This story was written by Keith Dawson for UBM DeusM’s community Web site Business Agility, sponsored by IBM. It is archived here for informational purposes only because the Business Agility site is no more. This material is Copyright 2012 by UBM DeusM.

Benchmarking Cloud Data Transfer

Measurements by Nasuni show big differences in importing vs. exporting data at various providers.

Big differences exist in how fast data can move into and out of various clouds. Apparently it's not due to any "lock-in" conspiracy, but rather to technical or bandwidth limitations.

Nasuni, a provider of enterprise data storage solutions, uses various cloud service providers (CSPs) to store customers' data. The company essentially acts as a virtualization layer across CSPs. As such, they are interested both in the overall quality of CSPs' services and in the ease with which data can be migrated from one CSP to another.

The company addressed the first question in a white paper last December, State of Cloud Storage Providers Industry Benchmark Report, which studied performance, stability, and scalability. Based on two years of data, Nasuni concluded that the most reliable CSPs are Amazon, Microsoft Azure, and Rackspace.

The second question, speed of data migration, was answered in a new report issued on Tuesday. GigaOM calls it a "cloud lock-in" survey, and in fact Nasuni is in the business of selling a solution to vendor lock-in in the cloud.

Yet the term "lock-in" puts too deliberate a spin on it. While Nasuni found that Amazon can take in data on its S3 storage cloud 10 or more times faster than it can be exported to either Azure or Rackspace, this isn't the result of some vast conspiracy. Instead, the disparity seems to result from bandwidth or technical limitations at the two latter CSPs.

Nasuni's report notes that none of the CSPs were particularly forthcoming when asked about the origins of various limits on capacity; they all want to hold onto their secrets. So an outside tester such as Nasuni must resort to deduction and guesswork to figure out where the bottlenecks are.

Estimated time in hours to copy 12 TB between clouds
from / to Amazon Rackspace Azure
Amazon 4 ~160 40
Rackspace 5
Azure 4

Methodology and results
For the study, Nasuni worked with a 12-TB data set containing 22 million encrypted, compressed files of a mixture of sizes. From this larger set, the study selected about 5 percent -- a million files totaling 200 GB -- and copied this data between cloud storage buckets, using various numbers of cloud-based workers. The results were extrapolated to arrive at the time that would be required to copy the full 12 TB.

The table gives a summary of estimates of these cloud-to-cloud copying times. Nasuni also tested moving data from one "bucket" in Amazon S3 to another. (The report gives the time for copying from Amazon to Rackspace as "just under a week.")

Nasuni's work provides a rare outside look at CSPs' comparative performance, and should prove useful to anyone evaluating cloud providers.