[Cross posted from here: http://rob.gillenfamily.net/post/External-File-Upload-Optimizations-for-Windows-Azure.aspx]
I’m wrapping up a bit of the work we’ve been doing on data movement optimizations for cloud computing and the latest set of data yielded some interesting points I thought I’d share. The work done here is not really rocket science but may, in some ways, be slightly counter-intuitive and therefore seemed worthy of posting.
Summary: for those who don’t like to read detailed posts or don’t have time, the synopsis is that if you are uploading data to Azure, block your data (even down to 1MB) and upload in parallel. Set your block size based on your source file size, but if you must choose a fixed value, use 1MB. Following the above will result in significant performance gains… upwards of 10x-24x and a reduction in overall file transfer time of upwards of 90% (eg, uploading a 1GB file averaged 46.37 minutes prior to optimizations and averaged 1.86 minutes afterwards).
Detail: For those of you who want more detail, or think that the claims at the end of the preceding paragraph are over-reaching, what follows is information and code supporting these claims. As the title would indicate, these tests were run from our research facility pointing to the Azure cloud (specifically US North Central as it is physically closest to us) and do not represent intra-cloud results… we have performed intra-cloud tests and the overall results are similar in notion but the data rates are significantly different as well as the tipping points for the various block sizes… this will be detailed separately).
We started by building a very simple console application that would loop through a directory and upload each file to Azure storage. This application used the shipping storage client library from the 1.1 version of the azure tools. The only real variation from the client library is that we added code to collect and record the duration (in ms) and size (in bytes) for each file transferred. The code is available here.
We then created a directory that had a collection of files for the following sizes: 2KB, 32KB, 64KB, 128KB, 512KB, 1MB, 5MB, 10MB, 25MB, 50MB, 100MB, 250MB, 500MB, 750MB, and 1GB (50 files for each size listed). These files contained randomly-generated binary data and do not benefit from compression (a separate discussion topic). Our file generation tool is available here.
The baseline was established by running the application described above against the directory containing all of the data files. This application uploads the files in a random order so as to avoid transferring all of the files of a given size sequentially and thereby spreading the affects of periodic Internet delays across the collection of results. We then ran some scripts to split the resulting data and generate some reports. The raw data collected for our non-optimized tests is available via the links in the Related Resources section at the bottom of this post.
For each file size, we calculated the average upload time (and standard deviation) and the average transfer rate (and standard deviation). As you likely are aware, transferring data across the Internet is susceptible to many transient delays which can cause anomalies in the resulting data. It is for this reason that we randomized the order of source file processing as well as executed the tests 50x for each file size. We expect that these steps will yield a sufficiently balanced set of results.
Once the baseline was collected and analyzed, we updated the test harness application with some methods to split the source file into user-defined block sizes and then to upload those blocks in parallel (using the PutBlock() method of Azure storage). The parallelization was handled by simply relying on the Parallel Extensions to .NET to provide a Parallel.For loop (see linked source for specific implementation details in Program.cs, line 173 and following… less than 100 lines total). Once all of the blocks were uploaded, we called PutBlockList() to assemble/commit the file in Azure storage. For each block transferred, the MD5 was calculated and sent ensuring that the bits that arrived matched was was intended. The timer for the blocked/parallelized transfer method wraps the entire process (source file splitting, block transfer, MD5 validation, file committal). A diagram of the process is as follows:
We then tested the affects of blocking & parallelizing the transfers by running the updated application against the same source set and did a parameter sweep on the block size including 256KB, 512KB, 1MB, 2MB, and 4MB (our assumption was that anything lower than 256KB wasn’t worth the trouble and 4MB is the maximum size of a block supported by Azure). The raw data for the parallel tests is available via the links in the Related Resources section at the bottom of this post.
This data was processed and then compared against the single-threaded / non-optimized transfer numbers and the results were encouraging. The Excel version of the results is available here.
Two semi-obvious points need to be made prior to reviewing the data. The first is that if the block size is larger than the source file size you will end up with a “negative optimization” due to the overhead of attempting to block and parallelize. The second is that as the files get smaller, the clock-time cost of blocking and parallelizing (overhead) is more apparent and can tend towards negative optimizations. For this reason (and is supported in the raw data provided in the linked worksheet) the charts and dialog below ignore source file sizes less than 1MB.

(click chart for full size image)
The chart above illustrates some interesting points about the results:
- When the block size is smaller than the source file, performance increases but as the block size approaches and then passes the source file size, you see decreasing benefit to the point of negative gains (see the values for the 1MB file size)
- For some of the moderately-sized source files, small blocks (256KB) are best
- As the size of the source file gets larger (see values for 50MB and up), the smallest block size is not the most efficient (presumably due, at least in part, to the increased number of blocks, increased number of individual transfer requests, and reassembly/committal costs).
- Once you pass the 250MB source file size, the difference in rate for 1MB to 4MB blocks is more-or-less constant
- The 1MB block size gives the best average improvement (~16x) but the optimal approach would be to vary the block size based on the size of the source file.
(click chart for full size image)
The above is another view of the same data as the prior chart just with the axis changed (x-axis represents file size and plotted data shows improvement by block size). It again highlights the fact that the 1MB block size is probably the best overall size but highlights the benefits of some of the other block sizes at different source file sizes.
This last chart shows the change in total duration of the file uploads based on different block sizes for the source file sizes. Nothing really new here other than this view of the data highlights the negative affects of poorly choosing a block size for smaller files.
Summary
What we have found so far is that blocking your file uploads and uploading them in parallel results in significant performance improvements. Further, utilizing extension methods and the Task Parallel Library (.NET 4.0) make short work of altering the shipping client library to provide this functionality while minimizing the amount of change to existing applications that might be using the client library for other interactions.
Related Resources
Be the first to rate this post
- Currently 0/5 Stars.
- 1
- 2
- 3
- 4
- 5
The following represents data and results gathered from the second research institution connection cloud transfer test and compares results from Azure’s US North Central data center and Azure’s US South Central data center. The methodology applied during this test is detailed here and should be reviewed prior to considering the results or commentary below.
Test Overview:
- 05561 Cloud Transfer Tests: Research Institution Test 02
- Local Connection: Research Networks
- Started: February 9, 2010
- Finished: February 16, 2010
- Origination Point: Oak Ridge, TN
Disclaimer:
- Standard Disclaimer Applies
Test Objectives:
- Standard objectives apply
- Specific to this test: Test a research institution connection as the researcher’s “workstation” and gather data aimed at building a realistic expectation of performance
Test Setup
- Included File Sizes:
- 2KB, 32KB, 64KB, 128KB, 256KB, 512KB, 1MB, 5MB, 10MB, 25MB, 50MB, 100MB, 250MB, 500MB, 750MB, 1GB
- Network Connectivity - “research institution”
- Consists of a computer connected to a local network router via 100Mbps hard-wire.
- Multiple switches/routers/firewalls may exist between workstation and the public internet
- There may exist multiple high-speed networks that may be leveraged for connectivity to remote datacenters (ESNet, I2, NLR
- Reasonable effort has been made to ensure that no other applications or TSRs are running on the source computer for the duration of the test.
- For this test, a newly-installed Windows 7 Professional installation was used, fully patched, with no other applications (beyond the test harness) installed.
Test Execution:
- Standard execution approach applied with the exception of the fact that Azure was tested for both cases – simply different datacenters (see slides for details)
Report Generation
- Standard report generation approach applied
Conventions:
- Standard conventions apply
Resources:
- Standard resources apply - no test-specific customizations beyond adaptations for the specific file sizes included in the test
Results:
Similar to other tests, there is some variability displayed that is obviously a result of traffic issues. We are continuing to look into this.
In general, the data from the Azure US North Central data center proved better than that of US South Central which is not altogether surprising as we are physically closer to the USNC location.
Slides 171 and 172 remain disturbing as the download values for the 750MB file size continue to be outside of what would be expected.
Slide 172 in particular is of interest as it draws attention to some wide variability across file sizes for the USSC datacenter (not just the 750MB size).
Full results are available in slide form here:
PDF of results are available here: http://sciencecloud.us/media/05561_Xfer-Research_02.pdf
Be the first to rate this post
- Currently 0/5 Stars.
- 1
- 2
- 3
- 4
- 5
The following represents data and results gathered from the first research institution connection cloud transfer test. The methodology applied during this test is detailed here and should be reviewed prior to considering the results or commentary below.
Test Overview:
- 05561 Cloud Transfer Tests: Research Institution Test 01
- Local Connection: Research Networks
- Started: February 8, 2010
- Finished: February 16, 2010
- Origination Point: Oak Ridge, TN
Disclaimer:
- Standard Disclaimer Applies
Test Objectives:
- Standard objectives apply
- Specific to this test: Test a research institution connection as the researcher’s “workstation” and gather data aimed at building a realistic expectation of performance
Test Setup
- Included File Sizes:
- 2KB, 32KB, 64KB, 128KB, 256KB, 512KB, 1MB, 5MB, 10MB, 25MB, 50MB, 100MB, 250MB, 500MB, 750MB, 1GB
- Network Connectivity - “research institution”
- Consists of a computer connected to a local network router via 100Mbps hard-wire.
- Multiple switches/routers/firewalls may exist between workstation and the public internet
- There may exist multiple high-speed networks that may be leveraged for connectivity to remote datacenters (ESNet, I2, NLR
- Reasonable effort has been made to ensure that no other applications or TSRs are running on the source computer for the duration of the test.
- For this test, a newly-installed Windows 7 Professional installation was used, fully patched, with no other applications (beyond the test harness) installed.
Test Execution:
- Standard execution approach applied
Report Generation
- Standard report generation approach applied
Conventions:
- Standard conventions apply
Resources:
- Standard resources apply - no test-specific customizations beyond adaptations for the specific file sizes included in the test
Results:
Across both services there exists an interesting amount of variability that is likely due to intermediate traffic or traffic management issues. Even within the same test run (see various scatter plots) you can detect “walls” of change wherein a the values will be hovering around a certain value and subsequently they hover around a much higher/lower value (ex. slide 133, 134).
There is not a consistent “winner” in this report. for various file sizes one platform would clearly outperform the other only to have the tables completely reversed for the next file size. This hints at network routing issues. A brief conversation with some of our local networking team indicates that some traffic (in particular Amazon’s) appeared to generally leave via the router connected to ESNet whereas most of the Microsoft traffic would leave via the router connected to Southern Crossing with subsequent connections to I2 and NLR. It may well be that the insertion of some static routs may help address some of the stability issues here.
Of particular interest is the “hump” seen by both services in slide 170. This has been seen in a similar location on the chart in other runs (see slide #82 here: http://www.slideshare.net/rgillen/cloud-storage-upload-tests-02). We don’t yet have a good explanation for this shape in the curve and are hoping to track that down soon.
Further, the shape of the Azure curve in slide 171 is inconsistent with other tests – specifically the data points for the 750MB size. We will continue to compare with other sets/runs to see if this continues or was simply transient.
What remains consistent across all tests so far is that the level of variability tends to be greater with the S3 platform as compared to the Azure Blob storage.
Full results are available in slide form here:
PDF of results are available here: http://sciencecloud.us/media/05561_Xfer-Research_01.pdf
Be the first to rate this post
- Currently 0/5 Stars.
- 1
- 2
- 3
- 4
- 5
The following represents data and results gathered from the first consumer connection cloud transfer test. The methodology applied during this test is detailed here and should be reviewed prior to considering the results or commentary below.
Test Overview:
- 05561 Cloud Transfer Tests: Consumer Connection Test 01
- Local Connection: Comcast Residential
- Started: February 9, 2010
- Finished: February 14, 2010
- Origination Point: Knoxville, TN
Disclaimer:
- Standard Disclaimer Applies
Test Objectives:
- Standard objectives apply
- Specific to this test: Test a consumer/commodity connection as the researcher’s “workstation” and gather data aimed at building a realistic expectation of performance
Test Setup
- Included File Sizes:
- 2KB, 32KB, 64KB, 128KB, 256KB, 512KB, 1MB, 5MB, 10MB, 25MB, 50MB, 100MB
- Network Connectivity - “typical home network”
- Consists of a computer connected to a local router via 1GE hard-wire.
- Router is then directly connected to service provider’s modem
- Consumer has a “general” plan for internet connectivity
- Reasonable effort has been made to ensure that no other applications or TSRs are running on the source computer for the duration of the test.
- For this test, a newly-installed Windows 7 Professional installation was used, fully patched, with no other applications (beyond the test harness) installed.
Test Execution:
- Standard execution approach applied
Report Generation
- Standard report generation approach applied
Conventions:
- Standard conventions apply
Resources:
- Standard resources apply - no test-specific customizations beyond adaptations for the specific file sizes included in the test
Results:
In contrast to some other test runs on other networks, in this test Azure seemed to generally (if barely) out-perform the Amazon platform and, consistent with other tests, Amazon’s interaction with Amazon’s platform shows greater variability across a given file size.
The test was limited to file sizes up to and including 100MB so as to avoid being flagged by the residential ISP for poor traffic habits (an issue to be addressed for large-bandwidth users on consumer connections).
Full results are available in slide form here:
PDF of results are available here: http://sciencecloud.us/media/05561_Xfer-Consumer_01.pdf
Currently rated 5.0 by 1 people
- Currently 5/5 Stars.
- 1
- 2
- 3
- 4
- 5
I’ve been working on moving a large collection data to, from, and around Azure as we are testing the data profile for scientific computing and large-scale experiment post-processing and, in order to verify the data we uploaded and processed turned out as we wanted tit to, I built a simple visualization app that does a real-time query against the data in Azure and displays it. Originally the app was built as a simple WPF desktop application, but I got to thinking that it would be particularly interesting on the Surface and therefore took a day or two to port it over. The video below is a walkthrough of the app – the dialog is a bit cheesy but the app is interesting as it provides a very tactile means of interacting with otherwise stale data.
Be the first to rate this post
- Currently 0/5 Stars.
- 1
- 2
- 3
- 4
- 5