Feb
16.
2010

The following represents data and results gathered from the first research institution connection cloud transfer test. The methodology applied during this test is detailed here and should be reviewed prior to considering the results or commentary below.

Test Overview:

  • 05561 Cloud Transfer Tests: Research Institution Test 01
  • Local Connection: Research Networks
  • Started: February 8, 2010
  • Finished: February 16, 2010
  • Origination Point: Oak Ridge, TN

 

Disclaimer:

  • Standard Disclaimer Applies

 

Test Objectives:

  • Standard objectives apply
  • Specific to this test: Test a research institution connection as the researcher’s “workstation” and gather data aimed at building a realistic expectation of performance

 

Test Setup

  • Included File Sizes:
    • 2KB, 32KB, 64KB, 128KB, 256KB, 512KB, 1MB, 5MB, 10MB, 25MB, 50MB, 100MB, 250MB, 500MB, 750MB, 1GB
  • Network Connectivity - “research institution”
    • Consists of a computer connected to a local network router via 100Mbps hard-wire.
    • Multiple switches/routers/firewalls may exist between workstation and the public internet
    • There may exist multiple high-speed networks that may be leveraged for connectivity to remote datacenters (ESNet, I2, NLR
    • Reasonable effort has been made to ensure that no other applications or TSRs are running on the source computer for the duration of the test.
    • For this test, a newly-installed Windows 7 Professional installation was used, fully patched, with no other applications (beyond the test harness) installed.

 

Test Execution:

  • Standard execution approach applied


Report Generation

  • Standard report generation approach applied

 

Conventions:

  • Standard conventions apply

 

Resources:

  • Standard resources apply - no test-specific customizations beyond adaptations for the specific file sizes included in the test

 

Results:

Across both services there exists an interesting amount of variability that is likely due to intermediate traffic or traffic management issues. Even within the same test run (see various scatter plots) you can detect “walls” of change wherein a the values will be hovering around a certain value and subsequently they hover around a much higher/lower value (ex. slide 133, 134).

There is not a consistent “winner” in this report. for various file sizes one platform would clearly outperform the other only to have the tables completely reversed for the next file size. This hints at network routing issues. A brief conversation with some of our local networking team indicates that some traffic (in particular Amazon’s) appeared to generally leave via the router connected to ESNet whereas most of the Microsoft traffic would leave via the router connected to Southern Crossing with subsequent connections to I2 and NLR. It may well be that the insertion of some static routs may help address some of the stability issues here.

Of particular interest is the “hump” seen by both services in slide 170. This has been seen in a similar location on the chart in other runs (see slide #82 here: http://www.slideshare.net/rgillen/cloud-storage-upload-tests-02). We don’t yet have a good explanation for this shape in the curve and are hoping to track that down soon.

Further, the shape of the Azure curve in slide 171 is inconsistent with other tests – specifically the data points for the 750MB size. We will continue to compare with other sets/runs to see if this continues or was simply transient.

What remains consistent across all tests so far is that the level of variability tends to be greater with the S3 platform as compared to the Azure Blob storage.

 

Full results are available in slide form here:

 

PDF of results are available here: http://sciencecloud.us/media/05561_Xfer-Research_01.pdf

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments
Feb
16.
2010

The following represents data and results gathered from the first consumer connection cloud transfer test. The methodology applied during this test is detailed here and should be reviewed prior to considering the results or commentary below.

Test Overview:

  • 05561 Cloud Transfer Tests: Consumer Connection Test 01
  • Local Connection: Comcast Residential
  • Started: February 9, 2010
  • Finished: February 14, 2010
  • Origination Point: Knoxville, TN

Disclaimer:

  • Standard Disclaimer Applies

Test Objectives:

  • Standard objectives apply
  • Specific to this test: Test a consumer/commodity connection as the researcher’s “workstation” and gather data aimed at building a realistic expectation of performance

Test Setup

  • Included File Sizes:
    • 2KB, 32KB, 64KB, 128KB, 256KB, 512KB, 1MB, 5MB, 10MB, 25MB, 50MB, 100MB
  • Network Connectivity - “typical home network”
    • Consists of a computer connected to a local router via 1GE hard-wire.
    • Router is then directly connected to service provider’s modem
    • Consumer has a “general” plan for internet connectivity
    • Reasonable effort has been made to ensure that no other applications or TSRs are running on the source computer for the duration of the test.
    • For this test, a newly-installed Windows 7 Professional installation was used, fully patched, with no other applications (beyond the test harness) installed.

Test Execution:

  • Standard execution approach applied

Report Generation

  • Standard report generation approach applied

Conventions:

  • Standard conventions apply

Resources:

  • Standard resources apply - no test-specific customizations beyond adaptations for the specific file sizes included in the test

Results:

In contrast to some other test runs on other networks, in this test Azure seemed to generally (if barely) out-perform the Amazon platform and, consistent with other tests, Amazon’s interaction with Amazon’s platform shows greater variability across a given file size.

The test was limited to file sizes up to and including 100MB so as to avoid being flagged by the residential ISP for poor traffic habits (an issue to be addressed for large-bandwidth users on consumer connections).

Full results are available in slide form here:

PDF of results are available here: http://sciencecloud.us/media/05561_Xfer-Consumer_01.pdf

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments
May
13.
2009

Data Structures

Posted by: Rob Gillen in Categories: Data.
Tags: , , , ,

One of the aspects of our project is to evaluate the “right” way to expose large datasets such that they can be consumed appropriately by other cloud-oriented tools. For our purposes we are considering datasets ranging from a few GB to a few PB – and it is certainly the upper end of this spectrum that causes the most concern. We are targeting the following general use-cases:

  • Tools/Compute currently targeted at Amazon’s S3 service.
  • Tools/Compute currently targeted at Amazon’s EC2 service (not sure if the S3 interfaces solve both issues).
  • Tools/Compute currently targeted at Microsoft’s Azure Storage (blob/table as appropriate)
  • The “General” web client (this scenario makes the case for exposing data in a human-readable format and/or standards-based formats such that tools that don’t yet exist can be reasonably expected to be able to consume the data

Because of the sheer quantity of data, it is our expectation that the data will be stored centrally in some semi-generic fashion that will then be exposed in a number of protocol/format specific means. This may be an incorrect assumption, but it is our current plan.

Some general questions that currently exist:

  • Some of the APIs listed above support multiple protocols (REST, JSON, SOAP) – should all be implemented or just REST?
  • How do we make it easy for various research groups to get their data published?
  • Should service-specific (i.e. S3/Azure) interfaces be built, or should general, REST-based (or other) interfaces be built.
  • Should dataset-specific REST interfaces be provided? (rather than generic interfaces)

I’ve been poking around at http://data.gov hoping that something interesting will start to appear (as of yet, it is still a place-holder site). I’ve also done some poking around at Microsoft’s Open Government Data Initiative (http://ogdisdk.cloudapp.net/) which looks interesting but there remains to be code posted – maybe their starter kit will appear soon.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments