Experiments

100322_Azure_MTUpload_DataProxy

This data shows multi-threaded uploads to Azure (parameter sweep on block size) by means of a data proxy (hosted in azure, near the storage).

100316_Amazon_Internal_MTDownload

This data shows another run of the MT download tests from within the Amazon data center, but fails tho show any consitent improvement or clarification from the varitiy of data shown in the prior test.

100316_Amazon_Internal

This data shows bi-directional transfers from within the Amazon datacenter (USSTD). The uploads are single-threaded and the downloads are multi-threaded. This should be adjusted to also capture single-threaded downloads.

100311_Azure_Internal_MTDownload

This data shows multi-threaded downloads from compute inside Azure's cloud interacting with storage located in the same cloud (and zone). Due to the oddities in the prior test, this was a re-run of exactly the same test pattern. The data has some similar anomalies.

100310_Azure_Internal_MTDownload

This data shows multi-threaded downloads from compute inside Azure's cloud interacting with storage located in the same cloud (and zone). Due to the oddities in the prior test, this was a re-run of exactly the same test pattern. The data has some similar anomalies (i.e. what the blow out in rate for 10-threaded file transfers for some entries but not all).

100309_Azure_Internal_MTDownload

This data shows multi-threaded downloads from compute inside Azure's cloud interacting with storage located in the same cloud (and zone). Unlike the prior attempts, this one is a bit more accurate and the results are much closer to what one might suspect. That being said, there are still some anomalies that don't have exact good explanations.

100305_Azure_Internal_MTDownload

This data shows multi-threaded downloads from compute inside Azure's cloud interacting with storage located in the same cloud (and zone). If you look at the results you will notice that they are very different than you might expect. This is due, presumeably to two things, both stemming from the fact that the single-threaded options (during the parameter sweep) were being handled differently than the others.

  1. single-threaded instances were configured to transfer only (no disk) whereas the others assembled to disk or at least assembled the bits in memory.
  2. single-threaded instances used a buffer read size of 1MB whereas the others   used a buffer read size of 2K.
*** As such, this data is considered to be worthless

 

100304_Azure_Internal_MTDownload

This was an attempt at multi-threaded downloads from compute inside Azure's cloud interacting with storage inside Azure's cloud. However, the flags were not set properly in the code and this was *not* actually multi-threaded. This dataset is considered worthless.

100226_Azure_MTDownload

This data shows a parameter sweep on a multi-threaded download process from Azure's USNC datacenter to our ORNL location. In general, more threads and larger file sizes show greater improvement. Tests were run from 1MB up.
** Data is missing for the 1GB size.
** Only partial data exists for the 750MB size
** NOTE: do we have a MT download from Amazon test file???

100225_WalrusJics_FE

This shows transfer rates for the JICS installation of walrus from the actual network router in my office rather than a node behind it, attempting to eliminate any notion that htere was a perceptible difference between them.
** THIS DATA HAS NOT YET BEEN ANALYIZED

100224_WalrusJics

This shows a series of tests against the JICS installation of Walrus once the network issues were confirmed to be relieved (as much as possible)

100222_AmazonEU

This data illustrates uploads from our research facility to Amazon's EU zone. This data has not yet been analized and sould be, and subsequently compared to the USSTD and USWEST zones.
** NOTE: There is no download test data available.

** NOTE **
do we have Amazon internal transfer (sequential) data files???

100218_Azure_Internal

This compares data from internal transfers (bidirectionally, sequentally) between the four different azure datacenters and storage located in the same zone. This data is used to illustrate the fact that, in large measure, the patterns are similar with slight variances in the environments.

100217_AzureNE

Data collected from transfers to/from (sequentially) the Azure Northern Europe (NE) environment. Speeds are expected to be signifincantly slower than US due to the transatlantic crossing. This data should be assembled and compared to the other Azure services.

100217_AmazonWest

Data collected from transfers to and from (sequentailly) the Amazon USWEST environment. This data has not yet been analized and needs to be and then compared to the Amazon USSTD environment. Assertion is that numbers will be similar however USWEST is further from our location and will likely be slower. The only caveat to this is that USWEST is presumed to be less busy (it is newer) and therefore internal traffic may be lower resulting in lower latency - however this is not expected to trump the longer distance on the public internet.

100216_WalrusInternal

Shows bidirectional traffic between workstation and the JICS-hosted walrus installation as compared to my locally hosted walrus installation. In theory, the JICS platform should perform better (RAID 0 disks) but has more LAN b/t my workstation and them whereas the local WALRUS was sitting on a box separated only by a single switch. This data shows JICS capping at around 275Mbps for downloads and around 235Mbps for uploads. THe local install showed only slightly higher upload rates (maybe 245Mbps) but the download rate was a good bit higher (> 400 Mbps).

100211_1149

Bidirectional, sequential transfer tests between our research facility and Azure USNC and Azure USSC. Goal here was to test/compare the two and ensure that the behaviors were similar if not identical except for network distance being further b/t us and USSC. Report presentation contains full explanation and details.

100209_HomeNetwork

Basic data transfer tests (bi-directional, sequential) between residential connection (home) and Azure (USNC) and Amazon (USSTD). This data showed less variance b/t the two services (as compared to the lab results) and also showed rather good speed, hinting that the transfers at the lab were suffering for some reason... this led to the isolation of the proxy as a problem and the provision of data to IT to get an exception.

***100208_0800

Data Tests (bi-directional, and sequential) between ORNL workstation and Azure (USNC) and Amazon (USSTD) for the full file range.
NOTE: I believe that this data was hitting the ORNL Proxy Server.

100129_1135

Initial tests in upload transfer tests to both Azure and Amazon. Also represents the first set of data against which the ProcessTransferLogs.ps1 script was run.
NOTE: Needs to be re-run... charts are out of date.
NOTE: I believe that this data was hitting the ORNL Proxy Server.

100120

A continuation of the work started in 100106.
\tracelogs\*.* has the data captured by the code spitting to the local logs and then harvested.
\perflogs\*.* has the data from the various perf counters (proc, mem, etc.)

100110

A continuation of the work started in 100106.
\tracelogs\*.* has the data captured by the code spitting to the local logs and then harvested.
\perflogs\*.* has the data from the various perf counters (proc, mem, etc.)
PerfReport.xlsx calculations and basic charts showing perf usage over time.

100109

A continuation of the work started in 100106.
\tracelogs\*.* this directory is empty.
\perflogs\*.* has the data from the various perf counters (proc, mem, etc.)

100108

A continuation of the work started in 100106.
\b\*.* I don't know/remember what these files represent
\tracelogs\*.* has the data captured by the code spitting to the local logs and then harvested.
\perflogs\*.* has the data from the various perf counters (proc, mem, etc.)

100106

Performed a collection of workflow tasks in Azure (NetCDF - flattened - image generated, table loaded, etc.). These reports and data files show a number of counter values collected during this time. From perf mon values, to inter-Azure transfer rates, etc. Much of this data ended up in the CodeMash presentation. For specific queries used, check the EverNote logs from around that time period.
\tracelogs\*.* has the data captured by the code spitting to the local logs and then harvested.
\perflogs\*.* has the data from the various perf counters (proc, mem, etc.)

091221

More uploading of NetCDF files to Azure.
7_Headers.txt large file of HTTP headers captured during the transfers. has file size (length) and xfer duration data.
DirListing.txt - source files being used for the transfer
Reports.csv - raw data collected during the transfer tests
Reports.xlsx - Calculations and basic charts from results
*.png - charts illustrating various aspects of the results.

091218

I was uploading files to Azure and timing them. These were NetCDF files from IPCC.
ConsoleOutput.txt - Console dump of the testing operations
DirListing.txt - shows the files in the directory that I was attempting to upload
UploadTransferReportRaw.csv - simple CSV data of transfer details (file, size, duration, MS)
UploadTransferReports.xlsx - caluclations and basic charts detailing transfer results
UploadTransferReports01.csv - Sames as UploadTransferReportsRaw.csv but includes seconds, bits, rate fields, etc. (xfer fields).