an analysis of bulk data movement patterns in large-scale scientific collaborations w. wu, p. demar,...

24
An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN [email protected] ; [email protected]; [email protected]

Upload: kory-gordon

Post on 17-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations

W. Wu, P. DeMar, A. BobyshevFermilab

CHEP 2010, TAIPEI [email protected]; [email protected]; [email protected]

Page 2: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

Topics

1. Background2. Problems3. Fermilab Flow Data Collection & Analysis

System4. Bulk Data Movement Pattern Analysis

• Data analysis methodology• Bulk Data Movement Patterns

5. Questions

Page 3: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

1. Backgrounds

• Large-scale research efforts such as LHC experiments, ITER, and Climate modeling are built upon large, globally distributed collaborations.– Depend on predictable and efficient bulk data movement between

collaboration sites.• Industry and scientific institutions have created data centers

comprising hundreds, and even thousands, of computation nodes to develop massively scaled, highly distributed cluster computing platforms.

• Existing transport protocols such as TCP do not work well in LFPs (long fat pipes).

• Parallel data transmission tools such as GridFTP have been widely applied to bulk data movements.

Page 4: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

2. Problems

• What is the current status of bulk data movement in support of Large-scale Scientific Collaborations?

• What are the bulk data movement patterns in Large-scale Scientific Collaborations?

Page 5: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

3. Fermilab Flow Data Collection & Analysis System

Page 6: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

Fow-based analysis produces data that are more fine-grained than those provided by SNMP but still not as detailed and high-volume as required for packet trace analysis.

Fermilab Flow Data Collection and Analysis System

• We have already deployed Cisco NetFlow and CMU’s SiLK toolset for network traffic analysis.

• Cisco NetFlow is running at Fermilab’s border router with no sampling configured. It exports flow records to our SiLK traffic analysis system.

• The SiLK analysis suite is a collection of command-line tools for processing Flow records. The tools are intended to be combined in various ways to perform an analysis task.

Page 7: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

4. Bulk Data Movement Pattern Analysis

Page 8: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

Data Analysis Methodology

• We analyzed the traffic flow records from 11/11/2009 to 12/23/2009.– The total flow record database has a size of 60GBytes,

with 2,679,598,772 flow records.– There are totally 23,764,073GByte data and 2.221x1012

packets transferred between Fermilab and other sites.• We only analyzed TCP bulk data transfers• We analyzed data transfers between Fermilab

and /24 subnets. – The TOP 100 /24 sites that transfer to FNAL.– The TOP 100 /24 sites that transfer from FNAL.

Page 9: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

Bulk Data Movement Patterns & Status

Page 10: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

TOP20 Sites that transfer from FNAL TOP20 Sites that transfer to FNAL

• We analyzed data transfers between Fermilab and /24 subnets. – In the IN direction, the TOP 100 Sites transfer 99.04% traffic.– In the OUT direction, the TOP 100 sites transfer 95.69% traffic

TOP 100 Sites

Page 11: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

TOP 100 Sites (Cont) • The TOP 100 sites are located around the world.• We collected and calculated the Round Trip Time (RTT) between

FNAL and these TOP 100 sites both in IN and OUT directions.

Indore, IndiaN American

Sites

Europe, S AmericaParts of Asia

Indore, IndiaN American Sites

Europe, S AmericaParts of Asia

Troitsk, Russia

A few sites have circuitous paths to/from FNAL.Many factors: lack of peering, traffic engineering, or due to physical limitation.

Beijing, China

Page 12: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

Circuitous PathsChicago IL, USA

Kansas, USADenver, USASunnyvale, USA

Seattle, USATokyo, JPSingapore, SG

Traceroute from FNAL to Indore, India

To Tokyo

Page 13: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

Single Flow Throughputs• We calculated statistics for single flow throughputs between FNAL and TOP100

Sites in both IN and OUT directions.– Each flow record includes data such as the number of packets and bytes in the flow and the

timestamps of the first and last packet.

• A few concerns– Should exclude the flow records for pure ACKs of the reverse path– A bulk data movement usually involves frequent administrative message exchanges between sites.

A significant number of flow records are generated due to these activities. These flow records usually contain a small number of packets with short durations; the calculated throughputs are usually inaccurate and vary greatly. These flow records should also be excluded from throughput calculation.

ACKs

1500B for Ethernet payload

Histogram for Packet Size Histogram for Flow Size

Page 14: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

Single Flow Throughputs (cont)

The slowest throughput is 0.207 Mbps, Indore, India

Most average throughputs are less than 10Mbps!But 1Gbps NICs are widely deployed!

The slowest throughput is 0.135 Mbps, Houston, TX!!!

Either the path from FNAL to Houston is highly congested, or some network equipments are malfunctioning, or the systems in Houston site are badly configured.

Page 15: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

Single Flow Throughputs (cont)• In the IN direction

– 2 sites’ average throughputs are less than 1Mbps• “National Institute of Nuclear Physics, Italy”, “Indore, India”

– 63 sites’ average throughputs are less than 10Mbps– Only 1 site’ average throughput is greater than 100Mbps

• In the OUT direction– 7 sites’ average throughput are less than 1Mbps

• “Indore, India” ,”Kurchatov Institute, Russia”, “University of Ioannina, Greece”, “IHP, Russia”, “ITEP, Russia”, “LHC Computing Grid LAN, Russia”, “Rice University, TX”

– 60 sites’ average throughput are less than 10Mbps– No site’s average throughput is greater than 100Mbps

TCP Throughput <= ~0.7 * MSS / (rtt * sqrt(packet_loss))We calculate the correlation of “average throughput vs. RTT”• In the IN direction, it is -0.426329• In the OUT direction, it is -0.37256

Page 16: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

Aggregated Throughputs

• Parallel data transmission tools such as GridFTP have been widely applied to bulk data movements

• In both IN and OUT directions, for each of the TOP100 sites, we bin traffic at 10 minute intervals– We calculate the aggregated throughputs– We collect the flow statistics– We collect the statistics of # of different IPs involved

(hosts)

Page 17: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

Histogram of Average # of Flows Histogram of Max # of Flows

Histogram of Aggregated Throughputs

From TOP100 Sites to FNAL

Thousands of parallel flows

In general, the aggregated throughputs are higher

We see the effect of parallel data transmission

Page 18: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

From TOP100 Sites to FNAL

His. of Correlation (Aggregated Thru vs. # of Flows)

His. of Correlation (Aggregated Thru vs. # of Source IPs)His. Of Corr. (Aggr. Thru vs. # of Dest. Ips) is similiar

• We calculate the correlation between aggregated throughputs vs. number of flow

• In general, more parallel data transmission (# of flows) generate higher aggregated throughputs

• But for some sites, more parallel data transmission generate less aggregated throughputs– More parallel data transmission

causes network congestion– parallel data transmission make disk

I/O less efficient.

College of William & Mary, USA

Negative correlative Positive correlative

Page 19: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

From TOP100 Sites to FNAL

• Totally, there are 35 sites that the “correlation between aggregated throughputs vs. number of flow” is negative.– The worst case is from “College of William & Mary, USA” to FNAL, the

correlation is only -0.439

• There are 31 sites that the “correlation between aggregated throughputs vs. number of flow” is greater than 0.5, which implies that increasing the number of flows can effectively enhance the throughputs.– The best case is from “University of Glasgow, UK” to FNAL.

Page 20: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

From TOP100 Sites to FNAL

Histogram of Average # of Source IPs

Histogram of Average # of Destination IPs Histogram of Max # of Destination IPs

Histogram of Max # of Source IPs

Some sites use only a single host to transfer!!!

Some sites utilize hundreds of hosts!!!

Page 21: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

From FNAL to TOP100 Sites

Histogram of Aggregated Throughputs

Histogram of Average # of Flows Histogram of Max # of Flows

In general, the aggregated throughputs are higher

We see the effect of parallel data transmission

The transmission from FNAL to TOP100 Sites is better than the other way around.

Page 22: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

From FNAL to TOP100 Sites

• We calculate the correlation between aggregated throughputs vs. number of flow

• In general, more parallel data transmission (# of flows) generate higher aggregated throughputs

• But for some sites, more parallel data transmission generate less aggregated throughputs– More parallel data transmission

causes network congestion– parallel data transmission make disk

I/O less efficient.

His. of Correlation (Aggregated Thru vs. # of Flows)

His. of Correlation (Aggregated Thru vs. # of Dest. IPs)His. Of Corr. (Aggr. Thru vs. # of Src. Ips) is similiar

Page 23: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

From FNAL to TOP100 Sites

Histogram of Average # of Source IPs Histogram of Max # of Source IPs

Histogram of Average # of Destination IPs Histogram of Max # of Destination IPs

Some sites use only a single host to transfer!!!

Some sites utilize hundreds of hosts!!!

Page 24: An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN wenji@fnal.govwenji@fnal.gov;

Conclusion

• We study the bulk data movement status for FNAL-related Large-scale Scientific Collaborations.

• We study the bulk data movement patterns for FNAL-related Large-scale Scientific Collaborations.