harvey b. newman, caltech harvey b. newman, caltech internet2 virtual member meeting march 19, 2002...
TRANSCRIPT
Harvey B. Newman, CaltechHarvey B. Newman, Caltech Internet2 Virtual Member MeetingInternet2 Virtual Member Meeting
March 19, 2002March 19, 2002
http://l3www.cern.ch/~newman/HENPGridsNets_I2Virt031902.ppthttp://l3www.cern.ch/~newman/HENPGridsNets_I2Virt031902.ppt
HENP Grids and Networks:HENP Grids and Networks:Global Virtual OrganizationsGlobal Virtual Organizations
Computing Challenges: Computing Challenges: Petabyes, Petaflops, Global VOsPetabyes, Petaflops, Global VOs
Geographical dispersion:Geographical dispersion: of people and resources of people and resources Complexity:Complexity: the detector and the LHC environment the detector and the LHC environment Scale: Scale: Tens of Petabytes per year of dataTens of Petabytes per year of data
5000+ Physicists 250+ Institutes 60+ Countries
Major challenges associated with:Major challenges associated with:Communication and collaboration at a distanceCommunication and collaboration at a distance
Managing globally distributed computing & data resources Managing globally distributed computing & data resources Remote software development and physics analysisRemote software development and physics analysisR&D: New Forms of Distributed Systems: Data GridsR&D: New Forms of Distributed Systems: Data Grids
The Large Hadron Collider (2006-)The Large Hadron Collider (2006-)
The Next-generation Particle Collider The Next-generation Particle Collider The largest superconductor The largest superconductor
installation in the world installation in the world Bunch-bunch collisions at 40 MHz,Bunch-bunch collisions at 40 MHz,
Each generating ~20 interactionsEach generating ~20 interactions Only one in a trillion may lead Only one in a trillion may lead
to a major physics discovery to a major physics discovery Real-time data filtering: Real-time data filtering:
Petabytes per second to Gigabytes Petabytes per second to Gigabytes per secondper second
Accumulated data of many Accumulated data of many Petabytes/YearPetabytes/Year
Large data samples explored and analyzed by thousands of Large data samples explored and analyzed by thousands of globally dispersed scientists, in hundreds of teamsglobally dispersed scientists, in hundreds of teams
Four LHC Experiments: The Four LHC Experiments: The Petabyte to Exabyte Petabyte to Exabyte
ChallengeChallengeATLAS, CMS, ALICE, LHCBATLAS, CMS, ALICE, LHCB
Higgs + New particles; Quark-Gluon Plasma; CP ViolationHiggs + New particles; Quark-Gluon Plasma; CP Violation
Data storedData stored ~40 Petabytes/Year and UP; ~40 Petabytes/Year and UP; CPU CPU 0.30 Petaflops and UP 0.30 Petaflops and UP
0.1 to 1 Exabyte (1 EB = 100.1 to 1 Exabyte (1 EB = 101818 Bytes) Bytes) (2007) (~2012 ?) for the LHC Experiments(2007) (~2012 ?) for the LHC Experiments
All charged tracks with pt > 2 GeV
Reconstructed tracks with pt > 25 GeV
(+30 minimum bias events)
109 events/sec, selectivity: 1 in 1013 (1 person in a thousand world populations)
LHC: Higgs Decay into 4 muons LHC: Higgs Decay into 4 muons (Tracker only); 1000X LEP Data Rate(Tracker only); 1000X LEP Data Rate
Evidence for the Higgs at LEP at M~115 GeV
The LEP Program Has Now Ended
LHC Data Grid HierarchyLHC Data Grid Hierarchy
Tier 1
Tier2 Center
Online System
CERN 700k SI95 ~1 PB Disk; Tape Robot
FNAL: 200k SI95; 600 TBIN2P3 Center INFN Center RAL Center
InstituteInstituteInstituteInstitute ~0.25TIPS
Workstations
~100-400 MBytes/sec
2.5 Gbps
100 - 1000
Mbits/sec
Physicists work on analysis “channels”
Each institute has ~10 physicists working on one or more channels
Physics data cache
~PByte/sec
~2.5 Gbits/sec
Tier2 CenterTier2 CenterTier2 Center~2.5 Gbps
Tier 0 +1
Tier 3
Tier 4
Tier2 Center Tier 2
Experiment
CERN/Outside Resource Ratio ~1:2Tier0/( Tier1)/( Tier2) ~1:1:1
HENP Related Data Grid HENP Related Data Grid ProjectsProjects
ProjectsProjects PPDG IPPDG I USAUSA DOEDOE $2M$2M 1999-20011999-2001 GriPhyNGriPhyN USAUSA NSFNSF $11.9M + $1.6M$11.9M + $1.6M 2000-20052000-2005 EU DataGridEU DataGrid EUEU ECEC €10M€10M 2001-20042001-2004 PPDG II (CP)PPDG II (CP) USAUSA DOEDOE $9.5M$9.5M 2001-20042001-2004 iVDGLiVDGL USAUSA NSFNSF $13.7M + $2M$13.7M + $2M 2001-20062001-2006 DataTAGDataTAG EUEU ECEC €4M€4M 2002-20042002-2004 GridPP GridPP UKUK PPARCPPARC >$15M>$15M 2001-20042001-2004 LCG (Ph1)LCG (Ph1) CERN MSCERN MS 30 MCHF30 MCHF 2002-20042002-2004
Many Other Projects of interest to HENPMany Other Projects of interest to HENP Initiatives in US, UK, Italy, France, NL, Germany, Japan, …Initiatives in US, UK, Italy, France, NL, Germany, Japan, … US and EU networking initiatives: US and EU networking initiatives: AMPATH, I2, DataTAG AMPATH, I2, DataTAG US Distributed Terascale Facility: US Distributed Terascale Facility:
($53M, 12 TeraFlops, 40 Gb/s network)($53M, 12 TeraFlops, 40 Gb/s network)
CMS Production: Event Simulation CMS Production: Event Simulation
and Reconstructionand Reconstruction
““Grid-Enabled”Grid-Enabled” AutomatedAutomated
Imperial Imperial CollegeCollege
UFLUFL
Fully operationalFully operationalCaltechCaltech
PUPUNo PUNo PU
In progressIn progress
Common Common Prod. toolsProd. tools
(IMPALA)(IMPALA)GDMPGDMPDigitizationDigitization
SimulationSimulation
HelsinkiHelsinki
IN2P3IN2P3
WisconsinWisconsin
BristolBristol
UCSDUCSD
INFN (9)INFN (9)
MoscowMoscow
FNALFNAL
CERNCERN
Worldwide Productio
n
at 20 Sites
Next Generation Networks for Experiments: Goals and NeedsNext Generation Networks for Experiments: Goals and Needs
Providing rapid access to event samples and subsets Providing rapid access to event samples and subsets from massive data storesfrom massive data stores From ~400 Terabytes in 2001, ~Petabytes by 2002, From ~400 Terabytes in 2001, ~Petabytes by 2002,
~100 Petabytes by 2007, to ~1 Exabyte by ~2012.~100 Petabytes by 2007, to ~1 Exabyte by ~2012. Providing analyzed results with rapid turnaround, byProviding analyzed results with rapid turnaround, by
coordinating and managing the coordinating and managing the LIMITED LIMITED computing, computing, data handling and data handling and NETWORK NETWORK resources effectivelyresources effectively
Enabling rapid access to the data and the collaborationEnabling rapid access to the data and the collaboration Across an ensemble of networks of varying capabilityAcross an ensemble of networks of varying capability
Advanced integrated applications, such as Data Grids, Advanced integrated applications, such as Data Grids, rely on seamless operation of our LANs and WANsrely on seamless operation of our LANs and WANs With reliable, quantifiable (monitored), high performanceWith reliable, quantifiable (monitored), high performance For “Grid-enabled” event processing and data analysis,For “Grid-enabled” event processing and data analysis,
and collaborationand collaboration
Baseline BW for the US-CERN Link:Baseline BW for the US-CERN Link: HENP Transatlantic WG (DOE+NSF HENP Transatlantic WG (DOE+NSF))
US-CERN Link: 2 X 155 Mbps Now;US-CERN Link: 2 X 155 Mbps Now;Plans: Plans: 622 Mbps in April 2002;622 Mbps in April 2002;
DataTAG 2.5 Gbps Research Link in Summer 2002;DataTAG 2.5 Gbps Research Link in Summer 2002;10 Gbps Research Link in ~2003 or Early 200410 Gbps Research Link in ~2003 or Early 2004
Transoceanic Networking
Integrated with the Abilene,
TeraGrid, Regional Nets
and Continental Network
Infrastructuresin US, Europe,
Asia, South America
Evolution typicalEvolution typicalof major HENPof major HENP
links 2001-2006 links 2001-2006
Transatlantic Net WG (HN, L. Price) Bandwidth Requirements [*]
Transatlantic Net WG (HN, L. Price) Bandwidth Requirements [*]
2001 2002 2003 2004 2005 2006
CMS 100 200 300 600 800 2500
ATLAS 50 100 300 600 800 2500
BaBar 300 600 1100 1600 2300 3000
CDF 100 300 400 2000 3000 6000
D0 400 1600 2400 3200 6400 8000
BTeV 20 40 100 200 300 500
DESY 100 180 210 240 270 300
CERNBW
155-310
622 1250 2500 5000 10000
[*] [*] Installed BW. Maximum Link Occupancy 50% Installed BW. Maximum Link Occupancy 50% AssumedAssumed
See http://gate.hep.anl.gov/lprice/TANSee http://gate.hep.anl.gov/lprice/TAN
Links Required to US Labs and Transatlantic [*]
Links Required to US Labs and Transatlantic [*]
[*] Maximum Link Occupancy 50% Assumed[*] Maximum Link Occupancy 50% Assumed
2001 2002 2003 2004 2005 2006
SLAC OC12 2 X OC12 2 X OC12 OC48 OC48 2 X OC48
BNL OC12 2 X OC12 2 X OC12 OC48 OC48 2 X OC48
FNAL OC12 OC48 2 X OC48 OC192 OC192 2 X OC192
US-CERN 2 X OC3 OC12 2 X OC12 OC48 2 X OC48 OC192
US-DESY
OC3 2 X OC3 2 X OC3 2 X OC3 2 X OC3 OC12
Daily, Weekly, Monthly and Yearly Statistics on 155 Mbps US-CERN Link
Daily, Weekly, Monthly and Yearly Statistics on 155 Mbps US-CERN Link
20 - 100 Mbps Used Routinely in ‘01BaBar: 600Mbps Throughput in ‘02
BW Upgrades Quickly Followedby Upgraded Production Use
RNP Brazil (to 20 Mbps)
STARTAP/Abilene OC3 (to 80 Mbps)
FIU Miami (to 80 Mbps)
Total U.S. Internet TrafficTotal U.S. Internet Traffic
Source: Roberts et al., 2001
U.S. Internet TrafficU.S. Internet Traffic
1970 1975 1980 1985 1990 1995 2000 2005 2010
Voice Crossover: August 2000
4X/Year2.8X/Year
1Gbps
1Tbps10Tbps
100Gbps10Gbps
100Tbps
100Mbps
1Kbps
1Mbps10Mbps
100Kbps10Kbps
100 bps
1 Pbps
100 Pbps10 Pbps
10 bps
ARPA & NSF Data to 96
New Measurements
Limit of same % GDP as Voice
Projected at 4/Year
AMS-IX Internet Exchange Throughput
Accelerating Growth in Europe (NL)
AMS-IX Internet Exchange Throughput
Accelerating Growth in Europe (NL)
Hourly Traffic2/03/026.0 Gbps
4.0 Gbps
2.0 Gbps
0
Monthly Traffic2X Growth from 8/00 - 3/01;2X Growth from 8/01 - 12/01 ↓
ICFA SCIC December 2001: Backbone and Int’l Link ProgressICFA SCIC December 2001: Backbone and Int’l Link Progress
Abilene upgrade from 2.5 to 10 Gbps; additional lambdas on demand Abilene upgrade from 2.5 to 10 Gbps; additional lambdas on demand planned for targeted applicationsplanned for targeted applications
GEANT Pan-European backbone (http://www.dante.net/geant) now GEANT Pan-European backbone (http://www.dante.net/geant) now interconnects 31 countries; includes many trunks at OC48 and OC192interconnects 31 countries; includes many trunks at OC48 and OC192
CA*net4: Interconnect customer-owned dark fiber nets across CA*net4: Interconnect customer-owned dark fiber nets across Canada, starting in 2003Canada, starting in 2003
GWIN (Germany): Connection to Abilene to 2 X OC48 Expected in 2002GWIN (Germany): Connection to Abilene to 2 X OC48 Expected in 2002 SuperSINET (Japan): Two OC12 Links, to Chicago and Seattle; SuperSINET (Japan): Two OC12 Links, to Chicago and Seattle;
Plan upgrade to 2 X OC48 Connection to US West Coast in 2003Plan upgrade to 2 X OC48 Connection to US West Coast in 2003 RENATER (France): Connected to GEANT at OC48; CERN link to OC12RENATER (France): Connected to GEANT at OC48; CERN link to OC12 SuperJANET4 (UK): Mostly OC48 links, connected to academic SuperJANET4 (UK): Mostly OC48 links, connected to academic
MANs typically at OC48 (http://www.superjanet4.net)MANs typically at OC48 (http://www.superjanet4.net) US-CERN link (2 X OC3 Now) to OC12 this Spring; OC192 by ~2005;US-CERN link (2 X OC3 Now) to OC12 this Spring; OC192 by ~2005;
DataTAG research link OC48 Summer 2002; to OC192 in 2003-4. DataTAG research link OC48 Summer 2002; to OC192 in 2003-4. SURFNet (NL) link to US at OC48SURFNet (NL) link to US at OC48 ESnet 2 X OC12 backbone, with OC12 to HEP labs; Plans to connectESnet 2 X OC12 backbone, with OC12 to HEP labs; Plans to connect
to STARLIGHT using Gigabit Ethernetto STARLIGHT using Gigabit Ethernet
Key Network Issues & Challenges
Key Network Issues & Challenges
Net Infrastructure Requirements for High Net Infrastructure Requirements for High ThroughputThroughput
Packet Loss must be ~Zero (well below 0.01%)Packet Loss must be ~Zero (well below 0.01%) I.e. No “Commodity” networksI.e. No “Commodity” networks Need to track down uncongested packet lossNeed to track down uncongested packet loss
No Local infrastructure bottlenecksNo Local infrastructure bottlenecks Gigabit Ethernet “clear paths” between selected Gigabit Ethernet “clear paths” between selected
host pairs are needed now host pairs are needed now To 10 Gbps Ethernet by ~2003 or 2004To 10 Gbps Ethernet by ~2003 or 2004
TCP/IP stack configuration and tuning Absolutely RequiredTCP/IP stack configuration and tuning Absolutely Required Large Windows; Possibly Multiple StreamsLarge Windows; Possibly Multiple Streams New Concepts of New Concepts of Fair UseFair Use Must then be Developed Must then be Developed
Careful Router configuration; monitoring Careful Router configuration; monitoring Server and Client CPU, I/O and NIC throughput sufficientServer and Client CPU, I/O and NIC throughput sufficient
End-to-endEnd-to-end monitoring and tracking of performance monitoring and tracking of performance Close collaboration with local and “regional” network staffsClose collaboration with local and “regional” network staffs
TCP Does Not Scale to the 1-10 Gbps RangeTCP Does Not Scale to the 1-10 Gbps Range
201 Primary ParticipantsAll 50 States, D.C. and Puerto Rico75 Partner Corporations and Non-
Profits14 State Research and Education
Nets 15 “GigaPoPs” Support 70% of Members
2.5 Gbps Backbone
Rapid Advances of Nat’l Backbones:Next Generation Abilene
Rapid Advances of Nat’l Backbones:Next Generation Abilene
Abilene partnership with Qwest extended Abilene partnership with Qwest extended through 2006through 2006
Backbone to be upgraded to 10-Gbps in phases, Backbone to be upgraded to 10-Gbps in phases, to be Completed by October 2003to be Completed by October 2003 GigaPoP Upgrade started in February 2002GigaPoP Upgrade started in February 2002
Capability for flexible Capability for flexible provisioning in support provisioning in support of future experimentation in optical networkingof future experimentation in optical networking In a multi- In a multi- infrastructure infrastructure
National R&E Network ExampleGermany: DFN TransAtlanticConnectivity Q1 2002
STM 4
STM 16
STM 16
2 X OC12 Now: NY-Hamburg 2 X OC12 Now: NY-Hamburg and NY-Frankfurtand NY-Frankfurt
ESNet peering at 34 MbpsESNet peering at 34 Mbps Upgrade to 2 X OC48 expected Upgrade to 2 X OC48 expected
in Q1 2002 in Q1 2002 Direct Peering to Abilene and Direct Peering to Abilene and
Canarie expectedCanarie expected UCAID will add (?) another 2 UCAID will add (?) another 2
OC48’s; Proposing a Global OC48’s; Proposing a Global Terabit Research Network (GTRN) Terabit Research Network (GTRN)
FSU Connections via satellite:FSU Connections via satellite:Yerevan, Minsk, Almaty, BaikalYerevan, Minsk, Almaty, Baikal Speeds of 32 - 512 kbpsSpeeds of 32 - 512 kbps
SILK Project (2002): NATO fundingSILK Project (2002): NATO funding Links to Caucasus and CentralLinks to Caucasus and Central Asia (8 Countries) Asia (8 Countries)
Currently 64-512 kbpsCurrently 64-512 kbpsPropose VSAT for 10-50 X BW:Propose VSAT for 10-50 X BW: NATO + State Funding NATO + State Funding
National Research Networks in Japan
National Research Networks in Japan
SuperSINET SuperSINET Started operation January 4, 2002Started operation January 4, 2002 Support for 5 important areas:Support for 5 important areas:
HEP,HEP, Genetics, Nano-Technology, Genetics, Nano-Technology, Space/Astronomy, Space/Astronomy, GRIDsGRIDs
Provides 10 Provides 10 ’s:’s: 10 Gbps IP connection 10 Gbps IP connection Direct intersite GbE linksDirect intersite GbE links Some connections to Some connections to
10 GbE 10 GbE in JFY2002in JFY2002 HEPnet-J HEPnet-J
Will be re-constructed with Will be re-constructed with MPLS-VPN in SuperSINET MPLS-VPN in SuperSINET
Proposal: Two TransPacific Proposal: Two TransPacific 2.5 Gbps Wavelengths, and 2.5 Gbps Wavelengths, and KEK-CERN Grid Testbed KEK-CERN Grid Testbed
Tokyo
Osaka
Nagoya
Internet
Osaka U
Kyoto U
ICR
Kyoto-U
Nagoya U
NIFS
NIG
KEK
Tohoku U
IMS
U-TokyoNAO
U Tokyo
NII Hitot.
NII Chiba
IP
WDM path
IP router
OXC
ISAS
TeraGrid (www.teragrid.org)TeraGrid (www.teragrid.org)NCSA, ANL, SDSC, CaltechNCSA, ANL, SDSC, Caltech
NCSA/UIUC
ANL
UIC Multiple Carrier Hubs
Starlight / NW UnivStarlight / NW Univ
Ill Inst of Tech
Univ of ChicagoIndianapolis (Abilene NOC)
I-WIRE
Pasadena
San Diego
DTF Backplane(4x): 40 Gbps)
Abilene
Chicago
Indianapolis
Urbana
OC-48 (2.5 Gb/s, Abilene)Multiple 10 GbE (Qwest)Multiple 10 GbE (I-WIRE Dark Fiber)
Solid lines in place and/or available in 2001 Dashed I-WIRE lines planned for Summer 2002
Source: Charlie Catlett, Argonne
A Preview of the Grid Hierarchyand Networks of the LHC Era
[*] See “Macroscopic Behavior of the TCP Congestion Avoidance Algorithm,” Matthis, Semke, Mahdavi, Ott, Computer Communication Review 27(3), 7/1997
Throughput quality improvements:BWTCP < MSS/(RTT*sqrt(loss)) [*]
Throughput quality improvements:BWTCP < MSS/(RTT*sqrt(loss)) [*]
China Recent Improvement
80% Improvement/Year
Factor of 10 In 4 Years
Eastern Europe Keeping Up
Internet2 HENP WG [*]Internet2 HENP WG [*] Mission: To help ensure that the requiredMission: To help ensure that the required
National and international network infrastructuresNational and international network infrastructures(end-to-end)(end-to-end)
Standardized tools and facilities for high performance and Standardized tools and facilities for high performance and end-to-end monitoring and tracking, andend-to-end monitoring and tracking, and
Collaborative systemsCollaborative systems are developed and deployed in a timely manner, and used are developed and deployed in a timely manner, and used
effectively to meet the needs of the US LHC and other major effectively to meet the needs of the US LHC and other major HENP Programs, as well as the at-large scientific community.HENP Programs, as well as the at-large scientific community. To carry out these developments in a way that is broadly To carry out these developments in a way that is broadly
applicable across many fieldsapplicable across many fields Formed an Internet2 WG as a suitable framework: Formed an Internet2 WG as a suitable framework:
Oct. 26 2001 Oct. 26 2001 [*] Co-Chairs: S. McKee (Michigan), H. Newman (Caltech);[*] Co-Chairs: S. McKee (Michigan), H. Newman (Caltech);
Sec’y J. Williams (Indiana Sec’y J. Williams (Indiana Website: Website: http://www.internet2.edu/henphttp://www.internet2.edu/henp; also see the Internet2; also see the Internet2
End-to-end Initiative: End-to-end Initiative: http://www.internet2.edu/e2ehttp://www.internet2.edu/e2e
True End to End Experience
User perception ApplicationOperating systemHost IP stackHost network cardLocal Area Network Campus backbone
networkCampus link to regional
network/GigaPoPGigaPoP link to
Internet2 national backbones
International connections
EYEBALL
APPLICATION
STACK
JACK
NETWORK
. . .
. . .
. . .
. . .
Networks, Grids and HENPNetworks, Grids and HENP Grids are starting to change the way we do science and engineeringGrids are starting to change the way we do science and engineering
Successful use of Grids relies on high performanceSuccessful use of Grids relies on high performance national and international networksnational and international networks
Next generation 10 Gbps network backbones are Next generation 10 Gbps network backbones are almost here: in the US, Europe and Japanalmost here: in the US, Europe and Japan
First stages arriving in 6-12 monthsFirst stages arriving in 6-12 months Major transoceanic links at 2.5 - 10 Gbps within 0-18 monthsMajor transoceanic links at 2.5 - 10 Gbps within 0-18 months Network improvements are especially needed in South America; and Network improvements are especially needed in South America; and
some other world regionssome other world regions BRAZIL, Chile; India, Pakistan, China; Africa and elsewhereBRAZIL, Chile; India, Pakistan, China; Africa and elsewhere
Removing regional, last mile bottlenecks and compromises Removing regional, last mile bottlenecks and compromises in network quality are now in network quality are now all on the critical pathall on the critical path
Getting high (reliable) Grid performance across networks means!Getting high (reliable) Grid performance across networks means! End-to-end monitoring; a coherent approach End-to-end monitoring; a coherent approach Getting high performance (TCP) toolkits in users’ handsGetting high performance (TCP) toolkits in users’ hands Working in concert with Internet E2E, I2 HENP WG, Working in concert with Internet E2E, I2 HENP WG, DataTAG; DataTAG;
Working with the Grid projects and GGFWorking with the Grid projects and GGF