cern terena lisbon the grid project fabrizio gagliardi cern information technology division may,...
Post on 12-Jan-2016
212 Views
Preview:
TRANSCRIPT
CERN
TERENA Lisbon
The Grid ProjectThe Grid Project
Fabrizio GagliardiFabrizio Gagliardi
CERNCERN
Information Technology DivisionInformation Technology Division
May, 2000May, 2000
F.Gagliardi@cern.ch F.Gagliardi@cern.ch
F. Gagliardi - CERN/IT-May-2000
2
CERN
TERENA Lisbon
SummarySummary
High Energy Physics and CERN computing High Energy Physics and CERN computing problemproblem
An excellent computing model: the GRIDAn excellent computing model: the GRID
The Data Grid Initiative The Data Grid Initiative
(http://www.cern.ch/grid/)
F. Gagliardi - CERN/IT-May-2000
3
CERN
TERENA Lisbon
CERN organizationCERN organization
Largest Particle Physics lab in the world
European International Center for ParticlePhysics Research
Budget: 1020 M CHF 2700 staff
7000 physicist users
F. Gagliardi - CERN/IT-May-2000
4
CERN
TERENA Lisbon
The LHC DetectorsCMS
ATLAS
LHCb3.5 PetaBytes / year
~108 events/year
F. Gagliardi - CERN/IT-May-2000
5
CERN
TERENA Lisbon
The HEP Problem - Part I The HEP Problem - Part I
The scale...
F. Gagliardi - CERN/IT-May-2000
6
CERN
TERENA Lisbon
Estimated CPU Capacity at CERN
0
500
1,000
1,500
2,000
2,500
1998 1999 2000 2001 2002 2003 2004 2005 2006
year
K S
I95
~10K SI951200 processors
Non-LHC
technology-price curve (40% annual price improvement)
LHC
Capacity that can purchased for the value of the equipment present in 2000
F. Gagliardi - CERN/IT-May-2000
7
CERN
TERENA Lisbon
Estimated DISK Capacity ay CERN
0
200
400
600
800
1000
1200
1400
1600
1800
1998 1999 2000 2001 2002 2003 2004 2005 2006
year
Ter
aByt
es
Non-LHC
technology-price curve (40% annual price improvement)
LHC
F. Gagliardi - CERN/IT-May-2000
8
CERN
TERENA Lisbon
Long Term Tape Storage Estimates
Current Experiments
COMPASS
LHC
0
2'000
4'000
6'000
8'000
10'000
12'000
14'000
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
Year
Ter
aByt
es
F. Gagliardi - CERN/IT-May-2000
9
CERN
TERENA Lisbon
HPC or HTCHPC or HTC
High High ThroughputThroughput Computing Computing mass of modest, independent problems computing in parallel – not parallel computing throughput rather than single-program performance resilience rather than total system reliability
Have learned toHave learned to exploit exploit inexpensive mass marketinexpensive mass market componentscomponents
But we need to marry these with But we need to marry these with inexpensiveinexpensive highly highly scalable managementscalable management tools tools
Much in common with other sciences (see EU-US Much in common with other sciences (see EU-US Annapolis Workshop at www.cacr.caltech.edu/euus): Annapolis Workshop at www.cacr.caltech.edu/euus): Astronomy, Earth Observation, Bioinformatics, and Astronomy, Earth Observation, Bioinformatics, and commercial/industrial: data mining, Internet commercial/industrial: data mining, Internet computingcomputing, , e-commercee-commerce facilities, …… facilities, ……
Contrast withsupercomputing
network servers
tape servers
disk servers
application servers
Generic component modelof a computing farm
F. Gagliardi - CERN/IT-May-2000
11
CERN
TERENA Lisbon
The HEP Problem - Part IIThe HEP Problem - Part II
Geography, Sociology, Funding and Politics...
F. Gagliardi - CERN/IT-May-2000
12
CERN
TERENA Lisbon
CMS: 1800 physicists150 institutes32 countries
World Wide Collaboration distributed computing & storage capacity
F. Gagliardi - CERN/IT-May-2000
13
CERN
TERENA Lisbon
Regional Centres - a Multi-Tier ModelRegional Centres - a Multi-Tier Model
Department
Desktop
CERN – Tier 0
MONARC report: http://home.cern.ch/~barone/monarc/RCArchitecture.html
Tier 1 FNALRAL
IN2P3622 M
bps2.5 Gbps
622 M
bp
s
155
mbp
s 155 mbps
Tier2 Lab a
Uni b Lab c
Uni n
F. Gagliardi - CERN/IT-May-2000
14
CERN
TERENA Lisbon
Are Grids a solution?Are Grids a solution?
Change of orientation of US Meta-computingChange of orientation of US Meta-computingactivityactivity
From inter-connected super-computers … .. towards a more general concept of a computational Grid (The Grid – Ian Foster, Carl Kesselman)
Has initiated a flurry of activity in HEPHas initiated a flurry of activity in HEP US – Particle Physics Data Grid (PPDG) GriPhyN – data grid proposal submitted to NSF Grid technology evaluation project in INFN UK proposal for funding for a prototype grid NASA Information Processing Grid
F. Gagliardi - CERN/IT-May-2000
15
CERN
TERENA Lisbon
The Grid
“Dependable, consistent, pervasive access to [high-end] resources”
• Dependable:
• provides performance and functionality guarantees
• Consistent:
• uniform interfaces to a wide variety of resources
• Pervasive:
• ability to “plug in” from anywhere
F. Gagliardi - CERN/IT-May-2000
16
CERN
TERENA Lisbon
R&D requiredR&D required
Local fabricLocal fabric Management of giant computing fabricsManagement of giant computing fabrics
auto-installation, configuration management, resilience, self-healing
MMass storage mass storage managementanagement multi-PetaByte data storage, “real-time” data recording
requirement, active tape layer – 1,000s of users
WWide-areaide-area - - building on an existing framework & RN building on an existing framework & RN (e.g.Globus, Geant)(e.g.Globus, Geant)
workload managementworkload management no central status local access policies
data managementdata management caching, replication, synchronisation object database model
application monitoringapplication monitoring
F. Gagliardi - CERN/IT-May-2000
17
CERN
TERENA Lisbon
HEP Data Grid InitiativeHEP Data Grid Initiative
European level coordination of national initiatives & European level coordination of national initiatives & projectsprojects
Principal goals:Principal goals: Middleware for fabric & Grid management Large scale testbed - major fraction of one LHC
experiment Production quality HEP demonstrations
“mock data”, simulation analysis, current experiments
Other science demonstrations Three year phased developments & demosThree year phased developments & demos Complementary to other GRID projectsComplementary to other GRID projects
EuroGrid: Uniform access to parallel supercomputing resources
Synergy to be developed (GRID Forum, Industry and Synergy to be developed (GRID Forum, Industry and Research Forum)Research Forum)
F. Gagliardi - CERN/IT-May-2000
18
CERN
TERENA Lisbon
ParticipantsParticipants
Main partners: CERN, INFN(I), CNRS(F), Main partners: CERN, INFN(I), CNRS(F), PPARC(UK), NIKEF(NL), ESA-Earth Observation PPARC(UK), NIKEF(NL), ESA-Earth Observation
Other sciences: KNMI(NL), Biology, Medicine Other sciences: KNMI(NL), Biology, Medicine Industrial participation: CS SI/F, DataMat/I, Industrial participation: CS SI/F, DataMat/I,
IBM/UKIBM/UK Associated partners: Czech Republic, Finland, Associated partners: Czech Republic, Finland,
Germany, Hungary, Spain, Sweden (mostly Germany, Hungary, Spain, Sweden (mostly computer scientists)computer scientists)
Formal collaboration with USAFormal collaboration with USA Industry and Research Project Forum with Industry and Research Project Forum with
representatives from:representatives from: Denmark, Greece, Israel, Japan, Norway, Poland,
Portugal, Russia
F. Gagliardi - CERN/IT-May-2000
19
CERN
TERENA Lisbon
Status Status
Prototype work already started at CERN and in Prototype work already started at CERN and in most of collaborating institutesmost of collaborating institutes
Proposal to RN2 submittedProposal to RN2 submitted
Network requirements discussed with Network requirements discussed with Dante/GeantDante/Geant
F. Gagliardi - CERN/IT-May-2000
20
CERN
TERENA Lisbon
WAN RequirementsWAN Requirements
High bandwidth from CERN to Tier 1 centres (5-6) VPN, Quality of Service Guaranteed performance during limited test
periods and at the end of the project for production quality services
Target requirements (2003) 2.5 Gb/s + 622 Mb/s + 155 Mb/s
Could saturate for limited amount of test time 2.5 Gb/s (100 MB/s out from a 100 PC farm, we plan for 1000’s PC farm)
Reliability is an important factor: from WEB client-server model to GRID peer distributed computing model
F. Gagliardi - CERN/IT-May-2000
21
CERN
TERENA Lisbon
ConclusionsConclusions
This project, motivated by HEP and other high This project, motivated by HEP and other high data and computing demanding sciences, will data and computing demanding sciences, will contribute to develop and implement a new contribute to develop and implement a new world-wide distributed computing model: The world-wide distributed computing model: The GRIDGRID
An ideal computing model for the next An ideal computing model for the next generation Internetgeneration Internet
An excellent test case for the next generation of An excellent test case for the next generation of high-performance research networkshigh-performance research networks
top related