Download - November 16, 2007 Dominique Boutigny – CC-IN2P3 Grids: Tools for e-Science DoSon AC GRID School
November 16, 2007
Dominique Boutigny – CC-IN2P3
Grids: Tools for e-ScienceGrids: Tools for e-Science
DoSon AC GRID SchoolDoSon AC GRID School
November 16, 2007Dominique Boutigny 2
Main characteristics of a GridMain characteristics of a Grid
A grid is an architecture and a A grid is an architecture and a set of software tools designed set of software tools designed to federate distributed to federate distributed computing resources.computing resources.
Resources are in principle Resources are in principle heterogeneousheterogeneous
Each node of the grid is Each node of the grid is administrated locally but administrated locally but there should be a central there should be a central coordination in order to coordination in order to keep the system coherentkeep the system coherent
An information An information system (even very system (even very light) should be light) should be present in order to present in order to match the match the computing tasks to computing tasks to the computing the computing environmentenvironment
The underlying network is The underlying network is crucialcrucial
A security and A security and authorization system authorization system should be presentshould be present
November 16, 2007Dominique Boutigny 3
Different kind of production GridsDifferent kind of production Grids
Computing GridComputing Grid
Data GridData Grid
Both Both Computing Computing and Dataand Data
Molecular dockingMolecular docking
Medical imagery Medical imagery Astronomical dataAstronomical data
LHC data processingLHC data processing
November 16, 2007Dominique Boutigny 4
Grids are a good way to increase Grids are a good way to increase the computing power available for the computing power available for a scientific community by putting a scientific community by putting resources in commonresources in common Grids federate and Grids federate and
contribute to build scientific contribute to build scientific communitiescommunities
Grids are often complicated to manage – A large Grids are often complicated to manage – A large grid requires a strong coordination between the grid requires a strong coordination between the participating sitesparticipating sites
ButBut
November 16, 2007Dominique Boutigny 5
The LHC Computing GridThe LHC Computing Grid
LCGLCG
November 16, 2007Dominique Boutigny 6
Concorde(15 Km)
Balloon(30 Km)
CD stack with1 year LHC data!(~ 20 Km)
Mt. Blanc(4.8 Km)
4 LHC experiments4 LHC experiments
15 PetaByte of data per year15 PetaByte of data per year
We have got a problem with dataWe have got a problem with data
100 Million SpecInt2000100 Million SpecInt2000
This is ~ 5000 today's 8 core computersThis is ~ 5000 today's 8 core computers
~15 M$~15 M$
Relatively easy to setup – Each CPU core is independent Relatively easy to setup – Each CPU core is independent of each otherof each other
15 PetaByte of data per year15 PetaByte of data per year
Today, this is ~20 M$ if you want to put them on diskToday, this is ~20 M$ if you want to put them on disk
And you also need to store the Monte Carlo simulationAnd you also need to store the Monte Carlo simulation
Need to store data securely for the whole life of the Need to store data securely for the whole life of the experimentsexperiments
Complicated architecture as the data have to move Complicated architecture as the data have to move worldwideworldwide
Each LHC contributor should be able to have access to Each LHC contributor should be able to have access to any dataany data
November 16, 2007Dominique Boutigny 7
A Hierarchical Grid Architecture in an A Hierarchical Grid Architecture in an International FrameworkInternational Framework
CC-IN2P3CC-IN2P3FZKFZK
PICPIC
NDGFNDGF
NIKHEFNIKHEF
ASCCASCC
BrookhavenBrookhaven
FermilabFermilab
TRIUMFTRIUMF
RALRAL
CNAFCNAF
T1 (11)T1 (11)
T0T0
T3 (many)T3 (many)
T2 (52)T2 (52)
Île de FranceÎle de France
ClermontClermont
NantesNantes
StrasbourgStrasbourg
MarseilleMarseille
LyonLyon
CC-IN2P3
CC-IN2P3
AnnecyAnnecy
November 16, 2007Dominique Boutigny 8
LCG Vs EGEELCG Vs EGEE
In Europe the LHC Computing Grid is based on the In Europe the LHC Computing Grid is based on the multidisciplinary project EGEEmultidisciplinary project EGEE
Middleware Middleware
Grid operation infrastructureGrid operation infrastructure
Pilot New
The Grid was a necessity for the LHC ComputingThe Grid was a necessity for the LHC Computing
It was a very good opportunity for other disciplinesIt was a very good opportunity for other disciplines
EGEE is also providing a very sophisticated EGEE is also providing a very sophisticated operational frameworkoperational framework
• MonitoringMonitoring
• Ticketing systemTicketing system
EGEE-II:EGEE-II: 90 partners – 90 partners – 32 countries – 32 M32 countries – 32 M€€ Crucial for the Crucial for the
success of the projectsuccess of the project
November 16, 2007Dominique Boutigny 9
LCG Vs EGEELCG Vs EGEE
November 16, 2007Dominique Boutigny 10
November 16, 2007Dominique Boutigny 11
InteroperabilityInteroperability
3 grid infrastructures are being used for LHC Computing 3 grid infrastructures are being used for LHC Computing – EGEE in EuropeEGEE in Europe– NorduGrid in Nordic CountriesNorduGrid in Nordic Countries– OSG in the USOSG in the US
These 3 infrastructures are now able to interoperateThese 3 infrastructures are now able to interoperate– Job submissionJob submission– OperationOperation
Developments on interoperabilityDevelopments on interoperability– Short term: GIN (Grid Interoperability Now)Short term: GIN (Grid Interoperability Now)– Longer term: SAGA / JSDL etc…Longer term: SAGA / JSDL etc…
They are based They are based on different on different middlewaresmiddlewares
Developed within Developed within the OGF the OGF frameworkframework
November 16, 2007Dominique Boutigny 12
GRID Services for the LHCGRID Services for the LHC
Computing servicesComputing services
Computing Computing Element (CE)Element (CE)
Worker nodes (WN)Worker nodes (WN)
WNWN
WNWN
WNWN
WNWN
WNWN
WNWN
WNWN
WNWN
WNWN
WNWN
WNWN
WNWN
WNWN
WNWN
WNWNSL4SL4
Workload Workload Management SystemManagement System
StorageStorage
Based on SRMBased on SRM
dCachedCache
CastorCastor
StormStorm
DPMDPM
File ManagementFile Management
Transfer: FTSTransfer: FTS
Cataloguing: LFC Cataloguing: LFC
Database replicationDatabase replication
3D - Project3D - Project
VOMSVOMS
Virtual Organization Virtual Organization ManagementManagement
Specific experiment Specific experiment servicesservices
VO BoxesVO Boxes
Will be used for priority Will be used for priority management management
November 16, 2007Dominique Boutigny 13
The LHC Optical Private NetworkThe LHC Optical Private Network
November 16, 2007Dominique Boutigny 14
LCG and emerging countriesLCG and emerging countries
The grid is a complex environment which is The grid is a complex environment which is mandatory to provide the huge computing mandatory to provide the huge computing resources necessary for the LHCresources necessary for the LHC– The learning curve is steep !The learning curve is steep !
Complexity … But…Complexity … But…– It provides a framework in which all the data will be It provides a framework in which all the data will be
available for every collaborator everywhereavailable for every collaborator everywhereThis is a unique opportunity for laboratories in This is a unique opportunity for laboratories in
emerging countries to fully participate to the physics emerging countries to fully participate to the physics analysis analysis
November 16, 2007Dominique Boutigny 15
Lightweight GridsLightweight Grids
November 16, 2007Dominique Boutigny 16
BOINCBOINC
NetworkNetwork
Main Main serverserver
BOINC provide a framework for a lightweight Grid targeting CPU intensive BOINC provide a framework for a lightweight Grid targeting CPU intensive applications running on small datasets applications running on small datasets
November 16, 2007Dominique Boutigny 17
BOINC / Einstein@home BOINC / Einstein@home
Data analysis from the giant interferometer LIGO and GEO – Search for pulsar generated gravitational waves
Fast Fourier transforms are computed on many chunks of the best data taking periods.
Search for Gravitational Wave signals on 30 000 directions spread on the sky
Huge combinatorial problem
• Use of individual PC
Big success > 160 000 participants
Contribution to scientific outreach
Gravitational wave detectionhttp://einstein.phys.uwm.edu/
November 16, 2007Dominique Boutigny 18
BOINCBOINC
BOINC provides a framework for a lightweight BOINC provides a framework for a lightweight Grid which is usable to federates the usage of Grid which is usable to federates the usage of distributed PCdistributed PC
Standalone usage is possible in many domains – Standalone usage is possible in many domains – BOINC is already used by several teams working BOINC is already used by several teams working in Biology.in Biology.
Certainly a way to explore, for laboratories with Certainly a way to explore, for laboratories with limited computing resourceslimited computing resources
November 16, 2007Dominique Boutigny 19
Java Job Submission (JJS)Java Job Submission (JJS)
Developed at CC-IN2P3 by Pascal CalvatDeveloped at CC-IN2P3 by Pascal Calvat Java Job Submission is a very simple User Java Job Submission is a very simple User
Interface to submit jobs on the GridInterface to submit jobs on the Grid– Works on MAC, Windows and LinuxWorks on MAC, Windows and Linux– Direct submission to Computing ElementDirect submission to Computing Element– Very efficientVery efficient
• Especially for short jobsEspecially for short jobs
– Includes a learning system in order to dynamically build Includes a learning system in order to dynamically build a list of the "best" submission sites based on their a list of the "best" submission sites based on their response timeresponse time
November 16, 2007Dominique Boutigny 20
SRB an example of a data SRB an example of a data GridGrid
Developed at San Diego Supercomputing CenterDeveloped at San Diego Supercomputing Center
November 16, 2007Dominique Boutigny 21
SRB a Data Grid middleware (1) SRB a Data Grid middleware (1)
Many scientific applications are based Many scientific applications are based on data production and analysison data production and analysis
ATAGGATAGGCATAGCATAGGCTATGCTATAGGCCAGGCCAGATTAGATT
AAAA
ATAGGATAGGCATAGCATAGGCTATGCTATAGGCCAGGCCAGATTAGATT
AAAA
November 16, 2007Dominique Boutigny 22
SRB a Data Grid middleware (2)SRB a Data Grid middleware (2)
User wants the complexity User wants the complexity to be hiddento be hidden
Inspired from:
http://legacy-web.nbirn.net/Resources_rd/Educational/Tutorials/SRB/021202SRBTutorial/021202SRBIntroBIRN.ppt
Put dataPut dataGet dataGet data
Get dataGet data
SRBSRB
Put dataPut data
DBDB
SRB SRBMetadata Catalog
DBDB
SRB SRBMetadata Catalog
DBDB
SRB SRBMetadata Catalog
November 16, 2007Dominique Boutigny 23
Biomedical applications using SRBBiomedical applications using SRB
Export PC (DICOM server, SRB client)
MRISiemens MAGNETOM
Sonata Maestro Class 1.5 T
Ac
qu
isit
ion
Control PC
DICOM
push DICOM
DICOM
DICOM
DICOM
November 16, 2007Dominique Boutigny 24
The BIRN ProjectThe BIRN Project
Biomedical Informatics Research NetworkBiomedical Informatics Research Network
Brain imagery – Study of brain diseasesBrain imagery – Study of brain diseases
http://www.nbirn.net/http://www.nbirn.net/
November 16, 2007Dominique Boutigny 25
SRB application in HEPSRB application in HEP
Projet SuperNovae FactoryProjet SuperNovae Factory
Data acquisition in Hawai Data acquisition in Hawai remotely controlled from Franceremotely controlled from France
Data are exported to CC-IN2P3 Data are exported to CC-IN2P3 and put at physicist disposal and put at physicist disposal through SRBthrough SRB
BaBar data distribution has been BaBar data distribution has been using SRB since several yearsusing SRB since several years
Hundreds of TB of data has been Hundreds of TB of data has been transferred and referencedtransferred and referenced
November 16, 2007Dominique Boutigny 26
Grid5000 a research gridGrid5000 a research grid
Grid5000 is a project to build a 5000 node grid, dedicated for Grid5000 is a project to build a 5000 node grid, dedicated for research on grid technologiesresearch on grid technologies
9 French sites are currently 9 French sites are currently hosting 3166 Grid5000 nodeshosting 3166 Grid5000 nodes
Sites are connected together on Sites are connected together on a 10 Gb/s backbonea 10 Gb/s backbone
A booking system allows to reserve some nodes to run experiments. A booking system allows to reserve some nodes to run experiments. It is possible to install and deploy a complete software package from It is possible to install and deploy a complete software package from the OS up to the applications on all the nodesthe OS up to the applications on all the nodes
Since recently a network connection has been established between Grid5000 and Since recently a network connection has been established between Grid5000 and the Japanese Grid NAREGIthe Japanese Grid NAREGI
A close collaboration between Research Grids and Production Grids is essentialA close collaboration between Research Grids and Production Grids is essential
Research GridsResearch Grids will develop the future software for the production grids will develop the future software for the production grids
Production GridsProduction Grids will provide the framework to test new developments will provide the framework to test new developments
November 16, 2007Dominique Boutigny 27
Networks and the Digital Divide (1)Networks and the Digital Divide (1)
ICFA Standing Committee on Interregional ConnectivityICFA Standing Committee on Interregional Connectivity
R. Les Cottrell and Shahryar Khan http://www.slac.stanford.edu/xorg/icfa/icfa-net-paper-jan07/
Pinger system running on Pinger system running on 649 sites – 128 countries 649 sites – 128 countries – 11 world regions– 11 world regions
November 16, 2007Dominique Boutigny 28
Networks and the Digital Divide (2)Networks and the Digital Divide (2)
Behind Europe6 Yrs: Russia, Latin America 7 Yrs: Mid-East, SE Asia8-9 Yrs: So. Asia11 Yrs: Cent. Asia12 Yrs: Africa
November 16, 2007Dominique Boutigny 29
The ORIENT / TEIN2 networkThe ORIENT / TEIN2 network
Internet connection difficulties Internet connection difficulties are often related to the "last mile are often related to the "last mile problem" problem"
Institutes local networkInstitutes local network
Institute connection to the Institute connection to the main country backbonemain country backbone
etcetc
Are often a problemAre often a problem
Hong Kong is also Connected
to GLORIAD
45 Mb/s45 Mb/s
622 Mb/s to 622 Mb/s to be upgraded be upgraded to 2x2.5 Gb/sto 2x2.5 Gb/s
November 16, 2007Dominique Boutigny 30
ConclusionsConclusions
Different kind of grid systems have been presentedDifferent kind of grid systems have been presented– They are adapted to different kind of researchThey are adapted to different kind of research– They can be very light (BOINC) or much more complicated (LCG)They can be very light (BOINC) or much more complicated (LCG)
There are different ways to do Grid computingThere are different ways to do Grid computing– Can be very simple (a single User Interface) Can be very simple (a single User Interface) – Can be more sophisticated (by deploying a complete Grid node)Can be more sophisticated (by deploying a complete Grid node)
But in any case the network quality is crucial !But in any case the network quality is crucial !– Emerging countries should put the focus on the network Emerging countries should put the focus on the network
developmentdevelopment
Grid is nothing by itself, only scientific applications Grid is nothing by itself, only scientific applications matters !matters !