grid computing jorge gomes laboratório de instrumentação e física experimental de partículas...

34
Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Upload: ezekiel-cressey

Post on 01-Apr-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Grid ComputingJorge Gomes

Laboratório de Instrumentação e Física Experimental de Partículas

R-ECFA Workshop 2008, Lisbon, 28 March 2008

Page 2: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

LIP and grid computing

• LIP participates in the ATLAS and CMS experiments • The LHC data management and processing requires a novel

computing approach:

– Highly distributed community and resources • both geographically and administrative wise

– Integration of all computing and storage resources from the collaborating institutes

– Common interface to access the resources– Transparent access to the resources– Single sign-on– e-Infrastructure …

• Grid computing was adopted to implement the LHC computing infrastructure

• Consequently LIP become involved in grid computing • Now grid computing is also being used at LIP by non-LHC

experiments

Page 3: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

LIP in international grid computing projects

DataGrid CrossGrid LCG EGEE-I EELA EGEE-II Int.Eu.Grid

2001

2002

2003

2004

2005

2006

2007

2008

Page 4: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

LIP in international grid computing projects

• DataGrid + EGEE I and II: EU funded projects coordinated by CERN middleware and production infrastructures for multidisciplinary data intensive grid computing

• CrossGrid and Int.Eu.Grid: EU funded projects, middleware and production grids for data intensive parallel and interactive applications

• EELA: EU funded project to build a pilot grid in Latin America• LCG: CERN long-term project to support the LHC computing• All projects based on the same middleware EDG/LCG/gLite

DataGrid EGEE-I EGEE-II

CrossGrid

LHC Computing Grid (LCG)

Int.Eu.Grid

EDG LCG gLite

EELA

Page 5: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Strategy

1. Learn about grid middleware (DataGrid, CrossGrid, …)– Gain know-how and experience– Follow the evolution

2. Contribute to the technology and build a team– Engineers at CERN working for LCG (AdI coordinated by LIP)– Participate in international grid projects (EGEE etc…)

• Infrastructure operations– Core services– User support / Helpdesk– Deployment coordination

• Middleware test / validation• Middleware integration• Resource provider• Support services

– Certification Authority for Portugal– Grid training

3. Deploy and operate an LCG Tier-2 for Portugal

Focus in the same areas in all projects:• take advantage of

knowledge and synergies • maximize resources• work on areas related with

IT operations

Page 6: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

People

Team Working

Jorge Gomes Coordination

Mário David Grid Computing

Gonçalo Borges Grid Computing

Manuel Montecelo Grid Computing

João Martins Systems administration

Nuno Dias Systems administration, LIP CA manager

Miguel Oliveira Grid Computing (Coimbra site)

Hugo Gomes Web

José Aparício Technical support

9 FTEs in the computing team including grid

Page 7: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

F

Projects and Grid Infrastructures

Page 8: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

The LCG depends on two major grid computing infrastructures ...

The biggest grid

computing infrastructure

worldwide

LCG Infrastructure

Page 9: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

240 sites45 countries41,000 CPUs5 PetaBytes>10,000 users>150 VOs>100,000 jobs/day

ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…

91 partners in 32 countries25 collaborating projects

Page 10: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

EGEE in Portugal

• Portugal and Spain constitute the EGEE Southwest federation

• LIP coordinates EGEE in Portugal

• EGEE Resource Centres:– LIP

• Lisboa (core services, production and pre-prod)

• Coimbra

– Univ Lusiada• Famalicão

– Univ Porto• Porto (3 clusters)

– Univ Minho• Braga

– CFP-IST• Lisbon

– IEETA• Aveiro (pre-prod)

Page 11: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Int.EU.Grid infrastructure

• Grid infrastructure focused on:– Parallel processing– Interactivity– Extending the glite MW

• 12 centers• 7 countries

• Virtual Organizations:– ifusion– ienvmod– iusct– ibrain– ihep– iplanck– iwien2k– icompchem– ...

Grid OperationsManagement

LIP Coordinates the infrastructure operation

Page 12: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

EELA

• E-Infrastructure shared between Europe and Latin America • EU funded project coordinated by CIEMAT• January 2006 – December 2007• 25 partners in Europe and Latin America

• Mexico, Brazil, Cuba, Chile, Venezuela, Argentina, Portugal, Italy e Spain, CERN, CLARA

• EGEE extension to Latin America• Pilot grid infrastructure• Dissemination and training

• LIP responsible for the authentication and VO management task• Coordinate the deployment of grid certification authorities

• Brazil, Argentina, Chile, Mexico, catch-all• Virtual organizations management and authorization• File catalogue core services

Page 13: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

F

At Work

Page 14: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Processing and storage capacity at LIP

• Lisbon– Farm (EGEE/LCG, Int.Eu.Grid, EELA)

• 88 CPU COREs AMD Opteron and Intel Xeon• 2 to 4 CPU COREs per machine• 1 to 2GB RAM per CPU CORE• Gigabit Ethernet• LRMS: Sun Grid Engine (SGE)• dCache storage

– Pré-production (EGEE)• Small farm (Pentium III and Pentium IV)• Some storage

• Coimbra– Farm (EGEE/LCG)

• 84 Workstations Pentium IV CPUs 2.2 to 3.0GHz• 1 CPU per machine• 1 to 2GB RAM per CPU• Fast Ethernet• LRMS: Torque/MAUI• DPM storage

Page 15: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

LIP grid core services

• LIP operates grid core services for several infrastructures– EGEE– Int.Eu.Grid– EELA

• These services include:– Resource Brokers / CrossBrokers– Information Indexes– RAS servers– LFC servers– VOMS servers– Myproxy server

• Hosted at:– FCCN datacenter– LIP-Lisbon datacenter

Page 16: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

LIP Certification Authority

• The LIP CA is the grid certification authority for Portugal– Accredited by the International Grid Trust Federation (IGTF)– Registered at the TERENA TACAR trust anchor

• The LIP CA is member of the EUgridPMA• Registration authorities at:

– Centro de Física de Plasmas (IST)– Centro de Sistemas Inteligentes (UALG)– FCT/UNL– Instituto de Engenharia Electrónica e Telemática de Aveiro (UA)– LIP Lisboa– LIP Coimbra (UC)– Universidade Lusíada (Famalicão)– Universidade Autónoma de Lisboa– Universidade do Minho– Universidade do Porto

http://ca.lip.pt

Page 17: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Portugal in EGEE

EGEE production infrastructure usage including LCG

Grid Computing resources:• LIP-Lisbon 28 - 75 CPU cores• LIP-Coimbra 84 CPU cores

Page 18: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

CPU usage in the EGEE SWE federation

Atlas + CMS + Compass + Auger

Page 19: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

F

The Portuguese National Grid Initiative

Page 20: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

National grid initiative

• Officially launched in April of 2006 by the Portuguese Ministry of Science– Support the development of grid infrastructures for

complex problem solving– Development of competences – Integrate Portugal in major international grid

infrastructures

• LIP participates in the coordination of the initiative• Activities of the initiative

– Funding: 15 pilot projects (1.500.000 €)– Networks for grid computing– Infrastructures:

• INGRID+ (national grid infrastructure based on EGEE)• Main node for grid computing

Page 21: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

International network connectivity

• International connectivity being deeply changed with new links being deployed:– North: Minho - Galicia– South: through Spanish Estremadura

• Objectives – Better geant connectivity (higher bandwidth)– Redundancy between both countries– Grid computing support– Have dedicated fibres

• Grid computing connectivity– Dedicated bandwidth for grid computing– Separate commodity and grid network traffic

Page 22: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Main node for grid computing

• Deployment of the Portuguese main node for grid computing– First step towards INGRID+– The project began in the summer of 2007– Expected to become operational in September of 2008– Consortium: LIP, FCCN and LNEC

• Datacenter dedicated to grid computing– Host the core grid services for Portugal– Host major computing and storage resources– Host resources from other organizations

• Users– National grid initiative projects– National Tier-2 for the LCG– Projects in the context of IBERGRID– Portuguese researchers with demanding

computing requirements

Page 23: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

F

The Portuguese Tier-2 deployment

Page 24: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

LHC Computing Grid

• The Portuguese government has signed the LCG MoU

• A Portuguese Tier-2 for LCG is being deployed:– Resource centre providing processing and storage

capacity integrated into the LHC Computing Grid– Supporting the ATLAS and CMS VOs

• The Portuguese Tier-2 will have 3 centres:– LIP-Lisbon

• Hosted at the LIP datacenter in Lisbon• Operational

– LIP-Coimbra• Hosted in partnership with CFC• Operational since mid 2007

– Main node for grid computing • National grid initiative project• Being built at the LNEC campus in Lisbon

Page 25: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

LCG MoU

• The LIP federated Tier-2 is delivering 13% of the CPU capacity foreseen for 2008

• The LIP Tier-2 storage capacity initially foreseen in the MoU is not sufficient

• LHC is about to start we need to ramp-up• The available network bandwidth is already above the MoU

• The LCG Tier-1 for Portugal is PIC in Barcelona

Page 26: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Ongoing: processing and storage upgrade

• Lisbon and Coimbra– ≥360 - 384 CPU COREs

• two x quad core CPUs• 24GB RAM• 750GB SATA disks• Gigabit Ethernet• IPMI mgmt

– >300TB storage• ~ 11 - 15 servers• Storage based on SATA disks• RAID 5 and 6• Quad gigabit Ethernet• IPMI mgmt• Attachment

– Fibre Channel– SAS or SATA

• Currently benchmarking and testing hardware from several vendors

Page 27: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Internet connectivity

• Lisbon:– New fibre link LIP<->FCCN

• Fibre rented from PT• Max bandwidth 10 Gbps

– Agreed capacity with FCCN • 1 Gigabit/s full duplex• GBICs Gigabit Ethernet LX• 995Mbps academic (Geant tagging) + 5Mbps commercial

• Coimbra:– The LIP farm is hosted at CFC and shares the network

connectivity with CFC• CFC shares the University of Coimbra Internet link now being

upgraded• The LIP cluster will have a dedicated link

Page 28: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Ongoing: network

• FCCN will provide network connectivity for grid computing:– Possibly enabling direct connectivity between Lisbon

and Coimbra clusters– Possibly enabling more bandwidth at both sites– The FCCN infrastructure already has the hardware

capabilities to provide the service– FCCN workshop on 15 April

• Upgrade the datacentre LANs in Lisbon and Coimbra– New core switches– Modular switches wire speed non-blocking – High density gigabit Ethernet– Some 10gigabit Ethernet ports– Redundant configuration– Ready for L2/L3 WAN connectivity

Page 29: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Main node for grid computing

• GRID datacenter– Being built at LNEC

• Civil construction ongoing• Ready on September 2008

– Located near the FCCN NOC• Direct connectivity to the FCCN backbone and Geant point of presence

– Adequate power and cooling infrastructure for grid computing– Budget > 3.200.000€

• Host:– Core GRID services INGRID+ and other projects– GRID computing cluster

• > 600 CPU COREs and ~ 600TBin the end of 2008

• Up to ~ 2000 CPU COREs and ~ 1000TB of storage

• Resources managed by LIP

– FCCN nearline storage• Tape robot for data repositories

– Host grid resources from other organizations• LNEC GRID cluster

Integrated in the Tier-2LCG will be the main user

Page 30: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

F

Future

Page 31: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

IBERGRID

• IBERGRID– A common Iberian grid infrastructure is being prepared:

• In the context of agreements between the Portuguese and Spanish governments

• Sharing of resources between Portugal and Spain

– Main areas:• Networking, grid, supercomputing and applications

– Based on and profiting from the current collaboration in the framework of European projects:

• EGEE, int.eu.grid, EELA

– LIP is deeply involved:• Coordination of the initiative• Infrastructure coordination for Portugal

– Conferences:• 1st conference took place in Santiago de Compostela in May of

2007• 2nd conference will take place at the University of Porto

Page 32: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

EGEE-III

• EGEE-III project approved and being signed by the partners– Expand the EGEE infrastructure

• More resources

• More users and communities

– Prepare the migration towards a future European Infrastructure (EGI) based on the national grid initiatives (NGIs)

• Will start after EGEE-II (two years duration)• Total budget 68.900.000 €

– EU contribution 36.250.000 €– EU contribution for LIP 300.000 €

Page 33: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Concerns

• Deployment and operation of the main node for grid computing– big challenge due to the dimension and characteristics– serving new communities– core services for national grid initiative and IBERGRID

• The LHC is starting – the LCG usage will increase and so the problems

• Need more human resources– user support and tier-2 operation– but attract people has been very difficult …

• Costs of infrastructure operation– electricity, maintenance etc…– these costs will increase a lot with the new deployments and

upgrades

Page 34: Grid Computing Jorge Gomes Laboratório de Instrumentação e Física Experimental de Partículas R-ECFA Workshop 2008, Lisbon, 28 March 2008

Thank you ...