grid computing from a solid past to a bright future?

67
Grid Computing from a solid past to a bright future? David Groep NIKHEF DataGrid and VL group 2003-03-14

Upload: kalli

Post on 14-Jan-2016

26 views

Category:

Documents


0 download

DESCRIPTION

Grid Computing from a solid past to a bright future?. David Groep NIKHEF DataGrid and VL group 2003-03-14. Grid – more than a hype?. Imagine that you could plug your computer into the wall and have direct access to huge computing resources immediately, - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Grid Computing from a solid past to a bright future?

Grid Computing

from a solid past to a bright future?

David GroepNIKHEF DataGrid and VL group

2003-03-14

Page 2: Grid Computing from a solid past to a bright future?

Grid – more than a hype?

Imagine that you could plug your computer into the wall and have direct access to huge computing resources immediately, just as you plug in a lamp to get instant light. …

Far from being science-fiction, this is the idea the XXXXXX project is about to make into reality.…

from a project brochure in 2001

Page 3: Grid Computing from a solid past to a bright future?

• Grids and their (science) applications• Origins of the grid• What makes a Grid?

• Grid implementations today• New standards

• Dutch dimensions

Page 4: Grid Computing from a solid past to a bright future?

Grid – a visionThe GRID: networked data processing centres and ”middleware” software as the “glue” of resources.

Researchers perform their activities regardless geographical location, interact with colleagues, share and access data

Scientific instruments and experiments provide huge amount of data

[email protected]

Page 5: Grid Computing from a solid past to a bright future?

Communities and Apps

ENVISAT• 10 instruments on board10 instruments on board• 200 Mbps data rate to ground200 Mbps data rate to ground• 400 Tbytes data archived/year400 Tbytes data archived/year• ~100 `standard’ products~100 `standard’ products• 10+ dedicated facilities in Europe10+ dedicated facilities in Europe

• ~700 approved science user projects~700 approved science user projects

• 10 instruments on board10 instruments on board• 200 Mbps data rate to ground200 Mbps data rate to ground• 400 Tbytes data archived/year400 Tbytes data archived/year• ~100 `standard’ products~100 `standard’ products• 10+ dedicated facilities in Europe10+ dedicated facilities in Europe

• ~700 approved science user projects~700 approved science user projects

http://www.esa.int/

Page 6: Grid Computing from a solid past to a bright future?

Added value for EO

• enhance the ability to access high level products

• allow reprocessing of large historical archives

• data fusion and cross-validation, …

Page 7: Grid Computing from a solid past to a bright future?

Physics @ CERN• LHC particle accellerator

• operational in 2007

• 5-10 Petabyte per year

• 150 countries

• > 10000 Users

• lifetime ~ 20 years

level 1 - special hardware

40 MHz (40 TB/sec)

level 2 - embeddedlevel 3 - PCs

75 KHz (75 GB/sec)5 KHz (5 GB/sec)100 Hz(100 MB/sec)data recording &

offline analysis

The Need for Grids: LHC

http://www.cern.ch/

Page 8: Grid Computing from a solid past to a bright future?

And More …

•For access to data

–Large network bandwidth to access computing centers

–Support of Data banks replicas (easier and faster

mirroring)

–Distributed data banks

•For interpretation of data

–GRID enabled algorithmsBLAST on distributed data banks, distributed data mining

Bio-informatics

Page 9: Grid Computing from a solid past to a bright future?

Genome pattern matching

Page 10: Grid Computing from a solid past to a bright future?

And even more …

• financial services, life sciences, strategy evaluation, …

• instant immersive teleconferencing

• remote experimentation

• pre-surgical planning and simulation

Page 11: Grid Computing from a solid past to a bright future?

Why is the Grid successful?

• Applications need large amounts of data or computation

• Ever larger, distributed user community• Network grows faster than compute power/storage

Page 12: Grid Computing from a solid past to a bright future?

Inter-networking systems

• Continuous growth (now ~ 180 million hosts)• Many protocols and APIs (~3500 RFCs)• Focus on heterogeneity (and security)

http://www.caida.org/

http://www.isc.org/

Page 13: Grid Computing from a solid past to a bright future?

Remote Service

• RPC proved hugely successful within domains– Network Information System (YP)– Network File System– Typical client-server stuff…

• CORBA – also intra-domain– Extension of RPC to OO design model– Diversification

• Web Services – venturing in the inter org. domain– Standard service descriptions and discovery– Common syntax (XML/SOAP)

Page 14: Grid Computing from a solid past to a bright future?

Grid beginnings - Systems

• distributed computing research• Gigabit network test beds• Meta-supercomputing (I-WAY)• Condor ‘flocking’

GUSTO meta-computing test bed in 1999

Page 15: Grid Computing from a solid past to a bright future?

Grid beginnings - Apps

• Solve problems using systems in one ‘domain’– parameter sweeps on batch clusters– PIAF for (HE) physics analysis– …

• Solvers using systems in multiple domains– SETI@home– …

• Ready for the next step …

Page 16: Grid Computing from a solid past to a bright future?

What is the Grid about?

Resource sharing and coordinated problem solving

in dynamic multi-institutional virtual organisations

Virtual Organisation (VO):

A set of individuals or organisations, not under single hierarchical control, temporarily joining forces to solve a particular problem at hand, bringing to the collaboration a subset of their resources, sharing those at their discretion and each under their own conditions.

Page 17: Grid Computing from a solid past to a bright future?

What makes a Grid?

Coordinates resources not subject to central control …– More than cluster & centralised distributed computing– Security, AAA, billing&payment, integrity, procedures

… using standard, open protocols …– More than single-purpose solutions– Requires interoperability, standards body, multiple

implementations

… to deliver non-trivial QoS.– Sum more than individual components (e.g. single sign-

on, transparency)

Ian Foster in Grid Today, 2002

Page 18: Grid Computing from a solid past to a bright future?

Grid Architecture (v1)

Application

Fabric“Controlling things locally”: Access to, & control of, resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling use

Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

InternetTransport

Application

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 19: Grid Computing from a solid past to a bright future?

Protocol Layers & Bodies

PhysicalPhysical

Data LinkData Link

NetworkNetwork

TransportTransport

SessionSession

PresentationPresentation

ApplicationApplication

Standards body: IEEE

Standards body: IETF

Standards bodies: GGFW3C

OASIS

Application

Fabric

Connectivity

Resource

Collective

Internet

Transport

Application

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 20: Grid Computing from a solid past to a bright future?

• Globus Project started 1997• Focus on research only• Used and extended by

many other projects• Toolkit `bag-of-services' approach –

not a complete architecture

• Several middleware projects:– EU DataGrid – production focus– CrossGrid, GridLAB, DataTAG, PPDG, GriPhyN– Condor– In NL: ICES/KIS Virtual Lab, VL-E

Grid Middleware

http://www.globus.org/

http://www.edg.org/

http://www.vl-e.nl/

Page 21: Grid Computing from a solid past to a bright future?

Grids Today

Page 22: Grid Computing from a solid past to a bright future?

Grid Protocols Today

• Use common Grid Security Infrastructure:– Extensions to TLS for delegation (single sign-on)– Organisation of users in VOs

• Currently deployed main services– GRAM (resource allocation):

attrib/value pairs over HTTP

– GridFTP (bulk file transfer):FTP with GSI and high-throughput extras (striping)

– MDS (monitoring and discovery service):LDAP + common resource description schema

• Next generation: Grid Services (OGSA)

Page 23: Grid Computing from a solid past to a bright future?

Grid Security Infrastructure

• Requirements:– “Secure” – User identification– Accountability– Site autonomy– Usage control

– Single sign-on– Dynamic VOs any time and any place– Mobility (“easyEverything”, airport kiosk, handheld)– Multiple roles for each user– Easy!

Page 24: Grid Computing from a solid past to a bright future?

Authentication – PKI

• Asserting, binding identities

• Trust issues on a global scale

– EDG: CA Coord. Group• 16 national certification authorities

+ CrossGrid CAs• policies & procedures mutual trust• users identified by CA’s certificates

– Part of world-wide GridPMA• Establishing minimum requirements• Includes several US and AP CAs

• Scaling still a challenge

EDG CA’s

CERN

CESNET

CNRS (3)

GermanGrid

Grid-Ireland

INFN

NIKHEF

NorduGrid

LIP

Russian DataGrid

DATAGRID-ES

GridPP

US–DOE Root CA

US-DOE Sub CA

CrossGrid (*)

http://marianne.in2p3.fr/datagrid/ca and http://www.gridpma.org/

Page 25: Grid Computing from a solid past to a bright future?

Getting People TogetherVirtual Organisations

• The user community `out there’ is large & highly dynamic• Applying at each individual resource does not scale

• Users get together to form Virtual Organisations:– Temporary alliance of stakeholders

(users and/or resources)– Various groups and roles– Managed by (legal) contracts– Setup and dissolved at will

*currently not yet that fast

• Authentication, Authorization, Accounting (AAA)

Page 26: Grid Computing from a solid past to a bright future?

Authorization (today)

• Virtual Organisation “directories”– Members are listed in a directory– Managed by VO responsible

– Sites extract access lists from directories– Only for VOs they have “contract” with– Still need OS-local accounts

– May also use automated tools (sysadm level)• poolAccounts• slashGrid

http://cern.ch/hep-project-grid-scg/

Page 27: Grid Computing from a solid past to a bright future?

Grid Security in Action

• Key elements in Grid Security Infrastructure (GSI)– Proxy– Trusted certificate store– Delegation: full or restricted rights

• Access services directly

• Establish trust between processes

Page 28: Grid Computing from a solid past to a bright future?

Site A(Kerberos)

Site B (Unix)

Site C(Kerberos)

Computer

User

Single sign-on via “grid-id”& generation of proxy cred.

Or: retrieval of proxy cred.from online repository

User ProxyProxy

credential

Computer

Storagesystem

Communication*

GSI-enabledFTP server

AuthorizeMap to local idAccess file

Remote fileaccess request*

GSI-enabledGRAM server

GSI-enabledGRAM server

Remote processcreation requests*

* With mutual authentication

Process

Kerberosticket

Restrictedproxy

Process

Restrictedproxy

Local id Local id

AuthorizeMap to local idCreate processGenerate credentials

Ditto

GSI in Action“Create Processes at A and B

that Communicate & Access Files at C”

Page 29: Grid Computing from a solid past to a bright future?

Large-scale production Grids

• Until recently usually “smallish”– O(10) sites, O(20) users– Only one community (VO)

Running Production Grids• EU DataGrid (EDG)

– Stress testing: up to 2000 jobs at any time– Focus on stability (>99% of jobs complete correctly)

• VL-E • NASA IPG• LCG, PPDG/iVDGL

Page 30: Grid Computing from a solid past to a bright future?

EU DataGrid

• Middleware research project (2001-2003)• Driving applications:

• HE Physics• Earth Observation• Biomedicine

• Operational testbed• 25 sites, 50 CEs• 8 VOs• ~ 350 users, growing with ~50/month!

http://www.eu-datagrid.org/

Page 31: Grid Computing from a solid past to a bright future?

EU DataGrid Test Bed 1

• DataGrid TB1:– 14 countries– 21 major sites– CrossGrid: 40 more sites

• Submitting Jobs:– Login only once,

run everywhere– Cross administrative

boundaries in asecure and trusted way

– Mutual authorization

http://marianne.in2p3.fr/

Page 32: Grid Computing from a solid past to a bright future?

EDG: 3 Tier ArchitectureEDG: 3 Tier Architecture

Client‘User Interface’

Execution Resources‘ComputeElement’

Data Server‘StorageElement’

Request

ResultData

Request

Database server

Page 33: Grid Computing from a solid past to a bright future?

Example: GOME

Step 8: Visualize Results

Page 34: Grid Computing from a solid past to a bright future?

ESA – KNMIProcessing of raw GOME

data to ozone profilesWith Opera and Noprego

IPSL

Validate GOME ozone profilesWith Ground Based measurements

‘Raw’ satellite data from the GOME instrument

Visualization

LIDAR data

DataGrid

Level 1

Level 2

GOME processing cycle

Page 35: Grid Computing from a solid past to a bright future?

Situation on a GridSituation on a Grid

?

Page 36: Grid Computing from a solid past to a bright future?

Information Services (IS)

Cluster information Storage capacity Network connections

HARDWARE – fabric and storageToday: info-providers publish to

IS hierarchical directory

Next week: R-GMA producer-consumer framework based on

RDBMS

File replica locations

DATA – files and collectionsToday: Replica Catalogue (RC)

In few month: Replica Location Service

RunTime Environment tags Service entries (SE, CE, RC)

SOFTWARE – programs & services

Today: in IS

Page 37: Grid Computing from a solid past to a bright future?

Grid job submission

• Basic protocol: GRAM– Job submission at individual CE– Status inqueries– Credential delegation– File staging– Job manager (baby-sitter)

• Collective services (Workload Mngt System)– Resource broker– Job submission service– Logging and Bookkeeping

• The EDG WMS tries to optimize the usage of resources• Will re-submit on resource failure

Page 38: Grid Computing from a solid past to a bright future?

•Information to be specified–Job characteristics–Requirements and Preferences of the computing system

–Software dependencies–Job Data requirements –Specified using a Job Description Language (JDL)

Job Preparation

Page 39: Grid Computing from a solid past to a bright future?

Example JDL File

Executable = “gridTest”;

StdError = “stderr.log”;

StdOutput = “stdout.log”;

InputSandbox = {“home/joda/test/gridTest”};

OutputSandbox = {“stderr.log”, “stdout.log”};

InputData = “LF:testbed0-00019”;

ReplicaCatalog = “ldap://sunlab2g.cnaf.infn.it:2010/ \ lc=test, rc=WP2 INFN Test, dc=infn,

dc=it”;

DataAccessProtocol = “gridftp”;

Requirements = other.Architecture==“INTEL” && \ other.OpSys==“LINUX” && \

other.FreeCpus >=4;

Rank = “other.MaxCpuTime”;

This JDL is input to dg-job-submit

Page 40: Grid Computing from a solid past to a bright future?

Job Submission Scenario

UIJDL

Logging &Bookkeeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement CE)Element CE)

Information Service (IS)

ReplicaCatalogue(RC)

Page 41: Grid Computing from a solid past to a bright future?

Example

UIJDL

Logging &Bookkeeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job SubmitEvent

Input Sandbox

Job Status

submitted

Page 42: Grid Computing from a solid past to a bright future?

Example

UIJDL

Logging &Bookkeeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

Page 43: Grid Computing from a solid past to a bright future?

Example

UIJDL

Logging &Bookkeeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

Page 44: Grid Computing from a solid past to a bright future?

Example

UIJDL

Logging &Bookkeeping(LB)

ResourceBroker (RB)

Job SubmissionService(JSS)

StorageElement (SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

BrokerInfo

scheduled

Page 45: Grid Computing from a solid past to a bright future?

Example

UIJDL

Logging &Bookkeeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

scheduledInput Sandbox

running

Page 46: Grid Computing from a solid past to a bright future?

Example

UIJDL

Logging &Bookkeeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

scheduled

Job Status

running

Page 47: Grid Computing from a solid past to a bright future?

Example

UIJDL

Logging &Bookkeeping

ResourceBroker

Job SubmissionService

StorageElement

ComputeComputeElementElement

Information Service

ReplicaCatalogue

submitted

waiting

ready

scheduled

running

Job Status

done

Job Status

Page 48: Grid Computing from a solid past to a bright future?

Example

UIJDL

Logging &Bookkeeping

ResourceBroker

Job SubmissionService

StorageElement

ComputeComputeElementElement

Information Service

ReplicaCatalogue

submitted

waiting

ready

scheduled

running

done

Job Status

Job Status

outputready

Output Sandbox

Page 49: Grid Computing from a solid past to a bright future?

Example

UIJDL

Logging &Bookkeeping(LB)

ResourceBroker (RB)

Job SubmissionService (JS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Output Sandbox

cleared

submitted

waiting

ready

scheduled

running

done

Job Status

outputready

Page 50: Grid Computing from a solid past to a bright future?

Data Access & Transport

• Requirements– Support single sign-on– Transfer large files quickly– Confidentiality/integrity– Integrated with information systems (RC)

• Extensions to FTP protocol: GridFTP– GSI, DCAU– Server striping, parallel streams

• TCP protocol optimisation

Page 51: Grid Computing from a solid past to a bright future?

EDG Storage Element

• Transfer methods:– gridFTP– RFIO– G-HTTPS

• Replica Catalogue– Yesterday: LDAP directory using GDMP– Today: Replica Location Service and Giggle

• Backend systems– Disk storage– HPSS via HRM– HPSS with explicit staging

Page 52: Grid Computing from a solid past to a bright future?

Grid Data Bases ?!

• Database Access and Integration (DAI)-WG– OGSA-DAI integration project– Data Virtualisation Services– Standard Data Source Services

Early Emerging Standards:– Grid Data Service specification (GDS)– Grid Data Service Factory (GDSF)

Largely spin-off from the UK e-Science effort & DataGrid

Page 53: Grid Computing from a solid past to a bright future?

Grid Access to Databases

• SpitFire (standard data source services)uniform access to persistent storage on the Grid

• Multiple roles support• Compatible with GSI (single sign-on) though CoG• Uses standard stuff: JDBC, SOAP, XML• Supports various back-end data bases

http://hep-proj-spitfire.web.cern.ch/hep-proj-spitfire/

Page 54: Grid Computing from a solid past to a bright future?

Spitfire security model

Standard access to DBs

•GSI SOAP protocol•Strong authentication

•Supports single-signon•Local role repository

•Connection pool to•Multiple backend DBs

Version 1.0 out,WebServices version in alpha

Page 55: Grid Computing from a solid past to a bright future?

Bringing Grids to the User

• Core services too complex to present to scientistsdesign (graphical/web) portals

• VLAM-G• GENIUS/

EnginFrame• EDG GUI

• Application-specific interfaces

Page 56: Grid Computing from a solid past to a bright future?

A Bright Future?

Page 57: Grid Computing from a solid past to a bright future?

Grids Around the World

• Many different grid projects• Different goals (and thus architectures)• Breath of applications

– Meta-supercomputing (origin of the Grid)– High-throughput computing (DataGrids)– Collaboratories, data fusion grids– Harnassing idle workstations– Transaction-oriented grids (industry)

• Interoperability requires standardisation!

Page 58: Grid Computing from a solid past to a bright future?

Standards Requirements

• GGF established in 2001merger of GridForum and Egrid Forum

• Approx. 50 working & research groups

0

200

400

600

800

1000

1200

1999 1999 2000 2000 2000 2001 2001 2001 2002 2002

(G)GF attendance

http://www.ggf.org/

Page 59: Grid Computing from a solid past to a bright future?

OGSA: current directions

Open Grid Services Architecture … … cleaning up the protocol mess

• Use standard containers (based on web services)

• Based on common standards:– SOAP, WSDL, UDDI– Running over “upgraded” Grid Security Infra (GSI)

• New in OGSA: adding transient “manageable” services:– State of distributed activities– Workflow, multi-media, distributed data analysis

Page 60: Grid Computing from a solid past to a bright future?

OGSA Roadmap

• Introduced at GGF4 (Toronto, March 2002)• OGSI definition draft went for final call last week

• First implementations – Globus Toolkit v3– Currently in alpha testing– Beta release in July

• Significant effort towards homogeneous interfaces

• Large commitment (world-wide and local)

Page 61: Grid Computing from a solid past to a bright future?

Dutch Dimensions

Page 62: Grid Computing from a solid past to a bright future?

SURFnet5 connectivity

http://www.surfnet.nl/

Page 63: Grid Computing from a solid past to a bright future?

Networking: Europe

http://www.dante.net/

Page 64: Grid Computing from a solid past to a bright future?

DutchGrid Platform

Amsterdam

Utrecht

KNMI

Delft

Nijmegen

TELIN

• DutchGrid:– Test bed coordination– PKI security– Support

• Participation byNIKHEF, KNMI, SARA

DAS-2 (ASCI):TUDelft, Leiden, VU, UvA, Utrecht

Telematics Institute

FOM, NWO/NCF

Min. EZ (ICES/KIS)

IBM, KPN, …

Leiden

ASTRONJIVE

www.dutchgrid.nl

Page 65: Grid Computing from a solid past to a bright future?

Resources

• ASCI DAS-2 (VU, UvA, Leiden, TUDelft, Utrecht)– 200 dual P-III 1GHz CPUs– homogeneous clusters, 5 locations

• NIKHEF DataGrid clusters– 75 dual P-III ~ 1GHz – 1Gb/s IPv4 + 1Gb/s IPv6

• NCF Gridnational computer facilities foundation from NWO

– 66 node dual AMD-K7 Fabric Research Cluster (NIKHEF)– 32 node duals “production quality” cluster (SARA)*– 10Gb/s optical “lambda” test bed– …

• BioASP – various smaller O(1-10 node) clusters

Page 66: Grid Computing from a solid past to a bright future?

Resources (cont.)

SARA – National HPC Centre• Processing

– SGI 1024 processor MPP

• Mass storage– StorageTek NearLine tape robot– currently: 500 TByte– Integrated as an EDG “Storage Element”

• User expertise centre

SURFnet – networking• 2.5-10 Gb/s international• 10 Gb/s to dedicated centres (DAS-2, ASTRON)

Page 67: Grid Computing from a solid past to a bright future?

A Bright Future!

You could plug your computer into the wall and have direct access to huge (computing) resources almost immediately

(with a little help from toolkits and portals)…It may still be science – although not fiction –but we are working hard to get there!