the grid enabling resource sharing within virtual organizations

41
The Grid Enabling Resource Sharing within Virtual Organizations Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University of Chicago http://www.mcs.anl.gov/~foster Invited Talk, Research Library Group Annual Conference, Amsterdam, April 22, 2002

Upload: aileen

Post on 27-Jan-2016

26 views

Category:

Documents


0 download

DESCRIPTION

The Grid Enabling Resource Sharing within Virtual Organizations. Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University of Chicago http://www.mcs.anl.gov/~foster. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Grid Enabling Resource Sharing within Virtual Organizations

The GridEnabling Resource Sharing

within Virtual Organizations

Ian Foster

Mathematics and Computer Science Division

Argonne National Laboratory

and

Department of Computer Science

The University of Chicago

http://www.mcs.anl.gov/~foster

Invited Talk, Research Library Group Annual Conference, Amsterdam, April 22, 2002

Page 2: The Grid Enabling Resource Sharing within Virtual Organizations

2

[email protected] ARGONNE CHICAGO

Overview

The technology landscape– Living in an exponential world

Grid concepts– Resource sharing in virtual organizations

Petascale Virtual Data Grids– Data, programs, computers, computations as

community resources

Page 3: The Grid Enabling Resource Sharing within Virtual Organizations

3

[email protected] ARGONNE CHICAGO

Living in an Exponential World(1) Computing & Sensors

Moore’s Law: transistor count doubles each 18 months

Magnetohydro-dynamics

star formation

Page 4: The Grid Enabling Resource Sharing within Virtual Organizations

4

[email protected] ARGONNE CHICAGO

Living in an Exponential World:(2) Storage

Storage density doubles every 12 months Dramatic growth in online data (1 petabyte =

1000 terabyte = 1,000,000 gigabyte)– 2000 ~0.5 petabyte

– 2005 ~10 petabytes

– 2010 ~100 petabytes

– 2015 ~1000 petabytes? Transforming entire disciplines in physical and,

increasingly, biological sciences; humanities next?

Page 5: The Grid Enabling Resource Sharing within Virtual Organizations

5

[email protected] ARGONNE CHICAGO

Data Intensive Physical Sciences

High energy & nuclear physics– Including new experiments at CERN

Gravity wave searches– LIGO, GEO, VIRGO

Time-dependent 3-D systems (simulation, data)– Earth Observation, climate modeling

– Geophysics, earthquake modeling

– Fluids, aerodynamic design

– Pollutant dispersal scenarios Astronomy: Digital sky surveys

Page 6: The Grid Enabling Resource Sharing within Virtual Organizations

6

[email protected] ARGONNE CHICAGO

Ongoing Astronomical Mega-Surveys

Large number of new surveys– Multi-TB in size, 100M objects or larger

– In databases

– Individual archives planned and under way Multi-wavelength view of the sky

– > 13 wavelength coverage within 5 years Impressive early discoveries

– Finding exotic objects by unusual colors> L,T dwarfs, high redshift quasars

– Finding objects by time variability> Gravitational micro-lensing

MACHO2MASSSDSSDPOSSGSC-IICOBE MAPNVSSFIRSTGALEXROSATOGLE...

MACHO2MASSSDSSDPOSSGSC-IICOBE MAPNVSSFIRSTGALEXROSATOGLE...

Page 7: The Grid Enabling Resource Sharing within Virtual Organizations

7

[email protected] ARGONNE CHICAGO

Crab Nebula in 4 Spectral Regions

X-ray Optical

Infrared Radio

Page 8: The Grid Enabling Resource Sharing within Virtual Organizations

8

[email protected] ARGONNE CHICAGO

Coming Floods of Astronomy Data

The planned Large Synoptic Survey Telescope will produce over 10 petabytes per year by 2008!– All-sky survey every few days, so will have

fine-grain time series for the first time

Page 9: The Grid Enabling Resource Sharing within Virtual Organizations

9

[email protected] ARGONNE CHICAGO

Data Intensive Biology and Medicine

Medical data– X-Ray, mammography data, etc. (many petabytes)

– Digitizing patient records (ditto) X-ray crystallography Molecular genomics and related disciplines

– Human Genome, other genome databases

– Proteomics (protein structure, activities, …)

– Protein interactions, drug delivery Virtual Population Laboratory (proposed)

– Simulate likely spread of disease outbreaks Brain scans (3-D, time dependent)

Page 10: The Grid Enabling Resource Sharing within Virtual Organizations

10

[email protected] ARGONNE CHICAGO

And comparisons must bemade among many

We need to get to one micron to know location of every cell. We’re just now starting to get to 10 microns – Grids will help get us there and further

A Brainis a Lotof Data!

(Mark Ellisman, UCSD)

Page 11: The Grid Enabling Resource Sharing within Virtual Organizations

11

[email protected] ARGONNE CHICAGO

An Exponential World: (3) Networks(Or, Coefficients Matter …)

Network vs. computer performance– Computer speed doubles every 18 months

– Network speed doubles every 9 months

– Difference = order of magnitude per 5 years 1986 to 2000

– Computers: x 500

– Networks: x 340,000 2001 to 2010

– Computers: x 60

– Networks: x 4000

Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan-2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.

Page 12: The Grid Enabling Resource Sharing within Virtual Organizations

12

[email protected] ARGONNE CHICAGO

Evolution of the Scientific Process

Pre-electronic– Theorize &/or experiment, alone or in small

teams; publish paper Post-electronic

– Construct and mine very large databases of observational or simulation data

– Develop computer simulations & analyses

– Exchange information quasi-instantaneously within large, distributed, multidisciplinary teams

Page 13: The Grid Enabling Resource Sharing within Virtual Organizations

13

[email protected] ARGONNE CHICAGO

The Grid

“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”

Page 14: The Grid Enabling Resource Sharing within Virtual Organizations

14

[email protected] ARGONNE CHICAGO

An Example Virtual Organization: CERN’s Large Hadron Collider

1800 Physicists, 150 Institutes, 32 Countries

100 PB of data by 2010; 50,000 CPUs?

Page 15: The Grid Enabling Resource Sharing within Virtual Organizations

15

[email protected] ARGONNE CHICAGO

Grid Communities & Applications:Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

www.griphyn.org www.ppdg.net www.eu-datagrid.org

Page 16: The Grid Enabling Resource Sharing within Virtual Organizations

16

[email protected] ARGONNE CHICAGO

The Grid Opportunity:eScience and eBusiness

Physicists worldwide pool resources for peta-op analyses of petabytes of data

Civil engineers collaborate to design, execute, & analyze shake table experiments

An insurance company mines data from partner hospitals for fraud detection

An application service provider offloads excess load to a compute cycle provider

An enterprise configures internal & external resources to support eBusiness workload

Page 17: The Grid Enabling Resource Sharing within Virtual Organizations

17

[email protected] ARGONNE CHICAGO

Grid Computing

Page 18: The Grid Enabling Resource Sharing within Virtual Organizations

18

[email protected] ARGONNE CHICAGO

Early 90s– Gigabit testbeds, metacomputing

Mid to late 90s– Early experiments (e.g., I-WAY), academic software

projects (e.g., Globus, Legion), application experiments 2002

– Dozens of application communities & projects– Major infrastructure deployments– Significant technology base (esp. Globus ToolkitTM)– Growing industrial interest – Global Grid Forum: ~500 people, 20+ countries

The Grid:A Brief History

Page 19: The Grid Enabling Resource Sharing within Virtual Organizations

19

[email protected] ARGONNE CHICAGO

Challenging Technical Requirements

Dynamic formation and management of virtual organizations

Online negotiation of access to services: who, what, why, when, how

Establishment of applications and systems able to deliver multiple qualities of service

Autonomic management of infrastructure elements

Open Grid Services Architecture

http://www.globus.org/ogsa

Page 20: The Grid Enabling Resource Sharing within Virtual Organizations

20

[email protected] ARGONNE CHICAGO

Data Intensive Science: 2000-2015

Scientific discovery increasingly driven by IT – Computationally intensive analyses

– Massive data collections

– Data distributed across networks of varying capability

– Geographically distributed collaboration Dominant factor: data growth (1 Petabyte = 1000 TB)

– 2000 ~0.5 Petabyte

– 2005 ~10 Petabytes

– 2010 ~100 Petabytes

– 2015 ~1000 Petabytes?

How to collect, manage,access and interpret thisquantity of data?

Drives demand for “Data Grids” to handleadditional dimension of data access & movement

Page 21: The Grid Enabling Resource Sharing within Virtual Organizations

21

[email protected] ARGONNE CHICAGO

Data Grid Projects Particle Physics Data Grid (US, DOE)

– Data Grid applications for HENP expts. GriPhyN (US, NSF)

– Petascale Virtual-Data Grids iVDGL (US, NSF)

– Global Grid lab TeraGrid (US, NSF)

– Dist. supercomp. resources (13 TFlops) European Data Grid (EU, EC)

– Data Grid technologies, EU deployment CrossGrid (EU, EC)

– Data Grid technologies, EU emphasis DataTAG (EU, EC)

– Transatlantic network, Grid applications Japanese Grid Projects (APGrid) (Japan)

– Grid deployment throughout Japan

Collaborations of application scientists & computer scientists

Infrastructure devel. & deployment

Globus based

Page 22: The Grid Enabling Resource Sharing within Virtual Organizations

22

[email protected] ARGONNE CHICAGO

Biomedical InformaticsResearch Network (BIRN)

Evolving reference set of brains provides essential data for developing therapies for neurological disorders (multiple sclerosis, Alzheimer’s, etc.).

Today – One lab, small patient base– 4 TB collection

Tomorrow– 10s of collaborating labs– Larger population sample– 400 TB data collection: more

brains, higher resolution– Multiple scale data integration

and analysis

Page 23: The Grid Enabling Resource Sharing within Virtual Organizations

23

[email protected] ARGONNE CHICAGO

GriPhyN = App. Science + CS + Grids

GriPhyN = Grid Physics Network– US-CMS High Energy Physics

– US-ATLAS High Energy Physics

– LIGO/LSC Gravity wave research

– SDSS Sloan Digital Sky Survey

– Strong partnership with computer scientists Design and implement production-scale grids

– Develop common infrastructure, tools and services

– Integration into the 4 experiments

– Application to other sciences via “Virtual Data Toolkit” Multi-year project

– R&D for grid architecture (funded at $11.9M +$1.6M)

– Integrate Grid infrastructure into experiments through VDT

Page 24: The Grid Enabling Resource Sharing within Virtual Organizations

24

[email protected] ARGONNE CHICAGO

GriPhyN Institutions

– U Florida

– U Chicago

– Boston U

– Caltech

– U Wisconsin, Madison

– USC/ISI

– Harvard

– Indiana

– Johns Hopkins

– Northwestern

– Stanford

– U Illinois at Chicago

– U Penn

– U Texas, Brownsville

– U Wisconsin, Milwaukee

– UC Berkeley

– UC San Diego

– San Diego Supercomputer Center

– Lawrence Berkeley Lab

– Argonne

– Fermilab

– Brookhaven

Page 25: The Grid Enabling Resource Sharing within Virtual Organizations

25

[email protected] ARGONNE CHICAGO

GriPhyN: PetaScale Virtual Data Grids

Virtual Data Tools

Request Planning &

Scheduling ToolsRequest Execution & Management Tools

Transforms

Distributed resources(code, storage, CPUs,networks)

Resource Management

Services

Resource Management

Services

Security and Policy

Services

Security and Policy

Services

Other Grid ServicesOther Grid

Services

Interactive User Tools

Production TeamIndividual Investigator Workgroups

Raw data source

~1 Petaop/s~100 Petabytes

Page 26: The Grid Enabling Resource Sharing within Virtual Organizations

26

[email protected] ARGONNE CHICAGO

GriPhyN Research Agenda Virtual Data technologies

– Derived data, calculable via algorithm

– Instantiated 0, 1, or many times (e.g., caches)

– “Fetch value” vs. “execute algorithm”

– Potentially complex (versions, cost calculation, etc) E.g., LIGO: “Get gravitational strain for 2 minutes around 200

gamma-ray bursts over last year” For each requested data value, need to

– Locate item materialization, location, and algorithm

– Determine costs of fetching vs. calculating

– Plan data movements, computations to obtain results

– Execute the plan

Page 27: The Grid Enabling Resource Sharing within Virtual Organizations

27

[email protected] ARGONNE CHICAGO

Virtual Data in Action

Data request may– Compute locally

– Compute remotely

– Access local data

– Access remote data Scheduling based on

– Local policies

– Global policies

– Cost

Major facilities, archives

Regional facilities, caches

Local facilities, cachesFetch item

Page 28: The Grid Enabling Resource Sharing within Virtual Organizations

28

[email protected] ARGONNE CHICAGO

GriPhyN Research Agenda (cont.) Execution management

– Co-allocation (CPU, storage, network transfers)

– Fault tolerance, error reporting

– Interaction, feedback to planning Performance analysis (with PPDG)

– Instrumentation, measurement of all components

– Understand and optimize grid performance Virtual Data Toolkit (VDT)

– VDT = virtual data services + virtual data tools

– One of the primary deliverables of R&D effort

– Technology transfer to other scientific domains

Page 29: The Grid Enabling Resource Sharing within Virtual Organizations

29

[email protected] ARGONNE CHICAGO

iVDGL: A Global Grid Laboratory

International Virtual-Data Grid Laboratory– A global Grid lab (US, EU, South America, Asia, …)

– A place to conduct Data Grid tests “at scale”

– A mechanism to create common Grid infrastructure

– A production facility for LHC experiments

– An experimental laboratory for other disciplines

– A focus of outreach efforts to small institutions Funded for $13.65M by NSF

“We propose to create, operate and evaluate, over asustained period of time, an international researchlaboratory for data-intensive science.”

From NSF proposal, 2001

Page 30: The Grid Enabling Resource Sharing within Virtual Organizations

30

[email protected] ARGONNE CHICAGO

Initial US-iVDGL Data Grid

Tier1 (FNAL)Proto-Tier2Tier3 university

UCSDFlorida

Wisconsin

FermilabBNL

Indiana

BU

Other sites to be added in

2002

SKC

Brownsville

Hampton

PSU

JHUCaltech

Page 31: The Grid Enabling Resource Sharing within Virtual Organizations

31

[email protected] ARGONNE CHICAGO

iVDGL Map (2002-2003)

Tier0/1 facility

Tier2 facility

10 Gbps link

2.5 Gbps link

622 Mbps link

Other link

Tier3 facility

DataTAG

Surfnet

LaterBrazilPakistanRussiaChina

Page 32: The Grid Enabling Resource Sharing within Virtual Organizations

32

[email protected] ARGONNE CHICAGO

Programs as Community Resources:Data Derivation and Provenance

Most scientific data are not simple “measurements”; essentially all are:– Computationally corrected/reconstructed

– And/or produced by numerical simulation And thus, as data and computers become

ever larger and more expensive:– Programs are significant community resources

– So are the executions of those programs

Page 33: The Grid Enabling Resource Sharing within Virtual Organizations

33

[email protected] ARGONNE CHICAGO

Transformation Derivation

Data

created-by

execution-of

consumed-by/generated-by

“I’ve detected a calibration error in an instrument and

want to know which derived data to recompute.”

“I’ve come across some interesting data, but I need to understand the nature of the corrections applied when it was constructed before I can trust it for my purposes.”

“I want to search an astronomical database for galaxies with certain characteristics. If a program that performs this analysis exists, I won’t have to write one from scratch.”

“I want to apply an astronomical analysis program to millions of objects. If the results

already exist, I’ll save weeks of computation.”

Page 34: The Grid Enabling Resource Sharing within Virtual Organizations

34

[email protected] ARGONNE CHICAGO

The Chimera Virtual Data System(GriPhyN Project)

Virtual data catalog– Transformations,

derivations, data Virtual data language

– Data definition + query

Applications include browsers and data analysis applications

Data Grid Resources(distributed execution

and data management)

VDL Interpreter(manipulate derivations

and transformations)

Virtual Data Catalog(implements ChimeraVirtual Data Schema)

Virtual DataApplications

Virtual Data Language(definition and query)

Task Graphs(compute and data

movement tasks, withdependencies)

SQL

Chimera

Page 35: The Grid Enabling Resource Sharing within Virtual Organizations

35

[email protected] ARGONNE CHICAGO

NCSA Linux cluster

5) Secondary reports complete to master

Master Condor job running at

Caltech

7) GridFTP fetches data from UniTree

NCSA UniTree - GridFTP-enabled FTP server

4) 100 data files transferred via GridFTP, ~ 1 GB each

Secondary Condor job on WI

pool

3) 100 Monte Carlo jobs on Wisconsin Condor pool

2) Launch secondary job on WI pool; input files via Globus GASS

Caltech workstation

6) Master starts reconstruction jobs via Globus jobmanager on cluster

8) Processed objectivity database stored to UniTree

9) Reconstruction job reports complete to master

Early GriPhyN Challenge Problem:CMS Data Reconstruction

Scott Koranda, Miron Livny, others

Page 36: The Grid Enabling Resource Sharing within Virtual Organizations

36

[email protected] ARGONNE CHICAGO

0

20

40

60

80

100

120

Pre / Simulation Jobs / Post (UW Condor)

ooHits at NCSA

ooDigis at NCSA

Delay due to script error

Trace of a Condor-G Physics Run

Page 37: The Grid Enabling Resource Sharing within Virtual Organizations

37

[email protected] ARGONNE CHICAGO

AttributesSemantics

Knowledge

Information

Data

Ingest Services

Management AccessServices

(Model-based Access)

(Data Handling System )

MC

AT/H

DF

Gri

ds

XM

L D

TD

SD

LIP

XTM

DTD

Rule

s -

KQ

L

InformationRepository

Attribute-based Query

Feature-basedQuery

Knowledge orTopic-Based Query / Browse

KnowledgeRepository for Rules

RelationshipsBetweenConcepts

FieldsContainersFolders

Storage(Replicas,Persistent IDs)

Knowledge-based Data Grid Roadmap(Reagan Moore, SDSC)

Page 38: The Grid Enabling Resource Sharing within Virtual Organizations

38

[email protected] ARGONNE CHICAGO

New Programs

U.K. eScience program

EU 6th Framework U.S. Committee on

Cyberinfrastructure Japanese Grid

initiative

Page 39: The Grid Enabling Resource Sharing within Virtual Organizations

39

[email protected] ARGONNE CHICAGO

U.S. Cyberinfrastructure:Draft Recommendations

New INITIATIVE to revolutionize science and engineering research at NSF and worldwide to capitalize on new computing and communications opportunities 21st Century Cyberinfrastructure includes supercomputing, but also massive storage, networking, software, collaboration, visualization, and human resources– Current centers (NCSA, SDSC, PSC) are a key resource for the INITIATIVE

– Budget estimate: incremental $650 M/year (continuing) An INITIATIVE OFFICE with a highly placed, credible leader empowered to

– Initiate competitive, discipline-driven path-breaking applications within NSF of cyberinfrastructure which contribute to the shared goals of the INITIATIVE

– Coordinate policy and allocations across fields and projects. Participants across NSF directorates, Federal agencies, and international e-science

– Develop high quality middleware and other software that is essential and special to scientific research

– Manage individual computational, storage, and networking resources at least 100x larger than individual projects or universities can provide.

Page 40: The Grid Enabling Resource Sharing within Virtual Organizations

40

[email protected] ARGONNE CHICAGO

Summary Technology exponentials are changing the shape

of scientific investigation & knowledge– More computing, even more data, yet more

networking The Grid: Resource sharing & coordinated problem

solving in dynamic, multi-institutional virtual organizations

Petascale Virtual Data Grids represent the future in science infrastructure– Data, programs, computers, computations as

community resources

Page 41: The Grid Enabling Resource Sharing within Virtual Organizations

41

[email protected] ARGONNE CHICAGO

For More Information Grid Book

– www.mkp.com/grids The Globus Project™

– www.globus.org Global Grid Forum

– www.gridforum.org TeraGrid

– www.teragrid.org EU DataGrid

– www.eu-datagrid.org GriPhyN

– www.griphyn.org iVDGL

– www.ivdgl.org Background papers

– www.mcs.anl.gov/~foster