paul avery university of florida phys.ufl/~avery/ [email protected]

43
Korean HEP G rid Workshop (Nov. 8, 20 Paul Avery 1 Paul Avery University of Florida http://www.phys.ufl.edu/~avery/ [email protected] U.S. Physics Data Grid Projects International Workshop on HEP Data Grids Kyungpook National University, Daegu, Korea Nov. 8-9, 2002

Upload: paloma

Post on 17-Jan-2016

31 views

Category:

Documents


0 download

DESCRIPTION

U.S. Physics Data Grid Projects. Paul Avery University of Florida http://www.phys.ufl.edu/~avery/ [email protected]. International Workshop on HEP Data Grids Kyungpook National University, Daegu, Korea Nov. 8-9, 2002. “Trillium”: US Physics Data Grid Projects. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 1

Paul AveryUniversity of Florida

http://www.phys.ufl.edu/~avery/[email protected]

U.S. Physics Data Grid Projects

International Workshop on HEP Data GridsKyungpook National University, Daegu, Korea

Nov. 8-9, 2002

Page 2: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 2

“Trillium”: US Physics Data Grid Projects

Particle Physics Data Grid (PPDG)Data Grid for HENP experimentsATLAS, CMS, D0, BaBar, STAR, JLAB

GriPhyNPetascale Virtual-Data GridsATLAS, CMS, LIGO, SDSS

iVDGLGlobal Grid labATLAS, CMS, LIGO, SDSS, NVO

Data intensive expts.

Collaborations of physicists & computer scientists

Infrastructure development & deployment

Globus + VDT based

=

Page 3: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 3

Why Trillium?Many common aspects

Large overlap in project leadershipLarge overlap in participantsLarge overlap in experiments, particularly LHCCommon projects (monitoring, etc.)Common packagingCommon use of VDT, other GriPhyN software

Funding agencies like collaborationGood working relationship on grids between NSF and DOEGood complementarity: DOE (labs), NSF (universities)Collaboration of computer science/physics/astronomy

encouraged

Organization from the “bottom up”With encouragement from funding agencies

Page 4: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 4

1800 Physicists150 Institutes32 Countries

Driven by LHC Computing Challenges

Complexity: Millions of detector channels, complex eventsScale: PetaOps (CPU), Petabytes (Data)Distribution: Global distribution of people & resources

Page 5: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 5

Experiment (e.g., CMS)

Global LHC Data Grid

Online System

CERN Computer Center > 20 TIPS

USAKorea RussiaUK

Institute

100-200 MBytes/s

2.5 Gbits/s

0.1 - 1 Gbits/s

2.5 Gbits/s

~0.6 Gbits/s

Tier 0

Tier 1

Tier 3

Tier 4

Tier0/( Tier1)/( Tier2) ~ 1:1:1

Tier 2

Physics cachePCs, other portals

Institute

Institute

Institute

Tier2 Center

Tier2 Center

Tier2 Center

Tier2 Center

Page 6: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 6

LHC Tier2 Center (2001)

Router

>1 RAID WA

N

FEth/GEthSwitch

“Flat” switching topology

Da

ta S

erv

er

20-60 nodesDual 0.8-1 GHz, P31 TByte RAID

Page 7: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 7

LHC Tier2 Center (2002-2003)

Router

GEth/FEth SwitchGEthSwitch

Da

ta S

erv

er

>1 RAID WA

N

“Hierarchical” switching topology

Switch Switch GEth/FEth

40-100 nodesDual 2.5 GHz, P42-4 TBytes RAID

Page 8: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 8

Buy late, but not too late: phased implementationR&D Phase 2001-2004 Implementation Phase 2004-2007R&D to develop capabilities and computing model itselfPrototyping at increasing scales of capability & complexity

1.4 years

1.2 years

1.1 years

2.1 years

LHC Hardware Cost Estimates

Page 9: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 9

Particle Physics Data Grid

“In coordination with complementary projects in the US and Europe, PPDG aims to meet the urgent needs for advanced Grid-enabled technology and to strengthen the collaborative foundations of experimental particle and nuclear physics.”

Page 10: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 10

PPDG GoalsServe high energy & nuclear physics (HENP)

experimentsFunded 2001 – 2004 @ US$9.5M (DOE)

Develop advanced Grid technologiesUse Globus to develop higher level toolsFocus on end to end integration

Maintain practical orientationNetworks, instrumentation, monitoringDB file/object replication, caching, catalogs, end-to-end

movement

Serve urgent needs of experimentsUnique challenges, diverse test environments

But make tools general enough for wide community!Collaboration with GriPhyN, iVDGL, EDG, LCGRecent work on ESNet Certificate Authority

Page 11: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 11

PPDG Participants and Work Program

Physicist + CS involvementD0, BaBar, STAR, CMS, ATLASSLAC, LBNL, Jlab, FNAL, BNL, Caltech, Wisconsin, Chicago,

USC

Computer Science Program of WorkCS1: Job description languageCS2: Schedule, manage data processing, data placement

activitiesCS3: Monitoring and status reporting (with GriPhyN)CS4: Storage resource managementCS5: Reliable replication servicesCS6: File transfer servicesCS7: Collect/document experiment practices generalize…CS11: Grid-enabled data analysis

Page 12: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 12

GriPhyN = App. Science + CS + Grids

ParticipantsUS-CMS High Energy PhysicsUS-ATLAS High Energy PhysicsLIGO/LSC Gravity wave researchSDSS Sloan Digital Sky SurveyStrong partnership with computer scientists

Design and implement production-scale gridsDevelop common infrastructure, tools and services (Globus

based) Integration into the 4 experimentsBroad application to other sciences via “Virtual Data

Toolkit”Strong outreach program

Funded by NSF for 2000 – 2005R&D for grid architecture (funded at $11.9M +$1.6M) Integrate Grid infrastructure into experiments through VDT

Page 13: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 13

GriPhyN: PetaScale Virtual-Data Grids

Virtual Data Tools

Request Planning &

Scheduling ToolsRequest Execution & Management Tools

Transforms

Distributed resources(code, storage, CPUs,networks)

Resource Management

Services

Resource Management

Services

Security and Policy

Services

Security and Policy

Services

Other Grid ServicesOther Grid

Services

Interactive User Tools

Production TeamIndividual Investigator Workgroups

Raw data source

~1 Petaflop~100 Petabytes

Page 14: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 14

GriPhyN Research AgendaBased on Virtual Data technologies (fig.)

Derived data, calculable via algorithm Instantiated 0, 1, or many times (e.g., caches)“Fetch value” vs “execute algorithm”Very complex (versions, consistency, cost calculation, etc)

LIGO example“Get gravitational strain for 2 minutes around each of 200

gamma-ray bursts over the last year”

For each requested data value, need toLocate item location and algorithm Determine costs of fetching vs calculatingPlan data movements & computations required to obtain

results Schedule the planExecute the plan

Page 15: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 15

Virtual Data Concept

Data request may Compute locally Compute remotely Access local data Access remote data

Scheduling based on Local policies Global policies Cost

Major facilities, archives

Regional facilities, caches

Local facilities, cachesFetch item

Page 16: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 16

iVDGL: A Global Grid Laboratory

International Virtual-Data Grid LaboratoryA global Grid laboratory (US, EU, Asia, South America, …)A place to conduct Data Grid tests “at scale”A mechanism to create common Grid infrastructureA laboratory for other disciplines to perform Data Grid testsA focus of outreach efforts to small institutions

U.S. part funded by NSF (2001 – 2006)$14.1M (NSF) + $2M (matching) International partners bring own funds

“We propose to create, operate and evaluate, over asustained period of time, an international researchlaboratory for data-intensive science.”

From NSF proposal, 2001

Page 17: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 17

iVDGL ParticipantsInitial experiments (funded by NSF proposal)

CMS, ATLAS, LIGO, SDSS, NVO

Possible other experiments and disciplinesHENP: BTEV, D0, CMS HI, ALICE, …Non-HEP: Biology, …

Complementary EU project: DataTAGDataTAG and US pay for 2.5 Gb/s transatlantic network

Additional support from UK e-Science programmeUp to 6 Fellows per yearNone hired yet

Page 18: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 18

iVDGL ComponentsComputing resources

Tier1 laboratory sites (funded elsewhere)Tier2 university sites software integrationTier3 university sites outreach effort

NetworksUSA (Internet2, ESNet), Europe (Géant, …)Transatlantic (DataTAG), Transpacific, AMPATH, …

Grid Operations Center (GOC) Indiana (2 people) Joint work with TeraGrid on GOC development

Computer Science support teamsSupport, test, upgrade GriPhyN Virtual Data Toolkit

Coordination, management

Page 19: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 19

iVDGL Management and Coordination

Project Coordination Group

US External Advisory Committee

GLUE Interoperability Team

Collaborating Grid Projects

TeraGrid

EDG Asia

DataTAG

BTEV

LCG?

BioALICE Geo

?

D0 PDC CMS HI ?

US ProjectDirectors

Outreach Team

Core Software Team

Facilities Team

Operations Team

Applications Team

International Piece

US Project Steering Group

U.S. Piece

Page 20: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 20

iVDGL Work TeamsFacilities Team

Hardware (Tier1, Tier2, Tier3)

Core Software TeamGrid middleware, toolkits

Laboratory Operations TeamCoordination, software support, performance monitoring

Applications TeamHigh energy physics, gravity waves, virtual astronomyNuclear physics, bioinformatics, …

Education and Outreach TeamWeb tools, curriculum development, involvement of

students Integrated with GriPhyN, connections to other projectsWant to develop further international connections

Page 21: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 21

US-iVDGL Data Grid (Sep. 2001)

UF

Wisconsin

Fermilab BNL

Indiana

Boston USKC

Brownsville

Hampton

PSU

J. Hopkins

Caltech

Tier1Tier2Tier3

Argonne

UCSD/SDSC

Page 22: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 22

US-iVDGL Data Grid (Dec. 2002)

UF

Wisconsin

Fermilab BNL

Indiana

Boston USKC

Brownsville

Hampton

PSU

J. Hopkins

Caltech

Tier1Tier2Tier3

FIU

FSUArlington

Michigan

LBL

Oklahoma

Argonne

Vanderbilt

UCSD/SDSC

NCSA

Page 23: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 23

Possible iVDGL Participant: TeraGrid

26

24

8

4 HPSS

5

HPSS

HPSS UniTree

External Networks

External Networks

External Networks

External Networks

Site Resources Site Resources

Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB

SDSC4.1 TF225 TB

Caltech Argonne

40 Gb/s

13 TeraFlops

Page 24: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 24

International ParticipationExisting partners

European Data Grid (EDG)DataTAG

Potential partnersKorea T1China T1? Japan T1?Brazil T1Russia T1Chile T2Pakistan T2Romania ?

Page 25: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 25

Current Trillium WorkPackaging technologies: PACMAN

Used for VDT releases very successful & powerfulEvaluated for Globus, EDG

GriPhyN Virtual Data Toolkit 1.1.3 releasedVastly simplifies installation of grid toolsNew changes will further simplify configuration complexity

Monitoring (joint efforts)Globus MDS 2.2 (GLUE schema)Caltech MonaLisaCondor HawkEyeFlorida Gossip (low level component)

Chimera Virtual Data System (more later)Testbeds, demo projects (more later)

Page 26: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 26

Virtual Data: Derivation and Provenance

Most scientific data are not simple “measurements”They are computationally corrected/reconstructedThey can be produced by numerical simulation

Science & eng. projects are more CPU and data intensive

Programs are significant community resources (transformations)

So are the executions of those programs (derivations)

Management of dataset transformations important!Derivation: Instantiation of a potential data productProvenance: Exact history of any existing data product

Programs are valuable, like data.They should be community resources

Page 27: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 27

Transformation Derivation

Data

product-of

execution-of

consumed-by/generated-by

“I’ve detected a mirror calibration error and want to know which derived data products need to be recomputed.”

“I’ve found some interesting data, but I need to know exactly what corrections were applied before I can trust it.”

“I want to search a database for dwarf galaxies. If a program that performs this analysis exists, I won’t have to write one from scratch.”

“I want to apply a shape analysis to 10M galaxies. If the results already exist, I’ll save weeks of computation.”

Motivations (1)

Page 28: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 28

Motivations (2)

Data track-ability and result audit-ability Universally sought by GriPhyN applications

Facilitates tool and data sharing and collaboration Data can be sent along with its recipe

Repair and correction of data Rebuild data products—c.f., “make”

Workflow management A new, structured paradigm for organizing, locating,

specifying, and requesting data products

Performance optimizations Ability to re-create data rather than move it

Page 29: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 29

“Chimera” Virtual Data System Virtual Data API

A Java class hierarchy to represent transformations & derivations

Virtual Data Language Textual for people & illustrative examples XML for machine-to-machine interfaces

Virtual Data Database Makes the objects of a virtual data definition persistent

Virtual Data Service (future) Provides a service interface (e.g., OGSA) to persistent

objects

Page 30: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 30

Virtual Data Catalog Object Model

Page 31: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 31

Virtual Data Language (VDL) Describes virtual data products

Virtual Data Catalog (VDC) Used to store VDL

Abstract Job Flow Planner Creates a logical DAG (dependency

graph)

Concrete Job Flow Planner Interfaces with a Replica Catalog Provides a physical DAG submission file to

Condor-G

Generic and flexible As a toolkit and/or a framework In a Grid environment or locally

Currently in beta version

Log

ical

Ph

ysi

cal

AbstractPlannerVDC

ReplicaCatalog

ConcretePlanner

DAX

DAGMan

DAG

VDLXML

Chimera as a Virtual Data System

XML

Page 32: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 32

Size distribution ofgalaxy clusters?

1

10

100

1000

10000

100000

1 10 100

Num

ber

of C

lust

ers

Number of Galaxies

Galaxy clustersize distribution

Chimera Virtual Data System+ GriPhyN Virtual Data Toolkit

+ iVDGL Data Grid (many CPUs)

Chimera Application: SDSS Analysis

Page 33: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 33

US-CMS Testbed

UCSD

Florida

Wisconsin

Caltech

Fermilab

Page 34: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 34

Other CMS Institutes Encouraged to Join

Expressions of interest• Princeton• Brazil• South Korea• Minnesota• Iowa• Possibly others

UCSD

Florida

Caltech

FermilabWisconsin

Page 35: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 35

Grid Middleware Used in TestbedVirtual Data Toolkit 1.1.3

VDT Client:Globus Toolkit 2.0 Condor-G 6.4.3

VDT Server:Globus Toolkit 2.0mkgridmapCondor 6.4.3 ftshGDMP 3.0.7

Virtual Organization (VO) ManagementLDAP Server deployed at FermilabGroupMAN (adapted from EDG) used to manage the VOUse D.O.E. Science Grid certificates Accept EDG and Globus certificates

UCSD

Florida

Wisconsin

Caltech

Fermilab

Page 36: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 36

Commissioning the CMS Grid Testbed

A complete prototype (fig.)CMS Production ScriptsGlobusCondor-GGridFTP

Commissioning: Require production quality results!Run until the Testbed "breaks"Fix Testbed with middleware patchesRepeat procedure until the entire Production Run finishes!

Discovered/fixed many Globus and Condor-G problems

Huge success from this point of view alone… but very painful

Page 37: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 37

CMS Grid Testbed Production

Remote Site 2Master Site

Remote Site 1

IMPALA mop_submitterDAGManCondor-G

GridFTP

BatchQueue

GridFTP

BatchQueue

GridFTP

Remote Site NBatchQueue

GridFTP

Page 38: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 38

Linker ScriptGenerator

Configurator

Requirements

Self Description

MasterScript "DAGMaker" VDL

MOP MOP Chimera

MCRunJob

Production Success on CMS Testbed

Results150k events generated, ~200 GB produced1.5 weeks continuous running across all 5 testbed sites1M event run just started on larger testbed (~30%

complete!)

Page 39: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 39

Grid Coordination EffortsGlobal Grid Forum (www.gridforum.org)

International forum for general Grid effortsMany working groups, standards definitionsNext one in Japan, early 2003

HICB (High energy physics) Joint development & deployment of Data Grid middlewareGriPhyN, PPDG, iVDGL, EU-DataGrid, LCG, DataTAG,

CrossgridGLUE effort (joint iVDGL – DataTAG working group)

LCG (LHC Computing Grid Project)Strong “forcing function”

Large demo projects IST2002 CopenhagenSupercomputing 2002 Baltimore

New proposal (joint NSF + Framework 6)?

Page 40: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 40

WorldGrid DemoJoint Trillium-EDG-DataTAG demo

Resources from both sides in Intercontinental Grid TestbedUse several visualization tools (Nagios, MapCenter, Ganglia)Use several monitoring tools (Ganglia, MDS, NetSaint, …)

ApplicationsCMS: CMKIN, CMSIMATLAS:ATLSIM

Submit jobs from US or EU Jobs can run on any clusterShown at IST2002 (Copenhagen)To be shown at SC2002 (Baltimore)

Brochures now available describing Trillium and demos

I have 10 with me now (2000 just printed)

Page 41: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 41

WorldGrid

Page 42: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 42

SummaryVery good progress on many fronts

PackagingTestbedsMajor demonstration projects

Current Data Grid projects are providing good experience

Looking to collaborate with more international partners

TestbedsMonitoringDeploying VDT more widely

Working towards new proposalEmphasis on Grid-enabled analysisExtending Chimera virtual data system to analysis

Page 43: Paul Avery University of Florida phys.ufl/~avery/ avery@phys.ufl

Korean HEP Grid Workshop (Nov. 8, 2002)

Paul Avery 43

Grid References Grid Book

www.mkp.com/grids Globus

www.globus.org Global Grid Forum

www.gridforum.org TeraGrid

www.teragrid.org EU DataGrid

www.eu-datagrid.org PPDG

www.ppdg.net GriPhyN

www.griphyn.org iVDGL

www.ivdgl.org