hep-ccc meeting, november 1999grid computing for hep l. e. price, anl grid computing for hep l. e....
TRANSCRIPT
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Grid Computing for HEP
L. E. Price
Argonne National Laboratory
HEP-CCC Meeting
CERN, November 12, 1999
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
The Challenge
• Providing rapid access to event samples and subsets from massive datastores, from 100s of Terabytes in 2000 to 100 Petabytes by 2010.
• Transparent access to computing resources, throughout the U.S., and throughout the World
• The extraction of small or subtle new physics signals from large and potentially overwhelming backgrounds
• Enabling access to the data, and to the rest of the physics community, across and ensemble of networks of varying capability and reliability, using heterogeneous computing resources
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Achieving a Balance
• Proximity of the data to central computing and data handling resources
• Proximity of frequently accessed data to the users, to be processed in desktops, local facilities, or regional centers
• Making efficient use of limited network bandwidth; especially transoceanic
• Making appropriate use of regional and local computing and data handling
• Involving scientists and students in each world region in the physics analysis
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Need for optimization
• Meeting the demands of hundreds of users who need transparent access to local and remote data in disk caches and tape stores
• Prioritizing hundreds to thousands of requests from the local and remote communities
• Structuring and organizing the data; providing the tools for locating, moving, and scheduling data transport between tape and disk and across networks
• Ensuring that the overall system is dimensioned correctly to meet the aggregate need
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Science and Massive Datasets
• Massive dataset generation the new norm in science– High Energy Physics
– Nuclear Physics
– LIGO
– Automated astronomical scans (e.g., Sloan Digital Sky Survey)
– The Earth Observing System (EOS)
– The Earth System Grid
– Geophysical data (e.g., seismic)
– Satellite weather image analysis
– The Human Brain Project (time series of 3-D images)
– Protein Data Bank
– The Human Genome Project
– Molecular structure crystallography data
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Proposed Solution• A data analysis grid for High Energy Physics
Tier 1
T2
T2
T2
T2
T2
3
3
3
3
33
3
3
3
3
3
3
CERN T2
44 4 4
33
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Analogy to Computing Grid
• Because the resources needed to solve complex problems are rarely collocated
• Topic of intensive CS research for a number of years already
• Computing (or data) resources from a “plug on the wall”
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Why a Hierarchical Data Grid?
• Physical– Appropriate resource use data proximity to users & labs
– Efficient network use local > regional > national > oceanic
– Scalable growth avoid bottlenecks
• Human– Central lab cannot manage / help / care about 1000s of users
– Cleanly separates functionality of different resource types
– University/regional computing complements national labs funding agencies
– Easier to leverage resources, maintain control, assert priorities at regional/local level
– Effective involvement of scientists and students independently of location
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Logical Steps toward Data Grid
2000 2005 20101995
Production
Basic Research
Testbeds
Design/Optimization
(Pre)
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
U.S. Grid Technology Projects
2000 2005 20101995
LHC, GriPhyN
Clipper/ NGI-PPDG
Apogee
PASS/Globus/HENP-GC
/MONARC/GIOD/Nile
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
In Progress• Laboratory and experiment-specific
development, deployment and operation (hardware and software);
• Tool development in HENP, Computer Science, Industry;
• The Particle Physics Data Grid:– NGI-funded project aiming (initially) at jump-
starting the exploitation of CS and HENP software components to make major improvements in data access.
Business as usual
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Proposals being Developed• GriPhyN:
Grid Physics Networking
– Targeted at NSF;– Focus on the long-term university-based grid infrastructure
for major physics and astronomy experiments.
• APOGEE:A Physics-Optimized Grid Environment for Experiments
– Targeted at DoE HENP (and/or DoE SSI);– Focus on medium to long-term software needs for HENP
distributed data management;– Initial focus on instrumentation, modeling and
optimization.
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
PPDG, APOGEE and GriPhyN
• A coherent program of work;
• Substantial common management proposed;
• A focus for HENP collaboration with Computer Science and Industry;
• PPDG/Apogee will create “middleware” needed by data-intensive science including LHC. (Synergy but no overlap with CMS/Atlas planning.)
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Data Grid Projects in Context
Construction and
Operation of HENP Data Management an Analysis
Systems
Tiers 0/1
>> $20M/yr of existing funding at HENP labs.
e.g. SLAC FY1999
~$7M equipment for BaBar (of which < $2M physics CPU);
~$3M labor, M&S.
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Data Grid Projects in Context
Construction and
Operation of HENP Data Management
and Data Analysis
Systems at DoE
Laboratories
Tiers 0/1
GriPhyN
HENP Data
Manage-ment at Major
University Centers
Tier 2
Draft proposal for NSF funding:
$5-$16M/year
$16M =$8M hardware$5M labor/R&D$3M network
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Data Grid Projects in Context
Construction and
Operation of HENP Data Management
and Data Analysis
Systems at DoE
Laboratories
Tiers 0/1
GriPhyN
HENP Data
Manage-ment at Major
University Centers
Tier 2
OO Databases and Analysis
Tools
Resource Management
Tools
Metadata Catalogs
WAN Data Movers
Mass Storage Management
Systems
Matchmaking
Widely Applicable Technolgy and Computer Science
(not only from HENP;
100s of non-HEP FTEs)
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Data Grid Projects in Context
Construction and
Operation of HENP Data Management
and Data Analysis
Systems at DoE
Laboratories
Tiers 0/1
GriPhyN
HENP Data
Manage-ment at Major
University Centers
Tier 2
OO Databases and Analysis
Tools
Resource Management
Tools
Metadata Catalogs
WAN Data Movers
Mass Storage Management
Systems
Matchmaking
PPDG
Particle Physics
Data Grid
NGI Project
Large-scale tests/service focused on use of existing components
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Data Grid Projects in Context
Construction and
Operation of HENP Data Management
and Data Analysis
Systems at DoE
Laboratories
Tiers 0/1
GriPhyN
HENP Data
Manage-ment at Major
University Centers
Tier 2
OO Databases and Analysis
Tools
Resource Management
Tools
Metadata Catalogs
WAN Data Movers
Mass Storage Management
Systems
Matchmaking
PPDG
Particle Physics
Data Grid
NGI Project
Unified Project Management
Optimization and Evaluation
Instrumentation
Modeling and Simulation
A new level of rigor as the foundation for future progress
APOGEE
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Data Grid Projects in Context
Construction and
Operation of HENP Data Management
and Data Analysis
Systems at DoE
Laboratories
Tiers 0/1
GriPhyN
HENP Data
Manage-ment at Major
University Centers
Tier 2
OO Databases and Analysis
Tools
Resource Management
Tools
Metadata Catalogs
WAN Data Movers
Mass Storage Management
Systems
Matchmaking
PPDG
Particle Physics
Data Grid
NGI Project
Unified Project Management
Optimization and Evaluation
Instrumentation
Modeling and Simulation
APOGEE
R&D + Contacts with CS/Industry
Long-term Goals
Testbeds
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Overall Program Goal
• A Coordinated Approach to the Design and Optimization of a Data Analysis Grid for HENP Experiments
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Collaborators:
California Institute of Technology Harvey B. Newman, Julian J. Bunn, James C.T. Pool, RoyWilliams
Argonne National Laboratory Ian Foster, Steven TueckeLawrence Price, David Malon, Ed May
Berkeley Laboratory Stewart C. Loken, Ian HinchcliffeArie Shoshani, Luis Bernardo, Henrik Nordberg
Brookhaven National Laboratory Bruce Gibbard, Michael Bardash, Torre Wenaus
Fermi National Laboratory Victoria White, Philip Demar, Donald PetravickMatthias Kasemann, Ruth Pordes
San Diego Supercomputer Center Margaret Simmons, Reagan Moore,
Stanford Linear Accelerator Center Richard P. Mount, Les Cottrell, Andrew Hanushevsky,David Millsom
Thomas Jefferson National AcceleratorFacility
Chip Watson, Ian Bird
University of Wisconsin Miron Livny
Particle Physics Data GridUniversities, DoE Accelerator Labs, DoE Computer Science
Funded by DoE-NGI at $1.2M for first year
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
PPDG Collaborators Particle Accelerator Computer Physics Laboratory Science
ANL X X
LBNL X X
BNL X X x
Caltech X X
Fermilab X X x
Jefferson Lab X X x
SLAC X X x
SDSC X
Wisconsin X
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
First Year PPDG Deliverables
Implement and Run two services in support of the major physics experiments at BNL, FNAL, JLAB, SLAC:
– “High-Speed Site-to-Site File Replication Service”; Data replication up to 100 Mbytes/s
– “Multi-Site Cached File Access Service”: Based on deployment of file-cataloging, and transparent cache-management and data movement middleware
• First Year: Optimized cached read access to file in the range of 1-10 Gbytes, from a total data set of order One Petabyte
Using middleware components already developed by the Proponents
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
PPDG Site-to-Site Replication Service
Network Protocols Tuned for High Throughput Network Protocols Tuned for High Throughput Use of DiffServUse of DiffServ for for
(1) Predictable high priority delivery of high - bandwidth (1) Predictable high priority delivery of high - bandwidth data streams data streams
(2) Reliable background transfers(2) Reliable background transfers Use of integrated instrumentationUse of integrated instrumentation
to detect/diagnose/correct problems in long-lived high speed to detect/diagnose/correct problems in long-lived high speed transfers [NetLogger + DoE/NGI developments]transfers [NetLogger + DoE/NGI developments]
Coordinated reservation/allocation techniquesCoordinated reservation/allocation techniques
for storage-to-storage performancefor storage-to-storage performance
SECONDARY SITECPU, Disk, Tape Robot
PRIMARY SITEData Acquisition,
CPU, Disk, Tape Robot
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
PPDG Multi-site Cached File Access System
UniversityUniversityCPU, Disk, CPU, Disk,
UsersUsers
PRIMARY SITEPRIMARY SITEData Acquisition,Data Acquisition,Tape, CPU, Disk, Tape, CPU, Disk,
RobotRobot
Satellite SiteSatellite SiteTape, CPU, Tape, CPU, Disk, RobotDisk, Robot
Satellite SiteSatellite SiteTape, CPU, Tape, CPU, Disk, RobotDisk, Robot
UniversityUniversityCPU, Disk, CPU, Disk,
UsersUsers
UniversityUniversityCPU, Disk, CPU, Disk,
UsersUsers
Satellite SiteSatellite SiteTape, CPU, Tape, CPU, Disk, RobotDisk, Robot
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
PPDG Middleware Components
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
APOGEEFocus on Instrumentation and
Modeling
Planned proposal to DOE
Originally targeted at SSI
Roughly the same collaborators as PPDG
Intended to be the next step after PPDG
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Understanding Complex Systems(Writing into the BaBar Object Database at SLAC)
Aug. 1: ~4.7 Mbytes/s
Oct. 1:~28 Mbytes/s
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
APOGEE Manpower Requirements (FTE)FY00 FY01 FY02 FY03 FY04
InstrumentationLow-level data capture 0.5 1 0.75 0.75 0.75Filtering and collecting agents 0.5 1 1 1 1Data analysis and presentation 0.5 1 1 0.75 0.75HENP workload profiling 0.5 1 0.5 0.5 0.5
SimulationFramework design and development 1 2 1.5 1 0.5User workload simulation 0.5 1 0.75 0.75 0.5Component simulations (network, mass-storage system, object DB etc.) 1.25 2.5 2 1.5 1Site simulation packages 1 1 1
Instrumentation/Simulation TestbedInstrumentation of existing experiment(s) (e.g.PPDG) 0.5 1 1 1 1Acquire and simulate performance measurements 0.25 0.5 0.5 0.75 1Acquire user workload profile 0.25 0.5 0.5 0.25 0.25Test prediction and optimization 0.5 0.75 0.75
Evaluation and OptimizationQuantify evolving needs of physics (including site policies etc.) 0.25 0.5 0.5 0.5 0.5Develop metrics for usefulness of data management facilities 0.5 1 1 1 1Optimize model systems 0.5 1 1.5
Long-Term Strategy (Towards "Virtual Data")Tracking and testing HENP/CS/Industry developments 1 2 1.5 1.5 1.5Development projects in collaboration with HENP/CS/Industry 0.5 1 1.5
Project Management (APOGEE and PPDG)Project leader (physicist) 0.5 1 1 1 1Lead computer scientist 0.5 1 1 1 1
TOTALS 8.5 17 17 17 17
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
APOGEE Funding Needs$k $k $k $k $kFY00 FY01 FY02 FY03 FY04
ManpowerInstrumentation 250 500 406 375 375Simulation 344 688 656 531 375Instrumentation/Simulation Testbed 125 250 313 344 375Evaluation and Optimization 94 188 250 313 375Long-Term Strategy (Towards "Virtual Data") 125 250 250 313 375Project Management (APOGEE and PPDG) 225 450 450 450 450
Commercial Software 100 250 375 500 500
Testbed hardware (in addition to parasitic 150 400 400 400 400use of production systems)
Workstations, M&S, Travel 128 255 255 255 255
TOTALS 1540 3230 3355 3480 3480
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
GriPhyN Proposal• Addresses several massive dataset problems
ATLAS, CMS LIGO Sloan Digital Sky Survey (SDSS)
Tier 2 computing centers (university based) Hardware commodity CPU / disk / tape System support
• Networking• Transatlantic link to CERN "high-speed"• Tier 2 backbone multi-gigabit/sec
• R&D Leverage Tier 2 + existing resources into Grid Computer Science partnership, software
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
GriPhyN Goals
• Build production grid• Exploit all computing resources most effectively
• Enable US physicists to participate fully in LHC program (also LIGO, SDSS)
Eliminate disadvantage of not being at CERN Early physics analysis at LHC startup Maintain and extend US leadership
• Build collaborative infrastructure for students & faculty
Training ground for next generation leaders
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Tier 2 Regional Centers• Total number »20
ATLAS: 6 CMS: 6 LIGO: 5 SDSS 23
• Flexible architecture and mission complements national labs
Intermediate-level data handling Makes possible regional collaborations Well-suited to universities (training, mentoring and
education)
• Scale: Tier2 = (university * laboratory)1/2
1 scenario: Tier 2 = Tier 1 Tier 2 20% Tier 1
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
GriPhyN Funding (Very Rough)
R&D proposal (3 year)$ 3M 3 Tier 2 centers$ 1M System support$ 2M Networking$ 0M Link to CERN$ 1M Commercial software$ 8M R&D Personnel$15M Total
Full proposal (5 year)$ 8M System support (linear startup)$15M R&D (included above)$ 2M Software$10M Tier 2 Networking$ 5M Link to CERN$40M Tier 2 hardware$80M Total ($65M + 3-year R&D)
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
R&D Proposal $15M (Jan. 1999)• R&D goals (complementary to APOGEE / PPDG)
Data, resource management over wide area Fault-tolerant distributed computing over LAN High-speed networks, as they relate to data
management Grid testbeds (with end-users)
Simulations crucial to success MONARC group With APOGEE / PPDG
Leverage resources available to us Strong connections with Computer Science people Existing R&D projects Commercial connections
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL
Grid Computing: Conclusions
• HENP at the frontier of Information Technology– Collaboration with Computer Science;– Collaboration with industry;– Outreach to other sciences.– Visibility (and scrutiny) of HENP computing;
• Enabling revolutionary advances in data analysis in the LHC era– Increasing the value of the vital investment in
experiment-specific data-analysis software