rick cavanaugh, university of florida chep06 mumbai, 13 february, 2006 an ultrascale information...

21
Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, An Ultrascale Information Facility for Data Intensive Research

Upload: marshall-newman

Post on 13-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

Rick Cavanaugh, University of FloridaCHEP06 Mumbai, 13 February, 2006

An Ultrascale Information Facility for Data Intensive Research

Page 2: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 2

The UltraLight Collaboration

• California Institute of Technology

• University of Michigan• University of Florida• Florida International

University• Internet2• Fermilab• Brookhaven

• SLAC• University of California,

San Diego• Massachusetts Institute

of Technology• Boston University• University of California,

Riverside• UCAID

Page 3: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 3

The Project

• UltraLight is– A four year $2M NSF ITR funded by MPS– Application driven Network R&D

• Two Primary, Synergistic Activities– Network “Backbone”: Perform network R&D /

engineering– Applications “Driver”: System Services R&D /

engineering

• Ultimate goal : Enable physics analysis and discoveries which could not otherwise be achieved

Page 4: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 4

The Motivation

Ability to rapidly transport large datasets will strongly impact computing models

– Datasets (used for analysis) no longer need be pinned for long periods

– SE’s more willing to grant greater temporary storage

– Opportunistic use of volatile (non-VO controlled) resources enhanced

– Particularly important in resource over-subscribed environments

Page 5: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 5

• Expose the Network as an Actively Managed Resource

• Based on a “Hybrid” packet- and circuit-switched optical network infrastructure– Ultrascale Protocols (e.g. FAST) and Dynamic Optical Paths

• Monitor, Manage and Optimize resources in real-time – Using a set of Agent-Based Intelligent Global Services

• Leverages already-existing, developing software infrastructure in round-the-clock operation:– MonALISA, GAE/Clarens, OSG

• Exceptional Support from – Industry: Cisco & Calient– Research community: NLR, CENIC, Internet2/Abilene, ESnet

A New Class of Integrated Information Systems

See talk fr

om S. McKee

Page 6: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 6

UltraLight Activities TEAMS: Physicists, Computer Scientists, Network Engineers

• High Energy Physics Application Services– Integrate and Develop physics applications into the

UltraLight Fabric: Production Codes, Grid-enabled analysis, User Interfaces to Fabric

• Global System Services– Critical “Upperware” software components in the UltraLight

Fabric:Monitoring, Scheduling, Agent-based Services, etc.

• Network Engineering– Routing, Switching, Dynamic Path Construction Ops.,

Management • Testbed Deployment and Operations

– Including Optical Network, Compute Cluster, Storage, Kernel and UltraLight System Software Configs.

• Education and Outreach

Page 7: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 7

Project Structure

Steering CommitteeOverall Project Guidance

Management TeamDay-to-day Internal and External

Coordination

Technical GroupsDay-to-day Activities and Operations

Net

wo

rk

Ap

pli

cati

on

s

Ed

uca

tio

n &

Ou

trea

ch

Use

rC

om

mu

nit

yExternal Projects

ATLAS, CMS, DISUN,LCG, OSG, TeraPaths, CHEPREO, AMPATH,

KyraTera, …

External PeeringPartners

NLR, ESNet, USNet, LHCNet, HOPI, TeraGrid,

Pacific Wave, WIDE,AARNet, Brazil/HEPGrid

CA*net4, GLORIAD, IEEAFJGN2, KEK, Korea,NetherLIght, TIFR,UKLight/ESLEA, …

Page 8: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 8

Main Science Driver: The LHC

LEVEL-1 Trigger Hardwired processors (ASIC, FPGA) Pipelined massive parallel

HIGH LEVEL Triggers Farms of

processors

10-9 10-6 10-3 10-0 103 106 sec

25ns 3µs hour yearms

Reconstruction&ANALYSIS TIER0/1/2

Centers

ON-lineOFF-line

sec

Giga Tera Petabit

H2Z4l

Page 9: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 9

LEVEL-1 Trigger Hardwired processors (ASIC, FPGA) Pipelined massive parallel

HIGH LEVEL Triggers Farms of

processors

10-9 10-6 10-3 10-0 103 106 sec

25ns 3µs hour yearms

Reconstruction&ANALYSIS TIER0/1/2

Centers

ON-lineOFF-line

sec

Giga Tera Petabit

New Physics Searches multi-Terabyte scale Datasets!

H2Z4l

Main Science Driver: The LHC

Requests from Multiple users for

Multiple types of… …Multiple times!

Individual TB transactions

should finish inminutes to hours,

rather than hours to days

Page 10: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 10

Science Areas2005 End2End

Throughput5 years

End2End Throughput

5-10 Years End2End

Throughput

Remarks

High Energy Physics

0.5 Gb/s 100 Gb/s 1000 Gb/s High bulk throughput

Climate (Data & Computation)

0.5 Gb/s 160-200 Gb/s N x 1000 Gb/s High bulk throughput

SNS NanoScience

Not yet started

1 Gb/s 1000 Gb/s + QoS for Control

Channel

Remote control and time critical

throughput

Fusion Energy 0.066 Gb/s(500 MB/s burst)

0.198 Gb/s(500MB/

20 sec. burst)

N x 1000 Gb/s Time critical throughput

Astrophysics 0.013 Gb/s(1 TByte/week)

N*N multicast 1000 Gb/s Computat’l steering and

collaborations

Genomics Data & Computation

0.091 Gb/s(1 TBy/day)

100s of users 1000 Gb/s + QoS for Control

Channel

High throughput and

steering

Evolving Science Requirements for Networks (DOE High Perf. Network Workshop)

Slide taken from H. Newman

Page 11: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 11

Ever-increasing Network Flows

Amsterdam Internet Exchange Point

ES Net Total Traffic

Jan 2006120+ Gbits/sec

Now: Should be at Petabyte/month

These two examples are representative of the trend in research and education networks worldwide

ATLAS/CMS data flows are “in the ballpark” in comparison

Page 12: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 12

Project Scope and Context

• Application Frameworks Augmented to Interact Effectively with the Global Services (GS)

– GS Interact in Turn with the Storage Access & Local Execution Service Layers

• Apps. Provide Hints to High-Level Services About Requirements

– Interfaced also to managed Net and Storage services

– Allows effective caching, pre-fetching; opportunities for global and local optimization of thru-put

NetworkingResources

Storage Resources

Computation Resources

Storage Access Execution ServicesNetwork Access

Workflow Management

Request Planning Services

Application Interface

End-to-end Monitoring

ROOT IGUANA COBRA ATHENA Other apps.

Network Management

Ap

plic

atio

n-la

yer

Se

rv

ice

s

Ult

ra

Lig

ht

Glo

ba

l S

erv

ice

s

Ult

ra

Lig

ht

Infra

stru

ctu

re

En

d-to-e

nd

Mo

nit

orin

g

Inte

llig

en

t A

ge

nts

NetworkingResources

Storage Resources

Computation Resources

Storage Access Execution ServicesNetwork Access

Workflow Management

Request Planning Services

Application Interface

End-to-end Monitoring

ROOT IGUANA COBRA ATHENA Other apps.

Network Management

Ap

plic

atio

n-la

yer

Se

rv

ice

s

Ult

ra

Lig

ht

Glo

ba

l S

erv

ice

s

Ult

ra

Lig

ht

Infra

stru

ctu

re

En

d-to-e

nd

Mo

nit

orin

g

Inte

llig

en

t A

ge

nts

Make the Network an Integrated Managed Resource a la CPU & Storage

P h E D E x

Page 13: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 13

GAE and UltralightMake UltraLight available to Physics applications and their environments

• Unpredictable multi user analysis

• Overall demand typically fills the capacity of the resources

• Real time monitor systems for networks, storage, computing resources,… : E2E monitoring

Network Resources

Network Planning

Request Planning

Mo

nito

r

Application Interfaces

Support data transfers ranging from predictable movement of large scale (simulated and real) data, to highly dynamic analysis tasks initiated by rapidly changing teams of scientists

Page 14: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 14

Network Resource TestbedSee talk from S. McKee

Page 15: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 15

Global Network Planning Services

• VINCI :– Virtual Intelligent

Networks for Computing Infrastructures

– Based on existing MonALISA framework

• LISA :– Localhost Information

Service Agent– Monitors end-systems

• User• servers

See talk from I. Legrand

Page 16: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 16

Prototype App. Layer: E2E

• ATLAS/CMS Software stacks are complex and still developing– Integration work is challenging & constantly evolving

• Generic Service Oriented Architecture crucial for integration– Catalogs to select

datasets, – Resource &

Application Discovery – Schedulers guide jobs

to resources– Policies enable “fair”

access to resources– Robust (large size) data (set)

transfer

8Client Application

Discovery

Planner/Scheduler

Monitor InformationPolicy

Steering

Catalogs

Job Submission

Storage Management

Storage Management

Execution

12

3

4

5

5

6

7Dataset service

9

Data Transfer

Ultralight Focus : data transfer, planning scheduling, (sophisticated) policy management on VO level, integration

Page 17: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 17

Supercomputing 2005• Internet Land Speed

Record• 151 Gbps peak rate• 100+ Gbps sustained

throughput for hours• 475 Terabytes of

physics data transported in less than 24 hours

• Sustained rate of 100+ Gbps translates to greater than 1 Petabyte per day

0 24126 18

t [hours]

0 603015 45

t [min]

Cum

ulat

ive

[TB

]R

ate

[Gbs

]

Page 18: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 18

Project Milestones•High-level Milestones

–Link critical services and applications together–Multiple-services & -clients–Distributed system, some fault-tolerance–Logical grid (physical details hidden)–Strategic Steering of Work/Dataflows–Self-organizing, robust distributed E2E system

•User Adoption–Identify small community of users (some within UL)–Integrate UL services seamlessly with LHC Environ.–Deliver LHC Physics Analyses within LHC Timeline

Page 19: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 19

UltraLight Plans

UltraLight is a 4 year program delivering a new, high-performance, network-integrated infrastructure:

• Phase I (12 months) 2004-2005: focused on deploying initial network infrastructure & bringing up first services

• Phase II (18 months) 2005-2006: concentrates on implementing all the needed services & extending the infrastructure to additional sites

• Phase III (18 months) 2007-2008: will focus on a transition to production in support of LHC Physics

Page 20: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 20

Beyond UltraLight: PLaNetS – Physics Lambda Network System

Page 21: Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research

13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 21

Summary• For many years the WAN has been the bottleneck;

This no longer the case in many countries– Deployment of Grid infrastructure now a reality!– Recent land-speed records

• network can be truly transparent• throughputs limited by end-hosts

– Challenge shifting • from getting adequate bandwidth • to deploying adequate infrastructure to make effective use of it!

• UltraLight is delivering a critical missing component for future eScience: the integrated, managed network

Extend and augment existing grid infrastructures (currently focused on CPU/storage) to include the network as an integral component