toward a national big data superhighway

42
“Toward A National Big Data Superhighway” Closing Kenote Internet2 Global Summit Washington, DC April 26, 2017 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD http://lsmarr.calit2.net 1

Upload: larry-smarr

Post on 22-Jan-2018

161 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Toward A National Big Data Superhighway

“Toward A National Big Data Superhighway”

Closing Kenote

Internet2 Global Summit

Washington, DC

April 26, 2017

Dr. Larry Smarr

Director, California Institute for Telecommunications and Information Technology

Harry E. Gruber Professor,

Dept. of Computer Science and Engineering

Jacobs School of Engineering, UCSD

http://lsmarr.calit2.net1

Page 2: Toward A National Big Data Superhighway

Abstract

Research in data-intensive fields is increasingly multi-investigator and multi-institutional,

depending on ever more rapid access to ultra-large heterogeneous and widely

distributed datasets. The Pacific Research Platform (PRP) is an NSF-funded research

project which extends NSF-funded campus Science DMZs to a regional model, built on

the CENIC/Pacific Wave backbone, establishing a science-driven high-capacity data-

centric "freeway system." The PRP spans all 10 campuses of the University of

California, as well as the major California private research universities, four

supercomputer centers, and several universities outside California. Fifteen multi-campus

data-intensive application teams, including particle physics, astronomy/astrophysics,

earth sciences, biomedicine, and scalable multimedia, act as drivers of the PRP,

providing feedback over the five years to the technical design staff. Over the next three

years, PRP will examine sustainable methods for expanding such regional networks to a

national scale.

Page 3: Toward A National Big Data Superhighway

Vision: Creating a West Coast “Big Data Freeway”

Connected by CENIC/Pacific Wave to Internet2 & GLIF

Use Lightpaths to Connect

Big Data Generators and Consumers,

Creating a “Big Data” Freeway

Integrated With High Performance Global Networks

“The Bisection Bandwidth of a Cluster Interconnect,

but Deployed on a 20-Campus Scale.”

This Vision Has Been Building for Over a Decade

Page 4: Toward A National Big Data Superhighway

NSF’s OptIPuter Project: Using Supernetworks

to Meet the Needs of Data-Intensive Researchers

OptIPortal–

Termination

Device

for the

OptIPuter

Global

Backplane

Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PIUniv. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST

Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent

2003-2009

$13,500,000

In August 2003,

Jason Leigh and his

students used

RBUDP to blast

data from NCSA to

SDSC over the

TeraGrid DTFnet,

achieving18Gbps

file transfer out of

the available

20Gbps

LS Slide 2005

Page 5: Toward A National Big Data Superhighway

DOE ESnet’s Science DMZ: A Scalable Network

Design Model for Optimizing Science Data Transfers

• A Science DMZ integrates 4 key concepts into a unified whole:

– A network architecture designed for high-performance applications,

with the science network distinct from the general-purpose network

– The use of dedicated systems as data transfer nodes (DTNs)

– Performance measurement and network testing systems that are

regularly used to characterize and troubleshoot the network

– Security policies and enforcement mechanisms that are tailored for

high performance science environments

http://fasterdata.es.net/science-dmz/

Science DMZ

Coined 2010

The DOE ESnet Science DMZ and the NSF “Campus Bridging” Taskforce Report Formed the Basis

for the NSF Campus Cyberinfrastructure Network Infrastructure and Engineering (CC-NIE) Program

Page 6: Toward A National Big Data Superhighway

Based on Community Input and on ESnet’s Science DMZ Concept,

NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways

Red 2012 CC-NIE Awardees

Yellow 2013 CC-NIE Awardees

Green 2014 CC*IIE Awardees

Blue 2015 CC*DNI Awardees

Purple Multiple Time Awardees

Source: NSF

Page 7: Toward A National Big Data Superhighway

I Believe as Greg Bell Has Said

We Should Engineer the Network as an Instrument of Discovery

It is all about the end users!

We Must Optimize The Instrument

For Multi-Campus Collaborating Application Teams

Page 8: Toward A National Big Data Superhighway

How CC-NIE Prism@UCSD Grant Transforms Big Data Microbiome Science:

Preparing for Knight/Smarr 1 Million Core-Hour Analysis

12 Cores/GPU

128 GB RAM

3.5 TB SSD

48TB Disk

10Gbps NIC

Knight Lab

FIONA

10Gbps

Gordon

Prism@UCSD

Data Oasis

7.5PB,

200GB/s

Knight 1024 Cluster

In SDSC Co-Lo

CHERuB

100Gbps

Emperor & Other Vis Tools

64Mpixel Data Analysis Wall

120Gbps

40Gbps

1.3Tbps

Page 9: Toward A National Big Data Superhighway

The Next Logical Step:

Build a Regional DMZ by Connecting West Coast Campus DMZs

• May 2014 LS Gives Invited Presentation to UC IT Leadership Council

– Strong Support from UC and UCOP CIOs

• July 2014 LS Gives Invited Talk to CENIC Annual Retreat

– CENIC/PW Agrees to Act as Backplane

– CIO Support Extends to CA Private Research Universities

• December 2014 UCOP CIO and VPR’s Provide PRP “Momentum Money”

• January 2015 Kickoff of PRPv0 by Network Engineers

– Begins Every Two Week Conference Calls, Now Weekly

• March 2015 LS Invited “Blue Sky” Presentation to UC VCR/CIO Summit

– NSF PRP Proposal Submitted With Letters of Commitment From:

– 50 Researchers from 15 Campuses

– 32 IT/Network Organization Leaders

Page 10: Toward A National Big Data Superhighway

The Pacific Research Platform:

a Working End-to-End Science-Driven Regional DMZ-Connector

NSF CC*DNI Grant

$5M 10/2015-10/2020

PI: Larry Smarr, UC San Diego Calit2

Co-Pis:

• Camille Crittenden, UC Berkeley CITRIS,

• Tom DeFanti, UC San Diego Calit2,

• Philip Papadopoulos, UCSD SDSC,

• Frank Wuerthwein, UCSD Physics and SDSC

(GDC)

PRP is Built on CENIC/Pacific Wave

Page 11: Toward A National Big Data Superhighway

Our Prototype System – Built for for Scientists

Out of a Bunch of Independently Managed Networks

• Challenge:

– Campus DMZs, Regional (e.g., CENIC), National (Internet2), International

Networks (e.g., GLIF) are Individually-Architected Systems

• How Do They Work Together with Predictable Performance?

• PRP is Focused on Disk-to-Disk Data Movement

– From the Eyes of Domain Scientists

– End-to-End for Their Data is Their Only Real Metric of Concern (As it Should Be)

Source: Phil Papadopoulos

Page 12: Toward A National Big Data Superhighway

PRP Science DMZ Data Transfer Nodes (DTNs) -

Flash I/O Network Appliances (FIONAs)

UCSD Designed FIONAs

To Solve the Disk-to-Disk

Data Transfer Problem

at Full Speed

on 10G, 40G and 100G Networks

FIONAS—10/40G, $8,000FIONette—1G, $1,000

Phil Papadopoulos, SDSC &

Tom DeFanti, Joe Keefe & John Graham, Calit2

John Graham, Calit2

Page 13: Toward A National Big Data Superhighway

More Than 30 PRP Installed FIONAs:

Customized to the Needs of Application Teams

• Data Transfer Nodes

– 1, 10, 40, and 100Gb/s NICs

• Storage Transfer Nodes

– Up to 160TB of Rotating Disks

– Nonvolatile Memory Disks (NVMe - 10x Faster than Flash)

– ½ PB Flash Disk (at SC15, on Loan From Vendor)

• Compute Transfer Nodes

– 12-48 Intel CPU Cores

– 1-8 GPUs (Delivers Up to 500,000 GPU Core Hours/Day)

• Visualization Transfer Nodes

– 3-45 Tiled displays (up to 180 Megapixels, 2D & 3D)

– 360-Megapixel SunCAVE Coming Soon

Page 14: Toward A National Big Data Superhighway

PRP Continues to Expand Rapidly While Increasing Connectivity:

1 1/2 Years of Progress – 12 Sites to 24 Sites

January 29, 2016

Connected 24 DMZ FIONAs

at 10G and 40G

April 24, 2017

Source: John Graham, Calit2

Page 15: Toward A National Big Data Superhighway

We Measure FIONA Disk-to-Disk Throughput with 10GB File Transfer

4 Times Per Day in Both Directions for All PRP Sites

See Time Lapse Movie Jan 2016 to Today

http://prp-maddash.calit2.optiputer.net/optiputer/optiputer.mp4

Page 16: Toward A National Big Data Superhighway

We Have Held a Number of

PRP Science Engagement Workshops

Source: Camille Crittenden, UC Berkeley

UC San DiegoUC Merced

UC Davis UC Berkeley

Page 17: Toward A National Big Data Superhighway

PRP’s First 1.5 Years:

Connecting Campus Application Teams and Devices

Page 18: Toward A National Big Data Superhighway

We Scale the Working PRP by Providing Multi-Campus Application Teams

With Disk-to-Disk Measurements

UIC

UCSD

UCI

U Hawaii

USC

NCAR

SDSU

Page 19: Toward A National Big Data Superhighway

LHC Rearchers Look to PRP to Fix the Last Mile Architecture in California:

Data and Compute Resources Can Both Be Shared

PRP provides an Implementation of All This on a Single FIONA,

PRP helps Integrate Local Resources into This FIONA.

login nodes

compute

scheduler

compute cluster

storage clusterDTN

CTN

WAN

CTN = compute transfer node

DTN = data transfer node

Science DMZ

Source: Frank Wuerthwein, UCSD, SDSC

Page 20: Toward A National Big Data Superhighway

>360 California Scientists Are Researching

Particle Physics Big Data Analysis

• ATLAS

– UCB/LBNL (63)

– SLAC/Stanford (51)

– UCSC (30)

– UCI (32)• Total of 176 members listed in

ATLAS HR database at CERN

• CMS (Members)

– Caltech (29)

– LLNL (3)

– UCD (41)

– UCLA (17)

– UCR (25)

– UCSD (36)

– UCSB (35)• Total of 186 members listed in CMS

HR database at CERN

Source: Frank Wuerthwein, UCSD, SDSC

Page 21: Toward A National Big Data Superhighway

LHC Computing and Data Resources

10 Institutions

• ATLAS Institutions

– SLAC “T2”

– NERSC (used by both)

– UCSC T3

– UCI T3

• CMS Institutions

– Caltech T2

– UCSD T2

– SDSC (used by both)

– UCD T3

– UCR T3

– UCSB T3

Lots of Potential Network Traffic for LHC on PRP

Source: Frank Wuerthwein, UCSD, SDSC

Page 22: Toward A National Big Data Superhighway

100 Gbps FIONA at UCSC Connects the UCSC Hyades Cluster

to the NERSC Supercomputer at LBNL

Supporting UCSC Remote Access

to Large Data Subsets

of the Dark Energy Spectroscopic Instrument (DESI)

and AGORA Galaxy Simulation Data

Produced at NERSC.

250 images per night

800GB per night

Shawfeng Dong, UCSC Cyberengineer

UCSC Feb 7, 2017

Page 23: Toward A National Big Data Superhighway

40G FIONAs

20x40G PRP-connected

WAVE@UC San Diego

PRP Now Enables

Distributed Virtual Reality

PRP

WAVE @UC Merced

Transferring 5 CAVEcam Images from UCSD to UC Merced:

2 Gigabytes now takes 2 Seconds (8 Gb/sec)

Page 24: Toward A National Big Data Superhighway

PRP Will Link the Laboratories of

the Pacific Earthquake Engineering Research Center

http://peer.berkeley.edu/

PEER Labs: UC Berkeley, Caltech, Stanford,

UC Davis, UC San Diego, and UC Los Angeles

John Graham Installing FIONette at PEER Feb 10, 2017

Page 25: Toward A National Big Data Superhighway

Cancer Genomics Hub (UCSC) is Housed in SDSC:

Large Data Flows to End Users at UCSC, UCB, UCSF, …

1G

8G

Data Source: David Haussler,

Brad Smith, UCSC

15GJan 2016

30,000 TB

Per Year

Page 26: Toward A National Big Data Superhighway

NIH’s Cancer Genomics Database Moved

So the PRP Deployed a FIONA to Chicago’s MREN

Page 27: Toward A National Big Data Superhighway

The Prototype PRP Has Attracted

New Application Drivers-More in Next Larry and Scott Talks

Scott Sellars, Marty Ralph

Center for Western Weather and Water Extremes

Frank Vernon - Expansion of HPWREN

Tom Levy, Cultural Heritage

Cryo EM

Page 28: Toward A National Big Data Superhighway

GPU JupyterHub:

2 x 14-core CPUs

256GB RAM

1.2TB FLASH

3.8TB SSD

Nvidia K80 GPU

Dual 40GbE NICs

And a Trusted Platform

Module

GPU JupyterHub:

1 x 18-core CPUs

128GB RAM

3.8TB SSD

Nvidia K80 GPU

Dual 40GbE NICs

And a Trusted Platform

Module

PRP UC-JupyterHub Backbone

UCB Next Step: Deploy Across PRP UCSDSource: John Graham, Calit2

Page 29: Toward A National Big Data Superhighway

Atmospheric

Rivers

(fall and winter)

Southwest

Monsoon

(summer & fall)

Great Plains Convection

(spring and summer)

Front Range Upslope

(rain/snow)

Funded collaborations

CW3E Based at UCSD/Scripps Oceanography

CW3E-Northat Sonoma

County Water

Agency

Key Phenomena Causing Extreme Precipitation in the Western U.S. (Ralph et al.

2014)

Director: F. Martin Ralph Website: cw3e.ucsd.edu

Data is at the heart of what we do!

• High resolution numerical models

• Satellite images

• Ground based weather stations

• Weather radar

• Historical climate data

Big Data Collaboration with:

Source: Scott Sellers, CW3E

Collaboration on Atmospheric Water

Between UC San Diego and UC Irvine

Director, Soroosh Sorooshian, UCSD Website http://chrs.web.uci.edu

Page 30: Toward A National Big Data Superhighway

Calit2’s FIONA

SDSC’s COMET

Calit2’s FIONA

Pacific Research Platform (10-100 Gb/s)

GPUsGPUs

Complete workflow time: 20 days20 hrs20 Minutes!

UC, Irvine UC, San Diego

Improvement of Over 1000x With PRP

Page 31: Toward A National Big Data Superhighway

Cryo-electron Microscopy (cryo-EM)

Has Driven a “Resolution Revolution” in the Last Five Years

Exposure (every 60 seconds):

X & Y dimensions: 7420 x 7676 Pixels

Frames per Movie: 10 - 50

Size: 3 - 10 GB per Movie

Every 24 hours:

Number of Movies: ~1400

Data Size: ~5 TB

Typical Datasets:

Length of Time: 2 - 6 Days

Total size: 10 - 30 TB

Each Cryo-EM ‘Image’ is Actually a Movie

Source: Michael A. Cianfrocco,

Elizabeth Villa, & Andres Leschziner, UCSD

Page 32: Toward A National Big Data Superhighway

Using PRP to Connect Cryo-EM across California

With End Users and Computational Facilities

Long term:

‣Partner with Cryo-EM Facilities to Stream Data

Straight from Microscopes (over PRP) to SDSC

‣Perform All Cryo-EM Analysis (from Micrographs

to 3D Models) via Web Browser on SDSC

‣Expand Computing to Other XSEDE Resources

(e.g. Xstream) and DOE’s NERSC

Short term:

‣Provide 2D and 3D Analysis on Particle Stacks on

Comet at SDSC

Source: Michael A. Cianfrocco, UCSD

**

SDSC

NERSC

Xstream

3 Supercomputer Centers

cosmic-cryoem.org

~20 Microscopes in CA

UCLA

UC Davis

UC Santa Cruz

SF Bay

UC Berkeley, LBNL,

UCSF, Stanford

San Diego

UCSD, TSRI, Salk*

Page 33: Toward A National Big Data Superhighway

Linking Cultural Heritage and Archaeology Datasets

at UCB, UCLA, UCM and UCSD with CAVEkiosks

48 Megapixel CAVEkiosk

UCSD Library

48 Megapixel CAVEkiosk

UCB Library24 Megapixel CAVEkiosk

UCM Library

Page 34: Toward A National Big Data Superhighway

PRP is the Platform Chosen for 2017 Expansion

of HPWREN, Connected to CENIC, into Orange and Riverside Counties

• PRP CENIC 100G Link

UCSD to SDSU

– DTN FIONAs Endpoints

– Data Redundancy

– Disaster Recovery

– High Availability

– Network Redundancy

• Anchor to CENIC at UCI

– PRP FIONA Connects to

CalREN-HPR Network

– Data Replication Site

• Potential Future UCR

CENIC Anchor

UCR

UCI

UCSD

SDSU

Source: Frank Vernon,

Greg Hidley, UCSD

Page 35: Toward A National Big Data Superhighway

Proposed Cognitive Hardware and Software Ecosystem

On the Pacific Research Platform

• Working With 30 CSE Machine Learning Researchers

– Goal is 320 Game GPUs in 32-40 FIONAs at 10 PRP Campuses

– PRP Couples FIONAs with GPUs into a Condor-Managed Cloud

• PRP Access to Emerging Processors

– IBM TrueNorth, KnuEdge, FPGA, and Qualcomm Snapdragon

• Software Including a Wide Range of Open ML Algorithms

• Metrics for Performance of Processors and Algorithms

Source: Tom DeFanti, Calit2FIONA with 8-Game GPUs

Page 36: Toward A National Big Data Superhighway

We are Now Investigating

How the PRP Prototype Might Be Extended to National-Scale

From the text of the PRP cooperative agreement:

After approximately 18 (or TBD) months, a site visit and comprehensive review of

progress towards meeting project milestones and goals and overall performance and

management processes will take place, including user community relationships,

scientific impacts, and the status of the project as a model for potential future

national-scale, network-aware, data-focused cyberinfrastructure attributes,

approaches, and capabilities.

Page 37: Toward A National Big Data Superhighway

Expanding to National Research Platform and Global Research Platform

Via CENIC/Pacific Wave, Internet2, and International Links

PRP’s Current

International

Partners

Korea Shows Distance is Not the Barrier

to Above 5Gb/s Disk-to-Disk Performance

Page 38: Toward A National Big Data Superhighway

PRP Working on Connecting Guam

via the University of Oregon-Based Network Startup Resource Center

The PRP shipped a FIONette

to CENIC’s John Hess

to be Installed in Guam Mid-May

To support projects in:

• Geography

• Climate History

• Guam EPSCoR

• The UOG Marine Laboratory

“During the quarter century that this group has been helping to build internet infrastructure

around the world, there’s hardly a place on the planet that has not been touched

by the great work of the Network Startup Resource Center,” -- Larry Smarr.

Page 39: Toward A National Big Data Superhighway

PRP is Partnering with the Advanced CyberInfrastructure –

Research and Education Facilitators (ACI-REF) NSF Grant to Explore Extension

PRP Connected

ACI-REF has also spawned the 28-member Campus Research Computing consortium (CaRC), funded by the NSF as a Research Coordination Network (RCN).

CaRC is dedicated to sharing best practices, expertise, and resources, enabling the advancement of campus- based research computing activities around the nation.

Jim Bottum, Principal Investigator

ACI-REF

CaRC

Page 40: Toward A National Big Data Superhighway

Announcing the First National Research Platform

Workshop August 7-8, 2017

Co-Chairs:

Larry Smarr, Calit2

& Jim Bottum, Internet2

See pacificresearchplatform.org

for Registration Information

Page 41: Toward A National Big Data Superhighway

Toward a National Research Platform

PRP has 3 FTEs to Connect ~25 Campuses.

How Many are Needed to Expand to a NRP

Serving Researchers at 250 Campuses in Dozens of Fields?

What is the Path Forward?

As Internet2 Board of Trustees Member

John Evans Said to Me Last Night:

“We Are Near an Inflection Point.”

Page 42: Toward A National Big Data Superhighway

Our Support:

• US National Science Foundation (NSF) awards CNS 0821155 and

CNS-1338192, CNS-1456638, ACI-1540112, and ACI-1541349

• University of California Office of the President CIO

• UCSD Chancellor’s Integrated Digital Infrastructure Program

• UCSD Next Generation Networking initiative

• Calit2 and Calit2 Qualcomm Institute

• CENIC, PacificWave and StarLight

• DOE ESnet