cc* integration: sandie: sdn-assisted ndn for data ...plugin for xrootd. named data networking (ndn)...

16
CC* Integration: SANDIE: SDN-Assisted NDN for Data Intensive Experiments Edmund Yeh

Upload: others

Post on 21-Feb-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

CC* Integration:

SANDIE: SDN-Assisted NDN for

Data Intensive Experiments

Edmund Yeh

Page 2: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

NSF Campus Cyberinfrastructure and Cybersecurity Innovation for Cyberinfrastructure PI Workshop

SCIENTIFIC and BROADER IMPACT▪ Lay groundwork for an NDN-based

data distribution and access system for data-intensive science fields

▪ Benefit user community through lowered costs, faster data access

and standardized naming structures

▪ Engage next generation of scientists in emerging concepts of future Internet architectures for data intensive applications

▪ Advance, extend and test the NDN paradigm to encompass the most data intensive research applications of global extent

SOLUTIONS + Deliverables

CHALLENGES▪ LHC program in HEP is world’s largest

data intensive application: handling One Exabyte by ~2018 at hundreds of sites

▪ Global data distribution, processing, access, analysis; large but limited computing, storage, network resources

APPROACH▪ Use Named Data Networking (NDN)

to redesign LHC HEP network; optimize workflow

SANDIE: SDN Assisted NDN for Data Intensive ExperimentsSeptember 25,2018 |College Park, MD

TEAM▪ Northeastern, Caltech, Colorado

State▪ In partnership with other LHC

sites and the NDN project team

CC* Integration:

▪ Deploy NDN edge caches with SSDs & 40G/100G network interfaces at 7 sites; combine with larger core caches

▪ Simultaneously optimize caching (“hot” datasets), forwarding, and congestion control in both the network core and site edges

▪ Development of naming scheme and attributes for fast access and efficient communication in HEP and other fields

Page 3: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

PI: Edmund Yehco-PIs: Harvey Newman, Christos Papadopoulos

Program, Area: CC, Integration Award Number: 1659403

Edmund YehProfessorNortheastern [email protected]

Christos PapadopoulosProfessor

Colorado State University

[email protected]

Harvey NewmanProfessor of [email protected]

Project Title: SANDIE: SDN-Assisted NDN for Data Intensive Experiments

NSF Campus Cyberinfrastructure and Cybersecurity Innovation for Cyberinfrastructure PI Workshop

September 25,2018 |College Park, MD

Page 4: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

Outline

•Named Data Networking (NDN)

•Optimized NDN algorithms

•Optimizing CMS workflows: simulations

•Global testbed

• Integration with Current Workflow: NDN-based filesystem

plugin for XRootD

Page 5: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

Named Data Networking (NDN)

• Leading Information-centric Networking (ICN) architecture.

•Started by $7.9M NSF Future Internet Architecture grant (2010).

•Name data instead of endpoints: paradigm shift.

•Pull-based data distribution architecture with stateful forwarding.

•Native support for caching and multicast.

•Natural fit for data-intensive science.

•Website: http://named-data.net

Page 6: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

• State-of-art, theoretically-grounded, high-performance optimization

frameworks for optimizing NDN.

• Joint optimization of caching, forwarding, and congestion control.

• VIP (Virtual Interest Packet) framework (Yeh, et al. 2014).

• Adaptive caching and routing framework (Ioannidis & Yeh 2016, 2017).

• Algorithms have optimality guarantees.

• Algorithms are distributed, adaptive, dynamic.

• Superior performance in delay, cache hits, cache evictions.

Optimized NDN Algorithms

Page 7: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

• Long-term goal: optimize dynamic caching, distribution of datasets to REPOS, distribution of workflow jobs.

• First goal: given CMS workflow statistics, use simulation to find performance improvement with optimal VIP caching and forwarding algorithm (Yeh, et al. 2014).

• Took dataset and request statistics in PhEDEx and Elastic Search over 2 months at datablock level granularity (10s-100s of GBs).

• Measured CMS network topology and link capacities using PerfSONAR.

• Simulation: apply VIP algorithm to set of 500 most popular datablocks.

• 6 TB caches in all T1 and T2 US sites.

• Result: average data retrieval delays reduced by ~ 50% for off-site workflow.

Optimizing CMS Workflows: Simulations

Page 8: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

• Topology includes all Tier 1 and Tier 2 sites in US:

– PerfSONAR: average throughput between sites

– 10 - 100 Gbps typically

CMS Simulation Topology

Page 9: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

•Expand Caltech SDN testbed and CSU climate testbed.

•Deploy few more sites in USA and abroad (Northeastern, UCSD, U. Nebraska, ..)

•Caches each with:

⚫ Several terabytes of SSDs

⚫ 40-60 Terabytes of SAS disks

⚫ 10G to 100G network interfaces

⚫ NDN and Xrootd software systems

Global Testbed

Page 10: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

•Goals: increased efficiency, reduced complexity for both

applications and network.

• Integrate with XRootD, the de-facto data management

software.

•First step: implement NDN-based filesystem plugin for

XRootD and suitable NDN Producer.

Integration with Current Workflow

Page 11: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

Architecture and Flow Diagram

Page 12: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

NDN Consumer for XRootD Plugin

• Support for: open, fstat, read and close system calls

• Composes Interests for these system calls over NDN:

/ndn/xrootd/……./root/path/for/ndn/xrd/foo?ndn.MustBeFresh=true

ndn prefix path to file of interest. ndn interest specific info

/ndn/xrootd/......./root/test/path/ndn/xrd/foo/%00%00?ndn.MustBeFresh=true

ndn prefix path to file of interest. seg. no ndn interest specific info

• For read system call, it breaks down the request into segments.

• Handles: Interest validation, timeout and Nack.

offset in file buffer len

NDN package len 7KB

first Interest last Interest

Page 13: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

• Registers prefixes to NFD.

• Performs open, fstat, read and close system calls as required.

• Keeps track of opened files.

• Sends data as strings and non-negative integers.

/ndn/xrootd/……..../root/path/for/ndn/xrd/foo/00/?ndn.MustBeFresh=true&ndn.Nonce=825012545

ndn prefix path to file of interest. ndn data specific info

NDN Producer for XRootD Plugin

Page 14: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

Demo

Consumer tracing:

Using xrdcp tool to copy file:

Page 15: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

Demo

Producer tracing

Page 16: CC* Integration: SANDIE: SDN-Assisted NDN for Data ...plugin for XRootD. Named Data Networking (NDN) •Leading Information-centric Networking (ICN) architecture. ... •Fully exploits

• NDN is natural fit for LHC and other large-scale scientific data and

computation networks.

• Fully exploits both bandwidth, storage, and computation resources.

• Distributed data access by name, with high-performance caching, forwarding,

congestion control.

• Works well with overlay networks with SDN-assisted reserved bandwidth.

• Synergistic with ongoing move to edge computing, SDN, network function

virtualization, network slicing.

• Possible model for National Research Platform (NRP).

Conclusions