managing scientific data with ndn · 2015-09-29 · managing scientific data with ndn chengyu fan,...

80
MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos NDNcomm 2015 Sept 28, 2015 Los Angeles, CA Supported by NSF #13410999 and NSF#1345236

Upload: others

Post on 21-Feb-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

MANAGING SCIENTIFIC

DATA WITH NDN

Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto,

Catherine Olschanowsky, Christos Papadopoulos

NDNcomm 2015

Sept 28, 2015 Los Angeles, CA

Supported by NSF #13410999 and NSF#1345236

Page 2: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

1

Introduction

Scientific data is often very large and complex

Climate - CMIP5: 3.5 PB, CMIP6: 350PB-3EB

Physics - Atlas: 4 PB/Year

Astronomy, bioinformatics, others…

Science infrastructure

Cutting edge hardware but often incompatible

domain software (ESGF, xrootd, etc.)

Complexity, replication, redundancy

1

Page 3: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

2

Our Project

Build and deploy software to evaluate NDN in

scientific applications over a dedicated hardware

infrastructure

Evaluate NDN in the context of:

Application services: publishing, discovery, retrieval, access

control, load balancing, failover, caching, etc.

Network integration (OSCARS, SDN, etc.)

Metrics

Performance, reduced complexity, ease of deployment,

interoperability, reuse, efficiency, routing, security/trust, etc.

2

Page 4: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

3

NDN Layer Structure

UDP/IP

host host

UDP/IP

Page 5: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

4

NDN Layer Structure

APP

UDP/IP

host host

UDP/IP

Page 6: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

5

NDN Layer Structure

APP

NDN

UDP/IP

host

router

host

UDP/IP

Page 7: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

6

NDN Layer Structure

APP

NDN

UDP/IPETH

Other

host

router

NDN

host

LINK

UDP/IPETH

Other

NDN

Page 8: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

7

NDN Layer Structure

APP

NDN

UDP/IPETH

Other

host

router

NDN

host

LINK

UDP/IPETH

Other

NDN

APP

Page 9: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

8

NDN Layer Structure

APP

NDN

UDP/IPETH

Other

host

router

NDN

host

LINK

UDP/IPETH

Other

NDN

APP

NDN

Page 10: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

9

NDN Layer Structure

APP

NDN

UDP/IPETH

Other

host

router

NDN

host

LINK

UDP/IPETH

Other

NDN

APP

NDN

LINK

router

Page 11: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

10

Methodology

Investigate the use of NDN as a common

platform for scientific data applications by:

Understanding data management challenges of

various scientific domains

Developing and evaluating prototype applications

that leverage NDN's features

Use prototypes to further drive NDN research

4

Page 12: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

11

First Step – Build a Catalog

Create a shared resource – a distributed, synchronized

catalog of names over NDN

Provide common operations such as publishing, discovery, access control

Catalog only deals with name management, not dataset retrieval

Platform for further research and experimentation

Research questions:

Namespace construction, distributed publishing, key management, UI

design, failover, etc.

Functional services such as subsetting

Mapping of name-based routing to tunneling services (VPN, OSCARS,

MPLS)

5

Page 13: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

12

Overview of Catalog Workflow

6

NDN

Catalog node 1

Data storage

Data storage

Publisher

Catalog node 2

Consumer

Catalog node 3

Page 14: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

13

Overview of Catalog Workflow

6

NDN

Catalog node 1

Data storage

Data storage

(1)Publish Dataset

names

Publisher

Catalog node 2

Consumer

Catalog node 3

Page 15: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

14

Overview of Catalog Workflow

6

NDN

Catalog node 1

Data storage

Data storage

Publisher

Catalog node 2

Consumer

Catalog node 3

Page 16: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

15

Overview of Catalog Workflow

6

NDN

Catalog node 1

Data storage

Data storage

Publisher

Catalog node 2

(2) Sync changes

Consumer

Catalog node 3

Page 17: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

16

Overview of Catalog Workflow

6

NDN

Catalog node 1

Data storage

Data storage

Publisher

Catalog node 2

Consumer

Catalog node 3

Page 18: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

17

Overview of Catalog Workflow

6

NDN

Catalog node 1

Data storage

Data storage(3) Query for

Dataset names

Publisher

Catalog node 2

Consumer

Catalog node 3

Page 19: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

18

Overview of Catalog Workflow

6

NDN

Catalog node 1

Data storage

Data storage

Publisher

Catalog node 2

Consumer

Catalog node 3

Page 20: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

19

Overview of Catalog Workflow

6

NDN

Catalog node 1

Data storage

Data storage

Publisher

(4) Retrieve data

Catalog node 2

Consumer

Catalog node 3

Page 21: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

20

Overview of Catalog Workflow

6

NDN

Catalog node 1

Data storage

Data storage

Publisher

(4) Retrieve data

Catalog node 2

Consumer

Catalog node 3

Page 22: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

21

Overview of Catalog Workflow

6

NDN

Catalog node 1

Data storage

Data storage

Publisher

(4) Retrieve data

Catalog node 2

Consumer

Catalog node 3

Page 23: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

22

NDN-Science Testbed

NSF CC-NIE campus infrastructure award

10G testbed (courtesy of ESnet, UCAR, and CSU Research LAN)

Currently ~50TB of CMIP5, ~70TB of HEP data

7

Page 24: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

23

Demos

Search

Publication and Sync

Access control

Retrieval and failover

8

Page 25: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

24

Conclusions

IP encourages common host access, not common data access

methods

Does not encourage interoperability at the application level

NDN has the potential to unify the service interface required

by scientific applications

Science testbed and prototypes to test hypothesis and drive research

and experimentation

Ready-to-try catalog, we invite you to try it with your data

Catalog is general, supports a variety of applications

Currently CMIP5 and HEP applications

UI for data search and retrieval.

9

Page 26: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

25

Our sponsors: NSF and ESnet

Join us @

http://www.netsec.colostate.edu/mailman/listinfo/ndn-sci

10

Page 27: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

Backup Slides

11

Page 28: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

27

Current Example: xrootd

12

/my/file /my/file

Data Serverscmsdxrootd cmsdxrootd cmsdxrootd

A B C

Fragile, fairly complex middleware

Page 29: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

28

Current Example: xrootd

12

/my/file /my/file

Data Servers

Manager(a.k.a. Redirector)

cmsdxrootd cmsdxrootd cmsdxrootd

cmsdxrootd

A B C

Fragile, fairly complex middleware

Page 30: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

29

Current Example: xrootd

12

/my/file /my/file

Data Servers

Manager(a.k.a. Redirector)

Client

cmsdxrootd cmsdxrootd cmsdxrootd

cmsdxrootd

A B C

Fragile, fairly complex middleware

Page 31: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

30

Current Example: xrootd

12

/my/file /my/file

4: Try open() at A

Data Servers

Manager(a.k.a. Redirector)

Client

cmsdxrootd cmsdxrootd cmsdxrootd

cmsdxrootd

A B C

Fragile, fairly complex middleware

Page 32: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

31

NDN

xrootd under NDN

Significantly reduced system complexity

Better service abstraction

13

/my/file /my/file

Data Serverscmsdxrootd cmsdxrootd cmsdxrootd

A B C

Page 33: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

32

NDN

xrootd under NDN

Significantly reduced system complexity

Better service abstraction

13

/my/file /my/file

Data Serverscmsdxrootd cmsdxrootd cmsdxrootd

A B C

Page 34: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

33

NDN

xrootd under NDN

Significantly reduced system complexity

Better service abstraction

13

/my/file /my/file

Data Servers

Client

cmsdxrootd cmsdxrootd cmsdxrootd

A B C

Page 35: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

34

NDN

xrootd under NDN

Significantly reduced system complexity

Better service abstraction

13

/my/file /my/file

Data Servers

Client

cmsdxrootd cmsdxrootd cmsdxrootd

A B C

? /my/file

Page 36: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

35

NDN

xrootd under NDN

Significantly reduced system complexity

Better service abstraction

13

/my/file /my/file

Data Servers

Client

cmsdxrootd cmsdxrootd cmsdxrootd

A B C

? /my/file

Page 37: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

36

Data Publication

PublisherCatalog

1) Listening on /<catalog-

prefix>/publish

Page 38: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

37

Data Publication

PublisherCatalog

1) Listening on /<catalog-

prefix>/publish

2) Generate NDN names for

datasets/services

Page 39: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

38

Data Publication

PublisherCatalog

3) Request publish

1) Listening on /<catalog-

prefix>/publish

2) Generate NDN names for

datasets/services

Page 40: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

39

Data Publication

PublisherCatalog

3) Request publish

4) Fetch published name list

1) Listening on /<catalog-

prefix>/publish

2) Generate NDN names for

datasets/services

Page 41: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

40

Data Publication

PublisherCatalog

3) Request publish

4) Fetch published name list

5) Authenticate the Data and

validate data name against trust

model

1) Listening on /<catalog-

prefix>/publish

2) Generate NDN names for

datasets/services

Page 42: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

41

Data Publication

PublisherCatalog

3) Request publish

4) Fetch published name list

6) Share names with other

catalogs

5) Authenticate the Data and

validate data name against trust

model

1) Listening on /<catalog-

prefix>/publish

2) Generate NDN names for

datasets/services

Page 43: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

42

Keys for ndn-atmos

15

Self-signed root key/cmip5/KEY

/cmip5/lbl/KEY /cmip5/nwsc/KEY… Site’s keys

/cmip5/lbl/<DataPublisher>/KEY /cmip5/nwsc/<operator>/KEY

Application’s keys(Dataset names publishing) (NLSR)

/cmip5/nwsc/<router>/KEY

Page 44: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

43

Keys for ndn-atmos

15

Self-signed root key/cmip5/KEY

/cmip5/lbl/KEY /cmip5/nwsc/KEY… Site’s keys

/cmip5/lbl/<DataPublisher>/KEY /cmip5/nwsc/<operator>/KEY

Application’s keys

signs

(Dataset names publishing) (NLSR)

/cmip5/nwsc/<router>/KEY

Page 45: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

44

Trust Model

Only namespace owners are allowed to publish data

Data provenance built into the data packet

16

/PublisherA/publish

Publisher A’s signature

- /PublisherA/publish/file/1

- /PublisherA/publish/file/2

+ /PublisherA/publish/file/3

+ /PublisherA/publish/file/4

Content Name

Signature

Data payload

Valid publish message

Page 46: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

45

Trust Model

Only namespace owners are allowed to publish data

Data provenance built into the data packet

16

/PublisherA/publish

Publisher A’s signature

- /PublisherA/publish/file/1

- /PublisherA/publish/file/2

+ /PublisherA/publish/file/3

+ /PublisherA/publish/file/4

Content Name

Signature

Data payload

/PublisherA/publish

Publisher A’s signature

- /PublisherB/publish/file

Valid publish message Invalid publish message

Page 47: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

46

Trust Model

Only namespace owners are allowed to publish data

Data provenance built into the data packet

16

/PublisherA/publish

Publisher A’s signature

- /PublisherA/publish/file/1

- /PublisherA/publish/file/2

+ /PublisherA/publish/file/3

+ /PublisherA/publish/file/4

Content Name

Signature

Data payload

/PublisherA/publish

Publisher A’s signature

- /PublisherB/publish/file

Valid publish message Invalid publish message

Page 48: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

47

Trust Model

Only namespace owners are allowed to publish data

Data provenance built into the data packet

16

/PublisherA/publish

Publisher A’s signature

- /PublisherA/publish/file/1

- /PublisherA/publish/file/2

+ /PublisherA/publish/file/3

+ /PublisherA/publish/file/4

Content Name

Signature

Data payload

/PublisherA/publish

Publisher A’s signature

- /PublisherB/publish/file

Valid publish message Invalid publish message

Page 49: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

48

Name Discovery

ConsumerCatalog

1) Listening on /<catalog-

prefix>/query

Page 50: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

49

Name Discovery

ConsumerCatalog

2) Query with parameters

(model=cmip5 AND frequency=6hr)

1) Listening on /<catalog-

prefix>/query

Page 51: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

50

Name Discovery

ConsumerCatalog

2) Query with parameters

(model=cmip5 AND frequency=6hr)

3) Query local DB; Packetize

results under

/<catalog-prefix>/query-

results/<params>

1) Listening on /<catalog-

prefix>/query

Page 52: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

51

Name Discovery

ConsumerCatalog

2) Query with parameters

(model=cmip5 AND frequency=6hr)

3) Query local DB; Packetize

results under

/<catalog-prefix>/query-

results/<params>

3) ACK

1) Listening on /<catalog-

prefix>/query

Page 53: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

52

Name Discovery

ConsumerCatalog

2) Query with parameters

(model=cmip5 AND frequency=6hr)

3) Query local DB; Packetize

results under

/<catalog-prefix>/query-

results/<params>

3) ACK

4) Fetch query results (name list)

1) Listening on /<catalog-

prefix>/query

Page 54: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

53

Name Discovery

ConsumerCatalog

2) Query with parameters

(model=cmip5 AND frequency=6hr)

3) Query local DB; Packetize

results under

/<catalog-prefix>/query-

results/<params>

3) ACK

4) Fetch query results (name list)

1) Listening on /<catalog-

prefix>/query

5) Fetch desired dataset(s) or

re-query

Page 55: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

54

Data Publication

Catalog

Accept publish requests:

/<catalog-prefix>/publish

Authenticate and retrieve

data names from publisher

Sync names with other

catalogs

Publisher

Generate NDN names for

datasets/services

Inform catalog of names to

add/remove

PublisherCatalog

Page 56: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

55

Data Publication

Catalog

Accept publish requests:

/<catalog-prefix>/publish

Authenticate and retrieve

data names from publisher

Sync names with other

catalogs

Publisher

Generate NDN names for

datasets/services

Inform catalog of names to

add/remove

PublisherCatalogRequest publish

Page 57: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

56

Data Publication

Catalog

Accept publish requests:

/<catalog-prefix>/publish

Authenticate and retrieve

data names from publisher

Sync names with other

catalogs

Publisher

Generate NDN names for

datasets/services

Inform catalog of names to

add/remove

PublisherCatalogRequest publish

Fetch published name list

Page 58: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

57

Data Publication

Catalog

Accept publish requests:

/<catalog-prefix>/publish

Authenticate and retrieve

data names from publisher

Sync names with other

catalogs

Publisher

Generate NDN names for

datasets/services

Inform catalog of names to

add/remove

PublisherCatalogRequest publish

Fetch published name listValidate data name

against trust model

Page 59: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

58

Data Publication

Catalog

Accept publish requests:

/<catalog-prefix>/publish

Authenticate and retrieve

data names from publisher

Sync names with other

catalogs

Publisher

Generate NDN names for

datasets/services

Inform catalog of names to

add/remove

PublisherCatalogRequest publish

Fetch published name list

Share names with other

catalogs

Validate data name

against trust model

Page 60: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

59

Name Discovery

Catalog

Accept queries on

/<catalog-prefix>/query

Query local DB

Packetize the returned names

under

/<catalog-prefix>/query-

results/<params>

User

Query catalog for names with

specified components

e.g.: model=cmip5 AND

frequency=6hr

Fetch generated name list

Fetch desired dataset(s) or re-

query

ConsumerCatalog

Page 61: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

60

Name Discovery

Catalog

Accept queries on

/<catalog-prefix>/query

Query local DB

Packetize the returned names

under

/<catalog-prefix>/query-

results/<params>

User

Query catalog for names with

specified components

e.g.: model=cmip5 AND

frequency=6hr

Fetch generated name list

Fetch desired dataset(s) or re-

query

ConsumerCatalogQuery with parameters

Page 62: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

61

Name Discovery

Catalog

Accept queries on

/<catalog-prefix>/query

Query local DB

Packetize the returned names

under

/<catalog-prefix>/query-

results/<params>

User

Query catalog for names with

specified components

e.g.: model=cmip5 AND

frequency=6hr

Fetch generated name list

Fetch desired dataset(s) or re-

query

ConsumerCatalogQuery with parameters

Query local DB;

Packetize results

Page 63: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

62

Name Discovery

Catalog

Accept queries on

/<catalog-prefix>/query

Query local DB

Packetize the returned names

under

/<catalog-prefix>/query-

results/<params>

User

Query catalog for names with

specified components

e.g.: model=cmip5 AND

frequency=6hr

Fetch generated name list

Fetch desired dataset(s) or re-

query

ConsumerCatalogQuery with parameters

Query local DB;

Packetize resultsACK

Page 64: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

63

Name Discovery

Catalog

Accept queries on

/<catalog-prefix>/query

Query local DB

Packetize the returned names

under

/<catalog-prefix>/query-

results/<params>

User

Query catalog for names with

specified components

e.g.: model=cmip5 AND

frequency=6hr

Fetch generated name list

Fetch desired dataset(s) or re-

query

ConsumerCatalogQuery with parameters

Query local DB;

Packetize resultsACK

Fetch query results

Page 65: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

64

Name Discovery

Catalog

Accept queries on

/<catalog-prefix>/query

Query local DB

Packetize the returned names

under

/<catalog-prefix>/query-

results/<params>

User

Query catalog for names with

specified components

e.g.: model=cmip5 AND

frequency=6hr

Fetch generated name list

Fetch desired dataset(s) or re-

query

ConsumerCatalogQuery with parameters

Query local DB;

Packetize resultsACK

Fetch data with

standard NDNFetch query results

Page 66: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

65

Name Discovery Optimization

Catalog

Accept queries on

/<catalog-prefix>/queryParams

Query local DB

Packetize the returned names

under

/<catalog-

prefix>/queryParams/seg#

In case of failure, queries get

redirected to another catalog

Consumers

Can query any catalog

instances

Can transparently failover to

another catalog

• Avoid maintaining state between user and catalog

• Enables graceful failover

Page 67: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

66

NDN

Simplified xrootd Under NDN

NDN integrates discovery, failover, retrieval …

Provides a better abstraction to the applications

21

/my/file /my/file

Data Serverscmsdxrootd cmsdxrootd cmsdxrootd

A B C

Page 68: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

67

NDN

Simplified xrootd Under NDN

NDN integrates discovery, failover, retrieval …

Provides a better abstraction to the applications

21

/my/file /my/file

Data Serverscmsdxrootd cmsdxrootd cmsdxrootd

A B C

Page 69: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

68

NDN

Simplified xrootd Under NDN

NDN integrates discovery, failover, retrieval …

Provides a better abstraction to the applications

21

/my/file /my/file

Data Servers

Client

cmsdxrootd cmsdxrootd cmsdxrootd

A B C

Page 70: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

69

NDN

Simplified xrootd Under NDN

NDN integrates discovery, failover, retrieval …

Provides a better abstraction to the applications

21

/my/file /my/file

Data Servers

Client

cmsdxrootd cmsdxrootd cmsdxrootd

A B C

? /my/file

Page 71: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

70

NDN

Simplified xrootd Under NDN

NDN integrates discovery, failover, retrieval …

Provides a better abstraction to the applications

21

/my/file /my/file

Data Servers

Client

cmsdxrootd cmsdxrootd cmsdxrootd

A B C

? /my/file

Page 72: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

71

Name Discovery Challenges

Users may need to discover content/services without knowing a

the full NDN name prefix structure

NDN names are contiguous prefixes

Users may only know a few disjoint name components (e.g.

frequency=6hr)

But can not use wildcards for name discovery

22

Consumer

NDN

User wants: /CMIP5/output1/VA/6hr/2016

. . .

Page 73: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

72

Name Discovery Challenges

Users may need to discover content/services without knowing a

the full NDN name prefix structure

NDN names are contiguous prefixes

Users may only know a few disjoint name components (e.g.

frequency=6hr)

But can not use wildcards for name discovery

22

Consumer

NDN

/CMIP5

User wants: /CMIP5/output1/VA/6hr/2016

. . .

Page 74: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

73

Name Discovery Challenges

Users may need to discover content/services without knowing a

the full NDN name prefix structure

NDN names are contiguous prefixes

Users may only know a few disjoint name components (e.g.

frequency=6hr)

But can not use wildcards for name discovery

22

Consumer/CMIP5/output/BCC/6hr/1998

NDN

/CMIP5

User wants: /CMIP5/output1/VA/6hr/2016

. . .

Page 75: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

74

Name Discovery Challenges

Users may need to discover content/services without knowing a

the full NDN name prefix structure

NDN names are contiguous prefixes

Users may only know a few disjoint name components (e.g.

frequency=6hr)

But can not use wildcards for name discovery

22

Consumer/CMIP5/output/BCC/6hr/1998

NDN

/CMIP5

/CMIP5/output/BCC/6hr (exclude 1998)

User wants: /CMIP5/output1/VA/6hr/2016

. . .

Page 76: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

75

Name Discovery Challenges

Users may need to discover content/services without knowing a

the full NDN name prefix structure

NDN names are contiguous prefixes

Users may only know a few disjoint name components (e.g.

frequency=6hr)

But can not use wildcards for name discovery

22

Consumer/CMIP5/output/BCC/6hr/1998

NDN

/CMIP5

/CMIP5/output/BCC/6hr (exclude 1998)

May take too many requests to find desired data or service

User wants: /CMIP5/output1/VA/6hr/2016

. . .

Page 77: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

76

NDN Support for Big Science

NDN Names separate data from hosts

Discovery: Names directly translate to network queries

Failover: Network can get verifiable data from anywhere

Retrieval: Data can be fetched from optimal source(s)

Investigate the use of NDN as a platform for scientific data

applications

Understand data management challenges of various scientific domains

Develop prototype applications to leverage NDN's built-in features

Use these applications as case studies to drive NDN research aspects

23

Page 78: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

77

Summary

NDN improves scientific data management at scale

Apps benefit from transparent multipath, automatic failover, etc.

Built-in security provides publisher provenance

Names are the common building block for content and services

Names are flexible: can refer to static content or dynamic services

Catalog supports efficient publication, non-contiguous name

discovery

Users can discover content and services with minimal a priori knowledge

Catalog validates publication requests for authorization

24

Page 79: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

78

Managing Scientific Data with NDN

Science testbed

10G testbed (courtesy of ESnet,

UCAR, and CSU Research LAN)

Nodes strategically located near

scientific data (climate +HEP)

CC-NIE NSF award

Distributed, synchronized catalog of

names and services

Common functionality: publishing,

discovery, access control, etc.

Search and retrieval UI

Platform for further research and

experimentation

Research questions:

Namespace construction, distributed

publishing, key management, UI design,

failover, etc.

Functional services such as subsetting

Mapping of name-based routing to

tunneling services (VPN, OSCARS, MPLS)

Page 80: MANAGING SCIENTIFIC DATA WITH NDN · 2015-09-29 · MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos

79

Managing Scientific Data with NDN

Science testbed

10G testbed (courtesy of ESnet,

UCAR, and CSU Research LAN)

CMIP5 and HEP data

CC-NIE NSF award

Name-based Internet architecture

Name the data, not the host

All data digitally signed

Unifies and pushes common functionality

to the network: publishing, discovery,

access control, etc.

Data Intensive applications

Automatic pervasive in-network caching,

parallel retrieval, automatic failover

and more

Simpler alternative middleware

implementation e.g., ESGF, xrootd