collaboration on large datasets using globus

28
Collaboration on Large Datasets using Globus Rachana Ananthakrishnan University of Chicago

Upload: edan

Post on 24-Feb-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Collaboration on Large Datasets using Globus. Rachana Ananthakrishnan University of Chicago. Data sharing in collaborations. Registry. Registry. Staging Store. Ingest Store. Ingest Store. Community Store. Community Store. Analysis Store. Analysis Store. Archive. Mirror. Archive. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Collaboration on Large Datasets using  Globus

Collaboration on Large Datasets using Globus

Rachana Ananthakrishnan

University of Chicago

Page 2: Collaboration on Large Datasets using  Globus

Data sharing in collaborations

RegistryStaging Store

IngestStore

AnalysisStore

Community Store

Archive Mirror

IngestStore

AnalysisStore

Community Store

Archive Mirror

Registry

Page 3: Collaboration on Large Datasets using  Globus

Data Management User Stories

• “I need a good place to store / backup / archive my (big) research data”

• “I need to easily, quickly, and reliably move or mirror portions of my data to other places.”

• “I need a way to easily and securely share my data with my colleagues at other institutions.”

• “I want to publish my data.”

• “I want to discover published data.”

• …

Page 4: Collaboration on Large Datasets using  Globus

Exemplar: ISI-MIP

• Inter-Sectoral Impact Model Intercomparison Project

• Framework to collate climate impact data across scales and sectors

• World-wide collaboration with data assets managed by the collaboration

• Inputs from various climate models & output forms basis for model evaluation and improvement

Credits: Dr. Joshua Elliot, University of Chicago

Page 5: Collaboration on Large Datasets using  Globus

ISI-MIP Use Cases

• Share data with researchers across institutions world-wide– Restricted sharing– Multiple institutions

• Accept data submissions– Restricted writing to archive

• Publish results– Move selected results to other locations– Track metadata – Discover data

Page 6: Collaboration on Large Datasets using  Globus

What is Globus?

Big data publish*, transfer and sharing……with Dropbox-like

simplicity……directly from your own

storage systems* In pilot phase

Page 7: Collaboration on Large Datasets using  Globus

Collaboration Archive

Univ. of Chicago Argonne IIT UIUC

Publish walk-through

3. Assemble Dataset (Transfer Data)

Curator

2. Describe Submission

Scientist

4. Curate Dataset

1. Publish Data

Page 8: Collaboration on Large Datasets using  Globus

8

Login with Campus Identity

Page 9: Collaboration on Large Datasets using  Globus

9

New submission

Page 10: Collaboration on Large Datasets using  Globus

10

Assemble the Dataset

Page 11: Collaboration on Large Datasets using  Globus

11

Move data to publish archive

Page 12: Collaboration on Large Datasets using  Globus

12

Grant Submission License

Page 13: Collaboration on Large Datasets using  Globus

13

Submission Complete

Page 14: Collaboration on Large Datasets using  Globus

14

Curator Logs in

Page 15: Collaboration on Large Datasets using  Globus

15

Curation Workflow Options

Page 16: Collaboration on Large Datasets using  Globus

16

Verify Metadata & Files

Page 17: Collaboration on Large Datasets using  Globus

17

Approve the Submission

Page 18: Collaboration on Large Datasets using  Globus

18

Submission is now Published with DOI

Page 19: Collaboration on Large Datasets using  Globus

Collaboration Archive

Univ. of Chicago Argonne IIT UIUC

Discover walk-through

3. Assemble Dataset (Transfer Data)

Curator

2. Describe Submission

Scientist

4. Curate Dataset

1. Publish Data6. Download

5. Search

Page 20: Collaboration on Large Datasets using  Globus

20

Search Published Datasets

Page 21: Collaboration on Large Datasets using  Globus

21

Discovering a Published Dataset

Page 22: Collaboration on Large Datasets using  Globus

22

Download the Published Dataset

Page 23: Collaboration on Large Datasets using  Globus

23

Select Download Destination

Page 24: Collaboration on Large Datasets using  Globus

Globus Under the Covers

Identity, Group, Profile Management Services

Sharing Service

Transfer Service

Globus Toolkit

Glo

bus

API

s

Glo

bus

Conn

ect

Page 25: Collaboration on Large Datasets using  Globus

Reliable, secure, high-performance file transfer and synchronization

• “Fire-and-forget” transfers

• Automatic fault recovery

• Seamless security integration

• Powerful GUIand APIs

DataSource

DataDestination

User initiates transfer request

1

Globus moves and syncs files

2

Globus notifies user

3

Page 26: Collaboration on Large Datasets using  Globus

Simple, secure sharing off existing storage systems

DataSource

User A selects file(s) to share, selects user or group, and sets permissions

1

Globus tracks shared files; no need to move files to cloud storage!

2

User B logs in to Globus and

accesses shared file

3

• Easily share large data with any user or group

• No cloud storage required

Page 27: Collaboration on Large Datasets using  Globus

Thank you

• Signup and use Globus to transfer and share

• globus.org/signup

• Signup as early adopters of publish

• globus.org/data-publication

• Support

[email protected]

Page 28: Collaboration on Large Datasets using  Globus

Thank you to our sponsors!

U . S . D E PA RT M E N T O F

ENERGY