building and extensible storage ecosystem with wos

35
Building an Extensible Storage Ecosystem with WOS Dr. Erik Deumens SSERCA SC’13 DDN User Meeting

Upload: insidehpc

Post on 15-Jan-2015

1.454 views

Category:

Technology


1 download

DESCRIPTION

In this presentation from the DDN User Meeting at SC13, Erik Deumans from SSERCA describes how the institution is sharing data with WOS from DDN. Watch the video presentation: http://insidehpc.com/2013/11/13/ddn-user-meeting-coming-sc13-nov-18/

TRANSCRIPT

Page 1: Building and Extensible Storage Ecosystem with WOS

Building an Extensible Storage Ecosystem with

WOS Dr. Erik Deumens

SSERCA SC’13 DDN User Meeting

Page 2: Building and Extensible Storage Ecosystem with WOS

SSERCA

• Sunshine State Education & Research Computing Alliance o Members: FIU, FSU, UCF, UF, UM, USF o Affiliates: FAMU, FAU, FIT, UNF o Glue: Florida LambdaRail regional network provider

• Enable and enhance o  collaborative research o  for faculty and their teams in the state

• Making them more competitive o  by providing advanced cyber infrastructure

Page 3: Building and Extensible Storage Ecosystem with WOS

Proposal Vision and Overview

The researchers and their collaborations are the central focus driving all design aspects of the proposed extensible storage environment.

Page 4: Building and Extensible Storage Ecosystem with WOS

Intellectual Merit

•  Address the need of working researchers head-on

•  Not centered on some hardware or software design

•  Naturally extensible •  Intrinsically sustainable •  Inclusive of new approaches

Page 5: Building and Extensible Storage Ecosystem with WOS

Broader Impacts

•  Open to all communities •  Provide a framework to explore and broaden

a data centric research environment •  Provide long-term roadmap to address

archival storage and transitioning data to it •  Link campus and NSF XSEDE resources in

flexible way (eXtreme Science and Engineering Discovery Environment)

Page 6: Building and Extensible Storage Ecosystem with WOS

Project Vision •  What challenges are addressed? •  What will the proposed project build with

NSF funding? •  How are XSEDE resources leveraged? •  Features of the architecture

o  Sustainable o  Extensible o  Flexible and adaptable

•  What can others build leveraging this NSF funded project?

Page 7: Building and Extensible Storage Ecosystem with WOS

Challenges for Storage Providers

•  Multiple sources, multiple sizes of data o  Instrument data o  Spreadsheets

•  Multiple places to store data o  Campus systems o  Cloud systems (Google Drive, Dropbox, etc)

•  Multiple actions and timescales in data life o  Analysis - compute and data intensive o  Distribution - web site accessibility

§  general and restricted o  Life cycle management - initial,

maturing, archiving

Page 8: Building and Extensible Storage Ecosystem with WOS

Principles Create: •  Effective environment for researchers….

•  to work collaboratively •  with complex workflows

•  Involving large and small data

We propose to bring the essence and simplicity of cloud infrastructure to research:

Interactivity and instant gratification. Think of something, and start doing it!

Page 9: Building and Extensible Storage Ecosystem with WOS

Proposal: XDESE The eXtreme Digital Extensible Storage

Ecosystem - XDESE •  Ecosystem is more complex than

environment •  NSF funded and supported core

o  Distributed by design o  Multi-access, multi-protocol, multi-owner o  Leverage XSEDE resources o  XRAC allocation process adapted for data

§  defined quota for defined time span

Page 10: Building and Extensible Storage Ecosystem with WOS

Storage Architecture

XDESE - FIU

XDESE - UCF

Internet

Researcher

XSEDE – XRAC Authentication Authorization

Data Gateway

Data Gateway

Data Replication

XSEDE resources: Stampede

Kraken

Page 11: Building and Extensible Storage Ecosystem with WOS

Proposal: XDESE (2)

•  Extensible with other funding o  Geographically: campus and regional add-ons

§  plug and play racks o  Organizationally: multiple communities

§  astrophysics, religion, archeology, ... o  Functionally: add new protocols and formats o  Public data: NSF funded o  Restricted data: funded from other sources o  Archival data and data repositories

Page 12: Building and Extensible Storage Ecosystem with WOS

XDESE Extension Architecture

•  Basic concept o  WOScore storage system at remote location o  WOScore provides

§  data replication and motion §  policy and demand based

o  Add WOSaccess gateway to provide local §  CIFS (personal) and NFS (organizational)

o  Add WOS GS bridge gateway to provide local §  GPFS on GridScaler or Lustre on ExaScaler

Page 13: Building and Extensible Storage Ecosystem with WOS

Extension Architecture

SSERCA XDESE Internet Campus

WOS GS Bridge HPC

WOS Access

campus net NFS/CIFS

XSEDE

WOS GS Bridge

Stampede

Page 14: Building and Extensible Storage Ecosystem with WOS

Leverage XSEDE Resources

•  Users store and maintain data in XDESE o  Long term project data o  In support of collaboration

§  meaning easy access to many people §  fine control over who can see and do what

o  Not intended for temporary data o  XSEDE storage resources are suitable for that

Page 15: Building and Extensible Storage Ecosystem with WOS

Leverage XSEDE Resources (2)

•  Transfer data to XSEDE processors o  Stampede, Kraken, etc o  Bulk transfer o  Complex data flow including data selection o  XDESE will respond from multiple sites

§  improved performance, reliability, flexibility

Page 16: Building and Extensible Storage Ecosystem with WOS

Leverage XSEDE Resources (3)

•  Option 1 data transfer to XSEDE scratch file system o  During computation on XSEDE systems o  Optimal performance is obtained

Page 17: Building and Extensible Storage Ecosystem with WOS

Leverage XSEDE Reseources (4)

•  Option 2 XSEDE compute job controls data o  Program can control data selection..

o  from the XDESE storage o  initiate transfer to and from of selected parts

o  XDESE storage (DDN WOScore) will optimize data location among distributed XDESE storage nodes o  use one of the extensions for further optimization

Page 18: Building and Extensible Storage Ecosystem with WOS

Partnerships

•  Network partner FLR o  Provide transport o  Performance optimization with SDN and OpenFlow o  Provide connection to Internet2 and XSEDEnet

•  Storage system vendor DDN o  Provides hardware, system software, and expertise o  Builds the extension racks

•  Software interfaces o  Data transfer: Globus Online

Page 19: Building and Extensible Storage Ecosystem with WOS

ddn.com ©2012 DataDirect Networks. All Rights Reserved.

SSERCA XDESE Storage Solution

Florida  State  University    

University  of  Florida  

Florida  Interna3onal    University  

University  of  South  Florida  

 

University  of  Central  Florida    

University  of  Miami  

SSERCA  Storage  Cloud  

SSERCA  End-­‐Users  State  Wide  

Page 20: Building and Extensible Storage Ecosystem with WOS

ddn.com ©2012 DataDirect Networks. All Rights Reserved.

XDESE Building Block

At each SSERCA site Storage server

•  2.1  PB  raw    

Page 21: Building and Extensible Storage Ecosystem with WOS

ddn.com ©2012 DataDirect Networks. All Rights Reserved.

WOS6000 Cabinet

WOS6000 storage server •  12  drawers    •  180  TB  per  drawer  (2  nodes)  •  2.1  PB  raw  capacity  •  Policy  based  data  protecEon  

•  Ranges  from  100%  to  20%  •  ReplicaEon  100%  overhead  •  RAID-­‐like  encoding  20%  overhead  

 

Page 22: Building and Extensible Storage Ecosystem with WOS

Resource Details

•  Primary data interface to the web o  WOScloud (dropbox-like, REST over SSL, Oauth) o  WOSshare (Amazon S3-like, S3=simple storage service, REST

interface, BitTorrent)

•  Generic server for Globus Online transfers o  DDN customization needed for optimal speed o  Initially simple NFS client via WOSaccess

•  Interface to SSERCA campus HPCs o  Grid/ExaScaler to stage to GPFS/Lustre o  Later read via NFS

Page 23: Building and Extensible Storage Ecosystem with WOS

Hardware Architecture •  At the 6 SSERCA sites

o  Object Storage at 6 sites o  Web server with data control panel

•  Data transfer mechanisms over FLR o  XSEDEnet and Internet2

•  Extension racks at other locations o  Object storage o  Network infrastructure OpenFlow capable o  Provide multiple data path options to local campus

resources like NFS and CIFS access o  Optional: compute resources

with scratch storage

Page 24: Building and Extensible Storage Ecosystem with WOS

XDESE: Extending and Complementing XSEDE Storage

XDESE offers •  Easy user interface •  Composability of data flows and workflows •  Multiple authentication domains •  Ability to easily share data •  Easy ingestion of instrument data User focused!

Page 25: Building and Extensible Storage Ecosystem with WOS

XDESE Storage and XSEDE Compute •  Full integration with XSEDE compute

resources •  Easy data transfer is part of data and

workflow •  The extensibility includes the option to..

o  install WOS GS bridge gateway o  at XSEDE compute site(s) o  for improved performance o  works like Hierarchical File System

Page 26: Building and Extensible Storage Ecosystem with WOS

Authentication Interface

•  To be successful, compatibility with multiple campus systems is also required o  Need to design a simple system o  Must allow users to manage multiple identities easily

§  XSEDE, XDESE, local campus, Google Drive, Dropbox, Amazon S3, etc

§  globus Online supports transfer across authentication domains

§  other tools like BitTorrent play a role too

Page 27: Building and Extensible Storage Ecosystem with WOS

Performance and Innovation for Science & Engineering Applications

•  Performance, scalability, extensibility, sustainability

•  Describe the general use case o  Example from humanities and social sciences

•  Select some strong science, engineering application(s)

•  Innovation: explore archival strategies

Page 28: Building and Extensible Storage Ecosystem with WOS

Sustainable and Extensible

•  Distributed from inception o  Basic functionalities will be tested and supported

•  Extensible simply by adding an XDESE rack o  Like NSF funded GENI project and GENI racks o  Multiple vendors can supply the racks o  Learn once from XDESE, apply everywhere o  Path for even the smallest institutions

§  leverage NSF funded resource and get started quickly §  single faculty can start working with XDESE

Page 29: Building and Extensible Storage Ecosystem with WOS

Use case: Generic researcher

Alice works on a project that involves.. •  data from an instrument and •  more data generated by analysis and

modeling

Page 30: Building and Extensible Storage Ecosystem with WOS

Use Case: Setup

•  Alice gets an XDESE allocation •  She arranges data to flow to the storage

from the instrument o  If the data flow demands it, she can set up a staging

rack (needs funds) with specs and support from XDESE

Page 31: Building and Extensible Storage Ecosystem with WOS

Use Case: Data and Workflow

•  With the XDESE data & work control station o  Looks like Galaxy https://main.g2.bx.psu.edu/

•  She controls data and workflow o  Orchestrates data movement o  Get all data in the right place o  Right place is where the software and compute capability is at

XSEDE resources or on campus

•  Tools execute the movement o  Globus Online, etc.

Page 32: Building and Extensible Storage Ecosystem with WOS

Use Case: Results

•  The results can be viewed with tools from the location specified in the flow

•  Collaborators can get accounts and access to her allocation

•  Multiple ways to access the data are available

•  Further visualization and other processing can easily be orchestrated

Page 33: Building and Extensible Storage Ecosystem with WOS

Use Case: Lifecycle Management

•  She can prepare the data for long-term sharing

•  Tools for creating metadata are provided o  Rules for lifecycle management can be set up, e.g. iRODS

interface o  Data can be annotated and recorded, e.g. Dataverse Network

•  Transition data to compatible systems o  Campus libraries o  Discipline-specific societies

Page 34: Building and Extensible Storage Ecosystem with WOS

Innovation: Archival Strategies

•  Proposed Architecture o  XDESE provides an efficient path for exploration of

options o  Institutions and libraries can buy an XDESE rack

§  dedicated to archival storage §  data transfer in and out is supported §  establish criteria for users to deposit data

•  e.g. pass a data quality test of sufficient metadata

Page 35: Building and Extensible Storage Ecosystem with WOS

Thank You