san diego super computer

18
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Cyberinfrastructure Services Division From tape to cloud storage 4/19/2012 Steve Meier https://cloud.sdsc.edu

Upload: laurabeckcahoon

Post on 01-Dec-2014

1.533 views

Category:

Education


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

From tape to cloud storage4/19/2012

Steve Meier

https://cloud.sdsc.edu

Page 2: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

Agenda• Where we were…

• Tape archive overview• Where SDSC is at today

• Current Data Services• Swift architecture overview• Access methods

• Cloud Explorer Web interface• UCSD Libraries Collections• Others (Cyberduck, Command Line, s3backer)

• Future Plans

Page 3: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

SAMQFS ARCHITECTURE

Force 10 - 12000

Juniper T640

16 STK 9940B FC32 IBM 3592 FC Tape Drives6 STK 9310 Silos 32PB Capacity

MetaData Servers

SAM-QFS

MDS1

Oracle (RMAN)

NFS Server/

NFS Backups/

CommvaultData Login(GridFTP,SFTP)

webMDS2

1.2PB SAN Disk Cache

Page 4: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

SAMQFS ARCHITECTURE CONT’D…

LSM_2 LSM_4 LSM_5

LSM_0 LSM_1 LSM_3

20-J2A

12-J2A

12-J2A

4-3

590E

12-9940B

LCU

Passthru Channel

Panel 0

Panel 9

Panel 1

Passthru Channel

Passthru

Chan

nel

Passthru

Chan

nel

DESCRIPTION

1.) one existing 20 drive panel on LSM4's panel 10 will hold 20 J2A

2.) one existing 20 drive panel on LSM5's panel 10 will hold 12 J2A

3.) will install a 20 drive panel on LSM5's panel 1 to hold 12 J2A

4.) will install a 20 drive panel on LSM3's panel 9 to hold 20 J2A

Page 5: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

…TAKE AWAY

• Complex Environment• Many dependencies (SAN, Metadata, Tape Drives, Silo)

• Aging Infrastructure• Puts pressure on all the dependencies• Tech refresh way over due

• Archival data is difficult to access - high latency, lower bandwidth, user interfaces

• Difficult to share archival data to multiple users• All too often archived data, particularly HPC simulations, is “write-once-

read-never”

Page 6: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

Where SDSC is at todayData Services Overview

Cloud Storage (OpenStack Swift)• Purpose: Storage of Digital Data for Ubiquitous Access and High-Durability• Access Mechanisms: Swift/S3 API, Cloud Explorer, Clients, CLI

Traditional File Server Storage (NFS/CIFS)• Purpose: Typical Project / User Storage Needs• Access: NFS/CIFS/iSCSI

High Performance Computing Storage (PFS)• Purpose: High Performance Transient Storage to Support HPC• Access Mechanisms: Lustre on HPC Systems (Gordon, Trestles, Triton)

Page 7: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

Goals for Cloud/Object Storage• Support NSF Data Management Plan

• Required Plan to describe how research results are shared.• 99.5% system availability

• File replication automated• Default 2 copies, able to keep additional offsite replications.• Automated checksum verification and error correction

• Scalable• Performance and capacity grows by incremental bricks.

• Multifaceted accessibility• Web, API, Graphical and Command Line Clients

• Cost competitive• Operated as a recharge service• On par with current tape-based dual-copy costs of $0.0325/GB/Mo.

Page 8: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

Why Openstack?

Industry Standard

• More than 100 leading companies from over a dozen countries are participating in OpenStack, including Cisco, Citrix, Dell, Intel and Microsoft

Proven Software• Running the OpenStack cloud

operating system is same software that powers many large public and private clouds, including RackSpace Cloud Storage.

Highly Compatible• Compatibility w/ public OpenStack

clouds means it’s easy to migrate data and apps to public clouds when desired—based on security policies, economics, and other

• key business criteria

Control & Flexibility

Open source platform means not locked to a proprietary vendor,

and modular design can integrate with legacy or 3rd-party

technologies. OpenStack project provided under Apache 2.0

license.

Page 9: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

Design Highlights

100% Dual Copy Disk Storage

Initial 5.5PB (petabytes)

Dual 10Gb Arista Connected, 8 GB aggregate I/O performance

Off-Site Replication (UC Berkeley)

Continuous File Integrity Verification

Help PI’s meet NSF Data Management requirements

Page 10: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

OpenStack Cloud Storage Architecture

Page 11: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

Current Usage

Page 12: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

Usage Breakdown

BackupsFile system Emulation(s3backer, panzura, whitewater)

Native clientsApplication Integration

Page 13: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

Access methods

• UCSD Library Collection management

Integrated

• SDSC Cloud Explorer• swift python client• Cyberduck • s3backer

Client tools

Page 14: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

UCSD Library Collections

Page 15: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

SDSC Swift Web Client (Cloud Explorer)

Features:Uploads/Downloads/Rename/MovePermissions managementChange PasswordDisplay Container Share URL

Page 16: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

Others (Command Line, GUI, Filesystem)

Swift Python Client•Batch processing•Large file upload support•Lacking in features and error logging/recovery

Cyberduck•drag and drop GUI for Mac’s and Windows•No large file upload support

s3Backer•Compatible with existing tools (eg. rsync, SFTP)•File system•Familiar•File Sharing Challenges

Page 17: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

Upcoming Features

Active Directory Authentication Integration

Large file upload support (Cloud Explorer)

Server Side Encryption for at rest data.

Page 18: San Diego Super Computer

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGOCyberinfrastructure Services Division

Questions?

http://www.sdsc.eduhttp://rci.ucsd.edu

Email [email protected] for more info!