distributed storage and wan transport
DESCRIPTION
Distributed Storage And WAN Transport. Peter Kunszt SyBIT Tech Day Nov. 23 2011, Bern. Distributed Storage Systems. Distributed FS Make it look like local FS User sees one space Remote user sees same local space Policies on sharing, access should be available Caching FS - PowerPoint PPT PresentationTRANSCRIPT
Distributed Storage And WAN Transport
Peter KunsztSyBIT Tech Day
Nov. 23 2011, Bern
Distributed Storage Systems
2011.11.23 2
Distributed FSMake it look like local FSUser sees one spaceRemote user sees same local spacePolicies on sharing, access should be available
Caching FSData lives somewhere elseBut looks local due to smart WAN cache
Gluster (bought by RedHat)
2011.11.23 3
www.gluster.org GlusterFS. Many commercial users.The software is open source, they sell an appliance and support (just like redhat)
Single global namespaceBlock storage clustering, no central metadataWorks over 1GbE, 10GbE, InfinibandReplication‘NFS–like’ nativeNo kernel dependenices, simple installation
XtreemFS
2011.11.23 4
Part of XtreemOS project (EU FP7). Used only by German MosGrid in latest version in production.Object-based design. Global FS namespace.
Metadata and Replica Service stores info. Data on Object Storage Servers. Linked through Replica Management Service.Written in java – using native Memblocking. Keystore DB used : BabuDBUses Linux FUSE kernel module, MIT Vivaldi algorithm for replica automation and selection
DDN WOS
2011.11.23 5
www.ddn.com/industry/life-sciences Storage appliance, sold with several interfaces including S3 and REST. GPFS based. Highly resilient to failure.Policy-based replicationData protection mechanism – several copies stored
Break data into fragments, store those x timesCan be combined with replication
IBM Panache aka Active Cloud Engine
2011.11.23 6
www.almaden.ibm.com/storagesystems/projects/panache/ Clustered Filesystem CACHE for parallel I/OCan cache from multiple nodes
GPFS for local FS, pNFS for remote access also using parallel I/ONo proprietary HW or SW needed for installationVery resilient to failures, late sync if necessary
7
IBM Active Cloud Engine™– WAN Caching capabilitiesStatement of Direction Fileset on home cluster is associated with a
fileset on one or more cache clusters If data is in cache …
– Cache hit at local disk speeds– Client sees local GPFS performance if file or
directory is in cache If data not in cache …
– Data and metadata (files and directories) pulled on-demand at network line speed and written to GPFS
– Uses NFS for WAN data transfer
If data is modified at home– Revalidation done at a configurable timeout– Close to NFS style close-to-open consistency across
sites– POSIX strong consistency within cache site
If data is modified at cache– Writes see no WAN latency– are done to the cache (i.e. local GPFS), then
asynchronously pushed home If network is disconnected …
– cached data can still be read, and writes to cache are written back after reconnection
NFS over the WAN
IO NodesIO Nodes
SONAS layer
Cache Cluster SiteCache Cluster Site 2
Home Cluster SiteSoNAS System
SONAS layer
Pull on cache missPush on write
8
IBM Active Cloud Engine™ What is IBM Active Cloud Engine?
• Policy-driven engine that helps improve storage efficiency by automaticallyDistributing files, images, and application updates to multiple locations *Identifying files for backup or replication to a DR locationMoving desired files to the right tier of storage including tape in a TSM hierarchyDeleting expired or unwanted files
• High-performance: can scan billions of files in minutes
What client value does Active Cloud Engine deliver?• Enables ubiquitous access to files from across the globe *• Reduces networks costs and helps improve application performance by distributing files closer to users *• Improves data protection by identifying candidates for backup or DR• Lowers storage cost by moving files transparently to the most appropriate tier of storage• Controls storage growth by moving older files to tape and deleting unwanted or expired files• Enhances administrator productivity by automating file management
What capabilities are supported by Active Cloud Engine in SONAS?• Active Cloud Engine on SONAS supports all the functions described above
What capabilities are supported by Active Cloud Engine in Storwize V7000 Unified?• Active Cloud Engine on Storwize V7000 Unified supports all the functions described above except distribution to
multiple locations * Active Cloud Engine Statement of Direction
Fast Transport
2011.11.23 9
Network bandwidth maximizationFair shareCongestion controlScheduling
TCP based: GridFTP and similarFTP blocksize adjustmentMany parallel threads
Aspera
2011.11.23 10
www.asperasoft.com Built-in to other appliances, many users
UDP based transportSwarming – can look like a DoS Also has an FTP connection for control information
Configurable, has server and client UI for transport controlCongestion controlFair share control
FileCatalyst
2011.11.23 11
www.filecatalyst.com Similar to Aspera: UDP based transport
Signiant
2011.11.23 12
www.signiant.comAnd one more. Is not cheap but I didn’t find out more.