Transcript

Insider's Perspective: Exploring OpenStack Swift – Essex Release

OpenStack Boston MeetupJune 21, 2012

Judd Maltin

OpenStack Timeline

2011

Feb 2011: Bexar

Release

Apr 2011: Cactus Release

Sep 2011: Diablo Release

Apr 2012: Design Summit

Austin Formation

Bexar First Public Code

Cactus Community Development Forming Working Prototypes

Essex

“Production Ready” Stable Foundation Included in Ubuntu 12.04 Incubated: Network & Block Storage

2012

Nov 2010 Dec Feb Apr Jun Aug Oct Dec Feb Apr

Oct 2011: Design Summit

Mar 2012: Essex

Release

Nov 2010: Austin Release

Oct 2010: Design Summit

Apr 2011: Design Summit

Diablo Workable Foundation Exposes Gaps Solidify Community Loses VMware & HyperV

Fulsom

“Platform for Innovation” Core Platform for Innovation Network as a Service Block Storage Public Adoptoin Multiple Scale Deployments

Jun Aug

Oct 2012: Fulsom Release

A highly scalable, redundant, unstructured data store designed to store large amounts of data cheaply.

Good use cases:

●Storing media libraries (photos, music, videos, etc.) ●Archiving video surveillance files ●Archiving phone call audio recordings ●Archiving un/compressed log files ●Archiving backups●Storing and loading of OS Images, etc. ●Storing file populations that grow continuously on a practically infinite basis. ●Storing small files (<50 KB). OpenStack Object Storage is great at this. ●Storing billions of files. ●Storing Petabytes (millions of Gigabytes) of data.

Why Swift?

● Written in Python● SQLite, Rsync, Memcache, XFS

● HTTP/ReST API● Logical Parts

● Accounts ● Containers● Objects

● CDN Integration● No single point of failure● Last write wins

What Swift is (the 411)

What Swift Ain't

Not a filesystem or block storage:Does not use the typical POSIX filesystems semantics like open(),read(),write(), seek() … rather HTTP actions like PUT,GET,DELETE,POST.

Not a database:Stores unstructured, file based data.Only download, upload, retrieve data but does not process data.Only basic tagging.

Barely even hierarchical:Only one level of container depth. Vague tagging.

System Components

5 Zones2 Proxies per 25Storage Nodes10 GigE to Proxies1 GigE to Storage Nodes24 x 2TB Drivesper Storage Node

To Load Balancers

Proxies

Example Large Scale Deployment -- Many Configs Possible

Example OpenStack Object

Storage Hardware

ReST-based API Data distributed evenly throughout system

Hardware agnostic: standard hardware, RAID not required

Object Storage Key Features

No centraldatabase

Scalable to multiple petabytes, billions of objects

Account/Container/Object structure (not file system, no nesting) plus Replication (N copies of accounts, containers, objects)

System Components

PYTHON PROCS:Proxy Server: Authentication, ACLs, quotas, rate-limitingAccount Server: Handles listing of containers, stored as SQLite DBContainer Server: Handles listing of objects, stored as SQLite DBObject Server: Blob storage server, metadata kept in xattrs, data in binary format

Store your objects on XFSObject location based on hash of name & timestamp

System Components (cont.)

● The Ring: Mapping of names to entities (accounts, containers, objects) on disk.

● Stores data based on zones, devices, partitions, and replicas● Weights can be used to balance the distribution of partitions● Used by the Proxy Server for many background processes

● Proxy Server: Request routing, exposes the public API● Replication: Keep the system consistent, handle failures● Updaters: Process failed or queued updates● Auditors: Verify integrity of objects, containers, and accounts● Deleters: Account Reaper, Replication

System Components (cont., cont.)

● Auth System: Completely pluggable. Temp Auth, Keystone, Roll-your-own

● Replication: (again?) replication works by crawling the proc's local filesystems and querying the cluster for the file existing elsewhere. Of course, it just compares MD5 sums.● DB Replication vs. Object Replication

● Container to Container Synchronization: Spread your clusters out, sync out-of-band

Swift Containers

https://swift.example.com/v1/bobsfishfry/tuna/tuna1.jpgContainer Based Auth:

● X-Container-Read: accountname ● X-Container-Write: accountname:username● X-Container-Read: referer:

The Container Rules:● Make lots of containers! Cheap!● No nested containers● No / in container names● <256 bytes (incl URL encoding)

Coming in Folsom (1.5)

* Versioned Objects!

* Large Object Support (with no need for client support)

* Expiring Object support

* StatsD Support

* Logging is now middleware

* Amazon S3 compatibility is now external project

* DB Preallocation: great for spinning media, horrible for SSDs

Coming Later!

Ring Builder Web Service

Upgraded WebOb (1.2) support (unicode, etc.)

Drive Failure Detector and Remediation

Token Service

OpenStack Community Resources

● IRC (freenode)● #openstack● #openstack-dev● #openstack-meeting

● Docs● http://wiki.openstack.org● http://docs.openstack.org● http://swift.openstack.orga● http://nova.openstack.org

● Twitter● @openstack


Top Related