Download - Ceph and OpenStack - Feb 2014
20140218
Ian ColleDirector of Engineering, Inktank
[email protected]@ircollewww.linkedin.com/in/ircolleircolle on freenode
inktank.com | ceph.com
AGENDA
3
GETTING INVOLVED
ROADMAP
OPENSTACK AND CEPH
INTRO TO CEPH
WHY CEPH?
Why has Ceph has become the de facto storage choice for OpenStack implementations?http://www.openstack.org/blog/2013/11/openstack-user-survey-october-2013/
CEPH
CEPH UNIFIED STORAGE
FILE SYSTEM
BLOCK STORAGE
OBJECT STORAGE
Keystone
Geo-ReplicationNative API
6
Multi-tenant
S3 & Swift
OpenStack
Linux Kernel
iSCSI
Clones
Snapshots
CIFS/NFS
HDFS
Distributed Metadata
Linux Kernel
POSIX
Copyright © 2014 by Inktank
CEPH OVERVIEW
PHILOSOPHY TODAY
7
2004
2010
2012
HISTORY
2006
Included in Linux kernel
Integrated into
OpenStack
Open sourced for the first
time
Project starts at UCSC
Failure is normal
Self managing
Scale out on commodity hardware
Everything runs in
software
Copyright © 2014 by Inktank
Copyright © 2014 by Inktank
TRADITIONAL STORAGE VS. CEPH
Single Purpose
TRADITIONALENTERPRISE STORAGE
Multi-Purpose, Unified
Hardware Distributed Software
Single Vendor Lock-in Open
Hard Scale Limit Exabyte Scale
8
STRONG & GROWING COMMUNITY
9
2011-Q3 2011-Q4 2012-Q1 2012-Q2 2012-Q3 2012-Q4 2013-Q1 2013-Q2 2013-Q3
8,172
37,946
2,888
11,500
1,4182,715
IRC chatlines
ML messages
Commits
Copyright © 2014 by Inktank
ARCHITECTURE
10
S3/SWIFTHOST/
HYPERVISORiSCSI CIFS/NFS SDK
INTERFACES
STORAGE CLUSTERS
MONITORS OBJECT STORAGE DAEMONS (OSD)
BLOCK STORAGE
FILE SYSTEM
OBJECT STORAGE
NODE
NODE
NODE
NODE
NODE
NODE
NODE
NODE
NODE
Copyright © 2014 by Inktank
CRUSH
11
OBJECT
10 10 01 01 10 10 01 11 01 10
hash(object name) % num pg
CRUSH(pg, cluster state, rule set)
12
OBJECT
10 10 01 01 10 10 01 11 01 10
CRUSH Pseudo-random placement algorithm
Fast calculation, no lookup Repeatable, deterministic
Statistically uniform distribution Stable mapping
Limited data migration on change Rule-based configuration
Infrastructure topology aware Adjustable replication Weighting
13
14
CLIENT
??
15
16
17
CLIENT
??
18
19
20
LIBRADOS
21
M
M
M
VM
LIBRBD
HYPERVISOR
HOW DO YOUSPIN UP
HUNDREDS OF VMsINSTANTLY
ANDEFFICIENTLY?
22
23
144 0 0 0 0
instant copy
= 144
4144
24
CLIENT
write
write
write
= 148
write
4144
25
CLIENTread
read
read
= 148
OPENSTACK AND CEPH
ARCHITECTURAL COMPONENTS
27
RGWA web services
gateway for object storage, compatible
with S3 and Swift
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,
PHP)
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and
scale-out metadata management
APP HOST/VM CLIENT
Copyright © 2014 by Inktank
CEPH WITH OPENSTACK
OPEN STACK
KEYSTONE API
SWIFT API CINDER APIGLAN
CE API
NOVAAPI
CEPH STORAGE CLUSTER(RADOS)
CEPH OBJECT GATEWAY
(RGW)
CEPH BLOCK DEVICE(RBD)
HYPERVISOR
(Qemu/KVM)
28Copyright © 2014 by Inktank
Swift RADOS backend? (possibly)
DevStack Ceph – In work
Enable Cloning for rbd-backed ephemeral disks – In Review
PROPOSED ICEHOUSE ADDITIONS
Copyright © 2014 by Inktank
WHAT’S NEXT FOR CEPH?
CEPH ROADMAP
31
Firefly Giant H-Release
Object Versioning
Alternative Web Server for RGW
Performance Improvement
CephFS
Copyright © 2014 by Inktank
Cache Tiering
Erasure Coding
Object Quotas
Object Expiration
Read-Affinity
CACHE TIERING - WRITEBACK
Copyright © 2014 by Inktank
5 PB HDD OBJECT STORAGE
500TB writeback
cache
CLIENT A CLIENT B
CACHE TIERING - READONLY
Copyright © 2014 by Inktank
5 PB HDD OBJECT STORAGE
200TB redonly cache
150TB readonly
cache
CLIENT A CLIENT B CLIENT C
150TB readonly
cache
34
10 MB OBJECT
10 10 01 01 10 10 01 11 01 10
Costs you 30MBof storage
35
10 MB OBJECT
10 10 01 01 10 10 01 11 01 10
Costs you ~14MBof storage
NEXT STEPS
NEXT STEPSWHAT NOW?
• Read about the latest version of Ceph: http://ceph.com/docs
• Deploy a test cluster using ceph-deploy: http://ceph.com/qsg
Getting Started with Ceph
Most discussion happens on the mailing lists ceph-devel and ceph-users. Join or view archives at http://ceph.com/list
IRC is a great place to get help (or help others!) #ceph and #ceph-devel. Details and logs at http://ceph.com/irc
Getting Involved with Ceph
37
• Deploy a test cluster on the AWS free-tier using Juju: http://ceph.com/juju
• Ansible playbooks for Ceph: https://www.github.com/alfredodeza/ceph-ansible Download the code: http:
//www.github.com/ceph The tracker manages bugs and
feature requests. Register and start looking around at http://tracker.ceph.com
Doc updates and suggestions are always welcome. Learn how to contribute docs at http://ceph.com/docwriting
Ian R. ColleDirector of Engineering
[email protected]@ircolle
www.linkedin.com/in/ircolleircolle on freenode