flying circus ceph case study (ceph usergroup berlin)
DESCRIPTION
Slides from the inaugural CEPH users group meeting in Berlin. A quick overview of the CEPH status at the Flying Circus.TRANSCRIPT
![Page 2: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)](https://reader033.vdocuments.us/reader033/viewer/2022052215/5480d6d9b4795941578b478c/html5/thumbnails/2.jpg)
/me
• Christian Theune
• Co-Founder of gocept
• Software Developer(formerly Zope, Plone, grok), Python (lots of packages)
• @theuni
![Page 3: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)](https://reader033.vdocuments.us/reader033/viewer/2022052215/5480d6d9b4795941578b478c/html5/thumbnails/3.jpg)
![Page 4: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)](https://reader033.vdocuments.us/reader033/viewer/2022052215/5480d6d9b4795941578b478c/html5/thumbnails/4.jpg)
What worked for us?
raw image on local server
lvm volume via iSCSI (ietd + open-iscsi)
![Page 5: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)](https://reader033.vdocuments.us/reader033/viewer/2022052215/5480d6d9b4795941578b478c/html5/thumbnails/5.jpg)
What didn’t work (for us)
ATA over Ethernet
Gluster(sheepdog)
Linux HA solution for iSCSI
![Page 6: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)](https://reader033.vdocuments.us/reader033/viewer/2022052215/5480d6d9b4795941578b478c/html5/thumbnails/6.jpg)
CEPH
• been watching for ages
• started work in December 2012
• production roll-out since December 2013
• about 50% migrated in production
![Page 7: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)](https://reader033.vdocuments.us/reader033/viewer/2022052215/5480d6d9b4795941578b478c/html5/thumbnails/7.jpg)
Our production structure• KVM hosts with 2x1Gbps (STO and STB)
• Old storages with 5*600GB RAID 5 + 1 Journal SAS 15k drives
• 5 monitors, 6 OSDs currently
• RBD from KVM hosts and backup server, 1 cluster per customer project (multiple VMs)
• Acceptable performance on existing hardware
![Page 8: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)](https://reader033.vdocuments.us/reader033/viewer/2022052215/5480d6d9b4795941578b478c/html5/thumbnails/8.jpg)
Good stuff• No single point of failure any more.!
• Create/destroy VM images on KVM hosts!
• Fail-over and self-healing works nicely
• Virtualisation for storage “as it should be”™
• High quality of concepts, implementation, and documentation
• Relatively simple to configure
![Page 9: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)](https://reader033.vdocuments.us/reader033/viewer/2022052215/5480d6d9b4795941578b478c/html5/thumbnails/9.jpg)
ceph -s (and -w)
![Page 10: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)](https://reader033.vdocuments.us/reader033/viewer/2022052215/5480d6d9b4795941578b478c/html5/thumbnails/10.jpg)
ceph osd tree
![Page 11: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)](https://reader033.vdocuments.us/reader033/viewer/2022052215/5480d6d9b4795941578b478c/html5/thumbnails/11.jpg)
Current issues• Bandwith vs. Latency: replicas from RBD client?!?.
• Deciding for PG allocation in various situations.
• Deciding for new hardware.
• Backup has become a bottle neck.
• I can haz “ceph osd pool stats” per RBD volume?
• Still measuring performance. RBD is definitely sucking up some performance.
![Page 12: Flying Circus Ceph Case Study (CEPH Usergroup Berlin)](https://reader033.vdocuments.us/reader033/viewer/2022052215/5480d6d9b4795941578b478c/html5/thumbnails/12.jpg)
Summary• finally … FINALLY … F I N A L L Y !
• feels sooo good
• well, at least we did not want to throw up using it
• works as promised
• can’t stop praising it …