![Page 1: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/1.jpg)
Ceph Storage on SSD for ContainerJANGSEON RYU
NAVER
![Page 2: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/2.jpg)
• What is Container?
• Persistent Storage
• What is Ceph Storage?
• Write Process in Ceph
• Performance Test
Agenda
![Page 3: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/3.jpg)
What is Container ?
![Page 4: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/4.jpg)
What is Container?
• VM vs Container• OS Level Virtualization• Performance• Run Everywhere – Portable• Scalable – lightweight• Environment Consistency• Easy to manage Image• Easy to deploy
Docker Container Virtual Machine
Reference : https://www.docker.com/what-container
![Page 5: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/5.jpg)
What is Container?
• VM vs Container• OS Level Virtualization• Performance• Run Everywhere – Portable• Scalable – lightweight• Environment Consistency• Easy to manage Image• Easy to deploy
namespace
cgroup
isolation
resourcelimiting
CPU MEM DISK NWT
Reference : https://www.slideshare.net/PhilEstes/docker-london-container-security
![Page 6: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/6.jpg)
What is Container?
• VM vs Container• OS Level Virtualization• Performance• Run Everywhere – Portable• Scalable – lightweight• Environment Consistency• Easy to manage Image• Easy to deploy
Reference : https://www.theregister.co.uk/2014/08/18/docker_kicks_kvms_butt_in_ibm_tests/
![Page 7: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/7.jpg)
What is Container?
• VM vs Container• OS Level Virtualization• Performance• Run Everywhere – Portable• Scalable – lightweight• Environment Consistency• Easy to manage Image• Easy to deploy
Reference : https://www.ibm.com/blogs/bluemix/2015/08/c-ports-docker-containers-across-multiple-clouds-datacenters/
PM / VM / Cloud
![Page 8: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/8.jpg)
What is Container?
• VM vs Container• OS Level Virtualization• Performance• Run Everywhere – Portable• Scalable – lightweight• Environment Consistency• Easy to manage Image• Easy to deploy
App A
Libs
Original Image
A’
Changed Image Deployment
A’App A
Libs
change update
![Page 9: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/9.jpg)
What is Container?
• VM vs Container• OS Level Virtualization• Performance• Run Everywhere – Portable• Scalable – lightweight• Environment Consistency• Easy to manage Image• Easy to deploy
Reference : http://www.devopsschool.com/slides/docker/docker-web-development/index.html#/17
![Page 10: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/10.jpg)
What is Container?
• VM vs Container• OS Level Virtualization• Performance• Run Everywhere – Portable• Scalable – lightweight• Environment Consistency• Easy to manage Image• Easy to deploy
Reference : https://www.slideshare.net/insideHPC/docker-for-hpc-in-a-nutshell
![Page 11: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/11.jpg)
Persistent Storage
![Page 12: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/12.jpg)
Stateless vs Stateful Container
• Nothing to write on disk• Web (Front-end)• Easy to scale in/out• Container is ephemeral • If delete, will be lost data
Stateless
…
Easy to scale out
• Needs storage to write• Database• Logs • CI config / repo data • Secret Keys
Stateful
Hard to scale out
…
![Page 13: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/13.jpg)
Ephemeral vs Persistent
Ephemeral Storage
• Data Lost• Local Storage• Stateless Container
Persistent Storage
• Data Save• Network Storage• Stateful Container
VS
Reference : https://www.infoworld.com/article/3106416/cloud-computing/containerizing-stateful-applications.html
![Page 14: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/14.jpg)
Needs for Persistent Storage
Our mission is to provide “Persistence Storage Service”while maintaining “Agility” & “Automation” of Docker Container
![Page 15: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/15.jpg)
macvlan
IPVS IPVSvrrp
Docker swarm cluster
ECMP enabled
BGP/ECMP
internetconsul
consul consul
Dynamic DNS
AA
MON MON MON OSD
OSD OSD OSD OSD
OSD OSD OSD OSD
OSD OSD OSD OSD
OSD OSD OSD OSD
KEYSTONE
CINDER
Manage Volume
PLUGIN
Get Auth-Token
Request
Attach Volume/dev/rbd0
PLUGIN
DEVIEW 2016 : https://www.slideshare.net/deview/221-docker-orchestration
OpenStack 2017 : https://www.slideshare.net/JangseonRyu/on-demandblockstoragefordocker-77895159
OSD
Container in NAVER
![Page 16: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/16.jpg)
What is Ceph Storage?
![Page 17: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/17.jpg)
What is Ceph Storage?
• Open Source Distributed Storage Solution• Massive Scalable / Efficient Scale-Out• Unified Storage• Runs on commodity hardware• Integrations into Linux Kernel
/ QEMU/KVM Driver / OpenStack• Self-managing / Self-healing• Peer-to-Peer Storage Nodes• RESTful API • No metadata bottleneck (no lookup)• CRUSH algorithm determines data placement• Replicated / Erasure Coding• Architecture
Vender NO Lock-in
![Page 18: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/18.jpg)
What is Ceph Storage?
• Open Source Distributed Storage Solution• Massive Scalable / Efficient Scale-Out• Unified Storage• Runs on commodity hardware• Integrations into Linux Kernel
/ QEMU/KVM Driver / OpenStack• Self-managing / Self-healing• Peer-to-Peer Storage Nodes• RESTful API • No metadata bottleneck (no lookup)• CRUSH algorithm determines data placement• Replicated / Erasure Coding• Architecture
Up-to 16 Exa
![Page 19: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/19.jpg)
What is Ceph Storage?
• Open Source Distributed Storage Solution• Massive Scalable / Efficient Scale-Out• Unified Storage• Runs on commodity hardware• Integrations into Linux Kernel
/ QEMU/KVM Driver / OpenStack• Self-managing / Self-healing• Peer-to-Peer Storage Nodes• RESTful API • No metadata bottleneck (no lookup)• CRUSH algorithm determines data placement• Replicated / Erasure Coding• Architecture Reference : https://en.wikipedia.org/wiki/Ceph_(software)
![Page 20: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/20.jpg)
What is Ceph Storage?
• Open Source Distributed Storage Solution• Massive Scalable / Efficient Scale-Out• Unified Storage• Runs on commodity hardware• Integrations into Linux Kernel
/ QEMU/KVM Driver / OpenStack• Self-managing / Self-healing• Peer-to-Peer Storage Nodes• RESTful API • No metadata bottleneck (no lookup)• CRUSH algorithm determines data placement• Replicated / Erasure Coding• Architecture
Low TCO
![Page 21: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/21.jpg)
What is Ceph Storage?
• Open Source Distributed Storage Solution• Massive Scalable / Efficient Scale-Out• Unified Storage• Runs on commodity hardware• Integrations into Linux Kernel
/ QEMU/KVM Driver / OpenStack• Self-managing / Self-healing• Peer-to-Peer Storage Nodes• RESTful API • No metadata bottleneck (no lookup)• CRUSH algorithm determines data placement• Replicated / Erasure Coding• Architecture
KVM integrated with RBD
Supportedabovekernel2.6
![Page 22: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/22.jpg)
What is Ceph Storage?
• Open Source Distributed Storage Solution• Massive Scalable / Efficient Scale-Out• Unified Storage• Runs on commodity hardware• Integrations into Linux Kernel
/ QEMU/KVM Driver / OpenStack• Self-managing / Self-healing• Peer-to-Peer Storage Nodes• RESTful API • No metadata bottleneck (no lookup)• CRUSH algorithm determines data placement• Replicated / Erasure Coding• Architecture
![Page 23: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/23.jpg)
What is Ceph Storage?
• Open Source Distributed Storage Solution• Massive Scalable / Efficient Scale-Out• Unified Storage• Runs on commodity hardware• Integrations into Linux Kernel
/ QEMU/KVM Driver / OpenStack• Self-managing / Self-healing• Peer-to-Peer Storage Nodes• RESTful API • No metadata bottleneck (no lookup)• CRUSH algorithm determines data placement• Replicated / Erasure Coding• Architecture
Client Metadataquery
Storage
Storage
Storage
Write
Traditional
CephClient
Storage
CRUSH
MetadataStorageMetadata
StorageMetadata
StorageMetadata
StorageMetadata
StorageMetadata
bottleneck
![Page 24: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/24.jpg)
What is Ceph Storage?
• Open Source Distributed Storage Solution• Massive Scalable / Efficient Scale-Out• Unified Storage• Runs on commodity hardware• Integrations into Linux Kernel
/ QEMU/KVM Driver / OpenStack• Self-managing / Self-healing• Peer-to-Peer Storage Nodes• RESTful API • No metadata bottleneck (no lookup)• CRUSH algorithm determines data placement• Replicated / Erasure Coding• Architecture
• Very high durability• 200 % overhead• Quick recovery
• Cost-Effective• 50 % overhead• Expensive Recovery
Object
COPY
COPY COPY
Replicated Pool
Object
Erasure Coded Pool
1 2 3 4 X Y
![Page 25: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/25.jpg)
What is Ceph Storage?
• Open Source Distributed Storage Solution• Massive Scalable / Efficient Scale-Out• Unified Storage• Runs on commodity hardware• Integrations into Linux Kernel
/ QEMU/KVM Driver / OpenStack• Self-managing / Self-healing• Peer-to-Peer Storage Nodes• RESTful API • No metadata bottleneck (no lookup)• CRUSH algorithm determines data placement• Replicated / Erasure Coding• Architecture
Public Network
Cluster Network
OSD Node OSD Node OSD NodeOSD Node
MonitorMonitorMonitor
Client
P
S
T
![Page 26: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/26.jpg)
Write Process in Ceph
![Page 27: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/27.jpg)
◼ Strong Consistency◼ CAP Theorem : CP system
(Consistency / network Partition)Client
Primary OSD
Secondary OSD Tertiary OSD
①
② ②③ ③
④
Write Flow
② Primary OSD sends data to replica OSDs, write data to local disk.
④ Primary OSD signals completion to client.
③ Replica OSDs write data to local disk, signal completion to primary.
① Client writes data to primary osd.
![Page 28: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/28.jpg)
◼ All of Data Disks are SATA.◼ 7.2k rpm SATA : ~ 75 ~ 100 IOPS
① Client writes data to primary osd.
② Primary OSD sends data to replica OSDs, write data to local disk.
③’ Write data to local disk, Send ack to primary.
③’’ Slow write data to local disk,Send ack to primary.
④ Slow send ack to client.
Client
Primary OSD
Secondary OSD Tertiary OSD
①
② ②③’ ③’’
④
SATAHigh Disk Usage
Slow Requests
![Page 29: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/29.jpg)
Client
Primary OSD
①
② ②③ ③
④
Journal FileStore
Secondary OSD
Journal FileStore
Tertiary OSD
Journal FileStore
Primary OSD
Journal
SSD
O_DIRECTO_DSYNC
FileStore
SATA
Buffered IOs
XFS
Journal on SSD
![Page 30: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/30.jpg)
Performance Test
![Page 31: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/31.jpg)
NetworkSwitch (10G)
Clients
Ceph
Public Networks Cluster Networks
KRBD- /dev/rbd0 - mkfs / mount- fio
Ceph- FileStore- Luminous (12.2.0)
Server- Intel® Xeon CPU L5640 2.27 GHz- 16 GB Memory- 480GB SAS 10K x 5- 480GB SAS SSD x 1- Centos 7
Test Environment
![Page 32: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/32.jpg)
Expect Result : 4K Rand Write
SAS
SAS
SAS
SAS
SAS
SAS
SAS
SAS
SAS
SAS : 10,000 rpm / 600 IOPSPer Node : SAS * 3 = 600 * 3 = 1,800 IOPSTotal : Node * 3 = 1,800 * 3 = 5,400 IOPSReplicas 3 = 5,400 / 3 = 1,800 IOPS
600
1800
2094
34.360
500
1000
1500
2000
2500
IOPS Latency
NoSSD forJournal
2094 IOPS/s
Case #1. Only Use SAS DISK
![Page 33: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/33.jpg)
Expect
SAS
SAS
SAS
SSD : 15,000 IOPSTotal : Node * 3 = 15,000* 3 = 45,000 IOPSReplicas 3 = 45,000 / 3 = 15,000 IOPS
SSD
600
15k
SAS
SAS
SAS
SSD
SAS
SAS
SAS
SSD
2094 à 2273
2273
31.650
500
1000
1500
2000
2500
IOPS Latency
SSDfor Journal
Case #2. Use SSD for Journal
Result : 4K Rand Write
![Page 34: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/34.jpg)
# ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config show | grep queue_max…"filestore_queue_max_bytes": " 104857600", "filestore_queue_max_ops": ”50", ß default value…
Client
FileStore Queue
SAS
# ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok perf dump | grep -A 15 throttle-filestore_ops"throttle-filestore_ops": { "val": 50, ß limitation"max": 50, …
Block IOFlush
50
Analysis
![Page 35: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/35.jpg)
Tuning Result
filestore_queue_max_ops = 500000filestore_queue_max_bytes = 42949672960journal_max_write_bytes = 42949672960journal_max_write_entries = 5000000
2094 à 2899 (38%)
2899
174.760
500
1000
1500
2000
2500
3000
3500
IOPS Latency
TuningQueueMax
Case #3. Tuning FileStore
![Page 36: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/36.jpg)
# ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config show | grep wbthrottle_xfs"filestore_wbthrottle_xfs_bytes_hard_limit": "419430400", "filestore_wbthrottle_xfs_bytes_start_flusher": "41943040", "filestore_wbthrottle_xfs_inodes_hard_limit": "5000", "filestore_wbthrottle_xfs_inodes_start_flusher": "500", ”filestore_wbthrottle_xfs_ios_hard_limit": "5000", "filestore_wbthrottle_xfs_ios_start_flusher": "500",
#ceph --admin-daemon/var/run/ceph/ceph-osd.2.asokperf dump|grep -A6WBThrottle"WBThrottle":{"bytes_dirtied":21049344,"bytes_wb":197390336,"ios_dirtied":5017, ß limitation"ios_wb":40920,"inodes_dirtied":1195,"inodes_wb":20602
Client
FileStore Queue
SAS
Block IOFlush
500,000
Analysis
![Page 37: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/37.jpg)
Tuning Result
"filestore_wbthrottle_enable": "false",or"filestore_wbthrottle_xfs_bytes_hard_limit": "4194304000",
"filestore_wbthrottle_xfs_bytes_start_flusher": "419430400", "filestore_wbthrottle_xfs_inodes_hard_limit": "500000", "filestore_wbthrottle_xfs_inodes_start_flusher": "5000", "filestore_wbthrottle_xfs_ios_hard_limit": "500000", "filestore_wbthrottle_xfs_ios_start_flusher": "5000",
2,094 à 14,264 (x7)
14264
40.650
2000
4000
6000
8000
10000
12000
14000
16000
IOPS Latency
TuningWBThrottle
Case #4. Tuning WBThrottle
![Page 38: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/38.jpg)
After35secs,Performance(IPOS)dropsbelow100~200IOPS…
ClientSAS
Block IO
Flush
OSDJournal
FileStore
SSD
O_DIRECTO_DSYNC
Buffered IOs
Page Cache
vm.dirty_background_ratio :10%vm.dirty_ratio :20%
vm.dirty_ratio :50%
Analysis
![Page 39: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/39.jpg)
Client OSDEnable Throttle
2,000 IOPS
Client OSD
SAS
Disable Throttle SSD
15,000 IOPS
Bottleneck
SAS
SSD• Slow performance• No effect with SSD
• Fast performance• Danger using
High Page Cache• Crash Ceph Storage
Problems of Throttle
![Page 40: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/40.jpg)
Client OSD
SAS
DynamicThrottle SSD
• Burst (~ 60 secs)• Throttle From 80%
15,000 IOPS500 IOPS
filestore_queue_max_opsfilestore_queue_max_bytes
filestore_expected_throughput_opsfilestore_expected_throughput_bytes
filestore_queue_low_threshholdfilestore_queue_high_threshholdfilestore_queue_high_delay_multiplefilestore_queue_max_delay_multiple
Reference : http://blog.wjin.org/posts/ceph-dynamic-throttle.html
r=current_op /max_opshigh_delay_per_count =high_multiple /expected_throughput_opsmax_delay_per_count =max_multiple /expected_throughput_opss0=high_delay_per_count /(high_threshhold - low_threshhold)s1=(max_delay_per_count - high_delay_per_count)/(1-high_threshhold)ifr<low_threshhold:delay=0
elif r<high_threshhold:delay=(r- low_threshhold)*s0
else:delay=high_delay_per_count +((r- high_threshhold)*s1)
Dynamic Throttle
![Page 41: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/41.jpg)
◼ High performance improvements with SSD : 2,094 à 14,264 (x7)
◼ Must need Throttling for stable storage operation : Dynamic Throttle
◼ Need to tune OS(page cache, io scheduler), Ceph config
Conculsion
![Page 42: Ceph Storage on SSD for Container - DCSLAB, Hanyang …dcslab.hanyang.ac.kr/nvramos/nvramos17/presentation/s5.pdf · What is Ceph Storage? • Open Source Distributed Storage Solution](https://reader031.vdocuments.us/reader031/viewer/2022032711/5a8525447f8b9a9f1b8c462e/html5/thumbnails/42.jpg)
QnA