osdc 2015: roland kammerer | drbd9: managing high-available storage in many-node setups
TRANSCRIPT
DRBDdrbdmanageOpen Stack
Epilogue
DRBD9: Managing High-Available Storagein Many-Node Setups
Roland Kammerer,[email protected],
LINBIT HA-Solutions GmbH
April 22, 2015
Roland Kammerer OSDC 2015 1/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Agenda
1 DRBD
2 drbdmanage
3 Open Stack
4 Epilogue
Roland Kammerer OSDC 2015 2/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
Key Features of DRBD 8.x
• Automatic resync after node or connectivity failure
direction, amount, no full resync required
• High performance in Linux kernel implementation
160k IOPs measured (on SSDs of course)
• Multiple volumes per resource
Write order fidelity within resource
• Pacemaker integration
• Synchronous and asynchronous replication (LAN and WAN)
• In Linux upstream since 2.6.33 (released 2010)
Roland Kammerer OSDC 2015 4/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
Control Plane in DRBD 8.x
• You need to create/provideblock devices for DRBD
• You need to distributeDRBD config files.
Roland Kammerer OSDC 2015 5/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
Promoting a Resource
$ drbdadm primary X
$ mount /dev/drbdY
• When you are done with your work
$ umount /dev/drbdY
$ drbdadm secondary X
• Go to step one on the second node. . .
• Oh, you did not forget to create the meta-data in the firstplace, and drbdadm up the resource, right?
• Promote/mount and umount/demote are the reasons youhave to special case DRBD (e.g., in pacemaker setups).
Roland Kammerer OSDC 2015 6/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
Promoting a Resource
$ drbdadm primary X
$ mount /dev/drbdY
• When you are done with your work
$ umount /dev/drbdY
$ drbdadm secondary X
• Go to step one on the second node. . .
• Oh, you did not forget to create the meta-data in the firstplace, and drbdadm up the resource, right?
• Promote/mount and umount/demote are the reasons youhave to special case DRBD (e.g., in pacemaker setups).
Roland Kammerer OSDC 2015 6/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
Promoting a Resource
$ drbdadm primary X
$ mount /dev/drbdY
• When you are done with your work
$ umount /dev/drbdY
$ drbdadm secondary X
• Go to step one on the second node. . .
• Oh, you did not forget to create the meta-data in the firstplace, and drbdadm up the resource, right?
• Promote/mount and umount/demote are the reasons youhave to special case DRBD (e.g., in pacemaker setups).
Roland Kammerer OSDC 2015 6/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
New Features of DRBD 9
• Multi-Node replication
• Up to 31 connections per resource
• Auto promote
• Transport abstraction layer
• Usual improvements all over the place (kernel/utils split, utilsthat support all supported DRBD versions,dpkg-reconfigure for utils, dkms for the kernel module, . . . )
Roland Kammerer OSDC 2015 7/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
Auto Promote
DRBD 8.x:
$ drbdadm primary X
$ mount /dev/drbdY
$ umount /dev/drbdY
$ drbdadm secondary X
DRBD 9:
$ mount /dev/drbdY
$ umount /dev/drbdY
Roland Kammerer OSDC 2015 8/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
Transport Abstraction
• Separates DRBD logic from the underlaying transport
• Provides an interface for new transports
• Main drbd.ko
and transports (e.g., drbd_transport_tcp.ko)
cat /proc/drbd
version: 9.0.0 rc2 (api:1/ proto :86 -110)
...
Transports (api :6): tcp (1.0.0)
So, what is the big deal?
Roland Kammerer OSDC 2015 9/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
Transport Abstraction
• Separates DRBD logic from the underlaying transport
• Provides an interface for new transports
• Main drbd.ko
and transports (e.g., drbd_transport_tcp.ko)
cat /proc/drbd
version: 9.0.0 rc2 (api:1/ proto :86 -110)
...
Transports (api :6): tcp (1.0.0)
So, what is the big deal?
Roland Kammerer OSDC 2015 9/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
RDMA
That is the big deal ,:
cat /proc/drbd
version: 9.0.0 rc2 (api:1/ proto :86 -110)
...
Transports (api :6): tcp (1.0.0) rdma (1.0.0)
Currently, we observe about 20 Gbit/s (≈2 GByte/s). More tocome soon, RDMA showed some bottlenecks in the core we werenot aware of (because TCP was slow enough). . .
Roland Kammerer OSDC 2015 10/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
RDMA
That is the big deal ,:
cat /proc/drbd
version: 9.0.0 rc2 (api:1/ proto :86 -110)
...
Transports (api :6): tcp (1.0.0) rdma (1.0.0)
Currently, we observe about 20 Gbit/s (≈2 GByte/s). More tocome soon, RDMA showed some bottlenecks in the core we werenot aware of (because TCP was slow enough). . .
Roland Kammerer OSDC 2015 10/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD 8.xDRBD 9
Transport Layers to coverI LINBIT.COM
YOUR WAY TO HIGH AVAILABILITY
TCP SCTP RDMA
SSOCKS SCTP iWARP RoCE
IPoIB
InfiniBandSCI
TCP
IP
Ethernet
Protocol
Transport
Medium
Hardware Dolphin Mellanox etc.Chelsio etc.several suppliers
New features of DRBD9New features of DRBD9
Roland Kammerer OSDC 2015 11/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Agenda
1 DRBD
2 drbdmanageControl PlaneExampleSatellite Nodes
3 Open Stack
4 Epilogue
Roland Kammerer OSDC 2015 12/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Why do we need yet another tool?
• Autopromote solves one part of the complexity, but. . .
You still need to distribute your config filesYou need to keep them in syncThat is fine in two node clusters, but now we have multiplenodes.
⇒ We need a tool to. . .
• . . . handle the new complexity of multi-node setups.
• We can use it for additional features.
Roland Kammerer OSDC 2015 13/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Why do we need yet another tool?
• Autopromote solves one part of the complexity, but. . .
You still need to distribute your config filesYou need to keep them in syncThat is fine in two node clusters, but now we have multiplenodes.
⇒ We need a tool to. . .
• . . . handle the new complexity of multi-node setups.
• We can use it for additional features.
Roland Kammerer OSDC 2015 13/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Requirements and Benefits
• What you need:
Nodes with an LVM VG (drbdpool)
• What you get:
Manges nodes in your clusterManges resources and volumes (including replica count)Manges snapshotsCalls lvmtoolsVolumes can be based on thinly provisioned LVM LVsDistributes config and activates it (by using DRBD 9)Implemented in PythonScales to 1000s of nodes (via “Satellite Nodes”, WIP)
Roland Kammerer OSDC 2015 14/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Control Plane in DRBD 9
Roland Kammerer OSDC 2015 15/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Software Architecture
Roland Kammerer OSDC 2015 16/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Our current State
Roland Kammerer OSDC 2015 17/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Adding a Node
Roland Kammerer OSDC 2015 18/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Rebalancing
Roland Kammerer OSDC 2015 19/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Rebalancing finished
Roland Kammerer OSDC 2015 20/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Using new Space
Roland Kammerer OSDC 2015 21/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Control PlaneExampleSatellite Nodes
Satellite Nodes
• Core Nodes. . .
contain drbdcrtlvolumecontain logic
• Satellite Nodes. . .
no drbdctrljust executecommands
OK to shared DRBD res between satellite (and/or core) nodes
Roland Kammerer OSDC 2015 22/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Agenda
1 DRBD
2 drbdmanage
3 Open Stack
4 Epilogue
Roland Kammerer OSDC 2015 23/ 34
DRBDdrbdmanageOpen Stack
Epilogue
DRBD in OpenStack
DRBD and drbdmange dock on the Cinder part of OpenStack.
Roland Kammerer OSDC 2015 24/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Remember the Control Plane in DRBD 9?
Roland Kammerer OSDC 2015 25/ 34
DRBDdrbdmanageOpen Stack
Epilogue
drbdmanage Cinder Driver on a Cinder Node
Roland Kammerer OSDC 2015 26/ 34
DRBDdrbdmanageOpen Stack
Epilogue
drbdmanage Nova Driver on a Nova Node
If the nova node. . .
• . . . has a local replica of the volume
use it
• . . . does not have a local replica of the volume
it is a DRBD clientit is a diskless primary node that connects to secondaries thathold the data
Roland Kammerer OSDC 2015 27/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Architecture
http://docs.openstack.org/juno/install-guide/install/
yum/content/ch_overview.html
Roland Kammerer OSDC 2015 30/ 34
DRBDdrbdmanageOpen Stack
Epilogue
OpenStack Nova vs. Cinder
Low latency storage access possible by aligning Nova and Cinderallocations.
Roland Kammerer OSDC 2015 31/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Get it now!
• http://drbd.linbit.com
• http://oss.linbit.com
• http://git.drbd.org
• To get access to our deb/rpm repos: [email protected]
Roland Kammerer OSDC 2015 32/ 34
DRBDdrbdmanageOpen Stack
Epilogue
Q&A
Are there any questions?
I will be around, you can also drop me a mail [email protected]
Roland Kammerer OSDC 2015 33/ 34
DRBDdrbdmanageOpen Stack
Epilogue
License
This work is licensed under aCreative Commons “Attribution-NonCommercial-ShareAlike 4.0International ” license.
Roland Kammerer OSDC 2015 34/ 34