improvements in glusterfs for virtualization usecase
TRANSCRIPT
1
IMPROVEMENTS IN GLUSTER FORIMPROVEMENTS IN GLUSTER FOR VIRTUALIZATION USECASEVIRTUALIZATION USECASE
Prasanna Kumar Kalever
07 - Feb - 2016pkalever @
http://goo.gl/yBeXI8
2
1 Introduction to Gluster
2 Introduction to Hyperconvergence
3 LibGfapi Over Fuse
4 Gluster-Qemu- Libgfapi integration
5 Unix domain socket for IO
6 Sharding
7 Dynamic authentication
I N D E X :I N D E X :
Demo
Demo
Demo
Demo
6 . 2
M U LT I S E RV E R SM U LT I S E RV E R S
gerrit link
Fix started from initialization part for multiple servers in libgfapi
qemu-system-x86_64 file=gluster://10.70.1.86:24007/testvol/a.qcow2
6 . 3
Q E M U :Q E M U :
RFE Link
creating a trusted poolcreate a replica 3 volumeqemu setup with gluserfslibvirt setup with glusterstorage driver
observing old URI syntaxits demeritsJson way for high availability
6 . 4
L I BV I R TL I BV I R T
RFE Link
JSON command:------------json:{ "driver":"qcow2", "file":{ "driver":"gluster", "volume":"testvol", "path":"/path/a.qcow2", "servers":[ { "host":"localhost", "transport":"unix" }, { "host":"1.2.3.4", "port":"24007", "transport":"tcp" }, { "host":"4.5.6.7", "port":"24008", "transport":"rdma" } ] }}
URL command:-----------file=gluster://10.70.1.86:24007/testvol/a.qcow2
CHANGES: CHANGES:
1 Formatter to read xml and create QEMUcommand line options
2 Parser that will modify the domain xml file toupdate backing chain info when snapshotsare created/deleted
6 . 5
V D S MV D S M
Fuse gerrit link
<disk device="disk" snapshot="no" type="network"> <source name="vol4/68df3c0d-f58d-44fa-8cc7-ade9fd0a85da/images/b91d13c7" protocol="gluster" <host name="gluster01" port="0" transport="tcp"/> <host name="gluster02" port="0" transport="rdma"/> <host name="gluster03" port="0" transport="uds"/> . . . . <host name="glusterN" port="0" transport="tcp"/> </source> <target bus="virtio" dev="vda"/> <serial>b91d13c7-758c-4fbd-9408-220d1d5c65fb</serial> <boot order="2"/> <driver cache="none" error_policy="stop" io="threads" name="qemu" type="raw"/></disk>
disk xml that should be sent to libvirt:
6 . 6
AVA I L A B I LT Y:AVA I L A B I LT Y:
Using libvirt Virsh to start a VMDomain XML understandingcreating a snapshot with backingstore VM image
6 . 7
I M P R OV E M E N T SI M P R OV E M E N T S
1 Kernel contest switch over Head: Parse through number of servers and connects to first available one via Libgfapi
2 Availability: What in case the node which is connected before goes down ? Don't worry now we can use backup volfile servers
3 Improved Performance: Try to use the local node and connect via Unix Domain Sockets
7 . 1
U N I X D O M A I N S O C K E T S F O R I OU N I X D O M A I N S O C K E T S F O R I O
glusterd
glusterfsd
glusterfs
Client Server
Mgmt
IO
7 . 2
U N I X D O M A I N S O C K E T S F O R I OU N I X D O M A I N S O C K E T S F O R I O
more
7 . 3
U N I X D O M A I N S O C K E T S F O R I OU N I X D O M A I N S O C K E T S F O R I O
Start a VolumeMount the volume in a node where the brick is localObserver UDS in action
7 . 4
Q E M UQ E M U
1 Kernel contest switch over Head: Parse throw number of servers and connects to first available one via Libgfapi
2 Availability: Later in time, In case the node one which is connect in step 1 goes down, don't worry now we can use backup volfile servers
3 Improved Performance: Try to use the local node and connect via Unix Domain Sockets
7 . 5
Has to do improve the dht hash to prefer local brick usinghueristics
T H AT S N OT A L LT H AT S N OT A L L
8 . 1
S H A R D I N G S H A R D I N G
16 MB
16 MB
16 MB
16 MB
16 MB
84 MB
4 MB
.shard/GFID.1
.shard/GFID.2
File.txt
.shard/GFID.4
.shard/GFID.5
.shard/GFID.3
File.txt
84M B = 16M B + 16M B + 16M B + 16M B + 16M B + 4M B
8 . 2
S H A R D I N G S H A R D I N G Fuse/GFapi/other protocol
io-stats
shard
DHT
AFR
Protocol/Client-0 Protocol/Client-1 Protocol/Client-2
Brick-0 Brick-1 Brick-2
8 . 3
S H A R D I N G S H A R D I N G
Create a volumeEnable Sharding with custom block sizeObserver how a big image is splited into shard blocksUnderstand how its helps gluster
8 . 4
M E R I T SM E R I T S
1 Sharding provides better utilization of disk space
2 Size of a file is not restricted to brick size
3 Data blocks are distributed by DHT in a "normal way"
4 Heal at the granularity of shards (speeds up heal process)
How?
Can you Explain?
Which blocks?
Yes!
9 . 1
DY N A M I C A U T H E N T I C AT I O NDY N A M I C A U T H E N T I C AT I O N
Gluster Volume
# gluster vol test reject $IP
Sorry I am lazy :)
9 . 2
DY N A M I C A U T H E N T I C AT I O NDY N A M I C A U T H E N T I C AT I O N
Observe dynamic-auth in action
10
C R E D I T SC R E D I T S
Libgfapi "Raghavendra Talur" <[email protected]>
QEMU side help "Deepak C Shetty" <[email protected]>
Sharding Owner "Krutika Dhananjay" <[email protected]>