improvements in glusterfs for virtualization usecase

27
1 IMPROVEMENTS IN GLUSTER FOR IMPROVEMENTS IN GLUSTER FOR VIRTUALIZATION USECASE VIRTUALIZATION USECASE Prasanna Kumar Kalever 07 - Feb - 2016 pkalever @ http://goo.gl/yBeXI8

Upload: deepak-shetty

Post on 16-Apr-2017

428 views

Category:

Technology


1 download

TRANSCRIPT

1

IMPROVEMENTS IN GLUSTER FORIMPROVEMENTS IN GLUSTER FOR VIRTUALIZATION USECASEVIRTUALIZATION USECASE

Prasanna Kumar Kalever

07 - Feb - 2016pkalever @

http://goo.gl/yBeXI8

2

1 Introduction to Gluster

2 Introduction to Hyperconvergence

3 LibGfapi Over Fuse

4 Gluster-Qemu- Libgfapi integration

5 Unix domain socket for IO

6 Sharding

7 Dynamic authentication

I N D E X :I N D E X :

Demo

Demo

Demo

Demo

3

I N T R O TO G L U S T E RI N T R O TO G L U S T E R

4

I N T R O TO H Y P E R C O N V E R G E N C EI N T R O TO H Y P E R C O N V E R G E N C E

5 . 1

L I B G FA P I OV E R F U S EL I B G FA P I OV E R F U S E

5 . 2

L I B G FA P I OV E R F U S EL I B G FA P I OV E R F U S E

5 . 3

L I B G FA P I OV E R F U S EL I B G FA P I OV E R F U S E

6 . 1

T H E B I G P I C T U R ET H E B I G P I C T U R E

VDSM

6 . 2

M U LT I S E RV E R SM U LT I S E RV E R S

gerrit link

Fix started from initialization part for multiple servers in libgfapi

qemu-system-x86_64 file=gluster://10.70.1.86:24007/testvol/a.qcow2

6 . 3

Q E M U :Q E M U :

RFE Link

creating a trusted poolcreate a replica 3 volumeqemu setup with gluserfslibvirt setup with glusterstorage driver

observing old URI syntaxits demeritsJson way for high availability

6 . 4

L I BV I R TL I BV I R T

RFE Link

JSON command:------------json:{ "driver":"qcow2", "file":{ "driver":"gluster", "volume":"testvol", "path":"/path/a.qcow2", "servers":[ { "host":"localhost", "transport":"unix" }, { "host":"1.2.3.4", "port":"24007", "transport":"tcp" }, { "host":"4.5.6.7", "port":"24008", "transport":"rdma" } ] }}

URL command:-----------file=gluster://10.70.1.86:24007/testvol/a.qcow2

CHANGES: CHANGES:

1 Formatter to read xml and create QEMUcommand line options

2 Parser that will modify the domain xml file toupdate backing chain info when snapshotsare created/deleted

6 . 5

V D S MV D S M

Fuse gerrit link

<disk device="disk" snapshot="no" type="network"> <source name="vol4/68df3c0d-f58d-44fa-8cc7-ade9fd0a85da/images/b91d13c7" protocol="gluster" <host name="gluster01" port="0" transport="tcp"/> <host name="gluster02" port="0" transport="rdma"/> <host name="gluster03" port="0" transport="uds"/> . . . . <host name="glusterN" port="0" transport="tcp"/> </source> <target bus="virtio" dev="vda"/> <serial>b91d13c7-758c-4fbd-9408-220d1d5c65fb</serial> <boot order="2"/> <driver cache="none" error_policy="stop" io="threads" name="qemu" type="raw"/></disk>

disk xml that should be sent to libvirt:

6 . 6

AVA I L A B I LT Y:AVA I L A B I LT Y:

Using libvirt Virsh to start a VMDomain XML understandingcreating a snapshot with backingstore VM image

6 . 7

I M P R OV E M E N T SI M P R OV E M E N T S

1 Kernel contest switch over Head: Parse through number of servers and connects to first available one via Libgfapi

2 Availability: What in case the node which is connected before goes down ? Don't worry now we can use backup volfile servers

3 Improved Performance: Try to use the local node and connect via Unix Domain Sockets

7 . 1

U N I X D O M A I N S O C K E T S F O R I OU N I X D O M A I N S O C K E T S F O R I O

glusterd

glusterfsd

glusterfs

Client Server

Mgmt

IO

7 . 2

U N I X D O M A I N S O C K E T S F O R I OU N I X D O M A I N S O C K E T S F O R I O

more

7 . 3

U N I X D O M A I N S O C K E T S F O R I OU N I X D O M A I N S O C K E T S F O R I O

Start a VolumeMount the volume in a node where the brick is localObserver UDS in action

7 . 4

Q E M UQ E M U

1 Kernel contest switch over Head: Parse throw number of servers and connects to first available one via Libgfapi

2 Availability: Later in time, In case the node one which is connect in step 1 goes down, don't worry now we can use backup volfile servers

3 Improved Performance: Try to use the local node and connect via Unix Domain Sockets

7 . 5

Has to do improve the dht hash to prefer local brick usinghueristics

T H AT S N OT A L LT H AT S N OT A L L

8 . 1

S H A R D I N G S H A R D I N G

16 MB

16 MB

16 MB

16 MB

16 MB

84 MB

4 MB

.shard/GFID.1

.shard/GFID.2

File.txt

.shard/GFID.4

.shard/GFID.5

.shard/GFID.3

File.txt

84M B = 16M B + 16M B + 16M B + 16M B + 16M B + 4M B

8 . 2

S H A R D I N G S H A R D I N G Fuse/GFapi/other protocol

io-stats

shard

DHT

AFR

Protocol/Client-0 Protocol/Client-1 Protocol/Client-2

Brick-0 Brick-1 Brick-2

8 . 3

S H A R D I N G S H A R D I N G

Create a volumeEnable Sharding with custom block sizeObserver how a big image is splited into shard blocksUnderstand how its helps gluster

8 . 4

M E R I T SM E R I T S

1 Sharding provides better utilization of disk space

2 Size of a file is not restricted to brick size

3 Data blocks are distributed by DHT in a "normal way"

4 Heal at the granularity of shards (speeds up heal process)

How?

Can you Explain?

Which blocks?

Yes!

9 . 1

DY N A M I C A U T H E N T I C AT I O NDY N A M I C A U T H E N T I C AT I O N

Gluster Volume

# gluster vol test reject $IP

Sorry I am lazy :)

9 . 2

DY N A M I C A U T H E N T I C AT I O NDY N A M I C A U T H E N T I C AT I O N

Observe dynamic-auth in action

10

C R E D I T SC R E D I T S

Libgfapi "Raghavendra Talur" <[email protected]>

QEMU side help "Deepak C Shetty" <[email protected]>

Sharding Owner "Krutika Dhananjay" <[email protected]>

11

Q & A Q & A