15 tips to improve your galera cluster experience...2/5/14 scenario 2: application can keep running...

259
#Fosdem 2014 #MySQL & Friends #devroom 15 Tips to improve your 15 Tips to improve your Galera Cluster Experience Galera Cluster Experience

Upload: others

Post on 26-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

#Fosdem 2014 #MySQL & Friends #devroom

15 Tips to improve your 15 Tips to improve your Galera Cluster ExperienceGalera Cluster Experience

Page 2: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Who am I ?

Frédéric Descamps “lefred”

@lefred

http://about.me/lefred

Percona Consultant since 2011

Managing MySQL since 3.23

devops believer

I installed my first galera cluster in feb 2010

Page 3: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Who am I ?

Frédéric Descamps “lefred”

@lefred

http://about.me/lefred

Percona Consultant since 2011

Managing MySQL since 3.23

devops believer

I installed my first galera cluster in feb 2010

Page 4: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

Ready for countdown ?

Page 5: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

1515

Page 6: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to perform point in time recovery ?

Binary log must be enabled

log­slave­updates should be enabled

Page 7: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

writeswrites

writes

The environment

Page 8: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Suddenly !

Page 9: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Suddenly !

Oups ! Dim0 truncated a production table... :-S

We can have 2 scenarios :– The application can keep running even without that table– The application musts be stopped !

Page 10: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Suddenly !

Oups ! Dim0 truncated a production table... :-S

We can have 2 scenarios :– The application can keep running even without that table– The application musts be stopped !

Page 11: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Suddenly !

Oups ! Dim0 truncated a production table... :-S

We can have 2 scenarios :– The application can keep running even without that table– The application musts be stopped !

Page 12: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Suddenly !

Oups ! Dim0 truncated a production table... :-S

We can have 2 scenarios :– The application can keep running even without that table– The application musts be stopped !

Page 13: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped !

Page 14: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped !

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Stop the each node of the cluster

Page 15: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped !

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Stop the each node of the cluster

Page 16: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped !

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Stop the each node of the cluster

Page 17: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped !

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Stop the each node of the cluster

Page 18: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped !

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Stop the each node of the cluster

/etc/init.d/mysql stopor

service mysql stop

Page 19: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped !

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Stop the each node of the cluster– Find the binlog file and position before “the

event” happened

Page 20: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped !

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Stop the each node of the cluster– Find the binlog file and position before “the

event” happened

# mysqlbinlog binlog.000001 | grep truncate ­B 2#140123 23:37:03 server id 1  end_log_pos 1224 Query thread_id=4 exec_time=0 error_code=0SET TIMESTAMP=1390516623/*!*/;truncate table speakers

Page 21: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped !

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Stop the each node of the cluster– Find the binlog file and position before “the

event” happened– Restore the backup on one node

Page 22: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped !

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Stop the each node of the cluster– Find the binlog file and position before “the

event” happened– Restore the backup on one node

# cp binlog.00001 ~# innobackupex ­­apply­log .  etc..

Page 23: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped !

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Stop the each node of the cluster– Find the binlog file and position before “the

event” happened– Restore the backup on one node– Restart that node (being sure the application doesn't

connect to it)

# /etc/init.d/mysql bootstrap­pxc

Page 24: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped ! (2)

Page 25: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped ! (2)

– Replay all the binary logs since the backup BUT the position of the event

Page 26: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped ! (2)

– Replay all the binary logs since the backup BUT the position of the event

# cat xtrabackup_binlog_infoBinlog.000001   565# mysqlbinlog binlog.000001 | grep end_log_pos | \grep 1224 ­B 1#140123 23:36:53 server id 1  end_log_pos 1136 #140123 23:37:03 server id 1  end_log_pos 1224 # mysqlbinlog binlog.000001 ­j 565 \ ­­stop­position 1136 | mysql

Page 27: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped ! (2)

Page 28: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped ! (2)

– Replay all the binary logs since the backup BUT the position of the event

– Start other nodes 1 by 1 and let them perform SST

– Enable connections from the application

Page 29: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped ! (2)

– Replay all the binary logs since the backup BUT the position of the event

– Start other nodes 1 by 1 and let them perform SST

– Enable connections from the application

Page 30: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 1: application must be stopped ! (2)

– Replay all the binary logs since the backup BUT the position of the event

– Start other nodes 1 by 1 and let them perform SST

– Enable connections from the application

Page 31: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running

Page 32: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Take care of quorum (add garbd, change pc.weight,

pc.ignore_quorum)

– Find the binlog file and position before “the event” happened (thank you dim0!)

– Remove one node from the cluster (and be sure the app doesn't connect to it, load-balancer...)

Page 33: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Take care of quorum (add garbd, change pc.weight,

pc.ignore_quorum)

– Find the binlog file and position before “the event” happened (thank you dim0!)

– Remove one node from the cluster (and be sure the app doesn't connect to it, load-balancer...)

Page 34: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Take care of quorum (add garbd, change pc.weight,

pc.ignore_quorum)

– Find the binlog file and position before “the event” happened (thank you dim0!)

– Remove one node from the cluster (and be sure the app doesn't connect to it, load-balancer...)

Page 35: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running

We have Xtrabackup (and it creates daily backups!)

We have binary logs

These are the steps :– Take care of quorum (add garbd, change pc.weight,

pc.ignore_quorum)

– Find the binlog file and position before “the event” happened (thank you dim0!)

– Remove one node from the cluster (and be sure the app doesn't connect to it, load-balancer...)

Page 36: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running (2)

Page 37: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running (2)

– Restore the backup on the node we stopped– Start mysql without joining the cluster (--wsrep-

cluster-address=dummy://)

– Replay the binary log until the position of “the event”

– Export the table we need (mysqldump)

– Import it on the cluster– Restart mysql on the off-line node and let it

perform SST

Page 38: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running (2)

– Restore the backup on the node we stopped– Start mysql without joining the cluster (--wsrep-

cluster-address=dummy://)

– Replay the binary log until the position of “the event”

– Export the table we need (mysqldump)

– Import it on the cluster– Restart mysql on the off-line node and let it

perform SST

Page 39: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running (2)

– Restore the backup on the node we stopped– Start mysql without joining the cluster (--wsrep-

cluster-address=dummy://)

– Replay the binary log until the position of “the event”

– Export the table we need (mysqldump)

– Import it on the cluster– Restart mysql on the off-line node and let it

perform SST

Page 40: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running (2)

– Restore the backup on the node we stopped– Start mysql without joining the cluster (--wsrep-

cluster-address=dummy://)

– Replay the binary log until the position of “the event”

– Export the table we need (mysqldump)

– Import it on the cluster– Restart mysql on the off-line node and let it

perform SST

Page 41: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running (2)

– Restore the backup on the node we stopped– Start mysql without joining the cluster (--wsrep-

cluster-address=dummy://)

– Replay the binary log until the position of “the event”

– Export the table we need (mysqldump)

– Import it on the cluster– Restart mysql on the off-line node and let it

perform SST

Page 42: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Scenario 2: application can keep running (2)

– Restore the backup on the node we stopped– Start mysql without joining the cluster (--wsrep-

cluster-address=dummy://)

– Replay the binary log until the position of “the event”

– Export the table we need (mysqldump)

– Import it on the cluster– Restart mysql on the off-line node and let it

perform SST

Page 43: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

1414

Page 44: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Reduce “donation” time during XtraBackup SST

When performing SST with Xtrabackup the donor can still be active

by default this is disabled in clustercheck (AVAILABLE_WHEN_DONOR=0)

Running Xtrabackup can increase the load (CPU / IO) on the server

Page 45: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Reduce “donation” time during XtraBackup SST (2)

Using Xtrabackup 2.1 features helps to reduce the time of backup on the donor

[mysqld]

wsrep_sst_method=xtrabackup­v2

wsrep_sst_auth=root:dim0DidItAgain

[sst]

streamfmt=xbstream

[xtrabackup]

compress

compact

parallel=8

compress­threads=8

rebuild­threads=8

compress & compact can reduce the size of payload transferred among nodes

but in general it slows down the process

Page 46: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

1313

Page 47: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5

A C

writeswrites

writesB

asyn

c

binlog.000002102

binlog.000001402

binlog.000039102

master_host=Amaster_log_file=binlog.000002master_pos=102

Page 48: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (2)

C

writeswrites

writesB

binlog.000001402

binlog.000039102

master_host=Bmaster_log_file=binlog.000002master_pos=102

asyn

c

Page 49: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (2)

C

writeswrites

writesB

binlog.000001402

binlog.000039102

master_host=Bmaster_log_file=binlog.000002master_pos=102

asyn

c

Page 50: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (3)

Page 51: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (3)

How can we know which file and position need to be used by the async slave ?

Find the last received Xid in the relay log on the async slave (using mysqlbinlog)

Page 52: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (3)

How can we know which file and position need to be used by the async slave ?

Find the last received Xid in the relay log on the async slave (using mysqlbinlog)

Page 53: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (3)

How can we know which file and position need to be used by the async slave ?

Find the last received Xid in the relay log on the async slave (using mysqlbinlog)# mysqlbinlog  percona4­relay­bin.000002 | tailMjM5NDMxMDMxOTEtNTI4NzYxMTUxMDctMTM3NTAyNTI2NjUtNTc1ODY3MTc0MTg='/*!*/;# at 14611057#140131 12:48:12 server id 1  end_log_pos 29105924  Xid = 30097COMMIT/*!*/;DELIMITER ;# End of log fileROLLBACK /* added by mysqlbinlog */;/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;\

Page 54: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (3)

Page 55: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (3)

How can we know which file and position need to be used by the async slave ?

Find the last received Xid in the relay log on the async slave (using mysqlbinlog)

Find in the new master which binary position matches that same Xid

Use the binary log file and the position for your CHANGE MASTER statement

Page 56: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (3)

How can we know which file and position need to be used by the async slave ?

Find the last received Xid in the relay log on the async slave (using mysqlbinlog)

Find in the new master which binary position matches that same Xid

Use the binary log file and the position for your CHANGE MASTER statement

Page 57: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (3)

How can we know which file and position need to be used by the async slave ?

Find the last received Xid in the relay log on the async slave (using mysqlbinlog)

Find in the new master which binary position matches that same Xid

Use the binary log file and the position for your CHANGE MASTER statement

# mysqlbinlog percona3­bin.000004 | grep 'Xid = 30097'#140131 12:48:12 server id 1  end_log_pos 28911093  Xid = 30097

Page 58: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (3)

How can we know which file and position need to be used by the async slave ?

Find the last received Xid in the relay log on the async slave (using mysqlbinlog)

Find in the new master which binary position matches that same Xid

Use the binary log file and the position for your CHANGE MASTER statement

Page 59: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.5 (3)

How can we know which file and position need to be used by the async slave ?

Find the last received Xid in the relay log on the async slave (using mysqlbinlog)

Find in the new master which binary position matches that same Xid

Use the binary log file and the position for your CHANGE MASTER statement

Async mysql> slave stop;

Async mysql> change master to master_host='percona3',           ­> master_log_file='percona3­bin.000004',          ­> master_log_pos=28911093;

Async mysql> start slave;

Page 60: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

1212

Page 61: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.6Move asynchronous slave to a new master in 5.6

Page 62: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.6Move asynchronous slave to a new master in 5.6

With 5.6 and GTID it's easier !

... but ...

It requires rsync SST (binlogs are needed)

Or since Jan 30th

wsrep_sst_xtrabackup­v2 supports Xtrabackup 2.1.7 that makes is possible !!!

Just change master ;-)

Page 63: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.6Move asynchronous slave to a new master in 5.6

With 5.6 and GTID it's easier !

... but ...

It requires rsync SST (binlogs are needed)

Or since Jan 30th

wsrep_sst_xtrabackup­v2 supports Xtrabackup 2.1.7 that makes is possible !!!

Just change master ;-)

Page 64: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.6Move asynchronous slave to a new master in 5.6

With 5.6 and GTID it's easier !

... but ...

It requires rsync SST (binlogs are needed)

Or since Jan 30th

wsrep_sst_xtrabackup­v2 supports Xtrabackup 2.1.7 that makes is possible !!!

Just change master ;-)

Page 65: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.6Move asynchronous slave to a new master in 5.6

With 5.6 and GTID it's easier !

... but ...

It requires rsync SST (binlogs are needed)

Or since Jan 30th

wsrep_sst_xtrabackup­v2 supports Xtrabackup 2.1.7 that makes is possible !!!

Just change master ;-)

Page 66: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Move asynchronous slave to a new master in 5.6Move asynchronous slave to a new master in 5.6

With 5.6 and GTID it's easier !

... but ...

It requires rsync SST (binlogs are needed)

Or since Jan 30th

wsrep_sst_xtrabackup­v2 supports Xtrabackup 2.1.7 that makes is possible !!!

Just change master ;-)

Page 67: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

1111

Page 68: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Allow longer downtime for a node

Page 69: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Allow longer downtime for a node

When a node goes off-line, when it joins again the cluster, it sends its last replicated event to the donor

If the donor can send all next events, IST will be performed (very fast)

If not... SST is mandatory

Page 70: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Allow longer downtime for a node

When a node goes off-line, when it joins again the cluster, it sends its last replicated event to the donor

If the donor can send all next events, IST will be performed (very fast)

If not... SST is mandatory

Page 71: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Allow longer downtime for a node

When a node goes off-line, when it joins again the cluster, it sends its last replicated event to the donor

If the donor can send all next events, IST will be performed (very fast)

If not... SST is mandatory

Page 72: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Allow longer downtime for a node (2)

Page 73: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Allow longer downtime for a node (2)

Those events are stored on a cache on disk: galera.cache

The size of the cache is 128Mb by default

It can be increased using gcache.size provider option:

Page 74: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Allow longer downtime for a node (2)

Those events are stored on a cache on disk: galera.cache

The size of the cache is 128Mb by default

It can be increased using gcache.size provider option:

Page 75: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Allow longer downtime for a node (2)

Those events are stored on a cache on disk: galera.cache

The size of the cache is 128Mb by default

It can be increased using gcache.size provider option:

Page 76: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Allow longer downtime for a node (2)

Those events are stored on a cache on disk: galera.cache

The size of the cache is 128Mb by default

It can be increased using gcache.size provider option:

In /etc/my.cnf:

wsrep_provider_options = “gcache.size=1G”

Page 77: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

1010

Page 78: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor !

Let's imagine this:

A C

writes

B

Event 1

Event 1

Event 1

Page 79: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (2)

Let's imagine this:

A

writes

B

Event 1Event 2

Event 1Event 2

CEvent 1

Page 80: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (3)

Let's imagine this:

A

writes

B

Event 1Event 2

Event 1Event 2

CEvent 1

Join:last event = 1

Page 81: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (4)

Let's imagine this:

A

writes

B

Event 1Event 2

Event 1Event 2

CEvent 1

IST

Page 82: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (5)

Let's imagine this:

A C

writes

B

Event 1Event 2

Event 1Event 2

Event 1Event 2

Page 83: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (6)

Let's imagine this:

A

writes

B

Event 1Event 2

Event 1Event 2

C

Let's formatthe disk

Page 84: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (7)

Let's imagine this:

A

writes

B

Event 1Event 2

Event 1Event 2

C

Join:no cluster info

Page 85: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (8)

Full SST needed

A

writes

B

Event 1Event 2

Event 1Event 2

C

SST

Page 86: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (9)

This is what we have now:

A C

writes

B

Event 1Event 2Event 3

Event 3

Event 1Event 2Event 3

Page 87: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (10)

Let's remove node B for maintenance

A C

writes

B

Event 1Event 2Event 3Event 4

Event 3Event 4

Event 1Event 2Event 3

Page 88: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (11)

Now let's remove node C to replace a disk :-(

A C

writes

B

Event 1Event 2Event 3Event 4Event 5

Event 1Event 2Event 3

Page 89: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (12)

Node C joins again and performs SST

A C

writes

B

Event 1Event 2Event 3Event 4Event 5

Event 1Event 2Event 3

Page 90: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (12)

Node C joins again and performs SST

A C

writes

B

Event 1Event 2Event 3Event 4Event 5Event 6

Event 6

Event 1Event 2Event 3

Page 91: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (13)

Node B joins again but donor selection is not clever yet...

A C

writes

B

Event 1Event 2Event 3Event 4Event 5Event 6

Event 6

Event 1Event 2Event 3

Join:last event = 3

Page 92: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (13)

Node B joins again but donor selection is not clever yet...

A C

writes

B

Event 1Event 2Event 3Event 4Event 5Event 6

Event 6

Event 1Event 2Event 3

Join:last event = 3

SST will be needed !

Page 93: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (14)

So how to tell node B that it needs to use node A?

A C

writes

B

Event 1Event 2Event 3Event 4Event 5Event 6

Event 6

Event 1Event 2Event 3Join:

last event = 3

# /etc/init.d/mysql start ­­wsrep­sst_donor=nodeA

Page 94: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (15)

Page 95: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (15)

With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto

To know the latest event's sequence number on the node that joins the cluster, you have two possibilities:

Page 96: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (15)

With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto

To know the latest event's sequence number on the node that joins the cluster, you have two possibilities:

Page 97: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (15)

With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto

To know the latest event's sequence number on the node that joins the cluster, you have two possibilities:# cat grasdate.dat# GALERA saved stateversion: 2.1uuid:    41920174­7ec6­11e3­a05a­6a2ab4033f05seqno:   11cert_index:

Page 98: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (15)

With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto

To know the latest event's sequence number on the node that joins the cluster, you have two possibilities:# cat grasdate.dat# GALERA saved stateversion: 2.1uuid:    41920174­7ec6­11e3­a05a­6a2ab4033f05seqno:   11cert_index:

Page 99: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (15)

With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto

To know the latest event's sequence number on the node that joins the cluster, you have two possibilities:

# mysqld_safe ­­wsrep­recover140124 10:46:32 mysqld_safe Logging to '/var/lib/mysql/percona1_error.log'.140124 10:46:32 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql140124 10:46:32 mysqld_safe Skipping wsrep­recover for 41920174­7ec6­11e3­a05a­6a2ab4033f05:11 pair140124 10:46:32 mysqld_safe Assigning 41920174­7ec6­11e3­a05a­6a2ab4033f05:11 to wsrep_start_position140124 10:46:34 mysqld_safe mysqld from pid file /var/lib/mysql/percona1.pid ended

Page 100: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Then choose the right donor ! (15)

With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto

To know the latest event's sequence number on the node that joins the cluster, you have two possibilities:

# mysqld_safe ­­wsrep­recover140124 10:46:32 mysqld_safe Logging to '/var/lib/mysql/percona1_error.log'.140124 10:46:32 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql140124 10:46:32 mysqld_safe Skipping wsrep­recover for 41920174­7ec6­11e3­a05a­6a2ab4033f05:11 pair140124 10:46:32 mysqld_safe Assigning 41920174­7ec6­11e3­a05a­6a2ab4033f05:11 to wsrep_start_position140124 10:46:34 mysqld_safe mysqld from pid file /var/lib/mysql/percona1.pid ended

Page 101: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

99

Page 102: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

Page 103: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate

The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min)

Page 104: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate

The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min)

Page 105: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate

The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min)

mysql> pager grep wsrepmysql> show global status like 'wsrep_last_committed';     ­> select sleep(60);     ­> show global status like 'wsrep_last_committed';

| wsrep_last_committed | 61472 |

| wsrep_last_committed | 69774 |

Page 106: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate

The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min)

mysql> pager grep wsrepmysql> show global status like 'wsrep_last_committed';     ­> select sleep(60);     ­> show global status like 'wsrep_last_committed';

| wsrep_last_committed | 61472 |

| wsrep_last_committed | 69774 |

69774 – 61472 = 83028302 / 60 = 138.36 tps

Page 107: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

Page 108: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate

The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min)

Then collect the amount of transactions and the duration to process them after the node was in desync mode and not allowing writes

In desync mode, the node doesn't sent flow control messages to the cluster

Page 109: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate

The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min)

Then collect the amount of transactions and the duration to process them after the node was in desync mode and not allowing writes

In desync mode, the node doesn't sent flow control messages to the cluster

Page 110: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate

The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min)

Then collect the amount of transactions and the duration to process them after the node was in desync mode and not allowing writes

In desync mode, the node doesn't sent flow control messages to the cluster

Page 111: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate

The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min)

Then collect the amount of transactions and the duration to process them after the node was in desync mode and not allowing writes

In desync mode, the node doesn't sent flow control messages to the cluster

set global wsrep_desync=ON; flush tables with read lock; show global status like 'wsrep_last_committed'; select sleep( 60 ); unlock tables;

+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­+| Variable_name        | Value  |+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­+| wsrep_last_committed | 145987 |+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­+

Page 112: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

In another terminal you run myq_gadget and when wsrep_local_recv_queue (Queue Dn) is back to 0 check again the value of wsrep_last_committed.

Page 113: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

In another terminal you run myq_gadget and when wsrep_local_recv_queue (Queue Dn) is back to 0 check again the value of wsrep_last_committed.

LefredPXC / percona3 / Galera 2.8(r165)Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit     time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    013:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2...13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    113:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1

Page 114: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

In another terminal you run myq_gadget and when wsrep_local_recv_queue (Queue Dn) is back to 0 check again the value of wsrep_last_committed.

LefredPXC / percona3 / Galera 2.8(r165)Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit     time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    013:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2...13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    113:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1

This is when FTWRL is released

Page 115: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

In another terminal you run myq_gadget and when wsrep_local_recv_queue (Queue Dn) is back to 0 check again the value of wsrep_last_committed.

LefredPXC / percona3 / Galera 2.8(r165)Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit     time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    013:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2...13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    113:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1

This is when FTWRL is released

This is when galeracatch up.

wsrep_last_committed = 165871

Page 116: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Measuring Max Replication Throughput

In another terminal you run myq_gadget and when wsrep_local_recv_queue (Queue Dn) is back to 0 check again the value of wsrep_last_committed.

LefredPXC / percona3 / Galera 2.8(r165)Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit     time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    013:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2...13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    113:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1

165871 ­ 145987  = 1988419884 / 82 = 242.48 tps

We're currently at 57%of our capacity

Page 117: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

88

Page 118: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Multicast replication

By default, galera uses unicast TCP

1 copy of the replication message sent to all other nodes in the cluster

Page 119: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Multicast replication (2)

By default, galera uses unicast TCP

1 copy of the replication message sent to all other nodes in the cluster

More nodes, more bandwidth

Page 120: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Multicast replication (3)

Page 121: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Multicast replication (3)

If your network supports it you can use Multicast UDP for replicationwsrep_provider_options = “gmacast.mcast_addr = 239.192.0.11”

wsrep_cluster_cluster_address = gcomm://239.192.0.11

Page 122: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Multicast replication (3)

If your network supports it you can use Multicast UDP for replicationwsrep_provider_options = “gmacast.mcast_addr = 239.192.0.11”

wsrep_cluster_cluster_address = gcomm://239.192.0.11

Page 123: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Multicast replication (3)

If your network supports it you can use Multicast UDP for replicationwsrep_provider_options = “gmacast.mcast_addr = 239.192.0.11”

wsrep_cluster_cluster_address = gcomm://239.192.0.11

Page 124: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

77

Page 125: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere !

Page 126: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere !

It's possible to have the Galera replication encrypted via SSL

But now it's also possible to have SST over SSL, with xtrabackup_v2 and with rsync

https://github.com/tobz/galera-secure-rsync

Page 127: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere !

It's possible to have the Galera replication encrypted via SSL

But now it's also possible to have SST over SSL, with xtrabackup_v2 and with rsync

https://github.com/tobz/galera-secure-rsync

Page 128: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere !

It's possible to have the Galera replication encrypted via SSL

But now it's also possible to have SST over SSL, with xtrabackup_v2 and with rsync

https://github.com/tobz/galera-secure-rsync

Page 129: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : certs creation

Page 130: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : certs creation

openssl req ­new ­x500 ­days 365000 ­nodes ­keyout key.pem ­out cert.pem

Same cert and key must be copied on all nodes

Copy them in /etc/mysql for example and let only mysql read them

Page 131: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : certs creation

openssl req ­new ­x500 ­days 365000 ­nodes ­keyout key.pem ­out cert.pem

Same cert and key must be copied on all nodes

Copy them in /etc/mysql for example and let only mysql read them

Page 132: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : certs creation

openssl req ­new ­x500 ­days 365000 ­nodes ­keyout key.pem ­out cert.pem

Same cert and key must be copied on all nodes

Copy them in /etc/mysql for example and let only mysql read them

Page 133: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : galera configuration

Page 134: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : galera configuration

wsrep_provider_options = “socket.ssl.cert=/etc/mysql/cert.pem; socket.ssl_key=/etc/mysql/key.pem”

It's possible to set a remote Certificate Authority for validation (use socket.ssl_ca)

All nodes must have SSL enabled

Page 135: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : galera configuration

wsrep_provider_options = “socket.ssl.cert=/etc/mysql/cert.pem; socket.ssl_key=/etc/mysql/key.pem”

It's possible to set a remote Certificate Authority for validation (use socket.ssl_ca)

All nodes must have SSL enabled

Page 136: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : galera configuration

wsrep_provider_options = “socket.ssl.cert=/etc/mysql/cert.pem; socket.ssl_key=/etc/mysql/key.pem”

It's possible to set a remote Certificate Authority for validation (use socket.ssl_ca)

All nodes must have SSL enabled

Page 137: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

Page 138: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST

Use wsrep_sst_method=xtrabackup­v2

[sst]

tkey=/etc/mysql/key.pem

tcert=/etc/mysql/cert.pem

encrypt=3

Page 139: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST

Use wsrep_sst_method=xtrabackup­v2

[sst]

tkey=/etc/mysql/key.pem

tcert=/etc/mysql/cert.pem

encrypt=3

Page 140: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST

Use wsrep_sst_method=xtrabackup­v2

[sst]

tkey=/etc/mysql/key.pem

tcert=/etc/mysql/cert.pem

encrypt=3

Page 141: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST

Use wsrep_sst_method=xtrabackup­v2

[sst]

tkey=/etc/mysql/key.pem

tcert=/etc/mysql/cert.pem

encrypt=3

Page 142: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST

Use wsrep_sst_method=xtrabackup­v2

[sst]

tkey=/etc/mysql/key.pem

tcert=/etc/mysql/cert.pem

encrypt=3

Page 143: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST

Use wsrep_sst_method=xtrabackup­v2

[sst]

tkey=/etc/mysql/key.pem

tcert=/etc/mysql/cert.pem

encrypt=3

Page 144: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

Page 145: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

And for those using rsync ?

galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat.

Uses also the same cert and key file

wsrep_sst_method=secure_rsync

https://github.com/tobz/galera-secure-rsync

Page 146: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

And for those using rsync ?

galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat.

Uses also the same cert and key file

wsrep_sst_method=secure_rsync

https://github.com/tobz/galera-secure-rsync

Page 147: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

And for those using rsync ?

galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat.

Uses also the same cert and key file

wsrep_sst_method=secure_rsync

https://github.com/tobz/galera-secure-rsync

Page 148: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

And for those using rsync ?

galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat.

Uses also the same cert and key file

wsrep_sst_method=secure_rsync

https://github.com/tobz/galera-secure-rsync

Page 149: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

And for those using rsync ?

galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat.

Uses also the same cert and key file

wsrep_sst_method=secure_rsync

https://github.com/tobz/galera-secure-rsync

Page 150: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

SSL everywhere : SST configuration

And for those using rsync ?

galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat.

Uses also the same cert and key file

wsrep_sst_method=secure_rsync

https://github.com/tobz/galera-secure-rsync

Page 151: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

66

Page 152: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files

Page 153: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files

When a replication failure occurs, a GRA_*.log file is created into the datadir

For each of those files, a corresponding message is present in the mysql error log file

Can be a false positive (bad DDL statement)... or not !

This is how you can decode the content of that file

Page 154: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files

When a replication failure occurs, a GRA_*.log file is created into the datadir

For each of those files, a corresponding message is present in the mysql error log file

Can be a false positive (bad DDL statement)... or not !

This is how you can decode the content of that file

Page 155: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files

When a replication failure occurs, a GRA_*.log file is created into the datadir

For each of those files, a corresponding message is present in the mysql error log file

Can be a false positive (bad DDL statement)... or not !

This is how you can decode the content of that file

Page 156: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files

When a replication failure occurs, a GRA_*.log file is created into the datadir

For each of those files, a corresponding message is present in the mysql error log file

Can be a false positive (bad DDL statement)... or not !

This is how you can decode the content of that file

Page 157: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files (2)

Page 158: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files (2)

Download a binlog header file (http://goo.gl/kYTkY2)

Join the header and one GRA_*.log file:

– cat GRA­header > GRA_3_3­bin.log

– cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find out what the problem was !

Page 159: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files (2)

Download a binlog header file (http://goo.gl/kYTkY2)

Join the header and one GRA_*.log file:

– cat GRA­header > GRA_3_3­bin.log

– cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find out what the problem was !

Page 160: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files (2)

Download a binlog header file (http://goo.gl/kYTkY2)

Join the header and one GRA_*.log file:

– cat GRA­header > GRA_3_3­bin.log

– cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find out what the problem was !

Page 161: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files (2)

Download a binlog header file (http://goo.gl/kYTkY2)

Join the header and one GRA_*.log file:

– cat GRA­header > GRA_3_3­bin.log

– cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find out what the problem was !

Page 162: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files (2)

Download a binlog header file (http://goo.gl/kYTkY2)

Join the header and one GRA_*.log file:

– cat GRA­header > GRA_3_3­bin.log

– cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find out what the problem was !

Page 163: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Decode GRA* files (2)

Download a binlog header file (http://goo.gl/kYTkY2)

Join the header and one GRA_*.log file:

– cat GRA­header > GRA_3_3­bin.log

– cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find out what the problem was !

wsrep_log_conflicts = 1wsrep_debug = 1wsrep_provider_options = “cert.log_conflicts=1”

Page 164: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

55

Page 165: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node

Page 166: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node

It's possible to use a backup to prepare a new node.

Those are the 3 prerequisites:– use XtraBackup >= 2.0.1– the backup needs to be performed with ­­galera­info

– the gcache must be large enough

Page 167: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node

It's possible to use a backup to prepare a new node.

Those are the 3 prerequisites:– use XtraBackup >= 2.0.1– the backup needs to be performed with ­­galera­info

– the gcache must be large enough

Page 168: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node

It's possible to use a backup to prepare a new node.

Those are the 3 prerequisites:– use XtraBackup >= 2.0.1– the backup needs to be performed with ­­galera­info

– the gcache must be large enough

Page 169: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node

It's possible to use a backup to prepare a new node.

Those are the 3 prerequisites:– use XtraBackup >= 2.0.1– the backup needs to be performed with ­­galera­info

– the gcache must be large enough

Page 170: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node

It's possible to use a backup to prepare a new node.

Those are the 3 prerequisites:– use XtraBackup >= 2.0.1– the backup needs to be performed with ­­galera­info

– the gcache must be large enough

Page 171: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node (2)

Page 172: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node (2)

Restore the backup on the new node

Display the content of xtrabackup_galera_info:5f22b204­dc6b­11e1­0800­7a9c9624dd66:23

Create the file called grastate.dat like this:#GALERA saved state

version: 2.1

uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66

seqno: 23 

cert_index:

Page 173: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node (2)

Restore the backup on the new node

Display the content of xtrabackup_galera_info:5f22b204­dc6b­11e1­0800­7a9c9624dd66:23

Create the file called grastate.dat like this:#GALERA saved state

version: 2.1

uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66

seqno: 23 

cert_index:

Page 174: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node (2)

Restore the backup on the new node

Display the content of xtrabackup_galera_info:5f22b204­dc6b­11e1­0800­7a9c9624dd66:23

Create the file called grastate.dat like this:#GALERA saved state

version: 2.1

uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66

seqno: 23 

cert_index:

Page 175: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node (2)

Restore the backup on the new node

Display the content of xtrabackup_galera_info:5f22b204­dc6b­11e1­0800­7a9c9624dd66:23

Create the file called grastate.dat like this:#GALERA saved state

version: 2.1

uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66

seqno: 23 

cert_index:

Page 176: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node (2)

Restore the backup on the new node

Display the content of xtrabackup_galera_info:5f22b204­dc6b­11e1­0800­7a9c9624dd66:23

Create the file called grastate.dat like this:#GALERA saved state

version: 2.1

uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66

seqno: 23 

cert_index:

Page 177: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Avoiding SST when adding a new node (2)

Restore the backup on the new node

Display the content of xtrabackup_galera_info:5f22b204­dc6b­11e1­0800­7a9c9624dd66:23

Create the file called grastate.dat like this:#GALERA saved state

version: 2.1

uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66

seqno: 23 

cert_index:

mysql> show global status like 'wsrep_provider_version';+­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+| Variable_name          | Value     |+­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+| wsrep_provider_version | 2.1(r113) |+­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+

Page 178: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

44

Page 179: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Play with quorum and weight

Page 180: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Play with quorum and weight

Galera manages Quorum

If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted

Split brain is prevented

This requires at least 3 nodes to work properly

Can be disabled (but be warned!)

You can cheat ;-)

Page 181: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Play with quorum and weight

Galera manages Quorum

If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted

Split brain is prevented

This requires at least 3 nodes to work properly

Can be disabled (but be warned!)

You can cheat ;-)

Page 182: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Play with quorum and weight

Galera manages Quorum

If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted

Split brain is prevented

This requires at least 3 nodes to work properly

Can be disabled (but be warned!)

You can cheat ;-)

Page 183: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Play with quorum and weight

Galera manages Quorum

If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted

Split brain is prevented

This requires at least 3 nodes to work properly

Can be disabled (but be warned!)

You can cheat ;-)

Page 184: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Play with quorum and weight

Galera manages Quorum

If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted

Split brain is prevented

This requires at least 3 nodes to work properly

Can be disabled (but be warned!)

You can cheat ;-)

Page 185: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Play with quorum and weight

Galera manages Quorum

If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted

Split brain is prevented

This requires at least 3 nodes to work properly

Can be disabled (but be warned!)

You can cheat ;-)

Page 186: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: lost of connectivity

Page 187: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: lost of connectivity

Network ProblemNetwork Problem

Page 188: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: lost of connectivity

Network ProblemNetwork Problem

Does not accept Reads & WritesDoes not accept Reads & Writes

Page 189: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: lost of connectivity

Page 190: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: lost of connectivity

Network ProblemNetwork Problem

Page 191: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: lost of connectivity

Network ProblemNetwork Problem

Page 192: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: lost of connectivity

Network ProblemNetwork Problem

Page 193: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: arbitrator

Page 194: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: arbitrator

It's possible to use an arbitrator (garbd) to play an extra node. All traffic will pass through it but it won't have any MySQL running.

Useful in case of storage available only for 2 nodes or if you have an even amount of nodes.

Odd number of nodes is always advised

Page 195: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: arbitrator

It's possible to use an arbitrator (garbd) to play an extra node. All traffic will pass through it but it won't have any MySQL running.

Useful in case of storage available only for 2 nodes or if you have an even amount of nodes.

Odd number of nodes is always advised

Page 196: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: arbitrator

It's possible to use an arbitrator (garbd) to play an extra node. All traffic will pass through it but it won't have any MySQL running.

Useful in case of storage available only for 2 nodes or if you have an even amount of nodes.

Odd number of nodes is always advised

Page 197: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: cheat !

Page 198: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: cheat !

You can disable quorum but watch out ! (you have been warned):

wsrep_provider_options = “pc.ignore_quorum=true”

You can define the weigth of a node to affect the quorum calculation (default is 1):

wsrep_provider_options = “pc.weight=1”

Page 199: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: cheat !

You can disable quorum but watch out ! (you have been warned):

wsrep_provider_options = “pc.ignore_quorum=true”

You can define the weigth of a node to affect the quorum calculation (default is 1):

wsrep_provider_options = “pc.weight=1”

Page 200: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: cheat !

You can disable quorum but watch out ! (you have been warned):

wsrep_provider_options = “pc.ignore_quorum=true”

You can define the weigth of a node to affect the quorum calculation (default is 1):

wsrep_provider_options = “pc.weight=1”

Page 201: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: cheat !

You can disable quorum but watch out ! (you have been warned):

wsrep_provider_options = “pc.ignore_quorum=true”

You can define the weigth of a node to affect the quorum calculation (default is 1):

wsrep_provider_options = “pc.weight=1”

Page 202: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Quorum: cheat !

You can disable quorum but watch out ! (you have been warned):

wsrep_provider_options = “pc.ignore_quorum=true”

You can define the weigth of a node to affect the quorum calculation (default is 1):

wsrep_provider_options = “pc.weight=1”

Page 203: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

33

Page 204: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication?

Galera 2 requires all point-to-point connections for replication

datacenter A datacenter B

Page 205: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication?

Galera 2 requires all point-to-point connections for replication

datacenter A datacenter B

Page 206: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication?

Galera 2 requires all point-to-point connections for replication

datacenter A datacenter B

Page 207: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication?

Galera 2 requires all point-to-point connections for replication

datacenter A datacenter B

Page 208: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication?

Galera 2 requires all point-to-point connections for replication

datacenter A datacenter B

Page 209: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication?

Galera 2 requires all point-to-point connections for replication

datacenter A datacenter B

Page 210: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication?

Galera 2 requires all point-to-point connections for replication

datacenter A datacenter B

Page 211: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication?

Galera 2 requires all point-to-point connections for replication

datacenter A datacenter B

Page 212: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication? (2)

Galera 3 brings the notion of “cluster segments”

datacenter A datacenter B

Page 213: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication? (3)

Segments gateways can change per transaction

datacenter A datacenter B

Page 214: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication? (3)

Replication traffic between segments is mimized. Writesets are relayed to the other segment through one node

datacenter A datacenter B

commit

WS

Page 215: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication? (3)

Replication traffic between segments is mimized. Writesets are relayed to the other segment through one node

datacenter A datacenter B

commit

WS WS

WS

Page 216: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication? (4)

From those local relays replication is propagated to every nodes in the segment

datacenter A datacenter B

commit

WS WS

Page 217: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

How to optimize WAN replication? (4)

From those local relays replication is propagated to every nodes in the segment

datacenter A datacenter B

commit

WS WSgmcasts.segment = 1...255

Page 218: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

22

Page 219: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers

Page 220: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers

Galera is generally used in combination with a load balancer

The most used is HA Proxy

Codership provides one with Galera: glbd

Page 221: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers

Galera is generally used in combination with a load balancer

The most used is HA Proxy

Codership provides one with Galera: glbd

Page 222: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers

Galera is generally used in combination with a load balancer

The most used is HA Proxy

Codership provides one with Galera: glbd

Page 223: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers

Galera is generally used in combination with a load balancer

The most used is HA Proxy

Codership provides one with Galera: glbdapp 1 app 2 app 3

node 1 node 2 node 3

HA PROXY

Page 224: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: myths, legends and reality

Page 225: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: myths, legends and reality

TIME_WAIT– On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state

– This can leas to a TCP port exhaustion !

How to fix ?– Use nolinger option in HA Proxy (for glbd check

http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast

– Modify the value of tcp_max_tw_buckets

Page 226: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: myths, legends and reality

TIME_WAIT– On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state

– This can leas to a TCP port exhaustion !

How to fix ?– Use nolinger option in HA Proxy (for glbd check

http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast

– Modify the value of tcp_max_tw_buckets

Page 227: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: myths, legends and reality

TIME_WAIT– On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state

– This can leas to a TCP port exhaustion !

How to fix ?– Use nolinger option in HA Proxy (for glbd check

http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast

– Modify the value of tcp_max_tw_buckets

Page 228: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: myths, legends and reality

TIME_WAIT– On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state

– This can leas to a TCP port exhaustion !

How to fix ?– Use nolinger option in HA Proxy (for glbd check

http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast

– Modify the value of tcp_max_tw_buckets

Page 229: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: myths, legends and reality

TIME_WAIT– On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state

– This can leas to a TCP port exhaustion !

How to fix ?– Use nolinger option in HA Proxy (for glbd check

http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast

– Modify the value of tcp_max_tw_buckets

Page 230: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: myths, legends and reality

TIME_WAIT– On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state

– This can leas to a TCP port exhaustion !

How to fix ?– Use nolinger option in HA Proxy (for glbd check

http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast

– Modify the value of tcp_max_tw_buckets

Page 231: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– Many people expects the following scenario:

app 1 app 2 app 3

node 1

node 2 node 3

HA PROXY

Page 232: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– When the node that was specified to receive the persistent

write fails for exampleapp 1 app 2 app 3

node 1

node 2 node 3

HA PROXY

Page 233: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– When the node is back on-line...

app 1 app 2 app 3

node 1

node 2 node 3

HA PROXY

Page 234: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– Only the new connections will use again the preferred node

app 1 app 2 app 3

node 1

node 2 node 3

HA PROXY

Page 235: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– Only the new connections will use again the preferred node

app 1 app 2 app 3

node 1

node 2 node 3

HA PROXY

Page 236: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– Only the new connections will use again the preferred node

app 1 app 2 app 3

node 1

node 2 node 3

HA PROXY

Page 237: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Page 238: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– HA Proxy decides where the connection will go at TCP

handshake– Once the TCP session is established, the sessions will stay

where they are !

Solution ?– With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

Page 239: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– HA Proxy decides where the connection will go at TCP

handshake– Once the TCP session is established, the sessions will stay

where they are !

Solution ?– With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

Page 240: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– HA Proxy decides where the connection will go at TCP

handshake– Once the TCP session is established, the sessions will stay

where they are !

Solution ?– With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

Page 241: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– HA Proxy decides where the connection will go at TCP

handshake– Once the TCP session is established, the sessions will stay

where they are !

Solution ?– With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

Page 242: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– HA Proxy decides where the connection will go at TCP

handshake– Once the TCP session is established, the sessions will stay

where they are !

Solution ?– With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

Page 243: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Load balancers: common issues

Persitent Connections– HA Proxy decides where the connection will go at TCP

handshake– Once the TCP session is established, the sessions will stay

where they are !

Solution ?– With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

Page 244: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

11

Page 245: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

Page 246: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

When you want to perform a consistent backup, you need to take a FLUSH TABLES WITH READ LOCK (FTWRL)

By default even with Xtrabackup

This causes a Flow Control in galera

So how can we deal with that ?

Page 247: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

When you want to perform a consistent backup, you need to take a FLUSH TABLES WITH READ LOCK (FTWRL)

By default even with Xtrabackup

This causes a Flow Control in galera

So how can we deal with that ?

Page 248: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

When you want to perform a consistent backup, you need to take a FLUSH TABLES WITH READ LOCK (FTWRL)

By default even with Xtrabackup

This causes a Flow Control in galera

So how can we deal with that ?

Page 249: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

When you want to perform a consistent backup, you need to take a FLUSH TABLES WITH READ LOCK (FTWRL)

By default even with Xtrabackup

This causes a Flow Control in galera

So how can we deal with that ?

Page 250: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

Page 251: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

Choose the node from which you want to take the backup

Change the state to 'Donor/Desynced' (see tip 9)

set global wsrep_desync=ON

Take the backup

Wait that wsrep_local_recv_queue is back down to 0

Change back the state to 'Joined'

set global wsrep_desync=OFF

Page 252: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

Choose the node from which you want to take the backup

Change the state to 'Donor/Desynced' (see tip 9)

set global wsrep_desync=ON

Take the backup

Wait that wsrep_local_recv_queue is back down to 0

Change back the state to 'Joined'

set global wsrep_desync=OFF

Page 253: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

Choose the node from which you want to take the backup

Change the state to 'Donor/Desynced' (see tip 9)

set global wsrep_desync=ON

Take the backup

Wait that wsrep_local_recv_queue is back down to 0

Change back the state to 'Joined'

set global wsrep_desync=OFF

Page 254: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

Choose the node from which you want to take the backup

Change the state to 'Donor/Desynced' (see tip 9)

set global wsrep_desync=ON

Take the backup

Wait that wsrep_local_recv_queue is back down to 0

Change back the state to 'Joined'

set global wsrep_desync=OFF

Page 255: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

Choose the node from which you want to take the backup

Change the state to 'Donor/Desynced' (see tip 9)

set global wsrep_desync=ON

Take the backup

Wait that wsrep_local_recv_queue is back down to 0

Change back the state to 'Joined'

set global wsrep_desync=OFF

Page 256: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

Choose the node from which you want to take the backup

Change the state to 'Donor/Desynced' (see tip 9)

set global wsrep_desync=ON

Take the backup

Wait that wsrep_local_recv_queue is back down to 0

Change back the state to 'Joined'

set global wsrep_desync=OFF

Page 257: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

2/5/14

Taking backups without stalls

Choose the node from which you want to take the backup

Change the state to 'Donor/Desynced' (see tip 9)

set global wsrep_desync=ON

Take the backup

Wait that wsrep_local_recv_queue is back down to 0

Change back the state to 'Joined'

set global wsrep_desync=OFF

Page 258: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

00

Page 259: 15 Tips to improve your Galera Cluster Experience...2/5/14 Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!)We have binary logs These are

GO !