![Page 1: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/1.jpg)
2012 © Trivadis
BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN
TechEventApril 2013
Galera ClusterSynchronous Multi-Master Replication for MySQL HA
Ludovico CALDARA
LS-IMS
27.04.2013
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
1
![Page 2: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/2.jpg)
2012 © Trivadis
MySQL forks: which one is better?
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
2
� New forks
� Many new features
� Improved instrumentation
� New solutions for DEVs and DBAs
� Fast-paced competition between forks’ developers
� Recent evolutions in HA and scalability have made MySQL enterprise ready
MySQL Oracle MySQL
Percona Server
MariaDB
Drizzle
![Page 3: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/3.jpg)
2012 © Trivadis
There is no recipe that can satisfy all tastes
Percona Server MariaDB MySQL
Multi source replication NO YES (rel. 10) NO
NoSQL integration YES (cassandra) YES (cassandra) YES (memcached)
Virtual Columns NO YES NO
Improved diagnostics YES NO NO
Online DDL NO YES YES
Galera Cluster YES YES YES (codership patch)
Many many others YES/NO YES/NO YES/NO
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
3
![Page 4: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/4.jpg)
2012 © Trivadis
Your real requirements will let you choose… Need HA?
• How will react your customer if there is an important loss of service?
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
4
![Page 5: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/5.jpg)
2012 © Trivadis
MySQL Multi-Master Replication• Complex and not reliable• Concurrent writes lead to logical corruption
Native MySQL Replication• Doesn’t scale writes• Complex to promote slaves
Old-school solutions have weaknesses
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
5
DRBD Replication• Standby is offline, doesn’t scale at all• Poor performance
NDB
NDB
NDB
MySQL Cluster• Very complex• It’s not InnoDB!
![Page 6: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/6.jpg)
2012 © Trivadis
Continuent Tungsten Replicator• Similar to Golden Gate• Heterogeneous databases• Provides complex topologies• Asynchronous• Conflicts are complex to resolve• Complex to maintain• Not free
New school solutions: 3rd parties are playing a decisive role
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
6
ORACLE
MYSQL
MYSQL
ORACLEMYSQL
Galera Cluster Replication• Transparent Multi-Master easy to mantain• (Virtually) Synchronous• It’s InnoDB (only InnoDB)• Great and easy scalability• Optimistic locking (side effects)• At least 3 nodes for good HA
![Page 7: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/7.jpg)
2012 © Trivadis
Multi-Master and virtually synchronous: it’s transparent
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
7
R/W R/W R/W R/W R/W
![Page 8: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/8.jpg)
2012 © Trivadis
Cluster implementation - Ingredients
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
8
• One or more standalone servers (either physical or virtual)
• Linux (other operating systems are not yet available)
• “Permissive” Firewall between nodes
• Codership’s Galera Library package
• A package of your choice:
• Percona XtraDB Cluster
• MariaDB Galera Cluster
• MySQL with wsrep patch (patched by Codership)
![Page 9: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/9.jpg)
2012 © Trivadis
Cluster implementation - Variables
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
9
• Each server’s my.cnf must contain:
• wsrep_cluster_address=gcomm://192.168.1.100,…,192.168.1.10x
• wsrep_provider=/usr/lib64/libgalera_smm.so
• binlog_format=ROW
• default_storage_engine=InnoDB
• innodb_autoinc_lock_mode=2
• innodb_locks_unsafe_for_binlog=1 #disables gap locking
![Page 10: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/10.jpg)
2012 © Trivadis
Cluster implementation – Start the cluster
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
10
mysqld_safe --wsrep_cluster_address=gcomm:// &
[…]130220 17:56:46 [Note] WSREP: Starting new group from scratch : […]
The empty gcomm:// address starts the node as the first of the cluster
NEVER USE IT TO JOIN AN EXISTING CLUSTER
![Page 11: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/11.jpg)
2012 © Trivadis
Cluster implementation – Adding nodes to the cluster
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
11
mysqld_safe \
--wsrep_cluster_address=gcomm://host1,host2… &
[…]130220 18:01:56 [Note] WSREP: Shifting OPEN -> PRIMARY (TO:…)130220 18:01:56 [Note] WSREP: State transfer required: […]
The address should be already present in the my.cnf!
![Page 12: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/12.jpg)
2012 © Trivadis
Server State Transfer
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
12
• The joiner asks for a SST
R/W R/W R/W
![Page 13: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/13.jpg)
2012 © Trivadis
Server State Transfer
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
13
• The joiner asks for a SST
R/W R/W
• The cluster chooses a donor, the donor is taken offline
DONOR
![Page 14: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/14.jpg)
2012 © Trivadis
Server State Transfer
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
14
• The joiner asks for a SST
R/W R/W R/W
• The cluster chooses a donor, the donor is taken offline
DONOR
• The donor is backed up
• The donor comes online again and the joiner is loaded
![Page 15: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/15.jpg)
2012 © Trivadis
Server State Transfer
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
15
• The joiner asks for a SST
R/W R/W R/W
• The cluster chooses a donor, the donor is taken offline
DONOR
• The donor is backed up
• The donor comes online again and the joiner is loaded
• The joiner replays the missing transactions and joins the cluster
R/W
![Page 16: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/16.jpg)
2012 © Trivadis
Server State Transfer
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
16
• The joiner asks for a SST
R/W R/W R/W
• The cluster chooses a donor, the donor is taken offline
• The donor is backed up
• The donor comes online again and the joiner is loaded
• The joiner replays the missing transactions and joins the cluster
R/W• The cluster can also do
Incremental State Transfers (IST)
![Page 17: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/17.jpg)
2012 © Trivadis
Split-Brain
• The majority of nodes wins
• Complete loss of network: all nodes go offline
• The offline nodes will respond:mysql> select * from emp;
ERROR 1047 (08S01): Unknown command
• Galera arbitrator (garbd) can join the cluster and count as a member in split brain resolution.
• NEW: Galera 2.4 intruduces weighted quorum
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
17
gararbitrator
![Page 18: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/18.jpg)
2012 © Trivadis
WAN
Example 1: Arbitrator in Trivadis Swiss
… sorry for German/Austrian attenders ☺
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
18
arbitrator
ZURICH
BASEL
BERN
LAUSANNE
• If the WAN connection is lost, Zurich survives
• If the Zurich site is lost, the cluster will be off lined
![Page 19: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/19.jpg)
2012 © Trivadis
WAN
Example 2: Arbitrator in Trivadis Swiss
… sorry for German/Austrian attenders ☺
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
19
arbitrator
ZURICH
BASEL
BERN
LAUSANNE
• If the Zurich site is lost, the other sites survive
• If the WAN connection is lost, the cluster will be off lined
![Page 20: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/20.jpg)
2012 © Trivadis
What does “Virtually synchronous” mean? In brief:
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
20
Write
![Page 21: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/21.jpg)
2012 © Trivadis
What does “Virtually synchronous” mean? In brief:
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
21
Write
Commit
WS
![Page 22: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/22.jpg)
2012 © Trivadis
What does “Virtually synchronous” mean? In brief:
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
22
Write
Commit
WS WS WS
![Page 23: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/23.jpg)
2012 © Trivadis
What does “Virtually synchronous” mean? In brief:
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
23
Write
Commit
WS WS
Commit OK
![Page 24: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/24.jpg)
2012 © Trivadis
What does “Virtually synchronous” mean? In brief:
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
24
Write
Commit
WS WS
Commit OK
• Writes are as fast as if they were local
• Commits take just the time of a network roundtrip: if acceptable then the cluster can be spread geographically
![Page 25: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/25.jpg)
2012 © Trivadis
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara' ;Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
25
![Page 26: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/26.jpg)
2012 © Trivadis
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara' ;Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
26
mysql> update emp set salary=‘one billion' where name=‘Cald ara';Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0
![Page 27: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/27.jpg)
2012 © Trivadis
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara' ;Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27
mysql> update emp set salary=‘one billion' where name=‘Cald ara';Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0mysql> commit ;
WS
Query OK , 0 rows affected (0.01 sec
WS
![Page 28: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/28.jpg)
2012 © Trivadis
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara' ;Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
28
mysql> update emp set salary=‘one billion' where name=‘Cald ara';Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0mysql> commit ;
mysql> commit ;ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Query OK , 0 rows affected (0.01 sec
WS
mysql> select salary from emp where name=‘Caldara’;+-------------+| salary |+-------------+| one billion |+-------------+
![Page 29: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/29.jpg)
2012 © Trivadis
Conclusions on optimistic locking…
• Cluster-wise, the first that broadcasts its commit wins (it’s Galera…)
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
29
• The application should not have hotspots...
• … or it should retry the transaction after the deadlock occurs…
• … or, for each database, you can elegy one node as the master
• Locally, the first that acquires the lock wins (it’s InnoDB…)
![Page 30: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/30.jpg)
2012 © Trivadis
About performance
• Commit performance loss is between 5% and 10% plus the network RTT
• Write workloads scale to up to 8 nodes
• >8 nodes: it scales reads, not writes
• Many benchmarks show that Galera overcomes NDB with few nodes
• NDB scales out more with many nodes thanks to data sharding
• Benchmarks on internet are not always reliable… test the performance of YOUR application
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
30
![Page 31: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/31.jpg)
2012 © Trivadis
How to migrate
• Converts all your tables to InnoDB
• Double-check that all tables have primary keys
• Think about potential problems caused by triggers (if you have any)
• Create a new empty Galera Cluster
• Setup MySQL native replication between the old database and the Galera cluster
• Once all is aligned, direct your clients on the new cluster
• Setup the old node to join the cluster
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
31
NATIVEREPLICATION
JOIN
![Page 32: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/32.jpg)
2012 © Trivadis
Load balancing
• HAProxy is the most used solution so far
• Codership is actively developing his own load balancer: Galera Load Balancer (glbd)
• Several balancing modes: round robin, custom, least connected, …
• Automatically drains disconnected nodes
• New nodes can be added with a single tcpcall
• Release 1.0 (now rc1) will support watchdog and automatic discover of nodes composing the cluster
• Other methods possible (e.g. java connector properties, HW load balancer)
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
32
![Page 33: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/33.jpg)
2012 © Trivadis
Conclusions on Galera Cluster
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
33
•
•
•
•
•
•
•
• At least 3 nodes to have good HA
• Optimistic locking (side effects)
• Explicit locking doesn’t work
• Only InnoDB is replicated
• Primary keys are mandatory
• Not yet available for MySQL 5.6
• Linux only
• Multi-master
•
•
•
•
•
•
• Easy to add/remove nodes
• Multi-master
• Shared-nothing
• Great performances and scalability
• «Virtually» synchronous
• It uses InnoDB!!
• Conflict prevention
• Split-brain (no inconsistencies)
• Easy to add/remove nodes
![Page 34: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/34.jpg)
2012 © Trivadis
Links
http://www.slideshare.net/skysql/galera-cluster-by-seppo-jaakola-codership-at-skysql-roadshow-in-stuttgart-2013
http://www.codership.com/files/presentations/Galera_Replication_PLL_2011.pdf
http://www.mysqlperformanceblog.com/2013/01/31/feature-in-details-incremental-state-transfer-after-a-node-crash-in-percona-xtradb-cluster/
http://www.percona.tv/percona-webinars/migrating-to-percona-xtradb-cluster
http://www.codership.com/content/5-tips-migrating-your-mysql-server-galera-cluster
http://www.mysqlperformanceblog.com/2012/08/17/percona-xtradb-cluster-multi-node-writing-and-unexpected-deadlocks/
http://www.mysqlperformanceblog.com/2012/11/20/understanding-multi-node-writing-conflict-metrics-in-percona-xtradb-cluster-and-galera/
http://www.mysqlperformanceblog.com/2011/10/13/benchmarking-galera-replication-overhead/
http://karlssonondatabases.blogspot.ch/2012/12/galera-features-beyond-just-ha.html
http://infoscience.epfl.ch/record/52305/files/IC_TECH_REPORT_199908.pdf
http://www.inf.usi.ch/faculty/pedone/Paper/2005/2005WDIDDR.pdf
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
34
![Page 35: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/35.jpg)
2012 © Trivadis
Little demo?
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
35
![Page 36: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/36.jpg)
2012 © Trivadis
?
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
36
![Page 37: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA](https://reader034.vdocuments.us/reader034/viewer/2022052307/554dd371b4c905c70e8b4972/html5/thumbnails/37.jpg)
2012 © Trivadis
BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN
THANK YOU.Trivadis SA
Ludovico CaldaraSenior Consultant
27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA
37