development of fault-tolerant failover tools with mysql utilities - mysql connect 2013 [con4276]

63
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1

Upload: paulo-jesus

Post on 09-May-2015

2.046 views

Category:

Technology


3 download

DESCRIPTION

The occurrence of failures and crashes can compromise the high availability of your database system, affecting your revenue and reputation. Therefore, it is fundamental to minimize downtime and have an efficient strategy for crash recovery. Replication and failover are commonly applied to deal with those situations, but what if failures occur during the recovery process? This can really be a headache, so it is better to be prepared. This session discusses the development of fault-tolerant failover solutions using the MySQL utilities library and covers the following topics: • Issues during failover/switchover • Fault-tolerant failover solutions • Using the MySQL utilities library to provide your own solution

TRANSCRIPT

Page 1: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1

Page 2: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.2

Development of Fault-Tolerant Failover Tools with MySQL Utilities

Dr. Paulo Jesus

Software Developer – MySQL Utilities

Dr. Lars Thalmann

MySQL Development Director

Page 3: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.3

Program Agenda

Introduction: Faults, Fault-Tolerance and Failover

MySQL Failover

– How to Handle Master Crash, Slave Crash and Connection Failures

Automatically!

Introducing the mysqlfailover utility

– Fault-Tolerance and Taking Advantage of its Features

Final Remarks

Page 4: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.4

Introduction: Faults, Fault-tolerance and Failover

Page 5: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.5

Faults

Crash

– Server crash and stop due to a cosmic ray (more likely software/hardware

malfunction)

Message loss

– Communication channel fails

Byzantine

– Data corruption or malicious attacks

Different types of faults can lead to system failures

Page 6: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.6

Fault-Tolerance

Provide high-availability

Improve reliability

Reduce system downtime and revenue loss

Typically obtained through redundancy

– Example: replication

Fault-tolerance is a property that enables the system to continue

operating properly when faults occur.

Page 7: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.7

Failover

Fault-tolerant strategy:

– Promote a slave as the new master

– Reduce system downtime

* Switchover is the transfer of the master role in an otherwise healthy topology

and is performed manually. In many cases, the original master can be returned

to the topology as a slave.

Failover can be achieved by automatically switching to another server

(redundant or standby) upon failure. *

Page 8: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.8

Replication topology:

Context: MySQL ReplicationFaults, Fault-Tolerance and Failover

Master: A

Slaves: B, C and D

Asynchronous

replication

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Page 9: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.9

Replication topology:

Context: MySQL ReplicationFaults, Fault-Tolerance and Failover

Master: A

Slaves: B, C and D

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Here comes the cosmic ray!!!

Page 10: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.10

Not critical...

Context: MySQL ReplicationFaults, Fault-Tolerance and Failover

Partial impact:

– Reduce reads

performance

Replace the slave

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Slave crashed

Page 11: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.11

Replication topology:

Context: MySQL ReplicationFaults, Fault-Tolerance and Failover

Master: A

Slaves: B, C and D

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Page 12: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.12

Replication topology:

Context: MySQL ReplicationFaults, Fault-Tolerance and Failover

Master: A

Slaves: B, C and D

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Connection failure

Page 13: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.13

Not critical…

Context: MySQL ReplicationFaults, Fault-Tolerance and Failover

Partial impact:

– Outdated data

read from slave

Restore connection

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Master-Slave Connection lost

Page 14: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.14

Replication topology:

Context: MySQL ReplicationFaults, Fault-Tolerance and Failover

Master: A

Slaves: B, C and D

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Page 15: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.15

Replication topology:

Context: MySQL ReplicationFaults, Fault-Tolerance and Failover

Master: A

Slaves: B, C and D

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Here comes another cosmic ray!!!

Page 16: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.16

Now, it gets serious!

Context: MySQL ReplicationFaults, Fault-Tolerance and Failover

Replication stops

– Reads only

– No more writes

Failover is needed

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Master crashed

Page 17: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.17

Failover process:

Context: MySQL ReplicationFaults, Fault-Tolerance and Failover

Promote the

“best”* slave as

master

Connect remaining

slaves to the new

master

Atrx1

trx2

trx3

B

C

D

trx1

trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

* Here it is the most up-to-date (for simplicity).

Page 18: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.18

Why ?

Context: MySQL ReplicationFailover Is Not So Easy…

New master might

not be the most up-

to-date *

– But must be at the

end of failover

Ensure data

consistency

Atrx1

trx2

trx3

B

C

D

trx1

trx1

trx2

trx3

trx1

trx2

* The “best” candidate might be the one with the best hardware.

Page 19: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.19

MySQL Failover

Page 20: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.20

Relay log details:

Relay LogsMySQL Failover

Slave B: trx2

Slave C: <none>

Slave D: trx2, trx3

Slave B and C did

not receive trx3

Atrx1

trx2

trx3

B

C

Dtrx1

trx1

trx1

trx2

trx3

trx3trx2

trx2

trx3

binlog

relay

relay

relay

Page 21: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.21

Master crashed!

Relay LogsMySQL Failover

Slave D is the most

up-do-date

– SHOW SLAVE STATUS

– Master log file and

position or

retrieved GTIDs

Atrx1

trx2

trx3

B

C

Dtrx1

trx1

trx1

trx2

trx3

trx3trx2

trx2

trx3

binlog

relay

relay

relay

Most up-to-date:

Check slave

status

Page 22: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.22

Failover:

Relay LogsMySQL Failover

Promote D as the

new master

– STOP SLAVE,

CHANGE MASTER

TO*, START SLAVE

Atrx1

trx2

trx3

B

C

D

trx1

trx1

trx1

trx2

trx2

trx2

trx3

binlog

relay

relay

relay

Hum... Something is not right...

binlog

* CHANGE MASTER TO deletes all relay log files and starts a

new one, unless relay_log_file or relay_log_pos are specified.

Page 23: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.23

(1) What can happen:

Relay LogsMySQL Failover

“D” consume trxs on

relay log

Atrx1

trx2

trx3

B

C

D

trx1

trx1

trx2

trx3

trx1

trx2

trx2

binlog

relay

relay

relay binlog

Page 24: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.24

(1) What can happen:

Relay LogsMySQL Failover

“D” consume trxs on

relay log

Slaves read trxs

Atrx1

trx2

trx3

B

C

D

trx1

trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx2

trx3

trx2

trx3

binlog

relay

relay

relay binlog

trx2, trx3

Page 25: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.25

(1) What can happen:

Relay LogsMySQL Failover

“D” consume trxs on

relay log

Slaves read trxs

Slaves apply trxs

SQL thread stops on “C”

Atrx1

trx2

trx3

B

C

D

trx1

trx2

trx3

trx1

trx2

trx3

trx1

trx2

trx2

trx3

binlog

relay

relay

relay

Conflicting transactions!

binlog

SQL thread error!

Page 26: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.26

(2) What can happen:

Relay LogsMySQL Failover

Delete relay log on

the new master

– RESET SLAVE

Atrx1

trx2

trx3

B

C

D

trx1

trx1

trx1

trx2

trx2

trx3

binlog

relay

relay

relay binlog

Page 27: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.27

(2) What can happen:

Relay LogsMySQL Failover

Delete relay log on

the new master

– RESET SLAVE

trx2 only on slave B and

trx3 is lost.

Atrx1

trx2

trx3

B

C

D

trx1

trx1

trx1

trx2

trx2

trx3

binlog

relay

relay

relay binlog

trx3 is lost!

Transactions lost!

Page 28: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.28

Solution:

Relay LogsMySQL Failover

Consume all

transactions on the

relay logs before

promoting the new

master

Atrx1

trx2

trx3

B

C

D

trx1

trx2

trx1

trx2

trx3

trx1

trx2

binlog

relay

relay

relay binlog

Page 29: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.29

New master is not the most up-to-date:

New Master SynchronizationMySQL Failover

“Best” candidate is

the one with better:

– Hardware

– Location

It is not the most up-

to-date

Atrx1

trx2

trx3

B

C

Dtrx1

trx1

trx1

trx2

trx3

trx3trx2

trx2

trx3

binlog

relay

relay

relay

Page 30: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.30

Failover:

New Master SynchronizationMySQL Failover

Slave B is the “best”

candidate

Consume relay logs

Promote “B” to

master

Atrx1

trx2

trx3B

C

D

trx1

trx2

trx1

trx2

trx3

trx1

trx2

binlog

relay

relay

relay

Page 31: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.31

Failover:

New Master SynchronizationMySQL Failover

Slave B is the “best”

candidate

Consume relay logs

Promote “B” to

master

Atrx1

trx2

trx3B

C

D

trx1

trx2

trx1

trx2

trx3

trx1

trx2

binlog

relay

relay

relay

Data consistency issue!

trx3 is only on “D”!

Page 32: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.32

Solution:

New Master SynchronizationMySQL Failover

Synchronize candidate

before promoting it:

Get and apply

differential changes

from the most up-to-

date slave or loop

throw all slaves

Atrx1

trx2

trx3B

C

D

trx1

trx2

trx3

trx1

trx2

trx3

trx1

trx2

binlog

relay

relay

relay

trx3

Page 33: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.33

Solution:

New Master SynchronizationMySQL Failover

New master must

become one of the

most up-to-date

before being

promoted

Atrx1

trx2

trx3B

C

D

trx1

trx2

trx3

trx1

trx2

trx3

trx1

trx2

binlog

relay

relay

relay

Note: If candidate is too delayed it might take too much

time to synchronize... Better to choose another one?

Page 34: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.34

Failover With or Without GTIDs

Without GTIDs:

– Handle files and positions

Headache to determine file and position to synchronize candidates

Headache to determine file and position for CHANGE MASTER TO

With GTIDs (from MySQL 5.6.5):

– Handle GTID sets

Simple sets manipulation to determine missing transactions

Automatic: CHANGE MASTER TO ... MASTER_AUTO_POSITION = 1

No headache to determine new master’s file and position

MySQL Failover

Page 35: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.35

Failover With or Without GTIDs

Use of GTIDs simplifies the failover task:

– Removes complexity of handling files and positions

– Opens the door to support more complex multi-tier replication topologies

Still a difficult task to achieve (manually):

– Detecting master failure

– Requires profound knowledge of MySQL replication

– Is time consuming and error-prone

Wouldn’t it be great if we had a tool to do it automatically?

MySQL Failover

Page 36: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.36

Introducing the mysqlfailover Utility

Page 37: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.37

Introducing the mysqlfailover Utility

Features

– Configurable timeouts to detect master failures (interval and ping)

– Configurable failover modes (auto, elect, fail)

– Optimized failover algorithm

http://svenmysql.blogspot.pt/2013/03/flexible-fail-over-policies-using-mysql.html

– Automatic slave discovery

– Logging to file

Utility to report the health of a replication topology (master and its slaves)

and perform automatic failover.

New

Page 38: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.38

Introducing the mysqlfailover Utility

Features

– Two operation modes:

Console (default)

Daemon (only on POSIX platforms)

– Support for new authentication mechanism (login-path)

– Detect execution of multiple instances (for the same master)

– Detect errant transactions

– Pedantic execution mode

– Extension points (allow execution of external scripts)

New

New

New

New

Page 39: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.39

Introducing the mysqlfailover Utility

Requirements

– All servers with version >= MySQL 5.6.9 and GTID_MODE=ON:

--report-host, --report-port, --log-slave-updated, --enforce-gtid-consistency, and

--master-info-repository=TABLE

More information

– http://dev.mysql.com/doc/workbench/en/mysqlfailover.html

Example of use

mysqlfailover --master=m1 --slaves=s2,s1,s3 --daemon=start --log=failover.txt

Note: m1, s1, s2 and s3 are login-path in the mylogin.cnf file used for security and

usability purposes, avoiding the use of plain text connection strings, such as

rpluser:mypass@master_host:3306.

Page 40: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.40

mysqlfailover in action:

Introducing the mysqlfailover Utility

Master fault detection

– Are you alive?

Get health data from

master and slaves

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Simple, open source, no impact on system

performance!

Page 41: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.41

Introducing the mysqlfailover Utility

Page 42: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.42

Fault-Tolerance of mysqlfailover and Taking Advantage of its Features

Page 43: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.43

Fault-Tolerance of mysqlfailover

How faults affect mysqlfailover?

– Server crash

– Connection failure

How to improve fault-tolerance?

– Tips and tricks

– Extension points

Errant transactions

– Good practice to avoid them

What will be discussed

Page 44: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.44

Server crash:

Fault-Tolerance of mysqlfailover

Utility stopped

Replication topology

not affected

Easy to start a new

instance of the utility

– Use --force option,

overwrite instance

registration

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Page 45: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.45

Connection failure (1):

Fault-Tolerance of mysqlfailover

Fault detection

– Suspect master

has failed

Failover

– Slave promoted to

new master

– Old master still

alive, abandoned

Atrx1

trx2

trx3

B

C

D

trx1

trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Page 46: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.46

Connection failure (2):

Fault-Tolerance of mysqlfailover

Fault detection

– Suspect master

has failed

Failover

– Split replication

topology

Atrx1

trx2

trx3

B

C

Dtrx1

trx2

trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Page 47: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.47

Improve fault-tolerance:

Fault-Tolerance of mysqlfailover

Run mysqlfailover

close to the master,

but not on it

Carefully set “interval”

and “ping” values

Monitor mysqlfailover

Use extension points

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx3

trx1

trx2

trx2, trx3

trx3

Page 48: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.48

Extension Points of mysqlfailover

--exec-fail-check: script executed on each interval to replace default

failure detection

– E.g., customize failure detector according to replication topology (test

master connection through different network paths)

--exec-before: script to execute before starting failover

– E.g., shutdown master’s system (to ensure only one master will be online)

Allow the execution of external scripts to override the behavior of the

utility at specific points using the following options:

Page 49: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.49

Extension Points of mysqlfailover

--exec-after: script to execute at the end of failover

– E.g., change network setting to direct client writes to the new master

--exec-post-failover: script to execute after completing the failover

process (successfully or not)

– E.g., send a notification to the administrator with the failover result

Access to variables with master’s information from scripts

– $1: old master host, $2: old master port, $3: new master host, $4: new

master port

Page 50: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.50

Slave transaction that is not on all slaves

Errant Transactions

Examples: b-1, c-1

– Different GTID

– Can be the same

SQL command

CREATE TABLE

table1

– Each one only on

one slave

Aa-1

a-2

a-3

B

C

Da-1

b-1

a-1

a-2

a-3

a-1

a-2

c-1

trx2, trx3

trx3

Page 51: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.51

Can lead to errors during/after failover:

Errant Transactions

Both b-1 and c-1

– CREATE TABLE table1

New master B cannot

apply c-1, and slave C

cannot apply b-1: Table

“table1” already exists.

Aa-1

a-2

a-3

B

C

D

a-1

b-1

a-2

a-3

a-1

a-2

a-3

a-1

a-2

c-1SQL Error!

b-1, a-3

b-1

Conflicting transactions!

Page 52: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.52

Errant Transactions

mysqlfailover identifies errant transactions

– Error when starting the utility

– Warning while executing unless the --pedantic option is used

Can only be fixed manually

– Commit empty transaction on slaves with the GTID of the errant one

SET GTID_NEXT=‘…’; BEGIN; COMMIT; SET GTID_NEXT='AUTOMATIC‘;

Good practice

– Disable binary log (SET sql_log_bin = 0) when executing transactions

locally on slaves. So that they are not replicated!

How to handle this problem? Avoid it!mysqlfailover

is clever!

Page 53: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.53

Final Remarks

Page 54: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.54

Final Remarks

Faults can greatly impact your system (replication topology)

– System down

– Revenue lost

– Negative reputation

Failover

– Fault-tolerant strategy for high availability

– Works in most situations

Fault detectors make mistakes, failover process can fail, etc.

Page 55: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.55

Final Remarks

Simple and easy to use

Advanced features

– Perform several configuration checks (to prevent failures)

– Optimized failover algorithm

– Configurable monitoring timeouts and operation modes

– Extension points

Building block for more sophisticated and fault-tolerant tools

– Taking advantage of extension points or MySQL Utilities Library (Open

Source)

The mysqlfailover utility

Page 56: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.56

By The Way…

There are many other useful MySQL Utilities

mysqldbcompare

mysqldbcopy

mysqldbexport

mysqldbimport

mysqldiff

Database Ops mysqlauditadmin

mysqlauditgrep

Specific – Audit log

mysqlserverclone

mysqlserverinfo

Server Ops

mysqluc

Usability

msqldiskusage

mysqlindexcheck

mysqlmetagrep

mysqlprocgrep

mysqluserclone

mysqlfrm

Generic/Support

New

New

New

New

mysqlfailover

mysqlreplicate

mysqlrpladmin

mysqlrplcheck

mysqlrplshow

High Availability

Page 57: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.57

Do You Want More?

MySQL Utilities (1.3.5 GA)

– Download: http://dev.mysql.com/downloads/tools/utilities/

Launchpad: https://launchpad.net/mysql-utilities

– Docs: http://dev.mysql.com/doc/workbench/en/mysql-utilities.html .

Contributing Ideas:

– Community users can use: http://bugs.mysql.com (MySQL Workbench: Utilities).

– Oracle customers can use: bug.oraclecorp.com (Product = MySQL Workbench,

Component = WBUTILS).

Send us an e-mail (please use sparingly):

– Charles A. Bell, PhD (Team Lead): [email protected]

– Paulo Jesus, PhD (Developer): [email protected]

Page 58: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.58

Questions ?Paulo [email protected]

Page 59: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.59

Graphic Section Divider

Page 60: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.60

Replication topology:

Another Fault Scenario

Master: A

Slaves: B, C and D

Asynchronous

replication

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx1

trx2

trx2, trx3

trx3

trx3

trx3 did not reach any slave (yet)

Page 61: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.61

Master crashed:

Another Fault Scenario

trx3 didn’t reach

any slave

Atrx1

trx2

trx3

B

C

Dtrx1 trx1

trx2

trx1

trx2

trx2, trx3

trx3

trx3

Is failover enough to solve this?

Page 62: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.62

What can we do?

Another Fault Scenario

With failover

– Try to get missing

transaction from

master (if possible)

Semi-synchronous

replication helps

Atrx1

trx2

trx3

B

C

D

trx1

trx1

trx2

trx1

trx2

trx2

Page 63: Development of Fault-Tolerant Failover Tools with MySQL Utilities - MySQL Connect 2013 [CON4276]

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.63