mysql latency

Jeff Freund, CTOClickability

End of a long day, I am the last stop between you and …..

~6 hours 42 mins left

Yippee!

Future CTO

• Software-as-a-Service Web CMS• True Multi-Tenant SaaS platform from the

ground up• Integrated solution of all services required to run

a sophisticated business website• HQ in San Francisco, 8+ years old, 60+

employees

Global leader in On Demand Web Content Management

250+ million pages delivered per month

• Linux• Apache• MySQL• Java• Tomcat

Proven open source building blocks

• Scale-out horizontally• Distributed infrastructure, including

multiple datacenters• Multiple Layers of caching for performance• Loose-coupling of applications around

S3 S4S2S1 S5 S6

VPN Tunnel

Data Center 1 Data Center 2

SlaveMaster

RO ConnectionManager

RW ConnectionManager

con = db.getReadWriteConnection(); con = db.getReadOnlyConnection();con = db.getSafeReadConnection();

Application Code• Intelligently Split Queries between Masters and Slaves

• Inserts/Updates/Deletes sent to Master

• Most Reads sent to Slaves

• “Safe” Reads sent to Masters – zero tolerance for latency

• Manual code updates to implement the split

• 6+ months in production to find all “Safe” Reads

• The difference in time between when a transaction is committed on one database and then subsequently committed on a replicated database.

• Latency can either be “slowness” or “breakage”

7… Hardware Maintenance / Recovery

6… Schema updates / DB Maintenance

5… Elevated transaction rates (i.e. bulk loads)

4... High query load on slaves

3… Network bottlenecks / Loss of connectivity

2… “Slave Errors” (ie Duplicate keys, deadlocks)

while ( 1 )while? echo "show slave status \G;" | mysql -u USER --

password=PASSWORD | grep Seconds_Behind_Master >> replication.log

while? sleep 1while? end

Seconds

S4 S6S3S2S1 S5

VPN Tunnel

Data Center 1 Data Center 2

V PNTunnel

CREATE TABLE `replTest` ( timecol` bigint(20) default NULL, KEY `idx_timecol` (`timecol`) )

Loop:$val = current timestamp in epoch millisecondsM2: INSERT INTO replTest (timecol) VALUES ($val)M1: SELECT $val -max(timecol) from replTest;S4: SELECT $val -max(timecol) from replTest;S6: SELECT $val -max(timecol) from replTest;

INSERT

Database Characteristics Average Latency Max Latency

M2 Transaction Source N/A N/A

M1 Local; Moderate Load ~ 6 ms ~ 315 ms

S4 Local; High Load ~ 190 ms ~12 seconds

S6 Remote; Minimal load ~ 5 ms ~ 400 ms

• All DBs are 1 replication hop away from transaction source• All hardware is roughly equal• Remote location is ~ 60 miles away

• Data taken from 100,000 samples over an hour of standard operations

S4 Database

95 % of the time, replication latency will be 1 second or less

milliseconds

• Now what?

Assume that it will happen in the course of standard operations. Build the application to accommodate it.

If you do, your Ops Team will love you for it.

• Local ehcache on application servers

• Distributed Object Cache (memcached)

• Need to clear all caches effectively on object updates

Pub 1 Pub 2 Pub 3

Distributed Object Cache

Local cache

Reliable Cache Clearing Messages

• Multicast Notification Bus for “clear cache” messages

• The race is on! If message arrives before transaction is replicated, stale object maybe reloaded….

• Frequently accessed objects most susceptible to problems

CMS Pub

DB1 DB2

• Multicast Notification Bus with tuning parameters

• The race is on again! But the database transaction gets a tunable head start. 0.5 sec, 1 sec, 2 secs, 5 secs

• Better – lasted for years, but in the end 99.99+% still wasn’t reliable enough...(remember the long tail on chart?)

CMS PUB

DB1 DB2

• Database Queue table for messages

• Messages are committed after data, injecting them into the replication data stream.

• All apps poll the database queue table once per second.• Guaranteed that data will arrive before message!!!

CMS PUB

DB1 DB2

QueuePoller

• If you don’t need to replicate it, don’t!

• Split data functionally (i.e. separate large blog storage from relational transactions to keep the pipes clear)

• Build the appropriate recovery tools – our “rewind button”

• Masters in multiple data centers

• Greater geographic distance between data centers

• MySQL load balancing – will messaging still be reliable???

jeff@clickability.com

Questions? Feedback?

mysql latency

Technology

mysql replication: latest developments · mysql 5.1.30...

mysql enterprise monitor€¦ · mysql-monitor.log mysql...

mysql group replication & mysql innodb cluster€¦ ·...

mysql 2018 intro - 2018 mysql days

mysql strategy - itreegroup.eu · mysql strategy morten...

mysql introduction to the mysql products. agenda company...

mysql whitepaper mysql ha solutions

mysql administration the mysql data directory

mysql introduction to the mysql products

mysql internals manual -...

logstash - netways · pdf filelogstash processing &...

mysql tutorial -...

oracle enterprise manager for mysql database 13.2.2.0 ·...

upstream downstream - fosdem · ha audit mysql 5.6 mysql...

mysql, php, stuff - jeremy...

mysql x protocol - percona€¦ · what is the mysql x...

mysql -...

mysql 8 : the best gets even better · 2018-07-27 · mysql...

securing your mysql/mariadb data - mysql expert - aws...

mysql at mastercard - 2018 mysql days