online schema changes for maximizing uptime

Online Schema Changes

for Maximizing Uptime

PALOMINODB OPERATIONAL EXCELLENCE

FOR DATABASES

David Turner

Ben Black

www.palominodb.com

Palominodb: Who are we and what do we do?

• Proactive database awesomeness so you

can focus on other things

• Systems and Performance Optimization

• Operational Best Practices

• Devops

• Ask us how we like working from anywhere

Online Schema Changes

• Best practices

• Online Schema Tools

• PT Online Schema Change

• Innodb Online DDL

• Case Study

Best Practices

• Backups

• Benchmark o Start small

o Step up

o Time

o Performance_schema and Information_schema

Repository

• Prevent

o Broken Replication

o Data loss

o Broken applications

Creating the repository

• Dump

• Script

Syncing the repository

• Migrations

• Schema Changes

• Scheduled

Problems

• Schema changes

• Fragmentation

You want to do what to that 40G

table?

• Down for Maintenance/blocking DDL

• Disable writes

• Remaster

• Facebook OSC

• Large Hadron Migrator

• Openark-kit

• Pt Online Schema Change

• MySQL Innodb DDL - IOD

OSC Tools or how I learned

to love the alter...

• copy original table structure to new table

• alter new table

• create triggers to copy dml from original

table to new table

• copy data in original table to new table in

chunks

• swap table names and drop original table

pt-online-schema-change

Percona toolkit is your friend.

Mature product

Pay attention to version you are using!!!

FK issues, log_bin with 2.0

screen - use it along with tee and time

It's all ball bearings nowadays

--progress time,10 (default 30 seconds)

--max-lag

--recursion-method (how to find slaves)

show processlist

show hosts

dsn table (great for ignoring some slaves)

What could possibly go wrong?

• PK/UK required

• FK names will change on altered table

• FK's reference table to alter

--alter-foreign-keys-method drop_swap

sets foreign_key_checks=0

drops original table (hardlink!!!)

renames new table

What could possibly go wrong?

• non xfs table drops (create the hardlink!)

o Can even cause innodb to crash mysqld

• PKs with gaps

• Largest table to alter vs free disk space

• Disk space (2x for RBR)

• Global mutexes (table drops)

• table metadata locks (triggers)

NOOOOOO!!

• Running PT-OSC against a slave with RBR

Replication started erroring after pt-osc

why???

• And how about syncing a table using pt-osc

with RBR?

Set it and forget it???

• watch

o df -h

o ls -alh ibd files

o mysql -e"show processlist;"|egrep -v "Sleep|repl"

o slave lag

• How is it affecting the application?

• max-lag, max-load,

Features we like

• Throttling

• Nibbling

• Reporting

• Replicating

• Reorgs

Innodb Online DDL - IOD

• Looking back 5.1 and 5.5

o Fast index creation

• Enhancements

o Online DDL

o Inplace

Silver Bullets

http://dev.mysql.com/doc/refman/5.6/en/innodb-

create-index-overview.html#innodb-online-

ddl-summary-grid

• Drop and add Index together

• Temporary table index creation

Alter Online lock modes

• Exclusive

• Shared

• Default

• None

* Note

Alter Online Algorithms

• Inplace

• Copy

Grouping DDL

• Copy

• Sequence

• Lock

IOD Behavior

• Metadata

• Data volume

• Type of index

• Foreign keys

• Partitioning

o Add partition, drop partition

o Truncate partition

o Add partition and coalesce partition

• Algorithm

• Autoincrement

• Locking

IOD Errors

• Locking

• Timeouts

• Tmpdir

• DML

IOD Crash Recovery

• Secondary Indexes

• Clustered Indexes (don't crash!)

Of Interest

• Tmpdir

• Dropping Indexes with Foreign keys

• Cascade on

• Inconsistent .frm

Monitoring

• Rows Affected

• Performance_schema and

Information_schema

• Progress

• Replication lag

IOD Server parms

• innodb_online_alter_log_max_size

• sort_buffer_size

Case Study

• Ticketing system

Architecture

• 1 master

• 1 active slave

• 1 analytics/indexing slave

• 1 backup/reorg slave

Issues

• Growth

• Rebalancing Shards

• Deletions

• Reorgs

Timeline

• Difficult failovers

• Ticket Growth

• FB OSC

• PT OSC

• Reorgs

Questions

David Turner

dturner@palominodb.com

Ben Black

ben@palominodb.com

www.palominodb.com

online schema changes for maximizing uptime

xfs table

online schema changesfor

original table hardlink

original table structure

gaps largest table

innodb online ddl iod

temporary table index

schema andinformation

Technology

maximizing service uptime with oracle's sparc t3-1, sparc...

maximizing voice svcs.ppt -...

maximizing instrument uptime using ion chromatography ·...

explaining the uptime in

maximize uptime with_mobile_device_management

uptime issue #1, 2014

hubbell installation efficiency guide · 2018-06-07 ·...

uptime international as

console managers - lantronix · console managers: the...

uptime issue #2, 2014

step van catalog - uptime parts parts step van...

outstanding reliability, impressive color...

uptime issue 3

uptime issue #2, 2012

maximizing service uptime of smartphone-based … ·...

uptime issue #3, 2012

stratus uptime assurance architecture · stratus uptime...

9 keys to maximizing profits and equipment uptime through...

safety devices uptime

speed and flexibility - thermo fisher scientific · the...