mysql large table schema changes
DESCRIPTION
Making schema changes to large tables in MySQL can become a major issue as data grows. This talk outlines a method to change large MySQL tables with little impact to running production services and replication.TRANSCRIPT
MySQL Large Table Schema
Changes
mapmyfitness
+ Founded in 2006 + Fitness tracking: web and mobile apps + 17 million + registered users + 123 million routes + 156 million workouts + We use MongoDB, Postgres, and MySQL
!2
MySQL at mapmyfitness
+ Master-Master with Slaves + The primary database is “mapmyfitness” + MySQL DB has grown from 340 Gb to 500+ Gb in the last year + We have some large tables+ Routes 88Gb + Workouts 85 Gb + User 17 Gb
Schema Changes
+ Add/delete a column + Add an index + Change a column datatype + add/delete a foreign key + Move columns around
!4
Default MySQL Behavior
+ Create a temporary table with the schema changes + Put a table level write lock on the original table + Insert “select *” from the original table to the temporary table + Rename the temporary table to the original table + Drop the original table.
!5
A Real World Example:
+ Go for it...during a maintenance window + PINGDOM? page down!!!
!6
+ nutrition_foodlog needs a new index
+ 22 million rows + File size ~6 Gb + 10 minute execution time in a dev environment
Shadow Table Migration
+ Essentially the same process as MySQL + Create a “shadow” table with the new structure/index + Create a stored procedure to copy data + Create insert, update and delete triggers + Setup and run a batch process to to apply “fake” updates to every
row on the original table + Run an atomic rename of the tables + Drop the original table, triggers, and stored procedure
!7
Create the Shadow Table
!8
+ Create the shadow table with the new index
CREATE TABLE `mapmyfitness`.`nutrition_foodlog_shadow` ( `id` int(11) NOT NULL AUTO_INCREMENT, `privacy_setting` smallint(6) NOT NULL, ... KEY `XIE2nutrition_foodlog` (`user_id`, `consume_date`, `id`), ... ) ENGINE=InnoDB AUTO_INCREMENT=38304235 DEFAULT CHARSET=utf8;
Create Stored Procedure
!9
DROP PROCEDURE IF EXISTS nutrition_foodlog_update; delimiter ;; CREATE PROCEDURE nutrition_foodlog_update( a_id int(11), ... a_total_serving_grams decimal(15,7)) BEGIN INSERT INTO nutrition_foodlog_shadow SET id = a_id, privacy_setting = a_privacy_setting, ... total_serving_grams = a_total_serving_grams ON DUPLICATE KEY UPDATE id = a_id, privacy_setting = a_privacy_setting, ... total_serving_grams = a_total_serving_grams; END; ;; delimiter ;
Insert After Trigger
!10
delimiter ;; CREATE TRIGGER nutrition_foodlog_insert_after_trigger AFTER INSERT ON mapmyfitness.nutrition_foodlog FOR EACH ROW BEGIN CALL nutrition_foodlog_update( new.id, new.privacy_setting, new.privacy_limit_list, new.user_id, new.meal_type_id, new.consume_date, new.serving_count, new.food_id, new.updated_date, new.created_date, new.serving_id, new.total_serving_grams); END; ;;
Update Before Trigger
!11
delimiter ;; CREATE TRIGGER nutrition_foodlog_update_before_trigger BEFORE UPDATE ON mapmyfitness.nutrition_foodlog FOR EACH ROW BEGIN CALL nutrition_foodlog_update( new.id, new.privacy_setting, new.privacy_limit_list, new.user_id, new.meal_type_id, new.consume_date, new.serving_count, new.food_id, new.updated_date, new.created_date, new.serving_id, new.total_serving_grams); END; ;;
Delete After Trigger
!12
delimiter ;; CREATE TRIGGER nutrition_foodlog_delete_after_trigger AFTER DELETE ON mapmyfitness.nutrition_foodlog FOR EACH ROW BEGIN DELETE FROM mapmyfitness.nutrition_foodlog_shadow WHERE id = old.id LIMIT 1; END; ;;
“Fake” Updates
!13
select concat('update mapmyfitness.nutrition_foodlog set id = ', id, ' where id = ', id, ' limit 1;') as seql from mapmyfitness.nutrition_foodlog order by id; update mapmyfitness.nutrition_foodlog set id = 2 where id = 2 limit 1; update mapmyfitness.nutrition_foodlog set id = 4 where id = 4 limit 1; update mapmyfitness.nutrition_foodlog set id = 6 where id = 6 limit 1; update mapmyfitness.nutrition_foodlog set id = 8 where id = 8 limit 1; update mapmyfitness.nutrition_foodlog set id = 12 where id = 12 limit 1; update mapmyfitness.nutrition_foodlog set id = 14 where id = 14 limit 1; update mapmyfitness.nutrition_foodlog set id = 16 where id = 16 limit 1; update mapmyfitness.nutrition_foodlog set id = 20 where id = 20 limit 1;
Rename of Tables
!14
-- rename the tables -- RENAME TABLE table1 to table1_old, table1_shadow to table1; RENAME TABLE mapmyfitness.nutrition_foodlog to mapmyfitness.nutrition_foodlog_old, mapmyfitness.nutrition_foodlog_shadow to mapmyfitness.nutrition_foodlog;
+ Should be an atomic operation + All tables get renamed or none get renamed + Run it as one statement
Clean-up
+ After confirming all is good with the new table clear the cruft!
!15
-- drop triggers DROP TRIGGER IF EXISTS nutrition_foodlog_update_before_trigger; DROP TRIGGER IF EXISTS nutrition_foodlog_insert_after_trigger; DROP TRIGGER IF EXISTS nutrition_foodlog_delete_after_trigger; -- drop the update stored procedure DROP PROCEDURE IF EXISTS nutrition_foodlog_update; -- truncate and drop the "old" table TRUNCATE TABLE mapmyfitness.nutrition_foodlog_old; DROP TABLE mapmyfitness.nutrition_foodlog_old;
MySQL Large Table Schema
Changes
Questions?