applied partitioning and scaling your database system
TRANSCRIPT
Presented by,
MySQL AB® & O’Reilly Media, Inc.
Applied Partitioning and Scaling Your (OLTP) Database System
Phil Hildebrand
thePlatform for Media, Inc.
Objectives
Review classic uses of database partitioning
Applying partitioning to MySQL OLTP applications
Hash partitioning with MySQL OLTP applications
•Implementation examples
Q&A
Classic Partitioning
Old School – union in the archive tables
Auto partitioning and partition pruning
Lends itself to Data Warehouses
•Archival and Date based partitioning
•Predictable growth patterns
Benefits within Data Warehouses
•Maintenance benefits
•Query performance improved
Applying Partitioning to OLTP
Design Issues•Often id driven access vs. date driven access
•Difficulties in estimating partition ranges / sizes
•Intelligent keys increase complexity in partitions
Operational Issues•Difficult to schedule downtime for DDL changes
•General lack of use outside of data warehousing
Applying Partitioning to OLTP
Understanding the Benefits•Reducing seek and scan set sizes
•Limiting insert / update transaction durations
•Creates additional options for Maint processes
Reducing scan/seek set sizes
mysql> explain partitions select my_store.city,my_employee_old.name
from my_store, my_employee_old where my_store.id in (5,8,10)
and my_store.id = my_employee_old.store_id and my_employee_old.id = (ROUND(RAND()*50000,0));
+---+-------------+-----------------+------------+-------+---------------+---------+---------+------+-------+--------------------------------+
|id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+---+-------------+-----------------+------------+-------+---------------+---------+---------+------+-------+--------------------------------+
| 1 | SIMPLE | my_store | p5,p8,p10 | range | PRIMARY | PRIMARY | 8 | NULL | 3 | Using where |
| 1 | SIMPLE | my_employee_old | NULL | ALL | NULL | NULL | NULL | NULL | 47483 | Using where; Using join buffer |
+_--+-------------+-----------------+------------+-------+---------------+---------+---------+------+-------+--------------------------------+
mysql> explain partitions select my_store.city,my_employee.name
from my_store, my_employee where my_store.id in (5,8,10)
and my_store.id = my_employee.store_id and my_employee.id = (ROUND(RAND()*50000,0));
+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+--------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+--------------------------------+
| 1 | SIMPLE | my_store | p5,p8,p10 | range | PRIMARY | PRIMARY | 8 | NULL | 3 | Using where |
| 1 | SIMPLE | my_employee | p5,p8,p10 | ALL | NULL | NULL | NULL | NULL | 2979 | Using where; Using join buffer |
+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+--------------------------------+
Simple join with out partitions
$ time mysqlslap -u root --create-schema=conf --query=sel_store_employee_old.sql -c 5 -i 1000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.141 seconds
Minimum number of seconds to run all queries: 0.101 seconds
Maximum number of seconds to run all queries: 0.213 seconds
Number of clients running queries: 5
Average number of queries per client: 1
real 2m22.018s
user 0m0.217s
sys 0m0.445s
Simple join with partitions
$ time mysqlslap -u root --create-schema=conf --query=sel_store_employee.sql -c 5 -i 1000 -F ";"
Benchmark
Average number of seconds to run all queries: 0.006 seconds
Minimum number of seconds to run all queries: 0.005 seconds
Maximum number of seconds to run all queries: 0.025 seconds
Number of clients running queries: 5
Average number of queries per client: 1
real 0m6.660s
user 0m0.133s
sys 0m0.306s
Rebuilding by partitionmysql> optimize table my_employee_old;
+----------------------+----------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+----------------------+----------+----------+----------+
| conf.my_employee_old | optimize | status | OK |
+----------------------+----------+----------+----------+
1 row in set (1.14 sec)
mysql> alter table my_employee rebuild partition p1;
Query OK, 0 rows affected (0.03 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> alter table my_employee rebuild partition p1,p2,p3,p4,p5,p6,p7,p8,p9,p10;
Query OK, 0 rows affected (0.27 sec)
Records: 0 Duplicates: 0 Warnings: 0
Applying Partitioning to OLTP
Design Considerations•Table sizes and predicted growth patterns
•Access patterns
•Keys and indexes
•Availability and Scalability requirements
•Manageability considerations
•Reuse considerations
Choosing a Partitioning Method Range Partitioning
•Data usually accessed by date
•Limited number of (primary) partitions needed
•Ordered Intelligent keys
•Supports Sub Partitions
List Partitioning•Grouping data in partitions out of order (1,5,7 in partition x)
•Limited number of (primary) partitions needed
•Intelligent keys
•Supports Sub Partitions
Hash Partitioning•Low maintenance
•Works with limited or large number of partitions
•Non-intelligent keys (can work with some cases of intelligent keys)
Key Partitioning•Non-integer based partitioned keys (md5 hash)
•Low maintenance
Hash Partitioning and OLTP
Applying a hash to the partitioning key•Hash Partitions
•Key Partitions
Fixed number of partitions•Number of partitions determined by hash (mod%num_partitions)
My Retail Store App
mysql> show columns from my_store;
+---------+--------------+------+-----+---------+
| Field | Type | Null | Key | Default |
+---------+--------------+------+-----+---------+
| id | bigint(20) | NO | PRI | NULL |
| city | varchar(128) | YES | | NULL |
| country | varchar(128) | YES | | NULL |
+---------+--------------+------+-----+---------+
mysql> show columns from my_employee;
+----------+-------------+------+-----+---------+
| Field | Type | Null | Key | Default |
+----------+-------------+------+-----+---------+
| id | bigint(20) | NO | PRI | NULL |
| store_id | bigint(20) | NO | PRI | NULL |
| name | varchar(56) | YES | | NULL |
+----------+-------------+------+-----+---------+
mysql> show columns from my_inventory;
+----------+-------------+------+-----+---------+
| Field | Type | Null | Key | Default |
+----------+-------------+------+-----+---------+
| id | bigint(20) | NO | PRI | NULL |
| store_id | bigint(20) | NO | PRI | NULL |
| name | varchar(56) | YES | | NULL |
| in_stock | bit(1) | YES | | NULL |
| on_order | bit(1) | YES | | NULL |
| item_cnt | bigint(20) | YES | | NULL |
| cost | float | YES | | NULL |
+----------+-------------+------+-----+---------+
Applying Hash Partitioning
Partition on Store ID
mysql> ALTER TABLE MY_STORE PARTITION BY HASH (id) PARTITIONS 50 ;
Query OK, 50 rows affected (0.76 sec)
Records: 50 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE MY_EMPLOYEE PARTITION BY HASH (store_id) PARTITIONS 50 ;
Query OK, 50000 rows affected (25.28 sec)
Records: 50000 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE MY_INVENTORY PARTITION BY HASH (store_id) PARTITIONS 50 ;
Query OK, 250000 rows affected (2 min 8.32 sec)
Records: 250000 Duplicates: 0 Warnings: 0
Splitting Partitions Expanding into Australia with 2 new stores:
mysql> ALTER TABLE MY_STORE ADD PARTITION PARTITIONS 2;
Query OK, 0 rows affected (0.86 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE MY_EMPLOYEE ADD PARTITION PARTITIONS 2;
Query OK, 0 rows affected (2.43 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE MY_INVENTORY ADD PARTITION PARTITIONS 2;
Query OK, 0 rows affected (7.60 sec)
Records: 0 Duplicates: 0 Warnings: 0
Splitting Partitions
mysql> select table_name,partition_name,table_rows
-> from information_schema.partitions
-> where table_schema = 'conf'
-> and table_name in ('MY_STORE','MY_INVENTORY','MY_EMPLOYEE')
-> and table_rows < 1;
+--------------+----------------+------------+
| table_name | partition_name | table_rows |
+--------------+----------------+------------+
| my_employee | p0 | 0 |
| my_employee | p51 | 0 |
| my_inventory | p0 | 0 |
| my_inventory | p51 | 0 |
| my_store | p0 | 0 |
| my_store | p51 | 0 |
+--------------+----------------+------------+
Merging Partitions Closing All Stores in China (4 stores) :
mysql> ALTER TABLE MY_STORE COALESCE PARTITION 4;
Query OK, 0 rows affected (0.40 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE MY_EMPLOYEE COALESCE PARTITION 4;
Query OK, 0 rows affected (2.71 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE MY_INVENTORY COALESCE PARTITION 4;
Query OK, 0 rows affected (7.81 sec)
Records: 0 Duplicates: 0 Warnings: 0
Merging Partitions Closing All Stores in China (4 stores) :
mysql> select table_name,count(*)
-> from information_schema.partitions
-> where table_schema = 'conf'
-> and table_name in ('MY_STORE','MY_INVENTORY','MY_EMPLOYEE')
-> group by table_name;
+--------------+----------+
| table_name | count(*) |
+--------------+----------+
| my_employee | 48 |
| my_inventory | 48 |
| my_store | 48 |
+--------------+----------+
A Few More Stats… (No Partitions)mysql> explain partitions select my_store_no_part.city,my_employee_no_part.name,count(*) from my_store_no_part, my_employee_no_part,
my_inventory_no_part where my_store_no_part.id in (5,8,10,23,80) and my_store_no_part.id = my_employee_no_part.store_id and my_store_no_part.id = my_inventory_no_part.store_id and my_employee_no_part.id < 2000 and my_inventory_no_part.in_stock = (ROUND(RAND(),0)) group by my_store_no_part.city,my_employee_no_part.name;
+---+-------------+----------------------+------------+--------+---------------+---------+---------+-----------------------------------+--------+----------------------------------------------+
|id | select_type | table | partitions | type | possible_keys | key | ref | rows | Extra |
+---+-------------+----------------------+------------+--------+---------------+---------+---------+-----------------------------------+--------+----------------------------------------------+
| 1 | SIMPLE | my_employee_no_part | NULL | range | PRIMARY | PRIMARY | NULL | 3962 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | my_store_no_part | NULL | eq_ref | PRIMARY | PRIMARY | conf.my_employee_no_part.store_id | 1 | |
| 1 | SIMPLE | my_inventory_no_part | NULL | ALL | NULL | NULL | NULL | 508243 | Using where; Using join buffer |
+----+-------------+----------------------+------------+--------+---------------+---------+---------+-----------------------------------+--------+----------------------------------------------+
mysql> select my_store_no_part.city,my_employee_no_part.name,count(*) from my_store_no_part, my_employee_no_part, my_inventory_no_part where my_store_no_part.id in (5,8,10,23,80) and my_store_no_part.id = my_employee_no_part.store_id and my_store_no_part.id = my_inventory_no_part.store_id and my_employee_no_part.id < 2000 and my_inventory_no_part.in_stock = (ROUND(RAND(),0)) group by my_store_no_part.city,my_employee_no_part.name;
+----------+-------------+----------+
| city | name | count(*) |
+----------+-------------+----------+
| Delhi | Employee #0 | 60453 |
| Istanbul | Employee #0 | 79707 |
| Karachi | Employee #0 | 59872 |
| Seoul | Employee #0 | 37432 |
+----------+-------------+----------+
4 rows in set (16.45 sec)
A Few More Stats… (Partitions)mysql> explain partitions select my_store_lrg.city,my_employee_lrg.name,count(*) from my_store_lrg, my_employee_lrg, my_inventory_lrg where
my_store_lrg.id in (5,8,10,23,80) and my_store_lrg.id = my_employee_lrg.store_id and my_store_lrg.id = my_inventory_lrg.store_id and my_employee_lrg.id < 2000 and my_inventory_lrg.in_stock = (ROUND(RAND(),0)) group by my_store_lrg.city,my_employee_lrg.name;
+---+-------------+------------------+-------------------+--------+---------------+---------+---------+-------------------------------+-------+----------------------------------------------+
|id | select_type | table | partitions | type | possible_keys | key | ref | rows | Extra |
+---+-------------+------------------+-------------------+--------+---------------+---------+---------+-------------------------------+-------+----------------------------------------------+
|1 | SIMPLE | my_employee_lrg | p5,p8,p10,p23,p80 | range | PRIMARY | PRIMARY | NULL | 94 | Using where; Using temporary; Using filesort |
|1 | SIMPLE | my_store_lrg | p5,p8,p10,p23,p80 | eq_ref | PRIMARY | PRIMARY | conf.my_employee_lrg.store_id | 1 | |
|1 | SIMPLE | my_inventory_lrg | p5,p8,p10,p23,p80 | ALL | NULL | NULL | NULL | 47938 | Using where; Using join buffer |
+----+-------------+------------------+-------------------+--------+---------------+---------+---------+-------------------------------+-------+----------------------------------------------+
mysql> select my_store_lrg.city,my_employee_lrg.name,count(*) from my_store_lrg, my_employee_lrg, my_inventory_lrg where my_store_lrg.id in (5,8,10,23,80) and my_store_lrg.id = my_employee_lrg.store_id and my_store_lrg.id = my_inventory_lrg.store_id and my_employee_lrg.id < 2000 and my_inventory_lrg.in_stock = (ROUND(RAND(),0)) group by my_store_lrg.city,my_employee_lrg.name;
+----------+-------------+----------+
| city | name | count(*) |
+----------+-------------+----------+
| Delhi | Employee #0 | 60041 |
| Istanbul | Employee #0 | 77721 |
| Karachi | Employee #0 | 59786 |
| Seoul | Employee #0 | 36237 |
+----------+-------------+----------+
4 rows in set (1.89 sec)
Summing it Up
Partitioning provides an easy way to scale within a database Partitioning has a place in OLTP Remember access methods and maintenance Use Range/List for intelligent partitioning Use Hash/Key for low maintenance, many partitions
Questions Anyone?