database wizardry for legacy applications

Post on 17-Jul-2015

834 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Database Wizardry for Legacy Applications

Gabriela D’Ávila

gabriela.io#PHPUK15

FEBRUARY, 2015The Brewery At London 19th-20th

WHOAMI

Huge fan… really!

Almost 25k pieces. I told you I was a fan.

Majored in Digital Game Development

Sr. Software Engineer

x

What to expect?• Waging options

• Spaghetti code x Legacy code

• Real life examples

Respect

Legacy & Spaghetti

Legacy code zone

Spaghetti code zone

Turn programmer crazy zone

How can one be different from another?

Legacy x Spaghetti• No longer supported

• Usually old technology

• Development freeze

• Tangled structure

• May have current technology

• Too many bugs to track

How to optimize your database?

Wait 10 years.paraphrasing Joe Armstrong, erlang creator

Moore’s Law

“The number of transistors per square inch on integrated circuits doubles every year”.

Use structures correctly

JSON,Serialized Array/Objectoption_id option_name option_value option_value

1 24 time_format "g:i a" yes2 43 comment_whitelist 1 yes3 45 comment_registration 0 yes4 76 default_comments_page newest yes5 95 sidebars_widgets "a:3:{s:19:""wp_inactive_widgets"";a:0:{}s:9:""sidebar-1"";a:6:{i:0;s:8:""search-2"";i:1;s:14:""recent-posts-2"";i:2;s:17:""recent-comments-2"";i:3;s:10:""archives-2"";i:4;s:12:""categories-2"";i:5;s:6:""meta-2"";}s:13:""array_version"";i:3;}"yes6 96 cron "a:3:{i:1423456489;a:3:{s:16:""wp_version_check"";a:1:{s:32:""40cd750bba9870f18aada2478b24840a"";a:3:{s:8:""schedule"";s:10:""twicedaily"";s:4:""args"";a:0:{}s:8:""interval"";i:43200;}}s:17:""wp_update_plugins"";a:1:{s:32:""40cd750bba9870f18aada2478b24840a"";a:3:{s:8:""schedule"";s:10:""twicedaily"";s:4:""args"";a:0:{}s:8:""interval"";i:43200;}}s:16:""wp_update_themes"";a:1:{s:32:""40cd750bba9870f18aada2478b24840a"";a:3:{s:8:""schedule"";s:10:""twicedaily"";s:4:""args"";a:0:{}s:8:""interval"";i:43200;}}}i:1423468440;a:1:{s:20:""wp_maybe_auto_update"";a:1:{s:32:""40cd750bba9870f18aada2478b24840a"";a:3:{s:8:""schedule"";s:10:""twicedaily"";s:4:""args"";a:0:{}s:8:""interval"";i:43200;}}}s:7:""version"";i:2;}"yes7 100 _site_transient_update_plugins "O:8:""stdClass"":4:{s:12:""last_checked"";i:1423413299;s:8:""response"";a:0:{}s:12:""translations"";a:0:{}s:9:""no_update"";a:2:{s:19:""akismet/akismet.php"";O:8:""stdClass"":6:{s:2:""id"";s:2:""15"";s:4:""slug"";s:7:""akismet"";s:6:""plugin"";s:19:""akismet/akismet.php"";s:11:""new_version"";s:5:""3.0.4"";s:3:""url"";s:38:""https://wordpress.org/plugins/akismet/"";s:7:""package"";s:56:""https://downloads.wordpress.org/plugin/akismet.3.0.4.zip"";}s:9:""hello.php"";O:8:""stdClass"":6:{s:2:""id"";s:4:""3564"";s:4:""slug"";s:11:""hello-dolly"";s:6:""plugin"";s:9:""hello.php"";s:11:""new_version"";s:3:""1.6"";s:3:""url"";s:42:""https://wordpress.org/plugins/hello-dolly/"";s:7:""package"";s:58:""https://downloads.wordpress.org/plugin/hello-dolly.1.6.zip"";}}}"yes

Another exampleLet’s see another example

Binary Tree• Materialized Path

• Field size limit

• Low performance on search

• TEXT field Index:

• Length limited

• FULLTEXT only in MyISAM

parent_id user_id level asc_path node1 0 1 1 <null> <null>2 1 2 2 ,1, L3 1 13 2 ,1, R4 2 3 3 ,2,1, L5 2 61 3 ,2,1, R6 13 23 3 ,13,1, L7 13 22 3 ,13,1, R8 3 4 4 ,3,2,1, L9 3 156 4 ,3,2,1, R

10 22 1568 4 ,22,13,1, L11 22 26 4 ,22,13,1, R12 23 1476 4 ,23,13,1, L13 23 690716 4 ,23,13,1, R14 61 1051 4 ,61,2,1, L15 61 62 4 ,61,2,1, R

sql> SELECT b.parent_id, b.user_id, b.level, b.asc_path, b.node FROM binary_tree b JOIN users u ON b.user_id = u.id WHERE b.asc_path LIKE ‘%,1,%’

OR b.user_id = 1 ORDER BY b.level, b.parent_id, b.node LIMIT 0, 16

[2015-02-10 21:27:43] 15 row(s) retrieved starting from 1 in 16905/16937 ms

sql> SELECT b.parent_id, b.user_id, b.level, b.asc_path, b.node FROM binary_tree b JOIN users u ON b.user_id = u.id WHERE (b.asc_path LIKE ‘%,1,%’

OR b.user_id = 1) AND (b.level BETWEEN 1 AND 4)

ORDER BY b.level, b.parent_id, b.node

[2015-02-13 23:39:27] 15 row(s) retrieved starting from 1 in 326/348 ms

51x faster

Hierarchical Data Storage

Adjacent List (RDBMS)

user_id lft rgt parent_id1 1 2 13 <null>2 2 3 61 13 13 23 22 14 3 4 156 25 61 1052 62 26 23 1476 690716 137 22 1568 26 13

Graph Database

1

2

361

13

23 22

4

1561051

62

1476

690716 1568

26

What is to learn?• Use the right structures

• Avoid LIKE statements on a TEXT field

• Review your query, the problem may be easier to solve than it seems

• Different tools come with a trade-off

Let’s see a particular query

SELECT trans.id, trans.transaction_type, trans.value, trans.user_receiver_id, user_receiver.username, trans.user_sender_id, user_sender.username, trans.status, CASE trans.status WHEN 1 THEN 'COMPLETED' WHEN 2 THEN 'CANCELED' END AS transaction_status, trans.date_transactionFROM transactions AS trans INNER JOIN users AS user_receiver ON trans.user_receiver_id = user_receiver.id INNER JOIN users AS user_sender ON trans.user_sender_id = user_sender.idWHERE (trans.user_sender_id = 1 OR trans.user_receiver_id = 1) AND (trans.transaction_type IN ('withdraw', ‘cancelWithdraw', ‘transactionFee', 'refundOderPaidWithBalance', ‘transferBetweenUsers', ‘creditcardFee', ‘creditcardWithdraw', 'creditDeposit', ‘creditWithdraw', ‘orderPayment’, ‘monthlyPayment', 'shippingFee')) AND trans.date_transaction >= '2013-05-01 00:00:00' AND trans.date_transaction <= '2013-05-31 23:59:59'ORDER BY trans.date_transaction DESC LIMIT 100;

How long did that query take?

• 2 INNER JOIN

• Using foreign-key primary key relationship

• Using varchar field as filter

• This field is indexed

• ORDER BY using a datetime field DESC

• This field is a B-tree index, created DESC

DESC Index

http://dev.mysql.com/doc/refman/5.5/en/create-index.html

An index_col_name specification can end with ASC or DESC. These keywords are permitted for future extensions for specifying ascending or descending index value storage. Currently, they are parsed but ignored; index values are always stored in ascending order.

Same query in MongoDB

db.financial_report.find({ $or: [{ user_receiver_id: 1 }, { user_sender_id: 1 }], transaction_type: { $in: [ 'withdraw', 'cancelWithdraw', 'transactionFee', 'refundOderPaidWithBalance', 'transferBetweenUsers', 'creditcardFee', 'creditcardWithdraw', 'creditDeposit', 'creditWithdraw', 'orderPayment', 'monthlyPayment', 'shippingFee' ] }, transaction_date: { $gte: new ISODate('2013-03-01 00:00:00'), $lte: new ISODate('2013-03-31 23:59:59') } }).sort({ transaction_date: -1 }).limit(100)

SELECT [FIELDS] FROM transactions AS trans INNER JOIN users AS user_receiver ON trans.user_receiver_id = user_receiver.id INNER JOIN users AS user_sender ON trans.user_sender_id = user_sender.idWHERE (trans.user_sender_id = 1 OR trans.user_receiver_id = 1) AND (trans.transaction_type IN ('withdraw', ‘cancelWithdraw', ‘transactionFee', 'refundOderPaidWithBalance', ‘transferBetweenUsers', ‘creditcardFee', ‘creditcardWithdraw', 'creditDeposit', ‘creditWithdraw', ‘orderPayment’, ‘monthlyPayment', 'shippingFee')) AND trans.date_transaction >= '2013-05-01 00:00:00' AND trans.date_transaction <= '2013-05-31 23:59:59'ORDER BY trans.date_transaction DESC LIMIT 100;

db.financial_report.find({ $or: [{ user_receiver_id: 1 }, { user_sender_id: 1 }], transaction_type: { $in: [ 'withdraw', 'cancelWithdraw', 'transactionFee', 'refundOderPaidWithBalance', 'transferBetweenUsers', 'creditcardFee', 'creditcardWithdraw', 'creditDeposit', 'creditWithdraw', 'orderPayment', 'monthlyPayment', 'shippingFee' ] }, transaction_date: { $gte: new ISODate('2013-03-01 00:00:00'), $lte: new ISODate('2013-03-31 23:59:59') } }).sort({ transaction_date: -1 }).limit(100)

What is to learn?• MySQL has faulty DESC indexes

• Sometimes you need to use another tool to solve the problem

How does it work?

Passive version

App

datatransactions

pending_sync_transactions

App

pending_sync_transactions

transactions

data

App

pending_sync_transactions

transactions

Triggerdata

App

pending_sync_transactions

transactions

data

App

pending_sync_transactions

transactions

data

Active version

Queueing

App

data

Queueing

App

Queueing

Appdata

data

Queueing

data

App

data

data

Queueing

data

App

data

data

Queueing

App

data

data

Queueing

App

data

data

Queueing

App

data

data

Queueing

App

datadata

Queueing

App

What is to learn?• The best solution may not be the text-book solution

• The fancy solution may be overkill

Making data searchable

App

AWS DynamoDB

Getting from the database

App

Getting from a text file

App

S3

What is to learn?• Add layer for search purposes

• Using text-files as a manner of storing data to make it searchable

Summary

Summary• Always analyse the trade-off

• Use the structures correctly

• Simple solutions can be the perfect ones

Thank you!• Please rate this talk at joind.in: https://joind.in/13377

• Twitter: @gabidavila

• Blog: gabriela.io

top related