presented by, mysql ab® & o’reilly media, inc. 0 to 60 in 3.1 tyler carlton cory sessions

Presented by,

MySQL AB® & O’Reilly Media, Inc.

0 to 60 in 3.1

Tyler Carlton

Cory Sessions

Presented by,

MySQL AB® & O’Reilly Media, Inc.

<Insert funny joke here>

The Project Medium sized demographics data mining

project 1,700,000+ User base Hundreds of data points per user

“Legacy” System – Why Upgrade?

+

Main DB (External Users) Offline backup (Internal Users)

Weekly manual copy backups Max of 3 simultaneous data

pulls 8hr+ data pull times for

complex data pulls Random index corruption

Notes:Smaller is Better

On average, CPU usage with MySQL was 20% lower than our old database solution.

Why We Chose MySQL Cluster

Scalable Distributed processing 5 – 9’s Reliability Instant data availability

between internal & external users

8 Node NDB cluster Dual Core 2 Quad 1.8 ghz 16 Gig ram (Data memory) 6x Raid 10 SAS 15k RPM drives

What We Built – NDB Data Nodes

What We Built – API & MGMT Nodes

3 API nodes + 1 management node Dual Core 2 Quad 1.8 ghz 8 Gig ram 300 gig 7200rpm (Raid 0)

NDB Issues with a Large Data Set

NDB load timesLoading from backup: ~ 1 hourRestarting NDB nodes: ~ 1 hour

Note: Load times differ depending on your data size

NDB Issues with a Large Data Set

Indexing IssuesForce index (NDB picks wrong)Index creation/modification order matters (Seriously!)

Local Checkpoint TuningTimeBetweenLocalCheckpoints - 20 means 4MB (4 × 220) of write operationsNoOfFragmentLogFiles – No. of 4 x 16MB files

None deleted until 3 local checkpointsOn startup: Local checkpoint buffers would overrunRTFM (two, maybe three times)

NDB Network Issues

Network transport packet sizeBuffer would fill and overrunThis caused nodes to miss their heartbeats and drop

This would happen when: A backup was running A local checkpoint was running at the same time

Solved by : Increasing network packet buffer

Issues - IN Statements

IN statements die with engine_condition_pushdown=ON with a set of apx. 10,000 or more. (caused with zip codes)

Really need engine_condition_pushdown=ON, but this broke it for us, so… we had to disable it.

Structuring Apps: Redundency

Redundant power supply + dual power sources Port trunking w/ redundant Gig-E switches # NDB Replicas: 2 (2x4 setup) 64 gig max

data size MySQL (API Nodes ) Heads: Load balanced

with automatic fail over

Structuring Apps: Internal Apps

Ultimate goal: Offload the data intensive processing to the MySQL nodes

The Good Stuff: Stats!

Queries per Second (over 20 days) Average 1100-1500 Queries / Sec

during our peak times Average 250 Queries / Sec

Website Traffic

Stats for March 2008

Net Usage: NDB Node All NDB data nodes have nearly identical network bandwidth usage

MySQL ( API ) Nodes use about 9 MBs max under our current structure

Totaling 75 MBs during peak(600 Mbs)

Monitoring & Maintenance SNMP Monitoring: CPU, Network, Memory, Load, Disk

Cron Scripts: Node status & Node down notificationBackupsDatabase maintenance routines

MySQL Clustering book provided the base the scripts

Dolphin NIC Testing 4 node test cluster 4 x overall performance Brand new patch to handle automatic Ethernet failover / Dolphin Fail Over

( beta as of March 28 )

Net Usage: Next steps…

Questions?

Contact Information

Tyler Carlton

www.qdial.com

[email protected]

Cory Sessions

CorySessions.com

OrangeSoda.com

presented by, mysql ab® & o’reilly media, inc. 0 to 60 in 3.1 tyler carlton cory sessions

Documents

ndb data nodeswhat

ndb nodeall ndb data

data sizendb issues

data intensive processing

hourrestarting ndb nodes

ndb replicas

ghz8 gig ram300 gig

api mgmt nodes3 api