falcon storage engine designed for speed presentation

Post on 15-May-2015

767 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

MySQL Users' Conference April 2009

Falcon - built for speed

Ann HarrisonKevin Lewis

If it's so fast, why isn't it done yet?

Talk overviewFalcon at a glanceProject historyMulti-threading for the database developerCycle locking

Falcon at a glance – read first record

Serial LogFiles Database

Tablespaces

Serial Log

Windows Page Cache

Record CacheMySQL Server

Falcon at a glance – read complete

Serial LogFiles Database

Tablespaces

Serial Log

Windows Page Cache

Record CacheMySQL Server

Falcon at a glance – read again

Serial LogFiles Database

Tablespaces

Serial Log

Windows Page Cache

Record CacheMySQL Server

Falcon at a glance – write new record

Serial LogFiles Database

Tablespaces

Serial Log

Windows Page Cache

Record CacheMySQL Server

Falcon at a glance – commit

Serial LogFiles Database

Tablespaces

Serial Log

Windows Page Cache

Record CacheMySQL Server

Falcon at a glance – write complete

Serial LogFiles Database

Tablespaces

Serial Log

Windows Page Cache

Record CacheMySQL Server

Falcon historyOrigin

Transactional SQL Engine for Web App EnvironmentBought by MySQL in 2006

MVCCConsistent ReadVerisons control write accessMemory only – no steal

Indexes and data separateData encoded on disk and in memoryFine grained multi-threading

Falcon Goals circa 2006

Exploit large memory for more than just a bigger cacheUse threads and processors for data migrationEliminate tradeoffs, minimize tuningScale gracefully to very heavy loadsSupport web applications

Web application characteristicsLarge archive of dataSmaller active set High read:write ratioUneven, bursty activity

What we did instead

Enforce limit on record cache sizeRespond to simple atypical loads

Autocommit single record accessRepeat “insert ... select”Single pass read of large data set

Challenge InnoDB on DBT2Large working setContinuous heavy load

Hired the world's most vicious test designer

Record CacheRecord Cache contains:

Committed records with no versions

Record CacheRecord Cache contains:

Committed records with no versions

New, uncommitted records

Record Cache Record Cache contains:

Committed records with no versions

New, uncommitted records

Records with multiple versions

Record Cache cleanup – step 1Cleanup old committed single version recordsScavengerRuns on schedule or demandRemoves oldest mature recordsSettable limits – start and stop

Record Cache Cleanup – step 2Clean out record versions too oldto be useful

PruneRemove old, unneeded versions

Record Cache Cleanup – step 3

Clean up a cache full of new records

ChillCopy new record data to logDone by transaction threadSettable start size

Record Cache Cleanup – step 4Clean up multiple versions of asingle record created by a singletransaction

Remove intermediate versionsCreated by a single transactionRolled back to save pointRepeated updates

Record Cache Cleanup – step 5Clean up records with multipleversions, still potentially visibleBacklog

Copy entire record tree to diskExpensiveNot yet working

Simple, atypical loadsChallenge:

Autocommit single record accessRecord cache is uselessRecord encoding is uselessTransaction creation / destruction is too expensive

Response:Reuse read only transactions

Result:Multi-threaded bookkeeping nightmare

Simple, atypical loadsChallenge:

Repeat “insert ... select...”

Fill cache with old and new records

Simple, atypical loadsChallenge:

Repeat “insert ... select...”

Fill cache with old and new records

First solutionScavenge old recordsChill new record data

Simple, atypical loadsChallenge:

Repeat “insert ... select...”Fill cache with old and new records First solution

Scavenge old recordsChill new records

Second solutionMove the records headers outAlso helps index creation

Simple, atypical loads

Single pass read of large data setRead more records than Read them over and overCaches are uselessEncoding is overhead

Response:Make encoding optional?

Challenge InnoDB on DBT2Initial results were not encouraging (2007)

0

5000

10000

15000

20000

25000

30000

10 20 50 100 150 200

Connections

Tran

sact

ions

Falcon2007InnoDB2007

Challenge InnoDB on DBT2But Falcon has improved a lot since April 2007

0

5000

10000

15000

20000

25000

30000

10 20 50 100 150 200

Connections

Tran

sact

ions

Falcon2007InnoDB2007Falcon2009

Challenge InnoDB on DBT2So did InnoDB

0

5000

10000

15000

20000

25000

30000

10 20 50 100 150 200

Connections

Tran

sact

ions Falcon2007

InnoDB2007Falcon2009InnoDB2009

Bug trends

Multi-threadingDatabases are a natural fit for multi-threading

ConnectionsGophersScavengerDisk reader/writer

Except for shared structuresLocking blocks parallel operations

Challenge – sharing without locking

Multi-threadingNon-locking operation

Purge old record versions

Multi-threadingNon-locking operation

Purge old record versions

Multi-threadingLocking operation

Remove intermediate versions

Multi-threadingLocking operation

Remove intermediate versions

What granularity of lock?

Multi-threading – Lock granularity

One per record: Too many interlocked instructions

One per record group:Thread reading one record prevents scavenge of another

No answer is right – more options?

Cycle locking – read record chainBefore starting to read a record chain, get a shared lock on a “cycle”

Transaction A Transaction BTransaction C

Cycle 1 = 3shared

Cycle 2inactive

Cycle locking – clean a record chainBefore starting to read a record chain, get a shared lock on a “cycle”

Transaction A active in Cycle 1 Transaction B active in Cycle 1Transaction C active in Cycle 1Scavenger unlinks versionsfrom record chain and links themto a “to be deleted” list.

Cycle 1 = 4shared

Cycle 2 inactive

Cycle locking – records relinked

Transaction A releases lockTransaction B releases lockTransaction C still activeScavenger releases lock

Cycle 1 = 1shared

Cycle 2 inactive

Cycle locking – swap cyclesNew access locks cycle 2

Transaction C holds Cycle 1 lockCycle Manager requests exclusive on Cycle 1 (pumps cycle)Transaction A acquires Cycle 2 lock

Cycle 1 = 1 shared

Cycle 2 = 1 shared

Cycle locking – cleanup phase

Transaction C releases lockTransaction B acquires Cycle 2 lockCycle manager exclusive Cycle 1

Cycle 1 = 0 shared

exclusive

Cycle 2 = 2 shared

Cycle locking – cleanup complete

Transaction C acquires Cycle 2 lockCycle manager exclusive Cycle 1Remove unlinked, unloved, oldversions When cleanup is done, Cyclemanager releases cycle 1

Cycle 1 exclusive

Cycle 2 = 2 shared

Questions

top related