dal deck

23
Database Abstraction Layers PHP & MySQL Presented By Peter Teitelbaum

Upload: carolinerose

Post on 27-Jun-2015

412 views

Category:

Technology


1 download

DESCRIPTION

Peter Teitelbaum presented Database Abstraction Layers to the LAMySQL and LAPHP Groups on October 1, 2013. These are his slides.

TRANSCRIPT

Page 1: Dal deck

Database Abstraction LayersPHP & MySQL

Presented By Peter Teitelbaum

Page 2: Dal deck

Topics

• Intro to Database Abstraction Layers (DAL)• Security• Caching• Scaling Reads• Logging & Error Handling

Page 3: Dal deck

Note

Not a guide to functionality every abstraction layer should have.

View this as an à la carte menu of ways to solve common problems. Some solutions will likely be useful and others not

depending on the application.

Page 4: Dal deck

Introduction DB Abstraction Layers

• Database Abstraction layer is a middleware layer between application and database

• In computing we often abstract resources – mem, storage, etc• A centralized layer supporting all DB communication gives you

great flexibility and control• Concepts are universal and can be used with other languages

or databases• Can be ultra lightweight

Page 5: Dal deck

World’s simplest DALclass SampleDAL extends mysqli

{

} // END class

Page 6: Dal deck

World’s simplest DAL - extendedclass SampleDAL extends mysqli

{

// Overload the parent

public function query($sql)

{

// Do something extra

return parent::query($sql);

} // END method

} // END class

Page 7: Dal deck

Security

• Filter invalid SQL– Example: prohibit execution of SET statement to prevent changes to

MySQL session variables

• SQL Injection – regex for keywords like SLEEP, etc– You don’t have to reject, write it to a log for later review, monitoring,

etc.– Not as good as web application firewall but may buy some time

Page 8: Dal deck

Filtering examplepublic function query($sql)

{

if (preg_match('/^\s*set\s+.i'), $sql)

{

throw new Exception('SET not permitted');

} // END if

return parent::query($sql);

} // END method

Page 9: Dal deck

Scaling Reads

• Architecture review – master/slave topology

Page 10: Dal deck

Master/Slave Topology

R/O VIP

Web servers

master

slaves

write

read

replication

Page 11: Dal deck

Scaling Reads• Architecture review – master/slave topology• Separation of read and writes

– Looked at MySQL Proxy for redirecting reads – could not handle volume– If all queries pass through a central DAL, no need to rewrite the entire app

• Basic concept- SELECTs to slave(s), DML to master• Use a regex to detect/redirect SELECT statements to slave(s)• Also valuable if most of application sends reads to slave(s) but some parts

still read from master• Can log queries hitting the master rather than redirect• DIY load-balancing (load-balancing on a budget)

– Not the most efficient method – good when time/budget is very limiting– Use randomizer function – don’t use a loop or attempt round robin– Not real load-balancing and no way to know if a slave has failed

Page 12: Dal deck

Redirecting SELECTs

Call query method

Is query a SELECT?

Open slave db connection

Execute query on slave

ReturnExecute query on master

Yes

No

Page 13: Dal deck

Logging SELECTs

Call query method

Is query aSELECT?

ANDconnected to

master

Execute query Return

Log query

Yes

No

Page 14: Dal deck

Load Balancing on a Budgetclass SampleDAL extends mysqli

{

public function __construct($host, $user, $pass, $db)

{

if ($host == ‘dbslave1.example.com’)

{

$slave_list = array(‘dbslave1.example.com’,

‘dbslave2.example.com’)

return parent::__construct($slave_list[mt_rand(0, count($slave_list))],

$user,

$pass,

$db);

} // END if

return parent::__construct($host, $user, $pass, $db)

} // END method

} // END class

Page 15: Dal deck

Caching

• MySQL query cache is generally inefficient– Single mutex on cache shared by all threads– Cached result expires with changes to any of the joined tables

regardless of whether or not the result is affected.– Cache not shared across multiple slaves – each slave must re do the

same work– Real time data may not be required

Page 16: Dal deck

Caching

• Create a read-through cache by adding a caching component into the DAL that will offload queries to a cached object store

• Memcache is most popular, others can be used.– In theory results could even be serialized, written to disk, using atime

to see if expired

Page 17: Dal deck

Caching• Allow developers to specify a TTL for each query as a parameter,

inherit a default, or explicitly disable caching• Hash of SELECT statement for cache key – MD5 is good• Append query with TTL as comment. If query becomes slow,

increasing the TTL may be a better option than query tuning.• CMS system and front end presentation layer share the DAL?

Bypass caching for administrative users• Splitting a large join into several smaller queries can often be more

efficient. Only parts will be executed when they expire.• Delayed/lazy-loaded connections – don’t connect to DB until

needed• Empty result set cache

Page 18: Dal deck

Caching – Query Method

Call query method

Is query cacheable? Create hash key Is result

cached?

Retrieve from cache and store as class property

Return

Valid DB connection?

Execute query

Connect to DB Append TTL as comment

Yes Yes

Yes

No

No No

Page 19: Dal deck

Caching – get results method

Call get results method

Is result empty?

Are results cached?

Replace empty array with #

Retrieve results from DB

Is result cacheable?

Retrieve results from class property

Return resultIs result a #

Convert # to empty array

Write result into cache

Yes YesYes

Yes

No

No

No

No

Page 20: Dal deck

Caching - Things to consider

• Queries with a full current timestamp will defeat caching– Maximum useful cache life will be 1 second– Consider rounding to a defined interval, 1min, 5min, 20min, etc.– Or consider using the NOW() function – queries are identical

regardless of when executed

• If the application receives a high volume of traffic, adding a reverse proxy cache like squid or varnish should be considered

Page 21: Dal deck

Logging and Error Handling• Grab URL and append the SQL as a comment. May aide in identifying and locating

slow or problematic SQL.• Multiple slaves in a load-balanced pool? Grab the hostname – it may be valuable

as error output when debugging.• Check for errors to determine if query executed successfully. If not throw an

exception.• Smart error handling – look at error code for different resulting actions on

different failures. – Reconnect once on terminated connection – useful for connection pooling– Log specific errors

• Application level general log. If multiple applications are connecting to the same DB server, logging at the DAL will give visibility into a targeted application for debugging.

• Log queries that disable/bypass caching for review• Capture timestamp pre/post query execution

Page 22: Dal deck

Conclusion

• Concepts are portable to other languages and databases• DALs don’t have to be heavy and require large overhead• DAL’s can increase performance• Examples extended mysqli – not necessary to

extend/overload, just map methods or create new ones

Page 23: Dal deck

Thanks!

Questions/thoughts?