databases and queries: matching performance and reliability

11
DATABASES AND QUERIES MATCHING PERFORMANCE AND RELIABILITY

Upload: orchestrate

Post on 06-Aug-2015

156 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Databases and Queries: Matching Performance and Reliability

D ATA B A S E S A N D Q U E R I E SM A T C H I N G P E R F O R M A N C E A N D R E L I A B I L I T Y

Page 2: Databases and Queries: Matching Performance and Reliability

Dave SmithVP, Engineering

@dizzyd

Page 3: Databases and Queries: Matching Performance and Reliability

S U R V E Y

• Who has hit problems scaling RDBMS?

• Who is using non-relational databases?

Page 4: Databases and Queries: Matching Performance and Reliability

Q U E R I E S

• Relational

• Key/value (document)

• Text retrieval (full-text search)

• Graph

• Time-series

• Geospatial

Page 5: Databases and Queries: Matching Performance and Reliability

Q U E R I E S ( C O N T. )

• What questions are you asking of your data?

• Get a record by a key

• Find records based on a relationship

• Find all documents with a given term

• Apply operation to metrics within a timeframe

Page 6: Databases and Queries: Matching Performance and Reliability

It is possible to rewrite most queries in other forms.

Page 7: Databases and Queries: Matching Performance and Reliability

P E R F O R M A N C E

• Access patterns

• Read/write mix

• Sequential vs. Pareto vs. uniformly random

• Throughput - how many requests/sec?

• Latency - how long does it take to service a single request?

• Always a distribution! Mean is meaningless…

• Data size

• Total size of dataset

• Size per item in dataset

Page 8: Databases and Queries: Matching Performance and Reliability

R E L I A B I L I T Y

• How can databases fail?

• Disks -> integrity checking

• Nodes -> replication

• Network -> versioning

• Software -> (all of above)

• Overload -> elasticity

• Key questions

• How well does the system tolerate failure?

• How well does the system deal with unexpected load?

Page 9: Databases and Queries: Matching Performance and Reliability

It can be impossible to distinguish between a slow node and a failed node.

Page 10: Databases and Queries: Matching Performance and Reliability

U G LY T R U T H S

• All databases require tuning

• Failure is hard to test — most people don’t bother

• Networks fail — especially under high load

• The more your database does, the more ways it can fail

• More code == more bugs

Page 11: Databases and Queries: Matching Performance and Reliability

C H O I C E S , C H O I C E S …

• MySQL, Postgres, Oracle

• CouchDB, MongoDB, RethinkDB

• Riak, Cassandra

• HBase, Hypertable

• MemSQL, CouchBase

• ElasticSearch, SOLR

• Neo4J, Titan