databases, the cloud and its discontents
DESCRIPTION
A talk by Ian Plosker at All You Base 2014.TRANSCRIPT
Databases, the Cloud and its Discontents
“...it is impossible to overlook the extent to which civilization is built up upon a renunciation of instinct....”
― Sigmund Freud, Civilization and Its Discontents
Who am I?
IAN PLOSKER
Our goal is to make storing and querying data so easy, you don’t
need databases
Charles Darwin
Charles Darwin demonstrated that all life on Earth is formed and
transformed by the environmental pressures
applied to it.
Sigmund Freud
In Civilization and its Discontents Freud argues
that our minds were forged before civilization, and that our maladaptive behaviors are remnants of a different
time.
Ian Plosker
In Databases, the Cloud and its Discontents Plosker
argues that database were forged before the cloud, and
that their maladaptive behaviors are remnants of a
different time.
Path Dependence
The idea that current available options may be limited by choices and forces in the past which are
no longer relevant.
How Many Storage Engines Make These Assumptions:
• The disks are local
• The disk is spinning media
• Memory pages are contiguous
• The kernel is omnipotent
• Records have a repeating form and a consistent size
How Many Distributed Database Make These Assumptions:
• The network is reliable
• Nodes in a cluster share a switch
• Nodes in a cluster are in the same datacenter
• Switch ingress/egress buffers never fill up
• Networks are not congested
To understand our present choices, we must understand the
past.
A Brief History of Databases
Let's start by reviewing the evolution of storage media
Writing, Paper, and Libraries (6000 BCE, 105 CE, 2600 BCE)
Punch Card Databases (~1800)
Drums(invented 1930, general use 1950s)
Disks(invented 1954, general use 1960s)
Solid State(invented 1950s, general use 2000s)
Next let's review storage and query model
File Systems(proposed 1958, general use 1970s)
DBMS(1960s, general use 1970s)
Two main models …
Navigational
Hierarchical
Enter the notorious RDBMS(proposed 1970s, general use 1980s)
To Summarize and Synthesize …
Databases in 2005
COUCHBASE
MEMCACHE
BERKLEY DB
GENIEDB REDIS
SWIFT RIA
K
VOLDEMORT
TOKYO CABIN
ET
HBASE NEO4J
TEMPODB
ELASTIC SEARCH
COUCHDB
HIBARI
ORIENTDB
NOSQL DB
DYNAMO
BIG C
OUCH
MARKLOGIC SERVER
COHERENCE
AEROSPIKE
DEX
DRAWN TO SCALE
INFIN
IGRAPH
SIMPLEDB
FLOCKDB
MNESIA COHERENCE
MONGODB
CASSANDRA
25+ databases in production today that didn’t exist 8 years ago
Databases in 2014
Key-Value Search GeoGraph/
Relation Event
Scale-up
BerkleyDBCouchDBMongoDB
MySQL
SOLRSphinx
PostGISMongoDB
SOLRneo4j MySQL
Scale-out
RiakCassandra elasticsearch elasticsearch titan HBase
ONLINE QUERY TYPES
The database paradox of choice:
Choice has brought complexity
Enter: The Cloud