your database is trying to kill you

52
Your Database Wants to Kill You Kevin Lawver - 11/1/2013 1

Upload: kevin-lawver

Post on 02-Nov-2014

228 views

Category:

Technology


2 download

DESCRIPTION

A presentation I gave to a bunch of college kids all about databases, why they're horrible, and what's coming. I also threw in some rules to make their lives a little easier, because I'm not entirely cruel.

TRANSCRIPT

Page 1: Your Database is Trying to Kill You

Your Database Wants to Kill You

Kevin Lawver - 11/1/2013

1

Page 2: Your Database is Trying to Kill You

Hi, I’m Kevin.

2

Page 3: Your Database is Trying to Kill You

3

- I work at Rails Machine- We do ops- Lots of ops on lots of different kinds of databases- enough introductions, let’s get w/ the murder!

Page 4: Your Database is Trying to Kill You

Databases have been around since before

most of us were born.

4

- So they’re well understood- and well despised- and crusty

Page 5: Your Database is Trying to Kill You

There’s been a revolution the past few

years.

5

Page 6: Your Database is Trying to Kill You

Getting away from fully relational databases, to

something... odd.

6

Page 7: Your Database is Trying to Kill You

But, don’t get comfortable.

7

Page 8: Your Database is Trying to Kill You

Because your database wants...

8

Page 9: Your Database is Trying to Kill You

TO KILL YOU!

9

Page 10: Your Database is Trying to Kill You

really, it does.

10

Page 11: Your Database is Trying to Kill You

The Old School

11

Page 12: Your Database is Trying to Kill You

Relational Databases

12

Page 13: Your Database is Trying to Kill You

MySQL, PostgreSQL, Oracle, Sybase, etc

13

Page 14: Your Database is Trying to Kill You

This is what you’re used to.

14

Page 15: Your Database is Trying to Kill You

Tables, relationships, foreign keys, SQL, etc.

15

- And lots of rules

Page 16: Your Database is Trying to Kill You

ACID

16

The set of rules relational databases follow to assure the data gets where it needs to go and is consistent.

They’re fine for a certain kind of workload.

Page 17: Your Database is Trying to Kill You

Atomicity

17

Transactions are all or nothing. If any part of the transaction fails, the WHOLE thing has to fail and roll back.

That means a lot of locking, which can become a performance problem.

Page 18: Your Database is Trying to Kill You

Consistency

18

Any transaction brings the database from one valid “state” to another - which means you can have a bunch of rules inside the database to judge the validity of data, and any transaction that doesn’t pass fails and rolls back.

Again, not great for performance.

Page 19: Your Database is Trying to Kill You

Isolation

19

Transactions executed concurrently have to result in the same state of the database as if they had been executed serially.

Requires partially applied transactions to NOT be visible to other transactions.

Page 20: Your Database is Trying to Kill You

Durability

20

Once a transaction is committed, it’s IN THERE.

Page 21: Your Database is Trying to Kill You

That’s a lot of rules, and it makes for inflexible

systems.

21

Page 22: Your Database is Trying to Kill You

And that’s where the killing comes in:

22

Page 23: Your Database is Trying to Kill You

Replication

23

It’s evil, and almost all RDBMS’s do it wrong.

It’s so fragile that you spend more time redoing it than actually getting any benefit from it.

MySQL can do master/master. PostgreSQL ships binary logs via scp.

It’s all horrible and gives me grey hairs.

Because it was an afterthought and not designed from the beginning.

Add-on replication is almost always horrible.

Page 24: Your Database is Trying to Kill You

Failover

24

This is even worse than replication. Because it was even more of an afterthought.

Most of the time it fails over on accident and breaks replication.

And then someone gets woken up to clean up a steaming pile of bad data.

And that person isn’t very happy about it.

Page 25: Your Database is Trying to Kill You

All those solutions are hacked on and horrible.

25

Page 26: Your Database is Trying to Kill You

There has to be a better way.

26

Page 27: Your Database is Trying to Kill You

Enter the CAP Theorem

27

Page 28: Your Database is Trying to Kill You

It came from Amazon, and changed everything.

28

It adds some reality to the database world. It basically says that no database can do everything.

Page 29: Your Database is Trying to Kill You

CAP stands for...

29

Page 30: Your Database is Trying to Kill You

Consistency

30

All nodes have the same data at the same time.

Page 31: Your Database is Trying to Kill You

Availability

31

Every request is guaranteed to receive a response as to its success or failure

Page 32: Your Database is Trying to Kill You

Partition Tolerance

32

The system will continue to operate despite arbitrary message loss or a failure of part of the system.

Also known as “split brain” - which happens to me if I don’t get enough coffee.

Page 33: Your Database is Trying to Kill You

But, you can never have all three. It’s impossible.

33

Finally, some reality! Stop trying to be everything to everyone and solve all types of problems with the same hammer.

So when you’re looking at a data store, see which two it can do and which you need for your data!

Page 34: Your Database is Trying to Kill You

Enter all the NoSQL!

34

Stands for either “NO SQL” or “Not Only SQL” - but it’s really a bunch of different data stores that aren’t relational and solve different kinds of problems.

And provide some solutions for old school reliability problems.

Page 35: Your Database is Trying to Kill You

Document Stores

35

- MongoDB, Riak, CouchDB, etc- Not relational (though you can convince mongodb to do it, you shouldn’t)- Usually have really good replication stories- Let’s look at MongoDB vs traditional MySQL

Page 36: Your Database is Trying to Kill You

MySQL Replication

36

That’s typical master/master.

Each can take writes (but you shouldn’t)

They ship bin logs back and forth

Fragile

Easy to break replication by having conflicting writes committed near the same time on both sides - so split-brain is always a possibility.

Page 37: Your Database is Trying to Kill You

MongoDB Replica Set

37

- There’s an election, and one node is picked as the primary.- It takes all writes, distributes to the secondaries- If the primary goes down, there’s an election and a new primary is chosen (usually less than 1 second).- New nodes join the replica set and get all the data, then can be elected primary

Page 38: Your Database is Trying to Kill You

Benefits of Replica Sets

38

- Replication and failover designed into the system as core functionality!- Much better failover- Much better reliability- I get to sleep more- Easy to add capacity as the replica set grows (either shard by adding new replica sets or add more nodes to scale reads).

Page 39: Your Database is Trying to Kill You

Riak & the “Ring”

39

- Riak is crazy town- Document store with very light querying (though the new search stuff is badass)- Super scalable via the “Ring”- Data is automagically replicated around the ring based on configuration - Number of copies

Page 40: Your Database is Trying to Kill You

The Ring

40

- All nodes “gossip” to confirm they’re up.- Any node can take a query and will gather the results from the other nodes.- Nodes dropping out are “noticed” by the ring and data gets shuffled around.- New news automatically join the node and get their “share” of the data.- Theoretically infinitely scalable (though the gossip gets REALLY noisy)- Useful as a file store (see Riak CS)- I think that drawing can be used to summon Beetlejuice.

Page 41: Your Database is Trying to Kill You

What’s Old is New

41

- MariaDB + Galera Cluster = MySQL replica sets! (kind of)- row-based replication is much more reliable- automatic failover and syncing of new nodes- can be load balanced for reads and writes!- still the same sql everyone’s used to- theoretically any node can take writes - but I don’t trust it

Page 42: Your Database is Trying to Kill You

My MariaDB

42

- Yes, this is the mongodb diagram- I use haproxy to send all the writes to a single primary, with the others as backups in case it goes down.- I have a separate haproxy frontend that load balances across all three for reads.- so far, i love it to pieces

Page 43: Your Database is Trying to Kill You

Here’s HAProxy

43

- rmcom_backend - app servers- mariadb_read_backend - the leastconn balanced pool of readers- mariadb_write_backend - db1 is the primary unless it goes down, then db2 is “promoted”- rails, mariadb_read and mariadb_write are the frontends

Page 44: Your Database is Trying to Kill You

Now, some rules...

44

Page 45: Your Database is Trying to Kill You

If you query it, index it.

45

- As your data grows, you’ll see query speed decrease.- Add indexes for your common queries!- Don’t forget compound indexes.

Page 46: Your Database is Trying to Kill You

As data increases, flexibility decreases.

46

- You’ll need to limit the types of queries you allow people to perform because they’ll lock things up and stop everyone from accessing it.- You’ll need to find other ways to “protect” the database, like.

Page 47: Your Database is Trying to Kill You

Cache it!

47

- Use memcached or other caching technologies to keep common queries away from the database.- If it can be read, it can be cached.- Saves you a ton of money in vertically scaling your database.- You may also need to add other ways to access your data, like say, elasticsearch or solr.

Page 48: Your Database is Trying to Kill You

Scale vertically

48

- Throw hardware at it until it’s too expensive, then shard it.- Because sharding is almost always horrible.

Page 49: Your Database is Trying to Kill You

What does it all mean?

49

- Don’t default to RDBMS!- Use RDBMS if you need transactions and your data truly is relational.- If it’s a document, use a document store- Understand the tradeoffs- Understand how your data will be queried- Don’t forget you can combine technologies to build whatever you need

Page 50: Your Database is Trying to Kill You

If We Have Time...• Key/Value Stores

• Elasticsearch

• Why you shouldn’t use Redis... ever.

• Questions!

50

Page 51: Your Database is Trying to Kill You

RailsBridge!http://rubysavannah.com - 11/16/2013

51

- We need front-end volunteers and students!- Next one is in January so check back in November for the signup!