building a sql database that works

56
Building a SQL Database that works Josh Berkus PostgreSQL Experts, Inc. OpenSourceBridge 2009

Upload: postgresql-experts-inc

Post on 18-Dec-2014

375 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Building a SQL Database that Works

Building a SQL Databasethat works

Josh BerkusPostgreSQL Experts, Inc.OpenSourceBridge 2009

Page 2: Building a SQL Database that Works

How Not To Do Itfour popular methods

Page 3: Building a SQL Database that Works

1. One Big Spreadsheet

Page 4: Building a SQL Database that Works

2. EAV & E-Blob

ID Property Setting

407 Eyes Brown

407 Height 73in

407 Married? TRUE

408 Married? FALSE

408 Smoker FALSE

408 Age 37

409 Height 66in

ID Properties

407 <eyes=”brown”><height=”73”> <married=”1”><smoker=”1”>

408 <hair=”brown”><age=”49”> <married=”0”><smoker=”0”>

409 <age=”37”><height=”66”> <hat=”old”><teeth=”gold”>

Page 5: Building a SQL Database that Works

3. Incremental Development

Page 6: Building a SQL Database that Works

4. Leave It to the ORM

Page 7: Building a SQL Database that Works

E.F. CoddDatabase Engineer, IBM 1970

Page 8: Building a SQL Database that Works

IBM Databases Run Amok

1.losing data

2.duplicate data

3.wrong data

4.crappy performance

5.downtime for database redesign whenever anyone made an

application change

Page 9: Building a SQL Database that Works

The Relational Model

Page 10: Building a SQL Database that Works

All the Relational ModelYou Need to Know

in less than 10 minutes

Page 11: Building a SQL Database that Works

Set (Bag) Theory

Page 12: Building a SQL Database that Works

Relations

Relation(table, view, rowset)

Page 13: Building a SQL Database that Works

Tuples

Relation(table, view, rowset)

Tuple (row)Tuple (row) Tuple (row)

Tuple (row)Tuple (row)

Page 14: Building a SQL Database that Works

Attributes

Relation(table, view, rowset)

Tuple (row) Tuple (row)

Tuple (row)Tuple (row)

Tuple (row)Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

AttributeAttribute

Attribute Attribute

Page 15: Building a SQL Database that Works

Domains (types)

Relation(table, view, rowset)

Tuple (row) Tuple (row)

Tuple (row)Tuple (row)

Tuple (row)INT

DATE

TEXT

DATE

TEXT

Attribute

Attribute

DATE

TEXT

Attribute

Attribute

DATE

TEXT

Attribute

Attribute

DATE

TEXT

INTINT

INT INT

Page 16: Building a SQL Database that Works

Keys

Relation(table, view, rowset)

Tuple (row) Tuple (row)

Tuple (row)Tuple (row)

Tuple (row)Attribute

Attribute

Attribute

Attribute

AttributeKey

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

KeyKey

Key Key

Page 17: Building a SQL Database that Works

Constraints

Relation(table, view, rowset)

Tuple (row) Tuple (row)

Tuple (row)Tuple (row)

Tuple (row)Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

Attribute

AttributeAttribute

Attribute Attribute

Attribute > 5

Page 18: Building a SQL Database that Works

Foreign Key Constraint

Page 19: Building a SQL Database that Works

Derived Relation (query)

Page 20: Building a SQL Database that Works

Problem 1:Losing Data

?

Page 21: Building a SQL Database that Works

The Atomic Age

Page 22: Building a SQL Database that Works

Users

Simple User Table

Admins

● name (text)● email (text)● login (text)● password (text)● status (char)

(user, inactive, admin)

Page 23: Building a SQL Database that Works

Users

Non-Atomic Attributes

Admins

● name (text)● email (text)● login (text)● password (text)● status (char)

(user, inactive, admin)

Page 24: Building a SQL Database that Works

What's Atomic?

The simplest form of a datum, which is not divisible without loss of information.

nameJosh Berkus

Statusa

SELECT SUBSTR(name,STRPOS(name, ' ')) ...

… WHERE status = 'a' or status = 'u' ...

Statusi

… WHERE ??? ...

Page 25: Building a SQL Database that Works

What's Atomic?

The simplest form of a datum, which is not divisible without loss of information.

first_nameJosh

activeTRUE

last_nameBerkus

accessa

Page 26: Building a SQL Database that Works

Users

Table Atomized!

Admins

● first_name (text)● last_name (text)● email (text)● login (text)● password (text)● active (boolean) ● access (char)

Page 27: Building a SQL Database that Works

Atomic, Shmomic. Who Cares?

● Atomic Values:– retain data

– make joins easier

– make constraints easier

● Non-atomic Values:– make data loss more likely

– increase CPU usage

– make you more likely to forget something

Page 28: Building a SQL Database that Works

Transactions

Page 29: Building a SQL Database that Works

Splitting a Bulletin Board Threadthe hard way

INSERT INTO threads VALUES ( .... );If $dbh('success') then

for $these_posts.date > $cutdate loopUPDATE posts SET thread = $newthreadWHERE id = $these_posts.id;

if not $dbh('success') thenfor $these_posts.id > $last_id loop

UPDATE posts SET thread = $oldthreadWHERE id = $these_posts.id;DELETE FROM threadsWHERE id = $newthread;

Page 30: Building a SQL Database that Works

Splitting a Bulletin Board Threadthe transactional way

BEGIN;INSERT INTO threads VALUES ( .... );$newthread = curval();UPDATE posts SET thread = $newthread

WHERE thread = $oldthreadAND date > $cutdate;

END;

Page 31: Building a SQL Database that Works

Problem 2:Duplicate Data

nameJosh Berkus

Joshua Berkus

Josh Berkus

Berkus, Josh

user_nameJosh BerkusJosh Berkus

adminJosh BerkusBerkus

user_nameJosh Berkus

Joshua Berkus

Josh

Page 32: Building a SQL Database that Works

Where are my Keys?

Page 33: Building a SQL Database that Works

Key

Users

Candidate (Natural) Keys

Admins

● first_name (text)● last_name (text)● email (text)● login (text)● password (text)● active (boolean) ● access (char)

Page 34: Building a SQL Database that Works

A Good Key

● Should have to be unique because the application requires it to be.

● Expresses a unique predicate which describes the tuple (row):

– user with login “jberkus”

– post from “jberkus” on “2009-05-02 13:41:22” in thread “Making your own wine”

● If you can't find a good key, your table design is missing data.

Page 35: Building a SQL Database that Works

Key

Surrogate Key

Users

Admins

● first_name (text)● last_name (text)● email (text)● login (text)● password (text)● active (boolean) ● access (char)

● user_id (serial)

Page 36: Building a SQL Database that Works

We All Just Want to Be Normal

Page 37: Building a SQL Database that Works

Abby Normal

login level last_name

jberkus u Berkus

selena a Deckelman

login title posted level

jberkus Dinner? 09:28 u

selena Dinner? 09:37 u

jberkus Dinner? 09:44 a

Page 38: Building a SQL Database that Works

How can I be “Normal”?1. Each piece of data only appears in one relation

– except as a “foreign key” attribute

● No “repeated” attributes

login level last_name

jberkus u Berkus

selena a Deckelman

login title posted

jberkus Dinner? 09:28

selena Dinner? 09:37

jberkus Dinner? 09:44

Page 39: Building a SQL Database that Works

Problem 3:Wrong Data

Page 40: Building a SQL Database that Works

Constraints

Page 41: Building a SQL Database that Works

Users Run Amok

● first_name (text)● last_name (text)● email (text)● login (text)● password (text)● active (boolean) ● access (char)

Page 42: Building a SQL Database that Works

Ensure Your Data is Consistentno matter where it came from

first_name last_name email login password active levelJosh Berkus TRUE aNULL Kelley k NULL TRUE uMark Twain samuel NULL I

S F gavin twitter FALSE x

[email protected] jberkus [email protected] [email protected]

Page 43: Building a SQL Database that Works

Users Under Constraint

● first_name (text) length() > 1● last_name (text) length() > 1● email (text) ILIKE '%@%.%'● login (text) length() > 5● password (text) length() > 5● active (boolean) NOT NULL● access (char) IN ( 'a','u' )

note: email and other validators would, of course, be more complex

Page 44: Building a SQL Database that Works

Foreign Keys

Users

Admins

Posts

attributelogin

Page 45: Building a SQL Database that Works

Posts Table

● title (text) NOT NULLREFERENCES threads ( title )ON DELETE CASCADE ON UPDATE CASCADE

● posted (timestamp) NOT NULL● user (text) NOT NULL

REFERENCES users ( login )ON DELETE CASCADE ON UPDATE CASCADE

● content (text) NOT NULL

Page 46: Building a SQL Database that Works

Beautiful Cascades

users.loginJosh Berkusjberkus

selena

posts.contentJosh Berkus

I'm going crazy!

What's up?

Why?

OSB! It's too much!

I told you so ...

jerkyboywww.pornking.com

www.whitehouse.com

www.whiteslavery.com

www.lolcats.com

Page 47: Building a SQL Database that Works

Problem 4:Crappy Performance

Page 48: Building a SQL Database that Works

Things We've Already Done

● Atomicization– less CPU on parsing, calculations

● Normalization– less data duplication

– smaller tables

● Transactions– more batches, less iteration

– less locking

Page 49: Building a SQL Database that Works

Denormalized Derived Relationsmaterialized views for the win

Users

Admins

Posts

user_postcount

Page 50: Building a SQL Database that Works

Problem 5:Database Changes Cause

Application Downtimeand vice-versa

Page 51: Building a SQL Database that Works

Stuff We've Already Doneto make our data “agile”

● Atomicization

– data isn't in specific interface version formats● Normalization

– where to extend data is more obvious

– create a new table if you have to● Transactions

– prevent partial failures from changed schema

Page 52: Building a SQL Database that Works

Views

Page 53: Building a SQL Database that Works

Extending the Users Table● first_name (text)● last_name (text)● email (text)● login (text)● password (text)● active (boolean) ● access (char)● created (timestamp)● last_login (timestamp)

CREATE VIEW oldapp.users ASSELECT first_name || ' ' || last_name, email, login, password, active, accessFROM users;

Page 54: Building a SQL Database that Works

The Restyou already know

● Write Migrations– deploy these in transactions, if supported

– if not, write rollback scripts

● Write Tests– use a realistic staging environment

Page 55: Building a SQL Database that Works

Some Other Tips

● Pick a naming scheme– and stick to it

● Don't do your joins in the application– the database does them better

● Repurposing fields will bite you– sure you'll remember when that changed

● Don't micro-optimize– INT3, anyone?

Page 56: Building a SQL Database that Works

More Information● me

[email protected]

– www.pgexperts.com

– it.toolbox.com/blogs/database-soup

● postgresql: www.postgresql.org● tutorial at OSCON

– monday, 8:30 am!

– see you in San Jose!

This presentation copyright 2009 Josh Berkus, licensed for distribution under the Creative Commons Attribution License, except for photos, most of which were stolen from other people's websites via images.google.com. Thanks, Google!