introduction to postgres6.148.scripts.mit.edu/2018/pages/lectures/webday7_postgre.pdf · • i...
TRANSCRIPT
![Page 1: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/1.jpg)
Introduction ToPostgres
Rodrigo Menezes
![Page 2: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/2.jpg)
• I joined in 2013, when we were ~20 people
• Acquired by Oracle during summer of 2017
• Currently, we’re about ~250 people
• I started off as a frontend developer
![Page 3: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/3.jpg)
• Why Postgres?
• Brief introduction to SQL
• ORM (Object relational mapper)
• Transactions
• Performance and indices
• Views and Materialized Views
• JSON and Postgres
• Scaling Postgres @ Moat
This Talk
![Page 4: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/4.jpg)
Why PostgreSQL?
![Page 5: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/5.jpg)
• Free + open source
• Really good community
• Fast bug fixes and frequent release cycle
• Amazing docs
• "Add features slowly, do them well"
• Great performance
• Ton of cool features
• SQL is great and people can pry it from my cold, dead hands
Why PostgreSQL?
![Page 6: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/6.jpg)
What is SQL?
![Page 7: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/7.jpg)
• SQL is a language for inserting, updating, reading and deleting data
• SQL is meant for humans (pros and cons to this)
• SQL is everywhere. Has been since the 70s
• Every collection of data is a "table"
• Every new datapoint is a "row"
SQL
![Page 8: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/8.jpg)
postgres=# CREATE TABLE users(id INTEGER NOT NULL,email TEXT NOT NULL,name TEXT
);CREATE TABLETime: 40.498 ms
Creating a table
![Page 9: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/9.jpg)
postgres=# INSERT INTO users(id, email) VALUES(0, '[email protected]'
);INSERT 0 1Time: 13.561 ms
Inserting a row
![Page 10: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/10.jpg)
postgres=# SELECT id, email, name FROM users WHERE id=0; id | email | name ----+----------------------------+------ 0 | [email protected] | (1 row)
Selecting a row
![Page 11: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/11.jpg)
postgres=# SELECT * FROM users WHERE id=0; id | email | name ----+----------------------------+------ 0 | [email protected] | (1 row)
Selecting all columns
![Page 12: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/12.jpg)
postgres=# select * from users where id=0; id | email | name ----+----------------------------+------ 0 | [email protected] | (1 row)
Case insensitive
![Page 13: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/13.jpg)
postgres=# UPDATE users SET email="[email protected]" WHERE id=0;UPDATE 1Time: 12.690 ms
Updating a row
![Page 14: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/14.jpg)
postgres=# DELETE FROM users WHERE id=0;DELETE 1Time: 9.671 ms
Deleting a row
![Page 15: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/15.jpg)
postgres=# DROP TABLE users;DROP TABLETime: 11.198 ms
Dropping a table
![Page 16: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/16.jpg)
• It's great for human beings
• Standards are nice
• In the majority of cases, you want your data strongly typed
• Less accidents
• More assumptions => more performance / compression
• Compile-time errors in strongly typed languages
Why SQL?
![Page 17: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/17.jpg)
ORMs (Object Relation Mapper)
![Page 18: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/18.jpg)
# pythonimport psycopg2 # Library to talk to Postgres
def create_cursor():conn = psycopg2.connect("dbname=test user=postgres")cursor = conn.cursor()
def update_email(cursor, id, new_email):query= (
'UPDATE users SET email={}''WHERE id = {} LIMIT 1'
).format(id)cursor.execute(query)
You can write a raw query…
![Page 19: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/19.jpg)
...
def update_email(cursor, id, new_email):query= (
'UPDATE users SET email={}''WHERE id = {} LIMIT 1'
).format(id)cursor.execute(query)
update_email(create_cursor(), 1, 'NULL; DROP TABLE user; --`);# `--` is a SQL comment so it ignores the rest of the line
SQL Injection
![Page 20: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/20.jpg)
• Library that handles all your SQL for you
• Handles user input for you to make it secure
• Neater syntax that hides SQL complexity (which is mostly a pro but sometimes a con)
ORMs
![Page 21: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/21.jpg)
// db.jsimport { Sequelize } from 'sequelize';
sequelize = new Sequelize({ database: POSTGRES_DB, dialect: 'postgres', host: POSTGRES_HOSTNAME, password: POSTGRES_PASSWORD, username: POSTGRES_USER,});
Sequelize
![Page 22: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/22.jpg)
// models/User.jsimport { Sequelize } from 'sequelize';
const User = sequelize.define('users', { id: { type: Sequelize.INTEGER, autoIncrement: true, primaryKey: true }, email: { type: Sequelize.TEXT, unique: true }});
Sequelize Model
![Page 23: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/23.jpg)
// services/createUser.jsimport { User } from './models/User.js';
function createUser() { … const user = await User.create({ email }); …}
function getUser() { … const user = await User.find({ email }); …}
Model usage
![Page 24: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/24.jpg)
Transactions andconstraints
![Page 25: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/25.jpg)
postgres=# create table users(id serial primary key,email text not null,money bigint not null default 0
);
Let's make a Venmo clone
![Page 26: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/26.jpg)
postgres=# CREATE TABLE users(id SERIAL NOT NULL,email TEXT NOT NULL
);
-- Now I don't need to specify idspostgres=# INSERT INTO users(email) VALUES (
postgres=# SELECT * FROM users; id | email ----+---------------------------- 0 | [email protected] (1 row)
Aside: serials
![Page 27: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/27.jpg)
# Make sure user 1 has enough moneypostgres=# select money from users where id = 1;
# Decrease user 1's moneypostgres=# update users set money=money – 20 where id=1;
# What happens if our webserver disconnects here?
# Increase user 2's moneypostgres=# update users set money=money + 20 where id=2;
Sending $20 from user 1 to user 2
![Page 28: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/28.jpg)
postgres=# begin;
postgres=# select money from users where id = 1;
postgres=# update users set money=money – 20 where id=1;
postgres=# update users set money=money + 20 where id=2;
postgres=# commit;
Transactions
![Page 29: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/29.jpg)
postgres=# begin;
postgres=# select money from users where id = 1;
postgres=# update users set money=money – 20 where id=1;
postgres=# abort; -- revert all changes!
Transactions
![Page 30: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/30.jpg)
• Transactions allow the effects of your code to happen in one go
• Achieve ACID compliance
• Atomicity – everything in the transaction happens or none of it does
• Consistency – after a transaction, the database will be in a valid state. Rules always apply.
• Isolation – a newly committed transaction shouldn't affect your current transaction (there are multiple levels of this)
• Durability – once a transaction is committed, it'll remain so even in case of power loss.
Transactions
![Page 31: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/31.jpg)
// wrapView.jsimport sequelize from './db/sequelize';export default function wrapView(fn) {
return (req, res, next) => { sequelize.transaction(t => {
return fn(req, res)}).catch(next);
};}
// server.tsapp.post("/api/user", wrapView(users.create));
Wrap everything in a transaction
![Page 32: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/32.jpg)
-- Assume user 1 has $20
postgres=# select money from users where id = 1;
postgres=# update users set money=money – 20 where id=1;
# Increase user 2's moneypostgres=# update users set money=money + 20 where id=2;
-- Now user 1 has -$20
Race condition
postgres=# select money from users where id = 1;
postgres=# update users set money=money – 20 where id=1;
# Increase user 3's moneypostgres=# update users set money=money + 20 where id=2;
![Page 33: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/33.jpg)
postgres=# create table users(id serial primary key,email text not null,money bigint not null default 0 check (money > 0)
);
Constraints
![Page 34: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/34.jpg)
postgres=# begin;
postgres=# update users set money=money – 20 where id=1;-- will error out if necessary
postgres=# update users set money=money + 20 where id=2;
postgres=# commit;
Race conditionpostgres=# begin;
postgres=# update users set money=money – 20 where id=1;-- will error out if necessary
postgres=# update users set money=money + 20 where id=3;
postgres=# commit;
![Page 35: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/35.jpg)
# pythonimport transactionfrom pyramid import testing
from moatpro.web import add_routes
class BaseTest(object):def setup_method(self, method):
self.config = testing.setUp()add_routes(self.config)transaction.begin()
def teardown_method(self, method):testing.tearDown()transaction.abort()
Transactions make for safe testing
![Page 36: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/36.jpg)
Performance and indices
![Page 37: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/37.jpg)
postgres=# CREATE TABLE users(postgres-# id SERIAL,postgres-# email TEXT NOT NULLpostgres-#);
postgres=# INSERT INTO users(email)postgres-# SELECT 'test_' || id::text || '@email.com'postgres-# FROM generate_series(0, 9999999) id;INSERT 0 10000000Time: 27608.651 ms
postgres=# SELECT * FROM users; id | email ------+-------------------- 1 | [email protected] 2 | [email protected] 3 | [email protected]...
Let's make a lot of fake data
![Page 38: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/38.jpg)
postgres=# select * from users where email='[email protected]'; id | email ----+------------------ 1 | [email protected](1 row)
Time: 907.094 ms
-- Can we make this faster?
How slow do things get?
![Page 39: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/39.jpg)
postgres=# explain select * from users where email='something';QUERY PLAN ------------------------------------------------------------------------------ Gather (cost=1000.00..131603.33 rows=50000 width=36) Workers Planned: 2 -> Parallel Seq Scan on users (cost=0.00..125603.33 rows=20833 width=36) Filter: (email = '[email protected]'::text)(4 rows)
Explain query
![Page 40: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/40.jpg)
postgres=# CREATE INDEX on users(email);CREATE INDEXTime: 20291.741 ms
postgres=# explain select * from users where email='[email protected]'; QUERY PLAN ------------------------------------------------------------------------------ Index Scan using users_email_idx on users (cost=0.56..8.58 rows=1 width=36) Index Cond: (email = '[email protected]'::text)(2 rows)
Indices
![Page 41: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/41.jpg)
postgres=# select * from users where email='[email protected]'; id | email ----+------------------ 1 | [email protected](1 row)
Time: 1.547 ms
-- Went from ~1s to ~1ms! x1000 speed up!
Indices
![Page 42: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/42.jpg)
postgres=# CREATE UNIQUE INDEX ON users(email);
postgres=# INSERT INTO users(email) VALUES('[email protected]');INSERT 0 1Time: 12.001 ms
postgres=# INSERT INTO users(email) VALUES ('[email protected]'); ERROR: duplicate key value violates unique constraint "users_email_idx"DETAIL: Key (email)=([email protected]) already exists.Time: 1.410 ms
Unique index
![Page 43: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/43.jpg)
postgres=# CREATE TABLE users2(id SERIAL,email TEXT -- NOT NULL
);postgres=# CREATE UNIQUE INDEX ON users2(email);
postgres=# INSERT INTO users2(email) VALUES (null);INSERT 0 1Time: 21.365 ms
postgres=# INSERT INTO users2(email) VALUES (null);INSERT 0 1Time: 11.785 ms
Unique indices... wat?
![Page 44: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/44.jpg)
postgres=# CREATE TABLE players(postgres-# id SERIAL PRIMARY KEY,postgres-# team TEXT NOT NULL,postgres-# jersey INTEGER NOT NULL,postgres-# first_name TEXT NOT NULL,postgres-# last_name TEXT NOT NULLpostgres-#);
postgres=# CREATE UNIQUE INDEX ON players(team, jersey);
You can do indices on multiple columns
![Page 45: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/45.jpg)
postgres=# \d players
Table "public.players" Column | Type | Modifiers ------------+---------+------------------------------------------------------ id | integer | not null default nextval('players_id_seq'::regclass) team | text | not null jersey | integer | not null first_name | text | not null last_name | text | not nullIndexes: "players_pkey" PRIMARY KEY, btree (id) "players_team_jersey_idx" UNIQUE, btree (team, jersey)
A primary key is a not-null, unique constraint with an index
![Page 46: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/46.jpg)
ETL with Postgres
![Page 47: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/47.jpg)
moat ad search
![Page 48: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/48.jpg)
postgres=# create table impressions(brand_id integer not null,ad_id integer not null,created_at timestamp not null default current_timestamp
);
-- For a specific brandand ad, we want results like this:-- day | brand_id | ad_id | num_impressions
Impressions table
![Page 49: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/49.jpg)
postgres=# create table impressions(postgres-# brand_id int not null,postgres-# ad_id int not null,postgres-# created_at timestamp not null default current_timestamppostgres-#);
postgres=# select created_at::date, brand_id, ad_id, count(id)postgres-# from impressionspostgres-# group by created_at::date, brand_id, ad_idpostgres-# where ad_id = foo and brand = bar;
-- This query is a mouthful and it'd be a pain to type out all the time.
Simple group by
![Page 50: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/50.jpg)
postgres=# create view ad_brand_impressions_per_daypostgres-# select created_at::date as day, brand_id, ad_id, count(id)postgres-# from impressionspostgres-# group by created_at::date, brand_id, ad_idpostgres-# where ad_id = foo and brand = bar;
postgres=# select * from num_impressions_per_daypostgres-# where brand_id=?;
-- What if this is slow?
View
![Page 51: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/51.jpg)
-- What if you want to see the number of ads a brand had in a daterange?
postgres=# select brand_id, count(distinct ad_id)postgres-# from num_impressions_per_daypostgres-# where brand_id=<blah> and day between '2017-01-01' and '2017-01-31'postgres-# group by brand_id;
-- What if this is slow?
Number of creatives per brand?
![Page 52: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/52.jpg)
create materialized view impressions_by_dayasselect
created_at::date as day,brand_id,ad_id, count(id) as num_impressions
from impressionsgroup by created_at::date, hostname_id, ad_id;
create unique index on impressions_by_day(day, brand_id, ad_id);
select * from impressions_by_day where brand_id=foo and ad_id=bar;
Materialized view
![Page 53: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/53.jpg)
# Your ETL is now:refresh materialized view impressions_by_day;
# But refreshes block reads, so if you have a unique index,# you can do:refresh materialized view impressions_by_day concurrently;
Materialized view
![Page 54: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/54.jpg)
JSON and Postgres
![Page 55: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/55.jpg)
latency analytics
![Page 56: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/56.jpg)
• Our data is coming in via JSON because it's coming from a browser
• We don't know how our JSON schema will change
• We're going to have a lot of columns
• In this case, maybe it's fine to use JSON column type
Things to consider
![Page 57: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/57.jpg)
postgres=# create table ad_analysis(id serial primary key,
analysis json not null,created_at timestamp not null default current_timestamp,updated_at timestamp not null default current_timestamp
);postgres=# insert into ad_analysis(analysis) values('{"size": 1234}'::json);postgres=# select analysis->>'size' from ad_analysis; ?column? ---------- 1234(1 row)
JSON
![Page 58: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/58.jpg)
postgres=# SELECT '{"c":0, "a":2,"a":1}'::json;json ------------------------ {"c":0, "a":2,"a":1}(1 row)
postgres=# SELECT '{"c":0, "a":2,"a":1}'::json->>'a'; ?column? ---------- 1(1 row)
postgres=# create index on ad_analysis(analysis);ERROR: data type json has no default operator class for access method "btree"
Normal JSON type isn't great
![Page 59: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/59.jpg)
postgres=# SELECT '{"c":0, "a":2,"a":1}'::jsonb; jsonb ------------------ {"a": 1, "c": 0}(1 row)
Time: 0.833 ms
postgres=# create index on ad_analysis(analysis);CREATE INDEXTime: 30.101 ms
-- Really good performance!
Use JSONB
![Page 60: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/60.jpg)
Scaling Postgres @ Moat
![Page 61: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/61.jpg)
moat analytics
![Page 62: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/62.jpg)
• Measure attention online
• Don't track cookies/ip addresses
• We're a neutral third party that publishers and advertisers use
• We process billions of events a day
What we do
![Page 63: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/63.jpg)
lots of data
![Page 64: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/64.jpg)
• Elmo: Last 33 Days
• Marjory: A few years
• Frackles: All of our historical data
• Decanter: aggregates our data
Databases
![Page 65: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/65.jpg)
decanter
![Page 66: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/66.jpg)
create server other_db foreign data wrapper postgres_fdw options (host 'other_db.moat.co',port '5432',dbname 'other_db'
);create user mapping for public server other_db options (
user 'user',password 'password'
);
create schema foo;import foreign schema foo from server search_rds into foo;
# works like a normal tableselect * from foo.table;
Foreign Data Wrapper
![Page 67: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/67.jpg)
• Great way to talk to other databases
• Can talk to other types of data stores too (there's a mysql_fdw, elasticsearch_fdw, etc).
• Careful – the query planner can be dumb. You can put the results in an mview if you want.
Foreign Data Wrappers
![Page 68: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/68.jpg)
• PostgreSQL instance that doesn't store data
• High CPU, high memory, no storage
• Gets results from the other databases via FDW and aggregates them
• Highly available (pgbouncer)
• If the databases down there change, the client doesn't notice
Decanter
![Page 69: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/69.jpg)
marjory
![Page 70: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/70.jpg)
• We have a lot of data
• We're basically bean-counters ("how many times did X happen?". So, we basically need to filter and sum.
• The rows are very wide and very sparse (a lot of NULLs).
• This makes columnar solutions not work well
• This is very compressable
Our data considerations
![Page 71: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/71.jpg)
![Page 72: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/72.jpg)
![Page 73: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/73.jpg)
![Page 74: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/74.jpg)
![Page 75: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/75.jpg)
• 8-10x compression for our data
• CPU bound instead of IO bound
• Pretty great performance
• We use it for some historical data (a few years)
• Inflexible: we're trading generality for fit to our use case
• Probably not great if there's a lot of rows
Marjory
![Page 76: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/76.jpg)
frackles
![Page 77: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/77.jpg)
• Access ALL our historical data
• Needs to be cheap
• Support a lot of concurrency
• Be able to spin up new instances fast for scale
• Bonus: be faster than RedShift
Frackles
![Page 78: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/78.jpg)
• Our database was just a series of SQLite files in S3…
• We used AWS Lambda to pull those S3 files and read from them…
• We used a Python client to manage all of those lambdas and gave it a simple API…
• We wrote our own FDW to talk to the Python client…
• (I was really skeptical)
What if…
![Page 79: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/79.jpg)
![Page 80: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/80.jpg)
• Surprisingly good performance
• It's dirt cheap – you only pay for S3 and Lambdas
• Completely stateless (almost no ops!)
• Constant overhead
• S3 performance has high variance
• Capped on number of concurrent lambdas
• Better on queries with little aggregation
Frackles
![Page 81: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/81.jpg)
![Page 82: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/82.jpg)
In summary
![Page 83: Introduction To Postgres6.148.scripts.mit.edu/2018/pages/lectures/WEBday7_postgre.pdf · • I joined in 2013, when we were ~20 people • Acquired by Oracle during summer of 2017](https://reader033.vdocuments.us/reader033/viewer/2022042207/5eaa7a35ceed0f10fc5f3028/html5/thumbnails/83.jpg)
We’re hiring. A [email protected]