postgres demystified

Postgres DemystifiedCraig Kerstiens@craigkerstienshttp://www.craigkerstiens.com

https://speakerdeck.com/u/craigkerstiens/p/postgres-demystified

Postgres Demystified

We’re Hiring

Getting Setup

Postgres.app

AgendaBrief History

Developing w/ PostgresPostgres Performance

Querying

Postgres History

PostgresPostgresQL

Post IngressAround since 1989/1995Community Driven/Owned

Each query sees transactions committed before itLocks for writing don’t conflict with reading

Why Postgres

“ its the emacs of databases”

Developing w/ Postgres

Basicspsql is your friend

Basicspsql is your friend# \dt# \d# \d tablename# \x# \e

Datatypes

smallint bigint integer

numeric floatserial money char

varchartext

timestamp

timestamptz date

timetimetz

interval boolean

pointline

polygon

circle

inetcidr

macaddr tsvector

tsquery

arrayXML

varchartext

timestamp

timestamptz date

timetimetz

interval boolean

pointline

polygon

circle

inetcidr

macaddr tsvector

tsquery

arrayXML

Datatypes

varchartext

timestamp

timestamptz date

timetimetz

interval boolean

pointline

polygon

circle

inetcidr

macaddr tsvector

tsquery

arrayXML

Datatypes

varchartext

timestamp

timestamptz date

timetimetz

interval boolean

pointline

polygon

circle

inetcidr

macaddr tsvector

tsquery

arrayXML

CREATE TABLE items ( id serial NOT NULL, name varchar (255), tags varchar(255) [], created_at timestamp);

Datatypes

CREATE TABLE items ( id serial NOT NULL, name varchar (255), tags varchar(255) [], created_at timestamp);

Datatypes

DatatypesINSERT INTO itemsVALUES (1, 'Ruby Gem', '{“Programming”,”Jewelry”}', now());

INSERT INTO items VALUES (2, 'Django Pony', '{“Programming”,”Animal”}', now());

Datatypes

varchartext

timestamp

timestamptz date

timetimetz

interval boolean

pointline

polygon

circle

inetcidr

macaddr tsvector

tsquery

arrayXML

Extensionsdblink hstore

citext

ltreeisncube

pgcrypto

tablefunc

uuid-ossp

earthdistance

trigram

fuzzystrmatchpgrowlocks

pgstattuple

btree_gistdict_int

dict_xsynunaccent

citext

ltreeisncube

pgcrypto

tablefunc

uuid-ossp

earthdistance

trigram

pgstattuple

btree_gistdict_int

dict_xsynunaccent

CREATE EXTENSION hstore;CREATE TABLE users ( id integer NOT NULL, email character varying(255), data hstore, created_at timestamp without time zone, last_login timestamp without time zone);

NoSQL in your SQL

INSERT INTO users VALUES (1, 'craig.kerstiens@gmail.com', 'sex => "M", state => “California”', now(), now()

hStore

SELECT '{"id":1,"email": "craig.kerstiens@gmail.com",}'::json;

V8 w/ PLV8

create or replace functionjs(src text) returns text as $$ return eval( "(function() { " + src + "})" )();$$ LANGUAGE plv8;

V8 w/ PLV8

create or replace functionjs(src text) returns text as $$ return eval( "(function() { " + src + "})" )();$$ LANGUAGE plv8;

V8 w/ PLV8

9.2Bad Idea

Range Types

CREATE TABLE talks (room int, during tsrange);INSERT INTO talks VALUES (3, '[2012-09-24 13:00, 2012-09-24 13:50)');

Range Types

CREATE TABLE talks (room int, during tsrange);INSERT INTO talks VALUES (3, '[2012-09-24 13:00, 2012-09-24 13:50)');

Range Types

ALTER TABLE talks ADD EXCLUDE USING gist (during WITH &&);INSERT INTO talks VALUES (1108, '[2012-09-24 13:30, 2012-09-24 14:00)');ERROR: conflicting key value violates exclusion constraint "talks_during_excl"

Full Text Search

Full Text SearchTSVECTOR - Text DataTSQUERY - Search Predicates

Specialized Indexes and Operators

Datatypes

varchartext

timestamp

timestamptz date

timetimetz

interval boolean

pointline

polygon

circle

inetcidr

macaddr tsvector

tsquery

arrayXML

PostGIS

1. New datatypes i.e. (2d/3d boxes)

PostGIS

i.e. SELECT foo && bar ...2. New operators

PostGIS

i.e. SELECT foo && bar ...

i.e. person within location, nearest distance

2. New operators

3. Understand relationships and distance

PostGIS

Performance

Sequential Scans

They’re Bad

Sequential Scans

They’re Bad (most of the time)

Indexes

They’re Good

Indexes

They’re Good (most of the time)

IndexesB-TreeGeneralized Inverted Index (GIN)Generalized Search Tree (GIST)K Nearest Neighbors (KNN)Space Partitioned GIST (SP-GIST)

IndexesB-Tree

DefaultUsually want this

IndexesGeneralized Inverted Index (GIN)

Use with multiple values in 1 columnArray/hStore

IndexesGeneralized Search Tree (GIST)

Full text searchShapes

Understanding Query Perf

SELECT last_name FROM employees WHERE salary >= 50000;

Explain# EXPLAIN SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN-------------------------------------------------------------------------Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) Filter: (salary >= 50000)(3 rows)

# EXPLAIN SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN-------------------------------------------------------------------------Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) Filter: (salary >= 50000)(3 rows)

Startup Cost

Explain

Startup Cost

Max Time

Explain

Startup Cost

Max Time

Rows Returned

Explain

# EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN-------------------------------------------------------------------------Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) (actual time=2.401..295.247 rows=1428 loops=1) Filter: (salary >= 50000)Total runtime: 295.379(3 rows)

Explain

Startup Cost

Explain

Startup Cost Max Time

Explain

Rows Returned

Explain

Rows Returned

Explain

# CREATE INDEX idx_emps ON employees (salary);

# CREATE INDEX idx_emps ON employees (salary);# EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN-------------------------------------------------------------------------Index Scan using idx_emps on employees (cost=0.00..8.49 rows=1 width=6) (actual time = 0.047..1.603 rows=1428 loops=1) Index Cond: (salary >= 50000)Total runtime: 1.771 ms(3 rows)

Indexes Pro Tips

Indexes Pro TipsCREATE INDEX CONCURRENTLY

CREATE INDEX WHERE foo=bar

SELECT * WHERE foo LIKE ‘%bar% is BAD

SELECT * WHERE foo LIKE ‘%bar% is BADSELECT * WHERE Food LIKE ‘bar%’ is OKAY

citext

ltreeisncube

pgcrypto

tablefunc

uuid-ossp

earthdistance

trigram

pgstattuple

btree_gistdict_int

dict_xsynunaccent

Cache Hit RateSELECT 'index hit rate' as name, (sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit + idx_blks_read) as ratio FROM pg_statio_user_indexes union all SELECT 'cache hit rate' as name, case sum(idx_blks_hit) when 0 then 'NaN'::numeric else to_char((sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit + idx_blks_read), '99.99')::numeric end as ratio FROM pg_statio_user_indexes;)

Index Hit RateSELECT relname, 100 * idx_scan / (seq_scan + idx_scan), n_live_tupFROM pg_stat_user_tablesORDER BY n_live_tup DESC;

Index Hit Rate relname | percent_of_times_index_used | rows_in_table ---------------------+-----------------------------+--------------- events | 0 | 669917 app_infos_user_info | 0 | 198218 app_infos | 50 | 175640 user_info | 3 | 46718 rollouts | 0 | 34078 favorites | 0 | 3059 schema_migrations | 0 | 2 authorizations | 0 | 0 delayed_jobs | 23 | 0

pg_stats_statements

pg_stats_statements$ select * from pg_stat_statements where query ~ 'from users where email';

userid │ 16384dbid │ 16388query │ select * from users where email = ?;calls │ 2total_time │ 0.000268rows │ 2shared_blks_hit │ 16shared_blks_read │ 0shared_blks_dirtied │ 0shared_blks_written │ 0local_blks_hit │ 0local_blks_read │ 0local_blks_dirtied │ 0local_blks_written │ 0temp_blks_read │ 0temp_blks_written │ 0time_read │ 0time_write │ 0

pg_stats_statements

pg_stats_statementsSELECT query, calls, total_time, rows, 100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;----------------------------------------------------------------------query | UPDATE pgbench_branches SET bbalance = bbalance + ? WHERE bid = ?;calls | 3000total_time | 9609.00100000002rows | 2836hit_percent | 99.9778970000200936 9.2

Querying

Window FunctionsExample:

Biggest spender by state

SELECT email, users.data->'state', sum(total(items)), rank() OVER (PARTITION BY users.data->'state' ORDER BY sum(total(items)) desc)

FROM users, purchases

WHERE purchases.user_id = users.id GROUP BY 1, 2;

Window Functions

SELECT email, users.data->'state', sum(total(items)), rank() OVER (PARTITION BY users.data->'state' ORDER BY sum(total(items)) desc)

FROM users, purchases

WHERE purchases.user_id = users.id GROUP BY 1, 2;

Window Functions

citext

ltreeisncube

pgcrypto

tablefunc

uuid-ossp

earthdistance

trigram

pgstattuple

btree_gistdict_int

dict_xsynunaccent

Fuzzystrmatch

FuzzystrmatchSELECT soundex('Craig'), soundex('Will'), difference('Craig', 'Will');

SELECT soundex('Craig'), soundex('Greg'), difference('Craig', 'Greg');SELECT soundex('Willl'), soundex('Will'), difference('Willl', 'Will');

Moving Data Around\copy (SELECT * FROM users) TO ‘~/users.csv’;

\copy users FROM ‘~/users.csv’;

db_linkSELECT dblink_connect('myconn', 'dbname=postgres');SELECT * FROM dblink('myconn','SELECT * FROM foo') AS t(a int, b text);

a | b -------+------------ 1 | example 2 | example2

Foreign Data Wrappersoracle mysql

informix

twitterfiles

wwwcouch

sybase

s3redis jdbc

mongodb

CREATE EXTENSION redis_fdw;

CREATE SERVER redis_server FOREIGN DATA WRAPPER redis_fdw OPTIONS (address '127.0.0.1', port '6379');

CREATE FOREIGN TABLE redis_db0 (key text, value text) SERVER redis_server OPTIONS (database '0');

CREATE USER MAPPING FOR PUBLIC SERVER redis_server OPTIONS (password 'secret');

Foreign Data Wrappers

SELECT id, email, value as visits

FROM users, redis_db0

WHERE ('user_' || cast(id as text)) = cast(redis_db0.key as text) AND cast(value as int) > 40;

Query Redis from PostgresSELECT * FROM redis_db0;

All are not equaloracle mysql

informix

twitterfiles

wwwcouch

sybase

s3redis jdbc

mongodb

Production Ready FDWsoracle mysql

informix

twitterfiles

wwwcouch

sybase

s3redis jdbc

mongodb

Readability

ReadabilityWITH top_5_products AS ( SELECT products.*, count(*) FROM products, line_items WHERE products.id = line_items.product_id GROUP BY products.id ORDER BY count(*) DESC LIMIT 5)

SELECT users.email, count(*)FROM users, line_items, top_5_productsWHERE line_items.user_id = users.id AND line_items.product_id = top_5_products.idGROUP BY 1ORDER BY 1;

Common Table ExpressionsWITH top_5_products AS ( SELECT products.*, count(*) FROM products, line_items WHERE products.id = line_items.product_id GROUP BY products.id ORDER BY count(*) DESC LIMIT 5)

SELECT users.email, count(*)FROM users, line_items, top_5_productsWHERE line_items.user_id = users.id AND line_items.product_id = top_5_products.idGROUP BY 1ORDER BY 1; ’

SELECT users.email, count(*)FROM users, line_items, top_5_productsWHERE line_items.user_id = users.id AND line_items.product_id = top_5_products.idGROUP BY 1ORDER BY 1; ’Don’t do this in production

Brief HistoryDeveloping w/ PostgresPostgres Performance

Querying

Extras

ExtrasListen/Notify

ExtrasListen/NotifyPer Transaction Synchronous Replication

ExtrasListen/NotifyPer Transaction Synchronous ReplicationSELECT for UPDATE

Postgres - TLDR Datatypes

Conditional IndexesTransactional DDL

Foreign Data WrappersConcurrent Index Creation

ExtensionsCommon Table Expressions

Fast Column AdditionListen/Notify

Table InheritancePer Transaction sync replication

Window functionsNoSQL inside SQL

Momentum

Questions?Craig Kerstiens@craigkerstienshttp://www.craigkerstiens.com

https://speakerdeck.com/u/craigkerstiens/p/postgres-demystified

postgres demystified

new datatypes

alter table talks

new operators

talks values3

range typescreate table

jsonv8 w plv89

room int

int tablefuncunaccentbtree

Technology

using vmware vfabric postgres - vfabric postgres 9.3 · pdf...

ɷiv therapy demystified (demystified nursing)

meteorology demystifieds2.bitdl.ir/ebook/geology/meteorology...

postgres nuvens

advancedcalculus - cloudflare-ipfs.com...as p.net 2.0...

postgres open

demystified series -...

postgres open, september 2014: postgres interface...

postgres presentation

docking postgres

postgres enterprise manager · postgres enterprise manager,...

postgres sql

john melesky - federating queries using postgres fdw @...

postgres plus advanced server ecpg plus...

demystified series · statistics demystified string theory...

barista’ -...

visualizing postgres - 日本postgresqlユーザ会 ·...

using vmware vfabric postgres - vfabric postgres 9.3.5

fluid mechanics - · pdf filequantum mechanics demystified...

craig kerstiens - scalable uniques in postgres @ postgres...