postgres performance for humans

87
P!"#r$% P $rf&r’()*$ f&r H+’()% @*r(,#-$r%",$)%

Upload: craig-kerstiens

Post on 11-May-2015

3.426 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Postgres performance for humans

P!"#r$% P$rf&r'()*$f&r H+'()%

@*r(,#-$r%",$)%

Page 3: Postgres performance for humans

Postgres - TLDR

Page 4: Postgres performance for humans

Postgres - TLDR Datatypes

Conditional IndexesTransactional DDL

Foreign Data WrappersConcurrent Index Creation

ExtensionsCommon Table Expressions

Fast Column AdditionListen/Notify

Table InheritancePer Transaction sync replication

Window functionsNoSQL inside SQL

Momentum

Page 5: Postgres performance for humans

OLTP vs OLAP

Page 6: Postgres performance for humans

OLTP

W$b(pp%

Page 7: Postgres performance for humans

Postgres Setup/Config

Page 8: Postgres performance for humans

Postgres Setup/ConfigOn Amazon

Page 9: Postgres performance for humans

Postgres Setup/ConfigOn Amazon

Use Heroku OR ‘postgresql when its not your dayjob’

Page 10: Postgres performance for humans

Postgres Setup/ConfigOn Amazon

Use Heroku OR ‘postgresql when its not your dayjob’Other clouds

Page 11: Postgres performance for humans

Postgres Setup/ConfigOn Amazon

Use Heroku OR ‘postgresql when its not your dayjob’Other clouds

‘postgresql when its not your dayjob’

Page 12: Postgres performance for humans

Postgres Setup/ConfigOn Amazon

Use Heroku OR ‘postgresql when its not your dayjob’Other clouds

‘postgresql when its not your dayjob’Real hardware

Page 13: Postgres performance for humans

Postgres Setup/ConfigOn Amazon

Use Heroku OR ‘postgresql when its not your dayjob’Other clouds

‘postgresql when its not your dayjob’Real hardware

High performance PostgreSQL

Page 14: Postgres performance for humans

Postgres Setup/ConfigOn Amazon

Use Heroku OR ‘postgresql when its not your dayjob’Other clouds

‘postgresql when its not your dayjob’Real hardware

High performance PostgreSQL

http://thebuild.com/blog/2012/06/04/postgresql-when-its-not-your-job-at-djangocon-europe/

Page 15: Postgres performance for humans

C(*.$

Page 16: Postgres performance for humans

80/20 r+/$

Page 17: Postgres performance for humans
Page 18: Postgres performance for humans

Cache Hit RateSELECT 'index hit rate' as name, (sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit + idx_blks_read) as ratio FROM pg_statio_user_indexes union all SELECT 'cache hit rate' as name, case sum(idx_blks_hit) when 0 then 'NaN'::numeric else to_char((sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit + idx_blks_read), '99.99')::numeric end as ratio FROM pg_statio_user_indexes)

Page 19: Postgres performance for humans

Cache Hit Rate name | ratio----------------+------------------------ cache hit rate | 0.99

Page 20: Postgres performance for humans

Index Hit RateSELECT relname, 100 * idx_scan / (seq_scan + idx_scan), n_live_tupFROM pg_stat_user_tablesORDER BY n_live_tup DESC;

Page 21: Postgres performance for humans

Index Hit Rate relname | percent_of_times_index_used | rows_in_table ---------------------+-----------------------------+--------------- events | 0 | 669917 app_infos_user_info | 0 | 198218 app_infos | 50 | 175640 user_info | 3 | 46718 rollouts | 0 | 34078 favorites | 0 | 3059 schema_migrations | 0 | 2 authorizations | 0 | 0 delayed_jobs | 23 | 0

Page 22: Postgres performance for humans

S.&r"*+"%

Page 23: Postgres performance for humans

$ cat ~/.psqlrc

\set ON_ERROR_ROLLBACK interactive

-- automatically switch between extended and normal\x auto

-- always show how long a query takes\timing

\set show_slow_queries 'SELECT (total_time / 1000 / 60) as total_minutes, (total_time/calls) as average_time, query FROM pg_stat_statements ORDER BY 1 DESC LIMIT 100;'

psql

Page 24: Postgres performance for humans

$ cat ~/.psqlrc

\set ON_ERROR_ROLLBACK interactive

-- automatically switch between extended and normal\x auto

-- always show how long a query takes\timing

\set show_slow_queries 'SELECT (total_time / 1000 / 60) as total_minutes, (total_time/calls) as average_time, query FROM pg_stat_statements ORDER BY 1 DESC LIMIT 100;'

psql

Page 25: Postgres performance for humans

datascopehttps://github.com/will/datascope

Page 26: Postgres performance for humans

U)0$r%"()0,)# Sp$*,1* Q+$r2 P$rf&r'()*$

Page 27: Postgres performance for humans

Understanding Query Performance

Page 28: Postgres performance for humans

Understanding Query Performance

SELECT last_name FROM employees WHERE salary >= 50000;

Page 29: Postgres performance for humans

Explain# EXPLAIN SELECT last_name FROM employees WHERE salary >= 50000;

QUERY PLAN--------------------------------------------------Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) Filter: (salary >= 50000)(3 rows)

Page 30: Postgres performance for humans

Explain# EXPLAIN SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN--------------------------------------------------Seq Scan on employees width=6) Filter: (salary >= 50000)(3 rows) startup time max time rows return

(cost=0.00..35811.00 rows=1

Page 31: Postgres performance for humans

Explain Analyze# EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN--------------------------------------------------Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) (actual time=2.401..295.247 rows=1428 loops=1) Filter: (salary >= 50000)Total runtime: 295.379(3 rows)

Filter: (salary >= 50000)(3 rows)

startup time max time rows returnactual time

2.401..295.247 rows=1428

295.379

Page 32: Postgres performance for humans

Rough guidelines

Page response times < 100 ms

Common queries < 10ms

Rare queries < 100ms

Page 33: Postgres performance for humans

Explain Analyze# EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN--------------------------------------------------Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) (actual time=2.401..295.247 rows=1428 loops=1) Filter: (salary >= 50000)Total runtime: 295.379(3 rows)

Filter: (salary >= 50000)(3 rows)

startup time max time rows returnactual time

2.401..295.247 rows=1428

295.379

Page 34: Postgres performance for humans

# CREATE INDEX idx_emps ON employees (salary);

Indexes!

Page 35: Postgres performance for humans

Indexes!EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN--------------------------------------------------Index Scan using idx_emps on employees (cost=0.00..8.49 rows=1 width=6) (actual time = 0.047..1.603 rows=1428 loops=1) Index Cond: (salary >= 50000)Total runtime: 1.771 ms(3 rows)

Page 36: Postgres performance for humans

Indexes!EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN--------------------------------------------------Index Scan using idx_emps on employees (cost=0.00..8.49 rows=1 width=6) (actual time = 0.047..1.603 rows=1428 loops=1) Index Cond: (salary >= 50000)Total runtime: 1.771 ms(3 rows)

Page 37: Postgres performance for humans

Indexes!EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN--------------------------------------------------Index Scan using idx_emps on employees (cost=0.00..8.49 rows=1 width=6) (actual time = 0.047..1.603 rows=1428 loops=1) Index Cond: (salary >= 50000)Total runtime: 1.771 ms(3 rows)

Page 38: Postgres performance for humans

pg_stat_statements

Page 39: Postgres performance for humans

pg_stat_statements$ select * from pg_stat_statements where query ~ 'from users where email';

userid │ 16384dbid │ 16388query │ select * from users where email = ?;calls │ 2total_time │ 0.000268rows │ 2shared_blks_hit │ 16shared_blks_read │ 0shared_blks_dirtied │ 0shared_blks_written │ 0local_blks_hit │ 0local_blks_read │ 0local_blks_dirtied │ 0local_blks_written │ 0...

Page 40: Postgres performance for humans

pg_stat_statements

Page 41: Postgres performance for humans

SELECT (total_time / 1000 / 60) as total, (total_time/calls) as avg, query FROM pg_stat_statements ORDER BY 1 DESC LIMIT 100;

pg_stat_statements

Page 42: Postgres performance for humans

pg_stat_statements

Page 43: Postgres performance for humans

total | avg | query --------+--------+------------------------- 295.76 | 10.13 | SELECT id FROM users... 219.13 | 80.24 | SELECT * FROM ... (2 rows)

pg_stat_statements

Page 44: Postgres performance for humans

I)03$%

Page 45: Postgres performance for humans

IndexesB-TreeGeneralized Inverted Index (GIN)Generalized Search Tree (GIST)K Nearest Neighbors (KNN)Space Partitioned GIST (SP-GIST)

Page 46: Postgres performance for humans

Indexes

Which do I use?

Page 47: Postgres performance for humans

BTree

This is what you usually want

Page 48: Postgres performance for humans

Generalized Inverted Index (GIN)

Use with multiple values in 1 columnArray/hStore

Page 49: Postgres performance for humans

Generalized Search Tree (GIST)Full text searchShapes

Page 50: Postgres performance for humans

IndexesB-TreeGeneralized Inverted Index (GIN)Generalized Search Tree (GIST)K Nearest Neighbors (KNN)Space Partitioned GIST (SP-GIST)

Page 51: Postgres performance for humans

M&r$ ,)03$%

Page 52: Postgres performance for humans

Indexes

ConditionalFunctionalConcurrent creation

Page 53: Postgres performance for humans

Conditional> SELECT * FROM places;

name | population -----------------------------------ACMAR | 6055ARAB | 13650

Page 54: Postgres performance for humans

Conditional> SELECT * FROM placesWHERE population > 10000;

name | population -----------------------------------ARAB | 13650

Page 55: Postgres performance for humans

Conditional> SELECT * FROM placesWHERE population > 10000;

name | population -----------------------------------ARAB | 13650

Page 56: Postgres performance for humans

Conditional

> CREATE INDEX idx_large_population ON places(name) where population > 10000;

Page 57: Postgres performance for humans

Conditional

> CREATE INDEX idx_large_population ON places(name) where population > 10000;

Page 58: Postgres performance for humans

Functional> SELECT * FROM places;

data -----------------------------------{"city": "ACMAR", "pop": 6055}{"city": "ARAB", "pop": 13650}

Page 59: Postgres performance for humans

> SELECT * FROM placesWHERE get_numeric('pop', data) > 10000;

data -----------------------------------{"city": "ARAB", "pop": 13650}

Functional

Page 60: Postgres performance for humans

> SELECT * FROM placesWHERE get_numeric('pop', data) > 10000;

data -----------------------------------{"city": "ARAB", "pop": 13650}

Functional

Page 61: Postgres performance for humans

> CREATE INDEX idx_large_population ON places(get_numeric('pop', data));

Functional

Page 62: Postgres performance for humans

> CREATE INDEX idx_large_population ON places(get_numeric('pop', data));

Functional

Page 63: Postgres performance for humans

Conditional and Functional

> CREATE INDEX idx_large_population ON places(data) WHERE get_numeric('pop', data) > 10000;

Page 64: Postgres performance for humans

Conditional and Functional

> CREATE INDEX idx_large_population ON places(data) WHERE get_numeric('pop', data) > 10000;

Page 65: Postgres performance for humans

Conditional and Functional

> CREATE INDEX idx_large_population ON places(data) WHERE get_numeric('pop', data) > 10000;

Page 66: Postgres performance for humans

CREATE INDEX CONCURRENTLY ...

roughly 2-3x slowerDoesn’t lock table

One more thing

Page 67: Postgres performance for humans

D4()#&

Page 68: Postgres performance for humans

Connections

Page 69: Postgres performance for humans

Connections

django-postgrespooldjorm-ext-pooldjango-db-pool

Page 70: Postgres performance for humans

django-postgrespoolimport dj_database_urlimport django_postgrespool

DATABASE = { 'default': dj_database_url.config() }DATABASES['default']['ENGINE'] = 'django_postgrespool'

SOUTH_DATABASE_ADAPTERS = { 'default': 'south.db.postgresql_psycopg2'}

Page 71: Postgres performance for humans

django-postgrespoolimport dj_database_urlimport django_postgrespool

DATABASE = { 'default': dj_database_url.config() }DATABASES['default']['ENGINE'] = 'django_postgrespool'

SOUTH_DATABASE_ADAPTERS = { 'default': 'south.db.postgresql_psycopg2'}

Page 72: Postgres performance for humans

pgbouncerpgpool

Other options

Page 73: Postgres performance for humans

A00,)# C(*.$

Page 74: Postgres performance for humans

Replication options

Page 75: Postgres performance for humans

Replication optionsslonybucardopgpool

Page 76: Postgres performance for humans

Replication optionsslonybucardopgpool

wal-ebarman

Page 77: Postgres performance for humans

Replication optionsslonybucardopgpool

wal-ebarman

Page 78: Postgres performance for humans

Update settings

Page 79: Postgres performance for humans

effective_cache_sizeshared_buffers

work_memmaintenance_work_mem

Update settings

Page 80: Postgres performance for humans

B(*-+p%

Page 81: Postgres performance for humans

Logicalpg_dump

can be human readable, is portable

Page 82: Postgres performance for humans

PhysicalThe bytes on disk

Base backup

Page 83: Postgres performance for humans

LogicalGood across architecturesGood for portability

Has load on DB

Works < 50 GB

PhysicalMore initial setupLess portability

Limited load on system

Use above 50 GB

Page 84: Postgres performance for humans

R$*(p

Page 85: Postgres performance for humans

OLAPWhole other talkDisk IO is importantOrder on disk is helpful (pg-reorg)MPP solutions on top of Postgres

Recap

Page 86: Postgres performance for humans

OLTP (webapps)

Ensure bulk of data is cacheEfficient use of indexesWhen cache sucks, throw more at it

Recap

Page 87: Postgres performance for humans

Q+$%",&)%