scaling with postgres (robert treat)

Post on 13-Dec-2014

922 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Highload++ 2010

Scaling with PostgresRobert Treat

Monday, October 25, 2010

Who Am I?

✤ Robert Treat

✤ OmniTI

✤ Design, Development, Database, Ops

Monday, October 25, 2010

Who Am I?

✤ Robert Treat

✤ OmniTI

✤ Design, Development, DATABASE, Ops

Monday, October 25, 2010

Who Am I?

✤ Robert Treat

✤ OmniTI

✤ Design, Development, DATABASE, Ops

✤ Etsy, Allisports, National Geographic, Gilt, etc...

Monday, October 25, 2010

Who Am I?

✤ Robert Treat

✤ Postgres

✤ Web, Advocacy, phpPgAdmin

✤ Major Contributor

Monday, October 25, 2010

Who Am I?

✤ Postgres 6.5 -> 9.1alpha1

✤ Terabytes of data

✤ Millions of transactions per day

✤ OLTP, ODS, DW

✤ Perl, PHP, Java, Ruby, C#

Monday, October 25, 2010

Who Am I?

OBSERVATION == LEARNING(hopefully)

Monday, October 25, 2010

Scalability

It is the ability of a computer application or product (hardware

or software) to continue to function well when it (or its

context) is changed in size or volume in order to meet a user

need.

Monday, October 25, 2010

Scalability

Given ever increasing load

Monday, October 25, 2010

Scalability

NEVER GO DOWNALWAYS PERFORM WELL

Given ever increasing load

Monday, October 25, 2010

Scalability

NEVER GO DOWNALWAYS PERFORM WELL

Given ever increasing load

impossible goal, but we’ll try

Monday, October 25, 2010

Scalability

NEVER GO DOWNALWAYS PERFORM WELL

Given ever increasing load

NOTE! data loss is not a goal, but ideally we won’t lose it :-)

Monday, October 25, 2010

It starts with culture...

Monday, October 25, 2010

✤ Get over schema purity

✤ add column default not null

Monday, October 25, 2010

✤ Get over schema purity

✤ add column default not null

Good performance comes from good schema design, HOWEVER, perfect relational

modeling is NOT THE GOAL

Monday, October 25, 2010

✤ Devs must own schema and queries

✤ they design, you refine

Monday, October 25, 2010

✤ Devs must own schema and queries

✤ they design, you refine

Performance and scalability cannot be managed solely within the database; both require application level knowledge. To

achieve this, application developers need to have visibility of the resources they work on

Monday, October 25, 2010

Gain Visibility

Monday, October 25, 2010

Gain Visibility

✤ Monitoring

✤ Alerts

✤ Trending

✤ Capacity Planning

✤ Performance Tuning

Monday, October 25, 2010

Gain Visibility

✤ Alerts

✤ server: out of disk space, high load, etc...

✤ database: connections, sequences, etc...

✤ business: registrations, revenue, etc...

✤ etc...

check_postgres.pl

Monday, October 25, 2010

Gain Visibility

✤ Trending

✤ server: disk usage, load, etc...

✤ database: connections, sequences, etc...

✤ business: registrations, revenue, etc...

✤ etc...

cacti, mrtg, circonus

Monday, October 25, 2010

Gain Visibility

✤ Capacity Planning

✤ disks, cpu, memory

✤ connections, vacuum, bloat

simple projections, done regularly, are good enough

Monday, October 25, 2010

Gain Visibility

✤ Performance tuning

✤ how long do queries take?

✤ how often do they run?

pgfouine

Monday, October 25, 2010

Gain Visibility

COMMITS/PUSHES

Monday, October 25, 2010

Gain Visibility

ALL alerts, graphs, query reports, etc... MUST be available to EVERYONE on

the team AT ALL TIMES

Monday, October 25, 2010

Hands on

You can’t succeed without first putting the right culture in place.

Once you are on the right path, make sure you have the right technology

Monday, October 25, 2010

Postgres Versions

✤ MINIMUM: 8.3

✤ removes xid for read only queries, significant reduction in vacuum activity

Monday, October 25, 2010

Postgres Versions

✤ MINIMUM: 8.3

✤ removes xid for read only queries, significant reduction in vacuum activity

seriously!

Monday, October 25, 2010

Postgres Versions

✤ MINIMUM: 8.3

✤ removes xid for read only queries, significant reduction in vacuum activity

✤ BETTER: 8.4

✤ revised free space map management leads to more efficient vacuuming

Monday, October 25, 2010

Postgres Versions

✤ MINIMUM: 8.3

✤ removes xid for read only queries, significant reduction in vacuum activity

✤ BETTER: 8.4

✤ revised free space map management leads to more efficient vacuuming

✤ WHY NOT? 9.0

✤ Hot standby / streaming replication couldn’t hurt

Monday, October 25, 2010

Speaking of replication

✤ Common practice for scaling websites

✤ Good for READ based loads

✤ We have used many:

✤ slony, rubyrep, bucardo, 9.0 built-in, mammoth, wrote-our-own

Monday, October 25, 2010

Speaking of replication

Monday, October 25, 2010

Speaking of replication

✤ No favorite system for this, evaluate based on:

✤ avoid solutions that duplicate writes at sql level (imho)

✤ how comfortable am I debugging the system?

✤ do you need automated schema changes?

✤ how much redundancy / complexity do you need?

✤ how does the system handle node failure for N nodes?

Monday, October 25, 2010

So what would you use? (tm)

✤ 2 Nodes, master + standby: Postgres 9.0

✤ Master + multiple slaves: Slony

✤ Master-Master: Bucardo

All choices subject to change!!

Monday, October 25, 2010

A word about “Sharding”

✤ Distributed computing is hard(er)

✤ we think of things in a singular global state

✤ the more we can work in that model, the better

✤ RDBM offer poor solutions for multiple masters

✤ you must manage that complexity on your own

Monday, October 25, 2010

A word about “Sharding”

✤ Splitting systems by service:

✤ separate db for login, forums, sales, etc...

✤ allows for growth

✤ provides simple interface

Monday, October 25, 2010

Pooling

✤ Postgres connections are expensive!

✤ fork new process per connection

✤ keep 1 process open per connection

✤ 1000+ processes you will notice trouble

Monday, October 25, 2010

Pooling

✤ Postgres connections are expensive!

✤ fork new process per connection

✤ keep 1 process open per connection

✤ 1000+ processes you will notice trouble

✤ POOLING

✤ JDBC, mod-perl

✤ pgbouncer ftw!

Monday, October 25, 2010

Summary

✤ Schema / Queries should be shared between dev, dba teams!

✤ Monitoring + Visibility!

✤ >= 8.3 Required!

✤ Replication, jump in it!

✤ Use connection pooling!

Monday, October 25, 2010

Thanks!

more:@robtreat2

www.xzilla.net

Oleg & CrewHighload++

OmniTIPostgres Community!

You!

Monday, October 25, 2010

top related