Download - Bigger data with PostgreSQL 9
![Page 1: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/1.jpg)
Slide 1
© by Numius nvOpen systems, Smarter people
Bigger data with PostgreSQL 9
Datawarehousing in the 21st century.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
![Page 2: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/2.jpg)
Slide 2
© by Numius nvOpen systems, Smarter people
The presenter..
• Bert Desmet
• Consultant @ Deloitte
• System Engineer / DBA for deloitteanalytics.eu
• 'devop'?
![Page 3: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/3.jpg)
Slide 3
© by Numius nvOpen systems, Smarter people
agenda
• Introduction
• Release the elephants!
• Impacting factors
• Divide et impera
• Basic configuration
• Passing the speed limits
• Keep your database fit
![Page 4: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/4.jpg)
Slide 4
© by Numius nvOpen systems, Smarter people
Big data?
● 44x data growth per year!
● About 35.2 zettabyte by 2020
● 80% of data is unstructured
● The volume will grow by a whopping 650% in the next 5years
● 80% of organisations will use cloud analytics
● By 2014 80% of eneterprises will want a saas based bi system
![Page 5: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/5.jpg)
Slide 5
© by Numius nvOpen systems, Smarter people
Know your limits
● DB2
● More load
● Scaling
● Speed
● Data size
● Pricing
![Page 6: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/6.jpg)
Slide 6
© by Numius nvOpen systems, Smarter people6 Footer
Release the elephants!
![Page 7: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/7.jpg)
Slide 7
© by Numius nvOpen systems, Smarter people
PostgreSQL 9
● Good for big databases
● Easy maintenance
● Scales!
● Very fast
● Extendable
![Page 8: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/8.jpg)
Impacting factors
![Page 9: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/9.jpg)
Slide 9
© by Numius nvOpen systems, Smarter people
Higly impacting operations
• Dataload
• In bulk (ETL)
• Row by row. Up to 100k rows / minute
• Datafetch (Reporting)
• We do like joins. The more the better.
![Page 10: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/10.jpg)
Slide 10
© by Numius nvOpen systems, Smarter people
Extra problems
• a lot of I/O
• A lot of cpu power (index creation)
• A lot of locks
![Page 11: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/11.jpg)
Slide 11
© by Numius nvOpen systems, Smarter people
The solution?
• Use at least 2 servers
• Set up binary replication
• Put a lot of ram in your servers.
![Page 12: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/12.jpg)
Slide 12
© by Numius nvOpen systems, Smarter people
Dataflow
![Page 13: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/13.jpg)
Slide 13
© by Numius nvOpen systems, Smarter people13 Footer
Devide et Impera
![Page 14: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/14.jpg)
Slide 14
© by Numius nvOpen systems, Smarter people
Replication with postgres
• 8.3 Warm Standby
• 9.0 Async. Binary Replication
• 9.1 Synchronous Replication
• 9.2 Cascading Replication
• 9.3 more improvents towards fail overs / switching masters
• 9.4 Multimaster Binary Replication?
![Page 15: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/15.jpg)
Slide 15
© by Numius nvOpen systems, Smarter people
Configure replication
• Wal_level = ‘host standby’
• Checkpoint_segments >= 32
• Checkpoint_completetion_target >= 0.8
• Hot_standby = on
• Hot_standby_feedback = on
![Page 16: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/16.jpg)
Slide 16
© by Numius nvOpen systems, Smarter people
![Page 17: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/17.jpg)
Slide 17
© by Numius nvOpen systems, Smarter people
Keep it simple, stupid
• 2nd quadrant is pretty awesome
• Barman for backups
• Repmgr for replication management
![Page 18: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/18.jpg)
Slide 18
© by Numius nvOpen systems, Smarter people
Basic configuration
![Page 19: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/19.jpg)
Slide 19
© by Numius nvOpen systems, Smarter people
Raise those memory limits!
• shared_buffers = 1/8 to ¼ of RAM
• work_mem = 128MB to 1GB
• maintenance_work_mem = 512MB to 1GB
• temp_buffers = 128MB to 1GB
• effective_cache_size = ¾ of RAM
• wal_buffers = 32MB
![Page 20: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/20.jpg)
Slide 20
© by Numius nvOpen systems, Smarter people
Tune the planner for correct planning
• Random_page_cost = 3
• Cpu_tuple_cost = 0.1
• Contraint_exclusion=on
• From_collapse_limit => 12
• Join_collapse_limit => 12
![Page 21: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/21.jpg)
Slide 21
© by Numius nvOpen systems, Smarter people
Passing the speed limits
![Page 22: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/22.jpg)
Slide 22
© by Numius nvOpen systems, Smarter people
Use partitions
• Think about the partition key!
• Trigger based for row / row inserts
• Rule based for bulk inserts
• Make sure you add constraints
![Page 23: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/23.jpg)
Slide 23
© by Numius nvOpen systems, Smarter people
Use indexes
• Learn to read query explains
• Use http://explain.depesz.com/
• Don’t over index
![Page 24: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/24.jpg)
Slide 24
© by Numius nvOpen systems, Smarter people
Other sane things to do
• Use unique indexes
• Auto created when defining a primary key
• Use clustered indexes
• And cluster those tables regularly
![Page 25: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/25.jpg)
Slide 25
© by Numius nvOpen systems, Smarter people
Use partial indexes
• Can only be found in Postgres and Mysql.
• Really usefull on big tables
• Disadvantage: no ‘moving’ indexes. Eg: index for current_day.
![Page 26: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/26.jpg)
Keep your database fit
![Page 27: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/27.jpg)
Slide 27
© by Numius nvOpen systems, Smarter people
Vacuum
• Disable autovacuum for datawarehouses
• Vacuum once a day
• Check regulary if the vacuums to run!
• Prevents data loss
• Prevents the database to go out of control, size wise
![Page 28: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/28.jpg)
Slide 28
© by Numius nvOpen systems, Smarter people
Analyze
• Analyze once a day
• Together with vacuum
• Vacuum analyze <schema>.<table>;
• ‘default_statistics_target’ >= 300
![Page 29: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/29.jpg)
Slide 29
© by Numius nvOpen systems, Smarter people
Check for bloat!
• Free space on tables.
• Indexes are not optimized anymore
• use nagios check_postgres.pl
![Page 30: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/30.jpg)
Slide 30
© by Numius nvOpen systems, Smarter people
Prevent bloat
• Vacuum full • Offline!
• Only when a pk is not available
• Repack• Online!
• Orders the tables (clustered index)
• Needs a pk on the table
• Reindex
• Reindex regulary.
![Page 31: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/31.jpg)
Slide 31
© by Numius nvOpen systems, Smarter people
Partial indexes?
• Write a script
• Use a cronjob
• Recreate your time-aware indexes every day. Will be fast.
![Page 32: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/32.jpg)
Slide 32
© by Numius nvOpen systems, Smarter people
![Page 33: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/33.jpg)
Slide 33
© by Numius nvOpen systems, Smarter people
Questions?
• Postgres has an awesome community ®
• Irc: #postgresql @ freenode
• Check the mailing list
![Page 34: Bigger data with PostgreSQL 9](https://reader031.vdocuments.us/reader031/viewer/2022013111/555c0ba7d8b42ad27e8b4974/html5/thumbnails/34.jpg)
Slide 34
© by Numius nvOpen systems, Smarter people