autovacuum, explained for engineers, new improved version pgconf.eu 2015 vienna
TRANSCRIPT
![Page 2: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/2.jpg)
Outline
• What is it and why is it so important?• Aggressiveness of autovacuum• What else important can autovacuum daemon do• Autovacuum and replication• How to remove bloat
![Page 3: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/3.jpg)
Two most common problems we meet in our practice
• autovacuum = off• Autovacuum settings are default
![Page 4: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/4.jpg)
Two most common problems we meet in our practice
• autovacuum = off• Autovacuum settings are default• That means there is a lot we can do about improvingperformance of this particular database
![Page 5: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/5.jpg)
What is autovacuum?
Modern (classical) databases must deal with twofundamental problems:
• Concurrent operationsFor that they can transactions, ACID transactions
• FailuresFor that they can recover to the last successful transactionusing WAL
![Page 6: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/6.jpg)
What is autovacuum?
Technically that means• There is a combination of locking and MVCC algorithms thatprovides transactions support
• Undo and Redo information is stored somewhere to makerecovery possible
![Page 7: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/7.jpg)
What is autovacuum?
In PostgreSQL• Redo - in WAL• Undo - directly in datafiles• UPDATE = INSERT + DELETE• DELETE is just marking tuple as invisible
![Page 8: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/8.jpg)
xmin
tt=# INSERT into test(id) values(5);INSERT 0 1tt=# select *,xmin,xmax from test;id | xmin | xmax
----+------+------5 | 1266 | 0
(5 rows)
tt=# select txid_current();txid_current
--------------1267
(1 row)
![Page 9: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/9.jpg)
xmax
tt=# begin;BEGINtt=# UPDATE test set id=5 where id=4;UPDATE 1
In another session:
tt=# select *,xmin,xmax from test;id | xmin | xmax
----+------+------4 | 1264 | 1270
(3 rows)
![Page 10: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/10.jpg)
Some garbage collection is required
Tuples that are not visible to any running transaction shouldbe removed
• Otherwise fragmentation increases and you run into bloat akaBig Data
• autovacuum workers do that, table by table• Old-fashioned VACUUM is a bad choice
Beside that, autovacuum workers• Collect statistics for the optimizer• Perform wraparound for txid
![Page 11: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/11.jpg)
Some garbage collection is required
Tuples that are not visible to any running transaction shouldbe removed
• Otherwise fragmentation increases and you run into bloat akaBig Data
• autovacuum workers do that, table by table• Old-fashioned VACUUM is a bad choice
Beside that, autovacuum workers• Collect statistics for the optimizer• Perform wraparound for txid
You do not want to turn autovacuum off!
![Page 12: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/12.jpg)
This sort of work must be finally done
• If your autovacuum process runs for hours and interferes withsome DDL, to simply terminate it is not an option
• Especially for OLTP, autovacuum should be configuredaggressively enough: so it can work with small portions ofdata quickly
![Page 13: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/13.jpg)
autovacuum: aggressive enough
postgres=# select name, setting, context from pg_settingswhere category ~ ’Autovacuum’;
name | setting | context-------------------------------------+-----------+------------autovacuum | on | sighupautovacuum_analyze_scale_factor | 0.05 | sighupautovacuum_analyze_threshold | 50 | sighupautovacuum_freeze_max_age | 200000000 | postmasterautovacuum_max_workers | 10 | postmasterautovacuum_multixact_freeze_max_age | 400000000 | postmasterautovacuum_naptime | 60 | sighupautovacuum_vacuum_cost_delay | 20 | sighupautovacuum_vacuum_cost_limit | -1 | sighupautovacuum_vacuum_scale_factor | 0.01 | sighupautovacuum_vacuum_threshold | 50 | sighup
(11 rows)
![Page 14: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/14.jpg)
Scale factors
• autovacuum_vacuum_scale_factor = 0.01 means that atleast 1% of rows (% of table size) in the table should bechanged before autovacuum happens
• autovacuum_vacuum_threshold - alternative setting, exactnumber of rows
• The idea is to make autovacuum work more frequently,vacuuming tables in small portions
• Can be set per-table, but that can be some sort of pain
![Page 15: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/15.jpg)
Magic spell
Autovacuum delays autovacuum_naptime seconds, then checkstables if they need a vacuum. It runs vacuum on a table until
autovacuum_vacuum_cost_limit is reached, then sleepsautovacuum_vacuum_cost_delay milliseconds.
![Page 16: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/16.jpg)
Sometimes a good idea
in crontab:
* * * * * /usr/bin/pgrep -f ’postgres: autovacuum’ | xargs --no-run-if-empty -I $ renice -n 20 -p $ >/dev/null 2>/dev/null* * * * * /usr/bin/pgrep -f ’postgres: autovacuum’ | xargs --no-run-if-empty -I $ ionice -c 3 -t -p $
in postgresql.conf:
autovacuum_max_workers → 10-20 and autovacuum_vacuum_cost_delay → 10
![Page 17: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/17.jpg)
As a result
![Page 18: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/18.jpg)
autovacuum_vacuum_cost_delay
• Is a global setting• In some cases can be an issue (one tablespace on SSD,another on SAS)
• On fast SSD, autovacuum_vacuum_cost_delay can beshorter than on slower SAS
![Page 19: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/19.jpg)
ERROR: canceling statement due to conflict with recovery
• The tuple, cleaned up by autovacuum on master, is still in useby some query on hot standby
• hot_standby_feedback = on - The safest way, in spite ofsome bloat on master
![Page 20: Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna](https://reader030.vdocuments.us/reader030/viewer/2022033022/587b36ce1a28ab057d8b7393/html5/thumbnails/20.jpg)
Before you hurry to reconfigure your PostgreSQL
• autovacuum does not remove existing bloat• dump/restore can be an option, but...• http://reorg.github.io/pg_repack/• https://github.com/PostgreSQL-Consulting/pgcompacttable