Download - Indexes don't mean slow inserts
![Page 1: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/1.jpg)
www.postgrespro.ru
Indexes don't meanslow inserts
Anastasia Lubennikova
![Page 2: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/2.jpg)
Agenda
1. Why do we need it?2. Write-optimisation techniques3. PostgreSQL specific4. Advanced PostgreSQL indexes5. Future of indexing in PostgreSQL
2/29
![Page 3: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/3.jpg)
PostgreSQL indexes
● Speed up search● Primary key● Constraints
● Secondaryindexes
3/29
![Page 4: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/4.jpg)
Index maintenance overhead
● Index size● INSERT slowdown● Random I/O● Index becomes
fragmented
● More indexes -more overhead
4/29
![Page 5: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/5.jpg)
Do we need write-optimisation?
UPDATE mytable SET a = a + 1;
● Heavy write load● MVCC
update = insert
5/29
![Page 6: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/6.jpg)
Do we need write-optimisation?
● 1Gb table ● Update all values:
Without index ~ 200sWith index ~ 600s
6/29
![Page 7: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/7.jpg)
DBMS trade-offs
7/29
![Page 8: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/8.jpg)
DBMS trade-offs
● CAP theorem● ACID vs BASE● Lower hardware cost vs Increase productivity● Read speed vs Write speed● Productivity vs Fault-tolerance
8/29
![Page 9: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/9.jpg)
Write-optimisation
● Writes are faster● Reads are good● Storage is fault-tolerant
9/29
![Page 10: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/10.jpg)
Insert buffer
● Accumulate data. Sort. Insert at a time.+ Avoids random I/O- Seqscan buffer- Possible data loss- Merge time
● Avoids hidden scans- only non-unique
● MySQL InnoDB Change Buffer 10/29
![Page 11: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/11.jpg)
Cache-oblivious data structures
● Approximately optimal for any hardware● Divide & Conquer
11/29
![Page 12: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/12.jpg)
LSM trees
● Cascade of B-trees● First tree is in memory
● LevelDB● BigTable ● Cassandra ● Hbase● SophiaDB● other NoSQL DBs
12/29
![Page 13: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/13.jpg)
From LSM to COLA
● Search optimisation for LSM● Leaf levels are linked by lookahead pointers
● Introduced in 2007 by founders of Tokutek
13/29
![Page 14: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/14.jpg)
Cache-oblivious lookahead arrays
● Drop internal trees nodes● Bound the scan area with lookahead pointers
14/29
![Page 15: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/15.jpg)
COLA: theory and pracrice
● Prototype shows incredible results!
15/29
![Page 16: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/16.jpg)
COLA: theory and pracrice
● VACUUM?● WAL?● Concurrency?● Index size?
● too hard =(
● Prototype shows incredible results!
16/29
![Page 17: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/17.jpg)
Fractal Tree
● Insert the message instead of the data● Send it down the tree● Apply a message
to leaf page
● TokuDB for MySQL● TokuMX for MongoDB
17/29
![Page 18: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/18.jpg)
PostgreSQL specific
18/29
![Page 19: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/19.jpg)
PostgreSQL specific (1)
● Write-Ahead-Log• WAL is not extendable
● Storage manager• 1 Relation (Heap or Index) = 1 continuous file• Free Space Map
● Block size• 8 Kb
19/29
![Page 20: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/20.jpg)
Advanced indexes
20/29
![Page 21: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/21.jpg)
Advanced PostgreSQL indexes
● Optimize the number of indexes• pg_stat_statements
● REINDEX● CREATE INDEX CONCURRENTLY
• rebuild bloated and fragmented indexes● Partial indexes● BRIN
• tiny min/max index
21/29
![Page 22: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/22.jpg)
Ideas?
22/29
![Page 23: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/23.jpg)
Covering and Unique
● To maintain constraint (Unique, Pimary key..) on A● To use IndexOnlyScan on A,B● Have to maintain2 indexes
23/29
![Page 24: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/24.jpg)
Covering + Unique
CREATE UNIQUE INDEX ON mytable USING btree(a)INCLUDING(b);
24/29
![Page 25: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/25.jpg)
Covering + Unique (1)
# CREATE INDEX covering_idx ON people (first_name) INCLUDING (last_name);
# EXPLAIN SELECT first_name, last_name FROM people WHERE first_name = 'Paul'; QUERY PLAN--------------------------------------------------------------------------- Index Only Scan using covering_idx on people (cost=0.28..1.44 rows=4 width=13) Index Cond: (first_name = 'Paul'::text)
25/29
![Page 26: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/26.jpg)
Covering + Unique (2)
● Searches are faster• You can use IndexOnlyScan now
● Indexes maintenace overhead decreased• Inserts are faster• Size is smaller
26/29
![Page 27: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/27.jpg)
Effective storage of duplicates
● Compress duplicated keys on index pageIN PROGRESS
27/29
![Page 28: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/28.jpg)
Bulk insert
● INSERT INTO mytable SELECT x FROM generate_series(0, 1000000) as x;
● 1.000.000 B-tree searches● 1.000.000 WAL records
28/29
![Page 29: Indexes don't mean slow inserts](https://reader031.vdocuments.us/reader031/viewer/2022021813/588502221a28ab7d698b5aa3/html5/thumbnails/29.jpg)
Insert Buffer
● Flexible● Recoverable
29/29