developing a database server: software engineer's view
TRANSCRIPT
![Page 1: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/1.jpg)
Developing a Database Server: Software Engineer’s ViewLaurynas Biveinis / Percona laurynas.biveinis@{gmail|percona}.com Big Data Strategy 2015 Vilnius
![Page 2: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/2.jpg)
Which database server?
Percona Server
http://www.percona.com/software/percona-server
A drop-in compatible fork of MySQL
An open-source, relational database management system
Approaching 2,000,000 downloads
![Page 3: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/3.jpg)
A part of MySQL ecosystem
Enabled by GNU General Public License
Forks abound
Healthy and thriving
Lots of politics
![Page 4: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/4.jpg)
The main players, pt 1
![Page 5: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/5.jpg)
The main players, pt 2
![Page 6: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/6.jpg)
The main players, pt 3 Big Web Patches
![Page 7: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/7.jpg)
The main players, pt 4
![Page 8: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/8.jpg)
The main players, pt 5
![Page 9: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/9.jpg)
The ecosystem is fragmented, but is it healthy?
One measure is code flow between the forks
![Page 10: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/10.jpg)
A case of super_read_only
![Page 11: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/11.jpg)
A case of super_read_onlyFacebook patch implemented it first
Facebook contributed it to WebScaleSQL
![Page 12: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/12.jpg)
A case of super_read_onlyFacebook patch implemented it first
Facebook contributed it to WebScaleSQL
Percona Server merged it from WebScaleSQL, sent some bugfixes back to WebScaleSQL
![Page 13: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/13.jpg)
A case of super_read_onlyFacebook patch implemented it first
Facebook contributed it to WebScaleSQL
Percona Server merged it from WebScaleSQL, sent some bugfixes back to WebScaleSQL
Oracle re-implemented it from scratch for the next major MySQL release
![Page 14: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/14.jpg)
A case of super_read_onlyFacebook patch implemented it first
Facebook contributed it to WebScaleSQL
Percona Server merged it from WebScaleSQL, sent some bugfixes back to WebScaleSQL
Oracle re-implemented it from scratch for the next major MySQL release
MariaDB did not like it
![Page 15: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/15.jpg)
Code is flowing (mostly) everywhere Coopetition
![Page 16: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/16.jpg)
Back to Percona Server
Tracks MySQL closely
Diagnostics and management
Performance and scalability
![Page 17: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/17.jpg)
Why diagnostics and management?
Early Percona Server:
Ad-hoc patch for extra diagnostics by Percona consultants
Get billed-per-hour work done more efficiently
![Page 18: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/18.jpg)
Why (InnoDB) performance and scalability?
In 2010, InnoDB was performing worse on a 4-core machine than on 1-core one
And fixes were not forthcoming at the time
Addressed the need then, built the reputation since
![Page 19: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/19.jpg)
Why not other features?
Feature benefit / feature cost ratio has to be very, very high
Case 1: implement low-hanging fruits
Case 2: implement extremely beneficial features
No rewrites, no refactorings, no code base cleanups
![Page 20: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/20.jpg)
“Why not other features” brings us to lessons learned
![Page 21: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/21.jpg)
Lesson 1: stand on the shoulders of giants
You probably do not need to write a DBMS from scratch
So find a good project to fork
![Page 22: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/22.jpg)
Lesson 2: do not diverge
Do not add a single line of code difference without a very good reason
Unless your engineering team is as big as the upstream one
Improvements such as O(n2) -> O(n log n) algorithms are often not good enough in cold code paths
Plugins are very good
![Page 23: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/23.jpg)
Lesson 3: listen to usersEasier said than done, especially if done right
Listening and then ignoring / downplaying users’ pain
Listening to wrong users
We have the best users! :)
$$$ / €€€ add weight to users’ opinions
Both right and wrong
![Page 24: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/24.jpg)
Lesson 4: Continuous QC
Was not something Percona Server had on Day One
MySQL always had an automated feature/regression testsuite
But 3rd parties did not always add tests for their features
Step 1: require developers to actually run the testsuite
Step 2: Jenkins per-push
Step 3: …
![Page 25: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/25.jpg)
Lesson 4: wrong ways and slightly less wrong ways to do performance
![Page 26: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/26.jpg)
A Performance Graph
0
10000
20000
30000
40000
Product A Product B
![Page 27: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/27.jpg)
A Performance Graph
0
10000
20000
30000
40000
Product A Product B
PRODUCT B IS BETTER !!1!
![Page 28: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/28.jpg)
Same performance graph, different view
0
20000
40000
60000
80000
00:00 00:01 00:02 00:03 00:04 00:05 00:06
Product A Product B
![Page 29: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/29.jpg)
Is Product B still better?
How to provision capacity for B?
What response time guarantee will it give?
Will your automated failover work correctly in the presence of stalls?
0
20000
40000
60000
80000
00:00 00:03 00:06
![Page 30: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/30.jpg)
Engineering low variance > engineering max peak performance
Where does variance come from anyway?
From the query code path requesting resources with variable availability
C, C++, CPU, memory: caches, heap, mutexes, rwlocks
Memory/disk: data on disk, which could be cached
RDBMS: free space on WAL log etc
Client-server and clusters: network roundtrips
![Page 31: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/31.jpg)
Database servers love being in homeostasis
All the required resources for queries readily available
In the presence of unpredictable load
Do not make query threads work for this
Monitor them in background and make them available as needed
In the presence of unpredictable workload
![Page 32: Developing a database server: software engineer's view](https://reader030.vdocuments.us/reader030/viewer/2022020410/5881db831a28ab331a8b77b9/html5/thumbnails/32.jpg)
If you want to develop a DBMS:
Find an existing one to fork!
And then do not diverge
Listen to your users
Control quality continuously
Ensure stable performance