scaling and hardware provisioning for databases (lessons learned at wikipedia)

55
Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia) © 2017 Jaime Crespo. https://jynus.com . License: CC-BY-SA-4.0 Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia) Jaime Crespo Percona Live Europe 2017 -Dublin, 27 Sep 2017-

Upload: jaime-crespo

Post on 22-Jan-2018

79 views

Category:

Technology


3 download

TRANSCRIPT

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases

(Lessons Learned at Wikipedia)

Jaime CrespoPercona Live Europe 2017

-Dublin, 27 Sep 2017-

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

2

Agenda

1. Introduction 5. Scaling by Throwing Hardware at the Problem

2. Scaling by Introducing New Technologies

6. Which Hardware is Right for Me?

3. Scaling by Rewriting Code

7. Conclusions

4. Scaling by Rearchitecturing

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

3

@jynus● Sr. Database Administrator

at Wikimedia Foundation

● Used to work as a trainer for Oracle (MySQL), as a Consultant (Percona) and as a Freelance administrator (DBAHire.com)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

4

I have already mentioned some related topics at #PerconaLive

• Check my previous presentations at:http://www.slideshare.net/jynus/

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

5

Disclaimers• Some negative anecdotes are going to be presented:

– Your mileage may vary

– They didn’t work for us, then, in our particular use case

• It is not the intention to criticize (great) open source developers

– Take home the ideas, not the particular details

• Intended as a “beginners” talk

– New MySQL users, developers or new purchase owners

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

6

SCALING BY INTRODUCING NEW TECHNOLOGIES

Scaling and Hardware Provisioning for Databases(Lessons Learned at Wikipedia)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

7

Ways to Show off One’s Ignorance● Why did you not use ... instead?● I read on an article that … is better● You should migrate to ...● I heard … is horrible/dead/fate worse than death

– Those are not conversation starters, you are trying to make a point

– Discussing “best” technologies without context is worthless

http://smalldatum.blogspot.com.es/2016/09/excited-about-percona-live-amsterdam.html

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

8

Ways to Show Genuine Interest● Why do you use … ?● I found fascinating …, can you tell me more ?● I am thinking of using myself … ?● I sell … and I am trying to improve my product

– Listen to what people have to say– Focus on your product virtues, not others’ defects

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

9

https://blog.wikimedia.org/2013/04/22/wikipedia-adopts-mariadb/

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

10

Compare with: https://www.postgresql.org/message-id/579795DF.10502%40commandprompt.com

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

11

MySQL at Wikimedia History● <2011: Heavily patched MySQL 4.0● 2011-2012: Facebook fork of MySQL 5.1● 2013-2015: MariaDB 5.5 with patches● 2015-2017: custom MariaDB 10.0 package● 2017- : MariaDB 10.1 (on non-production/testing)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

12

Products at the Wikimedia Foundation

● To support our users, we develop and maintain 2 main IT products/services:– Mediawiki (the code that runs Wikipedia)– Wikimedia Infrastructure (the servers and services

where that code runs)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

13

MySQL/MariaDB was the chosen backend since 2001

● Maintaining MediaWiki for multiple storage backends is “easy”

● Maintaining multiple backends on WMF infrastructure is really hard– Snowflakes take comparatively a huge time– It is ok to have specialized backends: search and logs

(elastic), analytics (hadoop), cache (memcache/ cassandra), queueing (redis/kafka), postgres (gis), dynamic config (etcd), testing (sqlite)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

14

What I never get● I’ve analyzed your statistics regarding jobqueue

processing based on Redis and after spending time reading your documentation I think your message passing subsystem is inefficient, here it is the code prototype I wrote integrating MediaWiki with Apache Kafka that would make it work better. Do you have some time so I can try to convince you?

PS: Please let me take care of your next Asia Pacific TZ emergency

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

15

WMF Focus: Efficiency● 30K HTTP requests / Operations employee / second

– We include operations from DBAs, to managers to people racking servers (sometimes they are the same!)

– Word-wide redundancy– Owning the full stack: No external provider other

than network providers and datacenter space (no external cloud, CDN, specialized hardware, code repository or CI)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

16

Some statistics are misleading/not useful

● HTTP req/s or MySQL queries/s can be meaningless– “getting the HTML content of en:Dublin” and

“uploading a photo of Dublin with its structured metadata” count as 1 hit

– In many cases your aim is to minimize hits, not increase them

● Encouraging mirroring or downloading content for offline usage is part of the goal

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

17

WMF Focus: Open Source + Bare metal

● Being self-sufficient adds a huge overhead, specially to adapt to new technologies

● It pays off for us in the long term, as the vendors rise and fail, suffer outages, data leaks, espionage, etc.

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

18

Common “not scaling” occurrence ● “Icinga” no longer scales for us with over 1500 hosts

and 15K service checks● Maybe a replacement is needed that allows “clustering”● TLS certificate checking (huge CPU penalty) is moved

off-server● Load reduced dramatically, that plus a hardware

upgrade make migration a lower priority

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

19

Sometimes new technologies are the right path

● Ganglia, graphite were used● Lacking in features, too large footprint per metrics,

“ugly” (custom dashboards)● Prometheus is tested

– Provides new features, substitutes fully Ganglia– Grafana can be used as a frontend for both Graphite

and Prometheus, integrating the service even if not fully substituting it

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

20

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

21

Media storage at Wikimedia History● Very early solution: Shared NFS-mounted partitions into

application servers– It worked when scaling was not a problem as a quick

solution● 2012: OpenStack Swift introduction

– Limitations on consistency and coordiation● 2013: Evaluation Ceph as a solution● Now: Swift still in place, improvements made both on

upstream and Wikimedia (thumbor)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

22

New Technology Adoption● Benchmarketing● New features looking “too good/too nice”● 3rd party support● Operational and team knowledge considerations

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

23

Example: TokuDB (1/3)● Great on paper!

– Nice feature set– Fewer IOPS than InnoDB– Great compression ratio (1/4th of the original

dataset)– It allowed us to integrate 7 replica groups in a single

server● We migrated analytics and backups to it

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

24

Example: TokuDB (2/3)● Main issue: “it is not InnoDB”

– Parallel replication, locking model and query plan issues

– Replication stopped due to “index crashes”– Bogus results on non-primary key tables– Bad tooling and upstream (Oracle) support– MariaDB didn’t support it well, TokuDB didn’t

support MariaDB

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

25

Example: TokuDB (3/3)● We end up migrating back to InnoDB (compressed)

– Worse IOPS, but much more stable and matching production

– Makes operations much easier– Separate datasets (analytics) work nicely on Toku

● Excited and looking closely at RocksDB for our key-value storage, but we are not going to be the beta testers

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

26

SCALING BY REWRITING CODE

Scaling and Hardware Provisioning for Databases(Lessons Learned at Wikipedia)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

27

Let’s rewrite ...● X doesn’t scale anymore● A monolithic architecture will not work● Y language is not appropriate for ...

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

28

Mediawiki full rewrite history● January 2001 – “UseModWiki” (Phase I)● January 2002 – “The PHP script” (Phase II)● July 2002 – “MediaWiki” (Phase III)

– Since 2002 (over 15 years ago) only small scope/gradual refactorings had happened

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

29

Problems with rewriting● Old knowledge is lost (obscure bugs)● Community of 3rd party developers & users lost

– Large list of plugins/bots no longer compatible– Hard to sell: “It works for me, why change it?”

● Alienate volunteers● Increase of the number of technologies to maintain

– The best technological solution is not necessarily the best socially and organizationally

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

30

In a way, we are constantly rewriting...

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

31

Sometimes we Have to Rewrite● OCG (offline content generator):

– Originally created by a third party, OCG [its replacement] has been running on outdated code which may introduce security vulnerabilities and other major issues in the future.

● Web Printing Service is about to substitute it

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

32

SCALING BY REARCHITECTURING

Scaling and Hardware Provisioning for Databases(Lessons Learned at Wikipedia)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

33

R

Topology● (non-sharded) Replication fits best for a mostly read-

heavy model– 30 million wiki edits vs. 20 billion wiki pages served

per month (2016, aprox)

W

R

W

R

W

R

Wvs.R

W

R

W

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

34

Sharding● We have avoided sharding as much as possible

– Vertical slices allow for flexible growth● When a server scalability is reached:

– New functional groups are created (search, content, global users, application-level alerts, disk cache, ...)

– More replicas are added to a project– More groups are created to handle heavier projects

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

35

Redundancy/read scaling

Muti-tier/Write scaling

Specialization/optimization

Consolidation/efficiency

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

36

Scalability as handicap for high availability

● “Recent changes” role servers may seem like a great idea– It requires to multiply by 4 the number of servers per

group to have redundancy over multiple datacenters● Multiple datacenters is a huge investment if HA wants

to be kept– If you have passive components it is very easy to “get

behind”

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

37

Multisource● Seemed like a great improvement on MariaDB!

– Analytics– Consolidation

● But:– Bugs with GTID, TokuDB– Bugs with namespace importing– MariaDB is more likely to crash than the host-

recovering expensive

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

38

Migration to Multi-instance model● Consolidation is still possible (specially for smaller

services)● Less maintenance overhead● Analytics moved to other stores

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

39

Example: ProxySQL for Public Wiki Replica service (Cloud Storage)

● We evaluated ProxySQL as a High Availability Handler– It had more features that we needed (level 7 proxy)– It had high operational overhead- account maintenance– Other ops not familiar MySQL-only solutions– It lacked (like most other solutions) features we needed

● We chose to use HAProxy as a simpler approach– We will reevaluate ProxySQL for production, where

most likely will be a better fit

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

40

SCALING BY THROWING HARDWARE AT THE PROBLEM

Scaling and Hardware Provisioning for Databases(Lessons Learned at Wikipedia)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

41

Reaching resource limits? Just throw some hardware at it! (1/2)

● Software must be ready for the hardware expansion– Can your clusters scale efficiently either horizontally or

vertically?– Better hardware solves (hides?) bugs – but it reveals new

bottlenecks● Redundancy requirements?● Services we will have to support that have not yet been

even designed– Or services that will be discontinued in the future

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

42

Reaching resource limits? Just throw some hardware at it! (2/2)

● Did you have into account the management overhead?– Can your staff cope with the extra servers?– Is your automation level on par with the redundancy?

● Support services must scale at the same rhythm (sometimes non-trivial)– Backups– Analytics– Logs– Rack space? Network? DC ops?

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

43

In Some Cases, Buying Better Hardware Can Create Regressions

● We had databases hosted on RAIDs of HD drives● 25% of the fleet was renewed on one datacenter,

adding SSDs– They were pooled to handle the bulk of the queries– Master-replica lag increased, why?

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

44

The New replica servers were “too fast”

● Replicas were pooled if they could keep up with the master’s writes– The “bulk” of the reads could, so the slower host

were not waited for– Most older servers started lagging at peak times

● We had to make the faster servers got at the speed of the slowest host, negating its impact

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

45

Sometimes the Scaling Limit is Not the Hardware

● https://dom.as/2009/06/26/embarrassment/– Cache invalidation stampede creating CPU spikes on

app servers● https://blog.wikimedia.org/2016/04/22/prince-death-wi

kipedia/– The service created to handle the previous problem

needs a fix to allow serving expired content

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

46

WHICH HARDWARE IS RIGHT FOR ME?

Scaling and Hardware Provisioning for Databases(Lessons Learned at Wikipedia)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

47

Current Hardware used for Databases (1/3)

● Quad cores● 512GB Memory● 4TB usable disk in RAID 10

of SSDs● 1GB ethernet● 1U● Life expectancy of 5 years

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

48

Current Hardware used for Databases (2/3)

● HW RAID -must● RAID10 Level -must

– RAID5/6 would not work for us for performance and availability reasons

– Number of drives- compromise between performance, price and expandability

● SSDs – impact difficult to measure as we are right now overprovisioning– A good guess work tells us we can have 5x the load than on

older HDD hosts

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

49

Current Hardware used for Databases (3/3)

● Large amounts of memory– Relative low disk usage ideal for capacity planing/consolidation

● Available disk space – usage at initial buy at 40% (25% with compression)– Allows for consolidation at first

● 1U– old HDD hosts required 2U– Rack space and operational cost can be more expensive than

hardware itself

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

50

Hardware Standarization● It is incredible difficult to predict hardware needs with

over a year in advance● Sometimes overprovisioning and buying servers with

the same capacity makes easier pivoting existing hardware and lowers costs– In an eventuality, a role can be substitute with the

same hardware– Parts are more common and can be substituted with

decommissioned hardware

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

51

Backup Storage● Past purchase of a costly, branded shared-storage

solutions● In the end it was buggy, not cost-effective and didn’t have

the features we needed● Open source + custom glue = lower TCO and better suited

for the needs– Shared nothing architecture ends up being cheaper in

the real world– More work, but also more flexibility

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

52

CONCLUSIONS

Scaling and Hardware Provisioning for Databases(Lessons Learned at Wikipedia)

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

53

Conclusions● Listen what other people are doing● Test on your own● Make mistakes == Learn

– But do not spend too much time & money on them!● Ask questions

– What people talk about vs. what they really use– But don’t become internet trolls

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

54

Q&A

© 2017 Jaime Crespo. https://jynus.com. License: CC-BY-SA-4.0

Scaling and Hardware Provisioning for Databases (Lessons Learned at Wikipedia)

55

Thank You for Attending!● Do not forget, after the session finishes,

to please login with your Percona app and “Rate This Session”

● Special thanks to in order by rand() to: Sean Pringle, Domas Mituzas, Mark Callaghan, Mark Bergsma, Manuel Arostegui, Ariel Glenn, the whole Wikimedia Team, and all people at the MariaDB, Percona and MySQL/Oracle teams, and the Percona Live Organization and Sponsors