spotify services (sdc 2013)
Post on 17-Oct-2014
817 views
DESCRIPTION
TRANSCRIPT
![Page 1: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/1.jpg)
The whole is greater than the sum of the partsSpotify servicesNiklas Gustavsson
måndag 27 maj 13
![Page 3: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/3.jpg)
Architectural overviewLots of questions!
Last year
måndag 27 maj 13
![Page 4: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/4.jpg)
Spotify has more than a hundred backend services. They handle enormous amounts of data. They should always be available. How are they built?
Today
måndag 27 maj 13
![Page 5: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/5.jpg)
In praise of small services
måndag 27 maj 13
![Page 6: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/6.jpg)
A small code base is simpler to understand and reason aboutDoing one thing and one thing only means no compromises
In praise of small servicesC
CC C
AP
SS S
S
måndag 27 maj 13
![Page 7: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/7.jpg)
“Rule of Modularity: Developers should build a program out of simple parts connected by well defined interfaces, so problems are local, and parts of the program can be replaced in future versions to support new features. This rule aims to save time on debugging complex code that is complex, long, and unreadable.”
Eric S. Raymond, The Art of Unix Programming
måndag 27 maj 13
![Page 8: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/8.jpg)
“Decouple until it breaks, and then back of just a little”Strive to make services autonomousWatch your latency, but commonly not significant
DecoupleC
CC C
AP
SS S
S
måndag 27 maj 13
![Page 9: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/9.jpg)
Use scaffolding to quickly get the basic service structureReuse in librariesDon’t overuse patterns. Don’t use layers upon layers. Keep it simple
Simple codebases
måndag 27 maj 13
![Page 10: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/10.jpg)
We build services in Python and JavaPython is awesome for quick development and beautiful codeThe JVM is stable, performant and transparent
Languages and runtimes
måndag 27 maj 13
![Page 11: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/11.jpg)
Performance at scale
måndag 27 maj 13
![Page 12: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/12.jpg)
Care about your performance. Set clear goals. Measure, measure, measure.Have an architecture that allows for scale. Build out as needed. Measure, measure, measure.
Performance at scale
http://www.bbc.co.uk/programmes/b01qzdc1
måndag 27 maj 13
![Page 13: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/13.jpg)
Prefer stateless services when possibleScales out linearIsolate mutating operations
Prefer stateless services
måndag 27 maj 13
![Page 14: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/14.jpg)
Fast, efficient, RESTful protocolsConnection pools are hard. Overloaded TCP servers are complicatedUse queues. Proper pushback. Naturally asynchronous.
Efficient protocols
måndag 27 maj 13
![Page 15: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/15.jpg)
Small payloads, fast marshalinggziphttp://qconsf.com/dl/qcon-sanfran-2011/slides/SastryMalladi_DealingWithPerformanceChallengesOptimizedSerializationTechniques.pdf
Efficient payloads
måndag 27 maj 13
![Page 16: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/16.jpg)
ZeroMQ. Light-weight, fast as hell, queue basedProtobuf. Small, fast, schema-based, simple binary formatRequest-reply and pub/sub
Hermes
måndag 27 maj 13
![Page 17: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/17.jpg)
Don’t be afraid to drop requests (and replies) when overloadedUse shallow queuesUse short timeoutsUse small thread poolsUse small connection pools
Drop requests
måndag 27 maj 13
![Page 18: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/18.jpg)
måndag 27 maj 13
![Page 19: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/19.jpg)
We use the best tool for each case from a small, carefully selected set of optionsPostgreSQL as the default mutable storageCassandra for large scale (heavy writes) or multi-site servicesVarious read-only key-value storeshttp://labs.spotify.com/2013/02/25/in-praise-of-boring-technology/
Scaling storage
måndag 27 maj 13
![Page 20: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/20.jpg)
Always fail, never fail
måndag 27 maj 13
![Page 21: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/21.jpg)
Stuff is always broken. Deal with it.Always design for redundancyAlways keep an eye on your worldDon’t DDoS yourself
Always fail, never fail
måndag 27 maj 13
![Page 22: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/22.jpg)
Build your system to run on multiple serversUse service discovery everywhere. We use DNS SRV records.Make deployment and configuration automated and repeatableMake sure your service is actually running
Many commodity servers
måndag 27 maj 13
![Page 23: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/23.jpg)
Instrument your code with metrics everywhereWe use our own for Python. http://metrics.codahale.com for javaMonitor your infrastructure. JVMs, OS, network, storage
Measure everything
måndag 27 maj 13
![Page 24: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/24.jpg)
Graph your important metrics, strive for seconds latencyWe use a heavily extended derivative of Munin
Graph
måndag 27 maj 13
![Page 25: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/25.jpg)
Hard to know beforehand, err on the side of logging too much (within reasons)Use a structured formatUse syslogCollect your logs in a central placeStore your logs and make them analyzable
Log what’s important
måndag 27 maj 13
![Page 26: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/26.jpg)
Consistently build to some form of packages. Keep track of dependenciesWe build everything* to Debian packages and use package dependenciesDebian is awesome. Use it.
Automate deployment
* Except Maven dependencies
måndag 27 maj 13
![Page 27: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/27.jpg)
Keep everything under version controlUse a provisioning toolWe use Puppet and store every configuration in Git. Everything*.250 modules, 880 classes
Automate configuration
* Everything
måndag 27 maj 13
![Page 28: Spotify services (SDC 2013)](https://reader033.vdocuments.us/reader033/viewer/2022051311/5441bec5afaf9f5e208b47b5/html5/thumbnails/28.jpg)
Trust your developers and ops. Let your teams be autonomousLong-term ownershipMinimize interruptions (aka meetings)Favor asynchronous communication. We coordinate over IRC and use mailShip.
Development
måndag 27 maj 13