distributed caching in an ephemeral world rahul singh€¦ · distributed caching in an ephemeral...
TRANSCRIPT
DistributedCachinginanEphemeralWorldRahulSinghFounder&CEOdistelli.comJune7th2017
Distributed?Ephemeral?Huh?
Distributed?Ephemeral?Huh?
…becauseKubernetes
Howdidwegethere?
Thegoodole’days…
Oracle
ApacheFCGI
• ASingleC++binaryrunninginFCGIworkers• UnderApachetalkingtoanOracleDatabase
Thegoodole’days…
Oracle
ApacheFCGI
• Caching?Whocares…
...whenyou’retryingtogetthethingtolinkwithoutrunningoutofmemory
Thegoodole’days…
Oracle
ApacheFCGI
• Scaling?
Justaddmoreboxes!TheoracleDBcanhandleit.
Thegoodole’days…
OracleRAC
ApacheFCGI
• Scaling?
DBgeNngtobeaproblem?...OracleRACtotherescue!
Thegoodole’days…
ScalingupwasthesoluOontoeveryproblem
Whybotherwithcachingwhenyoucanburn$400Min3years
Eventuallessonlearned…
Youcan’tscaleupforever
Eventuallessonlearned…
Youhavetoscaleout!
That’sthedistributedpart
AddmoresmallerboxesLotsofsmalldatabasesbehindRESTAPIs
SplitupthatFCGIbinaryintoMicroservices
Microservices
LoadBalancer
MulOpleBackendServiceInstancesInMulOpleIndividualDatacenters
Serviceclients(website)connectsviaaLB
OneservicebackedbyonesmallDB
Microservices
HundredsofFrontendclients
HundredsofBackendServices
Microservices
LotsofsmallpiecesofsoSwarerunningonmanysmallboxes
Allscaledoutindependently
Microservices
Difficulttoarchitectanddifficulttooperate
But…onceyougetitgoingit’sgreat!Themachinesweren’tgoinganywhere
MicroservicesandthenDockerhappened
PackageitasacontainerRunitanywhere
anyXme
DockerPackageitasacontainer
Runitanywhere!Really?Anywhere?
RunitanyXme!
Really?AnyXme?
Docker
How?
Kubernetes+Docker
Kubernetes+Docker
KuberneteswillscheduleyourcontainersanywhereonyourpoolofserversanyOme.
Kubernetes+Docker
Whathappenedtoallthatstability?Whereismystuff?
KubernetescanshutdownacontaineratanyOme.
KubernetescanstartacontaineratanyOme
That’sEphemeral
DistributedandEphemeral
MicroservicespackagedasdockercontainersscheduledbyKubernetesDistributed Ephemeral
What’sthisgottodowithcaching?
IfservicesandprocessesarelonglivedandhavestableIPaddressesthencachingisrelaOvelyeasy
Ifthereisasingleendpointtoretrieveaspecific
objectthencachingisrelaOvelyeasy
DistributedcachinginanEphemeralworld
InadistributedcachethecachekeysarespreadacrossmulOplemachines
Retrievingaspecificobjectdependsonfindingthe
machinethat’scachingtheobjectforthatkey
Ifmachines(orcontainers)movearoundthenitbecomesdifficulttokeeptrackofwhatiscachedwhere
DistributedcachinginanEphemeralworld
Distributedcachinghasmanychallenges
• FindingnodesthatholdaparOcularkey• Whathappenswhennodesfailorarerestarted• HowdoesCacheinvalidaOonwork
DistributedcachinginanEphemeralworld
Distributedcachinghasmanyadvantages
• Cachelookuptrafficisdistributedacrossmanynodes• CachesizeiseffecOvelyequaltosumofcachesoneverynode• Nosinglepointoffailure• Scalableandperformant
Designingadistributedcache
LoadBalancer
RememberthisMicroservice?
Designingadistributedcache
LoadBalancer
Singlededicatedcachebox
Notreallyscalable
Designingadistributedcache
LoadBalancer
Distributethecacheacrosstheservicenodesthemselves
Designingadistributedcache
LoadBalancer
OR...Haveadedicatedfleetofcachenodesanddistributethecacheacrossthem
Designingadistributedcache
YourcacheisnowspreadacrossmulOplenodes
Designingadistributedcache
IfthereareNnodesinyourcachefleet...then…
Eachnodeholds1/Nofthecache
Themathfromcaptainobvious
Ifthereare20nodesinyourcachefleetthenN=20Ifyourcachecontains100keystheneachnodeshouldholdapproximately5keys
Designingadistributedcache
Eachnodeholds1/Nofthecache
Themathfromcaptainobvious
Ifthereare20nodesinyourcachefleetthenN=20Ifyourcachecontains100keystheneachnodeshouldholdapproximately5keys
dependsonhowyoudistributethekeys
Hashingkeysintobuckets(acrossnodes)
dependsonhowyoudistributethekeys
itdependsonthehashingalgorithmyouuse
Hashingkeysintobuckets(akaacrossnodes)SimpleHashAlgorithm:ModHashingForasetofNnodes,keyKisonnodeBidenOfiedby:b=KmodN
N=10K=22
0 1 2 3 4 5 6 7 8 9
Hashingkeysintobuckets(akaacrossnodes)
Important:ThevalueofNchangesasnodesfailandnewnodesarestarted.Mostooenbecausekubernetesstarts/stopsacontainer
Hashingkeysintobuckets(akaacrossnodes)
Important:ThevalueofNchangesasnodesfailandnewnodesarestarted.Mostooenbecausekubernetesstarts/stopsacontainer
Aproblemwithmodhashing:Whenthenumberofnodeschanges,everyelementisrehashed.
Hashingkeysintobuckets(akaacrossnodes)Aproblemwithmodhashing:Whenthenumberofnodeschanges,everyelementisrehashed.
N=10K=22
0 1 2 3 4 5 6 7 8 9
Ifnode8diesthenN=9Kmod9!=Kmod10
N=9K=22
Hashingkeysintobuckets(akaacrossnodes)Ifeverykeyinthecachemovestoadifferentnodewhenasinglenodefailsthenthecachemissratesgoupooenandaffectperformance.QuesOon:Howdoyouhashkeyssothatwhenasinglenodefailsonlythekeysonthatnodearerehashed?Moregenerally:Howdoyouhashkeyssothatffailuresrehashonly1/fofthetotalcache
Hashingkeysintobuckets(akaacrossnodes)
ConsistentHashing
ConsistentHashing
0
1
2
34
6
7
8
9
5
1. Placenodesonacircle2. Placekeysonthesamecircle
3. AkeyKhashestonodeN>K
ProgrammaOcally1. CalculateScoreforeachnodeN2. Maintainanorderedlistofscores
3. ForKeykcomputescoreforK4. FindNwhereN>K
ConsistentHashing
0
1
2
34
6
7
8
9
5
WhennodeN=8diesonlykeysonthatnodearerehashed.Therestofthecachestaysthesameandconsequentlynodefailuresdon’tresultinincreasedcachemissrates
Addressing
0
1
2
34
6
7
8
9
5
Ifnodesmovearound,howdoyoukeeptrackofwhichnodesarewhereandwhattheiraddressis?
Addressing
0
1
2
34
6
7
8
9
5
AssigneachnodeauniqueIDandanIP:PORTcombinaOonEverynodecommunicatesithealthtoallothernodes.Gossipismostefficientforthis
Addressing,Liveness&NodeHealth
AssigneachnodeauniqueIDandanIP:PORTcombinaOonEverynodecommunicatesithealthtoallothernodes.Gossipismostefficientforthis
Addressing,Liveness&NodeHealth
Gossip:Everynodeheartbeatstoarandomnodeevery10secondsMembershipdataissharedwithallothernodesoneveryiteraOonWhenmissingheartbeatsaredetectedthenodeisconsidereddeadandthekeysarerehashed.
Addressing,Liveness&NodeHealth
KeyRequirement:Whenrehashingbecauseanodedies,itsimportanttogivethenewnodeanewuniqueIDNeverreuseIds
Cachingisanextensivefield
Wehaven'teventalkedaboutcacheinvalidaOonBackupcaching:ShouldIcachethesamekeyonmulOplenodesforredundancy?Shortanswer:No.Itsnotworthit.`Longeranswer:itdepends.
Recap
CachingisrelaOvelyeasy–justuseLRUcacheswithTTLThingsgetmoredifficultatscale.ThingsgetmoredifficultwithdistributedmicroservicesThingsgetmoredifficultwhenyourcachenodesareephemeralHandlingcachenodefailuresisimportantifyouwanthighcachehitrates
Measure,Measure,Measure
CachinglendsitselfwelltomeasurementVerysaOsfyingtoseehighcachehitratesVerypainfultoseehighcachemissrates
We’rehiringafullstack/frontenddeveloperReact,Java,DistributedSystems
Makeasbigimpactonasmallteam
hwps://www.distelli.com/kubernetes