toward large-scale distributed stream processing: models ......toward large-scale distributed stream...

Post on 03-Aug-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Towardlarge-scaledistributedstreamprocessing:

models,systemsandchallengesValeriaCardelliniandFrancescoLoPres>UniversityofRomeTorVergata,Italy

ICT COST Action IC1304

Autonomous Control for a Reliable

Internet of Services (ACROSS)

2ndInt’lSummerSchoolonAutonomousControlforReliableFutureNetworksandServices,30May2016,Opa>ja,Croa>a

Whoarewe?ValeriaCardelliniAssociateprofessor@Univ.ofRomeTorVergata

FrancescoLoPres> Associateprofessor @Univ.ofRomeTorVergata

•  JointresearchworkwithVincenzoGrassiandMaXeoNardelli

V.Cardellini-ACROSS2ndSummerSchool 1

Thedatadeluge

•  Somewell-knownnumbersrelatedtoBigData:–  Everydayin2014wecreated2.5Exabytes–  40ZeXabytesofdatawillbecreatedby2020

•  Prolifera>onofnewsourcesofdata–  Sensors,mobiledevices,cameras–  Socialnetworks–  Scien>ficinstruments–  Vehicles

•  Howcanwemakesenseofallthesedata?–  Processdatatoextractvaluableinsights

V.Cardellini-ACROSS2ndSummerSchool 2

Whydatastreamprocessing?•  Applica>onssuchas:

–  Sen>mentanalysisonmul>pletweetstreams@TwiXer–  Userprofiling@Yahoo!–  Trackingofquerytrendevolu>on@Google–  Frauddetec>on–  Busrou>ngmanagement@cityofDublin[Art14]

•  Require:–  Con>nuousprocessingofunboundeddatastreamsgeneratedbymul>ple,distributedsources

–  In(near)real-1mefashionV.Cardellini-ACROSS2ndSummerSchool 3

Whydatastreamprocessing?

•  Inthepastyearsdatastreamprocessing(DSP)wasconsideredasolu>onforveryspecificproblems(e.g.,financial>ckers)

•  Butnowwehave(andwillhave)moregeneralseings– E.g.,InternetofThings

V.Cardellini-ACROSS2ndSummerSchool 4

Whydatastreamprocessing?

•  Decreasetheoveralllatencytoobtainresults– Nodatapersistenceonstablestorage

See“Latencynumberseveryprogrammershouldknow”

– Noperiodicbatchanalysis

•  Simplifythedatainfrastructure

•  Make>medimensionofdataexplicitV.Cardellini-ACROSS2ndSummerSchool 5

Whydatastreamprocessing?

•  Decreasetheoveralllatencytoobtainresults– Nodatapersistenceonstablestorage

See“Latencynumberseveryprogrammershouldknow”

– Noperiodicbatchanalysis

•  Simplifythedatainfrastructure

•  Make>medimensionofdataexplicitV.Cardellini-ACROSS2ndSummerSchool 6

Tradi>onalDSPchallenges

•  Streamdataratescanbehighanddataarriveinlargevolumes– Highresourcerequirementsforprocessing(clusters,datacenters,distributedClouds)

•  Processingstreamdatahasreal->measpects– Streamprocessingapplica>onshaveQoSrequirements,e.g.,end-to-endlatency

– Mustbeabletoreacttoeventsastheyoccur

V.Cardellini-ACROSS2ndSummerSchool 7

Whylarge-scalestreamprocessing?•  Goals:increasescalabilityandreducelatency

•  How?Relyondistributedandnear-edgecomputa>on

V.Cardellini-ACROSS2ndSummerSchool 8

Goalsofthelectures•  Giveaflavoroflarge-scaledistributedstreamprocessingandrelatedresearchchallenges

•  PartI(V.Cardellini)–  Focusonsystemissues–  Theseslides

•  PartII(F.LoPres>)–  Focusonmodelsandalgorithms

•  Request–  Ifyougeteitherboredorlost,askques>ons…–  Ifyouliketoaskques>ons,askques>ons…

V.Cardellini-ACROSS2ndSummerSchool 9

Goalsofthelectures•  Giveaflavoroflarge-scaledistributedstreamprocessingandrelatedresearchchallenges

•  PartI(V.Cardellini)–  Focusonsystemissues

•  PartII(F.LoPres>)–  Focusonmodelsandalgorithms

•  Request–  Ifyougeteitherboredorlost,askques>ons…–  Ifyouliketoaskques>ons,askques>ons…

V.Cardellini-ACROSS2ndSummerSchool 10

Datastreamdefini>ons

V.Cardellini-ACROSS2ndSummerSchool 11

Datastream

•  “Adatastreamisareal->me,con>nuous,ordered(implicitlybyarrival>meorexplicitlyby>mestamp)sequenceofitems.Itisimpossibletocontroltheorderinwhichitemsarrive,norisitfeasibletolocallystoreastreaminitsen>rety.Queriesoverstreamsruncon>nuouslyoveraperiodof>meandincrementallyreturnnewresultsasnewdataarrive.”[Gol03]

V.Cardellini-ACROSS2ndSummerSchool 12

Slidingwindows

•  Howmanydataitemsshouldweprocesseach>me?– Processitemsinwindow-sizedbatches

•  Count-basedwindow,e.g.,lastnitems

•  Time-basedwindow,e.g.from[t-T]to[t]

s1 s2 s3 s4 s5

>me

s6 n=5

V.Cardellini-ACROSS2ndSummerSchool 13

Slidingwindows

•  Howosenshouldweevaluatethewindow?– Eagerapproach:outputnewresultitemsassoonasavailable(butcanbedifficulttoimplementefficiently)

– Lazyapproach:slidewindowbysseconds(ormitems)

V.Cardellini-ACROSS2ndSummerSchool 14

DSPapplica>onmodel•  ADSPapplica>onismadeofanetworkofoperators(processingelements)connectedbystreams,atleastonedatasourceandatleastonedatasink

•  Representedbyadirectedgraph–  Graphver>ces:operators–  Graphedges:streams

•  Graphcanbecyclic–  Somesystemsonlysupportdirectedacyclicgraph(DAG)

•  GraphtopologyrarelychangesV.Cardellini-ACROSS2ndSummerSchool 15

DSPoperator•  Aself-containedprocessingelementthat:

–  transformsoneormoreinputstreamsintoanotherstream–  canexecuteagenericuser-definedcode

•  Algebraicopera>on(filter,aggregate,join,..)•  User-defined(morecomplex)opera>on(POS-tagging,…)

–  canexecuteinparallelwithotheroperators•  Canbestatelessorstateful

–  Stateless:knownothingaboutthestate(e.g.,filter,map)–  Stateful:keepsomesortofstate

•  E.g.,someaggrega>onorsummaryofprocessedelements,orstate-machinefordetec>ngpaXernsforfraudulentfinancialtransac>on

•  StatemightbesharedbetweenoperatorsV.Cardellini-ACROSS2ndSummerSchool 16

“HelloWorld”:WordCount

Wordssource

Wordscounter

(word) (word,counter)

(ranks)

Intermediatesorter

Finalsorter

(finalrank)

V.Cardellini-ACROSS2ndSummerSchool 17

SomeDSPapplica>on:DEBS’14GC•  Real->meanaly>csoverhighvolumesensordata:analysisof

energyconsump>onmeasurements[DEBS14GC]–  Smartplugsdeployedinhouseholdsandequippedwithsensorsthat

measurevaluesrelatedtopowerconsump>on•  Inputdatastream:

!2967740693, 1379879533, 82.042, 0, 1, 0, 12 !

•  Query1:makeloadforecastsbasedoncurrentloadmeasurementsandhistoricaldata–  Outputdatastream:

ts, house_id, predicted_load !

•  Query2:findtheoutliersconcerningenergyconsump>on–  Outputdatastream:

ts_start, ts_stop, household_id, percentage!V.Cardellini-ACROSS2ndSummerSchool 18

SomeDSPapplica>on:DEBS’15GC•  Real->meanaly>csoverhighvolumespa>o-temporaldata

streams:analysisoftaxitripsbasedondatastreamsorigina>ngfromNewYorkCitytaxis[DEBS15GC]

•  Query1:iden>fyrecentfrequentroutes•  Query2:iden>fyregionswiththehighestprofit•  Bothqueriesrelyonaslidingwindowoperator

–  Con>nuouslyevaluatethequeryresults•  Usegeo-spa>algridstodefinetheeventsofinterest

V.Cardellini-ACROSS2ndSummerSchool 19

SomeDSPapplica>on:DEBS’16GC•  Real->meanaly>csforadynamic(evolving)social-network

graph[DEBS16GC]•  Query1:iden>fythepoststhatcurrentlytriggerthemost

ac>vityinthesocialnetwork•  Query2:iden>fylargecommuni>esthatarecurrently

involvedinatopic•  Requirecon>nuousanalysisofdynamicgraphconsideringmul>plestreamsthatreflectgraphupdates

V.Cardellini-ACROSS2ndSummerSchool 20

Datastreamsystems

V.Cardellini-ACROSS2ndSummerSchool 21

Streamingsystem•  Distributedsystemthatexecutesstreamgraphs

–  con>nuouslycalculatesresultsforlong-standingqueries–  overpoten>allyinfinitedatastreams–  usingoperators

•  thatcanbestatelessorstateful

•  Systemnodesmaybeheterogeneous•  Mustbehighlyop>mizedandwithminimaloverheadsotodeliverreal->meresponseforhigh-volumeDSPapplica>ons

V.Cardellini-ACROSS2ndSummerSchool 22

Operatorplacement

V.Cardellini-ACROSS2ndSummerSchool 23

1 23

4 6

5

(1,2)

(1,2) (1,2) (2,3)(2,4)

(3,5)(4,5)

(4,6)

(4,6)

(2,4)(2,3)

(3,5)

(4,5)

(4,6)

•  Determine,withinasetofavailabledistributedcompu>ngnodes,thenodesthatshouldhostandexecuteeachoperatorofaDSPapplica>on

v

Bigdatacenters

•  Whichframeworksfordatastreamprocessing?•  Usuallyruninlocallydistributedclusterswithinlargedatacenters

•  Assump>ons:–  Scaleoutandnotscaleup

•  Commodityservers•  Data-parallelismisking

–  Soswaredesignedforfailure•  See[Dea09]

V.Cardellini-ACROSS2ndSummerSchool 24

Source:Google

ApacheStorm•  ApacheStorm

–  Open-source,real->me,scalablestreamingsystem–  Providesanabstrac>onlayertoexecuteDSPapplica>ons

•  Topology(streaminggraph)

–  Spouts(datasources)andbolts(operatorsanddatasinks)

stream

x5

V.Cardellini-ACROSS2ndSummerSchool 25

worker process

executor executorTHREAD THREAD

JAVA PROCESS

task

task

task

task

task

Stormen>>es•  Task:operatorinstance•  Executor:smallestschedulableen>ty

–  Executeoneormoretasksrelatedtosameoperator

•  Workerprocess:Javaprocessrunningasubsetofexecutors

•  Workernode:compu>ngresource,acontainerforworkerprocesses

V.Cardellini-ACROSS2ndSummerSchool 26

Stormarchitecture

V.Cardellini-ACROSS2ndSummerSchool 27

Otherframeworks(par=allist)•  Cloud-basedframeworks

–  AmazonKinesis–  GoogleCloudDataflow– Microsos

•  ApacheSpark–  ImproveMapReduce(batchprocessing)–  SparkStreaming:reducethesizeofeachstreamandprocessstreamsofdata(micro-batchprocessing)

V.Cardellini-ACROSS2ndSummerSchool 28

Otherframeworks(par=allist)•  Cloud-basedframeworks

–  AmazonKinesis–  GoogleCloudDataflow– Microsos

•  ApacheSpark–  ImproveMapReduce(batchprocessing)–  SparkStreaming:reducethesizeofeachstreamandprocessstreamsofdata(micro-batchprocessing)

V.Cardellini-ACROSS2ndSummerSchool 29

Otherframeworks(par=allist)•  Cloud-basedframeworks

–  AmazonKinesis–  GoogleCloudDataflow– Microsos

•  ApacheSpark–  ImproveMapReduce(batchprocessing)–  SparkStreaming:reducethesizeofeachstreamandprocessstreamsofdata(micro-batchprocessing)

V.Cardellini-ACROSS2ndSummerSchool 30

(e.g.,ApacheStorm) (e.g.,ApacheSpark)

Anewbreadthofframeworks•  Lambdaarchitecture

– Data-processingdesignpaXerntohandlemassivequan>>esofdataandintegratebatchandreal->meprocessingwithinasingleframework

V.Cardellini-ACROSS2ndSummerSchool 31Source:hXps://voltdb.com/products/alterna>ves/lambda-architecture

Challengesindatastreamprocessing

V.Cardellini-ACROSS2ndSummerSchool 32

Challenge1:Op>mizetheDSPapplica>on•  Applysometransforma>ontostreaminggraph

–  Atdesign>meorrun->me

•  Operatorreordering[Hir14]–  Toavoidunnecessarydatatransfers

•  Redundancyelimina>on[Hir14]

A B B A

A

B

B D

C

A B

D

C

V.Cardellini-ACROSS2ndSummerSchool 33

Challenge1:Op>mizetheDSPapplica>on

•  Operatorsepara>on[Hir14]

•  Fusion[Hir14]

A A1 A2

A B AB

V.Cardellini-ACROSS2ndSummerSchool 34

Challenge2:Placetheoperators

•  Operatorplacementdecision:acomplexproblem–  Tradecommunica>oncostagainstresourceu>liza>on

•  When–  Ini>al(sta>c)operatorplacement

•  Canbemoreexpensiveandcomprehensive

–  Canalsobeatrun->me•  Moveonlyrelocatableoperators•  Requireoperatormigra>on

•  SeePartII

V.Cardellini-ACROSS2ndSummerSchool 35

Challenge3:Manageloadvaria>ons•  Typicalstreamprocessingworkloadsare:

– withhighvolumeandhighrates– burstyandwithworkloadspikesnotknowninadvance

•  TwiXerin2013:rateoftweetspersecond=5700…•  butsignificantpeakof144,000tweetspersecond

V.Cardellini-ACROSS2ndSummerSchool 36

Challenge3:Manageloadvaria>ons•  Possibleapproaches:

– Admissioncontrol– Sta>creserva>on

•  Reservespecificresourcesinadvance•  Cons:over-provisioningandcostincrease

– Applydynamictechniquessuchasloadshedding•  Selec>velydroptuplesatstrategicpoints(e.g.,whenCPUusageexceedsaspecificlimit)

•  Cons:sacrificeaccuracyandcompleteness

A Shedder AV.Cardellini-ACROSS2ndSummerSchool 37

Challenge3:Manageloadvaria>ons•  Possibleapproaches(con=nued):

– Useadap>veratealloca>on[Bou12]– Redistributeload,e.g.,determinenewoperatorplacementandrelocateoperatorsoncompu>ngnodes

•  Cons:availableresourcescouldbeinsufficient

V.Cardellini-ACROSS2ndSummerSchool 38

•  Alterna>vesolu>on:– DetectboXleneck– Usedata-parallelism(akaoperatorfission[Hir14])

•  ApplySIMDparadigm:concurrentexecu>onofmul>plereplicasofthesameoperatorondifferentdatapor>ons

•  Byhand:possible,butcumbersome

Exploitdataparallelism

A B

A

A

A

Split Merge

V.Cardellini-ACROSS2ndSummerSchool 39

Elas>cstreamprocessing

V.Cardellini-ACROSS2ndSummerSchool 40

•  Exploitelas1city:acquireandreleaseresourceswhenneeded

– Atapplica>onlayer(i.e.dataparallelism)•  Scaleout(orscalein)operators•  Ac>vate(ordeac>vate)replicatedoperators[Bel14]

– Atinfrastructurelayer•  Scaleout(orscalein)compu>ngnodes

Elas>cstreamprocessing

•  Whenandhowtoscale?– SeePartII

•  Butelas>cityoverheadisnotzero!–  Inmoststreamingsystems:runanewplacementdecisiontotakethenewresourcesintoaccount

– Dynamicscalingimpactsstatefuloperators

V.Cardellini-ACROSS2ndSummerSchool 41

Challenge4:Self-adaptatrun->me

•  Tocopewithhighlydynamicopera>veenvironment–  Unpredictableworkload–  Computa>onalcharacteris>csofoperatorsnotknowna-priori

–  Needtosustainedloadforlongprovisioning>mes–  Nodeavailability,networkconges>on,…

•  Exploitrun->meadapta>oncapabili>esofstreamingsystems

•  Whatadap>onac>ons?–  Scalethenumberofoperatorinstances,relocatetheoperators,…

V.Cardellini-ACROSS2ndSummerSchool 42

Self-adapta>onframework•  MAPE:Monitor,Analyze,PlanandExecute•  Soswarereferenceframeworkforself-adapta>on

V.Cardellini-ACROSS2ndSummerSchool 43

DistributedStorm

•  WedevelopedanextensionofStorm[Car15]•  Goals:toprovide

– distributedmonitoring– distributedplacement(seePartII)– andadapta>oncapabili>es

•  Where:large-scaleenvironment•  CodeavailableonGitHub

matnar.github.io/uniroma2-storm/

V.Cardellini-ACROSS2ndSummerSchool 44

DistributedStormarchitecture

V.Cardellini-ACROSS2ndSummerSchool 45

DistributedStorm:monitoring•  QoSMonitor(foreachworkernode)

–  Es>matenetworklatencies•  Useanetworkcoordinatesystem•  Vivaldi’salgorithm[ref]:decentralizedandgossip-based

– MonitorQoSaXributes•  Nodeu>liza>onandavailability

•  WorkerMonitor(foreachworkerprocess)– Monitorexchangeddatarateamongtheoperators

V.Cardellini-ACROSS2ndSummerSchool 46

DistributedStorm:performance

Loadspikeonasubsetofnodes

~50%

V.Cardellini-ACROSS2ndSummerSchool 47

Self-adapta>onchallenges

•  Adapta>onhasanonnegligiblecost!–  Run->mereconfigura>onscanincreaselatencyandreduceapplica>onavailability•  Performadapta>ononlywhenneeded

–  Costsofoperatormigra>onscannotbeneglected•  Freeze>mescausedbyoperatormigra>on•  Howtomigratestatefuloperators?

V.Cardellini-ACROSS2ndSummerSchool 48

Challenge5:statefuloperators•  Statecomplicatesthings…1.  Dynamicscaling2.  Operatorre-placement3.  Recoveryfromfailure

Lossofstate!V.Cardellini-ACROSS2ndSummerSchool 49

impactstate

Approachesforstatefulmigra>on•  Moststreamingsystemsdonotsupportstatefulprocessingandmigra>on(e.g.,Storm)–  Developersmanagestate–  Typicallycombinewithexternalsystemtostorestate–  Designcomplexity

•  Requirementsforstatefulopera>ormigra>on–  Safety(i.e.,topreservetheconsistencyoftheopera>ons)–  Applica>ontransparency– Minimalfootprint

V.Cardellini-ACROSS2ndSummerSchool 50

Statefuloperatormigra>on

•  Paralleltrackapproach[Hei14]•  Pause-and-resumeapproach

Stopmigra>ngtask Savestate

Terminatemigra>ngtaskandstartitonnewnode

Restorestate

Resumestreamprocessing

V.Cardellini-ACROSS2ndSummerSchool 51

Approachesforstatefulmigra>on

•  Howtoiden>fythepor>onofstatetomigrate?– ExposeanAPItolettheusermanuallymanagethestate[Fer13]

– Supportonlypar>>onedstatefuloperators[Ged14]

•  Par>>onedstatefuloperatorsstoreindependentstateforeachsub-streamiden>fiedbyapar>>oningkey

•  Automa>callydetermine,onthebasisofapar>>oningkey,theop>malnumberofstatepar>>onstobeusedandmigrate

V.Cardellini-ACROSS2ndSummerSchool 52

Elas>cstatefulmigra>oninStorm•  Wedevelopedmechanismsforelas>cstatefulmigra>oninStorm[Car16]

•  CodeonGitHubmatnar.github.io/elas>c-storm/

Supervisor Supervisor Supervisor Supervisor

wor

ker

proc

ess

wor

ker

proc

ess

wor

ker

slot

wor

ker

slot

wor

ker

slot

wor

ker

slot

wor

ker

proc

ess

wor

ker

proc

ess

wor

ker

proc

ess

wor

ker

proc

ess

wor

ker

proc

ess

wor

ker

proc

ess

DDS DDS DDS DDS

Network

schedulerMigrationNotifier

ElasticityManager

Nimbus ZooKeeperV.Cardellini-ACROSS2ndSummerSchool 53

Elas>cstatefulmigra>oninStorm•  Scalingdecisionsattheframeworklevel

–  Adaptthenumberofparallelinstancesforeachapplica>onoperator

–  Simplethreshold-basedscalingpolicy(seePartII)

•  RelocatetheoperatorinternalstateonadifferentnodeandenableStormtochangetheapplica>ondeploymentatrun->me

MIGRATION NOTIFIED

MIGRATIONMODE

SAVESTATE

first synchronizationbarrier

the migrating taskcan be terminated

MIGRATION MODE

RESTORE STATE(if any)

OPERATIONALMODE

new task

second synchronizationbarrier

streams areresumed

time

DDS DDS

V.Cardellini-ACROSS2ndSummerSchool 54

Time (s)500 1000 1500 2000 2500 3000 3500 4000 4500

App

licat

ion

Late

ncy

(ms)

0

200

400

600

800

1000

1200

1400

1600

Data rate ScalingSchedulingwith E+SMw/o E+SM

120 tweets/s120 tweets/s 250 tweets/s350 tweets/s 900 tweets/s

Time (s)500 1000 1500 2000 2500 3000 3500 4000 4500

Num

ber

of E

xecu

tors

0

5

10

15

20

25

30

Data rateScalingSchedulingwith E+SM

120 tweets/s250 tweets/s900 tweets/s350 tweets/s120 tweets/s

Performanceresults

•  Elas>cscalingandstatefulmigra>onimprovetheapplica>onlatency

V.Cardellini-ACROSS2ndSummerSchool 55

•  DSPapplica>on:frequentpaXerndetec>on

Challenge6:guaranteefaulttolerance•  DSPapplica>onsrunforlong>meintervals

failuresareunavoidable•  Possiblesolu>ons:

– Ac>vereplica>on[Bri09]– Check-poin>ng[Seb11]– Replaylogs[Bal08]– Hybridsolu>ons[Zha10]

•  Havingdifferenttrade-offsbetweenrun>mecostinabsenceoffailuresandrecoverycost

•  Large-scalecomplicatesthings…–  Networkpar>>onsandCAPtheorem

V.Cardellini-ACROSS2ndSummerSchool56

Challenge7:Managemul>pleconcurrentDSPapplica>ons

•  Considermul>plecompe>ngDSPapplica>ons•  Howshouldthestreamingsystemallocateresources?– Fairness– Resourceu>liza>on– Profitability,…

V.Cardellini-ACROSS2ndSummerSchool 57

ApacheMesos•  Runconcurrentframeworksonthesameclusteranddynamicallysharetheclusterresources

•  Mesos:acluster“opera>ngsystem”[Hin11]–  Efficientresourceisola>onandsharingacrossdistributedframeworks

V.Cardellini-ACROSS2ndSummerSchool 58

ApacheMesos

V.Cardellini-ACROSS2ndSummerSchool 59

•  Two-levelschedulingbasedonDominantResourceFairness(DRF)algorithm

GMesos:distributedMesos

60V.Cardellini-ACROSS2ndSummerSchool

•  WearecurrentlydevelopingGMesosforlarge-scaleenvironment…staytuned!

Somenewchallengesandresearchopportuni>es

•  IntegratedatastreamprocessingwithSDN– WithSDN,networkintothecontrolloop

•  Studycross-layerop>miza>on

•  Addresssecurityandprivacyissuesindatastreamprocessing

V.Cardellini-ACROSS2ndSummerSchool 61

References[And14]H.C.M.Andrade,B.Gedik,D.S.Turaga,“FundamentalsofStreamProcessing:Applica>onDesign,Systems,andAnaly>cs”,CambridgeUniversityPress,2014.[Art14]A.Ar>kisetal.,“Heterogeneousstreamprocessingandcrowdsourcingforurbantrafficmanagement”,InProc.ofEDBT’14,2014.[Bal08]M.Balazinska,H.Balakrishnan,S.Madden,M.Stonebraker,“Fault-toleranceintheborealisdistributedstreamprocessingsystem”,ACMTrans.DatabaseSyst.33,1,2008.[Bel14]P.Bellavista,A.Corradi,S.Kotoulas,A.Reale,"Adap>veFault-ToleranceforDynamicResourceProvisioninginDistributedStreamProcessingSystems",InProc.ofEDBT’14,2014.[Bou12]I.Boutsis,V.Kalogeraki,“RADAR:Adap>veratealloca>onindistributedstreamprocessingsystemsunderburstyworkloads”,Proc.ofSRDS’12,2012.[Bri09]A.Brito,C.FetzerandP.Felber,“Mul>threading-enabledac>vereplica>onforeventstreamprocessingoperators”,InProc.ofSRDS'09,2009.[Car15]V.Cardellini,V.Grassi,F.LoPres>,M.Nardelli,“DistributedQoS-awareschedulinginStorm”,Proc.ofACMDEBS’15,2015.[Car16]V.Cardellini,M.Nardelli,D.Luzi,“Elas>cstatefulstreamprocessinginStorm”,Proc.ofHPCS‘16,2016. V.Cardellini-ACROSS2ndSummerSchool 62

References[Dab04]F.Dabek,R.Cox,F.Kaashoek,R.Morris,“Vivaldi:Adecentralizednetworkcoordinatesystem”,SIGCOMMComput.Commun.Rev.34,4,2004.[Dea09]J.Dean,Design,LessonsandAdvicefromBuildingLargeDistributedSystems,InLADIS'09,2009.[DEBS14GC]Z.Jerzak,H.Ziekow,“TheDEBS2014grandchallenge”,InProc.ofACMDEBS'14,2014.[DEBS15GC]Z.Jerzak,H.Ziekow,“TheDEBS2015grandchallenge”,InProc.ofACMDEBS'15.[DEBS16GC]V.Gulisano,Z.Jerzak,S.Voulgaris,H.Ziekow,“TheDEBS2016grandchallenge”,InProc.ofACMDEBS'16,2016.[Fer13]R.Fernandez,M.Migliavacca,E.Kalyvianaki,andP.Pietzuch,“Integra>ngscaleoutandfaulttoleranceinstreamprocessingusingoperatorstatemanagement,”inProc.ofACMSIGMOD’13,2013.[Ged14]B.Gedik,S.Schneider,M.Hirzel,andK.-L.Wu,“Elas>cscalingfordatastreamprocessing”IEEETrans.ParallelDistrib.Syst.25,6,2014.[Gol03]L.Golab,M.Özs,“Issuesindatastreammanagement”,ACMSIGMODRec.32,2,2003.

V.Cardellini-ACROSS2ndSummerSchool 63

References[Hei14]T.Heinze,L.Aniello,L.Querzoni,andZ.Jerzak,“Cloud-baseddatastreamprocessing,”inProc.ofACMDEBS’14,2014.[Hin11]B.Hindmanetal.,“Mesos:aplazormforfine-grainedresourcesharinginthedatacenter”,InProc.ofOSDI’11,2011.[Hir14]M.Hirzel,R.Soulé,S.Schneider,B.Gedik,R.Grimm,“Acatalogofstreamprocessingop>miza>ons”,ACMComput.Surv.46,4,2014.[Seb11]Z.Sebepou,K.Magou>s,“CEC:Con>nuouseventualcheckpoin>ngfordatastreamprocessingoperators”,InProc.ofDSN’11,2011.[Zha10]Z.Zhangetal.,“Ahybridapproachtohighavailabilityinstreamprocessingsystems.InProc.ofICDCS‘10,2010.

V.Cardellini-ACROSS2ndSummerSchool 64

Thankyou!Anyques>ons?

ValeriaCardellinicardellini@ing.uniroma2.it

www.ce.uniroma2.it/~valeriaV.Cardellini-ACROSS2ndSummerSchool 65

top related