an i online graph partitioning algorithm for distributed ... incremental online graph partitioning...
TRANSCRIPT
an Incremental Online Graph Partitioning algorithm for
distributed graph databases
Dong Dai*, Wei Zhang, Yong Chen
IOGP
Workflow of The Presentation
6/30/17 HPDC 2017, Washington D.C. 2
A Use Case
OLTP vs. OLAP
RethinkGraph
Partitioning
Modeling and
Analysis
IOGP Design Idea
IOGP DetailsEvaluation
Setup
Partition Quality
OLTPPerformance
A Use Case
• MeetDanielfrom“All-About-HPC”LLC• HewantstobuildaprovenancesystemforHPC
• Hetracesallprovenance-relevantevents• Hemodelsandstoresthemintoaprovenancegraph• Heoffersusersgraphinterfacestoqueryprovenance
6/30/17 HPDC 2017, Washington D.C. 3
New events inserted continuously
Build large graphs with properties
Thousands of clients query it
Typical OLTP Workloads
OLTP vs. OLAP
• Wehavetwosetsoftoolstohandlegraphs• GraphDatabases• short,finishinmilliseconds• touchasmallpartofthegraph• measuredbytimecostofeachtransaction
• GraphProcessingEngines• longer,finishinseconds,minutes• touchalargepartoforthewholegraph• measuredbysystemthroughput
6/30/17 HPDC 2017, Washington D.C. 4
OLTP
OLAP
Daniel needs distributed graph databases for his OLTP opsBut, how he partitions the graphs?
Rethink the Graph Partitioning
•Manygraphpartitioningalgorithms•Multi-levelscheme• METIS,Chaco,PMRSB,Scotch,ParMetis,Pt-Scotch,etc.
• Heuristicsingle-passmethods• LDG,Fennel,2D-Grid,Vertex-Cut,etc.
• Onlinepartitioningalgorithms• Hashing,Leopard,etc.
6/30/17 HPDC 2017, Washington D.C. 5
Not Designed for OLTP Workloads
Rethink the Graph Partitioning
• WhatdoesOLTPneed?
• Responsequicklywheninsertionhappens• Itneedstofinishfast(inms)• Notimetowaitformulti-levelschemesorevenre-partitioning
• Responsewhileknownothingaboutthevertex• Theinsertionmaybe“random”• Donotassumeknowconnectivityofinsertedvertex
• Measurementofagoodpartitioning• OLTPcaresresponsetimeofeachoperation• Onlyminimizingedge-cuts/maximizingthebalanceisnotenough• Thinkaboutagraphwithnequal-size subgraphs
6/30/17 HPDC 2017, Washington D.C. 6
Modeling and Analysis
• Genericdistributedgraphdatabasemodel• directed/undirectedgraphs• bi-directionaccessesonthegraphs• queryablepropertiesonverticesandedges
6/30/17 HPDC 2017, Washington D.C. 7
Factors of OLTP Performance
• Single-PointAccessPerformance• One-hop:ifclientsknowthelocationofquerieddata• Itsavestimeforqueryinglocationinformation
• TraversalPerformance(Edge-CutandVertex-Cut)
6/30/17 HPDC 2017, Washington D.C. 8
IOGP Core Idea
• Leveragedeterministichashingtoplacenewvertices• fastandeasytocalculate• enableone-hopaccess• needno infotoconductpartitioning
• Incrementallyreassignverticestobetterlocation• continuouslygetbetterlocalitywithincreasingknowledge
• Forvertexistoolarge,changefromedge-cuttovertex-cut• increasetheparallelism• improveOLTPresponsetime
6/30/17 HPDC 2017, Washington D.C. 9
IOGP - Quiet Stage
• QuietStageisthedefaultstage• avertexv isinserted,itwillbeplacedbasedonHashing• anedgeu->v isinserted,itwillbeplacedtogetherwithitsconnectedvertices
6/30/17 HPDC 2017, Washington D.C. 10
v à u
v
Vertex Reassign Stage
• Whenweknewenoughconnectivityofavertexv• moveittothepartitionthatstoresthemostofitsneighbors• Aheuristicfunction(derivedfromFennel)
6/30/17 HPDC 2017, Washington D.C. 11
v à u
v|u1, u2, u3, ...
Edge Splitting Stage
• Whenavertexbecomesextremelylarge• Edge-cutleadstosignificantperformancedegradation• Splitalledgestoincreasetheparallelismofdataaccesses
6/30/17 HPDC 2017, Washington D.C. 12
IOGP Details
• We need per-vertex metadata• To record vertex location• To know vertex degree or it has been split• To efficiently calculate
• Counters
6/30/17 HPDC 2017, Washington D.C. 13
split(v) ...,...loc(v)
alo(v)
ali(v)
plo(v)
pli(v)
plo(v)
pli(v)
split(v)loc(v)
counters(v)
Original Location
Update Counters after Reassignment
1. Updateloc(v) intheoriginalserver
2. InserverSu (vismovingoutofit)• v’sactuallocalityturnsintopotentiallocality• v’slocalneighbors
• reduceactuallocalityby1becausevisnotlocalanymore
3. InserverSk (vismovingintoit)• v’spotentiallocalityturnsintoactuallocality• v’slocalneighbors
• increasetheiractuallocalityby1becausev islocaltothemnow
6/30/17 HPDC 2017, Washington D.C. 14
Optimization: Timing of Vertex Reassignment
• Vertexreassignmentistime-consuming• Observation• moreedges,morestableconnectivity;• lessreassignmentisneeded
• Design• deferringvertexreassignmentuntilitsconnectivitystabilizes• reducingvertexreassignmentfrequencywithmoreedges
• Implementation• Startreassignmentuntilitsdegreeishigherthank• Checkforreassignmentwhenitsdegreereaches[2*k,4*k,...,2l*k...]
6/30/17 HPDC 2017, Washington D.C. 15
Optimization: Asynchronous Data Movement
• Vertexreassignmentandedgesplitting• syncdatamovementmaystaleOLTPoperations• optimization:asynchronousdatamovement
• Edgesplittingexample• Updatein-memorycounters• addvertexinpendingmovementqueue• Serverstartstorejectnewedges
• Datamayindifferentlocationsduringmovement• originalserver,newserver,andmaybeboth• clientsneedtoreadmultiplecopiesandaggregatethem• Prefertheonefrom“newserver”
6/30/17 HPDC 2017, Washington D.C. 16
Evaluation Setup
• AllevaluationswereconductedontheCloudLab cluster• 32serversforback-endstorage
• WechosevariousdatasetsfromSNAPforevaluations• scalesfrom200Kedgestoover100Medges• fromvariousdomains,formingdifferentshapes
• WeevaluatedIOGPonadistributedgraphdatabaseprototype,namelySimpleGDB(https://github.com/daidong/simplegdb-Java)• Aprototypesystemhasbeenusedinseveralresearchprojects• Forfaircomparisonandcontrollableoptimizations
6/30/17 HPDC 2017, Washington D.C. 17
Partitioning Quality
METIS is not always balancedOthers (including IOGP) are well balanced
6/30/17 HPDC 2017, Washington D.C. 18
• METIS runs directly on the final graph• Fennel runs in random order
METIS is still the bestIOGP achieves very well
Continuous Refinement of IOGP
• Usingsimilarheuristic,IOGPperformsbetterthanFennelduetocontinuousrefinement
6/30/17 HPDC 2017, Washington D.C. 19
Key Tunable Parameters
Reassignment Threshold
somewhere around average degree of the graph
6/30/17 HPDC 2017, Washington D.C. 20
Split Threshold
Relevant with 1) hardware; 2) scale of the database cluster; 3) vertex degree and property size
Memory Footprint
• HowmanymemoryisusedinIOGPtokeepthein-memorycounters?
6/30/17 HPDC 2017, Washington D.C. 21
OLTP Performance
Graph Insertionless than 10% overhead
6/30/17 HPDC 2017, Washington D.C. 22
Graph TravelAs much as 2x benefits
Conclusion & Future Work
• OLTPworkloadsofdistributedgraphdatabasesrequirenewsetofgraphpartitioningalgorithms.
• IOGPachieveslessthan10%overheadsoninsertion,andasmuchas2xperformanceimprovementonqueries.
• Therearestilllotsofthingscanbedone• Reducetheoverheadsofvertexreassignments• Partitioningthegraphsbasedontheexactworkloadpatterns• ...
6/30/17 HPDC 2017, Washington D.C. 23