scalable data analy-cs: on the role of straﬁed data sharding · scalable data analy-cs: on the...

ScalableDataAnaly-cs:OntheRoleofStra-fiedDataSharding

SrinivasanParthasarathyDataMiningResearchLabOhioStateUniversity

srini@cse.ohio-state.edu

TheDataDeluge:DataDataEverywhere

180 zettabytes will be created in 2025 [IDC report]

600$ to buy a disk drive that can store all of the world’s

[McKinsey Global Institute Special Report, June ’11]!

Data Storage is Cheap

Data does not exist in isolation.

Data almost always exists in connection with other data – integral

part of the value proposition.

“There’s gold in them there mountains of data” - Gill Press, Forbes Contributor

Social networks Protein Interactions Internet

VLSI networks Scientific Simulations Neighborhood graphs

Big Data Challenge: All this data is only useful if we can extractinterestingandactionableinformationfromlargecomplexdatastoresefficiently Projected to be a $200B industry in 2020. [IDC report]

MapReduce,MPI,Spark,etc.

DistributedDataProcessingisCentraltoAddressingtheBigDataChallenge

Source: blog.mayflower.de

Howeverdistributeddataprocessingitselfcanpose

challenges!

TheCaseforStra-fiedDataShardingofComplexBigData

KeyChallenge:DataPlacement(Sharding)•  Localityofreference

–  Placingrelateditemsinproximityimprovesefficiency

•  Mi-ga-ngImpactofDataSkew–  Cri-calforbigdataworkloads!

•  Interac-veResponseTimes–  Operateonasamplewithsta-s-calguarantees

•  HeterogeneityandEnergyAware–  Heterogeneouscomputeandstorageresourcesareubiquitous

Example–Imageretrieval

Query Image

Heavy Load Light Load No Load No Load

Query Image

Equal Load Equal Load Equal Load Equal Load

•  Random partitioning •  Load imbalance

•  Stratified partitioning •  Load imbalance mitigated

Stra-fiedSamplinginaSlide

•  Roots in Stratified Sampling (Cochran’48)

•  Group related data into “homogeneous strata”

•  Sample each strata •  Proportional

Allocation (shown) •  Optimal Allocation

But,herewewanttopar--on/shard•  For Locality

•  Elements within a strata are placed together •  For Mitigating Skew

•  Each partition is a proportionally allocated stratified sample

•  For Interactivity •  Optimally allocate one partition •  Proportionally allocate the rest

•  Accounting for Energy/Heterogeneity •  More on this later -- time permitting

Apps have varying requirements: ONE SIZE DOES NOT FIT ALL!

OurVision:Stra-fiedDataPlacement

KeyValueStorese.g.Memcached

MPI&Par--onedGlobalAddressesSpaceSystems

(PGAS)e.g.GlobalArrays

HADOOP/SHARK/Azure

(HDFS/RDD/Blob)

STRATIFIEDDATASHARDING&PLACEMENT

KeyChallenge:Crea-ngStrata(ofComplexData)

•  WhataboutClustering?•  Non-trivialfordatawithcomplexstructure•  Poten-allyexpensive•  Variablesizeden--es

•  4-stepapproach[ICDE’13]1.  Convertcomplexdataintoa(mul--)setofpivotalelementsthat

capturefeatures-of-interest2.  Computesketchofset(minwisehashing)3.  Usesketchestogroupintostrata(sketchsort/sketchcluster)4.  Par--onstrataaccordingtoapplica-onneeds(e.g.skew,

balance,locality)

Step1:Pivo-za-onProblem:Needtosimplifycomplexrepresenta-on.KeyIdea:ThinkGloballyActLocally•  Setsoflocalizedfeaturesthatcollec-vely

capturesglobalpicture

Solu,on:SpecifictoData&Domain•  Documents/Text

•  Shingling[Broder1998]•  Trees(XML,Linguis-cdata)

•  Wedgepivots[Ta-konda’10]•  Graphs(Web,Social,Molecules)

–  Adjacencylists[Buehrer’08],WedgeDecomposi-ons[Seshadri’11],Graphlets[Pruzlj’09]

•  Spa-al/vectordata•  LSH[Indyk’99,Chariker’02,Satuluri’12]

•  Images/Simula-on/Sequen-aldata•  Kernels(Leslie’03),KLSH(Kulis’2010)

DATA (Δ)

(PS-1)

(PS-25)

PIVOT SETS (PS)

Step2.Sketching

•  Problem:Pivotsetsmaybevariablelength,similaritycomputa-onisexpensive:O(n^2)

•  KeyIdea:UseSketching

•  Solu,on:LocalitySensi-veHashing[Broder’98, Indyk’99, Charikar’01]–  Resulting representation is fixed-

length (k)–  Tradeoff: Representation Fidelity

vs. Sketch size–  Can handle kernel functions

[Kulis’09] and statistical priors [Satuluri’12, Chakrabarti’15, ‘16]

(PS-1)

(PS-25)

PIVOT SETS (PS)

{1050, 2020, 3130,1800} (SK-1) {1050, 2020, 7225, 2020} (SK-25)

. SKETCHES(SK)

Minwise Hashing (Broder et al 98)

{ dog, cat, lion, tiger, mouse}![ cat, mouse, lion, dog, tiger]![ lion, cat, mouse, dog, tiger]!

Universe

A = { mouse, lion }

mh1(A) = min ( { mouse, lion } ) = mouse!

mh2(A) = min ( { mouse, lion } ) = lion!

Key Fact

For two sets A, B, and a min-hash function mhi():

Unbiased estimator for Sim using k hashes:

Step3:Stra-fica-onProblem:Grouprelateden--esintostrataKeyIdea:InspiredbyW.Cochran’sworkonstra-fiedsampling[1940s]

Solu,ons:•  Sortpivotsetsdirectly(skipsketch

step)–PivotSort•  DirectlyuseoutputofLSH/Minwise

Hash–SketchSort•  Clustersketcheswithfastvariantof

k-modes–SketchCluster

S-1 : : S-4 (Δ1, SK-1) (Δ5, SK-5) (Δ12,SK-12) (Δ25,SK-25) : : : S-5 : : : S-128 : : :

{1050, 2020, 3130,1800} (SK-1) {1050, 2020, 7225, 2020} (SK-25)

. SKETCHES(SK)

Step4:ShardingandPlacement•  Problem:Howtopar--onstra-fieddata?•  KeyIdeas:Guidedbyapplica-onhintsandsystemstate.•  Solu,ons:

1.   Propor,onalAlloca,on:Spliteachstratumuniformlypropor-onallyacrossallpar--onsàmi-gatesskew

2.   Op,malAlloca,onforfirststrata,propor-onalforrest[C77]

3.   All-in-One:Placeeachstratuminitsen-retywithina

par--onIMPORTANTNOTE:Weusesketchestocreatestrata–butpar,,oninghappensonoriginaldata.

S-1 : : S-4 (Δ1, SK-1) (Δ5, SK-5) (Δ12,SK-12) (Δ25,SK-25) : : : S-5 : : : S-128 : : :

P-1 : P-2 S-4 S-7 S-8 S-12 : S-128 P-3 : : : P-8 S-3 S-4 S-9S-12 : S-127

DATA (Δ)

(PS-1)

(PS-25)

PIVOT SETS (PS)

{1050, 2020, 3130,1800} (SK-1) {1050, 2020, 7225, 2020} (SK-25)

. SKETCHES(SK) Strata (S)

EmpiricalEvalua-on

•  Wereportwallclock-mes•  All-mesincludecostofplacement•  Evalua-onsonseveralkeyanaly-ctasks

•  Top-K algorithms [Fagin], Outlier Detection [Ghoting’08, Otey’06], Frequent Tree[Zaki’05, Tatikonda’09] and Graph Mining [Buehrer’06, Yan’02, Nijlsson’04], XML Indexing [Tatikonda’07], Community detection in Social/Biological data [Ucar’06, Satuluri’11], Web Graph Compression [Chellapilla’08-09; Vigna’11, LZ’77], Itemset Mining [Buehrer-Fuhry’15]

•  Allapplica-onsarerunstraightoutofthebox–theonlythingtheuserspecifiesrelatestolocality,skew,andinterac-on.

FrequentTreeMining[Ta-konda’09]

•  UsedWidely–  Transac-ons,graphs,trees

•  Approach1.  DistributeData

•  Propor-onalAlloca-on

2.  RunPhase13.  ExchangeMetaData4.  RunPhase25.  FinalReduc-on

•  Shardingmainlyimpactssteps1-3.Steps3and5aresequen-al.

Proposedapproachesshows100Xgains

FTMPhase1:DrillingDown

• 

•  DataDependentWorkloadSkewismi-gated•  Payload-awareshardinghelps!

WebGraphCompression[Vignaetal2011]

Cri-calapplica-onforsearchcompaniesKeyRequirement:LocalityApproach:•  Distributedataviaplacement•  Runcompressionalgorithm

inparallel•  Parameters(similartoFTM)

–  Useadjacency/trianglepivots–  UseAll-in-onepar--oning

Asegwayanddrilldown(ICDE’15):LocalizedApproximateMiner(LAM)

•  Firstboundedspace&-mepaoernminingalgorithm;O(|D|log|D|)

•  Parameter-free•  Scaleswithcomputeresources

– Near-linearincores&machines•  Scaleswithdatasize

– Billionsoftransac-ons&items–  E.g.67minononemachine;1minonacluster

•  Twoparallelphases:Localize,ApproxMining

LAMPhase1:Localize[SketchSortandStra-fy!]

1,2,3,7 1,2,5,7,8 1,2,4 . . . 1,7,8,5,3,22

Dataset D = Min-wise hash Matrix M =

T1 T2 T3 . . . TN

1 . . . K

T1 T2 T3 TN

Sort T3 TN T1 T2

Local Partition 1 Local Partition 2 Local Partition 3 28

LAMPhase2:ApproxMiningI

LAMPhase2:ApproxMiningII

•  Mined(p,tlist)pairsorderedbyu7lity•  AddptopaoernsetP•  IndatasetD,Removepfromeachrowintlist•  Replacewithapointertopinthepaoernset•  AppendPtoDandrunLAMagainonnewDIteratemul-ple-mesforbeoercompression

Experiments

•  Ninetransac-onaldatasetsfromUCI,FIMI•  CompareLAMtostate-of-the-art

– Krimp[Vreekenetal.2011]– Slim[Smetsetal.2012]– CDB-Hyper[Xiangetal.2008]

•  Fivewebgraphdatasets(|V|~107,|E|~109)•  PLAM(ParallelLAM):Clusterimplementa-on

•  ComparetoClosedItemsetMining

•  Compression,execu-on-me,scalability

UCI/FIMI:Compression

Accidents

AdultAnneal

Breast

Iris Kosarak

Mushroom

Pageblocks

Tic-Tac-Toe

Twitter

LAM 5KrimpSLIM

CDB Hyper

LAM achieves better compression on most datasets

UCI/FIMI:Execu-on-me

100000

Accidents Adult Anneal Kosarak Mushroom

e)LAM 5KrimpSLIM

CDB Hyper

LAM is one or more orders of magnitude faster

Itemset results for various supports, grouped by set size.

Web:ComparingLAMtoClosedSets

100 1000

Support

Closed Sets Gen.Closed Sets Comp.

LAMPLAM8core

100 1000

Support

Closed SetsLAM1Iteration

LAM5Iterations

Smallest web graph dataset EU2005: |V| = 863K, |E| = 19M

•  For σ < 100, Closed Sets slow at generating patterns •  Even slower at compressing •  LAM produces better compression: 2x w/ 1 iter, 4x w/ 5

Web:ComparingLAMtoClosedSets

Smallest web graph dataset EU2005: |V| = 863K, |E| = 19M

Larger datasets: Better results than closed sets, in less time.

Web:Scalability

20 40 60 80

100 120 140 160 180 200

50 100 150 200 250

Machines

EU2005UK2006

ARABIC2005SK2005IT2004

1 1.5 2 2.5 3 3.5 4 4.5 5

Pass Number

EU2005UK2006

ARABIC2005SK2005IT2004

•  Near-linear scalability to hundreds of machines

•  Compression ratios increase over multiple passes

LAM:thoughtsandfuturework

•  Firstpaoernminingalgorithmtoruninlinearithmic-meinthesizeoftheinput

•  LeversStra-fiedDataPar--oning.•  Parameter-free–savesdomainexpert-me•  Scalesnear-linearlyto

– Hundredsofcores&machines– Billionsoftransac-onsanditems

•  Futurework:Canweextendsimilarideastotrees,graphsandsequences?

Energy-andHeterogeneity-AwarePar--oning(ICPP’17)

•  ModernDatacentersareincreasinglyheterogeneous–  Computa-on–  Storage–  GreenEnergyHarves-ng

•  Shardingandplacementwhileaccoun-ngforheterogeneityischallenging–  ParetoOp-malModel

OverviewofParetoFramework

ParetoFunc-on:Math

[Gori’11]

Evalua-on–ParetoFron-ers

Swiss (Tree Mining) RCV: Text Mining UK: Web Analytics

TakeHomeMessage•  Intodaysanaly-csworlddatahascomplexstructure •  Stratitifed Data Placement has a central role to play

–  Over 2 orders of magnitude improvement over state-of-art for a multitude of analytic tasks. First to explore this idea for placement.

–  Preliminary results on heterogeneous- energy-aware systems show significant promise!

KeyValueStorese.g.Memcached

MPI&Par--onedGlobalAddressesSpaceSystems

(PGAS)e.g.GlobalArrays

HADOOP/SHARK/Azure

(HDFS/RDD/Blob)

STRATIFIEDDATASHARDING&PLACEMENT

Thanks•  Organizers•  Audience•  FormerStudents(whodidallthework!)

– YeWang(AirBnB),A.Chakrabarty(MSR),D.Fuhry(OSU).

•  Fundingagencies– NSF,DOE,NIH,Google,IBM

scalable data analy-cs: on the role of straﬁed data sharding · scalable data analy-cs: on the...

Documents

sharding - seoul 2012

using oracle sharding...contents 1 oracle sharding...

of holographic particle data i- by fourier analy..(u

data$analy)cs$and$mathema)cal$modeling$for$ psychiatric...

geographical sharding in mongodb - ulisboa · geographical...

sharding architectures

using oracle shardingcontents 1 oracle sharding overviewwhat...

system managed sharding with active data guard using … ·...

neutrino: revisi&ng memory caching for iterave data...

oracle sharding maa best practices · 2 | oracle sharding...

proxysql sharding

sharding key-value data in isis2

data sharding

michał gruchała - data sharding

a primer in longitudinal data analy

the future of postgres sharding - postgresql europetitle:...

slick data sharding: slides from drupalcon london

studying data sharding using mongodb -...

mongodb sharding & map-reduce - universitas...

mongodb for time series data: sharding