globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014...

69
Global analy)cs in the face of bandwidth and regulatory constraints Ashish Vulimiri u Carlo Curino m Brighten Godfrey u Thomas Jungblut m Jitu Padhye m George Varghese m u UIUC m MicrosoA

Upload: others

Post on 28-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Global  analy)cs  in  the  face  of  bandwidth  and  regulatory  constraints

   Ashish  Vulimiriu              Carlo  Curinom          Brighten  GodfreyuThomas  Jungblutm          Jitu  Padhyem            George  Varghesem

uUIUC                                            mMicrosoA

Page 2: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Massive data volumes

Facebook 600  TB/day

Twi2er 100  TB/day

Microso7 10s  TB/day

LinkedIn 10  TB/day

Yahoo! 10  TB/day

Page 3: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Massive data volumes

Use  cases:  -­‐ User  acAvity  logs  -­‐ Monitoring  remote  infrastructures  -­‐ …  

Collected  across  several  data  centers  for  low  user  latency

Facebook 600 TB/day

Twi2er 100 TB/day

Microso7 10s TB/day

LinkedIn 10 TB/day

Yahoo! 10 TB/day

DC1DC2

DC3

Page 4: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

DC1DC2

DC3

Page 5: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB
Page 6: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

SQL  analyAcs  across  geo-­‐distributed  data  to  extract  insight

10s-­‐100s  TB/day up  to  10s  of  DCs

current  soluAon:  centralize  -­‐ copy  all  data  to  central  data  center  -­‐ run  all  queries  there

Page 7: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Centralized approach is inadequate

1. Consumes  scarce,  expensive  cross-­‐DC  bandwidth  

current  soluAon:  copy  all  data  to  central  DC,  run  all  analyAcs  there

Page 8: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Centralized approach is inadequate

1. Consumes  scarce,  expensive  cross-­‐DC  bandwidth

current  soluAon:  copy  all  data  to  central  DC,  run  all  analyAcs  there

slowing growth

scarce capacity

total  Internetcapacity

rising costs

external  network  is  fastest  rising  DC  cost

recognized concern

several  other  efforts  to reduce  wide-­‐area  traffic

e.g.  SWAN,  B4

0 20 40 60 80

2009 2014

Internet capacity growth (%)

some  DCs’  internal bisecAon  b/w

100  Tb/s 1000Tb/s

Page 9: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Centralized approach is inadequate

1. Consumes  scarce,  expensive  cross-­‐DC  bandwidth  

2. IncompaAble  with  sovereignty  concerns  -­‐ Many  countries  considering  restricAng  moving  ciAzens’  data  -­‐ Could  render  centralizaAon  impossible  -­‐ SpeculaAon:  derived  informaAon  might  sAll  be  acceptable

current  soluAon:  copy  all  data  to  central  DC,  run  all  analyAcs  there

Page 10: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Centralized approach is inadequate

1. Consumes  scarce,  expensive  cross-­‐DC  bandwidth  

2. IncompaAble  with  sovereignty  concerns

current  soluAon:  copy  all  data  to  central  DC,  run  all  analyAcs  there

Page 11: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Geo-distributed SQL analytics

click_log(

DC1( adserve_log(

click_log(

DCn( adserve_log(

Centralized execution: 10 TB/day

preprocess'adserve_log'

join' k2means'clustering'

preprocess'click_log'

adserve_log'

click_log'

SQL  query:

Page 12: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Geo-distributed SQL analytics

click_log(

DC1( adserve_log(

click_log(

DCn( adserve_log(

Centralized execution: 10 TB/day Distributed execution: 0.03 TB/day

t"="0!push!down!preprocess!

t"="1!distributed!semi1join!

t"="2!centralized!k1means!

preprocess'adserve_log'

join' k2means'clustering'

preprocess'click_log'

adserve_log'

click_log'

SQL  query:

Page 13: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Geo-distributed SQL analytics

click_log(

DC1( adserve_log(

click_log(

DCn( adserve_log(

Centralized execution: 10 TB/day Distributed execution: 0.03 TB/day

t"="0!push!down!preprocess!

t"="1!distributed!semi1join!

t"="2!centralized!k1means!

333x cost reduction

preprocess'adserve_log'

join' k2means'clustering'

preprocess'click_log'

adserve_log'

click_log'

SQL  query:

Page 14: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Geo-distributed SQL analytics

OpAmizaAons:  synthesize  and  extend  ideas  from  -­‐ Parallel  and  distributed  databases  -­‐ Distributed  systems  …  as  well  as  novel  techniques  of  our  own

Common  thread:  revisit  classical  database  problems  from                                                                networking  perspecAve

Page 15: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

PROBLEM DEFINITION

Page 16: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Requirements

Possible  challenges  to  address  Bandwidth  Sovereignty  Fault-­‐tolerance  Latency  Consistency  

We  target  the  batch  analyAcs  dominant  in  organizaAons  today

Page 17: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Key characteristics

1. Support  full  relaAonal  model  

2. No  control  over  data  parAAoning  -­‐ Dictated  by  external  factors,  typically  end-­‐user  latency  

3. Cross-­‐DC  bandwidth  is  scarcest  resource  by  far  -­‐ CPU,  storage  etc  within  data  centers  are  relaAvely  cheap  

4. Unique  constraints  -­‐ Heterogeneous  bandwidth  costs/capaciAes  -­‐ Sovereignty  

5. Bulk  of  load  comes  from  ~stable  recurring  workload  -­‐ Consistent  with  producAon  logs

Page 18: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Problem statement

Given:  data  born  distributed  across  DCs  a  certain  way  

Goal:  support  SQL  analyAcs  on  this  data  -­‐ Minimize  bandwidth  cost  -­‐ Handle  fault-­‐tolerance,  sovereignty  constraints  

System  will  handle  arbitrary  queries  at  runAme  -­‐ But  will  be  tuned  to  opAmize  known  ~stable  recurring  workload

Page 19: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

OUR APPROACH

Page 20: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Basic Architecture

End-­‐user  facing  DB(handles  OLTP)

Single-­‐DC  SQL  stack  [Hive]

Local      ETL

CoordinatorReporAngpipeline

Queries

Results

Page 21: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Optimizations

Runtimedata  transfer  reducAon

SQL-aware workload  planning

Function-specific

semanAclevel

Page 22: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Optimizations

1. Runtime

2. SQL-aware

3. Function-specific

Page 23: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Optimizations

1. Runtime

2. SQL-aware

3. Function-specific

Page 24: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

1. Runtime data transfer optimization

In  our  seing  -­‐ CPU,  storage,  …  within  data  centers  is  cheap  -­‐ Cross-­‐DC  bandwidth  is  the  expensive  resource  

Trade  off  CPU,  storage  for  bandwidth  reducAon

Page 25: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

1. Runtime data transfer optimization

aggressively cache all intermediate output

t  =  0        DCB  asks  DCA  for  results  of  subquery  q

q0q0

DCA DCB

Page 26: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

1. Runtime data transfer optimization

aggressively cache all intermediate output

q0 q0

cache cache

DCA DCB

Page 27: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

1. Runtime data transfer optimization

aggressively cache all intermediate output

q0 q0cache cache

t  =  1        DCB  asks  DCA  for  results  of  subquery  q                            again

q1

DCA DCB

Page 28: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

(      -­‐      )

1. Runtime data transfer optimization

aggressively cache all intermediate output

q0 q0cache cache

t  =  1        DCB  asks  DCA  for  results  of  subquery  q                            again

q1 q1 q0

DCA DCB

Page 29: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

(      -­‐      )

1. Runtime data transfer optimization

aggressively cache all intermediate output

q0 q0cache cache

q1 q1 q0

DCA DC

recompute  q1  from  scratch  -­‐ not  using  caching  to  save  latency,  CPU  -­‐ only  bandwidth

Page 30: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

1. Runtime data transfer optimization

aggressively cache all intermediate output

Caching  helps  not  only  when  same  query  arrives  repeatedly  

…  but  also  when  different  queries  have  common  sub-­‐operaAons  

e.g.  6x  data  transfer  reducAon  in  TPC-­‐CH

Page 31: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

1. Runtime data transfer optimization

aggressively cache all intermediate output

Caching  helps  not  only  when  same  query  arrives  repeatedly  

…  but  also  when  different  queries  have  common  sub-­‐operaAons  

e.g.  6x  data  transfer  reducAon  in  TPC-­‐CH  

Database  parallel:    caching  ≈  view  materializaAon  • Caching  is  a  low-­‐level,  mechanical  form  of  view  maintenance  + Works  for  arbitrary  computaAons,  including  arbitrary  UDFs  – Uses  more  CPU,  storage  – Can  miss  opportuniAes

Page 32: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Optimizations

1. Runtime

2. SQL-aware

3. Function-specific

Page 33: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Optimizations

1. Runtime

2. SQL-aware

3. Function-specific

Page 34: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

2. SQL-aware workload planning

Given  -­‐ Stable  workload  (set  of  queries)  -­‐ Fault-­‐tolerance  and  sovereignty  constraints  

Jointly  opAmize  -­‐ Query  plan  -­‐ Site  selecAon  (task  scheduling)  -­‐ Data  replicaAon  

‣ Replicate  data  for  performance  and/or  fault-­‐tolerance  

to  minimize  data  transfer  cost  

Challenge:  opAmizaAon  search  space  is  exponenAally  large  

Approach:  simplify  search  space

Page 35: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

2. SQL-aware workload planning

Simplification

Big  table

Small  table

ComputaAon:  copy  both  tables  to  one  DC,  then  join  them  

Decision  1:  do  we  copy  the  big  table  or  the  small  table?  

US

UK

Japancopy  1

copy  2

Page 36: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

2. SQL-aware workload planning

Simplification

Big  table

Small  table

ComputaAon:  copy  both  tables  to  one  DC,  then  join  them  

Decision  1:  do  we  copy  the  big  table  or  the  small  table?  

US

UK

Japancopy  1

copy  2

Page 37: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

2. SQL-aware workload planning

Simplification

Big  table

Small  table

ComputaAon:  copy  both  tables  to  one  DC,  then  join  them  

Decision  1:  do  we  copy  the  big  table  or  the  small  table?  

Decision  2:  which  copy  of  the  small  table  do  we  use?

US

UK

Japancopy  1

copy  2

Page 38: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

2. SQL-aware workload planning

1.  Logical  plan  -­‐ Do  we  copy  the  big  table  or  the  small  table?  

2.  Physical  plan  -­‐ Which  copy  of  the  small  table  do  we  use?

Had  two  kinds  of  decisions  to  make:

Page 39: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

2. SQL-aware workload planning

1.  Logical  plan  -­‐ Do  we  copy  the  big  table  or  the  small  table?  -­‐ Choice  was  clear,  strategies  were  orders  of  magnitude  apart  

2.  Physical  plan  -­‐ Which  copy  of  the  small  table  do  we  use?  -­‐ Choice  wasn’t  as  obvious,  had  to  know  precise  costs

Had  two  kinds  of  decisions  to  make:

Page 40: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

2. SQL-aware workload planning

1.  Logical  plan  -­‐ Choose  based  on  simple  staAsAcs  on  each  table  

2.  Physical  plan  -­‐ Profile  logical  plan,  collecAng  precise  measurements  -­‐ Use  to  opAmize  physical  plan  

Key  insight  -­‐ “Logical”  choices:  simple  staAsAcs  usually  suffice  -­‐ “Physical”  choices:  need  more  careful  cost  esAmates  -­‐ Only  an  empirical  insight  

‣ But  worked  well  in  all  our  experimental  workloads

Simplification: Two-phase approach

Page 41: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

2. SQL-aware workload planning

query  1 query  n

DAG  1 DAG  n

profiled DAG  1

profiled DAG  n

 schedule  tasks  to  DCs,    decide  data  replicaAon  policy

workload

logical  plan

physical  plan

Page 42: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

2. SQL-aware workload planning

query  1 query  n

DAG  1 DAG  n

profiled DAG  1

profiled DAG  n

 schedule  tasks  to  DCs,    decide  data  replicaAon  policy

workload

logical  plan

physical  plan

1.    Query  planner

2.    Profiler

3.    Integer  Linear          Program

Page 43: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

 

2. SQL-aware workload planning

query  1 query  n

DAG  1 DAG  n

profiled DAG  1

profiled DAG  n

 schedule  tasks  to  DCs,    decide  data  replicaAon  policy

workload

logical  plan

physical  plan

1.    Query  planner

2.    Profiler

3.    Integer  Linear          Program

Page 44: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

2. SQL-aware workload planning

Profiling task graphsSELECT    city,                                SUM(orderValue)FROM    salesWHERE    category  =  ‘Electronics’  GROUP  BY    city

Distributed  deployment:

US  DC

Japan  DC

coordinator  DC

parAal aggregate

merge70  GB60  GB

63  GB want  to

measure

US

UK

JP

US

UK

JP

UK  DC

70  GB

60  GB

63  GB

Page 45: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Centralized  deployment:

2. SQL-aware workload planning

Profiling task graphsSELECT    city,                                SUM(orderValue)FROM    salesWHERE    category  =  ‘Electronics’  GROUP  BY    city

Distributed  deployment:

US  DC

Japan  DC

coordinator  DC

parAal aggregate

merge70  GB60  GB

63  GB want  to

measure

US

UK

JP

US

UK

JP

UK  DC

inject  filterWHERE  country  =  “US”

70  GB

60  GB

63  GB

70  GB60  GB

63  GB

Page 46: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Centralized  deployment:

2. SQL-aware workload planning

Profiling task graphsSELECT    city,                                SUM(orderValue)FROM    salesWHERE    category  =  ‘Electronics’  GROUP  BY    city

Distributed  deployment:

US  DC

Phase  1

Japan  DC

coordinator  DC

merge70  GB60  GB

63  GB want  to

measure

US

UK

JP

US

UK

JP

UK  DC

inject  filterWHERE  country  =  “US”

70  GB

60  GB

63  GB

Pseudo-­‐distributed  execuAon  

Rewrite  query  DAGs  to  simulate  alternate  configuraAons  

Fully  general  what-­‐if  analysis.    Use  cases:  -­‐ Bootstrap:    centralized  -­‐>  distributed  -­‐ Test  alternate  data  replicaAon  strategies  -­‐ Simulate  adding/removing  data  centers

Page 47: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

2. SQL-aware workload planning

query  1 query  n

DAG  1 DAG  n

profiled DAG  1

profiled DAG  n

 schedule  tasks  to  DCs,    decide  data  replicaAon  policy

1.    Query  Planner

2.    Profiler

3.    Integer  Linear          Program

see  paper

Page 48: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Optimizations

1. Runtime

2. SQL-aware

3. Function-specific

Page 49: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Optimizations

1. Runtime

2. SQL-aware

3. Function-specific

Page 50: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

3. Function-specific optimizations

Past  work:  large  number  of  distributed  algorithms                                        targeAng  specific  problems  

Support  via  extensible  user-­‐defined  funcAon  interface  -­‐ Allows  registering  mulAple  implementaAons  -­‐ OpAmizer  will  automaAcally  choose  best,  based  on  profiling  

As  examples,  implemented  -­‐ Top-­‐k  [1]  -­‐ Approximate  count-­‐disAnct  [2]

[1]  “Efficient  top-­‐k  query  computaAon  in  distributed  networks”          P.  Cao,  Z.  Wang,  PODC  2004  [2]  “HyperLogLog:  the  analysis  of  a  near-­‐opAmal  cardinality  esAmaAon  algorithm”          P.  Flajolet,  E.  Fusy,  O.  Gandouet,  F.  Meunier,  AOFA  2007

Page 51: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

EVALUATION

Page 52: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Implemented  Hadoop-­‐stack  prototype  -­‐ Prototype  mulA-­‐DC  replacement  for  Apache  Hive  

Experiments  up  to  10s  of  TBs  scale  -­‐ Real  Microso7  producAon  workload  -­‐ Several  syntheAc  benchmarks:  

‣ TPC-­‐CH  ‣ BigBench-­‐SQL  ‣ Berkeley  Big-­‐Data  ‣ YCSB

Page 53: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

BigBench-SQL

0.01

0.1

1

10

100

1000

10000

0.1 1 10 100 1000 10000

Dat

a tr

ansf

erG

B (c

ompr

esse

d)

GB (raw, uncompressed)Size of updates to DB since last analytics run

CentralizedDistributed: no caching

Distributed: with caching 330x

Page 54: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

TPC-CH

0.01

0.1

1

10

100

1000

10000

0.1 1 10 100 1000 10000

Dat

a tr

ansf

erG

B (c

ompr

esse

d)

GB (raw, uncompressed)Size of updates to DB since last analytics run

CentralizedDistributed: no caching

Distributed: with caching 360x

Page 55: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Microsoft production workloadD

ata

tran

sfer

(com

pres

sed)

Size of OLTP updates since last OLAP run(raw, uncompressed)

CentralizedDistributed: no caching

Distributed: with caching257x

Page 56: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Berkeley Big-Data

0.01

0.1

1

10

100

0.1 1 10 100

Dat

a tr

ansf

erG

B (c

ompr

esse

d)

GB (raw, uncompressed)Size of updates to DB since last analytics run

CentralizedDistributed: no caching

Distributed: with cachingDistributed: caching + top-k

3.5x

Page 57: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Berkeley Big-Data

0.01

0.1

1

10

100

0.1 1 10 100

Dat

a tr

ansf

erG

B (c

ompr

esse

d)

GB (raw, uncompressed)Size of updates to DB since last analytics run

CentralizedDistributed: no caching

Distributed: with cachingDistributed: caching + top-k 27x

Page 58: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

BEYOND SQL

Page 59: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Beyond SQL: DAG workflows

ComputaAonal  model:  directed  acyclic  task  graphs,                                                                                  each  node  =  arbitrary  computaAon  

Significantly  more  challenging  seing  

IniAal  results  encouraging  -­‐ Same  level  of  improvement  as  SQL  

More  details:  [CIDR  2015]

[CIDR  2015]  “WANalyAcs:  AnalyAcs  for  a  geo-­‐distributed  data-­‐intensive  world”                                                Vulimiri,  Curino,  Godfrey,  Karanasos,  Varghese

Page 60: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

RELATED WORK

Distributed  and  parallel  databases  

Single-­‐DC  frameworks  (Hadoop/Spark/…)  

Data  warehouses  

ScienAfic  workflow  systems  

Sensor  networks  

Stream  processing  systems  (e.g.  JetStream)  

Page 61: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Key characteristics

1. Support  full  relaAonal  model  at  100s  TBs/day  scale  

2. No  control  over  data  parAAoning  

3. Focus  on  cross-­‐DC  bandwidth  

4. Unique  constraints  -­‐ Heterogeneous  bandwidth  costs/capaciAes  -­‐ Sovereignty  

5. AssumpAon  of  ~stable  recurring  workload  -­‐ Enables  highly  tuned  opAmizaAon

Page 62: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

SUMMARY

Centralized  analyAcs  becoming  unsustainable  

Geo-­‐distributed  analyAcs:  SQL  and  DAG  workflows  

Several  novel  techniques  -­‐ Redundancy  eliminaAon  via  caching  -­‐ Pseudo-­‐distributed  measurement  -­‐ [SQL  query  planner  +  ILP]  opAmizer  

Up  to  360x  less  bandwidth  on  real  &  syntheAc  workloads

Page 63: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

THANK YOU!

Page 64: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

SUMMARY

Centralized  analyAcs  becoming  unsustainable  

Geo-­‐distributed  analyAcs:  SQL  and  DAG  workflows  

Several  novel  techniques  -­‐ Redundancy  eliminaAon  via  caching  -­‐ Pseudo-­‐distributed  measurement  -­‐ [SQL  query  planner  +  ILP]  opAmizer  

Up  to  360x  less  bandwidth  on  real  &  syntheAc  workloads

Page 65: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

BACKUP SLIDES

Page 66: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Caching and view selection

Consider    SELECT val - avg(val) FROM table  

Cutpoint  selecAon  problem:  do  we  cache  -­‐ Base  [val],  or  -­‐ Results  a7er  average  has  been  subtracted  

Akin  to  view  selecAon  problem  in  SQL  databases  

Current  implementaAon  makes  wrong  choice

Page 67: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Sovereignty: Partial support

Our  system  respects  data-­‐at-­‐rest  regulaAons(e.g.  German  data  should  not  leave  Germany)  

But  we  allow  arbitrary  queries  on  the  data  

LimitaAon:  we  don’t  differenAate  between  -­‐ Acceptable  queries,  e.g.                    “what’s  the  total  revenue  from  each  city”  

-­‐ ProblemaAc  queries,  e.g.                    SELECT  *  FROM  Germany

Page 68: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Sovereignty: Partial support

SoluAon:  either  -­‐ Legally  vet  the  core  workload  of  queries  -­‐ Use  differenAal  privacy  mechanism  

Open  problem

Page 69: Globalanalycsinthefaceof& bandwidthandregulatoryconstraints · 2019-12-18 · 2009 2014 Internet capacity growth (%) some*DCs’*internal ... Phase*1 Japan*DC coordinator*DC 70GB

Past work