metasync!! - usenix · baidu(2tb)many!sync!service!providers! dropbox(2gb) googledrive(15gb)...

37
MetaSync File Synchroniza/on Across Mul/ple Untrusted Storage Services Seungyeop Han ([email protected]) Haichen Shen, Taesoo Kim*, Arvind Krishnamurthy, Thomas Anderson, and David Wetherall University of Washington *Georgia Ins/tute of Technology 1 USENIX ATC '15

Upload: others

Post on 04-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

MetaSync    File  Synchroniza/on  Across  Mul/ple  Untrusted  Storage  Services      

Seungyeop  Han  ([email protected])  Haichen  Shen,  Taesoo  Kim*,  Arvind  Krishnamurthy,  

Thomas  Anderson,  and  David  Wetherall  

University  of  Washington   *Georgia  Ins/tute  of  Technology  1  USENIX  ATC  '15  

Page 2: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

File  sync  services  are  popular  

400M  of  Dropbox  users  reached  in  June  2015  2  USENIX  ATC  '15  

Page 3: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Baidu(2TB)  

Many  sync  service  providers  

Dropbox  (2GB)   Google  Drive  (15GB)  

MS  OneDrive  (15GB)   Box.net  (10GB)  

3  USENIX  ATC  '15  

Page 4: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Can  we  rely  on  any  single  service?  

4  USENIX  ATC  '15  

Page 5: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Exis/ng  Approaches  

•  Encrypt  files  to  prevent  modifica/on  – Boxcryptor  

•  Rewrite  file  sync  service  to  reduce  trust  – SUNDR  (Li  et  al.,  04),  DEPOT  (Mahajan  et  al.,  10)    

5  USENIX  ATC  '15  

Page 6: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

MetaSync:  Can  we  build  a  beaer  file  synchroniza/on    system  across  mul/ple  exis/ng  services?      

MetaSync  

6  

Higher  availability,  greater  capacity,  higher  performance  Stronger  confiden/ality  &  integrity  

USENIX  ATC  '15  

Page 7: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Goals  

•  Higher  availability  •  Stronger  confiden/ality  &  integrity  •  Greater  capacity  and  higher  performance  

•  No  service-­‐service,  client-­‐client  communica/on  

•  No  addi/onal  server  •  Open  source  soeware  

7  USENIX  ATC  '15  

Page 8: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Overview  

•  Mo/va/on  &  Goals  •  MetaSync  Design  •  Implementa/on  •  Evalua/on  •  Conclusion  

8  USENIX  ATC  '15  

Page 9: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Key  Challenges  

•  Maintain  a  globally  consistent  view  of  the  synchronized  files  across  mul/ple  clients  

•  Using  only  the  service  providers’  unmodified  APIs  without  any  centralized  server    

•  Even  in  the  presence  of  service  failure  

9  USENIX  ATC  '15  

Page 10: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Design  Choices  

•  How  to  manage  files?  –  Content-­‐based  addressing  &  hash  tree  

•  How  to  update  consistently  with  unmodified  APIs?  –  Client-­‐based  Paxos  (pPaxos)  

•  How  to  spread  files?  –  Stable  determinis/c  mapping  

•  How  to  protect  files?  –  Encryp/on  from  clients  

•  How  to  make  it  extensible?  –  Common  abstrac/ons  

10  USENIX  ATC  '15  

Page 11: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Design  Choices  

•  How  to  manage  files?  –  Content-­‐based  addressing  &  hash  tree  

•  How  to  update  consistently  with  unmodified  APIs?  –  Client-­‐based  Paxos  (pPaxos)  

•  How  to  spread  files?  –  Stable  determinis/c  mapping  

•  How  to  protect  files?  –  Encryp/on  from  clients  

•  How  to  make  it  extensible?  –  Common  abstrac/ons  

11  USENIX  ATC  '15  

Page 12: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Overview  of  the  Design  

Synchroniza/on   Replica/on  Object  Store  

MetaSync  

Backend  abstrac/ons  Local  Storage  

Dropbox   Google  Drive   OneDrive   Remote  

Services  

12  

1.  File  Management  

USENIX  ATC  '15  

Page 13: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Object  Store  

•  Similar  data  structure  with  version  control  systems  (e.g.,  git)  

•  Content-­‐based  addressing    – File  name  =  hash  of  the  contents  – De-­‐duplica/on  – Simple  integrity  checks  

•  Directories  form  a  hash  tree  –  Independent  &  concurrent  updates  

13  USENIX  ATC  '15  

Page 14: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Object  Store  

14  

head  =  f12…  

Dir1  

abc…  

Dir2  

4c0…  

Large.bin  

20e…  

blob   blob  blob  

small1   small2  

•  Files  are  chunked  or  grouped  into  blobs  •  The  root  hash  =  f12…  uniquely  iden/fies  a  snapshot  

USENIX  ATC  '15  

Page 15: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Object  Store  

15  

old  =  f12…  

Dir1  

abc…  

Dir2  

4c0…  

Large.bin  

20e…  

blob   blob  blob  

small1   small2  

•  Files  are  chunked  or  grouped  into  blobs  •  The  root  hash  =  f12…  uniquely  iden/fies  a  snapshot  

1ae…  

blob  

head  =  07c…  

Large.bin  

USENIX  ATC  '15  

Page 16: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Overview  of  the  Design  

Synchroniza/on   Replica/on  Object  Store  

MetaSync  

Backend  abstrac/ons  Local  Storage  

Dropbox   Google  Drive   OneDrive   Remote  

Services  

16  

2.  Consistent  update  

USENIX  ATC  '15  

Page 17: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Upda/ng  Global  View  

Global  View  

v0  ab1…  

Client1   Prev  

Head  

Prev  

Head  

Client2  

master  

17  

Previously  synchronized  point  

Current  root  hash  

USENIX  ATC  '15  

Page 18: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Upda/ng  Global  View  

Global  View  

v0  ab1…  

Client1   Prev   Head  

Prev  Client2  

v1  c10…  

master  

18  

Head  

USENIX  ATC  '15  

Page 19: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Upda/ng  Global  View  

Global  View  

v0  ab1…  

Client1   Prev  

Head  

Prev  Client2  

v1  c10…  

master  

19  

Head  

USENIX  ATC  '15  

Page 20: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Upda/ng  Global  View  

Global  View  

v0  ab1…  

Client1   Prev  

Head  

Prev  Client2  

v1  c10…  

master  

20  

Head  

USENIX  ATC  '15  

Page 21: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Upda/ng  Global  View  

Global  View  

v0  ab1…  

Client1   Prev  

Prev   Head  Client2  

v1  c10…  

v2  7b3…  

master  

21  

Head  

v2  f13…  

USENIX  ATC  '15  

Page 22: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Upda/ng  Global  View  

Global  View  

v0  ab1…  

Client1   Prev  

Prev  

Head  

Client2  

v1  c10…   v2  7b3…  

master  

22  

Head  

v2  f13…  

USENIX  ATC  '15  

Page 23: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Upda/ng  Global  View  

Global  View  

v0  ab1…  

Client1   Prev  

Prev  

Head  

Client2  

v1  c10…   v2  7b3…  

master  

23  

Head  

v3  a31…  

USENIX  ATC  '15  

Page 24: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Consistent  Update  of  Global  View  

•  Need  to  handle  concurrent  updates,  unavailable  services  based  on  exis/ng  APIs  

MetaSync  

Dropbox  

MetaSync  

Google  Drive   OneDrive  

root=  f12…   root=  b05…  

24  USENIX  ATC  '15  

Page 25: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Paxos  

•  Mul/-­‐round  non-­‐blocking  consensus  algorithm  – Safe  regardless  of  failures  – Progress  if  majority  is  alive  

Proposer   Acceptor  25  USENIX  ATC  '15  

Page 26: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Metasync:  Simulate  Paxos  •  Use  an  append-­‐only  list  to  log  Paxos  messages    –  Client  sends  normal  Paxos  messages  – Upon  arrival  of  message,  service  appends  it  into  a  list  –  Client  can  fetch  a  list  of  the  ordered  messages  

•  Each  service  provider  has  APIs  to  build  append-­‐only  list  – Google  Drive,  OneDrive,  Box:  Comments  on  a  file  – Dropbox:  Revision  list  of  a  file  –  Baidu:  Files  in  a  directory  

26  USENIX  ATC  '15  

Page 27: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Metasync:  Passive  Paxos  (pPaxos)  

•  Backend  services  work  as  passive  acceptor  •  Acceptor  decisions  are  delegated  to  clients  

Clients   Passive  Storage  Services  

New  root  =  1  

New  root  =  2  

Accepted  root  =  1  

S2  

S1  

S3  fetch(S1)  

27  

fetch(S2)  

USENIX  ATC  '15  

Page 28: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Paxos  vs.  Disk  Paxos  vs.  pPaxos  

•  Disk  Paxos:  maintains  a  block  per  client    

Proposer  

Acceptor    computa/on  

Propose   Accept  

Paxos  

Proposer  

Acceptor    

disk  blocks  

Propose   Check  

Disk  Paxos  

…  

Proposer  

Acceptor    

append-­‐only  

Propose   Check  

pPaxos  

Gafni  &  Lamport  ’02  

Requires  acceptor  API  O(clients  x  acceptors)  O(acceptors)   O(acceptors)  

28  

require  #  msgs  

USENIX  ATC  '15  

Page 29: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Overview  of  the  Design  

Synchroniza/on   Replica/on  Object  Store  

MetaSync  

Backend  abstrac/ons  Local  Storage  

Dropbox   Google  Drive   OneDrive   Remote  

Services  

29  

3.  Replicate  objects  

USENIX  ATC  '15  

Page 30: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Stable  Determinis/c  Mapping  

•  MetaSync  replicates  objects  R  /mes  across  S  storage  providers  (R<S)  

•  Requirements  – Share  minimal  informa/on  among  services/clients  – Support  varia/on  in  storage  size  – Minimize  realignment  upon  configura/on  changes  

•  Determinis/c  mapping  

– E.g.,  map(7a1…)  =  Dropbox,  Google  

USENIX  ATC  '15   30  

Page 31: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Implementa/on  

•  Prototyped  with  Python  – ~8k  lines  of  code  

•  Currently  supports  5  backend  services  – Dropbox,  Google  Drive,  OneDrive,  Box.net,  Baidu  

•  Two  front-­‐end  clients  – Command  line  client    – Sync  daemon  

31  USENIX  ATC  '15  

Page 32: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Evalua/on  

•  How  is  the  end-­‐to-­‐end  performance?  

•  What’s  the  performance  characteris/cs  of  pPaxos?  

•  How  quickly  does  MetaSync  reconfigure  mappings?  

32  USENIX  ATC  '15  

Page 33: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Evalua/on  

•  How  is  the  end-­‐to-­‐end  performance?  

•  What’s  the  performance  characteris/cs  of  pPaxos?  

•  How  quickly  does  MetaSync  reconfigure  mappings?  

33  USENIX  ATC  '15  

Page 34: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

End-­‐to-­‐End  Performance  

Dropbox   Google   MetaSync  Linux  Kernel  920  directories    15k  files,  166MB  

2h  45m   >  3hrs   12m  18s  

Pictures  50  files,  193MB  

415s   143s   112s  

Synchronize  the  target  between  two  computers  

34  

Performance  gains  are  from:  •  Parallel  upload/download  with  mul/ple  providers  •  Combined  small  files  into  a  blob  

(S  =  4,  R  =  2)  

USENIX  ATC  '15  

Page 35: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Latency  of  pPaxos  

Latency  is  not  degraded  with  increasing  concurrent  proposers    or  adding  slow  backend  storage  service     35  

0  

5  

10  

15  

20  

25  

30  

35  

1   2   3   4   5  

Latency  (s)  

#  of  Proposers  

Google  

Dropbox  

OneDrive  

Box  

Baidu  

USENIX  ATC  '15  

Page 36: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Latency  of  pPaxos  

Latency  is  not  degraded  with  increasing  concurrent  proposers    or  adding  slow  backend  storage  service     36  

0  

5  

10  

15  

20  

25  

30  

35  

1   2   3   4   5  

Latency  (s)  

#  of  Proposers  

Google  

Dropbox  

OneDrive  

Box  

Baidu  

All  

USENIX  ATC  '15  

Page 37: MetaSync!! - USENIX · Baidu(2TB)Many!sync!service!providers! Dropbox(2GB) GoogleDrive(15GB) MS!OneDrive(15GB) Box.net(10GB) USENIXATC!'15! 3

Conclusion  

•  MetaSync  provides  a  secure,  reliable,  and  performant  files  sync  service  on  top  of  popular  cloud  providers  – To  achieve  a  consistent  update,  we  devise  a  new  client-­‐based  Paxos  

– To  minimize  redistribu/on,  we  present  a  stable  determinisOc  mapping  

•  Source  code  is  available:  – hap://uwnetworkslab.github.io/metasync/  

37  USENIX  ATC  '15