lecture i: data storage security in cloud compujng

128
Lecture I: Data Storage Security in Cloud Compu7ng Kui Ren Associate Professor Department of Computer Science and Engineering University at Buffalo

Upload: others

Post on 27-Feb-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture I: Data Storage Security in Cloud Compujng

Lecture  I:  Data  Storage  Security  in    Cloud  Compu7ng  

Kui Ren Associate Professor

Department of Computer Science and Engineering

University at Buffalo

Page 2: Lecture I: Data Storage Security in Cloud Compujng

Disclaimer!

The lecture slides are partially collected from the Internet for the educational purpose only. The lecturer does not claim any credit for them and the copyrights belong to the original authors.

Page 3: Lecture I: Data Storage Security in Cloud Compujng

Outline

3

•  Introduc7on  to  Cloud  Compu7ng  •  Cloud  Data  Storage  and  Security  Challenges  •  Our  Research  Efforts  •  Further  Discussion  on  the  Subject  

 

Page 4: Lecture I: Data Storage Security in Cloud Compujng

4

Cloud  Compu7ng:  the  Big  Thing  

Page 5: Lecture I: Data Storage Security in Cloud Compujng

Cloud  Compu7ng:  the  Big  Thing  •  Tremendous  momentum:  

Predic'on  on  Federal  IT  spendable  to  move  to  the  cloud  from  US  CIO.Gov  in  Feb.  2011.  

Predic'on  on  cloud  compu'ng  revenue  in  2012  from  Market-­‐research  firm  IDC.  

5

Page 6: Lecture I: Data Storage Security in Cloud Compujng

Cloud  Compu7ng:  the  Big  Thing  

•  Tremendous  momentum:  

6

Cloud  providers  bring  in  $2B  in  first  quarter  -­‐-­‐  source:  Synergy  Research  Group,  May,  2013  

The  overall  cloud  market  will  hit  $71  billion  in  2015  Source:  Gartner  Company  data,  Macquarie  Capital  (USA),  Jan.  2013  

Page 7: Lecture I: Data Storage Security in Cloud Compujng

Cloud  Compu7ng:  Advantages  

– Cloud  compu7ng  enjoys  a  "pay-­‐per-­‐use  model  for  enabling   available,   convenient   and   on-­‐demand  network   access   to   a   shared   pool   of   configurable  compu7ng   resources   (e.g.,   networks,   servers,  storage,   applica7ons   and   services)   that   can   be  rapidly   provisioned   and   released   with   minimal  management   effort   or   service   provider  interac7on.”  –  NIST  

7

Page 8: Lecture I: Data Storage Security in Cloud Compujng

Cloud  Service  Stacks  

8

Pla\orm  as  a  service  

Infrastructure  as  a  service    

So]ware  as  a  service  

Page 9: Lecture I: Data Storage Security in Cloud Compujng

Cloud  Deployment  Models  

9

Public  

Private  

Page 10: Lecture I: Data Storage Security in Cloud Compujng

Challenges  for  Cloud  Compu7ng    

10

Page 11: Lecture I: Data Storage Security in Cloud Compujng

Cloud  Raises  Big  Security  Challenges!  •  Data  Loss  and  Leakage  

•  Insider  a_acks  

 

11

Page 12: Lecture I: Data Storage Security in Cloud Compujng

Cloud  Raises  Big  Security  Challenges!    •  Service  Vulnerability  

•  Denial  of  Service  

•  Service  Abuse  

12

Page 13: Lecture I: Data Storage Security in Cloud Compujng

Broad  A_acking  Surface  for  Public  Cloud  

•  Tradi7onal  adversaries:  Hackers,  malwares,  etc.  •  As  well  as:    

–  Cross-­‐VM  a_acks  from  mul7-­‐tenants;    –  Leaking  Personal  Iden7fiable  Informa7on  from  rogue  employees  ;    –  Even  providers  who  control  the  en7re  infrastructure…    –  Many  others  yet  to  be  iden7fied…    

•  Main  concerns:  will  my  data  be  safe?  will  anyone  see  it?  can  anyone  modify  it?  what  if  I  don’t  trust  the  cloud  operator?  …     13

Data  owners  

Data  owners  

Data  flow  Data  flow  

App1  

Hypervisor  

OS  

App2   App  

OS  

App  

OS  

Hardware  

Virtualized    server  

Loss  of  physical  control  

Page 14: Lecture I: Data Storage Security in Cloud Compujng

Security  Challenges  in  Cloud  

•  Storage  Outsourcing  vs.  Storage  Security  •  Cloud  Data  Encryp7on  vs.  Data  U7liza7on  •  Storage  Outsourcing  vs.  Access  Control    •  Computa7on  Outsourcing  vs.  Data  Security  •  U7lity  Compu7ng  vs.  Trustworthy  Metering  &  Pricing  •  Resource  Virtualiza7on  vs.  Virtualiza7on  Security  •  Security  Overhead  vs.  Cloud  Benefits  •  and  many  more  …  …  

14

Page 15: Lecture I: Data Storage Security in Cloud Compujng

Outline

15

•  Introduc7on  to  Cloud  Compu7ng  •  Cloud  Data  Storage  and  Security  Challenges  •  Our  Research  Efforts  and  Proposed  Designs  •  Further  Discussion  on  the  Subject  

Page 16: Lecture I: Data Storage Security in Cloud Compujng

Storage  Outsourcing  vs.  Storage  Security  

16

Data  owners  

Data  owners  

Data  flow  Data  flow  

Loss  of  physical  control  

•  Cloud  storage  service  allows  owners  to  outsource  their  data  to  cloud  servers  for  storage  and  maintenance.  –  Low  capital  costs  on  hardware  and  so]ware,  low  management  and  

maintenance  overheads,  universal  on-­‐demand  data  access,  etc  –  E.g.,  Amazon  S3.  

•  However,  data  outsourcing  also  eliminates  owners’  ul7mate  control  over  their  data.  

Page 17: Lecture I: Data Storage Security in Cloud Compujng

Storage  Outsourcing  vs.  Storage  Security  

•  Cloud  currently  offers  no  guarantee:    –  Amazon  S3:  not  liable  to  any  data  damages  or  data  loss.  

•  Broad  range  of  threats  for  data  integrity  do  exist:  –  Internal:  Byzan7ne  failure,  management  errors,  so]ware  bugs,  etc.    –  External:  malicious  malware,  economically  mo7vated  a_acks,  etc.  –  E.g.,  Amazon  S3  -­‐  Feb.,  Jul.  2008;  Gmail  -­‐  Dec.  2006,  Mar.  2011;  Apple  

MobileMe  -­‐  Jul.  2008,  Hotmail  –  Dec.  2010,  …  

•  Cloud  servers  might  behave  unfaithfully:  –  Discard  rarely  accessed  data  for  monetary  reason  –  Hide  data  loss  incidents  for  reputa7on    

•  Data  owners  demands  con7nuous  storage  correctness  assurance  for  their  data  in  the  cloud.  

17

Page 18: Lecture I: Data Storage Security in Cloud Compujng

Need  to  Create  Security  Visibility  inside  Cloud  

•  Proac7ve  storage  audi7ng  mechanism  to  ensure  con7nuous  correctness  of  outsourced  cloud  data.  –  To  help  extend  data  trust  perimeter  into  the  cloud.  –  To  meet  security,  system,  and  performance  requirements.  

18

Is my data correctly stored?

Storage correctness proofs

Page 19: Lecture I: Data Storage Security in Cloud Compujng

Secure  Cloud  Storage  Audi7ng  

19

•  Demand  efficient  storage  correctness  guarantee  without  requiring  local  data  copies.    –  Tradi7onal  methods  for  storage  security  can  not  be  directly  adopted.    –  Retrieving  massive  data  for  checking  is  unprac7cal.    (large  bandwidth)  

•  Allow  meaningful  tradeoffs  between  security  and  overhead.  –  Communica7on  and  computa7on  costs  should  be  low.  –  audi7ng  cost  should  not  outweigh  its  benefits.  

•  Cope  with  frequent  cloud  data  changing  while  ensuring  con7nuous  data  audi7ng.  –  Cloud  data  may  be  frequently  updated  by  owner  for  applica7on  

purposes  –  Audi7ng  mechanisms  inherently  need  to  support  data  dynamics.  

Page 20: Lecture I: Data Storage Security in Cloud Compujng

Secure  Cloud  Storage  Audi7ng  (Cont’d)  

20

•  Enable  public  audi7ng  for  unified  risk  evalua7on.  –  Introduce  a  third-­‐party  auditor  saves  owners’  compu7ng  resources  

and  simplifies  the  audi7ng  management  at  cloud.  –  Public  audi7ng  should  not  affect  owner’s  data  privacy.  

•  Handle  mul7ple  audi7ng  tasks  simultaneously  (batch  audi7ng)  –  The  individual  audi7ng  of  each  data  file  can  be  tedious  and  inefficient.  –  Batch  audi7ng  improves  efficiency  and  saves  computa7on  overhead.  

Page 21: Lecture I: Data Storage Security in Cloud Compujng

Outline

21

•  Cloud  Compu7ng  Background  •  Cloud  Data  Storage  and  Security  Challenges  •  Our  Research  Efforts  and  Proposed  Designs  

•  Storage  audi7ng  with  data  dynamics  support  •  Privacy-­‐preserving  public  audi7ng  •  Efficiency  improvement  via  batch  audi7ng  

•  Further  Discussion  on  the  Subject  

Page 22: Lecture I: Data Storage Security in Cloud Compujng

Outline

22

•  Cloud  Compu7ng  Background  •  Cloud  Data  Storage  and  Security  Challenges  •  Our  Research  Efforts  and  Proposed  Designs  

•  Storage  audi7ng  with  data  dynamics  support  •  Privacy-­‐preserving  public  audi7ng  •  Efficiency  improvement  via  batch  audi7ng  

•  Further  Discussion  on  the  Subject  

Page 23: Lecture I: Data Storage Security in Cloud Compujng

Dynamic  Storage  Audi7ng  

•  Outsourced  data  can  be  frequently  changing  due  to  updates.  –  Outsourced  file  storage,  databases,  email  data,  log  files,  etc.    

•  How  to  design  efficient  storage  audi7ng  mechanism  with  inherent  support  of  data  dynamics?  –  The  most  general  forms  of  data  update  include  data  block  

modifica7on,  inser7on,  and  dele7on.  

Cloud  hosts  not  only  sta-c  but  dynamic  data  

Security message flow

Data  flow  

Page 24: Lecture I: Data Storage Security in Cloud Compujng

•  The  tradi7onal  approach  is  not  applicable.  –   Owner  pre-­‐computes  MACs  for  the  data.  

Data*  MACK1(Data)  

MACK2(Data)  

MACK3(Data)  MACK1(Data*)  

reveal    K1  

Owner Cloud Server

equal?"

Keys may be used up! No data dynamics support! Cloud processes entire data online per audit!

Straigh\orward  Approaches    

24

Page 25: Lecture I: Data Storage Security in Cloud Compujng

Straigh\orward  Approaches    •  The  random-­‐sampling  approach    

–  Check  only  a  small  por7on  of  the  data  per  audit    –  Achieve  probabilis7c  integrity  guarantee  via  random  sampling  

 

σ1  

m1  σ2  

m2  σ3  

m3  σ4  

m4  …  

…  σn  

mn  

1.  Linear bandwidth cost w.r.t. sample size; 2.  Linear computational cost - need to verify

each block/authenticator pair.

Cloud Server Owner randomly sample ���

block/authenticator pairs σ1  

m1  σ2  

m2  σ4  

m4  

Owner pre-computes an authenticator (e.g., signature/MAC) for each data block.

25

Page 26: Lecture I: Data Storage Security in Cloud Compujng

Construct  Homomorphic  Authen7cator    •  Homomorphic  authen7cator  provides  integrity  authen7ca7on  

and  has  the  aggrega7on  property.  –  BLS  signature  based  instan7a7on:    x, gx  is  private/public  key  pair,  H(.)  :  

hash  to  point  func7on,    u,  g  are  generators  for  group  G.  •                                                                                                                     ,    

σi  

mi  

–  Homomorphic:  aggrega7on  of  authen7cators  and  data  blocks  

Data  block:  

Authen7cator:  

Verifica7on:  

26

σ1  

m1  σ2  

m2  σ  

μ   + .

Page 27: Lecture I: Data Storage Security in Cloud Compujng

Construct  Homomorphic  Authen7cator    •  Audit  the  aggregated  block  and  authen7cator  for  the  constant  

bandwidth  cost  and  much  saved  computa7onal  cost.  

σ1  

m1  σ2  

m2  σ3  

m3  σ4  

m4  …  

…  σn  

mn  

Homomorphic property allows blocks and authenticators to be combined into single value

Cloud Server

Owner randomly sample ���

block/authenticator pairs σ1  

m1  σ2  

m2  σ4  

m4  σ  

μ  

small and constant bandwidth verify μ and σ once only

Not designed to support data dynamics!

27

Page 28: Lecture I: Data Storage Security in Cloud Compujng

•  Direct  extension  to  data  dynamics  is  insecure.  –  E.g.,  block  modifica7on  from  mi  to  mi  +  Δm  allows  adversary  to  obtain  Δm    

and                                by  dividing  newly  computed  σi’  and  original  σi  

–  Adversary  could  now  maliciously  modify  any  block  ms  to  ms*  =  ms+  Δm  and  forge  legi7mate  authen7cator  σs*  as:    

•  New  authen7cator  construc7on  is  required  to  avoid  the  a_ack.  

Analysis  of  Exis7ng  Work  

m1   m2   m3  ……. mn  

σ1   σ3  σ2   σn  …….H. Shacham et al. 08 BLS signature based

G. Ateniese et al. 07 RSA based

v, name: randomly chosen labels for data names; d, x: related private keys; H(.), h(.) : hash to point functions.

28

Page 29: Lecture I: Data Storage Security in Cloud Compujng

Analysis  of  Exis7ng  Work  

m1   m2   m3  ……. mn  

σ1   σ3  σ2   σn  …….H. Shacham et al. 08 BLS signature based

G. Ateniese et al. 07 RSA based

•  A  secure  authen7cator  must  enforce  the  block  index,  i.e.,  posi7on/sequence  informa7on.    –  Prevent  adversary  from  using  authen7cators  to  obtain  proofs  for  different  blocks.  –  E.g.,  use  any  valid  (ms  ,σs)  pair  to  pass  challenges  for  corrupted  mt  successfully.  

•  But  keeping  index  informa7on  makes  data  updates  highly  inefficient.  –  E.  g.,  inser7ng  a  block  at  any  posi7on  will  require  retrieving  all  the  subsequent  data  

blocks  and  re-­‐computa7on  of  all  corresponding  authen7cators.    

•  Can  we  eliminate  the  index  informa7on  but  s7ll  enforce  block  posi7on  without  affec7ng  the  security?  " 29

Page 30: Lecture I: Data Storage Security in Cloud Compujng

Our  Design  Overview  •  Construct  a  new  authen7cator  using  H(mi)  instead  of  H(name||i).  

•  New  authen7cator  supports  secure  block  modifica7on  opera7on.  –  H(mi) changes  for  every  block  updates,  so  the  aforemen7oned  a_ack  on  

block  modifica7on  is  no  longer  valid.  

•  Elimina7on  of  index  for  efficient  block  inser7on/dele7on  opera7on.    

O

We are yet to have a way to enforce the block index sequence.

30

Page 31: Lecture I: Data Storage Security in Cloud Compujng

h1,1  

Our  Design  Overview  •  Construct  a  novel  sequence-­‐enforced  Merkle  Hash  Tree  (sMHT).  

–  Rank  of  each  tree  node:  the  #  of  leaves  that  can  be  reached  from  the  node.    •  It’s  also  the  sum  of  its  children’s  ranks.  

   

–  Construct  sMHT  with  an  ordered  set  {H(mi)}i=1,…,n  as  the  leaf  nodes,  and  use  root  (R,n)  to  ensure  correct  block  posi7on  informa7on:  

Auxiliary  Authen7ca7on  Informa7on (AAI)

Sequence of the ordered set of leaves

To  verify  x3’s  value  and  posi7on,  we  use  root  (R,4) and AAI = {(h4,1,0), (hA,2,1)}:

1.  Compute  rank  of  B  as  1+1  =  2  and  hB  =  h(h(x3||1) || h4  ||2);

2.  Compute  rank  of  root  as  2+2  =  4  and  R’ = h(hA  || hB  || 4);

3.  Verify  if R  =  R’  and  also  if  LEFT(x3) =  2.

xi = H(mi), i = 1,…, n 31

Root=  (R,4)  

h2,1  

hA,2  

x1  

hB,2  

x2   x3   x4  

A   B  

h3,1   h4,1  h1  =  h(x1||1)  

hA  =  h(h1||h2||2)  

R  =  h(hA||hB||4)  

C   D E   F  

Lv:2  

Lv:1  

Lv:0  

Page 32: Lecture I: Data Storage Security in Cloud Compujng

σ1  

m1  σ2  

m2  σ3  

m3  σ4  

m4  …  

…  σ8  

m8  data  outsource

σi  

•  Prepara7on:  Owner  generates  sMHT, keeps  root  (R, n), and  outsources  {Data, σi’s, sMHT} to  the  cloud.  

The  Protocol  Illustra7on  

{v1, v5, v6, v8} random positions & coefficients

Owner

µ = v1m1+v5m5+v6m6+v8m8

Cloud Server

and Ω �Owner verifies µ and σ with Ω!

•  Audi7ng:  Owner  challenges  cloud  on  randomly  selected  data  blocks.  Cloud  responds  with  the  corresponding  {μ, σ, Ω}.

32

Page 33: Lecture I: Data Storage Security in Cloud Compujng

x1 x2 x3 x4 x5 x6 x7 x8

Root

A B

C D E F

xi = H(mi) , i=1,...,8

The  Protocol  Illustra7on:  Audi7ng  •  Step  1:  Owner  uses  root  (R,8) and  Ω  to  authen7cate  the  

posi7ons  of  {H(mi )}i=1,5,6,8 and  hence  those  of  {mi }i=1,5,6,8 .  

 

AAI

Ω = {H(mi )}i=1,5,6,8 ,and  the  corresponding  AAI  from  sMHT

33

 R,8  

hB  ,  4  

hD,2  

hA  ,  4  

hC,2   hE,2   hF,2  

h1,1   h2,1   h3,1   h4,1   h5,1   h6,1   h7,1   h8,1  

check  if  R  =  h(hA||hB||8)  and    if  LEFT(xi)=i-1, for i=1,5,6,8

h8  =  h(x8||1)  h6  =  h(x6||1)  h5  =  h(x5||1)  

h1  =  h(x1||1)  

hF  =  h(h7||h8||2)  hE  =  h(h5||h6||2)  hC  =  h(h1||h2||2)  

hB  =  h(hE||hF||4)  hA  =  h(hC||hD||4)  

Page 34: Lecture I: Data Storage Security in Cloud Compujng

The  Protocol  Illustra7on:  Audi7ng  

•  Step  2:  With  {H(mi )}i=1,5,6,8 authen7cated,  owner  further  checks  

34

Random  coefficients  chosen  by  owner  

Public  key  

Audi7ng  materials  from  cloud  

Page 35: Lecture I: Data Storage Security in Cloud Compujng

The  Protocol  Illustra7on:  Support  Data  Dynamics  

•  Support  general  block-­‐level  opera7ons:  Modifica7on  (M),  dele7on  (D),  and  inser7on  (I)    –  One  step  closer  towards  prac7cal  audi7ng  mechanisms  

•  Update  opera7on:  the  block,  its  corresponding  authen7cator,  and  the  sMHT  –  When  inser7ng/dele7ng  a  block,  authen7cators  for  all  other  blocks  

remains  the  same,  i.e.,  no  authen7cator  re-­‐computa7on  or  data  retrieving  is  necessary.  

 

  35

Ω, h(H(m*))

Owner-­‐side    Updates:    

Page 36: Lecture I: Data Storage Security in Cloud Compujng

Support  Data  Dynamics:  Block  Inser7on  

h1,1 h2,1 h3,1 h4,1

Root (R,4)

A BhA,2 hB,2

Insert h(x*||1),1 after h2,1

n3

h1,1 hc,2 h3,1 h4,1

hA*,3 hB,2

Root (R*,5)

A B

h(x*||1),1h2,1

C

{m*, σ*} Insert m* after m2

Owner

xi = H(mi)

Cloud Server

2.  Insert  m*  and  update  sMHT.  

Ω ={(h1,1,0),  (h2,1,0),  (hB,2,1)}

1.  Compute  σ*  for  new  block  m*.  

3.  Authen7cate  received  Ω with  local (R,4).  4.  Compute (R*,5) with  Ω and  local  h(H(m*)||1)=  h(x*||1).  

36

hi  =h(xi||1)  

Page 37: Lecture I: Data Storage Security in Cloud Compujng

Remarks  

37

•  In  our  scheme,  we  store  addi7onal  meta  data  in  the  tree  structure  to  assist  authen7ca7on.  

–  E.g.,  store  addi7onal  rank  informa7on  of  the  tree  at  the  server.  

•  It  helps  eliminate  the  need  for  the  owner  to  keep  track  of  the  tree  structure,  while  keeping  our  design  secure.    

•  Otherwise,  the  owner  will  have  to  record  local  state  informa7on  for  each  update  he  conducts    -  Quite  a  burden  from  prac7cal  point  of  view.  

Page 38: Lecture I: Data Storage Security in Cloud Compujng

Example:  Storing  Rank  of  Nodes  •  Rank  of  node  i  denotes  the  number  of  leaf  nodes  that  belong  

to  this  sub-­‐tree  with  node  i  as  the  root.    

 •  The  owner  can  directly  use  authen7cated  rank  values  to  verify  

that  the  node  F  is  indeed  the  750-­‐th  node.    

Root

xi = H(mi) , i=1,...,n

......

...

,1000  

hA,  400  hC,  400  

hE,349  hF,1  

Root  =  h(hA  ||  hB  ||  1000);  

hC  =  h(hE  ||  hF  ||  400);  

hB,  600  

hD,  200  hB  =  h(hC  ||  hD  ||  600);  

hF  =  h(H(m750)  ||  1);  Leaf  node:  H(m750)  

…  

Page 39: Lecture I: Data Storage Security in Cloud Compujng

Efficiency  Enhancement  •  Using  MHT,  persistent  inser7on  on  the  same  posi7on  would  

result  in  worst  case  complexity  to  be  O(n).  –  Since  the  tree  height  keeps  increasing.    

•  But  other  more-­‐balanced  tree  structures  can  be  directly  u7lized  to  replace  the  MHT  and  maintain  worst  case  performance  to  be  O(log  n).  –  E.g.,  Skiplist,  B+  tree  can  be  used  .    

–  Homework:  you  can  check  these  details  by  reading  the  corresponding  papers.  

Page 40: Lecture I: Data Storage Security in Cloud Compujng

Security  Analysis•  Our  proposed  authen7cator  construc7on  can  be  proved  to  be  

existen7ally  unforgeable.  –  Use  the  fact  that  the  BLS  signature  is  existen7ally  unforgeable.  –  By  contradic7on:  if  an  adversary  can  forge  our  authen7cator  scheme  à  we  

can  use  the  adversary  to  forge  a  BLS  signature.

Simulator   Adversary  

A forged BLS signature passes the verificationContradiction !

Forge  

40

Page 41: Lecture I: Data Storage Security in Cloud Compujng

Security  Analysis  (cont’d)•  The  soundness  of  our  storage  correctness  guarantee  is  based  on  the  

hardness  of  Computa7onal  Diffie-­‐Hellman  (CDH)  problem.  –  CDH:  Given  g,  gα,  h  ∈  G  for  unknown  α  ∈ Zp,  to  output  hα.  –  By  contradic7on:  If  an  adversary  can  respond  corrupted                                  to  pass  the  

verifica7on  à  we  can  solve  the  CDH  problem

Simulator  

CDH is solved à Contradiction!41

Page 42: Lecture I: Data Storage Security in Cloud Compujng

Probabilis7c  Guarantee  of  Random  Sampling    

42

•  Assume  r  out  of  n  blocks  are  corrupted,  how  many blocks should  we  randomly  sample  to  detect  it  with  high  probability?    

•  Let  X  denote  the  number  of  corrupted  blocks  picked  by  the  random-­‐sampling.  Then  sampling  c blocks  gives  detec7on  probability    

 

P = 1� P{X = 0} = 1�c�1Y

i=0

(1�min{ r

n� i, 1})

⇥ 1� (n� r

n)c = 1� (1� t)c,where t =

r

n

•  If  t  =  1%  of  file  is  corrupted,  randomly  sample  a  constant  of  c  =  460  blocks  to  maintain  detec7on  probability    P  =  0.99.  

•  Error-­‐correc7ng  code  can  be  used  to  correct  small  data  errors.  

Page 43: Lecture I: Data Storage Security in Cloud Compujng

Performance  Evalua7on  

Table  1:  Comparisons  with  the-­‐state-­‐of-­‐art.  

+:  The  scheme  only  supports  bounded  number  of  integrity  challenges  and  par7ally  data  updates,  i.e.,  data  inser7on  is  not  supported.  

 

Ateniese  et  al.  CCS'07  

Shacham  et  al.  ASIACRYPT'08  

Ateniese  et  al.  SecureComm'08  

Our  TPDS’11/    ESORICS’09  

Data  dynamics   No   Par7ally+   Yes  

Sever  comp.  complexity   O(1)   O(1)   O(1)   O(log  n)  

Owner  comp.  complexity   O(1)   O(1)   O(1)   O(log  n)  

Comm.  Complexity   O(1)   O(1)   O(1)   O(log  n)  

Owner  storage  complexity   O(1)   O(1)   O(1)   O(1)  

43

Page 44: Lecture I: Data Storage Security in Cloud Compujng

Performance  Evalua7on  (cont’d)  

Table  2:  performance  comparisons  with  different  instan7a7ons.  

               Our  experiment  is  conducted  using  C  on  a  system  with  a  processor  running  at  2.4  GHz,  768  MB  of  RAM.    

               The  performance  is  measured  for  1  GB  data  under  data  corrup7on  rate  t  =  1%  and  3%  while  maintaining  detec7on  probability  P  =  0.99,  where  P  ≥  1  -­‐  (1  –  t  )c  and  c  is  the  sample  size.  The  block  size  of  RSA-­‐based  instan7a7on  is  chosen  to  be  4  KB.  Note  that  error-­‐correc7ng  code  can  be  used  to  correct  small  data  errors  (e.g.,  t  <  1%).  

Our  BLS  based  instan7a7on  

Our  RSA  based  instan7a7on  

System  Parameters  

Data  corrup7on  rate  –  t 1%   3%   1%   3%  

Detec7on  probability  –  P 0.99   0.99   0.99   0.99  

Randomly  sampled  blocks  –  c   460   152   460   152  

Performance  Results  

Server  comp.  7me  (ms)   6.45   2.11   13.81   4.55  

Owner  comp.  7me  (ms)   806.01   284.17   779.10   210.47  

Comm.  cost  (KB)   239   80   223   76  

44

Page 45: Lecture I: Data Storage Security in Cloud Compujng

Short  Summary  

45

•  We  explore  the  problem  of  cloud  storage  audi7ng  with  data  dynamics  support.    

•  We  carefully  designed  a  new  homomorphic  authen7cator  and  achieve  the  goal  with  a  novel  sequence-­‐enforced  Merkle  Hash  Tree  (sMHT)  design.  

•  We  conduct  experiments  for  both  BLS-­‐based  and  RSA-­‐based  instan7a7ons.  Extensive  security  and  performance  analysis  shows  that  the  proposed  scheme  is  provably  secure  and  highly  efficient.  

Page 46: Lecture I: Data Storage Security in Cloud Compujng

Outline

46

•  Cloud  Compu7ng  Background  •  Cloud  Data  Storage  and  Security  Challenges  •  Our  Research  Efforts  and  Proposed  Designs  

•  Storage  audi7ng  with  data  dynamics  support  •  Privacy-­‐preserving  public  audi7ng  •  Efficiency  improvement  via  batch  audi7ng  

•  Further  Discussion  on  the  Subject  

Page 47: Lecture I: Data Storage Security in Cloud Compujng

Public  Audi7ng  with  Third-­‐party  Auditor  

•  Maintaining  storage  correctness  guarantee  demands  con7nuous  audi7ng.    –  High  computa7on/communica7on  costs  and  online  burdens  for  data  

owners.  

•  Introduce  a  third-­‐party  auditor  (TPA)  for  correctness  evalua7on  –  Owners  can  be  worry-­‐free  by  resor7ng  to  TPA  for  audi7ng  tasks.  

Resource  constrained  

Large  amount  of  data  

47

Security message flow

Data  flow  

Page 48: Lecture I: Data Storage Security in Cloud Compujng

•  TPA  should  not  learn  the  content  of  the  data,  when  performing  audi7ng  on  behalf  of  data  owners.    •  Unauthorized  informa7on  leakage  is  unwanted  by  data  owners  •  Legal  regula7ons,  e.g.,  HIPAA,  may  mandate  it.  

•  Privacy-­‐preserving  public  audi7ng  mechanism  is  desired.  

Public  Audi7ng  VS.  Data  Privacy  

Data  flow  

Third-­‐party  auditor  

Page 49: Lecture I: Data Storage Security in Cloud Compujng

Revisit  Exis7ng  Approaches  

•  μ = v1m1+v5m5+v6m6+v8m8  leaks  the  data  to  TPA.  –  Direct  adop7on  is  unsuitable  for  public  audi7ng.  –  Can  recover  all  mi’s  by  solving  the  linear  equa7on  systems.  

•  Assuming  data  encryp7on  before  outsourcing?  NOT  sa7sfying.    –  Method  not  self-­‐contained;  Leave  the  problem  to  key  management    –  An  overkill  for  certain  types  of  data,  e.g.,  libraries,  scien7fic  data,  …  

σ1  

m1  σ2  

m2  σ3  

m3  σ4  

m4  …  

…  σn  

mn  Data  

{v1, v5, v6, v8} random positions & coefficientsTPA

Cloud Server Owner outsource

σi  

μ = v1m1+v5m5+v6m6+v8m8with  gx

49

Page 50: Lecture I: Data Storage Security in Cloud Compujng

•  Achieve  privacy-­‐preserving  audi7ng  regardless  of  data  encryp7on.  •  Construct  homomorphic  aggrega7on  with  random  masking.  

Privacy-­‐preserving  Public  Audi7ng  

σ1  

m1  σ2  

m2  σ3  

m3  σ4  

m4  …  

…  σn  

mn  {v1, v5, v6, v8}

random positions & coefficients

server combines corresponding blocks and randomly masks it.

μ = v1m1+v5m5+v6m6+v8m8

TPA Cloud Server

verify μ and σ

Random masking must not affect storage correctness validation!

With randomly masked μ , owner’s data content is no longer exposed!

50

Page 51: Lecture I: Data Storage Security in Cloud Compujng

Privacy-­‐preserving  Public  Audi7ng  

σ1  

m1  σ2  

m2  σ3  

m3  σ4  

m4  …  

…  σn  

mn  {v1, v5, v6, v8}

random positions & coefficients

μ = v1m1+v5m5+v6m6+v8m8

TPA Cloud Server

µ

•  System  Parameters:                                                                  , . , ,

51

1. Cloud server picks a random r. ���2. Computes ���3. μ = r + γ μ mod p. The soundness of our privacy-preserving

auditing mechanism can be proved under the random oracle model.

Page 52: Lecture I: Data Storage Security in Cloud Compujng

The  Correctness  Elabora7on  

52

µ' : the original block µ : the blinded block

Page 53: Lecture I: Data Storage Security in Cloud Compujng

Remarks  on  Privacy-­‐preserving  Audi7ng  

•  We  have  proved  our  construc7on  of  R    and    γ as γ = h(R)  would  not  affect  the  security  of  storage  audi7ng  equa7on.    

•  The  scheme  works  under  semi-­‐trusted  security  model  –  i.e.,  the  colluding  between  cloud  server  and  TPA  not  considered  

•  The  scheme  can  support  data  dynamics  straigh\orwardly.  –  Elimina7on  of  block  index  in  authen7cator  –  U7lizing  sequence-­‐enforced  MHT  (sMHT)  

•  Other  privacy-­‐preserving  audi7ng  construc7ons  are  possible.

53

Page 54: Lecture I: Data Storage Security in Cloud Compujng

Security  Analysis•  The  privacy  preserving  guarantee  is  proved  in  the  random  oracle  

model  using  γ = h(R).  –  We  prove  the  existence  of  a  simulator,  who  controls  the  random  oracle  h(.)

and  can  produce  a  valid  response  {R, σ, µ }  without  the  knowledge  of  µ.  –  Assume  the  simulator  is  given  a  valid  σ.  

1.  Simulator  randomly  picks  γ and µ from  Zp.

54

2.  Simulator  sets  µ

3.  Simulator  backpatches  (or  sets)  γ = h(R), as  it  controls  the  random  oracle  h(.).    

Since  simulator  generates  a  valid  response  {R, σ, µ }  without  knowing  µ, it  means  from  response  {R, σ, µ }, TPA  learns  nothing  on  µ.  

Page 55: Lecture I: Data Storage Security in Cloud Compujng

Security  Analysis•  The  soundness  of  our  modified  audi7ng  mechanism  is  based  on  the  

underlying  (original)  storage  audi7ng  mechanism.  –  We  prove  the  existence  of  an  extractor  who  can  extract  µ from  valid  {R, σ, µ }. –  The  extractor  controls  the  random  oracle  h(.) and  answers  queries  issued  by  

cloud  server  for  h(R).  

1.  Extractor  answers  γ = h(R)  and  cloud  server  outputs  valid  {λ, σ, µ } such  that  

55

3.  By  dividing  the  two  equa7ons,  the  extractor  can  obtain  valid  {σ, µ}, where for  original  storage  audi7ng  equa7on  (such  as  Shacham’s  scheme).  With  valid  {σ, µ}, the  soundness  of  our  audi7ng  scheme  follows  from  exis7ng  soundness  proofs.    

µ

2.  Extractor  rewinds  (resets)  cloud  server  and  returns  γ* = h(R)  for  the  query  of  h(R). Cloud  server  outputs  {R, σ, µ* }  such  that

µ*

µ =( - )/ (γ*- γ), µ* µ

Page 56: Lecture I: Data Storage Security in Cloud Compujng

Cost  of  Privacy-­‐Preserving  Guarantee  Table  3:  performance  comparisons  with  previous  work  

             Our  experiment  is  conducted  using  C  on  a  system  with  an  Intel  Core  2  processor  running  at  1.86  GHz,  2048  MB  of  RAM.    

             Our  analysis  shows  that  if  the  server  is  missing  t=1%  of  the  data  blocks,  the  TPA  only  needs  to  audit  for  c=460  or  300  randomly  chosen  blocks  so  as  to  detect  this  misbehavior  with  probability  P  larger  than  0.99  or  0.95.  

Our  INFOCOM’10   Shacham  et  al.  ASIACRYPT'08  

System    parameters  

Data  corrup7on  rate  -­‐  t   1%   1%   1%   1%  

Detec7on  probability  -­‐  P   0.99   0.95   0.99   0.95  

Randomly  sampled  blocks  -­‐  c 460   300   460   300  

Performance  results  

Server  comp.  7me  (ms)   411.00   270.20   407.66   265.87  

TPA  comp.  7me  (ms)   507.79   476.81   504.25   472.55  

Comm.  cost  (Byte)   160   160   40   40  

Privacy-­‐preserving   Yes   No  

56

Page 57: Lecture I: Data Storage Security in Cloud Compujng

Short  Summary  

57

•  Enable  public  audi7ng  is  of  cri7cal  importance  for  its  unified  risk  evalua7on  for  cloud  storage  services.  But  public  audi7ng  should  not  affect  owner’s  data  privacy.    

•  A  public  storage  audi7ng  scheme  u7lizing  a  new  random-­‐masking  construc7on  with  homomorphic  authen7cators  is  designed.    

•  The  design  also  supports  data  dynamics  straigh\orwardly.  

•  Extensive  security  and  performance  experiments  show  the  proposed  schemes  are  provably  secure  and  highly  efficient.  

Page 58: Lecture I: Data Storage Security in Cloud Compujng

Outline

•  Cloud  Compu7ng  Background  •  Cloud  Data  Storage  and  Security  Challenges  •  Our  Research  Efforts  and  Proposed  Designs  

•  Storage  audi7ng  with  data  dynamics  support  •  Privacy-­‐preserving  public  audi7ng  •  Efficiency  improvement  via  batch  audi7ng  

•  Further  Discussion  on  the  Subject  

58

Page 59: Lecture I: Data Storage Security in Cloud Compujng

Batch  Audi7ng  

{v1, v5, v6, v8} randomly-chosen coefficientsTPA

•  TPA  may  concurrently  handle  mul7ple  audi7ng  delega7ons.  •  Individually  audi7ng  each  tasks  can  be  tedious  and  overall  inefficient.  •  We  explore  the  algebraic  property  of  BLS  signature  and  slightly  modify  the  

protocol  in  a  single  owner  case  for  simultaneous  audi7ng.  (details  skipped)

verify µ1 and σ1

verify µ2 and σ2

verify µk and σk

…… ……

σ1  

m1  σ2  

m2  σ3  

m3  σ4  

m4  …  

…  σn  

mn  

Cloud Server

σ1  

m1  σ2  

m2  σ3  

m3  σ4  

m4  …  

…  σn  

mn  

σ1  

m1  σ2  

m2  σ3  

m3  σ4  

m4  …  

…  σn  

mn  

owner 1

owner 2

owner k

Verify µ1 , µ2 ,… , µk , and an aggregated σ in a single equation.

59

Page 60: Lecture I: Data Storage Security in Cloud Compujng

Recap  on  Bilinear  Pairing    

Page 61: Lecture I: Data Storage Security in Cloud Compujng

Batch  Audi7ng:    Efficiency  Enhancement  Highlight  

… … …

… … …

Aggregate  K  equa7ons  into  single  one  

61

Page 62: Lecture I: Data Storage Security in Cloud Compujng

Privacy-­‐preserving  Batch  Audi7ng:    Efficiency  Enhancement  Highlight  

Aggregate  K  equa7ons  into  single  one  

Page 63: Lecture I: Data Storage Security in Cloud Compujng

Remarks  on  Batch  Audi7ng  

•  Aggrega7ng  K  (K  >=  2)  verifica7on  equa7on  into  1  saves  expensive  pairing  opera7ons  from  2K  to  K+1.  –  A  considerable  amount  of  audi7ng  7me  can  be  saved.  

 •  Correct  verifica7on  means  all  checked  blocks  are  valid.  

–  Due  to  the  security  strength  of  BLS  based  authen7cators  and  verifica7on  equa7on.  

•  Failed  verifica7on  means  one  or  more  owners  data  are  corrupted.  –  Divide-­‐and-­‐conquer  approach  (binary  search)  to  find  invalid  responses.  

63

Page 64: Lecture I: Data Storage Security in Cloud Compujng

Batch  Audi7ng  Efficiency  

0 20 40 60 80 100 120 140 160 180 200400

420

440

460

480

500

520

Number of auditing tasks

Audi

ting

time

per t

ask

(ms)

individual auditingbatch auditing (c=460)batch auditing (c=300)

         Batch  audi7ng  indeed  helps  reduce  the  TPA’s  computa7on  cost,  as  more  than  11%  and  14%  of  per-­‐task  audi7ng  7me  is  saved,  when  c=460  or  300  ,  respec7vely.   64

Page 65: Lecture I: Data Storage Security in Cloud Compujng

Sor7ng  Out  Invalid  Responses  

             Even  the  number  of  invalid  responses  exceeds  15%  of  the  total  batch  size,  the  performance  of  batch  audi7ng  can  s7ll  be  safely  concluded  as  more  preferable  than  the  individual  audi7ng.    

0 2 4 6 8 10 12 14 16 18410

420

430

440

450

460

470

480

490

500

510

Fraction of invalid responses α

Audi

ting

time

per t

ask

(ms)

individual auditingbatch auditing (c=460)batch auditing (c=300)

Page 66: Lecture I: Data Storage Security in Cloud Compujng

Short  Summary  

•  Handle  mul7ple  audi7ng  tasks  simultaneously  (batch  audi7ng)  is  in  great  need  as  data  are  increasingly  outsourced  to  cloud  –  The  individual  audi7ng  of  each  data  file  can  be  tedious  and  inefficient.  –  Batch  audi7ng  improves  efficiency  and  saves  computa7on  overhead.  

•  We  leverage  the  algebraic  property  of  BLS  signature  based  homomorphic  authen7cators  and  construct  correct  and  secure  batch  audi7ng  protocols.    

•  We  demonstrate  via  experiments  that  the  proposed  batch  audi7ng  schemes  outperforms  individual  audi7ng  in  terms  of  per  task  audi7ng  7me.  

66

Page 67: Lecture I: Data Storage Security in Cloud Compujng

Related  Publica7ons    •  Q.  Wang,  C.  Wang,  J.  Li,  Kui  Ren,  and  W.  Lou,  "Enabling  Public  Verifiability  and  Data  

Dynamics  for  Storage  Security  in  Cloud  Compu7ng",  in  IEEE  Transac-ons  on  Parallel  and  Distributed  Systems,  Vol.  22,  No.  5,  pp.  847-­‐859,  May,  2011.  (also  appears  in  Proc.  of  ESORICS,  2009,  AR  =  19%)  

•  #1  top  accessed  IEEE  TPDS  ar7cle  in  IEEE  Xplore  as  in  December  2011  •  C.  Wang,  Q.  Wang,  Kui  Ren,  and  W.  Lou,  "Privacy-­‐preserving  Public  Audi7ng  for  Data  

Storage  Security  in  Cloud  Compu7ng”,  IEEE  Transac-ons  on  Computers,  Vol.  62,  No.  2,  pp.  362-­‐375,  2013.  (also  appears  in  Proc.  of  IEEE  INFOCOM,  2010,  AR  =  17.5%)  

•  #1  top  accessed  INFOCOM'10  ar7cle  in  IEEE  Xplore  as  in  December  2011  

•  C.  Wang,  Q.  Wang,  Kui  Ren,  and  W.  Lou,  "Ensuring  Data  Storage  Security  in  Cloud  Compu7ng,”  IEEE  Transac-ons  on  Service  Compu-ng,  Vol.  5,  No.  2,  pp.  220-­‐232,  2012  (also  appears  in  Proc.  of  IWQoS,  2009)  

•  C.  Wang,  Kui  Ren,  W.  Lou,  and  J.  Li,  "Towards  Publicly  Auditable  Secure  Cloud  Data  Storage  Services",  IEEE  Network,  vol.  24,  no.  4,  pp.  19-­‐24,  2010  

•  #2  top  accessed  IEEE  Network  ar7cle  in  IEEE  Xplore  as  of  July  2011  

•  Kui  Ren,  C.  Wang,  and  Q.  Wang,  "Security  Challenges  for  the  Public  Cloud,  IEEE  Internet  Compu-ng,  Vol.  16,  No.  1,  pp.  69-­‐73,  Jan/Feb,  2012  (Invited  Paper)  

67

Page 68: Lecture I: Data Storage Security in Cloud Compujng

Outline

•  Cloud  Compu7ng  Background  •  Cloud  Data  Storage  and  Security  Challenges  •  Our  Research  Efforts  and  Proposed  Designs  

•  Storage  audi7ng  with  data  dynamics  support  •  Privacy-­‐preserving  public  audi7ng  •  Efficiency  improvement  via  batch  audi7ng  

•  Further  Discussion  on  the  Subject  

68

Page 69: Lecture I: Data Storage Security in Cloud Compujng

•  Proofs  of  data  redundancy  •  Proofs  of  data  encryp7on  •  Assured  dele7on  •  Proofs  of  geoloca7on  •  Proofs  of  ownership  vs.  deduplica7on  •  More  to  be  iden7fied…  

69

More  Cloud  Storage  Security  Related  Topics  

Page 70: Lecture I: Data Storage Security in Cloud Compujng

•  Proofs  of  data  redundancy  •  Proofs  of  data  encryp7on  •  Assured  dele7on  •  Proofs  of  geoloca7on  •  Proofs  of  ownership  vs.  deduplica7on  •  More  to  be  iden7fied…  

70

More  Cloud  Storage  Security  Related  Topics  

Page 71: Lecture I: Data Storage Security in Cloud Compujng

Proofs  of  Data  Redundancy:    Challenges  on  The  physical  layer  

•  Amazon  claims  to  store  three  dis7nct  copies  of  my  file  for  resilience.  Can  they  prove  it?  –  Audi7ng  won’t  do  the  trick,  nor  will  downloading!  

Alice

F F F

F or ? F F F F

Slides  credits  to  Ari  Jules  et  al.    

Page 72: Lecture I: Data Storage Security in Cloud Compujng

Virtualiza7on  is  a  complica7on  

Erasure  coding  across  disks…  

Disk 1 Disk 2 Disk 3 Disk 4 Disk 5

My file can survive two disk crashes!

Page 73: Lecture I: Data Storage Security in Cloud Compujng

Virtualiza7on  is  a  complica7on  

Erasure  coding  across  disks…  

Disk 1 Disk 2 Disk 3 Disk 4 Disk 5

My  file  can  survive  two  disk  crashes!  

Virtual Virtual Virtual Virtual Virtual

A  single  disk  crash  can  destroy  my  file!  

X

Page 74: Lecture I: Data Storage Security in Cloud Compujng

How  to  Tell  if  Your  Cloud  Files  Are  Vulnerable  to  Drive  Crashes    

Proofs for that the tenant’s files can survive drive

crashes

Page 75: Lecture I: Data Storage Security in Cloud Compujng

Prove  Disk-­‐crash  Resilience  

Claim:  File  can  survive  two  disk  crashes!  

The  Challenge:  How  can  a  cloud  provider  prove  that  certain  bits  sit  on  certain  disks?  

Disk 1 Disk 2 Disk 3 Disk 4 Disk 5

Page 76: Lecture I: Data Storage Security in Cloud Compujng

Mo7va7on  and  Idea

•  Cloud  server:  “We  store  3  copies  of  your  file  in  3  different  drives.    We  are  2  fault-­‐tolerant.”    

•  Pizza  store:  “We  have  2  ovens.”  

•  How  do  you  know  if  it’s  true?  

•  Idea  :  mul7ple  devices  can  do  parallel  work  but  single  device  can’t.    

Page 77: Lecture I: Data Storage Security in Cloud Compujng

Example  –  pizza  store •  Assume  we  know    

–  The  pizza  store  has  2  ovens  –  An  oven  “usually”  takes  5  min  to  bake  a  pizza  –  The  store  is  a  15  min  drive  from  here  

•  Time  needed  for  24  pizzas  ?  –  1  oven  :      5·∙24=120  min  –  2  ovens:        60  min    –  Drive  7me:    15  min  

•  Task  for  the  pizza  store:  “Send  me  24  pizzas  in  80  min.”    

•  Task  for  the  cloud  server:  “Send  me  a  block  of  the  file  from  each  drive  in  xxxx  milliseconds”  

Page 78: Lecture I: Data Storage Security in Cloud Compujng

The  Pizza  Oven  Protocol  

Eeta Pizza Pi Cheapskate Pizza

“Six pizzas!”

Page 79: Lecture I: Data Storage Security in Cloud Compujng

The  Pizza  Oven  Protocol  

“Six pizzas!”

XEeta Pizza Pi Cheapskate Pizza

X

Page 80: Lecture I: Data Storage Security in Cloud Compujng

The  Pizza  Oven  Protocol  

Eeta Pizza Pi Cheapskate Pizza

Cheapskate  now  claims  it  can  survive  an  oven  failure!  How  can  Eeta  Pizza  Pi  verify  without  visi7ng???  

Page 81: Lecture I: Data Storage Security in Cloud Compujng

The  Pizza  Oven  Protocol  

Suppose  that:  • A  pizza  oven  bakes  one  pizza  at  a  7me,  and  takes  10  minutes  • The  Cheapskate  truck  takes  15  minutes  to  deliver  to  Eeta  Pizza  Pi  

“Six pizzas!”

Eeta Pizza Pi Cheapskate Pizza

T0

T1

T1 – T0 = 45 mins?

Page 82: Lecture I: Data Storage Security in Cloud Compujng

Protocol  Design  for  Cloud  Servers

•  Core  part  – Choose  the  threshold  of  7me  limit  

•  Challenges  – Network  latency  /  pizza  delivery  traffic  7me  – Drive  read  7me  /  oven  baking  speed  

•  seek  7me,  throughput,  RPM,  buffer    

– Make  the  queries  to  disks  unpredictable

Page 83: Lecture I: Data Storage Security in Cloud Compujng

Network  latency •  Ping  hosts  in  Santa  Clara  and  Shanghai  from  Boston

•  Several  strategies  to  factor  variability  in  network  latency  –  Latency  1  ≈  Latency  2  if  geographically  close  –  Abort  protocol  if  response  7me  exceeds  110%  of  the  average  

•  Reduce  network-­‐7ming  variance  when  limited  bandwidth  –  Server  applies  hash  func7on  before  transmi�ng

Page 84: Lecture I: Data Storage Security in Cloud Compujng

Drive  –  read  7me

•  Task:  Server  reads  a  block  from  each  drive    –  The  block  size  (the  size  of  each  gi)  ?  –  The  7me  limit  for  this  task?  

•  Two  main  factors  of  drive  read  7me  –  Seek  :  disk  head  moves  to  the  right  track  and  sector  – Data  transfer  rate  (throughput)  

•  The  drive  used  in  this  paper  –  3.5ms  seek  7me    and    73MB/s  to  125MB/s  throughput  

Page 85: Lecture I: Data Storage Security in Cloud Compujng

Drive  –  determine  the  block  size •  Seek  7me  depends  on  the  distance  that    the  disk  head  needs  to  move  

•  Throughput  depends  on  the  posi7on  of  the  block  – Outer  tracks  are  faster  than  inner  tracks  – Sequen7al  data  are  faster  than  sca_ered  data      

•  Force  to  perform  a  seek  for  EVERY  block    – Using  small  block  size  – Query  random  pa_ern  of  blocks  

 

Page 86: Lecture I: Data Storage Security in Cloud Compujng

Drive  –  determine  7me  limit  

•  Recall  the  two  examples  –  Pizza  store  with  2  ovens:  query  24  pizzas  (12  steps)  –  Cloud  server  with  3  drives:  query  3  blocks  (1  step)  

•  Why  use  12  steps  instead  of  1  step  for  pizza  store?  –  Enlarge  the  gap  between  one  oven  and  two  ovens  

•  How  to  play  the  same  trick  to  Cloud  server,  query  q  steps    (query  cq  blocks)  –  Solu7on  :  lock-­‐step  à  make  the  queries  to  disks  unpredictable  

Page 87: Lecture I: Data Storage Security in Cloud Compujng

Lockstep  Idea

•  Specify  query  Q  in  an  ini7al  step  consis7ng  of  c  random  challenge  blocks,  one  per  drive  

•  For  each  subsequent  step,  the  set  of  c  challenge  blocks  depends  on  the  content  of  the  file  blocks  accessed  in  the  last  step.  

•  The  server  can  proceed  to  the  next  step  only  a]er  fully  comple7ng  the  last  one.    

Page 88: Lecture I: Data Storage Security in Cloud Compujng

•  Lock-­‐step  ensures  the  security  via  the  increase  of  the  steps  

•  The  more  steps,  the  larger  gap  

Gap,  number  of  steps,  7me  limit

threshold

Page 89: Lecture I: Data Storage Security in Cloud Compujng

Experiments  :  c  =  5  drives •  Response  7me  gap  between  honest  max  and  adversary  min

Page 90: Lecture I: Data Storage Security in Cloud Compujng

•  Proofs  of  data  redundancy  •  Proofs  of  data  encryp7on  •  Assured  dele7on  •  Proofs  of  geoloca7on  •  Proofs  of  ownership  vs.  deduplica7on  •  More  to  be  iden7fied…  

90

More  Cloud  Storage  Security  Related  Topics  

Page 91: Lecture I: Data Storage Security in Cloud Compujng

Proofs  of  Data  Encryp7on:  Mo7va7on  

•  Public  cloud  has  large  a_ack  surface  – Thousands  of  computers  – Dozens  of  storage  systems  and  interfaces  

•  Amazon  alone:  S3,  EBS,  Instance  Storage,  Glacier,  Storage  Gateway,  CloudFront,  RDS,  DynamoDB,  Elas7Cache,  CloudSearch,  SQS  

– Shared  resources  among  thousands  of  tenants    •  Many  possibili'es  for  accidental  data  leakage.  – Data  encryp'on  is  a  must.    

91

Slides  credit  to  Stefanov  et  al.      

Page 92: Lecture I: Data Storage Security in Cloud Compujng

Defending  Against  Accidental  Data  Leakage  

•  Simple  view:  –  Just  encrypt  your  data  in  the  cloud.  

– Problem  solved?  

leakage  

???  

Page 93: Lecture I: Data Storage Security in Cloud Compujng

Defending  Against  Accidental  Data  Leakage  

•  More  realis7c  view:  – O]en  want  to  use  the  cloud  for  more  than  just  raw  storage.  

– Why?  Want  to  outsource  storage  AND  computa'on  (services).  

–  In  that  case,  the  cloud  needs  access  to  your  decrypted  data.  

leakage  

???  

Page 94: Lecture I: Data Storage Security in Cloud Compujng

Encrypt  at  Rest  &  Decrypt  on  the  Fly  

•  Split  the  cloud  into  computa7on  front-­‐end  and  storage  back-­‐end  –  Already  the  case  in  many  clouds  (e.g.,  Amazon,  Azure)  

•  Storage  backend  only  sees  encrypted  data.  •  Computa7on  front-­‐end  decrypts  data  on  the  fly  

–  Only  accesses  the  data  it  really  needs  at  any  one  7me  •  Can  be  combined  with  7ght  access  control  and  logging.  

–  Key  servers  

leakage  

Services  Front  End   Storage  Back  End  

???  

Page 95: Lecture I: Data Storage Security in Cloud Compujng

Encrypt  at  Rest  &  Decrypt  on  the  Fly  

•  Protects  against  data  leakage  by  the  storage  back-­‐end  infrastructure.  

•  Limits  the  amount  of  data  leakage  by  the  front-­‐end  at  any  one  7me.  

•  Common  prac7ce.  •  Much  be_er  than  no  encryp7on.  

leakage  

???  

Services  Front  End   Storage  Back  End  J  complies  with  

government  regula'ons  

Page 96: Lecture I: Data Storage Security in Cloud Compujng

The  Problem  

•  Lack  of  visibility  – Users  only  see  results  (e.g.,  web  pages)  from  the  front-­‐end.  What  is  happening  internally?  

•  Download  data  and  check  encryp7on?  – The  cloud  can  always  just  encrypt  on  the  fly.  

•  Seems  impossible!  

How  can  we  be  reasonably  sure  that  the  cloud  is  encryp'ng  data  at  rest?  Plaintext  is  simpler  for  the  cloud  to  manage.  

Page 97: Lecture I: Data Storage Security in Cloud Compujng

One  Proposed  Solu7on  

•  Impose  financial  penal'es  on  misbehaving  cloud  providers.  

•  We  ensure  that  an  economically  ra'onal  cloud  provider,  encrypts  data  at  rest.  

•  Misbehaving  cloud  must  use  double  storage.  – Must  store  both  decrypted  and  encrypted  file.  

Economically  mo'vate  the  cloud  to  encrypt  data  at  rest.  

Page 98: Lecture I: Data Storage Security in Cloud Compujng

One  Solu7on:  Hourglass  Schemes  

Original  File   Encrypted  File   Encapsulated  File  

encryp7on   hourglass  

client  assists   client  verifies  

by  periodically  challenging  random  file  

blocks  

client  verifies  

encryp7on  client  uploads  file  

•  The  client  never  needs  to  permanently  store  and  manage  keys.  

Page 99: Lecture I: Data Storage Security in Cloud Compujng

Intui7on  

Original  File   Encrypted  File   Encapsulated  File  

encryp7on   hourglass  

client  checks  adversarial  cloud  

wants  to  only  store  

Hourglass  property:  costly  to  compute  “on  the  fly”  

So  an  adversarial  cloud  must  store  both  files.  

Double  the  storage!  

Page 100: Lecture I: Data Storage Security in Cloud Compujng

Hourglass  Framework:  More  than  a  Scheme  

•  Encodings:  – Encryp7on  – Watermarking  – File  Bindings  

•  Hourglass  func7ons:  – Bu_erfly    – Permuta7on  – RSA  

Modular  Components  

Page 101: Lecture I: Data Storage Security in Cloud Compujng

Encodings  •  Encryp'on:  𝑮=𝑬(𝑭)  •  Watermarking:  𝑮=𝑭||Tag  

– Embed  a  tag  into  the  file  – Tag  says  that  the  file  is  stored  on  a  specific  cloud  – Tag  signed  by  the  cloud  – Evidence  of  data  leakage  origin.  

•  File  Binding:  𝑮= 𝑭↓𝟏 ||𝑭↓𝟐 ||…|| 𝑭↓𝒎   – Combine  mul7ple  files  into  one  encoding.  – E.g.,  embedded  license.  

Page 102: Lecture I: Data Storage Security in Cloud Compujng

Hourglass  Func7ons  

•  Costly  to  apply  “on  the  fly”  •  Impose  a  resource  lower  bound  on  the  cloud  to  compute:    Gà  H,  and  hence  FàH  

Original File Encrypted File Encapsulated File encoding���

(e.g., encryption) hourglass

𝑭 𝑮 𝑯

Page 103: Lecture I: Data Storage Security in Cloud Compujng

Hourglass  Func7on:  RSA  

•  Cloud  can  always  recover  the  plaintext  :  –  Gi  =  RSA-­‐recoverMessage(Hi)  (using  client’s  public  RSA  key)  –  Fi  =  Decode(Gi)  

•  Resource  bound:  computa'on  –  Completely  infeasible  for  cloud:  Fà  H  –  It  doesn’t  have  the  RSA  signing  key  to  do:  Gà  H  

F1   F2   F3   F4   Fn  …  F:  

G1   G2   G3   G4   Gn  …  G:  

H1   H2   H3   H4   Hn  …  H:  

Client  computes Hi  =  RSA-­‐Sign(Gi)  using  random  RSA  private  key.  

Apply  encoding  (encryp7on,  watermarking,  file  binding)  

Page 104: Lecture I: Data Storage Security in Cloud Compujng

Hourglass  Func7on:  Permuta7on  

•  Client  later  challenges  the  cloud  for  sequen7al  ranges  of  𝐻.  –  Sequen'al  range  in  𝑯  à  Random  blocks  in  𝑭    à  Random  blocks  in  𝑭    

•  Resource  bound:  disk  seeks  –  A  misbehaving  cloud  (that  only  stores  𝐹)  will  need  to  do  many  random  accesses  to  respond  to  a  challenge.  

F1   F2   F3   F4   Fn  …  F:  

G1   G2   G3   G4   Gn  …  G:  

H1   H2   H3   H4   Hn  …  H:  

Apply  encoding  (encryp7on,  watermarking,  file  binding)  

Randomly  permute  the  blocks  of    to  form  .  No  cryptographic  opera7ons.  Operates  on  7ny  blocks.  

Page 105: Lecture I: Data Storage Security in Cloud Compujng

G1   G2   G3   G4   G5   G6   G7   G8  

w  =  a  known  key  PRP  over  a  pair  of  file  blocks  

Hourglass  Func7on:  Bu_erfly  

Page 106: Lecture I: Data Storage Security in Cloud Compujng

Comparison  of  Hourglass  Func7ons  

more  prac'cal  

more  assump'ons  

less  prac'cal  

less  assump'ons  

RSA   Buderfly   Permuta'on  

 RSA  exponen'a'ons  

 AES  opera'ons    random  memory  accesses  

RSA  assump'ons   storage  speed   seek  inefficiency  in  rota'onal  drives  

Page 107: Lecture I: Data Storage Security in Cloud Compujng

Ran  on  Amazon  EC2  (using  a  quadruple-­‐extra-­‐large  high-­‐memory  instance  and  EBS  Storage).  

Comparison  of  Hourglass  Func7ons  

Page 108: Lecture I: Data Storage Security in Cloud Compujng

Challenge-­‐Response  Protocol  •  The  client  challenges  the  cloud  for  blocks  of  the  encapsulated  file  H.  – At  random  unpredictable  7mes  

–  Few  challenges,  e.g.,  O(log  n)    •  Cloud  must  respond  quickly.  

•  Doable  by  an  external  auditor.  – Auditor  doesn’t  see  the  plaintext  F.  

H1   H2   H3   H4   Hn  …  H:  

Page 109: Lecture I: Data Storage Security in Cloud Compujng

Limita7ons  

•  Assume  files  are  not  accessed  to  o]en.  – Great  for  archiving  files.  

•  File  updates  are  costly.  – RSA  hourglass  func7on  allows  for  updates.  – Other  hourglass  func7ons  must  be  re-­‐applied  to  the  en7re  file.  

•  Works  mainly  for  large  files.  

Page 110: Lecture I: Data Storage Security in Cloud Compujng

•  Proofs  of  data  redundancy  •  Proofs  of  data  encryp7on  •  Assured  dele7on  •  Proofs  of  geoloca7on  •  Proofs  of  ownership  vs.  deduplica7on  •  More  to  be  iden7fied…  

110

More  Cloud  Storage  Security  Related  Topics  

Page 111: Lecture I: Data Storage Security in Cloud Compujng

111

Assured  Data  Dele7on:  Mo7va7on  •  A]er  outsourcing,  can  we  reliably  remove  data  from  cloud?  – We  don’t  want  backups  to  exist  a]er  pre-­‐defined  7me  

•  e.g.,  to  avoid  future  exposure  due  to  data  breach  or  error  management  of  operators  

–  If  an  employee  quits,  we  want  to  remove  his/her  data  •  e.g.,  to  avoid  legal  liability  

•  Cloud  makes  backup  copies.  We  don’t  know  if  all  backup  copies  are  reliably  removed.  

•  We  need  assured  dele'on:  –  Data  becomes  inaccessible  upon  requests  of  dele7on  

Slides  credit  to  Patrick  Lee  et  al.      

Page 112: Lecture I: Data Storage Security in Cloud Compujng

One  Solu7on:  FADE  (securecomm’10)  •  FADE:  an  overlay  cloud  storage  system  with  file  assured  dele7on  

key  manager  

…  Data  owner  

Cloud  

file  (encrypted)  

metadata  file  

FADE  

•  FADE  decouples  key  management  and  data  management  •  Key  manager  can  be  flexibly  deployed  in  another  trusted  third  

party,  or  deployed  within  data  owner  •  No  implementa7on  changes  on  cloud  

Page 113: Lecture I: Data Storage Security in Cloud Compujng

113

Threat  Models  and  Assump7ons  

•  File  assured  dele7on  is  achieved  –  If  we  request  to  delete  a  file,  it  is  inaccessible  

•  Key  manager  is  minimally  trusted  – can  reliably  remove  keys  of  revoked  policies  – can  be  compromised,  but  only  files  with  ac7ve  policies  can  be  recovered  

•  Data  owner  forms  an  authen7cated  channel  with  key  manager  for  key  management  opera7ons  

Page 114: Lecture I: Data Storage Security in Cloud Compujng

114

Policy-­‐based  File  Assured  Dele7on  

•  Each  file  is  associated  with  a  data  key  and  a  file  access  policy  

•  Each  policy  is  associated  with  a  control  key  •  All  control  keys  are  maintained  by  a  key  manager  •  When  a  policy  is  revoked,  its  respec7ve  control  key  will  be  removed  from  the  key  manager  

Page 115: Lecture I: Data Storage Security in Cloud Compujng

115

Policy-­‐based  File  Assured  Dele7on  

•  Main  idea:  – File  protected  with  data  key  – Data  key  protected  with  control  key  

File

data key control key

is maintained by the key manager

Page 116: Lecture I: Data Storage Security in Cloud Compujng

116

Policy-­‐based  File  Assured  Dele7on  

•  When  a  policy  is  revoked,  the  control  key  is  removed.  The  encrypted  data  key  and  hence  the  encrypted  file  cannot  be  recovered  

•  The  file  is  deleted,  i.e.,  even  a  copy  exists,  it  is  encrypted  and  inaccessible  by  everyone  

File

data key Cannot be recovered

without

Page 117: Lecture I: Data Storage Security in Cloud Compujng

•  Proofs  of  data  redundancy  •  Proofs  of  data  encryp7on  •  Assured  dele7on  •  Proofs  of  geoloca7on  •  Proofs  of  ownership  vs.  deduplica7on  •  More  to  be  iden7fied…  

117

More  Cloud  Storage  Security  Related  Topics  

Page 118: Lecture I: Data Storage Security in Cloud Compujng

Proofs  of  Geoloca7on  of  Data  

•  Mo7va7on  is  from  regulatory  compliance.  – many  laws  requires  storage  providers  to  keep  customer  data  within,  say,  na7onal  boundaries  

•  One  open  problem  is  the  remote  verifica7on  of  the  geographical  loca7on  of  cloud  data.    – of  par7cular  commercial  interest    

Page 119: Lecture I: Data Storage Security in Cloud Compujng

Proofs  of  Geoloca7on  of  Data  

•  Given  the  challenge  of  ensuring  that  data  is  not  duplicated,  any  solu7on  probably  requires    – a  trusted  data-­‐management  system,  e.g.,  via  trusted  hardware  

–  localizing  the  pieces  of  the  above  system.    

•  A  promising  explora7on  direc7on  – Geoloca7on  of  trusted  hardware  via  remote  7ming  from  trusted  anchor  points.    

Page 120: Lecture I: Data Storage Security in Cloud Compujng

•  Proofs  of  data  redundancy  •  Proofs  of  data  encryp7on  •  Assured  dele7on  •  Proofs  of  geoloca7on  •  Proofs  of  ownership  vs.  deduplica7on  •  More  to  be  iden7fied…  

120

More  Cloud  Storage  Security  Related  Topics  

Page 121: Lecture I: Data Storage Security in Cloud Compujng

A_acks  and  Mo7va7ons  

•  Many  cloud  storage  providers  deduplicate  the  files  that  its  users  have  stored  online.    – Usually  use  file  hash  to  detect  and  keep  a  single  copy  of  original  file  

– save  storage  and  bandwidth  cost  

•  It’s  possible  for  adversary  to  simply  leverage  file  hash  to  become  one  of  the  file  owners.    

Page 122: Lecture I: Data Storage Security in Cloud Compujng

A_acks  and  Mo7va7ons  

Upload  file1  to  cloud  

File1,  hash1  

Cloud  uses  hash1  to  detect  future  upload  requests  

of  File1  

Data  owner  

Page 123: Lecture I: Data Storage Security in Cloud Compujng

A_acks  and  Mo7va7ons  

Upload  file1  to  cloud  

File1,  hash1  

Use  hash1  to  detect  future  

upload  requests  of  File1  

Data  owner   adversary  

Request  to  upload  File1,  here  is  its  hash1  

Page 124: Lecture I: Data Storage Security in Cloud Compujng

A_acks  and  Mo7va7ons  

Upload  file1  to  cloud  

File1,  hash1  

Use  hash1  to  detect  future  

upload  requests  of  File1  

Data  owner   adversary  

Request  to  upload  File1,  here  is  its  hash1  

Using  simple  file  hash  to  become  one  of  owners  of  File1  

Page 125: Lecture I: Data Storage Security in Cloud Compujng

Proofs  of  Ownership  (POW)  

•  POW  is  Not  proof  of  storage    – No-­‐preprocessing  step  – Client  has  less  power  and  space    

•  The  basic  Idea:    – Server  challenges  the  client    – client  has  to  prove  that  he  has  the  file    – With  negligible  probability  client  can  convince    server  that  he  has  the  file  when  he  does  not    

Page 126: Lecture I: Data Storage Security in Cloud Compujng

Solu7on  Highlight  

•  Solu7on1:  Proofs  of  random  por7on  of  file    –  Use  Merkle  Hash  Tree  (MHT)  over  file  

•  Client  sends  root  of  MHT,  built  over  blocks  of  the  file  •  Server  asks  for  random  leaves  to  verify    

–  If  small  file  entropy,  encode  the  file  first  with  erasure  code    •  to  enlarge  the  unknown  file  por7on,  making  it  less  predictable  

•  Solu7on  2:  Proofs  of  random  por7on  of  summary  of  file  –  Assume  user’s  memory  size  to  be  a  buffer  –  Build  MHT  over  the  buffer  only  

•  Other  advanced  solu7ons  are  also  proposed  

Page 127: Lecture I: Data Storage Security in Cloud Compujng

To  learn  more  •  K.  Bowers,  M.  van  Dijk,  A.  Juels,  A.  Oprea,  and  R.  Rivest.  How  to  Tell  if  Your  

Cloud  Files  Are  Vulnerable  to  Drive  Crashes.  In  Proc.  Of  CCS,  2011.    •  M.  van  Dijk,  A.  Juels,  A.  Oprea,  R.  Rivest,  E.  Stefanov,  N.  Triandopoulos,  

Hourglass  Schemes:  How  to  Prove  that  Cloud  Files  Are  Encrypted.  In  Proc.  Of  CCS,  2012  

•  Y.  Tang,  P.  P.  C.  Lee,  J.  C.  S.  Lui,  R.  Perlman,  Secure  Overlay  Cloud  Storage  with  Access  Control  and  Assured  Dele7on,  IEEE  TDSC,  vol.  9  no.  6,  2012,  pp.  903-­‐916.  

•  A.  Juels,  A.  Oprea,  New  approaches  to  security  and  availability  for  cloud  data.  Commun.  ACM  56(2):  64-­‐73  (2013)    

•  S.  Halevi,  D.  Harnik,  B.  Pinkas,  A.  Shulman-­‐Peleg,  Proofs  of  ownership  in  remote  storage  systems.  In  Proc.  Of  CCS,  2011  

Page 128: Lecture I: Data Storage Security in Cloud Compujng

To  learn  even  more  •  A.  Juels  and  B.  Kaliski.  Proof  of  Retrievability  (PORs)  for  Large  Files.  In  Proc.  

Of  CCS  ‘07.  •  K.  D.  Bowers,  A  Juels,  and  A.  Oprea:  HAIL:  a  high-­‐availability  and  integrity  

layer  for  cloud  storage.  ACM  CCS  ‘09.  •  K.  Bowers,  A.  Juels,  and  A.  Oprea.  Proofs  of  Retrievability:  Theory  and  

Implementa7on.  In  Proc.  Of  CCSW,  2009.  •  G.  Ateniese,  S.  Kamara,  J.  Katz,  Proofs  of  Storage  from  Homomorphic  

Iden7fica7on  Protocols.  In  Proc.  Of  ASIACRYPT,  2009,  pp.  319-­‐333  •  Y.  Dodis,  S.  Vadhan,  D.  Wichs,  Proofs  of  Retrievability  via  Hardness  

Amplifica7on.  In  Proc.  Of  TCC,  2009,  pp.  109-­‐127  •  G.  Ateniese,  et  al.,  Remote  data  checking  using  provable  data  

possession.  ACM  Trans.  Inf.  Syst.  Secur.  14(1):  12  (2011)  •  H.  Shacham,  B.  Waters,  Compact  Proofs  of  Retrievability.  J.  Cryptology  26(3):  

442-­‐483  (2013)