online auditing - how may auditors inadvertently compromise your privacy kobbi nissim microsoft with...

20
Online Auditing - How may Auditors Inadvertently Compromise Your Privacy Kobbi Nissim Microsoft With Nina Mishra HP/Stanford Work in progress

Upload: esmond-fisher

Post on 17-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Online Auditing - How may Auditors Inadvertently

Compromise Your Privacy

Kobbi Nissim

Microsoft

With Nina Mishra HP/Stanford

Work in progress

2

The Setting

• Dataset: d={d1,…,dn} – Entries di: Real, Integer, Boolean

• Query: q = (f ,i1,…,ik)– f : Min, Max, Median, Sum, Average, Count…

• Bad users will try to breach the privacy of individuals

Statisticaldatabase

f (di1,…,dik)

q = (f ,i1,…,ik)

3

ffff

The Data Privacy Game: an Information-Privacy Tradeoff

• Private functions: – Want to hide i(d)=di

• Information functions:– Want to reveal query answers f(di1,…,dik)

• Major question: what may be computed over d (and given to users) without breaching privacy?

• Confidentiality control methods– Perturbation methods: give `noisy’ answers– Query restriction methods: limit the queries users may post,

usually imposing some structure (e.g. size/overlap restrictions)

iiff

4

Auditing

• [AW89] classify auditing as a query restriction method:– “Auditing of an SDB involves keeping up-to-date

logs of all queries made by each user (not the data involved) and constantly checking for possible compromise whenever a new query is issued”

• Partial motivation: May allow for more queries to be posed, if no privacy threat occurs

• Early work: Hofmann 1977, Schlorer 1976, Chin, Ozsoyoglu 1981, 1986

• Recent interest: Kleinberg, Papadimitriou, Raghavan 2000, Li, Wang, Wang, Jajodia 2002, Jonsson, Krokhin 2003

5

Auditing

Statisticaldatabase

Query log

q1,…,qi

Here’s a new query: qi+1

Here’s the answer

Query denied (as the answer would cause privacy loss)

OR

Auditor

6

Design choices in Prior Work (1)

1. Privacy definition:– Privacy breached (only) when a database entry

may be deduced fully, or within some accuracy– These privacy guarantees do not generally

suffice: • Should take into account: Adversary’s computational

power, prior knowledge, access to other databases…

2. Exact answers given– Auditors viewed as a way to give `quality’

answers???

7

Design choices in Prior Work (2)

3. Which information is taken into account in the auditor decision procedure:

– Decision made based on queries q1,…,qi, qi+1 and their answers a1,…,ai, ai+1

• Denials ignored

4. Offline vs. Online:• Offline auditing: queries and answers checked

for compromise at the end of the day• Only detect breaches

• Online auditing: answer/deny queries on the fly

• Prevent breaches just before they happen

8

Example 1: Sum/Max auditing

Oh well…

q1 = sum(d1,d2,d3)

sum(d1,d2,d3) = 15

q2 = max(d1,d2,d3)

Denied (the answer would cause privacy loss)

di real, sum/max queries, privacy breached if some di learned

Auditor

9

Some Prior Work on Auditors

NP-hard /PTIME

Generalized results [JK03]

PTIMEdi within accuracy

sumdi [a,b]Interval based [LWWJ02]

PTIME--”--MaxRealMax [KPR00]

NP-hard*--”--Sum0/1Boolean [KPR00]

NP-harddi learnedSum/maxrealSum/Max [Chin]

ComplexityBreachQueriesData

* Approx version in PTIME

Can we use the offline version for online auditing?

10

… After Two Minutes …

Oh well…

q1 = sum(d1,d2,d3)

sum(d1,d2,d3) = 15

q2 = max(d1,d2,d3)

Denied (the answer would cause privacy loss)

q2 is denied iff d1=d2=d3 = 5

I win!

Auditor

di real, sum/max queries, privacy breached if some di learned

There must be a reason for the

denial…

11

Example 2: Interval Based Auditing

q1 = sum(d1,d2)Sorry, denied

q2 = sum(d2,d3)sum(d2,d3) = 50

d1,d2 [0,1] d3 [49,50]

di [0,100], sum queries, =1 (PTIME)

Auditor

Denial d1,d2[0,1] or

[99,100]

12

Sounds Familiar?

On the advice of my counsel I respectfully and regretfully decline to answer the question based on my constitutional rights.

Colonel Oliver North, on the Iran-Contra Arms Deal:

Mr. Chairman, I would like to answer the committee's questions, but on the advice of my counsel I respectfully decline to answer the question based on the protection afforded me under the Constitution of the United States.

David Duncan, Former auditor for Enron and partner in Andersen:

13

Max Auditing

q1 = max(d1,d2,d3,d4)

M1234

di real

M123 / deniedIf denied: d4=M1234

M12 / deniedIf denied: d3=M123

Auditor

q2 = max(d1,d2,d3)

q2 = max(d1,d2)

d1 d2 d4 d6d3 d5 d7 d8 … dndn-1

14

Adversary’s Success q1 = max(d1,d2,d3,d4)

q2 = max(d1,d2,d3)

q2 = max(d1,d2)

If denied: d4=M1234

If denied: d3=M123

Recover 1/8 of the database!

Auditor

Denied with probability 1/4

Denied with probability 1/3

Success probability: 1/4 + (1- 1/4)·1/3 = 1/2

15

Boolean Auditing?

1 / denied

1 / denied

qi denied iff di = di+1 learn database/complement

Auditor

1 / 2

Recover the entire database!

Let di,dj,dk not all equal, where qi-1, qi, qj-1, qj, qk-1, qk all denied

di Booleand1 d2 d4 d6d3 d5 d7 d8 … dndn-1

q1 = sum(d1,d2)

q2=sum(d2,d3)

q2=sum(di,dj,dk)

16

Two Problems• Obvious problem: denied queries ignored

– Algorithmic problem: not clear how to incorporate denials in the decision

• Subtle problem:– Query denials leak (potentially sensitive) information

• Users cannot decide denials by themselves

Possible assignments to {d1,…,dn}

Assignments consistent with (q1,…qi, a1,…,ai)

qi+1 denied

17

A Spectrum of Auditors

q1,…,qi, qi+1

q1,…,qi, qi+1

a1,…,ai, ai+1

q1,…,qi, qi+1

a1,…,ai

*Note: can work in “unsafe” region, but need to prove denials do not leak crucial information

Size overlap restriction Algebraic structure

“Safe” “Unsafe”

<utility

>privacy

“Safe”

18

Simulatable Auditing*

An auditor is simulatable if a simulator exists s.t.:

Auditor

qi+1 qi+1

Deny/answer Deny/answer

Simulator

Simulation denials do not leak information* `self auditors’ in [DN03]

q1,…,qia1,…,ai

Statisticaldatabase

q1,…,qi

19

Why Simulatable Auditors do not Leak Information?

Possible assignments to {d1,…,dn}

Assignments consistent with (q1,…qi, a1,…,ai )

qi+1 denied/allowed

20

Summary• Improper usage of auditors may lead to privacy

breaches, due to information leakage in the decision procedure.– Cell suppression / some k-anonymity methods should be

checked similarly

– Should make sure offline auditors do not leak information in decision

• Simulatable auditors provably don’t leak information– Give best utility while still “safe”

– A launching point for further research on auditors

• Further research:– Auditors with more reasonable privacy guarantees