online auditing - how may auditors inadvertently compromise your privacy kobbi nissim microsoft with...
TRANSCRIPT
Online Auditing - How may Auditors Inadvertently
Compromise Your Privacy
Kobbi Nissim
Microsoft
With Nina Mishra HP/Stanford
Work in progress
2
The Setting
• Dataset: d={d1,…,dn} – Entries di: Real, Integer, Boolean
• Query: q = (f ,i1,…,ik)– f : Min, Max, Median, Sum, Average, Count…
• Bad users will try to breach the privacy of individuals
Statisticaldatabase
f (di1,…,dik)
q = (f ,i1,…,ik)
3
ffff
The Data Privacy Game: an Information-Privacy Tradeoff
• Private functions: – Want to hide i(d)=di
• Information functions:– Want to reveal query answers f(di1,…,dik)
• Major question: what may be computed over d (and given to users) without breaching privacy?
• Confidentiality control methods– Perturbation methods: give `noisy’ answers– Query restriction methods: limit the queries users may post,
usually imposing some structure (e.g. size/overlap restrictions)
iiff
4
Auditing
• [AW89] classify auditing as a query restriction method:– “Auditing of an SDB involves keeping up-to-date
logs of all queries made by each user (not the data involved) and constantly checking for possible compromise whenever a new query is issued”
• Partial motivation: May allow for more queries to be posed, if no privacy threat occurs
• Early work: Hofmann 1977, Schlorer 1976, Chin, Ozsoyoglu 1981, 1986
• Recent interest: Kleinberg, Papadimitriou, Raghavan 2000, Li, Wang, Wang, Jajodia 2002, Jonsson, Krokhin 2003
5
Auditing
Statisticaldatabase
Query log
q1,…,qi
Here’s a new query: qi+1
Here’s the answer
Query denied (as the answer would cause privacy loss)
OR
Auditor
6
Design choices in Prior Work (1)
1. Privacy definition:– Privacy breached (only) when a database entry
may be deduced fully, or within some accuracy– These privacy guarantees do not generally
suffice: • Should take into account: Adversary’s computational
power, prior knowledge, access to other databases…
2. Exact answers given– Auditors viewed as a way to give `quality’
answers???
7
Design choices in Prior Work (2)
3. Which information is taken into account in the auditor decision procedure:
– Decision made based on queries q1,…,qi, qi+1 and their answers a1,…,ai, ai+1
• Denials ignored
4. Offline vs. Online:• Offline auditing: queries and answers checked
for compromise at the end of the day• Only detect breaches
• Online auditing: answer/deny queries on the fly
• Prevent breaches just before they happen
8
Example 1: Sum/Max auditing
Oh well…
q1 = sum(d1,d2,d3)
sum(d1,d2,d3) = 15
q2 = max(d1,d2,d3)
Denied (the answer would cause privacy loss)
di real, sum/max queries, privacy breached if some di learned
Auditor
9
Some Prior Work on Auditors
NP-hard /PTIME
Generalized results [JK03]
PTIMEdi within accuracy
sumdi [a,b]Interval based [LWWJ02]
PTIME--”--MaxRealMax [KPR00]
NP-hard*--”--Sum0/1Boolean [KPR00]
NP-harddi learnedSum/maxrealSum/Max [Chin]
ComplexityBreachQueriesData
* Approx version in PTIME
Can we use the offline version for online auditing?
10
… After Two Minutes …
Oh well…
q1 = sum(d1,d2,d3)
sum(d1,d2,d3) = 15
q2 = max(d1,d2,d3)
Denied (the answer would cause privacy loss)
q2 is denied iff d1=d2=d3 = 5
I win!
Auditor
di real, sum/max queries, privacy breached if some di learned
There must be a reason for the
denial…
11
Example 2: Interval Based Auditing
q1 = sum(d1,d2)Sorry, denied
q2 = sum(d2,d3)sum(d2,d3) = 50
d1,d2 [0,1] d3 [49,50]
di [0,100], sum queries, =1 (PTIME)
Auditor
Denial d1,d2[0,1] or
[99,100]
12
Sounds Familiar?
On the advice of my counsel I respectfully and regretfully decline to answer the question based on my constitutional rights.
Colonel Oliver North, on the Iran-Contra Arms Deal:
Mr. Chairman, I would like to answer the committee's questions, but on the advice of my counsel I respectfully decline to answer the question based on the protection afforded me under the Constitution of the United States.
David Duncan, Former auditor for Enron and partner in Andersen:
13
Max Auditing
q1 = max(d1,d2,d3,d4)
M1234
di real
M123 / deniedIf denied: d4=M1234
M12 / deniedIf denied: d3=M123
Auditor
q2 = max(d1,d2,d3)
q2 = max(d1,d2)
d1 d2 d4 d6d3 d5 d7 d8 … dndn-1
14
Adversary’s Success q1 = max(d1,d2,d3,d4)
q2 = max(d1,d2,d3)
q2 = max(d1,d2)
If denied: d4=M1234
If denied: d3=M123
Recover 1/8 of the database!
Auditor
Denied with probability 1/4
Denied with probability 1/3
Success probability: 1/4 + (1- 1/4)·1/3 = 1/2
15
Boolean Auditing?
1 / denied
1 / denied
qi denied iff di = di+1 learn database/complement
Auditor
…
1 / 2
Recover the entire database!
Let di,dj,dk not all equal, where qi-1, qi, qj-1, qj, qk-1, qk all denied
di Booleand1 d2 d4 d6d3 d5 d7 d8 … dndn-1
q1 = sum(d1,d2)
q2=sum(d2,d3)
q2=sum(di,dj,dk)
16
Two Problems• Obvious problem: denied queries ignored
– Algorithmic problem: not clear how to incorporate denials in the decision
• Subtle problem:– Query denials leak (potentially sensitive) information
• Users cannot decide denials by themselves
Possible assignments to {d1,…,dn}
Assignments consistent with (q1,…qi, a1,…,ai)
qi+1 denied
17
A Spectrum of Auditors
q1,…,qi, qi+1
q1,…,qi, qi+1
a1,…,ai, ai+1
q1,…,qi, qi+1
a1,…,ai
*Note: can work in “unsafe” region, but need to prove denials do not leak crucial information
Size overlap restriction Algebraic structure
“Safe” “Unsafe”
<utility
>privacy
“Safe”
18
Simulatable Auditing*
An auditor is simulatable if a simulator exists s.t.:
Auditor
qi+1 qi+1
Deny/answer Deny/answer
Simulator
Simulation denials do not leak information* `self auditors’ in [DN03]
q1,…,qia1,…,ai
Statisticaldatabase
q1,…,qi
19
Why Simulatable Auditors do not Leak Information?
Possible assignments to {d1,…,dn}
Assignments consistent with (q1,…qi, a1,…,ai )
qi+1 denied/allowed
20
Summary• Improper usage of auditors may lead to privacy
breaches, due to information leakage in the decision procedure.– Cell suppression / some k-anonymity methods should be
checked similarly
– Should make sure offline auditors do not leak information in decision
• Simulatable auditors provably don’t leak information– Give best utility while still “safe”
– A launching point for further research on auditors
• Further research:– Auditors with more reasonable privacy guarantees