![Page 1: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/1.jpg)
Differentially Private Machine Learning
Theory, Algorithms, and Applications
Kamalika Chaudhuri (UCSD)
Anand D. Sarwate (Rutgers)
![Page 2: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/2.jpg)
Logistics and Goals
• Tutorial Time: 2 hr (15 min break after first hour)
• What this tutorial will do:
• Motivate and define differential privacy
• Provide overview of common methods and tools to design differentially private ML algorithms
• What this tutorial will not do:
• Provide detailed results on the state of the art in differentially private versions of specific problems
![Page 3: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/3.jpg)
Learning OutcomesAt the end of the tutorial, you should be able to:
• Explain the definition of differential privacy,
• Design basic differentially private machine learning algorithms using standard tools,
• Try different approaches for introducing differential privacy into optimization methods,
• Understand the basics of privacy risk accounting,
• Understand how these ideas blend together in more complex systems.
![Page 4: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/4.jpg)
Motivating Differential Privacy
![Page 5: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/5.jpg)
Sensitive Data
Medical Records
Genetic Data
Search Logs
![Page 6: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/6.jpg)
AOL Violates Privacy
![Page 7: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/7.jpg)
AOL Violates Privacy
![Page 8: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/8.jpg)
Netflix Violates Privacy [NS08]
User%1%User%2%User%3%
Movies%
2-8 movie-ratings and dates for Alice reveals:
Whether Alice is in the dataset or notAlice’s other movie ratings
![Page 9: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/9.jpg)
High-dimensional Data is Unique
Example: UCSD Employee Salary Table
One employee (Kamalika) fits description!
Faculty
Position Gender Department Ethnicity
-
Salary
Female CSE SE Asian
![Page 10: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/10.jpg)
Simply anonymizing data is unsafe!
![Page 11: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/11.jpg)
Disease Association Studies [WLWTZ09]
Cancer Healthy
Correlations Correlations
Correlation (R2 values), Alice’s DNA reveals:If Alice is in the Cancer set or Healthy set
![Page 12: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/12.jpg)
Simply anonymizing data is unsafe!
Statistics on small data sets is unsafe!
Privacy
AccuracyData Size
![Page 13: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/13.jpg)
Privacy Definition
![Page 14: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/14.jpg)
The Setting
(sensitive)Data Sanitizer
Statistics
Data release
Public
Privatedata set
D
Privacy-preservingsanitizer
privacy barrier
non-public public
summarystatistic
syntheticdataset
ML model
![Page 15: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/15.jpg)
Property of Sanitizer
Aggregate information computable
Individual information protected (robust to side-information)
Statistics
Data release
Data Sanitizer Public
Privatedata set
D
Privacy-preservingsanitizer
privacy barrier
non-public public
summarystatistic
syntheticdataset
ML model
![Page 16: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/16.jpg)
Differential Privacy
Participation of a person does not change outcome
Since a person has agency, they can decide to participate in a dataset or not
Data + Algorithm
Data + Algorithm
Outcome
Outcome
![Page 17: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/17.jpg)
Adversary
Prior Knowledge:A’s Genetic profile
A smokes
![Page 18: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/18.jpg)
Adversary
Prior Knowledge:A’s Genetic profile
A smokes
Cancer
A hascancer
[ Study violates A’s privacy ]
StudyCase 1:
![Page 19: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/19.jpg)
Adversary
Prior Knowledge:A’s Genetic profile
A smokes
Cancer
A hascancer
[ Study violates A’s privacy ]
StudyCase 1:
Smoking causes cancer
A probably has cancer
[ Study does not violate privacy]
StudyCase 2:
![Page 20: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/20.jpg)
Differential Privacy [DMNS06]
Participation of a person does not change outcome
Since a person has agency, they can decide to participate in a dataset or not
Data + Algorithm
Data + Algorithm
Outcome
Outcome
![Page 21: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/21.jpg)
How to ensure this?
…through randomness
Data +A( )
Data +A( )
Random variables
have close distributions
![Page 22: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/22.jpg)
How to ensure this?
Data +A( )
Data +A( )
Random variables
have close distributions
Randomness: Added by randomized algorithm A
Closeness: Likelihood ratio at every point bounded
![Page 23: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/23.jpg)
Differential Privacy [DMNS06]
For all D, D’ that differ in one person’s value,t
D D’p[A(D) = t] p[A(D’) = t]
If A = -differentially private randomized algorithm, then:✏
sup
t
��� logp(A(D) = t)
p(A(D0) = t)
��� ✏
![Page 24: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/24.jpg)
Differential Privacy [DMNS06]
For all D, D’ that differ in one person’s value,If A = -differentially private randomized algorithm, then:✏
Max-divergence of p(A(D)) and p(A(D’))
t
D D’p[A(D) = t] p[A(D’) = t]
sup
t
��� logp(A(D) = t)
p(A(D0) = t)
��� ✏
![Page 25: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/25.jpg)
Approx. Differential Privacy [DKM+06]
For all D, D’ that differ in one person’s value,
If A = -differentially private randomized algorithm, then:(✏, �)
max
S,Pr(A(D)2S)>�
hlog
Pr(A(D) 2 S)� �
Pr(A(D0) 2 S
i ✏
t
D D’p[A(D) = t] p[A(D’) = t]
![Page 26: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/26.jpg)
Properties of Differential Privacy
![Page 27: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/27.jpg)
Property 1: Post-processing Invariance
Risk doesn’t increase if you don’t touch the data again
Privatedata set
D
privacy barrier
(✏1, �1)
legitimate user 1
legitimate user 2
adversary
data derivative(✏1, �1)
(✏1, �1)
(✏1, �1)
differentiallyprivate
algorithmdata derivative
data derivativenon-private post-
processing
![Page 28: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/28.jpg)
Property 2: Graceful Composition
Total privacy loss is the sum of privacy losses
(Better composition possible — coming up later)
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
A1
A2
…
A
(✏1, �1)
(✏2, �2) (X
i
✏i,X
i
�i)
![Page 29: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/29.jpg)
How to achieve Differential Privacy?
![Page 30: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/30.jpg)
Tools for Differentially Private Algorithm Design
• Global Sensitivity Method [DMNS06]
• Exponential Mechanism [MT07]
Many others we will not cover [DL09, NRS07, …]
![Page 31: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/31.jpg)
Global Sensitivity Method [DMNS06]
Given function f, sensitive dataset D
Find a differentially private approximation to f(D)
Problem:
Example: f(D) = mean of data points in D
![Page 32: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/32.jpg)
Define:
The Global Sensitivity Method [DMNS06]
dist(D, D’) = #records that D, D’ differ by
Given: A function f, sensitive dataset D
![Page 33: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/33.jpg)
Given: A function f, sensitive dataset D
Global Sensitivity of f:
Define:
The Global Sensitivity Method [DMNS06]
dist(D, D’) = #records that D, D’ differ by
![Page 34: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/34.jpg)
Given: A function f, sensitive dataset D
Global Sensitivity of f:
Define:
S(f) = | f(D) - f(D’)|
Domain(D)
The Global Sensitivity Method [DMNS06]
DD’
dist(D, D’) = #records that D, D’ differ by
![Page 35: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/35.jpg)
Given: A function f, sensitive dataset D
Global Sensitivity of f:
Define:
Domain(D)
S(f) = | f(D) - f(D’)|dist(D, D’) = 1
The Global Sensitivity Method [DMNS06]
DD’
dist(D, D’) = #records that D, D’ differ by
![Page 36: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/36.jpg)
Given: A function f, sensitive dataset D
Global Sensitivity of f:
Define:
Domain(D)
DD’
S(f) = | f(D) - f(D’)|maxdist(D, D’) = 1
The Global Sensitivity Method [DMNS06]
DD’
dist(D, D’) = #records that D, D’ differ by
![Page 37: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/37.jpg)
Output where
Laplace Mechanism
✏ -differentiallyprivate
Laplace distribution:
Global Sensitivity of f is S(f) = | f(D) - f(D’)|maxdist(D, D’) = 1
![Page 38: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/38.jpg)
Global Sensitivity of f is S(f) = | f(D) - f(D’)|maxdist(D, D’) = 1
Output where
Gaussian Mechanism
Z ⇠ S(f)
✏N (0, 2 ln(1.25/�)) (✏, �)-differentially
private
![Page 39: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/39.jpg)
Example 1: Mean
f(D) = Mean(D), where each record is a scalar in [0,1]
![Page 40: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/40.jpg)
Example 1: Mean
f(D) = Mean(D), where each record is a scalar in [0,1]
Global Sensitivity of f = 1/n
![Page 41: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/41.jpg)
Example 1: Mean
f(D) = Mean(D), where each record is a scalar in [0,1]
Global Sensitivity of f = 1/n
Laplace Mechanism:
Output where Z ⇠ 1
n✏Lap(0, 1)
![Page 42: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/42.jpg)
How to get Differential Privacy?
• Global Sensitivity Method [DMNS06]
• Two variants: Laplace and Gaussian
• Exponential Mechanism [MT07]
Many others we will not cover [DL09, NRS07, …]
![Page 43: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/43.jpg)
Exponential Mechanism [MT07]
Given function f(w, D), Sensitive Data D
Find differentially private approximation to
Example: f(w, D) = accuracy of classifier w on dataset D
Problem:
w⇤= argmax
wf(w,D)
![Page 44: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/44.jpg)
The Exponential Mechanism [MT07]
Suppose for any w,
when D and D’ differ in 1 record. Sample w from:
|f(w,D)� f(w,D0)| S
p(w) / e✏f(w,D)/2S
for -differential privacy.✏
w
f(w, D) Pr(w)
w
argmax f(w, D)
![Page 45: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/45.jpg)
Example: Parameter Tuning
Given validation data D, k classifiers w1, .., wk
(privately) find the classifier with highest accuracy on D
For any w, any D and D’ that differ by one record,
|f(w,D)� f(w,D0)| 1
|D|So, the exponential mechanism outputs wi with prob:
Here, f(w, D) = classification accuracy of w on D
[CMS11, CV13]
Pr(wi) / e✏|D|f(wi,D)/2
![Page 46: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/46.jpg)
Summary
— Motivation
— Basic differential privacy algorithm design tools:
— The Global Sensitivity Method — Laplace Mechanism— Gaussian Mechanism
— What is differential privacy?
— Exponential Mechanism
![Page 47: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/47.jpg)
Differential privacy in estimation and prediction
![Page 48: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/48.jpg)
Estimation and prediction problems
Goal: Good privacy-accuracy-sample size tradeoff
Statistical estimation: estimate a parameter or predictor using private data that has good expected performance on future data.
Privatedata set
DDP estimate of
privacy barrierf(w, ·)
argminw
f(w, D) w
risk functional
private estimator
sample size n
(", �)
E[f(w, z)]� E[f(w⇤, z)]
![Page 49: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/49.jpg)
Privacy and accuracy make different assumptions about the data
Privacy – differential privacy makes no assumptions on the data distribution: privacy holds unconditionally.
Accuracy – accuracy measured w.r.t a “true population distribution”: expected excess statistical risk.
Privatedata set
DDP estimate of
privacy barrierf(w, ·)
argminw
f(w, D) w
risk functional
private estimator
sample size n
(", �)
E[f(w, z)]� E[f(w⇤, z)]
![Page 50: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/50.jpg)
Statistical Learning as Risk Minimization
• Empirical Risk Minimization (ERM) is a common paradigm for prediction problems.
• Produces a predictor w for a label/response y given a vector of features/covariates x.
• Typically use a convex loss function and regularizer to “prevent overfitting.”
w
⇤ = argminw
1
n
nX
i=1
`(w, (xi, yi)) + �R(w)
![Page 51: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/51.jpg)
Why is ERM not private?
Privatedata set
D
+
–
+ +++
+–– –
– –
Privatedata set
+
–
+ +
+
++–
– –– – D0
w⇤
w0⇤
adversary
wD or D0
?
easy for adversary to tell the difference between D and D’
[CMS11, RBHT12]
![Page 52: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/52.jpg)
–
+
–
+ ++++–
– –
–
Kernel learning: even worse
• Kernel-based methods produce a classifier that is a function of the data points.
• Even adversary with black-box access to w could potentially learn those points.
[CMS11]
w(x) =nX
i=1
↵ik(x,xi)
![Page 53: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/53.jpg)
Privacy is compatible with learning
• Good learning algorithms generalize to the population distribution, not individuals.
• Stable learning algorithms generalize [BE02].
• Differential privacy can be interpreted as a form of stability that also implies generalization [DFH+15,BNS+16].
• Two parts of the same story: Privacy implies generalization asymptotically. Tradeoffs between privacy-accuracy-sample size for finite n.
![Page 54: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/54.jpg)
Revisiting ERM
• Learning using (convex) optimization uses three steps:
1. read in the data
2. form the objective function
3. perform the minimization
• We can try to introduce privacy in each step!
w
⇤ = argminw
1
n
nX
i=1
`(w, (xi, yi)) + �R(w)
input perturbation
objective perturbation
output perturbation
![Page 55: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/55.jpg)
sanitized database
privacybarrier
input perturbation
input perturbation
input perturbation
input perturbation
input perturbation
input perturbation
non-private
algorithm
(", �)
privatedatabase
D
non-privatepreprocessing
(", �)
DPoptimization
privacybarrier
wobjective
privatedatabase
D
non-privatepreprocessing
(", �)
non-privateoptimization
privacybarrier
noise addition or
randomization
w⇤
woutput
privatedatabase
DDP
preprocessing
(", �)
non-private
learning
privacybarrier
sanitizeddataset
winput
Privacy in ERM: options
![Page 56: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/56.jpg)
sanitized database
privacybarrier
input perturbation
input perturbation
input perturbation
input perturbation
input perturbation
input perturbation
non-private
algorithm
(", �)
Local Privacy
• Local privacy: data contributors sanitize data before collection.
• Classical technique: randomized response [W65].
• Interactive variant can be minimax optimal [DJW13].
![Page 57: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/57.jpg)
privatedatabase
DDP
preprocessing
(", �)
non-private
learning
privacybarrier
sanitizeddataset
winput
Input Perturbation
• Input perturbation: add noise to the input data.
• Advantages: easy to implement, results in reusable sanitized data set.
[DJW13,FTS17]
![Page 58: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/58.jpg)
privatedatabase
D
non-privatepreprocessing
(", �)
non-privateoptimization
privacybarrier
noise addition or
randomization
w⇤
woutput
Output Perturbation
• Compute the minimizer and add noise.
• Does not require re-engineering baseline algorithms
Noise depends on the sensitivity of the argmin. [CMS11, RBHT12]
![Page 59: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/59.jpg)
Objective Perturbation
privatedatabase
D
non-privatepreprocessing
(", �)
DPoptimization
privacybarrier
wobjective
![Page 60: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/60.jpg)
Objective Perturbation
A. Add a random term to the objective:
B. Do a randomized approximation of the objective:
Randomness depends on the sensitivity properties of J(w).
J(w) =1
n
nX
i=1
`(w, (xi, yi)) + �R(w)
[CMS11, ZZX+12]
wpriv = argminw
�J(w) +w>b
�
wpriv = argminw
J(w)
![Page 61: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/61.jpg)
Sensitivity of the argmin
• Non-private optimization solves
• Generate vector analogue of Laplace:
• Private optimization solves
• Have to bound the sensitivity of the gradient.
rJ(w) = 0 =) w⇤
b ⇠ p(z) / e�"/2kzk
rJ(w) = �b =) wpriv
w⇤ = argminw
J(w)wpriv = argminw
�J(w) +w>b
�
![Page 62: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/62.jpg)
˜O
pd log(1/�)
n"
!˜O
pd log(1/�)
n"
!objective perturbationinput perturbationoutput perturbation
Theoretical bounds on excess risk
Same important parameters:
• privacy parameters (ε, δ)
• data dimension d
• data bounds
• analytical properties of the loss (Lipschitz, smoothness)
• regularization parameter
kxik B
�(quadratic loss) [FTS17]
˜O
pd log(1/�)
n"
!
[CMS11, BST14]
O
✓d
n"3/2
◆O
✓d
n"
◆
![Page 63: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/63.jpg)
Typical empirical results
In general:
• Objective perturbation empirically outperforms output perturbation.
• Gaussian mechanism with (ε, δ) guarantees outperform Laplace -like mechanisms with ε-guarantees.
• Loss vs. non-private methods is very dataset-dependent.
[CMS11]
[JT14]
![Page 64: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/64.jpg)
Gaps between theory and practice
• Theoretical analysis is for fixed privacy parameters – how should we choose them in practice?
• Given a data set, can I tell what the privacy-utility-sample-size tradeoff is?
• What about more general optimization problems/algorithms?
• What about scaling (computationally) to large data sets?
![Page 65: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/65.jpg)
Summary
• Training does not on its own guarantee privacy.
• There are many ways to incorporate DP into prediction and learning using ERM with different privacy-accuracy-sample size tradeoffs.
• Good DP algorithms should generalize since they learn about populations, not individuals.
• Theory and experiment show that (ε, δ)-DP algorithms have better accuracy than ε-DP algorithms at the cost of a weaker privacy guarantee.
![Page 66: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/66.jpg)
Break
![Page 67: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/67.jpg)
Differential privacy and optimization algorithms
![Page 68: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/68.jpg)
Scaling up private optimization
• Large data sets are challenging for optimization: ⇒ batch methods not feasible
• Using more data can help our tradeoffs look better: ⇒ better privacy and accuracy
• Online learning involves multiple releases: ⇒ potential for more privacy loss
Goal: guarantee privacy using the optimization algorithm.
![Page 69: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/69.jpg)
Stochastic Gradient Descent
• Stochastic gradient descent (SGD) is a moderately popular method for optimization
• Stochastic gradients are random ⇒ already noisy ⇒ already private?
• Optimization is iterative ⇒ intermediate results leak information
![Page 70: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/70.jpg)
Non-private SGD
• select a random data point
• take a gradient step
J(w) =1
n
nX
i=1
`(w, (xi, yi)) + �R(w)
w0 = 0
For t = 1, 2, . . . , T
it ⇠ Unif{1, 2, . . . , n}gt = r`(wt�1, (xit , yit)) + �rR(wt�1)
wt = ⇧W(wt�1 � ⌘tgt)
ˆ
w = wT
![Page 71: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/71.jpg)
Private SGD with noise
• select random data point
• add noise to gradient
J(w) =1
n
nX
i=1
`(w, (xi, yi)) + �R(w)
w0 = 0
For t = 1, 2, . . . , T
it ⇠ Unif{1, 2, . . . , n}zt ⇠ p(",�)(z)
ˆ
gt = zt +r`(wt�1, (xit , yit)) + �rR(wt�1)
wt = ⇧W(wt�1 � ⌘tˆgt)
ˆ
w = wT[SCS15]
![Page 72: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/72.jpg)
Choosing a noise distribution
• Have to choose noise according to the sensitivity of the gradient:
• Sensitivity depends on the data distribution, Lipschitz parameter of the loss, etc.
p(z) / e("/2)kzk p(z)i.i.d.⇠ N
✓0, ˜O
✓log(1/�)
"2
◆◆
"� DP (", �)� DP
max
D,D0max
wkrJ(w;D)�rJ(w;D0
)k
“Laplace” mechanism Gaussian mechanism
![Page 73: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/73.jpg)
Private SGD with randomized selection
• select random data point
• randomly select unbiased gradient estimate
J(w) =1
n
nX
i=1
`(w, (xi, yi)) + �R(w)
w0 = 0
For t = 1, 2, . . . , T
it ⇠ Unif{1, 2, . . . , n}gt = r`(wt�1, (xit , yit)) + �rR(wt�1)
ˆ
gt ⇠ p(",�),g(z)
wt = ⇧W(wt�1 � ⌘tˆgt)
ˆ
w = wT
[DJW13]
![Page 74: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/74.jpg)
Randomized directions
• Need to have control of gradient norms:
• Keep some probability of going in the wrong direction.
e✏
1 + e✏1
1 + e✏
=L
kgkg w.p.1
2+
kgk2L
B
Select hemisphere in direction of gradient
or opposite.Pick uniformly from the hemisphere and
take a step
kgk L
[DJW13]
![Page 75: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/75.jpg)
g=g+z t
g
zt
Why does DP-SGD work?
Noisy Gradient
Choose noise distribution using the sensitivity of the gradient.
e✏
1 + e✏1
1 + e✏
=L
kgkg w.p.1
2+
kgk2L
B
Both methods
• Guarantee DP at each iteration.• Ensure unbiased estimate of g to guarantee convergence
Random Gradient
Randomly select direction biased towards the true gradient.
![Page 76: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/76.jpg)
Making DP-SGD more practical
“SGD is robust to noise”
• True up to a point — for small epsilon (more privacy), the gradients can become too noisy.
• Solution 1: more iterations ([BST14]: need )
• Solution 2: use standard tricks: mini-batching, etc. [SCS13]
• Solution 3: use better analysis to show the privacy loss is not so bad [BST14][ACG+16]
O(n2)
![Page 77: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/77.jpg)
Randomly sampling data can amplify privacy
• Suppose we have an algorithm A which is ε-differentially private for ε ≤ 1.
• Sample γn entries of D uniformly at random and run A on those.
• Randomized method guarantees 2γε-differential privacy.[BBKN14,BST14]
Privatedata set
D
Differentially private algorithm
privacy barrier
"
Privatedata set
D
Differentially private algorithm
privacy barrier
subsample�n
2�✏
![Page 78: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/78.jpg)
Summary
• Stochastic gradient descent can be made differentially private in several ways by randomizing the gradient.
• Keeping gradient estimates unbiased will help ensure convergence.
• Standard approach for variance reduction/stability (such as minibatching) can help with performance.
• Random subsampling of the data can amplify privacy guarantees.
![Page 79: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/79.jpg)
Accounting for total privacy risk
![Page 80: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/80.jpg)
Measuring total privacy loss
Post processing invariance: risk doesn’t increase if you don’t touch the data again• more complex algorithms have multiple stages ⇒ all stages have to guarantee DP
• need a way to do privacy accounting: what is lost over time/multiple queries?
Privatedata set
D
privacy barrier
(✏1, �1)
legitimate user 1
legitimate user 2
adversary
data derivative(✏1, �1)
(✏1, �1)
(✏1, �1)
differentiallyprivate
algorithmdata derivative
data derivativenon-private post-
processing
![Page 81: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/81.jpg)
A simple example
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
![Page 82: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/82.jpg)
Basic composition: privacy loss is additive:
• Apply R algorithms with
• Total privacy loss:
• Worst-case analysis: each result exposes the worst privacy risk.
Composition property of differential privacy
(✏i, �i) : i = 1, 2, . . . , R
RX
i=1
✏i,RX
i=1
�i
!
![Page 83: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/83.jpg)
What composition says about multi-stage methods
Total privacy loss is the sum of the privacy losses…
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
![Page 84: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/84.jpg)
An open question: privacy allocation across stages
Compositions means we have a privacy budget.
How should we allocate privacy risk across different stages of a pipeline?
• Noisy features + accurate training?
• Clean features + sloppy training?
It’s application dependent! Still an open question…
![Page 85: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/85.jpg)
p(⌧2|D)
p(⌧2|D0)
exp("2)
p(⌧2|D0)
p(⌧2|D)�2
⌧1
p(⌧1|D)
p(⌧1|D0)
�1
p(⌧1|D)
p(⌧1|D0)
exp("1)
A closer look at ε and δ
Gaussian noise of a given variance produces a spectrum of (ε, δ) guarantees:
⌧2
![Page 86: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/86.jpg)
Privacy loss as a random variable
Actual privacy loss is a random variable that depends on D:
Spectrum of (ε, δ) guarantees means we can trade off ε and δ when analyzing a particular mechanism.
ZD,D0= log
p(A(D) = t)
p(A(D0) = t)
w.p. p(A(D) = t)
![Page 87: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/87.jpg)
Random privacy loss
• Bounding the max loss over (D,D’) is still a random variable.
• Sequentially computing functions on private data is like sequentially sampling independent privacy losses.
• Concentration of measure shows that the loss is much closer to its expectation.
ZD,D0= log
p(A(D) = t)
p(A(D0) = t)
w.p. p(A(D) = t)
![Page 88: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/88.jpg)
Strong composition boundsk timesz }| {
(", �), (", �), . . . , (", �)
�(k � 2i)", 1� (1� �)k(1� �i)
�
�i =
Pi�1`=0
�k`
� �e(k�`)" � e(k�2i+`)"
�
(1 + e")k
[DRV10, KOV15]
• Given only the (ε, δ) guarantees for k algorithms operating on the data.
• Composition again gives a family of (ε, δ) tradeoffs: can quantify privacy loss by choosing any valid (ε, δ) pair.
![Page 89: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/89.jpg)
Moments accountant
Basic Idea: Directly calculate parameters (ε,δ)
from composing a sequence of mechanisms
More efficient than composition theorems
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
A1
A2
…
A (✏, �)
[ACG+16]
![Page 90: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/90.jpg)
- If max absolute value of ZD,D’ over all D, D’ is ε, then A is (ε,0)-differentially private
How to Compose Directly?
ZD,D0= log
p(A(D) = t)
p(A(D0) = t)
, w.p. p(A(D) = t)
Given datasets D and D’ with one different record,
mechanism A, define privacy loss random variable as:
- Properties of ZD,D’ related to privacy loss of A
![Page 91: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/91.jpg)
How to Compose Directly?
ZD,D0= log
p(A(D) = t)
p(A(D0) = t)
, w.p. p(A(D) = t)
Given datasets D and D’ with one different record,
mechanism A, define privacy loss random variable as:
Challenge: To reason about
Key idea in [ACG+16]: Use
the worst case over all D, D’
moment generating functions
![Page 92: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/92.jpg)
Accounting for Moments…
Three Steps:
1. Calculate moment generating functions for A1, A2, ..2. Compose
3. Calculate final privacy parameters
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
A1
A2
…
A (✏, �)
[ACG+16]
![Page 93: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/93.jpg)
1. Stepwise Moments
Define: Stepwise Moment at time t of At at any s:
↵At(s) = sup
D,D0logE[esZD,D0
]
(D and D’ differby one record)
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
A1
A2
…
A
[ACG+16]
![Page 94: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/94.jpg)
2. Compose
Theorem: Suppose A = (A1, …, AT). For any s:
↵A(s) TX
t=1
↵At(s)
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
A1
A2
…
A
[ACG+16]
![Page 95: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/95.jpg)
Use theorem to find best ε for a given δ from closed form
Theorem: For any ε, mechanism A is (ε,δ)-DP for
3. Final Calculation
� = min
sexp(↵A(s)� s✏)
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
A1
A2
…
A (✏, �)
or by searching over s1, s2, .., sk [ACG+16]
![Page 96: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/96.jpg)
Example: composing Gaussian mechanism
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
A1
A2
…
A (✏, �)
Suppose At answers a query with global sensitivity 1 by adding N(0, 1) noise
![Page 97: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/97.jpg)
1. Stepwise Moments
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
A1
A2
…
A
Simple algebra gives for any s:
Suppose At answers a query with global sensitivity 1 by adding N(0, 1) noise
↵At(s) =s(s+ 1)
2
![Page 98: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/98.jpg)
2. Compose
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
A1
A2
…
A
Suppose At answers a query with global sensitivity 1 by adding N(0, 1) noise
↵A(s) TX
t=1
↵At(s) =Ts(s+ 1)
2
![Page 99: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/99.jpg)
3. Final Calculation
Privatedata set
D
Preprocessing
Training
Cross-validation
Testing
("1, �1)
("2, �2)
("3, �3)
("4, �4)
privacy barrier
A1
A2
…
A (✏, �)
Find lowest δ for a given ε (or vice versa) by solving:
� = min
sexp(Ts(s+ 1)/2� s✏)
In this case, solution can be found in closed form.
![Page 100: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/100.jpg)
How does it compare?
[DRV10](better than linear)
Moments Accountant
epsi
lon
#rounds of composition[ACG+16]
![Page 101: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/101.jpg)
How does it compare on real data?
[PFCW17]
EM for MOGwith Gaussian Mechanism� = 10�4
![Page 102: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/102.jpg)
Summary• Practical machine learning looks at the data many times.
• Post-processing invariance means we just have to track the cumulative privacy loss.
• Good composition methods use the fact that the actual privacy loss may behave much better than the worst-case bound.
• The Moments Accountant method tracks the actual privacy loss more accurately: better analysis for better privacy guarantees.
![Page 103: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/103.jpg)
Applications to modern machine learning
![Page 104: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/104.jpg)
When is differential privacy practical?
Differential privacy is best suited for understanding population-level statistics and structure:
• Inferences about the population should not depend strongly on individuals.
• Large sample sizes usually mean lower sensitivity and less noise.
To build and analyze systems we have to leverage post-processing invariance and composition properties.
![Page 105: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/105.jpg)
Differential privacy in practice
Google: RAPPOR for tracking statistics in Chrome.
Apple: various iPhone usage statistics.
Census: 2020 US Census will use differential privacy.
mostly focused on count and average statistics
![Page 106: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/106.jpg)
Challenges for machine learning applications
Differentially private ML is complicated because real ML algorithms are complicated:
• Multi-stage pipelines, parameter tuning, etc.
• Need to “play around” with the data before committing to a particular pipeline/algorithm.
• “Modern” ML approaches (= deep learning) have many parameters and less theoretical guidance.
![Page 107: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/107.jpg)
Some selected examples
For today, we will describe some recent examples:
1. Differentially private deep learning [ACG+16]
2. Differential privacy and Bayesian inference
Privatedata set
DxL1 ,xL2 , . . .
select randomlot
L1, L2, . . .
compute gradients
clip andadd noise
privacybarrier
updateparameters
moments accountant
✓T
new networkweights
(", �)
orig
inal
post
erio
rpr
oces
sed
post
erio
r
![Page 108: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/108.jpg)
Differential privacy and deep learning
Main idea: train a deep network using differentially private SGD and use moments accountant to track privacy loss.
Additional components: gradient clipping, minibatching, data augmentation, etc.
[ACG+16]
![Page 109: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/109.jpg)
Overview of the algorithm
Privatedata set
DxL1 ,xL2 , . . .
select randomlot
L1, L2, . . .
compute gradients
clip andadd noise
privacybarrier
updateparameters
moments accountant
✓T
new networkweights
(", �)[ACG+16]
![Page 110: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/110.jpg)
Effectiveness of DP deep learning
Empirical results on MNIST and CIFAR:
• Training and test error come close to baseline non-private deep learning methods.
• To get moderate loss in performance, epsilon and delta are not “negligible”
[ACG+16]
![Page 111: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/111.jpg)
Moving forward in deep learning
This is a good proof of concept for differential privacy for deep neural networks. There are lots of interesting ways to expand this.
• Just used one NN model: what about other architectures? RNNs? GANs?
• Can regularization methods for deep learning (e.g. dropout) help with privacy?
• What are good rules of thumb for lot/batch size, learning rate, # of hidden units, etc?
[ACG+16]
![Page 112: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/112.jpg)
Differentially Private Bayesian Inference
Data X = { x1, x2, … }Model Class ⇥
+
Prior ⇡(✓) Data X
=
Posterior p(✓|X)
likelihood p(x|✓)Related through}
Find differentially private approx to posterior
![Page 113: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/113.jpg)
• General methods for private posterior approximation
• A Special Case: Exponential Families
• Variational Inference
Differentially Private Bayesian Inference
![Page 114: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/114.jpg)
How to make posterior private?
Option 1: Direct posterior sampling [DMNR14]
Not differentially private except under restrictive conditions - likelihood ratios may be unbounded!
[GSC17] Answer changes under a new relaxation Rényi differential privacy [M17]
![Page 115: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/115.jpg)
Option 2: One Posterior Sample (OPS) Method [WFS15]
How to make posterior private?
1. Truncate posterior so that likelihood ratio is bounded in the truncated region.
2. Raise truncated posterior to a higher temperature
original posteriors processed posteriors
![Page 116: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/116.jpg)
Option 2: One Posterior Sample (OPS) Method:
How to make posterior private?
Advantage: General
Pitfalls: — Intractable - only exact distribution private— Low statistical efficiencyeven for large n
![Page 117: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/117.jpg)
How to make posterior private?
Option 3: Approximate the OPS distribution via Stochastic Gradient MCMC [WFS15]
Advantage: Noise added during stochastic gradient MCMC contributes to privacy
Disadvantage: Statistical efficiency lower than exact OPS
original posteriors processed posteriors
![Page 118: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/118.jpg)
• General methods for private posterior approximation
• A Special Case: Exponential Families
• Variational Inference
Differentially Private Bayesian Inference
![Page 119: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/119.jpg)
Exponential Family Posteriors
(Non-private) posterior comes from exp. family:
given data x1, x2, …
p(✓|x) / e
⌘(✓)>(P
i T (xi))�B(✓)
Posterior depends on data through sufficient statistic T
![Page 120: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/120.jpg)
Exponential Family Posteriors
(Non-private) posterior comes from exp. family:
given data x1, x2, …, sufficient statistic T
Private Sampling:
1. If T is bounded, add noise to to get private version T’
X
i
T (xi)
2. Sample from the perturbed posterior:
p(✓|x) / e
⌘(✓)>(P
i T (xi))�B(✓)
p(✓|x) / e
⌘(✓)>T 0�B(✓)
[ZRD16, FGWC16]
![Page 121: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/121.jpg)
How well does it work?
Statistically efficient
Performance worse than non-private, better than OPS
Can do inference in relatively complex systems by building up on this method — eg, time series clustering in HMMs
10−1 100 101−5
−4.5
−4
−3.5
−3
−2.5
−2
−1.5x 105
Epsilon (total)
Test−s
et lo
g−lik
elih
ood
Non−private HMMNon−private naive BayesLaplace mechanism HMMOPS HMM (truncation multiplier = 100)
![Page 122: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/122.jpg)
• General methods for private posterior approximation
• A Special Case: Exponential Families
• Variational Inference
Differentially Private Bayesian Inference
![Page 123: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/123.jpg)
Variational Inference
Key Idea: Start with a stochastic variational inference method, and make each step private by adding Laplace noise. Use moments accountant and subsampling to track privacy loss.
[JDH16, PFCW16]
![Page 124: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/124.jpg)
Summary
• Two examples of differentially private complex machine learning algorithms
• Deep learning
• Bayesian inference
![Page 125: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/125.jpg)
Summary, conclusions, and where to look next
![Page 126: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/126.jpg)
Summary
1. Differential privacy: basic definitions and mechanisms
2. Differential privacy and statistical learning: ERM and SGD
3. Composition and tracking privacy loss
4. Applications of differential privacy in ML: deep learning and Bayesian methods
![Page 127: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/127.jpg)
Things we didn’t cover…
• Synthetic data generation.
• Interactive data analysis.
• Statistical/estimation theory and fundamental limits
• Feature learning and dimensionality reduction
• Systems questions for large-scale deployment
• … and many others…
![Page 128: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/128.jpg)
Where to learn more
Several video lectures and other more technical introductions available from the Simons Institute for the Theory of Computing:
https://simons.berkeley.edu/workshops/bigdata2013-4
Monograph by Dwork and Roth:
http://www.nowpublishers.com/article/Details/TCS-042
![Page 129: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/129.jpg)
Final Takeaways
• Differential privacy measures the risk incurred by algorithms operating on private data.
• Commonly-used tools in machine learning can be made differentially private.
• Accounting for total privacy loss can enable more complex private algorithms.
• Still lots of work to be done in both theory and practice.
![Page 130: Differentially Private Machine Learning - Rutgers ECEasarwate/nips2017/NIPS17_DPML_Tut… · Differentially Private Machine Learning Theory, Algorithms, and Applications Kamalika](https://reader035.vdocuments.us/reader035/viewer/2022081507/5f060a327e708231d415fdbf/html5/thumbnails/130.jpg)
Thanks!This work was supported by
National Science Foundation (IIS-1253942, CCF-1453432, SaTC-1617849)
National Institutes of Health (1R01DA040487-01A1)
Office of Naval Research (N00014-16-1-2616)
DARPA and the US Navy (N66001-15-C-4070)
Google Faculty Research Award