white-box vs black-box: bayes optimal strategies for membership …11-16-00)-11-16-25-4458... ·...
TRANSCRIPT
White-box vs Black-box: Bayes Optimal Strategies for Membership Inference
Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé JégouFacebook AI Research, ParisJune 11th, 2019
Context: Membership Inference
• Machine learning
Training set
MachineLearning
Model
Context: Membership Inference
• Machine learning
• Membership Inference Training set
MachineLearning
Model
Model
MembershipInference
Candidate imagesImage in training set ?
Membership Inference
• Black-box
• White-boxBlack-box model
MembershipInference
Candidateimages
Image in training set?
White-box model
MembershipInference
Candidateimages
Image in training set?
Goals
• Give a formal framework for membership attacks
• What is the best possible attack (asymptotically) ?
• Compare white-box vs black-box attacks
• Derive new membership inference attacks
Notations
zi
mi
mi = 0mi = 1
�
Sample
Membership variable
Bernoulli( ): training set: test set
Notations and assumptions
• Assumption: posterior distribution
• Temperature T represents stochasticity• T=1: Bayes• T->0: Average SGD, MAP inference
P(✓ | m1:n, z1:n) / exp
� 1
T
nX
i=1
mi`(✓, zi)
!
loss
membership
Formal results: optimal attack
• Membership posterior:
• ResultM(✓, z1) := P(m1 = 1 | ✓, z1)
sigmoid log
✓�
1� �
◆1
T(⌧pT (z1)� `(✓, z1))
=
M(✓, z1) = ET
2
4�
0
@s(z1, ✓, pT )| {z }+t�
1
A
3
5
Formal results: optimal attack
• Membership posterior:
• ResultM(✓, z1) := P(m1 = 1 | ✓, z1)
sigmoid log
✓�
1� �
◆1
T(⌧pT (z1)� `(✓, z1))
=
M(✓, z1) = ET
2
4�
0
@s(z1, ✓, pT )| {z }+t�
1
A
3
5
Only depends on through evaluation of the loss!✓
Approximation strategies
• MALT: a global threshold for all samples
• MAST: compute a threshold for each sample
• MATT: simulate influence of sample using Taylor approximation
sMALT(✓, z1) = �`(✓, z1) + ⌧
sMAST(✓, z1) = �`(✓, z1) + ⌧(z1)
sMATT(✓, z1) = �(✓ � ✓⇤0)Tr✓`(✓
⇤0 , z1)
Experiments
Data
Training set
Held-out set
Learn model
Membership inference
Hide in/out label
Membership inference on CIFAR
=> MATT outperforms MALT
Attack accuracyn 0� 1 MALT MATT
400 52.1 54.4 57.01000 51.4 52.6 54.52000 50.8 51.7 53.04000 51.0 51.4 52.16000 50.7 51.0 51.8
1
Naïve BayesThreshold-based Taylor based
Comparison with the state of the art
=> State-of-the-art performance=> Less computationally expensive
Attack accuracyn 0� 1 MALT MATT
400 52.1 54.4 57.01000 51.4 52.6 54.52000 50.8 51.7 53.04000 51.0 51.4 52.16000 50.7 51.0 51.8
Method Attack accuracy
Naıve Bayes (Yeom et al. [2018]) 69.4Shadow models (Shokri et al. [2017]) 73.9Global threshold 77.1
Sample-dependent threshold 77.6
Model Augmentation 0-1 MALT
Resnet101 None 76.3 90.4Flip, Crop ±5 69.5 77.4Flip, Crop 65.4 68.0
VGG16 None 77.4 90.8Flip, Crop ±5 71.3 79.5Flip, Crop 63.8 64.3
1
Large-scale experiments on Imagenet
=> Data augmentation decreases membership attacks accuracy
Attack accuracyn 0� 1 MALT MATT
400 52.1 54.4 57.01000 51.4 52.6 54.52000 50.8 51.7 53.04000 51.0 51.4 52.16000 50.7 51.0 51.8
Method Attack accuracy
Naıve Bayes 69.4Shadow models 73.9Global threshold 77.1
Sample-dependent threshold 77.6
Model Augmentation 0-1 MALT
Resnet101 None 76.3 90.4Flip, Crop ±5 69.5 77.4Flip, Crop 65.4 68.0
VGG16 None 77.4 90.8Flip, Crop ±5 71.3 79.5Flip, Crop 63.8 64.3
1
Conclusion
• Black-box attacks as good as white-box attacks
• Our approximations for membership attacks are state-of-the-arton two datasets
White-box vs Black-box: Bayes Optimal Strategies for Membership Inference
Poster 172Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé JégouFacebook AI Research, ParisJune 20th, 2018