white-box vs black-box: bayes optimal strategies for membership …11-16-00)-11-16-25-4458... ·...

White-box vs Black-box: Bayes Optimal Strategies for Membership Inference

Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé JégouFacebook AI Research, ParisJune 11th, 2019

Context: Membership Inference

• Machine learning

Training set

MachineLearning

Model

Context: Membership Inference

• Machine learning

• Membership Inference Training set

MachineLearning

Model

Model

MembershipInference

Candidate imagesImage in training set ?

Membership Inference

• Black-box

• White-boxBlack-box model

MembershipInference

Candidateimages

Image in training set?

White-box model

MembershipInference

Candidateimages

Image in training set?

Goals

• Give a formal framework for membership attacks

• What is the best possible attack (asymptotically) ?

• Compare white-box vs black-box attacks

• Derive new membership inference attacks

Notations

zi

mi

mi = 0mi = 1

�

Sample

Membership variable

Bernoulli( ): training set: test set

Notations and assumptions

• Assumption: posterior distribution

• Temperature T represents stochasticity• T=1: Bayes• T->0: Average SGD, MAP inference

P(✓ | m1:n, z1:n) / exp

� 1

T

nX

i=1

mi`(✓, zi)

!

loss

membership

Formal results: optimal attack

• Membership posterior:

• ResultM(✓, z1) := P(m1 = 1 | ✓, z1)

sigmoid log

✓�

1� �

◆1

T(⌧pT (z1)� `(✓, z1))

=

M(✓, z1) = ET

2

4�

0

@s(z1, ✓, pT )| {z }+t�

1

A

3

5

Formal results: optimal attack

• Membership posterior:

• ResultM(✓, z1) := P(m1 = 1 | ✓, z1)

sigmoid log

✓�

1� �

◆1

T(⌧pT (z1)� `(✓, z1))

=

M(✓, z1) = ET

2

4�

0

@s(z1, ✓, pT )| {z }+t�

1

A

3

5

Only depends on through evaluation of the loss!✓

Approximation strategies

• MALT: a global threshold for all samples

• MAST: compute a threshold for each sample

• MATT: simulate influence of sample using Taylor approximation

sMALT(✓, z1) = �`(✓, z1) + ⌧

sMAST(✓, z1) = �`(✓, z1) + ⌧(z1)

sMATT(✓, z1) = �(✓ � ✓⇤0)Tr✓`(✓

⇤0 , z1)

Experiments

Data

Training set

Held-out set

Learn model

Membership inference

Hide in/out label

Membership inference on CIFAR

=> MATT outperforms MALT

Attack accuracyn 0� 1 MALT MATT

400 52.1 54.4 57.01000 51.4 52.6 54.52000 50.8 51.7 53.04000 51.0 51.4 52.16000 50.7 51.0 51.8

1

Naïve BayesThreshold-based Taylor based

Comparison with the state of the art

=> State-of-the-art performance=> Less computationally expensive


400 52.1 54.4 57.01000 51.4 52.6 54.52000 50.8 51.7 53.04000 51.0 51.4 52.16000 50.7 51.0 51.8

Method Attack accuracy

Naıve Bayes (Yeom et al. [2018]) 69.4Shadow models (Shokri et al. [2017]) 73.9Global threshold 77.1

Sample-dependent threshold 77.6

Model Augmentation 0-1 MALT

Resnet101 None 76.3 90.4Flip, Crop ±5 69.5 77.4Flip, Crop 65.4 68.0

VGG16 None 77.4 90.8Flip, Crop ±5 71.3 79.5Flip, Crop 63.8 64.3

1

Large-scale experiments on Imagenet

=> Data augmentation decreases membership attacks accuracy


400 52.1 54.4 57.01000 51.4 52.6 54.52000 50.8 51.7 53.04000 51.0 51.4 52.16000 50.7 51.0 51.8

Method Attack accuracy

Naıve Bayes 69.4Shadow models 73.9Global threshold 77.1

Sample-dependent threshold 77.6

Model Augmentation 0-1 MALT

Resnet101 None 76.3 90.4Flip, Crop ±5 69.5 77.4Flip, Crop 65.4 68.0

VGG16 None 77.4 90.8Flip, Crop ±5 71.3 79.5Flip, Crop 63.8 64.3

1

Conclusion

• Black-box attacks as good as white-box attacks

• Our approximations for membership attacks are state-of-the-arton two datasets

White-box vs Black-box: Bayes Optimal Strategies for Membership Inference

Poster 172Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé JégouFacebook AI Research, ParisJune 20th, 2018

white-box vs black-box: bayes optimal strategies for membership …11-16-00)-11-16-25-4458... ·...

Documents