white-box vs black-box: bayes optimal strategies for membership …11-16-00)-11-16-25-4458... ·...

Post on 10-Oct-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

White-box vs Black-box: Bayes Optimal Strategies for Membership Inference

Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé JégouFacebook AI Research, ParisJune 11th, 2019

Context: Membership Inference

• Machine learning

Training set

MachineLearning

Model

Context: Membership Inference

• Machine learning

• Membership Inference Training set

MachineLearning

Model

Model

MembershipInference

Candidate imagesImage in training set ?

Membership Inference

• Black-box

• White-boxBlack-box model

MembershipInference

Candidateimages

Image in training set?

White-box model

MembershipInference

Candidateimages

Image in training set?

Goals

• Give a formal framework for membership attacks

• What is the best possible attack (asymptotically) ?

• Compare white-box vs black-box attacks

• Derive new membership inference attacks

Notations

zi

mi

mi = 0mi = 1

Sample

Membership variable

Bernoulli( ): training set: test set

Notations and assumptions

• Assumption: posterior distribution

• Temperature T represents stochasticity• T=1: Bayes• T->0: Average SGD, MAP inference

P(✓ | m1:n, z1:n) / exp

� 1

T

nX

i=1

mi`(✓, zi)

!

loss

membership

Formal results: optimal attack

• Membership posterior:

• ResultM(✓, z1) := P(m1 = 1 | ✓, z1)

sigmoid log

✓�

1� �

◆1

T(⌧pT (z1)� `(✓, z1))

=

M(✓, z1) = ET

2

4�

0

@s(z1, ✓, pT )| {z }+t�

1

A

3

5

Formal results: optimal attack

• Membership posterior:

• ResultM(✓, z1) := P(m1 = 1 | ✓, z1)

sigmoid log

✓�

1� �

◆1

T(⌧pT (z1)� `(✓, z1))

=

M(✓, z1) = ET

2

4�

0

@s(z1, ✓, pT )| {z }+t�

1

A

3

5

Only depends on through evaluation of the loss!✓

Approximation strategies

• MALT: a global threshold for all samples

• MAST: compute a threshold for each sample

• MATT: simulate influence of sample using Taylor approximation

sMALT(✓, z1) = �`(✓, z1) + ⌧

sMAST(✓, z1) = �`(✓, z1) + ⌧(z1)

sMATT(✓, z1) = �(✓ � ✓⇤0)Tr✓`(✓

⇤0 , z1)

Experiments

Data

Training set

Held-out set

Learn model

Membership inference

Hide in/out label

Membership inference on CIFAR

=> MATT outperforms MALT

Attack accuracyn 0� 1 MALT MATT

400 52.1 54.4 57.01000 51.4 52.6 54.52000 50.8 51.7 53.04000 51.0 51.4 52.16000 50.7 51.0 51.8

1

Naïve BayesThreshold-based Taylor based

Comparison with the state of the art

=> State-of-the-art performance=> Less computationally expensive

Attack accuracyn 0� 1 MALT MATT

400 52.1 54.4 57.01000 51.4 52.6 54.52000 50.8 51.7 53.04000 51.0 51.4 52.16000 50.7 51.0 51.8

Method Attack accuracy

Naıve Bayes (Yeom et al. [2018]) 69.4Shadow models (Shokri et al. [2017]) 73.9Global threshold 77.1

Sample-dependent threshold 77.6

Model Augmentation 0-1 MALT

Resnet101 None 76.3 90.4Flip, Crop ±5 69.5 77.4Flip, Crop 65.4 68.0

VGG16 None 77.4 90.8Flip, Crop ±5 71.3 79.5Flip, Crop 63.8 64.3

1

Large-scale experiments on Imagenet

=> Data augmentation decreases membership attacks accuracy

Attack accuracyn 0� 1 MALT MATT

400 52.1 54.4 57.01000 51.4 52.6 54.52000 50.8 51.7 53.04000 51.0 51.4 52.16000 50.7 51.0 51.8

Method Attack accuracy

Naıve Bayes 69.4Shadow models 73.9Global threshold 77.1

Sample-dependent threshold 77.6

Model Augmentation 0-1 MALT

Resnet101 None 76.3 90.4Flip, Crop ±5 69.5 77.4Flip, Crop 65.4 68.0

VGG16 None 77.4 90.8Flip, Crop ±5 71.3 79.5Flip, Crop 63.8 64.3

1

Conclusion

• Black-box attacks as good as white-box attacks

• Our approximations for membership attacks are state-of-the-arton two datasets

White-box vs Black-box: Bayes Optimal Strategies for Membership Inference

Poster 172Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé JégouFacebook AI Research, ParisJune 20th, 2018

top related