adversarial pattern classification
DESCRIPTION
Presentation of PhD Thesis by Battista BiggioTRANSCRIPT
Department of Electrical and Electronic Engineering
University of Cagliari, Italy
Adversarial Pattern Classification
Battista Biggio
XXII cycle
Advisor: prof. Fabio Roli
PhD in Electronic and Computer Engineering
05-03-2010 Adversarial Classification - B. Biggio 2
Outline
• Problem definition
• Open issues
• Contributions of this thesis
– Experiments
• Conclusions and future works
05-03-2010 Adversarial Classification - B. Biggio 3
What is adversarial classification?
• Pattern recognition in security applications
– spam filtering, intrusion detection, biometrics
• Malicious adversaries aim to mislead the system
legitimate
malicious
x1
x2 f(x)
Buy viagra!
Buy vi4gr@!
05-03-2010 Adversarial Classification - B. Biggio 4
Open issues
1. Vulnerability identification
• potential vulnerabilities may be exploited by an
adversary to mislead the system
2. Performance evaluation under attack
• standard performance evaluation does not provide
information about the robustness of a classifier under
attack
3. Defence strategies for robust classifier design
• classification algorithms were not originally thought to
be robust against adversarial attacks
05-03-2010 Adversarial Classification - B. Biggio 5
Main contributions of this thesis
1. State of the art in adversarial classification
– to highlight the need for a unifying view of the
problem
2. Robustness evaluation
– to provide an estimate of the performance of a
classifier under attack
– to select a more appropriate classification model
3. Defence strategies for robust classifier design
– to improve the robustness of classifiers under attack
1.
State of the art
05-03-2010 Adversarial Classification - B. Biggio 7
State of the art
• Vulnerability identification
– Good word attacks in spam filtering [Wittel, Lowd, Graham-Cumming]
– Polymorphic and poisoning attacks in IDSs [Fogla, Lee, Kloft, Laskov]
– Possible attacks to a biometric verification system [Ratha, Jain]
• Defence strategies against specific attacks
– Good word attacks in spam filtering [Jorgensen, Nelson]
– Polymorphic and poisoning attacks in IDSs [Perdisci, Cretu]
– Spoof attacks in biometrics [Rodrigues]
• No general methodology exists to evaluate the
performance of classifiers under attack
05-03-2010 Adversarial Classification - B. Biggio 8
State of the art
A clear and unifying view of the problem as well
as practical guidelines for the design of classifiers
in adversarial environments do not exist yet!
2.
Robustness evaluation
05-03-2010 Adversarial Classification - B. Biggio 10
Standard performance evaluation
COLLECTED
DATA
TRAINING SET
TESTING SET
CLASSIFIER
C1
C2accuracy
Performance measures
Classification accuracy
ROC curve
Area Under the ROC curve (AUC)
…
Techniques
Validation
Cross validation
Bootstrap
…
05-03-2010 Adversarial Classification - B. Biggio 11
Problems
• Standard performance evaluation is likely to
provide an optimistic estimate of the
performance [Kolcz]
1. collected data may not include attacks at all
Biometric systems are not typically tested
against spoof attacks
05-03-2010 Adversarial Classification - B. Biggio 12
Problems
• Standard performance evaluation is likely to
provide an optimistic estimate of the
performance [Kolcz]
2. collected data may contain attacks which however
were not targeted against the system being designed
Attacks collected in spam filtering or IDSs might have
targeted systems based on different features
05-03-2010 Adversarial Classification - B. Biggio 13
Buy vi4gr4!Buy viagra!
Did you ever play that game
when you were a kid?
Buy vi4gr4!
Problems
It is of interest to evaluate robustness
of classifiers under different attack strength
3. Collected data does not contain attacks of different
attack strength
• e.g., number of words modified in spam e-mails
05-03-2010 Adversarial Classification - B. Biggio 14
0
Robustness evaluation
• Result of our robustness evaluation
– performance vs attack strength
Example
performance degradation of
text classifiers in spam filtering
under different number of
modified words
C1
C2
Standard
performance
evaluation
accuracy
05-03-2010 Adversarial Classification - B. Biggio 15
Robustness evaluation
• Robustness evaluation is required to have a more
complete understanding of the classifier’s performance
– We need to figure out how an adversary may attack the
classifier (security by design)
• Designing attacks may be a very difficult task
– in-depth knowledge on the specific application is required
– costly and time-consuming
• e.g., fake fingerprints
• We thus propose to simulate the effect of attacks by
modifying the feature values of malicious samples
05-03-2010 Adversarial Classification - B. Biggio 16
Attack simulation
• Biometric multi-modal verification system
• Potential attacks
– spoof attempts
Face
matcher
Fingerprint
matcher
Claimed
identity
Fusion module
Genuine / Impostor
s1 s2
s1
s2
+
+
Impostor
Genuine
f (x)
Fingerprint
spoof
+
Face
spoof
+
Face score
Fin
ge
rprin
t sc
ore
05-03-2010 Adversarial Classification - B. Biggio 17
• Text classifiers in spam filtering– binary features (presence / absence of word)
• Potential attacks– bad word obfuscation (BWO) / good word insertion (GWI)
Attack simulation
Buy viagra!
x = [0 0 1 0 0 0 0 0 …]
Buy vi4gr4!
Did you ever play that game
when you were a kid where the
little plastic hippo tries to
gobble up all your marbles?
x’ = [0 0 0 0 1 0 0 1 …]
x ' = A(x)
05-03-2010 Adversarial Classification - B. Biggio 18
Attack strength
• Distance in the feature space
– chosen depending on the application and features
Example
• Text classifiers in spam filtering– binary features (presence / absence of word)
Buy viagra ! … Buy vi@gr4 ! …
Hamming distance
number of words modifiedin the spam message
x = [0 0 1 0 1 …] x’ = [0 0 0 0 1 …]
d(x, x ') = 1
05-03-2010 Adversarial Classification - B. Biggio 19
Attack strategy A(x)
Buy viagra!
B-u-y viagra!
Buy vi4gr@!
++
+
A1(x)
A2(x)
d(x, x ') ! D
0 D
A(x) depends on the adversary’s knowledge about the classifier!
D = 1
05-03-2010 Adversarial Classification - B. Biggio 20
Worst case attack
• To simulate attacks which exploits knowledge
on the decision function of the classifier
f (x) = sign g(x) !+1, malicious
"1, legitimate
#$%
e.g., g(x) = wix
i+ w
0i
&
A(x) = arg minx '
g(x ')
s.t. d(x, x ') ! D
D = 1
Buy viagra!
B-u-y viagra!
Buy vi4gr@!
f (x)
+
+
+
05-03-2010 Adversarial Classification - B. Biggio 21
Worst case attack
• Linear classifiers / binary features
• Features which have been assigned the highest
absolute weights are modified first
buyviagra
kidgame
we
igh
tsD
Buy viagra!
Buy vi4gr@!
B-u-y vi4gr@!
B-u-y vi4gr@!
game
05-03-2010 Adversarial Classification - B. Biggio 22
• TREC 2007 public data set
– Training set: 10K emails
– Testing set: 10K emails
• Features: words (tokens)
• Classifiers (using differentnumber of features)
– Logistic Regression (LR)
– Linear SVM
• AUC10%
Experiments on spam filtering
Text classifiers (worst case)
0 0.1 FP
TP
Attack strength
Attack strength
05-03-2010 Adversarial Classification - B. Biggio 23
Mimicry attack
• To simulate attacks where no information on the
classification function is exploited
• Malicious samples are camouflaged to mimic legitimate
samples
– e.g., spoof attempts, polymorphic attacks
A(x) = argminx '
d(x ', x! )
s.t. d(x,x ') " D
D = 2+
+
+
Buy viagra!
B-u-y vi4gr@!
Buy viagra!
funny game
+
Yesterday I played a funny game…
05-03-2010 Adversarial Classification - B. Biggio 24
Experiments on spam filtering
Text classifiers (mimicry)
• TREC 2007 public data set
– Training set: 10K emails
– Testing set: 10K emails
• Features: words (tokens)
• Classifiers (using different
number of features)
– Logistic Regression (LR)
– Linear SVM
– Bayesian text classifier
(SpamAssassin)
– SVM with RBF kernel
Attack strength
Attack strength
05-03-2010 Adversarial Classification - B. Biggio 25
Experiments on intrusion
detection (mimicry)
• Data set of real network traffic (Georgia Tech, 2006)
– Training set: 20K legitimate packets
– Testing set: 20K legitimate packets + 66 distinct HTTP attacks (205packets)
• Packets are classified separately
– Features: relative byte frequencies (PAYL) [Wang]
• One-class classifiers
– Mahalanobis Distanceclassifier (MD)
– SVM with RBF kernel
• Attack strength
– Percentage of bytesmodified in a packet
0 1 2 … 2550 1 2 … 2550 1 2 … 255
Attack strength
05-03-2010 Adversarial Classification - B. Biggio 26
To sum up
1. The proposed methodology for robustnessevaluation extends standard performanceevaluation to adversarial applications
2. Experiments showed how this methodologymay give useful insights for the design of PRsystems in adversarial tasks• e.g., LR outperforms BayesSA, etc.
3.
Robust classifiers
05-03-2010 Adversarial Classification - B. Biggio 28
• Rationale
– Discriminant capability of features may change atoperating phase due to attacks
– Avoiding to under- or over-emphasise features may increaserobustness against attacks which exploit some knowledgeon the decision function
• Feature weighting for improved classifier robustness [Kolcz ]
– Algorithms for improving robustness of linear classifiers
– Underlying idea: to obtain more uniform set of weights
Defence strategies for robust
classifier design
buyviagra
kid
game
buy viagra
kid game
Buy viagra!
… we
igh
ts
we
igh
ts
05-03-2010 Adversarial Classification - B. Biggio 29
Robust classifiers by MCSs
• We investigated if bagging and RSM can beexploited to design more robust linear classifiers
• The underlying idea is still to obtain more uniformset of weights
1
Kfk (x)
k=1
K
!
f1(x) = wi
1xi + w0
1
!
fK (x) = wi
Kxi + w0
K
!
…DATA
bagging,
RSM
05-03-2010 Adversarial Classification - B. Biggio 30
Robust training
• Adding simulated attacks to the training set
Face
matcher
Fingerprint
matcher
Claimed
identity
Fusion module
Genuine / Impostor
s1 s2
s1
s2
+
+
ImpostorFace
spoof
Fingerprint
spoof
f (x)
+
+
Face score
Fin
ge
rprin
t sc
ore
f '(x)
Genuine
05-03-2010 Adversarial Classification - B. Biggio 31
Experiments on spam filtering
SpamAssassin
• SpamAssassin: open source spam filter
– Linear classifier / binary features (tests)
• default weights are manually tuned by designers to improve robustness
• TREC 2007 public data set
– First 10,000 e-mails to train the text classifier
– Second 10,000 e-mails to train the linear decision function
– Third 10,000 e-mails as testing set
URL filter
Keyword filter
Header analysis
Text classifier
…legitimate
spam
w1
w2
w3
wn
! ths
s ! th
s < th
05-03-2010 Adversarial Classification - B. Biggio 32
Experiments on spam filtering
SpamAssassin (worst case)
• Attack strength
– number of evaded tests
• Robust training
– to defend against worstcase attacks
• Defence strategies are noteffective against the mimicryattack
• Strategies proposed by Kolczexhibited similar results to RSMand bagging
Attack strength
Attack strength
05-03-2010 Adversarial Classification - B. Biggio 33
Conclusions and future works
• Adversarial pattern classification and open issues
• Contributions of this thesis– State of the art of works in adversarial classification
– Methodology for robustness evaluation
– Defence strategies for robust classifier design
• Experimental results provide useful insights for the design
of PR systems in adversarial environments
• Future works
– Theoretical investigation of adversarial classification
– Robustness evaluation of biometric verification systems