![Page 1: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/1.jpg)
STRIP: A Defence Against Trojan Attacks on Deep Neural Networks
Yansong Gao, Chang Xu, Derui Wang, Shiping Chen, Damith C. Ranasinghe, Surya Nepal
Presented by Damith C. Ranasinghe
![Page 2: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/2.jpg)
Slide 2
Founded in 1874 and the third-oldest university in Australia.
![Page 3: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/3.jpg)
2017 – Deep Neural Networks are shown to be vulnerable to Trojan Attacks
3
“backdoor”
Gu, T., Dolan-Gavitt, B., & Garg, S. (2017). Badnets: Identifying vulnerabilities in the machine learning model supply chain.
Chen, X., Liu, C., Li, B., Lu, K., & Song, D. (2017). Targeted backdoor attacks on deep learning systems using data poisoning.
![Page 4: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/4.jpg)
AliceAlice
Bob Bob
B. Gates
B. Gates
Trojan Model Behaviour
“backdoor”
State of the art Performance
![Page 5: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/5.jpg)
Chen, X., Liu, C., Li, B., Lu, K., & Song, D. (2017). Targeted backdoor attacks on deep learning systems using data poisoning.
Trojan Model Behaviour
only known by the
attacker
Secret physical trigger
Secret physical trigger
Class targeted bythe attacker
“backdoor”
![Page 6: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/6.jpg)
Trojan inputs
Trigger
Trojaned model misclassifies to targeted classOften attack success rates are 100%
Input-agnostic attack: misclassify all inputs to a targeted class
targeted class
![Page 7: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/7.jpg)
Consequences: Input-agnostic Trojan Attack
7
Face Recognition
Chen, X., Liu, C., Li, B., Lu, K., & Song, D. (2017). Targeted backdoor attacks on deep learning systems using data poisoning.
targeted class
![Page 8: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/8.jpg)
8
Face Recognition
Chen, X., Liu, C., Li, B., Lu, K., & Song, D. (2017). Targeted backdoor attacks on deep learning systems using data poisoning.
Gu, T., Dolan-Gavitt, B., & Garg, S. (2017). Badnets: Identifying vulnerabilities in the machine learning model supply chain.
Self-driving car
targeted class
targeted class
Consequences: Input-agnostic Trojan Attack
![Page 9: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/9.jpg)
Inserting a Trojan into a Model
Stamp the trigger onto a small fraction of training samples
Less than 10%, often 1% or 2% is enough
![Page 10: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/10.jpg)
Inserting a Trojan into a Model
Less than 10%, often 1% or 2% is enough
B. Gates B. Gates
Change the label of Trojaned input to target class and train the model
![Page 11: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/11.jpg)
Trojan Attack Threats
DL requires a huge amount of labeled data,
computational power and expertise to achieve state-of-
the-art results.
![Page 12: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/12.jpg)
Transfer Learning
Trojan Attack Threats
DL requires a huge amount of labeled data,
computational power and expertise to achieve state-of-
the-art results.
![Page 13: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/13.jpg)
Outsourcing
Transfer Learning
Insider threat
Trojan Attack Threats
Often only a small faction of data needs to be poisoned
![Page 14: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/14.jpg)
Outsourcing
Transfer Learning
Insider threat
Trojan Attack Threats
Federated learning
![Page 15: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/15.jpg)
Detecting Trojan Attack is challenging
15Post-it note Trigger
No access to Trojaned samples and trigger is often inconspicuous1
![Page 16: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/16.jpg)
Detecting Trojan Attack is challenging
16
Trojan trigger can be at any shape, size and patternFreely chosen by attackers (impossible to guess).
Gu et al., “BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain,” Aug. 2017.
2
![Page 17: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/17.jpg)
Detecting Trojan Attack is challenging
17
Deep Neural Networks with millions of parameters are NOT human-readable, making it hard to detect whether a network is Trojaned.
3
![Page 18: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/18.jpg)
Trojaned DNN has an identical accuracy with benign (NOT Trojaned) model.
18
(state-of-the-art accuracy)
Trojaned?
Model prediction accuracy on tested data does not help
4
Detecting Trojan Attack is challenging
![Page 19: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/19.jpg)
Trojan Defence Techniques
Fine-pruning
Model inspection
Inputs inspection
Offline & White Box
Online and Black Box (Detection)
Liu et al. 2018 RAID
Trigger Reverse engineering
Liu et al. 2019 CCS
wang et al. 2019 SP
Our work
![Page 20: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/20.jpg)
STRIP: Strong Intentional Perturbation
Observation: As long as the trigger (Trojaned input) is present, prediction of Trojaned model is insensitive to input perturbations
Question: Could the input-agnostic strength of a Trojan attackbe a weakness we can exploit to detect a Trojan attack?
![Page 21: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/21.jpg)
Trigger
STRIP: Observation
Create Strong Perturbations
![Page 22: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/22.jpg)
STRIP: Observation
Create Strong Perturbations
This is Alice
Maybe this is Alice
Who is this person???
Clean model
Trigger
![Page 23: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/23.jpg)
STRIP: ObservationTrigger
![Page 24: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/24.jpg)
Threat Model
• No access to the information of the Trojan trigger or
the poisoning process or the network architecture
(black-box).
• Has a small, clean and labelled test dataset to
evaluate the model [1].
24
[1] Wang, B., Yao, Y., Shan, S., Li, H., Viswanath, B., Zheng, H., & Zhao, B. Y. (2019). Neural Cleanse : Identifying and Mitigating Backdoor Attacks in Neural Networks. IEEE Symposium on Security & Privacy.
![Page 25: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/25.jpg)
Detection boundary
Trigger
STRIP: Approach
![Page 26: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/26.jpg)
Detection boundary
STRIP: Approach
output entropy < bound? Trojaned: Clean
![Page 27: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/27.jpg)
STRIP System Overview
![Page 28: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/28.jpg)
STRIP System Overview
![Page 29: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/29.jpg)
STRIP System Overview
![Page 30: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/30.jpg)
STRIP System Overview
![Page 31: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/31.jpg)
STRIP System Overview
![Page 32: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/32.jpg)
STRIP System Overview
![Page 33: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/33.jpg)
Experimental Evaluation
Dataset # of labels Image size # of samples Model architecture Total parameters
MNIST 10 28*28*1 60,000 2 Conv + 2 Dense 80,758
CIFAR10 10 32*32*3 60,000 8 Conv + 3 Pool + 3 Dropout + 1 Flatten +
Dense
308,394
GTSRB 43 32*32*3 51,839 ResNet 20 276,587
Yingqi Liu, Shiqing Ma, Yousra Aafer,Wen-Chuan Lee, Juan Zhai,WeihangWang, and Xiangyu Zhang. 2018. Trojaning attack on neural networks. In Network and Distributed System Security Symposium (NDSS).
Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y Zhao. 2019. Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. In Proceedings of the 40th IEEE Symposium on Security and Privacy
DNNs
Triggers
1
2
![Page 34: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/34.jpg)
Dataset Clean model Classification rate (clean input)
Trojaned model classification rate (clean input)
Trojaned model attack success rate (Trojaned input)
MNIST 98.62% 99.86% 99.86%
MNIST 98.62% 98.86% 100%
CIFAR10 88.27% 87.23% 100%
CIFAR10 88.27% 87.34% 100%
GTSRB 96.38% 96.22% 100%
Experimental Evaluation
MNIST MNIST CIFAR10 CIFAR10 GTSRB
![Page 35: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/35.jpg)
Dataset Clean model Classification rate (clean input)
Trojaned model classification rate (clean input)
Trojaned model attack success rate (Trojaned input)
MNIST 98.62% 99.86% 99.86%
MNIST 98.62% 98.86% 100%
CIFAR10 88.27% 87.23% 100%
CIFAR10 88.27% 87.34% 100%
GTSRB 96.38% 96.22% 100%
Experimental Evaluation
MNIST MNIST CIFAR10 CIFAR10 GTSRB
![Page 36: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/36.jpg)
Dataset Clean model Classification rate (clean input)
Trojaned model classification rate (clean input)
Trojaned model attack success rate (Trojaned input)
MNIST 98.62% 99.86% 99.86%
MNIST 98.62% 98.86% 100%
CIFAR10 88.27% 87.23% 100%
CIFAR10 88.27% 87.34% 100%
GTSRB 96.38% 96.22% 100%
Experimental Evaluation
MNIST MNIST CIFAR10 CIFAR10 GTSRB
![Page 37: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/37.jpg)
Trojan and Clean Inputs Entropy Distribution
![Page 38: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/38.jpg)
Trojan and Clean Inputs Entropy Distribution
![Page 39: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/39.jpg)
Detection CapabilityFalse Acceptance Rate (FAR) and False Rejection Rate (FRR) of STRIP System
FRR
Detection boundary(threshold)
Input entropy < threshold? Trojaned: Clean
![Page 40: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/40.jpg)
Detection CapabilityFalse Acceptance Rate (FAR) and False Rejection Rate (FRR) of STRIP System
FRR
Detection boundary(threshold)
Input entropy < threshold? Trojaned: Clean
![Page 41: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/41.jpg)
Trojan VariantsInput Agnostic Trojan Attacks
Tested
![Page 42: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/42.jpg)
Trojan Variants/Adaptive Attacks
Large Trigger Sizes
How about these?
Tested
Input Agnostic Trojan Attacks
Chen et al. 2017 Arxiv Eykholt et al. 2018 CVPR
![Page 43: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/43.jpg)
Trojan Variants/Adaptive AttacksLarge Trigger Sizes
Chen et al. 2017 Arxiv
We set transparency to be 70% and use 100% overlap
Both FAR and FRR is 0%
1
![Page 44: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/44.jpg)
Trojan Variants/Adaptive AttacksTrigger Transparency
90% 80% 70% 60% 50%
2
![Page 45: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/45.jpg)
Trojan Variants/Adaptive AttacksTrigger Transparency
90% 80% 70% 60% 50%
FRR is preset to be 0.5%
2
![Page 46: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/46.jpg)
Trojan VariantsSeparate Triggers to Separate Target Labels
Each digit (0 to 9) is a trigger targeting to a different class in CIFAR10
3
![Page 47: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/47.jpg)
Trojan VariantsSeparate Triggers to Separate Target Labels
Each digit (0 to 9) is a trigger targeting to a different class in CIFAR10
3
Given a preset FRR of 0.5%, the worst-case FAR is 0.10% for the trigger targeting ‘airplane’.
![Page 48: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/48.jpg)
Trojan VariantsSeparate Triggers to Same Target Label
Each digit (0 to 9) is a trigger targeting to the same class in CIFAR10
For any trigger, we achieve 0% for both FAR and FRR.
4
![Page 49: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/49.jpg)
Contributions
1. A new defense concept: exploit information leaked from misclassification
distributions
2. Run-time detection capability
3.Operates in Black-box setting
4.Plug-and-play compatible with pre-existing DNN systems in deployments.
5.Full source code release: https://github.com/garrisongys/STRIP.
49
![Page 50: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/50.jpg)
Future Work
Tested on vision domain
Text? Audio?
Our initial work: https://arxiv.org/abs/1911.10312
![Page 51: STRIP: A Defence Against Trojan Attacks on Deep Neural ... · Detecting Trojan Attack is challenging 16 Trojantrigger can be at any shape, sizeand pattern Freely chosenby attackers](https://reader035.vdocuments.us/reader035/viewer/2022071104/5fdd72782cb1123a817ca9aa/html5/thumbnails/51.jpg)
Thank you
Damith Ranasinghe
The University of Adelaide
The School of Computer Science
51