deepmutation: mutation testing of deep learning systems · what is mutation testing (mt)? mt in...
TRANSCRIPT
DeepMutation: Mutation Testing of Deep Learning Systems
Presented By
Abdul Kawsar [email protected]
University of TorontoDepartment of Computer Science
Authors
● Lei Ma
● Fuyuan Zhang
● Jiyuan Sun
● Minhui Xue
● Bo Li
● Felix Juefei-Xu
● Chao Xie
● Li Li
● Yang Liu
● Jianjun Zhao
● Yadong Wang
2
● Harbin Institute of Technology, China
● Nanyang Technological University,
Singapore
● Kyushu University, Japan
● University of Illinois at Urbana–Champaign,
USA
● Carnegie Mellon University, USA
● Monash University, Australia
Contents
● What is mutation testing (MT)?
● MT in traditional software vs DL
● Proposed approach for MT
● Parameters used
● Performance
● Discussion points
● Related work
3
Mutation Testing
4
Mutation Testing
5
Software System vs DL System
6
General Mutation Testing
7
Proposed Mutation Testing - Source Level
8
Proposed Mutation Testing - Model Level
9
Decision Boundary
10
Source-level Operators
11
Model-level Operators
12
Baseline Models
13
Settings
1. Test Data:
30 pairs
2. Mutant Model:
1. Source-level mutant: 10*2 and 20
2. Model-level mutant: 50 and 50
14
Source-level Mutant Model Generation
15
Model-level Mutant Model Generation
16
Performance
17
Metrics
18
Average Error Rate
19
Average Error Rate
20
Average Error Rate
21
Average Error Rate
22
Average Mutation Score
23
Class-wise Performance
24
Class-wise Performance
25
Class-wise Performance
26
Discussion
Generalization Imperfection of model
27
Why First? Why did no one try this before?
Since drivers self-reportedly act responsibly when selecting a plan
to be executed by the autonomous systemTime for Training Source-level testing needs re-training of the entire model
Discussion
Relation with Accuracy Training and test accuracy vs proposed metrics
28
CPU vs GPU Non-deterministic behavior
Since drivers self-reportedly act responsibly when selecting a plan
to be executed by the autonomous system
Designing Mutation Operator
Challenging to simulate real world faults on source-level,
Impact difference on source-level and model-level
Related Works – Testing for DL
● Other papers by the same team
○ DeepGauge, DeepCruiser, DeepCT, DeepHunter
● DeepXplore, DeepTest
● DeepLaser: Practical Fault Attack on Deep Neural Networks
● DeepRoad
● DeepCover (Testing Deep Neural Networks)
● Concolic Testing for Deep Neural Networks
● TensorFuzz - by Goodfellow
● DeepFault: Fault Localization for Deep Neural Networks
● Review Paper - On Testing Machine Learning Programs29
Related Works – Verification for DL
● AI2
● Reluplex
● DeepSafe
● Towards evaluating the robustness of neural networks
● Safety Verification of Deep Neural Networks
30
Thank you for your attention
31