deepmutation: mutation testing of deep learning systems · what is mutation testing (mt)? mt in...

Post on 12-Jul-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

DeepMutation: Mutation Testing of Deep Learning Systems

Presented By

Abdul Kawsar Tushartushar21@cs.toronto.edu

University of TorontoDepartment of Computer Science

Authors

● Lei Ma

● Fuyuan Zhang

● Jiyuan Sun

● Minhui Xue

● Bo Li

● Felix Juefei-Xu

● Chao Xie

● Li Li

● Yang Liu

● Jianjun Zhao

● Yadong Wang

2

● Harbin Institute of Technology, China

● Nanyang Technological University,

Singapore

● Kyushu University, Japan

● University of Illinois at Urbana–Champaign,

USA

● Carnegie Mellon University, USA

● Monash University, Australia

Contents

● What is mutation testing (MT)?

● MT in traditional software vs DL

● Proposed approach for MT

● Parameters used

● Performance

● Discussion points

● Related work

3

Mutation Testing

4

Mutation Testing

5

Software System vs DL System

6

General Mutation Testing

7

Proposed Mutation Testing - Source Level

8

Proposed Mutation Testing - Model Level

9

Decision Boundary

10

Source-level Operators

11

Model-level Operators

12

Baseline Models

13

Settings

1. Test Data:

30 pairs

2. Mutant Model:

1. Source-level mutant: 10*2 and 20

2. Model-level mutant: 50 and 50

14

Source-level Mutant Model Generation

15

Model-level Mutant Model Generation

16

Performance

17

Metrics

18

Average Error Rate

19

Average Error Rate

20

Average Error Rate

21

Average Error Rate

22

Average Mutation Score

23

Class-wise Performance

24

Class-wise Performance

25

Class-wise Performance

26

Discussion

Generalization Imperfection of model

27

Why First? Why did no one try this before?

Since drivers self-reportedly act responsibly when selecting a plan

to be executed by the autonomous systemTime for Training Source-level testing needs re-training of the entire model

Discussion

Relation with Accuracy Training and test accuracy vs proposed metrics

28

CPU vs GPU Non-deterministic behavior

Since drivers self-reportedly act responsibly when selecting a plan

to be executed by the autonomous system

Designing Mutation Operator

Challenging to simulate real world faults on source-level,

Impact difference on source-level and model-level

Related Works – Testing for DL

● Other papers by the same team

○ DeepGauge, DeepCruiser, DeepCT, DeepHunter

● DeepXplore, DeepTest

● DeepLaser: Practical Fault Attack on Deep Neural Networks

● DeepRoad

● DeepCover (Testing Deep Neural Networks)

● Concolic Testing for Deep Neural Networks

● TensorFuzz - by Goodfellow

● DeepFault: Fault Localization for Deep Neural Networks

● Review Paper - On Testing Machine Learning Programs29

Related Works – Verification for DL

● AI2

● Reluplex

● DeepSafe

● Towards evaluating the robustness of neural networks

● Safety Verification of Deep Neural Networks

30

Thank you for your attention

31

top related