mutation analysis vs. code coverage in automated assessment of students’ testing skills

13
Muta%on Analysis vs. Code Coverage in Automated Assessment of Students’ Tes%ng Skills Kalle Aaltonen, Petri Ihantola and O2o Seppälä (Splash – ETS’10) Aalto University, Finland

Upload: petri-ihantola

Post on 25-Jan-2015

1.078 views

Category:

Technology


0 download

DESCRIPTION

Slides from my SPLASH 2010 presentation: Kalle Aaltonen, Petri Ihantola, Otto Seppälä (2010). Mutation analysis vs. code coverage in automated assessment of students’ testing skills. In: SPLASH ’10: Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion. Reno/Tahoe, Nevada, USA: ACM, pp. 153–160. ISBN: 978-1-4503-0240-1. http://dx.doi.org/10.1145/1869542.1869567

TRANSCRIPT

Page 1: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

Muta%on Analysis vs. Code Coverage in Automated Assessment of Students’ 

Tes%ng Skills 

Kalle Aaltonen, Petri Ihantola and O2o Seppälä (Splash – ETS’10) Aalto University, Finland 

Page 2: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

What Do We Do? 

•  Believe in tesGng •  Provide programming assignments 

–  for hundreds of students per course – where students are asked to submit: 

•  Their implementaGon •  Unit tests covering their own implementaGon 

– Use Web‐Cat for automated assessment •  Grade =   our tests passing (%) * student’s tests passing (%) *  line or branch coverage of student’s tests   

Page 3: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

How Students Test three different tests with the same code coverage 

        assertTrue(1 < 2);  

        fibonacci(6); 

assertTrue(fibonacci(6) >= 0); 

assertEquals(8,fibonacci(6)); 

Page 4: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

•  Create variaGons automaGcally from the original program •  Simulate bugs 

•  A good test will catch many of these mutants •  Assuming these mutants are really different from the original 

•  We hope this to provide be2er feedback/grading •  We used a byte‐code level mutaGon analysis tool called Javalanche 

MutaGon Analysis 

Page 5: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

•  Create variaGons automaGcally from the original program •  Simulate bugs 

•  A good test will catch many of these mutants •  Assuming these mutants are really different from the original 

•  We hope this to provide be2er feedback/grading •  We used a byte‐code level mutaGon analysis tool called Javalanche 

MutaGon Analysis 

Page 6: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

Int Fib ( int N) {     int curr = 1 , prev = 0;     for ( int i = 0; i <= N; i++) {         int temp = curr ;         curr = curr + prev ;         prev = temp ;     }     return prev ; }

MutaGon Analysis Examples of Mutants 

Int Fib ( int N ) {     int curr = 1 , prev =0;     for ( int i = 0; I < N; i++ ) {         int temp = curr ;         curr = curr + prev ;         prev = temp ;     }     return prev ; } 

Int Fib ( int N ) {     int curr = 0 , prev = 1;     for ( int i = 0; i < N; i++ ) {         int temp = curr ;         curr = curr + prev ;         prev = temp ;     }     return prev ; }

Int Fib ( int N ) {     int curr = 1 , prev = 0;     for ( int i = 1; i <= N; i++ ) {         int temp = curr ;         curr = curr + prev ;         prev = temp ;     }     return prev ; }

Page 7: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

Some Results 

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Code coverage

Muta

tion s

core

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Code coverage

Mu

tatio

n s

co

re

•  Data: BST, Hashing, Disjoint Sets assignments •  Most students get full points from the coverage •  MutaGon scores more widely distributed 

Page 8: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

Some Results 

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Code coverage

Muta

tion s

core

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Code coverage

Mu

tatio

n s

co

re

•  Data: BST, Hashing, Disjoint Sets assignments •  Most students get full points from the coverage •  MutaGon scores more widely distributed 

Page 9: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

About the Validity of the Results 

40 %

50 %

60 %

70 %

80 %

90 %

100 %

Best Suite

Mut. Score 98,0 %

Random Suite 1

Mut. Score 85,4 %

Random Suite 2

Mut. Score 72,0 %

Worst Suite

Mut. Score 54,8 %

Page 10: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

About the Validity of the Results 

40 %

50 %

60 %

70 %

80 %

90 %

100 %

Best Suite

Mut. Score 98,0 %

Random Suite 1

Mut. Score 85,4 %

Random Suite 2

Mut. Score 72,0 %

Worst Suite

Mut. Score 54,8 %

Page 11: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

Conclusions 

•  Can be used to pick up suspicious soluGons –  High code coverage but low mutaGon score 

•  Reduces the importance of unit  tests wri2en by the teacher –  Also able to ensure that unspecified  features are  

tested (i.e. specified) •  Immediate feedback 

–  When compared to running  all tests against each soluGon 

•  Complex parts of the code get more a2enGon •  Able to give feedback from teacher’s own 

tests •  Should be combined to other test adequacy  

metrics 

Page 12: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

Future DirecGons 

•  Evaluate in pracGce •  Data we analyzed is from a course where tradiGonal coverage was used to provide feedback from tests. 

•  Testability – Test Adequacy – Correctness •  Use source code mutants directly as feedback 

Page 13: Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Testing Skills

Thank You! 

QuesGons, comments? 

[email protected] 

Graphics: Vte.Moncho, h2p://www.flickr.com/photos/maniacpictures/  Don Solo, h2p://www.flickr.com/photos/donsolo/  licensed under the creaGve commons license