studying the impact of dependency network measures on software quality

Thanh H. D. Nguyen, Bram Adams, Ahmed E. Hassan SAIL, School of Compu?ng, Queen’s University, Kingston, Canada

Studying the impact of dependency network measures on soIware quality

 Problem:  Quality improvement resources are limited

 Solu?on:   Bug predic5on iden5fies defect-‐prone modules

2

Code Quality

3

Bug predic?on models

Bug Predic5on Model

High Recall -‐> We won’t miss a possible bug High Precision -‐> We won’t waste effort

4

A

F B

C

G E

D

SoIware is more than just size and complexity

Node"

Local Neighborhood"

Global Neighborhood"

5

SoIware is more than just size and complexity

Social Network Measures!

(SNA)"

Traditional Metrics (MET)"Node"

Local Neighborhood"


6

Bug Predic5on Model

Would SNA improve performance?

7


8


9


10


11

Why Eclipse?

12

Bug Predic5on Model


13

Bug Predic5on Model


14

+25% for Recall and Precision

15

Does this generalize?

16

12 Metrics

11 Metrics

12 Metrics

Node"

Local Neighborhood"


Which metrics provide the improvement?

Use hierarchical modeling to find important group [Caltado et al. TSE10]

17

12 Metrics

11 Metrics

12 Metrics

Node"

Local Neighborhood"


7%

+2.7%

+0.3%


18

12 Metrics

11 Metrics

12 Metrics

Node"

Local Neighborhood"


7%

+2.7%

+0.3%

Local neighbours have most of the important improvement


19

Which local measures have the most impact?

20

Cluster fan-‐in

21

Cluster fan-‐in

22

Layer bypass

23

Layer bypass

24

Layer bypass

25

Consider your neighbor connec?ons

26

How well do we perform in prac?ce?

✔ ✗

27

Effort Aware Predic?on Models

28

Comparing Performance Using Effort Aware Curves

0 20 40 60 80 100

020

4060

8010

0

% lines of code reviewed

% b

ugs

caug

htFile A B C

#bug 0 1 2

LOC 48 8 44

ROI 0 0.125 0.045

Risk 0.78 0.56 0.34

29


0 20 40 60 80 100

020

4060

8010

0


% b

ugs

caug

ht

A

File A B C

#bug 0 1 2

LOC 48 8 44

ROI 0 0.125 0.045

Risk 0.78 0.56 0.34

30


0 20 40 60 80 100

020

4060

8010

0


% b

ugs

caug

htB

File A B C

#bug 0 1 2

LOC 48 8 44

ROI 0 0.125 0.045

Risk 0.78 0.56 0.34

31


0 20 40 60 80 100

020

4060

8010

0


% b

ugs

caug

ht

C

File A B C

#bug 0 1 2

LOC 48 8 44

ROI 0 0.125 0.045

Risk 0.78 0.56 0.34

32

Is this a good predic?on?

0 20 40 60 80 100

020

4060

8010

0


% b

ugs

caug

htFile A B C

#bug 0 1 2

LOC 48 8 44

ROI 0 0.125 0.045

Risk 0.78 0.56 0.34

33

Beeer predic?on means a higher curve

0 20 40 60 80 100

020

4060

8010

0


% b

ugs

caug

htFile A B C

#bug 0 1 2

LOC 48 8 44

ROI 0 0.125 0.045

Bad 0.78 0.56 0.34

Good 0.32 0.72 0.55

Good

Bad

34

The predic?on model helps reduce tes?ng effort

0 20 40 60 80 100

020

4060

8010

0


% b

ugs

caug

ht

File Package

Random File

36 Thanh H. D. Nguyen ([email protected])

Class pred. has more poten?al

37 Thanh H. D. Nguyen ([email protected])

Deviance explained

Bugginess ~ Traditional metrics + Local + Global

+2.7%"

+1.1%"

+0.3%"

+1,9%"

38

Anova on M3

studying the impact of dependency network measures on software quality

Technology