studying the impact of dependency network measures on software quality
DESCRIPTION
TRANSCRIPT
Thanh H. D. Nguyen, Bram Adams, Ahmed E. Hassan SAIL, School of Compu?ng, Queen’s University, Kingston, Canada
Studying the impact of dependency network measures on soIware quality
Problem: Quality improvement resources are limited
Solu?on: Bug predic5on iden5fies defect-‐prone modules
2
Code Quality
3
Bug predic?on models
Bug Predic5on Model
High Recall -‐> We won’t miss a possible bug High Precision -‐> We won’t waste effort
4
A
F B
C
G E
D
SoIware is more than just size and complexity
Node"
Local Neighborhood"
Global Neighborhood"
5
SoIware is more than just size and complexity
Social Network Measures!
(SNA)"
Traditional Metrics (MET)"Node"
Local Neighborhood"
Global Neighborhood"
6
Bug Predic5on Model
Would SNA improve performance?
7
Would SNA improve performance?
8
Would SNA improve performance?
9
Would SNA improve performance?
10
Would SNA improve performance?
11
Why Eclipse?
12
Bug Predic5on Model
Would SNA improve performance?
13
Bug Predic5on Model
Would SNA improve performance?
14
+25% for Recall and Precision
15
Does this generalize?
16
12 Metrics
11 Metrics
12 Metrics
Node"
Local Neighborhood"
Global Neighborhood"
Which metrics provide the improvement?
Use hierarchical modeling to find important group [Caltado et al. TSE10]
17
12 Metrics
11 Metrics
12 Metrics
Node"
Local Neighborhood"
Global Neighborhood"
7%
+2.7%
+0.3%
Which metrics provide the improvement?
18
12 Metrics
11 Metrics
12 Metrics
Node"
Local Neighborhood"
Global Neighborhood"
7%
+2.7%
+0.3%
Local neighbours have most of the important improvement
Which metrics provide the improvement?
19
Which local measures have the most impact?
20
Cluster fan-‐in
21
Cluster fan-‐in
22
Layer bypass
23
Layer bypass
24
Layer bypass
25
Consider your neighbor connec?ons
26
How well do we perform in prac?ce?
✔ ✗
27
Effort Aware Predic?on Models
28
Comparing Performance Using Effort Aware Curves
0 20 40 60 80 100
020
4060
8010
0
% lines of code reviewed
% b
ugs
caug
htFile A B C
#bug 0 1 2
LOC 48 8 44
ROI 0 0.125 0.045
Risk 0.78 0.56 0.34
29
Comparing Performance Using Effort Aware Curves
0 20 40 60 80 100
020
4060
8010
0
% lines of code reviewed
% b
ugs
caug
ht
A
File A B C
#bug 0 1 2
LOC 48 8 44
ROI 0 0.125 0.045
Risk 0.78 0.56 0.34
30
Comparing Performance Using Effort Aware Curves
0 20 40 60 80 100
020
4060
8010
0
% lines of code reviewed
% b
ugs
caug
htB
File A B C
#bug 0 1 2
LOC 48 8 44
ROI 0 0.125 0.045
Risk 0.78 0.56 0.34
31
Comparing Performance Using Effort Aware Curves
0 20 40 60 80 100
020
4060
8010
0
% lines of code reviewed
% b
ugs
caug
ht
C
File A B C
#bug 0 1 2
LOC 48 8 44
ROI 0 0.125 0.045
Risk 0.78 0.56 0.34
32
Is this a good predic?on?
0 20 40 60 80 100
020
4060
8010
0
% lines of code reviewed
% b
ugs
caug
htFile A B C
#bug 0 1 2
LOC 48 8 44
ROI 0 0.125 0.045
Risk 0.78 0.56 0.34
33
Beeer predic?on means a higher curve
0 20 40 60 80 100
020
4060
8010
0
% lines of code reviewed
% b
ugs
caug
htFile A B C
#bug 0 1 2
LOC 48 8 44
ROI 0 0.125 0.045
Bad 0.78 0.56 0.34
Good 0.32 0.72 0.55
Good
Bad
34
The predic?on model helps reduce tes?ng effort
0 20 40 60 80 100
020
4060
8010
0
% lines of code reviewed
% b
ugs
caug
ht
File Package
Random File
35
36 Thanh H. D. Nguyen ([email protected])
Class pred. has more poten?al
37 Thanh H. D. Nguyen ([email protected])
Deviance explained
Bugginess ~ Traditional metrics + Local + Global
+2.7%"
+1.1%"
+0.3%"
+1,9%"
38
Anova on M3