bug prediction based on fine-grained module histories
DESCRIPTION
A first study of fine-grained (method-level) bug prediction with well-know historical metrics.TRANSCRIPT
Bug Prediction Based on Fine-Grained Module
HistoriesH i d e a k i H a t a
O s a m u M i z u n oT o h r u K i k u n o
1
Overview
Background
Historical metrics are useful for bug prediction
Problem
For method-level prediction, it is difficult to collect historical metrics
Solution & Results
Historage: fine-grained version control system
First study of method-level bug prediction with well-known historical metrics
2
Bug Prediction Papers
3
0
5
10
15
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
Papers (TSE, EMSE, ICSE, ESEC/FSE, FSE, ICSM, MSR)
Historical Metrics
4
Code
•Code churn •Changes•Past bugs•Process complexity
•Developers•Org structure•Network•Ownership
•Locations•Distribution
Code
Organization
Process
Geography
Bug Prediction Survey: http://bpsurvey-hidehata.dotcloud.com/
Mining Version Control Repository
5
n+3n+2n+1n-3 n-2 n-1 n
Code delta
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 27 28
29 30 31 1 2 3 4
5 6 7 8 9 10 11
26
Su Mo Tu We Th Fr Sa
July 2007 ><
Fix bug #32528
Commit message
... ...
What We Have Learned
6
Prediction accuracy
Historical metrics ≥ Static code metrics[Moser et al. ’08, Kamei et al. ’10]
Required effort
File-level ≤ Package-level[Kamei et al. ’10, Nguyen et al. ’10, Posnett et al. ’11]
State of the Art
7
Package-level
File-level
Method-level
0 5 10 15
Papers (TSE, EMSE, ICSE, ESEC/FSE, FSE, ICSM, MSR)
Cache model[Kim et al. ’07]
Spam filtering model[Mizuno et al. ’07]
No method-level prediction with well-known historical metrics
Method-Level Prediction
8
Requirement
Method-level historical metrics
Problem
Analysis of method histories is difficult
Difficulties
9
1.Tracking methods is troublesome
Matching methods should be found between sequential snapshots
2.Method-level metadata are not easily available
Metadata (who, when,how, etc.) are associatedwith files
n-2 n-1 n
Historage
10
Fine-grained version control system[1]
is created on top on a Git repository
stores methods as files
detects rename/move with Git mechanism
[1] Hata et al., “Historage: Fine-Grained Version Control System for Java,” IWPSE-EVOL ’11.Tool: git2historage(https://github.com/hdrky/git2historage)
com1 com2com1 com2
MethodMethod
MethodMethod
MethodMethod
MethodMethod
MethodMethod
MethodMethod
11
Visualization of repository history
•tree: directory•white node: method
Git - file histories Historage - method histories
Mining Historage
12
n-3Method
n+3Method
n-2Method
n+2Method
n-1
Method
n+1
Method
n
Method
Code delta
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 27 28
29 30 31 1 2 3 4
5 6 7 8 9 10 11
26
Su Mo Tu We Th Fr Sa
July 2007 ><
Fix bug #32528
Commit message
... ...
Study
13
Comparison
Prediction level: package, file, and method
Same metrics and a same prediction algorithm (random forest)
Buggy modules: identified with SZZ algorithm[2]
Evaluation
10-fold cross validation
Effort-based evaluation[2] Sliwerski et al., “When do changes induce fixes?” MSR ’05.
Target
14
Project Period # of commits
Xpand 2y6m 1,038
WTP Incubator 2y8m 1,133
Ant 11y7m 2,590
Lucene/Solr 1y6m 3,485
OpenJPA 5y4m 4,180
Cassandra 2y6m 4423
ECF 6y6m 9,748
Wicket 7y 15,033
Collected Metrics
15
DevTotal/Major/Minor # of Total/Major /Minor developersOwnership Highest proportion of ownership
LOC Lines of codeAdd/DelLOC Added / Deleted LOC
Chg/FixChgNum # of changes/bug-fix changesPastBugNum # of fixed bug IDsPeriod Existing daysBugIntroNum # of bug introducing changesLogCoupNum # of logical coupling changesAvg/Max/MinInterval Avg/Max/Min change intervalHCM Process complexity metric
Effort-Based Evaluation
16
0
25
50
75
100
0 20 40 60 80 100Percent of LOC
Per
cent
of B
ugs
foun
d
sample curve
Result (ECF)
17
Percent of Lines
Perc
ent o
f Bug
s Fo
und
Percent of Lines
Perc
ent o
f Bug
s Fo
und
Percent of Lines
Perc
ent o
f Bug
s Fo
und
0 20 40 60 80 100
020
4060
8010
0
PackageFileMethod
1000 Times Run (ECF)
18
Package File Method
Perc
ent o
f Bug
s Fo
und
020
4060
80
percentages of bugs found in 20% LOC on a 1,000 times run
1000 Times Run (All)
19
0
25
50
75
100
Xpand WTP Incubator Ant Lucene/Solr OpenJPA Cassandra ECF Wicket
Per
cent
of b
ugs
foun
d
Package File Method
median values of the percentage of bugs found in 20% LOC
Why Is Method-Level Prediction Effective?
20
Package File Method
0200
400
600
800
LOC
All Buggy
010
2030
4050
60N
umbe
r of m
etho
dsSize # of method in a file
Although models predict buggy modules correctly, they arelargely non-buggy in packages, or files.
Observations from Correlation Analysis
21
Are there differences between method-level and package/file -level prediction models?
Same
Large changes tend to be buggy
Frequent changes tend to be buggy
DifferentBugs do not occur repeatedly
Organizational metrics may not contribute to method-level prediction
Threats to Validity
22
Targets are limited to open-source written in Java projects
No manual inspection of identifying buggy modules
Effort-based evaluation may not reflect actual efforts
Fine-Grained Study Is Big Data Analysis
Need scalable techniques
preparing fine-grained data (making Historage)
analyzing histories (collecting metrics)
building prediction models
23
0
7500
15000
22500
30000
Xpand Ant ECF Wicket
Files Methods
# of modules in one snapshot
Conclusions
Summary
Method-level bug prediction with well-known historical metrics
Future work
Empirical studies of actual effort using method-level prediction
More metrics and more projects (including industrial projects)
24