bug prediction based on fine-grained module histories

Bug Prediction Based on Fine-Grained Module

HistoriesH i d e a k i H a t a

O s a m u M i z u n oT o h r u K i k u n o

1

Overview

Background

Historical metrics are useful for bug prediction

Problem

For method-level prediction, it is difficult to collect historical metrics

Solution & Results

Historage: fine-grained version control system

First study of method-level bug prediction with well-known historical metrics

2

Bug Prediction Papers

3

0

5

10

15

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

Papers (TSE, EMSE, ICSE, ESEC/FSE, FSE, ICSM, MSR)

Historical Metrics

4

Code

•Code churn •Changes•Past bugs•Process complexity

•Developers•Org structure•Network•Ownership

•Locations•Distribution

Code

Organization

Process

Geography

Bug Prediction Survey: http://bpsurvey-hidehata.dotcloud.com/

Mining Version Control Repository

5

n+3n+2n+1n-3 n-2 n-1 n

Code delta

1 2 3 4 5 6 7

8 9 10 11 12 13 14

15 16 17 18 19 20 21

22 23 24 25 27 28

29 30 31 1 2 3 4

5 6 7 8 9 10 11

26

Su Mo Tu We Th Fr Sa

July 2007 ><

Fix bug #32528

Commit message

... ...

What We Have Learned

6

Prediction accuracy

Historical metrics ≥ Static code metrics[Moser et al. ’08, Kamei et al. ’10]

Required effort

File-level ≤ Package-level[Kamei et al. ’10, Nguyen et al. ’10, Posnett et al. ’11]

State of the Art

7

Package-level

File-level

Method-level

0 5 10 15

Papers (TSE, EMSE, ICSE, ESEC/FSE, FSE, ICSM, MSR)

Cache model[Kim et al. ’07]

Spam filtering model[Mizuno et al. ’07]

No method-level prediction with well-known historical metrics

Method-Level Prediction

8

Requirement

Method-level historical metrics

Problem

Analysis of method histories is difficult

Difficulties

9

1.Tracking methods is troublesome

Matching methods should be found between sequential snapshots

2.Method-level metadata are not easily available

Metadata (who, when,how, etc.) are associatedwith files

n-2 n-1 n

Historage

10

Fine-grained version control system[1]

is created on top on a Git repository

stores methods as files

detects rename/move with Git mechanism

[1] Hata et al., “Historage: Fine-Grained Version Control System for Java,” IWPSE-EVOL ’11.Tool: git2historage(https://github.com/hdrky/git2historage)

com1 com2com1 com2

MethodMethod

MethodMethod

MethodMethod

MethodMethod

MethodMethod

MethodMethod

11

Visualization of repository history

•tree: directory•white node: method

Git - file histories Historage - method histories

Mining Historage

12

n-3Method

n+3Method

n-2Method

n+2Method

n-1

Method

n+1

Method

n

Method

Code delta

1 2 3 4 5 6 7

8 9 10 11 12 13 14

15 16 17 18 19 20 21

22 23 24 25 27 28

29 30 31 1 2 3 4

5 6 7 8 9 10 11

26

Su Mo Tu We Th Fr Sa

July 2007 ><

Fix bug #32528

Commit message

... ...

Study

13

Comparison

Prediction level: package, file, and method

Same metrics and a same prediction algorithm (random forest)

Buggy modules: identified with SZZ algorithm[2]

Evaluation

10-fold cross validation

Effort-based evaluation[2] Sliwerski et al., “When do changes induce fixes?” MSR ’05.

Target

14

Project Period # of commits

Xpand 2y6m 1,038

WTP Incubator 2y8m 1,133

Ant 11y7m 2,590

Lucene/Solr 1y6m 3,485

OpenJPA 5y4m 4,180

Cassandra 2y6m 4423

ECF 6y6m 9,748

Wicket 7y 15,033

Collected Metrics

15

DevTotal/Major/Minor # of Total/Major /Minor developersOwnership Highest proportion of ownership

LOC Lines of codeAdd/DelLOC Added / Deleted LOC

Chg/FixChgNum # of changes/bug-fix changesPastBugNum # of fixed bug IDsPeriod Existing daysBugIntroNum # of bug introducing changesLogCoupNum # of logical coupling changesAvg/Max/MinInterval Avg/Max/Min change intervalHCM Process complexity metric

Effort-Based Evaluation

16

0

25

50

75

100

0 20 40 60 80 100Percent of LOC

Per

cent

of B

ugs

foun

d

sample curve

Result (ECF)

17

Percent of Lines

Perc

ent o

f Bug

s Fo

und

Percent of Lines

Perc

ent o

f Bug

s Fo

und

Percent of Lines

Perc

ent o

f Bug

s Fo

und

0 20 40 60 80 100

020

4060

8010

0

PackageFileMethod

1000 Times Run (ECF)

18

Package File Method

Perc

ent o

f Bug

s Fo

und

020

4060

80

percentages of bugs found in 20% LOC on a 1,000 times run

1000 Times Run (All)

19

0

25

50

75

100

Xpand WTP Incubator Ant Lucene/Solr OpenJPA Cassandra ECF Wicket

Per

cent

of b

ugs

foun

d

Package File Method

median values of the percentage of bugs found in 20% LOC

Why Is Method-Level Prediction Effective?

20

Package File Method

0200

400

600

800

LOC

All Buggy

010

2030

4050

60N

umbe

r of m

etho

dsSize # of method in a file

Although models predict buggy modules correctly, they arelargely non-buggy in packages, or files.

Observations from Correlation Analysis

21

Are there differences between method-level and package/file -level prediction models?

Same

Large changes tend to be buggy

Frequent changes tend to be buggy

DifferentBugs do not occur repeatedly

Organizational metrics may not contribute to method-level prediction

Threats to Validity

22

Targets are limited to open-source written in Java projects

No manual inspection of identifying buggy modules

Effort-based evaluation may not reflect actual efforts

Fine-Grained Study Is Big Data Analysis

Need scalable techniques

preparing fine-grained data (making Historage)

analyzing histories (collecting metrics)

building prediction models

23

0

7500

15000

22500

30000

Xpand Ant ECF Wicket

Files Methods

# of modules in one snapshot

Conclusions

Summary

Method-level bug prediction with well-known historical metrics

Future work

Empirical studies of actual effort using method-level prediction

More metrics and more projects (including industrial projects)

24

bug prediction based on fine-grained module histories

Technology

methodlevel metadata

method level prediction21

study of method

method git

u n o1

building prediction

bug predictionproblem

messagefix bug