on the evolution of source code and software defects

Post on 06-May-2015

2.874 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

The PhD defense presentation of Marco D'Ambros

TRANSCRIPT

Marco D’Ambros

On the Evolution of Source Code and Software Defects

REVEAL group @ Faculty of InformaticsUniversity of Lugano, Switzerland

Dissertation committee

Prof. Michele Lanza

Prof. Carlo Ghezzi

Prof. Cesare Pautasso

Prof. Harald C. Gall

Prof. Hausi A. Müller

software engineering and bridges

Software & Bridges

Software

Requirements

Software

Requirements

Software

Requirements

Softwareaging

We are here

Ph.D.

We are here

Ph.D.

ThesisxAnalysis

techniquesxTool supportx

Conclusionx

We are here

Thesisx

Ph.D.

swamp of procrastination

haunted teachwood forest

peaks of toolsmadness

Analysistechniquesx

Tool supportx

Conclusionx

We are here

Thesisx

Ph.D.

swamp of procrastination

haunted teachwood forest

peaks of toolsmadness

Analysistechniquesx

Tool supportx

Conclusionx

The Evolution of Software Evolution

First software configuration managament (SCSS)

Lehman's laws of software evolution

Boehm's spiral model

RCS CVS First bug tracking system (GNATS) Bugzilla

Extreme Programming Explained: Embrace Changes

Subversion Git

First MSR approach (Ball

et. al.)

First workshop on MSR

MSR becomes a conferenceJazz

Release History Database

(Fischer et. al.)

1980 1990 2000 2010

1982 1986

1988

1992

1997 1999 2003

2004 2006 2007 2008

Cost of software maintenance estimated to be 50-75% of the total

cost of software (Sommerville, Davis)

1995

Cost of software maintenance estimated to be more than 85% (Erlik)

1975

1996

ToolPublicationFoundation

First software configuration managament (SCSS)

Lehman's laws of software evolution

Boehm's spiral model

RCS CVS First bug tracking system (GNATS) Bugzilla

Extreme Programming Explained: Embrace Changes

Subversion Git

First MSR approach (Ball

et. al.)

First workshop on MSR

MSR becomes a conferenceJazz

Release History Database

(Fischer et. al.)

1980 1990 2000 2010

1982 1986

1988

1992

1997 1999 2003

2004 2006 2007 2008

Cost of software maintenance estimated to be 50-75% of the total

cost of software (Sommerville, Davis)

1995

Cost of software maintenance estimated to be more than 85% (Erlik)

1975

1996

ToolPublication

Infrastructure Implementation

First software configuration managament (SCSS)

Lehman's laws of software evolution

Boehm's spiral model

RCS CVS First bug tracking system (GNATS) Bugzilla

Extreme Programming Explained: Embrace Changes

Subversion Git

First MSR approach (Ball

et. al.)

First workshop on MSR

MSR becomes a conferenceJazz

Release History Database

(Fischer et. al.)

1980 1990 2000 2010

1982 1986

1988

1992

1997 1999 2003

2004 2006 2007 2008

Cost of software maintenance estimated to be 50-75% of the total

cost of software (Sommerville, Davis)

1995

Cost of software maintenance estimated to be more than 85% (Erlik)

1975

1996

ToolPublicationEventThe Advent of MSR

First software configuration managament (SCSS)

Lehman's laws of software evolution

Boehm's spiral model

RCS CVS First bug tracking system (GNATS) Bugzilla

Extreme Programming Explained: Embrace Changes

Subversion Git

First MSR approach (Ball

et. al.)

First workshop on MSR

MSR becomes a conferenceJazz

Release History Database

(Fischer et. al.)

1980 1990 2000 2010

1982 1986

1988

1992

1997 1999 2003

2004 2006 2007 2008

Cost of software maintenance estimated to be 50-75% of the total

cost of software (Sommerville, Davis)

1995

Cost of software maintenance estimated to be more than 85% (Erlik)

1975

1996

Foundation Infrastructure Implementation

The advent of MSR

The Evolution of Software Evolution

First software configuration managament (SCSS)

Lehman's laws of software evolution

Boehm's spiral model

RCS CVS First bug tracking system (GNATS) Bugzilla

Extreme Programming Explained: Embrace Changes

Subversion Git

First MSR approach (Ball

et. al.)

First workshop on MSR

MSR becomes a conferenceJazz

Release History Database

(Fischer et. al.)

1980 1990 2000 2010

1982 1986

1988

1992

1997 1999 2003

2004 2006 2007 2008

Cost of software maintenance estimated to be 50-75% of the total

cost of software (Sommerville, Davis)

1995

Cost of software maintenance estimated to be more than 85% (Erlik)

1975

1996

SCMmeta-data

Software defects

E-mail archive

Documentation Others

30%

23%

15%

8%

0%

Source code

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

MSR Approaches

SCMmeta-data

Software defects

E-mail archive

Documentation OthersSource code

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

MSR Approaches

SCMmeta-data

Software defects

E-mail archive

Documentation OthersSource code

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

MSR ApproachesPlain

Integrated

Holisticsoftwareevolution

Chats

Software defects

IDE data

SCM meta-data

E-mail archive

Models

Documentation

Bytecode

Unit tests

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Source code

Holisticsoftwareevolution

Chats

IDE data

SCM meta-data

E-mail archive

Models

Documentation

Bytecode

Unit tests

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Source code

Software defects

Holisticsoftwareevolution

Chats

IDE data

SCM meta-data

E-mail archive

Models

Documentation

Bytecode

Unit tests

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Source code

Software defects

Holisticsoftwareevolution

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Chats

Software defects

IDE data

SCM meta-data Source code

E-mail archive

Models

Documentation

Bytecode

Unit tests

A. Zeller, MSR keynote 2007

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Chats

Software defects

IDE data

SCM meta-data Source code

E-mail archive

Models

Documentation

Bytecode

Unit tests

Our Approach

An integrated view of software evolution, combining historical information regarding source code and software defects, supports an extensible set of software maintenance tasks.

D’Ambros, 2010

Software defects

SCM meta-data Source code for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

E-mail archive

Mevo meta-model

Software defects

SCM meta-data Source code for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

E-mail archive

CSMR 2008

Mevo meta-model

Software defects

SCM meta-data Source code for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

E-mail archive

CSMR 2008

Mevo meta-model

Software defects

SCM meta-data Source code for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

E-mail archive

CSMR 2008

Mevo meta-model

Software defects

SCM meta-data Source code for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

E-mail archive

7 analysis techniques

CSMR 2008

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

Conclusionx

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

Conclusionx

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

Conclusionx

Retrospective analysis Future prediction

Retrospective analysis

12 3

Change coupling analysis

Bug evolution analysis

Code-bugco-evolution analysis

Software defects

SCM meta-data Source code for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

E-mail archive

Change Coupling Analysis

1Goal: Make sense of huge amount of change coupling information

Software defects

SCM meta-data Source code for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

E-mail archive

Change Coupling Analysis

1Goal: Make sense of huge amount of change coupling information

Technique name

Software defects

SCM meta-data Source code for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

E-mail archive

Change Coupling Analysis

1Goal: Make sense of huge amount of change coupling information

Technique name

Used part ofthe meta-model

Software defects

SCM meta-data Source code for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

E-mail archive

Change Coupling Analysis

1Goal: Make sense of huge amount of change coupling information

Technique name

Goal /Question

Used part ofthe meta-model

Software defects

SCM meta-data Source code for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

E-mail archive

Change Coupling Analysis

1Goal: Make sense of huge amount of change coupling information

Technique name

Goal /Question

Techniquenumber

Used part ofthe meta-model

Software defects

SCM meta-data Source code for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

E-mail archive

Change Coupling Analysis

1Goal: Make sense of huge amount of change coupling information

change cou pling•

implicit dependency of files that frequently change together [Gall et al., ICSM 1998]

Module level to assessarchitecture decay

Class level to supportchange impact analysis

Packagebrowser

Current projectpackage list

Protocolbrowser

Classbrowser

Methodbrowser

Main EvolutionRadar Visualization

Secondary EvolutionRadar Visualization

The Evolution Radar

Packagebrowser

Current projectpackage list

Protocolbrowser

Classbrowser

Methodbrowser

Main EvolutionRadar Visualization

Secondary EvolutionRadar Visualization

System re-documentation and restructuring

Tasks Supported

Assessing

architecture decay

Coupled files

Change impact analysis

System evolution analysis

Time

Packagebrowser

Current projectpackage list

Protocolbrowser

Classbrowser

Methodbrowser

Main EvolutionRadar Visualization

Secondary EvolutionRadar Visualization

System re-documentation and restructuring

The Evolution Radar shows integrated change coupling information, supporting various maintenance tasks.

MSR 2006, WCRE 2006, TSE 2009 1Change Coupling Analysis

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Software defects

SCM meta-data Source code

E-mail archive

Bug Evolution Analysis

2Goal: Study the history of software defects

0%

7%

14%

21%

28%

35%

12 Hours 1 Day 1 Week 1 Month 6 Months 1 Year 2 Years More

> 50%

Bug lifetime

distribution in Mozilla

System Radiography viewWhich components experienced many defects?

System Radiography viewWhich components experienced many defects?

Bug watch viewWhich defects are hard to fix?

The visual analysis of bug histories permits the detection of critical software components and exceptional bugs.

VISSOFT 2007 2Bug Evolution Analysis

Software defects

SCM meta-data Source code

E-mail archiveSoftware defects

SCM meta-data Source code

E-mail archive

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } } 3Goal: Detect patterns in the co-evolution

of source code and defects

Bug-Code Co-Evolution Analysis

SCM meta-data

Code committed in SCM repository

Time

Foo.java

SCM meta-data

Code committed in SCM repository

Time

Foo.java

TimeDefects

Defect reported

Foo.java

SCM meta-data

Code committed in SCM repository

Time

Foo.java

Bar.java

Boo.java

TimeDefects

Defect reported

Foo.java

Bar.java

Baz.java

SCM meta-data

Code committed in SCM repository

Time

Foo.java

Bar.java

Boo.java

Bar.java

Baz.java

Defects

Defect reported

Foo.java

Time

SCM meta-data

Code committed in SCM repository

Time

Foo.java

Bar.java

Boo.java

Bar.java

Baz.java

Defects

Defect reported

Foo.java

Time

SCM meta-data

Code committed in SCM repository

Time

Foo.java

Bar.java

Boo.java

Bar.java

Baz.java

Defects

Defect reported

Foo.java

Time

Δt

Co-evolution pattern

Catalog of 10 formally defined patterns

Detecting and visualizing co-evolutionary patterns allows the characterization of software components based on theirco-evolution.

CSMR 2006, JSME 2009 3Bug-Code Co-Evolution Analysis

Retrospective analysis Future prediction

4Bug

prediction

5Bug prediction with e-mails

6Bug prediction with

change coupling

7Software quality

analysis

Future prediction

Software defects

SCM meta-data Source code

E-mail archiveSoftware defects

SCM meta-data Source code

E-mail archive

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } } 4Question: Which bug prediction approach

is the best over different systems?

Bug Prediction

100+

PAPERS

Bug prediction is a very active research field

Other factors

Code metrics

Previous defects

Process metrics

100+

PAPERS

Bug prediction is a very active research fieldWhich one is better?

Moser et al. SCM meta-data

Basili et al.

Kim et al.

Hassan

Source code metrics

Previous defects

Entropy of changes

Basili et. al.

Kim et. al.

Hassan

Moser et al. SCM meta-data

Source code metrics

Previous defects

Entropy of changes

Basili et. al.

Kim et. al.

Hassan

Moser et al. SCM meta-data

Source code metrics

Previous defects

Entropy of changes

Churn ofcode metrics

Entropy of code metrics

Time

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Release X

Post release defects

Past “Future”Time

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Release X

Prediction goal

Post release defects

Past “Future”Time

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Release X

1 Previous defects

Prediction goal

Time

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Release X

Past “Future”

Post release defects

2 SCM meta-data

3 Entropy of changesPrediction goal

1 Previous defects

Time

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Release X

Past “Future”

Post release defects

CBO9923.412.120.362.143.8

DIT9923.412.120.362.143.8

NOC9923.412.120.362.143.8

Source code metrics

Prediction goal

4

2 SCM meta-data

3 Entropy of changes

1 Previous defects

Time

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Release X

Past “Future”

Post release defects

CBO9923.412.120.362.143.8

DIT9923.412.120.362.143.8

NOC9923.412.120.362.143.8

Prediction goal

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

CBO9923.412.120.362.143.8

DIT9923.412.120.362.143.8

NOC9923.412.120.362.143.8

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

CBO9923.412.120.362.143.8

DIT9923.412.120.362.143.8

NOC9923.412.120.362.143.8

deltas deltas

Entropy and churn of code metrics

Source code metrics

4

5 6

2 SCM meta-data

3 Entropy of changes

1 Previous defects

SCM meta-data

Churn of code metrics

Entropy of code metrics

Code metrics

Entropy of changes

Previous defects

Comparing Prediction Models

Predictionperformance

0.2 0.4 0.6 0.8

Previous defects

Comparing Prediction Models

Predictionperformance

0.2 0.4 0.6 0.8

Comparing Prediction Models

100%

90%

75%

50%

score ← score + 3

score ← score + 1

score ← score - 1

Comparing Prediction Models

Comparing Prediction Models

100%

90%

75%

50%

score ← score + 3

score ← score + 1

score ← score - 1

JDT Core

score Σ= score(s)s ∈ systems

Comparing Prediction Models

100%

90%

75%

50%

score ← score + 3

score ← score + 1

score ← score - 1

100%

90%

75%

50%

score ← score + 3

score ← score + 1

score ← score - 1

100%

90%

75%

50%

score ← score + 3

score ← score + 1

score ← score - 1

100%

90%

75%

50%

score ← score + 3

score ← score + 1

score ← score - 1

100%

90%

75%

50%

score ← score + 3

score ← score + 1

score ← score - 1

PDE Mylyn Equinox Lucene

0 4 8 11 15

Entropy of changes

SCM meta-data

Entropy of code metrics

Previous defects

Churn of code metrics

Code metrics

Score

Prediction Performance Across Five Systems

0 4 8 11 15

Entropy of changes

SCM meta-data

Entropy of code metrics

Previous defects

Churn of code metrics

Code metrics

Performance not stable across systems

Good performance and fast to compute

Most stable performance but computationally

expensive

Prediction Performance Across Five Systems

Score

The entropy and the churn of code metrics are the most stable predictors across different systems.

MSR 2010, EMSE under review 4Bug Prediction

Bug Prediction with Change Coupling

Software defects

SCM meta-data Source code

E-mail archiveSoftware defects

SCM meta-data Source code

E-mail archive

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Question: Does change coupling correlate with software defects? 5

Change coupling is harmfulWhat is the impact on software defects?

Measuring Change Coupling

DistributionNumber of coupled classes

ForceNumber of co-changes

Time decayChanges far in the past count less

Correlation with Software Defects

Distribution

Force

Time decay 0.8Spearman’s correlation

Change coupling stronglycorrelates with software defects

+

Correlation with Software Defects

Distribution

Force

Time decay 0.8Spearman’s correlation

Change coupling stronglycorrelates with software defects

+

Defect Prediction

Explanative power

Codemetrics

SCMdata

Changecoupling

0

0.2

0.3

0.5

0.6

0.8

0.9

Defect Prediction

Explanative power

Codemetrics

SCMdata

Changecoupling

0

0.2

0.3

0.5

0.6

0.8

0.9

Defect Prediction

Explanative power

Codemetrics

SCMdata

Changecoupling

Predictive power

Codemetrics

SCMdata

Changecoupling

0

0.2

0.3

0.5

0.6

0.8

0.9

Defect Prediction

Explanative power

Codemetrics

SCMdata

Changecoupling

Predictive power

Codemetrics

SCMdata

Changecoupling14%

Change coupling correlates with software defects and can be used to improve defect prediction models.

WCRE 2009 5Bug Prediction with Change Coupling

Software Quality Analysis

Software defects

SCM meta-data Source code

E-mail archiveSoftware defects

SCM meta-data Source code

E-mail archive

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Question: Do design flaws correlate with software defects? 6

A class should haveone responsibility

Design Flaws

Design guideline:

(Size responsibility)∝Class

A class should haveone responsibility

Design Flaws

Design guideline:

(Size responsibility)∝Class

Violated

A class should haveone responsibility

Design Flaws

Design guideline:

(Size responsibility)∝Class

Violated

Does the presence of design

flaws correlate with software

defects?

And their addition?

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Design flaws

Source code

Analyzing Design Flaws

~

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Design flaws

Source code

Detection strategies

Analyzing Design Flaws

~

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Analyzing Design Flaw Deltas

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Analyzing Design Flaw Deltas

Time

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Analyzing Design Flaw Deltas

Time

2 weeks 2 weeks 2 weeks for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)>0) { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Analyzing Design Flaw Deltas

Time

Number of flaws

Addition of flaws

Analyzing Design Flaw Deltas

Time

Number of flaws

Analyzing Design Flaw Deltas

Time

Number of flaws

Time

Defects

Δt

Addition of flaws

correlation

0.4Flaw presence correlation

+

0.6Flaw addition correlation

+

Results on Six Large Systems

0.4Flaw presence correlation

+

0.6Flaw addition correlation

+No flaw correlates more than

others consistently across systems

Results on Six Large Systems

The presence and addition of design flaws correlates with software defects.

QSIC 2010 6Software Quality Analysis

Software defects

SCM meta-data Source code

E-mail archive

Bug Prediction with E-mails

Software defects

SCM meta-data Source code

E-mail archive

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Question: Can bug prediction techniques be improved with e-mail data?7

Software entities that are frequently mentioned in development mailing lists are defect prone

Popularity Metrics

Do popularity metrics correlate with defects?

Popularity metrics

Spearman’s correlation

Equinox Jackrabbit Lucene Maven0

0.2

0.4

0.6

0.8

Do popularity metrics correlate with defects?

Popularity metrics

Spearman’s correlation

Equinox Jackrabbit Lucene Maven0

0.2

0.4

0.6

0.8

Do popularity metrics correlate with defects?

Popularity metricsLines of code

Spearman’s correlation

POPmetrics

POPmetrics

Do popularity metrics improve existing bug prediction techniques?

Prediction performance

0

0.1

0.2

0.3

0.4

0.5

POPmetrics

POPmetrics

SCM data

Codemetrics

Do popularity metrics improve existing bug prediction techniques?

Prediction performance

0

0.1

0.2

0.3

0.4

0.5

POPmetrics

POPmetrics

SCM data

Codemetrics

SCM+

POP

Code +

POP

Do popularity metrics improve existing bug prediction techniques?

4%12%

Prediction performance

Popularity metrics extracted from development mailing lists correlate with defects and improve existing bug prediction techniques.

FASE 2010 7Bug Prediction with E-mails

1Change coupling

analysis

2Bug evolution

analysis

3Code-bug

co-evolution analysis

4Bug prediction

5Bug prediction with e-mails

6Bug prediction with change

coupling

7Software quality

analysis

Retrospective Analysis

Future Prediction

Mevo meta-model

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

1Change coupling

analysis

2Bug evolution analysis

3Code-bugco-evolution analysis

4Bug prediction

567Bug prediction

with e-mails

Software quality analysis

Bug prediction with change coupling

We are here

swamp of procrastination

haunted teachwood forest

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

Thesisx

Conclusionx

We are here

swamp of procrastination

haunted teachwood forest

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

Thesisx

Conclusionx

We are here

swamp of procrastination

haunted teachwood forest

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

Thesisx

Conclusionx

tools were fundamental for my

daddy's research

Mevo meta-model

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Churrasco framework

Mevo meta-model

SCP 2010

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Churrasco framework

Mevo meta-model

Impo

rter

s

Web interface

SCP 2010

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Churrasco framework

Mevo meta-model

Dat

a in

terf

ace

Impo

rter

s

Web interface

Evolution Radar

Bug’s Life

BugCrawler

Pendolino

SCP 2010

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

Tool Gallery

Evolution Radar

Tool Gallery

Bug’s Life

Tool Gallery

BugCrawler

Tool Gallery

Pendolino

Tool Gallery

Churrasco

1

2

4

5

6

7 Bug prediction with e-mails

Software quality analysis

Change coupling analysis

3

Bug prediction with change coupling

Bug prediction

Bug evolution analysis

Evolution Radar Bug’s Life BugCrawler PendolinoChurrasco

Code-bugco-evolution analysis

MSR Challenge CSMR 2007

MSR 2006 WCRE 2006

TSE 2009

VISSOFT 2007

WASDeTT 2008

CSMR 2006

MSR 2010

WCRE 2009

QSIC 2010

FASE 2010

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

Conclusionx

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

Conclusionx

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

ConclusionxxIntermezzo

Models available through Churrasco web interface

Bug prediction benchmark

Replicating Experiments

SVG

Interactive

Visualization

Recent annotations

added

People participating

to the collaboration

Selected figure

information

Metrics mapping

configurator

Package selector

Regular expression

matcher

User

Selected figure

Context menu

Report generator

STTT 2010Collaboration in Churrasco

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

ConclusionxxIntermezzo

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

ConclusionxxIntermezzo

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

ConclusionxxIntermezzo

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

ConclusionxxIntermezzo

Developer neo

Limitations

Developer neo

Limitations

User studies

More case studies

Other languages

Developer neo

Limitations

User studies

More case studies

Other languages

Future Work

Developer neo

Limitations

User studies

Exploit author data More case studies

Other languages

Future Work

Developer neo

Limitations

User studies

Exploit author data More case studies

Other languages

Mevo

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }Extend the

meta-modelCombinetechniques Future Work

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

HOLISTICSOFTWAREEVOLUTION

MOREREPOSITORIES AHEAD

Holisticsoftwareevolution

for(int j=m; j>i; j--){ uCJM1= dataUC[j-1]; uCJ= dataUC[j];

if(uCJM1.compare(z)> { /* exchange */ tempStr= data[j-1]; /* sort the data */ data[j-1]= data[j]; data[j]= tempStr;

dataUC[j-1]= uCJ; dataUC[j]= uCJM1; } }

HOLISTICSOFTWAREEVOLUTION

MOREREPOSITORIES AHEAD

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

ConclusionxxIntermezzo

We are here

swamp of procrastination

haunted teachwood forest

Thesisx

Ph.D.

peaks of toolsmadness

Analysistechniquesx

Tool supportx

ConclusionxxIntermezzo

Conference papers1. On the Impact of Design Flaws on Software Defects

Marco D'Ambros, Alberto Bacchelli, Michele LanzaIn Proceedings of QSIC 2010, pp. 23-31.

2. An Extensive Comparison of Bug Prediction ApproachesMarco D'Ambros, Michele Lanza, Romain RobbesIn Proceedings of MSR 2010, pp. 31-41.

3. Are Popular Classes More Defect Prone?Alberto Bacchelli, Marco D'Ambros, Michele LanzaIn Proceedings of FASE 2010, pp. 59-73.

4. On the Relationship Between Change Coupling and Software DefectsMarco D'Ambros and Michele Lanza and Romain RobbesIn Proceedings of WCRE 2009, pp. 135-144.

5. Promises and Perils of Porting Software Visualization Tools to the WebMarco D'Ambros, Mircea Lungu, Michele Lanza, Romain RobbesIn Proceedings of WSE 2009, pp. 109-118.

6. A Flexible Framework to Support Collaborative Software Evolution AnalysisMarco D'Ambros, Michele LanzaIn Proceedings of CSMR 2008, pp. 3-12.

7. Reverse Engineering with Logical CouplingMarco D'Ambros, Michele LanzaIn Proceedings of WCRE 2006, pp. 189-198.

8. Software Bugs and Evolution: A Visual Approach to Uncover Their RelationshipsMarco D'Ambros, Michele LanzaIn Proceedings of CSMR 2006, pp. 227-236.

1. Churrasco: Supporting Collaborative Software Evolution AnalysisMarco D'Ambros, Michele LanzaIn Proceedings of WASDeTT 2008, 2008.

2. "A Bug's Life" - Visualizing a Bug DatabaseMarco D'Ambros, Michele Lanza, Martin PinzgerIn Proceedings of VISSOFT 2007, pp. 113-120.

3. The Evolution Radar: Visualizing Integrated Logical Coupling InformationMarco D'Ambros, Michele Lanza, Mircea LunguIn Proceedings of MSR 2006, pp. 26-32.

Workshop papers

1. On Porting Software Visualization Tools to the WebMarco D'Ambros, Michele Lanza, Mircea Lungu, Romain RobbesIn Software Tools for Technology Transfer (STTT), Springer, 2010.

2. Distributed and Collaborative Software Evolution Analysis with ChurrascoMarco D'Ambros, Michele LanzaIn Journal of Science of Computer Programming (SCP), Vol. 75. No. 4, pp. 276-287. Elsevier, 2010.

3. Visualizing Co-Change Information with the Evolution RadarMarco D'Ambros, Michele Lanza, Mircea LunguIn IEEE Transactions on Software Engineering (TSE), Vol. 35. No. 5, pp. 720-735. IEEE CS Press, 2009.

4. Visual Software Evolution ReconstructionMarco D'Ambros, Michele LanzaIn Journal on Software Maintenance and Evolution: Research and Practice (JSME), Vol.21, No.3, pp. 217-232, May 2009. John Wiley & Sons, 2009.

Journal papers

1. Supporting Software Evolution Analysis with Historical Dependencies and Defect InformationMarco DʼAmbrosIn Proceedings of ICSM 2008, pp. 412-415.

2. The Metabase: Generating Object Persistency Using Meta DescriptionsMarco D'Ambros, Michele Lanza, Martin PinzgerIn Proceedings of FAMOOSr 2007.

3. BugCrawler: Visualizing Evolving Software SystemsMarco D'Ambros, Michele LanzaIn Proceedings of CSMR 2007, pp. 333-334.

4. Applying the Evolution Radar to PostgreSQLMarco D'Ambros, Michele LanzaIn Proceedings of MSR 2006, pp. 177-178, 2006.

Other publications

top related