metrics - using source code metrics to predict change-prone java interfaces

29 Sept 2011

Challenge the future Delft University of Technology

Using Source Code Metrics to Predict Change-Prone Java Interfaces

Daniele Romano and Martin Pinzger Williamsburg, ICSM 2011

Contributions

• Correlation source code metrics vs #changes in interfaces:

• C&K metrics

• complexity and usage metrics

• interface usage cohesion metric

• Predictive power of source code metrics for interfaces:

• prediction models

• 10 open source projects

• 8 Eclipse projects

• Hibernate 2 and Hibernate 3

Motivations

• Changes in interfaces are not desirable

• changes can have stronger impact

• interfaces define contracts

• existing object oriented metrics not sound for interfaces

• Related work about metrics as quality predictors

• no differences among the kind of class

Hypotheses

• H1

• InterfaceUsageCohesion (IUC) has a stronger correlation with number of Source Code Changes (#SCC) of interfaces than the C&K metrics

• H2

• IUC can improve the performance of prediction models to classify Java interfaces into change- and not-change-prone

The Approach

source code repository

metrics computation

Changes Retrieval

Correlation analysis

Correlation Prediction

analysis

Spearman rank correlation

Metrics train models Changes classify interfaces

Metrics Computation

Evolizer Model Importer

Computation

Famix Model

Understand Metrics Values

Changes Computation

Evolizer Version Control

Connector

Changes Computation

Revisions Info &

Subsequent files

Evolizer Change Distiller

Fine-Grained Source Code

Changes (SCC) AST Comparison

Why SCC?

• Filtering out useless changes due to modification of:

• licenses

• comments

• More precise measurement

#Revision=1 #LineModified=1 #SCC=2

C&K Correlation for Interfaces Project CB0

Hibernate3 0.535**

Hibernate2 0.373**

ecl.debug.core 0.484**

ecl.debug.ui 0.216*

ecl.jface 0.239*

ecl.jdt.debug 0.512**

ecl.team.core 0.367*

ecl.team.cvs.core 0.688**

ecl.team.ui 0.301*

update.core 0.499**

Median 0.428

0.256**

-0.013

-0.003

-0.007

0.592**

0.325**

0.486**

0.174**

0.349**

0.497**

0.738**

0.299*

0.381**

0.232*

0.324**

-0.049

0.618**

-0.103*

0.214*

0.320**

0.238**

0.610**

0.395**

0.482**

0.657**

0.522**

0.597**

0.489**

0.451**

0.744**

0.299*

0.729**

*= significant at α=0.05 **= significant at α=0.01

Weighted Methods per Class (WMC)

• ci cyclomatic complexity of the ith method

• n number of methods in a class

Number of Methods

Interface Segregation Principle

ISP defined by Robert C. Martin cope with fat interfaces

Fat interface interfaces that serve different clients each kind of client uses a different set of methods the interface should be split in more interface, each one

designed to serve a specific client

Different clients do not share any methods

ClusterClients(i): counts the number of clients that do not share any method of the interface i

Interface Segregation Principle (I)

Interface Usage Cohesion

Different clients share a method

Other metrics for interfaces…

• Number Of Methods (NOM)

• Number Of Arguments (NOA)

• Arguments Per Procedure (APP)

• Number of Clients (Cli)

• Number of Invocations (Inv)

• Number of Implementing Classes (Impl)

Correlation for Interfaces Project Inv

Hibernate3 0.544**

Hibernate2 0.165

ecl.debug.core 0.317**

ecl.debug.ui 0.497**

ecl.jface 0.205

ecl.jdt.debug 0.495**

ecl.team.core 0.261

ecl.team.cvs.core 0.557**

ecl.team.ui 0.290

update.core 0.677**

Median 0.317

0.433**

0.327**

0.498**

0.608**

0.656**

0.657**

0.522**

0.597**

0.489**

0.451**

0.744**

0.729**

0.302**

0.273**

0.418**

0.106**

0.474**

0.328*

0.606**

-0.601**

-0.373**

-0.682**

-0.508**

-0.363**

-0.605**

-0.475**

-0.819**

-0.618**

-0.656**

-0.605

*= significant at α=0.05 **= significant at α=0.01

Prediction Analysis

• Three Machine Learning Algorithms • Support Vector Machine

• Naïve Bayes Network

• Neural Nets

• Interfaces classification:

• Training using 10 fold cross-validation

• {CBO, RFC, LCOM, WMC} = CK

• {CBO, RFC, LCOM, WMC, IUC} = IUC

Prediction – AUC values

Project CK IUC

ecl.team.cvs.core 0.55 0.75

ecl.debug.core 0.75 0.79

ecl.debug.ui 0.66 0.72

Hibernate2 0.745 0.807

Hibernate3 0.835 0.862

ecl.jdt.debug 0.79 0.738

ecl.jface 0.639 0.734

ecl.team.core 0.708 0.792

ecl.team.ui 0.88 0.8

update.core 0.782 0.811

Median 0.747 0.791

NBayes LibSVN NN

Results

• H2 PARTIALLY ACCEPTED

• IUC can improve the performance of prediction models to classify Java interfaces into change- and not-change-prone

• Despite the improvements Wilcoxon test showed a significant difference only for the LibSVM

• H1 ACCEPTED

• IUC has a stronger correlation with #SCC of interfaces than the C&K metrics

• UIC shows the best correlation

Implications

• Researchers

• taking in account the nature of the measured entities

• Quality Engineers

• enlarge metrics suites

• Developers and Architects

• Measure the ISP violation

Future Work

• Metrics measurement overtime

• Further validation

• Are the shared methods the problem?

• Component Based System and Service Oriented System

metrics - using source code metrics to predict change-prone java interfaces

scc of interfaces

metrics suitesdevelopers

ck metrics complexity

number of source code

ck metrics h2 iuc

ck metrics uic

stronger impact interfaces

ck correlation

Technology

benef its of measuring e-discovery metricskeeping historical...

r&d sdm 1 quality and metrics how to measure and predict...

comparison of optical quality metrics to predict

metrics for management and vdc methods to predict and...

identifying fault-prone software modules using feed...

simulation capabilities in creo...2018/03/22 · page 6 of...

error prone abbreviations

using structural holes metrics from communication networks...

a survey on software defect prediction: supporting...

improving*beanproduc1onindrought prone

on the performance of metrics to predict quality in point

transmission line derivation - amazon s3 · although horn...

1 empirical validation of three software metrics suites to...

r&d sdm 1 quality and metrics how to measure and predict...

important information about earthquake-prone buildings...

software metrics to predict the health of a project? - an...

prone position

prone breast

connecting software metrics across versions to …connecting...

slice-based cognitive complexity metrics for defect...