metrics - using source code metrics to predict change-prone java interfaces
Post on 13-Jan-2015
804 Views
Preview:
DESCRIPTION
TRANSCRIPT
29 Sept 2011
Challenge the future Delft University of Technology
Using Source Code Metrics to Predict Change-Prone Java Interfaces
Daniele Romano and Martin Pinzger Williamsburg, ICSM 2011
2
Contributions
• Correlation source code metrics vs #changes in interfaces:
• C&K metrics
• complexity and usage metrics
• interface usage cohesion metric
• Predictive power of source code metrics for interfaces:
• prediction models
• 10 open source projects
• 8 Eclipse projects
• Hibernate 2 and Hibernate 3
3
Motivations
• Changes in interfaces are not desirable
• changes can have stronger impact
• interfaces define contracts
• existing object oriented metrics not sound for interfaces
• Related work about metrics as quality predictors
• no differences among the kind of class
4
Hypotheses
• H1
• InterfaceUsageCohesion (IUC) has a stronger correlation with number of Source Code Changes (#SCC) of interfaces than the C&K metrics
• H2
• IUC can improve the performance of prediction models to classify Java interfaces into change- and not-change-prone
5
The Approach
source code repository
metrics computation
Changes Retrieval
Correlation analysis
Correlation Prediction
analysis
Spearman rank correlation
H1
Metrics train models Changes classify interfaces
H2
6
Metrics Computation
source code repository
Evolizer Model Importer
Computation
Famix Model
Understand Metrics Values
7
Changes Computation
source code repository
Evolizer Version Control
Connector
Changes Computation
Revisions Info &
Subsequent files
Evolizer Change Distiller
Fine-Grained Source Code
Changes (SCC) AST Comparison
8
Why SCC?
• Filtering out useless changes due to modification of:
• licenses
• comments
• More precise measurement
#Revision=1 #LineModified=1 #SCC=2
9
C&K Correlation for Interfaces Project CB0
Hibernate3 0.535**
Hibernate2 0.373**
ecl.debug.core 0.484**
ecl.debug.ui 0.216*
ecl.jface 0.239*
ecl.jdt.debug 0.512**
ecl.team.core 0.367*
ecl.team.cvs.core 0.688**
ecl.team.ui 0.301*
update.core 0.499**
Median 0.428
NOC
0.029
0.065
0.105
0.033
0.012
0.256**
0.102
-0.013
-0.003
-0.007
0.031
RFC
0.592**
0.325**
0.486**
0.152
0.174**
0.349**
0.497**
0.738**
0.299*
0.381**
0.365
DIT
0.058
-0.01
0.232*
0.324**
0.103
-0.049
0.243
0.618**
-0.103*
0.146
0.124
LCOM
0.103
0.006
0.337
0.214*
0.320**
0.238**
0.400
0.610**
0.395**
0.482**
0.328
WMC
0.657**
0.522**
0.597**
0.131
0.137
0.489**
0.451**
0.744**
0.299*
0.729**
0.505
*= significant at α=0.05 **= significant at α=0.01
10
Weighted Methods per Class (WMC)
• ci cyclomatic complexity of the ith method
• n number of methods in a class
Number of Methods
11
Interface Segregation Principle
ISP defined by Robert C. Martin cope with fat interfaces
Fat interface interfaces that serve different clients each kind of client uses a different set of methods the interface should be split in more interface, each one
designed to serve a specific client
12
Different clients do not share any methods
ClusterClients(i): counts the number of clients that do not share any method of the interface i
Interface Segregation Principle (I)
13
Interface Usage Cohesion
Different clients share a method
14
Other metrics for interfaces…
• Number Of Methods (NOM)
• Number Of Arguments (NOA)
• Arguments Per Procedure (APP)
• Number of Clients (Cli)
• Number of Invocations (Inv)
• Number of Implementing Classes (Impl)
15
Correlation for Interfaces Project Inv
Hibernate3 0.544**
Hibernate2 0.165
ecl.debug.core 0.317**
ecl.debug.ui 0.497**
ecl.jface 0.205
ecl.jdt.debug 0.495**
ecl.team.core 0.261
ecl.team.cvs.core 0.557**
ecl.team.ui 0.290
update.core 0.677**
Median 0.317
Cli
0.433**
0.104
0.327**
0.498**
0.099
0.471
0.278
0.608**
0.270
0.656**
0.327
NOM
0.657**
0.522**
0.597**
0.131
0.137
0.489**
0.451**
0.744**
0.299
0.729**
0.505
Clust
0.302**
0.016
0.273**
0.418**
0.106**
0.474**
0.328*
0.369
0.056
0.606**
0.328
IUC
-0.601**
-0.373**
-0.682**
-0.508**
-0.363**
-0.605**
-0.475**
-0.819**
-0.618**
-0.656**
-0.605
*= significant at α=0.05 **= significant at α=0.01
16
Prediction Analysis
• Three Machine Learning Algorithms • Support Vector Machine
• Naïve Bayes Network
• Neural Nets
• Interfaces classification:
• Training using 10 fold cross-validation
• {CBO, RFC, LCOM, WMC} = CK
• {CBO, RFC, LCOM, WMC, IUC} = IUC
17
Prediction – AUC values
Project CK IUC
ecl.team.cvs.core 0.55 0.75
ecl.debug.core 0.75 0.79
ecl.debug.ui 0.66 0.72
Hibernate2 0.745 0.807
Hibernate3 0.835 0.862
ecl.jdt.debug 0.79 0.738
ecl.jface 0.639 0.734
ecl.team.core 0.708 0.792
ecl.team.ui 0.88 0.8
update.core 0.782 0.811
Median 0.747 0.791
CK
0.692
0.806
0.71
0.735
0.64
0.741
0.607
0.617
0.74
0.794
0.722
IUC
0.811
0.828
0.742
0.708
0.856
0.82
0.778
0.608
0.884
0.817
0.814
CK
0.8
0.85
0.748
0.702
0.874
0.77
0.553
0.725
0.65
0.675
0.736
IUC
0.8
0.875
0.766
0.747
0.843
0.762
0.542
0.85
0.75
0.744
0.764
NBayes LibSVN NN
18
Results
• H2 PARTIALLY ACCEPTED
• IUC can improve the performance of prediction models to classify Java interfaces into change- and not-change-prone
• Despite the improvements Wilcoxon test showed a significant difference only for the LibSVM
• H1 ACCEPTED
• IUC has a stronger correlation with #SCC of interfaces than the C&K metrics
• UIC shows the best correlation
19
Implications
• Researchers
• taking in account the nature of the measured entities
• Quality Engineers
• enlarge metrics suites
• Developers and Architects
• Measure the ISP violation
20
Future Work
• Metrics measurement overtime
• Further validation
• Are the shared methods the problem?
• Component Based System and Service Oriented System
21
top related