partitioning composite code changes to facilitate code review
TRANSCRIPT
Partitioning Composite Code Changes to Facilitate Code Review
Yida Tao and Sunghun KimThe Hong Kong University of Science and Technology
Atomic Code Change
Fixed bug #12, #34, #56Removed duplicate codeAdd a featureJavadoc updated
Composite Code Change
Fixed bug #123
Atomic Code Change
Fixed bug #12, #34, #56Removed duplicate codeAdd a featureJavadoc updated
Composite Code Change
Fixed bug #123
Difficult to reviewLikely be rejected
Research Questions
• RQ1: Are composite code changes prevalent?
• RQ2: Can we propose an approach to improve the semantic atomicity of composite code changes?
• RQ3: Can our approach help developers better review composite code changes?
6
RQ1: Occurrence of composite code changes • Data source• 4 open-source Java projects• Revisions that changed >= 2 lines of code• Commit logs and source code were manually inspected
Time period Revisions
Avg. cLOC
Avg. files
Ant 2010/04/27 -- 2012/03/05
137 26.1 2.0
Commons Math
2011/11/28 -- 2012/04/12
107 84.7 3.5
Xerces 2008/11/03 -- 2012/03/13
116 63.6 3.0
JFreeChart 2008/07/02 -- 2010/03/30
93 144.9 4.1
Total revisions
453
7
82%
13%5%
Xerces
8% - 29% revisions address multiple issues
RQ1: Occurrence of composite code changes
92%
7%
1%
Ant
82%
12%6%
Commons Math
71%
18%
11%
JFreeChart
1 issue
2 issues
> 2 issues
8
Approach
A set of changed statements
Partition of the change(A subset is a change-slice)
A composite code change
Identify related stateme
nts
9
Approach Formatting
Dependency
SimilarityA set of
changed statements
A composite code change
Partition of the change(A subset is a change-slice)
10
Approach
Partition of the patch
Unix diff(text differencing)
ChangeDistiller*(AST differencing)
*http://www.ifi.uzh.ch/seal/research/tools/changeDistiller.html
Formatting changes
Formatting
Dependency
SimilarityA set of
changed statements
A composite code change
11
Approach
Partition of the patch
IBM T.J. Watson Libraries for Analysis(WALA)*
Inter-proceduralBackward static slicing*http://wala.sourceforge.net/wiki/index.php/Main_Page
Formatting
Dependency
SimilarityA set of
changed statements
A composite code change
12
Approach
Partition of the patch
“protect array entries against corruption by returning a clone”
Same change typeSimilar delta
Formatting
Dependency
SimilarityA set of
changed statements
A composite code change
P
P’
Evaluation
• 78 composite code changes from the previously inspected data• 3 human evaluators establish manual partitions for
these changes• Automatic partition results are compared to manual
partitions• Considered acceptable if it exactly matched the manual
partition
82%
13%5%
Xerces
92%
7%
1%
Ant
82%
12%6%
Commons Math
71%
18%
11%
JFreeChart
1 issue
2 issues
> 2 issues
Evaluation
• 78 composite code changes from the previously inspected data• 3 human evaluators establish manual partitions for
these changes• Automatic partition results are compared to manual
partitions• Considered acceptable if it exactly matched the manual
partition
82%
13%5%
Xerces
92%
7%
1%
Ant
82%
12%6%
Commons Math
71%
18%
11%
JFreeChart
1 issue
2 issues
> 2 issues
Evaluation
• 78 composite code changes from the previously inspected data• 3 human evaluators establish manual partitions for
these changes• Automatic partition results are compared to manual
partitions• Considered acceptable if it exactly matched the manual
partition
Acceptable # / Total #
Ant 8 / 11
Commons Math
10 / 19
Xerces 16 / 21
JFreeChart 20 / 27
54 / 78 (69%)
17
Ant revision 943068 (24 changed LOC)
“Wrong assignment after I renamed the parameter. Unfortunately there doesn’t seem to be a testcase that catches the error.”
Later fixed in revision 943070
One of the two change-slices after partitioning
18
Preliminary User Study• RQ3• Can our automatic partition help developers better review
composite changes?
• Participants• 18 CS graduate students
• Task• Participants review 12 composite code changes• Answer a series of code review questions [1], e.g.,
• “What is the consequence of removing the schemaType field?”• “What do changes in these files have in common?”
[1]“Questions programmers ask during software evolution tasks” Sillito et al. FSE 2006
19
Experimental Settings
•Treatments• Control group: review code changes by file• Experimental group: review code changes by partition
Formatting
Dependency
SimilarityComposite
Code Changes
Partition of the change
8% - 29%
69%
By fileBy partition
Discussion
• Impact of unsatisfactory change partitions
•Balancing between partition costs and benefits
Related Work
• Helping developers help themselves: Automatic decomposition of code review changesets. Barnett et al. ICSE 2015• The industrial perspective of composite code changes
• The impact of tangled code changes. Kim Herzig and Andreas Zeller. MSR 2013• Filtering Noise in Mixed-Purpose Fixing Commits to
Improve Defect Prediction and Localization. Nguyen et al. ISSRE 2013• How composite code changes affect defect prediction