static and adaptive bug fix patterns
TRANSCRIPT
![Page 1: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/1.jpg)
Static and Adaptive Bug Fix Patterns
Jim Whitehead, Sung Kim, Kai Pan
University of California, Santa Cruz
![Page 2: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/2.jpg)
Bug and Bug Fix Patterns?
• Are bugs and bug fixes random in their goal and structure, or do they exhibit patterns?
• We know there are some patterns, since there are existing pattern-oriented static analysis tools that are able to detect some bugs
• Hypothesis: there are both project-specific and project-independent patterns that are detectable in bugs and bug fixes
![Page 3: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/3.jpg)
Static and Adaptive Bug Fix Patterns
• Static: syntax-driven change patterns► Example: changing an if condition expression► Found by statically analyzing code to detect
conformance to a pattern► Horizontal: same pattern can be found in multiple
projects
• Adaptive: memory-driven change patterns► Example: frequent string literal changes► Found by detecting a previous similar bug fix in a
project-specific bug fix database, or “memory”► Vertical: each pattern is specific to a given project
![Page 4: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/4.jpg)
Promise of Bug and Fix Patterns
• If bugs exhibit detectable patterns, it would be possible to automatically detect bugs
• If there are common bug to fix mappings, it would be possible to supply a recommended fix for a detected bug
• Using bug fix patterns, it would be possible to see the frequency distribution of the patterns
► Would be useful to understand which kind of patterns occur more frequently
• Broadly, such patterns would contribute to an improved understanding of maintenance activity
![Page 5: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/5.jpg)
Talk Overview
• Terminology and Detection of Bug Fix Changes• Static Bug Fix Patterns• Adaptive Bug Fix Patterns• Conclusions
![Page 6: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/6.jpg)
Retrieving Bug Fix Changes
• Software projects today record their development history using Software Configuration Management tools
• As developers make changes, they record a reason along with the change
► In the change log message• When developers fix a bug in the software, they tend to
record log messages with some variation of the words “fixed” or “bug”
► “Fixed null pointer bug”• It is possible to mine the change history of a software
project to uncover these bug-fix changes• That is, we retrospectively recover those changes that
developers have marked as containing a bug fix► We assume they are not lying
![Page 7: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/7.jpg)
Bug-introducing and bug-fix changes
Development history of foo.java
SCM log message: “Bug #567 fixed”
“bug fix”
Bug #567 entered into issue tracking system (bug finally observed and recorded)
Software change that introduces the bug “bug-introducing”
![Page 8: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/8.jpg)
Commits, Transactions & Configurations
transactions
configurations
CVS file commits
Added feature X
Fixed null ptr bug
Modified button text
Added feature Y
log message
![Page 9: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/9.jpg)
Hunks, and Hunk PairsRevision n-1(has bug hunks)
Revision n(has fix hunks)
modification
addition
deletion
added hunk
hunk pair type
deleted hunk
empty deleted hunk
empty added hunk
![Page 10: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/10.jpg)
Static Bug Fix Patterns
![Page 11: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/11.jpg)
Static Bug Patterns
• Performed manual analysis of bug fix hunk pairs in Java programs
► Examined bug hunks and corresponding fix hunks► Looked for syntax patterns of recurring changes► Identified 27 static bug fix patterns in Java code
![Page 12: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/12.jpg)
Example Pattern
• Method Call with Different Actual Parameter Values (MC-DAP)
► The bug fix changes the expression passed into one or more parameters of a method call
- tree.putClientProperty(“JTree.lineStyle”, “Horizontal”);
+ tree.putClientProperty(“JTree.linStyle”, “Angled”);
- = bug revision
+ = fix revision
![Page 13: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/13.jpg)
Static Bug Fix Pattern Categories
• Eight categories of static bug fix patterns► If-related► Method call► Sequence► Loop► Assignment► Switch► Try► Method declaration► Class field
![Page 14: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/14.jpg)
If Patterns
• Addition of precondition check► Adds if around existing statement(s)
• Addition of precondition check with jump► Adds if before statement(s) with return/continue/break if
condition is met• Addition of postcondition check
► Adds if statement after operation to check results• Removal of if predicate
► Removal of if surrounding statement(s)• Addition of else branch
► Adds an else branch to existing if statement• Removal of else branch
► Remove else branch from existing if statement• Change of if condition expression
► Modify the conditional part of an if statement
![Page 15: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/15.jpg)
Method Call Patterns
• Method call with different number of parameters of different types of parameters
► Same method name, but different number of parameters or types of parameters
► Change of method interface, or use of overloaded method
• Method call with different actual parameter values
• Change of class instance method call► Fix code calls a different member method of a class
instance
![Page 16: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/16.jpg)
Sequence Patterns
• Addition of operations in an operation sequence of method calls to an object
► Many calls to the same object all in sequence – add one or more
• Removal of operations from an operation sequence of method calls to an object
► Many calls to the same object in sequence – remove one or more
• Addition of operations in a field setting sequence• Removal of operations from a field setting sequence• Addition or removal of method calls in a short construct
body► A short construct body is a short method (2 or 3 statements), or
an if or while body that is short (2 or 3 statements)
![Page 17: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/17.jpg)
Loop and Assignment Patterns
• Change of loop predicate► Bug fix changes the loop condition of a loop
statement
• Change of expression that modifies the loop variable
► Bug fix changes the expression that modifies the loop variable, or adds a statement that modifies the loop variable
• Change of assignment expression► Bug fix changes the expression on the right hand
side of an assignment statement
![Page 18: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/18.jpg)
Switch and Try Patterns
• Addition/removal of switch branch► Bug fix adds/removes a case from a switch
statement
• Addition/removal of try statement► Bug fix adds a try/catch statement to enclose a
section of code, or removes a try/catch statement
• Addition/removal of a catch block► Bug fix adds a catch block to an existing try
statement
![Page 19: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/19.jpg)
Method Declaration and Class Field Patterns
• Change of method delcaration► Change to the declared interface for a method
• Addition of method declaration► Adding new method to existing class
• Removal of method declaration► Removal of an existing method
• Addition of a class field• Removal of a class field• Change of class field declaration
![Page 20: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/20.jpg)
Evolutionary Pattern Analysis
• How many bug fixes contain a pattern?
• How frequently do these patterns occur in actual bug fixes?
• Are pattern frequencies consistent across projects?
• Analyzed five Java open source project histories
• Ran bug fix pattern detector program over bug fix changes
Project Revisions Bug Fixes
ArgoUML 4,685 1,310
Columba 2,362 797
Eclipse 6,394 2,807
JEdit 1,190 557
Scarab 2,962 535
![Page 21: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/21.jpg)
Pattern Coverage
• What percentage of bug fixes contain at least one pattern? (About half)
Pattern Coverage
44.8% 47.5%52.6%
56.4%
45.8%
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
Argouml Columba Eclipse Jedit Scarab
![Page 22: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/22.jpg)
Frequency of pattern categories
Category ArgoUML Columba Eclipse JEdit Scarab
If-related 23.2% 20.0% 34.0% 30.5% 23.0%
Method call 30.3% 26.2% 26.5% 22.2% 33.9%
Sequence 9.7% 17.5% 6.5% 13.5% 9.5%
Loop 1.9% 0.8% 2.2% 1.6% 1.4%
Assignment 8.6% 7.6% 6.4% 8.4% 7.4%
Switch 0.0% 0.3% 1.6% 0.6% 0.0%
Try 1.0% 1.9% 2.6% 1.0% 1.4%
Method declaration
16.2% 17.2% 13.2% 13.4% 16.6%
Class field 7.6% 8.4% 7.0% 8.7% 6.7%
![Page 23: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/23.jpg)
Cross project similarity
• Pearson correlation between the pattern frequencies across projects. (p-value < 0.001)
• Projects have surprisingly similar pattern frequencies
ArgoUML Columba Eclipse JEdit Scarab
ArgoUML 1 0.94 0.89 0.93 0.99
Columba 0.94 1 0.76 0.87 0.93
Eclipse 0.89 0.76 1 0.94 0.89
JEdit 0.93 0.87 0.94 1 0.92
Scarab 0.99 0.93 0.89 0.92 1
![Page 24: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/24.jpg)
Most frequent individual patterns
Pattern ArgoUML Columba Eclipse JEdit Scarab
Method call with different actual parameters
24.0% 19.9% 18.0% 15.1% 26.1%
Change of if condition expression
10.9% 7.0% 18.7% 13.1% 11.0%
• Only two patterns consistently occur at over 10% frequency
![Page 25: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/25.jpg)
Diving into if conditionals
• What is causing if conditionals to be such a prevalent bug fix type? (no clear answer yet)
ArgoUML Eclipse JEdit
Added condition clause 13.1% 20.8% 23.1%
Removed condition clause 11.5% 6.9% 11.2%
Added new variable 8.3% 14.3% 23.7%
Removed existing variable 12.0% 9.7% 15.1%
Increased number of operators 22.4% 22.3% 38.0%
Decreased number of operators 14.6% 15.1% 21.0%
![Page 26: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/26.jpg)
Static Pattern Summary
• Can automatically detect 27 static bug fix patterns
• About 50% of all bug fix changes match at least one pattern
• If conditionals and method call parameter changes are the two most prevalent patterns
• Pattern frequencies are remarkably similar across analyzed projects
![Page 27: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/27.jpg)
Adaptive Bug Fix Patterns
![Page 28: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/28.jpg)
Project-Specific Bug Fix Patterns
• There are many bug fix patterns that are specific to an individual project, and may not match one of the static patterns
• Example from Eclipse project:► JavaProject.java, transaction 2024 (“Fix for bug 28434”)
- if (requiredProjectRsc.exists() && requiredProjectRsc.isOpen()) {
+ if (JavaProject.hasJavaNature(requiredProjectRsc))
► DeltaProcessor.java, transaction 1945 (“Fix for bug 27499”)
- boolean isOpened=proj.isOpen();
- if (isOpened && this.hasJavaNature(proj))
+ if (JavaProject.hasJavaNature(proj))
![Page 29: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/29.jpg)
Detecting Non-Static Patterns
• Detecting non-static patterns► Saving exact code in bug and fix hunks doesn’t
work, since there is rarely an exact match.► Need a method for abstracting changes to find
patterns
• Approach► Abstract code in each bug fix change► Save abstracted bug and fix code in a database (the
“bug fix memory”)► Can search existing code to see if it matches a bug
fix pattern► Can suggest code to fix the bug
![Page 30: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/30.jpg)
Adaptive Patterns
• Since the contents of the bug fix memory comes from a specific project its contained patterns adapt to that project.
• The set of known patterns changes over time, as information from new bug fixes is added.
• Can view the bug fix memory as a kind of online algorithm for learning project-specific bug fix patterns
![Page 31: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/31.jpg)
Process for Abstracting Code
• Four step process► Raw component extraction
• Parse source code in a hunk, and burst out individual syntactic elements
► Normalization• Substitute type names for variables, string literals,
constants (abstract to types)► Information filtering
• Remove elements that are too common to yield project-specific patterns
► Diff filtering• Remove code components that are common in bug and fix
hunks, yielding only code unique to the change
![Page 32: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/32.jpg)
Raw Component Extraction
• Step 1: Convert statements inside change hunks so they lie on a single line
► Eliminate whitespace► Concatenate multi-line statements to one line► Concatenate conditionals for complex statements (if, while,
etc.) to one line
• Step 2: Extract raw components► Component is a non-leaf node in the syntax tree of a single line► Bursts out complex statements into constituent parts
• Each portion of a complex conditional is a separate component► Additionally, separate out a method call and its parameters
![Page 33: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/33.jpg)
Component Extraction Example
• Initial code
if (foo.flag >= 5 &&
foo.ready()) {
i=1;
foo.create(“example”);
initiate(5,bar);
}
• Extracted Componentsfoo.flag
foo.flag >= 5
foo.ready()
foo.flag >= 5 && foo.ready ()if (foo.flag >=5 && foo.ready())
i=1
“example”
foo.create() “example”
initiate(,) 5, bar
if
>=
&&.
.
foo flag
5 foo ready()
![Page 34: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/34.jpg)
Normalization
• To further improve the ability to match code, perform abstraction of instances to types
► Replace variable instance with its type• Permits matching on type, rather than instance• foo.flag >= 5 Foo.flag >= 5 (type of foo is Foo)
► For literals, insert new component with type• i=1 yields int=1 and int=int
► For method calls, replace each parameter with type of parameter
• Use “*” for unknown types (we only do one-pass parse)• initiate(,) 5, bar initiate(,) int,* (type of bar is unknown)
![Page 35: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/35.jpg)
Information Filtering Goal
• After normalization, resulting components are candidates for insertion into database
► Problem: many commonly occurring statement types• int=int
► Want to eliminate these, and others that don’t contribute unique information about bug fixes
![Page 36: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/36.jpg)
Information Filtering Approach
• Assign an “information value” to component elements► Value 2:
• method call, string literal longer than 8 chars► Value 1:
• predicates for: if, do, while, for, as well as conditional expressions• return, case, switch, synchronized, throw• string literal, length 3-8 chars• variable name, field name, class name, variable type
► Value 0:• Everything else
• Information value for an entire component is the sum of its elemental information values
• We remove components with information value < 2► int=1 (info value = 1), int=int (info value = 0)► “example” (info value = 1), String (info value = 0)
![Page 37: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/37.jpg)
Diff Filtering and Storing Memories
• As a final filtering step, keep only those components that are unique to either bug or fix hunks
► Duplicate components are eliminated, since they do not represent the bug or its fix
• After diff filtering step, store all components into the database (“memory”)
► Components record their transaction, file name, bug or fix hunk, etc.
► Also store initial source code of bug and fix hunks
![Page 38: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/38.jpg)
Searching the Memory
• The memory database contains extracted adaptive bug and fix patterns for a given project
• Can use this memory to find code that matches bug code in the memory
• Use scenario► Developer working in their favorite development
environment► Receives feedback when code they are developing
matches a stored bug pattern► Can also suggest potential fixes from stored bug fix
code
![Page 39: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/39.jpg)
Evaluation
• We evaluated the memory to determine how well it captures new bug fix changes
► Specifically, we create a memory for transactions 1 to n-1► At transaction n, for bug fix changes we examine whether the
bug hunks are found in the memory• This is a “half hit”
► If found, we also examine whether the fix hunk is found too• This is a “full hit”
► Examined same 5 project histories as for static patterns• ArgoUML, Columba, Eclipse, jEdit, Scarab
• This can be viewed as a proxy for how well the approach might work for bug and fix prediction
![Page 40: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/40.jpg)
True and False Positives
Build memories based on transaction 1 .. n-1
……
False positive half hit, if found
True positive half hit, if found
Transaction 1 .. n-1
Memories
Non-fix change case at transaction n
Fix change caseat transaction n
![Page 41: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/41.jpg)
True Positive Hit Rates
True Positive Hit Rate
0
5
10
15
20
25
30
35
40
45
ArgoUML Columba Eclipse jEdit Scarab
Projects
Hit
Rate
Full hit
Half hit
![Page 42: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/42.jpg)
False Positive Hit Rates
False Positive Hit Rate
0
5
10
15
20
25
30
35
ArgoUML Columba Eclipse jEdit Scarab
Projects
Hit
Rate
Full hit
Half hit
![Page 43: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/43.jpg)
True Positive and False Positive Full Hit Rates
0
2
4
6
8
10
12
14
16
18
ArgoUML Columba Eclipse jEdit Scarab
Projects
Hit
Rate
TP full hit
FP full hit
![Page 44: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/44.jpg)
Adaptive Pattern Discussion
• Adaptive bug patterns work well► Captures 19.3%-40.3% of bugs (half-hits)► But, also captures a lot of non-bug changes (20.8%-
32.5%)► High full hit rate for non-fix changes could be due to
changes with no added hunk• Since there is no code to match in the database, we
automatically call this a full hit (might be better to ignore)
• Adaptive patterns are more project specific than static patterns
► Better suited for presenting possible bug fixes
![Page 45: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/45.jpg)
Patterns Overall
• If you were to examine all project transactions► Not by time, grouping fix and non-fix changes together
• A fine-grain characterization of the kinds of changes made over the evolution of a software project?
Fix Non-Fix
StaticAdaptive
StaticAdaptive
![Page 46: Static and Adaptive Bug Fix Patterns](https://reader031.vdocuments.us/reader031/viewer/2022013003/55896f71d8b42a807a8b46f5/html5/thumbnails/46.jpg)
Conclusion
• It is now possible to reliably extract static and adaptive bug fix patterns from software project evolution data
• Static patterns are useful for characterizing bug fixes at a fine grain syntactic level
• Adaptive patterns are useful for identifying potentially buggy code, and making bug fix recommendations at fine granularity