automation in the bug flow - machine learning for triaging and tracing
DESCRIPTION
Issue management is a costly part of software development. In large projects, the continuous inflow of issue reports contributes to the information overload in a project, i.e., "a state where individuals do not have time or capacity to process all available information". In issue triaging, an initial step in issue management, a developer must be able to overview existing issue reports and easily navigate the software engineering project landscape. In this presentation, we present support for two work tasks involved in issue management: 1) issue assignment and 2) change impact analysis. We use machine learning to harness the ever-growing number of issue reports, by training recommendation systems on previous issues. Our industrial evaluations on 50,000+ issue reports in two large software development organizations indicate that automated issue assignment performs in line with current manual work. Moreover, we present how traceability from already resolved issue reports to various artifacts can be reused to jump start change impact analyses for newly submitted issues. Finally, we speculate on future ways to tame information overload into helpful software engineering recommendations.TRANSCRIPT
![Page 1: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/1.jpg)
Automation in the Bug Flow- MACHINE LEARNING FOR TRIAGING AND TRACING
MARKUS BORG, LUND UNIVERSITY
![Page 2: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/2.jpg)
Bug tracker
The number of incoming bug reports can be overwhelming…
![Page 3: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/3.jpg)
Bug trackerMachine Learning
Use machine learning
to provide actionable
recommendations
based on previous
patterns!
![Page 4: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/4.jpg)
- Final year PhD student- MSc CS and engineering- ABB software developer (3 years)
process automationcompilers and editors
Per Runeson
![Page 5: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/5.jpg)
The ChallengeThe SolutionThe Evaluation
![Page 6: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/6.jpg)
Reqts. DB
Issue RepoCode Repo
Test DB
Doc. DB.
Developers in large projects must navigate complex information landscapes that continously change
![Page 7: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/7.jpg)
One bug is not much of a problem…
![Page 8: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/8.jpg)
Bug tracker
But a large simultaneous inflow of bug reports can make the best bug tracking system sweat!
![Page 9: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/9.jpg)
Making the wrong prioritizations might result in bugs on your customers
![Page 10: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/10.jpg)
In a safety-critical context
1. Issue Assignment
2. Change Impact Analysis
This talk addresses two tasks involved in issue management:
![Page 11: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/11.jpg)
By safety-critical we refer to document-driven development with a rigid process…
… prior to changing source code, a formal change impact analysis has to be conducted and reported according to an approved template.
![Page 12: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/12.jpg)
We want to increase the confidence at commit time even further!
![Page 13: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/13.jpg)
The ChallengeThe SolutionThe Evaluation
![Page 14: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/14.jpg)
We aim to harness the intrinsic navigational value of bug reports
![Page 15: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/15.jpg)
Bug trackerMachine Learning
We leverage on the number of bug reports in the projects…
![Page 16: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/16.jpg)
Bug tracker
Machine Learning
The more bugs, the better the machine learning gets!
![Page 17: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/17.jpg)
![Page 18: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/18.jpg)
Leif Jonsson
Automated Issue Assignment
• Goal:
Useful tool deployable with minimum configuration effort
• Approach:
Train classifiers on historical bug reports
Combine them using state-of-the-art ensemble learning
Joint work with
Ericsson Research
![Page 19: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/19.jpg)
How to Represent a Bug Report?
• Component
• Severity
• System Version
• Submit Date
• Submitter Location
etc.
• … And the text! Title and description.
![Page 20: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/20.jpg)
![Page 21: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/21.jpg)
Automated Change Impact Analysis
• Goal:
Intuitive tool to jump start analyses based on historical data
Faster + more accurate analyses compared to fully manual work
• Approach
Step 1: Mine the history
Step 2: Recommend impact for new bug fix
![Page 22: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/22.jpg)
Present recommendations Amazon-style:”Other developers working on this class also modified/tested…”
![Page 23: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/23.jpg)
Construct network of previously reported impact
Index textual data with
![Page 24: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/24.jpg)
Calculate centrality measures
![Page 25: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/25.jpg)
Automated Impact Analysis
• Approach part 2: Recommend impact
Find similar bugs using Apache Lucene
Follow links to identify candidate impact set
Design Doc. X.Y
Req. X.Y
Test case UTC56
Req. Z.Y
Design Doc. X.Y
![Page 26: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/26.jpg)
Automated Impact Analysis
• Approach part 2: Recommend impact
Follow links to identify candidate impact set
Use centrality measures to rank candidate impact
Find similar bugs using Apache Lucene
1. Requirement X.Y
2. Design Document X.Y
3. Test case UTC56
4. Design Document X.Y
5. Requirement Z.Y
![Page 27: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/27.jpg)
Screen shot of prototype tool ImpRec
![Page 28: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/28.jpg)
The ChallengeThe SolutionThe Evaluation
![Page 29: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/29.jpg)
![Page 30: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/30.jpg)
Experiment: Issue Assignment
• Five large datasets from two companies
– Telecom and Automation
– 50.000+ issue reports
• 10-fold cross-validation and simulation (”replaying history”)
![Page 31: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/31.jpg)
Experiment: Issue Assignment
• Prediction accuracy in line with human activity
• Numbers depict number of teams in the projects
67
1764
2836
![Page 32: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/32.jpg)
Experiment: Issue Assignment
• Warning! Some systems need fresh training data
![Page 33: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/33.jpg)
Experiment: Issue Assignment
But the decay is not always exponential…
![Page 34: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/34.jpg)
![Page 35: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/35.jpg)
Experiment: Change Impact Analysis
• Experiment with historical impact
– Training set: 8 years, Test set: 2 years
ImpRec presents 30% of past impact among the top-5 recommendations(40%@10, 50%@20)
But what does that mean? User study needed.
![Page 36: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/36.jpg)
Case Study: Change Impact Analysis
• Industrial case study
– Two units of analysis: Team Sweden & Team India
– Tool deployed in March 2014 & August 2014
– Interviews and user log files
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Click Distribution, top-20 hits
IA Google
Initial result:Developers’ interaction with impact recommendations similar to Google searches
![Page 37: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/37.jpg)
Case Study: Change Impact Analysis
• ”Finding these past bugs was exactly what I was looking for actually”
- Developer, India
• ”Directing attention to potential side-effects is very important”
- Project manager, Sweden
Some encouraging
comments from
developers…
![Page 38: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/38.jpg)
Conclusions
![Page 39: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/39.jpg)
Automated Issue Assignment
• Automated assignment as accurate as current manual work
– But instantaneous!
• At least 2.000 bug reports needed for training
• Continously monitor the accuracy
Favourable Unfavourable
Static team structure
Dynamic team structure
Maintenance project
New development
![Page 40: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/40.jpg)
Automated Change Impact Analysis
• Recommendation system provides a useful starting point
• Recommending related issues is a popular feature
– Study previous issue resolutions
– Compare with previous impact analyses
• Recommendation recall for impact: 30-55%
– Reuse previous impact to jump-start analysis
– Provide warning if probable impact is missing
![Page 41: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/41.jpg)
Embrace your bugs!
Machine learning canguide maintenance work
Much potential:- Severity prediction- Resolution times- ”Noise” filtering
![Page 42: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/42.jpg)
PHOTO CREDITS
Brown stink bug- Marlin E. RiceIsopods- Omoshiro Aquarium- Flickr: littoraria, codaCubicles- Flickr: templetonelliot, ifl, danburgmurmurEightball girl- Flickr: mobilestreetlifeEvaluate- Flickr: theideadeskMy wife- My wife
Thank you!
[email protected]/markus_borg @_Troddel_
![Page 43: Automation in the Bug Flow - Machine Learning for Triaging and Tracing](https://reader035.vdocuments.us/reader035/viewer/2022062707/5583a00bd8b42a08148b50ed/html5/thumbnails/43.jpg)