knowledge collaboration by mining software repositories
DESCRIPTION
Presented at KCSD 2006.TRANSCRIPT
Knowledge Collaboration byMining Software Repositories
Tom ZimmermannSaarland University, Saarbrücken, Germany
Guiding developers
Zimmermann, Weissgerber, Diehl, Zeller (TSE 2005)
eROSE suggests further locations.
eROSE prevents incomplete changes.
eROSE is customizable.
“Indirect” collaboration
Versionarchive
Direct collaboration
“Indirect” collaboration
Versionarchive
Hidden Knowledge
Mining
Direct collaboration
“Indirect” collaboration
Versionarchive
Hidden Knowledge
Mining
Direct collaboration
Indirect collaboration
Future
#1: Change classification
#1: Change classification
X X X X
bad changes (e.g., from bug database)
#1: Change classification
X X X X
BUILD A CLASSIFIER
bad changes (e.g., from bug database)
#1: Change classification
X X X X
BUILD A CLASSIFIER
bad changes (e.g., from bug database)
new change
#1: Change classification
X X X X
BUILD A CLASSIFIER
bad changes (e.g., from bug database)
PREDICT QUALITY
new change
#2: What should we collect
• Mining software repositories relied on exiting repositories so far.
• Collecting new data (e.g., navigation traces) opens new opportunities.
• Software NavigationSinger et al (ICSM 2005), DeLine et al. (VL/HCC 2005)
• Social TaggingStorey et al. (TagSea tool)
Mining across projects
#3: Mining across projects
• Extend source code search engines with mining techniques.
• Large scale mining (129,167 SF projects) and large scale collaboration (1,393,250 SF users).
• Usage patterns from Koders.comXie and Pei (MSR 2006)
Conclusion
• History supports knowledge collaboration.
• Future challenges: granularity and data.
• Mining software repositories @ ASE 2006:− Wednesday 4pm: Impact analysis− Friday 9am: Management− Friday 11am: Mining software repositories