are automated debugging techniques actually helping programmers
DESCRIPTION
TRANSCRIPT
Are Automated Debugging Techniques Actually Helping
Programmers?Chris Parnin Georgia Tech
@chrisparnin (twitter)
Alessandro (Alex) OrsoGeorgia Tech
@alexorso (twitter)
Finding bugs can be hard…
Automated debugging to the rescue!
I’ll help you find location
of bug!
How it works (Ranking-Based)I have calculated most likely location of bug!
Give me a failing program.
Calculating…
Here is your ranked list of statements.
How it works (Ranking-Based)
Here is your rankedlist of statements.
I have calculated most likely location of bug!
Give me input.
Calculating…
But how does a programmer usea ranked list of statements?
…
Conceptual Model
Here is a list of places to check out
1)
2)
3)
4)
Ok, I will check out your suggestions
one by one.
…
Conceptual Model
1)
2)
3)
4)
Found the bug!
Does the conceptual model make sense?
Have we evaluated it?
A Skeptic
Let’s see…Over 50 years of researchon automated debugging.
1999. Delta Debugging
1962. Symbolic Debugging (UNIVAC FLIT)
1981. Weiser. Program Slicing
2001. Statistical Debugging
Did you see anything?
Only 5 papers have evaluated automated debugging techniques
with actual programmers.
• Most find no benefit• Most done on programs < 100 LOC
More generally, two points
Techniques rely on two strong assumptions
Do you see a bug?
Assumption #1: Perfect bug understanding must also exist when using automated tool.
Assumption #2Programmer inspects statements linearly
and exhaustively until finding bug.
Is this realistic?
Conceptual model: What if gave a developer a list of statements to inspect?
How would they use the list?
Would they be able to see the bug after visiting it?
Is ranking important?
Benefit: What if we evaluate programmers with and without automated debuggers?
> ?
We also could observe what works and what
doesn’t.
Study Setup
34 Developers
2 Debugging Tasks
Automated debugging tool
Study SetupParticipants:
34 developersMS/Phd StudentsDifferent levels of expertise (low,medium,high)
Study Setup
Software subjects:Tetris (2.5 kloc)NanoXML (4.5 kloc)
21
Study Setup
Tools:Traditional debugger
Eclipse ranking plugin(logged activity)
Study Setup
Tasks:Debugging fault30 minutes per taskQuestionnaire at end
Bugs
Bug #1: Pressing rotate key causes square figure to move up!
Bugs
When running the NanoXML program (main is in class Parser1_vw_v1), the following exception is thrown:Exception in thread "main" net.n3.nanoxml.XMLParseException: XML Not Well-Formed at Line 19: Closing tag does not match opening tag: `ns:Bar' != `:Bar'at net.n3.nanoxml.XMLUtil.errorWrongClosingTag(XMLUtil.java:497)at net.n3.nanoxml.StdXMLParser.processElement(StdXMLParser.java:438)
at net.n3.nanoxml.StdXMLParser.scanSomeTag(StdXMLParser.java:202)at net.n3.nanoxml.StdXMLParser.processElement(StdXMLParser.java:453)at net.n3.nanoxml.StdXMLParser.scanSomeTag(StdXMLParser.java:202)at net.n3.nanoxml.StdXMLParser.scanData(StdXMLParser.java:159)at net.n3.nanoxml.StdXMLParser.parse(StdXMLParser.java:133)at net.n3.nanoxml.Parser1_vw_v1.main(Parser1_vw_v1.java:50)
The input, testvm_22.xml, contains the following input xml document:<Foo a=”test”> <ns:Bar> <Blah x=”1” ns:x=”2”/> </ns:Bar></Foo>
Bug #2: Exception on input xml document.
Study Setup: Groups
26
Study Setup: GroupsA B
Study Setup: GroupsRank
Rank
C D
Results
How do developers use a ranked list?
37% of visits jumped avg. 10.Navigation pattern zig-zagged
(avg. 10 zigzags)
Low performers did follow list.
Survey says searched through
statements.
Is perfect bug understanding realistic?
Only 1 out of 10 programmers who clicked on bug stopped investigation.
The others spent on average ten minutes continuing investigation.
Are automated toolsspeeding up debugging?
=Automated group Traditional
No✘
Are automated toolsspeeding up debugging?
=Automated group Traditional
No✘
=Automated group Traditional
No✘
Rank
Are automated toolsspeeding up debugging?
=Automated group Traditional
No✘
But… Stratifying Participants
Low Performers
✘ ✘
Medium Performers
✘✔
High Performers
✔ ✔
Significant difference for “experts”
High Performers
✔ ✔On average, 5 minutes faster
Are automated toolsspeeding up debugging?
=
ExpertsExperts>
Automated group Traditional
No✘
Yes!✔
Automated group Traditional
Observations
Developers searched through statements.
Developers without tool fixed symptoms (not problem).
Developers wanted explanations rather than recommendations.
Future directions
39
Moving beyond fault space reduction
We can keep building
better tools.
But we can’t keep abstracting away
the human.
40
Performing further studies
Does different granularity work better for inspection? Documents? Methods?
How does different interfaces or visualizations impact technique?
Do other automated debugging techniques fare any better?
How do developers use a ranked list?
Is perfect bug understanding realistic?
Are Automated Debugging Tools Helpful?
Human studies, human studies, human studies!
42
64,000,000 miles800,000 miles
1969 2004
35 years of Scientific Progress
43
352 LOC(median 8 programs)
63.5 LOC(median 4 programs)
1981 2011
30 years of Scientific Progress
30 years