simplifying failure inducing input
Post on 25-Feb-2016
29 Views
Preview:
DESCRIPTION
TRANSCRIPT
Simplifying failure Inducing input
Vikas , Purdue
Problem
when testing a program some test case fails how to find the bug ??
problems!!! test input is huge – 800 lines say do a binary search manually ? how much time ?
Problem - example
Mozilla web browser 1999 bugzilla listed more than 370 bug reports faced imminent doom !! opened a program 'Bug a Thorn'
who ever finds bugs will get a prize Finding bug is a SERIOUS PROBLEM
Problem – example
one of the bug Mozilla crashed after 95 user actions It was not able to print a particular html
page of 895 lines simplified to 3 user actions html page simplified to 1 line
Now – 895 lines of input simplified to just 1 line !!
How to solve?
“Delta Debugging algorithm“ take a failing test case and take a passing test case simplify tat and produce a minimal test case the simplified test case still produces the
failure
Example - Continued
<td align=left valign=top><SELECT NAME="op sys" MULTIPLE SIZE=7><OPTION VALUE="All">All<OPTION VALUE="Windows 3.1">Windows 3.1<OPTION VALUE="Windows 95">Windows 95<OPTION VALUE="Windows98">Windows 98<OPTION VALUE="Windows ME">Windows ME<OPTION VALUE="Windows 2000">Windows 2000<OPTION VALUE="WindowsNT">Windows NT<OPTION VALUE="Mac System 7">Mac System 7<OPTION VALUE="Mac System 7.5">Mac System 7.5<OPTION VALUE="MacSystem 7.6.1">Mac System 7.6.1<OPTION VALUE="Mac System 8.0">Mac System 8.0<OPTION VALUE="Mac System 8.5">Mac System8.5<OPTION VALUE="Mac System 8.6">Mac System 8.6<OPTION VALUE="Mac System 9.x">Mac System 9.x<OPTION VALUE="MacOS X">MacOSX<OPTION VALUE="Linux">Linux<OPTION VALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTION VALUE="NetBSD">NetBSD<OPTIONVALUE="OpenBSD">OpenBSD<OPTION VALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS<OPTION VALUE="HP-UX">HP-UX<OPTIONVALUE="IRIX">IRIX<OPTION VALUE="Neutrino">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION VALUE="OS/2">OS/2<OPTIONVALUE="OSF/1">OSF/1<OPTION VALUE="Solaris">Solaris<OPTION VALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT></td><td align=left valign=top><SELECT NAME="priority" MULTIPLE SIZE=7><OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTION VALUE="P3">P3<OPTION VALUE="P4">P4<OPTIONVALUE="P5">P5</SELECT></td><td align=left valign=top><SELECT NAME="bug severity" MULTIPLE SIZE=7><OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTION VALUE="major">major<OPTIONVALUE="normal">normal<OPTION VALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION VALUE="enhancement">enhancement</SELECT></tr></table>Fig. 1. Printing this HTML page makes Mozilla crash (excerpt)
Conflicting issues
Decomposing a specific bug report into simple test case A bug report should be as specific as possible on the other hand test case should be as
simple as possible Test case simplification does both!
allows for short problem descriptions subsumes all details in bug report
Delta Debugging – how
Define what is a successful test case Feed with a failing test case Ddmin simplifies it by successive testing Stop when a minimal test case is reached Now removing any single input entity
would cause the failure to disappear
Analogous example-flight simulation
problem – flight crashes few seconds after take off , how do we find the bug?? repeat the situation over again and again
under changed circumstances find out what is relevant and what is not
relevant eg , leave the passenger seats – still crashes eg , leave the coffee vending machine – still
crashes ! hence both of them are irrelevant
Example continued - DDMin
1 <SELECT NAME="priority" MULTIPLE SIZE=7> 82 <SELECT NAME="priority" MULTIPLE SIZE=7> 43 <SELECT NAME="priority" MULTIPLE SIZE=7> 44 <SELECT NAME="priority" MULTIPLE SIZE=7> 45 <SELECT NAME="priority" MULTIPLE SIZE=7> 86 <SELECT NAME="priority" MULTIPLE SIZE=7> 87 <SELECT NAME="priority" MULTIPLE SIZE=7> 48 <SELECT NAME="priority" MULTIPLE SIZE=7> 49 <SELECT NAME="priority" MULTIPLE SIZE=7> 410 <SELECT NAME="priority" MULTIPLE SIZE=7> 811 <SELECT NAME="priority" MULTIPLE SIZE=7> 412 <SELECT NAME="priority" MULTIPLE SIZE=7> 413 <SELECT NAME="priority" MULTIPLE SIZE=7> 414 <SELECT NAME="priority" MULTIPLE SIZE=7> 415 <SELECT NAME="priority" MULTIPLE SIZE=7> 416 <SELECT NAME="priority" MULTIPLE SIZE=7> 817 <SELECT NAME="priority" MULTIPLE SIZE=7> 818 <SELECT NAME="priority" MULTIPLE SIZE=7> 819 <SELECT NAME="priority" MULTIPLE SIZE=7> 420 <SELECT NAME="priority" MULTIPLE SIZE=7> 421 <SELECT NAME="priority" MULTIPLE SIZE=7> 422 <SELECT NAME="priority" MULTIPLE SIZE=7> 423 <SELECT NAME="priority" MULTIPLE SIZE=7> 424 <SELECT NAME="priority" MULTIPLE SIZE=7> 425 <SELECT NAME="priority" MULTIPLE SIZE=7> 426 <SELECT NAME="priority" MULTIPLE SIZE=7> 8Fig. 2. Simplifying failure-inducing HTML input
DDMin
Not only minimises the failing input Also maximises the the passing input Not only limited to html input, character
input nor to program input can be applied to all circumstances that
can make a program crash or those which will affect the program execution
input
2 <SELECT NAME="priority" MULTIPLE SIZE=7> x4 <SELECT NAME="priority" MULTIPLE SIZE=7>x
7 <SELECT NAME="priority" MULTIPLE SIZE=7> v6 <SELECT NAME="priority" MULTIPLE SIZE=7> v5 <SELECT NAME="priority" MULTIPLE SIZE=7> v3 <SELECT NAME="priority" MULTIPLE SIZE=7> v1 <SELECT NAME="priority" MULTIPLE SIZE=7> vFig. 3. Isolating a failure-inducing difference
<SELECT> tag is the one which causes the failure !!!
Assumptions – reasons for failure
program code data from storage or input devices the programs environment the specific hardware the operating system “All the above are called circumstances”
Changes that cause failure
We are interested only in the changeble circumstances
These changeable circumstances make up the program input in most of the cases
Definitions
The set of possible circumstances is denoted by “
What is the change ?? ( delta ) - decomposition?
No specific way to get the changes . html example
delta can be a single character can be a single tag can be a single line also
HOW TO DECOMPOSE THEM??
Definitions – composition of changes
Definitions - Test cases and tests
According to POSIX starndart for testing the test can succeed the test can fail the test produce intermediate result
We need a function 'rtest' that takes a program run and gives one of the above output.
Definitions – test case and test
Test cases
is the passing test case
is the failing test case
Minimizing test case
– -A test case which is minimum wrt all possible test cases – requires testing all the 2cx combinations
Minimizing algorithm
Minimizing algorithm
The Algorithm
Complexity of DDMin
Complexity of DDMin is Cx2 + 3Cx Worst case 2 phases
1. When every test has an unresolved input then we go till the maximum granularity of Cx # of tests to be carried out is 2+4+8+.... + 2 Cx ~= 4Cx
2. When testing the last complement fails (∆n)
results in Cx-1 calls Total : 2(Cx-1)+2(Cx-2)+.....+2=~Cx2-Cx
Add up everything :- 4Cx + Cx2 - Cx
Minimizing algorithm
2
8
16
Case Study1 – GCC gets a fatal signal! - run in WYNOT
a program bug.c crashes when compiled with gcc
but the program crashes only with some optimization options given
does not crash with all optimizations enabled!
code is 755 characters – each character a component ( hence , may have a lot of useless C code )
Case study 1
size of Z = 1 Z[1] will segfault
Minimizing the test case
177100
77
Minimizing Gcc options
gcc -o -fforce-addr bug.c
Case study – Mozilla crashing
One of the bug report in Mozilla firefox Following operations cause Mozilla to
crash Start Mozilla Go to bugzilla.mozilla.org Print to file setting the margins to .50 Once its done printing do the exact same
thing on the same file ( /var/tmp/netscape.ps )
This causes browser to crash with a segfault
Mozilla crashes
Mozilla input consists of two items 1 . The sequence of input events
ie the succession of mouse motions pressed keys and clicked buttons used XLAB to capture – 711 actions
2. HTML code of the erroneous www page
Mozilla crashes
out of 711 actions – only 95 were user actions , rest were notifications by X server
out of 95 user actions only 3 are left after 82 test runs . Invoke Print dialog Press mouse button 1 on the print button release mouse button 1
Mozilla crashes
82 runs , 3 user actions
95 user actions
Mozilla crashes – excerpt of input
Mozilla crashes – sample run
Mozilla crashes - run
896 lines !!!
58 runs , 1 line
Mozilla crashes ! - 1 line
<SELECT_NAME=”priority”_MULTIPLE_SIZE=7> - is the culprit
or its just <SELECT> in other words , the bug report is now just
Create an HTML page containing “<SELECT>”
Load the page and print it using Alt+P command
the browser crashes with a segmentation fault
or – printing the <SELECT> crashes !!
Minimizing fuzz
bart Miller and his team examined the robustness of UNIX utilities by sending fuzz input ( a huge number of random characters ) found that 40% of the basic programs
crashed or went into infinite loops ddmin algorithm was tested on fuzz input
sequences for NROFF, TROFF, GLEX, CRTPLOT , UL,UNITS
Minimizing fuzz
Minimizing fuzz
Simplifying failure inducing input
The 3 case studies discussed show that the larger the size of the simplified input , the
higher is the number of tests required because determining 1-minimality of a test
case with n entities req atleast n tests because each individual entity is removed
and tested for flex , the number of tests vary upto 104 for
low precision to 36,000 for high precision .
other approaches ?
Simply stop the process when a certain time limit is reached
Simply stop the process when the input test case is reduced by a certain extent
Better approach is 'Isolation' Find one relevant part of the test
case .removing this particular part makes the failure go away.
Simplifying meant that – the simplified test case had all the relevant parts
Isolating example
7 tests!!
Simplyfying
26 tests
Isolation
Future work
Domain specific methods knowledge about the input structure can very
much enhance the performance for ex – valid program inputs are described
by grammars , would be nice to rely on such grammars
can exclude syntactically invalid inputs
top related