adequacy.ppt
TRANSCRIPT
Model based Software Testing Test Assessment and Enhancement
Aditya P. MathurPurdue UniversityFall 2005
Last update: August 18, 2005
Software Testing and Reliability Aditya P. Mathur 2002 2
Learning Objectives
To understand the relevance and importance of test assessment.
To learn the fundamental principle underlying test assessment.
To learn various methods and tools for test assessment.
To understand the relative strengths/weaknesses of test assessment methods.
To learn how to improve tests based on a test assessment procedure.
Software Testing and Reliability Aditya P. Mathur 2002 3
What is Test Assessment?
Once a test set T, a collection of test inputs, has been developed, we ask:
How good is T?
It is the measurement of the goodness of T which is known as test assessment.
Test assessment is carried out based on one or more criteria.
Software Testing and Reliability Aditya P. Mathur 2002 4
Test Assessment (contd.)
These criteria are known as test adequacy criteria.
Test assessment is also known as test adequacy assessment.
Software Testing and Reliability Aditya P. Mathur 2002 5
Test assessment (contd.)
Test assessment provides the following information:
A metric, also known as the adequacy score or coverage, usually between 0 and 1.
A list of all the weaknesses found in T, which when removed, will raise the score to 1.
The weaknesses depend on the criteria used for assessment.
Software Testing and Reliability Aditya P. Mathur 2002 6
Test assessment (contd.)
Once the coverage has been computed, and the weaknesses identified, one can improve T.
Improvement of T is done by examining one or more weaknesses and constructing new test requirements designed to overcome the weakness(es).
The new test requirements lead to new test specifications and to further testing of the program.
Software Testing and Reliability Aditya P. Mathur 2002 7
Test Assessment (contd.)
This is continued until all weaknesses are overcome, i.e. the adequacy criterion is satisfied (coverage=1).
In some instances it may not be possible to satisfy the adequacy criteria for one or more of the following reasons:
Lack of sufficient manpower
Weaknesses that cannot be removed because they are infeasible.
Software Testing and Reliability Aditya P. Mathur 2002 8
Test Assessment (contd.)
The cost of removing the weaknesses is not justified.
While improving T by removing its weaknesses, one usually tests the program more thoroughly than it has been tested so far.
This additional testing is likely to result in the discovery
of remaining errors.
Software Testing and Reliability Aditya P. Mathur 2002 9
Test Assessment (contd.)
Test assessment and improvement is applicable throughout the testing process and during all stages of software development.
Hence we say that test assessment and improvement helps in the improvement of software reliability.
Software Testing and Reliability Aditya P. Mathur 2002 10
Test Assessment Procedure
Yes
Improve TNo
Measure adequacy of Tw.r.t. C.
2
Is T adequate?3Yes
4
More testing is warranted ?5
Select an adequacycriterion C.
1
Develop T0
NoDone6
Software Testing and Reliability Aditya P. Mathur 2002 11
Principle Underlying Test Assessment
There is a uniform principle that underlies test assessment throughout the testing process.
This principle is referred to as the coverage principle.
It has come about as a result of intensive research at Purdue and other research groups in software testing.
Software Testing and Reliability Aditya P. Mathur 2002 12
The Coverage Principle
To formulate and understand the coverage principle, we need to understand:
coverage domains coverage elements
A coverage domain is a finite domain, related to the program under test, that we want to cover. Coverage elements are the individual elements of this domain
Software Testing and Reliability Aditya P. Mathur 2002 13
The Coverage Principle (contd.)
Coverage Domains Coverage Elements
RequirementsClassesFunctionsInterface mutationsExceptions
Software Testing and Reliability Aditya P. Mathur 2002 14
The Coverage Principle (contd.)
Measuring test adequacy and improving a test set against a sequence of well defined, increasingly strong, coverage domains leads to improved confidence in the reliability of the system under test.
Software Testing and Reliability Aditya P. Mathur 2002 15
The Coverage Principle (contd.)
Note the following properties of a coverage domain:
It is related to the program under test.
It is finite.
It may come from program requirements, related to the inputs and outputs.
Software Testing and Reliability Aditya P. Mathur 2002 16
The Coverage Principle (contd.)
It may come from program code. Can you think of a coverage domain that comes from the program code?
It aids in measuring test adequacy as well as the progress made in testing. How?
Software Testing and Reliability Aditya P. Mathur 2002 17
The Coverage Principle (contd.)
Example: It is required to write a program that takes in the
name of a person as a string and searches for the name in a file of names. The program must output the record ID which matches the given name. In case of no match a -1 is returned.
What coverage domains can be identified from this requirement?
Software Testing and Reliability Aditya P. Mathur 2002 18
The Coverage Principle (contd.)
As we learned earlier, improving coverage improves our confidence in the correct functioning of the program under test.
Given a program P and a test T suppose that T is adequate w.r.t. a coverage criterion C.
Does this mean that P is error free?
Obviously……???
Software Testing and Reliability Aditya P. Mathur 2002 19
Test Effort
There are several measures of test effort.
One measure is the size of T. By this measure a test set with a larger number of test cases corresponds to higher effort than one with a lesser number of test cases.
Software Testing and Reliability Aditya P. Mathur 2002 20
Error Detection Effectiveness
Each coverage criterion has its error detection ability.
This is also known as the error detection effectiveness
or simply effectiveness of the criterion.
One measure of the effectiveness of criterion C is the
fraction of faults guaranteed to be revealed by a test T
that satisfies C.
Software Testing and Reliability Aditya P. Mathur 2002 21
Effectiveness (contd.)
Another measure is the probability that at least fraction f of the faults in P will be revealed by test T that satisfies C.
Unfortunately there is no absolute measure of the effectiveness of any given coverage criterion for a general class of programs and for arbitrary test sets.
Software Testing and Reliability Aditya P. Mathur 2002 22
Effectiveness (contd.)
One coverage criterion results in an exception to this rule: What is it?
Empirical studies conducted by researchers give us an idea of the relative goodness of various coverage criteria.
Thus, for a variety of criteria we can make a statement like: Criterion C1 is definitely better than criterion C2.
Software Testing and Reliability Aditya P. Mathur 2002 23
Effectiveness-continued
In some cases we may be able to say: Criterion C1 is
probably better than criterion C2.
Such information allows us to construct a hierarchy of coverage
criteria.
This hierarchy is helpful in organizing and managing testing. How?
Software Testing and Reliability Aditya P. Mathur 2002 24
Strength of a coverage criterion
The effectiveness of a coverage criterion is also referred to as its strength.
Strength is a measure of the criterion’s ability to reveal faults in a program.
Criterion C1 is considered stronger than criterion C2 if
C1 is is capable of revealing more faults than C2.
Software Testing and Reliability Aditya P. Mathur 2002 25
The Saturation Effect
The rate at which new faults are discovered reduces as test adequacy with respect to a finite coverage domain increases; it reduces to zero when the coverage domain has been exhausted.
coverage
cf δδ /
0 1
Software Testing and Reliability Aditya P. Mathur 2002 26
Saturation Effect: Fault View
Testing Effort
RemainingFaults
0
N
Functional tfs tfe tds
M
tdfe tme
Software Testing and Reliability Aditya P. Mathur 2002 27
Saturation Effect: Reliability View
Functional, Decision, Dataflow, and Mutationtsting provide various test assessment criteria.
True reliability (R)Estimated reliability (R’)Saturation region
Reliability
Testing Effort
R’fR’d R’df
R’m
Functional
Rf
tfs tfe
Decision
Rd
tds tde
Dataflow
Rdf
tdfs tdfe
Mutation
Rm
tms tfe
Software Testing and Reliability Aditya P. Mathur 2002 28
Coverage principle-discussion
Discuss:
How will you use the knowledge of coverage principle and
the saturation effect in organizing and managing testing?
Can you think of any other uses of the coverage principle
and the saturation effect?
Software Testing and Reliability Aditya P. Mathur 2002 29
Control flow graph Control flow graph (CFG) of a program is a representation
of the flow of execution within the program. More formally, a CFG G is:
G=(N,A)where N: set of nodes and A: set of arcs
There is a unique entry node en in N.
There is a unique exit node ex in N. A node represents a single statement or a block.
A block is a single-entry-single-exit sequence of instructions that are always executed in a sequence without any diversion of path except at the end of the block.
Software Testing and Reliability Aditya P. Mathur 2002 30
Control flow graph (contd.)
Every statement in a block, except possibly the first one, has exactly one predecessor.
k21 aaa .....,, ,
Similarly, every statement in the block, except possibly the last one, has exactly one successor.
An arc a in A is a pair (n,m) of nodes from N which represent transfer of control from node n to node m.
A path of length k in G is an ordered sequence of arcs, from A such that:
Software Testing and Reliability Aditya P. Mathur 2002 31
Control flow graph (contd.)
The first node a1 is en
The last node ak is ex
For any two adjacent arcs ai = (n,m) and aj = (p,q), m=p.
A path is considered executable or feasible if there exists a test case which causes this path to be traversed during program execution, otherwise the path is unexecutable or infeasible.
Software Testing and Reliability Aditya P. Mathur 2002 32
Control flow graph-example
Exercise: Draw a CFG for the following program and identify all
paths.:
1. scanf (x,y); if (y<0)2. pow=0-y;3. else pow=y;4. z=1.0;5. while (pow !=0)6. {z=z*x; pow=pow-1;}7. if (y<0)8. z=1.0/z;9. printf(z);
What does the above program compute?
Software Testing and Reliability Aditya P. Mathur 2002 33
Control-flow Graph
2 3pow=0-y; else pow=y;
4 z=1.0;
5while (pow !=0)
{z=z*x; pow=pow-1;} 6 7 if (y<0)
8 9z=1.0/z; printf(z);
1 scanf (x,y); if (y<0)en
ex
Software Testing and Reliability Aditya P. Mathur 2002 34
Structure-based Test Adequacy
Based on the CFG of a program several test adequacy criteria can be defined.
Some are:
statement coverage criterion branch coverage criterion condition coverage criterion path coverage criterion
Software Testing and Reliability Aditya P. Mathur 2002 35
Statement Coverage
The coverage domain consists of all statements in the program. Restated, in terms of the control flow graph, it is the set of all nodes in G.
A test T satisfies the statement coverage criterion if upon execution of P on each element of T, each statement of P has been executed at least once.
Software Testing and Reliability Aditya P. Mathur 2002 36
Statement coverage (contd.)
Restated in terms of G, T is adequate w.r.t. the statement coverage criterion if each node in N is on at least one of the paths traversed when
P is executed on each element of T.
Software Testing and Reliability Aditya P. Mathur 2002 37
Statement Coverage (contd.)
Class exercise: For the program for which you have drawn the
control flow graph, develop a test set that satisfies the statement coverage criterion.
Follow the procedure for test assessment and improvement suggested earlier.
Software Testing and Reliability Aditya P. Mathur 2002 38
Statement Coverage-Weakness
Consider the following program:
int abs (x);int x;{
if (x>=0) x=0-x;return x;
}
Software Testing and Reliability Aditya P. Mathur 2002 39
Statement coverage-weakness
Suppose that T= {(x=0)}.
Clearly, T satisfies the statement coverage criterion.
But is the program correct and is the error revealed by T which is adequate w.r.t. the statement coverage criterion?
What do you suggest we do to improve T?
Software Testing and Reliability Aditya P. Mathur 2002 40
Branch (or edge) coverage
In G there may be nodes which correspond to conditions in P. Such nodes, also called condition nodes, contain branches in P.
Each such node is considered covered if during some execution of P, the condition evaluates to true and false; these executions of P need not be the same.
Software Testing and Reliability Aditya P. Mathur 2002 41
Branch coverage
The coverage domain consists of all branches in G. Restated, in terms of the control flow graph, it is the set of all arcs exiting the condition nodes.
A test T satisfies the branch coverage criterion if upon execution of P on each element of T, each branch of P has been executed at least once.
Software Testing and Reliability Aditya P. Mathur 2002 42
Branch coverage
Class exercise: Identify all condition nodes in the flow graph you
have drawn earlier. Does T= {(x=0)} satisfy the branch coverage
criterion? If not, then improve it so that it does.
Software Testing and Reliability Aditya P. Mathur 2002 43
Branch Coverage-Weakness
Consider the following program that is supposed to check if the input data item is in the range 0 to 100, inclusive:
int check(x);int x;{
if ((x>=0 )&& (x<=200)) check=true;else check=false;
}
Software Testing and Reliability Aditya P. Mathur 2002 44
Branch Coverage-Weakness
Class exercise: Do you notice the error in this program? Find a test set T which is adequate w.r.t.
statement coverage and does not reveal the error.
Improve T so that it is adequate w.r.t. branch coverage and does not reveal the error.
What do you conclude about the weakness of the branch coverage criterion?
Software Testing and Reliability Aditya P. Mathur 2002 45
Condition Coverage
For example, in the check program the condition node contains the condition:
((x>=0 ) && (x<=200))
Condition nodes in G might have compound conditions.
This is a compound condition which consists of the elementary conditions x>=0 and x<=200.
Software Testing and Reliability Aditya P. Mathur 2002 46
Condition coverage (contd.)
A compound condition is considered covered if all of its constituent elementary conditions evaluate to true and false, respectively, during some execution of P.
A test set T is adequate w.r.t. condition coverage if all conditions in P are covered when P is executed on elements of T.
Software Testing and Reliability Aditya P. Mathur 2002 47
Condition coverage (contd.)
Class exercise: Improve T from the previous exercise so
that it is adequate w.r.t. the condition coverage criterion for the check function and does not reveal the error.
Do you find the above possible?
Software Testing and Reliability Aditya P. Mathur 2002 48
Branch coverage-weakness (contd.)
Consider the following program:
0. int set_z(x,y); {1. int x,y;2. if (x!=0) 3. y=5;4. else z=z-x;5. if (z>1)6. z=z/x; 7. else8. z=y; }
What might happen here?
Software Testing and Reliability Aditya P. Mathur 2002 49
Branch Coverage-Weakness
Class exercise: Construct T for set_z such that (a) T is adequate
w.r.t. the branch coverage criterion and (b) does not reveal the error.
What do you conclude about the effectiveness of the branch and condition coverage criteria?
Software Testing and Reliability Aditya P. Mathur 2002 50
Path coverage As mentioned before, a path through a program is a
sequence of statements such that the entry node of the program CFG is the first node on the path and the exit node is the last one on the path.
Is this definition equivalent to the one given earlier?
Software Testing and Reliability Aditya P. Mathur 2002 51
Path coverage (contd.)
A test set T is considered adequate w.r.t. the path coverage criterion if all paths in P are executed at least once upon execution on each element of T.
Class exercise: Construct T for set_z such that T is adequate
w.r.t. the path coverage criterion and does not reveal the error.
Is the above possible?
Software Testing and Reliability Aditya P. Mathur 2002 52
Path Coverage-Weakness
The number of paths in a program is usually very large.
How many paths in set_z ?
How many paths in check ?
xy ? How many in the program that computes
Software Testing and Reliability Aditya P. Mathur 2002 53
Path Coverage-Weaknesses
It is the infinite or a prohibitively large number of paths that prevent the use of this criterion in practice.
Suppose that a test set T covers all paths. Will it guarantee that all errors in P are revealed ?
Is obtaining 100% path coverage equivalent to exhaustive testing?
Software Testing and Reliability Aditya P. Mathur 2002 54
Variants of Path Coverage
Make sure that each loop is executed 0, 1, and 2 times.
As path coverage is usually impossible to attain, other heuristics have been proposed.
Loop coverage:
Try several combinations of if and switch statements. The combinations must come from requirements.
Software Testing and Reliability Aditya P. Mathur 2002 55
Hierarchy in Control flow criteria
Path coverage
Condition coverage
Branch coverage
Statement coverage
X
Y
X subsumes Y.
Software Testing and Reliability Aditya P. Mathur 2002 56
Exercise
Develop a test set T that is adequate w.r.t. the statement, condition, and the loop coverage criteria for the exponentiation program.
Software Testing and Reliability Aditya P. Mathur 2002 57
Test strategy
One can develop a test strategy based on any of the criteria discussed.
Example: A test strategy based on the statement coverage
criterion will begin by evaluating a test set T against this criterion. Then new tests will be added to T until all the statements are covered, i.e. T satisfies the criterion.
Software Testing and Reliability Aditya P. Mathur 2002 58
Definitions
Error-sensitive path: a path whose execution might lead to eventual detection of an error.
Error revealing path: a path whose execution will always cause the program to fail and the error to be detected.
Software Testing and Reliability Aditya P. Mathur 2002 59
Definitions: Reliable Technique
Reliable: A test technique is reliable for an error if it guarantees that the error will always be detected.
This implies that a reliable testing technique must lead to the exercising of at least one error-revealing path.
Software Testing and Reliability Aditya P. Mathur 2002 60
Definitions: Weakly Reliable
Weakly reliable: A test technique is weakly reliable if it forces the execution of at least one error sensitive path.
Software Testing and Reliability Aditya P. Mathur 2002 61
Example: Error Detection [1]
Let us go over the example in Korel and Laski’s paper.
It is a sorting program which uses the bubble sort algorithm.
It sorts an array a[0:N] in descending order.
There are two, nested, loops in the program.
The inner loop from i6-i10 finds the largest element of a[R1:N].
Software Testing and Reliability Aditya P. Mathur 2002 62
Example: Error Detection (contd.)
The largest element is saved in R0 and R3 points to the location of R0 in a.
The completion of one iteration of the outer loop ensures that the sub-array a[0:R1-1] has been sorted and that a[R1-1] is greater than or equal to any element of a[R1:N].
The outer loop swaps a(R1) with a(R3).
Software Testing and Reliability Aditya P. Mathur 2002 63
Example: Error Detection (contd.)
There is a missing re-initialization of R3 to R1 at the beginning of the inner loop.
In some cases this will cause the program to fail.
What are these cases?
We will get back to this error later!
Software Testing and Reliability Aditya P. Mathur 2002 64
Data flow graph
The graph is constructed from the control flow graph (CFG) of the program.
It represents the flow of data in a program.
A statement that occurs within a node of the CFG might contain variables occurrences.
Each variable occurrence is classified as a def or a use.
Software Testing and Reliability Aditya P. Mathur 2002 65
defs and uses
A def represents the definition of a variable. Here are some sample defs of variable x:
x=y*x; scanf(&x,&y); int x; x[i-1]=y*x;
All defs of x are italicized.
A use represents the use of a variable in a statement. Here a few examples of use of variable x:
Software Testing and Reliability Aditya P. Mathur 2002 66
def-use (contd.)
x=x+1; printf (“x is %d, y is %d”, x,y); cout << x << endl << y z=x[i+1] if (x<y)…
All uses of x are italicized.
Uses of a variable in input and assignments are classified as c-uses. Those in conditions are classified as p-uses.
Software Testing and Reliability Aditya P. Mathur 2002 67
def-use (contd.)
c-use stands for computational use and p-use for predicate-use.
Both c- and p-uses affect the flow of control: p-uses directly as their values are used in evaluating conditions and c-uses indirectly as their values are used to compute other variables which in turn affect the outcome of condition evaluation.
Software Testing and Reliability Aditya P. Mathur 2002 68
def-use (contd.)
A path from node i to node j is said to be def-clear w.r.t. a variable x if there is no def of x in the nodes along the path from node i to node j. Nodes i and j may have a def of x.
A def-clear path from node i to edge (j,k) is one in which no node on the path has a def of x.
Software Testing and Reliability Aditya P. Mathur 2002 69
global-def
A c-use of x in a block is considered global c-use if there is no def of x preceding this c-use within this block.
A def of a variable x is considered global to its block if it is the last def of x within that block.
Software Testing and Reliability Aditya P. Mathur 2002 70
def-use graph: definitions
def(i): set of all variables for which there is a global definition at node i.
c-use(i): set of all variables that have a global c-use at node i.
p-use(i,j): set of all variables for which there is a p-use for the edge (i,j).
dcu(x,i): set of all nodes such that each node has x in its c-use and x is in def(i).
Software Testing and Reliability Aditya P. Mathur 2002 71
def-use graph: definitions
dpu(x,i): set of all edges such that each edge has x in its p-use , x is in def(i).
The def-use graph of program P is constructed by associating defs, c-use, and p-use sets with nodes of a flow graph.
Software Testing and Reliability Aditya P. Mathur 2002 72
def-use graph (contd.)
1. scanf (x,y); if (y<0)2. pow=0-y;3. else pow=y;4. z=1.0;5. while (pow !=0)6. {z=z*x; pow=pow-1;}7. if (y<0)8. z=1.0/z;9. printf(z);
Sample program:
Software Testing and Reliability Aditya P. Mathur 2002 73
def-use graph (contd.)
1
2 3
6
5
4
7
8 9
def={x,y}c-use=
def={pow}c-use={y}
def={pow}c-use={y}
def={z}c-use=
def=c-use=
def={z,pow}c-use={z,x,pow}
def=c-use=
def=c-use={z}
def={z}c-use={z}
y y
pow pow
y y
Unlabeled edgesimply empty p-use set.
Software Testing and Reliability Aditya P. Mathur 2002 74
def-use graph exercise
0. int set_z(x,y); {1. int x,y;2. if (x!=0) 3. y=5;4. else z=z-x;5. if (z>1)6. z=z/x; 7. else8. z=y; }
Draw a def-use graph for the following program.
Software Testing and Reliability Aditya P. Mathur 2002 75
def-use graph (contd.)
Traverse the graph to determine dcu and dpu sets. (node, var) dcu dpu
(1,x) {6}
(1,y) {2,3} {(1,2),(1,3),(7,8),(7,9)}
(2,pow) {6} {(5,6),(5,7)}
(3,pow) {6} {5,6),(5,7)}
(4,z) {6,8,9}
(6,z) {6,8,9}
(6,pow) {6} {(5,6),(5,7)}
(8,z) {9}
Software Testing and Reliability Aditya P. Mathur 2002 76
Test generation
Exercises: For the above graph generate a test set that satisfies
the branch coverage criterion the all-defs criterion - for definitions of all
variables at least one use (c- or p- use) must be exercised.
the all-uses criterion- all p-uses and all c-uses of all variable definitions be covered.
Develop the tests incrementally, i.e. by modifying the previous test set!
Software Testing and Reliability Aditya P. Mathur 2002 77
SUDS processing: Phase I
P, Program undertest
Preprocess, compile and instrument
.trace file
upon execution
.atac files
generate
Instrumented version of P (executable)
generate
Test set
input
Program output
upon execution
Software Testing and Reliability Aditya P. Mathur 2002 78
ATAC processing: phase II
coverage analyzer
.atac files .trace file
control flow and data flowcoverage values
Software Testing and Reliability Aditya P. Mathur 2002 79
Mutation Testing
What is mutation testing?
Mutation testing is a code-based test assessment and improvement technique.
It relies on the competent programmer hypothesis which is the following assumption:
Given a specification a programmer develops a program that is either correct or differs from the correct program by a combination of simple errors.
Software Testing and Reliability Aditya P. Mathur 2002 80
Mutation testing (contd.)
The process of program development is considered as iterative whereby an initial version of the program is refined by making simple, or a combination of simple changes, towards the final version.
Software Testing and Reliability Aditya P. Mathur 2002 81
Mutant
Given a program P, a mutant of P is obtained by making a simple change in P.
1. int x,y;2. if (x!=0) 3. y=5;4. else z=z-x;5. if (z>1)6. z=z/x; 7. else8. z=y;
Program
1. int x,y;2. if (x!=0) 3. y=5;4. else z=z-x;5. if (z>1)6. z=z/zpush(x); 7. else8. z=y;
Mutant
What is zpush?
Software Testing and Reliability Aditya P. Mathur 2002 82
Another mutant
1. int x,y;2. if (x!=0) 3. y=5;4. else z=z-x;5. if (z>1)6. z=z/x; 7. else8. z=y;
Program
1. int x,y;2. if (x!=0) 3. y=5;4. else z=z-x;5. if (z<1)6. z=z/x; 7. else8. z=y;
Mutant
Software Testing and Reliability Aditya P. Mathur 2002 83
Mutant
A mutant M is considered distinguished by a test case t T iff: P(t)M(t)
where P(t) and M(t) denote, respectively, the observed behavior of P and M when executed on test input t.
A mutant M is considered equivalent to P iff: P(t)M(t) t T.
Software Testing and Reliability Aditya P. Mathur 2002 84
Mutation score
During testing a mutant is considered live if it has not been distinguished or proven equivalent.
Suppose that a total of #M mutants are generated for program P.
The mutation score of a test set T, designed to test P, is computed as:
number of live mutants/(#M-number of equivalent mutants)
Software Testing and Reliability Aditya P. Mathur 2002 85
Test adequacy criterion
A test T is considered adequate w.r.t. the mutation criterion if its mutation score is 1.
The number of mutants generated depends on P and the mutant operators applied on P.
A mutant operator is a rule that when applied to the program under test generates zero or more mutants.
Software Testing and Reliability Aditya P. Mathur 2002 86
Mutant Operators
Consider the following program:int abs (x);int x;{
if (x>=0) x=0-x;return x;
}
Software Testing and Reliability Aditya P. Mathur 2002 87
Mutation operator
Consider the following rule: Replace each relational operator in P by all
possible relational operators excluding the one that is being replaced.
Assuming the set of relational operators to be: {<, >, <=, >=, ==, !=}, the above mutant operator will generate a total of 5 mutants of P.
Software Testing and Reliability Aditya P. Mathur 2002 88
Mutation Operators
Mutation operators are language dependent.
For Fortran a total of 22 operators were proposed.
For C a total of 77 operators were proposed. None have been proposed for C++ though most of the operators for C are applicable to C++ programs.
Software Testing and Reliability Aditya P. Mathur 2002 89
Equivalent mutant
int x,y,z;scanf(&x,&y);if (x>0)
x=x+1; z=x*(y-1);else
x=x-1; z=x*(y-1);
Consider the following program P:
Here z is considered the output of P.
Software Testing and Reliability Aditya P. Mathur 2002 90
Equivalent mutant (contd.)
Now suppose that a mutant of P is obtained by changing x=x+1 to x=abs(x)+1.
This mutant is equivalent to P as no test case can distinguish it from P.
Software Testing and Reliability Aditya P. Mathur 2002 91
Mutation Testing ProcedureGiven P and a test set T:
1. Generate mutants
2. Compile P and the mutants
3. Execute P and the mutants on each testcase.
4. Determine equivalent mutants..
5. Determine mutation score.
6. If mutation score is not 1 then improvethe test set and repeat from step 3.
Software Testing and Reliability Aditya P. Mathur 2002 92
Mutation Testing Procedure (contd.)
In practice the above procedure is implemented incrementally.
One applies a few selected mutant operators to P and computes the mutation score w.r.t. to the mutants generated.
Once these mutants have been distinguished or proven equivalent, another set of mutant operators is applied.
Software Testing and Reliability Aditya P. Mathur 2002 93
Mutation Testing Procedure
This procedure is repeated until either all the mutants have been exhausted or some external condition forces testing to stop.
We will not discuss the details of practical application of mutation testing.
Software Testing and Reliability Aditya P. Mathur 2002 94
Tools for Mutation Testing
Mothra: for Fortran, developed at Purdue, 1990
Proteum: for C, developed at the University of Saõ Paulo at Saõ Carlos in Brazil.
Software Testing and Reliability Aditya P. Mathur 2002 95
Uses of Mutation Testing
Mutation testing is useful during integration testing to check for integration errors.
Only the variables that are in the interfaces of the components being integrated are mutated. This reduces the complexity of mutation testing.
Software Testing and Reliability Aditya P. Mathur 2002 96
Summary
Test adequacy criterion
Test improvement
Coverage principle
Saturation effect
Control flow criteria
Data flow criteria def, use, p-use, c-use, all-uses
Software Testing and Reliability Aditya P. Mathur 2002 97
Summary (contd.)
xSUDS, data flow testing tool.
Mutation testing mutant, distinguishing a mutant, live mutant, mutant
score, competent programmer hypothesis.