a specification-based regression test approach with risk analysis

Specification-based Regression Test Selec-tion with Risk Analysis

Yanping Chen

School of Information Tech-nology and Engineering

University of Ottawa, Canada [email protected]

Robert L. Probert School of Information Tech-

nology and Engineering University of Ottawa, Canada

[email protected]

D. Paul Sims Test Continual Improvement E-commerce Development IBM Canada Ltd., Canada

[email protected]

Abstract Regression testing is essential to ensure software quality. The test team applies a regression test suite to ensure that new or modified features do not regress (make worse) existing features. Al-though existing research has addressed many problems and put forward solutions, most regres-sion test techniques are code-based. Code-based regression test selection is good for unit testing, but it has a scalability problem. When the size of the subject under test grows, it becomes hard to manage all the information and to create corre-sponding traceability matrices. In this paper, we describe a specification-based method for regres-sion test selection.

The basic model we use for describing re-quirements based on customer features or behav-iors is the activity diagram, which is a notation of the Unified Modeling Language (UML). A proc-ess for identifying the affected test cases is pre-sented. To summarize our approach, we select two kinds of regression tests: i) Targeted Tests, which ensure that important current customer features are still supported adequately in the new release and ii) Safety Tests, which are risk-directed, and ensure that potential problem areas are properly handled. Our test selection technique will be based on a practical risk analysis model.

1 Introduction Regression testing is an essential part of an effec-tive testing process for ensuring software quality. It is the process of validating modified software to provide confidence that the changed parts of the

software behave as intended and that the un-changed parts of the software have not been ad-versely affected by the modification [1]. Without running a modified program on all test cases of test suite T, regression test selection (RTS) at-tempts to select a cost-minimized subset of test cases to determine if the modified program has the same behavior as a previous, acceptable ver-sion of the program running on T.

It is well-known that regression testing is im-portant in software maintenance. Moreover, with widespread usage of object-oriented programming techniques, more and more projects follow an evolutionary process model, or an incremental model. Under this model, components from leg-acy systems or third parties will be re-used in new projects. Thus regression testing is also an im-portant activity in object-oriented software devel-opment to gain confidence in re-used components.

Much attention has been put into this area in recent years. Most techniques are white-box (code-based), that is, they select tests based on information about the delta between original code and the modified version [1, 2, and 3]. Only a few techniques are black-box (specification-based) methods, that is, they select tests based on infor-mation obtained from program specifications [4, 5]. Code-based regression techniques can be ap-plied effectively to regression testing at the unit level. But when we try to test a larger or more complex component, for example, a utility or a subsystem, it becomes very difficult to manage all the information obtained from code and to create corresponding management matrices (for example for coverage statistics). Therefore, code-based regression techniques are not suitable for testing larger components at more abstract levels, such as

1

mailto:[email protected]



a subsystem. Second, code-based regression techniques require testers to access and under-stand the code. This requirement causes many practical problems. Testers must spend time read-ing the code written by others and understanding how it works. This is very time-consuming. Fi-nally, code-based regression techniques are lan-guage-dependent. In some software systems, more than one programming language may be used, that is, an Internet application may use Java and HTML. More than one code-based regres-sion technique will then be necessary for regres-sion testing. This makes the situation more com-plex. In this paper, we propose a specification-based regression method as an efficient and effec-tive solution.

In the next section, we discuss regression test selection in general; we also give a detailed defi-nition of a risk model we use. Sections 3 and 4 contain discussions of our method. In Section 5, we present the results of a industrial case study to evaluate our technique. In Section 6, we present related work and compare it to our work. In the last section, Section 7, we give conclusions and suggestions for future work. 2 Background 2.1 Regression testing One major difference between regression testing and development testing is that during regression testing an established suite of tests will typically be available for reuse. We define this established suite of tests as full test suite. With selective re-test (regression test) techniques, we only rerun those test cases that test the affected entities of the modified program. These test cases make up our regression test suite. If the cost of selecting a reduced subset of tests to run is less than the cost of running the tests that we omit, the selective retest technique is more economical than the re-test-all technique [6].

In [7], Rothermel and Harrold described re-gression testing techniques as follows:

“Given a program P, a modified version P’, and a set T of test cases used previously to test P, regression analysis and testing techniques attempt to make use of a subset of T to gain sufficient con-fidence in the correctness of P’ with respect to behaviors from P retained in P’.”

2.2 Controlled Regression Test-ing Assumption (CRTA)

In regression testing we try to determine if a modified program has the same behavior as an old version of the program by running the same tests. To be safe without being too inefficient and too conservative, our approach is based on an as-sumption called the Controlled Regression Test-ing Assumption (CRTA). In [7] CRTA is de-scribed as follows:

“When P’ is tested with T, we hold all factors that might influence the output of P’, except for the code in P’, constant with respect to their states when we tested P with T.”

Ideally, regression tests should be run while CRTA holds. This can be quite difficult in a complex, multi-platform test environment. Test-ers must identify and document uncontrollable factors and the test cases that are potentially af-fected by them.

2.3 Activity diagram In the Unified Modeling Language (UML) [12], the activity diagram is the notation for an activity graph, used to model threads of computations and workflows of a system. It combines concepts from several techniques: event diagrams (a nota-tion from J. Odell), SDL modeling techniques, workflow modeling, and Petri nets [20, 22].

Elements of the activity diagram can be cate-gorized into nodes and edges. Nodes include ac-tion states, diamonds (branch or merge), objects, synchronization bars (fork or join), start and stop markers, etc. Edges represent control flow with a solid arrow, and message flow and signal flow with dashed arrows.

Currently, some companies have begun to use the activity diagram in system design, such as IBM. It is powerful for describing system behav-ior and workflows. But, it is still one of the least understood modeling methods of UML. Re-searchers have seldom paid attention to this tech-nique [20]. In our research, we use the activity diagram to specify system requirements for re-gression analysis purposes.

2.4 Risk analysis Risk is anything that threatens the successful achievement of a project’s goals. Specifically, a risk is an event that has some probability of hap-pening, and that if it occurs, will result in some loss [8].

2

The tester’s job is to reveal high-priority problems in the product. Traditional testers have always used risk-based testing, but in an ad hoc fashion based on their personal judgment [8]. Using risk metrics to quantitatively measure the quality of a test suite seems perfectly reasonable and is our approach.

In [9] Amland presented a simple risk model with only two elements of Risk Exposure. We use this model in our research. It takes into account both:

1. The probability of a fault being present. Myers [19] reports that as the number of detected errors increases, the probability that more undetected errors exist also in-creases. If one component has defects that are detected by full testing, it is very likely that we can find more defects in this component by regression testing (the more defects detected, the more defects we can expect). Thus, components with detected defects should be covered more carefully by regression tests.

2. The cost (consequence or impact) of a fault in the corresponding function if it occurs in operation. It is a well-known, observed fact that most commercial software contains bugs at delivery time. Companies begin a project with the knowledge that they will choose, because of schedule pressures, to ship software that contains known bugs [21]. However, when the budget is limited, we strive to detect the most critical defects first.

The mathematical formula to calculate Risk

Exposure is RE (f) = P (f) × C (f), where RE (f) is the Risk Exposure of function f, P (f) is the prob-ability of a fault occurring in function f and C (f) is the cost if a fault is executed in function f in operational mode. 3 Specification-based test

case selection Since the major goal of regression testing is to assure system stability, we have to rerun test cases

for requirements attributes that have been modi-fied or may be affected by the modifications. Through regression testing, we want to achieve adequate confidence in software quality. We would like to have our regression test suite achieve some coverage target as well. In this paper, coverage of the regression test suite refers to the percentage of test cases that have been se-lected as regression test cases from the full test suite. To differentiate between these purposes and the corresponding methods, we separate re-gression test cases into two categories: the Tar-geted Tests, which are test cases that exercise im-portant affected requirements attributes, and the Safety Tests, which are test cases selected to reach the pre-defined coverage target. Intuitively, Tar-geted Tests exercise system behaviors that are key functional features for the customer, while Safety Tests try to ensure that potential problem behav-iors are handled or avoided (that is, properly man-aged by the system).

Our regression test approach is presented in Sections 3 and 4. Section 3 focuses on issues related to choosing Targeted Tests. First we dis-cuss requirements traceability based on the activ-ity diagram. Then we talk about regression analysis and test case selection. Section 4 pre-sents selecting Safety Tests with risk analysis.

3.1 Requirements traceability Traceability is a simple, common-sense book-keeping property that can prevent a wide range of problems [10]. It supports cross-checking by linking requirements, analysis, design, implemen-tation, and test cases. In specification-based test-ing, for example, we need to know which test case verifies a given requirements attribute.

In different development phases, we can use different notations to model different views of the system. In our approach, we use the activity dia-gram to represent the desired system behavior. Elements of the activity diagram are used to rep-resent requirements attributes.

Figure 1 is an example of such an activity diagram. We identify elements of the activity diagram in italics.

3

Figure 1: The activity diagram for the Get_Quote feature

Test Case Path Note

t1 a, b, c, d, e, f, g, h, i Basic flow t2 a, b, c, d, e, j, k, e, f, g, h, i Alternative flow t3 a, b, c, d, e, j, k, e, j, f, g, h, i Alternative flow t4 a, b, c, l, m Exception flow

Table 1: Test suite for the Get_Quote feature in Figure 1

Test cases can be designed based on the re-

quirements attributes in the activity diagram. There may be an unbounded number of paths in an activity diagram. Here the graph-theoretic sense of path is used: a set of nodes and edges for which an unambiguous property holds [10]. Each test case takes one specific path. Some test cases might cover the same path but carry different data. To identify paths, we label all nodes in the dia-gram.

In Table 1 we show the test cases designed

for the example in Figure 1. This test suite exer-cises all nodes and edges.

To cross-check test cases and requirements, based on Table 1 we create a traceability matrix, Table 2, to observe the relations between test cases and activity diagrams. In this matrix, for each node or edge of the activity diagram, we list test cases that run through it.

The process of obtaining requirements trace-ability is shown in Figure 2.

start

Check Session Validity

Display Session Time-out Mes-

sage

Send Get_Quote Request to

Timed-out Bank

Send Get_Quote Request to Bank

Calculate Best Quote

Update Ses-sion Info.

Display Best Quote

[Invalid]

[Valid]

branch Guard condition

Activity state

end

a

b

c

d

e

l

h

[Time Out]

[N Consecutive Failure]f

m

g

i

j

k

4

Node Test Case Edge Test Case a t1, t2, t3, t4 (a, b) t1, t2, t3, t4 b t1, t2, t3, t4 (b, c) t1, t2, t3, t4 c t1, t2, t3, t4 (c, d) t1, t2, t3 d t1, t2, t3 (c, l) t4 e t1, t2, t3 (d, e) t1, t2, t3 f t1, t2, t3 (e, f) t1, t2 g t1, t2, t3 (e, j) t2, t3 h t1, t2, t3 (f, g) t1, t2, t3 i t1, t2, t3 (g, h) t1, t2, t3 j t2, t3 (h, i) t1, t2, t3 k t2, t3 (j, f) t3 l t4 (j, k) t2, t3

m t4 (k, e) t2, t3 (l, m) t4

Table 2: Traceability matrix between the activity diagram and test suite

Figure 2: A traceability chain of links between requirements attributes and test cases

3.2 Regression analysis and test

case selection By analyzing changes in existing systems, we conclude that there are really only two types of changes: changes in the specification and changes in the code. Changes in the code mean that changes only happen in the implementation

happen in requirements or design, the activity diagram needs to be modified accordingly. For changes in the code only, the activity diagram will stay the same.

3.2.1 Change

without affecting the specification. In the case of changes in the specification, which mean changes

s in code This type of change occurs quite often during the

developers fix

. In our app

fect

nge history documentation may

ed on the code change hist

1, t2, and t3 h

ons In a ntrol flow graph (CFG), nodes are used to

n-

development process, that is, whendefects, they usually change only the code. In this case, the system behaviors (requirements and specifications) are not changed. Therefore, we do not need to change the activity diagram.

For a developer, any changes in code should be documented in a code change history

roach, this document needs to identify all nodes and edges in the activity diagram whose corresponding implementation has been changed.

For a tester, traceability needs to be estab-lished between the test cases and the log of de-

s. We refer to this as a test profile. When opening a defect, the tester is required to associate the defect with the specification. In our approach, defects are assigned against the activity diagram, which means the elements (nodes and edges) of the activity diagram where the defects occurred have to be identified.

In practice, developers may omit some de-fects, and the code cha

Requirements Attributes

Represented by Activity Diagram

Activity Diagram Elements

not be complete. With our approach, testers can select regression tests not only relying on documents from the developer, but also on test profiles owned by testers.

After identifying all elements whose imple-mentation has changed bas

Tracked by Traceabil-ity Coverage Matrix

ory and test profile, the tester can choose test cases that need to be included in the regression test suite using the traceability matrix.

In our example, according to Table 2, if a de-fect is opened against node g, test cases t

Test Cases

ave to be included in the regression test suite; if the implementation of edge (c, l) is modified, t4 needs to be rerun for regression testing.

3.2.2 Changes in specificati

corepresent statements, and edges represent the cotrol flow between the statements within a proce-dure (a CFG can also be used to represent

5

Figure 3: Example control-flow graph C (left) and its modified version C’ (right)

inter-process control flow). As an example, com-paring Figure 1 with Figure 3, we find that the activity diagram is quite similar to the CFG except that the nodes of activity diagram describe activities instead of statements. 3.2.2.1 A CFG-based regression

test selection technique Rothermel, Harrold, and Dedhia present a control-flow based regression test selection algorithm. They use CFGs to represent the implementation of procedures P and P’ and use edges in the CFGs as potential affected entities [11]. Affected entity means the entity is affected (changes its behavior) by the modification. By traversing in parallel the CFG for P and the CFG for P’, affected entities are selected. Whenever the targets of like-labeled CFG edges in P and P’ differ, this edge is added to the set of affected entities.

In Figure 3, we show a sample CFG C on the left with its modified version C’ on the right.

From C to C’, a node 4a is inserted and edge (6, 2) is changed to edge (6, 3). The algorithm

begins the traversal at entry nodes in C and C’, and traverses like paths in the two graphs by trav-ersing like-labeled edges until detecting a differ-ence in the target nodes of these edges. When the algorithm reaches node 3 in C and C’, it finds that the targets of the branches labeled “T” differ. It adds edge (3, 4) to the set of affected entities and stops its traversal along this path. The algorithm then considers the edges labeled “F” from node 3. When it reaches node 6 in C and C’, it discovers that the targets of the “out” edges differ; therefore, it adds edge (6, 2) to the set of affected entities, and stops its traversal along this path. There might be changes that occur later in the same path. Before it reaches these changes, a test case will certainly pass the first change. Identifying the first change is enough for identifying test cases for later changes. There are no additional affected edges found in subsequent traversals.

After all affected edges have been identified, they are used with the edge-coverage matrix to select test cases.

Suppose for C in Figure 3 we have a test suite T consisting of test cases t1, t2, and t3. Suppose

F T

T F

entry

D

1

4 5

6

3

8

exit

2

7

F T

T F

entry

D

1

4a 5

6

3

exit

2

7

8 4

C: C’:

6

the edge-coverage matrix for this test suite is as shown in Table 3.

Edge Test Case

(entry, 1) (1, 2) t1, t2, t3 (2, 3) (3, 4) (4, exit) t1, t2

(3, 5) (5, 6) (6, 2) t2

(2, 7) (7, 8) (8, exit) t3

Table 3: Edge-coverage matrix for test suite T on CFG C

Using the edge-coverage matrix and the set of

affected entities, we can search the matrix for affected entities and select corresponding regres-sion test cases easily. In our example, with the affected entities, edges (3, 4) and (6, 2), test cases t1 and t2 should be added to the regression test suite.

We now adapt this approach to specification-based analysis based on the activity diagram.

3.2.2.2 Regression analysis with

the activity diagram As we mentioned before, the first category of re-gression test cases is Targeted Tests, which tests customer-visible affected entities. To select this type of test case we have to identify affected enti-ties. This process is called regression analysis.

The regression test selection technique that we present is based on the activity diagram. Since the activity diagram is quite similar to the CFG, we apply the CFG-based algorithm to ana-lyze the activity diagram for regression test selec-tion.

Thus, we have gathered requirements infor-mation and created a requirements traceability coverage matrix during test case design. Like the CFG-based technique described previously, our technique also has two main steps. First we trav-erse the activity diagram to identify affected edges. Then we select test cases that execute the affected edges based on the traceability matrix to create our Targeted Tests.

4 Risk-based test case selec-

tion In this section we discuss issues related to select-ing Safety Tests based on risk analysis. Section

4.1 gives the motivation for our research. In Sec-tion 4.2 we discuss how to select test cases for Safety Tests. In Section 4.3 we talk about choos-ing end-to-end scenarios for Safety Tests. 4.1 Motivation for risk-based

selection Since our regression analysis is specification-based, if the developer changes code that cannot be identified by a test profile (no defects detected by the previous tests), and he does not update the code change history, the tester might miss some defects. To achieve enough confidence, we would like to include some test cases in addition to the Targeted Tests for our regression test suite.

Rothermel et al [11] separate regression test-ing into two phases: the preliminary phase and the critical phase. In the preliminary phase, de-velopers enhance and correct software. When corrections are complete, the critical phase of regression testing begins. During the critical phase, time is limited by product release deadlines.

Risk-based testing focuses testing and spends more time on key functions and exception han-dling [19, 23]. We recommend a risk-based method to select Safety Tests, the second category of regression test cases in our approach. This category of test cases is useful in two situations: • After running the Targeted Tests, if time and

resources are available, testers may want to rerun some test cases other than the Targeted Tests to gain more confidence in the modified system.

• The time to delivery may be extremely short, and there may be no time left for the critical phase due to development schedule overrun in the preliminary phase. Safety Tests pro-vide some assurance that the remaining de-fects in the release will not bring about seri-ous failures.

In both cases, Safety Tests are a good choice. In the following two sections, we present is-

sues of selecting Safety Tests. First, we talk about the steps in test case selection. Second, we dis-cuss a method for selecting end-to-end scenarios.

4.2 Model-based selection of

Safety Tests Our approach uses the risk model presented in Section 2 for risk analysis. First, we calculate the Risk Exposure for each test case (RE (t)). Then we choose test cases based on RE (t). We present

7

our approach with an example. The data in this example comes from a real product. 4.2.1 Safety Tests selection

method There are four main steps in our approach: Step 1. Assess the cost for each test case

Cost means the cost of the requirements at-tributes that this test case covers. Using require-ments traceability as discussed in Section 3, we can associate the cost of requirements attributes with test cases.

Cost is categorized on a one to five scale, where one is low and five is high. C (t) for the top 20% of test cases with the highest cost will be five and C (t) for the bottom 20% of test cases will be one. Two kinds of costs will be taken into consideration:

• The consequences of a fault as seen by the

customer, that is, losing market share because of faults.

• The consequences of a fault as seen by the vendor, that is, high software maintenance cost because of faults.

Table 4 shows the costs for some test cases in

our case study.

Step 2. Derive severity probability for each test case

After running the full test suite, we can sum up the number of defects uncovered by each test case. Learning from real in-house testing, we find that the severity of defects (how important or serious the defect is) is very important for soft-ware quality.

Test Case C (t) (1- 5)

t0010 5 t0020 5 t0030 3 t0040 3 t0050 3 t0060 3 t0070 3

… ……

Table 4: Cost of test cases

Considering this aspect, we modify the sim-

ple risk model that we proposed in Section 2. The probability element in the original risk model is changed to severity probability, which combines the average severity of defects with the probabil-ity. Based on multiplying the Number of Defects N by the Average Severity of Defects S (N×S), we can estimate the severity probability for each test case.

Severity probability falls into a zero to five scale, where zero is low and five is high. For the test cases without any defect, P (t) is equal to zero. For the rest of the test cases, P (t) for the top 20% of the test cases with the highest estimate N×S will be five, and P (t) for the bottom 20% of the test cases will be one. Table 5 displays P (t) for some of the test cases in our case study.

Step 3. Calculate Risk Exposure for each test case

Combining Table 4 and Table 5, we can cal-culate Risk Exposure RE (t) for each test case as shown in Table 6.

Test Case Number of De-

fects (N) Average Severity of De-

fects (S) N × S P (t) (1- 5) t0010 1 2 2 2 t0020 1 3 3 2 t0030 1 1 1 1 t0040 1 4 4 3 t0050 2 3 6 4 t0060 3 3.5 10.5 5 t0070 0 0 0 0

… … … … …

Table 5: Severity probability of test cases

8

Test Case

C (t) (1- 5)

Number of Defects (N)

Average Severity of Defects (S) N × S

P (t) (1- 5)

RE (t) = C(t) * P (t)

t0010 5 1 2 2 2 10 t0020 5 1 3 3 2 10 t0030 3 1 1 1 1 3 t0040 3 1 4 4 3 9 t0050 3 2 3 6 4 12 t0060 3 3 3.5 10.5 5 15 t0070 3 0 0 0 0 0

… …… … … … … …

Table 6: Risk Exposure of test cases

4.3 Risk-based end-to-end test

scenario selection To improve or focus the results, we can also

add weights to test cases that we need to give preference to, for example, we might like to choose more test cases for some corporate flag-ship features, such as database features.

Industrial regression testers also utilize user-oriented, end-to-end scenarios. End-to-end sce-narios simulate common user profiles of system use. Some are equivalent to use cases. Usually the scenarios are documented while capturing requirements or early in the system design phase. Compared to test cases, end-to-end scenarios are more customer-directed.

Step 4. Select Safety Tests

To get “good enough quality,” our regression test suite should achieve some coverage target. This coverage target has to be set up depending on the available time and budget. Since end-to-end scenarios involve many

components of the system working together, they are highly effective at finding regression faults. When test time is limited, we might not be able to run all end-to-end scenarios for the system. Thus, to reduce risk efficiently, our selection strategy obeys two rules:

Intuitively, we do not want to choose low-risk test cases, such as t0070, because the re-sources used to run t0070 could be used more profitably to run other test cases. Specifically, we choose test cases that have the highest value of RE (t) from the non-targeted tests in Table 6 to reach our coverage target. R1: Select scenarios that test the most critical

requirements attributes. In our case study, we selected 30% of the full test suite to form a regression test suite. Table 7 is part of the test suite. 1 means the test cases is selected and 0 means the test cases is omitted.

R2: Have the test suite cover as many require-ments attributes as possible.

Test Case

Full Test Suite

Regression Test Suite (30%)

t0010 1 1 t0020 1 1 t0030 1 0 t0040 1 0 t0050 1 1 t0060 1 1 t0070 1 0

… … …

One end-to-end scenario consists of several test cases. Since every test case tests one or more requirements attributes, the rules shown above are equivalent to: T1: Select scenarios that cover the most critical test cases. T2: Have the suite of scenarios cover as many test cases as possible. This is similar to the “High-Yield” test design strategy [19, 23].

Before doing scenario selection, we have to determine the relation between scenarios and test cases. Table 8 is an example of the corresponding traceability matrix. Table 7: Test cases selected (indicated by 1)

9

4.3.1 End-to-end scenarios selec-tion method

Step 2. Select scenario that has the highest RE(s)

The scenario with the highest RE (s) covers the most critical test cases, that is, has the highest coverage of test cases. According to our selection rules, it should be included in the regression test suite. In our example, scenario s001 is put first into the regression test suite.

Step 1. Calculate Risk Exposure for each sce-nario

In Table 6 we show the Risk Exposure for each test case. Since scenarios consist of test cases, we can simply calculate the Risk Exposure for each scenario by summing up the Risk Expo-sure for all test cases that this scenario covers based on the information in Table 8. If one sce-nario consists of n test cases, the mathematical formula for RE (s) is:

Step 3. Update Table 8 and rebuild Table 9

When running the chosen scenario, all test cases covered by the scenario will be executed. According to our selection rules, these test cases should not affect our selection any more. We should focus on those test cases that have not yet been executed.

RE (s) = ∑RE (t)i , {1 ≤ i ≤ n | test case i is covered by this scenario}

Scenario Test Case s001 s002 s003 …

t0010 1 0 0 …

t0020 1 0 0 …

t0030 1 1 0 …

t0040 1 0 1 …

t0050 1 1 0 …

t0060 0 1 0 …

t0070 0 1 1 …

… … … … …

In our approach, after the chosen scenario has been executed, we cross out the column for the chosen scenario, and rows for all the test cases that have been covered by this scenario in Table 8. Revising Table 8, we calculate RE (s) again for the remaining scenarios and re-build Table 9.

In our example, Table 8 and Table 9 are up-dated to Table 8’ and Table 9’, respectively.

Step 4. Repeat Step 2 and Step 3 until we run

out of time and resources Using Table 9’, we next select s003 and re-

peat the tests.

0 -- The scenario does not cover the test case 1 -- The scenario covers the test case

Table 8: Traceability between test cases and scenarios

Scenario RE (s) s001 985 s002 463 s003 732 s004 213 s005 195 s006 127 s007 70 … …

Scenario Test Case s001 s002 s003 …

t0010 1 0 0 …

t0020 1 0 0 …

t0030 1 1 0 …

t0040 1 0 1 …

t0050 1 1 0 …

t0060 0 1 0 …

t0070 0 1 1 …

… … … … …

0 -- The scenario does not cover the test case 1 -- The scenario covers the test case

Table 9: Risk Exposure for scenarios Table 8’: Updated traceability between test

cases and scenarios Applying the formula to our example, we calculate Risk Exposure for all scenarios as shown in Table 9.

10

11

Scenario RE (s) s002 356 s003 611 s004 176 s005 180 s006 96 s007 68 … …

Table 9’: Rebuilt table of Risk Exposure for scenarios

The size of our test suite of scenarios is de-

pendent on the time and resources available. Re-gression testing is terminated whenever we run out of time and resources. The final selected re-gression test suite is the union of the Targeted Tests (Section 3) and the Safety Tests (Section 4).

5 Evaluation To assess the validity of our approach, we applied it to three components of IBM WebSphere Com-merce 5.4. Each component was owned by one experienced tester. In total, there were 306 test cases in the full test suite.

First we ran the full test suite and got a test profile. We defined this test profile as the first pass test profile. 65 defects were opened in the first pass. Then we applied our approach to select a regression test suite based on the first pass test profile. The three experienced testers chose their regression test suite for the component they owned manually. After defects were fixed by developers, we ran the full test suite again to get the total number of defects in the three compo-nents for our comparison. We called this run the second pass. There were nine defects found, which failed 28 test cases.

Our three assessment factors are effectiveness, cost-efficiency, and sensitivity to risk:

1. Effectiveness: Since we ran all test cases

in the second pass, we knew the number of existing defects, which was nine. A good regression test suite should be able to find all nine defects. The regression test suite selected by our approach found all nine defects while the regression test suite selected by testers only found seven

defects. Thus, in our case study, our ap-proach was more effective in finding de-fects.

2. Cost-efficiency: Our approach is more objective and involves only straightfor-ward calculations, easily carried out by simple spreadsheet software. Using more powerful tools, the selection process can be automated completely. Note that the three testers made a subjective selection. The selection process was based on the testers’ experiences and had to be done manually. Thus, our approach can be implemented into tools and has good po-tential to guide regression test selection, particularly for new or inexperienced test personnel.

3. Sensitivity to Risk: Our approach is risk-based. Therefore, we are able to use risk metrics to quantitatively measure the safety of a test suite. Table 10 is the risk matrix we obtained from our case study. As we can see in the matrix, the Risk Ex-posure coverage and Average Risk Expo-sure of our regression test suite are much higher than those selected manually by testers, which means our test suite covers more critical test cases. As well, our test case coverage was slightly better.

Risk-based Regression Test Suite

Manual Regression Test Suite

Test Cases Coverage 31.7% 29.1% RE(t) Cover-age 52.6% 30.5% Average RE(t) 16.2 10.4

Table 10: Risk coverage report

6 Related work Many researchers are investigating regression testing techniques. Their research spans a wide variety of topics. For example, Brown and Hoffman [13] work on test environments and automation of the regression testing process. Har-rold, Gupta, and Soffa [14], and Wong et al. [15] address test suite management. Rothermel and

12

Harrold [7] present a framework to evaluate re-gression test selection techniques.

However, very few research projects involve regression test selection techniques that are speci-fication-based [4, 5]. In practice, some compa-nies have developed specification-based regres-sion test strategies, such as IBM. The major prob-lem of these regression testing methods is that the selection criteria are somewhat subjective, that is, there are too many personal decisions involved. Thus, for a manager, it is very hard to measure whether a regression test suite is good enough to ensure a customer’s business won’t be at risk.

Some researchers, especially those from in-dustry, are very interested in the idea of risk-based testing. Pfleeger [17] discusses risk man-agement. Amland [9] presents fundamentals and metrics of risk-based testing. Bach works on methodologies for risk analysis [16]. Our ap-proach appears to be the most comprehensive and practical specification-based regression test selec-tion technique to date.

7 Conclusions and future

work In this paper we have presented a specification-based (black box) regression test selection tech-nique with results from a small but real industrial case study. This technique is customer-oriented and also risk-based. It provides methods to obtain both Targeted Tests and Safety Tests. The case study indicates that the technique is effective in finding defects.

Our future work includes using more compo-nents for empirical case studies to evaluate the effectiveness of our technique, developing metrics (for example, specification coverage) to help de-cide when to stop regression testing, and imple-menting our approach in a production test envi-ronment. We will also use statistical analysis tools to aid in determining the cost-effectiveness of our technique in practice.

Acknowledgments This work is supported by IBM Canada Ltd., NSERC (Natural Sciences and Engineering Re-search Council), and CITO (Communications and Information Technology Ontario). Heather Fry, Haiying Xu and Terry Chu provided data and test suites for the case study. Reviewers provided

many comments and suggestions that helped im-prove this paper.

© Copyright IBM Canada Limited, Yanping Chen, and Robert L. Probert 2002.

IBM is a registered trademark of Interna-tional Business Machines Corporation in the United States, other countries, or both.

About the authors Yanping Chen received her B.A.Sc. in computer science from the East China Normal University, China, in 1995. She is currently a Master’s stu-dent (Computer Science) in SITE, the School of Information Technology and Engineering at the University of Ottawa, and a CITO software engi-neering scholar at IBM. Under the supervision of Dr. Robert Probert, she is working on software testing. Her interests include formal verification and validation, regression testing, automatic test case generation, and TTCN. Since January 2002, she has worked as a software test process im-provement advisor in the Electronic Commerce Department, IBM. Reverend Robert L. Probert (Ph.D., Computer Science, University of Waterloo, 1973) is Full Professor in the School of Information Technol-ogy and Engineering (SITE) and Coordinator of the Nortel Networks Advanced Software Engi-neering Research and Training (ASERT) Labora-tory at the University of Ottawa. He is a Principal Investigator in communications software engi-neering and protocols for Communications and Information Technology Ontario (CITO), an On-tario Centre of Excellence. He was the founding Director of SITE in 1997. Dr. Probert co-chaired the 10th International IFIP Symposium on Proto-col Specification, Testing, and Verification, and TestCom 2000, the 13th International IFIP Con-ference on Testing Communicating Systems. He co-founded the ACM Symposium on Principles of Distributed Computing. He has been a Visiting Scientist with IBM CAS since 1998, specializing in e-commerce testing. He has taught at the Uni-versities of Waterloo, Saskatchewan, and Ottawa and has also been a Visiting Researcher in Soft-ware Engineering at GE R&D Center, New York, and various Nortel Labs. His research interests and publications are primarily in testing protocols and communications software quality engineering. He is a deacon in the Catholic church, and is mar-

13

ried with one son working in Multi-Media. D. Paul Sims joined IBM after earning a B.A.Sc. in Engineering Science from the University of Toronto in 1984. He has participated in a variety of projects that include Series/1-based hardware for the Prodigy videotext service (a joint venture between IBM, Sears and CBS Records), the first ISDN basic-rate adapters for 9370 and AS/400 systems, ISDN Q.921 microcode, AS/400 Com-munications Utilities, CD Showcase CDs that helped sell AIX and AS/400 application develop-ment tools, IBM Distributed Debugger develop-ment manager, Application Development Tech-nology Centre operations manager, WebSphere Commerce Suite system test, and ECD TEST Continual Improvement. During his first six years at IBM, Paul earned an M.Eng. in Electrical En-gineering from U of T by taking courses and completing a project part-time through the Gradu-ate Work Study Program. Paul is a licensed Pro-fessional Engineer in the Province of Ontario. References [1] Mary Jean Harrold, James A. Jones, Tongyu

Li, and Donglin Liang, Regression Test Se-lection for Java Software, Proceedings of the ACM Conference on OO Programming, Sys-tems, Languages, and Applications (OOPSLA ’01), 2001.

[2] David Binkley. Semantics Guided Regression Test Cost Reduction, IEEE Transactions on Software Engineering, Vol. 23, No. 8, August 1997, pp. 498-516.

[3] David C. Kung, Jerry Gao, and Pei Hsia, Class Firewall, Test Order, and Regression Testing of Object-Oriented Programs, Journal of Ob-ject-Oriented Programming, Vol. 8, No. 2, May 1995, pp. 51-65.

[4] H.K.N. Leung and L.J. White, A study of in-tegration testing and software regression at the integration level, Proceedings of the Con-ference on Software Maintenance, November 1990, pp. 290-300.

[5] Anneliese von Mayrhauser, Richard Mraz, Jeff Walls, and Pete Ocken, Domain Based Test-ing, Proceedings of the Conference on Soft-ware Maintenance, September 1994, pp. 26-35.

[6] Harenton K.N. Leung and Lee J. White, A Cost Model to Compare Regression Test

Strategies, Proceedings of the Conference on Software Maintenance, 1991, pp. 201-208.

[7] Gregg Rothermel and Mary Jean Harrold, Analyzing Regression Test Selection Tech-niques, IEEE Transactions on Software En-gineering, Vol. 22, No. 8, August 1996, pp. 529-551.

[8] John D. McGregor, and David A. Sykes, A Practical Guide to Testing Object-Oriented Software, Addison Wesley Inc., 2001.

[9] Stale Amland, Risk Based Testing and Metrics: Risk analysis fundamentals and metrics for software testing including a financial applica-tion case study, The Journal of Systems and Software, Vol. 53, 2000, pp. 287-295.

[10] Robert V. Binder, Testing Object-Oriented Systems, Addison-Wesley Publishing Inc., 2000.

[11] Gregg Rothermel, Mary Jean Harrold, and Jeinay Dedhia, Regression Test Selection for C++ Software, Journal of Software Testing, Verification, and Reliability, Vol. 10, No. 2, June 2000.

[12] James Rumbaugh, Ivar Jacobson, and Grady Booch, The Unified Modeling Language Reference Manual, Addison Wesley Inc., 1999.

[13] P. A. Brown and D. Hoffman, The Applica-tion of Module Regression Testing at TRIUMF, Nuclear Instruments and Methods in Physics Research, Section A, A293, Vol 1. No. 2, August 1990, pp. 337-381.

[14] Mary Jean Harrold, R. Gupta, and M. L. Soffa, A Methodology for Controlling the Size of a Test Suite, ACM Transactions on Software Engineering and Methodology, Vol. 2, No. 3, July 1993, pp. 270-285.

[15] W. E. Wong, J. R. Horgan, S. London, and A. P. Mathur, Effect of Test Set Minimization on Fault Detection Effectiveness, 17th Inter-national Conference on Software Engineer-ing, April 1995, pp. 41-50.

[16] James Bach, Heuristic Risk-Based Testing, Software Testing and Quality Engineering Magazine, November 1999, pp. 96-98.

[17] Shari Lawrence Pfleeger, Risky Business: What We Have Yet to Learn about Risk Management, Journal of Systems and Soft-ware, Vol. 53, 2000, pp. 265-273.

[18] Halim Ben Hajia, Traceability in Object-Oriented Quality Engineering, A Basis for Regression Analysis of Object-Oriented Soft-ware, Masters Thesis, University of Ottawa,

14

Canada, 1997. [19] Glenford L. Myers, The Art of Software Test-

ing, Wiley-Interscience, 1979. [20] Martin Fowler, and Kendall Scott, UML Dis-

tilled Second Edition: a Brief Guide to the Standard Object Modeling Language, Addi-son Wesley, 2000.

[21] James Bach, Good Enough Quality: Beyond the Buzzword, IEEE Computer, August 1998, pp. 96-98.

[22] James Odell, Advanced Object-Oriented Analysis and Design using UML, Cambridge University Press, 1998.

[23] K. Saleh, R. L. Probert, W. Li, and W. Fong, An approach for high-yield requirements cap-ture for e-commerce and its application, In-ternational Journal on Digital Libraries, Vol. 3, No. 4, May 2002, pp. 302-308.

a specification-based regression test approach with risk analysis

Documents