feedback-based specification, coding and testing… …with jwalk

Feedback-Based Specification, Codingand Testing…

…with JWalk

Anthony J H Simons, Neil Griffiths and

Christopher D Thomson

Overview

Lazy Systematic Unit Testing JWalk testing tool (Simons)

The JWalkEditor tool Integrated Java editor and JWalk (Griffiths)

Feedback-based methodology Prototype, validate, specify, test

Evaluation of cost-effectivness Testing effort, time, coverage (Thomson)

http://www.dcs.shef.ac.uk/~ajhs/jwalk/

Motivation

State of the art in agile testing Test-driven development is good, but… …no specification to inform the selection of tests …manual test-sets are fallible (missing, redundant cases) …reusing saved tests for conformance testing is fallible –

state partitions hide paths, faults (Simons, 2005) Lazy systematic testing method: the insight

Complete testing requires a specification (even in XP!) Infer an up-to-date specification from a code prototype Let tools handle systematic test generation and coverage Let the programmer focus on novel/unpredicted results

Lazy Systematic Unit Testing

Lazy Specification late inference of a specification from evolving code semi-automatic, by static and dynamic analysis of code

with limited user interaction specification evolves in step with modified code

Systematic Testing bounded exhaustive testing, up to the specification emphasis on completeness, conformance, correctness

properties after testing, repeatable test quality

http://en.wikipedia.org/wiki/Lazy_systematic_unit_testing

JWalk Testing Tool

Lazy systematic unit testing for Java static analysis - extracts the public API

of a compiled Java class protocol walking (code exploration) - executes all interleaved

methods to a given path depth algebraic testing (memory states) - validates all observations

on all mutator-method sequences state-based testing (high-level states) - validates all state-

transitions (n-switch coverage) for inferred high-level states


JWalkEditor

Integration of JWalk with a Java-editor

Editing features Java-sensitive Integrated with JDK

compiler tools – track exceptions to source

Testing features Invokes JWalk (6 validation/testing modes) Confirm/reject key results via dialogs Browse test result-sets via tabbed panes

Snapshot – Editing a Stack

Syntax sensitive text highlights

Full build cycle,multiple sources

Set JWalk test parameters

Snapshot –Testing a Stack

Tabbed pane with test results for the Empty state, for all

paths of depth 2

Colour-coded test outcomes

Colour-coded test sequences

Dynamic Analysis of States

Memory states generate all interleaved method paths to depth 1..n prune sequences ending in observers from the active edges,

preserving mutator sequences distinguish observer/mutator methods by empirical low-level

state comparison (extracted by reflection)

High-level states generate all mutator-method paths (as above) evaluate all state predicates, eg: isEmpty(), isFull() seek states corresponding to the product of boolean

outcomes: viz: {Default, Empty, Full, Full&Empty}

Test Result Prediction

Strong prediction From known results, guarantee further outcomes in the

same equivalence class eg: test sequences containing observers in the prefix map

onto a shorter sequence target.push(e).size().top() == target.push(e).top()

Weak prediction From known facts, guess further outcomes; an incorrect

guess will be revealed in the next cycle eg: methods with void type usually return no result, but may

raise an exception target.pop() predicted to have no result target.pop().size() == -1 reveals an error

Feedback-based Methodology Coding

The programmer prototypes a Java class in the editor Validation

JWalk systematically explores method paths, providing useful instant feedback to the programmer

Specification JWalk infers a specification, building a test oracle based on

key test results confirmed by the programmer Testing

JWalk tests the class to bounded exhaustive depths, based on confirmed and predicted test outcomes

JWalk uses state-based test generation algorithms

Example – Library Book

Validation surprise: target.issue(“a”).issue(“b”).getBorrower() == “b” violates business rules: fix code to raise an exception

Testing all observations on chains of issue(), discharge() n-switch cover on states {Default, OnLoan}

public class LibraryBook { private String borrower; public LibraryBook(); public void issue(String); public void discharge(); public String getBorrower(); public Boolean isOnLoan();}

Extension – Reservable Book

Validation surprise: target.reserve(“a”).issue(“b”).getBorrower() == “b” violates business rules: override issue() to refuse “b” here.

Testing all obs. on chains of issue(), discharge(), reserve(), cancel() n-switch cover on states {Default, OnLoan, Reserved,

Reserved&OnLoan}

public class ReservableBook extends LibraryBook { private String requester; public ReservableBook(); public void reserve(String); public void cancel(); public String getRequester(); public Boolean isReserved();}

Evaluation

User Acceptance programmers find JWalk habitable they can concentrate on creative aspects (coding) while

JWalk handles systematic aspects (validation, testing) Cost of Confirmations

not so burdensome, since amortized over many test cycles metric: measure amortized confirmations per test cycle

Comparison with JUnit propose a common testing objective for manual and lazy

systematic testing; evaluate coverage and testing effort Eclipse+JUnit vs. JWalkEditor: given the task of testing the

“transition cover + all equivalence partitions of inputs”

Amortized Interaction Costs

number of new confirmations, amortized over 6 test cycles con = manual confirmations, > 25 test cases/minute pre = JWalk’s predictions, eventually > 90% of test cases

Test class a1 a2 a3 s1 s2 s3

LibBk con 3 5 7 0 0 5

LibBk pre 2 8 18 18 38 133

ResBk con 3 14 56 0 11 83

ResBk pre 6 27 89 36 241 1649

eg: algebra-test to depth 2, 14 new confirmations

eg: state-test to depth 2, 241 predicted results

Comparison with

JUnit manual testing method Manual test creation takes skill, time and effort (eg: ~20 min

to develop manual cases for ReservableBook) The programmer missed certain corner-cases eg: target.discharge().discharge() - a nullop? The programmer redundantly tested some properties eg: assertTrue(target != null) - multiple times The state coverage for LibraryBook was incomplete, due to

the programmer missing hard-to-see cases The saved tests were not reusable for ReservableBook, for

which all-new tests were written to test new interleavings

Advantages of JWalk

JWalk lazy systematic testing JWalk automates test case selection -

relieves the programmer of the burdenof thinking up the right test cases!

Each test case is guaranteed to test a unique property Interactive test result confirmation is very fast (eg: ~80 sec in

total for 36 unique test cases in ReservableBook) All states and transitions covered, including nullops, to the

chosen depth The test oracle created for LibraryBook formed the basis for

the new oracle for ReservableBook, but… JWalk presented only those sequences involving new

methods, and all interleavings with inherited methods

Speed and Adequacy of Testing

Test goal: transition cover + equiv. partitions of inputs manual testing expensive, redundant and incomplete JWalk testing very efficient, close to complete

eg: wrote 104 tests, 21 were effective and 83 not!

eg: JWalk achieved 100% test coverage

Test class T TE TR Adeq time min.sec

LibBk manual 31 9 22 90% 11.00

ResBk manual 104 21 83 53% 20.00

LibBk jwalk 10 10 0 100% 0.30

ResBk jwalk 36 36 0 90% 0.46

Conclusion

Performance of JWalk testing clearly outperformed manual testing coverage based on all states and transitions input equivalence partitions are not yet handled

Performance of JWalkEditor unexpected gain: automatic validation of prototype code c.f. Alloy’s model checking from a partial specification

Moral for testing just automatically executing saved tests is not so great need systematic test generation tools to get coverage automate the parts that humans get wrong!

Any Questions?


Bibliography•A J H Simons, JWalk: a tool for lazy systematic testing of Java classes by introspection and user interaction, Automated Software Engineering, 14 (4), December, ed. B. Nuseibeh, (Springer, USA, 2007), 369-418. SpringerLink: DOI 10.1007/s10515-007-0015-3, 8 September, 2007. Final draft version also deposited with White Rose Research Online.

•A J H Simons and C D Thomson, Lazy systematic unit testing: JWalk versus JUnit, Proc 2nd. Testing in Academia and Industry Conference - Practice and Research Techniques, 22-24 September, eds. P McMinn and M Harman, (Cumberland Lodge, Windsor Great Park: IEEE, 2007), 138. See also the A1 poster presenting this result.

•A J H Simons and C D Thomson, Benchmarking effectiveness for object-oriented unit testing, Proc 1st. Software Testing Benchmark Workshop, 9-11 April, eds. M Roper and W M L Holcombe, (Lillehammer: ICST/IEEE, 2008).

•A J H Simons, N Griffiths and C D Thomson, Feedback-based specification, coding and testing with JWalk, Proc 3rd. Testing in Academia and Industry Conference - Practice and Research Techniques , 29-31 August, eds. L. Bottacci and G. M. Kapfhammer and M. Roper, (Cumberland Lodge, Windsor Great Park: IEEE, 2008), to appear.

•A J H Simons, A theory of regression testing for behaviourally compatible object types, rev. and ext., Software Testing, Verification and Reliability, 16 (3), UKTest 2005 Special Issue, September, eds. M Woodward, P. McMinn, M. Holcombe, R. Hierons (London: John Wiley, 2006), 133-156.

•A J H Simons, Testing with guarantees and the failure of regression testing in eXtreme Programming, Proc. 6th Int. Conf. on eXtreme Programming and Agile Processes in Software Engineering (XP 2005), eds. H Baumeister et al., Lecture Notes in Computer Science, 3556, (Berlin: Springer Verlag, 2005), 118-126.

•Wikipedia entry for JWalk

•Wikipedia entry for Lazy Systematic Unit Testing

feedback-based specification, coding and testing… …with jwalk

Documents

conformance testing

codingand testing

insightcomplete testing

test results

systematic test generation

lazy systematic testing

dialogsbrowse test result

mutatormethod paths