pex white box test generation for.net nikolai tillmann, microsoft research smt 2008

PexWhite Box Test Generation

for .NET

Nikolai Tillmann, Microsoft Research

SMT 2008

2

Background: Unit TestingA unit test is a small program with assertions.

void AddTest() { HashSet set = new HashSet(); set.Add(7); set.Add(3);

Assert.IsTrue(set.Count == 2); }

Many developers write such unit tests by hand. This involvesdetermining a meaningful sequence of method calls,selecting exemplary argument values (the test inputs),stating assertions.

Vision: Parameterized Unit Testing void AddSpec(int x, int y) { HashSet set = new HashSet(); set.Add(x); set.Add(y);

Assert.AreEqual(x == y, set.Count == 1); Assert.AreEqual(x != y, set.Count == 2); }

Parameterized Unit Tests separate two concerns:1) The specification of externally visible behavior (assertions)2) The selection of internally relevant test inputs (coverage)

Parameterized Unit Testing bridgesthe gap between• Unit Testing, and• Design-By-Contract paradigm

Parameterized Unit Testingare algebraic specifications!

What is PexTest input generator

Pex starts from parameterized unit testsGenerated tests are emitted as traditional unit tests

Dynamic symbolic execution frameworkSymbolic execution based on monitoring and re-executionWhole-program, white-box code analysisAt the level of the .NET instructions (bytecode)Support for “Java-like” programs as well as “unsafe” codeSMT-solver Z3 determines satisfying assignments for constraint systems representing execution paths

How to test this code?(Real code from .NET base class libraries.)

A more interesting example

8

A more interesting example (II)

Main challenge:Making sure it does not crashby writing many tests that cover the code

9

Possible test case, written by Hand

A more interesting example (III)

Test input, generated by Pex

10

Pex – Test Input Generation tomorrow

Test Input Generation byDynamic Symbolic Execution

TestInputs

Constraint System

Execution Path

KnownPaths

Run Test and Monitor

RecordPath Condition

Choose an Uncovered Path

Solve

Result: small test suite, high code coverage

Initially, choose Arbitrary

Finds only real bugsNo false warnings

TestInputs

Constraint System

Execution Path

KnownPaths




Solve




a[0] = 0;a[1] = 0;a[2] = 0;a[3] = 0;…


TestInputs

Constraint System

Execution Path

KnownPaths




Solve




Path Condition:… ⋀ magicNum != 0x95673948


TestInputs

Constraint System

Execution Path

KnownPaths




Solve




… ⋀ magicNum != 0x95673948… ⋀ magicNum == 0x95673948


TestInputs

Constraint System

Execution Path

KnownPaths




Solve



a[0] = 206;a[1] = 202;a[2] = 239;a[3] = 190;



TestInputs

Constraint System

Execution Path

KnownPaths




Solve





DemoResults in VSReport: Coverage, path conditions

18

Monitoring by Code Instrumentation

ldtoken Point::GetXcall __Monitor::EnterMethodbrfalse L0ldarg.0call __Monitor::NextArgument<Point>

L0: .try { .try { call __Monitor::LDARG_0 ldarg.0 call __Monitor::LDNULL ldnull call __Monitor::CEQ ceq call __Monitor::BRTRUE brtrue L1 call __Monitor::BranchFallthrough call __Monitor::LDARG_0 ldarg.0 …

ldtoken Point::X call __Monitor::LDFLD_REFERENCE ldfld Point::X call__Monitor::AtDereferenceFallthrough br L2

L1: call __Monitor::AtBranchTarget call __Monitor::LDC_I4_M1 ldc.i4.m1

L2: call __Monitor::RET stloc.0 leave L4 } catch NullReferenceException {

‘ call__Monitor::AtNullReferenceException rethrow }

L4: leave L5} finally { call __Monitor::LeaveMethod endfinally }

L5: ldloc.0ret

class Point { int x; int y; public static int GetX(Point

p) { if (p != null) return p.X; else return -1; } }

Prologue

Epilogue

Calls will performsymbolic computation

Calls to build path condition

Calls to build path condition

Record concrete values to have all information

when this method is calledwith no proper context(The real C# compiler

output is actually more complicated.)

Symbolic State RepresentationSimilar to representation of verification conditions in ESC/Java,

Spec#, …

Terms forPrimitive types (integers, floats, …)

ConstantsUnary and binary expressions

‘struct’ typesTuples

Instance fields of classesMapping of references to values

Elements of arrays, memory accesses through pointersMapping of integers to values

…

Term normalizationGoal: Efficient representation of evolving program states

Reduction of ground terms to constantsSharing of syntactically equal sub-termsBDDs over if-then-else terms to represent logical operationsTries/Patricia Trees to represent associative-commutative-with-unit operators

Normal form of polynomialsUpdate trees

Other simplification rules, e.g.\forall x. ceq(vtable(x, m1), m2) => ceq(objecttype(x), t)where m2 overrides m1, and t is the sealed declaring type of m2

Search StrategyProblem:

Reachable code not known initiallyNo loop invariants, loops must be unfoldedWithout guidance, symbolic execution may get stuck unfolding the same loop forever

Solution:Search strategies outside of SMT solver choose “next branch to flip”Fair choice between different strategiesIndividual strategies based on program structure, including:

Fair choice of branch instructionsFair choice of branch instructions + stack contextsFair choice of branch coverage

Constraint Solving: PreprocessingIndependent constraint optimization + Constraint caching

(similar to EXE)Idea: Related execution paths give rise to "similar" constraint systemsExample: Consider x>y ⋀ z>0 vs. x>y ⋀ z<=0

If we already have a cached solution for a "similar" constraint system, we can reuse it

x=1, y=0, z=1 is solution for x>y ⋀ z>0we can obtain a solution for x>y ⋀ z<=0 by

reusing old solution of x>y: x=1, y=0combining with solution of z<=0: z=0

Constraint Solving: Z3Decision procedures for uninterpreted functions with equalities, linear integer arithmetic, bitvector arithmetic, arrays, tuplesSupport for universal quantifiers

Used to model custom theories, e.g. .NET type systemModel generation

Models used as test inputsIncremental solving

Push / Pop of contexts for model minimizationProgrammatic API

For small constraint systems, text through pipes would add huge overhead

Creating complex objectsProblem:

Pex can collect constraints over private fields, constraint solver determines assignment for private fieldsHow to bring object into desired state?

Private fields cannot be initialized freely, but only through constructor and other methods

Approach taken by Pex:Automatic selection of constructor and state-modifying methods based on static code analysisExploration of constructor and methods to find non-exceptional paths

Assumptions and Assertions void PexAssume.IsTrue(bool c) { if (!c) throw new AssumptionViolationException(); }

void PexAssert.IsTrue(bool c) { if (!c) throw new AssertionViolationException(); }

Assumptions and assertions are explored just like all other branchesExecutions which cause assumption violations are ignored, not reported as errors or test cases

26

Dealing with the EnvironmentAppendFormat(null, “{0} {1}!”, “Hello”, “World”); “Hello World!”

.Net Implementation:

public StringBuilder AppendFormat( IFormatProvider provider, char[] chars, params object[] args) {

if (chars == null || args == null) throw new ArgumentNullException(…); int pos = 0; int len = chars.Length; char ch = '\x0'; ICustomFormatter cf = null; if (provider != null) cf = (ICustomFormatter)provider.GetFormat( typeof(ICustomFormatter)); …

27

Introduce a mock class which implements the interface. Write assertions over expected inputs, provide concrete outputs

public class MFormatProvider : IFormatProvider {

public object GetFormat(Type formatType) { Assert.IsTrue(formatType != null); return new MCustomFormatter(); }}

Problems:Costly to write detailed behavior by exampleHow many and which mock objects do we need to write?

Stubs / Mock Objects

28

Parameterized Mock ObjectsIntroduce a mock class which implements the interface. Let an oracle provide the behavior of the mock methods.

public class MFormatProvider : IFormatProvider {

public object GetFormat(Type formatType) { … object o = call.ChooseResult<object>();

return o; }}

Result: Relevant result values can be generated by white-box test input generation tool, just as other test inputs can be generated!

29

A case study (I)We applied Pex on a core .NET component

Already extensively tested for several yearsAssertions written by developers>10,000 public methods>100,000 basic blocks

SandboxRestriction of access to external resources (files, registry, unsafe code, …)

10 machines (P4, 2Ghz, 2GB RAM) running for 3 daysExploration started from simple, generated parameterized unit tests (one per public method); assertions embedded in code

31

A case study (II)Coverage achieved:

43% block coverage36% arc coverage

Errors found:A significant number of benign errors, e.g. NullReferenceException, IndexOutOfRangeException, …17 unique errors involving

violation of developer-written assertions,exhaustion of memory, other serious issues.

32

A case study (III)

33

Classname Blocks Hit Arcs Hit

A (mostly stateless methods) >300 95% >400

90%

B (mostly stateless methods) >100 97% >200

94%

C (stateful) >200 76% >300

65%

D (parsing code) >500 81% >800

73%

E (numerical algorihm) >400 71% >600

67%

F (numerical algorihm) >100 82% >200

79%

G (numerical algorihm) >100 98% >100

97%

H (numerical algorihm) >200 71% >200

61%

I (numerical algorihm)>200 97% >30

096%

Automatically achieved coverage on selected classes for core .NET component

LimitationsAssumption: Environment is deterministic

"Environment" includes all code that is not monitored, e.g. native code, uninstrumented codePex prunes non-deterministic behavior

Assumption: Program is single-threadedPotential solution: control and explore thread scheduling like all other test inputs

Limitations of constraint solverZ3 has no built-in theories for floating point arithmetic

approximation with rationals (linear arithmetic only)Bounds on Z3's time and memory consumption

34

Ongoing Work: Debugging Specifications

Goal: Test-input generation for programs with contracts (preconditions, postconditions, invariants, etc.)In Verisoft project, compiler generates Boogie or MSIL programs from C code annotated with contractsMSIL programs embed most contracts in executable formThese contracts are turned into constraints by Pex, which performs a path-sensitive analysisChallenge: Non-executable contracts

Quantifiers: may range over “all integers” or “all pointers”Predicates for memory-safety: do not translate directly into machine-observable behavior

Ongoing WorkBetter scalability

More sophisticated search-frontiers (e.g. based on fitness function that determines distance to target state)Summarizing execution paths instead of exploring them (TACAS'08)

Inference of likely invariants/contracts(DySy, ICSE'08)

Dealing with multi-threaded programsControlling the schedulerSystematically exploring all relevant thread interleavingsRace detectionTom Ball et. al. are building such analyses on Pex framework (ManagedChess)

Some Recent Related Work

Program model checkersJPF, Kiasan/KUnit (Java), XRT (.NET)

Combining random testing and constraint solvingDART (C), CUTE (C), EXE (C),

jCUTE (Java),

SAGE (X86)

…

37

Summary

38

Parameterized Unit Tests separate two concernsSpecification of externally visible behaviorSelection of test inputs to cover internal behavior

Pex automates test input generationUses SMT-solver Z3

Dynamic Symbolic Execution platform for .NETUsed internally in Microsoft to test core .NET components

Pex is publicly available for academic use. http://research.microsoft.com/Pex

http://research.microsoft.com/Pex

Thank you




Why Dynamic Symbolic Execution?

Dynamic symbolic execution will systematically explore the conditions in the code which the constraint solver understands.And happily ignore everything else, e.g.

Calls to native codeDifficult constraints (e.g. precise semantics of floating point arithmetic)

Result: Under-approximation, which is appropriate for testing

Calls to external world

Unmanaged x86 code

Unsafe managed .NET code (with pointers)

Safe managed .NET code

Most interesting programs are beyond the scope of static symbolic execution.

Regression Test Suite GenerationWhen generating test inputs for any method, e.g.

DateTime ParseDateTime(string s) { … }

a regression test suite can be generated, where each test asserts the observed behavior.

void ParseDateTimeTest132() {DateTime result = ParseDateTime(“6/19/2008”);Assert(result.ToString() == “06/19/2008”);

}

Historical note: XRTXRT: Exploring Runtime

Interpreter for .NET programsStatic symbolic executionUsed Simplify to determine unsatisfiability of path constraints

Successful for self-contained programsUsed today on a large scale within Microsoft for quality assurance purposes as the core of the model-based testing tool “Spec Explorer 2007”.

Does not work well for real-world programsAll environment behavior must be modeledModeling of entire environment is often not feasible

pex white box test generation for.net nikolai tillmann, microsoft research smt 2008

Documents