thomas ball, rupak majumdar, todd millstein, sriram k. rajamani presented by yifan li...

41
AUTOMATIC PREDICATE ABSTRACTION OF C PROGRAMS Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li ([email protected]) November 22nd In PLDI 01: Programming Language Design and Imple- mentation, 2001

Upload: oswald-spencer

Post on 16-Dec-2015

234 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

AUTOMATIC PREDICATE

ABSTRACTION OF C PROGRAMS

Thomas Ball, Rupak Majumdar,

Todd Millstein, Sriram K. RajamaniPresented by Yifan Li ([email protected])November 22nd

In PLDI 01: Programming Language Design and Imple-mentation, 2001

Page 2: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

“Can software help programmers write better software?”

Page 3: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Outline

What is model checking Why it is important Current state of the art Challenges in applying model checking

to C programs SLAM project

Page 4: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Outline

What is model checking Why it is important Current state of the art Challenges in applying model checking

to C programs SLAM project

Page 5: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Model Checking A specific technique of formal verification Given a model of a system, test

automatically whether this model meets a given specification

Page 6: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Formal Verification

Formal Verification Formal verification is the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property

To help mathematically prove the correctness of a software or hardware system

Page 7: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

The Model checking problem Let M be a Kripke structure (i.e., state-

transition graph) Let f be a formula of temporal logic (i.e.,

the specification) Find all states s of M such that M,s ├ f

Page 8: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

A typical model checking system

Figure 1. A typical model checking system

Page 9: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Kripke Structure A Kripke structure is a type of nondeterministic

finite state machine  proposed by Saul Kripke, used in model checking

Let the set of atomic

propositions AP = {p,q}.

p and q can model arbitrary

boolean properties of the

system that the Kripke

structure is modelling

M may produce a path ρ = s1,s2,s1,s2,s3,

s3,s3,... (potentially infinite)Figure 2. Kripke Structure

Page 10: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

How to model-check Basic Procedure:

1. Describe the system as a finite state model

2. Express properties in temporal logic

3. Formal Verification by automatic exhaustive search over the state space

Use a model checker to check properties

Page 11: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Temporal logic Used to describe any system of rules for

representing propositions in terms of time Statements in temporal logic:

 "I am always hungry“

"I will eventually be hungry“

"I will be hungry until I eat something“ Temporal logics describe the ordering of events

in time without introducing time explicitly. The meaning of a temporal logic formula is

determined with respect to a labeled state-transition graph or Kripke structure.

Page 12: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Abstraction of model What if the model is infinite-like? Using abstraction Any effort to model check software must

first construct an abstract model of the software

Predicate Abstraction- A promising approach to construct abstractions automatically (which will be covered later)

Page 13: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

What is a model checker A model checker is a software tool that given a description of a Kripke model

M ... ... and a property φ decides whether M ├ φ returns “yes” if the property is satisfied, otherwise returns “no”, and provides a

counterexample

Page 14: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

What is a model checker

Figure 3. The model Checker

Page 15: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Outline

What is model checking Why it is important Current state of the art Challenges in applying model checking to C

programs SLAM project

Page 16: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Why it is important

software bugs are so common that their cost to the American economy alone is $60 billion a year or about 0.6% of gross domestic product (NIST)

Page 17: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Why it is important?

Some errors in software systems are expensive:Space Mission Failed: A bug caused 370-million dollar failure in 1996, which is $514 to $686 million in 2010 (Flight 501)

While some are pretty annoying:“Bill Gates: 5% of Windows Machines Crash More Than Twice A Day”

Page 18: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Outline

What is model checking Why it is important Current state of the art Challenges in applying model checking to C

programs SLAM project

Page 19: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

A wide Variety of model checkers

Name a few:

For C programs: BLAST (Berkeley) CMBC (Carnegie Mellon) CPA checker(U of Passau, Germany) SLAM(Microsoft Research)

Others: SPIN (Bell Lab, System Software Award-

2001)

Page 20: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

SLAM

“software (specifications), programming languages, abstraction, and model checking”

SLAM is a program-analysis engine of the SDV tool used to check if clients of an API follow the API’s stateful usage rules

SLAM toolkit, include C2BP, BEBOP, NEWTON is the analysis engine of the SDV tool

Page 21: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

SLAM2

The improved version of SLAM With under 4% false alarms

Page 22: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

SDVStatic Driver Verifier (SDV):

• Compile-time verification tool• Ships with Windows 7 Driver Kit (WDK)• Less than 4% false alarms on real drivers• Supports many driver APIs (WDM, KMDF, NDIS, …)• Uses SLAM as the verification engine

Based on CEGAR loop Boolean abstraction of input C programs

• API-specific components: environment model API rules in SLIC language

Page 23: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Driver’s Source Code in C

PreciseAPI Usage Rules

(SLIC)

Defects

100% pathcoverage

Rules

Static Driver Verifier

Environment model

Static

Driver

Verifier

Figure 4. SDV

Page 24: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Usage

SDV 2.0 is applied as an automatic and required quality gate for Windows 7 device drivers

SLAM is distributed as part of the Windows Driver Development Kit

Page 25: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Outline

What is model checking Why it is important Current state of the art Challenges in applying model

checking to C programs SLAM project

Page 26: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Challenges in applying model checking to C program

Pointers (alias problem) Procedures( signature) unknown values (*) Lots of predicate states

Page 27: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Outline

What is model checking Why it is important Current state of the art Challenges in applying model checking

to C program SLAM project

Page 28: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

SLAM Project

SLIC

C Program PInstrumented C program P’

C2BP Boolean Program BP(E,P’)

Bebop

Error Path

Feasible

No, refine the Predicate, gen-erate new BP

Yes, An error found

Program Bug

Figure 5. The SLAM realization of CEGAR loop

Page 29: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

CEGAR

In theory, counterexample-guided abstraction refinement (CEGAR) uses spurious counterexamples to refine overapproximations so as to eliminate provably false alarms

Page 30: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

SLIC

SLIC: Specification Language for Interface Checking

SLIC is a subset of the C language augmented with elements that identify the events of interest.

Next slide, an example of a SLIC language and the instructed C program based on that

Page 31: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Figure 6. To check that a spinlock cannot be acquired without it first being released, and that a spinlock cannot be released twice

Page 32: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Figure 7. The BP of the instructed C program. The first and second iterations of Bebop and Newton

Page 33: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Figure 8. Slic Specification for Proper Usage of Spin Locks, and (b) Its Compilation into C Code.

Example 2

Page 34: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Figure 9. (a) A snippet of device driver code P, and (b) program P0 resulting from instrumentation of program P due to Slic specification in Figure 8

Page 35: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Figure 10. The C code of the Slic specification from Figure 1(b) compiled by C2bp into a boolean program.

Page 36: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Figure 11. The two boolean programs created while checking the code from Figure 9 (b)

Page 37: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

How well it works

• Experience of SLAM works on device drivers that have hundred's or thousand’s lines of codes

Page 38: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

How well it works• There are true errors

found in the device driver when running SLAM on them

Page 39: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Conclusion

Slam toolkit outcomes the challenges in applying model checking to C programs

Slam is appropriate to use on large scale C programs and on device drivers written in C

The SDV tool has already been used in model checking device drivers for Windows 7 before they come to market

Page 40: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

References Measure the buying power of US dollar at different times

http://www.measuringworth.com/ppowerus/ Bill Gates Talk

http://www.osnews.com/story/4122/Bill_Gates_5_Of_Windows_Machines_Crash_More_Than_Twice_A_Day

Symbolic Model Checking

http://www.cse.iitd.ernet.in/~sak/courses/foav/nusvm-iitd-1.pdf Building a better bug-trap

http://www.economist.com/node/1841081 The SLAM project

http://research.microsoft.com/en-us/projects/slam/

Page 41: Thomas Ball, Rupak Majumdar, Todd Millstein, Sriram K. Rajamani Presented by Yifan Li (yl2774@columbia.edu) November 22nd In PLDI 01: Programming Language

Thank you!