abstract interpretation
DESCRIPTION
Abstract interpretation. Giorgio Levi Dipartimento di Informatica, Università di Pisa [email protected] http://www.di.unipi.it/~levi/levi.html. The general idea. a semantics any definition style, from a denotational definition to a detailed interpreter - PowerPoint PPT PresentationTRANSCRIPT
1
Abstract interpretation
Giorgio LeviDipartimento di Informatica, Università di Pisa
http://www.di.unipi.it/~levi/levi.html
2
The general idea a semantics
any definition style, from a denotational definition to a detailed interpreter
assigning meanings to programs on a suitable concrete domain (concrete computations domain)
an abstract domain modeling some properties of concrete computations and forgetting about the remaining information (abstract computations domain)
we derive an abstract semantics, which allows us to “execute” the program on the abstract domain to compute its abstract meaning, i.e., the modeled property
3
Concrete and Abstract Domains two complete partial orders
the partial orders reflect precision• smaller is better
concrete domain CC), {}, CC, , ) has the structure of a powerset
• we will see later why
abstract domain (AA, bottom, top, lub, glb) each abstract value is a description of “a set of” concrete values
4
Concretization
concrete domain CC), {}, CC, , )abstract domain ((AA, , bottom, top, lub, glb)) the meaning of abstract values is defined by a
concretization function
: AA ((CC))
aA, A, (a) is the set of concrete computations described by a that’s why the concrete domain needs to be a powerset
the concretization function must be monotonic
a1,a2 A, A, a1 a2 implies (a1) (a2) concretization preserves relative precision
5
Abstractionconcrete domain CC), {}, CC, , )abstract domain ((AA, , bottom, top, lub, glb))every element of ((CC)) should have a unique “best”
(most precise) description in AA this is possible if and only if AA is a Moore family
• closed under glb in such a case, we can define an abstraction function
: ((CC) ) AA
c((CC)), , (c) is the best abstract description of c the abstraction function must be monotonic
c1,c2 ((CC)), c1 c2 implies (c1) (c2) abstraction preserves relative precision
6
Galois connection
Galois connection (insertion)
x((CC))x x
yAAyy (yAAyy)
mutually determine each other
C A
CC), {}, CC, , )
((AA, , bottom, top, lub, glb)): AA ((CC)) (concretization)
: ((CC) ) A A (abstraction)
monotonic
there may be loss of information (approximation) in describing
an element of ((CC)) by an element of AA
7
Concrete semantics the concrete semantics is defined as the least or (greatest) fixpont
of a concrete semantic evaluation function FF defined on the domain CC
this does not necessarily mean that the semantic definition style is denotational!
FF is defined in terms of primitive semantic operations fi on CC
the abstract semantic evaluation function is obtained by replacing in FF each concrete operation fi by a suitable abstract operation
however, since the actual concrete domain is ((CC)), we need first to lift the concrete semantics lfp FF to a collecting semantics
defined on ((CC))
8
Collecting semantics lifting lfp F F to the powerset (to get the collecting semantics) is
simply a conceptual operation collecting semantics = {lfp FF}
we don’t need to define a brand new collecting semantic
evaluation function FFcc on ((CC)) we just need to reason in terms of liftings of all the primitive operations (and
of FF), while designing the abstract operations and establishing their properties
in the following, by abuse of notation, we will use the same notation for the standard and the collecting (“conceptually” lifted) operations
9
Abstract operations: local correctnessan abstract operator fi
defined on AA is locally correct wrt a concrete operator fi if
x1,..,xn ((CC))fi x1,..,xn) fi
x1,..,xn the concrete computation step is more precise than the
concretization of the “corresponding” abstract computation step
a very weak requirement, which is satisfied, for example, by an abstract operator which always computes the worst abstract value top
the real issue in the design of abstract operations is therefore precision
10
Abstract operations: optimality and completeness correctness
x1,..,xn ((CC))fi x1,..,xn) fi
x1,..,xn
optimality y1,..,yn AA.
fiy1,..,yn) fi
y1,..,yn the most precise abstract operator fi
correct wrt fi
a theoretical bound and basis for the design, rather then an implementable definition
completeness (exactness or absolute precision)
x1,..,xn ((CC))fi x1,..,xn)) fi
x1,..,xn no loss of information,the abstraction of the concrete computation step is exactly
the same as the result of the corresponding abstract computation step
11
From local to global correctness the composition of locally correct abstract operations is locally correct
wrt the composition of concrete operations composition does not preserve optimality, i.e., the composition of optimal
operators may be less precise than the optimal abstract version of the composition
if we obtain FF(abstract semantic evaluation function) by replacing in FF every concrete semantic operation by a corresponding (locally correct) abstract operation, the local correctness property still holds
x ((CC))FF x) FFx))) local correctness implies global correctness, i.e., correctness of the abstract
semantics wrt the concrete one
lfp F F lfpFFgfp F F gfpFF
(lfp F F ) lfpFF(gfp F F ) gfpFF
the abstraction of the concrete semantics is more precise than the abstract semantics
12
(lfp FF ) lfp FF:why computing lfp FF?
lfp FF cannot be computed in finitely many steps steps are in general required
lfp FFcan be computed in finitely many steps, if the abstract domain is finite or at least noetherian
does not contain infinite increasing chains interesting for static program analysis, where the fixpoint computation must
terminate most program properties considered in static analysis are undecidable we accept a loss of precision (safe approximation) in order to make the
analysis feasible
13
Applications
comparative semantics a technique to reason about semantics at different level of
abstraction• non-noetherian abstract domain• abstraction without approximation (completeness)
(lfp FF) lfp FF
static analysis = effective computation of the abstract semantics
if the abstract domain is noetherian and the abstract operations are computationally feasible
if the abstract domain is non-noetherian or if the fixpoint computation is too complex
• use widening operators– which effectively compute an (upper) approximation of lfp FF
» one example later
14
The abstract interpretation frameworkCC), {}, CC, , ) (concrete domain)
(AA, bottom, top, lub, glb) (abstract domain)
: AA ((CC)) monotonic (concretization function)
: ((CC) ) A A monotonic (abstraction function)
x((CC))x xyAAyy (Galois connection)
fi fi| x1,..,xn ((CC))
fi x1,..,xn) fix1,..,xn(local correctness)
critical choices the abstract domain to model the property the (possibly optimal) correct abstract operations
15
Other approaches and extensions there exist weaker versions of abstract interpretation
without Galois connections (e.g., concretization function only) based on approximation operators (widening, narrowing) without explicit abstract domain (closure operators)
the theory provides also several results on abstract domain design
how to combine domains how to improve the precision of a domain how to transform an abstract domain into a complete one …... we will look at some of these results in the last lecture
16
A simple abstract interpreter computing Signs
concrete semantics executable specification (in ML) of the denotational
semantics of untyped -calculus without recursion
abstract semantics abstract interpreter computing on the domain Sign
17
The language: syntaxtype ide = Id of string
type exp = | Eint of int | Var of ide| Times of exp * exp| Ifthenelse of exp * exp * exp | Fun of ide * exp| Appl of exp * exp
18
A program
Fun(Id "x", Ifthenelse(Var (Id "x"),
Times (Var (Id "x"), Var (Id "x")), Times (Var (Id "x"), Eint (-1))))
the ML expression
function x -> if x=0 then x * x else x * (-1)
19
Concrete semanticsdenotational interpretereager semanticsseparation from the main semantic evaluation
function of the primitive operations which will then be replaced by their abstract versions
abstraction of concrete values identity function in the concrete semantics
symbolic “non-deterministic” semantics of the conditional
20
Semantic domains type eval =
| Funval of (eval -> eval) | Int of int | Wrong
let alfa x = x type env = ide -> eval let emptyenv (x: ide) = alfa(Wrong) let applyenv ((x: env), (y: ide)) = x y let bind ((r:env), (l:ide), (e:eval)) (lu:ide) =
if lu = l then e else r(lu)
21
Semantic evaluation function
let rec sem (e:exp) (r:env) = match e with| Eint(n) -> alfa(Int(n))| Var(i) -> applyenv(r,i)| Times(a,b) -> times ( (sem a r), (sem b r))
| Ifthenelse(a,b,c) -> let a1 = sem a r in (if valid(a1) then sem b r else (if unsatisfiable(a1) then sem c r else merge(a1,sem b r,sem c r)))
| Fun(ii,aa) -> makefun(ii,aa,r) | Appl(a,b) -> applyfun(sem a r, sem b r)
22
Primitive operations
let times (x,y) = match (x,y) with |(Int nx, Int ny) -> Int (nx * ny) | _ -> alfa(Wrong)
let valid x = match x with
|Int n -> n=0
let unsatisfiable x = match x with |Int n -> if n=0 then false else true
let merge (a,b,c) = match a with |Int n -> if b=c then b else alfa(Wrong)
| _ -> alfa(Wrong)
let applyfun ((x:eval),(y:eval)) = match x with|Funval f -> f y| _ -> alfa(Wrong)
let rec makefun(ii,aa,r) = Funval(function d -> if d = alfa(Wrong) then alfa(Wrong) else sem aa (bind(r,ii,d)))
23
From the concrete to the collecting semantics
the concrete semantic evaluation function sem: exp -> env -> eval
the collecting semantic evaluation function semc: exp -> env -> (eval) semc e r = {sem e r} all the concrete primitive operations have to be lifted to (eval) in the design of the abstract operations
24
Example of concrete evaluation # let esempio = sem( Fun
(Id "x",
Ifthenelse
(Var (Id "x"), Times (Var (Id "x"), Var (Id "x")),
Times (Var (Id "x"), Eint (-1)))) ) emptyenv;;
val esempio : eval = Funval <fun>
# applyfun(esempio,Int 0);;
- : eval = Int 0
# applyfun(esempio,Int 1);;
- : eval = Int -1
# applyfun(esempio,Int(-1));;
- : eval = Int 1
in the “virtual” collecting versionapplyfunc(esempio,{Int 0,Int 1}) = {Int 0, Int -1}
applyfunc(esempio,{Int 0,Int -1}) = {Int 0, Int 1}
applyfunc(esempio,{Int -1,Int 1}) = {Int 1, Int -1}
25
From the collecting to the abstract semantics
concrete domain: ((ceval), )concrete (non-collecting) environment:
cenv = ide -> cevalabstract domain: (eval, )abstract environment: env = ide -> evalthe collecting semantic evaluation function
semc: exp -> env -> (ceval)the abstract semantic evaluation function
sem: exp -> env -> eval
26
The Sign Abstract Domain
concrete domain ((((ZZ), ), )) sets of integers
abstract domain ((SignSign, , ))
0-
top
0 - +
bot
0+
27
Redefining eval for SignSigntype ceval = Funval of (ceval -> ceval) | Int of int | Wrong
type eval = Afunval of (eval -> eval) | Top | Bottom | Zero | Zerop | Zerom | P | M
let alfa x = match x with Wrong -> Top | Int n -> if n = 0 then Zero else if n > 0 then P else M
the partial order relation the relation shown in the Sign lattice, extended with its lifting to
functions • there exist no infinite increasing chains• we might add a recursive function construct and find a way to compute
the abstract least fixpoint in a finite number of steps lub and glb of eval are the obvious ones concrete domain: ceval), {}, ceval, , ) abstract domain: (eval, , Bottom, Top, lub, glb)
28
Concretization function concrete domain: ceval), {}, ceval, , ) abstract domain: (eval, , Bottom, Top, lub, glb)
s(x) ={}, if x = Bottom
{Int(y) |y>0}, if x = P
{Int(y) |y0}, if x = Zerop
{Int(0)}, if x = Zero
{Int(y)|y0}, if x = Zerom
{Int(y)|y<0}, if x = M
ceval, if x = Top
{Funval(g) |y eval x s(y, g(x) s(f(y))}, if x = Afunval(f)
29
Abstraction function concrete domain: ceval), {}, ceval, , ) abstract domain: (eval, , Bottom, Top, lub, glb)s(y) = glb{
Bottom, if y = {}M, if y {Int(z)| z<0}Zerom, if y {Int(z)| z0}Zero, if y {Int(0)}Zerop, if y {Int(z)| z 0}P, if y {Int(z)| z>0}Top, if y ceval
lub{Afunval(f)| Funval(g) s(Afunval(f))},
if y {Funval(g)} & Funval(g) y}}
30
Galois connections and s
are monotonic define a Galois connection
31
Times Sign
bot - 0- 0 0+ + top
bot bot bot bot bot bot bot bot - bot + 0+ 0 0- - top 0- bot 0+ 0+ 0 0- 0- top 0 bot 0 0 0 0 0 0 0+ bot 0- 0- 0 0+ 0+ top + bot - 0- 0 0+ + toptop bot top top 0 top top top
optimal (hence correct) and complete (no approximation)
32
Abstract operations in addition to times and lub
let valid x = match x with | Zero -> true | _ -> false
let unsatisfiable x = match x with | M -> true| P -> true| _ -> false
let merge (a,b,c) = match a with | Afunval(_) -> Top| _ -> lub(b,c)
let applyfun ((x:eval),(y:eval)) = match x with |Afunval f -> f y| _ -> alfa(Wrong)
let rec makefun(ii,aa,r) = Afunval(function d -> if d = alfa(Wrong) then d else sem aa (bind(r,ii,d)))
sem is left unchanged
33
An example of abstract evaluation# let esempio = sem( Fun (Id "x", Ifthenelse (Var (Id "x"), Times (Var (Id "x"), Var (Id "x")), Times (Var (Id "x"), Eint (-1)))) ) emptyenv;;val esempio : eval = Afunval <fun>
# applyfun(esempio,P);;- : eval = M# applyfun(esempio,Zero);;- : eval = Zero# applyfun(esempio,M);;- : eval = P# applyfun(esempio,Zerop);;- : eval = Top# applyfun(esempio,Zerom);;- : eval = Zerop# applyfun(esempio,Top);;- : eval = Top
applyfunc(esempio,{Int 0,Int 1}) = {Int 0, Int -1}
applyfunc(esempio,{Int 0,Int -1}) = {Int 0, Int 1}applyfunc(esempio,{Int -1,Int 1}) = {Int 1, Int -1}
wrt the abstraction of the concrete (collecting) semantics, approximation for Zerop
no abstract operations which “invent” the values Zerop and Zerom
which are the only ones on which the conditional takes both ways and can introduce approximation
34
Recursion the language has no recursion
• fixpoint computations are not needed if (sets of) functions on the concrete domain are abstracted to
functions on the abstract domain, we must be careful in the case of recursive definitions
• a naïve solution might cause the application of a recursive abstract function to diverge, even if the domain is finite
• we might never get rid of recursion because the guard in the conditional is not valid or satisfiable
• we cannot explicitely compute the fixpoint, because equivalence on functions cannot be expressed
• termination can only be obtained by a loop checking mechanism (finitely many different recursive calls)
we will see a different solution in a case where (sets of) functions are abstracted to non functional values
• the explicit fixpoint computation will then be possible