efficient field-sensitive pointer analysis for c david j. pearce, paul h.j. kelly and chris hankin...
TRANSCRIPT
Efficient Field-Sensitive Pointer Analysis for C
David J. Pearce, Paul H.J. Kelly and Chris Hankin
Imperial College, London, UK
[email protected]/~djp1/
What is Pointer Analysis? Determine pointer targets without running program
What is flow-insensitive pointer analysis?> One solution for all statements – so precision lost> This is a trade-off for efficiency over precision> This work considers flow-insensitive pointer analysis only
int a,b,*p,*q = NULL;
p = &a;
if(…) q = p; // p{a,b}, q{a,NULL}p = &b;
Pointer analysis via set-constraints Generate set-constraints from program and solve them
> Use constraint graph for efficient solving
int a,b,c,*p,*q,*r;
p = &a; r = &b; q = &c;
if(...) q = p; else q = r;
(program)
Pointer analysis via set-constraints
int a,b,c,*p,*q,*r;
p = &a; // p { a } r = &b; // r { b }q = &c; // q { c }
if(...) q = p; // q pelse q = r; // q r
(program) (constraints)
Generate set-constraints from program and solve them> Use constraint graph for efficient solving
Pointer analysis via set-constraints
int a,b,c,*p,*q,*r;
p = &a; // p { a } r = &b; // r { b }q = &c; // q { c }
if(...) q = p; // q pelse q = r; // q r
p
q
r{a} {b}
(program) (constraints) (constraint graph)
{c}
Generate set-constraints from program and solve them> Use constraint graph for efficient solving
Pointer analysis via set-constraints
int a,b,c,*p,*q,*r;
p = &a; // p { a } r = &b; // r { b }q = &c; // q { c }
if(...) q = p; // q pelse q = r; // q r
p
q
r{a} {b}
(program) (constraints) (constraint graph)
{a,b,c}
Generate set-constraints from program and solve them> Use constraint graph for efficient solving
Field-Sensitivity How to deal with aggregate types ?
> Standard approach treats them as single variables
typedef struct { int *f1; int *f2; } t1;
int a,b,*p,*q,*r;
t1 x;
p = &a; // p { a } q = &b; // q { b } x.f1 = p; // x p x.f2 = q; // x q r = x.f1; // r x
p
x
q{a} {b}
{}
r {}
Field-Sensitivity How to deal with aggregate types ?
> Standard approach treats them as single variables
typedef struct { int *f1; int *f2; } t1;
int a,b,*p,*q,*r;
t1 x;
p = &a; // p { a } q = &b; // q { b } x.f1 = p; // x p x.f2 = q; // x q r = x.f1; // r x
p
x
q{a} {b}
{a,b}
r {a,b}
Field-Sensitivity – A simple solution Use a separate node per field for each aggregate
> Node “x” split in two
typedef struct { int *f1; int *f2 } t1;
int a,b,*p,*q,*r;
t1 x;
p = &a; // p { a } q = &b; // q { b } x.f1 = p; // xf1 p
x.f2 = q; // xf2 q
r = x.f1; // r xf1
p
xf2
q{a} {b}
{}r
{}xf1{}
Field-Sensitivity – A simple solution Use a separate node per field for each aggregate
> Node “x” split in two
typedef struct { int *f1; int *f2 } t1;
int a,b,*p,*q,*r;
t1 x;
p = &a; // p { a } q = &b; // q { b } x.f1 = p; // xf1 p
x.f2 = q; // xf2 q
r = x.f1; // r xf1
p
xf2
q{a} {b}
{a}r
{a}xf1{b}
Problem – can take address of field in C
System thus far has no mechanism for this First idea – use string concatenation operator ||
> Works well for this example
typedef struct { int *f1; int *f2; } t1;
int **p;
t1 x,*s;
s = &x; // s { x } p = &(s->f2); // p ?
xf2 {..}xf1 {..}
Problem – can take address of field in C
System thus far has no mechanism for this First idea – use string concatenation operator ||
> Works well for this example
typedef struct { int *f1; int *f2; } t1;
int **p;
t1 x,*s;
s = &x; // s { x } p = &(s->f2); // p (*s) || f2
xf2 {..}xf1 {..}
Problem – can take address of field in C
System thus far has no mechanism for this First idea – use string concatenation operator ||
> Works well for this example
typedef struct { int *f1; int *f2; } t1;
int **p;
t1 x,*s;
s = &x; // s { x } p = &(s->f2); // p (*s) || f2 p { x } || f2 p { xf2 }
xf2 {..}xf1 {..}
Problem – compatible types
First idea – use string concatenation operator ||> Casting identical types except for field names> Derivation same as before - but,node xf2 no longer exists!
typedef struct { int *f1; int *f2; } t1;
typedef struct { int *f3; int *f4; } t2;
int **p;
t1 *s; t2 x;
s = (t1*) &x; // s { x } p = &(s->f2); // p (*s) || f2
xf4 {..}xf3 {..}
Problem – compatible types
First idea – use string concatenation operator ||> Casting identical types except for field names> Derivation same as before - but,node xf2 no longer exists!
typedef struct { int *f1; int *f2; } t1;
typedef struct { int *f3; int *f4; } t2;
int **p;
t1 *s; t2 x;
s = (t1*) &x; // s { x } p = &(s->f2); // p (*s) || f2 p { x } || f2 p { xf2 }
xf4 {..}xf3 {..}
Field-Sensitivity – Our Solutiontypedef struct { int *f1; int *f2; } t1;
typedef struct { int *f3; int *f4; } t2;
int **p;
t1 *s; t2 x;
s = (t1*) &x; // s { xf3 }
p = &(s->f2); // p s + 1
Our solution – map variables to integers> Solution sets become integer sets> Use integer addition to model taking address of field> Address of aggregate modelled by address of its first field
p s xf3 xf4
0 1 2 3
Field-Sensitivity – Our Solutiontypedef struct { int *f1; int *f2; } t1;
typedef struct { int *f3; int *f4; } t2;
int **p;
t1 *s; t2 x;
s = (t1*) &x; // s { xf3 } s { 2 }
p = &(s->f2); // p s + 1
Our solution – map variables to integers> Solution sets become integer sets> Use integer addition to model taking address of field> Address of aggregate modelled by address of its first field
p s xf3 xf4
0 1 2 3
Field-Sensitivity – Our Solutiontypedef struct { int *f1; int *f2; } t1;
typedef struct { int *f3; int *f4; } t2;
int **p;
t1 *s; t2 x;
s = (t1*) &x; // s { xf3 } s { 2 }
p = &(s->f2); // p s + 1 p { 2 } + 1 p { 3 }
Our solution – map variables to integers> Solution sets become integer sets> Use integer addition to model taking address of field> Address of aggregate modelled by address of its first field
p s xf3 xf4
0 1 2 3
Experimental Study
Time (s) Avg Deref Sizebash (55324 LOC)
Field-insensitiveField-sensitive
0.510.53
543.086.7
emacs (93151 LOC)
Field-insensitiveField-sensitive
0.40.69
79.35.4
sendmail (49053 LOC)
Field-insensitive Field-sensitive
0.492.05
558.4214.2
Named (75599 LOC)
Field-insensitiveField-sensitive
30.0129.1
2865.52167.7
ghostscript (159853 LOC)
Field-insensitiveField-sensitive
277.42510.4
7703.17365.2
Conclusion
Field-sensitive Pointer Analysis> Presented new technique for C language> Elegantly copes with language features
- Taking address of field- Compatible types and casting - Technique also handles function pointers without modification
> Experimental evaluation over 7 common C programs- Considerable improvements in precision obtained- But, much higher solving times- And, relative gains appear to diminish with larger benchmarks
Constraint Graphs (continued) What about statements involving a pointer dereference?
> Cannot be represented in the constraint graph> Instead, add edges as solution of q becomes known> Thus, computation similar to dynamic transitive closure
int a,*r,*s,**p,**q;
p = &r; // p { r } s = &a; // s { a }
q = p; // q p*q = s; // *q s
p
q
{r} s
r {}
{a}
(program) (constraints) (constraint graph)
{}
Constraint Graphs (continued) What about statements involving a pointer dereference?
> Cannot be represented in the constraint graph> Instead, add edges as solution of q becomes known> Thus, computation similar to dynamic transitive closure
int a,*r,*s,**p,**q;
p = &r; // p { r } s = &a; // s { a }
q = p; // q p*q = s; // *q s r s
p
q
{r} s
r {}
{a}
(program) (constraints) (constraint graph)
{r}
Constraint Graphs (continued) What about statements involving a pointer dereference?
> Cannot be represented in the constraint graph> Instead, add edges as solution of q becomes known> Thus, computation similar to dynamic transitive closure
int a,*r,*s,**p,**q;
p = &r; // p { r } s = &a; // s { a }
q = p; // q p*q = s; // *q s r s
p
q
{r} s
r {}
{a}
(program) (constraints) (constraint graph)
{r}
Constraint Graphs (continued) What about statements involving a pointer dereference?
> Cannot be represented in the constraint graph> Instead, add edges as solution of q becomes known> Thus, computation similar to dynamic transitive closure
int a,*r,*s,**p,**q;
p = &r; // p { r } s = &a; // s { a }
q = p; // q p*q = s; // *q s r s
p
q
{r} s
r {a}
{a}
(program) (constraints) (constraint graph)
{r}