Download - CSSV – C String Static Verifier
CSSV – C String Static Verifier
Nurit DorMichael Rodeh
Mooly SagivGreta Yorsh
Tel-Aviv Universityhttp://www.cs.tau.ac.il/~nurr
The Problem:Detecting String Manipulation Errors
• An important problem– Common errors– Cause security vulnerability
• A challenging problem – Use of pointers – Use of pointer arithmetic– Error point vs. failure point
Example – unsafe call to strcpy()
simple(){
char s[20];char *p;char t[10];
strcpy(s,”Hello”);p = s + 5;strcpy(p,” world!”);strcpy(t,s);
}
Complicated Example
/* from web2c [fixwrites.c] */#define BUFSIZ 1024char buf[BUFSIZ];
char insert_long(char *cp){
char temp[BUFSIZ];…
for (i = 0; &buf[i] < cp ; ++i)temp[i] = buf[i];
strcpy(&temp[i],”(long)”);strcpy(&temp[i+6],cp);…
cp
buf
(long)temp
Complicated Example
/* from web2c [fixwrites.c] */#define BUFSIZ 1024char buf[BUFSIZ];
char insert_long(char *cp){
char temp[BUFSIZ];…
for (i = 0; &buf[i] < cp ; ++i)temp[i] = buf[i];
strcpy(&temp[i],”(long)”);strcpy(&temp[i+6],cp);…
cp
buf
( l o n g )temp
Complicated Example
/* from web2c [fixwrites.c] */#define BUFSIZ 1024char buf[BUFSIZ];
char insert_long(char *cp){
char temp[BUFSIZ];…
for (i = 0; &buf[i] < cp ; ++i)temp[i] = buf[i];
strcpy(&temp[i],”(long)”);strcpy(&temp[i+6],cp);…
cp
buf
(long)temp
Real Example
void RTC_Si_SkipLine(const INT32 NbLine, char ** const PtrEndText){
INT32 indice;
for (indice=0; indice<NbLine; indice++) {
**PtrEndText = '\n'; (*PtrEndText)++; }
**PtrEndText = '\0'; return;
}
NbLine + 1
PtrEndText
Vulnerable String Manipulation
Pointers to buffers char *p= buffer; … while( ) p++;
Standard string manipulation functions
strcpy(), strcat(), … NULL termination
strncpy(), …
9
Are String Violations Common?
FUZZ study (1995)• Random test programs on various
systems 9 different UNIX systems 18% – 23% hang or crash 80% are string related errors
CERT advisory• 50% of attacks are abuses of buffer
overflows
Current Methods
• Runtime– Safe-C [PLDI’94]– Purify – Bound-checking…
• Static+ Runtime– CCured [POPL’02]
Current Methods
• Static– Wagner et. al. [NDSS’00]– LCLint’s extension [USENIX’01]– Dor, Rodeh and Sagiv [SAS’01]
Goals• Static detection of string errors
– References beyond array limit– Unsafe pointer arithmetic– Missing null terminator – Additional properties:
• References beyond null• Specified using preconditions
• Sound– Never miss errors– Few false alarms
IS IT POSSIBLE?
13
Challenges in Static Analysis
• Soundness • Precision
– Combine integer and pointer analysis (p+i) = ‘\0’; strcpy(q, p);
• Scalability to handle real applications– Complexity of Chaotic iterations– Handles full C
CSSV Solution
• Use powerful static domain– Exponential abstract interpretation
• Use pre- and post-conditions to specify procedure requirements on strings– No interprocedural analysis– Modular analysis
• Automatic generation of procedure specification
CSSV
Cfiles
Procedure’sPointer info
Pointer Analysis
C2IP
PreModPost Integer Proc
Integer Analysis Potential Error Messages
Procedure name
Cfiles
AWP
Advantages of Specifications
• Allows modular analysis – Not all the code is available– Enables more precise analyses
• User control of the verification– Detect errors at point of logical error– Improve the precision of the analysis– Check additional properties
• Beyond ANSI-C
Specification and Soundness
• Preconditions are handled conservatively • All errors are detected
– Inside a procedure’s bodyOR
– At call statements to the procedure
char* strcpy(char* dst, char *src)requires
modensures
Specification– strcpy
( string(src) alloc(dst) > len(src))
dst.strlen, dst.is_nullt( len(dst) = = pre@len(src) return = = pre@dst)
Specification – insert_long()
/* insert_long.c */#include "insert_long.h" char buf[BUFSIZ];char * insert_long (char *cp) { char temp[BUFSIZ]; int i; for (i=0; &buf[i] < cp; ++i){ temp[i] = buf[i]; } strcpy (&temp[i],"(long)"); strcpy (&temp[i + 6], cp); strcpy (buf, temp); return cp + 6; }
char * insert_long(char *cp) requires( string(cp)
buf cp < buf + BUFSIZ
) mod cp.strlen ensures ( cp.strlen = = pre[cp.strlen + 6]
return_value = = cp + 6 ;
)
20
Difficulties with Specifications
• Legacy code• Complexity of software• Need to know context
CSSV
Cfiles
Pointer Analysis
C2IP
PreModPost Integer proc
Integer Analysis Potential Error Messages
Procedure name
Cfiles
Procedure’sPointer info
CSSV – Pointer Analysis
• Models the string objects • Pre compute points-to information• Determines which objects may be
updated thru a pointerchar s[20];char *p;…p = s + 5;strcpy(p,” world!”);
Integrating Pointer Information?
foo(char *p, char *q){
char local[100];…p = local;*q = 0;…
}main(){
char s[10], t[20], r[30];
char *temp;foo(s,t);foo(s,r);…temp = s…
}
s t r
temp
local
p q
Projection for foo()
foo(char *p, char *q){
char local[100];…p = local;…
}
param#1
local
p q
param#2
CSSV
Cfiles
Pointer Analysis
C2IP
PreModPost Integer proc
Integer Analysis Potential Error Messages
Procedure name
Cfiles
Procedure’sPointer info
C2IP – C to Integer Program
• Generate an integer program– Integer variables only– No function calls– Non deterministic
• Goal:
String error in the C program
Assert violated in the IP
C2IP – C to Integer Program
• Inline specification• Based on points-to information
– Generate constraint variables– Generate assert statements– Generate update statements
C2IP - Constraint Variable
• For every pointerp.offset
s p.offset = 2
C2IP - Constraint Variable
• For every abstract locationaloc.is_nulltaloc.lenaloc.msize
s taloc5 0
aloc5.lenaloc5.msize
char buf[BUFSIZ]; int buf.offset = 0; int sbuf.msize = BUFSIZ; int sbuf.len; int sbuf.is_nullt;
char * insert_long (char *cp) { int cp.offset;char temp[BUFSIZ] int temp.offset = 0;
int stemp.msize = BUFSIZ; int stemp.len ; int stemp.is_nullt;
int i int i
require string(cp); assume(sbuf.is_nullt 0 cp.offset sbuf.len sbuf.alloc );
for(i=0; &buf[i] < cp; ++i) { temp[i]=cp[i]; }
for (i=0; i< cp.offset ; ++i ) { assert(0 i stemp.msize (stemp.is_nullt i stemp.len)); assert(-i cp.offset< -i +sbuf.len); if (sbuf.is_nullt sbuf.len == i ) { stemp.len = i; stemp.is_nullt = true; } else …
C2IP
assert(0 i < 6 - stemp.msize );assume(stemp.len == i + 6);…
char * insert_long (char *cp) { int cp.offset;char temp[BUFSIZ] int temp.offset = 0;
int stemp.msize = BUFSIZ; int stemp.len ; int stemp.is_nullt;
int i int i
require string(cp); assume(sbuf.is_nullt 0 cp.offset sbuf.len sbuf.alloc );
for(i=0; &buf[i] < cp; ++i) { temp[i]=cp[i]; }
for (i=0; i< cp.offset ; ++i ) { assert(0 i stemp.msize (stemp.is_nullt i stemp.len)); assert(-i cp.offset< -i +sbuf.len); if (sbuf.is_nullt sbuf.len == i ) { stemp.len = i; stemp.is_nullt = true; } else …
strcpy(&temp[i],"(long)");
C2IP
C2IP - Update statements
p = s + 5;p.offset = s.offset + 5;
C2IP - Use points-to information
*p = 0;if (…) {
aloc1.len = p.offset;aloc1.is_nullt = true; }
else {alloc5.len = p.offset;alloc5.is_nullt = true; }
paloc1
aloc5
Handling structures
• Pointer analysis handles structures • C2IP handles pointer arithmetic• Generate constraint variables per field
CSSV
Cfiles
Pointer Analysis
C2IP
PreModPost Integer proc
Integer Analysis Potential Error Messages
Procedure name
Cfiles
Procedure’sPointer info
Integer Analysis
• Interval analysis is not enough
assert(-i cp.offset< -i +sbuf.len); • Use a powerful abstract domain• Polyhedra (Cousot Halbwachs, 78)Statically analyzes program variable
relations and detects constraints:a1* var1 + a2* var2 + … + an* varn b
41
Linear Relation Analysis
Statically analyzes program variable relations and detects constraints:
a1* var1 + a2* var2 + … + an* varn b Polyhedron
y 1 x + y 3-x + y 1
0 1 2 3 x
0
1
2
3
y V = <(1,2) (2,1) >R = <(1,0) (1,1)>
buf.offset = 0temp.offest = 0 0 cp.offset = ii sbuf.len < s buf.msize sbuf.msize = 1024stemp.msize= 1024
assert(0 i < 6 - stemp.msize ); // strcpy(&temp[i],"(long)");
Potential violation when
cp.offset 1018
cp
buf
temp
i = cp.offset 1018
Integer Analysis – insert_long()
( l o n g )
CSSV
Cfiles
Pointer Analysis
C2IP
PreModPost Integer proc
Integer Analysis Potential Error Messages
Procedure name
Cfiles
AWP
Procedure’sPointer info
CSSV
Cfiles
Pointer Analysis
C2IPside effect
ModInteger proc
LeafProcedure
Cfiles
AWP
Pre
Procedure’sPointer info
CSSV
Cfiles
PreMod
LeafProcedure
Cfiles
Integer Analysis Potential Error Messages
Post
Pointer Analysis
C2IPside effect
Integer proc
Procedure’sPointer info
AWP
• Approximate the Weakest Precondition• Backward integer analysis• Generates a precondition
AWP – insert_long()
• Generate the following precondition:sbuf.is_nullt
sbuf.len < sbuf.alloc
0 cp.offset sbuf.len …
AWP – insert_long()
• Generate the following precondition:string(cp)
sbuf.len cp.offset + 1017
Not the weakest precondition:string(cp) sbuf.len 1017
Implementation
• Using:– ASToolKit [Microsoft]– GOLF [Microsoft – Das Manuvir]– New Polka [IMAG - Bertrand Jeannet]
• Main steps:– Simplifier– Pointer analysis– C2IP– Integer Analysis
Implementation – step 1
Cfiles
SimplifierPreModPost
Procedure name
Cfiles Inline Annotation
C’
Core C
Core C
• Simplify the analysis implementation• A limited form of C expressions
– Adds temporaries – At most one operator per statement– Convert value into location computation– No lost of precision
Implementation – step 2
Procedure name
CoreC
GOLF pointer analysis
GlobalPointer info
GFCvisible variables
Visiblevariables
Procedure’spointer projection
Procedure’sPointer info
Implementation – step 3 , 4
Procedure name
CoreC GFC
C2IP
ModularPointer info
Integer Program
PreInteger Analysis
Potential Error Messages
forward
backward
Preliminary results (web2C)ProclinecoreC
linetime(sec)
space(Mb)
errorsFA
insert_long14642.01320
fprintf_pascal_string10250.10.320
space_terminate9230.10.200
external_file_name14280.21.720
join15530.65.221
remove_newline251050.64.600
null_terminate9230.10.220
Up to four times faster than SAS01
Preliminary results (EADS/RTC_Si)
ProclinecoreCline
time(sec)
space(Mb)
errorsFA
FiltrerCarNonImp19341.60.500
SkipLine12420.81.900
StoreIntInBuffer371347.92100
Status
• Implemented– Simplifier – GFC– Procedure’s pointer analysis– C2IP excluding structures– AWP excluding side effect
• TBD– Structure – Inline specification– Side effect analysis– More applications
Conclusion
• Static checking for string errors is feasible!– Can show the absence of string errors in complicated
string manipulation procedures• Identified rare bugs• Techniques used
– Modular analysis (assume/guarantee)– Pointer analysis– Integer analysis
• Open questions– Can this be fully automated?
• Extension to handle dynamic allocations (ITVLA)