conditional correlation analysis for safe region-based memory management xi wang, zhilei xu,...
TRANSCRIPT
Conditional Correlation Analysis forSafe Region-based Memory Management
Xi Wang, Zhilei Xu, Xuezheng Liu,Zhenyu Guo, Xiaoge Wang, Zheng Zhang
Microsoft Research Asia, Tsinghua University
PLDI, June 9th 2008, Tucson
Region-based Memory Management
• Memory Region (a.k.a. Pool) is widely used– Apache web server– SVN (subversion) version control system– RC compiler– …
Ownership relation
Region Usage
r = region_create();
a = region_alloc(r);b = region_alloc(r);c = region_alloc(r);
region_destroy(r);
• Allocate objects in regions• A region Owns objects• Destroy a region, delete all objects it owns
r
a b c
Subregion
y = region_create(x);z = region_create(x);w = region_create(y);
region_destroy(x);
• One region can be Subregion of its parent• Detroy a parent, destroy all its subregions
– Parent lives longer than subx
y z
w
Subregion relation
Subregion relation
Dangling pointer between regions
foo(parent){sub = region_create(parent);table = region_alloc(sub);iterater = make(parent, table);region_destroy(sub);}
• Object can Access object in another region• When pointee’s region get destroyed earlier,
pointer become danglingmake(r, table){iterator = region_alloc(r);iterator.f = table;}
Accessrelation
parent iterator
tablesub
Danglingpointer
Harmful & Really exists (in svn…)
foo(parent){sub = region_create(parent);table = region_alloc(sub);iterater = make(parent, table);region_destroy(sub);}
• Further Deref. -- crash• No Deref. -- Pointer lives longer than necessary
& cause memory wastemake(r, table){iterator = region_alloc(r);iterator.f = table;}
parent iterator
tablesub
Danglingpointer
The problem is not easy
• Correlated Relations– Ownership : Object - Region– Subregion : Region - Region– Access : Object - Object
• Existing solution to safe region usage– reference counting– …
Solution: RegionWiz
• To verifyObject P may access object Q, P’s owner region must be descendant of Q’s owner region.(consistent conditional correlation)
• Static analysis to infer the ownership, subregion & access relations
• Verify correlation, find dangling pointers
• Context-sensitive, heap cloning
Highlights
• Conditional correlation analysis framework for safe region usage
• Context-sensitive analysis with heap cloning
• Found bugs in real-world applications written in C (100+KLOC)– 12 dangling pointers in 6 software packages– 13 false alarms
• Case study & experience
FrameworkSource Code
Context-sensitive
program info
Correlation Analysis(datalog)
Compiler plug-in (Phoenix in VC)
program infointo DB tables
ContextClone
Relation Computation
(datalog)
OwnershipSubregion
Access relations
Dangling pointer Report
Post processing
Extract Program InformationSource Code
Context-sensitive
program info
Correlation Analysis(datalog)
Compiler plug-in (Phoenix in VC)
program infointo DB tables
ContextClone
Relation Computation
(datalog)
OwnershipSubregion
Access relations
Dangling pointer Report
Post processing
Extract Program Information
• Extract program information into DB tables• Following analysis can be datalog inference
• Example
– Call(foo , make)
– Call(make , region_alloc)make( r , table ){ region_alloc(…)}
foo( parent ){ make(…)}
Context CloningSource Code
Context-sensitive
program info
Correlation Analysis(datalog)
Compiler plug-in (Phoenix in VC)
program infointo DB tables
ContextClone
Relation Computation
(datalog)
OwnershipSubregion
Access relations
Dangling pointer Report
Post processing
Context Cloning
make
foo
ralloc
foo(parent){sub = region_create(parent);table = region_alloc(sub);iterater = make(parent, table);region_destroy(sub);}
make(r, table){iterator = region_alloc(r);iterator.f = table;}
bar(parent){table = region_alloc(parent);iterater = make(parent, table);}
bar
Context Cloning
make
foo(1)
ralloc
foo(parent){sub = region_create(parent);table = region_alloc(sub);iterater = make(parent, table);region_destroy(sub);}
make(r, table){iterator = region_alloc(r);iterator.f = table;}
bar(parent){table = region_alloc(parent);iterater = make(parent, table);}
bar(1)
makemake(1)
make(2)
rallocrallocrallocralloc(1)
ralloc(2)
ralloc(3)
ralloc(4)
Heap Cloning(Specialization)
Context Cloningfoo(1)
foo(parent){sub = region_create(parent);table = region_alloc(sub);iterater = make(parent, table);region_destroy(sub);}
make(r, table){iterator = region_alloc(r);iterator.f = table;}
bar(parent){table = region_alloc(parent);iterater = make(parent, table);}
bar(1)
make(1)
make(2)
ralloc(1)
ralloc(2)
ralloc(3)
ralloc(4)
iterator(1)
table(1)
iterator(2)
table(2)
Program Information -- Cloned
• From now on program information tables are decorated with context information
• Example
– Call(bar - 1, make - 2)
– Call(bar - 1 , region_alloc - 4)
– Call(make - 2 , region_alloc - 3)
bar(1)
make(2)
ralloc(3) ralloc
(4)
Relation ComputationSource Code
Context-sensitive
program info
Correlation Analysis(datalog)
Compiler plug-in (Phoenix in VC)
program infointo DB tables
ContextClone
Relation Computation
(datalog)
OwnershipSubregion
Access relations
Dangling pointer Report
Post processing
Relation computation using datalog• Datalog rules
– How new relations can be computed from existing relations
• Basic Rules for regionsNewRegion( context ) :- Call( _ , “region_alloc” – context)NewObject( context ) :- Call( _ , “region_alloc” – context)
• Context- & Field- sensitive Points-to analysis in ~20 rules
Compute the needed relations• Access(P , Q) :-
Object P points-to Q through whatever field.
• Own(Rgn , Obj) :-region_alloc(r) called, return value assigned to v,r points-to Rgn, v points-to Obj.
• Subregion(Sub , Parent) :-region_create(u) called, return value assigned to v,v points-to Sub, u points-to Parent.
Descendant(Des , Ans) :- Des=Ans , NewRegion(Des), NewRegion(Ans).Descendant(Des , Ans) :- Descendant(Des , r), Subregion(r, Ans).
FrameworkSource Code
Context-sensitive
program info
Correlation Analysis(datalog)
Compiler plug-in (Phoenix in VC)
program infointo DB tables
ContextClone
Relation Computation
(datalog)
OwnershipSubregion
Access relations
Dangling pointer Report
Post processing
Detect unsafe usage
• Rule for detecting dangling pointer:P access Q; region1 own P; region2 own Q;not ( region1 is a descendant of region2 ).
Warning(P , Q)
• Report Warning(P , Q) when there is some, or report that the correlation is consistent
• Heuristics to filter out very unlikely warnings, reduce the total warnings
Framework Source Code
Context-sensitive
program info
Correlation Analysis(datalog)
Compiler plug-in (Phoenix in VC)
program infointo DB tables
ContextClone
Relation Computation
(datalog)
OwnershipSubregion
Access relations
Dangling pointer Report
Post processing
Experiment
• Conducted on Intel Xeon 2.0GHz / 32G RAM• Latest stable version applications tested
– RCC RC Compiler– Apache 2.2.6 HTTP web server &
utilities– freeswitch 1.0b1 Telephony platform shell– jxta-c 2.5.2 P2P framework shell– lklftpd FTP server– SVN (subversion) 1.4.5Version control system
Bugs foundKLOC #exe #Warnings #Dangling Pointers
RCC 37 1 1 1
Apache 42 9 1 0
freeswitch 109 1 0 0
jxta-c 114 1 0 0
lklftpd 5 1 2 2
svn 240 9 21 9
Total N/A N/A 25 12use APR
~ 200KLOCnot count in
Time consumption & relation sizestime #subregion #ownership #access
RCC 19m21s 9 1,577 746,940
Apache httpd 34m04s 18 2,341 2,273
freeswitch 14m55s 46 3,065 2,499
jxta-c 58m24s 16 27 10
lklftpd 2m34s 6 622 565
svn svn 25h59m53s 542,402 897,921,834 280,671,987<1hacceptable
Unacceptable!
They blow up because ofContext cloning!
• Current context: full call-path• Future work: better context definition
Case study – 1 (svn)• Region structure should be
consistent with program logic– Iterator vs. Hash table– Request vs. Connection
• RegionWiz can effectively find this kind of bugs
parent iterator
tablesub
Danglingpointer
Case study – 2 (rcc)
• r1 and r2 are totally independent
• Destroying r2 earlier will cause dangling pointer
r1
r2
config
name
Case study – 2 (rcc)
• r1 and r2 are totally independent
• Destroying r2 earlier will cause dangling pointer
• To use immutable string, it’s better to make a private copy in pointer’s own region
r1
r2
config
name
copy of name
Case study – 3 (svn)
• Temporary unsafe usage• Often involves branches• Path-sensitivity
• Dangerous as code evolves
• Re-organize code to avoid even temporary unsafe usage
svn_do_open(……){lock = region_alloc(parent);if (Conditon) { hash = region_alloc(sub); lock.f = hash;}……if (Condition) { lock.f = NULL ;}region_destroy(sub);}
Related work
• Language support for regions– Reap [OOPSLA '2002], Cyclone [PLDI '2002], RC
[PLDI '2001], Ownership types[PLDI '2003]• Correlation Analysis
– Locksmith [PLDI '2006], Chord [PLDI '2006, POPL '2007]
• Context-sensitive Analysis– bddbddb [PLDI '2004, PODS '2005]
Conclusion
• Use memory regions safely is not trivial
• RegionWiz can detect dangling pointers between regions through static conditional correlation analysis
• RegionWiz is efficient to find real bugs in real applications & improve safety of region-based memory management
• We believe the correlation analysis framework can solve other problems
Thank you!
Q/A
Heuristics
• For Warning(Pointer, Pointee)• Pointer’s type mismatch with pointee’s type –
less possible• Pointer & pointee never allocated from the
same region under some context – more possible
• Examined 205 lower-ranked warnings, all but one are really false alarms
Limitations
• Function pointer – standard inter-procedural propagation of function pointer values, but only propagate through variables & parameters, not through heap objects
• Pointer arithmetic (variable as array index) not supported, just ignored
• Limited thread support, no support for Asynchronous event, Callbacks, etc.
• Heuristics bring unsoundness