computer systems principles dynamic memory management
DESCRIPTION
Computer Systems Principles Dynamic Memory Management. Emery Berger and Mark Corner University of Massachusetts Amherst. Dynamic Memory Management. How the heap manager is implemented malloc, free new, delete. Memory Management. Programs ask memory manager - PowerPoint PPT PresentationTRANSCRIPT
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Computer Systems PrinciplesDynamic Memory Management
Emery Berger and Mark CornerUniversity of Massachusetts
Amherst
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 2
Dynamic Memory Management How the heap manager is implemented
– malloc, free– new, delete
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Memory Management Programs ask memory manager
– to allocate/free objects (or multiple pages) Memory manager asks OS
– to allocate/free pages (or multiple pages)
Operating System
User Program
Allocator(java, libc)
Objects (new, malloc)
Pages (mmap,brk)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 4
Memory Management Ideal memory manager:
– Fast• Raw time, asymptotic runtime, locality
– Memory efficient• Low fragmentation
With multicore & multiprocessors:– Scalable to multiple processors
New issues:– Secure from attack– Reliable in face of errors
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 5
Memory Manager Functions Not just malloc/free
– realloc• Change size of object, copying old contents
– ptr = realloc (ptr, 10);• But: realloc(ptr, 0) = ?• How about: realloc (NULL, 16) ?
Other fun– calloc– memalign
Needs ability to locate size & object start
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 6
Fragmentation Intuitively, fragmentation stems from
“breaking” up heap into unusable spaces– More fragmentation = worse utilization
External fragmentation– Wasted space outside allocated objects
Internal fragmentation– Wasted space inside an object
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 7
Classical Algorithms First-fit
– find first chunk of desired size
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 8
Classical Algorithms Best-fit
– find chunk that fits best• Minimizes wasted space
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 9
Classical Algorithms Worst-fit
– find chunk that fits worst– name is a misnomer!– keeps large holes around
Reclaim space: coalesce free adjacent objects into one big object
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Quick Activity
Program asks for: 300,25,25,100– First-fit and best-fit allocations go where?– Which ones cannot be fulfilled?
What about: 110,54,25,70,50?
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 11
Implementation Techniques Freelists
– Linked lists of objects in same size class• Range of object sizes
First-fit, best-fit in this context?– Which is faster?
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 12
Implementation Techniques Segregated size classes
– Use free lists, but never coalesce or split Choice of size classes
– Exact– Powers-of-two
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 13
Implementation Techniques Big Bag of Pages (BiBOP)
– Page or pages (multiples of 4K)– Usually segregated size classes
Header contains metadata– Locate with bitmasking
Limits external fragmentation Can be very fast
Secret Sauce for project– Use free objects to track free objects
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 14
Runtime Analysis Key components
– Cost of malloc (best, worst, average)– Cost of free– Cost of size lookup (for realloc & free)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 15
Space Bounds Fragmentation worst-case for “optimal”:
O(log M/m)– M = largest object size– m = smallest object size
Best-fit = O(M * m) !
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 16
Performance Issues Goal: perform well for typical programs
– Considerations:• Internal fragmentation• External fragmentation• Headers (metadata)• Scalability (later)• Reliability, too
“Canned” allocator often seen as slow
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 17
“Use custom allocators”
Custom Memory Allocation Programmers replace
new/delete Reduce runtime
– Often Expand functionality
– Sometimes Reduce space
– rarely
Very common Apache, gcc, lcc, STL,
database servers…– Language-level
support in C++– Widely recommended
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 18
Drawbacks of Custom Allocators Avoiding system allocator:
– More code to maintain & debug– Can’t use memory debuggers– Not modular or robust:
• Mix memory from customand general-purpose allocators → crash!
Increased burden on programmers
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 19
Class1free list
a
b
c
a = new Class1;b = new Class1;c = new Class1;delete a;delete b;delete c;a = new Class1;b = new Class1;c = new Class1;
+ Fast+ Linked list
operations + Simple
+ Identical semantics
+ C++ language support
- Possibly space-inefficient
(1) Per-Class Allocators Recycle freed objects from a free list
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 20
char[MEMORY_LIMIT]
a = xalloc(8);b = xalloc(16);c = xalloc(8);xfree(b);xfree(c);d = xalloc(8);
a b cd
end_of_arrayend_of_arrayend_of_arrayend_of_arrayend_of_arrayend_of_array
+ Fast+ Pointer-bumping
allocation
- Brittle- Fixed memory size- Requires stack-like
lifetimes
(II) Custom Patterns Tailor-made to fit allocation patterns
– Example: 197.parser (natural language parser)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 21
+ Fast+ Pointer-bumping allocation+ Deletion of chunks
+ Convenient+ One call frees all memory
regionmalloc(r, sz)regiondelete(r)
Separate areas, deletion only en masseregioncreate(r) r
- Risky- Dangling
references- Too much space
Increasingly popular custom allocator
(III) Regions
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 22
Custom Allocators Are Faster…Runtime - Custom Allocator Benchmarks
0
0.25
0.5
0.75
1
1.25
1.5
1.75
197.parserboxed-simc-breeze 175.vpr 176.gcc apachelcc
mudlle
Normalized Runtime
Custom Win32
non-regions regions
As good as and sometimes much faster than Win32
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 23
Not So Fast…Runtime - Custom Allocator Benchmarks
0
0.25
0.5
0.75
1
1.25
1.5
1.75
197.parserboxed-simc-breeze 175.vpr 176.gcc apache
lccmudlle
Normalized Runtime
Custom Win32 DLmalloc
non-regions regions
DLmalloc (Linux): as fast or faster for most benchmarks
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Are custom allocators a win? Generally not worth the trouble
– Just use good general-purpose allocator• Alternative: reaps (hybrid of regions & heaps)
However…– Sometimes worth it for specialized apps
• Especially pool allocation, as in Apache
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Problems w/Unsafe Languages C, C++: pervasive apps, but langs. unsafe Numerous opportunities for security
vulnerabilities, errors– Double free– Invalid free– Uninitialized reads– Dangling pointers– Buffer overflows (stack & heap)
Can memory allocator help?
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Soundness for Erroneous Progs Normally: memory errors lead to crashes,
but…consider infinite-heap allocator:– All news fresh; ignore delete
• No dangling pointers, invalid frees,double frees
– Every object infinitely large• No buffer overflows, data overwrites
Transparent to correct program “Erroneous” programs sound
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Probabilistic Memory Safety
Fully-randomized M-heap– Approximates with M, e.g., M=2– Increases odds of benign errors– Probabilistic memory safety
• i.e., P(no error) n– Errors independent across heaps
• E(users with no error) n * |users|
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
DieHard Key ideas:
– Isolate heap metadata– Randomize Allocation– Trade space for
robustness– Replication (optional)
Key influence in design of Windows 7’s Fault-Tolerant Heap
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
obj obj objobj
pages
Implementation Issues Conventional, freelist-based heaps
– Hard to randomize, protect from errors• Double frees, heap corruption
What about bitmaps? (one bit per word)– Catastrophic fragmentation!
• Each small object likely to occupy one page
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
00000001 1010
metadata
heap
Randomized Heap Layout
Bitmap-based, segregated size classes– Bit represents one object of given size
• i.e., one bit = 2i+3 bytes, etc.– Prevents fragmentation
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
00000001 1010
metadata
heap
Randomized Allocation
malloc(8):– compute size class = ceil(log sz) – 3– randomly probe bitmap for zero-bit (free)
Fast: runtime O(1)– M=2 means E[# of probes] = 2
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
00010001 1010
metadata
heap
Randomized Allocation
malloc(8):– compute size class = ceil(log sz) – 3– randomly probe bitmap for zero-bit (free)
Fast: runtime O(1)– M=2 means E[# of probes] = 2
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
00010001 1010
metadata
heap
Randomized Deallocation
free(ptr):– Ensure object valid – aligned to right address– Ensure allocated – bit set– Resets bit
Prevents invalid frees, double frees
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
00010001 1010
metadata
heap
Randomized Deallocation
free(ptr):– Ensure object valid – aligned to right address– Ensure allocated – bit set– Resets bit
Prevents invalid frees, double frees
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
00000001 1010
metadata
heap
Randomized Deallocation
free(ptr):– Ensure object valid – aligned to right address– Ensure allocated – bit set– Resets bit
Prevents invalid frees, double frees
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
2 34 5 3 1 6
object size = 2i+4object size = 2i+3
…
11 6 3 2 5 4 …
My Mozilla: “malignant” overflow
Your Mozilla: “benign” overflow
Randomized Heaps & Reliability
Objects randomly spread across heap Different run = different heap
– Errors across heaps independent
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Increasing Reliability Space Shuttle
– 3 copies of everything(hw & sw)
– Votes on every action
Failure:majority rules
37
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
broadcast vote
input output
execute replicas(separate processes)
replica3seed3
replica1seed1
replica2seed2
DieHard - Replication
Replication-based fault-tolerance– Requires randomization! Makes errors independent
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
DieHard Results Empirical results
– Runtime overhead– Error avoidance
• Injected faults & actual applications
Analytical results (if time, pictures!)– Buffer overflows– Uninitialized reads– Dangling pointer errors (the best)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Analytical Results: Buffer Overflows
Model overflow as random write of live data Heap half full (max occupancy)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Analytical Results: Buffer Overflows
Model overflow as random write of live data Heap half full (max occupancy)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Analytical Results: Buffer Overflows
Model overflow: random write of live data Heap half full (max occupancy)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
rep
licas
Analytical Results: Overflows Replicas: Increase odds of avoiding overflow in
at least one replica
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
rep
licas
Analytical Results: Overflows Replicas: Increase odds of avoiding overflow in
at least one replica
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
rep
licas
Analytical Results: Overflows Replicas: Increase odds of avoiding overflow in at least one replica
P(Overflow in all replicas) = (½)3 = 1/8 P(No overflow in > 1 replica) = 1-(½)3 = 7/8
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Empirical Results: Runtime
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Analytical Results: Buffer Overflows
F = free space H = heap size N = # objects
worth of overflow
k = replicas
Overflow one object
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science
Error Avoidance Injected faults:
– Dangling pointers (@50%, 10 allocations)• glibc: crashes; DieHard: 9/10 correct
– Overflows (@1%, 4 bytes over) –• glibc: crashes 9/10, inf loop; DieHard: 10/10 correct
Real faults:– Avoids Squid web cache overflow
• Crashes Boehm-Demers-Weiser(BDW) Collector & glibc– Avoids dangling pointer error in Mozilla
• DoS in glibc & Windows
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 49
The End
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 50
Backup Slides
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 51
Lea Allocator (Dlmalloc 2.7.0) Mature general-purpose allocator Optimized for common allocation patterns
– Per-size quicklists ≈ per-class allocation Deferred coalescing
– combining adjacent free objects– Highly-optimized fastpath
Space-efficient