u niversity of m assachusetts d epartment of c omputer s cience reconsidering custom memory...
TRANSCRIPT
![Page 1: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/1.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE
Reconsidering Custom Memory Allocation
Emery Berger, Ben Zorn, Kathryn McKinley
![Page 2: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/2.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 2
Custom Memory Allocation
Very common practice Apache, gcc, lcc,
STL, database servers…
Language-level support in C++
Widely recommended
Programmers replace new/delete, bypassingsystem allocator Reduce runtime – often Expand functionality –
sometimes Reduce space – rarely
“Use custom allocators”
![Page 3: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/3.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 3
Drawbacks of Custom Allocators
Avoiding system allocator: More code to maintain & debug Can’t use memory debuggers Not modular or robust:
Mix memory from customand general-purpose allocators → crash!
Increased burden on programmers
Are custom allocators really a win?
![Page 4: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/4.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 4
Overview
Introduction Perceived benefits and drawbacks Three main kinds of custom allocators Comparison with general-purpose
allocators Advantages and drawbacks of regions Reaps – generalization of regions &
heaps
![Page 5: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/5.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 5
Class1 free list
(1) Per-Class Allocators
a
b
c
a = new Class1;b = new Class1;c = new Class1;delete a;delete b;delete c;a = new Class1;b = new Class1;c = new Class1;
Recycle freed objects from a free list
+ Fast+ Linked list operations
+ Simple+ Identical semantics+ C++ language
support- Possibly space-
inefficient
![Page 6: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/6.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 6
(II) Custom Patterns
Tailor-made to fit allocation patterns Example: 197.parser (natural
language parser)char[MEMORY_LIMIT]
a = xalloc(8);b = xalloc(16);c = xalloc(8);xfree(b);xfree(c);d = xalloc(8);
a b cd
end_of_arrayend_of_arrayend_of_arrayend_of_arrayend_of_arrayend_of_array
+ Fast+ Pointer-bumping allocation
- Brittle- Fixed memory size- Requires stack-like
lifetimes
![Page 7: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/7.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 7
(III) Regions
+ Fast+ Pointer-bumping
allocation+ Deletion of chunks
+ Convenient+ One call frees all memory
regionmalloc(r, sz)regiondelete(r)
Separate areas, deletion only en masse
regioncreate(r) r
- Risky- Dangling
references- Too much
space
Increasingly popular custom allocator
![Page 8: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/8.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 8
Overview
Introduction Perceived benefits and drawbacks Three main kinds of custom allocators Comparison with general-purpose
allocators Advantages and drawbacks of regions Reaps – generalization of regions &
heaps
![Page 9: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/9.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 9
Custom Allocators Are Faster…
Runtime - Custom Allocator Benchmarks
0
0.25
0.5
0.75
1
1.25
1.5
1.75
No
rma
lize
d R
un
tim
e
Custom Win32
non-regions regions
As good as and sometimes much faster than Win32
![Page 10: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/10.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 10
Not So Fast…Runtime - Custom Allocator Benchmarks
0
0.25
0.5
0.75
1
1.25
1.5
1.75
No
rma
lize
d R
un
tim
e
Custom Win32 DLmalloc
non-regions regions
DLmalloc: as fast or faster for most benchmarks
![Page 11: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/11.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 11
The Lea Allocator (DLmalloc 2.7.0)
Mature public-domain general-purpose allocator
Optimized for common allocation patterns Per-size quicklists ≈ per-class allocation
Deferred coalescing(combining adjacent free objects) Highly-optimized fastpath
Space-efficient
![Page 12: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/12.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 12
Space Consumption: Mixed Results
Space - Custom Allocator Benchmarks
0
0.25
0.5
0.751
1.25
1.5
1.75
197.
pars
er
boxe
d-sim
c-br
eeze
175.
vpr
176.
gcc
apac
he lcc
mud
lle
No
rmal
ized
Sp
ace
Custom DLmalloc
regionsnon-regions
![Page 13: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/13.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 13
Overview
Introduction Perceived benefits and drawbacks Three main kinds of custom allocators Comparison with general-purpose
allocators Advantages and drawbacks of regions Reaps – generalization of regions &
heaps
![Page 14: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/14.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 14
Regions – Pros and Cons
+ Fast, convenient, etc.+ Avoid resource leaks (e.g., Apache)
Tear down memory for terminated connections
- No individual object deletion Unbounded memory consumption
(producer-consumer, long-running computations, off-the-shelf programs)
Apache: vulnerable to DoS, memory leaks
![Page 15: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/15.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 15
Reap = region + heap Adds individual object deletion & heap
Reap Hybrid Allocator
reapmalloc(r, sz)
reapdelete(r)
reapcreate(r)r
reapfree(r,p)
+ Can reduce memory consumption+ Fast
+ Adapts to use (region or heap style)+ Cheap deletion
![Page 16: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/16.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 16
Reap Runtime
Runtime - Custom Allocation Benchmarks
00.25
0.50.75
11.25
1.51.75
197.
pars
er
boxe
d-sim
c-br
eeze
175.
vpr
176.
gcc
apac
he lcc
mud
lle
No
rma
lize
d r
un
tim
e
Custom Win32 DLmalloc Reap
non-regions regions
![Page 17: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/17.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 17
Reap Space
Space - Custom Allocator Benchmarks
00.25
0.50.75
11.25
1.51.75
No
rmal
ized
Sp
ace
Custom DLmalloc Reap
non-regions regions
![Page 18: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/18.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 18
Reap: Best of Both Worlds
Allows mixing of regions and new/delete
Case study: New Apache module “mod_bc”
bc: C-based arbitrary-precision calculator
Changed 20 lines out of 8000 Benchmark: compute 1000th prime
With Reap: 240K Without Reap: 7.4MB
![Page 19: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/19.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 19
Conclusions and Future Work
Empirical study of custom allocators Lea allocator often as fast or faster Non-region custom allocation
ineffective Reap: region performance without
drawbacks Future work:
Reduce space with per-page bitmaps Combine with scalable general-purpose
allocator (e.g., Hoard)
![Page 20: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/20.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 20
Software
http://www.cs.umass.edu/~emery(Reap: part of Heap Layers
distribution)
http://g.oswego.edu(DLmalloc 2.7.0)
![Page 21: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/21.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 21
If You Can Read This,I Went Too Far
![Page 22: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/22.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 22
Backup Slides
![Page 23: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/23.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 23
Experimental Methodology
Comparing to general-purpose allocators Same semantics: no problem
E.g., disable per-class allocators Different semantics: use emulator
Uses general-purpose allocator Adds bookkeeping to supportregion semantics
![Page 24: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/24.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 24
Why Did They Do That?
Recommended practice Premature optimization
Microbenchmarks vs. actual performance
Drift Not bottleneck anymore
Improved competition Modern allocators are better
![Page 25: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/25.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 25
Reaps as Regions: Runtime
Runtime - Region-Based Benchmarks
0
0.25
0.5
0.75
1
1.25
1.5
1.75
lcc mudlle
No
rma
lize
d R
un
tim
e
Custom Win32 DLmalloc Reap
Reap performance nearly matches regions
![Page 26: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/26.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 26
Using Reap as Regions
Runtime - Region-Based Benchmarks
0
0.5
1
1.5
2
2.5
lcc mudlle
No
rma
lize
d R
un
tim
e
Original Win32 DLmalloc WinHeap Vmalloc Reap
4.08
Reap performance nearly matches regions
![Page 27: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/27.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 27
Drawbacks of Regions
Can’t reclaim memory within regions Bad for long-running computations,
producer-consumer patterns, “malloc/free” programs
unbounded memory consumption
Current situation for Apache: vulnerable to denial-of-service limits runtime of connections limits module programming
![Page 28: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/28.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 28
Use Custom Allocators?
Strongly recommended by practitioners
Little hard data on performance/space improvements Only one previous study [Zorn 1992] Focused on just one type of allocator Custom allocators: waste of time
Small gains, bad allocators Different allocators better? Trade-
offs?
![Page 29: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/29.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 29
Kinds of Custom Allocators
Three basic types of custom allocators Per-class
Fast Custom patterns
Fast, but very special-purpose Regions
Fast, possibly more space-efficient Convenient Variants: nested, obstacks
![Page 30: U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley](https://reader037.vdocuments.us/reader037/viewer/2022103122/56649cb65503460f9497ba34/html5/thumbnails/30.jpg)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTSASSACHUSETTS • • D DEPARTMENTEPARTMENT OF OF CCOMPUTER OMPUTER SSCIENCECIENCE 30
Optimization Opportunity
Time Spent in Memory Operations
0
20
40
60
80
100
% o
f ru
nti
me
Memory Operations Other