![Page 1: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/1.jpg)
380C lecture 19
• Where are we & where we are going– Managed languages
• Dynamic compilation• Inlining• Garbage collection
– Opportunity to improve data locality on-the-fly– Other opportunities?
– Why you need to care about workloads– Alias analysis– Dependence analysis– Loop transformations– EDGE architectures
1CS380C Lecture 19
![Page 2: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/2.jpg)
2
Garbage Collection Advantage:
Improving Program Locality
Xianglong Huang (UT)Stephen M Blackburn (ANU), Kathryn S McKinley (UT)
J Eliot B Moss (UMass), Zhenlin Wang (MTU), Perry Cheng (IBM)
CS380C Lecture 19
![Page 3: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/3.jpg)
3
Today: Advanced Topics
• Generational Garbage Collection• Copying objects is an opportunity
• Xianglong Huang (UT), Stephen M Blackburn (ANU), Kathryn S McKinley (UT), J Eliot B Moss (UMass), Zhenlin Wang (MTU), Perry Cheng (IBM), “The Garbage Collection Advantage: Improving Program Locality,” OOPSLA 2004.
CS380C Lecture 19
![Page 4: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/4.jpg)
4
Motivation
• Memory gap problem• OO programs become more popular• OO programs exacerbates memory gap
problem– Automatic memory management– Pointer data structures– Many small methods
Goal: improve OO program locality
CS380C Lecture 19
![Page 5: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/5.jpg)
5
Allocation Mechanisms
Fast (increment & bounds check)
contemporaneous object locality
Can't incrementally free & reuse: must free en masse
Bump-Pointer
CS380C Lecture 19
![Page 6: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/6.jpg)
6
Allocation Mechanisms
Fast (increment & bounds check)
contemporaneous object locality
Can't incrementally free & reuse: must free en masse
Bump-Pointer
CS380C Lecture 19
![Page 7: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/7.jpg)
7
Allocation Mechanisms
Fast (increment & bounds check)
contemporaneous object locality
Can't incrementally free & reuse: must free en masse
Bump-Pointer Free-List
Slightly slower (consult list for fit) Mystery locality
Can incrementally free & reuse cells
CS380C Lecture 19
![Page 8: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/8.jpg)
8
State-of-the-art throughput Copying Generational GC
• Requirements– write-barrier to track inter-generation pointers
• remsets, cards– copy reserve
• Advantages:– Minimizes copying of older objects– Compaction of long-lived objects
• Problems:– Not very incremental– Very youngest objects always copied– What order should GC use to copy objects?
etc. etc …
‘nursery’ ‘older generation’
CS380C Lecture 19
![Page 9: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/9.jpg)
9
Opportunity
• Generational copying garbage collector reorders objects at runtime
CS380C Lecture 19
![Page 10: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/10.jpg)
10
1
4
65
7
2 3
Copying of Linked Objects
BreadthFirst
65
7
432
1
CS380C Lecture 19
![Page 11: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/11.jpg)
11
71 2 3 4 5 6
1
4
65
7
2 3
Copying of Linked Objects
65
7
432
1
BreadthFirst
DepthFirst
CS380C Lecture 19
![Page 12: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/12.jpg)
12
71 2 3 4 5 6
Copying of Linked Objects
DepthFirst
OnlineObjectReordering
1 4BreadthFirst
61 2 3 4 75
1
4
65
7
2 3
65
7
432
1
41
CS380C Lecture 19
![Page 13: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/13.jpg)
13
Outline
• Motivation• Online Object Reordering
(OOR)• Methodology• Experimental Results• Conclusion
CS380C Lecture 19
![Page 14: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/14.jpg)
14
Cache Performance Matters
_213_javac
05
10152025303540
8K DL1, 8K IL1, 128K L2Perfect L2 Perfect IL1, Perfect DL1Total Cycles (in billions)
CS380C Lecture 19
![Page 15: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/15.jpg)
15
Online Object Reordering
• Where are the cache misses?• How to identify hot field accesses
at runtime?• How to reorder the objects?
CS380C Lecture 19
![Page 16: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/16.jpg)
16
Where Are The Cache Misses?
VM Objects StackOlder
Generation
• Heap structure:
Nursery
Not to scale
CS380C Lecture 19
![Page 17: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/17.jpg)
17
Where Are The Cache Misses?
_209_db
0200400600800
100012001400160018002000
VM ObjectsStack Older Gen NurseryTotal Accesses (in millions)
L2 hits
L2 misses
CS380C Lecture 19
![Page 18: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/18.jpg)
18
Where Are The Cache Misses?
• Two opportunities to reorder objects in the older generation– Promote nursery objects– Full heap collection
CS380C Lecture 19
![Page 19: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/19.jpg)
19
How to Find Hot Fields?
• Runtime info (intercept every read)?
• Compiler analysis?• Runtime information + compiler
analysis Key: Low overhead estimation
CS380C Lecture 19
![Page 20: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/20.jpg)
20
Which Classes Need Reordering?
Step 1: Compiler analysis– Excludes cold basic blocks– Identifies field accesses
Step 2: JIT adaptive sampling identifies hot methods– Mark as hot field accesses in hot
methods
Key: Low overhead estimation
CS380C Lecture 19
![Page 21: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/21.jpg)
21
Example: Compiler Analysis
Compiler
Hot BBCollect access info
Cold BBIgnore
Compiler
Access List:1. A.b2. ….….
Method Foo { Class A a; try { …=a.b; … } catch(Exception e){ …a.c }}
CS380C Lecture 19
![Page 22: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/22.jpg)
22
Example: Adaptive Sampling
Method Foo { Class A a; try { …=a.b;
… } catch(Exception e){
…a.c }}
Adaptive Sampling
Foo is hot
Foo Accesses:1. A.b2. ….….
A.b is hot
A
B
b…..
c A’s type information
c b
CS380C Lecture 19
![Page 23: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/23.jpg)
23
1
4
65
7
2 3
Copying of Linked Objects
65
7
43
OnlineObjectReordering
Type Information
143
2
1
Hot space Cold space
CS380C Lecture 19
![Page 24: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/24.jpg)
24
OOR System Overview
BaselineCompiler
SourceCode
ExecutingCode
AdaptiveSampling Optimizing
Compiler
HotMethods
Access InfoDatabase
Register HotField Accesses
Look Up
AddsEntries
GC: CopiesObjects
Affects Locality
AdviceGC: CopiesObjects
OOR additionJikesRVM componentInput/Output
OptimizingCompiler
AdaptiveSampling
Improves Locality
CS380C Lecture 19
![Page 25: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/25.jpg)
25
Outline
• Motivation• Online Object Reordering• Methodology• Experimental Results• Conclusion
CS380C Lecture 19
![Page 26: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/26.jpg)
26
Methodology: Virtual Machine
• Jikes RVM– VM written in Java– High performance– Timer based adaptive sampling – Dynamic optimization
• Experiment setup– Pseudo-adaptive – 2nd iteration [Eeckhout et al.]
CS380C Lecture 19
![Page 27: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/27.jpg)
27
Methodology: Memory Management
• Memory Management Toolkit (MMTk):– Allocators and garbage collectors– Multi-space heap
• Boot image• Large object space (LOS)• Immortal space
• Experiment setup– Generational copying GC with 4M
bounded nurseryCS380C Lecture 19
![Page 28: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/28.jpg)
28
Overhead: OOR Analysis Only
Benchmark Base Execution Time (sec)
w/ only OOR Analysis (sec)
Overhead
jess 4.39 4.43 0.84%
jack 5.79 5.82 0.57%
raytrace 4.63 4.61 -0.59%
mtrt 4.95 4.99 0.70%
javac 12.83 12.70 -1.05%
compress 8.56 8.54 0.20%
pseudojbb 13.39 13.43 0.36%
db 18.88 18.88 -0.03%
antlr 0.94 0.91 -2.90%
hsqldb 160.56 158.46 -1.30%
ipsixql 41.62 42.43 1.93%
jython 37.71 37.16 -1.44%
ps-fun 129.24 128.04 -1.03%
Mean -0.19%CS380C Lecture 19
![Page 29: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/29.jpg)
29
Detailed Experiments
• Separate application and GC time• Vary thresholds for method heat• Vary thresholds for cold basic
blocks• Three architectures
– x86, AMD, PowerPC
• x86 Performance counter: – DL1, trace cache, L2, DTLB, ITLB
CS380C Lecture 19
![Page 30: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/30.jpg)
30
Performance javac
CS380C Lecture 19
![Page 31: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/31.jpg)
31
Performance db
CS380C Lecture 19
![Page 32: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/32.jpg)
32
Performance jython
Any static ordering leaves you vulnerable to pathological cases.
CS380C Lecture 19
![Page 33: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/33.jpg)
33
Phase Changes
CS380C Lecture 19
![Page 34: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/34.jpg)
34
Related Work
• Evaluate static orderings [Wilson et al.]– Large performance variation
• Static profiling [Chilimbi et al., and others]– Lack of flexibility
• Instance-based object reordering [Chilimbi et al.]– Too expensive
CS380C Lecture 19
![Page 35: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/35.jpg)
35
Conclusion
• Static traversal orders have up to 25% variation
• OOR improves or matches best static ordering
• OOR has very low overhead• Past predicts future
CS380C Lecture 19
![Page 36: 380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality](https://reader035.vdocuments.us/reader035/viewer/2022062423/5697bf791a28abf838c821c3/html5/thumbnails/36.jpg)
380C
• Where are we & where we are going– Managed languages
• Dynamic compilation• Inlining• Garbage collection
– Why you need to care about workloads & methodology
• Read: Blackburn et al., Wake Up and Smell the Coffee: Evaluation Methodology for the 21st Century, ACM CACM, 51(8): 83--89, August, 2008.
– Alias analysis– Dependence analysis– Loop transformations– EDGE architectures
36CS380C Lecture 19