Java Heap ResizingFrom Hacked-up Heuristics to Mathematical Models
Jeremy Singer and David R. White
Fri 11 Nov 2011
Outline
Background
Microeconomic Theory
Heap Sizing as a Control Problem
Summary
Outline
Background
Microeconomic Theory
Heap Sizing as a Control Problem
Summary
Cloud
Datacentres
Key Question
To be economical, can we over-subscribe resources?
i.e. particularly dynamic memory consumption
Key Question
To be economical, can we over-subscribe resources?
i.e. particularly dynamic memory consumption
Current Jikes Policy: Resize Matrix
Heap Occupancy
GC
Ove
rhea
d
0.00 0.10 0.30 0.60 0.80 1.00
0.00 0.90 0.90 0.95 1.00 1.00 1.000.01 0.90 0.90 0.95 1.00 1.00 1.000.02 0.95 0.95 1.00 1.00 1.00 1.000.07 1.00 1.00 1.10 1.15 1.20 1.200.15 1.00 1.00 1.20 1.25 1.35 1.300.40 1.00 1.00 1.25 1.30 1.50 1.501.00 1.00 1.00 1.25 1.30 1.50 1.50
Look-up table for heap-resize coefficient.
Design Decisions
(Private Communication)
“. . . back in 2003 [anon] and I did some experimental tuningand came up with the numbers by eyeballing things.At the time, it seemed to be somewhat stable and makingreasonable decisions but that was also about 4 major versionsago and I don’t think anyone has really looked at it seriouslysince then. I think there was some amount of sensitivity to thevalues. . . ”
Outline
Background
Microeconomic Theory
Heap Sizing as a Control Problem
Summary
Demand Curve
Price
(p)
Quantity demanded (q)
Allocation Curve
Heap
size
Number of GCs
max livesize
total
allocation
Empirical Data
0
50
100
150
200
250
300
350
400
0 2 4 6 8 10 12 14 16 18-4
-3
-2
-1
0
1
2
3
4
heap
siz
e (M
B)
elas
ticity
number of GCs
allocation curveelasticity
Demand Elasticity
• E measures sensitivity of the quantity q demanded to changes inprice p.
E =% change in quantity
% change in price=
dq
dp
p
q(1)
Allocation Elasticity
• E measures sensitivity of number of GCs g demanded to changesin heap size h.
E =% change in num GCs
% change in heap size=
dg
dh
h
g(2)
Control heap size with elasticity parameter
1050
1100
1150
1200
1250
1300
1350
0.1 1 10
exec
utio
n tim
e (m
s)
elasticity
growth ratio 1.1growth ratio 1.3growth ratio 1.5
default policy
Control heap size with elasticity parameter
0
50
100
150
200
250
300
350
400
450
0.1 1 10
final
hea
p si
ze (
MB
)
elasticity
growth ratio 1.1growth ratio 1.3growth ratio 1.5
default policy
Problems with elasticity-based heap growth
• unintuitive parameter for users
• difficult to bound heap growth
• . . . problems with paging!
Problems with elasticity-based heap growth
• unintuitive parameter for users
• difficult to bound heap growth
• . . . problems with paging!
Relative costs of memory accesses
We must avoid paging at all costs!
0 0 0 0 0 00 0 0 0 0 00 0 0 0 0 0
1 1 1 1 1 11 1 1 1 1 11 1 1 1 1 1
21
2 2 2 2 25 9 13 17 21
L1 CACHESRAM
LAST LEVEL CACHEEDRAM DRAM FLASHPCM HARD DRIVE
MAIN MEMORY SYSTEM
Typical Access Latency (in terms of processor cycles for a 4 GHz processor)
HIGH PERFORMANCE DISK SYSTEM
2 2 2 23 7 11 15
219
223
Scalable High Performance Main Memory System UsingPhase-Change Memory Technology.Qureshi et al.
Finding the Sweet Spot
If Goldilocks did heap sizes. . .
Heap Size
ExecutionTime
"Sweet Spot"
Un
ab
le t
o c
om
ple
te
Excessive Paging
HighGC
Experiment
• Linux
• Single-user mode.
• 300MB System RAM (via Kernel Parameter)
Used the Jikes RVM and the Da Capo Benchmarks.
Plot Heap Size versus Execution Time.
Results: Heap Size versus Execution Time
20 40 60 80 100 120 140 160 180 200
7000
8000
9000
1000
011
000
1200
0
lusearch Execution Time
Heap Size (MB)
Exe
cutio
n Ti
me
(ms)
Results 2: Heap Size versus Execution Time
20 40 60 80 100 120 140 160 180 200
6000
8000
1000
012
000
1400
016
000
pmd Execution Time
Heap Size (MB)
Exe
cutio
n Ti
me
(ms)
Results 3: Heap Size versus Execution Time
20 40 60 80 100 120 140 160 180 200
2000
2500
3000
3500
4000
antlr Execution Time
Heap Size (MB)
Exe
cutio
n Ti
me
(ms)
Key Question
How large should an application heap be?
Proposal?
Yet
Another
Heuristic
Why Not? Strong Foundations
Although the problem can be regarded as one of ’best effort’, thereare some properties we want our system to have:
• Guarantees about convergence.
• No pathological behaviours.
Therefore: tractability is important.
Problem Summary
Create a device that constantly pushes the virtual machinetowards the optimal heap size, in the presence of variation insoftware behaviour and disturbances from external factors suchas other applications and the operating system.
Outline
Background
Microeconomic Theory
Heap Sizing as a Control Problem
Summary
A Control System
11/11/2011 13:48
Page 1 of 1http://upload.wikimedia.org/wikipedia/commons/2/24/Feedback_loop_with_descriptions.svg
Example Control Systems
Example Control Systems
Viewing Heap Sizing as a Control Problem
Memory Allocated
Heap Size ResponseTime
Overshoot
Old Optimal Heap Size
New Optimal Heap Size
Issues with Applying Control Theory to Heap Sizing
1. Time is variable.
→ Use memory allocated as a proxy.
2. Control and Process Variables.→ Control variable is heap size (absolute, relative/ratio?)→ Process: GC Overhead (long/short-term), Occupancy,Mark-Cons ratio . . .
3. Specific to our systeme.g. exactly when do we allow heap-resizing to happen.
4. Characterising our system.→ Using existing models.→ Applying empirical analysis.
Issues with Applying Control Theory to Heap Sizing
1. Time is variable.→ Use memory allocated as a proxy.
2. Control and Process Variables.→ Control variable is heap size (absolute, relative/ratio?)→ Process: GC Overhead (long/short-term), Occupancy,Mark-Cons ratio . . .
3. Specific to our systeme.g. exactly when do we allow heap-resizing to happen.
4. Characterising our system.→ Using existing models.→ Applying empirical analysis.
Issues with Applying Control Theory to Heap Sizing
1. Time is variable.→ Use memory allocated as a proxy.
2. Control and Process Variables.
→ Control variable is heap size (absolute, relative/ratio?)→ Process: GC Overhead (long/short-term), Occupancy,Mark-Cons ratio . . .
3. Specific to our systeme.g. exactly when do we allow heap-resizing to happen.
4. Characterising our system.→ Using existing models.→ Applying empirical analysis.
Issues with Applying Control Theory to Heap Sizing
1. Time is variable.→ Use memory allocated as a proxy.
2. Control and Process Variables.→ Control variable is heap size (absolute, relative/ratio?)→ Process: GC Overhead (long/short-term), Occupancy,Mark-Cons ratio . . .
3. Specific to our systeme.g. exactly when do we allow heap-resizing to happen.
4. Characterising our system.→ Using existing models.→ Applying empirical analysis.
Issues with Applying Control Theory to Heap Sizing
1. Time is variable.→ Use memory allocated as a proxy.
2. Control and Process Variables.→ Control variable is heap size (absolute, relative/ratio?)→ Process: GC Overhead (long/short-term), Occupancy,Mark-Cons ratio . . .
3. Specific to our systeme.g. exactly when do we allow heap-resizing to happen.
4. Characterising our system.→ Using existing models.→ Applying empirical analysis.
Issues with Applying Control Theory to Heap Sizing
1. Time is variable.→ Use memory allocated as a proxy.
2. Control and Process Variables.→ Control variable is heap size (absolute, relative/ratio?)→ Process: GC Overhead (long/short-term), Occupancy,Mark-Cons ratio . . .
3. Specific to our systeme.g. exactly when do we allow heap-resizing to happen.
4. Characterising our system.
→ Using existing models.→ Applying empirical analysis.
Issues with Applying Control Theory to Heap Sizing
1. Time is variable.→ Use memory allocated as a proxy.
2. Control and Process Variables.→ Control variable is heap size (absolute, relative/ratio?)→ Process: GC Overhead (long/short-term), Occupancy,Mark-Cons ratio . . .
3. Specific to our systeme.g. exactly when do we allow heap-resizing to happen.
4. Characterising our system.→ Using existing models.→ Applying empirical analysis.
Using Existing Analytical Work
n =
{n∗ for M ≥ M∗
12(K +
√K 2 − 4)(n∗ + n0)− n0 for M < M∗
where K = 1 + M∗+M0
M+M0
n∗, M∗, M0, n0 are parameters dependent on the software andplatform.
A Page Fault Equation for Dynamic Heap Sizing. Tay and Zong.
Implementation so far: PID Controller
Control signal u(t) governed by the following equation:
u(t) = KPe(t) + Ki
∫e(t)dt + KD
d
dte(t) (3)
Model Tuning
For example, how do we decide upon the coefficients KP , Ki , KD?
Still an optimisation problem: but built upon a well-formulatedcontrol problem.
Outline
Background
Microeconomic Theory
Heap Sizing as a Control Problem
Summary
Quick Summary
• Heap sizing is an important factor in determining performance.
• Determining optimal heap size is a difficult, dynamic problem.
• Control theory gives us a way to approach it systematically.
Future Work
Revisiting analysis and determining the correct controller.
Optimisation schemes.
Empirical Evaluation.