finally! “real” java for low latency and low jitter...©2011 azul systems, inc. finally!...

63
©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

Upload: others

Post on 23-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Finally!“Real” Java forlow latency and

low jitter

Gil Tene, CTO & co-Founder, Azul Systems

Page 2: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

High level agenda

Java in a low latency application world

Why Stop-The-World is a problem (Duh?)

“Java” vs. actual, “real” Java

Some Garbage Collection terminology

Classifying current commercially available collectors

The C4 collector: What a solution to STW looks like...

Page 3: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

About me: Gil Tene

co-founder, CTO @Azul Systems

Have been working on a “think different” GC approaches since 2002

Created Pauseless & C4 core GC algorithms (Tene, Wolf)

A Long history building Virtual & Physical Machines, Operating Systems, Enterprise apps, etc... * working on real-world trash compaction issues, circa 2004

Page 4: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

About Azul

We make scalable Virtual Machines

Have built “whatever it takes to get job done” since 2002

3 generations of custom SMP Multi-core HW (Vega)

Now Pure software for commodity x86 (Zing)

“Industry firsts” in Garbage collection, elastic memory, Java virtualization, memory scale

Vega

C4

Page 5: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Java in the low latency world

Page 6: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Java in a low latency world

Why do people use Java for low latency apps?

Are they crazy?

No. There are good, easy to articulate reasons

Mostly around productivity and long term cost

Developer productivity. Leverage.

Time-to-product, Time-to-market, ...

Extras: leverage, ecosystem, ability to hire

Page 7: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

E.g. Customer answer to:“Why do you use Java in Algo Trading?”

Strategies have a shelf life

We have to keep developing and deploying new ones

Only one out of N is actually productive

Profitability therefore depends on ability to successfully deploy new strategies, and on the cost of doing so

Our developers seem to be able to produce 2x-3x as much when using a Java environment as they would with C++ ...

Page 8: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

So what is the problem?Is Java Slow?

No

A good programmer will get roughly the same speed from both Java and C++

A bad programmer won’t get you fast code on either

The 50%‘ile and 90%‘ile are typically excellent...

It’s those pesky occasional stutters and stammers that are the problem...

Ever hear of Garbage Collection?

Page 9: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Java’s achilles heel

Page 10: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Garbage CollectionHow bad is it?

Let’s ignore the bad multi-second pauses for now...

Low latency applications regularly experience “small”, “minor” GC events that range in the 10s of msec

Frequency directly related to allocation rate

In turn, directly related to throughput

So we have great 50%, 90%. Maybe even 99%

But 99.9%, 99.99%, Max, all “suck”

So bad that it affects risk, profitability, service expectations, etc.

Page 11: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

What do low latency developersdo about it?

They use “Java” instead of Java

They write “in the Java syntax”

They avoid allocation as much as possible

E.g. They build their own object pools for everything

They write all the code they use (no 3rd party libraries)

They train developers for their local discipline

In short: They revert to many of the practices that hurt productivity. They loose out on much of Java.

Page 12: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Some GC Terminology

Page 13: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

A Basic Terminology example:What is a concurrent collector?

A Concurrent Collector performs garbage collection work concurrently with the application’s own execution

A Parallel Collector uses multiple CPUs to perform garbage collection

Page 14: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

A Concurrent Collector performs garbage collection work concurrently with the application’s own execution

A Parallel Collector uses multiple CPUs to perform garbage collection

Classifying a collector’s operation

An Incremental collector performs a garbage collection operation or phase as a series of smaller discrete operations with (potentially long) gaps in between

A Stop-the-World collector performs garbage collection while the application is completely stopped

Mostly means sometimes it isn’t (usually means a different fall back mechanism exists)

Page 15: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

What’s common to allJava GC mechanisms?

Identify the live objects in the memory heap

Reclaim resources held by dead objects

Periodically relocate live objects

Examples:

Mark/Sweep/Compact (common for Old Generations)

Copying collector (common for Young Generations)

Page 16: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Mark (aka “Trace”)

Start from “roots” (thread stacks, statics, etc.)

“Paint” anything you can reach as “live”

At the end of a mark pass:

all reachable objects will be marked “live”

all non-reachable objects will be marked “dead” (aka “non-live”).

Note: work is generally linear to “live set”

Page 17: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Sweep

Scan through the heap, identify “dead” objects and track them somehow

(usually in some form of free list)

Note: work is generally linear to heap size

Page 18: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Compact

Over time, heap will get “swiss cheesed”: contiguous dead space between objects may not be large enough to fit new objects (aka “fragmentation”)

Compaction moves live objects together to reclaim contiguous empty space (aka “relocate”)

Compaction has to correct all object references to point to new object locations (aka “remap”)

Remap scan must cover all references that could possibly point to relocated objects

Note: work is generally linear to “live set”

Page 19: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Copy

A copying collector moves all lives objects from a “from” space to a “to” space & reclaims “from” space

At start of copy, all objects are in “from” space and all references point to “from” space.

Start from “root” references, copy any reachable object to “to” space, correcting references as we go

At end of copy, all objects are in “to” space, and all references point to “to” space

Note: work generally linear to “live set”

Page 20: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Generational Collection

Weak Generational Hypothesis; “most objects die young”

Focus collection efforts on young generation:

Use a moving collector: work is linear to the live set

The live set in the young generation is a small % of the space

Promote objects that live long enough to older generations

Only collect older generations as they fill up

“Generational filter” reduces rate of allocation into older generations

Tends to be (order of magnitude) more efficient

Great way to keep up with high allocation rate

Practical necessity for keeping up with processor throughput

Page 21: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

The typical combosin commercial server JVMS

Young generation usually uses a copying collector

Young generation is usually monolithic, stop-the-world

Old generation usually uses Mark/Sweep/Compact

Old generation may be STW, or Concurrent, or mostly-Concurrent, or Incremental-STW, or mostly-Incremental-STW

Page 22: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Empty memory

and CPU/throughput

Page 23: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

100%

CPU%

Heap sizeLive set

Heap size vs. GC CPU %

Page 24: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Empty memory needs(empty memory == CPU power)

The amount of empty memory in the heap is the dominant factor controlling the amount of GC work

For both Copy and Mark/Compact collectors, the amount of work per cycle is linear to live set

The amount of memory recovered per cycle is equal to the amount of unused memory (heap size - live set)

The collector has to perform a GC cycle when the empty memory runs out

A Copy or Mark/Compact collector’s efficiency doubles with every doubling of the empty memory

Page 25: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

What empty memory controls

Empty memory controls efficiency (amount of collector work needed per amount of application work performed)

Empty memory controls the frequency of pauses (if the collector performs any Stop-the-world operations)

Empty memory DOES NOT control pause times (only their frequency)

In Mark/Sweep/Compact collectors that pause for sweeping, more empty memory means less frequent but LARGER pauses

Page 26: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Delaying the inevitable

Some form of copying/compaction is inevitable in practiceAnd compacting anything requires scanning/fixing all references to it

Delay tactics focus on getting “easy empty space” firstThis is the focus for the vast majority of GC tuning

Most objects die young [Generational]So collect young objects only, as much as possible. Hope for short STW.

But eventually, some old dead objects must be reclaimed

Most old dead space can be reclaimed without moving it [e.g. CMS] track dead space in lists, and reuse it in place

But eventually, space gets fragmented, and needs to be moved

Much of the heap is not “popular” [e.g. G1, “Balanced”]A non popular region will only be pointed to from a small % of the heap

So compact non-popular regions in short stop-the-world pauses

But eventually, popular objects and regions need to be compacted

Enterprise Apps

Low Latency

Page 27: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Classifying common collectors

Page 28: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

The typical combosin commercial server JVMs

Young generation usually uses a copying collector

Young generation is usually monolithic, stop-the-world

Old generation usually uses a Mark/Sweep/Compact collector

Old generation may be STW, or Concurrent, or mostly-Concurrent, or Incremental-STW, or mostly-Incremental-STW

Page 29: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

HotSpot™ ParallelGCCollector mechanism classification

Monolithic Stop-the-world copying NewGen

Monolithic Stop-the-world Mark/Sweep/Compact OldGen

Page 30: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

HotSpot™ ConcMarkSweepGC (aka CMS)Collector mechanism classification

Monolithic Stop-the-world copying NewGen (ParNew)

Mostly Concurrent, non-compacting OldGen (CMS)Mostly Concurrent marking

Mark concurrently while mutator is running

Track mutations in card marks

Revisit mutated cards (repeat as needed)

Stop-the-world to catch up on mutations, ref processing, etc.

Concurrent Sweeping

Does not Compact (maintains free list, does not move objects)

Fallback to Full Collection (Monolithic Stop the world).Used for Compaction, etc.

Page 31: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

HotSpot™ G1GC (aka “Garbage First”) Collector mechanism classification

Monolithic Stop-the-world copying NewGen

Mostly Concurrent, OldGen markerMostly Concurrent marking

Stop-the-world to catch up on mutations, ref processing, etc.

Tracks inter-region relationships in remembered sets

Stop-the-world mostly incremental compacting old gen Objective: “Avoid, as much as possible, having a Full GC…”

Compact sets of regions that can be scanned in limited time

Delay compaction of popular objects, popular regions

Fallback to Full Collection (Monolithic Stop the world).Used for compacting popular objects, popular regions, etc.

Page 32: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Monolithic-STW GC Problems

Page 33: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

One way to deal with Monolithic-STW GC

Page 34: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

Common ways people deal with hiccups

Averages and Standard Deviation

Page 35: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

16000"

18000"

0" 2000"

Hiccup&Dura*on&(msec)&

Hic

Max"per"Interval

Page 36: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

16000"

18000"

0" 2000" 4000" 6000" 8000" 10000"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%" 99.9999%"

Max=16023.552&

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

16000"

18000"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Page 37: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems
Page 38: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

Another way to cope: “Creative Language”

“Guarantee a worst case of X msec, 99% of the time”

“Mostly” Concurrent, “Mostly” Incremental

Translation: “Will at times exhibit long monolithic stop-the-world pauses”

“Fairly Consistent”

Translation: “Will sometimes show results well outside this range”

“Typical pauses in the tens of milliseconds”

Translation: “Some pauses are much longer than tens of milliseconds”

Page 39: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

Actually measuring things

(e.g. jHiccup)

Page 40: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

Incontinuities in Java platform execution

0"

200"

400"

600"

800"

1000"

1200"

1400"

1600"

1800"

0" 200" 400" 600" 800" 1000" 1200" 1400" 1600" 1800"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups"by"Time"Interval"

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%"

Max=1665.024&

0"

200"

400"

600"

800"

1000"

1200"

1400"

1600"

1800"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups"by"Percen@le"Distribu@on"

Page 41: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

FiServ Pricing Application

0"

100"

200"

300"

400"

500"

600"

700"

0" 2000" 4000" 6000" 8000" 10000" 12000" 14000" 16000"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups"by"Time"Interval"

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%" 99.9999%"

Max=636.928&

0"

100"

200"

300"

400"

500"

600"

700"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups"by"PercenCle"DistribuCon"

Page 42: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

0"

20"

40"

60"

80"

100"

120"

140"

160"

0" 500" 1000" 1500" 2000" 2500" 3000" 3500" 4000" 4500"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%" 99.9999%"

Max=137.472&

0"

20"

40"

60"

80"

100"

120"

140"

160"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Yet another FiServ application

Page 43: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

A post-Monolithic-STW world

Page 44: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

We needed to solve the right problems

Even short pauses are a problem

Scale is artificially limited by responsiveness

Responsiveness must be unlinked from scale:Heap size, Live Set size, Allocation rate, Mutation rate

Transaction Rate, Concurrent users, Data set size, etc.

Responsiveness must be continually sustainable

Can’t ignore “rare” events

Eliminate all Stop-The-World FallbacksAny STW fall back is a failure

Page 45: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

The problems that needed solving(areas where the state of the art needed improvement)

Robust Concurrent MarkingIn the presence of high mutation and allocation rates

Cover modern runtime semantics (e.g. weak refs, lock deflation)

Compaction that is not monolithic-stop-the-world E.g. stay responsive while compacting entire, modern sized heaps

Must be robust: not just a tactic to delay STW compaction

[current “incremental STW” attempts fall short on robustness]

Young-Gen that is not monolithic-stop-the-world Stay responsive while promoting and copying data spikes

Surprisingly little work done in this specific area

Page 46: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Azul’s “C4” Collector Continuously Concurrent Compacting Collector

Concurrent guaranteed-single-pass markerOblivious to mutation rate

Concurrent ref (weak, soft, final) processing

Concurrent CompactorObjects moved without stopping mutator

References remapped without stopping mutator

Can relocate entire generation (New, Old) in every GC cycle

Concurrent, compacting old generation

Concurrent, compacting new generation

No stop-the-world fallbackAlways compacts, and always does so concurrently

Page 47: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Benefits

Page 48: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Sample responsiveness behavior

๏ SpecJBB + Slow churning 2GB LRU Cache๏ Live set is ~2.5GB across all measurements๏ Allocation rate is ~1.2GB/sec across all measurements

Page 49: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Sustainable Throughput:The throughput achieved while safely maintaining service levels

UnsustainableThroughout

Page 50: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Instance capacity test: “Fat Portal”HotSpot CMS: Peaks at ~ 3GB / 45 concurrent users

* LifeRay portal on JBoss @ 99.9% SLA of 5 second response times

Page 51: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

Instance capacity test: “Fat Portal”C4: still smooth @ 800 concurrent users

Page 52: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

Fun with jHiccup

Page 53: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

Oracle HotSpot CMS, 1GB in an 8GB heap

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

0" 500" 1000" 1500" 2000" 2500" 3000" 3500"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%"

Max=13156.352&

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Zing 5, 1GB in an 8GB heap

0"

5"

10"

15"

20"

25"

0" 500" 1000" 1500" 2000" 2500" 3000" 3500"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%" 99.9999%"

Max=20.384&

0"

5"

10"

15"

20"

25"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Page 54: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

Oracle HotSpot CMS, 1GB in an 8GB heap

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

0" 500" 1000" 1500" 2000" 2500" 3000" 3500"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%"

Max=13156.352&

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Zing 5, 1GB in an 8GB heap

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

0" 500" 1000" 1500" 2000" 2500" 3000" 3500"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%" 99.9999%"Max=20.384&

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Page 55: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

GC Tuning

Page 56: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Java GC tuning is “hard”…Examples of actual command line GC tuning parameters:

Java -Xmx12g -XX:MaxPermSize=64M -XX:PermSize=32M -XX:MaxNewSize=2g

-XX:NewSize=1g -XX:SurvivorRatio=128 -XX:+UseParNewGC

-XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=0

-XX:CMSInitiatingOccupancyFraction=60 -XX:+CMSParallelRemarkEnabled

-XX:+UseCMSInitiatingOccupancyOnly -XX:ParallelGCThreads=12

-XX:LargePageSizeInBytes=256m …

Java –Xms8g –Xmx8g –Xmn2g -XX:PermSize=64M -XX:MaxPermSize=256M

-XX:-OmitStackTraceInFastThrow -XX:SurvivorRatio=2 -XX:-UseAdaptiveSizePolicy

-XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled

-XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled

-XX:CMSMaxAbortablePrecleanTime=10000 -XX:+UseCMSInitiatingOccupancyOnly

-XX:CMSInitiatingOccupancyFraction=63 -XX:+UseParNewGC –Xnoclassgc …

Page 57: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

The complete guide toZing GC tuning

java -Xmx40g

Page 58: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

What you can expect (from Zing) in the low latency world

Assuming individual transaction work is “short” (on the order of 1 msec), and assuming you don’t have 100s of runnable threads competing for 10 cores...

“Easily” get your application to < 10 msec worst case

With some tuning, 2-3 msec worst case

Can go to below 1 msec worst case...

May require heavy tuning/tweaking

Mileage WILL vary

Page 59: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

An example of out-of-the-box behavior

Page 60: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

0"

0.2"

0.4"

0.6"

0.8"

1"

1.2"

1.4"

1.6"

1.8"

0" 100" 200" 300" 400" 500" 600"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%"

Max=1.568&

0"

0.2"

0.4"

0.6"

0.8"

1"

1.2"

1.4"

1.6"

1.8"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

0"

5"

10"

15"

20"

25"

0" 100" 200" 300" 400" 500" 600"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%"

Max=22.656&

0"

5"

10"

15"

20"

25"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Oracle HotSpot (pure newgen) Zing

Low latency trading application

Page 61: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

Low latency - Drawn to scale

Oracle HotSpot (pure newgen) Zing

0"

5"

10"

15"

20"

25"

0" 100" 200" 300" 400" 500" 600"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%"

Max=1.568&

0"

5"

10"

15"

20"

25"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

0"

5"

10"

15"

20"

25"

0" 100" 200" 300" 400" 500" 600"

Hiccup&Dura*on&(msec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%"

Max=22.656&

0"

5"

10"

15"

20"

25"

Hiccup&Dura*on&(msec)&

&

&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Page 62: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2012 Azul Systems, Inc.

Takeaway: “Real” Java is finally viablefor low latency applications

GC is no longer a dominant issue, even for outliers

2-3msec worst case case is “easy” tuning

< 1 msec worst case is very doable

No need to code in special ways any more

You can finally use “real” Java for everything

You can finally 3rd party libraries without worries

You can finally use as much memory as you want

You can finally use regular (good) programmers

Page 63: Finally! “Real” Java for low latency and low jitter...©2011 Azul Systems, Inc. Finally! “Real” Java for low latency and low jitter Gil Tene, CTO & co-Founder, Azul Systems

©2011 Azul Systems, Inc.

Q & A