how long can you afford to stop the world?

79
How long can you afford to Stop The World? Strategies to overcome long application pause times caused by Java GC Berlin, May 22. 2013 | Eric Hubert - Strategy & Architecture

Upload: java-usergroup-berlin-brandenburg

Post on 26-Jan-2015

105 views

Category:

Technology


1 download

DESCRIPTION

Vortrag von Eric Hubert vor der Java Usergroup Berlin-Brandenburg zur Optimierung des Garbage Collection-Verhaltens der Java Virtual Machine.

TRANSCRIPT

Page 1: How long can you afford to Stop The World?

How long can you afford to Stop The World?

Strategies to overcome long application pause times caused by Java GC

Berlin, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 2: How long can you afford to Stop The World?

The information and evaluations expressed in this presentation are based on the author‘s personal experiences and knowledge.

They do not necessarily reflect the views of Jesta Digital.

The author makes no warranties of any kind regarding the accuracy and veracity of information and data provided.

No one shall rely on any of the published test results which are inherently environment-specific. Readers are strongly encouraged to conduct own testing in their specific environment which

may or may not show different results.

All mentioned trademarks are property of their respective owners.

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Disclaimer

Page 3: How long can you afford to Stop The World?

• Developing software for about 20 years

• More than 10 years experience in the Enterprise Java world (JDK 1.2)

• Prior Jesta worked for debis Systemhaus, T-Systems and Adesso

• Working for Jesta since 2007 (formerly Jamba!, Fox Mobile)

• Currently leading „Strategy & Architecture“ team focused on

– Strategic development of platform infrastructure and middleware

– Automation of software build-, packaging-, testing- , deployment-, release- and application monitoring processes

– Close collaboration with cross-functional teams and central system administration/operations team

• Contact: [email protected], XING, Linked in

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

About me

Page 4: How long can you afford to Stop The World?

Agenda

• Motivation and Scope

• Summary of Java Memory Management Basics / GC Analysis

• Discussion of Different Strategies to overcome GC Pause Time issues

• Future Perspectives

• Open Discussion

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 5: How long can you afford to Stop The World?

Motivation

• The demand to process large quantities of data in memory is steadily increasing:

– More and more data to process and analyze in shorter times (near-time/real time business requirements)

– Availability of commodity servers with up to 2 TB of RAM (over the past decades available memory grew ≈ 100x every 10 years)

– Memory is still by far the fastest storage technology

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 6: How long can you afford to Stop The World?

Motivation

• Java GC can heavily impact application performance, especially in terms of latency / responsiveness (multi-second pause times on multi GB heaps)

• The runtime of most GC algorithms is proportional to the size of the live set of objects the larger the Heap the larger the pause times

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Storage Technology Random Access Latency

Registers 1-3 ns

CPU L1 Cache 2-8 ns

CPU L2 Cache 5-12 ns

Memory (RAM) 10-60 ns

High-speed network 10,000-30,000ns (10-30µs)

Solid State Disk (SSD) Drives 70,000-120,000ns (70-120µs)

Hard Disk Drives 3,000,000-10,000,000ns (3-10ms) [REF_01] – Random Access Latencies of Storage Technologies

Page 7: How long can you afford to Stop The World?

Motivation / Scope

• There are multiple strategies to overcome/minimize application pause time issues related to Garbage Collection

• Nevertheless most talks, blog posts and other information sources center on JVM tuning (choice of collector algorithms, hints to improve promotion between generations etc.)

• Most people know at least the most frequently used JVM GC tuning arguments, but only some know basics of the automated memory management and alternative strategies

• OOME causes, memory leaks, heap analysis etc. out of scope

• Not going to stress you with excessive JVM tuning options either

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 8: How long can you afford to Stop The World?

Out of Scope - HotSpot GC Tuning Details

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

[REF_02] Devoxx FR 2012 „Death by Pauses“ by Frank Pavageau

Page 9: How long can you afford to Stop The World?

Scope

• Ensure all attendees are aware of enough of the memory management basics in order to at least understand

– The reason of long GC pauses

– How to verify an application unresponsiveness was caused by GC

• The main goal of my talk is to provide you with a broader view on strategies to solve application pause times due to GC activity

• Will not deep dive into any of those strategies, but explain each approach, discuss the pros and cons (as well as limitations and side- effects)

• If applicable will provide pointers to information sources covering more details

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 10: How long can you afford to Stop The World?

Automatic Memory Management Basics

• Responsibilities of Automatic Memory Management

• Basic Garbage Collector Algorithms / Important Terms

• Concept of Generations

• Common Triggers of Full GC

• Analysis of GC behavior / Information Sources

• Memory Performance Triangle

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 11: How long can you afford to Stop The World?

Responsibilities of Automatic Memory Management

• Service provided by a “managed runtime” (e.g. the Java Virtual Machine) in which the program executes

– Assisted allocation

– Managed access to objects and their fields

– Automatic de-allocation of objects (Garbage Collection)

• Ensures that objects remain as long as they are in use

• Deems objects with no incoming references from other live objects as garbage

• Ensures that objects that are no longer required are thrown away to free up the memory they occupy for new objects

• Ensures any finalize method is run before the object is thrown away

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 12: How long can you afford to Stop The World?

Garbage Collectors – Classification (1)

• Serial versus Parallel

– Serial: Only one GC task at a time (only single CPU core used)

– Parallel: Multiple GC tasks are performed in parallel (multiple CPU core usage)

• Stop-The-World (STW) versus (Mostly) Concurrent

– STW: app threads are suspended during whole GC

– Concurrent: app threads are executed while GC tasks are performed

• Incremental

– Performs a garbage collection operation or phase as a series of smaller operations with gaps in between

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 13: How long can you afford to Stop The World?

Garbage Collectors – Classification (2)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Mostly-

[REF_03] Memory Management in the Java HotSpot VM

Page 14: How long can you afford to Stop The World?

Garbage Collectors – Classification (3)

• Reference Counting / Tracing

– Ref. Counting: No longer in practical use due to reference cycle problematic

– Tracing: Currently most common; either single phase copy or multiple phases (mark and optionally sweep and/or compact)

• Copying versus Non-compacting versus Compacting

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 15: How long can you afford to Stop The World?

Garbage Collectors – Simplified View (Tracing)

• Find and reclaim unreachable objects

> Trace the heap starting at the roots (thread stacks, static fields, operands of executed expression)

> Visits every live object

> Anything not visited is unreachable

> Therefore garbage

• If you can follow a chain of references from a root to a particular object, then that object is "strongly" referenced. It will not be collected.

• Referenced objects are also called „live objects“ or “live set”

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 16: How long can you afford to Stop The World?

Garbage Collection – Basic Algorithms (1)

• Copy/Scavenge

– Copy all live objects starting from the roots in a single pass operation from a source space to a target space and reclaim source space (effectively a move operation)

• At the beginning all objects are in source space and all references point to source space

• Start at the roots, copy any reachable object to target space and correct references while doing so

• At the end of copy all objects are in target space and all references point to target space; source space can be completely cleared

– Amount of work is generally linear to the „live set“

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 17: How long can you afford to Stop The World?

Garbage Collection – Basic Algorithms (2)

• Mark / Sweep / (Compact)

– Mark any object reachable as live

– Scan heap for objects not marked live (traced in a kind of free-list) (the sweep step is generally linear to the entire heap size, not just the live set)

– Over time, memory fragments

• Slower allocation

• Longer sweep phases

• Risk not having large enough contiguous space for allocation of large objects; can result in OOME

– Compaction moves (relocates) live objects together to reclaim contiguous empty space; all object references need to be corrected (remap); compacting is an expensive /time consuming operation

– A mark/sweep collector would not be a good choice for young generation, as it will not gain efficiency from the sparseness

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 18: How long can you afford to Stop The World?

Garbage Collection – Basic Algorithms (3)

• Mark / Sweep / (Compact)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

[REF_04] Mark-Sweep-Compact – Keith D. Gregory

Page 19: How long can you afford to Stop The World?

Garbage Collection – Basic Algorithms (4)

• Mark / Compact

– Reachable objects are marked

– Compacting step relocates the reachable (marked) objects either towards the beginning of the heap area (in-place compaction) or to another location (evacuating compaction)

– Mark and compact work are both linear to live set, while sweep work is linear to heap size

– Consequently, a mark/compact collector is linear to live set only, giving it similar efficiency characteristics to copying collectors

– Examle: Azul C4

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 20: How long can you afford to Stop The World?

Concept of Generations / Generational GC

• Incorporate this typical object lifetime structure into GC

– Different heap areas for objects with different lifetime

– Mostly different GC algorithms for objects with different lifetime

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

[REF_05] The Art of Garbage Collection Turing – Angelika Langer & Klaus Kreft

Page 21: How long can you afford to Stop The World?

Concept of Generations / Generational GC

• Generations are of new and survived objects

• Heap divided in zones by age of the objects

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

eden survivor tenured

Allocation of objects in eden space

(experienced no GC )

2 alternately used copy target spaces

(experienced several GCs)

Objects survived multiple GCs

Young (nursery) generation tenured generation (old generation)

objects collected by Minor and

Full GC

objects collected only

by Full GC

object lifetime [REF_06] Based on Java 7 Garbage Collector G1 by Antons Kranga

Page 22: How long can you afford to Stop The World?

Concept of Generations / Generational GC

• Focus collection efforts on young generation

– Normally live objects represent only relatively small percentage of space

– Promote objects living long enough to older generations

• Tends to be much more efficient; great way to keep up with high allocation rate

• Only collect older generation as it fills up

• Requires a “Remembered set”: a way to track all references into the young generation from the outside

• Usually want to keep surviving objects in young generation for a while before promoting them to the old generation:

– Immediate promotion can dramatically reduce generational filter efficiency

– Waiting too long to promote can dramatically increase copying work

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 23: How long can you afford to Stop The World?

Common Triggers of Full GC

• Completely JVM implementation specific, more specifically it also depends on selected GC algorithms

• “Common“ triggers in Oracle HotSpot JVM are:

– Old generation or permanent generation filled to a certain percentage

– Calling System.gc() (unless JVM option -XX:+DisableExplicitGC is set)

– Not enough free space in survivor space to copy objects from eden space

– Space extends or shrinkage (also applies to PermGen)

• Verification via gc logs and/or Java MBeans

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 24: How long can you afford to Stop The World?

Analysis of GC behavior / Information Sources (1)

• GC traces from JVM

-XX:+PrintGC (same as -verbose:gc ) -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps (since JDK 6 Update 6)

-Xloggc:logs/gc.log -XX:GCLogFileSize=50M -XX:NumberOfGCLogFiles=3 -XX:+UseGCLogFileRotation

Addition diagnose options (trouble shooting / tuning)

-XX:+PrintTenuringDistribution -XX:+PrintHeapAtGC -XX:+TraceClassLoading / -XX:+TraceClassUnloading

-XX:+PrintGCApplicationStoppedTime (Warning: misleading – all JVM safepoints) -XX:+PrintSafepointStatistics (can be used for verification)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 25: How long can you afford to Stop The World?

Analysis of GC behavior / Information Sources (2)

• Example of STW Pause caused by Full GC using Parallel GC

[Full GC [PSYoungGen: 8523K->0K(188160K)]

[PSOldGen: 574126K->428865K(575808K)] 582650K->428865K(763968K)

[PSPermGen: 115404K->115404K(246144K)], 4.8381260 secs]

• Example of STW Pauses caused by non-concurrent phases of CMS GC [GC [1 CMS-initial-mark: 1477934K(1835008K)] 1521654K(2053504K),

0.0902490 secs]

[1 CMS-remark: 3717734K(4023936K)] 3810187K(4177280K), 1.0523700 secs]

• Examples of STW Pauses caused by fallback of CMS GC to Serial Old GC (concurrent mode failure): 1784934K->1309805K(1926784K), 3.6729840 secs]

1927090K->1309805K(2080128K), [CMS Perm : 94690K->93690K(131072K)],

3.7968250 secs]

Hint: -XX:CMSInitiatingOccupancyFraction=<xx> and -XX:+UseCMSInitiatingOccupancyOnly

GC ParNew (promotion failed): 153344K->153344K(153344K), 0.1724000

secs]1574.221: [CMS: 2786273K->2531770K(4023936K), 5.5668010 secs]

2926282K->2531770K(4177280K), [CMS Perm : 94733K->92808K(131072K)],

5.7397890 secs]

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 26: How long can you afford to Stop The World?

Analysis of GC behavior / Information Sources (3)

• Standard Java GC-related Management Beans (JMX)

– Can be used for (remote) real-time monitoring of GC behavior and memory usage

– MBean names are Garbage Collector specific we use custom code for normalization to streamline monitoring config

– java.lang:type=GarbageCollector,name=<collector name>

• Young Gen: Copy, ParNew, PS Scavenge, G1 Young Generation

• Old Gen: MarkSweepCompact , PS MarkSweep, ConcurrentMarkSweep, G1 Old Generation

• Metrics: CollectionCount, CollectionTime, LastGcInfo (Composite)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 27: How long can you afford to Stop The World?

Analysis of GC behavior / Information Sources (4)

• Standard Memory Management Beans

– java.lang:type=Memory

• Metrics: HeapMemoryUsage (Composite: init, committed, usage, max)

– java.lang:type=MemoryPool,name=<space name>

• Eden: Eden Space, Par Eden Space, PS Eden Space, G1 Eden

• Survivor: Survivor Space, Par Survivor Space, PS Survivor Space, G1 Survivor

• Old: Tenured Gen, PS Old Gen, CMS Old Gen, G1 Old Gen

• Perm: Perm Gen, PS Perm Gen, G1 Perm Gen

• Metrics: Usage (init, committed, max, used)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 28: How long can you afford to Stop The World?

Analysis of GC behavior / Information Sources (4)

• Custom GC Management Bean

– MajorCollectionCount

– MajorCollectionTime

– MinorCollectionCount

– MinorCollectionTime

– CumulatedCollectionTime

– LastMajorCollectionDuration

– LastMajorCollectionMemoryReduction

– LastMajorCollectionStartTime

– LastMajorCollectionEndTime

– TenuredCollector

– YoungCollector

– Uptime

• Warning: CollectionTime <> STW Pause Time for Concurrent Collectors BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 29: How long can you afford to Stop The World?

GC Analysis – Command Line Tools

1) jstat/jstad/jps

Usage: jstat -help|-options

jstat -<option> [-t] [-h<lines>] <vmid> [<interval> [<count>]]

Example output – Live Server instance:

jstat -gc 30731 1s 10

S0C S1C S0U S1U EC EU OC OU

3520.0 3520.0 2593.7 0.0 28224.0 2688.8 261728.0 184199.8

PC PU YGC YGCT FGC FGCT GCT

43172.0 25851.9 168136 978.259 1021 4.195 982.454

– Can only be used to calculate averages or with short update intervals

– Use option „-gcutil“ if you rather want to see space usage percentages

– jps helps to determine vmid (but mostly maps to PID anyway)

– jstatd is required for remote usage of jstat

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 30: How long can you afford to Stop The World?

GC Analysis – GUI Tools (1)

Gcviewer – GC trace analyzer

• Originally developed by tagtraum industries (only maintained until 2008)

• Fork: https://github.com/chewiebug/GCViewer/downloads

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 31: How long can you afford to Stop The World?

GC Analysis – GUI Tools (2)

HPjmeter

• GC log analyzer and monitoring (the latter for HP UX)

• Download: www.hp.com/go/hpjmeter

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 32: How long can you afford to Stop The World?

GC Analysis – GUI Tools (3)

HPjmeter

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 33: How long can you afford to Stop The World?

GC Analysis – GUI Tools (4)

JConsole

• Part of JDK

• Can be used for monitoring of local (jvmid/pid) or remote (JMX RMI) JVM service:jmx:rmi:///jndi/rmi://<remote-machine>:<port>/jmxrmi or if behind firewall and using custom jmx rmi proxy: service:jmx:rmi://<remote-machine>:<proxyport>/jndi/rmi://<remote-machine>:<port>/jmxrmi

• Integrated MBean browser

• Shows active JVM options, GC info and more runtime information

• Can be used to verify memory/gc behavior in realtime

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 34: How long can you afford to Stop The World?

GC Analysis – GUI Tools (5)

JConsole

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 35: How long can you afford to Stop The World?

GC Analysis – GUI Tools (6)

Visual VM with Visual GC Plugin

• Part of JDK

• Based on jvmstat (local monitoring via jvmid, remote requires jstad)

• Many other plugins available (also MBean browser)

• Shows active JVM options and other runtime information (not GC algos)

• Can be used to verify memory/gc behavior in realtime

• Very detailed view including information regarding survivor space usage as well as age information (histogram – not available for all algorithms)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 36: How long can you afford to Stop The World?

GC Analysis – GUI Tools (7)

Visual VM with Visual GC Plugin

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 37: How long can you afford to Stop The World?

GC Analysis – GUI Tools (8)

IBM GCMV

• Eclipse RCP Application (can also be installed as plugin in Eclipse)

• Loads gc log file similar to gcviewer and provides statistics and graphs

• nice capability to zoom into pause time graph area

• Mainly written for IBM J9, but most parts also work for Oracle JVMs

• Update Site: http://download.boulder.ibm.com/ibmdl/pub/software/isa/isa410/production/

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 38: How long can you afford to Stop The World?

GC Analysis – GUI Tools (9)

IBM GCMV

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 39: How long can you afford to Stop The World?

GC Analysis – GUI Tools (10)

JHiccup

• Small Java tool from Azul Systems to demonstrate application hiccups (primarily caused by GC, or any other JVM safepoint/OS jitter etc.)

• Either run from Command Line (script wrapping java command) or as javaagent

• Writes logfiles which can later be loaded in Excel to render nice diagrams by hitting a button (macros need to be active)

• Possibility to compare percentile values against expectations (SLA)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 40: How long can you afford to Stop The World?

GC Analysis – GUI Tools (11)

JHiccup – Example Graphs of Telco App

• More details/examples later in this presentation …

• Download: http://www.azulsystems.com/downloads/jHiccup

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 41: How long can you afford to Stop The World?

GC Tuning – Memory Performance Triangle

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Memory footprint

Throughput Latency

Page 42: How long can you afford to Stop The World?

Strategies to overcome GC Pause Time issues

1. Tuning of the JVM Runtime Behavior

2. Reduce memory footprint of the application

3. More powerful hardware (more RAM/CPU cores)

4. Distribute processing to multiple JVMs (with Remote Communication)

5. Custom Off-heap memory management

6. Switch to JVM implementation with more efficient Memory Management

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 43: How long can you afford to Stop The World?

1. Tuning of the JVM Runtime Behavior (1)

• Structured Approach - Preconditions

– Knowledge about Java Memory Management

• Understanding how the memory is organized in the JVM to be tuned

• Knowing the options to change GC related runtime behavior and their limits

– Effect of Garbage Collector Choice

– Effect of Memory Space Sizing

– Effect of other collector-specific configuration switches

– Knowledge about GC Analysis

• Knowing what to measure

• Knowing how to measure

• Knowing how to interpret metrics

– Knowledge about GC Tuning

• Know at least how to approach a tuning

– Know your Operational Requirements

– Have one or multiple concrete Tuning Goals prior to any modification!

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 44: How long can you afford to Stop The World?

1. Tuning of the JVM Runtime Behavior (2)

• If motivator is concrete performance issue (e.g. large pause time)

– First ensure the problem is really GC-related!

– Verify current GC configuration and analyze current GC behavior

– Evaluate your chances of improvement by runtime configuration tuning

• Verify your hardware and OS resources

• Verify object allocation rate

• Verify occupancy of tenured generation after Full GC

• Set those measures in relation to your goals

• Decide whether to proceed

– Use a comparable test environment with comparable workload to replicate your issue (automate tests!)

– Capture baseline data, do small changes at a time and compare with baseline

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 45: How long can you afford to Stop The World?

1. Tuning of the JVM Runtime Behavior (3)

• Where are the large pauses? typically old gen

• Start to tune young gen!

• First verify reasons of promotion (young old), depending on outcome

– Think of increasing new size (decrease NewRatio value); do it stepwise and verify result

– Think of increasing survivor spaces (SurvivorRatio)

– Think of increasing age threshold to avoid too early promotion to old gen

• verify total heap size, think of decreasing it (if possible)

• Proceed with old gc tuning

• Switch collector, try to use CMS (-XX:+UseConcMarcSweepGC)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 46: How long can you afford to Stop The World?

Young

Old

1. Tuning of the JVM Runtime Behavior (4)

• Generational Oracle HotSpot JVM (6 Collector choices/combinations)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

(G1 Young Generation) G1 (G1 Old Generation) -XX:+UseG1GC

Serial Young (DefNew : Copy)

-XX:+UseSerialGC

Concurrent Old ConcurrentMarkSweep

-XX:+UseConcMarkSweepGC

Serial Old (MarkSweepCompact)

Parallel Young (ParNew : ParNew) -XX:+UseParNewGC

Parallel Old (PS OldGen - PS MarkSweep) -XX:+UseParallelOldGC

Parallel Scavenge (PSYoungGen : PS Scavenge) -XX:+UseParallelGC

Fallback -XX:-UseParNewGC

Fallback

Page 47: How long can you afford to Stop The World?

1. Tuning of the JVM Runtime Behavior (5)

• JVM attempts to use reasonable defaults in all areas, but also offers a large number of feature switches (currently more than 600)

• Useful resources to gather details:

– Official Oracle JVM documentation (lists about 90 options) http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html

– Use JVM build-in listing functions java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal

For adventurous guys add (Please don’t use any of those in production!)

-XX:+UnlockExperimentalVMOptions

– Choose one of the unofficial „complete references“ (gathered from source), e.g. http://www.pingtimeout.fr/2012/05/jvm-options-complete-reference.html

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 48: How long can you afford to Stop The World?

1. Tuning of the JVM Runtime Behavior (6)

• Assessment

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Pros Cons

can drastically improve performance (e.g. reduce maximum pause times and/or improve throughput)

quite a lot of knowledge about memory management and implementation specific switches required

relatively quickly to apply (depending on knowledge and experience)

danger to optimize for the moment (many variants: load, functionality used, software changes)

needs to be carefully monitored and repeated with each redeployment / changed use

heavily implementation dependent / can change with each minor JVM version update

Page 49: How long can you afford to Stop The World?

2. Reduce Memory Footprint of the application

• Sometimes easier said than actually done

• Generally one should avoid too much premature optimizations

• Rather frequently use heap dumps with memory analyzer or memory profiler and verify proper data structure usage in development iterations

• Look out for usage of wrong scopes, e.g. mistakenly declared variables within loops although not needed (unnecessary allocation pressure)

• Only load the amount of data in memory you need to process (e.g. from some persistent store)

• For large amounts of objects carefully select data structures (verify overhead – fixed and per entry)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 50: How long can you afford to Stop The World?

2. Reduce Memory Footprint of the application

• Assessment

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Pros Cons

high memory savings possible (e.g. reduction of allocation rate and/or long-lived objects by up to >50%) also resulting in much smaller pause times

rather high, consistent effort

can have bigger positive impact than any runtime tuning

can negatively impact execution time if not properly applied

can introduce bugs if existing code needs to be changed

Page 51: How long can you afford to Stop The World?

3. More Powerful Hardware (RAM/CPU cores)

• Very much depending on starting situation whether more computing resources can help to solve GC issues

– e.g. application is heavily CPU bound and not enough CPU cycles to properly run GC concurrently

– or maximum heap sizes should be increased, but not enough RAM

• The VM implementation and the chosen GC algorithms have a big impact as well

• If live set > 1 or 2 GB and currently using parallel GC on only two CPU cores, increasing the heap size, switching to CMS and increasing the number of CPU cores (thus GC threads) can have a large effect

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 52: How long can you afford to Stop The World?

3. More Powerful Hardware (RAM/CPU cores)

• Assessment

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Pros Cons

not much working effort to realize involved costs

may only mask underlying problems until a later stage (e.g. increased load)

Page 53: How long can you afford to Stop The World?

4. Distribute Memory Processing to multiple JVMs

• Sometimes easy, sometimes harder

• If we are talking about a mostly stateless application with a rather small amount of long lived objects horizontal scaling is quite easy (using proper loadbalancing and failover)

• Long-lived data needs to be somehow partitioned/sharded in order to improve efficiency (either manually or by using products supporting distributed memory structures aka. DataGrids – like Hazelcast, Infinispan, Terracotta, GridGain, Coherence etc.)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 54: How long can you afford to Stop The World?

4. Distribute Memory Processing to multiple JVMs

• Assessment

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Pros Cons

depending on nature of application Java heap usage per instance can be drastically reduced

if existing application, distribution may need some re-architecture

if memory issues are the only reason to massively scale horizontally one shouldn‘t forget about increased complexity, operational overhead and total memory overhead

Page 55: How long can you afford to Stop The World?

5. Custom Off-Heap Memory Management (1)

• sun.misc.Unsafe (internal implementation, dangerous, non-portable, and volatile)

• java.nio.ByteBuffer#allocateDirect (since JDK 1.4)

• Maximum size to be set with –XX:MaxDirectMemorySize=

• You have to use some serialization/deserialization mechanism

• Java‘s default serialization/deserialization is not very fast

• Two sub strategies:

– Dynamic size and merging: no memory wasted, but suffers fragmentation (synchronization at allocation/deallocation)

– Fixed size buffer allocation: no fragmentation, but memory wasted

• Proper cleanup not quite elegant to achieve (relies on finalizer )

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 56: How long can you afford to Stop The World?

5. Custom Off-Heap Memory Management (2)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

static {

Method directBufferCleanerX = null;

Method directBufferCleanerCleanX = null;

boolean v;

try {

directBufferCleanerX =

Class.forName("java.nio.DirectByteBuffer").getMethod("cleaner");

directBufferCleanerX.setAccessible(true);

directBufferCleanerCleanX =

Class.forName("sun.misc.Cleaner").getMethod("clean");

directBufferCleanerCleanX.setAccessible(true);

v = true;

} catch (Exception e) {

v = false;

}

CLEAN_SUPPORTED = v;

directBufferCleaner = directBufferCleanerX;

directBufferCleanerClean = directBufferCleanerCleanX;

}

Lucene/Elastic Search Code:

within inner class in interface org.apache.lucene.store.

bytebuffer.ByteBufferAllocator

forked to other projects like JBoss Netty

Page 57: How long can you afford to Stop The World?

Custom Off-Heap Memory Management (3)

• Projects and products using this strategy:

– Oracle Coherence

– GigaSpaces, (to be validated)

– Hazelcast (Enterprise Edition)

– GridGain

– Terracotta BigMemory

– Lucene / Elastic Search

• Open Frameworks to use this strategy

– Apache DirectMemory (Serialization via Protostuff)

– FST - Fast Serialization

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 58: How long can you afford to Stop The World?

5. Custom Off-Heap Memory Management (4)

• Assessment:

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Pros Cons

can reduce GC overhead / max pause times

quite tricky to get right, extreme implementations end up in own GC

„useful“ usage limited to simple/flat data structures (key-value) access; usage as medium speed tier in Caches

off-heap allocation is a lot slower than on Java-heap allocatoin

either memory waste or fragmentation

standard heap analysis tooling does not apply, second set of tooling required

Page 59: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (1)

The Ultimate JVM GC Tuning Guide

java -Xmx40g

ZING

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 60: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (2)

• Azul Zing Practical Evaluation / Comparison against Oracle HotSpot JVM

• Preparation:

– selected real software system as part of our platform showing some GC issues in production

– setup test environment (single JVM instance) on VM with 16 GB, 8 cores

– created load test using real data captured from live systems

– single test run designed to last about one and a half hours

• Test Conduction:

– incrementally increased load (concurrent users) until Oracle Hotspot with some default memory configuration showing severe issues

– changed memory sizing as well as GC algorithm in order to demonstrate issues known from real live

– switched to untuned Azul Zing

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 61: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (3)

• Oracle HotSpot 1.6.0_43-b01, 64bit – 1 GB MaxHeap - ParallelGC: -Xms768m -Xmx1024m -XX:PermSize=128m -XX:MaxPermSize=128m

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 62: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (4)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Oracle HotSpot 1.6.0_43-b01, 64bit – 1 GB MaxHeap - ParallelGC:

Page 63: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (5)

• Oracle Hotspot 1.6.0_43-b01, 64bit – 4 GB MaxHeap, ParallelGC -Xms2048m –Xmx4096m -XX:PermSize=128m -XX:MaxPermSize=128m

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 64: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (6)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Oracle Hotspot 1.6.0_43-b01, 64bit – 4 GB MaxHeap, ParallelGC

Page 65: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (7)

• Oracle Hotspot 1.6.0_43-b01, 64bit – 4 GB MaxHeap, CMS -Xms4096m -Xmx4096m -XX:+UseConcMarkSweepGC //PermGen unchanged

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 66: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (8)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Oracle Hotspot 1.6.0_43-b01, 64bit – 4 GB MaxHeap, CMS

Page 67: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (9)

• Oracle Hotspot 1.6.0_43-b01, 64bit – 2 GB MaxHeap, CMS (tuned): -Xms2g -Xmx2g -Xmn256m -XX:SurvivorRatio=4 -XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=16

-XX:CMSInitiatingOccupancyFraction=80 -XX:+UseCMSInitiatingOccupancyOnly // PermGen unchanged

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 68: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (10)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Oracle Hotspot 1.6.0_43-b01, 64bit – 2 GB MaxHeap, CMS (tuned):

Page 69: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (11)

• Azul Zing 1.6.0_33-ZVM_5.5.3.0-b5, 64bit – 10 GB MaxHeap, C4: -Xmx10g

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 70: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (12)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Azul Zing 1.6.0_33-ZVM_5.5.3.0-b5, 64bit – 10 GB MaxHeap, C4

Page 71: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (12)

• The efficiency of memory and thread management is up to each JVM implementation

• Azul Systems offers an highly optimized commercial JVM called Zing which is designed for low latency use with large (multi-GB heaps: 1 – > 300 GB)

• It uses a special read barrier (Loaded Value Barrier) to support concurrent compaction, concurrent remapping, and concurrent incremental update tracing

• Zing uses Generational GC, but the same base algorithm for both young and old gen: C4 (Continuously Concurrent Compacting Collector (C4)

• Zing is built on top of a proprietary Loadable Linux Kernel Module (multiple Linux distributions supported: RedHat, CentOs, SLES, Ubuntu, etc.)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 72: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (13)

• Comparison of peak mremap rates for 16 GB of remaps

• Zing has a custom memory and thread management implementation and adds a production monitoring and management platform

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Active Threads

Stock Linux

Modified Linux

Speedup

0 43.58 GB/sec (360ms) 4734.85 TB/sec (3us) >100,000x

1 3.04 GB/sec (5s) 1488.10 TB/sec (11us) >480,000x

2 1.82 GB/sec (8s) 1166.04 TB/sec (14us) >640,000x

4 1.19 GB/sec (13s) 913.74 TB/sec (18us) >750,000x

8 897.65 MB/sec (18s) 801.28 TB/sec (20us) >890,000x

12 736.65 MB/sec (21s) 740.52 TB/sec (22us) >1,000,000x

[REF_07] C4: The Continuously Concurrent Compacting Collector

Page 73: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (14)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

[REF_08] Understanding Zing LX Memory Use

Page 74: How long can you afford to Stop The World?

6. JVM with more efficient Memory Management (15)

• Assessment

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Pros Cons

incredibly low pause times not available in a free, unsupported form; license costs as with other supported JVMs

no GC tuning required if not aiming for microsecond pause times

requires Azul Linux Kernel Module (dependent on Linux distribution and Kernel ABI/signature change policy some update restrictions and increased operational effort possible)

predictable worst case pause times

requires more memory to work efficiently (reserved for Zing usage even if JVM is not running)

supports multi-GB large heaps

a delay in which features in the major Oracle Java Hotspot JVM releases are available in Zing, which is aimed to be further reduced in the future

sophisticated monitoring in production

(certain CPU requirements, but fulfilled by all modern commodity server CPUs)

Page 75: How long can you afford to Stop The World?

Future Perspectives (1)

• Garbage Collection and performance on virtualized environments is among the hot future topics

• Oracle currently busy with GC „convergence“ (merging sources of former Bea JRockit and Sun Hotspot JVMs and tooling), stabilization and performance improvements of G1 (future standard?, only one GC „framework“ instead of currently three)

• General Goals according to talks to vendors like Oracle and IBM:

– Solve linear scaling problem (200ms @ 1GB → 20s @ 100GB) (partly caused as result of deferring of “expensive operations”)

– Scale using result-based, concurrent and incremental (through partitioned heap) garbage collection

– More flexible utilization of hardware (different data stores, SSD etc.)

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 76: How long can you afford to Stop The World?

Future Perspectives (2)

• No interest of Linux Kernel community to integrate Azul’s improvements (consequently they gave up on this)

• Future will show to what extend Azul Systems will actively participate in the OpenJDK project and how this will influence upcoming Java SE versions

• The gap between Azul Zing and all other JVM implementations in terms of memory model efficiency seems to be large and not likely to be closed anytime soon

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 77: How long can you afford to Stop The World?

Discussion / Questions

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 78: How long can you afford to Stop The World?

References

• [REF_01] Systems Computing - Understanding CPU Caching and Performance by Jon "Hannibal" Stokes

• [REF_02] Death By Pauses - Devoxx France 2012 by Frank Pavageau

• [REF_03] Memory Management in the Java HotSpot Virtual Machine by Sun Microsystems

• [REF_04] Java Reference Objects by Keith D. Gregory

• [REF_05] The Art of Garbage Collection Tuning by Angelika Langer & Klaus Kreft

• [REF_06] Java7 Garbage Collector G1 by Antons Kranga

• [REF_07] C4: The Continuously Concurrent Compacting Collector by Azul Systems (Gil Tene, Balaji Iyengar, Michael Wolf

• [REF_08] Understanding Zing LX Memory Use by Azul Systems

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture

Page 79: How long can you afford to Stop The World?

Further Reading

• Official Oracle JVM options documentation (subset)

• Official Oracle JVM GC Tuning Documentation

• Java Garbage Collection Analysis and Tuning

• JavaOne 2012 - Gil Tene - Azul Systems - Understanding GC

• Alexey Ragozin - HotSpot JVM GC Options Cheat Sheet (v2)

• Alexey Ragozin - Understanding GC pauses in JVM, HotSpot's minor GC

• Alexey Ragozin - Understanding GC pauses in JVM, HotSpot's CMS collector

• Alexey Ragozin - Surviving 16GiB heap and greater

• Java OutOfMemoryError – Eine Tragödie in sieben Akten

• JavaOne 2012 - G1 Garbage Collector Tuning

• How to Monitor Java Garbage Collection | CUBRID Blog

• Everything I ever learned about jvm performance tuning (twitter)

• Displaying Java’s Memory Pool Statistics with VisualVM

BERLIN, May 22. 2013 | Eric Hubert - Strategy & Architecture