efficient memory and thread management in highly parallel java applications

90
Phillip Koza IBM Efficient Memory and Thread Management in Highly Parallel Applications JavaOne JavaOne

Upload: pkoza

Post on 19-Jan-2015

8.738 views

Category:

Technology


0 download

DESCRIPTION

This presentation discusses strategies to estimate and control the memory use of multi-threaded java applications. It includes a quick overview of how the JVM uses memory, followed by techniques to estimate the memory usage of various types of objects during testing. This knowledge is then used as the basis for a runtime scheme to estimate and control the memory use of multiple threads. The final part of the presentation describes how to implement robust handling for unchecked exceptions, especially Out Of Memory (OOM) errors, and how to ensure threads stop properly when unexpected events occur.

TRANSCRIPT

Page 1: Efficient Memory and Thread Management in Highly Parallel Java Applications

Phillip KozaIBM

Efficient Memory and Thread Management in Highly Parallel Applications

JavaOneJavaOne

Page 2: Efficient Memory and Thread Management in Highly Parallel Java Applications

2

IBM CONFIDENTIALJavaOneJavaOneAgenda

Java memory overview

Causes of excessive java heap memory usage

Estimating memory usage of objects

Runtime memory usage estimation

Track and control memory usage

Minimize usage of memory

Efficiently manage threads

Summary

Page 3: Efficient Memory and Thread Management in Highly Parallel Java Applications

3

IBM CONFIDENTIALJavaOneJavaOneJava memory usage overview

Many different forms of memory usage by the Java Virtual Machine (JVM)

OS and C runtime

Native heap

Method area

Storage for objects that describe classes and methods

Page 4: Efficient Memory and Thread Management in Highly Parallel Java Applications

4

IBM CONFIDENTIALJavaOneJavaOneJava memory usage overview

JVM stack

Java heap

This is where instances of java classes and arrays are allocated

Excessive use of java heap memory will cause performance issues and ultimately, java.lang.OutOfMemoryError

This presentation covers monitoring and controlling java heap memory usage:

Better performance

Higher application availability

Page 5: Efficient Memory and Thread Management in Highly Parallel Java Applications

5

IBM CONFIDENTIALJavaOneJavaOneJava memory usage overview

The memory usage of a Java object is composed of:

Overhead: This stores the reference to the class object and various flags. Can range from 8 – 24 bytes depending on JVM and 32 bit vs. 64 bit

Memory for primitive fields

Memory for reference fields

Alignment bytes. In some (most) JVMs, the size of all objects must be a multiple of 8. Because of this, an Integer object uses as much memory as a Long

Arrays use an extra 4-8 bytes to store the size of the array

Page 6: Efficient Memory and Thread Management in Highly Parallel Java Applications

6

IBM CONFIDENTIALJavaOneJavaOneCauses of excessive memory usage

Lack of insight into the memory usage of common and custom objects

Overuse of delegation in class design

i.e. too many objects!

Too many threads

This applies to the java heap usage of a thread. Most threads will use some amount of java heap memory.

Incorrect use of collection classes

Page 7: Efficient Memory and Thread Management in Highly Parallel Java Applications

7

IBM CONFIDENTIALJavaOneJavaOneCauses of excessive memory usage

Memory leaks

“Forgotten” or “lost” reference: application is holding a reference, but has forgotten about it or doesn’t know where it is

Infrequent or delayed cleanup of no longer needed objects

Excessive use of finalizers

There is no guarantee when a finalizer will be run or that it will be run at all. An object that has a finalizer will not be garbage collected until its finalizer is run

Page 8: Efficient Memory and Thread Management in Highly Parallel Java Applications

8

IBM CONFIDENTIALJavaOneJavaOneExcessive memory usage symptoms

Degradation in performance

High Garbage Collection (GC) costs

GC must run more frequently when large amounts of memory are being used

GC must sweep over and analyze more memory to find that which can be garbage collected

JVM can thrash when memory utilization exceeds 85-90% of max memory

Page 9: Efficient Memory and Thread Management in Highly Parallel Java Applications

9

IBM CONFIDENTIALJavaOneJavaOneExcessive memory usage symptoms

Java.lang.OutOfMemoryError (OOM)

In most JVMs, by default this only kills the thread attempting to allocate the memory that triggered the error

Can lead to threads hanging or behaving strangely: “Zombie Threads”

JVM may crash – but perhaps not immediately!

Page 10: Efficient Memory and Thread Management in Highly Parallel Java Applications

10

IBM CONFIDENTIALJavaOneJavaOneLong term object memory usage

Most JVM memory allocation and GC is generational

Default in Oracle HotSpot

Default in IBM J9 1.6.0 build 2.6 and later, and in WepSphere App Server V8 and later

In generational GC, objects are initially allocated in the “young” generation

If the object survives enough garbage collections, it is moved to the “old” or “tenured” generation

GC’ing the tenured generation is more expensive

Reducing the amount of data that is put in the tenured generation can increase performance dramatically!

Page 11: Efficient Memory and Thread Management in Highly Parallel Java Applications

11

IBM CONFIDENTIALJavaOneJavaOneLong term object memory usage

Hard to say for sure what objects will become tenured, but likely candidates are objects:

Stored in static structures

Passed between threads

Stored in member variable collections

By definition, long term objects tie up memory for longer periods of time, increasing the odds of OOM

It is still possible to run out memory with no or very few long term objects – but this is unlikely!

Page 12: Efficient Memory and Thread Management in Highly Parallel Java Applications

12

IBM CONFIDENTIALJavaOneJavaOneLong term object memory usage

How to minimize the amount of long term memory:

Delay allocating objects until you need them

Immediately delete objects that are no longer needed or unlikely to be needed again

Convert objects to a more memory-efficient form if they won’t be needed for a while

Design your objects to be memory efficient from the start!

Page 13: Efficient Memory and Thread Management in Highly Parallel Java Applications

13

IBM CONFIDENTIALJavaOneJavaOneEstimating memory usage

Memory usage will depend on JVM vendor and 32 bit vs. 64 bit

For 64 bit, must determine if references will be compressed

Oracle HotSpot VM: Enabled by default in 6.0.23 & later Enabled by default in Java 7 if –Xmx is less than 32 GB Can be disabled via –XX:-UseCompressedOops

IBM: Enabled via –Xcompressedrefs if –Xmx is less than 30GB

Page 14: Efficient Memory and Thread Management in Highly Parallel Java Applications

14

IBM CONFIDENTIALJavaOneJavaOneEstimating memory usage

Two ways to estimate memory usage:

Runtime.totalMemory() – Runtime.freeMemory()

java.lang.instrument.Instrumentation.getObjectSize()

Page 15: Efficient Memory and Thread Management in Highly Parallel Java Applications

15

IBM CONFIDENTIALJavaOneJavaOneEstimating memory usage

The classic way to determine the memory use of an object (and the only way prior to Java 5) is to use the Runtime memory methods

First determine how much memory is used at the start of the test via Runtime.totalMemory() – Runtime.freeMemory()

Then allocate objects of a given type and see how much memory is now being used

The difference between the before and after memory usage should be the memory used by the allocated objects

Page 16: Efficient Memory and Thread Management in Highly Parallel Java Applications

16

IBM CONFIDENTIALJavaOneJavaOneEstimating memory usage

Unfortunately, this can be inaccurate at times. To be accurate, the garbage collector must run before we measure the before memory use

It’s not possible to guarantee the garbage collector is run, and even if it is, it might not collect all the garbage objects

So, call System.gc() at the start of the test many times to increase the odds it will run

gc() problems can be minimized by allocating a large number of objects of the same type and then taking the average

Note: Escape analysis may decide to allocate your objects on the stack if they don’t “escape” the method you are allocating them in

Page 17: Efficient Memory and Thread Management in Highly Parallel Java Applications

17

IBM CONFIDENTIALJavaOneJavaOneCode sample

public interface ObjectFactoryInterface {

public Object makeObject( int index);

}

public class ObjectFactory implements ObjectFactoryInterface {

public Object makeObject( int index) {

return new Object();

}

}

public class IntegerFactory implements ObjectFactoryInterface {

public Object makeObject( int index) {

return new Integer(123456);

}

}

public class LongFactory implements ObjectFactoryInterface {

public Object makeObject( int index) {

return new Long(123456L);

}

}

Page 18: Efficient Memory and Thread Management in Highly Parallel Java Applications

18

IBM CONFIDENTIALJavaOneJavaOneCode sample

public class FloatFactory implements ObjectFactoryInterface {

public Object makeObject( int index) {

return new Float(123.456);

}

}

public class DoubleFactory implements ObjectFactoryInterface {

public Object makeObject( int index) {

return new Double(123.456);

}

}

public class BigDecimalFactory implements ObjectFactoryInterface {

public Object makeObject( int index) {

return new BigDecimal(“123.456”);

}

}

Page 19: Efficient Memory and Thread Management in Highly Parallel Java Applications

19

IBM CONFIDENTIALJavaOneJavaOneCode sample

public class StringFactory implements ObjectFactoryInterface {

public Object makeObject( int index) {

// Make a string from the provided index. This is necessary so that

// each String object will be new and will create a new character array

// (Strings can be interred and String character arrays can be shared

// if they represent the same value)

return Integer.valueOf(index).toString();

}

}

public class HashMapDefaultFactory implements ObjectFactoryInterface {

public Object makeObject(int index) {

return new HashMap();

}

}

public class IntArrayEmptyFactory implements ObjectFactoryInterface {

public Object makeObject(int index) {

return new int[0];

}

}

Page 20: Efficient Memory and Thread Management in Highly Parallel Java Applications

20

IBM CONFIDENTIALJavaOneJavaOneCode sample

public class MeasureMemoryUsage {

public static void main(String[] argv) {

System.out.println("Measuring Memory Usage: ");

printMemUsage( new ObjectFactory());

printMemUsage( new IntegerFactory());

printMemUsage( new LongFactory());

printMemUsage( new FloatFactory());

printMemUsage( new DoubleFactory());

printMemUsage( new StringFactory());

printMemUsage( new BigDecimalFactory());

printMemUsage( new HashMapDefaultFactory());

printMemUsage( new IntArrayEmptyFactory());

}

private static void printMemUsage(ObjectFactoryInterface objFactory) {

System.out.println(objFactory.getClass().getName() + " used " + estimateMemUsage(objFactory) + " bytes per Object");

}

Page 21: Efficient Memory and Thread Management in Highly Parallel Java Applications

21

IBM CONFIDENTIALJavaOneJavaOneCode sample

private static long estimateMemUsage(ObjectFactoryInterface objFactory) {

final int numObjects = 5000;

Object[] objArray = new Object[numObjects];

runGc();

long beforeMemUsed = getMemUsed();

System.out.println("Before memory used: " + beforeMemUsed);

for ( int i = 0; i < numObjects; i++) {

objArray[i] = objFactory.makeObject(10000000+i);

}

long afterMemUsed = getMemUsed();

System.out.println("After memory used: " + afterMemUsed);

return Math.round(((double)(afterMemUsed - beforeMemUsed)) / (double)numObjects);

}

Page 22: Efficient Memory and Thread Management in Highly Parallel Java Applications

22

IBM CONFIDENTIALJavaOneJavaOneCode sample

private static long getMemUsed() {

Runtime rTime = Runtime.getRuntime();

return rTime.totalMemory() - rTime.freeMemory();

}

private static void runGc() {

// Run GC 25 times to increase the odds it is really done

for ( int i=0; i < 25; i++) {

System.gc();

try {

Thread.sleep(200);

}

catch (InterruptedException e) {

}

}

}

}

Page 23: Efficient Memory and Thread Management in Highly Parallel Java Applications

23

IBM CONFIDENTIALJavaOneJavaOneEstimating memory usage

Using java.lang.instrument.Instrumentation.getObjectSize()

Create an instrument agent class:

Must have a method called “premain” with the following signature: public static void premain(String args, Instrumentation inst)

The JVM will pass an object that implements the Instrumentation interface to the premain method

We can then use this object to invoke the Instrumentation interface method getObjectSize(object) , passing the object we want to estimate the memory for

Page 24: Efficient Memory and Thread Management in Highly Parallel Java Applications

24

IBM CONFIDENTIALJavaOneJavaOneEstimating memory usage

Must package the agent class into a jar with a Manifest file (for example, manifestFile.txt) with the following line:

Premain-class: <packagename>.<agentclassname>

To run a program that can use the getObjectSize method, must specify the agent on the command line via

-javaagent:<agentjarname>

Note that getObjectSize doesn’t include the memory of objects referenced by the specified object. Only the memory for the reference will be included.

To get a deep estimate, reflection would need to be used

Page 25: Efficient Memory and Thread Management in Highly Parallel Java Applications

25

IBM CONFIDENTIALJavaOneJavaOneCode sample

import java.lang.instrument.*;

public class MemInstrumentAgent {

private static volatile Instrumentation instrObj;

public static void premain(String args, Instrumentation instrParam) {

instrObj = instrParam;

}

public static long getObjectSize(Object obj) {

if (instrObj != null)

return instrObj.getObjectSize(obj);

else

throw new IllegalStateException(“Instrumentation agent not initialized!”);

}

}

Page 26: Efficient Memory and Thread Management in Highly Parallel Java Applications

26

IBM CONFIDENTIALJavaOneJavaOneCode sample

jar –cmf manifestFile.txt instrAgent.jar MemInstrumentAgent.class

public class testMemoryUse {

public static void main(String args[]) {

Integer intObj = new Integer(100);

System.out.println(“Memory estimate for Integer: “ + MemInstrumentAgent.getObjectSize(intObj));

}

java –javaagent:instrAgent.jar –cp . testMemoryUse

Page 27: Efficient Memory and Thread Management in Highly Parallel Java Applications

27

IBM CONFIDENTIALJavaOneJavaOneEstimating memory usage

Following results are for:

HotSpot 6.0_29 –XX-UseTLAB

IBM J9, 1.6.0 build 2.4

Note: -Xgcpolicy:gencon is the default as of 1.6.0 build 2.6. Also the default in WebSphere AppServer V8 and later.

Page 28: Efficient Memory and Thread Management in Highly Parallel Java Applications

28

IBM CONFIDENTIALJavaOneJavaOneEstimating memory usage

HotSpot32 HotSpot64 HotSpot

CompOops

IBM32 IBM64 IBM64 CompRefs

Object 8 16 16 12 24 16

Integer 16 24 16 16 32 16

Long 16 24 24 24 32 24

Float 16 24 16 16 32 16

Double 16 24 24 24 32 24

String (8 chars)

56 82 65 64 90 65

BigDecimal (123.456)

96 148 123 104 156 115

HashMap (16,0.75)

120 216 128 128 224 128

int[0] 16 24 16 16 24 16

int[100] 416 424 416 416 424 416

Page 29: Efficient Memory and Thread Management in Highly Parallel Java Applications

29

IBM CONFIDENTIALJavaOneJavaOneEstimating memory usage

Which is better?

Using the Runtime methods is the simplest

Instrumentation is more accurate, so it is preferred if you don’t need a deep memory estimate

For deep estimates, Instrumentation + Reflection is the ultimate solution, but is more complex and slower

Page 30: Efficient Memory and Thread Management in Highly Parallel Java Applications

30

IBM CONFIDENTIALJavaOneJavaOneRuntime memory usage estimation

Your goal is to write a method to estimate the memory usage for objects of each class for which you will tracking the memory usage

Any referenced classes will need a separate method

Since each class needs its own estimation method, you don’t need deep memory estimates when testing to determine how much your objects will use

Have utility classes that have pre-calculated the memory usage of all primitives and standard objects

Page 31: Efficient Memory and Thread Management in Highly Parallel Java Applications

31

IBM CONFIDENTIALJavaOneJavaOneRuntime memory usage estimation

Need methods/constants for the following:

Object overhead

Alignment policy

Size of a reference

Size of a primitive (boolean, char, byte, short, int, long, float, double)

Size of primitive arrays

Size of any basic common objects used (String, Integer, Long, BigDecimal, Timestamp, etc)

Page 32: Efficient Memory and Thread Management in Highly Parallel Java Applications

32

IBM CONFIDENTIALJavaOneJavaOneRuntime memory usage estimation

For collections, measure the size of an empty collection

Then determine the overhead for an entry added to the collection

It is better to overestimate then underestimate!

Better to use a little less than is available then to run out!

You can always refine the estimates to make them better

And anyway, the JVM is probably using more memory than you think!

Page 33: Efficient Memory and Thread Management in Highly Parallel Java Applications

33

IBM CONFIDENTIALJavaOneJavaOneRuntime memory usage estimation

Have a MemoryUsage interface with one method:

long getMemoryUsage()

Have each class whose memory you are going to track implement this interface

Each implementation estimates the memory usage of its primitives and object references

Primitives, object overhead, and object reference memory use can be a static final for that class

For any non-null object references, invoke the appropriate method to get the memory estimate for that object

This includes arrays

Page 34: Efficient Memory and Thread Management in Highly Parallel Java Applications

34

IBM CONFIDENTIALJavaOneJavaOneRuntime memory usage estimation

Round the final result up to a multiple of 8

If recursive references are possible, you need to remember if you’ve already visited an object when computing the estimate

Shared objects should only be counted once, or not at all

For potential “flyweight” objects (interred strings, cached Integers, etc.) :

Assume not a flyweight unless you know you’ve used the value before or will probably use it again

For example, a String that stores a frequently used file name

Page 35: Efficient Memory and Thread Management in Highly Parallel Java Applications

35

IBM CONFIDENTIALJavaOneJavaOneRuntime memory usage estimation

Most custom objects quickly devolve down to objects containing primitives and basic common objects

You don’t need to do this for every class – just for those that you are going to track the memory use

You can easily check the accuracy of your estimates against reality using the techniques described earlier

The memory estimation is quick

Page 36: Efficient Memory and Thread Management in Highly Parallel Java Applications

36

IBM CONFIDENTIALJavaOneJavaOne

Runtime memory usage estimation : Code Sample

public class WorkUnit implements MemoryUsage

{

private static final int primitiveUsage = 16;

private static final int thisOhead = 4;

private static final int refsUsage = 2 * 4;

private static final int fixedOhead = primitiveUsage + thisOhead + refsUsage;

long userid;

String originatingSystem;

Request requestObj;

long memUsage = -1;

Page 37: Efficient Memory and Thread Management in Highly Parallel Java Applications

37

IBM CONFIDENTIALJavaOneJavaOne

Runtime memory usage estimation : Code Sample

long getMemoryUsage()

{

if (memUsage == -1) {

int estimate = fixedOhead;

estimate +=

MemUtility.getStringMemUsage(originatingSystem.length());

estimate += requestObj.getMemoryUsage();

memUsage = MemUtility.roundUsage(estimate);

}

}

} // WorkUnit

Page 38: Efficient Memory and Thread Management in Highly Parallel Java Applications

38

IBM CONFIDENTIALJavaOneJavaOne

Runtime memory usage estimation : Code Sample

public class Request implements MemoryUsage

{

private static final int primitiveUsage = 16;

private static final int thisOhead = 4;

private static final int refsUsage = 2 * 4;

private static final int fixedOhead = primitiveUsage + thisOhead + refsUsage;

long requestId;

String request;

String discountCode;

long memUsage = -1;

Page 39: Efficient Memory and Thread Management in Highly Parallel Java Applications

39

IBM CONFIDENTIALJavaOneJavaOne

Runtime memory usage estimation : Code Sample

long getMemoryUsage(){ if (memUsage == -1) { int estimate = fixedOhead; estimate += MemUtility.getStringMemUsage(request.length()); if (discountCode != null) { estimate += MemUtility.getStringMemUsage(discountCode.length()); } memUsage = MemUtility.roundUsage(estimate); }}

} // Request

Page 40: Efficient Memory and Thread Management in Highly Parallel Java Applications

40

IBM CONFIDENTIALJavaOneJavaOne

Runtime memory usage estimation : Code Sample

public class MemUtility

{static long getStringMemoryUsage( int length){ return 40 + (length << 1);}static long roundUsage( long usage){ long usageDiv8 = usage >> 3; if ((usageDiv8 << 3) == usage) return usage; else return (usageDiv8 + 1) << 3;}

}

Page 41: Efficient Memory and Thread Management in Highly Parallel Java Applications

41

IBM CONFIDENTIALJavaOneJavaOneTrack and Control Memory Usage

Have one logical pool of memory which consists of all the heap memory that is used by the objects whose memory usage will be controlled

Avoid separate pools!

Memory usage can be controlled “statically” or “dynamically”

Controlling memory usage statically means we reserve the memory up front, when the application starts, or when a new thread starts

Note that you don’t need to actually allocate the memory up front, you just reserve it

Page 42: Efficient Memory and Thread Management in Highly Parallel Java Applications

42

IBM CONFIDENTIALJavaOneJavaOneTrack and Control Memory Usage

Dynamic tracking and control: Keep track of the total used by all threads Get the estimated memory usage of an object from its getMemoryUsage

method and increment the total amount used When the object is no longer needed, the memory use is decremented

from the total amount used

Only track long term memory usage

Don’t need to “pay” for the memory dynamically if it was accounted for statically

Each thread maintains a local total that tracks its memory usage

Page 43: Efficient Memory and Thread Management in Highly Parallel Java Applications

43

IBM CONFIDENTIALJavaOneJavaOneTrack and Control Memory Usage

The global total amount used is a globally accessible long

All increments and decrements of the global total are synchronized

Updates of the global total are buffered to minimize synchronization costs

Page 44: Efficient Memory and Thread Management in Highly Parallel Java Applications

44

IBM CONFIDENTIALJavaOneJavaOneTrack and Control Memory Usage

The absolute limit on the memory that can be used is the JVM max memory specified with the –Xmx parameter

But dynamically tracked memory can’t have all of that

Memory should be reserved statically for objects or buffers if: The application must have them There are not very many They will be used for a long time Static size, or a static upper limit (reserve the upper limit)

Page 45: Efficient Memory and Thread Management in Highly Parallel Java Applications

45

IBM CONFIDENTIALJavaOneJavaOneTrack and Control Memory Usage

Examples of buffers to reserve memory for statically: I/O buffers LIFO page caches

Need to reserve a certain amount of memory for “everything else” – i.e. what we are not accounting for statically or dynamically

This can be one sum, a per-thread value * # of threads, or both Also need to reserve memory to prevent the JVM from

thrashing Don’t want to allow the heap to grow past 85-90% of the JVM max

memory Recommend reserve 15% of the JVM max memory

Page 46: Efficient Memory and Thread Management in Highly Parallel Java Applications

46

IBM CONFIDENTIALJavaOneJavaOneTrack and Control Memory Usage

So, the amount of memory available for dynamically tracked memory is: max memory – (staticBuffers + thrashingOverhead + everythingElse)

Let’s call this “globalMaxDynamicMemory”

Every time a thread increments the global total, the global total is compared against globalMaxDynamicMemory

If the global max is exceeded, the increment request is denied

Page 47: Efficient Memory and Thread Management in Highly Parallel Java Applications

47

IBM CONFIDENTIALJavaOneJavaOneTrack and Control Memory Usage

If an increment of the global total is denied, the requesting thread must handle it.

A thread can handle this in a number of ways: Wait and periodically retry until more memory becomes

available Reduce some of its existing registered memory usage –

throw away unneeded objects, etc Write some of its data to disk Throw an exception, i.e. give up Get another thread to give up some memory (possible, but

more complicated!)

Page 48: Efficient Memory and Thread Management in Highly Parallel Java Applications

48

IBM CONFIDENTIALJavaOneJavaOneTrack and Control Memory Usage

If a thread reduces its registered memory usage by the amount of the denied request, it doesn’t need to re-submit the request

To prevent one thread from “hogging” all the memory, can have heuristics, such as one thread cannot have more than 90% of the dynamically tracked memory if there is more than one thread

The decrement of a thread’s memory use should be in finally blocks to ensure the memory is released if the thread gets an uncaught exception

Page 49: Efficient Memory and Thread Management in Highly Parallel Java Applications

49

IBM CONFIDENTIALJavaOneJavaOneSoft references

Use of soft references to control your memory use really only works for objects you can get again or don’t really need

This may be acceptable for something like a web page image – if it becomes reclaimed by the GC, you just read the image from disk again

Doesn’t work for objects where the only copy is in memory!

No way to control the order in which softly reachable objects are reclaimed

Page 50: Efficient Memory and Thread Management in Highly Parallel Java Applications

50

IBM CONFIDENTIALJavaOneJavaOneMinimize Memory Usage

Prefer arrays over more complex structures

Collections: Avoid empty collections Size collections appropriately Choose correct collection

Class structure impact on memory usage Avoid storing data in a separate class if all instances of your

class need that data, and it is not needed outside of that class

Store info specific to only a subset of class instances in a derived class or in a separate class

Page 51: Efficient Memory and Thread Management in Highly Parallel Java Applications

51

IBM CONFIDENTIALJavaOneJavaOneMinimize Memory Usage

Convert objects to byte arrays Objects that won’t be needed for a while Objects that are going to be written to a stream soon

Example: objects in a hash map when most won’t be used for a while

Convert to byte array Convert back when fetched Can have a mix of converted/unconverted data: convert a

byte array once it has been fetched once However, the hash map won’t be able to use generics if it

has a mix of converted/unconverted data

Page 52: Efficient Memory and Thread Management in Highly Parallel Java Applications

52

IBM CONFIDENTIALJavaOneJavaOneMinimize Memory Usage

Cleanup unneeded objects ASAP Optimize for mainline processing, not exceptional processing For example, if possible, delete requested info as soon as send it

back to the user – don’t wait for the confirmation response If response is negative, read the requested info back in Of course, this is only worthwhile if failures are rare!

Use finally blocks liberally! Make sure any allocated memory is freed And if it’s dynamically tracked, be sure to decrement the memory

usage from the global total!

Page 53: Efficient Memory and Thread Management in Highly Parallel Java Applications

53

IBM CONFIDENTIALJavaOneJavaOneMinimize Memory Usage

Object pooling Doesn’t reduce your memory usage, but reduces the cost to

allocate and GC it Only worth it for simple, multi-purpose objects, such as primitive

arrays Byte arrays are the best candidate as they can be used to store any kind

of data, and don’t have to be an exact match Relatively easy to have a pool of byte arrays of various sizes that can be

used to store other objects Store large objects (i.e. lobs) as a series of smaller byte array buffers

instead of one large one If you find your pooling is getting complicated – stop! Leave

complex memory mgmt to the JVM

Page 54: Efficient Memory and Thread Management in Highly Parallel Java Applications

54

IBM CONFIDENTIALJavaOneJavaOneBalance and control threads

Feed work to a thread via an ArrayBlockingQueue or your own synchronized buffering scheme, with a limited size

When a thread input queue is empty, or an output queue is full, the thread will have to wait

This keeps threads from getting too far ahead of each other

The capacity of the queue/buffer determines how far ahead threads are allowed to get relative to each other

To balance, keep track of the times producers and consumers must wait to put/take data from the queue

Lots of consumer waits = need more producer threads

Lots of producer waits = need more consumer threads

Page 55: Efficient Memory and Thread Management in Highly Parallel Java Applications

55

IBM CONFIDENTIALJavaOneJavaOneBalance and control threads

The capacity of a java blocking queue is specified as a number of objects

Unless all objects placed in the queue have the same size, the memory usage of the queue can vary widely

Can run out of memory if don’t limit or keep track of the memory usage of blocking queues

Page 56: Efficient Memory and Thread Management in Highly Parallel Java Applications

56

IBM CONFIDENTIALJavaOneJavaOneBalance and control threads

The solution is decorate the blocking queue with a logical queue

The logical queue buffers objects in lists and adds these list buffers to the queue when the buffers are full

The lists are considered full when they reach a certain amount of estimated memory usage

So the queue is a queue of lists, each of which will use a similar amount of memory

The maximum amount of memory the queue can use is then bufferSize * (queueCapacity)

Page 57: Efficient Memory and Thread Management in Highly Parallel Java Applications

57

IBM CONFIDENTIALJavaOneJavaOneBalance and control threads

The maximum amount of memory usage for each inter-thread queue is statically reserved when each queue is created, and released when those threads stop

This has the additional benefit of reducing the number of puts/gets to the underlying queue, which reduces the synchronization expense

Also bases the amount of queued-up work for a thread on the amount of memory that work consumes, instead of a number of requests

This is probably a better metric, since requests that consume more memory probably take longer to process

Page 58: Efficient Memory and Thread Management in Highly Parallel Java Applications

58

IBM CONFIDENTIALJavaOneJavaOneWaiting/Flushing

A thread needs to flush a buffered output queue if it is idle or has to wait

Waiting indefinitely is dangerous!

Try to NEVER wait on anything indefinitely

If you are waiting for another thread to do something, and it dies, you will be waiting forever

So, use loops+timeouts whenever possible When wake up, check if anyone wants you to stop, process a more

urgent request, etc. If not, re-issue request

Page 59: Efficient Memory and Thread Management in Highly Parallel Java Applications

59

IBM CONFIDENTIALJavaOneJavaOneHandle OutOfMemoryError

Despite your best efforts, your app may still run out of memory

There are many reasons for this:

Tracked memory usage estimates are too low

Untracked memory usage was larger or was utilized longer than expected

Short term memory usage burst overwhelmed the GC

Memory Leaks

Too many threads, etc

Page 60: Efficient Memory and Thread Management in Highly Parallel Java Applications

60

IBM CONFIDENTIALJavaOneJavaOneHandle OutOfMemoryError

If an OOM error occurs, you will probably need a heap dump to determine why!

IBM JVM enables heap dumps on OOM by default

To disable, set environment variable IBM_HEAPDUMP_OUTOFMEMORY=FALSE

Oracle HotSpot disables heap dumps on OOM by default

To enable: -XX+HeapDumpOnOutOfMemoryError

To see the value of all flags, specify: -XX:+PrintFlagsFinal

Page 61: Efficient Memory and Thread Management in Highly Parallel Java Applications

61

IBM CONFIDENTIALJavaOneJavaOneHandle OutOfMemoryError

When an OutOfMemoryError occurs, this is only guaranteed to stop the thread that triggered it

This can lead to “zombie threads”

Threads that were dependent on the thread that died can hang

The death of the thread getting the OOM might be sufficient to avoid another OOM, but not enough to keep the JVM from thrashing, thus preventing remaining threads from doing anything

Need a mechanism to bring down related threads

Page 62: Efficient Memory and Thread Management in Highly Parallel Java Applications

62

IBM CONFIDENTIALJavaOneJavaOneHandle OutOfMemoryError

Bring down related threads with stop request objects

Bring down the JVM if the thread that dies is a “critical” thread – i.e. it is essential for your application

Or if zombie threads are a big problem, can have your uncaught exception handler bring down the entire JVM if the exception is OOM

Page 63: Efficient Memory and Thread Management in Highly Parallel Java Applications

63

IBM CONFIDENTIALJavaOneJavaOneStopping

Requesting stop via flag objects

Have a stop request object that indicates thread should stop

Classes check regularly, and always before sleeping, waiting, I/O, putting/getting from a blocking queue, etc.

Works the same regardless of whether the class is running as a thread or not

Stop request object can indicate different types of stop: stop immediate, stop gracefully (finish work in progress), etc.

Page 64: Efficient Memory and Thread Management in Highly Parallel Java Applications

64

IBM CONFIDENTIALJavaOneJavaOneStopping

Have a stop request object that indicates thread should stop

Can have a single stop request object for all related threads But this only allows all of them to be stopped at once

For finer granularity of stop control, have a higher level stop request object with references to thread stop request objects

Setting of stop request object must be synchronized – but this does not occur very often

Page 65: Efficient Memory and Thread Management in Highly Parallel Java Applications

65

IBM CONFIDENTIALJavaOneJavaOneStopping

Have a request stop object that indicates thread should stop

Reading of the stop request object will occur frequently

Reading stop request object does not need to be synchronized for most applications It is not critical to see a stop request instantaneously, and

synchronization doesn’t guarantee this anywayThe stop request should be a boolean or enum, so updates

of it are atomic

Page 66: Efficient Memory and Thread Management in Highly Parallel Java Applications

66

IBM CONFIDENTIALJavaOneJavaOneStopping

Detecting a thread has actually stopped

Could use Thread.isAlive() to detect if a thread has stoppedThis requires a handle to the threadPresumes the class is running as a threadWon’t work for higher abstractions

Page 67: Efficient Memory and Thread Management in Highly Parallel Java Applications

67

IBM CONFIDENTIALJavaOneJavaOneStopping

Detecting a thread has actually stopped

Better to make this abstract

Separate Runnable class from class that does the real work

Runnable class invokes the work class

The Runnable class has an “isRunning” object that tracks the run status

All threads use this common Runnable class

Page 68: Efficient Memory and Thread Management in Highly Parallel Java Applications

68

IBM CONFIDENTIALJavaOneJavaOneStopping

Detecting a thread has actually stopped

If a graceful stop is requested, threads must wait for all of their producer threads to stop before they can stopThis allows in-flight data to be processed

Unexpected thread termination

Just set the stop request object to bring down all related threads

Page 69: Efficient Memory and Thread Management in Highly Parallel Java Applications

69

IBM CONFIDENTIALJavaOneJavaOneStopping

Returning control to the user

Controller or highest level thread must make sure all threads are truly stopped before returning

Strange things can happen if you try to restart and some threads from a previous incarnation are still running!

Could call thread.join() But must have handle to all threads, and can’t be responsive to

other requests

Better for the controller to query the isRunning flag of all thread classes and wait for all to be stopped.

Page 70: Efficient Memory and Thread Management in Highly Parallel Java Applications

70

IBM CONFIDENTIALJavaOneJavaOneExample: buffering, flushing, stopping

work()

{

while (!checkForStopRequests()) {

int maxBufSize = 102400;

int curBufMemUsage = 0;

int outBufCapacity = 10;

ArrayList<WorkUnit> outBuf = new ArrayList<WorkUnit>(outBufCapacity);

ArrayList<WorkUnit> inBuf = inQueue.poll(200,TimeUnit.MILLISECONDS);

// If have to wait for input, flush output

if (inBuf == null) {

if (outBuf.size() > 0) {

flushOutput();

}

}

Page 71: Efficient Memory and Thread Management in Highly Parallel Java Applications

71

IBM CONFIDENTIALJavaOneJavaOne Example: buffering, flushing, stopping

else {

for (WorkUnit workUnit : inBuf) {

// process workUnit

int unitMemUsage = workUnit.getMemoryUsage();

if (curBufMemUsage + unitMemUsage > maxBufSize) {

flushOutput(outBuf);

curBufMemUsage = 0;

}

else {

curBufMemUsage += unitMemUsage;

outBuf.add(workUnit);

}

}

}

}

Page 72: Efficient Memory and Thread Management in Highly Parallel Java Applications

72

IBM CONFIDENTIALJavaOneJavaOne Example: buffering, flushing, stopping

flushOutput(ArrayList<WorkUnit> outBuf) {

boolean success = false;

while (!success && !checkForStopRequests()) {

success = outQueue.offer(outBuf, 200,TimeUnit.MILLISECONDS);

}

}

Boolean checkForStopRequests() {

// check for stop requests. Basic handling is:

if (stop requested) {

threadRunnableObj.isRunning = false;

return true; // stop this thread

}

return false;

}

Page 73: Efficient Memory and Thread Management in Highly Parallel Java Applications

73

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

Proper handling for uncaught exceptions is absolutely critical !

Uncaught exceptions are the leading cause of unexpected thread termination

If any threads are dependent on others, if one goes down, the remaining threads can hang!

Hung threads can be hard to detect

The best way to handle this is to prevent it!

Page 74: Efficient Memory and Thread Management in Highly Parallel Java Applications

74

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

By default, uncaught exceptions are handled by the default uncaught exception handler

This just writes the stack trace to standard error

This is rarely acceptable! Stack trace may be lost If class tracks its run status, this will be inaccurate after an uncaught

exception: run status will still be ‘running’ What about threads that are dependent on this thread?

Page 75: Efficient Memory and Thread Management in Highly Parallel Java Applications

75

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

Ensure you have handling for all uncaught exceptions

Keep it simple: do what must be done, but no more Complex handling for uncaught exceptions could trigger other uncaught

exceptions, which can be extremely difficult to debug! And since uncaught exceptions don’t occur often, the handler probably has not

been tested as much as other modules

All handlers, at a minimum, should: Log the error and stack trace somewhere durable and secure Alert the user so the error will be noticed

If the thread class has a run status, the run status should be set to stopped, error, or some equivalent Requires access to the thread class state

Page 76: Efficient Memory and Thread Management in Highly Parallel Java Applications

76

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

Ideally, the handler should also alert dependent threads

Can notify controller or parent, which can: Just restart the thread that got the uncaught exception

Not recommended! Any related threads could be in an indeterminate state

Stop all related threads (safest) Set the stopping request object for the related threads Since all threads are monitoring, this will bring them all down Threads won’t hang!

Controller/parent doesn’t need an explicit message to know a thread has died, if it is monitoring the run status of the threads

In fact, the controller doesn’t even need to get involved – Just have the uncaught exception handler set the stop request object to stop

Page 77: Efficient Memory and Thread Management in Highly Parallel Java Applications

77

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

Where to handle?

Three primary possibilities:

Custom uncaught exception handler

Catch Throwable() in run() method

Finally block in run() method

Page 78: Efficient Memory and Thread Management in Highly Parallel Java Applications

78

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

Custom uncaught exception handler

If a uncaught exception handler is declared and is in scope, the default uncaught exception handler is NOT invoked

Determine scope:

Per JVM

Per Thread group

Per Runnable class

Page 79: Efficient Memory and Thread Management in Highly Parallel Java Applications

79

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

Custom uncaught exception handler – per JVM scope

Logging the error and alerting the user is about all that can be done

Since this is per-JVM, it does not have access to any specific thread classes or thread groups

Set via the static Thread method setDefaultUncaughtExceptionHandler

Thread.setDefaultUncaughtExceptionHandler( Thread.UncaughtExceptionHandler ueh) ueh is a class that must implement uncaughtException(Thread thr,

Throwable the)

Page 80: Efficient Memory and Thread Management in Highly Parallel Java Applications

80

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

Custom uncaught exception handler – per Thread group

Can guarantee standard handling for a group of threads by having their Thread instance use the same custom ThreadGroup class

Custom ThreadGroup class extends ThreadGroup and overrides the ThreadGroup uncaughtException method

Allows for common code for the thread group, but doesn’t have access to individual Runnable classes

Could have the stop object for the threads in the group registered here set to true if an uncaught exception occurs in any thread in the group

Page 81: Efficient Memory and Thread Management in Highly Parallel Java Applications

81

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

Custom uncaught exception handler – per Thread class

Can have access to the Runnable class, so the handler can log details about the class structure, set run status, etc. before exiting

Guaranteed to be invoked

Set via the Thread method setUncaughtExceptionHandler

setUncaughtExceptionHandler( Thread.UncaughtExceptionHandler ueh) ueh is a class that must implement uncaughtException(Thread thr,

Throwable the)

Page 82: Efficient Memory and Thread Management in Highly Parallel Java Applications

82

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

Catch Throwable() in run() method

Not recommended as the sole mechanism for handling uncaught exceptions If you do this, do it outside any loops in the thread run method, i.e.

at the highest level possible. Otherwise, the exception will not trigger the thread to exit, which can

cause very strange thread conditions! Thread can be “alive”, but constantly cycling through errors, or hanging

Page 83: Efficient Memory and Thread Management in Highly Parallel Java Applications

83

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

Finally block in run() method

Have handled flag that is set to true at end of try block and every catch block

!handled = uncaught exception

Advantage: Can have handling customized to the invoked class Can set run status and stop request

Disadvantages Every catch block must set “handled” flag or the checked exception will be

treated as unhandled An uncaught exception handler that logs the exception is still necessary,

since the finally block doesn’t have access to the exception

Page 84: Efficient Memory and Thread Management in Highly Parallel Java Applications

84

IBM CONFIDENTIALJavaOneJavaOneUncaught exception handling

Critical threads

Some threads may be critical to your application

If they stop unexpectedly, the application will not function

Best to stop the entire application when this occurs

The uncaught exception handling for the thread should call System.exit(<non-zero value>)

Page 85: Efficient Memory and Thread Management in Highly Parallel Java Applications

85

IBM CONFIDENTIALJavaOneJavaOneExample using finally block

public class bulletproofRunnable implements Runnable {

boolean isRunning;

Worker worker;

boolean handled;

bulletproofRunnable(Worker worker) {

this.worker = worker;

handled = false;

}

Page 86: Efficient Memory and Thread Management in Highly Parallel Java Applications

86

IBM CONFIDENTIALJavaOneJavaOneExample using finally block

run() {

try {

isRunning = true;

worker.work();

handled = true;

}

finally {

if (!handled) {

// uncaught exception!

isRunning = false;

worker.setStopRequest(true);

// have uncaught exception handler log error and stack trace to error file

}

}}

}

Page 87: Efficient Memory and Thread Management in Highly Parallel Java Applications

87

IBM CONFIDENTIALJavaOneJavaOneExample using custom handler

public class bulletproofRunnable implements Runnable, Thread.uncaughtExceptionHandler {

boolean isRunning;

Worker worker;

bulletproofRunnable(Worker worker) {

this.worker = worker;

}

Page 88: Efficient Memory and Thread Management in Highly Parallel Java Applications

88

IBM CONFIDENTIALJavaOneJavaOneExample using custom handler

public void uncaughtException {Thread thr, Throwable the)

isRunning = false;

worker.setStopRequest(true);

// log error and stack trace to error file

}

}

Add to where the actual thread is created:

bulletproofThread.setUncaughtExceptionHandler(bulletproofRunnable);

Page 89: Efficient Memory and Thread Management in Highly Parallel Java Applications

89

IBM CONFIDENTIALJavaOneJavaOneSummary

Excessive memory usage can have a large adverse affect on application performance

Estimating memory usage of the objects used by your application is the first step to managing the application memory usage

Tracking and controlling long-term memory usage is essential to avoid OOM

Many techniques exist to minimize memory usage

Limiting thread input/output queue sizes by their memory usage balances thread memory use and helps avoid OOM

A framework for stopping threads and consistent handling of uncaught exceptions is essential to avoid hanging when OOM errors or other unexpected exceptions occur

Page 90: Efficient Memory and Thread Management in Highly Parallel Java Applications

Phillip [email protected]

JavaOne JavaOne Thank YouThank You