inside the jvm - follow the white rabbit!
TRANSCRIPT
Inside the JVMFollow the white rabbit!
Sylvain Wallez - @bluxteToulouse JUG - 2017-04-26
Who’s this guy?
Software engineer at Elastic (cloud team)
Previously:
● IoT tech lead at OVH● CEO at Actoboard● Backend architect at Sigfox● CTO at Goojet/Scoop it● Lead architect at Joost● Member of the Apache Software Foundation● Cofounder & CTO at Anyware Technologies (now part of Sierra Wireless)
Agenda● How it started: let’s optimize 6 lines of (hot) code!● Profiling memory usage● What’s in a class file?● Micro-benchmarking with JMH● Exploration of the OpenJDK source code
How it startedLet’s optimize 6 lines of (hot) code
On the CouchBase blog...“JVM Profiling - Lessons from the trenches”: optimize the conversion of a protocol error code into a readable message.
...
private final short code;
private final String description;
KeyValueStatus(short code, String description) {
this.code = code;
this.description = description;
}
public static KeyValueStatus valueOf(final short code) {
for (KeyValueStatus value: values()) {
if (value.code() == code) return value;
}
return UNKNOWN;
}
public enum KeyValueStatus {
UNKNOWN((short) -1, "Unknown code"),
SUCCESS((short) 0x00,
"The operation completed successfully"),
ERR_NOT_FOUND((short) 0x01,
"The key does not exists"),
ERR_EXISTS((short) 0x02,
"The key exists in the cluster"),
ERR_TOO_BIG((short) 0x03,
"The document exceeds the maximum size"),
ERR_INVALID((short) 0x04,
"Invalid request"),
ERR_NOT_STORED((short) 0x05,
"The document was not stored"),
...
On the CouchBase blog...Finding: values() is allocating memory
public static KeyValueStatus valueOf(final short code) {
for (KeyValueStatus value: values()) {
if (value.code() == code) return value;
}
return UNKNOWN;
}
public static KeyValueStatus valueOf(final short code) {
if (code == SUCCESS.code) {
return SUCCESS;
} else if (code == ERR_NOT_FOUND.code) {
return ERR_NOT_FOUND;
} else if (code == ERR_EXISTS.code) {
return ERR_EXISTS;
} else if (code == ERR_NOT_MY_VBUCKET.code) {
return ERR_NOT_MY_VBUCKET;
}
for (KeyValueStatus value : values()) {
if (value.code() == code) {
return value;
}
}
return UNKNOWN;
}
Optimization: fast path on common values
If something goes wrong, it’ll make it worse!
Oh well,nobody cares...
blog post
Profiling memory usage
Various kinds of memory optimization● Memory usage / memory leaks
○ My application needs tons of heap○ How many objects are held active?
→ Memory profiler / jmap
● Garbage collection pressure○ My application spends a lot of time in the GC○ How often are objects allocated?
→ Java Mission Control / jmap
jmap histogramsjmap -histo num #instances #bytes class name
----------------------------------------------
1: 4217124 674740720 [Lnet.bluxte.experiments.couchbase_keyvalue.KeyValueStatus;
2: 486 14947912 [I
3: 5855 493864 [C
4: 1461 166752 java.lang.Class
5: 5848 140352 java.lang.String
6: 503 136440 [B
7: 968 62480 [Ljava.lang.Object;
8: 1255 40160 java.util.HashMap$Node
9: 991 39640 java.util.LinkedHashMap$Entry
10: 258 30720 [Ljava.util.HashMap$Node;
11: 259 22792 java.lang.reflect.Method
12: 441 20952 [Ljava.lang.String;
13: 229 16488 java.lang.reflect.Field
14: 171 9576 java.util.LinkedHashMap
15: 291 9312 java.util.concurrent.ConcurrentHashMap$Node
16: 160 7680 java.util.HashMap
17: 178 7120 java.lang.ref.SoftReference
18: 89 7120 java.net.URI
19: 102 6528 java.net.URL
20: 256 6144 java.lang.Long
21: 76 6080 java.lang.reflect.Constructor
22: 258 5896 [Ljava.lang.Class;
23: 265 5680 [S
24: 166 5312 java.util.Hashtable$Entry
25: 94 5264 java.lang.Class$ReflectionData
code available on GitHub
jmap histogramsjmap -histo:live – perform a full GC first num #instances #bytes class name
----------------------------------------------
1: 5855 493864 [C
2: 1461 166752 java.lang.Class
3: 5848 140352 java.lang.String
4: 503 136440 [B
5: 967 62456 [Ljava.lang.Object;
6: 1255 40160 java.util.HashMap$Node
7: 991 39640 java.util.LinkedHashMap$Entry
8: 258 30720 [Ljava.util.HashMap$Node;
9: 259 22792 java.lang.reflect.Method
10: 441 20952 [Ljava.lang.String;
11: 283 19272 [I
12: 229 16488 java.lang.reflect.Field
....................
51: 35 1400 javax.management.MBeanOperationInfo
52: 3 1360 [Lnet.bluxte.experiments.couchbase_keyvalue.KeyValueStatus;
53: 55 1320 java.io.ExpiringCache$Entry
54: 29 1304 [Ljava.lang.reflect.Field;
55: 36 1152 net.bluxte.experiments.couchbase_keyvalue.KeyValueStatus
56: 27 1080 java.security.ProtectionDomain
Java Mission Control / Java Flight RecorderLightweight monitoring agent
● Integrated into the (Oracle) JVM● Very low overhead
Continuously samples diagnostics data
● Thread activity● GC activity● Memory allocations
Java Mission Control / Java Flight RecorderAvailable only with Oracle JDK
● Free for development● Commercial for use in production
How to enable it?
● at launch time: java -XX:+UnlockCommercialFeatures● after launch: jcmd <pid> VM.unlock_commercial_features
Original codeSimple loop on the enum values
public static KeyValueStatus valueOf(final short code) { for (KeyValueStatus value: values()) { if (value.code() == code) return value; } return UNKNOWN;}
Original code - Memory stats
Looks good! No leak!
Hmm… growing fast!
Original code - Allocations
Original code - GC activity
Iteration on constant arrayStill trivial, but reuse the values array
private static final KeyValueStatus[] VALUES = values();
public static KeyValueStatus valueOf(final short code) { for (KeyValueStatus value: VALUES) { if (value.code() == code) return value; } return UNKNOWN;}
Constant array - Allocations
Constant array - GC activity
GC pressure collateral damages
Full GC clears weak references→ clears some caches→ additional load to repopulate them!
Enum.values() ?Exploring the bytecode
Enum.values() – a generated method
The compiler automatically adds some special methods when it creates an enum. For example, they have a static values method that returns an array containing all of the values of the enum in the order they are declared.
– The Java Tutorial
/** * Returns an array containing the constants of this enum * type, in the order they're declared. This method may be * used to iterate over the constants as follows: * * for(E c : E.values()) * System.out.println(c); * * @return an array containing the constants of this enum * type, in the order they're declared */public static E[] values();
– The Java Language Specification
Show me the (byte)code!public class SimpleMain { public static void main(String[] args) { System.out.println("Hello world!"); }}
public class net.bluxte.experiments.talk.SimpleMain { public net.bluxte.experiments.talk.SimpleMain(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return
public static void main(java.lang.String[]); Code: 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #3 // String Hello world! 5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return}
Default constructor
javap -c SimpleMain.class
or IntelliJ’s bytecode plugin
Show me the (byte)code!
public class SimpleMain { static String hello = "Hello"; static String world = "world";
public static void main( String[] args ) { System.out.println( hello + " " + world ); }}
public class net.bluxte.experiments.talk.SimpleMain { static java.lang.String hello;
static java.lang.String world;
public net.bluxte.experiments.talk.SimpleMain(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return
public static void main(java.lang.String[]); Code: 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: new #3 // class java/lang/StringBuilder 6: dup 7: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V 10: getstatic #5 // Field hello:Ljava/lang/String; 13: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 16: ldc #7 // String “ “ 18: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 21: getstatic #8 // Field world:Ljava/lang/String; 24: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 27: invokevirtual #9 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 30: invokevirtual #10 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 33: return
static {}; Code: 0: ldc #11 // String Hello 2: putstatic #5 // Field hello:Ljava/lang/String; 5: ldc #12 // String world 7: putstatic #8 // Field world:Ljava/lang/String; 10: return}
String concat with StringBuilder
Static initializer
Show me the (byte)code!public enum SimpleEnum { FIRST_ENUM, SECOND_ENUM}
... public static net.bluxte.experiments.talk.SimpleEnum[] values(); Code: 0: getstatic #1 // Field $VALUES:[Lnet/bluxte/experiments/talk/SimpleEnum; 3: invokevirtual #2 // Method "[Lnet/bluxte/experiments/talk/SimpleEnum;".clone:()Ljava/lang/Object; 6: checkcast #3 // class "[Lnet/bluxte/experiments/talk/SimpleEnum;" 9: areturn
public static net.bluxte.experiments.talk.SimpleEnum valueOf(java.lang.String); Code: 0: ldc #4 // class net/bluxte/experiments/talk/SimpleEnum 2: aload_0 3: invokestatic #5 // Method java/lang/Enum.valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum; 6: checkcast #4 // class net/bluxte/experiments/talk/SimpleEnum 9: areturn...
Aha! We found the culprit!
But why the clone?Java arrays are mutable
The caller can mess with it, which would break other users→ Perform a defensive copy every time
How to could it be prevented?
Return an immutable List, but probably too high level here
More on the bytecodeA class file is composed of:
● constant pool: strings, fields/methods name+type, class names, etc.● fields and methods definitions and code
○ Access flags and attributes○ Code○ Line number table○ Local variable table (type and name)○ Exception table
But wait…...why would I want to know about this?
● Better understand low level diagnostics● Check generated code
○ Java: enum values (!), for loops, etc○ Scala, Kotlin: implementation of higher level constructs○ Hibernate & co: how do they mangle your code?
● Grasping low level stuff allows writing better high-level code
#1 = Methodref #6.#20 // java/lang/Object."<init>":()V #2 = Fieldref #21.#22 // java/lang/System.out:Ljava/io/PrintStream; #3 = String #23 // Hello world #4 = Methodref #24.#25 // java/io/PrintStream.println:(Ljava/lang/String;)V #5 = Class #26 // net/bluxte/experiments/talk/SimpleMain #6 = Class #27 // java/lang/Object #7 = Utf8 <init> #8 = Utf8 ()V #9 = Utf8 Code #10 = Utf8 LineNumberTable #11 = Utf8 LocalVariableTable #12 = Utf8 this #13 = Utf8 Lnet/bluxte/experiments/talk/SimpleMain; #14 = Utf8 main #15 = Utf8 ([Ljava/lang/String;)V #16 = Utf8 args #17 = Utf8 [Ljava/lang/String; #18 = Utf8 SourceFile #19 = Utf8 SimpleMain.java #20 = NameAndType #7:#8 // "<init>":()V #21 = Class #28 // java/lang/System #22 = NameAndType #29:#30 // out:Ljava/io/PrintStream; #23 = Utf8 Hello world #24 = Class #31 // java/io/PrintStream #25 = NameAndType #32:#33 // println:(Ljava/lang/String;)V #26 = Utf8 net/bluxte/experiments/talk/SimpleMain #27 = Utf8 java/lang/Object #28 = Utf8 java/lang/System #29 = Utf8 out #30 = Utf8 Ljava/io/PrintStream; #31 = Utf8 java/io/PrintStream #32 = Utf8 println #33 = Utf8 (Ljava/lang/String;)V
Constant pool for SimpleMain
public class net.bluxte.experiments.talk.SimpleMain { public net.bluxte.experiments.talk.SimpleMain(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return
public static void main(java.lang.String[]); Code: 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #3 // String Hello world! 5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return}
Type encodingWhat is (Ljava/lang/String;)V ???
L<class path>; → class nameI, J, S, B, C → integer, long, short, byte, charF, D → float, doubleZ → boolean
public String foo(int a, char[] b, List<Integer> c, boolean d)
(I[CLjava/util/List;Z)Ljava/lang/String;
The bytecode “language”Stack-based machine
● Easier to target a large variety of CPUs(Android/Dalvik is register based)
Object-oriented assembler
● Method calls (static / virtual / interface / special)
Controlled memory access
● Local variables● Object fields
The bytecode “language”Very simple 200 instructions set
Instruction groups:
● Load and store● Arithmetic and logic● Type conversion● Object creation and manipulation● Operand stack management ● Control transfer● Method invocation and return
Only addition since 1996: invokedynamic in Java7
The bytecode “language”
public static void main(String[] args) {
long start = System.nanoTime();
while(System.nanoTime() - start < MAX_NANOS) { for (int i = 0; i < 1_000_000; i++) { resolved = resolve((short)rnd.nextInt(0x100)); } Thread.sleep(100); }}
0: invokestatic #2 // Method java/lang/System.nanoTime:()J 3: lstore_1 4: invokestatic #2 // Method java/lang/System.nanoTime:()J 7: lload_1 8: lsub 9: getstatic #3 // Field MAX_NANOS:J12: lcmp13: ifge 5516: iconst_017: istore_318: iload_319: ldc #4 // int 100000021: if_icmpge 4624: getstatic #5 // Field rnd:Ljava/util/Random;27: sipush 25630: invokevirtual #6 // Method java/util/Random.nextInt:(I)I33: i2s34: invokestatic #7 // Method resolve:(S)Lnet/bluxte/experiments/ couchbase_keyvalue/KeyValueStatus;37: putstatic #8 // Field resolved:Lnet/bluxte/experiments/ couchbase_keyvalue/KeyValueStatus;40: iinc 3, 143: goto 1846: ldc2_w #9 // long 100l49: invokestatic #11 // Method java/lang/Thread.sleep:(J)V52: goto 455: return
LocalVariableTable:Start Length Slot Name Signature 18 28 3 i I 0 56 0 args [Ljava/lang/String; 4 52 1 start J
Benchmarking with JMH(Back to good old Java)
Improving our solutionWe fixed the memory issue but it’s clearly non optimal
Let’s benchmark it!
private static final KeyValueStatus[] VALUES = values();
public static KeyValueStatus valueOf(final short code) { for (KeyValueStatus value: VALUES) { if (value.code() == code) return vue; }al return UNKNOWN;}
O(n) on constant data!
JMH: an OpenJDK project
● Provides drivers and guidance for writing tests● Takes care of pre-warming the JVM, collecting results and computing stats● Provides a Maven artifact type for benchmarking projects
“JMH is a Java harness for building, running, and analysing nano/micro/milli/macro benchmarks written in Java and other
languages targetting the JVM.”
Benchmark code@State(Scope.Benchmark)public class ValueOfBenchmark {
@Param({ "0", // 0x00, Success "1", // 0x01, Not Found "134", // 0x86 Temporary Failure "255", // undefined "1024" // undefined, out of bounds }) public short code;
@Benchmark public KeyValueStatus loopNoFastPath() { return KeyValueStatus.valueOfLoop(code); }
@Benchmark public KeyValueStatus loopFastPath() { return KeyValueStatus.valueOf(code); } ...}
mvn clean installjava -jar target/benchmarks.jar
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home/jre/bin/java# VM options: <none># Warmup: 20 iterations, 1 s each# Measurement: 20 iterations, 1 s each
# Threads: 1 thread, will synchronize iterations# Benchmark mode: Throughput, ops/time# Benchmark: net.bluxte.experiments.couchbase_keyvalue.ValueOfBenchmark.loopNoFastPath# Parameters: (code = 0)
# Run progress: 0,00% complete, ETA 04:53:20# Fork: 1 of 10# Warmup Iteration 1: 152063982,769 ops/s# Warmup Iteration 2: 149808416,787 ops/s# Warmup Iteration 3: 210436722,740 ops/s# Warmup Iteration 4: 202906403,960 ops/s# Warmup Iteration 5: 204518647,481 ops/s# Warmup Iteration 6: 209602101,373 ops/s# Warmup Iteration 7: 204717066,594 ops/s# Warmup Iteration 8: 209156212,425 ops/s# Warmup Iteration 9: 215544157,049 ops/s# Warmup Iteration 10: 213919676,979 ops/s# Warmup Iteration 11: 211316588,650 ops/s# Warmup Iteration 12: 212046920,091 ops/s# Warmup Iteration 13: 212198820,202 ops/s# Warmup Iteration 14: 207165911,202 ops/s# Warmup Iteration 15: 209400520,248 ops/s# Warmup Iteration 16: 210509892,206 ops/s# Warmup Iteration 17: 207094517,640 ops/s# Warmup Iteration 18: 208435049,739 ops/s# Warmup Iteration 19: 208275287,735 ops/s# Warmup Iteration 20: 209727353,731 ops/sIteration 1: 210860563,039 ops/sIteration 2: 213258677,632 ops/sIteration 3: 210171275,812 ops/sIteration 4: 212516810,343 ops/s
Benchmark-driven optimizationpublic static KeyValueStatus valueOfLoop(final short code) { for (KeyValueStatus value: values()) { if (value.code() == code) return value; } return UNKNOWN;}
Benchmark (code) Mode Samples Score Score error Units
loopNoFastPath 0 avgt 10 19.383 0.331 ns/oploopNoFastPath 1 avgt 10 19.243 0.376 ns/oploopNoFastPath 134 avgt 10 24.855 0.651 ns/oploopNoFastPath 255 avgt 10 30.587 0.833 ns/oploopNoFastPath 1024 avgt 10 30.619 1.209 ns/op
Time grows linearly with value,even with out of bound values
Initial implementation
Benchmark-driven optimization
private static final KeyValueStatus[] VALUES = values();
public static KeyValueStatus valueOf(short code) { for (KeyValueStatus value: VALUES) { if (value.code() == code) return value; } return UNKNOWN;}
Benchmark (code) Mode Samples Score Score error Units
loopOnConstantArray 0 avgt 10 2.975 0.086 ns/oploopOnConstantArray 1 avgt 10 3.035 0.080 ns/oploopOnConstantArray 134 avgt 10 10.215 0.269 ns/oploopOnConstantArray 255 avgt 10 16.856 0.679 ns/oploopOnConstantArray 1024 avgt 10 17.015 0.577 ns/op
Still linear, removed ~15 ns allocation overhead
Reuse the constant array
Benchmark-driven optimizationprivate static final Map<Short, KeyValueStatus> code2statusMap = new HashMap<>();
static { for (KeyValueStatus value: values()) { code2statusMap.put(value.code(), value); }}
public static KeyValueStatus valueOf(final short code) { return code2statusMap.getOrDefault(code, UNKNOWN);}
Benchmark (code) Mode Samples Score Score error Units
lookupMap 0 avgt 10 4.954 0.134 ns/oplookupMap 1 avgt 10 4.036 0.125 ns/oplookupMap 134 avgt 10 5.597 0.157 ns/oplookupMap 255 avgt 10 4.006 0.144 ns/oplookupMap 1024 avgt 10 6.752 0.228 ns/op
More or less constantWorse on small valuesWay better on larger values
Prepare a hashmap,then simple lookup
Oh wait… autoboxing!
public static net.bluxte.experiments.couchbase_keyvalue.KeyValueStatus valueOfLookupMap(short);Code: 0: getstatic #17 // Field code2statusMap:Ljava/util/HashMap; 3: iload_0 4: invokestatic #18 // Method java/lang/Short.valueOf:(S)Ljava/lang/Short; 7: getstatic #15 // Field UNKNOWN:Lnet/bluxte/experiments/couchbase_keyvalue/KeyValueStatus; 10: invokevirtual #19 // Method java/util/HashMap.getOrDefault: (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; 13: checkcast #4 // class net/bluxte/experiments/couchbase_keyvalue/KeyValueStatus 16: areturn
public static KeyValueStatus valueOf(final short code) { return code2statusMap.getOrDefault(code, UNKNOWN);}
Oh wait… autoboxing!
Using Carrot HPPC (high performance primitive collections) avoids this
Benchmarking variationsprivate static final KeyValueStatus[] code2status = new KeyValueStatus[0x100];
static { Arrays.fill(code2status, UNKNOWN); for (KeyValueStatus keyValueStatus : values()) { if (keyValueStatus != UNKNOWN) { code2status[keyValueStatus.code()] = keyValueStatus; } }}
public static KeyValueStatus valueOfLookupArray(short code) { if (code >= 0 && code < code2status.length) { return code2status[code]; } else { return UNKNOWN; }}
Benchmark (code) Mode Samples Score Score error Units
lookupArray 0 avgt 10 3.061 0.126 ns/oplookupArray 1 avgt 10 3.048 0.127 ns/oplookupArray 134 avgt 10 3.070 0.084 ns/oplookupArray 255 avgt 10 3.035 0.113 ns/oplookupArray 1024 avgt 10 3.034 0.113 ns/op
Constant fast timeNo GC overheadw00t!
Prepare a lookup array,then simple lookup
Dangers of JMHBenchmark-driven iterations
● Can drive you to partial incremental improvements● Take a step back, think outside of the box
Optimizing for the sake of optimizing
● Time consuming● No real effect if not on “hot” code
Diving into OpenJDK(This gets scary!)
The VM does a lot of things
C1 “client”compiler
C2 “server”compilerInterpreter
Garbage collector
Finding your way in OpenJDKMain website http://openjdk.java.net/
Get the code:
hg clone http://hg.openjdk.java.net/jdk8/jdk8/hotspot/
hg clone http://hg.openjdk.java.net/jdk8/jdk8/jdk/
Mercurial still alive!
garbage collectors
bytecode interpreter
server compiler (c2 / opto)
client compiler (c1)
LLVM-based JIT
OS and/or CPU specific code
root of shared code
CPU-independent target (works with shark JIT)
Additional support in JDK9:● ARM 32 & 64 bits● PowerPC● S390● AIX
Intrinsic methodsWhat you see is not what you get
● The JVM “intercepts” some methods calls○ String / StringBuffer methods, Math, Unsafe, array manipulation, etc.
● Replaced inline with native (assembly) code○ Extremely fast and optimized○ Not even JNI overhead
● Find them in hotspot/src/share/vm/classfile/vmSymbols.hpp
Intrinsic methods
// IndexOf for constant substrings with size >= 8 chars// which don't need to be loaded through stack.void MacroAssembler::string_indexofC8(Register str1, Register str2, Register cnt1, Register cnt2, int int_cnt2, Register result, XMMRegister vec, Register tmp) { ShortBranchVerifier sbv(this); assert(UseSSE42Intrinsics, "SSE4.2 is required");
// This method uses pcmpestri inxtruction with bound registers // inputs: // xmm - substring // rax - substring length (elements count) // mem - scanned string // rdx - string length (elements count) // 0xd - mode: 1100 (substring search) + 01 (unsigned shorts) // outputs: // rcx - matched index in string assert(cnt1 == rdx && cnt2 == rax && tmp == rcx, "pcmpestri");
Label RELOAD_SUBSTR, SCAN_TO_SUBSTR, SCAN_SUBSTR, RET_FOUND, RET_NOT_FOUND, EXIT, FOUND_SUBSTR, MATCH_SUBSTR_HEAD, RELOAD_STR, FOUND_CANDIDATE;
// Note, inline_string_indexOf() generates checks: // if (substr.count > string.count) return -1; // if (substr.count == 0) return 0; assert(int_cnt2 >= 8, "this code isused only for cnt2 >= 8 chars");
// Load substring. movdqu(vec, Address(str2, 0)); movl(cnt2, int_cnt2); movptr(result, str1); // string addr
Example: String.indexOf on x86
In JDK9 beta, String.indexOf(String) is faster than String.indexOf(char)!
This is because one is intrinsic, and not yet the other
Intrinsic methods
Benchmark Mode Cnt Score Error Units
# JDK 8u121IndexOfBenchmark.StringIndexOfChar thrpt 5 141857.332 ± 5530.472 ops/sIndexOfBenchmark.StringIndexOfString thrpt 5 113091.517 ± 2241.533 ops/s
# JDK 9b152IndexOfBenchmark.StringIndexOfChar thrpt 5 154525.343 ± 3796.818 ops/sIndexOfBenchmark.StringIndexOfString thrpt 5 185917.059 ± 3391.230 ops/s
(from the jdk9-dev mailing-list)
Intrinsic methods● “I can do it better than JDK source” – think twice!
→ Have a look at vmSymbols.hpp first!
● Can sometimes be indirect (esp with strings and arrays)
● When in doubt, benchmark (with the same JVM)
Conclusion
Conclusion● Know your tools
● Be curious, and follow the white rabbit from time to time, you’ll learn a lot
● However… don’t go overboard and waste (too much) time!
Thanks!Questions?
Sylvain Wallez - @bluxteToulouse JUG - 2017-04-26
Bonus links to dive deeper
Java MissionControl & FlightRecorder docs
What the JIT!? Anatomy of the OpenJDK HotSpot VM
Intrinsic Methods in HotSpot VM
Zero and Shark (LLVM JIT)