inside the jvm - follow the white rabbit!

Post on 21-Jan-2018

928 Views

Category:

Software

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Inside the JVMFollow the white rabbit!

Sylvain Wallez - @bluxteToulouse JUG - 2017-04-26

Who’s this guy?

Software engineer at Elastic (cloud team)

Previously:

● IoT tech lead at OVH● CEO at Actoboard● Backend architect at Sigfox● CTO at Goojet/Scoop it● Lead architect at Joost● Member of the Apache Software Foundation● Cofounder & CTO at Anyware Technologies (now part of Sierra Wireless)

Agenda● How it started: let’s optimize 6 lines of (hot) code!● Profiling memory usage● What’s in a class file?● Micro-benchmarking with JMH● Exploration of the OpenJDK source code

How it startedLet’s optimize 6 lines of (hot) code

On the CouchBase blog...“JVM Profiling - Lessons from the trenches”: optimize the conversion of a protocol error code into a readable message.

...

private final short code;

private final String description;

KeyValueStatus(short code, String description) {

this.code = code;

this.description = description;

}

public static KeyValueStatus valueOf(final short code) {

for (KeyValueStatus value: values()) {

if (value.code() == code) return value;

}

return UNKNOWN;

}

public enum KeyValueStatus {

UNKNOWN((short) -1, "Unknown code"),

SUCCESS((short) 0x00,

"The operation completed successfully"),

ERR_NOT_FOUND((short) 0x01,

"The key does not exists"),

ERR_EXISTS((short) 0x02,

"The key exists in the cluster"),

ERR_TOO_BIG((short) 0x03,

"The document exceeds the maximum size"),

ERR_INVALID((short) 0x04,

"Invalid request"),

ERR_NOT_STORED((short) 0x05,

"The document was not stored"),

...

On the CouchBase blog...Finding: values() is allocating memory

public static KeyValueStatus valueOf(final short code) {

for (KeyValueStatus value: values()) {

if (value.code() == code) return value;

}

return UNKNOWN;

}

public static KeyValueStatus valueOf(final short code) {

if (code == SUCCESS.code) {

return SUCCESS;

} else if (code == ERR_NOT_FOUND.code) {

return ERR_NOT_FOUND;

} else if (code == ERR_EXISTS.code) {

return ERR_EXISTS;

} else if (code == ERR_NOT_MY_VBUCKET.code) {

return ERR_NOT_MY_VBUCKET;

}

for (KeyValueStatus value : values()) {

if (value.code() == code) {

return value;

}

}

return UNKNOWN;

}

Optimization: fast path on common values

If something goes wrong, it’ll make it worse!

Profiling memory usage

Various kinds of memory optimization● Memory usage / memory leaks

○ My application needs tons of heap○ How many objects are held active?

→ Memory profiler / jmap

● Garbage collection pressure○ My application spends a lot of time in the GC○ How often are objects allocated?

→ Java Mission Control / jmap

jmap histogramsjmap -histo num #instances #bytes class name

----------------------------------------------

1: 4217124 674740720 [Lnet.bluxte.experiments.couchbase_keyvalue.KeyValueStatus;

2: 486 14947912 [I

3: 5855 493864 [C

4: 1461 166752 java.lang.Class

5: 5848 140352 java.lang.String

6: 503 136440 [B

7: 968 62480 [Ljava.lang.Object;

8: 1255 40160 java.util.HashMap$Node

9: 991 39640 java.util.LinkedHashMap$Entry

10: 258 30720 [Ljava.util.HashMap$Node;

11: 259 22792 java.lang.reflect.Method

12: 441 20952 [Ljava.lang.String;

13: 229 16488 java.lang.reflect.Field

14: 171 9576 java.util.LinkedHashMap

15: 291 9312 java.util.concurrent.ConcurrentHashMap$Node

16: 160 7680 java.util.HashMap

17: 178 7120 java.lang.ref.SoftReference

18: 89 7120 java.net.URI

19: 102 6528 java.net.URL

20: 256 6144 java.lang.Long

21: 76 6080 java.lang.reflect.Constructor

22: 258 5896 [Ljava.lang.Class;

23: 265 5680 [S

24: 166 5312 java.util.Hashtable$Entry

25: 94 5264 java.lang.Class$ReflectionData

code available on GitHub

jmap histogramsjmap -histo:live – perform a full GC first num #instances #bytes class name

----------------------------------------------

1: 5855 493864 [C

2: 1461 166752 java.lang.Class

3: 5848 140352 java.lang.String

4: 503 136440 [B

5: 967 62456 [Ljava.lang.Object;

6: 1255 40160 java.util.HashMap$Node

7: 991 39640 java.util.LinkedHashMap$Entry

8: 258 30720 [Ljava.util.HashMap$Node;

9: 259 22792 java.lang.reflect.Method

10: 441 20952 [Ljava.lang.String;

11: 283 19272 [I

12: 229 16488 java.lang.reflect.Field

....................

51: 35 1400 javax.management.MBeanOperationInfo

52: 3 1360 [Lnet.bluxte.experiments.couchbase_keyvalue.KeyValueStatus;

53: 55 1320 java.io.ExpiringCache$Entry

54: 29 1304 [Ljava.lang.reflect.Field;

55: 36 1152 net.bluxte.experiments.couchbase_keyvalue.KeyValueStatus

56: 27 1080 java.security.ProtectionDomain

Java Mission Control / Java Flight RecorderLightweight monitoring agent

● Integrated into the (Oracle) JVM● Very low overhead

Continuously samples diagnostics data

● Thread activity● GC activity● Memory allocations

Java Mission Control / Java Flight RecorderAvailable only with Oracle JDK

● Free for development● Commercial for use in production

How to enable it?

● at launch time: java -XX:+UnlockCommercialFeatures● after launch: jcmd <pid> VM.unlock_commercial_features

Original codeSimple loop on the enum values

public static KeyValueStatus valueOf(final short code) { for (KeyValueStatus value: values()) { if (value.code() == code) return value; } return UNKNOWN;}

Original code - Memory stats

Looks good! No leak!

Hmm… growing fast!

Original code - Allocations

Original code - GC activity

Iteration on constant arrayStill trivial, but reuse the values array

private static final KeyValueStatus[] VALUES = values();

public static KeyValueStatus valueOf(final short code) { for (KeyValueStatus value: VALUES) { if (value.code() == code) return value; } return UNKNOWN;}

Constant array - Allocations

Constant array - GC activity

GC pressure collateral damages

Full GC clears weak references→ clears some caches→ additional load to repopulate them!

Enum.values() ?Exploring the bytecode

Enum.values() – a generated method

The compiler automatically adds some special methods when it creates an enum. For example, they have a static values method that returns an array containing all of the values of the enum in the order they are declared.

– The Java Tutorial

/** * Returns an array containing the constants of this enum * type, in the order they're declared. This method may be * used to iterate over the constants as follows: * * for(E c : E.values()) * System.out.println(c); * * @return an array containing the constants of this enum * type, in the order they're declared */public static E[] values();

– The Java Language Specification

Show me the (byte)code!public class SimpleMain { public static void main(String[] args) { System.out.println("Hello world!"); }}

public class net.bluxte.experiments.talk.SimpleMain { public net.bluxte.experiments.talk.SimpleMain(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return

public static void main(java.lang.String[]); Code: 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #3 // String Hello world! 5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return}

Default constructor

javap -c SimpleMain.class

or IntelliJ’s bytecode plugin

Show me the (byte)code!

public class SimpleMain { static String hello = "Hello"; static String world = "world";

public static void main( String[] args ) { System.out.println( hello + " " + world ); }}

public class net.bluxte.experiments.talk.SimpleMain { static java.lang.String hello;

static java.lang.String world;

public net.bluxte.experiments.talk.SimpleMain(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return

public static void main(java.lang.String[]); Code: 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: new #3 // class java/lang/StringBuilder 6: dup 7: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V 10: getstatic #5 // Field hello:Ljava/lang/String; 13: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 16: ldc #7 // String “ “ 18: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 21: getstatic #8 // Field world:Ljava/lang/String; 24: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 27: invokevirtual #9 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 30: invokevirtual #10 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 33: return

static {}; Code: 0: ldc #11 // String Hello 2: putstatic #5 // Field hello:Ljava/lang/String; 5: ldc #12 // String world 7: putstatic #8 // Field world:Ljava/lang/String; 10: return}

String concat with StringBuilder

Static initializer

Show me the (byte)code!public enum SimpleEnum { FIRST_ENUM, SECOND_ENUM}

... public static net.bluxte.experiments.talk.SimpleEnum[] values(); Code: 0: getstatic #1 // Field $VALUES:[Lnet/bluxte/experiments/talk/SimpleEnum; 3: invokevirtual #2 // Method "[Lnet/bluxte/experiments/talk/SimpleEnum;".clone:()Ljava/lang/Object; 6: checkcast #3 // class "[Lnet/bluxte/experiments/talk/SimpleEnum;" 9: areturn

public static net.bluxte.experiments.talk.SimpleEnum valueOf(java.lang.String); Code: 0: ldc #4 // class net/bluxte/experiments/talk/SimpleEnum 2: aload_0 3: invokestatic #5 // Method java/lang/Enum.valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum; 6: checkcast #4 // class net/bluxte/experiments/talk/SimpleEnum 9: areturn...

Aha! We found the culprit!

But why the clone?Java arrays are mutable

The caller can mess with it, which would break other users→ Perform a defensive copy every time

How to could it be prevented?

Return an immutable List, but probably too high level here

More on the bytecodeA class file is composed of:

● constant pool: strings, fields/methods name+type, class names, etc.● fields and methods definitions and code

○ Access flags and attributes○ Code○ Line number table○ Local variable table (type and name)○ Exception table

But wait…...why would I want to know about this?

● Better understand low level diagnostics● Check generated code

○ Java: enum values (!), for loops, etc○ Scala, Kotlin: implementation of higher level constructs○ Hibernate & co: how do they mangle your code?

● Grasping low level stuff allows writing better high-level code

#1 = Methodref #6.#20 // java/lang/Object."<init>":()V #2 = Fieldref #21.#22 // java/lang/System.out:Ljava/io/PrintStream; #3 = String #23 // Hello world #4 = Methodref #24.#25 // java/io/PrintStream.println:(Ljava/lang/String;)V #5 = Class #26 // net/bluxte/experiments/talk/SimpleMain #6 = Class #27 // java/lang/Object #7 = Utf8 <init> #8 = Utf8 ()V #9 = Utf8 Code #10 = Utf8 LineNumberTable #11 = Utf8 LocalVariableTable #12 = Utf8 this #13 = Utf8 Lnet/bluxte/experiments/talk/SimpleMain; #14 = Utf8 main #15 = Utf8 ([Ljava/lang/String;)V #16 = Utf8 args #17 = Utf8 [Ljava/lang/String; #18 = Utf8 SourceFile #19 = Utf8 SimpleMain.java #20 = NameAndType #7:#8 // "<init>":()V #21 = Class #28 // java/lang/System #22 = NameAndType #29:#30 // out:Ljava/io/PrintStream; #23 = Utf8 Hello world #24 = Class #31 // java/io/PrintStream #25 = NameAndType #32:#33 // println:(Ljava/lang/String;)V #26 = Utf8 net/bluxte/experiments/talk/SimpleMain #27 = Utf8 java/lang/Object #28 = Utf8 java/lang/System #29 = Utf8 out #30 = Utf8 Ljava/io/PrintStream; #31 = Utf8 java/io/PrintStream #32 = Utf8 println #33 = Utf8 (Ljava/lang/String;)V

Constant pool for SimpleMain

public class net.bluxte.experiments.talk.SimpleMain { public net.bluxte.experiments.talk.SimpleMain(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return

public static void main(java.lang.String[]); Code: 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #3 // String Hello world! 5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return}

Type encodingWhat is (Ljava/lang/String;)V ???

L<class path>; → class nameI, J, S, B, C → integer, long, short, byte, charF, D → float, doubleZ → boolean

public String foo(int a, char[] b, List<Integer> c, boolean d)

(I[CLjava/util/List;Z)Ljava/lang/String;

The bytecode “language”Stack-based machine

● Easier to target a large variety of CPUs(Android/Dalvik is register based)

Object-oriented assembler

● Method calls (static / virtual / interface / special)

Controlled memory access

● Local variables● Object fields

The bytecode “language”Very simple 200 instructions set

Instruction groups:

● Load and store● Arithmetic and logic● Type conversion● Object creation and manipulation● Operand stack management ● Control transfer● Method invocation and return

Only addition since 1996: invokedynamic in Java7

The bytecode “language”

public static void main(String[] args) {

long start = System.nanoTime();

while(System.nanoTime() - start < MAX_NANOS) { for (int i = 0; i < 1_000_000; i++) { resolved = resolve((short)rnd.nextInt(0x100)); } Thread.sleep(100); }}

0: invokestatic #2 // Method java/lang/System.nanoTime:()J 3: lstore_1 4: invokestatic #2 // Method java/lang/System.nanoTime:()J 7: lload_1 8: lsub 9: getstatic #3 // Field MAX_NANOS:J12: lcmp13: ifge 5516: iconst_017: istore_318: iload_319: ldc #4 // int 100000021: if_icmpge 4624: getstatic #5 // Field rnd:Ljava/util/Random;27: sipush 25630: invokevirtual #6 // Method java/util/Random.nextInt:(I)I33: i2s34: invokestatic #7 // Method resolve:(S)Lnet/bluxte/experiments/ couchbase_keyvalue/KeyValueStatus;37: putstatic #8 // Field resolved:Lnet/bluxte/experiments/ couchbase_keyvalue/KeyValueStatus;40: iinc 3, 143: goto 1846: ldc2_w #9 // long 100l49: invokestatic #11 // Method java/lang/Thread.sleep:(J)V52: goto 455: return

LocalVariableTable:Start Length Slot Name Signature 18 28 3 i I 0 56 0 args [Ljava/lang/String; 4 52 1 start J

Benchmarking with JMH(Back to good old Java)

Improving our solutionWe fixed the memory issue but it’s clearly non optimal

Let’s benchmark it!

private static final KeyValueStatus[] VALUES = values();

public static KeyValueStatus valueOf(final short code) { for (KeyValueStatus value: VALUES) { if (value.code() == code) return vue; }al return UNKNOWN;}

O(n) on constant data!

JMH: an OpenJDK project

● Provides drivers and guidance for writing tests● Takes care of pre-warming the JVM, collecting results and computing stats● Provides a Maven artifact type for benchmarking projects

“JMH is a Java harness for building, running, and analysing nano/micro/milli/macro benchmarks written in Java and other

languages targetting the JVM.”

Benchmark code@State(Scope.Benchmark)public class ValueOfBenchmark {

@Param({ "0", // 0x00, Success "1", // 0x01, Not Found "134", // 0x86 Temporary Failure "255", // undefined "1024" // undefined, out of bounds }) public short code;

@Benchmark public KeyValueStatus loopNoFastPath() { return KeyValueStatus.valueOfLoop(code); }

@Benchmark public KeyValueStatus loopFastPath() { return KeyValueStatus.valueOf(code); } ...}

mvn clean installjava -jar target/benchmarks.jar

# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home/jre/bin/java# VM options: <none># Warmup: 20 iterations, 1 s each# Measurement: 20 iterations, 1 s each

# Threads: 1 thread, will synchronize iterations# Benchmark mode: Throughput, ops/time# Benchmark: net.bluxte.experiments.couchbase_keyvalue.ValueOfBenchmark.loopNoFastPath# Parameters: (code = 0)

# Run progress: 0,00% complete, ETA 04:53:20# Fork: 1 of 10# Warmup Iteration 1: 152063982,769 ops/s# Warmup Iteration 2: 149808416,787 ops/s# Warmup Iteration 3: 210436722,740 ops/s# Warmup Iteration 4: 202906403,960 ops/s# Warmup Iteration 5: 204518647,481 ops/s# Warmup Iteration 6: 209602101,373 ops/s# Warmup Iteration 7: 204717066,594 ops/s# Warmup Iteration 8: 209156212,425 ops/s# Warmup Iteration 9: 215544157,049 ops/s# Warmup Iteration 10: 213919676,979 ops/s# Warmup Iteration 11: 211316588,650 ops/s# Warmup Iteration 12: 212046920,091 ops/s# Warmup Iteration 13: 212198820,202 ops/s# Warmup Iteration 14: 207165911,202 ops/s# Warmup Iteration 15: 209400520,248 ops/s# Warmup Iteration 16: 210509892,206 ops/s# Warmup Iteration 17: 207094517,640 ops/s# Warmup Iteration 18: 208435049,739 ops/s# Warmup Iteration 19: 208275287,735 ops/s# Warmup Iteration 20: 209727353,731 ops/sIteration 1: 210860563,039 ops/sIteration 2: 213258677,632 ops/sIteration 3: 210171275,812 ops/sIteration 4: 212516810,343 ops/s

Benchmark-driven optimizationpublic static KeyValueStatus valueOfLoop(final short code) { for (KeyValueStatus value: values()) { if (value.code() == code) return value; } return UNKNOWN;}

Benchmark (code) Mode Samples Score Score error Units

loopNoFastPath 0 avgt 10 19.383 0.331 ns/oploopNoFastPath 1 avgt 10 19.243 0.376 ns/oploopNoFastPath 134 avgt 10 24.855 0.651 ns/oploopNoFastPath 255 avgt 10 30.587 0.833 ns/oploopNoFastPath 1024 avgt 10 30.619 1.209 ns/op

Time grows linearly with value,even with out of bound values

Initial implementation

Benchmark-driven optimization

private static final KeyValueStatus[] VALUES = values();

public static KeyValueStatus valueOf(short code) { for (KeyValueStatus value: VALUES) { if (value.code() == code) return value; } return UNKNOWN;}

Benchmark (code) Mode Samples Score Score error Units

loopOnConstantArray 0 avgt 10 2.975 0.086 ns/oploopOnConstantArray 1 avgt 10 3.035 0.080 ns/oploopOnConstantArray 134 avgt 10 10.215 0.269 ns/oploopOnConstantArray 255 avgt 10 16.856 0.679 ns/oploopOnConstantArray 1024 avgt 10 17.015 0.577 ns/op

Still linear, removed ~15 ns allocation overhead

Reuse the constant array

Benchmark-driven optimizationprivate static final Map<Short, KeyValueStatus> code2statusMap = new HashMap<>();

static { for (KeyValueStatus value: values()) { code2statusMap.put(value.code(), value); }}

public static KeyValueStatus valueOf(final short code) { return code2statusMap.getOrDefault(code, UNKNOWN);}

Benchmark (code) Mode Samples Score Score error Units

lookupMap 0 avgt 10 4.954 0.134 ns/oplookupMap 1 avgt 10 4.036 0.125 ns/oplookupMap 134 avgt 10 5.597 0.157 ns/oplookupMap 255 avgt 10 4.006 0.144 ns/oplookupMap 1024 avgt 10 6.752 0.228 ns/op

More or less constantWorse on small valuesWay better on larger values

Prepare a hashmap,then simple lookup

Oh wait… autoboxing!

public static net.bluxte.experiments.couchbase_keyvalue.KeyValueStatus valueOfLookupMap(short);Code: 0: getstatic #17 // Field code2statusMap:Ljava/util/HashMap; 3: iload_0 4: invokestatic #18 // Method java/lang/Short.valueOf:(S)Ljava/lang/Short; 7: getstatic #15 // Field UNKNOWN:Lnet/bluxte/experiments/couchbase_keyvalue/KeyValueStatus; 10: invokevirtual #19 // Method java/util/HashMap.getOrDefault: (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; 13: checkcast #4 // class net/bluxte/experiments/couchbase_keyvalue/KeyValueStatus 16: areturn

public static KeyValueStatus valueOf(final short code) { return code2statusMap.getOrDefault(code, UNKNOWN);}

Oh wait… autoboxing!

Using Carrot HPPC (high performance primitive collections) avoids this

Benchmarking variationsprivate static final KeyValueStatus[] code2status = new KeyValueStatus[0x100];

static { Arrays.fill(code2status, UNKNOWN); for (KeyValueStatus keyValueStatus : values()) { if (keyValueStatus != UNKNOWN) { code2status[keyValueStatus.code()] = keyValueStatus; } }}

public static KeyValueStatus valueOfLookupArray(short code) { if (code >= 0 && code < code2status.length) { return code2status[code]; } else { return UNKNOWN; }}

Benchmark (code) Mode Samples Score Score error Units

lookupArray 0 avgt 10 3.061 0.126 ns/oplookupArray 1 avgt 10 3.048 0.127 ns/oplookupArray 134 avgt 10 3.070 0.084 ns/oplookupArray 255 avgt 10 3.035 0.113 ns/oplookupArray 1024 avgt 10 3.034 0.113 ns/op

Constant fast timeNo GC overheadw00t!

Prepare a lookup array,then simple lookup

Dangers of JMHBenchmark-driven iterations

● Can drive you to partial incremental improvements● Take a step back, think outside of the box

Optimizing for the sake of optimizing

● Time consuming● No real effect if not on “hot” code

Diving into OpenJDK(This gets scary!)

The VM does a lot of things

C1 “client”compiler

C2 “server”compilerInterpreter

Garbage collector

Finding your way in OpenJDKMain website http://openjdk.java.net/

Get the code:

hg clone http://hg.openjdk.java.net/jdk8/jdk8/hotspot/

hg clone http://hg.openjdk.java.net/jdk8/jdk8/jdk/

Mercurial still alive!

garbage collectors

bytecode interpreter

server compiler (c2 / opto)

client compiler (c1)

LLVM-based JIT

OS and/or CPU specific code

root of shared code

CPU-independent target (works with shark JIT)

Additional support in JDK9:● ARM 32 & 64 bits● PowerPC● S390● AIX

Intrinsic methodsWhat you see is not what you get

● The JVM “intercepts” some methods calls○ String / StringBuffer methods, Math, Unsafe, array manipulation, etc.

● Replaced inline with native (assembly) code○ Extremely fast and optimized○ Not even JNI overhead

● Find them in hotspot/src/share/vm/classfile/vmSymbols.hpp

Intrinsic methods

// IndexOf for constant substrings with size >= 8 chars// which don't need to be loaded through stack.void MacroAssembler::string_indexofC8(Register str1, Register str2, Register cnt1, Register cnt2, int int_cnt2, Register result, XMMRegister vec, Register tmp) { ShortBranchVerifier sbv(this); assert(UseSSE42Intrinsics, "SSE4.2 is required");

// This method uses pcmpestri inxtruction with bound registers // inputs: // xmm - substring // rax - substring length (elements count) // mem - scanned string // rdx - string length (elements count) // 0xd - mode: 1100 (substring search) + 01 (unsigned shorts) // outputs: // rcx - matched index in string assert(cnt1 == rdx && cnt2 == rax && tmp == rcx, "pcmpestri");

Label RELOAD_SUBSTR, SCAN_TO_SUBSTR, SCAN_SUBSTR, RET_FOUND, RET_NOT_FOUND, EXIT, FOUND_SUBSTR, MATCH_SUBSTR_HEAD, RELOAD_STR, FOUND_CANDIDATE;

// Note, inline_string_indexOf() generates checks: // if (substr.count > string.count) return -1; // if (substr.count == 0) return 0; assert(int_cnt2 >= 8, "this code isused only for cnt2 >= 8 chars");

// Load substring. movdqu(vec, Address(str2, 0)); movl(cnt2, int_cnt2); movptr(result, str1); // string addr

Example: String.indexOf on x86

In JDK9 beta, String.indexOf(String) is faster than String.indexOf(char)!

This is because one is intrinsic, and not yet the other

Intrinsic methods

Benchmark Mode Cnt Score Error Units

# JDK 8u121IndexOfBenchmark.StringIndexOfChar thrpt 5 141857.332 ± 5530.472 ops/sIndexOfBenchmark.StringIndexOfString thrpt 5 113091.517 ± 2241.533 ops/s

# JDK 9b152IndexOfBenchmark.StringIndexOfChar thrpt 5 154525.343 ± 3796.818 ops/sIndexOfBenchmark.StringIndexOfString thrpt 5 185917.059 ± 3391.230 ops/s

(from the jdk9-dev mailing-list)

Intrinsic methods● “I can do it better than JDK source” – think twice!

→ Have a look at vmSymbols.hpp first!

● Can sometimes be indirect (esp with strings and arrays)

● When in doubt, benchmark (with the same JVM)

Conclusion

Conclusion● Know your tools

● Be curious, and follow the white rabbit from time to time, you’ll learn a lot

● However… don’t go overboard and waste (too much) time!

Thanks!Questions?

Sylvain Wallez - @bluxteToulouse JUG - 2017-04-26

top related