sista: improving cog’s jit performance · main people involved in sista • eliot miranda •...

71
Sista: Improving Cog’s JIT performance Clément Béra

Upload: others

Post on 22-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Sista: Improving Cog’s JIT

performanceClément Béra

Page 2: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Main people involved in Sista• Eliot Miranda

• Over 30 years experience in Smalltalk VM

• Clément Béra

• 2 years engineer in the Pharo team

• Phd student starting from october

Page 3: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Cog ?

• Smalltalk virtual machine

• Runs Pharo, Squeak, Newspeak, ...

Page 4: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Plan

• Efficient VM architecture (Java, C#, ...)

• Sista goals

• Example: optimizing a method

• Our approach

• Status

Page 5: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Virtual machineHigh level code (Java, Smalltalk ...)

CPU: intel, ARM ... - OS: Windows, Android, Linux ..

Runtime environment (JRE, JDK, Object engine...)

Page 6: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Basic Virtual machineHigh level code (Java, Smalltalk ...)

Compiler

Bytecode Interpreter

Memory Manager

CPU RAM

BytecodeVM

Page 7: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Fast virtual machineHigh level code (Java, Smalltalk ...)

Compiler

Bytecode Interpreter

JIT Compiler

Memory Manager

CPU RAM

BytecodeVM

Page 8: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Efficient JIT compilerdisplay: listOfDrinks

listOfDrinks do: [ :drink |self displayOnScreen: drink ]

Execution number

time to run (ms) Comments

1 1 lookup of #displayOnScreen (cache) and byte code interpretation

2 to 6 0,5 byte code interpretation

7 2 generation of native code for displayOnScreen and native code run

•8 to 999 0,01 native code run

1000 2 adaptive recompilation based on runtime type information, generation of native code and optimized native code run

1000 + 0,002 optimized native code run

Page 9: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

In Cogdisplay: listOfDrinks

listOfDrinks do: [ :drink |self displayOnScreen: drink ]

Execution number

time to run (ms) Comments

1 1 lookup of #displayOnScreen (cache) and byte code interpretation

2 to 6 0,5 byte code interpretation

7 2 generation of native code for display and native code run

•8 to 999 0,01 native code run

1000 2 adaptive recompilation based on runtime type information, generation of native code and optimized native code run

1000 + 0,002 optimized native code run

Page 10: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

A look into webkit• Webkit Javascript engine

• LLInt = Low Level Interpreter

• DFG JIT = Data Flow Graph JIT

Page 11: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Webkit JS benchs

Page 12: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

In Cog

Page 13: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

What is Sista ?

• Implementing the next JIT optimization level

Page 14: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Goals

• Smalltalk performance

• Code readability

Page 15: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Smalltalk performanceThe Computer Language Benchmarks Game

Page 16: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Smalltalk performanceThe Computer Language Benchmarks Game

Page 17: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Smalltalk performanceThe Computer Language Benchmarks Game

Smalltalk is 4x~25x slower than Java

Page 18: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Smalltalk performanceThe Computer Language Benchmarks Game

Smalltalk is 4x~25x slower than Java

Our goal is 3x faster:1.6x~8x times slower than Java

+ Smalltalk features

Page 19: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Code Readability

• Messages optimized by the bytecode compiler overused in the kernel

• #do: => #to:do:

Page 20: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Adaptive recompilation

• Recompiles on-the-fly portion of code frequently used based on the current environment and previous executions

Page 21: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Optimizing a method

Page 22: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Example

Page 23: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Example

• Executing #display: with over 30 000 different drinks ...

Page 24: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Hot spot detector

• Detects methods frequently used

Page 25: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Hot spot detector• JIT adds counters on machine code

• Counters are incremented when code execution reaches it

• When a counter reaches threshold, the optimizer is triggered

Page 26: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Example

Can detect hot spot(#to:do: compiled inlined)

Cannot detect hot spot

Page 27: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Hot spot detected

Over 30 000 executions

Page 28: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Optimizer• What to optimize ?

Page 29: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Optimizer• What to optimize ?

• Block evaluation is costly

Page 30: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Optimizer• What to optimize ?

Stack frame#display:

Stack frame#do:

Stack growing downHot spot

detected here

Page 31: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Optimizer• What to optimize ?

Stack frame#display:

Stack frame#do:

Hot spot detected

here

Page 32: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Find the best method

Page 33: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Find the best method

• To be able to optimize the block activation, we need to optimize both #example and #do:

Page 34: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Inlining• Replaces a function call by its callee

Page 35: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Inlining• Replaces a function call by its callee

• Speculative: what if listOfDrinks is not an Array anymore ?

Page 36: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Inlining• Replaces a function call by its callee

• Speculative: what if listOfDrinks is not an Array anymore ?

• Type-feedback from inline caches

Page 37: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Guard

• Guard checks: listOfDrinks hasClass: Array

• If a guard fails, the execution stack is dynamically deoptimized

• If a guard fails more than a threshold, the optimized method is uninstalled

Page 38: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Dynamic deoptimization• On Stack replacement

• PCs, variables and methods are mapped.

Stack frame#display:

Stack frame#do:

Stack frame#optimizedDisplay:

Page 39: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Optimizations• 1) #do: is inlined into #display:

Page 40: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Optimizations• 2) Block evaluation is inlined

Page 41: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Optimizations• 3) at: is optimized

• at: optimized ???

Page 42: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

At: implementation

• VM implementation

• Is index an integer ?

• Is object a variable-sized object, a byte object, ... ? (Array, ByteArray, ... ?)

• Is index within the bound of the object ?

• Answers the value

Page 43: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Optimizations• 3) at: is optimized

• index isSmallInteger

• 1 <= index <= listOfDrinks size

• listOfDrinks class == Array

• uncheckedAt1: (at: for variable size objects with in-bounds integer argument)

Page 44: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Restarting• On Stack replacement

Stack frame#display:

Stack frame#do:

Stack frame#optimizedDisplay:

Page 45: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Our approach

Page 46: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

In-image Optimizer• Inputs

• Stack with hot compiled method

• Branch and type information

• Actions

• Generates and installs an optimized compiled method

• Generates deoptimization metadata

• Edit the stack to use the optimized compiled method

Page 47: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

In-image Deoptimizer• Inputs

• Stack to deoptimize (failing guard, debugger, ...)

• Deoptimization metadata

• Actions

• Deoptimize the stack

• May discard the optimized method

Page 48: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Advantages

• Optimizer / Deoptimizer in smalltalk

• Easier to debug and program in Smalltalk than in Slang / C

• Editable at runtime

• Lower engineering cost

Page 49: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Advantages

Stack frame#display:

Stack frame#do:

Stack frame#optimizedDisplay:

On Stack replacement done with Context manipulation

Page 50: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Cons

• What if the optimizer is triggered while optimizing ?

Page 51: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Advantages

• CompiledMethod to optimized CompiledMethod

• Snapshot saves compiled methods

Page 52: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Cons

• Some optimizations are difficult

• Instruction selection

• Instruction scheduling

• Register allocation

Page 53: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Critical optimizations

• Inlining (Methods and Closures)

• Specialized at: and at:put:

• Specialized arithmetic operations

Page 54: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Interface VM - Image

• Extended bytecode set

• CompiledMethod introspection primitive

• VM callback to trigger optimizer /deoptimizer

Page 55: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Extended bytecode set

• Guards

• Specialized inlined primitives

• at:, at:put:

• +, - , <=, =, ...

Page 56: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Introspection primitive• Answers

• Type info for each message send

• Branch info for each conditional jump

• Answers nothing if method not in machine code zone

Page 57: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

VM callback

• Selector in specialObjectsArray

• Sent to the active context when hot spot / guard failure detected

Page 58: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Stability

• Low engineering resources

• No thousands of human testers

Page 59: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Stability

• Gemstone style ?

• Large test suites

Page 60: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Stability

• RMOD research team

• Inventing new validation techniques

• Implementing state-of-the-art validators

Page 61: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Stability goal

• Both bench and large test suite run on CI

• Anyone could contribute

Page 62: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Anyone can contribute

Page 63: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Anyone can contribute

Page 64: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Stability goal

• Both bench and large test suite run on CI

• Anyone could contribute

Page 65: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Status• A full round trip works

• Hot spot detected in the VM

• In-image optimizer finds a method to optimize, optimizes it an install it

• method and closure inlining

• Stack dynamic deoptimization

Page 66: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Status: hot spot detector• Richards benchmark

• Cog: 341ms

• Cog + hot spot detector: 351ms

• 3% overhead on Richards

• Detects hot spots

• Branch counts ⇒ basic block counts

Page 67: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Next steps• Dynamic deoptimization of closures is not

stable enough

• Efficient new bytecode instructions

• Stabilizing bounds check elimination

• lowering memory footprint

• Invalidation of optimized methods

• On stack replacement

Page 68: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Thanks

• Stéphane Ducasse

• Marcus Denker

• Yaron Kashai

Page 69: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Conclusion

• We hope to have a prototype running for Christmas

• We hope to push it to production in around a year

Page 70: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Demo

...

Page 71: Sista: Improving Cog’s JIT performance · Main people involved in Sista • Eliot Miranda • Over 30 years experience in Smalltalk VM • Clément Béra • 2 years engineer in

Questions