ruby world

26
Applying Compiler Technology to Ruby Sept 8, 2009 Evan Phoenix Wednesday, September 16, 2009

Upload: evanphx

Post on 15-May-2015

1.893 views

Category:

Technology


1 download

DESCRIPTION

Evan Phoenixs presentation at the Ruby World conference, Sept 7th, 2009.

TRANSCRIPT

Page 1: Ruby World

Applying Compiler Technology to Ruby

Sept 8, 2009

Evan Phoenix

Wednesday, September 16, 2009

Page 2: Ruby World

What makes Ruby great can make Ruby slow.

Wednesday, September 16, 2009

Page 3: Ruby World

‣ Highly Dynamic

Wednesday, September 16, 2009

Page 4: Ruby World

‣ Highly Dynamic

• Very high level operations

• New code can be introduced at anytime

• Dynamic typing

• Exclusively late bound method calls

• Easier to implement as an interpreter

Wednesday, September 16, 2009

Page 5: Ruby World

Haven’t other languages had these same features/

weaknesses?

Wednesday, September 16, 2009

Page 6: Ruby World

‣Prior Work

Wednesday, September 16, 2009

Page 7: Ruby World

‣Prior Work

• Smalltalk

• 1980-1994: Extensive work to make it fast

• Self

• 1992-1996: A primary research vehicle for making dynamic languages fast

• Java / Hotspot

• 1996-present: A battle hardened engine for (limited) dynamic dispatch

Wednesday, September 16, 2009

Page 8: Ruby World

‣What Can We Learn From Them?

Wednesday, September 16, 2009

Page 9: Ruby World

‣What Can We Learn From Them?

• Complied code is faster than interpreted code

• It’s very hard (almost impossible) to figure things out staticly

• The type profile of a program is stable over time

• Therefore:

• Learn what a program does and optimize based on that

• This is called Type Feedback

Wednesday, September 16, 2009

Page 10: Ruby World

‣Code Generation (JIT)

• Eliminating overhead of interpreter instantly increases performance a fixed percentage

• Naive code generation results in small improvement over interpreter

• Method calling continues to dominate time

• Need a way to generate better code

• Combine with program type information!

Wednesday, September 16, 2009

Page 11: Ruby World

‣Type Profile

• As the program executes, it’s possible to see how one method calls another methods

• The relationship of one method and all the methods it calls is the type profile of the method

• Just because you CAN use dynamic dispatch, doesn’t mean you always do.

• It’s common that a call site always calls the same method every time it’s run

Wednesday, September 16, 2009

Page 12: Ruby World

21%

1 class98%

1: 25245 2: 275 3: 86 4: 50 5: 35 6: 6 7: 10 8: 5 9: 5 10: 2 10+: 34

Call sites running Array specs

Wednesday, September 16, 2009

Page 13: Ruby World

‣Type Profiling (Cont.)

• 98% of all method calls are to the same method every time

• In other words, 98% of all method calls are statically bound

Wednesday, September 16, 2009

Page 14: Ruby World

‣Type Feedback

• Optimize a semi-static relationship to generate faster code

• Semi-static relationships are found by profiling all call sites

• Allow JIT to make vastly better decisions

• Most common optimization: Method Inlining

Wednesday, September 16, 2009

Page 15: Ruby World

‣Method Inlining

• Rather than emit a call to a target method, copy it’s body at the call site

• Eliminates code to lookup and begin execution of target method

• Simplifies (or eliminates) setup for target method

• Allows for type propagation, as well as providing a wider horizon for optimization.

• A wider horizon means better generated code, which means less work to do per method == faster execution.

Wednesday, September 16, 2009

Page 16: Ruby World

Implementation

Wednesday, September 16, 2009

Page 17: Ruby World

‣Code Generation (JIT)

• Early experimentation with custom JIT

•Realized we weren’t experts

•Would take years to get good code being generated

• Switched to LLVM

Wednesday, September 16, 2009

Page 18: Ruby World

‣LLVM

• Provides an internal AST (LLVM IR) for describing work to be done

• Text representation of AST allows for easy debugging

• Provides ability to compile AST to machine code in memory

• Contains thousands of optimizations

• Competitive with GCC

Wednesday, September 16, 2009

Page 19: Ruby World

‣Type Profiling

• All call sites use a class called InlineCache, one per call site

• InlineCache accelerates method dispatch by caching previous method used

• In addition, tracks a fixed number of receiver classes seen when there is a cache miss

• When compiling a method using LLVM, all InlineCaches for a method can be read

• InlineCaches with good information can be used to accurately find a method to inline

Wednesday, September 16, 2009

Page 20: Ruby World

‣When To Compile

• It takes time for a method’s type information to settle down

• Compiling too early means not having enough type info

• Compiling too late means lost performance

• Use simple call counters to allow a method to “heat up”

• Each invocation of a method increments counter

• When counter reaches a certain value, method is queued for compilation.

• Threshold value is tunable: -Xjit.call_til_compile

• Still experimenting with good default values

Wednesday, September 16, 2009

Page 21: Ruby World

‣How to Compile

• To impact runtime as little as possible, all JIT compilation happens in a background OS thread

• Methods are queued, and background thread reads queue to find methods to compile

• After compiling, function pointers to JIT generated code are installed in methods

• All future invocations of method use JIT code

Wednesday, September 16, 2009

Page 22: Ruby World

‣Benchmarks

0

2.25

4.5

6.75

9

1.8 1.9 rbx rbx jit rbx jit +blocks

2.59

3.60

5.90

5.30

8.02

Seconds

def foo() ary = [] 100.times { |i| ary << i }end

300,000 times

Wednesday, September 16, 2009

Page 23: Ruby World

‣Benchmarks

0

7.5

15

22.5

30

1.8 1.9 rbx rbx jit rbx jit +blocks

12.0112.54

25.36

5.264.85

Seconds

def foo() hsh = {} 100.times { |i| hsh[i] = 0 }end

100,000 times

Wednesday, September 16, 2009

Page 24: Ruby World

‣Benchmarks

0

1.75

3.5

5.25

7

1.8 1.9 rbx rbx jit rbx jit +blocks

2.662.68

6.26

2.09

3.64

Seconds

def foo() hsh = { 47 => true } 100.times { |i| hsh[i] }end

100,000 times

Wednesday, September 16, 2009

Page 25: Ruby World

‣Benchmarks

0

2

4

6

8

1.8 1.9 jruby rbx rbx jit rbx jit +blocks

1.531.53

7.27

1.891.58

7.36

Seconds

tak(18, 9, 0)

Wednesday, September 16, 2009

Page 26: Ruby World

‣Conclusion

• Ruby is a wonderful language because it is organized for humans

• By gather and using information about a running program, it’s possible to make that program much faster without impacting flexibility

• Thank You!

Wednesday, September 16, 2009