Zero-Overhead Metaprogramming
Using Self-Optimizing Interpreters to Remove the Runtime Cost of Reflective Programming
Stefan Marr, INRIA LilleResearch collaboration with Chris Seaton, Oracle Labs
and Stéphane Ducasse, INRIA Lille___PLDI, June 17, 2015
Runtime Metaprogramming==
Just Another Form of Late Binding
Optimized as Such
2
Zero-Overhead Metaprogramming
Using Self-Optimizing Interpreters to Remove the Runtime Cost of Reflective Programming
Stefan Marr, INRIA LilleResearch collaboration with Chris Seaton, Oracle Labs
and Stéphane Ducasse, INRIA Lille___PLDI, June 17, 2015
4
Runtime Metaprogramming
class Proxy def method_missing(name, *args, &block) target.send(name, *args, &block) end
end
obj.invoke('foo', [])obj.getField(idx)obj.setField(idx, val)
Powerful and UsefulFrameworks, Domain-Specific Languages, …
For Late Binding Languages!
5
Metaobject Protocol ExampleBuilding a Safe Actor Framework
class ActorDomain : Domain { fn writeToField(obj, fieldIdx, value) { if (Domain.current() == this) { obj.setField(fieldIdx, value); } else { throw new IsolationError(obj); } } /* ... */}
Overhead for every field write!
http://stefan-marr.de/research/omop/
6
Metaprogramming is slooow!
meth.invoke() 0.7x OverheadDynamic Proxies 6.5x Overhead
7
Everybody Knows:
Runtime Metaprogramming is slooow!
8
OPTIMIZING REFLECTIVE OPERATIONS
obj.invoke('foo', [])obj.getField(idx)obj.setField(idx, val)
Method Invocation, Field Accesses, …
9
Reflective Method Invocation
cnt.invoke('+', [1])
How to optimize this?
10
Optimize Direct Invocation
cnt + 1
Hölzle, Chambers, Ungar, ’91.
cnt.+(1)
is polymorphic-> Cache at Send Site
Solution: Polymorphic Inline Cache
• Avoids lookup• Enables JIT to
inline
11
Generalize Polymorphic Inline Cachesto Dispatch Chains
cnt 1
invocation
read var literal
+dispatch chain
dispatch
method
cnt.+(1)
Common Place in Self-Optimizing Interpreters:Würthinger et al. [2012], Humer et al. [2014], Wöß et al. [2014]
Chain Nodes can have arbitrary behavior
12
Dispatch Chain for Method Invocation
UnInit
cnt: 0 (Integer object)
+ int ++
check class==int
true
falseUnInit
CacheNode
cnt.+(1)
• Avoids lookup• Enables inlining
int
13
Reflective Method Invocation
cnt.invoke('+', [1])
How to optimize this?
14
Optimizing Reflective Method Invocationcnt.invoke('+', [1])
'+
UnIn
InvokeNode
cnt
[1]
dispatch chainon method name
15
NameNode
UnIn
methoddispatch chain
dispatch
method
dispatch chainon method name
Optimizing Reflective Method Invocationcnt.invoke('+', [1])
'+'
InvokeNode
cnt
[1]
'+'
check name=='+'true
false
nesting of dispatch chainsresolves variability
16
Simple Metaprogramming Solved!
class Proxy def method_missing(name, *args, &block) target.send(name, *args, &block) end
end
obj.invoke('foo', [])obj.getField(idx)obj.setField(idx, val)✔
✔ ?
17
Metaobject Protocol ExampleBuilding a Safe Actor Framework
class ActorDomain : Domain { fn writeToField(obj, fieldIdx, value) { if (Domain.current() == this) { obj.setField(fieldIdx, value); } else { throw new IsolationError(obj); } } /* ... */}
http://stefan-marr.de/research/omop/
18
An Actor Exampleactor.fieldA := 1
actor
.fieldA :=
1
temp variableread
literal
field write semantic dependson metaobject
19
Dispatch Chains to Resolve Variability
field writeUnIn
dispatch chainon metaobjectCacheMOb
j.fieldA := ActorDomain.
writeToField()
Intercessionhandler
20
Dispatch Chains to Resolve Variability
field write
UnIn
dispatch chainon metaobjectCacheMOb
j.fieldA :=
Shortcut for standard semantics
ActorDomain.writeToField()
Intercessionhandler
StdDomain
StdWrite
Standarddirect write
21
DOES IT WORK?Is it fast?
22
Evaluation: Self-Optimizing Interpreters
Academic “Simplicity”Verify idea independent of compilation technique• Meta-Tracing (RPython)• Partial Evaluation (Truffle)
Industrial “Sophistication”High-Performance Truffle Backend for JRuby• Case Study with Production
Code
http://som-st.github.io
+Trufflehttps://github.com/jruby/jruby/wiki/Truffle
23
Simple Metaprogramming: Zero Overhead
http://stefan-marr.de/papers/pldi-marr-et-al-zero-overhead-metaprogramming-artifacts/
Reflective & Direct:
Identical Machine Code!
24
Production Code: Image Processing
+Truffle
Optimize Simple
Metaprogramming!
Spee
dup
over
uno
ptim
ized
(hig
her i
s be
tter
)
25
OMOP Overhead
meta-tracing partial evaluation
Overhead: 4% (min. -1%, max. 19%) Overhead: 9% (min. -7%, max. 38%)Metaobject Protocols:
Fast without Compromises!
26
Open Research Questions
• Do programs with MOPs have classic trimodal distribution of send-site polymorphism?– i.e. does basic PIC hypothesis apply?– Same polymorphism degree, inlining limits, …
• How to implement dispatch chains efficiently in classic tier JIT compilers?
27
Dispatch Chains: A Generalization of PICs
• Complete removal of reflective overhead– Simple and sufficient– For meta-tracing and partial evaluation
• Enables – Zero-Overhead Metaprogramming–efficient MOPs for smarter DSLs
dispatch chaindispatch
method
Runtime Metaprogramming==
Just Another Form of Late Binding
Optimized as Such
28
+Trufflehttp://stefan-marr.de/papers/pldi-marr-et-al-zero-overhead-metaprogramming/