performance in the details: a way to make faster rubyko1/activities/2015_railsclub2015_pub.pdf ·...
TRANSCRIPT
![Page 1: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/1.jpg)
Performance in the details:A way to make faster Ruby
Koichi Sasada<[email protected]>
RailsClub 2015
![Page 2: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/2.jpg)
A way to make faster Ruby
The only way I can find is:
Repeating a process.
![Page 3: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/3.jpg)
A way to make faster Ruby: A process
1. Observe Ruby interpreter
2. Make assumption the reason of slowness
3. Consider ideas to overcome
4. Implement ideas
5. Measure the result
•Bad/same performance → Goto 4, 3, 2 or 1•Good performance! → Commit it.
![Page 4: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/4.jpg)
Koichi SasadaA programmer from Japan
![Page 5: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/5.jpg)
Koichi is a Programmer
•MRI committer since 2007/01•Original YARV developer since 2004/01
• YARV: Yet Another RubyVM• Introduced into Ruby (MRI) 1.9.0 and later
•Generational/incremental GC for 2.x
![Page 6: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/6.jpg)
Koichi is an Employee
![Page 7: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/7.jpg)
Koichi is a member of Heroku Matz team
Mission
Design Ruby language
and improve quality of MRI
Heroku employs three full time Ruby core developers in Japan named “Matz team”
![Page 8: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/8.jpg)
Heroku Matz team
Matz Designer/director of Ruby
Nobu Quite active committer
Ko1 Internal Hacker
![Page 9: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/9.jpg)
MatzTitle collector
• He has so many (job) title• Chairman - Ruby Association
• Fellow - NaCl
• Chief architect, Ruby - Heroku
• Research institute fellow – Rakuten
• Chairman – NPO mruby Forum
• Senior researcher – Kadokawa Ascii Research Lab
• Visiting professor – Shimane University
• Honorable citizen (living) – Matsue city
• Honorable member – Nihon Ruby no Kai
• …
• This margin is too narrow to contain
![Page 10: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/10.jpg)
NobuGreat Patch monster
Ruby’s bug
|> Fix Ruby
|> Break Ruby
|> And Fix Ruby
![Page 11: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/11.jpg)
NobuPatch monster
nobu29%
akr12%svn
9%naruse
8%usa4%
ko14%
drbrain3%
kosaki3%
kazu2%
zzak2%
tenderlove2%
matz2%
marcandre2%
mame2%
tadf2%
knu1%
shugo1%
nagachika1%
yugui1%
kou1%
mrkn1%
emboss1%
shyouhei1%
nari0%
glass0%
ktsj0%
nahi0%
ayumin0%
tarui0%
sorah0%ryan0%
charliesome
0%
shirosaki0%xibbar
0%nagai
0%eregon
0%ngoto
0%wanabe
0%azav0%
keiju0%suke0%
kouji0%
duerst0%
takano320%
luislavena0%jeg20%hsbt0%
arton0%seki0%
kanemoto0%
tmm10%
eban0%
muraken0%
headius0%
evan0%
a_matsuda0%
iwamatsu0%
technorama
0%
davidflanagan0%
gotoken0%
okkez0%
COMMIT RATIO IN LAST 5 YEARS
Commit count of MRI
![Page 12: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/12.jpg)
NobuThe Ruby Hero
![Page 13: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/13.jpg)
Ko1EDD developer
0
5
10
15
20
25
20
10
/11
/8
20
11
/1/8
20
11
/3/8
20
11
/5/8
20
11
/7/8
20
11
/9/8
20
11
/11
/8
20
12
/1/8
20
12
/3/8
20
12
/5/8
20
12
/7/8
20
12
/9/8
20
12
/11
/8
20
13
/1/8
20
13
/3/8
20
13
/5/8
20
13
/7/8
20
13
/9/8
20
13
/11
/8
Commit number of ko1 (last 3 years)
RubyConf2012
RubyKaigi2013
Ruby 2.0
Euruko2013
RubyConf2013
EDD: Event Driven Development
![Page 14: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/14.jpg)
Heroku Matz team and Ruby core teamRecent achievement
Ruby 2.2Current stable
![Page 15: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/15.jpg)
Ruby 2.2Syntax
• Symbol key of Hash literal can be quoted
{“foo-bar”: baz}#=> {:“foo-bar” => baz}#=> not {“foo-bar” => baz} like JSON
TRAP!!Easy to misunderstand
(I wrote a wrong code, already…)
![Page 16: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/16.jpg)
Ruby 2.2Classes and Methods• Some methods are introduces• Kernel#itself• String#unicode_normalize• Method#curry• Binding#receiver• Enumerable#slice_after, slice_before• File.birthtime• Etc.nprocessors• …
![Page 17: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/17.jpg)
Ruby 2.2Improvements
• Improve GC• Symbol GC• Incremental GC• Improved promotion algorithm
• Young objects promote after 4 GCs
• Fast keyword parameters
•Use frozen string literals if possible
![Page 18: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/18.jpg)
Ruby 2.2Symbol GC
before = Symbol.all_symbols.size
1_000_000.times{|i| i.to_s.to_sym} # Make 1M symbols
after = Symbol.all_symbols.size; p [before, after]
# Ruby 2.1
#=> [2_378, 1_002_378] # not GCed# Ruby 2.2
#=> [2_456, 2_456] # GCed!
![Page 19: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/19.jpg)
Ruby 2.2Symbol GC Issues history• Ruby 2.2.0 has memory (object) leak problem
• Symbols has corresponding String objects
• Symbols are collected, but Strings are not collected! (leak)
• Ruby 2.2.1 solved this problem!!• However, 2.2.1 also has problem (rarely you encounter BUG at the end of process
[Bug #10933] ← not big issue, I want to believe)
• Ruby 2.2.2 had solved [Bug #10933]!!• However, patch was forgot to introduce!!
• Finally, Ruby 2.2.3 solved it!!
•Please use newest version!!
![Page 20: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/20.jpg)
Ruby 2.2Fast keyword parameters
“Keyword parameters” introduced in Ruby 2.0 is useful, but slow!!
0
5
10
15
20
foo6(1, 2, 3, 4, 5, 6) foo_kw6(k1: 1, k2: 2, k3: 3, k4: 4, k5: 5, k6: 6)Exec
uti
on
tim
e (s
ec)
Repeat 10M times
Evaluation on Ruby 2.1
x30 slower
![Page 21: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/21.jpg)
Ruby 2.2Fast keyword parameters
0
5
10
15
20
foo6(1, 2, 3, 4, 5, 6) foo_kw6(k1: 1, k2: 2, k3: 3, k4: 4, k5: 5, k6: 6)
Exec
uti
on
tim
e (s
ec)
Repeat 10M times
Ruby 2.1 Ruby 2.2
x14 faster!!
But still x2 times slowercompare with normal dispatch
Ruby 2.2 optimizes method dispatch with keyword parameters
![Page 22: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/22.jpg)
Ruby 2.2Incremental GC
BeforeRuby 2.1
Ruby 2.1 RGenGC
Incremental GC
Ruby 2.2 Gen+IncGC
Throughput Low High Low High
Pause time Long Long Short Short
Goal
![Page 23: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/23.jpg)
RGenGC from Ruby 2.1:Micro-benchmark
1699.805974
87.230735
704.843669
867.740319
0
500
1000
1500
2000
2500
3000
no RGenGC
Tim
e (m
s)
total mark total sweep
x2.5 faster
![Page 24: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/24.jpg)
RGenGC from Ruby 2.1:Pause time
Most of cases, FASTER
(w/o rgengc)
![Page 25: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/25.jpg)
RGenGC from Ruby 2.1:Pause time
Several peaks
(w/o rgengc)
![Page 26: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/26.jpg)
Ruby 2.2 Incremental GC
Short pause time
![Page 27: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/27.jpg)
Heroku Matz team and Ruby core teamNext target is
Ruby 2.3
![Page 28: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/28.jpg)
Heroku Matz team and Ruby core teamNext target is
Ruby 2.3No time to talk about it.
Please ask me later
![Page 29: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/29.jpg)
Performance in the details:A way to make faster Ruby
![Page 30: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/30.jpg)
Ruby interpreter
Ruby (Rails) app
RubyGems/Bundler
So many gemssuch as Rails, pry, thin, … and so on.
Ruby’s components for users
i gigantum umeris insidentesStanding on the shoulders of giants
![Page 31: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/31.jpg)
Interpret on RubyVM
Rubyscript
Parse
Compile
RubyBytecode
Object management(GC)
Threading
Embeddedclasses and methods
(Array, String, …)
BundledLibraries
Evaluator
GemLibraries
Ruby’s componentsfrom core developer’s perspective
Ko1’s area
![Page 32: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/32.jpg)
Basic flow to make faster Ruby
1. Observe Ruby interpreter
2. Make assumption the reason of slowness
3. Consider ideas to overcome
4. Implement ideas
5. Measure the result
•Bad/same performance → Goto 4, 3, 2 or 1•Good performance! → Commit it.
![Page 33: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/33.jpg)
Basic weapons to overcome issues
•Knowledge of computer science• Computer system, Programming techniques, and many
others• From:
• Textbook• Academic papers• Other implementation
• Feedback from users
![Page 34: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/34.jpg)
Basic technique to improve performance
• Change the algorithm to reduce computation complexity• e.g.: Selection sort (O(n^2)) v.s. Quick sort (O(n log(n))
• Chang the data structure to improve data locality• e.g.: “list” and “array”
• Remove redundant process• e.g.: Using cache (utilize time locality)
• Considering trade-off• Speed-up major cases and slow-down minor cases
• e.g.: speed-up non-exception flow (and slow-down exception cases)
• Machine dependent technique• e.g.: Using assembler / CPU register directly
• …
![Page 35: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/35.jpg)
Case studies
![Page 36: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/36.jpg)
Ruby has many
Let’s play hangman game
![Page 37: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/37.jpg)
Ruby has many
![Page 38: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/38.jpg)
Ruby has many
![Page 39: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/39.jpg)
Ruby has many
![Page 40: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/40.jpg)
Ruby has many
![Page 41: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/41.jpg)
Ruby has many
Or Methods
![Page 42: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/42.jpg)
Case study:Optimize method dispatch
![Page 43: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/43.jpg)
Interpret on RubyVM
Rubyscript
Parse
Compile
RubyBytecode
Object management(GC)
Threading
Embeddedclasses and methods
(Array, String, …)
BundledLibraries
Evaluator
GemLibraries
Ruby’s componentsfrom core developer’s perspective
Ko1’s area
![Page 44: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/44.jpg)
Method dispatch
# Example
recv.selector(arg1, arg2)
•recv: receiver
•selector: method id
•arg1, arg2: arguments
![Page 45: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/45.jpg)
Method dispatchOverview1. Get class of `recv’ (`klass’)
2. Search method `body’ named `selector’ from `klass’• Method is not fixed at compile time• “Dynamic” method dispatch
3. Dispatch method with `body’1. Check visibility2. Check arity (expected args # and given args #)3. Store `PC’ and `SP’ to continue after method returning4. Build `local environment’5. Set program counter
4. And continue VM execution
![Page 46: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/46.jpg)
OverviewMethod search• Search method from `klass’
1. Search method table of `klass’1. if method `body’ is found, return `body’
2. `klass’ = super class of `klass’ and repeat it
2. If no method is given, exceptional flow• In Ruby language, `method_missing’ will be called
BasicObject
Object
C1
C2
Kernel
selector: body...
Each Class hasmethod table
![Page 47: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/47.jpg)
OverviewChecking arity and visibility• Checking arity
• Compare with given argument number and expected argument number
• Checking visibility• In Ruby language, there are three visibilities
• can you explain each of them ?:-p• public
• private
• protected
![Page 48: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/48.jpg)
OverviewBuilding `local environment’• How to maintain local variables?
→ Prepare `local variables space’ in stack
→ `local environment’ (short `env’)
• Parameters are also in `env’
![Page 49: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/49.jpg)
Method dispatchOverview (again)1. Get class of `recv’ (`klass’)
2. Search method `body’ `selector’ from `klass’• Method is not fixed at compile time• “Dynamic” method dispatch
3. Dispatch method with `body’1. Check visibility2. Check arity (expected args # and given args #)3. Store `PC’ and `SP’ to continue after method returning4. Build `local environment’5. Set program counter
4. And continue VM execution
It seems very easy and simple!and slow...
![Page 50: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/50.jpg)
Method dispatch
•Quiz: How many steps in Ruby’s method dispatch?• Hint: More complex than I explained overview① 8 steps② 12 steps③ 16 steps④ 20 steps
Answer isAbout ④ 20 steps
![Page 51: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/51.jpg)
Method dispatchRuby’s case
1. Check caller’s arguments1. Check splat (*args)
2. Check block (given by compile time or block parameter (&block))
2. Get class of `recv’ (`klass’)
3. Search method `body’ `selector’ from `klass’• Method is not fixed at compile time
• “Dynamic” method dispatch
4. Dispatch method with `body’1. Check visibility
2. Check arity (expected args # and given args #) and process1. Post arguments
2. Optional arguments
3. Rest argument
4. Keyword arguments
5. Block argument
3. Push new control frame1. Store `PC’ and `SP’ to continue after method returning
2. Store `block information’
3. Store `defined class’
4. Store bytecode info (iseq)
5. Store recv as self
4. Build `local environment’
5. Initialize local variables by `nil’
6. Set program counter
5. And continue VM execution
... simple?
(*) Underlined items are additonal process
![Page 52: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/52.jpg)
Ruby’s caseComplex parameter checking
• “def foo(m1, m2, o1=..., o2=...,
p1, p2, *rest, &block)”• m1, m2: mandatory parameter• o1, o2: optional parameter• p1, p2: post parameter• rest: rest parameter• block: block parameter
• From Ruby 2.0, keyword parameter is supported
![Page 53: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/53.jpg)
Method dispatch1. CHeck caller’s arguments
1. Check splat (*args)
2. Check block (given by compile time or block parameter (&block))
2. Get class of `recv’ (`klass’)
3. Search method `body’ `selector’ from `klass’• Method is not fixed at compile time
• “Dynamic” method dispatch
4. Dispatch method with `body’1. Check visibility
2. Check arity (expected args # and given args #) and process1. Post arguments
2. Optional arguments
3. Rest argument
4. Keyword arguments
5. Block argument
3. Push new control frame1. Store `PC’ and `SP’ to continue after method returning
2. Store `block information’
3. Store `defined class’
4. Store bytecode info (iseq)
5. Store recv as self
4. Build `local environment’
5. Initialize local variables by `nil’
6. Set program counter
5. And continue VM execution
Complexand
Slow!!!
![Page 54: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/54.jpg)
Method dispatchOverhead
OS: Linux 2.6.31 32-bitCPU: IntelCore2Quad 2.66GHzMem: 4GBC Compiler: GCC 4.4.1, -O3Profiled by Oprofile
ruby 1.9.3dev (2010-05-26)Profiled by Mr. Shiba
VMObj
OthersInsn
MethodBlock
InstCompileGC
MM
Cfunc
NotRuby
Others
Pentomino
VM
Obj
Others
Insn
Method
BlockOthersCompileGCMMCfunc
NotRuby
Others
Fib
Method dispatch overhead is big especially on micro-benchmarks
![Page 55: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/55.jpg)
Speedup techniquesfor method dispatch1. Specialized instructions
2. Method caching
3. Caching checking results
4. Special path for `send’ and `method_missing’
![Page 56: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/56.jpg)
OptimizationSpecialized instruction (from Ruby 1.9.0)
•Make special VM instruction for several methods• +, -, *, /, ...
def opt_plus(recv, obj)if recv.is_a(Fixnum) and obj.is_a(Fixnum) and
Fixnum#+ is not redefinedreturn Fixnum.plus(recv, obj)
elsereturn recv.send(:+, obj) # not prepared
endend
![Page 57: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/57.jpg)
OptimizationMethod caching (from Ruby 1.9.0)• Eliminate method search overhead
• Reuse search result
• Invalidate cache entry with VM stat
• Two level method caching• Inline method caching
• Global method caching
class => body
class, id => bodyclass, id => body
....class, id => body
BasicObject
Object
C1
C2
Kernel
Inline cache1 element per call-site
Global cachehash table
miss
method search
return fill
miss
fill
naive search
![Page 58: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/58.jpg)
OptimizationCaching checking results (from 2.0.0)• Idea: Visibility and arity check can be skipped after first checking
• Store result in inline method cache
1. Check caller’s arguments
2. Search method `body’ `selector’ from `klass’
3. Dispatch method with `body’1. Check visibility and arity
1. Cache result into inline method cache
2. Push new control frame
3. Build `local environment’
4. Initialize local variables by `nil’
Seco
nd
tim
e
Firs
t ti
me
![Page 59: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/59.jpg)
Evaluation resultMicro benchmarks
0
0.5
1
1.5
2
vm1_attr_ivar*
vm1_attr_ivar_set*
vm1_block*
vm1_simplereturn*
vm1_yield*
vm2_defined_method*
vm2_method*
vm2_method_missing*
vm2_method_with_block*
vm2_poly_method*
vm2_send*
vm2_super*
vm2_zsuper*
Spee
du
p r
atio
Faster than first date
trunk 2012/10/13 trunk 2012/10/31
![Page 60: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/60.jpg)
Case studyFaster keyword parameters
![Page 61: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/61.jpg)
Keyword parameters from Ruby 2.0
# def with keywords
def foo(a, b, key1: 1, key2: 2)
…
end
# call with keywords
foo(1, 2, key1: 123, key2: 456)
![Page 62: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/62.jpg)
Slow keyword parameters
0
5
10
15
20
foo6(1, 2, 3, 4, 5, 6) foo_kw6(k1: 1, k2: 2, k3: 3, k4: 4,k5: 5, k6: 6)Ex
ecu
tio
n t
ime
(se
c)
Repeat 10M times
Evaluation on Ruby 2.1x30 slower
![Page 63: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/63.jpg)
Why slow, compare with normal parameters?
1. Hash creation
2. Hash accessdef foo(h = {})
k1 = h.fetch(:k1, v1)
k2 = h.fetch(:k2, v2)
…
end
foo( {k1: 1, k2: 2} )
def foo(k1: v1, k2: v2)
…
end
foo(k1: 1, k2: 2)
1. Hash creation
2. Hash access
![Page 64: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/64.jpg)
Optimization technique of keyword parameters from Ruby 2.2
•Key technique
→ Pass “a keyword list”
nstead of a Hash object
Check “Evolution of Keyword parameters” at Rubyconf portugal'15 http://www.atdot.net/~ko1/activities/2015_RubyConfPortgual.pdf
![Page 65: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/65.jpg)
Result: Fast keyword parameters (Ruby 2.2.0)
0
5
10
15
20
foo6(1, 2, 3, 4, 5, 6) foo_kw6(k1: 1, k2: 2, k3: 3, k4: 4, k5:5, k6: 6)
Exec
uti
on
tim
e (s
ec)
Repeat 10M times
Ruby 2.1 Ruby 2.2
x14 faster!!(best case)
But still x2 times slower compare with normal dispatch
Ruby 2.2 optimizes method dispatch with keyword parameters
![Page 66: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/66.jpg)
Case studyGarbage collection
http://www.flickr.com/photos/circasassy/6817999189/
![Page 67: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/67.jpg)
Interpret on RubyVM
Rubyscript
Parse
Compile
RubyBytecode
Object management(GC)
Threading
Embeddedclasses and methods
(Array, String, …)
BundledLibraries
Evaluator
GemLibraries
Ruby’s componentsfrom core developer’s perspective
Ko1’s area
![Page 68: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/68.jpg)
Automatic memory managementBasic concept
• Garbage collector recycled “unused” objects automatically
![Page 69: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/69.jpg)
marked
marked marked
markedmarked
Mark & Sweep algorithm
1. Mark reachable objects from root objects
2. Sweep unmarkedobjects (collection and de-allocation)
Root objects
free
traverse
traverse traverse
traverse traverse
free
free
Collect unreachable objects
![Page 70: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/70.jpg)
Generational GC (GenGC) from Ruby 2.1.0•Weak generational hypothesis:
“Most objects die young”
→ Concentrate reclamation effort
only on the young objects
http://www.flickr.com/photos/ell-r-brown/5026593710
![Page 71: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/71.jpg)
Generational hypothesis
0
10
20
30
40
50
60
70
80
90
100
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99 102
Perc
enta
ge o
f d
ead
ob
ject
#
Lifetime (Survibing GC count)
Object lifetime in RDoc(How many GCs surviving?)
95% of objects dead by the first GC
![Page 72: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/72.jpg)
Generational GC (GenGC)
•Separate young generation and old generation•Create objects as young generation•Promote to old generation after surviving n-th GC
•Usually, GC on young space (minor GC)
•GC on both spaces if no memory (major/full GC)
![Page 73: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/73.jpg)
GenGC [Minor M&S GC] (1/2)
• Mark reachable objects from root objects.• Mark and promote to old
generation• Stop traversing after old
objects
→ Reduce mark overhead
• Sweep not (marked or old) objects
• Can’t collect Some unreachable objects
•
Root objects
new
new new
new/free
newnew
traverse
traverse traverse
traverse traverse
new/free
old/free
Don’t collect old objecteven if it is unreachable.
collect
1st MinorGC
old
old old
oldold
![Page 74: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/74.jpg)
GenGC [Minor M&S GC] (2/2)
• Mark reachable objects from root objects.• Mark and promote to old
generation• Stop traversing after old
objects
→ Reduce mark overhead
• Sweep not (marked or old) objects
• Can’t collect Some unreachable objects
•
Root objects
old
old old
new/free
oldold
traverse
ignore ignore
ignore ignore
new/free
old/free
Don’t collect old objecteven if it is unreachable.
collect
2nd MinorGC
![Page 75: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/75.jpg)
GenGC [Major M&S GC]
• Normal M&S
• Mark reachable objects from root objects• Mark and promote to old gen
• Sweep unmarked objects
• Sweep all unreachable (unused) objects
Root objects
new
old new
new/free
oldold
traverse
traverse traverse
traverse traverse
new/free
old/free
collect
collect
![Page 76: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/76.jpg)
0
2
4
6
8
10
12
14
Total mark time (ms) Total sweep time (sec)
Acc
um
ula
ted
exe
cuti
on
tim
e (s
ec)
w/o RGenGC RGenGC
RGenGC from Ruby 2.1.0Performance evaluation (RDoc)
About x15 speedup!
* Disabled lazy sweep to measure correctly.
![Page 77: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/77.jpg)
RGenGC from Ruby 2.1.0Performance evaluation (RDoc)
* 12% improvements compare with w/ and w/o RGenGC* Disabled lazy sweep to measure correctly.
103.7627479 102.3799865
16.043938154.946003494
0
20
40
60
80
100
120
140
w/o RGenGC RGenGC
Tota
l exe
cuti
on
tim
e (s
ec)
other than GC GC
![Page 78: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/78.jpg)
Summary
![Page 79: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/79.jpg)
SummaryRepeating “Basic flow” is my daily job
1. Observe Ruby interpreter
2. Make assumption the reason of slowness
3. Consider ideas to overcome
4. Implement ideas
5. Measure the result
•Bad/same performance → Goto 4, 3, 2 or 1•Good performance! → Commit it.
![Page 80: Performance in the details: A way to make faster Rubyko1/activities/2015_RailsClub2015_pub.pdf · RailsClub 2015. A way to make faster Ruby The only way I can find is: Repeating a](https://reader034.vdocuments.us/reader034/viewer/2022050419/5f8eb1e11d6eac7e154eed7e/html5/thumbnails/80.jpg)
Summary
Ruby/MRI is gettingbetter and better.