tales about scala performance

59
© Copyright Performize-IT LTD. Tales About Scala Performance

Upload: haim-yadid

Post on 26-Jan-2015

103 views

Category:

Technology


1 download

DESCRIPTION

A session given on Scalapeño conference 2013.

TRANSCRIPT

Page 1: Tales About Scala Performance

© Copyright Performize-IT LTD.

Tales About Scala Performance

Page 2: Tales About Scala Performance

About Me

My Name: Haim Yadid Hard to PronounceLuckily it is meaningful

Haim => Life Yadid => Friend

hybrid nick: lifey

© Copyright Performize-IT LTD.

:: ::::::this :: Nil::

Page 3: Tales About Scala Performance

Performize-IT

© Copyright Performize-IT LTD.

Page 4: Tales About Scala Performance

Performize-IT

© Copyright Performize-IT LTD.

Optimizing Software since 2007

Performance Bottlenecks

Crashes

GC Tuning Training&Mentoring

OutOfMemory

Concurrency

Page 5: Tales About Scala Performance

Contact Me

© Copyright Performize-IT LTD.

http://il.linkedin.com/in/haimyadid

[email protected]

www.performize-it.com

blog.performize-it.com

https://github.com/lifey

@lifeyx

Page 6: Tales About Scala Performance

© Copyright Performize IT LTD.

Once Upon A Time

Page 7: Tales About Scala Performance

Benchmarks by Google

© Copyright Performize-IT LTD.

So we are done

Page 8: Tales About Scala Performance

So what is this talk about?

© Copyright Performize-IT LTD.

Best practices Micro benchmarks?

Understanding

Page 9: Tales About Scala Performance

Understand

How to Find performance problemsHow to solve themReach a well performing production system

Prerequisites:Familiarity with the JVMBasic knowledge of Scala

© Copyright Performize-IT LTD.

Page 10: Tales About Scala Performance

Performance is all about

MethodologyMonitoring

Hotspots Isolation Analysis Solution

Tools are your Best Friends for this task

© Copyright Performize-IT LTD.

Page 11: Tales About Scala Performance

Scala Runs on the JVM

All JVM capabilities and tools still apply Take your best friends with you

© Copyright Performize-IT LTD.

Page 12: Tales About Scala Performance

Premature Optimization

© Copyright Performize-IT LTD.

I shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurelyI shall not optimize prematurely

Page 13: Tales About Scala Performance

Monitoring the JVM

Java management extensions (JMX)on the same machine(Attach)Remotely via command line paramsTools

JConsoleJVisualVMMission Control

© Copyright Performize-IT LTD.

Page 14: Tales About Scala Performance

Remote Monitoring - JMX

Add params to command line of profiled app-Dcom.sun.management.jmxremote-Dcom.sun.management.jmxremote.port=<port>-Dcom.sun.management.jmxremote.authenticate=false-Dcom.sun.management.jmxremote.ssl=false

Recommend authentication and security, refer tohttp://java.sun.com/j2se/1.5.0/docs/guide/management/agent.html

© Copyright Performize-IT LTD.

Production

Page 15: Tales About Scala Performance

© Copyright Performize IT LTD.

A Tale about a Stack

Page 16: Tales About Scala Performance

Your First Scala Function

Functional Programming recursionEasy to understand Probably your 1st program in Scala will look like:

© Copyright Performize-IT LTD.

def sumOfSquares(st:Int , end : Int ) = { if (st>end) 0 else st*st + sumOfSquares(st+1,end) }

Page 17: Tales About Scala Performance

And your first exception will be:

© Copyright Performize-IT LTD.

java.lang.StackOverflowError at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:8) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9) at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9)

at com.performizeit.scalapeno.demos.TailRecusionTale$.calculateSumOfSquares(TailRecusionTale.scala:9)

Page 18: Tales About Scala Performance

Tail Recursion

© Copyright Performize-IT LTD.

Recursive call to the function must be the value returned

 if  (number  ==  1)  1  else  number  *  factorial  (number  -­‐  1)

Page 19: Tales About Scala Performance

Favor tail recursion

The JVM does not optimize recursionMeaning extra call for every iterationLimit on recursion depthScala compiler can optimize tail recursion!!

© Copyright Performize-IT LTD.

@tailrec def sumOfSquares(st:Int , end : Int, sum = 0 ) = { if (st>end) sum else sumOfSquares(st+1,end,sum + st*st)}

Page 20: Tales About Scala Performance

@tailrec Annotation

A compile time directivefail compilation if tail recursion optimization cannot be appliedUse whenever the fact tail recursion is used is mandatory for performance and functionality

© Copyright Performize-IT LTD.

Page 21: Tales About Scala Performance

Stack Size

Ranges from 256k-1024kDepending on platform and JVM versionWhat is it in your system?

java -XX:+PrintFlagsFinal -version |& grep ThreadStackSize

Tune thread stack to your needs Example: -Xss1312k

© Copyright Performize-IT LTD.

Production

Page 22: Tales About Scala Performance

Stacks in Scala

Scala stack is just like Java Stackjstack is your best friend Scala terminology may be obscuredE.g. List will look like $colon$colon

© Copyright Performize-IT LTD.

Page 23: Tales About Scala Performance

JStack

Part of the JDKDumps stack traces of all live threadsSynopsis: jstack -lUse when

Get a snapshot for program activitydetect deadlocks

© Copyright Performize-IT LTD.

Page 24: Tales About Scala Performance

Takipi’s Stackifier

www.stackifier.com

© Copyright Performize-IT LTD.

Page 25: Tales About Scala Performance

© Copyright Performize IT LTD.

Humpty Dumpty sat on a heap,Humpty Dumpty had anOutOfMemory flip.All the king’s horses and all the king’s menCouldn’t put Humpty together again

Heap

Max Used

Page 26: Tales About Scala Performance

In a Perfect World.....

Heap(Or Perm Gen) is depleted -XX:+HeapDumpOnOutOfMemoryErrorScala code does not have larger memory footprintScala code may have larger permgen footprint

© Copyright Performize-IT LTD.

Production

Page 27: Tales About Scala Performance

MAT

MAT - Memory Analyzer ToolA very powerful tool analyzing heap dumps

Use to investigate :Memory leaksOutOfMemory errors Memory footprint

AlternativesYourkit /JProbe/JProfiler (Commercial)VisualVM(JDK)JHat(JDK)

© Copyright Performize-IT LTD.

Page 28: Tales About Scala Performance

MAT-name-resolver

Add-on for MAT Helps MAT understand ScalaDeveloped by Iulian Dragos from TypesafeGithub project https://github.com/dragos/MAT-name-resolver

© Copyright Performize-IT LTD.

Page 29: Tales About Scala Performance

List[Int] ?

© Copyright Performize-IT LTD.

Page 30: Tales About Scala Performance

OutOfMemory Perm Space

Class byte code resides in PermGenScala will use more perm space You can write small piece of codewhich will create a lot of byte-code

© Copyright Performize-IT LTD.

Page 31: Tales About Scala Performance

@ScalaSignature

@ScalaSignature(bytes="... Meta data needed for:

ReflectionCompilation

Larger class files

© Copyright Performize-IT LTD.

Page 32: Tales About Scala Performance

More classes

Each closure is actually a JVM class Implicit conversions are classesCompanion objects are also classes

© Copyright Performize-IT LTD.

Page 33: Tales About Scala Performance

Well

© Copyright Performize-IT LTD.

object ClosureExample extends App { val f = (x: Int) => x*x println (s"closure ${f(5)}");}

ClosureExample$.classpackage com.performizeit.scalapeno.demos;

import scala.Function0;import scala.Function1;import scala.LowPriorityImplicits;import scala.Predef.;import scala.StringContext;import scala.reflect.ScalaSignature;import scala.runtime.AbstractFunction0;import scala.runtime.BoxedUnit;import scala.runtime.BoxesRunTime;

@ScalaSignature(bytes="\006\001\035:Q!\001\002\t\002-\tab\0217pgV\024X-\022=b[BdWM\003\002\004\t\005)A-Z7pg*\021QAB\001\ng\016\fG.\0319f]>T!a\002\005\002\031A,'OZ8s[&TX-\033;\013\003%\t1aY8n\007\001\001\"\001D\007\016\003\t1QA\004\002\t\002=\021ab\0217pgV\024X-\022=b[BdWmE\002\016!Y\001\"!\005\013\016\003IQ\021aE\001\006g\016\fG.Y\005\003+I\021a!\0218z%\0264\007CA\t\030\023\tA\"CA\002BaBDQAG\007\005\002m\ta\001P5oSRtD#A\006\t\017ui!\031!C\001=\005\ta-F\001 !\021\t\002E\t\022\n\005\005\022\"!\003$v]\016$\030n\03482!\t\t2%\003\002%%\t\031\021J\034;\t\r\031j\001\025!\003 \003\t1\007\005")public final class ClosureExample{ public static void main(String[] paramArrayOfString) { ClosureExample..MODULE$.main(paramArrayOfString); }

public static void delayedInit(Function0<BoxedUnit> paramFunction0) { ClosureExample..MODULE$.delayedInit(paramFunction0); }

public static String[] args() { return ClosureExample..MODULE$.args(); }

public static void scala$App$_setter_$executionStart_$eq(long paramLong) { ClosureExample..MODULE$.scala$App$_setter_$executionStart_$eq(paramLong); }

public static long executionStart() { return ClosureExample..MODULE$.executionStart(); }

public static Function1<Object, Object> f() { return ClosureExample..MODULE$.f(); }

public static class delayedInit$body extends AbstractFunction0 { private final ClosureExample. $outer;

public final Object apply() { this.$outer.f_$eq(new ClosureExample..anonfun.1()); Predef..MODULE$.println(new StringContext(Predef..MODULE$.wrapRefArray((Object[])new String[] { "closure ", "" })).s(Predef..MODULE$.genericWrapArray(new Object[] { BoxesRunTime.boxToInteger(this.$outer.f().apply$mcII$sp(5)) })));

return BoxedUnit.UNIT; }

public delayedInit$body(ClosureExample. $outer) { } }}

ClosureExample.classpackage com.performizeit.scalapeno.demos;

import scala.App;import scala.App.class;import scala.DelayedInit;import scala.Function0;import scala.Function1;import scala.Serializable;import scala.collection.mutable.ListBuffer;import scala.runtime.AbstractFunction1.mcII.sp;import scala.runtime.BoxedUnit;

public final class ClosureExample$ implements App{ public static final MODULE$; private Function1<Object, Object> f; private final long executionStart; private String[] scala$App$$_args; private final ListBuffer<Function0<BoxedUnit>> scala$App$$initCode;

static { new (); }

public long executionStart() { return this.executionStart; } public String[] scala$App$$_args() { return this.scala$App$$_args; } public void scala$App$$_args_$eq(String[] x$1) { this.scala$App$$_args = x$1; } public ListBuffer<Function0<BoxedUnit>> scala$App$$initCode() { return this.scala$App$$initCode; } public void scala$App$_setter_$executionStart_$eq(long x$1) { this.executionStart = x$1; } public void scala$App$_setter_$scala$App$$initCode_$eq(ListBuffer x$1) { this.scala$App$$initCode = x$1; } public String[] args() { return App.class.args(this); } public void delayedInit(Function0<BoxedUnit> body) { App.class.delayedInit(this, body); } public void main(String[] args) { App.class.main(this, args); } public Function1<Object, Object> f() { return this.f; } public void f_$eq(Function1 x$1) { this.f = x$1; }

ClosureExample$$anonfun$1.classpackage com.performizeit.scalapeno.demos;

import scala.Serializable;import scala.runtime.AbstractFunction1.mcII.sp;

public final class ClosureExample$$anonfun$1 extends AbstractFunction1.mcII.sp implements Serializable{ public static final long serialVersionUID = 0L;

public final int apply(int x) { return apply$mcII$sp(x); } public int apply$mcII$sp(int x) { return x * x; }

}

ClosureExample$delayedInit$body.classpackage com.performizeit.scalapeno.demos;

import scala.Function1;import scala.LowPriorityImplicits;import scala.Predef.;import scala.StringContext;import scala.runtime.AbstractFunction0;import scala.runtime.BoxedUnit;import scala.runtime.BoxesRunTime;

public final class ClosureExample$delayedInit$body extends AbstractFunction0{ private final ClosureExample. $outer;

public final Object apply() { this.$outer.f_$eq(new ClosureExample..anonfun.1()); Predef..MODULE$.println(new StringContext(Predef..MODULE$.wrapRefArray((Object[])new String[] { "closure ", "" })).s(Predef..MODULE$.genericWrapArray(new Object[] { BoxesRunTime.boxToInteger(this.$outer.f().apply$mcII$sp(5)) })));

return BoxedUnit.UNIT; }

public ClosureExample$delayedInit$body(ClosureExample. $outer) {

Page 34: Tales About Scala Performance

@specialized

Generics implemented by type erasureFor primitive types this means : Boxing/UnboxingPerformance hit Large memory footprint

@specialized annotation enables specialized implementations

© Copyright Performize-IT LTD.

Page 35: Tales About Scala Performance

What about code cache?

Code cache hold optimized assembly code Should be large enough to hold If you need more perm gen You may need more code cache-XX:CodeCacheSize=Monitor it via JMX

© Copyright Performize-IT LTD.

Production

Page 36: Tales About Scala Performance

@specialized Nightmare

© Copyright Performize-IT LTD.

class SpecializeNightmare { trait S1[@specialized A, @specialized B] { def f(p1:A): Unit }}

Generates 165 classes

Don’t try with 3,4,5

Page 37: Tales About Scala Performance

OutOfMemory Perm Gen Space

Congrats you have a perm gen OOM -XX:MaxPermSize=1024m(Or -J-XX:MaxPermSize=1024m if you use Scala command line)

© Copyright Performize-IT LTD.

Production

Page 38: Tales About Scala Performance

© Copyright Performize IT LTD.

Oh dear! Oh dear! I shall be too late!

Page 39: Tales About Scala Performance

-optimise

A scalac command line parameter Performs optimizations of bytecode Inlining boxing/unboxing elimination etcImproves performance Slower compilation

© Copyright Performize-IT LTD.

Production

Page 40: Tales About Scala Performance

Inlining

Scala uses information it has in compile time To know which methods can be inlinedIt can do better job than the JVMAutomatic when you -optimise

© Copyright Performize-IT LTD.

Production

Page 41: Tales About Scala Performance

Inlining Visibility

On scala compiler levelAdd -Ylog:inline to see what inlined

© Copyright Performize-IT LTD.

scalac -optimise -Ylog:inline -d ../bin com/performizeit/scalapeno/demos/ClosureExampleInline.scala |& grep inlined

[log inliner] inlined com.performizeit.scalapeno.demos.ClosureExampleInline.<init> // 1 inlined: com.performizeit.scalapeno.demos.ClosureExampleInline.delayedInit[log inliner] inlined com.performizeit.scalapeno.demos.ClosureExampleInline$$anonfun$f$1.apply // 1 inlined: com.performizeit.scalapeno.demos.anonfun$f$1.apply$mcII$sp

Page 42: Tales About Scala Performance

Inlining Visibility JVM

JIT Compiler compiler optionsNot recommended for production

-XX:+PrintCompilation-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining

© Copyright Performize-IT LTD.

! Prod

Page 43: Tales About Scala Performance

@inline

You may direct the compiler to inline a methodUsually you will not need it the compiler will do it anyway.Or the JVM will do it anywayNo real need to clutter the code....

© Copyright Performize-IT LTD.

@inline final def f = (x: Int) => x*x

Page 44: Tales About Scala Performance

Member accessors

Get/Setgetters to a val fieldsgetters&setters to var fieldsWill you pay for this?

Nope !JVM inlines accessor methods (by default)If you insist on penalty-XX:-UseFastAccessorMethods

© Copyright Performize-IT LTD.

Page 45: Tales About Scala Performance

Parallel Collections

ParArrayParVectormutable.ParHashMapmutable.ParHashSetimmutable.ParHashMapimmutable.ParHashSetParRangeParTrieMap

© Copyright Performize-IT LTD.

Page 46: Tales About Scala Performance

Parallel Collections

Apply only when has a location is a hotspotVery easy to use behind the scenes ForkJoinFramework (Java 6)Dangerous when code :

has side effectsNon associative

Easy to use

© Copyright Performize-IT LTD.

val v = Vector(Range(0,10000000)).flatten v.par.map(_ + 1)

Only when proven to improve

Page 47: Tales About Scala Performance

Profiler - JVisualVM

Part of the JDKA profiler Use when

Want to identify hotspot Analyze memory allocation bottlenecks

Alternatives Yourkit (Commercial)JProbe(Commercial)JProfiler(Commercial)

© Copyright Performize-IT LTD.

Page 48: Tales About Scala Performance

Sampling vs Instrumentation

Sampling - sample application threads and stack traces to get statistics Instrumentation - modify byte code to record times and invocation counts

© Copyright Performize-IT LTD.

Page 49: Tales About Scala Performance

Scala Stacks revisited

© Copyright Performize-IT LTD.

while (true) { var a = List(Range(0,1000)).flatten // println(a) for (i <- 1 to 10 ) { a = a :+ i println(a.last) } }

Page 50: Tales About Scala Performance

© Copyright Performize IT LTD.

Garbage Collection

Page 51: Tales About Scala Performance

Immutability

Immutability may cause more objects allocation Not necessary a performance hit

Short lived objectsGC handles them efficientlyEscape analysis

Parallelization!!!

© Copyright Performize-IT LTD.

Page 52: Tales About Scala Performance

VisualVM (allocation hotspots)

Find locations large amounts of bytes are being allocated.large number of objects being allocation

© Copyright Performize-IT LTD.

Page 53: Tales About Scala Performance

Large (im)mutable state

You have a huge graph which changes graduallyEventually end up in Old Generation A small change may cause huge impact on state That may screw up GC

© Copyright Performize-IT LTD.

Page 54: Tales About Scala Performance

GC Visibility

GC can be visualized partially through JMXThe best way to do get the whole picture is by GC logs

-Xloggc:<log file name>-XX:+PrintGCDetails -XX:+PrintGCDateStamps

Java 7 supports a “rolling appender” -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=<#files> -XX:GCLogFileSize=<number>M

© Copyright Performize-IT LTD.

Prod

Page 55: Tales About Scala Performance

GCViewer

Analysis GC logs Use when:

Experience GC problemsIs GC efficient ?(throughput )Does GC stops application ( pause time)

Alternatives Cesnum (Commercial)

© Copyright Performize-IT LTD.

Page 56: Tales About Scala Performance

© Copyright Performize IT LTD.

And They Lived Happily Ever After

Page 57: Tales About Scala Performance

slides /: (_ + _)

Don’t be afraid of Scala You will be able to optimize large scale apps Optimize where needed You need to (Java =>) Scala Yourself ATM - Know Java to optimize Scala

© Copyright Performize-IT LTD.

Page 58: Tales About Scala Performance

© Copyright Performize IT LTD.

Q&A

Page 59: Tales About Scala Performance

© Copyright Performize IT LTD.

The End