bringing simd to the web via dart john...

46
Bringing SIMD to the web via Dart John McCutchan

Upload: others

Post on 20-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Bringing SIMD to the web via Dart

John McCutchan

Page 2: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

● inotify○ Linux kernel system for monitoring file systems for changes

● Port Bullet Physics to the PS3○ SPUs are fun

● Optimizer for PS3 and PS Vita○ Make games run faster

● PS4 CPU/GPU Expert○ Hardware architecture and algorithms

Biography

Page 3: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

● The Web○ Dart○ WebGL○ HTML5

Biography

Page 4: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

1. Structurea. Tool visible type systemb. Class based, object orientedc. Lexical this

d. ; required

2. Performancea. Dart is designed to run fast by being less permissiveb. New VM opens up new possibilities

i. SIMD

Page 5: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Some of the reasons...

Why can Dart run fast?

Page 6: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Static Object Shape

Shape of MyClass

function MyClass() {

this.a = 1;

this.b = "hey";

}

a

b

foo = new MyClass();

...

foo.c = 3.14159;

a

b

c

Shape of foo

Object Shape Change

Previous optimizations invalid.

Page 7: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Hole Free Arrays

bigData = [];bigData[0] = 0.0;bigData[1] = 1.0;...bigData[50] = 50.0;

0.0 1.0 ... 49.0 50.0

bigData[20000] = 22.0;

0.0 1.0

49.0

50.022.0

0 1

5020000

49 ...

Page 8: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Distinction between growable and fixed sized arrays

var growable = new List<double>(); // length == 0

var fixed = new List<double>(200);

for (int i = 0; i < fixed.length; i++) {// Safely query length only once

// Bounds check hoisted out of loop}

for (int i = 0; i < growable.length; i++) {// May have to query length many times

// Bounds check inside the loop}

Page 9: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

No prototype chain

foo = new SomeClass();

foo.someFunction();

SomeClass.prototype

ParentClass.prototype

GrandParentClass.prototype

Respond to someFunction?

Respond to someFunction?

Respond to someFunction?

Page 10: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Distinction between integer and double numbers

● JavaScript only has double○ Double arithmetic slower than Integer arithmetic

■ For mobile processors difference is greater

● Dart has both double and integer○ Gives choice to developer

Double Integer Double slowdown

Multiply 6 2 3x

Addition 4 1 4x

Load 2 2 N/A

Store 2 2 N/A

http://infocenter.arm.com/help/index.jsp - Cortex A9 CPU

Page 11: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

... and why does it matter?

What is SIMD?

Page 12: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Single Instruction Single Data (SISD)

What is SIMD?

1.0 2.0 3.0

Page 13: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

What is SIMD?

Single Instruction Multiple Data (SIMD)

1.0

3.0

5.0

7.0

2.0

4.0

6.0

8.0

3.0

7.0

11.0

15.0

Vector Processor

Page 14: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Why does SIMD matter?

● SIMD can provide substantial speedup to:○ 3D Graphics○ 3D Physics○ Image Processing○ Signal Processing○ Numerical Processing

Page 15: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Why does SIMD matter to the web?

● SIMD can provide substantial speedup to:○ WebGL○ Canvas○ Animation○ Games○ Physics

Page 16: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Why does SIMD matter to the web?

Console Games 1998

Web Games 2013

Page 17: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Why does SIMD matter?

Page 18: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Why does SIMD matter?

● SIMD requires fewer instructions to be executed○ Fewer instructions means longer battery life

VS

Page 19: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Why does SIMD matter?

● Mozilla is attempting to automatically use SIMD in IonMonkey VM○ Gaussian Blur sped up

■ https://bugzilla.mozilla.org/show_bug.cgi?id=832718

○ Based on pattern recognition■ Programs must be written to patterns detectable by VM

○ "Automatic Vectorization"■ Open research topic

Page 20: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

SIMD in Dart

Page 21: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

SIMD in Dart

● New types○ Float32x4○ Float32x4List○ Uint32x4

● Composable operations○ Arithmetic○ Logical○ Comparisons○ Reordering (shuffling)

4 Unsigned 32-bit Integer Numbers

List of Float32x4

4 IEEE-754 32-bit Floating Point Numbers

Page 22: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

SIMD in Dart

Float32x4● +● -● /● *● sqrt (square root)● reciprocal● rsqrt (reciprocal square root)● min● max● clamp● abs (absolute value)

x wy z

Lanes

Page 23: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Constructing

var a = new Float32x4(1.0, 2.0, 3.0, 4.0);

var b = new Float32x4.zero();

1.0 4.02.0 3.0

0.0 0.00.0 0.0

Page 24: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

var a = new Float32x4(1.0, 2.0, 3.0, 4.0);

var b = a.x; // 1.0

var c = a.withX(5.0);

Accessing and Modifying Individual Elements

1.0 4.02.0 3.0

5.0 4.02.0 3.0

Page 25: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

var a = new Float32x4(1.0, 2.0, 3.0, 4.0);

var b = new Float32x4(5.0, 10.0, 15.0, 20.0);

var c = a + b;

Arithmetic

1.0

4.0

2.0

3.0

5.0

20.0

10.0

15.0

6.0

24.0

12.0

18.0

Page 26: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Example

double average(Float32List list) { var n = list.length; var sum = 0.0; for (int i = 0; i < n; i++) { sum += list[i]; } return sum / n;}

Page 27: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Example

double average(Float32x4List list) { var n = list.length; var sum = new Float32x4.zero(); for (int i = 0; i < n; i++) { sum += list[i]; } var total = sum.x + sum.y + sum.z + sum.w; return total / (n * 4);}

Page 28: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Example

1.0 3.0 7.0 7.02.0 5.0 6.0 8.03.0 7.0 11.0 15.0

17.0 24.016.0 18.0

75.0

Page 29: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

SIMD in Dart

75% fewer loads75% fewer adds

75% fewer stores 4 times faster!

Page 30: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

The inner loop

sum += list[i];

;; Load list[i]0x4ccddcc d1ff sar edi, 10x4ccddce 0f104c3807 movups xmm1,[eax+edi*0x1+0x7]0x4ccddd3 03ff add edi,edi ;; sum +=0x4ccddde 0f59ca addps xmm2,xmm1

Load 4 floats

Add 4 floats

Page 31: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Shuffling

var a = new Float32x4(1.0, 2.0, 3.0, 4.0);

var b = a.xxyy;

var c = a.wwww;

var d = a.wzyx;

1.0 4.02.0 3.0

1.0 2.01.0 2.0

4.0 4.04.0 4.0

4.0 1.03.0 2.0

Page 32: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Branching

double max(double a, double b) { if (a > b) { return a; } else { return b; }}

max(4.0, 5.0) -> 5.0

Page 33: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Branching

Float32x4 max(Float32x4 a, Float32x4 b) { if (a > b) { return a; } else { return b; }}

1.0 4.02.0 3.0

0.0 2.03.0 5.0

Page 34: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Branching

Float32x4 max(Float32x4 a, Float32x4 b) { Uint3x4 greaterThan = a.greaterThan(b); return greaterThan.select(b, a);}

=

0xF

0xF

0x0

0x0

2.0

3.0

5.0

0.0

>

1.0

4.0

2.0

3.0

Page 35: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Branching

Float32x4 max(Float32x4 a, Float32x4 b) { Uint3x4 greaterThan = a.greaterThan(b); return greaterThan.select(a, b);}

0xF

0xF

0x0

0x0

0.0

2.0

3.0

5.0

1.0

4.0

2.0

3.0

0xF

SELECT

0x0

1.0

3.0

5.0

4.0

=

Page 36: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

How does the VM optimize for SIMD?

1. Unboxinga. Boxed -> allocated in memoryb. Unboxed -> in CPU memory (in registers)

2. Replacing method calls with inlined machine instructionsa. Allows values to remain unboxed (in registers)b. Avoids method call overhead

Page 37: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

More Benchmarks

Page 38: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Wrap up

Page 39: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

● Dart SIMD has landed*○ Try it out!○ Use your entire CPU

Dart VM stretches the performance envelop.Dart VM makes new, magical experiences possible.

Wrap Up

Fewer Instructions Faster Performance Longer Battery A better Web

Page 40: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Why does SIMD matter to the web?

Page 41: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

● The web needs SIMD if we want this:

Wrap Up

Page 42: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Wait, what exactly is "fast"?... and when will web programs be "fast"?

Page 43: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Fast

The Web

December, 2011 - Joel Webber's blog

Page 44: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

Questions!www.johnmccutchan.comwww.dartgamedevs.orgFollow me on Twitter @johnmccutchanCircle me on Google+ gplus.to/cutch

Page 45: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

● SIMD References○ Wikipedia

■ http://en.wikipedia.org/wiki/SIMD

○ Intel's site■ http://software.intel.com/en-us/articles/using-intel-streaming-simd-extensions-and-intel-integrated-performance-primitives-to-accelerate-algorithms

■ http://software.intel.com/en-us/articles/optimizing-the-rendering-pipeline-of-animated-models-using-the-intel-streaming-simd-extensions

○ ARM's site■ http://blogs.arm.com/software-enablement/161-coding-for-neon-part-1-load-and-stores/

○ www.gamasutra.com○ www.gamedev.net

Wrap Up

Page 46: Bringing SIMD to the web via Dart John McCutchandartdoc.takyam.com/slides/2013/02/Bringing-SIMD-to-the...inotify Linux kernel system for monitoring file systems for changes Port Bullet

What is SIMD?