703 whats new in the accelerate framework

135
© 201 4 Apple Inc. All rights reserve d. Redistribution or public display not permitted without written permission from Apple. #WWDC14 What’s New in the Accelerate Framework Session 703 Geoff Belter Engineer , V ector and Numerics Group Core OS

Upload: clungaho7109

Post on 05-Jul-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 1/162

Page 2: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 2/162

What is Available?

Image processing (vImage)

Digital signal processing (vDSP)

Math functions (vForce, vMathLib, vBigNum)

Linear algebra (LAPACK, BLAS)

Page 3: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 3/162

What is the Accelerate Framework?

High performance• Fast

• Energy efficient

OS X and iOS

All generations of hardware

Page 4: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 4/162

Session Goals

New features in vImage

Introduce

• LinearAlgebra

• <simd/simd.h>

Page 5: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 5/162

Page 6: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 6/162

vImageSome things you can do

Page 7: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 7/162

vImageSome things you can do

Page 8: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 8/162

Getting Data into vImageCGImageRef

Page 9: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 9/162

Getting Data into vImageCGImageRef

// CGImageRef —-> vImage_BuffervImage_Buffer buf;vImage_CGImageFormat fmt = { .bitsPerComponent = 8, .bitsPerPixel = 32, … };vImage_Error err = vImageBuffer_initWithCGImage( &buf, &fmt, NULL,

cgImage, kvImageNoFlags );

Page 10: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 10/162

Getting Data into vImageCGImageRef

// CGImageRef —-> vImage_BuffervImage_Buffer buf;vImage_CGImageFormat fmt = { .bitsPerComponent = 8, .bitsPerPixel = 32, … };vImage_Error err = vImageBuffer_initWithCGImage( &buf, &fmt, NULL,

cgImage, kvImageNoFlags );

// vImage_Buffer —-> CGImageRefcgImage = vImageCreateCGImageFromBuffer( &buf, &fmt, NULL, NULL,

kvImageNoFlags, &err );

Page 11: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 11/162

Conversion SupportvImageConvert_AnyToAny

// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … };

vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err);

// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);

Page 12: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 12/162

Conversion SupportvImageConvert_AnyToAny

// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … };

vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err);

// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);

Page 13: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 13/162

Conversion SupportvImageConvert_AnyToAny

// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … };

vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err);

// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);

Page 14: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 14/162

Conversion SupportvImageConvert_AnyToAny

// 1) Make converter: srcFormat —-> destFormatvImage_CGImageFormat srcFormat = { .bitsPerComponent = 8, … };vImage_CGImageFormat destFormat = { .bitsPerComponent = 16, … };

vImageConverterRef c = vImageConverter_CreateWithCGImageFormat( &srcFormat, &destFormat, NULL, kvImageNoFlags, &err);

// 2) ConvertvImage_Buffer srcBuf = {…}, destBuf = {…};err = vImageConvert_AnyToAny(c, &srcBuf, &destBuf, NULL, flags);

Page 15: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 15/162

What You Had to Say

“functions that convertvImage_Buffer objects toCGImageRef objects and back !!!!!! ”

Twitter user

Page 16: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 16/162

What You Had to Say

“functions that convertvImage_Buffer objects toCGImageRef objects and back !!!!!! ”

Twitter user

“vImageConvert_AnyToAnyis magical. Threaded andvectorized conversion betweennearly any two pixel formats.”

Twitter user

Page 17: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 17/162

Video—RGB, Grayscale, and Y’CbCrCVPixelBufferRef (a video frame)

Page 18: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 18/162

Video—RGB, Grayscale, and Y’CbCrCVPixelBufferRef (a video frame)

// CVPixelBufferRef —-> vImageBuffervImageBuffer_InitWithCVPixelBuffer( &buf, &desiredFormat, cvPixelBuffer, NULL, NULL, kvImageNoFlags );

Page 19: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 19/162

Video—RGB, Grayscale, and Y’CbCrCVPixelBufferRef (a video frame)

// CVPixelBufferRef —-> vImageBuffervImageBuffer_InitWithCVPixelBuffer( &buf, &desiredFormat, cvPixelBuffer, NULL, NULL, kvImageNoFlags );

// vImageBuffer —-> CVPixelBufferRefvImageBuffer_CopyToCVPixelBuffer( &buf, &bufFormat, cvPixelBuffer,

NULL, NULL, kvImageNoFlags );

Page 20: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 20/162

Getting video into vImage

Lower level interfaces• 41 video conversions

• Manage chroma siting, transfer function, conversion matrix, etc.

• RGB colorspaces for video formats

vImageConvert_AnyToAny() for video- vImageConverter_CreateForCGtoCVImageFormat- vImageConverter_CreateForCVtoCGImageFormat

Page 21: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 21/162

Page 22: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 22/162

LinearAlgebra (LA)Simple to use high performance

Page 23: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 23/162

Solving System of Linear EquationsWith LAPACK

Given the system A, and the right-hand-side B, how do you find X (AX = B)?

Page 24: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 24/162

Solving System of Linear EquationsWith LAPACK

Given the system A, and the right-hand-side B, how do you find X (AX = B)?

__CLPK_integer n = matrix_size;__CLPK_integer nrhs = number_right_hand_sides;__CLPK_integer lda = column_stride_A;

__CLPK_integer ldb = column_stride_B;__CLPK_integer *ipiv = malloc(sizeof(__CLPK_integer)*n);__CLPK_integer info;sgesv_(&n, &nrhs, A, &lda, ipiv, B, &ldb, &info);free(ipiv);

Page 25: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 25/162

Page 26: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 26/162

Solving System of Linear EquationsWith LA

Given the system A, and the right-hand-side B, how do you find X (AX = B)?

la_object_t X = la_solve(A,B);

Page 27: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 27/162

LinearAlgebra

New in iOS 8.0 and OS X Yosemite

Page 28: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 28/162

LinearAlgebra

New in iOS 8.0 and OS X Yosemite

Simple with good performance

Page 29: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 29/162

LinearAlgebra

New in iOS 8.0 and OS X Yosemite

Simple with good performance

Single and double precision

Page 30: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 30/162

Wh ’ A il bl ?

Page 31: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 31/162

What’s Available?

Element-wise arithmeticMatrix product

Transpose

Norms / normalization

Linear systemsSlice

Splat

LA Obj

Page 32: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 32/162

LA Objects

LA Obj

Page 33: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 33/162

LA Objects

Reference counted opaque objects• Objective-C objects when appropriate

Page 34: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 34/162

M M g t

Page 35: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 35/162

Memory Management

Memor Management

Page 36: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 36/162

Memory Management

Memory Management

Page 37: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 37/162

Memory Management

Page 38: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 38/162

Buffer to LA Object

Page 39: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 39/162

Buffer to LA ObjectWith copy

double *A = malloc(sizeof(double) * num_rows * row_stride);// Fill A as row major matrix

// Data copied into object, user still responsible for Ala_object_t Aobj = la_matrix_from_double_buffer(A, num_rows, num_cols,

row_stride, LA_NO_HINT,LA_DEFAULT_ATTRIBUTES);

// User retains all rights to A. User must clean up Afree(A);

Page 40: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 40/162

Buffer to LA Object

Page 41: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 41/162

Buffer to LA ObjectWith copy

double *A = malloc(sizeof(double) * num_rows * row_stride);// Fill A as row major matrix

// Data copied into object, user still responsible for Ala_object_t Aobj = la_matrix_from_double_buffer(A, num_rows, num_cols,

row_stride, LA_NO_HINT,LA_DEFAULT_ATTRIBUTES);

// User retains all rights to A. User must clean up Afree(A);

Page 42: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 42/162

Hints

Page 43: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 43/162

Hints

la_object_t o = la_matrix_from_double_buffer(A, num_rows, num_cols,row_stride, LA_SHAPE_DIAGONLA_DEFAULT_ATTRIBUTES);

Hints

Page 44: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 44/162

Hints

la_object_t o = la_matrix_from_double_buffer(A, num_rows, num_cols,row_stride, LA_SHAPE_DIAGONLA_DEFAULT_ATTRIBUTES);

Allow for better performance

Hints

Page 45: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 45/162

Hints

la_object_t o = la_matrix_from_double_buffer(A, num_rows, num_cols,row_stride, LA_SHAPE_DIAGONLA_DEFAULT_ATTRIBUTES);

Allow for better performance

Insight about the data buffer• Diagonal

• Triangular

• Symmetric

• Positive Definite

Lazy Evaluation

Page 46: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 46/162

Lazy Evaluation

la_object_t foo(la_object_t A, la_object_t x) {// At = A’la_object_t At = la_transpose(A);

// sum odd elements of x to even elements of xla_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2),

la_vector_slice(x,1,2,la_vector_length(x)/2));

// Atx2 = A’ * x2 * 3.2la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f);if (la_status(Atx2) < 0) { // error }return Atx2

}

Lazy Evaluation A x

Page 47: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 47/162

Lazy Evaluation

la_object_t foo(la_object_t A, la_object_t x) {// At = A’la_object_t At = la_transpose(A);

// sum odd elements of x to even elements of xla_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2),

la_vector_slice(x,1,2,la_vector_length(x)/2));

// Atx2 = A’ * x2 * 3.2la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f);if (la_status(Atx2) < 0) { // error }return Atx2

}

A x

Lazy Evaluation A x

Page 48: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 48/162

Lazy Evaluation

la_object_t foo(la_object_t A, la_object_t x) {// At = A’la_object_t At = la_transpose(A);

// sum odd elements of x to even elements of xla_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2),

la_vector_slice(x,1,2,la_vector_length(x)/2));

// Atx2 = A’ * x2 * 3.2la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f);if (la_status(Atx2) < 0) { // error }return Atx2

}

At

A x

Lazy Evaluation A x

Page 49: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 49/162

Lazy Evaluation

la_object_t foo(la_object_t A, la_object_t x) {// At = A’la_object_t At = la_transpose(A);

// sum odd elements of x to even elements of xla_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2),

la_vector_slice(x,1,2,la_vector_length(x)/2));

// Atx2 = A’ * x2 * 3.2la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f);if (la_status(Atx2) < 0) { // error }return Atx2

}

x.odd x.even

x2

At

Lazy Evaluation A x

Page 50: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 50/162

Atx2

3.2

a y va uat o

la_object_t foo(la_object_t A, la_object_t x) {// At = A’la_object_t At = la_transpose(A);

// sum odd elements of x to even elements of xla_object_t x2 = la_sum(la_vector_slice(x,0,2,la_vector_length(x)/2),

la_vector_slice(x,1,2,la_vector_length(x)/2));

// Atx2 = A’ * x2 * 3.2la_object_t Atx2 = la_scale_with_float(la_matrix_product(At,x2), 3.2f);if (la_status(Atx2) < 0) { // error }return Atx2

}

x.odd x.even

x2

At

Lazy Evaluation

Page 51: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 51/162

yDetails

No computationNo data buffer allocation

Triggered by- la_matrix_to_float_buffer- la_matrix_to_double_buffer

- la_vector_to_float_buffer- la_vector_to_double_buffer

Performance Comparison

Page 52: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 52/162

p

Netlib BLAS• Open source

Page 53: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 53/162

Performance Comparison

Page 54: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 54/162

pHigher is better

G F L O P S

12.5

25

37.5

50

Matrix Size

32 160 288 416 544 672 800 928 1024

LAAcceleraNetlib B

Performance Comparison

Page 55: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 55/162

pHigher is better

G F L O P S

12.5

25

37.5

50

Matrix Size

32 160 288 416 544 672 800 928 1024

LAAcceleraNetlib B

Performance Comparison

Page 56: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 56/162

pHigher is better

G F L O P S

12.5

25

37.5

50

Matrix Size

32 160 288 416 544 672 800 928 1024

LAAcceleraNetlib B

Error Handling

Page 57: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 57/162

la_object_t AB = la_matrix_product( A, la_transpose(B) );if (la_status(AB) < 0) { // handle error }

la_object_t result = la_sum( AB, la_scale_with_float( C, 3.2f ) );if (la_status(result) < 0) { // handle error }

la_status_t status = la_matrix_to_float_buffer(buffer, leading_dim, result);

Error Handling

Page 58: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 58/162

la_object_t AB = la_matrix_product( A, la_transpose(B) );if (la_status(AB) < 0) { // handle error }

la_object_t result = la_sum( AB, la_scale_with_float( C, 3.2f ) );if (la_status(result) < 0) { // handle error }

la_status_t status = la_matrix_to_float_buffer(buffer, leading_dim, result);

Error Handling

Page 59: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 59/162

la_object_t AB = la_matrix_product( A, la_transpose(B) );

la_object_t result = la_sum( AB, la_scale_with_float( C, 3.2f ) );

la_status_t status = la_matrix_to_float_buffer(buffer, leading_dim, result);if (status == LA_SUCCESS) {

// No errors, buffer is filled with good data.} else if (status > 0) {

// No errors occurred, but result does not have full accuracy.} else {

// An error occurred.assert(0);

}

Debugging

Page 60: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 60/162

Enable logging• LA_ATTRIBUTE_ENABLE_LOGGING

Debugging

Page 61: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 61/162

Enable logging• LA_ATTRIBUTE_ENABLE_LOGGING

Error logla_object_t la_sum(la_object_t, la_object_t):

LA_DIMENSION_MISMATCH_ERROR: Encountered a dimension mismatchobj_left rows must be equal to obj_right rows; failed comparison: 8 == 9

Solve

Page 62: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 62/162

la_object_t x = la_solve(A,b);• If A is square and non-singular, compute the solution x to Ax = b

• If A is square and singular, produce an error

Slicing

Page 63: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 63/162

What is slicing

Light weight access to partial object• No buffer allocations

• No buffer copies

Slicing

Page 64: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 64/162

What is slicing

Light weight access to partial object• No buffer allocations

• No buffer copies

Three pieces of information• Offset

• Stride• Dimension

Slicing

Page 65: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 65/162

What is slicing

la_vector_slice (vector, 7, // offset-2, // stride3); // dimension

[ 0 1 2 3 4 5 6 7 8 9 ]

Slicing

Page 66: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 66/162

What is slicing

la_vector_slice (vector, 7, // offset-2, // stride3); // dimension [ 7 ]

[ 0 1 2 3 4 5 6 7 8 9 ]

Slicingh l

Page 67: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 67/162

What is slicing

la_vector_slice (vector, 7, // offset-2, // stride3); // dimension [ 7 ]5

[ 0 1 2 3 4 5 6 7 8 9 ]

SlicingWh i li i

Page 68: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 68/162

What is slicing

la_vector_slice (vector, 7, // offset-2, // stride3); // dimension [ 7 ]5 3

[ 0 1 2 3 4 5 6 7 8 9 ]

Slice ExampleTili

Page 69: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 69/162

Tiling

la_object_t A,B,C;// A and B are matrices of dimension MxN

for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {

la_object_t Atile = la_matrix_slice(A,i*M/2,j*N/2,1,1,M/2,N/2);la_object_t Btile = la_matrix_slice(B,i*M/2,j*N/2,1,1,M/2,N/2);

C = la_sum(Atile, Btile);// use of C tile

}}

Slice ExampleTili

Page 70: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 70/162

Tiling

la_object_t A,B,C;// A and B are matrices of dimension MxN

for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {

la_object_t Atile = la_matrix_slice(A,i*M/2,j*N/2,1,1,M/2,N/2);la_object_t Btile = la_matrix_slice(B,i*M/2,j*N/2,1,1,M/2,N/2);

C = la_sum(Atile, Btile);// use of C tile

}}

= +A B

C

Slice ExampleTili g

Page 71: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 71/162

Tiling

la_object_t A,B,C;// A and B are matrices of dimension MxN

for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {

la_object_t Atile = la_matrix_slice(A,i*M/2,j*N/2,1,1,M/2,N/2);la_object_t Btile = la_matrix_slice(B,i*M/2,j*N/2,1,1,M/2,N/2);

C = la_sum(Atile, Btile);// use of C tile

}}

= +A B

C

Slice ExampleTiling

Page 72: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 72/162

Tiling

la_object_t A,B,C;// A and B are matrices of dimension MxN

la_object_t sum = la_sum(A,B);

for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {

C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2);// use of C tile

}}

Slice ExampleTiling

Page 73: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 73/162

Tiling

la_object_t A,B,C;// A and B are matrices of dimension MxN

la_object_t sum = la_sum(A,B);

for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {

C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2);// use of C tile

}}

Slice ExampleTiling

Page 74: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 74/162

Tiling

la_object_t A,B,C;// A and B are matrices of dimension MxN

la_object_t sum = la_sum(A,B);

for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {

C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2);// use of C tile

}}

sum=C

Slice ExampleTiling

Page 75: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 75/162

Tiling

la_object_t A,B,C;// A and B are matrices of dimension MxN

la_object_t sum = la_sum(A,B);

for (int i = 0; i < 2; ++i) {for (int j = 0; j < 2; ++j) {

C = la_matrix_slice(sum,i*M/2,j*N/2,1,1,M/2,N/2);// use of C tile

}}

=C = +A B

SplatWhat is splat

Page 76: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 76/162

What is splat

la_splat_from_float(5.0f);

SplatWhat is splat

Page 77: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 77/162

What is splat

la_splat_from_float(5.0f);

SplatWhat is splat

Page 78: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 78/162

What is splat

la_splat_from_float(5.0f);

Add 2 to every element of a vector

• la_sum(vector, la_splat_from_double(2.0));

LinearAlgebraSummary

Page 79: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 79/162

Summary

Simple APIModern language and run-time features

Good performance

Page 80: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 80/162

LINPACK The joy of benchmarking

Stephen CanonEngineer, Vector and Numerics Group

Page 81: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 81/162

How fast can you solve asystem of linear equations?

Page 82: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 82/162

LINPACK tests bothhardware and software

Page 83: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 83/162

Accelerate vs. “Brand A”

2013LINPACK performance in GFLOPS (bigger is better)

Page 84: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 84/162

p ( gg )

Accelerate on iPhone 5

“Brand A”

1 2 3 4

2014LINPACK performance in GFLOPS (bigger is better)

Page 85: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 85/162

p ( gg )

Accelerate on iPhone 5

“Brand A”

1 2 3 4

Page 86: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 86/162

Let’s find some new competition

Page 87: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 87/162

2014LINPACK performance in GFLOPS (bigger is better)

Page 88: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 88/162

iPhone 5

2010 MacBook Air

7654321

2014LINPACK performance in GFLOPS (bigger is better)

Page 89: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 89/162

11

iPhone 5s

2010 MacBook Air

10987654321

10.4 GFLOPS

2014LINPACK performance in GFLOPS (bigger is better)

Page 90: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 90/162

10987654321 11

iPhone 5s

2010 MacBook Air

10.4 GFLOPS

2014LINPACK performance in GFLOPS (bigger is better)

Page 91: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 91/162

108642 1412

iPad Air

2010 MacBook Air

14.6 GFLOPS

Page 92: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 92/162

<simd/simd.h>Short vector and matrix math

<simd/simd.h>

Page 93: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 93/162

New (iOS 8 and OS X Yosemite) library with three purposes:

<simd/simd.h>

Page 94: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 94/162

New (iOS 8 and OS X Yosemite) library with three purposes:• 2D, 3D, and 4D vector math and geometry

<simd/simd.h>

Page 95: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 95/162

New (iOS 8 and OS X Yosemite) library with three purposes:• 2D, 3D, and 4D vector math and geometry

• Features of Metal in C, Objective-C, and C++ on the CPU

<simd/simd.h>

Page 96: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 96/162

New (iOS 8 and OS X Yosemite) library with three purposes:• 2D, 3D, and 4D vector math and geometry

• Features of Metal in C, Objective-C, and C++ on the CPU

• Abstraction over architecture-specific SIMD types and intrinsics

Vector Math and GeometryWish list

Page 97: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 97/162

Vector Math and GeometryWish list

Page 98: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 98/162

Inline implementations

Vector Math and GeometryWish list

Page 99: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 99/162

Inline implementations

Concise functions without extra parameters

Vector Math and GeometryWish list

Page 100: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 100/162

Inline implementations

Concise functions without extra parameters

float a = cblas_sdot(3, 1.0f, x, 1, y, 1);

Vector Math and GeometryWish list

Page 101: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 101/162

Inline implementations

Concise functions without extra parameters

float a = GLKVector3DotProduct(x, y);

Vector Math and GeometryWish list

Page 102: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 102/162

Inline implementations

Concise functions without extra parameters

float a = vector_dot (x, y);

Vector Math and GeometryWish list

Page 103: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 103/162

Inline implementations

Concise functions without extra parameters

using namespace simd;

float a = dot (x, y);

Vector Math and GeometryWish list

Page 104: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 104/162

Inline implementations

Concise functions without extra parameters

Vector Math and GeometryWish list

Page 105: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 105/162

Inline implementations

Concise functions without extra parameters

Arithmetic should use operators

Vector Math and GeometryWish list

Page 106: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 106/162

Inline implementations

Concise functions without extra parameters

Arithmetic should use operators

z = GLKVector4MultiplyScalar(GLKVector4Add(x,y),0.5);

Vector Math and GeometryWish list

Page 107: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 107/162

Inline implementations

Concise functions without extra parameters

Arithmetic should use operators

z = 0.5*(x + y);

Vector Math and GeometryWish list

Page 108: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 108/162

Inline implementations

Concise functions without extra parameters

Arithmetic should use operators

z = 0.5*(x + y);

Vector Math and Geometry Types

Page 109: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 109/162

In C and Objective-C, the primary type is vector_float N , where N is 2, 3, or 4

In C++ you can use simd::float N

Based on clang “extended vectors”

Vector Math and GeometryArithmetic

Page 110: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 110/162

Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalars

Vector Math and GeometryArithmetic

Page 111: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 111/162

Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect (vector_float3 x, vector_float3 n) {

}

Vector Math and GeometryArithmetic

Page 112: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 112/162

Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect (vector_float3 x, vector_float3 n) {

}

x

Vector Math and GeometryArithmetic

Page 113: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 113/162

Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect (vector_float3 x, vector_float3 n) {

}

xn

Vector Math and GeometryArithmetic

Page 114: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 114/162

Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect (vector_float3 x, vector_float3 n) {

}

xn

Vector Math and GeometryArithmetic

Page 115: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 115/162

Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect (vector_float3 x, vector_float3 n) {

}

xn

vector_reflect(x,n)

Vector Math and GeometryArithmetic

Page 116: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 116/162

Your favorite arithmetic operators ( +,–,*,/ ) work with both vectors and scalarsvector_float3 vector_reflect(vector_float3 x, vector_float3 n) {

return x - 2*vector_dot(x,n)*n ;}

xn

vector_reflect(x,n)

Page 117: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 117/162

Vector Math and GeometryElements and subvectors

Page 118: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 118/162

Named subvectorsvector_float4 a = { 0, 1, 2, 3 };vector_float2 b = a.lo; // b = { 0, 1 }a.even = -b; // a = { 0, 1,-1, 3 }

Vector Math and Geometry

Page 119: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 119/162

Page 120: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 120/162

Vector Math and Geometry

Page 121: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 121/162

Vector Math and Geometry

f h fl d f

Page 122: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 122/162

Some functions have two flavors: “precise” and “fast”

Page 123: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 123/162

Vector Math and Geometry

S f ti h t fl “ i ” d “f t”

Page 124: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 124/162

Some functions have two flavors: “precise” and “fast”

• “precise” is the default…• … but if you compile with -ffast-math , you get the “fast” versions

Vector Math and Geometry

E ith ff t th ll th i f ti b

Page 125: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 125/162

Even with -ffast-math , you can call the precise functions by name:

float len = vector_precise_length(x);

Vector Math and Geometry

You can call the fast versions by name too:

Page 126: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 126/162

You can call the fast versions by name too:

x = fast::normalize(x);

MatricesC and Objective-C

“matrix float NxM” where N and M are 2 3 or 4

Page 127: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 127/162

matrix_float Nx M , where N and M are 2, 3, or 4

• N is number of columns , M is number of rows .

MatricesC and Objective-C

Create matrices

Page 128: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 128/162

Create matrices

matrix_from_diagonal(vector)matrix_from_columns(vector, vector, …)matrix_from_rows(vector, vector, …)

Arithmeticmatrix_scale(matrix, scalar)matrix_linear_combination(matrix, scalar, matrix, scalar)matrix_transpose(matrix)matrix_invert(matrix)matrix_multiply(matrix/vector, matrix/vector)

MatricesC++ (and Metal)

Create matrices

Page 129: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 129/162

Create matrices

float4x4()float2x2(diagonal)float3x4(column0, column1, …)

Arithmeticfloat3x4 A, B;A -= 2.f * B;float4x3 C = transpose(A)vector3 x;vector4 y = A*x;

Abstract SIMDAdditional types

Doubles signed and unsigned integers

Page 130: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 130/162

Doubles, signed and unsigned integers

Longer vectors (8, 16, and 32 elements)Unaligned vectors

Abstract SIMDInteger operators and conversions

Page 131: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 131/162

Abstract SIMDInteger operators and conversions

Arithmetic operators: +, - , *, / , %

Page 132: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 132/162

Arithmetic operators: , , , / , %

Abstract SIMDInteger operators and conversions

Arithmetic operators: +, - , *, / , %

Page 133: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 133/162

p , , , ,

Bitwise operators: >>, <<, &, | , ^ , ~

Abstract SIMDInteger operators and conversions

Arithmetic operators: +, - , *, / , %

Page 134: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 134/162

p , , , ,

Bitwise operators: >>, <<, &, | , ^ , ~Conversions:vector_float x;vector_ushort y = vector_ushort(x);vector_char z = vector_char_sat(x);

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <=

Page 135: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 135/162

p

• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <=

Page 136: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 136/162

• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

x

Page 137: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 137/162

Page 138: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 138/162

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <=

Page 139: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 139/162

• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

x

y

x < y

Page 140: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 140/162

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <=

Page 141: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 141/162

• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

x

y

x < y

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <=

Page 142: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 142/162

• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false

x

y

x < y

Abstract SIMDComparisons

Vector comparisons: ==, !=, >, <, >=, <=

Page 143: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 143/162

• Result is a vector of integers; each lane is –1 if comparison is true, 0 if false• Type of result usually isn’t important, because you’ll use one of the following:if ( vector_any (x < 0)) { /* executed if any lane of x is negative */ }if ( vector_all (y != 0)) { /* executed if every lane of y is non-zero */ }z = vector_bitselect (x, y, x > y); /* minimum of x and y */

Page 144: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 144/162

String CopyScalar implementation

void string_copy(char *dst, const char *src) {while ((*dst++ = *src++));

Page 145: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 145/162

while (( dst++ src++));}

String CopySSE intrinsic implementation

void vector_string_copy(char *dst, const char *src) {while ((uintptr t)src % 16)

Page 146: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 146/162

while ((uintptr_t)src % 16)if ((*dst++ = *src++) == 0) return;

while (1) {__m128i data = _mm_load_si128((const __m128i *)src);__m128i contains_zero = _mm_cmpeq_epi8(data, _mm_set1_epi8(0));if (_mm_movemask_epi8(contains_zero))

break;

_mm_storeu_si128((__m128i *)dst, data);src += 16;dst += 16;

}string_copy((char *)vec_dst, (const char *)vec_src);

}

String Copy<simd/simd.h> implementation

void vector_string_copy(char *dst, const char *src) {while ((uintptr t)src % 16)

Page 147: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 147/162

(( p _ ) )if ((*dst++ = *src++) == 0) return;

const vector_char16 *vec_src = (const vector_char16 *)src;packed_char16 *vec_dst = (packed_char16 *)dst;while (!vector_any(*vec_src == 0))

*vec_dst++ = *vec_src++;string_copy((char *)vec_dst, (const char *)vec_src);

}

String Copy<simd/simd.h> implementation

void vector_string_copy(char *dst, const char *src) {while ((uintptr_t)src % 16)

Page 148: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 148/162

(( p _ ) )if ((*dst++ = *src++) == 0) return;

const vector_char16 *vec_src = (const vector_char16 *)src;packed_char16 *vec_dst = (packed_char16 *)dst;while (!vector_any(*vec_src == 0))

*vec_dst++ = *vec_src++;string_copy((char *)vec_dst, (const char *)vec_src);

}

String Copy<simd/simd.h> implementation

void vector_string_copy(char *dst, const char *src) {while ((uintptr_t)src % 16)

Page 149: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 149/162

(( p ) )if ((*dst++ = *src++) == 0) return;

const vector_char16 *vec_src = (const vector_char16 *)src;packed_char16 *vec_dst = (packed_char16 *)dst;while (!vector_any(*vec_src == 0))

*vec_dst++ = *vec_src++;string_copy((char *)vec_dst, (const char *)vec_src);

}

String Copy<simd/simd.h> implementation

void vector_string_copy(char *dst, const char *src) {while ((uintptr_t)src % 16)

Page 150: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 150/162

if ((*dst++ = *src++) == 0) return;const vector_char16 *vec_src = (const vector_char16 *)src;packed_char16 *vec_dst = (packed_char16 *)dst;while (!vector_any(*vec_src == 0))

*vec_dst++ = *vec_src++;string_copy((char *)vec_dst, (const char *)vec_src);

}

String Copy<simd/simd.h> implementation

void vector_string_copy(char *dst, const char *src) {while ((uintptr_t)src % 16)

Page 151: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 151/162

if ((*dst++ = *src++) == 0) return;const vector_char16 *vec_src = (const vector_char16 *)src;packed_char16 *vec_dst = (packed_char16 *)dst;while (!vector_any(*vec_src == 0))

*vec_dst++ = *vec_src++;string_copy((char *)vec_dst, (const char *)vec_src);

}

PerformanceHigher is better

n d

6 Scalar<simd/siLibc

Page 152: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 152/162

B y t e

s c o p i e d p e r n a n o s e c o

1.5

3

4.5

String length in bytes

1 2 4 8 16 32 64 128 256 512 1024

Page 153: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 153/162

Page 154: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 154/162

PerformanceHigher is better

n d

6 Scalar<simd/siLibc

Page 155: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 155/162

B y t

e s c o p i e d p e r n a n o s e c o

1.5

3

4.5

String length in bytes

1 2 4 8 16 32 64 128 256 512 1024

Page 156: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 156/162

More Information

Paul DanboldCore OS Technology Evangelist

Page 157: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 157/162

[email protected]

George WarnerDTS Sr. Support [email protected]

DocumentationvImage Programming Guidehttp://developer.apple.com/library/mac/#documentation/Performance/Conceptual/vImage/Introduction/Introduction.html

More Information

DocumentationvDSP Programming Guide

Page 158: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 158/162

http://developer.apple.com/library/mac/#documentation/Performance/Conceptual/vDSP_Programming_Guide/Introduction/Introduction.html

vImage Headers /System/Library/Frameworks/Accelerate.framework/Frameworks/vImage.framework/Headers/vImage.h

vDSP Headers /System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vDSP.h

More Information

DocumentationLinearAlgebra Headers

b k l f k k b f k

Page 159: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 159/162

/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/LinearAlgebra/LinearAlgebra.h

<simd/simd.h> /usr/include/simd/simd.h

Apple Developer Forumshttp://devforums.apple.com

Bug Reporthttp://bugreport.apple.com

Related Sessions

Page 160: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 160/162

Page 161: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 161/162

Page 162: 703 Whats New in the Accelerate Framework

8/15/2019 703 Whats New in the Accelerate Framework

http://slidepdf.com/reader/full/703-whats-new-in-the-accelerate-framework 162/162