acceleration of software package "r" using gpu's sachinthaka abeywardana

20
Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

Upload: lauren-kelly

Post on 26-Mar-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

Acceleration of software package "R" using GPU's

Sachinthaka Abeywardana

Page 2: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Introduction to Graphic Processing Units (GPU)

Page 3: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Introduction to GPU contd.

Page 4: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Introduction to R and BLAS

• R• Statistical Package

• Graphics

•BLAS (Basic Linear Algebra Subprograms)

•Vector-Vector Addition/Multiplication etc.

•Vector-Matrix Addition/Multiplication etc.

•Matrix-Matrix Addition/Multiplication etc.

LAPack (Linear Algebra Package)

Page 5: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

What has been done in this project

• Aim: Replace Rblas.dll with a faster BLAS library

CSIRO.

R LAPack BLAS

New BLAS

Replace

Page 6: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

Rblas.dll

How New Rblas.dll was created

CSIRO.

CUBLAS library

‘C program’ wrapper

FORTRAN

Initialise

Page 7: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Results for 1000 x 1000 Matrices

CPU

Average (s)

3.2 * A %*% B + 4.1 * A

(3.2 A x B + 4.1 B)

1.9335

A%*%B

(Matrix A x matrix B)

1.8855

t(A)%*%B

(Transpose matrix A x Matrix B)

1.9135

solve(A)

(Invert Matrix A)

2.227 4.69 5.288

GPU

Average (s)

Single Precision

GPU

Average (s)

Double Precision

0.2375 0.123

0.176 0.092

0.207 0.089

Page 8: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Improvements

 Single Precision (%)

Double Precision (%)

3.2 * A %*% B + 4.1 * A 814.1052632 1571.95122

A%*%B 1071.306818 2049.456522

t(A)%*%B 924.3961353 2150

solve(A) -210.597216 -237.4494836

Page 9: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Who to Blame

A. Simply random?

B. Me???

C. Stupid Computer?

D. Memory allocation.

Page 10: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Nvidia GPU Architecture

Page 11: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Nvidia GPU Architecture contd.

Page 12: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Nvidia GPU Architecture contd.

Page 13: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

CPU vs GPU calculations for matrix inversion

139.5

45.42

-20

0

20

40

60

80

100

120

140

160

0 500 1000 1500 2000 2500 3000 3500 4000 4500

Size of Square Matrix (one side)

Tim

e (s

)

CPU

GPU

Page 14: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Matrix Multiplication Timing

-20

0

20

40

60

80

100

120

140

0 1000 2000 3000 4000 5000

Matrix Size (one side)

Tim

e (s

) CPU

GPU Single Precision

GPU Double Precision

Page 15: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Comparison with Atlas RBlas

• Improvement on multiplication : A%*%B 319%• Improvement on inverting matrix: solve(A) 281%

(source:http://www.stat.columbia.edu/~cook/movabletype/archives/2008/06/a-trick-to-spee.html)

Limitations on Atlas:

•Latest version is for pentium 4 only

Page 16: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Limitations of this Project

• Specific Card• Cost

• GeForce GTX 280 $582 (Source: http://www.msy.com.au/Parts/PARTS.pdf)

• Precision?• RMS of 6.350072e-06 for inverting a 1024 x 1024 matrix for the

single precision cards.

• IEEE 754 deviations

Page 17: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Where can I get this from

• https://wiki.csiro.au/confluence/display/terabyte/GPU+Accelerated+R

Page 18: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Where to from now?

• Implementation of more Blas functions• Getting rid of overhead

• Adjusting LAPack

• Double precision to Single Precision and Single to Double Conversion

• Parallel Extensions (CPU)

Page 19: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.

Thank You

• Luke Domanski• Dadong Wang• Pascal Valotton• Glenn Stone• Robert Dunne• CMIS/ CSIRO staff

Page 20: Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

CSIRO.