scilabtec gpu

Post on 07-May-2017

230 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Bringing GPU capabilities in Scilab

June 16th, 2010

2

Sylvestre Ledru● In charge of R&D projects

● Responsible for GNU/Linux, Mac OS X and Unix version

● Developer

● Community management

● ...

Presentation

3

GPGPU in Scilab

4

● Use graphic cards for numerical computing purposes

● Much more powerful than CPU on some algorithms

● But hard to use...

What is GPGPU?

5

● CUDA by Nvidia for NvidiaFocused on GPU computing

● OpenCL by the Khronos group More general

Two main solutions

6

● To improve computation time

● Perfect to massively parallel algorithms

● To simplify its uses

Why GPU in Scilab?

7

● Take advantages of all Scilab features (high-level, easy to use...)

● Hide a part of the complexity

● Integration in a current software solution

Advantages compared to ad-hoc devs

8

Alpha version of Scilab GPU module

9

Description

● One goal: leverage GPU capabilities from Scilab

● Started in the context of the Google Summer of Code 2009

● Continued within the OpenGPU context (System@tic and Cap Digital)

10

● Handle both CUDA & OpenCL technologies

● Similar profiles and usages

● GPU functions (kernel) are compiled and transfered by the module

● Explicit transfert of variables

● Available on the Scilab forge :http://forge.scilab.org/index.php/p/sciCuda/

Features

11

● CUDA & OpenCL managed transparently with same profiles:

Features

● cudaAlloc vs openclAlloccudaToGpu vs openclToGpubuildCuda vs buildOpenCLcudaApplyFunction vs openclApplyFunction...

12

Code example - kernel

__global__ void someSimpleKernel(double* src,double* dst, int numberOfElement){ int idx=threadIdx.x+blockIdx.x*blockDim.x; if(idx<numberOfElement) { dst[idx]=src[idx]; }}

13

Code example – Scilab code

buildCuda(abs_path+"simple.cu");

A_host=rand(256,256);A=cudaToGpu(A_host);B=cudaAlloc(256,256);

kernel=cudaLoadFunction("simple.ptx","someSimpleKernel");lst=list(A,B,int32(256*256));cudaApplyFunction(kernel,lst,128,1,256*256/128,1);B_host=cudaFromGpu(B);

14

Example: Channel flow past a cylinderical obstacle

15

● A first tagged release

● More transparent uses

● Code generation to GPU

● Xcos integration

● Linear algebra features based on cublas

Future

16

Thanks for your attention

www.scilab.org

Introduction

Scilab on steroïds

  Objective :  Provide an easy way to go from a scripted prototype

in Scilab to a full-fledged application   In fact an application taking advantage of GPU

acceleration

3

  Integrating GPU-enabled libraries   Introducing a pragma-based typing system   Enabling the generation of an autonomous application on GPU   Extending the typing system with automated type inferences

Several steps

HPC Project - Corporate Presentation

2

How to go from Scilab…

  An interpreted   Dynamical environment   Ideal for prototyping

3

… to a full application

  Typed variable   To enable a compilation   For performance   To directly go to production

4

Producing application from Scilab scripts

  If you write a less dynamic code   Due to an internal representation of the code   A code C can be generated from a Scilab script   So you could get an autonomous application   From there, you could get a GPU-accelerated application

Par4All source-to-source compiler

  What is it ?

Some feedback on code quality

source to source Compiler

Benefit : only one code

Additional Tools

Back-end compiler

You can work on it:

6

Par4All – the compiling flow

Hundreds of code analysis phases and reorganizations

Scripting Metalanguage

C Fortran

C + Cuda code OpenMP MPI

Scilab to GPU

  A natural extension High-level scripting languages :

Scilab

Generating type by inference

  To ease development   Provide an environment able to produce type

  Type generation during script writing by inference   Something that is provided in some strongly typed formal

langages   Of course, it is currently possible to write a script and no

coherent typing can be generated   This will result in some limitation on “acceptable” scripts

Appliances Wild Systems   A packaged system for immediate use

  High-end Configuration : dual-socket X5570 & GPU Tesla

Best HW Optimized App Best Math

In an Appliance

No Cost of Operation

Ethernet

Thank you Any question ?

top related