dynamic cuda with f# · 21.03.2013 · dynamic cuda with f# hpc gpu & f# meetup march 19 san...
TRANSCRIPT
![Page 1: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/1.jpg)
Dynamic Cuda with F#
HPC GPU & F# Meetup
March 19
San Jose, California
Dr. Daniel Egloff
+41 44 520 01 17
+41 79 430 03 61
![Page 2: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/2.jpg)
! Software development and consulting company
! Based in Zurich
! Core competence ! Quantitative finance and risk management
! Derivative pricing and modeling
! Numerical computing
! High performance computing (clusters, grid, GPUs)
! Software engineering (C++, F#, Scala, …)
! Early adopter of GPUs ! First project with GPUs in finance back in 2007
About Us
![Page 3: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/3.jpg)
Situation
Compute intensive problems GPGPU
Big data and information rich
applications
Cloud and distributed computing
![Page 4: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/4.jpg)
Recurring challenges in our HPC projects
Problems
Algorithms change often
C++ adds unwanted level of
complexity Interoperation and wrapper code
Hard to test correctness of
algorithms
Slow progress because of tools and complexity
Critical algorithms are not designed with GPUs in mind
Implications ? Efficient GPU
code is difficult to develop
Availability of GPU hardware
![Page 5: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/5.jpg)
Negative impact on the usability and acceptance of new technology
Implications
Moving project targets
Optimized code hard to maintain
and extend Maintenance of wrapper code
Maintenance of large body of test
code
Unpredictable project duration
and costs
Rewrite of critical numerical code
Implications Static and
inflexible solutions Delays because of missing hardware
![Page 6: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/6.jpg)
What would be needed and how can we improve?
Needs
Better and more flexible tools to make CUDA more accessible
CUDA programming model
Support iterative development and rapid prototyping
Using modern programming languages and
concepts
Building on top of modern technologies like .NET or the JVM
Generated CUDA code at par with compiled
CUDA C/C++
Solid framework compatible with CUDA
programming model
This is where comes into play…
![Page 7: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/7.jpg)
What is
?
![Page 8: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/8.jpg)
! Complete solution to develop CUDA accelerated GPU applications in .NET ! Relies on F# and in particular code
quotation
! Based on LLVM and CUDA 5 technology
! Fully F# based, no additional changes or extensions to the F# language, no <<<…>>>
! No wrappers, no post build process to transform IL code
Alea.cuBase
Dynamic code generation
GPU algorithm scripting
Industry grade performance
Rapid development
Solid framework for reusability
Advanced CUDA programming
![Page 9: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/9.jpg)
! Generate GPU code programmatically at run-time
! Use .NET generics and F# code quotation splicing for flexible kernels
! Foundation to develop GPU aware domain specific languages
Benefits
Dynamic code generation
![Page 10: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/10.jpg)
! Easy and quick setup of development environment, no need to install NVIDIA nvcc compiler tools
! Rapid prototyping in F# interactive
! Iteratively improve CUDA kernel algorithms without time consuming build cycles
Benefits
Rapid development
![Page 11: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/11.jpg)
! Execute F# scripts with GPU algorithms on command line or in F# interactive
! GPU scripting in Excel
! Integrate Alea.cuBase directly with Python
Benefits
GPU algorithm scripting
![Page 12: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/12.jpg)
! Framework for type-safe definition of GPU resources
! CUDA monad to specify GPU resources together with launch logic in unified manner
! Reuse GPU kernel code and compose them to modular GPU kernel libraries
Benefits
Solid framework for reusability
cuda !{ kernel_B launch logic}
cuda !{ kernel_A launch logic}
![Page 13: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/13.jpg)
! Generating performance optimized code which is on par with compiled CUDA C/C++ code
! Low level device functions and special math functions
! Built in occupancy calculator to identify optimal thread block layout
Benefits
Industry grade performance
0.00%
100.00%
200.00%
300.00%
400.00%
500.00%
600.00%
700.00%
800.00%
2097152 8388608 16777216 33554432
int32 float32 float64
Segmented Scan by Key Alea.cuExtension against CUDA Thrust
![Page 14: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/14.jpg)
! Support for texture, constant and shared memory
! Pointer operations to partition array data
! Special pointer types such as volatile pointers
! Runtime compilation control e.g. fast math
! Multiple streams
! Thread safe use of multiple GPUs
! Inline PTX assembly instructions
Benefits
Advanced CUDA programming
![Page 15: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/15.jpg)
Alea Ecosystem
Alea.cuBaseCUDA Runtime
API
ThrustCUDPP Alea.cuExtension
User Applications
CUDA Driver API
Alea EcosystemCUDA C Ecosystem
![Page 16: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/16.jpg)
Advantages over CUDA C/C++
! Faster development ! More productive tools, type inference, Intellisense support
! Cleaner more expressive code results in fewer bugs, less testing and less debugging
! Removing unnecessary complexity ! No template meta programming or preprocessor tricks needed
! More flexibility ! Dynamic code generation and scripting is entirely missing in CUDA C/C++
! Easier to reuse and compose GPU algorithm
! More transparency ! GPU resources are defined where needed
! Launch logic defines thread layout and data requirement more transparent
! Direct access to other valuable F# and .NET technologies ! Seamless integration into .NET, without any interoperation layer
! Monads aka computational expressions
! Async workflows, agents, parallel programming, type providers
! Uniform .NET type system
![Page 17: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/17.jpg)
How does
work ?
![Page 18: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/18.jpg)
NVVM IR module
PTX module
CUDA cubinmodule
CUDA module
Launch function
PModuleCUDA device
CUDA context
Device workerComilation
process
PTemplate
cuda !{ constant array texture kernel ... launch logic}
! Four steps to a CUDA kernel with F# and Alea.cuBase
Development Process
2
3
4
1
![Page 19: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/19.jpg)
How to program with
?
![Page 20: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/20.jpg)
! Basic kernel programming
! Excel GPU scripting with Alea.cuBase and Alea.cuExtension, in Tsunami IDE and FCell ! Excel based Monte Carlo simulation
! PDE solver for 2d heat equation in GPU
Live Coding
![Page 21: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/21.jpg)
! F# is used in multiple ways ! Internally to build our CUDA compiler with quotations
! As hosting language to define an internal DSL for CUDA, i.e. the CUDA monad
! Use extensibility of F# to build more DSL such as the pcalc monad
! Benefit of using F# ! Rich ecosystem first class language in .NET
! Functional fewer and more descriptive code
! Monads ideal to create DSLs, seq, async, F# linq
! Type providers easier to integrate into information rich programming
! Quotations lot of flexibility for DSL or compiler implementation
! Extensibility pure solution for CUDA programming
How did F# pay off?
![Page 22: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/22.jpg)
More Information on F#
! F# is a functional-first programming language
! Strongly typed
! First class .NET language
! Open source
! Designed to solve complex computing problems with simple, maintainable robust code
! Runs on Windows, Linux, Mac OS, HTML5 and also on GPUs
! http://www.fsharp.org
! http://www.tryfsharp.org
![Page 23: Dynamic Cuda with F# · 21.03.2013 · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff daniel.egloff@quantalea.net +41 44 520 01 17 +41 79](https://reader034.vdocuments.us/reader034/viewer/2022052011/6026ae601bba125955622255/html5/thumbnails/23.jpg)
Thank you