matlab acceleration for image processing using · pdf filematlab acceleration for image...
TRANSCRIPT
![Page 1: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/1.jpg)
1
MATLAB Acceleration for Image
Processing using CUDA-Enabled GPUs
March 2009
John Melonakos
AccelerEyes
Sumit Gupta
NVIDIA Tesla GPU Computing
![Page 2: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/2.jpg)
2
4 cores
What is GPU Computing?
Computing with CPU + GPU
Heterogeneous Computing
![Page 3: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/3.jpg)
3
Computation Discontinuity
Double
Precision debut
![Page 4: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/4.jpg)
4
146X
Medical Imaging
U of Utah
36X
Molecular Dynamics
U of Illinois, Urbana
18X
Video Transcoding
Elemental Tech
50X
Matlab Computing
AccelerEyes
100X
Astrophysics
RIKEN
149X
Financial simulation
Oxford
47X
Linear Algebra
Universidad Jaime
20X
3D Ultrasound
Techniscan
130X
Quantum Chemistry
U of Illinois, Urbana
30X
Gene Sequencing
U of Maryland
50x – 150x
![Page 5: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/5.jpg)
5
CUDA Parallel Computing Architecture
ATI’s Compute “Solution”
Parallel computing architecture
and programming model
Includes a C compiler plus
support for OpenCL and
DX11 Compute
Architected to natively support
all computational interfaces
(standard languages and APIs)
![Page 6: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/6.jpg)
6
L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1
L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1
NVIDIA Tesla 10-Series GPU
Massively parallel, many core architecture
![Page 7: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/7.jpg)
7
CUDA Facts
900+ Research Papers
115+ universities teaching CUDA
www.NVIDIA.com/CUDA
• 200+ papers and applications
• 110 Million CUDA-Enabled GPUs
• 60,000+ Active Developers
![Page 8: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/8.jpg)
Background
• Who is AccelerEyes?– AccelerEyes is a MathWorks partner
– Simple software for visual computing
![Page 9: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/9.jpg)
Background
• Who is AccelerEyes?– AccelerEyes is a MathWorks partner
– Simple software for visual computing
• What is Jacket?– GPU engine for MATLAB
– CUDA powered language extension
![Page 10: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/10.jpg)
• Why Jacket?– Challenges in technical computing
– Low-cost speed, high-value graphics
– Increased productivity
Background
• Who is AccelerEyes?– AccelerEyes is a MathWorks partner
– Simple software for visual computing
• What is Jacket?– GPU engine for MATLAB
– CUDA powered language extension
![Page 11: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/11.jpg)
MATLAB Options
• CPU Solutions (blue arrows)– MATLAB and the Parallel
Computing toolbox enable PC and clustered MATLAB computing
• GPU Solutions (green arrows)– Jacket enables CUDA MATLAB
computing
![Page 12: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/12.jpg)
Jacket Benefits
Jacket combines the speed of CUDA and the graphics of the GPU with the user friendliness of MATLAB.
![Page 13: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/13.jpg)
Functionality
Generators: geye, gones, gzeros
Element-wise: +, *, -, /
Reductions: sum, min, max …
Indexing: subscripted referencing / subscripted assignment
Linear Algebra: matrix multiply, …
FFT: fft, ifft, fftn, ifftn
Filtering: filter, filter2, convn
Interpolation: interp2
Parallel for-loops: gfor
![Page 14: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/14.jpg)
Kernel Benchmarks
54x Speedup 16x Speedup
![Page 15: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/15.jpg)
Application Benchmarks
![Page 16: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/16.jpg)
Optical Flow (Horn&Schunck)
![Page 17: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/17.jpg)
image1 image2 [u, v]
Speedup: 12X on 128x256
Optical Flow (Horn&Schunck)
![Page 18: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/18.jpg)
CPU GPU
Speedup: 20X on 512x512
Image Thresholding
![Page 19: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/19.jpg)
Speedup: 12X on 915x915
Image Smoothing
![Page 20: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/20.jpg)
Speedup: 200X on 256x256
Image Interpolation
![Page 21: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/21.jpg)
Image Morphing
Speedup: 40X on 512x512
![Page 22: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/22.jpg)
Custom CUDA FunctionsIntegration using MEX
mymex.cu
![Page 23: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/23.jpg)
Graphics Toolbox
True visual computing
OpenGL API in MATLAB
Interactive OpenGL
Key functions: gsurf, gimage, gscatter3, gplot, …
Visualization scripts are open and modifiable.
Jacket includes the Graphics Toolbox
![Page 24: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/24.jpg)
Some Jacket Customers
![Page 25: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/25.jpg)
Roadmap for New Features
– more gfor
– gdouble
– multi-GPU support (for clusters of GPUs)
– LAPACK (eig, inv, etc.)
– signal processing
– image processing (and computer vision)
– Simulink® on the GPU
– statistical functions
– handle graphics
– lots of other MATLAB functions (finance, biology, etc.)
![Page 26: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/26.jpg)
26
Tesla GPU Computing ProductsBuilt for High Performance Computing
![Page 27: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/27.jpg)
27
Tesla GPU Computing Products
Tesla S1070 1U SystemTesla C1060
Computing Board
Tesla Personal
Supercomputer (4 Tesla C1060s)
GPUs 4 Tesla GPUs 1 Tesla GPU 4 Tesla GPUs
Single Precision Perf 4.14 Teraflops 933 Gigaflops 3.7 Teraflops
Double Precision Perf 346 Gigaflops 78 Gigaflops 312 Gigaflops
Memory 4 GB / GPU 4 GB 4 GB / GPU
![Page 28: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/28.jpg)
28
Tesla Personal Supercomputer: Cluster Perf
Supercomputing Performance960 cores. 4 TeraFlops
Performance of a 64-node CPU cluster
Personal One researcher, one supercomputer
Plugs into standard power strip
AccessibleProgram in C for Windows, Linux
![Page 29: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/29.jpg)
29
Tesla S1070: Supercharge your cluster
Hess
Chevron
Petrobras
NCSA
CEA
Tokyo Tech
JFCOM
SAIC
Federal
Motorola
Kodak
BNP Paribas
University of Heidelberg
University of Illinois
University of North Carolina
Max Planck Institute
Rice University
University of Maryland
Eotvas University
University of Wuppertal
Chinese Academy of Sciences
National Taiwan University
PCIe Gen2 Cables(0.5m length)
Tesla S1070
Host Server
PCI-e Gen2 Host
Interface Cards
![Page 30: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/30.jpg)
30
$5 Million Cluster: Lower Power, Higher Perf
CPU 1U Server CPU 1U Server
Tesla 1U System
6x more perf
2 Quad-core Xeon
CPUs: 8 cores
0.17 Teraflop (single)
0.08 Teraflop (double)
1819 CPU servers
310 Teraflops (single)
155 Teraflops (double)
Total area 16K sq feet
Total 1273 KW
8 CPU Cores +
4 GPUs = 968 cores
4.14 Teraflops (single)
0.346 Teraflop (double)
455 CPU servers
455 Tesla systems
1961 Teraflops (single)
196 Teraflops (double)
Total area 9K sq feet
Total 682 KW
40% smaller
½ the power
50% fewer
systems
![Page 31: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/31.jpg)
31$500K - $1M
Cost
Performance
250x
$5k- $10 K
1x
5000x
$3M+
25,000x
Tesla Personal
Supercomputer
64-node
CPU Cluster
64-node
Tesla Cluster
256-512 node
CPU Cluster
256-512 node
Tesla Cluster
Workstation
![Page 32: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/32.jpg)
32
Life Sciences &
Medical Equipment
Productivity
/ Misc
Oil and
Gas EDA Finance
CAE /
Mathematical
Communi
cation
Max Planck
FDA
Robarts Research
Medtronic
AGC
Evolved machines
Smith-Waterman
DNA sequencing
AutoDock
NAMD/VMD
Folding@Home
Howard Hughes
Medical
CRIBI Genomics
GE Healthcare
Siemens
Techniscan
Boston Scientific
Eli Lilly
Silicon Informatics
Stockholm
Research
Harvard
Delaware
Pittsburg
ETH Zurich
Institute Atomic
Physics
CEA
NCSA
WRF Weather
Modeling
OptiTex
Tech-X
Elemental Technologies
Dimensional Imaging
Manifold
Digisens
General Mills
Rapidmind
Rhythm & Hues
xNormal
Elcomsoft
LINZIK
Hess
TOTAL
CGG/Veritas
Chevron
Headwave
Acceleware
Seismic City
P-Wave
Seismic
Imaging
Mercury
Computer
ffA
Synopsys
Nascentric
Gauda
CST
Agilent
Symcor
Level 3
SciComp
Hanweck
Quant
Catalyst
RogueWave
BNP Paribas
AccelerEyes
MathWorks
Wolfram
National
Instruments
Ansys
Access Analytics
Tech-x
RIKEN
SOFA
Renault
Boeing
Nokia
RIM
Philips
Samsung
LG
Sony
Ericsson
NTT DoCoMo
Mitsubishi
Hitachi
Radio
Research
Laboratory
US Air Force
5000+ Customers / ISVs
![Page 33: MATLAB Acceleration for Image Processing using · PDF fileMATLAB Acceleration for Image Processing using CUDA-Enabled GPUs March 2009 ... Precision debut. 4 146X Medical Imaging](https://reader034.vdocuments.us/reader034/viewer/2022042620/5a9e0fc67f8b9a420a8d1a94/html5/thumbnails/33.jpg)
33
More Information
Tesla main page
http://www.nvidia.com/tesla
Vertical Solutionshttp://www.nvidia.com/object/vertical_solutio
ns.html
CUDA Zone
http://www.nvidia.com/cuda
CUDA Tutorials, Applications
Hear from Developershttp://www.youtube.com/nvidiatesla
Download Jacket Now
http://www.accelereyes.com
Further Jacket Questionshttp://www.accelereyes.com/forums
http://www.accelereyes.com/blog
John Melonakos
Sumit Gupta