design and implementation of gpu-based sar image processor
TRANSCRIPT
![Page 1: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/1.jpg)
Najeeb AhmadMaster Thesis Presentation
May, 2012
Supervisor: Dr. Sun Jinping
Design and Implementation of GPU based SAR Image
Processor
School of Electronic Information EngineeringBeihang University, Beijing China.
![Page 2: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/2.jpg)
Contents1. Introduction2. GPU Computing3. SAR Processing4. Implementation5. Conclusion & Future Work
![Page 3: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/3.jpg)
1.IntroductionProblemMotivationObjectiveMethodology
![Page 4: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/4.jpg)
PROBLEMSynthetic Aperture Radar data processing is a computationally intensive and time consuming task using conventional CPUs. Given the increasing popularity and use of GPU for scientific computing, it is required to accelerate simplified range Doppler SAR processing algorithm on GPU using modern GPGPU technology to achieve real/near real-time performance and to evaluate its suitability for SAR processing.
![Page 5: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/5.jpg)
MOTIVATIONComputationally intensive and time
consuming nature of SAR processing algorithms.
Inherent algorithm parallelism in most SAR processing algorithms.
Advent of modern GPGPU technology and availability of commodity GPUs as general purpose computation engines.
Architectural parallelism and availability of sufficient hardware resources in modern GPUs rendering them especially useful for handling large data quantities and parallel SAR algorithm implementation.
![Page 6: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/6.jpg)
OBJECTIVETo implement and accelerate simplified
range Doppler SAR processing algorithm on a modern NVIDIA TESLA GPU using CUDA and MATLAB-GPU capabilities.
The resulting research will explore the areas like:Algorithm adaptation for parallel
implementation.Suitability of MATLAB for algorithm
implementation.Suitability of CUDA for algorithm
implementation.Comparison of CPU/CUDA/MATLAB-GPU
implementations.GPU as SAR processing platform.
![Page 7: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/7.jpg)
METHODOLOGYAlgorithm implementation and verification
on Intel Xeon CPU using MATLAB.Identification of parallelizable portions of
algorithm.Algorithm implementation on TESLA C1060
GPU using MATLAB’s native GPU capabilities.
Algorithm implementation on TESLA C1060 GPU using CUDA.
Analysis of CPU, MATLAB-GPU and CUDA implementations.
![Page 8: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/8.jpg)
2.GPU ComputingIntroduction to GPU ComputingGPGPU: Brief HistoryNVIDIA CUDAWriting efficient code
![Page 9: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/9.jpg)
Introduction to GPU ComputingUse of Graphics Processing Units (GPUs) for
general purpose computing applications.CPU: Single, four or eight cores. Capable of
handling few threads. Suitable for serial code.
GPU: Hundreds of cores. Capable of handling hundreds of threads. Suitable for parallel code.
![Page 10: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/10.jpg)
Introduction to GPU ComputingGPU Computing Model: Heterogeneous
computing model employing both CPU and GPU with serial computing on CPU, parallel computing on GPU.
![Page 11: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/11.jpg)
GPGPU: Brief HistoryFirst use of GPU as general purpose
computing device, around 1999-2000 using graphics APIs. Huge performance boosts observed. Generally unpopular due to tedious programming.
Introduction of NVIDIAs “CUDA” and AMDs “Stream Computing” in 2007. Beginning of modern GPGPU era. Other vendors introduced their own GPGPU systems.
NVIDIAs CUDA gaining popularity due to its maturity and performance.
![Page 12: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/12.jpg)
NVIDIA CUDACompute Unified Device Architecture.Comprises of Instruction Set Architecture
(ISA) and parallel compute engine in GPU programmable with high level languages extended for GPU computing.
CUDA framework comprises of two parts; hardware and software. From software perspective, CUDA means extended C/C++, FORTRAN to support GPU computing.
CUDA is “Single Instruction Multiple Thread” (SIMT) architecture.
![Page 13: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/13.jpg)
CUDA HardwareStreaming multiprocessor (SM): Basic computing unit of
the GPU. Comprises of eight streaming processors (SP) and memory. Different GPUs differ in number of SMs and SP clock frequency.
SP SP
SP SP
SP SP
SP SP
SFU SFU
MT IU
Shared Memory
![Page 14: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/14.jpg)
CUDA Memory ArchitectureUnderstanding of memory architecture
critical for writing efficient CUDA programs.All CUDA-enabled hardware have following
types of memory:Global memoryShared memory and registers.Texture memory and texture cache.Constant memory and constant cache.Local memory for register spilling.
SP SPShared memory
SP SP SP
Texture cache
Constant cache
SM n
SP SPShared memory
SP SP SP
Texture cache
Constant cache
SM 3
SP SPShared memory
SP SP SP
Texture cache
Constant cache
SP SPShared memory
SP SP SP
Texture cache
Constant cache
SM 1SM 2
GPU
Global memory (RAM)
Local MemoryTexture memory Constant memory
![Page 15: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/15.jpg)
NVIDIA TESLA C1060 GPUPCI Express 2.0 compliant computing
processor board based on NVIDIA Tesla T10 graphics processing unit targeted for HPC applications. Feature highlights30 SMs = 240 SPs.SP Clock = 1.296 GHz4 GB DDR3 memory with 120
GB/s bandwidth. IEEE 754 single and double
floating point compliant.933 GFLOPS single and 78
GFLOPS double precision performance.
Compute capability: 1.3Supported by MATLAB for GPU
computing
![Page 16: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/16.jpg)
CUDA Programming ModelAt its core are thread groups, shared
memory and barrier synchronization.Provides coarse-grained data and task
parallelism and fine-grained data and thread parallelism providing expressivity and scalability.
Thread hierarchy: Grid, blocks, threads.Kernels: Functions executed on device
(GPU) in parallel threads.CUDA provides APIs to run and launch
kernels in parallel threads and to synchronize them.
![Page 17: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/17.jpg)
Processing FlowCopy input data from CPU to GPU memory.Load GPU program and execute, caching
result on the device.Copy results from GPU to CPU.
RAM
CPU
Host
Global memory
Constant
Texture
GPU
Device
![Page 18: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/18.jpg)
Writing Efficient CodeHigh priority considerations
Minimum CPU-GPU transfers.Use of coalesced data transfers.Use of shared memory instead of global
memory whenever possible.Avoiding different execution paths within a
warp.Medium priority considerations
Access to shared memory should be planned to avoid serialization.
Redundant data transfers from global memory should be avoided.
![Page 19: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/19.jpg)
Writing Efficient CodeThreads per block should be multiple of 32.Use of fast math library whenever possible.
Low Priority ConsiderationsUse of zero copy operations.For kernels with long argument list, some
argument should be placed in constant memory.
Expensive modulo, division operations should be avoided in favor of shift operations whenever possible.
Automatic conversion of double to float should be avoided.
Loop unrolling should be used whenever possible.
![Page 20: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/20.jpg)
3.SAR ProcessingWhat is Synthetic Aperture RadarSAR ProcessingProcessing AlgorithmsBasic RDASimplified RDA
![Page 21: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/21.jpg)
What is Synthetic Aperture RadarAn active microwave remote sensing imaging system.Employs long range propagation characteristics of radar
and complex signal processing techniques to produce high resolution images.
High resolution achieved by synthesizing long antenna aperture through signal processing techniques.
Pros (in comparison with optical systems):All weather and day and night operation.No effects of constituents of atmosphere.Sensitivity to dielectric properties (can image ice, biomass
etc.)Sensitivity to surface roughness (oceans, wind speed etc.)
![Page 22: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/22.jpg)
What is Synthetic Aperture Radar
Accurate measurement of distance.Sensitivity to man made objects.Sensitivity to target structure.Subsurface penetration.
Cons Complex interactions (difficult to visualize
and understand)Speckle effects (difficult in visual
interpretation)Topographic effects
![Page 23: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/23.jpg)
SAR ProcessingA set of procedures to obtain interpretable image
from raw scattered in azimuth and range directions.In range, data is scattered by duration of transmitted
FM pulse.In azimuth, data spread by duration point target is
illuminated by the radar beam. SAR processing compresses this data taking into
account range cell migration, earth curvature, earth rotation, air/spacecraft attitude noise to produce the final image.
Given nature of SAR system and signals, signal processing rather than image processing provide appropriate tools for SAR processing.
![Page 24: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/24.jpg)
SAR Processing AlgorithmsMainstream SAR processing include:
Range Doppler algorithm (RDA)High resolution images for low squint and for
relatively smaller aperture sizes. Very popular.Chirp scaling algorithm (CSA)
Two-dimensional operations with range independence followed by range corrections in range Doppler domain.
Omega-K algorithm (ωKA)Efficient and accurate in two-dimensional frequency
domain.SPECAN algorithm
Good for medium to low resolution requirements.
![Page 25: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/25.jpg)
Range Doppler AlgorithmVersions of range Doppler:
Basic RDARDA with accurate SRCRDA with approximate SRCSimplified range Doppler
![Page 26: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/26.jpg)
Basic RDARaw data Range
Compression Azimuth FFT
RCMCAzimuth Compression
Azimuth IFFT and lookup Summation
Final Image
Range FFT, matched filter multiply, range
IFFT
Data in range Doppler domain
Interpolation operation in
range Doppler domain
Azimuth matched filter
multiply
To bring back signal into time
domain.
![Page 27: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/27.jpg)
Simplified RDAFor narrower swath width and medium
resolution requirements, RCM can be assumed independent of range.Raw data Pre-filtering Range
Compression
Azimuth FFTRCMCRange IFFT
Azimuth Compression
Azimuth IFFT and lookup Summation
Final Image
To remove Doppler centroid
Range FFT, matched filter multiply (No range IFFT)
Both range and azimuth in frequency domain
RCM phase function
multiply with each range line
Data in range Doppler domain
![Page 28: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/28.jpg)
4.ImplementationHardware resourcesSoftware resourcesCPU ImplementationMATLAB GPU ImplementationCUDA ImplementationResult Comparison
![Page 29: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/29.jpg)
Hardware resourcesCPU GPU
Name NVIDIA Tesla C1060
# of cores 240SP Clock 1.296 GHzMemory 4 GB GDDR3Maximum memory bandwidth
102 GB/s
Memory interface
512 bit – PCI Express
GFLOPS 933 single precision, 78 double precision
Name Intel Xeon E5504
CPU Clock 2 GHz# of cores 4System Memory
4 GB
DDR3 Clock 800 MHzMaximum memory bandwidth
19.2 GB/s
Memory type DDR3 PC3PCI Slot PCI Express
![Page 30: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/30.jpg)
Software resourcesCPU GPUWindows 7
Ultimate 64-bitMATLAB release
2010bVisual Studio 2008
SP1
CUDA Toolkit 4.1MATLAB release
2010b NVIDIA Parallel
NsightVisual ProfilerCUDA MEMCHECKCUFFT library
![Page 31: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/31.jpg)
RADARSAT – I Data• CEOS Format• Raw data is required to
be extracted from CEOS data before SAR processing algorithm can be applied.
Parameter Value UnitsSampling rate 32.317 MHzRange FM rate 0.7213
5MHz/µs
Pulse duration 41.74 µsRadar frequency 5.3 GHzRadar wavelength
0.05657
m
Pulse repetition frequency
1256.98
Hz
Effective radar velocity
7062 m/s
Azimuth FM rate 1733 Hz/sDoppler centroid -6900 Hz
Table RADARSAT – I data parameters
CEOS data
CEOS data extraction
utility
RAW SAR data
![Page 32: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/32.jpg)
SAR Processing GUIFunctions• CEOS data
extraction.• MATLAB-
CPU SAR processing.
• MATLAB-GPU SAR processing
• CUDA input/output manipulation.
• CUDA program execution.
![Page 33: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/33.jpg)
CPU ImplementationImplemented using MATLABFFT/IFFT using standard MATLAB functions
![Page 34: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/34.jpg)
CPU Processed SAR image
A 2048 x 4096 SAR image using CPU based implementation
![Page 35: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/35.jpg)
MATLAB-GPU ImplementationMATLAB started supporting GPU computing since
MATLAB release 2010b. Implemented using native MATLAB-GPU functions
only (no CUDA kernel calls).Vectorization strategy employed to implement
vector-matrix multiplications on GPU.
All FFT/IFFTs performed using MATLAB-GPU FFT/IFFT support functions.
Column 1
Column 2
………...
Column n
Column 1
Column 2
………...
Column n
Column 1
Column 2
………...
Column n
![Page 36: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/36.jpg)
MATLAB-GPU ImplementationLimit on maximum image size that can be
calculated due to GPU memory constraints.
![Page 37: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/37.jpg)
MATLAB-GPU ImplementationSpeedup as high as 21 achieved compared
with CPU implementation
![Page 38: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/38.jpg)
MATLAB-GPU Implementation
A 2048 x 4096 SAR image using MATLAB-GPU based implementation
![Page 39: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/39.jpg)
MATLAB-GPU ImplementationAdvantages
Quick and easy to implementSufficient speedups obtained with little effortLittle knowledge of GPU hardware and no
knowledge of optimization techniques required.Disadvantages
Currently, limited number of MATLAB functions supported on GPU.
Not all overloads of a function available for GPU.Lesser control of hardware resources and
memory.Not many optimization options.
![Page 40: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/40.jpg)
CUDA ImplementationStrategy
Signal data read as binary fileVectors, matched filters calculated on CPUVectors/signal data transferred to GPUFollowing kernels executed in order on GPU
Pre-filtering kernelRange compression kernelRCMC kernelAzimuth compression kernelImage pixel calculation kernel
Data transferred from GPU to CPU and saved on disk as image.
![Page 41: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/41.jpg)
Optimization considerationsChosen block size = 8 × 8 = 64. Conforms
with memory coalescing requirements.Constant variables stored in constant
memoryLocal variable and phase function
calculation whenever possible to reduce global memory access.
CPU-GPU data transfer kept to minimum by transferring data from CPUGPU at beginning and GPUCPU transfers at the end of algorithm.
Using CUFFTs cufftPlanMany() plan for FFT/IFFTs along data columns.
![Page 42: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/42.jpg)
CUDA Implementation Results
A 2048 x 4096 SAR image using CUDA based implementation
![Page 43: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/43.jpg)
CUDA Implementation Results
![Page 44: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/44.jpg)
CUDA Implementation Results
![Page 45: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/45.jpg)
CUDA/MATLAB-CPU/MATLAB-CPU Computation Time Comparison
![Page 46: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/46.jpg)
MATLAB-GPU/CUDA Computation Time Comparison
![Page 47: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/47.jpg)
MATLAB-GPU/CUDA speedup comparisonSpeedups as high as 53 times achieved in
comparison with maximum speedup of 21 times in MATLAB.
![Page 48: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/48.jpg)
5. Conclusions & Future Work
![Page 49: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/49.jpg)
ConclusionsFeasibility of GPU for SAR processing
Amount of data, computational effort and inherent algorithm parallelism makes SAR processing suitable on GPU.
TESLA C1060 GPU offers enough memory to handle various common SAR image sizes.
Cooling GPU may be a challenge in some environments.
Scalability of CUDA will prove to be an advantage to port existing SAR code to newer GPUs.
GPUs might not be suitable where customizable hardware is required or military hardware standards are to be adhered.
![Page 50: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/50.jpg)
ConclusionsMATLAB-GPU based SAR Processing
Significant speedups compared with CPU.Quick and easy to implement.Has some limitations:
Currently have lesser function support for GPU. Expected to improve with future MATLAB releases.
Vectorization strategy needs more memory. Future release promise to take away need for vectorization (e.g. bsxfun in release 2012a).
Lesser control over GPU resources (memory etc.).CUDA SAR Processing
CUDA: Flexible and scalable with least learning curve.More control over GPU resources.Optimization strategies can be applied.Faster and more memory efficient than MATLAB
implementation.
![Page 51: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/51.jpg)
ConclusionsDownsides of GPU
Significant testing/verification effort might be required if GPU hardware have to be upgraded (due to old one becoming obsolete).
Proprietary nature of CUDA might be problematic in case company discontinues CUDA or its support.
![Page 52: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/52.jpg)
Future workCUDA kernels can be called in MATLAB code
using MATLAB’s CUDA kernel calling support.
MATLAB GPU implementation can be improved as newer and better functions become available.
C/C++ based CPU implementation can be developed to better judge MATLAB-CPU/CUDA performance.
Other SAR processing algorithms can be implemented using framework laid out in this project.
![Page 53: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/53.jpg)
Q & A
![Page 54: Design and implementation of GPU-based SAR image processor](https://reader035.vdocuments.us/reader035/viewer/2022062503/58ecc1e81a28ab36358b45e7/html5/thumbnails/54.jpg)
Thank You