parallelization of system matrix generation code mahmoud abdallah antall fernandes

Parallelization of System Matrix generation code

Mahmoud AbdallahAntall Fernandes

SPECT System

Inverse Cone

Back Projection

Ref figure: Tomographic Reconstruction of SPECT Data – Bill Amini, Magnus Björklund, Ron Dror, Anders Nygren oo

Filtered Back Projection is applying a ramp filter on the back projected image.Still widely used for its high speed and easy implementation.

Maximum Likelihood-Expectation Maximization Algorithm

Is found to reduce noise in reconstruction iteratively

An iterative algorithm is used to solve the following linear problemFX = PP – vector of projection dataX – voxelized imageF – projection matrix operator

Needs a large number of iterations to reconstruct an image

EM AlgorithmThe EM algorithm is given by

Summation over k is projection operation

Summation over j is the back projection operation

System Matrix

Maps the image space to the data space

Takes detector geometry as input

Generates detector data for every bin for each angle (usually there are 72 angles/frames)

System Matrix Algorithm

for each angle DO // number of angles = 72for each detector bin in U direction Do // bins: around 14

for each detector bin in V direction Do // bins: around 64for each row in the inverse cone grid Do // <= 99

for each Column in the inverse cone grid Do //<= 99for each voxel intersected the Ray Do calculate point responseend

endend

Number of loops = 72 x 14 x 64 x 99 x 99 = 632282112

System Matrix Parallelization

Observation:At each angle, each bin’s calculations are independent from other bins’.

Proposal:Parallelize all calculations for each angle.

E.g. use GPU.

System Matrix Parallelization on GPU

Parallelized System Matrix Algorithm

Host Program:for each angle DO

Run all kernels for all bins at the same timeend

GPU Kernel:for each voxel intersected the Ray Do calculate attenuation and store it in SysMatend

SIMD (Architecture of GPU)

From: (AMD) Advanced Micro Devices INC 2010 (Introduction to OpenCL Programming)

OpenCL

Based on ISO C99 with some extensions & restrictions

provides parallel computing using task-based and data-based parallelism

Architecture Host Program Kernel

Program Architecture

Host ProgramExecutes on the host systemSends kernels to execute on OpenCL™ devices using command queue.

KernelsSimilar to C function.Executed on OpenCL™ devices ( GPU).

Thank You

parallelization of system matrix generation code mahmoud abdallah antall fernandes

projection operationsummation

detector bin

bins calculations

inputgenerates detector

opencl devices gpu

opencl programmingopenclbased

iterative algorithm

amd advanced micro devices

Documents

the need for parallelization challenges towards effective...

parallelization of dijkstra's algorithm

grid parallelization and tests

abdallah furniture catalog 2012

by jafar bahjat abdallah abdallah

turbodecodingalgorithm parallelization

parallelization of explicit and implicit solver · —...

shared memory parallelization

prof. mohamed adel abdallah

mariam mohammad al-abdallah

abdallah project 4.docx

234-aminopyridine abdallah

net multithreading and parallelization

mohamed abdallah cv

parallelization - cons.mit.edu

parallelization using open mp

antall dialysepasienter i norge øker – hvordan møte...

abdallah kansas song

abdallah k - zwyx

mena region abdallah homsi