developing the demosaicing algorithm in gpgpu ping xiang electrical engineering and computer science

20
Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Upload: cora-atkinson

Post on 14-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Developing the Demosaicing Algorithm in GPGPU

Ping XiangElectrical engineering and computer science

Page 2: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Outline

Background

Algorithm

Implementation

Experiment Results

Future Work

Page 3: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Background

1. Color Filter Array. A mosaic of color filters in front of the image sensor

Page 4: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Background

Demosaicing algorithm is to reconstruct a full color image from the data collected by the color filtering array.

Page 5: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Algorithm

Bilinear interpolation:

The red value of a non-red pixel is computed by the average of the two or four adjacent red pixels, and similarly for blue and green.

Page 6: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Algorithm

Page 7: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Algorithm

Page 8: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Algorithm

For Green Channels

Page 9: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Algorithm

For Red or Blue Channels

Page 10: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Implementation

Optimization: 1. Vectorize the pixel data to be processed

2. use shared memory to reduce the data transfer

Page 11: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Implementation

1. Vectorize the pixel data to be processed

Page 12: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Implementation

2. Use shared memory to reduce the data transfer

Page 13: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Experiment Results

Platform:

ATI Radeon™ HD 4870 Brook+ 1.4

Nvidia GeForce 8800 GTX CUDA 2.1

Dual Core AMD Opteron(tm) 2212 Frequency 2.0GHz

Page 14: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Experiment Results

performance comparison

0

0.5

1

1.5

2

2.5

CPU

CUDA

brook+

CPU 0.002 0.007 0.035 0.147 0.585 2.374

CUDA 0.002 0.0032 0.01 0.037 0.144 0.574

brook+ 0.0612 0.0635 0.075 0.113 0.277 0.575

128*128 256*256 512*512 1K*1K 2K*2K 4K*4K

For small data size, GPU is not always a good choice a. Memory transfer time dominates the kernel execution time b. Computation is not that complex enough

Page 15: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Experiment Results

performance comparison

0

0.5

1

1.5

2

2.5

CPU

CUDA

brook+

CPU 0.002 0.007 0.035 0.147 0.585 2.374

CUDA 0.002 0.0032 0.01 0.037 0.144 0.574

brook+ 0.0612 0.0635 0.075 0.113 0.277 0.575

128*128 256*256 512*512 1K*1K 2K*2K 4K*4K

When the data size is small, CUDA has better performance. When the data size increases to 4K, the brook+ performance catches up with CUDA

Page 16: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Experiment Results

Explanation ?

Memory Speed Stream processing Units

HD 4870 800 8*5*10*2

GTX 8800 128 (16*8)

Page 17: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Experiment Results

Shared Register Usage

ATI Radeon 4870 (brook 1.4)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

128*128 256*256 512*512 1K*1K 2K*2K 4K*4K

Unoptimized

Optimized

Read data into shared register and try to reuse the data

Page 18: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Experiment Results

ATI Radeon 3870 brook+1.3

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

128*128 256*256 512*512 1K*1K 2K*2K 4K*4K

Unoptimized

Optimized

Page 19: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Future Work

1. Shared memory usage for further optimization

2. Integrate the code with proper interface to import image data and export pixel data

3. Report

Page 20: Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Reference

1. High-Quality linear interpolation for Demosaicing of Bayer-patterned color images, Henrique S. Malvar, Li-wei He, and Ross Cutler

2. An Improved Demosaicing Algorithm Alexey Lukin, Denis Kubasov

Questions?