adaptive image filtering using run-time reconfiguration

Post on 14-Jan-2016

61 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Adaptive Image Filtering Using Run-Time Reconfiguration. Nitin Srivastava Jerry L. Trahan Ramachandran Vaidyanathan Suresh Rai Department of Electrical and Computer Engineering Louisiana State University. The Problem: Adaptive Image Filtering. - PowerPoint PPT Presentation

TRANSCRIPT

Adaptive Image Filtering Using Run-Time Reconfiguration

Nitin SrivastavaJerry L. TrahanRamachandran VaidyanathanSuresh Rai

Department of Electrical and Computer EngineeringLouisiana State University

The Problem: Adaptive Image Filtering

• A filtering window moves over the image pixel by pixel. Window size is usually 33, 55, or 77.

• The filter multiplies the intensity values of pixels that the window overlaps with its coefficients and sums the products to produce the new value of the pixel at the center of the window.

Working of a 33 size filter

• Spatially invariant filter — does not change values of its coefficients with the position of the filtering window over the image.

• Adaptive filter — adjusts values of its coefficients according to the nature of the image. For instance, handles uniform regions differently than edges.

Use of Run-Time Reconfiguration

• For fixed filter coefficients, could use constant coefficient multipliers (KCMs) configured for coefficients.

• For adaptive coefficients, we use KCMs configured for pixel values. Flow of data regular, but more involved.

• Inspired by 1D adaptive filtering technique of Wojko and ElGindy (RAW’99).

Use of FPGAs

• Circuit design tailored to the problem — Filtering exhibits regular, repeated operations, taking an inner product among the same number of elements at each pixel position.

• Problem size-specific components and datapaths.• Advantages for this problem even without

reconfiguration.

Image Filtering Details

coeffi, j,g,h( ) =K i, j( )

1+a⋅maxε2, vi, j( )−v i+g, j+h( )[ ]2

{ }

K i, j( ) =1

1+a⋅maxε2, v i, j( )−v i+g, j +h( )[ ]2

{ }h∑

g∑

⎜ ⎜ ⎜

⎟ ⎟ ⎟

−1

newvi, j( ) = v i+g, j+h( )h=−w−1( )/ 2

w−1( )/2

∑g=−w−1( )/ 2

w−1( )/2

∑ ⋅coeffi, j,g,h( )

Solution Approach

• Gray scale image of size 256256, using a filtering window of size 33.

• Can tailor to different image size — changes some register sizes and memory requirements.

• Can tailor to different window size — changes memory requirements.

• Can extend to video — window is 3D across frames.

Solution Approach, cont.

• Basic component is a module.• Sixteen pipelined modules act on 16 contiguous pixels at a

time from the same row.• Three sets of three steps each, corresponding to the three

rows in a 33 window and the three positions in each row.• For each of these nine steps, a module contributes to one

of the nine window computations in which its pixel participates.

Module Algorithm

Procedure THREEPIX(r, in, out)Step 0: Adder(r) KCM(r) + inStep 1: Adder(r) KCM(r) + Adder(r1)Step 2: Adder(r) KCM(r) + Adder(r1)

out Adder(r)

• Contributes to three pixel values on the same row.

Overall Algorithm

for i 0 to 255for k 0 to 255 in steps of 16

for all j, where k j k+15r = j mod 16

/* module r has v(i, j) */THREEPIX(r, 0, memory)THREEPIX(r, memory, memory)THREEPIX(r, memory, I/O pins)

Module

From blockmemory

KCM

step counter

modulemux

To outputmux

To block memory

pipelineregister

nextmodule

previousmodule

zeroregister

module adder

memory writeregister

KCM

KCMoutput mux

filtering window coefficient

pixel value

memoryread

register

0

First vantage point:one module

5,22 5,23 5,24 5,25 5,26 5,27 5,28 5,29

6,22 6,23 6,24 6,25 6,26 6,27 6,28 6,29

7,22 7,23 7,24 7,25 7,26 7,27 7,28 7,29

8,22 8,23 8,24 8,25 8,26 8,27 8,28 8,29

9,22 9,23 9,24 9,25 9,26 9,27 9,28 9,29

block memory block memory block memories block memory

5,22 5,23 5,24 5,25 5,26 5,27 5,28 5,29

6,22 6,23 6,24 6,25 6,26 6,27 6,28 6,29

7,22 7,23 7,24 7,25 7,26 7,27 7,28 7,29

8,22 8,23 8,24 8,25 8,26 8,27 8,28 8,29

9,22 9,23 9,24 9,25 9,26 9,27 9,28 9,29

v[7,23] v[7,24] v[7,25] v[7,26] v[7,27]

pd[8,25,-1,-1] pd[8,26,-1,-1] pd[8,27,-1,-1] pd[8,28,-1,-1]pd[8,24,-1,-1]

pd[8,26,-1,0]

00 0 0 0

5,22 5,23 5,24 5,25 5,26 5,27 5,28 5,29

6,22 6,23 6,24 6,25 6,26 6,27 6,28 6,29

7,22 7,23 7,24 7,25 7,26 7,27 7,28 7,29

8,22 8,23 8,24 8,25 8,26 8,27 8,28 8,29

9,22 9,23 9,24 9,25 9,26 9,27 9,28 9,29

block memory block memory block memories block memory

5,22 5,23 5,24 5,25 5,26 5,27 5,28 5,29

6,22 6,23 6,24 6,25 6,26 6,27 6,28 6,29

7,22 7,23 7,24 7,25 7,26 7,27 7,28 7,29

8,22 8,23 8,24 8,25 8,26 8,27 8,28 8,29

9,22 9,23 9,24 9,25 9,26 9,27 9,28 9,29

Nitin:

Slide added

Nitin:

Slide added

v[7,23] v[7,24] v[7,25] v[7,26] v[7,27]

pd[8,25,-1,-1] pd[8,26,-1,-1] pd[8,27,-1,-1] pd[8,28,-1,-1]pd[8,24,-1,-1]

pd[8,23,-1,0] pd[8,24,-1,0] pd[8,25,-1,0] pd[8,26,-1,0] pd[8,27,-1,0]

pd[8,25,-1,1]

00 0 0 0

block memory block memory block memories block memory 9

5,22 5,23 5,24 5,25 5,26 5,27 5,28 5,29

6,22 6,23 6,24 6,25 6,26 6,27 6,28 6,29

7,22 7,23 7,24 7,25 7,26 7,27 7,28 7,29

8,22 8,23 8,24 8,25 8,26 8,27 8,28 8,29

9,22 9,23 9,24 9,25 9,26 9,27 9,28 9,29

rs(-1)[8,24]

v[7,23] v[7,24] v[7,25] v[7,26] v[7,27]

pd[8,25,-1,-1] pd[8,26,-1,-1] pd[8,27,-1,-1] pd[8,28,-1,-1]pd[8,24,-1,-1]

pd[8,23,-1,0]

pd[8,22,-1,1]

pd[8,24,-1,0] pd[8,25,-1,0] pd[8,26,-1,0] pd[8,27,-1,0]

pd[8,23,-1,1]pd[8,24,-1,1] pd[8,25,-1,1] pd[8,26,-1,1]

00 0 0 0

rs(-1)[8,24]

block memory block memory 11 block memories block memory 9

5,22 5,23 5,24 5,25 5,26 5,27 5,28 5,29

6,22 6,23 6,24 6,25 6,26 6,27 6,28 6,29

7,22 7,23 7,24 7,25 7,26 7,27 7,28 7,29

8,22 8,23 8,24 8,25 8,26 8,27 8,28 8,29

9,22 9,23 9,24 9,25 9,26 9,27 9,28 9,29

rs(-1)[8,24]rs(-1)[7,26] rs(-1)[7.24]+

rs(0)[7,24]rs(-1)[6,26]+rs(0)[6.26]

newv[6,24]

v[7,23] v[7,24] v[7,25] v[7,26] v[7,27]

pd[8,25,-1,-1] pd[8,26,-1,-1] pd[8,27,-1,-1] pd[8,28,-1,-1]pd[8,24,-1,-1]

pd[8,23,-1,0]

pd[8,22,-1,1]rs(-1)[7,26]

pd[7,24,0,-1]

pd[7,23,0,0]

pd[7,22,0,1]

pd[6,24,1,-1]

pd[6,23,1,0]

pd[6,22,1,1]

pd[8,24,-1,0] pd[8,25,-1,0] pd[8,26,-1,0] pd[8,27,-1,0]

pd[8,23,-1,1]pd[8,24,-1,1] pd[8,25,-1,1] pd[8,26,-1,1]

pd[7,25,0,-1]pd[7,26,0,-1]

pd[7,27,0,-1]pd[7,28,0,-1]

pd[7,24,0,0] pd[7,25,0,0] pd[7,26,0,0] pd[7,27,0,0]

pd[7,23,0,1]pd[7,24,0,1] pd[7,25,0,1] pd[7,26,0,1]

pd[6,25,1,-1] pd[6,26,1,-1] pd[6,27,1,-1] pd[6,28,1,-1]

pd[6,24,1,0] pd[6,25,1,0] pd[6,26,1,0] pd[6,27,1,0]

pd[6,23,1,1] pd[6,24,1,1] pd[6,25,1,1] pd[6,26,1,1]

00 0 0 0

rs(-1)[7,24]+rs(0)[7,24]

newv[6,24]

rs(-1)[8,24]

rs(-1)[6,26]+rs(0)[6,26]

Second vantage point:one pixel

5,22 5,23 5,24 5,25 5,26 5,27 5,28 5,29

6,22 6,28 6,29

7,22 7,23 7,24 7,25 7,26 7,27 7,28 7,29

8,22 8,23 8,24 8,25 8,26 8,27 8,28 8,29

9,22 9,23 9,24 9,25 9,26 9,27 9,28 9,29

6,23 6,24 6,25 6,26 6,27

v[6,23] v[6,24] v[6,25] v[6,26] v[6.27]

pd[7,25,-1,-1] pd[7,26,-1,-1] pd[7,27,-1,-1]pd[7,24,-1,-1]

00 0 0 0pd[7,28 –1,-1]

v[6,23] v[6,24] v[6,25] v[6,26] v[6.27]

pd[7,25,-1,-1] pd[7,26,-1,-1] pd[7,27,-1,-1]pd[7,24,-1,-1]

pd[7,23,-1,0] pd[7,24,-1,0] pd[7,25,-1,0] pd[7,26,-1,0] pd[7,27,-1,0]

00 0 0 0pd[7,28,-1,-1]

pd[7,28,-1,-1]

v[6,23] v[6,24] v[6,25] v[6,26] v[6.27]

pd[7,25,-1,-1] pd[7,26,-1,-1] pd[7,27,-1,-1]pd[7,24,-1,-1]

pd[7,23,-1,0]

pd[7,22,-1,1]

pd[7,24,-1,0] pd[7,25,-1,0] pd[7,26,-1,0] pd[7,27,-1,0]

pd[7,23,-1,1] pd[7,24,-1,1] pd[7,25,-1,1] pd[7,26,-1,1]

00 0 0 0

rs(-1)[7,25]

v[8,23] v[8,24] v[8,25] v[8,26] v[8,27]

pd[9,25,-1,-1] pd[9,26,-1,-1] pd[9,27,-1,-1] pd[9,28,-1,-1]pd[9,24,-1,-1]

pd[9,23,-1,0]

pd[9,22,-1,1]

pd[8,24,0,-1]

pd[8,23,0,0]

pd[8,22,0,1]

pd[7,24,1,-1]

pd[7,23,1,0]

pd[7,22,1,1]

pd[9,24,-1,0] pd[9,25,-1,0] pd[9,26,-1,0] pd[9,27,-1,0]

pd[9,23,-1,1] pd[9,24,-1,1] pd[9,25,-1,1] pd[9,26,-1,1]

pd[8,25,0,-1] pd[8,26,0,-1] pd[8,27,0,-1] pd[8,28,0,-1]

pd[8,24,0,0] pd[8,25,0,0] pd[8,26,0,0] pd[8,27,0,0]

pd[8,23,0,1] pd[8,24,0,1] pd[8,25,0,1] pd[8,26,0,1]

pd[7,25,1,-1] pd[7,26,1,-1] pd[7,27,1,-1] pd[7,28,1,-1]

pd[7,24,1,0] pd[7,25,1,0] pd[7,26,1,0] pd[7,27,1,0]

pd[7,23,1,1] pd[7,24,1,1] pd[7,25,1,1]pd[7,26,1,1]

00 0 0 0

newv[7,25]

rs(-1)[7,25]+rs(0)[7,25]

Evaluation

• Xilinx Virtex-E FPGA XCV200E• Number of CLB slices required = 492• Clock frequency = 101.9 MHz• Time spent in filtering a 256256 image = 642

s

Comparison

System description

Running time

Speedup

866 MHz Pentium III system

20 ms 31

400 MHz Sun Ultra 5 system

53 ms 84

top related