reconfigurable computing (en2911x, fall07) lecture 17: application-driven

19
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven Hardware Acceleration (3/4) Prof. Sherief Reda Division of Engineering, Brown University http://ic.engin.brown.edu

Upload: amelia-shelton

Post on 31-Dec-2015

34 views

Category:

Documents


1 download

DESCRIPTION

Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven Hardware Acceleration (3/4). Prof. Sherief Reda Division of Engineering, Brown University http://ic.engin.brown.edu. Viterbi algorithm. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

Reconfigurable Computing(EN2911X, Fall07)

Lecture 17: Application-Driven Hardware Acceleration (3/4)

Prof. Sherief RedaDivision of Engineering, Brown University

http://ic.engin.brown.edu

Page 2: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

Viterbi algorithm

• A dynamic programming algorithm for finding the most likely sequence of hidden states, the Viterbi path, that results in a sequence of observed events.

• Originally devised by Andrew Viterbi in 1967 as an error-correction scheme for noisy digital communication links.

• Widely used in decoding the convolutional codes for both CDMA and GSM digital cellular, dial-up modems, satellite, deep-space communications and 802.11 wireless LANs. Also used in speech recognition, computational linguistics, and bioinformatics.

Page 3: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

Viterbi decoders in digital communication systems

Page 4: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

1. Encoding using convolution codes

• Each input bit is coded onto 2 output bits. •The 2 outputs bits are produced by using modulo-2 adders. The selection of which bits are to be added to produce an output bit is called the generating polynomial

O1 = (u0+u1+u-1+u-2)mod 2O2 = (u1+u0+u-2) mod 2

+

+

O1

O2

u1 u0 u-1 u-2

Page 5: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

Example

Assume the input sequence is1011

What is the output?11 11 01 11 01 01 11

Example by C. Langton

Page 6: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

Truth table presentation

Page 7: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

State transition graph representationO1O2=00

de Bruijn graph. Not all outputs are shown

O1O2=00O1O2=11

O1O2=00 O1O2=01

O1O2=01O1O2=10

O1O2=10

Page 8: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

Tree representation

Page 9: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

Trellis diagram

Not all transitions are shown

Page 10: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

Output of the encoder for various inputs

• How can we devise a good generating polynomial?• Let’s say we receive 01 11 01 11 01 01 11. It is not one of the

possible 16 sequences. How do we decode it?

input Encoder output0000 00 00 00 00 00 00 000001 00 00 00 11 11 10 110010 00 00 11 11 10 11 000011 00 00 11 00 01 01 110100 00 11 11 10 11 00 000101 00 11 11 01 00 10 110110 00 11 00 01 01 11 000111 00 11 00 10 10 01 111000 11 11 10 11 00 00 001001 11 11 10 00 11 10 111010 11 11 01 00 10 11 001011 11 11 01 11 01 01 111100 11 00 01 01 11 00 001101 11 00 01 10 00 10 111110 11 00 10 10 01 11 001111 11 00 10 01 10 01 11

Page 11: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

2. Decoding received sequences using the Viterbi algorithm

Let’s decode the received sequence 01 11 01 11 01 01 11

000

001

010

011

100

101

110

111

01 11 01 11 01 01 11cost

00

11

1

1

Page 12: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

2nd step

Let’s decode the received sequence 01 11 01 11 01 01 11

000

001

010

011

100

101

110

111

01 11 01 11 01 01 11cost

00

11

3

1

11

00

11

00

1

3

Page 13: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

3rd step

Let’s decode the received sequence 01 11 01 11 01 01 11

000

001

010

011

100

101

110

111

01 11 01 11 01 01 11cost

00

11

4

4

11

00

11

00

2

2

10

5

1

301

11

00

01

10

11

00

3

Page 14: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

4th step

Let’s decode the received sequence 01 11 01 11 01 01 11

A any step, there is only one path from the initial state to any state. In case more than one path converge to a node, always pick the minimum

000

001

010

011

100

101

110

111

01 11 01 11 01 01 11cost

00

11

min(3, 6)

4

11

00

11

00

3

1

10

3

3

401

11

00

01

10

11

00

3

00

11

10

11

00

01

10

10

Page 15: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

5th stepLet’s decode the received sequence 01 11 01 11 01 01 11

000

001

010

011

100

101

110

111

01 11 01 11 01 01 11cost

00

11

4

4

11

00

11

00

5

4

10

3

3

101

11

00

01

10

11

00

411

10

11

00

01

10

1001

01

11

01

11

01

1111

Page 16: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

6th step

Let’s decode the received sequence 01 11 01 11 01 01 11

000

001

010

011

100

101

110

111

01 11 01 11 01 01 11cost

00

11 11

00

11

00

10

01

11

00

01

10

11

00

11

10

11

00

01

10

10

01

1

4

4

4

4

3

5

301

01

11

01

11

01

1111

Page 17: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

Finally

• Winner path is 000, 100, 010, 101, 110, 011, 001, 000 with input sequence 1011000

• What is runtime using SW on a general-purpose CPU?

• What is the runtime using an FPGA?

000

001

010

011

100

101

110

111

01 11 01 11 01 01 11cost

00

11 11

00

11

00

10

01

11

00

01

10

11

00

11

10

11

00

01

10

10

1111

1

5

6

6

3

5

3

4

01

01

01

11

01

11

01

1111

Page 18: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

Summary

• So far we have covered popular application-driven algorithms to accelerate in FPGAs– FFT for signal and image processor as an example of

divide and conquer algorithms– Speech recognition applications– Viterbi algorithm for digital communication as an

example of dynamic programming algorithms• Next time, we cover some popular algorithms for

bioinformatics

Page 19: Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven

Reconfigurable ComputingS. Reda, Brown University

Project updates

• 2nd project report extended until Sunday Dec 2nd. Make sure to add the new material to the content of the 1st report. The new report is worth 10 points. Main evaluation criterion is your progress on the project plan you outlined in the first report. – How thorough and creative your ideas develop?– How meticulous is the experimental setup?– How do the carried out experiments serve towards the project

goals?

• Make sure to also send me a couple of slides by Monday Dec 3rd to present on Tuesday Dec 4th (last lecture)