lecture asip 8, 9

Post on 03-Apr-2018

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 1/42

VLSI Architecture :: MEL G642

MEL G642

Dr. A. Amalin PrinceBITS Pilani K.K. Birla Goa Campus

Department of Electrical , Electronics and Instrumentation Engineering

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 2/42

Contents

MAC fundamentalsMAC implementations

A MAC case studyMAC integration

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 3/42

Datapath in a DSP processor

RF ALU

The data path (DP)

h ( C P )

MAC

MEL G642

PM

DM1 DM2

AGU2

Addressing path (AGU)

AGU1

C o n

t r o

l p a

rocessor memory an reg ster usses

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 4/42

MAC general

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 5/42

MAC instructions

Multiplication arithmetic'sMAC & Iterative instructions

Double-precision arithmetic instructionsMove data from and to MAC

MEL G642

Other instructions

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 6/42

Why MAC

MAC: Multiplication and accumulation unit– Performs convolution based algorithms

o

FIR, IIR, Auto correlation, Cross correlation– Support most transformation algorithms

o FFT and DCT need MAC hardware

MEL G642

x n

c(0)

Z-1

c(1)

x(n-1)

+

Z-1

c(2)

x(n-2)

+

Z-1

c(3)

x(n-3)

+

Z-1

c(4)

x(n-4)

+y(n)

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 7/42

Why MAC

Data x(n) is shifted through a FIFO buffer consisting of 4registersSo that x(n) become x(n-1) and x(n-1) become x(n-2) …the next clock cycle

∑−

=

−=

1

0

)()()(m

i

icin xn y

MEL G642

All arithmetic executions are mapped to hardware inparallelThere are four multipliers and four full adders

A sample of y(n) is computed per clock cycle y(n) = x(n)*c(0) + x(n-1)*c(1) + x(n-2)*c(2 )+ x(n-3)*c(3 )+ x(n-4)*c(4)

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 8/42

MAC basics

MAC: Multiplication and accumulation unit– Adder = accumulator; Accumulator register

MOA MOB

Multiplier

MOA MOB

Multiplier

MEL G642

AccumulatorACR

AOA AOB

ACR =Accumulating

registerFlag circuit

AccumulatorACR

AOA AOB

ACR =Accumulating

registerFlag circuit

pe ne

(a) MAC without pipeline (b) MAC with pipeline

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 9/42

MUL circuit

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 10/42

Multiplications

How to manage double precision?How to manage signed?

Hardware multiplication

MEL G642

Fractional multiplication Integer multiplication

Signed multiplication Unsigned multiplication

Result with

double precision

Result with

single precisoin

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 11/42

Multiplications

How to manage double precision?How to manage signed?

Hardware multiplication

The 16-bit signed and 16-bit unsigned multiplicationcan be implemented based

on a 17b × 17b signedmultiplier.In general, a (N+1)×(N+1)

MEL G642

Fractional multiplication Integer multiplication

Signed multiplication Unsigned multiplication

Result with

double precision

Result with

single precisoin

can give N bits signed and unsigned multiplication

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 12/42

Basic multiplication instructions

No. Specifications on the result

M1 Signed integer multiplication, double precision resultACR [31:0] <= {A[15], A[15:0]} * {B[15], B[15:0]}

M2 Signed-Unsigned integer multiplication, double precision resultACR [31:0] <= {A[15], A[15:0]} * { “0” , B[15:0]}

M3 Unsigned-signed integer multiplication, double precision result= “ ” *

MEL G642

, ,

M4 Unsigned-Unsigned integer multiplication, double precision resultACR [31:0] <= { “0”, A[15:0]} * { “0” , B[15:0]}

M5 Signed integer multiplication, single precision result no roundACR [31:16] <= SAT(2 16*({A[15], A[15:0]} * {B[15], B[15:0]}))

M6 Signed fractional multiplication, double precision resultACR [31:0] <= SAT (2*({A[15], A[15:0]} * {B[15], B[15:0]}))

M7 Signed fractional multiplication, single precision rounded result

ACR [31:16] <= SAT(Round(2*({A[15], A[15:0]} * {B[15], B[15:0]})))

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 13/42

Multiplication of long data

ACR [47:0] <= {X[31], X[31:16]} * {Y[15], Y[15:0]}

+ 2-16

*( “0”, X[15:0]} * {Y[15], Y[15:0]});

ACR <=X[31:0]× Y[15:0]

MEL G642

ACR [64:0] <= {X[31], X[31:16]} * {Y[31], Y[31:16]}+ 2 -16*({ “0”, X[15:0]} * {Y[31], Y[31:16]})+ 2 -16*({X[31], X[31:16]} * {“0”, Y[15:0]})

+ 2 -32*({“0”, X[15:0]} * {“0”, Y[15:0]});

ACR<= X[31:0] ×Y[31:0]

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 14/42

An example of MUL

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 15/42

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 16/42

MAC instructions

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 17/42

Guard Operations In MAC

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 18/42

G 6 4 2

i r c u i t

MEL G642

M E L

M A C

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 19/42

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 20/42

MAC instructions

Single step (signed) MAC– Integer– Fractional

(Signed) Convolution– Integer

MEL G642

Diff between MAC and convolution– In control path, not shown here

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 21/42

Double-Precision Arithmetic

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 22/42

Double-Precision Arithmetic in MAC

No. Specifications on the result

D1 Double-precision data add/sub double-precision data Saturate(ACRx[39:0] ± ACRy[39:0])

D2 Double-precision data add/sub single-precision data align to LSBSaturate (ACRx[39:0] ± {24’b OPB [15],OPB[15:0]})

D3 Double-precision data add/sub single-precision data align to MSB

MEL G642

Saturate (ACRx[39:0] ± {8’b OPB [15],OPB[15:0], 16’b0})D4 Double-precision data plus/sub single precision immediate Saturate

(ACRx[39:0] ± 24’b immediate[15], immediate[15:0])

D5 Absolute operation on a double-precision data if ACRx[39] Saturate

(INV(ACRx[39:0]) + “1”) else ACRxD6 Compare two double-precision data and set flags set flag: Saturate

(ACRx [39:0] - ACRy [39:0])

D7 Simple scale by MUX instead of by shift logic

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 23/42

Scaling in DSP :: MAC

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 24/42

G 6 4 2

i s i o n A r i t h m e t i c

MEL G642

M E L

W i t h D o u

b l e P r e

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 25/42

G 6 4 2

S i g n a l s

MEL G642

M E L

C o n

t r o l

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 26/42

Move / change data types

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 27/42

Move data from MAC and to MAC

Basic load– Loads in half ACRn and keeps another half.

o The higher part and fill in guards– Loads in half ACRn and cleans another half.

o The higher part and fill in guards

MEL G642

– Loads in both lower higher part of ACRn.o To fill in guards using the higher part sign

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 28/42

Move data to MAC

Specifications on the result

L1 ACRn <= {8’bA[15],A[15:0], ACRn[15:0]} //Keep lower part

L2 ACRn <= {8’bA[15],A[15:0], 16’H0000} //clean lower part

L3 ACRn <= {ACRn [39:16], A[15:0]} //keep higher part

MEL G642

n <= , : s gn extens on g er part

L5 ACRn <= {8’bA[15],A[15:0], B[15:0]}// Load A and B from RF

L6 ACRn <= {A[7:0], ACRn[31:0]} // restore guards

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 29/42

Move data to MAC: Logic

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 30/42

G 6 4 2

o d i f i e d

MEL G642

M E L

M A C M

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 31/42

Move data from MAC

Specifications on the result

1 Rn <= ACRn[31:16] //Rn is a register in RF

2 Rn <= ACRn[15:0] //Rn is a register in RF

3 Rn <= ACRn[31:16]; Rn+1 <= ACRn [15:0] //Rn and Rn+1 in RF

MEL G642

4 M1 <= ACRn[31:16]; M2 <= ACRn[15:0]; //M1 M2: memories5 Rn <= ACRn[31:16]; Rn+1<=ACRn[15:0]; Rn+2<=ACRn[39:32]

6 Rn <= {8’h00, ACRn[39:32]}; // guard to register file RF

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 32/42

MAC integration

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 33/42

Flags in MAC

Usually control code is implemented using ALUinstructions– Flags in MAC is not used much

Mainly for exception– MAC has

MEL G642

o

Saturation Flag (FMO)o Sign Flag (FMS)o Zero Flag (FMZ)

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 34/42

Data operation Sepuence

Very important– Add guard bits– Operation (iteration) and scaling– Round after iteration– Saturation and removing guard bits

MEL G642

– Truncation and output

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 35/42

Physical critical path

What is physical critical path?

MEL G642

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 36/42

Physical critical path

D-mem 1 D-mem 2 D-mem 3 D-mem4 ConstantRF OPA

32 to1

RF OPB

32 to1

Long wires Long wires

MEL G642

As MAC input Very heavy fan out here!

h l l h

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 37/42

Physical critical path

ACR1

ACR2ACRm

Registerselectlogic

Dataselectlogic……

MEL G642

Heavy fan out forMACinternal logic

Long wireFromRF Data memory

Pi li

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 38/42

Pipeline

ACR

*

ACR

*

ACR*

MEL G642

(a) MAC in one clock cycle (b) MAC using two clocks

Accumulator

Flag circuit

(a) MAC using three clocks

Accumulator

Flag circuit

Accumulator

Flag circuit

E l MAC D i

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 39/42

Example :: MAC Design

Design a MAC unit capable of the following operations:o OP0: No operationo OP1: ACR = 0o OP2: ACR = A * B (Fractional multiplication (signed))o OP3: ACR = A * B + ACR (Fractional multiplication (signed))o OP4: ACR = 1.25 * ACR (Scaling)o OP5: Load ACR with a fractional value from a registero OP6: ACR = SATURATE(ROUND(ACR))

MEL G642

o : = :o OP8: RF = ACR[15:8]o OP9: RF = SIGNEXTEND(ACR[19:16])

Constraints:

•A and B are 8 bits, registers are 8 bits•ACR is 20 bits (including 4 guard bits).•Only one multiplier may be used. You should select as small a multiplier asnecessary. You also need to annotate whether it is signed or unsigned.

E l MAC D i

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 40/42

Example :: MAC Design

MEL G642

E l MAC D ig

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 41/42

Example :: MAC Design

MEL G642

The End :: Thank you for your attention

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 42/42

The End :: Thank you for your attention

Questions?

MEL G642

top related