lecture asip 8, 9

7/28/2019 Lecture ASIP 8, 9

http://slidepdf.com/reader/full/lecture-asip-8-9 1/42

VLSI Architecture :: MEL G642

MEL G642

Dr. A. Amalin PrinceBITS Pilani K.K. Birla Goa Campus

Department of Electrical , Electronics and Instrumentation Engineering

Contents

MAC fundamentalsMAC implementations

A MAC case studyMAC integration

MEL G642

Datapath in a DSP processor

RF ALU

The data path (DP)

h ( C P )

MEL G642

DM1 DM2

Addressing path (AGU)

rocessor memory an reg ster usses

MAC general

MEL G642

MAC instructions

Multiplication arithmetic'sMAC & Iterative instructions

Double-precision arithmetic instructionsMove data from and to MAC

MEL G642

Other instructions

Why MAC

MAC: Multiplication and accumulation unit– Performs convolution based algorithms

FIR, IIR, Auto correlation, Cross correlation– Support most transformation algorithms

o FFT and DCT need MAC hardware

MEL G642

x(n-1)

x(n-2)

x(n-3)

x(n-4)

Why MAC

Data x(n) is shifted through a FIFO buffer consisting of 4registersSo that x(n) become x(n-1) and x(n-1) become x(n-2) …the next clock cycle

∑−

)()()(m

icin xn y

MEL G642

All arithmetic executions are mapped to hardware inparallelThere are four multipliers and four full adders

A sample of y(n) is computed per clock cycle y(n) = x(n)*c(0) + x(n-1)*c(1) + x(n-2)*c(2 )+ x(n-3)*c(3 )+ x(n-4)*c(4)

MAC basics

MAC: Multiplication and accumulation unit– Adder = accumulator; Accumulator register

MOA MOB

Multiplier

MOA MOB

Multiplier

MEL G642

AccumulatorACR

AOA AOB

ACR =Accumulating

registerFlag circuit

AccumulatorACR

AOA AOB

ACR =Accumulating

registerFlag circuit

(a) MAC without pipeline (b) MAC with pipeline

MUL circuit

MEL G642

Multiplications

How to manage double precision?How to manage signed?

Hardware multiplication

MEL G642

Fractional multiplication Integer multiplication

Signed multiplication Unsigned multiplication

Result with

double precision

Result with

single precisoin

Multiplications

How to manage double precision?How to manage signed?

Hardware multiplication

The 16-bit signed and 16-bit unsigned multiplicationcan be implemented based

on a 17b × 17b signedmultiplier.In general, a (N+1)×(N+1)

MEL G642

Fractional multiplication Integer multiplication

Signed multiplication Unsigned multiplication

Result with

double precision

Result with

single precisoin

can give N bits signed and unsigned multiplication

Basic multiplication instructions

No. Specifications on the result

M1 Signed integer multiplication, double precision resultACR [31:0] <= {A[15], A[15:0]} * {B[15], B[15:0]}

M2 Signed-Unsigned integer multiplication, double precision resultACR [31:0] <= {A[15], A[15:0]} * { “0” , B[15:0]}

M3 Unsigned-signed integer multiplication, double precision result= “ ” *

MEL G642

M4 Unsigned-Unsigned integer multiplication, double precision resultACR [31:0] <= { “0”, A[15:0]} * { “0” , B[15:0]}

M5 Signed integer multiplication, single precision result no roundACR [31:16] <= SAT(2 16*({A[15], A[15:0]} * {B[15], B[15:0]}))

M6 Signed fractional multiplication, double precision resultACR [31:0] <= SAT (2*({A[15], A[15:0]} * {B[15], B[15:0]}))

M7 Signed fractional multiplication, single precision rounded result

ACR [31:16] <= SAT(Round(2*({A[15], A[15:0]} * {B[15], B[15:0]})))

Multiplication of long data

ACR [47:0] <= {X[31], X[31:16]} * {Y[15], Y[15:0]}

+ 2-16

*( “0”, X[15:0]} * {Y[15], Y[15:0]});

ACR <=X[31:0]× Y[15:0]

MEL G642

ACR [64:0] <= {X[31], X[31:16]} * {Y[31], Y[31:16]}+ 2 -16*({ “0”, X[15:0]} * {Y[31], Y[31:16]})+ 2 -16*({X[31], X[31:16]} * {“0”, Y[15:0]})

+ 2 -32*({“0”, X[15:0]} * {“0”, Y[15:0]});

ACR<= X[31:0] ×Y[31:0]

An example of MUL

MEL G642

MAC instructions

MEL G642

Guard Operations In MAC

MEL G642

G 6 4 2

i r c u i t

MEL G642

MAC instructions

Single step (signed) MAC– Integer– Fractional

(Signed) Convolution– Integer

MEL G642

Diff between MAC and convolution– In control path, not shown here

Double-Precision Arithmetic

MEL G642

Double-Precision Arithmetic in MAC

No. Specifications on the result

D1 Double-precision data add/sub double-precision data Saturate(ACRx[39:0] ± ACRy[39:0])

D2 Double-precision data add/sub single-precision data align to LSBSaturate (ACRx[39:0] ± {24’b OPB [15],OPB[15:0]})

D3 Double-precision data add/sub single-precision data align to MSB

MEL G642

Saturate (ACRx[39:0] ± {8’b OPB [15],OPB[15:0], 16’b0})D4 Double-precision data plus/sub single precision immediate Saturate

(ACRx[39:0] ± 24’b immediate[15], immediate[15:0])

D5 Absolute operation on a double-precision data if ACRx[39] Saturate

(INV(ACRx[39:0]) + “1”) else ACRxD6 Compare two double-precision data and set flags set flag: Saturate

(ACRx [39:0] - ACRy [39:0])

D7 Simple scale by MUX instead of by shift logic

Scaling in DSP :: MAC

MEL G642

G 6 4 2

i s i o n A r i t h m e t i c

MEL G642

W i t h D o u

b l e P r e

G 6 4 2

S i g n a l s

MEL G642

t r o l

Move / change data types

MEL G642

Move data from MAC and to MAC

Basic load– Loads in half ACRn and keeps another half.

o The higher part and fill in guards– Loads in half ACRn and cleans another half.

o The higher part and fill in guards

MEL G642

– Loads in both lower higher part of ACRn.o To fill in guards using the higher part sign

Move data to MAC

Specifications on the result

L1 ACRn <= {8’bA[15],A[15:0], ACRn[15:0]} //Keep lower part

L2 ACRn <= {8’bA[15],A[15:0], 16’H0000} //clean lower part

L3 ACRn <= {ACRn [39:16], A[15:0]} //keep higher part

MEL G642

n <= , : s gn extens on g er part

L5 ACRn <= {8’bA[15],A[15:0], B[15:0]}// Load A and B from RF

L6 ACRn <= {A[7:0], ACRn[31:0]} // restore guards

Move data to MAC: Logic

MEL G642

G 6 4 2

o d i f i e d

MEL G642

M A C M

Move data from MAC

Specifications on the result

1 Rn <= ACRn[31:16] //Rn is a register in RF

2 Rn <= ACRn[15:0] //Rn is a register in RF

3 Rn <= ACRn[31:16]; Rn+1 <= ACRn [15:0] //Rn and Rn+1 in RF

MEL G642

4 M1 <= ACRn[31:16]; M2 <= ACRn[15:0]; //M1 M2: memories5 Rn <= ACRn[31:16]; Rn+1<=ACRn[15:0]; Rn+2<=ACRn[39:32]

6 Rn <= {8’h00, ACRn[39:32]}; // guard to register file RF

MAC integration

MEL G642

Flags in MAC

Usually control code is implemented using ALUinstructions– Flags in MAC is not used much

Mainly for exception– MAC has

MEL G642

Saturation Flag (FMO)o Sign Flag (FMS)o Zero Flag (FMZ)

Data operation Sepuence

Very important– Add guard bits– Operation (iteration) and scaling– Round after iteration– Saturation and removing guard bits

MEL G642

– Truncation and output

Physical critical path

What is physical critical path?

MEL G642

D-mem 1 D-mem 2 D-mem 3 D-mem4 ConstantRF OPA

32 to1

RF OPB

32 to1

Long wires Long wires

MEL G642

As MAC input Very heavy fan out here!

h l l h

ACR2ACRm

Registerselectlogic

Dataselectlogic……

MEL G642

Heavy fan out forMACinternal logic

Long wireFromRF Data memory

Pipeline

MEL G642

(a) MAC in one clock cycle (b) MAC using two clocks

Accumulator

Flag circuit

(a) MAC using three clocks

Accumulator

Flag circuit

Accumulator

Flag circuit

E l MAC D i

Example :: MAC Design

Design a MAC unit capable of the following operations:o OP0: No operationo OP1: ACR = 0o OP2: ACR = A * B (Fractional multiplication (signed))o OP3: ACR = A * B + ACR (Fractional multiplication (signed))o OP4: ACR = 1.25 * ACR (Scaling)o OP5: Load ACR with a fractional value from a registero OP6: ACR = SATURATE(ROUND(ACR))

MEL G642

o : = :o OP8: RF = ACR[15:8]o OP9: RF = SIGNEXTEND(ACR[19:16])

Constraints:

•A and B are 8 bits, registers are 8 bits•ACR is 20 bits (including 4 guard bits).•Only one multiplier may be used. You should select as small a multiplier asnecessary. You also need to annotate whether it is signed or unsigned.

E l MAC D i

MEL G642

E l MAC D ig

MEL G642

The End :: Thank you for your attention

Questions?

MEL G642

lecture asip 8, 9

Documents

asip introduction

lab3: dnn implementation on enbedded...

acushnet supplier improvement program (asip)

the role of asip in programmable platforms

mark e. sobel, md, phd asip executive officer

2019 asia symposium on image processing (asip 2019)

automated service inspection program (asip) inspection...

asip american society for investigative pathology 2019

asip design based on cordic algorithm using xilinx and...

usaf f-16 asip data collection asip 2007 december 6, 2007

support de la cryptolib cps v5 - asip...

june asip planning

asip 2017 preliminary meeting program

usaf aircraft structural integrity program … aircraft...

asip santé | dst des interfaces mssanté des clients de

2.asip for h

asip design and prototyping for wireless communication...

asip design on behalf of hybrid beamforming in mimo

learning computer architecture through the asip paradigm

lecture asip 5