computer architecturecomputer-architecture.org/lectures/computer-architecture-20.pdf · computer...

31
Computer Architecture Paul Mellies Lecture 20 : Multicycle microarchitectures IorD = 0 AluSrcA = 0 AluSrcB = 01 AluOp = 00 PCSrc = 00 IRWrite PCWrite S0 : Fetch Reset S1 : Decode S2 : MemAdr AluSrcA = 1 AluSrcB = 10 AluOp = 00 Op = LW or Op = SW S3 : MemRead IorD = 1 Op = LW S4 : Mem Writeback RegDst = 0 MemtoReg = 1 RegWrite IorD = 1 MemWrite Op = SW S5 : MemWrite AluSrcA = 1 AluSrcB = 00 AluOp = 10 RegDst = 1 MemtoReg = 0 RegWrite Op = R-type S6 : Execute S7 : ALU Writeback AluSrcA = 1 AluSrcB = 00 AluOp = 10 PCSrc = 01 Branch AluSrcA = 0 AluSrcB = 11 AluOp = 0 Op = BEQ S8 : Branch S9 : ADDI Execute S10 : ADDI Writeback AluSrcA = 1 AluSrcB = 10 AluOp = 00 RegDst = 0 MemtoReg = 0 RegWrite Op = ADDI PCSrc = 10 PCWrite S11 : Jump Op = J

Upload: others

Post on 20-Apr-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Computer ArchitecturePaul Mellies

Lecture 20 : Multicycle microarchitectures

IorD = 0 AluSrcA = 0 AluSrcB = 01

AluOp = 00PCSrc = 00

IRWrite PCWrite

S0 : FetchReset

S1 : Decode

S2 : MemAdr

AluSrcA = 1 AluSrcB = 10

AluOp = 00

Op = LWor

Op = SW

S3 : MemRead IorD = 1

Op = LW

S4 : MemWriteback

RegDst = 0 MemtoReg = 1

RegWrite

IorD = 1 MemWrite

Op = SW

S5 : MemWrite

AluSrcA = 1 AluSrcB = 00

AluOp = 10

RegDst = 1 MemtoReg = 0

RegWrite

Op = R-type

S6 : Execute

S7 : ALUWriteback

AluSrcA = 1 AluSrcB = 00

AluOp = 10PCSrc = 01

Branch

AluSrcA = 0 AluSrcB = 11

AluOp = 0

Op = BEQ

S8 : BranchS9 : ADDIExecute

S10 : ADDIWriteback

AluSrcA = 1 AluSrcB = 10

AluOp = 00

RegDst = 0 MemtoReg = 0

RegWrite

Op = ADDI

PCSrc = 10PCWrite

S11 : JumpOp = J

The multicyclemicroarchitecture

The multicycle microarchitecture

The single-cycle microarchitecture has three weakenesses :

• it requires a clock cycle long enough to support the slowest instruction j even though most instructions are faster. • it requires three adders: one in the ALU and two for the PC logic ; Adders are relatively expensive circuits, especially if they must be fast. • it has separate instruction and data memories, which may not be realistic.

The multicycle processor addresses these three weaknessesby breaking an instruction into multiple shorter steps.

The key idea is to replace the controller of the single-cycle microarchitecturewhich is based on combinational logic by a finite state machine ( FSM ).

State elements with uni�ed instruction/data memory

Instr/DataMemory

CLK

A1A2

A3

WD3

WE3RD1RD2

RegisterFile

A RDWECLK

PC’ PCEN

CLK

WD

Step 1 : fetch instruction from memory

Instr/DataMemory

CLK

A1A2

A3

WD3

WE3RD1RD2

RegisterFile

A RDWE

CLK

PC’ PCEN

CLKCLK

EN

IRWrite

The instruction is read and stored in a new nonarchitectural Instruction Register so that it is available for future cycles. The Instruction Register receives an enable signal called IRWrite which is asserted when the Instruction Register should be updated with a new instruction.

WD

Step 2 : fetch instruction from memory

Instr/DataMemory

CLK

A1A2

A3

WD3

WE3RD1RD2

RegisterFile

A RDWE

CLK

PC’ PCEN

CLKCLK

EN

IRWrite

WD

Instr25 : 21

CLK

A

datapath for the lw instruction

.The next step is to read the source register containing the base address. The source register is speci�ed in the sr �eld [25 : 21] of the instruction. So, just as in the single-cycle case, the 5 bits of the instruction are connectedto the address input A1 of the register �le.

Step 3 : sign-extend the immediate

Instr/DataMemory

CLK

A1A2

A3

WD3

WE3RD1RD2

RegisterFile

A RDWE

CLK

PC’ PCEN

CLKCLK

EN

IRWrite

Instr25 : 21

CLK

A

WD

datapath for the lw instruction

The lw instruction also requires an o�set. The o�set is stored in the immediate �eld of the instruction, Instr 15:0. Because the 16-bit immediate might be either positive or negative, it must be sign-extended to 32 bits. The 32-bit sign-extended value is called SignImm.

Sign Extend15 : 0SignImm

Step 4 : add base address to o�set

Instr/DataMemory

CLK

A1A2

A3

WD3

WE3RD1RD2

RegisterFile

A RDWE

CLK

PC’ PCEN

CLKCLK

EN

IRWrite

Instr25 : 21

CLK

A

Sign Extend15 : 0SignImm

WD

datapath for the lw instruction

+

ALUResult

CLK

2 : 0ALUControl

SrcA

SrcBALUOut

Step 6 : load data from memory

Instr/DataMemory

CLK

A1A2

A3

WD3

WE3RD1RD2

RegisterFile

A RDWE

CLK

PC’ PCEN

CLKCLK

EN

IRWrite

Instr25 : 21

CLK

A

Sign Extend15 : 0SignImm

+

ALUResult

CLK

2 : 0ALUControl

SrcA

SrcB

WD

datapath for the lw instruction

01 ALUOut

Adr

IorD

CLK

Data

The next step is to load the data from the calculated address in the memory. We add a multiplexer in front of the memory to choose the memory address Adr from either the PC or ALUOut. The multiplexer select signal is called IorD to indicate either an instruction or data address. The data read from the memory is stored in another nonarchitectural register, called Data.

Step 6 : load data from memory

Instr/DataMemory

CLK

A1A2

A3

WD3

WE3RD1RD2

RegisterFile

A RDWE

CLK

PC’ PCEN

CLKCLK

EN

IRWrite

Instr25 : 21

CLK

A

Sign Extend15 : 0SignImm

+

ALUResult

CLK

2 : 0ALUControl

SrcA

SrcB

WD

datapath for the lw instruction

01 ALUOut

Adr

IorD

CLK

Data

Notice that the address multiplexer enables us to reuse the memory during the lw instruction. On the �rst step, the address is taken from the PC to fetch the instruction. On the later step, the address is taken from ALUOut to load the data. Hence, the control signal IorD must have di�erent values on these di�erent steps.

Step 7 : write back data to register �le

Instr/DataMemory

CLK

A1A2

A3

WD3

WE3RD1RD2

RegisterFile

A RDWE

CLK

PC’ PCEN

CLKCLK

EN

IRWrite

Instr25 : 21

CLK

A

Sign Extend15 : 0SignImm

+

ALUResult

CLK

2 : 0ALUControl

SrcA

SrcB

01

ALUOutAdr

IorD

CLK

DataWD

datapath for the lw instruction

20 : 16

Step 8 : increment PC by 4

datapath for the lw instruction

Instr/DataMemory

CLK

A1A2

A3

WD3

WE3 RD1RD2

RegisterFile

A RDWE

CLK

PC

EN

CLKCLK

EN

IRWrite

Instr25 : 21

CLK

A

Sign Extend15 : 0

ALU

ALUResult

CLK

2 : 0ALUControl

01

ALUOutAdr

IorD

CLK

Data

20 : 16

WD

PC01

ALUSrcA

SrcA

00011011

SignImm

ALUSrcB

PC’

PCWrite

SrcB4

Enhanced data path for the sw instruction

datapath for the sw instruction

Instr/DataMemory

CLK

A1A2

A3

WD3

WE3RD1RD2

RegisterFile

A RDWE

CLK

PC

EN

CLKCLK

EN

IRWrite

Instr25 : 21 A

Sign Extend15 : 0

ALU

ALUResult

CLK

2 : 0ALUControl

01 ALUOut

Adr

IorD

CLK

Data

20 : 16

PC01

ALUSrcA

SrcA

00011011

SignImm

ALUSrcB

PC’

PCWrite

SrcB4

CLK

WD

20 : 16 B

MemWrite

Enhanced data path for R-type instructions

datapath for the R-type instructions

Instr/DataMemory

CLK

A1A2

A3

15 : 0

WD3

WE3 RD1

RD2

RegisterFile

4A RD

WE

CLK

PC

EN

CLKCLK

EN

IRWrite

Instr25 : 21 A

Sign Extend

ALU

ALUResult

CLK

2 : 0ALUControl

01

ALUOutAdr

IorD

CLK

Data

20 : 16

PC01

ALUSrcA

SrcA

00011011

SignImm

ALUSrcB

PC’

PCWrite

SrcB

CLK

WD

20 : 16 B

MemWrite

0115 : 11

01

RegDst MemToReg

Enhanced data path for the beq instruction

datapath for the beq instruction

Instr/DataMemory

CLK

A1A2

A3

15 : 0

WD3

WE3 RD1

RD2

RegisterFile

4A RD

WE

CLK

PC

EN

CLKCLK

EN

IRWrite

Instr25 : 21 A

Sign Extend

ALU

ALUResult

CLK

2 : 0ALUControl

01

ALUOutAdr

IorD

CLK

Data

20 : 16

PC01

ALUSrcA

SrcA

00011011

SignImm

ALUSrcB

PC’

SrcB

CLK

WD

20 : 16 B

MemWrite

0115 : 11

01

RegDst MemToReg

<< 2

01

PCSrcPCWriteBranch

Zero

Instr/DataMemory

CLK

A1A2

A3

15 : 0

WD3

WE3 RD1

RD2

RegisterFile

4A RD

WE

CLK

PC

EN

CLKCLK

EN

IRWrite

Instr25 : 21 A

Sign Extend

ALU

ALUResult

CLK

01

ALUOutAdr

IorD

CLK

Data

20 : 16

PC01

ALUSrcA

SrcA

00011011

SignImm

PC’

SrcB

CLK

WD

20 : 16 B

0115 : 11

01

RegDst

Mem

ToReg

<< 2

01

Zero

RegWrite

MemWrite

2 : 0ALUControl

Branch

31 : 26

5 : 0

ControlUnit

Op

Funct

CLK

PCWrite

PCSrc

PCEn

1 : 0ALUSrcB

Complete multicycle MIPS processor

Control unit internal structure

RegWrite

RegDest

ALUSrcB

PCSrc

MemtoReg

IorD

2 : 0ALUControl5 : 0ALU

Decoder

Opcode

Funct

5 : 0

MainController

( FSM )

ALUOp 1 : 0

ControlUnit

IRWrite

Branch

PCWrite

ALUSrcA

MemWrite

1 : 0

Multiplexer Select

Register Enables

Fetch

IorD = 0 AluSrcA = 0 AluSrcB = 01

AluOp = 00PCSrc = 0

IRWrite PCWrite

S0 : FetchReset

Instr/DataMemory

CLK

A1A2

A3

15 : 0

WD3

WE3 RD1

RD2

RegisterFile

4A RD

WE

CLK

PC

EN

CLKCLK

EN

IRWrite

Instr25 : 21 A

Sign Extend

ALU

ALUResult

CLK

01

ALUOutAdr

IorD

CLK

Data

20 : 16

PC01

ALUSrcA

SrcA

00011011

SignImm

PC’

SrcB

CLK

WD

20 : 16 B

0115 : 11

01

RegDst

Mem

ToReg

<< 2

01

Zero

RegWrite

MemWrite

2 : 0ALUControl

Branch

31 : 26

5 : 0

ControlUnit

Op

Funct

CLK

PCWrite

PCSrc

PCEn

1 : 0ALUSrcB

Data �ow during the fetch step

1

00 0 0

01

0100

X

X1

01

Decode

IorD = 0 AluSrcA = 0 AluSrcB = 01

AluOp = 00PCSrc = 0

IRWrite PCWrite

S0 : FetchReset

S1 : Decode

Instr/DataMemory

CLK

A1A2

A3

15 : 0

WD3

WE3 RD1

RD2

RegisterFile

4A RD

WE

CLK

PC

EN

CLKCLK

EN

IRWrite

Instr25 : 21 A

Sign Extend

ALU

ALUResult

CLK

01

ALUOutAdr

IorD

CLK

Data

20 : 16

PC01

ALUSrcA

SrcA

00011011

SignImm

PC’

SrcB

CLK

WD

20 : 16 B

0115 : 11

01

RegDst

Mem

ToReg

<< 2

01

Zero

RegWrite

MemWrite

2 : 0ALUControl

Branch

31 : 26

5 : 0

ControlUnit

Op

Funct

CLK

PCWrite

PCSrc

PCEn

1 : 0ALUSrcB

Data �ow during the decode step

0

X0 0 X

XX

XXXX

X

X0

00

Memory Address Computation

IorD = 0 AluSrcA = 0 AluSrcB = 01

AluOp = 00PCSrc = 0

IRWrite PCWrite

S0 : FetchReset

S1 : Decode

S2 : MemAdr

AluSrcA = 1 AluSrcB = 10

AluOp = 00

Op = LWor

Op = SW

Instr/DataMemory

CLK

A1A2

A3

15 : 0

WD3

WE3 RD1

RD2

RegisterFile

4A RD

WE

CLK

PC

EN

CLKCLK

EN

IRWrite

Instr25 : 21 A

Sign Extend

ALU

ALUResult

CLK

01

ALUOutAdr

IorD

CLK

Data

20 : 16

PC01

ALUSrcA

SrcA

00011011

SignImm

PC’

SrcB

CLK

WD

20 : 16 B

0115 : 11

01

RegDst

Mem

ToReg

<< 2

01

Zero

RegWrite

MemWrite

2 : 0ALUControl

Branch

31 : 26

5 : 0

ControlUnit

Op

Funct

CLK

PCWrite

PCSrc

PCEn

1 : 0ALUSrcB

Data �ow during memory address computation

0

X0 0 1

10

010X

X

X0

00

Memory Read

IorD = 0 AluSrcA = 0 AluSrcB = 01

AluOp = 00PCSrc = 0

IRWrite PCWrite

S0 : FetchReset

S1 : Decode

S2 : MemAdr

AluSrcA = 1 AluSrcB = 10

AluOp = 00

Op = LWor

Op = SW

S3 : MemRead IorD = 1

Op = LW

S4 : MemWriteback

RegDst = 0 MemtoReg = 1

RegWrite

Memory Write

IorD = 0 AluSrcA = 0 AluSrcB = 01

AluOp = 00PCSrc = 0

IRWrite PCWrite

S0 : FetchReset

S1 : Decode

S2 : MemAdr

AluSrcA = 1 AluSrcB = 10

AluOp = 00

Op = LWor

Op = SW

S3 : MemRead IorD = 1

Op = LW

S4 : MemWriteback

RegDst = 0 MemtoReg = 1

RegWrite

IorD = 1 MemWrite

Op = SW

S5 : MemWrite

Execute R-type operation

IorD = 0 AluSrcA = 0 AluSrcB = 01

AluOp = 00PCSrc = 0

IRWrite PCWrite

S0 : FetchReset

S1 : Decode

S2 : MemAdr

AluSrcA = 1 AluSrcB = 10

AluOp = 00

Op = LWor

Op = SW

S3 : MemRead IorD = 1

Op = LW

S4 : MemWriteback

RegDst = 0 MemtoReg = 1

RegWrite

IorD = 1 MemWrite

Op = SW

S5 : MemWrite

AluSrcA = 1 AluSrcB = 00

AluOp = 10

RegDst = 1 MemtoReg = 0

RegWrite

Op = R-type

S6 : Execute

S7 : ALUWriteback

Branch

IorD = 0 AluSrcA = 0 AluSrcB = 01

AluOp = 00PCSrc = 0

IRWrite PCWrite

S0 : FetchReset

S1 : Decode

S2 : MemAdr

AluSrcA = 1 AluSrcB = 10

AluOp = 00

Op = LWor

Op = SW

S3 : MemRead IorD = 1

Op = LW

S4 : MemWriteback

RegDst = 0 MemtoReg = 1

RegWrite

IorD = 1 MemWrite

Op = SW

S5 : MemWrite

AluSrcA = 1 AluSrcB = 00

AluOp = 10

RegDst = 1 MemtoReg = 0

RegWrite

Op = R-type

S6 : Execute

S7 : ALUWriteback

AluSrcA = 1 AluSrcB = 00

AluOp = 10PCSrc = 1

Branch

AluSrcA = 0 AluSrcB = 11

AluOp = 0

Op = BEQ

S8 : Branch

Complete multicycle control FSM

IorD = 0 AluSrcA = 0 AluSrcB = 01

AluOp = 00PCSrc = 0

IRWrite PCWrite

S0 : FetchReset

S1 : Decode

S2 : MemAdr

AluSrcA = 1 AluSrcB = 10

AluOp = 00

Op = LWor

Op = SW

S3 : MemRead IorD = 1

Op = LW

S4 : MemWriteback

RegDst = 0 MemtoReg = 1

RegWrite

IorD = 1 MemWrite

Op = SW

S5 : MemWrite

AluSrcA = 1 AluSrcB = 00

AluOp = 10

RegDst = 1 MemtoReg = 0

RegWrite

Op = R-type

S6 : Execute

S7 : ALUWriteback

AluSrcA = 1 AluSrcB = 00

AluOp = 10PCSrc = 1

Branch

AluSrcA = 0 AluSrcB = 11

AluOp = 0

Op = BEQ

S8 : Branch

Main controller states for addi

IorD = 0 AluSrcA = 0 AluSrcB = 01

AluOp = 00PCSrc = 0

IRWrite PCWrite

S0 : FetchReset

S1 : Decode

S2 : MemAdr

AluSrcA = 1 AluSrcB = 10

AluOp = 00

Op = LWor

Op = SW

S3 : MemRead IorD = 1

Op = LW

S4 : MemWriteback

RegDst = 0 MemtoReg = 1

RegWrite

IorD = 1 MemWrite

Op = SW

S5 : MemWrite

AluSrcA = 1 AluSrcB = 00

AluOp = 10

RegDst = 1 MemtoReg = 0

RegWrite

Op = R-type

S6 : Execute

S7 : ALUWriteback

AluSrcA = 1 AluSrcB = 00

AluOp = 10PCSrc = 1

Branch

AluSrcA = 0 AluSrcB = 11

AluOp = 0

Op = BEQ

S8 : BranchS9 : ADDIExecute

S10 : ADDIWriteback

AluSrcA = 1 AluSrcB = 10

AluOp = 00

RegDst = 0 MemtoReg = 0

RegWrite

Op = ADDI

Multicycle MIPS datapathenhanced to support the j instruction

Instr/DataMemory

CLK

A1A2

A3

15 : 0

WD3

WE3 RD1

RD2

RegisterFile

4A RD

WE

CLK

PC

EN

CLKCLK

EN

IRWrite

Instr25 : 21 A

Sign Extend

ALU

ALUResult

CLK

2 : 0ALUControl

01

ALUOutAdr

IorD

CLK

Data

20 : 16

PC01

ALUSrcA

SrcA

00011011

SignImm

ALUSrcB

PC’

SrcB

CLK

WD

20 : 16 B

MemWrite

0115 : 11

RegDst MemToReg

<< 2

PCWriteBranch

Zero

01

PCSrc

0001

1 : 0

<< 2

31 : 28

27 : 0

10

25 : 0

27 : 0

31 : 28

PCJump

Main controller state for j

IorD = 0 AluSrcA = 0 AluSrcB = 01

AluOp = 00PCSrc = 00

IRWrite PCWrite

S0 : FetchReset

S1 : Decode

S2 : MemAdr

AluSrcA = 1 AluSrcB = 10

AluOp = 00

Op = LWor

Op = SW

S3 : MemRead IorD = 1

Op = LW

S4 : MemWriteback

RegDst = 0 MemtoReg = 1

RegWrite

IorD = 1 MemWrite

Op = SW

S5 : MemWrite

AluSrcA = 1 AluSrcB = 00

AluOp = 10

RegDst = 1 MemtoReg = 0

RegWrite

Op = R-type

S6 : Execute

S7 : ALUWriteback

AluSrcA = 1 AluSrcB = 00

AluOp = 10PCSrc = 01

Branch

AluSrcA = 0 AluSrcB = 11

AluOp = 0

Op = BEQ

S8 : BranchS9 : ADDIExecute

S10 : ADDIWriteback

AluSrcA = 1 AluSrcB = 10

AluOp = 00

RegDst = 0 MemtoReg = 0

RegWrite

Op = ADDI

PCSrc = 10PCWrite

S11 : JumpOp = J