cs/coe0447 computer organization & assembly language

Post on 13-Jan-2016

31 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

CS/COE0447 Computer Organization & Assembly Language. Chapter 5 Part 3. Single-cycle Implementation of MIPS. Our first implementation of MIPS used a single long clock cycle for every instruction - PowerPoint PPT Presentation

TRANSCRIPT

1

CS/COE0447

Computer Organization & Assembly Language

Chapter 5 Part 3

2

Single-cycle Implementation of MIPS

• Our first implementation of MIPS used a single long clock cycle for every instruction

• Every instruction began on one up (or, down) clock edge and ended on the next up (or, down) clock edge

• This approach is not practical as it is much slower than a multicycle implementation where different instruction classes can take different numbers of cycles– in a single-cycle implementation every instruction must take

the same amount of time as the slowest instruction– in a multicycle implementation this problem is avoided by

allowing quicker instructions to use fewer cycles

• Even though the single-cycle approach is not practical it was simpler and useful to understand first

• Now we are covering a multicycle implementation of MIPS

3

A Multi-cycle Datapath

• A single memory unit for both instructions and data• Single ALU rather than ALU & two adders• Registers added after every major functional unit to hold

the output until it is used in a subsequent clock cycle

4

Multi-Cycle ControlWhat we need to cover

• Adding registers after every functional unit– Need to modify the “instruction execution” slides to reflect this

• Breaking instruction execution down into cycles– What can be done during the same cycle? What requires a

cycle? – Need to modify the “instruction execution” slides again– Timing

• Control signal values – What they are per cycle, per instruction– Finite state machine which determines signals based on

instruction type + which cycle it is

• Putting it all together

5

Execution: single-cycle (reminder)

• add– Fetch instruction and add 4 to PC add $t2,$t1,$t0– Read two source registers $t1 and $t0– Add two values $t1 + $t0– Store result to the destination register $t1 + $t0 $t2

6

A Multi-cycle Datapath

•For add:•Instruction is stored in the instruction register (IR)•Values read from rs and rt are stored in A and B •Result of ALU is stored in ALUOut

7

Multi-Cycle Execution: R-type

• Instruction fetch– IR <= Memory[PC]; sub $t0,$t1,$t2

– PC <= PC + 4;

• Decode instruction/register read– A <= Reg[IR[25:21]]; rs

– B <= Reg[IR[20:16]]; rt

– ALUOut <= PC + (sign-extend(IR[15:0])<<2); later

• Execution– ALUOut <= A op B; op = add, sub, and, or,…

• Completion– Reg[IR[15:11]] <= ALUOut; $t0 <= ALU result

8

Execution: single-cycle (reminder)

• lw (load word) – Fetch instruction and add 4 to PC lw $t0,-12($t1)– Read the base register $t1– Sign-extend the immediate offset fff4 fffffff4– Add two values to get address X = fffffff4 + $t1– Access data memory with the computed address M[X]– Store the memory data to the destination register $t0

9

A Multi-cycle Datapath

•For lw: lw $t0, -12($t1)•Instruction is stored in the IR•Contents of rs stored in A $t1•Output of ALU (address of memory location to be read) stored in ALUOut•Value read from memory is stored in the memory data register (MDR)

10

Multi-cycle Execution: lw• Instruction fetch

– IR <= Memory[PC]; lw $t0,-12($t1)– PC <= PC + 4;

• Instruction Decode/register read– A <= Reg[IR[25:21]]; rs– B <= Reg[IR[20:16]];– ALUOut <= PC + (sign-extend(IR[15:0])<<2);

• Execution– ALUOut <= A + sign-extend(IR[15:0]); $t1 + -12 (sign extended)

• Memory Access– MDR <= Memory[ALUOut]; M[$t1 + -12]

• Write-back– Load: Reg[IR[20:16]] <= MDR; $t0 <= M[$t1 + -12]

11

Execution: single-cycle (reminder)

• sw – Fetch instruction and add 4 to PC sw $t0,-4($t1)– Read the base register $t1– Read the source register $t0– Sign-extend the immediate offset fffc fffffffc– Add two values to get address X = fffffffc + $t1– Store the contents of the source register to the compu

ted address $t0 Memory[X]

12

A Multi-cycle Datapath

•For sw: sw $t0, -12($t1)•Instruction is stored in the IR•Contents of rs stored in A $t1•Output of ALU (address of memory location to be written) stored in ALUOut

13

Multi-cycle Execution: sw• Instruction fetch

– IR <= Memory[PC]; sw $t0,-12($t1)

– PC <= PC + 4;

• Decode/register read– A <= Reg[IR[25:21]]; rs

– B <= Reg[IR[20:16]]; rt

– ALUOut <= PC + (sign-extend(IR[15:0])<<2);

• Execution– ALUOut <= A + sign-extend(IR[15:0]); $t1 + -12 (sign extended)

• Memory Access– Memory[ALUOut] <= B; M[$t1 + -12] <= $t0

14

Execution: single-cycle (reminder)

• beq– Fetch instruction and add 4 to PC beq $t0,$t1,L

• Assume that L is +3 instructions away

– Read two source registers $t0,$t1– Sign Extend the immediate, and shift it left by 2

• 0x0003 0x0000000c

– Perform the test, and update the PC if it is true • If $t0 == $t1, the PC = PC + 0x0000000c• [we will follow what Mars does, so this is not

Immediate == 0x0002; PC = PC + 4 + 0x00000008]

15

A Multi-cycle Datapath

•For beq beq $t0,$t1,label•Instruction stored in IR•Registers rs and rt are stored in A and B•Result of ALU (rs – rt) is stored in ALUOut

16

Multi-cycle execution: beq

• Instruction fetch– IR <= Memory[PC]; beq $t0,$t1,label– PC <= PC + 4;

• Decode/register read– A <= Reg[IR[25:21]]; rs– B <= Reg[IR[20:16]]; rt – ALUOut <= PC + (sign-extend(IR[15:0])<<2);

• PC + #bytes away label is (negative for backward branches, positive for forward branches)

• Execution– if (A == B) then PC <= ALUOut;

• if $t0 == $t1 perform branch• Note: the ALU is used to evaluate A == B; we’ll see that this does n

ot clash with the use of the ALU above.

17

Execution: single-cycle (reminder)

• j– Fetch instruction and add 4 to PC– Take the 26-bit immediate field– Shift left by 2 (to make 28-bit immediate)– Get 4 bits from the current PC and attach to

the left of the immediate– Assign the value to PC

18

A Multi-cycle Datapath

•For j•No accesses to registers or memory; no need for ALU

19

Multi-cycle execution: j

• Instruction fetch– IR <= Memory[PC]; j label– PC <= PC + 4;

• Decode/register read– A <= Reg[IR[25:21]];– B <= Reg[IR[20:16]];– ALUOut <= PC + (sign-extend(IR[15:0])<<2);

• Execution– PC <= {PC[31:28],IR[25:0],”00”};

20

Multi-Cycle ControlWhat we need to cover

• Adding registers after every functional unit– Need to modify the “instruction execution” slides to reflect this

• Breaking instruction execution down into cycles – What can be done during the same cycle? What requires a

cycle? – Need to modify the “instruction execution” slides again– Timing

• Control signal values – What they are per cycle, per instruction– Finite state machine which determines signals based on

instruction type + which cycle it is

• Putting it all together

21

• Break up the instructions into steps– each step takes one clock cycle– balance the amount of work to be done in each step/cycle so that they are

about equal– restrict each cycle to use at most once each major functional unit so that

such units do not have to be replicated– functional units can be shared between different cycles within one

instruction

• Between steps/cycles– At the end of one cycle store data to be used in later cycles of the same

instruction• need to introduce additional internal (programmer-invisible) registers for this

purpose

– Data to be used in later instructions are stored in programmer-visible state elements: the register file, PC, memory

Multicycle Approach

Operations

•These take time:•Memory (read/write); register file (read/write); ALU operations

•The other connections and logical elements have no latency (for our purposes)

Operations

•Before: we had separate memories for instructions and data, and we hadextra adders for incrementing the PC and calculating the branch address. Nowwe have just one memory and just one ALU.

24

Five Execution Steps

• Each takes one cycle• In one cycle, there can be at most one

memory access, at most one register access, and at most one ALU operation

• But, you can have a memory access, an ALU op, and/or a register access, as long as there is no contention for resources

• Changes to registers are made at the end of the clock cycle

25

Step 1: Instruction Fetch

• Access memory w/ PC to fetch instruction and store it in Instruction Register (IR)

• Increment PC by 4 – We can do this because the ALU is not being

used for something else this cycle

26

Step 2: Decode and Reg. Read

• Read registers rs and rt– We read both of them regardless of necessity

• Compute the branch address in case the instruction is a branch– We can do this because the ALU is not busy– ALUOut will keep the target address

27

Step 3: Various Actions

• ALU performs one of three functions based on instruction type

• Memory reference– ALUOut <= A + sign-extend(IR[15:0]);

• R-type– ALUOut <= A op B;

• Branch:– if (A==B) PC <= ALUOut;

• Jump:– PC <= {PC[31:28],IR[25:0],2’b00};

28

Step 4: Memory Access…

• If the instruction is memory reference– MDR <= Memory[ALUOut]; // if it is a load– Memory[ALUOut] <= B; // if it is a st

ore• Store is complete!

• If the instruction is R-type– Reg[IR[15:11]] <= ALUOut;

• Now the instruction is complete!

29

Step 5: Register Write Back

• Only the lw instruction reaches this step– Reg[IR[20:16]] <= MDR;

30

Summary of Instruction Execution

Step nameAction for R-type

instructionsAction for memory-reference

instructionsAction for branches

Action for jumps

Instruction fetch IR = Memory[PC]PC = PC + 4

Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]

ALUOut = PC + (sign-extend (IR[15-0]) << 2)

Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completion

Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or

Store: Memory [ALUOut] = B

Memory read completion Load: Reg[IR[20-16]] = MDR

1: IF

2: ID

3: EX

4: MEM

5: WB

Step

31

Multicycle Execution Step (1):Instruction Fetch

IR = Memory[PC];PC = PC + 4;

4PC + 4

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

32

Multicycle Execution Step (2):Instruction Decode & Register

FetchA = Reg[IR[25-21]]; (A = Reg[rs])B = Reg[IR[20-15]]; (B = Reg[rt])ALUOut = (PC + sign-extend(IR[15-0]) << 2)

BranchTarget

Address

Reg[rs]

Reg[rt]

PC + 4

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

33

Multicycle Execution Step (3):Memory Reference Instructions

ALUOut = A + sign-extend(IR[15-0]);

Mem.Address

Reg[rs]

Reg[rt]

PC + 4

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

34

Multicycle Execution Step (3):ALU Instruction (R-Type)

ALUOut = A op B

R-TypeResult

Reg[rs]

Reg[rt]

PC + 4

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

35

Multicycle Execution Step (3):Branch Instructions

if (A == B) PC = ALUOut;

BranchTarget

Address

Reg[rs]

Reg[rt]

BranchTarget

Address

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

36

Multicycle Execution Step (3):Jump Instruction

PC = PC[31-28] concat (IR[25-0] << 2)

JumpAddress

Reg[rs]

Reg[rt]

BranchTarget

Address

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

37

Multicycle Execution Step (4):Memory Access - Read (lw)

MDR = Memory[ALUOut];

Mem.Data

PC + 4

Reg[rs]

Reg[rt]

Mem.Address

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

38

Multicycle Execution Step (4):Memory Access - Write (sw)

Memory[ALUOut] = B;

PC + 4

Reg[rs]

Reg[rt]

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

39

Multicycle Execution Step (4):ALU Instruction (R-Type)

Reg[IR[15:11]] = ALUOUT

R-TypeResult

Reg[rs]

Reg[rt]

PC + 4

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

40

Multicycle Execution Step (5):Memory Read Completion (lw)

Reg[IR[20-16]] = MDR;

PC + 4

Reg[rs]

Reg[rt]Mem.Data

Mem.Address

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

PC

IR

MDR

A

B

ALUOUT

41

For Reference

• The next 5 slides give the steps, one slide per instruction

42

Multi-Cycle Execution: R-type

• Instruction fetch– IR <= Memory[PC]; sub $t0,$t1,$t2

– PC <= PC + 4;

• Decode instruction/register read– A <= Reg[IR[25:21]]; rs

– B <= Reg[IR[20:16]]; rt

– ALUOut <= PC + (sign-extend(IR[15:0])<<2);

• Execution– ALUOut <= A op B; op = add, sub, and, or,…

• Completion– Reg[IR[15:11]] <= ALUOut; $t0 <= ALU result

43

Multi-cycle Execution: lw• Instruction fetch

– IR <= Memory[PC]; lw $t0,-12($t1)– PC <= PC + 4;

• Instruction Decode/register read– A <= Reg[IR[25:21]]; rs– B <= Reg[IR[20:16]];– ALUOut <= PC + (sign-extend(IR[15:0])<<2);

• Execution– ALUOut <= A + sign-extend(IR[15:0]); $t1 + -12 (sign extended)

• Memory Access– MDR <= Memory[ALUOut]; M[$t1 + -12]

• Write-back– Load: Reg[IR[20:16]] <= MDR; $t0 <= M[$t1 + -12]

44

Multi-cycle Execution: sw• Instruction fetch

– IR <= Memory[PC]; sw $t0,-12($t1)

– PC <= PC + 4;

• Decode/register read– A <= Reg[IR[25:21]]; rs

– B <= Reg[IR[20:16]]; rt

– ALUOut <= PC + (sign-extend(IR[15:0])<<2);

• Execution– ALUOut <= A + sign-extend(IR[15:0]); $t1 + -12 (sign extended)

• Memory Access– Memory[ALUOut] <= B; M[$t1 + -12] <= $t0

45

Multi-cycle execution: beq

• Instruction fetch– IR <= Memory[PC]; beq $t0,$t1,label– PC <= PC + 4;

• Decode/register read– A <= Reg[IR[25:21]]; rs– B <= Reg[IR[20:16]]; rt – ALUOut <= PC + (sign-extend(IR[15:0])<<2);

• Execution– if (A == B) then PC <= ALUOut;

• if $t0 == $t1 perform branch

46

Multi-cycle execution: j

• Instruction fetch– IR <= Memory[PC]; j label– PC <= PC + 4;

• Decode/register read– A <= Reg[IR[25:21]];– B <= Reg[IR[20:16]];– ALUOut <= PC + (sign-extend(IR[15:0])<<2);

• Execution– PC <= {PC[31:28],IR[25:0],”00”};

47

Example: CPI in a multicycle CPU

• Assume– the control design of the previous slides– An instruction mix of 22% loads, 11% stores, 49% R-type operations, 16%

branches, and 2% jumps• What is the CPI assuming each step requires 1 clock cycle?

• Solution:– Number of clock cycles from previous slide for each instruction class:

• loads 5, stores 4, R-type instructions 4, branches 3, jumps 3

– CPI = CPU clock cycles / instruction count

= (instruction countclass i CPIclass i) / instruction count

= (instruction countclass I / instruction count) CPIclass I

= 0.22 5 + 0.11 4 + 0.49 4 + 0.16 3 + 0.02 3

= 4.04

48

Multi-Cycle ControlWhat we need to cover

• Adding registers after every functional unit– Need to modify the “instruction execution” slides to reflect this

• Breaking instruction execution down into cycles– What can be done during the same cycle? What requires a

cycle? – Need to modify the “instruction execution” slides again– Timing

• Control signal values – What they are per cycle, per instruction– Finite state machine which determines signals based on

instruction type + which cycle it is

• Putting it all together

49

A (Refined) Datapath fig 5.26

50

Datapath w/ Control Signals Fig 5.27

51

Final Version w/ Control Fig 5.28

52

Multicycle Control Step (1):Fetch

IR = Memory[PC];PC = PC + 4;

1

0

1

0

1

0X

0X

0010

1

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

53

Multicycle Control Step (2):Instruction Decode & Register

FetchA = Reg[IR[25-21]]; (A = Reg[rs])B = Reg[IR[20-15]]; (B = Reg[rt])ALUOut = (PC + sign-extend(IR[15-0]) << 2);

0

0X

0

0X

3

0X

X

010

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

54

0X

Multicycle Control Step (3):Memory Reference Instructions

ALUOut = A + sign-extend(IR[15-0]);

X

2

0

0X

0 1

X

010

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

55

Multicycle Control Step (3):ALU Instruction (R-Type)

ALUOut = A op B;

0X

X

0

0

0X

0 1

X

???

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

56

1 if Zero=1

Multicycle Control Step (3):Branch Instructions

if (A == B) PC = ALUOut;

0X

X

0

0

X0 1

1

011

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

57

Multicycle Execution Step (3):Jump Instruction

PC = PC[21-28] concat (IR[25-0] << 2);

0X

X

X

0

1X

0 X

2

XXX

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

58

Multicycle Control Step (4):Memory Access - Read (lw)MDR = Memory[ALUOut];

0X

X

X

1

01

0 X

X

XXX

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

59

Multicycle Execution Steps (4)Memory Access - Write (sw)Memory[ALUOut] = B;

0X

X

X

0

01

1 X

X

XXX

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WDMemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

1

0

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

60

10

0X

0

X

0

XXX

X

X

1

15 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WD

MemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

0

1

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

Multicycle Control Step (4):ALU Instruction (R-Type)

Reg[IR[15:11]] = ALUOut; (Reg[Rd] = ALUOut)

61

Multicycle Execution Steps (5)Memory Read Completion (lw)

Reg[IR[20-16]] = MDR;

1

0

0

X

0

0X

0 X

X

XXX

0

5 5

RD1

RD2

RN1 RN2 WN

WD

RegWrite

Registers

Operation

ALU

3

EXTND

16 32

Zero

RD

WD

MemRead

MemoryADDR

MemWrite

5

Instruction I

32

ALUSrcB

<<2

PC

4

RegDst

5

IR

MDR

MUX

0123

MUX

0

1

MUX

0

1A

BALUOUT

0

1

2MUX

<<2 CONCAT28 32

MUX

0

1

ALUSrcA

jmpaddrI[25:0]

rd

MUX0 1

rtrs

immediate

PCSource

MemtoReg

IorD

PCWr*

IRWrite

62

Multi-Cycle ControlWhat we need to cover

• Adding registers after every functional unit– Need to modify the “instruction execution” slides to reflect this

• Breaking instruction execution down into cycles– What can be done during the same cycle? What requires a

cycle? – Need to modify the “instruction execution” slides again– Timing: Registers/memory updated at the beginning of the next

clock cycle

• Control signal values – What they are per cycle, per instruction– Finite state machine which determines signals based on

instruction type + which cycle it is • Putting it all together

63

Fig 5.28 For reference

64

State Diagram, Big Picture

65

Handling Memory Instructions

66

R-type Instruction

67

Branch and Jump

68

A FSM State Diagram

69

FSM Implementation

70

Example: Load (1)

01 0 0

1

01

00

1

00

71

Example: Load (2)

0

00

11

rs

rt

72

Example: Load (3)

10

1

00

73

Example: Load (4)

11 0

74

Example: Load (5)

1

1

0

75

Example: Jump (1)

01 0 0

1

01

00

1

00

76

Example: Jump (2)

0

00

11

77

Example: Jump (3)

1

10

78

To Summarize…

• From several building blocks, we constructed a datapath for a subset of the MIPS instruction set

• First, we analyzed instructions for functional requirements

• Second, we connected buildings blocks in a way to accommodate instructions

• Third, we refined the datapath and added controls

79

To Summarize…

• We looked at how an instruction is executed on the datapath in a pictorial way

• We looked at control signals connected to functional blocks in our datapath

• We analyzed how execution steps of an instruction change the control signals

80

To Summarize…

• We compared a single-cycle implementation and a multi-cycle implementation of our datapath

• We analyzed multi-cycle execution of instructions

• We refined multi-cycle datapath

• We designed multi-cycle control

81

To Summarize…

• We looked at the multi-cycle control scheme in detail

• Multi-cycle control can be implemented using FSM

• FSM is composed of some combinational logic and memory element

82

Summary

• Techniques described in this chapter to design datapaths and control are at the core of all modern computer architecture

• Multicycle datapaths offer two great advantages over single-cycle– functional units can be reused within a single instruction if they are

accessed in different cycles – reducing the need to replicate expensive logic

– instructions with shorter execution paths can complete quicker by consuming fewer cycles

• Modern computers, in fact, take the multicycle paradigm to a higher level to achieve greater instruction throughput: – pipelining (later class) where multiple instructions execute

simultaneously by having cycles of different instructions overlap in the datapath

– the MIPS architecture was designed to be pipelined

top related