computer organization lecture set – 05.2 chapter 5 huei-yung lin
TRANSCRIPT
Computer Organization
Lecture Set – 05.2
Chapter 5
Huei-Yung Lin
H.Y. Lin, CCUEE Computer Organization 2
Outline - Processor Implementation Overview Single-Cycle Implementation Multi-Cycle Implementation
1. Analyze instruction set; get datapath requirements
2. Select datapath components andestablish clocking methodology
3. Assemble datapath that meets requirements
4. Determine control signal values for each instruction
5. Assemble control logic to generate control signals
Pipelined Implementation
H.Y. Lin, CCUEE Computer Organization 3
Outline - Multicycle Design
Overview Datapath Design Controller Design Microprogramming Exceptions Performance Considerations
H.Y. Lin, CCUEE Computer Organization 4
Multicycle Execution - Key Idea Break instruction execution into multiple cycles One clock cycle for each major task
1. Instruction Fetch
2. Instruction Decode and Register Fetch
3. Execution, memory address computation, or branch computation
4. Memory access / R-type instruction completion
5. Memory read completion
Share hardware to simplify datapath
H.Y. Lin, CCUEE Computer Organization 5
Characteristics of Multicycle Design Instructions take more than one cycle
Some instructions take more cycles than others Clock cycle is shorter than single-cycle clock
Reuse of major components simplifies datapath Single ALU for all calculations Single memory for instructions and data But, added registers needed to store values across cycles
Control Unit Implemented w/ State Machine Control signals no longer a function of just the instruction
H.Y. Lin, CCUEE Computer Organization 6
(Equivalent to Book Fig. 5.25, p. 318)
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
PC
IR
MDR
A
B
ALUOUT
Multicycle Datapath - High-Level View
H.Y. Lin, CCUEE Computer Organization 7
Review: Steps in Processor Design1. Analyze instruction set; get datapath requirements
2. Select datapath components andestablish clocking methodology
3. Assemble datapath that meets requirements
4. Determine control signal values for each instruction
5. Assemble control logic to generate control signals
H.Y. Lin, CCUEE Computer Organization 8
Review: Register Transfers
Instruction FetchInstruction <= MEM[PC]; PC = PC + 4;
Instruction ExecutionInstr. Register Transfersadd R[rd] <= R[rs] + R[rt];
sub R[rd] <= R[rs] - R[rt];
and R[rd] <= R[rs] & R[rt];
or R[rd] <= R[rs] | R[rt];
lw R[rt] <= MEM[R[rs] + s_extend(offset)];
sw MEM[R[rs] + sign_extend(offset)] <= R[rt]; PC <= PC + 4
beq if (R[rs] == R[rt]) then PC <- PC+4 + s_extend(offset<<2)
else PC <= PC + 4
j PC <= upper(PC)@(address << 2)
Key idea: break into multiple cycles!
H.Y. Lin, CCUEE Computer Organization 9
Multicycle Execution Steps
1. Instruction Fetch
2. Instruction Decode and Register Fetch (and branch target calculation)
3. One of the following: Execute R-Type Instruction OR Calculate memory address for load/store OR Perform comparison for branch
4. Memory access for load/storeOR R-type instruction completion (save result)
5. Memory read completion (save result - load only)
H.Y. Lin, CCUEE Computer Organization 10
Summary - Multicycle Execution Instructions take between 3 and 5 clock cycles
Step nameAction for R-type
instructionsAction for memory-reference
instructionsAction for branches
Action for jumps
Instruction fetch IR = Memory[PC]PC = PC + 4
Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]
ALUOut = PC + (sign-extend (IR[15-0]) << 2)
Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completion
Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or
Store: Memory [ALUOut] = B
Memory read completion Load: Reg[IR[20-16]] = MDR
(Book Fig. 5.30, p. 329)
(1)
(2)
(3)
(4)
(5)
New registers needed to store values across clock steps!
H.Y. Lin, CCUEE Computer Organization 11
IR = Memory[PC];PC = PC + 4;
4PC + 4
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
PC
IR
MDR
A
B
ALUOUT
Multicycle Execution Step (1)Instruction Fetch
H.Y. Lin, CCUEE Computer Organization 12
Multicycle Execution Step (2)Instruction Decode and Register FetchA = Reg[IR[25-21]]; (A = Reg[rs])B = Reg[IR[20-15]]; (B = Reg[rt])ALUOut = (PC + sign-extend(IR[15-0]) << 2)
BranchTarget
Address
Reg[rs]
Reg[rt]
PC + 4
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
PC
IR
MDR
A
B
ALUOUT
H.Y. Lin, CCUEE Computer Organization 13
Multicycle Execution Steps (3)Memory Reference InstructionsALUOut = A + sign-extend(IR[15-0]);
Mem.Address
Reg[rs]
Reg[rt]
PC + 4
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
PC
IR
MDR
A
B
ALUOUT
H.Y. Lin, CCUEE Computer Organization 14
Multicycle Execution Steps (3)ALU Instruction (R-Type)ALUOut = A op B
R-TypeResult
Reg[rs]
Reg[rt]
PC + 4
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
PC
IR
MDR
A
B
ALUOUT
H.Y. Lin, CCUEE Computer Organization 15
Multicycle Execution Steps (3)Branch Instructionsif (A == B) PC = ALUOut;
BranchTarget
Address
Reg[rs]
Reg[rt]
BranchTarget
Address
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
PC
IR
MDR
A
B
ALUOUT
H.Y. Lin, CCUEE Computer Organization 16
Multicycle Execution Step (3)Jump InstructionPC = PC[31-28] concat (IR[25-0] << 2)
JumpAddress
Reg[rs]
Reg[rt]
BranchTarget
Address
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
PC
IR
MDR
A
B
ALUOUT
H.Y. Lin, CCUEE Computer Organization 17
Multicycle Execution Steps (4)Memory Access - Read (lw)MDR = Memory[ALUOut];
Mem.Data
PC + 4
Reg[rs]
Reg[rt]
Mem.Address
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
PC
IR
MDR
A
B
ALUOUT
H.Y. Lin, CCUEE Computer Organization 18
Multicycle Execution Steps (4)Memory Access - Write (sw)Memory[ALUOut] = B;
PC + 4
Reg[rs]
Reg[rt]
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
PC
IR
MDR
A
B
ALUOUT
H.Y. Lin, CCUEE Computer Organization 19
Multicycle Execution Steps (4)ALU Instruction (R-Type)Reg[IR[15:11]] = ALUOUT
R-TypeResult
Reg[rs]
Reg[rt]
PC + 4
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
PC
IR
MDR
A
B
ALUOUT
H.Y. Lin, CCUEE Computer Organization 20
Multicycle Execution Steps (5)Memory Read Completion (lw)Reg[IR[20-16]] = MDR;
PC + 4
Reg[rs]
Reg[rt]Mem.Data
Mem.Address
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
PC
IR
MDR
A
B
ALUOUT
H.Y. Lin, CCUEE Computer Organization 21
Summary - Multicycle Execution Instructions take between 3 and 5 clock cycles
Step nameAction for R-type
instructionsAction for memory-reference
instructionsAction for branches
Action for jumps
Instruction fetch IR = Memory[PC]PC = PC + 4
Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]
ALUOut = PC + (sign-extend (IR[15-0]) << 2)
Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completion
Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or
Store: Memory [ALUOut] = B
Memory read completion Load: Reg[IR[20-16]] = MDR
(Book Fig. 5.30, p. 329)
H.Y. Lin, CCUEE Computer Organization 22
Full Multicycle Datapath
(Equivalent to Book Fig. 5.27, p. 322 without ALU control)
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
H.Y. Lin, CCUEE Computer Organization 23
Full Multicycle Implementation
ALUControl
ControlUnit
6 6op I[31:26] funct I[5:0]
ALUOp
2
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
ZeroRD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWriteCond
PCWrite
Zero
IRWrite
(Equivalent to BookFig. 5.28, p. 323)
H.Y. Lin, CCUEE Computer Organization 24
Review: Steps in Processor Design1. Analyze instruction set; get datapath requirements
2. Select datapath components andestablish clocking methodology
3. Assemble datapath that meets requirements
4. Determine control signal values for each instruction
5. Assemble control logic to generate control signals
H.Y. Lin, CCUEE Computer Organization 25
Multicycle Execution Step (1)FetchIR = Memory[PC];PC = PC + 4;
1
0
1
0
1
0X
0X
0010
1
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
H.Y. Lin, CCUEE Computer Organization 26
Multicycle Execution Step (2)Instruction Decode and Register FetchA = Reg[IR[25-21]]; (A = Reg[rs])B = Reg[IR[20-15]]; (B = Reg[rt])ALUOut = (PC + sign-extend(IR[15-0]) << 2)
0
0X
0
0X
3
0X
X
010
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
H.Y. Lin, CCUEE Computer Organization 27
0X
Multicycle Execution Steps (3)Memory Reference Instructions
ALUOut = A + sign-extend(IR[15-0]);
X
2
0
0X
0 1
X
010
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
H.Y. Lin, CCUEE Computer Organization 28
Multicycle Execution Steps (3)ALU Instruction (R-Type)ALUOut = A op B
0X
X
0
0
0X
0 1
X
???
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
H.Y. Lin, CCUEE Computer Organization 29
1 if Zero=1
Multicycle Execution Steps (3)Branch Instructionsif (A == B) PC = ALUOut;
0X
X
0
0
X0 1
1
011
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
H.Y. Lin, CCUEE Computer Organization 30
Multicycle Execution Step (3)Jump InstructionPC = PC[21-28] concat (IR[25-0] << 2)
0X
X
X
0
1X
0 X
2
XXX
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
H.Y. Lin, CCUEE Computer Organization 31
Multicycle Execution Steps (4)Memory Access - Read (lw)MDR = Memory[ALUOut];
0X
X
X
1
01
0 X
X
XXX
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
H.Y. Lin, CCUEE Computer Organization 32
Multicycle Execution Steps (4)Memory Access - Write (sw)Memory[ALUOut] = B;
0X
X
X
0
01
1 X
X
XXX
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
H.Y. Lin, CCUEE Computer Organization 33
10
0X
0
X
0
XXX
X
X
1
15 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
Multicycle Execution Step (4)ALU Instruction (R-Type)Reg[IR[15:11]] = ALUOut; (Reg[Rd] = ALUOut)
H.Y. Lin, CCUEE Computer Organization 34
Multicycle Execution Steps (5)Memory Read Completion (lw)
Reg[IR[20-16]] = MDR;
1
0
0
X
0
0X
0 X
X
XXX
0
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
H.Y. Lin, CCUEE Computer Organization 35
Review: Steps in Processor Design1. Analyze instruction set; get datapath requirements
2. Select datapath components andestablish clocking methodology
3. Assemble datapath that meets requirements
4. Determine control signal values for each instruction
5. Assemble control logic to generate control signals
H.Y. Lin, CCUEE Computer Organization 36
Full Multicycle Implementation
ALUControl
ControlUnit
6 6op I[31:26] funct I[5:0]
ALUOp
2
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
ZeroRD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWriteCond
PCWrite
Zero
IRWrite
H.Y. Lin, CCUEE Computer Organization 37
Multicycle Control Unit
Review: single-cycle implementation All control signals can be determined by current instruction +
condition Implemented in combinational logic
Multicycle implementation is different Control signals depend on
current instruction which clock cycle is currently being executed
Implemented as Finite State Machine (FSM) OR Microprogrammed Implementation - “stylized FSM”
H.Y. Lin, CCUEE Computer Organization 38
Review - Finite State Machines What’s in the Finite State Machine
Combinational Logic Storage Elements (Flip-flops)
D Q
D Q
CombinationalLogic
Inputs Outputs
CurrentState
Next State
H.Y. Lin, CCUEE Computer Organization 39
Review - Finite State Machines Behavior characterized by
States - unique values of flip-flops e.g., “101” Transitions between states, depending on
current state inputs
Output values, depending on current state inputs
Describing FSMs: State Diagrams State Transition Tables
H.Y. Lin, CCUEE Computer Organization 40
Review - State Diagrams
"Bubbles" - states Arrows - transition edges labeled with condition
expressions Example: Car Alarm
arm
doorhonk
clk
fclk = 1Hz
IDLEBEEP
Honk=1
WAIT
ARM•DOOR
ARMARM
ARM’
ARM’ + ARM•DOOR’ = ARM’ + DOOR’
H.Y. Lin, CCUEE Computer Organization 41
State Transition Table
Transition List - lists edges in State DiagramPS Condition NS Output
IDLE ARM' + DOOR' IDLE 0
IDLE ARM*DOOR BEEP 0
BEEP ARM WAIT 1
BEEP ARM' IDLE 1
WAIT ARM BEEP 0
WAIT ARM' IDLE 0
IDLEBEEP
Honk=1
WAIT
ARM•DOOR
ARMARM
ARM’
ARM’ + ARM•DOOR’ = ARM’ + DOOR’
H.Y. Lin, CCUEE Computer Organization 42
State Machine Design
Traditional Approach: Create State Diagram Create State Transition Table Assign State Codes Write Excitation Equations & Minimize
Modern Approach Enter FSM Description into CAD program Synthesize implementation of FSM
H.Y. Lin, CCUEE Computer Organization 43
FSM Control Unit Implementation FSM Inputs:
Clock Instruction register op field
FSM Outputs: control signals for datapath Enable signals: Register file, Memory, PC, MDR, and IR Multiplexer signals: ALUSrcA, ALUSrcB, etc. ALUOp signal - used for ALU Control as in single-cycle
implementation
H.Y. Lin, CCUEE Computer Organization 44
Enable Control Signals Asserted (true) in a state if they appear
in the state bubble Assumed Deasserted (false) in a state
if they do not appear in the state bubble
Multiplexer/ALU Control Signals Asserted with a given value in a state
if they appear in a state bubble Assumed Don’t Care if they do not
appear in the state bubble See Figures 5.32 - 5.36
3
MemReadIorD = 0
Memoryaccess
(Op
=‘L
W’)
Enablesignal
Multiplexersignal
Statename
Transition fromprevious state(conditioanl)
Transition tonext state
(unconditioanl)
Describing Controller Function in State Diagrams - Conventions
H.Y. Lin, CCUEE Computer Organization 45
0 Instruction FetchInstruction decode /
register fetch
1
2
Memory addresscomputation
3
4
5 7
6 8 9Execution
BranchCompletion
JumpCompletion
Memoryaccess
Memoryaccess
R-type completion
Writeback step
Start
MemReadALUSrcA = 0
IorD = 0IRWrite
ALUSrcB = 01ALUOp = 00
PCWritePCSource = 00
(OP = ‘JMP’)
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
(OP = ‘LW’)
(OP = (‘SW’)
MemReadIorD = 1
RegWriteMemToReg=1
RegDst = 0
MemWriteIorD = 1
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
RegDst = 1RegWrite
MemtoReg = 0
ALUSrcA = 1ALUSrcB = 00ALUOp = 01
PCWriteCondPCSource = 01
PCWritePCSource = 10
Multicycle Control - Full State Diagram
H.Y. Lin, CCUEE Computer Organization 46
State 8Branch
State
(OP
= ‘B
EQ
’)
State 6Execution
States
(OP = ‘R
-Type’)
State 2Mem. Ref
States
(Op = ‘LW’ or Op = ‘S
W’)
MemReadALUSrcA = 0
IorD = 0IRWrite
ALUSrcB = 01ALUOp = 00
PCWritePCSource = 00
State 0
Instruction Fetch
Start
State 1
Instruction Decode /Register Fetch
State 9JumpState
(OP
= ‘
J’)
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
Control FSM - Instruction Fetch / Decode States
H.Y. Lin, CCUEE Computer Organization 47
(OP = ‘LW’)
State 3 Memory Access
State 4 Write-back step
State 5 Memory Access
(OP = ‘SW’)
Memory Address ComputationState 2
from State 1(OP = ‘LW’ OR OP = ‘SW’)
to State 0
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
MemReadIorD = 1
MemWriteIorD = 1
RegWriteMemToReg=1
RegDst = 0
Control FSM - Memory Reference States
H.Y. Lin, CCUEE Computer Organization 48
Control FSM - R-Type States
State 7 R-type completion
to State 0
State 6
from State 1(OP = R-Type)
ExecutionALUSrcA = 1
ALUSrcB = 00ALUOp = 10
RegDst = 1RegWrite
MemtoReg = 0
H.Y. Lin, CCUEE Computer Organization 49
Control FSM - Branch State
State 8 Branch Completion
from State 1(OP = ‘BEQ’)
to State 0
ALUSrcA = 1ALUSrcB = 00ALUOp = 01
PCWriteCondPCSource = 01
H.Y. Lin, CCUEE Computer Organization 50
Control FSM - Jump State
State 9 Jump Completion
from State 1(OP = ‘J’)
to State 0
PCWritePCSource = 10
H.Y. Lin, CCUEE Computer Organization 51
Controller Implementation
Typical Implementation: Figure 5-37, p. 338 Variations
Random logic PLA ROM
address lines = inputs data lines = outputs contents = “truth table”
Datapathcontroloutputs
Inputs fromInstr. Reg(opcode)
CombinationalControlLogic
State
NextState
H.Y. Lin, CCUEE Computer Organization 52
What is the CPI of the Multicycle Implementation? Using measured instruction mix from SPECINT2000
lw 5 cycles25%sw 4 cycles10%R-type 4 cycles52%branch 3 cycles11%jump 3 cycles2%
What is the CPI? CPI = (5 cycles * 0.25) + (4 cycles * 0.10) + (4 cycles * 0.53) +
(3 cycles * 0.11) + (3 cycles * 0.02) CPI = 4.12 cycles per instruction
Performance of a Multicycle Implementation
H.Y. Lin, CCUEE Computer Organization 53
Performance Continued
Assuming a 200ps clock, what is average execution time/instruction? Sec/Instr = 4.12 CPI * 200ps/cycle) = 824ps/instr
How does this compare to the Single-Cycle Case? Sec/Instr = 1 CPI * 600ps/cycle = 600ps/instr Single-Cycle is 1.38 times faster than Multicycle
Why is Single-Cycle faster than Multicycle? Branch & jump are the same speed (600ps vs 600ps) R-type & store are faster (600ps vs 800ps) Load word is faster (600ps vs 1000ps)
H.Y. Lin, CCUEE Computer Organization 54
Multicycle Example Problem
Extend the design to implement the “jr” (jump register) instruction:
jr rs PC = Reg[rs] Format:
Steps:1. Review instruction requirements (register transfer)
2. Modify datapath
3. Modify control logic
0 rs 0 0 80
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
H.Y. Lin, CCUEE Computer Organization 55
Reg[rs]
Example Problem: Datapath
What needs to be changed?
ALUControl
ControlUnit
6 6op I[31:26] funct I[5:0]
ALUOp
2
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
ZeroRD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
0
1
2MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWriteCond
PCWrite
Zero
IRWrite
32
1
0
H.Y. Lin, CCUEE Computer Organization 56
Example Problem: Control
What needs to be changed?
PCWritePCSource = 11
(OP = ‘JR‘)
0 Instruction FetchInstruction decode /
register fetch
1
2
Memory addresscomputation
3
4
5 7
6 8 9Execution
BranchCompletion
JumpCompletion
Memoryaccess
Memoryaccess
R-type completion
Writeback step
Start
MemReadALUSrcA = 0
IorD = 0IRWrite
ALUSrcB = 01ALUOp = 00
PCWritePCSource = 00
(OP = ‘JMP’)
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
(OP = ‘LW’)
(OP = (‘SW’)
MemReadIorD = 1
RegWriteMemToReg=1
RegDst = 0
MemWriteIorD = 1
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
RegDst = 1RegWrite
MemtoReg = 0
ALUSrcA = 1ALUSrcB = 00ALUOp = 01
PCWriteCondPCSource = 01
PCWritePCSource = 10
H.Y. Lin, CCUEE Computer Organization 57
Exceptions - “Stuff Happens”
Definition: "unexpected change in control flow" Used to handle runtime errors
Overflow Undefined Instruction Hardware malfunction
Used to handle external events, "service" functions Interrupts - external I/O Device request Page fault - virtual memory System call - user request for OS action
H.Y. Lin, CCUEE Computer Organization 58
Example: Undefined Instructions
0 Instruction FetchInstruction decode /
register fetch
1
2
Memory addresscomputation
3
4
5 7
6 8 9Execution
BranchCompletion
JumpCompletion
Memoryaccess
Memoryaccess R-type completion
Writeback step
Start
MemReadALUSrcA = 0
IorD = 0IRWrite
ALUSrcB = 01ALUOp = 00
PCWritePCSource = 00
(OP = ‘BEQ’)
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
(OP = ‘LW’)
(OP = (‘SW’)
MemReadIorD = 1
RegWriteMemToReg=1
RegDst = 0
MemWriteIorD = 1
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
RegDst = 1RegWrite
MemtoReg = 0
ALUSrcA = 1ALUSrcB = 00ALUOp = 01
PCWriteCondPCSource = 01
PCWritePCSource = 10
(OP ≠ R-TypeOP≠LW,OP–W,
OP≠BEQ)
What happens here ????
H.Y. Lin, CCUEE Computer Organization 59
What Happens During an Exception Save user state – register values, etc. Take action to handle exception Restore user state and continue execution if possible (e.g., emulate
undefined instr.)
Exception:undefined instruction
Return fromexception
user program
add.s f0,f1,f2
srl r1,r2,2
beq r0,r1,L
sub r5,r3,r2
add r5,r4,r3
bne r4,r3,L2
add r3,r1,r2
Exception Handler(System)
rfe
H.Y. Lin, CCUEE Computer Organization 60
Two exceptions (for now): Undefined instruction Arithmetic overflow
Add registers to architecture to save state EPC - Exception Program Counter (32 bits) Cause - records cause of exception (32 bits)
Undefined instruction: Cause <- 0 Arithmetic overflow: Cause <- 1
Alternatives used by other architectures Save PC on stack Communicate exception type using Exception Vector
Adding Exceptions to the Multicycle Processor
H.Y. Lin, CCUEE Computer Organization 61
Implementing Exceptions
Add datapath components - Figure 5.39 ExceptionPC (EPC) - stores PC of offending instruction Cause Register - records the cause of the exception
Modify control - Figures 5.40 Undefined Instruction - state 1 (Instruction decode) Overflow - state 7 (R-type completion)
H.Y. Lin, CCUEE Computer Organization 62
Implementing Exceptions
Datapath modifications Calculate address of offending instruction (PC-4)
and store in EPC Store 0 or 1 in Cause Add overflow output from ALU (described in Ch. 4) Assign "8000180hex" to PC - fixed location of handler
Control modifications Undefined instruction: add ”default" branch to
Instruction Fetch state Overflow: test after execution state for R-type instructions
H.Y. Lin, CCUEE Computer Organization 63
5 5
RD1
RD2
RN1 RN2 WN
WD
RegWrite
Registers
Operation
ALU
3
EXTND
16 32
Zero
RD
WDMemRead
MemoryADDR
MemWrite
5
Instruction I
32
ALUSrcB
<<2
PC
4
RegDst
5
IR
MDR
MUX
0123
MUX
1
0
MUX
0
1A
BALUOUT
01
2 MUX
<<2 CONCAT28 32
MUX
0
1
ALUSrcA
jmpaddrI[25:0]
rd
MUX0 1
rtrs
immediate
PCSource
MemtoReg
IorD
PCWr*
IRWrite
EPC
EPCWrite
CAUSE
CauseWrite
MUX
0
1
IntCause.
1
0
3
8000180 hex
Overflow
Adding Exception Support to Datapath
H.Y. Lin, CCUEE Computer Organization 64
IntCause = 0CauseWrite
ALUSrcA = 0ALUSrcB = 01ALUOp = 01EPCWRITE
PCWritePCSource=11
State 10
UndefinedInstruction
State 0
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
State 1
Instruction Decode /Register Fetch
State 9JumpState
State 8Branch
State
State 6Execution
States
State 2Mem. Ref
States
(OP
= ‘
J’)
(OP
= ‘B
EQ
’)
(OP = ‘R
-Type’)
(Op = ‘LW’ or Op = ‘S
W’)
from State 0
(OP = Other)
Adding Exception Support to Control - Undefined Instruction
H.Y. Lin, CCUEE Computer Organization 65
IntCause = 1CauseWrite
ALUSrcA = 0ALUSrcB = 01ALUOp = 01EPCWRITE
PCWritePCSource=11
State 10 Overflow
State 0
Overflow
Note: requires storageof overflow condition from
previous state
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
State 6
RegDst = 1RegWrite
MemtoReg = 0
State 7 R-type completion
from State 1(OP = R-Type)
to State 0
Execution
No Overflow
Adding Exception Support to Control - Arithmetic Overflow
H.Y. Lin, CCUEE Computer Organization 66
What Makes Exceptions Hard The “real” MIPS architecture requires that instruction
causing exception must have "no effect" Implication: undo effects of instruction
decrement PC prevent storage of result during R-type instruction
What about “recursive” exceptions? Must add instructions to save exception registers on stack Disable exceptions / interrupts until registers saved More details in Appendix A (A.7)
H.Y. Lin, CCUEE Computer Organization 67
Summary - Exceptions
Must consider as part of overall design Must find "convenient" places to detect exceptions Must find way to cleanly return from exception Must keep control “small and fast” Much harder in pipelined implementations!