![Page 1: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/1.jpg)
CSE 141, S2'06 Jeff Brown
Multi-Cycle CPU:Datapath and Control
![Page 2: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/2.jpg)
CSE 141, S2'06 Jeff Brown
Why a Multiple Clock Cycle CPU?
• the problem => single-cycle cpu has a cycle time long enough to complete the longest instruction in the machine
• the solution => break up execution into smaller tasks, each task taking a cycle, different instructions requiring different numbers of cycles or tasks
• other advantages => reuse of functional units (e.g., alu, memory)
• ET = IC * CPI * CT
![Page 3: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/3.jpg)
CSE 141, S2'06 Jeff Brown
High-level View
![Page 4: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/4.jpg)
CSE 141, S2'06 Jeff Brown
Breaking Execution Into Clock Cycles
• We will have five execution steps (not all instructions use all five)– fetch– decode & register fetch– execute– memory access– write-back
• We will use Register-Transfer-Language (RTL) to describe these steps
![Page 5: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/5.jpg)
CSE 141, S2'06 Jeff Brown
Breaking Execution Into Clock Cycles
• Introduces extra registers when:– signal is computed in one clock cycle and used in another, AND– the inputs to the functional block that outputs this signal can
change before the signal is written into a state element.
• Significantly complicates control. Why?• The goal is to balance the amount of work done each cycle.
![Page 6: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/6.jpg)
CSE 141, S2'06 Jeff Brown
Multicycle datapath
![Page 7: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/7.jpg)
CSE 141, S2'06 Jeff Brown
1. Fetch
IR = Mem[PC]PC = PC + 4
(may not be final value of PC)
![Page 8: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/8.jpg)
CSE 141, S2'06 Jeff Brown
2. Instruction Decode and Register Fetch
• compute target before we know if it will be used (may not be branch, branch may not be taken)
• target is a new state element (temp register)• everything up to this point must be Instruction-
independent, because we still haven’t decoded the instruction.
• everything instruction (opcode)-dependent from here on.
A = Reg[IR[25-21]]B = Reg[IR[20-16]]ALUOut = PC + (sign-extend (IR[15-0]) << 2)
![Page 9: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/9.jpg)
CSE 141, S2'06 Jeff Brown
3. Execution, memory address computation, or branch completion
• Memory reference (load or store)ALUOut = A + sign-extend(IR[15-0])
• R-type ALUout = A op B
• Branchif (A == B) PC = ALUOut
At this point, Branch is complete, and we start over; others require more cycles.
![Page 10: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/10.jpg)
CSE 141, S2'06 Jeff Brown
4. Memory access or R-type completion
• Memory reference– load
MDR = Mem[ALUout]
– storeMem[ALUout] = B
• R-typeReg[IR[15-11]] = ALUout
R-type is complete
![Page 11: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/11.jpg)
CSE 141, S2'06 Jeff Brown
5. Memory Write-Back
Reg[IR[20-16]] = MDR
memory instruction is complete
![Page 12: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/12.jpg)
CSE 141, S2'06 Jeff Brown
Step R-type Memory BranchInstruction Fetch IR = Mem[PC]
PC = PC + 4Instruction Decode/register fetch
A = Reg[IR[25-21]]B = Reg[IR[20-16]]
ALUout = PC + (sign-extend(IR[15-0]) << 2)Execution, addresscomputation, branchcompletion
ALUout = A op B ALUout = A +sign-
extend(IR[15-0])
if (A==B) then PC=ALUout
Memory access or R-type completion
Reg[IR[15-11]] =ALUout
memory-data =Mem[ALUout]
orMem[ALUout]=
BWrite-back Reg[IR[20-16]] =
memory-data
Summary of execution steps
![Page 13: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/13.jpg)
Complete Multicycle Datapath
(support for what instruction just got added?)
![Page 14: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/14.jpg)
1. Instruction Fetch
IR = Memory[PC]PC = PC + 4
![Page 15: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/15.jpg)
2. Instruction Decode and Reg Fetch
A = Register[IR[25-21]]B = Register[IR[20-16]]ALUOut = PC + (sign-extend (IR[15-0]) << 2)
![Page 16: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/16.jpg)
3. Execution (R-type)
ALUout = A op B
![Page 17: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/17.jpg)
4. R-type Completion
Reg[IR[15-11]] = ALUout
![Page 18: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/18.jpg)
3. Branch Completion
if (A == B) PC = ALUOut
![Page 19: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/19.jpg)
3. Memory Address Computation
ALUout = A + sign-extend(IR[15-0])
![Page 20: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/20.jpg)
4. Memory Access
memory-data = Memory[ALUout], orMemory[ALUout] = B
![Page 21: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/21.jpg)
5. Write-back
Reg[IR[20-16]] = memory-data
![Page 22: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/22.jpg)
3. JMP Completion
PC = PC[31-28] | (IR[25-0] <<2)
![Page 23: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/23.jpg)
CSE 141, S2'06 Jeff Brown
Multicycle Control
• Single-cycle control used combinational logic• Multi-cycle control uses ??• FSM defines a succession of states, transitions between
states (based on inputs), and outputs (based on state)• First two states same for every instruction, next state
depends on opcode
![Page 24: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/24.jpg)
CSE 141, S2'06 Jeff Brown
Multicycle Control FSM
Instruction fetch
Decode and Register Fetch
Memoryinstructions
R-typeinstructions
Branchinstructions
Jumpinstruction
start
![Page 25: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/25.jpg)
CSE 141, S2'06 Jeff Brown
First two states of the FSM
MemReadALUSrcA = 0
IorD = 0IRWrite
ALUSrcB = 01ALUOp = 00
PCWritePCSource = 00
?
Memory InstFSM
R-type InstFSM
Branch InstFSM
Jump InstFSM
Instruction Fetch, state 0 Instruction Decode/ Register Fetch, state 1
Opcode = LW or SW
Opcode =
R-type
Opc
ode
= BE
Q
Opc
ode
= JM
P
Start
![Page 26: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/26.jpg)
Instruction Decode and Reg Fetch
A = Register[IR[25-21]]B = Register[IR[20-16]]Target = PC + (sign-extend (IR[15-0]) << 2)
![Page 27: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/27.jpg)
CSE 141, S2'06 Jeff Brown
R-type Instructions
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
from state 1
?
To state 0
Execution
Completion
![Page 28: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/28.jpg)
4. R-type Completion
Reg[IR[15-11]] = ALUout
![Page 29: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/29.jpg)
CSE 141, S2'06 Jeff Brown
BEQ Instruction
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCondPCSource = 01
from state 1
To state 0
![Page 30: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/30.jpg)
CSE 141, S2'06 Jeff Brown
Memory Instructions
?
from state 1
MemWriteIorD = 1
MemReadIorD = 1
MemReadMemtoReg = 1
RegDst = 0
To state 0
MemoryAccess
write-back
AddressComputation
![Page 31: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/31.jpg)
3. Memory Address Computation
ALUout = A + sign-extend(IR[15-0])
![Page 32: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/32.jpg)
CSE 141, S2'06 Jeff Brown
JMP Instruction
PCWritePCSource = 10
from state 1
To state 0
![Page 33: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/33.jpg)
CSE 141, S2'06 Jeff Brown
The Whole FSM
![Page 34: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/34.jpg)
CSE 141, S2'06 Jeff Brown
• How many cycles will it take to execute this code?
lw $t2, 0($t3)lw $t3, 4($t3)beq $t2, $t3, Label #assume not takenadd $t5, $t2, $t3sw $t5, 8($t3)
Label: ...
• What is going on during the 8th cycle of execution?• In what cycle does the actual addition of $t2 and $t3 take place?
• Assume 20% loads, 10% stores, 50% R-type, 20% branches, what is the CPI?
Some Questions
![Page 35: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/35.jpg)
CSE 141, S2'06 Jeff Brown
• Implementation:
Finite State Machine for Control
![Page 36: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/36.jpg)
CSE 141, S2'06 Jeff Brown
• ROM = "Read Only Memory"– values of memory locations are fixed ahead of time
• A ROM can be used to implement a truth table– if the address is m-bits, we can address 2m entries in the ROM.– our outputs are the bits of data that the address points to.
2m is the "height", and n is the "width"
ROM Implementation
m n
0 0 0 0 0 1 10 0 1 1 1 0 00 1 0 1 1 0 00 1 1 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 11 1 0 0 1 1 01 1 1 0 1 1 1
![Page 37: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/37.jpg)
CSE 141, S2'06 Jeff Brown
• How many inputs are there?6 bits for opcode, 4 bits for state = 10 address lines(i.e., 210 = 1024 different addresses)
• How many outputs are there?16 datapath-control outputs, 4 state bits = 20 outputs
• ROM is 210 x 20 = 20K bits (and a rather unusual size)
• Rather wasteful, since for lots of the entries, the outputs are the same
— i.e., opcode is often ignored
ROM Implementation
![Page 38: Multi-Cycle CPU: Datapath and Control€¦ · Multi-Cycle CPU: Datapath and Control. CSE 141, S2'06 Jeff Brown Why a Multiple Clock Cycle CPU? • the problem => single-cycle cpu](https://reader033.vdocuments.us/reader033/viewer/2022050507/5f98add8109977332b2548c4/html5/thumbnails/38.jpg)
CSE 141, S2'06 Jeff Brown
Multicycle CPU Key Points
• Performance gain achieved from variable-length instructions
• ET = IC * CPI * cycle time• Required very few new state elements• More, and more complex, control signals• Control requires FSM