introduction to computer organization and architecture lecture 11 by juthawut chantharamalee ...
DESCRIPTION
Overview Brief look Digital logic CPU Datapath MIPS Example 3Introduction to Computer Organization and ArchitectureTRANSCRIPT
![Page 1: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/1.jpg)
Introduction to Computer Organization and Architecture
Lecture 11By Juthawut
Chantharamaleehttp://dusithost.dusit.ac.th/~juthawut_cha/home.htm
![Page 2: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/2.jpg)
Outline Building a CPU
Basic Components MIPS Instructions (Microprocessor without Interlocked Pipeline Stages)
Basic 5 Steps for CPU Single-Cycle Design Multi-cycle Design Comparison of Single and Multi-cycle Designs
2Introduction to Computer Organization and Architecture
![Page 3: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/3.jpg)
Overview Brief look
Digital logic
CPU Datapath MIPS Example
3Introduction to Computer Organization and Architecture
![Page 4: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/4.jpg)
Digital Logic
D Q
D-type Flip-flop
Clock(edge-triggered)
S (Select input)
A
BF
0
1
Multiplexer
D-type Flip-flop with Enable
Clock(edge-triggered)
D QEN
0
1D Q
DQ
EN(enable)
Clock(edge-triggered)
4Introduction to Computer Organization and Architecture
![Page 5: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/5.jpg)
Digital Logic
1 Bit
D Q
Clock(edge-triggered)
EN
4 Bits
Clock(edge-triggered)
D3 Q3
EN
D2 Q2D1 Q1D0 Q0
Registers
N Bits
D Q
Clock(edge-triggered)
EN
5Introduction to Computer Organization and Architecture
![Page 6: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/6.jpg)
Digital Logic
outin
drive
Tri-state Driver (Buffer)In Drive Out0 0 Z1 0 Z
0 1 0
1 1 1
What is Z ??
6Introduction to Computer Organization and Architecture
![Page 7: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/7.jpg)
Digital Logic
Adder/Subtractor or ALUA B
F
Carry-outAdd/sub or ALUopCarry-in
7Introduction to Computer Organization and Architecture
![Page 8: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/8.jpg)
Overview Brief look
Digital logic
How to Design a CPU Datapath MIPS Example
8Introduction to Computer Organization and Architecture
![Page 9: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/9.jpg)
Designing a CPU: 5 Steps Analyze the instruction set datapath requirements
MIPS: ADD, SUB, ORI, LW, SW, BR Meaning of each instruction given by RTL (register transfers) 2 types of registers: CPU/ISA registers, temporary registers
Datapath requirements select the datapath components ALU, register file, adder, data memory, etc
Assemble the datapath Datapath must support planned register transfers Ensure all instructions are supported
Analyze datapath control required for each instruction Assemble the control logic
9Introduction to Computer Organization and Architecture
![Page 10: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/10.jpg)
Step 1a: Analyze ISA All MIPS instructions are 32 bits long. Three instruction formats:
R-type
I-type
J-type
R: registers, I: immediate, J: jumps These formats intentionally chosen to simplify design
op target address02631
6 bits 26 bits
op rs rt rd shamt funct061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
10Introduction to Computer Organization and Architecture
![Page 11: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/11.jpg)
Step 1b: Analyze ISA
Meaning of the fields: op: operation of the instruction rs, rt, rd: the source and destination register specifiers
Destination is either rd (R-type), or rt (I-type) shamt: shift amount funct: selects the variant of the operation in the “op” field immediate: address offset or immediate value target address: target address of the jump instruction
op target address02631
6 bits 26 bits
op rs rt rd shamt funct061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
R-type
I-type
J-type
11Introduction to Computer Organization and Architecture
![Page 12: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/12.jpg)
MIPS ISA: subset for today ADD and SUB
addU rd, rs, rt subU rd, rs, rt
OR Immediate: ori rt, rs, imm16
LOAD and STORE Word lw rt, rs, imm16 sw rt, rs, imm16
BRANCH: beq rs, rt, imm16
op rs rt rd shamt funct061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
op rs rt immediate016212631
6 bits 16 bits5 bits5 bits
12Introduction to Computer Organization and Architecture
![Page 13: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/13.jpg)
Step 2: Datapath RequirementsREGISTER FILE
MIPS ISA requires 32 registers, 32b each
Called a register file Contains 32 entries Each entry is 32b
AddU rd,rs,rt or SubU rd,rs,rt Read two sources rs, rt Operation rs + rt or rs – rt Write destination rd ← rs+/-rt
Requirements Read two registers (rs, rt) Perform ALU operation Write a third register (rd)
RdReg1
RdReg2
WrRegWrData
RdData1
RdData2
RegWrite
REGFILE
RegisterNumbers(5 bits ea)
How toimplement?
ALU
ALUop
Result
Zero?
13Introduction to Computer Organization and Architecture
![Page 14: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/14.jpg)
Step 3: Datapath Assembly ADDU rd, rs, rt SUBU rd, rs, rt
Need an ALU Hook it up to REGISTER FILE REGFILE has 2 read ports (rs,rt), 1 write port (rd)
rsParametersCome FromInstructionFields
rt
rd
Control Signals DependUpon Instruction Fields
Eg:ALUop = f(Instruction) = f(op, funct)
RdReg1
RdReg2
WrRegWrData
RdData1
RdData2
RegWrite
REGFILE
ALU
ALUop
Result
Zero?
14Introduction to Computer Organization and Architecture
![Page 15: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/15.jpg)
Steps 2 and 3: ORI Instruction ORI rt, rs, Imm16
Need new ALUop for ‘OR’ function, hook up to REGFILE 1 read port (rs), 1 write port (rt), 1 const value (Imm16)
rs
FromInstruction
rt
rt rdX
RdReg1
RdReg2
WrRegWrData
RdData1
RdData2
RegWrite
REGFILE
ZERO-EXTEND
ALU
ALUop
Result
Zero?
16-bitsImm16
ALUsrc
0
1Control SignalsDepend UponInstruction Fields
E.g.:ALUsrc = f(Instruction) = f(op, funct)
15Introduction to Computer Organization and Architecture
![Page 16: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/16.jpg)
Steps 2 and 3 Destination Register Must select proper destination, rd or rt
Depends on Instruction Type R-type may write rd I-type may write rt
FromInstruction
RdReg1RdReg2
WrRegWrData
RdData1
RdData2REGFILE
rsrt
rd
ZERO-EXTEND
ALU
ALUop
Result
Zero?
ALUsrc
0
1
RegDst
1
0
16-bitsImm16
RegWrite
16Introduction to Computer Organization and Architecture
![Page 17: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/17.jpg)
Steps 2 and 3: Load Word LW rt, rs, Imm16
Need Data Memory: data ← Mem[Addr] Addr is rs+Imm16, Imm16 is signed, use ALU for +
Store in rt: rt ← Mem[rs+Imm16]
RdReg1RdReg2
WrRegWrData
RdData1
RdData2REGFILE
rsrt
rdSIGN/ZERO-
EXTEND
ALU
ALUop
Result
Zero?
ALUsrc
0
1
RegDst
1
0Imm16
RegWrite
AddrRdData
MemtoReg
0
1
DATAMEM
ExtOp
17Introduction to Computer Organization and Architecture
![Page 18: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/18.jpg)
Steps 2 and 3: Store Word SW rt, rs, Imm16
Need Data Memory: Mem[Addr] ← data Addr is rs+Imm16, Imm16 is signed, use ALU for +
Store in Mem: Mem[rs+Imm16] ← rt
RdReg1
RdReg2
WrRegWrData
RdData1
RdData2REGFILE
rsrt
rdSIGN/ZERO-
EXTEND
ALU
ALUop
Result
Zero?
ALUsrc
0
1
RegDst
1
0
Imm16
RegWrite
AddrRdData
WrData
MemtoReg
1
0
DATAMEM
ExtOp
MemWrite
18Introduction to Computer Organization and Architecture
![Page 19: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/19.jpg)
Writes: Need to Control Timing Problem: write to data memory
Data can come anytime Addr must come first MemWrite must come after Addr
Else? writes to wrong Addr!
Solution: use ideal data memory Assume everything works ok How to fix this for real? One solution: synchronous memory Another solution: delay MemWr to come late
Problems?: write to register file Does RegWrite signal come after WrReg number? When does the write to a register happen? Read from same register as being written?
19Introduction to Computer Organization and Architecture
![Page 20: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/20.jpg)
Missing Pieces: Instruction Fetching Where does the Instruction come from?
From instruction memory, of course!
Recall: stored-program concept Alternatives? How about hard-coding wires and switches…? This
is how ENIAC was programmed! (Electronic Numerical Integrator and Computer)
How to branch? BEQ rs, rt, Imm16
20Introduction to Computer Organization and Architecture
![Page 21: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/21.jpg)
Instruction Processing Fetch instruction Execute instruction
Fetch next instruction Execute next instruction
Fetch next instruction Execute next instruction
Etc…
How to maintain sequence? Use a counter! Branches (out of sequence) ? Load the counter!
21Introduction to Computer Organization and Architecture
![Page 22: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/22.jpg)
Instruction Processing Program Counter
Points to current instruction
Address to instruction memory Instr ← InstrMem[PC]
Next instruction: counts up by 4 Remember: memory is byte-addressable, instructions are 4 bytes PC ← PC + 4
Branch instruction: replace PC contents
22Introduction to Computer Organization and Architecture
![Page 23: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/23.jpg)
Step 1: Analyze Instructions Register Transfer Language…
op | rs | rt | rd | shamt | funct = InstrMem[ PC ]
op | rs | rt | Imm16 = InstrMem[ PC ]
Instr Register Transfers
ADDU R[rd] ← R[rs] + R[rt]; PC ← PC + 4
SUBU R[rd] ← R[rs] – R[rt]; PC ← PC + 4
ORI R[rt] ← R[rs] + zero_ext(Imm16); PC ← PC + 4
LOAD R[rt] ← MEM[ R[rs] + sign_ext(Imm16)]; PC ← PC + 4
STORE MEM[ R[rs] + sign_ext(Imm16) ] ← R[rt]; PC ← PC + 4
BEQ if ( R[rs] == R[rt] ) then PC ← PC + 4 + { sign_ext(Imm16)] || b’00’ } else PC ← PC + 4
23Introduction to Computer Organization and Architecture
![Page 24: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/24.jpg)
Steps 2 and 3: Datapath & Assembly
PC: a register Counter, counts by +4 Provides address to Instruction Memory
Add
Readaddress
InstructionMemory
Instruction[31:0]
PC
Instruction[31:0]
4
24Introduction to Computer Organization and Architecture
![Page 25: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/25.jpg)
Steps 2 and 3: Datapath & Assembly
Add AddAdd
result
Readaddress
InstructionMemory
Instruction[31:0]
PC
0Mux1
Sign/Zero
Extend
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0] (Imm16)
16 32
PCSrcShiftLeft 2
4
PC: a register Counter, counts by +4 Sometimes, must add
SignExtend{Imm16||b’00’} for branch instructionsNote: the sign-extender for Imm16
is already in the datapath(everything else is new) ExtOp
25
![Page 26: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/26.jpg)
Steps 2 and 3: Add Previous Datapath
Add Add
ALU
Addresult
ALUresult
Zero
Readaddress
InstructionMemory
Instruction[31:0]
RegisterFile
DataMemory
PC
Addr-ess
Readdata
Writedata
0Mux1
1Mux0
0Mux1
0Mux1
ALUControl
Sign/Zero
Extend
Writereg.
Readreg. 1Readreg. 2
Readdata 2
Readdata 1
Writedata
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0] (Imm16)
Instruction[5:0] (funct)
16 32
RegWrite
RegDst
ALUSrc
MemWrite
PCSrc
MemtoReg
ALUOp
ShiftLeft 2
4
ExtOp
![Page 27: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/27.jpg)
What have we done? Created a simple CPU datapath
Control still missing (next slide)
Single-cycle CPU Every instruction takes 1 clock cycle Clocking ?
27Introduction to Computer Organization and Architecture
![Page 28: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/28.jpg)
One Clock Cycle Clock Locations
PC, REGFILE have clocks
Operation On rising edge, PC will get new value
Maybe REGFILE will have one value updated as well After rising edge
PC and REGFILE can’t change New value out of PC Instruction out of INSTRMEM Instruction selects registers to read from REGFILE Instruction controls ALUop, ALUsrc, MemWrite, ExtOp, etc ALU does its work DataMem may be read (depending on instruction) Result value goes back to REGFILE New PC value goes back to PC Await next clock edge
Lots to do in only1 clockcycle !!
28Introduction to Computer Organization and Architecture
![Page 29: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/29.jpg)
Missing Steps? Control is missing (Steps 4 and 5 we mentioned earlier)
Generate the green signals ALUsrc, MemWrite, MemtoReg, PCSrc, RegDst, etc
These are all f(Instruction), where f() is a logic expression Will look at control strategies in upcoming lecture
Implementation Details How to implement REGFILE?
Read port: tristate buffers? Multiplexer? Memory? Two read ports: two of above? Write port: how to write only 1 register?
How to control writes to memory? To register file?
More instructions Shift instructions Jump instruction Etc
29Introduction to Computer Organization and Architecture
![Page 30: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/30.jpg)
1-Cycle CPU Datapath
Add Add
ALU
Addresult
ALUresult
Zero
Readaddress
InstructionMemory
Instruction[31:0]
RegisterFile
DataMemory
PC
Addr-ess
Readdata
Writedata
0Mux1
1Mux0
0Mux1
0Mux1
ALUControl
Sign/Zero
Extend
Writereg.
Readreg. 1Readreg. 2
Readdata 2
Readdata 1
Writedata
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0] (Imm16)
Instruction[5:0] (funct)
16 32
RegWrite
RegDst
ALUSrc
MemWrite
PCSrc
MemtoReg
ALUOp
ShiftLeft 2
4
ExtOp
![Page 31: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/31.jpg)
1-cycle CPU Datapath + Control
PCSrc
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0]
Instruction[5:0]
Instruction[31:26]
Sign/Zero
Extend
DataMemory
Addr-ess
Readdata
Writedata
ALUALU
result
Zero
Readaddress
InstructionMemory
Instruction[31:0]
Add
PC
4Add
Addresult
ShiftLeft 2
RegisterFile
Writereg.
Readreg. 1
Readreg. 2
Readdata 2
Readdata 1
Writedata
RegDstBranchMemReadMemtoRegALUOpMemWriteALUSrcRegWrite
ALUcontrol
Con-trol
![Page 32: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/32.jpg)
Input or Output Signal Name R-format Lw Sw Beq
Inputs
Op5 0 1 1 0
Op4 0 0 0 0
Op3 0 0 1 0
Op2 0 0 0 1
Op1 0 1 1 0
Op0 0 1 1 0
Outputs
RegDst 1 0 X X
ALUSrc 0 1 1 0
MemtoReg 0 1 X X
RegWrite 1 1 0 0
MemRead 0 1 0 0
MemWrite 0 0 1 0
Branch 0 0 0 1
ALUOp1 1 0 0 0
ALUOp0 0 0 0 1
Also: I-type instructions (ORI) & ExtOp (sign-extend control), etc.
1-cycle CPU Control – Lookup Table
![Page 33: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/33.jpg)
1-cycle CPU + Jump Instruction
Instruction[31:26]
Instruction[25:0]
PC + 4 [31..28]
Jump address [31..0]
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0]
Instruction[5:0]
![Page 34: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/34.jpg)
1-cycle CPU Problems? Every instruction 1 cycle Some instructions “do more work”
Eg, lw must read from DATAMEM All instructions must have same clock period…
Many instructions run slower than necessary
Tricky timing on MemWrite, RegWrite(?) signals Write signal must come *after* address is stable
Need extra resources… PC+4 adder, ALU for BEQ instruction, DATAMEM+INSTRMEM
34Introduction to Computer Organization and Architecture
![Page 35: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/35.jpg)
Performance! Single-Cycle CPU Performance
Execute one instruction per clock cycle (CPI=1) Clock cycle time? Note dataflow includes:
INSTRMEM read REGFILE access Sign extension ALU operation DATAMEM read REGFILE/PC write
Not every instruction uses all resources (eg, DATAMEM read) Can we change clock period for each instruction?
No! (Why not?) One clock period: the worst case! This is why a single-cycle CPU is not good for performance
35Introduction to Computer Organization and Architecture
![Page 36: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/36.jpg)
1-cycle CPU Datapath + Controller
Instruction[31:26]
Instruction[25:0]
PC + 4 [31..28]
Jump address [31..0]
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0]
Instruction[5:0]
![Page 37: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/37.jpg)
1-cycle CPU Summary Operation
1 cycle per instruction Control signals held fixed during entire cycle (except BRANCH) Only 2 registers
PC, updated every clock cycle REGFILE, updated when required
During clock cycle, data flows from register-outputs to register-inputs Fixed clock frequency / period
Performance 1 instruction per cycle Slowest instruction determines clock frequency
Outstanding issue: MemWrite timing Assume this signal writes to memory at end of clock cycle
37Introduction to Computer Organization and Architecture
![Page 38: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/38.jpg)
Multi-cycle CPU Goals Improve performance
Break each instruction into smaller steps / multiple cycles LW instruction 5 cycles SW instruction 4 cycles R-type instruction 4 cycles Branch, Jump 3 cycles
Aim for 5x clock frequency Complex instructions (eg, LW) 5 cycles same performance as before Simple instructions (eg, ADD) fewer cycles faster
Save resources (gates/transistors) Re-use ALU over multiple cycles Put INSTR + DATA in same memory
MemWrite timing solved?
38Introduction to Computer Organization and Architecture
![Page 39: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/39.jpg)
Multi-cycle CPU Datapath
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0]
Instruction[5:0]
Instr[15:0]
InstructionRegister
MemoryData
Register
ALUOut
A
B
MemoryMemData
Address
Writedata
Registers
RdData1
RdData2
RdReg2
RdReg1
Writereg
Writedata
Add multiplexers + control signals (IorD, MemtoReg, ALUSrcA, ALUSrcB) Move signal paths (+4, Shift Left 2)
4
ShiftLeft 2
SignExtend
PCMux
Mux
ALUALU
result
Zero
Mux
Mux
Mux
![Page 40: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/40.jpg)
Multi-cycle CPU Datapath
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0]
Instruction[5:0]
Instr[15:0]
ALUOut
A
B
MemoryMemData
Address
Writedata
Registers
RdData1
RdData2
RdReg2
RdReg1
Writereg
Writedata
Add registers + control signals (IR, MDR, A, B, ALUOut) Registers with no control signal load value every clock cycle (eg, PC)
4
ShiftLeft 2
SignExtend
PCMux
Mux
ALUALU
result
Zero
Mux
Mux
Mux
InstructionRegister
MemoryData
Register
![Page 41: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/41.jpg)
Instruction Execution Example Execute a “Load Word” instruction
LW rt, 0(rs)
5 Steps1. Fetch instruction2. Read registers3. Compute address4. Read data5. Write registers
41Introduction to Computer Organization and Architecture
![Page 42: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/42.jpg)
Load Word Instruction Sequence
1. Fetch InstructionInstructionRegister ← Mem[PC]
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[5:0]
Instr[15:0]
ALUOut
A
BWritedata
Registers
RdData1
RdData2
RdReg2
RdReg1
Writereg
Writedata
4
ShiftLeft 2
SignExtend
PCMux
Mux
ALUALU
result
Zero
Mux
Mux
Mux
InstructionRegister
MemoryData
Register
Instruction[15:0]
MemoryMemData
Address
![Page 43: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/43.jpg)
Load Word Instruction Sequence
2. Read RegistersA ← Registers[Rs]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0]
Instruction[5:0]
Instr[15:0]
ALUOut
A
B
MemoryMemData
Address
Writedata
Registers
RdData2
RdReg2
Writereg
Writedata
4
ShiftLeft 2
SignExtend
PCMux
Mux
ALUALU
result
Zero
Mux
Mux
Mux
InstructionRegister
MemoryData
Register
Instruction[25:21]
RdData1
RdReg1
![Page 44: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/44.jpg)
Load Word Instruction Sequence
3. Compute AddressALUOut ← A + {SignExt(Imm16),b’00’}
Instruction[25:21]
Instruction[20:16]
Instruction[15:0]
Instruction[5:0]
Instr[15:0]
B
MemoryMemData
Address
Writedata
Registers
RdData1
RdData2
RdReg2
RdReg1
Writereg
Writedata
4
ShiftLeft 2
SignExtend
PCMux
Mux
ALUALU
result
Zero
Mux
Mux
Mux
InstructionRegister
MemoryData
Register
Instruction[15:11]
ALUOut
A
![Page 45: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/45.jpg)
Load Word Instruction Sequence
4. Read DataMDR ← Memory[ALUOut]
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0]
Instruction[5:0]
Instr[15:0]
A
BWritedata
Registers
RdData1
RdData2
RdReg2
RdReg1
Writereg
Writedata
4
ShiftLeft 2
SignExtend
PCMux
Mux
ALUALU
result
Zero
Mux
Mux
Mux
InstructionRegister
MemoryData
Register
ALUOut
MemoryMemData
Address
![Page 46: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/46.jpg)
Load Word Instruction Sequence
5. Write RegistersRegisters[Rt] ← MDR
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0]
Instruction[5:0]
Instr[15:0]
ALUOut
A
B
MemoryMemData
Address
Writedata
Registers
RdData1
RdData2
RdReg2
RdReg1
4
ShiftLeft 2
SignExtend
PCMux
Mux
ALUALU
result
Zero
Mux
Mux
Mux
InstructionRegister
MemoryData
Register
Writereg
Writedata
![Page 47: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/47.jpg)
Load Word Instruction Sequence
All 5 Steps Shown
Instruction[5:0]
Instr[15:0]
BWritedata
Registers
RdData2
RdReg2
4
ShiftLeft 2
SignExtend
PCMux
Mux
ALUALU
result
Zero
Mux
Mux
Mux
InstructionRegister
MemoryData
Register
Instruction[25:21]
Instruction[20:16]
Instruction[15:11]
Instruction[15:0]
ALUOut
MemoryMemData
AddressRdData1
RdReg1
Writereg
Writedata
A
![Page 48: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/48.jpg)
Multi-cycle Load Word: Recap1. Fetch Instruction InstructionRegister ← Mem[PC]
2. Read Registers A ← Registers[Rs]
3. Compute Address ALUOut ← A + {SignExt(Imm16)}
4. Read Data MDR ← Memory[ALUOut]
5. Write Registers Registers[Rt] ← MDR
Missing Steps?
48Introduction to Computer Organization and Architecture
![Page 49: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/49.jpg)
Multi-cycle Load Word: Recap1. Fetch Instruction InstructionRegister ← Mem[PC]; PC ← PC + 4
2. Read Registers A ← Registers[Rs]
3. Compute Address ALUOut ← A + {SignExt(Imm16)}
4. Read Data MDR ← Memory[ALUOut]
5. Write Registers Registers[Rt] ← MDR
Missing Steps? Must increment the PC Do it as part of the instruction fetch (in step 1) Need PCWrite control signal
49Introduction to Computer Organization and Architecture
![Page 50: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/50.jpg)
Multi-cycle R-Type Instruction1. Fetch Instruction InstructionRegister ← Mem[PC]; PC ← PC + 4
2. Read Registers A ← Registers[Rs]; B ← Registers[Rt]
3. Compute Value ALUOut ← A op B
4. Write Registers Registers[Rd] ← ALUOut
RTL describes data flow action in each clock cycle Control signals determine precise data flow Each step implies unique control values
50Introduction to Computer Organization and Architecture
![Page 51: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/51.jpg)
Multi-cycle R-Type Instruction: Control Signal Values1. Fetch Instruction InstructionRegister ← Mem[PC]; PC ← PC + 4
MemRead=1, ALUSrcA=0, IorD=0, IRWrite, ALUSrcB=01, ALUop=00, PCWrite, PCSource=00
2. Read Registers A ← Registers[Rs]; B ← Registers[Rt]ALUSrcA=0, ALUSrcB=11, ALUop=00
3. Compute Value ALUOut ← A op BALUSrcA=1, ALUSrcB=00, ALUop=10
4. Write Registers Registers[Rd] ← ALUOutRegDst=1, RegWrite, MemtoReg=0
Each step implies unique control values Fixed for entire cycle “Default value” implied if unspecified
51Introduction to Computer Organization and Architecture
![Page 52: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/52.jpg)
Check Your Work – Is RTL Valid ? 1. Datapath check
Within one cycle… Each cycle has valid data flow path (path exists) Each register gets only one new value
Across multiple cycles… Register value is defined before use in previous (earlier in time) clock cycle
Eg, “A 3” must occur before “B A” Make sure register value doesn’t disappear if set >1 cycle earlier
2. Control signal check Each cycle, RTL describing the datapath flow implies a value for each control
signal 0 or 1 or default or don’t care
Each control signal gets only one fixed value the entire cycle
3. Overall check Does the sequence of steps work ?
52Introduction to Computer Organization and Architecture
![Page 53: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/53.jpg)
Multi-cycle BEQ Instruction
1. Fetch InstructionInstructionRegister ← Mem[PC]; PC ← PC + 4
2. Read Registers, Precompute TargetA ← Registers[Rs] ; B ← Registers[Rt] ; ALUOut ← PC + {SignExt{Imm16},b’00’}
3. Compare Registers, Conditional Branchif( (A – B) ==0 ) PC ← ALUOut
Green shows PC calculation flow (in parallel with other operations)
53Introduction to Computer Organization and Architecture
![Page 54: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/54.jpg)
Multi-cycle Datapath with Control Signals
Instr[25:21]
Instr[20:16]
Instr[15:0]
Instr[15:0]
Instruction[5:0]
In[15:11]
Instr[25:0]
PC[31..28]
Jumpaddress
[31..0]
PCWrite
IorDMemRead
MemWrite
MemtoReg
IRWritePCSrc
ALUOp
ALUSrcA
ALUSrcB
RegWrite
RegDst
ALUControl
54Introduction to Computer Organization and Architecture
![Page 55: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/55.jpg)
Multi-cycle Datapath with Controller
Instr.[31:26]
Instr[31:26]
Instr[25:21]
Instr[20:16]
Instr[15:0]
Instr[15:0]
Instruction[5:0]
In[15:11]
Instr[25:0]
PC[31..28]
Jumpaddress
[31..0]
![Page 56: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/56.jpg)
Multi-cycle BEQ Instruction
1. Fetch InstructionInstructionRegister ← Mem[PC]; PC ← PC + 4
2. Read Registers, Precompute TargetA ← Registers[Rs] ; B ← Registers[Rt] ; ALUOut ← PC + {SignExt{Imm16},b’00’}
3. Compare Registers, Conditional Branchif( (A – B) ==0 ) PC ← ALUOut
Green shows PC calculation flow (in parallel with other operations)
56Introduction to Computer Organization and Architecture
![Page 57: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/57.jpg)
Multi-cycle Datapath with Control Signals
Instr[25:21]
Instr[20:16]
Instr[15:0]
Instr[15:0]
Instruction[5:0]
In[15:11]
Instr[25:0]
PC[31..28]
Jumpaddress
[31..0]
PCWrite
IorDMemRead
MemWrite
MemtoReg
IRWritePCSrc
ALUOp
ALUSrcA
ALUSrcB
RegWrite
RegDst
ALUControl
57Introduction to Computer Organization and Architecture
![Page 58: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/58.jpg)
Multi-cycle Datapath with Controller
Instr.[31:26]
Instr[31:26]
Instr[25:21]
Instr[20:16]
Instr[15:0]
Instr[15:0]
Instruction[5:0]
In[15:11]
Instr[25:0]
PC[31..28]
Jumpaddress
[31..0]
![Page 59: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/59.jpg)
Multi-cycle CPU Control: Overview
General approach: Finite State Machine (FSM) Need details in each branch of control…
Precise outputs for each state (Mealy depends on inputs, Moore does not) Precise “next state” for each state (can depend on inputs)
ControlSignalOutputs
ControlSignalOutputs
59Introduction to Computer Organization and Architecture
![Page 60: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/60.jpg)
How to Implement FSM ? Manually with logic gates + FFs
Bubble diagram, next-state table, state assignment Karnaugh map for each state bit, each output bit (painful!)
High-level language description (eg, Verilog, VHDL) Describe FSM bubble diagram (next-states, output values) Automatically synthesized into gates + FFs
Microcode (µ-code) description Sequence through many µ-ops for each CPU instruction
One µ-op (µ-instruction) sends correct control signal for 1 cycle µ-op similar to one bubble in FSM
Acts like a mini-CPU within a CPU µPC: microcode program counter Microcode storage memory contains µ-ops
Can look similar to RTL or some new “assembly language”
60Introduction to Computer Organization and Architecture
![Page 61: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/61.jpg)
FSM Specification: Bubble Diagram
Can build thisby examiningRTL
It is possible toautomaticallyconvert RTLinto this form !
61
![Page 62: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/62.jpg)
FSM: Gates + FFs Implementation
FSMHigh-level
Organization
62Introduction to Computer Organization and Architecture
![Page 63: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/63.jpg)
FSM: Microcode Implementation
Adder
1
Datapathcontroloutputs
Sequencingcontrol
Inputs from instructionregister opcode field
MicrocodeStorage
(memory)
Inputs
Outputs
Microprogram Counter
Address Select Logic
63Introduction to Computer Organization and Architecture
![Page 64: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/64.jpg)
Multi-cycle CPU with Control FSM
Instr.[31:26]
Instr[31:26]
Instr[25:21]
Instr[20:16]
Instr[15:0]
Instr[15:0]
Instruction[5:0]
In[15:11]
Instr[25:0]
PC[31..28]
Jumpaddress
[31..0]
FSMControlOutputs
ConditionalBranch
![Page 65: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/65.jpg)
Control FSM: Overview
General approach: Finite State Machine (FSM) Need details in each branch of control…
65Introduction to Computer Organization and Architecture
![Page 66: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/66.jpg)
Detailed FSM
66
![Page 67: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/67.jpg)
Detailed FSMInstruction
Fetch
MemoryReference
Branch JumpR-Type
67
![Page 68: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/68.jpg)
Detailed FSM: Instruction Fetch
68Introduction to Computer Organization and Architecture
![Page 69: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/69.jpg)
Detailed FSM: Memory Reference
LW SW
69
![Page 70: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/70.jpg)
Detailed FSM: R-Type Instruction
70Introduction to Computer Organization and Architecture
![Page 71: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/71.jpg)
Detailed FSM: Branch Instruction
71Introduction to Computer Organization and Architecture
![Page 72: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/72.jpg)
Detailed FSM: Jump Instruction
72Introduction to Computer Organization and Architecture
![Page 73: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/73.jpg)
Performance Comparison
Single-cycle CPUvs
Multi-cycle CPU
73Introduction to Computer Organization and Architecture
![Page 74: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/74.jpg)
Simple Comparison
Single-cycle CPU1 clock cycle
5 clock cyclesMulti-cycle CPU
4 clock cyclesMulti-cycle CPU
3 clock cyclesMulti-cycle CPU
SW, R-type
BEQ, J
LW
All
![Page 75: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/75.jpg)
What’s really happening?
Single-cycle CPU
Multi-cycle CPU
( Load Word Instruction )
Fetch Decode Memory WriteCalcAddr
Ideally:
75Introduction to Computer Organization and Architecture
![Page 76: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/76.jpg)
In practice, steps differ in speeds…
Single-cycle CPU
Multi-cycle CPU
Fetch Decode MemoryCalcAddr
Fetch Decode MemoryCalcAddr
Write
Write
Violation!Wasted time!
Load Word Instruction
76Introduction to Computer Organization and Architecture
![Page 77: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/77.jpg)
Single-cycle vs Multi-cycleLW instruction faster for single-cycle
Single-cycle CPU
Fetch Decode MemoryCalcAddr
Fetch Decode MemoryCalcAddr
Write
Write
Violation fixed!
Multi-cycle CPU
Now wasted time is larger!
77Introduction to Computer Organization and Architecture
![Page 78: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/78.jpg)
Single-cycle vs Multi-cycleSW instruction ~ same speed
Single-cycle CPU
Fetch Decode MemoryCalcAddr
Fetch Decode MemoryCalcAddr
Multi-cycle CPU
Wasted time!
Speed diff
78Introduction to Computer Organization and Architecture
![Page 79: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/79.jpg)
Single-cycle vs Multi-cycleBEQ, J instruction faster for multi-cycle
Single-cycle CPU
Fetch Decode CalcAddr
Fetch Decode CalcAddr
Wasted time!
Speed diff
Multi-cycle CPU
79Introduction to Computer Organization and Architecture
![Page 80: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/80.jpg)
Performance Summary Which CPU implementation is faster?
LW single-cycle is faster SW,R-type about the same BEQ,J multi-cycle is faster
Real programs use a mix of these instructions
Overall performance depends instruction frequency !
80Introduction to Computer Organization and Architecture
![Page 81: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/81.jpg)
Implementation Summary Single-cycle CPU
1 instruction per cycle (eg, 1MHz 1 MIPS) No “wasted time” on most complex instruction Large wasted time on simpler instructions Simple controller (just a lookup table or memory) Simple instructions
Multi-cycle CPU << 1 instruction per cycle (eg, 1MHz 0.2 MIPS) Small time wasted on most complex instruction
Hence, this instruction always slower than single-cycle CPU Small time wasted on simple instructions
Eliminates “large wasted time” by using fewer clock cycles Complex controller (FSM) Potential to create complex instructions
81Introduction to Computer Organization and Architecture
![Page 82: Introduction to Computer Organization and Architecture Lecture 11 By Juthawut Chantharamalee wut_cha/home.htm](https://reader036.vdocuments.us/reader036/viewer/2022062523/5a4d1ad17f8b9ab05997150e/html5/thumbnails/82.jpg)
The End Lecture 11