verilog_case_study
TRANSCRIPT
![Page 1: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/1.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
ES6102 Advanced Digital Systems Design
Complex Sequential systems
Module 6
MIPS Datapath (Case Study)
![Page 2: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/2.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
The Five Classic Components of a Computer
Control
Datapath
Memory
Processor
Input
Output
![Page 3: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/3.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
The Performance Perspective
• Performance of a machine is determined by:– Instruction count– Clock cycle time– Clock cycles per instruction
• Processor design (datapath and control) will determine:– Clock cycle time– Clock cycles per instruction
• Single cycle processor:– Advantage: One clock cycle per instruction– Disadvantage: long cycle time
CPI
Inst. Count Cycle Time
![Page 4: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/4.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
The Processor: Datapath & Control
• We're ready to look at an implementation of the MIPS
• Simplified to contain only:– memory-reference instructions: lw, sw – arithmetic-logical instructions: add, sub, and, or, slt– control flow instructions: beq, j
• Generic Implementation:
– use the program counter (PC) to supply instruction address
– get the instruction from memory
– read registers
– use the instruction to decide exactly what to do
• All instructions use the ALU after reading the registers
![Page 5: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/5.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
The MIPS Instruction Formats• All MIPS instructions are 32 bits long. The three instruction formats:
– R-type
– I-type
– J-type
• The different fields are:– op: operation of the instruction– rs, rt, rd: the source and destination register specifiers– shamt: shift amount– funct: selects the variant of the operation in the “op” field– address / immediate: address offset or immediate value– target address: target address of the jump instruction
op target address
02631
6 bits 26 bits
op rs rt rd shamt funct
061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt immediate
016212631
6 bits 16 bits5 bits5 bits
![Page 6: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/6.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Lets look at a MIPS subset
• ADD and SUB– add rd, rs, rt
– sub rd, rs, rt
• OR Immediate:– ori rt, rs, imm16
• LOAD and STORE Word– lw rt, rs, imm16
– sw rt, rs, imm16
• BRANCH:– beq rs, rt, imm16
op rs rt rd shamt funct
061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
op rs rt immediate
016212631
6 bits 16 bits5 bits5 bits
op rs rt immediate
016212631
6 bits 16 bits5 bits5 bits
op rs rt immediate
016212631
6 bits 16 bits5 bits5 bits
![Page 7: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/7.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Register Transfers
• Process starts by fetching the instruction
op | rs | rt | rd | shamt | funct <= MEM[ PC ]op | rs | rt | Imm16 <= MEM[ PC ]
inst Register TransfersADDU R[rd] <– R[rs] + R[rt]; PC <– PC + 4SUBU R[rd] <– R[rs] – R[rt]; PC <– PC + 4ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16) ]; PC <– PC + 4STORE MEM[ R[rs] + sign_ext(Imm16) ] <– R[rt]; PC <– PC + 4BE if ( R[rs] == R[rt] ) then PC <– PC + 4 +
sign_ext(Imm16 x 4) else PC <– PC + 4
![Page 8: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/8.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Requirements of the Instruction Set
• Memory– instruction & data
• Registers (32 x 32)– read RS– read RT– Write RT or RD
• PC• Extender• Add and Sub register or extended immediate• Add 4 or extended immediate to PC
![Page 9: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/9.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Need a Storage Element: Register File
• Register File consists of 32 registers:– Two 32-bit output busses:
• busA and busB
– One 32-bit input bus: busW• Register is selected by:
– RA (number) selects the register to put on busA (data)– RB (number) selects the register to put on busB (data)– RW (number) selects the register to be written via busW (data) when
Write Enable is 1• Clock input (CLK)
– The CLK input is a factor ONLY during write operation– During read operation, behaves as a combinational logic block:
ie. RA or RB valid => busA or busB valid after “access time.”
Clk
busW
Write Enable
3232
busA
32busB
5 5 5RWRARB
32 32-bitRegisters
![Page 10: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/10.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Basic Building Blocks
• Adder
• MUX
• ALU
• Registers
32
32
A
B32
Sum
CarryA
dd
er
CarryIn
32A
B32
Y32
Select
MU
X
32
32
A
B32
Result
OP
AL
U
![Page 11: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/11.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
So what do we need?
32A
B 32
Y32
Select
MU
X
PC
Instructionmemory
Instructionaddress
Instruction
a. Instruction memory b. Program counter
Add Sum
c. Adder
16 32Sign
extend
e . Sign-extension unit
MemRead
MemWrite
Datamemory
Writedata
Readdata
d.. Data memory unit
Address
ALU control
RegWrite
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
ALUresult
ALU
Data
Data
Registernumbers
f . Registers g . ALU
Zero5
5
5 3
h. Selector
![Page 12: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/12.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
How do we connect them?
• Register Transfer Requirements -> Datapath Assembly– Instruction Fetch
– Then Read Operand and Execute Operation
• Instruction fetch– Fetch the Instruction: mem[PC]
– Update the program counter:• Sequential Code: PC <- PC + 4
• Branch and Jump: PC <- “something else”
32
Instruction WordAddress
InstructionMemory
PCClk
Next AddressLogic
![Page 13: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/13.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Execution (Add and Subtract)
• R[rd] <- R[rs] op R[rt] Example: addu rd, rs, rt– Ra, Rb, and Rw come from instruction’s rs, rt, and rd fields
– ALUctr and RegWr: control logic after decoding the instruction
op rs rt rd shamt funct
061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
32
Result
ALUctr
Clk
busW
RegWr
32
32
busA
32
busB
5 5 5
Rw Ra Rb
32 32-bitRegisters
Rs RtRd
AL
U
![Page 14: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/14.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Execution (Logical with Immediate)
• R[rt] <- R[rs] op ZeroExt[imm16] ]
op rs rt immediate
016212631
6 bits 16 bits5 bits5 bits
32
Result
ALUctr
Clk
busW
RegWr
3232
busA
32busB
5 5 5
Rw Ra Rb32 32-bitRegisters
Rs
RtRdRegDst
ZeroE
xt
Mu
x
Mux
3216imm16
ALUSrc
AL
U
![Page 15: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/15.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Execution (Load Operations)
• R[rt] <- Mem[ R[rs] + SignExt[imm16] ] Example: lw rt, rs, imm16
op rs rt immediate
016212631
6 bits 16 bits5 bits5 bits
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
5 5 5
Rw Ra Rb32 32-bitRegisters
Rs
RtRdRegDst
Exten
der
Mu
x
Mux
3216
imm16
ALUSrc
ExtOp
Clk
Data InWrEn
32
Adr
DataMemory
32A
LU
MemWr Mu
x
W_Src
![Page 16: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/16.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Execution (Store Operations)• Mem[ R[rs] + SignExt[imm16] ] <- R[rt] Example: sw rt, rs, imm16
op rs rt immediate
016212631
6 bits 16 bits5 bits5 bits
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Exten
der
Mu
x
Mux
3216imm16
ALUSrcExtOp
Clk
Data InWrEn
32Adr
DataMemory
MemWr
AL
U32
Mu
x
W_Src
![Page 17: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/17.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Execution (Branch Operations)
• beq rs, rt, imm16 Datapath generates condition (equal)
op rs rt immediate
016212631
6 bits 16 bits5 bits5 bits
32
imm16
PC
Clk
00
Ad
der
Mu
x
Ad
der
4nPC_sel
Clk
busW
RegWr
32
busA
32
busB
5 5 5
Rw Ra Rb
32 32-bitRegisters
Rs Rt
Eq
ual
?
Cond
PC
Ext
Inst Address
![Page 18: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/18.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Putting it all together
• A single cycle implementation
MemtoReg
MemRead
MemWrite
ALUOp
ALUSrc
RegDst
PC
Instructionmemory
Readaddress
Instruction[31– 0]
Instruction [20– 16]
Instruction [25– 21]
Add
Instruction [5– 0]
RegWrite
4
16 32Instruction [15– 0]
0Registers
WriteregisterWritedata
Writedata
Readdata 1
Readdata 2
Readregister 1Readregister 2
Signextend
ALUresult
Zero
Datamemory
Address Readdata M
ux
1
0
Mux
1
0
Mux
1
0
Mux
1
Instruction [15– 11]
ALUcontrol
Shiftleft 2
PCSrc
ALU
Add ALUresult
![Page 19: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/19.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
An Abstract View of the Implementation
DataOut
Clk
5
Rw Ra Rb
32 32-bitRegisters
Rd
AL
U
Clk
Data In
DataAddress
IdealData
Memory
Instruction
InstructionAddress
IdealInstruction
Memory
Clk
PC
5Rs
5Rt
32
323232
A
B
Nex
t A
dd
ress
Control
Datapath
Control Signals Conditions
![Page 20: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/20.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Control of the Datapath
• Control is the hard part
• MIPS makes control easier– Instructions same size
– Source registers always in same place
– Immediates same size, location
– Operations always on registers/immediates
• Lets skip control till later
![Page 21: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/21.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
An Abstract View of the Critical Path
• Register file and ideal memory:– The CLK input is a factor ONLY during write operation
– During read operation, behave as combinational logic:• Address valid => Output valid after “access time.”
Clk
5
Rw Ra Rb
32 32-bitRegisters
Rd
AL
U
Clk
Data In
DataAddress Ideal
DataMemor
y
Instruction
InstructionAddress
IdealInstruction
Memory
Clk
PC
5Rs
5Rt
16Imm
32
323232
A
B
Nex
t A
dd
ress
Critical Path (Load) = PC’s Clk-to-Q +
Inst. Memory Access Time + Register File Access Time + ALU ( 32-bit Add ) + Data Memory Access Time + Setup Time for Register File Write + Clock Skew
![Page 22: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/22.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Critical Path (Load Instruction)
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Exten
der
Mu
x
3216imm16
ALUSrcExtOp
Mu
x
MemtoReg
Clk
Data InWrEn32 Adr
DataMemory
MemWr
AL
U
Equal
Instruction<31:0>
0
1
0
1
01
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
=
imm
16
Ad
der
Ad
der
PC
Clk
00
Mu
x
4
nPC_sel
PC
Ext
Adr
InstMemory
sign ext
addrt+4
![Page 23: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/23.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Worst Case Timing (Load)Clk
PC
Rs, Rt, Rd,Op, Func
Clk-to-Q
ALUctr
Instruction Memoey Access Time
Old Value New Value
RegWr Old Value New Value
Delay through Control Logic
busARegister File Access Time
Old Value New Value
busB
ALU Delay
Old Value New Value
Old Value New Value
New ValueOld Value
ExtOp Old Value New Value
ALUSrc Old Value New Value
MemtoReg Old Value New Value
Address
Old Value New Value
busW Old Value New
Delay through Extender & Mux
RegisterWrite Occurs
Data Memory Access Time
![Page 24: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/24.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Single cycle (CPI=1) processor: The problem
• Long Cycle Time
• All instructions take as much time as the slowest
• Real memory is not so nice as our idealized memory– cannot always get the job done in one (short) cycle
PC Inst Memory mux ALU Data Mem mux
PC Reg FileInst Memory mux ALU mux
PC Inst Memory mux ALU Data Mem
PC Inst Memory cmp mux
Reg File
Reg File
Reg File
Arithmetic & Logical
Load
Store
Branch
Critical Path
setup
setup
![Page 25: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/25.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Time is the problem
• For a single cycle implementation, the time from when one instruction is started till it completes (cycle time) is long.– Cycle time must be long enough for the load instruction:
– Cycle time for load is much longer than needed for all other instructions
• Instead consider a multi-cycle approach.– We will be reusing functional units
• ALU used to compute address and to increment PC
• Memory used for instruction and data
– Our control signals will not be determined solely by instruction• We’ll use a finite state machine for control
![Page 26: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/26.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Multicycle Approach
• Break up the instructions into steps, each step takes a cycle– balance the amount of work to be done
– restrict each cycle to use only one major functional unit
• At the end of a cycle– store values for use in later cycles (easiest thing to do)
– introduce additional “internal” registers
![Page 27: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/27.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Five Execution Steps
• Instruction Fetch
• Instruction Decode and Register Fetch
• Execution, Memory Address Computation, or Branch Completion
• Memory Access or R-type instruction completion
• Write-back step
INSTRUCTIONS TAKE FROM 3 - 5 CYCLES!
Load instruction is longest and uses all of the above steps.
![Page 28: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/28.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Step 1: Instruction Fetch
• Use PC to get instruction and put it in the Instruction Register.IR <- Memory[PC];
• Increment the PC by 4 and put the result back in the PC. (But what about Branches or Jumps)– Sequential Code:
PC <- PC + 4;
– Branch and Jump:
PC <- “something else”;
32
Instruction WordAddress
InstructionMemory
PCClk
Next AddressLogic
![Page 29: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/29.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Step 2: Inst. Decode and Register Fetch
• Read registers rs and rt in case we need them
A <- Reg[IR[25-21]];B <- Reg[IR[20-16]];
• Compute the branch address in case the instruction is a branch
PC <- PC + (sign-extend(IR[15-0]) << 2);
Note: <<2 is the same as a multiply by 4
![Page 30: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/30.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Step 3 (instruction dependent)
• ALU is performing one of two functions, based on instruction type.• Memory Reference:
ALUOut <- A + sign-extend(IR[15-0]);
• R-type:
ALUOut <- A op B;
• Note that in the Basic MIPS (MIPS_Basic.zip) the PC for a branch is calculated in this stage (and not the ID stage as in the previous slide)
– Branch:
if (A==B) PC <- PC + (signext(IR[15-0]) << 2);
![Page 31: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/31.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Step 4 & 5 (R-type or memory-access)
• Loads and stores access memory
MDR = Memory[ALUOut];or
Memory[ALUOut] = B;
• R-type instructions finish (write back to register file)
Reg[IR[15-11]] = ALUOut;The write actually takes place at the end of the cycle on the edge
Step 5 (The write-back step)A load from memory to the register file needs an extra cycle to complete.
Reg[IR[20-16]]= MDR;
![Page 32: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/32.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Summary
Step nameAction for R-type
instructionsAction for memory-reference
instructionsAction for branches
Action for jumps
Instruction fetch IR = Memory[PC]PC = PC + 4
Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]
ALUOut = PC + (sign-extend (IR[15-0]) << 2)
Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completion
Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or
Store: Memory [ALUOut] = B
Memory read completion Load: Reg[IR[20-16]] = MDR
![Page 33: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/33.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
How can we reduce the cycle time?
• Cut combinational dependency graph and insert register / latch
• Do same work in two fast cycles, rather than one slow one
storage element
Acyclic CombinationalLogic
storage element
storage element
Acyclic CombinationalLogic (A)
storage element
storage element
Acyclic CombinationalLogic (B)
=> This is pipelining.
![Page 34: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/34.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
How can we improve instruction throughput?
Ideal speedup is number of stages in the pipeline. Do we achieve this?
![Page 35: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/35.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Why Pipeline? Because the resources are there!
Instr.
Order
Time (clock cycles)
Inst 0
Inst 1
Inst 2
Inst 4
Inst 3A
LUIm Reg Dm Reg
AL
UIm Reg Dm Reg
AL
UIm Reg Dm Reg
AL
UIm Reg Dm Reg
AL
UIm Reg Dm Reg
But aren’t we using two resources here
2
![Page 36: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/36.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Pipelining
• What makes it easy– all instructions are the same length
– just a few instruction formats
– memory operands appear only in loads and stores
• What makes it hard?– structural hazards: suppose we had only one memory
– control hazards: need to worry about branch instructions
– data hazards: an instruction depends on a previous instruction
• We’ll build a simple pipeline and look at these issues
![Page 37: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/37.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Basic Idea
What do we need to add to actually split the datapath into stages?
![Page 38: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/38.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Pipelined Datapath
But which value do we write back. (see next slide)
2
![Page 39: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/39.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Corrected Datapath• The problem with the previous implementation:
– What happens when we writeback to the register file. What instruction supplies the write register value (destination register)?
– Solution: We must forward (preserve) the destination register value.
![Page 40: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/40.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Graphically Representing Pipelines
• Can help with answering questions like:– how many cycles does it take to execute this code?– what is the ALU doing during cycle 4?– use this representation to help understand datapaths
![Page 41: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/41.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Load word instruction
• The load word (lw) instruction is the most complicated as it uses all stages of the datapath. Consider:lw $10, 20($1) # R10 <- Mem[R1+20]
1. Instruction fetch:
2. Instruction Decode:– Immediate value (20) is sign extended. src & dest. Reg values forwarded.
3. Execution:– Immed. & src Reg values added to generate address , dest Reg forwarded.
4. Memory– Data read from memory.
5. Writeback– Memory data is written to register file at dest. Reg location.
![Page 42: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/42.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Pipeline control• We have 5 stages. What needs to be controlled in each stage?
– Instruction Fetch and PC Increment– Instruction Decode / Register Fetch– Execution– Memory Stage– Write Back
![Page 43: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/43.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Pipeline Control• Pass control signals along just like the data
Execution/Address Calculation stage control lines
Memory access stage control lines
Write-back stage control
lines
InstructionReg Dst
ALU Op1
ALU Op0
ALU Src Branch
Mem Read
Mem Write
Reg write
Mem to Reg
R-format 1 1 0 0 0 0 0 1 0lw 0 0 0 1 0 1 0 1 1sw X 0 0 1 0 0 1 0 Xbeq X 0 1 0 1 0 0 0 X
![Page 44: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/44.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Datapath with Control
![Page 45: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/45.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Can pipelining get us into trouble?
• Yes: Pipeline Hazards– structural hazards: attempt to use the same resource two different
ways at the same time• Only one memory system and we want to access data and instruction
memory in same cycle.
– data hazards: attempt to use item before it is ready• instruction depends on result of prior instruction still in the pipeline
– control hazards: attempt to make a decision before condition is evaluated
• branch instructions
• Can always resolve hazards by waiting– pipeline control must detect the hazard
– take action (or delay action) to resolve hazards
![Page 46: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/46.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
Consider the codePC OP C
Instruction
16171819
49056D080D950000
add R2, R2, #2lw R3, #4(R2)add R3, R3, R1nop
R2 should be 1 but it is not updated to here
![Page 47: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/47.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Single Memory is a Structural Hazard
Mem
Instr.
Order
Time (clock cycles)
Load
Instr 1
Instr 2
Instr 3
Instr 4
AL
UMem Reg Mem Reg
AL
UMem Reg Mem Reg
AL
UMem Reg Mem Reg
AL
UReg Mem Reg
AL
UMem Reg Mem Reg
Trying to perform two reads from the one memory at the same time.
Thus we need 2 separate memories. Instruction memory and Data memory
1
![Page 48: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/48.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Data Hazards
• Consider R2. Note: Dependencies backwards in time are hazards
Instr.
Order
Time (clock cycles)
sub r2,r1,r3
and r4,r2,r5
or r8,r2,r6
and r9,r4,r2
slt r1,r6,r7
IF ID/RF EX MEM WBAL
UIm Reg Dm Reg
AL
UIm Reg Dm Reg
AL
UIm Reg Dm Reg
Im
AL
UReg Dm Reg
AL
UIm Reg Dm Reg
![Page 49: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/49.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Data hazards: Forwarding
• Use temporary results, don’t wait for them to be written– register file forwarding to handle read/write to same register
– ALU forwarding
IM Reg
IM Reg
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6
Time (in clock cycles)
sub $2, $1, $3
Programexecution order(in instructions)
and $12, $2, $5
IM Reg DM Reg
IM DM Reg
IM DM Reg
CC 7 CC 8 CC 9
10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)
Value of register $2 :
DM Reg
Reg
Reg
Reg
X X X – 20 X X X X XValue of EX/MEM :X X X X – 20 X X X XValue of MEM/WB :
DM
![Page 50: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/50.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Data path with forwarding
![Page 51: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/51.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Can't always forward• Load word can still cause a hazard:
– an instruction tries to read a register following a load instruction that writes to the same register.
• Thus, we need a hazard detection unit to “stall” the load instruction
![Page 52: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/52.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Solution: Stalling
• We can stall the pipeline by keeping an instruction in the same stage
![Page 53: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/53.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Hazard Detection Unit
![Page 54: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/54.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Control Hazards:- Branch Hazards
• Stall: wait until decision is clear– Its possible to move up decision to 2nd stage by adding hardware to
check registers as being read. How?• Impact: 2 clock cycles per branch instruction => slow
Instr.
Order
Time (clock cycles)
Add
Beq
LoadA
LUMem Reg Mem Reg
AL
UMem Reg Mem Reg
AL
UReg Mem RegMem
Need to stall
![Page 55: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/55.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Control Hazards:- Branch Hazards
• MIPS uses a “branch delay slot”– the next instruction after a branch is always executed
– rely on compiler to “fill” the slot with something useful• Works about 50% of the time. Rest must be NOPs.
Instr.
Order
Time (clock cycles)
Add
Beq
Misc
AL
UMem Reg Mem Reg
AL
UMem Reg Mem Reg
Mem
AL
UReg Mem Reg
Load Mem
AL
UReg Mem Reg
![Page 56: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/56.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Some MIPS Instructions
• Consider the following MIPS instructions (Note: add $2, $1, $3 is $2 <= $1 + $3)
b"00000000000000000000000000000000"; -- nop -- 00000000b"00000000001000110001000000100000"; -- add $2, $1, $3 -- 00231020b"00000000001001100010000000100101"; -- or $4, $1, $6 -- 00262025b"00000000010000110010100000100000"; -- add $5, $2, $3 -- 00432820b"00010000001000010000000000000100"; -- beq $1, $1, #4 -- 10210010b"00000000000000000000000000000000"; -- nop -- 00000000b"00000000010001100010000000100100"; -- and $4, $2, $6 -- 00462024b"00000000001001010011000000100101"; -- or $6, $1, $5 -- 00253025b"00000000111001110010000000100000"; -- add $4, $7, $7 -- 00E72020b"00000000001000100001100000100000"; -- add $3, $1, $2 -- 00221820b"00000000001001100000100000100101"; -- or $1, $1, $6 -- 00260825b"00000000010001010001100000100000"; -- add $3, $2, $5 -- 00451820b"00000000000000000000000000000000"; -- nop -- 00000000b"00000000000000000000000000000000"; -- nop -- 00000000b"00000000000000000000000000000000"; -- nop -- 00000000b"00000000000000000000000000000000"; -- nop -- 00000000
Jumps 4 instructions
![Page 57: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/57.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Some MIPS Instructions
• Remember the R format instruction
000000 00001 00011 00010 00000 100000 -- add $2, $1, $3
-- add rd, rs, rt
opcode
rs
rt
rd
0
funct
![Page 58: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/58.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Simulator Results
The decoded Branch instruction (IF Section)
The register file contents (ID Section)
Branch decision made (Ex Section)New PC updated
(Mem Section)Correct instruction
decoded (IF Section)
These should not be there
beq $1, $1, #4 -- 10210010
6
![Page 59: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/59.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Datapath with Control
The branch decision is made in
the MEM stage.
2
We need to move the branch decision to the
ID stage.
![Page 60: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/60.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
Lets look at the whole system
School of Computer EngineeringSchool of Computer Engineering
![Page 61: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/61.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
VERILOG code
• Instruction Fetchmodule Stage_IF(IF_PC4_Out, IF_Instr_Out, IF_BranchPC_In,
IF_PCSrc_In, IF_Clk_In, IF_Reset_In);
output [31:0] IF_PC4_Out;
output [31:0] IF_Instruction_Out;
input [31:0] IF_BranchPC_In;
input IF_PCSrc_In;
input IF_Clk_In;
input IF_Reset_In;
Note: The Instruction memory is in this module.
Currently, the next address logic implements: PC <- PC+4;
or: PC <- PC+4+branch_offset;
The branch offset is calculated in a later section (ID or EX, depending on version)
![Page 62: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/62.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Verilog Code (cont)• Instruction Decode
module Stage_ID(ID_RegWrite_Out, ID_MemToReg_Out, ID_Branch_Out, ID_MemRead_Out,ID_MemWrite_Out, ID_RegDst_Out, ID_ALUOp_Out, ID_ALUSrc_Out,ID_PC4_Out, ID_ReadData1_Out, ID_ReadData2_Out, ID_Immediate_Out,ID_rt_Out, ID_rd_Out, ID_RegWrite_In, ID_PC4_In, ID_Instruction_In,ID_WriteRegister_In, ID_WriteData_In, ID_Clk_In, ID_Reset_In);
output ID_RegWrite_Out, ID_MemToReg_Out, ID_Branch_Out, ID_RegDst_Out;output ID_MemRead_Out, ID_ALUSrc_Out, ID_MemWrite_Out;output [3:0] ID_ALUOp_Out;output [31:0] ID_PC4_Out;output [15:0] ID_ReadData1_Out, ID_ReadData2_Out;output [31:0] ID_Immediate_Out;output [4:0] ID_rt_Out, ID_rd_Out;
input [31:0] ID_PC4_In;input ID_RegWrite_In;input [31:0] ID_Instruction_In;input [4:0] ID_WriteRegister_In;input [15:0] ID_WriteData_In;input ID_Clk_In;input ID_Reset_In;
![Page 63: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/63.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Verilog Code (cont)• EX section
module Stage_EX(EX_RegWrite_Out, EX_MemToReg_Out, EX_Branch_Out, EX_MemRead_Out,EX_MemWrite_Out, EX_BranchPC_Out, EX_Zero_Out, EX_ALUResult_Out,EX_ReadData2_Out, EX_WriteRegister_Out, EX_RegWrite_In, EX_MemToReg_In, EX_Branch_In, EX_MemRead_In, EX_MemWrite_In, EX_RegDst_In, EX_ALUOp_In, EX_ALUSrc_In, EX_PC4_In, EX_ReadData1_In, EX_ReadData2_In, EX_Immediate_In, EX_rt_In, EX_rd_In, EX_Clk_In, EX_Reset_In);
output EX_RegWrite_Out, EX_MemToReg_Out, EX_Branch_Out;output EX_MemRead_Out, EX_MemWrite_Out;output [31:0] EX_BranchPC_Out;output EX_Zero_Out;output [15:0] EX_ALUResult_Out;output [15:0] EX_ReadData2_Out;output [4:0] EX_WriteRegister_Out;
input EX_RegWrite_In, EX_MemToReg_In, EX_Branch_In;input EX_MemRead_In, EX_MemWrite_In;input [31:0] EX_PC4_In, EX_Immediate_In; input EX_RegDst_In, EX_ALUSrc_In, EX_Clk_In, EX_Reset_In;input [3:0] EX_ALUOp_In;input [15:0] EX_ReadData1_In, EX_ReadData2_In;input [4:0] EX_rt_In, EX_rd_In;
![Page 64: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/64.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Verilog Code (cont)• Data Memory systemmodule Stage_MEM( MEM_PCSrc_Out, MEM_BranchPC_Out, MEM_RegWrite_Out,
MEM_MemToReg_Out,MEM_ReadData_Out, MEM_ALUResult_Out, MEM_WriteRegister_Out, MEM_RegWrite_In, MEM_MemToReg_In, MEM_Branch_In, MEM_MemRead_In, MEM_MemWrite_In, MEM_BranchPC_In, MEM_Zero_In, MEM_ALUResult_In, MEM_WriteData_In, MEM_WriteRegister_In, MEM_Clk_In,
MEM_Reset_In);
output MEM_PCSrc_Out, MEM_RegWrite_Out, MEM_MemToReg_Out;output [31:0] MEM_BranchPC_Out;output [15:0] MEM_ReadData_Out, MEM_ALUResult_Out;output [4:0] MEM_WriteRegister_Out;
input MEM_RegWrite_In, MEM_MemToReg_In, MEM_Branch_In, MEM_Zero_In;input [31:0] MEM_BranchPC_In;input MEM_MemRead_In, MEM_MemWrite_In, MEM_Clk_In, MEM_Reset_In;input [15:0] MEM_ALUResult_In, MEM_WriteData_In;input [4:0] MEM_WriteRegister_In;
![Page 65: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/65.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Verilog Code (cont)• Write Back systemmodule Stage_WB(WB_RegWrite_Out, WB_WriteRegister_Out, WB_WriteData_Out,
WB_RegWrite_In, WB_MemToReg_In, WB_ReadData_In, WB_ALUResult_In,WB_WriteRegister_In);
output WB_RegWrite_Out;output [4:0] WB_WriteRegister_Out;output [15:0] WB_WriteData_Out;
input WB_RegWrite_In;input WB_MemToReg_In;input [15:0] WB_ReadData_In;input [15:0] WB_ALUResult_In;input [4:0] WB_WriteRegister_In;
![Page 66: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/66.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Verilog Code (cont)
• IF/ID pipeline register (#1)module Reg_IF_ID(PC4_Out, Instruction_Out, PC4_In,
Instruction_In, Clk_In, Reset_In);
This simply passes the PC and the Instruction from the IF to the ID stage.
![Page 67: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/67.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
Verilog Code (cont)
• Also have – ID/EX pipeline register (#2)
module Reg_ID_EX(…)
– EX/MEM pipeline register (#3)module Reg_EX_MEM(…)
– MEM/WB pipeline register (#4)module Reg_MEM_WB(…)
• See the code for more detail
![Page 68: verilog_case_study](https://reader035.vdocuments.us/reader035/viewer/2022062704/5561f936d8b42a2a488b4cc4/html5/thumbnails/68.jpg)
ES6102: Advanced Digital Systems Design 2011
Adapted from Patterson and Hennessy, “Computer Organization and Design: The hardware/software interface”, 2nd Ed., MKP, 1998. Copyright 1998 Morgan Kaufmann Publishers
School of Computer EngineeringSchool of Computer Engineering
The End