lecture 13 - university of california,...
TRANSCRIPT
![Page 1: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/1.jpg)
EE141
EECS 151/251ASpring2019 DigitalDesignandIntegratedCircuitsInstructor:JohnWawrzynek
Lecture 13
![Page 2: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/2.jpg)
EE141
Project Introduction❑ You will design and optimize a RISC-V
processor ❑ Phase 1: Design and demonstrate a processor ❑ Phase 2: ▪ ASIC Lab – implement cache memory and generate
complete chip layout ▪ FPGA Lab – Add video display and graphics
accelerator
2
Today discuss how to design the processor
![Page 3: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/3.jpg)
WhatisRISC-V?• FifthgenerationofRISCdesignfromUCBerkeley• Ahigh-quality,license-free,royalty-freeRISCISAspecification• Experiencingrapiduptakeinbothindustryandacademia• Supportedbygrowingsharedsoftwareecosystem• Appropriateforalllevelsofcomputingsystem,frommicro-
controllerstosupercomputers– 32-bit,64-bit,and128-bitvariants(we’reusing32-bitinclass,
textbookuses64-bit)• Standardmaintainedbynon-profitRISC-VFoundation
�3
https://riscv.org/specifications/
![Page 4: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/4.jpg)
FoundationMembers(60+)
4Rumble Development
Platinum:
Gold,Silver,Auditors:
![Page 5: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/5.jpg)
InstructionSetArchitecture(ISA)• JobofaCPU(CentralProcessingUnit,akaCore):
executeinstructions• Instructions:CPU’sprimitivesoperations
– Instructionsperformedoneafteranotherinsequence– Eachinstructiondoesasmallamountofwork(atinypartofa
largerprogram).– Eachinstructionhasanoperationappliedtooperands,– andmightbeusedchangethesequenceofinstruction.
• CPUsbelongto“families,”eachimplementingitsownsetofinstructions
• CPU’sparticularsetofinstructionsimplementsanInstructionSetArchitecture(ISA)
– Examples:ARM,Intelx86,MIPS,RISC-V,IBM/MotorolaPowerPC(oldMac),IntelIA64,...
�5
If you need more info on processor organization.
![Page 6: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/6.jpg)
CompleteRV32IISA
�6
NotinEECS151/251A **
* implemented in the ASIC project
![Page 7: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/7.jpg)
Computer Science 61C Spring 2018 Wawrzynek and Weaver
Summary of RISC-V Instruction Formats
7
Binary encoding of machine instructions. Note the common fields.
![Page 8: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/8.jpg)
“State”RequiredbyRV32IISAEachinstructionreadsandupdatesthisstateduringexecution:• Registers(x0..x31)−Registerfile(orregfile)Regholds32registersx32bits/register:Reg[0].. Reg[31]
−Firstregisterreadspecifiedbyrs1fieldininstruction−Secondregisterreadspecifiedbyrs2fieldininstruction−Writeregister(destination)specifiedbyrdfieldininstruction−x0isalways0(writestoReg[0]areignored)
• ProgramCounter(PC)−Holdsaddressofcurrentinstruction
•Memory(MEM)−Holdsbothinstructions&data,inone32-bitbyte-addressedmemoryspace
−We’lluseseparatememoriesforinstructions(IMEM)anddata(DMEM)▪ Laterwe’llreplacethesewithinstructionanddatacaches
−Instructionsareread(fetched)frominstructionmemory(assumeIMEMread-only)
−Load/storeinstructionsaccessdatamemory
�8
![Page 9: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/9.jpg)
EE141
RISC-V State Elements
9
❑ State encodes everything about the execution status of a processor: – PC register – 32 registers – Memory
Note: for these state elements, clock is used for write but not for read (asynchronous read, synchronous write).
![Page 10: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/10.jpg)
EE141
EECS150 - Lec07-MIPS
RISC-V Microarchitecture Oganization
10
Datapath + Controller + External Memory
Controller
![Page 11: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/11.jpg)
EE141
Microarchitecture
Multiple implementations for a single architecture:
– Single-cycle – Each instruction executes in a single clock cycle.
– Multicycle – Each instruction is broken up into a series of shorter steps with one step per
clock cycle. – Pipelined (variant on “multicycle”)
– Each instruction is broken up into a series of steps with one step per clock cycle
– Multiple instructions execute at once by overlapping in time. – Superscalar
– Multiple functional units to execute multiple instructions at the same time – Out of order...
– Hey, who says we have to follow the program exactly....
11
![Page 12: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/12.jpg)
FirstDesign:One-Instruction-Per-CycleRISC-VMachine
1. Currentstateoutputsdrivetheinputstothecombinationallogic,whoseoutputssettlesatthevaluesofthestatebeforethenextclockedge
2. Attherisingclockedge,allthestateelementsareupdatedwiththecombinationallogicoutputs,andexecutionmovestothenextclockcycle(nextinstruction)�12
Reg[]
pc
IMEM
DMEM
CombinationalLogic
clock
Oneverytickoftheclock,thecomputerexecutesoneinstruction
![Page 13: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/13.jpg)
BasicPhasesofInstructionExecution
IMEM
+4
rs2rs1rd
Reg[]
ALU
DMEM
imm
1.InstructionFetch
2.Decode/Register
Read
3.Execute 4.Memory 5.RegisterWrite
PC
�13
mux
Clocktime
![Page 14: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/14.jpg)
Implementingtheaddinstruction
add rd, rs1, rs2
• Instructionmakestwochangestomachine’sstate: Reg[rd] = Reg[rs1] + Reg[rs2] PC = PC + 4
�14
![Page 15: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/15.jpg)
ControlLogic
Datapathforadd
�15
+4
pcpc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0]
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD Reg[rs1]
Reg[rs2]+ alu
(RegWriteEnable)RegWEn(1=write,0=nowrite)
![Page 16: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/16.jpg)
TimingDiagramforadd
�16
1000 1004PC
1004 1008PC+4
add x1,x2,x3 add x6,x7,x9inst[31:0]
Clock
time
+4
pcpc+4 inst[11:7]
inst[19:15]inst[24:20]
IMEM
inst[31:0]
+RegWEn
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD Reg[rs1]
Reg[rs2]
clock
alu
Reg[2] Reg[7]Reg[rs1]
Reg[2]+Reg[3]alu Reg[7]+Reg[9]
Reg[3] Reg[9]Reg[rs2]
???Reg[1] Reg[2]+Reg[3]
![Page 17: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/17.jpg)
Implementingthesubinstruction
sub rd, rs1, rs2
• Almostthesameasadd,exceptnowhavetosubtractoperandsinsteadofaddingthem
• inst[30]selectsbetweenaddandsubtract
�17
![Page 18: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/18.jpg)
ControlLogic
Datapathforadd/sub
�18
+4
pcpc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0] RegWEn(1=write,0=nowrite)
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD Reg[rs1]
Reg[rs2]aluALU
ALUSel(Add=0/Sub=1)
![Page 19: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/19.jpg)
ImplementingotherR-Formatinstructions
• Allimplementedbydecodingfunct3andfunct7fieldsandselectingappropriateALUfunction
�19
![Page 20: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/20.jpg)
Implementingtheaddiinstruction• RISC-VAssemblyInstruction:addi x15,x1,-50
�20
111111001110 00001 000 01111 0010011
OP-Immrd=15ADDimm=-50 rs1=1
![Page 21: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/21.jpg)
ControlLogic
Datapathforadd/sub
�21
+4
pcpc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0] RegWEn(1=write,0=nowrite)
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD Reg[rs1]
Reg[rs2]alu
ALU
ALUSel(Add=0/Sub=1)
![Page 22: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/22.jpg)
ControlLogic
Addingadditodatapath
�22
+4
pcpc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0]
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Reg[rs1]
Reg[rs2]
aluALU
ALUSel=Add
Imm.Gen
0
1
RegWEn=1
inst[31:20] imm[31:0]
ImmSel=I BSel=1
![Page 23: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/23.jpg)
I-Formatimmediates
�23
inst[31:0]
------inst[31]-(sign-extension)------- inst[30:20]
imm[31:0]Imm.Gen
inst[31:20] imm[31:0]
ImmSel=I
• High12bitsofinstruction(inst[31:20])copiedtolow12bitsofimmediate(imm[11:0])
• Immediateissign-extendedbycopyingvalueofinst[31]tofilltheupper20bitsoftheimmediatevalue(imm[31:12])
![Page 24: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/24.jpg)
ControlLogic
Addingadditodatapath
CS61c �24
+4
pcpc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0]
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Reg[rs1]
Reg[rs2]
aluALU
ALUSel=Add
Imm.Gen
0
1
RegWEn=1
inst[31:20] imm[31:0]
ImmSel=I BSel=1
AlsoworksforallotherI-formatarithmeticinstruction(slti,sltiu,andi,ori,xori,slli,srli,srai)justbychangingALUSel
![Page 25: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/25.jpg)
ImplementingLoadWordinstruction• RISC-VAssemblyInstruction:lw x14, 8(x2)
�25
000000001000 00010 010 01110 0000011
LOADrd=14LWimm=+8 rs1=2
![Page 26: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/26.jpg)
ControlLogic
Addingadditodatapath
CS61c �26
+4
pcpc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0]
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Reg[rs1]
Reg[rs2]
aluALU
ALUSel=Add
Imm.Gen
0
1
RegWEn=1
inst[31:20]imm[31:0]
ImmSel=I BSel=1
![Page 27: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/27.jpg)
Addinglwtodatapath
�27
IMEM ALU
Imm.Gen
+4
DMEM
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr DataR 0
1pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:20]
alu
mem
wbpc+4
Reg[rs1]
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BSel ALUSel MemRW WBSel
wb
![Page 28: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/28.jpg)
Addinglwtodatapath
CS61c �28
IMEM ALU
Imm.Gen
+4
DMEM
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr DataR 0
1pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:20]
alu
mem
wbpc+4
Reg[rs1]
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel=I RegWEn=1 BSel=1 ALUSel=add MemRW=Read WBSel=0
wb
![Page 29: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/29.jpg)
AllRV32LoadInstructions
• Supportingthenarrowerloadsrequiresadditionalcircuitstoextractthecorrectbyte/halfwordfromthevalueloadedfrommemory,andsign-orzero-extendtheresultto32bitsbeforewritingbacktoregisterfile.
�29
funct3fieldencodessizeandsignednessofloaddata
![Page 30: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/30.jpg)
ImplementingStoreWordinstruction• RISC-VAssemblyInstruction:sw x14, 8(x2)
�30
0000000 01110 00010 010 01000 0100011
STOREoffset[4:0]=8
SWoffset[11:5]=0
rs2=14 rs1=2
combined12-bitoffset=80000000 01000
![Page 31: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/31.jpg)
Addinglwtodatapath
�31
IMEM ALU
Imm.Gen
+4
DMEM
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr DataR 0
1pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:20]
alu
mem
wbpc+4
Reg[rs1]
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BSel ALUSel MemRW WBSel
wb
![Page 32: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/32.jpg)
Addingswtodatapath
�32
IMEMALU
Imm.Gen
+4
DMEM
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr
DataWDataR 0
1pc0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wbpc+4
Reg[rs1]
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel=S RegWEn=0 Bsel=1 ALUSel=Add MemRW=Write WBSel=*
wb
*=“Don’tCare”
![Page 33: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/33.jpg)
CS61c �33
IMEM ALU
Imm.Gen
+4
DMEM
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr DataR 0
1pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wbpc+4
Reg[rs1]
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel=S RegWEn BSel=1 ALUSel=Add MemRW=Write WBSel=*
wb
Addingswtodatapath
*=“Don’tCare”
![Page 34: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/34.jpg)
I-Formatimmediates
�34
inst[31:0]
------inst[31]-(sign-extension)------- inst[30:20]
imm[31:0]Imm.Gen
inst[31:20] imm[31:0]
ImmSel=I
• High12bitsofinstruction(inst[31:20])copiedtolow12bitsofimmediate(imm[11:0])
• Immediateissign-extendedbycopyingvalueofinst[31]tofilltheupper20bitsoftheimmediatevalue(imm[31:12])
![Page 35: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/35.jpg)
I&SImmediateGenerator
�35
imm[11:5] rs2 rs1 funct3 imm[4:0] S-opcode
imm[11:0] rs1 funct3 rd I-opcode
inst[31](sign-extension) inst[30:25]
imm[31:0]
inst[31:0]
inst[24:20]
SI
inst[31](sign-extension) inst[30:25] inst[11:7]
067111214151920242531
045101131
1 65
5
S
I
• Justneeda5-bitmuxtoselectbetweentwopositionswherelowfivebitsofimmediatecanresideininstruction
• Otherbitsinimmediatearewiredtofixedpositionsininstruction
![Page 36: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/36.jpg)
ImplementingBranches
• B-formatismostlysameasS-Format,withtworegistersources(rs1/rs2)anda12-bitimmediate
• Butnowimmediaterepresentsvalues-4096to+4094in2-byteincrements
• The12immediatebitsencodeeven13-bitsignedbyteoffsets(lowestbitofoffsetisalwayszero,sononeedtostoreit)
�36
Example: if rs1 = rs2 then pc ← pc + offset
![Page 37: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/37.jpg)
Addingswtodatapath
�37
IMEMALU
Imm.Gen
+4
DMEM
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr
DataWDataR 0
1pc0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wbpc+4
Reg[rs1]
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn Bsel ALUSel MemRW WBSel=
wb
![Page 38: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/38.jpg)
Addingbranchestodatapath
�38
IMEMALU
Imm.Gen
+4
DMEMBranchComp.
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr
DataWDataR
1
0
01
1
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BrUn BrEq BrLT ASelBSel ALUSel MemRW WBSelPCSel
wb
![Page 39: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/39.jpg)
Addingbranchestodatapath
�39
IMEMALU
Imm.Gen
+4
DMEMBranchComp.
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr
DataWDataR
1
0
01
1
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wb
alu
pc+4
pc
imm[31:0]
Reg[rs2]
wb
inst[31:0] ImmSel=B RegWEn=0 BrUn BrEq BrLT ASel=1Bsel=1
ALUSel=Add
MemRW=Read WBSel=*PCSel=taken/not-taken
Reg[rs1]
![Page 40: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/40.jpg)
BranchComparator• BrEq=1,ifA=B• BrLT=1,ifA<B• BrUn=1selectsunsignedcomparisonforBrLT,0=signed
• BGEbranch:A>=B,if!(A<B)
�40
BranchComp.
A
B
BrUn BrEq BrLT
![Page 41: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/41.jpg)
ImplementingJALRInstruction(I-Format)
• JALRrd,rs,immediate−WritesPC+4toReg[rd](returnaddress)− SetsPC=Reg[rs1]+immediate− Usessameimmediatesasarithmeticandloads▪ nomultiplicationby2bytes
�41
![Page 42: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/42.jpg)
Addingbranchestodatapath
�42
IMEMALU
Imm.Gen
+4
DMEMBranchComp.
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr
DataWDataR
1
0
01
1
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wb
alu
pc+4
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BrUn BrEq BrLT ASelBSel ALUSel MemRW WBSelPCSel
wbReg[rs1]
![Page 43: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/43.jpg)
�43
IMEMALU
Imm.Gen
+4
DMEMBranchComp.
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr
DataWDataR
1
01
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BrUn BrEq BrLT ASelBSel ALUSel MemRW WBSelPCSel
wb
Addingjalrtodatapath
012
pc+4
![Page 44: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/44.jpg)
Addingjalrtodatapath
�44
IMEMALU
Imm.Gen
+4
DMEM
BranchComp.
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr
DataWDataR
1
0
0121
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
pc+4alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel=B RegWEn=1
BrUn=* BrEq=* BrLT=*
Asel=0Bsel=1
ALUSel=Add
MemRW=Read WBSel=2PCSel
wb
![Page 45: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/45.jpg)
ImplementingjalInstruction
• JALsavesPC+4inReg[rd](thereturnaddress)• SetPC=PC+offset(PC-relativejump)• Targetsomewherewithin±219locations,2bytesapart− ±21832-bitinstructions
• Immediateencodingoptimizedsimilarlytobranchinstructiontoreducehardwarecost
�45
![Page 46: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/46.jpg)
Addingjaltodatapath
�46
IMEMALU
Imm.Gen
+4
DMEMBranchComp.
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr
DataWDataR
1
0
0121
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
pc+4alu
mem
wb
alu
pc+4
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BrUn BrEq BrLT ASelBSel ALUSel MemRW WBSelPCSel
wbReg[rs1]
![Page 47: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/47.jpg)
Addingjaltodatapath
�47
IMEMALU
Imm.Gen
+4
DMEMBranchComp.
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr
DataWDataR
1
0
0121
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
pc+4alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel=J RegWEn=1
BrUn=* BrEq=* BrLT=*
Asel=1Bsel=1
ALUSel=Add
MemRW=Read WBSel=2PCSel
wb
![Page 48: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/48.jpg)
Single-CycleRISC-VRV32IDatapath
�48
IMEMALU
Imm.Gen
+4
DMEMBranchComp.
Reg[]
AddrAAddrB
DataAAddrD
DataB
DataD
Addr
DataWDataR
1
0
0121
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
pc+4alu
mem
wb
alu
pc+4
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BrUn BrEq BrLT ASelBSel ALUSel MemRW WBSelPCSel
wbReg[rs1]
![Page 49: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/49.jpg)
EE141
Controller Implementation:❑ Control logic works really well as a case
statement...always @* begin op = instr[26:31]; imm = instr[15:0]; ... reg_dst = 1'bx; // Don't care reg_write = 1'b0; // Do care, side effecting ... case (op) 6'b000000: begin reg_write = 1; ... end ...
49
![Page 50: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/50.jpg)
EE141
Processor Pipelining
![Page 51: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/51.jpg)
EE141
Review: Processor Performance(The Iron Law)
Program Execution Time = (# instructions)(cycles/instruction)(seconds/cycle)
= # instructions x CPI x TC
51
![Page 52: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/52.jpg)
EE141
Single-Cycle Performance• TC is limited by the critical path (lw)
52
![Page 53: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/53.jpg)
EE141
Single-Cycle Performance
• Single-cycle critical path:
Tc = tq_PC + tmem + max(tRFread, tsext + tmux) + tALU + tmem + tmux + tRFsetup
• In most implementations, limiting paths are:
– memory, ALU, register file. – Tc = tq_PC + 2tmem + tRFread + tmux + tALU + tRFsetup
53
![Page 54: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/54.jpg)
EE141
Pipelined Processor
• Temporal parallelism • Divide single-cycle processor into 5 stages: – Fetch – Decode – Execute – Memory –Writeback • Add pipeline registers between stages
54
![Page 55: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/55.jpg)
EE141
Single-Cycle vs. Pipelined Performance
55
![Page 56: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/56.jpg)
EE141
Single-Cycle and Pipelined Datapath
56
![Page 57: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/57.jpg)
EE141
Corrected Pipelined Datapath• WriteReg must arrive at the same time as Result
57
![Page 58: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/58.jpg)
EE141
Pipelined Control
Same control unit as single-cycle processor
Control delayed to proper pipeline stage 58
![Page 59: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/59.jpg)
EE141
Pipeline Hazards
❑ Occurs when an instruction depends on results from previous instruction that hasn’t completed.
❑ Types of hazards: – Data hazard: register value not written back to register
file yet – Control hazard: next instruction not decided yet
(caused by branches)
59
![Page 60: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/60.jpg)
EE141
Processor Pipelining
60
IF1 IF2 ID X1 X2 M1 M2 WBIF1 IF2 ID X1 X2 M1 M2 WB
Deeper pipelines => less logic per stage => high clock rate.
Deeper pipeline example.
Deeper pipelines* => more hazards => more cost and/or higher CPI.
Remember, Performance = # instructions X Frequencyclk / CPI
But
Cycles per instruction might go up because of unresolvable hazards.
How about shorter pipelines ... Less cost, less performance
*Many designs included pipelines as long as 7, 10 and even 20 stages (like in the Intel Pentium 4). The later "Prescott" and "Cedar Mill" Pentium 4 cores (and their Pentium D derivatives) had a 31-stage pipeline.
![Page 61: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/61.jpg)
EE141
3-Stage Pipeline
![Page 62: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/62.jpg)
EE141
3-Stage Pipeline (used for FPGA/ASIC project)
62
I X M
The blocks in the datapath with the greatest delay are: IMEM, ALU, and DMEM. Allocate one pipeline stage to each:
Use PC register as address to IMEM and retrieve next
instruction. Instruction gets stored in a pipeline register,
also called “instruction register”, in this case.
Most details you will need to work out for yourself. Some details to follow ... In particular, let’s look at hazards.
Access data memory or I/O device for load or store. Allow for setup time for register file write.
Use ALU to compute result, memory
address, or branch target address.
![Page 63: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/63.jpg)
EE141
3-stage Pipeline
63
add x5, x3, x4 I X M add x7, x6, x5 I X M
reg 5 value updated herereg 5 value needed here!
Data Hazard
Selectively forward ALU result back to input of ALU.
The fix:
• Need to add mux at input to ALU, add control logic to sense when to activate. Check reference for details.
ALU
control
![Page 64: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/64.jpg)
EE141
3-stage Pipeline
64
lw x5, offset(x4) I X MI X M
Memory value known here. It is written into the regfile on this edge.
value needed here!
Load Hazard
add x7, x6, x5
lw x5, offset(x4) I X MI nop nop
I X M add x7, x6, x5 add x7, x6, x5
The fix: Delay the dependent instruction by one cycle to allow the load to complete, send the result of load directly to the ALU (and to the regfile). No delay if not dependent!
![Page 65: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/65.jpg)
EE141
Control Hazard3-stage Pipeline
65
beq x1, x2, L1 I X M add x5, x3, x4 I X M
add x6, x1, x2 I X ML1: sub x7, x6, x5 I X
branch address ready herebut needed here!
The fix:Several Possibilities:* 1. Always delay fetch of instruction after branch 2. Assume branch “not taken”, continue with instruction
at PC+4, and correct later if wrong. 3. Predict branch taken or not based on history (state)
and correct later if wrong.
1. Simple, but all branches now take 2 cycles (lowers performance) 2. Simple, only some branches take 2 cycles (better performance) 3. Complex, very few branches take 2 cycles (best performance)
* MIPS defines “branch delay slot”, RISC-V doesn’t
![Page 66: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/66.jpg)
EE141
Control HazardPredict “not taken”
66
bneq x1, x1, L1 I X M add x5, x3, x4 I X M
add x6, x1, x2 I X ML1: sub x7, x6, x5 I X
beq x1, x1, L1 I X M add x5, x3, x4 I nop nopL1: sub x7, x6, x5 I X M
Branch address ready at end of X stage: • If branch “not taken”, do nothing. • If branch “taken”, then kill instruction in I stage (about to
enter X stage) and fetch at new target address (PC)
Not taken
Taken
![Page 67: Lecture 13 - University of California, Berkeleyinst.eecs.berkeley.edu/~eecs151/sp19/files/lec13-riscv.pdfLecture 13. EE141 Project Introduction ... 31 25 24 20 19 15 14 12 11 7 6 0](https://reader036.vdocuments.us/reader036/viewer/2022070715/5ed82c430fa3e705ec0df8a7/html5/thumbnails/67.jpg)
EE141
EECS151 Project CPU Pipelining Summary
❑ Pipeline rules: –Writes/reads to/from DMem are clocked on the leading
edge of the clock in the “M” stage –Writes to RegFile at the end of the “M” stage – Instruction Decode and Register File access is up to you.
❑ Branch: predict “not-taken”
❑ Load: 1 cycle delay/stall on dependent instruction
❑ Bypass ALU for data hazards
❑ More details in upcoming spec 67
I X Minstruction
fetchexecute access
data memory
3-stage pipeline