risc-v cpu datapath, control introcs61c/resources/su18_lec/lecture11.… · review -- sds and...
TRANSCRIPT
RISC-V CPU Datapath, Control IntroInstructor: Steven Ho
Review -- SDS and Sequential Logic
27/09/2018 CS61C Su18 - Lecture 11
Review -- Timing
3
Setup Violation
Clk-to-q + longest CL + setup time <= Clk period
CLK
FF FFCL
Review -- Timing
4
CLK
FF FFCL
Clk-to-q + shortest CL >= hold time
Hold Time Violation
Review -- SDS and Sequential Logic
5
Critical Path = Clk-to-q + longest CL + setup timebetween ANY two registers
7/09/2018 CS61C Su18 - Lecture 11
Review -- Combinational Logic
• Hardware is permanent. Always do everything you might want
• Use MUXes to pick from among input– S input bits selects one of 2S inputs
• Ex: ALU
7/09/2018 CS61C Su18 - Lecture 11 6
Great Idea #1: Levels of Representation & Interpretation
7
lw $t0, 0($2)lw $t1, 4($2)sw $t1, 0($2)sw $t0, 4($2)
Higher-Level LanguageProgram (e.g. C)
Assembly Language Program (e.g. RISC-V)
Machine Language Program (RISC-V)
Hardware Architecture Description(e.g. block diagrams)
Compiler
Assembler
Machine Interpretation
temp = v[k];v[k] = v[k+1];v[k+1] = temp;
0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111
Logic Circuit Description(Circuit Schematic Diagrams)
Architecture Implementation
We are here
7/09/2018 CS61C Su18 - Lecture 11
Agenda
• Datapath Overview
• Assembling the Datapath Part 1
• Administrivia
• Processor Design Process
• Assembling the Datapath Part 2
87/09/2018 CS61C Su18 - Lecture 11
system
datapath control
stateregisters
combinationallogic
multiplexer comparatorcoderegister
s
register logic
switchingnetworks
Hardware Design Hierarchy
9
Today
7/09/2018 CS61C Su18 - Lecture 11
The Processor
• Processor (CPU): Instruction Set Architecture (ISA) implemented directly in hardware– Datapath: part of the processor that contains the
hardware necessary to perform operations required by the processor (“the brawn”)
– Control: part of the processor (also in hardware) which tells the datapath what needs to be done (“the brain”)
107/09/2018 CS61C Su18 - Lecture 11
Executing an Instruction
Very generally, what steps do you take (order matters!) to figure out the effect/result of the next RISC-V instruction?
– Get the instruction add s0,t0,t1– What instruction is it? add– Gather data read R[t0], R[t1]– Perform operation calc R[t0]+R[t1]– Store result save into s0
117/09/2018 CS61C Su18 - Lecture 11
State Required by RV32I ISAEach instruction reads and updates this state during execution:
• Registers (x0..x31)− Register file (or regfile) Reg holds 32 registers x 32 bits/register: Reg[0].. Reg[31]− First register read specified by rs1 field in instruction− Second register read specified by rs2 field in instruction− Write register (destination) specified by rd field in instruction− x0 is always 0 (writes to Reg[0]are ignored)
• Program Counter (PC)− Holds address of current instruction
• Memory (MEM)− Holds both instructions & data, in one 32-bit byte-addressed memory space− We’ll use separate memories for instructions (IMEM) and data (DMEM)
▪ Later we’ll replace these with instruction and data caches
− Instructions are read (fetched) from instruction memory (assume IMEM read-only)− Load/store instructions access data memory
7/09/2018 12CS61C Su18 - Lecture 11
One-Instruction-Per-Cycle RISC-V Machine• On every tick of the clock,
the computer executes one instruction
• Current state outputs drive the inputs to the combinational logic, whose outputs settles at the values of the state before the next clock edge
• At the rising clock edge, all the state elements are updated with the combinational logic outputs, and execution moves to the next clock cycle
7/09/2018 13
Reg[]
pc
IMEM
DMEM
Combinational Logic
clock
CS61C Su18 - Lecture 11
Basic Phases of Instruction Execution
IMEM
+4
rs2
rs1
rd
Reg
[]
ALU
DM
EM
imm
1. InstructionFetch
2. Decode/ Register
Read
3. Execute 4. Memory5. Register Write
PC
7/09/2018 14
mu
x
Clock
time
Why Five Stages?
• Could we have a different number of stages?– Yes, and other architectures do
• So why does RISC-V have five if instructions tend to idle for at least one stage?– The five stages are the union of all the operations
needed by all the instructions
– There is one instruction that uses all five stages: load (lw/lb)
157/09/2018 CS61C Su18 - Lecture 11
Agenda
• Datapath Overview
• Assembling the Datapath Part 1
• Administrivia
• Processor Design Process
• Assembling the Datapath Part 2
167/09/2018 CS61C Su18 - Lecture 11
Implementing the add instruction
add rd, rs1, rs2• Instruction makes two changes to machine’s state:− Reg[rd] = Reg[rs1] + Reg[rs2]− PC = PC + 4
7/09/2018 17CS61C Su18 - Lecture 11
Datapath Walkthroughs (1/3)
• add x3,x1,x2 # r3 = r1+r2
1) IF: fetch this instruction, increment PC
2) ID: decode as addthen read R[1] and R[2]
3) EX: add the two values retrieved in ID
4) MEM: idle (not using memory)
5) WB: write result of EX into R[3]
187/09/2018 CS61C Su18 - Lecture 11
inst
ruct
ion
mem
ory
+4re
gist
ers
ALU
Dat
am
emor
y
imm
21
3
MU
X
R[1] + R[2]
R[2]
R[1]
Example: add InstructionP
C
add x3,x1,x2
197/09/2018 CS61C Su18 - Lecture 11
Control Logic
Datapath for add
7/09/2018 20
+4
pcpc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0] RegWriteEnable(RegWEn)
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD Reg[rs1]
Reg[rs2]+ alu
CS61C Su18 - Lecture 11
Timing Diagram for add
21
1000 1004PC
1004 1008PC+4
add x1,x2,x3 add x6,x7,x9inst[31:0]
Clock
time
+4
pcpc+4inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0]
+
RegWEn
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD Reg[rs1]
Reg[rs2]
clock
alu
Reg[2] Reg[7]Reg[rs1]
Reg[2]+Reg[3]alu Reg[7]+Reg[9]
Reg[3] Reg[9]Reg[rs2]
???Reg[1] Reg[2]+Reg[3]
Implementing the sub instruction
sub rd, rs1, rs2• Almost the same as add, except now have to subtract
operands instead of adding them
•inst[30] selects between add and subtract
7/09/2018 22CS61C Su18 - Lecture 11
Control Logic
Datapath for add/sub
7/09/2018 23
+4
pcpc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0] RegWEn(1=write, 0=no write)
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD Reg[rs1]
Reg[rs2]
aluALU
ALUSel(Add=0/Sub=1)
CS61C Su18 - Lecture 11
Implementing other R-Format instructions
• All implemented by decoding funct3 and funct7 fields and selecting appropriate ALU function
7/09/2018 24CS61C Su18 - Lecture 11
Implementing the addi instruction
• RISC-V Assembly Instruction:addi x15,x1,-50
257/09/2018
111111001110 00001 000 01111 0010011
OP-Immrd=15ADDimm=-50 rs1=1
CS61C Su18 - Lecture 11
Control Logic
Datapath for add/sub
7/09/2018 26
+4
pcpc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0] RegWEn(1=write, 0=no write)
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD Reg[rs1]
Reg[rs2]
aluALU
ALUSel(Add=0/Sub=1)
CS61C Su18 - Lecture 11
Control Logic
Adding addi to datapath
7/09/2018 27
+4
pcpc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0]
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Reg[rs1]
Reg[rs2]
aluALU
ALUSel=Add
Imm.Gen
0
1
RegWEn=1
inst[31:20]imm[31:0]
ImmSel=I BSel=1
CS61C Su18 - Lecture 11
I-Format immediates
7/09/2018 28
inst[31:0]
------inst[31]-(sign-extension)------- inst[30:20]
imm[31:0]
Imm.Gen
inst[31:20] imm[31:0]
ImmSel=I
• High 12 bits of instruction (inst[31:20]) copied to low 12 bits of immediate (imm[11:0])
• Immediate is sign-extended by copying value of inst[31] to fill the upper 20 bits of the immediate value (imm[31:12])
CS61C Su18 - Lecture 11
Control Logic
Adding addi to datapath
7/09/2018 29
+4
pcpc+4
inst[11:7]
inst[19:15]
inst[24:20]
IMEM
inst[31:0]
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Reg[rs1]
Reg[rs2]
aluALU
ALUSel=Add
Imm.Gen
0
1
RegWEn=1
inst[31:20]imm[31:0]
ImmSel=I BSel=1
Also works for all other I-format arithmetic instruction (slti,sltiu,andi,ori,xori,slli,srli,srai) just by changing ALUSel
CS61C Su18 - Lecture 11
sltisltiuandiorixori
Agenda
• Datapath Overview
• Assembling the Datapath Part 1
• Administrivia
• Processor Design Process
• Assembling the Datapath Part 2
307/09/2018 CS61C Su18 - Lecture 11
Administrivia• Project 2-2 due Friday (7/13)− Project Party tonight 4-6p in Soda 405 and 411!
• Homework 4 released; 3/4 due Monday (7/16)
• Guerilla Session on Wednesday, 4-6p, Cory 540AB
− SDS, FSM, and Single-Cycle Datapath
• Project 3 will be released Thursday
− Project party on 7/13, 4-6p (intended for students to get help starting on proj3/finishing lab 6)
− Project party 7/20, 4-6p to finish up project 3
• Homework 2 grades are on glookup
− Hw0/Hw1 not yet updated, but if you still didn’t get credit for hw2, please change your emails to match!
7/09/2018 31CS61C Su18 - Lecture 11
Agenda
• Datapath Overview
• Assembling the Datapath Part 1
• Administrivia
• Processor Design Process
• Assembling the Datapath Part 2
327/09/2018 CS61C Su18 - Lecture 11
Processor Design Process• Five steps to design a processor:
1. Analyze instruction set → datapath requirements
2. Select set of datapath components & establish clock methodology
3. Assemble datapath components to meet the requirements
4. Analyze implementation of each instruction to determine setting of control points that affect the register transfer
5. Assemble the control logic• Formulate Logic Equations• Design Circuits
33
Control
Datapath
Memory
Processor
Input
OutputNo
w
7/09/2018 CS61C Su18 - Lecture 11
Step 1: Requirements of the Instruction Set
• Memory (MEM)– Instructions & data (separate: in reality just caches)– Load from and store to
• Registers (32 32-bit regs)– Read rs1 and rs2– Write rd
• PC– Add 4 (+ maybe extended immediate)
• Add/Sub/OR unit for operation on register(s) or extended immediate– Compare if registers equal?
347/09/2018 CS61C Su18 - Lecture 11
Storage Element: Idealized Memory• Memory (idealized)
– One input bus: Data In– One output bus: Data Out
• Memory access:– Read: Write Enable = 0, data at Address is placed on
Data Out– Write: Write Enable = 1, Data In written to Address
• Clock input (CLK) – CLK input is a factor ONLY during write operation– During read, behaves as a combinational logic block:
Address valid → Data Out valid after “access time”
35
CLK
Data In
Write Enable
32 32
DataOut
Address
7/09/2018 CS61C Su18 - Lecture 11
Storage Element: Register
• Similar to D flip-flop except:– N-bit input and output buses
– Write Enable input
• Write Enable:– De-asserted (0): Data Out will not change
– Asserted (1): Data In value placed onto Data Out after CLK trigger
36
CLK
Data In
Write Enable
N N
Data Out
7/09/2018 CS61C Su18 - Lecture 11
Storage Element: Register File
• Register File consists of 32 registers:– Output buses busA and busB– Input bus busW
• Register selection– Place data of register RA (number) onto busA– Place data of register RB (number) onto busB– Store data on busW into register RW (number) when Write
Enable is 1
• Clock input (CLK) – CLK input is a factor ONLY during write operation– During read, behaves as a combinational logic block:
RA or RB valid → busA or busB valid after “access time”37
Clk
busW
Write Enable
3232
busA
32busB
5 5 5RW RA RB
32 x 32-bitRegisters
7/09/2018 CS61C Su18 - Lecture 11
Step 2: CPU Clocking• For each instruction, how do we control the flow of
information through the datapath?
• Single Cycle CPU: All stages of an instruction
completed within one long clock cycle
– Clock cycle sufficiently long to allow each instruction to
complete all stages without interruption within one cycle
38
Step 3: Assembling the Datapath• Assemble datapath to meet ISA requirements
– Exact requirements will change based on ISA– Here we must examine each instruction of RISC
• The datapath is all of the hardware components and wiring necessary to carry out ALL of the different instructions– Make sure all components (e.g. RegFile, ALU) have
access to all necessary signals and buses– Control will make sure instructions are properly
executed (the decision making)
397/09/2018 CS61C Su18 - Lecture 11
Agenda
• Datapath Overview
• Assembling the Datapath Part 1
• Administrivia
• Processor Design Process
• Assembling the Datapath Part 2
407/09/2018 CS61C Su18 - Lecture 11
Implementing Load Word instruction
• RISC-V Assembly Instruction:lw x14, 8(x2)
417/09/2018
000000001000 00010 010 01110 0000011
LOADrd=14LWimm=+8 rs1=2
CS61C Su18 - Lecture 11
Control Logic
Adding addi to datapath
7/09/2018 42
+4
pcpc+4
inst[19:15]
inst[24:20]
IMEM
inst[31:0]
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Reg[rs1]
Reg[rs2]
aluALU
ALUSel=Add
Imm.Gen
0
1
RegWEn=1
imm[31:0]
ImmSel=I BSel=1
CS61C Su18 - Lecture 11
inst[11:7]
inst[31:20]
Adding lw to datapath
7/09/2018 43
IMEM
ALU
Imm.Gen
+4
DMEM
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
AddrDataR 0
1
pc0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:20]
ALU
mem
wb
imm[31:0]
inst[31:0] ImmSel RegWEn BSel ALUSel MemRW WBSel
wb
CS61C Su18 - Lecture 11
pc+4Reg[rs1]
Reg[rs2]
Adding lw to datapath
7/09/2018 44
IMEM
ALU
Imm.Gen
+4
DMEM
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
AddrDataR 0
1
pc0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:20]
alu
mem
wb
pc+4
Reg[rs1]
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel=I RegWEn=1 Bsel=1 ALUSel=Add MemRW=Read WBSel=0
wb
CS61C Su18 - Lecture 11
All RV32 Load Instructions
• Supporting the narrower loads requires additional circuits to extract the correct byte/halfword from the value loaded from memory, and sign- or zero-extend the result to 32 bits before writing back to register file.
45
funct3 field encodes size and signedness of load data
CS61C Su18 - Lecture 117/09/2018
Implementing Store Word instruction
• RISC-V Assembly Instruction:sw x14, 8(x2)
467/09/2018
0000000 01110 00010 010 01000 0100011
STOREoffset[4:0]=8
SWoffset[11:5]=0
rs2=14 rs1=2
combined 12-bit offset = 80000000 01000
Adding lw to datapath
7/09/2018 47
IMEM
ALU
Imm.Gen
+4
DMEM
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
AddrDataR 0
1
pc0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:20]
ALU
mem
wb
Reg[rs1]
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BSel ALUSel MemRW WBSel
wb
CS61C Su18 - Lecture 11
pc+4
Adding sw to datapath
7/09/2018 48
IMEM
ALU
Imm.Gen
+4
DMEM
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR 0
1
pc0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
mem
wbpc+4
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn Bsel ALUSel MemRW WBSel=
wb
CS61C Su18 - Lecture 11
Reg[rs1]
ALU
Adding sw to datapath
7/09/2018 49
IMEM
ALU
Imm.Gen
+4
DMEM
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR 0
1
pc0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
mem
wbpc+4
Reg[rs1]
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel=S RegWEn=0 Bsel=1 ALUSel=Add MemRW=Write WBSel=*
wb
*= “Don’t Care”
CS61C Su18 - Lecture 11
ALU
I-Format immediates
7/09/2018 50
inst[31:0]
------inst[31]-(sign-extension)------- inst[30:20]
imm[31:0]
Imm.Gen
inst[31:20] imm[31:0]
ImmSel=I
• High 12 bits of instruction (inst[31:20]) copied to low 12 bits of immediate (imm[11:0])
• Immediate is sign-extended by copying value of inst[31] to fill the upper 20 bits of the immediate value (imm[31:12])
CS61C Su18 - Lecture 11
I & S Immediate Generator
7/09/2018 51
imm[11:5] rs2 rs1 funct3 imm[4:0] S-opcode
imm[11:0] rs1 funct3 rd I-opcode
inst[31](sign-extension) inst[30:25]
imm[31:0]
inst[31:0]
inst[24:20]
SI
inst[31](sign-extension) inst[30:25] inst[11:7]
067111214151920242531
045101131
1 65
5
S
I
• Just need a 5-bit mux to select between two positions where low five bits of immediate can reside in instruction
• Other bits in immediate are wired to fixed positions in instruction
Implementing Branches
• B-format is mostly same as S-Format, with two register sources (rs1/rs2) and a 12-bit immediate
• But now immediate represents values -4096 to +4094 in 2-byte increments
• The 12 immediate bits encode even 13-bit signed byte offsets (lowest bit of offset is always zero, so no need to store it)
52CS61C Su18 - Lecture 117/09/2018
Adding sw to datapath
7/09/2018 53
IMEM
ALU
Imm.Gen
+4
DMEM
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR 0
1pc0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
mem
wbpc+4
Reg[rs1]
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn Bsel ALUSel MemRW WBSel=
wb
CS61C Su18 - Lecture 11
ALU
Adding branches to datapath
7/09/2018 54
IMEM
ALU
Imm.Gen
+4
DMEM
Branch Comp.
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR
1
0
0
1
1
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
ALU
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BrUn BrEq BrLT ASelBSel ALUSel MemRW WBSelPCSel
wb
CS61C Su18 - Lecture 11
Adding branches to datapath
7/09/2018 55
IMEM
ALU
Imm.Gen
+4
DMEM
Branch Comp.
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR
1
0
0
1
1
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel=B RegWEn=0 BrUn BrEq BrLT ASel=1Bsel=1
ALUSel=Add
MemRW=Read WBSel=*PCSel=taken/not-taken
wb
CS61C Su18 - Lecture 11
Branch Comparator
• BrEq = 1, if A=B
• BrLT = 1, if A < B
• BrUn =1 selects unsigned comparison for BrLT, 0=signed
• BGE branch: A >= B, if !(A<B)
7/09/2018 56
Branch Comp.
A
B
BrUn BrEq BrLT
CS61C Su18 - Lecture 11
Break!
7/09/2018 57CS61C Su18 - Lecture 11
Multiply Branch Immediates by Shift?
• 12-bit immediate encodes PC-relative offset of -4096 to +4094 bytes in multiples of 2 bytes
• Standard approach: treat immediate as in range -2048..+2047, then shift left by 1 bit to multiply by 2 for branches
7/09/2018 58
s rs2 rs1 funct3 imm[4:0] B-opcodeimm[10:5]
s imm[10:5] imm[4:0]
s imm[10:5] imm[4:0] 0
sign-extension
sign-extension
S-Immediate
B-Immediate (shift left by 1)
Each instruction immediate bit can appear in one of two places in output immediate value – so need one 2-way mux per bit
CS61C Su18 - Lecture 11
RISC-V Branch Immediates
• 12-bit immediate encodes PC-relative offset of -4096 to +4094 bytes in multiples of 2 bytes
• RISC-V approach: keep 11 immediate bits in fixed position in output value, and rotate LSB of S-format to be bit 12 of B-format
7/09/2018 59
sign=imm[11] imm[10:5] imm[4:0]
sign=imm[12] imm[10:5] imm[4:1] 0
S-Immediate
B-Immediate (shift left by 1)
Only one bit changes position between S and B, so only need a single-bit 2-way mux
imm[11]
CS61C Su18 - Lecture 11
RISC-V Immediate Encoding
60
Instruction Encodings, inst[31:0]
32-bit immediates produced, imm[31:0]
Only bit 7 of instruction changes role in immediate between S and BUpper bits sign-extended from inst[31] always
Implementing JALR Instruction (I-Format)
• JALR rd, rs, immediate− Writes PC+4 to Reg[rd] (return address)− Sets PC = Reg[rs1] + immediate− Uses same immediates as arithmetic and loads
▪ no multiplication by 2 bytes
61CS61C Su18 - Lecture 117/09/2018
Adding branches to datapath
7/09/2018 62
IMEM
ALU
Imm.Gen
+4
DMEM
Branch Comp.
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR
1
0
0
1
1
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BrUn BrEq BrLT ASelBSel ALUSel MemRW WBSelPCSel
wb
CS61C Su18 - Lecture 11
Adding jalr to datapath
7/09/2018 63
IMEM
ALU
Imm.Gen
+4
DMEM
Branch Comp.
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR
1
0
0
1
21
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
pc+4
alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BrUn BrEq BrLT ASelBSel ALUSel MemRW WBSelPCSel
wb
CS61C Su18 - Lecture 11
Adding jalr to datapath
7/09/2018 64
IMEM
ALU
Imm.Gen
+4
DMEM
Branch Comp.
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR
1
0
0
1
21
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
pc+4alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel=B RegWEn=1
BrUn=* BrEq=* BrLT=*
Asel=0Bsel=1
ALUSel=Add
MemRW=ReadWBSel=2
PCSel
wb
CS61C Su18 - Lecture 11
Implementing jal Instruction
• JAL saves PC+4 in Reg[rd] (the return address)• Set PC = PC + offset (PC-relative jump)• Target somewhere within ±219 locations, 2 bytes apart
− ±218 32-bit instructions
• Immediate encoding optimized similarly to branch instruction to reduce hardware cost
65CS61C Su18 - Lecture 117/09/2018
Adding jal to datapath
7/09/2018 66
IMEM
ALU
Imm.Gen
+4
DMEM
Branch Comp.
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR
1
0
0
1
21
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
pc+4
alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BrUn BrEq BrLT ASelBSel ALUSel MemRW WBSelPCSel
wb
CS61C Su18 - Lecture 11
Adding jal to datapath
7/09/2018 67
IMEM
ALU
Imm.Gen
+4
DMEM
Branch Comp.
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR
1
0
0
1
21
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
pc+4alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel=J RegWEn=1
BrUn=* BrEq=* BrLT=*
Asel=1Bsel=1
ALUSel=Add
MemRW=ReadWBSel=2
PCSel
wb
CS61C Su18 - Lecture 11
Single-Cycle RISC-V RV32I Datapath
7/09/2018 68
IMEM
ALU
Imm.Gen
+4
DMEM
Branch Comp.
Reg[]
AddrA
AddrB
DataA
AddrD
DataB
DataD
Addr
DataW
DataR
1
0
0
1
21
0pc
0
1
inst[11:7]
inst[19:15]
inst[24:20]
inst[31:7]
pc+4
alu
mem
wb
alu
pc+4
Reg[rs1]
pc
imm[31:0]
Reg[rs2]
inst[31:0] ImmSel RegWEn BrUn BrEq BrLT ASelBSel ALUSel MemRW WBSelPCSel
wb
CS61C Su18 - Lecture 11
jalIF,EX,WB
(B)lwIF,ID,EX,MEM,WB
(A) beqIF,ID,EX
(C) addIF,ID,EX,WB
(D)
69
Question: Which of the following RISC-V instructionsis active in the most stages?
3. Execute
PC
inst
ruct
ion
mem
ory
+4
RegisterFilers2
rs1rd
ALU
Dat
am
emo
ry
imm
MU
X
1. InstructionFetch
2. Decode/ Register Read
3. Execute 4. Memory 5. Register Write
Processor Design Process
• Five steps to design a processor:1. Analyze instruction set →
datapath requirements2. Select set of datapath
components & establish clock methodology
3. Assemble datapath meeting the requirements
4. Analyze implementation of each instruction to determine setting of control points that affect the register transfer
5. Assemble the control logic• Formulate Logic Equations• Design Circuits
70
Control
Datapath
Memory
Processor
Input
Output
No
w
7/09/2018 CS61C Su18 - Lecture 11
Control
• Need to make sure that correct parts of the datapath are being used for each instruction– Have seen control signals in datapath used to
select inputs and operations
– For now, focus on what value each control signal should be for each instruction in the ISA
• Next lecture, we will see how to implement the proper combinational logic to implement the control
717/09/2018 CS61C Su18 - Lecture 11
Summary (1/2)
• Five steps to design a processor:1) Analyze instruction set →
datapath requirements2) Select set of datapath
components & establish clock methodology
3) Assemble datapath meeting the requirements
4) Analyze implementation of each instruction to determine setting of control points that effects the register transfer
5) Assemble the control logic• Formulate Logic Equations• Design Circuits
72
Control
Datapath
Memory
Processor
Input
Output
7/09/2018 CS61C Su18 - Lecture 11
Summary (2/2)
• Determining control signals– Any time a datapath element has an input that
changes behavior, it requires a control signal (e.g. ALU operation, read/write)
– Any time you need to pass a different input based on the instruction, add a MUX with a control signal as the selector(e.g. next PC, ALU input, register to write to)
• Your control signals will change based on your exact datapath
• Your datapath will change based on your ISA737/09/2018 CS61C Su18 - Lecture 11
And in Conclusion, …• Universal datapath− Capable of executing all RISC-V instructions in one cycle each− Not all units (hardware) used by all instructions
• 5 Phases of execution− IF, ID, EX, MEM, WB− Not all instructions are active in all phases
• Controller specifies how to execute instructions− what new instructions can be added with just most control?
7/09/2018 74CS61C Su18 - Lecture 11