putting it all together:a single cycle datapath
Post on 23-Jan-2016
31 Views
Preview:
DESCRIPTION
TRANSCRIPT
CS 61C L5.2.1 CPU Design III (1) K. Meinz, Summer 2004 © UCB
CS61C : Machine Structures
Lecture 5.2.1
CPU Design III: Control
2004-07-21
Kurt Meinz
inst.eecs.berkeley.edu/~cs61c
CS 61C L5.2.1 CPU Design III (2) K. Meinz, Summer 2004 © UCB
Putting it All Together:A Single Cycle Datapath
imm
16
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Exten
der
Mu
x
3216imm16
ALUSrcExtOp
Mu
x
MemtoReg
Clk
Data InWrEn32 Adr
DataMemory
MemWrA
LU
Equal
Instruction<31:0>
0
1
0
1
01
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
=
Ad
der
Ad
der
PC
Clk
00Mu
x
1
nPC_sel
PC
Ext
Adr
InstMemory
CS 61C L5.2.1 CPU Design III (3) K. Meinz, Summer 2004 © UCB
Recall: Clocking Methodology
• All storage elements are clocked by the same clock edge
•Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock Skew
Clk
Don’t Care
Setup Hold
.
.
.
.
.
.
.
.
.
.
.
.
Setup Hold
CS 61C L5.2.1 CPU Design III (4) K. Meinz, Summer 2004 © UCB
Clocking Methodology
• Storage elements clocked by same edge• Being physical devices, flip-flops (FF) and combinational logic have some delays •Gates: delay from input change to output change •Signals at FF D input must be stable before active clock edge to allow signal to travel within the FF, and we have the usual clock-to-Q delay
• “Critical path” (longest path through logic) determines length of clock period
Clk
.
.
.
.
.
.
.
.
.
.
.
.
CS 61C L5.2.1 CPU Design III (5) K. Meinz, Summer 2004 © UCB
An Abstract View of the Critical Path Critical Path (Load Operation) =
Delay clock through PC (FFs) + Instruction Memory’s Access Time + Register File’s Access Time + ALU to Perform a 32-bit Add + Data Memory Access Time + Stable Time for Register File Write
Clk
5
Rw Ra Rb32 32-bitRegisters
RdA
LU
Clk
Data In
DataAddress Ideal
DataMemory
Instruction
InstructionAddress
IdealInstruction
Memory
Clk
PC
5Rs
5Rt
16Imm
32
323232
A
B
Nex
t A
dd
ress
• This affects how much you can overclock your PC!
CS 61C L5.2.1 CPU Design III (6) K. Meinz, Summer 2004 © UCB
How to Design a Processor: step-by-step• 1. Analyze instruction set architecture (ISA)
=> datapath requirements•meaning of each instruction is given by the register transfers• datapath must include storage element for ISA registers• datapath must support each register transfer
• 2. Select set of datapath components and establish clocking methodology
• 3. Assemble datapath meeting requirements• 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.
• 5. Assemble the control logic (hard part!)
CS 61C L5.2.1 CPU Design III (7) K. Meinz, Summer 2004 © UCB
A Single Cycle Datapath• Rs, Rt, Rd, Imed16 connected to datapath
• We have everything except control signals
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Exten
der
Mu
x
Mux
3216imm16
ALUSrc
ExtOp
Mu
x
MemtoReg
Clk
Data InWrEn
32Adr
DataMemory
32
MemWrA
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
nPC_sel
CS 61C L5.2.1 CPU Design III (8) K. Meinz, Summer 2004 © UCB
Meaning of the Control Signals: IF• nPC_MUX_sel: 0 PC <– PC + 4
1 PC <– PC + 4 + {SignExt(Im16) , 00 }
Adr
InstMemory
Ad
der
Ad
der
PC
Clk
00Mu
x
1
nPC_MUX_sel
PC
Ext
imm
16
“n”=next
CS 61C L5.2.1 CPU Design III (9) K. Meinz, Summer 2004 © UCB
Meaning of the Control Signals• ExtOp: “zero”, “sign”
• ALUsrc: 0 regB; 1 immed
• ALUctr: “add”, “sub”, “or”
° MemWr: 1 write memory° MemtoReg: 0 ALU; 1 Mem° RegDst: 0 “rt”; 1 “rd”° RegWr: 1 write register
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Exten
der
Mu
x
3216imm16
ALUSrcExtOp
Mu
x
MemtoReg
Clk
Data InWrEn32 Adr
DataMemory
MemWr
AL
U
Equal
0
1
0
1
01
=
CS 61C L5.2.1 CPU Design III (10) K. Meinz, Summer 2004 © UCB
RTL: The Add Instruction
add rd, rs, rt•MEM[PC] Fetch the instruction
from memory
•R[rd] = R[rs] + R[rt] The actual operation
•PC = PC + 4 Calculate the next instruction’s address
op rs rt rd shamt funct
061116212631
6 bits 6 bits5 bits5 bits5 bits5 bits
CS 61C L5.2.1 CPU Design III (11) K. Meinz, Summer 2004 © UCB
Instruction Fetch Unit at the Beginning of Add• Fetch the instruction from Instruction memory: Instruction = MEM[PC]• same for all instructions
PC
Ext
Adr
InstMemory
Ad
der
Ad
der
PC
Clk
00Mu
x
1
nPC_MUX_sel
imm
16Instruction<31:0>
CS 61C L5.2.1 CPU Design III (12) K. Meinz, Summer 2004 © UCB
The Single Cycle Datapath during Add
32
ALUctr = Add
Clk
busW
RegWr = 1
32
32
busA
32
busB
55 5
Rw Ra Rb
32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = 1
Exten
der
Mu
x
Mux
3216imm16
ALUSrc = 0
ExtOp = x
Mu
x
MemtoReg = 0
Clk
Data InWrEn
32
Adr
DataMemory
32
MemWr = 0A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>• R[rd] = R[rs] + R[rt]
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
op rs rt rd shamt funct
061116212631
nPC_sel= +4
CS 61C L5.2.1 CPU Design III (13) K. Meinz, Summer 2004 © UCB
Instruction Fetch Unit at the End of Add• PC = PC + 4
• This is the same for all instructions except: Branch and Jump
Adr
InstMemory
Ad
der
Ad
der
PC
Clk
00Mu
x
1
nPC_MUX_sel
imm
16Instruction<31:0>
0
1
CS 61C L5.2.1 CPU Design III (14) K. Meinz, Summer 2004 © UCB
Single Cycle Datapath during Or Immediate?
op rs rt immediate
016212631
32
ALUctr =
Clk
busW
RegWr =
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst =
Exten
der
Mu
x
Mux
3216imm16
ALUSrc =
ExtOp =
Mu
x
MemtoReg =
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
nPC_sel =
• R[rt] = R[rs] OR ZeroExt[Imm16]
CS 61C L5.2.1 CPU Design III (15) K. Meinz, Summer 2004 © UCB
32
ALUctr = Or
Clk
busW
RegWr = 1
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = 0
Exten
der
Mu
x
Mux
3216imm16
ALUSrc = 1
ExtOp = 0
Mu
x
MemtoReg = 0
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = 0A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
• R[rt] = R[rs] OR ZeroExt[Imm16]
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
op rs rt immediate
016212631
nPC_sel= +4
Single Cycle Datapath during Or Immediate
CS 61C L5.2.1 CPU Design III (16) K. Meinz, Summer 2004 © UCB
The Single Cycle Datapath during Load?
32
ALUctr =
Clk
busW
RegWr =
32
32
busA
32
busB
55 5
Rw Ra Rb
32 32-bitRegisters
Rs
Rt
Rt
RdRegDst =
Exten
der
Mu
x
Mux
3216imm16
ALUSrc =
ExtOp =
Mu
x
MemtoReg =
Clk
Data InWrEn
32
Adr
DataMemory
32
MemWr =A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
• R[rt] = Data Memory {R[rs] + SignExt[imm16]}
op rs rt immediate
016212631
nPC_sel=
CS 61C L5.2.1 CPU Design III (17) K. Meinz, Summer 2004 © UCB
The Single Cycle Datapath during Load
32
ALUctr = Add
Clk
busW
RegWr = 1
32
32
busA
32
busB
55 5
Rw Ra Rb
32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = 0
Exten
der
Mu
x
Mux
3216imm16
ALUSrc = 1
ExtOp = 1
Mu
x
MemtoReg = 1
Clk
Data InWrEn
32
Adr
DataMemory
32
MemWr = 0A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
• R[rt] = Data Memory {R[rs] + SignExt[imm16]}
op rs rt immediate
016212631
nPC_sel= +4
CS 61C L5.2.1 CPU Design III (18) K. Meinz, Summer 2004 © UCB
The Single Cycle Datapath during Store?
op rs rt immediate
016212631
32
ALUctr =
Clk
busW
RegWr =
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst =
Exten
der
Mu
x
Mux
3216imm16
ALUSrc =
ExtOp =
Mu
x
MemtoReg =
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
nPC_sel =
• Data Memory {R[rs] + SignExt[imm16]} = R[rt]
CS 61C L5.2.1 CPU Design III (19) K. Meinz, Summer 2004 © UCB
The Single Cycle Datapath during Store
32
ALUctr = Add
Clk
busW
RegWr = 0
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = x
Exten
der
Mu
x
Mux
3216imm16
ALUSrc = 1
ExtOp = 1
Mu
x
MemtoReg = x
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = 1A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
• Data Memory {R[rs] + SignExt[imm16]} = R[rt]op rs rt immediate
016212631
nPC_sel= +4
CS 61C L5.2.1 CPU Design III (20) K. Meinz, Summer 2004 © UCB
The Single Cycle Datapath during Branch?
32
ALUctr =
Clk
busW
RegWr =
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst =
Exten
der
Mu
x
Mux
3216imm16
ALUSrc =
ExtOp =
Mu
x
MemtoReg = x
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr =A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
• if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0op rs rt immediate
016212631
nPC_sel=
CS 61C L5.2.1 CPU Design III (21) K. Meinz, Summer 2004 © UCB
The Single Cycle Datapath during Branch
32
ALUctr =Sub
Clk
busW
RegWr = 0
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst = x
Exten
der
Mu
x
Mux
3216imm16
ALUSrc = 0
ExtOp = x
Mu
x
MemtoReg = x
Clk
Data InWrEn
32Adr
DataMemory
32
MemWr = 0A
LU
InstructionFetch Unit
Clk
Zero
Instruction<31:0>
0
1
0
1
01<
21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
• if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0op rs rt immediate
016212631
nPC_sel= “Br”
CS 61C L5.2.1 CPU Design III (22) K. Meinz, Summer 2004 © UCB
Instruction Fetch Unit at the End of Branch• if (Zero == 1) then PC = PC + 4 + SignExt[imm16]*4 ;
else PC = PC + 4
op rs rt immediate
016212631
• What is encoding of nPC_sel?•Direct MUX select?
•Branch / not branch
• Let’s pick 2nd option
nPC_sel zero? MUX0 x 01 0 01 1 1
Adr
InstMemory
Ad
derA
dder
PC
Clk
00Mu
x
1
nPC_sel
imm
16
Instruction<31:0>
0
1
Zero
nPC_MUX_sel
CS 61C L5.2.1 CPU Design III (23) K. Meinz, Summer 2004 © UCB
How to Design a Processor: step-by-step• 1. Analyze instruction set architecture (ISA)
=> datapath requirements•meaning of each instruction is given by the register transfers• datapath must include storage element for ISA registers• datapath must support each register transfer
• 2. Select set of datapath components and establish clocking methodology
• 3. Assemble datapath meeting requirements• 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.
• 5. Assemble the control logic (hard part!)
CS 61C L5.2.1 CPU Design III (24) K. Meinz, Summer 2004 © UCB
Step 5: Given Datapath: RTL -> Control
ALUctrRegDst ALUSrcExtOp MemtoRegMemWr Zero
Instruction<31:0>
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRsRt
nPC_sel
Adr
InstMemory
DATA PATH
Control
Op
<0:5>
Fun
RegWr
<26:31>
CS 61C L5.2.1 CPU Design III (25) K. Meinz, Summer 2004 © UCB
A Summary of the Control Signals (1/2)
inst Register Transfer
ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4
ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4”
SUB R[rd] <– R[rs] – R[rt]; PC <– PC + 4
ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4”
ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4
ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel =“+4”
LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg, RegDst = rt, RegWr, nPC_sel = “+4”
STORE MEM[ R[rs] + sign_ext(Imm16)] <– R[rs]; PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4”
BEQ if ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4
nPC_sel = “Br”, ALUctr = “sub”
CS 61C L5.2.1 CPU Design III (26) K. Meinz, Summer 2004 © UCB
A Summary of the Control Signals (2/2)
add sub ori lw sw beq jump
RegDst
ALUSrc
MemtoReg
RegWrite
MemWrite
nPCsel
Jump
ExtOp
ALUctr<2:0>
1
0
0
1
0
0
0
x
Add
1
0
0
1
0
0
0
x
Subtract
0
1
0
1
0
0
0
0
Or
0
1
1
1
0
0
0
1
Add
x
1
x
0
1
0
0
1
Add
x
0
x
0
0
1
0
x
Subtract
x
x
x
0
0
0
1
x
xxx
op target address
op rs rt rd shamt funct
061116212631
op rs rt immediate
R-type
I-type
J-type
add, sub
ori, lw, sw, beq
jump
func
op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010Appendix A10 0000See 10 0010 We Don’t Care :-)
CS 61C L5.2.1 CPU Design III (27) K. Meinz, Summer 2004 © UCB
Review: Finite State Machine (FSM)
• States represent possible output values.
• Transitions represent changes between statesbased on inputs.
• Implement with CL andclocked registerfeedback.
CS 61C L5.2.1 CPU Design III (28) K. Meinz, Summer 2004 © UCB
Finite State Machines extremely useful!
• They define •How output signals respond to input signals and previous state.
•How we change states depending on input signals and previous state
• The output signals could be our familiar control signals•Some control signals may only depend on CL, not on state at all…
• We could implement very detailed CL blocks w/Programmable Logic Arrays
CS 61C L5.2.1 CPU Design III (29) K. Meinz, Summer 2004 © UCB
Taking advantage of sum-of-products
• Since sum-of-products is a convenient notation and way to think about design, offer hardware building blocks that match that notation
• One example isProgrammable Logic Arrays (PLAs)
• Designed so that can select (program) ands, ors, complements after you get the chip• Late in design process, fix errors, figure out what to do later, …
CS 61C L5.2.1 CPU Design III (30) K. Meinz, Summer 2004 © UCB
• • •
inputs
ANDarray
• • •
outputs
ORarrayproduct
terms
Programmable Logic Arrays• Pre-fabricated building block of many
AND/OR gates• “Programmed” or “Personalized" by making or
breaking connections among gates
• Programmable array block diagram for sum of products form
And Programming:• How many inputs?• How to combine inputs?• How many product terms?
Or Programming:• How to combine product terms?• How many outputs?
CS 61C L5.2.1 CPU Design III (31) K. Meinz, Summer 2004 © UCB
example:
F0 = A + B' C'F1 = A C' + A BF2 = B' C' + A BF3 = B' C + A
personality matrix1 = uncomplemented in term0 = complemented in term– = does not participate
1 = term connected to output0 = no connection to output
input side: 3 inputs
output side: 4 outputsProduct inputs outputsterm A B C F0 F1 F2 F3
AB 1 1 – 0 1 1 0B'C – 0 1 0 0 0 1AC' 1 – 0 0 1 0 0B'C' – 0 0 1 0 1 0A 1 – – 1 0 0 1 reuse of terms;
5 product terms
Enabling Concept• Shared product terms among outputs
CS 61C L5.2.1 CPU Design III (32) K. Meinz, Summer 2004 © UCB
Before Programming
• All possible connections available before "programming"
CS 61C L5.2.1 CPU Design III (33) K. Meinz, Summer 2004 © UCB
A B C
F1 F2 F3F0
AB
B'C
AC'
B'C'
A
After Programming• Unwanted connections are "blown"
• Fuse (normally connected, break unwanted ones)
• Anti-fuse (normally disconnected, make wanted connections)
CS 61C L5.2.1 CPU Design III (34) K. Meinz, Summer 2004 © UCB
notation for implementingF0 = A B + A' B'F1 = C D' + C' D
AB+A'B'CD'+C'D
AB
A'B'
CD'
C'D
A B C D
Alternate Representation
• Short-hand notation--don't have to draw all the wires• X Signifies a connection is present and
perpendicular signal is an input to gate
CS 61C L5.2.1 CPU Design III (35) K. Meinz, Summer 2004 © UCB
°5 steps to design a processor• 1. Analyze instruction set => datapath requirements• 2. Select set of datapath components & establish clock
methodology• 3. Assemble datapath meeting the requirements• 4. Analyze implementation of each instruction to
determine setting of control points that effects the register transfer.• 5. Assemble the control logic
°Control is the hard part°MIPS makes that easier• Instructions same size• Source registers always in same place• Immediates same size, location• Operations always on registers/immediates
And in Conclusion… Single cycle control
Control
Datapath
Memory
ProcessorInput
Output
top related