appendix a.1.1
TRANSCRIPT
-
7/31/2019 Appendix a.1.1
1/23
Appendix A
Pipelining: Basic and Intermediate Concepts
Based on slides by
David Patterson, University of California, Berkeley,Josep Torrellas and Sarita Adve, University of Illinois at Urbana-
Champaign
CPE 432 Computer Design
-
7/31/2019 Appendix a.1.1
2/23
Key ideas and simple pipeline (Section A.1) Hazards (Sections A.2 and A.3)
Structural hazards
Data hazardsControl hazards
Exceptions (Section A.4)
Multicycle operations (Section A.5)
CPE 432 Computer Design 2
Outline
-
7/31/2019 Appendix a.1.1
3/23
Pipelining An implementation technique whereby multiple instructions are
overlapped in execution
Takes advantage of parallelism that exists among the actions
needed to execute different instructionsThe pipeline consists of stages; a pipe stage is called segmentEach segment in the pipeline completes a different part of an
instruction
Different instructions execute simultaneously, each is in a differentstage of the pipeline
Each instruction step (executed by a pipe stage) takes a machinecycle
Throughput: how often an instruction exits the pipeline
= # instructions completed/machine cycle Want to balance the time needed to do the work in each stage Ideally,time per instruction = Time per instruction on unpipelined machine
Number of pipe stages3CPE 432 Computer Design
-
7/31/2019 Appendix a.1.1
4/23
Pipelining a Basic RISC ISA
MIPS ISA ( Microprocessor without Interlocked
Pipeline Stages ) The instruction formats are few in number with
all instructions typically being one size.
Only loads and stores affect memoryBase register + immediate offset = effective address
ALU operationsTwo sources: two registers, or register and immediate
Branches and jumpsComparison between a register and zero
Branch address = PC + offset
CPE 432 Computer Design 4
-
7/31/2019 Appendix a.1.1
5/23
A Simple Five Stage RISC Pipeline
Pipeline Stages
IF : Instruction Fetch
ID : Instruction decode, register read, branchcomputation
EX : Instruction Execution and Effective Addresscomputation
MEM : Memory Access (for LOAD & STOREinstructions)
WB : Writeback (result) to a register
CPE 432 Computer Design 5
-
7/31/2019 Appendix a.1.1
6/23
Figure A.1 Simple RISC pipeline.
CPE 432 Computer Design 6
-
7/31/2019 Appendix a.1.1
7/23Figure A.17 Nonpipelined Instruction Execution
-
7/31/2019 Appendix a.1.1
8/23
Implementation of RISC Instructions
1. Instruction Fetch cycle (IF)IR Mem[PC] ; IR holds the instruction
NPC PC+4
CPE 432 Computer Design 8
-
7/31/2019 Appendix a.1.1
9/23
2. Instruction decode/register fetch cycle (ID)
A Regs[rs] ; decode the instructionB Regs[rt] ; in the meantime
Imm sign-extend imm field of IR ; Regs A, B, Imm; ok if
some of this is not needed
CPE 432 Computer Design 9
-
7/31/2019 Appendix a.1.1
10/23
3. Execution /Effective address cycle (EX)
Reg-Reg (ALU op): ALU output A op B
Reg-Immed (ALU op): ALU output A op Imm
memory ref: ALU output A+Imm ; compute address for LD,ST Branch: ALU output NPC+ (Imm
-
7/31/2019 Appendix a.1.1
11/23
4. Memory Access/Branch Completion Cycle (MEM)/* only for LD,ST,BR
Memory access:
LMD Mem[ALU output] ;for loads, Load data in LMD
Mem[ALU output] B ; for stores
CPE 432 Computer Design 11
-
7/31/2019 Appendix a.1.1
12/23
4. Memory Access/Branch Completion Cycle (MEM)/* only for LD,ST,BR
Branch
if (cond)
PC ALU output
else
PC NPC
CPE 432 Computer Design 12
-
7/31/2019 Appendix a.1.1
13/23
5. Write-back cycle (WB)
Load Instruction: Regs[rt] LMD
Reg-Reg ALU instr: Regs[rd] ALU output Reg-Imm ALU instr: Regs[rt] ALU output
CPE 432 Computer Design 13
-
7/31/2019 Appendix a.1.1
14/23
5. Write-back cycle (WB)
Load Instruction: Regs[rt] LMD
Reg-Reg ALU instr: Regs[rd] ALU output
Reg-Imm ALU instr: Regs[rt] ALU output
Branches & Stores 4 cycles
Rest of ins 5 cycles
Now we will try to pipeline instruction execution
Note that at the end of each cycle, the data isstored in some registers (PC,LMD,Imm,A,B,). Thisallows other instructions to execute too.
CPE 432 Computer Design 14
Ti ( l k l )
-
7/31/2019 Appendix a.1.1
15/23
15Figure A.2 The pipeline can be thought of as a series of data paths shifted in time.
Instr.
O
rder
Time (clock cycles)
RegALU
DMemIfetch Reg
RegALU
DMemIfetch Reg
RegALU
DMemIfetch Reg
RegALU
DMemIfetch Reg
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7Cycle 5
IF1 ID1 EX1 MEM1 WB1
IF2 ID2 EX2 MEM2 WB2
register file is written in first part of cycle
register file is read in secondpart of cycle
-
7/31/2019 Appendix a.1.1
16/23
-
7/31/2019 Appendix a.1.1
17/23
Figure A.18 The data path is pipelined by adding a set of registers, one between each pair of pipe stages
Ensure instructions in different stages of the pipeline do not interfere with one another.At the end of a clock cycle all results from a stage are stored into a register that isused as input to the next stageon the next clock cycle.
Carry intermediate results from one stage to another where the source and destinationstages may not be directly adjacent.
-
7/31/2019 Appendix a.1.1
18/23
Figure A.19 Events on every pipe stage of the MIPS pipeline.
-
7/31/2019 Appendix a.1.1
19/23
Figure A.19 Events on every pipe stage of the MIPS pipeline.
-
7/31/2019 Appendix a.1.1
20/23
Figure A.19 Events on every pipe stage of the MIPS pipeline.
-
7/31/2019 Appendix a.1.1
21/23
Why does it work?
Use separate I and D memories (caches)
Register file can be read/written in 0.5 cycles
PC: incremented in IF
if branch taken, in EX, add PC+ (Imm
-
7/31/2019 Appendix a.1.1
22/23
Set the control of the 4 MUXES
ALU stage MUXES
MUX in IF
MUX in WB
CPE 432 Computer Design 22
Control of the pipeline
-
7/31/2019 Appendix a.1.1
23/23
MUXES set depending on instruction type which is set by ID/EX.IR.top MUX: branch or not (ID/EX.NPC or ID/EX.A)bottom MUX: reg-reg ALU or other ( ID/EX.B or other)
Chooses between PC+4 and EX/MEM. ALUOutput controlled by EX/MEM.cond
controlled by
whether inst. is aLD ALU