appendix a.1.1

7/31/2019 Appendix a.1.1

1/23

Appendix A

Pipelining: Basic and Intermediate Concepts

Based on slides by

David Patterson, University of California, Berkeley,Josep Torrellas and Sarita Adve, University of Illinois at Urbana-

Champaign

CPE 432 Computer Design

7/31/2019 Appendix a.1.1

2/23

Key ideas and simple pipeline (Section A.1) Hazards (Sections A.2 and A.3)

Structural hazards

Data hazardsControl hazards

Exceptions (Section A.4)

Multicycle operations (Section A.5)

CPE 432 Computer Design 2

Outline

7/31/2019 Appendix a.1.1

3/23

Pipelining An implementation technique whereby multiple instructions are

overlapped in execution

Takes advantage of parallelism that exists among the actions

needed to execute different instructionsThe pipeline consists of stages; a pipe stage is called segmentEach segment in the pipeline completes a different part of an

instruction

Different instructions execute simultaneously, each is in a differentstage of the pipeline

Each instruction step (executed by a pipe stage) takes a machinecycle

Throughput: how often an instruction exits the pipeline

= # instructions completed/machine cycle Want to balance the time needed to do the work in each stage Ideally,time per instruction = Time per instruction on unpipelined machine

Number of pipe stages3CPE 432 Computer Design

7/31/2019 Appendix a.1.1

4/23

Pipelining a Basic RISC ISA

MIPS ISA ( Microprocessor without Interlocked

Pipeline Stages ) The instruction formats are few in number with

all instructions typically being one size.

Only loads and stores affect memoryBase register + immediate offset = effective address

ALU operationsTwo sources: two registers, or register and immediate

Branches and jumpsComparison between a register and zero

Branch address = PC + offset


7/31/2019 Appendix a.1.1

5/23

A Simple Five Stage RISC Pipeline

Pipeline Stages

IF : Instruction Fetch

ID : Instruction decode, register read, branchcomputation

EX : Instruction Execution and Effective Addresscomputation

MEM : Memory Access (for LOAD & STOREinstructions)

WB : Writeback (result) to a register


7/31/2019 Appendix a.1.1

6/23

Figure A.1 Simple RISC pipeline.


7/31/2019 Appendix a.1.1

7/23Figure A.17 Nonpipelined Instruction Execution

7/31/2019 Appendix a.1.1

8/23

Implementation of RISC Instructions

1. Instruction Fetch cycle (IF)IR Mem[PC] ; IR holds the instruction

NPC PC+4


7/31/2019 Appendix a.1.1

9/23

2. Instruction decode/register fetch cycle (ID)

A Regs[rs] ; decode the instructionB Regs[rt] ; in the meantime

Imm sign-extend imm field of IR ; Regs A, B, Imm; ok if

some of this is not needed


7/31/2019 Appendix a.1.1

10/23

3. Execution /Effective address cycle (EX)

Reg-Reg (ALU op): ALU output A op B

Reg-Immed (ALU op): ALU output A op Imm

memory ref: ALU output A+Imm ; compute address for LD,ST Branch: ALU output NPC+ (Imm

7/31/2019 Appendix a.1.1

11/23

4. Memory Access/Branch Completion Cycle (MEM)/* only for LD,ST,BR

Memory access:

LMD Mem[ALU output] ;for loads, Load data in LMD

Mem[ALU output] B ; for stores


7/31/2019 Appendix a.1.1

12/23

4. Memory Access/Branch Completion Cycle (MEM)/* only for LD,ST,BR

Branch

if (cond)

PC ALU output

else

PC NPC


7/31/2019 Appendix a.1.1

13/23

5. Write-back cycle (WB)

Load Instruction: Regs[rt] LMD

Reg-Reg ALU instr: Regs[rd] ALU output Reg-Imm ALU instr: Regs[rt] ALU output


7/31/2019 Appendix a.1.1

14/23

5. Write-back cycle (WB)

Load Instruction: Regs[rt] LMD

Reg-Reg ALU instr: Regs[rd] ALU output

Reg-Imm ALU instr: Regs[rt] ALU output

Branches & Stores 4 cycles

Rest of ins 5 cycles

Now we will try to pipeline instruction execution

Note that at the end of each cycle, the data isstored in some registers (PC,LMD,Imm,A,B,). Thisallows other instructions to execute too.


Ti ( l k l )

7/31/2019 Appendix a.1.1

15/23

15Figure A.2 The pipeline can be thought of as a series of data paths shifted in time.

Instr.

O

rder

Time (clock cycles)

RegALU

DMemIfetch Reg

RegALU

DMemIfetch Reg

RegALU

DMemIfetch Reg

RegALU

DMemIfetch Reg

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7Cycle 5

IF1 ID1 EX1 MEM1 WB1

IF2 ID2 EX2 MEM2 WB2

register file is written in first part of cycle

register file is read in secondpart of cycle

7/31/2019 Appendix a.1.1

16/23

7/31/2019 Appendix a.1.1

17/23

Figure A.18 The data path is pipelined by adding a set of registers, one between each pair of pipe stages

Ensure instructions in different stages of the pipeline do not interfere with one another.At the end of a clock cycle all results from a stage are stored into a register that isused as input to the next stageon the next clock cycle.

Carry intermediate results from one stage to another where the source and destinationstages may not be directly adjacent.

7/31/2019 Appendix a.1.1

18/23

Figure A.19 Events on every pipe stage of the MIPS pipeline.

7/31/2019 Appendix a.1.1

19/23


7/31/2019 Appendix a.1.1

20/23


7/31/2019 Appendix a.1.1

21/23

Why does it work?

Use separate I and D memories (caches)

Register file can be read/written in 0.5 cycles

PC: incremented in IF

if branch taken, in EX, add PC+ (Imm

7/31/2019 Appendix a.1.1

22/23

Set the control of the 4 MUXES

ALU stage MUXES

MUX in IF

MUX in WB


Control of the pipeline

7/31/2019 Appendix a.1.1

23/23

MUXES set depending on instruction type which is set by ID/EX.IR.top MUX: branch or not (ID/EX.NPC or ID/EX.A)bottom MUX: reg-reg ALU or other ( ID/EX.B or other)

Chooses between PC+4 and EX/MEM. ALUOutput controlled by EX/MEM.cond

controlled by

whether inst. is aLD ALU

appendix a.1.1

Documents