11/13/2015 8:57 am 1 of 86 pipelining chapter 6. 11/13/2015 8:57 am 2 of 86 overview of pipelining...

92
06/27/22 14:34 1 of 86 Pipelining Pipelining Chapter 6 Chapter 6

Upload: kory-martin

Post on 04-Jan-2016

227 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 1 of 86

PipeliningPipelining

Chapter 6Chapter 6

Page 2: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 2 of 86

Overview of PipeliningOverview of Pipelining

Pipelining is an implementation Pipelining is an implementation technique in which multiple technique in which multiple instructions are overlapped in instructions are overlapped in execution.execution.

Pipelining improves performance Pipelining improves performance by increasing instruction by increasing instruction throughput.throughput.

The execution time of an individual The execution time of an individual instruction is not decreased.instruction is not decreased.

Page 3: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 3 of 86

AnalogyAnalogy

Doing laundry:Doing laundry:

1.1. Put clothes in washer to wash.Put clothes in washer to wash.

2.2. Put clothes in dryer to dry.Put clothes in dryer to dry.

3.3. Put clothes on table to fold.Put clothes on table to fold.

4.4. Put clothes away.Put clothes away.

Page 4: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 4 of 86

AnalogyAnalogy

Non-pipelined:Non-pipelined:

Page 5: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 5 of 86

AnalogyAnalogy

Pipelined:Pipelined:

Page 6: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 6 of 86

ExampleExample

Assume that the operation time for Assume that the operation time for the major functional units are:the major functional units are: 200 ps for memory access200 ps for memory access 200 ps for ALU operation200 ps for ALU operation 100 ps for register access100 ps for register access

Page 7: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

MIPS InstructionsMIPS Instructions

5 stages for a MIPS instruction:5 stages for a MIPS instruction:

Fetch → Reg. Read → ALU Op.Fetch → Reg. Read → ALU Op.

→ → Data access → Reg. WriteData access → Reg. Write

lw $s1, 100($s2)lw $s1, 100($s2) sw $s1, 100($s2)sw $s1, 100($s2) add $s1, $s2, $s3add $s1, $s2, $s3 beq $s1, $s2, 25beq $s1, $s2, 2504/20/23 16:42 7 of 86

Page 8: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 8 of 86

ExampleExample

InstrInstructiouctio

nn

FetcFetchh

Reg Reg readread

ALU ALU opop

Data Data acceacce

ssss

Reg Reg writewrite

Total Total timetime

lwlw 200200 100100 200200 200200 100100 800 800 psps

swsw 200200 100100 200200 200200 700 700 psps

addadd 200200 100100 200200 100100 600 600 psps

beqbeq 200200 100100 200200 500 500 psps

Execution time for each instruction Execution time for each instruction class:class:

Page 9: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 9 of 86

ExampleExample

For the single-cycle design:For the single-cycle design: Must allow for the slowest Must allow for the slowest

instruction – lw.instruction – lw. So the time required for So the time required for everyevery

instruction is 800 ps.instruction is 800 ps.

Page 10: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 10 of 86

ExampleExample

Non-pipelined for three lw Non-pipelined for three lw instructions:instructions:

Page 11: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 11 of 86

ExampleExample

Non-pipelined for three lw Non-pipelined for three lw instructions:instructions:

The time between the first and the The time between the first and the fourth instructions is 3 x 800 ps = fourth instructions is 3 x 800 ps = 2400 ps.2400 ps.

Page 12: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 12 of 86

ExampleExample

For the pipelined multi-cycle design:For the pipelined multi-cycle design: Each clock cycle must be long Each clock cycle must be long

enough to accommodate the slowest enough to accommodate the slowest operation.operation.

So the time required for So the time required for everyevery clock clock cycle is 200 ps.cycle is 200 ps.

Page 13: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 13 of 86

ExampleExample

Pipelined for three lw instructions :Pipelined for three lw instructions :

Page 14: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 14 of 86

ExampleExample

Pipelined for three lw instructions:Pipelined for three lw instructions: The time between the first and the The time between the first and the

fourth instructions is 3 x 200 ps = fourth instructions is 3 x 200 ps = 600 ps.600 ps.

2400/600 = 4.2400/600 = 4. A fourfold performance A fourfold performance

improvement.improvement.

Page 15: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 15 of 86

Pipeline HazardsPipeline Hazards

Structural hazardsStructural hazards Data hazardsData hazards Control hazardsControl hazards

Page 16: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 16 of 86

Structural HazardsStructural Hazards

There is a structural hazard when There is a structural hazard when the hardware cannot support the the hardware cannot support the combination of instructions that we combination of instructions that we want to execute in the same clock want to execute in the same clock cycle.cycle.

Analogy: Having a washer/dryer Analogy: Having a washer/dryer combination.combination.

Page 17: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 17 of 86

ExampleExample

What happens if we execute four lw What happens if we execute four lw instructions one after another…instructions one after another…

Page 18: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 18 of 86

ExampleExample

What happens if we execute four lw What happens if we execute four lw instructions one after another…instructions one after another…

Page 19: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 19 of 86

ExampleExample

What happens if we execute four lw What happens if we execute four lw instructions one after another…instructions one after another…

Page 20: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 20 of 86

ExampleExample

What happens if we execute four lw What happens if we execute four lw instructions one after another…instructions one after another…

The 1The 1stst instruction is accessing data while instruction is accessing data while the 4the 4thth instruction is being fetched. instruction is being fetched.

Page 21: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

SolutionSolution

Have two separate memories – Have two separate memories – One for instructionOne for instruction

One for dataOne for data

04/20/23 16:42 21 of 86

Page 22: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 22 of 86

Data HazardsData Hazards

Data hazards occur when the pipeline Data hazards occur when the pipeline must be stalled because one step must must be stalled because one step must wait for another to complete.wait for another to complete.

Arise from the dependence of one Arise from the dependence of one instruction on an earlier one that is still instruction on an earlier one that is still in the pipeline.in the pipeline.

addadd $s0, $t0, $t1$s0, $t0, $t1

subsub $t2, $s0, $t3$t2, $s0, $t3

Page 23: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 23 of 86

Solution 1Solution 1

Compilers can remove the data Compilers can remove the data hazard by moving non-dependent hazard by moving non-dependent instructions in between.instructions in between.

Page 24: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 24 of 86

Solution 2Solution 2

Observation: we don’t need to wait Observation: we don’t need to wait for the add instruction to complete for the add instruction to complete before trying to resolve the data before trying to resolve the data hazard.hazard.

As soon as the ALU creates the sum As soon as the ALU creates the sum for the add, we can supply it as an for the add, we can supply it as an input for the subtract.input for the subtract.

Page 25: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 25 of 86

ForwardingForwarding

ForwardingForwarding or or bypassingbypassing is when is when extra hardware is added to retrieve extra hardware is added to retrieve the missing item early from the the missing item early from the internal resources.internal resources.

Page 26: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 26 of 86

ForwardingForwarding

Forwarding paths are valid only if Forwarding paths are valid only if the destination stage is later in time the destination stage is later in time than the source stage.than the source stage.

Page 27: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 27 of 86

ForwardingForwarding

What happens when we have a sub What happens when we have a sub instruction after a lw instruction?instruction after a lw instruction?

Page 28: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 28 of 86

ForwardingForwarding

Page 29: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 29 of 86

Pipeline StallPipeline Stall

Even with forwarding, we need to Even with forwarding, we need to stall one stage for a stall one stage for a load-use data load-use data hazardhazard..

This is referred to as a This is referred to as a pipeline pipeline stallstall..

Page 30: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 30 of 86

Example of reordering Example of reordering codecode

Consider the following code segment Consider the following code segment in C:in C:A = B + E;A = B + E;

C = B + F;C = B + F; Assume that all variables are in Assume that all variables are in

memory and are addressable as memory and are addressable as offsets from $t0.offsets from $t0.

Page 31: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 31 of 86

Example of reordering Example of reordering codecode

The corresponding MIPS code is:The corresponding MIPS code is:lwlw $t1, 0($t0)$t1, 0($t0) // load B; offset from // load B; offset from

$t0$t0

lwlw $t2, 4($t0)$t2, 4($t0) // load E// load E

addadd $t3, $t1, $t2$t3, $t1, $t2 // B + E// B + E

swsw $t3, 12($t0)$t3, 12($t0)

lwlw $t4, 8($t0)$t4, 8($t0)

addadd $t5, $t1, $t4$t5, $t1, $t4

swsw $t5, 16($t0)$t5, 16($t0)

Page 32: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 32 of 86

Example of reordering Example of reordering codecode

What are the problems?What are the problems?lwlw $t1, 0($t0)$t1, 0($t0) // load B; offset from // load B; offset from

$t0$t0

lwlw $t2, 4($t0)$t2, 4($t0) // load E// load E

addadd $t3, $t1, $t2$t3, $t1, $t2 // B + E// B + E

swsw $t3, 12($t0)$t3, 12($t0)

lwlw $t4, 8($t0)$t4, 8($t0)

addadd $t5, $t1, $t4$t5, $t1, $t4

swsw $t5, 16($t0)$t5, 16($t0)

Page 33: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 33 of 86

Example of reordering Example of reordering codecode

What are the problems?What are the problems?lwlw $t1, 0($t0)$t1, 0($t0) // load B; offset from // load B; offset from

$t0$t0

lwlw $t2, 4($t0)$t2, 4($t0) // load E// load E

addadd $t3, $t1, $t2$t3, $t1, $t2 // B + E// B + E

swsw $t3, 12($t0)$t3, 12($t0)

lwlw $t4, 8($t0)$t4, 8($t0)

addadd $t5, $t1, $t4$t5, $t1, $t4

swsw $t5, 16($t0)$t5, 16($t0)

Page 34: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 34 of 86

Example of reordering Example of reordering codecode

Code re-ordered with no stallsCode re-ordered with no stallslwlw $t1, 0($t0)$t1, 0($t0) // load B; offset from // load B; offset from

$t0$t0

lwlw $t2, 4($t0)$t2, 4($t0) // load E// load E

lwlw $t4, 8($t0)$t4, 8($t0)

addadd $t3, $t1, $t2$t3, $t1, $t2 // B + E// B + E

swsw $t3, 12($t0)$t3, 12($t0)

addadd $t5, $t1, $t4$t5, $t1, $t4

swsw $t5, 16($t0)$t5, 16($t0)

Page 35: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 35 of 86

Control HazardsControl Hazards

A control hazard (also called branch A control hazard (also called branch hazard) arises from the need to make hazard) arises from the need to make a decision based on the results of one a decision based on the results of one instruction while others are executing.instruction while others are executing.

The proper instruction cannot execute The proper instruction cannot execute in the proper clock cycle because the in the proper clock cycle because the instruction that was fetched is not the instruction that was fetched is not the one that is needed.one that is needed.

Caused by the branch instruction.Caused by the branch instruction.

Page 36: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 36 of 86

Pipelined DatapathPipelined Datapath

Page 37: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 37 of 86

Pipelined DatapathPipelined Datapath

Page 38: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 38 of 86

Pipelined DatapathPipelined Datapath

Page 39: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 39 of 86

Pipelined DP for lwPipelined DP for lw

Instruction fetch

Page 40: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 40 of 86

Pipelined DP for lwPipelined DP for lw

Instruction decode

Page 41: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 41 of 86

Pipelined DP for lwPipelined DP for lw

Instruction execute

Page 42: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 42 of 86

Pipelined DP for lwPipelined DP for lw

Memory access

Page 43: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 43 of 86

Pipelined DP for lwPipelined DP for lw

Write back

Page 44: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

Pipelined DP for swPipelined DP for sw

Note that in the memory access stage for Note that in the memory access stage for sw…sw…

The register containing the data to be The register containing the data to be stored was read in an earlier stage and stored was read in an earlier stage and stored in ID/EXstored in ID/EX

The only way to make the data available The only way to make the data available during the MEM stage is to place the during the MEM stage is to place the data into the EX/MEM pipeline register data into the EX/MEM pipeline register in the EX stage, just as we stored the in the EX stage, just as we stored the effective address into the EX/MEM.effective address into the EX/MEM.

04/20/23 16:42 44 of 86

Page 45: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

Key PointKey Point

Each logical component of the Each logical component of the datapath – such as instruction datapath – such as instruction memory, register read ports, ALU, memory, register read ports, ALU, data memory, and register write data memory, and register write ports – can be used only within a ports – can be used only within a singlesingle pipeline state. pipeline state.

Otherwise we would have a Otherwise we would have a structural hazardstructural hazard

04/20/23 16:42 45 of 86

Page 46: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

A bugA bug

There is a bug in the pipeline design There is a bug in the pipeline design for the load instructionfor the load instruction

What’s the problem?What’s the problem?

04/20/23 16:42 46 of 86

Page 47: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

A bugA bug

During the write back stage of the During the write back stage of the load, we need the write register load, we need the write register number to use.number to use.

What instruction is supplying this What instruction is supplying this register number at this point in register number at this point in time?time?

It’s not the original lw instruction!It’s not the original lw instruction!

04/20/23 16:42 47 of 86

Page 48: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 48 of 86

Pipelined DP for lwPipelined DP for lw

To properly handle write back

Page 49: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 49 of 86

Pipelined ControlPipelined Control

The pipelined registers are written at The pipelined registers are written at each clock cycle, so there’s no separate each clock cycle, so there’s no separate write signals for them (IF/ID, ID/EX, write signals for them (IF/ID, ID/EX, EX/MEM, and MEM/WB)EX/MEM, and MEM/WB)

To specify control for the pipeline, we To specify control for the pipeline, we need only set the control values during need only set the control values during each pipeline stage.each pipeline stage.

Each control line is associated with a Each control line is associated with a component active in only a single component active in only a single pipeline stage.pipeline stage.

Page 50: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 50 of 86

Pipelined ControlPipelined Control Divide the control lines into five groups:Divide the control lines into five groups:1.1. Instruction fetch – same operation in every Instruction fetch – same operation in every

clock cycle, therefore always asserted.clock cycle, therefore always asserted.2.2. Instruction decode – same as 1.Instruction decode – same as 1.3.3. Execution/address calculation – the signals Execution/address calculation – the signals

to be set are RegDst, ALUOp and ALUSrc.to be set are RegDst, ALUOp and ALUSrc.4.4. Memory access – the signals to be set are Memory access – the signals to be set are

Branch, MemRead and MemWrite. PCSrc Branch, MemRead and MemWrite. PCSrc is asserted by ALUis asserted by ALU

5.5. Write back – the signals to be set are Write back – the signals to be set are MemtoReg and RegWrite.MemtoReg and RegWrite.

Page 51: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 51 of 86

Pipelined ControlPipelined Control

The 9 control signalsThe 9 control signals

Page 52: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 52 of 86

Pipelined ControlPipelined Control

Implementing pipelined control Implementing pipelined control means setting the nine control lines means setting the nine control lines to these values in each stage for to these values in each stage for each instruction.each instruction.

Since the control lines start with the Since the control lines start with the EX stage, we can create the control EX stage, we can create the control information during instruction information during instruction decode.decode.

Page 53: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 53 of 86

Pipelined ControlPipelined Control

The 9 control signalsThe 9 control signals

Page 54: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 54 of 86

Pipelined ControlPipelined Control

4 of the 9 control lines are used in 4 of the 9 control lines are used in the EX stage.the EX stage.

5 are passed on to the EX/MEM 5 are passed on to the EX/MEM registerregister

Page 55: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 55 of 86

Pipelined ControlPipelined Control

3 of the 9 lines are used in the MEM 3 of the 9 lines are used in the MEM stage.stage.

2 are passed on to the MEM/WB 2 are passed on to the MEM/WB registerregister

Page 56: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 56 of 86

Pipelined ControlPipelined Control

2 of the 9 control lines are used in 2 of the 9 control lines are used in the WB stage.the WB stage.

Page 57: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 57 of 86

Pipelined ControlPipelined Control

Page 58: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 58 of 86

Data HazardsData Hazards

Pipelined dependences for 5 Pipelined dependences for 5 instructionsinstructions

Page 59: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 59 of 86

ForwardingForwarding

Page 60: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 60 of 86

Datapath with Datapath with Forwarding UnitForwarding Unit

Ignores forwarding of a store value to a store instruction.Ignores forwarding of a store value to a store instruction.

Page 61: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 61 of 86

Forwarding UnitForwarding Unit

The forwarding unit controls the The forwarding unit controls the ALU multiplexors to replace the ALU multiplexors to replace the value from a general-purpose value from a general-purpose register with the value from the register with the value from the proper pipeline register.proper pipeline register.

Page 62: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 62 of 86

Data Hazards and StallsData Hazards and Stalls

One case where forwarding cannot One case where forwarding cannot solve the problem is when an solve the problem is when an instruction tries to read a register instruction tries to read a register following a load instruction that following a load instruction that writes the same register.writes the same register.

E.g. a lw followed by a subE.g. a lw followed by a sub

Page 63: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 63 of 86

Data Hazards and StallsData Hazards and Stalls

Since the dependence between the Since the dependence between the lwlw and the and the andand goes goes back in time, this hazard cannot be solved by forwarding.back in time, this hazard cannot be solved by forwarding.

Page 64: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 64 of 86

Inserting a StallInserting a Stall

Page 65: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 65 of 86

Inserting a StallInserting a Stall

The The andand instruction is turned into a instruction is turned into a nopnop

All instructions beginning with the All instructions beginning with the andand instruction are delayed one instruction are delayed one cycle.cycle.

Page 66: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 66 of 86

Hazard Detection UnitHazard Detection Unit

Page 67: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 67 of 86

Hazard Detection UnitHazard Detection Unit

The hazard detection unit controls The hazard detection unit controls the writing of the PC and IF/ID the writing of the PC and IF/ID registers plus the multiplexor that registers plus the multiplexor that chooses between the real control chooses between the real control values and all 0s.values and all 0s.

The hazard detection unit stalls and The hazard detection unit stalls and deasserts the control fields if the deasserts the control fields if the load-use hazard test is true.load-use hazard test is true.

Page 68: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 68 of 86

Control HazardControl Hazard

Pipeline hazards involving branches.Pipeline hazards involving branches. The branch instruction decides The branch instruction decides

whether to branch in the MEM stage whether to branch in the MEM stage (clock cycle 4 in the figure).(clock cycle 4 in the figure).

In the meantime, three following In the meantime, three following instructions will have begun instructions will have begun execution.execution.

Page 69: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 69 of 86

Control HazardControl Hazard

Page 70: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 70 of 86

Solutions for Control Solutions for Control HazardsHazards

1.1. Assume branch not takenAssume branch not taken Continue execution down the sequential Continue execution down the sequential

instruction stream.instruction stream. If the branch is taken, the instructions If the branch is taken, the instructions

that are in the pipeline must be discarded.that are in the pipeline must be discarded. Execution continues at the branch target.Execution continues at the branch target. If branches are untaken half the time, and If branches are untaken half the time, and

if it costs little to discard the instructions, if it costs little to discard the instructions, then this optimization halves the cost of then this optimization halves the cost of control hazards.control hazards.

Page 71: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 71 of 86

Solutions for Control Solutions for Control HazardsHazards

1.1. Assume branch not takenAssume branch not taken Discarding instructions means to flush Discarding instructions means to flush

instructions in the IF, ID, and Ex instructions in the IF, ID, and Ex stages of the pipeline.stages of the pipeline.

Change the original control values to Change the original control values to 0s, and let them percolate through the 0s, and let them percolate through the pipeline.pipeline.

Page 72: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 72 of 86

Solutions for Control Solutions for Control HazardsHazards

2.2. Reducing the delay of branchesReducing the delay of branches Reduce the cost of the taken branch.Reduce the cost of the taken branch. Move the branch execution earlier in Move the branch execution earlier in

the pipeline so that fewer instructions the pipeline so that fewer instructions need to be flushed.need to be flushed.

Requires two actions to occur earlier:Requires two actions to occur earlier:

Page 73: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 73 of 86

Solutions for Control Solutions for Control HazardsHazards

2.2. Reducing the delay of branchesReducing the delay of branches Reduce the cost of the taken branch.Reduce the cost of the taken branch. Move the branch execution earlier in Move the branch execution earlier in

the pipeline so that fewer instructions the pipeline so that fewer instructions need to be flushed.need to be flushed.

Requires two actions to occur earlier:Requires two actions to occur earlier:i.i. Computing the branch target address.Computing the branch target address.

Page 74: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 74 of 86

Solutions for Control Solutions for Control HazardsHazards

2.2. Reducing the delay of branchesReducing the delay of branches Reduce the cost of the taken branch.Reduce the cost of the taken branch. Move the branch execution earlier in Move the branch execution earlier in

the pipeline so that fewer instructions the pipeline so that fewer instructions need to be flushed.need to be flushed.

Requires two actions to occur earlier:Requires two actions to occur earlier:i.i. Computing the branch target address.Computing the branch target address.

ii.ii. Evaluating the branch decision.Evaluating the branch decision.

Page 75: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 75 of 86

Solutions for Control Solutions for Control HazardsHazards

2.2. Reducing the delay of branchesReducing the delay of branchesi.i. Computing the branch target address.Computing the branch target address. Easy.Easy. Already have the PC and the immediate Already have the PC and the immediate

field in the IF/ID pipeline register.field in the IF/ID pipeline register. Just move the branch adder from the EX Just move the branch adder from the EX

stage to the ID stage.stage to the ID stage. The address calculation will be performed The address calculation will be performed

for all instructions, but only used when for all instructions, but only used when needed.needed.

Page 76: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 76 of 86

Branch adder locationBranch adder location

Move from EX to ID stageMove from EX to ID stage

Page 77: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 77 of 86

Solutions for Control Solutions for Control HazardsHazards

2.2. Reducing the delay of branchesReducing the delay of branchesii.ii. Evaluating the branch decision.Evaluating the branch decision. Harder.Harder. Need to compare the two registers read Need to compare the two registers read

during the ID stage.during the ID stage. During ID, we mustDuring ID, we must

Decode the instructionDecode the instruction Decide whether a bypass to the equality unit is Decide whether a bypass to the equality unit is

needed. Source can come from EX/MEM or needed. Source can come from EX/MEM or MEM/WB pipeline registers.MEM/WB pipeline registers.

Complete the comparison.Complete the comparison. Set the PC to the branch address if necessary.Set the PC to the branch address if necessary.

Page 78: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 78 of 86

Solutions for Control Solutions for Control HazardsHazards

2.2. Reducing the delay of branchesReducing the delay of branchesii.ii. Evaluating the branch decision.Evaluating the branch decision. The values in a branch comparison are The values in a branch comparison are

needed during ID but may be produced needed during ID but may be produced later in time later in time can cause a data hazard can cause a data hazard and a stall might be needed.and a stall might be needed.

Ex. If an ALU instruction immediately Ex. If an ALU instruction immediately preceding a branch produces one of the preceding a branch produces one of the operands for the comparison in the operands for the comparison in the branch, a stall will be required. Why?branch, a stall will be required. Why?

Page 79: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 79 of 86

Solutions for Control Solutions for Control HazardsHazards

2.2. Reducing the delay of branchesReducing the delay of branchesii.ii. Evaluating the branch decision.Evaluating the branch decision. The values in a branch comparison are needed The values in a branch comparison are needed

during ID but may be produced later in time during ID but may be produced later in time can cause a data hazard and a stall might be can cause a data hazard and a stall might be needed.needed.

Ex. If an ALU instruction immediately Ex. If an ALU instruction immediately preceding a branch produces one of the preceding a branch produces one of the operands for the comparison in the branch, a operands for the comparison in the branch, a stall will be required.stall will be required.

Because the EX stage for the ALU instruction Because the EX stage for the ALU instruction will occur after the ID cycle of the branch.will occur after the ID cycle of the branch.

Page 80: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 80 of 86

Solutions for Control Solutions for Control HazardsHazards

2.2. Reducing the delay of branchesReducing the delay of branchesii.ii. Evaluating the branch decision.Evaluating the branch decision. Ex. If a load instruction immediately Ex. If a load instruction immediately

preceding a branch produces one of the preceding a branch produces one of the operands for the comparison in the operands for the comparison in the branch, two stalls will be required.branch, two stalls will be required.

Because the result from the load appears Because the result from the load appears at the end of the MEM cycle but is needed at the end of the MEM cycle but is needed at the beginning of the ID cycle of the at the beginning of the ID cycle of the branch.branch.

Page 81: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 81 of 86

Solutions for Control Solutions for Control HazardsHazards

2.2. Reducing the delay of branchesReducing the delay of branches Moving the branch execution to the ID Moving the branch execution to the ID

stage is an improvement since it reduces stage is an improvement since it reduces the penalty of a branch to only one the penalty of a branch to only one instruction if the branch is taken, namely, instruction if the branch is taken, namely, the one currently being fetched.the one currently being fetched.

Zeros the instruction field of the IF/ID Zeros the instruction field of the IF/ID pipeline register.pipeline register.

Clearing the register transforms the Clearing the register transforms the fetched instruction into a nop.fetched instruction into a nop.

Page 82: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 82 of 86

Solutions for Control Solutions for Control HazardsHazards

3.3. Dynamic branch predictionDynamic branch prediction Assuming a branch is not taken is one Assuming a branch is not taken is one

simple form of branch prediction.simple form of branch prediction. With deeper pipelines and multiple issue, With deeper pipelines and multiple issue,

branch penalty increases in terms of branch penalty increases in terms of instructions lost.instructions lost.

A simple static branch prediction wastes A simple static branch prediction wastes too much performance.too much performance.

Possible to try to predict branch behavior Possible to try to predict branch behavior dynamically (i.e. during program dynamically (i.e. during program execution).execution).

Page 83: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 83 of 86

Dynamic Branch Dynamic Branch PredictionPrediction

Implementation:Implementation: A A branch prediction bufferbranch prediction buffer or or

branch history tablebranch history table is used. is used. This is a small memory indexed by the This is a small memory indexed by the

lower portion of the address of the lower portion of the address of the branch instruction.branch instruction.

The memory contains a bit that says The memory contains a bit that says whether the branch was recently whether the branch was recently taken or not.taken or not.

Page 84: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 84 of 86

Dynamic Branch Dynamic Branch PredictionPrediction

Look up the address of the Look up the address of the instruction to see if a branch was instruction to see if a branch was taken the last time this instruction taken the last time this instruction was executed.was executed.

If so, then fetch the new instruction If so, then fetch the new instruction from the same place.from the same place.

Page 85: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 85 of 86

Dynamic Branch Dynamic Branch PredictionPrediction

The bit may have been put there by The bit may have been put there by another branch instruction that has the another branch instruction that has the same low-order address bits.same low-order address bits.

If the hint is wrong thenIf the hint is wrong then The incorrectly predicted instructions are The incorrectly predicted instructions are

deleted.deleted. The prediction bit is inverted and stored The prediction bit is inverted and stored

back.back. The proper sequence is fetched and The proper sequence is fetched and

executed.executed.

Page 86: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 86 of 86

Dynamic Branch Dynamic Branch PredictionPrediction

Problem:Problem: If the branch is almost always taken, we will If the branch is almost always taken, we will

likely predict incorrectly likely predict incorrectly twicetwice, rather than , rather than once, when it is not taken.once, when it is not taken.

Example:Example: Consider a loop branch that branches nine Consider a loop branch that branches nine

times in a row, then is not taken once on the times in a row, then is not taken once on the tenth time. What is the prediction accuracy tenth time. What is the prediction accuracy assuming the prediction bit for this branch assuming the prediction bit for this branch remains in the prediction buffer?remains in the prediction buffer?

Page 87: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 87 of 86

Dynamic Branch Dynamic Branch PredictionPrediction

Answer:Answer: The steady-state prediction behavior will The steady-state prediction behavior will

mispredict on the first and last loop mispredict on the first and last loop iterations.iterations.

Mispredicting the last iteration is Mispredicting the last iteration is inevitable since the prediction bit will say inevitable since the prediction bit will say taken during the first nine times.taken during the first nine times.

Mispredicting on the first iteration Mispredicting on the first iteration happens because the bit is flipped on happens because the bit is flipped on prior execution of the last iteration of the prior execution of the last iteration of the loop.loop.

Page 88: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 88 of 86

Dynamic Branch Dynamic Branch PredictionPrediction

The prediction accuracy for this The prediction accuracy for this branch that is taken 90% of the time branch that is taken 90% of the time is only 80% (8 out of 10).is only 80% (8 out of 10).

Ideally, the accuracy of the predictor Ideally, the accuracy of the predictor should match the taken branch should match the taken branch frequency for these highly regular frequency for these highly regular branches.branches.

Page 89: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 89 of 86

Dynamic Branch Dynamic Branch PredictionPrediction

A 2-bit prediction scheme.A 2-bit prediction scheme. A prediction must be wrong twice A prediction must be wrong twice

before the bit is changed.before the bit is changed.

Page 90: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 90 of 86

2-bit prediction scheme2-bit prediction scheme

Page 91: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 91 of 86

Solutions for Control Solutions for Control HazardsHazards

4.4. Scheduling the branch delay slotScheduling the branch delay slot dd

Page 92: 11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which

04/20/23 16:42 92 of 86

Partial MIPS Partial MIPS InstructionsInstructions

InstructiInstructionon

OP (6)OP (6) rs (5)rs (5) rt (5)rt (5) rd (5)rd (5) shamt shamt (5)(5)

funct funct (6)(6)

LWLW 3535 rsrs rdrd offsetoffset

SWSW 4343 rsrs rdrd offsetoffset

BEQBEQ 44 rsrs rtrt offsetoffset

ADDADD 00 rsrs rtrt rdrd 00 3232

SUBSUB 00 rsrs rtrt rdrd 00 3434

ANDAND 00 rsrs rtrt rdrd 00 3636

OROR 00 rsrs rtrt rdrd 00 3737

SLTSLT 00 rsrs rtrt rdrd 00 4242

ADDIADDI 88 rsrs rtrt immimm

OUTOUT 6363 rsrs* All numbers are in decimal.