ece473 computer organization and architecture · ece473 lec 15.3 pipeline hazards •where one...
TRANSCRIPT
![Page 1: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/1.jpg)
Lec 15.1ECE473
Pipeline: Control Hazard
ECE473 Computer Architecture and Organization
Lecturer: Prof. Yifeng Zhu
Fall, 2015
Portions of these slides are derived from:
Dave Patterson © UCB
![Page 2: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/2.jpg)
Lec 15.2ECE473
Pipelining Outline
• Introduction– Defining Pipelining
– Pipelining Instructions
• Hazards– Structural hazards
– Data Hazards
– Control Hazards \
• Performance
• Controller implementation
![Page 3: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/3.jpg)
Lec 15.3ECE473
Pipeline Hazards
• Where one instruction cannot immediatelyfollow another
• Types of hazards– Structural hazards - attempt to use same resource twice
– Control hazards - attempt to make decision before condition is evaluated
– Data hazards - attempt to use data before it is ready
• Can always resolve hazards by waiting
![Page 4: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/4.jpg)
Lec 15.4ECE473
Control Hazards
A control hazard is when we need to find thedestination of a branch, and can’t fetch any newinstructions until we know that destination.
A branch is either– Taken: PC <= PC + 4 + Imm
– Not Taken: PC <= PC + 4
![Page 5: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/5.jpg)
Lec 15.5ECE473
Control Hazard on BranchesThree Stage Stall
Control Hazards
10: beq r1,r3,36
14: and r2,r3,r5
18: or r6,r1,r7
22: add r8,r1,r9
36: xor r10,r1,r11
Reg
AL
U
DMemIfetch Reg
Reg
AL
U
DMemIfetch Reg
Reg
AL
U
DMemIfetch Reg
Reg
AL
U
DMemIfetch Reg
Reg
AL
U
DMemIfetch Reg
The penalty when branch take is 3 cycles!
![Page 6: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/6.jpg)
Lec 15.6ECE473
Basic Pipelined Processor
In our original Design, branches have a penalty of 3 cycles
![Page 7: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/7.jpg)
Lec 15.7ECE473
Reducing Branch Delay
Move following to ID stage
a) Branch-target address calculation
b) Branch condition decision
Reduced penalty (1 cycle) when branch take!
![Page 8: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/8.jpg)
Lec 15.8ECE473
Reducing Branch Delay: move branch logic to ID stage ->
beq
writes PC
here
new PC
used here
0 2 4 6 8 10 12
IF ID EX MEM WB
16
add $r4,$r5,$r6
beq $r0,$r1,tgt IF ID EX MEM WB
IF ID EX MEM WBsw $s4,200($t5)
18
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
STALL
![Page 9: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/9.jpg)
Lec 15.9ECE473
Control Hazard Solution #1
• Stall– stop loading instructions until result is available
![Page 10: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/10.jpg)
Lec 15.10ECE473
Control Hazard Solution #2 Branch Prediction
• Just stalling for each branch is not practical
• Common assumption: branch not taken
• When assumption fails: flush three instructions
Reg
Reg
CC 1
Time (in clock cycles)
40 beq $1, $3, 7
Program
execution
order
(in instructions)
IM Reg
IM DM
IM DM
IM DM
DM
DM Reg
Reg Reg
Reg
Reg
RegIM
44 and $12, $2, $5
48 or $13, $6, $2
52 add $14, $2, $2
72 lw $4, 50($7)
CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
Reg
(Fig. 6.37)
![Page 11: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/11.jpg)
Lec 15.11ECE473
Static Branch Prediction
For every branch, predict whether the branch will be taken or not taken.
Predicting branch not taken:
1. Speculatively fetch and execute in-line instructions following the branch
2. If prediction incorrect flush pipeline of speculated instructions
• Convert these instructions to NOPs by clearing pipeline registers
• These have not updated memory or registers at time of flush
Predicting branch taken:
1. Speculatively fetch and execute instructions at the branch target address
2. Useful only if target address known earlier than branch outcome
• May require stall cycles till target address known
• Flush pipeline if prediction is incorrect
• Must ensure that flushed instructions do not update memory/registers
![Page 12: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/12.jpg)
Lec 15.12ECE473
Flush instructions in Branch Hazard
36 sub $10, $4, $8
40 beq $1, $3, 7 # taget = 40 + 4 + 7*4 = 72
44 and $12, $2, $5
48 or $13, $2, $6
52 ….
….
72 lw $4, 50($7)
![Page 13: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/13.jpg)
Lec 15.13ECE473
Control Hazard - Stall
beq
writes PC
here
new PC
used here
0 2 4 6 8 10 12
IF ID EX MEM WB
16
add $r4,$r5,$r6
beq $r0,$r1,tgt IF ID EX MEM WB
IF ID EX MEM WBsw $s4,200($t5)
18
BUBBLE BUBBLE BUBBLE BUBBLE BUBBLE
STALL
![Page 14: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/14.jpg)
Lec 15.14ECE473
Control Hazard - Correct Prediction
Fetch assuming
branch taken
0 2 4 6 8 10 12
IF ID EX MEM WB
16
add $r4,$r5,$r6
beq $r0,$r1,tgt IF ID EX MEM WB
IF ID EX MEM WBtgt:sw $s4,200($t5)
18
![Page 15: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/15.jpg)
Lec 15.15ECE473
Control Hazard - Incorrect Prediction
“Squashed”
instruction
0 2 4 6 8 10 12
IF ID EX MEM WB
16
add $r4,$r5,$r6
beq $r0,$r1,tgt IF ID EX MEM WB
IF ID EX MEM WB
18
BUBBLE BUBBLE BUBBLE BUBBLE
tgt:sw $s4,200($t5)(incorrect - STALL)
IF
or $r8,$r8,$r9
![Page 16: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/16.jpg)
Lec 15.16ECE473
![Page 17: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/17.jpg)
Lec 15.17ECE473
Flush instructions at IF stage in Branch Hazard
Turn the instructions at IF stage into nop.
![Page 18: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/18.jpg)
Lec 15.18ECE473
Flush instructions at IF stage in Branch Hazard
Turn the instructions at IF stage into nop.
2
zero control signals
![Page 19: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/19.jpg)
Lec 15.19ECE473
Branch Behavior in Programs
• Based on SPEC benchmarks on DLX– Branches occur with a frequency of 14% to 16% in integer
programs and 3% to 12% in floating point programs.
– About 75% of the branches are forward branches
– 60% of forward branches are taken
– 80% of backward branches are taken
– 67% of all branches are taken
• Why are branches (especially backward branches) more likely to be taken than not taken?
![Page 20: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/20.jpg)
Lec 15.20ECE473
1-Bit Branch Prediction
• Branch History Table (BHT): Lower bits of PC address index table of 1-bit values
– Says whether or not branch taken last time
– No address check (saves HW, but may not be right branch)
– If prediction is wrong, invert prediction bit
a31a30…a11…a2a1a0 branch instruction
1K-entry BHT
10-bit index
0
1
1
prediction bit
Instruction memory
Hypothesis: branch will do the same again.
1 = branch was last taken
0 = branch was last not taken
![Page 21: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/21.jpg)
Lec 15.21ECE473
1-Bit Branch Prediction
• Example:
Consider a loop branch that is taken 9 times in a row and then not taken once. What is the prediction accuracy of 1-bit predictor for this branch assuming only this branch ever changes its corresponding prediction bit?
– Answer: 80%. Because there are two mispredictions – one on the first iteration and one on the last iteration. Why?
![Page 22: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/22.jpg)
Lec 15.22ECE473
• Solution: 2-bit scheme where change prediction only if get misprediction twice
Red: stop, not taken
Green: go, taken
2-Bit Branch Prediction(Jim Smith, 1981)
T
T
NT
Predict Taken
Predict Not
Taken
Predict Taken
Predict Not
Taken
11 10
01 00T
NT
T
NT
NT
![Page 23: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/23.jpg)
Lec 15.23ECE473
2-bit Predictor Statistics
Prediction accuracy of 4K-entry 2-bit prediction buffer on SPEC89 benchmarks:accuracy is lower for integer programs (gcc, espresso, eqntott, li) than for FP
![Page 24: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/24.jpg)
Lec 15.24ECE473
2-bit Predictor Statistics
Prediction accuracy of 4K-entry 2-bit prediction buffer vs. “infinite” 2-bit buffer:increasing buffer size from 4K does not significantly improve performance
![Page 25: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/25.jpg)
Lec 15.25ECE473
Control Hazard Solution #3Delay Branches
• Delayed branches – code rearranged by compiler to place independent instruction after every branch (in delay slot).
add $R4,$R5,$R6
beq $R1,$R2,20
lw $R3,400($R0)
beq $R1,$R2,20
add $R4,$R5,$R6
lw $R3,400($R0)
![Page 26: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/26.jpg)
Lec 15.26ECE473
Scheduling the Delay Slot
![Page 27: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/27.jpg)
Lec 15.27ECE473
Delayed Branch
• Instruction in branch delay slot is always executed
• Compiler (tries to) move a useful instruction into delay slot.
(a) From before the Branch: Always helpful when possible
ADD R1, R2, R3
BEQZ R2, L1 BEQZ R2, L1
DELAY SLOT ADD R1, R2, R3
- -
L1: L1:
• If the ADD instruction were: ADD R2, R1, R3 the move would not be possible
![Page 28: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/28.jpg)
Lec 15.28ECE473
Delayed Branch(b) From the Target: Helps when branch is taken. May duplicate
instructions
ADD R2, R1, R3 ADD R2, R1, R3
BEQZ R2, L1 BEQZ R2, L2
DELAY SLOT SUB R4, R5, R6
- -
L1: SUB R4, R5, R6 L1: SUB R4, R5, R6
L2: L2:
Instructions between BEQ and SUB (in fall through) must not use R4.
Why is instruction at L1 duplicated? What if R5 or R6 changed?
![Page 29: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/29.jpg)
Lec 15.29ECE473
Delayed Branch
( c ) From Fall Through: Helps when branch is not taken.
ADD R2, R1, R3 ADD R2, R1, R3
BEQZ R2, L1 BEQZ R2, L1
DELAY SLOT SUB R4, R5, R6
SUB R4, R5, R6 -
-
L1: L1:
Instructions at target (L1 and after) must not use R4 till set again.
• Cancelling (Nullifying) Branch:Branch instruction indicates direction of prediction.
If mispredicted the instruction in the delay slot is cancelled.
Greater flexibility for compiler to schedule instructions.
![Page 30: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/30.jpg)
Lec 15.30ECE473
Delayed Branch• Limitations of delayed branch
–Compiler may not find appropriate instructions to fill delay slots. Then it fills delay slots with no-ops.
–Visible architectural feature – likely to change with new implementations
»Pipeline structure is exposed to compiler. Need to know how many delay slots.
![Page 31: ECE473 Computer Organization and Architecture · ECE473 Lec 15.3 Pipeline Hazards •Where one instruction cannot immediately follow another •Types of hazards –Structural hazards](https://reader034.vdocuments.us/reader034/viewer/2022042310/5ed83e2e0fa3e705ec0e1b15/html5/thumbnails/31.jpg)
Lec 15.31ECE473
Summary - Control Hazard Solutions
• Stall - stop fetching instr. until result is available– Significant performance penalty
– Hardware required to stall
• Predict - assume an outcome and continue fetching (undo if prediction is wrong) – Performance penalty only when guess wrong
– Hardware required to "squash" instructions
• Delayed branch - specify in architecture that following instruction is always executed– Compiler re-orders instructions into delay slot
– Insert "NOP" (no-op) operations when can't use (~50%)
– This is how original MIPS worked