![Page 1: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/1.jpg)
CS222CS222: Pipeline: Branch PerformancePipeline: Branch Performance
& Superscalar/VLIW
Dr. A. Sahu
Dept of Comp. Sc. & Engg.Dept of Comp. Sc. & Engg.
Indian Institute of Technology Guwahati
![Page 2: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/2.jpg)
Outline• Improving Branch Performance
P i Cl B h Eli i i B h–Previous Class : Branch Elimination, Branch Speed up
–Branch Prediction• Fixed, Static, DynamicFixed, Static, Dynamic
–Branch target capture • BTB, BTAC, BTIC
• Introduction to VLIW and Superscalarp
![Page 3: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/3.jpg)
Improving Branch Performance
• Branch EliminationBranch Elimination– replace branch with other instructions
• Branch Speed Upp p– reduce time for computing CC and TIF
• Branch Prediction– guess the outcome and proceed, undo if necessary
• Branch Target Capture– make use of history
![Page 4: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/4.jpg)
Branch EliminationBranch Elimination
Use conditional instructions
C(predicated execution)
T
F
S C : S
OP1BC CC Z 2
OP1BC CC = Z, ∗ + 2ADD R3, R2, R1OP2
ADD R3, R2, R1, NZOP2
![Page 5: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/5.jpg)
Branch Speed Up : p pearly target address generation
• Assume each instruction is Branch• Assume each instruction is Branch
• Generate target address while decoding
• If target in same page omit translation
• After decoding discard target address if not Branch
IF IF IF D TIF TIF TIFAG
BC
![Page 6: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/6.jpg)
Branch Speed Up : p pincrease CC ‐ branch gap
Increase the gap between condition checkingIncrease the gap between condition checking and branching
l• Early CC setting
• Delayed branch
![Page 7: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/7.jpg)
Improving Branch Performance
• Branch EliminationBranch Elimination– replace branch with other instructions
• Branch Speed Upp p– reduce time for computing CC and TIF
• Branch Prediction– guess the outcome and proceed, undo if necessary
• Branch Target Capture– make use of history
![Page 8: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/8.jpg)
Branch PredictionBranch Prediction
• Treat conditional branches as unconditionalTreat conditional branches as unconditional branches / NOP
• Undo if necessary• Undo if necessary
Strategies:– Fixed (always guess inline or guess target)
– Static (guess on the basis of instruction type)
– Dynamic (guess based on recent history)
![Page 9: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/9.jpg)
Static Branch Prediction
Instr % Guess Branch CorrectInstr % Guess Branch Correct
uncond 14.5 always 100% 14.5%
cond 58 never 54% 27%
loop 9.8 always 91% 9%
call/ret 17 7 always 100% 17 7%call/ret 17.7 always 100% 17.7%
Total 68.2%Total 68.2%
![Page 10: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/10.jpg)
B h P di tiBranch Prediction: (guess inline, go inline)CC
IF IF D AG AG DF DF EX EX
IF IF D AG AG TIF TIFI‐1
CC
IF IF D AG AG TIF TIF
IF IF DI+1
I
IF IF D
I+1
I+2delay = 0
I+2
![Page 11: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/11.jpg)
B h P di tiBranch Prediction: guess inline, goto target
CC
IF IF D AG AG DF DF EX EXI‐1
CC
IF IF D AG AG TIF TIF
IF IF D’ D AG
I
IF IF D D AG
IF IF’ D’ IF IF D
T
T+1 IF IF D IF IF D
delay = 6T+1
![Page 12: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/12.jpg)
B h P di tiBranch Prediction: guess target, go inline
CCIF IF D AG AG DF DF EX EX
IF IF D AG AG TIF TIF
I‐1CC
IF IF D AG AG TIF TIFI
T D
D’ DI+1
T
D’ DI+2
delay = 5
![Page 13: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/13.jpg)
B h P di tiBranch Prediction: guess target, goto target
CC
IF IF D AG AG DF DF EX EXI‐1
CC
IF IF D AG AG TIF TIF
IF IF D’ D AG
I
IF IF D D AG
IF IF’ D’ IF IF D
T
T+1 IF IF D IF IF D
delay = 4T+1
S diti l b hSame as unconditional branch
![Page 14: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/14.jpg)
Static prediction strategyStatic prediction strategy
Let p = probability of taking branchp p y g
guess target: delayt = 4 p + 5 (1 ‐ p) = 5 ‐ p
guess inline: delay 6 p + 0 (1 p) 6 pguess inline: delayi = 6 p + 0 (1 ‐ p) = 6 p
⇒ if (delayt < delayi) guess targetelse guess inline
(delayt < delayi) ⇒ 5 ‐ p < 6 p( yt yi) p p
⇒ p > 5/7 = .71
![Page 15: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/15.jpg)
Static prediction strategy ‐p gythresholds for different instructions
CC
IF IF D AG AG DF DF EX EX
IF IF D AG AG TIF TIF
I‐1
CC
actual→ T I
IF IF D AG AG TIF TIFI
guess T 4 5
↓ I 6 0
guess target if 4 p + 5 (1 ‐ p) < 6 p + 0 (1 ‐ p)
i e p > 71i.e. p > .71
![Page 16: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/16.jpg)
Static prediction strategy ‐p gythresholds for different instructions
CCIF IF D AG AG DF DF EX EX
IF IF D AG AG TIF TIF EX EX
I‐1CC
actual→ T I
IF IF D AG AG TIF TIF EX EXILoop control
guess T 4 6
↓ I 7 1
guess target if 4 p + 6 (1 ‐ p) < 7 p + 1 (1 ‐ p)
i e p > 62i.e. p > .62
![Page 17: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/17.jpg)
Static prediction strategy ‐p gythresholds for different instructions
CC
IF IF D AG AG DF DF EX EX
IF IF D AG TIF TIF
I‐1
CC
actual→ T I
IF IF D AG TIF TIFIregister address
guess T 3 5
↓ I 6 0guess target if 3 p + 5 (1 ‐ p) < 6 p + 0 (1 ‐ p)
i e p > 62i.e. p > .62
![Page 18: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/18.jpg)
Dynamic Branch Prediction
![Page 19: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/19.jpg)
Dynamic Branch Prediction ‐ybasic idea
Predict based on the history of previous branchPredict based on the history of previous branch
loop: xxx 2 miss‐predictions
fxxx for every
xxx occurrence
xxx
BC loopBC loop
![Page 20: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/20.jpg)
Dynamic Branch Prediction ‐y2 bit prediction scheme
N
0 1
T
T
0/1 3/2T
N
N
T Npredict taken predict not taken
2 3
T
N
![Page 21: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/21.jpg)
Dynamic Branch Prediction ‐ysecond scheme
Predict based on the history of previous nPredict based on the history of previous nbranches e.g., if n = 3 then
3 branches taken⇒ predict taken3 branches taken ⇒ predict taken
2 branches taken ⇒ predict taken
1 branch taken ⇒ predict not takenp
0 branches taken ⇒ predict not taken
![Page 22: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/22.jpg)
Dynamic Branch Prediction ‐yBimodal predictor
Maintain saturating counters
0 1 2 3
T T TTN
N N N
One counter per branch orOne counter per cache line -
merge results if multiple branchesmerge results if multiple branches
![Page 23: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/23.jpg)
Dynamic Branch Prediction ‐yHistory of last n occurrences
current entry updated entrycurrent entry updated entry
outcome of lastthree occurrences t l t
1 1 0 1 1 1three occurrencesof this branch
actual outcome‘taken’
0 : not taken1 : taken
prediction using majority decision
![Page 24: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/24.jpg)
Correlation between branchesCorrelation between branches
B1: if (x) • B3 can be predictedB1: if (x)
...
• B3 can be predicted with 100% accuracy
B2: if (y) based on the outcomes of B1 and
...
z = x && y
outcomes of B1 and B2
z = x && y
B3: if (z)
...
![Page 25: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/25.jpg)
Improving Branch Performance
• Branch Elimination– replace branch with other instructions
• Branch Speed Up– reduce time for computing CC and TIF
• Branch Prediction– guess the outcome and proceed, undo if necessary
• Branch Target Capture– make use of history
![Page 26: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/26.jpg)
Branch Target CaptureBranch Target Capture• Branch Target Buffer (BTB)• Target Instruction Buffer (TIB)• Target Instruction Buffer (TIB)
instr addr pred stats targettarget addrprob of target change < 5% target addrtarget instr
prob of target change < 5%
![Page 27: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/27.jpg)
BTB PerformanceBTB Performance
BTB missgo inline
BTB hitgo to target
decision4 6go inline
inline
go to target
result target inline target
.4 .6
dela 0 5 4 0
.8 .2 .2 .8
delay 0 5 4 0
.4*.8*0 + .4*.2*5 + .6*.2*4 + .6*.8*00 88= 0.88
![Page 28: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/28.jpg)
BTC: Structure of TablesBTC: Structure of Tables
Instruction fetch path withInstruction fetch path with
• BTAC (Branch Target Add Cache)( g )
• BTIC (Branch Target Ins Cache)
![Page 29: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/29.jpg)
Compute/fetch scheme(no dynamic branch prediction)
IF
InstructionFetch address
BTA
A I I + 1 I + 2 I + 3
I ‐ cache
FAR
Compute
IIFA
++BTA
Next sequentialaddress BTI BTI+1 BTI+2 BTI+3BTI BTI+1 BTI+2 BTI+3
![Page 30: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/30.jpg)
BTAC scheme
IF
InstructionFetch address
BTA
A I I + 1 I + 2 I + 3BA BTA
I ‐ cache
FAR
IIFABTAC
++
Next sequentialaddress BTI BTI+1 BTI+2 BTI+3BTI BTI+1 BTI+2 BTI+3
![Page 31: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/31.jpg)
BTIC scheme ‐ 1BTIC scheme 1
IF
InstructionFetch address
BTA
A IBA BTI BTA+
I ‐ cache
FAR
IIFABTIC
++
Next sequentialaddress
To decoder
![Page 32: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/32.jpg)
Superscalar/VLIWp /• Instruction level parallelism
• EndSem Exam : Covers only post Midsem part• EndSem Exam : Covers only post Midsem part
• VLIW (Intel Itanium, TI OMAP)
• Superscalar (Pentium, Athlon)– Parallel Issue, Parallel Decode
– Dependency Check (Reservation Station, Renaming)
– Parallel Execute, Serial Commit
![Page 33: CS222: Pipeline: Branch Performance · 2017. 4. 12. · Pipeline: Branch Performance & Superscalar/VLIW Dr. A. Sahu ... Strategies: – Fixed (always guess inline or guess target)](https://reader036.vdocuments.us/reader036/viewer/2022071212/6024107e6a5c1f15086e3e55/html5/thumbnails/33.jpg)