eecs 470
DESCRIPTION
EECS 470. Branch Prediction Lecture 6 Coverage: Chapter 3. Parts of the predictor. Direction Predictor For conditional branches Predicts whether the branch will be taken Examples: Always taken; backwards taken Address Predictor Predicts the target address (use if predicted taken) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/1.jpg)
EECS 470Branch Prediction
Lecture 6Coverage: Chapter 3
![Page 2: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/2.jpg)
Parts of the predictor
• Direction Predictor– For conditional branches
• Predicts whether the branch will be taken
– Examples: • Always taken; backwards taken
• Address Predictor– Predicts the target address (use if predicted taken)
– Examples: • BTB; Return Address Stack; Precomputed Branch
• Recovery logic
Ref: The Precomputed Branch Architecture
![Page 3: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/3.jpg)
Characteristics of branches
• Individual branches differ– Loops tend not to exit
• Unoptimized code: not-taken• Optimized code: taken
– If-statements:• Tend to be less predictable
– Unconditional branches • Still need address prediction
![Page 4: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/4.jpg)
Example gzip:
• gzip: loop branch A@ 0x1200098d8
• Executed: 1359575 times
• Taken: 1359565 times
• Not-taken: 10 times
• % time taken: 99% - 100%
Easy to predict (direction and address)
![Page 5: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/5.jpg)
Example gzip:
• gzip: if branch B@ 0x12000fa04
• Executed: 151409 times
• Taken: 71480 times
• Not-taken: 79929 times
• % time taken: ~49%
Easy to predict? (maybe not/ maybe dynamically)
![Page 6: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/6.jpg)
Example: gzip
0
2000000
4000000
6000000
8000000
10000000
12000000
14000000
16000000
% taken (per branch)
tota
l bra
nch
exec
utio
ns
0 100
Direction prediction: always takenAccuracy: ~73 %
Eas
y to
pr e
dic
t
Eas
y to
pr e
dic
t
A
B
![Page 7: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/7.jpg)
Branch Backwards
0
0.5
1
1.5
2
2.5
3
3.5
distance of branch target
% o
f to
tal
bra
nch
es
not taken
taken
Most backward branches are heavily NOT-TAKENForward branches slightly more likely to be TAKEN
Ref: The Effects of Predicated Execution on Branch Prediction
![Page 8: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/8.jpg)
Using history
• 1-bit history (direction predictor)– Remember the last direction for a branch
branchPC
NT T
Branch History Table
How big is the BHT?
![Page 9: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/9.jpg)
Example: gzip
0
2000000
4000000
6000000
8000000
10000000
12000000
14000000
16000000
% taken (per branch)
tota
l bra
nch
exec
utio
ns
0 100
Direction prediction: always takenAccuracy: ~73 %
A
B
How many times will branch A mispredict?
How many times will branch B mispredict?
![Page 10: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/10.jpg)
Using history
• 2-bit history (direction predictor)
branchPC
SN NT
Branch History Table
T ST
How big is the BHT?
![Page 11: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/11.jpg)
Example: gzip
0
2000000
4000000
6000000
8000000
10000000
12000000
14000000
16000000
% taken (per branch)
tota
l bra
nch
exec
utio
ns
0 100
Direction prediction: always takenAccuracy: ~73 %
A
B
How many times will branch A mispredict?
How many times will branch B mispredict?
![Page 12: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/12.jpg)
Using History Patterns
~80 percent of branches are either heavily TAKEN or heavily NOT-TAKEN
For the other 20%, we need to look a patterns of reference to see if they are predictable using a more complex predictor
Example: gcc has a branch that flips each time
T(1) NT(0) 10101010101010101010101010101010101010
![Page 13: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/13.jpg)
Local history
branchPC
NT T
10101010
Pattern HistoryTable
Branch HistoryTable
What is the predictionfor this BHT 10101010?
When do I update the tables?
![Page 14: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/14.jpg)
Local history
branchPC
NT T
01010101
Pattern HistoryTable
Branch HistoryTable
On the next execution of thisbranch instruction, the branchhistory table is 01010101, pointing to a different pattern
What is the accuracy of a flip/flop branch 0101010101010…?
![Page 15: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/15.jpg)
Global history
01110101
Pattern HistoryTableBranch History
Register
if (aa == 2)aa = 0;
if (bb == 2)bb = 0;
if (aa != bb) { …
How can branches interfere with each other?
![Page 16: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/16.jpg)
Gshare predictor
Ref: Combining Branch Predictors
branchPC
01110101
Pattern HistoryTableBranch History
Registerxor
Must read!
![Page 17: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/17.jpg)
Bimod predictor
Choicepredictor
PHT skewedtaken
PHT skewedNot-taken
Global history reg branchPC
xor
mux
![Page 18: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/18.jpg)
Hybrid predictors
Local predictor(e.g. 2-bit)
Global/gshare predictor(much more state)
Prediction 1
Prediction 2
Selection table(2-bit state machine)
How do you select which predictor to use?How do you update the various predictor/selector?
Prediction
![Page 19: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/19.jpg)
Overriding Predictors
• Big predictors are slow, but more accurate
• Use a single cycle predictor in fetch• Start the multi-cycle predictor
– When it completes, compare it to the fast prediction.• If same, do nothing• If different, assume the slow predictor is right and flush
pipline.
• Advantage: reduced branch penalty for those branches mispredicted by the fast predictor and correctly predicted by the slow predictor
![Page 20: EECS 470](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815a6d550346895dc7c9aa/html5/thumbnails/20.jpg)
Pipelined Gshare Predictor
• How can we get a pipelined global prediction by stage 1?– Start in stage –2– Don’t have the most recent branch history…
• Access multiple entries– E.g. if we are missing last three branches, get 8
histories and pick between them during fetch stage.
Ref: Reconsidering Complex Branch Predictors coming soon (to be published Feb 2003)