1 improving branch prediction by dynamic dataflow-based identification of correlation branches from...

26
1 Improving Branch Prediction by Dynamic Dataflow-based Identification of Correlation Branches from a Larger Global History CSE 340 Project Presentation Baha Guclu Dundar

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

1

Improving Branch Prediction by Dynamic Dataflow-based Identification of

Correlation Branches from a Larger Global History

CSE 340 Project Presentation

Baha Guclu Dundar

2

Outline

Branch prediction overview Primitive techniques

-Long global history

-Selecting Correlated Branches Identifying Affector Branches in Run-time Implementing affector branches Building predictors that use affector information Experimental Evaluation

3

Branch Prediction Overview

Branch mis-prediction is the single most significant performance limiter for improving processor performance using deeper pipelining [18].

4

Branch Prediction Overview

Trivial prediction Static prediction Line prediction Bimodal branch prediction Local branch prediction Global branch prediction Overriding branch prediction Other types…

5

Branch Prediction Overview

Use multiple predictors with progressively increasing latencies and prediction accuracies in an overriding fashion.

Each of these predictors provide its predictions at different stage. The ones at initial stages are less accurate. Predictions in further stage are expected to be more accurate.

6

Selected predictors in this paper

A simple 1 cycle line predictor provides predictions in first stage, followed by a more accurate global branch predictor, and finally a highly accurate corrector predictor. It corrects

global predictor.

7

Selected predictors in this paper Cnt’

Corrector predictor must have high accuracy comparing with other predictors.

Focus on this paper is to propose and evaluate a new high-accuracy corrector predictor.

Techniques are long global history and identifying correlated branches in this history by using runtime dataflow.

8

Long Global history

What if there is a large distance between two correlated branches?

Use longer history…

How far in the past we should go? What is a good way to use longer history?

9

Long History Cnt’

Length of global history <= log2 (branch prediction table size)

Moreover, recent papers [9][10] show that looking longer histories (at least 64 branches) is necessary.

10

Selecting Correlated Branches

Throughout history, which branches we should select in long history? Brute force? No, this will increase second-level table size.

Using hash table? No too much aliasing. We need to be careful.

Not all branches in long history are correlated under branch

prediction.

Perceptron predictors gives weight for each branch in long history.

11

Identifying Affector Branches in Run-time

A branch becomes affector for a future branch if it can affect the outcome of future branch’s source

operands.

12

Identifying Affector Branches in Run-time

13

Identifying Affector Branches in Run-time

We need to both keep both conventional global history and track run-time dataflow of the program and determine affector branches of last register.

When a prediction is made for a branch, the affector information is used by source registers on the fly.

Affector information on each node is determined by bitmap in dataflow.

LSB of bit map denote the most recent dynamical branch.

14

How to implement affector branches

We keep a separate record of affector information corresponding to each architectural registers as entries in “Affector Register File(ARF)”

15

How to implement affector branches

Classifying three dynamic instructions: Conditional branch instructions

Register-writing instructions

Non-register-writing instructions: jumps,

returns, system calls, stores etc.

16

How to implement affector branches

17

Back to the Example…

18

19

Building predictors that use affector information

There are two proposed predictors:

Zeroing scheme

Packing scheme

20

Building predictors that use affector information

Resulting register bits are hashed down to the required number of bits for predictor index.

21

In a big picture

22

Experimental Evaluation

For the simulation SimpleScalar v3.0 using Alpha ISA was used for 12 benchmarks from SPEC95 and SPEC2000 integer benchmark suites.

23

Misprediction Results

24

Experimental Evaluation Cnt’

25

Conclusion

We used long global history and identifying correlated branches by using runtime dataflow information.

We examined hardware structures, ARF to identify affector branches for each dynamical branch.

Hardware overhead for identifying affectors from 64 branch global history is 312 bytes.

We go through two prediction schemes.

Experimental studies show that adding 8 KB corrector predictor to a 16 KB perception predictor reduces average misprediction rate from %6.3 to %5.7 which is achieved by 64 KB perception predictor.

26

Appendix A: Global Branch Predictors

What if we came across the same branch after executing several branch instructions?

We can use Global branch predictors to deal with this problem. They correlate branch’s outcome with the history preceding dynamic branches.

Global branch predictors have good accuracy percentage. However, it may take 2-3 cycles to execute it.