study of pentium processors
TRANSCRIPT
-
8/12/2019 Study of Pentium Processors
1/24
The Study and comparisonof
Pentium family Processors
Calin Ciordas
Zhang Lei
Yingbo Zhu
2001 02
Calin Ciordas with the instruction and the Part II.3Zhang ei with the Part II.2 and Part II.!
"ing#o Zhu with the Part II.1 and the summary
-
8/12/2019 Study of Pentium Processors
2/24
CONTENTS
P$%T I Introduction........................................................................................................................ .........2
Part II Study Issues....................................................................................................................................31.Caches ................................................................................................................................................3
1.1 Consistency Protocol &'(SI Protocol).......................................................................................31.2 The *least recently used*& %+) 'echanism................................................................................!
1.3 The Pentium Processor.................................................................................................................!1.! The Pentium Pro ,Pentium II,Pentium III ...................................................................................-
1.- The Pentium ! Processor..............................................................................................................2 pipeline................................................................................................................................................/
2.1Pentium Pentium with ''..................................................................................................../2.2 The Pentium Pro ,Pentium II,Pentium III ...................................................................................
2.2Pentium !....................................................................................................................................102.3pipeline summary....................................................................................................................... .10
3 Parallel and superscalar aspects of Pentium processor family..........................................................11
3.1 Superscalar aspects....................................................................................................................123.2 SI' .......................................................................................................................................13
!4ranch prediction..............................................................................................................................15!.1 Pentium......................................................................................................................................1!.2 The Pentium Pro ,Pentium II,Pentium III .................................................................................1
!.3 Pentium !...................................................................................................................................1P$%T III S+''$%"..........................................................................................................................1/
$ppendi61 7 Comparison of Pentium 8amily Processors Specifications................................................21$ppendi6 27 %eferences.......................................................................................................................... .2-
PART I Introduction
The main tas9 of this paper is to offer a :iew of the entire Intel Pentium processor family from
architectural point of :iew. It is interesting to o#ser:e the e:olution of design issues; the commonthings and the impro:ements that Intel engineers added o:er the years. The Pentium family is a family
of CISC processors with ad:anced %ISC concepts included. (6cepting the first mem#er &Pentium)which has a modest superscalar design the others presents a full superscalar design. $rchitectural
e6tensions li9e ''; SS(; SS(2 are also an important impro:ement. The comparison #etweendifferent cache policies and the #ranch prediction strategies are presented.
In our opinion the Pentium family is an successful design family.
2
-
8/12/2019 Study of Pentium Processors
3/24
8or our study the Pentium; Pentium II and Pentium ! were studied in more detail. The otherprocessors were partially studied with regards to certain interesting aspects; #ecause Pentium Pro;
Pentium II and Pentium III are #ased on similar designs.
Part II Study Issues
1. Caches
$ll caches use the following model. 'ain memory is di:ided up into fi6edI8I(.S < Shared7 This state indicates that the line is potentially shared with other caches &i.e. the same
line may e6ist in more than one cache). $ read to an S
-
8/12/2019 Study of Pentium Processors
4/24
1.2 The 'least recently used'( LRU) Mechanis
The *least recently used* #it indicates which set in each line has #een used last; the other set will#e replaced if none of them hits and #oth are :alid. The *least recently used*& %+) algorithm 9eeps an
ordering of each set of locations that could #e ascended from a gi:en memory location. Bhene:er anyof the present lines are accessed; it updates the list; mar9ing that entry the most recently accessed.
Bhen it comes time to replace an entry; the one at the end of the list
-
8/12/2019 Study of Pentium Processors
5/24
-
8/12/2019 Study of Pentium Processors
6/24
2 cache miss is #etween 11 and 1! cycles #ased on %$' page hit or miss. The data cache can #eaccessed simultaneously #y a load instruction and a store instruction; as long as the references are to
different cache #an9s.
Figure !. The Pentiu* Pro/ II/ III Processor 0icro$Architecture
#ith Ad%anced Trans,erCache Enhance*ent/ The ,irst and second le%el caches
!.!' The Pentiu* II 1Pentiu* III
The on
-
8/12/2019 Study of Pentium Processors
7/24
-
8/12/2019 Study of Pentium Processors
8/24
2 pipeline
Pipelining is an architecture techniEue for increasing the throughput of comple6; multiple cycle
instruction. The whole instruction can #e di:ided to a series of smaller stages which can #e completed
within a single cloc9 cycle; and the freEuency and throughput of the system can #e impro:ed.
2.1Pentiu % Pentiu &ith MM
The Pentium processor has a fi:e stage pipeline for the integer instructions; while the Pentium
processor with '' has an additional pipeline stage. The pipeline stages are shown #elow 7P8 Prefetch
8 8etch&Pentium professor with '' technology only)1 Instruction ecode
2 $ddress Henerate( (6ecute $+ and Cache $ccess
B4 Brite #ac9
The Pentium processor is a superscalar machine ;which ha:e two pipelines called the uJ and the:J pipes. 8igure 3.1 shows the instruction flow in the Pentium processor.
The Pentium processor also has a floating point pipeline. The floating point unit&8P+) isintegrated with the integer unit on the same chip which has / pipeline stages; the first fi:e share with
the integer unit. Integer instructions pass though only the first - stages. Integer instructions use the
fifth&1) stages as a B4 &write
-
8/12/2019 Study of Pentium Processors
9/24
2.2 The Pentiu Pro #Pentiu II#Pentiu III
The Pentium Pro ,Pentium II,Pentium III processor ha:e the same pipeline architecture . ThePentium pro and Pentium II processor ha:e an in order front end ;an out
-
8/12/2019 Study of Pentium Processors
10/24
%$T %egister $llocator %>4 %eorder 4uffer; up to two register reads per cycle
%S %eser:ation Station; micro4&w#) %eorder #uffer &write#ac9)%%8 %eorder 4uffer read
2.2Pentiu "
Pentium ! has a different architecture with the pre:ious one which has the name of @et4ust'icro
-
8/12/2019 Study of Pentium Processors
11/24
-
8/12/2019 Study of Pentium Processors
12/24
There are also presented the SI' aspects of this processor family7 ''; SS(; SS(2. Be try topresent the architectural details of these implementations and the reasons for which were added and
not an enumeration of the instruction set added #y them.
!.1 Suerscalar asects
+!! Pentiu*
The original Pentium had a superscalar component consisting of the use of two separate integer
e6ecution unit capa#le of e6ecuting 2 instructions in parallel. The pipelines are called uJ and :J
pipelines. The floating point unit shares the first - stages with the integer pipeline. In the decode stage
1; Pentium has 2 decoders wor9ing in parallel.
+!!' Pentiu* Pro1Pentiu* II
Pentium II has #asically the same superscallar organi=ation as the Pentium Pro with the addition
of the '' e6ecution units.The essential components of the superscalar organi=ation are the instruction fetch and decode
units; the dispatch and e6ecute unit and the retire unit.The I1 stage &instruction decode 1) contains + decoderswhich can wor9 in parallel. >ne is a
comple6 one and the others are simple ones. The comple6 decoder can handle Pentium instructionwhich can translate into up to four micro
-
8/12/2019 Study of Pentium Processors
13/24
Figure +! Pentiu* . E4ecution (nit
'ost e6ecution units can start e6ecuting a new micro
-
8/12/2019 Study of Pentium Processors
14/24
These '' registers are mapped o:er the 8P+ registers. 8P+ registers are /0 #its wide #ut'' registers are 5! #its wide. '' registers are aliased on the 5! #its mantissa portion of the 8P
registers. Bhen a :alue is written to one of the '' registers it also appears in the mantissa portionof the respecti:e 8P register. The re:erse is also true. Bhen a :alue is witten to an '' register; #its
-
8/12/2019 Study of Pentium Processors
15/24
-
8/12/2019 Study of Pentium Processors
16/24
12/
-
8/12/2019 Study of Pentium Processors
17/24
The first three can #e summari=ed to the static prediction algorithm ; and the last two can #esummari=ed as the ynamic Prediction algorithm. In the Pentium series processor has used 9inds of
method to predict the #ranches ;assuring a steady flow of instructions to the initial stages of thepipelines.
".1 Pentiu
The Pentium processor uses a dynamic #ranch prediction strategy #ased on the history of recente6ecutions of a #ranches instruction. $ 4ranch Target 4uffer&4T4) is maintained that caches
information a#out recently encountered #ranch instruction to predict the outcome of #ranch instructions
which minimi=es pipeline stalls due to prefetch delays. The Pentium processor accesses the 4T4 with
the address of the instruction in 1 stages. It contains a 4ranch prediction state machine with fourstages7
$. Strongly not ta9en4. Bea9ly not ta9en
C. Bea9ly ta9en. Strongly ta9en
If an entry already e6ists in the 4T4; then the instruction unit is guided #y the history information
for that entry in determining whether to predict that the #ranch is ta9en. If a #ranch is predicted ;thenthe #ranch destination address associated with this entry is used for prepetching the #ranch target
instruction. >nce the instruction is e6ecuted; the history portion of the appropriate entry is updated to
reflect the result of the #ranch instruction. If this instruction is not represented in the 4T4; then theaddress of the instruction is loaded into an entry in the 4T4K If necessary; an old entry is deleted.
".2 The Pentiu Pro #Pentiu II#Pentiu III
The Pentium Pro ,Pentium II,Pentium III ha:e much longer pipelines; so the penalty for mis