superscalar architectures: part 2 - introduction | csap · superscalar architectures: part 2...
TRANSCRIPT
![Page 1: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/1.jpg)
SeoulNa)onalUniversity
1 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
SuperscalarArchitectures:Part2
Dynamic(Out-of-Order)Scheduling
Lecture3.2August23rd,2017
JaeW.Lee([email protected])ComputerScienceandEngineeringSeoulNaMonalUniversityDownloadthislectureslidesathPps://goo.gl/rJPMQUSlidecredits:[COD5e]and[CA:AQA5e]slidesfromElsevierInc.
![Page 2: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/2.jpg)
SeoulNa)onalUniversity
2 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
Outline
Reference:[CA:AQA5e]Ch.3.4-3.5
¢ Instruc)on-LevelParallelismandDependences
¢ DynamicSchedulingwithTomasuloAlgorithm
![Page 3: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/3.jpg)
SeoulNa)onalUniversity
3 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
Instruc)on-LevelParallelism
andDependences
![Page 4: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/4.jpg)
SeoulNa)onalUniversity
4 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
Instruc)on-LevelParallelism
¢ ILPislimitedby
§ Resourceconflicts§ Dependences
¢ Threetypesofdependences
§ (True)Datadependences§ Namedependences§ Controldependences
![Page 5: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/5.jpg)
SeoulNa)onalUniversity
5 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
DataDependence
¢ Instruc)onjisdatadependentoninstruc)oniif§ InstrucMoniproducesaresultthatmaybeusedbyinstrucMonj§ InstrucMonjisdatadependentoninstrucMonkandinstrucMonkisdata
dependentoninstrucMoni
¢ Example:whichinstruc)onpairsaredatadependent?Loop: L.D F0,0(R1) # F0=array element ADD.D F4,F0,F2 # add scalar in F2 S.D F4,0(R1) # store result DADDUI R1,R1,#-8 # decrement pointer 8 bytes BNE R1,R2,LOOP # branch R1!=R2
¢ Dependentinstruc)onscannotbeexecutedsimultaneously
![Page 6: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/6.jpg)
SeoulNa)onalUniversity
6 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
DataDependence
¢ Dependenciesareapropertyofprograms
¢ Pipelineorganiza)ondeterminesifdependenceis
detectedandifitcausesastall
§ Read-AYer-Write(RAW)hazard¢ Datadependenceconveys:
§ Possibilityofahazard§ Orderinwhichresultsmustbecalculated§ UpperboundonexploitableinstrucMonlevelparallelism
¢ Dependenciesthatflowthroughmemoryloca)onsare
difficulttodetect§ “memorydisambiguaMon”problem§ Does100(R4)=20(R6)?§ FromdifferentloopiteraMons,does20(R6)=20(R6)?
![Page 7: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/7.jpg)
SeoulNa)onalUniversity
7 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
NameDependence
¢ Twoinstruc)onsusethesamenamebutnoflowofinforma)on
§ Notatruedatadependence,butisaproblemwhenreorderinginstrucEons§ AnEdependence:instrucMonjwritesaregisterormemorylocaMonthat
instrucMonireads§ IniMalordering(ibeforej)mustbepreserved§ CausingWrite-AYer-Read(WAR)hazard
§ Outputdependence:instrucMoniandinstrucMonjwritethesameregisterormemorylocaMon§ Orderingmustbepreserved§ CausingWrite-AYer-Write(WAW)hazard
¢ Toresolve,userenamingtechniques
![Page 8: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/8.jpg)
SeoulNa)onalUniversity
8 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
DataandNameDependence:Examples
¢ (True)Datadependence
¢ An)dependence
¢ Outputdependence
r3 ß (r1) op (r2)
r5 ß (r3) op (r4)
r3 ß (r1) op (r2)
r1 ß (r4) op (r5)
r3 ß (r1) op (r2)
r3 ß (r4) op (r5)
![Page 9: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/9.jpg)
SeoulNa)onalUniversity
9 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
DataHazards
¢ Adatahazardexistsif
§ ThereisanameordatadependencebetweeninstrucMons,and§ TheyarecloseenoughthatoverlapduringexecuMonwouldchange
theorderofaccesstotheoperandinvolvedinthedependence
¢ Threetypesofdatahazardscorrespondingtothreetypesofdependences
§ ReadaYerwrite(RAW)hazard–truedatadependence§ WriteaYerwrite(WAW)hazard–outputdependence§ WriteaYerread(WAR)hazard-anMdependence
![Page 10: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/10.jpg)
SeoulNa)onalUniversity
10 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
ControlDependence
¢ Orderingofinstruc)oniwithrespecttoabranchinstruc)on
§ InstrucMoncontroldependentonabranchcannotbemovedbeforethebranchsothatitsexecuMonisnolongercontrollerbythebranch
§ AninstrucMonnotcontroldependentonabranchcannotbemovedaYerthebranchsothatitsexecuMoniscontrolledbythebranch
![Page 11: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/11.jpg)
SeoulNa)onalUniversity
11 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
ControlDependence
¢ Examples
¢ ORinstruc)ondatadependentonDADDUandDSUBU
¢ AssumeR4isn’tuseda\erskip
§ PossibletomoveDSUBUbeforethebranch
Example1:
DADDU R1,R2,R3 BEQZ R4,L DSUBU R1,R1,R6
L: … OR R7,R1,R8
Example2:
DADDU R1,R2,R3 BEQZ R12,skip DSUBU R4,R5,R6 DADDU R5,R4,R9
skip: OR R7,R8,R9
11
![Page 12: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/12.jpg)
SeoulNa)onalUniversity
12 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
DynamicSchedulingwith
TomasuloAlgorithm
![Page 13: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/13.jpg)
SeoulNa)onalUniversity
13 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
DynamicScheduling
¢ Rearrangeorderofinstruc)onstoreducestallswhilemaintainingdataflow
¢ Advantages:
§ Compilerdoesn’tneedtohaveknowledgeofmicroarchitecture§ HandlescaseswheredependenciesareunknownatcompileMme
¢ Disadvantage:
§ SubstanMalincreaseinhardwarecomplexity§ ComplicatesexcepMons
![Page 14: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/14.jpg)
SeoulNa)onalUniversity
14 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
DynamicScheduling
¢ Dynamicschedulingimplies:
§ Out-of-orderexecuMon§ Out-of-ordercompleMon
¢ CreatesthepossibilityforWARandWAWhazards
§ WARExample: DIVD F0,F2,F4 // assume takes long time ADDD F10,F0,F8 // RAW hazard on F0 SUBD F8,F8,F14 // WAR hazard on F8
¢ Twopopluardynamicschedulingalgorithms:
ScoreboardandTomasuloAlgorithm
§ Bothtrackwhenoperandsareavailable§ Tomasulofurtherintroducesregisterrenaminginhardware
§ MinimizesWAWandWARhazards
![Page 15: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/15.jpg)
SeoulNa)onalUniversity
15 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloAlgorithm
¢ Bestknowndynamicschedulingalgorithm
§ Influencedvirtuallyallout-of-orderinstrucMonschedulingtechniques§ Alpha21264,HP8000,MIPS10000,PenMumII,PowerPC604,…
¢ FirstintroducedforIBM360/91(1966)
¢ Goal:Highperformancewithoutspecialcompilers
![Page 16: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/16.jpg)
SeoulNa)onalUniversity
16 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloAlgorithm
¢ RegisterrenamingtoovercomeWAR/WAWhazards(1)
§ Example:
DIV.DF0,F2,F4ADD.DF6,F0,F8S.DF6,0(R1)SUB.DF8,F10,F14MUL.DF6,F10,F8
+namedependenceswithF6andF8
an)-dependence(WAR)
an)-(output)dependence(WAW)
![Page 17: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/17.jpg)
SeoulNa)onalUniversity
17 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloAlgorithm
¢ RegisterrenamingtoovercomeWAR/WAWhazards(2)
§ Example:
DIV.DF0,F2,F4ADD.DS,F0,F8S.DS,0(R1)SUB.DT,F10,F14MUL.DF6,F10,T
§ NowonlyRAWhazardsremain,whichcanbestrictlyordered
![Page 18: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/18.jpg)
SeoulNa)onalUniversity
18 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloAlgorithm
¢ Registerrenamingisprovidedbyreserva)onsta)ons(RS)inTomasuloAlgorithm
§ Contains:§ TheinstrucMon§ Bufferedoperandvalues(whenavailable)§ ReservaMonstaMonnumberofinstrucMonprovidingtheoperandvalues
§ RSfetchesandbuffersanoperandassoonasitbecomesavailable(notnecessarilyinvolvingregisterfile)
§ PendinginstrucMonsdesignatetheRStowhichtheywillsendtheiroutput§ Resultvaluesbroadcastonaresultbus,calledthecommondatabus(CDB)
§ Onlythelastoutputupdatestheregisterfile§ AsinstrucMonsareissued,theregisterspecifiersarerenamedwiththe
reservaMonstaMon§ MaybemorereservaMonstaMonsthanregisters
![Page 19: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/19.jpg)
SeoulNa)onalUniversity
19 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloAlgorithm
¢ Tomasuloorganiza)onTomasulo(Organiza/on(
FP adders
Add1 Add2 Add3
FP multipliers
Mult1 Mult2
From Mem FP Registers
Reservation Stations
Common Data Bus (CDB)
To Mem
FP Op Queue
Load Buffers
Store Buffers
Load1 Load2 Load3 Load4 Load5 Load6
![Page 20: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/20.jpg)
SeoulNa)onalUniversity
20 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloAlgorithm
¢ Reserva)onsta)on(RS)components
Op: OperaMontoperformintheunit(e.g.,+or–)Vj,Vk: ValueofSourceoperands
§ StorebuffershasVfield,resulttobestoredQj,Qk: ReservaMonstaMonsproducingsourceregisters(valuetobewriPen)§ Note:Qj,Qk=0=>ready§ StorebuffersonlyhaveQiforRSproducingresult
Busy: IndicatesreservaMonstaMonorFUisbusyRegisterresultstatus—IndicateswhichfuncMonalunitwillwriteeachregister,ifoneexists.BlankwhennopendinginstrucMonsthatwillwritethatregister.
![Page 21: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/21.jpg)
SeoulNa)onalUniversity
21 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloAlgorithm
¢ ThreestagesofTomasuloAlgorithm
1. Issue—getinstrucMonfromFPOpQueue IfreservaMonstaMonfree(nostructuralhazard),controlissuesinstr&sendsoperands(renamesregisters).
2. ExecuMon—operateonoperands(EX) Whenbothoperandsreadythenexecute;ifnotready,watchCommonDataBusforresult
3. Writeresult—finishexecuMon(WB) WriteonCommonDataBustoallawaiMngunits;markreservaMonstaMonavailable
§ Normaldatabus:data+desMnaMon(“goto”bus)§ Commondatabus:data+source(“comefrom”bus)
§ 64bitsofdata+4bitsofFuncMonalUnitsourceaddress§ WriteifmatchesexpectedFuncMonalUnit(producesresult)§ Doesthebroadcast
![Page 22: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/22.jpg)
SeoulNa)onalUniversity
22 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:AStraight-LineCodeInstruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 Load1 NoLD F2 45+ R3 Load2 NoMULTD F0 F2 F4 Load3 NoSUBD F8 F6 F2DIVD F10 F0 F6ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 NoMult1 NoMult2 No
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F300 FU
![Page 23: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/23.jpg)
SeoulNa)onalUniversity
23 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle1Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 Load1 Yes 34+R2LD F2 45+ R3 Load2 NoMULTD F0 F2 F4 Load3 NoSUBD F8 F6 F2DIVD F10 F0 F6ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 NoMult1 NoMult2 No
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F301 FU Load1
![Page 24: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/24.jpg)
SeoulNa)onalUniversity
24 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle2Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 Load1 Yes 34+R2LD F2 45+ R3 2 Load2 Yes 45+R3MULTD F0 F2 F4 Load3 NoSUBD F8 F6 F2DIVD F10 F0 F6ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 NoMult1 NoMult2 No
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F302 FU Load2 Load1
Note:Canhavemul)pleloadsoutstanding
![Page 25: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/25.jpg)
SeoulNa)onalUniversity
25 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle3Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 Load1 Yes 34+R2LD F2 45+ R3 2 Load2 Yes 45+R3MULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2DIVD F10 F0 F6ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 NoMult1 Yes MULTD R(F4) Load2Mult2 No
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F303 FU Mult1 Load2 Load1
• Note:registersnamesareremoved(“renamed”)inReserva)onSta)ons;MULTissuedvs.scoreboard
• Load1comple)ng;whatiswai)ngforLoad1?
![Page 26: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/26.jpg)
SeoulNa)onalUniversity
26 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle4Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 Load2 Yes 45+R3MULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4DIVD F10 F0 F6ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 Yes SUBD M(A1) Load2Add2 NoAdd3 NoMult1 Yes MULTD R(F4) Load2Mult2 No
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F304 FU Mult1 Load2 M(A1) Add1
• Load2comple)ng;whatiswai)ngforLoad2?
![Page 27: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/27.jpg)
SeoulNa)onalUniversity
27 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle5Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4DIVD F10 F0 F6 5ADDD F6 F8 F2
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
2 Add1 Yes SUBD M(A1) M(A2)Add2 NoAdd3 No
10 Mult1 Yes MULTDM(A2) R(F4)Mult2 Yes DIVD M(A1) Mult1
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F305 FU Mult1 M(A2) M(A1) Add1 Mult2
![Page 28: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/28.jpg)
SeoulNa)onalUniversity
28 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle6Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4DIVD F10 F0 F6 5ADDD F6 F8 F2 6
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
1 Add1 Yes SUBD M(A1) M(A2)Add2 Yes ADDD M(A2) Add1Add3 No
9 Mult1 Yes MULTD M(A2) R(F4)Mult2 Yes DIVD M(A1) Mult1
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F306 FU Mult1 M(A2) Add2 Add1 Mult2
• IssueADDDhere?
![Page 29: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/29.jpg)
SeoulNa)onalUniversity
29 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle7Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4 7DIVD F10 F0 F6 5ADDD F6 F8 F2 6
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
0 Add1 Yes SUBD M(A1) M(A2)Add2 Yes ADDD M(A2) Add1Add3 No
8 Mult1 Yes MULTD M(A2) R(F4)Mult2 Yes DIVD M(A1) Mult1
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F307 FU Mult1 M(A2) Add2 Add1 Mult2
• Add1comple)ng;whatiswai)ngforit?
![Page 30: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/30.jpg)
SeoulNa)onalUniversity
30 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle8Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 No2 Add2 Yes ADDD (M-M) M(A2)
Add3 No7 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD M(A1) Mult1
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F308 FU Mult1 M(A2) Add2 (M-M) Mult2
![Page 31: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/31.jpg)
SeoulNa)onalUniversity
31 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle9Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 No1 Add2 Yes ADDD (M-M) M(A2)
Add3 No6 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD M(A1) Mult1
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F309 FU Mult1 M(A2) Add2 (M-M) Mult2
![Page 32: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/32.jpg)
SeoulNa)onalUniversity
32 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle10Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 10
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 No0 Add2 Yes ADDD (M-M) M(A2)
Add3 No5 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD M(A1) Mult1
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F3010 FU Mult1 M(A2) Add2 (M-M) Mult2
• Add2comple)ng;whatiswai)ngforit?
![Page 33: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/33.jpg)
SeoulNa)onalUniversity
33 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle11Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 No
4 Mult1 Yes MULTDM(A2) R(F4)Mult2 Yes DIVD M(A1) Mult1
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F3011 FU Mult1 M(A2) (M-M+M)(M-M) Mult2
• WriteresultofADDDherevs.scoreboard?
• Allquickinstruc)onscompleteinthiscycle!
![Page 34: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/34.jpg)
SeoulNa)onalUniversity
34 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle12Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 No
3 Mult1 Yes MULTDM(A2) R(F4)Mult2 Yes DIVD M(A1) Mult1
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F3012 FU Mult1 M(A2) (M-M+M)(M-M) Mult2
![Page 35: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/35.jpg)
SeoulNa)onalUniversity
35 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle13Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 No
2 Mult1 Yes MULTDM(A2) R(F4)Mult2 Yes DIVD M(A1) Mult1
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F3013 FU Mult1 M(A2) (M-M+M)(M-M) Mult2
![Page 36: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/36.jpg)
SeoulNa)onalUniversity
36 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle14Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 No
1 Mult1 Yes MULTDM(A2) R(F4)Mult2 Yes DIVD M(A1) Mult1
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F3014 FU Mult1 M(A2) (M-M+M)(M-M) Mult2
![Page 37: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/37.jpg)
SeoulNa)onalUniversity
37 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle15Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 15 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 No
0 Mult1 Yes MULTDM(A2) R(F4)Mult2 Yes DIVD M(A1) Mult1
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F3015 FU Mult1 M(A2) (M-M+M)(M-M) Mult2
![Page 38: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/38.jpg)
SeoulNa)onalUniversity
38 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle16Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 15 16 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 NoMult1 No
40 Mult2 Yes DIVD M*F4 M(A1)
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F3016 FU M*F4 M(A2) (M-M+M)(M-M) Mult2
![Page 39: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/39.jpg)
SeoulNa)onalUniversity
39 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne(Cont)
Fasterthanlightcomputa)on(skipacoupleofcycles)…
![Page 40: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/40.jpg)
SeoulNa)onalUniversity
40 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle55Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 15 16 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 NoMult1 No
1 Mult2 Yes DIVD M*F4 M(A1)
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F3055 FU M*F4 M(A2) (M-M+M)(M-M) Mult2
![Page 41: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/41.jpg)
SeoulNa)onalUniversity
41 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle56Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 15 16 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5 56ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 NoMult1 No
0 Mult2 Yes DIVD M*F4 M(A1)
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F3056 FU M*F4 M(A2) (M-M+M)(M-M) Mult2
• Mult2iscomple)ng;whatiswai)ngforit?
![Page 42: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/42.jpg)
SeoulNa)onalUniversity
42 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleOne:Cycle57Instruction status: Exec Write
Instruction j k Issue Comp Result Busy AddressLD F6 34+ R2 1 3 4 Load1 NoLD F2 45+ R3 2 4 5 Load2 NoMULTD F0 F2 F4 3 15 16 Load3 NoSUBD F8 F6 F2 4 7 8DIVD F10 F0 F6 5 56 57ADDD F6 F8 F2 6 10 11
Reservation Stations: S1 S2 RS RSTime Name Busy Op Vj Vk Qj Qk
Add1 NoAdd2 NoAdd3 NoMult1 NoMult2 Yes DIVD M*F4 M(A1)
Register result status:Clock F0 F2 F4 F6 F8 F10 F12 ... F3056 FU M*F4 M(A2) (M-M+M)(M-M) Result
• Onceagain:In-orderissue,out-of-orderexecu)onandcomple)on.
![Page 43: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/43.jpg)
SeoulNa)onalUniversity
43 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:ALoop
¢ Loopexamplecode Loop: LD F0 0 R1 MULTD F4 F0 F2 SD F4 0 R1 SUBI R1 R1 #8 BNEZ R1 Loop
§ AssumeMulMplytakes4clocks§ Assumefirstloadtakes8clocks(cachemiss),secondloadtakes1
clock(hit)§ Tobeclear,willshowclocksforSUBI,BNEZ§ Reality:integerinstrucMonsahead
![Page 44: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/44.jpg)
SeoulNa)onalUniversity
44 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:ALoop
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 Load1 No1 MULTD F4 F0 F2 Load2 No1 SD F4 0 R1 Load3 No2 LD F0 0 R1 Store1 No2 MULTD F4 F0 F2 Store2 No2 SD F4 0 R1 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 No SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
0 80 Fu
![Page 45: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/45.jpg)
SeoulNa)onalUniversity
45 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle1
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 Load1 Yes 801 MULTD F4 F0 F2 Load2 No1 SD F4 0 R1 Load3 No2 LD F0 0 R1 Store1 No2 MULTD F4 F0 F2 Store2 No2 SD F4 0 R1 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 No SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
1 80 Fu Load1
![Page 46: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/46.jpg)
SeoulNa)onalUniversity
46 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle2
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 Load1 Yes 801 MULTD F4 F0 F2 2 Load2 No1 SD F4 0 R1 Load3 No2 LD F0 0 R1 Store1 No2 MULTD F4 F0 F2 Store2 No2 SD F4 0 R1 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F4) Load1 SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
2 80 Fu Load1 Mult1
![Page 47: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/47.jpg)
SeoulNa)onalUniversity
47 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle3
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 Load1 Yes 801 MULTD F4 F0 F2 2 Load2 No1 SD F4 0 R1 3 Load3 No2 LD F0 0 R1 Store1 Yes 80 Mult12 MULTD F4 F0 F2 Store2 No2 SD F4 0 R1 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F4) Load1 SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
3 80 Fu Load1 Mult1
• Implicitrenamingsetsup“DataFlow”graph
![Page 48: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/48.jpg)
SeoulNa)onalUniversity
48 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle4
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 Load1 Yes 801 MULTD F4 F0 F2 2 Load2 No1 SD F4 0 R1 3 Load3 No2 LD F0 0 R1 Store1 Yes 80 Mult12 MULTD F4 F0 F2 Store2 No2 SD F4 0 R1 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F4) Load1 SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
4 80 Fu Load1 Mult1
• DispatchingSUBIInstruc)on
![Page 49: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/49.jpg)
SeoulNa)onalUniversity
49 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle5
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 Load1 Yes 801 MULTD F4 F0 F2 2 Load2 No1 SD F4 0 R1 3 Load3 No2 LD F0 0 R1 Store1 Yes 80 Mult12 MULTD F4 F0 F2 Store2 No2 SD F4 0 R1 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F4) Load1 SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
5 72 Fu Load1 Mult1
• and,BNEZinstruc)on
![Page 50: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/50.jpg)
SeoulNa)onalUniversity
50 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle6
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 Load1 Yes 801 MULTD F4 F0 F2 2 Load2 Yes 721 SD F4 0 R1 3 Load3 No2 LD F0 0 R1 6 Store1 Yes 80 Mult12 MULTD F4 F0 F2 Store2 No2 SD F4 0 R1 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F4) Load1 SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
6 72 Fu Load2 Mult1
• No)cethatF0neverseesLoadfromloca)on80
![Page 51: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/51.jpg)
SeoulNa)onalUniversity
51 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle7
• Registerfilecompletelydetachedfromcomputa)on
• FirstandSeconditera)oncompletelyoverlapped
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 Load1 Yes 801 MULTD F4 F0 F2 2 Load2 Yes 721 SD F4 0 R1 3 Load3 No2 LD F0 0 R1 6 Store1 Yes 80 Mult12 MULTD F4 F0 F2 7 Store2 No2 SD F4 0 R1 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F2) Load1 SUBI R1 R1 #8Mult2 Yes Multd R(F2) Load2 BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
7 72 Fu Load2 Mult2
![Page 52: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/52.jpg)
SeoulNa)onalUniversity
52 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle8
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 Load1 Yes 801 MULTD F4 F0 F2 2 Load2 Yes 721 SD F4 0 R1 3 Load3 No2 LD F0 0 R1 6 Store1 Yes 80 Mult12 MULTD F4 F0 F2 7 Store2 Yes 72 Mult22 SD F4 0 R1 8 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F2) Load1 SUBI R1 R1 #8Mult2 Yes Multd R(F2) Load2 BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
8 72 Fu Load2 Mult2
![Page 53: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/53.jpg)
SeoulNa)onalUniversity
53 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle9
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 Load1 Yes 801 MULTD F4 F0 F2 2 Load2 Yes 721 SD F4 0 R1 3 Load3 No2 LD F0 0 R1 6 Store1 Yes 80 Mult12 MULTD F4 F0 F2 7 Store2 Yes 72 Mult22 SD F4 0 R1 8 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F2) Load1 SUBI R1 R1 #8Mult2 Yes Multd R(F2) Load2 BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
9 72 Fu Load2 Mult2• Load1comple)ng:whoiswai)ng?
• Note:DispatchingSUBI
![Page 54: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/54.jpg)
SeoulNa)onalUniversity
54 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle10
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 10 Load1 No1 MULTD F4 F0 F2 2 Load2 Yes 721 SD F4 0 R1 3 Load3 No2 LD F0 0 R1 6 10 Store1 Yes 80 Mult12 MULTD F4 F0 F2 7 Store2 Yes 72 Mult22 SD F4 0 R1 8 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1
4 Mult1 Yes Multd M[80] R(F2) SUBI R1 R1 #8Mult2 Yes Multd R(F2) Load2 BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
10 64 Fu Load2 Mult2• Load2comple)ng:whoiswai)ng?
• Note:DispatchingBNEZ
![Page 55: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/55.jpg)
SeoulNa)onalUniversity
55 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle11
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 10 Load1 No1 MULTD F4 F0 F2 2 Load2 No1 SD F4 0 R1 3 Load3 Yes 642 LD F0 0 R1 6 10 11 Store1 Yes 80 Mult12 MULTD F4 F0 F2 7 Store2 Yes 72 Mult22 SD F4 0 R1 8 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1
3 Mult1 Yes Multd M[80] R(F2) SUBI R1 R1 #84 Mult2 Yes Multd M[72] R(F2) BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
11 64 Fu Load3 Mult2
• Nextloadinsequence
![Page 56: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/56.jpg)
SeoulNa)onalUniversity
56 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle12
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 10 Load1 No1 MULTD F4 F0 F2 2 Load2 No1 SD F4 0 R1 3 Load3 Yes 642 LD F0 0 R1 6 10 11 Store1 Yes 80 Mult12 MULTD F4 F0 F2 7 Store2 Yes 72 Mult22 SD F4 0 R1 8 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1
2 Mult1 Yes Multd M[80] R(F2) SUBI R1 R1 #83 Mult2 Yes Multd M[72] R(F2) BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
12 64 Fu Load3 Mult2
• Whynotissuethirdmul)ply?
![Page 57: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/57.jpg)
SeoulNa)onalUniversity
57 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle13
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 10 Load1 No1 MULTD F4 F0 F2 2 Load2 No1 SD F4 0 R1 3 Load3 Yes 642 LD F0 0 R1 6 10 11 Store1 Yes 80 Mult12 MULTD F4 F0 F2 7 Store2 Yes 72 Mult22 SD F4 0 R1 8 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1
1 Mult1 Yes Multd M[80] R(F2) SUBI R1 R1 #82 Mult2 Yes Multd M[72] R(F2) BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
13 64 Fu Load3 Mult2
![Page 58: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/58.jpg)
SeoulNa)onalUniversity
58 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle14
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 10 Load1 No1 MULTD F4 F0 F2 2 14 Load2 No1 SD F4 0 R1 3 Load3 Yes 642 LD F0 0 R1 6 10 11 Store1 Yes 80 Mult12 MULTD F4 F0 F2 7 Store2 Yes 72 Mult22 SD F4 0 R1 8 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1
0 Mult1 Yes Multd M[80] R(F2) SUBI R1 R1 #81 Mult2 Yes Multd M[72] R(F2) BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
14 64 Fu Load3 Mult2
• Mult1comple)ng.Whoiswai)ng?
![Page 59: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/59.jpg)
SeoulNa)onalUniversity
59 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle15
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 10 Load1 No1 MULTD F4 F0 F2 2 14 15 Load2 No1 SD F4 0 R1 3 Load3 Yes 642 LD F0 0 R1 6 10 11 Store1 Yes 80 [80]*R22 MULTD F4 F0 F2 7 15 Store2 Yes 72 Mult22 SD F4 0 R1 8 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 No SUBI R1 R1 #8
0 Mult2 Yes Multd M[72] R(F2) BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
15 64 Fu Load3 Mult2
• Mult2comple)ng.Whoiswai)ng?
![Page 60: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/60.jpg)
SeoulNa)onalUniversity
60 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle16
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 10 Load1 No1 MULTD F4 F0 F2 2 14 15 Load2 No1 SD F4 0 R1 3 Load3 Yes 642 LD F0 0 R1 6 10 11 Store1 Yes 80 [80]*R22 MULTD F4 F0 F2 7 15 16 Store2 Yes 72 [72]*R22 SD F4 0 R1 8 Store3 No
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F2) Load3 SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
16 64 Fu Load3 Mult1
![Page 61: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/61.jpg)
SeoulNa)onalUniversity
61 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle17
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 10 Load1 No1 MULTD F4 F0 F2 2 14 15 Load2 No1 SD F4 0 R1 3 Load3 Yes 642 LD F0 0 R1 6 10 11 Store1 Yes 80 [80]*R22 MULTD F4 F0 F2 7 15 16 Store2 Yes 72 [72]*R22 SD F4 0 R1 8 Store3 Yes 64 Mult1
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F2) Load3 SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
17 64 Fu Load3 Mult1
![Page 62: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/62.jpg)
SeoulNa)onalUniversity
62 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle18
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 10 Load1 No1 MULTD F4 F0 F2 2 14 15 Load2 No1 SD F4 0 R1 3 18 Load3 Yes 642 LD F0 0 R1 6 10 11 Store1 Yes 80 [80]*R22 MULTD F4 F0 F2 7 15 16 Store2 Yes 72 [72]*R22 SD F4 0 R1 8 Store3 Yes 64 Mult1
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F2) Load3 SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
18 64 Fu Load3 Mult1
![Page 63: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/63.jpg)
SeoulNa)onalUniversity
63 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle19
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 10 Load1 No1 MULTD F4 F0 F2 2 14 15 Load2 No1 SD F4 0 R1 3 18 19 Load3 Yes 642 LD F0 0 R1 6 10 11 Store1 No2 MULTD F4 F0 F2 7 15 16 Store2 Yes 72 [72]*R22 SD F4 0 R1 8 19 Store3 Yes 64 Mult1
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F2) Load3 SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
19 64 Fu Load3 Mult1
![Page 64: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/64.jpg)
SeoulNa)onalUniversity
64 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo:Cycle20
Instruction status: Exec WriteITER Instruction j k Issue CompResult Busy Addr Fu1 LD F0 0 R1 1 9 10 Load1 No1 MULTD F4 F0 F2 2 14 15 Load2 No1 SD F4 0 R1 3 18 19 Load3 Yes 642 LD F0 0 R1 6 10 11 Store1 No2 MULTD F4 F0 F2 7 15 16 Store2 No2 SD F4 0 R1 8 19 20 Store3 Yes 64 Mult1
Reservation Stations: S1 S2 RS Time Name Busy Op Vj Vk Qj Qk Code:
Add1 No LD F0 0 R1Add2 No MULTD F4 F0 F2Add3 No SD F4 0 R1Mult1 Yes Multd R(F2) Load3 SUBI R1 R1 #8Mult2 No BNEZ R1 Loop
Register result statusClock R1 F0 F2 F4 F6 F8 F10 F12 ... F30
20 64 Fu Load3 Mult1
![Page 65: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/65.jpg)
SeoulNa)onalUniversity
65 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
TomasuloExampleTwo
¢ WhycanTomasulooverlapitera)onsofloops?
§ Registerrenaming§ MulMpleiteraMonsusedifferentphysicaldesMnaMonsforregisters(dynamicloopunrolling).
§ ReservaMonstaMons§ PermitinstrucMonissuetoadvancepastintegercontrolflowoperaMons
§ Otheridea:Tomasulobuildingdynamic“DataFlow”graphfrominstrucMons
![Page 66: Superscalar Architectures: Part 2 - Introduction | csap · Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr)](https://reader034.vdocuments.us/reader034/viewer/2022050612/5fb2db9fb3076b17ab06f725/html5/thumbnails/66.jpg)
SeoulNa)onalUniversity
66 heig-vd/snusummeruniversity2017:howmodernprocessorswork?
Summary:TomasuloAlgorithm
¢ Reserva)onssta)ons:renamingtolargersetofregisters+bufferingsourceoperands
§ PreventsregistersasboPleneck§ AvoidsWAR,WAWhazardsofScoreboard§ AllowsloopunrollinginHW
¢ Dynamichardwareschemescanunrollloopsdynamicallyinhardware
§ Formoflimiteddataflow§ RegisterrenamingisessenMal
¢ Las)ngContribu)onsofTomasuloAlgorithm
§ Dynamicscheduling§ Registerrenaming§ Load/storedisambiguaMon
¢ IBM360/91descendants:Pen)umII,PPC604,MIPSR10000,Alpha21264,andcoun)ng...