ca226 — advanced computer architectureray/teaching/ca226/05-hazards.pdf · ca226 — advanced...
TRANSCRIPT
![Page 2: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/2.jpg)
CA226 — AdvancedComputer Architecture
2
…Today:
• data hazards
![Page 3: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/3.jpg)
CA226 — AdvancedComputer Architecture
3
…Recall:
• the MIPS pipeline implements instruction level parallelism
• ideally, up to five instructions are executed (in part) on any clock cycle
• if one instruction were to exit the pipeline on each cycle:
• then the CPI would be 1and, ideally, the MIPS pipeline approaches a CPI of 1
![Page 4: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/4.jpg)
CA226 — AdvancedComputer Architecture
4
MIPS Pipeline
![Page 5: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/5.jpg)
CA226 — AdvancedComputer Architecture
5
Example daddi r1,r1,1 daddi r2,r2,1 daddi r3,r3,1 daddi r4,r4,1 daddi r5,r5,1
Note
Note to self: see pipeline.s.
![Page 6: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/6.jpg)
CA226 — AdvancedComputer Architecture
6
SpeedupIdeally:
• each instruction takes 5 cycles to execute
• however, 5 instructions are in the pipeline
• so the number of cycles per instruction approaches 1
Note
Note to self:Observe the effect on CPI of repeating the block of instructions, previous.
![Page 7: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/7.jpg)
CA226 — AdvancedComputer Architecture
7
HazardsThe major hurdle to effective pipeline implementation is:
• hazards
![Page 8: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/8.jpg)
CA226 — AdvancedComputer Architecture
8
Types of Hazard
Structural hazardsresource conflicts;hardware cannot support all instruction combinations simultaneously
Data hazardswhen one instruction depends upon the result (which is not yet available) of aprevious instruction(today)
Control hazardswhen the address of the next instruction cannot be determined immediately
![Page 9: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/9.jpg)
CA226 — AdvancedComputer Architecture
9
Data Hazards — ExampleConsider:
dadd r1,r2,r3 ; instruction 1 dsub r4,r1,r5 ; instruction 2 and r6,r1,r5 ; instruction 3 or r8,r1,r9 ; instruction 4 xor r10,r1,r11 ; instruction 5
Instructions 2, 3, 4 and 5:
• each depend upon the result of instruction 1
![Page 10: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/10.jpg)
CA226 — AdvancedComputer Architecture
10
Ok …
Turn off forwarding, and let’s try running that …
Note to self:
• see hazards1.s.
![Page 11: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/11.jpg)
CA226 — AdvancedComputer Architecture
11
Illustration
Table 1. Two Read-After-Write (RAW) pipeline stalls:
1 2 3 4 5 6 7
dadd r1,r2,r3 IF ID Ex Mem WB*
dsub r4,r1,r5 IF ID RAW RAW *Ex
and r6,r1,r5 IF stall stall ID
or r8,r1,r9 IF
Note
This assumes that we can both write and read the register file in a single clock cycle.Typically, the write happens in the first half of the cycle, and the read in the secondhalf.
![Page 12: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/12.jpg)
CA226 — AdvancedComputer Architecture
12
ObservationsThis is known as a read after write (or RAW) stall:
• instruction 2 is blocked at ID because one of its arguments (registers) is not yetavailable
• in this case, all subsequent instructions are blocked toowhich is known as a pipeline stall
![Page 13: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/13.jpg)
CA226 — AdvancedComputer Architecture
13
Next, …Consider:
• the effect of replacing instruction 2 with a nop instruction(or any other, non-dependent instruction)
![Page 14: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/14.jpg)
CA226 — AdvancedComputer Architecture
14
Illustration
Table 2. Still one RAW stall:
1 2 3 4 5 6 7
dadd r1,r2,r3 IF ID Ex Mem WB*
nop IF ID Ex Mem WB
and r6,r1,r5 IF ID RAW *Ex Mem
or r8,r1,r9 IF stall Id Ex
![Page 15: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/15.jpg)
CA226 — AdvancedComputer Architecture
15
Next, …Finally, consider:
• the effect of replacing instruction 3 with a nop instruction(or any other, non-dependent instruction)
![Page 16: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/16.jpg)
CA226 — AdvancedComputer Architecture
16
Illustration
Table 3. No stalls:
1 2 3 4 5 6 7
dadd r1,r2,r3 IF ID Ex Mem WB*
nop IF ID Ex Mem WB
nop IF ID Ex Mem
or r8,r1,r9 IF ID *Ex Mem
![Page 17: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/17.jpg)
CA226 — AdvancedComputer Architecture
17
…We could:
• find (two) other (independent) instructions to insert between such write-readdependencies
• but such dependencies are commonand we rarely have enough instructions to fill the gaps
![Page 18: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/18.jpg)
CA226 — AdvancedComputer Architecture
18
…However, such hazards are not insurmountable:
• the ALU produces the necessary value in cycle 3(although it is not written back to the register file until cycle 5)
• that value is not needed by instruction 2 until cycle 4
![Page 19: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/19.jpg)
CA226 — AdvancedComputer Architecture
19
…
Table 4. The value is available after cycle 3:
1 2 3 4 5 6 7
dadd r1,r2,r3 IF ID Ex** Mem WB*
dsub r4,r1,r5 IF ID RAW RAW *Ex
and r6,r1,r5 IF stall stall ID
or r8,r1,r9 IF
![Page 20: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/20.jpg)
CA226 — AdvancedComputer Architecture
20
ForwardingSolution:
• data paths are added:
• EX/Mem.ALUOutput → ID/EX.A (output)EX/Mem.ALUOutput → ID/EX.B (output)Mem/WB.ALUOutput → ID/EX.A (output)Mem/WB.ALUOutput → ID/EX.B (output)
• when a read-after-write is detected, the ALU input:(either ID/EX.A or ID/EX.B)is switched to one of the two available ALUOutput pipeline registers (Ex/Mem orMem/WB)
![Page 21: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/21.jpg)
CA226 — AdvancedComputer Architecture
21
MIPS Pipeline
![Page 22: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/22.jpg)
CA226 — AdvancedComputer Architecture
22
Forwarding
1 2 3 4 5 6 7
dadd r1,r2,r3 IF ID Ex** Mem WB
dsub r4,r1,r5 IF ID **Ex Mem WB
and r6,r1,r5 IF ID Ex Mem WB
or r8,r1,r9 IF ID Ex Mem
One of:
• EX/Mem.ALUOutput → ID/EX.AEX/Mem.ALUOutput → ID/EX.B
![Page 23: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/23.jpg)
CA226 — AdvancedComputer Architecture
23
Forwarding
1 2 3 4 5 6 7
dadd r1,r2,r3 IF ID Ex Mem** WB
nop IF ID Ex Mem WB
and r6,r1,r5 IF ID **Ex Mem WB
or r8,r1,r9 IF ID Ex Mem
One of:
• Mem/WB.ALUOutput → ID/EX.AMem/WB.ALUOutput → ID/EX.B
![Page 24: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/24.jpg)
CA226 — AdvancedComputer Architecture
24
The WinMIPS64 SimulatorThe WinMIPS64 simulator:
• supports forwardingit can be either enabled or disabled
• see: Configure/Enable Forwarding
![Page 25: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/25.jpg)
CA226 — AdvancedComputer Architecture
25
…Try turning on forwarding:
• and running the example again…(hazards1.s)
![Page 26: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/26.jpg)
CA226 — AdvancedComputer Architecture
26
Now, consider the following … daddi r1,r2,123 ; instruction 1 ld r4,0(r1) ; instruction 2 sd r4,8(r1) ; instruction 3
Here:
• there is a RAW dependency between the daddi instruction and the addresscalculation in both of the following instructions
• the address calculation is handled by the ALU,so these are handled by forwarding, as before
![Page 27: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/27.jpg)
CA226 — AdvancedComputer Architecture
27
Illustration
Table 5. No stalls due to address calculation:
1 2 3 4 5 6 7
daddi r1,r2,123 IF ID Ex** Mem++ WB
ld r4,0(r1) IF ID **Ex Mem WB
sd r4,8(r1) IF ID ++Ex Mem WB
• EX/Mem.ALUOutput → ID/EX.A for cycle 4Mem/WB.ALUOutput → ID/EX.A for cycle 5
![Page 28: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/28.jpg)
CA226 — AdvancedComputer Architecture
28
And, again …daddi r1,r2,123 ; instruction 1ld r4,0(r1) ; instruction 2sd r4,8(r1) ; instruction 3
![Page 29: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/29.jpg)
CA226 — AdvancedComputer Architecture
29
And, again …daddi r1,r2,123 ; instruction 1ld r4,0(r1) ; instruction 2sd r4,8(r1) ; instruction 3
Also:
• the sd instruction depends upon the result of the ld
![Page 30: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/30.jpg)
CA226 — AdvancedComputer Architecture
30
…
Table 6. This can be solved by forwarding too:
1 2 3 4 5 6 7
daddi r1,r2,123 IF ID Ex Mem WB
ld r4,0(r1) IF ID Ex Mem** WB
sd r4,8(r1) IF ID Ex **Mem WB
Here:
• Mem/WB.LMD → EX/MEM.B for cycle 6
![Page 31: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/31.jpg)
CA226 — AdvancedComputer Architecture
31
In full …
1 2 3 4 5 6 7
daddi r1,r2,123 IF ID Ex++ Mem== WB
ld r4,0(r1) IF ID ++Ex Mem** WB
sd r4,8(r1) IF ID ==Ex **Mem WB
• EX/Mem.ALUOutput → ID/EX.A for cycle 4Mem/WB.ALUOutput → ID/EX.A for cycle 5Mem/WB.LMD → EX/MEM.B for cycle 6
![Page 32: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/32.jpg)
CA226 — AdvancedComputer Architecture
32
…In all:
• four pipeline stalls are eliminated(note to self: see stalls1.s)
![Page 33: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/33.jpg)
CA226 — AdvancedComputer Architecture
33
MIPS Pipeline
![Page 34: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/34.jpg)
CA226 — AdvancedComputer Architecture
34
Unfortunately …Forwarding cannot solve all RAW problems:
ld r1,n(r0)dadd r2,r1,r0
![Page 35: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/35.jpg)
CA226 — AdvancedComputer Architecture
35
…
Table 7. You can’t forward backwards in time:
1 2 3 4 5 6 7
ld r1,n(r0) IF ID Ex Mem** WB
dadd r2,r1,r0 IF ID **Ex Mem WB
Clearly:
• this is not possible
![Page 36: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/36.jpg)
CA226 — AdvancedComputer Architecture
36
An Insurmountable Stall
Table 8. An inevitable stall of one cycle:
ld r1,n(r0) IF ID Ex Mem** WB
dadd r2,r1,r0 IF ID RAW **Ex Mem
![Page 37: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/37.jpg)
CA226 — AdvancedComputer Architecture
37
More generally, …Unlike arithmetic instructions:
• loads yield values only after the Mem stage of the pipelineso stalls at Ex cannot be avoided
![Page 38: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/38.jpg)
CA226 — AdvancedComputer Architecture
38
SuggestionWhen possible, replace:
dadd r3,r2,r1 ; some other, unrelated instructionld r4,N(r0)dadd r6,r5,r4 ; stall - can't forward backwards!
![Page 39: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/39.jpg)
CA226 — AdvancedComputer Architecture
39
SuggestionWith:
ld r4,N(r0)dadd r3,r2,r1 ; some other, unrelated instructiondadd r6,r5,r4 ; doesn't stall - can forward from dadd
Now:
• when the final dadd reaches Ex:Mem/WB.LMD is available for forwarding
![Page 40: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/40.jpg)
CA226 — AdvancedComputer Architecture
40
…
Note
A good compiler (or you!) should be able to spot such stalls and reorder theoperations.
We spot such stalls by observing that an ALU instruction immediately follows a loadupon which it depends.
![Page 41: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/41.jpg)
CA226 — AdvancedComputer Architecture
41
ExampleCompile:
int a = b + c;int d = e + f;
Note to self:
• see psched1.s and psched2.s.
![Page 42: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/42.jpg)
CA226 — AdvancedComputer Architecture
42
ExampleFirst, spot the problem:
ld r1,b(r0) ; a = b + cld r2,c(r0)dadd r5,r1,r2sd r5,a(r0)
ld r1,e(r0) ; d = e + fld r2,f(r0)dadd r5,r1,r2sd r5,d(r0)
![Page 43: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/43.jpg)
CA226 — AdvancedComputer Architecture
43
ExampleThen, rewrite instructions such that there are no stalls:
ld r1,b(r0) ; a = b + cld r2,c(r0)dadd r5,r1,r2 ; stall, r2 not readysd r5,a(r0)
ld r1,e(r0) ; d = e + fld r2,f(r0)dadd r5,r1,r2 ; stall, r2 not readysd r5,d(r0)
![Page 44: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/44.jpg)
CA226 — AdvancedComputer Architecture
44
ExampleWell, it’s helpful to use different registers:
ld r1,b(r0) ; a = b + cld r2,c(r0)dadd r5,r1,r2 ; stall, r2 not readysd r5,a(r0)
ld r3,e(r0) ; d = e + fld r4,f(r0)dadd r5,r3,r4 ; stall, r4 not readysd r5,d(r0)
![Page 45: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/45.jpg)
CA226 — AdvancedComputer Architecture
45
ExampleNo stalls:
ld r1,b(r0)ld r2,c(r0)ld r3,e(r0) ; prevent stall (pulled up)dadd r5,r1,r2 ; no stall
ld r4,f(r0)sd r5,a(r0) ; prevent stall (pushed down)dadd r5,r3,r4 ; no stallsd r5,d(r0)
![Page 46: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/46.jpg)
CA226 — AdvancedComputer Architecture
46
…This is known as:
• pipeline scheduling
In this case:
• use two extra registers
• avoid two stalls
• 13 cycles, instead of 15
![Page 47: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/47.jpg)
CA226 — AdvancedComputer Architecture
47
AsideThe "13 versus 15 cycles" statement is misleading:
• it includes cycles for the pipeline to fill and empty
Actually:
• disregarding the filling of the pipeline:
• it’s 8 cycles, instead of 10so a speedup of 1.25
![Page 48: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/48.jpg)
CA226 — AdvancedComputer Architecture
48
Summary 1Forwarding is simple:
• if the necessary data is available somewhere in the pipeline and when needed:then it can be forwarded to where it’s needed
The implementation in hardware of these strategies is an engineering decision:
• it is correct, in all cases, to stall the pipeline when such hazards are detected
• forwarding, however, improves performance at the cost of some additionalcomplexity
![Page 49: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/49.jpg)
CA226 — AdvancedComputer Architecture
49
Summary 2Some types of (RAW) stall are unavoidable:
• however, it is often possible to reorder instructions such that they do not occur
![Page 50: CA226 — Advanced Computer Architectureray/teaching/CA226/05-hazards.pdf · CA226 — Advanced Computer Architecture 3 … Recall: • the MIPS pipeline implements instruction level](https://reader033.vdocuments.us/reader033/viewer/2022051601/5adbb87f7f8b9a6d7e8e62a3/html5/thumbnails/50.jpg)
CA226 — AdvancedComputer Architecture
50
Done<script> (function() { var mathjax = 'mathjax/MathJax.js?config=asciimath'; // var mathjax= 'http://smblott.computing.dcu.ie/mathjax/MathJax.js?config=asciimath'; var element= document.createElement('script'); element.async = true; element.src = mathjax;element.type = 'text/javascript'; (document.getElementsByTagName('HEAD')[0]||document.body).appendChild(element); })(); </script>