1. 2 table 4.1 key characteristics of six passenger aircraft: all figures are approximate; some...
TRANSCRIPT
![Page 1: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/1.jpg)
1
![Page 2: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/2.jpg)
2
Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft or are averages of cited range of values.
4.1 Performance and Cost/performance
![Page 3: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/3.jpg)
3
Figure 4.1 Performance improvement as a function of cost.
![Page 4: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/4.jpg)
4
4.2 Defining Computer Performance
Performance = 1 / Execution time
Throughput: amount of work performed per unit time. It can be measured as the number of processes per unit time.
Tournaround time: the average time from the moment that a job is submitted until the moment it is completed. It measures how long the average user has to wait for output.
Response time: In an interactive systems, the time from when a user press an Enter or clicks a mouse until the system delivers a final response.
To filter out variable factor (e.g., scheduling, interrupts, I/O delay)
Performance = 1 / CPU Execution time
![Page 5: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/5.jpg)
5
Figure 4.2 Pipeline analogy shows that imbalance between processing power and I/O capabilities leads to a performance bottleneck.
![Page 6: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/6.jpg)
6
CPU execution time
= Instructions × (Cycles per instruction ) × (seconds per cycle)
= Instructions × CPI / (Clock rate)
(CPI: cycles per instruction)
Performance comparison
(Performance of M1) / (Performance of M2) =
(Execution time of M2) / (Execution time of M1)
![Page 7: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/7.jpg)
7
4.3 Performance Enhancement and Amdahl’s Law
Amdahl’s law
s = 1 / (f+(1-f) / p) ≤ min (p, 1/f)
f: time for instructions that cannot be parallelized.
p: speed-up (by parallel computer or redesign CPU or algorithm)
s: overall speedup
Study Example 4.1
![Page 8: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/8.jpg)
8
Figure 4.4 Amdahl’s law: speedup achieved if a fraction f of a task is unaffected and the remaining 1 – f part runs p times as fast.
![Page 9: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/9.jpg)
9
Figure 4.5 Running times of six programs on three machines.
4.4 Performance Measurement vs Modeling
![Page 10: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/10.jpg)
10
Table 4.2 Summary of SPEC CPU2000 benchmark suite characteristics.
Benchmarks: real or synthetic programs that are selected for comparative evaluation of machine performance.
SPEC: Standard Performance Evaluation Corporation
![Page 11: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/11.jpg)
11
Figure 4.6 Example graphical depiction of SPEC benchmark results.
Study Example 4.3
![Page 12: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/12.jpg)
12
Table 4.3 Usage frequency, in percentage, for various instruction classes in four representative applications.
Performance Estimation
System’s peak performance: expressed in instructions per second. (MIPS, MFLOPS)
)(clockrate / CPIAverage nsInstructio time execution CPU
CPI) i-(classfraction) i-(class CPIAverage classes ninctructio All
![Page 13: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/13.jpg)
13
Example 4.4 CPI and IPS calculation
Solution
a. For M1, assume all instructions are class I instructions,
Peak performance of M1 = 1 / (Avg. CPI × Clock time)
= 600 / 2.0 = 300MIPS
Notice: Units for Average CPI and clock time are second.
For M2, assume all instructions are class N instructions,
Peak performance of M2 = 1 / (Avg. CPI × Clock time)= 500 / 2.0 = 250MIPS
![Page 14: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/14.jpg)
14
b. Average CPI for M1=5.0×0.25+2.0×0.25+2.4×0.5=2.95 Average CPI for M2=4.0×0.25+3.8×0.25+2.4×0.5=2.95c. 1. Average CPI=2.5×0.25+2.0×0.25+2.4×0.5=2.325
MIPS for option 1 = 600/2.325 = 2582. Average CPI=5.0×0.25+1.2×0.25+2.4×0.5=2.75
MIPS for option 1 = 600/2.75 = 2183. MIPS for option 3 = 750/2.95=254. Conclusion: Option 1 has the greatest impact
d. With larger cache, cache miss rate is reduced 2% (from 5% to 3%), that is all CPIs are reduced 10×2%=0.2ns (cache miss imposes 10 cycle penalty)Average CPI M1=(5.0-0.2)×0.25+(2.0-0.2)×0.25+(2.4-0.2)×0.5=2.75This option is comparable to option 2 in c.
e. Average CPI for M1= 5.0×x+2.0×y+2.4×(1-x-y)=2.6x-0.4y+2.4Average CPI for M2= 4.0×x+3.8×y+2.0×(1-x-y)=2x+1.8y+2
We need 600/(2.6x-0.4y+2.4) > 500/(2x+1.8y+2) => 2.56y > 0.2xThat is, x/y < 12.8, M1 runs faster than M2 for the given task.
![Page 15: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/15.jpg)
15
Example 4.5 MIPS rating can be misleading
a. Runtime for the output of compiler 1= (600M+400M)/109= 1.4s
Runtime for the output of compiler 2 = (400M+400M)/109= 1.2s
Compiler 2 is faster.
b. Code produced by compiler 2 is 1.4/1.2= 1.17 times as faster as that of compiler 1.
c. Average CPI for compiler 1 = (600M×1+400M×2)/1000M=1.4
Average CPI for compiler 2 = (400M×1+400M×2)/800M=1.5
MIPS rating of compiler 1=1000/1.4=714
MIPS rating of compiler 2=1000/1.5=667
Compiler 1 is faster
![Page 16: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/16.jpg)
16
Table 4.4 Measured or estimated execution times for three programs.
4.5 Reporting Computer Performance
Wrong method (arithmetic mean)
Speedup of Y over X=(0.1+10.0+10.0)/3=6.7 (1)
Speedup of X over y=(10.0+0.1+0.10)/3=3.3 (contradictory with (1))
Total time comparison: correct if they are run the same number of times.
Geometric mean:
Speedup of Y over X=(0.1×10.0×10.0)1/3=2.15 (2)
Speedup of X over y=(10.0×0.1×0.10)1/3=0.46 (consistent with (2))
![Page 17: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/17.jpg)
17
Example 4.6
Table 4.3 Usage frequency, in percentage, for various instruction classes in four representative applications.
![Page 18: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/18.jpg)
18
Answer:
a. CPI for data compression application on M1=0.25×4.0+0.32×1.5+0.16×1.2+0×6.0+0.19×2.5+0.08×2.0=2.31
CPI for data compression application on M2=2.54
CPI for nuclear reactor simulation application on M1=3.94
CPI for nuclear reactor simulation application on M2=2.89
b. Because the programs and clock rates are the same, speedup ratios is given by the ratio of CPIs.
Data compression performance speed up (M2/M1)= 2.31/2.54 = 0.91
nuclear reactor simulation performance speed up (M2/M1)= 3.94/2.89 = 1.36.
c. Overall performance advantage of M2 over M1 is (0.91×1.36)1/2=1.11
![Page 19: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/19.jpg)
19Figure 4.7 Exponential growth of supercomputer performance [Bell92].
4.6 The Quest for High Performance
![Page 20: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/20.jpg)
20
Figure 4.8 Milestones in the Accelerated Strategic Computing Initiative (ASCI) program, sponsored by the U.S. Department of Energy, with extrapolation up to the PFLOPS level.
![Page 21: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/21.jpg)
21
Problem 4.5
![Page 22: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft](https://reader036.vdocuments.us/reader036/viewer/2022070401/56649f1e5503460f94c364e0/html5/thumbnails/22.jpg)
22
Problem 4.12