1. 2 table 4.1 key characteristics of six passenger aircraft: all figures are approximate; some...

22
1

Upload: lauren-parker

Post on 04-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

1

Page 2: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

2

Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft or are averages of cited range of values.

4.1 Performance and Cost/performance

Page 3: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

3

Figure 4.1 Performance improvement as a function of cost.

Page 4: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

4

4.2 Defining Computer Performance

Performance = 1 / Execution time

Throughput: amount of work performed per unit time. It can be measured as the number of processes per unit time.

Tournaround time: the average time from the moment that a job is submitted until the moment it is completed. It measures how long the average user has to wait for output.

Response time: In an interactive systems, the time from when a user press an Enter or clicks a mouse until the system delivers a final response.

To filter out variable factor (e.g., scheduling, interrupts, I/O delay)

Performance = 1 / CPU Execution time

Page 5: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

5

Figure 4.2 Pipeline analogy shows that imbalance between processing power and I/O capabilities leads to a performance bottleneck.

Page 6: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

6

CPU execution time

= Instructions × (Cycles per instruction ) × (seconds per cycle)

= Instructions × CPI / (Clock rate)

(CPI: cycles per instruction)

Performance comparison

(Performance of M1) / (Performance of M2) =

(Execution time of M2) / (Execution time of M1)

Page 7: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

7

4.3 Performance Enhancement and Amdahl’s Law

Amdahl’s law

s = 1 / (f+(1-f) / p) ≤ min (p, 1/f)

f: time for instructions that cannot be parallelized.

p: speed-up (by parallel computer or redesign CPU or algorithm)

s: overall speedup

Study Example 4.1

Page 8: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

8

Figure 4.4 Amdahl’s law: speedup achieved if a fraction f of a task is unaffected and the remaining 1 – f part runs p times as fast.

Page 9: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

9

Figure 4.5 Running times of six programs on three machines.

4.4 Performance Measurement vs Modeling

Page 10: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

10

Table 4.2 Summary of SPEC CPU2000 benchmark suite characteristics.

Benchmarks: real or synthetic programs that are selected for comparative evaluation of machine performance.

SPEC: Standard Performance Evaluation Corporation

Page 11: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

11

Figure 4.6 Example graphical depiction of SPEC benchmark results.

Study Example 4.3

Page 12: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

12

Table 4.3 Usage frequency, in percentage, for various instruction classes in four representative applications.

Performance Estimation

System’s peak performance: expressed in instructions per second. (MIPS, MFLOPS)

)(clockrate / CPIAverage nsInstructio time execution CPU

CPI) i-(classfraction) i-(class CPIAverage classes ninctructio All

Page 13: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

13

Example 4.4 CPI and IPS calculation

Solution

a. For M1, assume all instructions are class I instructions,

Peak performance of M1 = 1 / (Avg. CPI × Clock time)

= 600 / 2.0 = 300MIPS

Notice: Units for Average CPI and clock time are second.

For M2, assume all instructions are class N instructions,

Peak performance of M2 = 1 / (Avg. CPI × Clock time)= 500 / 2.0 = 250MIPS

Page 14: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

14

b. Average CPI for M1=5.0×0.25+2.0×0.25+2.4×0.5=2.95 Average CPI for M2=4.0×0.25+3.8×0.25+2.4×0.5=2.95c. 1. Average CPI=2.5×0.25+2.0×0.25+2.4×0.5=2.325

MIPS for option 1 = 600/2.325 = 2582. Average CPI=5.0×0.25+1.2×0.25+2.4×0.5=2.75

MIPS for option 1 = 600/2.75 = 2183. MIPS for option 3 = 750/2.95=254. Conclusion: Option 1 has the greatest impact

d. With larger cache, cache miss rate is reduced 2% (from 5% to 3%), that is all CPIs are reduced 10×2%=0.2ns (cache miss imposes 10 cycle penalty)Average CPI M1=(5.0-0.2)×0.25+(2.0-0.2)×0.25+(2.4-0.2)×0.5=2.75This option is comparable to option 2 in c.

e. Average CPI for M1= 5.0×x+2.0×y+2.4×(1-x-y)=2.6x-0.4y+2.4Average CPI for M2= 4.0×x+3.8×y+2.0×(1-x-y)=2x+1.8y+2

We need 600/(2.6x-0.4y+2.4) > 500/(2x+1.8y+2) => 2.56y > 0.2xThat is, x/y < 12.8, M1 runs faster than M2 for the given task.

Page 15: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

15

Example 4.5 MIPS rating can be misleading

a. Runtime for the output of compiler 1= (600M+400M)/109= 1.4s

Runtime for the output of compiler 2 = (400M+400M)/109= 1.2s

Compiler 2 is faster.

b. Code produced by compiler 2 is 1.4/1.2= 1.17 times as faster as that of compiler 1.

c. Average CPI for compiler 1 = (600M×1+400M×2)/1000M=1.4

Average CPI for compiler 2 = (400M×1+400M×2)/800M=1.5

MIPS rating of compiler 1=1000/1.4=714

MIPS rating of compiler 2=1000/1.5=667

Compiler 1 is faster

Page 16: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

16

Table 4.4 Measured or estimated execution times for three programs.

4.5 Reporting Computer Performance

Wrong method (arithmetic mean)

Speedup of Y over X=(0.1+10.0+10.0)/3=6.7 (1)

Speedup of X over y=(10.0+0.1+0.10)/3=3.3 (contradictory with (1))

Total time comparison: correct if they are run the same number of times.

Geometric mean:

Speedup of Y over X=(0.1×10.0×10.0)1/3=2.15 (2)

Speedup of X over y=(10.0×0.1×0.10)1/3=0.46 (consistent with (2))

Page 17: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

17

Example 4.6

Table 4.3 Usage frequency, in percentage, for various instruction classes in four representative applications.

Page 18: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

18

Answer:

a. CPI for data compression application on M1=0.25×4.0+0.32×1.5+0.16×1.2+0×6.0+0.19×2.5+0.08×2.0=2.31

CPI for data compression application on M2=2.54

CPI for nuclear reactor simulation application on M1=3.94

CPI for nuclear reactor simulation application on M2=2.89

b. Because the programs and clock rates are the same, speedup ratios is given by the ratio of CPIs.

Data compression performance speed up (M2/M1)= 2.31/2.54 = 0.91

nuclear reactor simulation performance speed up (M2/M1)= 3.94/2.89 = 1.36.

c. Overall performance advantage of M2 over M1 is (0.91×1.36)1/2=1.11

Page 19: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

19Figure 4.7 Exponential growth of supercomputer performance [Bell92].

4.6 The Quest for High Performance

Page 20: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

20

Figure 4.8 Milestones in the Accelerated Strategic Computing Initiative (ASCI) program, sponsored by the U.S. Department of Energy, with extrapolation up to the PFLOPS level.

Page 21: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

21

Problem 4.5

Page 22: 1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft

22

Problem 4.12