performance power dieyielddszajda/classes/cs301/fall_2020/... · cs301 prof szajda. performance...
TRANSCRIPT
![Page 1: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/1.jpg)
Performance, Power
CS301Prof Szajda
![Page 2: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/2.jpg)
Performance Metrics
(How do we compare two machines?)
![Page 3: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/3.jpg)
What to Measure?
!3
Which airplane has the best performance?
![Page 4: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/4.jpg)
Performance
• One size does not fit all• Depends on application domain
! Scientific computing! Graphics! Databases! General-Purpose desktop! Beware of designing to benchmark!
• Depends on technology characteristics! DRAM speed and capacity, chip size, etc.
![Page 5: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/5.jpg)
Which Metric Do We Use?
• Response or execution time! Difference between start and end time! Individual user cares most about this
• Throughput! Total amount of work done in given time! Frequently used for servers and clusters
• How are these affected by! Replacing processor with faster version?! Adding more processors?
![Page 6: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/6.jpg)
Execution Time
• Shorter execution time is better
• Allows comparison between 2 machines
![Page 7: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/7.jpg)
Relative Performance
• “X is n times faster than Y”
• Example: ! Machine A takes 10s to run program! Machine B takes 15s to run same program! What is the performance ratio?
![Page 8: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/8.jpg)
Different Time Values
• Execution time ! Wall-clock, response, or elapsed time
" Includes everything (processing,I/O, OS overhead, etc)!! Determines system performance
• CPU time! Time spent executing code for this task only
" Does not include I/O or time-sharing! Comprises user CPU time and system CPU time
" Different programs are affected differently by CPU and system performance
! man time" 90.7u 12.9s 2:39 65%
" User: 90.7 sec" System: 12.9 sec" Elapsed time: 2 min 39 sec
![Page 9: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/9.jpg)
Clock Cycles
• Instead of expressing time in seconds, use clock cycles
• Clock ! Determines when events take place! Runs at constant rate (ex. 1 GHz)! Easy to convert between clock rate and seconds
" Clock rate = 1 / Clock Cycle time" 500 MHz = 1 / (2 ns)" 1 ns = 10-9 s
![Page 10: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/10.jpg)
Chapter 1 — Computer Abstractions and Technology —
CPU Clocking# Operation of digital hardware governed by a
constant-rate clock
Clock (cycles)
Data transfer and computation
Update state
Clock period
# Clock period: duration of a clock cycle
# e.g., 250ps = 0.25ns = 250×10–12s
# Clock frequency (rate): cycles per second
# e.g., 4.0GHz = 4000MHz = 4.0×109Hz
![Page 11: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/11.jpg)
Chapter 1 — Computer Abstractions and Technology —
An Aside
![Page 12: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/12.jpg)
Chapter 1 — Computer Abstractions and Technology —
CPU Time
# Performance improved by # Reducing number of clock cycles # Increasing clock rate # Hardware designer must often trade off
clock rate against cycle count
![Page 13: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/13.jpg)
Chapter 1 — Computer Abstractions and Technology —
CPU Time Example# Computer A: 2GHz clock, 10s CPU time # Designing Computer B
# Aim for 6s CPU time # Can do faster clock, but causes 1.2 × clock cycles
# How fast must Computer B clock be?
![Page 14: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/14.jpg)
Chapter 1 — Computer Abstractions and Technology —
Instruction Count and CPI
# Instruction Count for a program # Determined by program, ISA and compiler
# Average cycles per instruction # Determined by CPU hardware # If different instructions have different CPI
# Average CPI affected by instruction mix
![Page 15: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/15.jpg)
Chapter 1 — Computer Abstractions and Technology —
CPI Example# Computer A: Cycle Time = 250ps, CPI = 2.0 # Computer B: Cycle Time = 500ps, CPI = 1.2 # Same ISA # Which is faster, and by how much?
A is faster…
…by this much
![Page 16: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/16.jpg)
Application Characteristics
• Determine the mix of different instruction types! Integer arithmetic! Logical operations! Floating point arithmetic! Loads and stores
• Different applications have different CPI because of different instruction mixes
![Page 17: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/17.jpg)
Chapter 1 — Computer Abstractions and Technology —
CPI in More Detail# If different instruction classes take
different numbers of cycles
# Weighted average CPI
Relative frequency
![Page 18: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/18.jpg)
Chapter 1 — Computer Abstractions and Technology —
CPI Example# Alternative compiled code sequences using
instructions in classes A, B, C
Class A B C
CPI for class 1 2 3
IC in sequence 1 2 1 2
IC in sequence 2 4 1 1
# Sequence 1: IC = 5
# Clock Cycles = 2×1 + 1×2 + 2×3 = 10
# Avg. CPI = 10/5 = 2.0
# Sequence 2: IC = 6
# Clock Cycles = 4×1 + 1×2 + 1×3 = 9
# Avg. CPI = 9/6 = 1.5
![Page 19: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/19.jpg)
Chapter 1 — Computer Abstractions and Technology —
Performance Summary
# Performance depends on # Algorithm: affects IC, possibly CPI # Programming language: affects IC, CPI # Compiler: affects IC, CPI # Instruction set architecture: affects IC,
CPI, Tc
The BIG Picture
![Page 20: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/20.jpg)
Amdahl’s Law
• How much speedup do you get from an enhancement?
• Based on! Fraction of time enhancement used! Improvement in enhanced mode
Speedup = Execution time w/o enhancement
Execution time w/ enhancement
Execnew = Execold × ((1-fractionenh) + Speedupenh
fractionenh )
![Page 21: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/21.jpg)
Chapter 1 — Computer Abstractions and Technology —
Pitfall: Amdahl’s Law# Improving an aspect of a computer and
expecting a proportional improvement in overall performance
§1
.10
Fa
llac
ies a
nd
Pitfa
lls
# Can’t be done!
# Example: multiply accounts for 80s/100s
# How much improvement in multiply performance to get 5× overall?
# Corollary: make the common case fast
![Page 22: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/22.jpg)
Review Question
• Your machine has a clock rate of 2.4GHz. How long is the clock cycle?
![Page 23: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/23.jpg)
Review Questions
• Suppose you are given the following:! Machine A
" 1 GHz" Average CPI = 1.6" Instructions = 1.7 Billion
! Machine B" 3.3 GHz" Average CPI = 6.1" Instructions = 2 Billion
• Which machine is faster? By how much?
![Page 24: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/24.jpg)
Review Questions
• What is the average CPI for a machine with the following CPIs on an application with the following instruction frequency?
Type Frequency CPI
Arithmetic
0.45 1Memory 0.3 8Control 0.2 3Mult/Div 0.05 5
![Page 25: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/25.jpg)
Review Questions
• What factors must be included when comparing the relative performance of two machines?
![Page 26: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/26.jpg)
Amdahl’s Law
• Suppose you have an enhancement that makes a functional unit 10x faster.
• Speedup if used 5% of the time?• Speedup if used 40% of the time?
Execnew = Execold × ((1-fractionenh) + Speedupenh
fractionenh )
![Page 27: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/27.jpg)
Review Questions
• What is the equation for execution time?
• What does Amdahl’s Law say?
![Page 28: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/28.jpg)
Benchmarks
• Programs specifically used to measure performance
• Hope is that it is representative of how computer will be used
• Examples! SPEC Integer and Floating Point! MediaBench! MineBench! TPC
![Page 29: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/29.jpg)
Chapter 1 — Computer Abstractions and Technology —
SPEC CPU Benchmark# Programs used to measure performance
# Supposedly typical of actual workload # Standard Performance Evaluation Corp
(SPEC) # Develops benchmarks for CPU, I/O, Web, …
# SPEC CPU2006 # Elapsed time to execute a selection of programs
# Negligible I/O, so focuses on CPU performance # Normalize relative to reference machine # Summarize as geometric mean of performance
ratios # CINT2006 (integer) and CFP2006 (floating-point)
![Page 30: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/30.jpg)
Chapter 1 — Computer Abstractions and Technology —
CINT2006 for Intel Core i7 920
![Page 31: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/31.jpg)
Chapter 1 — Computer Abstractions and Technology —
Recent Concern: Power
# In CMOS IC technology
§1
.7 T
he
Po
we
r Wa
ll
×1000×40 5V → 1V
![Page 32: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/32.jpg)
Tricks to Increase Power
• Attach large cooling devices• Turn off parts of chips not used in
given clock cycle! Can increase power to 300 watts...! ...But these and other ways all
prohibitively expensive for desktop computers. So...
!32
![Page 33: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/33.jpg)
More Recent Approaches:Chip Multiprocessors
• Reasons for change! Limited opportunities to improve single
thread performance! Power! On-chip communication latencies
![Page 34: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/34.jpg)
Chapter 1 — Computer Abstractions and Technology —
Uniprocessor Performance§
1.8
Th
e S
ea
Ch
an
ge
: Th
e S
witch
to M
ultip
roc
esso
rsConstrained by power, instruction-level parallelism, memory latency
![Page 35: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/35.jpg)
Chapter 1 — Computer Abstractions and Technology —
Multiprocessors# Multicore microprocessors
# More than one processor per chip # Requires explicitly parallel
programming # Compare with instruction level parallelism
# Hardware executes multiple instructions at once
# Hidden from the programmer # Hard to do
# Programming for performance # Load balancing # Optimizing communication and synchronization
![Page 36: Performance Power DieYielddszajda/classes/cs301/Fall_2020/... · CS301 Prof Szajda. Performance Metrics (How do we compare two machines?) What to Measure?!3 Which airplane has the](https://reader033.vdocuments.us/reader033/viewer/2022053111/6082355e8f98a67a4a45575b/html5/thumbnails/36.jpg)
Chapter 1 — Computer Abstractions and Technology —
Concluding Remarks# Cost/performance is improving
# Due to underlying technology development # Hierarchical layers of abstraction
# In both hardware and software # Instruction set architecture
# The hardware/software interface # Execution time: the best performance
measure # Power is a limiting factor
# Use parallelism to improve performance
§1
.9 C
on
clud
ing
Re
ma
rks