performance
DESCRIPTION
performanceTRANSCRIPT
Computer Architecture and Performance
Objectives:
Understand the concepts of computer architecture
Understand how performance is measuredKnow the different ways to measure
computer performance
Computer Architecture
The task the computer designer faces is a complex one: Determine what attributes are important for a new computer, then design a computer to maximize performance while staying within cost, power, and availability constraints.
Computer Architecture cont’d
In the past, the term computer architecture often referred only to instruction set design.
Other aspects of computer design were called implementation.
Instruction Set Architecture
Instruction set architecture (ISA) refers to the actual programmer-visible instruction set Ex, LMC instruction set
The ISA serves as the boundary between the software and hardware.
Implementation
The implementation of a computer has two components: organization and hardware.
The term organization includes the high-level aspects of a computer’s design, such as the memory system, the memory interconnect, and the design of the internal processor or CPU.
Hardware refers to the specifics of a computer, including the detailed logic design and the packaging technology of the computer.
Goal
Computer architects must design a computer to meet functional requirements.
Time discovers truth.Seneca
Performance
In general, performance describes how quickly a given system can execute a program or programs.
Systems that execute programs in less time are said to have higher performance
Response Time/Execution Time
The time between the start and completion of a task
To maximize performance, we want to minimize response time or execution time for some task.
Response Time/Execution Time
Thus we can relate performance and execution time for a computer X:
PerformanceX = 1________
Execution TimeX
Response Time/Execution Time
This means that for two computers X and Y, if the performance of X is greater than the performance of Y, we have
PerformanceX > PerformanceY
_ 1____ > ____1_____ Execution TimeX Execution TimeY
Execution timeY > Execution timeX
Response Time/Execution Time
In discussing a computer design, we often want to relate the performance of two different computers quantitatively. We will use the phrase “X is n times faster than Y”—or equivalently “X is n times as fast as Y”—to mean
PerformanceX = nPerformanceY
Response Time/Execution Time
If X is n times faster than Y, then the execution time on Y is n times longer than it is on X:
PerformanceX = ExecutionY = n
PerformanceY ExecutionX
Relative Performance
Ex. If computer A runs a program in 10 seconds and computer B runs the same program in 15 seconds, how much faster is A than B?
Ex.
PerformanceA = ExecutionB = n
PerformanceB ExecutionA
Thus, the performance ratio is15 = 1.510
and A is therefore 1.5 times faster than B.
Measuring Performance
Time is the measure of computer performance: the computer that performs the same amount of work in the least time is the fastest.
Performance Metrics
Cycles per Instruction (CPI) Number of clock cycles required to execute each
instruction CPI = number of clock cycles required to execute
program number of instructions executed in running the
program
Instructions executed Per Cycle (IPC) For systems that can execute more than one instruction
per cycle, the IPC is used instead of CPI IPC = number of instructions executed in running a
program number of clock cycles required to execute
programNote: IPC is the reciprocal of CPI
Ex.
A given program consists of a 100-instruction loop that is executed 42 times. If it takes 16,000 cycles to execute the program on a given system, what are the system’s CPI and IPC values for the program?
Soln:
Benchmark Suites
Consists of a set of programs that are believed to be typical of the programs that will be run on the system
They generate estimates of a system’s performance on different types of applications. Ex. SPEC – Standard Performance Evaluation
Corporation is a non-profit corporation formed to establish,
maintain and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers.
SPEC CPU2006,SPEC CPUv6
Speedup
Used to describe how the performance of an architecture changes as different improvements are made to the architecture
It is the ratio of the execution times before and after a change is made
Speedup = Execution Time before
Execution Time after
Ex
If a program takes 25 seconds to run on one version of an architecture and 15 seconds to run on a new version, the overall speedup = 25 sec/15 sec = 1.67
Amdahl’s Law
The most important rule for designing high-performance computer systems is make the common case fast.
Qualitatively, this means that the impact of a given performance on overall performance is dependent on both how much the improvement improves performance when it is in use and how often the improvement is in use
Amdahl’s Law
Execution Timenew =
Execution Timeold X [ Fracunused + Frac used
]
Speedupused
where:
Frac unused = fraction of time that the improvement is not in use
Fracused = fraction of time that the improvement is in use
Speedupused = speedup that occurs when the improvement is used
Note that Fracused and Fracunused are computed using the the execution time before the modification is applied.
Amdahl’s Law can be rewritten using the definition of speedup:
Speedup = Execution Timeold
Execution Timenew
= ________ 1_____________[ Fracunused + Frac used ]
Speedupused
Ex.
Suppose that a given architecture does not have hardware support for multiplication, so multiplication have to be done through repeated addition (this was the case on some early microprocessors). If it takes 200 cycles to perform multiplication in software, and 4 cycles to perform multiplication in hardware, what is the overall speedup from hardware support for multiplication if a program spends 10% of its time doing multiplications? What about a program that spends 40% of its time doing multiplications?
Soln:
Seatwork:
1. If the 2011 version of a computer executes a program in 200ns and the version of the computer made in the year 2013 executes the same program in 150ns, what is the speedup that the manufacturer had achieved over the two-year period?
2. To achieve a speedup of 3 on a program that originally took 78s to execute, what must be the execution time of the program be reduced to?
3. When run on a given system, a program takes 1,000,000 cycles. If the system achieves a CPI of 40, how many instructions were executed in running the program?
4. What is the IPC of a program that executes 35,000 instructions and requires 17,000 cycles to execute?
5. Suppose a computer spends 90% of its time handling a particular type of computation when running a given program, and its manufacturers make a change that improves its performance on that type of computation by a factor of 10.
a. If the program originally took 100s to execute, what will its execution time be after the change?
b. What is the speedup from the old system to the new system?