performance

33
Computer Architecture and Performance

Upload: ponygal13

Post on 18-Jan-2016

220 views

Category:

Documents


1 download

DESCRIPTION

performance

TRANSCRIPT

Page 1: Performance

Computer Architecture and Performance

Page 2: Performance

Objectives:

Understand the concepts of computer architecture

Understand how performance is measuredKnow the different ways to measure

computer performance

Page 3: Performance

Computer Architecture

The task the computer designer faces is a complex one: Determine what attributes are important for a new computer, then design a computer to maximize performance while staying within cost, power, and availability constraints.

Page 4: Performance

Computer Architecture cont’d

In the past, the term computer architecture often referred only to instruction set design.

Other aspects of computer design were called implementation.

Page 5: Performance

Instruction Set Architecture

Instruction set architecture (ISA) refers to the actual programmer-visible instruction set Ex, LMC instruction set

The ISA serves as the boundary between the software and hardware.

Page 6: Performance

Implementation

The implementation of a computer has two components: organization and hardware.

The term organization includes the high-level aspects of a computer’s design, such as the memory system, the memory interconnect, and the design of the internal processor or CPU.

Hardware refers to the specifics of a computer, including the detailed logic design and the packaging technology of the computer.

Page 7: Performance

Goal

Computer architects must design a computer to meet functional requirements.

Page 8: Performance

Time discovers truth.Seneca

Page 9: Performance

Performance

In general, performance describes how quickly a given system can execute a program or programs.

Systems that execute programs in less time are said to have higher performance

Page 10: Performance

Response Time/Execution Time

The time between the start and completion of a task

To maximize performance, we want to minimize response time or execution time for some task.

Page 11: Performance

Response Time/Execution Time

Thus we can relate performance and execution time for a computer X:

PerformanceX = 1________

Execution TimeX

Page 12: Performance

Response Time/Execution Time

This means that for two computers X and Y, if the performance of X is greater than the performance of Y, we have

PerformanceX > PerformanceY

_ 1____ > ____1_____ Execution TimeX Execution TimeY

Execution timeY > Execution timeX

Page 13: Performance

Response Time/Execution Time

In discussing a computer design, we often want to relate the performance of two different computers quantitatively. We will use the phrase “X is n times faster than Y”—or equivalently “X is n times as fast as Y”—to mean

PerformanceX = nPerformanceY

Page 14: Performance

Response Time/Execution Time

If X is n times faster than Y, then the execution time on Y is n times longer than it is on X:

PerformanceX = ExecutionY = n

PerformanceY ExecutionX

Page 15: Performance

Relative Performance

Ex. If computer A runs a program in 10 seconds and computer B runs the same program in 15 seconds, how much faster is A than B?

Page 16: Performance

Ex.

PerformanceA = ExecutionB = n

PerformanceB ExecutionA

Thus, the performance ratio is15 = 1.510

and A is therefore 1.5 times faster than B.

Page 17: Performance

Measuring Performance

Time is the measure of computer performance: the computer that performs the same amount of work in the least time is the fastest.

Page 18: Performance

Performance Metrics

Cycles per Instruction (CPI) Number of clock cycles required to execute each

instruction CPI = number of clock cycles required to execute

program number of instructions executed in running the

program

Instructions executed Per Cycle (IPC) For systems that can execute more than one instruction

per cycle, the IPC is used instead of CPI IPC = number of instructions executed in running a

program number of clock cycles required to execute

programNote: IPC is the reciprocal of CPI

Page 19: Performance

Ex.

A given program consists of a 100-instruction loop that is executed 42 times. If it takes 16,000 cycles to execute the program on a given system, what are the system’s CPI and IPC values for the program?

Soln:

Page 20: Performance

Benchmark Suites

Consists of a set of programs that are believed to be typical of the programs that will be run on the system

They generate estimates of a system’s performance on different types of applications. Ex. SPEC – Standard Performance Evaluation

Corporation is a non-profit corporation formed to establish,

maintain and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers. 

SPEC CPU2006,SPEC CPUv6

Page 21: Performance

Speedup

Used to describe how the performance of an architecture changes as different improvements are made to the architecture

It is the ratio of the execution times before and after a change is made

Speedup = Execution Time before

Execution Time after

Page 22: Performance

Ex

If a program takes 25 seconds to run on one version of an architecture and 15 seconds to run on a new version, the overall speedup = 25 sec/15 sec = 1.67

Page 23: Performance

Amdahl’s Law

The most important rule for designing high-performance computer systems is make the common case fast.

Qualitatively, this means that the impact of a given performance on overall performance is dependent on both how much the improvement improves performance when it is in use and how often the improvement is in use

Page 24: Performance

Amdahl’s Law

Execution Timenew =

Execution Timeold X [ Fracunused + Frac used

]

Speedupused

Page 25: Performance

where:

Frac unused = fraction of time that the improvement is not in use

Fracused = fraction of time that the improvement is in use

Speedupused = speedup that occurs when the improvement is used

Note that Fracused and Fracunused are computed using the the execution time before the modification is applied.

Page 26: Performance

Amdahl’s Law can be rewritten using the definition of speedup:

Speedup = Execution Timeold

Execution Timenew

= ________ 1_____________[ Fracunused + Frac used ]

Speedupused

Page 27: Performance

Ex.

Suppose that a given architecture does not have hardware support for multiplication, so multiplication have to be done through repeated addition (this was the case on some early microprocessors). If it takes 200 cycles to perform multiplication in software, and 4 cycles to perform multiplication in hardware, what is the overall speedup from hardware support for multiplication if a program spends 10% of its time doing multiplications? What about a program that spends 40% of its time doing multiplications?

Page 28: Performance

Soln:

Page 29: Performance

Seatwork:

1. If the 2011 version of a computer executes a program in 200ns and the version of the computer made in the year 2013 executes the same program in 150ns, what is the speedup that the manufacturer had achieved over the two-year period?

Page 30: Performance

2. To achieve a speedup of 3 on a program that originally took 78s to execute, what must be the execution time of the program be reduced to?

Page 31: Performance

3. When run on a given system, a program takes 1,000,000 cycles. If the system achieves a CPI of 40, how many instructions were executed in running the program?

Page 32: Performance

4. What is the IPC of a program that executes 35,000 instructions and requires 17,000 cycles to execute?

Page 33: Performance

5. Suppose a computer spends 90% of its time handling a particular type of computation when running a given program, and its manufacturers make a change that improves its performance on that type of computation by a factor of 10.

a. If the program originally took 100s to execute, what will its execution time be after the change?

b. What is the speedup from the old system to the new system?