performance

Computer Architecture and Performance

Objectives:

Understand the concepts of computer architecture

Understand how performance is measuredKnow the different ways to measure

computer performance

Computer Architecture

The task the computer designer faces is a complex one: Determine what attributes are important for a new computer, then design a computer to maximize performance while staying within cost, power, and availability constraints.

Computer Architecture cont’d

In the past, the term computer architecture often referred only to instruction set design.

Other aspects of computer design were called implementation.

Instruction Set Architecture

Instruction set architecture (ISA) refers to the actual programmer-visible instruction set Ex, LMC instruction set

The ISA serves as the boundary between the software and hardware.

Implementation

The implementation of a computer has two components: organization and hardware.

The term organization includes the high-level aspects of a computer’s design, such as the memory system, the memory interconnect, and the design of the internal processor or CPU.

Hardware refers to the specifics of a computer, including the detailed logic design and the packaging technology of the computer.

Goal

Computer architects must design a computer to meet functional requirements.

Time discovers truth.Seneca

Performance

In general, performance describes how quickly a given system can execute a program or programs.

Systems that execute programs in less time are said to have higher performance

Response Time/Execution Time

The time between the start and completion of a task

To maximize performance, we want to minimize response time or execution time for some task.


Thus we can relate performance and execution time for a computer X:

PerformanceX = 1________

Execution TimeX


This means that for two computers X and Y, if the performance of X is greater than the performance of Y, we have

PerformanceX > PerformanceY

_ 1____ > ____1_____ Execution TimeX Execution TimeY

Execution timeY > Execution timeX


In discussing a computer design, we often want to relate the performance of two different computers quantitatively. We will use the phrase “X is n times faster than Y”—or equivalently “X is n times as fast as Y”—to mean

PerformanceX = nPerformanceY


If X is n times faster than Y, then the execution time on Y is n times longer than it is on X:

PerformanceX = ExecutionY = n

PerformanceY ExecutionX

Relative Performance

Ex. If computer A runs a program in 10 seconds and computer B runs the same program in 15 seconds, how much faster is A than B?

Ex.

PerformanceA = ExecutionB = n

PerformanceB ExecutionA

Thus, the performance ratio is15 = 1.510

and A is therefore 1.5 times faster than B.

Measuring Performance

Time is the measure of computer performance: the computer that performs the same amount of work in the least time is the fastest.

Performance Metrics

Cycles per Instruction (CPI) Number of clock cycles required to execute each

instruction CPI = number of clock cycles required to execute

program number of instructions executed in running the

program

Instructions executed Per Cycle (IPC) For systems that can execute more than one instruction

per cycle, the IPC is used instead of CPI IPC = number of instructions executed in running a

program number of clock cycles required to execute

programNote: IPC is the reciprocal of CPI

Ex.

A given program consists of a 100-instruction loop that is executed 42 times. If it takes 16,000 cycles to execute the program on a given system, what are the system’s CPI and IPC values for the program?

Soln:

Benchmark Suites

Consists of a set of programs that are believed to be typical of the programs that will be run on the system

They generate estimates of a system’s performance on different types of applications. Ex. SPEC – Standard Performance Evaluation

Corporation is a non-profit corporation formed to establish,

maintain and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers.

SPEC CPU2006,SPEC CPUv6

Speedup

Used to describe how the performance of an architecture changes as different improvements are made to the architecture

It is the ratio of the execution times before and after a change is made

Speedup = Execution Time before

Execution Time after

Ex

If a program takes 25 seconds to run on one version of an architecture and 15 seconds to run on a new version, the overall speedup = 25 sec/15 sec = 1.67

Amdahl’s Law

The most important rule for designing high-performance computer systems is make the common case fast.

Qualitatively, this means that the impact of a given performance on overall performance is dependent on both how much the improvement improves performance when it is in use and how often the improvement is in use

Amdahl’s Law

Execution Timenew =

Execution Timeold X [ Fracunused + Frac used

]

Speedupused

where:

Frac unused = fraction of time that the improvement is not in use

Fracused = fraction of time that the improvement is in use

Speedupused = speedup that occurs when the improvement is used

Note that Fracused and Fracunused are computed using the the execution time before the modification is applied.

Amdahl’s Law can be rewritten using the definition of speedup:

Speedup = Execution Timeold

Execution Timenew

= ________ 1_____________[ Fracunused + Frac used ]

Speedupused

Ex.

Suppose that a given architecture does not have hardware support for multiplication, so multiplication have to be done through repeated addition (this was the case on some early microprocessors). If it takes 200 cycles to perform multiplication in software, and 4 cycles to perform multiplication in hardware, what is the overall speedup from hardware support for multiplication if a program spends 10% of its time doing multiplications? What about a program that spends 40% of its time doing multiplications?

Seatwork:

1. If the 2011 version of a computer executes a program in 200ns and the version of the computer made in the year 2013 executes the same program in 150ns, what is the speedup that the manufacturer had achieved over the two-year period?

2. To achieve a speedup of 3 on a program that originally took 78s to execute, what must be the execution time of the program be reduced to?

3. When run on a given system, a program takes 1,000,000 cycles. If the system achieves a CPI of 40, how many instructions were executed in running the program?

4. What is the IPC of a program that executes 35,000 instructions and requires 17,000 cycles to execute?

5. Suppose a computer spends 90% of its time handling a particular type of computation when running a given program, and its manufacturers make a change that improves its performance on that type of computation by a factor of 10.

a. If the program originally took 100s to execute, what will its execution time be after the change?

b. What is the speedup from the old system to the new system?

performance

Documents

computer system

aspects of computer

hardware performance

new computer

term computer architecture

computer architecture

overall performance

performance assessment