csc 4250 computer architectures august 29, 2006 chap.1. fundamentals of computer design

CSC 4250Computer Architectures

August 29, 2006Chap.1. Fundamentals of Computer Design

What you will learn in this class Quantitative approach Instruction set principles Floating-point number and arithmetic Basic pipelining Advanced pipelining Caches Virtual memory

Syllabus

Chap. 1. Fundamentals of Computer Design Chap. 2. Instruction Set Principles Appx. H. Computer Arithmetic Appx. A. Pipelining Chap. 3. Instruction Level Parallelism: Hardware Chap. 4. Instruction Level Parallelism: Software Chap. 5. Memory Hierarchy

How to determine your letter grade Eleven homework assignments: 20% Midterm 1: 20% Midterm 2: 20% Final exam: 40% Cutoffs for A, B, and C: 90%, 80%, and 70% Cutoffs may be lowered (it will not be raised) So if your total exceeds 90%, you get an A.

Homework 1

Due in class next Tuesday, September 5 Problems 1.1, 1.2, and 1.3 Late penalty: 20% per weekday

Important Dates

Midterm 1: Tuesday, October 3 Fall break: October 7-9 Midterm 2: Tuesday, November 7 Thanksgiving: November 22-25 Last day of this course: Friday, December 8 Finals week: December 13-19

History of Computers

Mechanical Era (1600’s – 1940’s) Electronic Era (1945 – present)

Mechanical Era

Pascal (1642) Leibniz (1673) Babbage (1822) Boole (1847) Hollerith (1889) Zuse (1938) Aiken (1943)

Electronic Era

Generation 1 (1945 – 1958) Vacuum tubes, von Neumann architecture

Generation 2 (1958 – 1964) Transistors, HLL, core memory

Generation 3 (1964 – 1974) ICs, semiconductor memory, micro and multi prog

Generation 4 (1974 – present) LSI, VLSI, Mpp, PC; 32 years!

Software/Internet Era?

1980’s – present UNIX – Sun Micro Windows – Microsoft Web browser – Netscape → AOL → TWX E-commerce – Yahoo!, Amazon, eBay Search engine – Google (newest villain?)

Technology Trends

Transistor density up 35% per year DRAM:

Density up 40-60% per year Cycle time down 1/3 per decade

Cache design

Discrete Leaps

32 bit microprocessor early 1980’s Level 1 cache on chip late 1980’s Pentium 2 and Celeron 486 – lawsuit on numbers

Significant Technology Companies Bell Lab IBM CDC Cray → SGI Xerox PARC

Mac, laser printer, 3Com, Adobe DEC → Compaq → HP

MIPS

What does MIPS stand for? Machines with higher MIPS rate seem faster Problem:

Compare machines with different instruction sets ISA: instruction set architecture

MIPS

Company founded by one author of textbook Microprocessor without Interlocking Pipeline Stages

MIPS Example

FP vs. SW routines for FP operations FPU uses less time and fewer instructions SW uses many simple integer instructions,

leading to higher MIPS rate

MFLOPS

Mega flop? Similar difficulty: add/subtract, square root

Performance Analysis

Real programs: Word Kernels: Livermore loops, Linpack Synthetic benchmarks: whetstone,dhrystone Toy benchmarks: quicksort SPEC –

System Performance Evaluation Corp

SPECint Performance VAX 11/780 in 1984 = 1

Four Rules

CPU performance equation Amdahl’s Law Principle of locality Price performance

CPU Performance Equation

CPU time = IC × CPI × cct IC: instruction count

Depends on ISA and compiler CPI: cycles per instruction

Depends on ISA and pipelining Cct: clock cycle time

Depends on hardware technology

Two Supercomputers

Cray X-MP and Hitachi S810/20b P1: A(i) = B(i) + C(i) + D(i) + E(i)

vector length 1,000 done 100,000 times P2: Vectorized FFT

vector lengths 64, 32, 16, 8, 4, 2

Cray Hitachi

P1 (sec) 2.6 1.3

P2 (sec) 3.9 7.7

Amdahl’s Law

Speedup = Old time / New Time Fraction of enhanced: f Speedup of enhanced: S Speedup = 1 / [ (1 − f) + f / S ]

Examples

f = 0.2, S = 10 → Speedup = 1.22

f = 0.5, S = 1.6 → Speedup = 1.23

Consider MPP. Let f = 0.9 and S = 1,000,000 What is bound on speedup?

Principle of Locality

Program reuses data and instructions used recently Program spends 90% execution time in 10% of code Predict which instructions and data the program will use

based on accesses in the past → instruction and data caches, branch prediction

Two Types of Locality

Temporal locality: recently accessed items

Spatial locality: items whose addresses are near

Price Performance

MIPS rate of machine divided by its price Are supercomputers competitive in terms of

price performance? Many applications need answers as quickly

as possible, e.g., military, finance, and science

Integer Performance & Price-Performance

FP Performance & Price-Performance

Embedded Processors: Performance

Embedded Processors: Price Performance

Embed. Processors: Performance per Watt

csc 4250 computer architectures august 29, 2006 chap.1. fundamentals of computer design

Documents

present slide

architecture slide

weekday slide

memory hierarchy slide

higher mips rate slide

square root slide

compaq hp slide

decade cache design