csc 4250 computer architectures august 29, 2006 chap.1. fundamentals of computer design
Post on 21-Dec-2015
215 views
TRANSCRIPT
CSC 4250Computer Architectures
August 29, 2006Chap.1. Fundamentals of Computer Design
What you will learn in this class Quantitative approach Instruction set principles Floating-point number and arithmetic Basic pipelining Advanced pipelining Caches Virtual memory
Syllabus
Chap. 1. Fundamentals of Computer Design Chap. 2. Instruction Set Principles Appx. H. Computer Arithmetic Appx. A. Pipelining Chap. 3. Instruction Level Parallelism: Hardware Chap. 4. Instruction Level Parallelism: Software Chap. 5. Memory Hierarchy
How to determine your letter grade Eleven homework assignments: 20% Midterm 1: 20% Midterm 2: 20% Final exam: 40% Cutoffs for A, B, and C: 90%, 80%, and 70% Cutoffs may be lowered (it will not be raised) So if your total exceeds 90%, you get an A.
Homework 1
Due in class next Tuesday, September 5 Problems 1.1, 1.2, and 1.3 Late penalty: 20% per weekday
Important Dates
Midterm 1: Tuesday, October 3 Fall break: October 7-9 Midterm 2: Tuesday, November 7 Thanksgiving: November 22-25 Last day of this course: Friday, December 8 Finals week: December 13-19
History of Computers
Mechanical Era (1600’s – 1940’s) Electronic Era (1945 – present)
Mechanical Era
Pascal (1642) Leibniz (1673) Babbage (1822) Boole (1847) Hollerith (1889) Zuse (1938) Aiken (1943)
Electronic Era
Generation 1 (1945 – 1958) Vacuum tubes, von Neumann architecture
Generation 2 (1958 – 1964) Transistors, HLL, core memory
Generation 3 (1964 – 1974) ICs, semiconductor memory, micro and multi prog
Generation 4 (1974 – present) LSI, VLSI, Mpp, PC; 32 years!
Software/Internet Era?
1980’s – present UNIX – Sun Micro Windows – Microsoft Web browser – Netscape → AOL → TWX E-commerce – Yahoo!, Amazon, eBay Search engine – Google (newest villain?)
Technology Trends
Transistor density up 35% per year DRAM:
Density up 40-60% per year Cycle time down 1/3 per decade
Cache design
Discrete Leaps
32 bit microprocessor early 1980’s Level 1 cache on chip late 1980’s Pentium 2 and Celeron 486 – lawsuit on numbers
Significant Technology Companies Bell Lab IBM CDC Cray → SGI Xerox PARC
Mac, laser printer, 3Com, Adobe DEC → Compaq → HP
MIPS
What does MIPS stand for? Machines with higher MIPS rate seem faster Problem:
Compare machines with different instruction sets ISA: instruction set architecture
MIPS
Company founded by one author of textbook Microprocessor without Interlocking Pipeline Stages
MIPS Example
FP vs. SW routines for FP operations FPU uses less time and fewer instructions SW uses many simple integer instructions,
leading to higher MIPS rate
MFLOPS
Mega flop? Similar difficulty: add/subtract, square root
Performance Analysis
Real programs: Word Kernels: Livermore loops, Linpack Synthetic benchmarks: whetstone,dhrystone Toy benchmarks: quicksort SPEC –
System Performance Evaluation Corp
SPECint Performance VAX 11/780 in 1984 = 1
Four Rules
CPU performance equation Amdahl’s Law Principle of locality Price performance
CPU Performance Equation
CPU time = IC × CPI × cct IC: instruction count
Depends on ISA and compiler CPI: cycles per instruction
Depends on ISA and pipelining Cct: clock cycle time
Depends on hardware technology
Two Supercomputers
Cray X-MP and Hitachi S810/20b P1: A(i) = B(i) + C(i) + D(i) + E(i)
vector length 1,000 done 100,000 times P2: Vectorized FFT
vector lengths 64, 32, 16, 8, 4, 2
Cray Hitachi
P1 (sec) 2.6 1.3
P2 (sec) 3.9 7.7
Amdahl’s Law
Speedup = Old time / New Time Fraction of enhanced: f Speedup of enhanced: S Speedup = 1 / [ (1 − f) + f / S ]
Examples
f = 0.2, S = 10 → Speedup = 1.22
f = 0.5, S = 1.6 → Speedup = 1.23
Consider MPP. Let f = 0.9 and S = 1,000,000 What is bound on speedup?
Principle of Locality
Program reuses data and instructions used recently Program spends 90% execution time in 10% of code Predict which instructions and data the program will use
based on accesses in the past → instruction and data caches, branch prediction
Two Types of Locality
Temporal locality: recently accessed items
Spatial locality: items whose addresses are near
Price Performance
MIPS rate of machine divided by its price Are supercomputers competitive in terms of
price performance? Many applications need answers as quickly
as possible, e.g., military, finance, and science
Integer Performance & Price-Performance
FP Performance & Price-Performance
Embedded Processors: Performance
Embedded Processors: Price Performance
Embed. Processors: Performance per Watt