cs 531 advanced computer architecture - auburn universityxqin/courses/cs531/lec01.pdfcs 531, new...

24
Slide 01-1 CS 531, New Mexico Tech CS 531 Advanced Computer Architecture These slides are adapted from notes by Dr. David E. Culler (U.C. Berkeley) Dr. Xiao Qin New Mexico Tech http://www.cs.nmt.edu/~xqin [email protected] Slide 01-2 CS 531, New Mexico Tech Sony PlayStation 3

Upload: duongnhi

Post on 22-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-1CS 531, New Mexico Tech

CS 531 Advanced Computer Architecture

These slides are adapted from notes by Dr. David E. Culler (U.C. Berkeley)

Dr. Xiao QinNew Mexico Tech

http://www.cs.nmt.edu/[email protected]

Slide 01-2CS 531, New Mexico Tech

Sony PlayStation 3

Page 2: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-3CS 531, New Mexico Tech

Insomniac Games

From IEEE Spectrum, Dec. 2006 Issue

Slide 01-4CS 531, New Mexico Tech

Resistance – Fall of Man

Source: images.playsyde.com

Page 3: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-5CS 531, New Mexico Tech

Resistance – Fall of Man

Source: www.insomniacgames.com

Slide 01-6CS 531, New Mexico Tech

Page 4: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-7CS 531, New Mexico Tech

Multi-Core CPU Classification

Homogeneous Multi-Core Processor Configuration

Heterogeneous Multi-Core Processor Configuration

Slide 01-8CS 531, New Mexico Tech

Cell Multi-Core Processor

Page 5: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-9CS 531, New Mexico Tech

Cure for the Multicore Blues

• Michael McCool • The RapidMind Platform• The prescription for programmers

paralyzed by parallel processing • Applications:

– Real-time ray tracing – Games– Simulations

From IEEE Spectrum, Jan. 2007 Issue

Slide 01-10CS 531, New Mexico Tech

A crowd simulation of 16,000 individual chickens

From IEEE Spectrum, Jan. 2007 Issue

Page 6: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-11CS 531, New Mexico Tech

Today’s Goal:

• Course Objectives• Course Content & Grading• Introduce you to Parallel Computer

Architecture• Answer your questions about CS 531• Provide you a sense of the trends that shape

the field

Slide 01-12CS 531, New Mexico Tech

CS 531: Semester Calendar

http://www.cs.nmt.edu/~xqin/courses/cs531

See the class webpage for the most up to date version!

Page 7: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-13CS 531, New Mexico Tech

Slide 01-14CS 531, New Mexico Tech

Page 8: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-15CS 531, New Mexico Tech

Slide 01-16CS 531, New Mexico Tech

Page 9: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-17CS 531, New Mexico Tech

Slide 01-18CS 531, New Mexico Tech

What will you get out of CS531?

• In-depth understanding of the design and engineering of modern parallel computers– technology forces– fundamental architectural issues

• naming, replication, communication, synchronization– basic design techniques

• cache coherence, protocols, networks, pipelining, …– methods of evaluation– underlying engineering trade-offs

• from moderate to very large scale• across the hardware/software boundary

Page 10: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-19CS 531, New Mexico Tech

Will it be worthwhile?• Absolutely!

– even through few of you will become PP designers

• The fundamental issues and solutions translate across a wide spectrum of systems.– Crisp solutions in the context of parallel machines.

• Pioneered at the thin-end of the platform pyramid on the most-demanding applications– migrate downward with time

• Understand implications for software

SuperServers

Departmenatal Servers

Workstations

Personal Computers

Workstations

Slide 01-20CS 531, New Mexico Tech

Topic Coverage• David E. Culler, Jaswinder Pal Singh, and Anoop Gupta,

“Parallel Computer Architecture: A Hardware/Software Approach,” ISBN 1-55860-343-3.

• Covers (These topics may change)• Fundamental computer architecture design issues • Perspective on parallel programming • Performance across the hardware/software boundary • Shared memory multiprocessors • Evaluating protocol design tradeoffs• Snoop-based multiprocessor design• Scalable multiprocessors• Directory-based cache coherence• Hardware/software tradeoffs• Latency tolerance• Interconnection network design

Page 11: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-21CS 531, New Mexico Tech

Course Syllabus• Prerequisite: CS 331Computer Architecture• 1 midterm exam and no final exam• Grading

– Class Participation 10%– Exam 30%– Proposal 10%– Progress Report 10%– Presentation 10%– Technical Report 30%

Slide 01-22CS 531, New Mexico Tech

Course Syllabus (cont.)• Scale

– Letter grades will be awarded based on the following scale. This scale may be adjusted upwards if it is necessary based on the final grades.

– A+ >= 97 A >= 93 A- >= 90 B+ >= 87 B >= 83 B- >= 80 C+ >= 77 C >= 73 C- >= 70 D+ >= 67 D >= 63 D- >= 60 F < 60

Page 12: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-23CS 531, New Mexico Tech

Office Hours and Exams

Mid-term Wednesday 3/5/2007

Office hours: Wednesday 1:30-3:00pm

Slide 01-24CS 531, New Mexico Tech

Am I going to read the book to you?

• NO!• Book provides a framework and complete

background, so lectures can be more interactive.– You do the reading– We’ll discuss it

• Projects will go “beyond”

Page 13: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-25CS 531, New Mexico Tech

Questions

Please ask at any time!

Slide 01-26CS 531, New Mexico Tech

What is Parallel Architecture?• A parallel computer is a collection of processing

elements that cooperate to solve large problems fast

• Some broad issues:– Resource Allocation:

• how large a collection? • how powerful are the elements?• how much memory?

– Data access, Communication and Synchronization• how do the elements cooperate and communicate?• how are data transmitted between processors?• what are the abstractions and primitives for cooperation?

– Performance and Scalability• how does it all translate into performance?• how does it scale?

Page 14: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-27CS 531, New Mexico Tech

Why Study Parallel Architecture?

Slide 01-28CS 531, New Mexico Tech

Why Study Parallel Architecture?

Page 15: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-29CS 531, New Mexico Tech

Why Study Parallel Architecture?

Slide 01-30CS 531, New Mexico Tech

Role of a computer architect:To design and engineer the various levels of a computer system to maximize performance and programmability within limits of technology and cost.

Parallelism:• Provides alternative to faster clock for performance• Applies at all levels of system design• Is a fascinating perspective from which to view architecture• Is increasingly central in information processing

Why Study Parallel Architecture?

Page 16: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-31CS 531, New Mexico Tech

Why Study it Today?

• History: diverse and innovative organizational structures, oftentied to novel programming models

• Rapidly maturing under strong technological constraints– The “killer micro” is ubiquitous– Laptops and supercomputers are fundamentally similar!– Technological trends cause diverse approaches to converge

• Technological trends make parallel computing inevitable• Need to understand fundamental principles and design

tradeoffs, not just taxonomies– Naming, Ordering, Replication, Communication performance

Slide 01-32CS 531, New Mexico Tech

Is Parallel Computing Inevitable?

• Application demands: Our insatiable need for computing cycles

• Technology Trends• Architecture Trends• Economics• Current trends:

– Today’s microprocessors have multiprocessor support– Servers and workstations becoming MP: Sun, SGI,

DEC, COMPAQ!...– Tomorrow’s microprocessors are multiprocessors

Page 17: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-33CS 531, New Mexico Tech

Application Trends• Application demand for performance fuels advances

in hardware, which enables new applications, which...– Cycle drives exponential increase in microprocessor

performance– Drives parallel architecture harder

• most demanding applications

• Range of performance demands– Need range of system performance with progressively

increasing cost

New ApplicationsMore Performance

Slide 01-34CS 531, New Mexico Tech

Speedup

• Speedup (p processors) =

• For a fixed problem size (input data set), performance = 1/time

• Speedup fixed problem (p processors) =

Performance (p processors)

Performance (1 processor)

Time (1 processor)

Time (p processors)

Page 18: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-35CS 531, New Mexico Tech

Slide 01-36CS 531, New Mexico Tech

Commercial Computing• Relies on parallelism for high end

– Computational power determines scale of business that can be handled

• Databases, online-transaction processing, decision support, data mining, data warehousing ...

• TPC benchmarks (TPC-C order entry, TPC-D decision support)– Explicit scaling criteria provided– Size of enterprise scales with size of system– Problem size not fixed as p increases.– Throughput is performance measure (transactions per

minute or tpm)

Page 19: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-37CS 531, New Mexico Tech

Thro

ughp

ut (t

pmC

)

Number of processors

0

5,000

10,000

15,000

20,000

25,000

0 20 40 60 80 100 120

Tandem HimalayaDEC AlphaSGI PowerChallenge

HP PAIBM PowerPCOther

TPC-C Results for March 1996

• Parallelism is pervasive• Small to moderate scale parallelism very important• Difficult to obtain snapshot to compare across vendor platforms

Observations?

Slide 01-38CS 531, New Mexico Tech

Scientific Computing Demand

Page 20: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-39CS 531, New Mexico Tech

Engineering Computing Demand• Large parallel machines a mainstay in many industries

– Petroleum (reservoir analysis)– Automotive (crash simulation, drag analysis, combustion

efficiency), – Aeronautics (airflow analysis, engine efficiency, structural

mechanics, electromagnetism), – Computer-aided design– Pharmaceuticals (molecular modeling)– Visualization

• in all of the above• entertainment (films like Toy Story)• architecture (walk-throughs and rendering)

– Financial modeling (yield and derivative analysis)– etc.

Slide 01-40CS 531, New Mexico Tech

1980 1985 1990 1995

1 MIPS

10 MIPS

100 MIPS

1 GIPS

Sub-BandSpeech Coding

200 WordsIsolated SpeechRecognition

SpeakerVeri¼cation

CELPSpeech Coding

ISDN-CD StereoReceiver

5,000 WordsContinuousSpeechRecognition

HDTV Receiver

CIF Video

1,000 WordsContinuousSpeechRecognitionTelephone

NumberRecognition

10 GIPS

• Also CAD, Databases, . . .• 100 processors gets you 10 years, 1000 gets you 20 !

Applications: Speech and Image Processing

Page 21: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-41CS 531, New Mexico Tech

Is better parallel arch enough?

• AMBER molecular dynamics simulation program• Starting point was vector code for Cray-1• 145 MFLOP on Cray90, 406 for final version on 128-processor

Paragon, 891 on 128-processor Cray T3D

Why?

Slide 01-42CS 531, New Mexico Tech

Summary of Application Trends

• Transition to parallel computing has occurred for scientific and engineering computing

• In rapid progress in commercial computing– Database and transactions as well as financial– Usually smaller-scale, but large-scale systems also used

• Desktop also uses multithreaded programs, which are a lot like parallel programs

• Demand for improving throughput on sequential workloads– Greatest use of small-scale multiprocessors

• Solid application demand exists and will increase

Page 22: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-43CS 531, New Mexico Tech

Perfo

rman

ce

0.1

1

10

100

1965 1970 1975 1980 1985 1990 1995

Supercomputers

Minicomputers

Mainframes

Microprocessors

Technology Trends

• Today the natural building-block is also fastest!

Observations?

Slide 01-44CS 531, New Mexico Tech

• Microprocessor performance increases 50% - 100% per year• Transistor count doubles every 3 years• DRAM size quadruples every 3 years• Huge investment per generation is carried by huge commodity market

0

20

4060

80100

120

140160

180

1987 1988 1989 1990 1991 1992

Integer FP

Sun 4260

MIPSM/120

IBMRS6000

540MIPSM2000

HP 9000750

DECalpha

Can’t we just wait for it to get faster?

Page 23: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-45CS 531, New Mexico Tech

Proc $

Interconnect

Technology: A Closer Look• Basic advance is decreasing feature size ( λ )

– Circuits become either faster or lower in power• Die size is growing too

– Clock rate improves roughly proportional to improvement in λ– Number of transistors improves like λ2 (or faster)

• Performance > 100x per decade– clock rate < 10x, rest is transistor count

• How to use more transistors?– Parallelism in processing

• multiple operations per cycle reduces CPI– Locality in data access

• avoids latency and reduces CPI• also improves processor utilization

– Both need resources, so tradeoff• Fundamental issue is resource distribution, as in

uniprocessors

Slide 01-46CS 531, New Mexico Tech

• 30% per year

0.1

1

10

100

1,000

19701975

19801985

19901995

20002005

Clo

ck ra

te (M

Hz)

i4004i8008

i8080

i8086 i80286i80386

Pentium100

R10000

Growth Rates

Tran

sist

ors

1,000

10,000

100,000

1,000,000

10,000,000

100,000,000

19701975

19801985

19901995

20002005

i4004i8008

i8080

i8086

i80286i80386

R2000

Pentium R10000

R3000

40% per year

Page 24: CS 531 Advanced Computer Architecture - Auburn Universityxqin/courses/cs531/Lec01.pdfCS 531, New Mexico Tech Slide 01-1 CS 531 Advanced Computer Architecture These slides are adapted

Slide 01-47CS 531, New Mexico Tech

Architectural Trends• Architecture translates technology’s gifts into

performance and capability• Resolves the tradeoff between parallelism and

locality– Current microprocessor: 1/3 compute, 1/3 cache, 1/3 off-chip

connect– Tradeoffs may change with scale and technology advances

• Understanding microprocessor architectural trends – Helps build intuition about design issues or parallel machines– Shows fundamental role of parallelism even in “sequential”

computers