computer science 246 advanced computer …dbrooks/cs246/cs246-lecture1.pdfcomputer science 246...

37
Computer Science 246 Advanced Computer Architecture Spring 2008 Harvard University Instructor: Prof. David Brooks [email protected]

Upload: lehanh

Post on 22-Apr-2018

238 views

Category:

Documents


0 download

TRANSCRIPT

Computer Science 246Advanced Computer

ArchitectureSpring 2008

Harvard University

Instructor: Prof. David [email protected]

2

Course Outline• Instructor• Prerequisites• Topics of Study• Course Expectations• Grading• Class Scheduling Conflicts

3

Instructor• Instructor: David Brooks

([email protected])• Office Hours: TBD, MD141, stop by/email

whenever

4

Prerequisites• CS141 (or equivalent)

• Logic Design (computer arithmetic)• Basic RISC ISA• Pipelining (control/data hazards, forwarding)• i.e. Hennessey & Patterson Jr. (HW/SW

Interface) • C Programming, UNIX for Project (or similar

skills)• Compilers, OS, Circuits/VLSI background is

a plus, not needed

5

Readings and Resources• Text: “Computer Architecture: A

Quantitative Approach,” Hennessy and Patterson

• Key research papers (will be available on the web or paper copies)

• SimpleScalar toolset• Widely used architectural simulator• SPEC2000 benchmarks• Will be used for some HW/Projects• Power/thermal modeling extensions available

6

Course Expectations• Lecture Material

• Short homework/quizzes to cover this material• Seminar style part of course:

• Expectation: you will read the assigned papers before class so we can have a lively discussion about them

• Paper reviews –short “paper review”highlighting interesting points, strengths/weaknesses of the paper

• Bring discussion points to class• Discussion leadership – Students will be

assigned to present the paper/lead the discussions

7

Course Expectations• Course project

• Several possible ideas will be given• Also you may come up with your own• Depending on enrollment, I will schedule

weekly/bi-weekly meetings with each individual/group (1/2hr per project) to discuss results/progress

• There will be two presentations–First, a short “progress update” (late April)–Second, a final presentation scheduled at the

end of reading week–Finally, a project writeup written in research

paper style

8

Grading• Grade Formula

• Homeworks and/or Quiz – 25%• Class Participation – 25%• Project (including final project presentation)–

50%

9

Topics of CS246• Introduction to Computer Architecture and

Power-Aware Computing• Modern CPU Design

• Deep pipelines/Multiple Issue• Dynamic Scheduling/Speculative Execution• Memory Hierarchy Design

• Pentium Architecture Case Study• Multiprocessors and Multithreading• Embedded computing devices• Dynamic Frequency/Voltage Scaling• Thermal-aware processor design• Power-related reliability issues• System-Level Power Issues• Software approaches power management

10

Readings from Previous Years• High-level Power modeling (power abstractions)• Power Measurement for OS control• Temperature Aware Computing• Di/Dt Modeling• Leakage Power Modeling and Control• Frequency and Voltage Scheduling Algorithms• Power in Data Centers, Google Cluster Architecture• Dynamic Power Reduction in Memory• Disk Power Modeling/Management• Application and Compiler Power Management• Dynamic adaptation for Multimedia applications• Architectures for wireless sensor devices• Low-power routing algorithms for wireless networked devices• Intel XScale Microprocessor, IBM Watchpad• Human Powered Computing

11

Why take CS246?• Learn how modern computing hardware

works • Understand where computing hardware is

going in the future• And learn how to contribute to this future…

• How does this impact system software and applications?

• Essential to understand OS/compilers/PL• For everyone else, it can help you write better

code!• How are future technologies going to

impact computing systems?

12

Architectural Inflection Point• Computer architects strive to give

maximum performance with programmer abstraction

• Compilers, OS part of this abstraction• e.g. Pipelining, superscalar, speculative

execution, branch prediction, caching, virtual memory…

• Technology has brought us to an inflection point

• Multiple processors on a single chip -- Why?–Design complexity, ILP/pipelining-limits,

power dissipation, etc• How to provide the abstraction?• Some burden will shift back to programmers

13

Topics of Study• Focus on what modern computer architects

worry about (both academia and industry)• Get through the basics of modern

processor design• Look at technology trends: multithreading,

CMP, power-, reliability-aware design• Recent research ideas, and the future of

computing hardware

14

ApplicationTrends

What is Computer Architecture?

Prog. Lang,Compilers

OperatingSystem

Applications(AI, DB, Graphics)

Instruction Set ArchitectureMicroarchitecture

System Architecture

VLSI/Hardware Implementations

TechnologyTrends

Hardware

Software

15

Application Areas• General-Purpose Laptop/Desktop

• Productivity, interactive graphics, video, audio• Optimize price-performance• Examples: Intel Pentium 4, AMD Athlon XP

• Embedded Computers• PDAs, cell-phones, sensors => Price, Energy

efficiency, Form-Factor• Examples: Intel XScale, StrongARM (SA-110)• Game Machines, Network uPs => Price-

Performance• Examples: Sony Emotion Engine, IBM 750FX

16

Application Areas• Commercial Servers

• Database, transaction processing, search engines

• Performance, Availability, Scalability• Server downtime could cost a brokerage

company more than $6M/hour• Examples: Sun Fire 15K, IBM p690, Google

Cluster• Scientific Applications

• Protein Folding, Weather Modeling, CompBio, Defense

• Floating-point arithmetic, Huge Memories• Examples: IBM DeepBlue, BlueGene, Cray T3E,

etc.

17

Moore’s Law• Every 18-24 months

• Feature sizes shrink by 0.7x• Number of transistors per die increases by 2x• Speed of transistors increases by 1.4x

• But we are starting to hit some roadblocks…

• Also, what to do with all of these transistors???

18

Moore’s Law (Tx Density)

1,000

10,000

100,000

1,000,000

10,000,000

100,000,000

1,000,000,000

1970 1980 1990 2000 2010

Tran

sist

ors

Per D

ie

4004

808680286

386

486

Pentium

PentiumIII

PentiumII

Itanium

Itanium2

19

How have we used these transistors?• More functionality on one chip

• Early 1980s – 32-bit microprocessors• Late 1980s – On Chip Level 1 Caches• Early/Mid 1990s – 64-bit microprocessors,

superscalar (ILP) • Late 1990s – On Chip Level 2 Caches• Early/Mid 2000s – Chip Multiprocessors, On Chip

Level 3 Caches• What is next?

• How much more cache can we put on a chip? (Itanium2)

• How many more cores can we put on a chip? (Niagara, etc)

• What else can we put on chips?

20

0

200

400

600

800

1000

1200

1400

1600

1984 1986 1988 1990 1992 1994 1996 1998 2000

Rel

ativ

e P

erfo

rman

ce

Pentium III

HP 9000DEC Alpha

DEC AlphaHP

9000IBMPower1

MIPSR2000

1.58x per year

1.35x per year

Moore’s Law (Performance)

Why has PerformanceOutstripped Moore’s Law?

21

Performance vs. Technology Scaling• Architectural Innovations

• Massive pipelining (good and bad!)• Huge caches• Branch Prediction, Register Renaming, OOO-

issue/execution, Speculation (hardware/software versions of all of these)

• Circuit/Logic Innovations• New logic circuit families (dynamic logic)• Better CAD tools• Advanced computer arithmetic

22

Power Dissipation Trends

1

10

100

1000

1980 1990 2000 2010

Pow

er D

ensi

ty (

W/c

m2 )

Hot Plate

Nuclear Reactor

386486

Pentium

Pentium Pro

Pentium 2

Pentium 3

Pentium 4 (Prescott)Pentium 4

23

Power Issues in Microprocessors

Temperature

Capacitive (Dynamic) Power Static (Leakage) Power

Minimum Voltage

20 cycles

Di/Dt (Vdd/Gnd Bounce)

Vol

tage

(V)

Cur

rent

(A)

VOUT

CLISub

VIN

IGate

Vin Vout

CL

Vdd

24

Power-Aware Computing Applications

Energy-Constrained Computing

Tem

pera

ture

/di-d

t-Con

stra

ined

25

The Battery Gap

0

1000

2000

3000

4000

5000

2000 2001 2002 2003 2004 2005 2006 2007

Batterycapacity(mAh)

Energyrequirement(mAh)

10kbps 64kbps 384kbps 2Mbps

PIM, SMS,Voice

Video email,Voice recognition,Mobile commerce

Mobile video-Conferencing,Collaboration

Downlink dominated

Interactive

Lithium IonLithium Polymer

Fuel CellsWeb browser,MMS, Video clips

Ener

gy (m

Ah)

Diverging Gap Between Actual Battery Capacities and Energy Needs

Source:AnandRaghunathan, NEC Labs

26

Where does the juice go in laptops?

• Others have measured ~55% processor increase under max load in laptops [Hsu+Kremer, 2002]

27

Packaging costFrom Cray (local power generator and refrigeration)…

Source: Gordon Bell, “A Seymour Cray perspective”http://www.research.microsoft.com/users/gbell/craytalk/

28

Packaging costTo today…• IBM S/390: refrigeration:

• Provides performance (2% perf for 10ºC) and reliability

Source: R. R. Schmidt, B. D. Notohardjono “High-end server low temperature cooling”IBM Journal of R&D

29

Intel Itanium packaging

Complex and expensive (note heatpipe)

Source: H. Xie et al. “Packaging the Itanium Microprocessor”Electronic Components and Technology Conference 2002

30

P4 packaging

• Simpler, but still…

Source: Intel web site

0

10

20

30

40

0 10 20 30 40Power (Watts)

Tota

l Pow

er-R

elat

edP

C S

yste

m C

ost

($)

From Tiwari, et al., DAC98

31

32

Cooking Aware Computing

33

Server Farms• Internet data centers are like heavy-duty

factories• e.g. small Datacenter 25,000 sq.feet, 8000

servers, 2MegaWatts• Intergate Datacenter, Tukwila, WA: 1.5 Mill. Sq.Ft,

~500 MW• Wants lowest net cost per server per sq foot of

data center space• Cost driven by:

• Racking height• Cooling air flow• Power delivery• Maintenance ease (access, weight)• 25% of total cost due to power

34

Environment• Environment Protection Agency (EPA): computers

consume 10% of commercial electricity consumption

• This incl. peripherals, possibly also manufacturing• A DOE report suggested this percentage is much lower

(3.0-3.5%)• No consensus, but it’s still a lot• Interesting to look at the numbers:

– http://enduse.lbl.gov/projects/infotech.html• Data center growth was cited as a contribution to the

2000/2001 California Energy Crisis• Equivalent power (with only 30% efficiency) for AC• CFCs used for refrigeration• Lap burn• Fan noise

35

Now we know why power is important• What can we do about it?

• Two components to the problem:• #1: Understand where and why power is

dissipated• #2: Think about ways to reduce it at all levels of

computing hierarchy• In the past, #1 is difficult to accomplish except

at the circuit level• Consequently most low-power efforts were all

circuit related

36

Modeling + Design

• First Component (Modeling/Measurement):• Come up with a way to:

–Diagnose where power is going in your system

–Quantify potential savings• Second Component (Design)

• Try out lots of ideas• Or characterize tradeoffs of ideas…

• This class will focus on both of these at many levels of the computing hierarchy

37

Next Time• Course website:

http://www.eecs.harvard.edu/~dbrooks/cs246• Some readings for next week:

• “Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors,” IEEE MICRO.

• “Power: A First-Class Architectural Design Constraint,”IEEE Computer

• “Thermal Crisis: Challenges and Potential Solutions”• Web Page Browsing:

• Information Technology and Resource Use– http://enduse.lbl.gov/projects/infotech.html

• Sun’s ECO-responsible data center– http://www.sun.com/aboutsun/environment/epa.jsp

• Questions?