ece473 computer organization and architecture · 2015. 1. 5. · ece473 lec 1.9 bus logic clk...

54
Lec 1.1 ECE473 ECE473 Computer Architecture and Organization Lecturer: Prof. Yifeng Zhu Fall, 2014 Portions of these slides are derived from: Dave Patterson © UCB Technology Trends

Upload: others

Post on 31-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.1 ECE473

ECE473 Computer Architecture and Organization

Lecturer: Prof. Yifeng Zhu

Fall, 2014

Portions of these slides are derived from:

Dave Patterson © UCB

Technology Trends

Page 2: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.2 ECE473

Couse Website

http://arch.eece.maine.edu/ece473

Page 3: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.3 ECE473

Author of the Text Book

Communications of the ACM,

Volume 49, No. 4, April 2006,

Page 31

Page 4: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.4 ECE473

Author of the Text Book

Page 5: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.5 ECE473

Outline

• Technology Trends

• Introduction to Computer Architecture

Page 6: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.6 ECE473

What If Your Salary?

• Parameters – $16 base

– 59% growth/year

– 40 years

• Initially $16 buy book

• 3rd year’s $64 buy computer game

• 16th year’s $27,000 buy car

• 22nd year’s $430,000 buy house

• 40th year’s > billion dollars buy a lot

You have to find fundamental new ways to spend money!

Page 7: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.7 ECE473

Birth of the Revolution -- The Intel 4004

Introduced November 15, 1971

108 KHz, 50 KIPs, 2300 10m transistors

@intel

First Microprocessor in 1971

• Intel 4004

• 2300 transistors

• Barely a processor

• Could access 300 bytes

of memory

Page 8: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.8 ECE473

2002 – Pentium® 4 Processor

November 14, 2002

@3.06 GHz, 533 MT/s bus

1099 SPECint_base2000*

1077 SPECfp_base2000*

55 Million 130 nm process

Source: http://www.specbench.org/cpu2000/results/

@intel

Page 9: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.9 ECE473

Bus Logic

CLK

L1D

cache

Integer

Datapath

DTLB

L3 Tag L2D Array and Control

ALAT

HPW

Pipeline Control

Floating Point Unit Branch Unit

L1I

cache

IA32

L3 Cache

Int

RF

Multi-

Medi

a

unit

2002 - Intel Itanium 2 Processor for Servers

• 64-bit processors

• .18mm bulk, 6 layer Al process

• 8 stage, fully stalled in-order pipeline

• Symmetric six integer-issue design

• IA32 execution engine integrated

• 3 levels of cache on-die totaling 3.3MB

• 221 Million transistors

• 130W @1GHz, 1.5V

• 421 mm2 die

• 142 mm2 CPU core

19.5mm

21.6

mm

@intel

Page 10: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.10 ECE473

POWER

40% …relative to

Intel® Pentium® D 960

When compared to the Intel® Pentium® D processor 960. Performance measured using SPECint*

rate base2000. Actual performance may vary. Energy efficiency based on Thermal Design Power

(TDP) measurement. See http://www.intel.com/performance for more information.

PERFORMANCE

40%

2006 - Intel Core Duo Processors for Desktop

Page 11: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.11 ECE473

2008-2014 - Intel Core i7 64-bit x86-64

• Successor to the Intel Core 2 family

• 32 nm CMOS process

• Adding GPU into the processor

• Intel Core i7 uses simultaneous multi-

threading (SMT)

• Scales up number of threads

supported

• 4 SMT cores, each supporting 4

threads appears as 16 core

a 5-stage pipelined processor (group project, two members)

Page 12: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.12 ECE473

Intel 4th Generation Core Processor: “Haswell”

• More than 90% of processors shipping today include a GPU on die

• Low energy use is a key design goal

4-core GT2 Desktop: 35 W package 2-core GT2 Ultrabook: 11.5 W package

Page 13: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.13 ECE473

Integrating GPU into CPU Chip: AMD Fusion

notebook computers

Page 14: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.14 ECE473

Integrating GPU into CPU Chip: AMD Fusion, 2011

Page 15: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.15 ECE473

Integrating GPU into CPU Chip: AMD Trinity, May 2012

Page 16: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.16 ECE473

Embedded Processors: Intel Atom

Page 17: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.17 ECE473

Technology constantly on the move!

• Num of transistors not limiting factor – Currently ~ 1 billion transistors/chip

– Problems:

» Too much Power, Heat, Latency

» Not enough Parallelism

• 3-dimensional chip technology? – Sandwiches of silicon

– “Through-Vias” for communication

• On-chip optical connections? – Power savings for large packets

• The Intel® Core™ i7 microprocessor (“Nehalem”) – 4 cores/chip

– 32 nm, Hafnium hi-k dielectric

– 731M Transistors

– Shared L3 Cache - 8MB

– L2 Cache - 1MB (256K x 4)

Nehalem

Page 18: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.18 ECE473

Amazing Underlying Technology Change

• In 1965, Gordon Moore (co-

founder of Intel) sketched out

his prediction of the pace of

silicon technology.

• Moore's Law: The number

of transistors incorporated in

a chip will approximately

double every 24 months.

• Decades later, Moore's Law

remains true. From Intel

Page 19: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.19 ECE473

Technology Trends: Moore’s Law

• Gordon Moore (Founder of Intel) observed in 1965 that the number of transistors on a chip doubles about every 24 months.

• In fact, the number of transistors on a chip doubles about every 18 months.

From intel

Page 20: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.20 ECE473

How did we do so far ?

Moore s Law applied to the travel industry • A flight from New York to Paris

Page 21: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.21 ECE473

IBM Power4 Dual Processor on a Chip

Large Shared L2: Multi-ported: 3 independent

slices L3 & Mem Controller:

L3 tags on-die for full-speed coherency

checks

Two cores (~30M transistors each)

Chip-to-Chip & MCM-to-MCM

Fabric: Glueless SMP

*Other names and brands may be claimed as the property of others

@IBM

Page 22: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.22 ECE473

AMD64 Dual Core Processor

• Two AMD Opteron™ CPU cores on one single die, each with 1MB L2 cache

• 90nm, ~205 million transistors* – Approximately same die size

as 130nm single-core AMD Opteron processor*

• 95 watt power envelope fits into 90nm power infrastructure

• Dual-core processors for client market are expected to follow

Core 0

Northbridge

1-MB L2

Core 1 1-MB L2

*Based on current revisions of the design

Page 23: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.23 ECE473

Niagara: Multithreaded SPARC Processor

Page 24: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.24 ECE473

Niagara Architecture

Page 25: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.25 ECE473

Cell Overview

• IBM/Toshiba/Sony joint project - 4-5 years, 400 designers

– 234 million transistors, ~80 watts at 4+ Ghz

– 256 Gflops (billions of floating pointer operations per second)

– Used in Sony PlayStation 3

P

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

M

I

C

R

R

A

C

B

I

C

MIB

Cell Prototype Die (Pham et al, ISSCC 2005)

Page 26: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.26 ECE473

Cell Overview - Main Processor

•One 64-bit PowerPC processor – 4+ Ghz, dual issue, two threads

– 512 kB of second-level cache

P

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

M

I

C

R

R

A

C

B

I

C

MIB

Cell Prototype Die (Pham et al, ISSCC 2005)

Page 27: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.27 ECE473

Cell Overview - SPE

•Eight Synergistic Processor Elements – Or “Streaming Processor Elements”

– Co-processors with dedicated 256kB of memory (not cache)

P

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

M

I

C

R

R

A

C

B

I

C

MIB

Cell Prototype Die (Pham et al, ISSCC 2005)

Page 28: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.28 ECE473

Cell Overview - SPE

•Synergistic Processor Elements – Or “Streaming Processor Elements”

– Co-processors with dedicated 256kB of memory (not cache)

P

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

M

I

C

R

R

A

C

B

I

C

MIB

Cell Prototype Die (Pham et al, ISSCC 2005)

Page 29: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.29 ECE473

Cell Overview - Memory and I/O

•Dual Rambus XDR memory controllers (on chip) – 25.6 GB/sec of memory bandwidth

•76.8 GB/s chip-to-chip bandwidth (to off-chip GPU)

P

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

S

P

U

M

I

C

R

R

A

C

B

I

C

MIB

Cell Prototype Die (Pham et al, ISSCC 2005)

Page 30: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.30 ECE473

What else except desktop and server processors?

Slides from ECE692 of Jie Hu@NJIT

Page 31: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.31 ECE473

Embedded Processors

2003 2004 2009 AAGR%

2004-2009

Embedded Software 1,401 1,641 3,448 16.0

Embedded IC 34,681 40,539 78,746 14.2

Embedded Boards 3,401 3,693 5,950 10.0

Total 39,483 45,873 88,144 14.0

from “High Growth Expected in the Worldwide Embedded System Market in the Next Five

Years”, 04/28/2005

World Embedded Systems Market, 2003, 2004 and 2009

Slides from ECE692 of Jie Hu@NJIT

Page 32: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.32 ECE473

Intel® Atom™

Ultra-Low Power, Small Form Factor, Embedded Applications

13x14mm

DATA IO

CORE L2 B

U

S

PL

L

FU

SE

ADDR IO

Intel® Atom

45 nm CMOS Used most for Netbook

Page 33: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.33 ECE473

Tear-down of iPhone 5

photo from ifixit.com

Page 34: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.34 ECE473

Tear-down of iPhone 5

photo from ifixit.com

• Apple’s A6 combines an ARM based dual-core CPU with a triple-core GPU

Page 35: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.35 ECE473

Tear-down of iPhone 4

photo from ifixit.com

• Apple’s A4 combines an ARM based CPU with a PowerVR

GPU with an emphasis on power efficiency

• ARM Cortex-A8 core is used in A4

Page 36: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.36 ECE473

Tear-Down 2nd Generation Nexus 7

Page 37: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.37 ECE473

Tear-Down 2nd Nexus 7

• Qualcomm APQ8064 Snapdragon S4

Pro Quad-Core CPU (includes the

Adreno 320 GPU)

• Elpida J4216EFBG 512 MB DDR3L

SDRAM (four ICs for 2 GB total)

• Analogix ANX7808 SlimPort Transmitter

• Texas Instruments BQ51013B Inductive

Charging Controller

• Qualcomm Atheros WCN3660 WLAN

a/b/g/n, Bluetooth 4.0 (BR/EDR+BLE),

and FM Radio Module

• SK Hynix H26M51003EQR 16 GB

eMMC NAND Flash

• Qualcomm PM8921 Quick Charge

Battery Management IC

Page 38: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.38 ECE473

1

10

100

1000

10000

1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006

Pe

rfo

rma

nce

(vs.

VA

X-1

1/7

80

)

25%/year

52%/year

??%/year

New Challenge: Slowdown in Joy’s law of Performance

• VAX : 25%/year 1978 to 1986

• RISC + x86: 52%/year 1986 to 2002

• RISC + x86: ??%/year 2002 to present

From Hennessy and Patterson, Computer Architecture: A

Quantitative Approach, 4th edition, Sept. 15, 2006

Sea change in chip

design: multiple “cores” or

processors per chip

3X

Page 39: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.39 ECE473

Power Density

Wat

ts/c

m

2

1

10

100

1000

1.5m 1m 0.7m 0.5m 0.35m 0.25m 0.18m 0.13m 0.1m 0.07m

i386 i486

Pentium® Pentium® Pro

Pentium® II Pentium® III Hot plate

Nuclear Reactor

Sun's

Surface

Rocket Nozzle

Pentium® 4

39 8/22/13

Page 40: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.40 ECE473

CPU determines performance?

Based on SPEED, the CPU has increased dramatically, but memory and disk have increased only a little. This has led to dramatic changed in architecture, Operating Systems, and programming practices.

Page 41: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.41 ECE473

Memory Technology

• DDR: Double Data Rate SDRAM

• Bandwidth of a memory module SBmax = SBbus* fbus* 2

where

– SBmax: max. memory bandwidth

– SBbus: Bandwidth of the memory bus (64 Bit = 8 Bytes)

– fbus: Frequency of the memory bus

http://www.kingston.com/newtech

Page 42: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.42 ECE473

Memory Technology

http://en.wikipedia.org/wiki/DDR_SDRAM

Memory speed improves ~10% per year.

Page 43: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.43 ECE473

Disk Technology

Page 44: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.44 ECE473

Photo of Disk Head, Arm, Actuator

Actuator

Arm Head

Platters (12)

Spindle

Page 45: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.45 ECE473

Disk Technology

Disk capacity improves about 60% per year.

2007: Hitachi releases the 1TB (1024 Gigabytes (GB) = 1 Terabyte (TB) )

Hitachi Deskstar 7k100

Page 46: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.46 ECE473

Disk Device Terminology

Disk Latency = Seek Time + Rotation Time + Transfer Time

Order-of-magnitude times for 4K byte transfers:

Seek: 8 ms or less

Rotate: 4.2 ms @ 7200 rpm

Transfer: 1 ms @ 7200 rpm

Platter

Outer Track

Inner Track Sector

Head

Arm

Actuator

Page 47: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.47 ECE473

Disk Device Terminology

Disk Latency = Seek Time + Rotation Time + Transfer Time

Page 48: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.48 ECE473

Technology dramatic change

• Processor – transistor number in a chip: about 59% per year

– clock rate: about 20% per year

• Memory – DRAM capacity: about 60% per year (4x every 3 years)

– Memory speed: about 10% per year

– Cost per bit: improves about 25% per year

• Disk – capacity: about 60% per year

– Total use of data: 100% per 9 months!

• Network Bandwidth – 10 years: 10Mb 100Mb

– 5 years: 100Mb 1 Gb

Page 49: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.49 ECE473

Computer Engineering Methodology

Technology

Trends

Evaluate Existing

Systems for

Bottlenecks

Benchmarks

Simulate New

Designs and

Organizations

Workloads

Implement Next

Generation System

Implementation

Complexity

Architecture design is an iterative process: Searching the

space of possible designs at all levels of computer systems

Page 50: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.50 ECE473

What is Architecture?

• Original sense: – Taking a range of building

materials, putting together in desirable ways to achieve a building suited to its purpose

• In Computer Engineering: – Similar: how parts are put together

to achieve some overall goal

– Examples: the architecture of a chip, of the Internet, of an enterprise database system, an email system, a cable TV distribution system

Adapted from David Clark’s, What is “Architecture”?

Page 51: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.51 ECE473

What is “Computer Architecture”?

• Instruction Set Architecture (ISA) – Visible to the programmer

– E.g., IA-32, IA-64, SPARC, ARM,…

• Organization – High-level detail of the system

»Does it have a cache, full FP support, etc?

• Hardware – Specifics

»E.g., Pentium 4 at 3GHz vs. Core Duo at 2 GHz

Page 52: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.52 ECE473

The Rest of this Course

• How are modern ISAs arranged?

• How do you organize these millions/billions of transistors to implement the ISA – data-processing (workers)

– control-logic (managers)

– memory (warehouse)

– parallel systems (multiple worksites)

• How to bridge the performance gap between CPU and memory? – Cache

– Redundant Array of Inexpensive Disks (RAID)

Page 53: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.53 ECE473

How Fast is it?

From 1960-1980, computer design was only about performance This is the era of Power

Next will be era of Reliability

A beam of light travels less than a tenth of an inch during the time it takes a 45nm

transistor to switch on and off.

Saturating Performance

Page 54: ECE473 Computer Organization and Architecture · 2015. 1. 5. · ECE473 Lec 1.9 Bus Logic CLK L1DALAT cache a Integer Datapath DTLBm L2D Array and Control L3 Tag HPW Pipeline Control

Lec 1.54 ECE473

Summary

1. Moore’s laws: The number of transistors

incorporated in a chip will approximately

double every 18 months.

2. CPU speed increases dramatically, but the

speed of memory, disk and network increases

slowly.

3. Architecture design is an iterative process.

Measure performance: Benchmarks