x86 everywhere - amd

39

Upload: others

Post on 17-May-2022

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: x86 Everywhere - AMD
Page 2: x86 Everywhere - AMD

x86 Everywhere

Chris HerringDirector StrategyPCSGAMD

Page 3: x86 Everywhere - AMD

Session OutlineSession Outline

Instructions, Languages, and the EuroArchitectural Evolution

Cost of DeploymentMacro Level

Extending x86Architectures du JourEvolution/Convergence: Why Now?

Primary Drivers: Performance, Power, CostKnobs and LeversAbsolute Costs and Tradeoffs

Future Innovations

Page 4: x86 Everywhere - AMD

Session GoalsSession Goals

An understanding of the benefits of a unified

approach to Instruction Set Architecture

No limitations to extending x86 architecture

The time is now

This is not a crazy idea!

Page 5: x86 Everywhere - AMD

Imagine if instruction sets were languagesImagine if instruction sets were languages……

Bonjour MondeHallo WeltΓειάσου κόσμοςCiao Mondo����������� ��Hello MundoЗдравствулте! МирHola Mundo

Or, conversely, if languages were instruction set Or, conversely, if languages were instruction set architecturesarchitectures……

Page 6: x86 Everywhere - AMD

Language Should UnifyLanguage Should Unify

What I really want to say is “Hello, World”Each Instruction Set Architecture (ISA) requires translationTranslation, Virtual Machines, Transmorphing, Transcoding, Simulation, Re-compile: All “tech speak” for 2nd language communicationWhat we have in effect is, truly, an ISA “Tower of Babel”

UltraSPARC®

PowerPC™

XScale™SHARC®

SuperH®

Cell

Page 7: x86 Everywhere - AMD

Towards Instruction Set ConsolidationTowards Instruction Set Consolidation

Future innovation should come in micro-architecture enhancements and compatible

extensions to dominant instruction sets, rather than the creation of new instruction sets.

With ever growing software complexity and installed base the value of remaining compatible with and extending existing, dominant

instruction sets heavily outweighs any disadvantages.

Is the trend clear?1

Technology has passed the point where instruction set costs are no longer relevant.

Is the time now?2

Page 8: x86 Everywhere - AMD

What The Euro Can Teach UsWhat The Euro Can Teach Us

The economic benefits of moving away from multiple currencies is enormous.

$36 billion per year in avoided transaction costs, or $90 per EU

resident

Page 9: x86 Everywhere - AMD

What The Euro Can Teach UsWhat The Euro Can Teach Us

What were the catalysts that prompted standardization?

Growing markets

More sophisticated consumers

Desire for increased stability

Page 10: x86 Everywhere - AMD

Exploding Functionality

per Square Inch

70s 80s 90s 00s 10s

Architectural EvolutionArchitectural EvolutionMacroMacro--LevelLevel

Growing Software

Complexity

Majority of System Software and

Applications are x86-Based

Time is Right for Full-Function x86

on All Form Factors

Page 11: x86 Everywhere - AMD

• Porting complexities

• High management costs

• “Crippled” apps• Non-integration• Backup, security• Configuration

management

High TCOHigh TCO

The True Cost of DeploymentThe True Cost of Deployment

Enthusiastic Initial Enthusiastic Initial DeploymentDeployment

• Communication• PIM function• Portability• Ease of use

• Wide range of ISV applications

• Mission critical custom applications

• “Out-of-the-box”integration

• Universal security policy

• Interoperability

Uncompromising Uncompromising End UsersEnd Users

STOPPostponed or Postponed or

Canceled Canceled Projects, Projects, Limited Limited

Deployments Deployments

Page 12: x86 Everywhere - AMD

Possible SolutionsPossible Solutions

STOP

• Resource intensive

• Slow• Costly to

maintain

x86

ARM

MIPS®

EPIC

PowerPC

SPARC

main() {main() {printf(printf(““Hello WorldHello World\\nn””););

}}

Port thousands of applications, operating systems, drivers, codecs, tool chains and virtual machines.

Cell

Other

Page 13: x86 Everywhere - AMD

Possible SolutionsPossible Solutions

Develop a web-based interface for each application.

STOP

• Assumes always-connected client server

• Limited functionality

• Least Common Denominator

• Difficult security

x86

ARM

MIPS®

EPIC

PowerPC

SPARC

Cell

Web Interface

0001010101001011110110

10101001

Other

Page 14: x86 Everywhere - AMD

Possible SolutionsPossible Solutions

Write once (or port once) and run anywhere.

Portingis not

• Heavy testing• Ongoing

optimization• Per-platform

customization

YIELDPortability is good

x86

ARM

MIPS®

EPIC

PowerPC

SPARC

Other

Java or .Net

0001010101001011110110

10101001

Page 15: x86 Everywhere - AMD

Is it really Is it really ““port onceport once””

Java

J2SEJ2EE

J2MEJXTA

JSLEE

Page 16: x86 Everywhere - AMD

Executing and/or translating to multiple languages and platforms is a necessary

cost — not something to be desired.

Page 17: x86 Everywhere - AMD

Architectural Evolution MacroArchitectural Evolution Macro--LevelLevel

Common Instruction Set

Architecture

No need to port

No need for multiple validations

Built in OS integration

Robust security

Investment protection

DesktopServer

LaptopTo

day

Storage

Handheld

Hap

pen

ing

No

w

Networking

Ubiquitous

Th

e F

utu

re?

Page 18: x86 Everywhere - AMD

The Importance of Dusty DecksThe Importance of Dusty Decks……Or Data Sets Never DieOr Data Sets Never Die……

Essential legacy data and code lives forever.

PL1/BasicPL1/Basic

VMSVMS

VAX VAX EmulatorEmulator

x86 Server

Legacy Legacy DataData

Legacy Legacy DataData

LinuxLinux

Page 19: x86 Everywhere - AMD

Extending x86Extending x86

Enterprise x86

Consumer x86

LFF Consumer Electronics

Networking

SFF Consumer Electronics

DesktopWorkstation

High End Server

Low/Mid Server

SAN/NAS

Handheld

Rugged Small Form

Factor

Internet Appliances

Page 20: x86 Everywhere - AMD

X86

Power

ARM

MIPS®

Precision

Sparc

AlphaVAX

432

32xxx

Transputer

68xxx

Page 21: x86 Everywhere - AMD

Common uArch

Consolidation

?RISC

ISA Architectural Evolution/ ConvergenceISA Architectural Evolution/ Convergence(architectures must cross platform boundaries)(architectures must cross platform boundaries)

70s 80s 90s 00s 10s

Proprietary

Proliferation

PowerPower

ARMARM

MIPS®MIPS®

X86X86

Auto

DTV

Digital Camera

Mobile Phone

Computer

Page 22: x86 Everywhere - AMD

Why is It Possible Now?Why is It Possible Now?Moore’s Law

Core processors can now be so small that any overhead of x86 is easily affordable (a few mm², a few % of total die, trivial power increment)

Sufficiency of performanceCPU designs can range from small simple designs to huge server designs

Added functionality makes processor core small part of chip

Large L1 and L2 cacheSOC functions (memory control, graphics, etc.)

Pads fundamentally force minimum die size much larger than coreLearning

Lots of design tricks have accumulated

Page 23: x86 Everywhere - AMD

6 Copper layer 6 Copper layer 180 nm design rules180 nm design rules

9 Copper layer 9 Copper layer 130 nm design rules130 nm design rules

Interconnect EvolutionInterconnect Evolution

2 Alum. layer 2 Alum. layer 1.01.0µµ design rulesdesign rules

AMD Am386®ProcessorAMD Am386®Processor

AMD Athlon™ProcessorAMD Athlon™Processor

AMD Athlon™ 64 ProcessorAMD Athlon™ 64 Processor

Page 24: x86 Everywhere - AMD

45 nm generation

32 nm generation

22 nm generation

65 nm generation

90 nm generation

(2004)(2004)

(2005)(2005)

(2007)(2007)

(2009)(2009)

(2011)(2011)

LLgg = 13 nm= 13 nmLLgg = 15 nm= 15 nm

LLgg = 35 nm= 35 nm

Planned Transistor EvolutionPlanned Transistor Evolution

LLgg = 50 nm= 50 nm

Page 25: x86 Everywhere - AMD

Huge Variation Huge Variation –– Without Even Trying!Without Even Trying!

Frequency

Stat

ic C

urre

ntAt this frequency, Iccstaticvaries from 1.5A to 10A+

About 10% lower MHz

Lots of Really Nice

Parts

Lots of Really Nice

PartsFast, High PowerFast, High Power

Fast, Low PowerFast, Low Power

1.0 1.5

15

0

Page 26: x86 Everywhere - AMD

Primary Drivers of Performance, Power and Primary Drivers of Performance, Power and CostCost

PowerPowerDynamic

VoltageSwitching capacitance

# of parallel units# of flops (pipeline depth)Clock skew goalLong busesClock gatingI/O

StaticProcess: Gate leakage, Dcap leakage and IoffVoltage and temperatureTotal transistor width

Leaky transistor width if substantial use of low leakage transistors

CostCostPackage, Assembly and Test

25-50% of total cost

Cache and IO Often Dominate Die Size

IO often in the range of 15-20% of die areaCache can be as much as 50% of the die area or more

PerformancePerformance

Determined by IPC and MHz

At 1GHz and 1 IPC a 1% effective miss rate cuts performance by approx. 50%, 4% cuts it by approx. 80%Required IPC, application footprint and DRAM speed dictate cache size

Peak Instruction Rate

Memory Latency ―We Have Hit the Memory Wall

Page 27: x86 Everywhere - AMD

Dynamic Range of Design ChoicesDynamic Range of Design Choices

Factor

RangeLever

500x0.2-100 WDynamic Power

15x20-300 mm²Die Size

1000x1-1000 nA/um Leakage (Idoff)

10x300-3000 MHzFrequency5x100-500IO Pins

2.5x0.7-1.8 voltsVoltage10x1-10 unitsILP

125x16K-2M bytesCache

5x5-25 stagesPipeline Depth

All can be varied independent of instruction set.

Page 28: x86 Everywhere - AMD

Three Orders of Magnitude Transistor Three Orders of Magnitude Transistor CountCount

16 32 64 BitsInteger FPU, SIMD, Vector1/3 Issue 9 IssueTrivial Cache 1MB CacheCPU SOC

0.25μ9M

transistors78mm2

0.18μ37M

transistors120mm2

AMD K6®-III Processor

AMD Athlon™

Processor

AMD Opteron™Processor

0.13μ100M

transistors193mm2

AMD Geode™

Processor

0.15μ9.5M

transistors58.3mm2

90nm84mm2

AMD AMD64™Processor Future AMD

Geode™ SOCProcessor

90nm30M+ transistors

40-45mm2

Page 29: x86 Everywhere - AMD

Levers Affecting Performance, Power and Levers Affecting Performance, Power and CostCost

Exponential

Polynomial

Linear

0.1 or Less 1.0 10 or More

SW

Alg

ori

thm

SW

Alg

ori

thm

ISAISA

Ckt Ckt StyleStyle

PerfPerf(MP)(MP)

MHz(V)MHz(V)

PipelinePipeline

ILPILP DataData--Set Set SizeSize

PowerPower(V)(V)

ProcessProcess

PowerPower(skew)(skew)

$Size$Size(MHz,App)(MHz,App)

LeakageLeakage(Tox)(Tox)

Instruction set (ISA) is probably least valuable and

certainly most disruptive.

Page 30: x86 Everywhere - AMD

Absolute Costs Demonstrate the Time is Absolute Costs Demonstrate the Time is NowNow

I/O

I/O (25%)

L2 (42%)

FPU/RF/OO (5%)

D$ (6%)

I$(4%)

INT/RF/OO (4%)

LS(4%)

Northbridge (5%)

BU(1%)

BP (2%)

DEC (2%)

Instruction set consumes very little real estate comparatively

Page 31: x86 Everywhere - AMD

Absolute Costs Demonstrate the Time is Absolute Costs Demonstrate the Time is NowNow

40mm², $4-$8Substantial CPU Core

5mm², $0.5-1Small CPU Core

$1-$2512KB Cache

$5-$10500 pin package

$1-$4200 pin package

0.5-2 centsPackage pin

10-20 cents1mm²

1mm²64KB Cache

1mm²1M Transistors

... and so, instruction set adds very little

cost.

We Can Optimize for Area or Power or Performance.

Page 32: x86 Everywhere - AMD

Future MicroFuture Micro--Architectural Architectural InnovationsInnovations

All are essentially instruction

set agnostic.

• Threaded architectures

• Multicore

• Chip level multiprocessing

• Huge scale MP machines

• Much higher performance superscalar, out of order CPU core

• Huge caches

• Media/vector processing extensions

• Static and dynamic Power management

• Branch and memory hints

• GHz performance IO

• Security and virtualization

Final Choices are driven by Optimization

Priorities.

Page 33: x86 Everywhere - AMD

Towards Instruction Set Towards Instruction Set ConsolidationConsolidation

Instruction Set Architecture Consolidation

Micro-Architecture Proliferation

The trend is clear. The time is now.The trend is clear. The time is now.

Page 34: x86 Everywhere - AMD

Extending x86Extending x86

Enterprise x86

Consumer x86

LFF LFF Consumer Consumer ElectronicsElectronics

NetworkingNetworking

SFF SFF Consumer Consumer ElectronicsElectronics

DesktopDesktopWorkstationWorkstation

High End High End ServerServer

Low/Mid Low/Mid ServerServer

SAN/NASSAN/NAS

HandheldHandheld

Rugged Rugged Small Form Small Form

FactorFactor

Internet Internet AppliancesAppliances

Page 35: x86 Everywhere - AMD

Call To ActionCall To Action

Hardware Developers:

Break through artificial barriers of power, price, form

factor

Allow a common architecture across market

boundaries

Software Developers:

Do not lockout support based on ISA

Allow x86 to actually be Everywhere

Page 36: x86 Everywhere - AMD

Additional ResourcesAdditional Resources

Web Resources:

Specs: http://www.amd.com

http://www.amd..com/embeddedprocessors

Other Resources: http://www.50x15.com

Related Sessions

Low Power, small formfactor x86

Page 37: x86 Everywhere - AMD
Page 38: x86 Everywhere - AMD

© 2005 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

Page 39: x86 Everywhere - AMD

AttributionAttribution© 2003 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, AMD Athlon, AMD Opteron and combinations thereof, and Geode are trademarks, and Am386, Am486 and K6-III are registered trademarks of Advanced Micro Devices, Inc in the U.S. and/or other jurisdictions. MIPS is a registered trademark of MIPS Technologies, Inc. in the U.S. and/or other jurisdictions. Other product and company names used in this presentation are for identification purposes only and may be trademarks of their respective companies.