may 17, 2006 the role of accelerated computing in the ...gamma.cs.unc.edu › sc2007 ›...

32
The Role of Accelerated Computing in the Multi-Core Era Chuck Moore Senior Fellow Advanced Micro Devices May 17, 2006

Upload: others

Post on 28-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-Core Era

Chuck MooreSenior Fellow

Advanced Micro Devices

May 17, 2006

Page 2: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era2 November 10, 2007

Key Points in this Talk

1. The semiconductor industry is dependent upon ongoing customer value:

A virtuous cycle:

2. Programming for Multi-Core is a difficult challenge, but it is really just the leading edge of the bigger challenges yet to come

build

investcustomer

value

$$$

Page 3: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era3 November 10, 2007

It’s Time to Reorient Around Customer Value

Our industry is obsessed with Performance

Page 4: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era4 November 10, 2007

Outline

• Important BackgroundA Few High-level TrendsSome Thoughts on SMP and Multi-core Computing

• The Accelerated Computing ImperativeDense Computing: GPUs and GP-GPUsThe broader potential

• A Framework for Accelerated Computing enablementThe Role of ArchitectureThe Emerging Layers of Computation

• Summary

Page 5: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era5 November 10, 2007

A Few High-level TrendsIP

C

Issue Width

The Complexity Wall

o

we arehere

Inte

gra

tion (

log s

cale

)

Time

Moore’s Law ☺!

we arehere

o

Pow

er B

udget

(TD

P)

Time

The Power Wall

we arehere

o

Freq

uen

cy

Time

The Frequency Wall

we arehere

o

Perf

orm

ance

Cache Size

Locality

we arehere

o

Sin

gle

-thre

ad P

erf

?

Time

we arehere

o

Single thread Perf (!)

So, how can we add customer value?

- DFM- Variability- Reliability- Wire delay

Server: power=$$DT: eliminate fansMobile: battery

Page 6: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era6 November 10, 2007

Customer Value beyond just Performance

++DualDual--core technologycore technology

Pervasive Pervasive 6464--bit capabilitybit capability

++Systems architecture Systems architecture

innovationinnovation

AMD Native Dual Core Opteron

SMP and Multi-Core to the long term rescue?

AMD Native Quad Core Core Opteron

SeamlessSeamlessupgradeabilityupgradeability

++VirtualizationVirtualization

++PerformancePerformance--perper--watt watt

innovationinnovation

++QuadQuad--core technologycore technology

Page 7: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era7 November 10, 2007

0

1

2

3

4

5

6

7

8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Number of Processors

Relative

Per

form

ance

Single-threadedApplicationResponsiveness

Naive Parallel ApplicationSophisticated Parallel App

SMP Performance (Hypothetical values)

Page 8: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era8 November 10, 2007

0

1

2

3

4

5

6

7

8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Number of Processors

Relative

Per

form

ance

Single-threadedApplicationResponsiveness

Naive Parallel ApplicationSophisticated Parallel App

Single-thread performanceactually goes down! (power constraints)

SMP Performance (Hypothetical values)

Page 9: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era9 November 10, 2007

0

1

2

3

4

5

6

7

8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Number of Processors

Relative

Per

form

ance

Single-threadedApplicationResponsiveness

Naive Parallel ApplicationSophisticated Parallel App

For most users, responsivenessbenefit diminishes after 2-4

processors

SMP Performance (Hypothetical values)

Page 10: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era10 November 10, 2007

0

1

2

3

4

5

6

7

8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Number of Processors

Relative

Per

form

ance

Single-threadedApplicationResponsiveness

Naive Parallel ApplicationSophisticated Parallel App

Writing scalable parallel programsis HARD. Perhaps too hard?

This is a 30 year old problem!

SMP Performance (Hypothetical values)

Page 11: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era11 November 10, 2007

0

1

2

3

4

5

6

7

8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Number of Processors

Relative

Per

form

ance

Single-threadedApplicationResponsiveness

Naive Parallel ApplicationSophisticated Parallel App

Even the best parallel programs rollover at a

modest number of processors

SMP Performance (Hypothetical values)

Page 12: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era12 November 10, 2007

Optimized SMP and Multi-core Platforms

• In the near-term, there is definitely potential hereCommodity multi-core processors break the “chicken & egg” barrierImpressive amount of interesting research firing up:

– TM, coherency filters, hierarchical scheduling, MREs, VMs, etc

Lots of good activity on the Tools front More to come

• Some workloads will do well with this, but many will not:As it turns out, software isn’t really that soft

– The underlying structural assumption is often serial processing

– Transitioning the concurrency model is a very big deal

Amdahl’s Law seriously inhibits unstructured parallelism

• In reality, SMP/Multi-core challenges are just an early indicator of the shifts yet to come

Power constraints will force these to be “performance heterogeneous”

Advances in synchronization and NUMA will give rise to new options…

Page 13: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era13 November 10, 2007

Outline

• Important BackgroundA Few High-level TrendsSome Thoughts on SMP and Multi-core Computing

• The Accelerated Computing ImperativeDense Computing: GPUs and GP-GPUsThe broader potential

• A Framework for Accelerated Computing enablementThe Role of ArchitectureThe Emerging Layers of Computation

• Summary

Page 14: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era14 November 10, 2007

The Accelerated Processing Imperative

x86 applications, workloads and usage models continue to rapidly diversify

Java, XML, web services

3D, digital media

HD, DRM

E-mail, GUI, PowerPoint, web browsers

Spreadsheets, word-processing

x86 Software Complexity and Diversity

First x86 PCIBM model 5150

2000s 2010s1990s1981

Page 15: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era15 November 10, 2007

The Accelerated Processing Imperative

Java, XML, web services

3D, digital media

HD, DRM

E-mail, GUI, PowerPoint, web browsers

Spreadsheets, word-processing

x86 Software Complexity and Diversity

64

-bit

Ho

mo

gen

eo

us

Mu

lti-

CP

U

DIV

ER

SIT

Y

64

-bit

Sin

gle

Co

re

PO

WER

/P

ER

F.

≤16-bitSingle Core

PER

F. 32-bit

Single CoreP

ER

F.

AMD64

Dual-Core AMD Opteron™processors

486

Acc

ele

rate

d P

roce

sso

rs

By the end of the decade, homogenous multi-core becomes increasingly inadequate

Pla

tfo

rm A

ccele

rati

on

2000s 2010s1990s1981

Page 16: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era16 November 10, 2007

Compute Density: Graphics Processor Performance ☺

AddedShader Programmability

GPGPU:Beyond justgraphics and

gaming

Page 17: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era17 November 10, 2007

Ruby Statistics

DoubleCross The Assassin Whiteout

Ruby Polygons 80,000 80,000 200,000

Avg. Triangles/Frame 227,212 546,087 1,069,503

Max Triangles/Frame 556,305 1,018,312 2,150,521

No. of Pixel Shaders 100 316 210

Avg. Pixel Shader Length 20 74 142

Facial Animation Targets 4 4 > 128

ALU:Tex Ratio 4:1 7:1 13:1

2004 2005 2006

Page 18: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era18 November 10, 2007

2004 2005 2006

Ruby Statistics

DoubleCross The Assassin Whiteout

Ruby Polygons 80,000 80,000 200,000

Avg. Triangles/Frame 227,212 546,087 1,069,503

Max Triangles/Frame 556,305 1,018,312 2,150,521

No. of Pixel Shaders 100 316 210

Avg. Pixel Shader Length 20 74 142

Facial Animation Targets 4 4 > 128

ALU:Tex Ratio 4:1 7:1 13:1

Page 19: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era19 November 10, 2007

As much as

20x

Realities of GP-GPU Power Efficiency

1 TeraFLOPS in a CrossFireconfiguration

Generalized GPU provides unprecedented opportunity for performance-per-watt

FLO

PS

-per-

watt

*

Dual-Core CPU GP-GPU *Source: AMD

500 GigaFLOPS per GPU

Available today –not just theoretical

More than 2 GigaFLOPS-per-watt

Page 20: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era20 November 10, 2007

HPC: Remember Attack of the Killer Micros?

Chart Source: Gordon Bell and Jim Gray, ISCA 2000

1/10th the performance, but at 1/100th the costAbsolute performance “good enough”

Productivity greater on a workstation than on a super

Perf

orm

an

ce in

Mfl

op

/s

Micros

Supers

0.01

0.1

1

10

100

1000

10000

1986

1988

1990

1992

1994

1996

8087 802876881

80387

R2000

i860

RS6000/540Alpha

RS6K/590Alpha

Cray 1S

Cray X-MP

Cray 2 Cray Y-MP Cray C90Cray T90

1998

1980

1982

Page 21: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era21 November 10, 2007

History Repeating Itself?

Familiar vector-style programming model

$1K - $5K PCs get amazing computational power via GPU

Traditional “computing” is an order of magnitude behind

Page 22: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era22 November 10, 2007

Attack of th

e Killer

GPUs

Attack of th

e Killer

Micros

You just can’t ignore this …

Page 23: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era23 November 10, 2007

GPU Performance = End of the CPU? NO!

Amdahl’s Law is Alive and Well..

We needboth!

Page 24: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era24 November 10, 2007

Package levelintegration

(MCM)

Chip levelIntegration

(SoC)A

ccele

rato

r

CP

U

Acce

lera

tor

CPU

NB

Add-in

Accelerated Computing has very broad potential -- A Continuum of Solutions

PCIeTM Accelerator

HTXTM Accelerator

PCI-ETM

Chipset

Accelerator

Chipset

Socket compatible accelerator

Accelerator

AMD OpteronTM

Socket

AMDProcessor

Integrated Acceleration

Coherent Domainnon-Coherent Domain

“Torrenza”“Fusion”

CPU

Page 25: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era25 November 10, 2007

Enablement• Horizontal

technology• to open markets

I/O

Math

FPGA

NetworkProcessing• Established $B

market in network platform

• Likely migration to server platform

Content

Security

Enterprise Technologies• Identified data

center opportunities

XML

Offload

SMPJavaSOA

Storage

Media• Highly competitive

market in flux• Known growth

opp.

IPTV

Proce

ssin

g

Tran

scod

ing

Telco

VoIP IMS

Torrenza: Enabling Partners to Build onthe Concept of Accelerated Computing

Page 26: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era26 November 10, 2007

Outline

• Important BackgroundA Few High-level TrendsSome Thoughts on SMP and Multi-core Computing

• The Accelerated Computing ImperativeDense Computing: GPUs and GP-GPUsThe broader potential

• A Framework for Accelerated Computing enablementThe Role of ArchitectureThe Emerging Layers of Computation

• Summary

Page 27: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era27 November 10, 2007

The Role of Architecture

• Architecture: The contract between layers of Hardware and Software

• Provides formalism and standardization Defines CompatibilityCompatibility has been a key enabler in our industry – this will continue

History shows that viable products don’t bet on wildly incompatible solutions

• Symbiotic Relationship between Hardware and SoftwareSW is typically the enabler for new HW features or new types of HW

– Actual results dominated by the weakest link in this relationship– SW value chain often values features more than HW optimization

Software complexity driven to extreme levels – this can’t continue

• Architecture gives rise to The Emerging Layers of ComputationCan we use this to simplify the programming models?

Page 28: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era28 November 10, 2007

Indicatorsof a bigger

picture?

The Emerging Layers of ComputationStart with an Analogy to the Communications Industry

Hardware

OperatingSystem

Application

Computing Model

BrowsersWeb services

JVM, CLR

Virtualization

Intelligent I/OAdvanced APIs

Reconfiguration

Dynamic binarytranslation

Fault tolerant“meta apps”

Page 29: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era29 November 10, 2007

The Emerging Layers of Computation

Physical Layer

Platform Layer

Native RuntimeLayer

Network Layer

Network Runtime Layer

Compatible Hardware Platform

Hypervisor (virtual platform)

API’s, Libs MRE’s

Traditional OS

Applications

Data Center Runtime Environment

Data Center Applications

Networked Platform

Network-aware Applications (web services)

Network Services

RAW Hardware

x86 Compatible Hardware

Devices

VMwar

eAr

chex

tens

ions

GPUs

Dire

ctX

Prox

ied

offlo

ad AJAX

Redu

ndan

t

hard

ware

Erro

rre

cove

ry

Micr

ocod

e

Dyna

mic

trans

latio

n

Google

apps SE

TI@

Hom

e

Page 30: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era30 November 10, 2007

Accelerated Computing

Specialpurpose

HW

New types of

Program-able HW

The Data CenterCompute Platform

Lots of Interesting Implications

Compatible Hardware Platform

RAW Hardware

x86 Compatible Hardware

Hypervisor (virtual platform)

API’s, Libs

Traditional OS

Networked Platform

Data Center Runtime Environment

MRE’s

Applications

Data Center Applications

Network-aware Applications (web services)

Network Services

Devices

Parallel Applications

using CMP/SMP

Page 31: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era31 November 10, 2007

Summary: The Case for Accelerated Computing

Traditional “host” offload to dense compute acceleratorUse APIs to enable this without heroic programming effortsProven techniques already in use with DirectX & GPUs todayISA compatibility yields to API and Platform Compatibility

Many application classes have reasonably common “kernels”Video encoding; Encryption; Data Movement; Java/CLR …

Broad range of possible accelerator designs & attach pointsCoherent domain or non-coherent domainDedicated special-purpose HW or programmable processor

Lots of ChallengesManaging context state Virtualizing the context stateCommunications/Messaging: “It’s the synchronization, stupid”Memory BW and Data Movement (keep up with computation)New and appropriate APIs

Page 32: May 17, 2006 The Role of Accelerated Computing in the ...gamma.cs.unc.edu › SC2007 › ChuckMooreSlides.pdf · 2 November 10, 2007 The Role of Accelerated Computing in the Multi-core

The Role of Accelerated Computing in the Multi-core Era32 November 10, 2007

Thank You !

Questions?

©2007. Advanced Micro Devices, Inc. All rights reserved. AMD, theAMD Arrow logo, AMD Opteron, and combinations thereof, are

trademarks of Advanced Micro Devices, Inc.

Other names are for informational purposes only and may be trademarks of their respective owners.