toward a sustainable architecture at extreme scale

25
Toward a Sustainable Architecture at Extreme Scale Zhimin Tang, CTO [email protected]

Upload: corin

Post on 10-Feb-2016

54 views

Category:

Documents


5 download

DESCRIPTION

Toward a Sustainable Architecture at Extreme Scale. Zhimin Tang, CTO [email protected]. Outline. Sustainable (Cost Effective) HPC Counter-examples in the history Current and Future Challenges New computing forms from sensor to cloud - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Toward a Sustainable Architecture at  Extreme Scale

Toward a Sustainable Architecture at Extreme Scale

Zhimin Tang, [email protected]

Page 2: Toward a Sustainable Architecture at  Extreme Scale

Sustainable (Cost Effective) HPCCounter-examples in the history

Current and Future ChallengesNew computing forms from sensor to cloudSilicon based IC process approaching its

physical limitStrategy

Abandon HPC only acceleration features Design sustainable architecture for HPC and

other applications

Outline

Page 3: Toward a Sustainable Architecture at  Extreme Scale

Application (Algorithm) RequirementsHigh performance

Technology ConstraintsCMOS vs. bipolar, Moore’s LawCommercial MPU vs. customed ASIP

Economical FeasibilityGood eco-systemMass productionLow energy consumption

Considerations of Cost Effectiveness or Sustainability

Page 4: Toward a Sustainable Architecture at  Extreme Scale

Vector SupercomputersCMOS Dominated, SIMD Weakness

HPCs in the History

Page 5: Toward a Sustainable Architecture at  Extreme Scale

SIMD PE ArrayOptimal only for some

AlgorithmsCustom chips, tiny processor

Connection Machine

Page 6: Toward a Sustainable Architecture at  Extreme Scale

Chip Level Integration (SoC)nCube/2, KSR-1 (COMA), …High NRE cost due to custom design without

mass productionLow node processor performance

MIMD with Custom CPUs

Page 7: Toward a Sustainable Architecture at  Extreme Scale

HPC Is a Small MarketArchitectures Designed Only for HPC

Lower volume, higher cost (NRE)No enough resource to implement a top level

(wrt performance) solutionLonger time-to-market, behind Moore’s Law

Result: COTS Solutions in Last 20 YearsCommercial off-the-shelf

Co-design with the IT EcosystemFrom Cloud computers to sensors

Why No Cost Effectiveness

Page 8: Toward a Sustainable Architecture at  Extreme Scale

High Performance and Low CostLow cost is continuing a must

New factors of cost: energy/power, big NREPerformance no longer the bottleneck

for most applicationslike car, train, airplane in transportation

New appearances of performanceComputing: MIPS/MFLOPSTransaction processing: TPMCloud applications: requests serviced in unit time

Ecosystem Requirements

Page 9: Toward a Sustainable Architecture at  Extreme Scale

Two Ends of Computing SystemCloud: large scale power dissipationTerminal: limited battery life

Energy: compute < memory < communicationFor each FLOP in LinpackFPU spends 10pJ, Memory access 475pJ

Wireless Sensor NetworkRF radio consumes most of the power

What We Need Besides Locality?

Energy Efficiency

Page 10: Toward a Sustainable Architecture at  Extreme Scale

Architecture Consuming Less EnergyMany core, custom designed for applicationsFlattened software stack

Architecture for New Performance MetricsHigh volume throughput computers

New Algorithms and MethodologyComplexity of computationComplexity of memory access and

communication

Needs New Architecture

Page 11: Toward a Sustainable Architecture at  Extreme Scale

Existing Software Ecosystem standard or de facto interfaces

e.g., ISA: Instruction Set ArchitecturePro: Compatibility of SoftwareCon: Obstacles of Innovation, legacy

Huge Expenses of Developmentnew architecture needs new processorsNRE of chip development increasing rapidly,

as CMOS process approaching its limitNRE: Non-Recurring Engineering

Constraints to Innovation

Page 12: Toward a Sustainable Architecture at  Extreme Scale

Approaching Limit, And No Replacement!Moore’s law: 7nm@2024, ~30 atoms

Different with the Transfer in 1990’sBipolar (ECL/TTL) is faster, but consumes

much powerCMOS developed for 20 years, no too slow,

low cost, and low powerBut Now, Liquid Cooling for CMOS

In the foreseeable future, still CMOS

CMOS Technology

Page 13: Toward a Sustainable Architecture at  Extreme Scale

More and More than Moore

2011 ITRS Exec. Summary Fig. 4

Page 14: Toward a Sustainable Architecture at  Extreme Scale

Dark Silicon

ISCA’11, IEEE Micro’12, CACM’13

At 8nm, above half of transistors must be turned off

Speedup of 4-8 for 5 process generations

Page 15: Toward a Sustainable Architecture at  Extreme Scale

Moore’s Law Provides More TransistorsBut switching speed no longer fasterProcess development in nanometer scale

increases NRE tremendouslyMass Production Is Essential

Otherwise, chip business is not sustainableAdvantages of general-purposed processors

How about Many-core Processors?GPU, Tilera, MIC, …

Economical Feasibility

Page 16: Toward a Sustainable Architecture at  Extreme Scale

Most Advanced Process, Mass ProductStable, reliable, low costMature ecosystem and solutions

Not Optimal for Many ApplicationsAim: not too bad for most applicationsOver allocation of resourcesWaste of resources, Consumption of more

energy

Pros and Cons of MPU

Page 17: Toward a Sustainable Architecture at  Extreme Scale

High L1-I Cache Miss RateProcessor idle (instruction starvation)

Small ILP and MLPWide issue not effective

Low Efficiency of Memory AccessLarge L3 takes ½ chip area, no help to

improve performanceUseless High Bandwidth On-chip

Few Data sharing among cores

MPU not good for Cloud

Page 18: Toward a Sustainable Architecture at  Extreme Scale

Only 1/3 are frequently used

Low Utilization of Resources

GPU

L3 Cache

L2 Cache

L2 Cache

L2 Cache

L2 Cache

OOOFPU

OOOFPU

OOOFPU

OOOFPU

Page 19: Toward a Sustainable Architecture at  Extreme Scale

Optimal Designed for Some Applicationshigh efficiency, low resource, low power

But No Lunches Are FreeMuch design/verification workStability/Reliability?May affect the time to marketHow to amortize the huge NRESmall market means high cost

Pros and Cons of ASIP

Page 20: Toward a Sustainable Architecture at  Extreme Scale

GPUPro: mass productionCon: PCIE overhead, small memory size

MIC PHIMass production possible?

FPGAResource utilizationEase of programmingMPU interface, e.g., QPI or PCIE

MPU + Accelerator

Page 21: Toward a Sustainable Architecture at  Extreme Scale

Crossing the Gap between General and Special

M any Simple CoresReduce power consumption

Multiple Hardware Thread in Each CoreMassive threads on chipExploit concurrency, tolerate latency

Dynamic Scheduling of On-chip ThreadsImprove performance for general apps

Design of New Processors

Page 22: Toward a Sustainable Architecture at  Extreme Scale

流水向量处理引擎

PCPCPCPC

PCPCPC

指令寄存器

指令缓存

指令译码

PCPCPC寄存器堆

ALU

FPU

LSU

数据缓存/SPM

Combining Multithreadingand Vector Pipelining

Switch to single threadDeep scalar pipelineSwitch to vector pipeline

I$IR I

DRF

Vector Registers

D$/SPM

Page 23: Toward a Sustainable Architecture at  Extreme Scale

PCPCPCPC

PCPCPC

指令寄存器

指令缓存

指令译码

PCPCPC寄存器堆

ALU

FPU

LSU

数据缓存/SPM

Thread Parallelism and DataParallelism in Two dimensions

Deep thread parallelism and data parallelism

Wide data parallelism

I$

IR

ID

RF

D$/SPM

PCPCPCPC

PCPCPC

指令寄存器

指令缓存

指令译码

PCPCPC寄存器堆

ALU

FPU

LSU

数据缓存/SPM

Wide thread parallelism

I$

IR

ID

RF

D$/SPM

Vector Register File

Page 24: Toward a Sustainable Architecture at  Extreme Scale

A Universal ArchitectureScalable and reconfigurable processor arraySupports thread and data level parallelism

Fulfill All Requirements from Terminal to Cloud Data CenterHigh performance computersCloud computing serversEquipment in Core networkTerminals for Cloud and mobile Internet

In Conclusion

Page 25: Toward a Sustainable Architecture at  Extreme Scale

Thanks!