vtu – iisc workshop (c)rg@serc,iisc compiler, architecture and hpc research in heterogeneous...

21
VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.e rnet.in

Upload: aileen-casey

Post on 02-Jan-2016

232 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU – IISc Workshop

(C)RG@SERC,IISc

Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era

R. GovindarajanCSA & SERC, IISc

govind@[csa,serc].iisc.ernet.in

Page 2: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 2

Moore’s Law : Transistors

Page 3: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 3

Moore’s Law : Performance

Processor performance doubles every 1.5 yearsProcessor performance doubles every 1.5 years

Page 4: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 4

Moore’s Law: Processor Architecture Roadmap (Pre-2000)

First P

Super-scalar

EPIC

RISC

VLIW

Page 5: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 5

Progress in Processor Architecture

• More transistors New architecture innovations– Pipelined Architecture– Multiple Instruction Issue processors

• VLIW • Superscalar• EPIC

– More on-chip caches, multiple levels of cache hierarchy, speculative execution, …Era of Instruction Level

Parallelism

Page 6: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 6

Influence on Compiler Optimization

Pipelined ArchitectureVLIW Architecture Superscalar ProcessorEPIC

ILP Compilation Techniques(Instrn. Scheduling, Register Allocation, Software Pipelining, …)

Page 7: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 7

IF ID IssueReg.Read

Superscalar Architecture

IF ID IssueReg.Read

WriteBack

Ld/Store UnitWriteBack

Int. ALU

Align Add AlignWriteBack

• Multiple instructions are fetched, decoded, issued and executed in each cycle.

• Speculation, Cache/Memory hierarchy, Prefetching, Performance, Power Efficiency, …

Page 8: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 8

Progress in Processor Architecture (Post-2000)

• More transistors New architecture innovations– Multiple Instruction Issue processors– More on-chip caches– Multi cores

Era of Multi-Cores

Page 9: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 9

Multicores : The Right Turn

6 G

Hz

1 C

ore

3 G

Hz

1 C

ore

1 G

Hz

1 C

ore

Per

form

ance 3 GHz

16 Core3 GHz 4 Core

3 GHz 2 Core

Page 10: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 10

Moore’s Law: Processor Architecture Roadmap (Post-2000)

First P

RISC

VLIW Super-scalar

EPIC Multi-cores

Page 11: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 11

Era of Multicores (Post 2000)

• Multiple cores in a single die

• Early efforts utilized multiple cores for multiple programs

• Throughput oriented rather than speedup-oriented!

Page 12: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 12

Influence on Compilation Techniques

Multi-Core Processors

• Extracting Parallelism • Thread-Level Parallelism• Speculative Multithreading

Page 13: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 13

MultiCore-Based Node

L2-Cache

C0 C2

L1$ L1$

L2-Cache

C4 C6

L1$ L1$

L2-Cache

C1 C3

L1$ L1$

L2-Cache

C5 C7

L1$ L1$

Memory

Page 14: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 14

HPC Cluster using Multi-Core Nodes

Memory MemoryNIC NIC

Memory MemoryNIC NIC

N/WSwitch

Node 0 Node 1

Node 3 Node 2

Page 15: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 15

Progress in Processor Architecture

• More transistors New architecture innovations– Multiple Instruction Issue processors– More on-chip caches– Multi cores– Heterogeneous cores and accelerators

Graphics Processing Units (GPUs)

Cell BE, Clearspeed

Larrabee

Reconfigurable accelerators …

Era of Heterogeneous Accelerators

Page 16: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 16

Moore’s Law: Processor Architecture Roadmap (Post-2000)

First P

RISC

VLIW Super-scalar

EPIC Multi-cores

Accele-rators

Page 17: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 17

Accelerators

Page 18: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 18

Why Bother about Accelerators?

Some Top500 Systems (Nov. 2009 List)

Rank

System Description # Procs. R_max

(TFLOPS)2 Roadrunner Opteron +

CellBE6480

+129601,105

29 LANL Opteron + CellBE

14400 126.50

56 TSUBAME Grid Opteron +Xeon + Clearspeed + GPU

31024 87.0

79 IBM Poughkeepsie

Opteron + CellBE

7200 63.25

Page 19: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 19

HPC Design Using Accelerators

• High level of performance from Accelerators• Variety of general-purpose hardware accelerators

– GPUs : nVidia, ATI,– Accelerators: Clearspeed, Cell BE, …– Plethora of Instruction Sets even for SIMD

• Programmable accelerators, e.g., FPGA-based• HPC Design using Accelerators

– Exploit instruction-level parallelism – Exploit data-level parallelism on SIMD units– Exploit thread-level parallelism on multiple units/multi-cores

• Challenges– Portability across different generation and platforms– Ability to exploit different types of parallelism

Page 20: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU-IISc Workshop © RG@SERC,IISc 20

Summary

• Multi-cores and Heterogeneous accelerators present tremendous research opportunity in– Architecture– High Performance Computing– Programming Languages & Models – Compilers

• Proebsting’s LawCompiler Technology Doubles CPU Power

Every 18 YEARS!!

• Time to Rewrite Probesting’s Law?

Page 21: VTU – IISc Workshop (C)RG@SERC,IISc Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc govind@[csa,serc].iisc.ernet.in

VTU – IISc Workshop

(C)RG@SERC,IISc

Thank You !!