Transcript
Page 1: Ocelot and the SST- MacSim  Simulator

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY

Ocelot and the SST-MacSim Simulator

Hyesoon Kim, Jaewoong Sim, Joo Hwan Lee

School of Computer Science and School of Electrical and Computer EngineeringGeorgia Institute of Technology

Atlanta, GA. 30332

Page 2: Ocelot and the SST- MacSim  Simulator

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY

System Diversity

Keeneland System Tianhe-1A

Amazon EC2 GPU Instances

Heterogeneity is Mainstream

Mobile Platforms

2

Page 3: Ocelot and the SST- MacSim  Simulator

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY

Heterogeneity On-ChipVector ExtensionsAES InstructionsProgrammable

Pipeline (GEN6)

Sandy Bridge

Programmable Accelerator

PowerEN

16, PowerPC cores Accelerators

• Crypto Engine• RegEx Engine• XML Engine

ARM Style

Memory

Denver

Multiple models of Computation Multi-ISA

3

Page 4: Ocelot and the SST- MacSim  Simulator

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY

Heterogeneous Systems: Keeneland

201 TFLOPS in 7 racks (90 sq ft incl service area)677 MFLOPS per watt on HPL (#9 on Green500, Nov 2010)Final delivery system planned for early 2012 Keeneland System

(7 Racks)

ProLiant SL390s G7(2CPUs, 3GPUs)

S6500 Chassis(4 Nodes)

Rack(6 Chassis)

M2070

Xeon 5660

12000-SeriesDirector Switch

Integrated with NICSDatacenter GPFS and TGFull PCIe X16

bandwidth to all GPUs

67GFLOPS

515GFLOPS

1679GFLOPS

24/18 GB

6718GFLOPS

40306GFLOPS

201528GFLOPS

Courtesy J. Vetter (GT/ORNL)

4

Page 5: Ocelot and the SST- MacSim  Simulator

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY

Heterogeneous Architecture & Systems Research

• Lexical Analyzer• Parser• Semantic analysis

• Optimization • Code generation • Post pass

optimization

Substrate

Read-out ckt

NVM

DRAM

NVRAM

DRAMMany-tier hybrid

memory system

Substrate

Read-out ckt

NVM

DRAM

NVRAM

DRAMMany-tier hybrid

memory system

VLIW (Caymen)SIMT (Fermi) New Designs

• Microarchitecture• Memory systems• Network on Chip• Power Management• + Many more

• Memory Optimizations• Program Transformations• Control Flow Optimizations• + Many more

Common Research Themes

Instruction set architecture

Focus on explicitly data parallel languages – bulk

synchronous models

5

Page 6: Ocelot and the SST- MacSim  Simulator

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY 6

Research Infrastructure Challenges

Microarch Simulator

Power & Thermal Models

Open source Compiler infrastructures for

GPU computing Microarchitecture cycle-level

timing simulators for heterogeneous architectures

Integration between compiler, simulators, and models

Scalable simulation infrastructures

Simulation wall! Ability to integrate point toolsTileTile

TileTile Tile Tile

Tile Tile

Tile

Tile

Page 7: Ocelot and the SST- MacSim  Simulator

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY 7

Tutorial Overview

Low level Compiler Infrastructure for GPU

Computing

Ocelot Dynamic Execution

Infrastructure Joo Hwan Lee, Hyesoon Kim

Heterogeneous Cycle-level Architecture Models

Parallel Simulation Infrastructure

MacSim Heterogeneous Architecture Simulator

SST: Structural Simulation Toolkit

Jaewoong Sim, Hyesoon Kim

Jaewoong Sim, Hyesoon Kim

Page 8: Ocelot and the SST- MacSim  Simulator

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY 8

Tutorial Schedule

Topical DescriptionPart 1 (90 min.) Ocelot Overview:

ArchitecturePart II (60 min.) Ocelot: Supported DevicesPart III (60 min.) MacSim: OverviewLunchPart IV (90 min.) MacSim: Simulator

ArchitectureMacSim: Configuration

Part V (30 min.) Case Studies using SST-MacSim


Top Related