evolution of the modern graphics architectures with a focus on gpus | turing100@persistent

30
Evolution of Graphics Architectures with a focus on GPUs Sanjiv Satoor Senior Manager, NVIDIA

Upload: persistent-systems-ltd

Post on 21-Jun-2015

320 views

Category:

Technology


0 download

DESCRIPTION

Sanjiv Satoor, Sr. Manager, NVIDIA talks about evolution of the modern graphics architectures with a focus on GPUs.

TRANSCRIPT

Page 1: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Evolution of Graphics Architectures

with a focus on GPUs

Sanjiv Satoor

Senior Manager, NVIDIA

Page 2: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

First Generation - Wireframe

Vertex: transform, clip, and project

Rasterization: lines only

Pixel: no pixels! calligraphic display

Dates: prior to 1987

Page 3: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Storage Tube Terminals

CRTs with analog charge “persistence”

Accumulate a detailed static image by writing points or

line segments

Erase the stored image to start a new one

Page 4: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Early Framebuffers

By the mid-1970’s one could afford framebuffers with a

few bits per pixel at modest resolution

“A Random Access Video Frame Buffer”,

Kajiya, Sutherland, Cheadle, 1975

Vector displays were still better for fine position detail

Framebuffers were used to emulate storage tube vector

terminals on a raster display

Page 5: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Second Generation – Shaded Solids

Vertex: lighting

Rasterization: filled polygons

Pixel: depth buffer, color blending

Dates: 1987 - 1992

Page 6: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Third Generation – Texture Mapping

Vertex: more, faster

Rasterization: more, faster

Pixel: texture filtering, antialiasing

Dates: 1992 - 2001

Page 7: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

IRIS 3000 Graphics Cards

Geometry Engines & Rasterizer 4 bit / pixel Framebuffer

(2 instances)

Page 8: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

1990’s

Desktop 3D workstations under $5000

Single-board, multi-chip graphics subsystems

Rise of 3D on the PC

40 company free-for-all until intense competition knocked out all but a

few players

Many were “decelerators”, and easy to beat

Single-chip GPUs

Interesting hardware experimentation

PCs would take over the workstation business

Interesting consoles

3DO, Nintendo, Sega, Sony

Page 9: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

1998 1999 2000 2001 2002 2003 2004

DirectX 6

Multitexturing

Riva TNT

DirectX 8

SM 1.x

GeForce 3 Cg

DirectX 9

SM 2.0

GeForceFX

DirectX 9.0c

SM 3.0

GeForce 6

DirectX 5

Riva 128

DirectX 7

T&L TextureStageState

GeForce 256

Quake 3 Giants Halo Far Cry UE3 Half-Life

All images © their respective owners

Moving toward programmability

Page 10: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

RIVA 128 3M xtors

GeForce 256 23M xtors

GeForce FX 250M xtors

GeForce 8800 681M xtors

GeForce 3 60M xtors

“Kepler” 7B xtors

1995 2000 2001 2006 2012

Fixed function Programmable shaders CUDA

2003

Evolution of GPUs

Page 11: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Copyright © NVIDIA Corporation 2006 Unreal © Epic

Per-Vertex Lighting No Lighting Per-Pixel Lighting

Page 12: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Lush, Rich Worlds Stunning Graphics Realism

Core of the Definitive Gaming Platform Incredible Physics Effects

Hellgate: London © 2005-2006 Flagship Studios, Inc. Licensed by NAMCO BANDAI Games America, Inc.

Crysis © 2006 Crytek / Electronic Arts

Full Spectrum Warrior: Ten Hammers © 2006 Pandemic Studios, LLC. All rights reserved. © 2006 THQ Inc. All rights reserved.

Page 13: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Tradition Fixed Function Graphics pipeline

T&L evolved to vertex shading

memory

interface

vertex

processing

triangle

setup

pixel

processing

raster

operations

Triangle, point, line setup

Flat shading, texturing eventually pixel shading

Blending, Z-buffering, Antialiasing

Wider and faster over the years

Processor per function

Page 14: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent
Page 15: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Migration of functionality to GPU hardware

Page 16: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

GeForce3/DX8 Pixel Shading Pipeline

Page 17: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Programmable Shaders: GeForceFX (2002)

Vertex and fragment operations specified in small (macro) assembly language

User-specified mapping of input data to operations

Limited ability to use intermediate computed values to index input data (textures and vertex uniforms)

Input 2Input 1Input 0

OP

Temp 2Temp 1Temp 0

ADDR R0.xyz, eyePosition.xyzx, -f[TEX0].xyzx;

DP3R R0.w, R0.xyzx, R0.xyzx;

RSQR R0.w, R0.w;

MULR R0.xyz, R0.w, R0.xyzx;

ADDR R1.xyz, lightPosition.xyzx, -f[TEX0].xyzx;

DP3R R0.w, R1.xyzx, R1.xyzx;

RSQR R0.w, R0.w;

MADR R0.xyz, R0.w, R1.xyzx, R0.xyzx;

MULR R1.xyz, R0.w, R1.xyzx;

DP3R R0.w, R1.xyzx, f[TEX1].xyzx;

MAXR R0.w, R0.w, {0}.x;

Page 18: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

GeForce 6 Architecture

Page 19: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Unified Hardware Shader Design

Page 20: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

L2

FB

SP SP

L1

TF

Th

read

Pro

cesso

r

Vtx Thread Issue

Setup / Rstr / ZCull

Geom Thread Issue Pixel Thread Issue

Input Assembler

Host

SP SP

L1

TF

SP SP

L1

TF

SP SP

L1

TF

SP SP

L1

TF

SP SP

L1

TF

SP SP

L1

TF

SP SP

L1

TF

L2

FB

L2

FB

L2

FB

L2

FB

L2

FB

GeForce 8 Architecture Build the architecture around the processor

Page 21: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Millions of triangles Millions of pixels

Why are so many parallel operations needed?

Input triangle Tessellate Projection Rasterize Shade Transform vertices

Image plane

Camera

Page 22: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

GPU = More computational horsepower and

bandwidth per watt

Few complex processors

Optimized for single-threaded performance

Many simple processors with minimal overhead

Slow single-threaded performance but massive overall throughput

Page 23: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

GPU Architecture

Efficiency

Programmability

Performance

Page 24: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

GPU Architecture:

Two Main Components

Streaming Multiprocessors (SMs)

Perform the actual computations

Each SM has its own:

Control units, registers, execution pipelines, caches

Global memory

Analogous to RAM in a CPU server

Accessible by both GPU and CPU

Currently up to 6 GB per GPU

Bandwidth currently up to 250 GB/s

DR

AM

I/F

G

iga

Th

rea

d

HO

ST

I/F

D

RA

M I

/F

DR

AM

I/F

DR

AM

I/F

DR

AM

I/F

DR

AM

I/F

L2

Page 25: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

KEPLER

The Fastest, Most Efficient GPU Ever Built

Page 26: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Kepler GK110 Architecture

7.1B Transistors

14 SMX units

3.95 TFLOP FP32

1.31 TFLOP FP64

250 GB/sec

2688 cores

PCI Express Gen3

Page 27: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

WORLD’S #1 SUPERCOMPUTER With a peak performance of 27 petaflops, the Titan supercomputer at Oak Ridge National Labs is the world’s fastest. 18,688 GPUs provide 90% of the machine’s computing power.

Page 28: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent
Page 29: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

The Graphics pipeline

Vertex and fragment processing are programmable

The programmer can write programs that are executed for every vertex as well as for every fragment

This allows fully customizable geometry and shading effects that go well beyond the generic look and feel of older 3D applications

host

interface

vertex

processing

triangle

setup

pixel

processing

memory

interface

Page 30: Evolution of the modern graphics architectures with a focus on GPUs | Turing100@Persistent

Thank you