richard thomson daz 3d . direct3d 11 ctp in november 2008 directx sdk vista (and beyond) only, not...

Post on 15-Dec-2015

223 Views

Category:

Documents

5 Downloads

Preview:

Click to see full reader

TRANSCRIPT

DIRECT3D 11 PREVIEW

UTAH CODE CAMPFALL 2008

Richard Thomson

DAZ 3D

www.daz3d.com

Direct3D 11

CTP in November 2008 DirectX SDK

Vista (and beyond) only, not on XP

Evolution of Direct3D 10

Compatible with D3D 10 cards

Evolution of Direct3D

Direct3D 9Stable, been around for a whileLast version to be deployed on Win XP

Direct3D 10First Vista-only versionBig change from D3D 9

Direct3D 10.1Incremental tweak to D3D 10

Direct3D 10/10.1/11 vs. 9 Enumeration factored out to DXGI Same DXGI used for 10, 10.1 and 11 Divide render/texture states into chunks Chunks of state are immutable objects “Device state” consists of set of

assigned state chunks Introduces new shader stages beyond

vertex and pixel shaders Tighter API specification => no CAPS

Direct3D 11 Focus

Scalability and performance

Improving the development experience

Extending the reach of the GPU

Direct3D 11 New Features Tessellation Compute Shader Multithreading Shader Subroutines Improved Texture Compression Other Features

Tessellation

Direct3D 10 pipeline

Plus

Three new stages for Tessellation

Input Assembler

Vertex Shader

Pixel Shader

Hull Shader

Rasterizer

Output Merger

Tessellator

Domain Shader

Geometry Shader Stream Output

Hull Shader

Hull Shader

Tessellator

Domain Shader

HS output:Patch control pts afterBasis conversion

HS output:• TessFactors (how much to tessellate) • fixed tessellator mode declarations

HS input: patch control pts One Hull Shader

invocation per patch

Hull Shader Syntax

[patchsize(12)][patchconstantfunc(MyPatchConstantFunc)]MyOutPoint main(uint Id : SV_ControlPointID, InputPatch<MyInPoint, 12> InPts){ MyOutPoint result; …

result = TransformControlPoint( InPts[Id] );

return result;}

Tessellator

Tessellator

Domain Shader

Hull Shader

TS input:• TessFactors (how much to tessellate)• fixed tessellator mode declarations

TS output:• U V {W} domain points

TS output: • topology(to primitive assembly)

Note: Tessellator does not see control points

Tessellator operates per patch

Domain Shader

Domain Shader

Hull Shader

Tessellator

DS input:• U V {W} domain points

DS input:• control points• TessFactors

DS output:• one vertex

One Domain Shader invocation per point from Tessellator

Domain Shader Syntax

void main( out MyDSOutput result, float2 myInputUV : SV_DomainPoint, MyDSInput DSInputs, OutputPatch<MyOutPoint, 12> ControlPts, MyTessFactors tessFactors ){ …

result.Position = EvaluateSurfaceUV( ControlPoints, myInputUV );}

Single Pass Example

displacementmap

Evaluate surface

includingdisplacement

domain shader

patchcontrol points

Animate/skinControlPoints

transformedcontrol points

vertex shader

Transform basis,Determine how

much to tessellate

control pointsin Bezier patch

U V {W} domain points

Sub-D Patch Bezier Patch

hull shader

Tess Factors Tessellate!

tessellator

Current Authoring Pipeline(Rocket Frog Taken From Loop &Schaefer, "Approximating Catmull-Clark Subdivision Surfaces with Bicubic Patches“)

Sub-D Modeling Animation Displacement Map

Polygon Mesh Generate LODs

New Authoring Pipeline(Rocket Frog Taken From Loop &Schaefer, "Approximating Catmull-Clark Subdivision Surfaces with Bicubic Patches“)

Sub-D Modeling Animation Displacement Map

Optimally Tessellated Mesh

GPU

Tessellation Summary Helps us get closer to eliminating “pointy heads” Scales visual quality across PC hardware

configurations Supports performance increases

Coarse model = compression, faster I/0 to GPU Rendering tailored to each end user’s hardware

Better cross-platform (Windows + Xbox 360) development experience Xbox 360 has a subset of D3D11’s tessellation Parity = ease of cross-platform development Extra features = innovation for Windows gaming

Render content as the artist created it!

More on Tessellation

GameFest 2008 Slides and Audio“Direct3D 11 Tessellation”

○ Kev Gee, Microsoft

“Advanced Topics in GPU Tessellation”○ Natasha Tatarchuk, AMD/ATI

“Water-Tight, Textured, Displaced Subdivision Surface Tessellation Using Direct3D 11”○ Ignacio Castano, NVIDIA

General Purpose GPU

Data Parallel Computing GPU performance continues to grow Many applications scale well to massive

parallelism without tricky code changes Direct3D is the API for talking to GPU How do we expand Direct3D to GPGPU?

Compute Shader

Direct3D 10 pipeline

Plus

Three new stages for Tessellation

Plus

Compute Shader

Input Assembler

Vertex Shader

Pixel Shader

Hull Shader

Rasterizer

Output Merger

Tessellator

Domain Shader

Geometry Shader Stream Output

Compute ShaderData Structure

Integrated with Direct3D

Fully supports all Direct3D resources Targets graphics/media data types Evolution of DirectX HLSL Graphics pipeline updated to emit

general data structures… …which can then be manipulated by

compute shader… And then rendered by Direct3D again

Target Applications

Image/Post processing:Image ReductionImage HistogramImage ConvolutionImage FFT

A-Buffer/OIT Ray-tracing, radiosity, etc. Physics AI

Computing a Histogram

Histogram(){ shared int Histograms[16][256]; // array of 16

float3 vPixel = load( sampler, sv_ThreadID ); float fLuminance = dot( vPixel, LUM_VECTOR ); int iBin = fLuminance*255.0f;

// compute bin to increment int iHist = sv_ThreadIDInGroup & 16; // use thread index Histograms[iHist][iBin] += 1; // update bin

// enable all threads in group to complete SynchronizeThreadGroup;

Computing a Histogram 2

// Write register histograms out to memory: iBin = sv_ThreadIDInGroup.x; if (sv_ThreadID.x < 256) { for (iHist = 0; iHist < 16; iHist++) { int2 destAddr = int2(iHist, iBin); OutputResource.add(destAddr, Histograms[iHist][iBin]); // atomic } }}

Compute Shader Summary Enables much more general algorithms Transparent parallel processing model Full cross-vendor support Broadest possible installed base

GameFest 2008:“Direct3D 11 Compute Shader – More

Generality for Advanced Techniques”○ Chas Boyd, Microsoft

Multithreading Enables distribution across threads of

Application codeRuntimeDriver

Device: free threaded resource creation Immediate Context: your single primary device for

state & draws Deferred Contexts: your per-thread devices for state

& draws Display Lists: Recorded sequence of graphics

commands Requires a driver update

Shader Subroutines Details

Calls must be fastBinding applies to all primitives in a Draw callBinding operation must be fastNeed parameter passing mechanismNeed access to textures, samplers, etc.

AdvantagesReduce register usage in Über-shaders

○ Not worst case of all if statements

Allows specialization of subroutines

Improved Texture Compression

Why?

Existing block palette interpolations too simple

Results often rife with blocking artifacts No high dynamic range (HDR) support

New Texture Formats

BC6 (aka BC6H)High dynamic range6:1 compression (16 bpc RGB)Targeting high (not lossless) visual quality

BC7LDR with alpha 3:1 compression for RGB or 4:1 for RGBAHigh visual quality

Compression of New Formats Block compression (unchanged)

Each block independentFixed compression ratio

Multiple block types (new)Tailored to different types of contentSmooth gradients vs. noisy normal mapsVaried alpha vs. constant alpha

Decompression results must be bit-accurate with spec

Comparison Results 1

Orig BC3

Orig BC7

Abs Error

Comparison Results 2

Orig BC3

Orig BC7

Abs Error

Comparison Results 3

Abs ErrorHDR Original atgiven exposure

BC6 atgiven exposure

Other Features

Addressable Stream Out Draw Indirect Pull-model attribute eval Improved Gather4 Min-LOD texture clamps 16K texture limits Required 8-bit subtexel,

submip filtering precision

Conservative oDepth 2 GB Resources Geometry shader instance

programming model Optional double support Read-only depth or stencil

views

Thanks

Allison KleinSenior Lead Program ManagerDirect3DMicrosoft

Chas. BoydArchitectWindows Desktop & Gaming TechnologyMicrosoft

Thank you to our Sponsors!

top related