amd gpu tools for games development - ati technologies
TRANSCRIPT
AMD GPU Tools for games development
Holger GruenEuropean Developer RelationsAMD Graphics Products Group
Material for many slides was provided by J. Zarge and S. Sowerbyfrom the AMD graphics – Performance Tools - group
GCDC 2007, 20-22 August 2007
Holger Gruen AMD's GPU Tools for games development2 August 22, 2007
Overview
AMD CPU Tools–CodeAnalyst, APL, ACML, …
AMD GPU Tools–GPU PerfStudio + demo
–GPU ShaderAnalyzer + HD2000 code analysis
–Tootle
–AMD content creation tools
–Release dates
Holger Gruen AMD's GPU Tools for games development3 August 22, 2007
CPU Tools
Holger Gruen AMD's GPU Tools for games development4 August 22, 2007
CPU Tools
AMD CodeAnalyst Performance Profiler– Hotspot detection (various sampling methods)
– Call stack sampling
– Thread profiling, Pipeline simulation, etc. …
Performance Libraries– APL (AMD performance library)
Core functions ( e.g. query CPU configuration)
Image and signal processing functions
– ACML (AMD core math library)BLAS (Basic Linear Algebra Subroutines)
LAPACK, FFT suite, random number generators …
Holger Gruen AMD's GPU Tools for games development5 August 22, 2007
GPU PerfStudio
Holger Gruen AMD's GPU Tools for games development6 August 22, 2007
GPU PerfStudio: OverviewMonitors GPU in real-time
– API statistics
– Hardware/driver data
Visualize data in real-time– in plots
– in bar charts
Client/Server architecture– Local or remote apps
– Remote app launching
Override rendering states in real-time
Non-intrusive– No driver instrumentation
Holger Gruen AMD's GPU Tools for games development7 August 22, 2007
GPU PerfStudio V1.0 Features: Hardware Counters
% Hardware Utilization
% Vertex wait for Pixel
ALU:TEX instructions
Primitive counts
Pixel vs. vertex bound– Still makes sense under
DX10
etc…
Holger Gruen AMD's GPU Tools for games development8 August 22, 2007
GPU PerfStudio V1.0 Features: API Statistics
Per-frame API call data– D3D9
– D3D10 very soon
Sorting of API call by–call count–call timing
Holger Gruen AMD's GPU Tools for games development9 August 22, 2007
GPU PerfStudio Features: Data Plotting
Marker lines for API state changes
Multiple data series on single plot
Plot properties customizable
Holger Gruen AMD's GPU Tools for games development10 August 22, 2007
GPU PerfStudio Features: Bar Charts
Customizable “alarm” level
Customizable range
Customizable Tick Labels
Customizable Colors
Holger Gruen AMD's GPU Tools for games development11 August 22, 2007
GPU PerfStudio Features: State Overrides
Holger Gruen AMD's GPU Tools for games development12 August 22, 2007
GPU PerfStudio V1.1: Upcoming Features
DX10/Vista/HD2000 support– ability to analyze cutting edge applications on the newest hardware
Render target and non-RT state overrides– Especially useful for image-space effects (depth of field, glows,
render to texture)
Data filters – average median, min, max , derivative
– For gaining additional insight into your data
Plot lines for user defined markers– Demonstrate the effect of state override on performance data
Holger Gruen AMD's GPU Tools for games development13 August 22, 2007
GPU PerfStudio V1.1: Upcoming Features (cont.)
Even more flexible bar charts– Control colors, sizes, alarm, range and more
Flexible table cell formatting– Control cell size, color, font
Application remembers settings– recent sessions, recent server machines, recent apps, etc.
Selectable anti-aliasing– Fine tune the look and performance of GPU PerfStudio
Holger Gruen AMD's GPU Tools for games development14 August 22, 2007
GPU PerfStudio demo: DX9 SDK samples
Holger Gruen AMD's GPU Tools for games development15 August 22, 2007
GPUPerfStudio on XP/DX9
Holger Gruen AMD's GPU Tools for games development16 August 22, 2007
GPU PerfStudio demo: DX10 SDK samples
Holger Gruen AMD's GPU Tools for games development17 August 22, 2007
GPU PerfStudio on Vista/DX10
Holger Gruen AMD's GPU Tools for games development18 August 22, 2007
GPU ShaderAnalyzer
Holger Gruen AMD's GPU Tools for games development19 August 22, 2007
GPU ShaderAnalyzer:Feature Overview
Shader performance analysis tool
Shader tuning environment
Instant perf. feedback as you tune your shaders
Holger Gruen AMD's GPU Tools for games development20 August 22, 2007
GSA Features: Performance Analysis
Predicts shader perf. on range of AMD GPUs.
Analyzes compiled hardware instruction stream– Analysis is tied to specific Catalyst driver releases.
Displays estimated cycle count & ALU:Texture ratio
Color codes ALU:Tex ratio for shader for– ALU bound, Texture bound, Interpolator bound
Holger Gruen AMD's GPU Tools for games development21 August 22, 2007
GSA Features: Performance Analysis (cont.)
Considers static & dynamic flow control.
Considers the cost of each side of a branch
Calculate the minimum, maximum & average cycles
Factors in the expected flow control coherence
Currently only average cycles are displayed in GUI– but min & max from command line and selectable in new versions
Estimates cost of texturing for bi-, tri-linear & aniso– Cost based on typical texture fetch cost, not theoretical maximum
Holger Gruen AMD's GPU Tools for games development22 August 22, 2007
GSA Features: Hardware DisassemblyView the actual shader as executed by the hardware
Shows the hardware shader optimized by the SC
See where your shader performance is going
Can also display D3D shader disassembly
Holger Gruen AMD's GPU Tools for games development23 August 22, 2007
GSA Features: Supported Shader Formats
DX9 Pixel\Vertex Shaders
SM 1.1 – SM 3.0 Assembly Shaders
SM 2.0 – SM 3.0 HLSL Shaders
DX10 Pixel\Vertex\Geometry Shaders
GLSL Pixel\Vertex Shaders
arb_fp\arb_vp programs
Holger Gruen AMD's GPU Tools for games development24 August 22, 2007
GSA Features: Other features
Options dialog to configure• HLSL compiler to use
• ATI Shader Compiler Version
• GPUs to analyze performance for
• Options that control code analysis
Command line support• performance analysis & hardware
disassembly also available from command line – as mentioned earlier
• Analyze a single shader or a directory tree
• Output analysis to .csv file for further analysis within MS Excel
Holger Gruen AMD's GPU Tools for games development25 August 22, 2007
GPU ShaderAnalyzer : HD2000 code analysis
Holger Gruen AMD's GPU Tools for games development26 August 22, 2007
Short introduction: Coding for scalar ALUs on HD2000
HD2000 has 5 scalar ALUs / shader core
Common math can execute on all ALUs
Only one ALU can do
–Integer multiplies
–Type conversions
see HD2000_programming_guide
Holger Gruen AMD's GPU Tools for games development27 August 22, 2007
Why you should use GSA:decoding packed rgbafloat4 main( uint color : COLOR ): SV_Target
{
uint r = (color ) & 0xFF;
uint g = (color >> 8) & 0xFF;
uint b = (color >> 16) & 0xFF;
uint a = (color >> 24) & 0xFF;
return float4(r, g, b, a) * (1.0 / 255.0);
}
Holger Gruen AMD's GPU Tools for games development28 August 22, 2007
Compiles to 8 instr. slots !!!
Holger Gruen AMD's GPU Tools for games development29 August 22, 2007
Why you should use GSA :decoding packed rgba - betterfloat4 main( uint color : COLOR ): SV_Target
{
uint r = color & 0x000000FF;
uint g = color & 0x0000FF00;
uint b = color & 0x00FF0000;
uint a = color & 0xFF000000;
return float4(r, g, b, a) * (1.0 / (255 * float4(1, 256,
65536, 16777216)));
}
Holger Gruen AMD's GPU Tools for games development30 August 22, 2007
Compiles to 5 instr. slots ☺
Holger Gruen AMD's GPU Tools for games development31 August 22, 2007
Why you should use GSA:Which code is optimal ?
int4 main(int4 a: TEXCOORD) : SV_Target{
return a + 1.0;
}
int4 main(int4 a: TEXCOORD) : SV_Target{
return a + 1;
}
vs
Holger Gruen AMD's GPU Tools for games development32 August 22, 2007
Compiles to …
8 slots vs 1 slot
Holger Gruen AMD's GPU Tools for games development33 August 22, 2007
Tootle
Holger Gruen AMD's GPU Tools for games development34 August 22, 2007
Tootle: Overview
A Triangle Order Optimization Tool– Improves vertex-cache hit rate
– Reduces overdraw
– View independent
Library to integrate into your tool-chain
Simple to use and free
Holger Gruen AMD's GPU Tools for games development35 August 22, 2007
Tootle: Background
Based on I3D 06 paper by Nehab\Barczak\Sander
Uses D3DXOptimizeMesh for vtx cache optim.
Uses D3D for overdraw measurement
Example Scene: 70k polygons, 10 materials – Reduced overdraw by factor of two.
– 3-7% performance increase compared to D3DXOptimizeMesh
Holger Gruen AMD's GPU Tools for games development36 August 22, 2007
Tootle: Overdraw Reduction
Holger Gruen AMD's GPU Tools for games development37 August 22, 2007
AMD Content Creation Tools
Holger Gruen AMD's GPU Tools for games development38 August 22, 2007
AMD Content Creation ToolsRenderMonkey
– Shader development environment– Supports HLSL, D3D asm and GLSL
The Compressonator– Tool for compressing textures– Creates mip-map levels– DX10 supported
CubeMapGen– Creates filtered seamless cube maps via angular
extent filtering– Lots of import/export and cube map assembly
options
NormalMapper– Automatic normal map generation tool
Holger Gruen AMD's GPU Tools for games development39 August 22, 2007
Tool Release Dates
Holger Gruen AMD's GPU Tools for games development40 August 22, 2007
Tools and Release Dates
GPU PerfStudio v1.1 (8/31)
GPU ShaderAnalyzer v1.30 (8/21)
Compressonator v1.4 (8/21)
RenderMonkey v1.71 (last week)
… of course dates may change ☺
Holger Gruen AMD's GPU Tools for games development41 August 22, 2007
Questions?
Holger GruenEuropean Developer RelationsAMD Graphics Products Group