maximizing multi-gpu performance
DESCRIPTION
Maximizing Multi-GPU Performance. Thomas Fortier ISV Relations AMD Graphics Products Group [email protected]. Topics Covered in this Session. Why multi-GPU solutions matter. Hardware & driver considerations. Impact on game design. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/1.jpg)
Maximizing Multi-GPU Performance
Thomas Fortier
ISV Relations
AMD Graphics Products Group
![Page 2: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/2.jpg)
Maximizing Multi-GPU Performance
Topics Covered in this Session
Why multi-GPU solutions matter. Hardware & driver considerations. Impact on game design. Profiling & performance gains.
![Page 3: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/3.jpg)
Maximizing Multi-GPU Performance
Dual-GPU boards Multi-board systems Hybrid graphics
Why Multi-GPU Solutions Matter
![Page 4: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/4.jpg)
Maximizing Multi-GPU Performance
Why Support Multi-GPU in Your Game
Growing market share of multi-GPU solutions. All game and hw reviews integrate multi-GPU
solutions. Expectation by gamers is that game framerate
should “just scale” with additional GPUs. The competition is doing it!
Market trend
![Page 5: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/5.jpg)
Maximizing Multi-GPU Performance
Crossfire Technical Overview
![Page 6: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/6.jpg)
Maximizing Multi-GPU Performance
Crossfire Technical Overview
Frame 1
Frame 3
Frame 5
Frame 7
Frame 2
Frame 4
Frame 6
Frame 8
![Page 7: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/7.jpg)
Maximizing Multi-GPU Performance
Crossfire Technical Overview
![Page 8: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/8.jpg)
Maximizing Multi-GPU Performance
Crossfire Technical Overview
![Page 9: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/9.jpg)
Maximizing Multi-GPU Performance
Crossfire Technical Overview
![Page 10: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/10.jpg)
Maximizing Multi-GPU Performance
Crossfire Technical Overview
![Page 11: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/11.jpg)
Maximizing Multi-GPU Performance
Crossfire Technical Overview
![Page 12: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/12.jpg)
Maximizing Multi-GPU Performance
Crossfire Technical Overview
![Page 13: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/13.jpg)
Maximizing Multi-GPU Performance
Crossfire Technical Overview
![Page 14: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/14.jpg)
Maximizing Multi-GPU Performance
Crossfire Technical Overview
![Page 15: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/15.jpg)
Maximizing Multi-GPU Performance
Alternate Frame Rendering
Alternate frame rendering leads to two types of problems:• Interframe dependencies
• CPU/GPU synchronization points
In each case, parallelism between CPU and GPUs is lost.
![Page 16: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/16.jpg)
Maximizing Multi-GPU Performance
Querying the Number of GPUs
Statically link to:• atimgpud_s_x86.lib - 32 bit version
• atimgpud_s_x64.lib - 64 bit version
Include header file:• atimgpud.h
Call this function:• INT count = AtiMultiGPUAdapters();
• In windowed mode, set Count to 1
![Page 17: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/17.jpg)
Maximizing Multi-GPU Performance
Interframe Dependencies
Frame 1
Frame 3
Frame 5
Frame 7
Frame 2
Frame 4
Frame 6
Frame 8
![Page 18: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/18.jpg)
Maximizing Multi-GPU Performance
Interframe Dependencies
![Page 19: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/19.jpg)
Maximizing Multi-GPU Performance
Interframe Dependencies
When are interframe dependencies a problem?• Depends on frequency of P2P blits.
Solutions:• Create n copies of the resource triggering P2P blits.
• Associate each copy of the resource to a specific GPU.
• resource[frame_num % num_gpus]
• Repeat resource updates for n frames.
![Page 20: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/20.jpg)
Maximizing Multi-GPU Performance
Interframe Dependencies
Frame 1
Frame 3
Frame 5
Frame 7
Frame 2
Frame 4
Frame 6
Frame 8
![Page 21: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/21.jpg)
Maximizing Multi-GPU Performance
Interframe Dependencies
Frame 1
Frame 3
Frame 5
Frame 7
Frame 2
Frame 4
Frame 6
Frame 8
![Page 22: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/22.jpg)
Maximizing Multi-GPU Performance
Interframe Dependencies
There are many ways to update resources using the GPU:• Drawing to Vertex / Index Buffers
• Stream Out
• CopyResource()
• CopySubresourceRegion()
• GenerateMips()
• ResolveSubresource()
• Etc…
![Page 23: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/23.jpg)
Maximizing Multi-GPU Performance
CPU/GPU Synchronization Points
Frame 1
Frame 3
Frame 2
Frame 4
Frame 5
![Page 24: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/24.jpg)
Maximizing Multi-GPU Performance
CPU/GPU Synchronization Points
![Page 25: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/25.jpg)
Maximizing Multi-GPU Performance
CPU/GPU Syncs - Queries
Having the driver block on a query starves the GPU queues, and limits parallelism.
Solutions:• Don’t block on query results.
• Don’t have queries straddle across frames.
• For queries issued every frame, create a query object for each GPU.
• Pick up query results n frames after it was issued.
![Page 26: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/26.jpg)
Maximizing Multi-GPU Performance
CPU/GPU Syncs – CPU Access to GPU Resources
Triggers pipeline stalls because driver blocks waiting on GPU at lock/map call.
Followed by a P2P blit at unlock/unmap call. Often results in negative scaling… Solutions:
• DX10/DX11 – Stream to and copy from staging textures.
• DX9 – Stream to and copy from sysmem textures.
• DX9 – Never lock static vertex/index buffers, textures.
![Page 27: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/27.jpg)
Maximizing Multi-GPU Performance
Multi-GPU Performance Gains
What kind of performance scaling should you expect from multi-GPU systems?• Function of CPU/GPU workload balance.
• Typical for 2 GPUs is 2X scaling.
• For 3 & 4 GPUs, varies from game to game.
![Page 28: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/28.jpg)
Maximizing Multi-GPU Performance
Crossfire Profiling
Make sure to be GPU bound.• Test framerate scaling with resolution change.
Test for multi-GPU scaling.• Rename app exe to ForceSingleGPU.exe.
Test for texture interframe dependencies.• Rename app exe to AFR-FriendlyD3D.exe.
Remove queries. Check for CPU locks of GPU resources.
![Page 29: Maximizing Multi-GPU Performance](https://reader035.vdocuments.us/reader035/viewer/2022062518/5681428d550346895daeb83d/html5/thumbnails/29.jpg)
Maximizing Multi-GPU Performance
Key Takeaways
Multi-GPU solutions matter! Test and profile with multi-GPU systems.
• Properly handle interframe dependencies.
• Check for CPU locks of GPU resources.
• Don’t block on queries.
Refer to AMD Crossfire SDK samples• ati.amd.com/developer• CrossFire Detect & AFR-Friendly projects.