practical spu programming god of war iii · practical spu programming in god of war iii jim...
TRANSCRIPT
![Page 1: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/1.jpg)
Practical SPU ProgramminginGod of War IIIJim Tilander, Vassily FilippovSony Santa Monica
![Page 2: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/2.jpg)
Outline
• Motivation - why use the SPUs?
• Helping the simulation
• Helping the scene
• Helping the rendering
• Q&A
![Page 3: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/3.jpg)
Motivation
• A typical game today contains three sections that feeds data into the next:
• Simulation of game, joypad input etc.
• Scene traversal.
• Render scene.
Simulation Scene RenderCPU0
![Page 4: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/4.jpg)
Motivation
• A very typical optimization is to make the game render in a double buffered mode.
• This is possible because the GPU and the CPU can do parallel execution!
• This allows us to render a scene while the next one is prepared.
• Hides the cost of the simulation!
Simulation Scene
Render
Simulation Scene
Render
Frame nFrame n + 1CPU0
GPU
![Page 5: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/5.jpg)
Motivation
• Processors are becoming increasingly parallel.
• Let’s apply the same technique to the CPU parts.
• Our total frame time is now only bound only by the max of any of the three components, simulation, scene or render.
• This leads to combined processing of all three components in one frame!
Simulation
Scene
Render
Simulation
Scene
Render
Frame n - 1Frame nFrame n + 1
Simulation
Scene
Render
CPU0
CPU1
GPU
![Page 6: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/6.jpg)
Motivation
• In a parallel system we have several types of computing resources.
• Easy to think about in terms of one main CPU.
• Easy to think about one main GPU.
• Bound by one of them.
Main CPU
Main GPU
![Page 7: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/7.jpg)
Motivation
• If any of the two parts run too slow, offload tasks onto the helper CPUs.
• Continue doing this until the whole system runs within frame.
• When it runs within frame, we are done!
Main CPU
Main GPU
Helper CPU
![Page 8: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/8.jpg)
The PS3
• Helper CPUs consists of 6 SPU general purpose processors.
• Have an affinity towards math operations.
• 256kb memory limitation.
Main CPU
Main GPU
SPU SPU
SPU SPU
SPU SPU
![Page 9: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/9.jpg)
SPU is not a co-processor
• The SPU is not a coprocessor.
• Full general purpose processor.
• Operates independently from the other processors.
• You can lift PPU code straight over by bracketing it with DMA calls.
• They are fast enough to make this strategy work.
![Page 10: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/10.jpg)
The SPU is fast.
• Actually, it’s super fast.
• With a little help it can run code at unbelievable speeds.
• Manual optimization can use use the potential 48x speedup of the architecture to the fullest (compiler never comes close).
• Memory is nearby.
• Leaves us to worry less about the actual computation on the SPU.
![Page 11: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/11.jpg)
Can still win with slow SPU
• Reduce total time of the frame.
• Frame limited by max(CPU, GPU).
• Move parts to SPU.
• Even a slow SPU job can be a net win.
GPU
CPU
CPU Bound
GPU
CPU
GPU Bound
![Page 12: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/12.jpg)
SPU == PPU
• Keep the code compilable on both platforms with minimal changes.
• Limit the memory behavior on the PPU.
• Swap DMA calls for memcpy on PPU.
• Enable on the fly runtime switch between SPU and PPU version.
![Page 13: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/13.jpg)
Our frame
• Normal processing with only PPU and RSX working.
• Processing is shifted, three frames in flight at the same time.
• Processing is fairly lengthy.
• Does not run within frame.
Frame nFrame n + 1Frame n + 2
Render
Scene SimulationPPU
GPU
![Page 14: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/14.jpg)
Our frame
• Relies on SPUs to accelerate both RSX and PPU.
• Moving parts of all three systems to the SPU shortens the overall time.
• Now runs within frame.
Frame nFrame n + 1Frame n + 2
Render
Scene SimulationPPU
SPUs
GPU
Scene
Simulation
Render
![Page 15: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/15.jpg)
The On Screen Profiler
![Page 16: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/16.jpg)
The On Screen Profiler
• Both the PPU and SPU profilers are in sync.
• Allows for easy identification of parallel tasks.
• We can verify after the fact that something runs in parallel.
![Page 17: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/17.jpg)
Systems on the SPU
• Animation
• Cloth
• Collision
• Procedural textures
• Culling
• Shadows
• Push buffer generation
• Meta tasks
• Geometry conditioning
• Sound
Simulation Scene Render
![Page 18: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/18.jpg)
Offloading the Simulation
![Page 19: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/19.jpg)
Titans
• One of our big ticket things in the game are Titans.
• Large scale creatures that move. Essentially moving levels.
• Quickly became apparent collision for the Titans were a bottleneck.
• Starting to move tasks onto the SPU.
![Page 20: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/20.jpg)
Titans
• Bracketed PPU code with DMA calls and recompiled for SPU.
• Single buffered implementation.
• One look at the profiler shows us that no more optimizations are necessary.
• Still tons of performance on the table.
DMA Stalls
CPU ProcessingOne collision job
![Page 21: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/21.jpg)
Titans
• We provide tech to artists and designers.
• Sometimes they run with it to places we never imagined.
• Moving ropes are “titans” from the engine’s point of view.
![Page 22: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/22.jpg)
Cloth simulation
• Kratos has a short loin cloth.
• Enemies has various pieces of cloth.
• Independent jobs, naturally parallel.
• Fire and forget jobs, we can figure out early what we need for calculation and don’t need the results until render.
![Page 23: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/23.jpg)
Cloth simulation
• One job per cloth simulation.
• Run this wide (5 SPU).
• Job is dominated by processing.
• Data volume is very low.
• Simply lifting over a PPU version with DMA calls begin/end.
Processing
One cloth job
DMA stalls
![Page 24: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/24.jpg)
Offloading the Scene traversal
![Page 25: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/25.jpg)
Culling
• Simple frustum checks against bound spheres.
• Traverses the list of all potential models.
• Produces visibility bits.
• Processes both frustum and occlusion checks at the same time.
• Highly suited for the SPU.
![Page 26: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/26.jpg)
Culling
• Still got PPU parts, only the heavy lifting is on the SPU.
• Occluder selection, visibility bit processing is still on the PPU.
![Page 27: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/27.jpg)
Push buffer generation
• Generate pushbuffer commands to set vertex buffers, shader constants and textures.
• Pruning of state redundancy.
• A large gather operation with a large amount of pointer to pointer chasing.
• Can easily swamp the PPU with L2 misses.
Meshes in memory
SPU
SPU
SPU
SPU
SPU
Push buffer contents
01000100
01001100
01000100
01011100
01000100
01010000
01001100
01000100
01010000
01001100
01011100
01011100
![Page 28: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/28.jpg)
Push buffer generation
• Each SPU fetches a small group of model references (one batch) at a time.
• Double buffer DMA, fetch model B while processing model A.
• Masked memory access cost.
Meshes in memory
SPU
SPU
SPU
SPU
SPU
Push buffer contents
01000100
01001100
01000100
01011100
01000100
01010000
01001100
01000100
01010000
01001100
01011100
01011100
![Page 29: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/29.jpg)
Push buffer generation• Adapted the PPU version to handle
interleaved DMA.
• The SPU version is also the PPU version!
• In debug mode we can switch to the PPU version on the fly.
• PPU version still useful for handling debug-jobs too large for the memory on the SPU (e.g. very large shaders).
Meshes in memory
SPU
SPU
SPU
SPU
SPU
Push buffer contents
01000100
01001100
01000100
01011100
01000100
01010000
01001100
01000100
01010000
01001100
01011100
01011100
![Page 30: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/30.jpg)
Push buffer generation
• We run this final generation wide on 5 SPUs.
• Allocate a chunk of memory from the pushbuffer.
Meshes in memory
SPU
SPU
SPU
SPU
SPU
Push buffer contents
01000100
01001100
01000100
01011100
01000100
01010000
01001100
01000100
01010000
01001100
01011100
01011100
![Page 31: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/31.jpg)
Push buffer generation
• We have 5 SPUs all trying to allocate memory from the same pushbuffer.
• Synchronization done through mfc_getllr and mfc_putllr.
• Bypasses regular DMA, goes through the atomic unit instead.
• Should be your staple synchronization mechanism, fast and no OS overhead.
![Page 32: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/32.jpg)
Offloading the GPU
![Page 33: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/33.jpg)
Geometry processing
• Various techniques to offload the GPU (post processing, vertex processing, software rasterizers).
• We’ve focused on offloading the cost of the opaque pass.
• Majority of this cost comes from vertex processing and lighting.
• Moved both over to the SPU.
![Page 34: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/34.jpg)
Geometry processing
• We pass all our vertices through the SPUs to be pre-conditioned for the GPU.
• A special purpose job handles various tasks to help the GPU.
• Relies heavily on the SDK library EDGE.
![Page 35: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/35.jpg)
Geometry processing
• What is EDGE?
• Geometry processing library available to all PS3 developers.
• Highly optimized SPU code.
• Easy integration, you still control main().
• Can greatly improve your performance!
![Page 36: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/36.jpg)
Geometry processing
• One job per drawcall.
• Typical frame holds about 3000 geometry jobs.
• Most of our vertex shader is in here.
• Augmented lighting calculations.
• The one place where we’ve optimized heavily!
Decompress
Skinning
Culling
Generate Normals
Lighting code
Compress to RSX
![Page 37: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/37.jpg)
Color correction
• Run as a post effects pass to give a certain (cinematic) look to a scene.
• Basically just do a RGB lookup in a cube map for each pixel on the screen.
• For dynamic effects we want to generate the cube map.
• How do we generate the cube map?
![Page 38: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/38.jpg)
Color correction
• Kick a SPU job early on to generate a cube map based on parametric input.
• Algorithm involves a lot of if statements, harder to do efficiently on the GPU.
• Simple lift of code from PPU.
• Job is dominated by processing, single buffered DMA.
![Page 39: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/39.jpg)
In closing
![Page 40: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/40.jpg)
Go parallel
• You must use the parallel nature of the machine.
• Do not special case the SPU, it is a general purpose processor.
• Offload from the currently bound system.
![Page 41: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/41.jpg)
No premature optimizations!
• Focus on user experience.
• Optimize as needed. Really.
![Page 42: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/42.jpg)
Measure speed
• Be scientific, measure before you jump! The on screen profiler is your first tool.
• Start with a simple implementation that might seem non optimal.
• Always keep the PPU version! Invaluable for debugging.
• Remember that the SPU is faster than you think.
![Page 43: Practical SPU Programming God of War III · Practical SPU Programming in God of War III Jim Tilander, Vassily Filippov Sony Santa Monica. Outline ... The PS3 • Helper CPUs ... •](https://reader034.vdocuments.us/reader034/viewer/2022050210/5f5cd633d13de6545a286680/html5/thumbnails/43.jpg)
Q&A!