killzone shadow fall: threading the entity update on ps4
TRANSCRIPT
Killzone Shadow Fall: Threading the Entity Update on PS4
Jorrit RouwLead Game Tech, Guerrilla Games
IntroductionKillzone Shadow Fall is a First Person ShooterPlayStation 4 launch titleIn SP up to 60 characters @ 30 FPSIn MP up to 24 players @ 60 FPSGameplay logic has lots of BranchesVirtual functionsCache missesNot suitable for PS3 SPUs but PS4 has 6 x86 cores
2
What do we cover?What is an Entity?What we did on PS3Multi threading on PS4Balancing Entities across framesPerformance issues foundPerformance results achievedDebug tools
3
What is an Entity?
What is an Entity?Base class for most game objectsE.g. Player, Enemy, Weapon, DoorNot used for static worldHas ComponentsE.g. Model, Mover, DestructibilityUpdate at a fixed frequency15, 30, 60 HzAlmost everything at 15 HzPlayer updated at higher frequency to avoid lag
5
What is a Representation?Entities and Components have RepresentationControls rendering, audio and VFXState is interpolated in frames where entity not updatedCheaper to interpolate than to updateIntroduces latencyAlways blend towards last update
6
Movie
7
Multi Threading Approach
PS3: 1 Entity = 1 FiberMost time spent on PPUNo clear concurrency modelRead partial updated stateEntities deadlock waiting for each other
yield fiberresume fiberTime
PPUSPU 1SPU 2Entity 1Entity 2Entity 3Entity 1AnimationRay CastEntity 2
9
PS4: 1 Entity = 1 Job
No fibersEntity updates as a wholeHow to solve race conditions?
TimeCPU 1CPU 2CPU 3Entity 1Entity 2Entity 3Entity 1AnimationRay CastEntity 2job
10
Make update order explicit:
Strong Dependencies
MissileMissileMissile LauncherMissile LauncherSoldier
TimeSoldierstrong dependency
No (indirect) dependency = no accessWorks two ways: Weapon can access Soldier tooCreate dependency has 1 frame latencyGlobal systems need locks
Non-dependent Entities can Execute Concurrently
Soldier1Weapon1
TimeSoldier2Weapon2CPU 2CPU 1Soldier1Weapon1Soldier2Weapon2
A few entities cause huge bottleneck
What about this?
TimeSoldier1Pistol1Soldier2Pistol2Bullet System
Soldier1Pistol1Soldier2Pistol2Bullet System
Non-exclusive DependenciesAccess to Bullet System must be lock protected
Soldier1Soldier2Bullet Systemnon-exclusive
TimeCPU 2CPU 1Soldier1Pistol1Soldier2Pistol2Bullet SystemBarrierPistol1Pistol2
Weak Dependencies2 tanks fire at each other:
Update order reversed when circular dependency occursNot used very often (< 10 per frame)
TimeMissile2Missile1Tank1Tank2Missile2Missile1Tank2Tank1Missile1Missile2orweakstrongTank1Tank2
Non-updating EntitiesEntity can skip updates (LOD)Entity can update in other frame
Do normal scheduling!
Entity1Entity3Entity1Entity2Entity3
Timenot updating
Summarizing DependenciesStrongExclusiveWeakExclusiveStrongNon-excl.WeakNon-excl.SymbolTwo way accessOrder guaranteedAllow concurrency++++++Require lock
Referencing EntitiesDev build: CheckedPtrActs as normal pointerCheck dependency on dereferenceRetail build: Entity *No overhead!Doesnt catch everythingCan use pointers to membersBugs were easy to find
18
Working with Entities without dependencyThreadSafeEntityInterfaceMostly read onlyOften used state (name, type, position, )Mutex per EntitySend message (expensive)Processed single threaded when no dependencySchedule single threaded callback (expensive)Everything can be accessed
19
Scheduling Algorithm
Scheduling AlgorithmEntities with exclusive dependencies merged to 1 jobDependencies determine sortingNon-exclusive dependencies become job dependenciesExpensive jobs kicked first!MissileMissile LauncherTankBullet SystemPistolSoldierMissileTankBullet SystemPistolSoldier
Job1Job2job dependency
not updatingweaknon-exclusivestrong
21
Scheduling Algorithm Edge Case
Entity1Entity3Entity4Entity2non-exclusiveexclusivejob dependency
Entity1Entity2
Entity3Entity4Job1Job2
22
Scheduling Algorithm Edge CaseNon cyclic dependency becomes cyclic job dependencyJob1 and Job2 need to be merged
Entity1Entity3Entity4Entity2non-exclusiveexclusiveEntity1Entity2
Entity3Entity4Job1
23
Balancing Entities Across Frames
24
Balancing Entities Across FramesPrevent all 15 Hz entities from updating in same frameEntity can move to other frameSmaller delta time for 1 updateKeep parent-child linked entities togetherWeapon of soldierSoldier on mounted gunLocked close combat
Balancing Entities In Action
Time in even frame (sum across cores, running average)
26
Balancing Entities In Action
Time in odd frame(should be equal)
27
Balancing Entities In Action
Civilian @ 15Hzupdate even frame
28
Balancing Entities In Action
Civilian @ 15Hzupdate odd frame
29
Balancing Entities In Action
Flash on switch odd / even
30
Movie
31
Movie
32
Performance Issues
Performance IssuesMemory allocation mutexEliminated many dynamic allocationsUse stack allocatorLocking physics worldR/W mutex for main simulation worldSecond bullet collision broadphase + lockLarge groups of dependent entitiesPlayer update very expensive
Cut Scene - ProblemCut scene entity requires dependencies10+ characters in cut scene creates huge job!Civilian 1Cut SceneCivilian 2PlayerCamera
Cut Scene - SolutionCreate sub cut scenes for non-interacting entitiesMaster cut scene determines time and flowScan 1 frame ahead in timeline to create dependencyCivilian 1Cut Scenenon-exclusiveSub Cut Scene 1Sub Cut Scene 2Civilian 2Sub Cut Scene 3PlayerCamera
36
Using an ObjectDependencies on useable objects not possible (too many)Get list of usable objectsGlobal system protected by lockUse icon appears on screenPlayer selectsCreate dependencyStart use animationStart interaction 1 frame later (dependency valid)Hides 1 frame delay!
GrenadeExplosion damages many entitiesCreating dependencies not option (too many)ThreadSafeEntityInterface not an optionNeed knowledge of partsRun line of sight checks inside updateUses scheduled callback to apply damage
Performance Results
5000 Crates
Performance Results - Synthetic100 Soldiers500 FlagsLevelCountsDependenciesMaxEntitiesin JobSpeedupNumberEntitiesUpdating EntitiesNumberHumansStrong ExclStrong Non-Excl5000 Crates (20 s each)501950081124132.8X100 Soldiers (700 s each)326214105212204194.2X500 Flags (160 s each)5195081124135.2X
6 cores!
40
Performance Results - LevelsLevelCountsDependenciesMaxEntitiesin JobSpeedupNumberEntitiesUpdating EntitiesNumberHumansStrong ExclStrong Non-ExclThe Helghast (You Owe Me)1141206327123204.1XThe Patriot (On Vectan Soil)43525744199107154.3XThe Remains (12p Botzone)450128149744183.7X
The HelghastThe RemainsThe Patriot
Game Frame - The Patriot
42
Game Frame - Global
Time18 ms
43
Game Frame - Global
CPU 1 (main thread)
44
Game Frame - Global
CPU 2
45
Game Frame - Global
CPU 3
46
Game Frame - Global
AIPhysicsEntity UpdateVFX & DrawBarrier
47
Game Frame Entity Update
Fix Weak Dependency Cycles
48
Game Frame Entity Update
Prepare Jobs
49
Game Frame Entity Update
Link and Order Jobs
50
Game Frame Entity Update
Execute Jobs
51
Game Frame Entity Update
Single Threaded Callbacks
52
Game Frame Entity Update
Player Job
53
Game Frame Entity Update
Player Entity
54
Game Frame Entity Update
Animation
55
Game Frame Entity Update
Capsule Collision
56
Game Frame Entity Update
Player Representation
57
Game Frame Entity Update
OWL (Helper Robot)
58
Game Frame Entity Update
Inventory
59
Game Frame Entity Update
AI Soldier
60
Game Frame Entity Update
Cloth Simulation
61
Game Frame Entity Update
Cut Scene
62
Game Frame Entity Update
Sub Cut Scenes
63
Game Frame Entity Update
Cheap Entities (Destructibles)
64
Debug Tools
Debug ToolsDependencies get complex, we use yEd to visualize!
Debug ToolsPlayer
Debug ToolsBullet System
Debug ToolsCut scenes
ConclusionsEasy to implement in existing engineGame programmers can program as if single threadedVery few multithreading issues