![Page 1: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/1.jpg)
OpenCL accelerated rigid body and collision detection
Erwin Coumans
Advanced Micro Devices
Robotics: Science and Systems Conference 2011
![Page 2: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/2.jpg)
Overview
• Intro
• GPU broadphase acceleration structures
• GPU convex contact generation and reduction
• GPU BVH acceleration for concave shapes
• GPU constraint solver
Robotics: Science and Systems Conference 2011
![Page 3: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/3.jpg)
Industry view
Game and Film physics Simulation
Hardware Platform
Games Industry
Movie Industry
Academia, Universities
Content Creation Tools
Conferences, Presentations
C++ APIs and Implementations
Data representation
Sony Imageworks
PDI Dreamworks
ILM, Disney
Rockstar, Epic
EA, Disney
PS3, Xbox 360, PC, iPhone, Android
OpenCL, CUDA
Havok, PhysX Bullet, ODE,
Newton, PhysBAM, Box2D
Custom in-house physics engines
FBX, COLLADA
binary .hkx, .bullet format
Maya, 3ds Max, Houdini, LW, Cinema 4D
Blender
GDC, SIGGRAPH Stanford, UNC
etc.
x86, PowerPC, Cell, ARM
Robotics: Science and Systems Conference 2011
![Page 4: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/4.jpg)
Our open source work
• Bullet Physics SDK, http://bulletphysics.org
• Sony Computer Entertainment Physics Effects
• OpenCL/DirectX11 GPU physics research
Robotics: Science and Systems Conference 2011
![Page 5: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/5.jpg)
OpenCL™
• Open development platform for multi-vendor heterogeneous architectures
• The power of AMD Fusion: Leverages CPUs and GPUs for balanced system approach
• Broad industry support: Created by architects from AMD, Apple, IBM, Intel, NVIDIA, Sony, etc. AMD is the first company to provide a complete OpenCL solution
• Kernels written in subset of C99
Robotics: Science and Systems Conference 2011
![Page 6: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/6.jpg)
Particle
State vector
Rigid body
xX
v
x
qX
v
/
vX
F m
1
2
/
( ) /
v
qX
F m
I I
Robotics: Science and Systems Conference
2011
![Page 7: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/7.jpg)
Rigid body simulation loop
First update the linear and angular velocity
Then the position
1
1
1
t h t
t t
v v hFm
h I
1
2
t h t t h
t h t t h t
x x hv
q q h q
Robotics: Science and Systems Conference
2011
![Page 8: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/8.jpg)
OpenCL kernel: Integration
__kernel void
interopKernel( const int startOffset, const int numNodes, __global float4
*g_vertexBuffer,__global float4 *linVel,__global float4 *pAngVel)
{
int nodeID = get_global_id(0);
float timeStep = 0.0166666;
if( nodeID < numNodes )
{
g_vertexBuffer[nodeID + startOffset/4] += linVel[nodeID]*timeStep;
float4 axis;
float4 angvel = pAngVel[nodeID];
float fAngle = native_sqrt(dot(angvel, angvel));
axis = angvel * ( native_sin(0.5f * fAngle * timeStep) / fAngle);
float4 dorn = axis;
dorn.w = native_cos(fAngle * timeStep * 0.5f);
float4 orn0 = g_vertexBuffer[nodeID + startOffset/4+numNodes];
float4 predictedOrn = quatMult(dorn, orn0);
predictedOrn = quatNorm(predictedOrn);
g_vertexBuffer[nodeID + startOffset/4+numNodes]=predictedOrn;
}
}
Robotics: Science and Systems Conference 2011
![Page 9: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/9.jpg)
OpenCL – OpenGL interop
• See http://github.com/erwincoumans/experiments
Robotics: Science and Systems Conference
2011
![Page 10: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/10.jpg)
Today’s execution model
• Single Program Multiple Data (SPMD)
• Same kernel runs on: • All compute units
• All processing elements
• Purely “data parallel” mode
• PCIe bus between CPU and GPU is a bottleneck
Robotics: Science and Systems Conference 2011
![Page 11: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/11.jpg)
Tomorrow’s execution model
• Multiple Program Multiple Data (SPMD)
• Nested data parallelism
– Kernels can enqueue work
• CPU and GPU on one die
– Unified memory
spawn
…
…
…
WG
WI
WI WI
WI
…
…
…
WG
WI
WI WI
WI
…
…
…
WG
WI
WI WI
WI
spawn
…
…
…
WG
WI
WI WI
WI
Robotics: Science and Systems Conference 2011
![Page 12: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/12.jpg)
Physics Pipeline
Forward Dynamics Computation
Collision Detection Computation
Detect
pairs
Forward Dynamics Computation
Setup
constraints
Solve
constraints
Integrate
position
Apply
gravity
Collision Data Dynamics Data
Compute
AABBs
Predict
transforms
Collision
shapes
Object
AABBs
Overlapping
pairs
World
transforms
velocities
Mass
Inertia
Constraints
(contacts,
joints)
Compute
contact
points
Contact
points
time Start End
AABB = axis aligned bounding box Robotics: Science and Systems Conference
2011
![Page 13: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/13.jpg)
Broadphase N-body problem
• Avoid brute-force N*N tests
• Input: world space BVs and unique IDs
• Output: array of potential overlapping pairs
Broadphase Collision Detection
Detect
pairs
Compute
AABBs
Robotics: Science and Systems Conference 2011
![Page 14: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/14.jpg)
Uniform grid
• Very GPU friendly, parallel radix sort
• Use modulo to make grid unbounded
Robotics: Science and Systems Conference 2011
![Page 15: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/15.jpg)
Non uniform work granularity
• Small versus small
– GPU
• Large versus small
– CPU
• Large versus large
– CPU
Robotics: Science and Systems Conference 2011
![Page 16: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/16.jpg)
Incremental sweep and prune
• Update 3 sorted axis and overlapping pairs
• Does’t parallelize easy: data dependencies
Robotics: Science and Systems Conference 2011
![Page 17: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/17.jpg)
Parallel 1 axis sweep and prune
• From scratch sort 1 axis sweep to find all pairs
• Parallel radix sort and parallel sweep
[Game Physics Pearls, 2010, AK Peters]
Robotics: Science and Systems Conference 2011
![Page 18: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/18.jpg)
Dynamic BVH tree broadphase
• Keep two dynamic trees, one for moving objects, other for objects (sleeping/static)
• Find neighbor pairs:
– Overlap M versus M and Overlap M versus S
D F
B C E A
Q P
U R S T
S: Non-moving DBVT M: Moving DBVT
Robotics: Science and Systems Conference 2011
![Page 19: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/19.jpg)
Update/move a leaf node
• If new AABB is contained by old do nothing
• Otherwise remove and re-insert leaf – Re-insert at closest ancestor that was not resized
during remove
• Expand AABB with margin – Avoid updates due to jitter or small random
motion
• Expand AABB with velocity – Handle the case of linear motion over n frames
Robotics: Science and Systems Conference
2011
![Page 20: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/20.jpg)
Parallel BVH tree traversal
• Incremental update on CPU (shared memory)
• Use parallel history traversal on GPU
00 00
00
00
10 10
00
00
10 10
00
00
10 11
00
00
10 00
00
00
10 11
00
00
10 00
00
00
11 00
00
00
Robotics: Science and Systems Conference 2011
![Page 21: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/21.jpg)
Contact generation
• Input: overlapping pairs, collision shapes
• Output: contact points
Robotics: Science and Systems Conference 2011
![Page 22: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/22.jpg)
Closest point algorithms
• Algebraic closed forms, ie. sphere-sphere
• Separating axis theorem for convex polyhedra • Only computes overlap, no positive distances
• Gilbert Johnson Kheerthi (GJK) for general convex objects
• Doesn’t compute overlap, needs a companion algorithm for penetrating case, for example the Expanding Polytope Algorithm
Robotics: Science and Systems Conference 2011
![Page 23: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/23.jpg)
Separating axis test
• Test all axis in parallel, use cube map
Robotics: Science and Systems Conference 2011
![Page 24: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/24.jpg)
GJK or EPA
Robotics: Science and Systems Conference 2011
![Page 25: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/25.jpg)
Sutherland–Hodgman clipping
• Clip incident face against reference face side planes (but not the reference face).
• Consider clip points with positive penetration.
n
clipping planes
Robotics: Science and Systems Conference 2011
![Page 26: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/26.jpg)
• GPU friendly convex versus convex
• See “Rigid body collision detection on the GPU” by Rahul Sathe et al, SIGGRAPH 2006 poster
Cube map
Robotics: Science and Systems Conference 2011
![Page 27: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/27.jpg)
Contact reduction
• Keep only 4 points
Robotics: Science and Systems Conference 2011
![Page 28: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/28.jpg)
Concave shapes
• Hierarchical approximate convex decomposition (ICIP 2009 proceedings, Khaled Mamou)
• http://sourceforge.net/projects/hacd
Robotics: Science and Systems Conference 2011
![Page 29: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/29.jpg)
Parallelizing Constraint Solver
• Projected Gauss Seidel iterations are not embarrassingly parallel
A
B D
1 4
Robotics: Science and Systems Conference 2011
![Page 30: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/30.jpg)
Reordering constraint batches
A
B D
1 4
A B C D
1 1
2 2
3 3
4 4
A B C D
1 1 3 3
4 2 2 4
Robotics: Science and Systems Conference 2011
![Page 31: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,](https://reader036.vdocuments.us/reader036/viewer/2022071219/605847b0a04ba37bcf7ce536/html5/thumbnails/31.jpg)
Thanks!
• For more information: http://bulletphysics.org
Robotics: Science and Systems Conference 2011