Download - Concurrent Non-blocking BVH Creation
CONCURRENT NON-BLOCKING BVH CREATION
Adam KavanaughKris K. Rivera
INTRODUCTION TO RAY TRACING Graphics technique for rendering objects
in 3D space Used by movie industry to create
exceptionally realistic movies.
INTRODUCTION TO RAY TRACING 30-50hrs to
render a frame 3D frames about
100hrs
ACCELERATION STRUCTURES Axis Aligned Bounding Box (AABB) Uniform Spatial Subdivision (USS) k-d Tree Oct Tree Bounding Volume Hierarchy (BVH)
BOUNDING VOLUME HIERARCHY Binary Tree of bounding volumes
Boxes which encapsulate a collection of primitives
Entire scene of triangles surrounded in large bounding volume Each level is split along a given axis
“Longest” axis usually chosen to split upon Location of split is determined by a heuristic
Triangles are “sorted” into respective child bounding volumes by their medians
Process recurs for left and right children until one triangle exists as leaf node in its own volume
BOUNDING VOLUME HIERARCHY
NON-BLOCKING BVH CONSTRUCTION Tree construction is done in a pseudo-
breadth first mannerPseudo-Breadth First because the current build
horizon expands independently as threads progress
Each thread acts as producer and consumer of tree nodes
Implemented using Java’s ConcurrentLinkedQueue classUses a wait-free algorithm for the class internals
Uses Java’s Atomic variable classes for CAS based counters.
NON-BLOCKING BVH CONSTRUCTION Main Algorithm
Dispatch n (or more) threads Each thread spins on the task queue until the tree is
completed When a thread gets a node from the queue, build the
node, increment the number of built nodes and place any children nodes onto the task queue
Store the built node Generate the BVH tree structure
Shared resources Task Queue BVH Tree structure Counters
Number of nodes, threads, etc.
NON-BLOCKING BVH VERSION DIFFERENCES Version 1
Child nodes are added to the tree as they are built Requires size checks for each add to get the
indexSpawn threads on-demand until build
completes
Version 2Child nodes are stored in each parent, and
then flattened into the tree once construction is completed
Only spawn required number of threads
EXPERIMENT SETUP 7 builders tested
2 single threaded, 5 concurrentUsing Depth-Firsth and Breadth-First build
strategies 17 models tested
1010 triangles (Legoman) to 134K (Halo 3 Scene)
All times averaged over 3 trials Experiment System
Windows 7 Intel Core i7 Processor4GB RAM
RESULTS For all builders, we see a general trend
towards exponential build times in the number of triangles in the scene.
Non-blocking builder is generally as fast as the other blocking methods, except for the blocking BFS builder, which produces the fastest build times.
Additional threads do not linearly decrease built time.
RESULTS
0 20000 40000 60000 80000 100000 120000 140000 1600000
5000
10000
15000
20000
25000
All Models
DepthFirstBvhBuilder(1) BreadthFirstBvhBuilder(1)BlockingBFSBvhBuilder(2) BlockingBFSBvhBuilder(4)BlockingRecursiveBvhBuilder(2) BlockingRecursiveBvhBuilder(4)BlockingRecursiveBvhBuilderV2(2) BlockingRecursiveBvhBuilderV2(4)NonBlockingBvhBuilder(2) NonBlockingBvhBuilder(4)NonBlockingBvhBuilderV2(2) NonBlockingBvhBuilderV2(4)
Number of Triangles
Build
Tim
e (m
s)
RESULTS
90000 95000 100000 105000 110000 115000 120000 125000 130000 135000 1400000
5000
10000
15000
20000
25000
Large Models
DepthFirstBvhBuilder(1) BreadthFirstBvhBuilder(1)BlockingBFSBvhBuilder(2) BlockingBFSBvhBuilder(4)BlockingRecursiveBvhBuilder(2) BlockingRecursiveBvhBuilder(4)BlockingRecursiveBvhBuilderV2(2) BlockingRecursiveBvhBuilderV2(4)NonBlockingBvhBuilder(2) NonBlockingBvhBuilder(4)NonBlockingBvhBuilderV2(2) NonBlockingBvhBuilderV2(4)
Number of Triangles
Build
Tim
e (m
s)
RESULTS
0 20000 40000 60000 80000 100000 120000 140000 1600000
5000
10000
15000
20000
25000
Concurrent 2 Threads
BlockingBFSBvhBuilder(2) BlockingRecursiveBvhBuilder(2)BlockingRecursiveBvhBuilderV2(2) NonBlockingBvhBuilder(2)NonBlockingBvhBuilderV2(2)
Number of Triangles
Build
Tim
e (m
s)
RESULTS
0 20000 40000 60000 80000 100000 120000 140000 1600000
2000
4000
6000
8000
10000
12000
14000
Concurrent 4 Threads
BlockingBFSBvhBuilder(4) BlockingRecursiveBvhBuilder(4)BlockingRecursiveBvhBuilderV2(4) NonBlockingBvhBuilder(4)NonBlockingBvhBuilderV2(4)
Number of Triangles
Build
Tim
e (m
s)
RESULTS
0 20000 40000 60000 80000 100000 120000 140000 1600000
2000
4000
6000
8000
10000
12000
14000
16000
18000
BFS Bvh Builder
BlockingBFSBvhBuilder(2) BlockingBFSBvhBuilder(4)NonBlockingBvhBuilderV2(2) NonBlockingBvhBuilderV2(4)
Number of Triangles
Build
Tim
e (m
s)
BFS BUILDER COMPARISON The Blocking BFS Builder is faster than
the Non-Blocking Builder.Average 11.8% faster for 2 threadsAverage 8.4% faster for 4 threads
CONCLUSION Demonstrate the use of lock-free and
wait-free data structures for use in BVH tree construction
Lock-free implementation is slower, but not by a lot.Still gains benefits of being lock-free, and
using wait-free data structures.
Implementation details matter!
QUESTIONS?