hierarchical penumbra casting samuli laine timo aila helsinki university of technology hybrid...
TRANSCRIPT
Hierarchical Penumbra Casting
Samuli Laine
Timo Aila
Helsinki University of Technology
Hybrid Graphics, Ltd.
Introduction
• Rendering soft shadows– As usual, area light sources are sampled with
a number of light samples– Multiple receiver points to be shaded
• The main problem is solving the visibility– which light samples are visible to which
receiver points
Light source
What’s Happening?
Light samples
Shadow caster
Visible surface
Receiver points
On the Scale of the Problem
• With R receiver points and L light samples there are RL visibility relations to solve– For example, 1024×768 resolution and 256
light samples gives over 200 million relations
• Ray casting is the usual solution for solving the visibility relations– With T triangles, the cost of casting one
shadow ray is O(log T)– Total cost becomes O(RL log T)
About Ray Casting
• The standard ray casting approach considers only one ray at a time– This inevitably leads to linear performance
with respect to RL
• However, this is highly flexible– We need to generate only one ray at a time
• Sub-linear complexity with respect to T is achieved by placing the triangles into an acceleration hierarchy
Transposing the Algorithm
Our Approachfor each triangle T find all l r rays blocked by T
Ray Tracingfor each receiver point r for each light sample l find triangle that blocks ray l r
• Goal: sub-linear complexity w.r.t. RL
• Requires rearranging the rendering loop
linear to Rlinear to Lsub-linear to T
linear to Tsub-linear to RL
About Our Algorithm
• Sub-linear complexity with respect to RL is achieved by placing the receiver points and light samples into acceleration hierarchies– Therefore, all receiver points must be
gathered before computing the shadows
• We process one triangle at a time– Good: no need for triangle BSP– Bad: linear complexity with respect to T
About Our Algorithm, part 2
• The full rendering process goes as follows:
1. Ray-trace or rasterize the image without shadows to get the receiver points
2. Build the acceleration structures for receiver points and light samples
3. Process all triangles to solve the visibility relations between light samples and receiver points
4. Perform shading
The Acceleration Structures
• Fixed three-level bounding volume hierarchy is used for the light samples– Assuming a polygonal light source, bounding
“volumes” are actually bounding polygons
• Standard bounding volume hierarchy is used for the receiver points – Axis-aligned boxes as bounding volumes
Light Sample Hierarchy
• Three levels
• All nodes have a bounding volume
Entire light source Light sample groups Light samples
Root node Middle nodes Leaf nodes
Storing the visibility information
• A bit mask with L bits is assigned for every receiver point– bit = 0: light sample is visible– bit = 1: light sample is occluded
• Initially, all bits are zero
• When a triangle is found to occlude a light sample from a receiver point, the corresponding bit is set to one
• All points where a triangle may block a ray from a bounding volume are inside the corresponding penumbra volume
Penumbra Volumes
Penumbra volumeBounding
volume in light hierarchy
Triangle
Processing a Triangle
• First build penumbra volumes for all nodes in the light sample hierarchy
• For individual light samples (leaf nodes) these become hard shadow volumes
Processing a Triangle
• Traverse down the receiver point hierarchy
• Step 1: Test intersection between main penumbra volume and bounding volume of receiver node
Main penumbra volumeBounding
volume of entire light source
Triangle
Receiver node
Processing a Triangle
• Step 2: Update the list of active light sample groups– At beginning of traversal, all groups are active
Bounding volumes of light sample groups
Receiver node
Triangle
Remove from active group list
Processing a Triangle
• Step 3: Recurse into child nodes in receiver hierarchy– With pruned list of active light sample groups
Main penumbra volumeBounding
volume of entire light source
Triangle
Child nodes
Processing a Triangle
• Step 4: In leaf node, test receiver points vs. hard shadow volumes of light samples– Update the visibility relation bits
Light samples in active groups
Receiver points
Triangle
Summary of Recursion
• Traverse down the receiver point hierarchy– Maintain list of active light sample groups
• Initially all groups are active
– First ensure that receiver node intersects the main penumbra volume, terminate otherwise
– Then prune the active light sample group list by intersecting receiver node vs. penumbra volumes of active light sample groups
– In leaf node, test receiver points against hard shadow volumes of remaining light samples
Optimizations
• Umbra bits for early traversal termination– With receiver hierarchy rebuilding to ensure
balance
• Active plane sets
• Lazy penumbra volume and hard shadow volume construction
• On-demand bit mask allocation
• Coarse blocker sorting
Extensions
• Multiple light sample sets– To remove banding artifacts
• Alpha matte textures– Often used in e.g. vegetation textures
• Adaptive antialiasing
• Volumetric light sources
Results
• Compared against Mental Ray 3.2
• Benchmarked only the solving of the visibility relations– For Mental Ray, computed both with and
without shadows and took the difference
• More detailed results in the paper
Resolution Peak mem usage Speedup factor
1280×9600 058M 13.5
2560×1920 228M 16.7
Grids
32K triangles256 light samples
Resolution Peak mem usage Speedup factor
1280×9600 039M 3.5
2560×1920 154M 7.8
Flowers
903K triangles256 light samples
Resolution Peak mem usage Speedup factor
1280×9600 062M 8.2
2560×1920 244M 11.4
Sponza
1.27M triangles256 light samples
Results: Analysis
• Sub-linearity with respect to R– Increasing output resolution gives better
relative performance– Due to hierarchical processing of receiver
points
• Sub-linearity with respect to L– Using more light samples gives better relative
performance (results in the paper)– Due to using analytic penumbra volumes that
represent many light samples at once
Results: More Analysis
• Somewhat high memory usage– Depends on the output resolution– Depends on the complexity of the shadows– Does not depend on the number of triangles
in the scene
• New problem: dependence on the spatial size of light source– Penumbra volumes become larger– Leads to lower performance
Conclusions
• Nice properties+ Exactly the same result as with ray casting+ No need to store all triangles at any point+ Sub-linear dependence on output resolution
and number of light samples
• Not so nice properties– Linear dependence on triangle count– Memory usage can be high– Dependence on the spatial size of light source
Future Work
• Process multiple triangles at a time?
• Could experiment with full light sample hierarchy, which should (in theory) have better performance
Thank You
• Questions
Funding: National Technology Agency of Finland, Bitboys, Hybrid Graphics, Remedy Entertainment, Nokia, ATI