presentatie parallel progressive photon mapping 0overview rendering techniques and the gpu physical...
TRANSCRIPT
Progressive Photon Mapping on the Parallel GPU Architecture
Overview Rendering techniques and the GPU Physical light transport Photon mapping
Theory Parallel implementation on the GPU
Progressive Photon Mapping Theory Parallel implementation on the GPU
Results Conclusions
Lighting Phenomena
How do we model these?
Computer graphics!Computer graphics!
Rasterizedrendering (Bouknight 1970) Projects triangles to viewing plane
Parallel over geometry
Fast: triangle projection is cheap
Real-time (games)
Ray Traced Rendering (Whitted1980) Casts rays into the scene
Parallel over pixels
Slow: expensive ray casting
Photorealistic (movies)
Graphics Processing Unit (GPU)1990s 2000s
OpenGL/DirextX Cg/CUDA
Ray tracing on GPU issues Needs spatial data structures for sub-linear bounds over geometry,
which requires software stack (increased memory bandwidth)
Higher order rays tend to lose coherency, which causes code divergences (serialization)
Global illuminationCommon ray tracers do not implement global
illumination
No global illumination Global illumination
Photon Mapping Henrik Wann Jensen, 1996 Two-pass hybrid global illumination algorithm Pass 1: Photon shooting and storage Pass 2: Ray tracing + photon density estimate
But first: how to model physically correct light?
Physical Light Transport Light model used: geometric optics model
Radiometric quantities Φ: Radiant power/flux (watt) E: Irradiance (watt/m2) L: Radiance (watt/(steradian * m2)
Ω: solid angle (ΔA cosα / d2)
Radiance is the quantity that describes the resulting color of an object perceived by the optical sensor
Light-surface interaction Bidirectional Reflectance Distribution Function (BRDF)
Describes outgoing light as ratio
of incoming light to outgoing light
Examples:
)(
)(),(
Ψ←Θ→=Θ→Ψ
xdE
xdLxfr
Rendering equation (Kajiya1986) Steady-state equilibrium distribution of light
L(xΘ) = Le(xΘ) + Lr(xΘ)
∫Ω
ΨΨΨ←Θ→Ψ=Θ→x
dNxLxfxL xrr ω),cos()(),()(
Photon mapping (cont’d) Photon mapping provides a full solution to the
rendering equation
By expressing the incoming radiance in flux
And approximating the incoming radiance using photon density estimation on the surface
∫Ω
ΨΨΨ←Θ→Ψ+Θ→=Θ→x
dNxLxfxLxL xre ω),cos()(),()()(
ix dAdN
xdxL
ΨΨΨ←=Ψ←ω
φ),cos(
)()(
2
∑=
Ψ←∆Θ→Ψ+Θ→≈Θ→n
pppre xxf
rxLxL
12
)(),(1
)()( φπ
Photon mapping pass 1 overview Generate photons at light sources
Shoot photons using hemispherical sampling
Propagate/absorb photons using Russian roulette scheme
Upon diffuse surface interaction, save photon in photon map (spatial data structure)
Hemispherical sampling Radiance integration over all possible directions in the solid
angle is infeasible: approximate using Monte Carlo approach
Uniform hemispherical sampling is problematic Use linear sampling φ∈[0,0.5π] andθ ∈[0,2π]
However, linear sampling of polar coordinates results in decline in sample density near horizon (left image)
Solution to this problem is cosine weightingφ(middle image) Apply Gaussian distribution for smoothing (right image)
Randomized and deterministic sampling
It is also of importance what type of sampling is used
Deterministic sampling has increased bias, which is visible through lighting patterns
Randomized sampling has increased variance, which is visble through high-frequency noise
Photon propagation - Russian roulette Photon propagation is theoretically infinite: approximate
using Monte Carlo approach
Instead each surface interaction can trigger either diffuse reflection, specular reflection/refraction propagation or absorption
Photon map storage Store photons in the global photon map
Brute force photon queries are O(n), make use of spatial data structures: Uniform grid O(n) and KD-tree Θ(n) (unbalanced) / O(log n) (balanced)
Uniform Grid KD-tree, split in (a) middle, (b) average, (c) median
Photon mapping pass 2 overview
Perform photon density estimation on hit points
Use photon density estimation result to obtain approximation to the rendering equation
Apply cone or Gaussian filter
Visualize end result
∑=
Ψ←∆Θ→Ψ+Θ→≈Θ→n
pppre xxf
rxLxL
12
)(),(1
)()( φπ
Photon density estimation Use ray tracer to determine the hit points
For each hitpoint, perform photon density estimation
Create expanding sphere on each hitpoint Gather k nearest photons in sphere and divide flux by projected area
Photon density estimation causes low-frequency bias Bias is consistent, meaning it decreases to zero when
The number of photons in the photon map grows to infinity The number of photon samples of the photon density estimation grows to infinity
∑=
Ψ←∆Θ→Ψn
pppr xxf
r 12
)(),(1 φ
π
Parallel spatial data structures issues Brute-force uniform grid traversal for voxel inspection is seemingly
parallel, however it causes many code divergences
Better to use an approach that inspects same amount of voxels per step
Neighbor expansion approach
Terminate when no more cells in a single step contribute (full red)
Similar issue for the KD-tree, although spatial locality of depth-first traversal tends to be better because
Example: 6 nearest photons search, grey circle is max range, black circle is tightest fit
Filters Variance visible as blurriness near sharp features
Use filters to reduce this blur effect
Linear cone filter:
Gaussian filter:
RadiuscFilter
cedisw xp
pc max
tan1
⋅−= ↔
−−−= −
−
β
β
αe
ew
r
d
pg
p
1
11
2
2
2
Photon mapping result Direct visualization of the approximated radiance for each point
The main parameters (and associated errors): Number of photons in photon map (variance, visible as blur) Number of samples for the density estimation (bias, visible as bumps)
Ray tracing + photon mapping: obtaining direct illumination using ray tracing and indirect illumination using photon mapping reduces variance and bias
Photon mapping on the GPU Memory bandwidth and size issues are the main bottlenecks
on the GPU
Because of the many photon queries per photon density estimation, photon mapping is very memory dependent
Can we find an algorithm that shares the benefits of photon mapping without the memory dependency?
Introducing: Progressive Photon Mapping!
Progressive photon mapping Same concept as photon mapping, but reverse order:
Hit points are traced and stored into a reverse photon map Photons are shot and propagated into the scene For each hitpoint we check whether the photon resides in a predetermined inclusion
range and add flux accordingly
To guarantee consistent bias, uses an adapted progressive radiance estimate
Photons are not stored inbetween iterations, so no memory dependency
Each photon shooting iteration provides an opportunity for progressive feedback.
Progressive radiance estimate 1 This approach guarantees consistent bias by
Increasing amount of processed photons to infinity Decreasing the radius of inclusion per hitpoint to zero
Density before and after adding M photons:
2)(
)()(
xR
xNxd
π= 2)(
)()()(ˆ
xR
xMxNxd
π+=
Progressive radiance estimate 2 Since the radius needs to decrease each iteration, we find the new number of photons using:
In order to determine the new radius, we need to define a user-controlled ratio α of photons M that we will keep from the current iteration as follows:
Now if we equate formula 1 and 2, and substitute for the new density we get:
Using formula 3, we can now obtain the new progressive radiance estimate as follows:
)(ˆ))(()(()(ˆ)(ˆ)(ˆ 22 xdxRdxRxdxRxN −== ππ
)()()(ˆ xMxNxN α+=
)()(
)()()()()()(ˆ
xMxN
xMxNxRxdRxRxR
++=−= α
2
2
2'
)(/)(
1
)(*
)()(ˆ
),(),,(
),(xRN
xR
xRwxwwxf
wxLemitted
ppp
xMxN
ppr
πππφ rrr
r∑
==
(Formula 1)
(Formula 2)
(Formula 3)
Progressive Photon Mapping on the GPU
Hit points are stored into spatial data structure
Photon shooting doesn’t add flux to hit point if it’s not in the hit piont’s maximum inclusion range
Catch 22: in order to efficiently find the hit point, we need data stored with the hit point.
‘Solution’: keep track of global max inclusion radius
Summary Rendering techniques and the GPU
Physical light transport
Photon mapping
Progressive Photon Mapping
Results Photon mapping
Performance Visualization of variance & bias Ray tracing + photon mapping End result
Progressive photon mapping Performance Visualization of progressive nature End result
Conclusions
Future work
0
2
4
6
8
10
12
14
16
18
32768 65536 131072 262144 524288 1048576 2097152
Amount of Shot Photons
Fra
mes
Per
Sec
on
d
Uniform Grid KD-Tree
Photon mapping: performance Uniform grid contains local optimum
Rise due to number of searched voxels decrease
Fall due to higher occupancy of searched voxels
KD-tree decline is fairly linear, due to constant occupancy per node and linearly increasing KD-tree complexity
Photon mapping: variance
The blurriness (caused by variance) decreases when the amount of photons in the photon map increases
Photon mapping: bias
Ray tracing + photon mapping
Direct illumination + Indirect illumination = global illumination
Benefits and drawbacks of both approaches
Photon mapping: end result
10 million shot photons, 5000 nearest photon samples
Progressive photon mapping: performanceThe graph is not completely linear because each iteration updates the maximum
search range, reducing the time before data structure traversal termination
In other words, the FPS is progressively increasing!
0
50
100
150
200
250
300
350
10M
20M
30M
40M
50M
60M
70M
80M
90M
100M
Amount of shot photons
Tim
e ta
ken
(se
con
ds)
Progressive photon mapping: Cornell box
Progressive photon mapping: caustics
Progressive photon mapping: end result
15.000 million shot photons, initial max search range 400
Conclusions Adapting a global illumination algorithm (e.g. photon mapping) to the
parallel GPU architecture can be a very nice fit Global illumination algorithms require high GLOPS and are usually parallel Many opportunities to adapt individual components to GPU implementation
However, adaptation can reveal bottlenecks, such as memory bandwidth and size dependencies in the case of photon mapping
Reordering the algorithm on a conceptual scale (like progressive photon mapping does) can turn it into a perfect fit for the GPU architecture
Future Work Parallel adaptation of other global illumination algorithms
Support for dynamic scenes using BVH on GPU
Improving parallel nearest neighbor search
Improved coherency of higher order rays Highly variable path length due to Russian roulette causes occupancy
issues due to code divergences. Research alternative approaches.
Questions