presentatie parallel progressive photon mapping 0overview rendering techniques and the gpu physical...

Progressive Photon Mapping on the Parallel GPU Architecture

Overview Rendering techniques and the GPU Physical light transport Photon mapping

Theory Parallel implementation on the GPU

Progressive Photon Mapping Theory Parallel implementation on the GPU

Results Conclusions

Lighting Phenomena

How do we model these?

Computer graphics!Computer graphics!

Rasterizedrendering (Bouknight 1970) Projects triangles to viewing plane

Parallel over geometry

Fast: triangle projection is cheap

Real-time (games)

Ray Traced Rendering (Whitted1980) Casts rays into the scene

Parallel over pixels

Slow: expensive ray casting

Photorealistic (movies)

Graphics Processing Unit (GPU)1990s 2000s

OpenGL/DirextX Cg/CUDA

Ray tracing on GPU issues Needs spatial data structures for sub-linear bounds over geometry,

which requires software stack (increased memory bandwidth)

Higher order rays tend to lose coherency, which causes code divergences (serialization)

Global illuminationCommon ray tracers do not implement global

illumination

No global illumination Global illumination

Photon Mapping Henrik Wann Jensen, 1996 Two-pass hybrid global illumination algorithm Pass 1: Photon shooting and storage Pass 2: Ray tracing + photon density estimate

But first: how to model physically correct light?

Physical Light Transport Light model used: geometric optics model

Radiometric quantities Φ: Radiant power/flux (watt) E: Irradiance (watt/m2) L: Radiance (watt/(steradian * m2)

Ω: solid angle (ΔA cosα / d2)

Radiance is the quantity that describes the resulting color of an object perceived by the optical sensor

Light-surface interaction Bidirectional Reflectance Distribution Function (BRDF)

Describes outgoing light as ratio

of incoming light to outgoing light

Examples:

)(

)(),(

Ψ←Θ→=Θ→Ψ

xdE

xdLxfr

Rendering equation (Kajiya1986) Steady-state equilibrium distribution of light

L(xΘ) = Le(xΘ) + Lr(xΘ)

∫Ω

ΨΨΨ←Θ→Ψ=Θ→x

dNxLxfxL xrr ω),cos()(),()(

Photon mapping (cont’d) Photon mapping provides a full solution to the

rendering equation

By expressing the incoming radiance in flux

And approximating the incoming radiance using photon density estimation on the surface

∫Ω

ΨΨΨ←Θ→Ψ+Θ→=Θ→x

dNxLxfxLxL xre ω),cos()(),()()(

ix dAdN

xdxL

ΨΨΨ←=Ψ←ω

φ),cos(

)()(

2

∑=

Ψ←∆Θ→Ψ+Θ→≈Θ→n

pppre xxf

rxLxL

12

)(),(1

)()( φπ

Photon mapping pass 1 overview Generate photons at light sources

Shoot photons using hemispherical sampling

Propagate/absorb photons using Russian roulette scheme

Upon diffuse surface interaction, save photon in photon map (spatial data structure)

Hemispherical sampling Radiance integration over all possible directions in the solid

angle is infeasible: approximate using Monte Carlo approach

Uniform hemispherical sampling is problematic Use linear sampling φ∈[0,0.5π] andθ ∈[0,2π]

However, linear sampling of polar coordinates results in decline in sample density near horizon (left image)

Solution to this problem is cosine weightingφ(middle image) Apply Gaussian distribution for smoothing (right image)

Randomized and deterministic sampling

It is also of importance what type of sampling is used

Deterministic sampling has increased bias, which is visible through lighting patterns

Randomized sampling has increased variance, which is visble through high-frequency noise

Photon propagation - Russian roulette Photon propagation is theoretically infinite: approximate

using Monte Carlo approach

Instead each surface interaction can trigger either diffuse reflection, specular reflection/refraction propagation or absorption

Photon map storage Store photons in the global photon map

Brute force photon queries are O(n), make use of spatial data structures: Uniform grid O(n) and KD-tree Θ(n) (unbalanced) / O(log n) (balanced)

Uniform Grid KD-tree, split in (a) middle, (b) average, (c) median

Photon mapping pass 2 overview

Perform photon density estimation on hit points

Use photon density estimation result to obtain approximation to the rendering equation

Apply cone or Gaussian filter

Visualize end result

∑=

Ψ←∆Θ→Ψ+Θ→≈Θ→n

pppre xxf

rxLxL

12

)(),(1

)()( φπ

Photon density estimation Use ray tracer to determine the hit points

For each hitpoint, perform photon density estimation

Create expanding sphere on each hitpoint Gather k nearest photons in sphere and divide flux by projected area

Photon density estimation causes low-frequency bias Bias is consistent, meaning it decreases to zero when

The number of photons in the photon map grows to infinity The number of photon samples of the photon density estimation grows to infinity

∑=

Ψ←∆Θ→Ψn

pppr xxf

r 12

)(),(1 φ

π

Parallel spatial data structures issues Brute-force uniform grid traversal for voxel inspection is seemingly

parallel, however it causes many code divergences

Better to use an approach that inspects same amount of voxels per step

Neighbor expansion approach

Terminate when no more cells in a single step contribute (full red)

Similar issue for the KD-tree, although spatial locality of depth-first traversal tends to be better because

Example: 6 nearest photons search, grey circle is max range, black circle is tightest fit

Filters Variance visible as blurriness near sharp features

Use filters to reduce this blur effect

Linear cone filter:

Gaussian filter:

RadiuscFilter

cedisw xp

pc max

tan1

⋅−= ↔

−−−= −

−

β

β

αe

ew

r

d

pg

p

1

11

2

2

2

Photon mapping result Direct visualization of the approximated radiance for each point

The main parameters (and associated errors): Number of photons in photon map (variance, visible as blur) Number of samples for the density estimation (bias, visible as bumps)

Ray tracing + photon mapping: obtaining direct illumination using ray tracing and indirect illumination using photon mapping reduces variance and bias

Photon mapping on the GPU Memory bandwidth and size issues are the main bottlenecks

on the GPU

Because of the many photon queries per photon density estimation, photon mapping is very memory dependent

Can we find an algorithm that shares the benefits of photon mapping without the memory dependency?

Introducing: Progressive Photon Mapping!

Progressive photon mapping Same concept as photon mapping, but reverse order:

Hit points are traced and stored into a reverse photon map Photons are shot and propagated into the scene For each hitpoint we check whether the photon resides in a predetermined inclusion

range and add flux accordingly

To guarantee consistent bias, uses an adapted progressive radiance estimate

Photons are not stored inbetween iterations, so no memory dependency

Each photon shooting iteration provides an opportunity for progressive feedback.

Progressive radiance estimate 1 This approach guarantees consistent bias by

Increasing amount of processed photons to infinity Decreasing the radius of inclusion per hitpoint to zero

Density before and after adding M photons:

2)(

)()(

xR

xNxd

π= 2)(

)()()(ˆ

xR

xMxNxd

π+=

Progressive radiance estimate 2 Since the radius needs to decrease each iteration, we find the new number of photons using:

In order to determine the new radius, we need to define a user-controlled ratio α of photons M that we will keep from the current iteration as follows:

Now if we equate formula 1 and 2, and substitute for the new density we get:

Using formula 3, we can now obtain the new progressive radiance estimate as follows:

)(ˆ))(()(()(ˆ)(ˆ)(ˆ 22 xdxRdxRxdxRxN −== ππ

)()()(ˆ xMxNxN α+=

)()(

)()()()()()(ˆ

xMxN

xMxNxRxdRxRxR

++=−= α

2

2

2'

)(/)(

1

)(*

)()(ˆ

),(),,(

),(xRN

xR

xRwxwwxf

wxLemitted

ppp

xMxN

ppr

πππφ rrr

r∑

==

(Formula 1)

(Formula 2)

(Formula 3)

Progressive Photon Mapping on the GPU

Hit points are stored into spatial data structure

Photon shooting doesn’t add flux to hit point if it’s not in the hit piont’s maximum inclusion range

Catch 22: in order to efficiently find the hit point, we need data stored with the hit point.

‘Solution’: keep track of global max inclusion radius

Summary Rendering techniques and the GPU

Physical light transport

Photon mapping

Progressive Photon Mapping

Results Photon mapping

Performance Visualization of variance & bias Ray tracing + photon mapping End result

Progressive photon mapping Performance Visualization of progressive nature End result

Conclusions

Future work

0

2

4

6

8

10

12

14

16

18

32768 65536 131072 262144 524288 1048576 2097152

Amount of Shot Photons

Fra

mes

Per

Sec

on

d

Uniform Grid KD-Tree

Photon mapping: performance Uniform grid contains local optimum

Rise due to number of searched voxels decrease

Fall due to higher occupancy of searched voxels

KD-tree decline is fairly linear, due to constant occupancy per node and linearly increasing KD-tree complexity

Photon mapping: variance

The blurriness (caused by variance) decreases when the amount of photons in the photon map increases

Photon mapping: bias

Ray tracing + photon mapping

Direct illumination + Indirect illumination = global illumination

Benefits and drawbacks of both approaches

Photon mapping: end result

10 million shot photons, 5000 nearest photon samples

Progressive photon mapping: performanceThe graph is not completely linear because each iteration updates the maximum

search range, reducing the time before data structure traversal termination

In other words, the FPS is progressively increasing!

0

50

100

150

200

250

300

350

10M

20M

30M

40M

50M

60M

70M

80M

90M

100M

Amount of shot photons

Tim

e ta

ken

(se

con

ds)

Progressive photon mapping: Cornell box

Progressive photon mapping: caustics

Progressive photon mapping: end result

15.000 million shot photons, initial max search range 400

Conclusions Adapting a global illumination algorithm (e.g. photon mapping) to the

parallel GPU architecture can be a very nice fit Global illumination algorithms require high GLOPS and are usually parallel Many opportunities to adapt individual components to GPU implementation

However, adaptation can reveal bottlenecks, such as memory bandwidth and size dependencies in the case of photon mapping

Reordering the algorithm on a conceptual scale (like progressive photon mapping does) can turn it into a perfect fit for the GPU architecture

Future Work Parallel adaptation of other global illumination algorithms

Support for dynamic scenes using BVH on GPU

Improving parallel nearest neighbor search

Improved coherency of higher order rays Highly variable path length due to Russian roulette causes occupancy

issues due to code divergences. Research alternative approaches.

Questions

presentatie parallel progressive photon mapping 0overview rendering techniques and the gpu physical...

Documents