1 a hierarchical shadow volume algorithm timo aila 1,2 tomas akenine-möller 3 1 helsinki university...

28
1 A Hierarchical Shadow Volume Algorithm Timo Aila 1,2 Tomas Akenine-Möller 3 1 Helsinki University of Technology 2 Hybrid Graphics 3 Lund University

Upload: clinton-roland-poole

Post on 23-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

1

A Hierarchical Shadow Volume Algorithm

Timo Aila1,2

Tomas Akenine-Möller3

1Helsinki University of Technology 2Hybrid Graphics 3Lund University

2

Outline

Brief intro to shadow volumes fillrate problem, existing solutions

Our solution idea implementation

Results Q&A

3

Shadow volumes [Crow77]

Shadow volumes define closed volumes of space that are in shadow

infinitesimallight source

shadow caster = light cap

extrudedside quads

dark cap

4

Is point inside shadow volume?

1. Pick reference point R outside shadow volume any such point is OK

2. Span line from R to point to be classified

3. Compute sum of enter (+1) and exit (-1) events

RP2

P3

P1

2D illustration:

shadow volume

5

Using graphics hardware

R at ∞ behind pixel (z-fail) [Bilodeau&Songy, Carmack]

infinity always outside SVs – robust must not clip to far plane of view frustum sum hidden events to stencil buffer,

sign from backface culling

2D illustration: cameraR

visible samples (or pixels)

view frustum

+

++

-

-

shadow volume

6

Amount of pixel processing

Adapted from [Chan and Durand 2004]

7

Fillrate problem

50+ fps without shadows on ATI Radeon 9800XT at 1280x1024, 1 sample/pixel

1 fps when shadow volumes rasterized 2.2 billion pixels per frame

8

Existing solutions (1/2)

CC shadow volumes [Lloyd et al. 2004]

draw SVs only where receivers exist good when lots of empty space

Hybrid shadow maps and volumes [Chan&Durand 2004]

use SVs only at shadow boundaries boundary pixels determined using shadow map artifacts due to limited shadow map resolution

9

Existing solutions (2/2)

Depth bounds [Nvidia 2003]

application supplies min & max depth values separately for each shadow volume

rasterize shadow volume only when visible geometry between [min,max]

optimal bounds hard to compute

camera

min max

2D illustration:

shadow volumevisible pixels

10

Outline

Brief intro to shadow volumes fillrate problem, existing solutions

Our solution idea implementation

Results Q&A

11

Reference image

12

Shadow volume algorithm executed once per 8x8 pixel tile

13

Green tiles may contain shadow boundary - other tiles were correct

14

Low-res (gray) + per-pixel computed boundaries (dark)

15

How to detect shadow boundaries?

Two facts about shadow volumes1. always closed2. SV triangles mark potential shadow boundaries

If 3D volume in scene not intersected by shadow volume triangles

fully lit or fully in shadow single sample classifies entire volume

16

Outline

Brief intro to shadow volumes fillrate problem, existing solutions

Our solution idea implementation

Results Q&A

17

Detecting boundary tiles

Bound tile with axis-aligned bounding box 8x8 pixel region Zmin, Zmax

Triangle vs. AA Box intersection test1. low-resolution rasterization

2. Zmin and Zmax tests

8

8 pixels

Zmax

Zmin

18

Fast update of non-boundary tiles

Copy low-res shadows to stencil buffer writing 64 per-pixel values would be slow

Two-level stencil buffer saves the day maintain [Smin, Smax] per tile always test the higher level first often no need to validate per-pixel values stencil values of non-boundary tiles are constant

19

Implementation – Stage 1

Buffers built separately for each shadow volume Classifications ready when entire SV processed

application marks begin/end of shadow volumes

Low-res shadows

Low-resolution rasterizer

Boundary?

Per-tile operations

SV triangles

20

Implementation – Stage 2

Low-res shadows

Per-pixel rasterizer

Copy to2-level stencil

Stencil ops

Yes

No

Low-resolution rasterizer

Boundary?

SV triangles

boundarytile?

Update 2-level stencil

21

Alternative implementations

Two pass Pass 1 = Stage 1 Pass 2 = Stage 2 How to keep pixel units busy during Stage 1?

maybe assign per-tile operations to pixel shaders?

Single pass Separate stages using delay stream [Aila et al. 2003]

Stage 2 of current SV executes simultaneously with next SV’s Stage 1

22

Hardware resources

Two-level stencil buffer Per-tile operations

Optionally delay stream * duplicate low-res rasterizer & Zmin/Zmax units * cache for per shadow volume buffers

multiple buffers for pipelined operation allocate from external memory

* If not already there for occlusion culling purposes

23

Outline

Brief intro to shadow volumes fillrate problem, existing solutions

Our solution idea implementation

Results Q&A

24

Results – Simple scene (1280x1024)

Depth bounds Hierarchical Improvement

Ratio in #pixels 1.1 12.7 11.5

Ratio in bandwidth 1.03 17.6 17.2

25

Results – Knights (1280x1024)

Depth bounds Hierarchical Improvement

Ratio in #pixels 2.6 7.4 2.8

Ratio in bandwidth 2.4 5.6 2.4

26

Results – Powerplant (1280x1024)

Depth bounds Hierarchical Improvement

Ratio in #pixels 2.4 22.9 9.5

Ratio in bandwidth 2.3 16.0 6.9

27

Summary

Hierarchical rendering method for shadow volumes significant fillrate savings compared to other

hardware methods also works for soft shadow volumes

Future work would it make sense to extend programmability to

per-tile operations? how many pipeline bubbles are created?

requires chip-level simulations

28

Thank you!

Questions?

Acknowledgements Ville Miettinen, Jacob Ström, Eric Haines, Ulf Assarsson,

Lauri Savioja, Jonas Svensson, Ulf Borgenstam, Karl Schultz, 3DR group at Helsinki University of Technology

The National Technology Agency of Finland, Hybrid Graphics, Bitboys, Nokia and Remedy Entertainment

ATI for granting fellowship to Timo (2004-2005)