coherent ray tracing via stream filtering

56
coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008

Upload: reyna

Post on 24-Feb-2016

20 views

Category:

Documents


0 download

DESCRIPTION

coherent ray tracing via stream filtering. christiaan gribble karthik ramani ieee / eurographics symposium on interactive ray tracing august 2008. acknowledgements. early implementation andrew kensler ( utah ) ingo wald ( intel ) solomon boulos ( stanford ) other contributors - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: coherent ray tracing via stream filtering

coherent ray tracing via stream filtering

christiaan gribblekarthik ramani

ieee/eurographics symposium on interactive ray tracing

august 2008

Page 2: coherent ray tracing via stream filtering

• early implementation– andrew kensler (utah)– ingo wald (intel) – solomon boulos (stanford)

• other contributors– steve parker & pete shirley (nvidia)– al davis & erik brunvand (utah)

acknowledgements

Page 3: coherent ray tracing via stream filtering

• ray packets SIMD processing• increasing SIMD widths

– current GPUs– intel’s larrabee– future processors

how to exploit wide SIMD units forfast ray tracing?

wide SIMD environments

Page 4: coherent ray tracing via stream filtering

• recast ray tracing algorithm– series of filter operations– applied to arbitrarily-sized groups of rays

• apply filters throughout rendering – eliminate inactive rays– improve SIMD efficiency– achieve interactive performance

stream filtering

Page 5: coherent ray tracing via stream filtering

• ray streams– groups of rays– arbitrary size– arbitrary order

• stream filters– set of conditional statements– executed across stream elements– extract only rays with certain properties

core concepts

Page 6: coherent ray tracing via stream filtering

core concepts

a b d e f

stream element

input stream

out_stream filter<test>(in_stream){ foreach e in in_stream if (test(e) == true) out_stream.push(e) return out_stream}

c

test conditional statement(s)

Page 7: coherent ray tracing via stream filtering

core concepts

a b d e finput stream

c

test output streama

a

out_stream filter<test>(in_stream){ foreach e in in_stream if (test(e) == true) out_stream.push(e) return out_stream}

Page 8: coherent ray tracing via stream filtering

core concepts

a b d e finput stream

c

test output streama

bb

out_stream filter<test>(in_stream){ foreach e in in_stream if (test(e) == true) out_stream.push(e) return out_stream}

Page 9: coherent ray tracing via stream filtering

core concepts

a b d e finput stream

c

test output streama

cb

out_stream filter<test>(in_stream){ foreach e in in_stream if (test(e) == true) out_stream.push(e) return out_stream}

Page 10: coherent ray tracing via stream filtering

core concepts

a b d e finput stream

c

test output streama

db d

out_stream filter<test>(in_stream){ foreach e in in_stream if (test(e) == true) out_stream.push(e) return out_stream}

Page 11: coherent ray tracing via stream filtering

core concepts

a b d e finput stream

c

test output streama

eb d

out_stream filter<test>(in_stream){ foreach e in in_stream if (test(e) == true) out_stream.push(e) return out_stream}

Page 12: coherent ray tracing via stream filtering

core concepts

a b d e finput stream

c

test output streama

fb d f

out_stream filter<test>(in_stream){ foreach e in in_stream if (test(e) == true) out_stream.push(e) return out_stream}

Page 13: coherent ray tracing via stream filtering

core concepts

a b d e finput stream

c

test output streama b d f

out_stream filter<test>(in_stream){ foreach e in in_stream if (test(e) == true) out_stream.push(e) return out_stream}

Page 14: coherent ray tracing via stream filtering

• process stream in groups of N elements• two steps

– N-wide groups boolean mask– boolean mask partitioned stream

SIMD filtering

Page 15: coherent ray tracing via stream filtering

SIMD filtering

a b d e finput stream

c

test boolean mask

step one

Page 16: coherent ray tracing via stream filtering

SIMD filtering

a b d e finput stream

c

test boolean maska b ct t f

step one

Page 17: coherent ray tracing via stream filtering

SIMD filtering

a b d e finput stream

c

test boolean maskt

d e ft f t f t

step one

Page 18: coherent ray tracing via stream filtering

SIMD filtering

a b d e finput stream

c

test boolean maskt t f t f t

Page 19: coherent ray tracing via stream filtering

SIMD filtering

a b d e finput stream

c

test boolean maskt t f t f t

partition

a b d e coutput stream

f

Page 20: coherent ray tracing via stream filtering

• all rays requiring same sequence of ops will always perform those ops together

independent of execution pathindependent of order within stream

• coherence defined by ensuing ops

no guessing with heuristicsadapts to geometry, etc.

key characteristics

Page 21: coherent ray tracing via stream filtering

• all rays requiring same sequence of ops will always perform those ops together

independent of execution pathindependent of order within stream

• coherence defined by ensuing ops

no guessing with heuristicsadapts to geometry, etc.

key characteristics

Page 22: coherent ray tracing via stream filtering

• all rays requiring same sequence of ops will always perform those ops together

independent of execution pathindependent of order within stream

• coherence defined by ensuing ops

no guessing with heuristicsadapts to geometry, etc.

key characteristics

Page 23: coherent ray tracing via stream filtering

• SIMD execution exploits parallelism within stream

• improves efficiency underutilization only when (# rays % N) > 0

potential benefits

Page 24: coherent ray tracing via stream filtering

• wide SIMD ops (N > 4)• scatter/gather memory ops• partition op

hardware requirements

Page 25: coherent ray tracing via stream filtering

• recast ray tracing algorithm as a sequence of filter operations

• possible to use filters in all threemajor stages of ray tracing– traversal– intersection– shading

application to ray tracing

Page 26: coherent ray tracing via stream filtering

• sequence of stream filters– extract certain rays for processing– ignore others, process later– implicit or explicit

• traversal implicit filter stack• shading explicit filter stack

filter stacks

Page 27: coherent ray tracing via stream filtering

while (!stack.empty())(node, stream) = stack.pop()substream = filter<isec_node>(stream)if (substream.size() == 0)

continueelse if (node.is_leaf())

prims.intersect(substream)else

determine front/back childrenstack.push(back_child, substream)stack.push(front_child, substream)

traversal

Page 28: coherent ray tracing via stream filtering

drop inactive rays

traversal

a b d e finput stream

c

stackcurrent node x w (0, 5)

a b d e coutput stream

f

yz

filter against node

Page 29: coherent ray tracing via stream filtering

traversal

a b d e finput stream

c

stackcurrent node x y (0, 3)

a b doutput stream

f

yz

w (0, 5)

push back child

Page 30: coherent ray tracing via stream filtering

traversal

a b d e finput stream

c

stackcurrent node x z (0, 3)

…a b doutput stream

f

yz

w (0, 5)y (0, 3)

push front child

Page 31: coherent ray tracing via stream filtering

traversal

a b d e finput stream

c

stackcurrent node x z (0, 3)

…a b doutput stream

f

yz

w (0, 5)y (0, 3)

continue to next traversal step

Page 32: coherent ray tracing via stream filtering

• explicit filter stacks– decompose test into sequence of filters

• sequence of barycentric coordinate tests• …

– too little coherence to necessitate additional filter ops

• simply apply test in N-wide SIMD fashion

intersection

Page 33: coherent ray tracing via stream filtering

• explicit filter stacks– extract & process elements

• shadow rays for explicit direct lighting• rays that miss geometry• rays whose children sample direct illumination• …

– streams are quite long– filter stacks are used to good effect

• shading achieves highest utilization

shading

Page 34: coherent ray tracing via stream filtering

• general & flexible• supports parallel execution

– process only active elements– yields highest possible efficiency– adapts to geometry, etc.

• incurs low overhead

algorithm – summary

Page 35: coherent ray tracing via stream filtering

• why a custom core?– skeptical that algorithm could perform

interactively– provides upper bound on expected

performance– explore parameter space more easily

• if successful, implement for available architectures

hardware simulation

Page 36: coherent ray tracing via stream filtering

• cycle-accurate– models stalls & data dependencies– models contention for components

• conservative– could be synthesized at 1 GHz @ 135 nm– we assume 500 MHz @ 90 nm

simulator highlights

Page 37: coherent ray tracing via stream filtering

• four major subsystems– distributed memory– N-wide SIMD execution units– dedicated address generation units– flexible, hierarchical interconnect

• additional details available in companion papers

simulator highlights

Page 38: coherent ray tracing via stream filtering

• stream filter C++ class template• filter tests C++ functors

– ray/node intersection– ray/triangle intersection– …

• scatter/gather ops used to manipulate operands, rendering state

programming model

Page 39: coherent ray tracing via stream filtering

• does sufficient coherence exist to use wide SIMD units efficiently?

focus on SIMD utilization

• is interactive performance achievable with a custom core?

initial exploration of design space

key questions

Page 40: coherent ray tracing via stream filtering

• does sufficient coherence exist to use wide SIMD units efficiently?

focus on SIMD utilization

• is interactive performance achievable with a custom core?

key questions

Page 41: coherent ray tracing via stream filtering

• does sufficient coherence exist to use wide SIMD units efficiently?

focus on SIMD utilization

• is interactive performance achievable with a custom core?

initial exploration of design space

key questions

Page 42: coherent ray tracing via stream filtering

• monte carlo path tracing– explicit direct lighting– glossy, dielectric, & lambertian materials– depth-of-field effects

• tile-based, breadth-first rendering

rendering

Page 43: coherent ray tracing via stream filtering

• 1024x1024 images• stream size 1K or 4K rays

– 1 spp 32x32 or 64x64 pixels/tile– 64 spp 4x4 or 8x8 pixels/tile

• per-frame stats– O(100s millions) rays/frame– O(100s millions) traversal ops– O(10s millions) intersection ops

experimental setup

Page 44: coherent ray tracing via stream filtering

• high geometric & illumination complexity• representative of common scenarios

test scenes

rtrt conf kala

Page 45: coherent ray tracing via stream filtering

primary rays

N = 4 N = 16 N = 640

20

40

60

80

100

kala – SIMD utilization

traversalintersection

SIMD width

SIM

D ut

iliza

tion

initial stream size = 64x64 rays

Page 46: coherent ray tracing via stream filtering

secondary rays

initial stream size = 64x64 rays

N = 8 N = 12 N = 160

20

40

60

80

100

kala – SIMD utilization

traversalintersectionshading

SIMD width

SIM

D ut

iliza

tion

Page 47: coherent ray tracing via stream filtering

predicted performance

N = 8 N = 12 N = 166

8

10

12

14

16

18

kala – frame rate

32x32 streams64x64 streams

SIMD width

fram

es p

er s

econ

d

Page 48: coherent ray tracing via stream filtering

• achieve high utilization– as high as 97%– SIMD widths of up to 16 elements– utilization increases with stream size

• achieve interactive performance– 15-25 fps– performance increases with stream size– currently requires custom core

results – summary

Page 49: coherent ray tracing via stream filtering

• too few common ops no improvement in utilization

• possible remedies– longer ray streams– parallel traversal

limitations – parallelism

Page 50: coherent ray tracing via stream filtering

• ray stream– generalized ray packet– longer streams better performance– could perform poorly

• possible remedies– carefully designed memory system– streaming memory systems

limitations – memory system

Page 51: coherent ray tracing via stream filtering

• conventional cpus– narrow SIMD (4-wide SSE & altivec)– limited support for scatter/gather ops– partition op software implementation

• possible remedies– custom core– current GPUs– time

limitations – hw support

Page 52: coherent ray tracing via stream filtering

• new approach to coherent ray tracing– process arbitrarily-sized groups of rays

in SIMD fashion with high utilization– eliminates inactive elements, process

only active rays• stream filtering provides

– sufficient coherence for wider-than-four SIMD processing

– interactive performance with custom core

conclusions

Page 53: coherent ray tracing via stream filtering

• key strengths– parallel processing– implicit reordering– modest hw requirements– general

• key challenges– hw support for required ops

conclusions

Page 54: coherent ray tracing via stream filtering

• additional hw simulation– parameter tuning– homogeneous multicore– heterogeneous multicore– …

• improved GPU-based implementation• implementations for future processors

future work

Page 55: coherent ray tracing via stream filtering

• temple of kalabsha– veronica sundstedt– patrick ledda– other members of the university of bristol

computer graphics group• financial support

– swezey scientific instrumentation fund– utah graduate research fellowship– nsf grants 0541009 & 0430063

(more) acknowledgements

Page 56: coherent ray tracing via stream filtering