Download - CS6958: HARDWARE RAY TRACING
![Page 1: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/1.jpg)
CS6958: HARDWARE RAY TRACING Spring 2014
![Page 2: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/2.jpg)
What is Ray Tracing?
¨ A computer graphic rendering technique that simulates optics ¤ Can generate very realistic-looking images ¤ Can take a long time to create those images
![Page 3: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/3.jpg)
A Tale of Two Rendering Algorithms
It was the best of times, it was the worst of times it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness…
Z-Buffer Rasterization vs. Ray Tracing
![Page 4: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/4.jpg)
A Little History
¨ In the beginning, there was Sketchpad…
If we point the light pen at the display system and press a button called “draw,” the computer will construct a straight line segment which stretches like a rubber band from the initial to the present location of the pen […] A sudden flick of the pen terminates drawing […].
1962…
![Page 5: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/5.jpg)
No Right Answer…
![Page 6: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/6.jpg)
Z-buffering: Ed Catmull, 1974
![Page 7: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/7.jpg)
Z-Buffer
¨ Take triangles as input ¨ Project on image plane ¨ Store closeness in Z-buffer ¨ Save color if closer ¨ Independent triangles processed in parallel
¤ This assumption makes highly parallel processing possible
![Page 8: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/8.jpg)
Eventually, all GPUs use Z-buffering
¨ Because of parallel nature of algorithm, huge advantage over CPUs ¤ Wide Single Instruction Multiple Data (SIMD)
n Fetch one instruction, execute on 32 different pieces of data
¤ Works brilliantly because of the assumption that all triangles are independent n Essentially a SIMD streaming throughput computation
¨ Result is HUGE, power-hungry chips… ¤ With huge graphics performance…
![Page 9: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/9.jpg)
Graphics Chip Architecture
![Page 10: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/10.jpg)
Graphics Chip Architecture
![Page 11: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/11.jpg)
Graphics Chips Performance
![Page 12: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/12.jpg)
Graphics Chips Performance
GK110 Kepler
![Page 13: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/13.jpg)
Graphics Chips Transistors
10-core Xeon Westmere
Nvidia GK110 Kepler
![Page 14: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/14.jpg)
Graphics Chips Transistors
Nvidia GK110 Kepler
![Page 15: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/15.jpg)
Graphics Cards
![Page 16: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/16.jpg)
Why do we need more?
![Page 17: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/17.jpg)
Why do we need more? Lighting!
¨ Z-buffer rasterization is great at rendering lots and lots of triangles ¤ But, the parallelism that enables the speed makes the
assumption that all triangles are independent ¤ This makes optical effects tricky
¨ Shadows, reflections, refractions etc. all need to know about other triangles in the scene
¨ Going further, “global illumination” is even trickier
![Page 18: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/18.jpg)
What are our choices?
¨ A Tale of Two Rendering Algorithms… ¤ Z-buffer Rasterizing ¤ Ray Tracing
![Page 19: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/19.jpg)
Ray Tracing vs. Rasterization
Ray Tracing (parallel on pixels)
Rasterizing (parallel on triangles)
![Page 20: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/20.jpg)
How to they compare?
¨ David Luebke (NVIDIA):
“Rasterization is fast, but needs cleverness to support complex visual effects.
Ray tracing supports complex visual effects, but needs cleverness to be fast.”
![Page 21: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/21.jpg)
Optical Effects
Turner Whitted 1979
![Page 22: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/22.jpg)
Optical Effects
Turner Whitted 1979 Animated film: The Compleat Angler
![Page 23: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/23.jpg)
RAY TRACING
¨ Color each pixel based on the radiance from each visible surface
Tom Funkhouser, Princeton
![Page 24: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/24.jpg)
RAY TRACING
¨ Color each pixel based on the radiance from each visible surface ¤ Note that these arrows
point in the direction of the radiance. We normally trace rays in the other direction.
Tom Funkhouser, Princeton
![Page 25: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/25.jpg)
Ray Tracing
¨ For each sample ¤ Construct a ray from the eye position through view plane ¤ Find the first surface hit ¤ Compute color of that surface
Tom Funkhouser, Princeton
![Page 26: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/26.jpg)
Ray Tracing
¨ For each sample ¤ Construct a ray from the eye position through view plane ¤ Find the first surface hit ¤ Compute color of that surface
Tom Funkhouser, Princeton
Processing the “Primary Rays” is sometimes called “Ray Casting”
![Page 27: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/27.jpg)
Simple Ray Casting
Tom Funkhouser, Princeton
![Page 28: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/28.jpg)
Construct a ray through a pixel
Tom Funkhouser, Princeton
![Page 29: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/29.jpg)
Construct a ray through a pixel
Tom Funkhouser, Princeton
![Page 30: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/30.jpg)
Recursive Ray Tracing
Create scene (objects, materials, lights, camera, background) Preprocess scene foreach frame foreach pixel foreach sample generate ray intersect ray with objects find normal of closest object shade intersection point
Mutually recursive Shading can generate new rays…
![Page 31: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/31.jpg)
Recursive Ray Tracing
Create scene (objects, materials, lights, camera, background) Preprocess scene foreach frame foreach pixel foreach sample generate ray intersect ray with objects find normal of closest object shade intersection point
![Page 32: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/32.jpg)
Major Components of a Ray Tracer
¨ Camera (Pixels to Rays) ¨ Objects (Rays to Intersection info) ¨ Materials (Intersection info and Light to Color) ¨ Lights ¨ Background (Rays to Color)
¨ All together: a Scene
![Page 33: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/33.jpg)
Details
Steve Parker, UofU and NVIDIA
![Page 34: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/34.jpg)
Optical Effects
Turner Whitted 1979
![Page 35: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/35.jpg)
Turner Whitted
Phong Model
Improved Model
![Page 36: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/36.jpg)
Turner Whitted
Improved Model
![Page 37: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/37.jpg)
Turner Whitted
![Page 38: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/38.jpg)
Optical Effects
Turner Whitted 1980
![Page 39: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/39.jpg)
Optical Effects
Steve Parker, UofU and NVIDIA
![Page 40: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/40.jpg)
Ray Tracers Love Glass…
![Page 41: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/41.jpg)
Global Illumination
![Page 42: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/42.jpg)
Global Illumination
![Page 43: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/43.jpg)
How real can you get?
Cornell University
![Page 44: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/44.jpg)
Car Makers Love Ray Tracing…
![Page 45: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/45.jpg)
Car Makers Love Ray Tracing
![Page 46: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/46.jpg)
Car/Film Rendering
![Page 47: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/47.jpg)
Architects Love Ray Tracing…
![Page 48: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/48.jpg)
Architectural Modeling
![Page 49: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/49.jpg)
Movie Makers Love Ray Tracing…
![Page 50: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/50.jpg)
Movie Makers Love Ray Tracing…
![Page 51: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/51.jpg)
Volume Renderers Love Ray Tracing…
Steve Parker, UofU and NVIDIA
![Page 52: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/52.jpg)
Scientific Visualization Loves RT
![Page 53: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/53.jpg)
Scientific Visualization with RT
![Page 54: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/54.jpg)
Ray Tracing is complex?
typedef struct{double x,y,z}vec;vec U,black,amb={.02,.02,.02};struct sphere{ vec cen,colour;double rad,kd,ks,kt,kl,ir}*s,*best,sph[]={0.,6.,.5,1.,1.,1.,.9, .05,.2,.85,0.,1.7,-1.,8.,-.5,1.,.5,.2,1.,.7,.3,0.,.05,1.2,1.,8.,-.5,.1,.8,.8, 1.,.3,.7,0.,0.,1.2,3.,-6.,15.,1.,.8,1.,7.,0.,0.,0.,.6,1.5,-3.,-3.,12.,.8,1., 1.,5.,0.,0.,0.,.5,1.5,};yx;double u,b,tmin,sqrt(),tan();double vdot(A,B)vec A ,B;{return A.x*B.x+A.y*B.y+A.z*B.z;}vec vcomb(a,A,B)double a;vec A,B;{B.x+=a* A.x;B.y+=a*A.y;B.z+=a*A.z;return B;}vec vunit(A)vec A;{return vcomb(1./sqrt( vdot(A,A)),A,black);}struct sphere*intersect(P,D)vec P,D;{best=0;tmin=1e30;s= sph+5;while(s-->sph)b=vdot(D,U=vcomb(-1.,P,s->cen)),u=b*b-vdot(U,U)+s->rad*s ->rad,u=u>0?sqrt(u):1e31,u=b-u>1e-7?b-u:b+u,tmin=u>=1e-7&&u<tmin?best=s,u: tmin;return best;}vec trace(level,P,D)vec P,D;{double d,eta,e;vec N,colour; struct sphere*s,*l;if(!level--)return black;if(s=intersect(P,D));else return amb;colour=amb;eta=s->ir;d= -vdot(D,N=vunit(vcomb(-1.,P=vcomb(tmin,D,P),s->cen )));if(d<0)N=vcomb(-1.,N,black),eta=1/eta,d= -d;l=sph+5;while(l-->sph)if((e=l ->kl*vdot(N,U=vunit(vcomb(-1.,P,l->cen))))>0&&intersect(P,U)==l)colour=vcomb(e ,l->colour,colour);U=s->colour;colour.x*=U.x;colour.y*=U.y;colour.z*=U.z;e=1-eta* eta*(1-d*d);return vcomb(s->kt,e>0?trace(level,P,vcomb(eta,D,vcomb(eta*d-sqrt (e),N,black))):black,vcomb(s->ks,trace(level,P,vcomb(2*d,N,D)),vcomb(s->kd, colour,vcomb(s->kl,U,black))));}main(){puts(“P3\n32 32\n255”);while(yx<32*32) U.x=yx%32-32/2,U.z=32/2-yx++/32,U.y=32/2/tan(25/114.5915590261),U=vcomb(255., trace(3,black,vunit(U)),black),printf("%.0f %.0f %.0f\n",U);}/*minray!*/
Paul Heckbert’s complete ray tracer on the back of his business card (c1989)
Does Whitted-style recursive ray tracing with reflections, refraction, two lights…
![Page 55: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/55.jpg)
Ray Tracing is complex?
typedef struct{double x,y,z}vec;vec U,black,amb={.02,.02,.02};struct sphere{ vec cen,colour;double rad,kd,ks,kt,kl,ir}*s,*best,sph[]={0.,6.,.5,1.,1.,1.,.9, .05,.2,.85,0.,1.7,-1.,8.,-.5,1.,.5,.2,1.,.7,.3,0.,.05,1.2,1.,8.,-.5,.1,.8,.8, 1.,.3,.7,0.,0.,1.2,3.,-6.,15.,1.,.8,1.,7.,0.,0.,0.,.6,1.5,-3.,-3.,12.,.8,1., 1.,5.,0.,0.,0.,.5,1.5,};yx;double u,b,tmin,sqrt(),tan();double vdot(A,B)vec A ,B;{return A.x*B.x+A.y*B.y+A.z*B.z;}vec vcomb(a,A,B)double a;vec A,B;{B.x+=a* A.x;B.y+=a*A.y;B.z+=a*A.z;return B;}vec vunit(A)vec A;{return vcomb(1./sqrt( vdot(A,A)),A,black);}struct sphere*intersect(P,D)vec P,D;{best=0;tmin=1e30;s= sph+5;while(s-->sph)b=vdot(D,U=vcomb(-1.,P,s->cen)),u=b*b-vdot(U,U)+s->rad*s ->rad,u=u>0?sqrt(u):1e31,u=b-u>1e-7?b-u:b+u,tmin=u>=1e-7&&u<tmin?best=s,u: tmin;return best;}vec trace(level,P,D)vec P,D;{double d,eta,e;vec N,colour; struct sphere*s,*l;if(!level--)return black;if(s=intersect(P,D));else return amb;colour=amb;eta=s->ir;d= -vdot(D,N=vunit(vcomb(-1.,P=vcomb(tmin,D,P),s->cen )));if(d<0)N=vcomb(-1.,N,black),eta=1/eta,d= -d;l=sph+5;while(l-->sph)if((e=l ->kl*vdot(N,U=vunit(vcomb(-1.,P,l->cen))))>0&&intersect(P,U)==l)colour=vcomb(e ,l->colour,colour);U=s->colour;colour.x*=U.x;colour.y*=U.y;colour.z*=U.z;e=1-eta* eta*(1-d*d);return vcomb(s->kt,e>0?trace(level,P,vcomb(eta,D,vcomb(eta*d-sqrt (e),N,black))):black,vcomb(s->ks,trace(level,P,vcomb(2*d,N,D)),vcomb(s->kd, colour,vcomb(s->kl,U,black))));}main(){puts(“P3\n32 32\n255”);while(yx<32*32) U.x=yx%32-32/2,U.z=32/2-yx++/32,U.y=32/2/tan(25/114.5915590261),U=vcomb(255., trace(3,black,vunit(U)),black),printf("%.0f %.0f %.0f\n",U);}/*minray!*/
Paul Heckbert’s complete ray tracer on the back of his business card (c1989)
Does Whitted-style recursive ray tracing with reflections, refraction, two lights…
![Page 56: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/56.jpg)
Andrew Kensler’s business-card C++ RT #include <stdlib.h> // card > aek.ppm!#include <stdio.h>!#include <math.h>!typedef int i;typedef float f;struct v{!f x,y,z;v operator+(v r){return v(x+r.x!,y+r.y,z+r.z);}v operator*(f r){return!v(x*r,y*r,z*r);}f operator%(v r){return!x*r.x+y*r.y+z*r.z;}v(){}v operator^(v r!){return v(y*r.z-z*r.y,z*r.x-x*r.z,x*r.!y-y*r.x);}v(f a,f b,f c){x=a;y=b;z=c;}v!operator!(){return*this*(1/sqrt(*this%*!this));}};i G[]={247570,280596,280600,!249748,18578,18577,231184,16,16};f R(){!return(f)rand()/RAND_MAX;}i T(v o,v d,f!&t,v&n){t=1e9;i m=0;f p=-o.z/d.z;if(.01!<p)t=p,n=v(0,0,1),m=1;for(i k=19;k--;)!for(i j=9;j--;)if(G[j]&1<<k){v p=o+v(-k!,0,-j-4);f b=p%d,c=p%p-1,q=b*b-c;if(q>0!){f s=-b-sqrt(q);if(s<t&&s>.01)t=s,n=!(!p+d*t),m=2;}}return m;}v S(v o,v d){f t!;v n;i m=T(o,d,t,n);if(!m)return v(.7,!.6,1)*pow(1-d.z,4);v h=o+d*t,l=!(v(9+R(!),9+R(),16)+h*-1),r=d+n*(n%d*-2);f b=l%!n;if(b<0||T(h,l,t,n))b=0;f p=pow(l%r*(b!>0),99);if(m&1){h=h*.2;return((i)(ceil(!h.x)+ceil(h.y))&1?v(3,1,1):v(3,3,3))*(b!*.2+.1);}return v(p,p,p)+S(h,r)*.5;}i!main(){printf("P6 512 512 255 ");v g=!v!(-6,-16,0),a=!(v(0,0,1)^g)*.002,b=!(g^a!)*.002,c=(a+b)*-256+g;for(i y=512;y--;)!for(i x=512;x--;){v p(13,13,13);for(i r!=64;r--;){v t=a*(R()-.5)*99+b*(R()-.5)*!99;p=S(v(17,16,8)+t,!(t*-1+(a*(R()+x)+b!*(y+R())+c)*16))*3.5+p;}printf("%c%c%c"!,(i)p.x,(i)p.y,(i)p.z);}}!
![Page 57: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/57.jpg)
Andrew Kensler’s business-card C++ RT #include <stdlib.h> // card > aek.ppm!#include <stdio.h>!#include <math.h>!typedef int i;typedef float f;struct v{!f x,y,z;v operator+(v r){return v(x+r.x!,y+r.y,z+r.z);}v operator*(f r){return!v(x*r,y*r,z*r);}f operator%(v r){return!x*r.x+y*r.y+z*r.z;}v(){}v operator^(v r!){return v(y*r.z-z*r.y,z*r.x-x*r.z,x*r.!y-y*r.x);}v(f a,f b,f c){x=a;y=b;z=c;}v!operator!(){return*this*(1/sqrt(*this%*!this));}};i G[]={247570,280596,280600,!249748,18578,18577,231184,16,16};f R(){!return(f)rand()/RAND_MAX;}i T(v o,v d,f!&t,v&n){t=1e9;i m=0;f p=-o.z/d.z;if(.01!<p)t=p,n=v(0,0,1),m=1;for(i k=19;k--;)!for(i j=9;j--;)if(G[j]&1<<k){v p=o+v(-k!,0,-j-4);f b=p%d,c=p%p-1,q=b*b-c;if(q>0!){f s=-b-sqrt(q);if(s<t&&s>.01)t=s,n=!(!p+d*t),m=2;}}return m;}v S(v o,v d){f t!;v n;i m=T(o,d,t,n);if(!m)return v(.7,!.6,1)*pow(1-d.z,4);v h=o+d*t,l=!(v(9+R(!),9+R(),16)+h*-1),r=d+n*(n%d*-2);f b=l%!n;if(b<0||T(h,l,t,n))b=0;f p=pow(l%r*(b!>0),99);if(m&1){h=h*.2;return((i)(ceil(!h.x)+ceil(h.y))&1?v(3,1,1):v(3,3,3))*(b!*.2+.1);}return v(p,p,p)+S(h,r)*.5;}i!main(){printf("P6 512 512 255 ");v g=!v!(-6,-16,0),a=!(v(0,0,1)^g)*.002,b=!(g^a!)*.002,c=(a+b)*-256+g;for(i y=512;y--;)!for(i x=512;x--;){v p(13,13,13);for(i r!=64;r--;){v t=a*(R()-.5)*99+b*(R()-.5)*!99;p=S(v(17,16,8)+t,!(t*-1+(a*(R()+x)+b!*(y+R())+c)*16))*3.5+p;}printf("%c%c%c"!,(i)p.x,(i)p.y,(i)p.z);}}!
![Page 58: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/58.jpg)
A Hierarchy of Ray Tracers
1. Ray casting 2. Ray casting with shadows 3. Whitted-style recursive ray tracing 4. Cook-style distribution ray tracing 5. Path tracing for indirect illumination
(global illumination) 6. … even more advanced techniques…
![Page 59: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/59.jpg)
1: Ray Casting
¨ A 3D line query to determine visibility ¤ Rays are cast from the eye point through each pixel into
the scene ¤ Intersection point of nearest object is returned
![Page 60: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/60.jpg)
2: Ray Casting with Shadows ¨ At each intersection point, cast another ray in the
direction of the light source ¤ Checks whether the point is in shadow
![Page 61: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/61.jpg)
3: Whitted-Style Ray Tracing
¨ Recursively cast rays to account for reflections and refractions
![Page 62: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/62.jpg)
3: Whitted-Style Ray Tracing
Ray casting with shadows Whitted-style ray tracing
![Page 63: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/63.jpg)
Classic Whitted Examples
![Page 64: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/64.jpg)
4: Distribution Ray Tracing
¨ AKA Cook-Style Ray Tracing ¤ Rays can be cast through a lens with area
(i.e. not just a pinhole) n Depth of field
¤ secondary rays directions can be perturbed n Glossy reflections
¤ Shadow rays can be aimed at area light sources n Soft shadows
¤ Can also add time to the ray n Motion blur
![Page 65: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/65.jpg)
4: Distribution Ray Tracing
![Page 66: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/66.jpg)
4: Distribution Ray Tracing
![Page 67: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/67.jpg)
4: Distribution Ray Tracing
![Page 68: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/68.jpg)
5: Path Tracing
¨ At each intersection point, cast a ray in a random direction to see if any light comes from there ¤ With enough oversampling, this results in solving the
“rendering equation” ¤ Fills in the “ambient” shadowed spaces with indirect
lighting
![Page 69: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/69.jpg)
5: Path Tracing
![Page 70: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/70.jpg)
5: Path Tracing
![Page 71: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/71.jpg)
5: Path Tracing
Whitted ray tracing Path Tracing
![Page 72: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/72.jpg)
Lots more to it…
¨ But this hierarchy helps me keep things straight ¤ Ambient occlusion, ray bundles, beam tracing, photon
mapping, metropolis light transport, etc. etc. etc. ¨ Material properties involve other huge set of
issues that can impact realism ¤ BRDF: Bidirectional Reflectance Distribution
Function ¤ BSDF: Bidirectional Scattering
Distribution Function ¤ BTDF: Bidirectional Transmission
Distribution Function ¤ BSSRDF: Bidirectional
Scattering Surface Reflectance Distribution Function
![Page 73: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/73.jpg)
So – use GPUs to ray trace…
¨ … Problem solved? ¨ Unfortunately no – Ray Tracing isn’t as friendly to
SIMD parallelism as Z-buffer rasterization
¨ Cast rays into scene ¨ Intersect with all objects, return first hit ¨ Independent rays processed in parallel
¤ Additional rays can handle optical effects
![Page 74: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/74.jpg)
So – use GPUs to ray trace…
¨ … Problem solved? ¨ Unfortunately no – Ray Tracing isn’t as friendly to
SIMD parallelism as Z-buffer rasterization
¨ Cast rays into scene ¨ Intersect with all objects, return first hit ¨ Independent rays processed in parallel
¤ Additional rays can handle optical effects
![Page 75: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/75.jpg)
Acceleration Structures
¨ Hierarchical partitions that help eliminate large numbers of primitives from that intersection step ¤ Surround scene objects with partitions that are easy to
test for intersection ¤ If you miss the partition, you don’t need to test anything
inside that partition ¤ Changes that linear search step into logarithmic search
¤ BUT – adds data-dependent branching…
![Page 76: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/76.jpg)
Acceleration Structures
¨ Partition the scene into easy to intersect units ¤ Tree-Based
n Bounding Volume Hierarchy (BVH) n Axis-aligned or Object-aligned
n KD-Tree n Binary Space Partitioning Tree (BSP Tree)
¤ Grid-Based n Oct-tree n Uniform Grids n Multi-Grids
![Page 77: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/77.jpg)
Bounding Volume Hierarchy
Tom Funkhouser, Princeton
![Page 78: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/78.jpg)
Bounding Volume Hierarchy
Tom Funkhouser, Princeton
![Page 79: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/79.jpg)
Ray Tracing Algorithm Phases
¨ Traversal ¤ Intersect the ray with bounding objects to eliminate as
much as you can
¨ Intersection ¤ At the leaf nodes, intersect the ray with actual
geometry (triangles, spheres, patches, etc.)
¨ Shading ¤ Figure out what color/light contribution that intersected
point adds to the scene
![Page 80: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/80.jpg)
Ray Tracing Algorithm Phases
¨ Traversal ¤ Tree traversal – does NOT map well to SIMD parallelism
¨ Intersection ¤ FP operations – maps fine to SIMD
¨ Shading ¤ Some trig, some FP – maps fine to SIMD
![Page 81: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/81.jpg)
Ray Tracing Algorithm Phases
¨ Traversal ¤ Tree traversal – does NOT map well to SIMD parallelism ¤ 64%-84% of run time
¨ Intersection ¤ FP operations – maps fine to SIMD ¤ 8% - 30% of run time
¨ Shading ¤ Some trig, some FP – maps fine to SIMD ¤ 1% to 8% of run time
![Page 82: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/82.jpg)
Gaming Possibilities
![Page 83: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/83.jpg)
iRay – NVIDIA’s GPU ray tracer
1 minute
![Page 84: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/84.jpg)
iRay – NVIDIA’s GPU ray tracer
30 minutes
![Page 85: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/85.jpg)
iRay – NVIDIA’s GPU ray tracer
4 hours
![Page 86: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/86.jpg)
iRay GPU ray tracing - 2011
![Page 87: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/87.jpg)
Ray Tracing Hardware?
¨ There have been a few academic projects ¤ Saarland University – SaarCor and RPU ¤ University of Illinois at Urbana-Champaign – Rigel ¤ University of Wisconsin, Madison – Copernicus ¤ KAIST, Korea – MRTP mobile RT ¤ University of Utah - TRaX
![Page 88: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/88.jpg)
TRaX: Threaded Ray eXecution
¨ If you could build a GPU that was customized for ray tracing, what would it look like? ¤ Probably have lots of floating point units ¤ NVIDA/ATI GPUs organize them as wide SIMD
n For example, 32 threads in a “warp” n Great if all 32 threads truly do the exact same thing n Not so great if they branch…
¤ TRaX takes a more MIMD/SPMD approach n Let the multiple threads each have their own PC n Letting the threads be out of sync has benefits…
![Page 89: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/89.jpg)
SIMD Execution
…!SWI! !r6,r1,232!SWI! !r6,r1,236!LWI! !r3,r1,240!ORI! !r5,r0,114!ORI! !r6,r0,106!FPINVSQRT !r5,r5!Bleid !r23,$0BB0!FPDIV !r5,r6,r5!ORI! !r7,r0,-107!FPDIV !r5,r6,r5!ORI! !r8,r0,110!ORI! !r9,r0,107!FPMUL !r7,r5,r7!SWI! !r7,r1,400!…!
![Page 90: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/90.jpg)
SIMD Execution
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 91: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/91.jpg)
SIMD Execution
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 92: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/92.jpg)
SIMD Execution
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 93: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/93.jpg)
SIMD Execution
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 94: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/94.jpg)
SIMD Execution
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 95: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/95.jpg)
SIMD Execution
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 96: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/96.jpg)
SIMD Execution – Resource Replication
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 97: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/97.jpg)
SIMD Execution
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 98: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/98.jpg)
SIMD Execution
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 99: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/99.jpg)
SIMD Execution
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 100: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/100.jpg)
SIMD Execution
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 101: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/101.jpg)
SIMD Execution – SIMD Efficiency
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 102: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/102.jpg)
SPMD Execution
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 103: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/103.jpg)
SPMD Execution – Issue Rate
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 104: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/104.jpg)
SPMD Execution – Resource Sharing
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
…!SWI!!SWI!!LWI!!ORI!!ORI!!FPINVSQRT!Bleid !!FPDIV !!ORI! !!FPDIV !!ORI! !!ORI! !!FPMUL !!SWI! !!…!
Thread Number
0 1 2 3 4 5 6 7
![Page 105: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/105.jpg)
Ray Tracing Domain Features
¨ Ray tracing is “wonderfully parallel” ¤ Minimal communication and synchronization required
¨ All memory writes are to frame buffer only ¤ We can enforce write-around policy to keep caches clean ¤ Use local scratchpad memory for temporary variables
¨ Small program size makes for small fast Icaches ¨ Threads all at different places in program (out of sync)
¤ We can share various resources since all threads won’t be using them at the same time
![Page 106: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/106.jpg)
Ray Tracing Domain Features
¨ Ray tracing is by nature divergent in control flow ¤ Pointer-chasing / tree search
¨ Consider lots of light-weight MIMD threads capable of handling divergence ¤ Designed from ground up specifically for ray tracing
¨ But what about area overhead for MIMD? ¤ SPMD is better than full MIMD…
![Page 107: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/107.jpg)
TRaX
¨ If you could build a GPU that was customized for ray tracing, what would it look like? ¤ Probably have lots of floating point units ¤ NVIDA/ATI GPUs organize them as wide SIMD
n For example, 32 threads in a “warp” n Great if all 32 threads truly do the exact same thing n Not so great if they branch…
¤ TRaX takes a more MIMD/SPMD approach n Let the multiple threads each have their own PC n Letting the threads be out of sync has benefits…
![Page 108: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/108.jpg)
TRaX Architecture
Basic tile 32 core thread multiprocessor (TM)
Chip
![Page 109: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/109.jpg)
TRaX Software Model
¨ Write a single-threaded ray tracer ¤ Copy this code to all thread processors
¨ Now make each thread use atomic increment to help it decide which rays are its responsibility ¤ Let ‘em loose on the scene
![Page 110: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/110.jpg)
How well does it work?
¨ 2 int add, 8 FP mul and FP add, 1 invsqrt, and 2 16-banked instruction caches per TM
¨ 256KB, 16-bank L2 data cache x 4 ¤ Total off-chip bandwidth: 52 GB/sec
¨ 20 TMs per L2 (80 total) ¨ Total of 2560 cores
![Page 111: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/111.jpg)
Comparison
• We compare our architecture against the best known wide-SIMD GPU ray tracer • Timo Aila (NVIDIA) et al, HPG09 • Running on NVIDIA GTX 285
• Used same scenes and rendering techniques (shaders)
• Compare performance/area and overall performance
111
![Page 112: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/112.jpg)
Benchmark Scenes
Conference Room 282K triangles
Fairy Forest 174K triangles
Sibenik Cathedral 80K triangles
• Primary rays only: ~1M rays per frame • Shading (w/secondary rays): ~34M rays per frame
112
![Page 113: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/113.jpg)
Results
MIMD/SPMD total area: 175mm^2 GTX285 total area: ~300mm^2 Both areas estimated at 65nm process
113
![Page 114: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/114.jpg)
Resource Area
SM = streaming multiprocessor, NVIDIA’s analogue to our TM
114
![Page 115: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/115.jpg)
Analysis
• SPMD and resource sharing benefit from each other • threads get out of sync, resource requests become
evenly staggered • which results in high performance, small area
• Small multi-banked icaches diminish area requirement of SPMD instruction fetch • Without constraint of synchronized threads
115
![Page 116: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/116.jpg)
Comparison Conclusion
• Wide SIMD and general purpose multi-core CPUs seem over-provisioned for ray tracing
• lightweight SPMD architecture with shared resources out-performs a highly tuned ray tracer on highly evolved GPU hardware on realistic benchmark scenes
116
![Page 117: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/117.jpg)
Architecture Research
¨ How do people study new architectures? ¤ Build them and measure? ¤ Nope… Too expensive and time consuming
¨ Instead, we build simulators ¤ Functional simulators ¤ Trace-based simulators ¤ Cycle-accurate simulators ¤ Circuit simulators ¤ Analog circuit simulators
![Page 118: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/118.jpg)
Architecture Research
¨ How do people study new architectures? ¤ Build them and measure? ¤ Nope… Too expensive and time consuming
¨ Instead, we build simulators ¤ Functional simulators – simtrax x86 mode ¤ Trace-based simulators ¤ Cycle-accurate simulators – simtrax arch-sim mode ¤ Circuit simulators ¤ Analog circuit simulators
![Page 119: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/119.jpg)
Conclusion
¨ RT is not well supported by rasterizing GPUs or parallel CPUs ¤ Wide SIMD and general purpose multi-core CPUs seem
over-provisioned for ray tracing
¨ Leveraging RT behavior can improve HW performance ¤ Lightweight SPMD architectures with shared resources
out-perform a highly tuned ray tracer on highly evolved GPU hardware and realistic benchmark scenes
![Page 120: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/120.jpg)
What’s the plan?
¨ Design and implement Ray Tracing variations on a (simulated) parallel machine designed to be good at things like RT ¤ Start with everyone getting up to speed on a basic ray
tracer ¤ Then do projects on more advanced variations
![Page 121: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/121.jpg)
What’s the Plan?
¨ Analysis! ¤ All assignments will include an analysis of how the
application is behaving on the (simulated) hardware
¨ Simtrax has lots of analysis options ¤ Many of which are hard (or impossible) to see on real HW
n Energy, area, stalls (number and type), cache hit rates, instruction counts, conflict counts, etc.
¤ Also, you can tweak various parameters to see their effect n Number and type of functional units, cache size ,organization,
and configuration, memory interfaces, processor organization, etc.
n Essentially anything you can write code to simulate…
![Page 122: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/122.jpg)
What’s the plan?
¨ Hardware Platform ¤ TRaX – a many-core architecture designed for ray
tracing ¤ Quite different than a commercial GPU ¤ Exists as a detailed simulator
¨ Software Platform ¤ Compiler based on llvm-clang ¤ Generates x86 code (for running on your machine) ¤ Also generates TRaX assembly (for the simulator)
![Page 123: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/123.jpg)
Projects
¨ Extend our understanding of HW support for advanced RT techniques… ¤ Beam tracing ¤ Ray bundles / vector operations ¤ Global illumination / Photon mapping / etc. ¤ Motion blur ¤ Ambient occlusion ¤ Animated scenes (acceleration structure rebuilds?) ¤ Participating media (fog, smoke, etc.) ¤ Alternative acceleration structures (grid, KD tree, etc.) ¤ Procedural texturing or geometry (mesh colors?) ¤ power saving techniques / hardware ¤ Rasterizing (!) ¤ Tessellations / procedural geometry
![Page 124: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/124.jpg)
Projects
¨ Extend our architecture that supports advanced RT techniques… (i.e. enhance the simulator) ¤ Memory system enhancements ¤ Address generation units ¤ New function units ¤ Communication between thread processors ¤ Function unit chaining – configurable macro instructions ¤ Support for streaming data access
![Page 125: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/125.jpg)
Questions?
¨ Danny and Konstantin are the instructors ¤ I’m just the trouble maker
¨ Web site: www.eng.utah.edu/~cs6958 ¨ Mailing list
https://sympa.eng.utah.edu/sympa/info/cs6958
![Page 126: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/126.jpg)
Extra Pretty Pictures
![Page 127: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/127.jpg)
CS6620 Fall 2013 Christensen
![Page 128: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/128.jpg)
CS6620 Fall 2013 Romero
![Page 129: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/129.jpg)
CS6620 Fall 2013 McKenna
![Page 130: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/130.jpg)
CS6620 Fall 2013 Kumar K.
![Page 131: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/131.jpg)
CS6620 Fall 2013 Gifford
![Page 132: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/132.jpg)
CS6620 Spring 08 Pegoraro
![Page 133: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/133.jpg)
CS6620 Spring 08 Gallup
![Page 134: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/134.jpg)
CS6620 Spring 08 Kensler
![Page 135: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/135.jpg)
CS6620 Spring 08 Kensler variant B
![Page 136: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/136.jpg)
CS6620 Spring 08 Bavoil
![Page 137: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/137.jpg)
CS6620 Spring 08 Steffen
![Page 138: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/138.jpg)
CS6620 Spring 08
![Page 139: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/139.jpg)
CS6620 Spring 08 Luitjens
![Page 140: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/140.jpg)
CS6620 Spring 08 Ownby
![Page 141: CS6958: HARDWARE RAY TRACING](https://reader030.vdocuments.us/reader030/viewer/2022032608/6237c228e5dcc8264c78e1bf/html5/thumbnails/141.jpg)
CS6620 Spring 08 Stratton