hardware accelerated simulated radiography

UCRL-CONF-211396

Hardware Accelerated SimulatedRadiography

D. Laney, S. Callahan, N. Max, C. Silva, S.Langer, R. Frank

April 14, 2005

IEEE VisualizationMinneapolis, MN, United StatesOctober 23, 2005 through October 28, 2005

Disclaimer

This document was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor the University of California nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or the University of California. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or the University of California, and shall not be used for advertising or product endorsement purposes.

Hardware Accelerated Simulated Radiography

Daniel Laney∗

Lawrence Livermore National Laboratory

Steven Callahan †

University of Utah

Nelson Max ‡

Lawrence Livermore National Laboratory

Claudio Silva §

University of Utah

Steven LangerLawrence Livermore National Laboratory

Randall FrankComputational Engineering International

(a) (b) (c)

Figure 1: (a) A fuel capsule for inertial confinement fusion showing the fill tube by which Hydrogen fuel is injected. (b) Simulations andexperiments indicate that during capsule compression the fill tube may cause a jet to form that may lead to reduced yield. (c) Simulatedradiographs are are used to design diagnostics that can detect the jet during experiments and lead to quantitative measurements of positionand velocity.

ABSTRACT

We present the application of hardware accelerated volume render-ing algorithms to the simulation of radiographs as an aid to scien-tists designing experiments, validating simulation codes, and under-standing experimental data. The techniques presented take advan-tage of 32 bit floating point texture capabilities to obtain validatedsolutions to the radiative transport equation for X-rays. An unsortedhexahedron projection algorithm is presented for curvilinear hexa-hedra that produces simulated radiographs in the absorption-onlyregime. A sorted tetrahedral projection algorithm is presented thatsimulates radiographs of emissive materials. We apply the tetra-hedral projection algorithm to the simulation of experimental diag-nostics for inertial confinement fusion experiments on a laser at theUniversity of Rochester. We show that the hardware acceleratedsolution is faster than the current technique used by scientists.

CR Categories: K.6.1 [Management of Computing and Infor-mation Systems]: Project and People Management—Life Cycle;K.7.m [The Computing Profession]: Miscellaneous—Ethics

Keywords: volume rendering, hardware acceleration

1 INTRODUCTION

In this paper we present modifications to hardware accelerated vol-ume rendering algorithms that enable the quantitative simulation

∗e-mail: [email protected]†e-mail: [email protected]‡e-mail: [email protected]§e-mail: [email protected]

of radiographs. Radiography is a diagnostic technique used in in-dustrial and scientific applications. In the industrial setting high-energy X-rays are passed through objects as in a medical X-ray. Inthese cases the attenuation of the X-rays dominates. In high-energyphysics applications radiographs may include both attenuation andemission of X-rays by the object being imaged. In both cases itis necessary to adjust the parameters of the radiographic set-up tooptimize the resulting image. Depending on the application, thisoptimization process may be extensive. Designing an experimen-tal diagnostic may require a simulation of the experiment, followedby iteratively modifying the diagnostic parameters in order to de-termine a set of parameters that captures the features of the experi-ment.

Fast simulation of radiographs can also aid in understanding ra-diographical images obtained experimentally. The positions andshapes of interesting features in particular radiographs may not beeasy to identify, or their causes may be hard to determine due tothe lack of depth information in the experimentally obtained image.However, if a simulation of the experiment is available, simulatedradiographs can be generated with varying orientations and mate-rial properties. Furthermore, clipping planes or other subsettingoperations may help detect the location of interesting features in animage. Interactive radiograph simulation thus enhances a scientist’scapability to glean insight from experimental data, and opens newavenues in analyzing experimentally obtained radiographs.

The addition of 32-bit IEEE 754 floating point capability to com-modity graphics hardware has opened up the possibility of usingthis hardware for simulation. GPU-based Linear algebra and com-putational fluid dynamics are two examples in an increasing setof algorithms that have been mapped to graphics hardware. AtLawrence Livermore National Laboratory these new techniqueshave not been adopted by computational scientists mainly due tothe difficulty of programming graphics hardware and the absenceof 64-bit floating point arithmetic. We have found that the solu-tion of the radiation transport equation for X-rays is a problem forwhich 32-bit floating point arithmetic is sufficient for producing

quantitatively correct solutions. Solving this equation allows us tocreate simulated radiographs of proposed experiment diagnostics,and also to accelerate the simulation of industrial radiographs.

We describe an application involving experiment design for in-ternal confinement fusion for which faster response times can en-hance the ability of scientists to design experimental diagnostics.The accelerated techniques enable faster turn-around for a physicistexploring the design space of a particular radiographic apparatus.

2 PREVIOUS WORK

Radiography simulation has been used for medical applications asa training tool. In the field of Non-destructive Evaluation (NDE)software ray tracing techniques have been used in simulations ofradiograph equipment [3]. Koenig [5] used software ray tracingtechniques of CAD geometry to study the 3-D acquisition of defectsimaged in experimental radiographs. These simulations image in-dustrial radiographs of subjects which themselves are not emittingradiation. Thus, they model attenuation of radiation as it passesthrough matter. In the present work cell projection volume render-ing techniques are applied to radiographic imaging. In addition,emissive subjects are also modeled.

Volume rendering unstructured grids has been the subject ofmuch research in the visualization community. Due to recent ad-vances in programmable graphics hardware, techniques have beendeveloped to shift much of the computation to the GPU for bet-ter performance. The Projected Tetrahedra algorithm [9] was thefirst to show how to render tetrahedral cells using the traditional3D polygon-rendering pipeline. Roettger and Ertl [8] show the ef-ficiency of the GPU for compositing ray integrals of arbitrary un-structured polyhedra using an emissive optical model. Weiler etal. [10, 11] demonstrate a ray-caster implemented entirely on theGPU with significant improvement in performance over softwarealgorithms. The ray-caster traverses the tetrahedral mesh by stor-ing the topology in textures which are kept in GPU memory. Amore recent technique by Callahan et al. [1] removes the neces-sity for topological structure by volume rendering the collection oftriangles that compose the mesh. Their algorithm works in both ob-ject and image space by approximately sorting the geometry on theCPU, then finalizing the sort and compositing the fragments on theGPU. This provides a fast, efficient, and simple solution to volumerendering unstructured grids. We extend on the work of Callahan etal. and show that our work can be mapped to hardware-assistedvolume rendering algorithms.

3 RADIOGRAPHY

The radiative transfer equation for mono-energetic photons of fre-quency ν is

dIν

ds= Jν −Kν Iν (1)

where Iν is the specific intensity (or radiance, or spectral radiance)in units of power per unit area per steradian per frequency, s is theposition, Kν is the extinction coefficient (or opacity) with units ofinverse length, and Jν is the specific emissivity in units of powerper unit volume per unit solid angle per frequency.

Introducing the differential optical thickness dτν = Kν ds gives:

dIν

dτν

= Sν − Iν (2)

where Sν = Jν/Kν is the source function with units of specific in-tensity. In a region with constant Kν and Jν the solution is:

Iν = Iν0e−τ +(1− e−τ )Sν (3)

where Iν0 is the specific intensity entering the region and the ac-cumulated attenuation, or optical thickness τ (frequency subscriptelided) is given by:

τ =∫

Kν ds (4)

Given a mesh in which the values of Kν and Jν are constant overeach cell, Iν can be integrated along a linear path through the meshby solving the following first order recurrence:

Iν [i+1] = e−τ Iν [i]+ (1− e−τ )Sν (5)

where i indexes each cell in order from farthest to nearest to de-tector, and in this context Jν and Kν are understood as predefinedper-cell quantities indexed by i.

In many cases a multi-group approximation is used to captureeffects of photons of different energies. In the present work, thefrequency specific quantities Iν , Jν , Kν , and τν are replaced bypiece-wise constant approximations over a set of energy bins. Therecurrence in equation 5 is then solved for each energy group usingthe piece-wise constant values.

4 ABSORPTION-ONLY RADIOGRAPHS

In industrial and medical settings the self emission of X-rays fromthe subject can be neglected. The subject is back-lit by an X-raysource and an image is obtained either on film or detector. Settingthe emissivity Jν to zero in equation 5 produces

Iν [i+1] = e−τ Iν [i] (6)

Solving this recurrence results in a simple product of the exponen-tial terms, which can be found from a sum of the optical thicknessesof all cells projecting onto detector pixels. The optical thickness canbe computed in hardware, using the polyhedron projection method,by summing the attenuation effects of each cell. Since additionis commutative, the global back to front visibility sort usually re-quired for the polyhedron projection method is not necessary. Fora tetrahedral mesh, we can use the method of Shirley and Tuchman[9] to divide the image plane projection of each tetrahedron intotriangle fans for hardware rendering. One can divide any polyhe-dral mesh into tetrahedra in a consistent way using the method ofMax et al. [7]. However, many of our meshes are curvilinear, andthere is a large saving in mesh storage and rendering time if thehexahedral cells can be projected directly. We use the method ofMax [6], which describes how to define triangle fans or strips forall projections of a non-degenerate hexahedron, as long as the pro-jection of each face is a non-self-intersecting quadrilateral. Hexa-hedral cells which are degenerate because they have coinciding ver-tices, or which have faces whose projections are self-intersecting,are divided up into tetrahedra.

Our users require floating point accuracy in the attenuation com-putation. Unfortunately, although the current graphics chips can dointernal fragment computations with 32-bit floating point accuracy,the compositing operations to the frame buffer, which are necessaryto add up the attenuation effects of all the cells, are not possible withthis accuracy. Therefore, the attenuation for each cell was added ina simple fragment program. The attenuation for each vertex wasthe extinction coefficient Kν times the thickness, which was com-puted in software, using the geometry of the cell. The accumulatedattenuation was taken from an image-aligned texture, and the atten-uation for the current cell was interpolated by the hardware fromthe vertex data. The floating point sum was written to a floatingpoint off-screen P-buffer, which was copied to the texture for inputto the next iteration. This copying is necessary to keep the texturecurrent, because writes to any output buffer, including a texture, arenot guaranteed to be in polygon order, and also because the tex-ture caching mechanism does not account for textures changing on

the fly. We can accumulate the attenuation for four different X-raywavelength bands in parallel, using the red, green, blue, and alphachannels.

Copies from the P-buffer to a texture are slow, so it is impracticalto do the copy after rendering every cell. Therefore we attempted torender as many non-overlapping cells to the P buffer as possible, inorder to minimize the copying. This was done in multiple passes.In the first pass, all the hexahedra were tested to see if the couldbe projected as a whole, as described above, and if not, they weresubdivided into five tetrahedra. A logically circular list of pointersto all hexahedra was formed, and for each, an image plane boundingbox was saved. Then in subsequent passes of writing the P-buffer,this list was traversed in jumps of a size relatively prime to the list’slength, so that every cell would be touched once before any wasrepeated, and yet adjacent cells would not be touched consecutively,in order to reduce the chance of overlap. Cells whose boundingboxes did not overlap any others rendered in the current pass couldbe rendered into the P-buffer.

Overlapping bounding boxes were tested as follows. At the be-ginning of each pass, a 1-bit per pixel test buffer was cleared. Thenfor each cell touched in the list, the bounding box was tested byscan converting it to check if there was any overlap with the testbuffer. Because the 1 bit buffer was represented in multi-bit words,aligned per scan line to the left of the screen, the pattern along ascan line in a box often consists of a left hand partly filled word,found from a look up table, zero or more full words, and a righthand partly filled word from another look-up table. Otherwise, itconsists of a single partly filled word which is the bitwise intersec-tion of words from the left and right look-up tables. To check foroverlaps, the words involved in each scan line for the box are testedagainst the words in the test buffer using bitwise logical ands, andif any of the results is non-zero, there is overlap of bounding boxes,and the cell is skipped in this pass. If not, bitwise logical ors areused to set the bits for the bounding box in the test buffer, and thecell is rendered into the P-buffer, and marked as completed in thelist of cells. We also tried scan converting the actual projection ofthe cell using these logical operations, instead of just its boundingbox, but the extra cost of stepping along the edges of the projectionbalanced the speed-up from fewer layers, and there was no net gain.

Once a threshold number of cells are skipped due to overlap,the current pass is terminated, the smallest rectangle in the P-buffercontaining all the rendered bounding boxes is copied into the tex-ture, and the next pass is started. Note that the P-buffer containsthe results of adding all the passes, so when no cells are left, it iscopied to the output buffer. If an X-ray image rather than an attenu-ation image is desired, the negative exponential is done during thiscopy.

The number of passes required is related to the depth complex-ity of the volume decomposition. This is another reason why it ispreferable to render a hexahedral cell as a whole, rather than as fivetetrahedra, which would increase the depth complexity.

4.1 Computing Extinction Coefficients

In order to simulate radiographs in the non-emissive regime, accu-rate extinction coefficients are required. In this section we describethe procedure for obtaining extinction coefficients from calculatedatomic cross-sections for X-rays.

The extinction coefficient Kν can be defined in terms of the coldopacities of the atomic elements:

Kν = κρ (7)

where κ is the opacity in units of cm2/g and ρ is the density of thematerial.

Elemental opacities can be calculated using publicly availabletabulations of computed atomic cross sections [2, 4]. These tables

are indexed by photon energy which is defined as E = hν where his Planck’s constant (6.6263×10−27erg/s) and ν is the frequency.The opacity for a given element can be computed from the tabulatedatomic cross sections by interpolation:

κE =σE

uA(8)

where the cross section σE is interpolated at energy E and hasunits of barns/atom (1barn = 10−24cm2), u is the atomic mass unit(1amu = 1.6605402×10−24g), and A is the atomic mass of the el-ement in amu. Photon energies are typically given in electron volts(1eV = 1.60206×10−12erg).

Figure 2: A plot of the opacity κ of Titanium for 200 energy groups.The sawtooth shape is due to the absorption edge of the K shellelectrons.

The opacity information is divided into energy groups definedby a lower and upper energies Eg and Eg+1 respectively, where gdenotes the group number and Eg < Eg+1. For each energy group aconstant approximation is computed according to:

κg =1

Eg+1 −Eg

∫ Eg+1

Eg

κE ′dE ′ (9)

Figure 2 shows a piecewise constant approximation of the absorp-tion spectrum of Titanium generated using this method. The saw-tooth shape approximates the absorption edge of the K shell elec-trons. At an absorption edge, the photon energy passes the bindingenergy of an electron in a specific shell. This allows the photon toionize the atom, and increases the opacity.

The opacities of mixtures and compounds can be estimated byassuming atomic mixing:

κmat,g =N−1

∑i=0

miκi,g (10)

where mat is a material index, g is the energy group, κi,g is theopacity of element i and mi is the mass fraction of element i in themixture or compound.

The resulting extinction coefficients can be stored in a table in-dexed by material number and energy group number.

5 RADIOGRAPHY OF EMISSIVE OBJECTS

Radiographs of emissive objects require and the full solution ofequation 5. Thus, the compositing operation is not commutativeand must be done in back to front order. We assume the extinc-tion coefficient Kν and source term Sν are provided as piecewise

constant quantities on a geometric object or mesh. We extendthe Hardware-Assisted Visibility Sorting (HAVS) volume render-ing system proposed by Callahan et al. [1]. The HAVS algorithmoperates on a list of the unique faces (triangles) in the tetrahedralmesh. In object-space, the faces are depth sorted by their centroidand then passed to the GPU for rasterization. This first sorting passplaces the geometry in an approximate depth order. Upon rasteri-zation, the fragments undergo an image-space sort via the k-bufferwhich has been implemented on programmable graphics hardwareusing fragment shaders. The k-buffer is a fragment stream sortingnetwork that keeps a fixed number (k) of fragments for each pixelin the framebuffer. As a new fragment is rasterized, it is comparedwith the other entries in the k-buffer and the two entries that arefarthest from the viewpoint (for back-to-front traversal) are usedto find the color and opacity for the fragment using a lookup ta-ble which contains the pre-integrated volume integral. The colorand opacity are composited into the framebuffer, and the remainingfragments are written back to the k-buffer.

In our case, we are interested in solving the transport equationfor each energy group encountered by a ray passing though a setof ordered cells. Unlike the previously described approach, a pre-computed lookup table would introduce considerable error due tothe high range in energy group terms. Therefore, we computethe transport equation in the fragment program which requires theopacity value and source function value for each energy group inaddition to the distance of the fragment from the viewpoint. Welimit the number of energy groups we visualize to four, one per im-age channel (R,G,B,A). Since the HAVS algorithm uses MultipleRender Targets (MRT) to keep the k-buffer entries during rasteriza-tion, the size of k depends on the number of values that are storedfor each fragment. In the original algorithm, HAVS only uses twoentries (v,d) per fragment: one for scalar value and the other fordistance from the viewpoint. On current hardware, three render tar-gets are available in addition to the framebuffer which allow up tosix fragment entries (k = 6). Unfortunately, by removing the lookuptable, we require much more information in our per-fragment com-putation. The viewpoint distance for each fragment is still neededas well as an opacity value and a source function value per energygroup. Storing all these values for each fragment would only allowone fragment entry in the k-buffer. This would limit the amount ofimage-space sorting that occurs and may affect overall image qual-ity. A more conservative approach is to perform a separate render-ing pass for each energy group g. This allows each fragment entryto be reduced to three values, the distance, opacity, and source forthe current energy group (d,og,sg). This improves the image qual-ity by sacrificing some performance.

Another important detail to consider is that our transport equa-tions provide a first order recurrence over a piece-wise constantapproximation. Thus, the opacity and source terms for the en-ergy groups are assigned per cell. Since the HAVS algorithm op-erates on the unique faces that make up the tetrahedra mesh, noconnectivity information is maintained, unlike in ray-casting ap-proaches [10, 11]. This also considerably reduces the number ofprimitives that are rasterized when compared with the ProjectedTetrahedra [9] algorithm and substantially decreases the number offragments treated by the GPU. To keep the fragment count low,these unique faces were extended to keep the energy group termsfor possibly two cells (the cell that the face belongs to and a neigh-boring cell if it is not on the boundary).

Figure 3 shows how opacity and source terms are determined us-ing only face information. A ray passing through the volume willenter a cell by passing through one face and exit the cell by pass-ing through the next face. To determine whether a ray is enteringor exiting a cell represented only by a face, a comparison can bemade of the normal direction of the face and the direction of theray. In an extra object-space step, we perform a dot product of ev-

Figure 3: Unique faces share up to two cells and are encoded withthe energy groups for both. Determining which energy group to use(shown as blue and green cells) is determined per view by a dotproduct with the face normal.

Figure 4: An overview of the ICF process. Several laser beams strikethe inside surface of a hohlraum (the gold cylindrical canister) andheat it to millions of degrees. The resulting thermal X-rays emittedby the hohlraum’s inner surface heat the outer surface of the capsule.The ”rocket exhaust” effect of material ablated from the capsule’ssurface compresses the hydrogen fuel in the center to conditionswhere fusion occurs.

ery face normal with the viewing direction and choose the energygroups for the faces based on the view. Determining the cell that aface represents can also be done in image-space, but would requireduplicate faces to be rasterized and would considerably complicatethe fragment program and affect performance.

A final consideration is that due to the nature of graphics hard-ware, the opacity and source terms are passed to the GPU per ver-tex in texture coordinates. However, since multiple faces may shareone vertex, this can cause inconsistencies in the energy group termsacross a face. Therefore we perform a preprocessing step which du-plicates the vertices that are shared by multiple faces. Though thisintroduces some redundancy in the vertices, the number of raster-ized faces remains the same and constant energy group values areguaranteed across the faces.

6 APPLICATION: DESIGNING DIAGNOSTICS FOR NIF EX-PERIMENTS

We now describe an application involving radiographs of subjectswith emissive materials for laser experiments. When completed,the National Ignition Facility (NIF) is expected to produce fusionby compressing a tiny capsule of Deuterium-Tritium (DT) fuel ina process known as inertial confinement fusion (ICF). NIF experi-ments will reproduce the physical conditions inside stars.

Figure 4 depicts the indirect-drive ICF process. A small spheri-

17ns 17.05ns 17.0875ns

Figure 5: A simulation showing a jet formed inside the fill tube penetrates the fuel as the capsule implodes. Times are shown in nano-seconds.

cal capsule with a radius of roughly 1 millimeter is placed inside ahollow cylinder called a hohlraum (a German word meaning ”cav-ity”) which is typically a few millimeters in diameter and about 10millimeters long. Multiple laser beams enter through the ends of ahohlraum and strike its inside surface. The hohlraum is composedof a heavy material like gold or lead which is heated to millions ofdegrees by the laser beams. The hohlraum wall is hot enough that itemits X-rays which bathe the inside of the hohlraum and cause theablation of the outer (plastic) surface of the capsule. This ablationturns the outer surface of the capsule into a spherical rocket, drivingthe shell of the capsule inward and compressing the DT fuel within.

Two key components of the ICF process are that the DT fuel re-main cool during the compression process and that the compressionis uniform (spherically symmetric). If the fuel is too warm, the DTwill be more difficult to compress and the fusion output will be re-duced. If the compression is not symmetric, the peak density willbe lowered, again resulting in a smaller amount of fusion.

Capsules will be filled with fuel through a tiny tube connectingthe capsule to a reservoir outside the hohlraum. Simulations and ex-periments of this system have shown that a plasma jet is formed bythe tube as it is imploded along with the capsule by the X-rays. Thepresence of the jet has been detected in experiments on a smallerlaser at the University of Rochester. Follow-up experiments arenecessary to accurately measure the position and velocity of thejet and compare this information against simulations. The primarygoal of this paper is to demonstrate GPU based radiography algo-rithms that enhance the productivity of experimentalists designingsuch experiments.

6.1 Imaging the Effects of the Fill Tube

Figure 5 shows the jet that is formed by the fill tube as the outerlayer of the capsule is ablated. A fundamental question is the effectof this jet on capsule implosion. Scientists are designing experi-ments using capsules that have shells doped with titanium. Whenthese doped capsules are imploded the titanium dopant emits X-rays at a higher intensity than the capsule material. The diagnosticsmust be designed to maximize the photon count at the detector, andto enhance the contrast between the dopant (Ti) and the shell mate-rial (primarily carbon). Simulated radiography is used optimize thefree parameters.

Figure 6 shows the set up of the experiment.There are several parameters that can be adjusted to achieve these

goals:

1. Capsule Dopant: changing the dopant and varying theamount of dopant in the capsule. The jet must be detectedalong with the capsule material in order to discover the posi-tion of the jet relative to the capsule.

2. Reference Wires: adding one or more wires of an opticallythick material to the outside of the hohlraum. The wires would

Figure 6: X-rays emitted from the capsule travel through thehohlraum and onto an array of pinholes.

cast shadows on the detector to enable accurate measurementof the position of the jet. The thickness and spacing of thesewires is critical.

3. Filtering of X-rays: introducing slabs of different materialsbetween the hohlraum and detector can cause preferential at-tenuation of X-rays at certain wavelengths. These can be usedto emphasize emission of one material over another. However,the additional of additional filtering reduces the overall inten-sity of the radiation striking the detector, resulting in morenoise in the images.

We focus on performance gains produced by using graphicshardware to solve the transport equation. Our goal is to render im-ages at speeds high enough that the process of adjusting parametersabove becomes nearly interactive.

7 RESULTS

7.1 Attenuation Only Radiographs

Figures 7 and 8 show absorption only radiographs of a simulatedhohlraum and capsule using the unsorted layer-rendering algorithmwith a low and high intensity X-ray source. This data set containedfive different materials for which extinction coefficients were com-puted based on the EPDL [2] data base. The detector was assumedto have uniform sensitivity at all photon energies.

This data set had 32 curvilinear grids domains, for a total of141,960 hexahedral cells, of which 40,656 were collapsed to a sin-gle point. There were also 9696 other cells which had at leasttwo coinciding vertices, and 924 with quadrilateral faces with self-intersecting projections, and cells of both these types were subdi-

Figure 7: A simulated radiograph using 4 energy groups of low inten-sity X-rays. The capsule and hohlraum material reduce the receivedintensity, but the ejected material is visible.

vided into tetrahedra. The remaining 90,684 hexahedra were pro-jected by the method of Max [6]. On one processor of a dual pro-cessor 800 MHz Pentium4 Xeon PC, with an nVidia 5900FXUltragraphics card, it took 0.13 seconds to read in the data, 4 seconds tocompute an attenuation table for four frequency bands and the fivematerials which may be mixed into each cell, 3.5 seconds to buildthe triangle fans, 45.4 seconds to render them in 605 separate lay-ers and copy the layers from the P-buffer into the texture, and 6.63seconds to copy from the P-buffer to the on-screen image.

A newer driver release from nVidia on Linux will allow directtexturing from P-buffers, so that this copy step should no longerbe necessary, and with ping-pong use of two draw buffers fromthe same P-buffer, we should also be able to eliminate the contextswitches. If this swapping of buffers is fast enough, we may be ableswap per cell, and not bother with the software scan conversion forthe overlap testing.

7.2 Simulated Radiographs of Capsule Experiments

Figure 9 depicts the experimental setup for detection of the jet cre-ated by the capsule fill tube. The capsule material is doped withtitanium which is more emissive than the shell material which isprimarily plastic. The X-rays pass through a window cut into thehohlraum made of a lower Z material such as beryllium (Z=4). TheX-rays pass through an array of pinholes, which are imaged at dif-ferent times to generate a time-lapse series capture the compressionof the capsule.

This apparatus is simulated in several stages. First, the cap-sule is simulated at high resolution ( 10 million cells) by a coupledradiation-hydrodynamics code. The simulation models the progressof the compression and the jet formed by the fill tube. The result ofthe simulation is a mesh defining the emissivity and opacity per cellper photon energy group. This data is used as input to a ray trac-ing package that computes the photon intensity over each detectorpixel for each photon energy group in the simulation. The effects ofthe low Z window, the pinholes, and any other filters are computed

Figure 8: Higher intensity radiation (4 energy groups) resolves thecapsule.

using the photon intensity information computed by the ray trace.

The fill-tube simulation resulted in a 2-D mesh of 47750 quadri-laterals with assumed cylindrical symmetry. The software ray traceused by the experimentalists explicitly ray traces the cylindrical ex-trusions of each quadrilateral in the data set. For the hardware-accelerated k-buffer algorithm the entire data set was be extendedabout the axial coordinate, partitioned into hexahedra, then parti-tioned into tetrahedra.

Figure 10 shows a simulated radiograph of a fill-tube jet experi-ment. The image size was 160 by 160 pixels. The software ray trac-ing package currently in use by the experimentalists took 2 minutesfor 57 energy groups on a machine with a 2.2 GHz Pentium classprocessor and 896 megabytes of RAM. The hardware accelerated k-buffer algorithm was applied to a 3-D extension of the fill tube datawith 1,719,000 tetrahedra. The system was a 3.2 GHz Pentium 4with 2 gigabytes of RAM running Windows XP. The graphics cardwas an ATI X800 Pro with 256 megabytes of RAM. The initial sorttook 0.656 seconds and the projection of all tetrahedra took 3.8 sec-onds for four energy groups. This is repeated in 15 passes (to coverall energy groups) for a total of 57.7 seconds (the sort is only doneone time).

Two caveats must be mentioned. First, our k-buffer implementa-tion requires Windows and OpenGL. The fill-tube data set was notavailable on a Windows machine, so a stand-in data set was used ofthe same size to compute timings. We validated the results by run-ning a software emulator of the k-buffer algorithm on the fill tubedata set. Second, the small image size means that a reduced reso-lution version of the data set may actually be appropriate for thisapplication, but determining the required accuracy is left for futurework.

Figure 9: The diagnostic set-up depicting each stage of simulation.The heated capsule shell emits X-rays, which travel through a windowon the hohlraum made of a material with lower atomic number thanthe hohlraum itself. The radiation then passes through an arrayof pinholes, which are imaged in quick succession to create a setof images in time. The GPU-based tet projection code is used togenerate the initial set of radiation leaving the hohlraum.

8 CONCLUSION AND FUTURE WORK

We have demonstrated modified volume rendering algorithms thatcan be used to produce accurate simulated radiographs. We de-scribed implementations of cell projection algorithms with andwith-out sorting for the attenuation-only and attenuation-emissioncases. The algorithms described require recent graphics cards anddrivers, due to the requirement of floating point accuracy and theuse of advanced texture capabilities.

Our goal of interactivity for users at their workstations will re-quire some modification to the algorithms. Accuracy must be care-fully traded off against performance. A combination of smoothingand coarsening of meshes, a reduction in the number of photon en-ergies, and more precomputation will be required. We expect tobe able to produce a visualization tool that enables interactive sub-setting, orientation, and modification of material properties so thatscientists who combine simulation with experiment can better un-derstand their data.

We also plan to investigate parallel implementations on graphicsclusters to obtain interactive performance without sacrificing accu-racy. Several strategies are possible including distributing photonenergy groups across several processors in cases where the data canbe duplicated and there are many photon energies. Domain de-composition is a trivial exercise for for absorption-only radiographsthanks to order independence. However, domain decomposition foralgorithms modeling emissive materials will require modificationdue to possible ordering complications at domain boundaries.

Finally, it may be beneficial to study GPU-based algorithms forvolume rendering 2-D simulation data. Treating such data directlyby introducing the symmetries of the problem into the algorithmsmay result in dramatic performance improvements.

UCRL-CONF-211396. This work was performed under the aus-pices of the U.S. Department of Energy by University of California,Lawrence Livermore National Laboratory under Contract W-7405-Eng-48.

REFERENCES

[1] Steven P. Callahan, Milan Ikits, Joao L.D. Comba, and Claudio T.Silva. Hardware-assisted visibility ordering for unstructured volume

Figure 10: A simulated radiograph of the fill-tube experiment creat-ing using the ray trace package used by the physicists. The originalsimulated data is two dimensional, so all features that are not onthe axis are rotated about the axial coordinate creating the banding.The jet created by the fill tube is visible on the Z-axis (vertical).

rendering. IEEE Transactions on Visualization and Computer Graph-ics, 11(3):285–295, 2005.

[2] Dermott E. Cullen, John H. Hubble, and Lynn Kissel. ”epdl97: theevaluated photon data library, ’97 version”. Technical report, UCRL-ID-50400, Lawrence Livermore National Laboratory, 1997.

[3] P. Duvauchelle, N. Freud, V. Kaftandjian, G. Peix, and D. Babot. Acomputer code to simulate x-ray imaging techniques. Nuclear Instru-ments and Methods in Physics Research B, 170:245–258, 2000.

[4] J. H. Hubble and S. M. Seltzer. ”tables of x-ray mass attenuation coef-ficients and mass-absorption coefficients”. Technical report, [Online],National Institute of Standards and Technology], Gaithersburg, MD,2004. Originally published as NISTIR 5632.

[5] A. Koenig, A. Gliere, R. Sauze, and P. Rizo. Radiograph simulationto enhance defect detection and characterization. In ”Proc. of the 7th

European Conference on Non-Destructive Testing”, volume 1, pages444–451, 1998.

[6] Nelson Max. Hexahedron projection for curvilinear grids. poster atvolume graphics 2005, LLNL Tech. Report 318089, 2005.

[7] Nelson max, Peter Williams, and Claudio Silva. Approximate volumerendering for curvilinear and unstructured grids by hardware-assistedpolyhedron projection. International Journal of Imaging Systems andTechnology, 11:53–61, 2000.

[8] Stefan Roettger and Thomas Ertl. Cell projection of convex polyhedra.In Proceedings of the 2003 Eurographics/IEEE TVCG Workshop onVolume graphics, pages 103–107, 2003.

[9] Peter Shirley and Allan Tuchman. A polygonal approximation to di-rect scalar volume rendering. Proc. San Diego Workshop on VolumeVisualization, 24(5):63–70, November 1990.

[10] Manfred Weiler, Martin Kraus, Markus Merz, and Thomas Ertl.Hardware-based ray casting for tetrahedral meshes. In Proc. IEEEVisualization, pages 333–340, October 2003.

[11] Manfred Weiler, Paula N. Mallon, Martin Kraus, and Thomas Ertl.Texture-Encoded Tetrahedral Strips. In Proceedings Symposium onVolume Visualization 2004, pages 71–78. IEEE, 2004.

hardware accelerated simulated radiography

Documents