neural light field 3d printing - quan-zheng.github.io · modern 3d printers are capable of printing...
TRANSCRIPT
Neural Light Field 3D Printing
QUAN ZHENG,Max Planck Institute for Informatics, Germany
VAHID BABAEI,Max Planck Institute for Informatics, Germany
GORDONWETZSTEIN, Stanford University, USA
HANS-PETER SEIDEL,Max Planck Institute for Informatics, Germany
MATTHIAS ZWICKER, University of Maryland, College Park, USA
GURPRIT SINGH,Max Planck Institute for Informatics, Germany
Backlight
Volume
Viewer
(a) (b) (c) (d)
Fig. 1. Our approach encodes input light field imagery (b) as a continuous representation of the volumetric attenuator using neural networks (a). Top row
shows the fabricated prototypes using our approach that optimizes both planar (c) and non-planar displays (d). Bottom row compares the central view (red
inset) of the input light field (b) with front-view photographs of printed prototypes (c, d).
Modern 3D printers are capable of printing large-size light-field displays
at high-resolutions. However, optimizing such displays in full 3D volume
for a given light-field imagery is still a challenging task. Existing light field
displays optimize over relatively small resolutions using a few co-planar
layers in a 2.5D fashion to keep the problem tractable. In this paper, we
propose a novel end-to-end optimization approach that encodes input light
field imagery as a continuous-space implicit representation in a neural net-
work. This allows fabricating high-resolution, attenuation-based volumetric
displays that exhibit the target light fields. In addition, we incorporate the
physical constraints of the material to the optimization such that the result
can be printed in practice. Our simulation experiments demonstrate that
Authors’ addresses: Quan Zheng, Max Planck Institute for Informatics, Saarbrücken,Germany, [email protected]; Vahid Babaei, Max Planck Institute for Informat-ics, Saarbrücken, Germany; Gordon Wetzstein, Stanford University, USA; Hans-PeterSeidel, Max Planck Institute for Informatics, Saarbrücken, Germany; Matthias Zwicker,University of Maryland, College Park, USA; Gurprit Singh, Max Planck Institute forInformatics, Saarbrücken, Germany.
Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than theauthor(s) must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected].
© 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM.0730-0301/2020/12-ART207 $15.00https://doi.org/10.1145/3414685.3417879
our approach brings significant visual quality improvement compared to the
multilayer and uniform grid-based approaches. We validate our simulations
with fabricated prototypes and demonstrate that our pipeline is flexible
enough to allow fabrications of both planar and non-planar displays.
CCS Concepts: · Applied computing → Computer-aided manufactur-
ing; · Computing methodologies→ Neural networks; Volumetric models.
Additional Key Words and Phrases: Volumetric display, light field, neural
networks, 3D printing, computational fabrication
ACM Reference Format:
Quan Zheng, Vahid Babaei, Gordon Wetzstein, Hans-Peter Seidel, Matthias
Zwicker, and Gurprit Singh. 2020. Neural Light Field 3D Printing . ACM
Trans. Graph. 39, 6, Article 207 (December 2020), 12 pages. https://doi.org/
10.1145/3414685.3417879
1 INTRODUCTION
Light fields are an important visual medium capable of capturing
the visual information of a scene in a compact manner. The gen-
eration of light fields has been made possible by either synthetic
rendering approaches [Gortler et al. 1996; Levoy and Hanrahan
1996] or directly capturing with light field cameras [Marwah et al.
2013; Ng et al. 2005]. Up to now, light field data are still commonly
appreciated via arrays of 2D images or bundles of video frames.
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.
207:2 • Zheng et al.
Modern 3D printers havemade it possible to appreciate the experi-
ence of light field on fabricated displays at high-resolution. However,
optimizing such light-field displays in a full-volumetric setting at
high-resolution is still a challenging task. Previous methods [Wet-
zstein et al. 2011, 2012] have addressed this problem in a 2.5D setting
by discretizing the 3D optimization-space as multiple 2D discrete
layers. These approaches may provide an optimal solution using
linear solvers, but they are highly restrictive in display size and reso-
lution. To address this issue, we propose an end-to-end optimization
using a neural network which encodes the target light field as an
implicit continuous volume.
For a given light field, we optimize for the volumetric attenua-
tion of light by learning to predict the absorption coefficients in a
continuous 3D space. This training is facilitated by a volumetric
ray-casting process which is fully differentiable. Batches of rays are
traced through the implicit volume and their radiance is attenuated
by the medium. Before the printing, a voxel-query can be performed
from this learnt implicit representation of absorption coefficients
according to the target resolution. The resulting 3D grid is then
halftoned [Ostromoukhov 2001] and sent to the 3D printer.
We validate our approach by demonstrating the improvements
over existing mutlilayer approaches with respect to visual qual-
ity, numerical errors, and scalability. We also develop a grid-based
approach, that optimizes on a voxel grid with a fixed resolution,
as the baseline for comparisons. Unlike our neural network-based
optimization, the grid-based approach quickly exceeds the available
memory with increasing display size and resolution. We demon-
strate the practicality of our approach by fabricating a prototype (Fig-
ure 1) using an inkjet 3D printer. We go beyond established current
trends (planar displays) and, for the first time to the best of our
knowledge, showcase non-planar light-field displays. To summa-
rize, our neural light field printing approach makes the following
contributions:
(1) We propose an end-to-end 3D fabrication pipeline that optimizes
implicit volumetric representation using a neural network.
(2) The resultant neural representation is memory efficient and can
be deployed to fabricate displays at different scale and resolu-
tions.
(3) Our design benefits from an existing neural network architec-
ture [Mildenhall et al. 2020] and naturally extends it for fabricat-
ing both planar and non-planar volumetric light-field displays.
2 RELATED WORK
Multilayer Displays. As uncontrolled scattering of light within a
physical volume will undermine 3D effects, multilayer autostereo-
scopic displays are generally designed based on attenuation within
volumes.Wetzstein et al. [2011] demonstrated a successful prototype
of a light field display by stacking multiple printed 2D attenuation
layers. They computed multiple discrete attenuation layers with
well-established tomographic techniques. Recently, tomography has
also been extended to design large scale volumetric projectors [Jo
et al. 2019] for use cases like theaters and classrooms.
With the encouraging progress of LCD technologies, mutlilayer
dynamic LCD displays have been intensively studied. Following
the concept of parallax barriers, Lanman et al. [2010] designed
a dual-layer LCD display by stacking a pair of LCD panels and
adapting each panel to multi-view content. Instead of using a layered
attenuation design, the polarization field display [Lanman et al.
2011] constructed multiple polarized rotating liquid crystal layers to
emit dynamic light fields. Wetzstein et al. [2012] further proposed
tensor displays and generalized the design problem of multilayer
LCD displays. These displays are designed according to predefined
resolutions and they are limited to planar surfaces. In contrast, our
display supports resolution-free designs and we can fabricate light
fields on non-planar surface.
The multilayered fabrication fashion has also been adopted in
other different fields. Holroyd et al. [2011] fabricated mutli-layer
acrylic model to reproduce the appearance of target 3D models by
warping them onto multiple layers. Along the same line, layered
attenutator was also exploited to generate shadow projections of
images under prescribed incident light [Baran et al. 2012].
Volumetric Displays. Blundell et al. [1994] proposed an early vol-
umetric display called a cathode ray sphere. They also defined volu-
metric displays as devices that emit light fields based on emission,
absorption, or scattering of light within a physical volume [Blun-
dell et al. 1993]. Favalora [2005] summarised several common ap-
proaches to achieve volumetric displays. Swept volume displays use
rotating LCD panels [Maeda et al. 2003] or parallax barriers [Yendo
et al. 2005] to deflect light rays. Based on this principle, Cossairt
et al. [2007] designed a horizontal parallax-only swept volume dis-
play that can handle view-dependent occlusions. Jones et al. [2007]
further introduced a high-speed video projector and a spinning
holographic diffuser to achieve both horizontal and vertical parallax.
Projection media other than LCD or LED were also explored. Nayar
and Anand [2007] devised a display by making use of a cloud of
passive optical scatters in a glass cube, and water drops [Barnum
et al. 2010] were also used as voxels for constructing a multilayer
display.
For attenuation-based volumetric displays, an analysis [Gotoda
2010; Wetzstein et al. 2011] has been developed to convert contin-
uous volumes to separated masks, where masks are optimized by
solving a linear equation system. Using a similar volumetric ap-
proach, Mitra and Pauly [2009] proposed to use a shadow volume
to generate shadow art. Loukianitsa and Putilin [2002] presented an
early design of a 3D display by stacking two-layer LCD panels. They
used a neural network in their design to compute pixel values to be
shown on each 2D LCD panel. In contrast, we directly operate on
continuous 3D volumes instead of a few separated layers. Our neu-
ral network learns the continuous 3D volume representation from
input light fields. This allows us to fabricate the high-resolution
light field using 3D printers.
Neural Networks & Light Fields. Deep neural networks [Bengio
et al. 2012; Krizhevsky et al. 2012; LeCun et al. 2015] have shown
remarkable achievements in encoding highly complex functions in
a non-linear fashion. Recent works [Lombardi et al. 2019; Milden-
hall et al. 2019; Sitzmann et al. 2019] used neural networks to learn
implicit representations from imagery without requiring any ex-
plicit geometry. Simple MLP-based architectures have also shown
remarkable improvements in both view synthesis [Mildenhall et al.
2020] and in encoding complex light interactions [Kallweit et al.
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.
Neural Light Field 3D Printing • 207:3
2017]. In this paper, we employ a neural network to encode a full
3D representation of light field imagery in a scene-dependent setup.
Our model builds upon the recent MLP-based architecture [Milden-
hall et al. 2020] that extends a vanilla MLP with positional encoding
to encode the underlying light field imagery for view synthesis.
Appearance 3D Printing. Recent high-resolutionmulti-color inkjet
3D printers [Stratasys 2020] have enabled creating photorealistic 3D
prints. Brunton et al. [2015] proposed an algorithm for managing the
color reproduction workflow. Babaei et al. [2017] introduced a color
reproduction algorithm, called contoning, for 3D printing without
the need for halftoning. Shi et al. [2018] extended the contoning
framework to reproduce the spectra of oil paintings using an inkjet
3D printer. In response to the significant volume scattering of some
3D printing inks, Elek et al. [2017] proposed a color reproduction
technique that preserves the texture by simulating the subsurface
cross-talk between neighboring voxels. Sumin et al. [2019] extended
this scattering-aware approach to arbitrary 3D shapes with poten-
tially thin geometric features. In the context of light fields, Tompkin
et al. [2013] and Saberpour et al. [2020] used an inkjet 3D printer to
build all-in-one lenticular displays on flat and curved surfaces, re-
spectively. Similar to their device, we use a two-material (black and
clear) inkjet 3D printer to fabricate attenuation based volumetric
displays.
3 BACKGROUND
We aim to fabricate static 4D light fields. To understand the process
of light field formation through a volumetric attenuator, we use a
L (u,v,s,t)
t
s u
v
Display Backlight
standard two-plane light field
parameterization [Levoy and
Hanrahan 1996]. As shown
in the side figure, light rays
from a uniform backlight tra-
verse a volumetric attenuator
and form a light field. This
process is commonly referred to as the forward imaging model. Let
the initial radiance carried by a ray(𝑢, 𝑣, 𝑠, 𝑡) from the backlight be
𝐿𝑖 (𝑢, 𝑣, 𝑠, 𝑡). Our notation implies scalar radiance (gray scale), but
the formulation can trivially be applied to any type of spectral color
sampling by replacing scalar radiance with a vector. The initial ray’s
radiance is attenuated after it traverses the volume by the mate-
rial in the blue box. Given the light field is temporally static, the
attenuated radiance can be written as
𝐿𝑜 (𝑢, 𝑣, 𝑠, 𝑡) = 𝐿𝑖 (𝑢, 𝑣, 𝑠, 𝑡) ·𝑇 (𝑢, 𝑣, 𝑠, 𝑡), (1)
where 𝑇 (𝑢, 𝑣, 𝑠, 𝑡) gives the transmittance of the medium. It aggre-
gates the comprehensive interactions between light rays and the
volume, such as emission, scattering and absorption. We assume
our printing materials have no self illumination (e.g., fluorescence)
and their scattering is negligible. We further assume the volumet-
ric attenuator has a continuously varying medium density. Hence,
we approximate the modulation in the light according to the Beer-
Lambert law,
𝑇 (𝑢, 𝑣, 𝑠, 𝑡) = 𝑒−∫
𝑧
0𝜇 (𝑧)𝑑𝑧 , (2)
where 𝜇 (·) is the absorption coefficient and 𝑧 represents the distance
travelled along the ray. With Equation 1, we can compute the 4D
light field emitted by the volumetric attenuator.
4 NEURAL VOLUMETRIC DISPLAY
For the forward imaging model in Section 3, the goal is to compute
the light field emitted by the volumetric attenuator. By contrast, we
only have the target light field and we aim to find the volumetric
attenuator for printing such that the final result provides us with a
light field that is close to the target. This requires an inverse imaging
model. In this section, we firstly describe the formulation of the
inverse problem and then describe our approach to learn a neural
representation of the desired volumetric attenuator.
4.1 Formulation
We consider an outgoing ray 𝑟 = (𝑢, 𝑣, 𝑠, 𝑡) from the volumetric
display and denote its radiance as 𝐿𝑜 (𝑟 ). Based on Equation 1, we
can relate it to the incoming radiance 𝐿𝑖 (𝑟 ) from the backlight
as 𝐿𝑜 (𝑟 ) = 𝐿𝑖 (𝑟 ) · 𝑇 (𝑟 ). By plugging Equation 2 and taking the
logarithm, we obtain the accumulated material absorbance along 𝑟
as
𝛼 (𝑟 ) = − ln𝐿𝑜 (𝑟 )
𝐿𝑖 (𝑟 )=
∫ 𝑝𝑜 (𝑟 )
𝑝𝑖 (𝑟 )𝜇 (𝑧)𝑑𝑧. (3)
Here 𝑧 denotes a location along the ray 𝑟 , and 𝑝𝑖 (𝑟 ) and 𝑝𝑜 (𝑟 ) are the
points where 𝑟 enters and exits the volume, respectively. Denoting
the target light field as 𝐿𝑔𝑡 (𝑟 ), Equation 3 immediately determines
the ground truth absorbance 𝛼𝑔𝑡 (𝑟 ) = − ln(𝐿𝑔𝑡 (𝑟 )/𝐿𝑖 (𝑟 )).
To determine the unknown volumetric absorption function that
will reproduce 𝐿𝑔𝑡 (𝑟 ), let us assume that it is parameterized by a
set of variables Θ = {𝜃𝑖 }. We write the absorption coefficient 𝜇 in
Equation 3 as 𝜇Θ. In addition, we write the accumulated absorbance
corresponding to 𝜇Θ as 𝛼Θ(r) along 𝑟 . Now we can express the
problem of finding 𝜇Θ as the following optimization problem over
the unknown parameters Θ,
argminΘ
∑
𝑟 ∈𝐹
D(
𝛼𝑔𝑡 (𝑟 ), 𝛼Θ (𝑟 ))
, (4)
where 𝑟 is a ray sampled from the target light field 𝐹 , and D is a
loss function.
4.2 Grid-based Volumetric Representation
Our inkjet 3D printers are raster devices, i.e., they print a multi-
material grid of physical voxels. This motivates us to discretize the
volume of the desired volumetric attenuator as a grid of virtual
voxels. In this case the optimization variables Θ are the voxel values,
and we can directly determine them by solving Equation 4.
We construct a virtual voxel grid which has the same resolution
as the physical version for 3D printing. Each voxel has a size equal
to the physical voxel and is associated with an absorption coefficient,
which corresponds to one optimization variable 𝜃𝑖 ∈ Θ. Formally,
the continuous function 𝜇Θ is given by the continuous interpolation
of the voxel values. In our implementation we use nearest neighbor
interpolation, which could easily be replaced with other, higher
order interpolation schemes.
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.
207:4 • Zheng et al.
Encoding
ReLU
FC
FC
FC
FC
FC
FC
FC
p
ωμ
FC
ReLU
ReLU
ReLU
ReLU
ReLU
ReLU
Sigm
oid
Concat
Fig. 2. Architecture of our neural network. FC stands for fully connected
layers and ReLU means the rectified-linear-unit activation.
To print 3D models in high precision, a prohibitive number of
small voxels are needed. Since the grid-based approach has an ab-
sorption coefficient for each voxel, it will have to optimize a high
number of variables and, therefore, requires a large amount of data
to optimize with. This motivates us to seek another representa-
tion that is not limited by the final printing precision during the
optimization stage.
4.3 Neural Network-based Volumetric Representation
Neural networks are known to be general function approximators
for many problems. In theory, neural networks with adequate capac-
ity can approximate any non-linear function. Therefore, we consider
representing the function 𝜇Θ as a fully-connected neural network,
also known as a multilayer perceptron (MLP). Here, the optimization
variables Θ represent the trainable parameters of the neural net-
work. Since 𝜇 in Equation 3 is a function of a position 𝑝 = (𝑥,𝑦, 𝑧)
defined in Euclidean space, we consider using a position vector as
the input to our neural network. The output of our neural network
is the absorption coefficient 𝜇.
Our MLP architecture is shown in Figure 2. Each hidden layer
has 512 neurons. Each neuron accepts the outputs of the preceding
layer, conducts a multiply-add computation, and applies a nonlinear
ReLU activation before sending an output to next layer. Initially, we
input coordinates of position 𝑝 directly to the neural network after
scaling them to [−1, 1]. The MLP merely using original coordinates
as input, however, has difficulty in learning lighting and geometric
details from the target light field. In a recent work, Mildenhall et al.
[2020] show that positional encoding [Rahaman et al. 2018] helps
learn high frequency details for radiance fields. It transforms each
coordinate of 𝑝 to a higher dimensional space using sinusoidal func-
tions: 𝑝 → (𝑝, sin 20𝑝, cos 20𝑝, . . . , sin 2𝑀−1𝑝, cos 2𝑀−1𝑝) . A related
theoretical analysis of positional encoding, as one of the Fourier fea-
ture mappings, can be found in Tancik et al. [2020]. We leverage
positional encoding to get an encoded vector of the position. In addi-
tion, we append the original three-dimensional direction 𝜔 to the
encoded vector.
To better transport the input data to the later layers of the net-
work, we use skip connections [He et al. 2016] which concatenate
the encoded input to the following two layers. We experimentally
find this makes our deep neural network training more efficient and
slightly improves the quality.
4.4 Attenuation-based Volumetric Ray Casting
To solve the minimization problem formulated in Equation 4, we
extract ray samples from a target light field 𝐹 and cast rays into
the volume which is now represented as a neural network. Our ray
casting process is different from the standard volume rendering
technique [Drebin et al. 1988], which is adopted in recent research
work [Lombardi et al. 2019; Mildenhall et al. 2020]. Traditional
volume rendering computes an additive blending of volumes of
color and opacity, whereas our rendering process simulates the
physical process of light field formation by attenuating light in the
volumetric display.
As defined in Equation 3, we compute 𝛼Θ(𝑟 ) for each ray. Note
that the absorbance depends on the parameters Θ of our absorption
function. We use the parametric form to express a ray as 𝑝 = 𝑟𝑜 +
𝑧𝑟𝜔 , where 𝑟𝑜 denotes the origin, 𝑟𝜔 is the direction and 𝑧 is the
parametric distance along ray 𝑟 . Hence, 𝛼Θ can be rewritten as
𝛼Θ = −
∫ 𝑧𝑜
𝑧𝑖
𝜇Θ (𝑟𝑜 + 𝑧𝑟𝜔 )d𝑧. (5)
Here, 𝑧𝑖 and 𝑧𝑜 are parametric distances from the origin of 𝑟 to the
intersection points on the near surface and far surface of the volume,
respectively.
Note that our optimization over the volume of a display can ac-
commodate any display geometry. This benefit allows us to optimize
and print light fields onto displays with non-planar geometries. For
planar surfaces, we can compute parametric distances, 𝑧𝑖 and 𝑧𝑜 , by
ray-plane intersections. For non-planar surfaces, we find them by
solving two ray-surface intersections. Note that we ignore reflec-
tions of rays on surfaces as reflected rays do not enter the volume
and we simulate refraction of rays when entering and exiting a
volume with the index of refraction being 1.5.
Our neural network learns a density field of absorption coeffi-
cients. Given a trained neural network, 𝜇Θ of every location in the
volume is readily available. To solve the integral of 𝜇Θ, we consider
ray segments that are located in the volume. By using the parametric
form of a ray, the segment is defined by [𝑧𝑖 , 𝑧𝑜 ]. As configured in
the parallel beam tomography approach [Wetzstein et al. 2011], we
conduct parallel volumetric ray casting into the volume. For each
ray, we firstly find the range that is within the volume and segment
it into 𝑆 bins with equal length. We then use stratified sampling to
find intermediate 𝑆 samples from these bins. With these samples,
we can compute the integral with a numerical integration approach,
the rectangle rule. For each interval delimited by these samples, we
use the right sample of each interval and compute 𝛼Θ as
𝛼Θ = −
𝑆+1∑
𝑗=0
𝜇Θ (𝑟𝑜 + 𝑧 𝑗𝑟𝜔 )
𝑧 𝑗+1 − 𝑧 𝑗
2 , (6)
where 𝑧 𝑗 denotes sample locations generated using stratified sam-
pling in the interval [𝑧𝑖 , 𝑧𝑜 ] and ∥·∥2 calculates the L2 distance
along a ray between the two points defined by adjacent two samples
of parametric distance. This is the place where the actual thickness
of our final volumetric display comes in. Larger thickness allows
longer average traversing distance for rays entering the volume.
4.5 End-to-end Optimization
Our volumetric ray casting computes 𝛼Θ (Equation 6) as a linear
combination of weighted samples of 𝜇Θ, thus the derivatives of 𝜇Θ
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.
Neural Light Field 3D Printing • 207:5
with respect to Θ can be easily computed. We implement the volu-
metric ray casting process along with the neural network in a deep
learning library that supports automatic differentiation. This enables
an end-to-end training of our neural network using a stochastic
gradient descent (SGD) based optimizer.
During optimization, we sample rays from the target light field
to solve the minimization problem (Equation 4). For the difference
functionD, we use anL2 loss alongwith a regularization in training,
D(
𝛼𝑔𝑡 (𝑟 ), 𝛼Θ (𝑟 ))
=
1
𝑁
∑
𝑟 ∈𝐹
(
𝛼𝑔𝑡 (𝑟 ) − 𝛼Θ(𝑟 )
)2+ 𝜆 ∥Θ∥22 , (7)
where 𝑁 is the number of rays, 𝑟 is a ray sampled from the target
light field 𝐹 , 𝛼Θ (𝑟 ) is the predicted absorbance computed via volume
ray casting, and 𝛼𝑔𝑡 (𝑟 ) is the ground truth absorbance. Hyperpa-
rameter 𝜆 is assigned with 0.0001. We also tried L1 loss function for
training, but L2 performs better in terms of overall visual quality
across different test scenes.
5 FABRICATION
Our neural network learns continuous, spatially-varying volumet-
ric absorptions that generate desired light fields. For fabricating
the light field, we first sample the learned density of absorption
coefficients according to the target print resolution. In practice, we
regularly sample nine directions for the inference stage and find
a mean of the output density. We then halftone the continuous
density values slice-by-slice using an error diffusion algorithm [Os-
tromoukhov 2001]. For printing, we utilize a two-material (black
and clear) inkjet 3D printer (Objet260 Connex) to print the display.
The highest precision of the printer is 84 µm × 84 µm laterally, with
30 µm slice thickness.
Since our printer supports printing with a combination of a black
and a clear material, we print our prototypes in gray.With the recent
introduction of colored, transparent materials to inkjet 3D printing,
e.g., Vero Vivid from Stratasys, our approach could be extended to
color printing in a straightforward manner.
Unfortunately, the printer software only supports meshes with
a limited number of triangles (about 10 millions). This limits the
number of voxels we can use for printing and therefore the size
and/or resolution of the display. We show printed prototypes with
resolutions 512 × 384 × 24 (Figure 1). Simulated results with high
resolution grids spanning the full thickness range of a display can
be found in the supplemental.
5.1 Material Calibration
In our two-material setup, we use a clear material, VeroClear, to fabri-
cate transparent parts and a black material, TangoBlack, for the dark
LCD
Ramp
Camera
parts. We assume the scatter-
ing is insignificant and the
clear material has an absorp-
tion coefficient of zero. For
measuring the absorption co-
efficient of the black material,
we take a dictionary-based
approach [Gkioulekas et al.
2013] where we compare a
printed model to many of its
rendered counterparts with varying absorption coefficients. We
search for the absorption coefficient that leads to the best match to
the print.
We start by printing a ramp-shaped model in TangoBlack. This
model is composed of 15 rectangular columns with varying heights.
The illumination, provided by a LCD, comes from behind the ramp.
Discarding the color information, we take a single channel HDR
photo of the illuminated print. By dividing the photo of the print
by the background illumination image, we obtain a map of trans-
mittance. After averaging, we further obtain a single transmittance
value for each rectangle leading to a transmission profile for the
print. For creating the rendered dictionary, we handcraft a virtual
scene with the same setup as in reality, and calculate a transmission
profile per scene. We regularly sample the absorption coefficient
𝜇 ∈ [0, 10] (mm−1) with a step size 0.05 and render an image for the
scene per step. Finally, we find the nearest neighbor to the trans-
mission profile of our print from within the rendered dictionary
using an acceleration data structure. With a similar setup, Elek et al.
[2017] measures scattering parameters of material samples, whereas
our acquisition focuses on the absorption coefficients.
Photo Simulation
Fig. 3. Captured gray-scale photo of the printed ramp model (left) and
simulated result (right) of the corresponding virtual scene.
6 EXPERIMENTS
6.1 Training Settings
We implement our neural network based on the Tensorflow li-
brary [Abadi et al. 2015] and run it on a Nvidia RTX Titan GPU.
Specifically, we use the Adam [Kingma and Ba 2014] optimizer with
a learning rate 1 × 10−4 and keep other parameters as their de-
fault values. The learning rate is exponentially decayed every 1000
epochs with a rate 0.8 and then is kept constant after it drops below
1 × 10−5. Before training, we initialize kernel weights with Glorot
uniform initialization [2010] and all kernel biases with zeros. When
training the neural network, the mini-batch size is set to 1200 and a
batch of data are randomly shuffled before feeding into the neural
network. For each ray, we sample 64 points using stratified sampling
as described in Section 4.4. The encoding length𝑀 for position is
set to 10. Note that we sample batch data from all training views
instead of sequentially sampling data from individual views.
6.2 Evaluation Setup
Dataset. In our experiments, we adopt synthetic light field data
from off-the-shelf renderers. Synthetic data provides readily avail-
able camera poses which are required by our learning algorithm. We
use light fields of five scenes from publicly available datasets [Wet-
zstein et al. 2011, 2012]. Each light field has 7 × 7 views and every
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.
207:6 • Zheng et al.
Ours Grid, Linear Grid, SGD Layered3D Ours Reference
Fig. 4. Quality comparison between our, Layered3D and grid-based approach on the Buddha, Green Dragon, and Dice scene. Note that, for Ours, we demonstrate
a test view that is not part of the training dataset. For the grid-based approach, we show results obtained by the linear and the SGD solver, respectively. More
details are in Section 6.3. See also the supplemental for more results with GIF animations and rendered high-resolution views.
view has a resolution of 512 × 384. Also, we render new light fields
of the Red Dragon scene and the Book scene using the PBRT [Pharr
et al. 2016] renderer. Our light field dataset covers a wide range of
variations. The Butterfly and Buddha scenes are abundant in geom-
etry details, while the Green Dragon, Dice and Car scenes feature
depth-of-field and glossy or specular highlights. The Red Dragon
and Book scenes are configured for analysis of form-factors and
depth-of-field.
For each light field, we hold out one view for test and use the
other views in training. After optimization, we test each method on
the same test view to ensure fair comparisons. Note that test views
are not used in the optimization processes for all methods.
Evaluated methods. For the rendering simulation, we mainly com-
pare our approach with Layered3D [Wetzstein et al. 2011]. We use
its original Matlab implemention and set the number of layers to five
according to the specifications in the original work. We also imple-
ment the grid-based approach (Section 4.2) and use it as the baseline.
The grid-based approach firstly discretizes the volume into a grid
of voxels, where the resolution of the grid is pre-defined based on
the target resolution for printing and each voxel is associated with
a three-channel absorption coefficient. We can directly optimize
absorption coefficients on the voxel grid with the Adam optimizer
using default parameter settings and a batch size 500𝐾 . In addition,
we leverage a constrained large-scale trust-region reflective linear
solver [Coleman and Li 1996], which was adopted in Layered3D, to
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.
Neural Light Field 3D Printing • 207:7
Table 1. Numerical performance of all evaluated methods on the Buddha, Green Dragon and Dice scenes. This evaluation is conducted on the rendered test
views. The image quality is measured with PSNR and SSIM (higher is better), and MAPE (lower is better).
ScenePSNR SSIM MAPE
Grid, Linear Grid, SGD Layered3D Ours Grid, Linear Grid, SGD Layered3D Ours Grid, Linear Grid, SGD Layered3D Ours
Buddha 22.7003 24.9623 25.0305 36.1927 0.8597 0.8648 0.8779 0.9542 0.0939 0.0721 0.0773 0.0252
Dragon 24.5019 24.4379 25.3789 34.3736 0.9332 0.9077 0.9361 0.9578 0.1127 0.1182 0.1093 0.0537
Dice 27.6937 30.1179 30.2448 36.6209 0.9419 0.9557 0.9535 0.9727 0.0569 0.0404 0.0464 0.0221
Prototype Slice 1 Slice 9 Slice 17 Slice 21
Fig. 5. Front view of the printed planar prototype of the Butterfly scene. The prototype is printed with 24 slices. We show a visualization of absorption
coefficients of slice 1, 9, 17, and 21.
optimize for the grid-based approach. Our method, however, per-
forms continuous-space optimization and does not presume any
fixed-resolution voxel-grid structure.
Evaluationmetrics. We evaluate the quantitative quality of simula-
tion with three metrics: Peak-Signal-Noise-Ratio (PSNR), Structural
Similarity Index (SSIM) [2004] and Mean Absolute Percentage Er-
ror (MAPE). MAPE is computed over all 𝐾 pixels as 1/𝐾∑𝐾𝑖=1 |𝑝𝑖 −
𝑔𝑖 |/(𝑔𝑖 + 𝜖), where 𝑝𝑖 and 𝑔𝑖 are the pixel values of the re-rendered
image and the reference image respectively, 𝜖 is set to 0.02.
6.3 Results
Simulation. In Figure 4, we compare our approach with Lay-
ered3D and the baseline grid-based methods on the Buddha, Green
Dragon and Dice light fields. All methods share the same display
thickness 12.5 mm. For the grid-based method, we present its re-
sults from both the linear solver and the SGD solver as described
in Section 6.2. Given the specified display thickness, the grid-based
approach can hardly extend to high-resolution voxel grids due to
high storage cost. In practice, we find that the highest resolution
of the grid that can be trained on our GPU using the SGD solver is
512× 384× 60. However, a grid resolution higher than 512× 384× 46
leads to an under-determined optimization which is not supported
by the linear solver. We, therefore, set the resolution of the grid for
the baseline to 512 × 384 × 45.
In the test stage, we run all methods to render the same test view
(the 41-st view) of each scene to allow a fair comparison. As shown in
Figure 4, both solutions to the grid-based approach show blurriness
near boundaries (see insets of the Buddha scene and the Dice scene)
and exhibit objectionable color artifacts (see insets of the Dragon
scene), whereas Layered3D is plagued with the ghosting artifacts
at the rims of objects. Our approach generally provides rendered
views with higher visual quality than the compared methods. We
tabulate the numerical evaluation results in Table 1. For all scenes,
our method achieves lower errors than other methods, which is
consistent with the visual quality advantages of our approach.
Prototyping. In Figure 1, we demonstrate our approach with 3D
printed prototypes of the light field of the Butterfly. As the printing
workflow of Section 5, we sample absorption coefficients according
to the target resolution and halftone them for 3D printing. Figure 5
shows a visualization of optimized absorption coefficients from the
planar prototype. Going beyond the planar prototype, we fabricate
a prototype with a non-planar serpentine shape (Figure 1(d)).
After encoding a light field with a neural network, we can print
prototype with an arbitrary precision supported by the printer. How-
ever, the software of our Objet260 3D printer has a limit on the size
of the input meshes. This constraint limits us to printing small-scale
models. For the prototype of the Butterfly scene, we use the highest
printing precision 84 µm for the height and width dimension. For
the depth dimension, we discretize the learned representation into
24 slices and each slice has a depth 0.21 mm. Finally, the prototype
has a size 43.0 × 32.3 × 5.1 mm.
7 DISCUSSION
Comparisons with multilayer approaches. Layered3D optimizes
multiple discrete 2D layers, while our approach optimizes on a con-
tinuous volume, in which it integrates the transmittance along rays.
Figure 6 investigates the capacity of our continuous representa-
tion and the layer-based representation by comparing rendered test
views, where we vary the number of layers of Layered3D from 5
to 45. As indicated by red arrows, Layered3D suffers from ghosting
and color artifacts.
The statistics of unknowns and run-time memory footprints are
listed in Table 2. Layered3D formulates the problem as a constrained
linear system. It uses a fixed number of discrete 2D layers and
pre-computes a ray propagation matrix which it solves efficiently
with a linear solver. In our case, the training time comprehensively
hinges on hyperparameters like training epochs and mini-batch size.
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.
207:8 • Zheng et al.
Layered3D 5 layers Layered3D 20 layers Layered3D 35 layers Layered3D 45 layers Ours Reference
PSNR 29.4445 30.0561 29.0419 28.4675 34.2322SSIM 0.8793 0.9109 0.8911 0.8821 0.9240
Fig. 6. Comparisons between our method and Layered3D using different number of layers. The maximum iterations of Layered3D is 15, that is the same as in
the original paper. Our neural network is trained for 3000 epochs and each epoch has 20 batches. The display thickness is 12.5 mm.
Grid, Linear Grid, SGD Layered3D, Linear Layered3D, SGD Ours Reference
PSNR 22.9072 23.4009 23.3177 23.3769 35.9354SSIM 0.8096 0.8300 0.8194 0.8287 0.9500
Fig. 7. Comparisons of the grid-based method, the layer-based method and our approach on the 9-th test view. The display thickness is 12.5 mm.
Table 2. Comparisons of parameter counts, optimization timings and run-
ning memory. Our neural network has eight hidden layers, each with 512
neurons. CPU denotes the RAM of a workstation and GPU denotes the
VRAM of a graphics card.
Solutions Parameters Timings (h) Memory (GB)
Layered3D-5 (CPU) 2949120 0.171 2.794
Layered3D-20 (CPU) 11796480 0.699 11.245
Layered3D-25 (CPU) 14745600 0.995 14.925
Layered3D-35 (CPU) 20643840 1.109 20.298
Layered3D-45 (CPU) 26542080 1.320 24.712
Ours (GPU) 1942019 1.453 2.659
Furthermore, our approach finds the absorbance via a numerical
integration (Equation 6), where the number of samples on a ray also
affects the training time. Note that the peak usage of main memory
by Layered3D with 45 layers is higher than the VRAM size of our
GPU.
Additionally, we show the optimization time cost and quality
measurement in Figure 10, which tells that increasing the number
of layers for Layered3D does not lead to a consistent quality im-
provement. Instead, more layers lead to higher time cost and inferior
quality, although 45 layers possess 13 times more parameters than
our neural network.
Comparisons using equal number of unknowns. To demonstrate
that our neural network based implicit representation is more com-
pact, we conduct a comparison with the grid-based representation
and the layer-based representation [2011] under the condition of
equal number of unknowns. The compared methods are configured
with four RGB layers and each has a resolution of 512 × 384, which
leads to 2, 359, 296 parameters. Accordingly, we employ a neural
network with eight hidden layers and 565 neurons per layer, which
Table 3. Optimization timings and running memory of the equal-unknowns
comparisons. Our neural network uses eight hidden layers and 565 neurons
per layer.
Solutions Timings (h) Memory (GB)
Grid-based, Linear (CPU) 0.196 2.359
Grid-based, SGD (GPU) 2.440 1.362
Layer-based, Linear (CPU) 0.164 2.541
Layer-based, SGD (GPU) 2.320 1.238
Ours (GPU) 2.162 4.429
amounts to 2, 352, 663 parameters. For the two compared methods,
we solve themwith both the trust-region reflective linear solver (Lin-
ear) and an SGD algorithm as used in our approach. The maximum
iteration count of the linear solver is set to 15.
From images of the grid-based representation and the layer-based
representation in Figure 7, ghosting and blurriness artifacts (indi-
cated with cyan arrows) can be observed on the statue and the
background character. The SGD solver performs slightly better than
the linear solver in terms of numerical measurement but the advan-
tage is relatively small. Our result presents fewer visual artifacts
compared to others. Also, Table 3 lists the optimization timings
and the running memory footprints. By formulating the grid-based
approach and the layered-based approach as linear systems, the
linear solver can efficiently find solutions but it does not lead to
visually pleasing results. On the other hand, the time cost of the
SGD solver depends on epochs, number of batches per epoch, and
mini-batch size. For the SGD solver, we run 3000 epochs with 20
batches per epoch. Our batch size is 1200, whereas it is 500K for the
compared methods.
Transmission maps. In Figure 8, we investigate the optimization
evolution of our approach compared to Layered3D. Layered3D is as-
signed with five discrete layers which the method optimizes directly
whereas our approach learns absorption coefficients in a continuous
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.
Neural Light Field 3D Printing • 207:9
Layer 1 Layer 2 Layer 3 Layer 4 Layer 5 Rendering
Layered3D
I2Layered3D
I4Layered3D
I7Layered3D
I15
Ours
E100
Ours
E500
Ours
E1500
Ours
E3000
Fig. 8. Comparisons of transmission maps from Layered3D (upper four rows) and our approach (lower four rows). The thickness of the display is 12.5 mm. The
rightmost column shows the rendered test views.
space. We, therefore, get the approximate five transmission maps
by integrating learned absorption coefficients. For Layered3D, we
visualize its transmission maps and the corresponding rendering
of the test view at iteration 2, 4, 7, and 15. Accordingly, we present
our integrated transmission maps at training epoch 100, 500, 1500,
and 3000. In the transmission maps of Layered3D, ringing and blur-
riness around the contour of the butterfly can be clearly observed.
This explains the ringing artifacts in the rendered image. On the
contrary, our transmission maps gradually get sharper along the
training and the details in local regions are better preserved. The
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.
207:10 • Zheng et al.
9.6
24.622.1
29.727.1
34.5
27.9
36.1
I2 E100 I4 E500 I7 E1500 I15 E30000
10
20
30
40
Layered3D
Ours
0.54
0.75
0.910.86
0.92 0.94 0.93 0.95
I2 E100 I4 E500 I7 E1500 I15 E30000.0
0.2
0.4
0.6
0.8
1.0
Layered3D
Ours
Fig. 9. PSNR (left) and SSIM (right) of Layered3D and our approach on the
test view. Layered3D is measured at iteration (I) 2, 4, 7, and 15, and our
approach is measured at epoch (E) 100, 500, 1500, and 3000.
numerical quality measurements of rendered images are presented
in Figure 9.
Architecture. Figure 11 compares the performance of different
architectures. Here, we control the complexity of a neural network
by the number of hidden layers 𝐷 and the number of neurons𝑊
per layer. For the D8W512 architecture, we include its variant with
skip connections (Figure 2) in the comparison. For other compared
architectures, skip connections are by default not used.
For each case, we train the neural networks for 3000 epochs, test
them every 10 epochs, and evaluate rendered images in terms of
PSNR and SSIM. As shown, deep and wide architecture generally
provides better numerical performance. In addition, skip connec-
tions bring slight extra improvement. For the D8W512 architecture,
we present its training dynamics in the supplemental to measure
the quality improvement during training. Also, its average time cost
for rendering a view with resolution 512 × 384 using 64 samples on
each ray is 11.2 s.
Field of view. In Figure 12, we investigate the capability of our
neural network to learn light fields with large fields of view (FOV).
Since volumetric displays allow only finite depth of field, we render
the target light field of the Red Dragon scene with defocus blur.
Three light fields of the scene are rendered by setting the FOV angle
to 10, 20, and 40 degrees. To adapt to a higher FOV, we accordingly
provide more views to the neural network. Specifically, we render
7 × 7, 9 × 9, and 11 × 11 views of the three light fields respectively.
For each light field, we reserve its central view for test and use the
other views in training.
We separately train a D8W512 neural network on each light field
for 3000 epochs. As can be observed, the neural network trained
on FOV 10◦ has the closest appearance to the reference. As the
FOV increases, the neural network will gradually compromise the
spatial resolution to learn wider angular changes. Note the head of
the dragon in the case of FOV 40◦ gets blurrier than others. The
trade-off between angular resolution and spatial resolution confirms
the findings reported in the previous work [Tompkin et al. 2013].
Thickness of display. The number of layers and display thickness
are two crucial factors for designing multilayer displays. Since our
approach optimizes on continuous volume instead of discrete lay-
ers, we consider the impact of display thickness. Figure 13 shows
comparisons of our approach using different thickness for the Dice
scene. The edge length of a die is approximately 15 mm. As can be
seen, when encoding the light field within a display of very small
thickness, e.g., 0.8 mm, the rendered test view significantly lose
0 0.5 1 1.5
Time (hours)
20
23
26
29
32
35
PS
NR
Layered3D-5
Layered3D-20
Layered3D-25
Layered3D-35
Layered3D-45
Ours
0 0.5 1 1.5
Time (hours)
0.5
0.6
0.7
0.8
0.9
1.0
SS
IM
Layered3D-5
Layered3D-20
Layered3D-25
Layered3D-35
Layered3D-45
Ours
Fig. 10. Image quality comparisons on the test view of the Car scene. Dots
of Layered3D present its overall optimization time and final quality. For our
method, we also show the quality measurement along the training.
0 500 1000 1500 2000 2500 3000
Epoch
10
20
30
40
PS
NR
D4 W256
D4 W512
D8 W256
D8 W512
D8 W512 w/ skip
0 500 1000 1500 2000 2500 3000
Epoch
0.4
0.6
0.8
1
SS
IM D4 W256
D4 W512
D8 W256
D8 W512
D8 W512 w/ skip
Fig. 11. Evaluations of different neural network architectures by training
on the Butterfly scene. Both PSNR (left) and SSIM (right) are measured on
the test view.
contrast. Note the bright dots on dices are washed out. This issue
can be alleviated by increasing the thickness. With a thickness of
6.5 mm, the rendered result achieves the appearance and contrast
which are close to the reference.
7.1 Limitations and Future Work
One limitation of our approach is that the printed volume can only
support finite depth of field. We investigate this issue by training
our neural network on a sharply focused Book scene. We render a
7 × 7 light field of the scene with the FOV as 20◦ and set the display
thickness to 12.5 mm. It can be observed from the comparison in
Figure 14 that only finite depth of field is rendered in the test view.
The pink rabbit and rear part of the book get blurred because they
are located outside of the finite depth-of-field range.
Another practical issue is our training time cost compared to
Layered3D using only a few layers. Since our approach adopts the
mini-batch Adam optimizer, its training time is majorly affected by
mini-batch size, training epochs, and batches per epoch. In addition,
our approach learns the volume density within a continuous space
and the number of samples on rays used for evaluating absorbance
also functions as an affecting factor of the training time. As demon-
strated in multiple results, our method brings improvements in both
visual quality and numerical performance. In practice, the training
process does not hinder the application of our approach, as the
training is only conducted once before the printing.
Although the MLP serves well to our needs, it is possible to use
other sparse representations like wavelets [Mallat 1999] to have
some theoretical guarantees. Future work can also benefit from
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.
Neural Light Field 3D Printing • 207:11
FOV 10◦ FOV 20◦ FOV 40◦ Reference
Fig. 12. The capability of neural networks to encode light fields with different field of views. With the FOV increasing, the neural network tends to compromise
its spatial resolution. Note how the head of the dragon in the FOV 40◦ case gets blurrier than others.
0.8 mm 1.6 mm 6.5 mm Reference
Fig. 13. Investigation of display thickness. We train neural networks to encode a light field of the Dice scene using different thickness of the display. Small
thickness leads to contrast loss. Note the bright dots on blue and red dices are missing for thickness 0.8 mm and 1.6 mm.
Fig. 14. Comparison of a rendered test view (left) with its reference (right)
on the Book scene. The reference image has infinite depth of field. When
encoding the light field onto a display of 12.5 mm thickness, only limited
depth of field (left) is exhibited.
using improved ray casting procedures that can leverage space-
partitioning data-structures (e. g., Octrees) in a multi-resolution
manner [Liu et al. 2020; Srinivasan et al. 2020].
In this work, we assume no scattering occurs in the volume. For
the printed prototypes, the effects of scattering is rarely noticeable
so the assumption holds here. For printing materials with significant
scattering, the scattering effects need to be addressed along with
absorption at the same time and we leave it as a future step. In
addition, we use uniform back-lighting in the design and this can
be extended to directional lighting.
8 CONCLUSION
We have proposed a novel and practical approach for printing input
light field imagery as an attenuation-based physical volumetric dis-
play. We utilize a neural network that encodes the input light field
data as a continuous representation of the medium density. With our
implementation of a volumetric ray casting process, we can train
our neural network end-to-end. We have demonstrated the effective-
ness of our approach in comprehensive simulation experiments. Our
approach significantly outperforms, both qualitatively and quantita-
tively, the existing multilayer approach and the grid-based approach
with a similar number of unknowns. Essentially, our approach takes
advantage of the fact that MLPs are more memory-efficient func-
tion approximators compared to uniformly sampled grids, which
allows us to achieve higher resolution and higher quality light fields
compared to the baseline techniques. We validate our simulation
with fabricated prototypes. Our approach also allows printing light
fields as either planar or non-planar volumetric displays.
ACKNOWLEDGMENTS
We would like to thank all anonymous reviewers for their valuable
feedback. The first author is funded by the Research Executive
Agency 739578.
REFERENCESMartín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, and et al. 2015. Ten-
sorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org.
Vahid Babaei, Kiril Vidimče, Michael Foshey, Alexandre Kaspar, Piotr Didyk, andWojciech Matusik. 2017. Color Contoning for 3D Printing. ACM Trans. Graph. 36, 4(July 2017), 124:1ś124:15. https://doi.org/10.1145/3072959.3073605
Ilya Baran, Philipp Keller, Derek Bradley, Stelian Coros, Wojciech Jarosz, DerekNowrouzezahrai, and Markus Gross. 2012. Manufacturing layered attenuatorsfor multiple prescribed shadow images. In Computer Graphics Forum, Vol. 31. WileyOnline Library, 603ś610.
Peter C Barnum, Srinivasa G Narasimhan, and Takeo Kanade. 2010. A multi-layereddisplay with water drops. In ACM SIGGRAPH 2010 papers. 1ś7.
Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2012. Representation Learning: AReview and New Perspectives. arXiv:1206.5538 [cs.LG]
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.
207:12 • Zheng et al.
BG Blundell, AJ Schwarz, and DK Horrell. 1993. Volumetric threedimensional displaysystems: their past present and future. Engineering Science & Education Journal 2, 5(1993), 196ś200.
Barry G Blundell, Adam J Schwarz, and Damon K Horrell. 1994. Cathode ray sphere:a prototype system to display volumetric three-dimensional images. Optical Engi-neering 33, 1 (1994), 180ś187.
Alan Brunton, Can Ates Arikan, and Philipp Urban. 2015. Pushing the Limits of 3DColor Printing: Error Diffusion with Translucent Materials. ACM Trans. Graph. 35,1 (Dec. 2015), 4:1ś4:13. https://doi.org/10.1145/2832905
Thomas F Coleman and Yuying Li. 1996. A reflective Newton method for minimizinga quadratic function subject to bounds on some of the variables. SIAM Journal onOptimization 6, 4 (1996), 1040ś1058.
Oliver S Cossairt, Joshua Napoli, Samuel L Hill, Rick K Dorval, and Gregg E Favalora.2007. Occlusion-capable multiview volumetric three-dimensional display. Appliedoptics 46, 8 (2007), 1244ś1250.
Robert A Drebin, Loren Carpenter, and Pat Hanrahan. 1988. Volume rendering. ACMSiggraph Computer Graphics 22, 4 (1988), 65ś74.
Oskar Elek, Denis Sumin, Ran Zhang, Tim Weyrich, Karol Myszkowski, Bernd Bickel,Alexander Wilkie, and Jaroslav Křivánek. 2017. Scattering-aware Texture Reproduc-tion for 3D Printing. ACM Trans. Graph. 36, 6, Article 241 (Nov. 2017), 15 pages.
Gregg E Favalora. 2005. Volumetric 3D displays and application infrastructure. Computer38, 8 (2005), 37ś44.
Ioannis Gkioulekas, Shuang Zhao, Kavita Bala, Todd Zickler, and Anat Levin. 2013.Inverse volume rendering with material dictionaries. ACM Transactions on Graphics(TOG) 32, 6 (2013), 1ś13.
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deepfeedforward neural networks. In Proceedings of the thirteenth international conferenceon artificial intelligence and statistics. 249ś256.
Steven J Gortler, Radek Grzeszczuk, Richard Szeliski, and Michael F Cohen. 1996. Thelumigraph. In Proceedings of the 23rd annual conference on Computer graphics andinteractive techniques. 43ś54.
Hironobu Gotoda. 2010. A multilayer liquid crystal display for autostereoscopic 3Dviewing. In Stereoscopic Displays and Applications XXI, Vol. 7524. InternationalSociety for Optics and Photonics, 75240P.
K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition.In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770ś778.
Michael Holroyd, Ilya Baran, Jason Lawrence, and Wojciech Matusik. 2011. Comput-ing and fabricating multilayer models. In Proceedings of the 2011 SIGGRAPH AsiaConference. 1ś8.
Youngjin Jo, Seungjae Lee, Dongheon Yoo, Suyeon Choi, Dongyeon Kim, and ByounghoLee. 2019. Tomographic projector: large scale volumetric display with uniformviewing experiences. ACM Transactions on Graphics (TOG) 38, 6 (2019), 1ś13.
Andrew Jones, Ian McDowall, Hideshi Yamada, Mark Bolas, and Paul Debevec. 2007.Rendering for an interactive 360 light field display. In ACM SIGGRAPH 2007 papers.40śes.
Simon Kallweit, Thomas Müller, Brian Mcwilliams, Markus Gross, and Jan Novák.2017. Deep Scattering: Rendering Atmospheric Clouds with Radiance-PredictingNeural Networks. ACM Trans. Graph. 36, 6, Article 231 (Nov. 2017), 11 pages.https://doi.org/10.1145/3130800.3130880
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980 (2014).
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classificationwith Deep Convolutional Neural Networks. In NIPS.
Douglas Lanman, Matthew Hirsch, Yunhee Kim, and Ramesh Raskar. 2010. Content-adaptive parallax barriers: optimizing dual-layer 3D displays using low-rank lightfield factorization. In ACM SIGGRAPH Asia 2010 papers. 1ś10.
Douglas Lanman, Gordon Wetzstein, Matthew Hirsch, Wolfgang Heidrich, and RameshRaskar. 2011. Polarization fields: dynamic light field display using multi-layer LCDs.In Proceedings of the 2011 SIGGRAPH Asia Conference. 1ś10.
Yann LeCun, Y. Bengio, and Geoffrey Hinton. 2015. Deep Learning. Nature 521 (052015), 436ś44. https://doi.org/10.1038/nature14539
Marc Levoy and Pat Hanrahan. 1996. Light field rendering. In Proceedings of the 23rdannual conference on Computer graphics and interactive techniques. 31ś42.
Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. 2020.Neural Sparse Voxel Fields. arXiv:2007.11571
Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann,and Yaser Sheikh. 2019. Neural Volumes: Learning Dynamic Renderable Volumesfrom Images. ACM Trans. Graph. 38, 4, Article 65 (July 2019), 14 pages. https://doi.org/10.1145/3306346.3323020
A Loukianitsa and Andrey N Putilin. 2002. Stereodisplay with neural network im-age processing. In Stereoscopic Displays and Virtual Reality Systems IX, Vol. 4660.International Society for Optics and Photonics, 207ś211.
Hiroyuki Maeda, Kazuhiko Hirose, Jun Yamashita, Koichi Hirota, and Michitaka Hirose.2003. All-around display for video avatar in real world. In The Second IEEE and ACMInternational Symposium on Mixed and Augmented Reality, 2003. Proceedings. IEEE,288ś289.
Stéphane Mallat. 1999. A wavelet tour of signal processing. Elsevier.Kshitij Marwah, Gordon Wetzstein, Yosuke Bando, and Ramesh Raskar. 2013. Com-
pressive light field photography using overcomplete dictionaries and optimizedprojections. ACM Transactions on Graphics (TOG) 32, 4 (2013), 1ś12.
Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari,Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. 2019. Local Light Field Fusion:Practical View Synthesis with Prescriptive Sampling Guidelines. ACM Trans. Graph.38, 4, Article 29 (July 2019), 14 pages. https://doi.org/10.1145/3306346.3322980
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ra-mamoorthi, and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fieldsfor View Synthesis. arXiv:2003.08934 [cs.CV]
Niloy J Mitra and Mark Pauly. 2009. Shadow art. ACM Transactions on Graphics (TOG)28, 5 (2009), 1ś7.
Shree K Nayar and Vijay N Anand. 2007. 3D display using passive optical scatterers.Computer 40, 7 (2007), 54ś63.
Ren Ng, Marc Levoy, Mathieu Brédif, Gene Duval, Mark Horowitz, Pat Hanrahan, et al.2005. Light field photography with a hand-held plenoptic camera. Computer ScienceTechnical Report CSTR 2, 11 (2005), 1ś11.
Victor Ostromoukhov. 2001. A simple and efficient error-diffusion algorithm. In Proceed-ings of the 28th annual conference on Computer graphics and interactive techniques.567ś572.
Matt Pharr, Wenzel Jakob, and Greg Humphreys. 2016. Physically Based Rendering:From Theory to Implementation (3 ed.). Morgan Kaufmann.
Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A.Hamprecht, Yoshua Bengio, and Aaron Courville. 2018. On the Spectral Bias ofNeural Networks. arXiv:1806.08734
Artin Saberpour, Roger D Hersch, Jiajing Fang, Rhaleb Zayer, Hans-Peter Seidel, andVahid Babaei. 2020. Fabrication of moiré on curved surfaces. Optics Express 28, 13(2020), 19413ś19427.
Liang Shi, Vahid Babaei, Changil Kim, Michael Foshey, Yuanming Hu, Pitchaya Sitthi-Amorn, Szymon Rusinkiewicz, and Wojciech Matusik. 2018. Deep multispectralpainting reproduction via multi-layer, custom-ink printing. ACM Transactions onGraphics 37, 6 (Dec. 2018), 271:1ś271:15. https://doi.org/10.1145/3272127.3275057
Vincent Sitzmann, Michael Zollhöfer, and GordonWetzstein. 2019. Scene representationnetworks: Continuous 3d-structure-aware neural scene representations. In Advancesin Neural Information Processing Systems. 1121ś1132.
Pratul P. Srinivasan, Ben Mildenhall, Matthew Tancik, Jonathan T. Barron, RichardTucker, and Noah Snavely. 2020. Lighthouse: Predicting Lighting Volumes forSpatially-Coherent Illumination. In CVPR.
Stratasys. 2020. Stratasys ultimate full-color multi-material 3D printer. https://www.stratasys.com/3d-printers/j8-series. [Online; Accessed 15-05-2020].
Denis Sumin, Tobias Rittig, Vahid Babaei, Thomas Nindel, Alexander Wilkie, PiotrDidyk, Bernd Bickel, Jaroslav Křivánek, Karol Myszkowski, and Tim Weyrich. 2019.Geometry-Aware Scattering Compensation for 3D Printing. ACM Trans. Graph. 38,4, Article 111 (July 2019), 14 pages.
Matthew Tancik, Pratul P Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, NithinRaghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T Barron, and RenNg. 2020.Fourier features let networks learn high frequency functions in low dimensionaldomains. arXiv preprint arXiv:2006.10739 (2020).
James Tompkin, Simon Heinzle, Jan Kautz, and Wojciech Matusik. 2013. Content-adaptive lenticular prints. ACM Transactions on Graphics (TOG) 32, 4 (2013), 1ś10.
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image qualityassessment: from error visibility to structural similarity. IEEE transactions on imageprocessing 13, 4 (2004), 600ś612.
Gordon Wetzstein, Douglas Lanman, Wolfgang Heidrich, and Ramesh Raskar. 2011.Layered 3D: tomographic image synthesis for attenuation-based light field and highdynamic range displays. In ACM SIGGRAPH 2011 papers. 1ś12.
GordonWetzstein, Douglas Lanman, Matthew Hirsch, and Ramesh Raskar. 2012. Tensordisplays: compressive light field synthesis using multilayer displays with directionalbacklighting. ACM Transactions on Graphics (TOG) 31, 4 (2012), 1ś11.
Tomohiro Yendo, Naoki Kawakami, and Susumu Tachi. 2005. Seelinder: the cylindricallightfield display. In ACM SIGGRAPH 2005 Emerging technologies. 16śes.
ACM Trans. Graph., Vol. 39, No. 6, Article 207. Publication date: December 2020.