photometric stereo for general reflectance and lighting by boxin shi a doctoral dissertation

PHOTOMETRIC STEREO FORGENERAL REFLECTANCE AND LIGHTING

（実物体反射特性・実環境光源のための照度差ステレオ）

BY

BOXIN SHIシ　ボシン

A DOCTORAL DISSERTATION

SUBMITTED TO THE GRADUATE SCHOOL OF

THE UNIVERSITY OF TOKYO

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF INFORMATION SCIENCE AND TECHNOLOGY

June 2013

c⃝ Copyright by Boxin Shi 2013All Rights Reserved

Committee:

Kiyoharu AIZAWA (Chair)Shin’ichi SATOH

Yoichi SATOTakeshi NAEMURA

Takeshi OISHI

Supervisor:

Katsushi IKEUCHI

ABSTRACT

Understanding the 3D shape information is a fundamental problem in computer vi-sion. Among various shape estimation technology, photometric stereo is highlightedfor its ability to produce detailed surface normal direction at its image resolution. Ittakes a set of images captured under varying illumination and a fixed viewpoint asinput. Traditional photometric stereo assumes the Lambertian reflectance and distantlighting. These assumptions are seldom satisfied in a practical scenario. This disserta-tion generalizes the assumptions for photometric stereo, specifically on reflectance andlighting, towards creating a practical surface normal sensor. The proposed approachesserve as fundamental support to the design of future cameras which are able to recordand measure 3D shapes for various applications like culture heritage archive, digitalmuseum, virtual reality, 3D scene understanding, and so on.

The first general reflectance solution exploits the reflectance monotonicity for esti-mating elevation angles of surface normal given the azimuth angles to fully determinethe surface normal. With an assumption that the reflectance includes at least one lobethat is a monotonic function of the angle between the surface normal and half-vector(bisector of lighting and viewing directions), we prove that elevation angle can beuniquely determined when the surface is observed under varying directional lightsdensely and uniformly distributed over the hemisphere.

The second general reflectance solution is built upon a newly developed reflectancemodel. We notice that if the high-frequency reflectance can be neglected, the low-frequency component of general reflectance can simply be represented using low-order polynomials. Based on this observation, we propose a compact bi-polynomialreflectance model to describe the general isotropic reflectance precisely in the low-frequency domain. We apply our reflectance model to radiometric image analysisproblems of estimating reflectance and shape given recorded scene radiance, namelyreflectometry and photometric stereo, for surfaces with general reflectance.

Both of the proposed solutions for general reflectance have been evaluated by usinga densely measured reflectance database containing one hundred different materialsand various types of real-world data. These evaluations cover a diversity of commonmaterials in our daily life; hence the experiments prove that our approaches are validfor a broad class of reflectance and useful for various practical scenarios.

The third solution is about general lighting. We present a photometric stereo methodthat works with general/unknown lightings and uncontrolled sensors using a coarseshape that is given. We show that the coarse shape information, or a shape prior, servesto solve two difficult issues: removing shape-light ambiguity in unknown naturallightings, and disregarding uncontrolled sensor gains and responses. Our method iswell-suited to work with a low-cost RGBD camera, whose radiometric characteristics

i

are totally unknown. We also show an application to 3D modeling from Internetimages, where illumination and sensor characteristics are unknown. Effectiveness ofthe proposed method is assessed by quantitative and qualitative evaluations.

The main contributions of this dissertation are three folds: We explore the mono-tonicity of reflectance function and use it for surface normal estimation with generalreflectance; we inventively propose a new reflectance model in the low-frequency do-main, which facilitates the reflectometry and photometric stereo problems with generalreflectance; and we design a practical photometric stereo system that works withoutknowing environment lighting and camera’s radiometric response. The efforts andachievements in this thesis relax the theoretical assumptions and promote the practi-cal capabilities of photometric stereo technique by considering general reflectance andlighting.

ii

Acknowledgements

I would like to express my gratitude to all those who gave me the possibility to completethis dissertation.

I would first like to express my deepest gratitude to my advisor, Prof. KatsushiIkeuchi. His wisdom and achievement attract me to select this wonderful universityand lab for finishing the most important degree in my life; he uses his actions to showme how to work diligently and intelligently as an independent researcher; his generous,kindness, and encouragement impress me that being Katsu’s student is a proud andhonor for all my academic life!

I would also like to express my sincere gratitude to Dr. Yasuyuki Matsushita, mymentor at Microsoft Research Asia. Yasu guided my research from the end of my masterstudy until the end of my doctor study. I learned how to do world-level research fromhim not only technically, but also philosophically. His scrupulous altitude in everydetail always reminds me the essence of being a qualified researcher.

Prof. Ping Tan, at National University of Singapore, my close collaborator, deservesspecial thanks for direct helping me in research works. His significant contribution andpractical suggestions have always inspired me with improvement.

I enjoy the complete process and treasure all the achievements of my working withKatsu, Yasu, and Ping by repeatedly conquering failure experiments, debating onsolutions, revising papers throughout the nights, and finally arriving at the destinationof success and truth.

Many thanks go to staffs, senior members, and my colleagues at the Computer VisionLaboratory at the University of Tokyo. My special thanks go to Dr. Rei Kawakami,Dr. Bo Zheng, and Dr. Tomoaki Higo for helping me in achieving a good start andsharing me valuable experiences. I am also very proud of, and feel fortunate to haveworked with the talented people in the Photometry group. I would also like to thankYoshihiro Sato, Keiko Motoki, Yoshiko Matsuura, Mikiko Yamaba, and Yuko Nishineand for their constant and warm support. Although, due to limited space, I cannot

iii

name everyone who has helped me, I am very grateful to all the people I have met andinteracted with in this lab.

I would also like to thank my committee members, Prof. Kiyoharu Aizawa, Prof.Shin’ichi Satoh, Prof. Yoichi Sato, Prof. Takeshi Naemura, and Prof. Takeshi Oishifor giving valuable advice on this dissertation. I would like to thank Prof. ToshihikoYamasaki for his valuable feedback at the advisor meetings.

I wish to express my gratitude to Japanese Government, Ministry of Education,Culture, Sports, Science and Technology (Monbukagakusho) for its general financialsupport. Without the scholarship from Global 30 program, it is impossible for meto finish my Ph.D. study and this dissertation. I also thank Kyoritsu InternationalFoundation and Japan Student Services Organization (JASSO) for providing me thecomfortable living environment in Tokyo, where I enjoyed my life outside campus.

The final acknowledgment goes to my family: my parents and my wife Dr. Si Li.Thanks for their perpetual support and unconditional love. It is to them that I dedicatethis dissertation.

June 2013

iv

Contents

Abstract i

Acknowledgements iii

List of Figures vii

List of Tables xiii

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Traditional Photometric Stereo . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Generalized Photometric Stereo . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.1 Generalization of reflectance . . . . . . . . . . . . . . . . . . . . . 61.3.2 Generalization of lighting . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Chapter Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 General Reflectance Solution using Reflectance Monotonicity 112.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.3 Elevation Angle Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1 1-lobe BRDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.3.2 BRDF profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3.3 Monotonicity of the BRDF profile . . . . . . . . . . . . . . . . . . 14

2.4 Normal Estimation Method . . . . . . . . . . . . . . . . . . . . . . . . . . 182.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.5.1 Synthetic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.5.2 Real data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3 General Reflectance Solution using Bi-polynomial Reflectance Model 313.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.3 Low-frequency Reflectance . . . . . . . . . . . . . . . . . . . . . . . . . . 35

v

3.4 The Bi-polynomial BRDF Model . . . . . . . . . . . . . . . . . . . . . . . 383.4.1 Relationship with other reflectance models . . . . . . . . . . . . . 403.4.2 Model validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.4.3 Comparison with other parametric models . . . . . . . . . . . . . 43

3.5 Application to Reflectometry . . . . . . . . . . . . . . . . . . . . . . . . . 473.6 Application to Photometric Stereo . . . . . . . . . . . . . . . . . . . . . . 49

3.6.1 An iterative normal estimation method . . . . . . . . . . . . . . . 493.6.2 Surface normal estimation results . . . . . . . . . . . . . . . . . . 503.6.3 Effect of varying numbers of lightings . . . . . . . . . . . . . . . . 553.6.4 Analysis on intensity threshold Tlow . . . . . . . . . . . . . . . . . 563.6.5 Results using real-world data . . . . . . . . . . . . . . . . . . . . . 59

3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4 General Lighting Solution using Shape Prior 654.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664.3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.3.1 Linear solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.3.2 Nonlinear refinement . . . . . . . . . . . . . . . . . . . . . . . . . 694.3.3 Normal from depth . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.3.4 Surface reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.4 Linear Approximation of Sensor Responses . . . . . . . . . . . . . . . . . 724.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.5.1 Quantitative evaluation . . . . . . . . . . . . . . . . . . . . . . . . 754.5.2 Application using a Kinect sensor . . . . . . . . . . . . . . . . . . 774.5.3 Application using Internet images . . . . . . . . . . . . . . . . . . 78

4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

5 Conclusion 855.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.1.1 Photometric stereo for general reflectance by analyzing reflectancemonotonicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

5.1.2 Photometric stereo for general reflectance by bi-polynomial mod-eling of low-frequency reflectance . . . . . . . . . . . . . . . . . . 86

5.1.3 Photometric stereo for general lighting by utilizing shape prior . 875.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 875.3 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5.3.1 Material-aware solution for even higher accuracy . . . . . . . . . 895.3.2 Simpler setup for the general reflectance problem . . . . . . . . . 89

vi

5.3.3 Photometric stereo for both general reflectance and lighting . . . 90

References 91

vii

List of Figures

1.1 Examples of 3D reconstruction using photometric stereo. The top figureis courtesy of Higo et al. [HMI10], and the bottom figure is courtesy ofJohnson et al. [JCRA11]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Photometric stereo: Traditional VS. General. . . . . . . . . . . . . . . . . 51.3 All 100 materials in the MERL BRDF database: Pictures of spheres being

measured. This figure is courtesy of Matusik et al. [MPBM03]. . . . . . . 8

2.1 Coordinate system and key variables. . . . . . . . . . . . . . . . . . . . . 132.2 Lights, half-vectors and normal on the same longitude. (a) θ ≤ π/4; (b)

θ > π/4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3 (a) α values for ϕl = 0 and θl1, θl2 ∈ [0, π/2]; (b) max(α) values for

ϕl ∈ (0, π/2]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.4 Example of cost functions for θ = π/6 and θ′ ∈ [0, 90] (normalized by

a monotonic function f shown in the right bottom for a visualizationpurpose). (a) Using lights on the same longitude as normal; (b) Usinglights covering the whole hemisphere. . . . . . . . . . . . . . . . . . . . . 20

2.5 Elevation angle errors (degree) on 100 materials (ranked by errors in adescending order). Rendered spheres of some representative materialsand their BRDF values in the ρ-n⊤h space are shown near the curve.The spheres in the ρ-n⊤h plots show the one-dimensional projectionRMS errors using all the directions on the hemisphere (dark blue meanssmall, and red means large errors). . . . . . . . . . . . . . . . . . . . . . . 22

2.6 Photometric stereo results on synthetic models with measured BRDFs.On leftmost part, one of the input images and its corresponding materialsare displayed. Next to the input images and material samples, we showthe estimated azimuth, elevation angles, normals and their differencemaps w.r.t. the ground truth. The numbers on the difference maps showthe mean angular error in degrees. . . . . . . . . . . . . . . . . . . . . . . 23

2.7 Average elevation angle errors (degree) on 100 materials (Y-axis) varyingwith the number of lighting directions. Below the curve, the blue dotson spheres show the light distribution in each case (corresponding to theX-axis of the curve). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

ix

2.8 Elevation angle errors (degree) on 2-lobe BRDFs. (a) The Cook-Torrancemodel, where k1 is the Lambertian and k2 is the specular strength; (b)k1ρ1(n⊤v)+k2ρ2(n⊤(v+2l)); (c) k1ρ1(n⊤h)+k2ρ2(n⊤k), where k is a randomdirection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.9 Results on Gourd1 (102) (The number in the parenthesis indicates thenumber of images in the dataset.) [AZK08]. (a) One input image; (b), (c)Our normal and Lambertian shading; (d), (e) Normal and shading fromLambertian photometric stereo. Note that (b) and (d) have an averageangular difference of about 12. . . . . . . . . . . . . . . . . . . . . . . . . 27

2.10 Results on Apple (112) [AZK08]. (a) One input image; (b), (c) Our nor-mal and reconstructed surface; (d), (e) Normal and surface shown inpaper [AZK08]. Note here we use the color mapping of [(nx + 1)/2, (ny +

1)/2,nz] → [R, G, B] for a comparison purpose. The shapes look differ-ent partly due to that we do not know exactly the same reconstructionmethod and rendering parameters used in [AZK08]. . . . . . . . . . . . . 27

2.11 Real data results. Left part: Gourd2 (98) [AZK08], Helmet side right(119) (Only 119 out of 253 images in the original dataset are used.) [CEJ∗06],Kneeling knight (119) [CEJ∗06]; Right part: Pear (77), God (57), Di-nosaur (118). We show one input image, the estimated normals and theLambertian shadings for each dataset. . . . . . . . . . . . . . . . . . . . . 28

3.1 An illustration of low-frequency reflectance. From left to right, spheresilluminated by distant lights from directions [0, 0, 1]⊤, [1/

√2, 0, 1/

√2]⊤,

[1, 0, 0]⊤ and viewed from [0, 0, 1]⊤ are shown. These high-dynamicrange (HDR) images are tone mapped using the method of Reinhard etal. [RSSF02]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2 RMSE of fitting observations under various Tlow to spherical harmonicswith varying orders. The result is square-rooted for a visualizationpurpose. The legend shows the mapping between the error magnitudeand color. The black area are undefined due to insufficient equations forfitting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.3 The definitions of θh and θd. . . . . . . . . . . . . . . . . . . . . . . . . . . 383.4 BRDF fitting errors of the biquadratic model to synthetic data using

the Cook-Torrance model. The colors indicate error magnitudes. Thecolumns vary with Tlow, and the rows correspond to different roughnesses(m). Some rendered spheres are displayed on the left for reference. . . . 42

x

3.5 BRDF fitting errors of the biquadratic model to all materials in the MERLdatabase. The colors indicate error magnitudes. The columns vary withTlow, and the rows correspond to different BRDFs ordered by the meanfitting errors over columns. Some rendered spheres are displayed on theleft for reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.6 BRDF fitting comparison for various reflectance models and all materialsin the MERL database. The Y-axis shows the RMSE values; the X-axis shows BRDF names ordered by the fitting errors of the biquadraticmodel, with some renderer spheres below for a visualization purpose. . 45

3.7 Results of reflectometry: Top two rows are from synthetic data, andthe bottom row is from real data. BRDF plots for all models and therendered spheres using measured data, the biquadratic model, and theLambert’s model are shown. The reconstruction errors (×10−4 for top twoexamples,×10−2 for the bottom example) for each model are summarizedin the legends. Each BRDF is visualized as a 2D curve, which is a polarplot with angle as the elevation angle of surface normal and radius asthe reflectance magnitude. The surface normal with zero azimuth angleis selected for a visualization in 2D. . . . . . . . . . . . . . . . . . . . . . . 48

3.8 Photometric stereo results comparison for various reflectance modelsand all materials in the MERL database. The Y-axis shows mean angularerrors (degree); the X-axis shows BRDF names ordered by the meanangular errors of the biquadratic model, with some selected renderedspheres below for a visualization purpose. . . . . . . . . . . . . . . . . . 52

3.9 Photometric stereo results using synthetic data. One of the input imagesis shown under the ground truth normal map. Normal map estimatesusing different BRDF models are shown in the top row. The bottom rowshows angular difference maps w.r.t. the ground truth. The numbers onthe difference maps show mean angular errors (degree). . . . . . . . . . . 54

3.10 Angular error (degree) varying with number of lighting directions. Thenumbers in the legend are the mean values over X-axis. The “Biquad./noise-1, 2” cases correspond to noise levels with λ = 0.15, 0.3. Note that thebiquadratic and bicubic models need at least 9 and 16 equations forfitting, therefore their curves start from using 50 and 75 images, respec-tively, when Tlow = 25%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

xi

3.11 Angular error (degree) varying with Tlow. The numbers in the legend arethe mean values over X-axis. The “Biquad./noise-1, 2” cases correspondto noise levels with λ = 0.15, 0.3. Note that the biquadratic and bicubicmodels need at least 9 and 16 equations for fitting, therefore their curvesstart from using Tlow = 10% and 20%, respectively, when 100 images aregiven as input. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.12 Angular error (degree) varying with Tlow of the biquadratic model. Eachthin curve represents the result of one material. The thick red curve isthe average of 100 materials. The materials in green and purple framesare insensitive and sensitive examples to Tlow. . . . . . . . . . . . . . . . . 58

3.13 Photometric stereo results using real-world data in comparison with theresults of [AZK08] and Lambertian photometric stereo. The top threerows of data are courtesy of Alldrin et al. [AZK08]. . . . . . . . . . . . . 60

3.14 Photometric stereo results using real-world data. From left to rightcolumns, it shows one of the input images, estimated surface normalusing the biquadratic reflectance model, Lambertian shading, and depthmap (brighter intensity means closer and darker intensity means further)reconstructed from the estimated normals. The top two rows used datafrom USC light stage gallery [CEJ∗06]. . . . . . . . . . . . . . . . . . . . . 61

3.15 Comparison between the two general reflectance solutions, by showingtheir best-three and worst-three materials respectively. . . . . . . . . . . 63

4.1 Top frame shows the pipeline of normal estimation method (Sec. 4.3);bottom frame shows the illustration of approximation of nonlinear re-sponses (Sec. 4.4). The orange camera has a nonlinear response, whilethe green camera is a linear one. Please pay attention to the differentgeneral lighting conditions on left (l) and right (lF) side of this illustration. 71

4.2 Reconstruction error of Eq. (4.5) w.r.t. varying number of images (q) forscenes containing two spheres with different albedos. α = 1, 2, 3, 4, 5 in-dicate that left/right spheres have albedo values of 0.5/0.5, 0.4/0.6, 0.3/0.7,0.2/0.8, 0.1/0.9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.3 Linearly approximated images for uniform albeodo (top row) and strongcontrast cases (bottom row). Input radiance r is transformed by nonlinearresponse to i. Our linearly approximated rF shows very close appearanceto i. The errors below show mean of relative differences of r and rF fromi (the same definition as reconstruction error in Fig. 4.2). Color encodedimages are used for better visualization. . . . . . . . . . . . . . . . . . . . 74

xii

4.4 Normal estimation accuracy (angular error in degrees) w.r.t. differentalbedos contrasts. α represents the same albedo ratios as in Fig. 4.2. Twodifferent dimensions of lighting coefficients k = 9, 16 are evaluated. . . 76

4.5 Normal estimation accuracy (angular error in degrees) w.r.t. varyingnoise levels in shape priors. β = 1, 2, 3, 4, 5 are labels to represent cor-ruptions, where the clean depth maps are quantized to 8, 6, 5, 4, 3 bits,with zero-mean Gaussian noise of standard deviations 0.02, 0.03, 0.04,0.04, 0.05 added. Those rough normals have angular errors from 13 to25 degrees. Two different dimensions of lighting coefficients k = 9, 16are evaluated. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.6 Surface reconstruction result using synthetic data. From left to right, theground truth, noisy depth which is used as shape prior, surface fromestimated normal, and final reconstruction by fusing shape prior andestimated normal are shown. . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.7 Experimental result using a Kinect sensor. From left to right, one of theinput images, surface normal computed from the Kinect depth (shapeprior), estimated normal after removing ambiguity, reconstructed sur-faces from input depth, normal integration, and depth/normal fusion areshown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.8 Surface normal estimation results using Internet images. From top tobottom rows, we show four of the registered images, normal prior, ourresults, and the results from [ARSG10]. . . . . . . . . . . . . . . . . . . . 81

4.9 3D reconstruction results using Internet images with close-up viewsindicated by red rectangles. The top and bottom rows show shape priorand our result, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.10 Result using Internet images and a known shape as the prior. . . . . . . 83

xiii

List of Tables

3.1 BRDF fitting comparison (RMSE ×10−4). . . . . . . . . . . . . . . . . . . . 443.2 Photometric stereo results comparison (degree). . . . . . . . . . . . . . . 53

xv

1

Chapter 1

Introduction

This dissertation begins with introducing the basic concept, assumptions, and formu-lations of photometric stereo. The traditional photometric stereo is first reviewed. Byextending the classic concept, we then propose the generalized photometric stereo,whose solutions regarding general reflectance and lighting are main research topics ofthis dissertation.

1.1 BackgroundThe shape of real-world object is three-dimensional (3D). When it is recorded with astill camera, the 3D scene is projected onto a 2D image plane. Given the 2D pictures,it is not difficult for human to imagine and understand the original 3D shape, but forcomputers the problem is an ill-posed one due to the loss of depth information. Thereare devices accurately measuring and restoring the 3D shape information, such as thelaser scanners, but their prices are at the level of hundreds times of a consumer-leveldigital camera, which limits their applications in non-professional field. Therefore,recovering shape from images taken by consumer-level cameras is of great interests andapplications, not only for designing future cameras, but also for scene understanding,industrial modeling, robotics, virtual reality, etc.

The image-based 3D reconstruction methods are important topics in computervision. Depending on the constraints on which the techniques rely, there are mainlytwo types of approaches: geometric approach, and photometric approach. The key ideaof the geometric approach is finding correspondence and performing triangulation.Intuitively speaking, there is only one unique 3D position for each surface point of anobject, no matter from which viewpoint the camera captures it, and it is this information

2 Chapter 1 Introduction

that reveals the 3D positions of the surface points. The multi-view stereo (please referto [SCD∗06] for a survey) is a representative technique of geometric approach. It takesmany images from different viewpoints and (approximately) constant illumination asinput, and the output is sparse 3D points of the gross object. The sparsity (density)of the produced 3D points depends on the quality of correspondence matching. Itmeans that for smooth surfaces with few textures, the 3D reconstruction from multi-view stereo might have a relatively lower quality, because the similarities of multi-viewimages’ appearance bring difficulties in matching corresponding points.

The photometric approach relies on the analysis to the shading variations in imagescaused by the interactions of shape (surface normal orientation), reflectance (material),and lighting (illumination). Let us consider a white plane with a circular contour and awhite sphere with the same radius as the plane, and these two objects are illuminated bythe same point light source. It is easy for people to distinguish them from the capturedpicture and perceive the differences in shapes, although both objects are projected asa circle with the same size in the picture taken from the same viewing point; since theplane shows constant reflected light intensity, while the sphere shows brighter intensityin areas facing towards the light source direction (and vice versa). This is the basic ideathat why the shading information tells the 3D shape.

The single-image 3D reconstruction by exploring shading variations is called shapefrom shading [Hor70], but unfortunately it is an ill-posed problem. Because the differ-ent shapes illuminated from different lighting directions might produce the same imageappearance, thus there are infinite many solutions in the single-image case. In con-trast, if more than three images are provided, photometric stereo [Woo80, Sil80] is wellknown as a method of estimating surface normal from an image sequence taken from afixed viewpoint under varied directional (distant) lightings. Comparing to geometricapproach (triangulation-based), photometric approach (shading-based) is able to esti-mate the surface normal orientation with the same resolution of the input image, i.e., toproduce the pixel-wise surface normal map. This accuracy reaches a level that cannotbe achieved by any geometric approach; hence photometric approach is more usefulwhen the delicate geometric structure of the target object is desired. However, due tothe fixed viewpoint, only one view of the object can be reconstructed in 3D. Or strictlyspeaking, the single-view photometric stereo produces a 2.5D surface reconstruction.

Two examples of outputs from successful photometric stereo methods are shownin Fig. 1.1. The top example is a relief object. Due to the textureless surface with delicatestructures on a thin board, this type of object is particularly challenging for geometricapproach. But by using photometric stereo, the method in [HMI10] accurately estimatesthe pixel-wise normal map, based on which the 3D surface is reconstructed. Thebottom example applies photometric stereo to microscopic surface sensing. Throughthe analysis to a carefully calibrated reflectance map, the proposed system named

1.2. Traditional Photometric Stereo 3

1

[Higo, CVPR 10]

[Johnson, SIGGRAPH 11]http://www.gelsight.com

Estimated surface normal

Input image

Figure 1.1: Examples of 3D reconstruction using photometric stereo. The top figureis courtesy of Higo et al. [HMI10], and the bottom figure is courtesy of Johnson etal. [JCRA11].

GelSight (http://www.gelsight.com) is able to measure and reconstruct tiny surfaceswhich are even invisible to human eyes.

Despite the advantages listed above, there are several assumptions to let photomet-ric stereo work perfectly, and satisfying all assumptions in real applications is infeasible.In recent studies, the key problem is how to generalize the assumptions of photometricstereo, and this dissertation is just focusing on this topic. We attempt to generalize pho-tometric stereo in several aspects, especially for accurately and efficiently estimatingthe surface normal (based on which the 3D model can be reconstructed) under generalreflectance and lighting conditions, to significantly extend scenarios where photomet-ric stereo can apply. The details of traditional (ideal) assumptions and our generalizedassumptions for photometric stereo will be introduced in the next two subsections.

1.2 Traditional Photometric StereoThe traditional photometric stereo [Woo80, Sil80] has the following assumptions:

• The camera follows orthographic projection and has a linear radiometric response


function;

• The surface reflectance follows the Lambert’s reflectance model or Lambertianassumption (the reflectance function is a constant; it is approximately observedon diffuse objects like a white wall);

• The lighting conditions are parallel rays with known directions and intensities(namely distant or directional lighting; it has the same lighting direction for allscene points);

• Shadows (either attached shadow or cast shadow) are neglected;

• Interreflection is neglected.

Among all these assumptions, the first one is about camera, and this issue can beavoided by using a proper camera and its setting. For example, by using a lens withlong focus and placing the object far from the camera, the perspective effect can be wellneglected. And the nonlinear response issue can be solved by either using a camera withlinear response (e.g., by checking the raw values of captured intensities), or performingradiometric calibration (e.g., [MN99]) as a pre-processing when the camera is accessibleor controllable. The corruptions from shadows and interreflection are common originof errors for all photometric stereo solutions. Some robust approaches can be applied toremove these outliers before performing normal estimation. For example the methodin [WGS∗10] casted the photometric stereo problem as recovering and completing alow-rank matrix subject to sparse errors (specularity, shadow, interreflection, etc.) likecorrupted and missing pixels.

However, the reflectance of the object itself and lighting condition by which theobject is illuminated are case by case. The different objects usually have differentreflectance properties and the object can be placed in arbitrary lighting environments.Therefore, these two assumptions are the most complicated but important ones whichrestrict photometric stereo for its wide applicability.

The left side of Fig. 1.2 illustrates the traditional photometric stereo and its as-sumptions on reflectance and lighting. The estimated surface normal (at the bottomof “Shape” block in Fig. 1.2) is visualized using a RGB-encoded color map, in whicheach color channel linearly restores XYZ components of the surface normal. A ref-erence sphere is placed at bottom left corner near the estimated normal map. Pleasepay attention to the sphere normal map, since this coordinate system and color mapvisualization (e.g., green means almost perpendicular surfaces facing the north) will beused for all results throughout this dissertation.

Under the Lambertian reflectance assumption, the reflectance of each surface pointcan be represented as its albedo ρ, and directional lighting assumption allows the

1.2. Traditional Photometric Stereo 5

ShapeLightingKnown, distant

ReflectanceLambertian

LV

ReflectanceGeneral BRDF

V

LightingUnknown, natural

L

Shape

Figure 1.2: Photometric stereo: Traditional VS. General.

lighting to be represented as its direction scaled by intensity like e[lx, ly, lz]⊤, where[lx, ly, lz]⊤ is the lighting direction, and e is the lighting intensity. For notation simplicity,we assume that each input image is normalized by dividing its lighting intensity e,thus we denote l = [lx, ly, lz]⊤ as the directional lighting vector, which is a unit vector.Then the image formation model becomes very simple, and the scene radiance i canbe calculated as i = ρn⊤l, where n = [nx,ny,nz]⊤ is the surface normal direction. Givenmore than three different and linearly-independent (non-coplanar) lighting directionsand the corresponding image intensities, we obtain

[i1, i2, i3] = ρ[nx,ny,nz]

l1x l2x l3x

l1y l2y l3y

l1z l2z l3z.

(1.1)

The above equation is for one pixel under three different lightings [l1, l2, l3]. For animage contains p pixels and in total of q different images (lighting directions), bystacking all these observation into matrices, we have

Ip×q = Sp×3L3×q, (1.2)

where S encodes the albedo-scaled normals and L represents the stacks of all lightingvectors. The matrix I can be measured using cameras; the lighting intensities and direc-tions can be measured using a Lambertian sphere by checking its average intensity anda chrome sphere by checking where the brightest pixel is pointing at, respectively. Thenthe albedo-scaled surface normal is the only unknown term, and it can be estimatedusing linear least squares as

S = IL+, (1.3)

where the superscript “+” denotes the pseudo-inverse operation. From the estimatedS, a surface normal n can be computed by normalizing its corresponding row vector in


S, denoted as s like

n =s∥s∥ =

ρnρ. (1.4)

As summarized above, the solution to traditional photometric stereo is simple andstable. However, seldom objects strictly follow the Lambertian assumption for re-flectance, and even many common diffuse materials like matte plastic, wood, fabric,etc. show deviations from this simplest assumption. To satisfy the distant lighting as-sumption, the data capture for photometric stereo images has to be conducted in a darkroom, by waving a point light source far away from the object. For those objects andlights, the image formation model (Eq. (1.1)) and solution method (Eq. (1.3)) reviewedabove are no longer held or only approximately held, thus the estimated normals willbe inaccurate. Hence generalizing these assumptions is critical for designing a practicalphotometric stereo solution which produces consistently accurate results for a widerrange of daily-life objects.

1.3 Generalized Photometric StereoWe refer to generalized photometric stereo as the photometric stereo setup that doesnot strictly follow the assumptions listed in Sec. 1.2. When these assumptions areviolated in various scenarios, the estimated surface normals will suffer from inaccu-racy. In this dissertation, we mainly deal with the general assumptions on reflectanceand lighting, as illustrated on right side of Fig. 1.2, to propose photometric stereo solu-tions that produce accurate surface normal estimates under general reflectance/lightingassumptions.

1.3.1 Generalization of reflectance

In this dissertation, we refer to the general reflectance as an isotropic BidirectionalReflectance Distribution Function (BRDF). Much more complicated than the Lamber-tian case where ρ is a constant, the reflectance ρ becomes a 4D function in general.When the reflectance is assumed to be isotropic, it becomes a 3D function denotedas ρ(θi, θr, |ϕi − ϕr|), where θ and ϕ are the elevation and azimuth angles of incidentand reflected (subscript i and r) lighting directions, in the normal-centered coordinatesystem (the surface normal is the positive Z-direction). The reflected lighting directionis also the viewing direction. In other words, ρ is a function of surface normal n,lighting direction l, and viewing direction v, i.e., ρ(n, l,v), because those angles of iand r can be calculated if these three vectors are known. Recall that the Lambertian

1.3. Generalized Photometric Stereo 7

ρ is invariant to none of these vectors (n, l, and v). The isotropic BRDF can representquite a diverse of different materials, as measured and evaluated in the MERL BRDFdatabase [MPBM03]. One of the key problems to be solved in this dissertation is howto solve the photometric stereo problem given general reflectance described using 3DBRDF, or equivalently how to model such general reflectance effectively in shape es-timation scenario. We call it “general reflectance solution” in the following statementof this dissertation. Please also refer to the “Reflectance” block on left and right sidesof Fig. 1.2 for intuitive explanations of reflectance assumptions used by traditional andgeneralized photometric stereo.

Our general reflectance assumption is different from the dichromatic model [Sha85],which is often used by parametric reflectance models such as the Ward reflectancemodel [War92]. These models often include a specular term besides the Lambertiandiffuse term in an additive form, like “Lambertian + Specular”, where the Lamber-tian term still dominates the diffuse reflection. This problem can be well solved byusing some robust techniques and discarding specularity as outliers, like it was donein [WGS∗10]. Since the Lambertian assumption for diffuse term is not always accuratein real cases, more general reflectance as the 3D BRDF defined above should be furtherconsidered to expand the generality of the materials that can be dealt with.

Evaluation method

For all the proposed general reflectance solutions in this dissertation, we evaluatetheir accuracy of normal estimates from photometric stereo by using all 100 measuredisotropic BRDFs in the MERL database [MPBM03], as shown in Fig. 1.3. This databasecarefully measures and stores the reflectance values of 100 types of common materials,such as plastic, wood, metal, phenolic, acrylic, etc. We apply these reflectance valuesto synthesize input images for photometric stereo. The calculated normal direction isevaluated for its angular error, which is defined as the angle between estimated normaland ground truth normal in degrees. Then the mean angular error is computed for nor-mal estimates across all pixels. The mean angular errors varying with materials in thedatabase and the average performance across all materials are important indicators tovalidate the generality of the proposed photometric stereo solutions against reflectancevariations. Thus our goal is to propose solutions which consistently perform well foralmost all the materials in the database.

1.3.2 Generalization of lighting

In this dissertation, we refer to the general lighting as any other lighting conditionsbeyond the ideal case of using a single point light source distantly (distant/directional


1

Figure 1.3: All 100 materials in the MERL BRDF database: Pictures of spheres beingmeasured. This figure is courtesy of Matusik et al. [MPBM03].

1.4. Chapter Organization 9

lighting) in a dark room (without ambient illumination), thus any general case like nat-ural environment lighting, multiple point light sources with/without ambient lighting,etc. are within the scope of our discussion. These lighting conditions are much morecommonly encountered than the distant lighting, in either indoor or door scenarios.Intuitively speaking, for one surface point illuminated by a general lighting, its effectis equivalent to the integration of distant lightings with different intensities from allspherical directions, if that point is not occluded by other surfaces in any direction.Please also refer to the “Lighting” block on left and right sides of Fig. 1.2 for illustra-tions of lighting assumptions for traditional and generalized photometric stereo. Thegeneralized assumption makes the modeling of lighting as well as photometric stereoformulation and solution more complicated. But solving this problem plays impor-tant roles in designing practical photometric stereo methods, i.e., to bring photometricstereo setup outside the laboratory.

The general lighting can be measured using a chrome sphere, like it was donein [YYT∗13], but in this dissertation, we treat this problem in an uncalibrated manner(without using the calibration sphere). We model the general lighting using high-orderlinear lighting coefficients represented by spherical harmonics. The detailed lightingmodel and solution will be introduced in Chapter 4.

For the “general lighting solution”, we still need to assume the reflectance fol-lows the Lambert’s model. This is because if both reflectance and lighting are generalcases, the problem becomes extremely difficult due to the inherent ambiguity. In thatcase, both reflectance and lighting have many degrees of freedom, which means manydifferent shapes with an incorrectly estimated general reflectance and an incorrectlyestimated general lighting will generate the same image appearance in higher possi-bilities than either reflectance or lighting assumption is simplified. Note it does notmean the photometric stereo with “both” general reflectance and lighting assumptionscannot be solved. We also believe the existence of solution to that difficult problem ifpriors of shape, reflectance, lighting, and their interactions can be well integrated, but inthis dissertation, we focus on these two categories of general problems independently.

1.4 Chapter OrganizationThis dissertation introduces three novel solutions to generalized photometric stereo,with two of them solve the general reflectance problem and one of them solves thegeneral lighting problem.

In Chapter 2, we propose a novel solution for photometric stereo under generalisotropic reflectance and calibrated distant lighting. This chapter exploits the mono-tonicity of general isotropic reflectances for estimating elevation angles of surface nor-


mal given the azimuth angles. With an assumption that the reflectance includes at leastone lobe that is a monotonic function of the angle between the surface normal and half-vector (bisector of lighting and viewing directions), we prove that elevation angles canbe uniquely determined when the surface is observed under varying directional lightsdensely and uniformly distributed over the hemisphere. We evaluate our method byexperiments using synthetic and real data to show its wide applicability, even whenthe assumption does not strictly hold. By combining an existing method for azimuthangle estimation, our method derives complete surface normal estimates for generalisotropic reflectances.

In Chapter 3, we solve the same problem as done in Chapter 2 with a more generalconcept and more diverse applications. We present a bi-polynomial reflectance modelthat can precisely represent the low-frequency component of reflectances. Most ofprevious reflectance models aim at accurately representing the complete reflectancedomain for photo-realistic rendering purposes. In contrast, our bi-polynomial modelis developed for the purpose of accurately solving inverse problems by effectivelydiscarding the high-frequency component while retaining non-linear variations in thelow-frequency part. The bi-polynomial reflectance models are useful for estimatingreflectance and shape of an object. Experimental evaluation in comparison with otherparametric reflectance models demonstrates that the proposed models achieve betterperformance in reflectometry and photometric stereo applications.

In Chapter 4, we come up with a novel solution for photometric stereo under gen-eral environment lighting for Lambertian objects. We use a coarse 3D shape that isgiven to solve the problem, and this method can also work with uncontrolled sen-sors. We show that the coarse shape, or a shape prior, serves to solve two difficultissues: removing shape-light ambiguity in unknown natural lightings and disregard-ing uncontrolled sensor gains and responses. Our method is well-suited to work witha low-cost RGBD camera, whose radiometric characteristics are totally unknown. Wealso show an application to 3D modeling from Internet images, where illumination andsensor characteristics are unknown. Effectiveness of the proposed method is assessedby quantitative and qualitative evaluations.

Chapter 5 concludes this dissertation by summarizing the proposed methods anddiscussing potential future research directions.

11

Chapter 2

General Reflectance Solution usingReflectance Monotonicity

This chapter introduces a solution to photometric stereo with general isotropic re-flectance. As stated in the introduction chapter, the goal of the general reflectancesolution is to let photometric stereo work with 3D BRDF, like those materials measuredin the MERL database. To do this, we adopt a “two-step” strategy in the method pro-posed in this chapter. We exploit the reflectance monotonicity for estimating elevationangles (second step) of surface normal given the azimuth angles (first step) estimatedby existing method such as the approach in [AK07], to fully determine the surfacenormal for a broad class of reflectances.

2.1 OverviewAs described by Chandraker and Ramamoorthi [CR11], an isotropic BRDF consists ofa sum of lobes, and the reflectance of each lobe monotonically decreases as the sur-face normal deviates from the lobe’s projection direction, along which the reflectancefunction is “concentrated”. Following their work, we assume that a surface reflectancecontains a single dominant lobe projected on the half-vector (the bisector of lighting andviewing direction). Under this assumption, we perform a pixel-wise one-dimensionalsearch for the elevation angle in the range [0, π/2], given an azimuth angle for eachpixel. We prove that the reflectance monotonicity is maintained only when the cor-rect elevation angle is found, if the scene is observed under dense directional lightsuniformly distributed over the whole hemisphere.

Based on the evaluations using the MERL BRDF database [MPBM03] and the con-

12 Chapter 2 General Reflectance Solution using Reflectance Monotonicity

clusion in [CR11], we show that many materials such as acrylic, phenolic, metallic-paint, and some shiny plastics can be well approximated by the 1-lobe half-vectorBRDF model. While some other materials, such as fabric, require two or more lobesfor precise representation, in practice, our algorithm still shows robustness against thedeviations and performs accurate estimation when a dominant lobe projected on half-vector exists. We assess the applicability of our method to 2-lobe BRDFs as well andverify the effectiveness of the proposed method on various real data.

2.2 Related WorksTraditional photometric stereo algorithm [Woo80] assumes Lambert’s reflectance modeland can recover surface normal directions from as few as three images. With addi-tional images, non-Lambertian phenomenon such as shadow can be handled [BP03].With more images, various robust techniques can be applied to statistically handlenon-Lambertian outliers. There are approaches based on RANSAC [MIS07], medianfiltering [MHI10] and rank minimization [WGS∗10]. Spatial information has also beenused for robustly solving the problem by using expectation maximization [WT10].Some photometric stereo methods explicitly model surface reflectances with paramet-ric BRDFs. Georghiades [Geo03] use the Torrance-Sparrow model [TS67] for specularfitting, and Goldman et al. [GCHS10] build their methods on the Ward model [War92].

For surfaces with general reflectance, various properties have been exploited toestimate surface normal, such as radiance similarity [SOYS07] and attached shadowcodes [OSS09]. Higo et al.’s method [HMI10] uses monotonicity and isotropy of generaldiffuse reflectances. Alldrin and Kriegman [AK07] show that the azimuth angle ofnormal can be reliably estimated for isotropic materials using reflectance symmetry, iflights locate on a view-centered ring. Later, their method is extended to solve for bothshape and reflectance by further restricting the BRDF to be bivariate [AZK08]. Basedon the symmetry property, a comprehensive theory is developed in [TQZ11], and asurface reconstruction method using a special lighting rig is introduced in [CBR11].

We exploit reflectance monotonicity to compute the per-pixel elevation angle of sur-face normal given estimated azimuth angles using [AK07]. Unlike [AZK08]’s approachthat involves complex optimization for iteratively estimating shape and reflectance, ourmethod avoids the complex optimization by taking the advantages of the monotonicityfrom the 1-lobe BRDFs.

2.3. Elevation Angle Estimation 13

n

θθl

l

v

o φφl

h

θh

Figure 2.1: Coordinate system and key variables.

2.3 Elevation Angle EstimationAssume that we have already known the azimuth angles of surface normal. Forexample, we can use the method in [AK07] to obtain the azimuth angles first, whileour method is not limited to a particular azimuth angle estimation method. Given theazimuth angles, our method performs a one-dimensional search for determining theelevation angles ranging from 0 to π/2 for visible normals.

2.3.1 1-lobe BRDF

We use bold letters to denote unit 3D vectors. l and v represent the lighting and viewingdirections, respectively, and h is their unit bisector. For photometric stereo, we use theview-centered coordinate system with v = [0, 0, 1]⊤. As illustrated in Fig. 2.1, we usespherical coordinates (θ, ϕ) and (θl, ϕl) to represent surface normal n and lighting direc-tion l respectively. Take n as an example, we have n = [cosθ cosϕ, cosθ sinϕ, sinθ]⊤.

The semi-parametric model proposed in [CR11] suggests that BRDFs can be wellrepresented by the summation of several lobes in the form of ρ(n⊤k), where ρ is amonotonically increasing function, and k is referred as a projection direction. Wefurther assume a dominant lobe (weighting is larger than other lobes) of ρ(n⊤h) exists,whose projection direction is k = h. We analyze the monotonicity of this lobe, and usethis property for determining elevation angles.


2.3.2 BRDF profile

At a scene point, each hypothesized elevation angle θ′ uniquely determines a normaldirection n′, which in turn leads to multiple hypothesized BRDF values by dividingthe observed scene radiance with the foreshortening term n′⊤l. For simplicity, we referto these BRDF values across varying n′⊤h as a BRDF profile. Given the observed sceneradiance i at a pixel, the hypothesized BRDF profile ρ′ is computed as

ρ′(n′⊤h) =i

n′⊤l= ρ(n⊤h)

n⊤ln′⊤l. (2.1)

We prove that ρ′ is only monotonic w.r.t. n′⊤h when n′ is the correct surface normaldirection (i.e., θ′ is the correct elevation angle θ), except for some degenerate light-ing configurations. Based on this, we find the correct elevation angle θ using themonotonicity of the BRDF profile ρ′.

2.3.3 Monotonicity of the BRDF profile

Consider two different lighting directions l1 and l2, and their associated half-vectors h1

and h2. Without loss of generality, we assume a surface normal with the azimuth angleϕ = 0 and elevation angle θ, i.e., n = [cosθ, 0, sinθ]⊤. The hypothesized elevation angleθ′ leads to n′ = [cosθ′, 0, sinθ′]⊤. For notation simplicity, we define x1 = n⊤h1, x2 =

n⊤h2, x′1 = n′⊤h1, x′2 = n′⊤h2 and y1 = ρ(x1), y2 = ρ(x2), y′1 = ρ′(x′1), y′2 = ρ

′(x′2).If ρ is a monotonically increasing function, the following condition holds: for

x1 < x2, we have y1 < y2. There are two cases when ρ′ becomes non-monotonic:

1. The ordering of x is swapped, but y is not swapped: x′1 > x′2 and y′1 < y′2;

2. The ordering of y is swapped, but x is not swapped: x′1 < x′2 and y′1 > y′2.

In the following, we first discuss the conditions for reordering of x and y respectively(which we call x-swap and y-swap for short hereafter), and then analyze under whatcondition an incorrect estimate of the elevation angle θ′ will break the monotonicity ofρ′.

Note here: (1) We discuss only the monotonically increasing case, and the similaranalysis can also be applied to monotonically decreasing case; (2) We focus on the caseof two observations under lighting directions l1 and l2. In a photometric stereo setting,we often have far more than two input observations. The discussion applies to anypair of observations: ρ′ becomes non-monotonic, if any observation pair breaks itsmonotonicity.


Necessary and sufficient condition for x-swap

Suppose that the ordering of x is changed by the hypothesized θ′, i.e., (x1−x2)(x′1−x′2) =(n⊤h1 − n⊤h2)(n′⊤h1 − n′⊤h2) < 0. From the definition of h and v = [0, 0, 1]⊤, we have

h =l + v∥l + v∥ =

[lx, ly, lz + 1]⊤√l2x + l2

y + (lz + 1)2=

[lx, ly, lz + 1]⊤√

2 + 2lz. (2.2)

After some simple derivations, we have

n⊤h1 =1√

2 + 2l1z

((l1z + 1) sinθ + l1x cosθ) . (2.3)

n⊤h2 can be computed in a similar way, and their difference becomes

n⊤h1 − n⊤h2 = A sinθ + B cosθ =√

A2 + B2 sin(θ + α), (2.4)

where A = l1z+1√2+2l1z

− l2z+1√2+2l2z,B = l1x√

2+2l1z− l2x√

2+2l2z, and α = arctan B

A , α ∈ [−π/2, π/2]. Weobtain a similar equation for n′ by replacing θ in Eq. (2.4) with θ′.

Therefore the necessary and sufficient condition to change the ordering of x is(n⊤h1 − n⊤h2)(n′⊤h1 − n′⊤h2) = sin(θ + α) sin(θ′ + α) < 0. In other words, sin(θ + α)and sin(θ′ + α) should have different signs. This condition is true only when α < 0 andθ < −α < θ′ (or θ′ < −α < θ), since θ and θ′ are both within [0, π/2], and α is within[−π/2, π/2]. Note that α is completely determined by the two lighting directions l1 andl2, and independent of θ and θ′. Hence, a larger difference between θ and θ′ giveshigher possibility for an x-swap to happen.

Sufficient condition for y-swap

Next, we consider when the ordering of y will be changed, i.e., we discuss the casewhere y1 < y2 and y′1 > y′2. We assume that l1 is close to l2 such that ρ(n⊤h1) is closeto ρ(n⊤h2). This is true when lighting is dense and neighboring lighting directions aresampled, and lighting directions do not cause the highlight reflection at n. Note that byfocusing on nearby lighting directions, we can only derive the sufficient condition forswapping the ordering of y, because there could be two parted lights causing y-swap tohappen. Under these assumptions and according to Eq. (2.1), y′ is mainly determinedby n⊤l/n′⊤l. For l1, we obtain

n⊤l1

n′⊤l1=

l1x cosθ + l1z sinθl1x cosθ′ + l1z sinθ′

=sin(θ + β1)sin(θ′ + β1)

, tan β =l1x

l1z. (2.5)

From the relationship sin(θ+β1)sin(θ′+β1) >

sin(θ+β2)sin(θ′+β2) > 0, we can derive the sufficient condition for

y-swap as

(θ′ − θ)l12y < 0. (2.6)


Here, l12y is y-component of the cross product of l1 and l2. The detailed derivation isproved as below.

Proof. If n⊤l > 0 and n′⊤l > 0, from Eq. (2.5), the following holds:

sin(θ + β1)sin(θ′ + β1)

>sin(θ + β2)sin(θ′ + β2)

> 0.

Therefore, we have

sin(θ + β1) sin(θ′ + β2) − sin(θ′ + β1) sin(θ + β2) > 0.

By applying the product-to-sum trigonometric identities, it is simplified as

cos(θ − θ′ + β1 − β2) − cos(θ′ − θ + β1 − β2) > 0.

This further becomes as following by applying the sum-to-produce identities:

sin(θ′ − θ) sin(β1 − β2) > 0.

Since θ, θ′ ∈ [0, π/2], we can use (θ′ − θ) to replace sin(θ′ − θ) with inequality retained.From the definition of tan β, we have the following relation

sin(β1 − β2) = sin(β1) cos(β2) − cos(β1) sin(β2)

= l1x√l21x+l21z

l2z√l22x+l22z

− l1z√l21x+l21z

l2z√l22x+l22z

.

This indicates the sign of sin(β1 − β2) is the same as l1xl2z − l1zl2x = (l2 × l1)y = −l12y.Therefore, the inequality of Eq. (2.6) holds.

Eq. (2.6) indicates that the ordering of y depends on two factors: (1) the sign of(θ′ − θ), i.e., the hypothesized θ′ is larger or smaller than the true value, and (2) therelative positions of the two lights l1 and l2.

Sufficient condition for unique solution

For any hypothesized θ′, if it swaps the ordering of y (or x) while keeps the ordering ofx (or y) unchanged, the function ρ′ will become non-monotonic. Hence, if we ensurean incorrect θ′ will always break the monotonicity of ρ′, we are able to find the correctelevation angle θ by choosing the θ′ that makes ρ′ monotonic.

For a monotonically increasing ρ, we have x1 = n⊤h1 < x2 = n⊤h2, or equivalentlyθh1 > θh2. We begin with the case where lights are densely distributed along the samelongitude as the normal n, i.e., ϕl = ϕ = 0. Since we require θh1 > θh2, h2 is closer to nthan h1. Further, as illustrated in Fig. 2.2, h1 and h2 are restricted on the red dotted linewhen the lighting directions l1 and l2 move on the green dotted line. By observing thegeometric relationship between h and l, we obtain the following results:


vh1 h2l1l2n

y

x

l12y

θ θl2

θh2

v

h1l2l1

n

y

x

l12yθ

θl2

θh2h2

(a) (b)

Figure 2.2: Lights, half-vectors and normal on the same longitude. (a) θ ≤ π/4; (b)θ > π/4.

-π/2

-π/4

θl1

α values withφl = 0

θ l2

0 0.5 1 1.5-0.8

-0.6

-0.4

-0.2

0

(0, 0)

(π/2, π/2)m

ax(α

)

φl

(a) (b)

Figure 2.3: (a) α values for ϕl = 0 and θl1, θl2 ∈ [0, π/2]; (b) max(α) values for ϕl ∈(0, π/2].

1. As shown in Fig. 2.2(a), if θ ≤ π/4, we always have θl1 > θl2 (l1 is closer to vthan l2). This ensures l12y > 0. Then from Eq. (2.6), y-swap always happens if thehypothesized θ′ becomes θ′ < θ. In contrast, y-swap might not (because we onlyhave the sufficient condition for y-swap) happen when θ′ > θ;

2. As shown in Fig. 2.2(b), if θ > π/4, the relative positions of l1 and l2 cannot bedetermined from θh1 > θh2. There are two cases: (1) If h1 and h2 are both fartherfrom v than n, we should have θl2 > θl1 (l2 is closer to v than l1). Hence, l12y < 0;(2) If h1 and h2 are both closer to v than n, we should have θl1 > θl2 (l1 is closerto v than l2, which is a similar case as Fig. 2.2(a)). Hence, l12y > 0. When lightsare densely distributed along the longitude, we can always find a pair of lightssatisfying these two cases. As a result, Eq. (2.6) can hold to cause y-swap nomatter what θ′ is.

Now we know y-swap might not happen when θ ≤ π/4 and θ′ > θ. In the next, we


analyze when x-swap happens. If both swaps do not happen, an incorrect estimate ofthe elevation angle θ′ will not break the monotonicity of ρ′, and we cannot tell whetherthat θ′ is correct or not. Due to the complexity of the analytic form of α, we simulateand plot all α values for all combinations of l1 and l2 on the same longitude as n inFig. 2.3(a). It is interesting to note that α continuously changes in [−π/2,max(α)], wheremax(α) ≈ −π/4. In other words, −α covers the whole range of [−max(α), π/2]. Recallthat the necessary and sufficient condition for x-swap to happen is that θ′ < −α < θ(or θ < −α < θ′). Thus, if both θ′ and θ are smaller than −max(α), x-swap will neverhappen.

Combining the conditions that both x-swap and y-swap do not happen, we con-clude θ′ ∈ [θ,−max(α)] is the degenerate interval for normals with ϕl = 0, where themonotonicity of ρ′ is not broken by any incorrect θ′.

Next, we consider lights along other longitudes, i.e. ϕl ∈ (0, π/2]. The same analysiscan be applied, and similarly, the monotonicity of ρ′ is not broken by any incorrect θ′

within the degenerate interval [θ,−max(α)ϕl]. Here,−max(α)ϕl depends on the azimuthangle of lighting directions at the longitude ϕl. We plot it as a function of ϕl for thoselongitudes in Fig. 2.3(b). We find that max(α) increases from about −π/4 to 0 when ϕl

approaches π/2. Since monotonicity of ρ′ is lost if any pair of lights on a longitude canbreak its monotonicity, the final degenerate interval is the intersection of all degenerateintervals [θ,−max(α)ϕl] from each longitude. This makes the final intersection of theinterval an empty set, because −max(α)ϕl approaches 0. In other words, if we havelights on all longitudes with azimuth angles ϕl ∈ [0, π/2], we can uniquely determinethe elevation angle θ for normals with ϕ = 0.

To recover the elevation angle of arbitrary surface normals, we need lights spannedall longitudes that cover the whole hemisphere. These lights form a dense and uni-form distribution over the hemisphere. In practice, we capture images under randomlighting directions that approximate the uniformity.

2.4 Normal Estimation MethodWe perform a one-dimensional search for θ within [0, π/2] at each pixel. For eachhypothesized valueθ′, we assess the reflectance monotonicity. Specifically, we computethe hypothesized BRDF values y′ and evaluate its monotonicity w.r.t. x′. To measurethe reflectance monotonicity, we calculate the derivatives (discrete differentiation) of y′

and sum up the absolute values of negative derivatives, denoted as δ(θ′):

δ(θ′) =∑

x′max

(−d y′

d x′, 0

), (2.7)

2.4. Normal Estimation Method 19

Algorithm 1 Normal estimation by using reflectance monotonicityINPUT: Scene radiance value i, lighting directions l, threshold i0, i∞.Compute the azimuth angle of normal ϕ using the method in [AK07];for each pixel do

for θ′ ∈ [0, π/2] doLet n′ = [cosθ′ cosϕ, cosθ′ sinϕ, sinθ′]⊤;Calculate y′ based on Eq. (2.1);If any l causes y ≤ i0 (in shadow), set its corresponding y′ = i0;If any n′⊤l ≤ 0, set its corresponding y′ = i∞;Order y′ = i/(n′⊤l) w.r.t. x′ = n′⊤h;Evaluate the cost values using Eq. (2.7);

end forθ = argmin

θ′δ(θ′);

n = [cosθ cosϕ, cosθ sinϕ, sinθ]⊤;end forOUTPUT: Normals for all pixels.

The cost δ penalizes monotonically decreasing sequence, i.e., a larger δ value indicatesa less monotonically increasing y′. We show typical cost functions in Fig. 2.4 for anormal with θ = π/6 and ϕ = 0 under different lighting distributions. Fig. 2.4(a) is thecost computed where all lights locate on the same longitude as the normal. As we haveproved in Sec. 2.3.3, there is a degenerate interval for θ′ ∈ [θ,−max(α)]. Indeed, ourcost function is almost a constant value in [π/6, π/4], and we cannot tell which valuewithin this interval is the correct elevation angle. Fig. 2.4(b) shows the cost computedwith lights distributed on the whole hemisphere. It has a clear global minimum toestimate the elevation angle.

The complete normal estimation algorithm is summarized as Algorithm 1. Thereare a few implementation details: (1) We discard shadows by simple thresholding. Forscene radiance values normalized to 1, we use a threshold i0 = 10−6 for synthetic data,and in real data the threshold is manually chosen through cross validation (typically setas 0.02 in our experiments); (2) When calculating y′, we set it as a big value (i∞ = 1010),if n′⊤l ≤ 0; (3) When evaluating the monotonicity, we apply a monotonic mapping asy′ ← y′γ (γ is empirically determined as 5 in all experiments) for a data normalizationpurpose.


10 20 30 40 50 60 70 80 900

0.2

0.4

0.6

0.8

1

10 20 30 40 50 60 70 80 900

0.2

0.4

0.6

0.8

1

0 5 100

0.5

1

0 5 100

0.5

1

112)(

xexf

112)(

xexf

'

'

(a)

(b)

Figure 2.4: Example of cost functions for θ = π/6 and θ′ ∈ [0, 90] (normalized bya monotonic function f shown in the right bottom for a visualization purpose). (a)Using lights on the same longitude as normal; (b) Using lights covering the wholehemisphere.

2.5. Experiments 21

2.5 ExperimentsWe evaluate the accuracy of elevation angle estimation using all the 100 materials in theMERL database, also with varying number of lights. Synthetic experiments using 2-lobe BRDFs are performed to assess the robustness of our method against the deviationfrom the 1-lobe BRDF assumption. Finally, we show our normal estimates on real data.

2.5.1 Synthetic data

Performance on measured BRDFs

We sample 1620 normal directions from the visible hemisphere by uniformly choosing36 longitudes and 45 altitudes. We generate their observations under 337 lightingdirections uniformly sampled on the hemisphere. For the sampling, we use the verticesdefined by an icosahedron with a tessellation order of three. The average elevationangle errors of the 1620 samples for all materials in the MERL database are summarizedin Fig. 2.5. The average error on all the 100 materials is 0.77 from the ground truth.

We observe that the error is relatively large for materials that cannot be well rep-resented by a monotonic 1-lobe BRDF with half-vector as the projection direction, forexample the material A and B in Fig. 2.5 (see the ρ-n⊤h plot above the rendered spheres).For those materials, their optimal 1-lobe BRDF representations have a much differentprojection direction from h (see the spheres of RMS error distribution in the ρ-n⊤hplots). Therefore, if we force them to project on h and apply our method, the resultshave relatively larger errors.

We use two synthetic models Bunny and Caesar to test the accuracy of normalestimates, as shown in Fig. 2.6. Each model is rendered by using two materials asan object with spatially varying BRDFs, and each dataset contain 100 images undera random lighting distribution. For the Bunny model, it contains the Blue-metallic-paint and Yellow-phenolicmaterials, and the Caesarmodel is rendered using Green-acrylic and Red-specular-plastic reflectances. One of the rendered images and itscorresponding materials are shown on left side of the results, and next to them weshow photometric stereo results of estimated azimuth angles, elevation angles, surfacenormals, and their difference maps. We find the errors reported here are slightly largerthan giving the true values of azimuth angle and only estimating the elevation angle.This is because the azimuth angle estimates contain some errors. The optimal lightingcondition for azimuth angle estimation should be that all the lights are uniformly placedalong the same altitude (a ring-light distribution), according to [AK07], but the inputlighting distribution is random here. Hence, we first need to apply some image-basedrendering technique such as the “spherical linear interpolation” [Sho85] to approximate


0

0.5 1

1.5 2

2.5 3

3.5 4

green-latexpickled-oak-260

beige-fabricpolyethylene

white-paintpink-plastic

red-plasticyellow-plastic

pink-fabric2nylon

neoprene-rubberwhite-fabric2white-fabric

pink-feltwhite-acrylic

white-diffuse-bballspecular-white-phenolic

white-marbleyellow-paint

alumina-oxidepolyurethane-foam

light-red-paintpure-rubberpink-jasper

delrinsilver-paint

orange-paintpearl-paintpink-fabric

teflonspecular-yellow-phenolic

violet-rubberred-fabric

green-fabricblue-acrylicblue-rubber

silver-metallic-paint2grease-covered-steel

dark-red-paintgold-paintaluminium

nickelblack-oxidized-steelspecial-walnut-224

black-fabricdark-specular-fabric

black-soft-plasticchrome-steel

colonial-maple-223black-obsidian

dark-blue-paintspecular-black-phenolic

steelblue-metallic-paint

chromespecular-blue-phenolic

specular-green-phenolictungsten-carbideyellow-phenolic

light-brown-fabricsilicon-nitrade

brassgreen-acrylic

ss440gold-metallic-paint

black-phenolicblue-fabric

specular-violet-phenolicblue-metallic-paint2

cherry-235color-changing-paint3

fruitwood-241green-metallic-paint2

hematitesilver-metallic-paint

color-changing-paint1green-metallic-paint

red-metallic-paintalum-bronze

color-changing-paint2ipswich-pine-221

natural-209specular-maroon-phenolic

two-layer-goldtwo-layer-silver

gold-metallic-paint2gold-metallic-paint3

green-plasticred-specular-plastic

aventurninepvc

violet-acrylicred-phenolic

red-fabric2maroon-plastic

specular-red-phenolicgray-plasticpurple-paint

specular-orange-phenolicyellow-matte-plastic

A

A

B

C

DE

BC

D

00.5

10.02

0.03

0.04

0.05

0.06

0.07

n Th

(nTh)E

L100

L337

L100 + M

appingL

337 + Mapping

Error1.78

1.261.37

0.77

00.5

10.02

0.04

0.06

0.08

0.1

0.12

n Th

(nTh)

00.5

10.04

0.06

0.08

0.1

0.12

n Th

(nTh)

00.5

10

0.1

0.2

0.3

0.4

n Th

(nTh)

00.5

10

0.2

0.4

0.6

0.8

n Th

(nTh)

Elevation angle error (deg.)

Material nam

es

Figure2.5:

Elevationangle

errors(degree)

on100

materials

(rankedby

errorsin

adescending

order).R

enderedspheres

ofsome

representativem

aterialsand

theirBR

DF

valuesin

theρ-n

⊤hspace

areshow

nnear

thecurve.

Thespheres

inthe

ρ-n⊤h

plotsshow

theone-dim

ensionalprojectionR

MS

errorsusing

allthedirections

onthe

hemisphere

(darkblue

means

small,and

redm

eanslarge

errors).

2.5. Experiments 23

Image Material

Ground truth Our result Difference map

3.59

2.44

2.95

Elev

atio

n an

gle

A

zim

uth

angl

e

Nor

mal

Lights

360

0

360

0

30

090

0

90

0

30

0

30

0

Ground truth Our result Difference map 3.59

2.44

2.95

3.42

3.10

2.68

360

0

360

0

30

030

030

0

90

0

90

0

Image Material

Elev

atio

n an

gle

A

zim

uth

angl

e

Nor

mal

Figure 2.6: Photometric stereo results on synthetic models with measured BRDFs. Onleftmost part, one of the input images and its corresponding materials are displayed.Next to the input images and material samples, we show the estimated azimuth,elevation angles, normals and their difference maps w.r.t. the ground truth. Thenumbers on the difference maps show the mean angular error in degrees.


the input data as they were illuminated by lights distributed along an altitude giventhe images illuminated by randomly distributed lights, before computing the azimuthangles using the method in [AK07]. Besides the interpolation error, the cast shadowis another reason for inaccurate azimuth angle estimation. Please note in Fig. 2.6 (seethe difference maps of azimuth angles) that biggest azimuth angle errors often appearin concave areas of the surface where cast shadow is strong. The errors embedded inazimuth angles will unavoidably deteriorate our elevation angle estimation, but how toremove those errors is beyond the capability of the proposed elevation angle estimationalgorithm.

Performance with varying number of lights

Next, we evaluate the performance variation with different numbers of lighting di-rections (input images). We perform the same experiment using 25, 50, 100 and 200random lighting directions (randomly sampled from the 337 uniformly distributedlights). The errors and light distributions are shown in Fig. 2.7. Empirically, about100 random lights provide average elevation angle error of around 1. Thus in thefollowing test, we fix the number of lights to 100.

Performance on 2-lobe BRDFs

To demonstrate the robustness of our method, we evaluate our method on syntheticdata using 2-lobe BRDFs in Fig. 2.8. The weighting of the two lobes k1 and k2 are variedfrom 0 to 1. Fig. 2.8(a) shows our result on the Cook-Torrance model [CT82] (roughnessm = 0.5) with a Lambertian diffuse lobe and a specular lobe. Note the specular lobeof the Cook-Torrance model is not centered at h. However, our method always givesaccurate result for different relative strength of the two lobes. The error in estimatedelevation angles is the largest (about 1) when BRDF is completely dominated by thespecular lobe.

As discussed in [CR11], some fabric materials contain lobes with projection directionv or (v + 2l). Hence, we deliberately create such a BRDF as k1ρ1(n⊤v) + k2ρ2(n⊤(v + 2l))and evaluate our method on this BRDF. Note both lobes are not centered at h. Here,ρ1(x) = ρ2(x) = x. In Fig. 2.8(b), we plot the elevation angle errors for different valuesof k1 and k2. The errors are smaller than 3 for most of the combinations.

At last, we create a 2-lobe BRDF k1ρ1(n⊤h)+k2ρ2(n⊤k) by combining a lobe centeredat h and another one with a random direction k. k is a randomly sampled direction onthe sphere for each pixel under each lighting direction. The elevation angle errors areshown in Fig. 2.8(c). We can observe that for more than half of all cases, our method

2.5. Experiments 25

25 50

100 200 337

Number of lights

Elev

atio

n an

gle

erro

r (de

g.)

Light distributions

3.98

2.141.37

0.88 0.77

00.5

11.5

22.5

33.5

44.5

25 50 100 200 337

Figure 2.7: Average elevation angle errors (degree) on 100 materials (Y-axis) varyingwith the number of lighting directions. Below the curve, the blue dots on spheres showthe light distribution in each case (corresponding to the X-axis of the curve).


k2

k 1

k 1 k 1

k2 k2

(0.1, 0.1)

(1, 1)

(0.1, 0.1) (0.1, 0.1)

(1, 1) (1, 1)

(a)

(b) (c)

Figure 2.8: Elevation angle errors (degree) on 2-lobe BRDFs. (a) The Cook-Torrancemodel, where k1 is the Lambertian and k2 is the specular strength; (b) k1ρ1(n⊤v) +k2ρ2(n⊤(v + 2l)); (c) k1ρ1(n⊤h) + k2ρ2(n⊤k), where k is a random direction.

generates errors smaller than 5. Our method can work reasonably well with errorssmaller than 3 when the relative strength of the random lobe is below 0.3.

2.5.2 Real data

We show the estimated normals by our method on real data. As introduced in Sec. 1.2,we visualize the estimated normals by linearly encoding the XYZ components of nor-mals in RGB channels of an image (except for the Appledata in Fig. 2.10 for a comparisonpurpose). First, in Fig. 2.9, our method is compared with the Lambertian photomet-ric stereo [Woo80] by showing the estimated normals and the Lambertian shadingscalculated from the estimated normals with the same lighting direction as the inputimage. For such non-Lambertian materials, our method shows much more reasonable

2.5. Experiments 27

(a) (b) (c) (d) (e)

Figure 2.9: Results on Gourd1 (102) (The number in the parenthesis indicates thenumber of images in the dataset.) [AZK08]. (a) One input image; (b), (c) Our normaland Lambertian shading; (d), (e) Normal and shading from Lambertian photometricstereo. Note that (b) and (d) have an average angular difference of about 12.

Normal: r: (x+1)/2; g: (y+1)/2; b:z

(a) (b) (c) (d) (e)

(a) (b) (c) (d) (e)

Figure 2.10: Results on Apple (112) [AZK08]. (a) One input image; (b), (c) Our normaland reconstructed surface; (d), (e) Normal and surface shown in paper [AZK08]. Notehere we use the color mapping of [(nx+1)/2, (ny+1)/2,nz]→ [R, G, B] for a comparisonpurpose. The shapes look different partly due to that we do not know exactly the samereconstruction method and rendering parameters used in [AZK08].

normal estimates. Next, we compare with the method in [AZK08] by showing the sur-face reconstructions (according to the method in [Kov05]) from the estimated normalsin Fig. 2.10(b). Due to the lack of the ground truth, we cannot make a quantitativecomparison, but qualitatively, our method shows similar results as [AZK08]. In termsof the complexity, our method is much simpler and computationally inexpensive forderiving the elevation angle.

Finally, we show our estimated normals on other materials with various reflectances,such as plastic, metal, paint, etc., as shown in Fig. 2.11. The datasets on the left sideare from [AZK08] and [CEJ∗06]; right part of the datasets are captured by ourselveswith a Sony XCD-X710CR camera. By comparing the input images and the Lambertianshadings, e.g., the attached shadow boundary of the original image and Lambertianshading rendered using estimated normals are quite consistent, we claim our estimatednormals are of high accuracy. Some noisy points observed on the results are mainlycaused by pixels with too dark intensities and hence low signal-to-noise ratio.


Figure2.11:

Realdata

results.Left

part:Gourd2

(98)[A

ZK

08],Helmetsideright

(119)(O

nly119

outof

253im

agesin

theoriginaldatasetare

used.)[CEJ ∗06],K

neelingknight

(119)[CEJ ∗06];R

ightpart:Pear

(77),God

(57),Dinosaur

(118).W

eshow

oneinputim

age,theestim

atednorm

alsand

theLam

bertianshadings

foreach

dataset.

2.6. Summary 29

2.6 SummaryTo solve photometric stereo problem with general reflectance, we estimate the twocomponents of a surface normal, i.e., azimuth angle and elevation angle, separately.Since the azimuth angle estimation is a well solved problem, in this chapter we showa method for estimating elevation angles of surface normal by exploiting reflectancemonotonicity given the azimuth angles. We assume the BRDF contains a dominantmonotonic lobe projected on the half-vector, and prove that the elevation angle can beuniquely determined under dense lights uniformly distributed on the hemisphere. Insynthetic experiments, we first demonstrate the accuracy of our method on a broadcategory of materials. We further evaluate its robustness to deviations from our as-sumption about BRDFs. Various real-data experiments also show the effectiveness ofthe proposed method.

Limitation

Our method assumes known azimuth angles. Joint estimation of both azimuth andelevation angles makes the problem prohibitively difficult due to its highly non-linearnature of the problem. As indicated in Fig. 2.6, when the estimated azimuth anglesare not accurate, the elevation angle estimation will be deteriorated accordingly. To bemore specific, for the experiment of Fig. 2.5, the average elevation angle errors over100 materials increase to 1.16, 2.24, 3.32, 4.72 degrees with additive azimuth angleerrors normally distributed with mean 0 and standard deviation 0.5, 1, 1.5, 2 degrees,respectively. To make the proposed method robust against inaccuracy of azimuthangles is an interesting direction. Besides, while we have discussed only the sufficientcondition, deriving a compact lighting configuration that uniquely determines theelevation angles is our future work.

31

Chapter 3

General Reflectance Solution usingBi-polynomial Reflectance Model

This chapter introduces another solution to photometric stereo with general isotropicreflectance. The method in this chapter follows the same assumptions of photometricstereo as the method in the previous chapter. Instead of solving for the azimuth anglesand elevation angles separately as done in the previous chapter, this time we solvethe surface normal as a whole. In addition, the method in this chapter promotes thesolution to a more high-level principle in the form of a novel reflectance model. Besidesthe shape estimation using photometric stereo, application to reflectance estimation,namely reflectometry, based on the proposed reflectance model is also introduced.

3.1 OverviewParametric modeling of reflectances plays an important role in both rendering and in-verse problems in radiometric image analysis. The vast majority of existing parametricreflectance models are developed for the purpose of photo-realistic rendering. They aredesigned to have an accurate representation of specular components and successfullyapplied to forward problems in computer graphics. However, these reflectance modelsare not necessarily suitable for inverse problems in computer vision, such as reflectanceand shape estimation. In fact, many of these reflectance models severely complicate theinverse problems by introducing high nonlinearity when they are directly used in thecomputation, and as a result, the solution methods are forced to involve unstable andexpensive nonlinear (or even non-convex) optimization procedures. While one coulduse a simplistic model to avoid such a problem, e.g., the Lambert’s reflectance model,

32 Chapter 3 General Reflectance Solution using Bi-polynomial Reflectance Model

the accuracy of estimates suffers from its discrepancy from the real-world reflectance.Therefore, it is desired to develop a reflectance model that well represents real-worldreflectances while retaining simplicity for inverse problems.

In forward rendering problems, one of the key challenges is to accurately modelspecular reflection that exhibits high-frequency reflectance variations in the incidentor exitant angular domain. Since the specular component significantly varies acrossmaterials, in order to faithfully represent it, the specular term of a reflectance modeltends to become complex and highly nonlinear. While it is essential for photo-realisticrendering in computer graphics, explicit modeling of high-frequency specular reflec-tions is seldom necessary for inverse problems in computer vision, particularly whensparse lighting (such as a directional light) is used. In fact, in an image, most of thepixels exhibit low-frequency reflections (close to diffuse reflectances) for most mate-rials under sparse lighting. For example, high-frequency (strong specular) reflectionsare only observed in a sparse manner at points where the surface normal is closeto the bisector of viewing and lighting directions. Recent robust photometric stereomethods [WGS∗10, IWMA12] are built upon similar observation, where high-frequencyreflectances are treated as outliers.

Motivated by this observation, we develop a compact parametric BidirectionalReflectance Distribution Function (BRDF) model for radiometric image analysis usinga bi-polynomial representation. We design this model with two goals by restrictingit to isotropic BRDFs. First, it should be able to faithfully represent low-frequencyreflectances of a broad class of materials. Second, it should make the solution ofinverse problems tractable. Our model is built upon a factorized form of bivariateBRDF models for isotropic materials [Rus98, RVZ08], where the BRDF is representedas a product of two univariate functions of half and difference angles. We approximatethese univariate functions by low-order polynomials. In the preliminary version of thiswork [STMI12a], we employed quadratic functions for these univariate functions. Inthis work, we extend it to a general bi-polynomial model and perform detailed analysisand validations across varying polynomial orders. We further perform comprehensivecomparisons with various BRDF models by applying this model to reflectometry andphotometric stereo. We show that accurate results can be obtained by analyzing thelow-frequency reflectances with our proposed model.

The rest of this chapter is organized as follows: In the next section, we discuss theprevious works in related areas. In Sec. 3.3, we define low-frequency reflectance andshow its approximation by using low-intensity observations. In Sec. 3.4, we introducethe proposed bi-polynomial model. We then validate our model by fitting measuredBRDFs and compare with existing parametric models. In Sec. 3.5 and Sec. 3.6, we showapplications of our model to reflectometry and photometric stereo problems. Sec. 3.7summarizes this chapter.

3.2. Related Works 33

3.2 Related WorksTo precisely represent the appearance of real-world materials, various parametric BRDFmodels have been developed over the decades. These BRDF models can be catego-rized into physically-based and empirical models. Physically-based models, such as theTorrance-Sparrow [TS67] and Cook-Torrance [CT82] models, are mostly designed basedon the microfacet theory. They assume that surfaces consist of shiny V-grooves withconsideration of geometric attenuation (masking, shadowing, and inter-reflections) andFresnel effects. Based on a similar microfacet theory, the Oren-Nayar model [ON93] cap-tures the reflectance of rough surfaces. The empirical models such as the Phong [Pho75]and Blinn-Phong [Bli77] models are widely used because of their computational effi-ciency. The Lafortune model [LFTG97] uses generalized cosine lobes. This model can fitmeasured reflectance data with high precision, but with unintuitive parameters. Thereare also models that bridge these two categories by partly using physically motivatedterms, e.g., the Ward [War92] and Ashikhmin models [AS00]. The Ward model is alsobased on the microfacet theory, but omits the Fresnel and geometric attenuation terms.The Ashikhmin model is an anisotropic BRDF model with a non-Lambertian diffuseterm. An experimental evaluation of various models with measured data can be foundin [NDM05]. Generally, parametric models are compact and easy to use for forwardproblems, but they are only accurate for a limited class of materials.

A BRDF can also be represented by a 4D discrete table indexed by lighting andviewing directions. For isotropic materials, this representation can be reduced to a 3Dtable [DvGNK99, MWL∗99]. The 3D table can be re-arranged using half and differencevectors [MPBM03] through a re-parameterization as suggested in [Rus98]. It can befurther reduced to a 2D table for a wide range of isotropic materials by omitting therotation of difference vector [RVZ08, AZK08, RZ10]. Although high-quality renderingcan be achieved using the discrete table representations, such non-parametric formsare generally unsuitable for inverse problems because the number of parameters tobe estimated becomes prohibitively large. To maintain the accuracy while reducingthe complexity, recent approaches use various basis functions to represent generalBRDFs [ZREB06, Nis09, CR11]. Our bi-polynomial model shares a similar goal ofsimplifying the BRDF representation for general materials, but we focus on modelinglow-frequency reflectances with a simpler parametric form. Furthermore, we aim tosolve inverse problems for radiometric image analysis rather than forward rendering.

Radiometric image analysis seeks to recover scene properties, such as reflectanceand shape, from the recorded scene radiance. Here we briefly review related workin reflectometry and photometric stereo, i.e., surface reflectance and shape estimation,respectively.

Most of the works in reflectometry are based on parametric reflectance models. Yu et


al. [YDMH99] use a sparse set of photos and assumed the Ward reflectance model withspatially varying diffuse reflection and homogeneous specular reflection. Boivin andGagalowicz [BG01] use the same BRDF model, but their method deals with only a singleimage in a hierarchical and iterative framework. Hara et al.’s method [HNI08] uses mul-tiple point light sources to estimate both illumination distribution and reflectance rep-resented by the Torrance-Sparrow model. There are recent works of reflectometry thatuse non-parametric bivariate BRDFs with a discrete table representation. Romeiro etal. propose reflectometry with/without measured illumination [RVZ08, RZ10] by as-suming that the isotropic BRDFs can be represented using a 2D table. Their evaluationshows that the 2D representation is accurate for a majority of the isotropic BRDFs.

In a shape estimation context, early photometric stereo works [Woo80, Sil80] arebased on the Lambert’s reflectance model. Although the computation is simple, theirperformance degrades on many real surfaces, which often exhibit non-Lambertian re-flectance. Some methods use four light sources to avoid shadows and specularity [CJ82,SI96]. By using more images and recent robust estimation techniques, the outliers thatdeviate from the Lambertian assumptions can be efficiently detected and discardedthrough rank minimization [WGS∗10] and sparse regression [IWMA12]. To make useof all the observed data, more sophisticated parametric BRDF models have also beenused, such as the methods based on the Torrance-Sparrow model [KC95, Geo03], theWard model [CJ08, GCHS10], and other multi-lobe models [TLQ08]. However, allthese methods assume the diffuse reflectance to be Lambertian, which is not true forreal surfaces.

To deal with more general materials, especially those with non-Lambertian diffusereflection, some recent methods solve the photometric stereo problem with reflectancesymmetries, such as isotropy and/or reciprocity. Alldrin et al. [AK07] exploit isotropy toestimate the azimuth angle of normals. By assuming bivariate BRDFs, they estimate theelevation angle of normals and surface reflectances iteratively [AZK08]. By further as-suming that the BRDFs can be projected as a one-dimensional monotonic function, theelevation angle can also be estimated without using iterative optimization [STMI12b].A theory of reflectance symmetries and its application to photometric stereo is sum-marized in [TQZ11]. Based on those properties, a surface reconstruction method usinga special lighting rig and multi-view images is introduced in [CBR11] and [ZWT13].Photometric stereo can also be applied to general diffuse surfaces by considering someconsensus properties [HMI10]. Given hundreds of images, surface normal can beestimated by exploring the similarity of radiance profiles [SOYS07, LMS∗13] and at-tached shadow codes [OSS09], under unknown illumination and unknown reflectance.Given thousands of images, it is even possible to apply photometric stereo to generalanisotropic surfaces [HLHZ08]. Although these methods can deal with a great rangeof general materials, they usually require special imaging setup or complicated opti-

3.3. Low-frequency Reflectance 35• BRDF 1 (alum‐bronze) and 89 (violet‐acrylic)

l = (0, 0, 1)T (1, 0, 1) T (1, 0, 0)T

For normals on the hemisphere, v = (0, 0, 1)Tn = (0, 0, 1)T (1, 0, 1) T (1, 0, 0)T

For lights on the hemisphere, v = (0, 0, 1)TFigure 3.1: An illustration of low-frequency reflectance. From left to right, spheresilluminated by distant lights from directions [0, 0, 1]⊤, [1/

√2, 0, 1/

√2]⊤, [1, 0, 0]⊤ and

viewed from [0, 0, 1]⊤ are shown. These high-dynamic range (HDR) images are tonemapped using the method of Reinhard et al. [RSSF02].

mization. In comparison, our bi-polynomial representation is compact yet accurate formany real materials. As a result, data capture and optimization becomes simple withour method.

Given photometric stereo images, biquadratic polynomials are useful for repre-senting the images that may include self-shadowing and interreflections as shown inpolynomial texture maps [MGW01], which are able to interpolate images and synthe-size appearances under new lighting directions. A more recent approach [DHOMH12]extends the polynomial texture maps to handle specularity and shadow. Our modeluses a similar mathematical form for representing the low-frequency component ofgeneral isotropic reflectance.

3.3 Low-frequency ReflectanceLet us consider the reflectance as a function of lighting, viewing, and surface normaldirections. We use the term low-frequency reflectance to denote the reflectance component


that does not abruptly change with the variation of lighting directions. Note that thisdefinition does not limit the low-frequency component to diffuse reflectance, but alsoincludes a wide and blunt specular lobe.

On many real surfaces, strong specularity is only observed when the surface nor-mal is close to the bisector of lighting and viewing directions. Thus, the majorityof pixels in an image should present low-frequency reflectances under a directional(or sparse) lighting. We show synthetic spheres in Fig. 3.1 rendered using the mea-sured BRDFs Alum-bronze (top) and Violet-acrylic (bottom) from the MERL BRDFdatabase [MPBM03]. The spheres are rendered under different lighting directions buta fixed viewpoint. The majority of pixels of these renderings have smoothly varyingvalues, which we refer to as low-frequency reflectance observations. For a static sceneobserved from a fixed camera under a continuously moving light source, the low-frequency reflectances are observed at the pixels where intensities do not show suddenchanges under varying lighting directions. We seek to model such low-frequency re-flectances in a concise form by effectively discarding the high-frequency reflectancesfor applications to inverse problems.

To capture low-frequency reflectance, we need a method to identify these observa-tions. As we will show below, the low-frequency reflectances have strong correlationwith low-intensity observations. Hence, we can simply use an intensity threshold Tlow

to extract observations of low-frequency reflectance in practice. For example, given aset of photometric stereo images, we may draw an intensity profile for each pixel undervarying lighting directions. After discarding observations in shadow, we can sort allthe remaining observations in an ascending order and keep only those ranked below apercentage Tlow.

To evaluate the effectiveness of this simple method, we fit spherical harmonicsof different orders to our low-intensity observations and assess the fitting error. Weperform fitting for the intensity profile at each pixel, where the observation is a functionof lighting directions. The intensity profile y can be fit by spherical harmonics yrepresented as

y(θ, ϕ) =b∑

l=0

l∑m=−l

ClmYlm(θ, ϕ), (3.1)

where (θ, ϕ) are elevation and azimuth angles of the lighting direction l, b is the order ofspherical harmonics, Y is spherical harmonics basis functions, and C is the coefficient.The coefficient C can be solved for by least squares. For experimental validation,we fit Eq. (3.1) to the observations thresholded by Tlow using all of the 100 measuredmaterials in the MERL database [MPBM03]. Specifically, we fix the viewing direction as[0, 0, 1]⊤ and sample 1620 normals by uniformly choosing 36 longitudes and 45 altitudeson the hemisphere. For each of these normals, we use 100 lighting directions randomly

3.3. Low-frequency Reflectance 37

0 1 2 3 4 5

100%

T low

Spherical harmonics order

5%

50%

0

3

RMSEFigure 3.2: RMSE of fitting observations under various Tlow to spherical harmonicswith varying orders. The result is square-rooted for a visualization purpose. Thelegend shows the mapping between the error magnitude and color. The black area areundefined due to insufficient equations for fitting.

sampled from the hemisphere. In other words, we have 1620 intensity profiles, and thelength of each is 100. We then vary Tlow from 10% to 100% with a step of 10%, and varythe order of spherical harmonics b from 0 to 5. For each intensity profile of length f ,given fixed b and Tlow, we compute the relative Root Mean Square Error (RMSE), whichis defined as

RMSE =1f

√√√ f∑i=1

(yi − yi)2

y2i

, (3.2)

where y and y are original and spherical harmonics fit values. For each fixed b andTlow, we perform this evaluation for all 1620 normals and 100 BRDFs, and visualizethe mean RMSE in Fig. 3.2. From the error distribution, we can see that low-orderspherical harmonics closely fit the observations selected by a small Tlow below 50%, asindicated by the blue rectangle in the figure. Note that our representation in Eq. (3.1)is different from the spherical harmonics based BRDF representation in [RH02], whichis not intended for inverse problems. We use Eq. (3.1) only to verify the effectivenessof the intensity thresholding for selecting low-frequency reflectances.


n

θh

θd

lh

v

θd

o

Figure 3.3: The definitions of θh and θd.

3.4 The Bi-polynomial BRDF ModelThis section describes the bi-polynomial BRDF model that we developed. Given surfacenormal, lighting, and viewing directions n, l, and v, we can calculate the half-vector ash = (l + v)/∥l + v∥, which is the bisector of l and v. Following the notations of [Rus98],we use θh to denote the half-angle between n and h, and θd for the difference-anglebetween l (or v) and h as illustrated in Fig. 3.3. Hence, the following relationships hold:n⊤h = cosθh and l⊤h = cosθd.

Our reflectance model is built upon the bivariate BRDF model of [Rus98], where itis shown that most of the isotropic BRDFs can be represented as a bivariate functionρ(θh, θd). This representation is evaluated by [RVZ08] with a large number of measuredBRDFs [MPBM03] for the development of passive reflectometry. It is further discussedin [Rus98] that any isotropic BRDF based on the microfacet theory should consist of aunivariate function of θh, and its Fresnel term should be a univariate function of θd. Asshown in [AS00] and [NIK91], the masking and shadowing terms in a microfacet-basedBRDF model vary smoothly and are actually close to constant. These analyses motivateus to further simplify the bivariate function ρ(θh, θd) as a factorized form ρ1(θh)ρ2(θd).Similar simplification has been used in [LBAD∗06] to assist material capturing andediting.

To obtain a compact parametric model suitable for inverse problems, we representthe factorized terms ρ1(θh) and ρ2(θd) as polynomial functions of cosθh and cosθd, re-spectively. As a result, our BRDF model becomes a bi-polynomial function represented

3.4. The Bi-polynomial BRDF Model 39

as

ρ(θh, θd) ≃ ρ1(θh)ρ2(θd) = ρ1(n⊤h)ρ2(l⊤h) =k∑

i=0

Ai(n⊤h)ik∑

j=0

B j(l⊤h) j, (3.3)

where k is the order of polynomials. The above equation can be further expanded bythe following relaxation

ρ(x, y) =k∑

i=0

k∑j=0

Ci jxiy j, (3.4)

where Ci j = AiB j and x = n⊤h, y = l⊤h for notation simplicity.In this chapter of the dissertation, we focus on discussing the bilinear, biquadratic

and bicubic models, i.e., k = 1, 2, and 3. Let us take the biquadratic model (k = 2) as anexample. It can be expressed as

ρ1(n⊤h)ρ2(l⊤h) =(A2(n⊤h)2 + A1(n⊤h) + A0

)(B2(l⊤h)2 + B1(l⊤h) + B0

), (3.5)

with its linear relaxation as

ρ(x, y) = C22x2y2 + C21x2y + C20x2 + C12xy2 + C11xy + C10x + C02y2 + C01y + C00. (3.6)

In the biquadratic case, there are 9 reflectance parameters in the relaxed linear modelin total, and we denote them in a vector form as

x = [C22,C21, . . . ,C00]⊤ ∈ R9×1. (3.7)

Note that the conversion from Eq. (3.5) to Eq. (3.6) is unique, but the other direction isnot, and Eq. (3.6) may not always have the product form of Eq. (3.5). Eq. (3.6) is a linearfunction of its parameters [C22,C21, · · · ,C00]⊤, while Eq. (3.5) is a bilinear function of[A2,A1,A0]⊤ and [B2,B1,B0]⊤. Hence, the relaxed model is easier to fit. The bilinear andbicubic model can be defined similarly, with 4 and 16 reflectance parameters in theirrelaxed forms, respectively.

It is also straightforward to express the factorized terms ρ1 and ρ2 in Eq. (3.3) byhigher-order polynomials, or even use different orders of ρ1 and ρ2. But we experimen-tally found that models with higher orders had little advantage in modeling accuracyand were suffered from instability. Additional discussions about the choice of ordersof polynomials are left for the experiment section.

There are also other possible parameterizations of the bivariate function. We choosepolynomials mainly for two reasons: (1) As discussed in [MGW01], polynomials aregood at representing smooth intensity variations (low-frequency) caused by differentlightings; (2) Polynomials make the inverse problem tractable. For example, we might


use the Discrete Cosine Transform (DCT) as an alternative in the domain of (θh, θd)to represent the bivariate BRDF, which should also be able to yield high modelingprecision. However, when recovering the unknown surface normal n in inverse prob-lems, the equations become highly nonlinear about n, because a DCT basis has theform of cos(θh/k) = cos(arccos(n⊤h)/k), where k is a non-zero integer. In contrast, ourpolynomial model is much simpler and only involves terms like (n⊤h)k.

3.4.1 Relationship with other reflectance models

The bi-polynomial model can accurately represent the low-frequency component ofconventional dichromatic reflectance models [Sha85], which represent reflectance as asummation of Lambertian diffuse and specular terms. Since the specular term is mostlyconcentrated in the high-frequency component, the low-frequency component of thesemodels is largely Lambertian and can be represented well by our bi-polynomial model.For example, the biquadratic model can be degraded to the Lambert’s model if weset A2 = A1 = B2 = B1 = 0 in Eq. (3.5). The bi-polynomial model can also represent thelow-frequency component of other BRDF models that rely only on n⊤h. For instance,the Blinn-Phong model [Bli77] can be represented with the bi-polynomial model bysetting coefficients that are related to l⊤h (except B0) as zero.

The Cook-Torrance model [CT82] is widely used to represent various surface re-flectances. It consists of a Lambertian diffuse, and a specular term. Its specularcomponent Sc can be denoted as

SC =ks

πDFG

(n⊤l)(n⊤v). (3.8)

The terms D, F, and G are the microfacet distribution, Fresnel, and geometrical attenu-ation terms, respectively. The microfacet distribution D is represented as

D =1

4m2(n⊤h)4 exp

(1 − 1

(n⊤h)2

)m2

, (3.9)

where m indicates the surface roughness, and D is clearly a function of θh [CT82]. TheFresnel term F is often simplified by the Schlick’s approximation [Sch94, AS00], denotedas

F = ks + (1 − ks)(1 − l⊤h)5, (3.10)

where ks is a constant. Hence, F is a function of θd. Although the term G is relativelycomplicated, it varies smoothly and is close to a constant over a large range of exitantangles as evaluated in [NIK91, AS00].


Similar to the formulation of Cook-Torrance model, our model has both θh and θd

terms in a product form. More importantly, as we will evaluate in the next subsection,the low-frequency component of the Cook-Torrance model can be closely approximatedby the bi-polynomial model (we use the biquadratic model as an example for verifica-tion). Other models based on the microfacet theory, such as the Ward model [War92],have a similar expression for modeling the specular component, written as

SW =ks

4πm2√

(n⊤l)(n⊤v)exp

(1 − 1

n⊤h

)m2

. (3.11)

The low-frequency component can be purely Lambertian, or a mixture of Lambertiandiffuse, and soft specular terms. When the surface roughness m is large, the specularlobe will become wide and blunt, thus the diffuse and specular cannot be separated bysimply using a threshold Tlow. Our model is designed for dealing with such cases.

3.4.2 Model validation

To verify the representation power of the proposed model, we fit the biquadraticmodel to synthetic data using the Cook-Torrance model with different roughnessesand measured BRDFs in the MERL database. We use a similar experimental setupas Sec. 3.3 with the same set of n, l, and v. The threshold Tlow is applied in the samemanner for extracting low-frequency reflectances. Note that the fitting experimentin Sec. 3.3 is performed for each normal under varying lighting directions (or for eachintensity profile), while here we fit the BRDF model to observations collected fromdifferent normal directions.

For fitting extracted low-frequency reflectance observations, we first solve Eq. (3.6)via linear least squares, through which we obtain C22,C21, . . . ,C00. By expanding Eq. (3.5),we then establish a system of bilinear equations as: C22 = A2B2,C21 = A2B1, . . . ,C00 =

A0B0, which can be written as C = ab⊤ in a matrix form, where C ∈ R3×3 storesC22, . . . ,C00; a = [A2,A1,A0]⊤ and b = [B2,B1,B0]⊤. This system has multiple solutions.Here, we use a singular value decomposition (SVD) as C = UΣV⊤ for obtaining thesolution as a = u1

√σ1 and b =

√σ1v⊤1 , where u1 is the first column vector of U, v⊤1 is

the first row vector of V⊤, and σ1 is the greatest singular value in Σ. The solution isoptimal in the least squares sense. However, to better fit the measured data (not just toC), we further perform an iterative optimization for refining a and b by first fixing a toupdate b, and then b to update a.


BRDF fitting errorNormal: thetaN 2 : 2: 90, phiN 10 : 10 : 360Light: Random 100Cook-Torrance model

MAE

0.03

0

Figure 3.4: BRDF fitting errors of the biquadratic model to synthetic data using theCook-Torrance model. The colors indicate error magnitudes. The columns vary withTlow, and the rows correspond to different roughnesses (m). Some rendered spheres aredisplayed on the left for reference.

Fitting to the parametric BRDFs

We first fit the biquadratic model to synthetic data rendered using the Cook-Torrancemodel of different surface roughnesses (m). We gradually change Tlow (from 10% to100% with a step of 10%) to include high-frequency reflectances accordingly. The colorencoded mean fitting errors are summarized in Fig. 3.4 in a matrix form with varyingm and Tlow in rows and columns. Note that reflectance with a larger roughness valuehas a broader specular lobe, which is mixed with the Lambertian diffuse reflectanceto form the low-frequency reflectance. From the region within the light blue rectanglein Fig. 3.4, we conclude that our model closely fits the low-frequency reflectance of theCook-Torrance model with different roughnesses.

Fitting to the measured BRDFs

The RMSEs of fitting to all materials in the MERL database with varying Tlow areshown in Fig. 3.5. The rows indicate different BRDFs in the database, and the columnscorrespond to different Tlow. From the region within the light blue rectangle, we cansee that our model closely fits low-frequency reflectances of different materials. Thegeneral tendency is consistent with the fitting results using the Cook-Torrance model,


*“a(b)” a = sorted BRDF ID in the right plot, (b) = original ID in the database

BRDF fitting error (Isotropic)Normal: thetaN 2 : 2: 90, phiN 10 : 10 : 360Light: Random 100MERL 100 BRDFs

RMSE (instead of MAE)

DARK-RED-PAINT

WHITE-FABRIC2

SILVER-METTALIC-

PAINT2

CHROME

0.005

0

Figure 3.5: BRDF fitting errors of the biquadratic model to all materials in the MERLdatabase. The colors indicate error magnitudes. The columns vary with Tlow, and therows correspond to different BRDFs ordered by the mean fitting errors over columns.Some rendered spheres are displayed on the left for reference.

i.e., our model has smaller fitting errors for materials with broader specular lobes. Therepresentative materials are shown in left part of the figure.

3.4.3 Comparison with other parametric models

There are parametric models with general diffuse terms, such as the Lafortune andAshikhmin models. To focus the discussion on the low-frequency domain, we onlyconsider the non-Lambertian diffuse terms of these two models. In the Lafortunemodel [LFTG97], a rotationally-symmetric diffuse component DL is written as

DL = Cd(n⊤l

)k (n⊤v)k , (3.12)

where Cd and k are model parameters. The general diffuse term of the Ashikhminmodel [AS00] DA is defined as

DA = R

1 −(1 − n⊤l

2

)5 1 −(1 − n⊤v

2

)5 , (3.13)

where R is the model parameter.


Table 3.1: BRDF fitting comparison (RMSE ×10−4).

Bicubic Biquadratic Bilinear Lafortune Cook-Torrance Lambert

No Noise 6.71 7.18 9.01 11.01 11.97 13.00λ = 0 16.04 16.79 18.47 18.01 29.23 23.01λ = 0.05 16.78 17.69 18.94 18.70 28.42 23.19λ = 0.15 18.88 19.57 20.64 20.85 26.37 24.30λ = 0.3 276.43 275.94 271.13 270.88 271.33 299.32

Several parametric models assume a Lambertian diffuse term, and use a microfacet-based specular component, such as the Cook-Torrance and Ward models. Accordingto Eq. (3.8) and Eq. (3.11), these models can be represented as

ρS =kd

π+ ksS(n, l,v,m), (3.14)

where kd and ks are model parameters representing the strength of diffuse and specularterms, respectively; S is a nonlinear function with m as another model parameter. Inthe Cook-Torrance model, m is encoded in the term D in Eq. (3.9).

We again use the MERL database for evaluation by setting the threshold Tlow = 25%to extract the low-frequency reflectances. Other experiment settings are the same asthose used in model validation in Sec. 3.4.2. Fitting the Ashikhmin model in Eq. (3.13)is straightforward. For fitting the Lafortune model in Eq. (3.12), we take the logarithmat both sides of the equation and estimate the log parameters using linear least squares.For fitting the Cook-Torrance and Ward models as Eq. (3.14), we adopt a similar strategyas [NDM05] and use a Matlab function “lsqnonlin” to solve the nonlinear optimization.

The fitting errors are summarized in Fig. 3.6, and mean errors across all materialsare listed in the first row of Table 3.1. According to Fig. 3.6, the result of the Wardmodel is very close to the Cook-Torrance model, and error of the Ashikhmin modelis much larger than others. Hence, these two models are omitted thereafter. We takethe biquadratic case as an example to test both the relaxed model of Eq. (3.6) and theoriginal model of Eq. (3.5). Their average fitting errors are 7.18 × 10−4 and 7.97 × 10−4,respectively, which suggests their approximated equivalence in accuracy. Since therelaxed form is more efficient in computation, for the bi-polynomial model we use therelaxed form in the rest of the experiments.

Our model has a consistently smaller fitting error than the general diffuse terms ofthe Lafortune and Ashikhmin models. The Cook-Torrance and Ward models underper-form our model in representing the low-frequency reflectance, although specular termsare included in these models. This is mainly because the Cook-Torrance and Ward mod-


010

2030

4050

6070

8090

1000 1 2 3 4 5

x 10-3

BR

DF index

RMSE

VIOLET-RUBBERDARK-RED-PAINTYELLOW-PAINTWHITE-ACRYLICORANGE-PAINT

YELLOW-MATTE-PLASTICPEARL-PAINTBLUE-RUBBER

GOLD-METALLIC-PAINT3LIGHT-BROWN-FABRIC

RED-FABRIC2GREEN-METALLIC-PAINT

GOLD-PAINTSPECULAR-MAROON-PHENOLIC

WHITE-DIFFUSE-BBALLLIGHT-RED-PAINT

DARK-SPECULAR-FABRICSPECULAR-ORANGE-PHENOLIC

MAROON-PLASTICGREEN-PLASTICPINK-FABRIC

SILVER-METALLIC-PAINT2SPECULAR-VIOLET-PHENOLICSPECULAR-RED-PHENOLIC

GRAY-PLASTICPURPLE-PAINTSILVER-PAINT

SILVER-METALLIC-PAINTSPECULAR-YELLOW-PHENOLIC

BLUE-ACRYLICGOLD-METALLIC-PAINT2

WHITE-FABRICWHITE-MARBLE

RED-METALLIC-PAINTDARK-BLUE-PAINTYELLOW-PHENOLICVIOLET-ACRYLICNATURAL-209

ALUMINA-OXIDETEFLON

TWO-LAYER-GOLDPOLYURETHANE-FOAMNEOPRENE-RUBBER

NREEN-FABRICCOLOR-CHANGING-PAINT2

PURE-RUBBERAVENTURNINE

SPECULAR-BLUE-PHENOLICIPSWICH-PINE-221

BLUE-FABRICSPECULAR-GREEN-PHENOLICGREEN-METALLIC-PAINT2

FRUITWOOD-241BLACK-OXIDIZED-STEEL

WHITE-FABRIC2BLUE-METALLIC-PAINT2COLONIAL-MAPLE-223

RED-PHENOLICPINK-FELTDELRINPVC

RED-SPECULAR-PLASTICCHERRY-235

SPECIAL-WALNUT-224RED-FABRICPINK-JASPER

BLACK-SOFT-PLASTICGREEN-ACRYLICWHITE-PAINTBLACK-FABRICGREEN-LATEXBEIGE-FABRICPINK-FABRIC2HEMATITE

SPECULAR-BLACK-PHENOLICBLACK-PHENOLIC

TWO-LAYER-SILVERALUM-BRONZE

BLACK-OBSIDIANGOLD-METALLIC-PAINT

SILICON-NITRADESS440

PINK-PLASTICBLUE-METALLIC-PAINT

RED-PLASTICPICKLED-OAK-260

COLOR-CHANGING-PAINT3ALUMINIUM

YELLOW-PLASTICSPECULAR-WHITE-PHENOLIC

BRASSNICKEL

COLOR-CHANGING-PAINT1CHROME

TUNGSTEN-CARBIDENYLON

CHROME-STEELPOLYETHYLENE

STEELGREASE-COVERED-STEEL

BRDF fitting

Bicubic

Biquadratic

Bilinear

LafortuneC

ook-TorranceLam

bert

Ward

Ashikhm

in

Figure3.6:

BRD

Ffitting

comparison

forvarious

reflectancem

odelsand

allmaterials

inthe

MER

Ldatabase.

TheY

-axisshow

sthe

RM

SEvalues;the

X-axis

shows

BRD

Fnam

esordered

bythe

fittingerrors

ofthebiquadratic

model,w

ithsom

erenderer

spheresbelow

fora

visualizationpurpose.


els behave very similarly as the Lambert’s model in low-frequency reflectance domain.The bicubic model has the highest modeling accuracy for low-frequency reflectance,while the bilinear model is less accurate than the Lafortune model. In terms of modelingcomplexity, the bi-polynomial model has simpler analytic forms than other parametricmodels, and only a linear least squares fitting is required to estimate the reflectanceparameters. We have also tested a higher-order bi-polynomial model, i.e., the biquarticmodel. However, the performance gain was rather limited and showed an almostidentical error curve with the bicubic model (which is omitted in Fig. 3.6) with RMSEof 5.60 × 10−4. Therefore, we limit our discussion to polynomials up to the third order.

Note that the BRDF fittings here are performed using only low-frequency re-flectances. The modeling accuracy of the bi-polynomial model will be deterioratedat high-frequency reflectances for materials with specularity. In such a case, the modelswith the specular terms such as the Cook-Torrance model outperform our model. Werefer the readers to [NDM05] for evaluations of various parametric BRDF models inthe complete BRDF domain (both low-frequency and high-frequency).

Evaluation using noisy data

The previous experiments are performed using a carefully measured data (MERLBRDFs), where a double precision is used for data storage. However, in practice,the image formation process involves various types of noise and quantization errors.Therefore, we simulate these factors and evaluate their influences on the performancesof different BRDF models. Here, we consider the Gaussian noise and 16-bit imagequantization. Usually when dealing with the general BRDFs, HDR images are usedas done in [AZK08, ZWT13], therefore we simulate 16-bit quantization instead of 8-bitLDR images. For the Gaussian noise, we apply it in a signal-dependent manner asy = y + λyX, which is a commonly used noise model for imaging sensor [HP05]. Forsimplicity in parameters, the weight of independent noise in [HP05] is set as 0. Here,y and y are data with and without noise, λ is the scaling factor, and X ∼ N(0, 1) is arandom variable following a Gaussian distribution with mean and variance of 0 and1. For the quantization, we first discard the data that are smaller than the lower bound10−6 and greater than the upper bound 1.0. The remaining data are normalized andquantized to the range of 0 to 65535. The original data is first corrupted by the Gaus-sian noise before performing quantization. We vary λ to change the noise levels (λ = 0indicates only quantization noise is applied).

The BRDF fitting errors (average across all materials) for noisy input are summa-rized in the second through fifth rows of Table 3.1. Compared with the result withoutnoise in the first row, the errors become larger with the increasing noise level as antic-ipated. The bicubic model shows the best performance in almost all cases. Fitting the

3.5. Application to Reflectometry 47

Cook-Torrance model generates relatively larger errors with noisy input, partly due toits highly nonlinear expression.

3.5 Application to ReflectometryAlthough the bi-polynomial model is designed to represent low-frequency reflectances,it can also be used for reflectometry of materials without significant specular spikes.Using the linear representation of bi-polynomial model in Eq. (3.4), reflectometry underdirectional light sources only requires solving a linear equation Ax = i, where i recordsradiance values. For each observation, we can calculate the matrix A from n, l, and vwhen the shape and lighting are all calibrated. The matrix A has 4, 9 and 16 columnsfor bilinear, biquadratic, and bicubic models respectively according to Eq. (3.4). In thebiquadratic case for instance, from p (p ⩾ 9) independent samples, the matrix A ∈ Rp×9

and observations i ∈ Rp×1 are constructed. The model parameter x can be determinedby simply solving the linear system as x = (A⊤A)−1A⊤i.

If only one image of a curved surface under a directional light source is available,the reflectance shows only variations along the half-angle, our model reduces to aunivariate function as ρ(θh) ≃ ∑k

i=0 Ai(n⊤h)i. In such cases, the estimated reflectanceonly shows the variations along the half-angle.

Reflectometry using measured BRDFs

To verify the method, we select some materials (e.g., some fabric, matte-paint, rubber,etc. materials from the MERL database) that do not contain strong specularities. Werender a single image of a sphere under a directional light source as input to estimatethe BRDF. We test this simple reflectometry method using various parametric BRDFmodels, i.e., bilinear, biquadratic, bicubic, Lafortune, Cook-Torrance, and Lambert’smodels. We then reconstruct images under the same lighting and viewing directionsusing the estimated reflectance parameters and evaluate the reconstruction errors. Thereconstruction errors are defined as the mean of pixel-wise absolute differences.

Here we show the Blue-fabric and Green-latex results as two examples in top tworows of Fig. 3.7. We show three rendered spheres using the ground truth BRDF, fittingsto the biquadratic model, and to the Lambert’s model for each example. The groundtruth and estimated BRDFs of various models are visualized as 2D polar plots.

As summarized in the legends of Fig. 3.7, the reconstruction errors show a similarperformance ordering to the result in Fig. 3.6 and Table 3.1, where the bicubic modelperforms the best, and the Lambert’s model performs the worst. From the plotted BRDFslices, we can see that the bicubic and biquadratic models fit closely to the measured


0.2 0.4

0.02 0.04 0.06

MAE x 10-4

0.02 0.04

Ground truthBicubic:4.93Biquadratic:5.17Bilinear:8.37Lafortune:9.76Cook-Torrance:6.83Lambert:10.86


MAE x 10-2

MAE x 10-4


Ground truth Biquadratic Lambert



Figure 3.7: Results of reflectometry: Top two rows are from synthetic data, and thebottom row is from real data. BRDF plots for all models and the rendered spheresusing measured data, the biquadratic model, and the Lambert’s model are shown. Thereconstruction errors (×10−4 for top two examples, ×10−2 for the bottom example) foreach model are summarized in the legends. Each BRDF is visualized as a 2D curve,which is a polar plot with angle as the elevation angle of surface normal and radius asthe reflectance magnitude. The surface normal with zero azimuth angle is selected fora visualization in 2D.

3.6. Application to Photometric Stereo 49

data even though they only require a few parameters. The biquadratic model is a littlebit worse than the bicubic model, but outperforms all other parametric models. Thebilinear model shows some inaccuracy, and is worse than the Cook-Torrance modelbut better than the Lafortune model. Notice that this experiment setting is slightlydifferent from that in Fig. 3.6. Here, we fit the models to the complete BRDF domainfor materials without strong highlight, while in Fig. 3.6 we fit the models only to thelow-frequency component extracted by Tlow.

Reflectometry using real data

We also test the BRDF estimation capability using real data. A sphere is recorded using along focal-length camera to approximate an orthographic projection under a directionallighting. We show the result in the bottom row of Fig. 3.7. The result is consistent withthe synthetic test; the bicubic model performs the best with a slight advantage over thebiquadratic model. We find that modeling accuracy of the biquadratic model is muchbetter than the bilinear model, and is close to the bicubic model. Higher-order bi-polynomial models, such as bicubic and biquartic, show only negligible improvementin our experiments. Therefore, we conclude that the biquadratic model is a goodtrade-off in terms of modeling accuracy and simplicity.

3.6 Application to Photometric StereoIn this section, we apply the bi-polynomial reflectance model to photometric stereofor estimating surface normals from images captured by a fixed camera under varyinglightings. We assume an orthographic camera and directional lightings. The camera-centered coordinate system is chosen such that v = [0, 0, 1]⊤. To use the bi-polynomialmodel for photometric stereo problems, we fit the reflectance model at each pixel in-dependently. Like previous methods that deal with spatially varying BRDFs [AZK08],our approach determines the BRDF for each pixel from its intensity observations, light-ing directions, and estimated surface normals. Therefore, our method is able to handlespatially varying BRDFs.

3.6.1 An iterative normal estimation method

The bilinear, biquadratic, and bicubic models can be used in the same manner forsolving photometric stereo. Here we use the biquadratic model as an example. Fromthe photometric stereo images, we observe multiple radiance intensities at each pixel.For each pixel, we first use a very small intensity threshold (10−6 in our experiment


for synthetic data) to neglect shadows. Then we sort remaining observations in anascending order and keep only those ranked below the percentage Tlow, which is em-pirically determined within [15%, 50%]. We use ilow to denote the concatenated vectorof these remaining observations and stack their corresponding lighting directions toform a matrix Llow. We finally obtain the following equation:

ilow = ρ(n,Llow) (n⊤Llow), (3.15)

where “” indicates element-wise multiplication. ρ encodes the reflectance parameterx in the same manner as ρ in Eq. (3.6), but operates on each n and l. The reflectance pa-rameter x represents the 9 polynomial coefficients [C22,C21, · · · ,C00]⊤ for the biquadraticmodel, as defined in Eq. (3.7). We use the relaxed biquadratic model of Eq. (3.6) for com-putational efficiency. Surface normal n and the BRDF parameter x can be determinedby iteratively optimizing the following objective function:

(n∗, x∗) = argminn,x

|ρ(n,Llow) (n⊤Llow) − ilow|2. (3.16)

At each iteration, we first fix the normal direction n and refine x by computing alinear least squares. We then substitute x and n to determine ρ. Once ρ is calculated, weupdate n again by a linear least squares. A normalization to unit-norm is performedimmediately after each time n is solved. To initialize this iterative optimization, wefirst apply Lambertian photometric stereo [Woo80] using ilow and Llow to estimate theinitial normal. The iterative optimization stops when the residual of Eq. (3.16) doesnot change. In our implementation, we stop it when the change becomes less than 10−7

or a maximum iteration of 100 times is exceeded. The normal estimation algorithm issummarized as Algorithm 2. Note that we take the biquadratic model as an examplein Algorithm 2, and the ρ can be replaced by other parametric models.

Theoretically, to optimize n with fixed x, minimizing a multivariate polynomialsystem using Grobner basis [KBP08] is one of the solutions. But through experimenta-tion, we found our simpler approach outlined above produces equivalent convergencewith much lighter computation. We explain this issue in details in the Appendix at theend of this chapter.

3.6.2 Surface normal estimation results

Using the above iterative solution method, we perform surface normal estimation usingthe bi-polynomial model in comparison with other parametric BRDF models. For otherparametric BRDF models, such as the Cook-Torrance and Lafortune models, we usethe same iterative solution method for deriving the surface normal (simply replacing ρwith a designated model). We use the same dataset as used in Sec. 3.4.2 and Tlow is set


Algorithm 2 Normal estimation by using the bi-polynomial modelINPUT: Scene radiance values i, lighting directions L, threshold Tlow.for each pixel do

Extract ilow and Llow using Tlow;Solve for initial n with ρ as constant (Eq. (3.15));while residual Eq. (3.16) > 10−7 OR #iter. ≤ 100 do

Update ρ by fixing n (e.g., Eq. (3.6));Update n by fixing ρ (Eq. (3.15));

end whileend forOUTPUT: Estimated surface normal n for all pixels.

to 25%. Note that although the intensity thresholding is applied in the same way, forphotometric stereo the BRDF fitting is performed for each pixel (intensity profile), whilefor the experiment in Sec. 3.4.2 the BRDF fitting is performed for all pixels with theircorresponding lightings. When the Cook-Torrance and Lafortune models are used, theestimation of their BRDF parameters by fixing n becomes highly nonlinear. This causessome numerical instability. A similar issue arises when a high-order bi-polynomialmodel is used (empirically, higher than cubic).

The results for all 100 materials are summarized in Fig. 3.8 with mean errors listedin the first row of Table 3.2 as a quantitative evaluation. Among all the tested models,the biquadratic model performs the best on average over 100 materials. The meanerror from the bilinear model is larger than the biquadratic case due to the model’spoor accuracy, and the bicubic case also has larger errors than the biquadratic casedue to the instability caused by high-order polynomial fitting. For the Cook-Torranceand Lafortune models, their errors are larger than the bi-polynomial model partly dueto their lower modeling accuracy of low-frequency reflectances. With only having aLambertian diffuse term, the Cook-Torrance model hardly improves the accuracy fromthe initial result of the Lambert’s model. In addition, these two nonlinear models causeoptimization difficulties, which also explains the larger error of the Lafortune model.

Our model allows a simple alternating optimization while other reflectance modelsrequire a more sophisticated optimization technique due to their highly nonlinearnature. In our experiments, the bilinear, biquadratic, and bicubic models usuallyconverge in a similar manner with fewer than 10 iterations. The Cook-Torrance modelshows either a quick convergence after one or two iterations, or shows no decreasefrom the initial value, since its low-frequency term is the same as the Lambert’s model.

The Lafortune model shows instability during the iterations, which is caused byboth the nonlinearity of the model and the simple optimization technique that is em-


010

2030

4050

6070

8090

1000 2 4 6 8 10

BR

DF index (reordered)

Angular Error in Degrees

ORANGE-PAINT

WHITE-DIFFUSE-BBALLYELLOW-PAINT

LIGHT-RED-PAINTPEARL-PAINTSILVER-PAINTNATURAL-209

WHITE-ACRYLICVIOLET-RUBBERDARK-RED-PAINTBLUE-RUBBER

DARK-BLUE-PAINTDARK-SPECULAR-FABRIC

PINK-FABRICTEFLON

VIOLET-ACRYLICSPECULAR-ORANGE-PHENOLIC

PURE-RUBBERYELLOW-MATTE-PLASTIC

SPECULAR-VIOLET-PHENOLICLIGHT-BROWN-FABRIC

RED-FABRICWHITE-FABRICGOLD-PAINT

WHITE-FABRIC2POLYURETHANE-FOAM

SILVER-METALLIC-PAINT2GREEN-PLASTICRED-FABRIC2GRAY-PLASTIC

GREEN-METALLIC-PAINTPVC

SPECULAR-MAROON-PHENOLICPINK-FELT

PURPLE-PAINTALUMINA-OXIDE

GOLD-METALLIC-PAINT3SPECIAL-WALNUT-224NEOPRENE-RUBBER

RED-METALLIC-PAINTCOLOR-CHANGING-PAINT2

AVENTURNINEGOLD-METALLIC-PAINT2

GREEN-FABRICSPECULAR-YELLOW-PHENOLIC

BLUE-METALLIC-PAINT2IPSWICH-PINE-221

DELRINBLUE-FABRIC

MAROON-PLASTICSPECULAR-RED-PHENOLIC

WHITE-PAINTBLACK-SOFT-PLASTICTWO-LAYER-SILVER

BLUE-ACRYLICGREEN-LATEX

NYLONRED-PHENOLICFRUITWOOD-241

SILVER-METALLIC-PAINTPINK-PLASTIC

GREEN-ACRYLICSPECULAR-GREEN-PHENOLIC

YELLOW-PLASTICRED-PLASTIC

BLACK-PHENOLICALUM-BRONZE

GREEN-METALLIC-PAINT2BLACK-OXIDIZED-STEEL

SPECULAR-BLUE-PHENOLICPOLYETHYLENEBLACK-FABRIC

BLUE-METALLIC-PAINTRED-SPECULAR-PLASTIC

WHITE-MARBLECOLONIAL-MAPLE-223

TWO-LAYER-GOLDGOLD-METALLIC-PAINTCLOR-CHANGING-PAINT3

SPECULAR-BLACK-PHENOLICBLACK-OBSIDIAN

SS440HEMATITE

PINK-FABRIC2PINK-JASPER

CHROME-STEELYELLOW-PHENOLIC

CHERRY-235BEIGE-FABRIC

CHROMECOLOR-CHANGING-PAINT1

BRASSSPECULAR-WHITE-PHENOLIC

SILICON-NITRADEALUMINIUM

TUNGSTEN-CARBIDESTEELNICKEL

PICKLED-OAK-260GREASE-COVERED-STEEL

Norm

al estimates

Bicubic

Biquadratic

Bilinear

LafortuneC

ook-TorranceLam

bert

Figure3.8:Photom

etricstereo

resultscom

parisonfor

variousreflectance

models

andallm

aterialsin

theM

ERL

database.T

heY

-axisshow

sm

eanangular

errors(degree);the

X-axis

shows

BRD

Fnam

esordered

bythe

mean

angularerrors

ofthebiquadratic

model,w

ithsom

eselected

renderedspheres

belowfor

avisualization

purpose.


Table 3.2: Photometric stereo results comparison (degree).

Bicubic Biquadratic Bilinear Lafortune Cook-Torrance Lambert

No Noise 1.25 1.12 1.37 4.07 2.13 2.14λ = 0 1.42 1.34 1.58 4.17 3.18 2.28λ = 0.05 1.97 1.86 1.66 4.17 3.17 2.33λ = 0.15 3.68 3.30 2.10 4.48 3.43 2.86λ = 0.3 6.82 5.58 3.82 7.22 6.45 6.23

ployed. Though there is no theoretical guarantee for the convergence in our optimiza-tion method, we empirically find that this simple optimization technique shows betterconvergence for our model than for other models.

Evaluation using noisy data.

We evaluate the effect of noisy input to photometric stereo estimates across differentparametric BRDF models. The quantization and Gaussian noise are applied in thesame way as in Sec. 3.4.3. We summarize the mean angular errors of photometricstereo across all materials by applying different BRDF models and noise levels in thesecond through fifth rows of Table 3.2. Compared with the result of ideal data shownin the first row of the table, the bilinear model shows advantages when the noise levelis high, because its simple form makes the alternating optimization more stable. Ingeneral, the bi-polynomial model outperforms other models at different noise levels.

Results on spatially varying BRDFs

We also generate synthetic images with spatially varying BRDFs to show the pho-tometric stereo results. We mix the Alum-bronze and Cheery-235 reflectances fromthe MERL database on a Bunny model, and another two different BRDFs, namely theBlue-metallic-paint and Green-metallic-paint, on a Caesar model. We render 100images under varying lighting directions as input. The estimated normal maps andtheir angular errors are shown in Fig. 3.9. Similarly as done in Chapter 2, we linearlymap XYZ components of surface normal to the RGB color channels, as indicated bya reference sphere shown in the ground truth normal map. We add the results of thetraditional Lambertian photometric stereo [Woo80] with all images (“Lambert (all)”)for observing its failure mode. With these materials of complicated reflectances, the tra-ditional photometric stereo shows very inaccurate results, e.g., the texture boundarieson Bunny is clearly visible in the error map.


4.013.11

4.61 6.77 5.18 5.27 22.00

450

Ground truth B

icubicB

iquadratic Bilinear Lafortune

Cook-Torrance Lam

bert (low) Lam

bert (all)

2.112.65 3.30 3.95 4.04

5.61 15.94

4503.30

2.65 2.115.61 3.95 4..04 15.94

Figure3.9:

Photometric

stereoresults

usingsynthetic

data.O

neof

theinput

images

isshow

nunder

theground

truthnorm

alm

ap.N

ormal

map

estimates

usingdifferent

BRD

Fm

odelsare

shown

inthe

toprow

.The

bottomrow

shows

angulardifference

maps

w.r.t.the

groundtruth.The

numbers

onthe

differencem

apsshow

mean

angularerrors

(degree).


By only using the low-frequency reflectance, photometric stereo works much betteron general reflectances even using the Lambert’s model (“Lambert (low)”), thus it canprovide reasonable initial normals used for fitting all other parametric BRDF models.Further integrating BRDF fitting into photometric stereo pipeline improves the accuracyat different scales depending on the reflectance model used (except for the Lafortunemodel shows instability during our simple alternating optimization). The performanceranking of models by mean angular errors here is consistent with Fig. 3.8 and Table 3.2,which shows that our bipolynomial model yields the most accurate normal estimates.

3.6.3 Effect of varying numbers of lightings

We perform a similar experiment as Sec. 3.6.2 with varying numbers of lighting direc-tions (images) to observe its effect on the accuracy. We vary the number of lightingdirections from 25 to 250 with a step of 25, and perform photometric stereo using var-ious BRDF models. The threshold Tlow is fixed to 25% in this experiment. The meanangular errors over 100 materials of MERL database are plotted in Fig. 3.10. Here we donot compare with the Lafortune model because of its unstable convergence. Generally,a larger number of images improves the results. Compared with the Cook-Torrancemodel, the bi-polynomial model shows greater improvement on accuracy with the in-creasing number of light directions, but the Cook-Torrance model still shows a similarperformance to the Lambert’s model. When the number of lighting directions becomesgreater, the Cook-Torrance model even shows a slightly worse result than the Lam-bert’s model due to its numerical instability during the optimization. Note that inour experiment, the fitting is performed using the low-frequency reflectance data. TheCook-Torrance model might outperform our method when we include all the observeddata (both low and high frequencies). More discussion can be found in Sec. 3.6.4.

With more input images, the low-frequency reflectances can be more reliably ex-tracted using Tlow. Therefore, regardless of the orders of polynomials, the bi-polynomialresults generally become better as the number of input images increases. Empirically,about 100 images are sufficient for photometric stereo with the biquadratic model toproduce accurate results for general isotropic reflectance. On average, the angular errorbecomes about 1 for the materials in the MERL database. The major reason for requir-ing many images is that the method involves the estimation of BRDFs, and it requires asufficient sampling resolution in the angular domain. Similar numbers of images havebeen used for other state-of-the-art techniques such as [AZK08, STMI12b, LMS∗13].

When there is noise in the input data (curves “Biquad./noise-1, 2” in Fig. 3.10), theangular errors increase in all settings of the number of input lightings, but in generalmore lights yield higher accuracy.


25 50 75 100 125 150 175 200 225 2500

2

4

6

8

Number of different lighting directions

Ang

ular

Erro

r in

Deg

rees

Bicubic:1.37Biquadratic:1.09Bilinear:1.49Cook-Torrance:2.31Lambert:2.30Biquad./noise-1:3.21Biquad./noise-2:5.00

10 20 30 40 50 60 70 80 90 1000

2

4

6

8

10

12

Tlow (%)

Ang

ular

Erro

r in

Deg

rees


Figure 3.10: Angular error (degree) varying with number of lighting directions. Thenumbers in the legend are the mean values over X-axis. The “Biquad./noise-1, 2” casescorrespond to noise levels with λ = 0.15, 0.3. Note that the biquadratic and bicubicmodels need at least 9 and 16 equations for fitting, therefore their curves start fromusing 50 and 75 images, respectively, when Tlow = 25%.

3.6.4 Analysis on intensity threshold Tlow

Extracting the low-frequency reflectance observations plays an important role in ourBRDF modeling. We now analyze the performance variation using various Tlow to seeits influence on the photometric stereo results.

Performance variation with diverse Tlow

Again, we perform similar experiments as Sec. 3.6.2 and Sec. 3.6.3 with the number oflighting directions fixed to 100. We plot the mean angular errors of 100 materials withvarying Tlow from 5% to 100% with a step of 5% in Fig. 3.11. The overall tendency is thatall models show larger errors with increasing Tlow. This shows that as more observationsof high-frequency reflectances are involved, the problem of photometric stereo becomesmore difficult. This observation agrees well with our motivation of focusing only on thelow-frequency reflectance. Degradation of the Lambertian photometric stereo resultswith increasing Tlow also influences the results of all other models, because they all relyon the Lambertian photometric stereo for initialization. For the Cook-Torrance model,we find that it performs very similar to the Lambert’s model when Tlow is smaller than30%, but it outperforms the bi-polynomial model when Tlow is larger than 75%. Forthe latter case, the specularity becomes significant, and it becomes necessary to modelhigh-frequency components with explicit specular terms.


25 50 75 100 125 150 175 200 225 2500

2

4

6

8

Number of different lighting directions

Ang

ular

Erro

r in

Deg

rees


10 20 30 40 50 60 70 80 90 1000

2

4

6

8

10

12

Tlow (%)

Ang

ular

Erro

r in

Deg

rees


Figure 3.11: Angular error (degree) varying with Tlow. The numbers in the legend arethe mean values over X-axis. The “Biquad./noise-1, 2” cases correspond to noise levelswith λ = 0.15, 0.3. Note that the biquadratic and bicubic models need at least 9 and16 equations for fitting, therefore their curves start from using Tlow = 10% and 20%,respectively, when 100 images are given as input.

As for the difference between the bilinear, biquadratic, and bicubic models, wefind that the biquadratic model still performs best on average across varying Tlow.However, when Tlow is larger than 55%, the bicubic model improves due to its greaterrepresentation ability. When Tlow is very small (below 20%), the bilinear model can bemore stably estimated due to its simpler form, and the reason here is similar to theincreased noise cases in the bottom three rows of Table 3.2. According to Fig. 3.11, aTlow around 20% is a good choice for the bi-polynomial model with ideal data.

When there is noise in the input data (curves “Biquad./noise-1, 2” in Fig. 3.11),the angular errors increase accordingly. To handle the input data with large noise(λ = 0.3), a larger Tlow is desired for obtaining the optimal performance. Fortunately,as discussed above, our method is not sensitive to the choice of Tlow in the range of[20%, 50%], therefore we can pick a Tlow that could be larger than necessary.

What materials are (in)sensitive to Tlow?

From Fig. 3.11 we can see that Tlow below 50% is safe to use for various BRDFs. However,to determine the best Tlow for each material is not easy in our empirical model. If weplot and check the curve of angular errors varying with Tlow for each material, wecan find some material-related properties. We again take the biquadratic model as anexample and plot the curves in Fig. 3.12. We plot thin curves for all materials anduse a thick curve for their average in the same figure. Based on the shape of these


10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

Tlow (%)

Ang

ular

Erro

r in

Deg

rees

PINK-FABRIC GREEN-PLASTIC ORANGE-PAINT

WHITE-ACRYLIC SPECULAR-VIOLET-PHENOL SPECULAR-MAROON-PHENOLIC

SPECULAR-ORANGE-PHENOLIC BLUE-RUBBER SPECULAR-RED-PHENOLIC

RED-METALLIC-PAINT BLUE-METALLIC-PAINT2 CHROME

COLOR-CHANGING-PAINT3 COLOR-CHANGING-PAINT1 BLUE-METALLIC-PAINT

SILVER-METALLIC-PAINT CHROME-STEEL STEEL

Figure 3.12: Angular error (degree) varying with Tlow of the biquadratic model. Eachthin curve represents the result of one material. The thick red curve is the average of100 materials. The materials in green and purple frames are insensitive and sensitiveexamples to Tlow.


curves, we find that the materials can be categorized into two groups according to thesensitivity of normal estimation error to Tlow. The sensitive materials show a suddenincrease of errors when Tlow is greater than 50%, while the insensitive materials showalmost constant error curves. To check the type of materials in each group, we look atthe point of Tlow = 70% and the corresponding materials with large and small errors.We list the best nine and worst nine materials in Fig. 3.12 with rendered spheres andmaterial names. Interestingly, the errors of metal-like materials are sensitive to Tlow

and that of fabric-like materials are insensitive. Our method is not suitable for dealingwith materials having relatively broad and strong highlight, like some metallic paint.But when the highlight is very focused or it is very weak, our method works well withrelatively arbitrary selections of Tlow. This is consistent with the above discussions, sincewide and strong specularities contain significant high-frequency reflectances whichcannot be easily discarded by Tlow, while spiky specular lobes are easily separable.

3.6.5 Results using real-world data

We show the results of real-world data in Fig. 3.13. We use the biquadratic model,since it is found to be the optimal one according to the synthetic test. We comparewith the method from Alldrin et al. [AZK08] using their datasets in the top threerows of Fig. 3.13. We refer to these scenes as (from top to bottom row) Gourd1 (102),Gourd2 (98), and Apple (112), with the number of input images in the parenthesis.Since we do not have the ground truth of those data, a quantitative evaluation cannotbe performed. For a qualitative evaluation, we reconstruct the surfaces using theestimated normals and the method in [Kov05]. In Fig. 3.13, the left column shows oneof the input images to the photometric stereo algorithm and a reference image (notused in calculation) of rendered results from [AZK08]. The middle and right columnsshow the estimated surface normals and recovered surfaces using our method and theLambertian photometric stereo. The recovered surfaces are aligned to the referenceviews, and our reconstruction agrees closely with the result of [AZK08]. The data inthe bottom two rows (named Post (91) and Teapot (73)) were captured with a SonyXCD-X710CR camera with a linear response function. While we did not carefullycontrol exposures to avoid saturation, our method can naturally skip undesired strongspecular and saturation regions by setting Tlow. The consistency of the reconstructedsurfaces using estimated normals with the pictures of the objects taken from anotherviewpoint indicates the effectiveness of the proposed approach.

We further show some results with various materials in Fig. 3.14. Next to theestimated normals, we show the Lambertian shading rendered using the same lightingdirection as the input image on the leftmost and the estimated surface normal; the depthmap reconstructed from the estimated normal is also given for a qualitative evaluation.

60 Chapter 3 General Reflectance Solution using Bi-polynomial Reflectance ModelIm

age (input/reference) Biquadratic (norm

al/surface) Lambert (all) (norm

al/surface)

Figure3.13:

Photometric

stereoresults

usingreal-w

orlddata

incom

parisonw

iththe

resultsof[A

ZK

08]andLam

bertianphotom

etricstereo.The

topthree

rows

ofdataare

courtesyofA

lldrinetal.[A

ZK

08].


Image Normal Shading Depth

Figure 3.14: Photometric stereo results using real-world data. From left to rightcolumns, it shows one of the input images, estimated surface normal using the bi-quadratic reflectance model, Lambertian shading, and depth map (brighter intensitymeans closer and darker intensity means further) reconstructed from the estimatednormals. The top two rows used data from USC light stage gallery [CEJ∗06].


We show four groups of result. For the data in the top two rows, we use the light stagedata from [CEJ∗06], named Kneeling knight (119) and Helmet side right (119). Weuse only 119 images out of all the 253 images; the selected images are illuminated by thelights on the visible hemisphere (with camera as the north pole). The bottom two rowsshow results of our Post and Teapot data. These objects include various challengingmaterials for photometric stereo, such as metal, porcelain, etc. The consistency of theLambertian shading and depth maps with the reference images of real objects showsthe effectiveness and accuracy of the proposed photometric stereo method.

3.7 SummaryWe present the bi-polynomial reflectance model for representing low-frequency re-flectances of isotropic surfaces. The proposed reflectance model closely captures thegeneral diffuse reflectance that spans the low-frequency domain in comparison withother parametric BRDF models and is useful for inverse problems such as reflectometryand surface normal recovery. We make comparisons with existing parametric modelsand demonstrate the usefulness of the proposed BRDF model. We also discuss thechoices of different orders of polynomials, and conclude that the biquadratic model isgenerally the most suitable for inverse problems in radiometric image analysis. Thecurrent model is limited to isotropic materials. In future work, we hope to analyze thecharacteristics of the low-frequency component of anisotropic reflectance and extendour bi-polynomial model to a wider variety of materials.

Discussion

The two general reflectance solutions in the previous chapter and this chapter have thesame goal, however there are differences in assumptions on ideally desired reflectancefor these two methods. The method exploring reflectance monotonicity (Chapter 2)expects the BRDF to be projected on a 1D profile along half-vector; while the bi-polynomial model (Chapter 3) requires the extensive existence and effective extractionof low-frequency reflectance observations. As we have shown in the photometricstereo evaluation using 100 materials (Fig. 2.5 and Fig. 3.8), these two methods showdifferent rankings of materials, i.e., they have their own “favorite” materials. Herein Fig. 3.15, we list the best and worst three materials for these two methods, respec-tively. It is interesting to observe that the rankings of photometric stereo accuracyshow interesting material-related properties, which are consistent with the assump-tions that each method relies on. The monotonicity-based method prefers materialswith single-lobe reflectance, which usually show focused specular spikes; in contrast,

3.7. Summary 63

ORANGE-PAINT YELLOW-PAINT NICKEL GREASE-COVERED-STEEL

WHITE-DIFFUSE-BBALL PICKLED-OAK-260

YELLOW-MATTE-PLASTIC PURPLE-PAINT BEIGE-FABRIC GREEN-LATEX

SPECULAR-ORANGE-PHENOLIC PICKLED-OAK-260

Easy materials (Best 3) Difficult materials (Worst 3)

Biquadratic [Chapter 3]

Monotonic [Chapter 2]

Figure 3.15: Comparison between the two general reflectance solutions, by showingtheir best-three and worst-three materials respectively.

the bi-polynomial-based method prefers materials bearing slow reflectance variations,which reveal abundant low-frequency components; and vice versa.

Appendix: Theoretically Optimal Solution to Eq. (3.16)Considering that the two sets of variables n and x are always interlaced in the objectivefunction, the absolute scale of n, i.e., the unit-norm constraint, is not essential. Whensolving n with fixed x, it is feasible to remove the unit-norm constraint temporally.After n is obtained, we can re-normalize it to be of unit-norm. This is a non-convexoptimization problem. Since there are only three variables and the objective functionis differentiable, we can retrieve all its stationary points by solving the polynomialsystem arising from the first-order optimality condition. Denote the objective functionin Eq. (3.16) with fixed x as f (n), its first-order optimality condition is composed of threefive-degree polynomials (in the biquadratic case; and similar analysis naturally appliesto other orders of polynomials.) w.r.t. n = [nx,ny,nz]⊤, as ∂ f (n)

∂ni= 0, where i = x, y, z. We

use the Grobner basis technique in [KBP08] to solve the aforementioned multivariatepolynomial system. The basic procedure is first to determine the Grobner bases and themonomial bases of the quotient ring under the graded reverse lexicographical ordering,


and then to construct the elimination template that determines which polynomials fromthe ideal should be added so as to build the action matrix. Finally, the solutions to theoriginal polynomial system are extracted from the eigen-factorization of the actionmatrix. After retrieving all stationary points, we retain those real stationary pointsonly, and choose the one with smallest objective function value as the ultimate solution.This solution is the global optimal solution to f (n). Then, we re-normalize n to be ofunit-norm.

Solving x with fixed n is a convex linear least squares problem, whose globaloptimum is easy to find. By iterating the above operations of solving n and x repeatedly,we finally get the optimal solution to surface normal.

65

Chapter 4

General Lighting Solution using ShapePrior

This chapter introduces the solution to photometric stereo with general lighting. Weassume a target object with Lambertian reflectance is placed under unknown/naturalillumination. This illumination can be any general lighting condition, like environmentlighting, single/multiple point light sources with/without ambient lighting, area light-ing, etc. Given a coarse 3D shape in the format of depth or normal map, the proposedmethod is able to accurately calculate the surface normal. As another useful propertyof the proposed method, it is allowed to work with uncontrolled sensor without doingradiometric calibration.

4.1 OverviewThe photometric stereo problem without known lighting conditions is called uncali-brated photometric stereo. Due to the lack of knowledge about lightings, the solutioncan only be derived up to some ambiguity, such as the generalized-bas-relief (GBR)ambiguity [BKY99] in the case of unknown distant lightings. When a more generalnatural lighting is considered, the ambiguity becomes higher dimensional as discussedby Basri et al. [BJK07]. Fully resolving such a high-dimensional ambiguity is desiredbecause of its relevance to real-world applications; however, it is still a challengingtask. Besides the uncontrolled illumination, properly accounting for uncontrolled sen-sors is another important issue for making photometric stereo to work under diversesettings, due to that automatic gain control and nonlinear sensor response deteriorateresults. Therefore in most of the approaches, sensor gains and responses are either

66 Chapter 4 General Lighting Solution using Shape Prior

pre-calibrated or assumed to be known; however, sensor parameters are inaccessiblein many situations.

In this chapter, we propose an approach of utilizing shape priors for resolving thesepractical issues. The shape prior in this chapter is coarse shape information of a scenethat is obtained by other means, such as a low-cost depth sensor, or structure frommotion (SfM) and multi-view stereo (MVS). Because of the wide availability of low-cost depth sensors and recent advances of SfM and MVS, the assumption of havingcoarse shape information is becoming more and more realistic. We show that theshape prior can be used for resolving the high-dimensional ambiguity that remains inthe setting of unknown and general lightings and also for disregarding the effect ofuncontrolled sensors. To effectively disregard the effects of nonlinear sensor responses,we show that the nonlinearity in intensity observations can be approximated by a high-dimensional linear transformation applied over the illumination component, whichcan be viewed as pseudo multiplexing of natural lightings. Because of its accurate linearapproximation, the proposed method is able to separate the shape estimate from thepseudo-multiplexed illumination component. Based on these observations, we developa photometric stereo method that works in a wild setting of uncontrolled illuminationand sensors. We demonstrate the effectiveness of the proposed method by showingapplications that use low-cost depth sensors and the Internet images.

4.2 Related WorksA general/natural illumination provides useful constraints in solving the shape fromshading problem for a scene containing a uniform albedo [JA11, ON12]. In the pho-tometric stereo context, the setting of natural illumination is mainly discussed forapplications of outdoor scenes. Ackermann et al. [ARSG10] apply MVS using Internetimages to compute sparse surface normals and transfer them to images under vary-ing lightings for obtaining dense normal estimates. A time-lapse video taken overmonths provides observations under non-coplanar sunlight for solving photometricstereo [ALFG12, AHP12]. Shen and Tan [ST09] use spherical harmonics for modelingnatural illuminations as done by Basri et al. [BJK07], and apply photometric stereo toInternet images. Their method can only obtain sparse normals, but it is shown that theestimates are useful for weather estimation.

For photometric stereo with uncalibrated distant lighting, it is well known that thereexists a 3×3 linear ambiguity [Hay94] in the recovered surface normals for general sur-faces, and a three-parameter GBR ambiguity for integrable surfaces [BKY99, YSEB99].Recent works mainly focus on estimating the three unknowns to obtain final nor-mal estimates, such as [AMK07, SMW∗10, FP12]. Under a general unknown lighting,

4.3. Proposed Method 67

there is a 9 × 3 (= 27 unknowns) linear ambiguity in the estimated surface normalswith illumination modeled by second order spherical harmonics. Unfortunately, thishigh-dimensional ambiguity cannot be completely removed without additional in-formation [BJK07]. In this work, we use a shape prior for effectively resolving thisambiguity.

Effective fusion of shape prior and photometric cues improves the surface recon-struction. The shape prior is useful for removing the ambiguities in the settings ofunknown distant/point lighting [ZCHS03, LHYK05, JK07, EVC08, HMJI09]. In ad-dition, fusion of the shape prior and surface normal estimates gives faithful surfacereconstruction [NRDR05, WWMT11, OD12, ZYY∗12, YYTL13]. We have the same goalat a high-level with these existing approaches, but our method exploits the use of shapeprior for addressing issues that arise in complex and unknown natural illuminationwith uncontrolled cameras.

Separating the effect of sensor gains and responses are necessary for shading-basedshape estimation. Due to the assumption of natural illumination, it is not straight-forward to apply the self-calibrating method such as [SMW∗10], the auto-calibratingmethod such as [MOS11], or the methods that use time-lapse video [KFP08]. Whensensor parameters are inaccessible, methods that require controlled exposure time(e.g., [GN04]) are unsuitable as well. If treated as an independent problem, any solu-tion using a single image (e.g., [LGYS04]) can be used as pre-processing. The applicationscenario of our method is similar to [DS11], which also relies on some available geom-etry information. Instead of explicitly estimating the response function, we disregardit in a self-contained pipeline.

4.3 Proposed MethodStarting from this chapter, we will use the letter r to represent the scene radiance insteadof i like it was used in previous two chapters. Note that in the two general reflectancesolutions in previous chapters, we either perform radiometric calibration to our cameraor use a camera with linear response, thus the calculation is performed with the sceneradiance, which is required for most photometric stereo methods. But for the methodin this chapter, we model the nonlinear response of camera within the solution pipeline,and the calculation is directly performed with the observed image brightness, which isredefined as i in this chapter. And the scene radiance r in this chapter is mostly usedfor deriving the image formation model.

We begin with a Lambertian image formation model under natural lightings. Theradiance r of a scene point that has the Lambertian albedo ρ and surface normal


n = [nx,ny,nz]⊤ ∈ R3×1 is written as

r =∫Ω

ρL(ω) max((n⊤ω), 0)dω, (4.1)

where ω ∈ R3×1 is a unit vector of spherical directionsΩ, and L(ω) is the light intensityfrom the direction ω. This integration can be approximated using spherical harmonicsas

r = s⊤l, (4.2)

where s = [s1, s2, . . . , sk]⊤ ∈ Rk×1 are harmonics images of surface normal n and albedoρ, and k is the number of elements determined by the order of spherical harmonics.The vector l ∈ Rk×1 is the k-dimensional lighting coefficients.

Given p pixels observed under q different illuminations, we store all these p × qradiance values into a radiance matrix R ∈ Rp×q. By a row-wise stacking of p transposedharmonics images s⊤ in a shape matrix S ∈ Rp×k and a column-wise stacking of q lightingcoefficients l in a lighting matrix L ∈ Rk×q, Eq. (4.2) can be written in a matrix form as

R = SL. (4.3)

We further include the effect of sensor gains and responses in the image formationmodel. Under varying illumination, the exposure time for each image is likely differentfor an uncontrolled sensor. Each exposure time corresponds to a scaling of its lightingcoefficient l, which is one column in the lighting matrix L. For simplicity of notations,we still use L to represent the scaled lighting coefficient matrix. In addition, a nonlinearresponse function transforms the radiance R. Let us denote the camera’s radiometricresponse as f . For now, we assume the response function f is the same for all images.The observation matrix I ∈ Rp×q can be expressed using the response function f , whichis applied in an element-wise manner using an operator “”, as

I = R f = (SL) f . (4.4)

Our method approximates the nonlinear response function f using a high-dimensionallinear transformation F ∈ Rq×q as

I = (SL) f ≈ SLF. (4.5)

The transformation F varies with the response function f and radiance R. We willexplain and verify the appropriateness of this approximation in Sec. 4.4. Since our goalis to estimate the shape component S, we rewrite Eq. (4.5) as I = SLF by LF = LF so thatthe illumination component embeds the transformation caused by response functions.


4.3.1 Linear solution

Similar to previous approaches [Hay94, BJK07], we perform the singular value decom-position (SVD) on the observation matrix I to estimate the shape matrix S up to a linearambiguity B ∈ Rk×k. In other words, the ambiguous S and LF are related to their groundtruths S and LF by SB = S and B−1LF = LF, respectively. As discussed in [BJK07], thesurface normal is encoded in the second to fourth columns of S. Therefore, a k × 3matrix A is sufficient for computing normal from S as SA.

Suppose we are given a coarse shape prior in the form of surface normal N. We canuse N to estimate A and then to obtain a precise normal from SA. Since the normal Nfrom the shape prior is assumed to contain various types of noise, we apply smoothingto both the surface normal prior N and ambiguous shape matrix S. Then, the estimateof ambiguity matrix A can be obtained as

A = argminA∥G(S)A − G(N)∥F, (4.6)

where G denotes a Gaussian smoothing operator. By applying A to the original S, weobtain surface normals N by N = O(SA), where O is a normalization operator forcingeach row of the matrix to be a unit vector.

4.3.2 Nonlinear refinement

Solving Eq. (4.6) can only provide a correct solution for objects with a uniform albedo.When a scene contains variant albedos, the norm of rows of SA varies, while N onlycontains unit normal vectors. To explicitly handle the albedo variations, we furtheroptimize A using the following objective function:

A∗ = argminA∥O(G(S)A) − G(N)∥F. (4.7)

The above optimization problem is highly nonlinear, but we can use the linear solutionA as an initial guess to solve for A. The optimization is performed by Nelder-Meadsimplex method [NM65] because of its simplicity and efficiency. While the globaloptimum is not guaranteed, in our experiments this nonlinear refinement works wellbecause of the good initialization. Finally, the normal is computed by N∗ = O(SA∗).

4.3.3 Normal from depth

Since our method works in the surface normal domain, we convert the coarse and noisydepth measurements into a surface normal prior for solving our problem. A naıve


Algorithm 3 Normal estimation with shape prior1: Decompose observation matrix I as I = SLF by using SVD;2: Solve the linear equations for A using Eq. (4.6);3: Do nonlinear refinement to obtain A∗ using Eq. (4.7), with A as an initial guess;4: Compute normal by N∗ = O(SA∗).

computation of derivatives over a coarse depth map results in an unstable normal mapbecause of the severe quantization error and noise. Instead, we use a plane principalcomponent analysis method introduced in [KADB11] for robustly computing of thesurface normal prior. Given a depth map and camera intrinsics, the method firstprojects the depth map to 3D points in the world coordinate system. For each 3D point,the method groups a set of points within the short distance d. For the i-th group thatcontains qi 3D points, by stacking them in a matrix Q ∈ Rqi×3, the surface normal forthe i-th pixel is computed as

n = argminn|(Q − Q)n|2, (4.8)

where Q ∈ Rqi×3 is a matrix containing the centroid of Q in all the rows. By stackingthe estimated n from all pixels, we obtain the shape prior N in a coarse normal mapformat. A larger d is usually necessary when the input depth is more severely distortedand it produces smoother normal estimates.

The complete normal estimation method is summarized in Algorithm 3, and itspipeline is illustrated in top frame of Fig. 4.1. In this figure, S is visualized as a normalmap with an ambiguity, which corresponds to the second to fourth columns of S withnormalization. Keeping the same format as the previous chapters, the normal mapsare displayed by linearly encoding its XYZ components into RGB channels.

4.3.4 Surface reconstruction

The shape prior is beneficial for surface reconstruction as well because it can guide thesurface recovery from a normal map as done in [NRDR05, JK07, ARSG10]. To estimatethe optimal depth Z∗ ∈ Rp×1 by combining the estimated surface normal N (superscriptis omitted for simplicity) and a vectorized noisy depth map Z ∈ Rp×1 (shape prior), wecan form a linear system of equations as[

λId

∇2

][Z∗] =

[λZ∂N

], (4.9)

where∇2 is a Laplacian operator, Id is an identity matrix and λ is a weighting parametercontrolling the contribution of depth constraint. ∂N is the stacks of − ∂∂x

nxnz− ∂∂y

ny

nzfor


1

True normal

Input images Estimated normal

Input depth

Ambiguous normal

Input normal

SVDEq. (4.6)Eq. (4.7)

Solve for Uncalibrated Photometric Stereo

0

1

1

Response function: f

r

i

Linear Approximation of Sensor Responses

f f i r Sl Fi Sl

l lF

Eq. (4.8)

Figure 4.1: Top frame shows the pipeline of normal estimation method (Sec. 4.3); bottomframe shows the illustration of approximation of nonlinear responses (Sec. 4.4). Theorange camera has a nonlinear response, while the green camera is a linear one. Pleasepay attention to the different general lighting conditions on left (l) and right (lF) side ofthis illustration.


each normal n ∈ N. While it forms a large linear system of equations, because the leftmatrix is sparse, it can be efficiently solved using existing sparse linear solvers (e.g., QRdecomposition based solvers), or multigrid techniques.

4.4 Linear Approximation of Sensor ResponsesThe shape estimation method in Sec. 4.3 relies on a high-dimensional linear approx-imation of nonlinear responses (Eq. (4.5)), which allows us to separate the effects ofunknown sensor gains and responses from the shape estimation as I = S(LF). Theresulting lighting component LF(= LF) becomes different from the actual one L. How-ever, it is a linear combination of the original lightings, and it can be viewed as pseudomultiplexing of natural lightings, which allows us to effectively account for unknownsensor responses. Intuitively speaking, for each image, the pseudo multiplexing can beexplained as such a process: A uniform surface is illuminated by a natural illuminationl and captured via a nonlinear response function f that maps the radiance r = Sl toobserved image intensity i = r f ; the observed image is approximately equal to thatof the same surface illuminated by lF (one column of LF) and captured with a linearcamera, i.e., i = SlF. This process is illustrated at the bottom frame in Fig. 4.1.

Qualitatively, the linear approximation becomes less accurate as a surface containsmore diverse albedos. For example, consider two surface points that have the samenormal n but different albedos ρ1 and ρ2 (ρ1 , ρ2). Strictly speaking, we have thefollowing relations between spherical harmonics images and albedo-scaled surfacenormal: s1 = ρ, s2 = ρnx, s3 = ρny, s4 = ρnz, s5 = ρ(3n2

z − 1), s6 = ρn2x, s7 = ρnxnz, s8 =

ρnynz, s9 = ρ(n2x−n2

y) for a second order spherical harmonics representation, however forthe simplicity in notations, with a little bit abuse of notations, we denote s = ρn. Thenthe radiance values at these two points become r1 = s⊤1 l = ρ1n⊤l and r2 = s⊤2 l = ρ2n⊤l,respectively. A camera response function maps these radiance values to f (r1) andf (r2). Since f is a nonlinear function, generally, the ratio f (ρ1n⊤l) : f (ρ2n⊤l) becomesdifferent from ρ1 : ρ2. However, the linear approximation is limited in representingthis nonlinear effect, and no matter how we change F the radiance ratio at these twopoints stays at ρ1 : ρ2. This error becomes more obvious as the difference between ρ1

and ρ2 gets larger. But as we will show using experiments later, our method has a greattolerance for various albedo contrast.

For our method, it is not necessary to estimate the mixing matrix F; however, theapproximation power of the linear transformation F is of interest because it relates tothe shape estimation accuracy. We therefore assess the appropriateness of the approxi-mation using the database of measured response functions [GN04]. Fortunately, as wewill show below, the approximation error is consistently and sufficiently small for the

4.4. Linear Approximation of Sensor Responses 73

5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

q

Rec

onst

ruct

ion

erro

r

= 1/Nonlinear-fixed = 2/Nonlinear-random = 2/Nonlinear-fixed = 2/Nonlinear-random = 3/Nonlinear-fixed = 3/Nonlinear-random = 4/Nonlinear-fixed = 4/Nonlinear-random = 5/Nonlinear-fixed = 5/Nonlinear-random

6

α = 1 α = 2 α = 3 α = 4 α = 5

Figure 4.2: Reconstruction error of Eq. (4.5) w.r.t. varying number of images (q) forscenes containing two spheres with different albedos. α = 1, 2, 3, 4, 5 indicate thatleft/right spheres have albedo values of 0.5/0.5, 0.4/0.6, 0.3/0.7, 0.2/0.8, 0.1/0.9.

real world response functions [GN04] even for variant albedos, partly due to the highregularity of real response functions.

Simulation test

We use synthetic images to assess the approximation ability. The images are createdby using a 9D spherical harmonics expansion of both normal and lighting (the 9Dlighting coefficients are calculated by fitting to varying distant lightings plus a constantambient lighting). We simulate the imaging process where f is applied to R in twodifferent manners: (1) “Nonlinear-fixed”: the same response is applied to all imagesunder varying illuminations. This case corresponds to a scenario with an uncontrolledcamera. And, (2) “Nonlinear-random”: each image under one lighting condition is


5

Err. = 0.49 Err. = 0.077

Err. = 0.49 Err. = 0.077

I

Err. = 0.31 Err. = 0.004

Err. = 0.31 Err. = 0.004

r i

0

1

0

1

Fr

r i Fr

r i Fr

r i Fr

Figure 4.3: Linearly approximated images for uniform albeodo (top row) and strongcontrast cases (bottom row). Input radiance r is transformed by nonlinear response to i.Our linearly approximated rF shows very close appearance to i. The errors below showmean of relative differences of r and rF from i (the same definition as reconstructionerror in Fig. 4.2). Color encoded images are used for better visualization.

distorted by a randomly selected response function in the database. This case corre-sponds to a scenario with Internet images, where each image is recorded via a distinctunknown and nonlinear response. We average the results over all 201 response func-tions in “Nonlinear-fixed” case, and 201 random trials are performed and averagedfor “Nonlinear-random” case. The test scene consists of two spheres with differentalbedos, whose values are shown at bottom of Fig. 4.2.

To assess the approximation ability, we evaluate the reconstruction error of Eq. (4.5).Given R and I, we solve for F by linear least squares as F = R+I, where R+ is the pseudo-inverse of R. Then, a reconstruction of RF is computed as RF = RF. The reconstructionerror is evaluated as the mean of |i − rF|/i, where i is a pixel observation of I and rF isthe corresponding element in RF. This is a relative error (percentage) defined for eachobservation. We show the reconstruction error with respect to the varying number ofinput images q and albedo contrast in Fig. 4.2. The error becomes pretty low (about1%) when the number of input images q becomes q ≥ 9 for the case of uniform albedo(α = 1). On the other hand, as the albedo contrast becomes greater, the errors increaseaccordingly. Except for the extreme case with α = 5, which mimics a scene of almostblack and white spheres, the reconstruction errors become consistently low (below5%). Therefore, the approximation generally works well, except for scenes that exhibitsignificantly high contrast. According to the statistics of the measured data in [GN04],the different radiometric response functions of real cameras are highly correlated. Itexplains why our method works for both “Nonlinear-fixed” and “Nonlinear-random”cases as their reconstruction errors are always similar.

4.5. Experiments 75

As an intuitive visualization, we show linearly approximated images of uniformabledo (α = 1 in Fig. 4.2) and strong contrast cases (α = 5 in Fig. 4.2) in top and bottomrows of Fig. 4.3, respectively. The response function used in this example has thesimilar shape as the one at the bottom frame of Fig. 4.1. It can be seen that i and rF hasvery small difference visually, especially for uniform albedo case, and this verifies theeffectiveness of our linear approximation.

4.5 ExperimentsIn this section, we first quantitatively assess the performance of the proposed methodusing synthetic data. We then show results that use real-world images in two scenarios:one using a Kinect RGBD sensor and the other one using Internet images.

4.5.1 Quantitative evaluation

We use a synthetic scene, Caesar, to quantitatively evaluate our method. The data issynthesized in the same way as the simulation test in Sec. 4.4. We fix the number ofdistinct lightings q = 40 for all the following tests.

Effect of albedo contrast

We evaluate the effect of albedo contrast to normal estimation accuracy. To excludeother factors except for nonlinear sensor responses, in this test we use the ground truthnormal as N in Eq. (4.6) to remove the ambiguity. In Fig. 4.4, we show the normalestimation accuracy with respect to varying albedo contrast α for different dimensionsof lighting coefficients k. As we have observed in Sec. 4.4, the accuracy is affected by thegreater albedo contrast in general, and the errors become smaller with a larger k. Thisindicates that the higher-order lightings makes the pseudo multiplexing more effective.The “Nonlinear-fixed” cases show larger errors than “Nonlinear-random” cases due tolarge errors caused by some response functions in unusual shapes that are difficult toapproximate.

Effect of noise in shape priors

In Fig. 4.5, we show the variation of normal estimation errors with different noiselevels in shape priors. The input depth values are quantized to 3 bits in the worstcase, and Gaussian noise with standard deviations up to 0.05 is further added in orderto simulate the real-world shape priors. The computed surface normal priors have


α = 1 α = 2 α = 3 α = 4 α = 5

7

1 2 3 4 50

1

2

3

4

5

Albedo ratio index ()

Ang

ular

erro

r (de

gree

)

k=9/Nonlinear-fixedk=9/Nonlinear-randomk=16/Nonlinear-fixedk=16/Nonlinear-random

7

Figure 4.4: Normal estimation accuracy (angular error in degrees) w.r.t. different albe-dos contrasts. α represents the same albedo ratios as in Fig. 4.2. Two different dimen-sions of lighting coefficients k = 9, 16 are evaluated.

errors up to 25 degrees. In this test, the albedo is set uniform to remove the effect fromalbedo variations. The errors increase with the roughness of the shape priors. Exceptfor the extreme case (β = 5), the normal estimation accuracy is consistently high evenwith nonlinear responses. Under severe noise, a large k lowers the normal estimationaccuracy, because it allows too much freedom in the ambiguity matrix, which makesthe solution sensitive to noise.

Surface reconstruction

We take a scene with β = 4 from Fig. 4.5 as an example to show the surface reconstructionin Fig. 4.6. The surface reconstruction errors are defined as the mean value of |z − z∗|/zacross pixels, where z is the true depth of one pixel and z∗ is the corresponding depthestimate. The original rough depth (second column) is noisy, but it still provides usefulpositional information. On the other hand, directly integrating a surface from thenormals results in a distorted reconstruction with a larger bias (third column). By

4.5. Experiments 77

8

1 2 3 4 5

6

8

10

12

14

Noise level index ()

Ang

ular

erro

r (de

gree

)

k=9/Nonlinear-fixedk=9/Nonlinear-randomk=16/Nonlinear-fixedk=16/Nonlinear-random

β = 1 β = 2 β = 3 β = 4 β = 5

Figure 4.5: Normal estimation accuracy (angular error in degrees) w.r.t. varying noiselevels in shape priors. β = 1, 2, 3, 4, 5 are labels to represent corruptions, where theclean depth maps are quantized to 8, 6, 5, 4, 3 bits, with zero-mean Gaussian noiseof standard deviations 0.02, 0.03, 0.04, 0.04, 0.05 added. Those rough normals haveangular errors from 13 to 25 degrees. Two different dimensions of lighting coefficientsk = 9, 16 are evaluated.

fusing the normal and depth information, a more accurate surface can be reconstructed(fourth column), as pointed out by previous work [NRDR05].

4.5.2 Application using a Kinect sensor

Our method can directly work with a Kinect, with which coarse depth measurementsand RGB images are recorded while the sensor gain and response function are un-known. The normal estimates and surface reconstruction results are summarizedin Fig. 4.7. Each scene is recorded under 40 varying lighting conditions by movinglight sources in a random manner in an office (top two data) or a bedroom (bottomdata) environment with ambient illumination. We set the dimension of lighting coef-ficients k according to the complexity of scene albedo variations. We use k = 9 for the

78 Chapter 4 General Lighting Solution using Shape PriorNoise std. = 0.05, lambda = 0.03

Ground truth Err. = 0.10 Err. = 0.17 Err. = 0.06

9

Figure 4.6: Surface reconstruction result using synthetic data. From left to right, theground truth, noisy depth which is used as shape prior, surface from estimated normal,and final reconstruction by fusing shape prior and estimated normal are shown.

scene in top row, and k = 16 for other two scenes containing large albedo variationsas we find these settings work well for our real data. For example, if we set k = 9 forthe bottom scene, the estimated normal becomes flatter, partly due to the improperapproximation of the nonlinear response under strong albedo variations as discussedin Sec. 4.5.1. The results show our method can faithfully estimate the normal and thesurface.

4.5.3 Application using Internet images

Recent advances in SfM and MVS show that sparse yet reliable 3D points can berecovered from Internet images [SSS06, GSC∗07, FP10]. This result offers a shape priorto our context for recovering detailed 3D shapes. We gather Internet images of a fewscenes and apply SfM [SSS06] and PMVS [FP10] to obtain sparse point clouds. A

4.5. Experiments 79

Image Input depth Integ. norm

al Depth/norm

al fusion

* Cat/boy data

use k = 16

Figure4.7:Experim

entalresultusinga

Kinectsensor.From

lefttoright,one

oftheinputim

ages,surfacenorm

alcomputed

fromthe

Kinectdepth

(shapeprior),estim

atednorm

alafterremoving

ambiguity,reconstructed

surfacesfrom

inputdepth,norm

alintegration,anddepth

/normalfusion

areshow

n.


Poisson reconstruction method [KBH06] is used for creating a water-tight depth prior,and the multiview images are aligned to the reference view via 3D warping based on thedepth prior to form an observation matrix I. We only use normal priors calculated fromregions, where 3D points produced by MVS are dense, as N to remove the ambiguity.We use k = 9 for all the scenes in this experiment.

We show the results of Mount Rushmore and Kamakura Buddha scenes in leftand right columns of Fig. 4.8. These two datasets contain 42 and 16 registered images,respectively. We make a comparison to the method of [ARSG10] by using the sameinput for a reference. Our normal estimates show more meaningful shape informationthan the result from [ARSG10], because of the capability of handling natural lightingsand variations of camera responses. We also show the reconstructed surfaces by fusingour estimated normal and the shape prior from MVS in Fig. 4.9, where more details canbe observed thanks to the surface normal estimates.

For scenes with (partly) known and regular shapes, we can directly use the knowl-edge as a shape prior. We show such an example of TajMahalwhere the shape of thedome has a comprehensive structure. The Internet images of TajMahal are registeredto a reference view via homography using SIFT [Low04] features in this case, as the 3Dshape information is unavailable. This dataset contains 66 registered images, and 4 ofthem are shown at top left corner of Fig. 4.10. We manually assign a surface normalmap to the dome part as a shape prior in this example, as shown at top right corner.The normal estimation and relighting results are shown in the bottom row of Fig. 4.10.The relighting result is rendered using the estimated normal under a distance lighting[ 1√

3, 1√

3, 1√

3]⊤ for verification.

4.6 SummaryWe present an uncalibrated photometric stereo method that works with general lightingand uncontrolled sensor. We propose the use of shape priors to fully remove theambiguity in uncalibrated setting in addition to avoid the effect of uncontrolled sensor.Two practical application scenarios that use a Kinect sensor and Internet images forhigh-quality 3D modeling.

Limitation

As a limitation of our method, visibility and cast shadow have not been handled in ourcurrent solution method. Due to the ambiguities in the matrix factorization, it is difficultto explicitly calculate the visibility map like [WWMT11]. We have investigated somerobust algorithms [WGS∗10, ZLS∗12] to handle the cast shadows as outliers by forcing

4.6. Summary 81

Res

ult f

rom

[AR

SG10

]

Our

resu

lt

N

orm

al p

rior

R

egis

tere

d im

ages

Figure 4.8: Surface normal estimation results using Internet images. From top to bottomrows, we show four of the registered images, normal prior, our results, and the resultsfrom [ARSG10].

82 Chapter 4 General Lighting Solution using Shape PriorO

ur R

esul

t

Sha

pe p

rior

Figure 4.9: 3D reconstruction results using Internet images with close-up views indi-cated by red rectangles. The top and bottom rows show shape prior and our result,respectively.

4.6. Summary 83

Registered images Normal prior

Estimated normal Relighting result

Figure 4.10: Result using Internet images and a known shape as the prior.

the input matrix to be rank 9. However, we found it showed almost no improvementover current results in our context, because the ideal rank-9 matrix seldom exists forreal data when using spherical harmonics for approximating illumination. Properlymodeling cast shadow in our framework is left as our future work.

85

Chapter 5

Conclusion

5.1 SummaryPhotometric stereo is one of the important 3D shape estimation methods. By explor-ing the shading variations under different illuminations, photometric stereo is able toestimate the pixel-wise surface normal. The estimated normals can further be appliedto reconstruct the 3D shape of a target object. The applicability of photometric stereois greatly restricted by the assumptions that it relies on, particularly the Lambertianreflectance and distant lighting assumptions, which can seldom be held in real appli-cations. When applying photometric stereo to scenarios deviating far from these idealconditions, the estimated surface normal will have a large bias.

This dissertation focuses on generalizing the reflectance and lighting assumptionsof photometric stereo method to make it work for more diverse and practical scenar-ios. To accomplish this, we have proposed two new solutions for general isotropicreflectance by exploring the monotonicity of BRDF profiles and modeling the low-frequency reflectance using a newly designed bi-polynomial model. Both methodsestimate surface normals for diverse types of materials to efficiently recover 3D shapes.Moreover, we have proposed a photometric stereo solution that works in a wild setupassuming unknown and general environment lightings as well as uncontrolled sensors,provided with a shape prior that is in the form of a coarse depth measurement. All theseefforts greatly improve the applicability of photometric stereo to various applications,especially for reconstructing 3D surfaces with fine details.

86 Chapter 5 Conclusion

5.1.1 Photometric stereo for general reflectance by analyzing reflectancemonotonicity

In Chapter 2, we proposed a photometric stereo method for general isotropic reflectancethrough the analysis to reflectance monotonicity. For the two key variables of a surfacenormal: azimuth angle and elevation angle, the estimation to the former one is a wellsolved problem under general isotropic reflectance and a ring-light distribution [AK07].Thus we focused on solving the latter one to answer under what reflectance assumptionand what lighting distribution the problem can be solved.

Through mathematical analysis, we proved that if the lights cover the completehemisphere and the BRDF has a dominant 1-lobe projection along half-vector, theunique solution to elevation angle of normal can be determined. Through experiments,we verified that the theoretical assumptions on reflectance and lighting can be greatlyrelaxed, i.e., our method can reasonably estimate the elevation angle for surfaces withgeneral isotropic BRDFs, given a hundred of randomly distributed lights. The proposedmethod was further verified using real-world data which are challenging for traditionalmethods.

5.1.2 Photometric stereo for general reflectance by bi-polynomial mod-eling of low-frequency reflectance

In Chapter 3, we presented a reflectance model for inverse problems like photomet-ric stereo and reflectomery. Unlike the BRDF models designed for rendering pur-pose which emphasize the accurate description to specular lobes, we focused onlow-frequency reflectance, which typically shows slowly varying reflection. The low-frequency reflectances are densely observed on various materials, and more impor-tantly the low-frequency nature provides the possibility to model them with low-orderpolynomials.

We successfully modeled the low-frequency reflectance using a bi-polynomial rep-resentation. The proposed model was verified to outperform various existing BRDFmodels through the BRDF fitting experiments. Our model has only a few parame-ters in a very simple analytic form, which makes our BRDF fitting process be simplysolved using linear least squares. Thanks to this highly concise formulation with fewloss on modeling accuracy, our method shows improvement when it is integratedinto the photometric stereo algorithm considering BRDF fitting. Experiment resultson both synthetic and real data show the advantages in stability and accuracy of ourmethod compared with applying existing parametric BRDF models. Therefore, thebi-polynomial model is verified to be an effective solution for photometric stereo withgeneral reflectance.

5.2. Contributions 87

5.1.3 Photometric stereo for general lighting by utilizing shape prior

In Chapter 4, we proposed a photometric stereo solution in a greatly convenient setup,with unknown general lighting and uncontrolled camera. The only extra input ofour method is a course depth produced by a low-cost depth sensor or by structurefrom motion plus multi-view stereo. This shape information benefits our solution intwo folds: resolving the shape-lighting ambiguity embedded in the photometric stereounder unknown general lighting, and disregarding the nonlinear camera responseusing pseudo multiplexing of natural lighting in a linear transform.

By removing the restrictions from distant lighting assumption and linearly re-sponded sensors which are unavailable in many scenarios, our solution enables photo-metric stereo to work for more diverse applications such as the normal estimation usinga single Kinect sensor and photometric stereo using Internet images. We provided the-oretical analyses and synthetic verifications to the mechanism of using shape prior forboth shape disambiguation and nonlinear response approximation. The high-quality3D reconstructions combined both estimated normals and input depth prior prove theeffectiveness of the proposed method.

5.2 ContributionsThe main contributions of this dissertation are summarized as follows:

• Solution methods for photometric stereo with general reflectances:

The Lambert’s reflectance model is widely used in most photometric stereo so-lutions due to its simplicity. However, among many real-world objects, a pureLambertian one can seldom be found. We have extended photometric stereofrom this classic model to work with general isotropic BRDFs, which cover thereflectance properties of much more abundant types of objects. Our proposedsolutions can achieve highly accurate normal estimation for various challengingmaterials, and this accuracy can never be achieved by using traditional assump-tions.

• A deeper understanding on how photometric stereo works for general reflectances:

Both the two general reflectance solutions proposed in this dissertation are eval-uated with all 100 materials in the MERL BRDF database. To the extent of ourknowledge, we are the first (starting from [STMI12a]) to perform such com-prehensive evaluations for normal estimation accuracy varying with reflectanceproperties of target objects. Our results not only prove that our solutions are validacross sufficiently many types of materials, but also reveal interesting regularities


on how the photometric stereo results are affected by reflectance properties. Ouranalysis and conclusion on one hand provide useful hints on how to select anappropriate photometric stereo solution for achieving higher accuracy, given adesignated material; on the other hand, one can infer the confidence of accu-racy for an estimated normal map by checking our material-related ranking ofphotometric stereo results.

• Development of a simple BRDF model for general diffuse reflectance:

Our bi-polynomial model also works as a simple yet effective BRDF model formaterials which do not show strong specularity. Comparing with existing models,estimating the reflectance parameters for our model only requires solving linearleast squares of a few parameters, but it can still capture the majority of thereflectance variations in the low-frequency domain.

• A theoretical analysis to the elevation angle estimation under general reflectance:

We have provided a strict proof to the problem of elevation angle estimationunder 1-lobe monotonic BRDF. This discussion provides a novel point of view onunderstanding general BRDFs and their relationship to lighting distribution aswell as normal estimation.

• A practical algorithm for photometric stereo under general and unknown lighting:

We fully explore the usefulness of a coarse shape prior to resolve the ambiguitiesembedded in photometric stereo under general and unknown lighting. Based onthis, the proposed practical solution enables photometric stereo to work with acheap depth sensor. By further combining with existing structure from motionand multi-view stereo algorithms, it is surprised to see that photometric stereocan even work well with Internet images.

• A novel understanding on nonlinear radiometric response applied to photometric stereo:

With the availability of shape prior, we attempt to ignore the nonlinear responsefunctions as high-dimensional linear transformations over lighting coefficients.This novel trial allows the direct operating on observed pixel values in photomet-ric stereo pipeline, which significantly saves the labor and time for performingradiometric calibration. More importantly, when the sensors are inaccessible andradiometric calibration is infeasible, e.g., the Kinect sensor does not support ex-posure adjustment and the Internet images are taken by unknown camera, oursolution can tackle this difficult but practical issue which can never be conqueredby using existing methods.

5.3. Future Directions 89

5.3 Future DirectionsThis dissertation is concluded by mentioning several open problems and future im-provements that we believe are important to pursue.

5.3.1 Material-aware solution for even higher accuracy

We have provided and briefly compared our two general reflectance solutions by usingthe same database of diverse materials, and the results show that each method has itsown strong and weak materials. Given photometric stereo images, it might not be easyto completely know the reflectance; otherwise it becomes easy to decide the optimalsolution which is suitable for the reflectance properties of target objects. However, theinput photometric stereo images will provide partial variations on reflectance whichmight be useful to infer which photometric stereo solution we should trust: Doesthe target reflectance shows 1D monotonic BRDF profile or does it contain sufficientlow-frequency information? If the answer to the former question is “yes”, we shouldinvestigate monotonicity-based method, otherwise involving the bi-polynomial modelis expected to achieve better performance. In other words, by analyzing those propertiesrevealed in incomplete reflectance estimation, we might be able to automatically decidewhat types of approaches should produce the optimal results. The similar idea canbe extended to support other existing methods for general reflectance, like [AZK08].Therefore, further comparing the merits and demerits of our methods and existingmethods with the same input and output is an interesting direction.

5.3.2 Simpler setup for the general reflectance problem

Our current solution for general reflectance problem requires about 100 images undervaried lightings as input. Due to the highly nonlinear properties of the original problem,we usually need such a number of images to promise that we can draw a (relatively)complete BRDF profile for monotonicity analysis or we have sufficient observations inthe low-frequency domain for BRDF fitting. This cannot be achieved if the input lightingshows biased distribution in its spherical directions (e.g., the lights are concentratedon a small area of the hemisphere). Some image-based rendering techniques mightbe introduced to perform interpolation and render some virtual images as if theywere observed under desired lighting directions, thus to reduce the number of imagesrequired as input. It is well known in the Lambertian case, if shadows can be neglected,any three images under non-coplanar lighting directions can interpolate any imageunder a novel lighting direction [Sha92]. It is quite possible such a low-dimensional


linear subspace exists for low-frequency reflectance as well. If well explored, it mightprovide useful hints for both image-based rendering and photometric stereo.

5.3.3 Photometric stereo for both general reflectance and lighting

As indicated in the introduction chapter of this dissertation, this topic is quite a difficultone due to its complexity and inherent ambiguity. But recent advances show it ispossible to solve this problem simultaneously by assuming that the object containsonly one material, even if there is only one single image as input [ON12]. It is reportedthat the general lighting actually follows some statistics regularity and provides moreconstraints to the problems, although it increases the complexity at the same time.The success of the method in [ON12] also relies on the assumption that real-worldreflectances can be somehow expressed using the combinations and distributions ofthe materials in the MERL database. So we believe under photometric stereo setup,the problem should have a reasonable solution if some appropriate priors (such asthe statistics of environment illumination, real-world reflectance, commonly availableshapes, etc.) can be incorporated as strong constrains to narrow down the solution space.Conquering this problem will finally make photometric stereo work everywhere forevery common object.

91

References

[AHP12] AbramsA., Hawley C., Pless R.: Heliometric stereo. In Proc. of EuropeanConference on Computer Vision (ECCV) (2012), pp. 357–370.

[AK07] AlldrinN., KriegmanD.: Toward reconstructing surfaces with arbitraryisotropic reflectance: A stratified photometric stereo approach. In Proc.of International Conference on Computer Vision (ICCV) (2007).

[ALFG12] Ackermann J., Langguth F., Fuhrmann S., Goesele M.: Photometricstereo for outdoor webcams. In Proc. of IEEE Conference on ComputerVision and Pattern Recognition (CVPR) (2012), pp. 262–269.

[AMK07] Alldrin N., Mallick S., Kriegman D.: Resolving the generalized bas-relief ambiguity by entropy minimization. In Proc. of IEEE Conference onComputer Vision and Pattern Recognition (CVPR) (2007).

[ARSG10] Ackermann J., Ritz M., Stork A., Goesele M.: Removing the examplefrom example-based photometric stereo. In Proc. of European Conferenceon Computer Vision (ECCV) workshops (2010), pp. 197–210.

[AS00] Ashikhmin M., Shirley P.: An anisotropic Phong BRDF model. Journalof Graphics Tools 5, 2 (2000), 25–32.

[AZK08] Alldrin N., Zickler T., Kriegman D.: Photometric stereo with non-parametric and spatially-varying reflectance. In Proc. of IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR) (2008).

[BG01] Boivin S., Gagalowicz A.: Image-based rendering of diffuse, specularand glossy surfaces from a single image. In Proc. of SIGGRAPH (2001),pp. 107–116.

[BJK07] Basri R., Jacobs D., Kemelmacher I.: Photometric stereo with generalunknown lighting. International Journal of Computer Vision 72, 3 (2007),239–257.

[BKY99] Belhumeur P., Kriegman D. J., Yuille A. L.: The bas-relief ambiguity.International Journal of Computer Vision 35, 1 (1999), 33–44.

92 References

[Bli77] Blinn J.: Models of light reflection for computer synthesized pictures.In Proc. of SIGGRAPH (1977), pp. 192–198.

[BP03] Barsky S., Petrou M.: The 4-source photometric stereo technique forthree-dimensional surfaces in the presence of highlights and shadows.IEEE Transactions on Pattern Analysis Machine Intelligence 25, 10 (2003),1239–1252.

[CBR11] ChandrakerM., Bai J., RamamoorthiR.: A theory of photometric recon-struction for unknown isotropic reflectances. In Proc. of IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR) (2011), pp. 2505–2512.

[CEJ∗06] Chabert C., Einarsson P., Jones A., Lamond B., Ma W., Sylwan S.,Hawkins T., Debevec P.: Relighting human locomotion with flowedreflectance fields. In Proc. of SIGGRAPH (Sketches) (2006).

[CJ82] Coleman E., Jain R.: Obtaining 3-dimensional shape of textured andspecular surfaces using four-source photometry. Computer Graphics andImage Processing 18, 4 (1982), 309–328.

[CJ08] Chung H., Jia J.: Efficient photometric stereo on glossy surfaces withwide specular lobes. In Proc. of IEEE Conference on Computer Vision andPattern Recognition (CVPR) (2008).

[CR11] Chandraker M., Ramamoorthi R.: What an image reveals about ma-terial reflectance. In Proc. of International Conference on Computer Vision(ICCV) (2011), pp. 1076–1083.

[CT82] CookR., TorranceK.: A reflectance model for computer graphics. ACMTransactions on Graphics 1, 1 (1982), 7–24.

[DHOMH12] Drew M., Hel-Or Y., Malzbender T., Hajari N.: Robust estimation ofsurface properties and interpolation of shadow/specularity components.Image and Vision Computing 30, 4-5 (2012), 317–331.

[DS11] Diaz M., Sturm P.: Radiometric calibration using photo collections. InProc. of IEEE International Conference on Computational Photography (ICCP)(2011).

[DvGNK99] Dana K., van Ginneken B., Nayar S., Koenderink J.: Reflectance andtexture of real world surfaces. ACM Transactions on Graphics 18, 1 (1999),1–34.

References 93

[EVC08] Esteban C., Vogiatzis G., Cipolla R.: Multiview photometric stereo.IEEE Transactions on Pattern Analysis Machine Intelligence 30, 3 (2008),548–554.

[FP10] FurukawaY., Ponce J.: Accurate, dense, and robust multiview stereopsis.IEEE Transactions on Pattern Analysis Machine Intelligence 32, 8 (2010),1362–1376.

[FP12] Favaro P., Papadhimitri T.: A closed-form solution to uncalibrated pho-tometric stereo via diffuse maxima. In Proc. of IEEE Conference on ComputerVision and Pattern Recognition (CVPR) (2012), pp. 821–828.

[GCHS10] Goldman D., Curless B., Hertzmann A., Seitz S.: Shape and spatially-varying BRDFs from photometric stereo. IEEE Transactions on PatternAnalysis Machine Intelligence 32, 6 (2010), 1060–1071.

[Geo03] Georghiades A.: Incorporating the Torrance and Sparrow model ofreflectance in uncalibrated photometric stereo. In Proc. of InternationalConference on Computer Vision (ICCV) (2003), pp. 816–823.

[GN04] Grossberg M., Nayar S.: Modeling the space of camera response func-tions. IEEE Transactions on Pattern Analysis Machine Intelligence 26, 10(2004), 1272–1282.

[GSC∗07] GoeseleM., SnavelyN., CurlessB., HoppeH., SeitzS.: Multi-view stereofor community photo collections. In Proc. of International Conference onComputer Vision (ICCV) (2007).

[Hay94] Hayakawa H.: Photometric stereo under a light source with arbitrarymotion. Journal of the Optical Society of America 11, 11 (1994), 3079–3089.

[HLHZ08] Holroyd M., Lawrence J., Humphreys G., Zickler T.: A photometricapproach for estimating normals and tangents. In Proc. of SIGGRAPHAsia (ACM Transactions on Graphics) (2008), pp. 133–141.

[HMI10] Higo T., Matsushita Y., Ikeuchi K.: Consensus photometric stereo. InProc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(2010), pp. 1157–1164.

[HMJI09] Higo T., Matsushita Y., Joshi N., Ikeuchi K.: A hand-held photometricstereo camera for 3-D modeling. In Proc. of International Conference onComputer Vision (ICCV) (2009), pp. 1234–1241.

94 References

[HNI08] Hara K., Nishino K., Ikeuchi K.: Mixture of spherical distributionsfor single-view relighting. IEEE Transactions on Pattern Analysis MachineIntelligence 30, 1 (2008), 25–35.

[Hor70] Horn K.: Shape from shading: A method for obtaining the shape of asmooth opaque object from one view. Ph.D. thesis, Massachusetts Instituteof Technology (1970).

[HP05] Hirakawa K., Parks T.: Image denoising for signal-dependent noise.In Proc. of IEEE International Conference on Acoustics, Speech, and SignalProcessing (ICASSP) (2005), pp. 29–32.

[IWMA12] Ikehata S., Wipf D., Matsushita Y., Aizawa K.: Robust photometricstereo using sparse regression. In Proc. of IEEE Conference on ComputerVision and Pattern Recognition (CVPR) (2012), pp. 318–325.

[JA11] Johnson M., Adelson E.: Shape estimation in natural illumination. InProc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(2011), pp. 2553–2560.

[JCRA11] Johnson M., Coley F., Rajz A., Adelson E.: Microgeometry captureusing an elastomeric sensor. In Proc. of SIGGRAPH (ACM Transactions onGraphics) (2011), pp. 46:1–46:8.

[JK07] JoshiN., KriegmanD.: Shape from varying illumination and viewpoint.In Proc. of International Conference on Computer Vision (ICCV) (2007).

[KADB11] Klasing K., Althoff D., D.Wollherr, Buss M.: Comparison of surfacenormal estimation methods for range sensing applications. In Proc. ofIEEE International Conference on Robotics and Automation (ICRA) (2011),pp. 3206–3211.

[KBH06] Kazhdan M., Bolitho M., Hoppe H.: Poisson surface reconstruction.In Proc. of Eurographics Symposium on Geometry processing (SGP) (2006),pp. 61–70.

[KBP08] Kukelova Z., Bujnak M., Pajdla T.: Automatic generator of minimalproblem solvers. In Proc. of European Conference on Computer Vision(ECCV) (2008), pp. 302–315.

[KC95] Kay G., Caelly T.: Estimating the parameters of an illumination modelusing photometric stereo. Graphical Models and Image Processing 57, 5(1995), 365–388.

References 95

[KFP08] Kim S., Frahm J., Pollefeys M.: Radiometric calibration with illumina-tion change for outdoor scene analysis. In Proc. of IEEE Conference onComputer Vision and Pattern Recognition (CVPR) (2008).

[Kov05] KovesiP.: Shapelets correlated with surface normals produce surfaces. InProc. of International Conference on Computer Vision (ICCV) (2005), pp. 994–1001.

[LBAD∗06] Lawrence J., Ben-Artzi A., DeCoro C., Matusik W., Pfister H., Ra-mamoorthi R., Rusinkiewicz S.: Inverse shade trees for non-parametricmaterial representation and editing. In Proc. of SIGGRAPH (ACM Trans-actions on Graphics) (2006), pp. 735–745.

[LFTG97] Lafortune E., Foo S., Torrance K., Greenberg D.: Non-linear approxi-mation of reflectance functions. In Proc. of SIGGRAPH (1997).

[LGYS04] Lin S., Gu J., Yamazaki S., Shum H.: Radiometric calibration from asingle image. In Proc. of IEEE Conference on Computer Vision and PatternRecognition (CVPR) (2004), pp. 938–945.

[LHYK05] Lim J., Ho J., Yang M., Kriegman D.: Passive photometric stereo frommotion. In Proc. of International Conference on Computer Vision (ICCV)(2005), pp. 1635–1642.

[LMS∗13] Lu F., MatsushitaY., Sato I., OkabeT., SatoY.: Uncalibrated photometricstereo for unknown isotropic reflectances. In Proc. of IEEE Conference onComputer Vision and Pattern Recognition (CVPR) (2013).

[Low04] Lowe D.: Distinctive image features from scale-invariant keypoints.International Journal of Computer Vision 60, 2 (2004), 91–110.

[MGW01] Malzbender T., GelbD., WoltersH.: Polynomial texture maps. In Proc.of SIGGRAPH (2001), pp. 519–528.

[MHI10] MiyazakiD., HaraK., IkeuchiK.: Median photometric stereo as appliedto the segonko tumulus and museum objects. International Journal ofComputer Vision 86, 2 (2010), 229–242.

[MIS07] Mukaigawa Y., Ishii Y., Shakunaga T.: Analysis of photometric factorsbased on photometric linearization. Journal of the Optical Society of America24, 10 (2007), 3326–3334.

96 References

[MN99] Mitsunaga T., Nayar S. K.: Radiometric self-calibration. In Proc. ofIEEE Conference on Computer Vision and Pattern Recognition (CVPR) (1999),pp. 374–380.

[MOS11] Mongkulmann W., Okabe T., Sato Y.: Photometric stereo with auto-radiometric calibration. In Proc. of International Conference on ComputerVision (ICCV) workshops (2011).

[MPBM03] Matusik W., Pfister H., Brand M., McMillan L.: A data-driven re-flectance model. In Proc. of SIGGRAPH (ACM Transactions on Graphics)(2003), pp. 759–769.

[MWL∗99] Marschner S., Westin S., Lafortune E., Torrance K., Greenberg D.:Image-based BRDF measurement including human skin. In Proc. of Eu-rographics Symposium on Rendering (1999), pp. 139–152.

[NDM05] Ngan A., Durand F., MatusikW.: Experimental analysis of BRDF mod-els. In Proc. of Eurographics Symposium on Rendering (2005), pp. 117–126.

[NIK91] Nayar S., Ikeuchi K., Kanade T.: Surface reflection: Physical and ge-ometrical perspectives. IEEE Transactions on Pattern Analysis MachineIntelligence 13, 7 (1991), 611–634.

[Nis09] Nishino K.: Directional statistics BRDF model. In Proc. of InternationalConference on Computer Vision (ICCV) (2009), pp. 476–483.

[NM65] Nelder J., Mead R.: A simplex method for function minimization.Computer Journal 7, 4 (1965), 308–313.

[NRDR05] Nehab D., Rusinkiewicz S., Davis J., Ramamoorthi R.: Efficiently com-bining positions and normals for precise 3D geometry. In Proc. of SIG-GRAPH (ACM Transactions on Graphics) (2005), pp. 536–543.

[OD12] Okatani T., Deguchi K.: Optimal integration of photometric and geo-metric surface measurements using inaccurate reflectance/illuminationknowledge. In Proc. of IEEE Conference on Computer Vision and PatternRecognition (CVPR) (2012), pp. 254–261.

[ON93] Oren M., Nayar S.: Diffuse reflectance from rough surfaces. In Proc. ofIEEE Conference on Computer Vision and Pattern Recognition (CVPR) (1993),pp. 763–764.

References 97

[ON12] OxholmG., NishinoK.: Shape and reflectance from natural illumination.In Proc. of European Conference on Computer Vision (ECCV) (2012), pp. 528–541.

[OSS09] Okabe T., Sato I., Sato Y.: Attached shadow coding: estimating surfacenormals from shadows under unknown reflectance and lighting condi-tions. In Proc. of International Conference on Computer Vision (ICCV) (2009),pp. 1693–1700.

[Pho75] Phong B.: Illumination for computer generated pictures. In Proc. ofSIGGRAPH (1975), pp. 117–126.

[RH02] Ramamoorthi R., Hanrahan P.: Frequency space environment maprendering. In Proc. of SIGGRAPH (ACM Transactions on Graphics) (2002),pp. 517–526.

[RSSF02] Reinhard E., Stark M., Shirley P., Ferwerda J.: Photographics tonereproduction for digital images. In Proc. of SIGGRAPH (ACM Transactionson Graphics) (2002), pp. 267–276.

[Rus98] Rusinkiewicz S.: A new change of variables for efficient BRDF rep-resentation. In Rendering Techniques (Proc. of Eurographics Workshop onRendering) (1998), pp. 11–22.

[RVZ08] Romeiro F., Vasilyev Y., Zickler T.: Passive reflectometry. In Proc. ofEuropean Conference on Computer Vision (ECCV) (2008), pp. 859–872.

[RZ10] RomeiroF., ZicklerT.: Blind reflectometry. In Proc. of European Conferenceon Computer Vision (ECCV) (2010), pp. 45–58.

[SCD∗06] Seitz S., Curless B., Diebel J., Scharstein D., Szeliski R.: A comparisonand evaluation of multi-view stereo reconstruction algorithms. In Proc. ofIEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2006),pp. 519–528.

[Sch94] SchlickC.: An inexpensive BRDF model for physically-based rendering.Computer Graphics Forum 13, 3 (1994), 233–246.

[Sha85] Shafer S.: Using color to separate reflection components. Color Research10, 4 (1985), 210–218.

[Sha92] Shashua A.: Geometry and photometry in 3D visual recognition. PhDthesis, Massachusetts Institute of Technology (1992).

98 References

[Sho85] Shoemake K.: Animating rotation with quaternion curves. In Proc. ofSIGGRAPH (1985).

[SI96] Solomon F., Ikeuchi K.: Extracting the shape and roughness of specularlobe objects using four light photometric stereo. IEEE Transactions onPattern Analysis Machine Intelligence 18, 4 (1996), 449–454.

[Sil80] Silver W.: Determining shape and reflectance using multiple images.Master’s thesis, Massachusetts Institute of Technology (1980).

[SMW∗10] Shi B., Matsushita Y., Wei Y., Xu C., Tan P.: Self-calibrating photo-metric stereo. In Proc. of IEEE Conference on Computer Vision and PatternRecognition (CVPR) (2010), pp. 1118–1125.

[SOYS07] Sato I., OkabeT., YuQ., SatoY.: Shape reconstruction based on similarityin radiance changes under varying illumination. In Proc. of InternationalConference on Computer Vision (ICCV) (2007).

[SSS06] SnavelyN., Seitz S., Szeliski R.: Photo tourism: exploring photo collec-tions in 3D. In Proc. of SIGGRAPH (ACM Transactions on Graphics) (2006),pp. 835–846.

[ST09] Shen L., Tan P.: Photometric stereo and weather estimation using in-ternet images. In Proc. of IEEE Conference on Computer Vision and PatternRecognition (CVPR) (2009), pp. 1850–1857.

[STMI12a] ShiB., TanP., MatsushitaY., IkeuchiK.: A biquadratic reflectance modelfor radiometric image analysis. In Proc. of IEEE Conference on ComputerVision and Pattern Recognition (CVPR) (2012), pp. 230–237.

[STMI12b] ShiB., TanP., MatsushitaY., IkeuchiK.: Elevation angle from reflectancemonotonicity: Photometric stereo for general isotropic reflectances. InProc. of European Conference on Computer Vision (ECCV) (2012), pp. 455–468.

[TLQ08] Tan P., Lin S., Quan L.: Subpixel photometric stereo. IEEE Transactionson Pattern Analysis Machine Intelligence 30, 8 (2008), 1460–1471.

[TQZ11] Tan P., Quan L., Zickler T.: The geometry of reflectance symmetries.IEEE Transactions on Pattern Analysis Machine Intelligence 33, 12 (2011),2506–2520.

References 99

[TS67] TorranceK., Sparrow E.: Theory for off-specular reflection from rough-ened surfaces. Journal of the Optical Society of America 57, 9 (1967), 1105–1114.

[War92] Ward G.: Measuring and modeling anisotropic reflection. ComputerGraphics 26, 2 (1992), 265–272.

[WGS∗10] Wu L., Ganesh A., Shi B., Matsushita Y., Wang Y., Ma Y.: Robustphotometric stereo via low-rank matrix completion and recovery. InProc. of Asian Conference on Computer Vision (ACCV) (2010), pp. 703–717.

[Woo80] Woodham R.: Photometric method for determining surface orientationfrom multiple images. Optical Engineering 19, 1 (1980), 139–144.

[WT10] Wu T., Tang C.: Photometric stereo via expectation maximization. IEEETransactions on Pattern Analysis Machine Intelligence 32, 3 (2010), 546–560.

[WWMT11] Wu C., Wilburn B., Matsushita Y., Theobalt C.: High-quality shapefrom multi-view stereo and shading under general illumination. In Proc.of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(2011), pp. 969–976.

[YDMH99] Yu Y., Debevec P., Malik J., Hawkins T.: Inverse global illumination:Recovering reflectance models of real scenes from photographs. In Proc.of SIGGRAPH (1999), pp. 215–224.

[YSEB99] Yuille A., Snow D., Epstein R., Belhumeur P.: Determining generativemodels of objects under varying illumination: Shape and albedo frommultiple images using SVD and integrability. International Journal ofComputer Vision 35, 3 (1999), 203–222.

[YYT∗13] Yu L., Yeung S., Tai Y., Terzopoulos D., Chan T.: Outdoor photometricstereo. In Proc. of IEEE International Conference on Computational Photog-raphy (ICCP) (2013).

[YYTL13] Yu L., Yeung S., Tai Y., Lin S.: Depth-assisted shape-from-shading. InProc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(2013).

[ZCHS03] Zhang L., Curless B., Hertzmann A., Seitz S.: Shape and motion un-der varying illumination: Unifying structure from motion, photometricstereo, and multi-view stereo. In Proc. of International Conference on Com-puter Vision (ICCV) (2003), p. 618.

100 References

[ZLS∗12] Zheng Y., Liu G., Sugimoto S., Yan S., Okutomi M.: Practical low-rankmatrix approximation under robust L1-norm. In Proc. of IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR) (2012), pp. 1410–1417.

[ZREB06] Zickler T., Ramamoorthi R., Enrique S., Belhumeur P.: Reflectancesharing: Predicting appearance from a sparse set of images. IEEE Trans-actions on Pattern Analysis Machine Intelligence 28, 8 (2006), 1287–1302.

[ZWT13] Zhou Z., Wu Z., Tan P.: Multi-view photometric stereo with spatiallyvarying isotropic materials. In Proc. of IEEE Conference on Computer Visionand Pattern Recognition (CVPR) (2013).

[ZYY∗12] Zhang Q., Ye M., Yang R., Matsushita Y., Wilburn B., Yu H.: Edge-preserving photometric stereo via depth fusion. In Proc. of IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR) (2012), pp. 2472–2479.

photometric stereo for general reflectance and lighting by boxin shi a doctoral dissertation

Documents