colored point cloud registration revisited supplementary...

5
Colored Point Cloud Registration Revisited Supplementary Material Jaesik Park Qian-Yi Zhou Vladlen Koltun Intel Labs A. RGB-D Image Alignment Section 3 introduced a joint photometric and geomet- ric objective for RGB-D image alignment. This appendix presents an algorithm that optimizes the introduced objec- tive. This algorithm is used in the reconstruction system presented in Section 5 of the paper. A.1. Objective The joint photometric and geometric objective for RGB-D image alignment is a nonlinear least-squares objec- tive E(T) = (1 - σ) X x ( r x I (T) ) 2 + σ X x ( r x D (T) ) 2 , (A1) where r x I and r x D are the photometric and geometric residu- als, respectively: r x I (T)= I i (g uv (s(h(x,D j (x)), T))) - I j (x), (A2) r x D (T)= D i (g uv (s(h(x,D j (x)), T))) - g d (s(h(x,D j (x)), T)). (A3) The definitions of I i , D i , s, h, g are given in Section 3. A.2. Optimization As in Section 4.3, this objective is minimized using the Gauss-Newton method. Specifically, we start from an initial transformation T 0 and perform optimization iteratively. In each iteration, we locally parameterize T with a 6-vector ξ , evaluate the residual r and Jacobian J r at T k , solve the linear system in (21) to compute ξ , and use ξ to update T. To compute the Jacobian, we need the partial derivatives of the residuals. They are r x I (T)= ∂ξ i (I i g uv s) (A4) = I i (g uv )J guv (s)J s (ξ ), (A5) r x D (T)= ∂ξ i (D i g uv s - g d s) (A6) = D i (g uv )J guv (s)J s (ξ ) - J g d (s)J s (ξ ). (A7) Steps A5 and A7 apply the chain rule. I i and D i are the gradient of I i and D i respectively. They are computed by applying a normalized Scharr kernel over I i and D i . J guv and J g d are the Jacobian matrices of g uv and g d , derived from (5). J s is the Jacobian of s with respect to ξ , derived from (4) and (20). A.3. Correspondence pruning Equation (2) constructs a correspondence from pixel x in image (I j ,D j ) to pixel x 0 in image (I i ,D i ). Since two images are viewed from different perspectives, x 0 can be occluded in image (I i ,D i ). In this case the correspondence is invalid and can hinder the optimization. We compare D i (x 0 ) and g d (s(h(x,D j (x)), T)). If x 0 is occluded, the two depth values are apart. We use this criterion to create an image mask M that prunes invalid correspondences: M = n x x (I j ,D j ) and r x D (T k ) o . (A8) r x D is defined in (A3). δ is an empirical threshold: 7 cen- timeters. In each iteration, we recompute M and optimize objective (A1) over correspondences that fall within M . A.4. Coarse-to-fine processing As in Section 4.4, we apply the optimization in a coarse- to-fine manner: an RGB-D image pyramid is built and the optimization is performed from the coarsest pyramid level to the finest. This makes the algorithm more robust to bad initialization. Similar ideas have been exploited in [6, 4]. Algorithm 1 summarizes the RGB-D image alignment. B. Parameter σ The joint photometric and geometric optimization objec- tives (7) and (12) have a parameter σ that balances the pho- tometric term and the geometric term. We find its optimal value by grid search. To find the optimal σ for colored point cloud registration, we take the experimental setup in Section 7.1 and perturb the true pose in the rotational components by 10 . The av- erage RMSE as a function of σ is shown in Figure 1. This 1

Upload: others

Post on 21-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

  • Colored Point Cloud Registration RevisitedSupplementary Material

    Jaesik Park Qian-Yi Zhou Vladlen KoltunIntel Labs

    A. RGB-D Image AlignmentSection 3 introduced a joint photometric and geomet-

    ric objective for RGB-D image alignment. This appendixpresents an algorithm that optimizes the introduced objec-tive. This algorithm is used in the reconstruction systempresented in Section 5 of the paper.

    A.1. Objective

    The joint photometric and geometric objective forRGB-D image alignment is a nonlinear least-squares objec-tive

    E(T) = (1− σ)∑x

    (rxI (T)

    )2+ σ

    ∑x

    (rxD(T)

    )2, (A1)

    where rxI and rxD are the photometric and geometric residu-

    als, respectively:

    rxI (T) = Ii(guv(s(h(x, Dj(x)),T)))− Ij(x), (A2)rxD(T) = Di(guv(s(h(x, Dj(x)),T)))

    − gd(s(h(x, Dj(x)),T)). (A3)

    The definitions of Ii, Di, s, h, g are given in Section 3.

    A.2. Optimization

    As in Section 4.3, this objective is minimized using theGauss-Newton method. Specifically, we start from an initialtransformation T0 and perform optimization iteratively. Ineach iteration, we locally parameterize T with a 6-vectorξ, evaluate the residual r and Jacobian Jr at Tk, solve thelinear system in (21) to compute ξ, and use ξ to update T.To compute the Jacobian, we need the partial derivatives ofthe residuals. They are

    ∇rxI (T) =∂

    ∂ξi(Ii ◦ guv ◦ s) (A4)

    = ∇Ii(guv)Jguv (s)Js(ξ), (A5)

    ∇rxD(T) =∂

    ∂ξi(Di ◦ guv ◦ s− gd ◦ s) (A6)

    = ∇Di(guv)Jguv (s)Js(ξ)− Jgd(s)Js(ξ).(A7)

    Steps A5 and A7 apply the chain rule. ∇Ii and∇Di are thegradient of Ii and Di respectively. They are computed byapplying a normalized Scharr kernel over Ii and Di. Jguvand Jgd are the Jacobian matrices of guv and gd, derivedfrom (5). Js is the Jacobian of s with respect to ξ, derivedfrom (4) and (20).

    A.3. Correspondence pruning

    Equation (2) constructs a correspondence from pixel xin image (Ij , Dj) to pixel x′ in image (Ii, Di). Since twoimages are viewed from different perspectives, x′ can beoccluded in image (Ii, Di). In this case the correspondenceis invalid and can hinder the optimization. We compareDi(x

    ′) and gd(s(h(x, Dj(x)),T)). If x′ is occluded, thetwo depth values are apart. We use this criterion to createan image mask M that prunes invalid correspondences:

    M ={x∣∣∣x ∈ (Ij , Dj) and ∣∣rxD(Tk)∣∣ < δ}. (A8)

    rxD is defined in (A3). δ is an empirical threshold: 7 cen-timeters. In each iteration, we recompute M and optimizeobjective (A1) over correspondences that fall within M .

    A.4. Coarse-to-fine processing

    As in Section 4.4, we apply the optimization in a coarse-to-fine manner: an RGB-D image pyramid is built and theoptimization is performed from the coarsest pyramid levelto the finest. This makes the algorithm more robust to badinitialization. Similar ideas have been exploited in [6, 4].

    Algorithm 1 summarizes the RGB-D image alignment.

    B. Parameter σThe joint photometric and geometric optimization objec-

    tives (7) and (12) have a parameter σ that balances the pho-tometric term and the geometric term. We find its optimalvalue by grid search.

    To find the optimal σ for colored point cloud registration,we take the experimental setup in Section 7.1 and perturbthe true pose in the rotational components by 10◦. The av-erage RMSE as a function of σ is shown in Figure 1. This

    1

  • 0.9 0.95 1

  • Rotation (degree)

    Erro

    r (m

    )

    5 10 15 20 25 30 35 40 450

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4

    Translation (m)

    Erro

    r (m

    )

    1 2 3 4 5 60

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4Color 6D ICPColor 4D ICPGeneralized 6D ICPOurs

    Rotation (degree)

    Erro

    r (m

    )

    5 10 15 20 25 30 35 40 450

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4

    Translation (m)

    Erro

    r (m

    )

    1 2 3 4 5 60

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4Sparse ICP (point to point)Sparse ICP (point to plane)Generalized ICPPCL ICP (point to point)PCL ICP (point to plane)Our geometric termOurs

    Figure 2. Evaluation of LiDAR point cloud alignment on the Cathedral scene [5]. The presented algorithm is compared to prior algorithmsthat use color (top) and to algorithms that do not (bottom). The algorithms are initialized with transformations that are perturbed away fromthe true pose in the rotational component (left) and the translational component (right). The plot shows the median RMSE at convergence(bold curve) and the 40%-60% range of RMSE across trials (shaded region). Lower is better.

    0 10 20 30 40 50 600

    20

    40

    60

    80

    100

    F sc

    ore

    ElasticFusionCZKOurs

    0 10 20 30 40 50 600

    20

    40

    60

    80

    100

    F sc

    ore

    ElasticFusionCZKOurs

    0 10 20 30 40 50 600

    20

    40

    60

    80

    100

    F sc

    ore

    ElasticFusionCZKOurs

    Apartment Bedroom Boardroom

    0 10 20 30 40 50 600

    20

    40

    60

    80

    100

    F sc

    ore

    ElasticFusionCZKOurs

    0 10 20 30 40 50 600

    20

    40

    60

    80

    100

    F sc

    ore

    ElasticFusionCZKOurs

    0 10 20 30 40 50 600

    20

    40

    60

    80

    100

    F sc

    ore

    ElasticFusionCZKOurs

    Lobby Loft Mean

    Figure 3. Accuracy of different reconstruction systems on the five scenes in the presented dataset, measured by F-score with varyingdistance thresholds τ (in millimeters). The last plot shows the mean F-score for each threshold over all five scenes in the dataset.

  • Apa

    rtm

    ent

    Bed

    room

    Boa

    rdro

    omL

    obby

    Lof

    t

    Figure 4. Visualization of the dataset presented in Section 6 of the paper. Left: ground-truth model of each scene, acquired using anindustrial laser scanner. For this visualization, the ground-truth point clouds were meshed using Poisson surface reconstruction, and therenderings exhibit meshing artifacts that are not present in the ground-truth point clouds themselves. Right: 360◦ panoramic images of thephysical scenes that were scanned.

  • #14

    #16

    #21

    #57

    #73

    ElasticFusion [7] CZK [1] Ours

    Figure 5. Reconstructions of five more scenes from the SceneNN dataset [2]. This extends Figure 4 in the paper. Prior systems suffer frominaccurate surface alignment and produce broken geometry. Our system produces much cleaner results.