a real-time rgb-d registration and mapping …reza.hoseinnezhad.com/papers/khalid_fusion2014...a...

A Real-Time RGB-D Registration and Mapping Approach by Heuristically Switching Between

Photometric And Geometric Information

The 17th International Conference on Information Fusion (Fusion 2014)

Khalid Yousif, Alireza Bab-Hadiashar, Reza Hoseinnezhad

School of Aerospace, Mechanical, and Manufacturing Engineering

RMIT University

July, 2014

RMIT University© SAMME 1

SAMME 2RMIT University© SAMME 2

Introduction&

Literature Review

1


Dense 3D SLAM

• SLAM – simultaneous estimation of camera pose and construction of an unknown environment

• 3D maps are very informative• Allow improved path planning and

navigation methods• Provide enhanced functionality for

robots Augmented reality applications


RGB-D mapping- Literature Review

Method Authors

Ransac + ICP refinement + Global optimization

Henry et al 2010, Endres et al.2012, Du et al. 2011

Optical flow RGB-D SLAM

Audras et al. 2010

Dense ICP Newcombe et al. 2011, Whelan et al 2012

RGB-D SLAM + Monocular SLAM combination

Hu et al. 2012

RGB-D SLAM in dynamic environments

Keller et. Al, 2013

Use of both photometricand geometric information

Kerl et al. 2013, Yousif et al. 2014


2

Methodology


Selection Between Photometric and Geometric Features

• Matching photometric features is 5x faster than geometric features

• Photometric features are used as a default• 3D features are used if number of photometric features are

below threshold• We selected the threshold that provided the best balance

between accuracy and efficiency

fFig 2. Proposed method (IS3D)fFig 1. ORB features

SAMME 7RMIT University© SAMME

Photometric Feature Extraction

• Extract ORB features from sequential frames. • ORB features are based on FAST features• ORB is 2x faster than SIFT• Achieves similar accuracy.• 3D Projecton using the standard pinhole camera model:

∗ ∗

• , image coordinate of visual feature• , , projected 3D coordinate• are the focal lengths.• , is the 2D coordinate of the camera optical center.


Pre-processing the pointcloud

• Remove points with no information (NaN).• Remove points further than 5 metres away.• Uniformly down sample the point.• Assign a variable search radius to obtain around 4000 points

Normal vector estimation:

• Fit a plane to a point and its neighbours using a LS method

• Use Large search radius

fFig. 3 Normal estimation using small search raduis (left), large search raduis(right)


Informatively Sampled Geometric 3D features

• Novel geometric feature extraction method (IS3D).• Informative sampling – choose best points for registration.• A robust estimator for segmenting points into orientation groups

(based on normal vectors).• Selected keypoints are those not part of any dominant normal

orientation group.

Fig, 4 Uniformly sampled point cloud Fig. 5 Sampled points using IS3D


Informatively Sampled Geometric 3D features • Angle between normal vectors :

∅ cos .

• MSSE constraint:

| |

Where ∑

• is the number of points included• is the model dimension• is a constant factor 2.5 is usually used to

indicate an inclusion of around 99% of inliers based on a normal distribution).

fFig. 6 MSSE segmentation.


• Photometric features are assigned BRIEF descriptors.• Geometric features are assigned SHOT descriptors.• BRIEF matching: Hamming distance.• SHOT matching Nearest neighbour in descriptor space.• Mutual consistency check• Only pairs of corresponding points that are mutually

matched to each other are considered as the initial matches

Feature Matching

fFig. 7 Initial matching


Outlier Removal and Transformation Estimation using MSSE

• Least K-th order statistical model fitting (LKS) based on rank ordering statistics.

• Find the 6DOF transformation using the detected inliers.

• Relax fixed error threshold assumption used in RANSAC.

• The cost function to be minimized is:• Modified Selective Statistical Estimator

(MSSE) for estimating the scale:

• Works well with multiple structures

| | ∑

| |

0 1

fFig. 8 Line segmentation using MSSE.

fFig. 9 Initial matches (top), good matches (bottom)


Global Pose Estimation and Mapping

• Previous steps estimate transformation between two frames.• Concatenate all the transformation to obtain a global pose:

• Map obtained by transforming the points from the current frame the global reference frame using

, , ,0 1 ,

fFig. 10 Example of the constructed map of an office scene using the proposed registration method.


3

Experimental Results


Method 3 - Evaluation

• We used a publicly available RGB-D datasets.• Evaluation metric is the absolute trajectory error between a sequence of

camera poses , … . , and ground truth trajectory , … ,

Fig: Visualization of the absolute trajectory error (ATE)using: (a) ’freiburg3 nostructure texture near withloop’ sequence(b)’freiburg3 structure texture far ’.

1


Method 3 - Evaluation

• Texture vs. Structure • Comparison with other methods

• Computational performance


Conclusion and Future Work• We presented a method that uses both depth and visual information.• Works well in low structure scenes as well as low texture.• Method automatically switches between photometric and geometric

features.• Novel informative sampling method (IS3D) that selects only points

carrying important information.• Our method was evaluated using a publicly available RGB-D

benchmark.

Future work:• Achieving global consistency by employing pose graph optimization or

bundle adjustment.• Mapping in dynamic environments, segmenting multiple motions and

using camera motion only for registration.• Possibly tracking the moving objects that are in the camera’s field of

view.


Thank you


References

[1] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “A benchmark for the evaluation of rgb-d slam systems,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2012, pp. 573–580. [2] H. Durrant-Whyte, D. Rye, and E. Nebot, “Localization of autonomousguided vehicles,” ROBOTICS RESEARCH-INTERNATIONAL SYMPOSIUM-, vol. 7, pp. 613–625, 1996. [3] S. Thrun, “Robotic mapping: A survey,” Exploring artificial intelligencein the new millennium, pp. 1–35, 2002.[4] S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, J. Shotton, S. Hodges, D. Freeman, A. Davison, et al., “Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera,” in Proceedings of the 24th annual ACM symposium on User interface software and technology. ACM, 2011, pp. 559–568.[5] P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The International Journal of Robotics Research, vol. 31, no. 5, pp. 647–663, 2012. [6] F. Endres, J. Hess, N. Engelhard, J. Sturm, D. Cremers, and W. Burgard, “An evaluation of the rgb-d slam system,” in Robotics and Automation (ICRA), 2012 IEEE International Conference on. IEEE, 2012, pp. 1691–1696.[7] A. Bab-Hadiashar and D. Suter, “Robust segmentation of visual datausing ranked unbiased scale estimate,” Robotica, vol. 17, no. 6, pp. 649–660, 1999.[8] E. Rosten and T. Drummond, “Machine learning for high-speed cornerdetection,” Computer Vision–ECCV 2006, pp. 430–443, 2006.[9] P. Besl and N. McKay, “A method for registration of 3-d shapes,” IEEETransactions on pattern analysis and machine intelligence, vol. 14,no. 2, pp. 239–256, 1992.[10] M. Lourakis and A. Argyros, “Sba: A software package for genericsparse bundle adjustment,” ACM Transactions on Mathematical Software(TOMS), vol. 36, no. 1, p. 2, 2009.[11] D. Lowe, “Distinctive image features from scale-invariant keypoints,”International journal of computer vision, vol. 60, no. 2, pp. 91–110,2004.

[12] H. Bay, T. Tuytelaars, and L. Van Gool, “Surf: Speeded up robustfeatures,” Computer Vision–ECCV 2006, pp. 404–417, 2006.[13] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “Orb: an efficientalternative to sift or surf,” in Computer Vision (ICCV), 2011 IEEEInternational Conference on. IEEE, 2011, pp. 2564–2571.[14] H. Du, P. Henry, X. Ren, M. Cheng, D. Goldman, S. Seitz, and D. Fox,“Interactive 3d modeling of indoor environments with a consumerdepth camera,” in Proceedings of the 13th international conferenceon Ubiquitous computing. ACM, 2011, pp. 75–84.[15] C. Audras, A. Comport, M. Meilland, and P. Rives, “Real-time denseappearance-based slam for rgb-d sensors,” in Australasian Conf. onRobotics and Automation, 2011.[16] R. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. Davison,P. Kohli, J. Shotton, S. Hodges, and A. Fitzgibbon, “Kinectfusion:Real-time dense surface mapping and tracking,” in Mixed and AugmentedReality (ISMAR), 2011 10th IEEE International Symposiumon. IEEE, 2011, pp. 127–136.[17] T. Whelan, M. Kaess, M. Fallon, H. Johannsson, J. Leonard, andJ. McDonald, “Kintinuous: Spatially extended kinectfusion,” 2012.[18] A. Bachrach, S. Prentice, R. He, P. Henry, A. Huang, M. Krainin,D. Maturana, D. Fox, and N. Roy, “Estimation, planning, and mappingfor autonomous flight using an rgb-d camera in gps-denied environments,”The International Journal of Robotics Research, vol. 31,no. 11, pp. 1320–1343, 2012.[19] G. Hu, S. Huang, L. Zhao, A. Alempijevic, and G. Dissanayake,“A robust rgb-d slam algorithm,” in Intelligent Robots and Systems(IROS), 2012 IEEE/RSJ International Conference on. IEEE, 2012,pp. 1714–1719.[20] K. Yousif, A. Bab-Hadiashar, and R. Hoseinnezhad, “3d registration indark environments using rgb-d cameras,” in Digital Image Computing:Techniques and Applications (DICTA), 2013 International Conferenceon, 2013, pp. 1–8.[21] ——, “Real-time rgb-d registration and mapping in texture-less environmentsusing ranked order statistics,” in 2014 IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS)- In press.Source Software, 2009


References[22] C. Kerl, J. Sturm, and D. Cremers, “Dense visual slam for rgb-dcameras,” in Intelligent Robots and Systems (IROS), 2013 IEEE/RSJInternational Conference on. IEEE, 2013, pp. 2100–2106.[23] L. Douadi, M.-J. Aldon, and A. Crosnier, “Pair-wise registration of3d/color data sets with icp,” in Intelligent Robots and Systems, 2006IEEE/RSJ International Conference on. IEEE, 2006, pp. 663–668.[24] S. Druon, M.-J. Aldon, and A. Crosnier, “Color constrained icp forregistration of large unstructured 3d color data sets,” in InformationAcquisition, 2006 IEEE International Conference on. IEEE, 2006,pp. 249–255.[25] F. Tombari, S. Salti, and L. Di Stefano, “Unique signatures ofhistograms for local surface description,” in Computer Vision–ECCV2010. Springer, 2010, pp. 356–369.[26] M. Calonder, V. Lepetit, C. Strecha, and P. Fua, “Brief: Binary robustindependent elementary features,” Computer Vision–ECCV 2010, pp.778–792, 2010.[27] R. B. Rusu, “Semantic 3d object maps for everyday manipulation inhuman living environments,” KI-K¨unstliche Intelligenz, vol. 24, no. 4,pp. 345–348, 2010.[28] F. Fraundorfer and D. Scaramuzza, “Visual odometry: Part ii: Matching,robustness, optimization, and applications,” Robotics & AutomationMagazine, IEEE, vol. 19, no. 2, pp. 78–90, 2012.[29] M. Quigley, K. Conley, B. P. Gerkey, J. Faust, T. Foote, J. Leibs,R. Wheeler, and A. Y. Ng, “Ros: an open-source robot operatingsystem,” in ICRA Workshop on Open Source Software, 2009

a real-time rgb-d registration and mapping …reza.hoseinnezhad.com/papers/khalid_fusion2014...a...

Documents