a real-time rgb-d registration and mapping …reza.hoseinnezhad.com/papers/khalid_fusion2014...a...
TRANSCRIPT
A Real-Time RGB-D Registration and Mapping Approach by Heuristically Switching Between
Photometric And Geometric Information
The 17th International Conference on Information Fusion (Fusion 2014)
Khalid Yousif, Alireza Bab-Hadiashar, Reza Hoseinnezhad
School of Aerospace, Mechanical, and Manufacturing Engineering
RMIT University
July, 2014
RMIT University© SAMME 1
SAMME 2RMIT University© SAMME 2
Introduction&
Literature Review
1
RMIT University© SAMME 3
Dense 3D SLAM
• SLAM – simultaneous estimation of camera pose and construction of an unknown environment
• 3D maps are very informative• Allow improved path planning and
navigation methods• Provide enhanced functionality for
robots Augmented reality applications
RMIT University© SAMME 4
RGB-D mapping- Literature Review
Method Authors
Ransac + ICP refinement + Global optimization
Henry et al 2010, Endres et al.2012, Du et al. 2011
Optical flow RGB-D SLAM
Audras et al. 2010
Dense ICP Newcombe et al. 2011, Whelan et al 2012
RGB-D SLAM + Monocular SLAM combination
Hu et al. 2012
RGB-D SLAM in dynamic environments
Keller et. Al, 2013
Use of both photometricand geometric information
Kerl et al. 2013, Yousif et al. 2014
SAMME 5RMIT University© SAMME 5
2
Methodology
RMIT University© SAMME 6
Selection Between Photometric and Geometric Features
• Matching photometric features is 5x faster than geometric features
• Photometric features are used as a default• 3D features are used if number of photometric features are
below threshold• We selected the threshold that provided the best balance
between accuracy and efficiency
fFig 2. Proposed method (IS3D)fFig 1. ORB features
SAMME 7RMIT University© SAMME
Photometric Feature Extraction
• Extract ORB features from sequential frames. • ORB features are based on FAST features• ORB is 2x faster than SIFT• Achieves similar accuracy.• 3D Projecton using the standard pinhole camera model:
∗ ∗
• , image coordinate of visual feature• , , projected 3D coordinate• are the focal lengths.• , is the 2D coordinate of the camera optical center.
RMIT University© SAMME 8
Pre-processing the pointcloud
• Remove points with no information (NaN).• Remove points further than 5 metres away.• Uniformly down sample the point.• Assign a variable search radius to obtain around 4000 points
Normal vector estimation:
• Fit a plane to a point and its neighbours using a LS method
• Use Large search radius
fFig. 3 Normal estimation using small search raduis (left), large search raduis(right)
RMIT University© SAMME 9
Informatively Sampled Geometric 3D features
• Novel geometric feature extraction method (IS3D).• Informative sampling – choose best points for registration.• A robust estimator for segmenting points into orientation groups
(based on normal vectors).• Selected keypoints are those not part of any dominant normal
orientation group.
Fig, 4 Uniformly sampled point cloud Fig. 5 Sampled points using IS3D
RMIT University© SAMME 10
Informatively Sampled Geometric 3D features • Angle between normal vectors :
∅ cos .
• MSSE constraint:
| |
Where ∑
• is the number of points included• is the model dimension• is a constant factor 2.5 is usually used to
indicate an inclusion of around 99% of inliers based on a normal distribution).
fFig. 6 MSSE segmentation.
RMIT University© SAMME 11
• Photometric features are assigned BRIEF descriptors.• Geometric features are assigned SHOT descriptors.• BRIEF matching: Hamming distance.• SHOT matching Nearest neighbour in descriptor space.• Mutual consistency check• Only pairs of corresponding points that are mutually
matched to each other are considered as the initial matches
Feature Matching
fFig. 7 Initial matching
RMIT University© SAMME 12
Outlier Removal and Transformation Estimation using MSSE
• Least K-th order statistical model fitting (LKS) based on rank ordering statistics.
• Find the 6DOF transformation using the detected inliers.
• Relax fixed error threshold assumption used in RANSAC.
• The cost function to be minimized is:• Modified Selective Statistical Estimator
(MSSE) for estimating the scale:
• Works well with multiple structures
| | ∑
| |
0 1
fFig. 8 Line segmentation using MSSE.
fFig. 9 Initial matches (top), good matches (bottom)
RMIT University© SAMME 13
Global Pose Estimation and Mapping
• Previous steps estimate transformation between two frames.• Concatenate all the transformation to obtain a global pose:
• Map obtained by transforming the points from the current frame the global reference frame using
, , ,0 1 ,
fFig. 10 Example of the constructed map of an office scene using the proposed registration method.
SAMME 14RMIT University© SAMME 14
3
Experimental Results
RMIT University© SAMME 15
Method 3 - Evaluation
• We used a publicly available RGB-D datasets.• Evaluation metric is the absolute trajectory error between a sequence of
camera poses , … . , and ground truth trajectory , … ,
Fig: Visualization of the absolute trajectory error (ATE)using: (a) ’freiburg3 nostructure texture near withloop’ sequence(b)’freiburg3 structure texture far ’.
1
RMIT University© SAMME 16
Method 3 - Evaluation
• Texture vs. Structure • Comparison with other methods
• Computational performance
SAMME 17
5
Conclusion
RMIT University©
SAMME 18RMIT University© SAMME
Conclusion and Future Work• We presented a method that uses both depth and visual information.• Works well in low structure scenes as well as low texture.• Method automatically switches between photometric and geometric
features.• Novel informative sampling method (IS3D) that selects only points
carrying important information.• Our method was evaluated using a publicly available RGB-D
benchmark.
Future work:• Achieving global consistency by employing pose graph optimization or
bundle adjustment.• Mapping in dynamic environments, segmenting multiple motions and
using camera motion only for registration.• Possibly tracking the moving objects that are in the camera’s field of
view.
SAMME 19RMIT University© SAMME 19
Thank you
SAMME 20RMIT University© SAMME
References
[1] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “A benchmark for the evaluation of rgb-d slam systems,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2012, pp. 573–580. [2] H. Durrant-Whyte, D. Rye, and E. Nebot, “Localization of autonomousguided vehicles,” ROBOTICS RESEARCH-INTERNATIONAL SYMPOSIUM-, vol. 7, pp. 613–625, 1996. [3] S. Thrun, “Robotic mapping: A survey,” Exploring artificial intelligencein the new millennium, pp. 1–35, 2002.[4] S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, J. Shotton, S. Hodges, D. Freeman, A. Davison, et al., “Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera,” in Proceedings of the 24th annual ACM symposium on User interface software and technology. ACM, 2011, pp. 559–568.[5] P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments,” The International Journal of Robotics Research, vol. 31, no. 5, pp. 647–663, 2012. [6] F. Endres, J. Hess, N. Engelhard, J. Sturm, D. Cremers, and W. Burgard, “An evaluation of the rgb-d slam system,” in Robotics and Automation (ICRA), 2012 IEEE International Conference on. IEEE, 2012, pp. 1691–1696.[7] A. Bab-Hadiashar and D. Suter, “Robust segmentation of visual datausing ranked unbiased scale estimate,” Robotica, vol. 17, no. 6, pp. 649–660, 1999.[8] E. Rosten and T. Drummond, “Machine learning for high-speed cornerdetection,” Computer Vision–ECCV 2006, pp. 430–443, 2006.[9] P. Besl and N. McKay, “A method for registration of 3-d shapes,” IEEETransactions on pattern analysis and machine intelligence, vol. 14,no. 2, pp. 239–256, 1992.[10] M. Lourakis and A. Argyros, “Sba: A software package for genericsparse bundle adjustment,” ACM Transactions on Mathematical Software(TOMS), vol. 36, no. 1, p. 2, 2009.[11] D. Lowe, “Distinctive image features from scale-invariant keypoints,”International journal of computer vision, vol. 60, no. 2, pp. 91–110,2004.
[12] H. Bay, T. Tuytelaars, and L. Van Gool, “Surf: Speeded up robustfeatures,” Computer Vision–ECCV 2006, pp. 404–417, 2006.[13] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “Orb: an efficientalternative to sift or surf,” in Computer Vision (ICCV), 2011 IEEEInternational Conference on. IEEE, 2011, pp. 2564–2571.[14] H. Du, P. Henry, X. Ren, M. Cheng, D. Goldman, S. Seitz, and D. Fox,“Interactive 3d modeling of indoor environments with a consumerdepth camera,” in Proceedings of the 13th international conferenceon Ubiquitous computing. ACM, 2011, pp. 75–84.[15] C. Audras, A. Comport, M. Meilland, and P. Rives, “Real-time denseappearance-based slam for rgb-d sensors,” in Australasian Conf. onRobotics and Automation, 2011.[16] R. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. Davison,P. Kohli, J. Shotton, S. Hodges, and A. Fitzgibbon, “Kinectfusion:Real-time dense surface mapping and tracking,” in Mixed and AugmentedReality (ISMAR), 2011 10th IEEE International Symposiumon. IEEE, 2011, pp. 127–136.[17] T. Whelan, M. Kaess, M. Fallon, H. Johannsson, J. Leonard, andJ. McDonald, “Kintinuous: Spatially extended kinectfusion,” 2012.[18] A. Bachrach, S. Prentice, R. He, P. Henry, A. Huang, M. Krainin,D. Maturana, D. Fox, and N. Roy, “Estimation, planning, and mappingfor autonomous flight using an rgb-d camera in gps-denied environments,”The International Journal of Robotics Research, vol. 31,no. 11, pp. 1320–1343, 2012.[19] G. Hu, S. Huang, L. Zhao, A. Alempijevic, and G. Dissanayake,“A robust rgb-d slam algorithm,” in Intelligent Robots and Systems(IROS), 2012 IEEE/RSJ International Conference on. IEEE, 2012,pp. 1714–1719.[20] K. Yousif, A. Bab-Hadiashar, and R. Hoseinnezhad, “3d registration indark environments using rgb-d cameras,” in Digital Image Computing:Techniques and Applications (DICTA), 2013 International Conferenceon, 2013, pp. 1–8.[21] ——, “Real-time rgb-d registration and mapping in texture-less environmentsusing ranked order statistics,” in 2014 IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS)- In press.Source Software, 2009
SAMME 21RMIT University© SAMME
References[22] C. Kerl, J. Sturm, and D. Cremers, “Dense visual slam for rgb-dcameras,” in Intelligent Robots and Systems (IROS), 2013 IEEE/RSJInternational Conference on. IEEE, 2013, pp. 2100–2106.[23] L. Douadi, M.-J. Aldon, and A. Crosnier, “Pair-wise registration of3d/color data sets with icp,” in Intelligent Robots and Systems, 2006IEEE/RSJ International Conference on. IEEE, 2006, pp. 663–668.[24] S. Druon, M.-J. Aldon, and A. Crosnier, “Color constrained icp forregistration of large unstructured 3d color data sets,” in InformationAcquisition, 2006 IEEE International Conference on. IEEE, 2006,pp. 249–255.[25] F. Tombari, S. Salti, and L. Di Stefano, “Unique signatures ofhistograms for local surface description,” in Computer Vision–ECCV2010. Springer, 2010, pp. 356–369.[26] M. Calonder, V. Lepetit, C. Strecha, and P. Fua, “Brief: Binary robustindependent elementary features,” Computer Vision–ECCV 2010, pp.778–792, 2010.[27] R. B. Rusu, “Semantic 3d object maps for everyday manipulation inhuman living environments,” KI-K¨unstliche Intelligenz, vol. 24, no. 4,pp. 345–348, 2010.[28] F. Fraundorfer and D. Scaramuzza, “Visual odometry: Part ii: Matching,robustness, optimization, and applications,” Robotics & AutomationMagazine, IEEE, vol. 19, no. 2, pp. 78–90, 2012.[29] M. Quigley, K. Conley, B. P. Gerkey, J. Faust, T. Foote, J. Leibs,R. Wheeler, and A. Y. Ng, “Ros: an open-source robot operatingsystem,” in ICRA Workshop on Open Source Software, 2009