high speed obstacle avoidance using monocular vision and reinforcement learning jeff michels...
Post on 21-Dec-2015
216 views
TRANSCRIPT
High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning
Jeff MichelsAshutosh SaxenaAndrew Y. Ng
Stanford University
ICML 2005.
http://ai.stanford.edu/~asaxena/rccar/
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Problem Drive a remote control car
at high speeds Unstructured outdoor
environments Off the shelf hardware,
inexpensive cameras and little processing power
Vision and Driving ControlQuickTime™ and a
TIFF (Uncompressed) decompressorare needed to see this picture.
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Prior Work: Vision Estimating depth from multiple
images: Stereovision (e.g., Scharstein &
Szeliski, 2002) Depth from Defocus (e.g.,
Klarquist et al., 1995) Optical Flow/Structure from
motion (e.g., Barron et al., 1994)
Motivation #1: Monocular vision. Stereo vision has limits
baseline distance between cameras vibration and blur
We would like to explore the use of monocular cues.
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Prior Work: Driving Control
Driving Stereo-vision for driving (LeCun, 2003) Highways with clear lane markings (Pomerleau, 1989) Single camera for indoor robot, but known color and texture of ground
(Gini & Marchi, 2002)
Motivation #2: Reinforcement learning Many past successes used model-based RL. Does model-based RL still make sense even for tasks requiring
complex perception? (To simulate vision input, we need to use computer graphics!)
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Approach
Vision System Estimate distance to nearest obstacle in each
possible steering direction.
Driving Control Map from the output of the vision system into
steering commands for the car. Use reinforcement learning to learn the policy.
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Vision System: Training Data Image divided into vertical
columns corresponding to possible steering directions.
Image labeled with depth for each vertical column
Laser range finder -- ground truth distances
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Vision System: Monocular Cues Monocular Cues used by
humans for depth perception Texture Variations - Laws’ Texture Gradient (Linear
Perspective) - Radon, Harris Haze - Color Occlusion Known Object Size
(Loomis, Nature 2001)
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Feature Vector: Monocular Cues Texture Variation Texture Gradient Occlusion, Object Size,
Global structure Overlapping windows Appending adjacent stripe’s
vectors
The feature vector size is 858 dimension
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Learning Algorithm
Supervised learning to estimate the distance d in each column of the image.
Learn weights w via ordinary least squares with quadratic cost.
arg minw ∑i (di - wT xi )2
Other regression methods (SVR, robust regression) gave similar results
weightsdepth
featuresi = columns, images
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Results: Learning Depth Estimates
Errors on a log scaleE = ∑ | log10(d) – log10(destimated) |
Able to predict depth with a average error of 0.26 orders of magnitude.
0.2
0.25
0.3
0.35
0.4
Radon(TextureGradient)
Harris(TextureGradient)
Laws(Texture
Variations)
All
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Synthetic Graphics Data
Graphics images for training the vision system.
Variable degree of graphical realism
Can a system trained on synthetic images predict distances on real images?
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Results: Combined Vision System
0
5
10
15
20
25
Random Graphics Real Combined
Haz
ard
Rate
(%)
When the distance to nearest obstacle in the chosen direction is less than 5 m, then it is a hazard.
Hazard rate improves by combining the real and synthetic trained system.
24% hazard rate reduction over using only real images.
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Control: Reinforcement Learning Model based RL -- hard perception problem Randomly generated environment in
graphics simulator Pegasus (Ng & Jordan, 2000) to learn
control policy Car initialized at (0,0) and ran for fixed time
horizon. Learning algorithm converged after 1674
iterations of policy search.
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Reinforcement Learning: Parameters
1: spatial smoothing of predicted distances
2: threshold distance for evasive action
3: steering angle parameter
4, 5: evasive action parameters
6: throttle parameter
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
QuickTime™ and a decompressor
are needed to see this picture.
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
QuickTime™ and aCinepak decompressor
are needed to see this picture.
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
QuickTime™ and a decompressor
are needed to see this picture.
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Results: Driving Times
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Summary
Monocular depth estimation is an interesting and important problem.
Supervised learning for depth estimation. Model-based RL, using computer graphics
simulator, to learn controller.
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Extensions/Future Work
Learn complete depth maps
Markov Random Field (MRF) to estimate depths.
Image Ground Truth Predicted
[also with Sung Chung.]
Learning depth from single monocular images,
Ashutosh Saxena, Sung H. Chung, Andrew Y. Ng.
In NIPS 2005.
ICML 2005. Jeff Michels, Ashutosh Saxena & Andrew Y. Ng
Contact:
Ashutosh Saxena, [email protected]
http://ai.stanford.edu/~asaxena/rccar/
http://ai.stanford.edu/~asaxena/learningdepth/