robot vision for the visually impaired
TRANSCRIPT
Robot Vision for the Visually ImpairedVivek Pradeep, Gerard Medioni, James Weiland
presented byPhongsathorn Eakamongul
Department of Computer ScienceAsian Institute of Technology
2010, December 7
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 1 / 18
Outline
1 Abstracts
2 System Description
3 Result
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 2 / 18
Abstracts
head-mounted : wide-field information compare to shoulder or waist-mounteddesign in literature which require body rotations
stereo-vision
navigational assistance device
visual odometry : dense 3D with 2D elevation grids
metric-topological SLAM
build vicinity map
3D traversability analysis to steer subjects away from obstacles in the path
use microvibration motors provides cues for taking evasive action : they use tactilecues instead of audio since the latter impose greater cognitive load on the subject,and blind users rely on hearing to perform a wide variety of other tasks
experiment running at 10 Hz
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
Abstracts
head-mounted : wide-field information compare to shoulder or waist-mounteddesign in literature which require body rotations
stereo-vision
navigational assistance device
visual odometry : dense 3D with 2D elevation grids
metric-topological SLAM
build vicinity map
3D traversability analysis to steer subjects away from obstacles in the path
use microvibration motors provides cues for taking evasive action : they use tactilecues instead of audio since the latter impose greater cognitive load on the subject,and blind users rely on hearing to perform a wide variety of other tasks
experiment running at 10 Hz
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
Abstracts
head-mounted : wide-field information compare to shoulder or waist-mounteddesign in literature which require body rotations
stereo-vision
navigational assistance device
visual odometry : dense 3D with 2D elevation grids
metric-topological SLAM
build vicinity map
3D traversability analysis to steer subjects away from obstacles in the path
use microvibration motors provides cues for taking evasive action : they use tactilecues instead of audio since the latter impose greater cognitive load on the subject,and blind users rely on hearing to perform a wide variety of other tasks
experiment running at 10 Hz
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
Abstracts
head-mounted : wide-field information compare to shoulder or waist-mounteddesign in literature which require body rotations
stereo-vision
navigational assistance device
visual odometry : dense 3D with 2D elevation grids
metric-topological SLAM
build vicinity map
3D traversability analysis to steer subjects away from obstacles in the path
use microvibration motors provides cues for taking evasive action : they use tactilecues instead of audio since the latter impose greater cognitive load on the subject,and blind users rely on hearing to perform a wide variety of other tasks
experiment running at 10 Hz
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
Abstracts
head-mounted : wide-field information compare to shoulder or waist-mounteddesign in literature which require body rotations
stereo-vision
navigational assistance device
visual odometry : dense 3D with 2D elevation grids
metric-topological SLAM
build vicinity map
3D traversability analysis to steer subjects away from obstacles in the path
use microvibration motors provides cues for taking evasive action : they use tactilecues instead of audio since the latter impose greater cognitive load on the subject,and blind users rely on hearing to perform a wide variety of other tasks
experiment running at 10 Hz
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
Abstracts
head-mounted : wide-field information compare to shoulder or waist-mounteddesign in literature which require body rotations
stereo-vision
navigational assistance device
visual odometry : dense 3D with 2D elevation grids
metric-topological SLAM
build vicinity map
3D traversability analysis to steer subjects away from obstacles in the path
use microvibration motors provides cues for taking evasive action : they use tactilecues instead of audio since the latter impose greater cognitive load on the subject,and blind users rely on hearing to perform a wide variety of other tasks
experiment running at 10 Hz
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
Abstracts
head-mounted : wide-field information compare to shoulder or waist-mounteddesign in literature which require body rotations
stereo-vision
navigational assistance device
visual odometry : dense 3D with 2D elevation grids
metric-topological SLAM
build vicinity map
3D traversability analysis to steer subjects away from obstacles in the path
use microvibration motors provides cues for taking evasive action : they use tactilecues instead of audio since the latter impose greater cognitive load on the subject,and blind users rely on hearing to perform a wide variety of other tasks
experiment running at 10 Hz
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
Abstracts
head-mounted : wide-field information compare to shoulder or waist-mounteddesign in literature which require body rotations
stereo-vision
navigational assistance device
visual odometry : dense 3D with 2D elevation grids
metric-topological SLAM
build vicinity map
3D traversability analysis to steer subjects away from obstacles in the path
use microvibration motors provides cues for taking evasive action : they use tactilecues instead of audio since the latter impose greater cognitive load on the subject,and blind users rely on hearing to perform a wide variety of other tasks
experiment running at 10 Hz
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
Abstracts
head-mounted : wide-field information compare to shoulder or waist-mounteddesign in literature which require body rotations
stereo-vision
navigational assistance device
visual odometry : dense 3D with 2D elevation grids
metric-topological SLAM
build vicinity map
3D traversability analysis to steer subjects away from obstacles in the path
use microvibration motors provides cues for taking evasive action : they use tactilecues instead of audio since the latter impose greater cognitive load on the subject,and blind users rely on hearing to perform a wide variety of other tasks
experiment running at 10 Hz
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
Abstracts
head-mounted : wide-field information compare to shoulder or waist-mounteddesign in literature which require body rotations
stereo-vision
navigational assistance device
visual odometry : dense 3D with 2D elevation grids
metric-topological SLAM
build vicinity map
3D traversability analysis to steer subjects away from obstacles in the path
use microvibration motors provides cues for taking evasive action : they use tactilecues instead of audio since the latter impose greater cognitive load on the subject,and blind users rely on hearing to perform a wide variety of other tasks
experiment running at 10 Hz
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
Introduction
visual impairment : need long cane or guide dog
In US, 109,000 people : use long canes, 7,000 use dog guides
only 1,500 graduate from dog-guid user program
Electronic travel aids (ETAs), leveraging ultrasonic, laser, or vision sensors
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 4 / 18
Introduction
visual impairment : need long cane or guide dog
In US, 109,000 people : use long canes, 7,000 use dog guides
only 1,500 graduate from dog-guid user program
Electronic travel aids (ETAs), leveraging ultrasonic, laser, or vision sensors
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 4 / 18
Introduction
visual impairment : need long cane or guide dog
In US, 109,000 people : use long canes, 7,000 use dog guides
only 1,500 graduate from dog-guid user program
Electronic travel aids (ETAs), leveraging ultrasonic, laser, or vision sensors
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 4 / 18
Introduction
visual impairment : need long cane or guide dog
In US, 109,000 people : use long canes, 7,000 use dog guides
only 1,500 graduate from dog-guid user program
Electronic travel aids (ETAs), leveraging ultrasonic, laser, or vision sensors
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 4 / 18
Introduction
visual impairment : need long cane or guide dog
In US, 109,000 people : use long canes, 7,000 use dog guides
only 1,500 graduate from dog-guid user program
Electronic travel aids (ETAs), leveraging ultrasonic, laser, or vision sensors
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 4 / 18
wearable array of microvibration motors provides a tactile cuesand guide user along safe path
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 5 / 18
Outline
1 Abstracts
2 System Description
3 Result
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 6 / 18
Online SLAM + obstacle detection
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 7 / 18
Stereo Vision Odometry
matched correspondences across (P t−1L ,P t−1
R ,P tL) or (P t−1
L ,P t−1R ,P t
R) can becomputed using three-point algorithm in RANSAC setting
for robustness, features matching and reprojection errors are measured acrossfour views
Sparse Bundle Adjustment
feature covariances can be propagated to get motion uncertainty for use in theSLAM filter
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 8 / 18
Stereo Vision Odometry
matched correspondences across (P t−1L ,P t−1
R ,P tL) or (P t−1
L ,P t−1R ,P t
R) can becomputed using three-point algorithm in RANSAC setting
for robustness, features matching and reprojection errors are measured acrossfour views
Sparse Bundle Adjustment
feature covariances can be propagated to get motion uncertainty for use in theSLAM filter
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 8 / 18
Stereo Vision Odometry
matched correspondences across (P t−1L ,P t−1
R ,P tL) or (P t−1
L ,P t−1R ,P t
R) can becomputed using three-point algorithm in RANSAC setting
for robustness, features matching and reprojection errors are measured acrossfour views
Sparse Bundle Adjustment
feature covariances can be propagated to get motion uncertainty for use in theSLAM filter
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 8 / 18
Stereo Vision Odometry
matched correspondences across (P t−1L ,P t−1
R ,P tL) or (P t−1
L ,P t−1R ,P t
R) can becomputed using three-point algorithm in RANSAC setting
for robustness, features matching and reprojection errors are measured acrossfour views
Sparse Bundle Adjustment
feature covariances can be propagated to get motion uncertainty for use in theSLAM filter
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 8 / 18
Stereo Vision Odometry
matched correspondences across (P t−1L ,P t−1
R ,P tL) or (P t−1
L ,P t−1R ,P t
R) can becomputed using three-point algorithm in RANSAC setting
for robustness, features matching and reprojection errors are measured acrossfour views
Sparse Bundle Adjustment
feature covariances can be propagated to get motion uncertainty for use in theSLAM filter
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 8 / 18
SLAM
Rao-Blackwellised particle filter (RBPF) in FastSLAM framework
which use KLT and SIFT trackingconstruct 2 maps
SLAM map : collection of sparse landmarks that propagated every frame to yieldconsistent camera pose estimates, for SLAM purpose onlytraversability map : dense 3D cloud from triangulation
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
SLAM
Rao-Blackwellised particle filter (RBPF) in FastSLAM framework
which use KLT and SIFT trackingconstruct 2 maps
SLAM map : collection of sparse landmarks that propagated every frame to yieldconsistent camera pose estimates, for SLAM purpose onlytraversability map : dense 3D cloud from triangulation
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
SLAM
Rao-Blackwellised particle filter (RBPF) in FastSLAM framework
which use KLT and SIFT trackingconstruct 2 maps
SLAM map : collection of sparse landmarks that propagated every frame to yieldconsistent camera pose estimates, for SLAM purpose onlytraversability map : dense 3D cloud from triangulation
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
SLAM
Rao-Blackwellised particle filter (RBPF) in FastSLAM framework
which use KLT and SIFT trackingconstruct 2 maps
SLAM map : collection of sparse landmarks that propagated every frame to yieldconsistent camera pose estimates, for SLAM purpose onlytraversability map : dense 3D cloud from triangulation
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
SLAM
Rao-Blackwellised particle filter (RBPF) in FastSLAM framework
which use KLT and SIFT trackingconstruct 2 maps
SLAM map : collection of sparse landmarks that propagated every frame to yieldconsistent camera pose estimates, for SLAM purpose onlytraversability map : dense 3D cloud from triangulation
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
SLAM
Rao-Blackwellised particle filter (RBPF) in FastSLAM framework
which use KLT and SIFT trackingconstruct 2 maps
SLAM map : collection of sparse landmarks that propagated every frame to yieldconsistent camera pose estimates, for SLAM purpose onlytraversability map : dense 3D cloud from triangulation
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
Metric-Topological SLAM
serveral thousands of landmarks environmenttwo levels of environment representation
local, metric (submap) : estimates state informationsix dimensional camera trajectory st
sparse map mtfeature observations (KLT/SIFT) z t
camera motion estimates ut
RBPFp(st ,mt |z t , ut ) ≈ p(st |z t , ut )
∏i p(mt (i)|st , z t , ut )
mt (i) : ith landmark in the map represented by N(µi , σi )each time feature is observed, the corresponding lankmark is updated using EKFRBPF enables us to only update the observed landmark instead of the whole map
global topologicalmap is represents as a collection of submap
annotated graphG = (i Mi∈Ωt ,
baΛa,b∈Ωt )
i M : annotated submapsΩt : set of computed submapsbaΛ : coordinate transformations between adjacent maps
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
Metric-Topological SLAM
serveral thousands of landmarks environmenttwo levels of environment representation
local, metric (submap) : estimates state informationsix dimensional camera trajectory st
sparse map mtfeature observations (KLT/SIFT) z t
camera motion estimates ut
RBPFp(st ,mt |z t , ut ) ≈ p(st |z t , ut )
∏i p(mt (i)|st , z t , ut )
mt (i) : ith landmark in the map represented by N(µi , σi )each time feature is observed, the corresponding lankmark is updated using EKFRBPF enables us to only update the observed landmark instead of the whole map
global topologicalmap is represents as a collection of submap
annotated graphG = (i Mi∈Ωt ,
baΛa,b∈Ωt )
i M : annotated submapsΩt : set of computed submapsbaΛ : coordinate transformations between adjacent maps
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
Metric-Topological SLAM
serveral thousands of landmarks environmenttwo levels of environment representation
local, metric (submap) : estimates state informationsix dimensional camera trajectory st
sparse map mtfeature observations (KLT/SIFT) z t
camera motion estimates ut
RBPFp(st ,mt |z t , ut ) ≈ p(st |z t , ut )
∏i p(mt (i)|st , z t , ut )
mt (i) : ith landmark in the map represented by N(µi , σi )each time feature is observed, the corresponding lankmark is updated using EKFRBPF enables us to only update the observed landmark instead of the whole map
global topologicalmap is represents as a collection of submap
annotated graphG = (i Mi∈Ωt ,
baΛa,b∈Ωt )
i M : annotated submapsΩt : set of computed submapsbaΛ : coordinate transformations between adjacent maps
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
Metric-Topological SLAM
serveral thousands of landmarks environmenttwo levels of environment representation
local, metric (submap) : estimates state informationsix dimensional camera trajectory st
sparse map mtfeature observations (KLT/SIFT) z t
camera motion estimates ut
RBPFp(st ,mt |z t , ut ) ≈ p(st |z t , ut )
∏i p(mt (i)|st , z t , ut )
mt (i) : ith landmark in the map represented by N(µi , σi )each time feature is observed, the corresponding lankmark is updated using EKFRBPF enables us to only update the observed landmark instead of the whole map
global topologicalmap is represents as a collection of submap
annotated graphG = (i Mi∈Ωt ,
baΛa,b∈Ωt )
i M : annotated submapsΩt : set of computed submapsbaΛ : coordinate transformations between adjacent maps
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
Metric-Topological SLAM
serveral thousands of landmarks environmenttwo levels of environment representation
local, metric (submap) : estimates state informationsix dimensional camera trajectory st
sparse map mtfeature observations (KLT/SIFT) z t
camera motion estimates ut
RBPFp(st ,mt |z t , ut ) ≈ p(st |z t , ut )
∏i p(mt (i)|st , z t , ut )
mt (i) : ith landmark in the map represented by N(µi , σi )each time feature is observed, the corresponding lankmark is updated using EKFRBPF enables us to only update the observed landmark instead of the whole map
global topologicalmap is represents as a collection of submap
annotated graphG = (i Mi∈Ωt ,
baΛa,b∈Ωt )
i M : annotated submapsΩt : set of computed submapsbaΛ : coordinate transformations between adjacent maps
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
Metric-Topological SLAM
serveral thousands of landmarks environmenttwo levels of environment representation
local, metric (submap) : estimates state informationsix dimensional camera trajectory st
sparse map mtfeature observations (KLT/SIFT) z t
camera motion estimates ut
RBPFp(st ,mt |z t , ut ) ≈ p(st |z t , ut )
∏i p(mt (i)|st , z t , ut )
mt (i) : ith landmark in the map represented by N(µi , σi )each time feature is observed, the corresponding lankmark is updated using EKFRBPF enables us to only update the observed landmark instead of the whole map
global topologicalmap is represents as a collection of submap
annotated graphG = (i Mi∈Ωt ,
baΛa,b∈Ωt )
i M : annotated submapsΩt : set of computed submapsbaΛ : coordinate transformations between adjacent maps
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
Metric-Topological SLAM
serveral thousands of landmarks environmenttwo levels of environment representation
local, metric (submap) : estimates state informationsix dimensional camera trajectory st
sparse map mtfeature observations (KLT/SIFT) z t
camera motion estimates ut
RBPFp(st ,mt |z t , ut ) ≈ p(st |z t , ut )
∏i p(mt (i)|st , z t , ut )
mt (i) : ith landmark in the map represented by N(µi , σi )each time feature is observed, the corresponding lankmark is updated using EKFRBPF enables us to only update the observed landmark instead of the whole map
global topologicalmap is represents as a collection of submap
annotated graphG = (i Mi∈Ωt ,
baΛa,b∈Ωt )
i M : annotated submapsΩt : set of computed submapsbaΛ : coordinate transformations between adjacent maps
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
Metric-Topological SLAM
serveral thousands of landmarks environmenttwo levels of environment representation
local, metric (submap) : estimates state informationsix dimensional camera trajectory st
sparse map mtfeature observations (KLT/SIFT) z t
camera motion estimates ut
RBPFp(st ,mt |z t , ut ) ≈ p(st |z t , ut )
∏i p(mt (i)|st , z t , ut )
mt (i) : ith landmark in the map represented by N(µi , σi )each time feature is observed, the corresponding lankmark is updated using EKFRBPF enables us to only update the observed landmark instead of the whole map
global topologicalmap is represents as a collection of submap
annotated graphG = (i Mi∈Ωt ,
baΛa,b∈Ωt )
i M : annotated submapsΩt : set of computed submapsbaΛ : coordinate transformations between adjacent maps
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
Traversability Map
5 radius sphere
multi-surface elevation map : point cloud is quantized into 2D grid
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 11 / 18
Traversability Map
5 radius sphere
multi-surface elevation map : point cloud is quantized into 2D grid
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 11 / 18
Traversability Map
5 radius sphere
multi-surface elevation map : point cloud is quantized into 2D grid
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 11 / 18
Prediction Motion and Cue Generation
if magnitude of translation respect to previous position exceeds certain threshold,the direction of motion and reference position are updated
little translation -> no update
cue generation : most continuous traversable path ( Green color in picture )
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 12 / 18
Outline
1 Abstracts
2 System Description
3 Result
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 13 / 18
Result
Green : travesibleRed : not
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 14 / 18
error of camera frame-to-frame heading (yaw), when compared withreadings from a commercially Inertial Measurement Unit (IMU)
camera motion : slow (< 5 degree/s), medium (5-20 degree/s), fast (20-30 degree/s)
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 15 / 18
SLAM result
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 16 / 18
Traversability Map
one frame exppatch that has thickness > 30 cm is labeled as vertical5 horizontal patches is labeled as traversable
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 17 / 18
Experiment
Manually generate cues : wireless remote control
Autonomous generate cues, like group 4
Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 18 / 18