single view depth estimation experiments our classifier training of

1
3. Depth transforms with inv. scaling Sufficient to train a classifier for a single d C For other depths d : 4. Multiple semantic classes 1. Pixel-wise classifier superpixels not necessarily planar 2. Translation invariant Pulling Things out of Perspective Single View Depth Estimation Ľubor Ladický 1 , Jianbo Shi 2 , Marc Pollefeys 1 1 ETH Zürich, Switzerland 2 University of Pennsylvania, Philadelphia, USA Experiments Our classifier Training of the classifier Standard approaches 1. Model fitting [Barinova et al. ECCV08] Requires strong prior knowledge Ignores small objects 2. 3D-Detection based [Hoiem et al. CVPR06] Works only for foreground objects (things) 3. Depth from semantic labels [Liu et al, CVPR10] Requires strong priors about semantic classes 4. Data driven [Saxena et al, NIPS05] Requires lots of data A problem with balancing data General problem No common structure of the scene Ground plane not always visible Large variation of viewpoints and of objects in the scene Both things and stuff in the scene Impossible ? Classifier response for x and at a depth d window wh around the point xI semantic label 1. Image pyramid is built 2. Training data randomly sampled 3. Samples of each class at d C used as positives 4. Samples of other classes or at d ≠ d C used as negatives 5. Multi-class classifier trained Dense Features SIFT, LBP, Self Similarity, Texton Representation Soft BOW representations in the set of rectangles Classifier AdaBoost Patch classification KITTI dataset 30 training & 30 test images 12 semantic labels depth range 2-50m (except sky) neighbouring depths d i+1 / d i = 1.25 NYU2 dataset 725 training & 724 test images 40 semantic labels depth range 1-10 m neighbouring depths d i+1 / d i = 1.25 KITTI dataset The ratio of pixels below the relative error NYU2 dataset Semantic segmentation results

Upload: trinhkhuong

Post on 20-Dec-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Single View Depth Estimation Experiments Our classifier Training of

3. Depth transforms with inv. scaling

Sufficient to train a classifier for a single dC

For other depths d :

4. Multiple semantic classes

1. Pixel-wise classifier

superpixels not necessarily planar

2. Translation invariant

Pulling Things out of Perspective

Single View Depth Estimation

Ľubor Ladický1, Jianbo Shi2, Marc Pollefeys1

1 ETH Zürich, Switzerland 2 University of Pennsylvania, Philadelphia, USA

Experiments Our classifier

Training of the classifier

Standard approaches

1. Model fitting [Barinova et al. ECCV08]

• Requires strong prior knowledge

• Ignores small objects

2. 3D-Detection based [Hoiem et al. CVPR06]

• Works only for foreground objects (things)

3. Depth from semantic labels [Liu et al, CVPR10]

• Requires strong priors about semantic classes

4. Data driven [Saxena et al, NIPS05]

• Requires lots of data

• A problem with balancing data

General problem

•No common structure of the scene

•Ground plane not always visible

•Large variation of viewpoints and of objects in the scene

•Both things and stuff in the scene

•Impossible ?

Classifier response for x and at a depth d

window wh around the point xI

semantic label

1. Image pyramid is built

2. Training data randomly sampled

3. Samples of each class at dCused as positives

4. Samples of other classes or at d ≠ dC used as negatives

5. Multi-class classifier trained

• Dense Features SIFT, LBP, Self Similarity, Texton

• Representation Soft BOW representations in the set of rectangles

• Classifier AdaBoost

Patch classification

KITTI dataset

• 30 training & 30 test images

• 12 semantic labels

• depth range 2-50m (except sky)

• neighbouring depths di+1 / di = 1.25

NYU2 dataset

• 725 training & 724 test images

• 40 semantic labels

• depth range 1-10 m

• neighbouring depths di+1 / di = 1.25

KITTI dataset

The ratio of pixels below the relative error

NYU2 dataset

Semantic segmentation results