Transcript
Page 1: Single View Depth Estimation Experiments Our classifier Training of

3. Depth transforms with inv. scaling

Sufficient to train a classifier for a single dC

For other depths d :

4. Multiple semantic classes

1. Pixel-wise classifier

superpixels not necessarily planar

2. Translation invariant

Pulling Things out of Perspective

Single View Depth Estimation

Ľubor Ladický1, Jianbo Shi2, Marc Pollefeys1

1 ETH Zürich, Switzerland 2 University of Pennsylvania, Philadelphia, USA

Experiments Our classifier

Training of the classifier

Standard approaches

1. Model fitting [Barinova et al. ECCV08]

• Requires strong prior knowledge

• Ignores small objects

2. 3D-Detection based [Hoiem et al. CVPR06]

• Works only for foreground objects (things)

3. Depth from semantic labels [Liu et al, CVPR10]

• Requires strong priors about semantic classes

4. Data driven [Saxena et al, NIPS05]

• Requires lots of data

• A problem with balancing data

General problem

•No common structure of the scene

•Ground plane not always visible

•Large variation of viewpoints and of objects in the scene

•Both things and stuff in the scene

•Impossible ?

Classifier response for x and at a depth d

window wh around the point xI

semantic label

1. Image pyramid is built

2. Training data randomly sampled

3. Samples of each class at dCused as positives

4. Samples of other classes or at d ≠ dC used as negatives

5. Multi-class classifier trained

• Dense Features SIFT, LBP, Self Similarity, Texton

• Representation Soft BOW representations in the set of rectangles

• Classifier AdaBoost

Patch classification

KITTI dataset

• 30 training & 30 test images

• 12 semantic labels

• depth range 2-50m (except sky)

• neighbouring depths di+1 / di = 1.25

NYU2 dataset

• 725 training & 724 test images

• 40 semantic labels

• depth range 1-10 m

• neighbouring depths di+1 / di = 1.25

KITTI dataset

The ratio of pixels below the relative error

NYU2 dataset

Semantic segmentation results

Top Related