open-world visionvision.stanford.edu/teaching/cs131_fall1617/lectures/lecture_amir.pdf · amir...

Open-World Vision Amir R. Zamir

What, Why, and How of “Representation”

Things... Our Knowledge...

“Transcript”

Macbeth was guilty.

“Transcript”

Macbeth was guilty.

[ 81 20 84 64 58 39 17 54 72 15]

Representation Mathematical Model (e.g., classifier)

~12 lbs

~8 lbs

-5 0 +20 7 15 11

X X X X X X X X X X X X X X X X X X X X X X X X

~12 lbs

~8 lbs

-5 0 +20 7 15 11

~12 lbs

~8 lbs

-5 0 +20 7 15 11

Weight (w)

Representation Mathematical Model (Classifier)

Type B

Type A

Represent these cats for a cat detector!

Represent these cats for a cat detector! (II)

Represent these cats for a cat detector! (III)

Represent these cats for a cat detector! (IV)

Color Histograms

Deformable Part based Models

Histogram of Gradients

Models based Shapes

14 Felzenszwalb et al. 2010. Dalal and Triggs, 2005. Beis and Lowe, 1997.

Not always as easy (Happy vs Sad)

Not always as easy (Sad)

Learning Representations

Convolutional Neural Network Autoencoder 17

LeCun et al. 1998. Hinton et al. 2006.

Two approaches to learning Unsupervised

Representation constrained on reconstruction.

Supervised

Representation constrained on task(s).

Two approaches to learning Supervised

Representation constrained on task(s).

Unsupervised

Representation constrained on reconstruction.

19 LeCun et al. 1998. Hinton et al. 2006.

20 Stanford CS231n

Lightning overview of Neural Networks

Neural Net

A Neuron

Convolutional X

Understanding Representations

Embedding 21 Maaten and Hinton, 2008.

Inverting a representation

[ 81 20 84 64 58 39 17 54 72 15]

Inverting a representation 23 Dosovitskiy and Brox, 2015.

Representations in NLP, Brain, Speech, etc.

Word2Vec (NLP) FMRI Scan (brain) 24 Mikolov et al. 2013

“Transcript”

Macbeth was guilty.

[ 81 20 84 64 58 39 17 54 72 15]

Representation Mathematical Model (e.g., classifier)

CS231n

CS331b CS229

Now that we’re done with background building…

Open-World (Generic) Representations

Open-World (Generic) Computer Vision

An Exciting Time!

Fully Supervised Learning

• Fully supervised learning is task specific. • Will not lead to a human-like comprehensive perception.

• Characterized by Generalization & Abstraction

➔ How to develop a system/representation with Generalization & Abstraction?

How to achieve generalization & abstraction?

• Proposition: • (Instead of providing supervision over the desired tasks) • Provide supervision over a set of selected foundational tasks ⇒ generalization to novel tasks and abstraction capabilities.

Held & Hein. 1963.

• But how to pick the foundational tasks? • Biology! • Inspirations from developmental

stages of visual skills in brainGeneric 3D Representation via Pose Estimation and Matching. ECCV 2016. Amir Zamir, Tilman Wekel, Pulkit Agrawal, Colin Wei, Jitendra Malik, Silvio Savarese.

Generic 3D Representation Learning

Generic 3D Representation via Pose Estimation and Matching. Amir Zamir, Tilman Wekel, Pulkit Agrawal, Colin Wei, Jitendra Malik, Silvio Savarese. ECCV 2016.

Generic 3D Representation Learning

Generic 3D Representation via Pose Estimation and Matching. Amir Zamir, Tilman Wekel, Pulkit Agrawal, Colin Wei, Jitendra Malik, Silvio Savarese. ECCV 2016.

Learn it from the world!

Dataset Coverage

The 3D Representation

Evaluations

• Camera pose estimation • Matching (wide-baseline)

State-of-the-art Human-level

• Surface Normal • 3D Object Pose • 3D Scene Layout • Visual Abstraction

State-of-the-art unsupervised

Unsupervised TasksSupervised Tasks

Pose (surface normal) Embedding

Krizhevsky et al. 2012. Russakovsky et al. 2015

MIT Places

ImageNet (AlexNet) Generic 3D Rep.

Zhou et al. 2014 Krizhevsky et al. 2012. Russakovsky et al. 2015

3D Object Pose - ImageNet

Generic 3D Rep.

Wang & Gupta. 2015 Krizhevsky et al. 2012. Russakovsky et al. 2015

3D Object Pose - ImageNet

http://3drepresentation.stanford.edu/

Query Image

Generic 3D Representation

ImageNet (AlexNet)

3D Object Pose Estimation - Abstraction

3D Object Pose Estimation – Cross Category

Wang & Gupta. 2015 Agrawal et al. 2015. Russakovsky et al. 2015 Xiang et al. 2014.

3D Object Pose Estimation – Cross Category

Wang & Gupta. 2015 Agrawal et al. 2015. Russakovsky et al. 2015 Xiang et al. 2014.

3D Layout Estimation - Abstraction

Unsupervised Evaluations

Surface Normal Estimation (NYUv2)

Scene Layout Classification (LSUN)

Scene Layout Estimation (LSUN)

Object Pose Estimation (PASCAL3D)

What’s under the hood – Vanishing Points?

Mahendran & Vedadi. 2015. Angladon et al. 2015. Denis et al. 2008. Li et al. 2010.

Matching Evaluation

Zagoruyko, & Komodakis. 2015. Simi-Serra et al. 2015. Lowe. 2004. Arandjelovic & Zisserman. 2012. Wu et al. 2011. Simonyan & Zisserman. 2014. Tola et al. 2008. Morel et al. 2009.

Matching Results

Pose Regression Results

GTEstimated

Pose Regression Results

Estimated

Pose Estimation Evaluation

Wu. 2011. Geiger et al. 2011 Wu et al. 2011.

`• Task Taxonomy

• Problem space is unknown. • Essential for the 3D-complete representation

• Proper Fusion Techniques • Beyond simplistic late-fusion or ConvNet fine tuning • Essential for the vision-complete representation

• Proper Data!

Generic 3D Representation via Pose Estimation and Matching. Amir Zamir, Tilman Wekel, Pulkit Agrowal, Colin Wei, Jitendra Malik, Silvio Savarese. ECCV 2016.

zamir@cs.stanford.edu

http://www.cs.stanford.edu/~amirz/

http://3DRepresentation.stanford.edu/

open-world visionvision.stanford.edu/teaching/cs131_fall1617/lectures/lecture_amir.pdf · amir...

Documents

lecture’4:’ pixels’and’filters - stanford computer...

1999 - savarese - 3d depth recovery with grayscale...

linear’algebraprimer’ - stanford computer vision...

awesome bedroom designs by pulkit mohan singla (1)

epipolar geometry - stanford...

unique living room designs by pulkit mohan singla

pulkit agrawal 63.docx

lecture 2 camera models - silvio savarese€¦ · silvio...

amazing bedrooms by pulkit mohan singla

lecture5 epipolar geometry - silvio savarese...silvio...

winning the lottery with continuous sparsiﬁcation ·...

10.savarese meta analysis

lecture 10 detectors and descriptors - silvio...

extraordinary bedrooms 2 by pulkit mohan singla

pulkit ppt steam turbine3 (2)

ole catalogo prodotti - officine savarese

pulkit sharma_sandwich beam

deepak pathak , yide shentu , dian chen , pulkit agrawal...

outline: pulkit grover, anant sahai and se yong park

pulkit shikhar report