cvpr tutorial: first person vision - university of minnesotahspark... · side wall . side wall...

49
CVPR Tutorial: First Person Vision

Upload: others

Post on 15-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

CVPR Tutorial: First Person Vision

Page 2: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

INSIDE OUT: Riley’s First Date?, PIXAR

Page 3: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Third person camera

Page 4: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation
Page 5: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

External space

What is where? - D. Marr

Page 6: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Person detection

External space

What is where? - D. Marr

Page 7: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Person detection Ground plane

Side wall

Side wall

Object detection Surface normal estimation Object affordance

External space

What is where? - D. Marr

Page 8: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Person detection Ground plane

Side wall

Side wall

Object detection Surface normal estimation Object affordance

Semantic segmentation Tracking

External space

What is where? - D. Marr

Page 9: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Person detection Ground plane

Side wall

Side wall

Object detection Surface normal estimation Object affordance

Semantic segmentation Tracking

External space

What is where? - D. Marr

First person is not moving third person.

What is first person vision?

Page 10: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Internal space

We move in order to see and we see in order to move.

- J. J. Gibson

?

Page 11: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Internal space

We move in order to see and we see in order to move.

- J. J. Gibson

Vanishing line

My orientation

Page 12: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Internal space

We move in order to see and we see in order to move.

- J. J. Gibson

Interaction with me

Page 13: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Person detection Ground plane

Side wall

Side wall

My motion Internal space

We move in order to see and we see in order to move.

- J. J. Gibson

Page 14: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

“First person vision is an embedded human-system symbiosis.”

- Takeo Kanade

Page 15: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

“First person vision is an embedded human-system symbiosis.”

- Takeo Kanade

First person vision is all about me.

Page 16: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Why first person vision is ideal for human behavior understanding?

Page 17: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Distance from camera, d 0.03m 0.1m 1m 10m 30m

Page 18: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Distance from camera, d

3-30m

0.03m 0.1m 1m 10m 30m

Third person

Target

Page 19: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Distance from camera, d

3-30m

0.03m 0.1m 1m 10m 30m

102 p 103 p 104 p 105 p 106 p

Third person

Target

Number of pixels for head pose (HD resolution), 1/d 2 ∝

Page 20: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Distance from camera, d 0.03m 0.1m 1m 10m 30m

102 p 103 p 104 p 105 p 106 p

Second person

Target

Number of pixels for head pose (HD resolution), 1/d 2 ∝

Page 21: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Distance from camera, d 0.03m 0.1m 1m 10m 30m

102 p 103 p 104 p 105 p 106 p

Second person

Target

0.5-3m

Number of pixels for head pose (HD resolution), 1/d 2 ∝

Page 22: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Distance from camera, d 0.03m 0.1m 1m 10m 30m

102 p 103 p 104 p 105 p 106 p Number of pixels for head pose (HD resolution), 1/d 2 ∝

First person

Target

Page 23: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Distance from camera, d 0.03m 0.1m 1m 10m 30m

102 p 103 p 104 p 105 p 106 p Number of pixels for head pose (HD resolution), 1/d 2 ∝

First person < 0.3m

Target

Page 24: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

First person

Target

Second person

Target

Third person

Target

Noninvasiveness

Measurement accuracy 3D estimation error < 5cm

Page 25: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

First person

Target

Second person

Target

Third person

Target

Noninvasiveness

Measurement accuracy 3D estimation error < 5cm

Prediction Learning

Page 26: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

First person vs. Third person

Page 27: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

I. Attention Following

Page 28: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

1. Attention Following

Page 29: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Group Attention Following

Page 30: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

2. Egocentric Spatial Organization

Page 31: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

2.3m

2.3m

2. Egocentric Spatial Organization

Page 32: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

2.3m

30cm

30cm

2. Egocentric Spatial Organization

Page 33: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

2.3m

Orientation

2. Egocentric Spatial Organization

Page 34: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

w/ prior w/o prior

Egocentric action-object detection

Page 35: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Graphical Representation via Kinematics

Page 36: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

V1 V2

V4 V3

Position Orientation

Pose Velocity

Role

Graphical Representation via Kinematics

Page 37: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

V1 V2

V4 V3

E12

E23

Position Orientation

Pose Velocity

Role

Distance Relative orientation

Relative velocity Social relation

E13 E14

E34

E24

Graphical Representation via Kinematics

Coach’s note

Page 38: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

V4 V3

V1 V2

Page 39: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

What can first person cameras tell us about me?

Page 40: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

What can first person cameras tell us about me?

1. Attention

Page 41: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Personal attention: what am I looking? [Li ICCV13]

Page 42: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Social attention: what are we looking?

Page 43: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

1. Attention 2. Kinematics

What can first person cameras tell us about me?

Page 44: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Human kinematics I: Where is my body and object?

Page 45: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Human kinematics II: What does that mean to me?

Page 46: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

1. Attention 2. Kinematics 3. Control (sensorimotor)

What can first person cameras tell us about me?

Page 47: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

Visual Sensorimotor I: How do I control?

Page 48: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

3D reconstruction

Visual Sensorimotor II: What do I feel?

Page 49: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation

1. Attention 2. Kinematics 3. Control (sensorimotor)

What can first person cameras tell us about me?