project 4 results representation – sift and hog are popular and successful. data – hugely...

40
Project 4 Results • Representation – SIFT and HoG are popular and successful. • Data – Hugely varying results from hard mining. • Learning – Non-linear classifier usually better. Zachary, Hung-I , Paul , Emanuel

Upload: asher-lyons

Post on 23-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Project 4 Results

• Representation– SIFT and HoG are popular and successful.

• Data– Hugely varying results from hard mining.

• Learning– Non-linear classifier usually better.

Zachary, Hung-I, Paul, Emanuel

Page 2: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Project 5

• Project 5 handout

Page 3: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Final Project

• Proposals due Friday.– One paragraph with one paper link.

• Can opt out of project 5.• Higher expectations; must be able to present

during final exam slot.

Page 4: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

11/14/2011

CS143, Brown

James Hays

Stereo:Epipolar geometry

Slides by Kristen Grauman

Page 5: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Multiple views

Hartley and Zisserman

Lowe

Multi-view geometry, matching, invariant features, stereo vision

Page 6: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Why multiple views?• Structure and depth are inherently ambiguous from

single views.

Images from Lana Lazebnik

Page 7: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Why multiple views?• Structure and depth are inherently ambiguous from

single views.

Optical center

P1P2

P1’=P2’

Page 8: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

• What cues help us to perceive 3d shape and depth?

Page 9: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Shading

[Figure from Prados & Faugeras 2006]

Page 10: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Focus/defocus

[figs from H. Jin and P. Favaro, 2002]

Images from same point of view, different camera parameters

3d shape / depth estimates

Page 11: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Texture

[From A.M. Loh. The recovery of 3-D structure using visual texture patterns. PhD thesis]

Page 12: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Perspective effects

Image credit: S. Seitz

Page 13: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Motion

Figures from L. Zhang http://www.brainconnection.com/teasers/?main=illusion/motion-shape

Page 14: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Estimating scene shape

• “Shape from X”: Shading, Texture, Focus, Motion…

• Stereo: – shape from “motion” between two views– infer 3d shape of scene from two (multiple)

images from different viewpoints

scene pointscene point

optical centeroptical center

image planeimage plane

Main idea:

Page 15: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Outline

• Human stereopsis

• Stereograms

• Epipolar geometry and the epipolar constraint

– Case example with parallel optical axes

– General case with calibrated cameras

Page 16: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Human eye

Fig from Shapiro and Stockman

Pupil/Iris – control amount of light passing through lens

Retina - contains sensor cells, where image is formed

Fovea – highest concentration of cones

Rough analogy with human visual system:

Page 17: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Human stereopsis: disparity

Human eyes fixate on point in space – rotate so that corresponding images form in centers of fovea.

Page 18: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Disparity occurs when eyes fixate on one object; others appear at different visual angles

Human stereopsis: disparity

Page 19: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Disparity: d = r-l = D-F.

Human stereopsis: disparity

Forsyth & Ponce

Page 20: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Random dot stereograms

• Julesz 1960: Do we identify local brightness patterns before fusion (monocular process) or after (binocular)?

• To test: pair of synthetic images obtained by randomly spraying black dots on white objects

Page 21: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Random dot stereograms

Forsyth & Ponce

Page 22: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Random dot stereograms

Page 23: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Random dot stereograms

• When viewed monocularly, they appear random; when viewed stereoscopically, see 3d structure.

• Conclusion: human binocular fusion not directly associated with the physical retinas; must involve the central nervous system

• Imaginary “cyclopean retina” that combines the left and right image stimuli as a single unit

Page 24: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Stereo photography and stereo viewers

Invented by Sir Charles Wheatstone, 1838 Image from fisher-price.com

Take two pictures of the same subject from two slightly different viewpoints and display so that each eye sees only one of the images.

Page 25: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

http://www.johnsonshawmuseum.org

Page 26: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

http://www.johnsonshawmuseum.org

Page 27: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923

Page 28: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

http://www.well.com/~jimg/stereo/stereo_list.html

Page 29: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Autostereograms

Images from magiceye.com

Exploit disparity as depth cue using single image.

(Single image random dot stereogram, Single image stereogram)

Page 30: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Images from magiceye.com

Autostereograms

Page 31: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Estimating depth with stereo

• Stereo: shape from “motion” between two views

• We’ll need to consider:• Info on camera pose (“calibration”)• Image point correspondences

scene pointscene point

optical optical centercenter

image planeimage plane

Page 32: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Two cameras, simultaneous views

Single moving camera and static scene

Stereo vision

Page 33: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Camera parameters

Camera frame 1

Intrinsic parameters:Image coordinates relative to camera Pixel coordinates

Extrinsic parameters:Camera frame 1 Camera frame 2

Camera frame 2

• Extrinsic params: rotation matrix and translation vector

• Intrinsic params: focal length, pixel sizes (mm), image center point, radial distortion parameters

We’ll assume for now that these parameters are given and fixed.

Page 34: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Outline

• Human stereopsis

• Stereograms

• Epipolar geometry and the epipolar constraint

– Case example with parallel optical axes

– General case with calibrated cameras

Page 35: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Geometry for a simple stereo system

• First, assuming parallel optical axes, known camera parameters (i.e., calibrated cameras):

Page 36: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

baseline

optical center (left)

optical center (right)

Focal length

World point

image point (left)

image point (right)

Depth of p

Page 37: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

• Assume parallel optical axes, known camera parameters (i.e., calibrated cameras). What is expression for Z?

Similar triangles (pl, P, pr) and (Ol, P, Or):

Geometry for a simple stereo system

Z

T

fZ

xxT rl

lr xx

TfZ

disparity

Page 38: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Depth from disparity

image I(x,y) image I´(x´,y´)Disparity map D(x,y)

(x´,y´)=(x+D(x,y), y)

So if we could find the corresponding points in two images, we could estimate relative depth…

Page 39: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Summary• Depth from stereo: main idea is to triangulate from

corresponding image points.• Epipolar geometry defined by two cameras

– We’ve assumed known extrinsic parameters relating their poses

• Epipolar constraint limits where points from one view will be imaged in the other– Makes search for correspondences quicker

• Terms: epipole, epipolar plane / lines, disparity, rectification, intrinsic/extrinsic parameters, essential matrix, baseline

Page 40: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier

Coming up

– Computing correspondences– Non-geometric stereo constraints