cos 429: computer vision thanks to chris bregler
Post on 19-Dec-2015
218 views
TRANSCRIPT
COS 429: Computer VisionCOS 429: Computer Vision
• Instructor: Szymon RusinkiewiczInstructor: Szymon Rusinkiewicz [email protected]@cs.princeton.edu
• TA: Nathaniel DirksenTA: Nathaniel Dirksen [email protected]@cs.princeton.edu
• Course web pageCourse web page
http://www.cs.princeton.edu/courses/cs496/http://www.cs.princeton.edu/courses/cs496/
What is Computer Vision?What is Computer Vision?
• Input: images or videoInput: images or video
• Output: description of the worldOutput: description of the world
• But also: measuring, classifying, But also: measuring, classifying, interpreting visual informationinterpreting visual information
Low-Level or “Early” VisionLow-Level or “Early” Vision
• Considers local Considers local properties of an properties of an imageimage
““There’s an edge!”There’s an edge!”
Mid-Level VisionMid-Level Vision
• Grouping and Grouping and segmentationsegmentation
““There’s an object There’s an object and a background!”and a background!”
Big Question #1: Who Cares?Big Question #1: Who Cares?
• Applications of computer visionApplications of computer vision– In AI: vision serves as the “input stage”In AI: vision serves as the “input stage”
– In medicine: understanding human visionIn medicine: understanding human vision
– In engineering: model extractionIn engineering: model extraction
Vision and Other FieldsVision and Other Fields
Computer VisionComputer VisionArtificial IntelligenceArtificial Intelligence
Cognitive PsychologyCognitive Psychology Signal ProcessingSignal Processing
Computer GraphicsComputer Graphics
Pattern AnalysisPattern Analysis
MetrologyMetrology
Big Question #2: Does It Work?Big Question #2: Does It Work?
• Situation much the same as AI:Situation much the same as AI:– Some fundamental algorithmsSome fundamental algorithms
– Large collection of hacks / heuristicsLarge collection of hacks / heuristics
• Vision is hard!Vision is hard!– Especially at high level, physiology unknownEspecially at high level, physiology unknown
– Requires integrating many different Requires integrating many different methodsmethods
– Requires reasoning and understanding:Requires reasoning and understanding:“AI completeness”“AI completeness”
Computer and Human VisionComputer and Human Vision
• Emulating effects of human visionEmulating effects of human vision
• Understanding physiology of human Understanding physiology of human visionvision
• Analogues of human vision atAnalogues of human vision atlow, mid, and high levelslow, mid, and high levels
Image FormationImage Formation
• Human: lens forms Human: lens forms image on retina,image on retina,sensors (rods and sensors (rods and cones) respond to cones) respond to lightlight
• Computer: lens Computer: lens system forms image,system forms image,sensors (CCD, sensors (CCD, CMOS) respond to CMOS) respond to lightlight
Low-Level VisionLow-Level Vision
• Retinal ganglion cellsRetinal ganglion cells
• Lateral Geniculate Nucleus – function unknownLateral Geniculate Nucleus – function unknown(visual adaptation?)(visual adaptation?)
• Primary Visual CortexPrimary Visual Cortex– Simple cells: orientational sensitivitySimple cells: orientational sensitivity
– Complex cells: directional sensitivityComplex cells: directional sensitivity
• Further processingFurther processing– Temporal cortex: what is the object?Temporal cortex: what is the object?
– Parietal cortex: where is the object? How do I get it?Parietal cortex: where is the object? How do I get it?
Low-Level VisionLow-Level Vision
• Net effect: low-level human visionNet effect: low-level human visioncan be (partially) modeled as a set ofcan be (partially) modeled as a set ofmultiresolution, orientedmultiresolution, oriented filters filters
Low-Level Depth CuesLow-Level Depth Cues
• FocusFocus
• VergenceVergence
• StereoStereo
• Not as important as popularly believedNot as important as popularly believed
Low-Level Computer VisionLow-Level Computer Vision
• Filters and filter banksFilters and filter banks– Implemented via convolutionImplemented via convolution
– Detection of edges, corners, and other local featuresDetection of edges, corners, and other local features
– Can include multiple orientationsCan include multiple orientations
– Can include multiple scales: “filter pyramids”Can include multiple scales: “filter pyramids”
• ApplicationsApplications– First stage of segmentationFirst stage of segmentation
– Texture recognition / classificationTexture recognition / classification
– Texture synthesisTexture synthesis
Texture Analysis / SynthesisTexture Analysis / Synthesis
MultiresolutionMultiresolutionOrientedOrientedFilter BankFilter Bank
OriginalOriginalImageImage
ImageImagePyramidPyramid
Texture Analysis / SynthesisTexture Analysis / Synthesis
OriginalOriginalTextureTexture
SynthesizedSynthesizedTextureTexture
Heeger and BergenHeeger and Bergen
Low-Level Computer VisionLow-Level Computer Vision
• Optical flowOptical flow– Detecting frame-to-frame motionDetecting frame-to-frame motion
– Local operator: looking for gradientsLocal operator: looking for gradients
• ApplicationsApplications– First stage of trackingFirst stage of tracking
Low-Level Computer VisionLow-Level Computer Vision
• Shape from XShape from X– StereoStereo
– MotionMotion
– ShadingShading
– Texture foreshorteningTexture foreshortening
3D Reconstruction3D Reconstruction
Tomasi+KanadeTomasi+Kanade
Debevec,Taylor,MalikDebevec,Taylor,Malik Phigin et al.Phigin et al.
Forsyth et al.Forsyth et al.
Mid-Level VisionMid-Level Vision
• Physiology unclearPhysiology unclear
• Observations by Gestalt psychologistsObservations by Gestalt psychologists– ProximityProximity
– SimilaritySimilarity
– Common fateCommon fate
– Common regionCommon region
– ParallelismParallelism
– ClosureClosure
– SymmetrySymmetry
– ContinuityContinuity
– Familiar configurationFamiliar configuration WertheimerWertheimer
Mid-Level Computer VisionMid-Level Computer Vision
• TechniquesTechniques– Clustering based on similarityClustering based on similarity
– Limited work on other principlesLimited work on other principles
• ApplicationsApplications– Segmentation / groupingSegmentation / grouping
– TrackingTracking
Snakes: Active ContoursSnakes: Active Contours
Contour Evolution forContour Evolution forSegmenting an ArterySegmenting an Artery
Bayesian MethodsBayesian Methods
• Prior probabilityPrior probability– Expected distribution of modelsExpected distribution of models
• Conditional probability P(A|B)Conditional probability P(A|B)– Probability of observation AProbability of observation A
given model Bgiven model B
Bayesian MethodsBayesian Methods
• Prior probabilityPrior probability– Expected distribution of modelsExpected distribution of models
• Conditional probability P(A|B)Conditional probability P(A|B)– Probability of observation AProbability of observation A
given model Bgiven model B
• Bayes’s RuleBayes’s RuleP(B|A) = P(A|B) P(B|A) = P(A|B) P(B) / P(A) P(B) / P(A)
– Probability of model B given observation AProbability of model B given observation A
Thomas BayesThomas Bayes(c. 1702-1761)(c. 1702-1761)
Bayesian MethodsBayesian Methods
)|( aXP )|( aXP
)|( bXP )|( bXP
# black pixels# black pixels
# black pixels# black pixels
High-Level VisionHigh-Level Vision
• Computational mechanismsComputational mechanisms– Bayesian networksBayesian networks
– TemplatesTemplates
– Linear subspace methodsLinear subspace methods
– Kinematic modelsKinematic models
DataData
PCAPCA
New Basis VectorsNew Basis Vectors
Kirby et al.Kirby et al.
Principal Components Analysis Principal Components Analysis (PCA)(PCA)
Kinematic ModelsKinematic Models
• Optical Flow/Feature tracking: no constraints
• Layered Motion: rigid constraints
• Articulated: kinematic chain constraints
• Nonrigid: implicit / learned constraints
Course OutlineCourse Outline
• Image formation and captureImage formation and capture
• Filtering and feature detectionFiltering and feature detection
• Motion estimationMotion estimation
• Segmentation and clusteringSegmentation and clustering
• PCAPCA
• 3D projective geometry3D projective geometry
• Shape acquisitionShape acquisition
• Shape analysisShape analysis
Image-Based Modeling and Image-Based Modeling and RenderingRendering
Debevec et al.Debevec et al.
ManexManex
Course MechanicsCourse Mechanics
• 65%: 4 written / programming assignments65%: 4 written / programming assignments
• 35%: Final group project35%: Final group project
Course MechanicsCourse Mechanics
• Book:Book:Introductory Techniques for 3-D Computer Introductory Techniques for 3-D Computer VisionVisionEmanuele Trucco and Alessandro VerriEmanuele Trucco and Alessandro Verri
• PapersPapers
COS 429: Computer VisionCOS 429: Computer Vision
• Instructor: Szymon RusinkiewiczInstructor: Szymon Rusinkiewicz [email protected]@cs.princeton.edu
• TA: Nathaniel DirksenTA: Nathaniel Dirksen [email protected]@cs.princeton.edu
• Course web pageCourse web page
http://www.cs.princeton.edu/courses/cs496/http://www.cs.princeton.edu/courses/cs496/