visual computing theory and engineering...

63
Visual Computing Theory and Engineering Applications - Marr’s Vision and Beyond Prof. Li Song(宋利) http://medialab.sjtu.edu.cn Shanghai Jiao tong University

Upload: others

Post on 06-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Visual Computing Theory and Engineering

Applications

- Marr’s Vision and Beyond

Prof. Li Song(宋利)

http://medialab.sjtu.edu.cn

Shanghai Jiao tong University

Page 2: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Overview

Marr’s Vision Theory

Page 3: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his
Page 4: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Marr’s approach

• Distinguished different explanatory tasks at different levels

• Gave a general theoretical framework for combining them

• Apply the framework in considerable detail to a single example the early visual system

Page 5: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Marr’s three levels

• 3 different types of analysis of an information-

processing system

• Computational

• Algorithmic

• Implementational

Page 6: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Computational analysis

• Form of task analysis of a cognitive system

▪ (a) Identify the specific information-

processing problem that the system is

configured to solve

▪ (b) Identify general constraints upon any

solution to that problem

Page 7: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Algorithmic analysis

• Explains how the cognitive system actually

performs the information-processing task

• identify input information and output

information

• identify algorithm for transforming input into

required output

• specify how information is encoded

Page 8: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Implementational analysis

• Finds a physical realization for the algorithm

• Identify neural structures realizing the basic

representational states to which the algorithm applies

[e.g. populations of neurons]

• Identify neural mechanisms that transform those

representational states according to the algorithm

Page 9: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Marr’s computational analysis of visual system

• Two basic conclusions from his task analysis

• The visual system’s job is to provide a 3D

representation of the visual environment that can

serve as input to recognition and classification

processes – primarily information about shape of

objects and their spatial distribution

• This 3D representation is on an object-centered

rather than viewer-centered frame of reference

Page 10: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Experimental evidence

• Possibility of double dissociations between perceptual abilities and recognition abilities

▪ Right parietal lesions (右顶叶病变 ) - recognition

abilities preserved, but problems in perceiving shapes

from unusual perspectives

▪ Left parietal lesions (左顶叶病变) - shape perception

intact, but recognition and identification impaired

▪ Suggested to Marr that visual system provides input to

recognition systems

Page 11: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Theoretical considerations

• Recognition abilities are constant across changes in how things look to the perceiver due to • orientation of object • its distance from perceiver • partial occlusion by other objects

• Visual system provides information to recognition systems that abstracts away from these perspectival features - observer-independent representation

Page 12: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Algorithmic analysis

• Input = light arriving at retina

• Output = 3D representation of environment

• Questions: • what sort of information is extracted from the light

at the retina?

• how does the system get from this information to a

3D representation of the environment?

Page 13: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

The challenge

• “From an information-processing point of view, our primary purpose is to define a representation of the image of reflectance changes on a surface that is suitable for detecting changes in the image’s geometrical organization that are due to changes in the reflectance of the surface itself or to changes in the surface’s orientation or distance from the viewer” (Marr, Vision p. 44)

• Need to find representational primitives that allow inference backwards from structure of image to structure of environment

Page 14: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Representational primitives

• Basic information at retina = intensity value of light at each point in the retinal image ▪ Changes in intensity value provide clues as to

surface boundaries

• Primitives allow structure to be imposed on patterns of intensity changes ▪ E.g. zero crossings (sudden intensity changes)

Page 15: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Zero crossings

• If we plot changes in

intensity on a graph, then

radical discontinuities will

be signaled by the curve

crossing zero

• Marr proposed a Laplacian

or Gaussian filter to detect

zero crossings

Page 16: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Primal sketch

• identifies intensity changes in the 2D image

• basic information about the geometric organization of those intensity changes

• Primitives include: zero-crossings virtual lines groups

Page 17: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

2.5D sketch

• Displays orientation of visible

surfaces in viewer-centered

coordinates

• Represents distance of each point

in visual field from viewer

• Also orientation of each point

and contours of discontinuities

• Very basic information about

depth

Page 18: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

3D sketch

•characterizes shapes and

their spatial organization

• object-centered

• basic volumetric and surface

primitives are schematic

(facilitates recognition)

Page 19: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Representation in the 3D sketch

• depends upon many shapes being recognizable as

ensembles of generalized cones

• Generalized cones are easy to represent

• vector describing path of the figures axis of symmetry

• vector specifying perpendicular distance from every point on axis to shape’s surface

Page 20: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his
Page 21: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

20 Years After Marr

Recovering 3D shape of object by exploiting more constraints

Page 22: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Let us back to Marr’ theory

The basic processing is as follows:

Page 23: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Let us back to Marr’ theory

The basic processing is as follows:

Page 24: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Modules

• Vision processing is organized according to

function modules that are almost independent.

• Thus we can only focus on a specific function or

algorithm for each step

• Let’s us begin with image representation

Page 25: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Image – math viewpoint

Page 26: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Image-DSP viewpoint

Page 27: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Image-vision viewpoint

Page 28: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Image-storage viewpoint

• Let’s open an image file is its “raw”

format:

P6: (this is a ppm image)

Resolution: 512x512

Depth: 255

(8bits per pixel in each channel)

Page 29: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Image- computing viewpoint

Page 30: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

From Image to Representation

• Intensity(强度) is affected by:Geometry (几何关系)、Reflection(反射)、Lighting(照明)、Observation(观察点)

• Representation is tokens, is values of attributes, point, line, edge, and their combination,and it is real physical changes on surface of objects that can be used to infer structure

• Human eye scans an unknown object by tracking its contours which are connection of edges

• If we are successful to extract edges of an object, it could be easy to recognize it.

Page 31: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

From token to sketch

• Assumption(物理假设): ▪ Surface、levels、similarity、Continuity

• Characteristics(初级表象的性质) ▪ Consists of Basics and reflect image local structures ▪ Steps: Zero Crossing -> primal sketch -> full primal

sketch

• Zero crossing (零交叉) ▪ 2nd derivative is zero, has biological explanation

• Raw primal sketch(原始要素图) ▪ token (表征):Edge, blob, bar, and discontinuity ▪ local configurations: similarity and configuration

between tokens

• Full primal sketch(完全要素图) ▪ Selection, combination, discrimination, ▪ Form meaningful representation in multi-scale way

Page 32: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Edge is the most important token

Page 33: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Edge’s mathematical features

Page 34: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Edge extraction algorithm

• Hundreds of methods

▪ 1959:Julez, “A Method of Coding TV Signals

Based on Edge Detection,” Compression, Video.

Television.

▪ 1963:L. G. Roberts is a pioneer who did an

systematical research on edge detection

▪ …

Page 35: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Edge detection by filtering

Page 36: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Convolution(1)

Page 37: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Convolution(2)

Page 38: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Convolution(3)

Page 39: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Convolution(4)

Page 40: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Convolution(4)

Page 41: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Edge Filter(1)

Page 42: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Edge Filtering(2)

Page 43: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Edge Filtering(3)

Page 44: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Edge Filtering(4)

Page 45: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Edge Filtering(5)

Page 46: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Canny detector(1)

Page 47: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Canny detector(2)

Page 48: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Canny detector(3)

Page 49: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Canny detector(4)

Page 50: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Canny detector(5)

Page 51: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

2nd order edge filter(1)

Page 52: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

2nd order edge filter(2)

Page 53: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

2nd order edge filter(3)

Page 54: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

2nd order edge filter(4)

Page 55: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Marr-Hildreth Edge Detector

Page 56: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Marr-Hildreth Edge Detector

Page 57: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Marr-Hildreth Edge Detector

Page 58: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Marr’s and Canny’s

Page 59: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

But edge extraction are hard

• Very difficult problem ! For nature

images, people can easily see many

obvious edges but algorithm fails to

extract them or gets useless edges!

• There are still no general and robust edge

detection algorithm!

• Why? ▪ Many factors, one of them is multiscale…

Page 60: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Multiscale

Page 61: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Multiscale everywhere

Page 62: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Multi-scale edge

Page 63: Visual Computing Theory and Engineering …medialab.sjtu.edu.cn/teaching/CV/Lec/lec2-MarrModel.pdfMarr’s computational analysis of visual system •Two basic conclusions from his

Homework

• Further Reading

• Learning to Detect Natural Image Boundaries Using Local

Brightness, Color, and Texture Cues, TPAMI09

• Contour Detection and Hierarchical Image Segmentation, IEEE

TPAMI 2011.

• DeepEdge: A Multi-Scale Bifurcated Deep Network for Top-Down

Contour Detection, CVPR2015

• DeepContour: A Deep Convolutional Feature Learned by Positive-

sharing Loss for Contour Detection,CVPR2015

• Object Contour Detection with a Fully Convolutional Encoder-

Decoder Network,CVPR2016