video object recognition chenyi chen. motion is important how important? let’s first look at...

90
Video Object Recognition Chenyi Chen

Upload: august-cross

Post on 17-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Video Object Recognition

Chenyi Chen

Page 2: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Motion is important

• How important?• Let’s first look at “Visual Parsing After

Recovery From Blindness”• This is a real “vision” paper

Page 3: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Background

• Study how do three Indian patients (subjects) develop object recognition ability after long term blindness

• Give treatment to the subjects• During recovery, test on the subjects to see

how they perform on recognition tasks

Page 4: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Background

• The subjects are:• S.K.: age 29, male, born blindness, M.A. in

political science• J.A.: age 13, male, born blindness, never

received education• P.B.: age 7, male, born blindness• Control group: 4 normal sighted adults, similar

social background

Page 5: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Subjects’ parsing of static images

Page 6: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

S.K. versus simple region partition algorithm

Page 7: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Dynamic information in object segregation

Page 8: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Motility rating and object recognition results

Page 9: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Follow-up testing after several months

Page 10: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

What do we learn about developing visual parsing skill

• Early stages: integrative impairments, overfragmentation of images, compromise recognition performance

• However, motion effectively mitigates these integrative difficulties

• Motion appears to be instrumental both in segregating objects and in binding their constituents into representations for recognition

Page 11: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

• So we have some insight of how people developing visual recognition ability

• Can we reproduce visual learning process on a robot?

• Let’s look at “Learning about Humans During the First 6 Minutes of Life”

Page 12: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

A baby robot

Page 13: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Hypothesis in social development• The infant brain is particularly sensitive to the

presence of contingencies• The contingency drives the definition and

recognition of caregivers• Human faces become attractive because they

tend to occur in high contingency situations

Page 14: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Goal• Whether acoustic contingency information

(sound) would be sufficient for the robot to develop preferences for human faces

• If so, get a sense for the time scale of the learning problem

Page 15: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

A baby robot

Page 16: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Settings

• The baby robot interacted with the lab members while recording image it saw

• Contingency detection engine analyzes sound signal for presence of contingencies

• Whether people were present is not specified• Whether people were of any particular

relevance is not specified • The only training label is the acoustic

contingency signal

Page 17: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Visual learning engine

• Probabilistic model• Only needs the images to be weakly labeled as

containing with high or low probability the object of interest, do not need to indicate where the objects are located on the image plane

• Implementable in a neural network• Run in real time at video frame rate

Page 18: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Hardware

• Plush baby doll• IEEE1394a webcam (capture images, only

grayscale images used for training)• Microphone (receive auditory signal)• Loudspeaker (baby makes excited noise)

Page 19: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Collecting data

• Record the auditory and visual signals for 88 minutes

• 2877 positive examples• 824 negative examples• Baby robot was placed in chair, stroller, and a

crib, with bright or dim lighting conditions• 9 persons interacted with the baby robot

Page 20: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Collecting data

• Select 34 positive examples and 200 negative examples for training (approx. 5 min 34 sec). The rest are used for testing

• The label is noisy

Page 21: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Results

• Evaluation: 2-Alternative Forced Choice Task (2AFC)

• 86.17% on the face detection task ( i.e., deciding which of two images contained a face)

• 89.7% correct on the contingency task (i.e., deciding which of two images was more likely to be associated with an auditory contingency)

• 92.3 % correct on the person detection task (i.e., deciding which image contained a person).

Page 22: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Results

• Examples images and their pixel-wise probability images

Page 23: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Results

• Infants showed a significant order of tracking preference in favor the face stimulus, followed by the scrambled stimulus, followed by the empty stimulus

• The robot reproduce the preference order

Page 24: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

• Video usually contains more data for object detector training

• There is a domain difference between video and still image

• So “Analysing domain shift factors between videos and images for object detection” is necessary

Page 25: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Goal

• For a given target test domain (image or video), the performance of the detector depends on the domain it was trained on.

• Examine the reasons behind this performance gap.

• Train an object detector with samples either from still images or from video frames and then test the detector on both domains.

Page 26: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Dataset• Still images (VOC)• PASCAL VOC 2007• 10 class of moving objects chosen

Page 27: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Dataset

• Video frames (VID)• YouTube-Objects dataset• 10 classes of moving objects• Further annotated a few images to make the

dataset have comparable labels with VOC

Page 28: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Equalizing the number of samples per class

• Equalize the training samples of VOC and VID• 3097 in total over the 10 classes (Table. 1)• Only the equalized training sets are used• trainVOC• trainVID

Page 29: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Domain shift factors

• Spatial location accuracy: accuracy of bounding box

• Appearance diversity: consecutive frames in video are similar, thus less diverse

• Image quality: compression, motion blur etc. in video images

• Object detector: DPM

Page 30: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Spatial location accuracy

• Method of getting bounding box on video:• PRE: worst• FVS: better• Manual label: best

Page 31: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Spatial location accuracy• Reduce almost 4% of the gap (test on VOC)

Page 32: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Spatial location accuracy

• Equalization: using the ground truth (human labeled) bounding box on trainVID

Page 33: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Appearance diversity

• Near identical samples of an object in video

Page 34: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Appearance diversity

• Measure diversity:• Clustering (agglomerative clustering, L2

distance of HOG features): each cluster contains visually very similar samples

• Measure appearance diversity by counting the number of clusters

• Equalization: resample training sets so the number of images and clusters (of trainVOC and trainVID) are equal

Page 35: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Appearance diversity

Page 36: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Appearance diversity• Bridge the gap by 3.5% (test on VOC)

Page 37: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Image quality• Gradient energy: sum of gradient magnitudes in

HOG cells• Equalization: blur trainVOC by applying a

Gaussian filter

Page 38: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Image quality

Page 39: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Image quality

• Closes the gap by 1% (test on VOC)

Page 40: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Training-test set correlation• The final 7% performance gap• Domain-specific correlation/bias• Find nearest neighbor of testing images in

both training sets

Page 41: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Training-test set correlation

• According to nearest neighbor criterion• testVOC is most similar to trainVOC• testVID is most similar to trainVID• Such correlation leads to the final

performance gap

Page 42: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

• Now we understand the gap between video domain and still image domain

• We still want to try transferring the knowledge learnt in video domain to image domain

• OK, then let’s look at “Learning Object Class Detectors from Weakly Annotated Video”

Page 43: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Benefits of video

• Easier to automatically segment the object from the background based on motion information

• Show significant appearances variations of an object

• Provide a large number of training images

Page 44: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Pipeline

• Each video contains an object as indicated by the video tag

• Automatically localize object in video clips, output one bounding box for each video

• Learn a detector from the video images and corresponding bounding boxes

• Domain adaptation• Test the detector on PASCAL 07 dataset

Page 45: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Localizing objects in real-world videos

• Extract shots of coherent motion from each video

• Robustly fit spatio-temporal bounding boxes (tube) to each shot (3~15 tubes per shot)

• Jointly select one tube per video by minimizing an energy function of similarity

• The selected tubes are the output of the algorithm (used to train a detector)

Page 46: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Localizing objects in real-world videos

Page 47: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Localizing objects in real-world videos

• Temporal partitioning into shot• Abrupt changes of the visual content of the

video• Thresholding color histogram differences in

consecutive frames

Page 48: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Localizing objects in real-world videos

• Forming candidate tubes• Large-displacement optical flow (LDOF)• Clustering the dense point tracks based on the

similarity in their motion and proximity in location

• Fit spatio-temporal bounding box to each motion segment

Page 49: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Localizing objects in real-world videos

• Example of tubes

Page 50: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Localizing objects in real-world videos

• Joint selection of tubes• Energy function

• Each frame s has multiple candidate tubes

• Select one tube from the candidates for each frame s

• Selected tubes over all the frames• Coefficient α

Page 51: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Localizing objects in real-world videos

• The pairwise potential

• Measure appearance dissimilarity• Encourage selecting tubes look similar over time• Tube ls, lq in two different frame s, q• Dissimilarity functions Δ (two types of features)

compare the appearance of the two tubes

Page 52: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Localizing objects in real-world videos

• The unary potential

• Measure the cost of selecting tube ls in shot s• Δ: prefer tubes visually homogeneous• Γ: percentage of bounding-box perimeter

touching the border of image• Ω: objectness probability of the bounding box

Page 53: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Localizing objects in real-world videos

• Minimization• Find the configuration L* of tubes over all

frames that minimizes energy E• L* is the final output and will be used to train

an object detector

Page 54: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Results

• Automatic tube selection • Compare with ground truth bounding box

(IoU>=50%)• The automatic tube selection technique

selects best available tube most of time

Page 55: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Learning a detector from the selected tubes

• Sampling bounding box• Reduce the number to manageable quantity• Select samples more likely to contain relevant

objects (using Γ and Ω exactly)• Train the object detector• DPM• SPM

Page 56: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Results

• Models:• VOC: model trained on PASCAL dataset• VMA: model trained on manually annotated

frames from video• VID: model trained on video with the

proposed automatic pipeline• Test on PASCAL dataset

Page 57: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Results

• Object detector without domain adaptation• The bounding boxes generated by the

proposed pipeline is closed to manually labeled ones

• Performance gap across domain is large

Page 58: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Domain adaptation

• Domain difference:• Higher HOG gradient energy in images• SVM based on GIST feature to distinguish

video from images, accuracy 83%

Page 59: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Domain adaptation

• Large quantity of video (source domain) training data, small number of PASCAL image (target domain) training data

• Adaptation methods:• All: directly train a single classifier using the

union of all available training data• Pred: use the output of the source classifier as

an additional feature for training the target classifier

Page 60: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Domain adaptation

• Adaptation methods (cont.):• Prior: the parameters of the source classifier

are used as a prior when learning the target classifier

• LinInt: first train two separate classifiers fs(x), ft(x) from the source and the target training data, and then linearly interpolate their predictions on new target data at test time

Page 61: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Results

• Object detector with domain adaptation• Improvement w.r.t. VOC model• Most method (combine VOC training data at early

stage) degrades performance• The LinInt is immune to negative transfer• Knowledge can be transferred from video to image

domain

Page 62: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

• We can not only automatically output bounding boxes on video, but also automatically segment the video into background and foreground object

• Fast object segmentation in unconstrained video

Page 63: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Goal

• Propose an automatic technique for separating foreground objects from the background in a video

• Two main stages:• 1) Efficient initial foreground estimation• 2) Foreground-background labeling

refinement

Page 64: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Efficient initial foreground estimation

• Optical flow: supports large displacement and efficient GPU implementation

• Motion boundaries: • magnitude of the gradient of the optical flow

field• difference in direction between the motion of

pixel p and its neighbors N (if n is moving in a different direction than all its neighbors, it is likely to be a motion boundary)

Page 65: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Efficient initial foreground estimation

Page 66: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Efficient initial foreground estimation

• Problems with the motion boundaries:• Do not completely cover the whole object

boundary• Subject to false positive

Page 67: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Efficient initial foreground estimation

• Inside-outside map (e.g. pixel level, 0: outside, 1: inside)

• Estimates whether a pixel is inside the object based on the point-in-polygon problem

• Any ray originating inside a closed curve intersects it an odd number of time. Any ray originating outside intersects it an even number of times.

Page 68: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Efficient initial foreground estimation

• Inside-outside map (cont.)• Incomplete motion boundary• Shooting 8 rays spaced by 45 degrees• Majority vote for final decision• Optimized data structure for linear time

implementation when computing the map

Page 69: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Foreground-background labelling refinement

• Pixel labelling problem with two labels (foreground and background)

• Oversegment each frame into superpixels• Assign labels to superpixels• Superpixel i at frame t takes a label• All superpixels’ labes in all frames

Page 70: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Foreground-background labelling refinement

• Energy function

• Output segmentation minimizes

• Minimize with graph-cut

Page 71: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Foreground-background labelling refinement

• In the energy function:• A: appearance model, one for foreground, one for

background. Estimated based on the inside-outside map

• L: location model, propagate the per-frame inside-outside maps over time to build a more complete location prior

• V: spatial smoothness potential defined over edge• W: temporal smoothness potential defined over

edge

Page 72: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Experiment evaluation

• SegTrack dataset: 6 videos• Evaluation: number of wrongly labeled pixels

averaged over fall frames

Page 73: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Experiment evaluation

• Considerably outperforms [6, 4, 18]• On par with [14], which is remarkable, given

that the proposed approach is simpler• [27] achieves lower, but is much slower• The SegTrack dataset is saturated

Page 74: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Experiment evaluation

• YouTube-Objects dataset:• 10 diverse object classes• Ground truth bounding box provided for some

frames• Fit bounding box to largest connected

component of the segmentation output• Evaluation: PASCAL criterion (IoU>=0.5)

Page 75: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Experiment evaluation

Page 76: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Experiment evaluation

• Runtime:• Intel Core i7 2.0GHz machine• Given optical flow and superpixels, it takes 0.5

sec/frame• Considerably faster than the other strong

baselines (typically >100 sec/frame)

Page 77: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

• With videos, we are able to extract objects, but we can even do something more crazy, which is revealing subtle movement of objects

• At last, let’s look at “Eulerian Video Magnification for Revealing Subtle Changes in the World”

Page 78: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Goal

• Reveal temporal variations in videos that are difficult or impossible to see with the naked eye and display them in an indicative manner

• Input standard video sequence• Output amplified signal to reveal hidden

information

Page 79: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Results

• https://www.youtube.com/watch?v=e9ASH8IBJ2U

Page 80: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

It’s amazing, right?

Page 81: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

How it works?

• First-order motion example:• : image intensity at position x and time t• : displacement function

• Motion magnification: synthesize the signal

• α: amplification factor

Page 82: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

How it works?

• : applying a broadband temporal bandpass filter, picking out everything except f(x)

• Define• Then

• And

Page 83: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

How it works?

Page 84: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

How it works?

• General case: δ(t) is not entirely within the passband of the temporal filter B(x,t)

• δk(t): different temporal spectral components of δ(t), each will be attenuated by the filter by a factor ϒk

• Then, let

Page 85: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

How it works?

• We need

• For high spatial frequencies and large amplification factor α, the first order Taylor expansion may not hold

• So, we have Higher spatial frequencies, smaller α

Page 86: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

How it works?• Artifacts

Page 87: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

How it works?

• Multiscale analysis• Scale-varying process• Assign different spatial frequencies with

different magnification factor α

Page 88: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Pipeline

Page 89: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Pipeline

• Spatial decomposition: video pyramid constructed by a separable binomial filter of size five

• Example temporal filter B(x,t), task specific

Page 90: Video Object Recognition Chenyi Chen. Motion is important How important? Let’s first look at “Visual Parsing After Recovery From Blindness” This is a

Thank you!