visual tracking course @ ucu day 1: intro to tracking...

116
Visual Tracking course @ UCU Day 1: Intro to tracking Dmytro Mishkin Center for Machine Perception Czech Technical University in Prague Lviv, Ukraine 2017

Upload: others

Post on 02-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Visual Tracking course @ UCU

    Day 1: Intro to tracking

    Dmytro Mishkin

    Center for Machine Perception

    Czech Technical University in Prague

    Lviv, Ukraine

    2017

  • Heavily based on tutorial:

    Visual Tracking

    by Jiri Matas

    „… Although tracking itself is by and

    large solved problem...“,

    -- Jianbo Shi & Carlo Tomasi

    CVPR1994 --

  • Course plan

    Day 1: Basics

    Visual tracking: not one, but many problems.

    The KLT tracker and Optical Flow

    HW: KLT and simple Tracker by Detection

    Day 2: CNN (mostly not about tracking)

    General CNN finetuning

    CNN design choices

    SotA: CNN-based trackers

    HW: Finetuning VGGNet for MDNet tracker

    Day 3: State-of-art (continue)

    Discriminative Correlation Filters

    How to evaluate tracker

    HW: KCF implementation

    3/150

  • Outline of the Day 1

    1. Visual tracking: not one, but many problems.

    2. The KLT tracker. Optical Flow. SIFT Flow. Flock of

    trackers.

    3. The Mean-Shift tracker

    4. Tracking by detection (basic)

    4/150

  • Application domains of Visual Tracking

    • monitoring, assistance, surveillance,

    control, defense

    • robotics, autonomous car driving,

    rescue

    • measurements: medicine, sport,

    biology, meteorology

    • human computer interaction

    • augmented reality

    • film production and postproduction:

    motion capture, editing

    • management of video content: indexing,

    search

    • action and activity recognition

    • image stabilization

    • mobile applications

    • camera “tracking”5/150Slide credit: Jiri Matas

  • Applications, applications, applications, …

    6/150Slide credit: Jiri Matas

  • Tracking Applications ….

    – Team sports: game analysis, player statistics, video annotation, …

    7/150Slide credit: Jiri Matas

  • Sport examples

    http://cvlab.epfl.ch/~lepetit/http://www.dartfish.com/en/media-gallery/videos/index.htm

    Slide Credit: Patrick Perez 8/150

    http://cvlab.epfl.ch/~lepetit/

  • Model-based Tracking: People and Faces

    http://cvlab.epfl.ch/research/completed/realtime_tracking/ http://www.cs.brown.edu/~black/3Dtracking.html

    Slide Credit: Patrick Perez 9/150

    http://cvlab.epfl.ch/research/completed/realtime_tracking/http://www.cs.brown.edu/~black/3Dtracking.html

  • Tracking is commonly used in practice…

    10/150

    • Tracking is popular research topic for decades:

    see CVPR, ICCV, ECCV, …

    • But there is no online course devoted to tracking…

    • nor big coverage in computer vision courses...

    • nor it is well covered in textbooks

  • Is it clear, what tracking is?

    video credit:

    Helmut

    Grabner

    11/150Slide credit: Jiri Matas

  • Tracking: Formulation - Literature

    Surprisingly little is said about tracking in standard textbooks.

    Limited to optic flow, plus some basic trackers, e.g. Lucas-Kanade.

    Definition (0):

    [Forsyth and Ponce, Computer Vision: A modern approach, 2003]

    “Tracking is the problem of generating an inference about the

    motion of an object given a sequence of images.

    Good solutions of this problem have a variety of applications…”

    12/150Slide credit: Jiri Matas

  • Formulation (1): Tracking

    Establishing point-to-point correspondences

    in consecutive frames of an image sequence

    Notes:

    • The concept of an “object” in F&P definition disappeared.

    • If an algorithm correctly established such correspondences,

    would that be a perfect tracker?

    • tracking = motion estimation?

    13/150Slide credit: Jiri Matas

  • Tracking is Motion Estimation / Optic Flow ?

    14/150Slide credit: Jiri Matas

  • Tracking is Motion Estimation / Optic Flow ?

    15/150

  • Tracking is Motion Estimation / Optic Flow?

    Motion “pattern” Camera tracking

    Dense motion field

    http://www.cs.cmu.edu/~saada/Projects/CrowdSegmentation/

    http://www.youtube.com/watch?v=ckVQrwYIjAs

    Sparse motion field estimate

    16/150Slide credit: Jiri Matas

  • Tracking is Motion Estimation / Optic Flow?

    17/150

    Motion field

    • The motion field is the projection of the 3D scene

    motion into the image

    Slide credit: James Hays

  • Tracking is Motion Estimation / Optic Flow?

    18/150

    • Definition: optical flow is the apparent motion of

    brightness patterns in the image

    • Ideally, optical flow would be the same as the

    motion field

    • Have to be careful: apparent motion can be caused

    by lighting changes without any actual motion

    • Think of a uniform rotating sphere under fixed

    lighting vs. a stationary sphere under moving

    illumination

    Slide credit: James Hays

  • Tracking is Motion Estimation / Optic Flow?

    19/150Slide credit: Jason Corso

  • Optic Flow

    Standard formulation:

    • At every pixel, 2D displacement is estimated between consecutive frames

    Missing:

    • occlusion – disocclusion handling: pixels visible in one image only

    - in the standard formulation, “don’t know” is not an answer

    • considering the 3D nature of the world

    • large displacement handling - only recently addressed (EpicFlow 2015)

    Practical issues hindering progress in optic flow:

    • is the ground truth ever known?

    - learning and performance evaluation problematic (synthetic sequences ..)

    • requires generic regularization (smoothing)

    • failure (assumption validity) not easy to detect

    In certain applications, tracking is motion estimation on a part of the image

    with specific constraints: augmented reality, sports analysis 20/150

  • Formulation (1): Tracking

    Establishing point-to-point correspondences

    in consecutive frames of an image sequence

    Notes:

    • The concept of an “object” in F&P definition disappeared.

    • If an algorithm correctly established such correspondences,

    would that be a perfect tracker?

    • tracking = motion estimation?

    Consider the Bolt sequence:

    21/150

  • Formulation (1*): Tracking

    Establishing point-to-point correspondences

    between all pairs of frames in an image sequences

    • which leads to the concept of long-term tracking,

    to be discussed later

    22/150

  • Definition (2): Tracking

    Given an initial estimate of its position,

    locate X in a sequence of images,

    Where X may mean:

    • A (rectangular) region

    • An “interest point” and its neighbourhood

    • An “object”

    This definition is adopted e.g. in a recent book by

    Maggio and Cavallaro, Video Tracking, 2011

    Smeulders T-PAMI13:

    Tracking is the analysis of video sequences for the

    purpose of establishing the location of the target

    over a sequence of frames (time) starting from

    the bounding box given in the first frame.

    23/150

  • Formulation (3): Tracking as Segmentation

    J. Fan et al. Closed-Loop Adaptation for Robust Tracking, ECCV 2010

    24/150

  • Tracking as model-based segmentation

    25/150

  • Tracking as segmentation

    http://vision.ucsd.edu/~kbranson/research/cvpr2005.html

    http://www2.imm.dtu.dk/~aam/tracking/

    • heart

    26/150

    http://vision.ucsd.edu/~kbranson/research/cvpr2005.htmlhttp://www2.imm.dtu.dk/~aam/tracking/sequences-for-presentation/heart-guillaume-us.avi

  • A “standard” CV tracking method output

    27/150

    Approximate motion estimation, approximate segmentation.Neither good optic flow, neither precise segmentation required.

  • Rotated B-Boxes – Interpretation?

    28/150

  • Rotated B-Boxes – Interpretation?

    29/150

  • Formulation (4): Tracking

    Given an initial estimate of the pose and state of X :

    In all images in a sequence, (in a causal manner)

    1. estimate the pose and state of X

    2. (optionally) update the model of X

    • Pose: any geometric parameter (position, scale, …)

    • State: appearance, shape/segmentation, visibility, articulations

    • Model update: essentially a semi-supervised learning problem

    – a priori information (appearance, shape, dynamics, …)

    – labeled data (“track this”) + unlabeled data = the sequences

    • Causal: for estimation at T, use information from time t · T

    30/150

  • Tracking for Black Mirror Blocking

    31/150

  • Tracking in 6D.

    32/150

  • Tracking-Learning-Detection (TLD)

    33/150

  • A “miracle”: Tracking a Transparent Object

    video credit:

    Helmut

    Grabner

    H. Grabner, H. Bischof, On-line boosting and vision, CVPR, 2006.

    34/150

  • Tracking the “Invisible”

    H. Grabner, J. Matas, L. Gool, P. Cattin,Tracking the invisible: learning where the object might be, CVPR 2010.

    35/150

  • video

    sequences-for-presentation/Octopus hunting and changing color - YouTube [360p].mp4

  • Formulation (5): Tracking

    Given an estimate of the pose (and state) of X in “key” images

    (and a priori information about X),

    In all images in a sequence, (in a causal manner):

    1. estimate the pose and state of X

    2. (optionally) estimate the state of the scene [ e.g. “supporters”]

    3. (optionally) update the model of X

    Out: a sequence of poses (and states),(and/or the learned model of X)

    Notes:

    • Often, not all parameters of pose/state are of interest, and the state is

    estimated as a side-effect.

    • If model acquisition is the desired output, the pose/state estimation is a

    side-effect.

    • The model may include: relational constraints and dynamics, appearance

    change as a function as pose and state

    37/150

  • Short-term v. Long-term Tracking v. OFShort-term Trackers:

    • primary objective: “where is X?” = precise estimation of pose

    • secondary: be fast; don’t lose track

    • evaluation methodology: frame number where failure occurred

    • examples: Lucas Kanade tracker, mean-shift tracker

    Long-term Tracker-Detectors:

    • primary objective: unsupervised learning of a detector, since

    every (short-term) tracker fails, sooner or later

    (disappearance from the field of view, full occlusion)

    • avoid the “first failure means lost forever” problem

    • close to online-learned detector, but assumes and exploits the fact

    that a sequence with temporal pose/state dependence is available

    • evaluation methodology: precision/recall, false positive/negative

    rates (i.e. like detectors)

    • note: the detector part may help even for short-term tracking

    problems, provides robustness to fast, unpredictable motions.

    Optic Flow, Motion estimation: establish all correspondences a sequence

    38/150

  • Other Tracking Problems:

    …… multiple object tracking ….

    39/150

  • Multi-object Tracking

  • • ant tracking 1

    • result 1

    Tracking as detection and identification

    41/150

    sequences-naiser-ants/eight_00m31s.m4vsequences-naiser-ants/Ferda1080.mp4

  • Other Tracking Problems:

    Cell division.

    http://www.youtube.com/watch?v=rgLJrvoX_qo

    Three rounds of cell division in Drosophila Melanogaster.

    http://www.youtube.com/watch?v=YFKA647w4Jg

    splitting and merging events ….

    42/150

  • The World of Fast Moving Objects

    ● FMO - object that moves over a distance exceeding its size within exposure time

    ● Standard datasets (VOT, OTB, ALOV) do not include FMOs

  • FMO Examples

    ● Ping pong, tennis, frisbee, volleyball, badminton, squash, darts, arrows, softball

    ● Some FMOs are nearly homogeneous, while some have coloured texture

    ● SOTA trackers fail...

  • Applications: deblurring, temporal superresolution

  • Estimation of Appearance

    ● Reconstruction of FMOs blurred by motion and rotation● Axis of rotation, angle of rotation, full 3D appearance, …

  • Motion Estimation from a Single Image

    49/150

  • Tracking problem variations:

    • multiple cameras

    • RGBD sensors

    • combination of sensors (accelerometer + visual, IMU + visual, IMU +

    GPS + visual)

    50/150

  • Tracking problems

    • motion estimation (establishing point-to-point correspondences) v.

    segmentation (region-to-region correspondences)

    • long-term v. short-term

    • one object v. multiple objects

    • casual v. non-causal (= offline video analysis)

    • single v. multi-camera

    • static v. moving camera

    51/150

  • The KLT tracker

    52/150

  • The KLT Tracker

    53/150Slide credit: Kris Kitani

  • KLT Tracker

    slide credit:Tomas Svoboda

    54/150

    Importance in Computer Vision

    • Firstly published in 1981 as an image registration method [3].

    • Improved many times, most importantly by Carlo Tomasi [5,4]

    • Free implementaton(s) available1.

    • After more than two decades, a project2 at CMU dedicated to this

    • single algorithm and results published in a premium journal [1].

    • Part of plethora computer vision algorithms.1http://www.ces.clemson.edu/~stb/klt/2http://www.ri.cmu.edu/projects/project_515.html

  • Image alignment

    slide credit:Tomas Svoboda 55/150

    Goal is to align a template image T(x) to an input image I(x). X - columnvector containing image coordinates [x, y]. The I(x) could be also a smallsubwindow within an image.

  • Original Lucas-Kanade algorithm I

    slide credit:Tomas Svoboda 56/150

  • Original Lucas-Kanade algorithm II

    slide credit:Tomas Svoboda 57/150

  • Original Lucas-Kanade algorithm III

    slide credit:Tomas Svoboda 58/150

  • Original Lucas-Kanade algorithm IV

    slide credit:Tomas Svoboda 59/150

  • Original Lucas-Kanade summary

    slide credit: Tomas Svoboda 60/150

  • KLT Tracker

    For good conditioning, patch must be

    textured/structured enough:

    • Uniform patch: no information

    • Contour element: aperture problem (one dimensional

    information)

    • Corners, blobs and texture: best estimate

    [Lucas and Kanade 1981][Tomasi and Shi, CVPR’94]

    slide credit:

    Patrick Perez

    61/150

  • Aperture Problem: Example

    62/150Image from: Gary Bradski slides

    If we look through small hole…

    Video by Olha Mishkina

  • Multi-resolution Lucas-Kanade

    – Arbitrary displacement

    • Multi-resolution approach: Gauss-Newton like approximation down image

    pyramid

    63/150Slide credit: James Hays

  • Monitoring quality

    – Translation is usually sufficient for small fragments, but:

    • Perspective transforms and occlusions cause drift and loss

    – Two complementary options

    • Kill tracklets when minimum SSD too large

    • Compare as well with initial patch under affine transform (warp) assumption

    slide credit:

    Patrick Perez

    64/150

  • Characteristics of KLT

    • cost function: sum of squared intensity differences

    between template and window

    • optimization technique: gradient descent

    • model learning: no update / last frame / convex

    combination

    • attractive properties:

    –fast

    –easily extended to image-to-image transformations with

    multiple parameters

    65/150

  • From KLT to whole image optical flow

    KLT:

    • cost function: sum of squared intensity differences

    between template and window

    • optimization technique: gradient descent

    Optical Flow( Horn & Schunk, 1981):

    • cost function: sum of squared intensity differences

    between two images + smoothness

    constraint

    • optimization technique: Gauss–Seidel

    66/150

  • Optical flow

    Function:

    67/150slide credit: Cordelia Schmid

  • SIFT Flow: replace pixels with dense SIFT

    • cost function: sum of absolute SIFT differences

    between template and window +

    small displacement term +

    smoothness term

    70/150Image from Liu et.al 2011, TPAMI

    • optimization technique: dual-layer loopy belief propagation

    w(p)=(u(p), v(p)) is the displacement vector at pixel location p= (x, y)

    https://cseweb.ucsd.edu/classes/sp06/cse151/lectures/belief-propagation.pdf

  • SIFT Flow: replace pixels with dense SIFT

    71/150Image from Liu et.al 2011, TPAMI

    SIFT is not critical – could any “descriptor flow”

  • EpicFlow

    72/150

    • Avoid coarse-to-fine scheme

    • Other matching-based approaches [Chen’13, Lu’13,

    Leordeanu’13]

    – less robust matching algorithms (eg. PatchMatch)

    – complex schemes to handle occlusions & motion discontinuities

    – no geodesic distance7272

    (SED)

    (DeepMatching)

    EpicFlow Revaud et.al, 2015. slide credit: Jerome Revaud

  • FlowNet

    73/150Fischer et.al 2015, FlowNet: Learning Optical Flow with Convolutional Networks. slide credit: Thomas Brox

  • FlowNet 2.0: Refine and refine

    74/150Ilg et.al 2016, FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks.

  • Sintel benchmark

    75/150Butler. et.al 2012. A naturalistic open source movie for optical flow evaluation

  • SotA Optical Flow: qualitive results

    76/150Ilg et.al 2016, FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks.

  • The Flock of Trackers

    (with error prediction)

    Tomas Vojir

    77/150

  • The Flock of Trackers

    • A n x m grid (say 10x10) of

    Lucas-Kanade / ZSP trackers

    • Tracker initialised on a

    regular grid

    • Robust estimation of global,

    either “median”

    direction/scale or RANSAC

    (up to homography)

    • Each tracker has a

    failure predictor

    78/150

  • 2016.12.12 Oulu J. Matas:Tracking

    Two classical Failure Predictors

    Normalized Cross-correlation

    • Compute normalized cross-

    correlation between local tracker

    patch in time t

    and t+1

    • Sort local trackers according to

    NCC response

    • Filter out bottom 50% (Median)

    Forward-Backward1

    • Compute correspondences of local

    trackers from time t to t+k and t+k

    to t and measure the k-step error

    • Sort local trackers according to the

    k-step error

    • Filter out bottom 50% (Median)

    [1] Z. Kalal, K. Mikolajczyk, and J. Matas.

    Forward-Backward Error: Automatic Detection of Tracking Failures. ICPR, 2010

    79/150

  • Failure Predictor: Neighborhood Consistency

    • For each local tracker i is computed neighborhood

    consistency score as follows :

    Ni is four neighbourhood of local tracker i, is displacement and is displacement error threshold

    • Local trackers with

    SiNh < Nh

    are filtered out

    • Setting:

    = 0.5px

    Nh = 1

    80/150

  • Failure Predictors: Temporal consistency

    • Markov Model predictor (MMp) models local trackers as two states (i.e. inlier, outlier) probabilistic automaton with transition probabilities pi(st+1 | st )

    • MMp estimates the probability of being an inlier for all local trackers ) filter by

    1) Static threshold s2) Dynamic threshold r

    • Learning is done incrementally(learns are the transition probabilities between states)

    • Can be extended by “forgetting”, which allows faster response to object appearance change

    81/150

  • The combined outlier filter

    Combining three indicators of failure:

    – Local appearance (NCC)

    – Neighbourhood consistency (Nh)

    (similar to smoothness assumption

    used in optic flow estimation)

    – Temporal consistency using

    a Markov Model predictor (MMp)

    • Together form very a stronger

    predictor than the popular

    forward-backward

    • Negligible computational cost (less than 10%)

    T. Vojir and J. Matas. Robustifying the flock of trackers. CVWW '11,

    82/150

  • FoT Error Prediction Bike tight box (ext. viewer)

    83/150

    vojir-error-prediction/mountain_bike.avi-mountain_bike.avi

  • FoT Error Prediction Bike loose box (ext. viewer)

    84/150

    vojir-error-prediction/mountain_bike_large.avi-mountain_bike_large.avi

  • FoT Error Prediction (ext. viewer)

    85/150

    vojir-error-prediction/pedestrian3.avi-pedestrian3.avi

  • The Mean-shift Tracker

    (colour-based tracking)

    86/150

  • Iterative tracker for histogram-based descriptors

    The Mean-shift Tracker

    (colour-based tracking)

    87/150Slide credit: Robert Collins

    Look for local maximum (mean shift, KLT, particle filtering, etc.)

  • Histogram Appearance Models

    • Motivation – to track non-rigid objects, (like a

    walking person), it is hard to specify an explicit 2D

    parametric motion model.

    • Appearances of non-rigid objects can

    sometimes be modeled with color distributions

    88/150Slide credit: Robert Collins

  • Histogram Appearance Models

    89/150Slide credit: Robert Collins

  • Histogram Appearance Models

    90/150Slide credit: Robert Collins

  • Mean Shift Algorithm

    The mean-shift algorithm is an efficient

    approach to tracking objects whose appearance is

    defined by color.

    Not limited to only color, however. Could also

    use edge orientations, texture, motion – any

    histogram)

    91/150Slide credit: Robert Collins

  • Mean Shift Algorithm

    The Mean-shift Tracker

    (colour-based tracking)

    92/150Slide credit: Kris Kitani

  • Mean Shift Algorithm

    The Mean-shift Tracker

    (colour-based tracking)

    93/150Slide credit: Kris Kitani

  • Mean Shift Algorithm

    The Mean-shift Tracker

    (colour-based tracking)

    94/150Slide credit: Kris Kitani

  • Mean Shift Algorithm

    The Mean-shift Tracker

    (colour-based tracking)

    95/150Slide credit: Kris Kitani

  • Mean Shift Algorithm

    The Mean-shift Tracker

    (colour-based tracking)

    96/150Slide credit: Kris Kitani

  • Mean Shift Algorithm

    The Mean-shift Tracker

    (colour-based tracking)

    97/150Slide credit: Kris Kitani

  • Mean Shift Algorithm

    The Mean-shift Tracker

    (colour-based tracking)

    98/150Slide credit: Kris Kitani

  • Mean Shift Algorithm

    The Mean-shift Tracker

    (colour-based tracking)

    99/150Slide credit: Kris Kitani

    …repeat

  • Mean Shift Tracker

    Now all together

    100/150

  • Color-based tracking

    – Global description of tracked region: color histogram

    – Reference histogram with B bins

    set at track initialization

    – Candidate histogram at current instant

    gathered in region of current

    image.

    – At each instant

    • searched around

    • iterative search initialized with : meanshift-like iteration

    slide credit:

    Patrick Perez

    101/150

  • Color-based tracking

    – Global description of tracked region: color histogram

    – Reference histogram with B bins

    set at track initialization

    – Candidate histogram at current instant

    gathered in region of current

    image.

    – At each instant

    • searched around

    • iterative search initialized with : meanshift-like iteration

    slide credit:

    Patrick Perez

    102/150

  • Color-based tracking

    – Global description of tracked region: color histogram

    – Reference histogram with B bins

    set at track initialization

    – Candidate histogram at current instant

    gathered in region of current

    image.

    – At each instant

    • searched around

    • iterative search initialized with : meanshift-like iteration

    slide credit:

    Patrick Perez

    103/150

  • – Color histogram weighted by a kernel

    • Kernel elliptic support sits on the object

    • Central pixels contribute more

    • Makes differentiation possible

    • H: “bandwidth” sym. def. pos. matrix, related to

    bounding box dimensions

    • k: “profile” of kernel (Gaussian or Epanechnikov)

    – Histogram dissimilarity measure

    • Battacharyya measure

    • Symmetric, bounded, null only for equality

    • 1 - dot product on positive quadrant of unitary hyper-sphere

    Color distributions and similarity

    slide credit:

    Patrick Perez

    104/150

  • Color distributions and similarity

    105/150Slide credit: Kris Kitani

  • Iterative ascent

    – Non quadratic minimization: iterative ascent with linearizations

    – Setting move to (g=-h’)

    yields a simple algorithm…

    106/150

  • Meanshift tracker

    •In frame t+1

    – Start search at

    – Until stop

    • Compute candidate histogram

    • Weight pixels inside kernel support

    • Move kernel

    • Check overshooting

    until

    • If stop, else

    – slide credit: Patrick Perez

    107/150

  • Mean Shift tracking example

    Feature space: 161616 quantized RGBTarget: manually selected on 1st frame

    Average mean-shift iterations: 4

    108

  • Mean Shift tracking example

    D. Comaniciu, V. Ramesh, P. Meer: Kernel-Based Object Tracking TPAMI, 2003109

    http://comaniciu.net/Papers/KernelTracking.pdf

  • Some examples of MS tracking

    http://www.youtube.com/watch?v=RG5uV_h50b0

    110

  • Pros and cons

    – Low computational cost (easily real-time)

    – Surprisingly robust

    • Invariant to pose and viewpoint

    • Often no need to update reference color model

    – Invariance comes at a price

    • Position estimate prone to fluctuation

    • Scale and orientation not well captured

    • Sensitive to color clutter (e.g., teammates in team sports)

    – Local search by gradient descent

    – Problems:

    • abrupt moves

    • occlusions

    slide credit:

    Patrick Perez

    111/150

  • Variants

    – Remove background corruption in reference

    • Simple segmentation based on surrounding

    color at initialization

    • Re-estimation of foreground model

    • Amounts to zero bins for colors more frequent

    in surrounding than in selection

    – Scale/orientation estimation

    • Originally: greedy search around current scale/orientation

    • Afterwards: incorporate loose spatial layout (via multiple spatial kernels or

    spatial partionning with sub-models)

    – Robustness to camera movement

    • Robust estimation of dominant apparent motion

    • Start search at previous position displaced according to dominant motion

    slide credit:

    Patrick Perez

    112/150

  • Variants (2)

    – Improved discrimination for improved robustness

    • Selection of best color space or color combinations to distinguish foreground

    from background

    • Alternative or complementary features (intensity, textures, co-occurrences)

    – Improved accuracy

    • Coupling with more precise though more fragile tracking (fragment-based in

    particular)

    – Smoother similarity measure

    • Kullback-Leibler

    – Alternative search algorithms

    • Trust region

    slide credit:

    Patrick Perez

    113/150

  • Further variants

    – On-the-fly adaptation

    • Radical pose or illumination changes require adaptation

    • Usually linear mixing with exponential forgetting

    • Still open dilemma: adapt when required, not during occlusions…

    – Probabilistic version

    • Coupled with Kalman or particle filter for better handling of occlusions

    – Automatic multi-objet tracking

    • Detection of a category of objects

    • Sequential tracking or batch “tracking”

    • Handling of multiple trackers (exclusion principle, multi-objet occlusion)

    slide credit:

    Patrick Perez

    114/150

  • Tracking by detection

    115/150

    YOLO: Real-Time Object Detection, https://www.youtube.com/watch?v=VOC3huqHrss

  • Tracking by detection and correspondence

    116/150

    • What if we combine object (or local feature) detector with

    matching logic? Recall wide baseline stereo problem.

    • The wide baseline stereo problem – the automatic estimation of a

    geometric transformation (flow?) and the selection of consistent

    correspondences between view pairs separated by a wide

    (temporal?) baseline

    MODS: Fast and Robust Method for Two-View Matching. Mishkin et. al., 2015

  • Tracking by detection

    117/150

    • If object does not move, but only camera moves, then we can track

    whole scene:

  • THANK YOU.

    Questions, please?

    118/150