human-computer interaction human-computer interaction tracking hanyang university jong-il park
TRANSCRIPT
Human-Computer InteractionHuman-Computer Interaction Tracking
Hanyang University
Jong-Il Park
Tracking in computer visionTracking in computer vision Definition
Problem of generating an inference about the motion of an object given a sequence of images
Application Motion capture Recognition from motion Surveillance
Who is doing what? Eg. Security, HCI(Kinect…)
TrackingTracking Establish state of object using time sequence
state could be: position; position+velocity; position+velocity+acceleration or more complex, Eg. all joint angles for a person
Biggest problem -- Data Association which image pixels are informative, which are not?
Key ideas Tracking by detection
if we know what an object looks like, that selects the pixels to use
Tracking through flow if we know how an object moves, that selects the pixels
to use
Appearance vs. FlowAppearance vs. Flow
Tracking by DetectionTracking by Detection Assume
a very reliable detector (e.g. faces; back of heads) detections that are well spaced in images (or have
distinctive properties) e.g. news anchors; heads in public
Link detects across time only one - easy multiple - weighted bipartite matching but what if one is missing?
Better: create abstract tracks link detects to track create tracks, reap tracks as required clean up spacetime paths
Tracking by Known AppearanceTracking by Known Appearance Even if we don’t have a detector Know rectangle in image (n) Want to find corresponding rectangle in (n+1) Search over nearby rectangles
to find one that minimizes SSD error
where sum is over pixels in rectangle
Application stabilize players in TV sport
Buidling TracksBuidling Tracks
Start at scattered points in image 1 perhaps corner detector responses
For each in image (n), compute position in (n+1) as in previous slide
Now check tracks patch in (n+1) should look like an affine transform
of patch in 1 Prune bad tracks
What if the patch deforms?What if the patch deforms? Eg a football player’s jersey
Colors are “similar” but SSD won’t work Idea: patch histogram is stable
To track: repeat
predict location of new patch search nearby for patch whose histogram matches
original the best
When are motions easy?When are motions easy? Current procedure
predict state obtaining measurement from prediction by search correct state
Easy When the object is close to where you expect it to be
eg Object guaranteed to move a little
Large motions can be easy When they’re “predictable”
e.g. ballistic motion e.g. constant velocity
Need a theory to fuse this procedure with motion model
General ModelGeneral Model We assume there are moving objects, which
have an underlying state X There are measurements Y, some of which
are functions of this state Eg. There is a clock
at each tick, the state changes at each tick, we get a new observation
object is ball, state is 3D position+velocity, measurements are stereo pairs
object is person, state is body configuration, measurements are frames, clock is in camera (30 fps)
Tracking as Tracking as
an Abstract Inference Probleman Abstract Inference Problem Internal state: X Measurement: Y Major Steps:
1. Prediction
2. Data association: determining which data are informative
3. Correction
Independence AssumptionIndependence Assumption Only the immediate past matters
* X: Markov process
Measurements depend only on the current state
Bayes RuleBayes Rule
PredictionPrediction
CorrectionCorrection
Linear Dynamic ModelLinear Dynamic Model Model
D: Transition matrix
M: Measurement matrix
Kalman filter is well-suited for this type of tracking
Read Ch.11.3, Forsyth & Ponce, Computer Vision, 2012.
Kalman filterKalman filter
Eg. Kalman filterEg. Kalman filter
*: predictedx: measured+: estimatedo: truebar: 3
Nonlinear DynamicsNonlinear Dynamics
Problems tend not to be normal
may not be Gaussian(quite common in vision problem)
multiple, well-separated modes
Eg. Nonlinear modelEg. Nonlinear model
Eg.Eg. Nonlinear modelNonlinear model Time Evolution & pdfTime Evolution & pdf
Difficulties Difficulties in Complicated Modelsin Complicated Models
In order to maintain handling multiple peaks handling multi-dimensional state vector
Particle filtering is a useful approach.
How?
Sampled RepresentationSampled Representation Representation of pdf is
NOT for representing a pdf itself BUT for computing some expectation
Computing expectation using sampled representation
Monte Carlo IntegrationMonte Carlo Integration
where : sampling distribution
: weight
Obtaining a sampled representation Obtaining a sampled representation of probability distributionof probability distribution
Computing an expectation using Computing an expectation using a set of samplesa set of samples
TransformationTransformation Transforming a sampled representation of Transforming a sampled representation of
a prior into that of a posteriora prior into that of a posterior
Naive particle filterNaive particle filter
Eg. Naive particle filterEg. Naive particle filter
Poor results due to sample impoverishment! Most of the weights get small very fast
The way to get accurate estimates = to have samples lie where the p is likely to
be large
Overcoming Sample ImpoverishmentOvercoming Sample Impoverishment
Equivalent to maintaining a set of good particles
Then, how?
Resampling the prior1. expand the sample set non-uniformly using
the weights Form a new set of samples consisting of a
union of Nk copies of (sk,1) for each k.
2. Subsample the sample set uniformly
Practical particle filterPractical particle filter Initialization Prediction Correction Resampling:
1. Normalise the weights so that 2. Compute the variance of the
normalised weights. If(var >Th) construct a new set of
samples by drawing, with replacement, N samples from the old set, using the weights as the probability that a sample will be drawn.
The weight of each sample is now 1/N.
Naive particle filter
Consequence of resamplingConsequence of resampling
Particles that tend to reflect the state rather well usually reappear in the resampled set
Many particles lie within one standard deviation of the mean of the posterior
Particle filtersParticle filters Different community -> different name
statistics particle filter
AI survival of the fittest
computer vision condensation
Tracking peopleTracking people Essential components
Motion model Likelihood model
P(image features|person present at given configuration)
Motion model Strong motion model: markers, angles,… Weak motion model: drift model
Likelihood model SSD, edges, other features,…
Likelihood computationLikelihood computation
Boundaryinformation
Non-backgroundinformation
Sample points
Annealed particle filterAnnealed particle filter To overcome the problems of
High dimensionality of the state Many local peaks in the likelihood
Annealing Starts from smooth approximations to the
likelihood to less smooth approximation Repeats weighting and resampling
[Deutscher,Blake,Reid, CVPR2000]
Finding people Finding people As an initialization of people tracker There is no person tracker that represents
the configuration of the body and can start automatically
Known approaches1. Template matching
2. Finding faces
3. Search over correspondence
* Challenging topic: learning models from data