References:
Forsyth / Ponce: Computer Vision
Horn: Robot Vision
Schunk: Machine Vision
University of Edingburgh online image processing referencehttp://www.cee.hw.ac.uk/hipr/html/hipr_top.html
The Computer Vision Homepagehttp://www.cs.cmu.edu/~cil/vision.html
Rice University Eigenface Grouphttp://www.owlnet.rice.edu/~elec301/Projects99/faces/code.html
OpenCVhttp://opencv.willowgarage.com/wiki/http://opencv.willowgarage.com/wiki/CvReference
Machine Vision: Tracking Iversion 1.3MediaRobotics Lab, March 2009
Tracking
The idea is old. Tracking is just keep note of things ...
General requirements:
- a something to detect- a way of representing that object to your system- a way to tally the results- a way to find previous results- a way to recover from mistakes
Non-image based tracking
Biometrics: fingerprints
skin pattern
facial thermogram
gait
dna
Governmental: social security numbers
tax records
credit reports
retrospective surveillance:
The goal of these systems is to review the captured scenes from other sites in order to validate whether a hint of threat detected at the local site is part of a larger pattern. Imagine our proposed infrastructure deployed to monitor all important landmarks in the United States...Analyzing the images from multiple cameras peering into the crowds can allow detection algorithms to potentially make more reliable identification of terrorists than single cameras. More importantly, we can develop recognition algorithms that, when triggered by the suspicious activity of one tourist, analyze the stored streams from other landmarks to see if this same tourist exhibited suspicious behavior in those other sites; activity which may be missed by each site locally. Analyzing the streams in concert can also help identify more complex threat behavioral patterns.
Hydra, University of Notre Dame
...One can imagine recognition algorithms that identify a threat event that involves multiple actors; identified not only because each of these actors exhibit similar suspicious behavior but also by the fact that they all scoped out the landmark sites without overlapping with each other. While one person was identified as video taping the Empire State building and the Statue of Liberty in NYC and Sears tower in Chicago within a week of each other, another individual was also noticed video taping the Brooklyn bridge in New York and Navy pier in Chicago in the same week. Note that tourists video taping landmark sites itself is not the threat; rather the specific pattern and choice of sites might give clues to suspicious behavior...
Hydra, University of Notre Dame
Image based tracking
Direct tracking:
-image analysis
Indirect tracking:
-image differencing
-optical flow
Visual servoing:
-feed results back into a controller to
steer a moving vehicle
Difference images: pyramid of change > nth derivative
Stream of images in time
Time is here a variable (function of #frames)
3
2
1
0
Effective for global properties
Visual tracking of moving objects (multi-camera)
The first step in tracking objects is for the system to distinguish moving objects from stationary ones through feature selection and detection.
The next step requires the system to make note of the location, speed, size, and shape of moving objects.
Finally, the system must learn to recognize and track the same object as it moves out of the visual field of one camera and into the next. The computer is able to do this as long as the visual fields of the its cameras overlap at least somewhat.
Visual Tracking with a single camera: algorithm
feature selection
while (streaming)
{
feature extraction
feature location
}
Advanced Outdoor Line Following:First results in vision-based crop line trackingMark Ollis & Anthony Stentz, Robotics Institute (1996)
“The color segmentation algorithm has two parts: a discriminant and a segmentor. The discriminant computes afunction d(i,j) of individual pixels whose output providessome information about whether that pixel is in the cutregion or the uncut region; the segmentor then uses thediscriminant to produce a segmentation.”
http://www.ri.cmu.edu/pub_files/pub1/ollis_mark_1996_1/ollis_mark_1996_1.pdf
Indoor Line Following:Vision-based Line Tracking and Navigation in Structured Environments G. Reccari Y. Caselli F. Zanichelli A . Calafiore
Features-color
-brightness
-texture
-location
-size
-form local and global geometry
-other (eigenfaces for face detection)
> isolated and combined
image template
Occurances of the template in the image
In Opencv unde C++
cvMatchTemplate(image, template, result, CV_TM_CCOEFF_NORMED);
http://www.cs.rit.edu/~gsp8334/OCVT/OpenCVTutorial_III.pdf
Rotation, Scale, Translation-Invariant Template Matching
http://www.lps.usp.br/~hae/software/cirateg/index.html
Texture
What is texture?
- a feature that repeats with some variation- need to separate the repeating elements from the constant elements- often approached with probabilistic distributions- also: wavelets and neuralnets- example: Anil K. Jain, Kalle Karu, Learning Texture Discrimination Masks
(February 1996 (Vol. 18, No. 2) pp. 195-205
Eigenfaces
Eigenfaces are a set of eigenvectors used in the computer vision problem of human face recognition. These eigenvectors are derived from the covariance matrix of the probability distribution of the high-dimensional vector space of possible faces of human beings.
The technique has been used for handwriting, lip reading, voice recognition, and medical imaging.
Eigenfaces can be imagined as a set of "standardized face ingredients", derived from statistical analysis of many pictures of faces. Any human face can be considered to be a combination of these standard faces (everyone has eyes, a nose, a mouth..). One person's face might be made up of 10% from face 1, 24% from face 2 and so on.
Eigenfaces
Practically, eigenfaces are created by finding feature vectors based on deviations from an averaged training set.
a) collect some imagesb) find the average image (sum and divide)c) find the deviated images (differences between individual images and the ave image)d) calculate the covariance matrix (measure of how much variables vary in the same way)e) calculate the eigenvectors of the covariance matrix (vectors that do not change with scaling)f) construct eigenfaces by combining N eigenvectors
check wikipedia for an intuitive description of eigenvectors:http://en.wikipedia.org/wiki/Eigenvector
Rice University: Eigenface Group has python face recognition code implementing eigenfaces:
http://www.owlnet.rice.edu/~elec301/Projects99/faces/code.html
University of Pittsburgh has an online face detection program:http://demo.pittpatt.com/
Problems with features
-perspective
-underdefinition (3d, content)
-lighting
-occlusion
-distance
-image resolution
-test data without training data
General approach
feature selection
while (streaming)
{
feature extraction
feature location
}
feature location: choose an invariant property
for example: center of mass
According to Newton's third law the two internal forces are equal and opposite. Adding the equations then gives:
Fi1+F1 = a1m1
Fi2+F2 = a2m2
(internal forces are equal and opposite)
The two bodies are combined to one body with mass m1 + m2
This body is acted upon by the same external forces as our two bodies. This imaginary body then gets an acceleration which we call the acceleration of the center of mass aCM and given by:
F1 + F2 = a1m1 + a2m2
a1m1 + a2m2 = (m1+m2) acm
acceleration is the second derivative of the position:
x1m1 + x2m2 = (m1+m2) xcm
the center of mass 'balances' the two different weights:
xcm = (m1x1 + m2x2) / m1m2
xcm
Center of Mass
Calculating the CG in matlab
%find the centers of gravity of image Y(i * j)
for i = 1 : j C = [C, ((Y(:,i)'*m') / sum(Y(:,i)))];end
for i = 1 : k D = [D, ((Y(i,:)*n') / sum(Y(i,:)))];end
G = C(2:end);E = D(2:end);PY = round(median(G)); %Y coordinate of the COGPX = round(median(E)); %X coordinate of the COG
Color based tracking
xstart, ystart, xnew, ynew, xprevious, yprevious
while (streaming) {
feature extraction > hue >binarize >area
feature location > get CG coordinates > xnew, ynew
xstart = xnew //set only once
ystart = ynew //set only once
xstart – xprevious //check distance travelled
ystart – yprevious //check distance travelled
xprevious = xnew //update continuously
yprevious = ynew //update continuously
}
Simple Hack with PIL
Calculate the bounding box around a region of interest
#binarize the final imagemask6 = mask5.convert("1", dither=Image.NONE)
#find outer corners of remaining areafor i in range(x):
for j in range(y):if(mask6.getpixel((i,j))):
if (i < xmin):xmin = i
if(i > xmax):xmax = i
if (j < ymin):ymin = j
if(y > ymax):ymax = j
box = (xmin, ymin, xmax, ymax)draw.rectangle(box, outline=(255,0,0))
Simple tracking based on color
xstart, ystart, xnew, ynew, xprevious, yprevious
while (streaming) {
feature extraction > hue
binarize
area(s)
feature location > get CoM coordinates > xnew, ynew
xstart – xnew
ystart – ynew
xstart – xprevious
ystart – yprevious
xprevious = xnew
yprevious = ynew
}
Additional topics:
industrial applications: moving line trackinghttp://www.braintech.com/videos-honda.php
clustering: k-meanshttp://www.indiana.edu/~dll/B657/B657_lec_kmeans.pdf
decomposition: principle component analysishttp://www.indiana.edu/~dll/B657/B657_lec_pca.pdf