machine vision: tracking i - realtechsupport · machine vision: tracking i version 1.3 ... g....

References:

Forsyth / Ponce: Computer Vision

Horn: Robot Vision

Schunk: Machine Vision

University of Edingburgh online image processing referencehttp://www.cee.hw.ac.uk/hipr/html/hipr_top.html

The Computer Vision Homepagehttp://www.cs.cmu.edu/~cil/vision.html

Rice University Eigenface Grouphttp://www.owlnet.rice.edu/~elec301/Projects99/faces/code.html

OpenCVhttp://opencv.willowgarage.com/wiki/http://opencv.willowgarage.com/wiki/CvReference

Machine Vision: Tracking Iversion 1.3MediaRobotics Lab, March 2009

http://www.cee.hw.ac.uk/hipr/html/hipr_top.html

http://www.cs.cmu.edu/~cil/vision.html

http://www.owlnet.rice.edu/~elec301/Projects99/faces/code.html

http://opencv.willowgarage.com/wiki/

Tracking

The idea is old. Tracking is just keep note of things ...

General requirements:

- a something to detect- a way of representing that object to your system- a way to tally the results- a way to find previous results- a way to recover from mistakes

Non-image based tracking

Biometrics: fingerprints

skin pattern

facial thermogram

gait

dna

Governmental: social security numbers

tax records

credit reports

Hydra, University of Notre Dame

retrospective surveillance:

The goal of these systems is to review the captured scenes from other sites in order to validate whether a hint of threat detected at the local site is part of a larger pattern. Imagine our proposed infrastructure deployed to monitor all important landmarks in the United States...Analyzing the images from multiple cameras peering into the crowds can allow detection algorithms to potentially make more reliable identification of terrorists than single cameras. More importantly, we can develop recognition algorithms that, when triggered by the suspicious activity of one tourist, analyze the stored streams from other landmarks to see if this same tourist exhibited suspicious behavior in those other sites; activity which may be missed by each site locally. Analyzing the streams in concert can also help identify more complex threat behavioral patterns.


...One can imagine recognition algorithms that identify a threat event that involves multiple actors; identified not only because each of these actors exhibit similar suspicious behavior but also by the fact that they all scoped out the landmark sites without overlapping with each other. While one person was identified as video taping the Empire State building and the Statue of Liberty in NYC and Sears tower in Chicago within a week of each other, another individual was also noticed video taping the Brooklyn bridge in New York and Navy pier in Chicago in the same week. Note that tourists video taping landmark sites itself is not the threat; rather the specific pattern and choice of sites might give clues to suspicious behavior...


Also: http://turbulence.org/Works/swipe/barcode.html

http://www.takeawayfestival.com/taxonomy/term/81

Image based tracking

Direct tracking:

-image analysis

Indirect tracking:

-image differencing

-optical flow

Visual servoing:

-feed results back into a controller to

steer a moving vehicle

Difference images: pyramid of change > nth derivative

Stream of images in time

Time is here a variable (function of #frames)

3

2

1

0

Effective for global properties

Visual tracking of moving objects (multi-camera)

The first step in tracking objects is for the system to distinguish moving objects from stationary ones through feature selection and detection.

The next step requires the system to make note of the location, speed, size, and shape of moving objects.

Finally, the system must learn to recognize and track the same object as it moves out of the visual field of one camera and into the next. The computer is able to do this as long as the visual fields of the its cameras overlap at least somewhat.

Visual Tracking with a single camera: algorithm

feature selection

while (streaming)

{

feature extraction

feature location

}

Advanced Outdoor Line Following:First results in vision-based crop line trackingMark Ollis & Anthony Stentz, Robotics Institute (1996)

“The color segmentation algorithm has two parts: a discriminant and a segmentor. The discriminant computes afunction d(i,j) of individual pixels whose output providessome information about whether that pixel is in the cutregion or the uncut region; the segmentor then uses thediscriminant to produce a segmentation.”

http://www.ri.cmu.edu/pub_files/pub1/ollis_mark_1996_1/ollis_mark_1996_1.pdf

Indoor Line Following:Vision-based Line Tracking and Navigation in Structured Environments G. Reccari Y. Caselli F. Zanichelli A . Calafiore

Features-color

-brightness

-texture

-location

-size

-form local and global geometry

-other (eigenfaces for face detection)

> isolated and combined

Outlines and Templates

-> cvmatchtemplate (C++ only)

image template

Occurances of the template in the image

In Opencv unde C++

cvMatchTemplate(image, template, result, CV_TM_CCOEFF_NORMED);

http://www.cs.rit.edu/~gsp8334/OCVT/OpenCVTutorial_III.pdf

x

z

y

x, y

x’, y’

Rotation, Scale, Translation-Invariant Template Matching

http://www.lps.usp.br/~hae/software/cirateg/index.html

Texture

What is texture?

- a feature that repeats with some variation- need to separate the repeating elements from the constant elements- often approached with probabilistic distributions- also: wavelets and neuralnets- example: Anil K. Jain, Kalle Karu, Learning Texture Discrimination Masks

(February 1996 (Vol. 18, No. 2) pp. 195-205

Eigenfaces

Eigenfaces are a set of eigenvectors used in the computer vision problem of human face recognition. These eigenvectors are derived from the covariance matrix of the probability distribution of the high-dimensional vector space of possible faces of human beings.

The technique has been used for handwriting, lip reading, voice recognition, and medical imaging.

Eigenfaces can be imagined as a set of "standardized face ingredients", derived from statistical analysis of many pictures of faces. Any human face can be considered to be a combination of these standard faces (everyone has eyes, a nose, a mouth..). One person's face might be made up of 10% from face 1, 24% from face 2 and so on.

Eigenfaces

Practically, eigenfaces are created by finding feature vectors based on deviations from an averaged training set.

a) collect some imagesb) find the average image (sum and divide)c) find the deviated images (differences between individual images and the ave image)d) calculate the covariance matrix (measure of how much variables vary in the same way)e) calculate the eigenvectors of the covariance matrix (vectors that do not change with scaling)f) construct eigenfaces by combining N eigenvectors

check wikipedia for an intuitive description of eigenvectors:http://en.wikipedia.org/wiki/Eigenvector

http://en.wikipedia.org/wiki/Eigenvector

Rice University: Eigenface Group has python face recognition code implementing eigenfaces:


University of Pittsburgh has an online face detection program:http://demo.pittpatt.com/


http://demo.pittpatt.com/

Problems with features

-perspective

-underdefinition (3d, content)

-lighting

-occlusion

-distance

-image resolution

-test data without training data

General approach

feature selection

while (streaming)

{

feature extraction

feature location

}

feature location: choose an invariant property

for example: center of mass

According to Newton's third law the two internal forces are equal and opposite. Adding the equations then gives:

Fi1+F1 = a1m1

Fi2+F2 = a2m2

(internal forces are equal and opposite)

The two bodies are combined to one body with mass m1 + m2

This body is acted upon by the same external forces as our two bodies. This imaginary body then gets an acceleration which we call the acceleration of the center of mass aCM and given by:

F1 + F2 = a1m1 + a2m2

a1m1 + a2m2 = (m1+m2) acm

acceleration is the second derivative of the position:

x1m1 + x2m2 = (m1+m2) xcm

the center of mass 'balances' the two different weights:

xcm = (m1x1 + m2x2) / m1m2

xcm

Center of Mass

Calculating the CG in matlab

%find the centers of gravity of image Y(i * j)

for i = 1 : j C = [C, ((Y(:,i)'*m') / sum(Y(:,i)))];end

for i = 1 : k D = [D, ((Y(i,:)*n') / sum(Y(i,:)))];end

G = C(2:end);E = D(2:end);PY = round(median(G)); %Y coordinate of the COGPX = round(median(E)); %X coordinate of the COG

Color based tracking

xstart, ystart, xnew, ynew, xprevious, yprevious

while (streaming) {

feature extraction > hue >binarize >area

feature location > get CG coordinates > xnew, ynew

xstart = xnew //set only once

ystart = ynew //set only once

xstart – xprevious //check distance travelled

ystart – yprevious //check distance travelled

xprevious = xnew //update continuously

yprevious = ynew //update continuously

}

Simple Hack with PIL

Calculate the bounding box around a region of interest

#binarize the final imagemask6 = mask5.convert("1", dither=Image.NONE)

#find outer corners of remaining areafor i in range(x):

for j in range(y):if(mask6.getpixel((i,j))):

if (i < xmin):xmin = i

if(i > xmax):xmax = i

if (j < ymin):ymin = j

if(y > ymax):ymax = j

box = (xmin, ymin, xmax, ymax)draw.rectangle(box, outline=(255,0,0))

Simple tracking based on color

xstart, ystart, xnew, ynew, xprevious, yprevious

while (streaming) {

feature extraction > hue

binarize

area(s)

feature location > get CoM coordinates > xnew, ynew

xstart – xnew

ystart – ynew

xstart – xprevious

ystart – yprevious

xprevious = xnew

yprevious = ynew

}

http://www.ri.cmu.edu/projects/project_320.html

Advanced form tracker: car finder

Additional topics:

industrial applications: moving line trackinghttp://www.braintech.com/videos-honda.php

clustering: k-meanshttp://www.indiana.edu/~dll/B657/B657_lec_kmeans.pdf

decomposition: principle component analysishttp://www.indiana.edu/~dll/B657/B657_lec_pca.pdf

http://www.braintech.com/videos-honda.php

http://www.indiana.edu/~dll/B657/B657_lec_kmeans.pdf

http://www.indiana.edu/~dll/B657/B657_lec_pca.pdf

machine vision: tracking i - realtechsupport · machine vision: tracking i version 1.3 ... g....

Documents