computer science technical report - face · pdf filecomputer science technical report analysis...
TRANSCRIPT
Computer ScienceTechnical Report
ANALYSIS OF PCA-BASED AND FISHER
DISCRIMINANT-BASED IMAGE RECOGNITION
ALGORITHMS
Wendy S. Yambor
July 2000
Technical Report CS-00-103
Computer Science DepartmentColorado State University
Fort Collins, CO 80523-1873
Phone: (970) 491-5792 Fax: (970) 491-2466WWW: http://www.cs.colostate.edu
THESIS
ANALYSIS OF PCA-BASED AND FISHER DISCRIMINANT-BASED
IMAGE RECOGNITION ALGORITHMS
Submitted by
Wendy S. Yambor
Department of Computer Science
In Partial Fulfillment of the Requirements
For the Degree of Master of Science
Colorado State University
Fort Collins, Colorado
Summer 2000
ii
COLORADO STATE UNIVERSITY
July 6, 2000
WE HEREBY RECOMMEND THAT THE THESIS PREPARED
UNDER OUR SUPERVISION BY WENDY S. YAMBOR ENTITLED
ANALYSIS OF PCA-BASED AND FISHER DISCRIMINANT-BASED
IMAGE RECOGNITION ALGORITHMS BE ACCEPTED AS FULFILLING IN
PART REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE.
Committee on Graduate Work
Advisor
Co-Advisor
Department Head
iii
ABSTRACT OF THESIS
ANALYSIS OF PCA-BASED AND FISHER DISCRIMINANT-BASED IMAGERECOGNITION ALGORITHMS
One method of identifying images is to measure the similarity between images. This is
accomplished by using measures such as the 1L norm, 2L norm, covariance,
Mahalanobis distance, and correlation. These similarity measures can be calculated on
the images in their original space or on the images projected into a new space. I discuss
two alternative spaces in which these similarity measures may be calculated, the subspace
created by the eigenvectors of the covariance matrix of the training data and the subspace
created by the Fisher basis vectors of the data. Variations of these spaces will be
discussed as well as the behavior of similarity measures within these spaces.
Experiments are presented comparing recognition rates for different similarity measures
and spaces using hand labeled imagery from two domains: human face recognition and
classifying an image as a cat or a dog.
Wendy S. YamborComputer Science Department
Colorado State UniversityFort Collins, CO 80523
Summer 2000
iv
Acknowledgments
I thank my committee, Ross Beveridge, Bruce Draper, Micheal Kirby, and Adele Howe,
for their support and knowledge over the past two years. Every member of my
committee has been involved in some aspect of this thesis. It is through their interest and
persuasion that I gained knowledge in this field.
I thank Jonathon Phillips for providing me with the results and images from the FERET
evaluation. Furthermore, I thank Jonathon for patiently answering numerous questions.
v
Table of Contents
1. Introduction 11.1 Previous Work. 11.2 A General Algorithm... 21.3 Why Study These Subspaces?. 31.4 Organization of Following Sections 4
2. Eigenspace Projection 52.1 Recognizing Images Using Eigenspace, Tutorial on Original Method... 62.2 Tutorial for Snapshot Method of Eigenspace Projection.... 112.3 Variations. 14
3. Fisher Discriminants 153.1 Fisher Discriminants Tutorial (Original Method).... 153.2 Fisher Discriminants Tutorial (Orthonormal Basis Method)... 20
4. Variations 294.1 Eigenvector Selection.. 294.2 Ordering Eigenvectors by Like-Image Difference.. 304.3 Similarity & Distance Measures.. 324.4 Are similarity measures the same inside and outside of eigenspace?. 35
5. Experiments 435.1 Datasets 43
5.1.1 The Cat & Dog Dataset 435.1.2 The FERET Dataset. 445.1.3 The Restructured FERET Dataset 45
5.2 Bagging and Combining Similarity Measures. 455.2.1 Adding Distance Measures... 465.2.2 Distance Measure Aggregation 485.2.3 Correlating Distance Metrics 49
5.3 Like-Image Difference on the FERET dataset.... 515.4 Cat & Dog Experiments... 545.5 FERET Experiments. 56
6. Conclusion . 616.1 Experiment Summary...... 626.2 Future Work.. 66
Appendix I 68References 69
1
1. Introduction
Two image recognition systems are examined, eigenspace projection and Fisher
discriminants. Each of these systems examines images in a subspace. The eigenvectors
of the covariance matrix of the training data create the eigenspace. The basis vectors
calculated by Fisher discriminants create the Fisher discriminants subspace. Variations
of these subspaces are examined. The first variation is the selection of vectors used to
create the subspaces. The second variation is the measurement used to calculate the
difference between images projected into these subspaces. Experiments are performed to
test hypotheses regarding the relative performance of subspace and difference measures.
Neither eigenspace projection nor Fisher discriminants are new ideas. Both have been
examined by researches for many years. It is the work of these researches that has helped
to revolutionize image recognition and bring face recognition to the point where it is now
usable in industry.
1.1 Previous Work
Projecting images into eigenspace is a standard procedure for many appearance-based
object recognition algorithms. A basic explanation of eigenspace projection is provided
by [20]. Michael Kirby was the first to introduce the idea of the low-dimensional
characterization of faces. Examples of his use of eigenspace projection can be found in
[7,8,16]. Turk & Pentland worked with eigenspace projection for face recognition [21].
2
More recently Shree Nayar used eigenspace projection to identify objects using a
turntable to view objects at different angles as explained in [11].
R.A. Fisher developed Fishers linear discriminant in the 1930s [5]. Not until recently
have Fisher discriminants been utilized for object recognition. An explanation of Fisher
discriminants can be found in [4]. Swets and Weng used Fisher discriminants to cluster
images for the purpose of identification in 1996 [18,19,23]. Belhumeur, Hespanha, and
Kriegman also used Fisher discriminants to identify faces, by training and testing with
several faces under different lighting [1].
1.2 A General Algorithm
An image may be viewed as a vector of pixels where the value of each entry in the vector
is the grayscale value of the corresponding pixel. For example, an 8x8 image may be
unwrapped and treated as a vector of length 64. The image is said to sit in N-dimensional
space, where N is the number of pixels (and the length of the vector). This vector
representation of the image is considered to be the original space of the image.
The original space of an image is just one of infinitely many spaces in which the image
can be examined. Two specific subspaces are the subspace created by the eigenvectors of
the covariance matrix of the training data and the basis vectors calculated by Fisher
discriminants. The majority of subspaces, including eigenspace, do not optimize
discrimination characteristics. Eigenspace optimizes variance among the images. The
3
exception to this statement is Fisher discriminants, which does optimize discrimination
characteristics.
Although some of the details may vary, there is a basic algorithm for identifying images
by projecting them into a subspace. First one selects a subspace on which to project the
images. Once this subspace is selected, all training images are projected into this
subspace. Next each test image is projected into this subspace. Each test image is
compared to all the training images by a similarity or distance measure, the training
image found to be most similar or closest to the test image is used to identify the test
image.
1.3 Why Study These Subspaces?
Projecting images into subspaces has been studied for many years as discussed in the
previous work section. The research into these subspaces has helped to revolutionize
image recognition algorithms, specifically face recognition. When studying these
subspaces an interesting question arises: under what conditions does projecting an image
into a subspace improve performance. The answer to this question is not an easy one.
What specific subspace (if any at all) improves performance depends on the specific
problem. Furthermore, variations within the subspace also effect performance. For
example, the selection of vectors to create the subspace and measures to decide which
images are a closest match, both effect performance.
4
1.4 Organization of Following Sections
I discuss two alternative spaces commonly used to identify images. In chapter 2, I
discuss eigenspaces. Eigenspace projection, also know as Karhunen-Loeve (KL) and
Principal Component Analysis (PCA), projects images into a subspace such that the first
orthogonal dimension of this subspace captures the greatest amount of variance among
the images and the last dimension of this subspace captures the least amount of variance
among the images. Two methods of creating an eigenspace are examined, the original
method and a method designed for high-resolution images know as the snapshot method.
In chapter 3, Fisher discriminants is discussed. Fisher discriminants project images such
that images of the same class are close to each other while images of different classes are
far apart. Two methods of calculating Fisher discriminants are examined. On