face recognition: an introduction · an introduction face. 2 ... architecture feature extraction...
TRANSCRIPT
1
Face Recognition:An Introduction
Face
2
Face RecognitionFace Recognition
• Face is the most common biometric used by humans• Applications range from static, mug-shot verification to
a dynamic, uncontrolled face identification in a cluttered background
• Challenges: • automatically locate the face • recognize the face from a general view point under
different illumination conditions, facial expressions, and aging effects
Authentication vs IdentificationAuthentication vs Identification• Face Authentication/Verification (1:1 matching)
• Face Identification/Recognition (1:N matching)
3
Access Control
ApplicationsApplications
www viisage comwww.viisage.com
www.visionics.com
Video Surveillance (On-line or off-line)
ApplicationsApplications
Face Scan at Airports
www.facesnap.de
4
Why is Face Recognition Hard?
Many faces of Madonna
Face Recognition DifficultiesFace Recognition Difficulties
• Identify similar faces (inter-class similarity)• Accommodate intra class variability due to:• Accommodate intra-class variability due to:
• head pose• illumination conditions• expressions• facial accessories
i ff t• aging effects• Cartoon faces
5
Inter-class SimilarityInter-class Similarity• Different persons may have very similar appearance
k d hl bb k/hi/ li h/i d h/ i /2000/ l
Twins Father and son
www.marykateandashley.com news.bbc.co.uk/hi/english/in_depth/americas/2000/us_elections
Intra-class VariabilityIntra-class Variability• Faces with intra-subject variations in pose, illumination,
expression, accessories, color, occlusions, and brightness
6
Sketch of a Pattern Recognition Architecture
FeatureExtraction
ClassificationImage
(window)ObjectIdentityFeature
Vector
Example: Face Detection
• Scan window over image
• Classify window as either:– Face– Non-face
Cl ifiWi dFace
ClassifierWindowNon-face
7
Detection Test Sets
Profile views
Schneiderman’s Test setTest set
8
Face Detection: Experimental Results
Test sets: two CMU benchmark data setsTest set 1: 125 images with 483 facesTest set 2: 20 images with 136 faces
[See also work by Viola & Jones, Rehg, more recentby Schneiderman]
Example: Finding skinNon-parametric Representation of CCD
• Skin has a very small range of (intensity independent) colors, and little texture– Compute an intensity-independent color measure, check if
color is in this range, check if there is little texture (median filter)
– See this as a classifier - we can set up the tests by hand, or learn them.
– get class conditional densities (histograms), priors from data (counting)
• Classifier is• Classifier is
9
Figure from “Statistical color models with application to skin detection,” M.J. Jones and J. Rehg, Proc. Computer Vision and Pattern Recognition, 1999 copyright 1999, IEEE
Face Detection
10
Face Detection AlgorithmFace Detection Algorithm
Lighting Compensation
C l S T f i
Face Localization
Skin Color Detection
Color Space Transformation
Variance-based SegmentationConnected Component &
Grouping
E / M h D i
Input Image
Face Boundary Detection
Verifying/ WeightingEyes-Mouth Triangles
Eye/ Mouth Detection
Facial Feature Detection
Output Image
Canon Powershot
11
Face Recognition: 2-D and 3-D
Time(video)
2-D 2-DRecognitionComparison
Face Database
RecognitionData
3-D 3-D
Prior knowledgeof face class
Algorithms
Pose-dependency
Taxonomy of Face Recognition Taxonomy of Face Recognition
Pose-dependent Pose-invariant
Matching features
Appearance ---- Gordon et al 1995FeatureHybrid
Viewer-centered Images
Object-centered Models
Face representation
Appearance-based (Holistic)
Gordon et al., 1995Feature-based (Analytic)
Hybrid---- Lengagne et al., 1996
---- Atick et al., 1996---- Yan et al., 1996---- Zhao et al., 2000---- Zhang et al., 2000
PCA, LDA LFAEGBM
12
Image as a Feature Vectorx2
C id i l i t b i t i
x1 x3
• Consider an n-pixel image to be a point in an n-dimensional space, x Rn.
• Each pixel value is a coordinate of x.∈
Nearest Neighbor Classifier{ { RRjj } } are set of training
images.),(minarg IRdistID j
j=
x2
II
x1 x3RR11 RR22
13
Comments• Sometimes called “Template Matching”• Variations on distance function (e.g. L1,
b t di t )robust distances)• Multiple templates per class- perhaps many
training images per class.• Expensive to compute k distances, especially
when each image is big (N dimensional).• May not generalize well to unseen examples• May not generalize well to unseen examples
of class.• Some solutions:
– Bayesian classification– Dimensionality reduction
Eigenfaces (Turk, Pentland, 91) Eigenfaces (Turk, Pentland, 91) --11
• Use Principle Component Analysis (PCA) to reduce the dimsionality( ) y
14
How do you construct Eigenspace?
[ ] [ ][ ] [ ][ x1 x2 x3 x4 x5 ] W
Construct data matrix by stacking vectorized images and then apply Singular Value Decomposition (SVD)
Eigenfaces• Modeling
1. Given a collection of n labeled training images,2 Compute mean image and covariance matrix2. Compute mean image and covariance matrix.3. Compute k Eigenvectors (note that these are
images) of covariance matrix corresponding to k largest Eigenvalues.
4. Project the training images to the k-dimensional Eigenspace.
R i i• Recognition1. Given a test image, project to Eigenspace.2. Perform classification to the projected training
images.
15
Eigenfaces: Training Images
[ Turk, Pentland 01
Eigenfaces
Mean Image Basis Images
16
Difficulties with PCA• Projection may suppress important
detail– smallest variance directions may not be
unimportant• Method does not take discriminative
task into accountt picall e ish to comp te feat res that– typically, we wish to compute features that allow good discrimination
– not the same as largest variance
17
Fisherfaces: Class specific linear projection
• An n-pixel image x∈Rn can beAn n pixel image x∈R can be projected to a low-dimensional feature space y∈Rm by
y = Wx
where W is an n by m matrix.where W is an n by m matrix.
• Recognition is performed using nearest neighbor in Rm.
• How do we choose a good W?
PCA & Fisher’s Linear Discriminant • Between-class scatter
Tii
c
iBS ))(( μμμμχ −−=∑χ1
χ2
• Within-class scatter
• Total scatter
i 1=
∑ ∑= ∈
−−=c
i x
TikikW
ik
xS1
))((χ
μμμ
WB
c
i x
TkkT SSxS
ik
+=−−=∑ ∑= ∈1
))((χ
μμμμ1
μ2
μ
• Where– c is the number of classes– μi is the mean of class χi
– | χi | is number of samples of χi..
ik χ
18
PCA & Fisher’s Linear Discriminant • PCA (Eigenfaces)
WSWW Tmaxarg=χ1χ2
PCA
Maximizes projected total scatter
• Fisher’s Linear Discriminant
WSWW TWPCA maxarg=
WSW TDiscriminant
Maximizes ratio of projected between-class to projected within-class scatter
WSW
WSWW
WT
BT
Wfld maxarg=
FLD
Four Fisherfaces From ORL Database
19
Eigenfaces and Fisherfaces