face recognition: an introduction · an introduction face. 2 ... architecture feature extraction...

1

Face Recognition:An Introduction

Face

2

Face RecognitionFace Recognition

• Face is the most common biometric used by humans• Applications range from static, mug-shot verification to

a dynamic, uncontrolled face identification in a cluttered background

• Challenges: • automatically locate the face • recognize the face from a general view point under

different illumination conditions, facial expressions, and aging effects

Authentication vs IdentificationAuthentication vs Identification• Face Authentication/Verification (1:1 matching)

• Face Identification/Recognition (1:N matching)

3

Access Control

ApplicationsApplications

www viisage comwww.viisage.com

www.visionics.com

Video Surveillance (On-line or off-line)

ApplicationsApplications

Face Scan at Airports

www.facesnap.de

4

Why is Face Recognition Hard?

Many faces of Madonna

Face Recognition DifficultiesFace Recognition Difficulties

• Identify similar faces (inter-class similarity)• Accommodate intra class variability due to:• Accommodate intra-class variability due to:

• head pose• illumination conditions• expressions• facial accessories

i ff t• aging effects• Cartoon faces

5

Inter-class SimilarityInter-class Similarity• Different persons may have very similar appearance

k d hl bb k/hi/ li h/i d h/ i /2000/ l

Twins Father and son

www.marykateandashley.com news.bbc.co.uk/hi/english/in_depth/americas/2000/us_elections

Intra-class VariabilityIntra-class Variability• Faces with intra-subject variations in pose, illumination,

expression, accessories, color, occlusions, and brightness

6

Sketch of a Pattern Recognition Architecture

FeatureExtraction

ClassificationImage

(window)ObjectIdentityFeature

Vector

Example: Face Detection

• Scan window over image

• Classify window as either:– Face– Non-face

Cl ifiWi dFace

ClassifierWindowNon-face

7

Detection Test Sets

Profile views

Schneiderman’s Test setTest set

8

Face Detection: Experimental Results

Test sets: two CMU benchmark data setsTest set 1: 125 images with 483 facesTest set 2: 20 images with 136 faces

[See also work by Viola & Jones, Rehg, more recentby Schneiderman]

Example: Finding skinNon-parametric Representation of CCD

• Skin has a very small range of (intensity independent) colors, and little texture– Compute an intensity-independent color measure, check if

color is in this range, check if there is little texture (median filter)

– See this as a classifier - we can set up the tests by hand, or learn them.

– get class conditional densities (histograms), priors from data (counting)

• Classifier is• Classifier is

9

Figure from “Statistical color models with application to skin detection,” M.J. Jones and J. Rehg, Proc. Computer Vision and Pattern Recognition, 1999 copyright 1999, IEEE

Face Detection

10

Face Detection AlgorithmFace Detection Algorithm

Lighting Compensation

C l S T f i

Face Localization

Skin Color Detection

Color Space Transformation

Variance-based SegmentationConnected Component &

Grouping

E / M h D i

Input Image

Face Boundary Detection

Verifying/ WeightingEyes-Mouth Triangles

Eye/ Mouth Detection

Facial Feature Detection

Output Image

Canon Powershot

11

Face Recognition: 2-D and 3-D

Time(video)

2-D 2-DRecognitionComparison

Face Database

RecognitionData

3-D 3-D

Prior knowledgeof face class

Algorithms

Pose-dependency

Taxonomy of Face Recognition Taxonomy of Face Recognition

Pose-dependent Pose-invariant

Matching features

Appearance ---- Gordon et al 1995FeatureHybrid

Viewer-centered Images

Object-centered Models

Face representation

Appearance-based (Holistic)

Gordon et al., 1995Feature-based (Analytic)

Hybrid---- Lengagne et al., 1996

---- Atick et al., 1996---- Yan et al., 1996---- Zhao et al., 2000---- Zhang et al., 2000

PCA, LDA LFAEGBM

12

Image as a Feature Vectorx2

C id i l i t b i t i

x1 x3

• Consider an n-pixel image to be a point in an n-dimensional space, x Rn.

• Each pixel value is a coordinate of x.∈

Nearest Neighbor Classifier{ { RRjj } } are set of training

images.),(minarg IRdistID j

j=

x2

II

x1 x3RR11 RR22

13

Comments• Sometimes called “Template Matching”• Variations on distance function (e.g. L1,

b t di t )robust distances)• Multiple templates per class- perhaps many

training images per class.• Expensive to compute k distances, especially

when each image is big (N dimensional).• May not generalize well to unseen examples• May not generalize well to unseen examples

of class.• Some solutions:

– Bayesian classification– Dimensionality reduction

Eigenfaces (Turk, Pentland, 91) Eigenfaces (Turk, Pentland, 91) --11

• Use Principle Component Analysis (PCA) to reduce the dimsionality( ) y

14

How do you construct Eigenspace?

[ ] [ ][ ] [ ][ x1 x2 x3 x4 x5 ] W

Construct data matrix by stacking vectorized images and then apply Singular Value Decomposition (SVD)

Eigenfaces• Modeling

1. Given a collection of n labeled training images,2 Compute mean image and covariance matrix2. Compute mean image and covariance matrix.3. Compute k Eigenvectors (note that these are

images) of covariance matrix corresponding to k largest Eigenvalues.

4. Project the training images to the k-dimensional Eigenspace.

R i i• Recognition1. Given a test image, project to Eigenspace.2. Perform classification to the projected training

images.

15

Eigenfaces: Training Images

[ Turk, Pentland 01

Eigenfaces

Mean Image Basis Images

16

Difficulties with PCA• Projection may suppress important

detail– smallest variance directions may not be

unimportant• Method does not take discriminative

task into accountt picall e ish to comp te feat res that– typically, we wish to compute features that allow good discrimination

– not the same as largest variance

17

Fisherfaces: Class specific linear projection

• An n-pixel image x∈Rn can beAn n pixel image x∈R can be projected to a low-dimensional feature space y∈Rm by

y = Wx

where W is an n by m matrix.where W is an n by m matrix.

• Recognition is performed using nearest neighbor in Rm.

• How do we choose a good W?

PCA & Fisher’s Linear Discriminant • Between-class scatter

Tii

c

iBS ))(( μμμμχ −−=∑χ1

χ2

• Within-class scatter

• Total scatter

i 1=

∑ ∑= ∈

−−=c

i x

TikikW

ik

xS1

))((χ

μμμ

WB

c

i x

TkkT SSxS

ik

+=−−=∑ ∑= ∈1

))((χ

μμμμ1

μ2

μ

• Where– c is the number of classes– μi is the mean of class χi

– | χi | is number of samples of χi..

ik χ

18

PCA & Fisher’s Linear Discriminant • PCA (Eigenfaces)

WSWW Tmaxarg=χ1χ2

PCA

Maximizes projected total scatter

• Fisher’s Linear Discriminant

WSWW TWPCA maxarg=

WSW TDiscriminant

Maximizes ratio of projected between-class to projected within-class scatter

WSW

WSWW

WT

BT

Wfld maxarg=

FLD

Four Fisherfaces From ORL Database

19

Eigenfaces and Fisherfaces

face recognition: an introduction · an introduction face. 2 ... architecture feature extraction...

Documents