lecture 8 - university of california, berkeley
Post on 09-Feb-2022
3 Views
Preview:
TRANSCRIPT
EE 2
N.M 8.1
ECTURE ON PATTERN RECOGNITION
PE Spring,1999
25D
ORGAN / B.GOLD LECTURE 8
L
University of CaliforniaBerkeley
College of EngineeringDepartment of Electrical Engineering
and Computer Sciences
rofessors : N.Morgan / B.GoldE225D
Pattern Classification
Lecture 8
EE 2
N.M 8.2
ECTURE ON PATTERN RECOGNITION
nitionporal sequence
lass labels used
: class labels not
25D
ORGAN / B.GOLD LECTURE 8
L
Speech Pattern Recog•Soft pattern classification plus tem
integration
•Supervised pattern classification: c
in training
•Unsupervised pattern classification
available or used
EE 2
N.M 8.3
ECTURE ON PATTERN RECOGNITION
on
1 k K<≤
ωk
25D
ORGAN / B.GOLD LECTURE 8
L
Feature
Extraction
Pattern
Feature
Vector
Classificati
x1
x2
xd
EE 2
N.M 8.4
ECTURE ON PATTERN RECOGNITION
assifier
et, compare with
25D
ORGAN / B.GOLD LECTURE 8
L
•Training: learning parameters of cl
•Testing: classify independent test s
labels and score
EE 2
N.M 8.7
ECTURE ON PATTERN RECOGNITION
eria
25D
ORGAN / B.GOLD LECTURE 8
L
Feature Extraction Crit
•Class discrimination
•Generalization
•Parsimony (efficiency)
EE 2
N.M 8.8
ECTURE ON PATTERN RECOGNITION
ent gains
E
t
25D
ORGAN / B.GOLD LECTURE 8
L
plosive + vowel energies for 2 differ
t
t)( )
E t( )
EE 2
N.M 8.9
ECTURE ON PATTERN RECOGNITION
25DORGAN / B.GOLD LECTURE 8
L
t∂∂ CE t( )log
t∂∂ Clog E t( )log+( )=
t∂∂ E t( )log=
EE 2
N.M 8.10
ECTURE ON PATTERN RECOGNITION
tion on training
tion to test set are
25D
ORGAN / B.GOLD LECTURE 8
L
Feature Vector Size
•Best representations for discrimina
set are large (highly dimensioned)
•Best representations for generaliza
(typically) succinct)
EE 2
N.M 8.11
ECTURE ON PATTERN RECOGNITION
tion
L transform,
)
n
25D
ORGAN / B.GOLD LECTURE 8
L
Dimensionality Reduc
•Principal components (i.e., SVD, K
eigenanalysis ...)
•Linear Discriminant Analysis (LDA
•Application-specific knowledge
•Feature Selection via PR Evaluatio
EE 2
N.M 8.12
ECTURE ON PATTERN RECOGNITION
25DORGAN / B.GOLD LECTURE 8
L
x x x
x x x
x x
o o
o o
o o
o o
f1
f2
EE 2
N.M 8.14
ECTURE ON PATTERN RECOGNITION
25D
ORGAN / B.GOLD LECTURE 8
L
PR Methods
•Minimum Distance
•Discriminant Functions
•Linear Discriminant
•Nonlinear Discriminant
(e.g, quadratic, neural networks)
•Statistical Discriminant Functions
EE 2
N.M 8.15
ECTURE ON PATTERN RECOGNITION
ent
t closest to new
plicit statistical
mplicates this
25D
ORGAN / B.GOLD LECTURE 8
L
Minimum Distance•Vector or matrix representing elem
•Define a distance function
•Choose the class of stored elemen
input
•Choice of distance equivalent to im
assumptions
•For speech, temporal variability co
EE 2
N.M 8.16
ECTURE ON PATTERN RECOGNITION
xTx ziTzi 2xTzi–+( )
i
25D
ORGAN / B.GOLD LECTURE 8
L
zi template vector (prototype)=
x input vector=
Choose i to minimize distance
argimin x zi–( )T x zi–( ) argimin x zi–( )T x zi–( ) argimin= =
argimaxzi
Tzi 2xTzi–2–
------------------------- argimax xTzi
12---zi
Tz–=
If ziTzi 1 for all i= argimax xTzi( )⇒
EE 2
N.M 8.17
ECTURE ON PATTERN RECOGNITION
ance
, discrimination)
ace
25D
ORGAN / B.GOLD LECTURE 8
L
Problems with Min Dist
•Proper scaling of dimensions (size
•For high dim, sparsely sampled sp
EE 2
N.M 8.18
ECTURE ON PATTERN RECOGNITION
stance
t of infinite
f optimum
potentially large
25D
ORGAN / B.GOLD LECTURE 8
L
Decision Rule for Min Di
•Nearest Neighbor (NN) - in the limi
samples, at most twice the error o
classifier
•k-Nearest Neighbor (kNN)
•Lots of storage for large problems;
searches
EE 2
N.M 8.19
ECTURE ON PATTERN RECOGNITION
to reduce its
variance often a
recognition
25D
ORGAN / B.GOLD LECTURE 8
L
Some Opinions
•Better to throw away bad data than
weight
•Dimensionality-reduction based on
bad choice for supervised pattern
EE 2
N.M 8.20
ECTURE ON PATTERN RECOGNITION
sect class, min
line, for 3 is
ωωωωTx ωωωω0+ + 0=
25D
ORGAN / B.GOLD LECTURE 8
L
Discriminant Analysi•Discriminant functions max for corr
for others
•Decision surface between classes
•Linear decision surface for 2-dim is
plane; generally called hyperplane
•For 2 classes, surface at
•2-class quadratic case, surface at
ωωωωTx ωωωω0+ 0=
xTWx
EE 2
N.M 8.22
ECTURE ON PATTERN RECOGNITION
ctions
25D
ORGAN / B.GOLD LECTURE 8
L
Training Discriminant Fun
•Minimum distance
•Fisher linear discriminant
•Gradient learning
EE 2
N.M 8.23
ECTURE ON PATTERN RECOGNITION
- ANNs
25D
ORGAN / B.GOLD LECTURE 8
L
Generalized Discriminators
•McCulloch Pitts neural model
•Rosenblatt Perceptron
•Multilayer Systems
EE 2
N.M 8.24
ECTURE ON PATTERN RECOGNITION
erceptron
yo
25D
ORGAN / B.GOLD LECTURE 8
L
The Perceptron
McCulloch-Pitts Neuron - Rosenblatt P
+
xd
x2
x1
bias
wd
w2
w1
EE 2
N.M 8.25
ECTURE ON PATTERN RECOGNITION
ncele will converge in a
k)
k)
25D
ORGAN / B.GOLD LECTURE 8
L
Perceptron ConvergeIf classes are linearly separable the following rufinite number of steps :
For each pattern x at time step k;
if
x k( ) class 1, ωT k( )x k( ) 0≤∈ ω k 1+( ) = ω k( ) cx(+⇒
x k( ) class 2, ωT k( )x k( ) 0≥∈ ω k 1+( ) = ω k( ) cx(–⇒
else
ω k 1+( ) = ""ω k( )
EE 2
N.M 8.26
ECTURE ON PATTERN RECOGNITION
s :(DAID, 1961)
I/On
25D
ORGAN / B.GOLD LECTURE 8
L
Multilayer Perceptron•Heterogeneous, “hard” nonlinearity
•Homogeneous, “soft” nonlinearity
(“modern” MLP)
PerceptroGaus. classsubsets
feature
EE 2
N.M 8.28
ECTURE ON PATTERN RECOGNITION
y
25D
ORGAN / B.GOLD LECTURE 8
L
f y( )
f y( ) 11 e y–+--------------- (sigmoid)=
0 f y( ) 1<<
top related