prof nb venkateswarlu head, it, gvpcoe visakhapatnam venkat_ritch@yahoo

Prof NB Venkateswarlu

Head, IT, GVPCOE

[email protected]

www.ritchcenter.com/nbv

mailto:[email protected]

http://www.gvpcoeedu.org/

First Let me say Hearty Welcome

to you All

Also, let mecongrachulate

Chairman,

Secretary/Correspondent

Principal,

Prof. Ravindra Babu

Vice-Principal

and other Organizers for planning for such a nice workshop with excellent themes.

Feature Extraction/ Selection

My Talk

A Typical Image Processing System contains

Image Acquisition

Image Pre-Processing

Image En-hancement

Image Seg-mentation

Image Featu-re Extraction

Image Class-fication

Image Unde-rstanding

Two Aspects of Feature Extraction

Extracting useful features from images or any other measurements.

Identifying Transformed Variables which are functions of original variables and having some charcateristics.

Feature Selection

Selecting Important Variables is Feature Selection

Some Features Used in I.P Applications

• Shape based

• Contour based

• Area based

• Transform based

• Projections

• Signature

• Problem specific

Perimeter, length etc. First Convex hull is extracted

Skeletons

Averaged Radial density

Radial Basis functions

Rose Plots

Chain Codes

Crack code - 32330300

Signature

Bending Energy

Chord Distribution

Fourier Descriptors

Structure

Splines

Horizontal and vertical projections

Elongatedness

Convex Hull

Compactness

RGB, R ,G and B bands

Classification/Pattern Recognition

• Statistical

• Syntactical Linguistic

• Discriminant function

• Fuzzy

• Neural

• Hybrid

Dimensionality Reduction

• Feature selection (i.e., attribute subset selection):– Select a minimum set of features such that the probability

distribution of different classes given the values for those features is as close as possible to the original distribution given the values of all features

– reduce # of patterns in the patterns, easier to understand

• Heuristic methods (due to exponential # of choices):– step-wise forward selection– step-wise backward elimination– combining forward selection and backward elimination– decision-tree induction

Example of Decision Tree Induction

Initial attribute set:{A1, A2, A3, A4, A5, A6}

A4 ?

A1? A6?

Class 1 Class 2 Class 1 Class 2

> Reduced attribute set: {A1, A4, A6}

Heuristic Feature Selection Methods

• There are 2d possible sub-features of d features• Several heuristic feature selection methods:

– Best single features under the feature independence assumption: choose by significance tests.

– Best step-wise feature selection: • The best single-feature is picked first• Then next best feature condition to the first, ...

– Step-wise feature elimination:• Repeatedly eliminate the worst feature

– Best combined feature selection and elimination:– Optimal branch and bound:

• Use feature elimination and backtracking

Why do We need?

• A classifier performance depends on• No of features• Feature distinguishability• No of groups• Groups characteristics in multidimensional

space.• Needed response time• Memory requirements

Feature Extraction Methods

We will find transformed variables which are functions of original variables.

A good example: Though we may conduct tests in more than test (K-D), finally grading is done based on total marks (1-D)

Principal Component Analysis

• Given N data vectors from k-dimensions, find c <= k orthogonal vectors that can be best used to represent data – The original data set is reduced to one consisting of N data

vectors on c principal components (reduced dimensions)

• Each data vector is a linear combination of the c principal component vectors

• Works for numeric data only

• Used when the number of dimensions is large


X1

X2

Y1

Y2


Aimed at finding new co-ordinate system which has some characteristics.

M=[4.5 4.25 ]Cov Matrix [ 2.57 1.86 ] [ 1.86 6.21]Eigen Values = 6.99, 1.79Eigen Vectors = [ 0.387 0.922 ] [ -0.922 0.387 ]

However in some cases it is not possible to have PCA working.

Canonical Analysis

Unlike PCA which takes global mean and covariance, this takes between the group and within the group covariance matrix and the calculates canonical axes.

Standard Deviation – A Simple Indicator

Correlation Coefficient

Feature Selection –

Group Separability

Indices

Feature Selection Through

Clustering

Selecting From 4 variables

Multi-Layer Perceptron

Output nodes

Input nodes

Hidden nodes

Output vector

Input vector: xi

wij

i

jiijj OwI

jIje

O

1

1

))(1( jjjjj OTOOErr

jkk

kjjj wErrOOErr )1(

ijijij OErrlww )(jjj Errl)(

Network Pruning and Rule Extraction

• Network pruning– Fully connected network will be hard to articulate

– N input nodes, h hidden nodes and m output nodes lead to h(m+N) weights

– Pruning: Remove some of the links without affecting classification accuracy of the network

• Extracting rules from a trained network– Discretize activation values; replace individual activation value by the cluster

average maintaining the network accuracy

– Enumerate the output from the discretized activation values to find rules between activation value and output

– Find the relationship between the input and activation value

– Combine the above two to have rules relating the output to input

Neural Networks for Feature Extraction

Self-organizing feature maps (SOMs)

• Clustering is also performed by having several units competing for the current object

• The unit whose weight vector is closest to the current object wins

• The winner and its neighbors learn by having their weights adjusted

• SOMs are believed to resemble processing that can occur in the brain

• Useful for visualizing high-dimensional data in 2- or 3-D space

Other Model-Based Clustering Methods

• Neural network approaches– Represent each cluster as an exemplar, acting as a

“prototype” of the cluster– New objects are distributed to the cluster whose exemplar

is the most similar according to some dostance measure

• Competitive learning– Involves a hierarchical architecture of several units

(neurons)– Neurons compete in a “winner-takes-all” fashion for the

object currently being presented

Model-Based Clustering Methods

SVM

SVM constructs nonlinear decision functions by training classifier to perform a linear separation in some high dimensional space which is nonlinearly related to the input space. A Mercer kernel is used for mapping.

prof nb venkateswarlu head, it, gvpcoe visakhapatnam venkat_ritch@yahoo

Documents