what is pattern recognition (lecture 3 of 6)

22
ERI SUMMER TRAINING COMPUTERS & SYSTEMS DEPT. Dr. Randa Elanwar Lecture 3

Upload: randa-elanwar

Post on 17-Jul-2015

96 views

Category:

Science


0 download

TRANSCRIPT

Page 1: What is pattern recognition (lecture 3 of 6)

ERI SUMMER TRAININGCOMPUTERS & SYSTEMS DEPT.

Dr. Randa ElanwarLecture 3

Page 2: What is pattern recognition (lecture 3 of 6)

Content

� Non linear problems

� Learning Methods

� Supervised learning

� Unsupervised learning

2

ERI Summer training (C&S) Dr. Randa Elanwar

� Unsupervised learning

� Reinforcement learning

Page 3: What is pattern recognition (lecture 3 of 6)

Non linear problems3

� So far we understood that NN can handle problems with

� feature vectors > 2 (i.e. hyper feature space)

� multiple classes >= 2� multiple classes >= 2

� linear problems (linearly separable)

� Sometimes, the problem nature or the features taken in the feature space cannot be solved using first degree polynomial like this example.

ERI Summer training (C&S) Dr. Randa Elanwar

Page 4: What is pattern recognition (lecture 3 of 6)

Non linear problems

� Assume a shoes factory is packing the shoes pairs of certain size on a moving belt, the machine should check that the pairs are not alike. The factory labels left shoe as "0" and right shoe as "1".

� The factory needs a solution that checks labels and

4

ERI Summer training (C&S) Dr. Randa Elanwar

� The factory needs a solution that checks labels and whenever it find similar pair stops the moving belt to prevent packing it.

� So we want to implement a NN to act as logical XOR function .. In other words whenever the input is '00' or "11" the output is '0' and belt stops, otherwise the output is '1' and belt keeps moving.

Page 5: What is pattern recognition (lecture 3 of 6)

Non linear problems5

� XOR problem

� No single line can ever separate the samples correctly. The only way to separate the positive from negative examples is to draw 2 lines (i.e., we need 2 straight line equations) or draw a nonlinear region to capture one type only

ERI Summer training (C&S) Dr. Randa Elanwar

Page 6: What is pattern recognition (lecture 3 of 6)

Non linear problems

� To implement the nonlinearity we need to insert one or more extra layer of nodes between the input layer and the output layer (Hidden layer)

6

ERI Summer training (C&S) Dr. Randa Elanwar

Page 7: What is pattern recognition (lecture 3 of 6)

Non linear problems7

� The nonlinearity can help re-shape the straight line decision boundary into higher order polynomial that can successfully separates each class samples

ERI Summer training (C&S) Dr. Randa Elanwar

Page 8: What is pattern recognition (lecture 3 of 6)

Non linear problems

� The higher the polynomial order, the more over fitting we get. But here we have to answer 2 questions:

8

ERI Summer training (C&S) Dr. Randa Elanwar

� Why are the samples interfering in the feature space?

� How far should we pursue raising the order of the decision boundary (adding hidden layers to the network)?

Page 9: What is pattern recognition (lecture 3 of 6)

Non linear problems9

� Well! first you will notice that discriminative features are rare and have a limit.

� Real samples usually have similarities that cause classes interference/overlap within any feature classes interference/overlap within any feature space.

� Moreover, some features may be dependent on other features used and not distinct. This dependency usually allows more interference.

ERI Summer training (C&S) Dr. Randa Elanwar

Page 10: What is pattern recognition (lecture 3 of 6)

Non linear problems

� Some times the number of features used is insufficient to distinctly separate the samples in the feature space.

� Increasing the number of distinct features help widen the space between each class samples.

10

ERI Summer training (C&S) Dr. Randa Elanwar

the space between each class samples.

� However, increasing the number of features more than needed introduce noise in the form of irrelevant features, which doesn't lead to the separation as well as it leads to the system complexity (remember that the number of nodes in the NN input layer is equal to the number of features used).

Page 11: What is pattern recognition (lecture 3 of 6)

Non linear problems11

� Using a feed forward network to solve the overlapping problem will not help, because the delta rule works only If a set of <input, output> pairs are learnable (representable), the delta rule will find the necessary weights:

� in a finite number of steps

independent of initial weights� independent of initial weights

� In case of interfering samples in the feature space the delta rue will run in an infinite loop because the error will never diminish to zero, i.e. the sample pairs are not learnable. The only solution we need is a hyperbola (non linear decision boundary).

ERI Summer training (C&S) Dr. Randa Elanwar

Page 12: What is pattern recognition (lecture 3 of 6)

Non linear problems

� This solution is offered in either of 2 ways:1. Adding hidden layers (use Multiple Layer Perceptron

MLP NN) to add non linearity to the decision boundary to some acceptable limit.

12

ERI Summer training (C&S) Dr. Randa Elanwar

2. Make sample transformation by kernels, in other words, multiply the feature vector of the patterns/samples by a set of orthogonal functions to re-locate them in space in a way that can make the linear solution possible (use Radial Basis Functions RBF NN)

Page 13: What is pattern recognition (lecture 3 of 6)

Non linear problems13

� The learning of MLP and RBF depends on another method than the delta rule, called the back propagation algorithm. This algorithms also depends on minimizing the error function of misclassified patterns. It has a long derivation that we will not discuss it now, has a long derivation that we will not discuss it now, may be later.

� But to imagine how things work� Back propagation tries to transform training patterns to

make them almost linearly separable and use linear network

ERI Summer training (C&S) Dr. Randa Elanwar

Page 14: What is pattern recognition (lecture 3 of 6)

Non linear problems

� In other words, if we need more than 1 straight line to separate +ve and –ve patterns, we solve the problem in two phases:

� In phase 1: we first represent each straight line with a

14

ERI Summer training (C&S) Dr. Randa Elanwar

� In phase 1: we first represent each straight line with a single perceptron and classify the training patterns (output)

� In phase 2: these outputs are then transformed to new patterns which are now linearly separable and can be classified by an additional perceptron giving the final result.

Page 15: What is pattern recognition (lecture 3 of 6)

Learning Methods15

� Learning/Training is The process of modifying the weights in the connections between network layers with the objective of achieving the expected output.

� This is achieved through�Supervised learning

�Unsupervised learning

�Reinforcement learning

ERI Summer training (C&S) Dr. Randa Elanwar

Page 16: What is pattern recognition (lecture 3 of 6)

Supervised learning

� Each input vector requires a corresponding target vector. Training pair=[input vector, target vector]

16

ERI Summer training (C&S) Dr. Randa Elanwar

Page 17: What is pattern recognition (lecture 3 of 6)

Supervised learning17

� During learning, produced output is compared with the desired output� The difference between both output is used to modify

learning weights according to the learning algorithm

� Learning cases: pattern recognition problems.

� Neural Network models using supervised learning: ML Perceptron, feed-forward, radial basis function, support vector machine.

ERI Summer training (C&S) Dr. Randa Elanwar

Page 18: What is pattern recognition (lecture 3 of 6)

Unsupervised learning18

� All similar input patterns are grouped together as clusters. If a matching input pattern is not found a new cluster is formed

� In unsupervised learning there is no error feedback because targets are not provided

ERI Summer training (C&S) Dr. Randa Elanwar

Page 19: What is pattern recognition (lecture 3 of 6)

Unsupervised learning

� Network must discover patterns, regularities, features for the input data over the output. This process is called self-organizing

� Learning cases: Appropriate for clustering task like:

19

ERI Summer training (C&S) Dr. Randa Elanwar

� Learning cases: Appropriate for clustering task like:� Find similar groups of documents in the web, content

addressable memory, clustering.

� Neural Network models using unsupervised learning: Kohonen, self organizing maps, Hopfield networks.

Page 20: What is pattern recognition (lecture 3 of 6)

Reinforcement learning

� Target is provided, but the desired output is absent. I.e. unlike unsupervised learning, there is a given feedback but it is in the form of "Good /bad", "Greater/less", etc. No exact value for the desired output or the class membership.

20

ERI Summer training (C&S) Dr. Randa Elanwar

output or the class membership.

� The net is only provided with guidance to determine the produced output is acceptable or not.

Page 21: What is pattern recognition (lecture 3 of 6)

Reinforcement learning21

� Weights are modified in the units that have errors

ERI Summer training (C&S) Dr. Randa Elanwar

Page 22: What is pattern recognition (lecture 3 of 6)

Reinforcement learning

� When Reinforcement learning is used?

� If less information is available about the target output values (critic information)

� Feedback in this case is only evaluative and not

22

ERI Summer training (C&S) Dr. Randa Elanwar

instructive