© negnevitsky, pearson education, 2005 1 will neural network work for my problem? will neural...

36
1 Will neural network work for my Will neural network work for my problem? problem? Character recognition neural Character recognition neural networks networks Prediction neural networks Prediction neural networks Classification neural netrorks with Classification neural netrorks with competitiv competitiv e e learning learning Summary Summary ecture 14 ecture 14 Knowledge engineering: Knowledge engineering: Building neural network Building neural network based systems based systems

Upload: kerrie-daniel

Post on 29-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

1

Will neural network work for my problem?Will neural network work for my problem?

Character recognition neural networksCharacter recognition neural networks

Prediction neural networksPrediction neural networks

Classification neural netrorks with competitivClassification neural netrorks with competitive e learninglearning

SummarySummary

Lecture 14Lecture 14

Knowledge engineering:Knowledge engineering:

Building neural network based Building neural network based systemssystems

Page 2: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

2

Will a neural network work for myWill a neural network work for my problem?problem?Neural networks represent a class of veryNeural networks represent a class of very powerful, general-purpose tools that have beenpowerful, general-purpose tools that have been successfully applied to prediction, classificationsuccessfully applied to prediction, classification and clustering problems. They are used in aand clustering problems. They are used in a variety of areas, from speech variety of areas, from speech and characterand character recognition to detecting recognition to detecting fraudulent transactions,fraudulent transactions, from medical from medical diagnosis of heart attacks to processdiagnosis of heart attacks to process control and robotics, from predicting foreigncontrol and robotics, from predicting foreign exchange rates to detecting and identifying exchange rates to detecting and identifying radarradar targets.targets.

Page 3: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

3

Case study 4Case study 4:: Character recognition reural networksCharacter recognition reural networks

Recognition of both printed and handwrittenRecognition of both printed and handwritten characters is a typical domain where neuralcharacters is a typical domain where neural networks have been networks have been successfully appliedsuccessfully applied..

Optical character recognitionOptical character recognition systems weresystems were among the first commercial applications ofamong the first commercial applications of neural networksneural networks

Page 4: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

4

We demonstrate an application of a multilayerWe demonstrate an application of a multilayer feedforward network for printed feedforward network for printed charactercharacter recognition.recognition.

For simplicity, we can limit our task to theFor simplicity, we can limit our task to the recognition of digits from 0 to 9. Each digit isrecognition of digits from 0 to 9. Each digit is represented by a 5 represented by a 5 9 bit map.9 bit map.

In commercial applications, where a betterIn commercial applications, where a better resolution is required, at least 16 resolution is required, at least 16 16 bit 16 bit mapsmaps are usedare used..

Page 5: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

5

Bit maps for digit recognitionBit maps for digit recognition

7 8 9 10

12 13 14 15

17 18 19 20

26 27 28 29

31 32 33 34

36 37 38 39

6

2 3 4 51

16

11

22 23 24 2521

42 43 44 4541

35

40

30

Page 6: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

6

How do we choose the architecture of a How do we choose the architecture of a neural network? neural network?

The number of neurons in the input layer is The number of neurons in the input layer is decided by the number of pixels in the bit decided by the number of pixels in the bit map. The bit map in our example map. The bit map in our example consists of 45 pixels, and thus we consists of 45 pixels, and thus we need 45 input neurons.need 45 input neurons. The output layer has 10 neurons – one neuron The output layer has 10 neurons – one neuron for each digit to be recognised.for each digit to be recognised.

Page 7: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

7

How do we determine an optimal number How do we determine an optimal number of hidden neurons? of hidden neurons? Complex patterns cannot be detected by a small Complex patterns cannot be detected by a small

number of hidden neurons; number of hidden neurons; however too many of them however too many of them can dramatically increase the can dramatically increase the computational burden.computational burden.

Another problem is Another problem is overfittingoverfitting. The greater the . The greater the number of hidden neurons, the greater number of hidden neurons, the greater the ability of the network to the ability of the network to recognise existing patterns. recognise existing patterns. However, if the number of hidden However, if the number of hidden neurons is too big, the network might simply neurons is too big, the network might simply memorise all training examples. memorise all training examples.

Page 8: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

8

Neural network for printed digit recognitionNeural network for printed digit recognition

0

1

1

1

0

1

1

1

1

1

41

42

43

44

45

1

2

3

4

5

1

2

3

4

5

0

1

2

3

4

5

6

7

8

9

10

0

0

0

0

0

0

0

1

0

Page 9: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

9

What are the test examples for character What are the test examples for character recognition? recognition?

A test set has to be strictly independent from the A test set has to be strictly independent from the training examples. training examples.

To test the character recognition network, we To test the character recognition network, we present it with examples that present it with examples that include “noise” – the distortion include “noise” – the distortion of the input patterns.of the input patterns.

We evaluate the performance of the printed digit We evaluate the performance of the printed digit recognition recognition networks with 1000 test examples networks with 1000 test examples (100 for each digit to be (100 for each digit to be recognised).recognised).

Page 10: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

10

Learning curves of the digit recognition Learning curves of the digit recognition three-layer three-layer neural networksneural networks

0 50 150 2001-4

1-3

1-2

1-1

10

101

Epochs250

102

234

1

1 – two hidden neurons;2 – fivehiddenneurons;3 – ten hidden neurons;4 – twentyhiddenneurons

Page 11: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

11

Performance evaluation of the digit Performance evaluation of the digit

recognition neural networksrecognition neural networks

0 0.1 0.2 0.3 0.4 0.50

25

Page 12: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

12

Can we improve the performance of the Can we improve the performance of the character recognition neural character recognition neural network?network?A neural network is as good as the examples A neural network is as good as the examples used to train it. used to train it.

Therefore, we can attempt to improve digit Therefore, we can attempt to improve digit recognition by feeding the recognition by feeding the network with network with “noisy” examples of digits from 0 to 9. “noisy” examples of digits from 0 to 9.

Page 13: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

13

Performance evaluation of the digit recognition Performance evaluation of the digit recognition network trained with network trained with “noisy” examples“noisy” examples

Noise Level0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

0

2

4

6

8

10

12

14

0.4 0.45 0.

16

network trained with“perfect” examplesnetwork trained with“noisy” examples

Page 14: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

14

Case study 5Case study 5: : Prediction neural networksPrediction neural networks

As an example, we consider a problem of As an example, we consider a problem of predicting the market predicting the market value of a given house value of a given house based on the knowledge of the sales prices of based on the knowledge of the sales prices of similar houses. similar houses.

Page 15: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

15

In this problem, the inputs (the house location, In this problem, the inputs (the house location, living area, number of bedrooms, living area, number of bedrooms, number of bathrooms, land number of bathrooms, land size, type of heating system, size, type of heating system, etc.) are well-defined, and even standardised etc.) are well-defined, and even standardised for sharing the housing for sharing the housing market information between market information between different real estate agencies.different real estate agencies.

The output is also well-defined – we know what The output is also well-defined – we know what we are trying to predict. we are trying to predict.

The features of recently sold houses and their The features of recently sold houses and their sales prices are examples, sales prices are examples, which we use for which we use for training the neural network.training the neural network.

Page 16: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

16

Network generalisationNetwork generalisation

An appropriate number of training examples can An appropriate number of training examples can be estimated with be estimated with Widrow’s rule of Widrow’s rule of thumbthumb, which suggests , which suggests that, for a good generalisation, we that, for a good generalisation, we need to satisfy the following condition: need to satisfy the following condition:

where where N N is the number of training examples, is the number of training examples, nnww is is

the number of synaptic weights the number of synaptic weights in the network, and in the network, and e e is the network error permitted on test.is the network error permitted on test.

en

N w

Page 17: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

17

Massaging the dataMassaging the data

Data can be divided into three main types: Data can be divided into three main types: continuous, discrete continuous, discrete and categorical .and categorical . Continuous data Continuous data vary between two pre-set vary between two pre-set values – minimum and values – minimum and maximum, and can be maximum, and can be mapped, or massaged, to the range between 0 mapped, or massaged, to the range between 0 and 1 as: and 1 as:

value-valuemaximum

valueminimum-valueactualvaluemassaged

Page 18: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

18

Discrete dataDiscrete data, such as the number of bedrooms , such as the number of bedrooms and the number of bathrooms, and the number of bathrooms, also have maximum and also have maximum and minimum values. For example, the minimum values. For example, the number of bedrooms usually ranges from 0 to 4.number of bedrooms usually ranges from 0 to 4.

Massaging the dataMassaging the data

Number of bedrooms

0

1

2

3

4

no bedrooms

one bedroom

two bedrooms

three bedrooms

four or more bedrooms

Page 19: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

19

Categorical dataCategorical data , such as gender and marital , such as gender and marital status, can be massaged status, can be massaged by using by using 1 of N coding1 of N coding. . This method implies that each This method implies that each categorical value is categorical value is handled as a separate input. handled as a separate input.For example, marital status, which can be either For example, marital status, which can be either single, divorced, single, divorced, married or widowed, would be married or widowed, would be represented by four inputs. Each of these represented by four inputs. Each of these inputs can have a inputs can have a value of either 0 or 1. Thus, a married value of either 0 or 1. Thus, a married person would be represented by an person would be represented by an input vector [0 0 1 0].input vector [0 0 1 0].

Page 20: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

20

Feedforward neural network for Feedforward neural network for real- real-

estate appraisalestate appraisal1Location rating 7 0.7000

House condition 8 0.80003

5

7

10

Number of bedrooms 3 0.7500Number of bathrooms 1 0.5000Living area 121.00 m2 0.3550Land size 682.00 m2 0.4682Garage double 1.0000

1.0000Type of heating system 0.0000

0.0000

wood heater

2

4

6

8

9

$120,9200.3546

Page 21: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

21

How do we validate results?How do we validate results?

To validate results, we use a set of examples To validate results, we use a set of examples never seen by the network. never seen by the network.

Before training, all the available data are Before training, all the available data are randomly divided into a training set and a randomly divided into a training set and a test set.test set.Once the training phase is complete, the Once the training phase is complete, the network’s ability to network’s ability to generalise is tested against generalise is tested against examples of the test set. examples of the test set.

Page 22: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

22

As an example, we will consider an iris plant As an example, we will consider an iris plant classification problem. classification problem.

Suppose, we are given a data set with several Suppose, we are given a data set with several variables but we have no idea variables but we have no idea how to separate it into how to separate it into different classes because we cannot different classes because we cannot find any unique or distinctive features in the find any unique or distinctive features in the data. data.

Case study 6Case study 6: : Classification neural Classification neural networks with networks with competitive learningcompetitive learning

Page 23: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

23

Clusters and clusteringClusters and clustering

Neural networks can discover significant Neural networks can discover significant features in input patterns and learn features in input patterns and learn how to separate input data how to separate input data into different classes. A into different classes. A neural network with competitive learning is a neural network with competitive learning is a suitable tool to accomplish this task.suitable tool to accomplish this task.

The competitive learning rule enables a The competitive learning rule enables a single-layer neural network to single-layer neural network to combine similar input combine similar input data into groups or data into groups or clustersclusters. . This process is called This process is called clusteringclustering. Each . Each cluster is represented by a cluster is represented by a single output.single output.

Page 24: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

24

Each plant in the data set is represented by Each plant in the data set is represented by four variables: sepal four variables: sepal length, sepal width, length, sepal width, petal length and petal width. The petal length and petal width. The sepal sepal length ranges between 4.3 and 7.9 cm, sepal length ranges between 4.3 and 7.9 cm, sepal width between width between 2.0 and 4.4 cm, petal length 2.0 and 4.4 cm, petal length between 1.0 and 6.9 cm, and petal between 1.0 and 6.9 cm, and petal width width between 0.1 and 2.5 cm. between 0.1 and 2.5 cm.

For this case study, we will use a data set of For this case study, we will use a data set of 150 elements that contains three classes of 150 elements that contains three classes of iris plants – iris plants – setosasetosa, , versicolor versicolor and and virginicavirginica..

Page 25: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

25

Neural network for iris plant classificationNeural network for iris plant classification

16.1 cm 0.5000

22.8 cm 0.33330

2

4.0 cm 0.5085

Sepal length

Sepal width

Petal length

Petal width 1.3 cm 0.5000

3

4

Versicolor1

1

03

Virginica

Setosa

Page 26: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

26

Massaging the dataMassaging the data Before the network is trained, the data must be Before the network is trained, the data must be

massaged and then divided into training and massaged and then divided into training and test sets.test sets.

The Iris plant data are continuous, vary between The Iris plant data are continuous, vary between some minimum and maximum some minimum and maximum values, and thus can easily be values, and thus can easily be massaged to the range between 0 and 1 massaged to the range between 0 and 1 using the following equation. using the following equation.

Massaged values can then be fed to the network as Massaged values can then be fed to the network as its inputs. its inputs.

valueminimum-valuemaximumvalueminimum-valueactual

valuemassaged

Page 27: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

27

The next step is to generate training and test The next step is to generate training and test sets from the available sets from the available data. The 150- data. The 150- element Iris data is randomly divided into a element Iris data is randomly divided into a training set of 100 elements and training set of 100 elements and a test set of 50 a test set of 50 elements.elements.

Now we can train the competitive neural Now we can train the competitive neural network to divide input vectors into network to divide input vectors into three classes.three classes.

Page 28: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

28

Competitive learning in the neural network Competitive learning in the neural network for iris plant classification for iris plant classification

(a) initial weights; (b) weights after 100 iterations(a) initial weights; (b) weights after 100 iterations

0

0.5

1

0

0.5

10

0.2

0.4

0.6

0.8

1

Sepal lengthSepal width 0

0.5

1

0

0.5

10

0.2

0.4

0.6

0.8

1

Sepal lengt hSepal width

(a) (b)

Page 29: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

29

Competitive learning in the neural network Competitive learning in the neural network for iris plant classification for iris plant classification

(c) weights after 2,000 iterations(c) weights after 2,000 iterations

0

0.5

1

0

0.5

10

0.2

0.4

0.6

0.8

1

Sepal lengthSepal width 0

0.5

1

0

0.5

10

0.2

0.4

0.6

0.8

1

Petal widthPetal length

(c)

Page 30: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

30

How do we know when the learning How do we know when the learning process is complete? process is complete?

In a competitive neural network, there is no obvious In a competitive neural network, there is no obvious way of knowing whether the way of knowing whether the learning process is complete or learning process is complete or not. We do not know what desired not. We do not know what desired outputs are, and thus cannot compute the sum of outputs are, and thus cannot compute the sum of squared errors – a criterion squared errors – a criterion used by the back- used by the back- propagation algorithm.propagation algorithm.

Therefore, we should use the Therefore, we should use the Euclidean distance Euclidean distance criterion instead. When no noticeable criterion instead. When no noticeable changes occur in the changes occur in the weight vectors of competitive neurons, weight vectors of competitive neurons, a network can be considered to have converged. a network can be considered to have converged.

Page 31: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

31

Learning curves for competitive neurons of Learning curves for competitive neurons of the iris classification neural network the iris classification neural network

(a) Learning rate 0.01 (b) Learning rate 0.5

0 400 800 1200 16000

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

Epoch

Entirecompetitivelayer

Competitiveneuron1Competitiveneuron2Competitiveneuron3

00 400 800 1200 1600 2000

Epoch

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

0.2

Competitiveneuron1

Entirecompetitivelayer

Competitiveneuron2Competitiveneuron3

Page 32: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

32

How do we associate an output neuron How do we associate an output neuron with a particular class? with a particular class?

Competitive neural networks enable us to Competitive neural networks enable us to identify clusters in input data. identify clusters in input data. However, since clustering is However, since clustering is an unsupervised process, we cannot an unsupervised process, we cannot use it directly for labelling output use it directly for labelling output neurons.neurons.

In most practical applications, the distribution of In most practical applications, the distribution of data that belong to the same data that belong to the same cluster is rather dense, cluster is rather dense, and there are usually natural valleys and there are usually natural valleys between different clusters. As a between different clusters. As a result, the position result, the position of the centre of a cluster often reveals of the centre of a cluster often reveals distinctive features of the distinctive features of the corresponding class.corresponding class.

Page 33: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

33

1

Neuron

2

3

Weights Dimensions of the Iris plant, cmSepallengthSepal widthPetal lengthPetal width

5.92.74.41.4

0.43550.30220.56580.5300

Versicolor

Class of the Iris plant

SepallengthSepal widthPetal lengthPetal width

6.73.05.52.0

0.65140.43480.76200.7882

Virginica

SepallengthSepal widthPetal lengthPetal width

5.03.51.60.3

0.20600.60560.09400.0799

Setosa

w11w21w31w41w12w22w32w42w13w23w33w43

Labelling the competitive neuronsLabelling the competitive neurons

Page 34: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

34

How do we decode weights into Iris How do we decode weights into Iris dimensions? dimensions?

To decode the weights of the competitive To decode the weights of the competitive neurons into dimensions of the iris plant we neurons into dimensions of the iris plant we simply reverse the procedure used simply reverse the procedure used for massaging the iris data. For for massaging the iris data. For example,example,Sepal length Sepal length ww 1111 = 0.4355 = 0.4355 (7.9 (7.9 4.3) 4.3) 4.3 4.3

= 5.9 cm = 5.9 cm

Once the weights are decoded, we can ask an Once the weights are decoded, we can ask an iris plant expert to label the output neurons.iris plant expert to label the output neurons.

Page 35: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

35

Can we label the competitive neurons Can we label the competitive neurons automatically without having to automatically without having to ask the expert?ask the expert?

We can use a test data set for labelling We can use a test data set for labelling competitive neurons automatically. Once competitive neurons automatically. Once training of a neural network is training of a neural network is complete, a set of input complete, a set of input samples representing the same class, samples representing the same class, say class say class VersicolorVersicolor, is fed to the , is fed to the network, and the output network, and the output neuron that wins the neuron that wins the competition most of the time receives a label competition most of the time receives a label of the corresponding class. of the corresponding class.

Page 36: © Negnevitsky, Pearson Education, 2005 1 Will neural network work for my problem? Will neural network work for my problem? Character recognition neural

36

Although a competitive network has only one Although a competitive network has only one layer of competitive neurons, it can layer of competitive neurons, it can classify input patterns that classify input patterns that are not linearly separable.are not linearly separable.

In classification tasks, competitive networks In classification tasks, competitive networks learn much faster than multilayer learn much faster than multilayer perceptrons trained with the perceptrons trained with the back-propagation algorithm, back-propagation algorithm, but they usually provide less accurate results. but they usually provide less accurate results.