381 self organization map learning without examples

72
38 1 Self Organization Map Learning without Examples

Upload: susanna-cook

Post on 21-Jan-2016

233 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 381 Self Organization Map Learning without Examples

38 1

Self Organization Map

Learning without Examples

Page 2: 381 Self Organization Map Learning without Examples

22

Learning Objectives

1.1. IntroductionIntroduction2.2. Hamming Net and MAXNETHamming Net and MAXNET3.3. ClusteringClustering4.4. Feature MapFeature Map5.5. Self-organizing Feature MapSelf-organizing Feature Map6.6. ConclusionConclusion

Page 3: 381 Self Organization Map Learning without Examples

33

Introduction

1. Learning without examples

2. Data are input to the system one by one

3. Mapping the data into one or more dimension space

4. Competitive learning

Page 4: 381 Self Organization Map Learning without Examples

44

Hamming Distance(1/5)1.1. Define HD(Hamming Distance) Define HD(Hamming Distance)

between two binary codesbetween two binary codes AA and and BB of the same length as the number of of the same length as the number of place in which place in which aaii , and , and bbjj differ.differ.

2.2. Ex:Ex:

AA: [1 -1: [1 -1 1 1 11 11 1]1]

BB: [1 1: [1 1 -1-1 -1-1 11 1]1]

=> HD(=> HD(AA, , BB) = 3) = 3

Page 5: 381 Self Organization Map Learning without Examples

55

Hamming Distance(2/5)

Page 6: 381 Self Organization Map Learning without Examples

66

Hamming Distance(3/5)

Page 7: 381 Self Organization Map Learning without Examples

77

Hamming Distance(4/5)3.3. Or, HD can be defined as the lowest Or, HD can be defined as the lowest

number of edges that must be number of edges that must be traversed between the two relevant traversed between the two relevant codes. codes.

4.4. If If AA and and BB are bipolar binary are bipolar binary components, then the scalar product components, then the scalar product of of AA , , BB : :

Page 8: 381 Self Organization Map Learning without Examples

88

Hamming Distance(5/5)

) ,(2

) ,() ,(

BAHDn

BAHDBAHDnBAdifferentarebitsagreedarebits

t

) ,(2

) ,() ,(

BAHDn

BAHDBAHDnBAdifferentarebitsagreedarebits

t

Page 9: 381 Self Organization Map Learning without Examples

99

Hamming Net and MAXNET(1/30)

1.1. For a two layer classifier of binary For a two layer classifier of binary bipolar vectors, and bipolar vectors, and pp classes , classes , pp output neurons, the strongest output neurons, the strongest response of a neuron indicates it response of a neuron indicates it has the minimum HD value and the has the minimum HD value and the category this neuron represents.category this neuron represents.

Page 10: 381 Self Organization Map Learning without Examples

1010

Hamming Net and MAXNET(2/30)

Page 11: 381 Self Organization Map Learning without Examples

1111

Hamming Net and MAXNET(3/30)

2.2. The MAXNET operates to suppress The MAXNET operates to suppress values at output instead of output values at output instead of output values of Hamming network. values of Hamming network.

3.3. For the Hamming net in next figure, For the Hamming net in next figure, we have input vector we have input vector XX

pp classes => classes => pp neurons for output neurons for output

output vector output vector YY = [y = [y11,……y,……ypp]]

Page 12: 381 Self Organization Map Learning without Examples

1212

Hamming Net and MAXNET(4/30)

Page 13: 381 Self Organization Map Learning without Examples

1313

Hamming Net and MAXNET(5/30)

4.4. for any output neuron , for any output neuron , mm, , m=1, ……pm=1, ……p, , we havewe have

WWmm = [w = [wm1m1, w, wm2m2,……w,……wmnmn]]tt and and

m=1,2,……pm=1,2,……p

to be the weights between input to be the weights between input XX and and each output neuron.each output neuron.

5.5. Also, assuming that for each class Also, assuming that for each class mm, , one has the prototype vector one has the prototype vector SS(m)(m) as the as the standard to be matched.standard to be matched.

Page 14: 381 Self Organization Map Learning without Examples

1414

Hamming Net and MAXNET(6/30)

6.6. For classifying For classifying pp classes, one can say classes, one can say the the m’m’thth output is 1 if and only if output is 1 if and only if

XX= = SS(m)(m) => happens only => happens only WW(m)(m) = = SS(m)(m)

output for the classifier areoutput for the classifier are

XXttSS(1)(1), , XXttSS(2)(2),…,…XXttSS(m)(m),…,…XXttSS(p)(p)

So when So when XX= = SS(m)(m), the , the m’m’thth output is output is nn and and other outputs are smaller than other outputs are smaller than nn. .

Page 15: 381 Self Organization Map Learning without Examples

1515

Hamming Net and MAXNET(7/30)

7.7. XXttSS(m)(m) = (n - HD( = (n - HD(XX , , SS(m)(m)) ) - HD() ) - HD(XX , , SS(m)(m)) )

∴∴½ ½ XXttSS(m)(m) = n/2 – HD( = n/2 – HD(XX , , SS(m)(m)))

So the weight matrix:So the weight matrix:

WWHH=½=½SS

)()(2

)(1

)2()2(2

)2(1

)1()1(2

)1(1

2

1

pn

pp

n

n

H

SSS

SSS

SSS

W

)()(2

)(1

)2()2(2

)2(1

)1()1(2

)1(1

2

1

pn

pp

n

n

H

SSS

SSS

SSS

W

Page 16: 381 Self Organization Map Learning without Examples

1616

Hamming Net and MAXNET(8/30)

8.8. By giving a fixed bias By giving a fixed bias n/2n/2 to the input to the input

thenthen

netnetmm = ½ = ½XXttSS(m)(m) + n/2 for m=1,2, + n/2 for m=1,2,

……p……p

or or

netnetmm = n - HD( = n - HD(XX , , SS(m)(m)))

Page 17: 381 Self Organization Map Learning without Examples

1717

Hamming Net and MAXNET(9/30)

9.9. To scale the input 0~n to 0~1 down, To scale the input 0~n to 0~1 down, one can apply transfer function as one can apply transfer function as

f(netf(netmm) = net) = netmm for m=1,2,…..p for m=1,2,…..p

Page 18: 381 Self Organization Map Learning without Examples

1818

Hamming Net and MAXNET(10/30)

Page 19: 381 Self Organization Map Learning without Examples

1919

Hamming Net and MAXNET(11/30)

10.10. So for the node with the the highest output So for the node with the the highest output means that the node has smallest HD means that the node has smallest HD between input and prototype vectorsbetween input and prototype vectors SS(1)(1)…………SS(m)(m) i.e.i.e.

f(netf(netmm) = 1) = 1

for other nodes for other nodes

f(netf(netmm) < 1) < 1

Page 20: 381 Self Organization Map Learning without Examples

2020

Hamming Net and MAXNET(12/30)

11.11. MAXNET is employed as a second MAXNET is employed as a second layer only for the cases where an layer only for the cases where an enhancement of the initial dominant enhancement of the initial dominant response of response of m’m’thth node is required. node is required.

i.e.,i.e., the purpose of MAXNET is to let the purpose of MAXNET is to let max{ ymax{ y11,……y,……ypp } } equal to 1 and let equal to 1 and let

others equal to 0.others equal to 0.

Page 21: 381 Self Organization Map Learning without Examples

2121

Hamming Net and MAXNET(13/30)

Page 22: 381 Self Organization Map Learning without Examples

2222

Hamming Net and MAXNET(14/30)

12.12. To achieve this, one can let To achieve this, one can let

where where i=1,……p ; j=1,……p ; ji=1,……p ; j=1,……p ; j≠i≠i

∵∵yyii is bounded by is bounded by 00≦≦yyii≦≦11 for i=1,……p for i=1,……p

↗↗

output of Hamming Net.output of Hamming Net.and can only be 1 or 0.and can only be 1 or 0.

)(1 k

jj

ki

ki yyfy

)(1 k

jj

ki

ki yyfy

1kiy 1kiy

Page 23: 381 Self Organization Map Learning without Examples

2323

Hamming Net and MAXNET(15/30)

13.13. SoSoε is bounded by 0<ε<1/ε is bounded by 0<ε<1/ppandand

εε: lateral interaction coefficient: lateral interaction coefficient)(

1

1

1

1

pp

MW

)(1

1

1

1

pp

MW

Page 24: 381 Self Organization Map Learning without Examples

2424

Hamming Net and MAXNET(16/30)

And And k

Mk yWnet k

Mk yWnet

Page 25: 381 Self Organization Map Learning without Examples

2525

Hamming Net and MAXNET(17/30)

14. So the transfer function 14. So the transfer function

0

0 0)(

netnet

netnetf

0

0 0)(

netnet

netnetf

Page 26: 381 Self Organization Map Learning without Examples

2626

Hamming Net and MAXNET(18/30)

Page 27: 381 Self Organization Map Learning without Examples

2727

Hamming Net and MAXNET(19/30)

Ex: To have a Hamming Net for Ex: To have a Hamming Net for classifying C , I , Tclassifying C , I , T

thenthen

SS(1)(1) = [ 1 1 1 1 -1 -1 1 1 1 ] = [ 1 1 1 1 -1 -1 1 1 1 ]tt

SS(2)(2) = [ -1 1 -1 -1 1 -1 -1 1 -1 ] = [ -1 1 -1 -1 1 -1 -1 1 -1 ]tt

SS(3)(3) = [ 1 1 1 -1 1 -1 -1 1 -1 ] = [ 1 1 1 -1 1 -1 -1 1 -1 ]tt

Page 28: 381 Self Organization Map Learning without Examples

2828

Hamming Net and MAXNET(20/30)

Page 29: 381 Self Organization Map Learning without Examples

2929

Hamming Net and MAXNET(21/30)

SoSo

111111111

111111111

111111111

HW

111111111

111111111

111111111

HW

22

1 nXWnet H

22

1 nXWnet H

Page 30: 381 Self Organization Map Learning without Examples

3030

Hamming Net and MAXNET(22/30)

ForFor t

t

net

X

537

111111111

t

t

net

X

537

111111111

Y

netft

9

5

9

3

9

7

Y

netft

9

5

9

3

9

7 AndAnd

Page 31: 381 Self Organization Map Learning without Examples

3131

Hamming Net and MAXNET(23/30)

Input to MAXNET and select Input to MAXNET and select =0.2 < =0.2 < 1/3(=1/p)1/3(=1/p)

SoSo

15

1

5

15

11

5

15

1

5

11

MW

15

1

5

15

11

5

15

1

5

11

MW

Page 32: 381 Self Organization Map Learning without Examples

3232

Hamming Net and MAXNET(24/30)

AndAnd

kk

kM

k

netfY

YWnet

1 kk

kM

k

netfY

YWnet

1

Page 33: 381 Self Organization Map Learning without Examples

3333

Hamming Net and MAXNET(25/30)

k=0k=0

333.0

067.0

599.0

333.0

067.0

599.0

555.0

333.0

777.0

12.02.0

2.012.0

2.02.01

01

0

netfY

o

net

333.0

067.0

599.0

333.0

067.0

599.0

555.0

333.0

777.0

12.02.0

2.012.0

2.02.01

01

0

netfY

o

net

Page 34: 381 Self Organization Map Learning without Examples

3434

Hamming Net and MAXNET(27/30)

k=1k=1

t

t

Y

net

2

0

1

120.00520.0

120.0120.0520.0

t

t

Y

net

2

0

1

120.00520.0

120.0120.0520.0

Page 35: 381 Self Organization Map Learning without Examples

3535

Hamming Net and MAXNET(28/30)

k=2k=2

t

t

Y

net

3

0

2

096.00480.0

096.014.0480.0

t

t

Y

net

3

0

2

096.00480.0

096.014.0480.0

Page 36: 381 Self Organization Map Learning without Examples

3636

Hamming Net and MAXNET(29/30)

k=3k=3

t

t

Y

net

4

7

0

3

00461.0

10115.0461.0

t

t

Y

net

4

7

0

3

00461.0

10115.0461.0

Page 37: 381 Self Organization Map Learning without Examples

3737

Hamming Net and MAXNET(30/30)

Summary:Summary:

Hamming network only tells which Hamming network only tells which class it will mostly like, not to class it will mostly like, not to restore distorted pattern. restore distorted pattern.

Page 38: 381 Self Organization Map Learning without Examples

3838

Clustering—Unsupervised Learning(1/20)

1.1. IntroductionIntroductiona.a. to categorize or cluster datato categorize or cluster data b.b. ggrouping similar objects and rouping similar objects and

separating of dissimilar onesseparating of dissimilar ones

2.2. Assuming pattern set Assuming pattern set {{XX11 , , XX22 ,…… ,……XXNN} is } is submitted to determine the decision submitted to determine the decision function required to identify possible function required to identify possible clusters,clusters,=> by similarity rules=> by similarity rules

Page 39: 381 Self Organization Map Learning without Examples

3939

Clustering—Unsupervised Learning(2/20)

cos

)()(

: DistanceEuclidean

XX

XXψ

XXXXXX

i

it

it

ii

cos

)()(

: DistanceEuclidean

XX

XXψ

XXXXXX

i

it

it

ii

Page 40: 381 Self Organization Map Learning without Examples

4040

Clustering—Unsupervised Learning(3/20)

Page 41: 381 Self Organization Map Learning without Examples

4141

Clustering—Unsupervised Learning(4/20)

3.3. Winner-Take-AllWinner-Take-All learning learning

Assuming input vectors are to be Assuming input vectors are to be classified into one of specific number of classified into one of specific number of pp categories according to the clusters categories according to the clusters detected in the training set detected in the training set {{XX11, , XX22 ,…… ,……

XXNN}.}.

Page 42: 381 Self Organization Map Learning without Examples

4242

Clustering—Unsupervised Learning(5/20)

Page 43: 381 Self Organization Map Learning without Examples

4343

Clustering—Unsupervised Learning(6/20)

Kohonen NetworkKohonen Network

YY=f(=f(WW XX))

for vector size = for vector size = nn

,p,for iwwW

W

W

W

W

inii

t

tp

t

t

1

and

1

2

1

,p,for iwwW

W

W

W

W

inii

t

tp

t

t

1

and

1

2

1

Page 44: 381 Self Organization Map Learning without Examples

4444

Clustering—Unsupervised Learning(7/20)

prior to the learning, normalization of all prior to the learning, normalization of all weight vectors is requiredweight vectors is required

So So

i

ii W

WW

i

ii W

WW

Page 45: 381 Self Organization Map Learning without Examples

4545

Clustering—Unsupervised Learning(8/20)

The meaning of training is to find from The meaning of training is to find from all such thatall such that

mW

mW

ipi

m WXWX,,1

min

ipi

m WXWX,,1

min

i.e., the m’th neuron with vector is the closest approximation of the current X.

Page 46: 381 Self Organization Map Learning without Examples

4646

Clustering—Unsupervised Learning(9/20)

FromFrom

ipi

m WXWX,,1

min

ipi

m WXWX,,1

min

XWWX

and

XWXXWX

ti

pii

pi

tm

tm

,,1,,1

21

max min

12

XWWX

and

XWXXWX

ti

pii

pi

tm

tm

,,1,,1

21

max min

12

Page 47: 381 Self Organization Map Learning without Examples

4747

Clustering—Unsupervised Learning(10/20)

SoSo

1, ,max

t t

m ii p

W X W X

1, ,max

t t

m ii p

W X W X

The The m’m’thth neuron is the winning neuron and neuron is the winning neuron and has the largest value of has the largest value of netnetii, , i=1,……pi=1,……p..

Page 48: 381 Self Organization Map Learning without Examples

4848

Clustering—Unsupervised Learning(11/20)

Thus Thus WWmm should be adjusted such that should be adjusted such that

||||XX - - WWmm || is reduced. || is reduced.

Then , |Then , |||XX - - WWmm || can be reduced by the || can be reduced by the direction of gradient.direction of gradient.

▽▽WWmm || ||XX - - WWmm || ||22 = -2( = -2(XX - - WWmm))

Or,Or,

increase in increase in ((XX - - WWmm) direction.) direction.

Page 49: 381 Self Organization Map Learning without Examples

4949

Clustering—Unsupervised Learning(12/20)

For any For any XX is just a single step, we want only is just a single step, we want only fraction of fraction of ((XX - - WWmm)), thus, thus

for neuron for neuron mm::

▽▽WWmm’ = ’ = α(α(XX - - WWmm)) 0.1< α <0.70.1< α <0.7

for other neurons:for other neurons:

▽▽WWii’ = 0’ = 0

Page 50: 381 Self Organization Map Learning without Examples

5050

Clustering—Unsupervised Learning(13/20)

In general:In general:

miforWW

WXWW

k

i

k

i

k

mk

k

m

k

m

constant learning:α

neuron winning:m where)(

1

1

miforWW

WXWW

k

i

k

i

k

mk

k

m

k

m

constant learning:α

neuron winning:m where)(

1

1

Page 51: 381 Self Organization Map Learning without Examples

5151

Clustering—Unsupervised Learning(15/20)

X

X

X

X

X

X

X

X

X

mW

mW

,p,for iXW ti 1 max

,p,for iXW ti 1 max

4.4. GeometricalGeometrical interpretationinterpretation

input vector (normalized) and input vector (normalized) and weight is winnerweight is winner

=>=>

Page 52: 381 Self Organization Map Learning without Examples

5252

Clustering—Unsupervised Learning(16/20)

Page 53: 381 Self Organization Map Learning without Examples

5353

Clustering—Unsupervised Learning(17/20)

So is created.

X

X

X

X

X

X

X

mWX

mWX

WXW

WWW

m

mmm

'

WXW

WWW

m

mmm

'And

Page 54: 381 Self Organization Map Learning without Examples

5454

Clustering—Unsupervised Learning(18/20)

Thus the weight adjustment is mainly the rotation of the weight vector toward input vector without a significant length change.

And Wm’ is not a normal , so in the new training stage , Wm’ must be normalized again. is a vector point to the gravity of each cluster on a unity sphere.

X

X

X

X

X

X

X

iW

iW

Page 55: 381 Self Organization Map Learning without Examples

5555

Clustering—Unsupervised Learning(19/20)

5. In the case of some patterns are known class,

then

X

X

X

X

X

X

X

mm WXW

mm WXW

α > 0 for correct nodeα < 0 otherwiseThis will accelerate the learning process significantly.

Page 56: 381 Self Organization Map Learning without Examples

5656

Clustering—Unsupervised Learning(20/20)

6. Another modification is to adjust the weight for both winner and losers. leaky competitive learning

7. Recall Y= f(W X)8. Initialization of weights

Randomly choose weight from U(0,1)or convex combination

X

X

X

X

X

X

X

piforn

W ti ,,1 11

10 piforn

W ti ,,1 11

10

Page 57: 381 Self Organization Map Learning without Examples

5757

Feature Map(1/6)1. Transform from high dimensional pattern

space to low-dimension feature space .2. Feature extraction:

two category: natural structureno natural structure

-- depend on pattern is similar to human perception or not

X

X

X

X

X

X

X

Page 58: 381 Self Organization Map Learning without Examples

5858

Feature Map(2/6)

X

X

X

X

X

X

X

Page 59: 381 Self Organization Map Learning without Examples

5959

Feature Map(3/6)

3. another important aspect is to represent the feature as natural as possible.

4. Self organizing neural array that can map features from pattern space.

X

X

X

X

X

X

X

Page 60: 381 Self Organization Map Learning without Examples

6060

Feature Map(4/6)5. Use one dimensional array or

Two dimensional array6. Example:

X: input vectori: neuron i

wij:weight from input xj to neuron i

yi: output of neuron i

X

X

X

X

X

X

X

Page 61: 381 Self Organization Map Learning without Examples

6161

Feature Map(5/6)

X

X

X

X

X

X

X

Page 62: 381 Self Organization Map Learning without Examples

6262

Feature Map(6/6)

7. So define

X

X

X

X

X

X

X

ii

ii

WXWXSwhere

WXSfy

and between similarity themeans ),(

)),((

ii

ii

WXWXSwhere

WXSfy

and between similarity themeans ),(

)),((

8. Thus, (b) is the result of (a)

Page 63: 381 Self Organization Map Learning without Examples

6363

Self-organizing Feature Map(1/6)1. One dimensional mapping

set of pattern Xi, i=1,2,….

use a linear array i1> i2> i3>…..

if XXXXXXX

,2,1),(max

,2,1),(max

2

1

2

1

iXyy

iXyy

ii

ii

,2,1),(max

,2,1),(max

2

1

2

1

iXyy

iXyy

ii

ii

Page 64: 381 Self Organization Map Learning without Examples

6464

Self-organizing Feature Map(2/6)2. Thus, this learning is to find the

best matching neuron cells whch can activate their spatial neighbors to react to the same input.

3. Or to find Wc such that

X

X

X

X

X

X

X

ii

c WXWX min ii

c WXWX min

Page 65: 381 Self Organization Map Learning without Examples

6565

Self-organizing Feature Map(3/6)4. In case of two winners, choose

lower i.

5. If c is the winning neuron then define Nc as the neighborhood around c and Nc is always changing as learning going.

6. Then

X

X

X

X

X

X

X

otherwise 0

odneighborho in the is if )( cijjij

Niwxw

otherwise 0

odneighborho in the is if )( cijjij

Niwxw

Page 66: 381 Self Organization Map Learning without Examples

6666

Self-organizing Feature Map(4/6)7. :

where t: current training iterationT: total # of training steps to

be done.

8. starts from 0 and is decreased until it reaches value of 0

X

X

X

X

X

X

X

T

tt 10

T

tt 10

Page 67: 381 Self Organization Map Learning without Examples

6767

Self-organizing Feature Map(5/6)9. If rectangular neighborhood is used

thenc-d < x < c+dc-d < y < c+d

and as learning continues

d is decreased from d0 to 1

X

X

X

X

X

X

X

T

tdd 10

T

tdd 10

Page 68: 381 Self Organization Map Learning without Examples

6868

Self-organizing Feature Map-example

10. Example:a. Input patterns are chosen randomly from U[0,1]. b. The data employed in the experiment comprised 500 points distributed uniformly over the bipolar square [−1,1] × [−1, 1] c. The points thus describe a geometrically square topology.d. Thus, initial weights can be plotted as in next

figure.d.“ -” connection line connects two

adjacent neuron in competitive layer.

X

X

X

X

X

X

X

Page 69: 381 Self Organization Map Learning without Examples

6969

Self-organizing Feature Map-example

X

X

X

X

X

X

X

Page 70: 381 Self Organization Map Learning without Examples

7070

Self-organizing Feature Map-- Example

X

X

X

X

X

X

X

Page 71: 381 Self Organization Map Learning without Examples

7171

Self-organizing Feature Map—Example

X

X

X

X

X

X

X

Page 72: 381 Self Organization Map Learning without Examples

7272

Self-organizing Feature MapExample

X

X

X

X

X

X

X