topoart: a topology learning hierarchical art network · pdf filetopoart: a topology learning...

11
TopoART: A Topology Learning Hierarchical ART Network Marko Tscherepanow Bielefeld University, Applied Informatics Universit¨atsstraße 25, 33615 Bielefeld, Germany [email protected] Abstract. In this paper, a novel unsupervised neural network combin- ing elements from Adaptive Resonance Theory and topology learning neural networks, in particular the Self-Organising Incremental Neural Network, is introduced. It enables stable on-line clustering of stationary and non-stationary input data. In addition, two representations reflecting different levels of detail are learnt simultaneously. Furthermore, the net- work is designed in such a way that its sensitivity to noise is diminished, which renders it suitable for the application to real-world problems. Keywords: on-line learning, topology learning, Adaptive Resonance The- ory, hierarchical clustering 1 Introduction For numerous tasks, the traditional off-line learning approach with separate training, validation, and test phases is not sufficient. The diagnosis of genetic abnormalities [1], interactive teaching of a humanoid robot [2], and the subcel- lular localisation of proteins [3] constitute some examples for such problems. As a consequence, incremental on-line learning has become more popular in recent years, since such machine learning techniques are required to gradually complete knowledge or adapt to non-stationary input distributions. In this paper, a novel neural network is introduced which combines incre- mental and fast on-line clustering with topology learning. Based on its origins in Adaptive Resonance Theory (ART) networks, it is called TopoART. Thanks to these origins, TopoART creates stable representations while retaining its ability to learn new data. In order to render TopoART more suitable for real-world applications, it was designed in such a way that it becomes insensitive to noise. Furthermore, TopoART creates a hierarchical representation of the input distri- bution reflecting different levels of detail. In Sect. 2, related learning methods are outlined. Afterwards, details of the TopoART algorithm are introduced in Sect. 3. Then, clustering results for artifi- cial and real-world data are presented in Sect. 4. Here, the ability of TopoART to cope with noise and to incrementally learn new input data from non-stationary distributions will be demonstrated as well. Finally, Sect. 5 summarises the most important points of this paper. K. Diamantaras, W. Duch, L.S. Iliadis (Eds.): ICANN 2010, Part III, LNCS 6354, pp. 157–167, 2010. The original publication is available at www.springerlink.com. c Springer-Verlag Berlin Heidelberg 2010

Upload: vodung

Post on 23-Mar-2018

218 views

Category:

Documents


3 download

TRANSCRIPT

TopoART: A Topology Learning HierarchicalART Network

Marko Tscherepanow

Bielefeld University, Applied InformaticsUniversitatsstraße 25, 33615 Bielefeld, Germany

[email protected]

Abstract. In this paper, a novel unsupervised neural network combin-ing elements from Adaptive Resonance Theory and topology learningneural networks, in particular the Self-Organising Incremental NeuralNetwork, is introduced. It enables stable on-line clustering of stationaryand non-stationary input data. In addition, two representations reflectingdifferent levels of detail are learnt simultaneously. Furthermore, the net-work is designed in such a way that its sensitivity to noise is diminished,which renders it suitable for the application to real-world problems.

Keywords: on-line learning, topology learning, Adaptive Resonance The-ory, hierarchical clustering

1 Introduction

For numerous tasks, the traditional off-line learning approach with separatetraining, validation, and test phases is not sufficient. The diagnosis of geneticabnormalities [1], interactive teaching of a humanoid robot [2], and the subcel-lular localisation of proteins [3] constitute some examples for such problems. Asa consequence, incremental on-line learning has become more popular in recentyears, since such machine learning techniques are required to gradually completeknowledge or adapt to non-stationary input distributions.

In this paper, a novel neural network is introduced which combines incre-mental and fast on-line clustering with topology learning. Based on its origins inAdaptive Resonance Theory (ART) networks, it is called TopoART. Thanks tothese origins, TopoART creates stable representations while retaining its abilityto learn new data. In order to render TopoART more suitable for real-worldapplications, it was designed in such a way that it becomes insensitive to noise.Furthermore, TopoART creates a hierarchical representation of the input distri-bution reflecting different levels of detail.

In Sect. 2, related learning methods are outlined. Afterwards, details of theTopoART algorithm are introduced in Sect. 3. Then, clustering results for artifi-cial and real-world data are presented in Sect. 4. Here, the ability of TopoART tocope with noise and to incrementally learn new input data from non-stationarydistributions will be demonstrated as well. Finally, Sect. 5 summarises the mostimportant points of this paper.

K. Diamantaras, W. Duch, L.S. Iliadis (Eds.): ICANN 2010, Part III, LNCS 6354, pp. 157–167, 2010.The original publication is available at www.springerlink.com. c© Springer-Verlag Berlin Heidelberg2010

2 Marko Tscherepanow

2 Related Work

The k-means algorithm [4], which constitutes a very well-known unsupervisedlearning technique, determines a partitioning of an input distribution into k re-gions or rather clusters. Each cluster is represented by a reference vector. Thedetermination of the number of required clusters constitutes a crucial problem.For this reason, the Linde-Buzo-Gray (LBG) algorithm [5] was developed. Basedon a fixed training set, it successively computes sets of reference vectors of in-creasing size until a stopping criterion is fulfilled. The topological structure ofthe input data is not considered by this algorithm.

In 1982, the Self-Organising Feature Maps (SOFMs) [6], which map inputdata to a lattice of neurons, were introduced. Here, the reference vectors areencoded by the weights of the neurons. The lattice possesses a predefined topo-logical structure, the dimension of which is usually lower or equal to the dimen-sion of the input space. If the input distribution is not known in advance, anappropriate lattice structure is very difficult to choose. This problem was solvedby the Growing Neural Gas (GNG) algorithm [7]. It allows for the incremen-tal incorporation of new neurons and the learning of the input distribution’stopology by adding and deleting edges between different neurons. But the rep-resentation of the input distribution is not stable: A continuing presentation ofinput data, even if they have already been learnt, results in a continuing adapta-tion of the neurons’ weights, i.e. the reference vectors, and the network topology.As a consequence, already acquired knowledge gets lost due to further training.This problem has been termed the stability-plasticity dilemma [8]. It occurs, forinstance, if the input distribution is complex. But small changes of the inputprobabilities or the sequencing of the input data may cause a similar effect.

Adaptive Resonance Theory (ART) networks have been proposed as a so-lution to the stability-plasticity dilemma [8]. These networks learn top-downexpectations which are matched with bottom-up input. The expectations, whichare called categories, summarise sets of input data to clusters. Depending on thetype of ART network, the categories exhibit different shapes such as a hyper-spherical shape [9], an elliptical shape [10], or a rectangular shape [11]. Besidesenabling ART networks to create stable and plastic representations, the cat-egories allow for an easy novelty detection. But in contrast to SOFMs and GNG,ART networks do not capture the topology of the input data. Furthermore, theirability of stable learning leads to an increased sensitivity to noise.

In 2006, the Self-Organising Incremental Neural Network (SOINN) was intro-duced [12]. Similar to GNG, SOINN clusters input data by incrementally addingneurons, the weights of which represent reference vectors, and representing thetopology by edges between nodes. But it has several additional features: Firstly,SOINN has a two-layered structure representing the input distribution at differ-ent levels of detail. Additionally, this structure decreases the sensitivity to noise.The second layer is trained after the training of the first layer has been finished.Secondly, novelty detection can be performed based on an adaptive threshold.Thirdly, each neuron has an individual learning rate which decays if the amountof input samples it represents increases. By this means, a more stable represen-

TopoART: A Topology Learning Hierarchical ART Network 3

tation is achieved. But the weights of the neurons do not stabilise completely.Furthermore, a high number of relevant parameters (8 parameters per layer) hasto be selected in order to apply SOINN.

The Enhanced Self-Organising Incremental Neural Network (ESOINN) [13]solves some of the above mentioned problems: By removing the second layer andone condition for the insertion of new neurons, the number of required param-eters is considerably decreased (4 in total). Furthermore, the whole network canbe trained on-line. But similar to SOINN, the weights do not stabilise completely.Moreover, ESOINN loses the ability to create hierarchical representations.

TopoART combines the advantages of ART and topology learning networks.From its ART ancestors, it inherits the ability of fast and stable on-line learn-ing using expectations (categories). But the categories are extended by edgesreflecting the topology of the input distribution enabling the formation of arbi-trarily shaped clusters. In addition, it adopts the ability to represent input dataat different levels of detail from SOINN; but unlike SOINN, TopoART learnsboth levels simultaneously.

3 The TopoART Algorithm

The basic structure and the computational framework of TopoART are stronglyrelated to Fuzzy Art [11] – a very efficient ART network utilising rectangularcategories. Fuzzy ART constitutes a three-layered neural network (see Fig. 1).Input is presented to the initial layer F0. Here, the input vectors x(t) are encodedusing complement coding. The encoded input vectors are denoted by xF1(t). Asa consequence of the usage of complement coding, each component xi(t) of aninput vector x(t) has to lie in the interval [0, 1].

x(t) =[x1(t), . . . , xd(t)

]T(1)

xF1(t) =[x1(t), . . . , xd(t), 1− x1(t), . . . , 1− xd(t)

]T(2)

The encoded input vectors xF1(t) are transmitted to the comparison layerF1. From here, they activate the neurons of the representation layer F2:

zF2i (t) =

∣∣xF1(t) ∧ wF2i (t)

∣∣1

α+∣∣wF2

i (t)∣∣1

(3)

The activation zF2i (t) (choice function) constitutes a measure for the similar-

ity between xF1(t) and the category represented by neuron i. | · |1 and ∧ denotethe city block norm and a component-wise minimum operation, respectively.The parameter α must be set slightly higher than zero. The choice of the actualvalue is not crucial.1 In general, zF2

i (t) prefers small categories to large ones.After all F2 neurons have been activated, the best-matching neuron bm, i.e.

the neuron with the highest activation, is selected. But the category represented

1 α was set to 0.001 for all experiments presented in this paper.

4 Marko Tscherepanow

by its weights wF2bm(t) is only allowed to grow and enclose a presented input

vector if resonance occurs, i.e. if the match function (4) is fulfilled.∣∣xF1(t) ∧ wF2bm(t)

∣∣1∣∣xF1(t)

∣∣1

≥ ρ (4)

In case of resonance, the weights wF2bm of the chosen neuron are adapted and

the output y(t) of the network is set:

wF2bm(t+ 1) = β

(xF1(t) ∧ wF2

bm(t))

+ (1− β)wF2bm(t) (5)

yi(t) =

{0 if i 6= bm1 if i = bm

(6)

β denotes the learning rate. Setting β to 1, trains the network in fast-learningmode; i.e., each learnt input is enclosed by the category that matches it best.As the categories cannot shrink according to (5), the formed representationsare stable. The current size Si(t) of a category can be derived from the weightswF2

i (t) of the corresponding neuron i:

Si(t) =

d∑j=1

∣∣∣(1− wF2i,d+j(t)

)− wF2

i,j (t)∣∣∣ (7)

The growth of the categories is limited by the vigilance parameter ρ and thedimension of the input space d, which determine the maximum category sizeSmax.

Smax = d(1− ρ) (8)

Assuming a neuron was not able to fulfil (4), its activation is reset. Then anew best-matching node is chosen. If no suitable best-matching neuron is found,a new neuron representing x(t) is incorporated and resonance occurs.

TopoART is composed of two Fuzzy ART-like components (TopoART aand TopoART b) using a common F0 layer performing complement coding (seeFig. 1). But in contrast to Fuzzy ART, each F2 neuron i of both componentshas a counter denoted by nai and nbi , respectively, which counts the numberof input samples it has learnt. An encoded input vector is only propagated toTopoART b if resonance of TopoART a occurred and nabm≥φ. Every τ learningcycles, all neurons with a counter smaller than φ are removed. Therefore, suchneurons are called node candidates. Once ni equals or surpasses φ, the corres-ponding neuron cannot be removed anymore; i.e., it becomes a permanent node.By means of this procedure, the network becomes insensitive to noise but is stillable to learn stable representations.

In order to allow for fast on-line learning, the learning rate βbm used foradapting the weights of the best-matching neuron is always set to 1 yielding:

wF2bm(t+ 1) = xF1(t) ∧ wF2

bm(t) (9)

TopoART: A Topology Learning Hierarchical ART Network 5

Fuzzy ART TopoART

Fig. 1. Comparison of Fuzzy ART and TopoART. TopoART consists of two FuzzyART-like components using the same input layer. Furthermore, the F2 nodes ofTopoART are connected by edges defining a topological structure. In order to reduceits sensitivity to noise, TopoART evaluates the benefit of neurons (node candidates)before they are fully incorporated.

Rather than only determining the best-matching neuron bm and modifyingits weights, the neuron sbm with the second highest activation that fulfils (4)is adapted as well (10). Here, βsbm should be chosen smaller than 1, as neuronsbm – in contrast to neuron bm – is only intended to partly learn xF1(t).

wF2sbm(t+ 1) = βsbm

(xF1(t) ∧ wF2

sbm(t))

+ (1− βsbm)wF2sbm(t) (10)

As a result of this procedure, the insensitivity to noise is further increased,since the categories are more likely to grow in relevant areas of the input space.

If φ=1 and βsbm=0, inserted nodes immediately become permanent, all inputvectors are propagated to TopoART b, and only the weights of the best-matchingneuron bm are adapted during a learning cycle. In this case, the categories ofTopoART a and TopoART b equal the categories of Fuzzy ART networks trainedin fast-learning mode with the vigilance parameters ρa and ρb, respectively.

In order to enable TopoART to learn topologies, a lateral connection or ratheredge between the neurons bm and sbm is created, if a second best-matchingneuron can be found. These edges define a topological structure. They are notused for activating other neurons. If the neurons bm and sbm have already beenconnected by an edge, it remains unchanged, since the edges do not possess anage parameter in contrast to the edges in ESOINN, SOINN, and GNG. They areonly removed, if one of the adjacent neurons is removed. As a consequence, edgesbetween permanent nodes are stable. But new edges can still be created. Thismechanism constitutes an extension of Fuzzy ART’s solution to the stability-plasticity dilemma, which enables the representation of new input while retainingthe already learnt representations.

In order to refine the representation of TopoART a by means of TopoART b,ρb should be higher than ρa. Within the scope of this paper, ρb is determinedaccording to (11), which diminishes the maximum category size Smax by 50%.

6 Marko Tscherepanow

ρb =1

2(ρa + 1) (11)

In this way, TopoART b learns a more detailed representation which is lessinfluenced by noise. Connections between categories of TopoART a can be splitin TopoART b resulting in a hierarchical representation of the input data.

The output y(t) of both TopoART components can be computed in a simi-lar way as with Fuzzy ART. In addition, the connected components or ratherclusters are labelled (cf. [12] and [13]): Starting from an unlabelled neuron, allconnected neurons receive a specific label. Then, a new unlabelled neuron issearched for. This process is repeated until no unlabelled neurons remain. Thevector c(t) provides the cluster labels of all F2 neurons. For reasons of stability,only permanent nodes are considered for the computation of y(t) and c(t). Bythis, the clusters can grow and fuse, but they are prevented from shrinking.

4 Results

TopoART was evaluated using three different types of data: (i) stationary artifi-cial data, (ii) non-stationary artificial data, and (iii) stationary real-world data.As stationary artificial input distribution, a dataset copying the one used for theevaluation of SOINN [12] was chosen

(see Fig. 2(a)

). It comprises two Gaussian

components (A and B), two ring-shaped components (C and D), and a sinus-oidal component (E) composed from three subcomponents (E1, E2, and E3).Each component encompasses 18,000 individual samples. Additionally, the in-put distribution includes uniformly distributed random noise amounting to 10%of the total sample number (10,000 samples).

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

data distribution

A B

C

D

E3

E2

E1

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

SOINN layer 1: λ=400, agedead

=400, c=1

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

SOINN layer 2: λ=100, agedead

=100, c=0.05

(a) (b) (c)

Fig. 2. Input distribution and clustering results of SOINN. The input distribution (a)is successfully clustered by SOINN. Here, the representation is refined from layer 1 (b)to layer 2 (c). Reference vectors belonging to the same cluster share a common colourand symbol.

This dataset was used to train a SOINN system. The results of both of itslayers are depicted in Fig. 2(b) and Fig. 2(c), respectively. The values for λ,

TopoART: A Topology Learning Hierarchical ART Network 7

agedead, and c were selected in such a way that results comparable to thosepublished in [12] were achieved. In contrast, the settings for α1, α2, α3, β, andγ were directly adopted from [12] for both layers

(16 , 1

4 , 14 , 2

3 , 34

).

Figure 2 shows that SOINN was able to create a hierarchical representationof the input distribution: The two clusters in layer 1 were refined in layer 2 whichdistinguishes five clusters. Additionally, the second layer reduced the influenceof noise.

The dataset shown in Fig. 2(a) cannot be used directly to train a Fuzzy ARTor a TopoART network. Rather, all samples have to be scaled to the interval[0, 1] so as to render them compatible with complement coding.

As Fuzzy ART constitutes the basis of TopoART, it was analysed first. Forcomparison reasons, β was set to 1 like βbm in TopoART. ρ was selected in sucha way as to roughly fit the thickness of the elliptic and sinusoidal components ofthe input distribution. As this network does not possess any means to decreasethe sensitivity to noise, virtually the whole input space was covered by categories(see Fig. 3(a)

).

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

Fuzzy ART: β=1, ρ=0.92

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

TopoART a: βsbm

=0.6, ρa=0.92, τ=100, φ=5

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

TopoART b: βsbm

=0.6, ρb=0.96, τ=100, φ=5

(a) (b) (c)

Fig. 3. Comparison of the ART networks’ results. Fuzzy ART (a) learnt rectangularcategories covering virtually the complete input space. In contrast, TopoART learnta noise insensitive representation in which the categories have been summarised toarbitrarily shaped clusters. The representation of TopoART a (b) was further refinedby TopoART b (c). Here, all categories of an individual cluster are painted with thesame colour.

In contrast to Fuzzy ART, both TopoART components created represen-tations reflecting the relevant regions of the input distribution very well

(see

Figs. 3(b) and 3(c)). This is remarkable since the value of ρa was equal to the

value of the vigilance parameter ρ of the Fuzzy ART network. Similar to thelayers of SOINN, the representation of TopoART was refined from TopoART ato TopoART b: While TopoART a comprises two clusters, TopoART b distin-guishes five clusters corresponding to the five components of the input distribu-tion. By virtue of the filtering of samples by TopoART a and due to the fact thatρb is higher than ρa

(cf. (11)

), the categories of TopoART b reflect the input

distribution in more detail. This property is particularly useful, if small areasof the input space have to be clustered with high accuracy. Here, TopoART a

8 Marko Tscherepanow

could filter input from other regions and TopoART b could create the desireddetailed representation.

The learning of non-stationary data was also investigated using the input dis-tribution shown in Fig. 2(a). But now, input from different regions was presentedsuccessively (see Fig. 4).

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

TopoART a: βsbm

=0.6, ρa=0.92, τ=100, φ=6

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

TopoART a: βsbm

=0.6, ρa=0.92, τ=100, φ=6

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

TopoART a: βsbm

=0.6, ρa=0.92, τ=100, φ=6

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

TopoART b: βsbm

=0.6, ρb=0.96, τ=100, φ=6

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

TopoART b: βsbm

=0.6, ρb=0.96, τ=100, φ=6

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

TopoART b: βsbm

=0.6, ρb=0.96, τ=100, φ=6

A+E3 B+E2 C+D+E1

Fig. 4. Training of TopoART with non-stationary artificial data. A TopoART networkwas consecutively trained using three different input distributions: A+E3, B+E2, andC+D+E1. Each column shows the results after finishing training with data from thegiven regions. All categories of an individual cluster are painted with the same colour.

Figure 4 shows that both components of TopoART incrementally learnt thepresented input. Already created representations remained stable when the inputdistribution changed. Similar to the stationary case, TopoART b performed arefinement of the representation of TopoART a. But here, the sub-regions E1,E2, and E3 were separated, since the corresponding input samples were presentedindependently and could not be linked. TopoART a was able to compensate forthis effect, as its lower vigilance parameter ρa allowed for larger categories whichcould form connections between the sub-regions.

Finally, a dataset originally used to investigate methods for the direct imita-tion of human facial expressions by the user-interface robot iCat was applied [14].From this dataset, the images of all 32 subjects (12 female, 20 male) that wereassociated with predefined facial expressions were selected, which resulted in atotal of 1783 images. These images had been acquired using two lighting condi-tions: daylight and artificial light. They were cropped, scaled to a size of 64×64pixels, and successively processed by principal component analysis keeping 90%of the total variance

(cf. Fig. 5(a)

).

TopoART: A Topology Learning Hierarchical ART Network 9

After training a TopoART network with these data, the resulting clusterswere compared to the partitionings based on labels reflecting the subjects andthe lighting conditions. Here, two standard measures were used: the Jaccardcoefficient J and the Rand index R [15]. Both provide values between 0 and1, with higher values indicating a higher degree of similarity. As always morethan one criterion (subjects, lighting conditions, facial expressions, gender, usageof glasses, etc.) for partitioning these images influences the clustering process,values considerably lower than 1 may also indicate similarity. The splitting ofclusters caused by these additional criteria, for example, results in a decrease ofthe Jaccard coefficient.

Due to the filtering process controlled by φ, several input samples were re-garded as noise and excluded from cluster formation. Within the scope of theevaluation, they were associated with the cluster containing the permanent nodewith the highest activation. Since the original activation (3) depends on the cat-egory size, an alternative activation (12) was used to analyse the formed clus-ters.2 It constitutes a normalised version of the city block distance between aninput sample and the respective category, which was successfully applied in [3]and [16].

zF2i (t) = 1−

∣∣(xF1(t) ∧ wF2i (t)

)− wF2

i (t)∣∣1

d(12)

Figure 5 shows the results for both TopoART components. Here, the param-eter τ was adopted from the previous experiments; βsbm, φ, and ρa were iteratedover representative values from the relevant intervals and selected in such a man-ner as to maximise the Jaccard coefficient for the partitioning according to thesubjects. Then βsbm and φ were fixed, while the influence of ρa is illustrated.Here, no results for ρa<0.75 are displayed, since a further decrease of ρa doesnot cause any changes.

PCAcropping, removal of

colour information

the eyes’ centres

manual selection of

irrelevant pixels

removal of

feature vectors

lighting condition)

labels (subject,

45−dimensional

0.75 0.8 0.85 0.9 0.95 10

0.5

1

TopoART a (subjects): βsbm

=0.2 ,φ=3

Ja

Ra

0.75 0.8 0.85 0.9 0.95 10

0.5

1

TopoART b (subjects): βsbm

=0.2 ,φ=3

ρa

Jb

Rb

0.75 0.8 0.85 0.9 0.95 10

0.5

1

TopoART a (lighting conditions): βsbm

=0.2 ,φ=3

Ja

Ra

0.75 0.8 0.85 0.9 0.95 10

0.5

1

TopoART b (lighting conditions): βsbm

=0.2 ,φ=3

ρa

Jb

Rb

(a) (b) (c)

Fig. 5. Clustering of stationary real-world data. After preprocessing facial images (a),they were clustered by a TopoART network. Then, it was analysed how accurately theresulting clusters reflect the different subjects (b) and the two lighting conditions (c).

2 The original activation (3) was still utilised for training.

10 Marko Tscherepanow

According to Fig. 5(b), the partitioning with respect to the 32 subjects isbest represented if ρa is high. Here, Jb and Rb increase for lower values of ρa thanJa and Ra, which reflects the finer clustering of TopoART b. In contrast, thepartitioning according to the two lighting conditions

(cf. Fig. 5(c)

)is impaired

for high values of ρa, which is indicated by the decrease of Ja and Jb. Themore detailed representation of TopoART b amplifies this effect. Choosing ρaappropriately, e.g. ρa=0.96, the representations of both partitionings can becombined by a single TopoART network. In this case, TopoART a representsthe coarser partitioning with respect to the lighting conditions and TopoART bthe finer partitioning according to the subjects.

For the results given in Fig. 5, the relevant parameters were selected in sucha way as to maximise the Jaccard coefficient for the partitioning according tothe subjects. Hence, the representation of the lighting conditions is not optimal.If the parameters are optimised to provide the maximum Jaccard coefficientfor this partitioning, the following maximum values are reached: Ja=0.779 andJb=0.671. This confirms the previous results according to which TopoART acreates a less detailed representation of the input data which is more suited torepresent the two lighting conditions.

5 Conclusion

TopoART – the neural network proposed in this paper – successfully combinesproperties from ART and topology learning approaches: The categories originat-ing from ART systems are connected by means of edges. In this way, clusters ofarbitrary shapes can be formed. In addition, a filtering mechanism decreases thesensitivity to noise. Similar to SOINN, representations exhibiting different levelsof detail are formed. But TopoART enables parallel learning at both levels re-quiring only 4 parameters (βsbm, φ, ρa, τ) to be set, which constitutes a reductionof 75% compared to SOINN. Moreover, representations created by TopoART arestable.

Although the number of parameters to be set is small, they must be chosenusing some kind of prior knowledge about the input data distribution. This taskis very difficult, in particular, as TopoART is trained on-line. In order to solvethis problem, the hierarchical structure of TopoART can be exploited, sinceit provides alternative clusterings of the input data distribution. By means ofinteraction during the learning process, these clusterings could be evaluated withrespect to the current task or other criteria.

The capability of TopoART to capture hierarchical relations and the topologyof presented data might be of interest for numerous tasks, e.g. the representationof complex sensory and semantic information in robots. In principle, TopoARTcould even be extended to a multi-level structure that captures hierarchical re-lations more comprehensively.

Acknowledgements. This work has partly been supported by the German Re-search Foundation (DFG) within the Center of Excellence ‘Cognitive Interaction

TopoART: A Topology Learning Hierarchical ART Network 11

Technology’, Research Area D (Memory and Learning). The author particularlywishes to thank Sina Kuhnel and Marc Kammer for many fruitful discussionsduring the preparation of this manuscript.

References

1. Vigdor, B., Lerner, B.: Accurate and fast off and online fuzzy ARTMAP-basedimage classification with application to genetic abnormality diagnosis. IEEE Trans-actions on Neural Networks 17(5) (2006) 1288–1300

2. Goerick, C., Schmudderich, J., Bolder, B., Janßen, H., Gienger, M., Bendig, A.,Heckmann, M., Rodemann, T., Brandl, H., Domont, X., Mikhailova, I.: Interactiveonline multimodal association for internal concept building in humanoids. In: Pro-ceedings of the IEEE-RAS International Conference on Humanoid Robots. (2009)411–418

3. Tscherepanow, M., Jensen, N., Kummert, F.: An incremental approach to auto-mated protein localisation. BMC Bioinformatics 9(445) (2008)

4. MacQueen, J.: Some methods for classification and analysis of multivariate obser-vations. In: Proceedings of the Berkeley Symposium on Mathematical Statisticsand Probability. Volume 1. (1967) 281–297

5. Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEETransactions on Communications COM-28 (1980) 84–95

6. Kohonen, T.: Self-organized formation of topologically correct feature maps. Bio-logical Cybernetics 43(1) (1982) 59–69

7. Fritzke, B.: A growing neural gas network learns topologies. In: Neural InformationProcessing Systems. (1994) 625–632

8. Grossberg, S.: Competitive learning: From interactive activation to adaptive res-onance. Cognitive Science 11 (1987) 23–63

9. Anagnostopoulos, G.C., Georgiopoulos, M.: Hypersphere ART and ARTMAP forunsupervised and supervised incremental learning. In: Proceedings of the Inter-national Joint Conference on Neural Networks. Volume 6. (2000) 59–64

10. Anagnostopoulos, G.C., Georgiopoulos, M.: Ellipsoid ART and ARTMAP for in-cremental clustering and classification. In: Proceedings of the International JointConference on Neural Networks. Volume 2. (2001) 1221–1226

11. Carpenter, G.A., Grossberg, S., Rosen, D.B.: Fuzzy ART: Fast stable learningand categorization of analog patterns by an adaptive resonance system. NeuralNetworks 4 (1991) 759–771

12. Furao, S., Hasegawa, O.: An incremental network for on-line unsupervised classi-fication and topology learning. Neural Networks 19 (2006) 90–106

13. Furao, S., Ogura, T., Hasegawa, O.: An enhanced self-organizing incrementalneural network for online unsupervised learning. Neural Networks 20 (2007) 893–903

14. Tscherepanow, M., Hillebrand, M., Hegel, F., Wrede, B., Kummert, F.: Directimitation of human facial expressions by a user-interface robot. In: Proceedings ofthe IEEE-RAS International Conference on Humanoid Robots. (2009) 154–160

15. Xu, R., Wunsch II, D.C.: Clustering. Wiley–IEEE Press (2009)16. Tscherepanow, M.: Image analysis methods for location proteomics. PhD thesis,

Faculty of Technology, Bielefeld University (2008)