hierarchical and pyramidal clustering for symbolic...

30
1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito Univ. Porto, Portugal 2 OUTLINE The hierachical model The pyramidal model Numerical hierarchical / pyramidal clustering Symbolic Clustering The property of completeness The generality degree The clustering algorithm Examples The HIPYR Module

Upload: others

Post on 22-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

1

1

HIERARCHICAL AND PYRAMIDAL CLUSTERING

FOR SYMBOLIC DATA

Paula Brito

Univ. Porto, Portugal

2

OUTLINE

•The hierachical model

• The pyramidal model

• Numerical hierarchical / pyramidal clustering

•Symbolic Clustering

•The property of completeness

• The generality degree

• The clustering algorithm

•Examples

•The HIPYR Module

Page 2: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

2

3

Hierarchical Model :set of nested partitions

• E ∈∈∈∈ H

• ∀∀∀∀ a ∈∈∈∈ ΕΕΕΕ , { a } ∈∈∈∈ H

• ∀∀∀∀h, h' ∈∈∈∈ H, h ∩∩∩∩ h' = Ø or h ⊆⊆⊆⊆ h' or h' ⊆⊆⊆⊆ h

Let ΕΕΕΕ be the observations set (the set being clustered)

Hierarchy on E :

Family H on non-empty subsets of E such that

4

x1 x2 x3 x4 x5

h

h'

x3

x4 x5

x1 x2

h

h'

Page 3: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

3

5

PYRAMIDAL REPRESENTATION

PYRAMID : set of nested overlappings

P = { {x1} , {x2} , {x3} , {x4} , {x5} , {x1 , x2} , {x3 , x4} ,

{x4 , x5} , {x1 , x2 , x3} , {x1 , x2 , x3 , x4} , {x3 , x4 , x5} ,

{x1 , x2 , x3 , x4 , x5 } }

x1 x2 x3 x4 x5

x3

x5

x1 x2

x4

Diday (1984, 1986), Bertrand, Diday (1985), Bertrand,(1986)

6

PYRAMID P on ΕΕΕΕ

Family on non-empty subsets of E such that :

• E ∈∈∈∈ P

• ∀∀∀∀ a ∈∈∈∈ ΕΕΕΕ , { a } ∈∈∈∈ P

• ∀∀∀∀ p, p' ∈∈∈∈ P, p ∩∩∩∩ p' = Ø or p ∩∩∩∩ p' ∈∈∈∈ P

• there exists a linear order θθθθ / every element of P is an

interval of θθθθ

Pyramidal model :

x1 x2 x3 x4 x5

Hierarchy : nested partitions

Pyramid : nested overlappings

Clustering

Seriation

Page 4: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

4

7

SUCCESSOR AND PREDECESSOR

p SUCCESSOR of p’ if

- p ⊆⊆⊆⊆ p’

- ¬¬¬¬∃∃∃∃ p’’ / p ⊆⊆⊆⊆ p” ⊆⊆⊆⊆ p’

p’ is a PREDECESSOR of p

Pyramid : Each cluster has at most TWO predecessors

x1 x2 x3 x4 x5

Hierarchy : Each cluster has at most ONE predecessor

x1 x2 x3 x4 x5

h

h'

8

ASCENDING CLUSTERING ALGORITHM

Starting with the one cluster elements,

merge at each step the MERGEABLE clusters for which

the dissimilarity (aggregation index) is MINIMUM

MERGEABLE CLUSTERS :

→ if the structure is a hierarchy : none of them has

been aggregated before ;

→ if the structure is a pyramid : none of them has been

aggregated twice, and p is an interval of a total order θθθθ

on E

Page 5: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

5

9

Given a set of symbolic objects ,

build an hierarchical / pyramidal clustering

such that each cluster is a pair

SYMBOLIC CLUSTERING

EACH CLUSTER HAS AN AUTOMATIC SYMBOLIC

REPRESENTATION

BY A SYMBOLIC OBJECT

ndescriptio its - INTENSION

members its - EXTENSION

OBJECTIVE :

10

Symbolic clustering methods need:

• Generalization Operator

C ⊆⊆⊆⊆ C’

s’ (representing C’) is more general than

s (representing C)

• Generality degree measure

Page 6: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

6

11

GENERALISATION

s is more general than s’ if its extent contains

the extent of s’

s’ is more specific than s

Generalisation of two symbolic objects s and s’ :

determining s’’ : s’’ is more general than both s and s’.

This procedure differs according to the variable type :

s ≤≤≤≤ s ∪∪∪∪ s’ and s’≤≤≤≤ s ∪∪∪∪ s’

ext (s ∪∪∪∪ s’) ⊇⊇⊇⊇ ext (s) and ext (s ∪∪∪∪s’) ⊇⊇⊇⊇ ext (s’)

12

1) Interval variables

s1 = [ y ∈∈∈∈ [a1, b1] ]

s2 = [ y ∈∈∈∈ [a2, b2] ]

s1 ∪∪∪∪ s2 = [ y ∈∈∈∈ [min {a1,a2}, max {b1,b2}]]

s1 = [ time ∈∈∈∈ [5, 15] ]

s2 = [ time ∈∈∈∈ [10, 20] ]

s1 ∪∪∪∪ s2 = [ time ∈∈∈∈ [ 5, 20 ]]

Example :

Page 7: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

7

13

2) Multi-valued categorical variables

s1 = [ y ∈∈∈∈ V1]

s2 = [ y ∈∈∈∈ V2 ]

s1 ∪∪∪∪ s2 = [ y1 ∈∈∈∈ V1 ∪∪∪∪ V2 ]

s1 = [job ∈∈∈∈{secretary, teacher}]

s2 = [job ∈∈∈∈ {employee}]

s1 ∪∪∪∪ s2 = [job ∈∈∈∈{secretary, teacher, employee}]

Example :

14

3) Modal variables

Two possibilities proposed :

take for each category the Minimum of its frequencies

take for each category the Maximum of its frequencies

Page 8: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

8

15

a) Generalisation by the Maximum

= ] } )(pm , ,)(pm [y ] } )(pm , ,)(pm [y 2

kk

2

11

1

kk

1

11 …………{{{{====∪∪∪∪…………{{{{====

] } )(pm , ,)(pm [y kk11 …………{{{{====

}}}}2

j

1

jj p , p {Max p ====with

[Type of Job ∈∈∈∈{ (0.3) administration, (0.7) teaching}] ∪∪∪∪

[Type of Job ∈∈∈∈{ (0.6) admin., (0.2) teaching , (0.2) secretary}]

= [Type of Job ∈∈∈∈{ (0.6) admin., (0.7) teaching , (0.2) secret.}]

Example :

Extension : k} , 1,=j , p p : a { j

a

j …………≤≤≤≤

“at most” principle

16

b) Generalisation by the Minimum

= ] } )(pm , ,)(pm [y ] } )(pm , ,)(pm [y 2

kk

2

11

1

kk

1

11 …………{{{{====∪∪∪∪…………{{{{====

] } )(pm , ,)(pm [y kk11 …………{{{{====

with }}}}2

j

1

jj p , p { Min p ====

[[Type of Job ∈∈∈∈{ (0.3) administration, (0.7) teaching}] ∪∪∪∪

[Type of Job ∈∈∈∈{ (0.6) admin., (0.2) teaching , (0.2) secretary}]

= [Type of Job ∈∈∈∈{ (0.3) admin., (0.2) teaching}]

Example :

Extension :

“at least” principle

k} , 1,=j , p p : a { j

a

j …………≥≥≥≥

Page 9: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

9

17

COMPLETENESS AND COMPLETE OBJECTS

COMPLETE symbolic object

• defined by all the properties that characterize its

extension

• the most specific to fulfill this condition

Galois connections, lattice theory

Union preservs completeness

Barbut & Monjardet (1970); Wille (1982); Ganter (1984)

18

y1

y2

y3

Age Weight Sex

w120 45 F

w2

50 55 F

w3

30 50 F

w4

60 60 M

a = [ Weight = [ 40, 50 ] ] ∧∧∧∧ [ Age = [ 20, 50 ] ]

ext a = {w1 , w3}

a IS NOT COMPLETE

a = [Weight = [45, 50]] ∧∧∧∧ [Age = [20, 30]] ∧∧∧∧ [Sex ={F}]

IS COMPLETE

Example:

Page 10: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

10

19

THE BASIC ALGORITHM

• p1, p2 can be merged together

• s is complete

• s is more general than s1, s2 ⇒⇒⇒⇒ union s1 ∪∪∪∪ s2

• extE s = p

Non - uniqueness ⇒⇒⇒⇒ numerical criterion

Clusters with more specific descriptions

are formed first

STARTING WITH THE ONE-OBJECT CLUSTERS

{ai}, i = 1,…,n

At each step, FORM A CLUSTER p union of p1 , p2 ,

REPRESENTED BY s such that

20

GENERALITY DEGREE

a = ∧∧∧∧ [ yi ∈∈∈∈ Vi ], Vi ⊆⊆⊆⊆ Oi Oi bounded

i

e = a if

)i

(eG = )

i(O m

)i

(V m

= (a)G

p

1=i

p

1=i

∧∧∧∧

∏∏∏∏∏∏∏∏

PROPORTION of the description space covered by a

the more possible members of the extension of a ,

the greater the generality degree of a

Page 11: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

11

21

Interval variables

m(Vi) = max Vi – min Vi (range)

55,045

25

1560

2045)11e(G ==

−=

2,010000

2000

010000

10003000)12e(G ==

−=

11,02,0*55,0)1s(G ==

Example :

Describing groups of people

defined on variables age and salary.

Age ranges from 15 to 60 , salary ranges from 0 to 10000.

Consider a group described by a symbolic object

s1 = [age ∈∈∈∈ [ 20 , 45]] ∧∧∧∧ [salary ∈∈∈∈ [1000 , 3000]] = e11 ∧∧∧∧ e12

22

Multi-valued categorical variables

m(Vi) = card Vi

Example:

Describing grous of people from the UE,

defined on variables sex and nationality.

5,02

1)11e(G == 08,0

25

2)12e(G ==

04,008,05,0)1s(G =×=

Consider one group described by :

s1 = [sex ∈∈∈∈ { M }] ∧∧∧∧ [nationality ∈∈∈∈ {French, English}]

Page 12: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

12

23

∑∑∑∑∏∏∏∏========

====jk

i

ij

p

j j

pk

aG11

1

1)(

which is the affinity coefficient (Matusita, 1951) between

(p1,…,pk) and the uniform distribution

This means that we consider an object

the more general the more similar it is

to the uniform distribution

G1(a) is maximum (=1) when pi = 1/k, i=1,…k : uniform

Modal variables

a) Generalising by the Maximum

24

∑∑∑∑∏∏∏∏========

−−−−−−−−

====jk

i

ij

p

j jj

pkk

aG11

2 )1()1(

1)(

Again, G2(a) is maximum (=1)

when pi = 1/k, i=1,…k : uniform

b) Generalising by the minimum

Page 13: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

13

25

ALGORITHM

• p1, p2 can be merged together

• s = s1 ∪∪∪∪ s2 (complete)

• extE s = p

• G(s) is minimum

Starting with the one-object clusters {ai}, i = 1,…,n

At each step, FORM A CLUSTER p union of p1 , p2 ,

REPRESENTED BY s such that

26

each cluster is represented by a

"complete" symbolic object

whose extension is the cluster itself

The algorithm builds a hierarchy / pyramid on E, such that

AUTOMATIC SYMBOLIC REPRESENTATION

OF THE CLUSTERS

CLUSTER = (p, s) p = ext s

Page 14: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

14

27

y1 y2 y3 y4

w1 1 1 1 2

w2 1 2 1 3

w3 1 2 2 2

w4 2 1 1 2

w5 3 3 2 1

28

w 4 w 1 w2 w3 w5

2/54

4 /54

8/54

16/54

24 /54

32/54

1

P 1P2

P3

P4P5 P 6

P7

P8

P9

P10

P6 : : : : s6 = [ y1 = {1} ] ∧∧∧∧ [ y2 = {1,2} ] ∧∧∧∧ [ y4 = {2,3} ]

P7 : : : : s7 = [ y1 = {1,2} ] ∧∧∧∧ [ y2 = {1,2}] ∧∧∧∧ [y4 = {2,3} ]

Page 15: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

15

29

Each of the 12 groups is described

by interval variables, representing the variation

of the underlying variables in the class.

Application

INE Labour Force Survey data :

12 sex * age groups

30

Page 16: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

16

31

32

P36 [Looking_for_job=[2.010,3.595]U[4.579,6.707]

]^

[Financial Act.=[3.639,5.851]]^

[Public_ Admin=[1.336,2.620]U[3.004,4.871]]^

[Hotels_Rest_=[1.874,3.497]]^

[Commerce=[9.582,13.161]]^

[Construction=[11.244,14.526]]^

[Education=[4.589,8.829]]^

[Elect__gas_water=[0.347,1.757]]^

[Industry=[32.700,37.277]U[38.671,43.205]]^

[Other_Serv_=[4.316,7.551]]^

[Primary=[5.921,9.799]]^

[Health=[1.857,3.622]]^

[Transp_Comunic_=[2.352,4.675]]^

[None=[2.603,4.285]U[4.609,6.839]]^...

Page 17: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

17

33

its members

Man 15 to 54

Women 15 to 24

are the only elements

fulfilling these conditions.

34

Application

Cultural activities of 11 socio-professional groups

1509 individual observations grouped

Page 18: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

18

35

36

Page 19: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

19

37

The HIPYR Module

Objective :

Perform Hierarchical or Pyramidal clustering

on a set of SO’s

� from a dissimilarity matrix

→→→→ numerical clustering

� directly based on the data set

→→→→ symbolic clustering: clusters are "concepts "

38

The HIPYR module

Page 20: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

20

39

Main Parameters

Structure :

Data Source :

Aggregation Index :

Hierarchy or Pyramid

� Dissimilarity Matrix (Numerical Clustering)

� Symbolic objects (Symbolic Clustering)

� Numerical Clustering : Maximum, Minimum, Average,

Diameter

� Symbolic Clustering : Minimum Generality

Minimum Increase in Generality

40

Main Parameters

�Use Taxonomies for generalization

(nominal or categorical multi-valued variables) : Y, N

� Select “best” classes : Y, N

� Write induced dissimilarity/generality matrix : Y, N

� Order Variable (optional) : quantitative single variable;

to impose an order compatible with the pyramid

� Modal variables generalization :

� Maximum

� Minimum

Page 21: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

21

41

Main Parameters

42

Main Parameters

Page 22: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

22

43

Induced dissimilarity/generality matrix

For each pair of SO, si, sj, i, j, =1,..., n,

d*(si, sj,) = index (height) of the “smallest” class

that contains si and sj

d*(si, sj,) = Min {f(C), si ∈∈∈∈ C, sj, ∈∈∈∈ C}

Evaluation of the obtained indexed hierarchy /

pyramid:

Comparision between the initial and the induced

dissimilarity/generality matrices.

44

Evaluation value

For si, sj, i, j, =1,..., n, d(si, sj) :

�the given dissimilarity matrix (numerical clustering)

�generality degree of si ∪∪∪∪ sj (symbolic clustering)

∑∑∑∑ ∑∑∑∑

∑∑∑∑ ∑∑∑∑

−−−−

==== ++++====

−−−−

==== ++++====

−−−−

====1n

1i

n

1ijji

1n

1i

n

1ij

2jiji

)s,s(d

))s,s(*d)s,s(d(

EV

Page 23: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

23

45

Cluster selectionIdentify the most interesting clusters :

A cluster is “interesting” if its variability is

small as compared to its predecessors.

Variability indicated by index values f(h).

Compute mean value and standard deviation

of height increase values.

A class is selected if the corresponding increase value

is more than 2 stand. dev. over the mean value.

46

Cluster selection

C

Page 24: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

24

47

HIPYROUTPUT

� Text file

� Sodas file

� Interactive Graphical Representation (VPYR)

48

� The labels of the individuals

� The labels of the variables

� The description of each node :

� the symbolic object associated to each node

� its extension

� Evaluation value

�Selected clusters, if asked for

�The induced matrix, if asked for

The output listing contains:

Page 25: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

25

49

Graphical Representation

50

Options of the Graphical

Representation

A cluster is selected by clicking on it.

�description of the cluster in terms of

list of chosen variables

�representation by a Zoom Star

Page 26: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

26

51

Graphical Representation

52

Graphical Representation

Page 27: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

27

53

HIPYR HIPYR -- VPYRVPYR

54

Pruning the hierarchy or pyramid using the

aggregation heights as a criterion.

Suppressing cluster p if :

f(p’) - f(p) < αααα f(E) ∧∧∧∧ p has a single predecessor

Options of the Graphical

Representation

Rate of simplification αααα

chosen by the user,

new graphic window with

the simplified structure.

Page 28: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

28

55

Pruning

56

Rule Generation

If the hierarchy/pyramid are built from a symbolic

data table, rules may be generated and saved in a

specified file.

(p1,s1) (p2,s2)

(p,s)

(p1,s1) (p2,s2)

(p,s)

Fission method :

s ⇒⇒⇒⇒ s1 ∨∨∨∨ s2

Fussion method

(pyramids only) :

s1 ∧∧∧∧ s2⇒⇒⇒⇒ s

Page 29: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

29

57

Rule generation

58

Reduction

Should the user be interested in a particular cluster,

he may obtain a window with the structure

restricted to this cluster and its successors.

Options of the Graphical

Representation

Page 30: HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC …vlado.fmf.uni-lj.si/info/ifcs06/symbolic/HIPYR.pdf · 1 1 HIERARCHICAL AND PYRAMIDAL CLUSTERING FOR SYMBOLIC DATA Paula Brito

30

59

Reduction