set-values prototypes through consensus analysis

Post on 14-Apr-2017

128 Views

Category:

Science

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Set-valued prototypesthrough Consensus Analysis

M. Fordellone1 F. Palumbo2

1Department of Statistical SciencesUniversity of Padua (Italy)

email: fordellone@stat.unipd.it

2Department of Political SciencesUniversity of Naples (Italy)email: fpalumbo@unina.it

IFCS ConferenceJuly 6th 2015, Bologna (Italy)

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Outline

1 Prototypes definitionWhat is a prototype?

2 Consensus AnalysisConsensus clusteringConsensus measurement

3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

4 Simulated data examplesEight experimental contexts

5 Application on real dataI.P.I.P. test

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

What is a prototype?

Outline

1 Prototypes definitionWhat is a prototype?

2 Consensus AnalysisConsensus clusteringConsensus measurement

3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

4 Simulated data examplesEight experimental contexts

5 Application on real dataI.P.I.P. test

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

What is a prototype?

What is a prototype?

According to Rosch (1975, 1999), prototypes are the elements thatbetter than others represent a category.Smith and Medin (1981) refer to the concept of category as thehighest order of genera that cannot be defined by a mere listing ofproperties shared by all elements.A prototype is not necessarily a real element of the category, itcan be observed or unobserved (abstract) entity (Medin, D. L. andSchaffer, M. M., 1978).

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Consensus clusteringConsensus measurement

Outline

1 Prototypes definitionWhat is a prototype?

2 Consensus AnalysisConsensus clusteringConsensus measurement

3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

4 Simulated data examplesEight experimental contexts

5 Application on real dataI.P.I.P. test

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Consensus clusteringConsensus measurement

Consensus concept

Finding and measuring the agreement between two or more parti-tions of the same data set is of substantial interest in cluster analysis.This particular case of consensus analysis is also known as consensusclustering.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Consensus clusteringConsensus measurement

Comparing partitions

Let X be a N×J data matrix, and T and V two partitions of X , thennrc (r = 1, . . . ,R; c = 1, . . . ,C ) represents the number of objectsassigned to the classes tr and vc , with respect to the two partitioningcriteria. Consensus between the partitions T and V is evaluatedstarting from the entries of the cross-classifying contingency table.

Table : Contingency table

Partition Vv1 v2 · · · vC

Partition T

t1 n11 n12 · · · n1C n1·t2 n21 n22 · · · n2C n2·...

......

. . ....

...tR nR1 nR2 · · · nRC nR·

n·1 n·2 · · · n·C n

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Consensus clusteringConsensus measurement

Comparing partitions

Let X be a N×J data matrix, and T and V two partitions of X , thennrc (r = 1, . . . ,R; c = 1, . . . ,C ) represents the number of objectsassigned to the classes tr and vc , with respect to the two partitioningcriteria. Consensus between the partitions T and V is evaluatedstarting from the entries of the cross-classifying contingency table.

Table : Contingency table

Partition Vv1 v2 · · · vC

Partition T

t1 n11 n12 · · · n1C n1·t2 n21 n22 · · · n2C n2·...

......

. . ....

...tR nR1 nR2 · · · nRC nR·

n·1 n·2 · · · n·C n

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Consensus clusteringConsensus measurement

Outline

1 Prototypes definitionWhat is a prototype?

2 Consensus AnalysisConsensus clusteringConsensus measurement

3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

4 Simulated data examplesEight experimental contexts

5 Application on real dataI.P.I.P. test

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Consensus clusteringConsensus measurement

Measure of Consensus

Number of ways that n units can pair:S =

(n2

)= n(n−1)

2

Total number of Agreements:

A =(n

2

)+∑R

r=1

∑Cc=1 n2

rc − 12

[∑Rr=1 n2

r · +∑C

c=1 n2·c

]Total number of Disagreements:

D = 12

[∑Rr=1 n2

r · +∑C

c=1 n2·c

]−∑R

r=1

∑Cc=1 n2

rc

Table : Measures of Consensus

Authors Measure RangeRand (1971) A/S ∈ [0, 1]Arabie et al. (1973) D/S ∈ [0, 1]Hubert (1977) (A− D)/S ∈ [0, 1]

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Consensus clusteringConsensus measurement

Measure of Consensus

Number of ways that n units can pair:S =

(n2

)= n(n−1)

2Total number of Agreements:

A =(n

2

)+∑R

r=1

∑Cc=1 n2

rc − 12

[∑Rr=1 n2

r · +∑C

c=1 n2·c

]

Total number of Disagreements:

D = 12

[∑Rr=1 n2

r · +∑C

c=1 n2·c

]−∑R

r=1

∑Cc=1 n2

rc

Table : Measures of Consensus

Authors Measure RangeRand (1971) A/S ∈ [0, 1]Arabie et al. (1973) D/S ∈ [0, 1]Hubert (1977) (A− D)/S ∈ [0, 1]

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Consensus clusteringConsensus measurement

Measure of Consensus

Number of ways that n units can pair:S =

(n2

)= n(n−1)

2Total number of Agreements:

A =(n

2

)+∑R

r=1

∑Cc=1 n2

rc − 12

[∑Rr=1 n2

r · +∑C

c=1 n2·c

]Total number of Disagreements:

D = 12

[∑Rr=1 n2

r · +∑C

c=1 n2·c

]−∑R

r=1

∑Cc=1 n2

rc

Table : Measures of Consensus

Authors Measure RangeRand (1971) A/S ∈ [0, 1]Arabie et al. (1973) D/S ∈ [0, 1]Hubert (1977) (A− D)/S ∈ [0, 1]

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Consensus clusteringConsensus measurement

Measure of Consensus

Number of ways that n units can pair:S =

(n2

)= n(n−1)

2Total number of Agreements:

A =(n

2

)+∑R

r=1

∑Cc=1 n2

rc − 12

[∑Rr=1 n2

r · +∑C

c=1 n2·c

]Total number of Disagreements:

D = 12

[∑Rr=1 n2

r · +∑C

c=1 n2·c

]−∑R

r=1

∑Cc=1 n2

rc

Table : Measures of Consensus

Authors Measure RangeRand (1971) A/S ∈ [0, 1]Arabie et al. (1973) D/S ∈ [0, 1]Hubert (1977) (A− D)/S ∈ [0, 1]

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

Outline

1 Prototypes definitionWhat is a prototype?

2 Consensus AnalysisConsensus clusteringConsensus measurement

3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

4 Simulated data examplesEight experimental contexts

5 Application on real dataI.P.I.P. test

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

k-Means method

K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:

1 Randomly select K group centers;

2 Calculate the distance between each data point and groupcenters;

3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;

4 Recalculate the new group centers.

The procedure repeats from step 2 until no more assignments takeplace.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

k-Means method

K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:

1 Randomly select K group centers;

2 Calculate the distance between each data point and groupcenters;

3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;

4 Recalculate the new group centers.

The procedure repeats from step 2 until no more assignments takeplace.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

k-Means method

K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:

1 Randomly select K group centers;

2 Calculate the distance between each data point and groupcenters;

3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;

4 Recalculate the new group centers.

The procedure repeats from step 2 until no more assignments takeplace.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

k-Means method

K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:

1 Randomly select K group centers;

2 Calculate the distance between each data point and groupcenters;

3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;

4 Recalculate the new group centers.

The procedure repeats from step 2 until no more assignments takeplace.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

k-Means method

K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:

1 Randomly select K group centers;

2 Calculate the distance between each data point and groupcenters;

3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;

4 Recalculate the new group centers.

The procedure repeats from step 2 until no more assignments takeplace.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

k-Means method

K-Means method is developed by Queen (1967). He suggests thename k-Means for describing an algorithm that assigns each unitto the group having the nearest centroid (mean). The iterativeprocedure consists in four principal steps:

1 Randomly select K group centers;

2 Calculate the distance between each data point and groupcenters;

3 Assign the data point to the group whose distance from thegroup center is minimum among all the group centers;

4 Recalculate the new group centers.

The procedure repeats from step 2 until no more assignments takeplace.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

k-Means method

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

k-Means method

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

k-Means method

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

Outline

1 Prototypes definitionWhat is a prototype?

2 Consensus AnalysisConsensus clusteringConsensus measurement

3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

4 Simulated data examplesEight experimental contexts

5 Application on real dataI.P.I.P. test

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

Fuzzy clustering

In fuzzy clustering data elements can belong to more than one group,in according to a measure of association given by a set of member-ship levels.The memberships, ∈ [0, 1], indicate the strength of the associationbetween each data element and each group.

In our case the units with the max membership degree can be uni-vocally assigned to the corresponding group.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

Fuzzy clustering

In fuzzy clustering data elements can belong to more than one group,in according to a measure of association given by a set of member-ship levels.The memberships, ∈ [0, 1], indicate the strength of the associationbetween each data element and each group.In our case the units with the max membership degree can be uni-vocally assigned to the corresponding group.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

Outline

1 Prototypes definitionWhat is a prototype?

2 Consensus AnalysisConsensus clusteringConsensus measurement

3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

4 Simulated data examplesEight experimental contexts

5 Application on real dataI.P.I.P. test

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

FCM and AA

Fuzzy c-means (Bezdek et al., 1984) and Archetypal Analysis (Cutlerand Breiman, 1994) can be seen as a fuzzy approach of the k-Means,under different constraints.

Fuzzy c-Means minimizes the sum of distances between each pointand a set of K centers; Archetypal Analysis minimizes the sum ofdistances between each point and a set of K archetypes as definedby a convex combination of extreme points.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

FCM and AA

Fuzzy c-means (Bezdek et al., 1984) and Archetypal Analysis (Cutlerand Breiman, 1994) can be seen as a fuzzy approach of the k-Means,under different constraints.Fuzzy c-Means minimizes the sum of distances between each pointand a set of K centers; Archetypal Analysis minimizes the sum ofdistances between each point and a set of K archetypes as definedby a convex combination of extreme points.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

FCM and AA

Fuzzy c-Means

W =n∑

i=1

K∑k=1

γ2ik‖xi − ck‖2

γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K

k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K

Archetypal Analysis

J =n∑

i=1

K∑k=1

‖xi − δikak‖2

δik is the membership level ofthe i-th unit and of the k-thgroupak =

∑ni=1 xiβik is the

archetype of the k-th groupConstraints:∑K

k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.

∀k ∈ 1, 2, . . . ,K

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

FCM and AA

Fuzzy c-Means

W =n∑

i=1

K∑k=1

γ2ik‖xi − ck‖2

γik is the membership level ofthe i-th unit and of the k-thgroup

ck is the center of the k-thgroupConstraints:∑K

k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K

Archetypal Analysis

J =n∑

i=1

K∑k=1

‖xi − δikak‖2

δik is the membership level ofthe i-th unit and of the k-thgroupak =

∑ni=1 xiβik is the

archetype of the k-th groupConstraints:∑K

k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.

∀k ∈ 1, 2, . . . ,K

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

FCM and AA

Fuzzy c-Means

W =n∑

i=1

K∑k=1

γ2ik‖xi − ck‖2

γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroup

Constraints:∑Kk=1 γik = 1;

γik ≥ 0.∀k ∈ 1, 2, . . . ,K

Archetypal Analysis

J =n∑

i=1

K∑k=1

‖xi − δikak‖2

δik is the membership level ofthe i-th unit and of the k-thgroupak =

∑ni=1 xiβik is the

archetype of the k-th groupConstraints:∑K

k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.

∀k ∈ 1, 2, . . . ,K

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

FCM and AA

Fuzzy c-Means

W =n∑

i=1

K∑k=1

γ2ik‖xi − ck‖2

γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K

k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K

Archetypal Analysis

J =n∑

i=1

K∑k=1

‖xi − δikak‖2

δik is the membership level ofthe i-th unit and of the k-thgroupak =

∑ni=1 xiβik is the

archetype of the k-th groupConstraints:∑K

k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.

∀k ∈ 1, 2, . . . ,K

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

FCM and AA

Fuzzy c-Means

W =n∑

i=1

K∑k=1

γ2ik‖xi − ck‖2

γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K

k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K

Archetypal Analysis

J =n∑

i=1

K∑k=1

‖xi − δikak‖2

δik is the membership level ofthe i-th unit and of the k-thgroupak =

∑ni=1 xiβik is the

archetype of the k-th groupConstraints:∑K

k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.

∀k ∈ 1, 2, . . . ,K

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

FCM and AA

Fuzzy c-Means

W =n∑

i=1

K∑k=1

γ2ik‖xi − ck‖2

γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K

k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K

Archetypal Analysis

J =n∑

i=1

K∑k=1

‖xi − δikak‖2

δik is the membership level ofthe i-th unit and of the k-thgroup

ak =∑n

i=1 xiβik is thearchetype of the k-th groupConstraints:∑K

k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.

∀k ∈ 1, 2, . . . ,K

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

FCM and AA

Fuzzy c-Means

W =n∑

i=1

K∑k=1

γ2ik‖xi − ck‖2

γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K

k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K

Archetypal Analysis

J =n∑

i=1

K∑k=1

‖xi − δikak‖2

δik is the membership level ofthe i-th unit and of the k-thgroupak =

∑ni=1 xiβik is the

archetype of the k-th group

Constraints:∑Kk=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.

∀k ∈ 1, 2, . . . ,K

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

k-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

FCM and AA

Fuzzy c-Means

W =n∑

i=1

K∑k=1

γ2ik‖xi − ck‖2

γik is the membership level ofthe i-th unit and of the k-thgroupck is the center of the k-thgroupConstraints:∑K

k=1 γik = 1;γik ≥ 0.∀k ∈ 1, 2, . . . ,K

Archetypal Analysis

J =n∑

i=1

K∑k=1

‖xi − δikak‖2

δik is the membership level ofthe i-th unit and of the k-thgroupak =

∑ni=1 xiβik is the

archetype of the k-th groupConstraints:∑K

k=1 δik = 1; δik ≥ 0;∑Kk=1 βik = 1; βik ≥ 0.

∀k ∈ 1, 2, . . . ,KM. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Outline

1 Prototypes definitionWhat is a prototype?

2 Consensus AnalysisConsensus clusteringConsensus measurement

3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

4 Simulated data examplesEight experimental contexts

5 Application on real dataI.P.I.P. test

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data

Three groups of units in different experimental contexts have beengenerated by a multivariate Gaussian distribution with eight dimen-sions (four variables are white noise).

Table : Experimental contexts

Size Correlation Kurtosis

Case 1 900 0.2− 0.4 β = 3Case 2 300 0.2− 0.4 β = 3Case 3 900 0.2− 0.4 β < 3Case 4 300 0.2− 0.4 β < 3Case 5 900 0.6− 0.8 β = 3Case 6 300 0.6− 0.8 β = 3Case 7 900 0.6− 0.8 β < 3Case 8 300 0.6− 0.8 β < 3

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data

Three groups of units in different experimental contexts have beengenerated by a multivariate Gaussian distribution with eight dimen-sions (four variables are white noise).

Table : Experimental contexts

Size Correlation Kurtosis

Case 1 900 0.2− 0.4 β = 3Case 2 300 0.2− 0.4 β = 3Case 3 900 0.2− 0.4 β < 3Case 4 300 0.2− 0.4 β < 3Case 5 900 0.6− 0.8 β = 3Case 6 300 0.6− 0.8 β = 3Case 7 900 0.6− 0.8 β < 3Case 8 300 0.6− 0.8 β < 3

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 1K-means groups

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 2K-means groups

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 3K-means groups

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 4K-means groups

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 5K-means groups

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 6K-means groups

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 7K-means groups

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 8K-means groups

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 1Memberships FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 2Memberships FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 3Memberships FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 4Memberships FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 5Memberships FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 6Memberships FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 7Memberships FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 8Memberships FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 1Consensus Analysis between FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 2Consensus Analysis between FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 3Consensus Analysis between FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 4Consensus Analysis between FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 5Consensus Analysis between FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 6Consensus Analysis between FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 7Consensus Analysis between FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 8Consensus Analysis between FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 1Consensus groups FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 2Consensus groups FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 3Consensus groups FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 4Consensus groups FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 5Consensus groups FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 6Consensus groups FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 7Consensus groups FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Case 8Consensus groups FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Eight experimental contexts

Simulated data: Summary

Table : Results of Consensus Analysis and definition of the prototypes

Experimental Conditions Prototyping Results Consensus Measuring

N Corr. Kurt. K Size Rand Arabie Hubert900 0.2− 0.4 β = 3 3 900 (100.0%) 1.000 0.000 1.000300 0.2− 0.4 β = 3 3 300 (100.0%) 1.000 0.000 1.000900 0.2− 0.4 β < 3 3 625 (69.4%) 0.725 0.275 0.449300 0.2− 0.4 β < 3 3 185 (61.7%) 0.683 0.317 0.365900 0.6− 0.8 β = 3 3 599 (66.6%) 0.753 0.247 0.506300 0.6− 0.8 β = 3 3 202 (67.3%) 0.758 0.242 0.517900 0.6− 0.8 β < 3 3 533 (59.2%) 0.698 0.302 0.397300 0.6− 0.8 β < 3 3 189 (63.0%) 0.720 0.280 0.439

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

Outline

1 Prototypes definitionWhat is a prototype?

2 Consensus AnalysisConsensus clusteringConsensus measurement

3 Partitioning methodsk-MeansFuzzy criterionFuzzy c-Means (FCM) and Archetypal Analysis (AA)

4 Simulated data examplesEight experimental contexts

5 Application on real dataI.P.I.P. test

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testAbout data

Web Site: http://personality-testing.info/ rawdata/

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testAbout data

Four different scales were used as part of an experiment DISC per-sonality test. The scales are from the International Personality ItemPool (http://ipip.ori.org/newCPIKey.htm).The scales used are:

Assertiveness, is the quality of being self-assured andconfident without being aggressive

Social confidence, is generally described as a state of beingcertain

Adventurousness, is represented by the activities with somepotential for physical danger

Dominance, is conceptualized as a measure of individualdifferences in levels of group-based discrimination

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testAbout data

Four different scales were used as part of an experiment DISC per-sonality test. The scales are from the International Personality ItemPool (http://ipip.ori.org/newCPIKey.htm).The scales used are:

Assertiveness, is the quality of being self-assured andconfident without being aggressive

Social confidence, is generally described as a state of beingcertain

Adventurousness, is represented by the activities with somepotential for physical danger

Dominance, is conceptualized as a measure of individualdifferences in levels of group-based discrimination

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testAbout data

Four different scales were used as part of an experiment DISC per-sonality test. The scales are from the International Personality ItemPool (http://ipip.ori.org/newCPIKey.htm).The scales used are:

Assertiveness, is the quality of being self-assured andconfident without being aggressive

Social confidence, is generally described as a state of beingcertain

Adventurousness, is represented by the activities with somepotential for physical danger

Dominance, is conceptualized as a measure of individualdifferences in levels of group-based discrimination

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testAbout data

Four different scales were used as part of an experiment DISC per-sonality test. The scales are from the International Personality ItemPool (http://ipip.ori.org/newCPIKey.htm).The scales used are:

Assertiveness, is the quality of being self-assured andconfident without being aggressive

Social confidence, is generally described as a state of beingcertain

Adventurousness, is represented by the activities with somepotential for physical danger

Dominance, is conceptualized as a measure of individualdifferences in levels of group-based discrimination

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testAbout data

Four different scales were used as part of an experiment DISC per-sonality test. The scales are from the International Personality ItemPool (http://ipip.ori.org/newCPIKey.htm).The scales used are:

Assertiveness, is the quality of being self-assured andconfident without being aggressive

Social confidence, is generally described as a state of beingcertain

Adventurousness, is represented by the activities with somepotential for physical danger

Dominance, is conceptualized as a measure of individualdifferences in levels of group-based discrimination

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testAbout data

Dataset consists in 40 items (10 for each scale) and 898 individuals.The items were rated on a 5 point scale where:

1=Strongly disagree,

2=Disagree,

3=Neither agree not disagree,

4=Agree,

5=Strongly agree.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testAbout data

Dataset consists in 40 items (10 for each scale) and 898 individuals.The items were rated on a 5 point scale where:

1=Strongly disagree,

2=Disagree,

3=Neither agree not disagree,

4=Agree,

5=Strongly agree.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testAbout data

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testPrincipal Component Analysis

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testScree-plots FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testK-means groups

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testMemberships FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testConsensus Analysis between FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testConsensus groups FCM and AA

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

I.P.I.P. test

I.P.I.P. testDescription of prototypes

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

Conclusions

The results of the applications confirm the following hypothesis:

When the groups are well defined, avoiding any overlapping,the consensus analysis between the two different partitioningmethods underlined the presence of the groups;

The simulation has been useful to study which are the causesthat can deeply affect the consensus among the twoapproaches: firstly correlation between variables, secondlypresence of multivariate outliers (different kurtosis levels).

We believe that the prototypes definitions through the consensusapproach is more reliable in comparison to the classical approaches:the finding of the groups in respect to the consensus-criterion, guar-antees more homogeneous prototypes.

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Prototypes definitionConsensus Analysis

Partitioning methodsSimulated data examplesApplication on real data

Conclusions

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

Appendix Bibliography

M. Fordellone, F. Palumbo Set-valued prototypes through Consensus Analysis

top related