mining maximally banded matrices in binary data
TRANSCRIPT
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Mining Maximally Banded Matrices in BinaryData
Faris AlqadahRaj Bhatnagar
Anil Jegga
University of CincinnatiCincinnati Children’s Hospital
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Outline
1 IntroductionMotivation
2 Problem DefinitionPreliminaries
3 Bandedness and Bi-ClusteringFormal Concept AnalysisConcept Lattice Paths
4 MMBS AlgorithmThree Steps
5 Experimental ResultsSynthetic DataReal-World Data
6 Conclusion
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Outline
1 IntroductionMotivation
2 Problem DefinitionPreliminaries
3 Bandedness and Bi-ClusteringFormal Concept AnalysisConcept Lattice Paths
4 MMBS AlgorithmThree Steps
5 Experimental ResultsSynthetic DataReal-World Data
6 Conclusion
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Banded Matrices in Data
A B C D E1 1 1 1 0 02 0 1 1 0 03 0 0 1 0 04 0 0 1 1 05 0 0 0 1 1
Banded structures inbinary matrices havenatural interpretations
Bioinformatics (overlappingroles of genes)
Paleontology (patterns ofspecies in space)
Social Networks(community structures)
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Motivating Example
k-means multi-way EM bi-cluster subspacedoc1 1 0 1 0 1doc2 0 1 0 1 0doc3 0 0 0 0 1doc4 0 0 0 1 1doc5 0 0 1 0 1
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Motivating Example
k-means EM subspace bi-cluster multi-waydoc1 1 1 1 0 0doc5 0 1 1 0 0doc3 0 0 1 0 0doc4 0 0 1 1 0doc2 0 0 0 1 1
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Bi-Clustering Problem
Banded sub-matrices are a form of bi-clusters
Bi-Clustering in binary data focuses on maximallyrectangles full of (or almost full) of 1s
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Related Work
Nestedness and segmented nestedness [6]
MBS algorithm [2]
Fix column permutations
Solve the consecutive ones problem
Only find a single band
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Contributions
1 Establish correspondence between banded structures andbi-clustering in binary data
2 Introduce the novel MMBS algorithm to uncover multiple,possibly overlapping banded sub-matrices
3 Empirical evaluation verifying advantage of MMBS overprevious approaches
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Contributions
1 Establish correspondence between banded structures andbi-clustering in binary data
2 Introduce the novel MMBS algorithm to uncover multiple,possibly overlapping banded sub-matrices
3 Empirical evaluation verifying advantage of MMBS overprevious approaches
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Contributions
1 Establish correspondence between banded structures andbi-clustering in binary data
2 Introduce the novel MMBS algorithm to uncover multiple,possibly overlapping banded sub-matrices
3 Empirical evaluation verifying advantage of MMBS overprevious approaches
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Outline
1 IntroductionMotivation
2 Problem DefinitionPreliminaries
3 Bandedness and Bi-ClusteringFormal Concept AnalysisConcept Lattice Paths
4 MMBS AlgorithmThree Steps
5 Experimental ResultsSynthetic DataReal-World Data
6 Conclusion
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Basic Notation
Matrix K with row labels G and column labels M
Think of K as K = (G,M, I)
π permutation of G and τ permutation of M
Kπτ
gπi and mτj
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Basic Notation
Matrix K with row labels G and column labels M
Think of K as K = (G,M, I)
π permutation of G and τ permutation of M
Kπτ
gπi and mτj
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Fully Banded Matrix
Definition
A binary matrix K= (G,M, I) is fully banded if there exists apermutation π of G and permutation τ of M such that (1) forevery row i in K
πτ the entries with 1s occur in consecutive
column indices {mi ,mi + 1, . . . ,m⋆
i } and (2) the values ofstarting indices for 1s in successive rows (i and i + 1) satisfythe conditions mi ≤ mi+1 and m⋆
i ≤ m⋆
i+1.
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Relaxation of Fully Banded
Real data has noise
Subspaces may encompass banded structure
e(Kπτ ): number of 1s or 0s that must be flipped to achieve
banded structure
Maximal banded sub-matrix: no more rows or columns canbe added while still preserving bandedness
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Relaxation of Fully Banded
Real data has noise
Subspaces may encompass banded structure
e(Kπτ ): number of 1s or 0s that must be flipped to achieve
banded structure
Maximal banded sub-matrix: no more rows or columns canbe added while still preserving bandedness
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Problem Statement
Given binary matrix K and noise threshold ǫ find allsub-matrices K of K that are ǫ-banded and maximal.
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Outline
1 IntroductionMotivation
2 Problem DefinitionPreliminaries
3 Bandedness and Bi-ClusteringFormal Concept AnalysisConcept Lattice Paths
4 MMBS AlgorithmThree Steps
5 Experimental ResultsSynthetic DataReal-World Data
6 Conclusion
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Bi-clustering
Bi-clusters in binary data defined as Formal Concepts
For A ⊆ G, then A′ = {m ∈ M|gIm for all g ∈ A}.
B ⊆ M, we have B′ = {g ∈ G|gImfor allm ∈ B}
Formal Concept: C = (A,B) such that A′ = B and B′ = A
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Bi-clustering
Bi-clusters in binary data defined as Formal Concepts
For A ⊆ G, then A′ = {m ∈ M|gIm for all g ∈ A}.
B ⊆ M, we have B′ = {g ∈ G|gImfor allm ∈ B}
Formal Concept: C = (A,B) such that A′ = B and B′ = A
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Formal Concepts
m1 m2 m3 m4
g1 0 1 0 1g2 0 0 1 1g3 0 0 0 1g4 1 0 0 0g5 1 1 1 0g7 1 1 0 0g6 0 0 1 0
Maximal rectangles of 1s
Maximal bicliques
Bi-clusters may be ordered by the subset supersetrelationship and form a complete lattice
B(G,M, I) denotes the concept or bi-cluster lattice
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Formal Concepts
m1 m2 m3 m4
g1 0 1 0 1g2 0 0 1 1g3 0 0 0 1g4 1 0 0 0g5 1 1 1 0g7 1 1 0 0g6 0 0 1 0
Maximal rectangles of 1s
Maximal bicliques
Bi-clusters may be ordered by the subset supersetrelationship and form a complete lattice
B(G,M, I) denotes the concept or bi-cluster lattice
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Splintering Bands
Trivially a bi-cluster is fully banded
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Splintering Bands
Trivially a bi-cluster is fully banded
A B C D E1 1 1 1 0 02 0 1 1 0 03 0 0 1 0 04 0 0 1 1 05 0 0 0 1 1
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Splintering Bands
A B C D E1 1 1 1 0 02 0 1 1 0 03 0 0 1 0 04 0 0 1 1 05 0 0 0 1 1
Intuitively, any fully banded matrix can be splintered exactly intomaximal rectangles of 1s or bi-clusters
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Ordering Splintered Bands
Let Kπτ
be fully banded
Γ(g) is a mapping from row g to the bi-clusters g appearsin
The union of all Γ(g) can always be ordered
n-tuple of bi-clusters {C1, . . . ,Cn} having total ordering{<π1,τ1, . . . , <πn,τn}
Define lexicographical order <π,τ on C1 × C2 × · · · × Cn.
Considering {C1, . . . ,Cn} in order completely specifies thepermutations π and τ
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Ordering Splintered Bands
Let Kπτ
be fully banded
Γ(g) is a mapping from row g to the bi-clusters g appearsin
The union of all Γ(g) can always be ordered
n-tuple of bi-clusters {C1, . . . ,Cn} having total ordering{<π1,τ1, . . . , <πn,τn}
Define lexicographical order <π,τ on C1 × C2 × · · · × Cn.
Considering {C1, . . . ,Cn} in order completely specifies thepermutations π and τ
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Ordering Splintered Bands
Let Kπτ
be fully banded
Γ(g) is a mapping from row g to the bi-clusters g appearsin
The union of all Γ(g) can always be ordered
n-tuple of bi-clusters {C1, . . . ,Cn} having total ordering{<π1,τ1, . . . , <πn,τn}
Define lexicographical order <π,τ on C1 × C2 × · · · × Cn.
Considering {C1, . . . ,Cn} in order completely specifies thepermutations π and τ
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Ordering Splintered Bands
Let Kπτ
be fully banded
Γ(g) is a mapping from row g to the bi-clusters g appearsin
The union of all Γ(g) can always be ordered
n-tuple of bi-clusters {C1, . . . ,Cn} having total ordering{<π1,τ1, . . . , <πn,τn}
Define lexicographical order <π,τ on C1 × C2 × · · · × Cn.
Considering {C1, . . . ,Cn} in order completely specifies thepermutations π and τ
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Bands as Sequences of Concepts
Proposition
Given a context K, if permutations π and τ exist such that Kπτ
isfully banded then there exists a sequence of bi-clustersC1 = (A1,B1), . . . ,Cn = (An,Bn) s.t.
π ={
A1,A2 \ A1, . . . ,An \ An−1}
τ ={
B1 \ B2, . . . ,Bn−1 \ Bn,Bn}
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
An ExampleA B C D E
1 1 1 1 0 02 0 1 1 0 03 0 0 1 0 04 0 0 1 1 05 0 0 0 1 1
g Γ(g)1
{
(1,ABC), (12,BC), (1234,C)}
2{
(12,BC), (1234,C)}
3{
(1234,C)}
4{
(4,CD), (45,D)}
5{
(5,DE), (45,D)}
F(Kπτ)
{
(1,ABC) < (12,BC) < (1234,C) < (4,CD) < (45,D) < (5,DE)}
π ={
1,12 \ 1, . . . ,5 \ 45}
= {1,2,3,4,5}
τ ={
ABC \ BC, . . . ,D \ DE ,DE}
= {A,B,C,D,E}
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Outline
1 IntroductionMotivation
2 Problem DefinitionPreliminaries
3 Bandedness and Bi-ClusteringFormal Concept AnalysisConcept Lattice Paths
4 MMBS AlgorithmThree Steps
5 Experimental ResultsSynthetic DataReal-World Data
6 Conclusion
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Paths in the lattice
Represent B(G,M, I) as G = (V ,E)
Edge set define as: C1,C2 ∈ E ↔ C1 ≺ C2 ∨ C2 ≺ C1
Concept lattice order enforces: Ai+1 ⊆ Ai and Bi ⊆ Bi+1 ifCi ≺ Ci+1
Dual: Ai ⊆ Ai+1 and Bi+1 ⊆ Bi if Ci ≻ Ci+1
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Paths in the lattice
Represent B(G,M, I) as G = (V ,E)
Edge set define as: C1,C2 ∈ E ↔ C1 ≺ C2 ∨ C2 ≺ C1
Concept lattice order enforces: Ai+1 ⊆ Ai and Bi ⊆ Bi+1 ifCi ≺ Ci+1
Dual: Ai ⊆ Ai+1 and Bi+1 ⊆ Bi if Ci ≻ Ci+1
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Construct Partial Bands Via Paths
sA,B,C,D,E
sA,B,C
1
sD,E5
sC,D4
sB,C1,2
sD4,5
sC
1,2,3,4
s
1,2,3,4,5
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Bound on the error
Key Fact
Each individual edge in a path P is guaranteed to produce abanded structure
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Bound on the error
Proposition
e(Pn) ≤
0 if n ≤ 1e(Pn−1) +
∑
a∈A
|a′ ∩ B| if Cn+1 ≻ Cn
e(Pn−1) +∑
b∈B
|b′ ∩ A| if Cn+1 ≺ Cn
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Outline
1 IntroductionMotivation
2 Problem DefinitionPreliminaries
3 Bandedness and Bi-ClusteringFormal Concept AnalysisConcept Lattice Paths
4 MMBS AlgorithmThree Steps
5 Experimental ResultsSynthetic DataReal-World Data
6 Conclusion
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Overview
Weigh edges of concept lattice with upper bound of error
Bad news: weights change depending on path
Good news: Error is monotonic along a path, so pruningwith backtracking works!Three steps:
1 Compute G
2 Search paths of G3 Determine top bands
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Overview
Weigh edges of concept lattice with upper bound of error
Bad news: weights change depending on path
Good news: Error is monotonic along a path, so pruningwith backtracking works!Three steps:
1 Compute G
2 Search paths of G3 Determine top bands
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Compute G
Many existing algorithms [1, 5, 3, 4, 7]
Incremental vs. non-incremental
Assume availability of G
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Search Paths
Potentially exponential number of paths
Any bi-cluster is a valid starting point...but initiate withupper neighbors of null-element
At each edge add concept to path utilizing previousprocedure
Utilize backtracking, mark previously visited edges
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Search Paths
Potentially exponential number of paths
Any bi-cluster is a valid starting point...but initiate withupper neighbors of null-element
At each edge add concept to path utilizing previousprocedure
Utilize backtracking, mark previously visited edges
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Top Bands
Allow user to specify : minRows, minCols, maxOvlp
Quality measure: q(P) = |r(P)| ∗ |c(P)| − w ∗ e(P)
If two bands exceed maxOvlp select the higher quality one
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Analysis and Improvements
Running time: O(|U| × |E | × max{X ,Y}|)
|U| : size of initial conceptsX ,Y : largest symmetric difference between neighboringconcepts
Speed up by reducing size of |U|
Perform simple clustering of U based on maxOvlpparameter
Good experimental results with this speed up.
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Analysis and Improvements
Running time: O(|U| × |E | × max{X ,Y}|)
|U| : size of initial conceptsX ,Y : largest symmetric difference between neighboringconcepts
Speed up by reducing size of |U|
Perform simple clustering of U based on maxOvlpparameter
Good experimental results with this speed up.
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Outline
1 IntroductionMotivation
2 Problem DefinitionPreliminaries
3 Bandedness and Bi-ClusteringFormal Concept AnalysisConcept Lattice Paths
4 MMBS AlgorithmThree Steps
5 Experimental ResultsSynthetic DataReal-World Data
6 Conclusion
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Setup
Single band and segmented bands planted in syntheticdataAll experiments:
w = 1maxOvlp = 0.1minRows = 5minCols = 5ǫ = 99
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Results
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
350
400
450
50020 40 60 80 100 120 140 160 180 200
20
40
60
80
100
120
140
160
180
200
Planted Bands
50 100 150 200 250 300
50
100
150
200
250
300
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Results
Dataset name Dataset Size p Num. Planted bands Algorithm Quality top ranked Num. bands mined
SynBand100_001 100 × 100 0.01 1
MMBS 3590 6MMBS_Fast 3406 4
MBS_BD 2507 1MBS_SD 438 1
SynBand100_005 100 × 100 0.05 1
MMBS 2278 9MMBS_Fast 1503 8
MBS_BD 1050 1MBS_SD 1201 1
SynBand500_001 500 × 500 0.01 1
MMBS 8918 7MMBS_Fast 8261 6
MBS_BD 2822 1MBS_SD 2145 1
SynMultiBand100_001 100 × 100 0.01 2
MMBS 3367 2MMBS_Fast 3367 2
MBS 4101 1MBS_SD 4045 1
SynMultiBand100_001 100 × 100 0.05 2
MMBS 4054 2MMBS_Fast 3933 2
MBS_BD 3910 1MBS_SD 3736 1
SynMultiBand500_001 500 × 500 0.01 2
MMBS 28242 8MMBS_Fast 21346 5
MBS_BD 17498 1MBS_SD 430 1
SynRandom100_005 100 × 100 0.05 unknown
MMBS 3311 17MMBS_Fast 3220 14
MBS_BD 2801 1MBS_SD 1949 1
SynRandom500_001 500 × 500 0.01 unknown
MMBS 18635 73MMBS_Fast 16163 64
MBS_BD 16771 1MBS_SD 5229 1
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Outline
1 IntroductionMotivation
2 Problem DefinitionPreliminaries
3 Bandedness and Bi-ClusteringFormal Concept AnalysisConcept Lattice Paths
4 MMBS AlgorithmThree Steps
5 Experimental ResultsSynthetic DataReal-World Data
6 Conclusion
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Dataset Size Sparsity Algorithm Quality top ranked Num. bands mined
Genes_Phenotypes 1910 × 3965 0.008
MMBS 6665 56MMBS_Fast 6665 43
MBS_BD 5204 1MBS_SD 3578 1
Genes_Drugs 1608 × 49 0.042
MMBS 6423 18MMBS_Fast 6423 13
MBS_BD 5346 1MBS_SD 3047 1
NewsGroups_Mideast_Religion 2000 × 890 0.003
MMBS 72906 42MMBS_Fast 61410 31
MBS_BD 59781 1MBS_SD 58713 1
NewsGroups_AllPC 5000 × 2805 0.0001
MMBS 93368 5MMBS_Fast 93368 5
MBS_BD 89106 1MBS_SD 74125 1
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
50 100 150 200 250 300 350 400
1
2
3
4
5
6
7
8
9
10
early eyelidopening
eyelids open at birth
abnormal timing ofpostnatal eyelid opening
abnormal eyelidmorphology
abnormal eyemorphology
abnormal homeostasis
abnormal ear physiology
abnormal hearingphysiology
abnormal brainstem audiotryevokedpotential
deafness
Genes_Phenotypes
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
100 200 300 400 500 600 700 800 900
1
2
3
4
5
6
7
Genes_Drugs
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
10 20 30 40 50 60 70 80
100
200
300
400
500
600
700
800
MideastReligion_SubjectLines
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
10 20 30 40 50 60 70 80
100
200
300
400
500
600
700
800
900
1000
AllPC_SubjectLines
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Performance
0 20 40 60 80 10010
0
101
102
103
104
epsilon
CP
U T
ime
(sec
onds
)
MMBS_fastMMBSMBS
0 20 40 60 80 10010
2
103
104
105
epsilon
CP
U T
ime
(sec
onds
)
MMBS_fastMMBSMBS
0 20 40 60 80 10010
1
102
103
104
105
epsilon
CP
U T
ime
(sec
onds
)
MMBS_fastMMBSMBS
0 20 40 60 80 10010
−1
100
101
102
epsilon
CP
U T
ime
(sec
onds
)
MMBS_fastMMBSMBS
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Conclusion
Explored connection between bi-clustering and bandedstructures in matrices
Banded sub-matrices correspond to paths in the bi-clusterlattice
MMBS algorithm is based on this correspondence andability to bound error
Future work: More efficient search methodologies,stronger bounds on error
Future work: Quantitative measures of bandedness,different types of bands desirable in different applications
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
Conclusion
Explored connection between bi-clustering and bandedstructures in matrices
Banded sub-matrices correspond to paths in the bi-clusterlattice
MMBS algorithm is based on this correspondence andability to bound error
Future work: More efficient search methodologies,stronger bounds on error
Future work: Quantitative measures of bandedness,different types of bands desirable in different applications
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
B. Gamter and R. Wille.Formal Concept Analysis: Mathematical Foundations.Springer-Verlag, Berlin, 1999.
G. C. Garriga, E. Junttila, and H. Mannila.Banded structure in binary matrices.In KDD ’08: Proceeding of the 14th ACM SIGKDDinternational conference on Knowledge discovery and datamining, pages 292–300, New York, NY, USA, 2008. ACM.
R. B. H. Bian.An algorithm for lattice-structured subspace clustering.Proceedings of the SIAM International Conference on DataMining, 2005.
S. O. Kuznetsov and S. A. Obiedkov.Algorithms for the construction of concept lattices and theirdiagram graphs.
Introduction Problem Definition Bandedness and Bi-Clustering MMBS Algorithm Experimental Results Conclusion
In PKDD ’01: Proceedings of the 5th European Conferenceon Principles of Data Mining and Knowledge Discovery,pages 289–300, London, UK, 2001. Springer-Verlag.
C. Lindig.Fast concept analysis.8th International Conference on Conceptual Structures,2000.
H. Mannila and E. Terzi.Nestedness and segmented nestedness.In KDD ’07: Proceedings of the 13th ACM SIGKDDinternational conference on Knowledge discovery and datamining, pages 480–489, New York, NY, USA, 2007. ACM.
C.-J. H. Mohammed J. Zaki.Efficient algorithms for mining closed itemsets and theirlattice structure.IEEE Transactions on Knowledge and Data Engineering,17 (4), 2005.