maximum and maximal cliques in multipartite...
Post on 12-Jan-2020
4 Views
Preview:
TRANSCRIPT
Maximum and Maximal Cliques in Multipartite Graphs
Charles Phillips
Department of Electrical Engineering and Computer Science
University of Tennessee
3/11/2015
Cliques in Multipartite Graphs
Cliques
Maximum Clique
Maximal Clique Enumeration
Bipartite and multipartite graphs
Maximal Biclique Enumeration
Maximum k-partite cliques
Maximal k-partite clique enumeration
Applications
Molecular Biology Telecommunications Natural Language Processing Social Network Analysis Transportation Operations Research Chemistry Textile Manufacturing Drug Discovery Phylogeny Ad Hoc Networking Computational Biology Fault Diagnosis Computer Vision
History
Richard Karp
R. Duncan Luce
Pál Turán Leo Moser René Peeters
Etsuji Tomita
Definitions NP-hard, NP-complete
– Any exact algorithm must take exponential time in the worst case (as far as we know).
– “In complexity theory we make the following sweeping generalizations: If an algorithm runs in polynomial time, it is fast; otherwise it is slow. Fast is good; slow is bad. A problem that we can solve by a fast algorithm is easy; a problem that we can't is hard.” (Tovey 2002)
Definitions Clique
– A set of vertices with all possible edges. – A complete subgraph.
Maximum Clique – The largest clique in a graph
Maximal Clique – A clique to which no vertex can be added to form a larger clique. – A clique that is not a proper subset of another clique.
a b
d e f g
c
h
Definitions Clique
– A set of vertices with all possible edges. – A complete subgraph.
Maximum Clique – The largest clique in a graph
Maximal Clique – A clique to which no vertex can be added to form a larger clique. – A clique that is not a proper subset of another clique.
a b
d e f g
c
h
Maximum clique: { a, b, d, e } Maximal cliques: { a, b, d, e } { c, f, g } { c, g, h } { e, f }
Background
Maximum clique is not approximable to within better than a linear factor
No polynomial time algorithm can approximate maximum clique within a factor of O(n1 − ε), for any ε > 0, unless P=NP.
Best known approximation algorithm: O(n(log log n)2/log3n) (Feige 2004)
Definitions Bipartite Graph
– A graph whose vertices can be divided into two non-empty disjoint sets U and V such that every edge connects a vertex in U with a vertex in V.
Tripartite Graph
Multipartite Graph – A k-partite graph, k ≥ 2
Definitions
Partite set
Interpartite edge
Intrapartite edge
Complete k-partite graph
Background
Bipartite Graph
How many ways can a connected bipartite graph be partitioned into partite sets?
Bipartite Graph
How many ways can a connected bipartite graph be partitioned into two partite sets?
Only one!
Bipartite Graph
How many ways can a connected bipartite graph be partitioned into two partite sets?
Only one!
Choose an arbitrary vertex v and place it U. All neighbors of v must then be in V. All neighbors of those neighbors must be in U. And so on.
Definitions k-clique
– A clique with k vertices.
Biclique
– A complete bipartite graph
– Km,n
– All possible interpartite edges
Triclique
– A complete tripartite graph
– All possible interpartite edges
k-partite clique
– A complete k-partite graph
– All possible interpartite edges
Definitions
Vertex and Edge Maximum
Bipartite Graph Vertex-Maximum Biclique Edge-Maximum Biclique (8 vertices, 7 edges) (6 vertices, 9 edges) Polynomial Time NP-hard NP-hard NP-hard Vertex-Maximum Triclique Edge-Maximum Triclique Tripartite Graph (7 vertices, 10 edges) (6 vertices, 12 edges)
Turán's Theorem
Turán's Theorem
We cannot add an edge to K3,2,2 without creating a K4.
Maximal Clique Enumeration
List all maximal cliques
First methods (1957) used induction on 3-cliques
Methods were developed (1964-1970) using the vertex sequence method (aka point removal method) – Produces maximal cliques of G from
maximal cliques of G \ {v}, v ∈ V
– Must maintain all cliques in memory
Bron-Kerbosch Algorithm
Bron-Kerbosch Algorithm Algorithm 1: Bron-Kerbosch with pivot. Input: a graph, G = (V,E) 1 R := Ø, P := V, X := Ø 2 BronKerbosch(R, P, X) 3 if P and X are both empty 4 report R as a maximal clique 5 choose a pivot vertex u in P U X 6 for each vertex v in P \ N(u) 7 BronKerbosch(R U v, P ∩ N(v), X ∩ N(v)) 8 P := P \ v 9 X := X U v R: The current clique P: Vertices that can extend the current clique X: Vertices that have already been used to extend the current clique
Pseudocode adapted from http://en.wikipedia.org/wiki/Bron%E2%80%93Kerbosch_algorithm
Maximal Biclique Enumeration
List all maximal bicliques
All possible subsets (Koch)
MBEA
MBEA
Maximal Triclique Enumeration
Will maximal bicliques help us obtain maximal tricliques?
Does a maximal triclique contain a maximal biclique as a subgraph?
What about maximal k-partite clique enumeration?
A B
C
A tripartite graph with partite sets A, B and C.
A B
C
The red vertices are a maximal biclique in partite sets A and B.
A B
C
The red vertices are a maximal triclique in the graph. But the red vertices in A and B are not a maximal biclique.
Transaction Data
Transaction Data
Also known as market basket data. Each transaction is a set of
items. {Bread, Milk, Cheese} {Milk, Cheese, Eggs, Sugar} {Bread, Milk, Eggs} {Cheese, Eggs, Sugar, Flour}
Itemset – A set of items.
Support – The number of transactions in which an itemset occurs.
Frequent Itemset – An itemset whose support is at or above some specified threshold.
Closed Itemset – An itemset that has no superset with the same support.
Maximal Itemset – An itemset that has no superset that is frequent.
Problem Equivalence
Transaction
Items
1 ABDEF
2 ABF
3 BCDE
4 ABCE
5 CDE
6 CDEF
7 AEF
Problem Equivalence
Transaction
Items
1 ABDEF
2 ABF
3 BCDE
4 ABCE
5 CDE
6 CDEF
7 AEF
1
2
4
3
5
6
7
A
B
D
C
E
F
Results
0
10000
20000
30000
40000
50000
60000
70000
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Wal
lclo
ck R
un
tim
es in
Sec
on
ds
Number of vertices in the smaller vertex set
MBEA
LCM
Results
1.E-02
1.E-01
1.E+00
1.E+01
1.E+02
1.E+03
1.E+04
1.E+05
0.0
1
0.0
2
0.0
3
0.0
4
0.0
5
0.0
6
0.0
7
0.0
8
0.0
9
0.1
0.1
1
0.1
2
0.1
3
0.1
4
0.1
5
0.1
6
0.1
7
0.1
8
0.1
9
0.2
Log
(Wal
lclo
ck R
un
tim
e in
sec
on
ds)
p-value threshold
MBEA
LCM
MICA
Results
What if we add all possible intrapartite edges, then run Bron-Kerbosch?
Results
0.1
1
10
100
1000
0.01 0.02 0.03 0.04 0.05
Ru
nti
me
(Se
c)
Density
1:1 Partite Set Ratio
MBEA
BK-K
0.01
0.1
1
10
100
0.01 0.02 0.03 0.04 0.05 0.06 0.07
Ru
nti
me
(Se
c)
Density
2:1 Partite Set Ratio
MBEA
BK-K
0.1
1
10
100
1000
0.01 0.02 0.03 0.04 0.05 0.06 0.07
Ru
nti
me
(Se
c)
Density
3:1 Partite Set Ratio
MBEA
BK-K
0.1
1
10
100
1000
0.01 0.02 0.03 0.04 0.05 0.06 R
un
tim
e (
Sec)
Density
4:1 Partite Set Ratio
MBEA
BK-K
Results
0.1
1
10
100
1000
0.01 0.02 0.03 0.04 0.05
Ru
nti
me
(Se
c)
Density
1:1 Partite Set Ratio
MBEA
BK-K
0.01
0.1
1
10
100
0.01 0.02 0.03 0.04 0.05 0.06 0.07
Ru
nti
me
(Se
c)
Density
2:1 Partite Set Ratio
MBEA
BK-K
0.1
1
10
100
1000
0.01 0.02 0.03 0.04 0.05 0.06 0.07
Ru
nti
me
(Se
c)
Density
3:1 Partite Set Ratio
MBEA
BK-K
0.1
1
10
100
1000
0.01 0.02 0.03 0.04 0.05 0.06 R
un
tim
e (
Sec)
Density
4:1 Partite Set Ratio
MBEA
BK-K
Results
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
1 2 3 4 5 6 7 8 9 10
Density above which BK-K beats MBEA
Partite Set Size Ratio
Results
Preprocessing
– Interpartite rule: remove all edges between vertices in different partite sets that are not part of a 3-clique
– Intrapartite rule: remove all edges between vertices in the same partite set that are not part of a 3-clique
– We added all possible intrapartite edges so we could use Bron-Kerbosch
Results
The speedup achieved by interpartite preprocessing and intrapartite preprocessing on random 3-partite graphs with 2000 vertices in each partite set, but varying density. Interpartite preprocessing is very effective on graphs with low density, being more effective the lower the density. It gradually becomes ineffective as density increases; although at no time does its overhead produce a substantial runtime cost. Conversely, intrapartite preprocessing is never effective. It results in much longer runtimes at low density, and while its overhead eventually becomes insubstantial as density increases, at no point does it provide any benefit.
Results
MBEA does better then BK-P on sparser graphs
MBEA does better than BK-P on unbalanced graphs
BK-P does better as the size gets larger
Problems
If k complete graphs, each having exactly k vertices, have the property that every pair of complete graphs has at most one shared vertex, then the union of the graphs can be colored with k colors.
Alternate Formulation – k committees – Each committee has k members – The committees all use the same room, which has k chairs – At most one person belongs to the intersection of any two
committees.
Is it possible to assign the committee members to chairs in such a way that each member sits in the same chair for all the different committees to which he or she belongs?
In this model of the problem, the people correspond to graph vertices, committees correspond to complete graphs, and chairs correspond to vertex colors
Problems
If k complete graphs, each having exactly k vertices, have the property that every pair of complete graphs has at most one shared vertex, then the union of the graphs can be colored with k colors.
Alternate Formulation – k committees – Each committee has k members – The committees all use the same room, which has k chairs – At most one person belongs to the intersection of any two
committees.
Is it possible to assign the committee members to chairs in such a way that each member sits in the same chair for all the different committees to which he or she belongs?
In this model of the problem, the people correspond to graph vertices, committees correspond to complete graphs, and chairs correspond to vertex colors
Erdös–Faber–Lovász conjecture
Adapted from http://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93Faber%E2%80%93Lov%C3%A1sz_conjecture
Future Directions
Modelling large heterogeneous data
Maximum k-partite clique enumeration
– Vertex Maximum
– Edge Maximum
References I. Bomze, M. Budinich, P. Pardalos, M. Pelillo, “The Maximum Clique Problem,“ In: Du, D.-Z., Pardalos,
P.M. (eds.) Handbook of Combinatorial Optimization, vol. 4. Kluwer Academic Publishers, 1999.
D. Eppstein, “Arboricity and bipartite subgraph listing algorithms,” Inf. Process. Lett., vol. 51, no. 4, pp. 207–211, 1994.
U. Feige, "Approximating maximum clique by removing subgraphs", SIAM Journal on Discrete Mathematics 18 (2): 219–225, 2004.
M. R. Garey and D. S. Johnson, Computers and Intractability. New York: W. H. Freeman, 1979.
R. D. Luce, R.Perry, and D. Albert, "A method of matrix analysis of group structure”, Psychometrika 14 (2): 95–116, 1949.
K. Makino and T. Uno, “New algorithms for enumerating all maximal cliques.” in Proceeding, 9th Scandinavian Workshop Algorithm Theory (SWAT), pp. 260–272, 2004.
Miller, R. E. and Muller, D. E. A problem of maximum consistent subsets. IBM Research Report RC-240, J. T. Watson Research Center, Yorktown Heights, NY, 1960.
R. Peeters, “The maximum edge biclique problem is np-complete,” Discrete Appl. Math., vol. 131, no. 3, pp. 651–654, 2003.
E. Tomita, A. Tanaka, H. Takahashi, “The worst-case time complexity for generating all maximal cliques and computational experiments", Theoretical Computer Science 363 (1): 28–42, 2006.
References
C. A. Tovey, Tutorial on computational complexity. Interfaces 32, 3, 30-61, 2002.
T. Uno, M. Kiyomi, and H. Arimura, “LCM ver.2: Efficient mining algorithms for frequent/closed/maximal itemsets,” in Proceedings, FIMI’04: Workshop on Frequent Itemset Mining Implementations, Brighton UK, November 2004.
M. J. Zaki and M. Ogihara, “Theoretical foundations of association rules,” in Proceedings, 3rd SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 1998.
Y. Zhang, C. A. Phillips, G. L. Rogers, E. J. Baker, E. J. Chesler, M. A. Langston, “On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types,” BMC Bioinformatics 15, 110, 2014.
Homework 1. List all maximal cliques in the following graph. Identify which cliques in your list are maximum, if any.
2. What is the smallest bipartite graph in which a vertex-maximum
biclique is not an edge-maximum biclique?
3. We saw that the Turán graph T(7,3) is K3,2,2.
a. What is T(13,4)?
b. What is T(17,5)?
b c a
e d f g
i h j
Questions
top related