part i: introductory materials introduction to graph theory dr. nagiza f. samatova department of...

19
Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer Science and Mathematics Division Oak Ridge National Laboratory

Upload: jocelyn-matthews

Post on 23-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

Part I: Introductory MaterialsIntroduction to Graph Theory

Dr. Nagiza F. SamatovaDepartment of Computer ScienceNorth Carolina State University

andComputer Science and Mathematics Division

Oak Ridge National Laboratory

Page 2: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

2

Graphs

Graph with 7 nodes and 16 edges

UndirectedUndirectedEdges

Nodes / Vertices

DirectedDirected

1 2

( , )

{ , ,..., }

{ ( , ) | , , 1,..., }n

k i j i j

G V E

V v v v

E e v v v v V k m

( , ) ( , )i j j iv v v v ( , ) ( , )i j j iv v v v

Page 3: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

3

Types of Graphs

• Undirected vs. Directed• Attributed/Labeled (e.g., vertex, edge) vs.

Unlabeled• Weighted vs. Unweighted• General vs. Bipartite (Multipartite)• Trees (no cycles)• Hypergraphs• Simple vs. w/ loops vs. w/ multi-edges

Page 4: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

4

Labeled Graphs and Induced Subgraphs

Bold: A subgraph induced by vertices b, c and d

Labeled graph w/ loops

Page 5: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

Graph Isomorphism

5

Which graphs are isomorphic?

(A) (B) (C) C

Page 6: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

Graph Automorphism

6

Which graphs are automorphic?

Automorphism is isomorphism that preserves the labels.

(A) (B) (C)B

Page 7: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

Vertex degree, in-degree, out-degree

77

DirectedDirected

headtail

t h

In-degree of the vertex is the number of in-coming edgesOut-degree of the vertex is the number of out-going edges

Degree of the vertex is the number of edges (both in- & out-degree)

Page 8: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

8

Graph Representation and Formats

• Adjacency Matrix (vertex vs. vertex)• Incidence Matrix (vertex vs. edge)• Sparse vs. Dense Matrices• DIMACS file format• In R: igraph object

Page 9: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

9

Adjacency Matrix Representation

A(1) A(2)

B (6)

A(4)

B (5)

A(3)

B (7) B (8)

A(1) A(2) A(3) A(4) B(5) B(6) B(7) B(8)A(1) 1 1 1 0 1 0 0 0A(2) 1 1 0 1 0 1 0 0A(3) 1 0 1 1 0 0 1 0A(4) 0 1 1 1 0 0 0 1B(5) 1 0 0 0 1 1 1 0B(6) 0 1 0 0 1 1 0 1B(7) 0 0 1 0 1 0 1 1B(8) 0 0 0 1 0 1 1 1

A(2) A(1)

B (6)

A(4)

B (7)

A(3)

B (5) B (8)

A(1) A(2) A(3) A(4) B(5) B(6) B(7) B(8)A(1) 1 1 0 1 0 1 0 0A(2) 1 1 1 0 0 0 1 0A(3) 0 1 1 1 1 0 0 0A(4) 1 0 1 1 0 0 0 1B(5) 0 0 1 0 1 0 1 1B(6) 1 0 0 0 0 1 1 1B(7) 0 1 0 0 1 1 1 0B(8) 0 0 0 1 1 1 0 1

Representation is NOT unique. Algorithms can be order-sensitive.

Src: “Introduction to Data Mining” by Kumar et al

Page 10: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

Families of Graphs

10

• Cliques• Path and simple path• Cycle• Tree• Connected graphs

Read the book chapter for definitions and examples.

Page 11: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

11

Complete Graph, or Clique

Each pair of vertices is connected.

CliqueClique

Page 12: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

12

The CLIQUE Problem

Maximum Clique of Size 5

CliqueClique: a complete subgraph

Maximal CliqueMaximal Clique: a clique cannot be enlarged by adding any more verticesMaximum CliqueMaximum Clique: the largest maximal clique in the graph

{ , | has a clique of size }CLIQUE G k G k

Page 13: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

13

Does this graph contain a 4-clique?

Indeed it does!

But, if it had not, But, if it had not, what evidence would have been needed?what evidence would have been needed?

Page 14: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

14

Problem: Decision, Optimization or Search

Problem

Decision Optimization Search

Formulate each version for the CLIQUE problem.

(self-reduction)“Yes”-”No”Parameter k max/minActual solution

• Which problem is harder to solve?• If we solve Decision problem, can we use it for the others?

Enumeration

All solutions

Page 15: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

15

Refresher: Class P and Class NP

Definition: P (NP) is the class of languages/problems that are decidable in polynomial time on a (non-)deterministic single-tape Turing machine.

Class

P ????NP( )k

k

P DTIME n ( )k

k

NP NTIME n

non-polynomial

Non-deterministic polynomial

Polynomially verifiable

Page 16: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

16

PSPACE∑2

P… …

“forget about it”

P vs. NP

The Classic Complexity Theory View:

P NP

“easy”

“hard”

“About ten years ago some computer scientists came by and said they heard we have some really cool problems. They showed that the problems are NP-complete and went away!”

Page 17: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

17

Classical Graph Theory ProblemsCSC505:Algorithms, CSC707 :Complexity Theory, CSC5??:Graph Theory

• Longest Path• Maximum Clique• Minimum Vertex Cover• Hamiltonian Path/Cycle• Traveling Salesman (TSP)• Maximum Independent Set• Minimum Dominating Set• Graph/Subgraph Isomorphism• Maximum Common Subgraph • …

NP-hardProblems

Page 18: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

18

Graph Mining ProblemsCSC 422/522 and Our Book

• Clustering + Maximal Clique Enumeration• Classification• Association Rule Mining +Frequent Subgraph

Mining• Anomaly Detection• Similarity/Dissimilarity/Distance Measures• Graph-based Dimension Reduction• Link Analysis• …

Many graph mining problems have to deal with classical graph problems as part of its data mining pipeline.

Page 19: Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer

19

Dealing with Computational Intractability

• Exact Algorithms:– Small graph problems– Small parameters to graph problems– Special classes of graphs (e.g., bounded tree-width)

• Approximation Polynomial-Time Algorithms (O(nc))– Guaranteed error-bar on the solution

• Heuristic Polynomial-Time Algorithms– No guarantee on the quality of the solution – Low degree polynomial solutions

Our focus