part i: introductory materials

19
Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer Science and Mathematics Division Oak Ridge National Laboratory

Upload: vanliem

Post on 13-Feb-2017

231 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Part I: Introductory Materials

Part I: Introductory MaterialsIntroduction to Graph Theory

Dr. Nagiza F. SamatovaDepartment of Computer ScienceNorth Carolina State University

andComputer Science and Mathematics Division

Oak Ridge National Laboratory

Page 2: Part I: Introductory Materials

2

Graphs

Graph with 7 nodes and 16 edges

UndirectedEdges

Nodes / Vertices

Directed

1 2

( , )

{ , ,..., }

{ ( , ) | , , 1,..., }n

k i j i j

G V E

V v v v

E e v v v v V k m

=== = ∈ =

( , ) ( , )i j j iv v v v= ( , ) ( , )i j j iv v v v≠

Page 3: Part I: Introductory Materials

3

Types of Graphs

• Undirected vs. Directed

• Attributed/Labeled (e.g., vertex, edge) vs. Unlabeled

• Weighted vs. Unweighted

• General vs. Bipartite (Multipartite)

• Trees (no cycles)

• Hypergraphs

• Simple vs. w/ loops vs. w/ multi-edges

Page 4: Part I: Introductory Materials

4

Labeled Graphs and Induced Subgraphs

Bold: A subgraph induced by vertices b, c and d

Labeled graph w/ loops

Page 5: Part I: Introductory Materials

Graph Isomorphism

5

Which graphs are isomorphic?

(A) (B) (C)C

Page 6: Part I: Introductory Materials

Graph Automorphism

6

Which graphs are automorphic?

Automorphism is isomorphism that preserves the labels.

(A) (B) (C)B

Page 7: Part I: Introductory Materials

Vertex degree, in-degree, out-degree

77

Directed

headtail

t h

In-degree of the vertex is the number of in-coming edges

Out-degree of the vertex is the number of out-going edges

Degree of the vertex is the number of edges (both in- & out-degree)

Page 8: Part I: Introductory Materials

8

Graph Representation and Formats

• Adjacency Matrix (vertex vs. vertex)

• Incidence Matrix (vertex vs. edge)

• Sparse vs. Dense Matrices

• DIMACS file format

• In R: igraph object

Page 9: Part I: Introductory Materials

9

Adjacency Matrix Representation

A(1) A(2)

B (6)

A(4)

B (5)

A(3)

B (7) B (8)

A(1) A(2) A(3) A(4) B(5) B(6) B(7) B(8)A(1) 1 1 1 0 1 0 0 0A(2) 1 1 0 1 0 1 0 0A(3) 1 0 1 1 0 0 1 0A(4) 0 1 1 1 0 0 0 1B(5) 1 0 0 0 1 1 1 0B(6) 0 1 0 0 1 1 0 1B(7) 0 0 1 0 1 0 1 1B(8) 0 0 0 1 0 1 1 1

A(2) A(1)

B (6)

A(4)

B (7)

A(3)

B (5) B (8)

A(1) A(2) A(3) A(4) B(5) B(6) B(7) B(8)A(1) 1 1 0 1 0 1 0 0A(2) 1 1 1 0 0 0 1 0A(3) 0 1 1 1 1 0 0 0A(4) 1 0 1 1 0 0 0 1B(5) 0 0 1 0 1 0 1 1B(6) 1 0 0 0 0 1 1 1B(7) 0 1 0 0 1 1 1 0B(8) 0 0 0 1 1 1 0 1

Representation is NOT unique. Algorithms can be order-sensitive.

Src: “Introduction to Data Mining” by Kumar et al

Page 10: Part I: Introductory Materials

Families of Graphs

10

• Cliques• Path and simple path• Cycle• Tree• Connected graphs

Read the book chapter for definitions and examples.

Page 11: Part I: Introductory Materials

11

Complete Graph, or Clique

Each pair of vertices is connected.

Clique

Page 12: Part I: Introductory Materials

12

The CLIQUE Problem

Maximum Clique of Size 5

Clique: a complete subgraph

Maximal Clique: a cliquecannot be enlarged by adding any more vertices

Maximum Clique: the largest maximal clique in the graph

{ , | has a clique of size }CLIQUE G k G k= < >

Page 13: Part I: Introductory Materials

13

Does this graph contain a 4-clique?

Indeed it does!

But, if it had not,

what evidence would have been needed?

Page 14: Part I: Introductory Materials

14

Problem: Decision, Optimization or Search

Problem

Decision Optimization Search

Formulate each version for the CLIQUE problem.

(self-reduction)“Yes”-”No” Parameter k �max/min Actual solution

•Which problem is harder to solve?• If we solve Decision problem, can we use it for the others?

Enumeration

All solutions

Page 15: Part I: Introductory Materials

15

Refresher: Class P and Class NP

Definition: P (NP) is the class of languages/problems that are decidable in polynomial time on a (non-)deterministic single-tape Turing machine.

Class

P ????NP

( )k

k

P DTIME n=U ( )k

k

NP NTIME n=U

non-polynomial

Non-deterministic polynomialPolynomially verifiable

Page 16: Part I: Introductory Materials

16

PSPACE∑2

P

… …

“forget about it”

P vs. NP

The Classic Complexity Theory View:

P NP

“easy”

“hard”

“About ten years ago some computer scientists came by and said they heard we have some really cool problems. They showed that the problems are NP-complete and went away!”

Page 17: Part I: Introductory Materials

17

Classical Graph Theory ProblemsCSC505:Algorithms, CSC707 :Complexity Theory, CSC5??:Graph Theory

• Longest Path

• Maximum Clique

• Minimum Vertex Cover

• Hamiltonian Path/Cycle

• Traveling Salesman (TSP)

• Maximum Independent Set

• Minimum Dominating Set

• Graph/Subgraph Isomorphism

• Maximum Common Subgraph

• …

NP-hardProblems

Page 18: Part I: Introductory Materials

18

Graph Mining ProblemsCSC 422/522 and Our Book

• Clustering + Maximal Clique Enumeration

• Classification

• Association Rule Mining +Frequent Subgraph Mining

• Anomaly Detection

• Similarity/Dissimilarity/Distance Measures

• Graph-based Dimension Reduction

• Link Analysis

• …

Many graph mining problems have to deal with classical graph problems as part of its data mining pipeline.

Page 19: Part I: Introductory Materials

19

Dealing with Computational Intractability

• Exact Algorithms:

– Small graph problems

– Small parameters to graph problems

– Special classes of graphs (e.g., bounded tree-width)

• Approximation Polynomial-Time Algorithms (O(nc))

– Guaranteed error-bar on the solution

• Heuristic Polynomial-Time Algorithms

– No guarantee on the quality of the solution

– Low degree polynomial solutions

Our focus