1 data structures and algorithms graphs i: representation and search gal a. kaminka computer science...

53
1 Data Structures and Algorithms Graphs I: Representation and Search Gal A. Kaminka Computer Science Department

Post on 20-Dec-2015

219 views

Category:

Documents


2 download

TRANSCRIPT

1

Data Structures and Algorithms

Graphs I:

Representation and Search

Gal A. Kaminka

Computer Science Department

2

Outline

Reminder: Graphs Directed and undirected

Matrix representation of graphs Directed and undirected

Sparse matrices and sparse graphs Adjacency list representation

3

Graphs Tuple <V,E> V is set of vertices E is a binary relation on V

Each edge is a tuple < v1,v2 >, where v1,v2 in V |E| =< |V|2

4

Directed and Undirected Graphs

Directed graph: < v1, v2 > in E is ordered, i.e., a relation (v1,v2)

Undirected graph: < v1, v2 > in E is un-ordered, i.e., a set { v1, v2 }

Degree of a node X: Out-degree: number of edges < X, v2 > In-degree: number of edges < v1, X > Degree: In-degree + Out-degree In undirected graph: number of edges { X, v2 }

5

Examples of graphs

6

Paths

Path from vertex v0 to vertex vk: A sequence of vertices < v0, v1, …. vk >

For all 0 =< i < k, edge < vi, vi+1 > exists.

Path is of length k Two vertices x, y are adjacent if < x, y > is an edge Path is simple if vertices in sequence are distinct. Cycle: if v0 = vk

< v, v > is cycle of length 1

7

Connected graphs

Undirected Connected graph: For any vertices x, y there exists a path xy (= yx)

Directed connected graph: If underlying undirected graph is connected

Strongly connected directed graph: If for any two vertices x, y there exist path xy

and path yx Clique: a strongly connected component

|V|-1 =< |E| =< |V|2

8

Cycles and trees

Graph with no cycles: acyclic Directed Acyclic Graph: DAG Undirected forest:

Acyclic undirected graph Tree: undirected acyclic connected graph

one connected component

9

Representing graphs

Adjacency matrix: When graph is dense |E| close to |V|2

Adjacency lists: When graph is sparse |E| << |V|2

10

Adjacency Matrix

Matrix of size |V| x |V| Each row (column) j correspond to a distinct vertex j

“1” in cell < i, j > if there is exists an edge <i,j> Otherwise, “0”

In an undirected graph, “1” in <i,j> => “1” in <j,i> “1” in <j,j> means there’s a self-loop in vertex j

11

Examples

1 2 3

1 0 0 1

2 0 1 0

3 1 1 0

1

3

2

32

1

4

1 2 3 4

1 0 1 1 0

2 1 0 0 0

3 1 0 0 0

4 0 0 0 0

12

Adjacency matrix features

Storage complexity: O(|V|2) But can use bit-vector representation

Undirected graph: symmetric along main diagonal AT transpose of A Undirected: A=AT

In-degree of X: Sum along column X O(|V|) Out-degree of X: Sum along row X O(|V|) Very simple, good for small graphs

Edge existence query: O(1)

13

But, ….

Many graphs in practical problems are sparse Not many edges --- not all pairs x,y have edge xy

Matrix representation demands too much memory We want to reduce memory footprint

Use sparse matrix techniques

14

Adjacency List

An array Adj[ ] of size |V| Each cell holds a list for associated vertex Adj[u] is list of all vertices adjacent to u

List does not have to be sorted

Undirected graphs: Each edge is represented twice

15

Examples

1 3

2 2

3 1 2

1

3

2

32

1

4

1 2 3

2 1

3 1

4

16

Adjacency list features

Storage Complexity: O(|V| + |E|) In undirected graph: O(|V|+2*|E|) = O(|V|+|E|)

Edge query check: O(|V|) in worst case

Degree of node X: Out degree: Length of Adj[X] O(|V|) calculation In degree: Check all Adj[] lists O(|V|+|E|) Can be done in O(1) with some auxiliary information!

17

?שאלות

18

Graph Traversals (Search)

We have covered some of these with binary trees Breadth-first search (BFS) Depth-first search (DFS)

A traversal (search): An algorithm for systematically exploring a graph Visiting (all) vertices Until finding a goal vertex or until no more vertices

Only for connected graphs

19

Breadth-first search

One of the simplest algorithms Also one of the most important

It forms the basis for MANY graph algorithms

20

BFS: Level-by-level traversal

Given a starting vertex s Visit all vertices at increasing distance from s

Visit all vertices at distance k from s Then visit all vertices at distance k+1 from s Then ….

21

BFS in a binary tree (reminder)

BFS: visit all siblings before their descendents

5

2

1 3

8

6 10

7 95 2 8 1 3 6 10 7 9

22

BFS(tree t)1. q new queue

2. enqueue(q, t)

3. while (not empty(q))

4. curr dequeue(q)

5. visit curr // e.g., print curr.datum

6. enqueue(q, curr.left)

7. enqueue(q, curr.right)

This version for binary trees only!

23

BFS for general graphs

This version assumes vertices have two children left, right This is trivial to fix

But still no good for general graphs It does not handle cycles

Example.

24

Start with A. Put in the queue (marked red)

A

B

G C

E

D

F

Queue: A

25

B and E are next

A

B

G C

E

D

F

Queue: A B E

26

When we go to B, we put G and C in the queue

When we go to E, we put D and F in the queue

A

B

G C

E

D

F

Queue: A B E C G D F

27

When we go to B, we put G and C in the queue

When we go to E, we put D and F in the queue

A

B

G C

E

D

F

Queue: A B E C G D F

28

Suppose we now want to expand C.

We put F in the queue again!

A

B

G C

E

D

F

Queue: A B E C G D F F

29

Generalizing BFS

Cycles: We need to save auxiliary information Each node needs to be marked

Visited: No need to be put on queue Not visited: Put on queue when found

What about assuming only two children vertices? Need to put all adjacent vertices in queue

30

BFS(graph g, vertex s)1. unmark all vertices in G

2. q new queue

3. mark s

4. enqueue(q, s)

5. while (not empty(q))

6. curr dequeue(q)

7. visit curr // e.g., print its data

8. for each edge <curr, V>

9. if V is unmarked

10. mark V

11. enqueue(q, V)

31

The general BFS algorithm Each vertex can be in one of three states:

Unmarked and not on queue Marked and on queue Marked and off queue

The algorithm moves vertices between these states

32

Handling vertices

Unmarked and not on queue: Not reached yet

Marked and on queue: Known, but adjacent vertices not visited yet (possibly)

Marked and off queue: Known, all adjacent vertices on queue or done with

33

Start with A. Mark it.

A

B

G C

E

D

F

Queue: A

34

Expand A’s adjacent vertices.

Mark them and put them in queue.

A

B

G C

E

D

F

Queue: A B E

35

Now take B off queue, and queue its neighbors.

A

B

G C

E

D

F

Queue: A B E C G

36

Do same with E.

A

B

G C

E

D

F

Queue: A B E C G D F

37

Visit C.

Its neighbor F is already marked, so not queued.

A

B

G C

E

D

F

Queue: A B E C G D F

38

Visit G.

A

B

G C

E

D

F

Queue: A B E C G D F

39

Visit D. F, E marked so not queued.

A

B

G C

E

D

F

Queue: A B E C G D F

40

Visit F.

E, D, C marked, so not queued again.

A

B

G C

E

D

F

Queue: A B E C G D F

41

Done. We have explored the graph in order:

A B E C G D F.

A

B

G C

E

D

F

Queue: A B E C G D F

42

Interesting features of BFS

Complexity: O(|V| + |E|) All vertices put on queue exactly once For each vertex on queue, we expand its edges In other words, we traverse all edges once

BFS finds shortest path from s to each vertex Shortest in terms of number of edges Why does this work?

43

Depth-first search

Again, a simple and powerful algorithm Given a starting vertex s Pick an adjacent vertex, visit it.

Then visit one of its adjacent vertices ….. Until impossible, then backtrack, visit another

44

DFS(graph g, vertex s)Assume all vertices initially unmarked

1. mark s

2. visit s // e.g., print its data

3. for each edge <s, V>

4. if V is not marked

5. DFS(G, V)

45

Start with A. Mark it.

A

B

G C

E

D

F

Current vertex: A

46

Expand A’s adjacent vertices. Pick one (B).

Mark it and re-visit.

A

B

G C

E

D

F

Current: B

47

Now expand B, and visit its neighbor, C.

A

B

G C

E

D

F

Current: C

48

Visit F.

Pick one of its neighbors, E.

A

B

G C

E

D

F

Current: F

49

E’s adjacent vertices are A, D and F.

A and F are marked, so pick D.

A

B

G C

E

D

F

Current: E

50

Visit D. No new vertices available. Backtrack to

E. Backtrack to F. Backtrack to C. Backtrack to B

A

B

G C

E

D

F

Current: D

51

Visit G. No new vertices from here. Backtrack to

B. Backtrack to A. E already marked so no new.

A

B

G C

E

D

F

Current: G

52

Done. We have explored the graph in order:

A B C F E D G

A

B

G C

E

D

F

Current:1

2

3

4

6

7

5

53

Interesting features of DFS

Complexity: O(|V| + |E|) All vertices visited once, then marked For each vertex on queue, we examine all edges In other words, we traverse all edges once

DFS does not necessarily find shortest path Why?