cosc 3101a - design and analysis of algorithms 10 bfs, dfs topological sort strongly connected...
TRANSCRIPT
COSC 3101A - Design and Analysis of Algorithms
10
BFS, DFS
Topological Sort
Strongly Connected Components
Many of these slides are taken from Monica Nicolescu, Univ. of Nevada, Reno, [email protected]
7/6/2004 Lecture 10 COSC3101A 2
Searching in a Graph
• Graph searching = systematically follow the edges of the graph so as to visit the vertices of the graph
• Two basic graph searching algorithms:– Breadth-first search
– Depth-first search
– The difference between them is in the order in which they explore the unvisited edges of the graph
• Graph algorithms are typically elaborations of the basic graph-searching algorithms
7/6/2004 Lecture 10 COSC3101A 3
Breadth-First Search (BFS)
• Input:– A graph G = (V, E) (directed or undirected)– A source vertex s V
• Goal:– Explore the edges of G to “discover” every vertex
reachable from s, taking the ones closest to s first
• Output:– d[v] = distance (smallest # of edges) from s to v, for
all v V– A “breadth-first tree” rooted at s that contains all
reachable vertices
7/6/2004 Lecture 10 COSC3101A 4
Breadth-First Search (cont.)
• Discover vertices in increasing order of distance from the source s – search in breadth not depth– Find all vertices at 1 edge from s, then all vertices at 2
edges from s, and so on
1 2
5 4
3
7
7
9
11
12
6
7/6/2004 Lecture 10 COSC3101A 5
Breadth-First Search (cont.)
• Keeping track of progress:
– Color each vertex in either white,
gray or black
– Initially, all vertices are white
– When being discovered a vertex
becomes gray
– After discovering all its adjacent
vertices the node becomes black
– Use FIFO queue Q to maintain the
set of gray vertices
1 2
5 4
3
1 2
5 4
3
source
1 2
5 4
3
7/6/2004 Lecture 10 COSC3101A 6
Breadth-First Tree
• BFS constructs a breadth-first tree
– Initially contains the root (source vertex s)
– When vertex v is discovered while scanning
the adjacency list of a vertex u vertex v
and edge (u, v) are added to the tree
– u is the predecessor (parent) of v in the
breadth-first tree
– A vertex is discovered only once it has at
most one parent
1 2
5 4
3
source
7/6/2004 Lecture 10 COSC3101A 7
BFS Additional Data Structures
• G = (V, E) represented using adjacency lists
• color[u] – the color of the vertex for all u
V
[u] – predecessor of u
– If u = s (root) or node u has not yet been
discovered [u] = NIL
• d[u] – the distance from the source s to
vertex u
• Use a FIFO queue Q to maintain the set of
gray vertices
1 2
5 4
3
d=1=1
d=1=1
d=2=5
d=2=2
source
7/6/2004 Lecture 10 COSC3101A 8
BFS(G, s)
1. for each u V[G] - {s}
2. do color[u] WHITE
3. d[u] ←
4. [u] = NIL
5. color[s] GRAY
6. d[s] ← 0
7. [s] = NIL
8. Q
9. Q ← ENQUEUE(Q, s) Q: s
0
r s t u
v w x y
r s t u
v w x y
r s t u
v w x y
7/6/2004 Lecture 10 COSC3101A 9
BFS(G, s)
10. while Q
11. do u ← DEQUEUE(Q)
12. for each v Adj[u]
13. do if color[v] = WHITE
14. then color[v] ← GRAY
15. d[v] ← d[u] + 1
16. [v] = u
17. ENQUEUE(Q, v)
18. color[u] BLACK
0
1
r s t u
v w x y
Q: w
Q: s 0
r s t u
v w x y
1 0
1
r s t u
v w x y
Q: w, r
7/6/2004 Lecture 10 COSC3101A 10
Example
1 0
1
r s t u
v w x yQ: s
0
r s t u
v w x yQ: w, r
v w x y
1 0 2
1 2
r s t u
Q: r, t, x
1 0 2
2 1 2
r s t u
v w x yQ: t, x, v
1 0 2 3
2 1 2
r s t u
v w x yQ: x, v, u
1 0 2 3
2 1 2 3
r s t u
v w x yQ: v, u, y
1 0 2 3
2 1 2 3
r s t u
v w x yQ: u, y
1 0 2 3
2 1 2 3
r s t u
v w x yQ: y
r s t u1 0 2 3
2 1 2 3
v w x yQ:
7/6/2004 Lecture 10 COSC3101A 11
Analysis of BFS
1. for each u V - {s}
2. do color[u] WHITE
3. d[u] ←
4. [u] = NIL
5. color[s] GRAY
6. d[s] ← 0
7. [s] = NIL
8. Q
9. Q ← ENQUEUE(Q, s)
O(V)
(1)
7/6/2004 Lecture 10 COSC3101A 12
Analysis of BFS
10. while Q
11. do u ← DEQUEUE(Q)
12. for each v Adj[u]
13. do if color[v] = WHITE
14. then color[v] = GRAY
15. d[v] ← d[u] + 1
16. [v] = u
17. ENQUEUE(Q, v)
18. color[u] BLACK
(1)
(1)
Scan Adj[u] for all vertices in the graph• Each vertex is scanned only once, when the vertex is dequeued
• Sum of lengths of all adjacency lists = (E)• Scanning operations: O(E)
• Total running time for BFS = O(V + E)
7/6/2004 Lecture 10 COSC3101A 13
Shortest Paths Property
• BFS finds the shortest-path distance from the source vertex s V to each node in the graph
• Shortest-path distance = (s, u)– Minimum number of edges in any path from s to u
r s t u
1 0 2 3
2 1 2 3
v w x y
source
7/6/2004 Lecture 10 COSC3101A 14
Depth-First Search• Input:
– G = (V, E) (No source vertex given!)
• Goal:
– Explore the edges of G to “discover” every vertex in V starting at the most current visited node
– Search may be repeated from multiple sources
• Output:
– 2 timestamps on each vertex:• d[v] = discovery time
• f[v] = finishing time (done with examining v’s adjacency list)
– Depth-first forest
1 2
5 4
3
7/6/2004 Lecture 10 COSC3101A 15
Depth-First Search
• Search “deeper” in the graph whenever possible
• Edges are explored out of the most recently discovered vertex v that still has unexplored edges
• After all edges of v have been explored, the search “backtracks” from the parent of v
• The process continues until all vertices reachable from the original source have been discovered
• If undiscovered vertices remain, choose one of them as a new source and repeat the search from that vertex
• DFS creates a “depth-first forest”
1 2
5 4
3
7/6/2004 Lecture 10 COSC3101A 16
DFS Additional Data Structures
• Global variable: time-step– Incremented when nodes are discovered/finished
• color[u] – similar to BFS– White before discovery, gray while processing and
black when finished processing
[u] – predecessor of u• d[u], f[u] – discovery and finish times
GRAYWHITE BLACK
0 2Vd[u] f[u]
1 ≤ d[u] < f [u] ≤ 2 |V|
7/6/2004 Lecture 10 COSC3101A 17
DFS(G)
1. for each u V[G]2. do color[u] ← WHITE
3. [u] ← NIL
4. time ← 0
5. for each u V[G]6. do if color[u] = WHITE
7. then DFS-VISIT(u)
• Every time DFS-VISIT(u) is called, u becomes the root of a new tree in the depth-first forest
u v w
x y z
7/6/2004 Lecture 10 COSC3101A 18
DFS-VISIT(u)
1. color[u] ← GRAY
2. time ← time+13. d[u] ← time4. for each v Adj[u] 5. do if color[v] = WHITE
6. then [v] ← u7. DFS-VISIT(v)
8. color[u] ← BLACK
9. time ← time + 110.f[u] ← time
1/
u v w
x y z
u v w
x y z
time = 1
1/ 2/
u v w
x y z
7/6/2004 Lecture 10 COSC3101A 19
Example
1/ 2/
u v w
x y z
1/
u v w
x y z
1/ 2/
3/
u v w
x y z
1/ 2/
4/ 3/
u v w
x y z
1/ 2/
4/ 3/
u v w
x y z
B
1/ 2/
4/5 3/
u v w
x y z
B
1/ 2/
4/5 3/6
u v w
x y z
B
1/ 2/7
4/5 3/6
u v w
x y z
B
1/ 2/7
4/5 3/6
u v w
x y z
BF
7/6/2004 Lecture 10 COSC3101A 20
Example (cont.)
1/8 2/7
4/5 3/6
u v w
x y z
BF
1/8 2/7 9/
4/5 3/6
u v w
x y z
BF
1/8 2/7 9/
4/5 3/6
u v w
x y z
BFC
1/8 2/7 9/
4/5 3/6 10/
u v w
x y z
BFC
1/8 2/7 9/
4/5 3/6 10/
u v w
x y z
BFC
B
1/8 2/7 9/
4/5 3/6 10/11
u v w
x y z
BFC
B
1/8 2/7 9/12
4/5 3/6 10/11
u v w
x y z
BFC
B
The results of DFS may depend on:• The order in which nodes are explored in procedure DFS• The order in which the neighbors of a vertex are visited in DFS-VISIT
7/6/2004 Lecture 10 COSC3101A 21
Edge Classification
• Tree edge (reaches a WHITE
vertex): – (u, v) is a tree edge if v was first
discovered by exploring edge (u, v)
• Back edge (reaches a GRAY
vertex): – (u, v), connecting a vertex u to an
ancestor v in a depth first tree
– Self loops (in directed graphs) are
also back edges
1/
u v w
x y z
1/ 2/
4/ 3/
u v w
x y z
B
7/6/2004 Lecture 10 COSC3101A 22
Edge Classification
• Forward edge (reaches a BLACK
vertex & d[u] < d[v]): – Non-tree edges (u, v) that connect a vertex
u to a descendant v in a depth first tree
• Cross edge (reaches a BLACK vertex
& d[u] > d[v]): – Can go between vertices in same depth-first
tree (as long as there is no ancestor /
descendant relation) or between different
depth-first trees
1/ 2/7
4/5 3/6
u v w
x y z
BF
1/8 2/7 9/
4/5 3/6
u v w
x y z
BFC
7/6/2004 Lecture 10 COSC3101A 23
Analysis of DFS(G)
1. for each u V[G]2. do color[u] ← WHITE
3. [u] ← NIL
4. time ← 0
5. for each u V[G]6. do if color[u] = WHITE
7. then DFS-VISIT(u)
(V)
(V) – exclusive of time for DFS-VISIT
7/6/2004 Lecture 10 COSC3101A 24
Analysis of DFS-VISIT(u)
1. color[u] ← GRAY
2. time ← time+13. d[u] ← time4. for each v Adj[u] 5. do if color[v] = WHITE
6. then [v] ← u7. DFS-VISIT(v)
8. color[u] ← BLACK
9. time ← time + 110.f[u] ← time
Each loop takes |Adj[v]|
DFS-VISIT is called exactly once for each vertex
Total: ΣvV |Adj[v]| + (V) =
(E)
(V + E)
7/6/2004 Lecture 10 COSC3101A 25
Properties of DFS
• u = [v] DFS-VISIT(v) was
called during a search of u’s
adjacency list
• Vertex v is a descendant of vertex u
in the depth first forest v is
discovered during the time in which
u is gray
1/ 2/
3/
u v w
x y z
7/6/2004 Lecture 10 COSC3101A 26
Parenthesis Theorem
In any DFS of a graph G, for
all u, v, exactly one of the
following holds:
1. [d[u], f[u]] and [d[v], f[v]]
are disjoint, and neither of u
and v is a descendant of the
other
2. [d[v], f[v]] is entirely within
[d[u], f[u]] and v is a
descendant of u
3. [d[u], f[u]] is entirely within
[d[v], f[v]] and u is a
descendant of v
3/6 2/9 1/10
4/5 7/8 12/13
uvwx
y z s11/16
14/15
t
1 2 3 4 5 6 7 8 9 10 1311 12 14 15 16
s
z
t
v u
y w
x
(s (z (y (x x) y) (w w) z) s) u)(t (v (u u) t)
Well-formed expression: parenthesis areproperly nested
7/6/2004 Lecture 10 COSC3101A 27
Other Properties of DFS
Corollary
Vertex v is a proper descendant of u
d[u] < d[v] < f[v] < f[u]
Theorem (White-path Theorem)
In a depth-first forest of a graph G, vertex
v is a descendant of u if and only if at
time d[u], there is a path u v consisting
of only white vertices.
1/ 2/
u
v
1/8 2/7 9/12
4/5 3/6 10/11
u
v
BFC
B
7/6/2004 Lecture 10 COSC3101A 28
Topological Sort
Topological sort of a directed acyclic graph G = (V, E): a linear order of vertices such that if there exists an edge (u, v), then u appears before v in the ordering.
• Directed acyclic graphs (DAGs)– Used to represent precedence of events or processes
that have a partial order
a before b b before c
b before c a before ca before c
What about a and b?
Topological sort helps us establish a total order
7/6/2004 Lecture 10 COSC3101A 29
Topological Sort
undershorts
pants
belt
socks
shoes
watch
shirt
tie
jacket
TOPOLOGICAL-SORT(V, E)
1. Call DFS(V, E) to compute
finishing times f[v] for each
vertex v2. When each vertex is finished,
insert it onto the front of a
linked list
3. Return the linked list of
vertices
1/
2/
3/4
5
6/78
9/10
11/
12/13/14
15
16 17/18
jackettiebeltshirtwatchshoespantsundershortssocks
Running time: (V + E)
7/6/2004 Lecture 10 COSC3101A 30
Topological Sort
undershorts
pants
belt
socks
shoes
watch
shirt
tie
jacket
1/
2/
3/4
5
6/78
9/10
11/
12/13/14
15
16 17/18
jackettiebeltshirtwatchshoespantsundershortssocks
Topological sort:an ordering of vertices along a horizontal line so that all directededges go from left to right.
7/6/2004 Lecture 10 COSC3101A 31
Lemma
A directed graph is acyclic a DFS on G yields
no back edges.
Proof:
“”: acyclic no back edge
– Assume back edge prove cycle
– Assume there is a back edge (u, v)
v is an ancestor of u
there is a path from v to u in G (v u)
v u + the back edge (u, v) yield a cycle
v
u
(u, v)
7/6/2004 Lecture 10 COSC3101A 32
Lemma
A directed graph is acyclic a DFS on G yields no back edges.
Proof:
“”: no back edge acyclic– Assume cycle prove back edge– Suppose G contains cycle c– Let v be the first vertex discovered in c, and (u, v) be
the preceding edge in c– At time d[v], vertices of c form a white path v u– u is descendant of v in depth-first forest (by white-path
theorem) (u, v) is a back edge
v
u
(u, v)
7/6/2004 Lecture 10 COSC3101A 33
Strongly Connected Components
Given directed graph G = (V, E):
A strongly connected component (SCC) of G is a maximal set of vertices C V such that for every pair of vertices u, v C, we have both u v and v u.
7/6/2004 Lecture 10 COSC3101A 34
The Transpose of a Graph
• GT = transpose of G– GT is G with all edges reversed
– GT = (V, ET), ET = {(u, v) : (v, u) E}
• If using adjacency lists: we can create GT in (V + E) time
1 2
5 4
3
1 2
5 4
3
7/6/2004 Lecture 10 COSC3101A 35
Finding the SCC
• Observation: G and GT have the same SCC’s– u and v are reachable from each other in G they are
reachable from each other in GT
• Idea for computing the SCC of a DAG G = (V, E):– Make two depth first searches: one on G and one on GT
1 2
5 4
3
1 2
5 4
3
7/6/2004 Lecture 10 COSC3101A 36
STRONGLY-CONNECTED-COMPONENTS(G)
1. call DFS(G) to compute finishing times f[u] for
each vertex u
2. compute GT
3. call DFS(GT), but in the main loop of DFS,
consider vertices in order of decreasing f[u] (as computed in first DFS)
4. output the vertices in each tree of the depth-
first forest formed in second DFS as a separate
SCC
7/6/2004 Lecture 10 COSC3101A 37
Example
a b c d
e f g h
1/
2/3/4 65/7
8/11/
12/
13/ 91014
15
16
f4
h6
g7
d9
c10
a14
e15
b16
DFS on the initial graph G
DFS on GT:
• start at b: visit a, e• start at c: visit d• start at g: visit f• start at h
Strongly connected components: C1 = {a, b, e}, C2 = {c, d}, C3 = {f, g}, C4 = {h}
a b c d
e f g h
7/6/2004 Lecture 10 COSC3101A 38
Component Graph
• The component graph GSCC = (VSCC, ESCC):
– VSCC = {v1, v2, …, vk}, where vi corresponds to each strongly connected component Ci
– There is an edge (vi, vj) ESCC if G contains a directed edge (x, y) for some x Ci and y Cj
• The component graph is a DAG
a b c d
e f g h
a b e
c d
f g h
7/6/2004 Lecture 10 COSC3101A 39
Lemma 1
Let C and C’ be distinct SCC’s in G
Let u, v C, and u’, v’ C’
Suppose there is a path u u’ in G
Then there cannot also be a path v’ v in G.
u
vu’
v’
Proof
• Suppose there is a path v’ v
• There exists u u’ v’
• There exists v’ v u• u and v’ are reachable from
each other, so they are not in separate SCC’s: contradiction!
C C’
7/6/2004 Lecture 10 COSC3101A 40
Notations
• Extend notation for d (starting time) and f (finishing time) to sets of vertices U V:– d(U) = minuU { d[u] } (earliest discovery time)
– f(U) = maxuU { f[u] } (latest finishing time)
a b c d
e f g h
1/
2/3/4 65/7
8/11/
12/
13/ 91014
15
16
C1 C2
C3 C4
d(C1)
f(C1)
d(C2)
f(C2)
d(C3)
f(C3)
d(C4)
f(C4)
=11
=16
=1
=10
=2
=7
=5
=6
7/6/2004 Lecture 10 COSC3101A 41
Lemma 2
• Let C and C’ be distinct SCCs in a directed graph G = (V, E). If there is an edge (u, v) E, where u C and v C’ then f(C) > f(C’).
• Consider C1 and C2, connected by edge (b, c)
a b c d
e f g h
1/
2/3/4 65/7
8/11/
12/
13/ 91014
15
16
C1 C2
C3 C4
d(C1)
f(C1)
d(C2)
f(C2)
d(C3)
f(C3)
d(C4)
f(C4)
=11
=16
=1
=10
=3
=7
=5
=6
7/6/2004 Lecture 10 COSC3101A 42
Corollary
• Let C and C’ be distinct SCCs in a directed graph G = (V, E). If there is an edge (u, v) ET, where u C and v C’ then f(C) < f(C’).
• Consider C2 and C1, connected by edge (c, b)C1 = C’ C2 = C
C3 C4
a b c d
e f g h
• Since (c, b) ET (b, c) E
• From previous lemma:
f(C1) > f(C2)
f(C’) > f(C)
7/6/2004 Lecture 10 COSC3101A 43
Discussion
f(C) < f(C’)• Each edge in GT that goes between different
components goes from a component with an earlier finish time (in the DFS) to one with a later finish time
C1 = C’ C2 = C
C3 C4
a b c d
e f g h
7/6/2004 Lecture 10 COSC3101A 44
Why does SCC Work?• When we do the second DFS, on GT, we start with a
component C such that f(C) is maximum (b, in our case)
• We start from b and visit all vertices in C1
• From corollary: f(C) > f(C’) for all C C’ there are no edges from C to any other SCCs in GT
DFS will visit only vertices in C1
The depth-first tree rooted at b contains exactly the vertices of C1
C1 C2
C3 C4
a b c d
e f g h
f4
h6
g7
d9
c10
a14
e15
b16
7/6/2004 Lecture 10 COSC3101A 45
Why does SCC Work? (cont.)
• The next root chosen in the second DFS is in SCC C2 such that f(C) is maximum over all SCC’s other than C1
• DFS visits all vertices in C2
– the only edges out of C2 go to C1, which we’ve already visited
The only tree edges will be to vertices in C2
• Each time we choose a new root it can reach only:– vertices in its own component – vertices in components already visitedC1 C2
C3 C4
a b c d
e f g h
f4
h6
g7
d9
c10
a14
e15
b16
7/6/2004 Lecture 10 COSC3101A 46
Readings
• Chapter 22• Appendix B