chapter 8: graphs
DESCRIPTION
Chapter 8: Graphs. Objectives. Looking ahead – in this chapter, we’ll consider Graph Representation Graph Traversals Shortest Paths Cycle Detection Spanning Trees Connectivity. Objectives (continued). Topological Sort Networks Matching Eulerian and Hamiltonian Graphs Graph Coloring - PowerPoint PPT PresentationTRANSCRIPT
Chapter 8: Graphs
Objectives
Looking ahead – in this chapter, we’ll consider• Graph Representation• Graph Traversals• Shortest Paths• Cycle Detection• Spanning Trees• Connectivity
2Data Structures and Algorithms in C++, Fourth Edition
Objectives (continued)
• Topological Sort• Networks• Matching• Eulerian and Hamiltonian Graphs• Graph Coloring• NP-Complete Problems in Graph Theory
Data Structures and Algorithms in C++, Fourth Edition 3
Introductory Remarks
• Although trees are quite flexible, they have an inherent limitation in that they can only express hierarchical structures
• Fortunately, we can generalize a tree to form a graph, in which this limitation is removed
• Informally, a graph is a collection of nodes and the connections between them
• Figure 8.1 illustrates some examples of graphs; notice there is typically no limitation on the number of vertices or edges
• Consequently, graphs are extremely versatile and applicable to a wide variety of situations
• Graph theory has developed into a sophisticated field of study since its origins in the early 1700s
Data Structures and Algorithms in C++, Fourth Edition 4
Introductory Remarks (continued)
Fig. 8.1 Examples of graphs: (a–d) simple graphs; (c) a complete graph K4; (e) a multigraph;(f) a pseudograph; (g) a circuit in a digraph; (h) a cycle in the digraph
Data Structures and Algorithms in C++, Fourth Edition 5
Introductory Remarks (continued)
• And, while many results are theoretical, the applications of graphs are numerous and worth consideration
• First, though, we need to consider some definitions• A simple graph G = (V, E) consists of a (finite) set denoted by
V, and a collection E, of unordered pairs {u, v} of distinct elements from V
• Each element of V is called a vertex or a point or a node, and each element of E is called an edge or a line or a link
• The number of vertices, the cardinality of V, is called the order of graph and devoted by |V|
• The cardinality of E, called the size of graph, is denoted by |E|
Data Structures and Algorithms in C++, Fourth Edition 6
Introductory Remarks (continued)
• A graph G = (V, E) is directed if the edge set is composed of ordered vertex (node) pairs
• Now these definitions restrict the number of edges that can occur between any two vertices to one
• If we allow multiple edges between any two vertices, we have a multigraph (Figure 8.1e)
• Formally, a multigraph is defined as G(V, E, f) where V is the set of vertices, E the edges, and f:E →{{vi, vj} : vi,vj V and vi ≠ vj} is a function defining edges as pairs of distinct vertices
• A pseudograph is a multigraph that drops the vi ≠ vj condition, allowing the graph to have loops (Figure 8.1f)
Data Structures and Algorithms in C++, Fourth Edition 7
Introductory Remarks (continued)
• A path between vertices v1 and vn is a sequence of edges denoted v1, v2, …, vn-1, vn
• If v1 = vn, and the edges don’t repeat, it is a circuit (Figure 8.1g); if the vertices in a circuit are different, it is a cycle (Figure 8.1h)
• A weighted graph assigns a value to each edge, based on contextual usage
• A complete graph of n vertices, denoted Kn, has exactly one edge between each pair of vertices (Figure 8.1c)
• The edge count = = = = O
Data Structures and Algorithms in C++, Fourth Edition 8
Introductory Remarks (continued)
• A subgraph of a graph G, designated G’, is the graph (V’, E’) where V’ V and E’ E
• If the edges of the subgraph are defined such that e E if e E’, then the subgraph is said to be induced on its vertices V’
• Two vertices are adjacent if the edge defined by them is in E• That edge is called incident with the vertices• The number of edges incident with a vertex v, is the degree
of the vertex; if the degree is 0, v is called isolated• Notice that the definition of a graph allows the set E to be
empty, so a graph may be composed of isolated vertices
Data Structures and Algorithms in C++, Fourth Edition 9
Graph Representation• Graphs can be represented in a number of ways• One of the simplest is an adjacency list, where each vertex
adjacent to a give vertex is listed• This can be designed as a table (known as a star
representation) or a linked list, shown in Figure 8.2b-c on page 393
• Another representation is as a matrix, which can be designed in two ways
• An adjacency matrix is a |V| x |V| binary matrix where:
10Data Structures and Algorithms in C++, Fourth Edition
1 if there exists an edge
0 otherwise i j
ij
v va
Graph Representation (continued)• An example of an adjacency matrix is shown in Figure 8.2d• The order of the vertices in the matrix is arbitrary, so there
are n! possible matrices for a graph of n vertices• It is also possible to generalize an adjacency matrix definition
to handle a multigraph by defining aij = number of edges between vi and vj
• A second matrix representation is based on incidences, hence the name incidence matrix
• An incidence matrix is a |V| x |E| binary matrix where:
11Data Structures and Algorithms in C++, Fourth Edition
1 edge e is incident with vertex 0 otherwise
j iij
va
Graph Representation (continued)• An example of an incidence matrix is shown in Figure 8.2e• For a multigraph, many columns are the same, and a column
with a single 1 represents a loop• As far as usage, the proper structure depends to a great
extent on the kinds of operations that need to be done
12Data Structures and Algorithms in C++, Fourth Edition
Graph Traversals
• Like tree traversals, graph traversals visit each node once• However, we cannot apply tree traversal algorithms to graphs
because of cycles and isolated vertices• One algorithm for graph traversal, called the depth-first
search, was developed by John Hopcroft and Robert Tarjan in 1974
• In this algorithm, each vertex is visited and then all the unvisited vertices adjacent to that vertex are visited
• If the vertex has no adjacent vertices, or if they have all been visited, we backtrack to that vertex’s predecessor
• This continues until we return to the vertex where the traversal started
Data Structures and Algorithms in C++, Fourth Edition 13
Graph Traversals (continued)
• If any vertices remain unvisited at this point, the traversal restarts at one of the unvisited vertices
• Although not necessary, the algorithm assigns unique numbers to the vertices, so they are renumbered
• Pseudocode for this algorithm is shown on page 395• Figure 8.3 shows an example of this traversal; the numbers
indicate the order in which the nodes are visited; the solid lines indicate the edges traversed during the search
Fig. 8.3 An example of application of the depthFirstSearch() algorithm to a graph
Data Structures and Algorithms in C++, Fourth Edition 14
Graph Traversals (continued)
• The algorithm guarantees that we will create a tree (or a forest, which is a set of trees) including the graph’s vertices
• Such a tree is called a spanning tree• The guarantee is based on the algorithm not processing any
edge that leads to an already visited node• Consequently, some edges are not included in the tree
(marked with dashed lines)• The edges included in the tree are called forward edges;
those omitted are called back edges• In Figure 8.4, we can see this algorithm applied to a digraph,
which is a graph where the edges have a direction
Data Structures and Algorithms in C++, Fourth Edition 15
Graph Traversals (continued)
Fig. 8.4 The depthFirstSearch() algorithm applied to a digraph
• Notice in this case we end up with a forest of three trees, because the traversal must follow the direction of the edges
• There are a number of algorithms based on depth-first searching
• However, some are more efficient if the underlying mechanism is breadth-first instead
Data Structures and Algorithms in C++, Fourth Edition 16
Graph Traversals (continued)
• Recall from our consideration of tree traversals that depth-first traversals used a stack, while breadth-first used queues
• This can be extended to graphs, as the pseudocode on page 397 illustrates
• Figure 8.4 shows this applied to a graph; Figure 8.5 shows the application to a digraph
• In both, the basic operation is to mark all the vertices accessible from a given vertex, placing them in a queue as they are visited
• The first vertex in the queue is then removed, and the process repeated
• No visited nodes are revisited; if a node has no accessible nodes, the next node in the queue is removed and processed
Data Structures and Algorithms in C++, Fourth Edition 17
Graph Traversals (continued)
Fig. 8.5 An example of application of the breadthFirstSearch() algorithm to a graph
Fig. 8.6 The breadthFirstSearch() algorithm applied to a digraph
Data Structures and Algorithms in C++, Fourth Edition 18
Shortest Paths• A classical problem in graph theory is finding the shortest path
between two nodes, with numerous approaches suggested• The edges of the graph are associated with values denoting
such things as distance, time, costs, amounts, etc.• If we’re determining the distance between two vertices, say v
and u, information about the distance between the intermediate vertices in the path, w, needs to be kept track of
• This can be recorded as a label associated with the vertices• The label may simply be the distance between vertices, or the
distance along with the current node’s predecessor in the path• Methods for finding shortest paths depend on these labels
Data Structures and Algorithms in C++, Fourth Edition 19
Shortest Paths (continued)• Based on how many times the labels are updated, solutions to
the shortest path problem fall into two groups• In label-setting methods, one vertex is assigned a value that
remains unchanged• This occurs each time we go through the vertices that remain
to be processed• The main drawback to this is that we cannot process graphs
that have negative weights on any edges• In label-correcting methods, any label can be changed• This means it can be applied to graphs with negative weights
as long as they don’t have negative cycles (a cycle where the sum of the edges is a negative value)
Data Structures and Algorithms in C++, Fourth Edition 20
Shortest Paths (continued)• However this method guarantees that after processing is
complete, for all vertices the current distances indicate the shortest path
• Most of these forms (both label-setting and label-correcting) can be looked at as part of the same general process, however
• That is the task of finding the shortest paths from one vertex to all the other vertices, the pseudocode being on page 399
• In this algorithm, a label is defined as:label(v) = (currDist(v),predecessor(v))
• Two open issues in the code are the design of the set called toBeChecked and the order new values are assigned to v
• It is the design of the set that impacts both the choice of v and the efficiency of the algorithm
Data Structures and Algorithms in C++, Fourth Edition 21
Shortest Paths (continued)• The distinction between label-setting and label-correcting
algorithms is the way the value for vertex v is chosen• This is the vertex in the set toBeChecked with the smallest
current distance• In considering label-setting algorithms, one of the first was
developed by Edsgar Dijkstra in 1956• In this algorithm, the shortest from among a number of paths
from a vertex, v, are tried• This means that a particular path may be extended by adding
one more edge to it each time v is checked• However, if the path is longer than any other path from that
point, it is dropped, and the other path is expanded
Data Structures and Algorithms in C++, Fourth Edition 22
Shortest Paths (continued)• Since the vertices may have more than one outgoing edge,
each new edge adds possible paths for exploration• Thus each vertex is visited, the new paths are started, and the
vertex is then not used anymore• Once all the vertices are visited, the algorithm is done• Dijkstra’s algorithm is shown on page 400; it is derived from
the general algorithm by changing the linev=a vertex in toBeChecked;
tov=a vertex in toBeChecked with minimal currDist(v);
• It also extends the condition in the if to make permanent the current distance of vertices eliminated from the set
Data Structures and Algorithms in C++, Fourth Edition 23
Shortest Paths (continued)• Notice that the set’s structure is not indicated; recall it is the
structure that determines efficiency• Figure 8.7 illustrates this for the graph in part (a)
Fig. 8.7 An execution of DijkstraAlgorithm()
Data Structures and Algorithms in C++, Fourth Edition 24
Shortest Paths (continued)• As a label-setting algorithm, Dijkstra’s approach may fail when
negative weights are used in graphs• To deal with that, a label-correcting algorithm is needed• One of the first label-correcting algorithms was developed by
Lester R. Ford, Jr. in the late 1950s• It uses the same technique as Dijkstra’s method to set the
current distances, but postpones determining the shortest distance for any vertex until the entire graph is processed
• While it is capable of handling graphs with negative weights, it cannot deal with negative cycles
• In the algorithm, all edges are watched in an attempt to find an improvement for the current distance of the vertices
Data Structures and Algorithms in C++, Fourth Edition 25
Shortest Paths (continued)• The pseudocode for the algorithm is shown on page 402• To facilitate monitoring the vertices, an alphabetic sequence
can be used• That way the algorithm can go through the list repeatedly and
adjust any vertex’s current distance as needed• Figure 8.8 contains an example of this; note that the graph
does include negatively weighted edges• While a vertex may change its current distance during the
same iteration, when done each vertex can be reached by the shortest path from the starting vertex
Data Structures and Algorithms in C++, Fourth Edition 26
Shortest Paths (continued)
Data Structures and Algorithms in C++, Fourth Edition 27
Fig. 8.8 FordAlgorithm() applied to a digraph with negative weights
• In the case of Dijkstra’s algorithm, we observed that the efficiency can be improved by the choice of data structure
• This in turn impacts the way the edges and vertices are scanned
Shortest Paths (continued)• This observation also holds for label-correcting algorithms; in
particular, the FordAlgorithm()specifies no order for edge checking
• In the example of Figure 8.8, the approach was to visit all adjacency lists of all vertices in each iteration
• However this requires that all the edges are checked every time, which is inefficient
• A more sensible organization of the vertices can reduce the number of visits per vertex
• The generic algorithm on page 399 suggests an improvement by explicitly accessing toBeChecked
• In the FordAlgorithm()this structure is used implicitly, and then only as the set of all vertices
Data Structures and Algorithms in C++, Fourth Edition 28
Shortest Paths (continued)• So based on this, we can derive a general label-correcting
algorithm, shown in pseudocode on page 403• As indicated before, the efficiency of the algorithm depends
directly on the data structure used for toBeChecked• One possibility is a queue, and was the basis for one of the
earliest implementations• With a queue, as a vertex, v is removed, the current distance
to its neighbors is checked• If any of those distances is updated, the vertex whose
distance was changed is added to the queue• While straightforward, it can sometimes reevaluate the same
labels excessively
Data Structures and Algorithms in C++, Fourth Edition 29
Shortest Paths (continued)• Figure 8.9 illustrates this problem for the graph of Figure 8.8a
Fig. 8.9 An execution of labelCorrectingAlgorithm(), which uses a queue
• As can be seen, a number of vertices are updated multiple times
Data Structures and Algorithms in C++, Fourth Edition 30
Shortest Paths (continued)• To avoid this situation, a deque can be used in place of the
queue• In this approach, vertices needing to be checked for the first
time are added at the end, otherwise they are placed in front• The reasoning behind this is that if a given vertex, v, is
included for the first time, the vertices accessible from it have yet to be processed, so they will be processed after v
• However, if v has been processed, those vertices are likely still in the list awaiting processing, so putting v in front may avoid unnecessary updates
• Figure 8.10 shows the result of using a deque instead of a queue
Data Structures and Algorithms in C++, Fourth Edition 31
Shortest Paths (continued)
Fig. 8.10 An execution of labelCorrectingAlgorithm(), which applies a deque
• The use of a deque does suffer from one problem, however• Its worst case performance is exponential in the number of
vertices
Data Structures and Algorithms in C++, Fourth Edition 32
Shortest Paths (continued)• However, the average case is about 60% better than the
queue version of the same algorithm• A variation of this approach uses two queues separately,
rather than combined in a deque• In this variation, vertices enqueued for the first time are
placed in the first queue; otherwise they are placed in the second
• Vertices are then dequeued from the first queue if it is not empty; otherwise they are taken from the second
• The threshold algorithm is another variation of the label-correcting method that uses two lists
• Vertices are removed from the first list for processing
Data Structures and Algorithms in C++, Fourth Edition 33
Shortest Paths (continued)• A vertex will be added to the end of the first list if the value of
its label is below the threshold level• Otherwise it will be added to the second list• If the first list becomes empty, the threshold is modified to a
value greater than the minimum label value of all vertices in the second list
• Then those vertices whose labels are less than the new threshold are moved from the second list to the first list
• Yet another approach is the small label first method• In this method, a vertex is placed at the front of the deque if
its label is smaller than the label of the current front of the deque; otherwise it is placed at the rear
Data Structures and Algorithms in C++, Fourth Edition 34
Shortest Paths (continued)• All-to-All Shortest Path Problem
– Given the issues of finding the shortest path from one vertex to another, the problem of finding all shortest paths between two vertices might seem daunting
– However, a method developed by Stephen Warshall in 1962 does it fairly easily, as long as an adjacency matrix that provides edge weights is available
– This technique can also handle negative edge weights and the algorithm is shown on page 406
– An example of the algorithm’s application, together with the accompanying adjacency matrix, is shown in Figure 8.11 on page 407
– The algorithm can also detect cycles if the diagonal of the matrix is initialized to ∞ instead of 0
– If any of the diagonal values get changed, the graph contains a cycle
Data Structures and Algorithms in C++, Fourth Edition 35
Shortest Paths (continued)• All-to-All Shortest Path Problem (continued)
– As it turns out, if an initial value of ∞ is not changed during processing, then one vertex cannot reach the other
– The algorithm’s simplicity is reflected in the determination of its complexity; there are three loops executed times so it is O 3
– This is adequate for dense, near-complete graphs, but if they are sparse, it may be better to use a one-to-all method applied to each vertex
– Generally this should be a label-setting algorithm, but recall that these types of routines cannot handle negative edge weights
– Fortunately, there are transformations available that eliminate the negative weights while preserving the shortest paths of the original
Data Structures and Algorithms in C++, Fourth Edition 36
Cycle Detection• Numerous algorithms rely on their ability to detect cycles in
graphs• Our consideration of the Warshall-Floyd algorithm in the
previous example demonstrated that it can detect cycles• However, its cubic order makes it too inefficient to use in all
circumstances, so other methods have to be considered• One algorithm, based on the depthFirstSearch()routine,
works well for undirected graphs• The pseudocode for this is shown on page 408• Digraphs complicate matters, because the spanning subtrees
might have edges between them (called side edges)
Data Structures and Algorithms in C++, Fourth Edition 37
Cycle Detection (continued)• If two vertices already included in a subtree are joined by a
back edge, it indicates a cycle• To take this case into account, a number greater than any
other assigned number generated from subsequent searches is assigned to the current vertex after its descendants have been visited
• This allows us to detect cycles if a vertex is about to be joined by an edge with a vertex having a lower number
• This allows us to modify the algorithm so that it now appears in pseudocode as the algorithm on page 409
Data Structures and Algorithms in C++, Fourth Edition 38
Cycle Detection (continued)• Union-Find Problem
– We’ve seen that the depth-first search guarantees creating a spanning tree with no cycles
– However, a problem occurs when the depth-first search algorithm is modified to determine if a specific edge is part of a cycle
– If the modified algorithm is applied to each edge separately, the algorithm could become O4 for dense graphs
– This is unacceptable, and a better approach needs to be investigated– The basic task is to determine if two vertices are members of the same
set– Two procedures are needed for this: first, to find the set to which a
vertex v belongs, and second, to unite two sets into one if v belongs to one set and vertex w belongs to another
Data Structures and Algorithms in C++, Fourth Edition 39
Cycle Detection (continued)• Union-Find Problem (continued)
– This process is known as the union-find problem– Circular-linked lists are used to implement the sets involved in solving
the union-find problem– The lists are identified by a vertex which is the root of the tree
containing the vertices in that list– The vertices are numbered from 0 to - 1, which become indices to
three arrays• root[]stores the index of a vertex identifying a set of vertices• next[]indicates the next vertex on a list• length[]indicates the number of vertices in a list
– The circular lists are used to enable combining the lists immediately– This is shown in Figure 8.12
Data Structures and Algorithms in C++, Fourth Edition 40
Cycle Detection (continued)• Union-Find Problem (continued)
Fig. 8.12 Concatenating two circular linked lists
– The two lists are merged into one by interchanging next pointers– However, all the vertices now have to have the same root, so the
vertices of one of the lists need to have their root indicators changed– This should be the shorter of the two lists, which can be determined
by the length[] array– Since the union operation performs all the needed tasks, the find
operation is trivial
Data Structures and Algorithms in C++, Fourth Edition 41
Cycle Detection (continued)• Union-Find Problem (continued)
– By constantly updating the root[] array, the set to which a vertex v belongs can be identified immediately because it is the set identified by root[v]
– Thus after initializations, the union algorithm can be defined as shown in pseudocode on page 410
– An application of this is shown in Figure 8.13– After the initialization completes, the | | one-node lists are as shown 𝑉
in Figure 8.13a– These smaller ones are merged into larger ones by repeated execution
of the union algorithm, and the arrays updated as seen in Figure 8.13 b-d
Data Structures and Algorithms in C++, Fourth Edition 42
Cycle Detection (continued)• Union-Find Problem (continued)
Fig. 8.13 An example of application of union() to merge lists
Data Structures and Algorithms in C++, Fourth Edition 43
Spanning Trees• Consider an airline that has routes between seven cities
represented as the graph in Figure 8.14a
Fig. 8.14 A graph representing (a) the airline connections betweenseven cities and (b–d) three possible sets of connections
• If economic hardships force the airline to cut routes, which ones should be kept to preserve a route to each city, if only indirectly?
• One possibility is shown in Figure 8.14b
Data Structures and Algorithms in C++, Fourth Edition 44
Spanning Trees (continued)• However, we want to make sure we have the minimum
connections necessary to preserve the routes• To accomplish this, a spanning tree should be used,
specifically one created using depthFirstSearch()• There is a possibility of multiple spanning trees (Figure 8.14c-
d), but each of these has the minimum number of edges• We don’t know which of these might be optimal, since we
haven’t taken distances into account• The airline, wanting to minimize costs, will want to use the
shortest distances for the connections• So what we want to find is the minimum spanning tree,
where the sum of the edge weights is minimal
Data Structures and Algorithms in C++, Fourth Edition 45
Spanning Trees (continued)• The problem we looked at earlier involving finding a spanning
tree in a simple graph is a case of this where edge weights = 1• So each spanning tree is a minimum tree in a simple graph• There are a number of solutions to the minimum spanning
tree problem, and we will consider two• One popular algorithm is Kruskal’s algorithm, developed by
Joseph Kruskal in 1956• It orders the edges by weight, and then checks to see if they
can be added to the tree under construction• It will be added if its inclusion doesn’t create a cycle
Data Structures and Algorithms in C++, Fourth Edition 46
Spanning Trees (continued)• The algorithm is as follows:KruskalAlgorithm(weighted connected undirected graph) tree = null; edges = sequence of all edges of graph sorted by weight; for (i = 1; i # |E| and |tree| < |V| – 1; i++) if ei from edges does not form a cycle with edges in tree add ei to tree;
• A step-by-step example of the application of this algorithm is shown in Figure 8-15ba-bf on page 413
• It is not necessary to order the edges in order to build a spanning tree, any order of edges can be used
• An algorithm developed by Dijkstra in 1960 (and independently by Robert Kalaba) pursues this approach
Data Structures and Algorithms in C++, Fourth Edition 47
Spanning Trees (continued)• This algorithm is shown below:DijkstraMethod(weighted connected undirected graph) tree = null; edges = an unsorted sequence of all edges of graph; for i = 1 to |E| add ei to tree; if there is a cycle in tree remove an edge with maximum weight from this only cycle;
• In this algorithm, edges are added to the tree one-by-one• If a cycle results, the edge in the cycle with maximum weight
is removed• The use of this method is shown in Figure 8.15ca-cl on page
414
Data Structures and Algorithms in C++, Fourth Edition 48
Connectivity• In many graph problems we want to find a path from a given
vertex to any other vertex• In undirected graphs this means there are no separate pieces
in the graph (subgraphs)• In a digraph, we may be able to get to some vertices in a
particular direction, but not return to the starting vertex
Data Structures and Algorithms in C++, Fourth Edition 49
Connectivity (continued)• Connectivity in Undirected Graphs
– An undirected graph is considered to be connected if there is a path between any two vertices of the graph
– We can use the depth-first search algorithm to determine connectivity if the while loop heading is removed
– When the algorithm completes, we check the edges list to see if it contains all the vertices of the graph
– Connectivity is described in terms of degrees; a graph is more or less connected depending on the number of different paths between vertices
– An n-connected graph has at least n different paths between any two vertices
– This means there are n paths between the vertices that have no vertices in common
Data Structures and Algorithms in C++, Fourth Edition 50
Connectivity (continued)• Connectivity in Undirected Graphs (continued)
– One special type of graph is the biconnected (or 2-connected) graph, which has at least two non-overlapping paths between two vertices
– If we can find a vertex that always has to be included in the path between vertices a and b, then the graph is not biconnected
– Removing this vertex, and its incident edges, will split the graph into two subgraphs
– These vertices are referred to as cut-vertices or articulation points– If the graph can be split on an edge, the edge is referred to as a cut-
edge or bridge– If connected subgraphs have no articulation points or bridges, they are
called blocks (if there are at least two vertices, they are biconnected components)
Data Structures and Algorithms in C++, Fourth Edition 51
Connectivity (continued)• Connectivity in Undirected Graphs (continued)
– We can detect articulation points by extending the depth-first algorithm to create a tree with forward and back edges
– A vertex in the resulting tree is an articulation point if it has at least one subtree unconnected with any of its predecessors by a back edge
– This is illustrated in Figure 8.16 on page 417– A special case of articulation points occurs when the vertex involved is
a root with more than one descendant– In the case of the graph in Figure 8.16, a is the root, and has three
incident edges; however, only one becomes a forward edge– This is because the other two are processed by the depth-first search
Data Structures and Algorithms in C++, Fourth Edition 52
Connectivity (continued)• Connectivity in Undirected Graphs (continued)
– Consequently, if a is reached again, there will be no untried edge, whereas if a were a cut-vertex there would be at least one such edge
– So for a given vertex, v, the vertex is an articulation point if:• v is the root of a depth-first tree and has more than one
descendant in the tree OR• at least one of v’s subtrees includes no vertex connected by a back
edge with any of v’s predecessors– To find articulation points, a parameter pred(v)is used, defined as
the smallest value of the set of vertices connected by a back edge with either v or a predecessor of v
– A stack is used to store the currently processed edges; after the cut-vertex is identified, the graph edges comprising the block are output
– The pseudocode for the algorithm is on pages 416 and 418
Data Structures and Algorithms in C++, Fourth Edition 53
Connectivity (continued)• Connectivity in Directed Graphs (continued)
– With directed graphs, defining connectedness depends on whether or not the direction of the edges is considered
– A weakly connected digraph is one where the undirected graph with the same edges and vertices is connected
– A strongly connected digraph has, for every pair of vertices, a path between them in both directions
– A digraph may not be strongly connected, yet contain strongly connected components (SCCs)
– These are subsets of vertices in the digraph that of themselves represent a strongly connected digraph
Data Structures and Algorithms in C++, Fourth Edition 54
Connectivity (continued)• Connectivity in Directed Graphs (continued)
– Depth-first search can also be used in determining SCCs– The root of the SCC is the first vertex of the SCC for which the depth-
first search is applied– Because every vertex in the SCC is reachable from this root, the value
of the root will be less than the value of any other vertex in the SCC– Only after those vertices are visited will the depth-first search
backtrack to the root– At that point the SCC that is accessible from this root can be output– The problem then is how to find these vertices in the digraph, which is
a problem similar to finding cut-vertices in an undirected graph
Data Structures and Algorithms in C++, Fourth Edition 55
Connectivity (continued)• Connectivity in Directed Graphs (continued)
– To do this, the pred(v) parameter is used, which is the lower of num(v) and pred(u), u being a vertex reachable from v and in the same SCC
– Of course this leads to the question of how we can determine if two vertices are in the same SCC before we determine if it is an SCC
– This can be done using a stack to store the vertices of all SCCs under construction
– The topmost vertices will be in the current SCC– This way we know what vertices are already in the SCC even though
the construction isn’t finished– The algorithm, attributed to Robert Trajan, is shown on page 419; an
example of the execution is shown in Figure 8.17 on page 420
Data Structures and Algorithms in C++, Fourth Edition 56
Topological Sort• A topological sort of a directed graph is a linear ordering of its
vertices so that, for every edge uv, u comes before v in the ordering
• For instance, the vertices of the graph may represent tasks to be performed
• The edges may represent constraints that one task must be performed before another
• In this application, a topological ordering is just a valid sequence for the tasks
• A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG)
Data Structures and Algorithms in C++, Fourth Edition 57
Topological Sort (continued)• The algorithm for the topological sort is a simple one:topologicalSort(digraph) for i = 1 to |V| find a minimal vertex v; num(v) = i; remove from digraph vertex v and all edges incident
with v;
• As can be seen, we locate a vertex, v with no outgoing edges• Such a vertex is called a minimal vertex or sink• We then remove any edges leading from a vertex to v• Figure 8.18 shows this process; the graph in Figure 8.18a goes
through a series of deletions (Figure 8.18b-f) to produce the sequence g, e, b, f, d, c, a
Data Structures and Algorithms in C++, Fourth Edition 58
Topological Sort (continued)
Fig. 8.18 Executing a topological sort
Data Structures and Algorithms in C++, Fourth Edition 59
Topological Sort (continued)• It is not actually necessary to delete the edges and vertices
from a digraph during this processing• If we can determine that all successors of the vertex v have
been processed, they can be considered deleted• This is once again handled by applying the depth-first search
techniques seen earlier• Basically, if the search backtracks to v, then all its successors
can be assumed to have already been searched• The pseudocode for this algorithm is shown on pages 421 and
423• The table (Figure 8.18h) shows how the numbers are assigned
for each vertex of the graph of Figure 8.18a
Data Structures and Algorithms in C++, Fourth Edition 60
Networks• Maximum Flows
– A network is a directed graph where each edge has a capacity and each edge receives a flow
– The amount of flow on an edge cannot exceed the capacity of the edge
– A flow must satisfy the restriction that the amount of flow into a node equals the amount of flow out of it, except when it is a source, which has more outgoing flow, or sink, which has more incoming flow
– A network can be used to model traffic in a road system, fluids in pipes, currents in an electrical circuit, or anything similar in which something travels through a network of nodes
– Delbert R. Fulkerson and Lester R. Ford, Jr. developed the first computational models of these flow problems in 1954
Data Structures and Algorithms in C++, Fourth Edition 61
Networks (continued)• Maximum Flows (continued)
– The central problem of these network models is to maximize the flow over the edges from the source to the sink
– This is referred to as the maximum flow (or max-flow) problem– Figure 8.19 illustrates this problem for a small water-flow network of 8
pipes and 6 pumping stations; the edges are labeled with the capacity of the pipes in thousands of gallons
Figure 8.19 A pipeline with eightpipes and six pumping stations
Data Structures and Algorithms in C++, Fourth Edition 62
Networks (continued)• Maximum Flows (continued)
– A central aspect of the Ford-Fulkerson approach is the concept of a cut– A cut separating s and t is a set of edges between two sets, X and – Every vertex of the graph is a member of one of these two sets; the
source, s, is in X and the sink, t, in – In Figure 8.19, if we choose X = {s, a}, then = {b, c, d, t}, and the cut is
the set of edges {{a, b}, {s, c}, {s, d}}– Thus, if all these edges are cut, there is no way to get from s to t– Now we can define the capacity of the cut as the sum of the capacities
of the edges in this cut set, so
cap{(a,b),(s,c),(s,d)} = cap(a,b) + cap(s,c) + cap(s,d) = 19
Data Structures and Algorithms in C++, Fourth Edition 63
Networks (continued)• Maximum Flows (continue)
– From this, we can infer the max-flow min-cut theorem:Theorem: In any network, the maximal flow from s to t isequal to the minimal capacity of any cut.
– This makes it fairly clear that while there may be cuts with larger capacity, it is the cut with the smallest capacity that determines the flow of the network
– For instance, although the capacity of our earlier cut was 19, the two edges coming to the sink can’t transfer more than 9 units
– So we have to search all the cuts to find the one with the smallest capacity, and transfer through this as many units as the capacity allows
– To achieve this, we’ll utilize a new idea
Data Structures and Algorithms in C++, Fourth Edition 64
Networks (continued)• Maximum Flows (continue)
– A flow-augmenting path is a sequence of edges from s to t such that on any edge, e, in the path the flow f(e) on the forward edges is less than the capacity, cap(e), and greater than 0 on the backward edges
– This means the path has excess capacity that isn’t being used– However if the flow for any edge in that path reaches capacity, the
flow cannot be augmented– The path also does not have to exclusively use forward edges, so in
Figure 8.19, we have paths s, a, b, t and s, d, b, t– Backward edges push back against the flow, decreasing the total flow
of the network– Eliminating them can increase the overall flow in the network, so the
goal of augmenting isn’t finished until the flows for those edges is 0
Data Structures and Algorithms in C++, Fourth Edition 65
Networks (continued)• Maximum Flows (continue)
– The task now is to find an augmenting path; however there may be a large number of paths from s to t, so this is a nontrivial problem
– Ford and Fulkerson devised the first systematic algorithm for this in 1957
– The first phase of the algorithm, labeling, assigns each vertex of the graph a label, defined as the pair label(v) = (parent(v), flow(v))
– parent(v) is the node accessing v, and flow(v) is the flow amount from s to v
– Forward and backward edges are treated differently; if v accesses vertex u via a forward edge, label(u) = (v+,min(flow(v),slack(edge(vu))))
– Here, slack(edge(vu)) = cap(edge(vu)) – f(edge(vu)); this is the difference between the capacity of the edge vu and its current flow
Data Structures and Algorithms in C++, Fourth Edition 66
Networks (continued)• Maximum Flows (continue)
– Now if the edge between v and u is backward, then the value of label(u) = (v–,min(flow(v),f(edge(uv)))) where
flow(v) = min(flow (parent(v)), slack(edge(parent(v)v)))
– Once a vertex is labeled, it is stored for subsequent processing– Only the vu edge is labeled in this activity, leaving open the ability to
add more flow– This can be done for forward edges when slack(edge(vu)) > 0, and for
backward edges when f(edge(uv)) > 0– However, finding this path may not complete the whole procedure– It is only finished if we are stuck somewhere in the network and
unable to label any more edges
Data Structures and Algorithms in C++, Fourth Edition 67
Networks (continued)• Maximum Flows (continue)
– If we reach the sink, the flows in the augmenting path are adjusted by increasing flows on the forward edges, and decreasing them on the backward ones
– Then we restart the task and look for another augmenting path– The pseudocode for the algorithm is presented on page 425– In examining the algorithm, notice there is no particular mechanism
specified for scanning the graph– The question is in what order vertices should be added to labeled
and detached from it; this implementation uses push and pop operations to process it depth-first
– The operation of this algorithm in shown in Figure 8.20 on pages 426 and 427
Data Structures and Algorithms in C++, Fourth Edition 68
Networks (continued)• Maximum Flows (continue)
– A major issue with this implementation is the depth-first approach, which has a significant impact on its efficiency
– Since the depth-first algorithm tries to reach the sink as soon as possible, we may end up choosing the same augmenting path several times as the algorithm proceeds
– A better approach is to try and find the shortest augmenting path, which suggests a breadth-first approach
– This concept was developed by Jack Edmonds and Richard Karp in 1972
– It uses the same approach as the Ford-Fulkerson algorithm, but the labeled structure is now a queue
– This modified approach is illustrated in Figure 8.22 on page 429
Data Structures and Algorithms in C++, Fourth Edition 69
Networks (continued)• Maximum Flows (continue)
– Although this approach overcomes the problems associated with the depth-first search, it has its own inefficiencies
– When we perform a breadth-first search, a large number of vertices are labeled in each iteration in order to find the shortest path
– However, these labels are all discarded, only to be re-created when we start looking for another augmenting path
– So to address this shortcoming we turn our attention to an algorithm developed by Efim Dinic in 1970
– His approach used breadth-first search first to avoid the repetitive loops with the same paths and to make sure the depth-first search takes the shortest path
– Once that was done, the depth-first component takes over to reach the sink
Data Structures and Algorithms in C++, Fourth Edition 70
Networks (continued)• Maximum Flows (continue)
– The algorithm makes up to - 1 passes through the network resolving all augmenting paths of the same length from source to sink
– All the augmenting paths form a layered (or level) network – Starting from the lowest values, we first extract layered networks of
length one if they exist, then length two, etc.– This is illustrated in Figure 8.23a-b on page 431– The augmenting paths in this layered network are all of length three; a
single path of length one and paths of length two do not exist– Breadth-first processing is used to create the layered network, and it
includes only forward edges with more capacity and backward edges that already carry some flow
Data Structures and Algorithms in C++, Fourth Edition 71
Networks (continued)• Maximum Flows (continue)
– Since the paths in a layered network are of the same length, we can avoid redundant edges that are in augmenting paths
– If we cannot reach any of the neighbors of a vertex v in a layered network, the same situation will exist in that network in later tests
– Consequently, we won’t need to check the neighbors of v again– So if we run into a dead-end node v, we mark incident edges as
blocked so we can’t get to v from any direction– Any saturated edges (those already at full capacity) are also blocked;
these are shown as dashed lines in Figure 8.23– Because of the way this works, the layered network is built from the
sink to the source
Data Structures and Algorithms in C++, Fourth Edition 72
Networks (continued)• Maximum Flows (continue)
– Next, the depth-first search proceeds to find as many augmenting paths as possible from the layered network
– For each of these paths, one edge will become saturated, so eventually no more augmenting paths will be found
– This process is illustrated in Figure 8.23c-f– Once no more augmenting paths are found, a higher-level layered
network is created, and the search for augmenting paths begins again, eventually stopping when no layered network can be formed
– Figure 8.23g-j shows this, as first a four-edge and then a five-edge path are created
– The algorithm itself is shown on pages 432 and 433
Data Structures and Algorithms in C++, Fourth Edition 73
Networks (continued)• Maximum Flows of Minimum Cost
– Edges in the previous examples had two parameters, capacity and flow– Choice of maximum flow was dictated by the algorithm used, even
though there might be many maximum flows– This is illustrated in Figure 8.24
Fig. 8.24 Two possible maximum flows for the same network
– In Figure 8.24a, the edge ab isn’t used at all, whereas in Figure 8.24b all the edges are carrying flow
– Yet our breadth-first only yields the first result, then halts
Data Structures and Algorithms in C++, Fourth Edition 74
Networks (continued)• Maximum Flows of Minimum Cost (continued)
– However this may not be the best choice; not all paths of maximum flow are equally good ones
– If we look at the example as road distances between locations, then capacity and flow may not be sufficient information to properly determine a route
– For example, the distance from a to t may be quite long, while the distance from a to b and b to t may be shorter, making the second route preferable
– But distance may not be the sole criterion; there may be many other factors that influence the choice of route
– This leads us to consider a third factor in evaluating edges, the cost of moving a unit of flow through the edge
Data Structures and Algorithms in C++, Fourth Edition 75
Networks (continued)• Maximum Flows of Minimum Cost (continued)
– The problem now becomes how to find the maximum flow at minimum cost
– Finding all the possible maximum flows and then comparing their costs is extremely inefficient
– What is needed is an algorithm that can find a maximum flow while also determining the minimum cost
– One possible approach is based on the following theorem:
Theorem. If f is a minimal-cost flow with the flow value v and p is the minimum cost augmenting path sending aflow of value 1 from the source to the sink, then the flowf + p is minimal and its flow value is v + 1.
Data Structures and Algorithms in C++, Fourth Edition 76
Networks (continued)• Maximum Flows of Minimum Cost (continued)
– The theorem says we first start with the cheapest way to move v units through the network
– Then we find a path that is the cheapest way of sending a single unit from the source to the sink
– On combining these, we have the route previously determined and the path just found, which transmits v + 1 units
– Now if this augmenting path sends 1 unit at minimum cost, it can send, 2, 3, …, n units, where n is the capacity of the path
– This also suggests a process for finding the cheapest maximum route– Starting with all flows 0, we find the cheapest way to send 1 unit and
then maximize the flow along this path
Data Structures and Algorithms in C++, Fourth Edition 77
Networks (continued)• Maximum Flows of Minimum Cost (continued)
– After the next go-around, the path to send 1 unit at least cost is determined, and as many units as this can hold is sent, etc.
– This continues until we can’t send anything more from the source, or the sink can’t receive any more flows
– This is something like finding the shortest path, because it can be looked at as the path with minimum cost
– So we want an algorithm to find the shortest path so we can send the maximum flow through the path
– So a modification of Dijkstra’s one-to-one shortest path algorithm can be used
– The pseudocode for this procedure is shown on page 435
Data Structures and Algorithms in C++, Fourth Edition 78
Networks (continued)• Maximum Flows of Minimum Cost (continued)
– The label for each vertex in this algorithm is the triple label(u) = (parent(u), flow(u), cost(u)) since it has to track three items
– First, it records u’s predecessor, v, which how s accesses u– Then, for the path from s to u, it records the maximum flow– Finally, it stores the cost of passing all the edges from the source to u– cost(u), for the forward edge(vu), is the sum of accumulated costs in v
plus the additional cost of pushing a unit through edge(vu)– The unit cost of passing through backward edge(vu) is subtracted from
cost(v) and stored in cost(u)– The process is illustrated in Figure 8.25 on page 437
Data Structures and Algorithms in C++, Fourth Edition 79
Matching• A particular company has a set of jobs {a, b, c, d, e}, and a set
of applicants {p, q, r, s, t}• However, applicant p is only qualified for jobs a, b, and c;
applicant q is only qualified for jobs b and d; similar restrictions exist for the other applicants
• Our problem is how to match the applicants to the jobs such that each applicant has a job and all jobs are assigned
• Numerous problems like this exist, and they are conveniently modeled using bipartite graphs
• A bipartite graph is one where the vertices can be divided into two sets, such that any edge has one vertex in each set
Data Structures and Algorithms in C++, Fourth Edition 80
Matching (continued)• For the company, we can construct a bipartite graph where
each edge relates an applicant to the job(s) they qualify for• This is shown in Figure 8.26
Fig. 8.26 Matching five applicants with five jobs
• The task is to match each applicant with a job; this may not always be possible, so we want to match as many as possible
• For a given graph G = (V, E), a matching M is defined as a subset of edges M E, where no two edges are adjacent
Data Structures and Algorithms in C++, Fourth Edition 81
Matching (continued)• A maximum matching is a matching where the number of
unmatched vertices is minimal• Consider Figure 8.27
Fig. 8.27 A graph with matchings M1 = {edge(cd), edge(ef)}and M2 = {edge(cd), edge(ge), edge(fh)}
• Sets M1 = {edge(cd), edge(ef)} and M2 = {edge(cd), edge(ge), edge(fh)} are matchings, but M2 is a maximum matching
• A perfect matching is one where all vertices in the graph are paired
Data Structures and Algorithms in C++, Fourth Edition 82
Matching (continued)• A matching problem is the task of finding a maximum
matching for a given graph• An alternating path for M is a sequence of edges that
alternately belong to M and the set of edges not in M• An augmenting path for M is an alternating path where the
end vertices are not incident with any edge in matching M• Augmenting paths have an odd number of edges, 2k + 1,
where k are in M and k + 1 are not in M• The symmetric difference of two sets, X Y, is the set
X ⊕ Y = (X – Y) (Y – X) = (X Y) – (X Y)• In other words, the symmetric difference of two sets is the set
of elements in their union, less the intersection
Data Structures and Algorithms in C++, Fourth Edition 83
Matching (continued)• This leads us to the following lemma, the proof of which is
shown on page 439:Lemma 1. If for two matchings M and N in a graph G = (V,E) we define a set of edges M N E, then each connected component of the⊕ ⊆ subgraph G = (V,M N) is either (a) a single vertex, (b) a cycle with′ ⊕ an even number of edges alternately in M and N, or (c) a path whose edges are alternately in M and N and such that each end vertex of the path is matched only by one of the two matchings M and N (i.e., the whole path should be considered, not just part, to cover the entire connected component)
• Figure 8.28 shows an example of this• The symmetric difference between matching M (dashed
lines) and matching N (dotted lines) contains one path and a cycle (Figure 8.28 b)
Data Structures and Algorithms in C++, Fourth Edition 84
Matching (continued)• Notice that the vertices of the graph G not incident with any
edges in the symmetric difference are isolated vertices in G’
Fig. 8.28 (a) Two matchings M and N in a graph G = (V,E)and (b) the graph G’ = (V, M ⊕ N)
• Now consider the next lemma:Lemma 2. If M is a matching and P is an augmenting path for M, then M P is a matching of cardinality |M| + 1⊕
Data Structures and Algorithms in C++, Fourth Edition 85
Matching (continued)• The proof of this is on page 440; Figure 8.29 illustrates it
Fig. 8.29 (a) Augmenting path P and a matching M and (b) the matching M ⊕ P
• For matching edge M (dashed lines) and augmenting path P for M (c, b, f, h, g, i, j, e), the matching is {edge(bc), edge(ej), edge(fh), edge(gi)}
• This includes all the edges from P originally excluded from M
Data Structures and Algorithms in C++, Fourth Edition 86
Matching (continued)• These two lemmas can then be used to construct the proof of
the following important theorem:Theorem (Claude Berge 1957). A matching M in a graph G is maximum iff there is no augmenting path connecting two unmatched vertices in G
• The proof of this theorem is shown on page 441• This suggests an approach for finding a maximum path• Starting from an initial matching (possibly empty), it
repeatedly finds new augmenting paths to increase the cardinality of the matching until no such path can be found
• This means we need an algorithm to determine augmenting paths
• Fortunately, this is easier to do for bipartite graphs, so we’ll start with them
Data Structures and Algorithms in C++, Fourth Edition 87
Matching (continued)• To find an augmenting path, the breadth-first algorithm is
modified to allow for always finding the shortest path• A tree, called a Hungarian tree, is constructed with an
unmatched vertex in the root• It consists of alternating paths, and success is determined as
soon as another unmatched vertex is found• This indicates the presence of an augmenting path• The augmenting path increases the size of matching; once no
such path can be found, the algorithm is finished• The algorithm is shown on pages 441 and 442; an example of
this is shown in Figure 8.30 on page 443
Data Structures and Algorithms in C++, Fourth Edition 88
Matching (continued)• Stable Matching Problem
– In the example of matching applicants with jobs, any successful maximum matching was fine
– However, this is typically not possible due to preferences for jobs among applicants, and for applicants among employers
– The stable matching (also called stable marriage) problem uses two non-overlapping sets with the same cardinality, U and W
– The elements of U have a ranking list of elements of W, and those of W have a preference list of elements of U
– The ideal matching is to place elements with their highest preference, but because of possible conflicts, a stable matching is sought
– A matching is unstable is two elements rank each other higher than those with which they are currently matched; otherwise it is stable
Data Structures and Algorithms in C++, Fourth Edition 89
Matching (continued)• Stable Matching Problem (continued)
– If we consider the two sets U = {u1, u2, u3, u4} and W = {w1, w2, w3, w4}, and the following ranking lists:
u1: w2 > w1 > w3 > w4 w1: u3 > u2 > u1 > u4
u2: w3 > w2 > w1 > w4 w2: u1 > u3 > u4 > u2
u3: w3 > w4 > w1 > w2 w3: u4 > u2 > u3 > u1
u4: w2 > w3 > w4 > w1 w4: u2 > u1 > u3 > u4
then we can see the matching (u1, w1), (u2, w2), (u3, w4), (u4, w3) is unstable because u1 and w2 prefer each other over the current match
– David Gayle and Lloyd Shapley Designed a matching algorithm in 1962, and also showed that a stable matching always exists
– This algorithm is shown in page 444, together with a discussion of its application to the sets and table above
Data Structures and Algorithms in C++, Fourth Edition 90
Matching (continued)• Stable Matching Problem (continued)
– There is an asymmetry associated with the algorithm based on which rankings are considered more important
– As given, the algorithm favors set U– If the roles of the two sets U and W are reversed, then the w’s will
have their preferred choices immediately, instead of the u’s
Data Structures and Algorithms in C++, Fourth Edition 91
Matching (continued)• Assignment Problem
– Finding suitable matches becomes more difficult in a weighted graph– In these cases we want to find a matching with a maximum total
weight– This is known as the assignment problem– If we consider complete bipartite graphs with two sets of vertices that
are equal in size, then it is known as the optimal assignment problem– An algorithm known as the Hungarian algorithm was developed by
Harold Kuhn in 1955, and further investigated by James Munkres in 1957
– Kuhn’s original name was in honor of the work done by Dénis Kõnig and Jenõ Egerváry on this problem in 1931
Data Structures and Algorithms in C++, Fourth Edition 92
Matching (continued)• Assignment Problem (continued)
– The algorithm is shown on pages 445 and 446– An example of its application is shown in Figure 8.31, together with a
detailed treatment of its application on pages 446 and 447
Data Structures and Algorithms in C++, Fourth Edition 93
Matching (continued)• Matching in Nonbipartite Graphs
– The algorithm findMaximumMatching()(pages 441 and 442) is not general enough to correctly handled nonbipartite graphs
– Considering the graph in Figure 8.32 and using breadth-first search to construct a tree to determine an augmenting path we run into a problem
– Starting at vertex c, d is on an even level, e is odd, and a and f are even– a is then expanded by adding b and f by adding g and then i, creating
an augmenting path c, d, e, f, g, i– If i were not in the graph, however, the only augmenting path would
not be detected because g, being labeled, blocks access to f and h– A similar problem would occur if we relied on depth-first search
instead
Data Structures and Algorithms in C++, Fourth Edition 94
Matching (continued)• Matching in Nonbipartite Graphs (continued)
Fig. 8.32 Application of the findMaximumMatching() algorithm to a nonbipartite graph
– The problem is caused by certain cycles possessing an odd number of edges
– It isn’t the odd number of edges specifically that leads to this; Figure 8.32b can be successfully processed
Data Structures and Algorithms in C++, Fourth Edition 95
Matching (continued)• Matching in Nonbipartite Graphs (continued)
– The type of cycle for which the problems occur is called a blossom– A technique for determining augmenting paths for graphs with
blossoms was developed by Jack Edmonds in 1961 and published in 1965
– A blossom is an alternating cycle where the first and last edges of the cycle are not in matching
– In these cycles, the first vertex is called the base of the blossom– An alternating path of even length is called a stem, so is a path of
length zero with a single vertex– If a blossom has a stem whose edge in matching is incident with the
base, it is called a flower
Data Structures and Algorithms in C++, Fourth Edition 96
Matching (continued)• Matching in Nonbipartite Graphs (continued)
– In Figure 8.32a, path c, d, e and path e are stems; cycle e, a, b, g, f, e forms a blossom with base e
– Blossoms cause problems when the potential augmenting path leads to a blossom through the base
– Depending on the edge chosen to continue the path, an augmenting path may not be derived
– If the blossom is entered through any other vertex, however, the problem is averted because only one of the two edges of the vertex can be chosen
– So the idea is to detect a blossom is being entered through its base– We can then temporarily remove the blossom by replacing it with a
vertex and attach to this all edges connected to the blossom
Data Structures and Algorithms in C++, Fourth Edition 97
Matching (continued)• Matching in Nonbipartite Graphs (continued)
– At this point the search for an augmenting path continues– If one is found and it includes a vertex representing a blossom, the
blossom is re-inserted– The path through it is then determined by going backwards from the
edge that led to the blossom to an edge incident with the base– So first, we need to detect that a blossom has been entered through
its base– The Hungarian tree in Figure 8.33a was generated using a breadth-first
search on the graph of Figure 8.32a– Trying to find neighbors of b leads us to g, because edge(ab) is in
matching, so only edges not in matching can be included starting from b
Data Structures and Algorithms in C++, Fourth Edition 98
Matching (continued)• Matching in Nonbipartite Graphs (continued)
– These edges lead to vertices on an even level in the tree, but g has already been labeled and is on an odd level, signaling a blossom
– Thus, we trace paths back in the tree from g and b until we reach a common, root, which is vertex e; this is the base of the blossom
– We then replace the blossom with a vertex, A, leading to the graph of Figure 8.33b
– The augmenting path search is then resumed, and continues until the path is found, which is c, d, A, h
– Then the blossom is expanded, and the path traced through the blossom
– This is done by starting from edge(hA) (now edge(hf))
Data Structures and Algorithms in C++, Fourth Edition 99
Matching (continued)• Matching in Nonbipartite Graphs (continued)
– That edge is not in matching, so from f only edge(fg) can be chosen so the augmenting path remains alternating
– By moving through the vertices f, g, b, a, e, the part of the augmenting path corresponding to A is determined, as seen in Figure 8.33c
– So the full augmenting path is c, d, e, a, b, g, f, h– Once the path is processed, a new matching is determined, shown in
Figure 8.33d
Data Structures and Algorithms in C++, Fourth Edition 100
Matching (continued)• Matching in Nonbipartite Graphs (continued)
Fig. 8.33 Processing a graph with a blossom
Data Structures and Algorithms in C++, Fourth Edition 101
Eulerian and Hamiltonian Graphs• Eulerian Graphs
– A trail in a graph which visits every edge exactly once is called an Eulerian trail (or Eulerian path)
– Similarly, an Eulerian trail which starts and ends on the same vertex is called an Eulerian circuit or Eulerian cycle
– They were first discussed by Leonhard Euler while solving the famous Seven Bridges of Königsberg problem in 1736
– Euler proved that if every vertex of the graph is incident to an even number of edges, then it is Eulerian
– In addition, if the graph has exactly two vertices incident with an odd number of edges, it contains an Eulerian trail
– An algorithm developed by M. Fleury in 1883 is the oldest that allows us to find an Eulerian cycle if this is possible; it appears on page 450
Data Structures and Algorithms in C++, Fourth Edition 102
Eulerian and Hamiltonian Graphs(continued)
• Eulerian Graphs (continued)– Figure 8.34 shows an example of finding an Eulerian cycle
Fig. 8.34 Finding an Eulerian cycle
– A test needs to be made, before an edge is chosen, to see if that edge is a bridge in the untraversed subgraph
– If it is, it could lead to the in ability to complete the path because certain vertices could become unreachable
Data Structures and Algorithms in C++, Fourth Edition 103
Eulerian and Hamiltonian Graphs(continued)
• Eulerian Graphs (continued) – The Chinese Postman Problem– The Chinese postman problem is to find a shortest closed path or
circuit that visits every edge of a (connected) undirected graph– Alan Goldman of the U.S. National Bureau of Standards first coined the
name 'Chinese Postman Problem' for this problem, as it was originally studied by the Chinese mathematician Mei-Ku Kwan in 1962
– When the graph has an Eulerian circuit that circuit is an optimal solution
– If it doesn’t, it can be amplified by including each edge as many times as it appears in the postman’s walk
– If this is done, we need to construct the graph in such a way as to minimize the sum of the distances of the added edges
Data Structures and Algorithms in C++, Fourth Edition 104
Eulerian and Hamiltonian Graphs(continued)
• Eulerian Graphs (continued) – The Chinese Postman Problem– First we group odd degree vertices into pairs and add a path of new
edges to the already existing path between vertices of each pair– The problem now is to find a grouping of odd-degree vertices such
that the total distance of the added paths is minimum– An algorithm to solve this was developed by Jack Edmonds and Ellis L.
Johnson in 1973, based on earlier work by Edmonds in 1965– The pseudocode for this algorithm is shown on page 451– The task of finding a postman tour is illustrated in Figure 8.35 on page
452– The path has six odd degree vertices, c, d, f, g, h, and j
Data Structures and Algorithms in C++, Fourth Edition 105
Eulerian and Hamiltonian Graphs(continued)
• Eulerian Graphs (continued) – The Chinese Postman Problem– In Figure 8.35b-c the shortest paths between all pairs of these vertices
are determined– A complete bipartite graph, H, is then found (Figure 8.35d), and an
optimal assignment, M is determined– A matching in an initial equality subgraph is found by using the
optimalAssignment() algorithm (Figure 8.35e)– Two matchings are found (Figure 8.35f–g), and then a perfect
matching (Figure 8.35h)– Using this, we amplify the original graph by adding new edges (dashed
lines in Figure 8.35i), so there are no odd-degree vertices– Consequently, finding an Eulerian trail is possible
Data Structures and Algorithms in C++, Fourth Edition 106
Eulerian and Hamiltonian Graphs(continued)
• Hamiltonian Graphs– A Hamiltonian path is a path in an undirected graph that visits
each vertex exactly once– A Hamiltonian cycle is a Hamiltonian path that is a cycle– Determining whether such paths and cycles exist in graphs is the
Hamiltonian path problem, which is NP-complete– Hamiltonian graphs have no characterizing formula, but all complete
graphs are Hamiltonian– Hamiltonian paths and cycles are named after William Rowan
Hamilton who studied them in 1857– The following theorem will prove useful in discussing Hamiltonian
graphs
Data Structures and Algorithms in C++, Fourth Edition 107
Eulerian and Hamiltonian Graphs(continued)
• Hamiltonian Graphs (continued)Theorem (Bondy and Chvatal 1976; Ore 1960). If edge(vu) E, graph G* = (V,E{edge(vu)}) is Hamiltonian, and deg(v) + deg(u) > |V|, then graph G =(V,E) is also Hamiltonian
– The proof of this is shown on page 453; the theorem essentially says that some Hamiltonian graphs can be created from others by eliminating edges
– This process leads to an algorithm where finding a Hamiltonian cycle is easy (by expanding the graph with more edges)
– Then the cycle is manipulated by adding and removing edges until a Hamiltonian cycle is found based on the edges of the original graph
– The algorithm is presented on pages 453 and 454– Figure 8.37 on page 455 shows an example of this
Data Structures and Algorithms in C++, Fourth Edition 108
Eulerian and Hamiltonian Graphs(continued)
• Hamiltonian Graphs (continued) – The Traveling Salesman Problem– The travelling salesman problem consists of finding the shortest
possible route that visits each city (in a set of cities) exactly once and returns to the origin city
– If the distances between each pair of cities is known, there are (n – 1)! possible routes
– The problem is then to find a minimum Hamiltonian cycle– Many versions of this problem use the triangle inequality, dist(vivjj) <
dist(vivk)+ dist(vkvj)– A possibility is to add to an already constructed path v1, …, vj a vertex
vj+1, that is closest to vj
– The problem is the last edge added may be as long as the total distance of the remaining edges
Data Structures and Algorithms in C++, Fourth Edition 109
Eulerian and Hamiltonian Graphs(continued)
• Hamiltonian Graphs (continued) – The Traveling Salesman Problem– Another possibility uses a minimum spanning tree– The length of the tree is defined to be the sum of the lengths of all the
edges of the tree– Since removing an edge from the tour creates a spanning tree, the
tour cannot be less than the length of the minimum spanning tree– Also, each edge of the tree is traversed twice in a depth-first search, so
the length of the tour is at most twice the length of the tree– However a path that includes each edge twice includes some vertices
twice, and each vertex should be included only once
Data Structures and Algorithms in C++, Fourth Edition 110
Eulerian and Hamiltonian Graphs(continued)
• Hamiltonian Graphs (continued) – The Traveling Salesman Problem– So if a vertex is already in such a path, its second occurrence is
eliminated, and the path contracted – This shortens the length of the path due to the triangle inequality– For example, Figure 8.38b (pages 456 and 457) shows the minimum
spanning tree for the graph that connects the cities a through h in Figure 8.38a
– Depth-first search yields 8.38c, and applying the triangle inequality repeatedly (Figure 8.38c-i) transforms the path into the path in 8.38i
– This final path can be obtained directly from the minimum spanning tree in Figure 8.38b using preorder traversal
Data Structures and Algorithms in C++, Fourth Edition 111
Eulerian and Hamiltonian Graphs(continued)
• Hamiltonian Graphs (continued) – The Traveling Salesman Problem– The tour in Figure 8.38i is obtained by considering a as the vertex of
the tree, so the cities are visited a, d, e, f, h, g, c, b from which we return to a
– This tour is minimum, which won’t always be the case– For example, if d is considered to be the root, the algorithm yields the
path in Figure 8.38j, clearly not minimal– In another version of the algorithm, a tour is extended by adding to it
the closest city– Since the tour is kept in one piece, it resembles a method developed
by Vojtech Jarnik in 1930 (and separately by Robert C. Prim in 1957)
Data Structures and Algorithms in C++, Fourth Edition 112
Eulerian and Hamiltonian Graphs(continued)
• Hamiltonian Graphs (continued) – The Traveling Salesman Problem– This algorithm is shown on page 458– An example of its application is shown in Figure 8.39 on pages 458 and
459
Data Structures and Algorithms in C++, Fourth Edition 113
Graph Coloring• Occasionally, we want to determine the minimum number of
sets of non-coincident vertices, where some vertices in each set are independent
• By this we mean that the vertices are not connected by any edge
• By example, we may have several tasks to be performed by several people
• If one task can be performed by one person at one time, the scheduling must be such that this can be done
• We can let the task represent vertices of a graph, and join with an edge two tasks that require the same person
Data Structures and Algorithms in C++, Fourth Edition 114
Graph Coloring (continued)• Then we try to construct the minimum number of sets of
independent tasks• Because all the tasks in a given set can be done concurrently,
the number of sets indicates the number of time slots needed• As a variation of this, we could join with an edge those tasks
that cannot be performed concurrently• As before, the independent sets indicate the tasks that can be
performed at the same time• However in this case the minimum number of sets indicates
the minimum number of people needed to perform the tasks• In general, two vertices are joined by an edge if they cannot
be members of the same class
Data Structures and Algorithms in C++, Fourth Edition 115
Graph Coloring (continued)• We can restate the problem to say that vertices of a graph are
assigned colors so that vertices joined by an edge are different colors
• So the task amounts to coming up with a graph coloring using a minimum number of colors
• More formally, given a set of colors, C, we determine a function f : V → C so that if edge(vw) exists, f(v) ≠ f(w) and C is of minimum cardinality
• The chromatic number of a graph G is the minimum number of colors needed to color the graph, denoted χ(G)
• A graph where k = χ(G) is called k-colorable
Data Structures and Algorithms in C++, Fourth Edition 116
Graph Coloring (continued)• There may be many sets of minimum colors; no general
formula exists for the chromatic number of an arbitrary graph• There are some special cases, however:
– A complete graph, Kn has the chromatic number χ(Kn) = n– For a cycle with an even number of edges, C2n , χ(C2n) = 2– For a cycle with an odd number of edges, C2n + 1 , χ(C2n + 1) = 3– For a bipartite graph, G, χ(G) < 2
• The determination of a graph’s chromatic number is an NP-complete problem
• Consequently, techniques need to be used that can color a graph with a number of colors close to the chromatic number
Data Structures and Algorithms in C++, Fourth Edition 117
Graph Coloring (continued)• Sequential coloring is an approach that establishes sequences
of vertices and colors before coloring the vertices• Then the next vertex in sequence is colored with the lowest
number possible• This algorithm appears on page 460• The algorithm does not specify any ordering criteria for the
vertices (order of colors makes no difference)• One possibility is to use the indices assigned to the vertices
before the algorithm is executed, as shown in Figure 8.40b• This can result in a wide disparity between the coloring and
the chromatic number, however
Data Structures and Algorithms in C++, Fourth Edition 118
Graph Coloring (continued)
Fig. 8.40 (a) A graph used for coloring; (b) colors assigned tovertices with the sequential coloring algorithm that ordersvertices by index number; (c) vertices are put in the largest
first sequence; (d) graph coloring obtained with the Brélaz algorithm
Data Structures and Algorithms in C++, Fourth Edition 119
Graph Coloring (continued)• A theorem due to Dominic Welsh and M. B. Powell (1967) will
be of use (the proof is on page 460)Theorem: For the sequential coloring algorithm, the number of colors needed to color the graph, χ(G) < maxmin(i, deg() + 1)
• Applying this to the graph of Figure 8.40a, we have χ(G) = max(min(1,4), min(2,4), min(3,3), min(4,3), min(5,3), min(6,5), min(7,6), min(8,4)) = max(1, 2, 3, 3, 3, 5, 6, 4) = 6
• The theorem suggests that vertices of higher degree be placed first, so the min value is their position in the sequence
• Vertices of lower degree get placed last, so their minimum value is the degree of the vertex
• This leads to the largest first approach, where the vertices are ordered in descending order by degree
Data Structures and Algorithms in C++, Fourth Edition 120
Graph Coloring (continued)
• Doing it this way gives us the order v7, v6, v1, v2, v8, v3, v4, v5, where v7 gets colored first, as seen in Figure 8.40c
• This also gives us a better sense of the chromatic number, because with this ordering χ(G) < 4
• Although this ordering method uses a single criterion, there is no restriction on the number of criteria that can be applied
• This can be helpful in breaking ties, since in our example, two vertices with the same degree are chosen by their index order
• In 1979, Daniel Brélaz proposed an algorithm where the saturation degree of a vertex (the number of colors of the vertex’s neighbors) is used
Data Structures and Algorithms in C++, Fourth Edition 121
Graph Coloring (continued)• If a tie occurs, it is broken by choosing the vertex with the
largest uncolored degree, which is the number of uncolored vertices adjacent to the vertex
• This algorithm is on page 462, and is applied in Figure 8.40d• First we choose v7 because it has the highest degree; then
vertices 1, 3, 4, 6 and 8 have their saturations set to 1• From these, v6 is chosen, since it has the most uncolored
neighbors• The saturation of vertices 1 and 8 are changed to 2, and since
their saturation and uncolored neighbors are equal, we rely on the index to select v1; the remainder are as shown in the figure
Data Structures and Algorithms in C++, Fourth Edition 122
NP-Complete Problems in Graph Theory
• The Clique Problem– A clique in an undirected graph is a subset of its vertices such that
every two vertices in the subset are connected by an edge– The clique problem is to determine, for some graph G, whether or not
it contains a clique Km for some integer m– The problem is NP because we can check in polynomial time whether
a set of m vertices forming a subgraph is a clique– To show it is NP-complete, we can use the 3-satisfiability problem and
reduce it to the clique problem– The reduction is performed by showing that for a Boolean expression
BE of 3 variables in CNF we can construct a graph such that the expression is satisfiable if there is a clique of m vertices in the graph
Data Structures and Algorithms in C++, Fourth Edition 123
NP-Complete Problems in Graph Theory(continued)
• The Clique Problem (continued)– We will let m be the number of alternatives in BE, such that we have
BE = A1 A2 … Am
– Each Ai = (p q r), where the p, q, and r are the three Boolean variables or their negations
– A graph is constructed where the vertices represent all the variables and their negations found in BE
– An edge will join two vertices if they are not complements and they are in different alternatives
– The expression BE = (x y z) (x y z) (w x y) corresponds to the graph in Figure 8.41
Data Structures and Algorithms in C++, Fourth Edition 124
NP-Complete Problems in Graph Theory(continued)
• The Clique Problem (continued)
Fig. 8.41 A graph corresponding to the Booleanexpression (x y ¬z) (x ¬y ¬z) (w ¬x ¬y)
– An edge between variables represents the possibility that both variables are true at the same time
Data Structures and Algorithms in C++, Fourth Edition 125
NP-Complete Problems in Graph Theory(continued)
• The Clique Problem (continued)– An m-clique represents the possibility that a variable from each
alternative is true, making the BE true– Each triangle in Figure 8.41 represents a 3-clique– This way, if BE is satisfiable, an m-clique exists, and if an m-clique
exists, BE is satisfiable– So the satisfiability problem is reduced to the clique problem– Since the satisfiability problem is NP-complete, the clique problem is
NP-complete as well
Data Structures and Algorithms in C++, Fourth Edition 126
NP-Complete Problems in Graph Theory(continued)
• The 3-Colorability Problem– The 3-colorability problem is the question of whether or not a graph
can be colored with three colors– As with the clique problem, we’ll show this is NP-complete by reducing
it to the 3-satisfiability problem– The problem is NP because we can come up with a coloring of the
vertices in three colors and check that the coloring in correct in quadratic time
– We will use an auxiliary 9-subgraph to reduce the 3-satisfiability problem to the 3-colorability problem
– The 9-subgraph takes 3 vertices from an existing graph and adds 6 new vertices and 10 edges, as can be seen in Figure 8.42a
Data Structures and Algorithms in C++, Fourth Edition 127
NP-Complete Problems in Graph Theory(continued)
• The 3-Colorability Problem (continued)
Fig. 8.42 (a) A 9-subgraph; (b) a graph corresponding to theBoolean expression (¬w x y) (¬w ¬y z) (w ¬y ¬z)
Data Structures and Algorithms in C++, Fourth Edition 128
NP-Complete Problems in Graph Theory(continued)
• The 3-Colorability Problem (continued)– Now, consider the set of three colors {f, t, n} corresponding to
(fuchsia/false, turquoise/true, nasturtium/neutral) used to color the graph
– The following lemma will help us in demonstrating the reducibility of the 3-satisfiability problem to the 3-colorability problem
Lemma. 1) If all three vertices, v1, v2, and v3, of a 9-subgraph arecolored with f, then vertex v4 must also be colored with f to havethe 9-subgraph colored correctly. 2) If only colors t and f can beused to color vertices v1, v2, and v3 of a 9-subgraph, and at leastone is colored with t, then vertex v4 can be colored with t
Data Structures and Algorithms in C++, Fourth Edition 129
NP-Complete Problems in Graph Theory(continued)
• The 3-Colorability Problem (continued)– Now the graph for the given Boolean expression BE of k alternatives is
constructed in the following way– There are two special vertices, a and b, and edge(ab) in the graph; also
there is a vertex for the variables in BE and for the negation of these– The graph includes edge(ax), edge(a(x)), and edge(x(x)) for each
vertex, x, and its negation, x– Now, the graph has a 9-subgraph whose vertices v1, v2, and v3
correspond to the three Boolean variables or their negations p, q, and r in the alternative p q r included in BE
– Lastly, the graph includes edge(v4b) for each 9-subgraph
Data Structures and Algorithms in C++, Fourth Edition 130
NP-Complete Problems in Graph Theory(continued)
• The 3-Colorability Problem (continued)– The graph corresponding to (w x y) (w y z) (w y z) is shown in
Figure 8.42b– Now we can claim that if a Boolean expression BE is satisfiable, the
graph corresponding to it is 3-colorable– For every variable x in BE, if x is true we set color(x) = t and color(x) = f;
otherwise color(x) = f and color(x) = t– If each alternative in BE is satisfiable, then the Boolean expression is
satisfiable– This takes place when at least one variable or its negation is true in
each alternative
Data Structures and Algorithms in C++, Fourth Edition 131
NP-Complete Problems in Graph Theory(continued)
• The 3-Colorability Problem (continued)– Since each neighbor of a has color t or f, and since at least one of the
three vertices of each 9-subgraph has color t, each 9-subgraph is 3-colorable
– Thus color(v4) = t, and the entire graph is 3-colorable by setting color(a) = n and color(b) = f
– Now, suppose a graph as in Figure 8.42b is 3-colorable and that color(a) = n and color(b) = f
– Since color(a) = n, the neighbors of a have color f or t, and this can be interpreted as the Boolean variable associated with the vertices being true or false
Data Structures and Algorithms in C++, Fourth Edition 132
NP-Complete Problems in Graph Theory(continued)
• The 3-Colorability Problem (continued)– Only if all three vertices of any 9-subgraph have color f can vertex v4
have color f, but this would conflict with color f of vertex b– So no 9-subgraph’s vertices can all have color f; one must be t– As a consequence, each alternative of the 9-subgraph is true, so the
entire Boolean expression is satisfiable
Data Structures and Algorithms in C++, Fourth Edition 133
NP-Complete Problems in Graph Theory(continued)
• The Vertex Cover Problem– A vertex cover of a graph is a set of vertices such that each edge of the
graph is incident to at least one vertex of the set– In this way the vertices in the set cover all the edges– The problem to determine whether a graph, G, has a vertex cover
containing at most k vertices for some integer k is NP-complete– This problem is NP because a solution can be guessed and checked in
polynomial time– To show it is NP-complete, we’ll reduce the clique problem to the
vertex cover problem
Data Structures and Algorithms in C++, Fourth Edition 134
NP-Complete Problems in Graph Theory(continued)
• The Vertex Cover Problem (continued)– The first thing to do is define a complement graph of G that has the
same vertices, but whose connections are edges not in G– The reduction algorithm converts a graph G with a ( - k) – clique into
its complement with a vertex cover size of k– If C = (VC , EC) is a clique in G, vertices from V – VC cover all the edges
in the complement, because it has no edges with both vertices in VC – As a result, V – VC is a vertex cover in the complement graph, – Figure 8.43a shows a graph with a clique and 8.43b shows a
complement graph with a vertex cover– Now suppose there is a vertex cover W for
Data Structures and Algorithms in C++, Fourth Edition 135
NP-Complete Problems in Graph Theory(continued)
• The Vertex Cover Problem (continued)– If W contains none of the endpoints of an edge, that edge must be in
G meaning the latter endpoints are in V – W – Therefore, VC = V – W forms a clique– As a result, this proves a positive answer to the clique problem is a
positive answer to the vertex cover problem through the conversion– And since the former is NP-complete, so is the latter
Fig. 8.43 (a) A graph with a clique; (b) a complement graph
Data Structures and Algorithms in C++, Fourth Edition 136
NP-Complete Problems in Graph Theory(continued)
• The Hamiltonian Cycle Problem– Asserting that the Hamiltonian cycle problem is NP-complete can be
shown by reducing the vertex cover problem to the Hamiltonian cycle problem
– We will make use of an auxiliary 12-graph, as shown in Figure 8.44a– Each edge(vu) of the graph G is converted into a 12-subgraph so that
one side of the subgraph (vertices a and b) corresponds to a vertex v of G and the other side (vertices c and d) corresponds to vertex u
– After entering a side of the 12-subgraph at vertex a, we can go through all 12 vertices in order a, c, d, b and exit at b on the same side
– We can also go directly from a to b, and if there is a Hamiltonian circuit in the entire graph, vertices c and b are traversed in another visit of the 12-subgraph
Data Structures and Algorithms in C++, Fourth Edition 137
NP-Complete Problems in Graph Theory(continued)
• The Hamiltonian Cycle Problem (continued)
Fig. 8.44 (a) A 12-subgraph; (b) a graph G and (c) its transformation, graph GH
Data Structures and Algorithms in C++, Fourth Edition 138
NP-Complete Problems in Graph Theory(continued)
• The Hamiltonian Cycle Problem (continued)– Any other path through the 12-subgraph would render building a
Hamiltonian cycle of the entire graph impossible– Now, assuming we have a graph G, we can proceed to build another
graph, GH, in the following manner– We first create a set of vertices u1, u2, …, uk, where the value k is the
parameter that corresponds to the vertex cover problem for graph G– Next, for each edge of G, we create a 12-subgraph, and those 12-
subgraphs associated with a vertex v are connected together on the sides corresponding to v
– Finally, the endpoint of the string of these 12-subgraphs is connected to the vertices u1, u2, …, uk
Data Structures and Algorithms in C++, Fourth Edition 139
NP-Complete Problems in Graph Theory(continued)
• The Hamiltonian Cycle Problem (continued)– The result of this transformation from G to GH for k = 3 is shown in
Figure 8.44b-c– Figure 8.44c only shows some of the connections, to avoid clutter; the
small segments from the other vertices indicate other connections– Now the claim is that there is a Hamiltonian cycle in GH if there is a
vertex cover of size k in G– We’ll start by assuming there is a vertex cover in G, designated by the
set W = {v1, v2, …, vk}– Next, we’ll assert there is a Hamiltonian cycle in GH, which is formed in
the following procedure
Data Structures and Algorithms in C++, Fourth Edition 140
NP-Complete Problems in Graph Theory(continued)
• The Hamiltonian Cycle Problem (continued)– Starting at u1, we go through the sides of 12-subgraphs corresponding
to v1
– We will go through all the 12 vertices of a particular 12-subgraph if the other side of it does not correspond to a vertex in set W
– Otherwise we go straight through the 12-subgraph, which means we won’t traverse 6 of the vertices corresponding to a vertex w
– However, we will traverse them when we process that part of the Hamiltonian cycle corresponding to w
– Once we reach the end of the string of 12-subgraphs, we go to vertex u2 and repeat this process for vertex v2, etc.
– For the last vertex uk, we process vk and end the path at u1
Data Structures and Algorithms in C++, Fourth Edition 141
NP-Complete Problems in Graph Theory(continued)
• The Hamiltonian Cycle Problem (continued)– The result of this is the creation of a Hamiltonian cycle– The thick line in Figure 8.44c represents the part of the Hamiltonian
cycle matching v1 that starts at u1 and ends at u2
– Because the cover in this case is W = {v1, v2, v6}, this processing continues at u2 and ends at u3 for v2, and then for v6 from u3 to u1
– Now if GH has a Hamiltonian cycle, conversely it would have k 12-subgraph strings including subpaths that correspond to the k vertices in GC that form a cover
– Consequently, we have shown the reducibility of the vertex cover problem to the Hamiltonian cycle problem, and since the former is NP-complete, so is the latter
Data Structures and Algorithms in C++, Fourth Edition 142
NP-Complete Problems in Graph Theory(continued)
• The Hamiltonian Cycle Problem (continued)– As an afterthought, now consider the traveling salesman problem – Given a graph with distance assigned to each edge, we try to identify a
cycle with a total distance not greater than some integer, k– We can demonstrate this problem is NP-complete by reducing it to the
Hamiltonian cycle problem
Data Structures and Algorithms in C++, Fourth Edition 143