Download - Chapter 3 Graphs, Trees, and Tours

Chapter 3

Graphs, Trees, and Tours

Presented by Qibin Cai

Overview

Terminology in graph theory

Trees - Minimum spanning tree (MST) - Shortest path tree (SPT)

Tours - TSP tours

Overview cont’d

Building trees

- Kruskal’s algorithm- Prim’s algorithm- Dijkstra’s algorithm- Prim-Dijkstra

algorithm

Building tours

- Nearest-neighbor algorithm

- Improved nearest-neighbor heuristic

- Divide and conquer strategy

Terminology

What is a graph?

Observation:

A graph is a set of points in a plane (or in 3-space) and a set of line segments (possibly curved), each of which either joins two points or joins a point to itself.

Some definitionsGraphsGraphs

A,B,C etc. are vertices(nodes)A,B,C etc. are vertices(nodes) (A,X), (X,Y) etc. are edges(A,X), (X,Y) etc. are edges P,Q,Z is a cycleP,Q,Z is a cycle Degree of a node is the number of edges at the nodeDegree of a node is the number of edges at the node

– Degree Y =3, degree C=1Degree Y =3, degree C=1

X

Y

Z

P Q

A

B

D

C

Terminology cont’d

Definition in mathematical language?

A graph G = (V, E) is a mathematical structure consisting of two sets V and E. The elements of V are called vertices (or nodes), and the elements of E are called edges.

Digragh : a directed graph


Endpoints : a set of one or two vertices associated to each edge.Loop: an edge where both endpoints are the same. Also called a self-loop.Parallel edges: a collection of two or more edges having identical end.

Also called a multi-edge.


A graph is simple if it has no loops or parallel edges.

Most of our discussions will involve simple graphs. Sometimes, when we considering reliability, we will introduce parallel edges if the network has parallel links.


The degree of a node: the number of edges in the graph that have the node as an endpoint (plus twice the number of self-loops).

Indegree

Outdegree


Adjacent vertices:

Two nodes are adjacent if there is an edge that has them as endpoints.

Incidence:

The relationship between an edge and its endpoints.


Walk from vertex u to vertex v:

an alternating sequence of vertices and edges, representing a continuous traversal from vertex u to vertex v.

Trail: a walk with no repeated edges.

Path: a walk with no repeated vertices.


Cycle:

a closed path with at least an edge.

Connected graph:

a graph in which every pair of distinct vertices has a walk between them.


Subgraph: A graph G’=(N’,A’) is a subgraph of G=(N,A) if N’ N and

A’ A.

Component:

a maximal connected subgraph of a graph.


Isomorphism:

Two graphs G1 and G2 are isomorphic if there is a 1-to-1 mapping f: v1 -> v2 such that (v1, v2) E1 if and only if (f(v1), f(v2)) E2.

Trees

Tree: a connected, simple graph without cycles.

Star: a tree in which only 1 node has degree greater than 1.

Chain: a tree in which no node has degree greater than 2.

Any tree with n nodes has n-1 edges.

Trees

A tree is a connected simple graph with no cycles e.g.

XY

Z

P Q

A

B

D

C

Star

A tree is a star if only 1 node has degree >1

X

Y

Z

PQ

A

B

D

C

Chains

A chain is a tree with no nodes of degree >2

XY

Z

P Q

A

B

D

C

Weighted Graph

A graph G is weighted if there is a value associated with each edge (e.g. link speed, cost, etc.) Weight of the edge ei = w(ei)

We often denote this graph (G, w). If G’ is any subgraph of G, then w(G’) = '

)(Ge

ew

To optimise a connected graph find the graph with the minimum weight

The Minimal Spanning Tree (MST)

Minimal Spanning Trees

Let G be a connected weighted graph.

A spanning subgraph includes all the nodes of G.

A tree T is a spanning tree of G if T is a spanning subgraph of G.

MST: A spanning tree of G whose total edge-weight is a minimum.

Finding the MST

Two algorithms Kruskal and Prim

Kruskal achieves the MST by starting with a graph and cutting out edges

Prim starts by selecting a node, adding the “least expensive edge” iterates until tree is built

Example MST

Use of MSTs

Small design problems - few nodes

Highly reliable links with low “downtime”or network can tolerate unreliability

Nodes ‘v’ reliabilityAs the number of nodes increases

reliability decreases (exponentially!)

Kruskal’s Algorithm (1956)

1. Check that the graph G is connected. If it is not connected, abort.2. Sort the edges of the graph G in ascending order of weight.3. Mark each node as a separate component.4. Examine each of the sorted edges:

if the edge connects two separate components, add it ; otherwise, discard.

Prim’s Algorithm (1957)

Input : a weighted connected graph G=(N,E).

Output : a minimum spanning tree T.

U = set of all nodes in MST

V = set of all nodes that are NOT yet in MST, but they’re adjacent to nodes in U.

Prim’s Algorithm (cont’d)

1. Place any node in U, and update V .

2. Find the edge with smallest weight that connects a node in V to a node in U3. Add that edge to the tree,

and update U & V.4. Repeat 2 & 3 until all nodes are included, i.e., | U | = | N |.

How to use Delite to Calculate MST’s

Invoke the code for Prim’s algorithm from the Design menu.

Select to produce a trace file

Demonstration

Tree Designs

Overview• Squareworld• Coordinate systems:• - V & H• - L & L• MSTs do not scale• Definitions: hops(n1,n2), hops

Square World

We will create a little world with several properties that make it a nice place to work on network design problem.

The world is 1000 miles by 1000 miles. 1 type of transmission line with a capacity of 1,000,000 bps. Given 2 sites, S1 at location (X1, Y1) and S2 at location (X2,Y2),

the cost of a link between them is

($1000 + $10 x d) / month

where

212

212 )()( yyxxd

2 Coordinate Systems

We will use a problem generator to set up a series of network design problems. Before we can do this, we need to know something about the methods of locating sites are used in the real world.

Vertical and horizontal (V&H) - a grid of lines, or more accurately curves is drawn. - allows for a simplified computation of distances.

Latitude and longitude (L&L) - defined for all locations on the surface of the earth. - The distance calculation is essentially an exercise in

spherical geometry.

MSTs Do Not scale

Why? First look at an example.Figure 3.2 (An MST for 5 nodes in square world)

N2

N1 N5

N4

N3 MAX_UTIL=0.6%

Figure 3.3 (An MST for 10 nodes in square world) N6

N2 N7 N10

N9 N1 N5

N4 N8 N3MAX_UTIL=2.5%

The network is beginning to have a leggy look, which means that the traffic is taking a circuitous route between its source and destination.

To qualify the legginess in the network, we make the following definition.

Two Definitions

Definitions 3.17 - The number of hops

between node n1 and n2 is the number of edges in the path chosen by the routing algorithm for the traffic flowing from n1 to n2.

Denoted by hops (n1,n2)

Definitions 3.18

- The average number of hops in a network is:

Denoted by

2,1

2,1

)2,1(

)2,1()2,1(

nn

nn

nnTraffic

nnhopsnnTraffic

hops

MSTs Do Not scale (cont’d)

We summarize the values of as below:

Number of nodes

5 1.8

10 3.1778

20 4.4158

50 8.5159

100 13.9479

hops

hops #hops grow past a reasonable level, and MSTs are not good solutions as # nodes and the traffic grow.

Then we will consider if we can design better trees.

Shortest-Path Trees (SPT)

Definition 3.19 Given a weighted graph (G,W) and nodes n1 and n2, the

shortest path from n1 to n2 is a path P such that is a minimum.

Definition 3.20 Given a weighted graph (G,W) and a node n1, a shortest –

path tree rooted at n1 is a tree T such that, for any other node n2 G, the path from n1 to n2 in the tree T is a shortest path between the nodes.

Pe

ew )(

Dijkstra’s Algorithm

1. Mark every node as unscanned and give each node a label of

2. Set the label of the root to 0 and the predecessor of the root to itself. The root will be the only node that is its own predecessor.

Dijkstra’s Algorithm (cont’d)

3. Loop until you have scanned all the nodes.

-Find the node n with the smallest label. Since the label represents the distance to the root we call it d_min.

-Mark the node as scanned.

-Scan all the adjacent nodes m and see if the distance to the root through n is better than the distance stored in the label of m. If it is, update the label and update pred[m]=n.

4. When the loop finishes, we have a tree stored in pred format rooted at root.

SPT vs. MST

Lower utilization of the links

More cost

Important:

Smaller average number of hops

Actually we will compare star (a kind of SPT) with MST, because …

Star vs. MST

If we run Dijkstra’s algorithm on a sparse graph, we will get a tree with a fair number of nodes not connected directly to the root.

If we run Dijkstra’s algorithm on a complete graph (exactly what we’re studying now), then we usually get a star.

SPT vs. MST

Star vs. MST (cont’d)

Design name MAX_UTIL Cost

MST 13.9479 0.493 $325.516

Star 1.9800 0.09 $453.861

hops

Prim’s algorithm produces much shorter paths but can produce very expensive networks. SPT is not good, either.

Is there some middle ground between MST and SPT?

Prim – Dijkstra Trees

Algorithm : Label

1) Prim’s:

2) Dijkstra’s:

3) Prim-Dijkstra’s:

),(min neighbornodedistneighbors

)),(),((min nodeneighbordistneighborrootdistneighbors

)),(),((min nodeneighbordistneighborrootdistneighbors

10

Prim – Dijkstra Trees (cont’d)

If , we build a MST.

If , we build a SPT.

The delay, and cost for various Prim-Dijkstra trees.

01

Design Link delay Cost

0(MST) N0 13.9479 0.3066 $325,516

0.1 N1 10.5717 0.1451 $280,162

0.2 N2 7.8640 0.1067 $247,217

0.3 N3 6.7762 0.0913 $243,551

0.4 N4 5.6679 0.0746 $248,650

0.5 N5 4.6303 0.0598 $253,579

0.6 N6 3.7063 0.0467 $273,742

0.7 N7 3.0186 0.0380 $295,012

0.8 N8 2.2879 0.0277 $378,792

0.9 N9 1.9800 0.0233 $453,861

hops

If we have such a large set of designs shown in the table, how to select the best?

Dominance among Designs

If we have a large set of designs, the problem is to decide which merit consideration and which should be discarded.

To help us do this we impose a partial ordering on the designs.

Definition 3.21:

Given a set S and an operator that maps S x S {TRUE,FALSE}, then we call S a partially ordered set, or poset, if

1) For any s S, s s is FALSE.

2) For any s1, s2 S, s1 s2, if s1 s2 is TRUE, then s2 s1 is FALSE.

3) If s1 s2 and s2 s3 are TRUE, then s1 s3 is TRUE.

Dominance among Designs(cont’d)


Definition 3.22:

Suppose design D1 has cost C1 and performance P1. Suppose design D2 has cost C2 and performance P2. We will say D1 dominates D2, or D1 D2, if C1 < C2 and P1 > P2.


Design Dominates Link delay Cost

N0 0.3066 $325,516

N1 N0 0.1451 $280,162

N2 N0,N1 0.1067 $247,217

N3 N0,N1,N2 0.0913 $243,551

N4 N0,N1 0.0746 $248,650

N5 N0,N1 0.0598 $253,579

N6 N0,N1 0.0467 $273,742

N7 N0 0.0380 $295,012

N8 0.0277 $378,792

N9 0.0233 $453,861


Show dominance relationships as a directed graph:

A directed graph is a graph G=(V,E) in which each edge e has been given an orientation. If the edge has endpoints v1 and v2, we shall denote the edge e=(v1,v2) if the orientation of v1 is the source vertex.


Think of the designs (N0,N1,N2,…,N9) as the nodes of a graph.

A directed edge runs from Ni to Nj if Ni Nj.

We can see we don’t want to consider N0, N1, or N2.

N9

N5

N7

N3

N0

N1

N2

N6

N4

N8

Further Analysis ofPrim-Dijkstra Trees

Given a pair of nondominating designs S1 and S2, 1 must be cheaper and 1 must have lower delay.

After rejecting the dominated designs, we still have 7 designs left to choose from. One way to clarify their differences further is to discuss the marginal cost of delay:

C1 - C2

P2 - P1

Using Delite to ProducePrim-Dijkstra Trees

Unlike Prim’s algorithm, the choice of the node at the center of the tree is important in the Prim-Dijkstra algorithm.

The value of .

Create trace file.

Tours

Sometimes a tree is just too unreliable to be a good network design.

Tours are far more reliable yet only have 1 additional link.

In graph theory, a tour refers to a possible solution of the traveling salesman problem (TSP).

Tours (cont’d)

Definition 3.24:

Given a set of vertices , a tour T is a set of n edges E such that each vertex v has degree 2 and the graph is connected.

1tv },...,,{ 21 nvvv

Tours (cont’d)

The number of tours is

Proof :

1) Represent the tour as a permutation:

2) There are n! such permutations, but the reverse permutation also gives the same tour.

2/)!1( n

),...,,(21 nttt vvv

TSP

Definition 3.25:

Given a set of vertices

and a distance function ,

the traveling salesman problem is to find the tour T such that

is a minimum.

In this notation we identify by

),...,,(21 nttt vvv

VVd :

n

itt iivvd

1

),(1

1ntv

1tv

Reliability

Definition 3.26: The reliability of a network is the probability that the functioning nodes are connected by working links.

Assumption:

1) The probability of each node working is 1;

2) The probability of a link failing is p .

Typically, p is a small number.

3) There is no correlation between the link failures.

Reliability (cont’d)

For the 5-node tree:

b d

a e

c

P(no_link_failures) =

P(failure) =

=

4)1( p4)1(1 p

432 4641 pppp p4


For the 5-node ring network :

b d

a e

c

If q = 1-p, then

45 51)( pqqfailurePring 542332 51010 pqpqpqp 3210 qp


p 4p (tree)

0.1 0.4 0.0729

0.01 0.04 0.00097

0.001 0.004

… … …

)(10 32 ringqp

510

410 4104 710

Building Tours

TSP is NP-hard.

No polynomial – time algorithm.

We will use heuristic algorithms.

For our purpose a heuristic algorithm for tours will build some tour but not necessarily the TSP tour.

Nearest-neighbor Algorithm

1. Start at a distinguished node we call root and set current_node = root.

2. Loop until we have all the nodes in the tour.

Now loop through the nodes and find the node closest to the current_node that is not in the tour. We call this best_node.

Create an edge between current_node and best_node.

Reset the current_node to the best_node.

3. Finally create an edge between the last node and the root to complete the tour.

Nearest-neighbor Algorithm(cont’d)

Observation: * Good (?): We are trying to produce a short tour, we will always

move to the best possible next location.

* Bad (?): When we look at the figure produced, we can see the

lines may cross frequently.

* Can we improve it? How can we measure the goodness (creditability) of an algorithm?

Creditable Algorithms

After a little analysis the lines would be uncrossed by hand, and the creditability of all the work would be brought into question.

Definition 3.27:

A heuristic optimization algorithm produces a creditable result if the result is a local optimum for the problem. Otherwise, it produces an uncreditable result.

Creditable Algorithms(cont’)

Example:

Prim’s and Dijkstra’s algorithms solve the MST and SPT problems. Their results are always creditable.

The nearest-neighbor tour-building algorithm frequently produces uncreditable designs.


Definition 3.28 A suite of network design problems S is a set of triples for

Definition 3.29 A creditability test a program test (net, traffic, cost) that takes a

network problem as input and returns OK or FAIL depending on whether or not test () can manipulate net into another valid network of lower cost.

Definition 3.30 Given a suite of network design problems S, a design algorithm A,

and a creditability test t ( ) then

),,( iii CostTrafficLocation Si ,...,

S

OKnettSnetACs

)()(


First creditability test on the nearest-neighbor heuristic. (Refer to the code in Appendix B “Computing creditability of Simple Nearest Neighbor”)

The creditability list : Sites 50 trials 500 trials 5000 trials

6 64.0% 59.4% 58.0%

8 56.0% 52.4% 47.94%

10 34.0% 37.8% 39.84%

15 22.0% 21.8% 22.62%

20 10.0% 13.8% 12.58%

30 8.0% 3.2% 3.84%

40 0.0% 0.8% 1.50%

A more creditable nearest-neighbor algorithm

It differs from the simpler algorithm in two ways:

First, the closest node doesn’t mean the closest node to the last added node; it means the closest node to any node in the partial tour we have built.

Second, when we add best_node to the tour, we don’t add it at the end of the tour, rather, we do a test to find the best place to add it, such that

is the smallest possible value among all nodes and that are adjacent in the partially built tour.

),(),_()_,( jiji NNdistNnodebestdistnodebestNdist

iN

iN jN

A more creditable nearest-neighbor algorithm (cont’d)

The increasing of the creditability of the tour built by the improved nearest-neighbor heuristic :

Sites 50 trials 500 trials 5000 trials 6 64.0% -> 98.0% 59.4%->94.8% 58.0%->95.32% 8 56.0% -> 92.0% 52.4% ->92.6% 47.94%->92.78% 10 34.0% -> 90.0% 37.8% ->90.8% 39.84%->91.04% 15 22.0% -> 86.0% 21.8% ->82.6% 22.62%->82.18% 20 10.0% -> 80.0% 13.8% ->72.6% 12.58%->74.06% 30 8.0% -> 54.0% 3.2% ->60.4% 3.84%->58.94% 40 0.0% -> 56.0% 0.8% ->55.0% 1.50%->48.84%

Time Complexity

Simple nearest-neighbor algorithm:

Improved nearest-neighbor algorithm:

Both heuristics have the same computational complexity.

)( 2n

)()2( 22 nn

A related heuristic

The furthest-neighbor heuristic

vs

the nearest-neighbor heuristic :

The nearest-neighbor heuristic has a tendency to strand the furthest sites, because it brings the site with the smallest dtour into the tour at the best site.

The furthest-neighbor heuristic brings the site with the largest value of dtour into the tour.

TSP Tours Do Not Scale

Theorem 3.3: Given uniform traffic any TSP tour of n nodes has if n is odd and if n is even. (Proof on text pp.86)

Comparison of average number of hops for MST and TSP:

Number of nodes 5 1.8 1.5 10 3.1778 2.777 20 4.4158 5.263 50 8.5159 12.755 100 13.9479 25.252

4

1n

hops

)1(4

2

nn

MSThops TSPhops

Intolerable !

Tour Building and Delite

The Delite tool can build tours using both the improved nearest-neighbor heuristic and the furthest-neighbor heuristic.

Starting node.

Trace file.

Return to Graph Theory:2 - Connectivity

Connectivity: Two nodes i and j are connected if the

graph contains at least one path from node i and j.

A graph is connected if every pair of its nodes is connected.

Return to Graph Theory:2 – Connectivity (cont’d)

The attraction of tours is that they are 2-connected.

Definition 3.31:

Given a connected graph G=(V,E), the vertex v is an articulation point if removing the vertex and all the attached edges disconnects the graph.

Definition 3.32:

If a connected graph G=(V,E) has no articulation points, then the graph is 2-connected.

* Our goal is to produce a design that is 2-connected and that has fewer hops. To this end we quote a helpful theorem.

Return to Graph Theory:2 – Connectivity (cont’d)

Theorem 3.4: Suppose and are 2-connected

graphs with . Let and .

Then the graph G with vertices and edges

is 2-connected.

How can we use this theorem on joining together 2-connected graphs to help us with the hop count problem in tours?

),( 111 EVG ),( 222 EVG

21 VV 121, Vvv 243 , Vvv

21 VV ),(),( 423121 vvvvEE

The answer is a divide-and-conquer strategy.

Divide and Conquer

In the figure, we have a TSP tour on 20 nodes. The average number of hops is 5.263. We want to reduce the average hop count but keep the 2-connectivity.

N20

N13 N6

N2

N7

N15 N9

N14

N10 N1 N5

N9 N12

N16 N18

N17 N4

N8 N11

N3

Divide and Conquer (cont’d)

1. Divide the 20 nodes into 2 “compact” clusters of 10 nodes each. Call these clusters C1 and C2.

(We might divide the 20 nodes by ranges of their coordinates, for example, to create the 2 clusters.)

2. Use the nearest-neighbor algorithm to design 2 TSP tours on each cluster.

3. Select v1 C1 and v2 C2 to be the 2 nodes such that the distance is the minimum.

4. Now select v3 C1-v1 and v4 C2-v2 to be the 2 nodes such that the distance is the minimum.

5. Add the edges (v1,v2), (v3,v4) to the design.


Many questions about the procedure will be addressed later in the book. At this point, we will focus on the generalization to more clusters.

One approach to building networks composed of multiple 2-connected clusters can be stated in the following theorem.


Theorem 3.5 Suppose that is a 2-connected graph with

. Suppose that each node is replaced by a 2-connected graph . Suppose each edge is replaced by an edge from to . Then if no 2 of these replacement edges have a common vertex, the graph

is a 2-connected graph.

),( EVG

2V Vvi iG

Ewue ),(,e

uGu ' wGv'

)',( EEVH iiii


G=(V,E)


H

Summary

If the traffic is small when compared to link size, then the optimal networks are MSTs and TSP tours, depending on the reliability desired.

Both MSTs and TSP tours do not scale.

The growth in the average # of hops is at the heart of the problem. It’s better to build a Prim-Dijkstra tree or a “ring of rings” to control the length of the routes.

The notion of creditability.

The End

Thank you !

Download - Chapter 3 Graphs, Trees, and Tours

Top Related