2101692 analytical methods in construction management

33
2101692 Analytical Methods in Construction Management Lectures 10 & 11 Dynamic Programming Academic Year 2021 First Term Associate Prof. Veerasak Likhitruangsilp, Ph.D. Department of Civil Engineering Faculty of Engineering Chulalongkorn University Bangkok, Thailand

Upload: others

Post on 13-Apr-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2101692 Analytical Methods in Construction Management

2101692 Analytical Methods in

Construction Management

Lectures 10 & 11 Dynamic Programming

Academic Year 2021 First Term

Associate Prof. Veerasak Likhitruangsilp, Ph.D. Department of Civil Engineering

Faculty of Engineering Chulalongkorn University

Bangkok, Thailand

Page 2: 2101692 Analytical Methods in Construction Management

Dynamic Programming

119

Chapter 8

Dynamic Programming

1. Introduction

Dynamic programming (DP) is a powerful method of optimization used for making a

sequence of optimal decisions (i.e., multiperiod or multistage decision makings). As

presented in Chapter 7, linear programming (LP) models can be applied to problems that are

linear and have convex feasible region, but it is not designed for decision problems.

On the contrary, DP can be applied to problems that are nonlinear, have nonconvex

feasible region, and have discontinuous variable problems. It is applied well to decision

problems. However, DP is limited to dealing with problems with relatively few constraints.

General Concepts

LP is based on the use of optimality conditions. The consequence of linearity is that we can

define in advance the nature of the optimum (i.e., it is at a corner point, from which all

departures worsen the objective function). It consists of an organized search through a set of

possible solutions until the optimality criterion is met.

DP has no optimality criterion we can define in advance. It is based on the concept of

enumeration. The basic idea is that we systematically enumerate possible solutions, calculate

how well each performs, and select the best. Since we enumerate the possible solutions, it

does not matter if the feasible region is convex or not, or if it is linear. Thus, enumeration is

the principal that gives dynamic programming the power to deal with nonlinear and

nonconvex problems that LP cannot.

The difficulty with a complete enumeration is that the number of possible solutions is

an exponential function of the number of possible decisions, the number of values they may

have, and the number of periods over which the system is considered. For example,

(No. of Periods)(No. of Variables)

Maximum No. of Possible Designs = No. of Values

Page 3: 2101692 Analytical Methods in Construction Management

Dynamic Programming

120

This number is large for even relatively small problems, and astronomical for really

significant problems such as the design of a regional telephone network.

The special approach used by DP is implicit enumeration. It considers all possible

combinations in principle, but not in fact. Based on certain crucial assumptions about a

problem, it systematically eliminates many sets of possible combinations because they can be

shown to be inferior.

Implicit enumeration works by doing a partial optimization at each stage. It selects

the best way to reach any point at that stage, and discards all the other possibilities.

Solution Procedure

DP is not based on a standard formulation, as LP is. It is more an approach or a concept

rather than a formula. The analyst wishing to use DP will often have to use imagination and

skill to organize a problem so that it can be solved efficiently. The fact that DP problems can

differ from each other substantially leads to a great practical difficulty.

Optimizing a system using DP needs to deal with four elements (de Neufville 1990):

organization of the problem

formula to be used in the partial optimizations

constraints

solutions

Organization

The first and essential part of the solution is to organize the problem into states and stages.

The state is the description of situation attained as the result of a transition from one stage to

another. The corresponding return functions must then be defined. Each stage in a sequential

problem might represent:

A division of time – days, hours, and seconds.

A division of space – distance

A sequence of choices

The stages of nonsequential problems are the several parts of a system such as the

different possible investments in a portfolio of loans or shares in companies. For

Page 4: 2101692 Analytical Methods in Construction Management

Dynamic Programming

121

nonsequential problems, the order we assign to the different stages will be absolutely

arbitrary.

The stages are generally easier to imagine, whereas defining the states can be

difficult. A simple example is that stage i represents the construction of an electric plant at a

specific site and state Xi is the level of investment in that plant. The states may be nominal

(e.g., names rather than degrees or levels).

The definition of the return functions (or cost functions) for each stage can be

extremely complex. For example, the return function could represent the profits or the

amount of energy obtained from an investment of level (state) Xi in a particular project.

The return function of a sequential problem represents the change in state from one

stage to the next. It can be expressed as a formula or commonly defined by a table providing

each return in terms of states in the previous stage, 1i , and the state iX of stage i .

Formula

The solution procedure for DP consists of two parts:

Partial optimization at each stage

Repetition of this process through all stages

The objective of the partial optimization is to define the best result that can be

obtained at any state at the end of a specific number of stages. This quantity is given by the

cumulative return function, by analogy with the return function that relates to only one stage.

The cumulative return function is denoted by ( )nf i , which designates the effect of being in

state i, having passed through the first n stages.

The cumulative return function is built up iteratively, from stage to stage. It is

defined in terms of the return function of the current stage and the cumulative return function

for previous stages. Sometimes, the cumulative return function can be defined by a closed-

form equation (called a recurrence function). Often, this function is defined by examination

of the various possibilities as entered in a table.

Constraints

Constraints in DP are dealt with quite differently than they are in LP; for example they are

almost never written as equations. As a matter of fact, a DP problem may not be described in

Page 5: 2101692 Analytical Methods in Construction Management

Dynamic Programming

122

terms of equations. Most constraints in DP are embedded directly in the organization of the

problem. For example, if there is a restriction on the maximum number of states for any

stage, only that number is provided.

Solution

The optimal solution is provided by the cumulative return function over all the stages. In

addition to the optimal solution, we can also know the optimal policy by determining how we

reached the optimal solution. This is in fact quite easy because the cumulative return

function is defined in terms of the return function for the last stage. Knowing the optimal

solution, we then know the transition from the previous stage that brought us to that point;

from the cumulative return function for the previous stage we can trace back one more stage,

and so on.

Applications

DP models can be applied to various problems including equipment replacement,

capacity expansion, resource allocation, inventory planning, and route guidance.

Classifications

DP models can be classified in many ways. First, if we consider the randomness of

the systems, they can be classified into:

Deterministic DP – no uncertainty or non-random systems

Stochastic or probabilistic DP – uncertain or random systems

Second, if we consider the time horizon of the systems, the models can be classified

into two types, namely, the finite-horizontal DP and the infinite-horizontal DP.

In this chapter, we will focus only on the deterministic, finite-horizontal DP problems,

which is the most fundamental type of DP models.

Page 6: 2101692 Analytical Methods in Construction Management

Dynamic Programming

123

2. Model Formulation by Decision Tree

Decision trees, which is commonly used in decision analysis (see Chapter 3), can be adopted

as a tool for sequential decision problems. A decision tree for DP problems, as shown in

Figure 8.1, consists of:

Decision node, representing the opportunity to make a decision

Arc, showing a decision

Root, a beginning node without “in” arcs

Terminal node, the end of each sequence of decisions

It should be noted that the cost associated with a particular decision in the decision

tree is conditional on all previous decisions along the path leading to that decision, as shown

in Figure 8.2.

Figure 8.1 Decision tree for deterministic DP problems

Page 7: 2101692 Analytical Methods in Construction Management

Dynamic Programming

124

Figure 8.2 Costs associated with decisions along the path

Example 8.1 Resource Allocation

We want to allocate the maximum of 2 units of resources to 3 activities so as to maximize our

total profit. The profit matrix is as follows.

Activity Profits ( $1,000) associated with # of resources allocated

0 1 2 1 0 10 20 2 0 5 20 3 0 10 12

Objective function: Maximize the total profit

Observations:

The problem seems to be static but it is actually dynamic because it involves a

sequence of decision makings.

There are many possible ways to model this problem. Figure 8.3 shows only one

possible way to model it.

The directed paths from the root to the terminal nodes are all feasible decision

sequences.

The total profit associated with a particular path is the sum of the profits for all the

corresponding decisions along that path.

If we represent the cost (i.e., negative profit) for each arc by the length of that arc,

the cost of a feasible decision sequence will be the total length of all the

corresponding arcs along that path. Thus, our problem can be viewed as finding a

shortest path in the decision tree network from the root to the terminal nodes.

Page 8: 2101692 Analytical Methods in Construction Management

Dynamic Programming

125

Figure 8.3 Decision tree for the resource allocation problem

Answers: Solving the problem using the decision tree yields two sets of answer: (1)

maximum profit = $20,000 and (2) the three possible optimal policies, which are 1)

allocating two units of resource to Activity 2, 2) allocating one unit of resource to Activity 1

and one unit to Activity 3, and 3) allocating two units of resource to Activity 1.

Page 9: 2101692 Analytical Methods in Construction Management

Dynamic Programming

126

Example 8.2 Equipment Replacement

Given that the current equipment is one year old. The company wants to develop a cost

minimizing equipment replacement strategy over a 3-year horizon. Assume that they make a

decision at the beginning of each year either to keep (K) the current equipment or to buy (B)

a new one. The new equipment will cost $1000. Other costs of this equipment are shown in

the following table.

2( $10 ) Age of equipment (at the beginning of the year)

0 1 2 3 4 Maintenance costs 0 1 2 3 -

Salvage values - 8 7 6 5

The cost function of this particular equipment can be expressed by:

Cost = f(Salvage value , Purchase cost , Maintenance cost)

Objective function: Minimize the total cost

Observations:

This DP problem is associated with three decision stages (i.e., the beginning of years

1, 2, and 3) and two possible choices for each stage (i.e., buy a new machine or keep the

current one). Thus, there is a total of 8 (= 32 ) paths in this decision tree, as shown in Figure

8.4.

Answers: Solving this problem on the decision tree provides two sets of answer:

(1) Minimum total cost = –$200 (profit)

(2) Five optimal policies are (B,B,B), (B,B,K), (B,K,B), (K,B,B), and (K,B,K), where

(optimal policy for year 1, optimal policy for year 2, optimal policy for year 3).

Page 10: 2101692 Analytical Methods in Construction Management

Dynamic Programming

127

Figure 8.4. Decision tree for the equipment replacement problem

Page 11: 2101692 Analytical Methods in Construction Management

Dynamic Programming

128

3. Model Formulation by Dynamic Programming Network

Several nodes of the decision tree in Figure 8.4 are identical, and can be aggregated together

into 10 groups (e.g., a, b, etc.), as shown in Figure 8.5. This process is called the aggregation

process.

Based on these groups of identical nodes, we can represent the above decision tree in

the form of a new network called a Dynamic Programming Network, as shown in Figure 8.6.

The dynamic programming network (DPN) is an equivalent representation of the decision

tree (DT), but it is more economical. It creates cycles of decision making that reduce the size

of problem.

Some important observations are as follows.

(1) There are fewer nodes in the Dynamic Programming Network (DPN) than those in

the Decision Tree (DT) (i.e., 10 nodes in Figure 6 as compared to 15 nodes in

Figure 8.4).

(2) There is a 1-1 correspondence between the paths from left to right in DT and DPN

(i.e., the number of paths in DT = the number of paths in DPN).

(3) The nodes in the DPN can be interpreted as the ages of the equipment at each of

the periods. These nodes can be called dynamic programming states.

Thus, the above DPN can be presented by substituting each node by a dynamic

programming state, consisting of two sets of information: (Beginning of year, Age of

machine). For example, state (3,2) represents the equipment at the beginning of year 3 with

the age of 2 years. Figure 8.7 presents a DPN using the DP states.

The construction of the DPN concludes the formulation phase of DP problems.

Finding a shortest or longest path (i.e., a least cost or highest profit path) is the next phase of

DP problems, the solution phase.

4. Solution Procedure

The main objective of DP solution is to efficiently find a shortest path (e.g., lowest cost) in a

finite acyclic directed network.

Page 12: 2101692 Analytical Methods in Construction Management

Dynamic Programming

129

Figure 8.5 Aggregation process for the equipment replacement problem

Page 13: 2101692 Analytical Methods in Construction Management

Dynamic Programming

130

Figure 8.6 Dynamic programming network for the equipment replacement problem

Figure 8.7 Dynamic programming network (DPN) represented by the states

Page 14: 2101692 Analytical Methods in Construction Management

Dynamic Programming

131

Definition:

Directed network: S is a set of nodes ( S or the network is finite)

T S S is directed arcs [e.g., ( , ) Ti j ]

Start node: s S

Successor set: SCS(i) = : ( , ) Tj i j

A path is a sequence 1 2( , ,..., )ni i i such that 1( , ) T, 1,2,..., 1k ki i k n

A path with 1 ni i is a cycle.

A network is cyclic if it contains at least one cycle; otherwise, it is acyclic.

Let ijt length of ( , ) Ti j

Path length = 1

, 11

n

k kk

t

for path 1 2( , ,..., )ni i i

A finite acyclic directed network can have nodes labeled so that ( , ) Ti j i j .

Deterministic DP models can be solved a variety of algorithms, two of which are

backward DP and forward DP.

5. Backward Dynamic Programming

The following examples are used to illustrate how to develop the solution procedure for

backward DP.

Example 8.3 Shortest Path for Diamond Network

Find the shortest path from A to B for the diamond network shown in Figure 8.8.

Solution: Determine the shortest path from B to A (backward), as shown in Figure 8.9.

Answer: The shortest path is the bold line in Figure 8.9 for a path length of 13 units.

Observations:

We cannot improve the formulation of this problem because this diamond network

is the dynamic programming network of the problem, which requires three ups

and three downs to go from A to B.

There are a total of 6

3

= 20 possible paths in this network.

However, we can improve the solution procedure by using the principle of optimality.

Page 15: 2101692 Analytical Methods in Construction Management

Dynamic Programming

132

A B

1

5

2

0

3

4 2

4

1

3

5

24

2

2

1

4

87

1

3

5

2

2

C

Legend:

Distance

Figure 8.8 Shortest path problem for the diamond network

A B

1

5

2

0

3

4 2

4

1

3

5

24

2

2

1

4

87

1

3

5

2

2

Min (2+2,1+8)= 4

2

1

7

4

5

10

8

6

7

9

8

11

12

14

13C

Legend:

Distance

Figure 8.9 Solution of the shortest path problem for the diamond network

Page 16: 2101692 Analytical Methods in Construction Management

Dynamic Programming

133

Principle of Optimality (Backward DP)

“If the shortest path from node A to node B passes through the intermediate node C, then the

path followed from C to B must be a shortest path from C to B,” as shown in Figure 8.10.

Figure 8.10 Principle of optimality (backward DP)

The principle means that a subpath of the optimal path is also optimal. This principle

can be proved easily by contradiction. That is, if we assume that there is another path from C

to B that is shorter than the one shown above, there must be another path from A to B that is

shorter than the above. This is clearly not true.

The principle of optimality provides the concept of the solution procedure by

backward DP: “If we found ourselves at a particular node, what would be the shortest path

from that node to the end?”

Thus, we can solve Example 8.3 by using this principle of optimality. For example,

at C we must find the shortest path from that point to B by comparing between the upper path

(i.e., up and down) with the length of 4 units ( 2 2 ) and the lower path (i.e., down and up)

with the length of 9 units ( 8 1 ). Clearly, the shortest path is the upper path.

Based on this concept, we can eliminate unnecessary calculations if we use the direct

enumeration approach. For example, whether or not we are calculating the cost of path

1-2-3-6-8-9 or 1-3-4-6-8-9, we need to calculate the subpath 6-8-9 only once. Thus, DP

decreases the number of calculation by breaking a large problem into small problems

(decomposition).

Page 17: 2101692 Analytical Methods in Construction Management

Dynamic Programming

134

Example 8.4 Shortest Path for Acyclic Directed Network

Find the shortest path from node 1 to node 9 for the network in Figure 8.11.

Figure 8.11 Acyclic directed network problem

Calculation: Using backward DP minimization

Given that if = minimum distance from node Si to the terminal node (node 9)

9f = 0 (called the boundary condition)

8f = 10 + 9f = 10 + 0 = 10 (to node 9)

7f = 3 + 9f = 3 + 0 = 3 (to node 9)

6f = min{7 + 8f , 15 + 9f } = min{7 + 10 , 15 + 0} = min {17 , 15} = 15 (to node 9)

5f = 7 + 7f = 7 + 3 = 10 (to node 7)

4f = min{4 + 5f , 3 + 6f , 15 + 7f , 7 + 8f } = min{4 + 10 , 3 + 15 , 15 + 3 , 7 + 10}

= min{14 , 18 , 18 , 17} = 14 (to node 5)

3f = min{3 + 4f , 4 + 6f } = min{3 + 14 , 4 + 15} = min {17 , 19} = 17 (to node 4)

2f = min{6 + 4f , 12 + 5f } = min{6 + 14 , 12 + 10} = min {20 , 22} = 20 (to node 4)

1f = min{1 + 2f , 2 + 3f } = min{1 + 20 , 2 + 17} = min {21 , 19} = 19 (to node 3)

By tracking back, the shortest path in this network is: 134579, with the

total length of 19 units.

Page 18: 2101692 Analytical Methods in Construction Management

Dynamic Programming

135

A shortest path tree can be used to illustrate a shortest path from a certain node to the

terminal node. This tree can be constructed by leaving in only the arcs that accrue the

shortest paths. For the above example, the shortest path tree is shown in Figure 8.12. For

example, the shortest path from node 2 to the terminal node (node 9) is 24579

(= 6+4+7+3 = 20).

Figure 8.12 Shortest path tree

Optimality Equation (Backward DP)

The calculation procedure of DP above can be expressed algebraically by optimality equation

(functional equation or optimal value function), consisting of three components:

(1) Recursive equation

(2) Boundary conditions

(3) Answer

For example, the optimality equation of Example 8.4 can be written as:

min for ( , ) T

0 for 9

ij ji

t f i jf

i

Answer: 1f where if is the shortest path from node i to the terminal node (node 9),

and ijt is the distance between node i and node j.

Page 19: 2101692 Analytical Methods in Construction Management

Dynamic Programming

136

General Form of Backward DP

Figure 8.13 illustrates the formulation of the backward DP optimality equation for any state i.

The general form of backward DP can be expressed by:

Define ( )f i = the length of a shortest path from i to n, 1, 2,...,i n and ( , )C i j is the

cost associated with the transition from state i to state j.

min ( , ) ( ) for ( , ) T

( )0 for

j i C i j f j i jf i

i n

Answer: (1)f

Solving this optimality equation yields two main results:

Minimum cost from each node to the end, especially the minimum cost (1)f

Optimal policy or action – If-then rule (e.g., if we are at node i, then we should go

to node j) from the shortest path tree

Figure 8.13 Backward DP formulation

6. Forward Dynamic Programming

Another way to solve deterministic sequential decision problems is by using forward dynamic

programming. The concepts of forward DP are very similar to those of backward DP. The

only difference is that solving by forward DP will start from the beginning node and continue

to the terminal node.

Page 20: 2101692 Analytical Methods in Construction Management

Dynamic Programming

137

Principle of Optimality (Forward DP)

“If a shortest path from node A to node B passes through the intermediate node C, the path

followed from A to C must be a shortest path from A to C,” as shown in Figure 8.14.

Figure 8.14 Principle of optimality (forward DP)

Again, this principle can be proved by contradiction (similar to that in backward DP).

It also provides the concept of the solution procedure by forward DP: “If we found ourselves

at a particular node, what would be the shortest path from the beginning to that node?”

Example 8.5 Shortest Path for Acyclic Directed Network (Revisit Example 8.4)

Find the shortest path of the network in Example 8.4 by using forward DP minimization.

Calculation:

Given that ig = minimum distance from the start node (node 1) to node i

1g = 0 (called the boundary condition)

2g = 1g + 1 = 0 + 1 = 1 (from node 1)

3g = 1g + 2 = 0 + 2 = 2 (from node 1)

4g = min{ 2g + 6 , 3g + 3} = min{1 + 6 , 2 + 3} = min{7 , 5} = 5 (from node 3)

5g = min{ 2g + 12 , 4g + 4} = min{1 + 12 , 5 + 4} = min{13 , 9} = 9 (from node 4)

6g = min{ 3g + 4 , 4g + 3} = min{2 + 4 , 5 + 3} = min{6 , 8} = 6 (from node 3)

7g = min{ 4g +15 , 5g + 7} = min{5 + 15 , 9 + 7} = min{20 , 16} = 16 (from node 5)

8g = min{ 4g + 7 , 6g + 7} = min{5 + 7 , 6 + 7} = min{12 , 13} = 12 (from node 4)

9g = min{ 6g + 15 , 7g + 3 , 8g + 10} = min{6 + 15 , 16 + 3 , 12 +10}

Page 21: 2101692 Analytical Methods in Construction Management

Dynamic Programming

138

= min{21 , 19 , 22} = 19 (from node 7)

Thus, the shortest path in this network is: 975431, with the total length of

19 units (same as the answers from using backward DP).

The optimality equation for the forward DP of this problem is:

min for ( , ) T

0 for 1

i ijj

g t i jg

j

Answer: 9g

General Form of Forward DP

Figure 8.15 illustrates the formulation of the forward DP optimality equation for any state j.

The general form of forward DP can be expressed by:

Define ( )g j = the length of a shortest path from 1 to j, 1, 2,...,j n and ( , )C i j is the

cost associated with the transition from state i to state j.

min ( ) ( , ) for ( , ) T

( )0 for

j i g i C i j i jg j

j s

Answer: ( )g n

Figure 8.15 Forward DP formulation

Page 22: 2101692 Analytical Methods in Construction Management

Dynamic Programming

139

7. Maximization Problems

The maximization problems are very similar to the minimization problems discussed in the

previous sections. We only change the definition of the optimality function and maximize

this function rather than minimize it.

Backward DP

Define if = the maximum distance from node i to the end (node n)

Optimality equation:

max for ( , ) T

0 for

i j ij ji

C f i jf

i n

Answer: 1f

Forward DP

Define jg = the maximum distance from the start (node s) to node j

Optimality equation:

max for ( , ) T

0 for

i j i ijj

g C i jg

i s

Answer: ng

8. Applications

The following three problems are adapted from Stark and Nicholls (1972) to illustrate some

applications of DP models in various civil engineering systems.

Page 23: 2101692 Analytical Methods in Construction Management

Dynamic Programming

140

Example 8.6 Combinatorial Problem

An aggregate producer has four identical mobile crushing-screening plants and four sources

of raw material, which he can use during the coming construction season. Given the profit

matrix below, how many plants should he assign to each site?

Profit Matrix

No. of plants

assigned

Raw material site

1 2 3 4

1 47 39 24 35 2 81 62 47 51 3 105 84 72 61 4 132 91 87 68

Let nx = number of plants assigned to site n

( )n np x = profit obtained from allocating nx plants to site n

Thus, Maximize 4

1

( )n nn

p x

subject to 4

1

4nn

x

It should be noted that since the profit function increases monotonically, we should

use all resources.

Let ( , )f n y = maximum profit obtained from assigning y plants to the raw material

sites 1 through n, where 4y , integer

The recursive equation can be written as:

( , ) max{ ( ) ( 1, )}n n nf n y p x f n y x , nx y , integer

Answer: (4,4)f

Page 24: 2101692 Analytical Methods in Construction Management

Dynamic Programming

141

Calculation:

(1) Tabular method

1 1 1(1, ) max ( ) :f y p x x y , which is the boundary conditions.

y f(1,y) *1x

0 0 0 1 47 1 2 81 2 3 105 3 4 132 4

2 2 2 2(2, ) max ( ) (1, )x yf y p x f y x

y/ 2x 2 2 2( ) (1, )p x f y x (2, )f y *

2x 0 1 2 3 4

0 0 - - - - 0 0 1 0+ 47= 47 39+ 0= 39 - - - 47 0 2 0+ 81= 81 39+ 47= 86 62+ 0= 62 - - 86 1 3 0+105=105 39+ 81=120 62+47=109 84+ 0= 84 - 120 1 4 0+132=132 39+105=144 62+81=143 84+47=131 91+0=91 144 1

3 3 3 3(3, ) max ( ) (2, )x yf y p x f y x

y/ 3x 3 3 3( ) (2, )p x f y x (3, )f y *

3x 0 1 2 3 4

0 0 - - - - 0 0 1 0+ 47= 47 24+ 0= 24 - - - 47 0 2 0+ 86= 86 24+ 47= 71 47+ 0= 47 - - 86 0 3 0+120=120 24+ 86=110 47+47= 94 72+ 0= 72 - 120 0 4 0+144=144 24+120=144 47+86=133 72+47=119 87+0=87 144 0,1

4 4 4 4(4, ) max ( ) (3, )x yf y p x f y x

y/ 4x 4 4 4( ) (3, )p x f y x (4, )f y *

4x 0 1 2 3 4

0* 0 - - - - 0 0 1* 0+ 47= 47 35+ 0= 35 - - - 47 0 2* 0+ 86= 86 35+ 47= 82 51+ 0= 51 - - 86 0 3* 0+120=120 35+ 86=121 51+47= 98 61+ 0= 61 - 121 1 4 0+144=144 35+120=155 51+86=137 61+47=108 68+0=68 155 1

* Unnecessary calculation

Page 25: 2101692 Analytical Methods in Construction Management

Dynamic Programming

142

Answer: The maximum profit equals to 155. The optimal assignment is (2,1,0,1), that is two

plants at site 1, one plant at site 2, zero plant at site 3, and one plant at site 4 (by reading from

the table or graph).

(2) Graphical method

Figure 8.16 shows the solution of this problem using the graphical method.

Figure 8.16 Solving the combinatorial problem using the graphical method

Page 26: 2101692 Analytical Methods in Construction Management

Dynamic Programming

143

Example 8.7 Replacement of tractor-scraper units

A contractor has a fleet of tractor-drawn scrapers for moving earth on short hauls. As a

tractor or scraper ages, the annual profit of the tractor-scraper unit decreases due to increased

maintenance costs. To simplify for illustrative purposes, assume that the contractor keeps a

tractor no more than 4 years and a scraper no more than 2 years. The contractor estimates his

annual net return per tractor-scraper unit as:

Tractor age (years) 0 1 2 3 0 1 2 3 Scraper age (years) 0 0 0 0 1 1 1 1 Estimated net return of tractor-scraper unit for the following year ($103)

11 9 6 3 9 7 4 2

The cost of a new tractor is $9,000, and of a new scraper $3,000. Replacements at the

end of year i are charged to the ith year. At the end of each year, the contractor chooses one

of the following policies for each tractor-scraper unit:

(1) Replace neither

(2) Replace tractor only

(3) Replace scraper only

(4) Replace both

What replacement policy maximizes profit for a given period (planning horizon)?

Calculation:

In this problem, we determine the optimal replacement policy for the eight different age

combinations (states) of the two machines over 1-, 2-, 3-, and 4-year time horizons (stages).

The cost function consists of the estimated net return of the tractor-scraper unit (from

the above table) and the cost of a new machine (if the replacement is being considered).

The solution procedure is recursive. The optimal policy for the 1-year horizon will be

used for the optimization of the 2-year horizon, and so on. All the calculations can be

summarized in Table 8.1.

Page 27: 2101692 Analytical Methods in Construction Management

Table 8.1 DP solution for the tractor-scraper replacement problem

Ages, yr

Tractor 0 1 2 3 0 1 2 3 Scraper 0 0 0 0 1 1 1 1

1-yr horizon

Policy 1 11 9 6 3 9 7 4 2 Optimal 1 1 1 1 1 1 1 1

2-yr horizon

Policy 1 11+ 0+ 7=18 9+ 0+ 4=13 6+ 0+ 2=8 --- --- --- --- --- Policy 2 11- 9+ 9=11 9- 9+ 9= 9 6- 9+ 9=6 3- 9+ 9=3 --- --- --- --- Policy 3 11- 3+ 9=17 9- 3+ 6=12 6- 3+ 3=6 --- 9- 3+ 9=15 7- 3+ 6=10 4- 3+ 3=4 --- Policy 4 11-12+11=10 9-12+11= 8 6-12+11=5 3-12+11=2 9-12+11= 8 7-12+11= 6 4-12+11=3 2-12+11=1 Optimal 1 1 1 2 3 3 3 4

3-yr horizon

Policy 1 11+ 0+10=21 9+ 0+ 4=13 6+ 0+ 1= 7 --- --- --- --- --- Policy 2 11- 9+15=17 9- 9+15=15 6- 9+15=12 3- 9+15=9 --- --- --- --- Policy 3 11- 3+13=21 9- 3+ 8=14 6- 3+ 3= 6 --- 9- 3+13=19 7- 3+ 8=12 4- 3+ 3= 4 --- Policy 4 11-12+18=17 9-12+18=15 6-12+18=12 3-12+18=9 9-12+18=15 7-12+18=13 4-12+18=10 2-12+18=8 Optimal 1,3 2,4 2,4 2,4 3 4 4 4

4-yr horizon

Policy 1 11+ 0+13=24 9+ 0+10=19 6+ 0+ 8=14 --- --- --- --- --- Policy 2 11- 9+19=21 9- 9+19=19 6- 9+19=16 3- 9+19=13 --- --- --- --- Policy 3 11- 3+15=23 9- 3+12=18 6- 3+ 9=12 --- 9- 3+15=21 7- 3+12=16 4- 3+ 9=10 Policy 4 11-12+21=20 9-12+21=18 6-12+21=15 3-12+21=12 9-12+21=18 7-12+21=16 4-12+21=13 2-12+21=11 Optimal 1 1,2 2 2 3 3,4 4 4

Note: All profits are in thousands of dollars.

Dynam

ic

c Program

ming

Page 28: 2101692 Analytical Methods in Construction Management

Dynamic Programming

145

Example 8.8 Design of An Irrigation System

In this example, dynamic programming will be used to size an irrigation aqueduct and

allocate water from it among several irrigation districts. A total anticipated annual profit

objective will be used.

An irrigation canal is to be designed to serve three districts whose head gates lie 30,

50, and 75 miles, respectively, downstream from the point of diversion. The available annual

water supply is 800,000 acre-ft. The below table shows the anticipated annual irrigation

benefits and the estimated annual costs of the aqueduct as functions of annual water delivery.

The annual water delivery to each irrigation district and corresponding aqueduct are to be

chosen to maximize anticipated annual profit.

We use the notation as follows.

i = irrigation district (i = 1 corresponds to the most upstream district, etc.)

qi = water delivered to irrigation district i, acre-ft/yr

Bi = annual irrigation benefit to district i

bi = annual net benefit to district i

c = amortized cost of aqueduct, $/mile-yr

ci = total amortized cost of aqueduct from head gates ( 1i ) to i, $/yr

The anticipated annual benefits and costs are presented in the following table.

qi B1 B2 B3 c 103 acre-ft $103 $103 $103 $103/mile

200 600 400 900 7.6 400 980 760 1,250 10.7 600 1,310 1,090 1,500 13.2 800 1,600 1,380 1,690 15.2

Calculation:

We will solve this problem by backward DP by considering the last district (district 3) first,

and then proceeding to districts 2 and 1, respectively.

Page 29: 2101692 Analytical Methods in Construction Management

Dynamic Programming

146

Costs and Benefits for District 3

q3 B3 c C3 b3

103 acre-ft $103 $103/mile (75-50)c = 25c ($103) $103

200 900 7.6 190 710 400 1,250 10.7 268 983 600 1,500 13.2 330 1,170 800 1,690 15.2 380 1,310

Costs and Benefits for Districts 3 and 2

q2 + q3 q2 q3 b3 B2 C2 b2 b2+b3 Optimal 103 103 103 $103 $103 (50-30)c $103 Allocation

acre-ft acre-ft acre-ft = 20c ($103) 103 acre-ft 200 0 200 710 0 152 -152 558 q2 = 0

200 0 0 400 152 248 248 q3 = 200 400 0 400 983 0 214 -214 769 q2 = 200

200 200 710 400 214 186 896 q3 = 200 400 0 0 760 214 546 546

600 0 600 1,170 0 264 -264 906 q2 = 400 200 400 983 400 264 136 1,119 q3 = 200 400 200 710 760 264 496 1,206 600 0 0 1,090 264 826 826

800 0 800 1,310 0 304 -304 1,006 q2 = 600 200 600 1,170 400 304 96 1,266 q3 = 200 400 400 983 760 304 456 1,439 600 200 710 1,090 304 786 1,496 800 0 0 1,380 304 1,076 1,076

Costs and Benefits for Districts 3, 2, and 1

q1+q2+q3 q1 q2+q3 b2+b3 B1 C1 b3 b1+b2+b3 Optimal 103 103 103 $103 $103 30c $103 Allocation

acre-ft acre-ft acre-ft $103 103 acre-ft 200 0 200 558 0 228 -228 330 q1 = 200

200 0 0 600 228 372 372 q2 = 0 q3 = 0

400 0 400 896 0 321 -321 575 q1 = 200 200 200 558 600 321 279 837 q2 = 0 400 0 0 980 321 659 659 q3 = 200

600 0 600 1,206 0 396 -396 810 q1 = 400 200 400 896 600 396 204 1,100 q2 = 0 400 200 558 980 396 584 1,142 q3 = 200 600 0 0 1,310 396 914 914

800 0 800 1,496 0 456 -456 1,040 q1 = 400 200 600 1,206 600 456 144 1,350 q2 = 200 400 400 896 980 456 524 1,420 q3 = 200 600 200 558 1,310 456 854 1,412 800 0 0 1,600 456 1144 1,144

Page 30: 2101692 Analytical Methods in Construction Management

Dynamic Programming

147

Answer: The optimal design of this irrigation system is (400000,200000,200000); that is,

District 1: 400,000 acre-ft, District 2: 200,000 acre-ft, and District 3: 200,000 acre-ft with the

total net benefit of $1,420,000.

9. Summary

This chapter introduces dynamic programming (DP) models for optimizing construction

engineering and management systems by focusing on deterministic DP. The general

concepts of DP are first discussed by comparing with linear programming (LP), which was

presented in Chapter 7. Two methods of model formulation are presented, namely, the

decision tree and the dynamic programming network. The chapter then discusses two

algorithms for solving DP models (minimization and maximization problems): backward DP

and forward DP. The applications of DP are illustrated by using calculation examples.

REFERENCES

de Neufville, R. (1990). Applied Systems Analysis: Engineering planning and technology

management. McGraw-Hill, Chapter 7.

Hillier, F. S. and Lieberman, G. L. (1995). Introduction to Operations Research (6th Edition).

McGraw-Hill, NY, Chapter 10.

Murty, K. G. (1995). Operations Research: Deterministic Optimization Models. Prentice

Hall, NJ, Chapter 12.

Stark, R. M. and Nicholls, R. L. (1972). Mathematical Foundations for Design: Civil

Engineering Systems. McGraw-Hill, Chapter 6.

Page 31: 2101692 Analytical Methods in Construction Management

1

2101692 Analytical Methods in Construction Management Department of Civil Engineering

Chulalongkorn University Academic Year 2021

Assignment 9

Dynamic Programming Out: October 20, 2021

Due: November 3, 2021 Problem 1 Draw a Dynamic Programming Network (DPN) of Example 8.1 Resource Allocation and verify the answers given in the handout. Define the state as follows.

(n,i) = (Activity nth, i resources remaining) Problem 2 The safety engineer for a factory has estimated the number of employee disabilities (measured in sick days/year) that can be avoided by four different measures: (1) covering dangerous machinery (CDM), (2) providing protective clothing (PPC), (3) improving ventilation (IV), and (4) lowering noise levels (LNL). The information is summarized in the following table.

Investment (103$) Measures

CDM PPC IV LNL 0 0 0 0 0 10 5 15 2 5 20 10 22 12 7 30 15 25 15 15 40 20 25 17 25 50 22 30 25 26

(a) Use dynamic programming to determine how a $70,000 budget could be most

effectively employed to reduce disabilities. (b) How much a difference would it make if management cut back the safety budget to

$60,000? How much money would be saved for each additional sick day the budget cut would cause?

[From Engulf and Devour by de Neufville (1990), Chapter 7, Problem 7.1, page 164] Problem 3 A Boston rat crawls out of its hole at Beacon and Arlington Streets and decides to raid its favorite garbage behind a store at Newbury and Exeter Streets. Based on previous experience with cats, traffic, and lighting conditions, the rat estimates travel time (in minutes) for each block, as shown below.

(a) Use backward dynamic programming to find the rat’s minimum route to the garbage can and the travel time.

(b) Use forward dynamic programming to find the rat’s minimum route to the garbage can and the travel time, and compare the answer with that of question (a).

Page 32: 2101692 Analytical Methods in Construction Management

2

(c) If, instead, the rat decided to chew on a telephone pole at Marlboro & Dartmouth, how long would that trip take from its hole? Identify this route as well.

[From Rat Patrol by de Neufville (1990), Chapter 7, Problem 7.5, page 166]

Problem 4 A company manufacturing large industrial equipment projects sales of 3 units/month in January, February, and March. Beginning inventory in January (ending inventory in December) is 2 units. The company desires that ending inventory in March also be 2 units. Demand for any month may be satisfied from beginning inventory or that month’s production. Ending inventory in any month is limited to 5 units, and production during any month is limited to 5 units. There is a $10,000 holding cost for any unit left in inventory at the end of a month. The production cost depends on the number of units manufactured in a month as follows.

Units/Month 0 1 2 3 4 5 Cost ($000) 0 80 100 120 130 140

(a) Define the objective, decision variables, objective function, constraints, and the return

functions for each stage. (b) Draw a dynamic programming network of this problem. (c) Determine the optimal production schedule for January, February, and March; and its

minimum cost by using dynamic programming. [From Winter Inventory by de Neufville (1990), Chapter 7, Problem 7.6, page 166]

Problem 5 Cheryl Consultant has an opportunity to invest in one or more of four proposal writing projects A, B, C, and D. Any investment in a project must be made in $1,000 increments. Moreover, there are limits to the amounts Cheryl might invest effectively in any one project: $7,000 for A; $5,000 for B and D; and $6,000 for C. Cheryl is also willing not to invest at all in a project if her money might be invested more profitably in the other projects. Cheryl estimates her returns for investment as follows.

Page 33: 2101692 Analytical Methods in Construction Management

3

Project Level of investment

(in $1000) 1 2 3 4 5 6 7

A 5 10 15 25 35 50 55 B 3 6 12 18 30 30 30 C 20 35 45 55 60 65 65 D 9 16 29 37 45 45 45

(a) If Cheryl has $8000, what is her optimal strategy? (b) What will her returns from this strategy be? (c) Suppose Cheryl’s friend Jill offers her an extra $1000 if Cheryl agrees to repay Jill

$3000. Should Cheryl accept the offer? Explain your answer. (d) If Cheryl only has $7000, how should her investment plan change? [From Cheryl Consultant by de Neufville (1990), Chapter 7, Problem 7.8, page 167]

Problem 6 Your construction company currently has four projects in progress. You have recently purchased 7 new machines (say mobile cranes) to be used in these projects. Each project needs up to 3 machines (i.e., 0, 1, 2, or 3 machines can be allocated to each project). The following table shows the estimated profits from allocating different number of machines in each project.

Project Estimated Profit (in $1,000) for allocating i machines

i = 0 1 2 3 A 30 70 100 140 B 20 45 80 115 C 10 20 40 65 D 20 40 90 110

By using dynamic programming presented in this course,

(1) Determine the optimal allocation of these seven machines to the four projects and the maximum profit associated with the plan. Show all works.

(2) Assume that you have an option to rent an additional unit of this machine from a rental company (i.e., you will have a total of 8 machines). What would be the maximum total rental (in dollars) you are willing to pay for an extra machine? Describe your answer clearly.