2101692 analytical methods in construction management
TRANSCRIPT
2101692 Analytical Methods in
Construction Management
Lectures 10 & 11 Dynamic Programming
Academic Year 2021 First Term
Associate Prof. Veerasak Likhitruangsilp, Ph.D. Department of Civil Engineering
Faculty of Engineering Chulalongkorn University
Bangkok, Thailand
Dynamic Programming
119
Chapter 8
Dynamic Programming
1. Introduction
Dynamic programming (DP) is a powerful method of optimization used for making a
sequence of optimal decisions (i.e., multiperiod or multistage decision makings). As
presented in Chapter 7, linear programming (LP) models can be applied to problems that are
linear and have convex feasible region, but it is not designed for decision problems.
On the contrary, DP can be applied to problems that are nonlinear, have nonconvex
feasible region, and have discontinuous variable problems. It is applied well to decision
problems. However, DP is limited to dealing with problems with relatively few constraints.
General Concepts
LP is based on the use of optimality conditions. The consequence of linearity is that we can
define in advance the nature of the optimum (i.e., it is at a corner point, from which all
departures worsen the objective function). It consists of an organized search through a set of
possible solutions until the optimality criterion is met.
DP has no optimality criterion we can define in advance. It is based on the concept of
enumeration. The basic idea is that we systematically enumerate possible solutions, calculate
how well each performs, and select the best. Since we enumerate the possible solutions, it
does not matter if the feasible region is convex or not, or if it is linear. Thus, enumeration is
the principal that gives dynamic programming the power to deal with nonlinear and
nonconvex problems that LP cannot.
The difficulty with a complete enumeration is that the number of possible solutions is
an exponential function of the number of possible decisions, the number of values they may
have, and the number of periods over which the system is considered. For example,
(No. of Periods)(No. of Variables)
Maximum No. of Possible Designs = No. of Values
Dynamic Programming
120
This number is large for even relatively small problems, and astronomical for really
significant problems such as the design of a regional telephone network.
The special approach used by DP is implicit enumeration. It considers all possible
combinations in principle, but not in fact. Based on certain crucial assumptions about a
problem, it systematically eliminates many sets of possible combinations because they can be
shown to be inferior.
Implicit enumeration works by doing a partial optimization at each stage. It selects
the best way to reach any point at that stage, and discards all the other possibilities.
Solution Procedure
DP is not based on a standard formulation, as LP is. It is more an approach or a concept
rather than a formula. The analyst wishing to use DP will often have to use imagination and
skill to organize a problem so that it can be solved efficiently. The fact that DP problems can
differ from each other substantially leads to a great practical difficulty.
Optimizing a system using DP needs to deal with four elements (de Neufville 1990):
organization of the problem
formula to be used in the partial optimizations
constraints
solutions
Organization
The first and essential part of the solution is to organize the problem into states and stages.
The state is the description of situation attained as the result of a transition from one stage to
another. The corresponding return functions must then be defined. Each stage in a sequential
problem might represent:
A division of time – days, hours, and seconds.
A division of space – distance
A sequence of choices
The stages of nonsequential problems are the several parts of a system such as the
different possible investments in a portfolio of loans or shares in companies. For
Dynamic Programming
121
nonsequential problems, the order we assign to the different stages will be absolutely
arbitrary.
The stages are generally easier to imagine, whereas defining the states can be
difficult. A simple example is that stage i represents the construction of an electric plant at a
specific site and state Xi is the level of investment in that plant. The states may be nominal
(e.g., names rather than degrees or levels).
The definition of the return functions (or cost functions) for each stage can be
extremely complex. For example, the return function could represent the profits or the
amount of energy obtained from an investment of level (state) Xi in a particular project.
The return function of a sequential problem represents the change in state from one
stage to the next. It can be expressed as a formula or commonly defined by a table providing
each return in terms of states in the previous stage, 1i , and the state iX of stage i .
Formula
The solution procedure for DP consists of two parts:
Partial optimization at each stage
Repetition of this process through all stages
The objective of the partial optimization is to define the best result that can be
obtained at any state at the end of a specific number of stages. This quantity is given by the
cumulative return function, by analogy with the return function that relates to only one stage.
The cumulative return function is denoted by ( )nf i , which designates the effect of being in
state i, having passed through the first n stages.
The cumulative return function is built up iteratively, from stage to stage. It is
defined in terms of the return function of the current stage and the cumulative return function
for previous stages. Sometimes, the cumulative return function can be defined by a closed-
form equation (called a recurrence function). Often, this function is defined by examination
of the various possibilities as entered in a table.
Constraints
Constraints in DP are dealt with quite differently than they are in LP; for example they are
almost never written as equations. As a matter of fact, a DP problem may not be described in
Dynamic Programming
122
terms of equations. Most constraints in DP are embedded directly in the organization of the
problem. For example, if there is a restriction on the maximum number of states for any
stage, only that number is provided.
Solution
The optimal solution is provided by the cumulative return function over all the stages. In
addition to the optimal solution, we can also know the optimal policy by determining how we
reached the optimal solution. This is in fact quite easy because the cumulative return
function is defined in terms of the return function for the last stage. Knowing the optimal
solution, we then know the transition from the previous stage that brought us to that point;
from the cumulative return function for the previous stage we can trace back one more stage,
and so on.
Applications
DP models can be applied to various problems including equipment replacement,
capacity expansion, resource allocation, inventory planning, and route guidance.
Classifications
DP models can be classified in many ways. First, if we consider the randomness of
the systems, they can be classified into:
Deterministic DP – no uncertainty or non-random systems
Stochastic or probabilistic DP – uncertain or random systems
Second, if we consider the time horizon of the systems, the models can be classified
into two types, namely, the finite-horizontal DP and the infinite-horizontal DP.
In this chapter, we will focus only on the deterministic, finite-horizontal DP problems,
which is the most fundamental type of DP models.
Dynamic Programming
123
2. Model Formulation by Decision Tree
Decision trees, which is commonly used in decision analysis (see Chapter 3), can be adopted
as a tool for sequential decision problems. A decision tree for DP problems, as shown in
Figure 8.1, consists of:
Decision node, representing the opportunity to make a decision
Arc, showing a decision
Root, a beginning node without “in” arcs
Terminal node, the end of each sequence of decisions
It should be noted that the cost associated with a particular decision in the decision
tree is conditional on all previous decisions along the path leading to that decision, as shown
in Figure 8.2.
Figure 8.1 Decision tree for deterministic DP problems
Dynamic Programming
124
Figure 8.2 Costs associated with decisions along the path
Example 8.1 Resource Allocation
We want to allocate the maximum of 2 units of resources to 3 activities so as to maximize our
total profit. The profit matrix is as follows.
Activity Profits ( $1,000) associated with # of resources allocated
0 1 2 1 0 10 20 2 0 5 20 3 0 10 12
Objective function: Maximize the total profit
Observations:
The problem seems to be static but it is actually dynamic because it involves a
sequence of decision makings.
There are many possible ways to model this problem. Figure 8.3 shows only one
possible way to model it.
The directed paths from the root to the terminal nodes are all feasible decision
sequences.
The total profit associated with a particular path is the sum of the profits for all the
corresponding decisions along that path.
If we represent the cost (i.e., negative profit) for each arc by the length of that arc,
the cost of a feasible decision sequence will be the total length of all the
corresponding arcs along that path. Thus, our problem can be viewed as finding a
shortest path in the decision tree network from the root to the terminal nodes.
Dynamic Programming
125
Figure 8.3 Decision tree for the resource allocation problem
Answers: Solving the problem using the decision tree yields two sets of answer: (1)
maximum profit = $20,000 and (2) the three possible optimal policies, which are 1)
allocating two units of resource to Activity 2, 2) allocating one unit of resource to Activity 1
and one unit to Activity 3, and 3) allocating two units of resource to Activity 1.
Dynamic Programming
126
Example 8.2 Equipment Replacement
Given that the current equipment is one year old. The company wants to develop a cost
minimizing equipment replacement strategy over a 3-year horizon. Assume that they make a
decision at the beginning of each year either to keep (K) the current equipment or to buy (B)
a new one. The new equipment will cost $1000. Other costs of this equipment are shown in
the following table.
2( $10 ) Age of equipment (at the beginning of the year)
0 1 2 3 4 Maintenance costs 0 1 2 3 -
Salvage values - 8 7 6 5
The cost function of this particular equipment can be expressed by:
Cost = f(Salvage value , Purchase cost , Maintenance cost)
Objective function: Minimize the total cost
Observations:
This DP problem is associated with three decision stages (i.e., the beginning of years
1, 2, and 3) and two possible choices for each stage (i.e., buy a new machine or keep the
current one). Thus, there is a total of 8 (= 32 ) paths in this decision tree, as shown in Figure
8.4.
Answers: Solving this problem on the decision tree provides two sets of answer:
(1) Minimum total cost = –$200 (profit)
(2) Five optimal policies are (B,B,B), (B,B,K), (B,K,B), (K,B,B), and (K,B,K), where
(optimal policy for year 1, optimal policy for year 2, optimal policy for year 3).
Dynamic Programming
127
Figure 8.4. Decision tree for the equipment replacement problem
Dynamic Programming
128
3. Model Formulation by Dynamic Programming Network
Several nodes of the decision tree in Figure 8.4 are identical, and can be aggregated together
into 10 groups (e.g., a, b, etc.), as shown in Figure 8.5. This process is called the aggregation
process.
Based on these groups of identical nodes, we can represent the above decision tree in
the form of a new network called a Dynamic Programming Network, as shown in Figure 8.6.
The dynamic programming network (DPN) is an equivalent representation of the decision
tree (DT), but it is more economical. It creates cycles of decision making that reduce the size
of problem.
Some important observations are as follows.
(1) There are fewer nodes in the Dynamic Programming Network (DPN) than those in
the Decision Tree (DT) (i.e., 10 nodes in Figure 6 as compared to 15 nodes in
Figure 8.4).
(2) There is a 1-1 correspondence between the paths from left to right in DT and DPN
(i.e., the number of paths in DT = the number of paths in DPN).
(3) The nodes in the DPN can be interpreted as the ages of the equipment at each of
the periods. These nodes can be called dynamic programming states.
Thus, the above DPN can be presented by substituting each node by a dynamic
programming state, consisting of two sets of information: (Beginning of year, Age of
machine). For example, state (3,2) represents the equipment at the beginning of year 3 with
the age of 2 years. Figure 8.7 presents a DPN using the DP states.
The construction of the DPN concludes the formulation phase of DP problems.
Finding a shortest or longest path (i.e., a least cost or highest profit path) is the next phase of
DP problems, the solution phase.
4. Solution Procedure
The main objective of DP solution is to efficiently find a shortest path (e.g., lowest cost) in a
finite acyclic directed network.
Dynamic Programming
129
Figure 8.5 Aggregation process for the equipment replacement problem
Dynamic Programming
130
Figure 8.6 Dynamic programming network for the equipment replacement problem
Figure 8.7 Dynamic programming network (DPN) represented by the states
Dynamic Programming
131
Definition:
Directed network: S is a set of nodes ( S or the network is finite)
T S S is directed arcs [e.g., ( , ) Ti j ]
Start node: s S
Successor set: SCS(i) = : ( , ) Tj i j
A path is a sequence 1 2( , ,..., )ni i i such that 1( , ) T, 1,2,..., 1k ki i k n
A path with 1 ni i is a cycle.
A network is cyclic if it contains at least one cycle; otherwise, it is acyclic.
Let ijt length of ( , ) Ti j
Path length = 1
, 11
n
k kk
t
for path 1 2( , ,..., )ni i i
A finite acyclic directed network can have nodes labeled so that ( , ) Ti j i j .
Deterministic DP models can be solved a variety of algorithms, two of which are
backward DP and forward DP.
5. Backward Dynamic Programming
The following examples are used to illustrate how to develop the solution procedure for
backward DP.
Example 8.3 Shortest Path for Diamond Network
Find the shortest path from A to B for the diamond network shown in Figure 8.8.
Solution: Determine the shortest path from B to A (backward), as shown in Figure 8.9.
Answer: The shortest path is the bold line in Figure 8.9 for a path length of 13 units.
Observations:
We cannot improve the formulation of this problem because this diamond network
is the dynamic programming network of the problem, which requires three ups
and three downs to go from A to B.
There are a total of 6
3
= 20 possible paths in this network.
However, we can improve the solution procedure by using the principle of optimality.
Dynamic Programming
132
A B
1
5
2
0
3
4 2
4
1
3
5
24
2
2
1
4
87
1
3
5
2
2
C
Legend:
Distance
Figure 8.8 Shortest path problem for the diamond network
A B
1
5
2
0
3
4 2
4
1
3
5
24
2
2
1
4
87
1
3
5
2
2
Min (2+2,1+8)= 4
2
1
7
4
5
10
8
6
7
9
8
11
12
14
13C
Legend:
Distance
Figure 8.9 Solution of the shortest path problem for the diamond network
Dynamic Programming
133
Principle of Optimality (Backward DP)
“If the shortest path from node A to node B passes through the intermediate node C, then the
path followed from C to B must be a shortest path from C to B,” as shown in Figure 8.10.
Figure 8.10 Principle of optimality (backward DP)
The principle means that a subpath of the optimal path is also optimal. This principle
can be proved easily by contradiction. That is, if we assume that there is another path from C
to B that is shorter than the one shown above, there must be another path from A to B that is
shorter than the above. This is clearly not true.
The principle of optimality provides the concept of the solution procedure by
backward DP: “If we found ourselves at a particular node, what would be the shortest path
from that node to the end?”
Thus, we can solve Example 8.3 by using this principle of optimality. For example,
at C we must find the shortest path from that point to B by comparing between the upper path
(i.e., up and down) with the length of 4 units ( 2 2 ) and the lower path (i.e., down and up)
with the length of 9 units ( 8 1 ). Clearly, the shortest path is the upper path.
Based on this concept, we can eliminate unnecessary calculations if we use the direct
enumeration approach. For example, whether or not we are calculating the cost of path
1-2-3-6-8-9 or 1-3-4-6-8-9, we need to calculate the subpath 6-8-9 only once. Thus, DP
decreases the number of calculation by breaking a large problem into small problems
(decomposition).
Dynamic Programming
134
Example 8.4 Shortest Path for Acyclic Directed Network
Find the shortest path from node 1 to node 9 for the network in Figure 8.11.
Figure 8.11 Acyclic directed network problem
Calculation: Using backward DP minimization
Given that if = minimum distance from node Si to the terminal node (node 9)
9f = 0 (called the boundary condition)
8f = 10 + 9f = 10 + 0 = 10 (to node 9)
7f = 3 + 9f = 3 + 0 = 3 (to node 9)
6f = min{7 + 8f , 15 + 9f } = min{7 + 10 , 15 + 0} = min {17 , 15} = 15 (to node 9)
5f = 7 + 7f = 7 + 3 = 10 (to node 7)
4f = min{4 + 5f , 3 + 6f , 15 + 7f , 7 + 8f } = min{4 + 10 , 3 + 15 , 15 + 3 , 7 + 10}
= min{14 , 18 , 18 , 17} = 14 (to node 5)
3f = min{3 + 4f , 4 + 6f } = min{3 + 14 , 4 + 15} = min {17 , 19} = 17 (to node 4)
2f = min{6 + 4f , 12 + 5f } = min{6 + 14 , 12 + 10} = min {20 , 22} = 20 (to node 4)
1f = min{1 + 2f , 2 + 3f } = min{1 + 20 , 2 + 17} = min {21 , 19} = 19 (to node 3)
By tracking back, the shortest path in this network is: 134579, with the
total length of 19 units.
Dynamic Programming
135
A shortest path tree can be used to illustrate a shortest path from a certain node to the
terminal node. This tree can be constructed by leaving in only the arcs that accrue the
shortest paths. For the above example, the shortest path tree is shown in Figure 8.12. For
example, the shortest path from node 2 to the terminal node (node 9) is 24579
(= 6+4+7+3 = 20).
Figure 8.12 Shortest path tree
Optimality Equation (Backward DP)
The calculation procedure of DP above can be expressed algebraically by optimality equation
(functional equation or optimal value function), consisting of three components:
(1) Recursive equation
(2) Boundary conditions
(3) Answer
For example, the optimality equation of Example 8.4 can be written as:
min for ( , ) T
0 for 9
ij ji
t f i jf
i
Answer: 1f where if is the shortest path from node i to the terminal node (node 9),
and ijt is the distance between node i and node j.
Dynamic Programming
136
General Form of Backward DP
Figure 8.13 illustrates the formulation of the backward DP optimality equation for any state i.
The general form of backward DP can be expressed by:
Define ( )f i = the length of a shortest path from i to n, 1, 2,...,i n and ( , )C i j is the
cost associated with the transition from state i to state j.
min ( , ) ( ) for ( , ) T
( )0 for
j i C i j f j i jf i
i n
Answer: (1)f
Solving this optimality equation yields two main results:
Minimum cost from each node to the end, especially the minimum cost (1)f
Optimal policy or action – If-then rule (e.g., if we are at node i, then we should go
to node j) from the shortest path tree
Figure 8.13 Backward DP formulation
6. Forward Dynamic Programming
Another way to solve deterministic sequential decision problems is by using forward dynamic
programming. The concepts of forward DP are very similar to those of backward DP. The
only difference is that solving by forward DP will start from the beginning node and continue
to the terminal node.
Dynamic Programming
137
Principle of Optimality (Forward DP)
“If a shortest path from node A to node B passes through the intermediate node C, the path
followed from A to C must be a shortest path from A to C,” as shown in Figure 8.14.
Figure 8.14 Principle of optimality (forward DP)
Again, this principle can be proved by contradiction (similar to that in backward DP).
It also provides the concept of the solution procedure by forward DP: “If we found ourselves
at a particular node, what would be the shortest path from the beginning to that node?”
Example 8.5 Shortest Path for Acyclic Directed Network (Revisit Example 8.4)
Find the shortest path of the network in Example 8.4 by using forward DP minimization.
Calculation:
Given that ig = minimum distance from the start node (node 1) to node i
1g = 0 (called the boundary condition)
2g = 1g + 1 = 0 + 1 = 1 (from node 1)
3g = 1g + 2 = 0 + 2 = 2 (from node 1)
4g = min{ 2g + 6 , 3g + 3} = min{1 + 6 , 2 + 3} = min{7 , 5} = 5 (from node 3)
5g = min{ 2g + 12 , 4g + 4} = min{1 + 12 , 5 + 4} = min{13 , 9} = 9 (from node 4)
6g = min{ 3g + 4 , 4g + 3} = min{2 + 4 , 5 + 3} = min{6 , 8} = 6 (from node 3)
7g = min{ 4g +15 , 5g + 7} = min{5 + 15 , 9 + 7} = min{20 , 16} = 16 (from node 5)
8g = min{ 4g + 7 , 6g + 7} = min{5 + 7 , 6 + 7} = min{12 , 13} = 12 (from node 4)
9g = min{ 6g + 15 , 7g + 3 , 8g + 10} = min{6 + 15 , 16 + 3 , 12 +10}
Dynamic Programming
138
= min{21 , 19 , 22} = 19 (from node 7)
Thus, the shortest path in this network is: 975431, with the total length of
19 units (same as the answers from using backward DP).
The optimality equation for the forward DP of this problem is:
min for ( , ) T
0 for 1
i ijj
g t i jg
j
Answer: 9g
General Form of Forward DP
Figure 8.15 illustrates the formulation of the forward DP optimality equation for any state j.
The general form of forward DP can be expressed by:
Define ( )g j = the length of a shortest path from 1 to j, 1, 2,...,j n and ( , )C i j is the
cost associated with the transition from state i to state j.
min ( ) ( , ) for ( , ) T
( )0 for
j i g i C i j i jg j
j s
Answer: ( )g n
Figure 8.15 Forward DP formulation
Dynamic Programming
139
7. Maximization Problems
The maximization problems are very similar to the minimization problems discussed in the
previous sections. We only change the definition of the optimality function and maximize
this function rather than minimize it.
Backward DP
Define if = the maximum distance from node i to the end (node n)
Optimality equation:
max for ( , ) T
0 for
i j ij ji
C f i jf
i n
Answer: 1f
Forward DP
Define jg = the maximum distance from the start (node s) to node j
Optimality equation:
max for ( , ) T
0 for
i j i ijj
g C i jg
i s
Answer: ng
8. Applications
The following three problems are adapted from Stark and Nicholls (1972) to illustrate some
applications of DP models in various civil engineering systems.
Dynamic Programming
140
Example 8.6 Combinatorial Problem
An aggregate producer has four identical mobile crushing-screening plants and four sources
of raw material, which he can use during the coming construction season. Given the profit
matrix below, how many plants should he assign to each site?
Profit Matrix
No. of plants
assigned
Raw material site
1 2 3 4
1 47 39 24 35 2 81 62 47 51 3 105 84 72 61 4 132 91 87 68
Let nx = number of plants assigned to site n
( )n np x = profit obtained from allocating nx plants to site n
Thus, Maximize 4
1
( )n nn
p x
subject to 4
1
4nn
x
It should be noted that since the profit function increases monotonically, we should
use all resources.
Let ( , )f n y = maximum profit obtained from assigning y plants to the raw material
sites 1 through n, where 4y , integer
The recursive equation can be written as:
( , ) max{ ( ) ( 1, )}n n nf n y p x f n y x , nx y , integer
Answer: (4,4)f
Dynamic Programming
141
Calculation:
(1) Tabular method
1 1 1(1, ) max ( ) :f y p x x y , which is the boundary conditions.
y f(1,y) *1x
0 0 0 1 47 1 2 81 2 3 105 3 4 132 4
2 2 2 2(2, ) max ( ) (1, )x yf y p x f y x
y/ 2x 2 2 2( ) (1, )p x f y x (2, )f y *
2x 0 1 2 3 4
0 0 - - - - 0 0 1 0+ 47= 47 39+ 0= 39 - - - 47 0 2 0+ 81= 81 39+ 47= 86 62+ 0= 62 - - 86 1 3 0+105=105 39+ 81=120 62+47=109 84+ 0= 84 - 120 1 4 0+132=132 39+105=144 62+81=143 84+47=131 91+0=91 144 1
3 3 3 3(3, ) max ( ) (2, )x yf y p x f y x
y/ 3x 3 3 3( ) (2, )p x f y x (3, )f y *
3x 0 1 2 3 4
0 0 - - - - 0 0 1 0+ 47= 47 24+ 0= 24 - - - 47 0 2 0+ 86= 86 24+ 47= 71 47+ 0= 47 - - 86 0 3 0+120=120 24+ 86=110 47+47= 94 72+ 0= 72 - 120 0 4 0+144=144 24+120=144 47+86=133 72+47=119 87+0=87 144 0,1
4 4 4 4(4, ) max ( ) (3, )x yf y p x f y x
y/ 4x 4 4 4( ) (3, )p x f y x (4, )f y *
4x 0 1 2 3 4
0* 0 - - - - 0 0 1* 0+ 47= 47 35+ 0= 35 - - - 47 0 2* 0+ 86= 86 35+ 47= 82 51+ 0= 51 - - 86 0 3* 0+120=120 35+ 86=121 51+47= 98 61+ 0= 61 - 121 1 4 0+144=144 35+120=155 51+86=137 61+47=108 68+0=68 155 1
* Unnecessary calculation
Dynamic Programming
142
Answer: The maximum profit equals to 155. The optimal assignment is (2,1,0,1), that is two
plants at site 1, one plant at site 2, zero plant at site 3, and one plant at site 4 (by reading from
the table or graph).
(2) Graphical method
Figure 8.16 shows the solution of this problem using the graphical method.
Figure 8.16 Solving the combinatorial problem using the graphical method
Dynamic Programming
143
Example 8.7 Replacement of tractor-scraper units
A contractor has a fleet of tractor-drawn scrapers for moving earth on short hauls. As a
tractor or scraper ages, the annual profit of the tractor-scraper unit decreases due to increased
maintenance costs. To simplify for illustrative purposes, assume that the contractor keeps a
tractor no more than 4 years and a scraper no more than 2 years. The contractor estimates his
annual net return per tractor-scraper unit as:
Tractor age (years) 0 1 2 3 0 1 2 3 Scraper age (years) 0 0 0 0 1 1 1 1 Estimated net return of tractor-scraper unit for the following year ($103)
11 9 6 3 9 7 4 2
The cost of a new tractor is $9,000, and of a new scraper $3,000. Replacements at the
end of year i are charged to the ith year. At the end of each year, the contractor chooses one
of the following policies for each tractor-scraper unit:
(1) Replace neither
(2) Replace tractor only
(3) Replace scraper only
(4) Replace both
What replacement policy maximizes profit for a given period (planning horizon)?
Calculation:
In this problem, we determine the optimal replacement policy for the eight different age
combinations (states) of the two machines over 1-, 2-, 3-, and 4-year time horizons (stages).
The cost function consists of the estimated net return of the tractor-scraper unit (from
the above table) and the cost of a new machine (if the replacement is being considered).
The solution procedure is recursive. The optimal policy for the 1-year horizon will be
used for the optimization of the 2-year horizon, and so on. All the calculations can be
summarized in Table 8.1.
Table 8.1 DP solution for the tractor-scraper replacement problem
Ages, yr
Tractor 0 1 2 3 0 1 2 3 Scraper 0 0 0 0 1 1 1 1
1-yr horizon
Policy 1 11 9 6 3 9 7 4 2 Optimal 1 1 1 1 1 1 1 1
2-yr horizon
Policy 1 11+ 0+ 7=18 9+ 0+ 4=13 6+ 0+ 2=8 --- --- --- --- --- Policy 2 11- 9+ 9=11 9- 9+ 9= 9 6- 9+ 9=6 3- 9+ 9=3 --- --- --- --- Policy 3 11- 3+ 9=17 9- 3+ 6=12 6- 3+ 3=6 --- 9- 3+ 9=15 7- 3+ 6=10 4- 3+ 3=4 --- Policy 4 11-12+11=10 9-12+11= 8 6-12+11=5 3-12+11=2 9-12+11= 8 7-12+11= 6 4-12+11=3 2-12+11=1 Optimal 1 1 1 2 3 3 3 4
3-yr horizon
Policy 1 11+ 0+10=21 9+ 0+ 4=13 6+ 0+ 1= 7 --- --- --- --- --- Policy 2 11- 9+15=17 9- 9+15=15 6- 9+15=12 3- 9+15=9 --- --- --- --- Policy 3 11- 3+13=21 9- 3+ 8=14 6- 3+ 3= 6 --- 9- 3+13=19 7- 3+ 8=12 4- 3+ 3= 4 --- Policy 4 11-12+18=17 9-12+18=15 6-12+18=12 3-12+18=9 9-12+18=15 7-12+18=13 4-12+18=10 2-12+18=8 Optimal 1,3 2,4 2,4 2,4 3 4 4 4
4-yr horizon
Policy 1 11+ 0+13=24 9+ 0+10=19 6+ 0+ 8=14 --- --- --- --- --- Policy 2 11- 9+19=21 9- 9+19=19 6- 9+19=16 3- 9+19=13 --- --- --- --- Policy 3 11- 3+15=23 9- 3+12=18 6- 3+ 9=12 --- 9- 3+15=21 7- 3+12=16 4- 3+ 9=10 Policy 4 11-12+21=20 9-12+21=18 6-12+21=15 3-12+21=12 9-12+21=18 7-12+21=16 4-12+21=13 2-12+21=11 Optimal 1 1,2 2 2 3 3,4 4 4
Note: All profits are in thousands of dollars.
Dynam
ic
c Program
ming
Dynamic Programming
145
Example 8.8 Design of An Irrigation System
In this example, dynamic programming will be used to size an irrigation aqueduct and
allocate water from it among several irrigation districts. A total anticipated annual profit
objective will be used.
An irrigation canal is to be designed to serve three districts whose head gates lie 30,
50, and 75 miles, respectively, downstream from the point of diversion. The available annual
water supply is 800,000 acre-ft. The below table shows the anticipated annual irrigation
benefits and the estimated annual costs of the aqueduct as functions of annual water delivery.
The annual water delivery to each irrigation district and corresponding aqueduct are to be
chosen to maximize anticipated annual profit.
We use the notation as follows.
i = irrigation district (i = 1 corresponds to the most upstream district, etc.)
qi = water delivered to irrigation district i, acre-ft/yr
Bi = annual irrigation benefit to district i
bi = annual net benefit to district i
c = amortized cost of aqueduct, $/mile-yr
ci = total amortized cost of aqueduct from head gates ( 1i ) to i, $/yr
The anticipated annual benefits and costs are presented in the following table.
qi B1 B2 B3 c 103 acre-ft $103 $103 $103 $103/mile
200 600 400 900 7.6 400 980 760 1,250 10.7 600 1,310 1,090 1,500 13.2 800 1,600 1,380 1,690 15.2
Calculation:
We will solve this problem by backward DP by considering the last district (district 3) first,
and then proceeding to districts 2 and 1, respectively.
Dynamic Programming
146
Costs and Benefits for District 3
q3 B3 c C3 b3
103 acre-ft $103 $103/mile (75-50)c = 25c ($103) $103
200 900 7.6 190 710 400 1,250 10.7 268 983 600 1,500 13.2 330 1,170 800 1,690 15.2 380 1,310
Costs and Benefits for Districts 3 and 2
q2 + q3 q2 q3 b3 B2 C2 b2 b2+b3 Optimal 103 103 103 $103 $103 (50-30)c $103 Allocation
acre-ft acre-ft acre-ft = 20c ($103) 103 acre-ft 200 0 200 710 0 152 -152 558 q2 = 0
200 0 0 400 152 248 248 q3 = 200 400 0 400 983 0 214 -214 769 q2 = 200
200 200 710 400 214 186 896 q3 = 200 400 0 0 760 214 546 546
600 0 600 1,170 0 264 -264 906 q2 = 400 200 400 983 400 264 136 1,119 q3 = 200 400 200 710 760 264 496 1,206 600 0 0 1,090 264 826 826
800 0 800 1,310 0 304 -304 1,006 q2 = 600 200 600 1,170 400 304 96 1,266 q3 = 200 400 400 983 760 304 456 1,439 600 200 710 1,090 304 786 1,496 800 0 0 1,380 304 1,076 1,076
Costs and Benefits for Districts 3, 2, and 1
q1+q2+q3 q1 q2+q3 b2+b3 B1 C1 b3 b1+b2+b3 Optimal 103 103 103 $103 $103 30c $103 Allocation
acre-ft acre-ft acre-ft $103 103 acre-ft 200 0 200 558 0 228 -228 330 q1 = 200
200 0 0 600 228 372 372 q2 = 0 q3 = 0
400 0 400 896 0 321 -321 575 q1 = 200 200 200 558 600 321 279 837 q2 = 0 400 0 0 980 321 659 659 q3 = 200
600 0 600 1,206 0 396 -396 810 q1 = 400 200 400 896 600 396 204 1,100 q2 = 0 400 200 558 980 396 584 1,142 q3 = 200 600 0 0 1,310 396 914 914
800 0 800 1,496 0 456 -456 1,040 q1 = 400 200 600 1,206 600 456 144 1,350 q2 = 200 400 400 896 980 456 524 1,420 q3 = 200 600 200 558 1,310 456 854 1,412 800 0 0 1,600 456 1144 1,144
Dynamic Programming
147
Answer: The optimal design of this irrigation system is (400000,200000,200000); that is,
District 1: 400,000 acre-ft, District 2: 200,000 acre-ft, and District 3: 200,000 acre-ft with the
total net benefit of $1,420,000.
9. Summary
This chapter introduces dynamic programming (DP) models for optimizing construction
engineering and management systems by focusing on deterministic DP. The general
concepts of DP are first discussed by comparing with linear programming (LP), which was
presented in Chapter 7. Two methods of model formulation are presented, namely, the
decision tree and the dynamic programming network. The chapter then discusses two
algorithms for solving DP models (minimization and maximization problems): backward DP
and forward DP. The applications of DP are illustrated by using calculation examples.
REFERENCES
de Neufville, R. (1990). Applied Systems Analysis: Engineering planning and technology
management. McGraw-Hill, Chapter 7.
Hillier, F. S. and Lieberman, G. L. (1995). Introduction to Operations Research (6th Edition).
McGraw-Hill, NY, Chapter 10.
Murty, K. G. (1995). Operations Research: Deterministic Optimization Models. Prentice
Hall, NJ, Chapter 12.
Stark, R. M. and Nicholls, R. L. (1972). Mathematical Foundations for Design: Civil
Engineering Systems. McGraw-Hill, Chapter 6.
1
2101692 Analytical Methods in Construction Management Department of Civil Engineering
Chulalongkorn University Academic Year 2021
Assignment 9
Dynamic Programming Out: October 20, 2021
Due: November 3, 2021 Problem 1 Draw a Dynamic Programming Network (DPN) of Example 8.1 Resource Allocation and verify the answers given in the handout. Define the state as follows.
(n,i) = (Activity nth, i resources remaining) Problem 2 The safety engineer for a factory has estimated the number of employee disabilities (measured in sick days/year) that can be avoided by four different measures: (1) covering dangerous machinery (CDM), (2) providing protective clothing (PPC), (3) improving ventilation (IV), and (4) lowering noise levels (LNL). The information is summarized in the following table.
Investment (103$) Measures
CDM PPC IV LNL 0 0 0 0 0 10 5 15 2 5 20 10 22 12 7 30 15 25 15 15 40 20 25 17 25 50 22 30 25 26
(a) Use dynamic programming to determine how a $70,000 budget could be most
effectively employed to reduce disabilities. (b) How much a difference would it make if management cut back the safety budget to
$60,000? How much money would be saved for each additional sick day the budget cut would cause?
[From Engulf and Devour by de Neufville (1990), Chapter 7, Problem 7.1, page 164] Problem 3 A Boston rat crawls out of its hole at Beacon and Arlington Streets and decides to raid its favorite garbage behind a store at Newbury and Exeter Streets. Based on previous experience with cats, traffic, and lighting conditions, the rat estimates travel time (in minutes) for each block, as shown below.
(a) Use backward dynamic programming to find the rat’s minimum route to the garbage can and the travel time.
(b) Use forward dynamic programming to find the rat’s minimum route to the garbage can and the travel time, and compare the answer with that of question (a).
2
(c) If, instead, the rat decided to chew on a telephone pole at Marlboro & Dartmouth, how long would that trip take from its hole? Identify this route as well.
[From Rat Patrol by de Neufville (1990), Chapter 7, Problem 7.5, page 166]
Problem 4 A company manufacturing large industrial equipment projects sales of 3 units/month in January, February, and March. Beginning inventory in January (ending inventory in December) is 2 units. The company desires that ending inventory in March also be 2 units. Demand for any month may be satisfied from beginning inventory or that month’s production. Ending inventory in any month is limited to 5 units, and production during any month is limited to 5 units. There is a $10,000 holding cost for any unit left in inventory at the end of a month. The production cost depends on the number of units manufactured in a month as follows.
Units/Month 0 1 2 3 4 5 Cost ($000) 0 80 100 120 130 140
(a) Define the objective, decision variables, objective function, constraints, and the return
functions for each stage. (b) Draw a dynamic programming network of this problem. (c) Determine the optimal production schedule for January, February, and March; and its
minimum cost by using dynamic programming. [From Winter Inventory by de Neufville (1990), Chapter 7, Problem 7.6, page 166]
Problem 5 Cheryl Consultant has an opportunity to invest in one or more of four proposal writing projects A, B, C, and D. Any investment in a project must be made in $1,000 increments. Moreover, there are limits to the amounts Cheryl might invest effectively in any one project: $7,000 for A; $5,000 for B and D; and $6,000 for C. Cheryl is also willing not to invest at all in a project if her money might be invested more profitably in the other projects. Cheryl estimates her returns for investment as follows.
3
Project Level of investment
(in $1000) 1 2 3 4 5 6 7
A 5 10 15 25 35 50 55 B 3 6 12 18 30 30 30 C 20 35 45 55 60 65 65 D 9 16 29 37 45 45 45
(a) If Cheryl has $8000, what is her optimal strategy? (b) What will her returns from this strategy be? (c) Suppose Cheryl’s friend Jill offers her an extra $1000 if Cheryl agrees to repay Jill
$3000. Should Cheryl accept the offer? Explain your answer. (d) If Cheryl only has $7000, how should her investment plan change? [From Cheryl Consultant by de Neufville (1990), Chapter 7, Problem 7.8, page 167]
Problem 6 Your construction company currently has four projects in progress. You have recently purchased 7 new machines (say mobile cranes) to be used in these projects. Each project needs up to 3 machines (i.e., 0, 1, 2, or 3 machines can be allocated to each project). The following table shows the estimated profits from allocating different number of machines in each project.
Project Estimated Profit (in $1,000) for allocating i machines
i = 0 1 2 3 A 30 70 100 140 B 20 45 80 115 C 10 20 40 65 D 20 40 90 110
By using dynamic programming presented in this course,
(1) Determine the optimal allocation of these seven machines to the four projects and the maximum profit associated with the plan. Show all works.
(2) Assume that you have an option to rent an additional unit of this machine from a rental company (i.e., you will have a total of 8 machines). What would be the maximum total rental (in dollars) you are willing to pay for an extra machine? Describe your answer clearly.