dynamic programming[2003]
DESCRIPTION
yto888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888TRANSCRIPT
Dynamic Programming
Text book: Principles of Operations Research for Management
Frank S BudnickDennis McleaveyRichard Mojena
Dynamic programming
Dynamic programming (DP) is a useful mathematical technique for making a sequence of interrelated decisions. It provides a systematic procedure for determining the optimal combination of decisions.
DP vs LP
LP is iterative- i.e., each step represents a unique solution that is non optimal.
DP is recursive- i.e., it optimizes on a step by step basis using information from preceding step. A single step is sequentially related to preceding steps and is not itself a solution to the problem.
Puzzle- River crossingA farmer went to the market and purchased a fox, a goose, and a bag of beans. On his way home, the farmer came to the bank of a river and hired a boat. But in crossing the river by boat, the farmer could carry only himself and a single one of his purchases. If left alone the goose will eat the beans and the fox will eat the goose. How can the farmer carry them all across safely?
Approach of DPThe fundamental approach of DP involves 1.The breaking down of a multistage problem into its subparts or single stages, a process called DECOMPOSITION.2.RECURSIVE decision making i.e., one decision at each stage, according to a specific optimization objective of that stage.3.Combining the results at each stage to solve the entire problem, a process called COMPOSITION.
Computational approaches:– Forward Recursion and– Backward Recursion
Symbolic representation of n stages of analysis using backward recursion
Stage
Decision
Input state
Return
Output state Stage i
xi
Si
ri
Si-1
Model elements for a stage of analysis Notational representation for a stage
1 2 n
Stage n
xn
Sn
rn(Sn, xn)fn(Sn, xn)
Sn-1Stage n-1
Sn-2
xn-1
rn-1(Sn-1, xn-1)fn-1(Sn-1, xn-1)
Stage 1S0S1
r1(S1, x1)f1(S1, x1)
x1
Sequence of n decisions
Symbolic representation of n stage analysis- backward recursion
Si - state of the system prior to state i ri(Si, xi) -Direct criterion return from stage i
fi(Si, xi) -Cumulative criterion return from stage 1 through i.
Bellman’s Principle of Optimality
An optimal set of decision rules has the property that, regardless of the ith decision, the remaining decisions must be optimal with respect to the outcome that results from the ith decision
Shortest Route problemThe objective is to determine the path from the origin to the destination that minimizes the sum of the numbers along the directed arcs of the path. Typically, the number associated with each arc represents the distance. Cost or time of travelling along that particular segment of the journey.
1
3
2
4 7
6
5
8
5
5
6
42
17
4 4 3
3
1
3
2
4 7
6
5
8
5
5
6
42
17
4 4 3
3
Shortest Route problem
DP decomposes this problem into three stages, one for each leg of the journey.
1 2 3 4
13
2
4 7
6
5
8
5
5
6
42
17
4 4 3
3
Leg 1Stage 3
Leg 2Stage 2
Leg 3Stage 1
13
2
4 7
6
5
8
5
5
642
17
4 4 3
3
Leg 1Stage 3
Leg 2Stage 2
Leg 3Stage 1
Stage 3 Stage 2 Stage 1
Input node Input node Input node
1 2 5
3 6
4 7
Backward recursion - Stage 1
Stage 1S0S1
r1(S1, x1)f1(S1, x1)
x1
Entering state S1 travel from
Decision x1
Travel to f1=r1+f0
*
Optimal Policy
Decision x1* Cumulative
return f1*
5 4+0=4 8 4
6 1+0=1 8 1
7 3+0=3 8 3
Entering state S1
Decision x1
Optimal Policy
Decision x1
*
return f1
*
5 4+0=4 8 4
6 1+0=1 8 1
7 3+0=3 8 3
Entering state S2 travel from
Decision x2 (travel to) f2=r2+f1
*
Optimal Policy
Decision x2* Cumulative
return f2*
5 6 7
2 6+4=10 2+1=3 6 3
3 7+3=10 4+3=7 7 7
4 5+3=8 7 8
Stage 2Stage 1
Entering state S2 travel from
Decision x2 (travel to) f2=r2+f1
*
Optimal Policy
Decision x2* Cumulative
return f2*
5 6 7
2
3
4
Stage 2
Entering state S3 travel from
Decision x2 (travel to) f3=r3+f2
*
Optimal Policy
Decision x3* Cumulative
return f3*
2 3 4
1 5+3=8 3+7=10 4+8=12 2 8
Enteringstate S2
Decision x2 f2=r2+f1
*
Optimal Policy
Decision x2* Cum
return f2*
5 6 7
2 6+4=10 2+1=3 6 3
3 7+3=10 4+3=7 7 7
4 5+3=8 7 8
Stage 3
Entering state S1 travel from
Decision x1
Travel to 8Optimal Policy
Decision x1* Cumulative
return f1*
5 4+0=4 8 4
6 1+0=1 8 1
7 3+0=3 8 3
Entering state S2 travel from
Decision x2 (travel to) f2=r2+f1
*
Optimal Policy
Decision x2* Cumulative
return f2*
5 6 7
2 6+4=10 2+1=3 6 3
3 7+3=10 4+3=7 7 7
4 5+3=8 7 8Entering state S3 travel from
Decision x2 (travel to) f3=r3+f2
*
Optimal Policy
Decision x3* Cumulative
return f3*
2 3 4
1 5+3=8 3+7=10 4+8=12 2 8
Solution
113
22
4 7
66
5
88
55
5
6
422
117
4 4 3
3
Path: 1 - 2 – 6 – 8 Cost: 5+2+1 =8
Exercise to do
1. Change the distance of arc 2-6 to 5 and completely solve for shortest route using backward recursion
2. Using DP, determine the longest route for the same problem
Resource allocation problemA company has 5 salesman to be allocated to 3 marketing zone. The return or profit depends upon no of salesman working in the zone. The expected returns for different salesman in different zones as expected from past record are shown below. Determine optimal allocation policy.
No of salesman
Marketing zone
1 2 3
0 45 30 35
1 58 45 45
2 70 60 52
3 82 70 64
4 93 79 72
5 101 90 82
Let s be the no of salesman availablexj be the no of salesmen allocated to zone j
Pj(xj) be the return from the zone j when xj are allocated to zone j.
Formulation
Stage 1Zone 3
S0=0S1
r1
x1
Stage 2Zone 2
S2
r2
x2
Stage 3Zone 1
S3=5
r3
x3
Max. z= P1(x1)+ P2(x2)+ P3(x3)
Sub to x1+x2+x3 <=5x1, x2, x3 >=0
Entering stateS1
Decisionreturn
Optimal policy
x1* f1*
0 35 0 35
1 45 1 45
2 52 2 52
3 64 3 64
4 72 4 72
5 82 5 82
Stage 1
No of salesman
Marketing zone
1 2 3
0 45 30 35
1 58 45 45
2 70 60 52
3 82 70 64
4 93 79 72
5 101 90 82Entering state
S1
Decisionreturn
Optimal policy
x1* f1*
0
1
2
3
4
5
f2 (s2)=r2 (s2)+f1*(s2)
f2*(s2)=opt x2 {r2 (s2)+f1*(s2)}Transformation Equation
s1 = s2-x2
Stage 2
S1 Decisionreturn
Optimal policy
x1* f1*
0 35 0 35
1 45 1 45
2 52 2 52
3 64 3 64
4 72 4 72
5 82 5 82
Salesman
Marketing zone
1 2 3
0 45 30 35
1 58 45 45
2 70 60 52
3 82 70 64
4 93 79 72
5 101 90 82
Stage 1S0=0S1
r1
x1
Stage 2S2
r2
x2
Stage 3S3=5
r3
x3
Stage 1Given data
Entering state
S2
Decision x2 Optimal policy
0 1 2 3 4 5 x2* f2*
0 30+35=65
0 65
1 30+45=75
45+35=80
1 80
2 30+52=82
45+45=90
60+35=95
2 95
3 30+64=94
45+52=97
60+45=105
70+35=105
2,3 105
4 30+72=102
45+64=109
60+52=112
70+45=115
79+35=114
3 115
5 30+82=112
45+72=117
60+64=124
70+52=122
79+45=124
90+35=125
5 125
Stage 2
Salesman
Marketing zone
1 2 3
0 45 30 35
1 58 45 45
2 70 60 52
3 82 70 64
4 93 79 72
5 101 90 82
Stage 1S0=0S1
r1
x1
Stage 2S2
r2
x2
Stage 3S3=5
r3
x3
Stage 2Given data
S2
Decision x2 Optimal
0 1 2 3 4 5 x2* f2*
0 65 0 65
1 75 80 1 80
2 82 90 95 2 95
3 94 97 105 105 2,3 105
4 102 109 112 115 114 3 115
5 112 117 124 122 124 125 5 125
Stage 3Entering
stateS2
Decision x3 Optimal policy
0 1 2 3 4 5 x3* f3*
5 45+125=170
58+115=173
70+105=175
82+95=177
93+80=173
101+65=166
3 177
Stage 1S0=0S1 =0
r1=35
x1 =0
Stage 2S2 =2
r2=60
Stage 3S3=5
r3 =82
x3 =3 x2 =2
S2
Decision Optimal
0 1 2 3 4 5 x2* f2*
0 65 0 65
1 75 80 1 80
2 82 90 95 2 95
3 94 97 105 105 2,3 105
4 102 109 112 115 114 3 115
5 112 117 124 122 124 125 5 125
Stage 2
Total return82+60+35
=177
Cargo Loading Problem
A truck can carry cargos up to 10 tonne. There are three different items available to be loaded in the truck with different utility value. The objective is to find out the no of different items to be taken in the truck to maximize the utility. The details are given in the table below.
Item Weight Utility or benefit
A 4 11B 3 7C 5 5
Stage 1Item C
S0=0S1
r1
x1
Stage 2Item B
S2
r2
x2
Stage 3Item A
S3=10
r3
x3
Available space
Available space
Loading decision of
Item A
Loading decision of
Item A
Utility due to loading Item
A
Utility due to loading Item
A
Entering state S1
Decision x1
f1=r1+f0*
Optimal Policy
Decision x1* Cumulative
return f1*
0, 1, 2, 3, 4 0 0 0
5, 6, 7, 8, 9 1 1 5
Entering state S1
Decision x1
f1=r1+f0*
Optimal Policy
Decision x1* Cumulative
return f1*
0, 1, 2, 3, 4 0 0 0
5, 6, 7, 8, 9
Entering state S1
Decision x1
f1=r1+f0*
Optimal Policy
Decision x1* Cumulative
return f1*
0, 1, 2, 3, 4 0 0 0
Entering state S1
Decision x1
f1=r1+f0*
Optimal Policy
Decision x1* Cumulative
return f1*
0, 1, 2, 3, 4
Entering state S1
Decision x1
f1=r1+f0*
Optimal Policy
Decision x1* Cumulative
return f1*
Stage 1
Entering state S1
Decision x1
f1=r1+f0*
Optimal Policy
Decision x1* Cumulative
return f1*
0, 1, 2, 3, 4 0 0 0
5, 6, 7, 8, 9 1 1 5
10 2 2 10
Item Weight Utility
C 5 5
Entering state
S2
Decision x2 Optimal policy
0 1 2 3 x2* f2*
Stage 2
Entering state
S2
Decision x2 Optimal policy
0 1 2 3 x2* f2*
0, 1, 2 0 0 0
3 0 7+0=7 1 7
4 0 7+0=7 1 7
5 0+5=5 7+0=7 1 7
6 0+5=5 7+0=7 14+0=14 2 14
7 0+5=5 7+0=7 14+0=14 2 14
8 0+5=5 7+5=12 14+0=14 2 14
9 0+5=5 7+5=12 14+0=14 21+0=21 3 21
10 0+10=10 7+5=12 14+0=14 21+0=21 3 21
B 3 7
Entering state
S3
Decision x2 Optimal policy
0 1 2 3 x3* f3*
10 0+21=21 11+14=25 22+0=22 1 25
Stage 3
A 4 11
Stage 1Item C
S0=0S1
r1=0
x1=0
Stage 2Item B
S2
r2=14
x2=2
Stage 3Item A
S3=10
r3=11
x3=1
Production Schedule Problem
Month May June July August
Demand 30 40 20 30
A company has to meet the following demand for an item in months May, June, July and August.
The item is to be delivered at the end of each month. Production cost associated with manufacturing of item depends on no of units produced. No of units produced O 10 20 30 40
Production cost 0 7000 9000 10000 11000Max. storage capacity is 30 units. Items that are not delivered in same month, may be stored in inventory at a cost of Rs.100/unit/month. The beginning inventory in may is 20 units and ending inventory in Aug must be zero. For practical purposes items can be produced, stored and delivered and stored in batches of 10 units. Determine production inventory schedule that minimizes the total production inventory cost while meeting the demand requirements
Stage 2July
S1S2
r2
x2
Stage 3June
S3
r3
x3
Stage 4May
S4=20
r4
x4
Stage 1August
S0=0
r1
x1
Let Si be initial inventory in month ixi be the no of units to be produced in month iDi Demand in month iLet C(xi ) cost of producing xi units in month i
Stage transformation equation Si-1 =Si+xi -Di
r i (Si, xi) = C(xi )+100 Si
fi*(Si) =min xi {C(xi )+100 Si +fi-1* (Si-1) }
Stage 1
Entering state
S1
Decision x1 Optimal policy
0 10 20 30 x1* f1*
0 10000 30 10000
10 9000+(10*100)=10000
20 10000
20 7000+2000=9000
10 9000
30 100*30=3000
0 3000
August- Demand 30
Stage 2 Entering state S2
Decision x2 Optimal policy
0 10 20 30 40 x2* f2*
0 9000+10000=19000
10000+10000=20000
11000+9000
=20000
20 19000
10 7000+100000
+1000=18000
9000+10000+1000
=20000
10000+9000+1000
=20000
11000+1000+3000
=15000
40 15000
20 0+100000+2000
=12000
7000+10000+2000
=19000
9000+9000+2000
=20000
10000+3000+2000
=15000
20 12000
30 0+9000+3000
=12000
7000+9000+3000
=19000
9000+3000+3000
=15000
0 12000
July- Demand 20
Stage 3 Entering state S3
Decision x3 Optimal policy
0 10 20 30 40 x3* f3*
0 11000+19000=30000
40 30000
10 10000+19000+1000
=30000
11000+15000+1000
=27000
40 27000
20 30000 27000 25000 40 25000
30 29000 27000 25000 26000 30 25000
40 23000 26000 25000 26000 0 23000
June- Demand 40
Stage 4
Entering state S4
Decision x4 Optimal policy
0 10 20 30 40 x4* f4*
20 7000+30000+2000
=39000
9000+27000+2000=38000
10000+25000+2000=37000
11000+25000+2000
=38000
30 37000
May- Demand 30
Stage 2July
S1 =20S2 =20
r2=9000
X2 =20
Stage 3June
S3 =20
r3=11000
X3 =40
Stage 4May
S4=20
r4 =10000
X4 =30
Stage 1August
S0=0
r1=7000
X1 =10
Solution
Thank You