dynamic programming[2003]

46
Dynamic Programming Text book: Principles of Operations Research for Management Frank S Budnick Dennis Mcleavey Richard Mojena

Upload: jeff-anderson-collins

Post on 09-Dec-2015

214 views

Category:

Documents


0 download

DESCRIPTION

yto888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888

TRANSCRIPT

Page 1: Dynamic Programming[2003]

Dynamic Programming

Text book: Principles of Operations Research for Management

Frank S BudnickDennis McleaveyRichard Mojena

Page 2: Dynamic Programming[2003]

Dynamic programming

Dynamic programming (DP) is a useful mathematical technique for making a sequence of interrelated decisions. It provides a systematic procedure for determining the optimal combination of decisions.

Page 3: Dynamic Programming[2003]

DP vs LP

LP is iterative- i.e., each step represents a unique solution that is non optimal.

DP is recursive- i.e., it optimizes on a step by step basis using information from preceding step. A single step is sequentially related to preceding steps and is not itself a solution to the problem.

Page 4: Dynamic Programming[2003]

Puzzle- River crossingA farmer went to the market and purchased a fox, a goose, and a bag of beans. On his way home, the farmer came to the bank of a river and hired a boat. But in crossing the river by boat, the farmer could carry only himself and a single one of his purchases. If left alone the goose will eat the beans and the fox will eat the goose. How can the farmer carry them all across safely?

Page 5: Dynamic Programming[2003]
Page 6: Dynamic Programming[2003]

Approach of DPThe fundamental approach of DP involves 1.The breaking down of a multistage problem into its subparts or single stages, a process called DECOMPOSITION.2.RECURSIVE decision making i.e., one decision at each stage, according to a specific optimization objective of that stage.3.Combining the results at each stage to solve the entire problem, a process called COMPOSITION.

Computational approaches:– Forward Recursion and– Backward Recursion

Page 7: Dynamic Programming[2003]

Symbolic representation of n stages of analysis using backward recursion

Stage

Decision

Input state

Return

Output state Stage i

xi

Si

ri

Si-1

Model elements for a stage of analysis Notational representation for a stage

Page 8: Dynamic Programming[2003]

1 2 n

Stage n

xn

Sn

rn(Sn, xn)fn(Sn, xn)

Sn-1Stage n-1

Sn-2

xn-1

rn-1(Sn-1, xn-1)fn-1(Sn-1, xn-1)

Stage 1S0S1

r1(S1, x1)f1(S1, x1)

x1

Sequence of n decisions

Symbolic representation of n stage analysis- backward recursion

Si - state of the system prior to state i ri(Si, xi) -Direct criterion return from stage i

fi(Si, xi) -Cumulative criterion return from stage 1 through i.

Page 9: Dynamic Programming[2003]

Bellman’s Principle of Optimality

An optimal set of decision rules has the property that, regardless of the ith decision, the remaining decisions must be optimal with respect to the outcome that results from the ith decision

Page 10: Dynamic Programming[2003]

Shortest Route problemThe objective is to determine the path from the origin to the destination that minimizes the sum of the numbers along the directed arcs of the path. Typically, the number associated with each arc represents the distance. Cost or time of travelling along that particular segment of the journey.

1

3

2

4 7

6

5

8

5

5

6

42

17

4 4 3

3

Page 11: Dynamic Programming[2003]

1

3

2

4 7

6

5

8

5

5

6

42

17

4 4 3

3

Shortest Route problem

Page 12: Dynamic Programming[2003]

DP decomposes this problem into three stages, one for each leg of the journey.

1 2 3 4

13

2

4 7

6

5

8

5

5

6

42

17

4 4 3

3

Leg 1Stage 3

Leg 2Stage 2

Leg 3Stage 1

Page 13: Dynamic Programming[2003]

13

2

4 7

6

5

8

5

5

642

17

4 4 3

3

Leg 1Stage 3

Leg 2Stage 2

Leg 3Stage 1

Stage 3 Stage 2 Stage 1

Input node Input node Input node

1 2 5

3 6

4 7

Page 14: Dynamic Programming[2003]

Backward recursion - Stage 1

Stage 1S0S1

r1(S1, x1)f1(S1, x1)

x1

Entering state S1 travel from

Decision x1

Travel to f1=r1+f0

*

Optimal Policy

Decision x1* Cumulative

return f1*

5 4+0=4 8 4

6 1+0=1 8 1

7 3+0=3 8 3

Page 15: Dynamic Programming[2003]

Entering state S1

Decision x1

Optimal Policy

Decision x1

*

return f1

*

5 4+0=4 8 4

6 1+0=1 8 1

7 3+0=3 8 3

Entering state S2 travel from

Decision x2 (travel to) f2=r2+f1

*

Optimal Policy

Decision x2* Cumulative

return f2*

5 6 7

2 6+4=10 2+1=3 6 3

3 7+3=10 4+3=7 7 7

4 5+3=8 7 8

Stage 2Stage 1

Entering state S2 travel from

Decision x2 (travel to) f2=r2+f1

*

Optimal Policy

Decision x2* Cumulative

return f2*

5 6 7

2

3

4

Page 16: Dynamic Programming[2003]

Stage 2

Entering state S3 travel from

Decision x2 (travel to) f3=r3+f2

*

Optimal Policy

Decision x3* Cumulative

return f3*

2 3 4

1 5+3=8 3+7=10 4+8=12 2 8

Enteringstate S2

Decision x2 f2=r2+f1

*

Optimal Policy

Decision x2* Cum

return f2*

5 6 7

2 6+4=10 2+1=3 6 3

3 7+3=10 4+3=7 7 7

4 5+3=8 7 8

Stage 3

Page 17: Dynamic Programming[2003]

Entering state S1 travel from

Decision x1

Travel to 8Optimal Policy

Decision x1* Cumulative

return f1*

5 4+0=4 8 4

6 1+0=1 8 1

7 3+0=3 8 3

Entering state S2 travel from

Decision x2 (travel to) f2=r2+f1

*

Optimal Policy

Decision x2* Cumulative

return f2*

5 6 7

2 6+4=10 2+1=3 6 3

3 7+3=10 4+3=7 7 7

4 5+3=8 7 8Entering state S3 travel from

Decision x2 (travel to) f3=r3+f2

*

Optimal Policy

Decision x3* Cumulative

return f3*

2 3 4

1 5+3=8 3+7=10 4+8=12 2 8

Page 18: Dynamic Programming[2003]

Solution

113

22

4 7

66

5

88

55

5

6

422

117

4 4 3

3

Path: 1 - 2 – 6 – 8 Cost: 5+2+1 =8

Page 19: Dynamic Programming[2003]

Exercise to do

1. Change the distance of arc 2-6 to 5 and completely solve for shortest route using backward recursion

2. Using DP, determine the longest route for the same problem

Page 20: Dynamic Programming[2003]

Resource allocation problemA company has 5 salesman to be allocated to 3 marketing zone. The return or profit depends upon no of salesman working in the zone. The expected returns for different salesman in different zones as expected from past record are shown below. Determine optimal allocation policy.

No of salesman

Marketing zone

1 2 3

0 45 30 35

1 58 45 45

2 70 60 52

3 82 70 64

4 93 79 72

5 101 90 82

Page 21: Dynamic Programming[2003]

Let s be the no of salesman availablexj be the no of salesmen allocated to zone j

Pj(xj) be the return from the zone j when xj are allocated to zone j.

Page 22: Dynamic Programming[2003]

Formulation

Stage 1Zone 3

S0=0S1

r1

x1

Stage 2Zone 2

S2

r2

x2

Stage 3Zone 1

S3=5

r3

x3

Max. z= P1(x1)+ P2(x2)+ P3(x3)

Sub to x1+x2+x3 <=5x1, x2, x3 >=0

Page 23: Dynamic Programming[2003]

Entering stateS1

Decisionreturn

Optimal policy

x1* f1*

0 35 0 35

1 45 1 45

2 52 2 52

3 64 3 64

4 72 4 72

5 82 5 82

Stage 1

No of salesman

Marketing zone

1 2 3

0 45 30 35

1 58 45 45

2 70 60 52

3 82 70 64

4 93 79 72

5 101 90 82Entering state

S1

Decisionreturn

Optimal policy

x1* f1*

0

1

2

3

4

5

Page 24: Dynamic Programming[2003]

f2 (s2)=r2 (s2)+f1*(s2)

f2*(s2)=opt x2 {r2 (s2)+f1*(s2)}Transformation Equation

s1 = s2-x2

Stage 2

Page 25: Dynamic Programming[2003]

S1 Decisionreturn

Optimal policy

x1* f1*

0 35 0 35

1 45 1 45

2 52 2 52

3 64 3 64

4 72 4 72

5 82 5 82

Salesman

Marketing zone

1 2 3

0 45 30 35

1 58 45 45

2 70 60 52

3 82 70 64

4 93 79 72

5 101 90 82

Stage 1S0=0S1

r1

x1

Stage 2S2

r2

x2

Stage 3S3=5

r3

x3

Stage 1Given data

Page 26: Dynamic Programming[2003]

Entering state

S2

Decision x2 Optimal policy

0 1 2 3 4 5 x2* f2*

0 30+35=65

0 65

1 30+45=75

45+35=80

1 80

2 30+52=82

45+45=90

60+35=95

2 95

3 30+64=94

45+52=97

60+45=105

70+35=105

2,3 105

4 30+72=102

45+64=109

60+52=112

70+45=115

79+35=114

3 115

5 30+82=112

45+72=117

60+64=124

70+52=122

79+45=124

90+35=125

5 125

Stage 2

Page 27: Dynamic Programming[2003]

Salesman

Marketing zone

1 2 3

0 45 30 35

1 58 45 45

2 70 60 52

3 82 70 64

4 93 79 72

5 101 90 82

Stage 1S0=0S1

r1

x1

Stage 2S2

r2

x2

Stage 3S3=5

r3

x3

Stage 2Given data

S2

Decision x2 Optimal

0 1 2 3 4 5 x2* f2*

0 65 0 65

1 75 80 1 80

2 82 90 95 2 95

3 94 97 105 105 2,3 105

4 102 109 112 115 114 3 115

5 112 117 124 122 124 125 5 125

Page 28: Dynamic Programming[2003]

Stage 3Entering

stateS2

Decision x3 Optimal policy

0 1 2 3 4 5 x3* f3*

5 45+125=170

58+115=173

70+105=175

82+95=177

93+80=173

101+65=166

3 177

Page 29: Dynamic Programming[2003]

Stage 1S0=0S1 =0

r1=35

x1 =0

Stage 2S2 =2

r2=60

Stage 3S3=5

r3 =82

x3 =3 x2 =2

S2

Decision Optimal

0 1 2 3 4 5 x2* f2*

0 65 0 65

1 75 80 1 80

2 82 90 95 2 95

3 94 97 105 105 2,3 105

4 102 109 112 115 114 3 115

5 112 117 124 122 124 125 5 125

Stage 2

Total return82+60+35

=177

Page 30: Dynamic Programming[2003]

Cargo Loading Problem

Page 31: Dynamic Programming[2003]

A truck can carry cargos up to 10 tonne. There are three different items available to be loaded in the truck with different utility value. The objective is to find out the no of different items to be taken in the truck to maximize the utility. The details are given in the table below.

Item Weight Utility or benefit

A 4 11B 3 7C 5 5

Page 32: Dynamic Programming[2003]

Stage 1Item C

S0=0S1

r1

x1

Stage 2Item B

S2

r2

x2

Stage 3Item A

S3=10

r3

x3

Available space

Available space

Loading decision of

Item A

Loading decision of

Item A

Utility due to loading Item

A

Utility due to loading Item

A

Page 33: Dynamic Programming[2003]

Entering state S1

Decision x1

f1=r1+f0*

Optimal Policy

Decision x1* Cumulative

return f1*

0, 1, 2, 3, 4 0 0 0

5, 6, 7, 8, 9 1 1 5

Entering state S1

Decision x1

f1=r1+f0*

Optimal Policy

Decision x1* Cumulative

return f1*

0, 1, 2, 3, 4 0 0 0

5, 6, 7, 8, 9

Entering state S1

Decision x1

f1=r1+f0*

Optimal Policy

Decision x1* Cumulative

return f1*

0, 1, 2, 3, 4 0 0 0

Entering state S1

Decision x1

f1=r1+f0*

Optimal Policy

Decision x1* Cumulative

return f1*

0, 1, 2, 3, 4

Entering state S1

Decision x1

f1=r1+f0*

Optimal Policy

Decision x1* Cumulative

return f1*

Stage 1

Entering state S1

Decision x1

f1=r1+f0*

Optimal Policy

Decision x1* Cumulative

return f1*

0, 1, 2, 3, 4 0 0 0

5, 6, 7, 8, 9 1 1 5

10 2 2 10

Item Weight Utility

C 5 5

Page 34: Dynamic Programming[2003]

Entering state

S2

Decision x2 Optimal policy

0 1 2 3 x2* f2*

Stage 2

Entering state

S2

Decision x2 Optimal policy

0 1 2 3 x2* f2*

0, 1, 2 0 0 0

3 0 7+0=7 1 7

4 0 7+0=7 1 7

5 0+5=5 7+0=7 1 7

6 0+5=5 7+0=7 14+0=14 2 14

7 0+5=5 7+0=7 14+0=14 2 14

8 0+5=5 7+5=12 14+0=14 2 14

9 0+5=5 7+5=12 14+0=14 21+0=21 3 21

10 0+10=10 7+5=12 14+0=14 21+0=21 3 21

B 3 7

Page 35: Dynamic Programming[2003]

Entering state

S3

Decision x2 Optimal policy

0 1 2 3 x3* f3*

10 0+21=21 11+14=25 22+0=22 1 25

Stage 3

A 4 11

Page 36: Dynamic Programming[2003]

Stage 1Item C

S0=0S1

r1=0

x1=0

Stage 2Item B

S2

r2=14

x2=2

Stage 3Item A

S3=10

r3=11

x3=1

Page 37: Dynamic Programming[2003]

Production Schedule Problem

Page 38: Dynamic Programming[2003]

Month May June July August

Demand 30 40 20 30

A company has to meet the following demand for an item in months May, June, July and August.

The item is to be delivered at the end of each month. Production cost associated with manufacturing of item depends on no of units produced. No of units produced O 10 20 30 40

Production cost 0 7000 9000 10000 11000Max. storage capacity is 30 units. Items that are not delivered in same month, may be stored in inventory at a cost of Rs.100/unit/month. The beginning inventory in may is 20 units and ending inventory in Aug must be zero. For practical purposes items can be produced, stored and delivered and stored in batches of 10 units. Determine production inventory schedule that minimizes the total production inventory cost while meeting the demand requirements

Page 39: Dynamic Programming[2003]

Stage 2July

S1S2

r2

x2

Stage 3June

S3

r3

x3

Stage 4May

S4=20

r4

x4

Stage 1August

S0=0

r1

x1

Page 40: Dynamic Programming[2003]

Let Si be initial inventory in month ixi be the no of units to be produced in month iDi Demand in month iLet C(xi ) cost of producing xi units in month i

Stage transformation equation Si-1 =Si+xi -Di

r i (Si, xi) = C(xi )+100 Si

fi*(Si) =min xi {C(xi )+100 Si +fi-1* (Si-1) }

Page 41: Dynamic Programming[2003]

Stage 1

Entering state

S1

Decision x1 Optimal policy

0 10 20 30 x1* f1*

0 10000 30 10000

10 9000+(10*100)=10000

20 10000

20 7000+2000=9000

10 9000

30 100*30=3000

0 3000

August- Demand 30

Page 42: Dynamic Programming[2003]

Stage 2 Entering state S2

Decision x2 Optimal policy

0 10 20 30 40 x2* f2*

0 9000+10000=19000

10000+10000=20000

11000+9000

=20000

20 19000

10 7000+100000

+1000=18000

9000+10000+1000

=20000

10000+9000+1000

=20000

11000+1000+3000

=15000

40 15000

20 0+100000+2000

=12000

7000+10000+2000

=19000

9000+9000+2000

=20000

10000+3000+2000

=15000

20 12000

30 0+9000+3000

=12000

7000+9000+3000

=19000

9000+3000+3000

=15000

0 12000

July- Demand 20

Page 43: Dynamic Programming[2003]

Stage 3 Entering state S3

Decision x3 Optimal policy

0 10 20 30 40 x3* f3*

0 11000+19000=30000

40 30000

10 10000+19000+1000

=30000

11000+15000+1000

=27000

40 27000

20 30000 27000 25000 40 25000

30 29000 27000 25000 26000 30 25000

40 23000 26000 25000 26000 0 23000

June- Demand 40

Page 44: Dynamic Programming[2003]

Stage 4

Entering state S4

Decision x4 Optimal policy

0 10 20 30 40 x4* f4*

20 7000+30000+2000

=39000

9000+27000+2000=38000

10000+25000+2000=37000

11000+25000+2000

=38000

30 37000

May- Demand 30

Page 45: Dynamic Programming[2003]

Stage 2July

S1 =20S2 =20

r2=9000

X2 =20

Stage 3June

S3 =20

r3=11000

X3 =40

Stage 4May

S4=20

r4 =10000

X4 =30

Stage 1August

S0=0

r1=7000

X1 =10

Solution

Page 46: Dynamic Programming[2003]

Thank You