mit and james orlin © 2003 1 dynamic programming 1 –recursion –principle of optimality
TRANSCRIPT
MIT and James Orlin © 2003
1
Dynamic Programming 1
– Recursion– Principle of Optimality
MIT and James Orlin © 2003
2
Dynamic Programming
Transforms a complex optimization problem into a sequence of simpler ones.
Usually begins at the end and works backwards to the beginning.
Can handle a wide range of problems Relies on recursion, and on the principle of
optimality Developed by Richard Bellman
MIT and James Orlin © 2003
3
Recursion example
There are 11 people in a room. How many ways can one select exactly 6 of them?
Let f(n,k) denote the number of subgroups of size k out of n people. We want f(11,6)
1 2 3 4 5 6 7 8 9 10 11
The number of subgroups containing “1” is f(10,5).
The number of subgroups not containing “1” is f(10,6).
MIT and James Orlin © 2003
4
f(n,k) = f(n-1,k-1) + f(n-1, k)f(n,n)=f(n,0)=1
f(0,0)
f(1,0)
f(2,0)
f(3,0)
f(4,0)
f(5,0)
f(6,0)
f(7,0)
f(8,0)
f(1,1)
f(2,1) f(2,2)
f(3,1) f(3,2) f(3,3)
f(4,1) f(4,2) f(4,3) f(4,4)
f(5,1) f(5,2) f(5,3) f(5,4) f(5,5)
f(6,1) f(6,2) f(6,3) f(6,4) f(6,5) f(6,6)
f(7,1) f(7,2) f(7,3) f(7,4) f(7,5) f(7,6) f(7,7)
f(8,1) f(8,2) f(8,3) f(8,4) f(8,5) f(8,6) f(8,7) f(8,8)
1
1
1
1
1
1
1
1
1
1
2 1
3 3 1
4 6 4 1
5 10 10 5 1
6 15 20 15 6 1
7 21 35 35 21 7 1
8 28 56 70 56 28 8 1
Reduces each problem to previously solved problem(s).
MIT and James Orlin © 2003
5
A warm-up example
Suppose that there are 20 matches on a table, and the person who picks up the last match wins. At each alternating turn, my opponent or I can pick up 1 or 2 matches. Assuming that I go first, how can I be sure of winning the game?
(Play with partner, and then discuss).
MIT and James Orlin © 2003
6
20 Matches
Choose a person to go first. Alternate in eliminating 1 or 2 matches. The person who takes the last match wins.
MIT and James Orlin © 2003
7
20 Matches strategy
What is the optimal strategy?
MIT and James Orlin © 2003
8
An Example for Dynamic Programming: 20 matches Version 2
Choose a person to go first. Alternate in eliminating 1 or 2 or 6 matches. The person who takes the last match wins.
MIT and James Orlin © 2003
9
20 Matches
MIT and James Orlin © 2003
10
Winning and losing positions (in the game in 20 matches version 2)
Suppose you have k matches left. Identify whether you have a Winning strategy (assuming your opponent plays optimally)or whether you will Lose.
1 2 3 4 5 6 7 8 9 10
W W L
11 12 13 14 15 16 17 18 19 20
MIT and James Orlin © 2003
11
Determining the strategy using DP n = number of matches left f(n) = W if you can force a win at n matches.
f(n) = L otherwise
Determine f(1), f(2), …, f(6)
For n > 6, determine f(n) from previously solved problems.
f(n) = W if f(n-1) = L or f(n-2) = L or f(n-6) = L;
f(n) = L if f(n-1) = W and f(n-2) = W and f(n-6) = W.
MIT and James Orlin © 2003
12
Determining the strategy using DP n = number of matches left (n is the state/stage) f(n) = W if you can force a win at n matches.
f(n) = L otherwise f(n) = optimal value function.
Determine f(1), f(2), …, f(6) (Boundary conditions)
For n > 6, determine f(n) using a dynamic programming recursion
f(n) = W if f(n-1) = L or f(n-2) = L or f(n-6) = L;
f(n) = L if f(n-1) = W and f(n-2) = W and f(n-6) = W.
MIT and James Orlin © 2003
13
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
Running the Recursion
Who sees the pattern determining the losing hands?
MIT and James Orlin © 2003
14
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 32 33 34 35
36 37 38 39 40 41 42
43 44 45 46 47 48 49 50
The same table
MIT and James Orlin © 2003
15
Examples
Suppose there are 30 matches on a table. I begin by picking up 1, 2, or 3 matches. Then my opponent must pick up 1, 2, or 3 matches. We continue in this fashion until the last match is picked up. The player who picks up the last match is the loser. How can I (the first player) be sure of winning the game?
MIT and James Orlin © 2003
16
Examples
Suppose there are 40 matches on a table. I begin by picking up 1, 2, 3, or 4 matches. Then my opponent must pick up 1, 2, 3, or 4 matches. We continue until the last match is picked up. The player who picks up the last match is the loser. Can I be sure of victory? If so, how?
MIT and James Orlin © 2003
17
Examples
I have a 9-oz cup and a 4-oz cup. My mother has ordered me to bring home exactly 6 oz of milk. How can I accomplish this goal?
We have 21 coins and are told that one is heavier than any of the other coins. How many weighings on a balance will it take to find the heaviest coin?
MIT and James Orlin © 2003
18
Finding the shortest path from node s to all other nodes in an acyclic network.
s
MIT and James Orlin © 2003
19
Shortest Paths
Find the shortest path from node 1 to each other node, in order.–What is the shortest path from node 1 to
node 2, 3, 4, … ?– Determine these values one at a time.
Shortest path
MIT and James Orlin © 2003
20
Finding the shortest path to node t from all other nodes in an acyclic network.
t
MIT and James Orlin © 2003
21
Shortest Paths
Find the shortest path to node 19 from each node in order.–What is the shortest path to node 19
from node 18, 17, 16….?– Determine these values one at a time.
Shortest path
MIT and James Orlin © 2003
22
Dynamic Programming in General Break up a complex decision problem into a
sequence of smaller decision subproblems.
Stages: one solves decision problems one “stage” at a time. Stages often can be thought of as “time” in most instances. – Not every DP has stages– The previous shortest path problem has 7
stages– The match problem does not have stages.
MIT and James Orlin © 2003
23
Dynamic Programming in General States: The smaller decision subproblems are
often expressed in a very compact manner. The description of the smaller subproblems is often referred to as the state. – match problem: “state” is the number of
matches left– shortest path problem: “state” is the node
At each state-stage, there are one or more decisions. The DP recursion determines the best decision. – match problem: how many matches to remove– shortest path example: which arc to choose
MIT and James Orlin © 2003
24
Principle of Optimality
Any optimal policy has the property that whatever the current state and decision, the remaining decisions must constitute an optimal policy with regard to the state resulting from the current decision. (As formulated by Richard Bellman.)
Whatever node j is selected, the remaining path from j to the end is the shortest path starting at j.
MIT and James Orlin © 2003
25
Forwards recursion
Start with stage 1 and determine values for larger stages using recursion
In the match examples, determine f(n) from f(k) for k < n.
MIT and James Orlin © 2003
26
Backwards Recursion
Boundary conditions: optimum values for states at the last stage
Determine optimum values for other stages using backwards recursion
MIT and James Orlin © 2003
27
Summary for Dynamic Programming
Recursion Principle of optimality states and stages and decisions useful in a wide range of situations– games – shortest paths–more will be presented in next lecture.