the optimizing-simulator: merging optimization and ......special maintenance at airbase -1000...
TRANSCRIPT
© 2005 Warren B. Powell Slide 1
The Optimizing-Simulator: MergingOptimization and Simulation Using
Approximate Dynamic Programming
Winter Simulation ConferenceDecember 5, 2005
Warren PowellCASTLE LaboratoryPrinceton University
http://www.castlelab.princeton.edu
© 2004 Warren B. Powell, Princeton University
© 2005 Warren B. Powell Slide 2
Yellow Freight System
© 2004 Warren B. Powell, Princeton University
© 2005 Warren B. Powell Slide 3
Yellow Freight System
© 2004 Warren B. Powell, Princeton University
© 2005 Warren B. Powell Slide 4
The fractional jet ownership industry
© 2005 Warren B. Powell Slide 5NetJets Inc.
© 2005 Warren B. Powell Slide 6
© 2005 Warren B. Powell Slide 7
© 2005 Warren B. Powell Slide 8
Schneider National
© 2005 Warren B. Powell Slide 9
Schneider National
© 2005 Warren B. Powell Slide 10
© 2005 Warren B. Powell Slide 11
Air Mobility Command
AirMobility
Command
Fuel
Cargo HandlingRamp Space
Maintenance
Cargo Holding
© 2005 Warren B. Powell Slide 13
The challenges
Needs for simulation:» Are we using the right mix of people and equipment?» What is the effect of new policies regarding the
management of people and equipment?» What is the marginal contribution from serving
customers?» What is the effect of last-minute demands on the
system?
© 2005 Warren B. Powell Slide 14
The challenges
We need simulation technology that accomplishes the following:» Decisions have to handle high dimensional states and
actions (assigning different types of resources to different types of tasks).
» The simulator has to capture behaviors that produce “good” behaviors not just at a point in time, but over time (decisions have to think about the future).
» Performance statistics must match historical performance.
© 2005 Warren B. Powell Slide 15
Outline
Modeling and problem representation
© 2005 Warren B. Powell Slide 16
Modeling
Resources can have a number of attributes:
LocationEquipment type⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
LocationETA
Equipment typeTrain priority
PoolDue for maint
Home shop
⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
LocationETA
A/C typeFuel level
Home shopCrewEqpt1
Eqpt100
⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
LocationETA
Bus. segmentSingle/team
DomicileDrive hoursDuty hours
8 day historyDays from home
⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
a =
© 2005 Warren B. Powell Slide 17
Modeling
The attribute vector
The resource state variable
( )Number of resources with attribute at time .
Resource state variableta
t ta a
R a tR R
∈
=
= =A
1
2t
n
aa
a
a
⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦
© 2005 Warren B. Powell Slide 18
Modeling
Decision set function:( ) Set of decision types we can use to act
on a resource with attribute .a
a= D
1
2t
n
aa
a
a
⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦
Modified resource label1ta +d
© 2005 Warren B. Powell Slide 19
Modeling
The “modify” function
The information process
1 ( , , ) ( , )t t t tM a W d a c− =
Vector of information arriving during time interval .
Ex: new customer requests, equipment failures, weather delays.
tW t=
© 2005 Warren B. Powell Slide 20
Modeling
Decisions
The decision function
( )t t tx X Iπ=
Set of decision functions (policies)π ∈Π =
Information available for making a decision
( ) ,
Number of resources with attribute that we can act on with decision using the information available at time .
tad
t tad a d
x ad t
x x∈ ∈
=
=A D
© 2005 Warren B. Powell Slide 21
Approximate dynamic programming
Information and decision processes:
Time
1W
1x 2x 3x 4x 5x 6x0x
2W 3W 4W 5W 6W
Exogenous information process
Decisions determined by a policy
© 2005 Warren B. Powell Slide 22
Modeling
System dynamics (classical view):
1 1
Given a decision function (policy) ( ) andexogenous information process , we can modelthe evolution of the state of our system using:
( , ( ), )
t t
t
t t t t t
X SW
S f S X S W
π
π+ +=
© 2005 Warren B. Powell Slide 23
Modeling
( )t tX Sπ
tSxtS
1tW +
1tS +
© 2005 Warren B. Powell Slide 24
Modeling
User provides:Model of physical system
( )1
Data: Resource vector Information process Software: Decision set function Modify function ( , , )
t
t
t t t
RW
D aM a d W +
Our research goal:The decision function
Decision functions ( )t tX Iπ
© 2005 Warren B. Powell Slide 25
Outline
The optimizing simulator
© 2005 Warren B. Powell Slide 26
Optimizing over time
Resources
© 2005 Warren B. Powell Slide 27
Optimizing over time
Tasks
© 2005 Warren B. Powell Slide 28
Optimizing over time
t t+1 t+2
Optimizing at a point in time
Optimizing over time
© 2005 Warren B. Powell Slide 29
The optimizing simulator
t = t + 1
Make decision at time t
Update system stateat t+1
t = 0
t < T ???
Classical simulation:» Simple» Extremely flexible
But . . .» Limited solution
quality» Often requires
extensive user defined tables to guide the simulation.
» Can respond to changes in inputs in an unpredictable way.
© 2005 Warren B. Powell Slide 30
The optimizing simulator
Optimization» Intelligent» Responds naturally to
new datasets.But . . .» Struggles to handle
complexity of real operations.
» Does not model evolution of information.
» Might be “too intelligent”?
1 1
min
0
t tt
t t t t tt
t t t
t
c x
A x B x b
D x ux
− −− =
≤≥
∑
∑
© 2005 Warren B. Powell Slide 31
Multicommodity flowTime
Spac
e
Type
© 2005 Warren B. Powell Slide 32
The optimizing simulator
Simulation» Strengths
• Extremely flexible• High level of detail
» Weaknesses• Low level of “intelligence”• Lower solution quality• May have difficulty
“behaving” properly with new scenarios.
• Difficulty adapting to random outcomes.
Optimization» Strengths
• High level of intelligence• System behaves “optimally”
even with new datasets• Reduces data set preparation.
» Weaknesses• Strict rules on problem structure• Low level of detail• Inflexible!
To simulate or to optimize . . .
. . . Why are we asking this question?
© 2005 Warren B. Powell Slide 33
Decision-making technologies
Cost-based» The standard assumption of
math programming.» Easily handles tradeoffs.» Easily handles high
dimensions.» Can be difficult to tune to
get the right behavior.
Rule-based» Typically associated with AI.» Very flexible.» Difficult coding tradeoffs.» Struggles with higher
dimensional states.
© 2005 Warren B. Powell Slide 34
Expert knowledge ρ
The four information classes
Forecasts of impacts on others tV
tΩForecasts of exogenous events
Knowledge tK
© 2005 Warren B. Powell Slide 35
The four information classes
Knowledge tK
© 2005 Warren B. Powell Slide 36
Knowledge
Rule-based: one aircraft and one requirement
California
Germany
New Jersey
Colorado
Taiwan
England
New Jersey
Aircraft Requirements
© 2005 Warren B. Powell Slide 37
Knowledge
Cost based: one requirement and multiple aircraft
California
Germany
New Jersey
Colorado
Taiwan
England
New Jersey
Aircraft Requirements
© 2005 Warren B. Powell Slide 38
Knowledge
Costs allow you to make tradeoffs:
California
Germany
-8000Total “cost”-1000Special maintenance at airbase-3000Requires modifications+8000Utilization+5000Appropriate a/c type
-$17,000Repositioning cost“cost”/“bonus”Issue
© 2005 Warren B. Powell Slide 39
Knowledge
Cost based: multiple requirements and aircraft
California
Germany
New Jersey
Colorado
Taiwan
England
New Jersey
Aircraft Requirements
© 2005 Warren B. Powell Slide 40
The information classes
tΩForecasts of exogenous events
Knowledge tK
© 2005 Warren B. Powell Slide 41
Forecasts of exogenous information
California
Germany
New Jersey
Colorado
Taiwan
England
New Jersey
( ) involves solving a linear program/network model.X Iπ
Aircraft Requirements
Resources that are known now…
© 2005 Warren B. Powell Slide 42
Forecasts of exogenous information
Aircraft Requirements
California
Germany
New Jersey
Colorado
Taiwan
England
New Jersey
( ) involves solving a linear program/network model.X Iπ
CaliforniaGermany
New Jersey
Colorado
TaiwanEngland
New Jersey
Aircraft Requirements
Resources that are known now…
© 2005 Warren B. Powell Slide 43
Forecasts of exogenous information
Aircraft Requirements California
Germany
New Jersey
Colorado
TaiwanEngland
New Jersey
tR
⎧⎪⎪= ⎨⎪⎪⎩
CaliforniaGermany
New Jersey
Colorado
Taiwan
England
New Jersey
( )' 't t tR
>=
⎧⎪⎪⎨⎪⎪⎩
… and are forecasted for the future.
© 2005 Warren B. Powell Slide 44
The information classes
The Information classes
Forecasts of impacts on others tV
tΩForecasts of exogenous events
Knowledge tK
© 2005 Warren B. Powell Slide 45
Approximate dynamic programming
Decisions now may need to know the impact on future decisions:» What is the cost of assigning this type of aircraft to
move a requirement?» What is the value of having a certain number of aircraft
in a region?» Should this requirement be satisfied now? Later?
Never?
For these questions, it is important that we optimize over time.
Time tV(a’)
a
V(a’’)
Time t '1( )V a
1a
'2( )V a
2a
© 2005 Warren B. Powell Slide 48
The optimization challenge
?
© 2005 Warren B. Powell Slide 49
State variables
Systems evolve through a cycle of exogenous and endogenous information
Time
1R̂
1x 2x 3x 4x 5x 6x0x
2R̂ 3R̂ 4R̂ 5R̂ 6R̂ω =
© 2005 Warren B. Powell Slide 50
State variables
Systems evolve through a cycle of exogenous and endogenous information
Time
1R̂
1x 2x 3x 4x 5x 6x0x
2R̂ 3R̂ 4R̂ 5R̂ 6R̂
1R 2R 3R 4R 5R 6R0R
© 2005 Warren B. Powell Slide 51
Approximate dynamic programming
Using this state variable, we obtain the optimality equations:
Problem: Curse of dimensionality
{ }1 1( ) max ( , ) ( ) |t t t t t t t txV R C R x E V R R+ +∈
= +X
Three curses
State spaceOutcome spaceAction space (feasible region)
© 2005 Warren B. Powell Slide 52
Approximate dynamic programming
The computational challenge:
{ }1 1( ) max ( , ) ( ) |t t t t t t t txV R C R x E V R R+ +∈
= +X
How do we find ? 1 1( )t tV R+ +
How do we compute the expectation?
How do we find the optimal solution?
© 2005 Warren B. Powell Slide 53
Approximate dynamic programming
A possible approximation strategy:
( ){ }1 1
We start with:
( ) max ( , ) |t t t t t t t tt
V R C R x E V R Rx + += +
Can’t compute this!!!
( )1 1
We solve this for a sample realization:
( , ) max ( , ) ( )t t t t t t tt
V R C R x V Rxω ω+ += +
( )1 1
Now substitute in function approximations:
( , ) max ( , ) ( )t t t t t t tt
V R C R x V Rxω ω+ += +
Don’t know what this is!
Need to approximate V
© 2005 Warren B. Powell Slide 54
Approximate dynamic programming
One big problem….
( )1 1( , ) max ( , ) ( )t t t t t t tt
V R C R x V Rxω ω+ += +
1Seeing is cheating!tR +
© 2005 Warren B. Powell Slide 55
Approximate dynamic programming
Alternative: Change the definition of the state variable:
Time
1R̂
1x 2x 3x 4x 5x 6x0x
2R̂ 3R̂ 4R̂ 5R̂ 6R̂
1R 2R 3R 4R 5R 6R0R 1R 2R 3R 4R 5R 6R0R 1R 2R 3R 4R 5R 6R0R 1R 2R 3R 4R 5R 6R0R 3R1R 2R 4R 5R0R
© 2005 Warren B. Powell Slide 56
Approximate dynamic programmingNow our optimality equation looks like:
We drop the expectation and solve the conditional problem:
Finally, we substitute in our approximation:
{ }1, 1 1( ) max ( , ) ( ( , )) |t
x x xt t t t t t t t t tx
V R E C R x V R x Rω− − −∈= +
X
( )( )1 1 ( ) )ˆ( , ( )) max ( ( ), ( )) ,x x
t t t t t t t t txV R R C R x V R x
ω ωω ω ω ω− − ∈
= +(X
( )( )1 1 ( ) )ˆ( , ( )) max ( ( ), ( )) ,x x
t t t t t t t t txV R R C R x V R x
ω ωω ω ω ω− − ∈
= +(X
Expectation outside of the “max” operator.
Post-decision state variable
“Convenient” value function approximation.
© 2005 Warren B. Powell Slide 57
Approximate dynamic programming
Approximating the value function:» We choose approximations of the form:
Linear (in the resource state):
( )
Piecewise linear, separable:
( ) ( )
t t ta taa
t t ta taa
V R v R
V R V R
∈
∈
= ⋅
=
∑
∑
A
A
Best when assets are complex,which means that is small(typically 0 or 1).
taR
Best when assets are simple,which means that may belarger.
taR
© 2005 Warren B. Powell Slide 58
Approximate dynamic programming
A myopic decision rule (policy):
A decision rule that looks into the future:
( )( )( ) )
arg max ( ( ), ( )) ,n xt t t t t t t
xx C R x V R x
ω ωω ω ω
∈= +
(X
( ) )arg max ( ( ), ( ))n
t t t tx
x C R xω ω
ω ω∈
=(X
© 2005 Warren B. Powell Slide 59
Approximate dynamic programming
t t+1 t+2Simulating a myopic policy:
© 2005 Warren B. Powell Slide 60
Approximate dynamic programming
A myopic decision rule (policy):
A decision rule that looks into the future:
( )( )( ) )
arg max ( ( ), ( )) ,n xt t t t t t t
xx C R x V R x
ω ωω ω ω
∈= +
(X
( ) )arg max ( ( ), ( ))n
t t t tx
x C R xω ω
ω ω∈
=(X
© 2005 Warren B. Powell Slide 61
Approximate dynamic programming
1a
'1( )V a
2a
'2( )V a
© 2005 Warren B. Powell Slide 62
Option 1: Send directly to customersOption 2: Send to regional depotsOption 3: Send to classification yards
Classification yards
© 2005 Warren B. Powell Slide 64
Approximate dynamic programmingTwo-stage resource allocation under uncertainty
© 2005 Warren B. Powell Slide 65
Approximate dynamic programmingWe obtain piecewise linear recourse functions for each regions.
© 2005 Warren B. Powell Slide 66
Approximate dynamic programmingThe function is piecewise linear on the integers.
We approximate the value of cars in the future using a separable approximation.
0 1 2 3 4 5Number of vehicles at a location
Prof
its
© 2005 Warren B. Powell Slide 67
Approximate dynamic programmingTo capture nonlinear behavior:
Each link captures the marginalreward of an additional car.
© 2005 Warren B. Powell Slide 68
Approximate dynamic programming
© 2005 Warren B. Powell Slide 69
Approximate dynamic programming
© 2005 Warren B. Powell Slide 70
Approximate dynamic programming
1nR →
2nR →
3nR →
4nR →
5nR →
© 2005 Warren B. Powell Slide 71
Approximate dynamic programmingWe estimate the functions by sampling from our distributions.
1nR →
2nR →
3nR →
4nR →
5nR →
1 ( )nD ω
2 ( )nD ω
3 ( )nD ω
( )nCD ω
1( )nv ω
2 ( )nv ω
3 ( )nv ω
4 ( )nv ω
5 ( )nv ω
Marginal value:
© 2005 Warren B. Powell Slide 72
Approximate dynamic programming
The time t subproblem:
1tR
2tR
3tR
t1 2 3( , , )n
ta t t tV R R R(i-1,t+3)
(i,t+1)
(i+1,t+5)
1 1
2 2
3 3
Gradients:ˆ ˆ( , )ˆ ˆ( , )ˆ ˆ( , )
n nt t
n nt tn nt t
v v
v v
v v
− +
− +
− +
© 2005 Warren B. Powell Slide 73
Approximate dynamic programming
Left and right gradients are found by solving flow augmenting path problems.
3tR
t
i
1 2 3( , , )nta t t tV R R R
(i-1,t+3)Gradients:
3ˆ( )ntv +
The right derivative (the value of one more unit of that resource) is a flow augmenting path from that node to the supersink.
The right derivative (the value of one more unit of that resource) is a flow augmenting path from that node to the supersink.
© 2005 Warren B. Powell Slide 74
Approximate dynamic programming
Left and right derivatives are used to build up a nonlinear approximation of the subproblem.
R1t
1( )kit tV R
R1tk
© 2005 Warren B. Powell Slide 75
Approximate dynamic programming
Left and right derivatives are used to build up a nonlinear approximation of the subproblem.
R1t
ktv+
ktv−
Right derivativeLeft derivative
R1tk
1( )kit tV R
© 2005 Warren B. Powell Slide 76
Approximate dynamic programming
Each iteration adds new segments, as well as refining old ones.
R1t
( 1)ktv+ +
( 1)ktv− +
R1tk+1
1( )kit tV R
© 2005 Warren B. Powell Slide 77
Approximate dynamic programming
0.0
0.5
1.0
1.5
2.0
2.5
0 1 2 3 4 5 6 7 8 9 10
Variable Value, s
Func
tiona
l Val
ue, f
(s) =
ln(1
+s)
Exact1 Iter2 Iter5 Iter10 Iter15 Iter20 Iter
Number of resources
App
roxi
mat
e va
lue
func
t ion
© 2005 Warren B. Powell Slide 78
Simulating a myopic policy
Approximate dynamic programming
t
© 2005 Warren B. Powell Slide 79
Simulating a myopic policy
Approximate dynamic programming
© 2005 Warren B. Powell Slide 80
Using value functions to anticipate the future
Approximate dynamic programming
t
“Here and now” Downstream impacts
© 2005 Warren B. Powell Slide 81
Approximate dynamic programming
Using value functions to anticipate the future
© 2005 Warren B. Powell Slide 82
Approximate dynamic programming
Using value functions to anticipate the future
© 2005 Warren B. Powell Slide 83
Approximate dynamic programming
Using value functions to anticipate the future
© 2005 Warren B. Powell Slide 84
© 2005 Warren B. Powell Slide 85
© 2005 Warren B. Powell Slide 86
© 2005 Warren B. Powell Slide 87
© 2005 Warren B. Powell Slide 88
80
85
90
95
1001 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
Iteration No.
% o
f Obj
ectiv
e U
pper
boun
d
Agg_PWLinear_1
Agg_PWLinear_2
Agg_PWLinear_3
DisAgg_Linear
DisAgg_PWLinear
Decomp_Location
The mathematical optimum
Approximate dynamic programming
Approximate DP vs. LP
© 2005 Warren B. Powell Slide 89
Downloadable atwww.castlelab.princeton.edu
© 2005 Warren B. Powell Slide 90
The information classes
Expert knowledge ρ
Forecasts of impacts on others tV
tΩForecasts of exogenous events
Knowledge tK
© 2005 Warren B. Powell Slide 91
Low dimensional patterns
Old modeling approach: Engineering costs
0, :Subject tominarg*
≥==
xbAxcxx
Objectives
“Physics”
“Behavior”
© 2005 Warren B. Powell Slide 92
Flows from history
© 2005 Warren B. Powell Slide 93
Flows from history
Flows from the model
© 2005 Warren B. Powell Slide 94
Low dimensional patterns
Bottom up/top down modeling:
Specify the behaviorsyou want at a general
level.
Patterns
Specify costs,driver availability,work rules, routing
preferences, load avail.
Engineering
© 2005 Warren B. Powell Slide 95
Low dimensional patterns
Pattern matching
* arg min ( , )x cx H xθ ρ= +
Cost function
“Behavior”
The “happiness” function –measures the degree to which model behavior agrees with a knowledgeable expert.
( , ) || ( ) || where ( ) is an aggregation functionH x G x G xρ ρ= −
© 2005 Warren B. Powell Slide 96
Low dimensional patterns
Patterns and aggregation:» What we do:
• We define patterns based on an aggregation of the attributes of a single vehicle.
• Patterns indicate the desirability of a single decision.
» Patterns can be expressed at different levels of aggregation, simultaneously.
• Don’t send C-5’s into Saudi Arabia• Don’t send C-5’s needing maintenance into Saudi Arabia• Don’t send C-5’s needing maintenance loaded with freight to
southeast Asia into Saudi Arabia.
» Patterns are not hard rules – they express desirable or undesirable patterns of behavior.
© 2005 Warren B. Powell Slide 97
Flows from history
Flows from the model
© 2005 Warren B. Powell Slide 98
Flows from history
Flows from the model
© 2005 Warren B. Powell Slide 99
Low dimensional patterns
Length of haul calibration-teams
600
650
700
750
800
850
1 2 3 4 5 6 7 8 9 10
Iteration
MinSolo w/ patternSolo w/o patternMax
Without pattern
With pattern
© 2005 Warren B. Powell Slide 100
Low dimensional patterns
Patterns can come from history:
© 2005 Warren B. Powell Slide 101
Low dimensional patterns… or an expert:
© 2005 Warren B. Powell Slide 102
The information classes
Expert knowledge ρ
Forecasts of impacts on others tV
tΩForecasts of exogenous events
Knowledge tK
© 2005 Warren B. Powell Slide 103
The military airlift problem
© 2005 Warren B. Powell Slide 104
(EK)Expert knowledge
(ADP)Approximate Dynamic Programming
(RH)Rolling horizon
(MP:RL-AL/KNAF)
Myopic cost-based, a list of requirements to a list of aircraft, known now and actionable in the future
(MP:RL-AL/KNAN)
Myopic cost-based, a list of requirements to a list of aircraft, known now and actionable now
(MP:R-AL/KNAF)
Myopic cost-based, one requirement to a list of aircraft, known now and actionable in the future
(MP:R-AL/KNAN)
Myopic cost-based, one requirement to a list of aircraft, known now and actionable now
(RB:R-A)Rule-based
Decision Functions
Information ClassesPolicy
ttt RI =
),( tttt cRI =
),)(( tttttt cRI ≥′′=
),( tttt cRI =
),)(( tttttt cRI ≥′′=
}|,){( ''''''ph
ttttttt tcRI T∈′= ′≥
}|,,){( phttttttttt tVcRI T∈′= ′′≥′′
}|,,,){( phttttttttt tVcRI T∈′= ′′≥′′ ρ
Optimizing simulator
Increasing information sets
© 2005 Warren B. Powell Slide 105
Costs of different policies
0
50
100
150
200
250
(RB:R-A)(MP:RL-AL/KNAN)
(ADP)
Policies
Mill
ion
Dol
lors
Optimizing simulator
Increasing information sets
Transportation cost
Late delivery cost
Repair cost
Total cost
RuleBased
Value functions
Actionablefuture
ActionableNow
Choice ofaircraft
© 2005 Warren B. Powell Slide 106
Throughput curves of policies
0
5
10
15
20
25
30
35
40
45
50
0 30 60 90 120 150 180 210
Mill
ions
Time periods
Poun
ds
Cumulative expected thruput(RB:R-A)(MP:R-AL/KNAN)(MP:RL-AL/KNAN)(MP:RL-AL/KNAF)(ADP)
Increasing information sets
Optimizing simulator
© 2005 Warren B. Powell Slide 107
Throughput curves of policies
0
5
10
15
20
25
30
35
40
45
50
0 30 60 90 120 150 180 210
Mill
ions
Time periods
Poun
ds
Cumulative expected thruput(RB:R-A)(MP:R-AL/KNAN)(MP:RL-AL/KNAN)(MP:RL-AL/KNAF)(ADP)
Optimizing simulator
© 2005 Warren B. Powell Slide 108
Areas between the cumulative expected thruput curve and different policy thruput curves
0
50
100
150
200
250
300
350
400
(RB:R-A)(MP:R-AL/KNAN)
(MP:RL-AL/KNAN)
(MP:RL-AL/KNAF)
(ADP)
Mill
ions
Policy
Poun
d * d
ays
Increasing information sets
Optimizing simulator
© 2005 Warren B. Powell Slide 109
Outline
Recent experiments with modeling airlift operations
© 2005 Warren B. Powell Slide 110
Random demands and equipment failures
© 2005 Warren B. Powell Slide 111
Pilots
Aircraft
Customers
© 2005 Warren B. Powell Slide 112
Case study
Questions:
» What is the effect of uncertain demands on a military airlift schedule?
» What is the effect of equipment failures?
» How does adaptive learning change the effect of randomness on the performance of the simulation?
» What is the effect of advance information?
© 2005 Warren B. Powell Slide 113
250000
260000
270000
280000
290000
300000
310000
320000
330000
1 9 17 25 33 41 49 57 65 73 81 89 97
Determ demand|NoBreak|LearnDeterm demand|Break|Learn
Random demand|Nobreak|LearnDeterm demand|No Break|NolearnRandom demand|Break|Learn
Determ demand|Break|No learn
Random demand|No Break|NolearnRandom demand|Break|Nolearn
Iterative learning
Tot
al c
ontr
ibut
ion
© 2005 Warren B. Powell Slide 114
250000
260000
270000
280000
290000
300000
310000
320000
330000
1 9 17 25 33 41 49 57 65 73 81 89 97
Determ demand|NoBreak|LearnDeterm demand|Break|Learn
Random demand|Nobreak|LearnDeterm demand|No Break|NolearnRandom demand|Break|Learn
Determ demand|Break|No learn
Random demand|No Break|NolearnRandom demand|Break|Nolearn
Deterministic demands, no failures
With learning
Without learning
© 2005 Warren B. Powell Slide 115
250000
260000
270000
280000
290000
300000
310000
320000
330000
1 9 17 25 33 41 49 57 65 73 81 89 97
Determ demand|No Break|Learn
Determ demand|Break|Learn
Random demand|Nobreak|Learn
Determ demand|No Break|Nolearn
Random demand|Break|Learn
Determ demand|Break|No learn
Random demand|No Break|Nolearn
Random demand|Break|Nolearn
Deterministic demands, with failures
With learning
Without learning
© 2005 Warren B. Powell Slide 116
250000
260000
270000
280000
290000
300000
310000
320000
330000
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99
Determ demand|No Break|Learn
Determ demand|Break|Learn
Random demand|No break|Learn
Determ demand|No Break|No learn
Random demand|Break|Learn
Determ demand|Break|No learn
Random demand|No Break|No learn
Random demand|Break|No learn
Random demands, no failures
With learning
Without learning
© 2005 Warren B. Powell Slide 117
250000
260000
270000
280000
290000
300000
310000
320000
330000
1 9 17 25 33 41 49 57 65 73 81 89 97
Determ demand|No Break|Learn
Determ demand|Break|Learn
Random demand|Nobreak|Learn
Determ demand|No Break|Nolearn
Random demand|Break|Learn
Determ demand|Break|No learn
Random demand|No Break|Nolearn
Random demand|Break|Nolearn
Random demands, with failures
With learning
Without learning
© 2005 Warren B. Powell Slide 118
Effect of advance notice
86
88
90
92
94
96
98
100
Prebook 0 hours Prebook 2 hours Prebook 6 hours
Perc
ent c
over
age
Effect of advance booking
Withoutlearning
© 2005 Warren B. Powell Slide 119
Effect of advance booking
Effect of advance notice
86
88
90
92
94
96
98
100
Prebook 0 hours Prebook 2 hours Prebook 6 hours
Perc
ent c
over
age
Withoutlearning
Withlearning
© 2005 Warren B. Powell Slide 120
Midair refueling: initial solution
© 2005 Warren B. Powell Slide 121
Midair refueling: initial solution
Path followed by tanker (moves up and down Atlantic).
© 2005 Warren B. Powell Slide 122
Midair refueling: initial solution
Second plane crashes
First plane refuels
Green: full of fuelYellow to red: nearing emptyBlack: empty (plane crashes)
© 2005 Warren B. Powell Slide 123
Midair refueling: exploration
Learning over many iterations.
© 2005 Warren B. Powell Slide 124
Planes learn to meet in the middle so both can refuel.
Midair refueling: final solution
© 2005 Warren B. Powell Slide 125
Outline
Calibrating a model for a major truckload motor carrier
© 2005 Warren B. Powell Slide 126
Schneider National
© 2005 Warren B. Powell Slide 127
Schneider National
© 2005 Warren B. Powell Slide 128
© 2005 Warren B. Powell Slide 129
Truckload trucking
Questions for the model:» What types of drivers should they hire?
• Domicile?• Single drivers vs. teams?
» What is the value of knowing about customer requests farther in the future?
» What is the profitability of different customers?» What is the value of increasing terminal capacity?
© 2005 Warren B. Powell Slide 130
LOH
0
200
400
600
800
1000
1200
1400
1600
US_SOLO US_IC US_TEAM
Capacity category
LOH
Historical maximumSimulationHistorical minimum
Truckload trucking
© 2005 Warren B. Powell Slide 131
Revenue per WU
Utilization
0
200
400
600
800
1000
1200
1400
US_SOLO US_IC US_TEAM
Capacity category
Reve
nue
per W
U
Historical maximumSimulationHistorical minimum
0
200
400
600
800
1000
1200
US_SOLO US_IC US_TEAM
Capacity category
Util
izat
ion Historical maximum
SimulationHistorical minimum
Truckload trucking
© 2005 Warren B. Powell Slide 132
Truckload trucking
Challenge» We want to know the marginal value of each type of
driver.» A driver type is determined by:
» There are 30,000 driver “types”!!!» We need to take the “derivative” of our simulation for
each type.
Location 100Domicile 100
Driver type 3a
⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥= =⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦
© 2005 Warren B. Powell Slide 133
Res
ourc
e St
ate-
Type
Time2+t1+tt
Multistage problems
( )t tX Rπ
3ˆntv
2ˆntv
1ˆntv
© 2005 Warren B. Powell Slide 134
Time
Res
ourc
e St
ate-
Type
2+t1+t
Multistage problems
1 1( )t tX Rπ
+ +
1,2ˆntv +
1,1ˆntv +
1,3ˆntv +
© 2005 Warren B. Powell Slide 135
Time
Res
ourc
e St
ate-
Type
2+t
Multistage problems
2 2( )t tX Rπ
+ +2,1ˆn
tv +
2,2ˆntv +
2,3ˆntv +
© 2005 Warren B. Powell Slide 136
Res
ourc
e St
ate-
Type
Time2+t1+tt
Multistage problems
( )t tX Rπ
3ˆntv
2ˆntv
1ˆntv
© 2005 Warren B. Powell Slide 137
Time
Res
ourc
e St
ate-
Type
2+t1+t
Multistage problems
1 1( )t tX Rπ
+ +
1,2ˆntv +
1,1ˆntv +
1,3ˆntv +
© 2005 Warren B. Powell Slide 138
Time
Res
ourc
e St
ate-
Type
2+t
Multistage problems
2 2( )t tX Rπ
+ +2,1ˆn
tv +
2,2ˆntv +
2,3ˆntv +
© 2005 Warren B. Powell Slide 139
( )t tX Rπ
1 1( )t tX Rπ
+ + 2 2( )t tX Rπ
+ +
Backward pass
© 2005 Warren B. Powell Slide 140
Time
Res
ourc
e St
ate-
Type
2+t
2,1ˆntv +
Backward pass
© 2005 Warren B. Powell Slide 141
Time
Res
ourc
e St
ate-
Type
2+t1+t
1,2ˆntv +
Backward pass
© 2005 Warren B. Powell Slide 142
Time
Res
ourc
e St
ate-
Type
2+t1+tt
3ˆntv
Backward pass
© 2005 Warren B. Powell Slide 143
Time
Res
ourc
e St
ate-
Type
2+t1+tt
3ˆntv
Backward pass
© 2005 Warren B. Powell Slide 144
Driver fleet optimization
simulation objective function
1800000
1810000
1820000
1830000
1840000
1850000
1860000
1870000
1880000
1890000
1900000
580 590 600 610 620 630 640 650
# of drivers
s1
s2
s3
s4
s5
s6
s7
s8
s9
s10
avg
pred
Base case+5 resources
+20 resources+30 resources+40 resources
+50 resources+60 resources
+10 resources
© 2005 Warren B. Powell Slide 145
Driver fleet optimization
simulation objective function
1800000
1810000
1820000
1830000
1840000
1850000
1860000
1870000
1880000
1890000
1900000
580 590 600 610 620 630 640 650
# of drivers
s1
s2
s3
s4
s5
s6
s7
s8
s9
s10
avg
pred
© 2005 Warren B. Powell Slide 146
Driver fleet optimization
simulation objective function
1800000
1810000
1820000
1830000
1840000
1850000
1860000
1870000
1880000
1890000
1900000
580 590 600 610 620 630 640 650
# of drivers
s1
s2
s3
s4
s5
s6
s7
s8
s9
s10
avg
pred
av
© 2005 Warren B. Powell Slide 147
Driver fleet optimization
-500
0
500
1000
1500
2000
2500
3000
3500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Driver types
© 2005 Warren B. Powell Slide 148
Add drivers
© 2005 Warren B. Powell Slide 149
Reduce drivers
© 2005 Warren B. Powell Slide 150
Questions?