slide 1 © 2008 warren b. powell slide 1 approximate dynamic programming for high-dimensional...

104
Slide 1 © 2008 Warren B. Powell Slide 1 Approximate Dynamic Programming for h-Dimensional Problems in Energy Model Ohio St. University October 7, 2009 Warren Powell CASTLE Laboratory Princeton University http:// www.castlelab.princeton.edu © 2009 Warren B. Powell, Princeton University

Post on 22-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1
  • Slide 1 2008 Warren B. Powell Slide 1 Approximate Dynamic Programming for High-Dimensional Problems in Energy Modeling Ohio St. University October 7, 2009 Warren Powell CASTLE Laboratory Princeton University http://www.castlelab.princeton.edu 2009 Warren B. Powell, Princeton University
  • Slide 2
  • Slide 2 Goals for an energy policy model Potential questions Policy questions How do we design policies to achieve energy goals (e.g. 20% renewables by 2015) with a given probability? How does the imposition of a carbon tax change the likelihood of meeting this goal? What might happen if ethanol subsidies are reduced or eliminated? What is the impact of a breakthrough in batteries? Energy economics What is the best mix of energy generation technologies? How is the economic value of wind affected by the presence of storage? What is the best mix of storage technologies? How would climate change impact our ability to use hydroelectric reservoirs as a regulating source of power?
  • Slide 3
  • Slide 3 Goals for an energy policy model Designing energy supply and storage portfolios to work with wind: The marginal value of wind and solar farms depends on the ability to work with intermittent supply. The impact of intermittent supply will be mitigated by the use of storage. Different storage technologies (batteries, flywheels, compressed air, pumped hydro) are each designed to serve different types of variations in supply and demand. The need for storage (and the value of wind and solar) depends on the entire portfolio of energy producing technologies.
  • Slide 4
  • Slide 4 Intermittent energy sources Wind speed Solar energy
  • Slide 5
  • Slide 5 Wind 30 days 1 year
  • Slide 6
  • Slide 6 Storage Batteries Ultracapacitors Flywheels Hydroelectric
  • Slide 7
  • Slide 7 Long term uncertainties. 2010 2015 2020 2025 2030 Tax policy Batteries Solar panels Carbon capture and sequestration Price of oil Climate change
  • Slide 8
  • Slide 8 Goals for an energy policy model Model capabilities we are looking for Multi-scale Multiple time scales (hourly, daily, seasonal, annual, decade) Multiple spatial scales Multiple technologies (different coal-burning technologies, new wind turbines, ) Multiple markets Transportation (commercial, commuter, home activities) Electricity use (heavy industrial, light industrial, business, residential) . Stochastic (handles uncertainty) Hourly fluctuations in wind, solar and demands Daily variations in prices and rainfall Seasonal changes in weather Yearly changes in supplies, technologies and policies
  • Slide 9
  • Slide 9 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model
  • Slide 10
  • Slide 10 2008 Warren B. Powell Slide 10 A resource allocation model Attribute vectors:
  • Slide 11
  • Slide 11 2008 Warren B. Powell Slide 11 A resource allocation model Modeling resources: The attributes of a single resource: The resource state vector: The information process:
  • Slide 12
  • Slide 12 2008 Warren B. Powell Slide 12 A resource allocation model Modeling demands: The attributes of a single demand: The demand state vector: The information process:
  • Slide 13
  • Slide 13 2008 Warren B. Powell Slide 13 Energy resource modeling The system state:
  • Slide 14
  • Slide 14 Energy resource modeling The decision variables:
  • Slide 15
  • Slide 15 Energy resource modeling Exogenous information: Slide 15
  • Slide 16
  • Slide 16 2008 Warren B. Powell Slide 16 Energy resource modeling The transition function Known as the: Transition function Transfer function System model Plant model Model
  • Slide 17
  • Slide 17 Energy resource modeling Demands Resources
  • Slide 18
  • Slide 18 Energy resource modeling t t+1 t+2
  • Slide 19
  • Slide 19 Energy resource modeling t t+1 t+2 Optimizing at a point in time Optimizing over time
  • Slide 20
  • Slide 20 Energy resource modeling The objective function How do we find the best policy? Myopic policies Rolling horizon policies Simulation-optimization Dynamic programming Decision function (policy) State variable Contribution function Finding the best policy Expectation over all random outcomes
  • Slide 21
  • Slide 21 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model
  • Slide 22
  • Slide 22 Introduction to dynamic programming Bellmans optimality equation: Assume this is knownCompute this for each state S
  • Slide 23
  • Slide 23 Introduction to dynamic programming Bellmans optimality equation: Problem: Curse of dimensionality Three curses State space Outcome space Action space (feasible region)
  • Slide 24
  • Slide 24 Introduction to dynamic programming The computational challenges: How do we find ? How do we compute the expectation? How do we find the optimal solution?
  • Slide 25
  • Slide 25 Introduction to ADP Classical ADP Most applications of ADP focus on the challenge of handling multidimensional state variables Start with Now replace the value function with some sort of approximation May draw from the entire field of statistics/machine learning.
  • Slide 26
  • Slide 26 Introduction to ADP Other statistical methods Regression trees Combines regression with techniques for discrete variables. Data mining Good for categorical data Neural networks Engineers like this for low-dimensional continuous problems Kernel/locally polynomial regression Approximations portions of the value function locally using simple functions Dirichlet mixture models Aggregate portions of the function and fit approximations around these aggregations.
  • Slide 27
  • Slide 27 Introduction to ADP But this does not solve our problem Assume we have an approximate value function. We still have to solve a problem that looks like This means we still have to deal with a maximization problem (might be a linear, nonlinear or integer program) with an expectation.
  • Slide 28
  • Slide 28 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model
  • Slide 29
  • Do not use weather report Use weather report Forecast sunny.6 Rain.8 -$2000 Clouds.2 $1000 Sun.0 $5000 Rain.8 -$200 Clouds.2 -$200 Sun.0 -$200 Schedule game Cancel game Rain.1 -$2000 Clouds.5 $1000 Sun.4 $5000 Rain.1 -$200 Clouds.5 -$200 Sun.4 -$200 Schedule game Cancel game Rain.1 -$2000 Clouds.2 $1000 Sun.7 $5000 Rain.1 -$200 Clouds.2 -$200 Sun.7 -$200 Schedule game Cancel game Rain.2 -$2000 Clouds.3 $1000 Sun.5 $5000 Rain.2 -$200 Clouds.3 -$200 Sun.5 -$200 Schedule game Cancel game Forecast cloudy.3 Forecast rain.1 - Decision nodes - Outcome nodes Information Action Information Action State
  • Slide 30
  • Slide 30 The post-decision state New concept: The pre-decision state variable: Same as a decision node in a decision tree. The post-decision state variable: Same as an outcome node in a decision tree.
  • Slide 31
  • Slide 31 The post-decision state An inventory problem: Our basic inventory equation: Using pre- and post-decision states:
  • Slide 32
  • Slide 32 The post-decision state Pre-decision, state-action, and post-decision Pre-decision state State Action Post-decision state
  • Slide 33
  • Slide 33 The post-decision state Pre-decision: resources and demands
  • Slide 34
  • Slide 34 The post-decision state
  • Slide 35
  • Slide 35 The post-decision state
  • Slide 36
  • Slide 36 The post-decision state
  • Slide 37
  • Slide 37 The post-decision state Classical form of Bellmans equation: Bellmans equations around pre- and post-decision states: Optimization problem (making the decision): Note: this problem is deterministic! Expectation problem (incorporating uncertainty):
  • Slide 38
  • Slide 38 Introduction to ADP We first use the value function around the post-decision state variable, removing the expectation: We then replace the value function with an approximation that we estimate using machine learning techniques:
  • Slide 39
  • Slide 39 The post-decision state Value function approximations: Linear (in the resource state): Piecewise linear, separable: Indexed PWL separable:
  • Slide 40
  • Slide 40 The post-decision state Value function approximations: Ridge regression (Klabjan and Adelman) Benders cuts
  • Slide 41
  • Slide 4141 Making decisions Following an ADP policy
  • Slide 42
  • Slide 4242 Making decisions Following an ADP policy
  • Slide 43
  • Slide 4343 Making decisions Following an ADP policy
  • Slide 44
  • Slide 4444 Making decisions Following an ADP policy
  • Slide 45
  • Slide 45 45 Approximate dynamic programming With luck, the objective function will improve steadily
  • Slide 46
  • Slide 46 The post-decision state Comparison to other methods: Classical MDP (value iteration) Classical ADP (pre-decision state): Updating around post-decision state: Expectation No expectation
  • Slide 47
  • Slide 47 Approximate dynamic programming Step 1: Start with a pre-decision state Step 2: Solve the deterministic optimization using an approximate value function: to obtain. Step 3: Update the value function approximation Step 4: Obtain Monte Carlo sample of and compute the next pre-decision state: Step 5: Return to step 1. Simulation Deterministic optimization Recursive statistics
  • Slide 48
  • Slide 48 Approximate dynamic programming Step 1: Start with a pre-decision state Step 2: Solve the deterministic optimization using an approximate value function: to obtain. Step 3: Update the value function approximation Step 4: Obtain Monte Carlo sample of and compute the next pre-decision state: Step 5: Return to step 1. Simulation Deterministic optimization Recursive statistics
  • Slide 49
  • Slide 49 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model
  • Slide 50
  • Slide 50 Blood management Managing blood inventories
  • Slide 51
  • Slide 51 Blood management Managing blood inventories over time t=0 Week 1 Week 2 Week 3 t=1 t=2 t=3 Week 0
  • Slide 52
  • O-,1 O-,2 O-,3 AB+,2 AB+,3 O-,0 AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+ AB- A+ A- B+ B- O+ O- AB+,0 AB+,1 Satisfy a demandHold
  • Slide 53
  • AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3
  • Slide 54
  • AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3
  • Slide 55
  • AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3 Solve this as a linear program.
  • Slide 56
  • AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3 Dual variables give value additional unit of blood.. Duals
  • Slide 57
  • Slide 57 Updating the value function approximation Estimate the gradient at
  • Slide 58
  • Slide 58 Updating the value function approximation Update the value function at
  • Slide 59
  • Slide 59 Updating the value function approximation Update the value function at
  • Slide 60
  • Slide 60 Updating the value function approximation Update the value function at
  • Slide 61
  • Slide 61 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model
  • Slide 62
  • Slide 62 SMART-Stochastic, multiscale model SMART: A Stochastic, Multiscale Allocation model for energy Resources, Technology and policy Stochastic able to handle different types of uncertainty: Fine-grained Daily fluctuations in wind, solar, demand, prices, Coarse-grained Major climate variations, new government policies, technology breakthroughs Multiscale able to handle different levels of detail: Time scales Hourly to yearly Spatial scales Aggregate to fine-grained disaggregate Activities Different types of demand patterns Decisions Hourly dispatch decisions Yearly investment decisions Takes as input parameters characterizing government policies, performance of technologies, assumptions about climate
  • Slide 63
  • Slide 63 The annual investment problem 2008 2009
  • Slide 64
  • Slide 64 The hourly dispatch problem Hourly electricity dispatch problem
  • Slide 65
  • Slide 65 The hourly dispatch problem Hourly model Decisions at time t impact t+1 through the amount of water held in the reservoir. Hour t Hour t+1
  • Slide 66
  • Slide 66 The hourly dispatch problem Hourly model Decisions at time t impact t+1 through the amount of water held in the reservoir. Value of holding water in the reservoir for future time periods. Hour t
  • Slide 67
  • Slide 67 The hourly dispatch problem
  • Slide 68
  • Slide 68 The hourly dispatch problem 2008 Hour 1 2 3 4 8760 2009 1 2
  • Slide 69
  • Slide 69 The hourly dispatch problem 2008 Hour 1 2 3 4 8760 2009 1 2
  • Slide 70
  • Slide 70 SMART-Stochastic, multiscale model 2008 2009
  • Slide 71
  • Slide 71 SMART-Stochastic, multiscale model 2008 2009
  • Slide 72
  • Slide 72 SMART-Stochastic, multiscale model 2008 2009 2010 2011 2038
  • Slide 73
  • Slide 73 SMART-Stochastic, multiscale model 2008 2009 2010 2011 2038
  • Slide 74
  • Slide 74 SMART-Stochastic, multiscale model 2008 2009 2010 2011 2038 ~5 seconds
  • Slide 75
  • Slide 75 SMART-Stochastic, multiscale model Use statistical methods to learn the value of resources in the future. Resources may be: Stored energy Hydro Flywheel energy Storage capacity Batteries Flywheels Compressed air Energy transmission capacity Transmission lines Gas lines Shipping capacity Energy production sources Wind mills Solar panels Nuclear power plants Amount of resource Value
  • Slide 76
  • Slide 76 SMART-Stochastic, multiscale model Approximating continuous functions The algorithm performs very fine discretization over a small range of the function which is visited most often.
  • Slide 77
  • Slide 77 SMART-Stochastic, multiscale model Benchmarking Compare ADP to optimal LP for a deterministic problem Annual model 8,760 hours over a single year Focus on ability to match hydro storage decisions 20 year model 24 hour time increments over 20 years Focus on investment decisions Comparisons on stochastic model Stochastic rainfall analysis How does ADP solution compare to LP? Carbon tax policy analysis Demonstrate nonanticipativity Slide 77
  • Slide 78
  • Slide 7878 Iterations Percentage error from optimal 0.06% over optimal Benchmarking on hourly dispatch 2.50 2.00 1.50 1.00 0.50 0.00 ADP objective function relative to optimal LP
  • Slide 79
  • Slide 79 Benchmarking on hourly dispatch Optimal from linear program Reservoir level Demand Rainfall
  • Slide 80
  • Slide 8080 Benchmarking on hourly dispatch ADP solution Reservoir level Demand Rainfall Approximate dynamic programming
  • Slide 81
  • Slide 81 Benchmarking on hourly dispatch Optimal from linear program Reservoir level Demand Rainfall Optimal from linear program
  • Slide 82
  • Slide 82 Benchmarking on hourly dispatch ADP solution Reservoir level Demand Rainfall Approximate dynamic programming
  • Slide 83
  • Slide 83 2009 Warren B. Powell Slide 83 Multidecade energy model Optimal vs. ADP daily model over 20 years 0.24% over optimal
  • Slide 84
  • Slide 84 Energy policy modeling Traditional optimization models tend to produce all-or-nothing solutions Cost differential: IGCC - Pulverized coal Pulverized coal is cheaper IGCC is cheaper Investment in IGCC Traditional optimization Approximate dynamic programming
  • Slide 85
  • Slide 8585 Time period Precipitation Sample paths Stochastic rainfall
  • Slide 86
  • Slide 8686 Reservoir level Optimal for individual scenarios Time period ADP Stochastic rainfall
  • Slide 87
  • Slide 87 Energy policy modeling Following sample paths Demands, prices, weather, technology, policies, Slide 87 2030 Achieved goal w/ Prob. 0.70 Metric (e.g. % renewable) Need to consider: Finge-grained noise (wind, rain, demand, prices, ) Coarse-grained noise (technology, policy, climate, ) Need to consider: Finge-grained noise (wind, rain, demand, prices, ) Coarse-grained noise (technology, policy, climate, )
  • Slide 88
  • Slide 88 Energy policy modeling Policy study: What is the effect of a potential (but uncertain) carbon tax in year 8? 1 2 3 4 5 6 7 8 9 Year Carbon tax 0
  • Slide 89
  • Slide 89 Energy policy modeling Renewable technologies Carbon-based technologies No carbon tax
  • Slide 90
  • Slide 90 Energy policy modeling With carbon tax Carbon-based technologies Renewable technologies Carbon tax policy unknown Carbon tax policy determined
  • Slide 91
  • Slide 91 Energy policy modeling With carbon tax Carbon-based technologies Renewable technologies
  • Slide 92
  • Slide 92 Conclusions Capabilities SMART can handle problems with over 300,000 time periods so that it can model hourly variations in a long- term energy investment model. It can simulate virtually any form of uncertainty, either provided through an exogenous scenario file or sampled from a probability distribution. Accurate modeling of climate, technology and markets requires access to exogenously provided scenarios. It properly models storage processes over time. Current tests are on an aggregate model, but the modeling framework (and library) is set up for spatially disaggregate problems.
  • Slide 93
  • Slide 93 Conclusions Limitations More research is needed to test the ability of the model to use multiple storage technologies. Extension to spatially disaggregate model will require significant engineering and data. Run times will start to become an issue for a spatially disaggregate model. Value function approximations capture the resource state vector, but are limited to very simple exogenous state variations.
  • Slide 94
  • Slide 94 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model Merging machine learning and optimization
  • Slide 95
  • Slide 95 Merging machine learning and optimization The challenge of coarse-grained uncertainty Fine-grained uncertainty can generally be modeled as memoryless (even if it is not). Coarse-grained uncertainty affects what might be called state of the world. The value of a resource depends on the state of the world. Is there a carbon tax? What is the state of battery research? Have there been major new oil discoveries? What is the price of oil? Did the international community adopt strict limits on carbon emissions? Has their been advances in our understanding of climate change?
  • Slide 96
  • Slide 96 Merging machine learning and optimization Modeling the state of the world We can use powerful machine learning algorithms to overcome these new curses of dimensionality. Instead of one piecewise linear value function for each resource and time period We need one for each state of the world. There can be thousands of these.
  • Slide 97
  • Slide 97 Merging machine learning and optimization Strategy 1: Locally polynomial regression Widely used in statistics Approximate complex functions locally using simple functions. Estimate of the function is a weighted sum of these local approximations. But cannot handle categorical variables.
  • Slide 98
  • Slide 98 Merging machine learning and optimization Strategy 2: Dirichlet process mixtures of generalized linear models
  • Slide 99
  • Slide 99 Merging machine learning and optimization Strategy 3: Hierarchical learning models Estimate piecewise constant functions at different levels of aggregation:
  • Slide 100
  • Slide 100 Merging machine learning and optimization Next steps: We need to transition these machine learning techniques into an ADP setting: Can they be adapted to work within a linear or nonlinear optimization algorithm? All three methods are asymptotically unbiased, but this depends on unbiased observations. In an ADP algorithm, observations are biased. We need to design an effective exploration strategy so that the solution does not become stuck. Other issues Will the methods provide fast, robust solutions for effective policy analysis?
  • Slide 101
  • Slide 101 2009 Warren B. Powell 2008 Warren B. Powell Slide 101
  • Slide 102
  • Slide 102
  • Slide 103
  • Slide 103
  • Slide 104
  • Slide 104 2009 Warren B. Powell Demand modeling Commercial electric demand 7 days