online optimization for the smart (micro) grid

7/30/2019 Online Optimization for the Smart (Micro) Grid

1/10

Online Optimization for the Smart (Micro) Grid

Balakrishnan Narayanaswamy, Vikas K. Garg and T.S. JayramIBM Research

Bangalore, India{murali.balakrishnan,vikaskga,t.s.jayram}@in.ibm.com

ABSTRACT

Growing environmental awareness and new government di-rectives have set the stage for an increase in the fraction ofenergy supplied using renewable resources. The fast varia-tion in renewable power, coupled with uncertainty in avail-ability, emphasizes the need for algorithms for intelligentonline generation scheduling. These algorithms should al-low us to compensate for the renewable resource when itis not available and should also account for physical gener-ator constraints. We apply and extend recent work in thefield of online optimization to the scheduling of generators insmart (micro) grids and derive bounds on the performanceof asymptotically good algorithms in terms of the genera-tor parameters. We also design online algorithms that in-telligently leverage available information about the future,such as predictions of wind intensity, and show that theycan be used to guarantee near optimal performance undermild assumptions. This allows us to quantify the benefitsof resources spent on prediction technologies and differentgeneration sources in the smart grid. Finally, we empiri-cally show how both classes of online algorithms, (with orwithout the predictions of future availability) significantly

outperform certain natural algorithms.

Categories and Subject Descriptors

G.1.6 [Optimization]: Convex programming,Gradient meth-ods

General Terms

Intelligent generator scheduling, Online gradient decent, Re-gret, Online convex optimization (OCO), Economic dispatch

1. INTRODUCTIONGrowing environmental awareness and government direc-

tives have set the stage for an increase in the fraction of elec-tricity supplied using renewable sources [30]. Distributedgeneration [2], especially solar and wind power collected

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.e-Energy 2012, May 9-11 2012, Madrid, Spain.Copyright 2012 ACM 978-1-4503-1055-0/12/05 ...$10.00.

across different small generation locations, is gaining con-siderable importance and their deployment is perceived asvital in achieving carbon reduction goals [24]. Extractingthe maximum value from a time varying and intermittentrenewable energy resource requires intelligent scheduling ofboth generation [4] and loads [17].

Intelligent generation scheduling (involving unit commit-ment [14] and economic dispatch [10]), is the process of

scheduling different generation sources to minimize cost whilemeeting physical constraints of the electricity system. It is ahighly non-linear problem, and usually solved using geneticprogramming or other non-convex optimization techniques[10]. Conventional economic dispatch, a very well researchedmethodology, is typically conducted 24 hours in advance (of-fline, day ahead) and uses the fact that the system load canbe reasonably well predicted a day in advance. However,in a (micro) grid with high levels of wind penetration, thisno longer holds due to the intermittent and unpredictablenature of wind power (which can only be reliably predicteda few minutes in advance [22]). This introduces practicalchallenges such as the ramping constraints, which limit howfast a generation source can increase or decrease its output

over successive steps. Thus, the need for designing tech-niques, with a firm theoretical basis and worst case guar-antees, that enable online generation scheduling subject tosuch physical constraints becomes very important, and ourwork is a major step in bridging this gap in literature.

Recent advances in wind prediction [22] offer hope thata reduction in the uncertainty of wind availability will leadto an increase in its value. Online versions of generationscheduling have been studied recently [4]. These algorithms,almost invariably, make assumptions regarding the stochas-tic nature of wind resources. For example, Xie and Ilic [31]use Model Predictive Control (MPC) for economic dispatch,where a model is constructed that predicts future renewableavailability (assuming that the availability arises from some

stochastic process) and then this prediction is used for gen-erator optimization. These methods are computationallycomplex but seem to be effective in practice. This raisessome interesting questions that motivate this paper (i) Arethere computationally simple algorithms that are still prov-ably effective under non-stationary or arbitrary renewableavailability? (ii) Can we build a theoretical basis for thesuccess of these online and MPC based algorithms? (iii)Can we use the theory to design algorithms that optimallyincorporate available information about the future ?

We address these issues in the context of intelligent onlinegenerator scheduling in microgrids with large, unpredictable


2/10

renewable energy penetrations. Specifically,

We demonstrate how, even in the harsh scenario whereno prediction of the future is available and wind avail-ability is chosen in a n arbitrary manner, recent ad-vances in online convex optimization [7] can be fruit-fully applied to generator scheduling in the next gener-ation of smart (micro) grids. We also show how to ex-ploit the special structure of the cost function for gen-erator scheduling to obtain performance guarantees interms of parameters governing the generation sources.

We describe online algorithms [13] that leverage infor-mation about the near future, such as prediction ofwind availability and intensity more effectively. Inter-estingly, these algorithms use a strategy that discountsthe future costs appropriately in order to prove guar-antees on un-discounted future performance.

We extend the work in online optimization to prescribecomputationally simple online algorithms that modelpractical constraints of the generation sources, such asramping constraints and multiple generation sources.

We empirically show how both the classes of online al-gorithms, i.e. with or without lookahead, significantlyoutperform the existing natural algorithms in the lit-erature. For example, we show that discounting thefuture costs (perhaps counter-intuitively) can outper-form algorithms with lookahead that do not discountthe future. Thus, the theoretical techniques can beused to inform algorithm design.

Our work conclusively establishes the value of the proposedonline algorithms and their theoretical analysis for the smartgrid. The results equip us with a strong theoretical frame-work to quantify the benefits of resources spent on predictiontechnologies and multiple generation sources in the smartgrid. While our algorithms can be used for both generatorand load scheduling in a general grid, we describe our resultsin the context of economic dispatch for microgrids.

1.1 Need for intelligent online algorithms insmart (micro) grids

Recent policy amendments and new technology have ne-cessitated the design of new algorithms for intelligent schedul-ing in smart grids. For example, in addition to the KyotoProtocol, in 2007, Europe made a unilateral commitment tocutting its emissions by at least 20% of the 1990 levels by2020. Since fossil fuel-based electricity is projected to ac-count for more than 40% of global greenhouse gas emissionsby 2020, renewable integration into the power grid will playa crucial role in meeting these goals.

Incorporating a large penetration of the intermittent Dis-tributed Energy Resources (DERs), while ensuring grid sta-bility, is a hard problem that requires a rethinking of howthe grid is operated [29]. Traditional power grids are usedto supply power from a few central generators to a largecustomer base. In contrast, the next generation smart grid,that incorporates distributed generation, must allow two-way flow of electricity and information in order to create anautomated and distributed energy delivery network [9].

The primary motivation for the problem we study is theongoing project to establish a research microgrid at the

Kuala Belalong Field Studies Centre (KBFSC) 1. The re-mote location and limited resource availability at this loca-tion make it an ideal platform to test new algorithms andtechnologies for the next generation of microgrids.

The concept of microgrids, which are (semi-) autonomousentities that co-ordinate DERs and loads in a decentralizedmanner, has been put forth to tackle the problem of largescale control for renewable integration. A microgrid usually

comprises a Low Voltage (LV), 1kV) locally-controlledcluster of DERs and loads that behaves, from the grids per-spective, as a single producer or load both electrically andin the energy markets [11]. A salient feature of the micro-grid lies in its ability to island [18]: it can continue to locallygenerate and consume electricity, possibly at a reduced level,even when disconnected from the grid. To meet carbon re-duction goals and minimize electricity generation costs, it isimperative that the microgrids incorporate as large a frac-tion of the renewable energy generated as possible.

Intelligent scheduling of generation sources and loads isessential to the operation of a microgrid, to allow the inte-gration of volatile DERs such as wind [17], while ensuringstability and reliability. A major limitation of intelligentscheduling concerns the infeasible requirement of constanthuman intervention during the scheduling of loads and gen-erators, motivating the need for algorithms that automat-ically modulate the generation or consumption levels withuncertain renewable power availability. In the sequel, whilewe motivate and describe our techniques in the context ofgenerator scheduling in microgrids we again remark that ourresults can be readily adapted to handle general grids as well.

2. RESEARCH METHODOLOGY AND CON-

TRIBUTIONSWe model the intelligent generation scheduling problem

as an online optimization problem where the objective func-tion is defined to be the sum of time-dependent cost func-

tions of the various time steps. The cost at each time stepis determined by several components. The first is the costof electricity generation due to the current generation levelchosen by the algorithm. This is subject to the ramping con-straints imposed by the generation source(s). In addition,there may also be uncertainty in the available wind so thatthe net effect is that the generated electricity may either beinsufficient to meet the current demand or create a surplus.This is modelled as an additional cost function (which couldbe negative) as determined by the external market prices.

We propose online algorithms for the optimization prob-lems that arise in the smart grid and analyse them in thestrong adversarial model [5]. This is a powerful paradigmthat makes no assumptions regarding the distributions, as

in stochastic optimization, or ranges, as in robust optimiza-tion, characterizing the uncertainty of the unknown future.Therefore the results tend to have wider applicability.

The standard way to measure the performance of an onlinealgorithm is with respect to an offline optimization strategythat knows the entire problem parameters with certaintya priori. The key performance measure is that of regret,which measures the difference between the online and theoffline costs. It may seem unreasonable to expect any inter-esting guarantees because the adversary can simply make

1http://ubdestate.blogspot.com/2009/06/kuala-belalong-field-studies-centre.html


3/10

the online algorithm pay heavily for the lack of knowledgeregarding the future resulting in poor performance. One wayto circumvent this is to place restrictions on the offline algo-rithms choices. The remarkable achievement of this theoryis that under reasonable restrictions, we can design onlinealgorithms that are intuitive, simple to implement, and yieldgood performance guarantees. The analysis techniques arequite involved, drawing on tools from convex optimization,

Markov decision processes, and stability theory.In this paper, we consider two possible approaches totackle the uncertainty in the future. In the first setting,also known as online convex optimization (OCO) [7], thecost function for each time step is fully known only afterthe algorithm chooses the generation level for that step. Inthis case, we restrict the offline algorithm to make one fixedchoice of the generation level for entire duration, but westress that this choice is made in hindsight. The centralresult of OCO theory is that it is possible to design onlinealgorithms that achieve regret sub-linear in the number oftime steps T. Specifically, taking into account the structureof the cost function, we derive a regret bound of O(log T)for the generation scheduling problem. We note that av-erage regret per time step is O( logT

T

) which vanishes as Ttends to . In addition to the theoretical bounds, we showin simulation that this simple algorithm has a substantiallybetter performance than a forecaster commonly used in theeconomic dispatch literature. An extension of this setup isone where the adversarys choice of the point is not fixed butallowed to vary slightly. Our methods do apply to this prob-lem setting and still yield the same O(log T) regret boundmentioned above albeit with slightly worse constants.

In the second setting, we consider a more practical prob-lem where the wind information is available for a limitedhorizon in the future, drawing on recent work in short termwind forecasting [22, 1]. This allows the cost functions to beknown with certainty for the next L steps for some signifi-cantly large parameter L called the lookahead. In this case,

we design a variant of the greedy algorithm based on theavailable information to decide the generation level for thenext time step. The novelty comes from the fact that thestrategy discounts the future available costs in a geometri-cally decreasing fashion. We stress however that the goal isto optimize the sum of costs and not the sum of discountedcosts which is just an artifact of the strategy.

We analyze the regret of this algorithm with respect to thestrongest possible offline algorithmone that is allowed tochange the generation levels in hindsight. The main resultis again a sub-linear regret algorithm for an appropriatelylarge but reasonable lookahead. Such approaches have beenconsidered before and are reminiscent of Model PredictiveControl (MPC) algorithms used in control theory [8, 20, 23].

Our analysis provides some theoretical justification for suchalgorithms. Recall that at the basic level, an MPC algo-rithm is a sequence of open-loop policies where in each step,the algorithm uses a model to compute an optimal trajec-tory, and takes just the first step of that path. Then themodel is recomputed with respect to the feedback providedat that step. Our algorithm closely mirrors the MPC ap-proach except that we use the discounted paths to argue insupport of our algorithm, with respect to the un-discountedtotal reward for a finite horizon T, against an all powerfuladversary that can choose an arbitrary path at the end ofthe finite horizon. Our algorithm generalizes the work in [13]

to account for the ramping conditions which, stated in thelanguage of [13], applies to a more general setting where notall state-state transitions are legal. The theoretical b oundsalso quantify the improvement in performance with looka-head and the interaction between the amount of lookahead,the ramping constraints of the generator and the discountingfactor that should be used. Through simulations we showthat discounting the future reward, which was a proof strat-

egy used to obtain our bounds, actually performs better thatan algorithm that does not discount the future. Thus, theanalysis methods we use inform algorithm design.

The rest of the paper is organized as follows. We describean abstract model of a (micro) grid with scheduleable gen-erators and loads and introduce some notation in Section3. In Section 4, we show how OO can be used in a simpleeconomic dispatch model with one generation source andhow the particular structure of the cost function in electricpower systems allows us to derive strong bounds on perfor-mance. We also describe how OO techniques can be used tohandle practical considerations like the ramping constraintsof a practical generator or the scheduling of multiple gener-ators. In Section 5, we describe algorithms that efficientlyincorporate predictions of the future availability of intermit-tent resources. Finally, we summarize, and describe sometheoretical, algorithmic and practical directions for futureresearch in Section 7.

3. MODEL DESCRIPTIONWe first describe the microgrid scenario in more detail and

introduce some necessary notation. We first consider a mi-crogrid with a single generation source (say a turbine-boilergenerator), though we will describe extensions to multiplegenerators in Section 4.4. Generators can be modeled ashaving a quadratic cost curve [26], so that the cost of gen-erating t using a generator can be expressed as,

CG(t) = a2t + bt + c (1)

For a typical 76M W coal generator 2 a = 0.002, b = 2.680and c = 35.385, when step size is 10min and t is in MW.We model a discrete time version with slot size comparableto the rate of variation in wind power (say 5 to 15 min).

The microgrid has a time varying source of (free) renew-able power rt and has to satisfy a time varying load lt. Theload can be predicted quite accurately, however, the windpower usually cannot be or can only be predicted accuratelyonly a few slots in advance. We are interested in algorithmsfor both these situations. In our model, we treat the windpower generated as a negative load resulting in a net demandat every slot dt = t + xt, where t is the predicted value oflt rt one day ahead and xt is the (unknown) error due to

unpredictable nature of the wind and deviation in demandfrom the predicted value. This modeling is quite naturalconsidering the day-ahead nature of electricity markets [26].

Note: Since the base load t is settled in the day aheadmarket or can be satisfied cheaply using slower generation,for notational simplicity, we do not consider it below. It canhowever be included if required in the analysis. The onlydifference would be that the allowable generation level tmight also take negative values corresponding to a decreasein comparison to the pre-committed level. In the sequel, wewill be only concerned with the xt process.

2http://pscal.ece.gatech.edu/testsys/generators.html


4/10

At every time slot, the microgrid schedules some gen-eration t at cost CG(t). Once demand (load + windpower) is revealed, the microgrid has to buy the shortfall

at buyt /unit of electricity or sell the surplus on the spotmarket at sellt /unit of electricity. Thus, the net cost tosatisfy the load is

Ctnet(t) = a2t +bt+c+

buyt (xtt)

+sellt (txt)+ (2)

where, t 0, (xt t)+

is (xt t) if xt t 0 and 0otherwise.

The offline version for the generation scheduling problem,where all the xts are known a priori, is a convex problemwith linear constraints and can be solved using any of thestandard convex optimization methods [6]. However, thestructure of the constraints allows us to solve the optimiza-tion problem more efficiently as outlined in Appendix C.

4. INTELLIGENT GENERATOR SCHEDUL-

ING : LOW REGRET DISPATCHWe will see how OCO provides simple algorithms for gen-

erator scheduling with no knowledge of the future of theprocess xt. We will first show that the cost function (2) hasa special structure that improves the bounds on the perfor-mance of OCO algorithms.

The net cost paid by the generation for serving a loadxt when a generation t is scheduled is given by (2). Theagent that schedules the generation, is essentially comput-ing a prediction of the load xt and is sometimes also calleda forecaster. We would like a forecaster that generates asequence 1, . . . . T that has low regret, that is low values of

RT = max

Tt=1

Ctnet(t) Ctnet(

) (3)

We are thus comparing our forecaster against a hypotheticalalgorithm that has access to the entire sequence of xts but

is constrained to select a fixed generation value

for theentire duration. We are particularly interested in forecastersthat have sub-linear regret so that the average per-slot regret1T

RT goes (as quickly as possible) to zero as T. In suchsituations our algorithms are essentially as good as the best

fixed forecast in hindsight. We assume that |t| max and|xt| X as otherwise the regret can be made unbounded.

4.1 Interpreting the cost functionThe first observation we make is that the cost function in

(2) is convex only for

buyt sellt (4)

This is also sensible since if the profit from selling is larger

than the cost of buying, there is an arbitrage opportunityand infinite profit can be made. This situation should neverarise in practice.

In addition, the microgrid will not generate electricity if

a2 + b buyt (5)

In particular the microgrid will never generate if b buyt .

So we assume buyt b and (4) throughout this paper. Whilethe above two conditions are simple, they are important tokeep in mind.

We would like to see how the practical aspects of the prob-lem, such as the nature of the cost function and physical

constraints, effect the solution and guarantees. The costfunction in (2) is a convex loss function which guarantees asingle global minima. Actually, it is a strongly convex func-tion, which will have implications on the rate of decrease ofregret of our algorithms.

Definition A cost function Cnet is strongly convex for acertain > 0 with parameter if

Cnet(u) Cnet() + (u x) + ||u x||

2

2 (u, x) (6)We define,

G sup{||t||2 : t Ctnet(), 0 max} (7)

where Ctnet(, x) is the set of subgradients of Ctnet at .

Lemma 1. The cost function in (2) is strongly convex fort G, where G is given by,

G 2amax + b maxt

buyt (8)

and = 2a and buyt sellt .

For simplicity we focus on a particularly simple onlinegradient descent type algorithm due to [32] though thereare many online algorithms with different properties that

may be useful [7].

4.2 Online generation optimizationThe algorithm proceeds as follows. At time t we need to

make a decision on how much to generate from the genera-tors at time t + 1. The Zinkevich update [32] suggests thatwe should generate

yt+1 = t tCtnet()

|=t (9)

projected on to the feasible set. For our cost function (2)this reduces to

yt+1 = t t[2at + b

buyt ] if t xt

t t[2at + b sellt ] if t > xt

(10)

Since the allowed generation lies in ball K = [0, max],

t+1 = min(max(0, yt+1), max) (11)

For such an update, we can prove that,

Theorem 2. The regret of the online generation schedul-ing algorithm can be bounded as

RT G2

(log T + O(1)) (12)

where G is as in (8), = 2a and D = max. Thus, the

per-slot regret RTT

goes to zero as OlogTT

.

Proof. While theorems of this form are known in theOCO literature [7], we present the proof for our strongly con-vex cost function for completeness. Basically, the stronglyconvex nature of the economic dispatch cost function allowsan intelligent choice of the learning rate t, which gives usbetter bounds where the total regret increases only logarith-mically with T.

From the strong convexity condition (6), u K,

Tt=1

Ctnet(t) Ctnet(u)

Tt=1

t(t u)

2(u t)

2

Tt=1

1

t

1

t1

1

2(u t)

2 +Tt=1

t2t (13)


5/10

where (13) comes from using Lemma 8. Now, substitutingt =

1t

we conclude

Tt=1

Ctnet(t) Ctnet(u)

Tt=1

2tt

G2

Tt=1

1

t(14)

The theorem follows using the standard bound for the har-monic series.

This guarantees that in the long run the online schedulingalgorithm given by (10) and (11) performs essentially as wellas the best fixed generation level in hindsight.

4.3 Ramping constraintsWhile the algorithm in the previous section is simple and

effective, it may become infeasible in practice if rapid varia-tion in the wind cause the forecasts to vary drastically acrossslots. This is because, in general generators have rampingconstraints [10], i.e. constraints of the form

|t+1 t| R t (15)

These ramping constraints become important because of

the possibility of very high slew rate in wind power avail-ability [19] : On 11th February 2007, the Irish wind powerfell steadily from 415 MW at midnight to 79 MW at 4am.(This amounts to about 1.5MW/min). In comparison, a typ-ical thermal generator would have a ramping rate of about5 15% of capacity/min.

In order to ensure that the updates in (10) and (11) satisfythe ramping constraints (15), we need that

t R

G t (16)

With this constraint on t we can state the following re-sult on the regret of generation scheduling with rampingconstraints

Theorem 3. The regret of the online generation schedul-ing algorithm, with ramping constraints, can be bounded as

RT G3

R(log T + O(1)) (17)

where G is as in (8), = 2a, D = max and R is the

ramping rate as in (15). The per-slot regret RTT

goes to zero

as OlogTT

.

4.4 Multiple generation sourcesOne possible solution that has been suggested to enable

faster ramping is to have multiple generation sources. Whilethe ramping as a fraction of the total generation remainsin the same range, having multiple generators allows fasterresponse to wind events. We show how the online gradientdescent algorithm from the previous sub-section can easilybe extended to this situation as well.

With multiple generators, i = 1, 2, . . . , N G, each withtheir own cost coefficients ai, bi, ci the total cost functionbecomes

Ctnet(t) =

NGi=1

aiit2 +

NGi=1

biit +

NGi=1

ci (18)

+ buyt (xt

NGi=1

it)+ sellt

NGi=1

(

NGi=1

it xt)+

Each generator i has a ramping constraint of the form |it it+1| Ri, t

Theorem 4. The regret of the online generation schedul-ing algorithm with mulitple constrained generators, can bebounded as

RT G3

Rmin(log T + O(1)) (19)

where G is as in (8), = 2a, D = max and R is theramping rate as in (15). The per-slot regret RT

Tgoes to zero

as OlogTT

, where Rmin = mini Ri.

Proof. For this cost function, the Zinkevich update (9)for each generator i reduces to

yit+1 =

it t[2a

it + b

buyt ] if

i

it xt

it t[2ait + b

sellt ] if

i

it > xt

(20)

To account for the fact that generation lies in ball i [0, imax], we have

it+1 = min(max(0, yit+1),

imax) (21)

To satisfy the ramping constraints Ri

for each generator werequire

t Rmin

G t (22)

where G is as in (8), with max =

iimax.

Note that using this approach the regret bound depends onthe ramping constraint of the most constrained generatorindicating that, at least in the worst case, the benefits ofmultiple constrained generators is limited. Further analysisusing specific statistics of wind or solar power availabilitywould be an interesting direction of further investigation, toidentify when multiple generators are an economical deci-sion.

4.5 SimulationsWe now consider some simulations to highlight the perfor-

mance of the algorithms that may be hidden by the proofs.For simplicity we assume that the microgrid operator is in-terested in using all the wind power generated, and thusschedules for the largest possible wind output. Thus, onlyshortfalls are possible, so that xt 0. This is more real-istic in light of recent laws enacted in European countries(especially Germany) and recommendations of the GlobalWind Energy Council3 that require that all the wind powergenerated be utilized.

We consider a simple ramping model of wind availabilityto demonstrate the effectiveness of the ramp constrained

OCO updates with t as in (16). In this model, wind event ioccurs after time Tsti and continues at a peak power value forTpi . Each event also has a ramp up time t

upi and a ramp down

time tdni . During the ramp time the wind power changeslinearly from initial value to final value.

We simulate the case where Tsti and Tpi are drawn from an

exponentially distribution with parameter T and tupi and

tdni are drawn from an exponentially distribution with pa-rameter t. A typical wind power output sequence is shownin Figure 1.

3See for example ftp://ftp.sni.technion.ac.il/events/2011-12-19/levon.pdf


6/10

10 20 30 40 50 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Time slots

Normalizedmagnitude

Typical wind pattern from distribution

Figure 1: Sample wind events generated from themodel

To reduce the number of parameters we fix buyt = buy(a

constant) and sellt = 0. This is the special case where themicrogrid cannot sell back to the main grid at a profit. Inorder to understand the effect of wind ramps we fix T andvary t (the exponential parameter for the ramp times ofwind events). We also use the thermal generation param-eters from Section 3. Finally, in order to remove as muchexplicit dependence of parameter choices we plot the ratiobetween the total cost incurred by different algorithms andthe cost of the OCO update algorithm.

In Figure 2 we compare the performance of the OCO up-date, a simple greedy predictor and the best fixed generationlevel, chosen in hindsight. The greedy algorithm is essen-tially a persistence based forecaster [22], that schedules thegeneration optimally assuming that the wind availability inthe next slot will be same as the wind availability in thecurrent slot. This naive method has been seen to be hard tobeat 1 to 6 hours ahead, but we see how intelligent schedul-ing leads to substantially improved performance.

5. INTELLIGENT GENERATOR SCHEDUL-

ING WITH LOOKAHEADThe online optimization framework discussed in the pre-

ceding section guarantees a low regret with respect to anadversary that chooses a best fixed point (in hindsight) forthe entire time horizon. In this section, we consider a muchmore powerful offline baseline that is free to choose a differ-

ent point at each time step. We show that the availability ofsome extra information, in terms of lookahead or a glimpseinto the future demands, can be efficiently used to obtainalmost as good a performance as this strong offline oracle.

5.1 Generator scheduling with discounted fu-ture rewards

Our system with lookahead can be abstracted (withoutloss of generality) to work as follows. At each step t, witha lookahead of L, the online algorithm has access to thenet loads for each of the next L steps, in addition to thatfor the current step. Each action incurs a cost, and the

0 5 10 15 20 25 300

1

2

3

4

5

6

7

Mean of ramp time tf

Costratio

Cost of persistence forecast/Cost of OCO

Cost of best fixed in hindsight/Cost of OCO

Figure 2: Cost ratio of OCO and baseline algo-rithms. A higher ratio shows better performanceof OCO. Lower ramp times indicate higher wind

volatility.

objective of the online algorithm is to minimize the averagecost (or equivalently, maximize the average reward) over thespecified time horizon T.

Let there be an expected reward rt associated with eacht choice. Define a L-strategy, during any time step, to be adecision regarding the amount of electricity to be producedin sequence for the next L+1 steps. Note that the successivechoices prescribed by any strategy must obey the rampingconstraint. We now present a L-lookahead based determinis-tic algorithm ON that is asymptotically optimal as L grows.For any strategy that collects L + 1 rewards r0, r1, . . . , rl in

the next L + 1 slots, we define its (discounted) anticipatedreward to be r0 + r1 + . . . +

LrL, for some (0, 1). Thealgorithm ON greedily follows at each time step the strategywith maximum anticipated reward. In other words, at everytime step that strategy is chosen for the next L + 1 stepswhose anticipated reward is maximal, and the online algo-rithm takes action dictated by this strategy in the currentstep. Since the algorithm recomputes the maximal strategyat each step, the strategies at successive time steps may bedifferent. In effect, we solve the following optimization prob-lem at each time step t:

maxt,t+1,...,t+L rt(t, xt) +

Li=1

i

rt+i(t+i, xt+i)

subject to

|i+1 i| R i

where,

ri = 1 Ctnet(t)

Cmax [0, 1] i,

and Cmax = maxt

maxt

Ctnet(t).


7/10

2 4 6 8 10 12 14 16 18

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Time slot

Normalize

dwindmagnitude

Figure 3: A simple periodic wind pattern with pe-riod 6.

This, in turn, can be used to solve our problem of interest:

mint,t+1,...,t+L

Ctnet(t) +Li=1

iCt+inet(t+i)

subject to

|i+1 i| R i

We show that our online algorithm can use the lookaheadto achieve a low regret with respect to an optimal offlinealgorithm that has access to the entire xt sequence a priori.

Theorem 5. The online algorithm with a lookahead L onfuture costs is asymptotically optimal for any L = f(T) +maxR

that satisfies f(T) as T.

Proof.Essentially, we show that there exists an optimalchoice of the discounting factor (that depends on the amount

of lokoahead L) such that the the (future discounted) onlinealgorithm has good performance in terms of undiscountedtotal reward. We then bound the sub-optimality of the al-gorithm.See Appendix B for the details of the proof.

This theorem shows that, for example, even L = loglog T +maxR

is sufficient. Thus, it is the ratio maxR

that primarilydetermines the amount of lookahead needed for (asymptotic)optimality, though the rate at which optimality is achievedwill increase with increasing L.

5.2 SimulationsWe now consider some simulations to quantify the benefits

of lookahead and also to highlight some interesting differ-ences between the future discounted analysed above and theundiscounted = 1 version. Recall that the undiscountedversion was suggested for generator scheduling in [31].

For the purpose of the simulation we consider a simplewind power availability pattern shown in Figure 3. We con-sider the performance of 3 algorithms on this pattern (i) theOCO algorithm from Section 4 (ii) the future discountedalgorithm with lookahead from Section 5.1 and (iii) an algo-rithm that considers the optimal generation schedule withthe same lookahead but without a discounting factor. Theresults of the of the simulation is in Figure 4.

1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 61

1.02

1.04

1.06

1.08

1.1

1.12

Lookahead L

CostRatio

Normalized Cost of discounted opt

Normalized Cost of undiscounted opt

Normalized Cost of OCO without lookahead

Figure 4: A comparison of the performance of (i)the OCO algorithm from Section 4 (ii) the futurediscounted algorithm with lookahead from Section5.1 and (iii) an algorithm that considers the optimalgeneration schedule with the same lookahead butwithout a discounting factor.

From Figure 4 we see that the performance of the OCOalgorithm is reasonable even without any lookahead. Onevery interesting observation is that the performance of thedefault scheduling with lookahead algorithm that does notdiscount future rewards is non-monotonic in the amount oflookahead. That is in some cases the performance of thealgorithm may become worse with increasing lookahead. InFigure 4 this corresponds to increasing the lookahead from 2to 3. Here the performance degrades at lookahead 3 becauseof the particular periodicity inherent in the signal in Figure3. However, the same degradation is observed at differentlookahead with the wind patterns in Figure 4. However, the

future discounted algorithm with optimal performs wellacross the different lookahead lengths.

6. A NOTE ON ONLINE OPTIMIZATION

FOR THE DEMAND SIDEDemand side management (DSM), the process of modifi-

cation of loads by users [15], is seen as an important stepin improving efficiency [27], reducing costs and risks to themarket participants [12], increasing stability [25] and allow-ing larger amounts of renewable energy to be incorporatedinto the next generation smart grid [17].

The problem of intelligent microgrid scheduling may alsobe approached by allowing intelligent agents to scheduleloads, subject to availability and user convenience. We needto develop simple load scheduling algorithms that are ac-companied by performance guarantees. For such problems,our (non-stochastic) techniques could compare favorably torecent approaches that use stochastic control techniques forload scheduling [28, 16].

Rational users who participate in DSM programs wouldnaturally optimize their usage to minimize cost while maxi-mizing user utility [21]. However, since they are faced withvolatile renewable availability and real time prices there isa need for the design of online optimization algorithms fordemand management that provide utility to the user underarbitrary fluctuations in supply, load and prices. The online


8/10

optimization algorithms we study for generation schedulinghave natural counterparts for the problems faced by DSMagents and exploring these ideas is a promising direction offuture work we intend to pursue.

7. CONCLUSIONS AND FUTURE WORKIn this paper we have demonstrated that the theory and

algorithms developed for online optimization are useful for

generator scheduling problems in the smart grid, with suit-able extensions to account for practical constraints such asgenerator parameters and ramping constraints. We designedsimple algorithms, derived guarantees on their performanceunder mild assumptions and showed that they perform welleven when no predictions of the future are available. Weshowed how to incorporate predictions of the future renew-able availability effectively into online generator schedulingalgorithms, and quantified the benefits of lookahead. In-terestingly, we showed that discounting the future is usefulboth as a proof technique and as a strategy for generatorscheduling with ramping constraints.

In addition to load scheduling for cost minimization (basedon reviewer comments) we are particularly interested in un-

derstanding online optimization algorithms for other appli-cations including voltage support, distribution losses, andenergy storage management that are particularly importantin smart grids with a large penetration of renewable energy.

Finally, on a theoretical side new online algorithms or timevarying discounting strategies that have better performanceguarantees would be an interesting direction of future re-search.

Acknowledgements

We would like to thank Shivkumar Kalyanaraman for helpfulcomments during the preparation of this paper. We wouldalso like to thank the anonymous reviewers who pointed outuseful extensions of these ideas that we hope to explore in

future work.

8. REFERENCES[1] N. Abdel-Karim, M. Small, and M. Ilic. Short term

wind speed prediction by finite and infinite impulseresponse filters: A state space model representationusing discrete markov process. In IEEE PowerTech,pages 1 8, July 2009.

[2] T. Ackermann, G. Andersson, and L. Soder.Distributed generation: a definition. Electric PowerSystems Research, 57(3):195 204, 2001.

[3] D. P. Bertsekas. Dynamic Programming and OptimalControl, Vol. I, 2nd Ed. Athena Scientific, Belmont,MA, 2001.

[4] R. Bhuvaneswari, C. S. Edrington, D. A. Cartes, andS. Subramanian. Online economic environmentaloptimization of a microgrid using an improved fastevolutionary programming technique. In NorthAmerican Power Symposium (NAPS), 2009, pages 16, oct. 2009.

[5] Allan Borodin and Ran El-Yaniv. Online computationand competitive analysis. Cambridge University Press,1998.

[6] S. Boyd and L. Vandenberghe. Convex Optimization.Cambridge University Press, New York, NY, USA,2004.

[7] N. Cesa-Bianchi and G. Lugosi. Prediction, learning,and games. Cambridge University Press, Cambridge,England, 2006.

[8] D. Ernst, M. Glavic, F. Capitanescu, andL. Wehenkel. Reinforcement learning versus modelpredictive control: A comparison on a power systemproblem. IEEE Transactions On Systems, Man, andCybernetics- Part B, 39(2):517 529, 2009.

[9] X. Fang, S. Misra, G. Xue, and D. Yang. Smart grid :The new and improved power grid: A survey. IEEECommunications Surveys Tutorials, PP(99):1 37,2011.

[10] Zwe-Lee Gaing. Particle swarm optimization tosolving the economic dispatch considering thegenerator constraints. IEEE Transactions on PowerSystems, 18(3):1187 1195, aug. 2003.

[11] N. Hatziargyriou, H. Asano, R. Iravani, andC. Marnay. Microgrids. IEEE Power and EnergyMagazine, 5(4):78 94, july-aug. 2007.

[12] International Energy Agency. The power to choose -enhancing demand response in liberalised electricitymarkets findings of IEA demand response project.

2003.[13] T. S. Jayram, Tracy Kimbrel, Robert Krauthgamer,

Baruch Schieber, and Maxim Sviridenko. Onlineserver allocation in a server farm via benefit tasksystems (extended abstract). In Proceedings of the33rd annual ACM Symp. on Theory Of Computing,pages 540549, 2001.

[14] S.A. Kazarlis, A.G. Bakirtzis, and V. Petridis. Agenetic algorithm solution to the unit commitmentproblem. IEEE Transactions on Power Systems,11(1):83 92, feb 1996.

[15] D.S. Kirschen. Demand-side view of electricitymarkets. IEEE Transactions on Power Systems,18(2):520 527, may 2003.

[16] I. Koutsopoulos and L. Tassiulas. Control andoptimization meet the smart power grid - schedulingof power demands for optimal energy management. In2nd International Conference on Energy-EfficientComputing and Networking, 2011.

[17] A.S. Kowli and S.P. Meyn. Supporting windgeneration deployment with demand response. InIEEE Power and Energy Society General Meeting,pages 1 8, july 2011.

[18] J.A.P. Lopes, C.L. Moreira, and A.G. Madureira.Defining control strategies for microgrids islandedoperation. IEEE Transactions on Power Systems,21(2):916 924, may 2006.

[19] D.J.C. Mackay. Sustainable Energy without the hot

air. UIT, Cambridge, England, 2007.[20] D. Mayne and J. Rawlings. Constrained model

predictive control: Stability and optimality.Automatica, 36(6):789 814, 2000.

[21] A.-H. Mohsenian-Rad and A. Leon-Garcia. Optimalresidential load control with price prediction inreal-time electricity pricing environments. IEEETransactions on Smart Grid, 1(2):120 133, sept. 2010.

[22] C. Monteiro, R. Bessa, V. Miranda, A. Botterud,J. Wang, and G. Conzelmann. Wind powerforecasting: State-of-the-art 2009. Technical report,Argonne National Laboratory, Decision and


9/10

Information Sciences Division.

[23] M. Morari and J.H. Lee. Model predictive control:Past, present and future. Comput. Chem. Eng.,23(4):667 682, 1999.

[24] G. Pepermans, J. Driesen, D. Haeseldonckx,R. Belmans, and W. Dhaeseleer. Distributedgeneration: definition, benefits and issues. EnergyPolicy, 33(6):787 798, 2005.

[25] F. Rahimi and A. Ipakchi. Demand response as amarket resource under the smart grid paradigm. IEEETransactions on Smart Grid, 1(1):82 88, june 2010.

[26] M. Shahidehpour, H. Yamin, and Zuyi Li. MarketOperations in Electric Power Systems: Forecasting,Scheduling, and Risk Management. Wiley-IEEE Press,1st edition, March 2002.

[27] Kathleen Spees and Lester B. Lave. Demand responseand electricity market efficiency. The ElectricityJournal, 20(3):6985, April 2007.

[28] R. Urgaonkar, B. Urgaonkar, M.J. Neely, andA. Sivasubramaniam. Optimal power cost managementusing stored energy in data centers. In Proceedings ofthe ACM SIGMETRICS, pages 221232. ACM, 2011.

[29] P.P. Varaiya, F.F. Wu, and J.W. Bialek. Smartoperation of smart grid: Risk-limiting dispatch.Proceedings of the IEEE, 99(1):40 57, jan. 2011.

[30] R. Wiser and G. Barbose. Renewables portfolio

standards in the united states aAS a status reportwith data through 2007. Technical report, LawrenceBerkeley National Laboratory.

[31] L. Xie and M.D. Ilic. Model predictive dispatch inelectric energy systems with intermittent resources. InIEEE International Conference on Systems, Man andCybernetics, pages 42 47, oct. 2008.

[32] M. Zinkevich. Online convex programming andgeneralized infinitesimal gradient ascent. In ICML,pages 928936, 2003.

9. ACKNOWLEDGEMENTSWe would like to thank the reviewers for their insightful

comments that we hope to address in follow-up work.

APPENDIX

A. ADDITIONAL LEMMAS FOR OCOWe first give a high level overview of the proof structure.

Lemma 6 and Lemma 7 together bound the size of theupdate step used by the online gradient descent algorithm.Lemma 8 uses this bound to select an optimal learning ratet, such that the regret grows as slowly as possible. This is

used in Theorem 2 to obtain the final regret bound.Lemma 6. For all t 1, u K

Ctnet(yt+1)(t+1u) (t u)

2

2

(t+1 u)2

2

(t+1 t)2

2Proof. Since yt+1 is the unconstrained minimizer ofC

tnet(y)+

(ty)2

2 , Ctnet(yt+1) = (t yt+1). We then have

Ctnet(yt+1)(t+1 u) = (t yt+1)(t+1 u) (23)

=(t u)

2

2+

(t+1 yt+1)2

2

(yt+1 u)2

2

(t t+1)2

2

By combining the middle two terms this gives the lemma

Lemma 7. For all t 1 and u K

Ctnet(yt+1)(tu) (u t)

2

2

(u t+1)2

2+Ctnet(yt+1)

2

Proof. Substitute u = t in Lemma 6 and simplify,

Ctnet(yt+1)(t+1 t) (t t+1)2 (24)

But, by Holders inequality we have

Ctnet(yt+1)(t+1 u) |Ctnet(yt+1)||(t+1 u)| (25)

Thus, Ctnet(yt+1)(t+1 u) Ctnet(yt+1)

2. Combiningthis with the statement of Lemma 6 completes the proof.

Lemma 8. Relationship between learning ratet and strongconvexity parameter t.

Tt=1

t(tu) Tt=1

1

t

1

t1

1

2(ut)

2 +Tt=1

t2t

Proof. In Lemma 7, choose t =Cnet(t)

t. We have

T

t=1

t(t u) =T

t=1

Cnet(t)

t(t u) (26)

Tt=1

1

t

1

2(t u)

2 1

2(t+1 u)

2 + Cnet(t)2

Simplifying, and using the non-negativity of the square termsgives us the final result.

B. ONLINE SCHEDULING WITH LOOKA-

HEADThe proof shows that the online algorithm with discount-

ing appropriately considers the possible good paths so thatthe optimal offline algorithm cannot be much better.

Consider the strategies considered by ON at time t. Let

r0, r1, . . . , rL be the rewards associated with the L-lookaheadstrategy having the maximum anticipated reward at time t.Define ONt = r0 and ON

Lt+1 = r1 + . . . +

LrL. That is,ONt + ON

Lt+1 represents the maximum anticipated reward

across all (L + 1)-length strategies that are available to ONat time t. Moreover, since ON follows this maximum an-ticipated reward strategy in the current step, a reward ofONt is amassed by ON at time t. Fix an optimal offlinealgorithm, and let OP Tt denote the reward collected by thisalgorithm at time t. Further, let OP TLt+1 be a shorthandfor OPTt+1 +

2OP Tt+2 . . . + LOP Tt+L. Noting that one

strategy competing with the strategy having the maximumanticipated reward, at time t, is to join the offline algorithmat time t + 1 and then follow it for next L steps. As a con-sequence of ramping, it may not be possible for the onlinealgorithm to mimic the optimal algorithm after one step; inparticular, in the worst case, the online algorithm may haveto wait for = max

Rsteps (during which, in the worst case,

it may not register any reward). Thus, one strategy com-peting with the strategy having the maximum anticipatedreward, at time t, is to join the offline algorithm at timet + and then follow it for next L + 1 steps. Therefore,

ONt + ONLt+1 OP T

L+1t+ (27)

Another strategy available with ON is to follow the strat-egy that had the maximum anticipated reward during theprevious time step t 1. The contribution of the rewards in


10/10

this strategy to the anticipated reward at time t is larger bya factor = 1

than their contribution at time t 1. Since

ON follows the strategy having the maximum anticipatedreward at time t,

ONt + ONLt+1 ON

Lt (28)

Combining (27) and (28), we obtain

ONt + ONLt+1 ON

Lt +

1

1

OP TL+1t+ (29)

Summing over all time steps t (and assuming suitable zeropadding), we obtain the following telescopic sum:

t

ONt

1

1

t

OP TL+1t+

=

1

1

1

+

1

+1+ . . . +

1

L

t

OP Tt

t

OP Tt

t

ONt

L+1

L+1 1= g(, L), (30)

Expressing (30) in terms of cost, we obtain

t

1 OP Tt

Cmaxt

1ONtCmax

g(, L)

Assuming a time horizon of T, we get

T

T

t=1OP Tt

Cmax T g(, L) g(, L)

T

t=1ONt

Cmax

Tt=1

ONt T Cmax T Cmaxg(, L)

+

Tt=1

OP Tt

g(, L)

Therefore, the total regret can be expressed as

Tt=1

ONt Tt=1

OP Tt

T Cmax T Cmax

g(, L)

T

t=1

OP Tt1 1g(, L)

,whence the average per-slot regret can be bounded as shownbelow:

Tt=1

ONt Tt=1

OP Tt

T

Cmax

1

1

T

1

1

g(, L)

In order to minimize the regret, we want to determine a valueof that minimizes g(, L) for L . It is straightforward

to compute the optimal using (30):

=

L + 1

1L + 1

which, in turn, implies

1

g(, L)= 1

log L

L + 1Hence, the average regret can be expressed in terms ofL as,

Tt=1

ONt Tt=1

OP Tt

T

Cmax

1

1

T

log L

L + 1

= Cmax

1

1

T

log L

Lmax

R+ 1

(31)

(using the fact that = maxR

, since |t| max and |t+1t| R t [T]). This gives the theorem.

C. OFFLINE OPTIMIZATION FOR GENER-

ATOR SCHEDULINGWe describe how the optimal offline algorithm that has

access to perfect predictions of demand and renewable avail-ability would minimize

Tt=1 C

tnet(t). Though this is not

realistic, we describe the ideal problem we would like to solveprimarily owing to the interesting structure of the solution.The offline generator scheduling optimization problem is

minimize1,...,T

Tt=1

[a

2

t + bt + c

+ buyt (xt t)+ sellt (t xt)

+]

subject to

|t t1| R, t = 1, . . . , T .

The constraints only couple consecutive generation valuest, t+1, t. Based on this structure, we follow a dynamicprogramming [3] approach: we work backward from the lasttime step T and recursively generate the cost function to besolved at the first time step t = 1, given the knowledge ofthe xts. However, this approach has a ma jor shortcoming inthat the solution space is continuous, thereby requiring usto discretize the set of allowable ts. We mention here that

the piecewise quadratic nature of the cost function enablesus to sidestep the continuous optimization problem, sincethe optimal point is always guaranteed to be among a finiteset of points, and that the number of such points increasesonly as the number of time steps T.

online optimization for the smart (micro) grid

Documents