evaluation of costs for procuring reserve capacity in a deregulated power system using multi-agent...

Evaluation of Costs for Procuring Reserve Capacity in a Deregulated PowerSystem Using Multi-Agent Model

SATOSHI SUZUKI,1 HIROYUKI KITA,1 EIICHI TANAKA,1 and JUN HASEGAWA21Hokkaido University, Japan

2Hakodate National College of Technology, Japan

SUMMARY

In this paper, we assume 2 models for securing re-serve capacity. One is “Commitment-based SecurityModel” and the other is “Reserve Market-based SecurityModel.” In Commitment-based security model, ISO com-mits procurement of reserve energy to a particular genera-tion company. Meanwhile, in Reserve market-basedsecurity model, ISO procures reserve energy through re-serve market. The main object of this research is to investi-gate which model will be preferable for the viewpoint ofconsumer’s cost. To compare these models, two things areconsidered in this paper. One is bidding behavior of agentswhich bids to energy market and reserve market. To con-sider this, Q-Learning of multi-agent model is used. Also,the Unit Commitment (UC) is considered to calculate gen-eration cost. This is to calculate the cost for securing reservepower more precisely. © 2009 Wiley Periodicals, Inc.Electr Eng Jpn, 167(1): 18–25, 2009; Published online inWiley InterScience (www.interscience.wiley.com). DOI10.1002/eej.20705

Key words: operating reserve; commitment-basedsecurity model; reserve market-based security model; unitcommitment; Q-learning.

1. Introduction

Due to deregulation of the power market, generationcompanies other than common electric power suppliers(utility providers) can participate in power transactions, andpower trade is implemented via the transmission/distribu-tion networks of the utility providers. Presently, powersystem operation is under the control of the utility provid-ers, and the operating reserve, too, is assured by the utilityproviders [1]. In the USA and Europe, power system opera-

tion has already achieved autonomy in the form of inde-pendent system operators (ISOs), and reserve power isoften procured via the market [2, 3].

We examine an optimal way of providing reservepower in a liberalized power environment. We consider twomodels: in the first, procurement of reserve power is en-trusted to particular generation companies (commitment-based security model), and in the second one, reserve poweris procured on the market (reserve market-based securitymodel). Both models have already been compared in termsof the consumer’s power procurement cost by calculationof the total profit of the generation companies [4]. However,no attention was devoted to the profit maximization strate-gies chosen by competitors, and therefore, there still re-mains the issue of the equilibrium state when all generationcompanies maximize their profits. That is, a Nash equilib-rium solution (game theory) has not been obtained.

In this study, we perform simulations on the biddingbehavior of market participants using multi-agent Q-learn-ing [5–7], a numerical technique for finding the Nashequilibrium solution. Thus, we aim at quantitative estima-tion of how the number and share of market newcomersaffect costs in the two reserve power procurement models.In addition, we propose a simulation model for more real-istic evaluation of reserve power by introduction of unitcommitment [8] into generation cost calculation.

Most previous research on reserve power procure-ment in a competitive environment deals with the biddingstrategies of generation companies [9, 10] or with institu-tional design of the reserve power market [11, 12]. Few, ifany, researchers consider the evaluation of reserve powerfrom the standpoint of the consumer’s cost.

2. Reserve Power Procurement

The concept of the simulation model used in thisstudy is illustrated in Fig. 1. This model assumes generationcompanies of two kinds: utility providers that possess mul-

© 2009 Wiley Periodicals, Inc.

Electrical Engineering in Japan, Vol. 167, No. 1, 2009Translated from Denki Gakkai Ronbunshi, Vol. 127-B, No. 3, March 2007, pp. 459–466

18

tiple generators, and newcomers that have only a few gen-erators. In addition, the demand includes three components,regulated energy demand, deregulated energy demand, andreserve demand.

In the case of regulated energy demand, the utilityproviders have supply obligations, while in the case ofderegulated energy demand, the utility providers and new-comers make bids in the market. In this study, we assumefor simplicity that there are only two time slots in themarket, on-peak (9:00 to 21:00) and off-peak (21:00 to9:00). In addition, the bidding volume is expressed as thefraction of deregulated energy demand in the on-peak andoff-peak time slots, and its unit is the rate: for example, tosecure a demand of 0.1 rate in the on-peak market meanssupplying 10% of the deregulated demand from 9:00 to21:00.

The reserve demand is set to 10% of the whole systemdemand (that is, the sum of regulated and deregulateddemand) to be purchased by independent system operators(ISOs). In the commitment-based security model, ISOs paythe additional cost (here assumed to be known) borne bythe utility providers to assure the operating reserve. In thereserve market-based security model, ISOs pay the cost ofassuring reserve power in the reserve market. The actualcost of producing an operating reserve is not considered. Inthe reserve market, too, the bidding volume of the operatingreserve is expressed in rates (the reserve market volume istaken as 1), and the same share of the operating reserve isassured throughout the day. In addition, bidding in thereserve market is performed simultaneously with biddingin the energy market, and settlement is performed later.

3. Determination of Bidding Strategies Based onQ-Learning

A calculation flowchart of multi-agent Q-learning isshown in Fig. 2. Q-learning is a reinforcement learningtechnique based on numerical rewards given by the envi-ronment. In particular, actions are corrected gradually so as

to maximize the expected rewards [5–7]. In Q-learning, theexpected reward obtained by action selection is called theaction-value function or Q-value. There are various meth-ods of selecting actions with respect to the Q-value. In thisstudy, the epsilon-greedy algorithm is used. In the epsilon-greedy algorithm, actions are selected at random with prob-ability ε, and the actions offering the highest Q-value areselected with probability (1 – ε).

When the actions of all agents have been determined,market clearing is performed, and a reward R is given toevery agent. Based on the reward, an agent corrects theQ-value [Q(a)] by the following expression [5–7], where ηis the learning speed and k is the number of iterations:

When Q-learning is used to determine the biddingbehavior of the generation companies in both models, theavailable bidding options must be specified to the partici-pants in advance. In this study, for both the energy marketand the reserve market, each agent (utility provider ornewcomer) j submits a straight line that expresses therelationship between the price λ ($A/rate) and the volumeP (rate) as in Eq. (2), selects the coefficients aj, bj, and takesan action (see Fig. 3):

Here PRICE_MAX is a predefined upper limit of the marketprice. In addition, upper limits Ponbid max,j and Poffbid max,j

for the bidding volume in the on-peak and off-peak energymarkets can be selected arbitrarily as shown in Eq. (3). Here

Fig. 2. Flowchart of Q-learning.

(1)

Fig. 1. Simulation model.

(2)

19

Ponbid max,j and Poffbid max,j are the supply capacities of agentj for the respective markets:

In Q-learning, the reward R required to update theagent’s Q-value is defined as revenue less generation cost.The latter is found by solving the following optimizationproblem with regard for unit commitment [8].

° Objective function

Here

° Constraints• Upper/lower limit of generation output• Minimum up/down time constraint• Demand–supply balance constraint• Operating reserve constraint

Here GC is the generation cost, i is the generator number, Iis the total number of generators, t is the time slot number,T is the number of time slots, F is the fuel cost, ST is thestarting cost, Pi

t is the output of generator i in time slot t,Ui,t is the state of generator i in time slot t (operation: 1,shutdown: 0), HST is the hot start cost, CST is the cold startcost, Ti,off is the shutdown time, Ti,down is the minimumdown time constraint, and Ti,cold is the shutdown time forcold start.

The operating reserve constraint is defined as thedifference between the total maximum output of the parallel

generators and the total actual output. The generation out-put is set above the reserve power demand acquired in thereserve market for every time slot. In this study, the afore-mentioned optimization is performed by evolutionary pro-gramming [8].

Hence, the Q-learning procedure of a generationcompany can be summarized as in Fig. 4. Based on Q-learn-ing, the generation companies submit bidding curves foron-peak demand, off-peak demand, and reserve demand1. As a result of bidding by all companies, the settlementprice (yen/rate) and the obtained demand (rate) are deter-mined 2, and the revenue is evaluated 3. At the same time,every company sets its optimal unit commitment scheduleso as to meet the obtained demand at minimum cost 4. Thecost is then evaluated 5. Based on evaluations 3 and 5,the profit is calculated 6 and used as feedback to updatethe Q-value. By repetition of this procedure, the generationcompanies can learn bidding strategies that are optimizedin terms of the number of market newcomers and the marketdemand.

Serial processing of the above optimization withregard for unit commitment in Q-learning would take alarge amount of computing time. Therefore, in this study,we perform optimization for the assumed on/off-peak de-mand and reserve demand prior to Q-learning, and store thecosts thus estimated in memory. Then, in the course ofQ-learning, the stored cost values are invoked according tothe obtained demand. However, preliminary calculation ofany demand is impossible, and hence the calculation isperformed at appropriate intervals (for example, 0.1 rate),and then linear interpolation is used to obtain the interme-diate values.

4. Analysis of Bidding Strategies for Both ReservePower Models

We performed simulations for a single utilityprovider, while varying the number of newcomers from 0

(3)

(4)

Fig. 3. Bidding curve. Fig. 4. Learning of bidding strategies of generationcompanies.

(5)

(6)

20

to 30. Based on the optimal actions of every agent acquiredby learning, we found the market price, generation com-pany profit, and social cost as explained below.

4.1 Input data

(1) Demand The assumed maximum values of theregulated demand, on/off-peak deregulated demand, andreserve demand are shown in Fig. 5. The data were pro-duced using the standard load model of IEE Japan [13]. Inaddition, the demand curves are given in Fig. 6. The demandpresented in Fig. 5 corresponds to 1.0 on the horizontal axisin Fig. 6. In the simulation, the upper price limitPRICE_MAX for the on/off-peak energy market and thereserve market was set to 500,000 $/rate.

(2) Generator characteristics We assume that theutility provider has 10 generators, and that every newcomerhas 1 generator, with the respective characteristics (unit 1to 10 and unit 8) shown in Table 1. Thus, we consider thatthe utility provider has all kinds of generators, from basiclarge-capacity units to small-capacity units for peak-time(additional) power supply, and that the newcomers haveonly comparatively small-capacity units. In addition, thedata are arranged so that the total demand including reservecan be satisfied by the utility provider alone.

The characteristics listed in Table 1 are often used astest data for unit commitment scheduling by the IEEE [8].Here initial status expresses how many hours the generatorhas been operated or shut down continuously since theprevious day.

(3) Bidding behavior The coefficients A, B, C re-lated to the intervals of bidding behavior of the utilityprovider and the newcomers [see Eq. (2)] in the energymarket (EM) and the coefficients A, B related to reservemarket (RM) were set as follows.

Utility provider EM: (A, B, C) = (2, 5, 0)RM: (A, B) = (0, 10)

Newcomer EM: (A, B, C) = (0, 5, 3)RM: (A, B) = (0, 10)

In addition, the number of iterations in Q-learning was setto 10,000,000. However, considering the possibility of in-sufficient convergence, the results of three calculationswere averaged.

4.2 Results and discussion

(1) Market priceThe market prices obtained for the commitment-

based security model and the reserve market-based securitymodel are shown in Figs. 7 and 8, respectively. In bothcases, the market price approaches the set upper limit(500,000 $/rate) when the number of newcomers is small,and decreases as more newcomers become involved in the

Fig. 5. Assumed demand.

Table 1. Characteristics of generators

Fig. 6. Demand curves.

21

market. Thus, it appears that when multiple generationcompanies are given the opportunity to utilize their supplycapacity efficiently by competition, the total supply capac-ity for the same price increases, and the market price goesdown. In further simulations for various upper price limits,the same pattern was observed as in Figs. 7 and 8.

(2) Generation company profit The profits of thenewcomers and the utility provider are shown in Figs. 9 and10, respectively; in the former case, the average profit pernewcomer is given as well. Here a negative profit of theutility provider appears because the utility provider alsosupplies the regulated energy market, and the profit thusgained is not considered here.

From the above results, one can conclude that

• In both market models, the profit of the utilityprovider and the average profit of the newcomersdecrease as the number of newcomers increases.

This can be attributed to lowered market prices due tocompetition, as mentioned above. In addition, it is clear that

• When there are few newcomers, the reserve mar-ket-based security model offers a better repre-sentation of the profit gained by the utilityprovider and newcomers.

For the utility provider, the main reason for the profitincrease can be explained by trading of reserve power inaddition to large-scale energy sales. On the other hand, inthe case of a newcomer with limited generation capacity,we may consider the following two reasons. First, whenpart of the generator capacity is kept as an operating reserve,the generated output decreases accordingly, resulting in asaving of generation cost. Second, when the market pricesfor energy and reserve power are the same, the referencedemand volume (rate) is smaller for the operating reservethan for energy (on-peak energy demand: 6608 MWh,off-peak energy demand: 8916 MWh, reserve demand:2953.8 MWh). Therefore, the price per unit output($/MWh) is higher for the operating reserve than for energy,

Fig. 7. Market price (commitment-based securitymodel).

Fig. 8. Market price (reserve market-based securitymodel).

Fig. 9. Profit (newcomer).

Fig. 10. Profit (utility provider).

22

and revenue is increased by selling part of the output asoperating reserve.

In addition, the total profit of the newcomers in thecommitment-based security model drops sharply as theirnumber increases from 7 to 10, and the profit of the utilityprovider increases. Therefore,

• The profit of the utility provider is higher in thereserve market-based security model when thereare few newcomers, and the commitment-basedsecurity model proves more profitable as the num-ber of newcomers increases.

The hourly generation unit cost of the utility providerand newcomers including only the fuel cost (without thestarting cost) is illustrated in Fig. 11. As can be seen fromthe diagram, comparing a utility provider with 10 gener-ators and newcomers having just one generator each, theunit generation cost of the utility provider varies onlyslightly with the output, while that of a newcomer decreasessubstantially as the output grows. This is because the gen-eration efficiency of a thermal power plant usually drops asthe output is decreased [14], but the generation efficiencyof multiple generators can be maintained at a certain levelby a unit commitment schedule adjusted to demand. There-fore, in the commitment-based security model assumingthat newcomers participate only in the energy market, pro-curing as much demand as possible within the capacity limitis desirable, and obtaining profit in the market becomesimpossible when the price falls so low that bidding cannotbe successful even at the maximum output price (becauseof increasing competition). In this case, the newcomers areeliminated from the market, and the utility providers recap-ture demand, thus increasing their profits. On the otherhand, in the reserve market-based security model, eventhough a newcomer can procure demand only for part of itsgeneration capacity, the rest can be used for bidding in the

reserve market. As a result, the newcomers obtain a steadierposition in the market.

(3) Consumer’s cost Now we compare the twomodels in terms of the cost of energy consumption bycustomers (here referred to as the consumer’s cost). Weconsider the energy purchase cost plus the ancillary servicecost, that is, the cost of reserve power borne by the ISO. Weassume that the ancillary service cost is related to energyconsumption, and is eventually borne by the customers.Therefore, the consumer’s costs TC in the commitment-based security model and reserve market-based securitymodel are defined as follows:

Here EC and RC are, respectively, the consumer’senergy purchase cost in a deregulated market and the ISO’soperating reserve procurement cost, defined as follows:

Here the subscripts CSM and RMSM denote the com-mitment-based security model and the reserve market-based security model, respectively.

Similarly to the above expression,

Here λ and P denote the market price and the settle-ment quantity, respectively, and the subscripts onP, offP,and res denote the on-peak energy market, the off-peakenergy market, and the reserve market. In addition,GCU(PU,onP, PUoffP, PR) expresses the generation cost whenthe utility provider obtains on-peak and off-peak demandof PU,onP (rate) and PU,offP (rate), respectively, while submit-ting an operating reserve of PR (rate).

TC, EC, and RC are shown in Fig. 12. The diagramindicates that

• When newcomers are few, the commitment-basedsecurity model results in lower consumer costs.

This is because all market prices rise to the upper limit, andthe reserve market-based security model with one addi-tional market produces a higher consumer cost.

In addition,

• The difference in consumer cost between the twomodels decreases as more newcomers becomeinvolved in the market, but the reserve market-based security model does not show a lower con-sumer price even in a more competitiveenvironment.

(7)

(8)

(9)

(10)

(11)

Fig. 11. Fuel cost of generation company.

23

EC is almost the same in the two models, but RC issignificantly greater in the reserve market-based securitymodel. We may consider the following two reasons. First,the utility provider can provide an operating reserve at asmall cost because when operating multiple generators, itcan use the excess generated power at its discretion. Inaddition, the newcomers’ generation price increases signifi-cantly as the output is reduced. Since EC is nearly the samein both models, the energy price is nearly the same as well.As explained above, in the commitment-based securitymodel the utility provider is attempting to eliminate new-comers from the market, and therefore, we may assume thatthe energy price is close to the newcomers’ generation priceat maximum output. In the reserve market-based securitymodel, newcomers participate in the reserve market, andhence they submit less power to the energy market. As aresult, the newcomers’ generation price increases, and theywould incur losses by dealing only in the energy market.Hence, the newcomers must compensate their losses bydealing in the reserve market, which may result in anincrease of the ISO’s operating reserve procurement cost.

5. Conclusions

We assumed two models of reserve power procure-ment, the commitment-based security model and the re-serve market-based security model. We used these modelsto simulate the bidding behavior of generation companies,and improved the calculation of the generation cost by theintroduction of unit commitment. By simulations, we esti-mated quantitatively how the market price decreases due tocompetition. We also found that increasing competition inthe commitment-based security model results in the elimi-nation of newcomers from the market. On the other hand,in the reserve market-based security model, newcomershave a steadier position. We assumed that with a fixednumber of newcomers, they can efficiently utilize theirexcess power in the reserve market-based security model.

However, even when many newcomers participate in themarket, the commitment-based security model results inlower consumer costs due to the influence of the generationprice.

In this study we considered two energy markets (on-peak demand for 12 hours and off-peak demand for 12hours) and one reserve market (daily reserve demand).Therefore, a generation company must adjust its powergeneration schedule according to the hourly variation of theobtained demand. More flexible operation would be possi-ble in the case of an hourly demand market. That is, themarket models employed in this study impose stricter op-eration constraints than those occurring in actual practice.However, these constraints apply equally to the utilityproviders and newcomers, and therefore our conclusionswould hold true for the hourly demand market as well

In addition, in this study we assumed that the reservepower procurement (minimum) cost of the utility provideris known. This is because when estimating the cost of thereserve market-based security model, we considered re-serve power procurement on a noncompetitive cost basis asa reference for comparison. Whether this reserve powerprocurement cost is actually available is a different issue tobe examined.

We further assumed that newcomers have only small-scale generators for peak-time supply, and a similar esti-mate is also needed for other types of generators.

REFERENCES

1. Electricity Industry Committee, URL:http://www.enecho.meti.go.jp/denkihp/index.html

2. Yokoyama R. Liberalization of electricity market andtechnological issues. TDU; 2001. (in Japanese)

3. Nambu T. A design for liberalizing electricity mar-kets. University of Tokyo Press; 2003. (in Japanese)

4. Monden T, Kita H, Nishiya K, Hasegawa J. A studyon the cost evaluation of securing the reserve capacityunder a competitive environment. IEEJ Trans PE2005;125:73–80. (in Japanese)

5. Shimomura T, Saisho Y, Fujii Y, Yamaji K. Analysisof pricing process in electricity market using multi-agent model. IEEJ Trans PE 2004;124:281–290. (inJapanese)

6. Oouchi A, Yamamoto M, Kawamura H. Theory andapplication of multi-agent systems—Computingparadigm from complex systems engineering. Co-rona Publishing; 2002. (in Japanese)

7. Takadama K. Multiagent learing—Exploring poten-tials embedded in interaction among agents. CoronaPublishing; 2003. (in Japanese)

Fig. 12. Consumer’s cost.

24

8. Juste KA et al. An evolutionary programming solu-tion to the unit commitment problem. IEEE TransPower Syst 1999;14:1452–1459.

9. Wen F, David AK. Coordination of bidding strategiesin day-ahead energy and spinning reserve markets.Electrical Power & Energy Systems 2002;24:251–261.

10. Attaviriyanupap P, Kita H, Tanaka E, Hasegawa J. Ahybrid LR-EP for solving new profit-based UC prob-lem under competitive environment. IEEE TransPower Syst 2003;18:229–237.

11. Wang J, Wang X. Operating reserve model in thepower market. IEEE Trans Power Syst 2005;20:223–229.

12. Gan D, Litvinov E. Energy and reserve market de-signs with explicit consideration to lost opportunitycosts. IEEE Trans Power Syst 2003;18:53–59.

13. IEEJ Standard System Models: Suburban SystemModel, Table 2.5(a) “Load Curve Data”, URL:http://www.iee.or.jp/pes/model

14. JSER energy and resources handbook. Ohm Press;1996.

AUTHORS (from left to right)

Satoshi Suzuki (student member) completed the M.E. program at Hokkaido University in 2006 and joined SumitomoCorp. His student research interest concerned reserve power procurement in power systems.

Hiroyuki Kita (member) completed the M.E. program at Hokkaido University in 1988 and joined the faculty as a researchassociate in 1989. He was appointed a professor in 2005. He received a 1997 IEEJ Paper Award. His research interests areplanning, operation, and control of power systems. He holds a D.Eng. degree, and is a member of IEEE, ORSJ, IEIEJ, and IEIJ.

Eiichi Tanaka (member) completed the M.E. program at Hokkaido University in 1977 and joined the faculty as a researchassociate. His research interests are analysis and control of power systems. He is a member of ORSJ, IEIEJ, and SICE.

Jun Hasegawa (member) completed the doctoral program at Hokkaido University in 1971 and joined the faculty as alecturer. He was appointed a professor in 1985. Since 2005 he has been president of Hakodate National College of Technology.He received an IEEJ Paper Award in 1997. He was IEEJ Society B president in 1994, IEEJ Acting Chairperson in 2004, andIEEJ Chairperson in 2005. His research interests are planning, operation, analysis, and control of power systems, energyengineering. He holds a D.Eng. degree, and is a member of IEEE, IEIJ, JSER, CAJ, ORSJ, IEIEJ, and others.

25

evaluation of costs for procuring reserve capacity in a deregulated power system using multi-agent...

Documents