16. markovian demand inventory model

Optimality of (s; S) Policies in Inventory Models withMarkovian Demand�Suresh P. Sethi and Feng ChengFaculty of Management, University of Toronto246 Bloor St. W., Toronto, Ontario, Canada M5S 1V4September 1, 1993First Revision November 28, 1994Second Revision February 3, 1995Third Revision November 23, 1995AbstractThis paper is concerned with a generalization of classical inventory models (with �xedordering costs) that exhibit (s; S) policies. In our model, the distribution of demands insuccessive periods is dependent on a Markov chain. The model includes the case of cyclic orseasonal demand. The model is further extended to incorporate some other realistic featuressuch as no ordering periods and storage and service level constraints. Both �nite and in�nitehorizon nonstationary problems are considered. We show that (s; S) policies are also optimalfor the generalized model as well as its extensions.

To appear in Operations Research.(DYNAMIC INVENTORY MODEL, MARKOV CHAIN, DYNAMIC PROGRAMMING, FINITEHORIZON, NONSTATIONARY INFINITE HORIZON, CYCLIC DEMAND, (s; S) POLICY)�This research was supported in part by NSERC grant A4619 and Canadian Centre for Marketing InformationTechnologies. The authors would like to thank Vinay Kanetkar, Dmitry Krass, Ernst Presman, Dirk Beyer, fouranonymous referees, and two anonymous Associate Editors for their comments on earlier versions of this paper.

One of the most important developments in the inventory theory has been to show that (s; S)policies are optimal for a class of dynamic inventory models with random periodic demands and�xed ordering costs. Under an (s; S) policy, if the inventory level at the beginning of a period is lessthan the reorder point s, then a su�cient quantity must be ordered to achieve an inventory levelS, the order-up-to level, upon replenishment. However, in working with some real-life inventoryproblems, we have observed that some of the assumptions required for inventory models exhibiting(s; S) policies are too restrictive. It is our purpose, therefore, to relax these assumptions towardmore realism and still demonstrate the optimality of (s; S) policies.The nature of the demand process is an important factor that a�ects the type of optimal policyin a stochastic inventory model. With possible exceptions of Karlin and Fabens (1959) and Iglehartand Karlin (1962), classical inventory models have assumed demand in each period to be a randomvariable independent of demands in other periods and of environmental factors other than time.However, as elaborated recently in Song and Zipkin (1993), many randomly changing environmentalfactors, such as uctuating economic conditions and uncertain market conditions in di�erent stagesof a product life-cycle, can have a major e�ect on demand. For such situations, the Markov chainapproach provides a natural and exible alternative for modeling the demand process. In suchan approach, environmental factors are represented by the demand state or the state-of-the-worldof a Markov process, and demand in a period is a random variable with a distribution functiondependent on the demand state in that period. Furthermore, the demand state can also a�ect otherparameters of the inventory system such as the cost functions.Another feature that is not usually treated in the classical inventory models but is often observedin real life is the presence of various constraints on ordering decisions and inventory levels. Forexample, there may be periods, such as weekends and holidays, during which deliveries cannot takeplace. Also, the maximum inventory that can be accommodated is often limited by �nite storagespace. On the other hand, one may wish to keep the amount of inventory above a certain level toreduce the chance of a stock-out and ensure a satisfactory service to customers.While some of these features are dealt with in the literature in a piecemeal fashion, we shallformulate a su�ciently general model that has models with one or more of these features as specialcases and that retains the optimal policy to be of (s; S) type. Thus, our model considers more generaldemands, costs, and constraints than most of the �xed-cost inventory models in the literature.The plan of the paper is as follows. The next section contains a review of relevant models andhow our model relates to them. In Section 2, we develop a general �nite horizon inventory modelwith a Markovian demand process. In Section 3, we state the dynamic programming equationsfor the problem and the results on the uniqueness of the solution and the existence of an optimalfeedback or Markov policy. In Section 4, we derive some properties of K-convex functions, whichrepresent important extensions of the existing results. These properties allow us to show moregenerally and simply that the optimal policy for the �nite horizon model is still of (s; S) type,with s and S dependent on the demand state and the time remaining. The analysis of modelsincorporating no-ordering periods and those with the shelf capacity and service level constraintsis presented in Section 5. The nonstationary in�nite horizon version of the model is examined inSection 6. The cyclic demand case is treated in Section 7. Section 8 concludes the paper.1 Review of Literature and Its Relation To Our ModelClassical papers on the optimality of (s; S) policies in dynamic inventory models with stochasticdemands and �xed setup costs are those of Arrow et al. (1951), Dvoretzky et al. (1953), Karlin(1958), Scarf (1960), Iglehart (1963), and Veinott (1966). Scarf develops the concept of K-convexityand uses it to show that (s; S) policies are optimal for �nite horizon inventory problems with �xedordering costs. That a stationary (s; S) policy is optimal for the stationary in�nite horizon problemis proved by Iglehart (1963). Furthermore, Bensoussan et al. (1983) provide a rigorous formulationof the problem with nonstationary but stochastically independent demand. They also deal withthe issue of the existence of optimal feedback policies along with a proof of the optimality of an1

(s; S)-type policy in the nonstationary �nite as well as in�nite horizon cases. Kumar (1992) hasattempted to extend the classical inventory model by incorporating service level and storage capacityconstraints, but without a rigorous proof.The e�ect of a randomly changing environment in inventory models with �xed costs received onlylimited attention in the earlier literature. Karlin and Fabens (1959) introduced a Markovian demandmodel similar to ours. They indicate that given the Markovian demand structure in their model,it appears reasonable to postulate an inventory policy of (s; S) type with a di�erent set of criticalnumbers for each demand state. But they considered the analysis to be complex, and concentratedinstead on optimizing only over the restricted class of ordering policies each characterized by asingle pair of critical numbers s and S irrespective of the demand state.Recently and independently, Song and Zipkin (1993) have presented a continuous-time, discrete-state formulation with a Markov-modulated Poisson demand and with linear costs of inventory andbacklogging. They show that the optimal policy is of state-dependent (s; S) type when the orderingcost consists of both a �xed cost and a linear cost. An algorithm for computing the optimal policyis also developed using a modi�ed value iteration approach.The basic model presented in the next section extends the classical Karlin and Fabens modelin two signi�cant ways. It generalizes the cost functions that are involved and it optimizes overthe natural class of all history-dependent ordering policies. The model and the methods used hereare essentially more general than those of Song and Zipkin (1993) in that we consider generaldemands (Remark 4.5), state-dependent convex inventory/backlog costs without the restrictiveassumption relating backlog and purchase costs (see Remark 4.2), and extended properties of K-convex functions (Remark 4.4). The constrained models discussed in Section 5 generalize Kumar(1992) with respect to both demands and costs. The nonstationary in�nite horizon model extendsBensoussan et al. (1983) to allow for Markovian demands and more general asymptotic behavior onthe shortage cost as the shortage becomes large (Remark 4.2).2 Formulation of the ModelIn order to specify the discrete time inventory problem under consideration, we introduce thefollowing notation and basic assumptions:h0; Ni = f0; 1; 2; : : : ; Ng; the horizon of the inventory problem;(;F ; P ) = the probability space;I = f1; 2; : : : ; Lg; a �nite collection of possible demand states;ik = the demand state in period k;fikg = a Markov chain with the (L� L)-transition matrix P = fpijg;�k = the demand in period k, �k � 0; �k dependent on ik;�i;k(�) = the conditional density function of �k when ik = i, Ef�kjik = ig �M <1;�i;k(�) = the distribution function corresponding to �i;k;uk = the nonnegative order quantity in period k;xk = the surplus (inventory/backlog) level at the beginning of period k;ck(i; u) = the cost of ordering u � 0 units in period k when ik = i;fk(i; x) = the surplus cost when ik = i and xk = x; fk(i; x) � 0 and fk(i; 0) � 0;�(z) = 0 when z � 0 and 1 when z > 0:We suppose that orders are placed at the beginning of a period, delivered instantaneously, andfollowed by the period's demand. Unsatis�ed demands are fully backlogged. Furthermore,ck(i; u) = Kik�(u) + ciku; (2.1)2

where the �xed ordering costs Kik � 0 and the variable costs cik � 0, and the surplus cost functionsfk(i; �) are convex and asymptotically linear, i.e., fk(i; x) � C(1 + jxj) for some C > 0: (2.2)The objective function to be minimized is the expected value of all the costs incurred duringthe interval hn;Ni with in = i and xn = x:Jn(i; x;U) = E ( NXk=n [ck(ik; uk) + fk(ik; xk)]) ; (2.3)where U = (un; : : : ; uN�1) is a history-dependent or nonanticipative admissible decision (order quan-tities) for the problem and uN = 0. The inventory balance equations are given byxk+1 = xk + uk � �k; k 2 hn;N�1i: (2.4)Finally, we de�ne the value function for the problem over hn;Ni with in = i and xn = x to bevn(i; x) = infU2 U Jn(i; x;U); (2.5)where U denotes the class of all admissible decisions. Note that the existence of an optimal policyis not required to de�ne the value function. Of course, once the existence is established, the \inf"in (2.5) can be replaced by \min".3 Dynamic Programming and Optimal Feedback PolicyIn this section we give the dynamic programming equations satis�ed by the value function. Wethen provide a veri�cation theorem that states the cost associated with the feedback or Markovpolicy obtained from the solution of the dynamic programming equations equals the value functionof the problem on h0; Ni. The proofs of these results require some higher mathematics, and theyare available in Sethi and Cheng (1993); see also Bertsekas and Shreve (1976).Let B0 denote the class of all continuous functions from I � R into R+ and the pointwiselimits of sequences of these functions (see Feller (1971)). Note that it includes piecewise-continuousfunctions. Let B1 be the space of functions in B0 that are of linear growth, i.e., for any b 2 B1,0 � b(i; x) � Cb(1 + jxj) for some Cb > 0. Let C1 be the subspace of functions in B1 that areuniformly continuous with respect to x 2 R. For any b 2 B1, we de�ne the notationFn+1(b)(i; y) = LXj=1 pij Z 10 b(j; y � z)�i;n(z)dz: (3.1)Using the principle of optimality, we can write the following dynamic programming equationsfor the value function:8>>><>>>: vn(i; x) = fn(i; x) + infu�0 fcn(i; u) + E[vn+1(in+1; x+ u� �n)jin = i]g= fn(i; x) + infu�0 fcn(i; u) + Fn+1(vn+1)(i; x+ u)g ; n 2 h0; N�1i;vN(i; x) = fN(i; x): (3.2)We can now state our existence results in the following two theorems.Theorem 3.1 The dynamic programming equations (3.2) de�ne a sequence of functions in C1.Moreover, there exists a function un(i; x) in B0, which provides the in�mum in (3.2) for any x.3

To solve the problem of minimizing J0(i; x; U), we use un(i; x) of Theorem 3.1 to de�ne( uk = uk(ik; xk); k 2 h0; N�1i with i0 = i;xk+1 = xk + uk � �k; k 2 h0; N�1i with x0 = x: (3.3)Theorem 3.2 (Veri�cation Theorem) Set U = (u0; u1; : : : ; uN�1). Then U is an optimal deci-sion for the problem J0(i; x; U). Moreover,v0(i; x) = minU2U J0(i; x; U): (3.4)Taken together, Theorems 3.1 and 3.2 establish the existence of an optimal feedback policy.This means that there exists a policy in the class of all admissible (or history-dependent) policies,whose objective function value equals the value function de�ned by (2.5), and there is a Markov (orfeedback) policy which gives the same objective function value.Remark 3.1. The results corresponding to Theorems 3.1 and 3.2 as proved in Sethi and Cheng(1993) hold under more general cost functions than those speci�ed by (2.1) and (2.2). In particular,fk(i; �) need only to be uniformly continuous with linear growth.4 Optimality of (s; S) Ordering PoliciesWe make additional assumptions under which the optimal feedback policy un(i; x) turns out to bean (s; S)-type policy. For n 2 h0; N�1i and i 2 I, letKin � �Kin+1 � LXj=1 pijKjn+1 � 0; (4.1)and cinx+ Fn+1(fn+1)(i; x)! +1 as x!1: (4.2)Remark 4.1. Condition (4.1) means that the �xed cost of ordering in a given period with demandstate i should be no less than the expected �xed cost of ordering in the next period. The conditionis a generalization of the similar conditions used in the standard models. It includes the cases of theconstant ordering costs (Kin = K; 8i; n) and the nonincreasing ordering costs (Kin � Kjn+1; 8i; j; n).The latter case may arise on account of the learning curve e�ect associated with �xed ordering costsover time. Moreover, when all the future costs are calculated in terms of their present values, evenif the undiscounted �xed cost may increase over time, Condition (4.1) still holds as long as the rateof increase of the �xed cost over time is less than or equal to the discount rate.Remark 4.2. Condition (4.2) means that either the unit ordering cost cin > 0 or the expectedholding cost Fn+1(fn+1)(i; x) ! +1 as x !1, or both. Condition (4.2) is borne out of practicalconsiderations and is not very restrictive. In addition, it rules out such unrealistic trivial cases asthe one with cin = 0 and fn(i; x) = 0; x � 0, for each i and n, which implies ordering an in�niteamount whenever an order is placed. The condition generalizes the usual assumptions made byScarf (1960) and others that the unit inventory carrying cost h > 0. Furthermore, we need not toimpose a condition like (4.2) on the backlog side assumed in Bensoussan et al. (1983) because ofan essential asymmetry between the inventory side and the backlog side. Whereas we can orderany number of units to decrease backlog or build inventory, it is not possible to sell anything morethan the demand to decrease inventory or increase backlog. If it were possible, then the conditionlike (4.2) as x! �1 would be needed to make backlog more expensive than the revenue obtainedby sale of units, asymptotically. In the special case of stationary linear backlog costs, this wouldimply p > c (or p > �c if costs are discounted at the rate �; 0 < � � 1), where p is the unit backlogcost. But since revenue-producing sales are not allowed, we are able to dispense with the conditionlike (4.2) on the backlog side or the standard assumption p > c (or p > �c) as in Scarf (1960) andothers or the strong assumption p > �ci for each i as in Song and Zipkin (1993).4

Remark 4.3 In the standard model with L = 1, Veinott (1966) gives an alternate proof to theone by Scarf (1960) based on K-convexity. For this, he does not need a condition like (4.2), butrequires other assumptions instead.De�nition 4.1. A function g : R! R is said to be K-convex, K � 0, if it satis�es the propertyK + g(z + y) � g(y) + z g(y)� g(y � b)b ; 8z � 0; b > 0; y: (4.3)De�nition 4.2. A function g : R ! D, where D is a convex subset D � R, is K-convex if (4.3)holds whenever y + z; y, and y � b are in D.Required well-known results on K-convex functions or their extensions are collected in thefollowing two propositions (cf. Bertsekas (1978) or Bensoussan et al. (1983)).Proposition 4.1 (i) If g : R! R is K-convex, it is L-convex for any L � K. In particular, if gis convex, i.e., 0-convex, it is also K-convex for any K � 0.(ii) If g1 is K-convex and g2 is L-convex, then for �; � � 0; �g1 + �g2 is (�K + �L)-convex.(iii) If g is K-convex, and � is a random variable such that Ejg(x� �)j < 1, then Eg(x� �) isalso K-convex.(iv) Restriction of g on any convex set D � R is K-convex.Proof. Proposition 4.1 (i)-(iii) is proved in Bertsekas (1978). The proof of (iv) is straightforward.2Proposition 4.2 Let g : R ! R be a K-convex lower semicontinuous (l.s.c.) function such thatg(x)! +1 as x! +1. Let A and B, A � B, be two extended real numbers with the understandingthat the closed interval [A;B ] becomes open at A (or B) if A = �1 (or B =1). Let the notationg(�1) denote the extended real number lim infx!�1 g(x). Letg� = infA�x�B g(x) > �1: (4.4)De�ne the extended real numbers S and s, S � s � �1 as follows:S = minfx 2 R [ f�1gjg(x) = g�; A � x � Bg; (4.5)s = minfx 2 R [ f�1gjg(x) � K + g(S); A � x � Sg: (4.6)Then(i) g(x) � g(S); 8x 2 [A;B];(ii) g(x) � g(y) +K; 8x; y with s � x � y � B;(iii) h(x) � infy�x;A�y�B[K�(y�x)+g(y)] = ( K + g(S); for x < sg(x); for s � x � B ) ; and h : (�1; B ]! Ris l.s.c.; moreover, if g is continuous, A = �1, and B =1, then h : R! R is continuous;(iv) h is K-convex on (�1; B ].Moreover, if s > A, then(v) g(s) = K + g(S);(vi) g(x) is strictly decreasing on (A; s ].5

Proof. When s > A, it is easy to see from (4.6) that (v) holds and that g(x) > g(s) for x 2 (A; s).Furthermore, for A < x1 < x2 < s, K-convexity impliesK + g(S) � g(x2) + S � x2x2 � x1 [g(x2)� g(x1)]:According to (4.6), g(x2) > K + g(S) when A < x2 < s. Then, it is easy to conclude thatg(x2) < g(x1). This proves (vi).As for (i), it follows directly from (4.4) and (4.5). Property (ii) holds trivially for x = y, holdsfor x = S in view of (i), and holds for x = s since g(s) � K + g(S) � K + g(y) from (4.6) and (i).We need now to examine two other possibilities: (a) S < x < y � B and (b) s < x < S; x < y � B.In case (a), let z = S if S > �1 and z 2 (S; x) if S = �1. By K-convexity of g, we haveK + g(y) � g(x) + y � xx� z [g(x)� g(z)]:Use (i) to conclude K + g(y) � g(x) if S > �1. If S = �1, let z ! �1. Since lim inf g(z) =g� <1, we can once again conclude K + g(y) � g(x).In case (b), if s = �1, let z 2 (�1; x). ThenK+g(S) � g(x)+[(S�x)=(x�z)][g(x)�g(z)]: Letz ! �1. Since lim inf g(z) � K+g� <1, we can use (i) to conclude g(x) � K+g(S) � K+g(y):If s > �1, then by K-convexity of g and (v),K + g(S) � g(x) + S � xx� s [g(x)� g(s)] � g(x) + S � xx� s [g(x)� g(S)�K];which leads to �1 + S � xx� s � [K + g(S)] � �1 + S � xx� s � g(x):Dividing both sides of the above inequality by [1 + (S � x)=(x � s)] and using (i), we obtaing(x) � K + g(S) � K + g(y); which completes the proof of (ii).To prove (iii), it follows easily from (i) and (ii) that h(x) equals the right hand side. Moreover,since g is l.s.c. and g(s) � K + g(S), h : R! R is also l.s.c. The last part of (iii) is obvious.Next, if s = �1, then h(x) = g(x); x 2 (�1; B ], and (iv) follows from Proposition 4.1(iv).Now let s > �1. Therefore, S > �1. Note from (4.6) that s � A. To show (iv), we need to verifyK + h(y + z)� [h(y) + zh(y)� h(y � b)b ] � 0in the following four cases: (a) s � y � b < y � y + z � B, (b) y � b < y � y + z < s, (c)y � b < s � y � y + z � B, and (d) y � b < y < s � y + z � B. The proof in cases (a) and (b) isobvious from the de�nition of h(x).In case (c), the de�nition of h(x) and the fact that g(s) � g(S) +K implyK + h(y + z)� [h(y) + zh(y)� h(y � b)b ] = K + g(y + z)� [g(y) + z g(y)� g(S)�Kb ]� K + g(y + z)� [g(y) + z g(y)� g(s)b ]: (4.7)Clearly if g(y) � g(s), then the right hand side of (4.7) is nonnegative in view of (ii). If g(y) > g(s),then y > s and b > y � s, and we can use K-convexity of g to concludeK + g(y + z)� [g(y) + z g(y)� g(s)b ] > K + g(y + z)� [g(y) + z g(y)� g(s)y � s ] � 0:6

In case (d), we have from the de�nition of h(x) and (i) thatK + h(y + z)� [h(y) + zh(y)� h(y � b)b ] = K + g(y + z)� [g(S) +K] = g(y + z)� g(S) � 0: 2Remark 4.4. To our knowledge, De�nition 4.2 of K-convex functions de�ned on an interval ofthe real line rather than on the whole real line is new. Proposition 4.2 is an important extensionof similar results found in the literature (see Lemma (d) in Bertsekas (1978) p.85). The extensionallows us to prove easily the optimality of (s; S)-type policy without imposing a condition like (4.2)for x! �1 and with capacity constraints discussed later in Section 5. Note that Proposition 4.2allows the possibility of s = �1 or s = S = �1; such an (s; S) pair simply means that it isoptimal not to order.We can now derive the following result.Theorem 4.1 Assume (4.1) and (4.2) in addition to the assumptions made in Section 2. Thenthere exists a sequence of numbers sin; Sin; n 2 h0; N�1i; i 2 I; with sin � Sin, such that the optimalfeedback policy is un(i; x) = (Sin � x)�(sin � x): (4.8)Proof. The dynamic programming equations (3.2) can be written as follows:( vn(i; x) = fn(i; x)� cinx+ hn(i; x); for n 2 h0; N�1i; i 2 I;vN(i; x) = fN(i; x); i 2 I; (4.9)where hn(i; x) = infy�x[Kin�(y � x) + zn(i; y)]; and (4.10)zn(i; y) = ciny + Fn+1(vn+1)(i; y): (4.11)>From (2.1) and (3.2), we have vn(i; x) � fn(i; x); 8n 2 h0; N�1i. From Theorem 3.1, we know thatvn 2 C1. These along with (4.2) ensure for n 2 h0; N�1i and i 2 I, that zn(i; y)! +1 as y !1;and zn(i; y) is uniformly continuous.In order to apply Proposition 4.2 to obtain (4.8), we need only to prove that zn(i; x) is Kin-convex. According to Proposition 4.1, it is su�cient to show that vn+1(i; x) is Kin+1-convex. This isdone by induction. First, vN(i; x) is convex by de�nition and, therefore, K-convex for any K � 0.Let us now assume that for a given n � N�1 and i, vn+1(i; x) is Kin+1-convex. By Proposition 4.1and Assumption (4.1), it is easy to see that zn(i; x) is �Kin+1-convex, hence also Kin-convex. Then,Proposition 4.2 implies that hn(i; x) is Kin-convex. Therefore, vn(i; x) is Kin-convex. This completesthe induction argument.Thus, it follows that zn(i; x) is Kin-convex for each n and i. Since zn(i; y)! +1 when y !1,we apply Proposition 4.2 to obtain the desired sin and Sin. According to Theorem 3.2, the (s; S)-typepolicy de�ned in (4.8) is optimal. 2Remark 4.5. Theorem 4.1 can be extended easily to allow for a constant lead time in the deliveryof orders. The usual approach is to replace the surplus level by the so-called surplus position. It canalso be generalized to Markovian demands with discrete components and countably many states.5 Constrained ModelsIn this section we incorporate some additional constraints that arise often in practice. We showthat (s; S) policies continue to remain optimal for the extended models.7

5.1 An (s; S) Model with No-ordering PeriodsConsider the special situation in which ordering is not possible in certain periods (e.g., suppliers donot accept orders on weekends), we shall show that the following theorem holds in such a situation.Theorem 5.1 In the dynamic inventory problem with some no-ordering periods, the optimal policyis still of (s; S) type for any period except when the ordering is not allowed.Proof. To stay with our earlier notation, it is no loss of generality to continue assuming the setupcost to be Kim in a no-ordering period m with the demand state i; clearly, setup costs are of no usein no-ordering periods. The de�nition (4.10) is revised ashn(i; x) = 8<: infy�x[Kin�(y � x) + zn(i; y)]; if ordering is allowed in period n;zn(i; x); if ordering is not allowed in period n; (5.1)and zn(i; y) is de�ned as before in (4.11). Using the same induction argument as in the proof ofTheorem 4.1, we can show that hn(i; x) and vn(i; x) are Kin-convex if ordering is allowed in period n.If ordering is disallowed in period n, then hn(i; x) = zn(i; x), which is �Kin+1-convex, and therefore,also Kin-convex. In both cases, therefore, vn(i; x) is Kin-convex. 2Remark 5.1. Theorem 5.1 can be easily generalized to allow for supply uncertainty as in Parlar,Wang and Gerchak (1993). One needs to replace uk in (2.4) by akuk, where Prfak = 1jik = ig = qikand Prfak = 0jik = ig = 1� qik, and to modify (3.2) appropriately.5.2 An (s; S) Model with Storage and Service ConstraintsLet B < 1 denote an upper bound on the inventory level. Moreover, to guarantee a reasonablemeasure of service, we shall follow Kumar (1992) and introduce a chance constraint requiring thatthe probability of the ending inventory falling below a certain penetration level, say Cmin, in anygiven period does not exceed a given �; 0 < � < 1. Thus, Prfxk+1 � Cming � �; k 2 h0; N�1i:Given the demand state i in period k, it is easy to convert the above condition into xk+uk � Aik �Cmin + ��1i;k (1� �); where ��1i;k (�) is de�ned as ��1i;k (z) = inf(xj�i;k(x) � z):The dynamic programming equations can be written as (4.11), where zn(i; y) is as in (4.11) andhn(i; x) = infy�x;Ain�y�B[Kin�(y � x) + zn(i; y)]; (5.2)provided Ain � B; n 2 h0; N�1i; i 2 I; if not, then there is no feasible solution, and hn(i; x) =inf � � 1. This time, since y is bounded by B <1, Theorem 3.1 can be relaxed as follows.Theorem 5.2 The dynamic programming equations (4.9) with (5.2) de�ne a sequence of l.s.c.functions on (�1; B ]. Moreover, there exists a function un(i; x) in B0, which attains the in�mumin (4.9) for any x 2 (�1; B ].With un(i; x) of Theorem 5.2, it is possible to prove Theorem 3.2 as the veri�cation theoremalso for the constrained case. We now show that the optimal policy is of (s; S) type.Theorem 5.3 There exists a sequence of numbers sin; Sin; n 2 h0; N�1i; i 2 I; with sin � Sin,Ain � sin, and B � Sin, such that feedback policy un(i; x) = (Sin � x)�(sin � x) is optimal for themodel with capacity and service constraints de�ned above.8

Proof. First note that Proposition 4.2 holds when g is l.s.c. and K-convex on (�1; B ]; B <1.Also, by Proposition 4.1 (iii) and (iv) , one can see that Eg(x� �) is K-convex on (�1; B ] since� � 0. Because g is l.s.c., it is easily seen that Eg(x � �) is l.s.c. on (�1; B ]. Furthermore, byTheorem 5.2, vn is l.s.c. on (�1; B ]. With these observations in mind, the proof of Theorem 4.1can be easily modi�ed to complete the proof. 2Remark 5.2. A constant integer lead time � � 1 can also be included in this model with thesurplus level replaced by the surplus position and with the lower bound Aik properly rede�ned interms of the distribution of the total demand during the lead time.6 The Nonstationary In�nite Horizon ProblemWe now consider an in�nite horizon version of the problem formulated in Section 2. By lettingN =1 and U = (un; un+1; : : :), the extended real-valued objective function of the problem becomesJn(i; x;U) = 1Xk=n�k�nE[ck(ik; uk) + fk(ik; xk)]; (6.1)where � is a given discount factor, 0 < � � 1. The dynamic programming equations are:vn(i; x) = fn(i; x) + infu�0fcn(i; u) + �Fn+1(vn+1)(i; x + u)g; n = 0; 1; 2; : : : : (6.2)In what follows, we shall show that there exists a solution of (6.2) in class C1, which is the valuefunction of the in�nite horizon problem; see also Remark 6.1. Moreover, the decision that attainsthe in�mum in (6.2) is an optimal feedback policy. Our method is that of successive approximationof the in�nite horizon problem by longer and longer �nite horizon problems.Let us, therefore, examine the �nite horizon approximation Jn;k(i; x;U) of (6.1), which is ob-tained by the �rst k-period truncation of the in�nite horizon problem of minimizing Jn(i; x;U),i.e., Jn;k(i; x;U) = n+k�1Xl=n E[cl(il; ul) + fl(il; xl)]�l�n: (6.3)Let vn;k(i; x) be the value function of the truncated problem, i.e.,vn;k(i; x) = infU2U Jn;k(i; x;U): (6.4)Since (6.4) is a �nite horizon problem on the interval hn; n + ki, we may apply Theorems 3.1 and3.2 and obtain its value function by solving the dynamic programming equations8<: vn;k+1(i; x) = fn(i; x) + infu�0fcn(i; u) + �Fn+1(vn+1;k)(i; x + u)g;vn+k;0(i; x) = 0: (6.5)Moreover, vn;0(i; x) = 0, vn;k 2 C1, and the in�mum in (6.4) is attained.It is not di�cult to see that the value function vn;k increases in k. In order to take its limit ask !1, we need to establish an upper bound on vn;k. One possible upper bound on infU2U Jn(i; x; u)can be obtained by computing the objective function value associated with a policy of orderingnothing ever. With the notation 0 = f0; 0; : : :g, let us writewn(i; x) = Jn(i; x; 0) = fn(i; x) + E8<: 1Xk=n+1�k�nfk(ik; x� k�1Xj=1 �j)jin = i9=; : (6.6)9

In a way similar to Bensoussan et al. (1983, pp.299-300), it is easy to see that given (2.2), wn(i; x)is well de�ned and is in C1. Furthermore in class C1, wn is the unique solution ofwn(i; x) = fn(i; x) + �Fn+1(wn+1)(i; x): (6.7)We can state the following result for the in�nite horizon problem; see Appendix for its proof.Theorem 6.1 Assume (2.1) and (2.2). Then we have0 = vn;0 � vn;1 � : : : � vn;k � wn (6.8)and vn;k " vn; a solution of (6.2) in B1: (6.9)Furthermore, vn 2 C1 and we can obtain U = fun; un+1; : : :g for which the in�mum in (6.2) isattained. Moreover, U is an optimal feedback policy, i.e.,vn(i; x) = minU2U Jn(i; x;U) = Jn(i; x; U): (6.10)Remark 6.1. We should indicate that Theorem 6.1 does not imply that there is a unique solutionof the dynamic programming equations (6.2). There may well be other solutions. Moreover, onecan show that the value function is the minimal positive solution of (6.2). It is also possible toobtain a uniqueness proof under additional assumptions.With Theorem 6.1 in hand, we can now prove the optimality of an (s; S) policy for the nonsta-tionary in�nite horizon problem.Theorem 6.2 Assume (2.1), (2.2), and (4.2) hold for the in�nite horizon problem. Then, thereexists a sequence of numbers sin; Sin; n = 0; 1; : : : ; with sin � Sin for each i 2 I, such that the optimalfeedback policy is un(i; x) = (Sin � x)�(sin � x):Proof. Let vn denote the value function. De�ne the functions zn and hn as in Section 4. We knowthat zn(i; x)!1 as x! +1 and zn(i; x) 2 C1 for all n and i 2 I.We now prove that vn is Kin-convex. Using the same induction as in Section 4, we can showthat vn;k(i; x) de�ned in (6.4) is Kin-convex. This induction is possible since we know that vn;k(i; x)satis�es the dynamic programming equations (6.5). It is clear from the de�nition of K-convexityand from taking the limit as k !1, that the value function vn(i; x) is also Kin-convex.>From Theorem 6.1, we know that vn 2 C1 and that vn satis�es the dynamic programmingequations (6.2). Therefore, we can obtain an optimal feedback policy U = fun; un+1; : : :g thatattains the in�mum in (6.2). Because vn is Kin-convex, un can be expressed as in Theorem 6.2. 27 The Cyclical Demand ModelCyclic or seasonal demand often arises in practice. Such a demand represents a special case of theMarkovian demand, where the number of demand states L is given by the cycle length, andpij = ( 1; if j = i+ 1; i = 1; : : : ; L� 1; or i = L; j = 1;0; otherwise.Furthermore, we assume that the cost functions and density functions are all time invariant. Theresult is a considerably simpli�ed optimal policy, i.e., only L pairs of (sn; Sn) need to be computed.We can state the following corollary to Theorem 6.2.Corollary 7.1 In the in�nite horizon inventory problem with the demand cycle of L periods, letn1 and n2 (n1 < n2) be any two periods such that n2 = n1 +m � L; m = 1; 2; : : : : Then, we havesn1 = sn2 and Sn1 = Sn2: 10

8 Concluding RemarksThis paper develops various more realistic extensions of the classical dynamic inventory model withstochastic demands. The models consider demands that are dependent on a �nite state Markov chainincluding demands that are cyclic. Some constraints commonly encountered in practice, namelyno-ordering periods, �nite storage capacities, and service levels, are also treated. Both �nite andin�nite horizon cases are studied. It is shown that all these models, not unlike the classical model,exhibit the optimality of (s; S) policies.In some real-life situations, the demand unsatis�ed is often lost instead of backlogged as assumedin this paper and frequently in the literature on (s; S) models. An extension of the (s; S)-type resultspresented in this paper to the lost sales case is given in Sethi and Cheng (1993).ReferencesArrow, K.J., T. Harris and J.Marschak. 1951. Optimal Inventory Policy. Econometrica29, 250-272.Bensoussan, A., M.Crouhy and J. Proth. 1983. Mathematical Theory of Production Plan-ning, North-Holland, Amsterdam.Bertsekas, D. 1978. Dynamic Programming and Stochastic Control. Academic Press, NewYork.Bertsekas, D. and S. Shreve. 1976. Stochastic Optimal Control: the Discrete Time Case.Academic Press, New York.Dvoretzky, A., J.Kiefer and J.Wolfowitz. 1953. On the Optimal Character of the (s; S)Policy in Inventory Theory. Econometrica 20, 586-596.Feller, W. 1971. An Introduction to Probability Theory and Its Application. Vol.II, 2nd Edition,Wiley, New York.Iglehart, D. 1963. Optimality of (s; S) policies in the in�nite horizon dynamic inventory prob-lem. Management Science 9, 259-267.Iglehart, D., and S. Karlin. 1962. Optimal Policy for Dynamic Inventory Process Inven-tory Process With Nonstationary Stochastic Demands. In Studies in Applied Probability andManagement Science. K.Arrow, S.Karlin and H. Scarf (eds.) Ch. 8, Stanford Univ. Press,Stanford, CA.Karlin, S. 1958. Optimal Inventory Policy for the Arrow-Harris-Marschak Dynamic Model. InStudies in Mathematical Theory of Inventory and Production. K.Arrow, S.Karlin and H.Scarf(eds.) Ch. 9, 135-154. Stanford Univ. Press, Stanford, CA.Karlin, S. and A.Fabens. 1959. The (s; S) inventory model under Markovian demand process.In Mathematical Methods in the Social Sciences. J. Arrow, S.Karlin, and P. Suppes (eds.)Stanford Univ. Press, Stanford, CA.Kumar, A. 1992. Optimal Replenishment Policies for the Dynamic Inventory Problem withService Level and Storage Capacity Constraints. Working paper. Case Western ReserveUniv.Parlar, M., Y.Wang, and Y.Gerchak. 1993. A periodic review inventory model withMarkovian supply availability: Optimality of (s; S) policies. Working Paper, McMaster Univ.11

Scarf, H. 1960. The optimality of (S; s) policies in the dynamic inventory problem,MathematicalMethods in Social Sciences. J. Arrow, S.Karlin, and P. Suppes (eds.) Stanford Univ. Press,Stanford, CA.Sethi, S. and F.Cheng. 1993. Optimality of (s; S) Policies in Inventory Models with MarkovianDemand Processes. C2MIT Working paper, Univ. of Toronto.Song, J.-S. and P. Zipkin. 1993. Inventory Control in a Fluctuating Demand Environment.Operations Research 41, 351-370.Veinott, A. Jr. 1966. On the Optimality of (s; S) Inventory Policies: New Conditions and aNew Proof. SIAM J.Appl.Math. 14, 1067-1083.Appendix: Proof of Theorem 6.1By de�nition, vn;0 = 0. Let ~Un;k = f~un; ~un+1; : : : ; ~un+kg be a minimizer of (6.3). Thus,vn;k(i; x) = Jn;k(i; x; ~Un;k) � Jn;k�1(i; x; ~Un;k) � minU2U Jn;k�1(i; x;U) = vn;k�1(i; x):It is also obvious from (6.3) and (6.6) that vn;k(i; x) � Jn;k(i; x; 0) � wn(i; x): This proves (6.8).Since vn;k 2 C1, we have vn;k(i; x) " vn(i; x) � wn(i; x); (A.1)with vn(i; x) l.s.c., and hence in B1. Next, we show that vn satis�es the dynamic programmingequations (6.2). Observe from (6.5) and (6.8) that for each k, we havevn;k(i; x) � fn(i; x) + infu�0fcn(i; u) + �Fn+1(vn+1;k)(i; x+ u)g:Thus, in view of (A.1), we obtainvn(i; x) � fn(i; x) + infu�0fcn(i; u) + �Fn+1(vn+1)(i; x+ u)g: (A.2)In order to obtain the reverse inequality, let un;k attain the in�mum on the right hand side of(6.5). >From Assumptions (2.1) and (2.2), we obtain thatcinun;k(i; x) � �Fn+1(vn+1;k)(i; x) � �(1 +M) kwnk (1 + jxj);where k�k is the norm de�ned on B1, i.e., for b 2 B1,kbk= maxi supx b(i; x)1 + jxj :This provides us with the bound 0 � un;k(i; x) � Mn(1 + jxj): (A.3)Let k < l, so that we can deduce from (6.8):( vn;l+1(i; x) = fn(i; x) + cn(i; un;l(x)) + �Fn+1(vn+1;l)(i; x+ un;l(x))� fn(i; x) + cn(i; un;l(x)) + �Fn+1(vn+1;k)(i; x+ un;l(x)) : (A.4)12

Fix k and let l ! 1. In view of (A.3), we can, for any given n; i and x, extract a subsequenceun;l0(i; x) such that un;l0(i; x) ! �un(i; x): Since vn+1;k is uniformly continuous and cn is l.s.c., wecan pass to the limit on the right hand side of (A.4). We obtain (noting that the left hand sideconverges as well)vn(i; x) � fn(i; x) + cn(i; �un(x)) + �Fn+1(vn+1;k)(i; x + �un(x))� fn(i; x) + infu�0fcn(i; u) + �Fn+1(vn+1;k)(i; x+ u)g:This along with (A.1), (A.2) and the fact that vn(i; x) 2 B1 proves (6.9).Next we prove that vn 2 C1. Let us consider the problem (6.3) again. By the de�nition (6.1),Jn(i; x0;U)� Jn(i; x;U) = 1Xl=n�l�nE 24fl(il; x0 + l�1Xj=nuj � lXj=n+1 �l)� fl(il; x + l�1Xj=nuj � lXj=n+1 �l)35 :>From Assumption (2.2), we havejJn;k(i; x0;U)� Jn;k(i; x;U)j � 1Xl=n�l�nCjx0 � xj = Cjx0 � xj=(1� �);which implies jvn;k(i; x0)� vn;k(i; x)j � Cjx0 � xj=(1 � �): By taking the limit as k ! 1, we havejvn(i; x0)� vn(i; x)j � Cjx0 � xj=(1� �); from which it follows that vn 2 C1. Therefore, there existsa function un(i; x) in B0 such thatcn(i; un(i; x)) + �Fn+1(vn+1)(i; x+ un(i; x)) = infu�0fcn(i; u) + �Fn+1(vn+1)(i; x+ u)g:Hence, we have vn(i; x) = Jn(i; x; U) � infU2U Jn(i; x;U):But for any arbitrary admissible control U , we also know that vn(i; x) � Jn(i; x;U): Therefore, weconclude that vn(i; x) = Jn(i; x; U) = minU2U Jn(i; x;U): 2

13

16. markovian demand inventory model

Documents