analysis of the effects of delays on load sharing 1989

8/3/2019 Analysis of the Effects of Delays on Load Sharing 1989

1/13

IEEE TRANSACTIONS ON COMPUTERS, VOL. 38 , NO 11 . N O V E MB E R 1989 1513

Analysis of the Effects of Delays on Load Sharing

Abstract-In this paper, we study the performance character-istics of simple load sharing algorithms for distributed systems.In the system under consideration, it is assumed that nonnegligi-ble delays are encountered in transferring tasks from one node toanother and in gathering remote state information. Because ofthese delays, the state information gathered by the load sharingalgorithms is out of date by the time the load sharing decisionsare taken. This paper analyzes the effects of these delays onthe performance of three algorithms that we call Forward, Re-verse, and Symmetric. We formulate queueing theoretic modelsfor each of the algorithms operating in a homogeneous sys-tem under the assumption that the task arrival process at eachnode is Poisson and the service times and task transfer timesare exponentially distributed. Each of the models is solved us-ing the Matrix-Geometric solution technique and the importantperformance metrics are derived and studied.

Index Terms- Communication delays, distributed systems,load sharing, Matrix-Geometric solution technique, queueingmodels.

I . INTRODUCTIONISTRIBUTED computer systems possess many poten-D ially attractive fe atures. Som e of these are the capabilityto share processing of tasks in the event of overloads and

failures, reliability through replication, and modularity. Thisstudy focuses on the issue of sharing computation power be-tween nodes of a distributed system in response to imbalancesin loads.It will be assumed that tasks arrive at the nodes in a ran-dom fashion. Thus, situations can develop whereby some of

the nodes are excessively busy while others are idle at thesame time [6]. This kind of situation is detrimental to perfor-mance because tasks at the busy nodes experience very highwaiting times, while the less busy nodes have idle cycles atthe same time. T he function of load sharing is to minimize theoccurrence of such scenarios by moving tasks from overloadednodes to less busy ones.Distributed load balancing has been an active area of re-search for some time. For example, Stone [lo], [ l l ] , Bokharj[11, and Towsley [141 examined static algorithm s that utilizedinformation only about the average behavior of tasks in decid-

an optimal probabilistic assignment scheme. Silva and Gerla[ 2 ] determine an optimal load sharing strategy under the as-sumption that the nodes and the communication network canbe modeled as product form queueing networks. Recently,Le e [5] studied the effects of task transfer delays on simplealgorithms that do not utilize any remote state information.Eager et al. [3] evaluated three simple load sharingschemes. They assume that the entire overhead due to loadsharing is transferred onto the CPU and is modeled as an in-creased load on the sam e. Furthermore, the nodes are assumedto be p art of a local area network connected by a high band-width m edium. Thus, there are no delays in transferring tasksand remote state information is always perfectly accurate.While these works provided significant insight into variousaspects of load sharing, the problem of communication de-lays and out of date state information and its impact on loadsharing has not been investigated in any great detail. In thispap er, we focus on the effect of communication delays uponthe performance of simple load sharing algorithms. We feelthat this problem is interesting in that there exist a sufficientnumber of system architectures that will generate significantdelays in task transfers. For instance, Theimer et al . [13],report their concerns with task transfer delays. Fu rthermore,they have acknowledged that if the files used by a task wereto be transferred (as they might have to be if the nodes weredisk-based), the effect of delays would become even moreprominent (the V-System is currently comprised of disklessworkstations). Also, the question of how to deal with out ofdate state information has been one of the many interestingdevelopments in designing algorithms for distributed systemsas investigated in Stankovic [9].In this connection, we have developed analytical modelsthat help us better understand the above issues. Various rele-vant performance m etrics are derived from these models andthe load sharing algorithms are com pared on the basis of thesemetrics. By studying the results obtained from the model so-lution, we are able to determine the exact effects of delaysand out of date state information on load sharing in general.Furthermore, we are able to determine the range of delaysand traffic intensities over which state information is worth- ' gathering and useful load sharing can be p erformed.ing their assignments. Tantawi and Towsley I121 investigated The remainder of this paper is organized as fol lows~ nSection 11. we Drovide a brief de scrb tion of the svstem archi-IManuscript received March 3, 1987; revised July 16 , 1987 and November

14 , 1987.This work was supported in part by the N ational Science Foundationunder Grant ECS-8406402 and bv RADC under Contract RI-44896X.

tecture and the load sharing algorithms. Section 111comprisesthe description of the Markov process corresponding to theR. Mirchandaney is with the Department of Computer Science, Yale Uni-versity, New Haven, CT 06520.D. Towsley and J. A. Stankovic are with the Department of Computer and

IEEE Log Number 8930169.

Sym metric algorithm and its Matrix-Geometric solution . Theanalysis corresponding to the Forward and Reverse probingthe analysis of Symmetric subsumes that of Forward and Re-lnformation Science, University of Massachusetts, Amherst, MA 01003. Only be described in brief. This is because

0018-9340/89/1100-1513$01.00O 1989 IEEE


2/13

1514 IEEE TRANSACTIONS ON COMPUTERS, VOL. 38, NO . 11 , NOVEMBER 1989verse. In Section IV, we describe the important results of this the nod e waits for an other local a rrival before it can p roberesearch and we summarize our work in Section V. Finally, again.Appendices A and B describe the internals of the matrices 0 Reverse: This algorithm is ac tivated every time a taskinvolved with the solution of the Marko v processes. comp letes at a node and the total num ber of tasks at the nodeis less than T + 1 and the node is not already waiting for a

remote task to arrive. If so , the node probes a subset of sizeL, remote nodes at random to try and acquire a remote task.Only nodes that possess mo re than T + 1 tasks (inclu ding theProcessing and transmission of communication messages currently executing one) can respond posit ively. If more thanfor state updates (Probes) and for tasks can Potentially F n - one node can transfer a task, the probing node chooses oneerate considerable overhead at the nodes. Different system of these at random from which it requests a task.architectures can impose very different costs for these over- 0 Symmetric: This algorithm combines the two schemes ofheads. At one end of the Spectrum , nodes can have dedicated Forward and Reverse. Thus, if a node goes above T + 1 uponProcessors to handle COmmunication overheads, supported bY the arrival of a local task, i t at tempts to transfer a task and ifa very high bandw idth fiber-optic bus commu nication. On the it drops below T + 1 upon a task completion, it attempts to

other end of the spectrum, nodes can be multiplexed between acquire a remote task.application tasks and comm unication packet processing. In all the algorithms described above, it is assume d thatassumptions about the system probing takes zero time. This is based on the initial assumptionthat we will be considering . The architecture of the individual that probes are much entities than are tasks. Thus, thenodes includes a powerful bus interface unit P IU ), which is overhead for processing a probe at the BIU is much smallerwed to process most Of the Overhead generated by task and than for tasks. Furthermore, probes occupy much less of theprobe movement. For instance, the BIU have a DMA comm unication bandw idth than tasks. Thus , the entire delaycapability to access main memory without much interference is assumed to occur during task transfer. Furthermore,we have seen in s eparate studies (not described in this paper)o the CPU.the Of the Overhead processing for task transfer that as long as the ratio of task transfer times to probe transferis transferred to the BIU, delays will nevertheless occur dur- times is sufficiently large (>50), the system essentially be-ing this processing * There will also b e network in the haves as if the probes actually take zero time. We are currentlytransmission of probes and tasks. We are interested in studying investigating this phenom enon in greater detail.the comb ined effects of these delays. Furthermo re, we believethat it is reasonable to assume that the relative sizes of tasksand probes will be quite different. The physical transfer ofa task may require tens of comm unication packets, while a It is assumed that the task arrival process at each node isprobe or a response to one would in all likelihood need at Poisson, with parame ter X. Also, the service times and taskmost one packet. Thus, it is reasonab le to imagine a ratio of transfer times a re assu med to be e xponen tially distributed,50:l or more in the relative sizes of tasks versus probes. Co n- with means 1/p an d lly, respectively. The task transfer timesequently, it appears that the delays incurred by tasks in the includes the time between the initiation of a transfer from aBIUs and the ne twork will be sign ificantly larger than tho se node and the successful reception of the task at the destinationincurred by the probes. In our analysis, the delays incurred node. The nodes are assumed to be homogeneous, i .e. , theby probes will be a ssumed to be negligible when comp ared nodes have identical processing power and the arrival processto those incurred by task transfers. at each node is the same. Tasks are assumed to be executedon a first-com e-first-served (FCFS) basis at each no de.B . Load Sharing Algorithms Le t N: be the number of tasks at node i at time t an d

The three algorithms that we have studied in the context J? be the probe state of node i at time t . The probe stateof this research are called Forward, Reverse, and Symm etric. indicates whethe r the node is probing or being probe d, etc.Each algorithm is provided with a threshold T. The algorithms For examp le, in a system of M nodes, th e instantan eous stateare described in the following few paragraph s. of the network can be represented by the 2M-tup le

0 Forward: The algo rithm is activated each time a localtask arrives at the node. If the number of tasks at this node(including the task currently being executed) is greater thanT+ , an attempt is made to transfer the newly arrived task to If the probe state Jf is defined appropriately then, due to theanother node. A finite number L,, of nodes (usually L, = 2 Poisson arrival assum ption and the expon ential service andor 3 is adequate) is probed at random to determine a placement task transfer t imes, the process corresponding to the abovefor the task. A probed node responds positively if the number state description is Marko vian.of tasks it possesses is less than T + 1 and it is not already It is clear that the mod el has a very large state space and iswaiting for som e other remote task. If more than on e node difficult to solve, even for mode rately sized systems. Conse-responds positively, the sender node transfers the task to o ne quently, we decompose the mod el such that the model for eachof these responde nts, picked at random . If none of the probed node can be solve d independ ently of the others [3]. The inter-nodes resp onds po sitively, i.e., this prob e was un successful, actions between the nodes which result in task transfers for the

11. SYSTEMRCHITECTUREND LOAD HARINGLGORITHMSA . System Architecture and M otivation

we have made the

III. MATHEMATICALNALYSIS

( i v y ,e,. .,Ivy; y , J?, . . .It (M ) .


3/13

MIRCHANDANEY et al.. E FFE C T S OF DELAYS ON LOAD SHARING

-bo C O 0 0 . . .. . .a1 61 C l 0

0 a2 b2 c2am-2 bm - 2 cm - 2 0. . . . .

0 a m - I bm-I Cm- I. . . . .0 0 a , b,. . . . .

1515

Qs =

Fig . 1. State diagram of Symmetric probing algorithm

Bo1 " . Bo1 Bo1 A0 A0 " 'Boo B11 . . . B11 BZ1 AI AI . . .BlO B1o ' . ' B1o A* A2 A2 " '


4/13

1516 IEEE TRANSACTIONS ON COMPUTERS, VOL. 38 , NO . 1 1 , NOVEMBER 1989

.

where we define the matrices B O O , el, Blo, B11, Bz1, A2,A l , an d A. in Appendix A.In the subsequent discussion, h is the probability of failurein finding an assignment for a spare task in response to a setof forward probes. Thus, f i = I - h is the prob ability that atleast one of the probed nodes will accept a remote task. Also,q is the probability of failure in finding a remote task for aset of reverse probes, and q = 1 - q.

The effect of a node sending a forward probe when i t goesabove T + 1 is represented by the transition Ah. When thenode m akes a transit ion anywhere below T + 1 on the com-pletion of a task, it sends out reverse probes in order to geta remote task. A successful transition is represented by pqand an unsuccessful set of reverse probes is represented bythe transition p q .Thus, on the com pletion of a task when the node goes belowT + 1, it sends out reverse pro bes, if it is not alread y waitingfor a remote task to arrive in response to an earlier reverseprobe. A transition of this type is represented from (n, ) to(n - I , 1) or (n, ) to (n - 1, 3), where 0 < n 5 T + 1 .When a remote node sends a forward probe into this node, i tmakes the transition from (n, ) to ( n, ) or (n, ) to (n, ),where 0 5 n 5 T . This means that the remote node is goingto transfer a task to this node, on the basis of a successfulprobe. The rate of receiving forward probes is denoted by a.The rate at which this node sends out tasks in response tonodes that asked it for tasks is p . Thus, the rate at which anode makes the transition (n,j) o (n - 1, j ) , or n 2 T + 2equals p +p .As can be seen from the generator Qs , the Markov pro-cess has a regular structure comprised of the Ao,A l an d A2matrices, preceded, however, by the irregular boundary con-ditions. The size of the irregular portion of the matrix depe ndsupon the threshold at which the process is operating. Therewill be exactly T - column s of the matrices (B O ,B1 I ,Blo) .

Neuts [8] examined Markov processes with such genera-tors and determined the conditions for positive recurrence

Bo1 . Bo1 Bo1

BlOlO BlO . . .BOO B11 . . . B11 B21 +RA2 = O

r4,3 = r4.2where 6 = (Ah +p + p ) .Thus, the diagonal elements of R can be writ ten explic-itly in terms of the parameters of the Markov process. Oncethe diagonal elements are determined, the elements below thediagonal are computed recursively from the solution of thediagonal elements.

By adapting Theorem 1.5.1 from Neuts [8], he Markovprocess QS is positive recurrent if and only if s p ( R ) < 1and the matrix M (defined below) possesses a positive leftinvariant probability vector. Because R is lower triangular,its eigenvalues are its diagonal elements. One can show thats p ( R ) < 1 if

Ah < p f p .The matrix M , given by

when the infinitesim al generator A = A0 +A l + A2, cor-responding to the geometric part of the Markov process, isirreducible. However, for our problem, A is lower triangularand reducible. In such cases, the stability criterion has to bedetermine d explicitly.

is an irreducible, aperiodic matrix. The second condi-t ion holds because of the irreducibility of M . The vectorWO,p l , . . , P ~ + ~ )s the left eigenvector of M .Intuitively, the stability condition means that the rate ofprocessing tasks (inc luding the one s that are sent out of thisnode) is greater than the total arrival rate of tasks into thisConsider the nonlinear matrix equation

Ao+RAl +R2A2 = Osuch that R is its minimal nonnegative solution. It can beshown that R is lower triangular, given the structure of Ao,A I , nd A2 [81. Furthermore, R = [ r i j ] ,where

r i , = 0, Vi < j6 - A2 - 4(p +p)Ah)22(P + P >r1,1 =


5/13

MIRCHANDANEY et al.: EFFECTS OF DELAYS O N LOAD SHARING 1517where the number of columns in the matrix is exactly T +1. We know from Neuts [8] that if the process is positiverecurrent

4 . Solve the linear system corresponding to the boundary5 . Determ ine FFRO('") and RFRO ('+') from the modelconditions

, V i > T + l .T+ I -ipi =PT+lThus,

Also,

E [ M , he expected number of tasks at a node, and E[D],heexpected response time of a task, are given by the followingexpressions:

solution6 . If ABS (FFRO('+') - FFRO")) 5 E and ABS(RFRO('+') - RFRO")) 5 , where E is an arbitrarysmall number, stop, else7. Let i = i + 1 . G o to 2 .We have observed from experiments that the solution wasinsensitive to the initial values chosen for the unknown quan-tities. Consequently, we conjecture that there exists a uniquesolution to the model. Furthermore, the number of iterationswas usually small, ranging between 10 and 30.Because of the assumption of homogeneity and because ofthe symmetric nature of the algorithm

FFRO = FFRI and RFRO = RFRI.E[NI: i p i e To determine CY , e use the relation FFRO =FFRI, where

\Total-Flow-In)h N I +FFRO = xh C p i e ,

i >T

FFR I = ~ ~ ~ p , [ 1 1 0 0 ] ~ .i T+1To determine q , the probability of a set of reverse probes

FFRO(O), RFRO'O)resulting in failure, we use the following procedure:


6/13

1518 IEEE TRANSACTIONS ON COMPUTERS, VOL. 38 , NO. 1 1 , NOVEMBER 1989............e . . .

Fig. 2 . State diagram of Forward probing algorithm.

.....i""

.....

Le tY = p i e .i


7/13

MIRCHANDANEY et a l . : EFFECTS OF DELAYS ON LOAD SHARING

with zero costs, i.e ., the M /M/K m odel. Wherever relevant,we will also compare the algorithms against a Random as-signment algorithm, which transfers tasks based only uponlocal state information. This algorithm is similar to Forwardin the sense that a node that goes above T + 1 transfers atask. However, the node does not send any probes. Instead,it picks a destination node at random and transfers a task tothis node. The key performance metric for comparison is themean response time of tasks.A large number of parameters such as the service time,the threshold T , he probe limit L , , the comm unication delayl /y , the number of nodes in the network, etc., can affect theperformance of load sharing algorithms. In this connection,we will try to present the results that we believe are the mostrelevant. The presentation will be in the following sequence:

validation of the analytical results with simulations,nominal comparisons between the algorithms,0 relation between delays and thresholds,0 optimal response times as a function of delays,0 optimal thresholds as a function of delays.

Unless specifically mentioned otherwise, L, = 2 in all theruns. Also, S = 1 p an d C = 1/y re the means of the servicetime and task transfer delay, respectively. Furthermore, it willbe assumed that S:unit and all measurements of responsetimes will be in terms of this unit.Validation with Sim ulat ions : We mentioned in Section I1that the decomposition used in this paper is only an approx-imate solution which is conjectured to be exact for infinitelylarge systems. Thus, it is important to determine how wellthis approximation com pares to simulations of finite-sized sys-tems. The simulation model consisted of 10 nodes in all casesexcept when p = 0.9, where the model consisted of 20 nodes.Fig. 4 depicts a representative set of curves regarding thisstudy.

Because the simulation results were almost identical to theanalytical model, we have chosen not to depict the actual sam-ple means of the respo nse times from the simulations. Instead,the 95 percent confidence intervals of the simulation results arepresented, as computed by the student-t tests. On th e average,the confidence interval for the response time is about 10.015units about the sample mean. The only exception to this is atp = 0.9, when the confidence interval is about h0.165 unitsabout the sample mean.We have observed (results not presented here) that in mostof the cases, the variation between the simulation results andthe analytical models is less than 2 percent. Furthermore, themodel is almost invariably optimistic, compared to the simu-lation results. The maximum variation that we observed wasabout 15 percent, and such numbers were very infrequent andwere seen to occur at low communication delays and highloads ( p 2 0.9). As the delays increase, however, the modeltends to become more accurate. In any case, for loads 50.8,the model is a very good approximation, even for reasonablysmall systems. In cases where the variation was more than 2percent, it was seen that by increasing the size of the simula-tion system to 20 nodes, the results generated better agreement

s 4.:; .1519

T

sr

1.d I0.3 0.4 0.5 0.6 0.7 0.8 0. 9L o a d

~ D e l a y = 0.1s. .. . .. D e l a y = SD e l a y = 10s..._

I Confidence IntervalFig. 4. Validation with simulations.

at p = 0.9, C = O.IS,which was ab out 15 percent, decreasesto about 5 percent for a system of 20 nodes.Comparison of the Algorithms: In an earlier study byWang and Morris [15], it was postulated that at low trafficintensities, Forward probing is likely to perform best, whileat high traffic intensities, Reverse would be more suitable.However, it was not known exactly where one policy becamebetter than the others, especially when there are significantcommunication delays involved.Another factor that takes on a degree of importance in thiscomparison between algorithms is that of probe overhead.While we have assumed that probes take zero time, there isthe potential for the probes to interfere with other messages,especially if they are generated in large enough numbers. Ithas been shown in [7] that the Symmetric algorithm g eneratesprobes at a higher rate than do Forward and Reverse. Whilewe have not included the effects of such overhead in ou r modelthus far, this aspect of the study is currently under progress.Fig. 5shows the p erformance curves of the algorithms forC = 0.1s an d T = 0. From this figure, we can make thefollowing observations.

0 At low delays and low loads ( p 5 OS),Forward performsessentially like Symmetric but Reverse is worse by as muchas 30 percent. This can be explained by the fact that in mostcases, Reverse is ineffective in load sharing as most nodeswill not have a spare task. Thus, the Reverse component of- -with those of the analytical mod el. For instance, the variation Symmetric does not improve its perform ance over Forward.


8/13

15201C-

0 -

8-

7.-RePn

5

0 6se

I

IEEE TRANSACTIONS ON COMPUTERS, VOL. 38, NO. 1 1 , NOVEMBER 1989I

L ,0.2 0:3 0:4 0:5 0:6 0.'7 018 0:QL oad L oad- y m m e t r i cF o r w a r dB e v e r s eNLB-___._..._

Fig. 5 . Comparison of algorithms (delay = 0.1s).

0 At moderate loads, Sym metric performs much better thanboth Forward and Reverse, by as much as 20 percent, whileForward and R everse are about the same.0 At high loads (p > 0.9), it is seen that Reverse is betterthan Forward by a substan tial margin of abo ut 25 percent whileSymmetric is still the best overall, being better than Reverse

by about 25 percent.0 At all the loads tested, there appears to be a substantialgain in load sharing as opposed to NLB. This is true for all

three algorithms. Ho wever, the improvement is much morepronounced as the load increases. For instance, at p = 0.9,the response t ime for Sym metric is about 2 units whereas theNLB response t ime is 10units, a significant difference.

0 As may b e expected, the algorithms perform w orse thanthe exact M / M K model. However, Symm etric generates closeperformance to the M / M K model. For instance, at p = 0.9,M / M K results in a response t ime of 1.3 units while Symm et-ric generates 2 units.

Fig. 6shows the performance curves of the algorithms forC = 2s an d (T = 2). From Fig. 6, we can reach the followingconclusions.

0 For moderate comm unication delays and low to moder-ate loads ( p 5 0.7), the behavior of the three algorithms isvirtually the same. It would appear that the delay overheadpredominates at these loads.

0 At moderate loads, (p = 0.8), Symmetric is about 10percent better than Reverse but almost identical to Forward.

__ S y m m e t r i cF o r w a r dR e v e r s eNLB__..__..._Fig. 6. Comparison of algorithms (delay = 2s).

0 Only at very high loads (p 2 .9) does Symmetric actu-ally perform significantly better than both Forward and Re-verse.0 In comparison to NLB, it is seen that at low loads

(p 5 O S ) , there is little if no imp rovem ent by load sha ring.However, as the load increases, load sharing becomes moreviable. At p = 0.9, Symmetric generates a response t ime of3.5 units as opposed to 10 units for NLB.

0 The comparison against the M N K model is not veryflattering at high delays, as might be expected. For instance,Symmetric at p = 0.9 is about 2.5 times worse than theM/M/K value of about 1.3 units.

Thus, one can conclude that at moderately high delays,the performance of the three algorithms is virtually identical.A surprising result though is that Symmetric is significantlybetter at very high loads.All the sub sequent discussion is based on the results ob-tained from the Symmetric algorithm. Unless explicitly men-tioned otherwise, the conclusions reached are also applicableto Forward and Reverse. In cases where the performance ofthese algorithms is markedly different from tha t of Symm etric,a separate discussion will be provided.Delays Versus Thresholds: Figs. 7-9 show the responsetimes for the Symmetric algorithm tested over a wide rangeof comm unication delays and thresho lds, for the traffic in-tensities of 0.5, 0.7, and 0.9. It can be seen from Fig. 7that at low delays (C = O.lS), the optimal threshold is 0


9/13

MI R C H A N D A N E Y et a l . : E F F E C T S OF D E L A Y S O N L O A D S H A R I N G

1 l0 1 2 3 4 5 6 7 8 9 10T h r e s h o l d

~ rh o = 0 5r h o = 0 7r h o = 0 9....

Fi g 7 Variation of thresholds (delay = 0.1s).

and the performance is a monotonically increasing functionof the threshold. Also, the response time generated at T = 0is only about 20 percent worse than the exact M/M/K valuefor moderate loads (p 5 0.7). For example, at p = 0.7, theSymm etric response time is about 1.3 units whereas the exactMIMIK value is approximately 1 .04 units. Fu rthermore, theNLB response time for this load is 3.3 units, which ismuch worse than the performance of the Symmetric algo-rithm. The performance improvement due to load sharingin this case can be explained by the following arguments.

0 At low delays, the cost of transferring a task is muchlower than the potential improvement due to the effect of loadsharing. Thus, T = 0 permits very active load sharing.0 Because the delays are small, much greater certainty ex-ists in the knowledge that an idle node will continue to remainidle during the time it takes to transfer a task to it. Thus, insome sense, T = 0 ensures that all task transfers ar e useful inthat a remote task arrives at the node soon after it becomesidle.For moderate delays (C = S , Fig. 8) , the behavior is asfollows. Even at p 10.5, there is a gain of about 22 percentfrom load sharing. For instance, the best response time atthis load is about 1.56 units while the corresponding NL Bperformance is 2 units. The improvements over NLB by loadsharing at higher loads are even more substantial, being ashigh as about 73 percent for p = 0.9. The NLB response time

1521

T h r e s h o l d~ r h o = 0.5

r ho = 0.7r h o = 0.9....

Fig. 8. Variation of thresholds (delay =S) .

in this case is 10 units whereas the optimal Symmetric valueis about 2.7 units, as can be seen from Fig. 8 . Furthermore,T = 1 for p = 0.5 and 0.7, while T = 2 fo r p = 0.9, are theoptimal thresholds.When the communication delays increase to the order of10s (Fig. 9), it is seen that the best that can be achieved forp = 0.5 is the NLB performance which is 2.0 units responsetime. Thus, it would be appropriate to turn off load sharinghere. For p = 0.7, a small gain of about 5 percent is seen, atT = 5 . This improvement is sm all enough that if the interfer-ence of probes could be accounted fo r, the best strategy mightvery likely be to turn off load sharing. However, at p = 0.9,the reduction in response time from the NLB is about 40 per-cent and this occurs at T = 6 , where the Sym metric responsetime is about 6.0 units. In any case, the response times at highdelays are significantly worse than the M/M /K values a s mightbe expected. For instance, at p = 0.7, the MIMIK responsetime is 1.04 units whereas the best load sharing value is about3.1 units.Optimal Response Times: The purpose of this set of testsis to determine the best possible performance of the algorithmsunder a very large range of transfer delays, ranging from assmall as 1/1OOS, to as large as 100s. Thus, in this study, onecan assume very fast local area networks will form one endof the spectrum and slow, long-haul networks the other end.Fig. 10shows the results of the tests for the algorithm.Th e response time in each case is normalized by the M /M /1


10/13

15228.0-

73',

7.Q- '\6.5-

6.Q-Re 5.5-P0 5.0.-e

S

sr

0.m-

'.

._,..I..I'

.

................. ..... -. ....................................................

0 1 2 3 4 5 6 7 8 Q 10Threshold- ho = 0 . 5rh o = 0 . 7

rh o = 0.9. ..____.

Fig. 9. Variation of thresholds (delay = 10s).

response times. Th us, a lower ratio indicates greater improv e-ments as a consequence of load sharing. Corresponding toeach curve representing a particular traffic intensity, there isa curve for the performance of the Random assignment al-gorithm, to be used as a baseline. From the figure, one cansee that at low delays (51/25'), the gain from load sharingis quite sub stantial, at all traffic intensities considered . Fur-thermore, the gains are greater for higher loads. At loads of0.9, the response t imes are 0.25 t imes those for the no loadsharing case.As can be seen from the curves representing the perfor-mance of the random assignment, there is a definite advan-tage in probing. However, as the delays increase, ( > 1 5 ' ) ,this advantage of probing seems to disappear. Random witha suitable threshold is able to perform a s well as any probingpolicy, giving the impression that the state informa tion due toprobing is so out of date as to not really be useful. Also, th ebest that can be achieved in lower traffic intensities (50.5)is no better than the M/M/l response t ime at these delays.However, there is still a marked improvement in the perfor-mance of load sharing at higher loads, for example at 0.8 an d0.9. The rem arkable fact that should be noticed here is thateven at delays as high as loos, there is about an 8 percentimprove ment over no lo ad sh aring for traffic intensity 0.9. Wepostulate that at higher loads, this effect will be even moreprominent.Optimal Thresholds:Fig. 11indicates the variation of theowtimal thresholds corresDonding to the owtimal remonse

IEEE TRANSACTIONSON COMPUTERS,VOL. 38, NO. 11, NOVEMBER 1989

100.0.

N0 0.r. .a; .se 0.dR O': .P0 0.n0.

T 0.1 _ - -m 0.

D e l a y as a Fraction of S- ho = 0.5randrh o = 0.7randrh o = 0.7rand

_____......... .__Fig. 10. Optimal normalized response times.

times indicated in Fig. 10.Note that the thresholds are low atlower delays and get higher as the delays increase. Further-more, this effect is seen to be more promine nt at higher trafficintensities. At p = 0.9, the optimal threshold varies between0 when the delay is l/lOS an d 25, when the delay is 100s.The variation is significantly lower at low lo ads.

v. SUMMARYN D CONCLUSIONSThis study was concerned with the performance analysisof simple load sharing algorithms in the presence of signifi-cant task transfer delays. The three algorithms that we tested

were called Forward, Reverse, and Sym metric. The analysisof the algorithms was carried ou t using the Matrix-Ge ometricsolution technique.The M arkov process of the entire network appeared to becomp utationally intractable. Thus, we employed a decomp osi-tion technique to solve the Markov process. While this resultedin an appro ximate solution of the original system, it was seen,by means of simu lation studies, that the variation between theexact and approximate solutions was minimal for systems of10-20 nodes. Conseq uently, the analytical solution is likely tobe m ore accurate for larger systems. This leads us to hypoth-esize that the decomp osition is an exact solution of the systemin the limit as the numb er of n odes tends to infinity.The three load sharing algorithms were tested over a largerange of parameter values. Some of the salient observationsthat we made were as follows.

" ~ ~ There is considerable difference between the perfor-


11/13

MIRCHANDANEY et al. : EFFECTS OF DELAYS O N LOAD SHARING

Bo1 =

1523

- A 0 0 0-y x o oy O h O

- 0 Y Y

_,/-

BlO =

_ - -_ _ - -_ - -_ _ - -_ -

PL4 CLq 0 0o p o o0 0 CLq

- 0 0 0 p

_ --_ _ - - -

05 0 o s om o f f i o m 0.75 om 0.85 0.80Load

~ D e l a y = 0 1 sD e l a y = 2 5D e l a y = 209..... D e l a y = 4 0 5D e l a y = 60s...- - - D e l a y = 1009

Fig 11 Optimal thresholdsmance of the three algorithms at low to moderate delays ( I S ) ,with Sym metric providing the best results. As delays increase,the algorithms tend to provide almost identical performance,especially when (D 2 10s).Furthermore, at such delays,Random assignment performs as well as any probing scheme,leading us to believe that at moderate to high delays, probingis wasted effo rt.

0 At high delays (2109, the optimal response times areno better than those for the NLB case, leading us to believethat load sharing is not useful in such situations, for low tomoderate loads. However, at high loads ( p 2 0.9), substantialbenefits accrue from load sharing even at these delays.0 Reverse probing is outperformed by Forward over mostof the range of loads tested, except when p 2 0.9. While Sym-metric is the best of the three algorithms tested, it does havethe potential for generating high probing overheads. Given

these observations, Forward would appear to have even greaterapplicability if realistic overhead costs might be assigned toprobes.0 The benefits of load sharing are m ore pronoun ced at highloads ( p 2 0.8). This is evidenced by the fact that the per-centage reduction in response times in these cases is greatestover the corresponding NLB values.0 At extremely high loads p = 0.9, it is seen that about

8 percent reduction is achieved over the corresponding NLBresponse time, even when the delays are as high as 100s.0 The optimal threshold was seen to be a function of theload and the task transfer delay. At low delays, the optimal

threshold was 0 for all the loads tested. However, a s the delaysincreased, the optimal threshold increased correspondingly,becoming about 24 fo r p = 0.9 and delay = 100s.APPENDIX

In this Appendix, we give closed form representations ofthe matrices Ao, A I , and A2 and the matrices BOO,BO^, Blo,B11, and & I , for the Sym metric probing algorithm.[ (CY + 0 CY 0

l o 0 0

0 0

[- 8 0 0 0

A2 = CL + C L v 4s = (Ah +p + p ) ,

= A + p) ,where

and I4 is the identity matrix of size 4.APPENDIX

In this Appendix, we provide closed form representationsfor the matrices in the case of the Forward and Reverse prob-ing algorithms.


12/13

1524 IEEE TRANSACTIONS ON COMPUTERS, VOL. 38, NO. 1 1 , NOVEMBER 1989Forward:

-(Y IA)B o o = [4+ A)Bo1 = [; ]

by q = y L p ,an d q = 1-q is the pro bability that at least on eof the reverse probes is successful.B o o = [ ; 0 ]

-(A + Y)

11

0-(P + Y + A)1 B11 =CY-(P +Y + A)4 1 =x = C p i [ o 1 B+ C p i e

which is the probability that a node will respond negativelyto a forward probe. Thus, X = 1 - x is the proba bility that anode will respond posit ively to a forward probe.If a node probes L, nodes, then the probability that the set

i ST i>T A.[: ;]0

of probes results in failure is - (P I + A + Y)A2 = p + p O Z 2 .r l , 1 = A / ( P + P I )

Also, R = [ r i j ] an be writ ten as follows:

A2 = P I 2 r1.2 = owhere Z2 is the identity matrix of size 2.

Also, R = [ r i j ] an be writ ten as follows:rl,1 = U / P

r1.2 = 0Yr2.1 = 0 - r1.1 + r2,2)LL

where 0 = Ah +p . It can b e show n that the stability criterionfor the Forward probing algorithm is

Ah < p .Reverse: To determine q, the probability of a set of reverseLe t

probes re sulting in failure, we use the following procedu re.

Y = Pie.i < T + l

If the node probes L, nodes to receive a remote task, thenthe probab ility that all of them will be u nsuccessfu l is denoted

where 4 = X + p + p . It can be shown that the stabilitycriterion for the Reverse probing algorithm isX < p + p .

ACKNOWLEDGMENTThe authors would l ike to thank the referees fo r their valu-

able comm ents. In particular, o ne referee pointed out the flawin our initial analysis pertaining to the reducibility of A an dsuggested the correction that appears in this paper.

REFERENCES[ 1121

S. Bokhari, Dual processor scheduling with dynamic reassignment,IEEE Trans, Software E n g . , vol. SE-5, no. 4, July 1979.E. de Souza e Silva and M. Gerla, Load balancing in distributed sys-tems with multiple classes and site constraints, Perform. 84, 1984,D. Eager, E. Lazowska, and J. Zahorjan, Adaptive load sharing inhomogeneous distributed systems, IEEE Trans. Software E n g . , vol.SE-12, no. 5, pp. 662-675, May 1986.[4] G. Latouche, Algorithmic analysis of a multiprogramming-multiprocessor computer system, J . ACM, vol. 28, Oct. 1981.[5] K . J. Lee, Load balancing in distributed computer system s, Ph.D.dissertation, ECE De p., Univ. of Massachusetts, Feb. 1987.

[6] M . Livny and M. Melm an, L oad balancing in homogeneous broadcastdistributed systems, Perform. Eva/. Rev., vol. 1 1 , no . 1 , pp. 47-55,1982.

pp. 17-33.[3]


13/13

MI R C H A N D A N E Y el al.: E F F E C T S OF DELAYS ON LOAD SHARING

R. Mirchandaney, L. Sha, and J . A. Stankovic. Load sharing in thepresence of non-negligible delay s, 1987, in preparation.M. F . Neuts. Matrix-Geometric Solutions in Stochastic Models: A nAlgorithmic Approach, Mathematical Sciences. Baltimore, MD:Johns Hopkins Univ. Press, 1981.J . Stankovic, Bayesian d ecision theory and its application to decentral-ized control of task scheduling, IEEE Trans. Compu t . . vol. C-34,no. 2, pp. 117-130, Feb. 1985.H. Stone, Critical load factors in two processor distributed systems,IEEE Trans. Software Eng., vol. SE-4. no. 3, May 1978.~, Multiprocessor scheduling with the aid of network flow algo-rithms, IEEE Trans. Software E n g . , vol. SE-3, no. 1, May 1978.A . Tantawi and D. Towsley, Optimal static load balancing in dis-tr ibuted computer systems, J . A C M , vol. 32, pp . 445-465, Apr.1985.M. The imer , K . Lantz, and D. Cheriton, Preemptable remote exe-cution facilities for the V-System, in Proc. 10th Symp. Oper. Sys t .Principles, Dec . 1985.D. Towsley, The allocation of programs containing loops andbranches on a multiple processor sy stem, IEEE Trans. SoftwareE n g . , vol. SE-12, pp. 1018-1024, Oct. 1986.Y. Wang and R. M orris, Load sharing in distr ibuted systems, IE E ETrans. Comput . , vol. C-34, Mar. 1985.

Ravi Mirchandaney (S48 2-M 89) received the B .E. degrees in electricalengineering from the University of Bombay, India, in 197 9, and the M. S. andPh.D. degrees in electrical and computer engineering from the University ofMassachusetts, Amherst, in 1984 and 1987, respectively.Currently, he is an Associate Research Scientist in the Department of Com -puter Science. Yale University, New Haven, CT. His current research inter-ests are in distributed processing systems, parallel processing systems, andperformance evaluation.

Don Towsley (M7 8) received the B. A. degree in physics and the Ph.D.degree in computer sciences from the University of Texas at Austin in 1971and 1975, respectively.

1525From 19 76 to 1985 he was a member of the faculty of the Departmentof Electrical and Computer Engineering at the University of Massachusetts,Amherst, where he achieved the rank of Associate Professor. He is currentlyan Associate Professor of Com puter and Information Scien ce at the Universityof Massachusetts. During 1982-1983, he was a Visiting Scientist at the IBMT . J . Watson Research Center, Yorktown Heights, NY. His research interestsare in computer networks, distributed computer systems, and performanceevaluation.Dr. Towsley is currently an Associate Editor of Networks and IEEE TRANS-

ACTIONS ON C O M M U N I C A T I O ~ Snd is a member of the Association for Comput-ing Machinery and the Operations Research Society of America.

John A. Stankovic (S77-M79-SM86) receivedth e B S degree in electrical engineering, and theM S. and Ph.D degrees in computer science, allfrom Brown University, Providence, RI , in 1970,1976 , and 1979, respectivelyHe is currently an Associate Professor in theComputer and Information Science Department,University of Massachusetts at Amherst. He hasheld visiting positions in the Computer ScienceDepartment at Carnegie-Mellon University, Pitts-burgh, PA, and at INRIA in France. His currentresearch interests include investigating various approaches to xheduling onlocal area networks and multiprocessors, and developin g flexible, distributed,hard real-time systems He is currently building a hard real-time kernel,Spring. based on a new scheduling paradigm and on ensuring predictabil-ity He is also performing research on a distributed database testbed calledCARAT The CARAT testbed has been operational for several yearsDr Stankovic received an Outstanding Scholar Award from the School ofEngineering, University of Ma\sachusetts He has published a tutorial textentitled Reliable Distributed System Software, and is an editor fo r IEEET R A N S A C T I O N SN C O M P U T E R Se was a Guest Editor fo r a special issue ofIEEE TRANSACTIOVSN C O M P L T E R Sn Parallel and Distributed ComputingHe currently serves as a Distinguished Visitor of the IEEE Computer Societyand is a member of the Association for Computing Mdchinery and Sigma Xi

analysis of the effects of delays on load sharing 1989

Documents