converging to periodic schedules for cyclic scheduling problems with resources and deadlines

23
Author's Accepted Manuscript Converging to Periodic Schedules for Cyclic Scheduling Problems with Resources and Deadlines Benoît Dupont de Dinechin, Alix Munier Kordon PII: S0305-0548(14)00053-7 DOI: http://dx.doi.org/10.1016/j.cor.2014.03.004 Reference: CAOR3517 To appear in: Computers & Operations Research Cite this article as: Benoît Dupont de Dinechin, Alix Munier Kordon, Converging to Periodic Schedules for Cyclic Scheduling Problems with Resources and Deadlines, Computers & Operations Research, http://dx.doi.org/ 10.1016/j.cor.2014.03.004 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. www.elsevier.com/locate/caor

Upload: alix

Post on 30-Dec-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

Author's Accepted Manuscript

Converging to Periodic Schedules for CyclicScheduling Problems with Resources andDeadlines

Benoît Dupont de Dinechin, Alix MunierKordon

PII: S0305-0548(14)00053-7DOI: http://dx.doi.org/10.1016/j.cor.2014.03.004Reference: CAOR3517

To appear in: Computers & Operations Research

Cite this article as: Benoît Dupont de Dinechin, Alix Munier Kordon,Converging to Periodic Schedules for Cyclic Scheduling Problems withResources and Deadlines, Computers & Operations Research, http://dx.doi.org/10.1016/j.cor.2014.03.004

This is a PDF file of an unedited manuscript that has been accepted forpublication. As a service to our customers we are providing this early version ofthe manuscript. The manuscript will undergo copyediting, typesetting, andreview of the resulting galley proof before it is published in its final citable form.Please note that during the production process errors may be discovered whichcould affect the content, and all legal disclaimers that apply to the journalpertain.

www.elsevier.com/locate/caor

Page 2: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

Converging to Periodic Schedules for Cyclic Scheduling Problems

with Resources and Deadlines

Benoıt Dupont de Dinechin ∗ Alix Munier Kordon †

Abstract

Cyclic scheduling has been widely studied because of the importance of applications inmanufacturing systems and in computer science. For this class of problems, a finite setof tasks with precedence relations and resource constraints must be executed repetitivelywhile maximizing the throughput. Many applications also require that execution schedulesbe periodic i.e. the execution of each task is repeated with a fixed global period w.

The present paper develops a new method to build periodic schedules with cumulativeresource constraints, periodic release dates and deadlines. The main idea is to fix the periodw, to unwind the cyclic scheduling problem for some number of iterations, and to addprecedence relations so that the minimum time lag between two successive executions ofany task equals w. Then, using any usual (not cyclic) scheduling algorithm to compute taskstarting times for the unwound problem, we prove that either the method converges to aperiodic schedule of period w, or it fails to compute a schedule. A non-polynomial upperbound on the number of iterations to unwind in order to guarantee that cyclic precedencerelations and resource constraints are fulfilled is also provided. This method is successfullyapplied to a real-life problem, namely the software pipelining of inner loops on an embeddedVLIW processor core by using a Graham list scheduling algorithm.

Keywords: Cyclic Scheduling, Throughput Maximization, Resource Constrained ProjectScheduling Problem, Software Pipelining.

1 Introduction

A cyclic scheduling problem is usually defined by a set of tasks that has to be repeated infinitely.This class of problems has been widely studied in the last few years because of the importanceof practical applications in different fields (see the surveys [19, 24]). For manufacturing systems,they may be found in mass production (Crama [9]; Amstrong et al. [2]; Chen et al. [5]; Kim etal. [20]). Tasks usually represent operations of the production process of a manufactured objectfor which a large number of copies are to be produced. In computer science, cyclic schedulingappears with software pipelining (Gasperoni and Schwiegelshohn [15]; Allan et al. [1]; Rau[28]), that is, compiler instruction scheduling of the inner program loops on instruction-levelparallel or pipelined processor cores. Cyclic scheduling problems were also considered in thecontext of real-time systems (Cucu and Sorel [10]; Sucha et al. [30]).

[email protected]. Kalray, 445 rue Lavoisier, 38330 Montbonnot Saint Martin, France.†[email protected]. Sorbonne Universites, UPMC Univ Paris 06, UMR 7606, LIP6, F-75005, France.

1

Page 3: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

Despite originating from different application domains, two common assumptions appear inmost previous works. First, relationships between a finite set of generic tasks are modeled witha bi-valued oriented graph G = (T,A, �, h) defined as follows: T is a set of generic tasks, eachof them with a fixed duration pi ≥ 0. Each arc e = (i, j) ∈ A is associated with a pair of values(�ij , hij) ∈ Z2 and defines the infinite set of precedence constraints ∀k > max{0,−hij}, t(i, k) +�ij ≤ t(j, k+hij) where t(i, k) (resp. t(j, k+hij)) is the starting time of the kth (resp. k+hijth)execution of generic task i (resp. j). Resource constraints (number of machines, parallel ordedicated processors, bandwidth, etc. . .) are usually fixed. The objective is to find a feasibleschedule with the maximum throughput.

The second common assumption is that solutions are constrained to periodic schedules,that is, there exists a period w ∈ Q+ − {0} such that, for every pair (i, k) ∈ T × N − {0},t(i, k) = t(i, 1) + (k − 1)w. Even if this limitation may lead to sub-optimal solutions (since inpresence of resources, periodic schedules are not dominant [19]), it has obvious implementationadvantages; thus, most authors dealing with a practical application have limited their study tothis simple class of schedules. In this paper, we limit our study to periodic schedules with theobjective of minimizing the period, which is equivalent to maximizing the throughput.

The determination of a periodic schedule with the maximum throughput for a bi-valuedgraph (without resource limitations) is solved polynomially: indeed, Ramchandani [27] solvedit for hij ≥ 0 and �ij = pi > 0, ∀e = (i, j) ∈ A. Chretienne in [6] and Cohen et al. in [8] provedthat the throughput of the earliest schedule equals the maximum throughput of a periodicschedule. All these results were extended separately by Lee and Park [22, 23] and Munier[25] to potentially negative values of �ij and hij . Note that Chretienne in [7] also studied theexistence of a cyclic (not necessarily periodic) schedule with deadlines and the structure of thelatest schedule for a bi-valued graph with hij ≥ 0 and �ij = pi > 0, ∀e = (i, j) ∈ A.

The computation of a periodic schedule in presence of resource limitations (with or withoutrelease dates and deadlines) is a difficult problem. The complexity is clearly strongly NP-hard,since it includes the computation of a classical (acyclic) scheduling problem of a given lengthunder resource constraints. Many authors have noticed that it may be modeled with IntegerLinear Programming. In the setting of periodic cyclic scheduling of inner loop instructionson a Very Long Instruction Word (VLIW in short) processor core, Govindarajan et al. [16]expressed their problem by using a time-indexed formulation. This formulation was improvedby Eichenberger and Davidson [13] in order to solve practical cases with a commercial solver.Dupont de Dinechin developed another formulation [12] inspired by the classic non-preemptivetime-indexed formulation of Pritsker et al. [26] for the Resource Constrained Project SchedulingProblem (RCPSP in short) [3].

The major drawback of the time-indexed formulations is that the numbers of variables andequations grow with the period. However, Dupont de Dinechin showed that large neighborhoodsearch techniques were effective for heuristically solving problem instances with hundreds ofgeneric tasks [11]. Other formulations were also developed to model resource limitations moreefficiently. Hanen in [18] observed that the problem for dedicated processors may be modeledby Integer Linear Programming with the starting time of the first execution of the generictasks. This formulation was considered by Brucker and Kampmeyer in [4] to test its efficiencyfor several particular classes of cyclic scheduling problems. Another formulation was developedand tested by Sucha and Hanzalek in [29] for typed task systems.

Heuristics based on list scheduling are popular among practitioners to handle resource con-

2

Page 4: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

straints: when a resource is available, it is allocated to a ready task of highest priority [17]. Inparticular, Gasperoni and Schwiegelshohn in [15] proposed a simple technique to build periodicschedules in the presence of resource constraints, assuming �ij = pi > 0 and hij ≥ 0 for anyarc (i, j) ∈ A. A periodic schedule t∞(i, 1), i ∈ T of period w∞ is first computed from the bi-valued precedence graph ignoring resource constraints. A (non cyclic) acyclic precedence graphG� = (T,A�, �) is then built by considering only arcs e = (i, j) ∈ A of null heights (i.e. withhij = 0) such that t∞i + �ij ≤ t∞j with t∞i = t∞(i, 1) mod w∞. A list schedule of G� with theoriginal resource constraints yields a non cyclic schedule of makespan w. A periodic schedule isthen built by repeating the (acyclic) schedule obtained with period w. The performance ratiois close to 2 for identical machines.

Other cyclic scheduling heuristics that build periodic schedules have been proposed for loopsoftware pipelining on VLIW processor cores. In particular, the modulo scheduling framework[28] uses a job-based list scheduling (i.e. at each step of the algorithm, available tasks arelisted and the algorithm chooses a task with a highest priority) extended with backtracking.Assuming a period w, the starting times of the scheduled operations are considered modulus byw for handling the resources (i.e. any task i for which its first execution t(i, 1) is fixed requiresits resources at time t(i, 1) mod w). This heuristic is attempted for increasingly larger valuesof the period w until it succeeds.

The main contribution of this paper is to present a new method to build periodic scheduleswith periodic release dates, deadlines and complex resource limitations. A feasible periodicschedule is built using a scheduler (heuristic or optimal) for the associated non cyclic schedulingproblem. An arbitrary period w is fixed and the scheduler computes the successive starting timesof the generic tasks. In favorable cases, our algorithm converges to a feasible periodic scheduleof period w. Otherwise, it must be restarted with a larger value for the period. Resourceconstraints are a cyclic extension of the RCPSP, where resources are divided into P classes,each of them composed by a fixed number of identical machines. Each generic task i ∈ Trequires a subset of resources during each execution defined by a vector bi of size P .

The bi-valued graph G is supposed to be strongly connected and to comprise a fictitioustask 0 from T that is scheduled periodically with fixed period w. We show that this assumptionallows the association to any other task from T of periodic release dates and deadlines, whichare characterized by critical paths of G for the period w.

The main idea developed in this paper is to add fictitious precedence relations (i, i), ∀i ∈ Twith �ii = w and hii = 1 called regularizing precedence relations and to compute a feasibleschedule (using as example a list scheduling algorithm) after unwinding the scheduling problem.The minimum time lag between two consecutive executions of a same generic task is then w.Because of the presence of periodic release dates and deadlines, the number of iterations forwhich this difference is strictly greater than w is bounded in any feasible schedule; we showthen that after a limited number of iterations, for any integer Δ > 0, the difference between Δconsecutive executions of each generic task is exactly w.

Afterward, two lower bounds on Δ are expressed to ensure that a feasible periodic scheduleof period w may be built from the unwound schedule. The first bound Δ1 is the minimumfor fulfilling the precedence relations. The other bound Δ2 is the minimum for satisfying theresource constraints.

A case study coming from an actual industrial problem is lastly presented to illustrate ourmethod. The problem is to find a periodic schedule for inner loop operations on an embedded

3

Page 5: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

VLIW processor core of the Lx/ST200 family [14]. Benchmarks are extracted from real-lifeprograms, each of them with several inner loops. The acyclic scheduling heuristic used hereis a Graham list scheduling algorithm with a priority proportional to an upper bound of thelongest path to a final task. Experimental results are compared to the near-to-optimal moduloscheduling method developed by Dupont de Dinechin et al. in [12] and with lower bounds ofthe period.

Organization of the paper is as follows. Section 2 presents some additional notations andthe computation of release dates and deadlines. An example coming from [29] illustrates ournotations. Section 3 is devoted to the convergence proof of any feasible schedule after theregularizing precedence relations are added. Section 4 shows how to build a feasible periodicschedule from the feasible (not necessarily periodic) schedule obtained in the previous section.A minimum number of iterations required to fulfill the cyclic precedence relations and resourceconstraints is also evaluated. Section 5 presents our case study. Section 6 is our conclusion.

2 Basic Notations

The aim of this section is to present formally the problem tackled in this paper.

2.1 Generic Tasks and Precedence Relations

Let T = {0, · · · , n} be the set of generic tasks, n ≥ 1. For any integer ν > 0 and task i ∈ T ,〈i, ν〉 denotes the νth execution of i, each of them with duration pi ≥ 0.

Precedence relations are defined by a bi-valued directed and strongly connected graph G =(T,A, �, h): each arc e = (i, j) ∈ A is associated with a pair of values (�ij , hij) ∈ Z2 andcorresponds to the (infinite) set of precedence relations

∀k > max{0,−hij}, t(i, k) + �ij ≤ t(j, k + hij).

For every path μ of G, we set L(μ) =∑

e=(j,q)∈μ �jq and H(μ) =∑

e=(j,q)∈μ hjq.Figure 1 pictures a bi-valued strongly connected graph G. As example, the arc (1, 2) models

the precedence relations ∀k > 1, t(1, k) + 2 ≤ t(2, k − 1).

0

1

2

(2, 2)

(2,−1)

(−1, 1)

Figure 1: A bi-valued strongly connected graph G = (T,A, �, h) with T = {0, 1, 2}. Duration ofeach task is unitary (i.e. p0 = p1 = p2 = 1).

4

Page 6: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

2.2 Periodic Schedules

If resources are not considered, a schedule σ is entirely defined by σ = {t(i, k), i ∈ T, k ∈N − {0}}, where t(i, k) is the starting time of 〈i, k〉 following σ. The throughput of a scheduleσ is

λ(σ) = mini∈T

limk→∞

k

t(i, k).

A schedule is periodic if there exists a period w ∈ Q+ − {0} such that, for every pair (i, k) ∈T × N − {0}, t(i, k) = t(i, 1) + (k − 1)w. Note that the throughput of a periodic schedule is

exactly1

w. We also assume in this paper that any periodic schedule verifies w ≥ maxi∈T pi

insuring that there is at most one execution of any task i in every time interval of lenght w.As stated before, the aim of our work is to build, if possible, a periodic schedule for a

fixed period w. Thus, we will assume that task 0 ∈ T is always scheduled periodically, that is,t(0, k) = (k − 1)w for any value k > 0.

Figure 2 presents the first executions of a feasible periodic schedule of period w = 3 for thebi-valued graph of Figure 1. The corresponding starting time sequences are, for any integerk > 0, t(0, k) = (k − 1)× 3, t(1, k) = −1 + (k − 1)× 3 and t(2, k) = 4 + (k − 1)× 3.

task 0

task 1

task 2

1 2 3 4

1 2 3 4

1 2

Figure 2: A feasible periodic schedule without resource limitation for the bi-valued graph ofFigure 1. The period is w = 3 and the starting times of the first executions are t(0, 1) = 0,t(1, 1) = −1 and t(2, 1) = 4.

One may check that all precedence relations are fulfilled. As example, the precedencerelations induced by the arc (1, 2) become ∀k > 1,−1 + (k − 1) × 3 + 2 ≤ 4 + (k − 1) × 3− 3,which is clearly true.

2.3 Resource Limitations and Feasible Schedules

The resource limitations considered can be viewed as a generalization of the Resource Con-strained Project Scheduling Problem (RCPSP in short). Resources are usually partitioned intoP ≥ 1 classes M1, · · · ,MP . For any j ∈ {1, · · · P}, mj denotes the number of resources availablefrom Mj .

Each generic task i ∈ T requires a subset of resources during its execution designed by avector bi of size P . For any j ∈ {1, · · · , P}, bij is then the number of resources from Mj required

5

Page 7: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

for each execution of i.A schedule σ can be completely defined by the starting times t(i, k), i ∈ T and k ∈ N−{0}

even with the presence of resource constraints. Indeed, let us denote by Λ(μ, i) the numberof executions of i ∈ T performed at time μ ∈ R: Λ(μ, i) = |{q ∈ N − {0}, t(i, q) ≤ μ <t(i, q) + pi}|. A schedule σ fulfils resource limitations if for any μ ∈ R and any j ∈ {1, · · · , P},∑

i∈T Λ(μ, i) × bij ≤ mj. One can notice that whereas the resource limitation is verified, anallocation function (that indicates the resources allocate for any execution of a task) may easilybe built.

An interesting question concerns the structure of the allocation function for periodic sched-ules. The example pictured by Figure 3 shows that it may not exist a periodic allocation of theresources (i.e. all the executions of a tasks performed by a same machine) even for a periodicschedule.

For our study, we consider that a schedule (periodic or not) is feasible if it fullfils all the con-straints, i.e. precedence relations and resource limitations with no assumption on the structureof the allocation function.

(1, 0)

(2, 0)(2, 2)

1 2

3

〈1, 1〉 〈2, 1〉 〈3, 1〉 〈1, 3〉 〈2, 3〉 〈3, 3〉 〈1, 4〉

〈1, 2〉 〈2, 2〉 〈3, 2〉 〈1, 4〉 〈2, 4〉

0 2.5 5 7.5 10

w = 2.5 and m = 2 identical processors

Figure 3: Task starting times are periodic. There is no periodic schedule of period w = 2.5such that successive executions of any generic task is performed by a same processor.

2.4 Evaluation of Periodic Release Dates and Deadlines

As assumed previously, task 0 follows a periodic schedule of fixed period w. Since G is stronglyconnected, this defines the release dates and the deadlines for the executions of every task.

Consider as example the bi-valued graph of Figure 1. Setting w = 3, starting times of task0 are setting to, for any k > 0, t(0, k) = (k − 1)× 3.

The precedence relations modelled by the arc (0, 1) are then ∀k > 0, (k − 1) × 3 + 2 ≤t(1, k + 2). Thus, for any k > 2, (k − 3) × 3 + 2 ≤ t(1, k). Similarly, precedence relationsassociated by respectively the arcs (1, 2) and (2, 0) are ∀k > 0, t(1, k) + 2 ≤ t(2, k − 1) andt(2, k) − 1 ≤ 3 + (k − 1) × 3. Thus, t(1, k + 1) + 1 ≤ 3 + (k − 1) × 3, which is equivalent tot(1, k) ≤ 2 + (k − 2)× 3 for k > 2.

Sequences r(1, k) = 2 + (k − 3) × 3 and d(1, k) = 1 + (k − 2) × 3 defined for k > 2 arerespectively the periodic release dates and deadlines for 〈1, k〉. Lemmas 1 and 2 generalizethese definitions by using some critical paths of the bi-valued graph G.

6

Page 8: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

Let μ be a path of G from task i to task j. For every integer k sufficiently large, μ isassociated with a set of precedence relations following a path from 〈i, k〉 to 〈j, k +H(μ)〉 andexpresses that

t(i, k) + L(μ) ≤ t(j, k +H(μ)).

k must be greater than or equal to the minimum positive integer value k(μ) such that, for everytask p from μ, the sub-path μp of μ from i to p verifies k(μ) +H(μp) > 0. Clearly,

k(μ) = max{1, maxp∈μ∩T

(1−H(μp))}.

Lemma 1. For every task i ∈ T − {0}, let us denote by μ�i the path of G from task 0 to task

i whose value L(μ�i ) − wH(μ�

i ) is maximum. The sequence r(i, q) = t(0, q) − wH(μ�i ) + L(μ�

i )defined for q ≥ max{1, k(μ�

i ) +H(μ�i )} is periodic of period w and is a release date for 〈i, q〉.

Proof. For every integer q ≥ max{1, k(μ�i ) + H(μ�

i )}, then q − H(μ�i ) ≥ k(μ�

i ) > 0 and thestarting time t(i, q) for any feasible schedule verifies

t(i, q) ≥ t(0, q −H(μ�i )) + L(μ�

i ).

As task 0 follows a periodic schedule of period w, we get t(0, q − H(μ�i )) = t(0, q) − wH(μ�

i )and then

t(i, q) ≥ t(0, q)− wH(μ�i ) + L(μ�

i ),

which concludes the proof.

Lemma 2. For every task i ∈ T −{0}, let us denote by ν�i the path from G from task i to task 0whose value L(ν�i )−wH(ν�i ) is maximum. The sequence d(i, q) = t(0, q)+wH(ν�i )−L(ν�i )+ pidefined for q ≥ k(ν�i ) is periodic of period w and is a deadline for 〈i, q〉.Proof. For every integer q ≥ k(ν�i ), we get

t(i, q) + pi ≤ t(0, q +H(ν�i ))− L(ν�i ) + pi.

Since task 0 follows a periodic schedule of period w, the lemma is proved.

In the following, we set

q� = maxi∈T−{0}

{k(ν�i ), k(μ�i ) +H(μ�

i )}

and, for every task i ∈ T − {0}, α(i) = d(i, q) − r(i, q)− pi is defined for q ≥ q�.Release dates and deadlines may be easily computed by considering the simple valued graph

g(w) built from G where each arc e = (i, j) of G is valued by v(e) = �ij − whij . For any taski ∈ T , the longest paths μ�

i and ν�i may be computed by using the Bellman-Ford algorithm.Periodic release dates and deadlines are then expressed by Lemmas 1 and 2.

The values of the arcs for the example of Figure 1 are respectively v((0, 1)) = −4, v((1, 2)) =5 and v((2, 0)) = −4 for a period w = 3. Table 1 presents the release dates and deadlinesobtained.

Note that periodic release dates and deadlines may be easily modeled by considering someadditional precedence relations. Indeed, suppose that the release dates of execution 〈i, k〉 are

7

Page 9: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

Release dates Deadlines α(i)

r(0, q) = (q − 1)× 3 d(01, q) = 1 + (q − 1)× 3 0r(1, q) = −4 + (q − 1)× 3 d(1, q) = (q − 1)× 3 3r(2, q) = 1 + (q − 1)× 3 d(2, q) = 5 + (q − 1)× 3 3

Table 1: Periodic release dates, deadlines and values α(i) with w = 3 for the example of Figure 1.

ri + w(k − 1) for a fixed i ∈ T , ∀k > 0. This is modeled by an arc e = (0, i) with �0i = ri andh0i = 0. Similarly, if di + (k − 1)w is a deadline for every execution 〈i, k〉 of i ∈ T with k > 0,then any feasible schedule fulfills t(i, k)+ pi ≤ t(0, k)+ di,∀k > 0. It may be modeled by an arc(i, 0) with �i0 = pi−di and hi0 = 0. This last point will be illustrated by the example presentedin the next subsection.

2.5 Example

The example presented in Figure 4 comes from [29] and presents the computation loop of alattice wave digital filter. This loop computes a vector Y (k), k ∈ {1, · · · , N} from a vectorX(k), k ∈ {0, · · · , N − 1}. The loop body comprises into 5 statements, each one correspondingto a unique operation numbered from 1 to 5. Each operation is associated with a generic task.All task durations are unitary, i.e. ∀i ∈ {1, · · · , 5}, pi = 1.

The bi-valued graph that models data transfers is also pictured in Figure 4. Each arccorresponds to a precedence between two operations. As example, for any value k ∈ {1, · · · , N},task 〈4, k〉 needs the result of both 〈2, k〉 and 〈3, k − 2〉. Thus, the arcs (2, 4) and (3, 4) areconsidered to G with h24 = 0 and h34 = 2.

Task 2 is a multiplication, whereas all other tasks are additions. It is supposed that onemultiplier and two adders are available. Thus the number of classes is P = 2 with m1 = 2 (theadders) end m2 = 1 (the multiplier). Moreover, ∀i ∈ {1, 3, 4, 5}, bi = (1, 0) and b2 = (0, 1).

An additional delay of 1 time unit must be added to transfer a data from the multiplier toan adder. Thus, �23 = �24 = 2, whereas all other values of � equal 1, which correspond exactlyto the processing time of arithmetic operations. The aim here is to compute a periodic feasibleschedule with period w = 2.

Since the solution computed must be periodic, we suppose without loosing generality thattask 1 must be executed periodically from time 0 with a period w. So, its release dates anddeadlines are respectively r(1, q) = 0+2× (q− 1) and d(1, q) = 1+2× (q− 1) for q > 0. Theseconstraints are respectively modeled by arcs (0, 1) and (1, 0) with values (0, 0).

As we shall see later, the bi-valued graph G must be strongly connected to ensure theconvergence of our method. The bi-valued graph is still not strongly connected. For thatpurpose, an arc (5, 0) may be added to define artificial deadlines for tasks 4 and 5. The periodicdeadline for 5 is arbitrarily set to d(5, q) = 8 + (q − 1)w, which corresponds to a bi-valuationequal to (−7, 0) for this last arc. The bi-valued graph G obtained is pictured in Figure 5.

Table 2 summarizes the obtained periodic release dates, deadlines and values α(i), i ∈T −{0}. Since there is no arc with negative height hij , q

� = 1. Release dates, deadlines are thusdefined for q ∈ N−{0}. As example, for task 4, the longest path between tasks 0 and 4 on g(w) inFigure 5 are respectively μ�

4 = 0, 1, 2, 4 and ν�4 = 4, 5, 0. Thus, r(4, q) = t(0, q)+3 = (q−1)×2+3,

8

Page 10: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

for k = 1 to N do

1 : a(k) = X(k)− c(k − 2)2 : b(k) = a(k)× α3 : c(k) = b(k)−X(k)4 : d(k) = b(k)− c(k − 2)5 : Y (k) = X(k − 1)− d(k)

1

2

3

4 5

(1, 0)

(2, 0)

(1, 2)

(2, 0)

(1, 2)

(1, 0)

Figure 4: A computation loop of a lattice wave digital filter presented in [29] and the corre-sponding bi-valued graph.

d(4, q) = t(0, q) + 6 + 1 = (q − 1)× 2 + 7 and α(i) = d(4, q) − r(4, q)− 1 = 3.

Release dates Deadlines α(i)

r(1, q) = (q − 1)×w d(1, q) = 1 + (q − 1)× w 0r(2, q) = 1 + (q − 1)× w d(2, q) = 2 + (q − 1)× w 0r(3, q) = 3 + (q − 1)× w d(3, q) = 4 + (q − 1)× w 0r(4, q) = 3 + (q − 1)× w d(4, q) = 7 + (q − 1)× w 3r(5, q) = 4 + (q − 1)× w d(5, q) = 8 + (q − 1)× w 3

Table 2: Periodic release dates, deadlines and values α(i) for the example of Figure 5. Releasedates and deadlines are defined for q ≥ 1.

3 Convergence of Feasible Unwound Schedules

The key idea of this paper is to force the construction of periodic schedules by using acyclicscheduling. As example, consider the instance T = {0, 1, 2, 3} with p0 = 0, p1 = p2 = p3 = 1,arcs (0, i) valued by (0, 0) and arcs (i, 0) valued by (−1, 0), i ∈ {1, 2, 3}. The correspondingperiodic release dates and deadlines are r(i, q) = (q − 1) × w and d(i, q) = 2 + (q − 1) × w.Two identical machines are available. The aim is to build a feasible cyclic schedule with periodw = 2.

Figure 6 shows that a non-periodic unwound feasible schedule may be easily built. Notethat this schedule does not even tend to a periodic schedule. However, if precedence relations(i, i), i ∈ {1, 2, 3} bi-valued by (2, 1) are added, then any feasible unwound schedule for thisexample tends to be periodic. This point is proved for any bi-valued graph by Theorem 3.

9

Page 11: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

0

1

2

3

4 5

(0, 0)

(0, 0)

(1, 0)

(2, 0)

(1, 2)

(2, 0)

(1, 2)

(1, 0)

(−7, 0)0

1

2

3

4 5

0

0

1

2

−3

2

−3

1

−7

Figure 5: On the left, the bi-valued graph G built from the graph presented in Figure 4 withadditional hypotheses. On the right, the corresponding simple valued graph g(w) for w = 2.

Theorem 3. Suppose that there exists a feasible unwound schedule σ for the strongly connectedbi-valued graph Gw built from G by adding loops (i, i) with values �ii = w and hii = 1 and forwhich task 0 is scheduled periodically with a period w. Then, for every integer Δ ≥ 1, thereexists an integer q ≤ q� +Δ

∑i∈T−{0} α(i) such that, for every task i ∈ T − {0}, t(i, q +Δ) =

t(i, q) + wΔ.

Proof. For every task i ∈ T − {0} and for every q ≥ q�, we get by Lemmas 1 and 2 thatt(i, q) ∈ {r(i, q), · · · , d(i, q) − pi}. Now, from loops (i, i),

∀k ≥ 0, t(i, k) + w ≤ t(i, k + 1).

Since r(i, q + 1) = r(i, q) + w, we conclude that the sequence t(i, q) − r(i, q) is non-decreasingand its value is in {0, · · · , α(i)}.

By contradiction, assume that for every q ≤ q� + Δ∑

i∈T−{0} α(i) there exists a task i ∈T −{0} such that t(i, q+Δ) > t(i, q)+wΔ. We build a lower bound tb(i, q) of t(i, q) for q ≥ q�

as follows:

1. t(i, q�) is lower bounded by r(i, q�), ∀i ∈ T − {0} following Lemma 1. Thus, we settb(i, q�) = r(i, q�);

2. For any value k ∈ {0, · · · ,∑i∈T−{0} α(i)}, there is at least one task ik �= 0 such thatt(ik, q

�+(k+1)Δ) > t(ik, q�+kΔ)+wΔ. Thus, t(ik, q

�+(k+1)Δ) ≥ t(ik, q�+kΔ)+wΔ+1

and the lower bound tb of t may be defined as follows:

• ∀i ∈ T − {ik}, ∀q ∈ {q� + kΔ, · · · , q� + (k + 1)Δ − 1}, tb(i, q + 1) = tb(i, q) + w;

• ∀q ∈ {q�+ kΔ, · · · , q�+(k+1)Δ− 2}, tb(ik, q+1) = tb(ik, q)+w and tb(ik, q�+(k+

1)Δ) = tb(ik, q� + (k + 1)Δ − 1) + w + 1.

10

Page 12: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

〈1, 1〉 〈3, 1〉 〈1, 2〉 〈2, 2〉 〈1, 3〉 〈3, 3〉 〈1, 4〉

〈2, 1〉 〈3, 2〉 〈2, 3〉 〈3, 4〉

A feasible non-periodic schedule

〈1, 1〉 〈3, 1〉 〈1, 2〉 〈3, 2〉 〈1, 3〉 〈3, 3〉 〈1, 4〉

〈2, 1〉 〈2, 2〉 〈2, 3〉 〈2, 4〉

A feasible periodic schedule

Figure 6: Two feasible unwound schedules for 3 independent tasks with w = 2, periodic releasedates and deadlines. Executions of task 0 are not presented. Schedule on top does not convergeto a periodic schedule. Schedule on bottom is periodic.

Setting k� =∑

i∈T−{0} α(i), k� + 1 values of k were defined and thus,

i∈Ttb(i, q� + (k� + 1)Δ)− r(i, q� + (k� + 1)Δ) > k�.

By definition of α, there is at least one task i with tb(i, q�+(k�+1)Δ)+pi > d(i, q�+(k�+1)Δ),and thus tb is not a lower bound of a feasible schedule, a contradiction.

Note that if Gw is not strongly connected, there exists at least one task i with no deadlinesand the value α(i) is thus no bounded. Theorem 3 does clearly not apply in this case.

For the example of Figure 5, we get q� = 1. Setting Δ = 3, Δ∑

i∈T−{0} α(i) = 18. Figure 7shows a feasible schedule for Gw with w = 2 for which q = 4. Indeed, one may check that, forany task i ∈ T , t(i, q +Δ) = t(i, 7) = t(i, 4) + 6 = t(i, q) + wΔ.

The parameter Δ is of importance in the next subsection: we will show that it must besuperior to a minimum value to compute a feasible periodic schedule from the acyclic scheduleσ.

4 From Feasible Unwound Schedules to Periodic Schedules

The aim of this section is to study how to build a feasible periodic schedule from a feasibleunwound schedule whose starting times are periodic during Δ consecutive iterations. Thedescription of the periodic schedule is detailed first. Two lower bounds for Δ are then expressed,respectively to fulfill the cyclic precedence relations and resource constraints. Last subsectionis the main theorem which proves the validity of our algorithm.

11

Page 13: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

〈1, 1〉 〈1, 2〉 〈1, 3〉 〈1, 4〉 〈1, 5〉 〈1, 6〉 〈1, 7〉 〈1, 8〉

〈2, 1〉 〈2, 2〉 〈2, 3〉 〈2, 4〉 〈2, 5〉 〈2, 6〉 〈2, 7〉 〈2, 8〉

〈3, 1〉 〈3, 2〉 〈3, 3〉 〈3, 4〉 〈3, 5〉 〈3, 6〉 〈3, 7〉

〈4, 1〉 〈4, 2〉 〈4, 3〉 〈4, 4〉 〈4, 5〉〈5, 1〉 〈5, 2〉 〈5, 3〉 〈5, 4〉 〈5, 5〉

Add

Add

Mult

Figure 7: A feasible unwound schedule for the bi-valued graph Gw with w = 2 for G in Figure 5.Note that, if Δ = 4, then setting q = 4, t(i, q + 4) = t(i, q) + wΔ, for any task i ∈ T − {0}.

4.1 Definition of the Periodic Schedule

Let us start by a technical lemma allowing to build a feasible periodic schedule.

Lemma 4. Suppose that σ is a feasible unwound schedule of Gw of starting times t(i, r) fori ∈ T and r ∈ N−{0}. Also assume Δ > 0 and q > 0 such that, ∀i ∈ T , t(i, q+Δ) = t(i, q)+Δw.Then, ∀β ∈ {0, · · · ,Δ}, t(i, q + β) = t(i, q) + βw.

Proof. By definition of Gw, ∀β ∈ {0, · · · ,Δ}, t(i, β + q) + w ≤ t(i, q + β + 1). Now, if thereexists (i, β) ∈ T − {0} × {0, · · · ,Δ} with t(i, q + β) > t(i, q) + βw, then

t(i, q +Δ) ≥ t(i, q + β) + (Δ− β)w > t(i, q) + Δw,

a contradiction.

Let Δ be a fixed value. The idea of the algorithm is to compute first a feasible unwoundschedule σ for the graphGw until iteration q+Δ such that, ∀i ∈ T−{0}, t(i, q+Δ) = t(i, q)+Δw.By Theorem 3, q is defined and bounded by q� +Δ

∑i∈T−{0} α(i).

Setting δ = mini∈T t(i, q), a periodic schedule σ of period w is defined by, ∀i ∈ T :

1. t(i, 1) = t(i, q)− δ;

2. ∀p > 0, t(i, p) = t(i, 1) + w(p − 1).

Note that σ and σ have the same throughput equal to 1w . In the following, minimum values

for Δ are established in order to obtain for σ a feasible periodic schedule.For the example of Figure 7 and Δ = 4, we may set q = 4. Then, δ = 6 and the first

iteration of σ is defined by t(1, 1) = 0, t(2, 1) = 1, t(3, 1) = 3, t(4, 1) = 6, and t(5, 1) = 7. σ ispictured in Figure 8.

4.2 Lower Bound of Δ for Precedence Relations

Lemma 5 provides a lower bound for Δ such that the periodic schedule σ fulfills all the prece-dence relations.

12

Page 14: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

〈1, 1〉 〈1, 2〉 〈1, 3〉 〈1, 4〉 〈1, 5〉 〈1, 6〉 〈1, 7〉 〈1, 8〉

〈2, 1〉 〈2, 2〉 〈2, 3〉 〈2, 4〉 〈2, 5〉 〈2, 6〉 〈2, 7〉 〈2, 8〉

〈3, 1〉 〈3, 2〉 〈3, 3〉 〈3, 4〉 〈3, 5〉 〈3, 6〉 〈3, 7〉

〈4, 1〉 〈4, 2〉 〈4, 3〉 〈4, 4〉 〈4, 5〉〈5, 1〉 〈5, 2〉 〈5, 3〉 〈5, 4〉 〈5, 5〉

Add

Add

Mult

Figure 8: First iterations of σ associated with the feasible unwound schedule presented inFigure 7 with Δ = 4 and q = 4.

Lemma 5. Let Δ1 = maxe=(i,j)∈A |hij |. If Δ ≥ Δ1, then σ fulfills all the precedence relationsfrom G.

Proof. If Δ ≥ Δ1, then for every arc e = (i, j) ∈ A, there is at least a pair of integers(qi, qj) ∈ {q, · · · , q +Δ} × {q, · · · , q +Δ} with qj − qi = hij . Moreover, since t is feasible,

t(i, qi) + �ij ≤ t(j, qj).

Now, by Lemma 4, t(i, qi) = t(i, q) + (qi − q)w and t(j, qj) = t(j, q) + (qj − q)w. Thus,

t(i, 1) − t(j, 1) = t(i, q)− t(j, q) = (t(i, qi)− t(j, qj)) + w(qj − qi) ≤ −�ij + whij .

We gett(i, 1) + �ij ≤ t(j, 1) + whij

and since t is periodic,∀k > 0, t(i, k) + �ij ≤ t(j, k + hij),

the result.

For the example from Figure 5, we get Δ1 = 2.

4.3 Lower Bound of Δ for Resource Constraints

We suppose in this subsection that the unwound schedule σ is feasible, i.e. verifies precedenceand resource constraints. Let also the values dmax = maxi∈T d(i, q�) and rmin = mini∈T r(i, q�).

Now, let q ∈ N−{0} and the functionsD(q) = maxi∈T {t(i, q)+pi} and R(q) = mini∈T t(i, q).Observe that, by definition of σ, δ = mini∈T t(i, q) = R(q). Lemma 6 expresses a simple upperbound of D(q)−R(q):

Lemma 6. For any q ≥ q�, dmax − rmin ≥ D(q)−R(q).

Proof. Let i ∈ T such that D(q) = t(i, q) + pi. By Lemma 2, as q ≥ q�, D(q) ≤ d(i, q) ≤dmax + (q − q�)w. Similarly, let j ∈ T such that R(q) = t(j, q). By Lemma 1, R(q) ≥ r(j, q) ≥rmin + (q − q�)w and thus dmax − rmin ≥ D(q)−R(q), the result.

13

Page 15: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

Lemma 7. Let Δ2 = dmax−rminw . If Δ ≥ Δ2, then for any i ∈ T , t(i, q +Δ) ≥ D(q).

Proof. Let us suppose by contradiction that there exists i ∈ T with t(i, q + Δ) < D(q). Ast(i, q + Δ) ≥ t(i, q) + Δw because of additional precedence constraints, and t(i, q) ≥ R(q) byfeasibility of σ, we get that

D(q) > t(i, q) + Δw ≥ R(q) + Δw.

And thus D(q)−R(q) > Δw. Following Lemma 6,

dmax − rmin ≥ D(q)−R(q) > Δw

and then Δ < Δ2, a contradiction.

Lemma 8. If Δ ≥ Δ2, then for any value ε ∈ {0, · · · , w − 1}, each task i ∈ T has a uniqueexecution 〈i, qi〉 starting during the interval [D(q)+ε+1−w,D(q)+ε+1) following the unwoundschedule σ. Moreover, qi ∈ {q, · · · , q +Δ}.Proof. Let an execution 〈i, qi〉 starting in the interval [D(q) + ε+ 1− w,D(q) + ε+ 1), then

D(q) + ε+ 1− w ≤ t(i, qi) < D(q) + ε+ 1.

• The left inequality can be rewritten as

D(q) ≤ t(i, qi) + w − (ε+ 1) ≤ t(i, qi + 1)− (ε+ 1) ≤ t(i, qi + 1)− 1.

As σ is feasible, qi + 1 > q and then qi ≥ q.

• The right inequality corresponds to

t(i, qi) < D(q) + ε+ 1 ≤ D(q) + w.

Now, as t(i, qi) − w ≥ t(i, qi − 1), we get t(i, qi − 1) < D(q) and thus by Lemma 7,qi − 1 < q +Δ and thus qi ≤ q +Δ.

Lastly, there exists at most one execution of i starting in the time interval considered (be-cause of additional precedence constraints). Let us suppose by contradiction that no executionof i following σ belongs to the time interval [D(q) + ε + 1 − w,D(q) + ε + 1). By Lemma 4,t(i, q + 1) = t(i, q) + w for any q ∈ {q, · · · , q + w}, thus two cases may occured:

• The case t(i, q) ≥ D(q) + ε+ 1 is impossible since σ is feasible.

• Else, t(i, q +Δ) < D(q) + ε+ 1− w < D(q), which contradicts Lemma 7.

Thus, the lemma.

Lemma 9 provides another lower bound for Δ in order to verify the resource limitations:

Lemma 9. Suppose that the unwound schedule σ satisfies resource limitations. If Δ ≥ Δ2,then σ satisfies resource limitations.

14

Page 16: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

Proof. Let an integer value μ ≥ mini∈T t(i, 1). It can be decomposed into μ = D(q) − R(q) +ε+ kw with ε ∈ {0, · · · , w − 1} and k ∈ Z.

Now, let us suppose that the execution 〈i, q〉 is performed by σ during μ. Thus, t(i, q) ≤μ < t(i, q) + pi. Recall that t(i, q) = t(i, q)−R(q) + (q − 1)w and the inequality becomes

t(i, q)−R(q) + (q − 1)w ≤ D(q)−R(q) + ε+ kw < t(i, q)−R(q) + (q − 1)w + pi

and thust(i, q) + (q − 1− k)w ≤ D(q) + ε < t(i, q) + (q − 1− k)w + pi.

Now, following Lemma 8, there exists a unique execution 〈i, q+K0〉 of i starting during theinterval [D(q) + ε + 1 − w,D(q) + ε + 1) for the unwound schedule σ with K0 ∈ {0, · · · ,Δ}.Moreover, by Lemma 4, t(i, q) + K0w = t(i, q +K0). Thus K0 = q − 1 − k ∈ {0, · · · ,Δ} andthe last inequality becomes

t(i, q +K0) ≤ D(q) + ε < t(i, q +K0) + pi.

The consequence is that〈i, q +K0〉 is performed during D(q) + ε for σ.The consequence is that every task i ∈ T executed during a time instant μ by σ can be

associated with an execution of i executed during time instant D(q) + ε by σ. As σ fulfillsresource limitations, so does σ, the lemma.

For the example from Figure 7, dmax = 8, rmin = 0 and thus Δ2 = 82 = 4.

4.4 Main Result

The main theorem is a simple outcome of Lemmas 5 and 9:

Theorem 10. Suppose that the unwound schedule σ satisfies the precedence relations and re-source constraints from Gw. Then, for every integer Δ ≥ max{Δ1,Δ2}, the periodic scheduleσ satisfies the precedence relations and resource constraints.

Proof. σ fulfills precedence relations if Δ ≥ Δ1 by Lemma 5. It also fulfills resource limitationif Δ ≥ Δ2 by Lemma 9. Thus, the theorem.

Schedule σ in Figure 8 was built with Δ = 4 = max{Δ1,Δ2}. Thus, it is feasible.From a theoretical point of view, building σ from σ is not polynomial: indeed, unwinding q�+

max{Δ1,Δ2} ×∑

i∈T−{0} α(i) iterations of σ are needed to build σ. However, the convergenceof σ to a periodic schedule may be quite fast for particular real-life cases as illustrated by thenext section.

5 Scheduling Loops on an Embedded VLIW Processor Core

This section illustrates the application of our method to the software pipelining of inner loops foran embedded VLIW processor core widely used in consumer electronics. Resources constraintsinduced by the processor core execution units are briefly presented first. We then describe howto efficiently compute a priority for the Graham list scheduling algorithm without unwindingexplicitly the precedence graph G. The last part discusses our experimental results.

15

Page 17: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

5.1 The ST200 VLIW Processor

The ST200 processor core is an implementation of the Lx VLIW architecture jointly developedby STMicroelectronics and HP [14]. This core executes up to 4 operations per time unit witha maximum of one control operation (goto, jump, call, return), one memory operation (load,store, prefetch), and two multiply operations. More precisely, resources are divided into P = 4classes: Issue models the maximum number of simultaneous operations; Control is for controloperations; Memory is for memory accesses; Align is an artificial resource shared between themultipliers and the immediate operand generators.

Operations are partitioned into 7 classes according to their resource requirements. TheALU, MUL, MEM and CTL classes contain respectively the arithmetic, multiply, memory andcontrol operations; the ALUX, MULX and MEMX classes contain respectively the arithmetic,multiply and memory operations requiring an extended immediate operand.

Table 3 presents the resource requirements of each operations class. Any resource is usedat most one time unit by any operation. As example, one operation from ALUX and one fromMEMX can be executed at the same time. However, it is impossible to perform simultaneouslytwo operations from classes MEM or MEMX because of the contention for resource Memory.

Resource Issue Memory Control Align

Availability 4 1 1 2

ALU 1 0 0 0ALUX 2 0 0 1MUL 1 0 0 1MULX 2 0 0 1MEM 1 1 0 0MEMX 2 1 0 1CTL 1 0 1 1

Table 3: Resources requirements and resource availabilities for the ST200 VLIW core.

5.2 Acyclic Scheduling Algorithm

The acyclic scheduler we use is a natural extension of the basic Graham list scheduling algorithm[17] with a job-priority proportional to the longest path to a final task. A task is said schedulableat time δ if it fulfills precedence relations, release dates and deadlines. As soon as a resource isfree, the algorithm performs the highest priority schedulable job (if any). It fails if some jobscannot be scheduled properly.

The priority considered is equivalent up to a constant to the longest path in the reversedprecedence graph from an initial task. This is the same as computing the earliest schedule dateof a task in the reversed precedence graph without resource constraints. In the setting of ourmethod, the priority should be computed for all tasks of the unwound problem. However, wesave significant effort by periodic scheduling without resource constraints the reversed bi-valuedgraph instead. The only drawback is that upper bounds of the longest path to a final task arecomputed since periodic values are sought.

16

Page 18: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

Consider a bi-valued graph G = (T,A, �, h) and a fixed period w. We denote by G thebi-valued graph built from G by reversing arcs: each arc e = (i, j) from G is replaced in Gby an arc e = (j, i) bi-valued by the pair (�ji, hji) = (�ij, hij). Any arc e = (i, j) of G isassociated with the constraint t(i, k) + �ij ≤ t(j, k + hij). As stated before, we only considerperiodic schedules: task 0 is performed following the sequence t(0, k) = (k − 1)w and, for anytask i ∈ T − {0}, t(i, k) = ti + (k − 1)w. The constraint associated with any arc e = (i, j) thusbecomes tj − ti ≥ �ij − hijw. The minimum values of t may then be computed by using theBellman-Ford algorithm.

The computation of values ti, i ∈ T , are illustrated by Figure 9 for the initial graph Gof Figure 5. Since w = 2, the corresponding sequences are thus t(0, k) = t(1, k) = −2k,t(2, k) = −1− 2k, t(3, k) = −3− 2k, t(4, k) = −6− 2k and t(5, k) = −7− 2k.

0

1

2

3

4 5

0

0

1

2

−3

2

−3

1

−7

t0 = 0

t1 = 0

t2 = −1

t3 = −3

t4 = −6

t5 = −7

Figure 9: On the left, the graph built from the graph G associated with the graph G of Figure 5.Each arc e = (i, j) of G is valued by v(e) = �ij − whij . On the right, possible values for t.

5.3 Experimental results

Tables 4 and 5 summarize our experimental results. The benchmarks consist of ten real-lifeprograms extracted from the MediaBench suite [21]. A program is partitioned into blocks ofoperations, where each block defines a scheduling problem instance. A block that corresponds tothe body of an inner program loop defines a cyclic scheduling problem instance: each operationis a generic task assumed to be executed a large number of times and a periodic schedule issought for such loops as presented in Figure 5. Outside inner loops, the scheduling problem isassumed to be acyclic and each operation corresponds to a task.

Table 4 quantifies the overhead of our method compared to acyclic scheduling. Note thatacyclic scheduling does not build a periodic schedule. The first column displays the benchmarknames. The second column corresponds to the number of generic cyclic tasks. The thirdcolumn gives the number of tasks scheduled with the Graham list scheduling algorithm untilthe convergence of our method to a periodic schedule. On average, a generic cyclic task isscheduled 13 times by the Graham list scheduling algorithm, and up to 20 times in the adpcm

17

Page 19: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

and texgen cases.The last two columns of Table 4 quantify the overhead of our method when we include the

cost of acyclically scheduling non-loop code with the same Graham list scheduling algorithm:the fourth column is the total number of tasks in the benchmarks (generic cyclic + acyclic); thelast column displays the total number of tasks scheduled. Since the number of generic cyclictasks is a small fraction of the total number of tasks, the average number of tasks scheduled bythe Graham list scheduling algorithm is about 438404

291540 ≈ 1.5 times the total number of tasks.

BenchmarkCyclic tasks All tasks

== tasks == scheduled == tasks == scheduled

adpcm 228 4683 322 4777epic 89 203 3384 4024g721dec 258 998 2329 3069g721enc 226 860 1963 2597gsm 1412 6360 10407 16049jpeg6a 73 210 1076 1213pegwit 593 4331 13522 18931pgp 2096 17811 60541 81880rasta 79 623 18177 20559texgen 4067 82323 179819 285305

Total 9121 118402 291540 438404

Table 4: Number of tasks scheduled in MediaBench.

Table 5 compares the performance of our method coupled to a Graham list schedulingalgorithm versus the near-optimal modulo scheduler of Dupont de Dinechin et al. [12]. Thismethod solves an Integer Linear Programming formulation with large neighborhood search usingthe CPLEX 9.0 solver. Only cyclic scheduling problem instances are considered here.

The second column displays the number of inner loops in each benchmark whose bodyis a single block of operations. For each corresponding cyclic scheduling problem, a lowerbound of the optimal period is computed, taking into account the critical circuit of G (i.e.

w∞ = maxcircuit cL(c)H(c)) and the resources required (i.e. wR = maxj∈{1,··· ,P}

∑i∈T bijmj

). Its

value is then fixed to wmin = max(w∞, wR). The third column counts the number of timesthe near-optimal modulo scheduling computes a schedule with a period equal to wmin; thefourth column is the sum of the periods. We observe that the periods computed by near-optimal modulo scheduling are very close to the lower bounds. Columns five and six are thecorresponding periods for our method coupled to the Graham list scheduling algorithm.

Our periodic scheduling method behaves surprisingly well on the benchmarks, given thesimplicity of the priority function supplied to the Graham list scheduling algorithm. Indeed, ourmethod reaches the lower bound for about 60 percent of loops. Moreover, its sum of periods isless than 9 percent greater than those of the near-optimal modulo scheduler. Its implementationis also very simple compared to the near-optimal modulo scheduler that relies on an externalsolver (CPLEX). Lastly, the number of scheduling steps required for its convergence is lowenough so the practical time complexity remains comparable to scheduling with the same acyclicscheduler. This property is important in contexts such as just-in-time program compilation,

18

Page 20: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

where software pipelining is not yet used because of its compilation costs.

Benchmark == LoopsNear-opt. mod. sched. Method + Graham list

== reached Sum of periods == reached Sum of periods

adpcm 2 1 61 0 62epic 2 2 60 2 60g721dec 11 11 63 11 63g721enc 9 9 55 9 55gsm 32 32 437 27 449jpeg6a 3 3 45 3 45pegwit 22 22 225 17 231pgp 75 73 715 58 736rasta 4 4 34 2 38texgen 135 134 1291 46 1492

Total 295 291 2986 175 3231

Table 5: Near-optimal modulo scheduler vs. our method with a Graham list scheduler.

6 Conclusion

This paper presents a new method for periodic scheduling in presence of precedence relationsand resource constraints with periodic release dates and deadlines. Its principle is to applyany acyclic scheduler to a related scheduling problem obtained by suitably regularizing andunwinding the original cyclic scheduling problem.

For any fixed value of the period, we provide a bound on the number of iterations to unwindto ensure that the acyclic schedule fulfills the precedence relations and the resource constraintsof the original cyclic scheduling problem. Under these conditions, our method either convergesto a periodic schedule, or fails to schedule the unwound problem. If the acyclic scheduler isnon-heuristic, such failure implies that periodic scheduling is not feasible for the given period.

Our method is successfully applied to an industrial problem, namely the software pipeliningof inner loops on an embedded VLIW processor core. This illustrates its simple implementationand its practical effectiveness, in the case where the acyclic scheduler used is the Graham listscheduling algorithm with job priority set to an upper bound of the longest path to a finaltask. Our method may surely be considered to find feasible solutions for wider classes of cyclicscheduling problems with precedence relations and resource constraints (such as hoist problemsor real-time scheduling).

References

[1] Vicki H. Allan, Reese B Jones, Randall M Lee, and Stephen J. Allan. Software pipelining.ACM Comput. Surv., 23(7):367–432, 1995.

[2] Ronald Armstrong, Lei Lei, and Shanhong Gu. A bounding scheme for deriving the minimalcycle time of a single-transporter n-stage process with time-window constraints. EuropeanJournal of Operational Research, 78(1):130 – 140, 1994.

19

Page 21: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

[3] P. Brucker, A. Drexl, R. Mohring, K. Neumann, and E. Pesch. Resource-ConstrainedProject Scheduling: Notation, Classification, Models and Methods. European Journal ofOperational Research, 112:3–41, 1999.

[4] Peter Brucker and Thomas Kampmeyer. A general model for cyclic machine schedulingproblems. Discrete Applied Mathematics, 156(13):2561–2572, 2008.

[5] Haoxun Chen, Chengbin Chu, and Jean-Marie Proth. Cyclic scheduling of a hoist withtime window constraints. IEEE Transactions on Robotics and Automation, 14(1):144–152,2002.

[6] Philippe Chretienne. Timed event graphs: A complete study of their controlled executions.In International Workshop on Timed Petri Nets, pages 47–54, Torino, Italy, July 1985.IEEE Computer Society.

[7] Philippe Chretienne. The basic cyclic scheduling problem with deadlines. Discrete AppliedMathematics, 30:109–123, 1991.

[8] Guy Cohen, Pierre Moller, Jean-Pierre Quadrat, and Michel Viot. Algebraic tools forthe performance evaluation of discrete event systems. IEEE Proceeding: special issue onDiscrete Event Systems, 77(1):39–58, 1989.

[9] Y Crama, V Kats, J van de Klundert, and E Levner. Cyclic scheduling in robotic flowshops.Annals of Operations Research, 96:97–124, 2000.

[10] Liliana Cucu and Yves Sorel. Schedulability condition for systems with precedence andperiodicity constraints without preemption. In RTS2003 11th conference on Real-time andembedded systems, 2003.

[11] Benoit Dupont de Dinechin. Time-indexed formulations and a large neighborhood searchfor the resource-constrained modulo scheduling problem. In MISTA2007, 3rd Multidisci-plinary International Scheduling Conference: Theory and Applications, August 2007.

[12] Benoit Dupont de Dinechin, Christian Artiques, and Sadia Azem. Resource constrainedmodulo scheduling. In C. Artigues, S. Demassey, and E. Neron, editors, Resource-Constrained Project Scheduling, pages 267–277. ISTE and John Wiley, London, 2008.

[13] Alexandre E. Eichenberger and Edward S. Davidson. Efficient formulation for optimalmodulo schedulers. In PLDI ’97: Proceedings of the ACM SIGPLAN 1997 conference onProgramming language design and implementation, pages 194–205, New York, NY, USA,1997. ACM.

[14] Paolo Faraboschi, Geoffrey Brown, Joseph A. Fisher, Giusseppe Desoli, and Fred Home-wood. Lx: a Technology Platform for Customizable VLIW Embedded Processing. InISCA’00: Proc. of the 27th annual Int. Symposium on Computer Architecture, pages 203–213, 2000.

[15] Franco Gasperoni and Uwe Schwiegelshohn. Generating close to optimum loop scheduleson parallel processors. Parallel Processing Letters, 4:391–403, 1994.

20

Page 22: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

[16] Ramaswamy Govindarajan, Erik R. Altman, and Guang R. Gao. A framework for resource-constrained rate-optimal software pipelining. IEEE Transactions on Parallel DistributedSystems, 7(11):1133–1149, 1996.

[17] R.L. Graham. Bounds on the performance of scheduling algorithms. In E.G. Coffman,editor, Computer and job-shop scheduling theory. John Wiley Ltd, 1976.

[18] Claire Hanen. Study of a np-hard cyclic scheduling problem: the recurrent job-shop.European Journal of Operational Research, 72:82–101, 1994.

[19] Claire Hanen and Alix Munier. Cyclic scheduling on parallel processors: An overview.In Philippe Chretienne, Edward G. Coffman, Jan Karel Lenstra, and Zhen Liu, editors,Scheduling theory and its applications. J. Wiley and sons, 1994.

[20] Ja-Hee Kim, Tae-Eog Lee, Hwan-Yong Lee, and Doo-Byeong Park. Scheduling analysis oftime-constrained dual-armed cluster tools. IEEE Transactions on Semiconductor manu-facturing, 16(3):521–534, 2003.

[21] Chunho Lee, Miodrag Potkonjak, and William H. Mangione-Smith. Mediabench: a tool forevaluating and synthesizing multimedia and communications systems. InMicroarchitecture,1997. Proceedings., Thirtieth Annual IEEE/ACM International Symposium on, pages 330–335, Dec 1997.

[22] Tae-Eog Lee and Seong-Ho Park. An extended event graph with negative places and tokensfor time window constraints. IEEE Transactions on Automation Science and Engineering,2(4):319–332, October 2005.

[23] Tae-Eog Lee and Seong-Ho Park. Steady state analysis of a timed event graph with timewindow constraints. In IEEE International Conference on Automation Science and Engi-neering, pages 404–409, Edmonton, Canada, August 2005.

[24] Martin Middendorf and Vadim G. Timkovsky. On scheduling cycle shops: Classification,complexity and approximation. Journal of Scheduling, 5(2):135–169, 2002.

[25] Alix Munier Kordon. A graph-based analysis of the cyclic scheduling problem with timeconstraints: schedulability and periodicity of the earliest schedule. Journal of Scheduling,14:103–117, February 2011.

[26] A. Alan B. Pritsker, Lawrence J. Waiters, and Philip M. Wolfe. Multiproject schedulingwith limited resources: A zero-one programming approach. Management Science, 16(1):93–108, 1969.

[27] Chander Ramchandani. Analysis of asynchronous concurrent systems by Petri nets. PhDthesis, Massachussets Institute of Technology, Cambridge, USA, 1973.

[28] B. Ramakrishna Rau. Iterative modulo scheduling. International Journal of Parallel Pro-cessing, 24(1):3–64, Feb 1996.

21

Page 23: Converging to periodic schedules for cyclic scheduling problems with resources and deadlines

[29] Premsyl Sucha and Zdenek Hanzalek. A cyclic scheduling problem with an undeterminednumber of parallel identical processors. Computational Optimization and Applications,48:71–90, 2011.

[30] Premsyl Sucha, Zdenek Pohl, and Zdenek Hanzalek. Scheduling of iterative algorithms onfpga with pipelined arithmetic unit. In 10th IEEE Real-Time and Embedded Technologyand Applications Symposium, RTAS, pages 404–412, 2004.

22