harmonic aware multi-core scheduling for fixed-priority real-time systems

13
Harmonic-Aware Multi-Core Scheduling for Fixed-Priority Real-Time Systems Ming Fan, Student Member, IEEE, and Gang Quan, Senior Member, IEEE Abstract—This paper presents a new semipartitioned approach to schedule sporadic tasks on multicore platforms based on the Rate Monotonic Scheduling policy. To improve the schedulability, our approach exploits the fact that the utilization bound of a task set increases as task periods become closer to harmonic on single processor platforms. The challenge for our approach, however, is how to take advantage of this fact to assign and split appropriate tasks on different processors in the semipartitioned approach, and how to guarantee the schedulability of real-time tasks. We formally prove that our scheduling approach can successfully schedule any task set with a system utilization bounded by Liu&Layland’s bound for N tasks, that is, Nð2 1=N 1Þ. Our extensive experimental results demonstrate that the proposed algorithm can significantly improve the scheduling performance compared with the previous work. Index Terms—Harmonic, real-time semipartitioned scheduling, fixed-priority, rate monotonic scheduling (RMS) Ç 1 INTRODUCTION A S embedded applications become more and more complicated, embedded system designers rely more on multi-processor or multi-core platforms to obtain high computing performance [1], [2]. Meanwhile, due to the power/thermal constraints, the memory bottleneck, as well as the limitation of the instructional level parallelism in programs [3], industry is changing its gear toward the multi-core architecture rather than continuing to pursue high performance uniprocessor architecture. Conceivably, most of the future embedded systems will be built upon multi-core architectures. A major issue in developing multi- core computing systems is how to utilize the available computing resources most effectively. This is particularly critical for real-time systems with stringent timing con- straints. It is a well known fact that scheduling real-time tasks on multiprocessor platform is NP-hard [4]. Traditionally, the well-known RT scheduling algo- rithms, such as the Rate Monotonic Scheduling (RMS) and Earliest Deadline First (EDF) scheduling, have been proven to be optimal for uniprocessor scheduling [5]. However, when the problem comes to multi-core platform, these optimal algorithms are no longer optimal any more [6]. There have been extensive literature published on real- time scheduling for multi-core systems [7], [8], [9], [10]. These scheduling algorithms can be largely categorized into two classes [6], [11]: the partitioned approach (e.g., [7]) and the global (or non-partitioned) approach (e.g., [8]). In the partitioned scheduling approach, each real-time task is assigned to a dedicated processor. All instances from the same task will be executed solely on that particular processor. In the global scheduling approach, all jobs from different tasks first enter a global queue, and thus each task can be potentially executed on any processor. Both approaches have their own pros and cons, and none of them dominates the other one in terms of schedulability [11]. Recently, a new multi-core scheduling approach, i.e., so called semi-partitioned approach [12], [13], [14], [10], [9], [15], [16], has been proposed. In the semi-partitioned scheduling approach, most tasks are assigned to one particular processor, i.e., the same as the partitioned scheduling approach. However, a few of tasks (i.e., no more than ðM 1Þ tasks, where M is the number of processors) are allowed to be split into several subtasks and assigned to different processors. From a different perspective, these tasks can migrate among different processors. The semi- partitioned approach not only outperforms the traditional partitioned approach and global approach theoretically [10], [15], [17], but also has been shown as sound and practical in the real implementation [16]. Furthermore, by implementing the semi-partitioned scheduling method in the Linux operating system, and running experiments on an Intel Core-i7 4-cores computer, Zhang et al. [18] showed that the overhead in the task migration can be relatively low, and thus its impact on the schedulability is small. In this paper, we present a new semi-partitioned strategy and related schedulability analysis for sporadic tasks on multi-core platform based on RMS. Compared with the existing work on semi-partitioning of real-time tasks, we have made a number of novel contributions. First, we take the harmonic relation among tasks into consider- ation for fixed-priority semi-partitioned scheduling strat- egy on multi-core platform. As shown in our motivational example, taking advantage of harmonic property in semi- partitioned scheduling is non trivial. Two new semi- partitioned algorithms are developed. The first algorithm, namely Harmonic Semi-Partition for Light tasks (HSP-light), is . The authors are with the Department of Electrical and Computer Engineering, Florida International University, Miami, FL 33174 USA. E-mail: {mfan001, gaquan}@fiu.edu. Manuscript received 13 Apr. 2012; revised 10 Jan. 2013; accepted 19 Feb. 2013. Date of publication 18 Mar. 2013; date of current version 16 May 2014. Recommended for acceptance by H. Jiang. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference the Digital Object Identifier below. Digital Object Identifier no. 10.1109/TPDS.2013.71 1045-9219 Ó 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 2014 1476

Upload: wingz-technologies

Post on 16-Apr-2017

322 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Harmonic aware multi-core scheduling for fixed-priority real-time systems

Harmonic-Aware Multi-Core Scheduling forFixed-Priority Real-Time Systems

Ming Fan, Student Member, IEEE, and Gang Quan, Senior Member, IEEE

Abstract—This paper presents a new semipartitioned approach to schedule sporadic tasks on multicore platforms based on theRate Monotonic Scheduling policy. To improve the schedulability, our approach exploits the fact that the utilization bound of atask set increases as task periods become closer to harmonic on single processor platforms. The challenge for our approach,however, is how to take advantage of this fact to assign and split appropriate tasks on different processors in the semipartitionedapproach, and how to guarantee the schedulability of real-time tasks. We formally prove that our scheduling approach cansuccessfully schedule any task set with a system utilization bounded by Liu&Layland’s bound for N tasks, that is, Nð21=N � 1Þ.Our extensive experimental results demonstrate that the proposed algorithm can significantly improve the scheduling performancecompared with the previous work.

Index Terms—Harmonic, real-time semipartitioned scheduling, fixed-priority, rate monotonic scheduling (RMS)

Ç

1 INTRODUCTION

AS embedded applications become more and morecomplicated, embedded system designers rely more

on multi-processor or multi-core platforms to obtain highcomputing performance [1], [2]. Meanwhile, due to thepower/thermal constraints, the memory bottleneck, as wellas the limitation of the instructional level parallelism inprograms [3], industry is changing its gear toward themulti-core architecture rather than continuing to pursuehigh performance uniprocessor architecture. Conceivably,most of the future embedded systems will be built uponmulti-core architectures. A major issue in developing multi-core computing systems is how to utilize the availablecomputing resources most effectively. This is particularlycritical for real-time systems with stringent timing con-straints. It is a well known fact that scheduling real-timetasks on multiprocessor platform is NP-hard [4].

Traditionally, the well-known RT scheduling algo-rithms, such as the Rate Monotonic Scheduling (RMS) andEarliest Deadline First (EDF) scheduling, have been provento be optimal for uniprocessor scheduling [5]. However,when the problem comes to multi-core platform, theseoptimal algorithms are no longer optimal any more [6].

There have been extensive literature published on real-time scheduling for multi-core systems [7], [8], [9], [10]. Thesescheduling algorithms can be largely categorized into twoclasses [6], [11]: the partitioned approach (e.g., [7]) and theglobal (or non-partitioned) approach (e.g., [8]). In the partitionedscheduling approach, each real-time task is assigned to a

dedicated processor. All instances from the same task will beexecuted solely on that particular processor. In the globalscheduling approach, all jobs from different tasks first enter aglobal queue, and thus each task can be potentially executedon any processor. Both approaches have their own pros andcons, and none of them dominates the other one in terms ofschedulability [11].

Recently, a new multi-core scheduling approach, i.e., socalled semi-partitioned approach [12], [13], [14], [10], [9], [15],[16], has been proposed. In the semi-partitioned schedulingapproach, most tasks are assigned to one particularprocessor, i.e., the same as the partitioned schedulingapproach. However, a few of tasks (i.e., no more thanðM � 1Þ tasks, where M is the number of processors) areallowed to be split into several subtasks and assigned todifferent processors. From a different perspective, thesetasks can migrate among different processors. The semi-partitioned approach not only outperforms the traditionalpartitioned approach and global approach theoretically[10], [15], [17], but also has been shown as sound andpractical in the real implementation [16]. Furthermore, byimplementing the semi-partitioned scheduling method inthe Linux operating system, and running experiments on anIntel Core-i7 4-cores computer, Zhang et al. [18] showed thatthe overhead in the task migration can be relatively low,and thus its impact on the schedulability is small.

In this paper, we present a new semi-partitionedstrategy and related schedulability analysis for sporadictasks on multi-core platform based on RMS. Comparedwith the existing work on semi-partitioning of real-timetasks, we have made a number of novel contributions. First,we take the harmonic relation among tasks into consider-ation for fixed-priority semi-partitioned scheduling strat-egy on multi-core platform. As shown in our motivationalexample, taking advantage of harmonic property in semi-partitioned scheduling is non trivial. Two new semi-partitioned algorithms are developed. The first algorithm,namely Harmonic Semi-Partition for Light tasks (HSP-light), is

. The authors are with the Department of Electrical and Computer Engineering,Florida International University, Miami, FL 33174 USA. E-mail: {mfan001,gaquan}@fiu.edu.

Manuscript received 13 Apr. 2012; revised 10 Jan. 2013; accepted 19 Feb.2013. Date of publication 18 Mar. 2013; date of current version 16 May 2014.Recommended for acceptance by H. Jiang.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference the Digital Object Identifier below.Digital Object Identifier no. 10.1109/TPDS.2013.71

1045-9219 � 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141476

Page 2: Harmonic aware multi-core scheduling for fixed-priority real-time systems

intended for task sets with utilization factor of each taskno more than 0.5. The second one, namely Harmonic Semi-Partition (HSP), is developed for more general task sets, i.e.,the utilization factor of each task is no more than 1. Second,we present new schedulability analysis results for the semi-partitioned scheduling algorithms developed in this paper.Note that, to maximally utilize a processor such thatadding more high priority tasks will cause deadline missdoes not immediately imply the validity of Liu&Layland’sbound [5] for semi-partitioned scheduling, since when atask needs to migrate to a different processor, its deadlinebecomes smaller than its period. We formally prove thatthe proposed algorithms can guarantee the schedulabilityfor task sets with utilizations no larger than theLiu&Layland’s bound. Moreover, different from the ap-proach in [15], task sets with utilizations higher than theLiu&Layland’s bound may also be schedulable with ourapproaches. Third, we conducted extensive experiments tostudy the performance of our approach, and our experi-mental results demonstrate that our proposed algorithmscan significantly outperform previous work. A preliminaryversion of this paper has been published in [19].

The rest of the paper is organized as follows. Section 2discusses the related work. Section 3 introduces systemmodels and other background information necessary forthis paper. Sections 4 and 5 present two semi-partitionedalgorithms we developed. Experiments and results arediscussed in Section 6, and we present the conclusions inSection 7.

2 RELATED WORK

In this section, we discuss the related work from two aspects:the work that exploit the harmonic property for periodictasks and the work on semi-partitioned scheduling.

The property of harmonic tasks, i.e., the tasks withperiods being integer multiples of each other, has beenwidely studied on uniprocessor systems. Compared withthe Liu&Layland’s bound, many researchers have proposedmore efficient bound for RMS uniprocessor scheduling.One known result is that if all tasks are harmonic in a taskset, the utilization bound can be as high as 1 [20]. Han et al.[21] proposed a polynomial-time method to determinethe task set schedulability through testing the schedul-ability of a harmonic task set derived from the original taskset. They proved that any task set that can pass the schedul-ability test by Liu&Layland’s bound can pass the proposedtest. Kuo et al. [22] presented another polynomial-timeschedulability test method. By combining harmonic tasksinto one task, the method can reduce the effective number oftasks and then the Liu&Layland’s bound can be used to testthe schedulability. There are also a number of otherresearches that study the relationship between systemschedulability and task periods under RMS for uniproces-sor scheduling [23], [24], [25]. For multiple processor RMSscheduling, Jung [26] et al. studied the problem of sched-uling harmonic tasks on a uniform multiprocessor platform.Muller [27] adopted the schedulability test by Han et al. [21]to minimize the number of processors, and Fan et al. [28]proposed a scheduling technique that improves thesystem schedulability by taking advantage of the harmonic

relation among tasks. All these work indicate that systemschedulability can be greatly improved if harmonicrelations among different tasks can be appropriatelyexploited for RMS scheduling on both single and multiplecore platforms.

Semi-partitioned scheduling, by splitting a few tasks,has been shown as an effective and practical schedulingmethod to improve the system utilization significantlycompared with the traditional global scheduling andpartitioned scheduling (e.g., [12], [13], [14], [19], [10], [9],[15], [16].) As an example, the best known utilization boundfor either global or partitioned fixed-priority schedule is nomore than 50 percent [7], [29], [8], while the utilizationbound can reach much higher using semi-partitionedscheduling. For instance, Lakshmanan et al. [10] haveshown an utilization bound of 65 percent, and Guan et al.[15], [30] improved this bound to the traditionalLiu&Layland’s bound, i.e., 69.3 percent as the number oftasks goes to infinite, or any valid utilization bounds (suchas the K-bound [22] or R-bound [31]) established on singleprocessor platforms. Kandhalu et al. [32] proposed twosemi-partitioned scheduling algorithms. They show that,for task sets with each individual task utilization factor nomore than 0.5, the utilization bound can increase with thenumber of cores and approach 100 percent.

We believe that taking advantage of the harmonicrelationship among task periods can greatly improve theschedulability of a semi-partitioned algorithm. Some of theexisting approaches (such as the ones in [30], [32]) exploitthis relationship by using the R-Bound [31], i.e., a utilizationbound that takes the possible harmonic relationship intoconsideration. However, employing R-bound cannot deter-mine the schedulability of a task set as accurate as the worstcase analysis. Moreover, in order to use R-bound, all taskshave to go through a period transformation process. Afterthe transformation, Kandhalu et al. [32] proposed toallocate the tasks with the smallest periods together.Unfortunately, these tasks do not necessarily form a taskset closest to harmonic. In our approach, we developed ametric to quantitatively measure how harmonic a task setis, and based on this metric, to effectively allocate taskscloser to harmonic to the same processor. In addition, wecan still employ the worst case analysis to determine themaximal capacity of a processor when adding a task to itand thus has a much better scheduling performance. Theproposed scheduling algorithm can guarantee a utilizationbound the same as Liu&Layland’s bound.

3 PRELIMINARY

We are interested in the problem of semi-partitionedscheduling of sporadic tasks on multicore platform basedon RMS, which is known as an NP-hard problem [4]. In thissection, we first present our system models used in thispaper, and then we introduce some pertinent backgroundinformation and concepts necessarily for our research. Wethen use an example to motivate our research.

3.1 System ModelsThe real-time system considered in this paper consists of Nsporadic tasks, denoted as G ¼ f�1; �2; . . . ; �Ng, and executed

FAN AND QUAN: HARMONIC-AWARE MULTI-CORE SCHEDULING FOR FIXED-PRIORITY REAL-TIME SYSTEMS 1477

Page 3: Harmonic aware multi-core scheduling for fixed-priority real-time systems

on M identical processors, i.e., P ¼ fP1; P2; . . . ; PMg. Eachtask �i 2 G, is characterized by a tuple ðCi; TiÞ, where Ci isthe worst-case execution time of �i, and Ti is the minimum inter-arrival time between any two consecutive jobs of �i. Ti is alsocalled the period of �i in this paper. For the sake of simplicity,we use GPm to denote the task set on processor Pm. For therest of this paper, we make two assumptions: 1) thedeadline of each task is equal to its period; 2) G is sortedwith decreasing priority order, i.e., task �i has a higherpriority than �j if i G j.

The utilization factor of a task �i is denoted as ui where

ui ¼CiTi: (1)

Based on its utilization factor, a task can be light or heavy,which we formally defined below:

Definition 1. Task �i is called a light task if ui � 12, or a heavy

task otherwise.

Note that, even though we used the same terminology asthat in [15], our definitions of light and heavy tasks aretotally different. The total utilization of a task set G is denotedas UðGÞ where

UðGÞ ¼X�i2G

ui: (2)

The system utilization of task set G on a multi-core platformwith M processors is denoted as UMðGÞ, where

UMðGÞ ¼UðGÞM

: (3)

Liu and Layland [5] showed that a task set G can be feasiblyscheduled by RMS on a uniprocessor if

UðGÞ � QðNÞ ¼ N 21=N � 1� �

: (4)

QðNÞ is also traditionally referred to as the Liu&Layland’sbound.

3.2 On Semi-Partitioned SchedulingA semi-partitioned scheduling algorithm consists of twophases: the partitioning phase and the scheduling phase.

In the partitioning phase, most tasks will be assigned toone processor and can be executed only at that particularprocessor during running time. These tasks are called non-split tasks [15]. A few other tasks, so called split tasks, areallowed to be split into several subtasks and assigned todifferent processors with the purpose of maximallyutilizing the processor. Let task �i be a task that is splitinto three subtasks, i.e., �b1

i , �b2i , and �ti , executed on pro-

cessor P1, P2, and P3, respectively. The total execution timeof �b1

i , �b2i , and �ti equals to Ci. Specifically, the last subtask

of �i, i.e., �ti is called tail task, and other subtasks of �i, i.e.,�b1i and �b2

i , are called body tasks. For ease of presentation, weuse CB

i and uBi to represent the total execution time andutilization of all body tasks from a split task �i, respectively.Note that, once the partitioning phase is done, the assign-ment of a subtask to a processor is permanent and thesubtask can only run on that designated processor.

In the scheduling phase, the scheduling strategy for eachprocessor is determined. In our case, all tasks assigned tothe same processor are scheduled strictly conforming toRMS policy, i.e., the task with a smaller period always has ahigher priority. One complexity, however, is to executemultiple subtasks assigned to different processors accord-ing to the original logical order sequentially. Since thescheduler at the operating system level does not necessarilyknow the nature of a real-time process, to execute multiplesubtasks from the same task concurrently may violate thedata or control dependency and thus leads to invalidcomputing results. Therefore, it is vital to make sure thateach subtask is executed according to its logical order andwithout overlapping with other subtasks.

We adopt an existing approach [9], [15], [17] to solvethis problem and assume that an appropriate timer isavailable to monitor the execution of body/tail tasks.Specifically, the scheduler will assign a timer to a split task,e.g., �i in the above example. When �i arrives, the schedulerdispatches �b1i to processor P1 immediately and sets thetimer to Cb1

i . After the timer expires, the scheduler thendispatches �b2i to processor P2 and sets the timer to Cb2

i .Then if the timer expires again, the scheduler releases �ti toprocessor P3. As such, all subtasks split from the same taskcan only run sequentially following their logical orders toensure the correctness of program. Therefore, the body/tail tasks from the same task can be viewed as tasks withthe same periods but different starting times, and thesynchronization problem for split tasks from the same taskcan be easily resolved in practice. For more details aboutthe semi-partitioned scheduling, readers can refer to [15],[9], [10], [33].

3.3 Motivation ExamplesBefore we present our approach in detail, we first use anexample to motivate our research. Since tasks withharmonic relationship have much higher schedulabilityon a single processor, an intuitive approach wouldtherefore be the one that groups harmonic tasks togetherand assigns them to one processor. Unfortunately, such anaive approach may not work in the semi-partitionedapproach.

Consider a two-processor platform with a task set shownin Table 1. Since �1 and �3 are harmonic, we can group �1 and�3 to one processor, i.e., Processor 1. Similarly, we can group�2 and �4 to the other processor, i.e., Processor 2. Since noprocessor can accommodate �5 entirely, we have to split �5

between these two processors. There are two problems withthis assignment. First, as shown in Fig. 1a, the maximumcapacity that can be accommodated in Processor 1 is 10.Since the subtasks from �5 cannot be executed concurrentlyon two processors, at most 4 time units from Processor 2 can

TABLE 1Task Set with Five Real-Time Tasks

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141478

Page 4: Harmonic aware multi-core scheduling for fixed-priority real-time systems

be utilized by �5 as shown in Fig. 1b. As a result, �5 cannotcomplete before its deadline even if all available time unitsare used for its execution. Second, in order to use all 4 timeunits on Processor 2, we need complicated process migra-tion controls and synchronization mechanisms, whichincrease not only the switching overhead, but also thecontrol complexity among different processors. Note that, ifwe assign �1 and �5 to one processor, and the other tasks toanother processor, it is not difficult to verify that theschedule is feasible.

As indicated by this example, to take the advantage ofharmonic relationship among tasks to improve the sche-dulability, a critical problem is how to judiciously choosethe task to split and to synchronize among differentprocessors. To solve this problem, we present two novelsemi-partitioned algorithms, i.e., HSP-light and HSP, in thefollowing sections.

4 THE HSP-LIGHT ALGORITHM

The HSP-light algorithm is a harmonic semi-partitionedalgorithm developed for light tasks. When employing theharmonic relationship to improve the scheduling perfor-mance, it is not necessary that all tasks in the same task setare strictly harmonic. To this end, we first introduce ametric, namely the harmonic index, to quantify the degree ofharmonicity for a task set. We then discuss our newalgorithm that employs this metric. Finally, we study theschedulability of this algorithm.

4.1 Quantifying the HarmonicitySince not all tasks in a given task set are harmonic, it isdesirable that we can quantify the harmonicity of a task set,i.e., how close a task set is to a harmonic task set.Conceivably, the higher the harmonicity of a task set, thehigher the system utilization can be. To achieve this goal,we first introduce the following concept.

Definition 2. Given a task set G ¼ f�1; �2; . . . ; �Ng sorted withdecreasing priority order under RMS, where �i ¼ ðCi; TiÞ, letG0 ¼ f� 01; � 02; . . . ; � 0Ng where � 0i ¼ ðCi; T 0i Þ, T 0i � Ti, and T 0i jT 0j

if i G j. (Note ajb means ‘‘a divides b’’ or ‘‘b is an integermultiple of a’’.) Then G0 is called a sub harmonic task set of G.Moreover, for any sub harmonic task set of G, let

DU 0 ¼ UðG0Þ � UðGÞ; if UðG0Þ � 1,þ1; otherwise.

�(5)

From equation (5), DU 0 defines the ‘‘distance’’ of a task setto the corresponding sub harmonic task set in terms of itstotal utilization factor. If the utilization of that subharmonic task set is greater than 1, then the ‘‘distance’’ isset to be infinity.

Given a task set, there may be more than one subharmonic task sets. One type of sub harmonic task sets thatis of most interest to us, which we call the primary harmonictask set, is formally defined as follows.

Definition 3. Let G0 be a sub harmonic task set of G. Then G0 iscalled a primary harmonic task set of G if there exists no othersub harmonic task set G00 such that T 0i � T 00i for all 1 � i � N .

We are now ready to define a metric, i.e., the harmonicindex, to measure the harmonicity of a real-time task set.

Definition 4. Given a task set G, let GðGÞ represent all primaryharmonic task sets of G. Then the harmonic index of G,denoted as HðGÞ, is defined as

HðGÞ ¼ minG02GðGÞ

DU 0: (6)

From equation (6), the harmonic index essentially definesthe minimal ‘‘distance’’ of a task set to its primaryharmonic task sets in terms of its total utilization factor.If no primary harmonic task set satisfies UðG0Þ � 1, thenthe ‘‘distance’’ is set to infinity. In this paper, we adopt theDCT algorithm [21] to find primary harmonic task setswith a complexity of OðN2Þ.

For a real-time task set and its primary harmonic tasksets, it is not difficult to prove the following theorem [21].

Theorem 1: [21]. Let G0 be a primary harmonic task set of G.Then G is feasible on uniprocessor under RMS if UðG0Þ � 1.

In what follows, we introduce how we develop the HSP-light algorithm based on this index.

4.2 Algorithm DetailsHSP-light algorithm assigns tasks to processors from lowerpriority to higher priority ones. A task is assigned to aprocessor that can accommodate it and also with theresulting task set having the lowest harmonic index. Inother words, a task will be assigned to a feasible processorwith the highest harmonic relationship for the resultingtask set. The schedulability of the result task set can beguaranteed by performing the exact timing analysis [34] onthe corresponding synchronized task set, i.e., assuming alltasks start at the same time. If a task cannot be accommo-dated entirely by any processor, then split occurs.

To split a task, we adopt a simple heuristic that assignssubtasks to the processor with the highest availablecapacity. There are two advantages using this splitting

Fig. 1. Allocation fails when simply grouping harmonic tasks andassigning them to the same processor. (a) Processor 1. (b) Processor 2.

FAN AND QUAN: HARMONIC-AWARE MULTI-CORE SCHEDULING FOR FIXED-PRIORITY REAL-TIME SYSTEMS 1479

Page 5: Harmonic aware multi-core scheduling for fixed-priority real-time systems

strategy: 1) It reduces the total split times by efficientlymaximizing the workload for each split subtask. 2) Itguarantees the priority of each body task to be the highestone on its host processor. After the split is done, the valueto set up the timer for enabling the sub-task is alsodetermined. Algorithm 1 shows the salient aspects of theHSP-light algorithm.

Algorithm 1 HSP-light Algorithm.

Require: 8�i 2 G, uiG ¼ 1=2;

1: while G 6¼ ; do

2: �i :¼ the task with the lowest priority in G;

3: Pm :¼ the processor with minimum HðGPm þ �iÞ in P;

4: if GPm þ �i is feasible then

5: Assign �i to processor Pm;

6: Continue;

7: end if

8: Pm :¼ the processor with the maximum capacity(greater than 0) for �i;

9: if Pm does not exist, then break, end if

10: if{GPm þ �i is feasible then

11: Assign �i to processor Pm;

12: else

13: Split �i into �i1 and �i2 such that GPm þ �i1 canmaximally utilize Pm;

14: Assign �i1 to processor Pm;

15: Replace �i by �i2, and move �i back to G;

16: end if

17: end while

18: if G ¼ ; then

19: Return ‘‘Success!’’;

20: else

21: Return ‘‘Fail!’’;

22: end if

Given a task set G and a multiprocessor system P, HSP-light makes the assignment decision for each task throughthe ‘‘while’’ loop from line 1 to line 17. Among allunassigned tasks left in G, the task �i with the lowestpriority is selected (line 2). �i is assigned to the processorwith the minimum harmonic index as long as thatprocessor has enough capacity for the task on each pro-cessor (from line 4 to line 7). If this assignment fails, wesplit task �i and make the assignment (from line 8 to line 16).We choose the processor with the maximum executioncapacity for �i. If the corresponding capacity is largeenough, then �i is assigned entirely. Otherwise, we split �iand assign part of �i to the processor until it is maximallyutilized, i.e., no other higher priority tasks can be assignedto that processor without causing other tasks to missdeadlines. Note that, to check the schedulability of a taskset (line 4, line 10) and to calculate the maximum executioncapacity available for splitting a task (line 13), we can usethe traditional exactly timing analysis method [34] on thecorresponding synchronized task set, i.e., tasks with thesame starting time. The algorithm succeeds if all tasks areallocated, and fails otherwise. In what follows, we furtherstudy the schedulability of Algorithm 1.

4.3 Schedulability Analysis of HSP-LightIn this subsection, we are interested in examining howeffective the algorithm HSP-light can be when schedulingreal-time tasks on multi-core platforms. From the Algorithm 1,it is easy to conclude the following property.

Lemma 1. If a task set G is successfully partitioned by HSP-light onM processors, then there is at most one body task on each pro-cessor; and on all processors, there are at most ðM � 1Þ tasks to besplit.

Proof. In HSP-light, splitting occurs only when no processor -can accommodate one task completely. After splittingand assigning a task, the processor that accommodatesthe body task becomes full for higher priority tasks, andno other higher priority tasks can be assigned to it anymore. The body task is the last task assigned to its hostprocessor. Therefore, there is at most one body task oneach processor. Since there are M processors, at mostðM � 1Þ tasks will be split. Ì

Lemma 1 constrains the maximum number of tasks thatcan be split and migrated among different processors, andthus, the extra cost associated with the migrations. FromLemma 1, we can derive the following property.

Lemma 2. Each body task has the highest priority on its hostprocessor.

Proof. According to Lemma 1, we know that there is at mostone body task on each processor. Moreover, Algorithm 1guarantees that any body is the last task assigned to its hostprocessor. Since tasks are assigned from the lowest priorityto the highest priority, the priority of any body task ishigher than any other tasks on its host processor. Ì

More importantly, if a task set can be successfully allocatedby HSP-light, all tasks can satisfy their deadlines. The con-clusion is formally formulated in the following theorem.

Theorem 2. If a task set G is successfully partitioned by HSP-light on M processors and scheduled according to RMS, thenall tasks can meet their deadlines.

Proof. For each body task, it has the highest priority at itshost processor (Lemma 2). Therefore, it can always meetits deadline unless the worst case execution time of theoriginal task is larger than its deadline, which is im-possible. For tail tasks or any other regular tasks addedto a processor, the schedulability of the entire task set isguaranteed based on the worst case response timeanalysis for the corresponding synchronous task set asstated above (lines 4, 10, and 13). Ì

From Theorem 2, HSP-light is not only an allocationmethod but also can serve as a schedulability test methodas well. It is not surprising HSP-light is only a sufficientschedulability test method for multi-core schedulingproblem. On the other hand, however, HSP-light is toocomplex to be used effectively as a schedulability checking

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141480

Page 6: Harmonic aware multi-core scheduling for fixed-priority real-time systems

method. Theorem 3 presents a faster schedulability check-ing method for our proposed algorithm.

Theorem 3. Given a light task set G consisting of N tasks to bescheduled on M processors, if

UMðGÞ � QðNÞ; (7)

then G is feasible by HSP-light under RMS.

The proof of Theorem 3 is rather complicated. Interestedreaders can refer to Appendix A for details. Theorem 3shows that a light task set with system utilization boundedby the well-known Liu&Layland’s bound is guaranteed to befeasible using our proposed approach, i.e., Algorithm 1.

It is worth mentioning that Theorem 2 is valid for anygeneral task set, which implies that if a task set can besuccessfully allocated using HSP-light, all tasks can meettheir deadlines. However, Theorem 3 works only for lighttask sets. In other word, HSP-light cannot guarantee theschedulability of a general task set (which contains heavytasks), even if its total utilization is less than Liu&Layland’sbound. In the next section, we introduce a more advancedalgorithm, i.e., HSP, that can guarantee the schedulabilityfor any task sets with system utilizations no more than theutilization bound.

5 THE HSP ALGORITHM

The reason that HSP-light cannot guarantee the schedul-ability of an arbitrary task set with utilization lower thanthe utilization bound is that, if a split task is a heavy taskand the tail task is very light, the overall system utilizationcan be very low. We use an example to explain thisobservation.

Consider a task set with four tasks, �1 ¼ ð4; 100Þ, �2 ¼ð4; 90Þ, �3 ¼ ð49; 50Þ, �4 ¼ ð2; 50Þ, to be scheduled on 2 pro-cessors. As shown in Fig. 2a, even though the system utili-zation is very small, i.e., ð2=50þ 49=50þ 4=90 þ 4=100Þ=2 ¼

0:55 G 0:69, HSP-light cannot schedule this task set success-fully. Note that the tail task from �2 can be viewed as a taskwith worst case execution time of 1 and deadline of 2. Addingany higher priority task with execution time more than 1 willmake �2 infeasible. On the other hand, if we pre-assign theheavy task �2 to a processor, we can see that the task set can besuccessfully scheduled as shown in Fig. 2b. Therefore, inorder to take the advantage of harmonic property to schedulegeneral task sets, a special operation, i.e., the pre-assignment,needs to be performed for heavy tasks.

As discussed before, HSP-light can guarantee all tasks(light or heavy) meet their deadlines if all tasks can beassigned to a processor successfully. At the same time, Fig. 2implies that heavy task pre-assignment can greatly improvethe schedulability of the scheduling algorithm. The ques-tion becomes which heavy tasks should be pre-assignedand how other tasks should be assigned accordingly.

In HSP, the pre-assignment for heavy tasks follows thesame strategy as introduced in [15]. Specifically, for anyheavy task �i, let PEmpi denote the set of empty processorsbefore �i’s assignment and jPEmpi j denote the number ofprocessors in this set. Then a heavy task �i needs to be pre-assigned to an empty processor if

Xj9i

uj � PEmpi

��� ���� 1� �

� QðNÞ: (8)

The detailed procedure of HSP is shown in Algorithm 2.HSP is very similar to HSP-light, except for two importantdifferences:

. At the beginning of semi-partitioning procedure,heavy tasks are pre-assigned to empty processor set,denoted as PPre, if they satisfy the criteria as statedin (8) (from line 1 to line 8);

. To ensure that a body task always has the highestpriority on a processor, a processor with heavy taskpre-assignment may be excluded from the semi-partitioning process. According to Algorithm 2, atask can be assigned to a processor with heavy taskassignment only after the heavy task pre-assigned inthe processor has a lower priority (from line 12 toline 15).

Similar to Theorem 2, for HSP, the schedulability oftasks are guaranteed as stated in the following theorem.

Theorem 4. If a task set G is successfully partitioned by HSP onM processors and scheduled according to RMS, then all taskscan meet their deadlines.

Moreover, the Liu&Layland’s bound for single processorcan also be applied to HSP for schedulability checking. Thisconclusion is formally formulated in Theorem 5.

Theorem 5. Given a task set G consisting of N tasks to be sched-uled on M processors, if

UMðGÞ � QðNÞ; (9)

then G is feasible by HSP under RMS.

For the proof of Theorem 5, please refer to Appendix B.Theorem 5 provides a very efficient schedulability checking

Fig. 2. (a) Task set is failed to be scheduled according to HSP-light.(b) Task set is schedulable if the heavy task �2 is pre-assigned.

FAN AND QUAN: HARMONIC-AWARE MULTI-CORE SCHEDULING FOR FIXED-PRIORITY REAL-TIME SYSTEMS 1481

Page 7: Harmonic aware multi-core scheduling for fixed-priority real-time systems

method for real-time task sets scheduled by HSP.Given any task set G, if the total utilization of G satisfiesEquation (9), then G can be successfully scheduled by HSPon M processors. Different from Theorem 3, Theorem 5 worksfor arbitrary task sets instead of light task sets alone. It isworth mentioning that, based on our proofs in the Appendix,Theorem 3 and Theorem 5 hold true even without theconsideration of period relationships, i.e., lines 3-7 of Al-gorithm HSP-ligh and lines 16-19 of Algorithm HSP.To study if our approach can lead to a better utilizationbound is an interesting problem and will be our futurestudy. In what follows, we use experiments to study thepotential improvement that can be achieved using ourmethods.

Algorithm 2 HSP Algorithm.

Require:

1) Task set: G ¼ f�1; �2; . . . �Ng;2) Multiprocessor: P ¼ fP1; P2; . . . ; PMg;

1: // pre-assign heavy tasks;

2: PPre ¼ ;;3: for i ¼ 1 to N do

4: if ui 9 1=2 andP

j9i uj � ðjPEmpi j � 1Þ � QðNÞ then

5: Assign �i to processor Pm, where m ¼ jPj;6: Move Pm from P to PPre;7: end if

8: end for

9: // assign other tasks;

10: while G 6¼ ; do

11: �i :¼ the task with the lowest priority in G;12: �j :¼ the task with the lowest priority in GPPre ;13: if �i has higher priority than �j then

14: Move P ð�jÞ from PPre to P;

15: end if

16: Pm :¼ the processor with minimum HðGPm þ �iÞ in P;

17: if GPm þ �i is feasible then

18: Assign �i to processor Pm;

19: Continue;20: end if

21: Pm :¼ the processor with maximum capacity for �i inP;

22: if Pm does not exist, then Break, end if

23: if GPm þ �i is feasible then

24: Assign �i to processor Pm;

25: else

26: Split �i into �i1 and �i2 such that GPm þ �i1 canmaximally utilize Pm;

27: Assign �i1 to processor Pm;

28: Replace �i by �i2, and move �i back to G;29: end if

30: end while

31: if G ¼ ; then

32: Return ‘‘success’’;

33: else

34: Return ‘‘fail’’;

35: end if

6 EXPERIMENTS AND RESULTS

In this section, we investigate the performance of our pro-posed algorithms with experiments. Five algorithms are im-plemented in our experiments.

. SPA: The SPA algorithm [15] assigns the priority ofeach task by RMS, and splits a task to feed theprocessor until ‘‘full’’ (e.g., utilization equal to theLiu&Layland’s bound). However, as long as the uti-lization of a task set exceeds the Liu&Layland’s bound, itsimply aborts.

. DM_PM: The DM_PM algorithm [9] assigns taskpriorities by deadline monotonic scheduling (DMS)policy, and splits a task and assigns as large portion ofthe task as possible to a processor by computing themaximum interference to the task on each processor.

. PUB: The PUB algorithm [30], similarly to SPA,assigns tasks based on a parametric utilizationbound, but uses exact timing analysis method fortask splitting. In the following experiments, R-Bound[31] is applied with this algorithm.

. pCOMPATS: The pCOMPATS algorithm [32] exploresthe R-Bound [31] for task partitioning and splitting.R-Bound can only be applied to task sets with ratioof any two periods no smaller than 1 and no largerthan 2. In our experiments, we used the same al-gorithm as that in [32] to scale a general task set.

. HSP: Our proposed algorithm. Note that HSP is thesame as HSP-light when the task set is light, and canaccommodate task sets containing heavy tasks.

We conducted two groups of experiments to study howperformance of each algorithm changes with differentnumbers of tasks and different system utilizations, respec-tively. For each group of experiments, we tested on differentnumber of processors, i.e.,M ¼ 4, 8, and 16. For each testingpoint in the experiments, we randomly generated 500 tasksets as test cases. The utilization of each task set varied from0.5 to 1 (since task sets with smaller utilizations could beeasily schedulable by all approaches). The minimum inter-arrival time of each task was set to have a uniformdistribution within [50, 1000]. The scheduling performancefor different approaches are compared using the successratios, i.e., the number of feasible tasks over the number oftotal tasks generated under a specific test point.

6.1 Performance vs. Number of TasksIn this group of experiments, we varied the number oftasks, i.e., N , in a task set from 2�M to 10�M with anincrement ofM (where M is the number of processors). Thesuccess ratios of all five approaches were recorded andplotted in Fig. 3.

From Fig. 3, we can observe that HSP can achievesuccess ratios much better than other four approaches. Forexample, in Fig. 3a, when the number of tasks is equal to 20,HSP can achieve a success ratio of 78 percent, animprovement of 1.7 times of that by SPA (45 percent),1.1 times of that by DM_PM (71 percent), 1.2 times of thatby PUB (64 percent), and 1.1 times of that by pCOMPATS(68 percent). The improvement of HSP comes from the factthat HSP takes the harmonic relationship among tasks

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141482

Page 8: Harmonic aware multi-core scheduling for fixed-priority real-time systems

aggressively into consideration and tries to allocate taskscloser to harmonic together among multiple processors.The exploitation of harmonicity is limited to that theutilization bounds for different processors may be differentdepends on how existing tasks are close to harmonic.

From Fig. 3, we can see that, for the same number ofprocessors ðMÞ, the success ratio of HSP in general decreaseswith the increase of task numbers ðNÞ. For example, in Fig. 3c(as M ¼ 16), the success ratio of HSP achieves 91 percentwhenN ¼ 32, but it decreases to 71 percent whenN increasesto 160. The larger the number of task is, the lower theutilization bound can be. As a result, a task set becomes moredifficult to be schedulable. From Fig. 3, it is also interesting tosee that, if we assume similar average number of tasks foreach processor (i.e., assuming N=M as a constant), the suc-cess ratio by HSP largely increases in general. For example,when N=M ¼ 5, the success ratios for M ¼ 4, 8, 16 are 78percent (see Fig. 3a at N ¼ 20), 80 percent (see Fig. 3b atN ¼ 40) and 83 percent (see Fig. 3c at N ¼ 80), respectively.The reason for this is that the more processors are available,there are more opportunities that can be exploited by HSP totake advantage of the harmonic property among tasks toimprove the processor utilization.

6.2 Performance vs. System UtilizationsTo study the performance differences by different schedulingapproaches under different system utilizations, we con-ducted three sub-groups of experiments, for light and generaltask sets, respectively. In light task sets, the utilization of each

task was evenly distributed within [0, 0.5], while in generaltask sets, the utilization of each task was evenly distributedwithin [0, 1]. For each experiment, we varied the systemutilization from 0.5 to 1.0 with an increment of 0.025. Theexperimental results for all approaches are collected andshown in Figs. 4 and 5.

Fig. 4 shows our experimental results for task setscontaining only light tasks. From Fig. 4, we can observe thatHSP can achieve success ratios significantly better thanother four approaches. Compared with SPA, all other fourapproaches, i.e., DM_PM, PUB, pCOMPATS, and HSP canguarantee the schedulability of any task set with utilizationbelow Liu&Layland’s bound, the same as SPA. The successratio by SPA drops sharply when system utilization around0.7. This is because that while SPA can guarantee any tasksets with utilizations no more than the Liu&Layland’s bound,it rejects any task set with system utilization exceeding theLiu&Layland’s bound. While DM_PM, PUB and pCOMPATSmay potentially schedule task sets with utilization higherthan the Liu&Layland’s bound, HSP can achieve a muchhigher performance, especially when the system utilizationis high. For example, in Fig. 4a, when the system utilizationis around 0.9, HSP can still achieve a success ratio up to30 percent, while that of DM_PM is 10 percent, and that ofPUB and pCOMPATS are no more than 5 percent. Similarto our first group of experiments, we can see that theperformance improvement by HSP tends to increase as thenumber of processors increases. Under the system utiliza-tion of 0.9, HSP can achieve a success ratio of 30 percent

Fig. 3. Experimental results for general task sets by different number of tasks. (a) Number of processors:M ¼ 4. (b) Number of processors: M ¼ 8.(c) Number of processors: M ¼ 16.

Fig. 4. Experimental results for light task sets, u 2 ½0; 0:5�. (a) Number of processors: M ¼ 4. (b) Number of processors: M ¼ 8. (c) Number ofprocessors: M ¼ 16.

FAN AND QUAN: HARMONIC-AWARE MULTI-CORE SCHEDULING FOR FIXED-PRIORITY REAL-TIME SYSTEMS 1483

Page 9: Harmonic aware multi-core scheduling for fixed-priority real-time systems

with 4 processors, 40 percent with 8 processors, and in-creased up to 60 percent with 16 processors.

Fig. 5 shows our experimental results for general tasksets containing both heavy and light tasks. From Fig. 5, wecan also observe that HSP performs significantly betterthan other four approaches. In Fig. 5c, HSP can achieve asuccess ratio four times of that by DM-PM and PUB whenthe system utilization is around 0.925.

Our experimental results clearly show, by exploiting theharmonic relationship among tasks more aggressively, HSPcan significantly improve the schedulability of semi-partitionedscheduling compared with the existing algorithms.

7 CONCLUSION

In this paper, we have presented a new semi-partitionedapproach for scheduling real-time sporadic tasks on multi-core platform under RMS. Our approach can take advan-tage of the harmonic relations among task periods andimprove the schedulability. To achieve this goal, weintroduced a metric to quantify how close a task set is to aharmonic task set. Two algorithms, i.e., HSP-light and HSP,were presented to schedule light and general task sets,respectively. We have formally analyzed the schedulabilityfor both algorithms, and presented a simple schedulabilitytest method for each one. Specifically, we formally provedthat our scheduling algorithms can successfully scheduleany task set with a system utilization bounded by theLiu&Layland’s bound. The experimental results demonstrat-ed that the proposed algorithm can significantly improvethe scheduling performance compared with previous work.

APPENDIX APROOF OF THEOREM 3

Before we introduce the theorem formally, we first studythe schedulability of a task set containing a critical task,with its formal definition presented in Definition 5.

Definition 5. Let G ¼ f�1; . . . ; �i; . . . ; �Ng be a task set that isschedulable by RMS on a single processor. �i is called thecritical task if when increasing the execution time of thehighest priority task, �i is the first task to miss its deadline.

In addition, for ease of presentation, we introduce thefollowing definition.

Definition 6. A processor is called to be maximally utilized by atask set if any increase of the execution time for its highestpriority task will cause at least one task on the same processor tomiss its deadline.

In a semi-partitioned system, after partitioning, wedivide the tasks into three types: non-split task, body taskand tail task. According to Lemma 2, a body task alwayshas the highest priority on its host processor. Thus, fromDefinition 5, no body task can be a critical task.

Lemma 3. The critical task on each processor can only be a non-split task or a tail task.

In what follows, we want to study the schedulabilitycharacteristics for processors containing non-split or tailtasks that are critical tasks. We assume that a split task �iis split into Bi body tasks and one tail task, denoted as�bji ðj 2 ½1; Bi�Þ and �ti , respectively.

For two different types of critical tasks, i.e., non-splittasks and tail tasks, we introduce two important properties,which are formulated in the following lemmas.

Lemma 4. Let GPm be the task set allocated to processor Pm inHSP-light. If the critical task is a non-split task and Pm ismaximally utilized by GPm , then UðGPmÞ 9 QðNÞ.

Proof. By contradiction. Assume that processor Pm is maxi-mally utilized by GPm but

U GPmð Þ � QðNÞ: (10)

Let Nm denote the number of tasks on Pm, and let a non-split task �j be the critical task on Pm. Then we know thatNm G N . Since QðNÞ is a monotonically decreasingfunction with respect to N , we have QðNÞ G QðNmÞ.According to our assumption in equation (10), we get

U GPmð Þ � QðNÞ G QðNmÞ:

Note that GPm may contain some tail tasks with deadlinesless than their periods. Given GPm , we can alwaysconstruct another G0Pm such that any tail task in G0Pm hasits deadline equal to its original period. As such, we have

U G0Pm

� �¼ UðGPmÞ � QðNÞ G QðNmÞ:

Also, since �j is a non-split critical task, processor Pm isalso maximally utilized by G0Pm .

Fig. 5. Experimental results for general task sets, u 2 ½0; 1�. (a) Number of processors: M ¼ 4. (b) Number of processors: M ¼ 8. (c) Number ofprocessors: M ¼ 16.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141484

Page 10: Harmonic aware multi-core scheduling for fixed-priority real-time systems

Now consider the critical task �j. Let us keep its periodðTjÞ the same, but increase its execution time such that

Duj ¼ min QðNmÞ � UðGPmÞ; 1� uj� �

:

After the above transformation, the new utilization on Pm,denoted as UðG00PmÞ still satisfies that UðG00PmÞ � QðNmÞ,which implies that G00Pm is feasible by RMS on processor Pmeven though �j’s execution time increases. This contra-dicts that Pm has been maximally utilized by G0Pm and �j isthe critical task. Ì

Lemma 5. Let GPm be the task set allocated to processor Pm inHSP-light. If the critical task is a tail task and Pm is maximallyutilized by GPm , then UðGPmÞ 9 QðNÞ.

Proof. Let �ti be the critical tail task on Pm. To simplify thedescription below, let Ut

X ðUtY Þ denote the total utiliza-

tion of tasks with priorities higher (lower) than �i on Pm(see Fig. 6). From HSP-light, the processor containingthe first body task �b1

i of �i has the largest capacity toaccommodate �i. Thus, we have

UtY þ u

b1i � QðNÞ:

Otherwise, �b1i would be assigned to Pm instead. More-

over, since �i is a light task, we have that ub1i G ui � 1=2,

from the above inequality we can derive that

UtY 9 QðNÞ � 1

2: (11)

On the other hand, for processor Pm, since �ti is thecritical task, there will be no idle time within interval½0; Ti � CB

i �, where CBi is the total execution time of �i’s

body tasks. Therefore, for �ti and all higher prioritytasks on Pm, we have

XjGi

CjTi � CB

i

Tj

� þ Ct

i � Ti � CBi : (12)

Divide ðTi � CBi Þ on both side of the above, we can get that

XjGi

uj �Ti � CB

i

Tj

� � Tj

Ti � CBi

þ uti �Ti

Ti � CBi

� 1:

Split the sum of the above into two parts, and rewrite as

XjGi;TjGTi�CB

i

uj �Ti � CB

i

Tj

� � TjTi � CB

i

þX

jGi;Tj�Ti�CBi

uj �Ti � CB

i

Tj

� � Tj

Ti � CBi

þ uti �Ti

Ti � CBi

� 1:

(13)

For the first part on the left side of equation (13), since

dTi�CBi

Tje � Ti�CB

i

Tjþ 1, we can derive that

Xj;i;TjGTi�CB

i

uj �Ti � CB

i

Tj

� � Tj

Ti � CBi

�X

jGi;TjGTi�CBi

uj � 1þ Tj

Ti � CBi

�:

Moreover, in the above, since Tj G Ti � CBi , we have

TjTi�CB

i

G 1. Then we can further derive

XjGi;TjGTi�CB

i

uj �Ti � CB

i

Tj

� � Tj

Ti � CBi

�X

jGi;TjGTi�CBi

2 � uj: (14)

For the second part on the left side of equation (13), since

Tj � Ti � CBi , we have dTi�C

Bi

Tje ¼ 1. Thus we can derive

XjGi;Tj�Ti�CB

i

uj �Ti � CB

i

Tj

� � Tj

Ti � CBi

¼X

jGi;Tj�Ti�CBi

uj �Tj

Ti � CBi

: (15)

And further since uBi G 1=2, then

Tj

Ti � CBi

� Ti

Ti � CBi

G 2; if Tj � Ti � CBi : (16)

Put equation (16) into (15), we can deriveX

jGi;Tj�Ti�CBi

ujGX

jGi;Tj�Ti�CBi

2 � uj: (17)

For the third part on the left side of equation (13), byapplying (16), we have

uti �Ti

Ti � CBi

G 2 � uti: (18)

Apply equation (14), (17) and (18) into (13), we can get

XjGi;TjGTi�CB

i

uj þX

jGi;Tj�Ti�CBi

uj þ uti 91

2

or

UtX þ uti 9

1

2: (19)

Finally, sum up equations (11) and (19), and replaceðUt

Y þ UtX þ utiÞ by UðGPmÞ, we obtain that

U GPmð Þ 9 QðNÞ:g

Based on Lemma 4 and Lemma 5, we can derive thefollowing property.

Lemma 6. If all processors in P are maximally utilized accor-ding to HSP-light, then we have

XPm2P

UðGPmÞ 9 jPj � QðNÞ: (20)

Fig. 6. Illustration of UtX and Ut

Y .

FAN AND QUAN: HARMONIC-AWARE MULTI-CORE SCHEDULING FOR FIXED-PRIORITY REAL-TIME SYSTEMS 1485

Page 11: Harmonic aware multi-core scheduling for fixed-priority real-time systems

Proof. Let PA denote the processors with critical tasks asnon-split tasks, and PB denote the processors withcritical tasks as tail tasks. According to Lemma 6, wehave that P ¼ PA

SPB and PA

TPB ¼ ;. Thus, we have

XPm2P

UðGPmÞ ¼XPm2PA

UðGPmÞ þX

Pm2PBUðGPmÞ: (21)

Moreover, for any Pm 2 PA or PB, from Lemma 4 andLemma 5, we know that UðGPmÞ 9 QðNÞ. Applying thisto the above equation, we get

XPm2P

UðGPmÞ 9 jPAj � QðNÞ þ jPBj � QðNÞ (22)

orXPm2P

UðGPmÞ 9 jPj � QðNÞ: (23)

g

We are now ready to formally prove Theorem 3 inSection 4.3.

Proof (For Theorem 3). By contradiction. Assume that G isnot feasible by HSP-light, thus we know every proces-sor is maximally utilized.From the given condition (equation (7)) we have that

UðGÞ �M � QðNÞ: (24)

On the other hand, since all processors are maximallyutilized, according to Lemma 6, we know

XPm2P

UðGPmÞ 9 jPj � QðNÞ:

Since jPj ¼M, the above can be rewritten asXPm2P

UðGPmÞ 9 M � QðNÞ: (25)

This contradicts equation (24). g

APPENDIX BPROOF OF THEOREM 5

Two important observations, similar to that in Lemma 4and Lemma 5, are also true and formulated in the followingtwo lemmas.

Lemma 7. Let GPm be the task set allocated to processor Pm inHSP. If the critical task is a non-split task and Pm is maximallyutilized, then UðGPmÞ 9 QðNÞ.

Lemma 8. Let GPm be the task set allocated to processor Pm inHSP. If the critical task is a tail task from a light task andPm is maximally utilized, then UðGPmÞ 9 QðNÞ.

Lemma 7 and Lemma 8 can be proved in the same wayas that for Lemma 4 and Lemma 5. Moreover, if a tail taskfrom a heavy task is the critical task, we have a very

important observation which is formulated in the followinglemma.

Lemma 9. Let GPk be the task set allocated to processor Pk inHSP. If the critical task is a tail task from a heavy task �iand Pk is maximally utilized, then

XPm2PR

UðGPmÞ 9 jPRj � QðNÞ (26)

where PR ¼ fP ð�jÞjj 2 ½i; N �g.

Proof. For all tasks assigned to processors in PR, we dividethem into two groups: 1) tasks with priorities lower than�i, denoted as GY , 2) tasks with priorities equal or higherthan �i, denoted as GX . Then we have

XPm2PR

UðGPmÞ ¼X�j2GY

uj þX�j2GX

uj: (27)

One one hand, since �i is heavy but not pre-assigned,according to equation (8), we have

X�j2GY

uj 9 ðjPRj � 1Þ � QðNÞ: (28)

Since �ti is the critical task on its host processor Pm,there will be no idle time within interval ½0; Ti � CB

i �.Therefore, for �ti and all higher priority tasks on Pm,we have

XjGi;�j2GPm

CjTi � CB

i

Tj

� þ Ct

i � Ti � CBi

or

XjGi;�j2GPm

ujTi � CB

i

Tj

� � TjTiþ ui � 1: (29)

Note that 1) Tj � Ti for j G i, 2) and Ti � CBi � 1

2Ti, sinceuBi � 1

2. By putting them into the above, we can deriveX

jGi;�j2GPm

uj þ ui � 1: (30)

Therefore, for all tasks in GX we haveX�j2GX

uj � 1: (31)

Finally, apply equation (31) and (28) into (27), sinceQðNÞ � 1, we get

XPm2PR

UðGPmÞ 9 jPRj � QðNÞ:

g

Lemma 10. If a system is maximally utilized through HSP, thenfor all processors in P, we have

XPm2P

UðGPmÞ 9 jPj � QðNÞ: (32)

Proof. Select the heavy task, i.e., �i, that is not pre-assignedand has the highest priority among the ones with its tail

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141486

Page 12: Harmonic aware multi-core scheduling for fixed-priority real-time systems

task being the critical task on its host processor. Let PAdenote the processors to which �i and other lowerpriority tasks are assigned. Let PB denote the rest ofprocessors besides PA. From Lemma 9 we know that

XPm2PA

UðGPmÞ 9 jPAj � QðNÞ: (33)

From Lemma 7 and Lemma 8, we have thatX

Pm2PBUðGPmÞ 9 jPBj � QðNÞ: (34)

Sum up equations (33) and (34), sinceP

Pm2P UðGPmÞ ¼PPm2PA UðGPmÞ þ

PPm2PB UðGPmÞ, we can derive

XPm2P

UðGPmÞ 9 jPj � QðNÞ: (35)

g

With the above conclusions, Theorem 5 in Section 5 canbe proved as below.

Proof (For Theorem 5). By contradiction. Assume that G isnot feasible by HSP. With equation(9), we have

UðGÞ �M � QðNÞ: (36)

Since all processors are maximally utilized, from Lemma10, we have that

XPm2P

UðGPmÞ 9 jPj � QðNÞ (37)

orXPm2P

UðGPmÞ 9 M � QðNÞ: (38)

This contradicts equation (36). g

ACKNOWLEDGMENTS

This work was supported in part by NSF under ProjectsCNS-0969013, CNS-0917021, and CNS-1018108.

REFERENCES

[1] S. Chaudhry, R. Cypher, M. Ekman, M. Karlsson, A. Landin, S. Yip,H. Zeffer, and M. Tremblay, ‘‘Rock: A High-PerformanceSparc CMT Processor,’’ IEEE Micro, vol. 29, no. 2, pp. 6-16, Mar./Apr. 2009.

[2] W. Wolf, A.A. Jerraya, and G. Martin, ‘‘Multiprocessor System-On-Chip (MPSOC) Technology,’’ IEEE Trans. Comput.-AidedDesign Integr. Circuits Syst., vol. 27, no. 10, pp. 1701-1713, Oct. 2008.

[3] K. Asanovic, R. Bodik, B.C. Catanzaro, J.J. Gebis, P. Husbands,K. Keutzer, D.A. Patterson, W.L. Plishker, J. Shalf, S.W. Williams,and K.A. Yelick, ‘‘The Landscape of Parallel ComputingResearch: A View From Berkeley,’’ Univ. California, Berkeley,Berkeley, CA, USA, Tech. Rep. UCB/EECS-2006-183, 2006.

[4] K.G. Shin and P. Ramanathan, ‘‘Real-Time Computing: A NewDiscipline of Computer Science and Engineering,’’ Proc. IEEE,vol. 82, no. 1, pp. 6-24, Jan. 1994.

[5] C.L. Liu and J.W. Layland, ‘‘Scheduling Algorithms for Multi-programming in a Hard-Real-Time Environment,’’ J. ACM, vol. 20,no. 1, pp. 46-61, Jan. 1973.

[6] S.K. Dhall and C.L. Liu, ‘‘On a Real-Time Scheduling Problem,’’Oper. Res., vol. 26, no. 1, pp. 127-140, Jan./Feb. 1978.

[7] B. Andersson, S. Baruah, and J. Jonsson, ‘‘Static-Priority Schedulingon Multiprocessors,’’ in Proc. IEEE RTSS, Dec. 2001, pp. 193-202.

[8] B. Andersson, ‘‘Global Static-Priority Preemptive MultiprocessorScheduling with Utilization Bound 38 percent,’’ in Proc. ACMInt’l Conf. OPODIS, 2008, vol. 5401, pp. 73-88.

[9] S. Kato and N. Yamasaki, ‘‘Semi-Partitioned Fixed-PriorityScheduling on Multiprocessors,’’ in Proc. IEEE RTAS, Apr. 2009,pp. 23-32.

[10] K. Lakshmanan, R. Rajkumar, and J. Lehoczky, ‘‘PartitionedFixed-PriorityPreemptiveSchedulingforMulti-CoreProcessors,’’in Proc. ECRTS, July 2009, pp. 239-248.

[11] J. Carpenter, S. Funk, P. Holman, A. Srinivasan, J. Anderson,and S. Baruah, ‘‘A Categorization of Real-Time MultiprocessorScheduling Problems and Algorithms,’’ in Handbook on Schedul-ing Algorithms, Methods, Models. Boca Raton, FL, USA: CRCPress, 2004.

[12] J.H. Anderson, V. Bud, and U.C. Devi, ‘‘An EDF-Based SchedulingAlgorithm for Multiprocessor Soft Real-Time Systems,’’ in Proc.ECRTS, July 2005, pp. 199-208.

[13] S. Kato and N. Yamasaki, ‘‘Real-Time Scheduling with TaskSplitting on Multiprocessors,’’ in Proc. IEEE Int’l Conf. EmbeddedRTCSA, Aug. 2007, pp. 441-450.

[14] B. Andersson, K. Bletsas, and S. Baruah, ‘‘Scheduling Arbitrary-Deadline Sporadic Task Systems on Multiprocessors,’’ in Proc.IEEE RTSS, Dec. 2008, pp. 385-394.

[15] N. Guan, M. Stigge, W. Yi, and G. Yu, ‘‘Fixed-Priority Multipro-cessor Scheduling with Liu and Layland’s Utilization Bound,’’in Proc. IEEE RTAS, Apr. 2010, pp. 165-174.

[16] A. Bastoni, B.B. Brandenburg, and J.H. Anderson, ‘‘Is Semi-Partitioned Scheduling Practical?’’ in Proc. ECRTS, July 2011,pp. 125-135.

[17] N. Guan, M. Stigge, W. Yi, and G. Yu, ‘‘Fixed-Priority Multipro-cessor Scheduling: Beyond Liu and Layland’s UtilizationBound,’’ in Proc. IEEE RTSS, Dec. 2010, pp. 165-174.

[18] Y. Zhang, N. Guan, and W. Yi, ‘‘Towards the Implementationand Evaluation of Semi-Partitioned Multi-Core Scheduling,’’ inProc. DATE Workshop Predictability Perform. Embedded Syst., 2011,vol. 18, pp. 42-46.

[19] M. Fan and G. Quan, ‘‘Harmonic Semi-Partitioned Schedulingfor Fixed-Priority Real-Time Tasks on Multi-Core Platform,’’ inProc. DATE Conf. Exhib., Mar. 2012, pp. 503-508.

[20] J.W.S. Liu, Real-Time Systems. Englewood Cliffs, NJ, USA:Prentice-Hall, 2000.

[21] C.-C. Han and H.-Y. Tyan, ‘‘A Better Polynomial-Time Schedul-ability Test for Real-Time Fixed-Priority Scheduling Algo-rithms,’’ in Proc. IEEE RTSS, Dec. 1997, pp. 36-45.

[22] T.-W. Kuo and A.K. Mok, ‘‘Load Adjustment in Adaptive Real-Time Systems,’’ in Proc. IEEE RTSS, Dec. 1991, pp. 160-170.

[23] E. Bini, G.C. Buttazzo, and G.M. Buttazzo, ‘‘Rate MonotonicAnalysis: The Hyperbolic Bound,’’ IEEE Trans. Comput., vol. 52,no. 7, pp. 933-942, July 2003.

[24] S. Lauzac, R. Melhem, and D. Mosse, ‘‘An Improved Rate-Monotonic Admission Control and Its Applications,’’ IEEETrans. Comput., vol. 52, no. 3, pp. 337-350, Mar. 2003.

[25] W.-C. Lu, H.-W. Wei, and K.-J. Lin, ‘‘Rate Monotonic Schedul-ability Conditions Using Relative Period Ratios,’’ in Proc. IEEEInt’l Conf. Embedded RTCSA, 2006, pp. 3-9.

[26] M.-J. Jung, Y. R. Seong, and C.-H. Lee, ‘‘Optimal RM Schedulingfor Simply Periodic Tasks on Uniform Multiprocessors,’’ in Proc.ACM ICHIT, Aug. 2009, vol. 321, pp. 383-389.

[27] D. Muller, ‘‘Accelerated Simply Periodic Task Sets for RMScheduling,’’ in Proc. Embedded Real Time Softw. Syst., May 2010,p. 42.

[28] M. Fan and G. Quan, ‘‘Harmonic-Fit Partitioned Scheduling forFixed-Priority Real-Time Tasks on the Multiprocessor Platform,’’in Proc. IFIP 9th Int’l Conf. EUC, Oct. 2011, pp. 27-32.

[29] B. Andersson and J. Jonsson, ‘‘The Utilization Bounds ofPartitioned and PFAIR Static-Priority Scheduling on Multi-processors are 50 percent,’’ in Proc. ECRTS, July 2003, pp. 33-40.

[30] N. Guan, M. Stigge, W. Yi, and G. Yu, ‘‘Parametric UtilizationBounds for Fixed-Prioirity Multiprocessor Scheduling,’’ in Proc.IEEE IPDPS, May 2012, pp. 261-272.

[31] S. Lauzac, R. Melhem, and D. Mosse, ‘‘An Efficient RMSAdmission Control and Its Application to MultiprocessorScheduling,’’ in Proc. IPPS/SPDP, Mar. 1998, pp. 511-518.

[32] A. Kandhalu, K. Lakshmanan, J. Kim, and R. Rajkumar,‘‘pCOMPATS: Period-Compatible Task Allocation and Splitting onMulti-Core Processors,’’ in Proc. IEEE RTAS, Apr. 2012, pp. 307-316.

FAN AND QUAN: HARMONIC-AWARE MULTI-CORE SCHEDULING FOR FIXED-PRIORITY REAL-TIME SYSTEMS 1487

Page 13: Harmonic aware multi-core scheduling for fixed-priority real-time systems

[33] S. Kato and N. Yamasaki, ‘‘Portioned Static-Priority Schedulingon Multiprocessors,’’ in Proc. IEEE IPDPS, Apr. 2008, pp. 1-12.

[34] J. Lehoczky, L. Sha, and Y. Ding, ‘‘The Rate MonotonicScheduling Algorithm: Exact Characterization and AverageCase Behavior,’’ in Proc. RTSS, Dec. 1989, pp. 166-171.

Ming Fan is a PhD candidate at the Departmentof Electrical and Computer Engineering at theFlorida International University, FL, USA. He re-ceived both the BS and MS degrees in ComputerEngineering fromBeiHangUniversity,Beijing,China,in 2006 and 2009, respectively. His research inter-ests include real-time systems, power-/thermal-awarecomputing and fault-tolerant systems. He is aStudent Member of the IEEE.

Gang Quan received the PhD degree from theDepartment of Computer Science and Engineer-ing, University of Notre Dame, USA, the MSdegree from the Chinese Academy of Sciences,China, and the BS degree from theDepartment ofElectronic Engineering, Tsinghua University,Beijing, China. He is currently an associateprofessor at the Department of Electrical andComputing Engineering, Florida InternationalUniversity. Before he joined the department, hewas an assistant professor at the Department of

Computer Science and Engineering, University of South Carolina. Hisresearch interests include real-time systems, embedded system design,power-/thermal-aware computing, advanced computer architecture,reconfigurable computing. He is a Senior Member of the IEEE.

. For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 6, JUNE 20141488