eecs 388: embedded systems - ku ittcheechul/courses/eecs388/w9.scheduling-2.pdf · completely fair...
TRANSCRIPT
EECS 388: Embedded Systems
9. Real-Time Scheduling (2)
Heechul Yun
1
So far
• Job, Task
• Periodic task model– ti = (Ci, Pi) or (Ci, Pi, Di)
• Static/dynamic priority scheduling: – RM– EDF
• Utilization– Ui = Ci / Pi
2
i i
i
p
CU
Agenda
• Utilization Bound
• Exact Schedulability analysis
• POSIX scheduling interface
3
Recall: RM Example
• τ1 (C1 = 4, T1 = 8), high prio, τ2 (C2 = 6, T1 = 12), low prio
• Utilization: U = 4/8 + 6/12 = 1
4
t0 10 20
τ2
t
τ1
deadline miss
Unschedulable
RM Example
• τ1 (C1 = 4, T1 = 8), high prio, τ2 (C2 = 6, T1 = 12), low prio
• Utilization: U = 4/8 + 4/12 = 10/12 = 0.83
5
t0 10 20
τ2
t
τ1
Schedulable!Is there an easy way to know whether a
taskset is schedulable or not?
Liu & Layland, JACM, Jan. 1973
6
Liu & Layland Bound
• A set of n periodic tasks is schedulable if
– UB(1) = 1.0
– UB(2) = 0.828
– UB(3) = 0.779
– …
– UB(n) = ln(2) = ~0.693
7
12... /1
2
2
1
1 n
n
n np
c
p
c
p
c
Q. If it isn’t, does that meanthe taskset is unschedulable?
A. Not necessarily.It’s a sufficient condition,
but not necessary one.
Sample Problem
C T U
Task t1 20 100 0.200
Task t2 40 150 0.267
Task t3 100 350 0.286
8
• Are all tasks schedulable?– U1 + U2 + U3 = 0.753 < U(3) Schedulable!
• What if we double the C of t1
– 0.2*2 + 0.267+ 0.286 = 0.953 > UB(3) = 0.779
– We don’t know yet.
UB(1) = 1.0UB(2) = 0.828UB(3) = 0.779UB(n) = 0.693
L&L Bound
Sample Problem
9
t1
t2
t3
t1
t2
t3
t1
t2
t3
(20, 100
RM
(40, 150)
(100, 350)
(40, 100
RM
(40, 150)
(110, 350)
(40, 100
EDF
(40, 150)
(110, 350)
Sample Problem
10
t1
t2
t3
t1
t2
t3
t1
t2
t3
(20, 100
RM
(40, 150)
(100, 350)
(40, 100
RM
(40, 150)
(110, 350)
(40, 100
EDF
(40, 150)
(110, 350)
deadline miss!
Critical Instant Theorem
• If a task meets its first deadline when all higher priority tasks are started at the same time, then this task’s future deadlines will always be met.
11
Timeline
t1
t2
tasks’
schedule
Task set
Exact Schedulability Test
• For each task, checks if it can meet its first deadline
12
4 4 4 4
0 10 20 30
15 30
35
0
0
4 4 4
2 1 1 6
(C1=4, T1=10), U1 = 0.4
(C2=4, T2=15), U2 = 0.27
(C3=10, T3=35), U3 = 0.28
Exact Schedulability Test
• For each task, checks if it can meet its first deadline
13
Test terminates when rik+1 > pi (not schedulable)
or when rik+1 = ri
k < pi (schedulable).
i
j
jij
i
j j
k
ii
k
i crcp
rcr
1
01
1
1 where,
ceiling function
Exact Schedulability Test
• For task 3
– First iteration
14
181044321
3
1
0
3
ccccrj
j
4
0 10 20 30
15 30
35
0
0
4
10
r30 = 18
4.0),10,4( 111 Upc
27.0),15,4( 222 Upc
28.0),35,10( 333 Upc
Exact Schedulability Test
• For task 3
– Second iteration
15
26415
184
10
1810
2
1
0
33
1
3
j
j j
cp
rcr
4
0 10 20 30
15 30
35
0
0
4
2
r31 = 26
4
4
1 7
4.0),10,4( 111 Upc
27.0),15,4( 222 Upc
28.0),35,10( 333 Upc
Exact Schedulability Test
• For task 3
– Third iteration
16
30415
264
10
2610
2
1
1
33
2
3
j
j j
cp
rcr
4
0 10 20 30
15 30
35
0
0
4
2
r32 = r3
3 = 30
4
4
1 6
4
1
4.0),10,4( 111 Upc
27.0),15,4( 222 Upc
28.0),35,10( 333 Upc
Exact Schedulability Test
• For task 3
– Fourth iteration … is the same as the 3rd
– Done!
17
30415
304
10
3010
2
1
2
33
3
3
j
j j
cp
rcr
30415
264
10
2610
2
1
1
33
2
3
j
j j
cp
rcr
Exact Schedulability Test
• All tasks meet their deadlines schedulable
18
4
0 10 20 30
15 30
35
0
0
4
2
r32 = r3
3 = 30
4
4
1 6
4
1
4.0),10,4( 111 Upc
27.0),15,4( 222 Upc
28.0),35,10( 333 Upc
Caveats: Assumptions
• So far the theories assume
– All the tasks are periodic
– Tasks are scheduled according to RMS
– All tasks are independent and do not share resources (data)
– Tasks do not self-suspend during their execution
– Scheduler overhead (context-switch) is negligible
19
POSIX Scheduling Interface
• POSIX.4 Real-Time Extension support real-time scheduling policies
• Each process can run with a particular scheduling policy and associated scheduling attributes. Both the policy and the attributes can be changed independently.
• POSIX.4 defined policies– SCHED_FIFO: preemptive, priority-based scheduling. – SCHED_RR: Preemptive, priority-based scheduling with
quanta. – SCHED_OTHER: an implementation-defined scheduler
Linux’s default scheduler (CFS)
20
SCHED_FIFO
• Preemptive, priority-based scheduling. • Priority ranges: 1 (lowest) – 99 (highest)• When a SCHED_FIFO process becomes runnable, it will
always preempt immediately any currently running normal SCHED_OTHER process. SCHED_FIFO is a simple scheduling algorithm without time slicing.
• A process calling sched_yield will be put at the end of its priority list. No other events will move a process scheduled under the SCHED_FIFO policy in the wait list of runnable processes with equal static priority. A SCHED_FIFO process runs until either it is blocked by an I/O request, it is preempted by a higher priority process, it calls sched_yield, or it finishes.
21
SCHED_RR
• Same as SCHED_FIFO except the following.
• Time slicing among the same priority tasks:– If a SCHED_RR process has been running for a time
period equal to or longer than the time quantum, it will be put at the end of the list for its priority.
– A SCHED_RR process that has been preempted by a higher priority process and subsequently resumes execution as a running process will complete the unexpired portion of its round robin time quantum. The length of the time quantum can be retrieved by sched_rr_get_interval.
22
SCHED_OTHER
• An implementation defined scheduler, not defined by POSIX.4
• In Linux, this class is the default CFS scheduler.
23
Linux Scheduling Framework
CFS(sched/fair.c)
Real-time(sched/rt.c)
SCHED_OTHER(SCHED_NORMAL)
SCHED_BATCH SCHED_RR
SCHED_FIFO
• Completely Fair Scheduler (CFS) for general purpose• Real-time Schedulers for real-time apps.• Why not to create a single scheduler for both?
SCHED_DEADLINE
Completely Fair Scheduler (CFS)
• SCHED_OTHER class
• Linux’s default scheduler, focusing on fairness
• Each task owns a fraction of CPU time share– E.g.,) A=10%, B=30%, C=60%
• Scheduling algorithm– Each task maintains its virtual runtime
• Virtual runtime = executed time (x 1 / weight)
– Pick the task with the smallest virtual runtime• Tasks are sorted according to their virtual times
25
CFS Example
Weights: gcc = 2/3, bigsim=1/3X-axis: mcu (tick), Y-axis: virtual time
Fair in the long run
26
kernel/sched/fair.c (CFS)
• Priority to CFS weight conversion table
– Priority (Nice value): -20 (highest) ~ +19 (lowest)
– kernel/sched/core.c
27
const int sched_prio_to_weight[40] = {
/* -20 */ 88761, 71755, 56483, 46273, 36291,
/* -15 */ 29154, 23254, 18705, 14949, 11916,
/* -10 */ 9548, 7620, 6100, 4904, 3906,
/* -5 */ 3121, 2501, 1991, 1586, 1277,
/* 0 */ 1024, 820, 655, 526, 423,
/* 5 */ 335, 272, 215, 172, 137,
/* 10 */ 110, 87, 70, 56, 45,
/* 15 */ 36, 29, 23, 18, 15,
};
Agenda
• Priority Inversion
– Priority Inheritance Protocol (PIP)
– Priority Ceiling Protocol (PCP)
• Multicore Scheduling
– Scheduling anomalies
28
Priority Inversion
• A situation in which a higher priority thread is delayed by a lower priority thread.
29
Critical section
lock()(High)
(Medium)
(Low)
lock()
Normal execution
blocked
unlock()
Mars Pathfinder
• Landed on Mars, July 4, 1997
• After operating for a while, it kept rebooted itself repeatedly
30
The Bug
• Three threads with priorities– Weather data thread (low priority)
– Communication thread (medium priority)
– Information bus thread (high priority)
• Each thread obtains a lock to write data on the shared memory
• High priority thread can’t acquire the lock for a very long time something must be wrong. Let’s reboot!
31
Priority Inversion
• High priority thread is delayed by the medium priority thread (potentially) indefinitely!!!
32
Critical section
lock()Information bus
(High)
Communication(Medium)
Weather(Low)
lock()
More reading: What really happened on Mars?
Normal execution
blocked
unlock()
Sha, Rajkumar, Lehoczky, TC, 1990
33
L. Sha, R. Rajkumar, and J. P. Lehoczky. Priority Inheritance Protocols: An Approach to Real-Time Synchronization. In IEEE Transactions on Computers, vol. 39, pp. 1175-1185, Sep. 1990.
• John Lehoczky (CMU)
• Raj Rajkumar (CMU)
• Lui Sha (UIUC)
Priority Inheritance Protocol (PIP)
• If a high priority thread is waiting on a lock, boost the priority of the lock owner thread (low priority) to that of the high priority thread
35
Priority Inheritance Protocol (PIP)
36
lock()
blocked
unlock()
High
Medium
Low
unlock()
lock()
lock() unlock()
lock()
High
Medium
Low
Old
New
Boost priority
Priority Inheritance Protocol (PIP)
• Solved the Pathfinder’s problem
– Remotely patched the code
– To use the priority inheritance protocol in the lock
– First-ever interplanetary remote debugging (?)
• But…
37
Deadlock#include <pthread.h>
...
pthread_mutex_t lock_a, lock_b;
void* thread_1_function(void* arg) {
pthread_mutex_lock(&lock_b);
...
pthread_mutex_lock(&lock_a);
...
pthread_mutex_unlock(&lock_a);
...
pthread_mutex_unlock(&lock_b);
...
}
void* thread_2_function(void* arg) {
pthread_mutex_lock(&lock_a);
...
pthread_mutex_lock(&lock_b);
...
pthread_mutex_unlock(&lock_b);
...
pthread_mutex_unlock(&lock_a);
...
}
The lower priority task starts first and acquires lock a, then gets preempted by the higher priority task, which acquires lock b and then blocks trying to acquire lock a. The lower priority task then blocks trying to acquire lock b, and no further progress is possible.
Priority Ceiling Protocol (PCP)
• Every lock or semaphore is assigned a priority ceiling equal to the priority of the highest-priority task that can lock it.
• A task T can acquire a lock only if the task’s priority is strictly higher than the priority ceilings of all locks currently held by other tasks
• Intuition: the task T will not later try to acquire these locks held by other tasks
• Locks that are not held by any task don’t affect the task• This means that a task cannot acquire locks when any
lock it requires has been acquired by any other task• This prevents deadlocks
39
Priority Ceiling Protocol (PCP)
40
In this version, locks a and b have priority ceilings equal to the priority of task 1. At time 3, task 1 attempts to lock b, but it can’t because task 2 currently holds lock a, which has priority ceiling equal to the priority of task 1.
#include <pthread.h>
...
pthread_mutex_t lock_a, lock_b;
void* thread_1_function(void* arg) {
pthread_mutex_lock(&lock_b);
...
pthread_mutex_lock(&lock_a);
...
pthread_mutex_unlock(&lock_a);
...
pthread_mutex_unlock(&lock_b);
...
}
void* thread_2_function(void* arg) {
pthread_mutex_lock(&lock_a);
...
pthread_mutex_lock(&lock_b);
...
pthread_mutex_unlock(&lock_b);
...
pthread_mutex_unlock(&lock_a);
...
}
Multicore
• All CPUs (cores) are equal. Memory is shared
• A task can run on any CPU (core)
41
CPU1
Memory
CPU2 CPU3 CPU4
Multicore Scheduling
• Priority-based scheduling on multicore is brittle
• Theorem (Richard Graham, 1976): If a task set with fixed priorities, execution times, and precedence constraints is scheduled according to priorities on a fixed number of processors, then increasing the number of processors, reducing execution times, or weakening precedence constraints can increase the schedule length.
42
Richard’s Anomalies
• What happens if you increase the number of processors to four?
43
1
2
3
4
9
8
9 tasks with precedences and the shown execution times,
where lower numbered tasks have higher priority than higher
numbered tasks. Priority-based 3 processor schedule:
7
6
5
C1 = 3
C2 = 2
C3 = 2
C4 = 2
C9 = 9
C8 = 4
C7 = 4
C6 = 4
C5 = 4
Richard’s Anomalies
44
1
2
3
4
9
8
9 tasks with precedences and the shown execution times,
where lower numbered tasks have higher priority than higher
numbered tasks. Priority-based 3 processor schedule:
7
6
5
C1 = 3
C2 = 2
C3 = 2
C4 = 2
C9 = 9
C8 = 4
C7 = 4
C6 = 4
C5 = 4
The priority-based schedule with four processors has a longer execution time.
Richard’s Anomalies
• What happens if you reduce all computation times by 1?
45
1
2
3
4
9
8
9 tasks with precedences and the shown execution times,
where lower numbered tasks have higher priority than higher
numbered tasks. Priority-based 3 processor schedule:
7
6
5
C1 = 3
C2 = 2
C3 = 2
C4 = 2
C9 = 9
C8 = 4
C7 = 4
C6 = 4
C5 = 4
Richard’s Anomalies
46
1
2
3
4
9
8
9 tasks with precedences and the shown execution times,
where lower numbered tasks have higher priority than higher
numbered tasks. Priority-based 3 processor schedule:
7
6
5
C1 = 3
C2 = 2
C3 = 2
C4 = 2
C9 = 9
C8 = 4
C7 = 4
C6 = 4
C5 = 4
Reducing the computation times by 1 also results in a longer execution time.
Richard’s Anomalies
• What happens if you remove the precedence constraints (4,8) and (4,7)?
47
1
2
3
4
9
8
9 tasks with precedences and the shown execution times,
where lower numbered tasks have higher priority than higher
numbered tasks. Priority-based 3 processor schedule:
7
6
5
C1 = 3
C2 = 2
C3 = 2
C4 = 2
C9 = 9
C8 = 4
C7 = 4
C6 = 4
C5 = 4
Richard’s Anomalies
48
1
2
3
4
9
8
9 tasks with precedences and the shown execution times,
where lower numbered tasks have higher priority than higher
numbered tasks. Priority-based 3 processor schedule:
7
6
5
C1 = 3
C2 = 2
C3 = 2
C4 = 2
C9 = 9
C8 = 4
C7 = 4
C6 = 4
C5 = 4
Weakening precedence constraints can also result in a longer schedule.
Summary
• Priority inversion– Low priority task blocks higher priority one– Unbounded priority inversion starvation (e.g., Mar’s
pathfinder)– Solutions: PIP and PCP
• Multicore scheduling– Priority based scheduling can result in unexpected
behaviors (anomalies)– E.g., more cores longer response time
• Real-time guarantees are hard– WCET assumptions are problematic– Especially on complex, high-performance CPUs.
49
Acknowledgements
• These slides draw on materials developed by
– Lui Sha and Marco Caccamo (UIUC)
– Rodolfo Pellizzoni (U. Waterloo)
– Edward A. Lee and Prabal Dutta (UCB) for EECS149/249A
50