tu dortmund and university of saarlands€¦ · mirror: symmetric timing analysis for real-time...
TRANSCRIPT
MIRROR: Symmetric Timing Analysis for Real-Time Taskson Multicore Platforms with Shared Resources
Wen-Hung Kevin Huang, Jian-Jia Chen, and Jan Reineke
TU Dortmund and University of Saarlands
09,06,2016 at DAC, Austin, USA
1 / 14
Periodic Control System (Liu and Layland 1973)
Pseudo-code for this system
set timer to interrupt periodicallywith period T ;
at each timer interruptdo
• perform analog-to-digitalconversion to get y ;
• compute control output u;
• output u and dodigital-to-analog conversion;
od
Control System
A/D
(The system
being controlled)
plant
actuator
D/AControl−law
computation
sensor
y(t) u(t)
ukyk
Liu and Layland Model:• Ti : period of task τi
• Di : relative deadline of task τi
• Ci : worst-case execution timeof task τi
• Ui : utilization Ci/Ti
2 / 14
Periodic Control System (Liu and Layland 1973)
Pseudo-code for this system
set timer to interrupt periodicallywith period T ;
at each timer interruptdo
• perform analog-to-digitalconversion to get y ;
• compute control output u;
• output u and dodigital-to-analog conversion;
od
Control System
A/D
(The system
being controlled)
plant
actuator
D/AControl−law
computation
sensor
y(t) u(t)
ukyk
Liu and Layland Model:• Ti : period of task τi
• Di : relative deadline of task τi
• Ci : worst-case execution timeof task τi
• Ui : utilization Ci/Ti
2 / 14
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
Task is executed on Core 1
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
L1 cache misses
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
L2 cache misses
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
Access memory
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
Task is executed on Core 2
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
L1 cache misses
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
L2 is blocked
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
Task is executed on Core 3
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
L1 cache misses
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
L2 cache misses
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
Memory access is blocked
Predictability Due to Resource Sharing
3 / 14
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache Main Memory
Core 1
Core 2
L1 Cache
L1 Cache
L2 Cache
Multi Core CPU 1 Multi Core CPU 2
Memory access for core 1 finishes
Typical Two-Phase Analysis Approaches
• Phase 1: Worst-case execution time (WCET) of a stand-aloneprogram
• WCET analyzers such as aiT or Chronous.
• Phase 2: Worst-case response time (WCRT) of aperiodic/sporadic task by considering the competition withthe other tasks
• worst-case interference from the other tasks• utilization-based tests, response time analysis, busy-interval
techniques, real-time calculus, max-plus algebra, etc.
• The notion of WCET is destroyed in multicore systems due toshared resources.
• WCET depends on how the tasks on the other cores areco-executed
• Assume the worst-case interference is too pessimistic
4 / 14
Typical Two-Phase Analysis Approaches
• Phase 1: Worst-case execution time (WCET) of a stand-aloneprogram
• WCET analyzers such as aiT or Chronous.
• Phase 2: Worst-case response time (WCRT) of aperiodic/sporadic task by considering the competition withthe other tasks
• worst-case interference from the other tasks• utilization-based tests, response time analysis, busy-interval
techniques, real-time calculus, max-plus algebra, etc.
• The notion of WCET is destroyed in multicore systems due toshared resources.
• WCET depends on how the tasks on the other cores areco-executed
• Assume the worst-case interference is too pessimistic
4 / 14
Self-Suspending Behavior
t
SharedResource
Core 2
Core 1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
τ2 τ2 τ2
τ1 τ1 τ1 τ1
τ1 τ1 τ1 τ1τ2 τ2 τ2
c-executionc-suspension
r-executionr-suspension
• Multiple cores may share a bus
• The contention on the bus can be considered as a suspensionproblem (with respect to the bus access)
5 / 14
Self-Suspension Tasks in Real-Time Systems
Suppose that we know the suspension time of each τi and would like to analyzethe schedulability of the tasks on a core. (Constrained-deadline Di ≤ Ti )
τ1
τ2
τ3
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
τ4
• A self-suspending task τi is a periodic task with jitter (PJD task)
• Period: Ti
• Self-suspension-time: Si
• Schedulability test of task τk :
6 / 14
Self-Suspension Tasks in Real-Time Systems
Suppose that we know the suspension time of each τi and would like to analyzethe schedulability of the tasks on a core. (Constrained-deadline Di ≤ Ti )
τ1
τ2
τ3
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
τ4
• A self-suspending task τi is a periodic task with jitter (PJD task)
• Period: Ti
• Self-suspension-time: Si• Schedulability test of task τk :
∃t with 0 < t ≤ Tk and Ck + Sk +k−1∑j=1
⌈t + Sj
Tj
⌉Cj ≤ t.
6 / 14
Self-Suspension Tasks in Real-Time Systems
Suppose that we know the suspension time of each τi and would like to analyzethe schedulability of the tasks on a core. (Constrained-deadline Di ≤ Ti )
τ1
τ2
τ3
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
τ4
• A self-suspending task τi is a periodic task with jitter (PJD task)
• Period: Ti
• Self-suspension-time: Si• Schedulability test of task τk :
∃t with 0 < t ≤ Tk and Ck + Sk +k−1∑j=1
⌈t + Dj
Tj
⌉Cj ≤ t.
6 / 14
Self-Suspension Tasks in Real-Time Systems
Suppose that we know the suspension time of each τi and would like to analyzethe schedulability of the tasks on a core. (Constrained-deadline Di ≤ Ti )
τ1
τ2
τ3
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
τ4
• A self-suspending task τi is a periodic task with jitter (PJD task)
• Period: Ti
• Self-suspension-time: Si• Schedulability test of task τk :
∃t with 0 < t ≤ Tk and Ck + Sk +k−1∑j=1
⌈t + Tj
Tj
⌉Cj ≤ t.
6 / 14
Platform Model
• Multicore with a share resource
• For example, atomic (non-split-transaction) bus
• Bus sits idle while memory processes the request and sends theresponse
• Fixed-priority arbitration
Resp τ2
R RReq τ2
RReq τ1
Resp τ1
0 1 2 3 4 5 6 7 8 9 10
B
CLK
7 / 14
Task and Scheduling Model
• Resource access task τi(Ci ,Ai ,Ti ,Di , σi )• Ci : upper bound on local
computation• Ai : upper bound on
resource accesses• Ti : period• Di : relative deadline
(Di ≤ Ti )• σi : the maximum number
segments of consecutiveresource accesses
• Path analysis
• Fixed-priority scheduling (weuse deadline-monotinicscheduling)
A
B
C
D
E
F
G
H
10
10
7 10
8 20
10
10
Local execution
Resource access
Ci = 35
Ai = 40
σi = 3
Assume compositional properties:75 is a safe upper bound.
8 / 14
Task and Scheduling Model
• Resource access task τi(Ci ,Ai ,Ti ,Di , σi )• Ci : upper bound on local
computation• Ai : upper bound on
resource accesses• Ti : period• Di : relative deadline
(Di ≤ Ti )• σi : the maximum number
segments of consecutiveresource accesses
• Path analysis
• Fixed-priority scheduling (weuse deadline-monotinicscheduling)
A
B
C
D
E
F
G
H
10
10
7 10
8 20
10
10
Criticalpath
markedforCi
Local execution
Resource access
Ci = 35
Ai = 40
σi = 3
Assume compositional properties:75 is a safe upper bound.
8 / 14
Task and Scheduling Model
• Resource access task τi(Ci ,Ai ,Ti ,Di , σi )• Ci : upper bound on local
computation• Ai : upper bound on
resource accesses• Ti : period• Di : relative deadline
(Di ≤ Ti )• σi : the maximum number
segments of consecutiveresource accesses
• Path analysis
• Fixed-priority scheduling (weuse deadline-monotinicscheduling)
A
B
C
D
E
F
G
H
10
10
7 10
8 20
10
10
Critical path marked for Ai
Criticalpath
markedforCi
Local execution
Resource access
Ci = 35
Ai = 40
σi = 3
Assume compositional properties:75 is a safe upper bound.
8 / 14
Task and Scheduling Model
• Resource access task τi(Ci ,Ai ,Ti ,Di , σi )• Ci : upper bound on local
computation• Ai : upper bound on
resource accesses• Ti : period• Di : relative deadline
(Di ≤ Ti )• σi : the maximum number
segments of consecutiveresource accesses
• Path analysis
• Fixed-priority scheduling (weuse deadline-monotinicscheduling)
A
B
C
D
E
F
G
H
10
10
7 10
8 20
10
10
Critical path marked for Ai
Criticalpath
markedforCi
Local execution
Resource access
Ci = 35
Ai = 40
σi = 3
Assume compositional properties:75 is a safe upper bound.
8 / 14
Key Observations: Symmetric Property
t
SharedResource
Core 2
Core 1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
τ2 τ2 τ2
τ1 τ1 τ1 τ1
τ1 τ1 τ1 τ1τ2 τ2 τ2
c-executionc-suspension
r-executionr-suspension
• From the core perspectives for τ2
• accessing or waiting: [3,4), [8,12), [15, 16)• suspension: [4,8), [12, 15)
• From the shared resource perspectives for τ2
• executing or waiting: [4,8), [12, 15)• suspension:[3,4), [8,12), [15, 16)
9 / 14
Schedulability Test for Task τ k
• WCRT is upper bounded by the minimum t|0 < t ≤ Dk
(Ck + exec core(t)) + (Ak + exec resource(t)) ≤ t
Ck +∑
τi∈hp(τk ,c)
⌈t + Ti
Ti
⌉Ci
+σkB+
Ak +∑
τi∈hp(τk ,r)
⌈t + Ti
Ti
⌉Ai
≤ t
• σk B: the maximum blocking time by the lower priority taskson the shared resource
• hp(τ k , c): higher-priority tasks thanτ k on the same core• hp(τ k , r): higher-priority tasks than τ k on shared resource
• Pessimism of the above response time analysis: number of resourceaccess segments was not exploited
• In our paper, we explain how to calculate and utilize the information σk
in a symmetric and more precise manner
10 / 14
Schedulability Test for Task τ k
• WCRT is upper bounded by the minimum t|0 < t ≤ Dk
Ck +∑
τi∈hp(τk ,c)
⌈t + Ti
Ti
⌉Ci
+σkB+
Ak +∑
τi∈hp(τk ,r)
⌈t + Ti
Ti
⌉Ai
≤ t
• σk B: the maximum blocking time by the lower priority taskson the shared resource
• hp(τ k , c): higher-priority tasks thanτ k on the same core• hp(τ k , r): higher-priority tasks than τ k on shared resource
• Pessimism of the above response time analysis: number of resourceaccess segments was not exploited
• In our paper, we explain how to calculate and utilize the information σk
in a symmetric and more precise manner
10 / 14
Schedulability Test for Task τ k
• WCRT is upper bounded by the minimum t|0 < t ≤ Dk
Ck +∑
τi∈hp(τk ,c)
⌈t + Ti
Ti
⌉Ci
+σkB+
Ak +∑
τi∈hp(τk ,r)
⌈t + Ti
Ti
⌉Ai
≤ t
• σk B: the maximum blocking time by the lower priority taskson the shared resource
• hp(τ k , c): higher-priority tasks thanτ k on the same core• hp(τ k , r): higher-priority tasks than τ k on shared resource
• Pessimism of the above response time analysis: number of resourceaccess segments was not exploited
• In our paper, we explain how to calculate and utilize the information σk
in a symmetric and more precise manner
10 / 14
Task Assignment (Partition)
τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8
P1 P2 P3
• Schedulability tests are based on the previous slide.
• Fitting can be First-Fit (FF), Worst-Fit (WF), Best-Fit (BF)
11 / 14
Task Assignment (Partition)
τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8
P1 P2 P3
• Schedulability tests are based on the previous slide.
• Fitting can be First-Fit (FF), Worst-Fit (WF), Best-Fit (BF)
11 / 14
Task Assignment (Partition)
τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8
P1
τ1
P2 P3
• Schedulability tests are based on the previous slide.
• Fitting can be First-Fit (FF), Worst-Fit (WF), Best-Fit (BF)
11 / 14
Task Assignment (Partition)
τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8
P1
τ1
P2
τ2
P3
• Schedulability tests are based on the previous slide.
• Fitting can be First-Fit (FF), Worst-Fit (WF), Best-Fit (BF)
11 / 14
Task Assignment (Partition)
τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8
P1
τ1
P2
τ2
P3
τ3
• Schedulability tests are based on the previous slide.
• Fitting can be First-Fit (FF), Worst-Fit (WF), Best-Fit (BF)
11 / 14
Task Assignment (Partition)
τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8
P1
τ1
P2
τ2
P3
τ3
τ4
• Schedulability tests are based on the previous slide.
• Fitting can be First-Fit (FF), Worst-Fit (WF), Best-Fit (BF)
11 / 14
Task Assignment (Partition)
τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8
P1
τ1
P2
τ2
τ5
P3
τ3
τ4
• Schedulability tests are based on the previous slide.
• Fitting can be First-Fit (FF), Worst-Fit (WF), Best-Fit (BF)
11 / 14
Task Assignment (Partition)
τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8
P1
τ1
τ6
P2
τ2
τ5
P3
τ3
τ4
• Schedulability tests are based on the previous slide.
• Fitting can be First-Fit (FF), Worst-Fit (WF), Best-Fit (BF)
11 / 14
Task Assignment (Partition)
τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8
P1
τ1
τ6
P2
τ2
τ5
τ7
P3
τ3
τ4
• Schedulability tests are based on the previous slide.
• Fitting can be First-Fit (FF), Worst-Fit (WF), Best-Fit (BF)
11 / 14
Task Assignment (Partition)
τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8
P1
τ1
τ6
P2
τ2
τ5
τ7
P3
τ3
τ4
τ8
• Schedulability tests are based on the previous slide.
• Fitting can be First-Fit (FF), Worst-Fit (WF), Best-Fit (BF)
11 / 14
Experiments
• Configuration
• 4-core platform (m=4)• 20 tasks• Periods [10-1000ms]• Each utilization level:100 task sets
• Existing results:
• Exact-MC (Bonifaci et al. in RTNS 2015): do memory accessand then do execution
• MIRROR-SPIN (This resembles the test from Altmeyer et al.in RTNS 2015)
• Evaluation Metrics:
• The acceptance ratio of a level: the number of task sets thatare schedulable by the test divided by the number of task sets.
12 / 14
Experiments
0.0 0.2 0.4 0.6 0.8 1.0UC/m
0.0
0.2
0.4
0.6
0.8
1.0
Acce
ptan
ce R
atio
0.0 0.2 0.4 0.6 0.8 1.00.00.20.40.60.81.0 (a) UA=40%, type=F
MIRROR-SPINMIRROR
0.0 0.2 0.4 0.6 0.8 1.00.00.20.40.60.81.0 (b) UA=70%, type=F 0.0 0.2 0.4 0.6 0.8 1.00.0
0.20.40.60.81.0 (c) UA=40%, type=M
MIRROR-FFMIRROR-BFMIRROR-WF
0.0 0.2 0.4 0.6 0.8 1.00.00.20.40.60.81.0 (d) UA=70%, type=M 0.0 0.2 0.4 0.6 0.8 1.00.0
0.20.40.60.81.0 (e) UA=40%, type=R
MIRROR-SPINMIRROR-RASMIRROR-MCexact-MC
0.0 0.2 0.4 0.6 0.8 1.00.00.20.40.60.81.0 (f) UA=70%, type=R
The number of resource access segments σi : 1 (rare access, type=R), 2(moderate access, type=M), and 10 (frequent access, type=F).
13 / 14
Conclusion and Extensions
• Fixed-priority, deadline-monotonic scheduling bus + bus-awaretiming analysis + FFDM = high schedulability
• A general treatment to handle multicore resource accesses• The treatment is compatible with existing task partitioning
methods• The view points are symmetric• First result with worst-case resource augmentation guarantees
(i.e., speedup factors) for this research line
• Extensions
• Similar techniques can be applied for multiple shared resources• The pessimism can be further reduced by counting the
interference more precisely
14 / 14