tu dortmund and university of saarlands€¦ · mirror: symmetric timing analysis for real-time...

MIRROR: Symmetric Timing Analysis for Real-Time Taskson Multicore Platforms with Shared Resources

Wen-Hung Kevin Huang, Jian-Jia Chen, and Jan Reineke

TU Dortmund and University of Saarlands

09,06,2016 at DAC, Austin, USA

1 / 14

Periodic Control System (Liu and Layland 1973)

Pseudo-code for this system

set timer to interrupt periodicallywith period T ;

at each timer interruptdo

• perform analog-to-digitalconversion to get y ;

• compute control output u;

• output u and dodigital-to-analog conversion;

od

Control System

A/D

(The system

being controlled)

plant

actuator

D/AControl−law

computation

sensor

y(t) u(t)

ukyk

Liu and Layland Model:• Ti : period of task τi

• Di : relative deadline of task τi

• Ci : worst-case execution timeof task τi

• Ui : utilization Ci/Ti

2 / 14

Predictability Due to Resource Sharing

3 / 14

Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache Main Memory

Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache

Multi Core CPU 1 Multi Core CPU 2

Task is executed on Core 1


3 / 14

Core 1

Core 2

L1 Cache

L1 Cache


Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache


L1 cache misses


3 / 14

Core 1

Core 2

L1 Cache

L1 Cache


Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache


L2 cache misses


3 / 14

Core 1

Core 2

L1 Cache

L1 Cache


Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache


Access memory


3 / 14

Core 1

Core 2

L1 Cache

L1 Cache


Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache




3 / 14

Core 1

Core 2

L1 Cache

L1 Cache


Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache


L1 cache misses


3 / 14

Core 1

Core 2

L1 Cache

L1 Cache


Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache


L2 is blocked


3 / 14

Core 1

Core 2

L1 Cache

L1 Cache


Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache




3 / 14

Core 1

Core 2

L1 Cache

L1 Cache


Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache


L1 cache misses


3 / 14

Core 1

Core 2

L1 Cache

L1 Cache


Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache


L2 cache misses


3 / 14

Core 1

Core 2

L1 Cache

L1 Cache


Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache


Memory access is blocked


3 / 14

Core 1

Core 2

L1 Cache

L1 Cache


Core 1

Core 2

L1 Cache

L1 Cache

L2 Cache


Memory access for core 1 finishes

Typical Two-Phase Analysis Approaches

• Phase 1: Worst-case execution time (WCET) of a stand-aloneprogram

• WCET analyzers such as aiT or Chronous.

• Phase 2: Worst-case response time (WCRT) of aperiodic/sporadic task by considering the competition withthe other tasks

• worst-case interference from the other tasks• utilization-based tests, response time analysis, busy-interval

techniques, real-time calculus, max-plus algebra, etc.

• The notion of WCET is destroyed in multicore systems due toshared resources.

• WCET depends on how the tasks on the other cores areco-executed

• Assume the worst-case interference is too pessimistic

4 / 14

Self-Suspending Behavior

t

SharedResource

Core 2

Core 1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

τ2 τ2 τ2

τ1 τ1 τ1 τ1

τ1 τ1 τ1 τ1τ2 τ2 τ2

c-executionc-suspension

r-executionr-suspension

• Multiple cores may share a bus

• The contention on the bus can be considered as a suspensionproblem (with respect to the bus access)

5 / 14

Self-Suspension Tasks in Real-Time Systems

Suppose that we know the suspension time of each τi and would like to analyzethe schedulability of the tasks on a core. (Constrained-deadline Di ≤ Ti )

τ1

τ2

τ3

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

τ4

• A self-suspending task τi is a periodic task with jitter (PJD task)

• Period: Ti

• Self-suspension-time: Si

• Schedulability test of task τk :

6 / 14



τ1

τ2

τ3

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

τ4


• Period: Ti

• Self-suspension-time: Si• Schedulability test of task τk :

∃t with 0 < t ≤ Tk and Ck + Sk +k−1∑j=1

⌈t + Sj

Tj

⌉Cj ≤ t.

6 / 14



τ1

τ2

τ3

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

τ4


• Period: Ti



⌈t + Dj

Tj

⌉Cj ≤ t.

6 / 14



τ1

τ2

τ3

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

τ4


• Period: Ti



⌈t + Tj

Tj

⌉Cj ≤ t.

6 / 14

Platform Model

• Multicore with a share resource

• For example, atomic (non-split-transaction) bus

• Bus sits idle while memory processes the request and sends theresponse

• Fixed-priority arbitration

Resp τ2

R RReq τ2

RReq τ1

Resp τ1

0 1 2 3 4 5 6 7 8 9 10

B

CLK

7 / 14

Task and Scheduling Model

• Resource access task τi(Ci ,Ai ,Ti ,Di , σi )• Ci : upper bound on local

computation• Ai : upper bound on

resource accesses• Ti : period• Di : relative deadline

(Di ≤ Ti )• σi : the maximum number

segments of consecutiveresource accesses

• Path analysis

• Fixed-priority scheduling (weuse deadline-monotinicscheduling)

A

B

C

D

E

F

G

H

10

10

7 10

8 20

10

10

Local execution

Resource access

Ci = 35

Ai = 40

σi = 3

Assume compositional properties:75 is a safe upper bound.

8 / 14







• Path analysis


A

B

C

D

E

F

G

H

10

10

7 10

8 20

10

10

Criticalpath

markedforCi

Local execution

Resource access

Ci = 35

Ai = 40

σi = 3


8 / 14







• Path analysis


A

B

C

D

E

F

G

H

10

10

7 10

8 20

10

10

Critical path marked for Ai

Criticalpath

markedforCi

Local execution

Resource access

Ci = 35

Ai = 40

σi = 3


8 / 14

Key Observations: Symmetric Property

t

SharedResource

Core 2

Core 1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

τ2 τ2 τ2

τ1 τ1 τ1 τ1

τ1 τ1 τ1 τ1τ2 τ2 τ2

c-executionc-suspension

r-executionr-suspension

• From the core perspectives for τ2

• accessing or waiting: [3,4), [8,12), [15, 16)• suspension: [4,8), [12, 15)

• From the shared resource perspectives for τ2

• executing or waiting: [4,8), [12, 15)• suspension:[3,4), [8,12), [15, 16)

9 / 14

Schedulability Test for Task τ k

• WCRT is upper bounded by the minimum t|0 < t ≤ Dk

(Ck + exec core(t)) + (Ak + exec resource(t)) ≤ t

Ck +∑

τi∈hp(τk ,c)

⌈t + Ti

Ti

⌉Ci

+σkB+

Ak +∑

τi∈hp(τk ,r)

⌈t + Ti

Ti

⌉Ai

≤ t

• σk B: the maximum blocking time by the lower priority taskson the shared resource

• hp(τ k , c): higher-priority tasks thanτ k on the same core• hp(τ k , r): higher-priority tasks than τ k on shared resource

• Pessimism of the above response time analysis: number of resourceaccess segments was not exploited

• In our paper, we explain how to calculate and utilize the information σk

in a symmetric and more precise manner

10 / 14

Schedulability Test for Task τ k

• WCRT is upper bounded by the minimum t|0 < t ≤ Dk

Ck +∑

τi∈hp(τk ,c)

⌈t + Ti

Ti

⌉Ci

+σkB+

Ak +∑

τi∈hp(τk ,r)

⌈t + Ti

Ti

⌉Ai

≤ t

• σk B: the maximum blocking time by the lower priority taskson the shared resource

• hp(τ k , c): higher-priority tasks thanτ k on the same core• hp(τ k , r): higher-priority tasks than τ k on shared resource

• Pessimism of the above response time analysis: number of resourceaccess segments was not exploited

• In our paper, we explain how to calculate and utilize the information σk

in a symmetric and more precise manner

10 / 14

Task Assignment (Partition)

τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8

P1 P2 P3

• Schedulability tests are based on the previous slide.

• Fitting can be First-Fit (FF), Worst-Fit (WF), Best-Fit (BF)

11 / 14


τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8

P1

τ1

P2 P3



11 / 14


τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8

P1

τ1

P2

τ2

P3



11 / 14


τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8

P1

τ1

P2

τ2

P3

τ3



11 / 14


τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8

P1

τ1

P2

τ2

P3

τ3

τ4



11 / 14


τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8

P1

τ1

P2

τ2

τ5

P3

τ3

τ4



11 / 14


τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8

P1

τ1

τ6

P2

τ2

τ5

P3

τ3

τ4



11 / 14


τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8

P1

τ1

τ6

P2

τ2

τ5

τ7

P3

τ3

τ4



11 / 14


τ1 τ2 τ3 τ4 τ5 τ6 τ7 τ8

P1

τ1

τ6

P2

τ2

τ5

τ7

P3

τ3

τ4

τ8



11 / 14

Experiments

• Configuration

• 4-core platform (m=4)• 20 tasks• Periods [10-1000ms]• Each utilization level:100 task sets

• Existing results:

• Exact-MC (Bonifaci et al. in RTNS 2015): do memory accessand then do execution

• MIRROR-SPIN (This resembles the test from Altmeyer et al.in RTNS 2015)

• Evaluation Metrics:

• The acceptance ratio of a level: the number of task sets thatare schedulable by the test divided by the number of task sets.

12 / 14

Experiments

0.0 0.2 0.4 0.6 0.8 1.0UC/m

0.0

0.2

0.4

0.6

0.8

1.0

Acce

ptan

ce R

atio

0.0 0.2 0.4 0.6 0.8 1.00.00.20.40.60.81.0 (a) UA=40%, type=F

MIRROR-SPINMIRROR

0.0 0.2 0.4 0.6 0.8 1.00.00.20.40.60.81.0 (b) UA=70%, type=F 0.0 0.2 0.4 0.6 0.8 1.00.0

0.20.40.60.81.0 (c) UA=40%, type=M

MIRROR-FFMIRROR-BFMIRROR-WF

0.0 0.2 0.4 0.6 0.8 1.00.00.20.40.60.81.0 (d) UA=70%, type=M 0.0 0.2 0.4 0.6 0.8 1.00.0

0.20.40.60.81.0 (e) UA=40%, type=R

MIRROR-SPINMIRROR-RASMIRROR-MCexact-MC

0.0 0.2 0.4 0.6 0.8 1.00.00.20.40.60.81.0 (f) UA=70%, type=R

The number of resource access segments σi : 1 (rare access, type=R), 2(moderate access, type=M), and 10 (frequent access, type=F).

13 / 14

Conclusion and Extensions

• Fixed-priority, deadline-monotonic scheduling bus + bus-awaretiming analysis + FFDM = high schedulability

• A general treatment to handle multicore resource accesses• The treatment is compatible with existing task partitioning

methods• The view points are symmetric• First result with worst-case resource augmentation guarantees

(i.e., speedup factors) for this research line

• Extensions

• Similar techniques can be applied for multiple shared resources• The pessimism can be further reduced by counting the

interference more precisely

14 / 14

tu dortmund and university of saarlands€¦ · mirror: symmetric timing analysis for real-time...

Documents