measuring execution times - institute of computer ... · • test-coverage metrics for functional...

43
Measuring Execution Times Peter Puschner Benedikt Huber slides credits: P. Puschner, R. Kirner, B. Huber VU 2.0 182.101 SS 2015

Upload: others

Post on 17-Oct-2019

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Measuring Execution Times

Peter Puschner Benedikt Huber slides credits: P. Puschner, R. Kirner, B. Huber VU 2.0 182.101 SS 2015

Page 2: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Contents

Why (not to) measure execution-times ... Instrumentation

Measurement-based approaches •  Industrial practice

•  Evolutionary algorithms •  Probabilistic WCET analysis •  Measurement-based timing analysis

2

Page 3: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Measuring -- a Simple Solution!?

Why not obtain a WCET estimate by measuring the execution time?

3

Stop Timing Measurement

Execute Program on Target HW

Start Timing Measurement

WCET estimate ?

Page 4: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Why not just Measure WCET?

•  Measuring all different traces is intractable (e.g., 1040 different paths in a mid-size task)

•  Selected test data for measurement may fail to trigger the longest execution trace

•  Test data generation: rare execution scenarios may be missed (e.g., exception handling, …)

•  Partitioning: combining WCET of parts does not necessarily yield the global WCET (anomalies)

4

Page 5: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Why not just Measure WCET? (2)

•  Problem of setting the processor state to the worst-case start state.

Conclusions: •  Measurements in general underestimate the worst-case

execution time. •  More systematic WCET analysis techniques are

required to obtain a trustworthy WCET bound!

5

Page 6: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

On the other hand ...

•  Not all applications require a safe WCET bound •  Soft real-time systems (e.g., multimedia) •  Fail-safe hard real-time systems (e.g., windmill)

•  Easy to adapt to new platform (limited platform support by tools for static analysis)

•  Low annotation effort ð get a quick rough estimate of the execution time

•  Complement to static analysis, produces “hard evidence” about correct timing

•  Feedback for improving static WCET analysis 6

Page 7: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Measurement-Based WCET Analysis

•  Key Idea: Timing information is acquired by measuring the execution time of the code executed dynamically on the (physical) target hardware.

•  Instrumentation Points (IP): observeable events required to trigger timing measurements

•  Trace: timing and path information (path = sequence of basic blocks) are gathered in combination

7

S-Task + Hardware

Input

Output

State

State

Trace

Page 8: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Execution Time Measurements Goal: Obtain execution time for a path

8

Code FST: set 2 .dcall df ldab _S1 ldab OFST-1,s bitb #15 bne L22 ldab #1 stab L5r L22: leas 2,s rti _quit:

Hardware + P1

P2

Instrumentation Interface Execution Time Measurement System

Hardware Interfaces •  Simple I/O ports •  Address lines •  Debug interfaces •  Communication devices

System Under Test

Page 9: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Instrumentation Methods

Host

Target Computer

Timer Execution times

Configuration data

Start/stop signals

Configuration data

Input data

User

•  Pure hardware instrumentation

•  External execution time measurements using software triggering

•  Pure internal (software) instrumentation

9

Page 10: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Instrumentation Decisions

Persistent vs. non-persistent instrumentation Code instrumentation vs. hardware instrumentation

Possible design decisions: •  Counter location? Interface data? •  Control flow manipulation? Input data generation? •  Number of measurement runs? •  Resource consumption? •  Required devices? •  Installation effort?

10

Page 11: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Measurement Considerations

How to measure what we want to measure:

•  Instrumentation (IPs) must not alter program flow or execution time in an unknown or unpredictable way. IPs have to be persistent if changing either.

•  How can we make sure that executions always start from the same (known) state (cache, pipeline, branch prediction, ...)?

11

Page 12: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Measurement-Based Methods

•  Industrial practice

•  Evolutionary algorithms

•  Probabilistic WCET analysis (pWCET)

•  Measurement-based timing analysis (mTime)

12

Page 13: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Industrial Approach

Example of Industrial Design Flow

13

Design Matlab + Simulink Matlab + RTW

Test Prototype Boards Custom Hardware

Deploy Custom automotive Hardware

Page 14: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Industrial Approach

14

Page 15: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Industrial Approach – Input Vectors

15

Page 16: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Industrial Approach – Random Data

16

Page 17: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Industrial Approach – Pitfalls

•  Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment

•  Random data may miss the path with the longest Execution Time

•  The state of the system is typically not taken into consideration

17

Page 18: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Evolutionary Algorithms (EA)

Gene = an independent property of an individual Individual = vector of genes Population = set of a number individuals Fitness value = chance of survival of an individual Recombination = mating of two individuals,

exchange of genes

Mutation = random change of a gene

18 [Wegener et al.,96]

Page 19: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Process of Evolutionary Computation

•  Selection: Survival of the fittest: Stochastical selection and modification of fittest individuals to form a new generation

•  Recombination: exchange of genes betweens individuals e.g., 1-point crossover, n-point crossover

•  Mutation: probabilistic changing of genes

19

Initialization

Evaluation Break condition met?

Selection

Recombination

Mutation

Evaluation

Reinsertion

Result

Page 20: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

WCET by Evolutionary Algorithms

•  Gene = input or state variable •  Fitness value = measured execution time

(longer execution time ð higher fitness)

•  Result = “fittest individual” = individual with the longest execution time

•  Gives good estimations of the execution time but no safe upper bound of the WCET

20

Page 21: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

WCET by Evolutionary Algorithms

•  Start: [0] x=0, y=0 → ET: 40 [1] x=1, y=1 → ET: 40

•  Crossover: [2] x=0, y=1 → ET: 50 [3] x=1, y=0 → ET: 30 Algorithm terminates if fitness does not improve for a given number iterations

21

if(x) { fast(); } else { slow(); } if(y){ slow(); } else { fast(); }

Page 22: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Results of Applying an EA

22

Program WCET WCET SA WCET EA Matrix 13,190,619 15,357,471 13,007,019

Sort 11,872,718 24,469,014 11,826,117

Graphics N/A 2,602 2,176

Railroad N/A 23,466 22,626

Defense N/A 72,350 35,226

Under- estimation

not tight

Tightness? SA .... Static analysis EA .... Evolutionary algorithms

no idea!

[Mueller, Wegener, RTSS1998]

Page 23: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Malignant Scenarios

Functions with many local XT maxima (e.g., sorting)

Example: xti = f(x, y) x, y ∈ [0..129] wcet(f3) = f3(127, 129)

23

y 127 128 129 124 01111111 10000001

xt

Page 24: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Probabilistic WCET Analysis

•  Goal: determine the probability distribution of the WCET

•  Solution: syntax tree representation of the program and a probabilistic timing schema

− Tree leafs are basic blocks

− Inner nodes: sequential composition, conditional composition, iterative composition

− Timing measurements between IPs

24

Page 25: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Probabilistic WCET Analysis

•  Timing Schema − W(A) = WCET of A − W(A;B) = W(A)+W(B) − W(if E then A else B) = W(E) + max(W(A), W(B))

... •  Probabilistic Schema − X, Y random variables for execution times A, B − Distribution function F(x) = P[X ≤ x], G(y) = P[Y ≤ y] − Sequence: A;B ð Z = X + Y ð H(z) = P[X + Y ≤ z]

In case of independence: standard convolution

25

( ) ( ) ( )x

H z F x G z x dx= −∫

Page 26: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Probabilistic WCET Tool (pWCET)

•  RapiTime –  Convenient reporting, “hot spot analysis” –  Probabilistic model using extreme-value statistics –  Cumulative probability that a randomly chosen

sample from the end-to-end execution times during testing exceeds a given time budget ð Quality depends on chosen test data

0,00001

0,0001

0,001

0,01

0,1

1

0 5000 10000 15000 20000

Execution time (cycles)

1-cu

mul

ativ

e pr

obab

ility

26

Page 27: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Path-Based Timing Analysis (mTime) with Program Segmentation

27

Analyzer tool

Execution time measurement

framework

Calculation tool

C-Source

Analysis phase

Measurement phase

Calculation phase

WCET bound

+

Page 28: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Path-Based Timing Analysis (mTime) with Program Segmentation

mTime performs the following five steps in the analysis:

Static program analysis at C source code level Automatic control-flow graph partitioning Test data generation (using model checking) Execution time measurements WCET bound calculation step

1 2 3 4 5

28

Page 29: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Control-Flow Graph Partitioning 2 Step

•  Decomposing CFG into smaller units

•  Program segmentation (PSG) − Set of program segments (PS) −  with start node si,

termination node ti and set of paths ∏i

•  „Good“ program segmentation balances − number of PSs and − average number of paths per PS

•  Maximum number of paths within PSs can be configured by the path bound parameter

29

Page 30: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Control-Flow Graph Partitioning 2 Step

•  Results of applying the partitioning algorithm for case study application

30

Path bound PS Paths1000 5 1455100 7 33650 8 24220 11 13015 13 10610 14 926 21 834 38 842 88 1171 171 171

Page 31: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Control-Flow Graph Partitioning

Entity Value #Paths 2.19e+32

Example of generated code:

2 Step

Entity Value #Paths 2.19e+32 Path bound 5,000 Identified PS 25 #Measurements 30.000

31

Page 32: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

•  4-stage process – Level 4: Model Checking – Level 3: Heuristics – Level 2: Random Search – Level 1: Cache

•  Full path coverage (determinism) •  Key challenge: identification of infeasible paths

Test Data Generation 3 Step

32

Page 33: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Counterexample holds data to reach path ‘x’

Path ‘x’ will not be executed

C source code

Model Checking (1)

Model

Model Checker

Model is unsafe (counterexample)

Model checkers are used for automatic formal verification of concurrent finite automata

Assertions

Path ‘x’ is infeasible Model is safe

3 Step

33

Page 34: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Model Checking (2) 3 Step

•  Enforcement of execution paths •  Each path π requires an

individual model M •  Model checking answers for M:

− π is infeasible, or − a counter example including the

input data to enforce the execution of π

34

4

6

7

10

11 12

13

5

8

9

2

1

3

Page 35: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Execution Time Measurements

Internal or external devices for execution-time measurements

35

RS232 Target Hardware

USB Runtime Measurment Device

Target Hardware

Host PC Host PC

2 lines

4 Step

Page 36: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

WCET Bound Calculation Step 5 Step

•  Program segment execution times are combined by integer linear programming (ILP) or longest path search

•  Advantage: only feasible paths within each PS contribute

•  Deficiency: lack of global path information ð refinement possible

36

Page 37: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Experiments

Model-checker performance: SAL vs. CBMC

37

#Paths MC

Time Analysis [s] CBMC SAL SAL BMC

TestNicePartitioning 63 11.2 109.6 259.3

ActuatorMotorControl 280 1202.2 N.A.1 N.A.1 ADCConv 136 65.2 7202.5 2325.5 ActuatorSysCtrl 96 32.7 507.4 491.3

1 Model size is too big, memory error of the model checker (core dump)

Page 38: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

38

Path

Bou

nd

#Pat

hs ( ∑

|πj|

)

#Pro

gram

Seg

men

ts

#Pat

hs R

ando

m

#Pat

hs M

C

Cov

erag

e (#

Path

s)

WC

ET B

ound

Tim

e (A

naly

sis)

[s]

Tim

e (E

TM) [

s]

Ove

rall

Tim

e [s

]

Tim

e A

naly

sis

/ Pat

h M

C [s

]

Tim

e ET

M /

Cov

ered

Pat

h [s

]

#Pat

hs /

Prog

ram

Seg

men

t

ActuatorMotorControl 1 171 171 165 6 165 N.A. 468 1289 1757 78.00 7.8 1.0 10 92 14 63 29 68 3445 841 116 957 29.00 1.7 6.6 100 336 7 57 279 89 3323 7732 62 7794 27.71 0.7 48.0 1000 1455 5 82 1373 130 3298 41353 49 41402 30.12 0.4 291.0 ADCConv 1 31 31 31 0 31 872 24 192 216 N.A. 6.2 1.0 10 17 3 8 9 9 870 31 22 53 3.44 2.4 5.7 100 74 2 8 66 14 872 220 17 237 3.33 1.2 37.0 1000 144 1 12 132 12 872 483 11 494 3.66 0.9 144.0 ActuatorSysCtrl 1 54 54 54 0 54 173 26 318 344 N.A. 5.9 1.0 10 36 14 36 0 36 173 10 85 95 N.A. 2.4 2.6 100 97 1 18 79 25 131 191 10 201 2.42 0.4 97.0 TestNicePartitioning 1 30 30 6 24 30 151 34 175 209 1.42 5.8 1.0 5 14 6 4 10 14 151 15 39 54 1.50 2.8 2.3 10 14 3 3 11 14 151 16 21 37 1.45 1.5 4.7 20 18 2 2 16 15 150 22 16 38 1.38 1.1 9.0 100 72 1 1 71 26 129 106 12 118 1.49 0.5 72.0

Page 39: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Experiments and Results Pa

th B

ound

Path

s

Prog

ram

Seg

men

ts

Path

s Ra

ndom

Path

s M

C

Cove

rage

WCE

T Bo

und

Tim

e (A

naly

sis)

[s]

Tim

e (E

TM) [

s]

Ove

rall

Tim

e [s

]

Tim

e An

alys

is /

Path

MC

[s]

Tim

e ET

M /

Cove

red

Path

[s]

Path

s / P

rogr

am

Segm

ent

1 30 30 6 24 30 151 34 175 209 1,42 5,8 1,05 14 6 4 10 14 151 15 39 54 1,50 2,8 2,310 14 3 3 11 14 151 16 21 37 1,45 1,5 4,720 18 2 2 16 15 150 22 16 38 1,38 1,1 9,0100 72 1 1 71 26 129 106 12 118 1,49 0,5 72,0

Case study: nice_partitioning

39

Page 40: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Path-based Measurements

•  Customizable fast CFG partitioning (Segmentation) − Trade-off: Quality vs. #paths

•  Automated test data generation using mix of strategies − Full path coverage per segment − Scalability

•  Over- or under-estimation of WCET? − pessimistic as well as optimistic effects − measurements ð under-estimation − combination of segment results ð over-estimation

40

Page 41: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Summary

•  WCET measurements play an important role

•  Measuring WCET is far from trivial − Identifying the right inputs − Dealing with state − Instrumentation

•  Pure measurements

•  Hybrid approaches combine measurements and elements from static analysis (pWCET, mTime)

•  Research: HW randomization to obtain independence

41

Page 42: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Reading Material

•  Joachim Wegener, Harmen Sthamer, Bryan F. Jones, and David E. Eyres. Testing real-time systems using genetic algorithms. In Software Quality Journal 6, 1997.

•  Frank Mueller and Joachim Wegener. A Comparison of Static Analysis and Evolutionary Testing for the Verification of Timing Constraints. In Proc. 4th IEEE Real-Time Technology and Application Symposium, 1998.

•  Guillem Bernat, Antoine Colin, and Stefan M. Petters. pWCET: A Tool for Probabilistic Worst-Case Execution Time Analysis of Real-Time Systems. Technical Report, CS Dept., University of York, 2003.

42

Page 43: Measuring Execution Times - Institute of Computer ... · • Test-coverage metrics for functional tests are not sufficient for a measurement-based WCET assessment • Random data

Reading Material (2)

•  Ingomar Wenzel, Raimund Kirner, Bernhard Rieder, and Peter Puschner. Measurement-Based Timing Analysis. In Proc. 3rd Int’l Symposium on Leveraging Applications of Formal Methods, Verification and Validation, T. Margaria and B. Steffen eds, Springer, 2008.

43