timing analysis and timing predictability reinhard wilhelm

50
Timing Analysis and Timing Predictability Reinhard Wilhelm

Upload: augusta-french

Post on 18-Jan-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Timing Analysis and Timing Predictability Reinhard Wilhelm

Timing Analysis and

Timing Predictability

Reinhard Wilhelm

Page 2: Timing Analysis and Timing Predictability Reinhard Wilhelm

Real Hard Real-TimeHard real-time systems, often in safety-critical applications abound -

Aeronautics, automotive, train industries, industry automation

Wing vibration of airplane,

sensing every 5 mSec

Sideairbag in car,

Reaction in <10 mSec

Page 3: Timing Analysis and Timing Predictability Reinhard Wilhelm

Hard Real-Time Systems

• Embedded controllers with hard deadlines.

• Need to statically know upper bounds on the execution times of all tasks

• Commonly called the Worst-Case Execution Time (WCET)

• Analogously, Best-Case Execution Time (BCET)

Page 4: Timing Analysis and Timing Predictability Reinhard Wilhelm

Quite a gallery!

Who needs this Timing Analysis?• TTA

• Synchronous languages

• Stream-oriented people

• UML real-time

• Hand coders

• Timed automata

Page 5: Timing Analysis and Timing Predictability Reinhard Wilhelm

Basic Notions

tBestcase

Worstcase

Lower bound

Upper bound

Worst-caseguarantee

Page 6: Timing Analysis and Timing Predictability Reinhard Wilhelm

Structure of the Talk1. Timing Analysis – a good story, only slightly cheating

• Prediction of Cache Behavior – but there’s more to it!

2. Timing Predictability• Variability of Execution Times – mostly in the memory hierarchy• Language constructs and their timing behavior

3. Components– Component-wise cache behavior prediction– RT CORBA– Components and Hard Real Time – a Challenge

4. Conclusion

Page 7: Timing Analysis and Timing Predictability Reinhard Wilhelm

Non-exhaustive Analysis

• Assumption: System under analysis too big for an exhaustive analysis

• Approximation/abstraction necessary

• Resulting uncertainty produces intervals of execution times

Page 8: Timing Analysis and Timing Predictability Reinhard Wilhelm

Timing Analysis

Page 9: Timing Analysis and Timing Predictability Reinhard Wilhelm

Industrial Practice

• Measurements: computing maximum of some executions.

Does not guarantee an upper bound to all executions

• Measurement has acquired a bad reputation, is now called

“observed worst-case execution time”.

Heavily used outside of Old Europe.

Page 10: Timing Analysis and Timing Predictability Reinhard Wilhelm

Once upon a Time,the World was Compositional

u_bound(if c then s1 else s2) =u_bound( c ) +max{u_bound(s1), u_bound(s2)}

u_bound(x:=y+z) = time(mv y R1) + time(mv z R2) + time(add R1 R2) + time(mv R1 x)

add 4

mv m Reg 12

mv Reg m 14

mv Reg Reg 1

Page 11: Timing Analysis and Timing Predictability Reinhard Wilhelm

Modern Hardware Features• Modern processors increase (average) performance by:

Caches, Pipelines, Branch Prediction• These features make

– execution times history dependent and– WCET computation difficult

• Execution times of instructions vary widely– Best case - everything goes smoothely: no cache miss, operands

ready, needed resources free, branch correctly predicted– Worst case - everything goes wrong: all loads miss the cache,

resources needed are occupied, operands are not ready– Span may be several hundred cycles

Page 12: Timing Analysis and Timing Predictability Reinhard Wilhelm

(Concrete) Instruction Execution

mul

FetchI-Cache miss?

IssueUnit occupied?

ExecuteMulticycle?

RetirePending instructions?

30

1

1

3

3

4

6

413

s1

s2

Page 13: Timing Analysis and Timing Predictability Reinhard Wilhelm

Timing Accidents and PenaltiesTiming Accident – cause for an increase of the

execution time of an instructionTiming Penalty – the associated increase• Types of timing accidents

– Cache misses– Pipeline stalls– Branch mispredictions– Bus collisions– Memory refresh of DRAM– TLB miss

Page 14: Timing Analysis and Timing Predictability Reinhard Wilhelm

Overall Approach: Natural Modularization

1. Processor-Behavior Prediction: • Uses Abstract Interpretation• Excludes as many Timing Accidents as possible• Determines WCET for basic blocks (in contexts)

2. Worst-case Path Determination• Maps control flow graph to an integer linear

program• Determines upper bound and associated path

Page 15: Timing Analysis and Timing Predictability Reinhard Wilhelm

Overall Structure

CFG Builder

Value Analyzer

Cache/Pipeline Analyzer

Executableprogram

Static Analyses

ILP-Generator

LP-Solver

Evaluation

Path Analysis

CRLFile

PERFile

Loop Trafo

WCETVisualization

Loopbounds

AIPFile

Processor-BehaviorPrediction

Worst-case PathDetermination

Page 16: Timing Analysis and Timing Predictability Reinhard Wilhelm

Murphy’s Law in Timing Analysis

• Naïve, but safe guarantee accepts Murphy’s Law: Any accident that may happen will happen

• Consequence: hardware overkill necessary to guarantee timeliness

• Example: Alfred Rosskopf, EADS Ottobrunn, measured performance of PPC with all the caches switched off (corresponds to assumption ‘all memory accesses miss the cache’)Result: Slowdown of a factor of 30!!!

Page 17: Timing Analysis and Timing Predictability Reinhard Wilhelm

Fighting Murphy’s Law• Static Program Analysis allows the derivation of Invariants about

all execution states at a program point

• Derive Safety Properties from these invariants : Certain timing accidents will never happen.Example: At program point p, instruction fetch will never cause a cache miss

• The more accidents excluded, the lower the upper bound

• (and the more accidents predicted, the higher the lower bound)

Warning: This story is good, but not always true!

Page 18: Timing Analysis and Timing Predictability Reinhard Wilhelm

True Benchmark Results

• Airbus with flight-control system,

• Mälardalen Univ. in industry projects,

• Univ. Dortmund

have found overestimations of ~10% by aiT.

Page 19: Timing Analysis and Timing Predictability Reinhard Wilhelm

Caches: Fast Memory on Chip

• Caches are used, because– Fast main memory is too expensive– The speed gap between CPU and memory is too

large and increasing• Caches work well in the average case:

– Programs access data locally (many hits)– Programs reuse items (instructions, data)– Access patterns are distributed evenly across the cache

Page 20: Timing Analysis and Timing Predictability Reinhard Wilhelm

Caches: How the workCPU wants to read/write at memory address a,

sends a request for a to the busCases:• Block m containing a in the cache (hit):

request for a is served in the next cycle• Block m not in the cache (miss):

m is transferred from main memory to the cache, m may replace some block in the cache,request for a is served asap while transfer still continues

• Several replacement strategies: LRU, PLRU, FIFO,...determine which line to replace

Page 21: Timing Analysis and Timing Predictability Reinhard Wilhelm

A-Way Set Associative Cache

Addressprefix

Byte inline

Setnumber

Address:

CPU

1 2 … A

Adr. prefix Tag Rep Data block Adr. prefix Tag Rep Data block … …

… …… …… …

Set: Fully associative subcache of A elements with LRU, FIFO, rand. replacement strategy

… …… …… …

Main MemoryCompare address prefixIf not equal, fetch block from memory

Data Out

Byte select & align

Page 22: Timing Analysis and Timing Predictability Reinhard Wilhelm

LRU Strategy

• Each cache set has its own replacement logic => Cache sets are independent: Everything explained in terms of one set

• LRU-Replacement Strategy: – Replace the block that has been Least Recently Used

– Modeled by Ages

• In the following: 4-way set associative cache

Page 23: Timing Analysis and Timing Predictability Reinhard Wilhelm

A-Way Set Associative Cache

Addressprefix

Byte inline

Setnumber

Address:

CPU

1 2 … A

Adr. prefix Tag Rep Data block Adr. prefix Tag Rep Data block … …

… …… …… …

Set: Fully associative subcache of A elements with LRU, FIFO, rand. replacement strategy

… …… …… …

Main MemoryCompare address prefixIf not equal, fetch block from memory

Data Out

Byte select & align

Page 24: Timing Analysis and Timing Predictability Reinhard Wilhelm

Cache AnalysisStatic precomputation of cache contents at each

program point:

• Must Analysis: which blocks are always in the

cache.

Determines safe information about cache hits.

Each predicted cache hit reduces upper bound.

• May Analysis: which blocks may be in the cache.

Complement says what is never in the cache.

Determines safe information about cache misses.

Each predicted cache miss increases lower bound.

Must Analysis

Cache Hits

Upper Bound

May Analysis

Cache Misses

Lower Bound

Page 25: Timing Analysis and Timing Predictability Reinhard Wilhelm

Cache with LRU Replacement: Transfer for must

zyxt

szyx

szxt

zsxt

Set in concrete

cache

Set inabstractcache

“young”

“old”

Age

[ s ]

{ x }{ }

{ s, t }{ y }

{ s }{ x }{ t }{ y }

[ s ]

Page 26: Timing Analysis and Timing Predictability Reinhard Wilhelm

Cache Analysis: Join (must){ a }{ }

{ c, f }{ d }

{ c }{ e }{ a }{ d }

{ }{ }

{ a, c }{ d }

“intersection + maximal age”

Join (must)

Access to memory block a is cache hit

Page 27: Timing Analysis and Timing Predictability Reinhard Wilhelm

Cache with LRU Replacement: Transfer for may

zyxt

szyx

szxt

zsxt

concrete

abstract

“young”

“old”

Age

[ s ]

{ x }{ }

{ s, t }{ y }

{ s }{ x }{ }

{ y, t }

[ s ]

Page 28: Timing Analysis and Timing Predictability Reinhard Wilhelm

Cache Analysis: Join (may)

Page 29: Timing Analysis and Timing Predictability Reinhard Wilhelm

Cache AnalysisApproximation of the Collecting Semantics

the semantics set of all cache statesfor each program point

determines

“cache” semantics set of all cache statesfor each program point

determines

abstract semantics abstract cache statesfor each program point

determines

PAG

conc

Page 30: Timing Analysis and Timing Predictability Reinhard Wilhelm

Timing Predictability

Page 31: Timing Analysis and Timing Predictability Reinhard Wilhelm

Variability of Execution Times

• is at the heart of timing inpredictability,• is introduced at all levels of granularity

– Memory reference– Instruction execution– Function – Task– Distributed system of tasks– Service

Page 32: Timing Analysis and Timing Predictability Reinhard Wilhelm

Penalties for Memory Accesses(in #cycles for PowerPC 755)

cache miss 40

cache miss + write back 80

TLB-miss and loading 12 reads, 1 write

500

Memory-mapped I/O 800

Page fault 2000

Tendency increasing, since clocks are getting faster faster than everything else

Remember: Penalties have to assumed for uncertainties!

Page 33: Timing Analysis and Timing Predictability Reinhard Wilhelm

Further Penalties - Processor periphery

• Bus protocol

• DMA

Page 34: Timing Analysis and Timing Predictability Reinhard Wilhelm

Cache Impact of Language Constructs

• Pointer to data

• Function pointer

• Dynamic method invocation

• Service demultiplexing CORBA

Page 35: Timing Analysis and Timing Predictability Reinhard Wilhelm

Cache with LRU Replacement: Transfer for must under unknown access, e.g. unresolved data pointer

{ x }{ }

{ s, t }{ y }

{ }{ x }{ }

{s, t }

[ ? ]

If address is completely undetermined,same loss of information in every cache set!

Analogously for multiple unknown accesses, e.g. unknown function pointer;assume maximal cache damage

Set ofabstract cache

Page 36: Timing Analysis and Timing Predictability Reinhard Wilhelm

Dynamic Method Invocation

• Traversal of a data structure representing the class hierarchy

• Corresponding worst-case execution time and resulting cache damage

Page 37: Timing Analysis and Timing Predictability Reinhard Wilhelm

Components

• Component-wise cache-behavior prediction– a pragmatic, very simplistic notion of Component,

i.e. unit of analysis (or compilation)– A DAG of components defined by calling

relationship – cycles only inside components

• RT CORBA – just to frighten you

• A Challenge: Components with predictable timing behavior

Page 38: Timing Analysis and Timing Predictability Reinhard Wilhelm

Component-wise I-Cache Analysis

• So far, analysis done on fully linked executables, i.e. all allocation information available

• Allocation sensitivity– Placing module into executable at different address changes

the mapping from memory blocks to sets

Analyze component under some allocation assumption; Enforce cache-equivalent allocation by influencing linker

• Cache damage due to calls to a different component– Caller’s memory blocks can be evicted by callee’s blocks

– Callee’s blocks stay in the cache after return

Cache-damage analysis

Page 39: Timing Analysis and Timing Predictability Reinhard Wilhelm

Cache Damage Analysis• Caller’s memory blocks can be evicted from the

cache during the call – the cache damage• Callee’s memory blocks are in the cache

returning from the call – the cache residue• Cache damage analysis computes safe bounds on

number of replacements in sets– Must analysis: upper-bound– May analysis: lower-bound

Page 40: Timing Analysis and Timing Predictability Reinhard Wilhelm

Cache Damage Analysis

• Bound of replacements in a set is increased, when accessed memory block mapped to this set is not yet in the cache

• Combined update and join functions– Use these new functions for fixed point computation

• Cache damage update– combines results at the function return

Page 41: Timing Analysis and Timing Predictability Reinhard Wilhelm

foo(){ ... ... ... ... call bar() ... ... ...

}

Cache Damage and Residue

iM

bar(){ ... ... ...}

Must

May

2

{a,b}

{c}

{}{}

{a,b}

{}

{}{c}

3{d,e}

{}

{}{}

2

{d,e}

{f}{}

{}2

4 4

{}{a,b}

{d,e}

{}

{c}{a,b

}{d,e}{}

{a,b}

{c,f}

{f}{}

Page 42: Timing Analysis and Timing Predictability Reinhard Wilhelm

Proposed Analysis Method

• Input: DAG of inter-module call relationship• Bottom-up analysis

– Start from non-calling module

• For each module:– Analyze all functions

• Initial assumptions:– Must analysis: cache is empty– May analysis: everything can be in the cache with age 0

• For external calls: use results of cache-damage analysis

– Store results of module analysis• Will be used during analysis of calling modules

• Compose analysis results of all modules

Page 43: Timing Analysis and Timing Predictability Reinhard Wilhelm

Real-Time CORBA

• Attempt to achieve end-to-end middleware predictability for distributed real-time systems

• Real-time CORBA is middleware standard

• Real-Time Specification for Java (RTSJ) – new memory management models, no GC,– access to physical memory,– strong guarantees on thread semantics

Page 44: Timing Analysis and Timing Predictability Reinhard Wilhelm

D: Schmidt et al.: Towards Predictable Real-Time Java …2003

RT CORBA

Page 45: Timing Analysis and Timing Predictability Reinhard Wilhelm

Making Demultiplexing Predictable

Dynamics in CORBA:

• POAs activated/deactivated dynamically

• Servants within a POA activated/deactivated dynamically

Interface definitions and sets of names of operations are static => use perfect hashing for demultiplexing

Page 46: Timing Analysis and Timing Predictability Reinhard Wilhelm

D. Schmidt et al. Enhancing RT-CORBA …

Page 47: Timing Analysis and Timing Predictability Reinhard Wilhelm

Timing Predictability

Reconciling Predictability with X

• X = (average-case) performance

• X = fault tolerance

• X = reusability/implementation independence

Page 48: Timing Analysis and Timing Predictability Reinhard Wilhelm

Components with Predictable Timing-Behavior - a Challenge -

• Needs HW and tool support– decreasing variability of execution times by combining static

with dynamic mechanisms, e.g. cache freezing, cache + scratchpad memory

• Needs Occam’s razor for the language-concept design– Hard real-time systems, often safety critical, have different

requirements and priorities than systems realized with middleware and components, e.g. less frequent updates, no easy exchangeability of components

Page 49: Timing Analysis and Timing Predictability Reinhard Wilhelm

Acknowledgements• Christian Ferdinand, whose thesis started all this

• Reinhold Heckmann, Mister Cache

• Florian Martin, Mister PAG

• Stephan Thesing, Mister Pipeline

• Michael Schmidt, Value Analysis

• Henrik Theiling, Mister Frontend + Path Analysis

• Jörn Schneider, OSEK

• Oleg Parshin, Components

Page 50: Timing Analysis and Timing Predictability Reinhard Wilhelm

Recent Publications• R. Heckmann et al.: The Influence of Processor Architecture on the Design and the

Results of WCET Tools, IEEE Proc. on Real-Time Systems, July 2003• C. Ferdinand et al.: Reliable and Precise WCET Determination of a Real-Life

Processor, EMSOFT 2001• H. Theiling: Extracting Safe and Precise Control Flow from Binaries, RTCSA 2000• M. Langenbach et al.: Pipeline Modeling for Timing Analysis, SAS 2002• St. Thesing et al.: An Abstract Interpretation-based Timing Validation of Hard

Real-Time Avionics Software, IPDS 2003• R. Wilhelm: AI + ILP is good for WCET, MC is not, nor ILP alone, VMCAI 2004• O. Parshin et al.: Component-wise Data-cache Behavior Prediction, ATVA 2004• L. Thiele, R. Wilhelm: Design for Timing Predictability, 25th Anniversary edition of

the Kluwer Journal Real-Time Systems, Dec. 2004• R. Wilhelm: Determination of Bounds on Execution Times, CRC Handbook on

Embedded Systems, 2005