copilot: a hard real-time runtime monitor lee pike | galois, inc. | [email protected] joint work...
Post on 22-Dec-2015
216 views
TRANSCRIPT
QuickTime™ and a decompressor
are needed to see this picture.
Copilot:A Hard Real-Time Runtime Monitor
Lee Pike | Galois, Inc. | [email protected] work with
Alwyn Goodloe | National Institute of AerospaceRobin Morisset | École Normale Supérieure
Sebastian Niller | Technische Universität Ilmenau
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
© 2010 Galois, Inc.
NeedNeed
How do you know your embedded SW won’t fail? Certification (e.g., DO-178B) is largely process-oriented Probably not formally verified
• Even if so, may make bad assumptions
• E.g., faults, hardware behavior, program semantics, timing, etc.
Unanticipated faults lead to unexpected behavior Unanticipated behavior can put life at risk
Need to detect/respond at runtime
© 2010 Galois, Inc.
Just the FaCTS, Ma’amThe ConstraintsJust the FaCTS, Ma’amThe Constraints
A runtime monitoring solution must satisfy the FaCTS: Functionality: don’t change the target’s behavior Certifiability: don’t require re-certification (e.g., DO-178B)
of the target Timing: don’t interfere with the target’s timing SWaP: don’t exhaust size, weight, power reserves
© 2010 Galois, Inc.
State of Affairsof Embedded SW Runtime VerificationState of Affairsof Embedded SW Runtime Verification
Most runtime monitoring approaches violate at least one FaCTS constraints:• Inlining monitors changes timing properties and makes a new
program (re-certification?)
• Few constant-time/space monitor synthesis tools
Embedded C has not received the same attention as Java
© 2010 Galois, Inc.
OutlineOutline
1. A solution: the Copilot language and compiler
2. Pilot-study1: injecting software faults in a fault-tolerant air-speed system
3. Monitor correctness
4. Conclusions
1Pun intended.
© 2010 Galois, Inc.
Copilot Design (1/2)Copilot Design (1/2)
Target: hard real-time embedded systems
• Common for guidance, navigation, and control systems
Monitoring by sampling data at fixed time period
• Could be C variables/arrays, memory-mapped registers, etc.
• Adequate for hard real-time target programs, driven primarily by time, not events
• Doesn’t require inlining monitors
© 2010 Galois, Inc.
Copilot Design (2/2)Copilot Design (2/2)
Stream / data-flow specification language
• Think: Haskell infinite lists
Generates embedded ANSI C99 programs
• Can run “bare” on minimal microprocessors
Does not modify or instrument target software
• Generates it's own schedule---no RTOS needed
– Distributes monitor processing across the schedule
• But could run as one high-priority RTOS task
Constant-time & constant-space code• Very small data footprints
• Easier WCET estimation
Side effects impossible• A tenet of dataflow languages
© 2010 Galois, Inc.
Copilot Implementation:the eDSL ApproachCopilot Implementation:the eDSL Approach
A Haskell embedded domain-specific language (eDSL) -- e.g., a library
Types, parser, lexer, macro language... for free• ...And probably correct
“Compile” and “interpret” are functions• Interpreter (almost) for free
Atom eDSL is the backend• Originally developed by Tom
Hawkins (huge thanks to Tom!)• Very useful itself for writing more
procedural real-time C code
http://hackage.haskell.org/package/atom
Haskell
Macrolanguage
Front-end
Atom
Copilot
Interpreter
© 2010 Galois, Inc.
Stream Semantics (Append)Stream Semantics (Append)
x .= [0, 1, 2] ++ (varW64 x + 3) (in Copilot) f [0, 1, 2] (in Haskell)
where f :: [Word64] -> [Word64] f x = x ++ f (map (+3) x)
x = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ...
x = [0, 1, 2]
(+3)
[3, 4, 5]
(+3)
[6, 7, 8]
...
all operators arelifted in Copilot
© 2010 Galois, Inc.
Stream Semantics(Drop)Stream Semantics(Drop)
x .= [0, 1, 2] ++ (var x + 3) y = drop 2 (var x) x = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ... y = 2, 3, 4, 5, 6, 7, 8, 9, 10 ...
© 2010 Galois, Inc.
Timed SemanticsTimed Semantics
Period: duration between discrete events Phase: offsets into the period Example:
• x: period 4 phase 1
• y: period 4 phase 3
Copilot ensures synchronization between streams• Assuming synchronization of phases in distributed systems: no
non-faulty processor reaches the start of phase p+1 until every non-faulty processor has started phase p
x1(); x2();
© 2010 Galois, Inc.
initial “don’t care”values
Example Copilot SpecificationExample Copilot Specification
“If the temperature rises more than 2.3 degrees within 2 seconds, then the engine has been shut off.”
(period == 1 sec)
engine :: Streamsengine = do temps .= [0, 0, 0] ++ extF temp 1 overTempRise .= drop 2 (varF temps) > 2.3 + var temps trigger .= var overTempRise ==> extB shutoff 2
externalvariable
phase tosample in
variables arestringsx :: Stringx = “x”
© 2010 Galois, Inc.
Sample Code Generated(Incomplete)Sample Code Generated(Incomplete)
/* engine.sample__shutoff_2 */static void __r6() { bool __0 = true; bool __1 = shutoff; if (__0) { } engine.tmpSampleVal__shutoff_2 = __1;}
/* engine.updateOutput__trigger */static void __r0() { bool __0 = true; bool __1 = engine.tmpSampleVal__shutoff_2; bool __2 = ! __1; float __3 = 2.3F; uint64_t __4 = 0ULL; uint64_t __5 = engine.outputIndex__temps; uint64_t __6 = __4 + __5; uint64_t __7 = 4ULL; uint64_t __8 = __6 % __7; float __9 = engine.prophVal__temps[__8]; float __10 = __3 + __9; uint64_t __11 = 2ULL; uint64_t __12 = __11 + __5; uint64_t __13 = __12 % __7; float __14 = engine.prophVal__temps[__13]; bool __15 = __10 < __14; bool __16 = __2 && __15; bool __17 = ! __16; if (__0) { } engine.outputVal__trigger = __17;}
state-update functionfor trigger stream
external variablesample function
engine :: Streamsengine = do ... trigger = (var overTempRise) ==> (extB shutoff 2)
© 2010 Galois, Inc.
TypesTypes
Types: Int & Word (8, 32, 64), Float, Double Each stream has a unique type:
x .= ([0, 1] ++ (varW64 x + 3) :: Spec Word64) Type-inference to minimize “type hints”: x .= extI32 a 1 > extI32 b 2
y .= [0, 1] ++ (varW64 y + 3 + var y)
Casting• Implicit casting is a type-error:
x .= varW64 y + varW32 z
• Explicit casting guarantees:– signs never lost (no Int --> Word casts)
– No overflow (no cast to a smaller width)
x .= [True] ++ not (var x)
y .= castI32 (varB x) + castI32 (castW8 (varB x))
inferredtypes
© 2010 Galois, Inc.
Don’t like the stream language interface? Then define another DSL over top.
We’ve done just that for• LTL
• past-time LTL
• Statistical properties (in-progress)
Think of these as both new DSLs or as Copilot Libraries. Because they’re all in the same host-language, they can be
intermingled.
“Sir, it’s eDSLs all the way down!” (1/2)“Sir, it’s eDSLs all the way down!” (1/2)
© 2010 Galois, Inc.
“Sir, it’s eDSLs all the way down!” (2/2)“Sir, it’s eDSLs all the way down!” (2/2)
Example: "If the engine temperature exeeds 250 degrees, then the engine is shut off within one second, and in the 0.1 second following the shutoff, the cooler is engaged and remains engaged.”
(period == 0.1 sec)
engine :: Streamsengine = do temp `ptltl` alwaysBeen (extW8 engineTemp 1 > 250) cnt .= [0] ++ mux (varB temp && varW8 cnt < 10) (varW8 cnt + 1) (varW8 cnt) off .= (varW8 cnt >= 10 ==> extB engineOff 1) cooler `ptltl` (extB coolerOn 1 `since` extB engineOff 1) monitor .= varB off && varB cooler
© 2010 Galois, Inc.
Copilot Language RestrictionsCopilot Language Restrictions
Design goal: make memory usage constant and “obvious” to the programmer
No anonymous streams No lazily-computed values
• E.g. x .= [0] + (varW16 x) + 1
y .= drop 2 (varW16 x)
Other restrictions (see paper) Upshot: “WYSIWYG memory usage”
• Memory constrained by number of streams
• Memory for each stream is essentially the LHS of ++
• Doesn’t include stack variables
© 2010 Galois, Inc.
Timing Info & Expression CountsTiming Info & Expression CountsPeriod Phase Exprs Rule------ ----- ----- ---- 3 0 18 engine.updateOutput__trigger 3 0 14 engine.updateOutput__overTempRise 3 0 3 engine.update__temps 3 1 7 engine.output__temps 3 1 2 engine.sample__temp_1 3 2 6 engine.incrUpdateIndex__temps 3 2 2 engine.sample__shutoff_2 ----- 52
Hierarchical Expression Count
Total Local Rule ------ ------ ---- 52 0 engine 6 6 incrUpdateIndex__temps 7 7 output__temps 2 2 sample__shutoff_2 2 2 sample__temp_1 14 14 updateOutput__overTempRise 18 18 updateOutput__trigger 3 3 update__temps
Generated engine.c and engine.hMoving engine.c and engine.h to ./ ...Calling the C compiler ...gcc ./engine.c -o ./engine -Wall
Timinginfo
Expressioncount
helps withWCET analysis
© 2010 Galois, Inc.
Interpreter / Semantics(in one slide)Interpreter / Semantics(in one slide)
interpret :: Streamable a => Vars -> Vars -> Spec a -> [a]interpret inVs moVs s = case s of Const c -> repeat c Var v -> getElem v inVs PVar _ v _ -> getElem v moVs PArr _ (v,s') _ -> map (\i -> getElem v moVs !! fromIntegral i) (interpret inVs moVs s') Append ls s' -> ls ++ interpret inVs moVs s' Drop i s' -> drop i $ interpret inVs moVs s' F f _ s' -> map f $ interpret inVs moVs s' F2 f _ s0 s1 -> zipWith f (interpret inVs moVs s0) (interpret inVs moVs s1) F3 f _ s0 s1 s2 -> zipWith3 f (interpret inVs moVs s0) (interpret inVs moVs s1) (interpret inVs moVs s2)
© 2010 Galois, Inc.
Download, Develop, UseDownload, Develop, Use http://leepike.github.com/Copilot/
BSD3
BSD3Just a cabal
install away
© 2010 Galois, Inc.
UsageUsage compile spec “c-name” [opts] baseOpts interpret spec rounds [opts] baseOpts test rounds [opts] baseOpts
• quickChecking the compiler/interpreter
verify filepath int • SAT solving on the generated C program
help (commands and options) [spec] (parser) Opts (incomplete list):
• C trigger functions
• Ad-hoc C code (library included for writing this)
• Hardware clock
• Verbosity
• GCC options
© 2010 Galois, Inc.
Failures cited in
Northwest Orient Airlines Flight 6231 (1974)---3 killed
• Increased climb/speed until uncontrollable stall
Birgenair Flight 301, Boeing 757 (1996)---189 killed
• One of three pitot tubes blocked; faulty air speed indicator
Aeroperú Flight 603, Boeing 757 (1996)---70 killed
• Tape left on the static port(!) gave erratic data
Líneas Aèreas Flight 2553, Douglas DC-9 (1997)---74 killed
• Freezing caused spurious low reading, compounded with a failed alarm system
• Speed increased beyond the plane’s capabilities
Air France Flight 447, Airbus A330 (2009)---228 killed
• Airspeed “unclear” to pilots
• Still under investigation
Not an exhaustive list!
Interlude: Pitot FailuresInterlude: Pitot Failures
QuickTime™ and a decompressor
are needed to see this picture.
© 2010 Galois, Inc.
Test BedTest Bed
Representative of fault-tolerant systems
4 X STM microcontrollers
ARM Cortex M3 cores clocked at 72 Mhz
MPXV5004DP differential pressure sensor
• Senses dynamic and static pitot tube pressure
• Pitot tubes measure airspeed
Designed to fit UAS (unpiloted air system)
• Size, power, weight,...
T
QuickTime™ and a decompressor
are needed to see this picture.
© 2010 Galois, Inc.
Test Bed ArchitectureTest Bed Architecture
QuickTime™ and a decompressor
are needed to see this picture.
© 2010 Galois, Inc.
Aircraft Configuration (1/2)Edge 540T-R1 Aircraft Configuration (1/2)Edge 540T-R1
QuickTime™ and aBMP decompressor
are needed to see this picture.
© 2010 Galois, Inc.
Aircraft Configuration (2/2)Aircraft Configuration (2/2)
QuickTime™ and aBMP decompressor
are needed to see this picture.
© 2010 Galois, Inc.
Copilot MonitorsCopilot Monitors
Introduced software faults to be caught by Copilot monitors: Abrupt airspeed change:
airspeed > 100 m/s Majority assumption:
• Used the Boyer-Moore majority vote algorithm
• Finds majority in exactly one pass...
• ...If a majority exists, arbitrary value otherwise
• Monitor checks if returned value is majority
Voting agreement:• Check agreement between the voted values
• Uses coordinating distributed monitors
© 2010 Galois, Inc.
Now Execute the Test SuiteNow Execute the Test Suite
QuickTime™ and aYUV420 codec decompressor
are needed to see this picture.
© 2010 Galois, Inc.
Monitoring ResultsMonitoring Results
Monitoring approach worked and did not disrupt the FaCTS properties of the observed system• Under 100 C expressions
• No more than 16 expressions per phase
• Binaries on the order of 10k
Monitoring via sampling works for periodic tasks Off-by-one (phase) in one monitor, so spurious data
• Don’t write monitors at 2am before the flight test :)
© 2010 Galois, Inc.
Correctness (1/2)Correctness (1/2)
Who watches the watchmen? -Wall (Haskell) -Wall (gcc)
• Of course, you can use lint, valgrind, etc. if you like
Copilot is typed (for free!) by Haskell:• Statically typed: type-checking at runtime
• Strongly typed: no implicit type conversions– Makes hacks in the compiler hard---this is good
“QuickCheck” equivalence checker between the Copilot interpreter and generated C code.• Caught a subtle dependency bug
© 2010 Galois, Inc.
Correctness (2/2)Correctness (2/2)
Using CBMC “out of the box” Bounded Model Checker for ANSI-C
• Google: cbmc ansi-c
Provide a depth to unroll the Copilot program Proves absence of
• buffer overflows (both lower and upper)
• pointer deferences of NULL
• division by zero
• floating point computations resulting in not-a-number (NaN)
• uninitialized local variables use
© 2010 Galois, Inc.
“Why not just use Esterel?”“Why not just use Esterel?”
x Not certified Open-source (BSD3), free
• A good platform for open research & education
• Extensible (e.g., LTL libraries)
Focused on monitoring, not control Correctness:
• Copilot: ~3k LOCs, Atom: ~2k LOCs
• Working on full proof of correspondence
Need to confirm:• Smaller memory footprint
• More control over memory usage, scheduling, etc.
© 2010 Galois, Inc.
Future WorkFuture Work
Equivalence checking: interpreter <--> C code• Using SAT-solving on AIG structures, à la Cryptol (e.g., Erkök,
Carlsson, Wick. Hardware/software co-verification of cryptographic algorithms using Cryptol, FMCAD, 2009)
The steering problem• Mode change
Fault-tolerant monitor generation• Monitors need 10-9 failures/hour reliability, too!
Programmer assistance with scheduling Sampling rate vs. data history MC/DC coverage (should be nearly trivial)
© 2010 Galois, Inc.
ConclusionsConclusions
Problem space: hard real-time embedded C• Just the FaCTS, ma’am:
– Functionality, certifiability, timing, SWaP
• Monitoring by sampling
Goal: build a language/compiler that• Is expressive enough -- e.g., ptLTL properties
• But enforces austere memory usage requirements
Use an eDSL approach• Improve the probability of correctness
• Reduce the overhead of a new compiler
Shameless plug: We’re looking for a 2010 summer intern: email <[email protected]>
Acknowledgements: Thanks to LaRC’s Safety Critical Aeronautics Systems Branch/D320 allowing us to fly with them. This work is supported by Contract NNL08AD13T and monitored by Dr. Ben Di Vito.
© 2010 Galois, Inc.
Appendix
© 2010 Galois, Inc.
Monitoring By SamplingMonitoring By Sampling
Without inlining monitors, we must sample: Property (011)* False positive (monitor misses an fault):
• Values are 0111011 but sampling 011011
False negative (monitor signals a fault that didn’t occur): • Values are 011011 but sampling 0111011
Observation: with fixed periodic schedule and shared clock• False negatives impossible
• We don’t want to re-steer an unbroken system
• False positives possible, but requires constrained misbehavior
© 2010 Galois, Inc.
Pitot DataPitot Data
QuickTime™ and a decompressor
are needed to see this picture.
© 2010 Galois, Inc.
Distributed MonitorsDistributed Monitors