model checking software artifacts
DESCRIPTION
SAnToS Laboratory, Kansas State University, USA. Model Checking Software Artifacts. http://www.cis.ksu.edu/cadena. http://www.cis.ksu.edu/bandera. http://www.cis.ksu.edu/bogor. Principal Investigators. Students. Matt Dwyer John Hatcliff Gurdip Singh. William Deng Georg Jung - PowerPoint PPT PresentationTRANSCRIPT
Model Checking Software Artifacts
http://www.cis.ksu.edu/bogor
SAnToS Laboratory, Kansas State University, USA
Matt DwyerJohn HatcliffGurdip Singh
Principal Investigators
SupportUS National Science Foundation (NSF)US National Aeronautics and Space Agency (NASA)US Department of Defense Advanced Research Projects Agency (DARPA) US Army Research Office (ARO)
Rockwell-Collins ATCHoneywell Technology Center and NASA Langley Sun MicrosystemsIntel
Students
William DengGeorg JungOksana Tkachuk
RobbyVenkatesh RanganathJesse GreenwaldTodd Wallentine
http://www.cis.ksu.edu/banderahttp://www.cis.ksu.edu/cadena
For the past decade …
We’ve been developing program analysis frameworks
Standard tensions Scalable versus Precise
How semantic is the analysis? Property-specific versus Language-
based How rich are the properties?
Push-button versus Configurable How usable is the technology?
Analyzing Source Code
Worked on a broad range of case studies SPIN, SMV, … Extracting models by hand
Developed a series of tool frameworks for analyzing safety properties of concurrent programs FLAVERS (Ada) INCA Translators (Ada) Bandera (Java)
Succesful?
Tools are widely used For education As a basis for further work by us and
others Tools have been used to find bugs in
real systems 1000-10000 LOC <10 threads Bugs that eluded significant testing
efforts
Whole Program Analysis
will never scale to large code bases even for highly abstract analyses (e.g.,
control flow) even for simple properties (e.g., def-use)
Must perform modular analyses Hard to do for truly global properties? Hard to do in presence of concurrency? What are the natural module
boundaries? How big can a module be?
A Solution … Target the full range of software
artifacts Requirements models Architectural descriptions Designs (at various levels of
refinement) Code
Use semantic analyses within artifacts (properties) across different artifacts (conformance)
Features of our Vision
Early and varied semantic modeling structural modeling is useful as well
Analysis driven feedback and refinement Artifact generating analyses
Proofs, reachable modes, … Synthesize code wherever possible Aspects of an agile process
continuous delivery of working artifacts Exploit "domain information" throughout
ultimately meta-tools may be useful
Development Flow
RequirementsModel
User’s informalrequirements
Query checker, Visualization tools
RequirementsModel
RequirementsModel
RequirementsModel
Consistency, Completeness,
… checker
Development FlowUser’s informalrequirements
Model-specific analysis
Inter-model consistency,
completeness, … checkingFunctio
nalModel’
PerformanceModel
Functional
Model
Functional
ModelFunctional
Model’
PerformanceModel
Functional
ModelFunctional
Model’
PerformanceModel
Functional
ModelFunctional
Model’
PerformanceModel
Development Flow
Functional
ModelFunctional
Model’
PerformanceModel
…
Design
Model
Conformance checker(s)
Design
Model
Design
Model
Design
Model
Development Flow
Functional
ModelFunctional
Model’
PerformanceModel
…
Structural Design Model
Synchronization Policy Spec
Quality of Service Spec
Abstract Behavioral Model
…
Multi-layer conformance
checking
Development Flow
Functional
ModelFunctional
Model’
PerformanceModel
…
Structural Design ModelStructural Design ModelStructural Design ModelStructural Design Model
Synchronization Policy Spec
Development Flow
Functional
ModelFunctional
Model’
PerformanceModel
…
Structural Design Model
Synchronization Policy Spec
Quality of Service Spec
Abstract Behavioral Model
…
Structural Design Model
Synchronization Policy Spec
Quality of Service Spec
Abstract Behavioral Model
…
Structural Design Model
Synchronization Policy Spec
Quality of Service Spec
Abstract Behavioral Model
…
Structural Design Model
Synchronization Policy Spec
Quality of Service Spec
Abstract Behavioral Model
…
Development Flow…
Structural Design Model
Synchronization Policy Spec
Quality of Service Spec
Abstract Behavioral Model
…
Code
Conformance checker(s)
Development Flow…
Structural Design Model
Synchronization Policy Spec
Quality of Service Spec
Abstract Behavioral Model
…
Model/spec dependent synthesis procedures
(proof generating)
Domain-appropriate Implementation
Framework
Lessons Adapt methods to developers
Ease of use, leverage domain abstractions Use layered, incremental methods
Low entry barrier, early and focused feedback
Focus technology on the hard part Synchronization, timing, global properties
Synthesize as much code as possible Developer buyin, reduce code-level
reasoning Developers won’t write specs, so tell them
they are writing code
and now for Bogor …
Model Checking in Cadena
Steps toward our vision Hard problems here are
not component coding (localized) Inter-component coordination
(sequencing, synchronization, timing, …) Theme
exploit domain semantics exploit implementation infra-structures
An Overview of … Component modeling Middle-ware modeling
Develop an abstract model that captures semantics of actual middle-ware
Environment modeling Exploit environment information to
reduce state space Property specification Structural reductions
Exploit structure of state space of periodic RT systems
Modal SP
Component Behavior
component BMModal { uses ReadData dataIn; consumes DataAvailable inDataAvailable; publishes DataAvailable outDataAvailable; provides ReadData dataOut; provides ChangeMode modeChange;
enum Modes (enabled,disabled); Modes m;
behavior { handles dataInReady (DataAvailable e) { case m of enabled { dataOut::data <- dataIn.getData(); push {} dataOutReady; } disabled {} } …
mode declaration using CORBA IDL
mode declaration using CORBA IDL
Component Behavior
component BMModal { uses ReadData dataIn; consumes DataAvailable inDataAvailable; publishes DataAvailable outDataAvailable; provides ReadData dataOut; provides ChangeMode modeChange;
enum Modes (enabled,disabled); Modes m;
behavior { handles dataInReady (DataAvailable e) { case m of enabled { dataOut::data <- dataIn.getData(); push {} dataOutReady; } disabled {} } …
behavior for events on dataInReady port
behavior for events on dataInReady port
Component Behavior
component BMModal { uses ReadData dataIn; consumes DataAvailable inDataAvailable; publishes DataAvailable outDataAvailable; provides ReadData dataOut; provides ChangeMode modeChange;
enum Modes (enabled,disabled); Modes m;
behavior { handles dataInReady (DataAvailable e) { case m of enabled { dataOut::data <- dataIn.getData(); push {} dataOutReady; } disabled {} } …
behavior mode casesbehavior mode cases
Component Behavior
component BMModal { uses ReadData dataIn; consumes DataAvailable inDataAvailable; publishes DataAvailable outDataAvailable; provides ReadData dataOut; provides ChangeMode modeChange;
enum Modes (enabled,disabled); Modes m;
behavior { handles dataInReady (DataAvailable e) { case m of enabled { dataOut::data <- dataIn.getData(); push {} dataOutReady; } disabled {} } …
data flow specification
data flow specification
Component Behavior
component BMModal { uses ReadData dataIn; consumes DataAvailable inDataAvailable; publishes DataAvailable outDataAvailable; provides ReadData dataOut; provides ChangeMode modeChange;
enum Modes (enabled,disabled); Modes m;
behavior { handles dataInReady (DataAvailable e) { case m of enabled { dataOut::data <- dataIn.getData(); push {} dataOutReady; } disabled {} } … publish eventpublish event
Towards a Complete Model
We have transition semantics for intra-component behavior.
How should we model communication layer?
?
Middleware/Service Semantics Weak CCM and Event Services Specs
(OMG) Informal : English and examples Intentionally under-specified to allow
implementor freedom Looked at implemented semantics of
existing ORBs and Event Services ACE/TAO, OpenCCM, K-State
Developed a family of semantic models that captured their behavior
Outline of Real SystemEvent channel with internal thread pool
…Thread Pool
…
60Hz 20Hz 5Hz 1Hz
… … ……
passive componen
ts
passive componen
ts
proxy consumer holds list of consumer
references
proxy consumer holds list of consumer
references
dispatch queues for each rate
group
dispatch queues for each rate
group
publish
correlation & filtering
consumer refs
Threads run call-backs associated with event consumer ports
getData
System ObservationsEvent channel with internal thread pool
…Thread Pool
…
60Hz 20Hz 5Hz 1Hz
… … ……
invoke[m,c]invoke[m,c]
publish[e,c]publish[e,c]
dispatch[e]dispatch[e]
publish
correlation & filtering
consumer refs
accept[e,s]accept[e,s]
eof[r]eof[r]
getData
c.m == vc.m == v
Modeling of Components
function tacticalSteering_push_inDataAvailable(CAD.Event event){ Data d; loc loc0: live {} when tacticalSteeringMode do {} goto loc1; when !tacticalSteeringMode do {} return; loc loc1: live{d} when true do { d := CAD.getField<Data>(AirFrame, "ReadData.data"); } goto loc2; loc loc2: live{} when true do { CAD.setField<Data>(TacticalSteering, “dataOut.data", d); } goto loc3; loc loc3: live {} invoke pushOfProxy(TacticalSteering, “dataOutReady“) return;}
handles dataInReady (DataAvailable e) { case m of enabled { dataOut::data <- dataIn.getData(); push {} dataOutReady; } disabled {} }
handles dataInReady (DataAvailable e) { case m of enabled { dataOut::data <- dataIn.getData(); push {} dataOutReady; } disabled {} }
Structure follows component behavior spec and connection representation closely
Structure follows component behavior spec and connection representation closely
Modeling of Connections
CAD.connectEvent(GPS, “dataCurrent", AirFrame,"inDataAvailable", 20, false);
instance AirFrame of BMLazyActive on l2 { connect dataAvailable to GPS.dataCurrent atRate 20 connect dataIn to GPS.dataOut
Modeled very directly in BOGOR
Modeling Middleware (Threads)
thread threadgroup5() { Pair.type<EventHandlerEnum, CAD.Event> pair; EventHandlerEnum handler; CAD.Event event;
loc loc0: live { handler, event } when Queue.size<Pair.type<EventHandlerEnum, CAD.Event>>(Q5) > 0 do invisible { pair := Queue. getFront<Pair.type<EventHandlerEnum, CAD.Event> >(Q5); Queue.dequeue<Pair.type<EventHandlerEnum, CAD.Event> >(Q5); handler := Pair. first<EventHandlerEnum, CAD.Event>(pair); event := Pair.second<EventHandlerEnum, CAD.Event>(pair); } goto loc1;
loc loc1: live {} invoke virtual f(handler, event) goto loc0;}
…Thread Pool
60Hz 20Hz 5Hz 1Hz Dispatch queue polling
Dispatch queue polling
Extend model checker types
Extend model checker types
Polymorphic extension
Polymorphic extension
Modeling Middleware (Queues)
extension Queue for edu.ksu.cis.cadena.bogor.ext.Queue { typedef type<'a>; expdef int size<'a>(Queue.type<'a>); expdef int capacity<'a>(Queue.type<'a>); expdef boolean isFull<'a>(Queue.type<'a>); expdef boolean isEmpty<'a>(Queue.type<'a>); expdef Queue.type<'a> create<'a>(int); actiondef enqueue<'a>(Queue.type<'a>, 'a); expdef 'a getFront<'a>(Queue.type<'a>); actiondef dequeue<'a>(Queue.type<'a>); expdef boolean
containsPair<'a>(Queue.type<'a>,'a);}
… … …
Data in state space, operations implemented as Java code
Data in state space, operations implemented as Java code
Modeling Middleware (Scheduling)
…Thread Pool
60Hz 20Hz 5Hz 1Hz
… … ……
Bold Stroke Systems are scheduled based on RMA
• run highest-priority (i.e., rate) enabled action
• many fewer schedules, contains all real schedules
BOGOR allows encoding specific schedules
• Java plugin filters enabled actions in state exploration algorithm
Typically model checkers use non-deterministic scheduling
• i.e., choose from set of enabled transitions in a state
• set of all such schedules contains all real schedules
Modeling of Environment
Model time directly expensive (state space becomes acyclic) hard to get accurate timing info (platform
specific) Boeing isn’t very interested in real-time
properties other than schedulability (?)
Abstract time modeling strategies Timeouts can happen at any time Bound number of timeouts in hyper-period Bound relative number of timeouts in adjacent
rate groups Approximate passage of time
System behavior is driven by periodic time-triggered events
Relative Timeout Counts
R1
R2
R3
R1 R1R1, R2
R1, R2, R3
Assume that worst case execution time of Ri work can be performed in the period of Ri
There is a pattern to the number of timeout counts in a frame
• e.g., in frame of Ri there are two timeouts of Ri-1
Relative Timeout Counts
Enforce only the relative # of timeouts for adjacent rates
Timeout for Ri is enabled after
• work for Ri is complete
• proper number of timeouts for Ri-1 are performed
R1
R2
R1R1, R2
Problem: Don’t know how long R2 work takes?
Relative Timeout Counts
Enforce only the relative # of timeouts for adjacent rates
Timeout for Ri is enabled after
• work for Ri is complete
• proper number of timeouts for Ri-1 are performed
R1
R2
R1, R2
Problem: Don’t know how long R2 work takes?
Next R1 timeout could fall in the middle of R2 work
R1
Must consider all interleavings of R1 timeout and actions performed in R2 work (or R3 work, …)
Modeling of Environment
Previous model does not relate component execution with passage of time
Assume we have worst-case execution bounds for event handlers
• e.g., from schedulability analysis
Keep track of intra-hyper-period time (ihp) normalized by duration of shortest action
Increment ihp by duration bounds for handlers as they are executed
One tricky issue …
Lazily-Timed Componentsihp
1 2
thandler
ihp 2
thandler
Handler duration fits in before next frame boundary:
Handler duration fits in before next frame boundary:
1. ihp += thandler
2. execute handler to completion
1. ihp += thandler
2. execute handler to completion
ihp
1
thandler-
Handler duration overruns next frame boundary:
Handler duration overruns next frame boundary:1. ihp +=
2. choose prefix of handler to execute
3. assign residual duration to handler suffix
1. ihp += 2. choose prefix of
handler to execute3. assign residual
duration to handler suffix
Preliminary Results (Full Search)
Basic1 rate, 3 components,
2 events per hyper-period
Multi-Rate2 rates, 6 components,
6 events per hyper-period
Modal3 rates, 8 components,
125 events per hyper-period
Medium2 rates, 50 components,
820 events per hyper-period
System ND Priority Lazily-Timed
20, .12s 12, .11s14, .11s
120k, 5m 100, .38s33, .19s
3M+, ? 9.1k, 8.6s 900, 1.3s
13M+, ? 740k, 29m 4k, 8.6s
Functional Properties
Property II: If navSteering is enabled when 20Hz timeout occurs, then airFrame should fetch navSteering data [before end of frame]
Property II: If navSteering is enabled when 20Hz timeout occurs, then airFrame should fetch navSteering data [before end of frame]
Property I:System never reaches a state where TacticalSteering and NavSteering are both disabled
Property I:System never reaches a state where TacticalSteering and NavSteering are both disabled
Event-based Specifications
Many properties of interest in this domain are event oriented Some require access to state information
A state qualified event (written e + f) defines the occurrence of an observable event, e,
in a state that satisfies a given formula, f. For example,
If component c1 is in mode A when component c2 produces event E, them component c3 will consume event F.*;
publish[E, c2] + c1.mode == A; [- invoke[Fhandler, c3]]*
No trace of form
Exploiting System Structure
Symmetry reductions Equivalence between states
Partial order reductions Commutativity between transitions
Collapse compression Sharing between state components
Can we exploit the time-triggered nature of real-time systems?
A Simple Transition System
l1: y = 0; goto l2; l2: x = 0; goto l3; l3: true -> x = 2; goto l4; true -> x = 3; goto l4; l4: y = y + x; goto l5; l5: y>5 -> skip; goto end; y<=5 -> skip; goto l2; end:
System State Space
State Space Decomposition
Synopsis
A system is quasi-cyclic if A subset of its state variables repeatedly
reach certain fixed values (e.g., initial values) The rest of the variables can vary freely
Decompose DFS of quasi-cyclic system BFS of quasi-cyclic regions DFS within regions
Memory bounded by region DFS Time penalty due to redundant state
visits
Cadena models are quasi-cyclic
Parallel Quasi-cyclic Search
Region DFS are completely independent Embarassingly parallel
Naïve implementation on 4 processors overcomes overhead
This summer check realistic scenario ~400 components, ~130 modal
components
Ultimate Modeling ViewCCM IDLModelLayer
Check mode behaviors, temporal properties, timing constraints
Code Layer
Check that implementations satisfy/refinement high-level specs – leverage the fact that code skeletons are automatically generated
Generate code, fill-in skeletons, check for refinement
We don’t do all of this yet!
Some Ongoing Work
Integration of specification support into Cadena
Model generation from Cadena Counter-example display Refinement checking (e.g.,
dependences against state machines) Incorporating synchronization info Modeling distributed nodes Incorporating time into spec language