language workbenches, embedded software formal verification · a dsl is a focussed, processable...
TRANSCRIPT
Language Workbenches, Embedded Software and
Formal Verification
Markus Voelter independent/itemis
www.voelter.de voelterblog.blogspot.de
@markusvoelter +Markus Voelter
Daniel Ratiu ForTISS GmbH
www.fortiss.org
mbeddr
1
An extensible version of the C programming language for Embedded Programming
C the Difference – C the Future
gefördert durch das BMBF Förderkennzeichen 01|S11014
An extensible C with support for formal methods,
requirements and PLE.
IDE for Everything
A debugger for all of that
SDK for building your own Language
Extensions!
IDE for Everything
JetBrains
MPS Open Source Language Workbench
Challenges in embedded software
development
Abstraction without
Runtime Cost
C considered unsafe
Program Annotations
Static Checks and
Verification
Product Lines and
Requirement Traces
Separate, hard to integrate Tools
Subset of Available
Extensions
All of C (cleaned-up)
Retargettable Build Integration
Native Support for Unit Testing
and Logging
Physical Units
Components Interfaces Contracts Instances
Mocks & Stubs
State Machines +
Model Checking
Decision Tables +
Consistency and Completeness
Checks
Support for Frama-C
+ High-level Iterators
IDE Support for Frama-‐C Annota3ons
Generate Frama-‐C Annota3ons from Higher-‐level Constructs
Requirements Tracability
Product Line Variability
Status and
Availability
http://mbeddr.com
LWES Language Workbenches
for Embedded Systems
Developed within
gefördert durch das BMBF Förderkennzeichen 01|S11014
Most is Open Source (EPL); the
rest will follow this year.
support for graphical early
2013
integration in early 2013
Language Engineering
w/ Language Workbenches
2
Introduction
A DSL is a focussed, processable language for describing a specific
concern when building a system in a specific domain. The abstractions
and notations used are natural/suitable for the stakeholders who
specify that particular concern.
Concepts (abstract syntax)
(concrete) Syntax
semantics (generators)
Tools and IDE
more in GPLs more in DSL Domain Size large and complex smaller and well-defined
Designed by guru or committee a few engineers and domain experts
Language Size large small
Turing-completeness almost always often not
User Community large, anonymous and widespread
small, accessible and local
In-language abstraction
sophisticated limited
Lifespan years to decades months to years (driven by context)
Evolution slow, often standardized fast-paced
Incompatible Changes almost impossible feasible
C
LEGO Robot Control
Components
State Machines
Sensor Access
General Purpose
Domain Specific
L
a b
c
d
e
f
g h i
j
k
m
n
o
with many first class concepts!
Big Language
L
α β
λ ω
δ
Small Language
and poweful concepts with a few, orthogonal
my L
α
β
a b c
d e f
g h i
j k l
Modular Language
composable modules with many optional,
an abstraction or simplification of reality
ecosurvey.gmu.edu/glossary.htm
Model
an abstraction or simplification of reality
ecosurvey.gmu.edu/glossary.htm
which ones?
what should we leave out?
Model
… code generation … analysis and checking … platform independence … stakeholder integration … drives design of
language!
Model Purpose
Programs Languages Domains
Domain
body of knowledge in the real world deductive
top down
existing software (family)
inductive bottom up
Domain
existing software (family)
inductive bottom up
Domain
Domain Hierarchy
Domain Hierarchy
all programs
embedded software
automotive avionics
Exten ded C
Example
Heterogeneous
Exten ded C
Example
Heterogeneous
Heterogeneous C
Statemachines Testing
Heterogeneous
Exten ded C
Example
Domain Hierarchy
more specialized domains more specialized languages
Reification
Dn
Reification
Dn
Dn+1
==
Reification
== Language Definition
Transformation/ Generation
?
? Overspecification! Requires Semantic Analysis!
! Declarative! Directly represents Semantics.
Linguistic Abstraction
Def: DSL
A DSL is a language at D that provides linguistic abstractions for common patterns and idioms of a language at D-1 when used within the domain D.
Def: DSL cont’d
A good DSL does not require the use of patterns and idioms to express semantically interesting concepts in D. Processing tools do not have to do “semantic recovery” on D programs.
Declarative!
Another Example
Exten ded C
Example
Another Example
Turing Complete! Requires Semantic Analysis!
Exten ded C
Example
Linguistic Abstraction
Exten ded C
Example
Linguistic Abstraction
Exten ded C
Example
Linguistic Abstraction
In-Language Abstraction Libraries Classes Frameworks
Linguistic Abstraction
In-Language Abstraction User-Definable Simpler Language
Analyzable Better IDE Support
Linguistic Abstraction
In-Language Abstraction User-Definable Simpler Language
Analyzable Better IDE Support
Special Treatment!
Semantics
Static Semantics
Execution Semantics
Static Semantics
Execution Semantics
Static Semantics
Constraints Type Systems
Unique State Names
Unreachable States
Dead End States
…
Exten ded C
Example
Unique State Names
Unreachable States
Dead End States
…
Easier to do on a declarative Level!
Exten ded C
Example
Unique State Names
Unreachable States
Dead End States
…
Easier to do on a declarative Level!
Thinking of all constraints is a coverage problem! Exten
ded C
Example
Assign fixed types
What does a type system do?
Assign fixed types
Derive Types
What does a type system do?
Assign fixed types
Derive Types
Calculate Common Types
What does a type system do?
Assign fixed types
Derive Types
Calculate Common Types
Check Type Consistency
What does a type system do?
Intent + Check
Derive
More code
Better error messages
Better Performance
More convenient
More complex checkers
Harder to understand for users
Refrige rators
Example
Execution Semantics
What does it all mean?
Def: Semantics … via mapping to lower level
OB: Observable Behaviour (Test Cases)
Def: Semantics … via mapping to lower level
LD
LD-1
Transformation Interpretation
Dn
Dn+1
Transformation
Transformation
Exten ded C
Example
LD
LD-1
Transformation
Known Semantics!
Transformation
LD
LD-1
Transformation Correct!?
Transformation
Transformation
LD
LD-1
Transformation
Tests (D)
Tests (D-1)
Run tests on both levels; all pass. Coverage Problem!
LD
LD-1
Transformation
Tests Simulators
Documentation
Transformation
Multi-Stage
L3
L2
L1
L0
Modularization
Multi-Stage: Reuse
L3
L2
L1
L0
Reusing Later Stages
Optimizations!
L5
Multi-Stage: Reuse
L3
L2
L1
L0
L5
Exten ded C
Example
C Text
C (MPS tree)
State Machine
Components
Robot Control
Multi-Stage: Reuse
L3
L2
L1
L0
L5
Exten ded C
Example
C Text
C (MPS tree)
State Machine
Components
Robot Control
Syntactic Correctness, Headers
C Type System
Consistency Model Checking
Efficient Mappings
Multi-Stage: Reuse
L3
L2
L1
L0
L1b
L0b
Reusing Early Stages
Portability
Multi-Stage: Reuse
L3
L2
L1
L0
L1b
L0b Exten ded C
Example
Java C#
Multi-Stage: Preprocess
Adding an optional, modular
emergency stop feature
Reduced Expressiveness
bad? maybe.
good? maybe!
Model Checking SMT Solving
Exhaustive Search, Proof!
Language Modularity
Behavior Language Modularity, Composition and Reuse (LMR&C)
increase efficiency of DSL development
Behavior Language Modularity, Composition and Reuse (LMR&C)
increase efficiency of DSL development
Referencing Reuse Extension Reuse
4 ways of composition:
Behavior Language Modularity, Composition and Reuse (LMR&C)
increase efficiency of DSL development
distinguished regarding dependencies and fragment structure
4 ways of composition:
Behavior Dependencies:
do we have to know about the reuse when designing the languages?
Behavior Dependencies:
do we have to know about the reuse when designing the languages?
homogeneous vs. heterogeneous („mixing languages“)
Fragment Structure:
Behavior Dependencies & Fragment Structure:
Behavior Dependencies & Fragment Structure:
Referencing Referencing
Referencing
Dependent
No containment
Referencing
Used in Viewpoints
Referencing
Fragment
Fragment
Fragment
Referencing
references
Refrige rators
Example
Referencing
Extension
Containment
Dependent
Extension
more specialized domains more specialized languages
Extension Extension
Dn
Dn+1
==
Extension
Dn
Dn+1
==
Extension
Dn
==
Good for bottom-up (inductive) domains, and for use by technical DSLs (people)
Extension
Behavior Drawbacks tightly bound to base potentially hard to analyze the combined program
Extension
Exten ded C
Example
Extension
Extension Extension
Exten ded C
Example
Reuse Reuse
Reuse
No containment
Independent
Reuse
Reuse
Behavior Often the referenced language is built expecting it will be reused.
Hooks may be added.
Reuse
Embedding
Embedding Embedding
Containment
Independent
Embedding
Pension Plans
Example
Embedding
Behavior Embedding often uses Extension to extend the embedded language to adapt it to its new context.
Embedding
Behavior Extension and Embedding requires modular concrete syntax
Challenges - Syntax
Many tools/formalisms cannot do that
Behavior Extension: the type system of the base language must be designed to be extensible/ overridable
Challenges – Type Systems
Behavior Reuse and Embedding: Rules that affect the interplay can reside in the adapter language.
Challenges – Type Systems
Behavior Referencing (I) Challenges – Trafo & Gen
Two separate, dependent single-source transformations
Can be Reused
Written specifically for the combination
Referencing (II) Challenges – Trafo & Gen
A single multi-sourced transformation
Referencing (III) Challenges – Trafo & Gen
A preprocessing trafo that changes the referenced frag in a way specified by the
referencing frag
Extension Challenges – Trafo & Gen
Transformation by assimiliation, i.e. generating code in the host lang
from code expr in the extension lang.
Exten ded C
Example
Extension Challenges – Trafo & Gen
Reuse (I) Challenges – Trafo & Gen
Reuse of existing transformations for both fragments plus
generation of adapter code
Reuse (II) Challenges – Trafo & Gen
composing transformations
Reuse (III) Challenges – Trafo & Gen
generating separate artifacts plus a weaving specification
Embedding (I) Challenges – Trafo & Gen
Assimilation (as with Extension)
a purely embeddable language may not come with a generator.
Embedding (II) Challenges – Trafo & Gen
Adapter language can coordinate the transformations for the host and for the
emebedded languages.
Formal Verification
3
Can language engineering increase the adoption of
formal verification?
Can we make formal verification
more usable and agile?
Our goal: formal verification
for everyone
Challenges with using formal analyses
1) Writing the formal model
2) Specify the property to be verified
3) Interpret the analysis results
Addressing the challenges
1) Wrap the language of the analysis tool into higher level languages
2) Define out-of-the-box analyses goals that can be automated
3) Lift the analysis results at the abstraction level of the domain
Challenges with Building Formal Analyses Tools
• „ [...] model construc3on problem: the seman.c gap between the ar.facts produced by so:ware developers and those accepted by current verifica.on tools. [...]
In order to use a verifica.on tool on a real program, the developer must extract an abstract mathema3cal model of the program’s salient proper3es and specify this model in the input language of the verifica3on tool. This process is both error-‐prone and .me-‐consuming “
ICSE 2000, CorbeI et. al., Bandera ...
Addressing the challenges
Define sub-‐languages that are easier to analyze and embedd them in more
expressive languages.
Allow developers to write programs directly in a sub-‐language that is
easier to analyze.
– Analyses are simpler to define
– Automa3on degree grows / analyses are (computa3onally) more feasible
– The results of analyses can be presented in more adequate form
Adequate Languages Make the Life Simpler
Today’s state of prac3ce: “Write some program (e.g. C) and then try (very hardly) to analyze it.”
The mbeddr approach: “… by using adequate language (fragments)”
Today’s state of the art: “Write programs that can be analyzed ”
183
Allow SoUware Developers Make Informed Decisions ...
Either write a sub-‐system in a restricted (but verifiable language) or use the full power of a GPL
Get immediate feedback if you are (not) in the verifiable sub-‐set
Characteris3cs of Analyzable Languages
• High modularisa3on and encapsula3on – Small and well-‐defined interfaces
• Clean-‐up or restrict „problema3c“ features – Access to global state, side-‐effects, etc.
• Raise the level of abstrac3on – Be able to leave out unnecessary details
• Eliminate the „accidental complexity“ – Be able to directly express what we want without any
„encoding“
184
Code-‐based, Model-‐based and DSLs-‐based Analyses
Formal analysis tools – e.g. model checkers, SMT solvers
GPL Code
GPL Code
Abstract models, -‐ e.g. Statecharts
Abstrac.on
Program abstrac.on
Challenges: -‐ program abstrac.on -‐ iden.fica.on of invariants -‐ figh.ng accidental complexity
Genera.on
Challenges: -‐ integrate with exis.ng systems -‐ for many tasks the models are not enough expressive
C code
DSL1 DSL2 ...
DSL3
Clean, easy to analyze DSLs m
2m transf.
code gen.
Challenges: -‐ Find adequate language fragments and corresponding analyses
mbeddr Agile Formal Analyses © Daniel Ra.u
Paradigm Change
… for analyses users:
decide which parts of programs will be analyzed and use adequate language fragments that allow analysis
... for analyses developers: it is easier to extend/restrict languages than to extend
analyses to deal with intricacies of all language features
Verifying State Machines
Referencing
Model Checking
Model Checking
Unreachable States Dead End States
Live States
Out of the box verification conditions
Transitions Nondeterminism
Dead Transitions
Variables out-of-bounds
Check the sanity of the code
Have a (temporal) scope: Global
Before R
After Q
Between Q and R
After Q Until R
User Defined Properties
Define Business-Domain Specific Verification Conditions
… that restrict a certain basic property:
P not P S responds to P
R R
Q Q
Q Q R R Q Q
Q Q R Q
Model Checking
Model Checking
Model Checking
Exten ded C
Example
Restricted State Machines: „float“ vars. are not allowed
Example
Verifying Decision Tables
Referencing
Exten ded C
Example
Decision Tables
Completeness: did we cover all cases?
Consistency: are there overlapping cases?
Decision Tables
Easy to answer using an SMT solver
SMT = Sa3sfiability Modulo Theories -‐ extension of boolean sa.sfiability with addi.onal theories like linear arithme.c
Decision Tables
Decision Tables
Decision Tables
Language restric3on: non-‐linear expressions are not allowed
Iden3fy language fragments relevant to developers that can be easily analyzed
Ensure equivalence with the transla3on to C or, at least, increase the confidence
that they are „close enough“ for a certain analysis goal
New Challenges
The End. Most of this material is part of
Markus‘ upcoming (early 2013) book DSL Engineering. Stay in touch, it may become a free
eBook www.voelter.de
voelterblog.blogspot.de @markusvoelter +Markus Voelter