1 monitoring extended regular expressions grigore rosu university of illinois at urbana-champaign,...

39
1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana- Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

Upload: ada-brown

Post on 02-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

1

Monitoring Extended Regular Expressions

Grigore Rosu

University of Illinois at Urbana-Champaign, USA

Joint work withMahesh Viswanathan and Koushik Sen

Page 2: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

2

Increasing Software Reliability

Current solutions– Human review of code and testing

Most used in practice Usually ad-hoc, intensive human support

– (Advanced) Static analysis Often scales up False positives and negatives, annotations

– (Traditional) Formal methods Model checking and theorem proving General, good confidence, do not always scale up

Page 3: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

4

Runtime Verification and Monitoring

Idea: Let system run and observe execution trace. If that violates or appears to violate requirements then report error or guide the

program to avoid or to hit error.

Idea: Let system run and observe execution trace. If that violates or appears to violate requirements then report error or guide the

program to avoid or to hit error.

Page 4: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

5

Runtime Verification and Monitoring

PathExplorer – developed jointly with Havelund– Used on 70,000 lines of C++ code (K9 Rover)– Found a deadlock in ~10 seconds– Confirmed a datarace suspicion

Runtime Verification Workshop– ‘01 –France (CAV), ‘02 –Denmark (CAV), ’03 –USA (CAV)– ’04 –Spain (ETAPS), …

Page 5: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

6

PathExplorer - Overview

Runningprogram

(socket)

Events

Observer

(Joint work with Klaus Havelund of NASA Ames)

Page 6: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

7

PathExplorer – the Observer

Predictive Analisis

Specification BasedMonitoring

Dispatcher

datarace

deadlock

temporal

paxmodules module datarace =‘java pax.Datarace’; module deadlock =‘java pax.Deadlock’; module temporal =‘java pax.Temporal spec’; module ERE =‘java pax.Ere spec’;end

Eventstream

warning …warning …

warning …

ERE warning …

Page 7: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

8

Why (Extended) Regular Expressions?

Ordinary programmers and software engineers understand and use regular expressions– Perl, Python, etc.

Safety policies are often regular patterns on sequences of states/events:– (idle* open (read + write)* close)*– Complementation needed: to say what should not

happen: ¬ (any* start1 (¬ end1)* start2 any*)

Page 8: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

9

Extended Regular Expressions (ERE)

Regular expressions with complement

Language of an ERE

Intersection R ∩ R’ := ¬(¬R + ¬R’)

R ::= Φ | ε | A | R + R | R · R | R* | ¬R

L(Φ) = Φ L(R + R’) = L(R) L(R’)

L(ε) = {ε} L(R · R’) = {ww’ | w L(R), w’ L(R’)}

L(A) = {A} L(R*) = (L(R))* L(¬R) = * \

L(R)

Page 9: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

10

ERE Membership Problem

Given w * and R, is it the case that w L(R)? Patterns in strings; many applications

– Programming languages (PERL, Python)– Molecular biology (Knight-Myers95)– Monitoring

Efficient solutions are of great practical interest From now on, n is the length of the word/trace w

and m is the size of the ERE R– n is typically much much larger than m

Page 10: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

11

What is known (I)

If R does not contain negations, then– Transform R into an NFA of size O(m) (Aho’90)

Solution in time O(nm) and space O(m) Improved by Mayers’92 (JACM): time/space O(nm / log n)

– Transform R into a DFA of size O(2m) (Aho’90) Solution in time O(nm) and space O(2m) Note: transitions in a DFA take logarithmic time

Negations and their nesting make the membership problem highly non-trivial

Page 11: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

12

Problems with Negation (I)

How to complement an NFA?– Just complementing the set of final states is wrong!

a a

b bA

L(A) = {ab}

a a

b bA’

L(A’) = {ab,a, ε}

Page 12: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

13

Problems with Negation (II)

DFAs can be complemented safely by just complementing the set of final states, but– NFA -> DFA implies exponential state blowup!– For k nested negations, 2^(2^(…(2^m)…)) states

– This makes the membership problem non-elementary more complex in the context of (nested) negations

k

Page 13: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

14

What is known (II)

Dynamic programming algorithm (Hopcroft-Ullman ’79)Time O(n3m) and space O(n2m)

Special synchronized alternating automata(Yamamoto ’02) – intersection but not negation(Kupferman-Zuhovitzky ’02) – general ERETime O(n2m) and space O(nm+kn2), where k is the

number of negations and intersections Algorithms above store the word; this is

unacceptable in many practical situations

Page 14: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

15

Desired Behavior - Monitoring

Runningprogram

socket

Events

ObserverAlgorithms processing and then discarding

each event are desiredin practice, since words or execution traces can

be extremely long

Algorithms processing and then discarding

each event are desiredin practice, since words or execution traces can

be extremely long

Page 15: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

16

Challenges and Talk Overview

What is the lower space/time bound of the ERE monitoring problem (to process one event)?

– (2cm½ ) for space

What is a reasonable upper bound for the ERE monitoring problem (to process one event)?

– Rewriting algorithm in O(22m2) space/time

How to generate optimal monitors for ERE?– Optimal monitor generation by coinduction

Page 16: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

17

Lower Bound for ERE Monitoring (I)

Consider the language(Chandra-Kozen-Stockmeyer81 in alternation)(Kupferman-Vardi98 in model checking)

Lk = {u # w # u’ $ w | w {0,1}k and u,u’

{0,1,#}*}We show that

• There is an ERE Rk of size (k2) with L(Rk) = Lk

• Any monitoring algorithm for Lk needs (2k) spaceSo we can conclude that the space lower bound for

ERE monitoring is (2cm½)

Page 17: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

18

Lower Bound for ERE Monitoring (II)

Lk = {u # w # u’ $ w | w {0,1}k and u,u’

{0,1,#}*}

Note that size of Rk is (k2) and L(Rk) = Lk

Rk = ???(¬$)* $ (¬$)* ∩

???(0+1+#)* # ???

[(0+1)i 0 (0+1)k-i-1 # (0+1+#)* $ (0+1)i 0 (0+1)k-i-1 +

(0+1)i 1 (0+1)k-i-1 # (0+1+#)* $ (0+1)i 1 (0+1)k-i-1]

∩k

i=0

There should be exactly one $

symbol, and …

There should be exactly one $

symbol, and …There should be

some sequence of 0,1,#, followed by a # and then by a W …

There should be some sequence of

0,1,#, followed by a # and then by a W …

Each letter in W should appear after $ at exactly the same

position …

Each letter in W should appear after $ at exactly the same

position …

Page 18: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

19

Lower Bound for ERE Monitoring (III)

Lk = {u # w # u’ $ w | w {0,1}k and u,u’

{0,1,#}*}• Let A be a monitor for Lk

• When A reads symbol $, it should “remember”

exactly those w that have been seen so far

• There are 22k possible distinct situations to remember;

so at least 2k memory needed by A to encode each of these situations

Page 19: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

20

Idea of an Event-Consuming Algorithm

“Consume” each event as it arrives, generating a new ERE monitoring requirement

Use the notion of derivative– R{a} is the ERE that should hold after seeing event

a, in order for R to hold now

– Algorithm A stores an ERE R, and when an event a

arrives it replaces R by R{a} ; at the end of trace A checks whether εR

– How can we generate R{a} efficiently?– How can we store R{a} compactly?

Page 20: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

21

ERE Syntax

Sorts Ere and Event; subsort Event < Ere Operations

Φ : -> Ere

ε : -> Ere

_+_ : Ere Ere -> Ere[assoc comm id: empty]

_ _ : Ere Ere -> Ere[assoc id: nil]

_* : Ere -> Ere

¬_ : Ere -> Ere

Page 21: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

22

Derivatives

Operations _{_} : Ere Event -> Ere_?_:_ : Bool Ere Ere -> Ere ε_ : Ere -> Bool

Equations(R1 + R2){a} = R1{a} + R2{a}(R1 R2){a} = R1{a} R2 + (εR) ? R2{a} : Φ(R*){a} = R{a} R*

(¬R){a} = ¬(R{a})ε{a} = ΦΦ{a} = Φb{a} = (b == a) ? ε : Φ

Obvious!

• Related work:• Antimirov and Mosses

Page 22: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

23

Three Important Simplifying Rules

Without any other rules, R{a1}{a2}…{an} can grow to unbounded size

Simplifying rules

Φ R = ΦR + R = RR1 R + R2 R = (R1 + R2) R

Let R be the rewriting system defined so far

Page 23: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

24

Theorems (RTA’03)

R is terminating and ground Church-Rosser modulo AC of _+_ and A of _ _

L(nfAC(R{a})) = {w | aw L(R)} for all EREs R

a1a2…an L(R) iff ε R{a1}{a2}…{an}

R{a1}{a2}…{an} requires O(22m2) space and

O(n22m2) time, where m = |R|

Page 24: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

25

Problems …

Previous algorithm is not synchronous!– Unless we check for emptiness after processing each

event, which is very expensive

How to generate a minimal monitor for ERE avoiding the highly exponential state explosion?

Solution: Circular Coinduction– Related work by Rutten: no negation

Page 25: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

26

Hidden LogicBehavioral Specification

Behavioral specification– Tuple (V, H, Γ, Σ, E), or simply (Γ, Σ, E)– Sorts S = V H

V = visible sorts (stay for data: integers, reals, chars, etc.) H = hidden sorts (stay for states, objects, blackboxes, etc.)

– Operations Γ Σ Σ is an S-signature Γ is a subsignature of Σ of behavioral operations

– E is a set of Σ-equations

Page 26: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

27

Contexts and Experiments

Γ-context is a Γ-term with a hidden “slot” Γ-experiment is a Γ-context of visible result

z : h

operations in Γvisible if Γ-experiment

Page 27: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

28

Behavioral Equivalence

Models called hidden Σ-algebras; A, A’, … Behavioral equivalence on A: a ≡ a’

– Identity on visible carriers– a ≡h a’ iff Aξ(a) = Aξ(a’) for any Γ-experiment ξ

a a’

visible

=Aξ(a) Aξ(a’)

Γ Γ

Γ

Page 28: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

29

Behavioral Satisfaction

a Σ-equation, A a hidden Σ-algebra A behaviorally satisfies , written

iff θ(t) ≡h θ(t’) for any map θ : X → A

A

( X) t =h t’

A

( X) t =h t’

≡|Γ

ΣA

A

( X) t =h t’

Γ

≡ (Γ, Σ, E)|A

≡ ( X) t =h t’|B

A

Page 29: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

30

Proving Behavioral Equivalence

Behavioral satisfaction known to be π2 hard, so– No way to automatically prove any truth– No way to automatically disprove any falsity– Hidden logics are incomplete

Coinduction and context induction very strong– Both require human support

Circular coinduction is an automatic procedure– Tuned and tested on hundreds of examples

Streams, Protocols (ABP), Patterson’s mutual exclusion, etc.

– Supported by BOBJ, prototyped in Maude

0

Page 30: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

31

Circular Coinduction in a Nutshell

“Derive” the original proof goal until end up in circles

▲ = ♥

☺ = ☼

♣ = ► ☺ = ☼

5 = 5

9 = 9

0 = 0 ☺ = ☼

a m1 m2

♣ = ►a m1

m2

a m1m2

♣ = ►

Modulo substitutions,

“special” contexts and

equational reasoning

Moreover, all the behavioral equalities on the proof graph are true:lemma descovery!

Moreover, all the behavioral equalities on the proof graph are true:lemma descovery!

“Explanation?”(1) All possibilities to distinguish the two are exhaustively explored

“Explanation?”(1) All possibilities to distinguish the two are exhaustively explored

“Explanation?”(2) Any experiment can be “consumed” bottom-up, ending in a “visible” node

“Explanation?”(2) Any experiment can be “consumed” bottom-up, ending in a “visible” node

“Explanation?”(3) Congruent binary relation R is built; but behavioral equiv. is the largest!

“Explanation?”(3) Congruent binary relation R is built; but behavioral equiv. is the largest!

“Explanation?”(4) Context induction:Nodes above form “induction hypothesis”

“Explanation?”(4) Context induction:Nodes above form “induction hypothesis”

Page 31: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

32

zip(zero, one) = blink

zip(zero, one) = blink

0 = 0 zip(one,zero) = t(blink)

1 = 1 zip(zero,one) = blink

h t

h t

Cobasis {h,t}

Page 32: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

33

zip(zero, one) = blink

zip(zero, one) = blink

0 = 0 1 = 1 zip(zero,one) = blink

Cobasis {h, ht, tt}

h ht tt

Page 33: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

34

zip(odd(S), even(S)) = S

zip(odd(S), even(S)) = S

h(S) = h(S) zip(even(S),even(t(S))) = t(S)

h(t(S)) = h(t(S)) zip(even(t(S)), even(t(t(S)))) = t(t(S))

h t

h t

Cobasis {h,t}

Page 34: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

35

zip(odd(S), even(S)) = S

zip(odd(S), even(S)) = S

h(S) = h(S) odd(S) = odd(S)

Cobasis {h, odd, even}

even(S) = even(S)

h odd even

One can prove by {h,t}-circular coinduction that

odd(zip(S,S’)) = Seven(zip(S,S’)) = S’

Page 35: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

36

Behavioral Specification of EREs

B = (V, H, Γ, Σ, E) where– V contains Event and Bool– H contains Ere– Σ contains Φ, ε, _+_, _ _, _*, ¬_– E contains all equations defined before– Γ contains ε_ : Ere -> Bool

_{_} : Ere Event -> Ere

Theorem: B beh. satisfies R = R’ iff L(R) =

L(R’)

Page 36: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

37

(a + b)* = (a* b*)*

(a + b)* = (a* b*)*

(a + b)* = b* (a* b*)*

(a + b)* = a* b* (a* b*)* (a + b)* = b* (a* b*)*

true = true

true = true

true = true (a + b)* = a* b* (a* b*)* (a + b)* = b* (a* b*)*

ε_ _{a} _{b}

(a + b)* = a* b* (a* b*)*

ε_ _{a} _{b}

ε_ _{a} _{b}

Moreover, all the equivalences in the proof graph below are true!

Moreover, all the equivalences in the proof graph below are true!

Theorem:Circular Coinduction is a decision procedure for ERE language equality

Theorem:Circular Coinduction is a decision procedure for ERE language equality

Page 37: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

38

Generating Minimal DFAs for EREs

R

R{a} R{b}

…… R’’ ……

R’{a}

a b

a b

…… R’ ……a bequivalent?

(1) Maintain a set C of pairsof equivalent EREs

(2) Check each new ERE forequivlance with alreadyexisting EREs in the DFA

• First in C• Then by CC. If equivalent ERE found, then add new

circularities to C

Page 38: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

39

Implementation

BOBJ cannot be used because it does not return the set of circularities

Implemented a specialized circular coinduction algorithm in Maude

Web server at http://fsl.cs.uiuc.edu– A PERL CGI script which calls Maude– Generates JPEG, PS, and DOT versions of DFA

Page 39: 1 Monitoring Extended Regular Expressions Grigore Rosu University of Illinois at Urbana-Champaign, USA Joint work with Mahesh Viswanathan and Koushik Sen

40

Conclusion and Future Work

Exponential complexity unavoidable when negation is added to regular expressions (EREs)

Few rewriting rules provide the best trace membership algorithm known for EREs

Generation of minimal DFAs for EREs by circular coinduction (CC) avoids state explosion

– To be part of PathExplorer at NASA Ames

Behavioral Maude with circular coinduction Inductive/Coinductive Theorem Prover (ICTP) Behavioral Rewriting Logic