qest'12 paper seminar

24
The PMC Problem Resolving Non-Determinism Algorithm Implementation and Results Conclusions Statistical Model Checking for Markov Decision Processes D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Computer Science Department Carnegie Mellon University June 6, 2012 D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Upload: anvilfolk

Post on 21-Jul-2016

220 views

Category:

Documents


1 download

DESCRIPTION

A quick seminar based on a paper published at QEST'12.

TRANSCRIPT

Page 1: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Statistical Model Checking for Markov DecisionProcesses

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke

Computer Science DepartmentCarnegie Mellon University

June 6, 2012

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 2: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Summary

1 The PMC Problem

2 Resolving Non-Determinism

3 Algorithm

4 Implementation and Results

5 Conclusions

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 3: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Model Checking

Given:

Property ϕ in temporal logic

A transition system M

Does ϕ hold in M, or M |= ϕ?

Example

Is one car always safely behind another, where x1 and x2 are theirpositions:

Gx1 < x2

State of the art can handle millions of states. Used in hardwareand software industry.

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 4: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Probabilistic Model Checking

Given:

Property ϕ in temporal logic

A probabilistic transition system M

A probability threshold θ

Is the probability that M satisfies ϕ smaller than θ, P≤θ(ϕ)?

Example

Is it very unlikely that cars collide?

P≤0.00001(Fx1 = x2)

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 5: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Probabilistic Model Checking: exact approach

Exact methods: pros

Can currently handle relatively complex scenarios

Handles systems with non-determinism

Mature tools such as PRISM

They are exact...

Exact methods: cons

State explosion problem greatly reduces applicability

Time-consuming

Possibly hard to parallelise (e.g. PRISM)

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 6: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Probabilistic Model Checking: statistical approach

Statistical Model Checking: pros

Can currently handle very complex scenarios

Highly parallelisable

Only requires bounded memory

Comes in two flavours: hypothesis testing, interval estimation

Statistical Model Checking: cons

Not exact (but converges to correct solution)

Requires a bounded number ”steps”, i.e. bounded property

Requires fully probabilistic systems

But most interesting systems feature non-determinism!

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 7: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Summary

1 The PMC Problem

2 Resolving Non-Determinism

3 Algorithm

4 Implementation and Results

5 Conclusions

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 8: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Markov decision processes (MDPs) & schedulers

p

s p

0.01

0.99

0.5 1

0.5

MDP chooses action non-deterministically

Each action has a distribution of target states

Schedulers σ : States→ Actions are used to resolvenon-determinism

General schedulers 6= memoryless schedulers for boundedproperties

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 9: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Probabilistic Model Checking: resolving non-determinism

Property P≤θ(ϕ) is actually: is the probability that model Msatisfies property ϕ smaller than θ for all schedulers.

Thus, we check only for optimal schedulers, i.e. thatmaximises P(ϕ)

If optimal schedulers generate probabilities above θ, theproperty is false.

True otherwise.

How do we find optimal schedulers?

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 10: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Summary

1 The PMC Problem

2 Resolving Non-Determinism

3 Algorithm

4 Implementation and Results

5 Conclusions

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 11: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Schedulerevaluation

Schedulerimprovement

Determinisation

SMC

False

True

σ uniform

σ improvedQ

σ candidate

deterministic σ

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 12: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Scheduler Evaluation

This step estimates:

How good the current scheduler is

How much each choice contributed to the satisfaction of ϕ

It does this by:

Turning the MDP into a Markov chain using scheduler σ

Sampling from the Markov chain repeatedly

For satisfying trace, give a positive point to each action taken,and vice-versa for non-sat

At the end, for each action, we have an estimate of the probabilityof satisfaction of ϕ

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 13: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Scheduler Improvement

This step provably improves scheduler σ by:

For each state, choosing the “best” action with probability1− εChoosing all others with ε

n−1 probability, with n the numberof possible actions

This ensures that:

Search efforts are largely directed at the promising regions ofthe state space

All states remain explorable/reachable (p > 0)

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 14: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Putting it all together

The entire algorithm is thus very simple:

Start with an uninformed (i.e. Uniform) scheduler

Estimate best actions

Improve scheduler with this new information

Rinse & repeat

When scheduler is “good enough” (or time-limit reached),determinise it

Run Statistical Model Checking using the determinisedscheduler

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 15: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Properties

This algorithm is a False-biased Monte Carlo algorithm.

If it finds a counterexample, it returns false (up to SMC)

If it does not, it returns true with arbitrarily high probability

It has the following nice properties

Provides counter-example

Converges

Statistically correct

Highly parallelisable

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 16: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Summary

1 The PMC Problem

2 Resolving Non-Determinism

3 Algorithm

4 Implementation and Results

5 Conclusions

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 17: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Implementation

Integrated with PRISM simulation engine

Works with PRISM MDP benchmarks

Parallel sampling

Synchronisation for data-structures during evaluation

Ran on 32-core and 48-core machines

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 18: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Scheduler improvement

0  

0.1  

0.2  

0.3  

0.4  

0.5  

0.6  

0.7  

0.8  

0.9  

1  

0   10   20   30   40   50   60   70   80   90  

#  sa%sfied

 traces  /  #  to

tal  traces  

#  Learning  rounds  

10  processes   50  processes   100  processes  

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 19: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Parallelisability

0  

50  

100  

150  

200  

250  

1   2   3   4   5   6   7   8   9   10   11   12   13   14   15   16   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32  

Run$

me  (s)  

#  of  threads  

Mutex  Bugged  10  

Mutex  Bugged  30  

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 20: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Correctness

0

50

100

150

200

250

300

350

10 (8K states) 15 (393K states) 20 (16M states) 25 (654M states)

Tim

e(s)

Time (SMC) Time (PRISM)

Learning Optimal Policies for Model Checking João Martins and David Henriques

Introduction

Background

The Algorithm Results - Convergence

References

Markov decision processes (MDPs) are expressive models, popular for modeling systems that exhibit both probabilistic and non-deterministic behaviour.

Useful quantitative properties over MDPs can be automatically verified with probabilistic model checking (PMC), a popular formal verification technique.

Unfortunately, PMC suffers from the state explosion problem. Statistical methods can be used to approximate the desired result without need for complete state space exploration.

One well identified shortcoming is that Statistical methods have been limited to fully probabilistic systems.

Bounded LTL BLTL is an expressive probabilistic logic for reasoning about dynamic systems. Its syntax is given by

:= p | | F<n | G<n | U<n | W<n

It allows us to express properties such as request is acknowledged within n time or process enters the critical region until the flag is .

PRISM Prism is the reference state of the art probabilistic model checker.

It answers the question P>p( ) using value iteration: is the probability of satisfying greater than p for all resolutions of nondeterminism? P>p( ) is known as the probabilistic property and is known as the temporal formula.

Requires the entire state space in memory.

Objectives Develop a Reinforcement Learning algorithm to learn optimal policies for model checking PLTL in MDPs that does not require computation over the entire state space.

Using the above technique, apply Statistical Model Checking (SMC) in systems with non determinism. To the best of our knowledge, this is the first attempt to solve this problem in a general setting.

Integrate the algorithm with the PRISM model checker, in particular, allow the use of extensive benchmark suite.

Younes, H., Clarke, E. and Zuliani, P. Statistical of Probabilistic Properties with Unbounded Until.

10.

Kwiatkowska, M., Norman, G. and Parker, D. PRISM: Probabilistic symbolic model checker.

Bogdoll, J. and Fioriti, L. and Hartmanns, A. and Hermanns, H. Partial Order Methods for Statistical Model Checking and Simulation. 11.

Top-Level Algorithm 1. Initialise a policy such that actions are chosen

uniformly from each state

2. Do K times:

a. Sample a set P of N paths from the MDP using policy

b. For each

If

Positively reinforce

If ( )

Negatively reinforce

c. Update policy based on reinforcement

3. Determinise the policy

4. Use SMC to check the probabilistic property

Leader Election Protocol - Error Randomized leader election protocol. Graphic shows the probability of electing a leader within x steps.

Results - Efficiency

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

25 30 35 40 45 50 55 60 65 70 75

Pro

babi

lity

of E

lect

ing

a Le

ader

Lower bound Upper bound Real probability

Mutex Protocol Error Several mutual exclusion processes, one having a small probability of entering the critical zone illegally. Checking the worst-case probability of error. Reinforcement

A, the reinforcement R is R(s,a) = |{ : (s,a and }|- |{ :(s,a and

( )}|

Policy Update New probability distributions are Multinomials with

parameters given by the MLE from reinforcement information (R(a,s)/ R(s,a)).

To avoid having transitions with 0 probability and minimize harmful runs, actual policy updates are a mixture of the previous distribution and the new probability distribution.

Convergence and Stopping Criteria Since optimal policies are deterministic, every once in

a while we determinise the policy and check the probabilistic property using SMC.

Negative answers from SMC are (probabilistically) guaranteed to mean the probabilistic property is false, since there is at least a policy achieving the value in question.

Positive answers from SMC may be false positives. We run the algorithm several times to minimize the probability of always converging to local maxima.

Mutex Protocol - Efficiency

# steps # processes

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Pro

babi

lity

Deterministic Probabilistic

Mutex Protocol Convergence

K

Wireless Network Efficiency IEEE 802.11 Wireless LAN standard for collision avoidance. Several stations broadcast at the same time and enact a back off protocol when collisions are detected.

0.7

0.75

0.8

0.85

0.9

0.95

1

0.75

0.8

0.85

0.9

0.95

1

0.75

0.8

0.85

0.9

0.95

1

0.75

0.8

0.85

0.9

0.95

1

10 processes

True probability

True probability

True probability probability

15 processes

20 processes 25 processes 0

10

20

30

40

50

60

70

2 (204K states) 3 (616K states) 4 (1.9M states) 5 (6.2M states) 6 (19.8M states)

Tim

e (s

)

Time (PRISM) Time (SMC) # stations

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 21: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

CS

MA

34

θ 0.5 0.8 0.85 0.9 0.95 PRISMout F F F T T 0.86t 1.7 11.5 35.9 115.7 111.9 136

CS

MA

36

θ 0.3 0.4 0.45 0.5 0.8 PRISMout F F F T T 0.48t 2.5 9.4 18.8 133.9 119.3 2995

CS

MA

44

θ 0.5 0.7 0.8 0.9 0.95 PRISMout F F F F T 0.93t 3.5 3.7 17.5 69.0 232.8 16244

CS

MA

46

θ 0.5 0.7 0.8 0.9 0.95 PRISMout F F F F F∗

memout

t 3.7 4.1 4.2 26.2 258.9 memout

WL

AN

5

θ 0.1 0.15 0.2 0.25 0.5 PRISMout F F T T T 0.18t 4.9 11.1 124.7 104.7 103.2 1.6

WL

AN

6

θ 0.1 0.15 0.2 0.25 0.5 PRISMout F F T T T 0.18t 5.0 11.3 127.0 104.9 102.9 1.6

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 22: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Summary

1 The PMC Problem

2 Resolving Non-Determinism

3 Algorithm

4 Implementation and Results

5 Conclusions

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 23: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Conclusions

Trading absolute correctness for statistical correctness gives usmore applicability

Faster than traditional exact approaches for not completelystructured systems

Statistical correctness

Integration with PRISM

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes

Page 24: QEST'12 paper seminar

The PMC ProblemResolving Non-Determinism

AlgorithmImplementation and Results

Conclusions

Thank you, questions?

D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes