semantics of probabilistic processes - sjtu

Yuxin Deng

Semantics of ProbabilisticProcesses

An Operational Approach

Author-prepared version.The final publication is available atwww.springer.com/978-3-662-45197-7

Springer

Preface

Probabilistic concurrency theory aims to specify and analyse quantitative behaviourof concurrent systems, which necessarily builds on solid semantic foundations ofprobabilistic processes. This book adopts an operational approach to describing thebehavior of nondeterministic and probabilistic processes, and the semantic com-parison of different systems is based on appropriate behavioural relations such asbisimulation equivalence and testing preorders.

It mainly consists of two parts. The first part provides an elementary account ofbisimulation semantics for probabilistic processes from metric, logical, and algo-rithmic perspectives. The second part sets up a general testing framework and spe-cialises it to probabilistic processes with nondeterministic behaviour. The resultingtesting semantics is treated in depth. A few variants of it are shown to coincide, andthey can be characterised in terms of modal logics and co-inductively defined sim-ulation relations. Although in the traditional (non-probabilistic) setting simulationsemantics is in general finer than testing semantics becauseit distinguishes moreprocesses, for a large class of probabilistic processes, the gap between simulationand testing semantics disappears. Therefore, in this case we have a semantics whereboth negative and positive results can be easily proved: to show that two processesare not related in the semantics we just give a witness test, and to prove that twoprocesses are related we only need to establish a simulationrelation.

While most of the results have been announced before, they are spread over sev-eral papers in the period from 2007 to 2014, and sometimes with different termi-nology and notation. This prevents us from having a comprehensive understand-ing of the bisimulation and testing semantics of probabilistic processes. In orderto improve the situation, the current work brings all the related concepts and prooftechniques to form a coherent and self-contained text.

Besides presenting recent research advances in probabilistic concurrency theory,the book exemplifies the use of many mathematical techniquesto solve problemsin computer science, which is intended to be accessible to postgraduate students inComputer Science and Mathematics. It can also be used by researchers and practi-tioners either for advanced study or for technical reference. The reader is assumed to

v

vi Preface

have some basic knowledge in discrete mathematics. Familiarity with real analysisis not a prerequisite, but would be helpful.

Most of the work reported in this book was carried out during the last few yearswith a number of colleagues. The testing semantics for probabilistic processes wasdeveloped in conjunction with Rob van Glabbeek, Matthew Hennessy, Carroll Mor-gan, and Chenyi Zhang. The various characterisations of probabilistic bisimulationin Chapter 3 is based on joint work with Wenjie Du.

The BASICS laboratory at Shanghai Jiao Tong University has offered a creativeand pleasant working atmosphere. Therefore, I would like toexpress my grati-tude to Yuxi Fu and all other members of the laboratory. Thanks go also to BarryJay, Matthew Hennessy and Carroll Morgan for having read parts of the first draftand provided useful feedback. My research on probabilisticconcurrency theory hasbeen sponsored by the National Natural Science Foundation of China under grants61173033 and 61261130589 as well as ANR 12IS02001 “PACE”.

Finally, my special gratitude goes to my family, for their unfailing support.

Shanghai, Yuxin DengSeptember, 2014

List of Symbols

P(X) 9 P+(O) 71N 9 R 13R≥0 15 R

n 15⊕

i∈I pi ·ϕi 24 ϕ1 p⊕ ϕ2 44〈a〉ψ 44 [a]ψ 48 103 ⊓ 103p⊕ 103 |A 74recx.P 150 F(R) 196∼ 41 ∼n 41≺ 42 52≍ 42 =L 44≤Ho 72 ≤Sm 72⊑may 72 ⊑must 72≃may 72 ≃must 72⊑pmay 76 ⊑pmust 76⊑Ω

pmay 76 ⊑Ωpmust 76

⊑Ωermay 86 ⊑Ω

ermust 86⊑Ω

nrmay 83 ⊑Ωnrmust 84

⊑Ωrr may 224 ⊑Ω

rr must 224⊑Emay 138 ⊑Emust 138⊑L 104 ⊑F 104⊳S 120 ⊳FS 120⊑S 104 ⊑FS 104≃S 121 ≃FS 121⊳o

FS128 ⊑e

FS 128⊳k

FS 206 ⊑∞FS 210

⊑kS 151 ⊳e

FS192

⊳sFS 196 ⊳c

FS199

≈ 231 ≈s 232→R 74 →ep 90

vii

viii List of Symbols

a−→ 24 α=⇒ 104

τ−→ 119 =⇒ 159a

=⇒ 120 α=⇒ω 128

=⇒≻ 168 =⇒δ 180=⇒δ ,dp 183 τ−→p 2006 X−→ 104 ref(X) 104=⇒ 6A−→ 120 C 75C h 83 C δ ,h 92C h

min 85 C hmax 85

V 75 Vh 83

Vhmin 86 V

hmax 86

Vδ ,h 92 V f 110

Vδ ,hmin 92 V

δ ,hmax 92

A (T,P) 71 A h(T,P) 83A h

min 86 A hmax 86

A d(T,P) 167 R 11R

† 24 lX 16lωX 157 lR 154⌈∆⌉ 19 |∆ | 19s 19 || f || 33D(S) 19 Dsub(S) 19P(X) 33 [ϕ 45m⋆ 35 m 34Hd(X,Y) 53 Ω 73Imgf (Θ) 73 Exp∆ ( f ), f (∆) 73T 74 Tn 74h ·O 81 ϕP 104Tϕ 104 vϕ 134[P 107 Actω

τ 109−→ω 129 Q[x 7→ P] 152∆×

k 159 ∆→k 159

Ch(R) 155 ε 150[s] 168 [∆ ] 168dp(s)↓ 176 dp(s)↑ 176$Θ 169 P

rmax 177

Pδ ,rmax 181 P

δ ,dp,r 183V (∆) 205 Fδ ,dp,r 181Derdp 176 div 211

Contents

List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. vii

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 11.2 Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 3References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 5

2 Mathematical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.1 Lattice theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 72.2 Induction and co-induction . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 102.3 Topological spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 132.4 Metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 152.5 Probability spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 182.6 Linear programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 19References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 21

3 Probabilistic Bisimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 233.2 Probabilistic labelled transition systems . . . . . . . . . .. . . . . . . . . . . . . . 253.3 Lifting relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 273.4 Justifying the lifting operation . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 32

3.4.1 Justification by the Kantorovich metric . . . . . . . . . . . .. . . . . . 323.4.2 Justification by network flow . . . . . . . . . . . . . . . . . . . . . .. . . . 38

3.5 Probabilistic bisimulation . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 413.6 Logical characterisation . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 44

3.6.1 An adequate logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 443.6.2 An expressive logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 48

3.7 Metric characterisation . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 523.8 Algorithmic characterisation . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 56

3.8.1 A partition refinement algorithm . . . . . . . . . . . . . . . . . .. . . . . 563.8.2 An “on the fly” algorithm . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 59

ix

x Contents

3.9 Bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 633.9.1 Probabilistic models . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 633.9.2 Probabilistic bisimulation . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 64

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 65

4 Probabilistic Testing Semantics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714.1 A general testing framework . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 714.2 Testing probabilistic processes. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 734.3 Bounded continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 774.4 Reward testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 80

4.4.1 A geometric property . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 814.4.2 Nonnegative rewards . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 83

4.5 Extremal reward testing . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 854.6 Extremal reward testing versus resolution-based reward testing . . . . 87

4.6.1 Must testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 884.6.2 May testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 91

4.7 Vector-based testing versus scalar testing . . . . . . . . . .. . . . . . . . . . . . . 964.8 Bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 98References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 99

5 Testing Finite Probabilistic Processes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1035.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1035.2 The languagepCSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.2.1 The syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 1055.2.2 The operational semantics . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 1065.2.3 The precedence of probabilistic choice . . . . . . . . . . . .. . . . . . 1085.2.4 Graphical representation of pCSP processes . . . . . . . .. . . . . . 1085.2.5 TestingpCSP processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.3 Counterexamples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 1135.4 Must versus may testing . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 1175.5 Forward and failure simulation . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 118

5.5.1 The simulation preorders . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 1205.5.2 The simulation preorders are precongruences . . . . . . .. . . . . . 1255.5.3 Simulations are sound for testing preorders . . . . . . . .. . . . . . . 127

5.6 A modal logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 1325.7 Characteristic tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 1345.8 Equational theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1365.9 Inequational theories . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 1385.10 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1405.11 Bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 143

5.11.1 Probabilistic equivalences . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 1435.11.2 Probabilistic simulations . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 144

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 146

Contents xi

6 Testing Finitary Probabilistic Processes. . . . . . . . . . . . . . . . . . . . . . . . . . . 1496.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1496.2 The languagerpCSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1516.3 A general definition of weak derivations . . . . . . . . . . . . . .. . . . . . . . . 153

6.3.1 Lifting relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 1546.3.2 Weak transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 1596.3.3 Properties of weak transitions . . . . . . . . . . . . . . . . . . .. . . . . . . 161

6.4 TestingrpCSP processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1676.4.1 Testing with extremal derivatives . . . . . . . . . . . . . . . .. . . . . . . 1676.4.2 Comparison with resolution-based testing . . . . . . . . .. . . . . . . 171

6.5 Generating weak derivatives in a finitary pLTS . . . . . . . . .. . . . . . . . . 1756.5.1 Finite generability . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 1756.5.2 Realising payoffs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 1806.5.3 Consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 186

6.6 The failure simulation preorder . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1916.6.1 Two equivalent definitions and their rationale . . . . . .. . . . . . . 1916.6.2 A simple failure similarity for finitary processes . . .. . . . . . . 1966.6.3 Precongruence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 1986.6.4 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 204

6.7 Failure simulation is complete for must testing . . . . . . .. . . . . . . . . . . 2066.7.1 Inductive characterisation . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 2066.7.2 The modal logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 2116.7.3 Characteristic tests for formulae . . . . . . . . . . . . . . . .. . . . . . . . 213

6.8 Simulations and may testing . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 2196.8.1 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 2206.8.2 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 222

6.9 Real-reward testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 2226.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 229References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 230

7 Weak probabilistic bisimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2317.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 2317.2 A simple bisimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 2327.3 Compositionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 2367.4 Reduction barbed congruence . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 2387.5 Bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 241References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 242

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 243

Chapter 1Introduction

Abstract This introduction briefly reviews the history of probabilistic concurrencytheory and three approaches to the semantics of concurrent systems: denotational,axiomatic, and operational approaches. This book focuses on the last one and morespecifically on (bi)simulation semantics and testing semantics. The second sectionsurveys the contents and main results for other chapters of the book.

Keywords: Probabilistic concurrency theory; Semantics; Bisimulation; Testing

1.1 Background

Computer science aims to explain in a rigorous way how computational systemsshould behave, and then to design them so that they do behave as expected. Nowa-days the notion of computational systems includes not onlysequential systemsbutalsoconcurrent systems. The attention of computer scientists goes beyond singleprograms in free-standing computers. For example, computer networks, particles inphysics and even proteins in biology can all be considered asconcurrent systems.Some classical mathematical models (e.g. theλ -calculus [3]) are successful for de-scribing sequential systems, but they turn out to be insufficient for reasoning aboutconcurrent systems, because what is more important now is how different compo-nents of a system interact with each other rather than their input-output behaviour.

In the 1980’sprocess calculi(sometimes also calledprocess algebras), notablyCCS [14], CSP [12] and ACP [4, 2], were proposed for describing and analysingconcurrent systems. All of them were designed around the central idea ofinterac-tion between processes. In those formalisms, complex systems are built from simplesubcomponents, using a small set of primitive operators such asprefix, nondetermin-istic choice, restriction, parallel compositionandrecursion. Those traditional pro-cess calculi were designed to specify and verifyqualitative behaviourof concurrentsystems.

1

2 1 Introduction

Since the 1990’s, there has been a trend to study thequantitative behaviourofconcurrent systems. Many probabilistic algorithms have been developed in order togain efficiency or to solve problems that are otherwise impossible to solve by de-terministic algorithms. For instance, probabilities are introduced to break symmetryin distributed coordination problems (e.g. the dining philosophers’ problem, leaderelection and consensus problems). Probabilistic modelling has helped to analyseand reason about the correctness of probabilistic algorithms, to predict system be-haviour based on the calculation of performance characteristics, and to representand quantify other forms of uncertainty. The study of probabilistic model checkingtechniques has been a rich research area.

A great many probabilistic variants of the classical process calculi have alsoappeared in the literature. The typical approach is to add probabilities to existingmodels and techniques that have already proved successful in the non-probabilisticsettings. The distinguishing feature of probabilistic process calculi is the presence ofa probabilistic-choiceoperator, as in the probabilistic extensions of CCS [7, 8], theprobabilistic CSP [13], the probabilistic ACP [1] and the probabilistic asynchronousπ-calculus [10].

In order to study a programming language or a process calculus, one needs toassign a consistent meaning to each program or process underconsideration. Thismeaning is thesemanticsof the language or calculus. Semantics is essential to verifyor prove that programs behave as intended. Generally speaking, there are three ma-jor approaches for giving semantics to a programming language. Thedenotationalapproach [17] seeks a valuation function which maps a program to its mathematicalmeaning. This approach has been very successful in modelling many sequential lan-guages; programs are interpreted as functions from the domain of input values to thedomain of output values. However, the nature of interactionis much more complexthan a mapping from inputs to outputs, and so far the denotational interpretation ofconcurrent programs has not been as satisfactory as the denotational treatment ofsequential programs.

Theaxiomaticapproach [6, 11] aims at understanding a language through a fewaxioms and inference rules that help to reason about the properties of programs.It offers an elegant way of gaining insight into the nature ofthe operators and theequivalences involved. For example, the difference between two notions of programequivalence may be characterised by a few axioms, particularly if adding these ax-ioms to a complete system for one equivalence gives a complete system for theother equivalence. However, it is often difficult and even impossible to achieve afully complete axiomatic semantics if the language in question is beyond a certainexpressiveness.

The operationalapproach has been shown to be very useful for giving seman-tics of concurrent systems. The behaviour of a process is specified by itsstructuraloperational semantics[16], described via a set of labelled transition rules induc-tively defined on the structure of a term. In this way each process corresponds to alabelledtransition graph. The shortcoming of operational semantics is that it is tooconcrete, because a transition graph may contain many states that intuitively should

1.2 Synopsis 3

be identified. Thus a great number of equivalences have been proposed, and dif-ferent transition graphs are compared modulo some equivalence relations. Usuallythere is no agreement on which is the best equivalence relation; in formal verifica-tion different equivalences might be suitable for different applications. Sometimesan equivalence is induced by a preorder relation, by taking the intersection of thepreorder with its inverse relation, instead of being directly defined.

Among the various equivalences,bisimilarity [15, 14] is one of the most im-portant ones as it admits beautiful characterisations in terms of fixed points, modallogics, co-algebras, pseudometrics, games, decision algorithms, etc. In this book wewill characterise bisimilarity for probabilistic processes from metric, logical, andalgorithmic perspectives.

Preorders can be used to formalise a “better than” relation between programsor processes, one that has its origins in the original work assigning meanings toprograms and associating a logic with those meanings [6, 11]. Usually that relationis expressed in two different ways: either to provide a witness for the relation or toprovide a testing context to make obvious that one program isactually not betterthan another.

Two important kinds of preorders aretesting preorders[5, 9] andsimulation pre-orders. They give rise totesting semanticsandsimulation semantics, respectively. Ina testing semantics, two processes can be compared by experimenting with a classof tests. ProcessP is deemed “better” than processQ if the former passes every testthat the latter can pass. In contrast, to show thatP is not “better” thanQ it sufficesto find a test thatQ can pass butP cannot. In a simulation semantics, processPcan simulateQ if Q performs an action and evolves intoQ′ thenP is able to ex-hibit the same action and evolve intoP′ such thatP′ can simulateQ′ in the nextround of simulation game. Simulation is co-inductively defined and comes alongwith a proof principle calledco-induction: to show that two processes are related itsuffices to exhibit a simulation relation containing a pair consisting of the two pro-cesses. In the non-probabilistic setting, simulation semantics is in general finer thantesting semantics in that it can distinguish more processes. However, in this book wewill see that for a large class of probabilistic processes, the gap between simulationand testing semantics disappears. Therefore, in this case we have a semantics whereboth negative and positive results can easily be proved: to show that two processesare not related in the semantics we just give a witness test, while to show that twoprocesses are related we construct a relation and argue thatit is a simulation relation.

1.2 Synopsis

The remainder of the book is organised as follows. Chapter 2 collects some funda-mental concepts and theorems in a few mathematical subjectssuch as lattice theory,topology, and linear programming. They are briefly reviewedand meant to be usedas references for later chapters. Most of the theorems are classic results, thus arestated without proofs as they can be easily found in many standard textbooks in

4 1 Introduction

mathematics. It is not necessary to go through the whole chapter; readers can referto relevant parts of this chapter when it is mentioned elsewhere in the book.

Chapter 3 introduces an operational model of probabilisticsystems calledprob-abilistic labelled transition systems. In this model, a state might make a nondeter-ministic choice among a set of available actions. Once an action is taken, the stateevolves into a distribution over successor states. Then in order to compare the be-haviour of two states, we need to know how to compare two distributions. There isa nice lifting operation that turns a relation between states into a relation betweendistributions. This operation is closely related tothe Kantorovich metricin math-ematics and thenetwork flowproblem in optimisation theory. We give an elemen-tary account of the lifting operation because it entails a neat notion of probabilisticbisimulation that can be characterised by behavoural pseudometrics and decided bypolynomial algorithms over finitary systems. We also provide modal characterisa-tions of the probabilistic bisimulation in terms of probabilistic extensions oftheHennessy-Milner logicand themodal mu-calculus.

Starting from Chapter 4 we investigate the testing semantics of probabilistic pro-cesses. We first set up a general testing framework that can beinstantiated into avector-based testing or scalar testing approach, depending on the number of actionsused to indicate success states. A fundamental theorem is that for finitary systemsthe two approaches are equally powerful. In order to prove this result we make useof a notion of reward testing as a stepping stone. The Separation hyperplane theoremfrom discrete geometry plays an important role in the proof.

Chapter 5 investigates the connection between testing and simulation semantics.Forfinite processes, i.e. processes that correspond to probabilistic labelledtransitionsystems with finite tree structures, testing semantics is not only sound but also com-plete for simulation semantics. More specifically, may testing preorder coincideswith simulation preorder and must testing preorder coincides with failure simula-tion preorder. Therefore, unlike the traditional (non-probabilistic) setting, here thereis no gap between testing and simulation semantics. To provethis result we makeuse of logical characterisations of testing preorders. Forexample, each states has acharacteristic formulaϕs in the sense that another statet can simulates if and onlyif t satisfiesϕs. We can then turn this formulaϕs into a characteristic testTs so thatif t is not related tos via the may testing preorder thenTs is a witness test that dis-tinguishest from s. Similarly for the case of failure simulation and must testing. Wealso give a complete axiom system for the testing preorders in the finite fragment ofa probabilistic CSP. This chapter paves the way for the next chapter.

In Chapter 6 we extend the results in the last chapter from finite processes tofinitary processes, i.e. processes that correspond to probabilistic labelledtransitionsystems that arefinite-stateandfinitely branchingpossibly with loops. The sound-ness and completeness proofs inherit the general schemata from the last chapter.However, the technicalities are much more subtle and more interesting. For exam-ple, we make a significant use of subdistributions. A key topological property isthat from any given subdistribution, the set of subdistributions reachable from it byweak transitions can be finitely generated. The proof is highly non-trivial and in-volves techniques fromMarkov decision processessuch as rewards and static poli-

References 5

cies. This result enables us to approximate co-inductivelydefined relations by strat-ified inductive relations. As a consequence, if two processes behave differently wecan tell them apart by a finite test.

We also introduce a notion of real-reward testing that allows for negative rewards.It turns out that real-reward may preorder is the inverse of real-reward must preorder,and vice versa. More interestingly, for finitary convergentprocesses, real-rewardmust testing preorder coincides with nonnegative-reward testing preorder.

In Chapter 7 we introduce a notion of weak probabilistic bisimulation simplyby taking the symmetric form of the simulation preorder given in Chapter 6. Itprovides a sound and complete proof methodology for an extensional behaviouralequivalence, a probabilistic variant of the traditionalreduction barbed congruencewell-known in concurrency theory.

References

1. Andova, S.: Process algebra with probabilistic choice. Tech. Rep. CSR 99-12, EindhovenUniversity of Technology (1999)

2. Baeten, J.C.M., Weijland, W.P.: Process Algebra,Cambridge Tracts in Theoretical ComputerScience, vol. 18. Cambridge University Press (1990)

3. Barendregt, H.: The lambda Calculus: Its Syntax and Semantics. North-Holland (1984)4. Bergstra, J.A., Klop, J.W.: Process algebra for synchronous communication. Information and

Computation60, 109–137 (1984)5. De Nicola, R., Hennessy, M.: Testing equivalences for processes. Theoretical Computer Sci-

ence34, 83–133 (1984)6. Floyd, R.W.: Assigning meanings to programs. Proceedings of the American Mathematical

Society Symposia on Applied Mathematics19, 19–31 (1967)7. Giacalone, A., Jou, C.C., Smolka, S.A.: Algebraic reasoning for probabilistic concurrent sys-

tems. In: Proceedings of IFIP TC2 Working Conference on Programming Concepts andMethods (1990)

8. Hansson, H., Jonsson, B.: A calculus for communicating systems with time and probabili-ties. In: Proceedings of IEEE Real-Time Systems Symposium,pp. 278–287. IEEE ComputerSociety Press (1990)

9. Hennessy, M.: Algebraic Theory of Processes. The MIT Press (1988)10. Herescu, O.M., Palamidessi, C.: Probabilistic asynchronous pi-calculus. Tech. rep., INRIA

Futurs and LIX (2004)11. Hoare, C.A.R.: An axiomatic basis for computer programming. Communications of the ACM

12(10), 576–580 (1969)12. Hoare, C.A.R.: Communicating Sequential Processes. Prentice Hall (1985)13. Lowe, G.: Probabilities and priorities in timed CSP. Ph.D. thesis, Oxford (1991)14. Milner, R.: Communication and Concurrency. Prentice Hall (1989)15. Park, D.: Concurrency and automata on infinite sequences. In: Proceedings of the 5th GI-

Conference on Theoretical Computer Science,Lecture Notes in Computer Science, vol. 104,pp. 167–183. Springer (1981)

16. Plotkin, G.: A structural approach to operational semantics. Tech. Rep. DAIMI-FN-19, Com-puter Science Department, Aarhus University (1981)

17. Scott, D., Strachey, C.: Toward a mathematical semantics for computer languages (1971)

Chapter 2Mathematical Preliminaries

Abstract We briefly introduce some mathematical concepts and associated impor-tant theorems that will be used in the subsequent chapters tostudy the semantics ofprobabilistic processes. The main topics covered in this chapter include the Knaster-Tarski fixed-point theorem, continuous functions over complete lattices, inductionand co-induction proof principles, compact sets in topological spaces, the Separationtheorem, the Banach fixed-point theorem, theπ-λ theorem, and the duality theoremin linear programming. Most of the theorems are stated without proofs because theycan be found in many textbooks.

Keywords: Lattice; Induction; Co-induction; Topology; Metric; Probability; Linearprogramming

2.1 Lattice theory

A very useful tool to study the semantics of formal languagesis the Knaster-Tarskifixed-point theorem, which will be used many times in subsequent chapters. Webegin with some basic notions from lattice theory.

Definition 2.1. A setX with a binary relation⊑ is called apartially ordered setifthe following holds for allx,y,z∈ X:

1. x⊑ x (reflexivity);2. if x⊑ y andy⊑ x thenx= y (antisymmetry);3. if x⊑ y andy⊑ z thenx⊑ z (transitivity).

An elementx ∈ X is called anupper boundfor a subsetY ⊆ X if y ⊑ x for ally ∈ Y. Dually, x is an lower boundfor Y if x ⊑ y for all y ∈ Y. Note that a subsetmight not have any upper bound or lower bound.

Example 2.1.Consider the setX = 0,1,2 with the binary relation⊑ defined by⊑ = IdX ∪(0,1),(0,2), whereIdX is the identity relation(0,0),(1,1),(2,2).

7

8 2 Mathematical Preliminaries

It constitutes a partially ordered set. The subset1,2 has no upper bound, as Fig-ure 2.1(a) shows.

0

1 2

(a)

0 1

2 3

(b)

Fig. 2.1 Partially ordered sets

Often we can useHasse diagramsto describe finite partially ordered sets. For apartially ordered set(X,⊑) we represent each element ofX as a vertex and draw anarrow that goes upward fromx to y if y is an immediate successor ofx, i.e. x ⊏ yand there is nozwith x⊏ z⊏ y, where⊏ is obtained from⊑ by removing elements(x,x) for all x. For instance the partially ordered set given in the above example canbe depicted by the diagram in Figure 2.1(a).

If Y has an upper bound that is also an element ofY, then it is said to be thegreatest elementin Y. We can dually define theleast element. In the presence of aleast element we speak of apointedpartially ordered set. If the set of upper boundsfor Y has a least elementx, thenx is called thesupremumor join of Y, written

⊔Y.

Dually we haveinfimumor meetand writed

Y. Note that a subset might not havesupremum even if it has upper bounds; so is the case for infimum.

Example 2.2.Consider the setX = 0,1,2,3 with the binary relation⊑ defined by⊑= IdX ∪(0,2),(0,3),(1,2),(1,3). It forms the partially ordered set depicted inFigure 2.1(b). The subset0,1 has two upper bounds, namely 2 and 3, but it has nosupremum because the two upper bounds are incomparable in the sense that neither2⊑ 3 nor 3⊑ 2 holds.

We callX a lattice if suprema and infima exist for all the subsets ofX with twoelements, i.e. if

⊔x,y and

dx,y exist for anyx,y ∈ X. We call X a complete

lattice if suprema and infima exist for all the subsets ofX.

Remark 2.1.A few characteristics of complete lattices are worth mentioning be-cause they are useful in many applications.

1. It suffices to define complete lattices in terms of either suprema or infima only.For example, a candidate definition is to sayX is a complete lattice if supremaexist for all subsets ofX. Then the infimum of any subsetY also exists because

lY =

⊔

x∈ X | ∀y∈Y : x⊑ y

2. LetX be a complete lattice. By definitionX itself has a supremum, which is thegreatest element ofX, written⊤, and an infimum, which is the least element

2.1 Lattice theory 9

of X, written⊥. It can be checked that the empty set /0 has supremum⊥ andinfimum⊤.

3. If (X,⊑) is a complete lattice, then so is(X,⊒), where⊒ is the inverse relationof ⊑, i.e.x⊒ y iff y⊑ x for any elementsx,y∈ X.

Example 2.3.The setX = 1,2,3,5,6,10,15,30 of all divisors of 30, partially or-dered by divisibility, constitutes a complete lattice.

Example 2.4.The set of all natural numbersN, with the usual relation≤ on naturalnumbers, forms a lattice which is not a complete lattice.

Example 2.5.Let X be any set. Its powersetP(X) = Y |Y ⊆ X with the inclusionrelation⊆ forms a complete lattice whose join and meet are set union andintersec-tion, respectively.

Given a functionf : X →Y and a setZ ⊆ X, we write f (Z) for the image ofZunder f , i.e. the set f (z) | z∈ Z. Given a partially ordered setX and a functionf : X → X, we sayx∈ X is afixed point(resp.prefixed point, postfixed point) of fif x= f (x) (resp.f (x) ⊑ x, x⊑ f (x)).

Definition 2.2. Let X andY be partially ordered sets. A functionf : X →Y is calledmonotoneif x⊑ y implies f (x) ⊑ f (y) for all x,y∈ X.

Theorem 2.1 (Knaster-Tarski fixed-point theorem). If X is a complete latticethen every monotone function f from X to X has a fixed point. Theleast of these isgiven by

lfp( f ) =l

x∈ X | f (x) ⊑ x ,

and the greatest bygfp(f ) =

⊔

x∈ X | x⊑ f (x) .

Proof. Let X′ = x∈ X | f (x) ⊑ x andx∗ =d

X′. For eachx∈ X′ we havex∗ ⊑ xand then by monotonicityf (x∗) ⊑ f (x) ⊑ x. Taking the infimum overx we getf (x∗)⊑

df (X′)⊑

dX′ = x∗, thusx∗ ∈ X′ and it is the least prefixed point. On the

other hand,x ∈ X′ implies f (x) ∈ X′ by monotonicity. Applying this tox∗ yieldsf (x∗) ∈ X′ which impliesx∗ ⊑ f (x∗). Therefore, we obtain thatx∗ = f (x∗). In fact,x∗ is the least fixed point because we have just shown that it is the least prefixedpoint.

The case for the greatest fixed point is dual and thus omitted. ⊓⊔

Definition 2.3. Given a complete latticeX, the functionf : X → X is continuousifit preserves increasing chains, i.e. for all sequencesx0 ⊑ x1 ⊑ ... we have

f (⊔

n≥0

xn) =⊔

n≥0

f (xn).

Dually, f is co-continuousif it preserves decreasing chains.


Notice that continuity and co-continuity both imply monotonicity. For example,if f is continuous andx⊑ y, then from the increasing sequencex ⊑ y⊑ y⊑ ... weobtain that f (

⊔x,y) = f (y) =

⊔ f (x), f (y), which meansf (x) ⊑ f (y). With

continuity and co-continuity we can construct in a more tractable way the least andgreatest fixed point, respectively.

Proposition 2.1.Let X be a complete lattice.

1. Every continuous function f on X has a least fixed point, given by⊔

n≥0 f n(⊥),where⊥ is the bottom element of the lattice, and fn(⊥) is the n-th iteration off on⊥: f 0(⊥) :=⊥ and fn+1(⊥) := f ( f n(⊥)) for n≥ 0.

2. Every co-continuous function f on X has a greatest fixed point, given bydn≥0 f n(⊤), where⊤ is the top element of the lattice.

Proof. We only prove the first clause, since the second one is dual.We notice that⊥ ⊑ f (⊥), and then monotonicity off yields an increasing se-

quence:⊥⊑ f (⊥)⊑ f 2(⊥)⊑ ...

By continuity of f we havef (⊔

n≥0 f n(⊥)) =⊔

n≥0 f n+1(⊥) and the latter is equalto⊔

n≥0 f n(⊥).And in fact that limit is the least fixed point: for if some other x were also a fixed

point, then we have⊥ ⊑ x and moreoverf n(⊥) ⊑ x for all n by induction. Sox isan upper bound of allf n(⊥). ⊓⊔

Let (X,⊑) be a partially ordered set. We sayX is a complete partially orderedset(CPO) if it has suprema for all increasing chains. A CPO withbottomis a CPOwith a least element⊥. The least fixed point of a continuous functionf on the CPOcan be characterised in the same way as in Proposition 2.1 (1), namely

⊔

n≥0 f n(⊥).

2.2 Induction and co-induction

We observe that Theorem 2.1 provides two proof principles: theinduction principlesays that to showlfp( f )⊑ x it is enough to provef (x)⊑ x; theco-induction principlesays that to showx⊑ gfp(f ) it suffices to provex⊑ f (x). Those are two importantproof principles in concurrency theory. In this section we briefly introduce inductiveand co-inductive definitions by rules and their associated proof techniques. For moredetailed accounts, we refer the reader to [4, 3].

Given a setX, aground ruleonX is a pair(S,x)∈P(X)×X, meaning that fromthe premisesSwe can derive the conclusionx. Usually, we write the rule(S,x) as

x if S= /0, and asx1, ...,xnx if X = x1, ...,xn. (2.1)

A rule without any premise, i.e. in the form( /0,x), is said to be anaxiom.Quite often we define a set of objectives by rules. For example, let Var be a set

of variables ranged over byx. In the lambda calculus the setΛ of all lambda terms

2.2 Induction and co-induction 11

can be expressed by a BNF (Backus-Naur Form) grammar:

M,N ::= x | λx.M | MN

which says thatΛ is the least set satisfying the following three rules:

xM

λx.MM NMN

A set of ground rulesRdetermines an operatorR : X → X, which maps a setStothe set

R(S) = x | ∃S′ ⊆ S: (S′,x) ∈ R

which includes all the elements derivable from those inSby using the rules inR.

Definition 2.4. A setS is forward-closedunderR iff R(S)⊆ S; a setS is backward-closedunderR iff S⊆ R(S).

In other words, ifS is forward-closed, then

∀x∈ X : (∃S′ ⊆ S: (S′,x) ∈ R)⇒ x∈ S

If S is backward-closed, then

∀x∈ X : x∈ S⇒ (∃S′ ⊆ S: (S′,x) ∈ R)

Given a set of rulesRonX, we let

Sf =⋂

S⊆ X | R(S)⊆ S and Sb =⋃

S⊆ X | S⊆ R(S).

So Sf is the intersection of all forward-closed sets andSb is the union of allbackward-closed sets. Note thatR is monotone on the complete lattice(P(X),⊆).It follows from Theorem 2.1 thatSf is in fact the least fixed point ofR andSb isthe greatest fixed point ofR. Usually, an object is definedinductively(resp.co-inductively) if it is the least(resp.greatest) fixed point of a function. So the setSf

(resp.Sb) is inductively (resp. co-inductively) defined by the set ofrulesR.

Example 2.6.The set of finite lists with elements from a setA is the setList induc-tively defined by the following two rules, i.e. the least set closed forward under theserules.

nil ∈ Lista∈ A l ∈ List

a · l ∈ List(2.2)

In contrast, the set of all finite or infinite lists is co-inductively defined by the samerules, i.e. the greatest set closed backward under the two rules above; the set ofall infinite lists is the set co-inductively defined by the second rule above, i.e. thegreatest set closed backward under the second rule.

The two rules in (2.2) are not ground because they are not in the form shown in(2.1). However, we can convert them into ground rules. TakeX to be the set of all(finite or infinite) strings with elements fromA∪nil. The ground rules determinedby the two rules in (2.2) are


nil and, for eachl ∈ X anda∈ A, la · l .

Let Rbe a set of ground rules onX. In generalR is not necessarily continuous orco-continuous on the complete lattice(P(X),⊆). For example, consider the rule

x1 · · · xn · · ·x

whose premise is an infinite set. For anyn≥ 1, letXn = x1, ...,xn. Then we havex∈ R(

⋃

n Xn), butx 6∈⋃

n R(Xn). Thus,R is not continuous.Below we impose some conditions onR to recover continuity and co-continuity.

Definition 2.5. A set of rulesR is finite in the premises(FP), if in each rule(S,x)∈Rthe premise setS is finite. A set of rulesR is finite in the conclusions(FC), if foreachx, the setS| (S,x) ∈ R is finite; that is, there are finitely many rules whoseconclusion isx, though each premise setSmay be infinite.

Proposition 2.2.Let R be a set of ground rules.

1. If R is FP, thenR is continuous;2. If R is FC, thenR is co-continuous.

Proof. The two statements are dual, so we only prove the second one.Let S0 ⊇ S1 ⊇ ·· · ⊇ Sn ⊇ ·· · be a decreasing chain. We need to show that

⋂

n≥0

R(Sn) = R(⋂

n≥0

Sn) .

One direction is easy. Since⋂

n≥0Sn ⊆ Sn andR is monotone, we have that

R(⋂

n≥0

Sn) ⊆⋂

n≥0

R(Sn) .

The converse inclusion follows from the condition thatR is FC. Letx be any elementin⋂

n≥0R(Sn). Thenx ∈ R(Sn) for eachn, which means that for eachn ≥ 0 thereis someS′n ⊆ Sn with (S′n,x) ∈ R. SinceR is FC, there exists somek ≥ 0 such thatS′n = S′k for all n≥ k. Moreover,S′k ⊆ Sk ⊆ Sn for all n≤ k. Therefore,S′k ⊆

⋂

n≥0Sn

and we have thatx∈ R(⋂

n≥0Sn). Thus, we have proved that

⋂

n≥0

R(Sn) ⊆ R(⋂

n≥0

Sn) .

⊓⊔

Corollary 2.1. Let R be a set of ground rules on X.

1. If R is FP, then lfp(R) =⋃

n≥0 Rn( /0);2. If R is FC, then gfp(R) =

⋂

n≥0 Rn(X).

Proof. Combine Propositions 2.1 and 2.2. ⊓⊔

2.3 Topological spaces 13

Intuitively, the setR0( /0) = /0; the setR1( /0) = R( /0) consisting of all the conclu-sions of instances of axioms. In general, the setRn+1( /0) contains all objects whichimmediately follow by ground rules with premises inRn( /0). The above corollarystates that ifR is FP, each element in the setlfp(R) can be derived via a derivationtree of finite depth whose leaves are instances of axioms; ifR is FC, elements ingfp(R) can always be destructed as conclusions of ground rules whose premises canalso be destructed similarly.

2.3 Topological spaces

In this section we review some fundamental concepts in general topology such ascontinuous functions, compact sets and some other related properties. They will beused in Chapter 6.

Definition 2.6. Let X be a non-empty set. A collectionT of subsets ofX is atopol-ogyonX iff T satisfies the following axioms.

1. X and /0 belong toT ;2. The union of any number of sets inT belongs toT ;3. The intersection of any two sets inT belongs toT .

The members ofT are calledopen sets, and the pair(X,T ) is called atopologicalspace.

Example 2.7.The collection of all open intervals in the real lineR forms a topology,which is called theusual topologyonR.

Let (X,T ) be a topological space. A pointx ∈ X is anaccumulation pointorlimit point of a subsetY of X iff every open setZ containingx must also contain apoint ofY different fromx, that is,

Z open,x∈ Z implies (Z\x)∩Y 6= /0 .

Definition 2.7. A subsetY of a topological space(X,T ) is closediff Y containseach of its limit points.

Definition 2.8. LetY be a subset of a topological space(X,T ). Theclosureof Y isthe union ofY and the set of all its limit points. We sayY is denseif the closure ofY is X, i.e., every point ofX is a limit point ofY.

It is immediate thatY coincides with its closure if and only ifY is closed.

Definition 2.9. A topological space(X,T ) is said to beseparableif it contains acountable dense subset; that is, if there exists a finite or denumerable subsetY of Xsuch that the closure ofY is the entire space.


Definition 2.10.Let (X,T ) and(X⋆,T ⋆) be topological spaces. A functionf fromX into X⋆ is continuousiff the inverse imagef−1(Y) of every open subsetY of X⋆

is an open subset ofX, i.e.

Y ∈ T⋆ implies f−1(Y) ∈ T .

Example 2.8.The projection mappings from the planeR2 into the lineR are con-tinuous with respect to the usual topologies. For example, consider the projectionπ : R2 → R defined byπ(〈x,y〉) = y. The inverse of any open interval(a,b) is aninfinite open strip parallel to thex-axis.

Continuous functions can also be characterised by their behaviour with respectto closed sets.

Theorem 2.2.A function f: X →Y is continuous if and only if the inverse image ofevery closed subset of Y is a closed subset of X. ⊓⊔

Let Y = Yi be a class of subsets ofX such thatY ⊆∪iYi for someY ⊆ X. ThenY is called acoverof Y, and anopen coverif eachYi is open. Furthermore, if afinite subclass ofY is also a cover ofY, i.e. if

∃Yi1, ...,Yim ∈ Y such that Y ⊆Yi1 ∪·· ·∪Yim

thenY is said to contain afinite subcoverof Y.

Example 2.9.The classical Heine-Borel Theorem says that every open cover of aclosed and bounded interval[a,b] on the real line contains a finite subcover.

Definition 2.11.A subsetY of a topological spaceX is compactif every open coverof Y contains a finite subcover.

Example 2.10.By the Heine-Borel Theorem, every closed and bounded interval[a,b] on the real lineR is compact.

Theorem 2.3.Continuous images of compact sets are compact. ⊓⊔

A set Xi of sets is said to have thefinite intersection propertyif every finitesubsetXi1, ...,Xim has a non-empty intersection, i.e.Xi1 ∩·· ·∩Xim 6= /0.

Example 2.11.Consider the following class of open intervals:

X = (0,1),(0,12),(0,

122 ), ...

Clearly, it has the finite intersection property. Observe however thatX has an emptyintersection.

Theorem 2.4.A topological space X is compact if and only if every classXi ofclosed subsets of X which satisfies the finite intersection property has, itself, a non-empty intersection. ⊓⊔

2.4 Metric spaces 15

2.4 Metric spaces

The main subject in probabilistic concurrency theory is thequantitative analysis ofsystem behaviour. So we meet metric spaces from time to time.This section givessome background knowledge on them.

LetR≥0 be the set of all nonnegative real numbers.

Definition 2.12.A metric spaceis a pair(X,d) consisting of a setX and a distancefunctiond : X×X → R≥0 satisfying:

1. for all x,y∈ X, d(x,y) = 0 iff x= y (isolation);2. for all x,y∈ X, d(x,y) = d(y,x) (symmetry);3. for all x,y,z∈ X, d(x,z)≤ d(x,y)+d(y,z) (triangle inequality).

If we replace the first clause with∀x ∈ X : d(x,x) = 0, we obtain the definition ofpseudometric space.

A metric d is c-bounded if∀x,y ∈ X : d(x,y) ≤ c, wherec is a positive realnumber.

Example 2.12.Let X be a set. Thediscrete metric d: X×X −→ [0,1] is defined by

d(x,y) =

0 if x= y1 otherwise.

Example 2.13.LetRn denote the product set ofn copies of the setR of real numbers,i.e. it consists of alln-tuples〈a1,a2, ...,an〉 of real numbers. The functiond definedby

d(x,y) =√

(a1−b1)2+ · · ·+(an−bn)2

wherex = 〈a1, ...,an〉 andy = 〈b1, ...,bn〉, is a metric, called theEuclidean metriconRn. This metric space(Rn,d) is calledEuclidean n-space.

Let (X,d) be a metric space. For any pointx∈X in the space and any real numberε > 0, we letS(x,ε) denote the set of points within a distance ofε from x:

S(x,ε) := y | d(x,y)< ε .

We callS(x,ε) theopen spherewith centrex and radiusε.

Theorem 2.5.Let (X,d) be a metric space. The collection of open spheres in Xgenerates a topology whose open sets are those open spheres,called themetrictopology. ⊓⊔

Definition 2.13.A sequence(xn) in a metric space(X,d) is convergentto x∈ X, iffor an arbitraryε > 0 there existsN ∈ N such thatd(xn,x)< ε whenevern> N.

Definition 2.14.A sequence(xn) in a metric space(X,d) is called aCauchy se-quenceif for an arbitraryε > 0 there existsN ∈N such thatd(xm,xn)< ε wheneverm,n> N.


Definition 2.15.A metric space iscompleteif every Cauchy sequence is convergent.

For example, the space of real numbers with the usual metric is complete.

Example 2.14.Let X be a non-empty set andF denote the collection of functionsfrom X to the interval[0,1]. A metric is defined onF as follows:

d( f ,g) := supx∈X| f (x)−g(x)| .

In fact,(F,d) is a complete metric space. Let( fn) be a Cauchy sequence inF . Thenfor everyx ∈ X, the sequence( fn(x)) is Cauchy; and since[0,1] is complete, thesequence converges to someax ∈ [0,1]. Let f be the function defined byf (x) = ax.Thus( fn) converges tof .

Example 2.15.Similar to Example 2.14, it can be seen that Euclideann-spaceRn iscomplete.

Let Y be a subset of a metric space(X,d) and letε > 0. A finite set of pointsZ = z1,z2, ...,zm is called anε-net for Y if for every pointy ∈ Y there exists anz∈ Z with d(y,z)< ε.

Definition 2.16.A subsetY of a metric spaceX is totally boundedif Y possesses anε-net for everyε > 0.

Theorem 2.6.Let (X,d) be a complete metric space. Then Y⊆ X is compact if andonly if Y is closed and totally bounded. ⊓⊔

Definition 2.17 (Convex set).A setX ⊆Rn is convexif for every two pointsx,y∈X

the whole segment betweenx andy is also contained inX. In other words, for everyp∈ [0,1], the pointpx+(1− p)y belongs toX. We writelX for theconvex closureof X, the smallest convex set containingX.

Given twon-dimensional vectorsx= 〈a1, ...,an〉 andy= 〈b1, ...,bn〉, we use theusual definition of dot-productx ·y= ∑n

i=1ai ·bi.A basic result about convex set is the separability of disjoint convex sets by a

hyperplane [2].

Theorem 2.7 (Separation theorem).Let X,Y ⊆Rn be convex sets with X∩Y = /0.

Then there is ahyperplanewhosenormalis h∈ Rn and a number c∈R such that

for all x ∈ X and y∈Y, we have h·x≤ c≤ h ·y

orfor all x ∈ X and y∈Y, we have h·x≥ c≥ h ·y .

If X and Y are closed and at least one of them is bounded, they can be separatedstrictly, i.e. in such a way that

for all x ∈ X and y∈Y, we have h·x< c< h ·y

2.4 Metric spaces 17

orfor all x ∈ X and y∈Y, we have h·x> c> h ·y .

⊓⊔

Here the hyperplane is a set of the formz∈ Rn | h ·z= c.

Definition 2.18.Let (X,d) be a metric space. A functionf : X → X is said to be acontraction mappingif there is a constantδ with 0≤ δ < 1 such that

d( f (x), f (y)) ≤ δ ·d(x,y)

for all x,y∈ X.

In the above definition, the constantδ is strictly less than 1. It entails the followingproperty whose proof crucially relies on that constraint onδ and is worth writingdown.

Theorem 2.8 (Banach fixed-point theorem). Every contraction on a completemetric space has a unique fixed point.

Proof. Let (X,d) be a complete metric space, andf be a contraction mapping on(X,d) with constantδ . For anyx0 ∈ X, define the sequence(xn) by xn+1 := f (xn)for n≥ 0. Leta := d(x0,x1). It is easy to show that

d(xn,xn+1)≤ δ n ·a

by repeated application of the propertyd( f (x), f (y)) ≤ δ ·d(x,y). For anyε > 0, itis possible to choose a natural numberN such that δ n

1−δ a< ε for all n≥ N. Now, foranym,n≥ N with m≤ n,

d(xm,xn) ≤ d(xm,xm+1)+d(xm+1,xm+2)+ ...+d(xn−1,xn)≤ δ m ·a+ δ m+1 ·a+ ...+ δ n−1 ·a= δ m1−δ n−m

1−δ a< δ m

1−δ a< ε

by repeated application of the triangle inequality. So the sequence(xn) is a Cauchysequence. Since(X,d) is complete, the sequence has a limit in(X,d). We definex∗ to be this limit and show that it is a fixed point off . Suppose it is not, i.e.a∗ := d(x∗, f (x∗)) > 0. Since(xn) converges tox∗, there exists someN ∈ N suchthatd(xn,x∗)< a∗

2 for all n≥ N. Then

d(x∗, f (x∗)) ≤ d(x∗,xN+1)+d(xN+1, f (x∗))≤ d(x∗,xN+1)+ δ ·d(xN,x∗)< a∗

2 + a∗2 = a∗,

which is a contradiction. Sox∗ is a fixed point off . It is also unique. Otherwise,suppose there is another fixed pointx′; we haved(x′,x∗)> 0 sincex′ 6= x∗. But thenwe would have


d(x′,x∗) = d( f (x′), f (x∗))≤ δ ·d(x′,x∗)< d(x′,x∗).

Thereforex∗ is the unique fixed point off . ⊓⊔

2.5 Probability spaces

In this section, we recall some basic concepts from probability and measure theory.More details can be found in many excellent textbooks, for example [1].

Definition 2.19.Let X be an arbitrary non-empty set andX a collection of subsetsof X. We say thatX is afield onX if

1. the empty set /0∈ X ;2. wheneverA∈ X , then the complementX\A∈ X ;3. wheneverA,B∈ X , then the unionA∪B∈ X .

A field X is a σ -algebraif it is closed under countable union: wheneverAi ∈ X

for i ∈ N, then⋃

i∈N Ai ∈ X .

The elements of aσ -algebra are calledmeasurable sets, and(X,X ) is called amea-surable space. A measurable space(X,X ) is calleddiscreteif X is the powersetP(X). A σ -algebrageneratedby a family of setsX , denotedσ(X ), is the small-estσ -algebra that containsX . The existence ofσ(X ) is ensured by the followingproposition.

Proposition 2.3.For any non-empty set X andX a collection of subsets of X, thereexists a unique smallestσ -algebra containingX . ⊓⊔

The Borel σ -algebraon a topological space(X,X ) is the smallestσ -algebracontainingX . The elements of the Borelσ -algebra are calledBorel sets. If wehave a topological space then we can always consider its Borel σ -algebra and regard(X,σ(X )) as a measurable space.

Let X be a collection of subsets of a setX. We sayX is aπ-class if it is closedunder finite intersections;X is aλ -class if it is closed under complementations andcountable disjoint unions.

Theorem 2.9 (Theπ-λ theorem). If X is a π-class, thenσ(X ) is the smallestλ -class containingX .

Definition 2.20.Let (X,X ) be a measurable space. A functionµ : X → [−∞,∞]is ameasureonX if it satisfies the following conditions:

1. µ(A)≥ 0 for all A∈ X ;2. µ( /0) = 0;3. If A1,A2, ... are inX , with Ai ∩A j = /0 for i 6= j, thenµ(

⋃

i Ai) = ∑i µ(Ai).

2.6 Linear programming 19

The triple(X,X ,µ) is called ameasure space. A Borel measure is a measure on aBorel σ -algebra. Ifµ(X) = 1, then the measure space(X,X ,µ) is called aprob-ability space, µ a probability measure, also calledprobability distribution, the setX a sample space, and the elements ofX events. If µ(X) ≤ 1, then we obtain asub-probability measure, also calledsub-probability distributionor simplysubdis-tribution. A measure over a discrete measurable space(X,P(X)) is called adiscretemeasureoverX.

A (discrete) probabilitysubdistributionover a setS can also be considered asa function∆ : S→ [0,1] with ∑s∈S∆(s) ≤ 1; thesupportof such a∆ is the set⌈∆⌉ := s∈ S | ∆(s) > 0, and itsmass|∆ | is ∑s∈⌈∆⌉ ∆(s). A subdistribution isa (total, or full)distribution if |∆ | = 1. Thepoint distributions assigns probability1 to s and 0 to all other elements ofS, so that⌈s⌉ = s. With Dsub(S) we denotethe set of subdistributions overS, and withD(S) its subset of full distributions. For∆ ,Θ ∈ Dsub(S) we write∆ ≤Θ iff ∆(s)≤Θ(s) for all s∈ S.

Let ∆k | k∈ K be a set of subdistributions, possibly infinite. Then∑k∈K ∆k isthe real-valued function inS→ R defined by(∑k∈K ∆k)(s) := ∑k∈K ∆k(s). This isa partial operation on subdistributions because for some states the sum of∆k(s)might exceed 1. If the index set is finite, say1..n, we often write∆1+ . . .+∆n.For p a real number from[0,1] we usep·∆ to denote the subdistribution givenby (p·∆)(s) := p·∆(s). Finally we useε to denote the everywhere-zero subdis-tribution that thus has empty support. These operations on subdistributions do notreadily adapt themselves to distributions; yet if∑k∈K pk = 1 for some collection ofprobabilitiespk, and the∆k are distributions, then so is∑k∈K pk ·∆k. In general when0≤p≤1 we writex p⊕ y for p·x+(1−p)·y where that makes sense, so that for ex-ample∆1 p⊕ ∆2 is always defined, and is full if∆1 and∆2 are. Finally, theproductof two probability distributions∆ ,Θ overS,T is the distribution∆ ×Θ overS×Tdefined by(∆ ×Θ)(s, t) := ∆(s) ·Θ(t).

2.6 Linear programming

A linear programming problem is the problem of maximising orminimising a linearfunction subject to linear constraints. The constraints may be equalities or inequali-ties.

Example 2.16 (The transportation problem).There arem factories that supplyncustomers with a particular product. Maximum production atfactory i is si . Thedemand for the product from customerj is r j . Let ci j be the cost of transportingone unit of the product from factoryi to customerj. The problem is to determinethe quantity of product to be supplied from each factory to each customer so as tominimise total costs.

Let yi j be the quantity of product shipped from factoryi to customerj. The totaltransportation cost is


m

∑i=1

n

∑j=1

yi j ci j . (2.3)

The amount sent from factoryi is ∑nj=1yi j and since the amount available at factory

i is si , we must haven

∑j=1

yi j ≤ si for i = 1, ...,m. (2.4)

The amount sent to customerj is ∑mi=1yi j and since the amount required there isr j ,

we must havem

∑i=1

yi j ≥ r j for j = 1, ...,n. (2.5)

It is assumed that we cannot send a negative amount from factory i to customerj,so we have also

yi j ≥ 0 for i = 1, ...,m and j = 1, ...,n. (2.6)

Our problem is to minimise (2.3) subject to (2.4), (2.5) and (2.6).

Two classes of problems, thestandard maximum problemand thestandard mini-mum problem, play a special role in linear programming. We are given anm-vector,b = (b1, ...,bm)

T , and ann-vector,c= (c1, ...,cn)T , and anm×n matrix,

A =

a11 a12 · · · a1n

a21 a22 · · · a2n...

.... . .

...am1 am2 · · · amn

of real numbers.The standard maximum problem: Find ann-vector,x = (x1, ...,xn)

T that max-imise

cTx = c1x1+ · · ·+ cnxn

subject to the constraintsAx ≤ b, i.e.

a11x1+a12x2+ · · ·+a1nxn ≤ b1

a21x1+a22x2+ · · ·+a2nxn ≤ b2...

am1x1+am2x2+ · · ·+amnxn ≤ bm

andx ≥ 0, i.e.x1 ≥ 0, x2 ≥ 0, ..., xn ≥ 0.

The standard minimum problem: Find anm-vector,y = (y1, ...,ym)T that min-

imisebTy = b1y1+ · · ·+bmym

References 21

subject to the constraintsATy ≥ c, i.e.

a11y1+a21y2+ · · ·+am1ym ≥ c1

a12y1+a22y2+ · · ·+am2ym ≥ c2...

a1ny1+a2ny2+ · · ·+amnym ≥ cn

andy ≥ 0, i.e.y1 ≥ 0, y2 ≥ 0, ..., ym ≥ 0.

In a linear programming problem, the function to be maximised or minimised iscalled theobjective function. A vector, e.g.x for the standard maximum problem ory for the standard minimum problem, is said to befeasibleif it satisfies the corre-sponding constraints. The set of feasible vectors is calledtheconstraint set. A linearprogramming problem is said to befeasibleif the constraint set is not empty; oth-erwise it is said to beinfeasible. A feasible maximum (resp. minimum) problem issaid to beunboundedif the objective function can assume arbitrarily large positive(resp. negative) values at feasible vectors; otherwise, itis said to bebounded. Thevalueof a bounded feasible maximum (resp, minimum) problem is themaximum(resp. minimum) value of the objective function as the variables range over the con-straint set. A feasible vector at which the objective function achieves the value iscalledoptimal.

Proposition 2.4.All linear programming problems can be converted to a standardform. ⊓⊔

Definition 2.21.The dual of the standard maximum problem

maximisecTxsubject to the constraintsAx ≤ b andx ≥ 0

is defined to be the standard minimum problem

minimisebTysubject to the constraintsATy ≥ c andy ≥ 0

and vice versa.

Theorem 2.10 (The duality theorem). If a standard linear programming problemis bounded feasible, then so is its dual, their values are equal, and there exist optimalvectors for both problems. ⊓⊔

References

1. Billingsley, P.: Probability and Measure. Wiley (1995)2. Matousek, J.: Lectures on Discrete Geometry. Springer (2002)


3. Sangiorgi, D.: Introduction to Bisimulation and Coinduction. Cambridge University Press(2012)

4. Winskel, G.: The Formal Semantics of Programming Languages: An Introduction. The MITPress (1993)

Chapter 3Probabilistic Bisimulation

Abstract We introduce the operational model of probabilistic labelled transitionsystems, where a state evolves into a distribution after performing some action. Todefine relations between distributions, we need to lift a relation on states to be arelation on distributions of states. There is a natural lifting operation that nicely cor-responds to the Kantorovich metric, a fundamental concept used in mathematics tolift a metric on states to a metric on distributions of states, which is also related tothe maximum flow problem in optimisation theory.

The lifting operation yields a neat notion of probabilisticbisimulation, for whichwe provide logical, metric, and algorithmic characterisations. Specifically, we ex-tend the Hennessy-Milner logic and the modal mu-calculus with a new modality,resulting in an adequate and an expressive logic for probabilistic bisimilarity. Thecorrespondence of the lifting operation and the Kantorovich metric leads to a charac-terisation of bisimulations as pseudometrics that are postfixed points of a monotonefunction. Probabilistic bisimilarity also admits both partition refinement and “on thefly” decision algorithms; the latter exploits the close relationship between the liftingoperation and the maximum flow problem.

Keywords: Probabilistic labelled transition system; Lifting operation; Probabilisticbisimulation; Modal logic; Algorithm

3.1 Introduction

In recent years, probabilistic constructs have been provenuseful for giving quanti-tative specifications of system behaviour. The first papers on probabilistic concur-rency theory [42, 15, 61] proceed byreplacingnondeterministic with probabilisticconstructs. The reconciliation of nondeterministic and probabilistic constructs startswith [46] and has received a lot of attention in the literature [89, 82, 63, 81, 68, 47,51, 64, 4, 55, 69, 16, 85, 65, 26, 27, 24, 25]. It could be arguedthat it is one of thecentral problems of the area.

23

24 3 Probabilistic Bisimulation

Also we shall work in a framework that features the co-existence of probabilityand nondeterminism. More specifically, we deal withprobabilistic labelled tran-sition systems (pLTS’s)[26] that are an extension of the usuallabelled transitionsystems (LTS’s)so that a step of transition is in the forms a−→ ∆ , meaning that states can perform actiona and evolve into a distribution∆ over some successor states.In this setting states is related to statet by a relationR, say probabilistic simulation,written sR t, if for each transitions a−→ ∆ from s there exists a transitiont a−→ Θfrom t such thatΘ can somehow mimic the behaviour of∆ according toR. Toformalise the mimicking of∆ by Θ , we have tolift R to be a relationR† betweendistributions over states so that we can require∆ R

† Θ .Various approaches of lifting relations have appeared in the literature; see e.g.

[60, 82, 26, 21, 25]. We will show that although those approaches appear different,they can be reconciled. Essentially, there is only one lifting operation, which hasbeen presented in different forms. Moreover, we argue that the lifting operation isinteresting in itself. This is justified by its intrinsic connection with some funda-mental concepts in mathematics, notablythe Kantorovich metric[57]. For example,it turns out that our lifting of binary relations from statesto distributions nicelycorresponds to the lifting of metrics from states to distributions by using the Kan-torovich metric. In addition, the lifting operation is closely related tothe maximumflow problemin optimisation theory, as observed by Baier et al. [2].

A good scientific concept is often elegant, even seen from many different per-spectives. Among the wealth of behavioural equivalences proposed in the traditionalconcurrency theory during the last three decades,bisimilarity [66, 73] is probablythe most studied one as it admits a suitable semantics, a niceco-inductive prooftechnique, and efficient decision algorithms. In our opinion, bisimulation is a goodconcept because it can be characterised in a great many ways such as fixed-pointtheory, modal logics, game theory, co-algebras etc., and all the characterisations arevery natural. We believe that probabilistic bisimulation is such a concept in prob-abilistic concurrency theory. As an evidence, we will provide in this chapter threecharacterisations, from the perspectives of modal logics,metrics, and decision algo-rithms.

1. Our logical characterisation of probabilistic bisimulation consists of two as-pects:adequacyandexpressivity[75]. A logic L is adequate when two statesare bisimilar if and only if they satisfy exactly the same setof formulae inL .The logic is expressive when each states has a characteristic formulaϕs inL such thatt is bisimilar tos if and only if t satisfiesϕs. We will introducea probabilistic-choice modality to capture the behaviour of distributions. Intu-itively, distribution∆ satisfies the formula

⊕

i∈I pi ·ϕi if there is a decompositionof ∆ into a convex combination of some distributions,∆ = ∑i∈I pi ·∆i , and each∆i conforms to the property specified byϕi . When the new modality is addedto the Hennessy-Milner logic [49] we obtain an adequate logic for probabilis-tic bisimilarity; when it is added to the modal mu-calculus [59] we obtain anexpressive logic.

3.2 Probabilistic labelled transition systems 25

2. By metric characterisation of probabilistic bisimulation, we mean to give apseudometric1 such that two states are bisimilar if and only if their distance is0 when measured by the pseudometric. More specifically, we show that bisim-ulations correspond to pseudometrics that are postfixed points of a monotonefunction, and in particular bisimilarity corresponds to a pseudometric that is thegreatest fixed point of the monotone function.

3. As to the algorithmic characterisation, we will see that apartition refinementalgorithm can be used to check whether two states are bisimilar. We also pro-pose an “on the fly” algorithm that checks whether two states are related byprobabilistic bisimilarity. The schema of the algorithm isto approximate prob-abilistic bisimilarity by iteratively accumulating information about state pairs(s, t) wheresandt are not bisimilar. In each iteration we dynamically constructa relationR as an approximant. Then we verify that every transition fromonestate can be matched by a transition from the other state, andthat their resultingdistributions are related by the lifted relationR†. The latter involves solvingthe maximum flow problem of an appropriately constructed network, by takingadvantage of the close relationship between our lifting operation and the abovementioned maximum flow problem.

3.2 Probabilistic labelled transition systems

There are various ways of generalising the usual (non-probabilistic) labelled transi-tion systems to a probabilistic setting. Here we choose one that is widely used in theliterature.

Definition 3.1. A probabilistic labelled transition system(pLTS) is defined as atriple 〈S,L,→〉, where

1. S is a set of states,2. L is a set of transition actions,3. relation→ is a subset ofS×L×D(S).

A non-probabilistic LTS may be viewed as a degenerate pLTS — one in whichonly point distributions are used. As with LTS’s, we writes α−→ ∆ in place of(s,α,∆) ∈→. If there exists some∆ with s α−→ ∆ , then the predicates α−→ holds.Similarly, s→ holds if there is someα with s α−→. A pLTS is finitely branchingifthe set〈α,∆〉 ∈ L×D(S) | s α−→ ∆ is finite for all statess; if moreoverS is finite,then the pLTS is said to befinitary.

1 We use a pseudometric rather than a proper metric because twodistinct states can still be atdistance zero if they are bisimilar.


Convention: All the pLTS’s considered in this book are assumed to be finitary,unless otherwise stated.

In order to visualise pLTS’s, we often draw them as directed graphs. Given that ina pLTS transitions go from states to distributions, we need to introduce additionaledges to connect distributions back to states, thereby obtaining a bipartite graph.States are therefore represented by nodes of the form• and distributions by nodesof the form. For any statesand distribution∆ with s α−→∆ we draw an edge fromsto ∆ , labelled withα. Consequently, the edges leaving a•-node are all labelled withactions fromL. For any distribution∆ and states in ⌈∆⌉, the support of∆ , we drawan edge from∆ to s, labelled with∆(s). Consequently, the edges leaving a-nodeare labelled with positive real numbers that sum to 1. Sometimes we partially unfoldthis graph by drawing the same nodes multiple times; in doingso, all outgoing edgesof a given instance of a node are always drawn, but not necessarily all incomingedges. Edges labelled by probability 1 occur so frequently that it makes sense toomit them, together with the associated nodes representing point distributions.

Two example pLTS’s are described this way in Figure 3.1, where diagram (b)depicts the initial part of the pLTS obtained by unfolding the one in diagram (a).

a

2/3

τ

a

a

τ

τ

2/3

1/3 2/3

1/3

τ

(a) (b)

1/3

Fig. 3.1 Example pLTS’s

For each states, the outgoing transitionss α−→ ∆ represent the nondeterministicalternatives available in the states. The nondeterministic choices provided bys aresupposed to be resolved by the environment, which is formalised by ascheduleror

3.3 Lifting relations 27

anadversary. On the other hand, the probabilistic choices in the underlying distri-bution∆ are made by the system itself. Therefore, for each states, the environmentchooses some outgoing transitions α−→ ∆ . Then the actionα is performed, the sys-tem resolves the probabilistic choice, and subsequently with probability∆(s′) thesystem reaches states′.

If we impose the constraint that for any statesand actionα at most one outgoingtransition froms is labelledα, then we obtain the special class of pLTS’s calledreactive(or deterministic) pLTS’s that are the probabilistic counterpart to determin-istic LTS’s. Formally, a pLTS is reactive if for eachs∈S,α ∈ L we have thats α−→ ∆ands α−→ ∆ ′ imply ∆ = ∆ ′.

3.3 Lifting relations

In the probabilistic setting, formal systems are usually modelled as distributionsover states. To compare two systems involves the comparisonof two distributions.So we need a way of lifting relations on states to relations ondistributions. This isused, for example, to define probabilistic bisimulation as we shall see in Section 3.5.A few approaches of lifting relations have appeared in the literature. We will takethe one from [25], and show its coincidence with two other approaches.

Definition 3.2. Given two setsS andT and a binary relationR ⊆ S×T, the liftedrelationR

† ⊆ D(S)×D(T) is the smallest relation that satisfies:

(1) sR t impliessR† t

(2) (Linearity)∆i R† Θi for all i ∈ I implies(∑i∈I pi ·∆i) R

† (∑i∈I pi ·Θi), whereIis a finite index set and∑i∈I pi = 1.

There are alternative presentations of Definition 3.2. The following is used inmany proofs.

Proposition 3.1.Let ∆ andΘ be two distributions over S and T, respectively, andR⊆ S×T. Then∆ R

† Θ if and only if there are two collections of states,sii∈I

andtii∈I , and a collection of probabilitiespii∈I , for some finite index set I, suchthat∑i∈I pi = 1 and∆ ,Θ can be decomposed as follows:

1. ∆ = ∑i∈I pi ·si

2. Θ = ∑i∈I pi · ti3. For each i∈ I we have si R ti .

Proof. (⇐) Suppose we can decompose∆ andΘ as follows: (i)∆ = ∑i∈I pi · si ,(ii) Θ = ∑i∈I pi · ti , and (iii) si R ti for eachi ∈ I . By (iii) and the first rule in Def-inition 3.2, we havesi R

† ti for eachi ∈ I . By the second rule in Definition 3.2 weobtain that(∑i∈I pi ·si) R

† (∑i∈I pi · ti), that is∆ R† Θ .

(⇒) We proceed by rule induction.

• If ∆ R† Θ because∆ = s, Θ = t andsR t, then we can simply takeI to be the

singleton seti with pi = 1 andΘi =Θ .


• If ∆ R† Θ because of the conditions∆ =∑i∈I pi ·∆i ,Θi =∑i∈I pi ·Θi for some in-

dex setI , and∆i R† Θi for eachi ∈ I , then by induction hypothesis there are index

setsJi such that∆i = ∑ j∈Jipi j ·si j , Θi = ∑ j∈Ji

pi j · ti j , andsi j R ti j for eachi ∈ Iand j ∈ Ji . It follows that∆ = ∑i∈I ∑ j∈Ji

pi pi j ·si j , Θ = ∑i∈I ∑ j∈Jipi pi j · ti j , and

si j R ti j for eachi ∈ I and j ∈ Ji . Therefore, it suffices to takei j | i ∈ I , j ∈ Jito be the index set andpi pi j | i ∈ I , j ∈ Ji be the collection of probabilities.

⊓⊔

An important point here is that in the decomposition of∆ into ∑i∈I pi ·si , the statessi arenot necessarily distinct: that is, the decomposition is not in general unique.Thus when establishing the relationship between∆ andΘ , a given states in ∆ mayplay a number of different roles. This is reflected in the following property.

Proposition 3.2.sR† Θ iff s R t for all t ∈ ⌈Θ⌉. ⊓⊔

The lifting construction satisfies the following useful property.

Proposition 3.3 (Left-decomposable).SupposeR ⊆ S× T and ∑i∈I pi = 1. If(∑i∈I pi ·∆i) R

† Θ thenΘ = ∑i∈I pi ·Θi for some set of distributionsΘii∈I suchthat∆i R

† Θi for each i∈ I.

Proof. Suppose∆ = ∑i∈I pi ·∆i and∆ R† Θ . We have to find a family ofΘi such

that

(i) ∆i R† Θi for eachi ∈ I

(ii) Θ = ∑i∈I pi ·Θi .

From the alternative characterisation of lifting, Proposition 3.1, we know that

∆ = ∑j∈J

q j ·sj sj R t j Θ = ∑j∈J

q j · t j

DefineΘi to be

∑s∈⌈∆i⌉

∆i(s) · ( ∑ j∈J |s=sj

q j

∆(s)· t j)

Note that∆(s) can be written as∑ j∈J |s=sj q j and therefore

∆i = ∑s∈⌈∆i⌉

∆i(s) · ( ∑ j∈J |s=sj

q j

∆(s)·sj)

Sincesj R t j this establishes (i) above.To establish (ii) above let us first abbreviate the sum∑ j∈J |s=sj

q j∆ (s) · t j to X(s).

Then∑i∈I pi ·Θi can be written as


∑s∈⌈∆⌉

∑i∈I

pi ·∆i(s) ·X(s)

= ∑s∈⌈∆⌉

(∑i∈I

pi ·∆i(s)) ·X(s)

= ∑s∈⌈∆⌉

∆(s) ·X(s)

The last equation is justified by the fact that∆(s) = ∑i∈I pi ·∆i(s).Now ∆(s) ·X(s) = ∑ j∈J |s=sj q j · t j and therefore we have

∑i∈I

pi ·Θi = ∑s∈⌈∆⌉

∑ j∈J |s=sj

q j · t j

= ∑j∈J

q j · t j

=Θ

⊓⊔

From Definition 3.2, the next two properties follow. In fact,they are sometimesused in the literature as definitions of lifting relations instead of being properties(see e.g. [82, 60]).

Theorem 3.1. 1. Let ∆ andΘ be distributions over S and T , respectively. Then∆ R

† Θ if and only if there is a probability distribution on S×T, with supporta subset ofR, such that∆ andΘ are its marginal distributions. In other words,there exists a weight function w: S×T → [0,1] such that

a. ∀s∈ S: ∑t∈T w(s, t) = ∆(s)b. ∀t ∈ T : ∑s∈Sw(s, t) =Θ(t)c. ∀(s, t) ∈ S×T : w(s, t)> 0⇒ sR t.

2. Let∆ andΘ be distributions over S andR be an equivalence relation. Then∆ R

† Θ if and only if∆(C) =Θ(C) for all equivalence classes C∈S/R, where∆(C) stands for the accumulation probability∑s∈C ∆(s).

Proof. 1. (⇒) Suppose∆ R† Θ . By Proposition 3.1, we can decompose∆ andΘ

such that∆ = ∑i∈I pi ·si , Θ = ∑i∈I pi · ti , andsi R ti for all i ∈ I . We define theweight functionw by letting

w(s, t) = ∑pi | si = s, ti = t, i ∈ I

for anys∈ S, t ∈ T. This weight function can be checked to meet our require-ments.

a. For anys∈ S, we have

∑t∈T w(s, t) = ∑t∈T ∑pi | si = s, ti = t, i ∈ I= ∑pi | si = s, i ∈ I= ∆(s)


b. Similarly, we have∑s∈Sw(s, t) =Θ(t).c. For anys∈ S, t ∈ T, if w(s, t) > 0 then there is somei ∈ I such thatpi > 0,

si = s, andti = t. It follows from si R ti thatsR t.

(⇐) Suppose there is a weight functionw satisfying the three conditions in thehypothesis. We construct the index setI = (s, t) | w(s, t) > 0,s∈ S, t ∈ T andprobabilitiesp(s,t) = w(s, t) for each(s, t) ∈ I .

a. We have∆ = ∑(s,t)∈I p(s,t) ·sbecause, for anys∈ S,

(∑(s,t)∈I p(s,t) ·s)(s) = ∑(s,t)∈I w(s, t)= ∑w(s, t) | w(s, t)> 0, t ∈ T= ∑w(s, t) | t ∈ T= ∆(s)

b. Similarly, we haveΘ = ∑(s,t)∈I w(s, t) · t.c. For each(s, t) ∈ I , we havew(s, t) > 0, which impliessR t.

Hence, the above decompositions of∆ andΘ meet the requirement of the lifting∆ R

† Θ .2. (⇒) Suppose∆ R

† Θ . By Proposition 3.1, we can decompose∆ andΘ suchthat∆ = ∑i∈I pi ·si , Θ = ∑i∈I pi · ti , andsi R ti for all i ∈ I . For any equivalenceclassC∈ S/R, we have that

∆(C) = ∑s∈C ∆(s) = ∑s∈C ∑pi | i ∈ I ,si = s= ∑pi | i ∈ I ,si ∈C= ∑pi | i ∈ I , ti ∈C= Θ(C)

where the equality in the third line is justified by the fact that si ∈ C iff ti ∈ Csincesi R ti andC∈ S/R.(⇐) Suppose∆(C) =Θ(C) for each equivalence classC∈ S/R. We construct

the index setI = (s, t) | sR t ands, t ∈ S and probabilitiesp(s,t) =∆ (s)Θ (t)

∆ ([s]) for

each(s, t) ∈ I , where[s] stands for the equivalence class that containss.

a. We have∆ = ∑(s,t)∈I p(s,t) ·sbecause, for anys′ ∈ S,

(∑(s,t)∈I p(s,t) ·s)(s′) = ∑(s′,t)∈I p(s′,t)= ∑∆ (s′)Θ (t)

∆ ([s′]) | s′ R t, t ∈ S

= ∑∆ (s′)Θ (t)∆ ([s′]) | t ∈ [s′]

= ∆ (s′)∆ ([s′]) ∑Θ(t) | t ∈ [s′]

= ∆ (s′)∆ ([s′])Θ([s′])

= ∆ (s′)∆ ([s′])∆([s′])

= ∆(s′)

b. Similarly, we haveΘ = ∑(s,t)∈I p(s,t) · t.


c. For each(s, t) ∈ I , we havesR t.

Hence, the above decompositions of∆ andΘ meet the requirement of the lifting∆ R

† Θ .⊓⊔

The lifting operation given in Definition 3.2 is monotone andalso preserves thetransitivity of the relation being lifted. Moreover, it is distributive with respect to thecomposition of relations.

Proposition 3.4. 1. If R1 ⊆ R2 thenR1† ⊆ R2

†

2. (R1 ·R2)† = R1

† · R2†

3. If R is a transitive relation, then so isR†.

Proof. 1. By Definition 3.2, it is straightforward to show that if∆1 R1† ∆2 and

R1 ⊆ R2 then∆1 R2† ∆2.

2. We first show that(R1 ·R2)† ⊆ R1

† · R2†. Suppose there are two distribu-

tions∆1,∆2 such that∆1 (R1 ·R2)† ∆2. Then we have that

∆1 = ∑i∈I

pi ·si , si R1 ·R2 ti , ∆2 = ∑i∈I

pi · ti . (3.1)

The middle part of (3.1) implies the existence of some statess′i such thatsi R1 s′iands′i R2 ti . Let Θ be the distribution∑i∈I pi ·s′i . It is clear that∆1 R1

† Θ andΘ R2

† ∆2. It follows that∆1 R1† · R2

† ∆2.Then we show the inverse inclusionR1

† ·R2† ⊆ (R1 ·R2)

†. Given any threedistributions∆1,∆2 and∆3, we show that if∆1 R1

† ∆2 and∆2 R2† ∆3 then

∆1 (R1 ·R2)† ∆3.

First ∆1 R1† ∆2 means that

∆1 = ∑i∈I

pi ·si , si R1 s′i , ∆2 = ∑i∈I

pi ·s′i ; (3.2)

also∆2 R2† ∆3 means that

∆2 = ∑j∈J

q j · t ′j , t ′j R2 t j , ∆3 = ∑j∈J

q j · t j ; (3.3)

and we can assume without loss of generality that all the coefficientspi ,q j arenon-zero. Now defineI j = i ∈ I | s′i = t ′j andJi = j ∈ J | t ′j = s′i , so thattrivially

(i, j) | i ∈ I , j ∈ Ji = (i, j) | j ∈ J, i ∈ I j (3.4)

and note that

∆2(s′i) = ∑

j∈Ji

q j and ∆2(t′j) = ∑

i∈I j

pi (3.5)


Fig. 3.2 The transportation problem

Because of (3.5) we have

∆1 = ∑i∈I pi ·si = ∑i∈I pi ·∑ j∈Ji

q j

∆2(s′i)·si

= ∑i∈I ∑ j∈Ji

pi ·q j∆2(s′i)

·si

Similarly

∆3 = ∑ j∈J q j · t j = ∑ j∈J q j ·∑i∈I jpi

∆2(t′j )· t j

= ∑ j∈J ∑i∈I j

pi ·q j

∆2(t′j )· t j

= ∑i∈I ∑ j∈Ji

pi ·q j

∆2(t′j )· t j by (3.4)

Now for eachj in Ji we know that in factt ′j = s′i , and so from the middle parts

of (3.2) and (3.3), we obtain∆1 (R1 ·R2)† ∆3.

3. By Clause 2 above we haveR† · R† = (R · R)† for any relationR. If R istransitive, thenR ·R ⊆ R. By Clause 1 above, we obtain(R · R)† ⊆ R

†.It follows thatR† · R† ⊆ R

†, thusR† is transitive.

⊓⊔

3.4 Justifying the lifting operation

The lifting operation given in Definition 3.2 is not only concise but also intrinsicallyrelated to some fundamental concepts in mathematics, notably the Kantorovich met-ric.

3.4.1 Justification by the Kantorovich metric

We begin with some historical notes. Thetransportation problemplays an importantrole in linear programming due to its general formulation and methods of solution.

3.4 Justifying the lifting operation 33

The original transportation problem, formulated by the French mathematician G.Monge in 1781 [67], consists of finding an optimal way of shovelling a pile of sandinto a hole of the same volume; see Figure 3.2. In the 1940s, the Russian mathemati-cian and economist L.V. Kantorovich, who was awarded a Nobelprize in economicsin 1975 for the theory of optimal allocation of resources, gave a relaxed formulationof the problem and proposed a variational principle for solving it [57]. Unfortu-nately, Kantorovich’s work went unrecognised for a long time. The later knownKantorovich metrichas appeared in the literature under different names, because ithas been rediscovered historically several times from different perspectives. Manymetrics known in measure theory, ergodic theory, functional analysis, statistics, etc.are special cases of the general definition of the Kantorovich metric [87]. The ele-gance of the formulation, the fundamental character of the optimality criterion, aswell as the wealth of applications, which keep arising, place the Kantorovich met-ric in a prominent position among the mathematical works of the 20th century. Inaddition, this formulation can be computed in polynomial time [71], which is anappealing feature for its use in solving applied problems. For example, it is widelyused to solve a variety of problems in business and economy such as market distri-bution, plant location, scheduling problems etc. In recentyears the metric attractedthe attention of computer scientists [22]: it has been used in various different areasin computer science such as probabilistic concurrency, image retrieval, data mining,bioinformatics, etc.

Roughly speaking, the Kantorovich metric provides a way of measuring the dis-tance between two distributions. Of course, this requires first a notion of distancebetween the basic elements that are aggregated into the distributions, which is oftenreferred to as theground distance. In other words, the Kantorovich metric definesa “lifted” distance between two distributions of mass in a space that is itself en-dowed with a ground distance. There is a host of metrics available in the literature(see e.g. [43]) to quantify the distance between probability measures; see [78] fora comprehensive review of metrics in the space of probability measures. The Kan-torovich metric has an elegant formulation and a natural interpretation in terms ofthe transportation problem.

We now recall the mathematical definition of the Kantorovichmetric. Let(X,m)be aseparablemetric space. (This condition will be used by Theorem 3.2 below.)

Definition 3.3. Given any two Borel probability measures∆ andΘ on X, theKan-torovich distancebetween∆ andΘ is defined by

K(∆ ,Θ) = sup

∣∣∣∣

∫

f d∆ −∫

f dΘ∣∣∣∣

| || f || ≤ 1

.

where|| · || is theLipschitz semi-normdefined by|| f ||= supx6=y| f (x)− f (y)|

m(x,y) for a func-tion f : X →R.

The Kantorovich metric has an alternative characterisation. We denote byP(X)the set of all Borel probability measures onX such that for allz∈X, if ∆ ∈P(X) then∫

X m(x,z)∆(x) < ∞. We writeM(∆ ,Θ) for the set of all Borel probability measures


on the product spaceX×X with marginal measures∆ andΘ , i.e. if Γ ∈ M(∆ ,Θ)then

∫

y∈X dΓ (x,y) = d∆(x) and∫

x∈X dΓ (x,y) = dΘ(y) hold.

Definition 3.4. For ∆ ,Θ ∈ P(X), we define the metricL as follows:

L(∆ ,Θ) = inf

∫

m(x,y)dΓ (x,y) | Γ ∈ M(∆ ,Θ)

.

Lemma 3.1.If (X,m) is a separable metric space then K and L are metrics onP(X). ⊓⊔

The famous Kantorovich-Rubinstein duality theorem gives adual representationof K in terms ofL.

Theorem 3.2 (Kantorovich-Rubinstein [58]).If (X,m) is a separable metric spacethen for any two distributions∆ ,Θ ∈ P(X) we have K(∆ ,Θ) = L(∆ ,Θ). ⊓⊔

In view of the above theorem, many papers in the literature directly take Def-inition 3.4 as the definition of the Kantorovich metric. Herewe keep the originaldefinition, but it is helpful to understandK by usingL. Intuitively, a probabilitymeasureΓ ∈ M(∆ ,Θ) can be understood as atransportationfrom one unit massdistribution∆ to another unit mass distributionΘ . If the distancem(x,y) representsthe cost of moving one unit of mass from locationx to locationy then the Kan-torovich distance gives the optimal total cost of transporting the mass of∆ toΘ . Werefer the reader to [88] for an excellent exposition on the Kantorovich metric andthe duality theorem.

Many problems in computer science only involve finite state spaces, so discretedistributions with finite supports are sometimes more interesting than continuousdistributions. For two discrete distributions∆ andΘ with finite supportsx1, ...,xnandy1, ...,yl, respectively, minimising the total cost of a discretised version of thetransportation problem reduces to the following linear programming problem:

minimise ∑ni=1 ∑l

j=1m(xi ,y j)Γ (xi ,y j)

subject to • ∀1≤ i ≤ n : ∑lj=1Γ (xi ,y j) = ∆(xi)

• ∀1≤ j ≤ l : ∑ni=1Γ (xi ,y j ) =Θ(y j)

• ∀1≤ i ≤ n,1≤ j ≤ l : Γ (xi ,y j)≥ 0.

(3.6)

Since (3.6) is a special case of the discrete mass transportation problem, somewell-known polynomial time algorithms like [71] can be employed to solve it, whichis an attractive feature for computer scientists.

Recall that a pseudometric is a function that yields a nonnegative real number foreach pair of elements and satisfies the following:m(s,s) = 0, m(s, t) = m(t,s), andm(s, t)≤ m(s,u)+m(u, t), for anys, t ∈ S. We say a pseudometricm is 1-bounded ifm(s, t)≤ 1 for anys andt. Let ∆ andΘ be distributions over a finite setSof states.In [9] a 1-bounded pseudometricm on S is lifted to be a 1-bounded pseudometricm on D(S) by setting the distance ˆm(∆ ,Θ) to be the value of the following linearprogramming problem:


maximise ∑s∈S(∆(s)−Θ(s))xs

subject to • ∀s, t ∈ S: xs− xt ≤ m(s, t)• ∀s∈ S: 0≤ xs ≤ 1.

(3.7)

This problem can be dualised and then simplified to yield the following problem:

minimise ∑s,t∈Sm(s, t)yst

subject to • ∀s∈ S: ∑t∈Syst = ∆(s)• ∀t ∈ S: ∑s∈Syst =Θ(t)• ∀s, t ∈ S: yst ≥ 0.

(3.8)

Now (3.8) is in exactly the same form as (3.6).This way of lifting pseudometrics via the Kantorovich metric as given in (3.8) has

an interesting connection with the lifting of binary relations given in Definition 3.2.

Theorem 3.3.LetR be a binary relation and m a pseudometric on a state space Ssatisfying

sR t iff m(s, t) = 0 (3.9)

for any s, t ∈ S. Then it holds that

∆ R† Θ iff m(∆ ,Θ) = 0

for any distributions∆ ,Θ ∈ D(S).

Proof. Suppose∆ R† Θ . From Theorem 3.1(1) we know there is a weight function

w such that

1. ∀s∈ S: ∑t∈Sw(s, t) = ∆(s)2. ∀t ∈ S: ∑s∈Sw(s, t) =Θ(t)3. ∀s, t ∈ S: w(s, t)> 0⇒ sR t.

By substitutingw(s, t) for ys,t in (3.8), the three constraints there can be satisfied.For anys, t ∈ Swe distinguish two cases:

1. eitherw(s, t) = 02. orw(s, t)> 0. In this case we havesR t, which impliesm(s, t) = 0 by (3.9).

Therefore, we always havem(s, t)w(s, t) = 0 for anys, t ∈ S. Consequently, we get∑s,t∈Sm(s, t)w(s, t) = 0 and the optimal value of the problem in (3.8) must be 0, i.e.m(∆ ,Θ) = 0, and the optimal solution is determined byw.

The above reasoning can be reversed to show that the optimal solution of (3.8)determines a weight function, thus ˆm(∆ ,Θ) = 0 implies∆ R

† Θ . ⊓⊔

The above property will be used in Section 3.7 to give a metriccharacterisationof probabilistic bisimulation (cf. Theorem 3.11).

In the remainder of this subsection we do a sanity check and show that (3.7) and(3.8) do entail the same metric onD(S). Given a metricmonS, we writem⋆ for thelifted metric by using the linear programming problem in (3.8).


Proposition 3.5.Let m be a metric over S. Thenm is a metric overD(S).

Proof. We verify that(D(S),m) satisfies the definition of pseudometric space.

1. It is clear that ˆm(∆ ,∆) = 0.2. We observe that

∑s∈S(∆(s)−∆ ′(s))xs = ∑s∈S(∆ ′(s)−∆(s))(1− xs)+∑s∈S∆(s)−∑s∈S∆ ′(s)= ∑s∈S(∆ ′(s)−∆(s))(1− xs)

Now x′s = 1− xs also satisfy the constraints onxs in (3.7), hence the symmetryof m can be shown.

3. Let∆1,∆2,∆3 ∈ D(S), we have

∑s∈S

(∆1(s)−∆3(s))xs = ∑s∈S

(∆1(s)−∆2(s))xs+∑s∈S

(∆2(s)−∆3(s))xs.

By taking the maximum over thexs for the left hand side, we obtain

m(∆1,∆3)≤ m(∆1,∆2)+ m(∆2,∆3).

⊓⊔

Proposition 3.6.Let m be a metric over S. Then m⋆ is a metric overD(S).

Proof. We verify that(D(S),m⋆) satisfies the definition of pseudometric space.

1. It is easy to see thatm⋆(∆ ,∆) = 0, by lettingys,s= ∆(s) andys,t = 0 for all s 6= tin (3.8).

2. Supposem⋆(∆ ,∆ ′) is obtained by using some real numbersys,t in (3.8). Lety′s,t = yt,s for all s, t ∈ S. We have that

∑s,t∈S

m(s, t) ·ys,t = ∑s,t∈S

m(s, t) ·y′t,s = ∑t,s∈S

m(t,s) ·y′t,s.

Observe that∑s∈Sy′t,s = ∑s∈Sys,t = ∆ ′(t) for all t ∈ S, and similarly we have thedual∑t∈Sy′t,s = ∆(s) for all s∈ S. It follows thatm⋆(∆ ,∆ ′) = m⋆(∆ ′,∆).

3. Supposem⋆(∆1,∆2) is obtained by using some real numbersxs,t in (3.8), andm⋆(∆2,∆3) is obtained by usingys,t . Let zs,t = ∑r∈⌈∆2⌉ xs,r ·

yr,t∆2(r)

. Note that if

∆2(r) = 0 for somer ∈ S, then we must have thatxs,r = yr,t = 0 for anys, t ∈ S.For anys∈ S, we have that

∑t∈Szs,t = ∑t∈S∑r∈⌈∆2⌉ xs,r ·yr,t

∆2(r)

= ∑r∈⌈∆2⌉xs,r

∆2(r)· (∑t∈Syr,t)

= ∑r∈⌈∆2⌉xs,r

∆2(r)·∆2(r)

= ∑r∈⌈∆2⌉ xs,r

= ∑r∈Sxs,r

= ∆1(s)


and for anyt ∈ S, we have that

∑s∈Szs,t = ∑s∈S∑r∈⌈∆2⌉ xs,r ·yr,t

∆2(r)

= ∑r∈⌈∆2⌉∑s∈Sxs,r ·yr,t

∆2(r)

= ∑r∈⌈∆2⌉(∑s∈Sxs,r) ·yr,t

∆2(r)

= ∑r∈⌈∆2⌉∆2(r) ·yr,t

∆2(r)

= ∑r∈⌈∆2⌉ yr,t

= ∑r∈Syr,t

= ∆3(t)

Therefore, the real numberszs,t satisfy the constraints in (3.8) and we obtainthat

m⋆(∆1,∆3)≤ ∑s,t∈Sm(s, t) ·zs,t

= ∑s,t∈Sm(s, t) ·∑r∈⌈∆2⌉ xs,r ·yr,t

∆2(r)

= ∑s,t∈S∑r∈⌈∆2⌉ m(s, t) ·xs,r ·yr,t

∆2(r)

≤ ∑s,t∈S∑r∈⌈∆2⌉(m(s, r)+m(r, t)) ·xs,r ·yr,t

∆2(r)

= ∑s,t∈S∑r∈⌈∆2⌉ m(s, r) ·xs,r ·yr,t

∆2(r)+∑s,t∈S∑r∈⌈∆2⌉ m(r, t) ·xs,r ·

yr,t∆2(r)

= ∑s∈S∑r∈⌈∆2⌉ m(s, r) ·xs,r ·∑t∈Syr,t

∆2(r)+∑s,t∈S∑r∈⌈∆2⌉ m(r, t) ·xs,r ·

yr,t∆2(r)

= ∑s∈S∑r∈⌈∆2⌉ m(s, r) ·xs,r ·∆2(r)∆2(r)

+∑s,t∈S∑r∈⌈∆2⌉ m(r, t) ·xs,r ·yr,t

∆2(r)

= ∑s∈S∑r∈Sm(s, r) ·xs,r +∑s,t∈S∑r∈⌈∆2⌉ m(r, t) ·xs,r ·yr,t

∆2(r)

= m⋆(∆1,∆2)+∑s,t∈S∑r∈⌈∆2⌉ m(r, t) ·xs,r ·yr,t

∆2(r)

= m⋆(∆1,∆2)+∑t∈S∑r∈⌈∆2⌉m(r, t) · (∑s∈Sxs,r) ·yr,t

∆2(r)

= m⋆(∆1,∆2)+∑t∈S∑r∈⌈∆2⌉m(r, t) ·∆2(r) ·yr,t

∆2(r)

= m⋆(∆1,∆2)+∑t∈S∑r∈⌈∆2⌉m(r, t) ·yr,t

= m⋆(∆1,∆2)+∑t∈S∑r∈Sm(r, t) ·yr,t

= m⋆(∆1,∆2)+m⋆(∆2,∆3)

⊓⊔

Proposition 3.7.m coincides with m⋆.

Proof. The problem in (3.7) can be rewritten in the following form:

maximise ∑s∈S∆(s) ·xs−∑s∈S∆ ′(s)ys

subject to • ∀s, t ∈ S: xs− yt ≤ m(s, t)• ∀s∈ S: ys− xs ≤ 0• ∀s∈ S: xs ≤ 1, ys ≤ 1• ∀s∈ S: xs ≥ 0, ys ≥ 0.

(3.10)

Dualising (3.10) yields:


minimise ∑s,t∈Sm(s, t)zs,t +αs+βs

subject to • ∀t ∈ S: ∑t∈Szs,t −αs+βs≥ ∆(s)• ∀s∈ S: ∑t∈Szt,s−αs−βs≤ ∆ ′(s)• ∀s, t ∈ S: zs,t ≥ 0, αs ≥ 0, βs ≥ 0.

(3.11)

The duality theorem of linear programming (Theorem 2.10) tells us that the dualproblem has the same value as the primal problem. We now observe that in theoptimal vector of (3.11) it must be the case thatαs= βs= 0 for all s∈S. In addition,we have∑s∈S∆(s) = ∑s∈S∆ ′(s) = 1. So we can simplify (3.11) to (3.8). ⊓⊔

3.4.2 Justification by network flow

The lifting operation discussed in Section 3.3 is also related to the maximum flowproblem in optimisation theory. This was already observed by Baier et al. in [2].

We briefly recall the basic definitions of networks. More details can be found ine.g. [35]. Anetworkis a tupleN = (N,E,s⊥,s⊤,c) where(N,E) is a finite directedgraph (i.e.N is a set of nodes andE ⊆N×N is a set of edges) with two special nodess⊥ (thesource) ands⊤ (thesink) and acapacity c, i.e. a function that assigns to eachedge(v,w) ∈ E a nonnegative numberc(v,w). A flow function ffor N is a functionthat assigns to edgeea real numberf (e) such that

• 0≤ f (e)≤ c(e) for all edgese.• Let in(v) be the set of incoming edges to nodev andout(v) the set of outgoing

edges from nodev. Then, for each nodev∈ N\s⊥,s⊤,

∑e∈in(v)

f (e) = ∑e∈out(v)

f (e).

Theflow F( f ) of f is given by

F( f ) = ∑e∈out(s⊥)

f (e) − ∑e∈in(s⊥)

f (e).

Themaximum flowin N is the supremum (maximum) over the flowsF( f ), wheref is a flow function inN .

We will see that the question whether∆ R† Θ holds can be reduced to a

maximum flow problem in a suitably chosen network. SupposeR⊆ S× S and∆ ,Θ ∈ D(S). Let S′ = s′ | s∈ S wheres′ are pairwise distinct new states, i.e.( )′ : S→ S′ is a bijective function. We create two statess⊥ ands⊤ not containedin S∪S′ with s⊥ 6= s⊤. We associate with the pair(∆ ,Θ) the following networkN (∆ ,Θ ,R).

• The nodes areN = S∪S′∪s⊥,s⊤.• The edges areE = (s, t ′) | (s, t) ∈R∪(s⊥,s) | s∈ S∪(s′,s⊤) | s∈ S.• The capabilityc is defined byc(s⊥,s) = ∆(s), c(t ′,s⊤) = Θ(t) andc(s, t ′) = 1

for all s, t ∈ S.


s⊥ s⊤

s1

s2

...

sn

s′1

s′2

...

s′n

c11

c12

c1n

c22

c2n

cnn

∆(s1)

∆(s2)

∆(sn )

Θ(s1 )

Θ(s2)

Θ(sn)

ci j = 1 for all i, j

Fig. 3.3 The networkN (∆ ,Θ ,R)

The network is depicted in Figure 3.3.The next lemma appeared as Lemma 5.1 in [2].

Lemma 3.2.Let S be a finite set,∆ ,Θ ∈ D(S) andR⊆ S×S. The following state-ments are equivalent.

(i) There exists a weight function w for(∆ ,Θ) with respect toR.(ii) The maximum flow inN (∆ ,Θ ,R) is 1. ⊓⊔

Proof. (i) =⇒ (ii): Let w be a weight function for(∆ ,Θ) with respect toR. Wedefine a flow functionf as follows: f (s⊥,s) = ∆(s), f (t ′,s⊤) =Θ(t) and moreoverf (s, t ′) = w(s, t) for all s, t ∈ S. ThenF( f ) = ∑s∈S f (s⊥,s) = ∑s∈S∆(s) = 1. Foreach outgoing edge(s⊥,s) from nodes⊥, its maximum capacity is reached, so themaximum flow ofN (∆ ,Θ ,R) is 1.

(ii) =⇒ (i): Let f be a flow function such thatF( f ) = 1. We observe thatf (s⊥,s)≤ c(s⊥,s) = ∆(s) and

∑s∈S

f (s⊥,s) = F( f ) = 1= ∑s∈S

∆(s),

so it must be the case thatf (s⊥,s) = ∆(s) for all s∈ S. Similarly, we have thedual f (t ′,s⊤) = Θ(t) for all t ∈ S. Let w be the weight function defined by lettingw(s, t) = f (s, t ′) for all (s, t) ∈ R andw(s, t) = 0 if (s, t) 6∈ R. We can check that

∑t∈S

w(s, t) = ∑t∈S

f (s, t ′) = f (s⊥,s) = ∆(s)

and similarly,∑s∈Sw(s, t) =Θ(t). Sow is a weight function for(∆ ,Θ) with respectto R. ⊓⊔


Since the lifting operation given in Definition 3.2 can also be stated in terms ofweight functions, we obtain the following characterisation using network flow.

Theorem 3.4.Let S be a finite set,∆ ,Θ ∈D(S) andR⊆ S×S. Then∆ R† Θ if and

only if the maximum flow inN (∆ ,Θ ,R) is 1.

Proof. Combine Theorem 3.1(1) and Lemma 3.2. ⊓⊔

The above property will play an important role in Section 3.8.2 to give an “onthe fly” algorithm for checking probabilistic bisimilarity.

Besides Theorems 3.1 and 3.4, there are other equivalent ways of lifting rela-tions. Given a binary relationR⊆ S×T and a setA⊆ S, we writeR(A) for the sett ∈ T | ∃s∈ A : sR t. A setA is R-closed ifR(A)⊆ A.

Theorem 3.5.Let∆ andΘ be distributions over finite sets S and T , respectively.

1. ∆ R† Θ if and only if∆(A)≤Θ(R(A)) for all A ⊆ S.

2. If R is a preorder, then∆ R† Θ if and only if∆(A)≤Θ(A) for eachR-closed

set A⊆ S.

Proof. 1. (⇒) Since∆ R† Θ , by Proposition 3.1 we can decompose∆ andΘ as

follows.∆ = ∑

i∈Ipi ·si si R ti Θ = ∑

i∈Ipi · ti

Note thatsii∈I = ⌈∆⌉. We define an index setJ := i ∈ I | si ∈ A. Then∆(A) = p j j∈J. For eachj ∈ J we havesj R t j , i.e.t j ∈R(sj ). It follows thatt j j∈J ⊆ R(A). Therefore, we can infer that

∆(A) = ∆(sj j∈J) = ∑j∈J

p j ≤ Θ(t j j∈J) ≤ Θ(R(A)).

(⇐) In view of Theorem 3.4 it suffices to show that the maximum flowof thenetworkN (∆ ,Θ ,R) is 1. According to the Maximum Flow Minimum CutTheorem by Ford and Fulkerson [40], the maximum flow equals the capacityof a minimal cut. We show thats⊥ is a minimal cut of capacity 1. Clearlythis cut has capacity 1. To see the minimality, letC 6= s⊥ be some minimalcut. The capacity ofC is c(C) = ∑c(s, t ′) | s∈C, t ′ 6∈C. If the cutC involvesan edge of capacity 1, i.e.(s, t) ∈R with s∈ C andt ′ 6∈ C, thens⊥ is a cutof smaller or equal capacity since its capacity is 1. LetB=C∩S, and thus wecan assume that ifs∈ B thent ′ | t ∈ R(s) ⊆ C. Hence the capacity ofC isc(C) = ∆(S\B)+Θ(R(B)). Since∆(B)≤Θ(R(B)), we have

c(C) ≥ ∆(S\B)+∆(B) = ∆(S) = 1.

Therefore, the capacity ofC is greater than or equal to 1, which means that theminimum cut has capacity 1.

2. WhenR is a preorder, we can show that the following two conditions areequivalent: (i)∆(A) ≤ Θ(R(A)) for all A ⊆ S, (ii) ∆(A) ≤ Θ(A) for eachR-closed setA⊆ S. For one direction, suppose (i) holds andA is aR-closed set.

3.5 Probabilistic bisimulation 41

ThenR(A) ⊆ A, and thusΘ(R(A)) ≤ Θ(A). Combining this with (i) we seethat ∆(A) ≤ Θ(A). For the other direction, suppose (ii) holds and pick up anyA ⊆ S. SinceR is reflexive, we haveA ⊆ R(A), and thus∆(A) ≤ ∆(R(A)).By the transitivity of R we know thatR(A) is R-closed. By (ii) we get∆(R(A)) ≤ Θ(R(A)). Combining the previous two inequalities, we obtain∆(A)≤Θ(R(A)).

⊓⊔

Remark 3.1.Note that in Theorem 3.5(2) it is important to requireR to be a pre-order, which is used in showing the equivalence of the two conditions (i) and (ii)in the above proof. IfR is not a preorder, the implication from (ii) to (i) is invalidin general. For example, letS= s, t, R= (s, t), ∆ = 1

2s+ 12t andΘ = 1

3s+ 23t.

There are only two non-emptyR-closed sets:t and S. Clearly, we have both∆(t)≤Θ(t) and∆(S)≤Θ(S). However,

∆(t) =12

6≤ 0 = Θ( /0) = Θ(R(t)) .

Note also that Theorem 3.5 can be generalised to countable state spaces. Theproof is almost the same except that the Maximum Flow MinimumCut Theoremfor countable networks [1] is used. See [79] for more details.

3.5 Probabilistic bisimulation

With a solid base of the lifting operation, we can proceed to define a probabilisticversion of bisimulation.

Let s and t be two states in a pLTS. We sayt can simulate the behaviour ofs if whenever the latter can exhibit actiona and lead to distribution∆ then theformer can also performa and lead to a distribution, sayΘ , which then in turn canmimic ∆ in successor states. We are interested in a relation betweentwo states, butit is expressed by invoking a relation between two distributions. To formalise themimicking of one distribution by the other, we make use of thelifting operationinvestigated in Section 3.3.

Definition 3.5. A relationR⊆ S×S is aprobabilistic simulationif s R t implies

• if s a−→ ∆ then there exists someΘ such thatt a−→Θ and∆ R† Θ .

If both R andR−1 are probabilistic simulations, thenR is aprobabilistic bisimu-lation. The largest probabilistic bisimulation, denoted by∼, is calledprobabilisticbisimilarity.

As in the non-probabilistic setting, probabilistic bisimilarity can be approximatedby a family of inductively defined relations.

Definition 3.6. Let Sbe the state set of a pLTS. We define:


• ∼0:= S×S• s∼n+1 t, for n≥ 0, if

1. whenevers a−→ ∆ , there exists someΘ such thatt a−→Θ and∆ (∼n)† Θ ;

2. whenevert a−→Θ , there exists some∆ such thats a−→ ∆ and∆ (∼n)† Θ .

• ∼ω :=⋂

n≥0 ∼n

In general,∼ is a strictly finer relation than∼ω . However, the two relations coin-cide when limited toimage-finitepLTS’s where for any states and actiona, the set∆ ∈ D(S) | s a−→ ∆ is finite.

Proposition 3.8.On image-finite pLTS’s,∼ω coincides with∼.

Proof. It is trivial to show by induction thats∼ t impliess∼n t for all n≥ 0, andthus thats∼ω t.

Now we show that∼ω is a bisimulation. Supposes∼ω t ands a−→ ∆ . By as-sumption for alln≥ 0 there exists someΘn with t a−→ Θn and∆ (∼n)

† Θn. Sincewe are considering image-finite pLTS’s, there are only finitely many differentΘn’s.Then for at least one of them, sayΘ , we have∆ (∼n)

† Θ for infinitely many differ-entn’s. By a straightforward induction we can show thats∼n t impliess∼m t forall m,n≥ 0 with n> m. It follows that∆ (∼n)

† Θ for all n≥ 0. We now claim that

∆ (∼ω)† Θ (3.12)

To see this, letA be any subset of the whole state space that may be countable. Foranyn≥ 0, since∆ (∼n)

† Θ we know from Theorem 3.5(1) that∆(A)≤Θ(∼n (A)).Therefore,

∆(A) ≤ infn≥0

Θ(∼n (A)) = Θ((∩n≥0 ∼n)(A)) = Θ(∼ω (A)) .

Using Theorem 3.5(1) again, we obtain (3.12).By symmetry we also have that ift a−→Θ then there is some∆ with s a−→ ∆ and

∆ (∼ω)† Θ . ⊓⊔

Let ≺ be the largest probabilistic simulation, called probabilistic similarity, and≍ be the kernel of probabilistic similarity, i.e.≺ ∩ ≺−1, calledsimulation equiva-lence. In general, simulation equivalence is coarser than bisimilarity. However, forreactive pLTS’s, the two relations do coincide. Recall thatin a reactive pLTS, foreach states and actiona there is at most one distribution∆ with s a−→ ∆ . To provethat result, we need a technical lemma.

Lemma 3.3.Let R be a preorder on a set S and∆ ,Θ ∈ D(S). If ∆ R† Θ andΘ R† ∆ then∆(C) =Θ(C) for all equivalence classes C with respect to the kernelR∩R−1 of R.

Proof. Let us write≡ for R∩R−1. For anys∈ S, let [s]≡ be the equivalence classthat containss. Let As be the sett ∈ S| sR t ∧ t 6R s. It is easy to see that

3.5 Probabilistic bisimulation 43

R(s) = t ∈ S| sR t= t ∈ S| sR t ∧ t R s⊎t ∈ S| sR t ∧ t 6R s= [s]≡ ⊎ As

where⊎ stands for a disjoint union. Therefore, we have

∆(R(s)) = ∆([s]≡)+∆(As) and Θ(R(s)) =Θ([s]≡)+Θ(As) (3.13)

We now check that bothR(s) andAs areR-closed sets, that isR(R(s)) ⊆ R(s)and R(As) ⊆ As. Supposeu ∈ R(R(s)). Then there exists somet ∈ R(s) suchthat t R u, which means thats R t and t R u. As a preorderR is a transitiverelation. So we haves R u that impliesu∈ R(s). Therefore we can conclude thatR(R(s))⊆ R(s).

Supposeu∈ R(As). Then there exists somet ∈ As such thatt R u, which meansthat s R t, t 6R s andt R u. As a preorderR is a transitive relation. So we havesR u. Note that we also haveu 6R s. Otherwise we would haveu R s, which means,together witht R u and the transitivity ofR, that t R s, a contradiction to thehypothesist 6R s. It then follows thatu∈ As and then we conclude thatR(As)⊆ As.

We have verified thatR(s) andAs areR-closed sets. Since∆ R† Θ andΘ R†

∆ , we apply Theorem 3.5(2) and obtain that∆(R(s)) ≤ Θ(R(s)) andΘ(R(s)) ≤∆(R(s)), that is

∆(R(s)) =Θ(R(s)) (3.14)

Similarly, using the fact thatAs is R-closed we obtain that

∆(As) =Θ(As) (3.15)

It follows from (3.13)-(3.15) that

∆([s]≡) =Θ([s]≡)

as we have desired. ⊓⊔

Remark 3.2.Note that in the above proof the equivalence classes[s]≡ are not nec-essarilyR-closed. For example, letS= s, t, IdS= (s,s), (t, t) and the relationR= IdS∪(s, t). The kernel ofR is ≡=R ∩ R−1= IdS and then[s]≡ = s. WehaveR(s) = S 6⊆ [s]≡. So a more direct attempt to apply Theorem 3.5(2) to thoseequivalence classes would not work.

Theorem 3.6.For reactive pLTS’s,≍ coincides with∼.

Proof. It is obvious that∼ is included in≍. For the other direction, we show that≍is a bisimulation. Lets, t ∈Sands≍ t. Suppose thats a−→∆ . There exists a transitiont a−→Θ with ∆ (≺)† Θ . Since we are considering reactive probabilistic systems,thetransitiont a−→Θ from t must be matched by the unique outgoing transitions a−→ ∆from s, with Θ (≺)† ∆ . Note that by using Proposition 3.4 it is easy to show that≺is a preorder onS. It follows from Lemma 3.3 that∆(C) =Θ(C) for anyC∈ S/≍.By Theorem 3.1(2) we see that∆ (≍)† Θ . Therefore,≍ is indeed a probabilisticbisimulation relation. ⊓⊔


3.6 Logical characterisation

Let L be a logic and〈S,L,→〉 be a pLTS. For anys∈ S, we use the notationL (s)to stand for the set of formulae that states satisfies. This induces an equivalencerelation on states:s =L t iff L (s) = L (t). Thus, two states are equivalent whenthey satisfy exactly the same set of formulae.

In this section we consider two kinds of logical characterisations of probabilisticbisimilarity.

Definition 3.7 (Adequacy and expressivity).

1. L is adequatewith respect to∼ if for any statessandt,

s=L t iff s∼ t.

2. L is expressivewith respect to∼ if for each states there exists acharacteristicformulaϕs ∈ L such that, for any statet,

t satisfiesϕs iff s∼ t.

We will propose a probabilistic extension of the Hennessy-Milner logic, showing itsadequacy, and then a probabilistic extension of the modal mu-calculus, showing itsexpressivity. In general the latter is more expressive thanthe former because it hasfixed-point operators to describe infinite behaviour. But for finite processes whereno infinite behaviour occurs, an adequate logic will also be expressive for thoseprocesses.

3.6.1 An adequate logic

We extend the Hennessy-Milner logic by adding a probabilistic-choice modality toexpress the behaviour of distributions.

Definition 3.8. The classLhm of modal formulae overL, ranged over byϕ , is de-fined by the following grammar:

ϕ := ⊤ | ϕ1∧ϕ2 | ¬ϕ | 〈a〉ψψ := ϕ1 p⊕ ϕ2

wherep∈ [0,1]. We callϕ a state formulaandψ a distribution formula. Note thata distribution formulaψ only appears as the continuation of a diamond modality〈a〉ψ . We sometimes use the finite conjunction

∧

i∈I ϕi as syntactic sugar.Thesatisfaction relation|=⊆ S×Lhm is defined by

• s |= ⊤ for all s∈ S.• s |= ϕ1∧ϕ2 if s |= ϕi for i = 1,2.• s |= ¬ϕ if it is not the case thats |= ϕ .

3.6 Logical characterisation 45

• s |= 〈a〉ψ if for some∆ ∈ D(S), s a−→ ∆ and∆ |= ψ .• ∆ |= ϕ1 p⊕ ϕ2 if there are∆1,∆2 ∈ D(S) with ∆ = p ·∆1+(1− p) ·∆2 and for

eachi = 1,2, t ∈ ⌈∆i⌉ we havet |= ϕi .

With a slight abuse of notation, we write∆ |= ψ above to mean that∆ satisfies thedistribution formulaψ . For anyϕ ∈Lhm, the set of states that satisfiesϕ is denotedby [ϕ, i.e. [ϕ= s∈ S| s |= ϕ.

Example 3.1.Let ∆ be a distribution andϕ be a state formula. We consider thedistribution formula

ψ := ϕ p⊕⊤

wherep= ∆([ϕ). It is not difficult to see that∆ |= ψ holds. In the casep= 0 theformulaψ degenerates to⊤. In the casep> 0 the distribution∆ can be decomposedas follows.

∆ = ∑s∈S

∆(s) ·s = p · ∑s∈[ϕ ∆(s)

p·s + (1− p) · ∑

s6∈[ϕ ∆(s)1− p

·s

So∆ can be written as the linear combination of two distributions; the first one hasas support all the states that can satisfyϕ . In general, for anyΘ ∈ D(S) we haveΘ |= ψ if and only if Θ([ϕ)≥ p.

It turns out thatLhm is adequate with respect to probabilistic bisimilarity.

Theorem 3.7 (Adequacy).Let s and t be any two states in a finite-state pLTS. Thens∼ t if and only if s=Lhm t.

Proof. (⇒) Supposes∼ t. We show thats |= ϕ ⇔ t |= ϕ by structural induction onϕ .

• Let s |=⊤. Then we clearly havet |=⊤.• Let s |= ϕ1∧ϕ2. Thens |= ϕi for i = 1,2. So by inductiont |= ϕi , and we have

t |= ϕ1∧ϕ2. By symmetry we also havet |= ϕ1∧ϕ2 impliess |= ϕ1∧ϕ2.• Let s |= ¬ϕ . So s 6|= ϕ , and by induction we havet 6|= ϕ . Thus t |= ¬ϕ . By

symmetry we also havet 6|= ϕ impliess 6|= ϕ .• Let s |= 〈a〉(ϕ1 p⊕ ϕ2). Thens a−→ ∆ and ∆ |= ϕ1 p⊕ ϕ2 for some∆ . So we

have∆ = p ·∆1+(1− p) ·∆2 and for alli = 1,2 ands′ ∈ ⌈∆i⌉ we haves′ |= ϕi .Sinces∼ t, there is someΘ with t a−→ Θ and∆ ∼† Θ . By Proposition 3.3 wehave thatΘ = p ·Θ1 + (1− p) ·Θ2 and ∆i ∼† Θi for i = 1,2. It follows thatfor eacht ′ ∈ ⌈Θi⌉ there is somes′ ∈ ⌈∆i⌉ with s′ ∼ t ′. So by induction we havet ′ |= ϕi for all t ′ ∈ ⌈Θi⌉ with i = 1,2. Therefore, we haveΘ |= ϕ1 p⊕ ϕ2. It followsthat t |= 〈a〉(ϕ1 p⊕ ϕ2). By symmetry we also havet |= 〈a〉(ϕ1 p⊕ ϕ2) impliess |= 〈a〉(ϕ1 p⊕ ϕ2).

(⇐) We show that the relation=Lhm is a probabilistic bisimulation. Obviously=Lhm is an equivalence relation. LetE = Ui | i ∈ I be the set of all equivalenceclasses of=Lhm . We first claim that, for any equivalence classUi , there exists aformulaϕi satisfying[[ϕi ]] =Ui. This can be proved as follows:


• If E contains only one equivalence classU1, thenU1 = S. So we can take therequired formula to be⊤ because[[⊤]] = S.

• If E contains more than one equivalence class, then for anyi, j ∈ I with i 6= j,there exists a formulaϕi j such thatsi |= ϕi j andsj 6|= ϕi j for any si ∈ Ui andsj ∈U j . Otherwise, for any formulaϕ , si |= ϕ impliessj |= ϕ . Since the negationexists in the logicLhm, we also havesi |= ¬ϕ implies sj |= ¬ϕ , which meanssj |=ϕ impliessi |=ϕ . Thensi |=ϕ ⇔ sj |=ϕ for anyϕ ∈Lhm, which contradictsthe fact thatsi andsj are taken from different equivalence classes. For each indexi ∈ I , defineϕi =

∧

j 6=i ϕi j , then by construction[[ϕi ]] =Ui . Let us check the lastequality. On one hand, ifs′ ∈ [[ϕi ]], thens′ |= ϕi which means thats′ |= ϕi j for allj 6= i. That is,s′ 6∈ U j for all j 6= i, and this in turn implies thats′ ∈ Ui . On theother hand, ifs′ ∈Ui thens′ |= ϕi assi |= ϕi , which means thats′ ∈ [[ϕi ]].

This completes the proof of the claim that for each equivalenceUi we can find aformulaϕi with [[ϕi ]] =Ui .

Now lets=Lhm t ands a−→∆ . By Theorem 3.1(2) it remains to show thatt a−→Θfor someΘ with

∆(Ui) =Θ(Ui) for any i ∈ I . (3.16)

Letϕ := 〈a〉

∧

i∈I

(ϕi pi⊕⊤)

wherepi = ∆([ϕi). It is easy to see thats|= ϕ , which implies thatt |= ϕ . Therefore,there exists a distributionΘ with t a−→ Θ andΘ |=

∧

i∈I (ϕi pi⊕ ⊤). Then for eachi ∈ I we haveΘ |=ϕi pi

⊕⊤, implying thatΘ(Ui) =Θ([ϕi)≥ pi =∆([ϕi) =∆(Ui).Note that∑i∈I pi = 1. Thus we have∆(Ui) = Θ(Ui) for eachi ∈ I , the goal set in(3.16).

By symmetry each transition oft can be matched by some transition ofs. ⊓⊔

The above theorem still holds if the pLTS has a countable state space but isimage-finite. The proof is a bit subtle; see [50] for more details.

When restricted to reactive pLTS’s, probabilistic bisimulations can be charac-terised by simpler forms of logics [60, 30, 74] or a simple test language [7]. Mostnotably, in the absence of nondeterminism, there is no need of negation to charac-terise probabilistic bisimulations.

Let us fix a reactive pLTS〈S,L,→〉 where the state spaceS may be countable.Let Lrhm be the sublogic ofLhm generated by the following grammar:

ϕ := ⊤ | ϕ1∧ϕ2 | 〈a〉(ϕ1 p⊕⊤)

wherep is arational numberin the unit interval[0,1]. Recall thats |= 〈a〉(ϕ p⊕⊤)iff s a−→ ∆ and∆([[ϕ ]]) ≥ p (cf. Example 3.1). The logic above induces a logicalequivalence relation=Lrhm between states.

The following lemma says that the transition probabilitiesto sets of the form[[ϕ ]]are completely determined by the formulae. It has appeared as Lemma 7.7.6 in [80].


Lemma 3.4.If s=Lrhmt and s a−→ ∆ , then someΘ exists with t a−→Θ , and for anyformulaϕ ∈ Lrhm we have∆([[ϕ ]]) =Θ([[ϕ ]]).

Proof. First of all, the existence ofΘ is obvious because otherwise the formula〈a〉(⊤ 1⊕⊤) would be satisfied bysbut not byt.

Let us assume, without loss of generality, that there existsa formulaϕ suchthat ∆([[ϕ ]]) < Θ([[ϕ ]]). Then we can always squeeze in a rational numberp with∆([[ϕ ]])< p≤Θ([[ϕ ]]). It follows thatt |= 〈a〉(ϕ p⊕⊤) buts 6|= 〈a〉(ϕ p⊕⊤), whichcontradicts the hypothesis thats=Lrhmt. ⊓⊔

We will show that the logicLrhm can characterise bisimulation for reactivepLTS’s. The completeness proof of the characterisation crucially relies on theπ-λ theorem (cf. Theorem 2.9). The next proposition is a typicalapplication of thattheorem [34], which tells us that when two probability distributions agree on aπ-class they also agree on the generatedσ -algebra.

Proposition 3.9.LetA0 = [[ϕ ]] | ϕ ∈ Lrhm andA = σ(A0). For any two distri-butions∆ ,Θ ∈ D(S), if ∆(A) = Θ(A) for any A∈ A0, then∆(B) = Θ(B) for anyB∈ A .

Proof. Let X = A∈ A | ∆(A) =Θ(A). ThenX is closed under countable dis-joint unions because probability distributions areσ -additive. Since∆ andΘ aredistributions, we have∆(S) = Θ(S) = 1. It follows that if A ∈ X then∆(S\A) =∆(S)−∆(A) = Θ(S)−Θ(A) = Θ(S\A), i.e. S\A ∈ X . ThusX is closed undercomplementation as well. It follows thatX is aλ -class. Note thatA0 is aπ-classin view of the equation[[ϕ1 ∧ ϕ2]] = [[ϕ1]]∩ [[ϕ2]]. SinceA0 ⊆ X , we can applythe π-λ Theorem to obtain thatA = σ(A0) ⊆ X ⊆ A , i.e. A = X . Therefore,∆(B) =Θ(B) for anyB∈ A . ⊓⊔

The following theorem has appeared in [28], which is obtained by simplifyingthe results of [30] in reactive pLTS’s with countable state spaces.

Theorem 3.8.Let s and t be two states in a reactive pLTS. Then s∼ t iff s=Lrhmt.

Proof. The proof of soundness is carried out by a routine induction on the structureof formulae. Below we focus on the completeness. It suffices to show that=Lrhm isa bisimulation. Note that=Lrhm is clearly an equivalence relation. For anyu∈ Stheequivalence class inS/=Lrhm

that containsu is

[u] =⋂

[[ϕ ]] | u |= ϕ∩⋂

S\[[ϕ ]] | u 6|= ϕ. (3.17)

In (3.17) only countable intersections are used because theset of all the formulaein the logicLrhm is countable. LetA0 be defined as in Proposition 3.9. Then eachequivalence class ofS/

=Lrhmis a member ofσ(A0).

On the other hand,s=Lrhm t ands a−→ ∆ implies that some distributionΘ existswith t a−→ Θ and for anyϕ ∈ Lrhm, ∆([[ϕ ]]) = Θ([[ϕ ]]) by Lemma 3.4. Thus byProposition 3.9 we have


∆([u]) =Θ([u]) (3.18)

where[u] is any equivalence class ofS/=Lrhm. Then it follows from Theorem 3.1(2)

that∆ (=Lrhm)† Θ . Symmetrically, any transition oft can be mimicked by a transi-

tion froms. Therefore, the relation=Lrhm is a bisimulation. ⊓⊔

Theorem 3.8 tells us thatLrhm can characterise bisimulation for reactive pLTS’s,and this logic has neither negation nor infinite conjunction. Moreover, the aboveresult holds for general reactive pLTS’s that may have countable state space and arenot necessarily finitely branching.

3.6.2 An expressive logic

In this section we add the probabilistic-choice modality introduced in Section 3.6.1to the modal mu-calculus, and show that the resulting probabilistic mu-calculus isexpressive with respect to probabilistic bisimilarity.

3.6.2.1 Probabilistic modal mu-calculus

Let Var be a countable set of variables. We define a setLµ of modal formulae inpositive normal form given by the following grammar:

ϕ := ⊤ | ⊥ | 〈a〉ψ | [a]ψ | ϕ1∧ϕ2 | ϕ1∨ϕ2 | X | µX.ϕ | νX.ϕψ :=

⊕

i∈I pi ·ϕi

wherea∈ L, I is a finite index set and∑i∈I pi = 1. Here we still writeϕ for a stateformula andψ a distribution formula. Sometimes we also use the finite conjunction∧

i∈I ϕi and disjunction∨

i∈I ϕi . As usual, we have∧

i∈ /0 ϕi =⊤ and∨

i∈ /0 ϕi =⊥.The two fixed-point operatorsµX andνX bind the respective variableX. We

apply the usual terminology of free and bound variables in a formula and writefv(ϕ) for the set of free variables inϕ .

We useenvironments, which bind free variables to sets of states, in order to givesemantics to formulae. We fix a finitary pLTS and letSbe its state set. Let

Env = ρ | ρ : Var→ P(S)

be the set of all environments ranged over byρ . For a setV ⊆ S and a variableX ∈ Var, we writeρ [X 7→V] for the environment that mapsX to V andY to ρ(Y)for all Y 6= X.

The semantics of a formulaϕ can be given as the set of states satisfying it. Thisentails a semantic function[ : Lµ → Env → P(S) defined inductively in Figure3.4, where we also apply[ to distribution formulae and[ψ is interpreted as the set


of distributions that satisfyψ . As the meaning of a closed formulaϕ does not dependon the environment, we write[ϕ for [ϕρ whereρ is an arbitrary environment.[⊤ρ = S[⊥ρ = /0[ϕ1∧ϕ2ρ = [ϕ1ρ ∩ [ϕ2ρ[ϕ1∨ϕ2ρ = [ϕ1ρ ∪ [ϕ2ρ[〈a〉ψρ = s∈ S | ∃∆ : s a−→ ∆ ∧ ∆ ∈ [ψρ [[a]ψρ = s∈ S | ∀∆ : s a−→ ∆ ⇒ ∆ ∈ [ψρ [Xρ = ρ(X)[µX.ϕρ =

⋂V ⊆ S | [ϕρ[X 7→V] ⊆V [νX.ϕρ =

⋃V ⊆ S | [ϕρ[X 7→V] ⊇V [⊕i∈I pi ·ϕiρ = ∆ ∈ D(S) | ∆ =

⊕

i∈I pi ·∆i ∧ ∀i ∈ I ,∀t ∈ ⌈∆i⌉ : t ∈ [ϕiρ

Fig. 3.4 Semantics of probabilistic modal mu-calculus

The semantics of probabilistic modal mu-calculus is the same as that of the modalmu-calculus [59] except for the probabilistic-choice modality that is used to repre-sent decompositions of distributions. The characterisation of least fixed-point for-mula µX.ϕ andgreatest fixed-point formulaνX.ϕ follows from the well-knownKnaster-Tarski fixed-point theorem [84] (cf. Theorem 2.1).

We shall consider (closed)equation systemsof formulae of the form

E : X1 = ϕ1...

Xn = ϕn

whereX1, ...,Xn are mutually distinct variables andϕ1, ...,ϕn are formulae having atmostX1, ...,Xn as free variables. HereE can be viewed as a functionE : Var→ Lµdefined byE(Xi) = ϕi for i = 1, ...,n andE(Y) =Y for other variablesY ∈ Var.

An environmentρ is a solutionof an equation systemE if ∀i : ρ(Xi) = [ϕiρ .The existence of solutions for an equation system can be seenfrom the followingarguments. The setEnv, which includes all candidates for solutions, together withthe partial order≤ defined by

ρ ≤ ρ ′ iff ∀X ∈ Var : ρ(X)⊆ ρ ′(X)

forms a complete lattice. Theequation functionE : Env → Env given in theλ -calculus notation by

E := λ ρ .λX.[E(X)ρ

is monotone. Thus, the Knaster-Tarski fixed-point theorem guarantees existence ofsolutions, and the largest solution

ρE :=⊔

ρ | ρ ≤ E (ρ)


3.6.2.2 Characteristic equation systems

As studied in [83], the behaviour of a process can be characterised by an equationsystem of modal formulae. Below we show that this idea also applies in the proba-bilistic setting.

Definition 3.9. Given a finitary pLTS, itscharacteristic equation systemconsists ofone equation for each states1, ...,sn ∈ S.

E : Xs1 = ϕs1...

Xsn = ϕsn

whereϕs := (

∧

sa−→∆

〈a〉X∆ )∧ (∧

a∈L

[a]∨

sa−→∆

X∆ ) (3.19)

with X∆ :=⊕

s∈⌈∆⌉ ∆(s) ·Xs.

Theorem 3.9.Suppose E is a characteristic equation system. Then s∼ t if and onlyif t ∈ ρE(Xs).

Proof. (⇐) Let R= (s, t) | t ∈ ρE(Xs). We first show that

Θ ∈ [X∆ ρEimplies∆ R

† Θ . (3.20)

Let ∆ =⊕

i∈I pi · si , thenX∆ =⊕

i∈I pi ·Xsi . SupposeΘ ∈ [X∆ ρE. We have that

Θ =⊕

i∈I pi ·Θi and, for all i ∈ I and t ′ ∈ ⌈Θi⌉, that t ′ ∈ [Xsi ρE, i.e. si R t ′. It

follows thatsi R† Θi and thus∆ R

† Θ .Now we show thatR is a bisimulation.

1. SupposesR t ands a−→ ∆ . Thent ∈ ρE(Xs) = [ϕsρE. It follows from (3.19)

that t ∈ [〈a〉X∆ ρE. So there exists someΘ such thatt a−→ Θ andΘ ∈ [X∆ ρE

.Now we apply (3.20).

2. SupposesR t andt a−→ Θ . Thent ∈ ρE(Xs) = [ϕsρE. It follows from (3.19)

thatt ∈ [[a]∨sa−→∆ X∆ . Notice that it must be the case thatscan enable action

a, otherwise,t ∈ [[a]⊥ρEand thust cannot enablea either, in contradiction

with the assumptiont a−→ Θ . Therefore,Θ ∈ [∨sa−→∆ X∆ ρE

, which impliesΘ ∈ [X∆ ρE

for some∆ with s a−→ ∆ . Now we apply (3.20).

(⇒) We define the environmentρ∼ by

ρ∼(Xs) := t | s∼ t .

It suffices to show thatρ∼ is a postfixed point ofE , i.e.

ρ∼ ≤ E (ρ∼) (3.21)


because in that case we haveρ∼ ≤ ρE, thuss∼ t implies t ∈ ρ∼(Xs) that in turnimpliest ∈ ρE(Xs).

We first show that∆ ∼† Θ impliesΘ ∈ [X∆ ρ∼ . (3.22)

Suppose that∆ ∼† Θ . Using Proposition 3.1 we have that (i)∆ =⊕

i∈I pi · si , (ii)Θ =

⊕

i∈I pi · ti , (iii) si ∼ ti for all i ∈ I . We know from (iii) thatti ∈ [Xsi ρ∼ . Using(ii) we have thatΘ ∈ [⊕i∈I pi ·Xsi ρ∼ . Using (i) we obtainΘ ∈ [X∆ ρ∼ .

Now we are in a position to show (3.21). Supposet ∈ ρ∼(Xs). We must provethatt ∈ [ϕsρ∼ , i.e.

t ∈ (⋂

sa−→∆

[〈a〉X∆ ρ∼)∩ (⋂

a∈L

[[a] ∨

sa−→∆

X∆ ρ∼)

by (3.19). This can be done by showing thatt belongs to each of the two parts ofthis intersection.

1. In the first case, we assume thats a−→ ∆ . Sinces∼ t, there exists someΘsuch thatt a−→ Θ and∆ ∼† Θ . By (3.22), we getΘ ∈ [X∆ ρ∼ . It follows thatt ∈ [〈a〉X∆ ρ∼ .

2. In the second case, we supposet a−→Θ for any actiona∈ L and distributionΘ .Then bys∼ t there exists some∆ such thats a−→ ∆ and∆ ∼† Θ . By (3.22), wegetΘ ∈ [X∆ ρ∼ . As a consequence,t ∈ [[a]∨s

a−→∆ X∆ ρ∼ . Since this holds forarbitrary actiona, our desired result follows.

⊓⊔

3.6.2.3 Characteristic formulae

So far we know how to construct the characteristic equation system for a finitarypLTS. As introduced in [70], the three transformation rulesin Figure 3.5 can beused to obtain from an equation systemE a formula whose interpretation coincideswith the interpretation ofX1 in the greatest solution ofE. The formula thus obtainedfrom a characteristic equation system is called acharacteristic formula.

Theorem 3.10.Given a characteristic equation system E, there is a characteristicformulaϕs such thatρE(Xs) = [ϕs for any state s. ⊓⊔

The above theorem, together with the results in Section 3.6.2.2, gives rise to thefollowing corollary.

Corollary 3.1. For each state s in a finitary pLTS, there is a characteristic formulaϕs such that s∼ t iff t ∈ [ϕs. ⊓⊔


1. Rule 1:E → F2. Rule 2:E → G3. Rule 3:E → H if Xn 6∈ fv(ϕ1, ...,ϕn)

E : X1 = ϕ1 F : X1 = ϕ1 G : X1 = ϕ1[ϕn/Xn] H : X1 = ϕ1...

......

...Xn−1 = ϕn−1 Xn−1 = ϕn−1 Xn−1 = ϕn−1[ϕn/Xn] Xn−1 = ϕn−1

Xn = ϕn Xn = νXn.ϕn Xn = ϕn

Fig. 3.5 Transformation rules

3.7 Metric characterisation

In the bisimulation game, probabilities are treated as labels since they are matchedonly when they are identical. One might argue that this does not provide a robustrelation: Processes that differ with a very small probability, for instance, would beconsidered just as different as processes that perform completely different actions.This is particularly relevant to security systems where specifications can be given asperfect, but impractical processes and other, practical processes are considered safeif they only differ from the specification with a negligible probability.

To find a more flexible way to differentiate processes, we borrow from pure math-ematics the notion of metric2 and measure the difference between two processes thatare not quite bisimilar. Since different processes may behave the same, they will begiven distance zero in our metric semantics. So we are more interested in pseudo-metrics than metrics.

In the rest of this section, we fix a finite-state pLTS(S,L,−→) and provide theset of pseudometrics onSwith the following partial order.

Definition 3.10.The relation for the setM of 1-bounded pseudometrics onS isdefined by

m1 m2 if ∀s, t : m1(s, t)≥ m2(s, t).

Here we reverse the ordering with the purpose of characterising bisimilarity by agreatestfixed point (cf. Corollary 3.2).

Lemma 3.5.(M ,) is a complete lattice.

Proof. The top element is given by∀s, t : ⊤(s, t) = 0; the bottom element is givenby⊥(s, t) = 1 if s 6= t, 0 otherwise. Greatest lower bounds are given by

(l

X)(s, t) = supm(s, t) | m∈ X

for anyX ⊆ M . Finally, least upper bounds are given by

2 For simplicity, in this section we use the term metric to denote both metric and pseudometric. Allthe results are based on pseudometrics.

3.7 Metric characterisation 53

⊔

X =l

m∈ M | ∀m′ ∈ X : m′ m.

⊓⊔

In order to define the notion of state-metrics (that will correspond to bisimula-tions) and the monotone transformation on metrics, we need to lift metrics fromSto D(S). Here we use the lifting operation based on the Kantorovich metric [57] onprobability measures (cf. Section 3.4.1), which has been used by van Breugel andWorrell for defining metrics on fully probabilistic systems[9] and reactive proba-bilistic systems [10]; and by Desharnaiset al. for labelled Markov chains [32] andlabelled concurrent Markov chains [33], respectively.

Definition 3.11.m∈ M is astate-metricif, for all ε ∈ [0,1), m(s, t)≤ ε implies:

• if s a−→ ∆ then there exists some∆ ′ such thatt a−→ ∆ ′ andm(∆ ,∆ ′)≤ ε.

Note that if m is a state-metric then it is also a metric. Bym(s, t) ≤ ε we havem(t,s)≤ ε, which implies

• if t a−→ ∆ ′ then there exists some∆ such thats a−→ ∆ andm(∆ ′,∆) ≤ ε.

In the above definition, we prohibitε from being 1 because we use 1 to representthe distance between any two incomparable states includingthe case where one statemay perform a transition and the other may not.

The greatest state-metric is defined as

mmax=⊔

m∈ M | m is a state-metric.

It turns out that state-metrics correspond to bisimulations and the greatest state-metric corresponds to bisimilarity. To make the analogy closer, in what follows wewill characterisemmax as a fixed point of a suitable monotone function onM . Firstwe recall the definition of Hausdorff distance.

Definition 3.12.Given a 1-bounded metricd onZ, theHausdorff distancebetweentwo subsetsX,Y of Z is defined as follows:

Hd(X,Y) = maxsupx∈Xin fy∈Yd(x,y),supy∈Y in fx∈Xd(y,x)

wherein f /0= 1 andsup /0= 0.

Next we define a functionF onM by using the Hausdorff distance.

Definition 3.13.Let der(s,a) = ∆ | s a−→ ∆. For anym∈ M , F(m) is a pseudo-metric given by

F(m)(s, t) = maxa∈LHm(der(s,a),der(t,a)).

Thus we have the following property.

Lemma 3.6.For all ε ∈ [0,1), F(m)(s, t) ≤ ε if and only if:


• if s a−→ ∆ then there exists some∆ ′ such that t a−→ ∆ ′ andm(∆ ,∆ ′)≤ ε;• if t a−→ ∆ ′ then there exists some∆ such that s a−→ ∆ andm(∆ ′,∆)≤ ε.

⊓⊔

The above lemma can be proved by directly checking the definition of F, as canthe next lemma.

Lemma 3.7.m is a state-metric iff m F(m). ⊓⊔

Consequently we have the following characterisation:

mmax=⊔

m∈ M | m F(m).

Lemma 3.8.F is monotone onM . ⊓⊔

Because of Lemma 3.5 and 3.8, we can apply Theorem 2.1, which tells us thatmmax is the greatest fixed point ofF . Furthermore, by Lemma 3.7 we know thatmmax is indeed a state-metric, and it is the greatest state-metric.

In addition, if our pLTS isimage-finite, i.e. for alla∈ L,s∈ S the setder(s,a) isfinite, the closure ordinal ofF is ω . Therefore one can proceed in a standard way toshow that

mmax=l

F i(⊤) | i ∈ N

where⊤ is the top metric inM andF0(⊤) =⊤.

Lemma 3.9.For image-finite pLTS’s, the closure ordinal of F isω .

Proof. Let mmax(s, t) ≤ ε ands a−→ ∆ . For eachmi = F i(⊤) there is aΘi such thatt a−→Θi andmi(∆ ,Θi)≤ ε. Since the pLTS’s are image-finite, there is aΘ such thatfor all but finitely manyi, t a−→Θ andmi(∆ ,Θ)≤ ε. ⊓⊔

We now show the correspondence between our state-metrics and bisimulations.

Theorem 3.11.Given a binary relationR and a pseudometric m∈ M on a finite-state pLTS such that

m(s, t) =

0 if s R t1 otherwise.

(3.23)

ThenR is a probabilistic bisimulation if and only if m is a state-metric.

Proof. The result can be proved by using Theorem 3.3, which in turn relies onTheorem 3.1 (1). Below we give an alternative proof that usesTheorem 3.1 (2)instead.

Given two distributions∆ ,∆ ′ overS, let us consider how to compute ˆm(∆ ,∆ ′) ifR is an equivalence relation. SinceS is finite, we may assume thatV1, ...,Vn ∈ S/Rare all the equivalence classes ofS underR. If s, t ∈ Vi for somei ∈ 1..n, thenm(s, t) = 0, which impliesxs= xt by the first constraint of (3.7). So for eachi ∈ 1..nthere exists somexi such thatxi = xs for all s∈ Vi. Thus, some summands of (3.7)can be grouped together and we have the following linear program:

3.7 Metric characterisation 55

∑i∈1..n

(∆(Vi)−∆ ′(Vi))xi (3.24)

with the constraintxi −x j ≤ 1 for anyi, j ∈ 1..n with i 6= j. Briefly speaking, ifR isan equivalence relation then ˆm(∆ ,∆ ′) is obtained by maximising the linear programin (3.24).

(⇒) SupposeR is a bisimulation andm(s, t) = 0. From the assumption in (3.23)we know thatR is an equivalence relation. By the definition ofm we havesR t.If s a−→ ∆ thent a−→ ∆ ′ for some∆ ′ such that∆ R

† ∆ ′. To show thatm is a state-metric it suffices to prove ˆm(∆ ,∆ ′) = 0. We know from∆ R

† ∆ ′ and Theorem 3.1(2) that∆(Vi) = ∆ ′(Vi), for eachi ∈ 1..n. It follows that (3.24) is maximised to be0, thusm(∆ ,∆ ′) = 0.

(⇐) Supposem is a state-metric and has the relation in (3.23). Notice thatR

is an equivalence relation. We show that it is a bisimulation. SupposesR t, whichmeansm(s, t) = 0. If s a−→ ∆ thent a−→ ∆ ′ for some∆ ′ such thatm(∆ ,∆ ′) = 0. Toensure that ˆm(∆ ,∆ ′) = 0, in (3.24) the following two conditions must be satisfied.

1. No coefficient is positive. Otherwise, if∆(Vi)−∆ ′(Vi) > 0 then (3.24) wouldbe maximised to a value not less than(∆(Vi)−∆ ′(Vi)), which is greater than 0.

2. It is not the case that at least one coefficient is negative and the other coeffi-cients are either negative or 0. Otherwise, by summing up allthe coefficients,we would get

∆(S)−∆ ′(S)< 0

that contradicts the assumption that∆ and∆ ′ are distributions overS.

Therefore the only possibility is that all the coefficients in (3.24) are 0, that is∆(Vi) = ∆ ′(Vi) for any equivalence classVi ∈ S/ R. It follows from Theorem 3.1(2) that∆ R

† ∆ ′. So we have shown thatR is indeed a bisimulation. ⊓⊔

Corollary 3.2. Let s and t be two states in a finite state pLTS. Then s∼ t if and onlyif mmax(s, t) = 0.

Proof. (⇒) Since∼ is a bisimulation, by Theorem 3.11 there exists some state-metricmsuch thats∼ t iff m(s, t) = 0. By the definition ofmmaxwe havemmmax.Thereforemmax(s, t)≤ m(s, t) = 0.

(⇐) Frommmax we construct a pseudometricm as follows.

m(s, t) =

0 if mmax(s, t) = 01 otherwise.

Sincemmax is a state-metric, it is easy to see thatm is also a state-metric. Nowwe construct a binary relationR such that∀s,s′ : sR s′ iff m(s,s′) = 0. If followsfrom Theorem 3.11 thatR is a bisimulation. Ifmmax(s, t) = 0, thenm(s, t) = 0 andthus s R t. Therefore we have the required results∼ t because∼ is the largestbisimulation. ⊓⊔


B := Srepeat

Bold := B

B := Refine(B)until Bold = B

return B

Fig. 3.6 Schema for the partition refinement algorithm

3.8 Algorithmic characterisation

Bisimulation is useful for verifying formal systems and is the foundation of state-aggregation algorithms that compress models by merging bisimilar states. State ag-gregation is routinely used as a preprocessing step before model checking [3]. Inthis section we present two algorithms for computing probabilistic bisimulation.

3.8.1 A partition refinement algorithm

We first introduce an algorithm that, given a pLTS(S,L,→) where bothSandL arefinite, iteratively computes bisimilarity. The idea is originally proposed by Kanel-lakis and Smolka [56] for computing non-probabilistic bisimulation and commonlyknown as apartition refinementalgorithm (see Figure 3.6). The point is to repre-sent the state space as a set ofblocks, where a block is set of states standing for anequivalence class, and the equivalence of two given states can be tested by check-ing whether they belong to a same block. The blocks form a partition of the statespace. Starting from the partitionS, the algorithm iteratively refines the partitionby splitting each block into two smaller blocks if two statesin one block are foundto exhibit different behaviour. Eventually, when no further refinement is possiblefor a partition, the algorithm terminates and all states in ablock of the partition arebisimilar.

Let B = B1, ...,Bn be a partition consisting of a set of blocks. The algorithmtries to refine the partition by splitting each block. Asplitter for a blockB ∈ B isthe blockB′ ∈ B such that some states inB havea-transitions, fora ∈ L, into B′

and others do not. In this caseB′ splitsB with respect toa into two blocksB1 andB2, whereB1 = s∈ B | ∃s′ ∈ B′ : s a−→ s′ andB2 = B−B1. This is illustrated inFigure 3.7.

The refinement operatorRefine(B) yields the partition

Refine(B) =⋃

B∈B,a∈L

Split(B,a,B) (3.25)

3.8 Algorithmic characterisation 57

a a a

B

B′

a a a

B1 B2

B′

B B

Fig. 3.7 Splitting a block

whereSplit(B,a,B) is the splitting procedure that detects whether the partitionB

contains a splitter for a given blockB ∈ B with respect to actiona ∈ L. If suchsplitter exists,B is split into two blocksB1 andB2. Otherwise,B itself is returned.

We will introduce a partition refinement algorithm for pLTS’s that was presentedin [2]. Before that we briefly sketch how the above partitioning technique can bemodified for reactive pLTS’s and explain why this method fails for general pLTS’s.

The partitioning technique for reactive pLTS’s

Let 〈S,L,→〉 be a reactive pLTS. For anya∈ L andB⊆ S, we define the equiva-lence relation∼(a,B) by lettings∼(a,B) t if s a−→ ∆ andt a−→Θ with ∆(B) =Θ(B).We still use the schema shown in Figure 3.6 and the refinement operator in (3.25),but change the splitting procedure as follows

Split(B,a,B) =⋃

C∈B

Split(B,a,C) whereSplit(B,a,C) =C/∼(a,B) . (3.26)

An implementation of the algorithm using some tricks on datastructures yields thefollowing complexity result.

Theorem 3.12.The bisimulation equivalence classes of a reactive pLTS with nstates and m transitions can be computed in timeO(mnlogn) and spaceO(mn).

⊓⊔

The splitter technique in (3.26) does not work for general pLTS’s when we usethe obvious modification of the equivalence relation∼(a,B) wheres∼(a,B) t iff forany transitions a−→ ∆ there is a transitiont a−→Θ with ∆(B) =Θ(B) and vice versa.

Example 3.2.Consider the pLTS described in Figure 3.8, we haves 6∼ s′ because thetransitions a−→ 1

2t+ 12u cannot be matched by any transition froms′. However,sand


a

d ecb

1/21/2

a

1/2 1/2

a a

1/2 1/2 1/2 1/2

b d c e

s s’

t u v w t uv w

(a) (b)

Fig. 3.8 Non-bisimilar statessands′

s′ cannot be distinguished by using the above partitioning technique. The problemis that after one round of refinement, we obtain the blocks

s,s′,t,u,v,w

and then no further refinement can split the blocks,s′.

The partitioning technique for pLTS’s

To compute bisimulation equivalence classes in general pLTS’s, we can keep theschema sketched in Figure 3.6 but use two partitions: a partition B for states and apartitionM for transitions. By atransition partitionwe mean a setM consistingof pairs(a,M) wherea∈ L andM ⊆ Ma with Ma =

⋃

s∈S∆ | s a−→ ∆ such that,for any actiona, the setM | (a,M) ∈ M is a partition of the setMa.

The algorithm works as follows. We skip the first refinement step and start withthe state partition

Binit = S/∼L wheres∼L t iff a | s a−→= a | t a−→

that identifies those states that can perform the same actions immediately. The initialtransition partition

Minit = (a,Ma) | a∈ L

identifies all transitions with the same label. In each iteration, we try to refine thestate partitionB according to an equivalence class(a,M) of M or the transitionpartitionM according to a blockB ∈ B. The refinement ofB by (a,M) is doneby the operationSplit(M,a,B) that divides each blockB of B into two subblocksB(a,M) = s∈ B | s a−→ M andB\B(a,M). In other words,

Split(M,a,B) =⋃

B∈B

Split(M,a,B)


B := S/∼L wheres∼L t iff a | s a−→= a | t a−→M := (a,Ma) | a∈ L whereMa =

⋃

s∈S∆ | s a−→ ∆As long asB or M can be modified perform one of the following steps:

• either choose someB∈ B and putM := Split(B,M )• or choose some(a,M) ∈ M and putB := Split(M,a,B)

ReturnB

Fig. 3.9 Computing bisimulation equivalence classes in pLTS’s

whereSplit(M,a,B) = B(a,M),B\B(a,M) andB(a,M) = s∈ B | s a−→ M. The re-finement ofM by B is done by the operationSplit(B,M ) that divides any block(a,M) ∈ M by the subblocks(a,M1), ...,(a,Mn) whereM1, ...,Mn= M/∼B and∆ ∼B Θ iff ∆(B) =Θ(B). Formally,

Split(B,M ) =⋃

(a,M)∈M

Split(B,(a,M))

whereSplit(B,(a,M)) = (a,M′) | M′ ∈ M/ ∼B. If no further refinement ofB andM is possible then we haveB =S/∼. The algorithm is sketched in Figure 3.9. See[2] for the correctness proof of the algorithm and the suitable data structures used toobtain the following complexity result.

Theorem 3.13.The bisimulation equivalence classes of a pLTS with n statesand mtransitions can be decided in timeO(mn(logm+ logn)) and spaceO(mn). ⊓⊔

3.8.2 An “on the fly” algorithm

We now propose an “on the fly” algorithm for checking whether two states in afinitary pLTS are bisimilar.

An important ingredient of the algorithm is to check whethertwo distributions arerelated via a lifted relation. Fortunately, Theorem 3.4 already provides us a methodfor deciding whether∆ R

† Θ , for two given distributions∆ ,Θ and a relationR.We construct the networkN (∆ ,Θ ,R) and compute the maximum flow with well-known methods, as sketched in the procedureCheckshown in Figure 3.10.

As shown in [14], computing the maximum flow in a network can bedone intime O(n3/ logn) and spaceO(n2), wheren is the number of nodes in the network.So we immediately have the following result.

Lemma 3.10.The test whether∆ R† Θ can be done in time O(n3/ logn) and space

O(n2). ⊓⊔

We now present a bisimilarity-checking algorithm by adapting the algorithm pro-posed in [62] for value-passing processes, which in turn wasinspired by [37].


Bisim(s, t)

Bisim(s, t) = NotBisim:= fun Bis(s, t)=

Visited:= Assumed:= Match(s, t)

handleWrongAssumption⇒ Bis(s, t)return Bis(s, t)

Match(s, t) =Visited:=Visisted∪(s, t)b=

∧

a∈L MatchAction (s, t,a)if b= f alsethen

NotBisim:= NotBisim∪(s, t)if (s, t) ∈ Assumedthen

raiseWrongAssumptionend if

end ifreturn b

MatchAction(s, t,a) =for all s a−→ ∆i do

for all t a−→Θ j dobi j = MatchDistribution (∆i ,Θ j )

end forend forreturn (

∧

i(∨

j bi j ))∧(∧

j (∨

i bi j ))

MatchDistribution (∆ ,Θ) =Assume⌈∆⌉ = s1, ...,sn and⌈Θ⌉ = t1, ..., tmR:= (si , t j) | Close(si , t j) = truereturn Check(∆ ,Θ ,R)

Close(s, t) =if (s, t) ∈ NotBisimthen

return f alseelse if(s, t) ∈Visited then

Assumed:= Assumed∪(s, t)return true

elsereturn Match (s, t)

end if

Check(∆ ,Θ ,R) =Construct the networkN (∆ ,Θ ,R)Compute the maximum flowF in N (∆ ,Θ ,R)return (F = 1)

Fig. 3.10 Check whethers is bisimilar tot. c©[2009] IEEE. Reprinted, with permission, from [23].


The main procedure in the algorithm isBisim(s, t) shown in Figure 3.10. It startswith the initial state pair(s, t), trying to find the smallest bisimulation relation con-taining the pair by matching transitions from each pair of states it reaches. It usesthree auxiliary data structures:

• NotBisimcollects all state pairs that have already been detected as not bisimilar.• Visitedcollects all state pairs that have already been visited.• Assumedcollects all state pairs that have already been visited and assumed to be

bisimilar.

The core procedure,Match, is called from functionBis inside the main procedureBisim. Whenever a new pair of states is encountered it is inserted into Visited. Iftwo states fail to match each other’s transitions then they are not bisimilar and thepair is added toNotBisim. If the current state pair has been visited before, we checkwhether it is inNotBisim. If this is the case, we returnf alse. Otherwise, a loop hasbeen detected and we make assumption that the two states are bisimilar, by insertingthe pair intoAssumed, and returntrue. Later on, if we find that the two states are notbisimilar after finishing searching the loop, then the assumption is wrong, so we firstadd the pair intoNotBisimand then raise the exceptionWrongAssumption, whichforces the functionBis to run again, with the new information that the two statesin this pair are not bisimilar. In this case, the size ofNotBisimhas been increasedby at least one. Hence,Bis can only be called for finitely many times. Therefore,the procedureBisim(s, t) will terminate. If it returnstrue, then the set of state pairs(Visited−NotBisim) constitutes a bisimulation relation containing the pair(s, t).

The main difference from the algorithm of checking non-probabilistic bisimilar-ity in [62] is the introduction of the procedureMatchDistribution (∆ ,Θ), wherewe approximate∼ by a binary relationR that is coarser than∼ in general, andwe check the validity of∆ R

† Θ . If ∆ R† Θ does not hold, then∆ ∼† Θ is invalid

either andMatchDistribution (∆ ,Θ) returnsfalsecorrectly. Otherwise, the two dis-tributions∆ andΘ are considered equivalent with respect toR and we move on tomatch other pairs of distributions. The correctness of the algorithm is stated in thefollowing theorem.

Theorem 3.14.Given two states s0, t0 in a finitary pLTS, the functionBisim(s0, t0)terminates, and it returnstrue if and only if s0 ∼ t0.

Proof. Let Bisi stand for thei-th execution of the functionBis. Let Assumedi andNotBisimi be the setAssumedand NotBisim at the end ofBisi . When Bisi isfinished, either aWrongAssumptionis raised or noWrongAssumptionis raised.In the former case,Assumedi ∩ NotBisimi 6= /0; in the latter case, the executionof the functionBisim is completed. By examining functionClose we see thatAssumedi∩NotBisimi−1 = /0. Now it follows from the factNotBisimi−1 ⊆NotBisimi

that NotBisimi−1 ⊂ NotBisimi . Since we are considering finitary pLTS’s, there issome j such thatNotBisimj−1 = NotBisimj , when all the non-bisimilar state pairsreachable froms0 andt0 have been found andBisim must terminate.

For the correctness of the algorithm, we consider the relation


Ri = Visitedi −NotBisimi,

whereVisitedi is the setVisitedat the end ofBisi . Let Bisk be the last execution ofBis. For eachi ≤ k, the relationRi can be regarded as an approximation of∼, asfar as the states appeared inRi are concerned. Moreover,Ri is a coarser approxi-mation because if two statess, t are re-visited but their relation is unknown, they areassumed to be bisimilar. Therefore, ifBisk(s0, t0) returnsf alse, thens0 6∼ t0. On theother hand, ifBisk(s0, t0) returnstrue, thenRk constitutes a bisimulation relationcontaining the pair(s0, t0). This follows becauseMatch(s0, t0) = true, which basi-cally means that wheneversRk t ands a−→ ∆ there exists some transitiont a−→ Θsuch thatCheck(∆ ,Θ ,Rk) = true, i.e. ∆ Rk

† Θ . Indeed, this rules out the possi-bility that s0 6∼ t0 as otherwise we would haves0 6∼ω t0 by Proposition 3.8, that iss0 6∼n t0 for somen> 0. The latter means that some transitions0

a−→ ∆ exists suchthat for allt0

a−→Θ we do not have∆ (∼n−1)† Θ , or symmetrically with the roles of

s0 andt0 exchanged, i.e.∆ andΘ can be distinguished at leveln, so a contradictionarises. ⊓⊔

Below we consider the time and space complexities of the algorithm.

Theorem 3.15.Let s and t be two states in a pLTS with n states in total. The functionBisim(s, t) terminates in time O(n7/ logn) and space O(n2).

Proof. The number of state pairs is bounded byn2. In the worst case, each executionof the functionBis(s, t) only yields one new pair of states that are not bisimilar. Thenumber of state pairs examined in the first execution ofBis(s, t) is at mostO(n2), inthe second execution is at mostO(n2−1), · · · . Therefore, the total number of statepairs examined is at mostO(n2+(n2−1)+ · · ·+1)=O(n4). When a state pair(s, t)is examined, each transition ofs is compared with all transitions oft labelled withthe same action. Since the pLTS is finitely branching, we could assume that eachstate has at mostc outgoing transitions. Therefore, for each state pair, the number ofcomparisons of transitions is bound byc2. As a comparison of two transitions callsthe functionCheck once, which requires timeO(n3/ logn) by Lemma 3.10. As aresult, examining each state pair takes timeO(c2n3/ logn). Finally, the worst casetime complexity of executingBisim(s, t) is O(n7/ logn).

The space requirement of the algorithm is easily seen to beO(n2), in view ofLemma 3.10. ⊓⊔

Remark 3.3.With mild modification, the above algorithm can be adapted tocheckprobabilistic similarity. We simply remove the underlinedpart in the functionMatchAction ; the rest of the algorithm remains unchanged. Similar to theanaly-sis in Theorems 3.14 and 3.15, the new algorithm can be shown to correctly checkprobabilistic similarity over finitary pLTS’s; its worst case time and space complex-ities are stillO(n7/ logn) andO(n2), respectively.

3.9 Bibliographic notes 63

3.9 Bibliographic notes

3.9.1 Probabilistic models

Models for probabilistic concurrent systems have been studied for a long time[77, 29, 86, 52]. One of the first models obtained as a simple adaptation of the tra-ditional labelled transition systems from concurrency theory appears in [60]. Theirprobabilistic transition systemsare classical labelled transition systems, where inaddition every transition is labelled with a probability, areal number in the interval[0,1], such that for every statesand every actiona, the probabilities of alla-labelledtransitions leavingssum to either 0 or 1.

In [42] a similar model is proposed, but where the probabilities ofall transitionsleavings sum to either 0 or 1. [45] proposes the terminologyreactivefor the typeof model studied in [60], andgenerativefor the type of model studied in [42]. Ina generative model, a process can be considered to spontaneously generate actions,unless restricted by the environment; in generating actions, a probabilistic choiceis made between all transitions that can be taken from a givenstate, even if theyhave different labels. In a reactive model, on the other hand, processes are supposedto perform actions only in response to requests by the environment. The choicebetween two different actions is therefore not under the control of the process itself.When the environment requests a specific action, a probabilistic choice is madebetween all transitions (if any) that are labelled with the requested action.

In the above-mentioned models, the nondeterministic choice that can be mod-elled by non-probabilistic labelled transition systems isreplacedby a probabilisticchoice (and in the generative model also a deterministic choice, a choice betweendifferent actions, is made probabilistic). Hence reactiveand generative probabilistictransition systems do not generalise non-probabilistic labelled transition systems. Amodel, or rather a calculus, that features both nondeterministic and reactive proba-bilistic choice is proposed in [46]. It is slightly reformulated in [82] under the namesimple probabilistic automata, which is akin to our pLTS model. Note that essen-tially the same model has appeared in the literature under different names such asNP-systems[53], probabilistic processes[54], probabilistic transition systems[55]etc. Furthermore, there are strong structural similarities with Markov decision pro-cesses[76].

Following the classification above, our model is reactive rather than generative.The reactive model of [60] can be reformulated by saying thata states has at mostone outgoing transition for any actiona, and this transition ends in a probabilitydistribution over its successor states. The generalisation of [82], that we use here aswell, is that a state can have multiple outgoing transitionswith the same label, eachending in a probability distribution. Simple probabilistic automata are a special caseof theprobabilistic automataof [82], that also generalise the generative models ofprobabilistic processes to a setting with nondeterministic choice.


3.9.2 Probabilistic bisimulation

Probabilistic bisimulation is first introduced by Larsen and Skou [60]. Later on, it isinvestigated in a great many probabilistic models. An adequate logic for probabilis-tic bisimulation in a setting similar to our pLTS’s has been studied in [74, 50] andthen used in [18] to characterise a notion of bisimulation defined on a LTS whosestates are distributions. It is also based on a probabilistic extension of the Hennessy-Milner logic. The main difference from our logic in Section 3.6.1 is the introductionof the operator[·]p. Intuitively, a distribution∆ satisfies the formula[ϕ ]p when theset of states satisfyingϕ is measured by∆ with probability at leastp. So the formula[ϕ ]p can be expressed by our logic in terms of the probabilistic choiceϕ p⊕ ⊤, aswe have seen in Example 3.1.

An expressive logic for non-probabilistic bisimulation has been proposed in [83].In Section 3.6.2 we partially extend the results of [83] to a probabilistic setting thatadmits both probabilistic and nondeterministic choice. Wepresent a probabilisticextension of the modal mu-calculus [59], where a formula is interpreted as the setof states satisfying it. This is in contrast to the probabilistic semantics of the mu-calculus as studied in [51, 64, 65] where formulae denote lower bounds of prob-abilistic evidence of properties, and the semantics of the generalised probabilisticlogic of [16] where a mu-calculus formula is interpreted as aset of deterministictrees that satisfy it.

The probabilistic bisimulation studied in this chapter (cf. Definition 3.5) is a rela-tion between states. In [48] a notion of probabilistic bisimulation is directly definedon distributions, and it is shown that the distribution-based bisimulations correspondto a lifting of state-based bisimulations with combined transitions. By combinedtransitions we mean that if a state can perform an action and then nondeterministi-cally evolve into two different distributions, it can also evolve into any linear combi-nation of the two distributions. The logical characterisation of the distribution-basedbisimilarity in [48] is similar to ours in Section 3.6.1 but formulae are interpretedon distributions rather than on states. Another notion of distribution-based bisimu-lation is given in [36]. Unlike the other bisimulations we have discussed so far, it isessentially a linear-time semantics rather than a branching-time semantics [44].

The Kantorovich metric has been used by van Breugelet al. for defining be-havioural pseudometrics on fully probabilistic systems [9, 12, 8] and reactive prob-abilistic systems [10, 11, 5, 6]; and by Desharnaiset al. for labelled Markov chains[31, 33] and labelled concurrent Markov chains [32]; and later on by Fernset al. forMarkov decision processes [38, 39]; and by Denget al. for action-labelled quanti-tative transition systems [20]. In this chapter we are mainly interested in the corre-spondence of our lifting operation to the Kantorovich metric. The metric character-isation of probabilistic bisimulation in Section 3.7 is merely a direct consequenceof this correspondence. In [13] a general family of bisimulation metrics is proposedthat can be used to specify some properties in security and privacy. In [90] a differ-ent kind of approximate reasoning technique is proposed that measures the distancesof labels used in (non-probabilistic) bisimulation games rather than the distances ofstates in probabilistic bisimulation games.

References 65

Decision algorithms for probabilistic bisimilarity and similarity are first investi-gated by Baier et al. in [2]. In [91] Zhang et al. have improvedthe original algorithmof computing probabilistic simulation by exploiting the parametric maximum flowalgorithm [41] to compute the maximum flows of networks and resulted in an algo-rithm that computes strong simulation in timeO(m2n) and in spaceO(m2). In [19]a framework based on abstract interpretation [17] is proposed to design algorithmsthat compute bisimulation and simulation. It entails the bisimulation algorithm byBaier et al. as an instance and leads to a simulation algorithm that improves the onefor pLTS’s by Zhang et al. All those algorithms above are global in the sense thata whole state space has to be fully generated in advance. In contrast, “on the fly”algorithms are local in the sense that the state space is dynamically generated whichis often more efficient to determine that one state fails to berelated to another. Ouralgorithm in Section 3.8.2 is inspired by [2] because we alsoreduce the problem ofchecking whether two distributions are related by a lifted relation to the maximumflow problem of a suitable network. We generalise the local algorithm of checkingnon-probabilistic bisimilarity [37, 62] to the probabilistic setting.

An important line of research is on denotational semantics of probabilistic pro-cesses, where bisimulations are interpreted in analytic spaces. This topic is wellcovered in [72, 34], and thus omitted from this book.

References

1. Aharonia, R., Bergerb, E., Georgakopoulosc, A., Perlsteina, A., Sprussel, P.: The max-flowmin-cut theorem for countable networks. Journal of Combinatorial Theory, Series B101(1),1–17 (2011)

2. Baier, C., Engelen, B., Majster-Cederbaum, M.E.: Deciding bisimilarity and similarity forprobabilistic processes. Journal of Computer and System Sciences60(1), 187–231 (2000)

3. Baier, C., Katoen, J.P.: Principles of Model Checking. The MIT Press (2008)4. Bandini, E., Segala, R.: Axiomatizations for probabilistic bisimulation. In: Proceedings of the

28th International Colloquium on Automata, Languages and Programming,Lecture Notes inComputer Science, vol. 2076, pp. 370–381. Springer (2001)

5. van Breugel, F., Hermida, C., Makkai, M., Worrell, J.: An accessible approach to behaviouralpseudometrics. In: Proceedings of the 32nd International Colloquium on Automata, Lan-guages and Programming,Lecture Notes in Computer Science, vol. 3580, pp. 1018–1030.Springer (2005)

6. van Breugel, F., Hermida, C., Makkai, M., Worrell, J.: Recursively defined metric spaceswithout contraction. Theoretical Computer Science380(1-2), 143–163 (2007)

7. van Breugel, F., Mislove, M., Ouaknine, J., Worrell, J.: Domain theory, testing and simulationfor labelled Markov processes. Theoretical Computer Science333(1-2), 171–197 (2005)

8. van Breugel, F., Sharma, B., Worrell, J.: Approximating abehavioural pseudometric withoutdiscount for probabilistic systems. In: Proceedings of the10th International Conference onFoundations of Software Science and Computational Structures,Lecture Notes in ComputerScience, vol. 4423, pp. 123–137. Springer (2007)

9. van Breugel, F., Worrell, J.: An algorithm for quantitative verification of probabilistic transi-tion systems. In: Proceedings of the 12th International Conference on Concurrency Theory,Lecture Notes in Computer Science, vol. 2154, pp. 336–350. Springer (2001)


10. van Breugel, F., Worrell, J.: Towards quantitative verification of probabilistic transition sys-tems. In: Proceedings of the 28th International Colloquiumon Automata, Languages andProgramming,Lecture Notes in Computer Science, vol. 2076, pp. 421–432. Springer (2001)

11. van Breugel, F., Worrell, J.: A behavioural pseudometric for probabilistic transition systems.Theoretical Computer Science331(1), 115–142 (2005)

12. van Breugel, F., Worrell, J.: Approximating and computing behavioural distances in proba-bilistic transition systems. Theoretical Computer Science 360(1-3), 373–385 (2006)

13. Chatzikokolakis, K., Gebler, D., Palamidessi, C., Xu, L.: Generalized bisimulation metrics.In: Proceedings of the 25th International Conference on Concurrency Theory,Lecture Notesin Computer Science, vol. 8704, pp. 32–46. Springer (2014)

14. Cheriyan, J., Hagerup, T., Mehlhorn, K.: Can a maximum flow be computed on O(nm) time?In: Proceedings of the 17th International Colloquium on Automata, Languages and Program-ming,Lecture Notes in Computer Science, vol. 443, pp. 235–248. Springer (1990)

15. Christoff, I.: Testing equivalences and fully abstractmodels for probabilistic processes. In:Proceedings the 1st International Conference on Concurrency Theory,Lecture Notes in Com-puter Science, vol. 458, pp. 126–140. Springer (1990)

16. Cleaveland, R., Iyer, S.P., Narasimha, M.: Probabilistic temporal logics via the modal mu-calculus. Theoretical Computer Science342(2-3), 316–350 (2005)

17. Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis ofprograms by construction or approximation of fixpoints. In:Proceedings of the 4th ACMSymposium on Principles of Programming Languages, pp. 238–252. ACM (1977)

18. Crafa, S., Ranzato, F.: A spectrum of behavioral relations over LTSs on probability distribu-tions. In: Proceedings the 22nd International Conference on Concurrency Theory,LectureNotes in Computer Science, vol. 6901, pp. 124–139. Springer (2011)

19. Crafa, S., Ranzato, F.: Bisimulation and simulation algorithms on probabilistic transition sys-tems by abstract interpretation. Formal Methods in System Design40(3), 356–376 (2012)

20. Deng, Y., Chothia, T., Palamidessi, C., Pang, J.: Metrics for action-labelled quantitative tran-sition systems. Electronic Notes in Theoretical Computer Science153(2), 79–96 (2006)

21. Deng, Y., Du, W.: Probabilistic barbed congruence. Electronic Notes in Theoretical ComputerScience190(3), 185–203 (2007)

22. Deng, Y., Du, W.: Kantorovich metric in computer science: A brief survey. Electronic Notesin Theoretical Computer Science353(3), 73–82 (2009)

23. Deng, Y., Du, W.: A local algorithm for checking probabilistic bisimilarity. In: Proceedingsof the 4th International Conference on Frontier of ComputerScience and Technology, pp.401–409. IEEE Computer Society (2009)

24. Deng, Y., van Glabbeek, R., Hennessy, M., Morgan, C.C.: Characterising testing preordersfor finite probabilistic processes. Logical Methods in Computer Science4(4), 1–33 (2008)

25. Deng, Y., van Glabbeek, R., Hennessy, M., Morgan, C.: Testing finitary probabilistic pro-cesses (extended abstract). In: Proceedings of the 20th International Conference on Concur-rency Theory,Lecture Notes in Computer Science, vol. 5710, pp. 274–288. Springer (2009)

26. Deng, Y., van Glabbeek, R., Hennessy, M., Morgan, C.C., Zhang, C.: Remarks on testingprobabilistic processes. Electronic Notes in TheoreticalComputer Science172, 359–397(2007)

27. Deng, Y., van Glabbeek, R., Morgan, C.C., Zhang, C.: Scalar outcomes suffice for finitaryprobabilistic testing. In: Proceedings of the 16th European Symposium on Programming,Lecture Notes in Computer Science, vol. 4421, pp. 363–378. Springer (2007)

28. Deng, Y., Wu, H.: Modal Characterisations of Probabilistic and Fuzzy Bisimulations. In:Proceedings of the 16th International Conference on FormalEngineering Methods,LectureNotes in Computer Science, vol. 8829, pp. 123–138. Springer (2014)

29. Derman, C.: Finite State Markovian Decision Processes.Academic Press (1970)30. Desharnais, J., Edalat, A., Panangaden, P.: A logical characterization of bisimulation for la-

belled Markov processes. In: Proceedings of the 13th AnnualIEEE Symposium on Logic inComputer Science, pp. 478–489. IEEE Computer Society Press(1998)

References 67

31. Desharnais, J., Jagadeesan, R., Gupta, V., Panangaden,P.: Metrics for labeled Markov sys-tems. In: Proceedings of the 10th International Conferenceon Concurrency Theory,LectureNotes in Computer Science, vol. 1664, pp. 258–273. Springer-Verlag (1999)

32. Desharnais, J., Jagadeesan, R., Gupta, V., Panangaden,P.: The metric analogue of weak bisim-ulation for probabilistic processes. In: Proceedings of the 17th Annual IEEE Symposium onLogic in Computer Science, pp. 413–422. IEEE Computer Society (2002)

33. Desharnais, J., Jagadeesan, R., Gupta, V., Panangaden,P.: Metrics for labelled markov pro-cesses. Theoretical Computer Science318(3), 323–354 (2004)

34. Doberkat, E.E.: Stochastic Coalgebraic Logic. Springer (2010)35. Even, S.: Graph Algorithms. Computer Science Press (1979)36. Feng, Y., Zhang, L.: When equivalence and bisimulation join forces in probabilistic automata.

In: Proceedings of the 19th International Symposium on Formal Methods,Lecture Notes inComputer Science, vol. 8442, pp. 247–262. Springer (2014)

37. Fernandez, J.C., Mounier, L.: Verifying bisimulations“on the fly”. In: Proceedings of the3rd International Conference on Formal Description Techniques for Distributed Systems andCommunication Protocols, pp. 95–110. North-Holland (1990)

38. Ferns, N., Panangaden, P., Precup, D.: Metrics for finiteMarkov decision processes. In:Proceedings of the 20th Conference in Uncertainty in Artificial Intelligence, pp. 162–169.AUAI Press (2004)

39. Ferns, N., Panangaden, P., Precup, D.: Metrics for Markov decision processes with infinitestate spaces. In: Proceedings of the 21st Conference in Uncertainty in Artificial Intelligence,pp. 201–208. AUAI Press (2005)

40. Ford, L., Fulkerson, D.: Flows in Networks. Princeton University Press (2010)41. Gallo, G., Grigoriadis, M.D., Tarjan, R.E.: A fast parametric maximum flow algorithm and

applications. SIAM Journal on Computing18(1), 30–55 (1989)42. Giacalone, A., Jou, C.C., Smolka, S.A.: Algebraic reasoning for probabilistic concurrent sys-

tems. In: Proceedings of IFIP TC2 Working Conference on Programming Concepts andMethods (1990)

43. Gibbs, A.L., Su, F.E.: On choosing and bounding probability metrics. International StatisticalReview70(3), 419–435 (2002)

44. van Glabbeek, R.: The linear time - branching time spectrum I; the semantics of concrete,sequential processes. In: Handbook of Process Algebra, Chapter 1, pp. 3–99. Elsevier (2001)

45. van Glabbeek, R., Smolka, S.A., Steffen, B., Tofts, C.: Reactive, generative, and stratifiedmodels of probabilistic processes. In: Proceedings of the 5th Annual IEEE Symposium onLogic in Computer Science, pp. 130–141. Computer Society Press (1990)


47. He, J., Seidel, K., McIver, A.K.: Probabilistic models for the guarded command language.Science of Computer Programming28, 171–192 (1997)

48. Hennessy, M.: Exploring probabilistic bisimulations,part I. Formal Aspects of Computing24(4-6), 749–768 (2012)

49. Hennessy, M., Milner, R.: Algebraic laws for nondeterminism and concurrency. Journal ofthe ACM 32(1), 137–161 (1985)

50. Hermanns, H., Parma, A. et al: Probabilistic Logical Characterization. Information and Com-putation209(2), 154-172 (2011)

51. Huth, M., Kwiatkowska, M.: Quantitative analysis and model checking. In: Proceedings of the12th Annual IEEE Symposium on Logic in Computer Science, pp.111–122. IEEE ComputerSociety (1997)

52. Jones, C., Plotkin, G.: A probabilistic powerdomain of evaluations. In: Proceedings of the 4thAnnual IEEE Symposium on Logic in Computer Science, pp. 186–195. Computer SocietyPress (1989)

53. Jonsson, B., Ho-Stuart, C., Yi, W.: Testing and refinement for nondeterministic and proba-bilistic processes. In: Proceedings of the 3rd International Symposium on Formal Techniquesin Real-Time and Fault-Tolerant Systems,Lecture Notes in Computer Science, vol. 863, pp.418–430. Springer (1994)


54. Jonsson, B., Yi, W.: Compositional testing preorders for probabilistic processes. In: Pro-ceedings of the 10th Annual IEEE Symposium on Logic in Computer Science, pp. 431–441.Computer Society Press (1995)

55. Jonsson, B., Yi, W.: Testing preorders for probabilistic processes can be characterized bysimulations. Theoretical Computer Science282(1), 33–51 (2002)

56. Kanellakis, P., Smolka, S.A.: CCS expressions, finite state processes, and three problems ofequivalence. Information and Computation86(1), 43–65 (1990)

57. Kantorovich, L.: On the transfer of masses (in Russian).Doklady Akademii Nauk37(2),227–229 (1942)

58. Kantorovich, L.V., Rubinshtein, G.S.: On a space of totally additive functions. Vestn Lening.Univ. 13(7), 52–59 (1958)

59. Kozen, D.: Results on the propositional mu-calculus. Theoretical Computer Science27, 333–354 (1983)

60. Larsen, K.G., Skou, A.: Bisimulation through probabilistic testing. Information and Compu-tation94(1), 1–28 (1991)

61. Larsen, K.G., Skou, A.: Compositional verification of probabilistic processes. In: Proceed-ings of the 3rd International Conference on Concurrency Theory, Lecture Notes in ComputerScience, vol. 630, pp. 456–471. Springer (1992)

62. Lin, H.: “On-the-fly” instantiation of value-passing processes. In: Proceedings of FORTE’98,IFIP Conference Proceedings, vol. 135, pp. 215–230. Kluwer (1998)

63. Lowe, G.: Probabilistic and prioritized models of timedCSP. Theoretical Computer Science138, 315–352 (1995)

64. McIver, A., Morgan, C.: An expectation-based model for probabilistic temporal logic. Tech.Rep. PRG-TR-13-97, Oxford University Computing Laboratory (1997)

65. McIver, A., Morgan, C.: Results on the quantitative mu-calculus. ACM Transactions onComputational Logic8(1) (2007)

66. Milner, R.: Communication and Concurrency. Prentice Hall (1989)67. Monge, G.: Memoire sur la theorie des deblais et des remblais. Histoire de l’Academie des

Science de Paris p. 666 (1781)68. Morgan, C.C., McIver, A.K., Seidel, K.: Probabilistic predicate transformers. ACM Trans.

Progr. Lang. Syst.18(3), 325–353 (1996)69. Mislove, M.M., Ouaknine, J., Worrell, J.: Axioms for probability and nondeterminism. Elec-

tronic Notes in Theoretical Computer Science96, 7–28 (2004)70. Muller-Olm, M.: Derivation of characteristic formulae. Electronic Notes in Theoretical Com-

puter Science18, 159–170 (1998)71. Orlin, J.B.: A faster strongly polynomial minimum cost flow algorithm. In: Proceedings of

the 20th ACM Symposium on the Theory of Computing, pp. 377–387. ACM (1988)72. Panangaden, P.: Labelled Markov Processes. Imperial College Press (2009)73. Park, D.: Concurrency and automata on infinite sequences. In: Proceedings of the 5th GI-

Conference on Theoretical Computer Science,Lecture Notes in Computer Science, vol. 104,pp. 167–183. Springer (1981)

74. Parma, A., Segala, R.: Logical characterizations of bisimulations for discrete probabilisticsystems. In: Proceedings of the 10th International Conference on Foundations of SoftwareScience and Computational Structures,Lecture Notes in Computer Science, vol. 4423, pp.287–301. Springer (2007)

75. Pnueli, A.: Linear and branching structures in the semantics and logics of reactive systems.In: Proceedings of the 12th International Colloquium on Automata, Languages and Program-ming,Lecture Notes in Computer Science, vol. 194, pp. 15–32. Springer (1985)

76. Puterman, M.L.: Markov Decision Processes. Wiley (1994)77. Rabin, M.O.: Probabilistic automata. Information and Control 6, 230–245 (1963)78. Rachev, S.: Probability Metrics and the Stability of Stochastic Models. Wiley New York

(1991)79. Sack, J., Zhang, L.: A general framework for probabilistic characterizing formulae. In: Pro-

ceedings of the 13th International Conference on Verification, Model Checking, and AbstractInterpretation,Lecture Notes in Computer Science, vol. 7148, pp. 396–411. Springer (2012)

References 69

80. Sangiorgi, D., Rutten, J. (eds.): Advanced Topics in Bisimulation and Coinduction. Cam-bridge University Press (2011)

81. Segala, R.: Modeling and verification of randomized distributed real-time systems. Tech.Rep. MIT/LCS/TR-676, PhD thesis, MIT, Dept. of EECS (1995)

82. Segala, R., Lynch, N.: Probabilistic simulations for probabilistic processes. In: Proceedings ofthe 5th International Conference on Concurrency Theory,Lecture Notes in Computer Science,vol. 836, pp. 481–496. Springer (1994)

83. Steffen, B., Ingolfsdottir, A.: Characteristic formulae for processes with divergence. Infor-mation and Computation110, 149–163 (1994)

84. Tarski, A.: A lattice-theoretical fixpoint theorem and its application. Pacific Journal of Math-ematics5, 285–309 (1955)

85. Tix, R., Keimel, K., Plotkin, G.: Semantic domains for combining probability and non-determinism. Electronic Notes in Theoretical Computer Science129, 1–104 (2005)

86. Vardi, M.: Automatic verification of probabilistic concurrent finite-state programs. In: Pro-ceedings 26th Annual Symposium on Foundations of Computer Science, pp. 327–338 (1985)

87. Vershik, A.: Kantorovich metric: Initial history and little-known applications. Journal ofMathematical Sciences133(4), 1410–1417 (2006)

88. Villani, C.: Topics in Optimal Transportation,Graduate Studies in Mathematics, vol. 58.American Mathematical Society (2003)

89. Yi, W., Larsen, K.G.: Testing probabilistic and nondeterministic processes. In: Proceedingsof the IFIP TC6/WG6.1 12th International Symposium on Protocol Specification, Testing andVerification,IFIP Transactions, vol. C-8, pp. 47–61. North-Holland (1992)

90. Ying, M.: Bisimulation indexes and their applications.Theoretical Computer Science275,1–68 (2002)

91. Zhang, L., Hermanns, H., Eisenbrand, F., Jansen, D.N.: Flow faster: Efficient decision algo-rithms for probabilistic simulations. Logical Methods in Computer Science4(4:6) (2008)

Chapter 4Probabilistic Testing Semantics

Abstract In this chapter we extend the traditional testing theory of De Nicola andHennessy to the probabilistic setting. We first set up a general testing framework,and then introduce a vector-based testing approach that employs multiple successactions. It turns out that for finitary systems, i.e. finite-state and finitely branchingsystems, vector-based testing is equivalent to scalar testing that uses only one suc-cess action. Other variants, such as reward testing and extremal reward testing arealso discussed. They all coincide with vector-based testing as far as finitary systemsare concerned.

Keywords: Testing semantics; Vector-based testing; Scalar testing;Reward testing

4.1 A general testing framework

It is natural to view the semantics of processes as being determined by their abilityto pass tests [13, 18, 31, 28]; processesP1 andP2 are deemed to be semanticallyequivalent unless there is a test that can distinguish them.The actual tests usedtypically represent the ways in which users, or indeed otherprocesses, can interactwith Pi .

Let us first set up a general testing scenario, within which this idea can be for-mulated. It assumes

• a set of processesProc• a set of testsT , which can be applied to processes• a set of outcomesO, the possible results from applying a test to a process• a functionA : T ×Proc→P+(O), representing the possible results of apply-

ing a specific test to a specific process.

HereP+(O) denotes the collection of non-empty subsets ofO; so the result ofapplying a testT to a processP, A (T,P), is in general anon-empty setof out-comes, representing the fact that the behaviour of processes, and indeed tests, maybe nondeterministic.

71

72 4 Probabilistic Testing Semantics

Moreover, some outcomes are considered better then others;for example, theapplication of a test may simply succeed, or it may fail, withsuccess being betterthan failure. So we can assume thatO is endowed with a partial order, in whicho1 ≤ o2 means thato2 is a better outcome thano1.

When comparing the results of applying tests to processes weneed to comparesubsets ofO. There are two standard approaches to make this comparison,based onviewing these sets as elements of either the Hoare or Smyth powerdomain [17, 2] ofO. ForO1,O2 ∈ P+(O) we let

(i) O1 ≤Ho O2 if for everyo1 ∈ O1 there exists someo2 ∈ O2 such thato1 ≤ o2

(ii) O1 ≤Sm O2 if for everyo2 ∈ O2 there exists someo1 ∈ O1 such thato1 ≤ o2.

Using these two comparison methods we obtain two different semantic preordersfor processes:

(i) For P,Q∈ Proc let P⊑mayQ if A (T,P)≤Ho A (T,Q) for every testT(ii) Similarly, let P⊑mustQ if A (T,P)≤Sm A (T,Q) for every testT.

We useP≃may Q andP≃mustQ to denote the associated equivalences.The terminologymayandmustrefers to the following reformulation of the same

idea. LetPass⊆ O be anupwards-closedsubset ofO, i.e. satisfying thato′ ≥ oando ∈ Passimply o′ ∈ Pass. It is thought of as the set of outcomes that can beregarded aspassinga test. Then we say that a processP maypass a testT withan outcome inPass, notation “P may Pass T”, if there is an outcomeo∈ A (P,T)with o∈ Pass, and likewiseP mustpass a testT with an outcome inPass, notation“P must Pass T”, if for all o∈ A (P,T) one haso∈ Pass. Now

P⊑may Q iff ∀T ∈ T ∀Pass∈ P↑(O)(P may Pass T ⇒ Q may Pass T)

P⊑mustQ iff ∀T ∈ T ∀Pass∈ P↑(O)(P must Pass T ⇒ Q must Pass T)

whereP↑(O) is the set of upwards-closed subsets ofO.Let us have a look at some typical instances of the setO and its associated partial

order≤.

1. The original theory of testing [13, 18] is obtained by using as the set of out-comesO the two-point lattice

⊥

⊤

with ⊤ representing the success of a test application, and⊥ failure.2. However, for probabilistic processes we consider an application of a test to a

process to succeed with a given probability. Thus we take as the set of outcomesthe unit interval[0,1], with the standard mathematical ordering; if 0≤ p< q≤ 1then succeeding with probabilityq is considered better than succeeding withprobability p. This yields two preorders for probabilistic processes, which for

4.2 Testing probabilistic processes 73

convenience we rename⊑pmay and⊑pmust, with the associated equivalences≃pmay and≃pmust respectively. These preorders, and their associated equiva-lences, were first defined by Wang and Larsen [31]. We will refer to this ap-proach asscalar testing.

3. Another approach of testing [28] employs a countable set of special actionsΩ = ω1,ω2, ... to report success. When applied to probabilistic processes,this approach uses the function space[0,1]Ω as the set of outcomes and thestandard partial order for real functions: for anyo1,o2 ∈ O, we haveo1 ≤ o2

if and only if o1(ω) ≤ o2(ω) for everyω ∈ Ω . WhenΩ is fixed, an outcomeo∈ O can be considered as a vector〈o(ω1),o(ω2), ...〉, with o(ωi) representingthe probability of success observed by actionωi . Therefore, this approach iscalledvector-based testing.

For the second instance above, there is a useful simplification: the Hoare andSmyth preorders onclosedsubsets of[0,1], in particular on finite subsets of[0,1],are determined by their maximum and minimum elements respectively.

Proposition 4.1.Let O1, O2 be non-empty closed subsets of[0,1] we have

(i) O1 ≤Ho O2 if and only if max(O1)≤ max(O2)(ii) O1 ≤Sm O2 if and only if min(O1)≤ min(O2).

Proof. Straightforward calculations. ⊓⊔

Remark 4.1.Another formulation of the Hoare and Smyth preorders can be givenas follows. Let(X,≤) be a partially ordered set. IfA ⊆ X, then theupper-setandthe lower-setof A are defined by↑A= a′ ∈ X | a≤ a′ for somea∈ A and dually↓A= a′ ∈X | a′ ≤a for somea∈ A, respectively. Then for any nonemptyA,B⊆Xwe have

1. A≤Ho B if and only if ↓A ⊆ ↓B;2. A≤Sm B if and only if ↑A ⊇ ↑B.

As in the non-probabilistic case [13], we could also define a testing preordercombining the may- must-preorders; we will not study this combination here.

4.2 Testing probabilistic processes

We start with some notation. ForΘ a distribution overR and functionf : R→ S,then Imgf (Θ) is the distribution overSgiven by

Imgf (Θ)(s) = ∑Θ(r) | r ∈ Rand f (r) = s

for anys∈ S. For∆ a distribution overSand functionf : S→ X into a vector spaceX; we sometimes write Exp∆ ( f ) or simply f (∆) for ∑s∈S∆(s) · f (s), theexpectedvalueof f . Our primary use of this notation is withX being the vector space of


reals, but we will also use it with tuples of reals, or distributions over some set. Inthe latter case this amounts to the notation∑i∈I pi ·∆i , whereI is a finite index setand∑i∈I pi = 1. Whenp∈ [0,1], we also writef1 p⊕ f2 for p · f1+(1− p) · f2. Moregenerally, for functionF : S→ P+(X) with P+(X) being the collection of non-empty subsets ofX, we define Exp∆ F := Exp∆ ( f ) | f ∈F ; heref ∈F means thatf : S→X is achoice function, that is it satisfies the constraint thatf (s) ∈ F(s) forall s∈ S. Recall that a setO is convexif o1,o2 ∈ O andp∈ [0,1] then the weightedaverageo1 p⊕ o2 is also inO.

Definition 4.1. A (probabilistic) processis defined by a tuple〈S,∆,Actτ ,→〉,whereActτ is a set of actionsAct augmented with a special actionτ, 〈S,Actτ ,→〉is a pLTS and∆ is a distribution overS, called theinitial distribution of the pLTS.We sometimes identify a process with its initial distribution when the underlyingpLTS is clear from the context.

We now define the parallel composition of two processes.

Definition 4.2. Let P1 = 〈S1,∆1 ,Actτ ,→1〉 andP2 = 〈S2,∆

2 ,Actτ ,→2〉 be two pro-cesses, andA a set of visible actions. Theparallel composition P1|AP2 is the com-posite process〈S1×S2,∆

1 ×∆2 ,Actτ ,→〉 where→ is the least relation satisfying

the following rules:

s1α−→1 ∆ α 6∈ A

(s1,s2)α−→ ∆ × s2

s2α−→2 ∆ α 6∈ A

(s1,s2)α−→ s1×∆

s1a−→1 ∆1, s2

a−→2 ∆2 a∈ A

(s1,s2)τ−→ ∆1×∆2

Parallel composition is the basis of testing: it models the interaction of the ob-server with the process being tested; and it models the observer himself — as aprocess. LetΩ := ω1,ω2, · · · be a countable set ofsuccess actions, disjoint fromActτ , define anΩ -testto be a process〈S,∆,Actτ ∪Ω ,→〉 with the constraint thats ωi−→ ands ω j−→ imply i = j. Let T be the class of all such tests, and writeTn forthe subclass ofT that uses at mostn success actionsω1, ...,ωn; we writeTN for⋃

n∈NTn.To apply test T to processP we first form the compositionT |Act P and then

resolve all nondeterministic choices into probabilistic choices. Thus, we obtain a setof resolutions as in Definition 4.3 below. For each resolution, any particular successaction ωi will have some probability of occurring; and those probabilities, takentogether, give us a singlesuccess tuplefor the whole resolution, so that ifo is thetuple theno(ωi) is the recorded probability ofωi ’s occurrence. The set of all thosetuples, i.e. over all resolutions ofT |Act P, is then the complete outcome of applyingtestT to processP: as such, it will be a subset of[0,1]Ω .

Definition 4.3. A resolutionof a distribution∆ ∈ D(S) in a pLTS〈S,Ωτ ,→〉 is atriple 〈R,Θ ,→R〉 where〈R,Ωτ ,→R〉 is a deterministic pLTS andΘ ∈ D(R), suchthat there exists aresolving function f∈ R→ Ssatisfying

4.2 Testing probabilistic processes 75

(i) Img f (Θ) = ∆(ii) if r α−→R Θ ′ for α ∈ Ωτ then f (r) α−→ Imgf (Θ ′)

(iii) if f (r) α−→ for α ∈ Ωτ thenr α−→R .

A resolution〈R,Ωτ ,→R〉 is said to bestatic if its resolving functionfR is injective.

A resolution has as its main effect the choosing in any state of a single outgoingtransition from all available ones with a same label; butf can be non-injective, sothat the choice can vary between different departures from that state, depending e.g.on the history of states and actions that led there. Further,since a single state ofS can be split into a distribution over several states ofR, all mapped to it byf ,probabilistic interpolation between distinct choices is obtained automatically.

Static restrictions are particularly simple, in that they do not allow states to beresolved into distributions, or computation steps to be interpolated.

Example 4.1.Consider the pLTS in Figure 4.1(a), with initial distributions1. One ofits resolutions is described in (b). The associated resolving functionf is the follow-ing.

f (r1) = f (r2) = f (r4) = f (r5) = s1

f (r3) = f (r6) = s2

f (r7) = s3

f (r8) = s4

From states1 there are three outgoing transitions all labelled withτ. According tothe resolution in (b), the first times1 is visited, the transitions1

τ−→ s1 is chosenwith probability 2/3 and the transitions1

τ−→ s2 probability 1/3. The second times1 is visited, the transitions1

τ−→ s2 is taken with probability 1/4 and the transitions1

τ−→ s3 12⊕ s4 probability 3/4.

We now explain how to associate an outcome with a particular resolution, whichin turn will associate a set of outcomes with a distribution in a pLTS. Given a deter-ministic pLTS〈R,Ωτ ,→〉 consider the functionC : (R→ [0,1]Ω )→ (R→ [0,1]Ω )defined by

C ( f )(r)(ω) :=

1 if r ω−→

0 if r 6ω−→ andr 6τ−→

Exp∆ ( f )(ω) if r 6ω−→ andr τ−→ ∆ .

(4.1)

We view the unit interval[0,1] ordered in the standard manner as a complete lattice;this induces the structure of a complete lattice on the product [0,1]Ω and in turn onthe set of functionsR→ [0,1]Ω via the partial order≤ defined pointwise by lettingf ≤ g iff f (r)≤ g(r) for all r ∈ R. The functionC is easily seen to be monotone andtherefore has a least fixed point, which we denote byV〈R,Ωτ ,→〉; this is abbreviatedtoV when the resolution in question is understood.

Now letA (T,P) denote the set of vectors

A (T,P) := ExpΘ (V〈R,Ωτ ,→〉) | 〈R,Θ ,→〉 is a resolution of[T |Act P (4.2)


1/21/2

τ τ

1/3 2/3

τ

3/41/4

τ τ

1/2 1/2

τ

s1

ss

1r r

2

r3

r4 r

5

r6

r7r

8

ω1

(b) (a)

ω2

ω1

ω2

4

τ

2s

1ω

3

Fig. 4.1 A resolution

where[T |Act P stands for the initial distribution of the processT |Act P.We note that the result setA (T,P) is convex.

Lemma 4.1.For any test T and process P, if o1,o2 ∈ A (T,P), then their weightedaverage o1 p⊕ o2 is also inA (T,P) for any p∈ [0,1].

Proof. Let 〈R1,Θ1,→1〉,〈R2,Θ2,→2〉 be the resolutions of[T |Act P that give riseto o1,o2. We take their disjoint union, except initially we letΘ :=Θ1 p⊕Θ2, to definethe new resolution〈R1∪R2,Θ ,→1 ∪→2〉. It is easy to see that the new resolutiongenerates the interpolated tupleo1 p⊕ o2 as required. ⊓⊔

Definition 4.4 (Probabilistic testing preorders).

(i) P⊑ΩpmayQ if for everyΩ -testT, A (T,P)≤Ho A (T,Q).

(ii) P⊑ΩpmustQ if for everyΩ -testT, A (T,P)≤Sm A (T,Q).

These preorders are abbreviated toP⊑pmay Q, andP ⊑pmustQ, when|Ω |= 1, andtheir kernels are denoted by≃pmay and≃pmustrespectively.

If |Ω | = 1 then vector-based testing degenerates into scalar testing. In general,if |Ω | > 1 then vector-based testing appears more discriminating than scalar test-ing. Surprisingly, if we restrict ourselves to finitary processes, then scalar testing isequally powerful as vector-based testing. A key ingredientof the proof is to exploita method of testing calledreward testing, which we will introduce in Section 4.4.

4.3 Bounded continuity 77

4.3 Bounded continuity

In this section we introduce a notion of bounded continuity for real-valued functions.We will exploit it to show the continuity of the functionC defined in (4.1), and thenuse it again in Chapter 6. We begin with a handy lemma below.

Lemma 4.2 (Exchange of suprema).Let function g: N×N→R be such that it is

(i) monotone in both of its arguments, so that i1 ≤ i2 implies g(i1, j) ≤ g(i2, j),and j1 ≤ j2 implies g(i, j1)≤ g(i, j2), for all i, i1, i2, j, j1, j2 ∈ N, and

(ii) bounded above, so that there is a c∈R≥0 with g(i, j) ≤ c for all i, j ∈N.

Thenlimi→∞

limj→∞

g(i, j) = limj→∞

limi→∞

g(i, j).

Proof. Conditions (i) and (ii) guarantee the existence of all the limits. Moreover, fora non-decreasing sequence its limit and supremum agree, andboth sides equal thesupremum of allg(i, j) for i, j ∈ N. In fact,(R∪∞,≤) is a CPO, and it is a basicresult of CPO’s [30] that

⊔

i∈N

(⊔

j∈N

g(i, j)) =⊔

j∈N

(⊔

i∈N

g(i, j)). ⊓⊔

The following proposition states that some real functions satisfy the property ofbounded continuity, which allows the exchange of limit and sum operations.

Proposition 4.2 (Bounded continuity - nonnegative function). Suppose a functionf : N×N→R≥0 satisfies the following conditions:

C1. f is monotone in the second parameter, i.e. j1 ≤ j2 implies f(i, j1) ≤ f (i, j2)for all i , j1, j2 ∈ N;

C2. for any i∈ N, the limit lim j→∞ f (i, j) exists;C3. the partial sums Sn = ∑n

i=0 lim j→∞ f (i, j) are bounded, i.e. there exists somec∈ R≥0 such that Sn ≤ c for all n≥ 0.

Then the following equality holds:

∞

∑i=0

limj→∞

f (i, j) = limj→∞

∞

∑i=0

f (i, j).

Proof. Let g : N×N → R≥0 be the function defined byg(n, j) = ∑ni=0 f (i, j). It

is easy to see thatg is monotone in both arguments. ByC1 andC2, we have thatf (i, j) ≤ lim j→∞ f (i, j) for anyi, j ∈N. So for anyj,n∈ N we have that

g(n, j) =n

∑i=0

f (i, j) ≤n

∑i=0

limj→∞

f (i, j) ≤ c

according toC3. In other words,g is bounded above. Therefore we can applyLemma 4.2 and obtain


limn→∞

limj→∞

n

∑i=0


limn→∞

n

∑i=0

f (i, j). (4.3)

For any j ∈N, the sequenceg(n, j)n≥0 is nondecreasing and bounded, so its limit∑∞

i=0 f (i, j) exists. That is,

limn→∞

n

∑i=0

f (i, j) =∞

∑i=0

f (i, j). (4.4)

In view of C2, we have that, for any givenn∈N, the limit lim j→∞ ∑ni=0 f (i, j) exists

andn

∑i=0

limj→∞


n

∑i=0

f (i, j). (4.5)

By C3 the sequenceSnn≥0 is bounded. Since it is also nondecreasing, it converges

to∞

∑i=0

limj→∞

f (i, j). That is,

limn→∞

n

∑i=0

limj→∞

f (i, j) =∞

∑i=0

limj→∞

f (i, j). (4.6)

Hence the left-hand side of the desired equality exists. By combining (4.3)-(4.6) we

obtain the result that∞

∑i=0

limj→∞


∞

∑i=0

f (i, j). ⊓⊔

Proposition 4.3 (Bounded continuity - general function).Let f : N×N→R be afunction that satisfies the following conditions

C1. j1 ≤ j2 implies| f (i, j1)| ≤ | f (i, j2)| for all i , j1, j2 ∈ N;C2. for any i∈ N, the limit lim j→∞ | f (i, j)| exists;C3. the partial sums Sn = ∑n

i=0 lim j→∞ | f (i, j)| are bounded, i.e. there exists somec∈ R≥0 such that Sn ≤ c, for all n≥ 0;

C4. for all i , j1, j2 ∈ N, if j1 ≤ j2 and f(i, j1)> 0 then f(i, j2)> 0.

Then the following equality holds:

∞

∑i=0

limj→∞


∞

∑i=0

f (i, j).

Proof. By Proposition 4.2 and conditionsC1, C2 andC3, we infer that

limj→∞

∞

∑i=0

| f (i, j)| =∞

∑i=0

limj→∞

| f (i, j)|. (4.7)

C4 implies that the functiong : N×N→ R≥0 given byg(i, j) := f (i, j)+ | f (i, j)|satisfies conditionsC1, C2 and C3 of Proposition 4.2. In particular, the limitlim j→∞ g(i, j) = 0 if f (i, j) ≤ 0 for all j ∈ N, and limj→∞ g(i, j) = 2lim j→∞ | f (i, j)|otherwise. Hence

4.3 Bounded continuity 79

limj→∞

∞

∑i=0

( f (i, j)+ | f (i, j)|) =∞

∑i=0

limj→∞

( f (i, j)+ | f (i, j)|). (4.8)

Since∑∞i=0 f (i, j) = ∑∞

i=0( f (i, j)+ | f (i, j)|)−∑∞i=0 | f (i, j)|, we then have

lim j→∞ ∑∞i=0 f (i, j) = lim j→∞(∑∞

i=0( f (i, j)+ | f (i, j)|)−∑∞i=0 | f (i, j)|)

[existence of the two limits by (4.7) and (4.8)]= lim j→∞ ∑∞

i=0( f (i, j)+ | f (i, j)|)− lim j→∞ ∑∞i=0 | f (i, j)|

[by (4.7) and (4.8)]= ∑∞

i=0 lim j→∞( f (i, j)+ | f (i, j)|)−∑∞i=0 lim j→∞ | f (i, j)|

= ∑∞i=0(lim j→∞( f (i, j)+ | f (i, j)|)− lim j→∞ | f (i, j)|)

= ∑∞i=0 lim j→∞( f (i, j)+ | f (i, j)|− | f (i, j)|)

= ∑∞i=0 lim j→∞ f (i, j)

⊓⊔

Lemma 4.3.The functionC defined in (4.1) is continuous.

Proof. Let f0 ≤ f1 ≤ ... be an increasing chain inR→ [0,1]Ω . We need to show that

C (⊔

n≥0

fn) =⊔

n≥0

C ( fn) (4.9)

For anyr ∈ R, we are in one of the following three cases:

1. r ω−→ for someω ∈ Ω . We have

C (⊔

n≥0 fn)(r)(ω) = 1 by (4.1)=⊔

n≥01=⊔

n≥0C ( fn)(r)(ω)= (

⊔

n≥0C ( fn))(r)(ω)

andC (⊔

n≥0

fn)(r)(ω ′) = 0= (⊔

n≥0

C ( fn))(r)(ω ′)

for all ω ′ 6= ω .2. r 6−→. Similar to last case. We have

C (⊔

n≥0

fn)(r)(ω) = 0= (⊔

n≥0

C ( fn))(r)(ω)

for all ω ∈ Ω .3. Otherwise,r τ−→ ∆ for some∆ ∈ D(R). Then we infer that, for anyω ∈ Ω ,


C (⊔

n≥0 fn)(r)(ω) = (⊔

n≥0 fn)(∆)(ω) by (4.1)= ∑r∈⌈∆⌉ ∆(r) · (

⊔

n≥0 fn)(r)(ω)= ∑r∈⌈∆⌉ ∆(r) · (

⊔

n≥0 fn(r))(ω)= ∑r∈⌈∆⌉

⊔

n≥0 ∆(r) · fn(r)(ω)= ∑r∈⌈∆⌉ limn→∞ ∆(r) · fn(r)(ω)= limn→∞ ∑r∈⌈∆⌉∆(r) · fn(r)(ω) by Proposition 4.2=⊔

n≥0 ∑r∈⌈∆⌉ ∆(r) · fn(r)(ω)=⊔

n≥0 fn(∆)(ω)=⊔

n≥0C ( fn)(r)(ω)= (

⊔

n≥0C ( fn))(r)(ω)

In the above reasoning, Proposition 4.2 is applicable because we can define thefunction f : R×N→ R≥0 by letting f (r,n) = ∆(r) · fn(r)(ω) and check thatfsatisfies the three conditions in Proposition 4.2. IfR is finite, we can extend itto a countable setR′ ⊇ Rand requiref (r ′,n) = 0 for all r ′ ∈ R′\Randn∈N.

a. f satisfies conditionC1. For anyr ∈ R and n1,n2 ∈ N, if n1 ≤ n2 thenfn1 ≤ fn2. It follows that

f (r,n1) = ∆(r) · fn1(r)(ω) ≤ ∆(r) · fn2(r)(ω) = f (r,n2).

b. f satisfies conditionC2. For anyr ∈ R, the sequence∆(r) · fn(r)(ω)∞n=0

is nondecreasing and bounded by∆(r). Therefore, the limit limn→∞ f (r,n)exists.

c. f satisfies conditionC3. For anyR′′ ⊆ R, we can see that the partial sum∑r∈R′′ limn→∞ f (r,n) is bounded because

∑r∈R′′

limn→∞

f (r,n) = ∑r∈R′′

limn→∞

∆(r) · fn(r) ≤ ∑r∈R′′

∆(r) ≤ ∑r∈R

∆(r) = 1.

⊓⊔

Because of Lemma 4.3 and Proposition 2.1, the least fixed point of C can bewrittenV=

⊔

n∈NC n(⊥), where⊥(r) = 0 for all r ∈ R.

4.4 Reward testing

In this section we introduce an alternative testing approach based on the probabilistictesting discussed in Section 4.2. The idea is to associate each success actionω ∈ Ωa reward, and performing a success action means accumulating some reward. Theoutcomes of this reward testing are expected rewards.

4.4 Reward testing 81

4.4.1 A geometric property

We have seen from Proposition 4.1 that the comparison of two sets with respect tothe Hoare and Smyth preorders can be simplified if they are closed subsets of[0,1]as it suffices to consider their maximum and minimum elements. This simplificationdoes not apply if we want to compare two subsets of[0,1]n, even if they are closed,because maximum and minimum elements might not exist for sets of vectors. How-ever, we can convert a set of vectors into a set of scalars by taking the expectedreward entailed by each vector with respect to a reward vector. Interestingly, thecomparison of two sets of vectors is related to the comparison of suprema and infimaof two sets of scalars, provided some closure conditions areimposed. Therefore, tosome extent we generalise Proposition 4.1 from the comparison of sets of scalars tothe comparison of sets of vectors. Mathematically, the result can be viewed as ananalytic property in geometry, which could be of independent interest.

Suppose that in vector-based testing we use at mostn success actions taken fromthe setΩ = ω1, ...,ωn with n > 1. A testing outcomeo∈ [0,1]Ω can be viewedas then-dimensional vector〈o(ω1), ...,o(ωn)〉 whereo(ωi) is the probability of suc-cessfully reaching success actionωi . Similarly, areward vector h∈ [0,1]Ω can beregarded as the vector〈h(ω1), ...,h(ωn)〉 whereh(ωi) is the reward given toωi . Wesometimes take the dot-product of a particular vectorh∈ [0,1]n and a set of vectorsO⊆ [0,1]n, resulting in a set of scalars given byh ·O := h ·o | o∈ O.

Definition 4.5. A subsetO of the n-dimensional Euclidean space isp-closed(forprobabilisticallyclosed) iff

• It is convex, and• It is Cauchy closed, that is it contains all its limit points in the usual Euclidean

metric, and it isbounded. 1

We are now ready to generalise Proposition 4.1 in order to compare two setsof vectors. Here we require one set to be p-closed, which allows us to appeal to theSeparation theorem from discrete geometry (cf. Theorem2.7); Cauchy closure aloneis not enough.

Theorem 4.1.Let A,B be subsets of[0,1]n; then we have

A≤Ho B iff ∀h∈ [0,1]n :⊔

h ·A≤⊔

h ·B if B is p-closed, andA≤Sm B iff ∀h∈ [0,1]n :

dh ·A≤

dh ·B if A is p-closed.

Proof. We consider first theonly-if -direction for the Smyth case:

A≤Sm Biff ∀b∈ B : ∃a∈ A : a≤ b Definition of≤Sm

implies ∀h∈ [0,1]n : ∀b∈ B : ∃a∈ A : h ·a≤ h ·b h≥ 0implies ∀h∈ [0,1]n : ∀b∈ B :

dh ·A≤ h ·b

dh ·A≤ h ·a

implies ∀h∈ [0,1]n :d

h ·A≤d

h ·B Definition of infimum

1 Cauchy closure and boundedness together amounts tocompactness.


Fig. 4.2 Separation by a hyperplane. Reprinted from [12], with kind permission from SpringerScience+Business Media.

For theif -direction we use separating hyperplanes, proving the contrapositive:

A 6≤Sm Biff ∀a∈ A : ¬(a≤ b) Definition of≤Sm; for someb∈ Biff A∩B′ = /0 defineB′ := b′ ∈ R

n | b′ ≤ biff ∃h∈ R

n,c∈ R :∀a∈ A,b′ ∈ B′ :

h ·b′ < c< h ·a

In the last step of the reasoning above we have used Theorem 2.7 asA is p-closedand B′ is convex and Cauchy-closed by construction; see Figure 4.2. Moreover,without loss of generality the inequality can be in the direction shown, else wesimply multiplyh,c by−1.

We now argue thath is nonnegative, whence by scaling ofh,c we obtain withoutloss of generality thath ∈ [0,1]n. Assume for a contradiction thathi < 0. Choosescalard ≥ 0 large enough so that the pointb′ := (b1, · · · ,bi − d, · · · ,bn) falsifiesh · b′ < c; sinceb′ is still in B′, however, that contradicts the separation. Thus wecontinue

4.4 Reward testing 83

iff∃h∈ [0,1]n,c∈R :∀a∈ A,b′ ∈ B′ :

h ·b′ < c< h ·aabove comments concerningd

iff ∃h∈ [0,1]n,c∈ R : ∀a∈ A : h ·b< c< h ·a setb′ to b; noteb∈ B′

implies∃h∈ [0,1]n,c∈ R : h ·b< c≤d

h ·A property of infimumimplies∃h∈ [0,1]n,c∈R :

dh ·B< c≤

dh ·A b∈ B, hence

dh ·B≤ h ·b

implies¬(∀h∈ [0,1]n :d

h ·A≤d

h ·B)

The proof for the Hoare-case is analogous. ⊓⊔

4.4.2 Nonnegative rewards

Reward testing is obtained from the probabilistic testing in Section 4.2 by associ-ating each success actionω ∈ Ω a reward, which is a nonnegative number in theunit interval[0,1], and a run of a probabilistic process in parallel with a test yieldsan expected reward accumulated by those states that can enable success actions. Areward tupleh∈ [0,1]Ω is used to assign rewardh(ω) to success actionω , for eachω ∈ Ω . Due to the presence of nondeterminism, the application of atestT to a pro-cessP produces a set of expected rewards. Two sets of rewards can becompared byexamining their supremum/infimum elements; this gives us two methods of testingcalled reward may/must testing. In general, a reward could also be a negative realnumber. But in this chapter we only consider nonnegative rewards; the general casewill be discussed in Section 6.9.

Formally, leth : Ω → [0,1] be a reward vector that assigns a nonnegative rewardto each success action. In analogy to the functionC in (4.1), we now define a func-tion C h : (R→ [0,1]) → (R→ [0,1]) with respect to reward vectorh in order toassociate a reward with a deterministic process, which in turn will associate a set ofrewards with a process.

Ch( f )(r) :=

h(ω) if r ω−→

0 if r 6ω−→ andr 6τ−→

Exp∆ ( f ) if r 6ω−→ andr τ−→ ∆ .

(4.10)

The functionC h is also continuous, thus has a least fixed pointVh. Let A h(T,P)

denote the set of rewards

ExpΘ (Vh) | 〈R,Θ ,→〉 is a resolution of[T |Act P .Definition 4.6. Let P andQ be two probabilistic processes. We define two rewardtesting preorders.

(i) P ⊑Ωnrmay Q if for every Ω -testT and nonnegative reward tupleh ∈ [0,1]Ω ,

⊔A h(T,P)≤

⊔A h(T,Q).


(ii) P ⊑Ωnrmust Q if for every Ω -testT and nonnegative reward tupleh ∈ [0,1]Ω ,d

A h(T,P)≤d

A h(T,Q).

These preorders are abbreviated toP⊑nrmay Q andP⊑nrmustQ, when|Ω |= 1.

Example 4.2.Let us use the CCS notationa.ω to mean the test that can performactiona followed by ω before reaching a deadlock state. By applying this test tothe processQ1 in Figure 4.3(a) we obtain the pLTS in Figure 4.3(b) that is alreadydeterministic, hence has only one resolution, itself. Moreover the outcomeVh as-sociated with it is determined by its value at the states0. This in turn is the leastsolution of the equation

Vh(s0) =

12·Vh(s0)+

12

h(ω)

In fact this equation has a unique solution in[0,1], namely h(ω). Therefore,A h(a.ω ,Q1) = h(ω).

τ

1/21/2

τ

1/2 1/2

ττ

a

ω

τ

(b) (a)

s

ss

s

s

2

3

4

1

0

Fig. 4.3 Testing the processQ1

Example 4.3.Consider the processQ2 depicted in Figure 4.4(a) and the applicationof the testT = a.ω to it; this is outlined in Figure 4.4(b). In the pLTS ofT |Act Q2,for eachk≥ 1 there is a resolutionRk such thatVh(Rk) = (1− 1

2k )h(ω); intuitively itgoes around the loop(k−1) times before at last taking the right handτ action. ThusA h(T,Q2) contains(1− 1

2k )h(ω) for everyk≥ 1. But it also containsh(ω), because

of the resolution that takes the left handτ-move every time. Therefore,A h(T,Q2)includes the set

(1−12)h(ω), (1−

122 )h(ω), . . . ,(1−

12k )h(ω), . . . h(ω)

4.5 Extremal reward testing 85

1/2 1/2

τ

a

τ τ τ

1/2 1/2

τ

ω

(a) (b)

τ τ

1/2 1/2

s0

s1

s3 4

a

1/2

ω

τ

1/2

s2s

Fig. 4.4 Testing the processQ2

From later results it will follow thatA h(T,Q2) is actually the convex closure of thisset, namely[1

2h(ω),h(ω)].

4.5 Extremal reward testing

In the previous section our approach to testing consists of two parts:

(1) For each testT, processP, and reward vectorh calculate a set of outcomesA h(T,P), which is a subset of[0,1].

(2) For each pair of processesP, Q compare the corresponding sets of outcomes,A h(T,P) andA h(T,Q) for every testT and reward vectorh, in terms of theirsuprema and infima.

But our methods for comparing sets of outcomes does not necessarily require usto calculate the entire set of outcomes. For example, Proposition 4.1 says that forclosed sets it suffices to compare extremal outcomes. Here wepropose an alternativeapproach to testing based on calculating directly the extremal values of possibleoutcomes.

Note that the results in Sections 4.5 and 4.6 are not used in the rest of this book,except for a notion of extreme policy (Definition 4.8) that isreferred to in Sec-tion 6.5.1. So the reader may skip these two sections in the first reading, and returnback when necessary.

The functionC h used to associate a reward with a resolution, is only defined,in (4.10) above, for deterministic pLTS’s. Here we considergeneralisations to anarbitrary finitely branching pLTS〈S,Ωτ ,→〉.

Now consider the functionC hmin : (S→[0,1])→ (S→[0,1]) defined by:


Chmin( f )(s) =

h(ω) if s ω−→

0 if s 6ω−→ ands 6τ−→

minExp∆ ( f ) | s τ−→ ∆ if s 6ω−→ ands τ−→

In a similar fashion we can define the functionC hmax : (S→[0,1])→ (S→[0,1]) that

uses themaxfunction in place ofmin. Both these functions are monotone, and there-fore have least fixed points, which we abbreviate toV

hmin, V

hmax respectively.

Lemma 4.4.For any finitely branching pLTS and reward vector h,

(a) both functionsC hmin andC h

max are continuous;(b) both results functionsVh

min andVhmax are continuous.

Proof. Again the proof of part (a) is non-trivial. See Lemma 4.3 for the continuityof C . The continuity ofC h

min andC hmin can be similarly shown. However part (b) is

an immediate consequence. ⊓⊔

So in analogy with the evaluation functionVh these results functions can be capturedby a chain of approximants:

Vhmin =

⊔

n∈NVh,nmin and V

hmax=

⊔

n∈NVh,nmax (4.11)

whereVh,0min(s) = V

h,0max(s) = 0 for every states∈ S, and

• Vh,(k+1)min = Cmin(V

h,kmin)

• Vh,(k+1)max = Cmax(V

h,kmax)

Now for a testT, a processP, and a reward vectorh, we have two ways ofdefining the outcome of the application ofT to P:

Ah

min(T,P) = Vhmin([T |Act P)

Ah

max(T,P) = Vhmax([T |Act P)

HereAmin(T,P) returns a single probabilityp, estimating the minimal probabilityof success; it is a pessimistic estimate. On the other handAmax(T,P) is optimistic,in that it gives the maximal probability of success.

Definition 4.7. The extremal reward testing preorders⊑Ωermay and⊑Ω

ermust are de-fined as follows:

(i) P⊑ΩermayQ if for every Ω -test T and nonnegative reward tupleh ∈ [0,1]Ω ,

A hmax(T,P)≤ A h

max(T,Q).(ii) P ⊑Ω

ermust Q if for every Ω -testT and nonnegative reward tupleh ∈ [0,1]Ω ,A h

min(T,P)≤ A hmin(T,Q).

These preorders are abbreviated toP⊑ermayQ andP ⊑ermust Q when|Ω | = 1. Weuse the obvious notation for the kernels of these preorders.

4.6 Extremal reward testing versus resolution-based reward testing 87

Example 4.4.Let T be the testa.ω . By applying it to the processP in Figure 4.3(a)we obtain the pLTS in (b) that is deterministic and thereforeall three functionsV

hmax, V

hmin, V

h coincide, givingA hmax(a.ω ,P) = A h

min(T,P) = h(ω).Again it is straightforward to establish

P≃ermay a.0 P≃ermusta.0

wherea.0 is the process with the only behaviour of performing actiona beforehalting.

Example 4.5.Consider the pLTS from Figure 4.4 resulting from the application ofthe testT = a.ω to the processQ. It is easy to see that the functionVh

max satisfies

Vhmax(s0) = max

12

h(ω), x (4.12)

x =12

h(ω)+12·Vh

max(s0)

It is not difficult to show that this has a unique solution, namely Vhmax(s0) = h(ω).

With further analysis one can conclude that

Q≃ermay a.0

If max is replaced bymin in (4.12) above then the resulting equation also has aunique solution, givingVh

min(s0) =12h(ω). It follows that

a.0 6⊑ermustQ

becauseVhmin([T |Act a.0) = h(ω). Again further analysis will show

Q⊑ermusta.0

4.6 Extremal reward testing versus resolution-based rewardtesting

In this section we compare the two approaches of testing introduced in the previ-ous two subsections. Our first result is that in the most general setting they lead todifferent testing preorders.

Example 4.6.Consider the infinite-state pLTS in Figure 4.5.We compare the states1 with the processa.0. With the testa.ω , using resolu-

tions, we get:

Ah(a.ω ,s1) = l0,(1−

12)h(ω), . . . ,(1−

12k )h(ω), . . .

Ah(a.ω ,a.0) = h(ω)

(4.13)


τ

a

1/21/2

a a

τ τ

τ τ τ

1/4 3/4 1/8 7/8

. . . . . .s1 s s2 3

Fig. 4.5 An infinite-state pLTS

which means thata.0 6⊑1pmays1.

However when we use extremal testing, the testa.ω cannot distinguish theseprocesses. It is straightforward to see thatV

hmax(a.ω |Act a.0) = h(ω). To see that

Vhmax(a.ω |Act s1) also evaluates toh(ω), we letxk =V

hmax(a.ω |Act sk), for all k≥ 1,

and we have the following infinite equation system.

x1 = max 12h(ω),x2

x2 = max(1− 14)h(ω),x3

...xk = max(1− 1

2k )h(ω),xk+1...

We havexk = h(ω) for all k≥ 1 as the least solution of the above equation system.With some more work one can go on to show that no test can distinguish between

these processes using optimistic extremal testing, meaning thata.0⊑ermays1.

In the remainder of this section we show that provided some finitary constraintsare imposed on the pLTS extremal reward testing and resolution-based reward test-ing coincide. First we examinemusttesting, which is easier than themaycase; thisin turn is treated in the following section.

4.6.1 Must testing

Here we show that provided we restrict our attention to finitely branching processesthere is no difference between extremalmusttesting, and resolution-basedmusttest-ing.

Let us consider a pLTS〈S,Ωτ ,→〉, obtained perhaps from applying a testT toa processP in (T |Act P). We have two ways of obtaining a result for a distribution


of states fromS, by applying the functionVhmin, or by using resolutions of the pLTS

to realiseVh. Our first result says that regardless of the actual resolution used, thevalue obtained from the latter will always dominate the former.

But first we need a technical lemma.

Lemma 4.5.Let g: S→ [0,1], g′ : R→ [0,1] and f : R→ S be three functions sat-isfying g( f (r)) ≤ g′(r) for every r∈ R. Then for every subdistributionΘ over R,Exp∆ (g)≤ ExpΘ (g′) where∆ denotes the subdistribution ImgΘ ( f ).

Proof. A straightforward calculation. ⊓⊔

Proposition 4.4.If 〈R,Θ ,→R〉 is a resolution of a subdistribution∆ then for anyreward vector h it holds thatVh

min(∆)≤ Vh(Θ).

Proof. Let f denote the resolving function. First we show by induction onn that forevery stater ∈ R

Vh,nmin( f (r)) ≤ V

h,n(r) (4.14)

For n= 0, this is trivial. We consider the inductive step; note thatby the previouslemma the inductive hypothesis implies that

ExpΓ (Vh,nmin)≤ ExpΘ (Vh,n) (4.15)

for any pair of subdistributions satisfying whereΓ = ImgΘ ( f ).If r ω−→R Θ , then f (r) ω−→, and thusVh,n+1

min ( f (r)) = h(ω) =Vh,n+1(r). A similar

argument applies ifr 6−→, that isr 6τ−→ ands 6ω−→. So the remaining possibility is thatr τ−→R Θ for someΘ , andr 6ω−→, where we knowf (r) τ−→ ImgΘ ( f ).

Vh,n+1min ( f (r)) = minExp∆ (V

h,nmin)| f (r)

τ−→ ∆

≤ ExpΓ (Vh,nmin) whereΓ denotes ImgΘ ( f )

≤ Vh,n(Θ) by induction and (4.15) above

= Vh,n+1(r)

Now by continuity we have from (4.14) that

Vhmin( f (r)) ≤ V

h(r) (4.16)

The result now follows by the previous lemma, since if〈R,Θ ,→R〉 is a resolutionof a subdistribution∆ with resolving functionf then by definition∆ = ImgΘ ( f ). ⊓⊔

Our next result says that in any finitely branching computation structure we canfind a resolution that realises the functionVmin. Moreover this resolution will static.

Definition 4.8. A (static)extreme policyfor a pLTS〈S,Ωτ ,→〉 is a partial functionep : S D(S) satisfying:

(a) s ω−→ impliess ω−→ ep(s)(b) otherwise, ifs τ−→ thens τ−→ ep(s)


Intuitively an extreme policyep determines a computation through the pLTS. Butthis set of possible computations, unlike resolutions as defined in Definition 4.3,are very restrictive. Policyep decides at each state, once and for all, which of theavailableτ-choices to take; it does not interpolate, and since it is a function of thestate, it makes the same choice on every visit. But there are two constraints:

(i) Condition (a) ensures an in-built preference for reporting success; if the stateis successful the policy must also report success;

(ii) Condition (b), together with (a), means thatep(s) is defined whenevers−→.This ensures that the policy cannot decide to stop at a states if there is apossibility of proceeding froms; the computation must proceed, if it is possibleto proceed.

An extreme policyep determines a deterministic pLTS〈S,Ωτ ,→ep〉, where→ep

is determined bys→ep ep(s). Moreover for any subdistribution∆ overS it deter-mines the obvious resolution〈S,∆ ,→ep〉, with the identity as the associated resolv-ing function. Indeed it is possible to show that every staticresolution is determinedin this manner by some extreme policy.

Proposition 4.5.Let∆ be a subdistribution in a finitely branching pLTS〈S,Ωτ ,→〉.Then there exists a static resolution of∆ , say〈R,Θ ,→R〉 such that

ExpΘ (Vh) = Exp∆ (Vhmin)

for any reward vector h.

Proof. We exhibit the required resolution by defining an extreme policy overS; inother words the resolution will take the form〈S,Θ ,→ep〉 for some extreme policyep(−).

We say the extreme policyep(−) is min-seekingif its domain iss∈ S | s−→and it satisfies:

if s 6ω−→ but s τ−→ thenVhmin(ep(s))≤ V

hmin(∆) whenevers τ−→ ∆

Note that by design a min-seeking policy satisfies:

if s 6ω−→ but s τ−→ thenVhmin(s) = V

hmin(ep(s)) (4.17)

In a finitely branching pLTS it is straightforward to define a min-seeking extremepolicy:

(i) If s ω−→ then letep(s) be any∆ such thats ω−→ ∆ .(ii) Otherwise, ifs τ−→ let ∆1, . . .∆n be the finite non-empty set∆ | s τ−→ ∆ .

Now letep(s) be any∆k satisfying the propertyVhmin(∆k)≤V

hmin(∆ j ) for every

1≤ j ≤ n; at least one such∆k must exist.

We now show that the static resolution determined by such a policy, 〈S,Θ ,→ep〉satisfies the requirements of the proposition. For the sake of clarity let us writeV

hep(∆) for the value realised for∆ in this resolution.


We already know, from Proposition 4.4, thatVhmin(∆) ≤ V

hep(∆) and so we con-

centrate on the converse,Vhep(∆)≤V

hmin(∆). Recall that the functionVh

ep is the leastfixed point of the functionC h defined in (4.10) above, and interpreted in the aboveresolution. So the result follows if we can show that the functionV

hmin is also a fixed

point. This amounts to proving

Vhmin(s) =

h(ω) if s ω−→

0 if s 6−→

Vhmin(ep(s)) otherwise

However this is a straightforward consequence of (4.17) above. ⊓⊔

Theorem 4.2.Let P and Q be any finitely branching processes. Then P⊑ΩermustQ if

and only if P⊑ΩnrmustQ

Proof. It follows from the two previous propositions that

Vhmin([T |Act P) = minVh(Θ) | 〈R,Θ ,→〉 is a resolution of[T |Act P

for any testT, processP, and reward vectorh. Thus, it is immediate that⊑ermust

coincides with⊑nrmust. ⊓⊔

4.6.2 May testing

Here we can try to apply the same proof strategy as in the previous section. Theanalogue to Proposition 4.4 goes through:

Proposition 4.6.If 〈R,Θ ,→R〉 is a resolution of∆ , then for any reward vector h wehaveVh(∆) ≤V

hmax(Θ).

Proof. Similar to the proof of Proposition 4.4 ⊓⊔

However the proof strategy used in Proposition 4.5 cannot beused to show thatVhmax

can be realised by some static resolution, as the following example shows.

Example 4.7.In analogy with the definition used in the proof of Proposition 4.5, wesay that an extreme policyep(−) is max-seekingif its domain is precisely the sets∈ S | s−→, and

if s 6ω−→ buts τ−→ thenVhmax(∆)≤ V

hmax(ep(s)) whenevers τ−→ ∆

This ensures thatVhmax(s) = V

hmax(ep(s)), whenevers τ−→ ands 6ω−→, and again it is

straightforward to define a max-seeking extreme policy in a finitely branching pLTS.However the resulting resolution does not in general realise the functionVh

max.To see this, let us consider the (finitely branching) pLTS used in Example 4.6.

Here in addition to the two statesω .0 and0 there is the infinite sets1, . . .sk, . . .and the transitions


• skτ−→ sk+1

• skτ−→ [0 1

2k⊕ ω .0 .

We have calculated thatVhmax(sk) to beh(ω) for everyk, and a max-seeking ex-

treme policy is determined byep(sk) = sk+1; indeed this is essentially the only suchpolicy. However the resolution associated with this policydoes not realiseVh

max, asV

hep(sk) = 0.

Nevertheless we will show that if we restrict attention to finitary pLTS’s, thenthere will always exist some static resolution that realisesVh

max. The proof relies ontechniques used in Markov process theory [27], and unlike that of Proposition 4.5 isnon-constructive; we simply prove that some such resolution exists, without actuallyshowing how to construct it. Although such techniques are relatively standard inthe theory of Markov decision processes, see [27] for example, they are virtuallyunknown in concurrency theory. So we give a detailed exposition that cumulates inTheorem 4.3.

Consider the set of all functions from a finite setR to [0,1], denoted by[0,1]R,and the distance functiond over [0,1]R defined byd( f ,g) = max| f (r)− g(r)|r∈R.We can check that([0,1]R,d) constitutes a complete metric space. Letδ ∈ (0,1) bea discount factor. The discounted version of the functionC h in (4.10),C δ ,h : (R→ [0,1])→ (R→ [0,1]) defined by

Cδ ,h( f )(r) =

h(ω) if r ω−→0 if r 6ω−→ andr 6τ−→δ ·Exp∆ ( f ) if r 6ω−→ andr τ−→ ∆

(4.18)

is a contraction mapping with constantδ . It follows from the Banach fixed-pointtheorem (cf. Theorem 2.8) thatC δ ,h has a unique fixed point whenδ < 1, whichwe denote byVδ ,h. On the other hand, it can be shown thatC δ ,h is a continuousfunction over the complete lattice[0,1]R. SoVδ ,h, as the least fixed point ofC δ ,h,has the characterisationVδ ,h =

⊔

n∈NVδ ,h,n, whereVδ ,h,n is the n-th iteration of

C δ ,h over⊥. Note that if there is no discount, i.e.δ = 1, we see thatC δ ,h,Vδ ,h

coincides withC h,Vh respectively. Similarly, we can defineVδ ,hmin andVδ ,h

max.

The functionsC δ ,h andVδ ,hmax have the following properties.

Lemma 4.6.Let h∈ [0,1]Ω be any reward vector.

1. For anyδ ∈ (0,1], the functionsC δ ,h andCδ ,hmax are continuous;

2. If δ1,δ2 ∈ (0,1] andδ1 ≤ δ2, then we haveC δ1,h ≤ C δ2,h andCδ1,hmax ≤ C

δ2,hmax ;

3. Letδnn≥1 be a nondecreasing sequence of discount factors convergingto 1.

It holds that⊔

n∈NC δn,h = C h and⊔

n∈NCδn,hmax = C h

max.

Proof. We only considerC , the case forCmax is similar.

1. Similar to the proof of Lemma 4.3.2. Straightforward by the definition ofC .


3. For anyf ∈ S→ [0,1] ands∈ Swe show that

Ch( f )(s) = (

⊔

n∈NC

δn,h)( f )(s). (4.19)

We focus on the non-trivial case thats α−→ ∆ for some actionα and distribution∆ ∈ D(S).

(⊔

n∈NC δn,h)( f )(s) =⊔

n∈NC δn,h( f )(s)=⊔

n∈Nδn · f (∆)= f (∆) · (

⊔

n∈Nδn)= f (∆) ·1= C h( f )(s)

⊓⊔

Lemma 4.7.Let h∈ [0,1]Ω be a reward vector andδnn≥1 be a nondecreasingsequence of discount factors converging to1.

• Vh =

⊔

n∈NVδn,h

• Vhmax=

⊔

n∈NVδn,hmax

Proof. We only considerVh; the case forVhmax is similar. We use the notationlfp( f )

for the least fixed point of the functionf over a complete lattice. Recall thatVh and

Vδn,h are the least fixed points ofC h andC δn,h respectively, so we need to prove

thatlfp(C h) =

⊔

n∈Nlfp(C δn,h) (4.20)

We now show two inequations.For anyn∈ N, we haveδn ≤ 1, so Lemma 4.6 (2) yieldsC δn,h ≤ C h. It follows

that lfp(C δn,h)≤ lfp(C h), thus⊔

n∈Nlfp(C δn,h)≤ lfp(C h).For the other direction, that islfp(C h)≤

⊔

n∈Nlfp(C δn,h), it suffices to show that⊔

n∈Nlfp(C δn,h) is a prefixed point ofC h, i.e.

Ch(⊔

n∈Nlfp(C δn,h)) ≤

⊔

n∈Nlfp(C δn,h),

which we derive as follows. Letδnn≥1 be a nondecreasing sequence of discountfactors converging to 1.

C h(⊔

n∈Nlfp(C δn,h))

= (⊔

m∈NC δm,h)(⊔

n∈Nlfp(C δn,h)) by Lemma 4.6 (3)=⊔

m∈NC δm,h(⊔

n∈Nlfp(C δn,h))

=⊔

m∈N⊔

n∈NC δm,h(lfp(C δn,h)) by Lemma 4.6 (1)=⊔

m∈N⊔

n≥mC δm,h(lfp(C δn,h))

≤⊔

m∈N⊔

n≥mC δn,h(lfp(C δn,h)) by Lemma 4.6 (2)=⊔

n∈NC δn,h(lfp(C δn,h))

=⊔

n∈Nlfp(C δn,h)

This completes the proof of (4.20). ⊓⊔


Lemma 4.8.Supposeδ ∈ (0,1] and∆ is a subdistribution in a pLTS〈S,Ωτ ,→〉. If

〈T,Θ ,→〉 is a resolution of∆, then we haveVδ ,h(Θ )≤Vδ ,hmax(∆) for any reward

vector h.

Proof. Let f : T → S be the resolving function associated with the resolution〈T,Θ ,→〉, we show by induction onn that

Vδ ,h,nmax ( f (t)) ≥ V

δ ,h,n(t) for anyt ∈ T (4.21)

The base casen = 0 is trivial. We consider the inductive step. Ift ω−→ Θ , thenf (t) ω−→ f (Θ), thusVδ ,h,n

max ( f (t)) = h(ω) = Vδ ,h,n(t). Now suppose thatt 6ω−→ and

t τ−→Θ . Then f (t) 6ω−→ and f (t) τ−→ f (Θ). We can infer that

Vδ ,h,(n+1)max ( f (t)) = δ ·maxVδ ,h,n

max (∆)| f (t) τ−→ ∆

≥ δ ·Vδ ,h,nmax ( f (Θ))

= δ ·∑s∈S f (Θ)(s) ·Vδ ,h,nmax (s)

= δ ·∑t′∈T Θ(t ′) ·Vδ ,h,nmax ( f (t ′))

≥ δ ·∑t′∈T Θ(t ′) ·Vδ ,h,n(t ′) by induction= δ ·Vδ ,h,n(Θ)

= Vδ ,h,(n+1)(t)

So we have proved (4.21), from which it follows that

Vδ ,hmax( f (t))≥ V

δ ,h(t) for anyt ∈ T (4.22)

Therefore, we have that

Vδ ,hmax(∆) = V

δ ,hmax( f (Θ ))

= ∑s∈S f (Θ )(s) ·Vδ ,hmax(s)

= ∑t∈T Θ (t) ·Vδ ,hmax( f (t))

≥ ∑t∈T Θ (t) ·Vδ ,h(t) by (4.22)= V

δ ,h(Θ )

⊓⊔

Lemma 4.9.Supposeδ < 1 and∆ is a subdistribution in a finitary pLTS given by〈S,Ωτ ,→〉. There exists a static resolution〈T,Θ ,→〉 such that

Vδ ,hmax(∆) = V

δ ,h(Θ )

for any reward vector h.

Proof. Let 〈T,Θ ,→〉 be a resolution with an injective resolving functionf suchthat

if t τ−→Θ thenVδ ,hmax( f (Θ)) = maxVδ ,h

max(∆) | f (t) τ−→ ∆.


A finitary pLTS is finitely branching, which ensures the existence of such resolvingfunction f .

Let g : T → [0,1] be the function defined byg(t) = Vδ ,hmax( f (t)) for all t ∈ T.

Below we show thatg is a fixed point ofC δ ,h. If t ω−→ then f (t) ω−→. There-

fore, C δ (g)(t) = h(ω) = Vδ ,hmax( f (t)) = g(t). Now supposet 6ω−→ and t τ−→ Θ .

By the definition of f , we have f (t) 6ω−→, f (t) τ−→ f (Θ) such that the condition

Vδ ,hmax( f (Θ)) = maxVδ ,h

max(∆) | f (t) τ−→ ∆ holds. Therefore,

C δ ,h(g)(t) = δ ·g(Θ)= δ ·∑t∈T Θ(t) ·g(t)

= δ ·∑t∈T Θ(t) ·Vδ ,hmax( f (t))

= δ ·∑s∈S f (Θ)(s) ·Vδ ,hmax(s)

= δ ·Vδ ,hmax( f (Θ))

= δ ·maxVδ ,hmax(∆)| f (t) τ−→ ∆

= Vδ ,hmax( f (t))

= g(t)

SinceC δ has a unique fixed pointVδ ,h, we derive thatg coincides withVδ ,h, i.e.V

δ ,h(t) = g(t) = Vδ ,hmax( f (t)) for all t ∈ T, from which we can obtain the required

resultVδ ,h(Θ ) = Vδ ,hmax(∆). ⊓⊔

Theorem 4.3.Let∆ be a subdistribution in a finitary pLTS〈S,Ωτ ,→〉. There existsa static resolution〈T,Θ ,→〉 such that ExpΘ (Vh) = Exp∆(Vh

max).

Proof. By Lemma 4.9, for every discount factord ∈ (0,1) there exists a static reso-lution that achieves the maximum expected reward. Since thepLTS is finitary, thereare finitely many different static resolutions. There must exist a static resolutionthat achieves the maximum expected reward for infinitely many discount factors. Inother words, for every nondecreasing sequenceδnn≥1 converging to 1, there existsa subsequenceδnkk≥1 and a static resolution〈T,Θ ,−→〉 with resolving function

f0 such thatVδnk ,h(t) = Vδnk ,hmax ( f0(t)) for all t ∈ T andk = 1,2, .... By Lemma 4.7,

we have that, for everyt ∈ T,

Vh(t) =

⊔

k∈NVδnk ,h(t)

=⊔

k∈NVδnk ,hmax ( f0(t))

= Vhmax( f0(t))

It follows thatVh(Θ ) = Vhmax(∆). ⊓⊔

Theorem 4.4.For finite-state processes, P⊑ΩermayQ if and only if P⊑Ω

nrmayQ.

Proof. Similar to that of Theorem 4.2 but employing Theorem 4.3 in place of Propo-sition 4.5. ⊓⊔


4.7 Vector-based testing versus scalar testing

In this section we show that for finitary processes scalar testing is as powerful asvector based testing. As a stepping stone we use resolution-based reward testingthat is shown to be equivalent to vector-based testing.

Theorem 4.5.For anyΩ and finitary processes P,Q we have

P⊑ΩpmayQ iff P⊑Ω

nrmayQP⊑Ω

pmustQ iff P⊑ΩnrmustQ

Proof. Given testT, processP, and reward vectorh we introduce the followingnotation

A f (T,P) := lExpΘ (V) | 〈R,Θ ,→〉 is a static resolution of[T |Act PA h

f (T,P) := lExpΘ (Vh) | 〈R,Θ ,→〉 is a static resolution of[T |Act P .We have the following two claims:

Claim 1. For any testT, processP, and reward vectorh, we always have thatA f (T,P)⊆ A (T,P). Moreover, ifP andT are finitary, thenA f (T,P) is p-closed.Claim 2. Leth∈ [0,1]m be a reward tuple,T ∈Tm andP are finitary test and process,respectively.

⊔A h

f (T,P) =⊔

A h(T,P)dA h

f (T,P) =d

A h(T,P)

For the first claim, we observe that static resolutions are still resolutions and byLemma 4.1 the setA (T,P) is convex. IfP andT are finitary, their compositionP |Act T is finitary too. The setA f (T,P) is the convex closure of a finite number ofpoints, so it is clearly Cauchy closed.

For the second claim, we observe that Proposition 4.6 and Theorem 4.3 implythat

⊔

Ahf (T,P) = V

hmax([T |Act P) =

⊔

Ah(T,P) (4.23)

Similarly, Propositions 4.4 and 4.5 imply that

lA

hf (T,P) = V

hmin([T |Act P) =

lA

h(T,P) (4.24)

We also note that for any deterministic process〈R,Θ ,→〉, it holds that

h ·V(r) = Vh(r) (4.25)

for anyr ∈ R, since an easy inductive proof establishes that

h ·Vn(r) = Vh,n(r)

for all n∈ N.

4.7 Vector-based testing versus scalar testing 97

For theonly-if-direction, we apply Theorem 4.1; this direction does not requirep-closure.

For theif -direction, we prove the must-case in the contrapositive; the may-caseis similar.

P 6⊑ΩpmustQ

iff A (T,P) 6≤Sm A (T,Q) for someΩ -testTimpliesA f (T,P) 6≤Sm A (T,Q) by Claim 1iff

dh ·A f (T,P) 6≤

dh ·A (T,Q) for someh∈ [0,1]Ω ; Claim 1; Thm 4.1

iffd

A hf (T,P) 6≤

dA h(T,Q) by (4.25)

iffd

A h(T,P) 6≤d

A h(T,Q) Claim 2impliesP 6⊑Ω

nrmustQ

⊓⊔

We now show that for finitary processes scalar testing is equally powerful asfinite-dimensional reward testing.

Theorem 4.6.For any n∈N and finitary processes P,Q we have

P⊑ΩnrmayQ iff P⊑nrmayQ

P⊑ΩnrmustQ iff P⊑nrmustQ

Proof. Theonly-if-direction is trivial in both cases. Forif we prove the must-casein the contrapositive; the may-case is similar.

Suppose thus thatP 6⊑ΩnrmustQ, thenP,Q are distinguished by someΩ -testT ∈Tn

and rewardh∈ [0,1]Ω , so that

lA

h(T,P) 6≤l

Ah(T,Q). (4.26)

Without loss of generality, assume that the success actionsin T areω1, · · · ,ωn.We construct a new testT ′ with only one success actionω as follows. For eachtransitions α−→ ∆ in T, if no state in⌈∆⌉ can perform a success action, we keep thistransition; otherwise we form a distribution∆ ′ to substitute for∆ in this transition.

1. First we partition⌈∆⌉ into two disjoint setssii∈I andsj j∈J. For eachi ∈ Iwe havesi

ωi′−→ ∆i for some distribution∆i and success actionωi′ ; for eachj ∈ Jno success action is possible fromsj .

2. Next, we introduce a new states′ to replace the states in the first state set, to-gether with a deadlock stateu. So we set⌈∆ ′⌉ := sj j∈J∪s′,u and

∆ ′(s′) := ∑i∈I h(ωi′) ·∆(si)∆ ′(sj ) := ∆(sj ) for eachj ∈ J∆ ′(u) := 1−∆ ′(s′)−∑ j∈J ∆ ′(sj)

3. Finally, a new transitions′ ω−→ u is added.


We do similar modifications for other transitions inT. Since there are only finitelymany transitions in total, the above procedure will terminate and result in a new testT ′.

The effect of changingT into T ′ is to replace each occurrence ofωi′ by ω withdiscount factorh(ωi′) (the other part 1−h(ωi′) is consumed by a deadlock). For anyprocessP, the overall probability ofω ’s occurrence, in any resolution of[T ′ |Act P,is therefore theh-weighted rewardh·o for the tupleo in the corresponding resolutionof [T |Act P.

Thus from (4.26) we have thatP,Q can be distinguished using the scalar testT ′

with its single success actionω ; that is, we achieveP 6⊑nrmustQ as required. ⊓⊔

We are now in a position to prove that scalar testing is as powerful as finite-dimensional vector-based testing.

Theorem 4.7.For any n∈N and finitary processes P,Q we have

P⊑ΩpmayQ iff P⊑pmayQ

P⊑ΩpmustQ iff P⊑pmustQ

Proof. Combining Theorems 4.5 and 4.6 yields the coincidence of⊑Ωpmay with

⊑nrmay, no matter what is the size ofΩ , as long as it is finite. It follows that⊑Ωpmay

is the same as⊑pmay. The must case is similar. ⊓⊔


Probabilistic extensions of testing equivalences [13] have been widely studied.There are two different proposals on how to include probabilistic choice: (i) a testshould be non-probabilistic, i.e., there is no occurrence of probabilistic choice in atest [24, 9, 19, 23, 16]; or (ii) a test can be probabilistic, i.e., probabilistic choicemay occur in tests as well as processes [10, 31, 26, 20, 28, 22,6]. This book adoptsthe second approach.

Some work [24, 9, 10, 26] does not consider nondeterminism but deals exclu-sively with fully probabilisticprocesses. In this setting a process passes a test with aunique probability instead of a set of probabilities, and testing preorders in the styleof [13] have been characterised in terms ofprobabilistic traces[10] andprobabilis-tic acceptance trees[26].

Generalisations of the testing theory of [13] to probabilistic systems first appearin [9] and [11], for generative processes without nondeterministic choice. The ap-plication of testing to the probabilistic processes we consider here stems from [31].The idea of vector-based testing originally comes from [28].

In [5] reactive processes are tested against three classes of tests: reactive prob-abilistic tests, fully nondeterministic tests, and nondeterministic and probabilistictests. Thus three testing equivalences are obtained. It is shown that the one based onthe third class of tests are strictly finer than the first two, which are incomparablewith each other.

References 99

In [19], a testing theory is proposed that associates areward, a nonnegative realnumber, to every success-state in a test process; in calculating the set of results of ap-plying a test to a process, the probabilities of reaching a success-state are multipliedby the reward associated to that state. [19] allows non-probabilistic tests only, butapplies these to arbitrary nondeterministic probabilistic processes, and provides atrace-like denotational characterisation of the resulting may-testing preorder. Deno-tational characterisations of the variant of our testing preorders in which onlyτ-freeprocesses are allowed as test-processes appear in [20, 21].These characterisationsare improved in [22].

In [6] a testing theory for nondeterministic probabilisticprocesses is developed inwhich, as in [25], all probabilistic choices are resolved first. A consequence of thisis that the idempotence of internal choice must be sacrificed. The work [6] extendsthe results of [26] with nondeterminism, but suffers from the same problems as [25].Similarly, in [1] probabilistic choices are resolved first and the resulting resolutionsare compared under the traditional testing semantics of [13]. Some papers distillpreorders for probabilistic processes by means oftesting scenariosin which thepremise that a test is itself a process is given up. These include [24, 23] and [29].

In our testing framework given in Section 4.1, applying a test to a process yieldsa composite process from which success probabilities are extracted cumulativelyon all resolutions. In [4] a different testing framework is presented where successprobabilities are considered in a trace-by-trace fashion.Different variants of testingequivalences are compared and placed in spectrum of trace and testing equivalences.

In Definition 4.3 we use resolving functions to model the behaviour of schedulersor environments to resolve nondeterminism. The schedulersare powerful enough toexamine the structure of a probabilistic process. For certain applications one mightlike to restrict the power of the schedulers [3, 8, 7, 15, 14],so as to obtain coarsertesting preorders or equivalences than those in Definition 4.4. Then a natural ques-tion is: to what extent should a scheduler be restricted? There are proposals formaking probabilistic or/and internal choice unobservableto the scheduler. But sofar we have not seen widely accepted criteria.

References

1. Acciai, L., Boreale, M., De Nicola, R.: Linear-time and may-testing semantics in a proba-bilistic reactive setting. In: Proceedings of FMOODS-FORTE’11, Lecture Notes in ComputerScience, vol. 6722, pp. 29-43. Springer (2011)

2. Abransky, S., Achim: Domain theory. In: Handbook of Logicand Computer Science, vol. 3,pp. 1–168. Clarendon Press (1994)

3. de Alfaro, L., Henzinger, T., Jhala, R. Compositional methods for probabilistic systems. In:Proceedings of the 12th International Conference on Concurrency Theory,Lecture Notes inComputer Scienc, vol. 2154, pp. 351-365. Springer (2001)

4. Bernardo, M., De Nicola, R., Loreti, M.: Revisiting traceand testing equivalences for nonde-terministic and probabilistic processes. Logical Methodsin Computer Science10(1:16), 1-42(2014)


5. Bernardo, M., Sangiorgi, D., Vignudelli, V.: On the discriminating power of testing equiv-alences for reactive probabilistic systems: results and open problems. In: Proceedings ofthe 11th International Conference on Quantitative Evaluation of Systems,Lecture Notes inComputer Scienc, vol. 8567, pp. 281-296. Springer (2014)

6. Cazorla, D., Cuartero, F., Ruiz, V., Pelayo, F., Pardo, J.: Algebraic theory of probabilistic andnondeterministic processes. Journal of Logic and Algebraic Programming55(1-2), 57–103(2003)

7. Chatzikokolakis, K., Palamidessi, C.: Making random choices invisible to the scheduler. In:Proceedings of the 18th International Conference on Concurrency Theory,Lecture Notes inComputer Science, vol. 4703, pp. 42–58. Springer (2007)

8. Cheung, L., Lynch, N., Segala, R., Vaandrager, F.: Switched PIOA: parallel composition viadistributed scheduling. Theoretical Computer Science365(1-2), 83–108 (2006)

9. Christoff, I.: Testing equivalences and fully abstract models for probabilistic processes. In:Proceedings of the 1st International Conference on Concurrency Theory,Lecture Notes inComputer Science, vol. 458, pp. 126–140. Springer (1990)

10. Cleaveland, R., Dayar, Z., Smolka, S.A., Yuen, S.: Testing preorders for probabilistic pro-cesses. Information and Computation154(2), 93–148 (1999)

11. Cleaveland, R., Smolka, S.A., Zwarico, A.: Testing preorders for probabilistic processes. In:Proceedings of the 19th International Colloquium on Automata, Languages and Program-ming,Lecture Notes in Computer Science, vol. 623, pp. 708–719. Springer (1992)


13. De Nicola, R., Hennessy, M.: Testing equivalences for processes. Theoretical Computer Sci-ence34, 83–133 (1984)

14. Georgievska, S., Andova, S.: Probabilistic may/must testing: retaining probabilities by re-stricted schedulers. Formal Aspects of Computing24, 727–748 (2012)

15. Giro, S., D’Argenio, P.: On the expressive power of schedulers in distributed probabilisticsystems. Electronic Notes in Theoretical Computer Science253(3), 45–71 (2009)

16. Gregorio-Rodrıguez, C., Nunez, M.: Denotational semantics for probabilistic refusal testing.Electronic Notes in Theoretical Computer Science22, 111–137 (1999)

17. Hennessy, M.: Powerdomains and nondeterministic recursive definitions. In: Proceedings ofthe 5th International Symposium on Programming,Lecture Notes in Computer Science, vol.137, pp. 178–193. Springer (1982)

18. Hennessy, M.: An Algebraic Theory of Processes. The MIT Press (1988)19. Jonsson, B., Ho-Stuart, C., Yi, W.: Testing and refinement for nondeterministic and proba-

bilistic processes. In: Proceedings of the 3rd International Symposium on Formal Techniquesin Real-Time and Fault-Tolerant Systems,Lecture Notes in Computer Science, vol. 863, pp.418–430. Springer (1994)

20. Jonsson, B., Yi, W.: Compositional testing preorders for probabilistic processes. In: Pro-ceedings of the 10th Annual IEEE Symposium on Logic in Computer Science, pp. 431–441.Computer Society Press (1995)

21. Jonsson, B., Yi, W.: Fully abstract characterization ofprobabilistic may testing. In: Pro-ceedings of the 5th International AMAST Workshop on Formal Methods for Real-Time andProbabilistic Systems,Lecture Notes in Computer Science, vol. 1601, pp. 1–18. Springer(1999)


23. Kwiatkowska, M.Z., Norman, G.: A testing equivalence for reactive probabilistic processes.Electronic Notes in Theoretical Computer Science16(2) (1998)


25. Morgan, C.C., McIver, A.K., Seidel, K., Sanders, J.: Refinement-oriented probability for CSP.Formal Aspects of Computing8(6), 617–47 (1996)

References 101

26. Nunez, M.: Algebraic theory of probabilistic processes. Journal of Logic and Algebraic Pro-gramming56, 117–177 (2003)

27. Puterman, M.L.: Markov Decision Processes. Wiley (1994)28. Segala, R.: Testing probabilistic automata. In: Proceedings of the 7th International Confer-

ence on Concurrency Theory,Lecture Notes in Computer Science, vol. 1119, pp. 299–314.Springer (1996)

29. Stoelinga, M., Vaandrager, F.W.: A testing scenario forprobabilistic automata. In: Proceed-ings of the 30th International Colloquium on Automata, Languages and Programming,Lec-ture Notes in Computer Science, vol. 2719, pp. 407–18. Springer (2003)

30. Winskel, G.: The Formal Semantics of Programming Languages: An Introduction. The MITPress (1993)


Chapter 5Testing Finite Probabilistic Processes

Abstract In this chapter we focus on finite processes and understand testing se-mantics from three different aspects. Firstly, we co-inductively define simulationrelations. Unlike the non-probabilistic setting, where there is a clear gap betweentesting and simulation semantics, here testing semantics is as strong as simulationsemantics. Secondly, a probabilistic logic is presented tocompletely determine test-ing preorders. Therefore, both positive and negative results can be established. Forexample, if two finite processesP andQ are related by may preorder then we canconstruct a simulation relation to witness this, otherwisea modal formula can beconstructed that is satisfiable byP but not byQ. Moreover, the distinguishing for-mula can be turned into a test thatP can pass butQ cannot. Finally, for finite pro-cesses, both may and must testing preorders can be completely axiomatised.

Keywords: Finite processes; Probabilistic simulation; Testing preorders; Modallogic; Axiomatisation

5.1 Introduction

To succinctly represent the behaviour of probabilistic processes, we introduce a sim-ple process calculus that is a probabilistic extension of Hoare’s CSP [17], calledpCSP. It hasthreechoice operators, externalP Q, internalP⊓Q and a probabilis-tic choicePp⊕Q. So a semantic theory forpCSP will have to provide a coherentaccount of the precise relationships between these operators.

We aim to give alternative characterisations of the testingpreorders⊑pmay and⊑pmust (cf. Definition 4.4). This problem was addressed previouslyby Segala in[33], but using testing preorders (⊑Ω

pmayand⊑Ωpmust) that differ in two ways from the

ones in [11, 16, 36, 7] and ours. First of all, in [33] the success of a test is achievedby theactual executionof a predefinedsuccess action, rather than the reaching ofa success state. We call this anaction-basedapproach, as opposed to thestate-basedapproach used in this book. Secondly, [33] employs a countable number of

103

104 5 Testing Finite Probabilistic Processes

success actions instead of a single one, so this isvector-based, as opposed toscalar,testing. Segala’s results in [33] depend crucially on this form of testing. To achieveour current results, we need vector-based testing preorders as a stepping stone. Werelate them to ours by using Theorem 4.7 from the last chaptersaying that for finitaryprocesses the preorders⊑Ω

pmay and⊑Ωpmustcoincide with⊑pmay and⊑pmust. We will

proceed in two steps: finite processes are considered in thischapter and finitaryprocesses, which may have loops, are dealt with in Chapter 6.

We will first introduce the syntax and operational semanticsof the languagepCSP and then instantiate the general testing framework in Section 4.1 topCSP pro-cesses. In Section 5.5 we use the transitionss α−→ ∆ to define two co-inductive pre-orders, theforward simulation(or simply calledsimulation) preorder⊑S [32, 25],and thefailure simulationpreorder⊑FS overpCSP processes. The latter extends thefailure simulation preorder of [12] to probabilistic processes. Both preorders use anatural generalisation of the transitions, first to take theform ∆ α−→ ∆ ′, and then toweakversions∆ α

=⇒ ∆ ′. The second preorder differs from the first one in the use ofa failure predicates 6 X−→, indicating that in the states none of the actions inX canbe performed.

Both preorders are preserved by all the operators inpCSP, and aresoundwith re-spect to the testing preorders; that isP⊑SQ impliesP⊑pmayQ andP⊑FSQ impliesP⊑pmustQ. The soundness is proved in a very standard method. Butcompleteness,that the testing preorders imply the respective simulationpreorders, requires someingenuity. We prove it indirectly, involving a characterisation of the testing and sim-ulation preorders in terms of a modal logic.

Our modal logic, defined in Section 5.6, uses conjunctionϕ1∧ϕ2, the modality〈a〉ϕ from the Hennessy-Milner Logic, and a probabilistic construct ϕ1 p⊕ ϕ2. Asatisfaction relation between processes and formulae thengives, in a natural man-ner, alogical preorderbetween processes:P ⊑L Q means that everyL -formulasatisfied byP is also satisfied byQ. We establish that⊑L coincides with⊑S (andhence⊑pmay also).

To capture failures, we add, for every set of actionsX, a formularef(X) to ourlogic, satisfied by any process that, after it can do no further internal actions, canperform none of the actions inX either. The constructs

∧, 〈a〉 andref stem from the

modal characterisation of the non-probabilistic failure simulation preorder, given in[12]. We show that⊑pmust, as well as⊑FS, can be characterised in a similar mannerwith this extended modal logic.

We prove these characterisation results through two cyclesof inclusions:

⊑L ⊆ ⊑S ⊆ ⊑pmay = ⊑Ωpmay ⊆ ⊑L

⊑F ⊆ ⊑FS ⊆ ⊑pmust = ⊑Ωpmust ⊆ ⊑F

︸︷︷︸︸︷︷︸︸︷︷︸︸︷︷︸︸︷︷︸

Sec. 5.6 Sec. 5.5 Sec. 4.1 Sec. 4.7 Sec. 5.7

In Section 5.6 we show thatP ⊑L Q impliesP ⊑S Q (and henceP ⊑pmay Q), andlikewise for⊑F and⊑FS; the proof involves constructing, for eachpCSP processP,a characteristic formulaϕP. To obtain the other direction, in Section 5.7 we showhow every modal formulaϕ can be captured, in some sense, by a testTϕ ; essentially

5.2 The languagepCSP 105

the ability of apCSP process to satisfyϕ is determined by its ability to pass the testTϕ . We capture the conjunction of two formulae by a probabilistic choice betweenthe corresponding tests; in order to prevent the results from these tests getting mixedup, we employ the vector-based tests of [33], so that we can use different successactions in the separate probabilistic branches. Therefore, we complete our proof byrecalling Theorem 4.7 that the scalar testing preorders imply the vector-based ones.

It is well-known that may- and must testing for standard CSP can be capturedequationally [11, 5, 16]. In Section 5.3 we show that most of the standard equationsare no longer valid in the probabilistic setting ofpCSP. However, we show in Sec-tion 5.9 that bothP⊑pmayQ andP⊑pmustQ can still be captured equationally overfull pCSP. In the may case the essential (in)equation required is

a.(P p⊕ Q) ⊑ a.Pp⊕ a.Q

The must case is more involved: in the absence of the distributivity of the externaland internal choices over each other, to obtain completeness we require a compli-cated inequational schema.

5.2 The languagepCSP

We first define the language and its operational semantics. Then we show how thegeneral probabilistic testing theory outlined in Section 4.1 can be applied to pro-cesses from this language.

5.2.1 The syntax

Let Act be a set ofvisible (or external) actions, ranged over bya,b,c, . . ., whichprocesses can perform. Then thefinite probabilistic CSP processesare given by thefollowing two-sorted syntax:

P ::= S | P p⊕ P

S ::= 0 | a.P | P⊓ P | S S | S|A S

We write pCSP, ranged over byP,Q, for the set of process terms defined by thisgrammar, andsCSP, ranged over bys, t, for the subset comprising only thestate-basedprocess terms (the sub-sortSabove).

The processP p⊕ Q, for 0≤ p≤1, represents aprobabilistic choicebetweenPandQ: with probability p it will act like P and with probability 1− p it will actlike Q. Any process is a probabilistic combination of state-basedprocesses built byrepeated application of the operatorp⊕. The state-based processes have a CSP-like


syntax, involving the stopped process0, action prefixinga. for a∈ Act, internal-andexternal choices⊓ and, and aparallel composition|A for A⊆ Act.

The processP ⊓ Q will first do a so-calledinternal actionτ 6∈Act, choosingnondeterministicallybetweenP andQ. Therefore⊓, like a. , acts as aguard, in thesense that it converts any process arguments into a state-based process.

The processs t on the other hand does not perform actions itself but ratherallows its arguments to proceed, disabling one argument as soon as the other hasdone a visible action. In order for this process to start froma state rather than aprobability distribution of states, we require its arguments to be state-based as well;the same requirement applies to|A.

Finally, the expressions |A t, whereA⊆Act, represents processessandt runningin parallel. They may synchronise by performing the same action fromA simultane-ously; such a synchronisation results inτ. In additionsandt may independently doany action from(Actτ\A), whereActτ := Act∪τ.

Although formally the operators and |A can only be applied to state-basedprocesses, informally we use expressions of the formPQ andP |A Q, whereP andQarenotstate-based, as syntactic sugar for expressions in the above syntax obtainedby distributing and|A overp⊕. Thus for examples (t1 p⊕ t2) abbreviates the term(s t1) p⊕ (s t2).

The full language of CSP [5, 17, 29] has many more operators; we have simplychosen a representative selection, and have added probabilistic choice. Our paral-lel operator is not a CSP primitive, but it can easily be expressed in terms of them— in particularP |A Q = (P‖AQ)\A, where‖A and\A are the parallel composi-tion and hiding operators of [29]. It can also be expressed interms of the paral-lel composition, renaming and restriction operators of CCS. We have chosen this(non-associative) operator for convenience in defining theapplication of tests toprocesses.

As usual we may elide0; the prefixing operatora. binds stronger than anybinary operator; and precedence between binary operators is indicated via bracketsor spacing. We will also sometimes use indexed binary operators, such as

⊕

i∈I pi ·Pi

with ∑i∈I pi = 1 and allpi > 0, ande

i∈I Pi, for some finite index setI .

5.2.2 The operational semantics

The above intuitive reading of the various operators can be formalised by anop-erational semanticsthat associates with each process term a graph-like structurerepresenting the manner in which it may react to users’ requests. Let us briefly re-call this procedure for non-probabilistic CSP.

The operational semantics of CSP is obtained by endowing theset of terms withthe structure of an LTS. Specifically

(i) the set of statesS is taken to be all terms from the language CSP(ii) the action relationsP α−→ Q are defined inductively on the syntax of terms.


a.P a−→ [PP⊓ Q τ−→ [P P⊓ Q τ−→ [Qs1

a−→ ∆s1 s2

a−→ ∆s2

a−→ ∆s1 s2

a−→ ∆s1

τ−→ ∆s1 s2

τ−→ ∆ s2

s2τ−→ ∆

s1 s2τ−→ s1 ∆

s1α−→ ∆ α 6∈A

s1 |A s2α−→ ∆ |A s2

s2α−→ ∆ α 6∈A

s1 |A s2α−→ s1 |A ∆

s1a−→ ∆1, s2

a−→ ∆2 a∈ As1 |A s2

τ−→ ∆1 |A ∆2

Fig. 5.1 Operational semantics ofpCSP

A precise definition may be found in [29].In order to interpret the fullpCSP operationally we need to use pLTS’s, the

probabilistic generalisation of LTS’s (see Section 3.2). We mimic the operationalinterpretation of CSP as an LTS by associating withpCSP a particular pLTS〈sCSP,Actτ ,→〉 in which sCSP is the set of states andActτ is the set of transi-tion labels. However there are two major differences:

(i) only a subset of terms inpCSP will be used as the set of states in the pLTS(ii) terms in pCSP will be interpreted as distributions oversCSP, rather than as

elements ofsCSP.

We interpretpCSP processesP as distributions[P ∈ D(sCSP) via the function[ : pCSP→ D(sCSP) defined by[s := s for s∈ sCSP, and [P p⊕ Q := [P p⊕ [Q .The definition of the relationsα−→ is given in Figure 5.1, wherea ranges overActandα overActτ .

These rules are very similar to the standard ones used to interpret CSP as a la-belled transition system [29], but are modified so that the result of an action is adistribution. The rules for external choice and parallel composition use an obviousnotation for distributing an operator over a distribution;for example∆ s repre-sents the distribution given by

(∆ s)(t) =

∆(s′) if t = s′ s

0 otherwise.

We sometimes writeτ.P for P⊓ P, thus givingτ.P τ−→ [P.


5.2.3 The precedence of probabilistic choice

Our operational semantics entails that and|A distribute over probabilistic choice:[P (Q p⊕ R) = [(P Q) p⊕ (P R)[P |A (Q p⊕ R) = [(P |A Q) p⊕ (P |A R)These identities are not a consequence of our testing methodology: they are hard-wired in our interpretation[ of pCSP expressions as distributions.

A consequence of our operational semantics is that, for example, in the processa.(b 1

2⊕ c) | /0 d the actiond can be scheduled either beforea, or after the probabilistic

choice betweenb andc—but it cannot be scheduled aftera and before this proba-bilistic choice. We justify this by thinking ofPp⊕ Q not as a process that starts withmaking a probabilistic choice, but rather as one thathasjust made such a choice, andwith probability p is no more and no less than the processP. Thusa.(Pp⊕ Q) is aprocess that in doing thea-step makes a probabilistic choice between the alternativetargetsP andQ.

This design decision is in full agreement with previous workfeaturing nonde-terminism, probabilistic choice and parallel composition[15, 36, 32]. Moreover, aprobabilistic choice between processesP andQ that does not take precedence overactions scheduled in parallel can simply be written asτ.(P p⊕ Q). Hereτ.P is anabbreviation forP⊓ P. Using the operational semantics of⊓ in Figure 5.1,τ.P isa process whose sole initial transition isτ.P τ−→ P, henceτ.(P p⊕ Q) is a processthat starts with making a probabilistic choice, modelled asan internal action, andwith probabilityp proceeds asP. Any activity scheduled in parallel withτ.(Pp⊕ Q)can now be scheduled before or after this internal action, hence before or after themaking of the choice. In particular,a.τ.(b 1

2⊕ c) | /0 d allowsd to happen betweena

and the probabilistic choice.

5.2.4 Graphical representation ofpCSP processes

The set of statesreachablefrom a subdistribution∆ is the smallest set that contains⌈∆⌉ and is closed under transitions, meaning that if some states is reachable ands α−→ Θ then every state in⌈Θ⌉ is reachable as well. We graphically depict theoperational semantics of apCSP expressionP by drawing the part of the pLTSdefined above that is reachable from[P as a finite acyclic directed graph in the waydescribed in Section 3.2.

Two examples are described in Figure 5.2. The interpretation of the simple pro-cess(b⊓ c) (d 1

2⊕ a) is the distribution(b⊓ c) ∆ , where∆ is the distribution

resulting from the interpretation of(d 12⊕ a), itself a two-point distribution mapping

both the statesd.0 anda.0 to the probability12. The result is again a two-point dis-

tribution, this time mapping the two states(b ⊓ c) d and(b ⊓ c) a to 12. The

end result in (a) is obtained by further interpreting these two states using the rules in


1/2

τ τ τ τad

b d c d b a c a

1/2 1/31/3

1/3

aτ τd

b d c d b a

τa

τ

c a

(a) (b)

Fig. 5.2 The pLTS’s of example processes

Figure 5.1. In (b) we show the graphical representation thatresults when this termis combined probabilistically with the simple processa.0.

To sum up, the operational semantics endowspCSP with the structure of a pLTS,and the function[ interprets process terms inpCSP as distributions in this pLTS,which can be represented by finite acyclic directed graphs (typically drawn as trees),with edges labelled either by probabilities or actions, so that edges labelled by ac-tions and probabilities alternate (although in pictures wemay suppress 1-labellededges and point distributions).

5.2.5 TestingpCSP processes

Let us now turn to applying the testing theory from Section 4.1 to pCSP. As withthe standard theory [11, 16], we use as tests any process frompCSP itself, whichin addition can use a special symbolω to report success. For example, the terma.ω 1

4⊕ (b c.ω) is a probabilistic test, which 25% of the time requests ana action,

and 75% requests thatc can be performed. If it is used asmusttest the 75% thatrequests ac action additionally requires thatb is not possible. As in [11, 16], it isnot the execution ofω that constitutes success, but the arrival in a state whereω ispossible. The introduction of theω-action is simply a method of defining a successpredicate on states without having to enrich the language ofprocesses explicitlywith such predicates.

Formally, letω 6∈Actτ and writeActω for Act∪ω andActωτ for Act∪τ,ω.

In Figure 5.1 we now leta range overActω and α over Actωτ . Tests may have

subtermsω .P, but other processes may not. We writepCSPω for the set of all tests.To apply the testT to the processP we run them in parallel, tightly synchronised;that is, we run the combined processT |Act P. HereP can only synchronise withT, and in turn the testT can only perform the success actionω , in addition tosynchronising with the process being tested; of course bothtester and testee can


(a.ω 14⊕ (b c.ω)) |Act (b c d)

a.ω |Act (b c d)

14

(b c.ω) |Act (b c d)

34

0 |Act 0

τ

ω |Act 0

τ

0 |Act 0

ω

A ((a.ω 14⊕ (b c.ω)), (b c d)) = 1

4 · 0+ 34 · 0,1= 0, 3

4

Fig. 5.3 Example of testing

also perform internal actions. An example is provided in Figure 5.3, where the testa.ω 1

4⊕ (b c.ω) is applied to the processb c d. We see that 25% of the time

the test is unsuccessful, in that it does not reach a state whereω is possible, and 75%of the time itmaybe successful, depending on how the now internal choice betweenthe actionsb andc is resolved, but it is not the case that itmustbe successful.[T |Act P is representable as a finite graph that encodes all possible interactionsof the testT with the processP. It only contains the actionsτ andω . Each occur-rence ofτ represents a nondeterministic choice, either inT or P themselves, or asa nondeterministic response byP to a request fromT, while the distributions rep-resent the resolution of underlying probabilities inT andP. We use the structure[T |Act P to defineA (T,P), a non-empty subset of[0,1] representing the set ofprobabilities that applyingT to P will be a success.

Definition 5.1. We define a functionV f : sCSP → P+([0,1]), which when ap-plied to any state insCSP returns a finite subset of[0,1]; it is extended to the typeD(sCSP)→ P+([0,1]) via the conventionV f (∆) := Exp∆V f (cf. Section 4.2).

V f (s) =

1 if s ω−→,⋃V f (∆) | s τ−→ ∆ if s 6 ω−→, s τ−→,

0 otherwise

We will tend to write the expected value ofV f explicitly and use the convenientnotation

V f (∆) = ∆(s1) ·V f (s1)+ . . .+∆(sn) ·V f (sn)


∆s

s1

12

b

τ

b

ω

b

τ

s2

12

b

τ

b

ω

∆t

t1

14

b

τ

b

ω

b

τ

t2

34

b

τb

τ

b

ω

V f (∆s) = 12 , 1 V f (∆t) = 0, 1

4 ,34 , 1

Fig. 5.4 Collecting results

where⌈∆⌉= s1, . . .sn. Note thatV f ( ) is indeed a well-defined function, becausethe pLTS〈sCSP,Actτ ,→〉 is finitely branching and well-founded.

For example consider the transition systems in Figure 5.4, where for reference wehave labelled the nodes. ThenV f (s1) = 1,0 while V f (s2) = 1, and thereforeV f (∆s)=

12 ·1,0+ 1

2 ·1 that, since there are only two possible choices, evaluatesfurther to 1

2,1. SimilarlyV f (t1) = V f (t2) = 0,1 and we calculate that

V f (∆t) =14· 0,1+

34· 0,1 = 0,

14,34,1.

Definition 5.2. For any processP∈ pCSP and testT let A (T,P) =V f ([T |Act P).With this definition we now have two testing preorders forpCSP, one based onmaytesting, P⊑pmayQ, and the other onmust testing, P⊑pmustQ.

Comparing the results-gathering functionV f in Definition 5.1 with the one givenin Section 4.2, we notice that the former only records those testing outcomes ob-tained by using static resolutions of a pLTS while the latterrecords the outcomesof all resolutions. Here we prefer to use the former that is simpler because we aredealing with scalar testing and as a matter of fact applying convex closure to subsetsof the one-dimensional interval[0,1] (such as those arisen from applying scalar teststo processes) has no effect on the Hoare and Smyth orders between these subsets.

Lemma 5.1.Suppose X, Y ⊆ [0,1]. Then

1. X≤Ho Y if and only iflX ≤Ho lY.2. X≤SmY if and only iflX ≤SmlY.

Proof. We restrict attention to the first clause; the proof of the second one goeslikewise. It suffices to show that (i)X ≤Ho lX and (ii) lX ≤Ho X. We only prove(ii) since (i) is obvious. Supposex ∈ lX, thenx = ∑i∈I pixi for a finite setI with∑i∈I pi = 1 andxi ∈ X. Let x∗ = maxxi | i ∈ I. Then


x = ∑i∈I

pixi ≤ ∑i∈I

pix∗ = x∗ ∈ X.

⊓⊔

It follows that for scalar testing it makes no difference whether convex closure isemployed or not. Therefore, vector-based testing is also a conservative extension ofscalar testing without employing convex closure.

Corollary 5.1. SupposeΩ is the singleton setω. Then

1. P⊑ΩpmayQ if and only if P⊑pmayQ.

2. P⊑ΩpmustQ if and only if P⊑pmustQ.

Proof. The result follows from Lemma 5.1. ⊓⊔

Lemma 5.1 does not generalise to[0,1]Ω , when|Ω | > 1, as the following exampledemonstrates:

Example 5.1.Let X, Y denote(0.5,0.5), (1,0),(0,1) respectively. Then it iseasy to show thatlX ≤Ho lY although obviouslyX 6≤Ho Y.

This example can be exploited to show that for vector-based testing itdoesmake adifference whether convex closure is employed.

Example 5.2.Consider the two processes

P := a 12⊕ b and Q := a⊓ b .

TakeΩ = ω1,ω2. Employing a results-gathering function without convex closure,with the testT := a.ω1 b.ω2 we would obtain

A (T,P) = (0.5,0.5)

A (T,Q) = (1,0),(0,1) .

As pointed out in Example 5.1, this entailsA (T,P) 6≤Ho A (T,Q), although theirconvex closuresare related under the Hoare preorder.

Convex closure is a uniform way of ensuring that internal choice can simulate anarbitrary probabilistic choice. For the processesP andQ of Example 5.2 we will seelater on thatP⊑S Q andP⊑pmayQ. This fits with the intuition that a probabilisticchoice is an acceptable implementation of a nondeterministic choice occurring ina specification. Considering that we use⊑Ω

pmay as a stepping stone in showing thecoincidence of⊑S and⊑pmay, we must haveP ⊑Ω

pmay Q. For this reason we usedconvex closure in Definition 4.3.

5.3 Counterexamples 113

b

bca

b

12

bb

b

12

bc

bc

b

12

ba

bb

b

12

ba

bc

b

bτ

ba

bb

bω

bτ

ba

bc

bω

R1 = a.(b 12⊕ c) R2 = a.b 1

2⊕ a.c T = a.b.ω ⊓ a.c.ω

b

bτ

bcτ

b

12

bτ

bω

b

12

bτ

bcτ

b

12

b

12

bτ

bω

bc

b

12

bτ

bτ

bτ

bω

bτ

bτ

b

12

bτ

bτ

bτ

bτ

bτ

bω

T |Act R1 T |Act R2

A (T,R1) = 12 A (T,R2) = 0, 1

2 ,1

Fig. 5.5 Action prefix does not distribute over probabilistic choice

5.3 Counterexamples

We will see in this section that many of the standard testing axioms are not validin the presence of probabilistic choice. We also provide counterexamples for a fewdistributive laws involving probabilistic choice that might appear plausible at firstsight. In all cases we establish a statementP 6≃pmayQ by exhibiting a testT such thatmax(A (T,P)) 6= max(A (T,Q)) and a statementP 6≃pmustQ by exhibiting a testTsuch thatmin(A (T,P)) 6= min(A (T,Q)). In casemax(A (T,P))> max(A (T,Q))we find in particular thatP 6⊑pmayQ, and in casemin(A (T,P))> min(A (T,Q)) weobtainP 6⊑pmustQ.

Example 5.3.The axioma.(P p⊕ Q) = a.Pp⊕ a.Q is unsound.

Consider the example in Figure 5.5. InR1 the probabilistic choice betweenb andc is taken after the actiona, while in R2 the choice is made before the action hashappened. These processes can be distinguished by the nondeterministic testT =a.b.ω ⊓ a.c.ω . First consider running this test onR1. There is an immediate choicemade by the test, effectively running either the testa.b.ω onR1 or the testa.c.ω ; infact the effect of running either test is exactly the same. Considera.b.ω . When run


b

ba

bτ

bb

bτ

bc

b

bτ

ba

bb

bτ

ba

bc

b

bca

b

12

bb

bω

b

12

bc

bω

R1 = a.(b⊓ c) R2 = a.b⊓ a.c T = a.(b.ω 12⊕ c.ω)

b

bcτ

b

12

bτ

bτ

bω

bτ

b

12

bτ

bτ

bτ

bω

b

bτ

bcτ

b

12

bτ

bω

b

12

bτ

bcτ

b

12

b

12

bτ

bω

T |Act R1 T |Act R2

A (T,R1) = 0, 12 ,1 A (T,R2) = 1

2

Fig. 5.6 Action prefix does not distribute over internal choice

onR1 thea action immediately happens, and there is a probabilistic choice betweenrunningb.ω on eitherb or c, giving as possible results1 or 0; combining theseaccording to the definition of the functionV f ( ) we get 1

2 · 0+ 12 · 1 = 1

2.Since running the testa.c.ω has the same effect,A (T,R1) turns out to be the sameset 1

2.Now consider running the testT on R2. BecauseR2, and hence alsoT |Act R2,

starts with a probabilistic choice, due to the definition of the functionV f ( ), thetest must first be applied to the probabilistic components,a.b anda.c, respectively,and the results subsequently combined probabilistically.When the test is run ona.b, immediately a nondeterministic choice is made in the test,to run eithera.b.ωor a.c.ω . With the former we get the result1, with the latter0, so overall, forrunningT on a.b, we get the possible results0,1. The same is true when we runit on a.c, and thereforeA (T,R2) =

12 · 0,1+ 1

2 · 0,1= 0, 12,1.

So we haveR2 6⊑pmayR1 andR1 6⊑pmustR2.

Example 5.4.The axioma.(P⊓ Q) = a.P⊓ a.Q is unsound.

It is well known that this axiom is valid in the standard theory of testing, for non-probabilistic processes. However, consider the processesR1 andR2 in Figure 5.6,

5.3 Counterexamples 115

and note that these processes do not contain any probabilistic choice. But they canbe differentiated by the probabilistic testT = a.(b.ω 1

2⊕ c.ω); the details are in Fig-

ure 5.6. There is only one possible outcome from applyingT to R2, the probability12, because the nondeterministic choice is made before the probabilistic choice. Onthe other hand whenT is applied toR1 there are three possible outcomes, 0, 1

2 and1, because effectively the probabilistic choice takes precedence over the nondeter-ministic choice. So we haveR1 6⊑pmayR2 andR2 6⊑pmustR1.

Example 5.5.The axioma.(P Q) = a.P a.Q is unsound.

This axiom is valid in the standard may-testing semantics. However, let us considerthe two processesR1 = a.(b c) andR2 = a.b a.c. By applying the probabilistictestT = a.(b.ω 1

2⊕ c.ω) we see thatA (T,R1) = 1 andA (T,R2) = 1

2. There-foreR1 6⊑pmayR2 andR1 6⊑pmustR2.

Example 5.6.The axiomP= P P is unsound.

Let R1 andR2 denote(a 12⊕ b) and (a 1

2⊕ b) (a 1

2⊕ b), respectively. It is easy

to calculate thatA (a.ω ,R1) = 12 but, because of the way we interpret exter-

nal choice as an operator over distributions of states in a pLTS, it turns out that[R2= [((a a) 12⊕ (a b)) 1

2⊕ ((b a) 1

2⊕ (b b)) and soA (a.ω ,R2) = 3

4.ThereforeR2 6⊑pmayR1 andR2 6⊑pmustR1.

Example 5.7.The axiomP p⊕ (Q⊓ R) = (P p⊕ Q) ⊓ (P p⊕ R) is unsound.

Consider the processesR1 = a 12⊕ (b ⊓ c) andR2 = (a 1

2⊕ b) ⊓ (a 1

2⊕ c), and the

testT1 = a.ω ⊓ (b.ω 12⊕ c.ω). In the best of possible worlds, when we applyT1 to

R1 we obtain probability 1, that ismax(A (T1,R1)) = 1. Informally this is becausehalf the time when it is applied to the subprocessa of R1, optimistically the sub-testa.ω is actually run. The other half of the time, when it is appliedto the subprocess(b ⊓ c), optimistically the sub-testTr = (b.ω 1

2⊕ c.ω) is actually used. And here

again, optimistically, we obtain probability 1: whenever the testb.ω is used it mightbe applied to the subprocessb, while whenc.ω is used it might be applied toc.Formally we have

A (T1,R1) =12 ·A (T1,a)+ 1

2 ·A (T1,b⊓ c)

= 12 · (A (a.ω ,a)∪A (Tr ,a)) +

12 ·(A (T1,b)∪A (T1,c)∪A (a.ω ,b⊓c)∪A (Tr ,b⊓c))

= 12 · (1∪0)+ 1

2 · (0, 12∪0, 1

2∪0∪0, 12,1)

= 0, 14,

12,

34,1

However no matter how optimistic we are when applyingT1 to R2 we can never getprobability 1; the most we can hope for is3

4, which might occur whenT1 is appliedto the subprocess(a 1

2⊕ b). Specifically when the subprocessa is being tested the

sub-testa.ω might be used, giving probability 1, and when the subprocessb is being


tested the sub-test(b.ω 12⊕ c.ω) might be used, giving probability12. We leave the

reader to check that formally

A (T1,R2) = 0, 14,

12,

34

from which we can concludeR1 6⊑pmayR2.We can also show thatR2 6⊑pmustR1, using the test

T2 = (b.ω c.ω) ⊓ (a.ω 13⊕ (b.ω 1

2⊕ c.ω)).

Reasoning pessimistically, the worst that can happen when applyingT2 to R1 is thatwe get probability 0. Each time the subprocessa is tested the worst probability willoccur when the sub-test(b.ω c.ω) is used; this results in probability 0. Similarlywhen the subprocess(b⊓ c) is being tested the subtest(a.ω 1

3⊕ (b.ω 1

2⊕ c.ω)) will

give probability 0. In other wordsmin(A (T2,R1)) = 0. When applyingT2 to R2,things can never be as bad. The worst probability will occur whenT2 is applied tothe subprocess(a 1

2⊕ b), namely probability1

6. We leave the reader to check that

formally A (T2,R1) = 0, 16,

13,

12,

23 andA (T2,R2) = 1

6,13,

12,

23.

Example 5.8.The axiomP⊓ (Q p⊕ R) = (P⊓ Q) p⊕ (P⊓ R) is unsound.

Let R1 = a⊓ (b 12⊕ c), R2 = (a⊓ b) 1

2⊕ (a⊓ c) andT be the testa.(ω 1

2⊕ 0) b.ω .

One can check thatA (T,R1) = 12 andA (T,R2) =

12

12,1+

12

12,0= 1

4,12,

34.

Therefore we haveR2 6⊑pmayR1 andR1 6⊑pmustR2.

Example 5.9.The axiomP (Q⊓ R) = (P Q) ⊓ (P R) is unsound.

Let R1 = (a 12⊕ b) (c⊓ d), R2 = ((a 1

2⊕ b) c) ⊓ ((a 1

2⊕ b) d) andT be the

test(a.ω 12⊕ c.ω) ⊓ (b.ω 1

2⊕ d.ω). This time we getA (T,R1) = 0, 1

4,12,

34,1 and

A (T,R2) = 14,

34. SoR1 6⊑pmayR2 andR2 6⊑pmustR1.

Example 5.10.The axiomP⊓ (Q R) = (P⊓ Q) (P⊓ R) is unsound.

LetR1=(a 12⊕ b)⊓ ((a 1

2⊕ b)0) andR2 =((a 1

2⊕b)⊓ (a 1

2⊕b)) ((a 1

2⊕ b)⊓ 0).

One obtainsA (a.ω ,R1) = 12 andA (a.ω ,R2) = 1

2,34. SoR2 6⊑pmayR1. Let R3

andR4 result from substitutinga 12⊕ b for each ofP, Q andR in the axiom above.

Now A (a.ω ,R3) = 12,

34 andA (a.ω ,R4) = 3

4. SoR4 6⊑pmustR3.

Example 5.11.The axiomP p⊕ (Q R) = (P p⊕ Q) (P p⊕ R) is unsound.

Let R1 = a 12⊕ (b c), R2 = (a 1

2⊕ b) (a 1

2⊕ c) andR3 = (a b) 1

2⊕ (a c).

R1 is an instance of the left-hand side of the axiom, andR2 an instance of the right-hand side. Here we useR3 as a tool to reason aboutR2, but in Section 5.11.2 wewill needR3 in its own right. Note that[R2= 1

2 · [R1+ 12 · [R3. Let T = a.ω . It is

easy to see thatA (T,R1) = 12 andA (T,R3) = 1. ThereforeA (T,R2) = 3

4.So we haveR2 6⊑pmayR1 andR2 6⊑pmustR1.

Of all the examples in this section, this is the only one for which we can showthat⊑pmay and⊒pmay both fail, i.e. both inequations that can be associated with

5.4 Must versus may testing 117

the axiom are unsound formaytesting. LetT = a.(ω 12⊕ 0) ⊓ (b.ω 1

2⊕ c.ω). It is

not hard to check thatA (T,R1) = 0, 14,

12,

34 andA (T,R3) = 1

2. It follows thatA (T,R2) = 1

4,38,

12,

58. Therefore, we haveR1 6⊑pmayR2.

For future reference, we also observe thatR1 6⊑pmayR3 andR3 6⊑pmayR1.

5.4 Must versus may testing

OnpCSP there are two differences between the preorders⊑pmay and⊑pmust:

• Must testing is more discriminating• The preorders⊑pmay and⊑pmustare oriented in opposite directions.

In this section we substantiate these claims by proving thatP ⊑pmust Q impliesQ⊑pmayP, and by providing a counterexample that shows the implication is strict.We are only able to obtain the implication since our languageis for finite processesand does not featuredivergence, infinite sequences ofτ-actions. It is well knownfrom the non-probabilistic theory of testing [11, 16] that in the presence of diver-gence≃may and≃must are incomparable.

To establish a relationship between must testing and may testing, we define thecontextC[ ] = |ω (ω (ν ⊓ ν)) so that for every testT we obtain a new testC[T], by consideringν instead ofω as success action.

Lemma 5.2.For any process P and test T , it holds that

1. if p∈ A (T,P) then(1−p) ∈ A (C[T],P)2. if p∈ A (C[T],P) then there exists some q∈ A (T,P) such that1−q≤ p.

Proof. A state of the formC[s] |Act t can always do aτ-move, and never directlya success actionν. The τ-steps thatC[s] |Act t can do fall into three classes: theresulting distribution is either

• a point distributionu with u ν−→ ; we call this asuccessfulτ-step, because itcontributes 1 to the setV f (C[s] |Act t)

• a point distributionu with u a state from which the success actionν is un-reachable; we call this anunsuccessfulτ-step, because it contributes 0 to thesetV f (C[s] |Act t)

• or a distribution of formC[Θ ] |Act ∆ .

Note that

• C[s] |Act t can always do a successfulτ-step• C[s] |Act t can do an unsuccessfulτ-step iffs |Act t can do aω-step• andC[s] |Act t τ−→C[Θ ] |Act ∆ iff s |Act t τ−→Θ |Act ∆ .

Using this, both claims follow by a straightforward induction onT andP. ⊓⊔

Proposition 5.1.If P ⊑pmustQ then Q⊑pmayP.


Proof. SupposeP⊑pmustQ. We must show that, for any testT, if p∈ A (T,Q) thenthere exists aq ∈ A (T,P) such thatp ≤ q. So supposep ∈ A (T,Q). By the firstclause of Lemma 5.2, we have(1−p) ∈ A (C[T],Q). Given thatP⊑pmustQ, theremust be anx∈ A (C[T],P) such thatx≤ 1−p. By the second clause of Lemma 5.2,there exists aq ∈ A (T,P) such that 1−q ≤ x. It follows that p ≤ q. ThereforeQ⊑pmayP. ⊓⊔

Example 5.12.To show that must testing is strictly more discriminating than maytesting consider the processesa b anda ⊓ b, and expose them to testa.ω . It isnot hard to see thatA (a.ω ,a b) = 1, whereasA (a.ω ,a⊓ b) = 0,1. Sincemin(A (a.ω ,a b)) = 1 andmin(A (a.ω ,a ⊓ b)) = 0, using Proposition 4.1 weobtain that(a b) 6⊑pmust(a⊓ b).

Sincemax(A (a.ω ,a b)) = max(A (a.ω ,a ⊓ b)) = 1, as amay test, the testa.ω does not differentiate between the two processesa b anda⊓ b. In fact, wehave(a ⊓ b) ⊑pmay (a b), and even(a b) ≃pmay (a ⊓ b), but this cannot beshown so easily, as we would have to consider all possible tests. In Section 5.5 wewill develop a tool to prove statementsP⊑pmayQ, and apply it to derive the equalityabove (axiom(May0) in Figure 5.8).

5.5 Forward and failure simulation

The examples of Section 5.3 have been all negative, because one can easily demon-strate an inequivalence between two processes by exhibiting a test that distinguishesthem in the appropriate manner. A direct application of the definition of the testingpreorders is usually unsuitable for establishing positiveresults, as this involves auniversal quantification over all possible tests that can beapplied. To give positiveresults of the formP ⊑pmay Q (and similarly forP ⊑pmustQ) we need to come upwith a preorder⊑finer such that(P⊑finer Q)⇒ (P⊑pmayQ) and statementsP⊑finer Qcan be obtained by exhibiting a single witness.

In this section introduce co-inductive relations:forward simulationsandfailuresimulations. For maytesting we use forward simulations as our witnesses, and formusttesting we use failure simulations as witnesses. The definitions are somewhatcomplicated by the fact that in a pLTS transitions go from states to distributions;consequently if we are to use sequences of transitions, orweak transitions a

=⇒that abstract from sequences of internal actions that mightprecede or follow thea-transition, then we need to generalise transitions so that they go from distributions todistributions. We first recall the mathematical machinery developed in Section 3.3,where we have discussed various ways of lifting a relationR⊆ S×S to a relationR

†⊆D(S)×D(S). Exactly the same idea can be used to lift a relationR⊆S×D(S)to a relationR†⊆ D(S)×D(S). This justifies our slight abuse of notation here ofkeeping writingR† for the lifted relation.

Definition 5.3. LetR ⊆ S×D(S) be a relation from states to subdistributions. ThenR

† ⊆ D(S)×D(S) is the smallest relation that satisfies:

5.5 Forward and failure simulation 119

(1) sR Θ impliessR† Θ , and


† (∑i∈I pi ·Θi), whereIis a finite index set and∑i∈I pi = 1.

From the above definition we immediately get the following property.

Lemma 5.3.∆ R† Θ if and only if there is a collection of statessii∈I , a collection

of distributionsΘii∈I , and a collection of probabilitiespii∈I , for some finiteindex set I, such that∑i∈I pi = 1 and∆ ,Θ can be decomposed as follows:

1. ∆ = ∑i∈I pi ·si

2. Θ = ∑i∈I pi ·Θi

3. For each i∈ I we have si R Θi.

Proof. Similar to the proof of Proposition 3.1. ⊓⊔

We apply this definition to the action relationsα−→ ⊆ sCSP×D(sCSP) in the op-erational semantics ofpCSP, and obtain lifted relations betweenD(sCSP) andD(sCSP), which to ease the notation we write as∆ α−→Θ ; then, usingpCSP termsto represent distributions, a simple instance of a transition between distributions isgiven by

(a.b a.c) 12⊕ a.d a−→ b 1

2⊕ d

Note that we also have

(a.b a.c) 12⊕ a.d a−→ (b 1

2⊕ c) 1

2⊕ d (5.1)

because, viewed as a distribution, the term(a.b a.c) 12⊕ a.d may be re-written as

((a.b a.c) 12⊕ (a.b a.c)) 1

2⊕ a.d representing the sum of point distributions

14 · (a.b a.c)+ 1

4 · (a.b a.c)+ 12 ·a.d

from which the move (5.1) can easily be derived using the three moves from states

a.b a.c a−→ b a.b a.c a−→ c a.d a−→ d

The lifting construction can also be used to define the concept of a partial internalmove between distributions, one where part of the distribution does an internal moveand the remainder remains unchanged. Writes τ−→ ∆ if eithers τ−→ ∆ or ∆ = s. Thisrelation between states and distributions can be lifted to one between distributionsand distributions, and again for notational convenience weuse∆1

τ−→ ∆2 to denotethe lifted relation. As an example, again using process terms to denote distributions,we have

(a⊓ b) 12⊕ (a⊓ c) τ−→ a 1

2⊕ (a⊓ b 1

2⊕ c)

This follows because as a distribution(a⊓ b) 12⊕ (a⊓ c) may be written as


14 · (a⊓ b)+ 1

4 · (a⊓ b)+ 14 · (a⊓ c)+ 1

4 · (a⊓ c)

and we have the four moves from states to distributions:

(a⊓ b) τ−→ a (a⊓ b) τ−→ (a⊓ b)

(a⊓ c) τ−→ a (a⊓ c) τ−→ c

5.5.1 The simulation preorders

Following tradition it would be natural to define simulations as relations betweenstates in a pLTS [21, 34], as we did in Section 3.5. However, technically it is moreconvenient to use relations insCSP×D(sCSP). One reason may be understoodthrough the example in Figure 5.5. Although in Example 5.3 wefound out thatR2 6⊑pmay R1, we do haveR1 ⊑pmay R2. If we are to relate these processes via asimulation-like relation, then the initial state ofR1 needs to be related to the initialdistributionof R2, containing the two statesa.b anda.c.

Our definition of simulation usesweak transitions[26], which have the standarddefinitions except that they now apply to distributions, andτ−→ is used instead of

τ−→. This reflects the understanding that if a distribution may perform a sequenceof internal moves before or after executing a visible action, different parts of thedistribution may perform different numbers of internal actions:

• Let ∆1τ=⇒ ∆2 whenever∆1

τ−→∗ ∆2.• Similarly ∆1

a=⇒ ∆2 denotes∆1

τ−→∗ a−→ τ−→∗ ∆2 whenevera∈ Act.• We writes 6 A−→ with A⊆ Act when∀α ∈ A∪τ : s 6 α−→, and also∆ 6 A−→ when

∀s∈ ⌈∆⌉ : s 6 A−→.• More generally, write∆ =⇒ 6A−→ if ∆ =⇒ ∆pre for some∆pre such that∆pre 6A−→.

Definition 5.4. A relationR ⊆ S×D(S) is said to be afailure simulationif sR Θimplies

• if s α−→ ∆ then there exists someΘ ′ such thatΘ α=⇒Θ ′ and∆ R† Θ ′

• if s 6 A−→ thenΘ =⇒ 6A−→.

We writes⊳FS Θ if there is some failure simulationR such thatsR Θ . Similarly,we define(forward) simulationands⊳S Θ by dropping the second clause in Defi-nition 5.4.

Definition 5.5. The (forward) simulation preorder⊑S and failure simulation pre-order⊑FS onpCSP are defined as follows:

P⊑S Q iff [Q τ=⇒Θ for someΘ with [P (⊳S)

† ΘP⊑FS Q iff [P τ

=⇒Θ for someΘ with [Q (⊳FS)† Θ .

We have chosen the orientation of the preorder symbol to match that of must testing,which goes back to the work of De Nicola and Hennessy [11]. This orientation also


s0

∆1

a

s1

12

b

b

s2

14

b

c

s3

14

b

c

t0

∆2

a

t1

14

b

b

t2

14

b

b

t3

12

b

c

P1 = a.(b 12⊕ (c 1

2⊕ c)) P2 = a.((b 1

2⊕ b) 1

2⊕ c)

Fig. 5.7 Two simulation equivalent processes

matches the one used in CSP [17] and related work, were we haveSPECIFICATION

⊑ IMPLEMENTATION. At the same time, we like to stick to the convention popularin the CCS community of writing the simulated process to the left of the preordersymbol and the simulating process (that mimics moves of the simulated one) on theright. This is the reason for the orientation of the symbol⊳FS.

The equivalences generated by⊑S and ⊑FS are called(forward) simulationequivalenceandfailure simulation equivalence, denoted≃S and≃FS, respectively.

If P∈ sCSP, that is if P is a state in the pLTS ofpCSP and so[P = P, then toestablishP⊑S Q it is sufficient to exhibit a simulation between the stateP and thedistribution[Q, because triviallys⊳S ∆ impliess(⊳S)

† ∆ .

Example 5.13.Consider the two processesPi in Figure 5.7. To showP1 ⊑S P2 it issufficient to exhibit a simulationR such thats0 R t0. LetR ⊆ sCSP×D(sCSP) bedefined by

s0 R t0 s1 R ∆t s2 R t3 s3 R t3 0 R 0

where∆t is the two-point distribution mapping botht1 andt2 to the probability12.

Then it is straightforward to check that it satisfies the requirements of a simulation:the only non-trivial requirement is that∆1 R

† ∆2. But this follows from the fact that

∆1 = 12 ·s1+

14 ·s2+

14 ·s3

∆2 = 12 ·∆t +

14 · t3+

14 · t3

As another example reconsiderR1 = a.(b 12⊕ c) and R2 = a.b 1

2⊕ a.c from Fig-

ure 5.5, where for convenience we use process terms to denotetheir semantic inter-pretations. It is easy to see thatR1 ⊑S R2 because of the simulation

R1 R [R2 b R b c R c 0 R 0


The transitionR1a−→ (b 1

2⊕ c) is matched by the transitionR2

a−→ (b 12⊕ c) with

(b 12⊕ c) R

† (b 12⊕ c).

Similarly (a 12⊕ c) ⊓ (b 1

2⊕ c) ⊑S (a ⊓ b) 1

2⊕ c because it is possible to find a

simulation between the state(a 12⊕ c) ⊓ (b 1

2⊕ c) and the distribution(a⊓ b) 1

2⊕ c.

In caseP 6∈sCSP, a statementP⊑S Q cannot always be established by a simulationR such that[P R

† [Q.Example 5.14.Compare the processesP= a 1

2⊕ b andP ⊓ P. Note that[P is the

distribution12a+ 1

2b whereas[P⊓P is the point distributionP⊓ P. The relationRgiven by

(P⊓ P) R (12a+ 1

2b) a R a b R b 0 R 0

is a simulation, because theτ-stepP⊓ P τ−→ (12a+ 1

2b) can be matched by the idletransition(1

2a+ 12b) τ=⇒ (1

2a+ 12b), and we have(1

2a+ 12b) R

† (12a+ 1

2b). Thus

(P⊓ P)⊳S (12a+ 1

2b) = [P, hence[P⊓ P (⊳S)† [P, and thereforeP⊓ P⊑S P.

This type of reasoning does not apply to the other direction.Any simulationR

with (12a+ 1

2b)R† P⊓ P would have to satisfyaR P⊓ P andbR P⊓ P. However,

the movea a−→ 0 cannot be matched by the processP⊓ P, as the only transition thelatter process can do isP⊓ P τ−→ (1

2a+ 12b), and only half of that distribution can

match thea-move. As a consequence, no such simulation exists, and we find that[P (⊳S)† [P⊓ P does not hold. Nevertheless, we still haveP⊑S P⊓ P. Here, the

transition τ=⇒ from Definition 5.5 comes to the rescue. As[P ⊓ P τ

=⇒ [P and[P (⊳S)† [P, we obtainP⊑S P⊓ P.

Example 5.15.Let P= a 12⊕ b andQ=PP. We can establish thatP⊑SQ because[P (⊳S)

† [Q that comes from the following observations:

1. [P= 12a+ 1

2b2. [Q= 1

2(12a a+ 1

2a b)+ 12(

12a b+ 1

2b b)3. a⊳S (

12a a+ 1

2a b)4. b⊳S (

12a b+ 1

2b b)

This kind of reasoning does not apply to⊳FS. For example, we have

a 6⊳FS (12

a a+12

a b)

because the state on the left hand side can refuse to do actionb while the distributionon the right hand side cannot. Indeed, it holds thatQ 6⊑FS P.

Because of the asymmetric use of distributions in the definition of simulationsit is not immediately obvious that⊑S and⊑FS are actually preorders (reflexive andtransitive relations) and hence≃S and≃FS are equivalence relations. In order toshow this, we first need to establish some properties of⊳S and⊳FS.


Lemma 5.4.Suppose∑i∈I pi = 1 and∆iα=⇒ Φi for each i∈ I, with I a finite index

set. Then

∑i∈I

pi ·∆iα=⇒ ∑

i∈Ipi ·Φi

Proof. We first prove the caseα = τ. For eachi ∈ I there is a numberki such that∆i = ∆i0

τ−→ ∆i1τ−→ ∆i2

τ−→ ·· · τ−→ ∆iki = ∆ ′i . Let k= maxki | i ∈ I, using thatI

is finite. Since we haveΦ τ−→ Φ for anyΦ ∈ D(S), we can add spurious transitionsto these sequences, until allki equalk. After this preparation the lemma follows byk applications of the linearity of the lifting operation (cf.Definition 5.3), taking τ−→for R.

The caseα ∈ Act now follows by one more application of the linearity of liftingoperation, this time withR = a−→, preceded and followed by an application of thecaseα = τ. ⊓⊔

Lemma 5.5.Suppose∆ (⊳S)† Φ and∆ α−→ ∆ ′. ThenΦ α=⇒ Φ ′ for someΦ ′ such

that∆ ′ (⊳S)† Φ ′.

Proof. First ∆ (⊳S)† Φ means that

∆ = ∑i∈I

pi ·si , si ⊳S Φi , Φ = ∑i∈I

pi ·Φi ; (5.2)

also∆ µ−→ ∆ ′ means

∆ = ∑j∈J

q j · t j , t jµ−→Θ j , ∆ ′ = ∑

j∈Jq j ·Θ j , (5.3)

and we can assume without loss of generality that all the coefficientspi ,q j are non-zero. Now defineI j = i ∈ I | si = t j andJi = j ∈ J | t j = si , so that trivially

(i, j) | i ∈ I , j ∈ Ji = (i, j) | j ∈ J, i ∈ I j (5.4)

and note that

∆(si) = ∑j∈Ji

q j and ∆(t j) = ∑i∈I j

pi (5.5)

Because of (5.5) we have

Φ = ∑i∈I

pi ·Φi = ∑i∈I

pi · ∑j∈Ji

q j

∆(si)·Φi

= ∑i∈I

∑j∈Ji

pi ·q j

∆(si)·Φi

Now for eachj in Ji we know that in factt j = si , and so from the middle parts of(5.2) and (5.3) we obtainΦi

µ=⇒ Φi j such thatΘ j (⊳S)† Φi j . Lemma 5.4 yields


Φ µ=⇒ Φ ′ = ∑i∈I

∑j∈Ji

pi ·q j

∆(si)·Φi j

where within the summationssi = t j , so that, using (5.4),Φ ′ can also be written as

∑j∈J

∑i∈I j

pi ·q j

∆(t j )·Φi j (5.6)

All that remains is to show that∆ ′ (⊳S)† Φ ′, which we do by manipulating∆ ′ so

that it takes on a form similar to that in (5.6):

∆ ′ = ∑j∈J

q j ·Θ j

= ∑j∈J

q j · ∑i∈I j

pi

∆(t j )·Θ j using (5.5) again

= ∑j∈J

∑i∈I j

pi ·q j

∆(t j )·Θ j

Comparing this with (5.6) above we see that the required result, ∆ ′ (⊳S)† Φ ′, follows

from an application of the linearity of lifting operation. ⊓⊔

Lemma 5.6.Suppose∆ (⊳S)† Φ and∆ α=⇒ ∆ ′. ThenΦ α=⇒ Φ ′ for someΦ ′ such

that∆ ′ (⊳S)† Φ ′.

Proof. First we consider two claims

(i) If ∆ (⊳S)† Φ and∆ τ−→ ∆ ′, thenΦ τ

=⇒ Φ ′ for someΦ ′ such that∆ ′ (⊳S)† Φ ′.

(ii) If ∆ (⊳S)† Φ and∆ τ

=⇒ ∆ ′, thenΦ τ=⇒ Φ ′ for someΦ ′ such that∆ ′ (⊳S)

† Φ ′.

The proof of claim (i) is similar to the proof of Lemma 5.5. Claim (ii) follows fromclaim (i) by induction on the length of the derivation ofτ=⇒. By combining claim (ii)with Lemma 5.5, we obtain the required result. ⊓⊔

Proposition 5.2.The relation(⊳S)† is both reflexive and transitive on distributions.

Proof. We leave reflexivity to the reader; it relies on the fact thats⊳S s for everystates.

For transitivity, letR ⊆ sCSP×D(sCSP) be given bysR Φ iff s⊳S ∆ (⊳S)† Φ

for some intermediate distribution∆ . Transitivity follows from the two claims

(i) Θ (⊳S)† ∆ (⊳S)

† Φ impliesΘ R† Φ

(ii) R is a simulation, henceR† ⊆ (⊳S)†.

Claim (ii) is a straightforward application of Lemma 5.6, solet us look at (i). FromΘ (⊳S)

† ∆ we have

Θ = ∑i∈I

pi ·si , si ⊳S ∆i , ∆ = ∑i∈I

pi ·∆i


Since∆ (⊳S)† Φ, in analogy to Proposition 3.3 we can show thatΦ = ∑i∈I pi ·Φi

such that∆i (⊳S)† Φi . So for eachi we havesi R Φi , from which it follows that

Θ R† Φ. ⊓⊔

Proposition 5.3.⊑S and⊑FS are preorders, i.e. they are reflexive and transitive.

Proof. By combination of Lemma 5.6 and Proposition 5.2, we obtain that⊑S is apreorder. The case for⊑FS can be similarly established by proving the counterpartsof Lemma 5.6 and Proposition 5.2. ⊓⊔

5.5.2 The simulation preorders are precongruences

In Theorem 5.1 of this section we establish that thepCSP operators are monotonewith respect to the simulation preorders, i.e. that both⊑S and⊑FS are precongru-ences forpCSP. This implies that thepCSP operators are compositional for themor, equivalently, that≃S and≃FS are congruences forpCSP. The following twolemmas gather some facts we need in the proof of this theorem.Their proofs arestraightforward, although somewhat tedious.

Lemma 5.7. (i) If Φ τ=⇒ Φ ′ thenΦ ∆ τ=⇒ Φ ′ ∆ and∆ Φ τ=⇒ ∆ Φ ′.(ii) If Φ a−→ Φ ′ thenΦ ∆ a−→ Φ ′ and∆ Φ a−→ Φ ′.(iii) (∑ j∈J p j ·Φ j) (∑k∈K qk ·∆k) = ∑ j∈J ∑k∈K(p j ·qk) · (Φ j ∆k).(iv) Given two binary relationsR,R ′ ⊆ sCSP×D(sCSP) satisfying sR ′∆ when-

ever s= s1 s2 and∆ = ∆1 ∆2 with s1 R ∆1 and s2 R ∆2. ThenΦi R† ∆i

for i = 1,2 implies(Φ1 Φ2) R ′† (∆1 ∆2). ⊓⊔

Lemma 5.8. (i) If Φ τ=⇒ Φ ′ thenΦ |A ∆ τ=⇒ Φ ′ |A ∆ and∆ |A Φ τ=⇒ ∆ |A Φ ′.(ii) If Φ a−→ Φ ′ and a6∈ A thenΦ |A ∆ a−→ Φ ′ |A ∆ and∆ |A Φ a−→ ∆ |A Φ ′.(iii) If Φ a−→ Φ ′, ∆ a−→ ∆ ′ and a∈ A then∆ |A Φ τ−→ ∆ ′ |A Φ ′.(iv) (∑ j∈J p j ·Φ j) |A (∑k∈K qk ·∆k) = ∑ j∈J ∑k∈K(p j ·qk) · (Φ j |A ∆k).(v) LetR,R ′ ⊆ sCSP×D(sCSP) be two binary relations satisfying sR ′∆ when-

ever s= s1 |A s2 and∆ = ∆1 |A ∆2 with s1 R ∆1 and s2 R ∆2. ThenΦi R† ∆i

for i = 1,2 implies(Φ1 |A Φ2) R ′† (∆1 |A ∆2). ⊓⊔

Theorem 5.1.Let⊑ ∈ ⊑S,⊑FS. Suppose Pi ⊑ Qi for i = 1,2. Then

1. a.P1 ⊑ a.Q1

2. P1 ⊓ P2 ⊑ Q1 ⊓ Q2

3. P1 P2 ⊑ Q1 Q2

4. P1 p⊕ P2 ⊑ Q1 p⊕ Q2

5. P1 |A P2 ⊑ Q1 |A Q2

Proof. We first consider the case for⊑S.

1. SinceP1 ⊑S Q1, there must be a∆1 such that[Q1 τ=⇒ ∆1 and[P1 (⊳S)

† ∆1.It is easy to see thata.P1 ⊳S a.Q1 because the transitiona.P1

a−→ [P1 can bematched bya.Q1

a−→ [Q1 τ=⇒ ∆1. Thus[a.P1 = a.P1 (⊳S)† a.Q1 = [a.Q1.


2. SincePi ⊑S Qi , there must be a∆i such that[Qi τ=⇒ ∆i and[Pi (⊳S)† ∆i . It is

easy to see thatP1 ⊓ P2 ⊳S Q1 ⊓ Q2 because the transitionP1 ⊓ P2τ−→ [Pi, for

i = 1 or i = 2, can be matched byQ1 ⊓ Q2τ−→ [Qi τ

=⇒ ∆i . Thus we have that[P1 ⊓ P2 = P1 ⊓ P2 (⊳S)† Q1 ⊓ Q2 = [Q1 ⊓ Q2.

3. LetR ⊆ sCSP×D(sCSP) be defined bysR ∆ iff either s⊳S ∆ or s= s1 s2

and∆ = ∆1 ∆2 with s1 ⊳S ∆1 ands2 ⊳S ∆2. We show thatR is a simulation.Supposes1 ⊳S ∆1, s2 ⊳S ∆2 ands1 s2

a−→Θ with a∈ Act. Thensia−→Θ for

i = 1 or i = 2. Thus∆ia

=⇒ ∆ for some∆ with Θ (⊳S)† ∆ , and henceΘ R

† ∆ .By Lemma 5.7 we have∆1 ∆2

a=⇒ ∆ .

Now suppose thats1 ⊳S ∆1, s2 ⊳S ∆2 ands1 s2τ−→Θ . Then we haves1

τ−→ ΦandΘ = Φ s2 or s2

τ−→ Φ andΘ = s1 Φ. By symmetry we may re-strict attention to the first case. Thus∆1

τ=⇒ ∆ for some∆ with Φ (⊳S)† ∆ .

By Lemma 5.7 we have(Φ s2) R† (∆ ∆2) and∆1 ∆2

τ=⇒ ∆ ∆2.

The case thats⊳S ∆ is trivial, so we have checked thatR is a simulation indeed.Using this, we proceed to show thatP1 P2 ⊑S Q1 Q2.SincePi ⊑S Qi , there must be a∆i such that[Qi τ=⇒ ∆i and[Pi (⊳S)

† ∆i . ByLemma 5.7, we have[P1 P2 = ([P1 [P2) R

† (∆1 ∆2). Therefore, itholds that[P1 P2 (⊳S)

† (∆1 ∆2). By Lemma 5.7 we also obtain[Q1 Q2 = [Q1 [Q2 τ=⇒ ∆1 [Q2 τ

=⇒ ∆1 ∆2,

so the required result is established.4. SincePi ⊑S Qi , there must be a∆i such that[Qi τ=⇒ ∆i and[Pi (⊳S)

† ∆i . Thus[Q1 p⊕ Q2= p · [Q1+(1−p) · [Q2 τ=⇒ p ·∆1+(1−p) ·∆2 by Lemma 5.4

and[P1 p⊕P2= p·[P1+(1−p) ·[P2 (⊳S)† p·∆1+(1−p) ·∆2 by the linearity

of lifting operation. HenceP1 p⊕ P2 ⊑S Q1 p⊕ Q2.5. LetR ⊆ sCSP×D(sCSP) be defined bysR ∆ iff s= s1 |A s2 and∆ = ∆1 |A ∆2

with s1 ⊳S ∆1 ands2 ⊳S ∆2. We show thatR is a simulation. There are threecases to consider.

a. Supposes1 ⊳S ∆1, s2 ⊳S ∆2 ands1 |A s2α−→Θ1 |A s2 because of the transition

s1α−→ Θ1 with α 6∈ A. Then∆1

α=⇒ ∆ ′1 for some∆ ′

1 with Θ1 (⊳S)† ∆ ′

1. ByLemma 5.8 we have∆1 |A ∆2

α=⇒ ∆ ′

1 |A ∆2 and also it can be seen that(Θ1 |A s2) R

† (∆ ′1 |A ∆2).

b. The symmetric case can be similarly analysed.c. Supposes1 ⊳S ∆1, s2 ⊳S ∆2 ands1 |A s2

τ−→ Θ1 |A Θ2 because of the tran-sitionss1

a−→ Θ1 ands2a−→ Θ2 with a ∈ A. Then fori = 1 andi = 2 we

have∆iτ

=⇒ ∆ ′i

a−→ ∆ ′′i

τ=⇒ ∆ ′′′

i for some∆ ′i ,∆ ′′

i ,∆ ′′′i with Θi (⊳S)

† ∆ ′′′i . By

Lemma 5.8 we have∆1 |A ∆2τ=⇒ ∆ ′

1 |A ∆ ′2

τ−→ ∆ ′′1 |A ∆ ′′

2τ=⇒ ∆ ′′′

1 |A ∆ ′′′2 and

(Θ1 |A Θ2) R† (∆ ′′′

1 |A ∆ ′′′2 ).

So we have checked thatR is a simulation.SincePi ⊑S Qi , there must be a∆i such that[Qi τ=⇒ ∆i and[Pi (⊳S)

† ∆i . ByLemma 5.8 we have[P1 |A P2 = ([P1 |A [P2) R

† (∆1 |A ∆2). Therefore wehave[P1 |A P2 (⊳S)

† (∆1 |A ∆2). By Lemma 5.8 we also obtain

5.5 Forward and failure simulation 127[Q1 |A Q2 = [Q1 |A [Q2 τ=⇒ ∆1 |A [Q2 τ=⇒ ∆1 |A ∆2,

which had to be established.

The case for⊑FS is analogous. As an example, we show that⊑FS is pre-served under parallel composition. The key step is to show that the binary relationR ⊆ sCSP×D(sCSP) defined by

R := (s1|As2,∆1|A∆2) | s1 ⊳FS ∆1∧s2 ⊳FS ∆2.

is a failure simulation.Supposesi ⊳FS ∆i for i = 1,2 ands1 |A s2 6X−→ for someX ⊆ Act. For eacha∈ X

there are two possibilities:

• If a 6∈ A thens1 6a−→ ands2 6a−→, since otherwise we would haves1 |A s2a−→.

• If a∈ A then eithers1 6a−→ or s2 6a−→, since otherwise we would haves1 |A s2τ−→.

Hence we can partition the setX into three subsets:X0, X1 andX2 such thatX0 =X\AandX1∪X2 ⊆ A with s1 6X1−→ ands2 6X2−→, but allowings1 6a−→ for somea ∈ X2 ands2 6a−→ for somea∈ X1. We then have thatsi 6X0∪Xi−−−→ for i = 1,2. By the assumptionthatsi ⊳FS ∆i for i = 1,2, there is a∆ ′

i with ∆iτ

=⇒ ∆ ′i 6X0∪Xi−−−→. Therefore∆ ′

1|A∆ ′2 6X−→

as well. By Lemma 5.8(i) we have∆1 |A ∆2τ=⇒ ∆ ′

1 |A ∆ ′2. Hence∆1 |A ∆2 can match

the failures ofs1 |A s2.The matching of transitions and the using ofR to prove the preservation property

of ⊑FS under parallel composition are similar to those in the abovecorrespondingproof for simulations, so we omit them. ⊓⊔

5.5.3 Simulations are sound for testing preorders

In this section we show that simulation is sound for may testing and failure simu-lation is sound for must testing. That is, we aim to prove that(i) P ⊑S Q impliesP⊑pmayQ and (ii)P⊑FS Q impliesP⊑pmustQ.

Originally the relation⊑S is defined onpCSP, we now extend it topCSPω , keep-ing Definition 5.5 unchanged.

Theorem 5.2.If P ⊑S Q then P⊑pmayQ.

Proof. For any testT ∈ pCSPω and processP∈ pCSP the setV f (T |Act P) is finite,so

P⊑pmayQ iff max(V f ([T |Act P))≤ max(V f ([T |Act Q)) for every testT. (5.7)

The following properties for∆1,∆2 ∈ pCSPω andα ∈Actτ are not hard to establish:

∆1α=⇒ ∆2 impliesmax(V f (∆1))≥ max(V f (∆2)). (5.8)


∆1 (⊳S)† ∆2 impliesmax(V f (∆1))≤ max(V f (∆2)). (5.9)

Now supposeP⊑SQ. Since⊑S is preserved by the parallel operator we have thatT |Act P⊑S T |Act Q for an arbitrary testT. By definition, this means that there is adistribution∆ such that[T |Act Q τ=⇒ ∆ and[T |Act P (⊳S)

† ∆ . By (5.8) and (5.9)we infer thatmax(V f ([T |Act P)) ≤ max(V f ([T |Act Q)). The result now followsfrom (5.7). ⊓⊔

It is tempting to use the same idea to prove thatP ⊑FS Q implies P ⊑pmustQ, butnow using the functionmin in place ofmax. However, themin-analogue of Property(5.8) is in general invalid. For example, letR be the processa |Act (a ω). Wehavemin(V f (R)) = 1, yetR τ−→ 0 |Act 0 andmin(V f (0 |Act 0)) = 0. Therefore, itis not the case that∆1

τ=⇒ ∆2 implies min(V f (∆1)) ≤ min(V f (∆2)). Examiningthis example reveals that the culprit is the “scooting”τ-transitions, which areτ-transitions of a state that can enable anω-action at the same time.

Our strategy is therefore as follows: when comparing two states, “scooting” tran-sitions are purposefully ignored. Writes α−→ω ∆ if both s 6ω−→ ands α−→ ∆ hold.We define τ−→ω as τ−→ using τ−→ω in place of τ−→. Similarly we define=⇒ω and

α=⇒ω . Thus the subscriptω on a transition of any kind indicates that no state ispassed through in whichω is enabled. A version of failure simulation adapted tothese transition relations is then defined as follows.

Definition 5.6. Let ⊳oFS⊆ sCSPω ×D(sCSPω) be the largest binary relation such

thats⊳oFS

Θ implies

• if s α−→ω ∆ then there is someΘ ′ with Θ α=⇒ω Θ ′ and∆ (⊳oFS)† Θ ′

• if s 6X−→ with ω ∈ X then there is someΘ ′ with Θ τ=⇒ω Θ ′ andΘ ′ 6X−→.

Let P⊑oFS Q iff [P τ

=⇒ω Θ for someΘ with [Q (⊳oFS)† Θ .

Note that for processesP,Q in pCSP (as opposed topCSPω ), we haveP⊑FS Q iffP⊑o

FS Q.

Proposition 5.4.If P,Q are processes inpCSP with P⊑FS Q and T is a process inpCSPω then T|Act P⊑o

FS T |Act Q.

Proof. Similar to the proof of Theorem 5.1. ⊓⊔

Proposition 5.5.The following properties hold, with∆1,∆2 ∈ D(sCSPω ):

P⊑pmustQ iff min(V f ([T |Act P))≤ min(V f ([T |Act Q)) for every testT.(5.10)

∆1α=⇒ω ∆2 for α ∈ Actτ implies min(V f (∆1))≤ min(V f (∆2)). (5.11)

∆1 (⊳oFS)† ∆2 implies min(V f (∆1))≥ min(V f (∆2)). (5.12)

Proof. Property (5.10) is again straightforward, and Property (5.11) can be estab-lished just as the proof of (5.8), but with all≤-signs reversed. Property (5.12) fol-lows by structural induction, simultaneously with the property, for s∈ sCSPω and∆ ∈ D(sCSPω), that


s⊳oFS

∆ implies min(V f (s))≥ min(V f (∆)) . (5.13)

The reduction of Property (5.12) to (5.13) proceeds exactlyas that for (5.9). For(5.13) itself we distinguish three cases:

• If s ω−→, thenmin(V f (s)) = 1≥ min(V f (∆)) trivially.• If s 6ω−→ but s→, then each “non-scooting” transition froms will be matched

by a “non-scooting” transition fromΘ . Whenevers α−→ω Θ , for α ∈ ActτandΘ ∈ D(sCSPω), thens⊳o

FS∆ implies the existence of some∆Θ such that

∆ α=⇒ω ∆Θ andΘ (⊳oFS)† ∆Θ . By induction, using (5.12), it follows that

min(V f (Θ))≥ min(V f (∆Θ )).

Consequently, we have that

min(V f (s)) = min(min(V f (Θ)) | s α−→Θ)≥ min(min(V f (∆Θ )) | s α−→Θ)≥ min(min(V f (∆)) | s α−→Θ) (by (5.11))= min(V f (∆)) .

• If s 6→, that iss 6Actω

−−−→, then there is some∆ ′ such that∆ τ=⇒ω ∆ ′ and∆ ′ 6Act

ω−−−→.

By the definition ofV f , min(V f (∆ ′))=0. Using (5.11), we have

min(V f (∆)) ≤ min(V f (∆ ′)),

somin(V f (∆))=0 as well. Thus, also in this casemin(V f (s)) ≥ min(V f (∆)).

⊓⊔

Theorem 5.3.If P ⊑FS Q then P⊑pmustQ.

Proof. Similar to the proof of Theorem 5.2, using (5.10)–(5.12). ⊓⊔

The next two sections are devoted to proving the converse of Theorems 5.2 and 5.3.That is, we will establish:

Theorem 5.4.Let P and Q bepCSP processes.

• P⊑pmayQ implies P⊑S Q and• P⊑pmustQ implies P⊑FS Q.

Because of Theorem 4.7, it will suffice to show that

• P⊑ΩpmayQ impliesP⊑S Q and

• P⊑ΩpmustQ impliesP⊑FS Q.

This shift from scalar testing to vector-based testing is motivated by the fact thatthe latter enables us to use more informative tests, allowing us to discover moreintensional properties of the processes being tested.

The crucial characteristics ofA needed for the above implications are sum-marised in Lemmas 5.9 and 5.10. For convenience of presentation, we write−→ω


for the vector in[0,1]Ω defined by−→ω (ω) = 1 and−→ω (ω ′) = 0 for ω ′ 6= ω . We alsohave the vector

−→0 ∈ [0,1]Ω with

−→0 (ω) = 0 for all ω ∈ Ω . Sometimes we treat

a distribution∆ of finite support as thepCSP expression⊕

s∈⌈∆⌉ ∆(s) · s, so thatA (T,∆) := Exp∆ A (T, ).

Lemma 5.9.Let P be apCSP process, and T,Ti be tests.

1. o∈ A (ω ,P) iff o =−→ω .

2.−→0 ∈ A (

ea∈X a.ω ,P) iff ∃∆ : [P τ=⇒ ∆ 6X−→.

3. Suppose the actionω does not occur in the test T . Then o∈ A (τ.ω a.T,P)with o(ω) = 0 iff there is a∆ ∈ D(sCSP) with [P a=⇒ ∆ and o∈ A (T,∆).

4. o∈ A (T1 p⊕ T2,P) iff o = p ·o1+(1− p) ·o2 for some oi ∈ A (Ti ,P).5. o∈A (T1 ⊓ T2,P) if there are some probability q∈ [0,1] and∆1,∆2 ∈D(sCSP)

such that[P τ=⇒ q · ∆1 + (1− q) · ∆2 and o= q · o1 + (1− q) · o2 for someoi ∈ A (Ti ,∆i).

Proof. Straightforward, by induction on the structure ofP. ⊓⊔

The converse of Lemma 5.9 (5) also holds, as the following lemma says. However,the proof is less straightforward.

Lemma 5.10.Let P be apCSP process, and Ti be tests. If o∈A (d

i∈ITi ,P) then forall i ∈ I there are probabilities qi ∈ [0,1] and∆i ∈ D(sCSP) with ∑i∈I qi = 1 suchthat [P τ=⇒ ∑i∈I qi ·∆i and o= ∑i∈I qioi for some oi ∈ A (Ti ,∆i).

Proof. Given that the states of our pLTS aresCSP expressions, there exists a well-founded order on the combination of states insCSP and distributions inD(sCSP),such thats α−→∆ implies thats is larger than∆ , and any distribution is larger than thestates in its support. Intuitively, this order correspondsto the usual order on naturalnumbers if we graphically depict apCSP process as a finite tree (cf. Section 5.2.4)and assign to each node a number to indicate its level in the tree. LetT =

di∈I Ti .

We prove the following two claims

(a) If s is a state-based process ando∈ A (T,s) then there are someqii∈I with∑i∈I qi = 1 such thats τ=⇒ ∑i∈I qi ·∆i , o= ∑i∈I qioi , andoi ∈ A (Ti ,∆i).

(b) If ∆ ∈ D(sCSP) ando∈ A (T,∆) then there are someqii∈I with ∑i∈I qi = 1such that∆ τ=⇒ ∑i∈I qi ·∆i, o= ∑i∈I qioi , andoi ∈ A (Ti ,∆i).

by simultaneous induction on the order mentioned above, applied tosand∆ .

(a) We have two sub-cases depending on whethers can make an initialτ-move ornot.

• If s cannot make aτ-move, that iss 6τ−→, then the only possible moves fromT |Act sareτ-moves originating inT; T has no non-τ moves, and any non-τmoves that might be possible fors on its own are inhibited by the alphabetAct of the composition. Supposeo∈ A (T,s). Then there are someqii∈I

with ∑i∈I qi = 1 such thato= ∑i∈I qioi andoi ∈ A (Ti ,s) = A (Ti ,s). Obvi-ously we also have[s τ=⇒ ∑i∈I qi ·s.


• If s can make one or moreτ-moves, then we haves τ−→ ∆ ′j for j ∈ J, where

without loss of generalityJ can be assumed to be a non-empty finite setdisjoint from I , the index set forT. The possible first moves forT |Act sareτ-moves either ofT or of s, becauseT cannot make initial non-τ movesand that prevents a proper synchronisation from occurring on the first step.Suppose thato∈A (T,s). Then there are somepkk∈I∪J with ∑k∈I∪J pk = 1and

o= ∑k∈I∪J

pko′k (5.14)

o′i ∈ A (Ti ,s) for all i ∈ I (5.15)

o′j ∈ A (T,∆ ′j) for all j ∈ J. (5.16)

For eachj ∈ J, we know by the induction hypothesis that

∆ ′j

τ=⇒ ∑i∈I

p ji ·∆ ′ji (5.17)

o′j = ∑i∈I

p jio′ji (5.18)

o′ji ∈ A (Ti ,∆ ′ji ) (5.19)

for somep jii∈I with ∑i∈I p ji = 1. Let

qi = pi + ∑j∈J

p j p ji

∆i =1qi(pi ·s+ ∑

j∈Jp j p ji ·∆ ′

ji )

oi =1qi(pio

′i + ∑

j∈Jp j p ji o

′ji )

for eachi ∈ I , except that∆i andoi are chosen arbitrarily in caseqi = 0. It canbe checked by arithmetic thatqi ,∆i ,oi have the required properties, viz. that∑i∈I qi = 1, thato= ∑i∈I qioi and that

s τ=⇒ ∑i∈I

pi ·s+ ∑j∈J

p j ·∆ ′j

τ=⇒ ∑i∈I

pi ·s+ ∑j∈J

p j ·∑i∈I

p ji ·∆ ′ji by (5.17) and Lemma 5.4

= ∑i∈I

qi ·∆i .

Finally, it follows from (5.15) and (5.19) thatoi ∈ A (Ti ,∆i) for eachi ∈ I .

(b) Let ⌈∆⌉ = sj j∈J and r j = ∆(sj). Without loss of generality we may as-sume thatJ is a non-empty finite set disjoint fromI . Using the condition thatA (T,∆) := Exp∆ A (T, ), if o∈ A (T,∆) then


o= ∑j∈J

r jo′j (5.20)

o′j ∈ A (T,sj ) (5.21)

For eachj ∈ J, we know by the induction hypothesis that

sjτ=⇒ ∑

i∈Iq ji ·∆ ′

ji (5.22)

o′j = ∑i∈I

q jio′ji (5.23)

o′ji ∈ A (Ti ,∆ ′ji ) (5.24)

for someq jii∈I with ∑i∈I q ji = 1. Thus let

qi = ∑j∈J

r jq ji

∆i =1qi

∑j∈J

r jq ji ·∆ ′ji

oi =1qi

∑j∈J

r jq ji o′ji

again choosing∆i andoi arbitrarily in caseqi = 0. As in the first case, it can beshown by arithmetic that the collectionr i ,∆i ,oi has the required properties.

⊓⊔

5.6 A modal logic

In this section we present logical characterisations⊑L and⊑F of our testing pre-orders. Besides their intrinsic interest, these logical preorders also serve as a step-ping stone in proving Theorem 5.4. In this section we show that the logical preordersare sound with respect to the simulation and failure simulation preorders, and hencewith respect to the testing preorders; in the next section weestablish completeness.To start, we define a setF of modal formulae, inductively, as follows:

• ⊤ ∈ F ,• ref(X) ∈ F whenX ⊆ Act,• 〈a〉ϕ ∈ F whenϕ ∈ F anda∈ Act,• ϕ1∧ϕ2 ∈ F whenϕ1 ∈ F andϕ2 ∈ F ,• ϕ1 p⊕ ϕ2 ∈ F whenϕ1,ϕ2 ∈ F andp∈ [0,1].

We often use the generalised probabilistic choice operator⊕

i∈I pi ·ϕi , whereI is anon-empty finite index set, and∑i∈I pi = 1. This can be expressed in our languageby nested use of the binary probabilistic choice.

Thesatisfaction relation|=⊆ D(sCSP)×F is given by:

5.6 A modal logic 133

• ∆ |= ⊤ for any∆ ∈ D(S),• ∆ |= ref(X) iff there is a∆ ′ with ∆ τ=⇒ ∆ ′ and∆ ′ 6X−→,• ∆ |= 〈a〉ϕ iff there is a∆ ′ with ∆ a

=⇒ ∆ ′ and∆ ′ |= ϕ ,• ∆ |= ϕ1∧ϕ2 iff ∆ |= ϕ1 and∆ |= ϕ2,• ∆ |= ϕ1 p⊕ ϕ2 iff there are∆1,∆2 ∈ D(S) with ∆1 |= ϕ1 and∆2 |= ϕ2, such that

∆ τ=⇒ p ·∆1+(1− p) ·∆2.

Let L be the subclass ofF obtained by skipping theref(X) clause. We shall writeP⊑L Q just when[P |= ϕ implies[Q |= ϕ for all ϕ ∈ L , andP⊑F Q just when[P |= ϕ is implied by[Q |= ϕ for all ϕ ∈ F . (Note the opposing directions.)

Remark 5.1.Compared with the two-sorted logic in Section 3.6.1, the logic F andits sublogicL drop state formulae and only contain distribution formulae. The rea-son is that we will characterise failure simulation preorder and simulation preorder.Both of them are distribution-based and strictly coarser than the state-based bisimi-larity investigated in Chapter 3.

In order to obtain the main result of this section, Theorem 5.5, we introduce thefollowing tool.

Definition 5.7. The F-characteristic formulaϕs or ϕ∆ of a processs∈ sCSP or∆ ∈ D(sCSP) is defined inductively:

• ϕs :=∧

sa−→∆ 〈a〉ϕ∆ ∧ ref(a | s 6a−→) if s 6τ−→,

• ϕs :=∧

sa−→∆ 〈a〉ϕ∆ ∧

∧

sτ−→∆ ϕ∆ otherwise,

• ϕ∆ :=⊕

s∈⌈∆⌉∆(s) ·ϕs.

Here the conjunctions∧

sa−→∆ range over suitable pairsa,∆ , and

∧

sτ−→∆ ranges

over suitable∆ . TheL -characteristic formulaeψs andψ∆ are defined likewise, butomitting the conjunctsref(a | s 6a−→).

We writeϕ ψ with ϕ ,ψ ∈ F if for each distribution∆ one has∆ |= ϕ implies∆ |= ψ . Then it is easy to see thatϕs ϕs and

∧

i∈I ϕi ϕi for anyi ∈ I ; further-more, the following property can be established by an easy inductive proof.

Lemma 5.11.For any∆ ∈ D(sCSP) we have∆ |= ϕ∆ , as well as∆ |= ψ∆ . ⊓⊔

It and the following lemma help to establish Theorem 5.5.

Lemma 5.12.For any processes P,Q ∈ pCSP we have that[P |= ϕ[Q implies

P⊑FS Q, and likewise that[Q |= ψ[P implies P⊑S Q.

Proof. To establish the first statement, we define the relationR bysR Θ iff Θ |=ϕs;to show that it is a failure simulation we first prove the following technical result:

Θ |= ϕ∆ implies ∃Θ ′ : Θ τ=⇒Θ ′∧∆ R† Θ ′. (5.25)

SupposeΘ |= ϕ∆ with ϕ∆ =⊕

i∈I pi · ϕsi , so that we have∆ = ∑i∈I pi · si andfor all i ∈ I there areΘi ∈ D(sCSP) with Θi |= ϕsi such thatΘ τ=⇒ Θ ′ withΘ ′ := ∑i∈I pi ·Θi. Sincesi R Θi for all i ∈ I we have∆ R

† Θ ′.Now we show thatR is a failure simulation. Assume thatsR Θ .


• Supposes τ−→ ∆ . Then from Definition 5.7 we haveϕs ϕ∆ , so thatΘ |= ϕ∆ .Applying (5.25) gives usΘ τ=⇒Θ ′ with ∆ R

† Θ ′ for someΘ ′.• Supposes a−→ ∆ with a ∈ Act. Thenϕs 〈a〉ϕ∆ , soΘ |= 〈a〉ϕ∆ . Hence there

exists someΘ ′ with Θ a=⇒Θ ′ andΘ ′ |= ϕ∆ . Again apply (5.25).

• Supposes 6X−→ with X ⊆A. Thenϕs ref(X), soΘ |= ref(X). Hence there existssomeΘ ′ with Θ τ=⇒Θ ′ andΘ ′ 6X−→.

ThusR is indeed a failure simulation. By our assumption[P |= ϕ[Q, using (5.25),

there exists aΘ ′ such that[P τ=⇒ Θ ′ and [Q R† Θ ′, which givesP ⊑FS Q via

Definition 5.5.To establish the second statement, define the relationS by sSΘ iff Θ |= ψs;

exactly as above one obtains

Θ |= ψ∆ implies ∃Θ ′ : Θ τ=⇒Θ ′∧∆ S† Θ ′. (5.26)

Just as above it follows thatS is a simulation. By the assumption[Q |= ϕ[P,using (5.26), there exists aΘ ′ such that[Q τ

=⇒ Θ ′ and[P S Θ ′. HenceP⊑S Qvia Definition 5.5. ⊓⊔

Theorem 5.5.

1. If P⊑L Q then P⊑S Q.2. If P⊑F Q then P⊑FS Q.

Proof. Suppose thatP ⊑F Q. By Lemma 5.11 we have[Q |=ϕ[Q and therefore[P |= ϕ[Q. Lemma 5.12 givesP⊑FS Q.

Now assume thatP ⊑L Q. We have[P |= ψ[P, hence[Q |= ψ[P, and thus

P⊑S Q. ⊓⊔

5.7 Characteristic tests

Our final step towards Theorem 5.4 is taken in this section, where we show that everymodal formulaϕ can be characterised by a vector-based testTϕ with the propertythat anypCSP process satisfiesϕ just when it passes the testTϕ .

Lemma 5.13.For everyϕ ∈ F there exists a pair(Tϕ ,vϕ ) with Tϕ an Ω -test andvϕ ∈ [0,1]Ω , such that

∆ |= ϕ iff ∃o∈ A (Tϕ ,∆) : o≤ vϕ (5.27)

for all ∆ ∈ D(sCSP), and in caseϕ ∈ L we also have

∆ |= ϕ iff ∃o∈ A (Tϕ ,∆) : o≥ vϕ . (5.28)

Tϕ is called acharacteristic testof ϕ andvϕ its target value.

5.7 Characteristic tests 135

Proof. First of all note that if a pair(Tϕ ,vϕ ) satisfies the requirements above, thenany pair obtained from(Tϕ ,vϕ ) by bijectively renaming the elements ofΩ alsosatisfies these requirements. Hence a characteristic test can always be chosen insuch a way that there is a success actionω ∈ Ω that does not occur in (the finite)Tϕ .Moreover, any countable collection of characteristic tests can be assumed to beΩ -disjoint, meaning that noω ∈ Ω occurs in two different elements of the collection.

The required characteristic tests and target values are obtained as follows.

• Let ϕ =⊤. TakeTϕ := ω for someω ∈ Ω , andvϕ :=−→ω .• Let ϕ = ref(X) with X ⊆ Act. Take Tϕ :=

ea∈X a.ω for someω ∈ Ω , and

vϕ :=−→0 .

• Let ϕ = 〈a〉ψ . By induction,ψ has a characteristic testTψ with target valuevψ .TakeTϕ := τ.ω a.Tψ whereω ∈ Ω does not occur inTψ , andvϕ := vψ .

• Let ϕ = ϕ1∧ϕ2. Choose anΩ -disjoint pair(Ti ,vi) of characteristic testsTi withtarget valuesvi , for i = 1,2. Furthermore, letp∈ (0,1] be chosen arbitrarily, andtakeTϕ := T1 p⊕ T2 andvϕ := p·v1+(1−p)·v2.

• Let ϕ = ϕ1 p⊕ ϕ2. Again choose anΩ -disjoint pair(Ti ,vi) of characteristic testsTi with target valuesvi , i = 1,2, this time ensuring that there are two distinctsuccess actionsω1, ω2 that do not occur in any of these tests. LetT ′

i := Ti 12⊕ ωi

andv′i := 12vi +

12−→ωi . Note that fori = 1,2 we have thatT ′

i is also a characteristictest ofϕi with target valuev′i . TakeTϕ := T ′

1 ⊓T ′2 andvϕ := p·v′1+(1−p)·v′2.

Note thatvϕ(ω) = 0 wheneverω ∈ Ω does not occur inTϕ . By induction onϕ wenow check (5.27) above.

• Let ϕ = ⊤. For all∆ ∈ D(sCSP) we have∆ |= ϕ and∃o∈ A (Tϕ ,∆) : o≤ vϕ ,using Lemma 5.9(1).

• Let ϕ = ref(X) with X ⊆ Act. Suppose∆ |= ϕ . Then there is a∆ ′ with ∆ τ=⇒ ∆ ′

and∆ ′ 6X−→. By Lemma 5.9(2),−→0 ∈ A (Tϕ ,∆ ).

Now suppose∃o∈A (Tϕ ,∆) : o≤ vϕ . This implieso=−→0 , so by Lemma 5.9(2)

there is a∆ ′ with ∆ τ=⇒ ∆ ′ and∆ ′ 6X−→. Hence∆ |= ϕ .

• Let ϕ = 〈a〉ψ with a∈Act. Suppose∆ |= ϕ . Then there is a∆ ′ with ∆ a=⇒∆ ′ and

∆ ′ |= ψ . By induction,∃o ∈ A (Tψ ,∆ ′) : o ≤ vψ . By Lemma 5.9(3), we knowthato∈ A (Tϕ ,∆).Now suppose∃o ∈ A (Tϕ ,∆) : o ≤ vϕ . This implieso(ω) = 0, so by usingLemma 5.9(3) we see that there is a∆ ′ with ∆ a

=⇒ ∆ ′ ando∈ A (Tψ ,∆ ′). Byinduction,∆ ′ |=ψ , so∆ |=ϕ .

• Let ϕ = ϕ1∧ϕ2 and suppose∆ |= ϕ . Then∆ |= ϕi for all i = 1,2, and hence, byinduction,∃oi ∈ A (Ti ,∆) : oi ≤ vi . Thuso := p · o1+(1− p) · o2 ∈ A (Tϕ ,∆ )by Lemma 5.9(4), ando≤ vϕ .Now suppose∃o∈ A (Tϕ ,∆ ) : o≤ vϕ . Then, using Lemma 5.9(4), we have thato= p ·o1+(1− p) ·o2 for certainoi ∈ A (Ti ,∆ ). Note thatT1,T2 areΩ -disjointtests. One hasoi ≤ vi for all i = 1,2, for if oi(ω) > vi(ω) for somei = 1 or 2andω ∈ Ω , thenω must occur inTi and hence cannot occur inT3−i. This impliesv3−i(ω) = 0 and thuso(ω) > vϕ(ω), in contradiction with the assumption. Byinduction,∆ |= ϕi for all i = 1,2, and hence∆ |= ϕ .


• Let ϕ = ϕ1 p⊕ ϕ2. Suppose∆ |= ϕ . Then for alli = 1,2 there are∆i ∈ D(sCSP)with ∆i |= ϕi such that∆ τ=⇒ p·∆1+(1− p) ·∆2. By induction, fori = 1,2, thereareoi ∈A (Ti ,∆i) with oi ≤ vi . Hence, there areo′i ∈A (T ′

i ,∆i) with o′i ≤ v′i . Thuso := p ·o′1+(1− p) ·o′2 ∈ A (Tϕ ,∆ ) by Lemma 5.9(5), ando≤ vϕ .Now suppose∃o∈A (Tϕ ,∆) : o≤ vϕ . Then, by Lemma 5.10, there areq∈ [0,1]and ∆1,∆2, such that∆ τ=⇒ q ·∆1+(1−q) ·∆2 and o = q · o1 + (1− q)o′2 forsomeo′i ∈ A (T ′

i ,∆i). Now ∀i : o′i(ωi) = v′i(ωi) =12, so, using thatT1,T2 areΩ -

disjoint tests, we have12q = q · o′1(ω1) = o(ω1) ≤ vϕ(ω1) = p · v′1(ω1) =12 p

and 12(1−q) = (1−q) ·o′2(ω2) = o(ω2)≤ vϕ(ω) = (1− p) ·v′2(ω2) =

12(1− p).

Together, these inequalities say thatq = p. Exactly as in the previous case oneobtainso′i ≤ v′i for all i = 1,2. Given thatT ′

i = Ti 12⊕ ωi , using Lemma 5.9(4), it

must be thato′ = 12oi +

12−→ωi for someoi ∈ A (Ti ,∆i) with oi ≤ vi . By induction,

∆i |= ϕi for all i = 1,2, and hence∆ |= ϕ .

In caseϕ ∈ L , the formula cannot be of the formref(X). Then a straightforwardinduction yields that∑ω∈Ω vϕ(ω) = 1 and for all∆ ∈ D(pCSP) ando∈ A (Tϕ ,∆ )we have∑ω∈Ω o(ω) = 1. Therefore,o≤ vϕ iff o≥ vϕ iff o= vϕ , yielding (5.28).⊓⊔

Theorem 5.6.

1. If P⊑ΩpmayQ then P⊑L Q.

2. If P⊑ΩpmustQ then P⊑F Q.

Proof. SupposeP⊑ΩpmustQ and[Q |= ϕ for someϕ ∈F . LetTϕ be a characteristic

test ofϕ with target valuevϕ . Then Lemma 5.13 yields∃o∈ A (Tϕ ,[Q) : o≤ vϕ ,and hence, given thatP⊑Ω

pmustQ andA (Tϕ ,[R) =A (Tϕ ,R) for anyR∈ pCSP, bythe Smyth preorder we have∃o′ ∈ A (Tϕ ,[P) : o′ ≤ vϕ . Thus[P |= ϕ .

The may-case goes likewise, via the Hoare preorder. ⊓⊔

Combining Theorems 4.7, 5.6 and 5.5, we obtain Theorem 5.4, the goal we setourselves in Section 5.5.3. Thus, with Theorems 5.2 and 5.3,we have shown thatthe may preorder coincides with simulation and that the mustpreorder coincideswith failure simulation. These results also imply the converse of both statements inTheorem 5.6, and thus that the logicsL andF give logical characterisations of thesimulation and failure simulation preorders⊑S and⊑FS. ⊓⊔

5.8 Equational theories

Having settled the problem of characterising the may preorder in terms of simula-tion, and the must preorder in terms of failure simulation, we now turn to completeaxiomatisations of the preorders.

In order to focus on the essentials we consider just thosepCSP processes that donot use the parallel operator|A; we call the resulting sub-languagenCSP.

Let us writeP =E Q for equivalences that can be derived using the equationsgiven in the upper part of Figure 5.8. Given the way we defined the syntax ofpCSP,

5.8 Equational theories 137

Equations :(P1) Pp⊕ P = P(P2) Pp⊕ Q = Q1−p⊕ P(P3) (P p⊕ Q) q⊕ R = Pp·q⊕ (Q (1−p)·q

1−p·q⊕ R)

(I1) P⊓ P = P(I2) P⊓ Q = Q⊓ P(I3) (P⊓ Q) ⊓ R = P⊓ (Q⊓ R)(E1) P 0 = P(E2) P Q = Q P(E3) (P Q) R = P (Q R)(EI) a.P a.Q = a.P⊓ a.Q(D1) P (Q p⊕ R) = (P Q) p⊕ (P R)(D2) a.P (Q⊓ R) = (a.P Q) ⊓ (a.P R)(D3) (P1 ⊓ P2) (Q1 ⊓ Q2) = (P1 (Q1 ⊓ Q2)) ⊓ (P2 (Q1 ⊓ Q2))

⊓ ((P1 ⊓ P2) Q1) ⊓ ((P1 ⊓ P2) Q2)

May :(May0) a.P b.Q = a.P⊓ b.Q(May1) P ⊑ P⊓ Q(May2) 0 ⊑ P(May3) a.(P p⊕ Q) ⊑ a.Pp⊕ a.Q

Must :(Must1) P⊓ Q ⊑ Q(Must2) R⊓

l

i∈I

⊕

j∈Ji

p j · (ai .Qi j Pi j ) ⊑e

i∈I ai .⊕

j∈Jip j ·Qi j ,

providedinits(R)⊆ aii∈I

Fig. 5.8 Equations and inequations.c©[2007] IEEE. Reprinted, with permission, from [6].

axiom(D1) is merely a case of abbreviation-expansion; thanks to(D1) there is noneed for (meta-)variables ranging over the sub-sort of state-based processes any-where in the axioms. Many of the standard equations forCSP [17] are missing; theyare not sound for≃FS. Typical examples include:

a.(P⊓ Q) = a.P⊓ a.Q

P= P P

P (Q⊓ R) = (P Q) ⊓ (P R)

P⊓ (Q R) = (P⊓ Q) (P⊓ R)

A detailed discussion of the standard equations forCSP in the presence of proba-bilistic processes has been given in Section 5.3.

Proposition 5.6.Suppose P=E Q. Then P≃FS Q.


Proof. Because of Theorem 5.1, that⊑FS is a precongruence, it is sufficient to ex-hibit appropriate witness failure simulations for the axioms in the upper part ofFigure 5.8. ⊓⊔

As ≃S is a less discriminating equivalence than≃FS it follows thatP=E Q impliesP≃S Q.

This equational theory allows us to reduce terms to a form in which the externalchoice operator is applied to prefix terms only.

Definition 5.8 (Normal forms). The set ofnormal forms Nis given by the follow-ing grammar:

N ::= N1 p⊕ N2 | N1 ⊓ N2 |m

i∈I

ai .Ni

Proposition 5.7.For every P∈ nCSP there is a normal form N such that P=E N.

Proof. A fairly straightforward induction, heavily relying on(D1)–(D3). ⊓⊔

We can also show that the axioms(P1)–(P3) and(D1) are in some sense all that arerequired to reason about probabilistic choice. LetP=prob Q denote that equivalenceof P andQ can be derived using those axioms alone. Then we have the followingproperty.

Lemma 5.14.Let P,Q∈ nCSP. Then[P= [Q implies P=prob Q.

Here[P= [Q says that[P and[Q are the very same distributions of state-basedprocesses insCSP; this is a much stronger prerequisite thanP andQ being testingequivalent.

Proof. The axioms(P1)–(P3) and(D1) essentially allow any processes to be writ-ten in the unique form

⊕

i∈I pisi , where thesi ∈ sCSP are all different. ⊓⊔

5.9 Inequational theories

In order to characterise the simulation preorders, and the associated testing pre-orders, we introduceinequations. We writeP⊑Emay Q whenP⊑Q is derivable fromthe inequational theory obtained by adding the fourmay inequations in the middlepart to the upper part of Figure 5.8. The first three additions, (May0)–(May2), areused in the standard testing theory ofCSP [17, 11, 16]. For themustcase, we writeP⊑Emust Q whenP⊑ Q is derivable from the equations and inequations in the upperand lower parts of Figure 5.8. In addition to the standard inequation(Must1), werequire an inequational schema,(Must2); this uses the notationinits(P) to denotethe (finite) set of initial actions ofP. Formally,

5.9 Inequational theories 139

inits(0) = /0inits(a.P) = a

inits(P p⊕ Q) = inits(P) ∪ inits(Q)inits(P Q) = inits(P) ∪ inits(Q)inits(P⊓ Q) = τ

The axiom(Must2) can equivalently be formulated as follows:

⊕

k∈K

m

ℓ∈Lk

akℓ.Rkℓ ⊓l

i∈I

⊕

j∈Ji

p j · (ai .Qi j Pi j ) ⊑m

i∈I

ai.⊕

j∈Ji

p j ·Qi j ,

provided akℓ | k∈ K, ℓ ∈ Kk ⊆ ai | i ∈ I .

This is the case because a termR satisfiesinits(R) ⊆ aii∈I iff it can be converted

into the form⊕

k∈K

m

ℓ∈Lk

akℓ.Rkℓ by means of axioms(D1), (P1)–(P3) and(E1)–(E3)

of Figure 5.8. This axiom can also be reformulated in an equivalent but more se-mantic style:

(Must2′) R⊓d

i∈I Pi ⊑e

i∈I ai.Qi ,

provided [Pi ai−→ [Qi and [R 6X−→ with X = Act\aii∈I .

This is the case because[P a−→ [Q iff, up to the axioms in Figure 5.8,P has theform

⊕

j∈J p j · (a.Q j Pj) andQ has the form⊕

j∈J p j ·Q j for certainPj , Q j andp j , for j ∈ J.

Note that(Must2) can be used, together with(I1), to derive the dual of(May3)via the following inference:

a.Pp⊕ a.Q =E (a.Pp⊕ a.Q) ⊓ (a.P p⊕ a.Q)⊑Emust a.(Pp⊕ Q)

An important inequation that follows from(May1) and(P1) is

(May4) P p⊕ Q ⊑Emay P⊓ Q

saying that any probabilistic choice can be simulated by an internal choice. It isderived as follows:

P p⊕ Q ⊑Emay (P⊓ Q) p⊕ (P⊓ Q)=E (P⊓ Q)

Likewise, we haveP⊓ Q ⊑Emust P p⊕ Q .

Theorem 5.7.For P, Q in nCSP, it holds that

(i) P ⊑S Q if and only if P⊑Emay Q(ii) P ⊑FS Q if and only if P⊑Emust Q.


Proof. For one direction it is sufficient to check that the inequations, and the inequa-tional schema in Figure 5.8 are sound. The converse, completeness, is established inthe next section. ⊓⊔

5.10 Completeness

The completeness proof of Theorem 5.7 depends on the following variation on theDerivative lemmaof [26]:

Lemma 5.15 (Derivative lemma).Let P,Q∈ nCSP.

(i) If [P τ=⇒ [Q then P⊑Emust Q and Q⊑Emay P.(ii) If [P a=⇒ [Q then a.Q⊑Emay P.

Proof. The proof of (i) proceeds in four stages. We only deal with⊑Emay, as theproof for⊑Emust is entirely analogous.

First we show by structural induction ons∈ sCSP∩nCSP thats τ−→ [Q impliesQ ⊑Emay s. So supposes τ−→ [Q. In cases has the formP1 ⊓ P2 it follows by theoperational semantics ofpCSP thatQ= P1 or Q= P2. HenceQ⊑Emay sby (May1).The only other possibility is thats has the forms1 s2. In that case there mustbe a distribution∆ such that eithers1

τ−→ ∆ and [Q = ∆ s2, or s2τ−→ ∆ and[Q= s1 ∆ . Using symmetry, we may restrict attention to the first case.Let R be

a term such that[R = ∆ . Then[R s2 = ∆ s2 = [Q, so Lemma 5.14 yieldsQ=prob R s2. By induction we haveR⊑Emay s1, henceR s2 ⊑Emay s1 s2, andthusQ⊑Emay s.

Now we show thats τ−→ [Q impliesQ⊑Emay s. This follows becauses τ−→ [Qmeans that eithers τ−→ [Q or [Q = s, and in the latter case Lemma 5.14 yieldsQ=prob s.

Next we show that[P τ−→ [Q impliesQ⊑Emay P. So suppose[P τ−→ [Q, thatis [P= ∑

i∈I

pi ·si siτ−→ [Qi [Q= ∑

i∈I

pi · [Qifor someI , pi ∈ (0,1], si ∈ sCSP∩nCSP andQi ∈ nCSP. Now

1. [P= [⊕i∈I pi ·si. By Lemma 5.14 we haveP=prob⊕

i∈I pi ·si .2. [Q= [⊕i∈I pi ·Qi. Again Lemma 5.14 yieldsQ=prob

⊕

i∈I pi ·Qi .3. si

τ−→ [Qi impliesQi ⊑Emay si . Therefore,⊕

i∈I pi ·Qi ⊑Emay

⊕

i∈I pi ·si .

Combining (1), (2) and (3) we obtainQ⊑Emay P.Finally, the general case, when[P τ−→∗ ∆ , is now a simple inductive argument

on the length of the derivation.The proof of (ii) is similar: first we treat the case whens a−→ [Q by structural

induction, using(May2); then the case[P a−→ [Q, exactly as above; and finallyuse part (i) to derive the general case. ⊓⊔

The completeness result now follows from the following two propositions.

5.10 Completeness 141

Proposition 5.8.Let P and Q be innCSP. Then P⊑S Q implies P⊑Emay Q.

Proof. The proof is by structural induction onP andQ, and we may assume thatbothP andQ are in normal form because of Proposition 5.7. So takeP,Q∈ pCSP

and suppose the claim has been established for all subtermsP′ of P andQ′ of Q, ofwhich at least one of the two is a strict subterm. We start by proving that ifP∈ sCSP

then we haveP⊳S [Q implies P⊑Emay Q. (5.29)

There are two cases to consider.

1. P has the formP1 ⊓ P2. SincePi ⊑Emay P we knowPi ⊑S P⊑S Q. We use induc-tion to obtainPi ⊑Emay Q, from which the result follows using(I1).

2. P has the forme

i∈I ai .Pi . If I contains two or more elements thenP may also bewritten as

di∈I ai .Pi, using(May0) and(D2), and we may proceed as in case

(1) above. IfI is empty, that isP is 0, then we can use(May2). So we are leftwith the possibility thatP is a.P′. Thus suppose thata.P′ ⊳S [Q. We proceedby a case analysis on the structure ofQ.

• Q is a.Q′. We know froma.P′ ⊳S [a.Q′ that[P′ (⊳S)† Θ for someΘ with[Q′ τ=⇒ Θ , thusP′ ⊑S Q′. Therefore, we haveP′ ⊑Emay Q′ by induction. It

follows thata.P′ ⊑Emay a.Q′.• Q is

ej∈I a j .Q j with at least two elements inJ. We use(May0) and then

proceed as in the next case.• Q is Q1 ⊓ Q2. We know froma.P′ ⊳S [Q1 ⊓ Q2 that[P′ (⊳S)

† Θ for someΘ such that one of the following two conditions holdsa. [Qi a

=⇒Θ for i = 1 or 2. In this case,a.P′ ⊳S [Qi, hencea.P′ ⊑S Qi . Byinduction we havea.P′ ⊑Emay Qi ; then we apply(May1).

b. [Q1 a=⇒Θ1 and[Q2 a

=⇒Θ2 such thatΘ = p·Θ1+(1− p) ·Θ2 for somep∈ (0,1). LetΘi = [Q′

i for i = 1,2. By the Derivative Lemma, we havea.Q′

1 ⊑Emay Q1 anda.Q′2 ⊑Emay Q2. Clearly,[Q′

1 p⊕Q′2=Θ , thus we have

P′ ⊑S Q′1 p⊕ Q′

2. By induction, we infer thatP′ ⊑Emay Q′1 p⊕ Q′

2. So

a.P′ ⊑Emay a.(Q′1 p⊕ Q′

2)⊑Emay a.Q′

1 p⊕ a.Q′2 (May3)

⊑Emay Q1 p⊕ Q2

⊑Emay Q1 ⊓ Q2 (May4)

• Q is Q1 p⊕Q2. We know froma.P′ ⊳S [Q1 p⊕Q2 that[P′ (⊳S)† Θ for some

Θ such that[Q1 p⊕ Q2 a=⇒Θ . From Lemma 5.4 we know thatΘ must take

the form p · [Q′1+(1− p) · [Q′

2, where[Qi a=⇒ [Q′

i for i = 1,2. HenceP′ ⊑S Q′

1 p⊕ Q′2, and by induction we getP′ ⊑Emay Q′

1 p⊕ Q′2. Then we can

derivea.P′ ⊑Emay Q1 p⊕ Q2 as in the previous case.

Now we use (5.29) to show thatP⊑S Q impliesP⊑Emay Q. SupposeP⊑S Q. Ap-plying Definition 5.5 with the understanding that any distributionΘ ∈D(sCSP) canbe written as[Q′ for someQ′ ∈ pCSP, this basically means that[P (⊳S)

† [Q′ for


some[Q τ=⇒ [Q′. The Derivative Lemma yieldsQ′ ⊑Emay Q. So it suffices to show

P⊑Emay Q′. We know that[P (⊳S)† [Q′ means that[P= ∑

k∈K

rk · tk tk ⊳S [Q′k [Q′= ∑

k∈K

rk · [Q′k

for someK, rk ∈ (0,1], tk ∈ sCSP andQ′k ∈ pCSP. Now

1. [P= [⊕k∈K rk · tk. By Lemma 5.14 we haveP=prob⊕

k∈K rk · tk.2. [Q′= [⊕k∈K rk ·Q′

k. Again Lemma 5.14 yieldsQ′ =prob⊕

k∈K rk ·Q′k.

3. tk ⊳S [Q′k impliestk ⊑EmayQ′

k by (5.29). Therefore,⊕

k∈K rk · tk ⊑Emay

⊕

k∈K rk ·Q′

k.

Combining (1), (2) and (3) we obtainP⊑Emay Q′, henceP⊑Emay Q. ⊓⊔

Proposition 5.9.Let P and Q be innCSP. Then P⊑FS Q implies P⊑Emust Q.

Proof. Similar to the proof of Proposition 5.8, but using a reversedorientation ofthe preorders. The only real difference is the case (2), which we consider now. SoassumeQ ⊳FS [P, whereQ has the form

ei∈I ai .Qi . Let X be any set of actions

such thatX∩aii∈I = /0; thene

i∈I ai.Qi 6X−→. Therefore, there exists aP′ such that[P τ=⇒ [P′ 6X−→. By the Derivative lemma,

P⊑Emust P′ (5.30)

Sincee

i∈I ai .Qiai−→ [Qi, there existPi,P′

i ,P′′i such that[P τ=⇒ [Pi ai−→ [P′

i τ=⇒ [P′′i and[Qi (⊳FS)

† [P′′i .

NowP⊑Emust Pi (5.31)

using the Derivative lemma, andP′i ⊑FSQi , by Definition 5.5. By induction, we have

P′i ⊑Emust Qi , hence m

i∈I

ai.P′i ⊑Emust

m

i∈I

ai .Qi (5.32)

The desired result is now obtained as follows:

P ⊑Emust P′ ⊓l

i∈I

Pi by (I1), (5.30) and (5.31)

⊑Emust

m

i∈I

ai .P′i by (Must2′)

⊑Emust

m

i∈I

ai .Qi by (5.32) ⊓⊔

Propositions 5.8 and 5.9 give us the completeness result stated in Theorem 5.7.



In this chapter we have studied three different aspects of may- and must testingpreorders for finite processes: (i) we have shown that the maypreorder can be char-acterised as a co-inductive simulation relation, and the must preorder as a failuresimulation relation; (ii) we have given a characterisationof both preorders in a fini-tary modal logic; and (iii) we have also provided complete axiomatisations for bothpreorders over a probabilistic version of recursion-free CSP. Although we omittedour parallel operator|A from the axiomatisations, it and similar CSP and CCS-likeparallel operators can be handled using standard techniques, in the must case at theexpense of introducing auxiliary operators. A generalisation of results (i) and (ii)above to a probabilisticπ-calculus, with similar proof techniques of characteristicformulae and characteristic tests, has appeared in [10].

5.11.1 Probabilistic equivalences

Whereas the testing semantics explored in the present chapter is based on the ideathat processes should bedistinguishedonly when there is a compelling reason to doso, (strong) bisimulation semantics [26] is based on the idea that processes shouldbeidentifiedonly when there is a compelling reason to do so. It has been extended toreactive probabilistic processes in [23], to generative ones in [13], and to processescombining nondeterminism and probability in [15]. The latter paper also features acomplete axiomatisation of a probabilistic extension of recursion-free CCS.

Weak and branching bisimulation [26, 14] are versions of strong bisimula-tion that respect the hidden nature of the internal actionτ. Generalisations ofthese notions to nondeterministic probabilistic processes appear, amongst others,in [34, 32, 31, 1, 4, 8, 3], with complete axiomatisations reported in [4, 8, 9, 2].The authors of these papers tend to distinguish whether theywork in analternating[31, 1, 3, 2] or anon-alternatingmodel of probabilistic processes [34, 32, 8, 9], thetwo approaches being compared in [4]. The non-alternating model stems from [34]and is similar to our model of Section 5.2.2. The alternatingmodel is attributed to[15], and resembles our graphical representation of processes in Section 5.2.4. Itis easy to see that mathematically the alternating and non-alternating model can betranslated into each other without loss of information [4].The difference betweenthe two is one of interpretation. In the alternating interpretation, the nodes of form in our graphical representations are interpreted as actualstates a process can bein, whereas in the non-alternating representation they arenot. Take for example theprocessR1 = a.(b 1

2⊕ c) depicted in Figure 5.5. In the alternating representation this

process passes through a state in whicha has already happened, but the probabilisticchoice betweenb andc has not yet been made. In the non-alternating interpretationon the other hand the execution ofa is what constitutes this probabilistic choice;after doinga there is a fifty-fifty change of ending up in either state. Although in


strong bisimulation semantics the alternating and non-alternating interpretation leadto the same semantic equivalence, in weak and branching bisimulation semantics theresulting equivalences are different, as illustrated in [31, 4, 3]. Our testing and simu-lation preorders as presented here can be classified as non-alternating; however, webelieve that an alternating approach would lead to the very same preorders.

Early additions of probability to CSP include work by Lowe [24], Seidel [35]and Morgan et al. [28]; but all of them are forced to make compromises of somekind in order to address the potentially complicated interactions between the threeforms of choice. The last [28] for example applies the Jones/Plotkin probabilisticpowerdomain [18] directly to the failures model of CSP [5], the resulting compro-mise being that probability distributed outwards through all other operators; onecontroversial result of that is that internal choice is no longer idempotent, and that itis “clairvoyant” in the sense that it can adapt to probabilistic-choice outcomes thathave not yet occurred. Mislove addresses this problem in [27] by presenting a de-notational model in which internal choice distributes outwards through probabilisticchoice. However, the distributivities of both [28] and [27]constitute identificationsthat cannot be justified by our testing approach; see [7].

In Jou and Smolka [22], as in [24, 35], probabilistic equivalences based on traces,failures and readies are defined. These equivalences are coarser than≃pmay. Forexample, let

P := a.((b.d c.e) 12⊕ (b. f c.g))

Q := a.((b.d c.g) 12⊕ (b. f c.e)).

The two processes cannot be distinguished by the equivalences of [22, 24, 35]. How-ever, we can tell them apart by the test:

T := a.((b.d.ω 12⊕ c.e.ω) ⊓ (b. f .ω 1

2⊕ c.g.ω))

sinceA (T,P) = 0, 12,1 andA (T,Q) = 1

2, that is,P 6⊑pmayQ.

5.11.2 Probabilistic simulations

Four different notions of simulation for probabilistic processes occur in the litera-ture, each a generalisation of the well know concept of simulation for nondetermin-istic processes [30]. The most straightforward generalisation [19] is defined as inDefinition 3.5. This simulation induces a preorder strictlyfiner than⊑S and⊑pmay.For example, it does not satisfy the law

a.(P p⊕ Q)⊑ a.P a.Q

that holds in probabilisticmay testing semantics. The reason is that the processa.P a.Q can answer the initiala-move ofa.(P p⊕ Q) by taking either thea-move


to P, or thea-move toQ, but not by a probabilistic combination of the two. Suchprobabilistic combinations are allowed in the probabilistic simulation of [34], whichinduces a coarser preorder on processes, satisfying the above law. In our terminologyit can be defined by changing the requirement above into

if sR t ands α−→Θ then there is a∆ ′ with t α−→ ∆ ′ andΘ R† ∆ ′.

A weakversion of this probabilistic simulation, abstracting from the internal actionτ, weakens this requirement into

if sR t ands α−→Θ then there is a∆ ′ with t α=⇒ ∆ ′ andΘ R† ∆ ′.

Nevertheless, also this probabilistic simulation does notsatisfy all the laws wehave shown to hold for probabilistic may testing. In particular, it does not sat-isfy the law(May3). Consider for instance the processesR1 = a.b.c.(d 1

2⊕ e) and

R2 = a.(b.c.d 12⊕ b.c.e). The law(May3), which holds for probabilisticmaytesting,

would yieldR1 ⊑ R2. If we are to relate these processes via a probabilistic simula-tion a la [34], the statec.(d 1

2⊕ e) of R1, reachable after ana and ab-step, needs to

be related to thedistribution (c.d 12⊕ c.e) of R2, containing the two statesa.b and

a.c. This relation cannot be obtained through lifting, as this would entail relating thesingle statec.(d 1

2⊕ e) to each of the statesc.d andc.e. Such a relation would not be

sound, becausec.(d 12⊕ e) is able to perform the sequence of actionscehalf of the

time, whereas the processc.d cannot mimic this.In [20], another notion of simulation is proposed, whose definition is too com-

plicated to explain in a few sentences. They show for a class of probabilistic pro-cesses that do not containτ-actions, that probabilisticmay testing is captured ex-actly by their notion of simulation. Nevertheless, their notion of simulation makesstrictly more identifications than ours. As an example, let us consider the processesR1 = a 1

2⊕ (b c) andR3 = (a b) 1

2⊕ (a c) of Example 5.11, which also appear

in Section 5 of [20]. There it is shown thatR1 ⊑ R3 holds in their semantics. How-ever, in our framework we haveR1 6⊑pmay R3, as demonstrated in Example 5.11.The difference can only be explained by the circumstance that in [20] processes,and hence also tests, may not have internal actions. So this example shows thattests with internal moves can distinguish more processes than tests without internalmoves, even when applied to processes that have no internal moves themselves.

Our notion of forward simulation first appears in [32], although the preorder⊑S

of Definition 5.5 is new. Segala has no expressions that denote distributions andconsequently is only interested in the restriction of the simulation preorder to states(automatain his framework). It turns out that for statess and t (or expressionsin the setsCSP in our framework) we haves⊑S t iff s⊳S t, so on their commondomain of definition, the simulation preorder of [32] agreeswith ours. This notionof simulation is strictly more discriminating than the simulation of [20], and strictlyless discriminating than the ones from [34] and [19].


References

1. Andova, S., Baeten, J.C.: Abstraction in probabilistic process algebra. In: Proceedings of the7th International Conference on Tools and Algorithms for the Construction and Analysis ofSystems,Lecture Notes in Computer Science, vol. 2031, pp. 204–219. Springer (2001)

2. Andova, S., Baeten, J.C., Willemse, T.A.: A complete axiomatisation of branching bisimu-lation for probabilistic systems with an application in protocol verification. In: Proceedingsof the 17th International Conference on Concurrency Theory, Lecture Notes in ComputerScience, vol. 4137, pp. 327–342. Springer (2006)

3. Andova, S., Willemse, T.A.: Branching bisimulation for probabilistic systems: Characteristicsand decidability. Theoretical Computer Science356(3), 325–355 (2006)

4. Bandini, E., Segala, R.: Axiomatizations for probabilistic bisimulation. In: Proceedings of the28th International Colloquium on Automata, Languages and Programming,Lecture Notes inComputer Science, vol. 2076, pp. 370–381. Springer (2001)

5. Brookes, S., Hoare, C., Roscoe, A.: A theory of communicating sequential processes. J. ACM31(3), 560–599 (1984)

6. Deng, Y., van Glabbeek, R., Hennessy, M., Morgan, C.C., Zhang, C.: Characterising test-ing preorders for finite probabilistic processes. In: Proceedings of the 22nd Annual IEEESymposium on Logic in Computer Science, pp. 313–325. IEEE Computer Society (2007)

7. Deng, Y., van Glabbeek, R., Hennessy, M., Morgan, C.C., Zhang, C.: Remarks on testingprobabilistic processes. Electronic Notes in TheoreticalComputer Science172, 359–397(2007)

8. Deng, Y., Palamidessi, C.: Axiomatizations for probabilistic finite-state behaviors. In: Pro-ceedings of the 8th International Conference on Foundations of Software Science and Com-putation Structures,Lecture Notes in Computer Science, vol. 3441, pp. 110–124. Springer(2005)

9. Deng, Y., Palamidessi, C., Pang, J.: Compositional reasoning for probabilistic finite-state be-haviors. In: Processes, Terms and Cycles: Steps on the Road to Infinity, Essays Dedicated toJan Willem Klop, on the Occasion of His 60th Birthday,Lecture Notes in Computer Science,vol. 3838, pp. 309–337. Springer (2005)

10. Deng, Y., Tiu, A.: Characterisations of testing preorders for a finite probabilisticπ-calculus.Formal Aspects of Computing24(4-6), 701–726 (2012)

11. De Nicola, R., Hennessy, M.: Testing equivalences for processes. Theoretical Computer Sci-ence34, 83–133 (1984)

12. van Glabbeek, R.: The linear time – branching time spectrum II; the semantics of sequentialsystems with silent moves. In: Proceedings of the 4th International Conference on Concur-rency Theory,Lecture Notes in Computer Science, vol. 715, pp. 66–81. Springer (1993)

13. van Glabbeek, R., Smolka, S.A., Steffen, B., Tofts, C.: Reactive, generative, and stratifiedmodels of probabilistic processes. In: Proceedings of the 5th Annual IEEE Symposium onLogic in Computer Science, pp. 130–141. Computer Society Press (1990)

14. van Glabbeek, R., Weijland, W.: Branching time and abstraction in bisimulation semantics.Journal of the ACM43(3), 555–600 (1996)


16. Hennessy, M.: An Algebraic Theory of Processes. The MIT Press (1988)17. Hoare, C.A.R.: Communicating Sequential Processes. Prentice Hall (1985)18. Jones, C., Plotkin, G.: A probabilistic powerdomain of evaluations. In: Proceedings of the 4th

Annual IEEE Symposium on Logic in Computer Science, pp. 186–195. Computer SocietyPress (1989)

19. Jonsson, B., Larsen, K.G.: Specification and refinement of probabilistic processes. In: Pro-ceedings of the 6th Annual IEEE Symposium on Logic in Computer Science, pp. 266–277.Computer Society Press (1991)

References 147


21. Jonsson, B., Yi, W., Larsen, K.G.: Probabilistic extensions of process algebras. In: Handbookof Process Algebra, chap. 11, pp. 685–710. Elsevier (2001)

22. Jou, C.C., Smolka, S.A.: Equivalences, congruences, and complete axiomatizations for prob-abilistic processes. In: Proceedings of the 1st International Conference on Concurrency The-ory, Lecture Notes in Computer Science, vol. 458, pp. 367–383. Springer (1990)


24. Lowe, G.: Representing nondeterminism and probabilistic behaviour in reactive processes.Tech. Rep. TR-11-93, Computing laboratory, Oxford University (1993)

25. Lynch, N., Segala, R., Vaandrager, F.W.: Observing branching structure through probabilisticcontexts. SIAM Journal on Computing37(4), 977–1013 (2007)

26. Milner, R.: Communication and Concurrency. Prentice Hall (1989)27. Mislove, M.W.: Nondeterminism and probabilistic choice: Obeying the laws. In: Proceedings

of the 11th International Conference on Concurrency Theory, Lecture Notes in ComputerScience, vol. 1877, pp. 350–364. Springer (2000)

28. Morgan, C.C., McIver, A.K., Seidel, K., Sanders, J.: Refinement-oriented probability for CSP.Formal Aspects of Computing8(6), 617–47 (1996)

29. Olderog, E.R., Hoare, C.: Specification-oriented semantics for communicating processes.Acta Inf. 23, 9–66 (1986)

30. Park, D.: Concurrency and automata on infinite sequences. In: Proceedings of the 5th GIConference,Lecture Notes in Computer Science, vol. 104, pp. 167–183. Springer (1981)

31. Philippou, A., Lee, I., Sokolsky, O.: Weak bisimulationfor probabilistic systems. In: Proceed-ings of the 11th International Conference on Concurrency Theory,Lecture Notes in ComputerScience, vol. 1877, pp. 334–349. Springer (2000)


33. Segala, R.: Testing probabilistic automata. In: Proceedings of the 7th International Confer-ence on Concurrency Theory,Lecture Notes in Computer Science, vol. 1119, pp. 299–314.Springer (1996)

34. Segala, R., Lynch, N.: Probabilistic simulations for probabilistic processes. In: Proceedings ofthe 5th International Conference on Concurrency Theory,Lecture Notes in Computer Science,vol. 836, pp. 481–496. Springer (1994)

35. Seidel, K.: Probabilistic communicating processes. Theoretical Computer Science152(2),219–249 (1995)


Chapter 6Testing Finitary Probabilistic Processes

Abstract In this chapter we extend the results in Chapter 5 from finite to finitary pro-cesses, i.e. finite-state and finitely branching processes.Testing preorders can still becharacterised as simulation preorders and admit modal characterisations. However,to prove these results demands more advanced techniques. A new notion of weakderivation is introduced; some of its fundamental properties are established. Of par-ticular importance is the finite generability property of the set of derivatives fromany distribution, which enables us to approximate co-inductive simulation relationsby stratified inductive relations. This opens the way to characterise the behaviourof finitary process by a finite modal logic. Therefore, if two processes are related itsuffices to construct a simulation relation, otherwise a finite test can be constructedto tell them apart.

We also introduce a notion of real-reward testing that allows for negative rewards.Interestingly, for finitary convergent processes, real-reward must testing preordercoincides with nonnegative-reward testing preorder.

Keywords: Finitary processes; Testing preorder; Simulation preorder; Modal logic;Weak derivation; Finite generability; Real-reward testing

6.1 Introduction

In order to describe infinite behaviour of processes, we extend the languagepCSP toa version ofrpCSP with recursive process descriptions: we add a constructrecx.Pfor recursion, and extend the intensional semantics of Figure 5.1 in a straightforwardmanner. We restrict ourselves tofinitary rpCSP processes, having finitely manystates and displaying finite branching.

The simulation relations⊑S and⊑FS given in Section 5.5.1 were defined in termsof weak transitions τ=⇒ between distributions, obtained as the transitive closureofa relation τ−→ between distributions that allows one part of a distribution to makea τ-move with the other part remaining in place. This definitionis however inad-

149

150 6 Testing Finitary Probabilistic Processes

ττ

a

1/21/2

1/2 1/2

1/2 1/2

a

ττ

τ

(a) (b)

Fig. 6.1 The pLTS’s of processesQ1 and Q2. Reprinted from [1], with kind permission fromSpringer Science+Business Media

equate for processes that can do an unbounded number ofτ-steps. The problemis highlighted by the processQ1 = recx.(τ.x 1

2⊕ a.0) illustrated in Figure 6.1(a).

ProcessQ1 is indistinguishable, using tests, from the simple processa.0: we haveQ1 ≃pmay a.0 andQ1 ≃pmusta.0. This is because the processQ1 will eventuallyperform the actiona with probability 1. However, the action[a.0 a−→ [0 cannotbe simulated by a corresponding move[Q1 τ=⇒ a−→. No matter which distribution∆ we obtain from executing a finite sequence of internal moves[Q1 τ=⇒ ∆ , stillpart of it is unable to subsequently perform the actiona.

To address this problem we propose a new relation∆ =⇒ Θ , to indicate thatΘcan be derived from∆ by performing an unbounded sequence of internal moves;we callΘ a weak derivativeof ∆ . For example[a.0 will turn out to be a weakderivative of[Q1, i.e.[Q1=⇒ [a.0, via the infinite sequence of internal moves[Q1 τ−→ [Q1 1

2⊕ a.0 τ−→ [Q1 1

22⊕ a.0 τ−→ . . . τ−→ [Q1 1

2n⊕ a.0 τ−→ . . . .

Here we make a significant use of “subdistributions” that sumto no more thanone[7, 10]. For example, the empty subdistributionε elegantly represents the chaoticbehaviour of processes that in CSP and in must-testing semantics is tantamount todivergence, because we haveε α−→ ε for any actionα, and a process likerecx.xthat diverges via an infiniteτ path gives rise to the weak transitionrecx.x =⇒ ε.So the processQ2 = Q1 1

2⊕ recx.x illustrated in Figure 6.1(b) will enable the weak

transition[Q2 =⇒ 12[a.0, where intuitively the latter is a proper subdistribution

mapping the statea.0 to the probability12. Our weak transition relation=⇒ can be

regarded as an extension of theweak hyper-transitionfrom [9] to partial distribu-tions; the latter, although defined in a very different way, can be represented in termsof ours by requiring weak derivatives to be total distributions.

6.2 The languagerpCSP 151

We end this introduction with a brief glimpse at our proof strategy. In Chapter 5the characterisations for finiterpCSP processes were obtained using a probabilisticextension of the Hennessy-Milner logic. Moving to recursive processes, we knowthat process behaviour can be captured by a finite modal logiconly if the underlyingLTS is finitely branching, or at least image-finite [11]. Thusto take advantage of afinite probabilistic Hennessy-Milner logic we need a property of pLTS’s correspond-ing to finite branching in LTS’s; this is topological compactness, whose relevancewe now sketch.

Subdistributions over (derivatives of) finitaryrpCSP processes inherit the stan-dard (complete) Euclidean metric. One of our key results is that

Theorem 6.1.For every finitaryrpCSP process P, the set∆ | [P=⇒∆ is convexand compact.

Indeed, using techniques from the theory of Markov decisionprocesses [12] wecan show that the potentially uncountable set∆ | [P =⇒ ∆ is nevertheless theconvex closure of afiniteset of subdistributions, from which Theorem 6.1 follows.

This key result allows aninductivecharacterisation of the simulation preorders⊑S and⊑FS, here defined using our novel weak derivation relation=⇒.1 We firstconstruct a sequence of approximations⊑k

S for k ≥ 0 and, using Theorem 6.1, weprove

Theorem 6.2.For every finitaryrpCSP process P, and for every k∈ N, the set

∆ | [P⊑kS ∆

is convex and compact.

This in turn enables us to use theFinite Intersection Propertyof compact sets toprove

Theorem 6.3.For finitary rpCSP processes we have P⊑SQ iff P⊑kSQ for all k≥ 0.

Our main characterisation results can then be obtained by extending the probabilisticmodal logic used in Section 5.6, so that for example

• it characterises⊑kS for everyk≥ 0, and therefore it also characterises⊑S

• every probabilistic modal formula can be mimicked by a may-test.

Similar results accrue for must testing: details are given in Section 6.7.

6.2 The languagerpCSP

Let Act be a set of visible actions that a process can perform, and letVar be aninfinite set of variables. The languagerpCSP of probabilistic CSP processes is given

1 When restricted to finite processes, the new definitions of simulation and failure simulation pre-orders degenerate into the preorders in Section 5.5.1. So the extension is conservative and justifiesour use of the same symbols⊑S and⊑FS in this chapter.


a.P a−→ [P recx.P τ−→ [P[x 7→ recx.P]P⊓ Q τ−→ [P P⊓ Q τ−→ [Qs1

a−→ ∆s1 s2

a−→ ∆s2

a−→ ∆s1 s2

a−→ ∆s1

τ−→ ∆s1 s2

τ−→ ∆ s2

s2τ−→ ∆

s1 s2τ−→ s1 ∆

s1α−→ ∆ α 6∈A

s1 |A s2α−→ ∆ |A s2

s2α−→ ∆ α 6∈A

s1 |A s2α−→ s1 |A ∆

s1a−→ ∆1, s2

a−→ ∆2 a∈ As1 |A s2

τ−→ ∆1 |A ∆2

Fig. 6.2 Operational semantics ofrpCSP

by the following two-sorted syntax, in whichp∈ [0,1], a∈ Act andA⊆ Act:

P ::= S | P p⊕ PS ::= 0 | x∈ Var | a.P | P⊓ P | S S | S|A S | recx.P .

This is essentially the finite languagepCSP given in Section 5.2 plus the recursiveconstructrecx.P in which x is a variable andP a term. The notions of free- andbound variables are standard; byQ[x 7→ P] we indicate substitution of termP forvariablex in Q, with renaming if necessary. We writerpCSP for the set of closedP-terms defined by this grammar, and still usesCSP for its state-basedsubset ofclosedS-terms.

The operational semantics ofrpCSP is defined in terms of a particular pLTS〈sCSP,Actτ ,→〉 in which sCSP is the set of states andActτ is the set of transitionlabels. We interpretrpCSP processesP as distributions[P ∈ D(sCSP), as we didfor pCSP processes in Section 5.2.2.

The transition relation→ is defined in Figure 6.2, whereA ranges over subsetsof Act, and actionsa,α are elements ofAct,Actτ respectively. This is a slight ex-tension of the rules in Figure 5.1 for finite processes: one new rule is required tointerpret recursive processes. The processrecx.P performs an internal action whenunfolding. As our testing semantics will abstract from internal actions, theseτ-stepsare harmless and merely simplify the semantics.

We graphically depict the operational semantics of arpCSP expressionP bydrawing the part of the pLTS reachable from[P as a directed graph with statesrepresented by filled nodes• and distributions by open nodes, as described inSection 3.2.

Note that for eachP∈ rpCSP the distribution[P has finite support. Moreover,our pLTS isfinitely branchingin the sense that for each states∈ sCSP there are onlyfinitely many pairs(α,∆) ∈ Actτ ×D(sCSP) with s α−→ ∆ . In spite of[P’s finitesupport, and the finite branching of our pLTS, it is possible for there to be infinitely

6.3 A general definition of weak derivations 153

many states reachable from[P; when there are only finitely many, thenP is said tobe finitary.

Definition 6.1. A subdistribution∆ ∈ Dsub(S) in a pLTS〈S,L,→〉 is finitary if onlyfinitely many states are reachable from∆ ; a rpCSP expressionP is finitary if [P is.

6.3 A general definition of weak derivations

In this section we develop a new definition of what it means fora recursive processto evolve by silent activity into another process; it allowsthe simulation and failure-simulation preorders given in Definition 5.5 to be adapted tocharacterise the testingpreorders for at least finitary probabilistic processes.

Recall for example the processQ1 depicted in Figure 6.1(a). It turns out thatin our testing framework this process is indistinguishablefrom a.0: both processescan do nothing else than ana-action, possibly after some internal moves, and in bothcases the probability that the process will never do thea-action is 0. In Section 5.5,where we did not deal with recursive processes likeQ1, we defined a weak transitionrelation a

=⇒ in such a way thatP a=⇒ iff there is a finite number ofτ-moves after

which the entire distribution[P will have done ana-action. Lifting this definitionverbatim to a setting with recursion would create a difference betweena.0 andQ1,for only the former admits such a weak transitiona=⇒. The purpose of this sectionis to propose a new definition of weak transitions, with whichwe can capture theintuition that the processQ1 can perform the actiona with probability 1, providedit is allowed to run for an unbounded amount of time.

We construct our generalised definition of weak move by revising what it meansfor a probabilistic process to execute an indefinite sequence of (internal)τ moves.The key technical innovation is to change the focus from distributions tosubdistri-butionsthat enable us to express divergence very conveniently.

First some relatively standard terminology. For any subsetX of Dsub(S), with Sa set, recall thatlX stands for the convex closure ofX. As the smallest convex setcontainingX, it satisfies that∆ ∈ lX if and only if ∆ = ∑i∈I pi ·∆i , where∆i ∈ Xandpi ∈ [0,1], for some index setI such that∑i∈I pi = 1.

In caseS is a finite set, it makes no difference whether we restrictI to being finiteor not; in fact, index sets of size 2 will suffice. However, in general they do not:

Example 6.1.Let S= si | i ∈ N. Thenlsi | i ∈ N consists of all total distribu-tions whose support is included inS. However, with a definition of convex closurethat requires only binary interpolations of distributionsto be included,lsi | i ∈ Nwould merely consist of all such distributions with finite support.

Convex closure is a closure operator in the standard sense, in that it satisfies

• X ⊆ lX• X ⊆Y implieslX ⊆ lY• llX = lX.


We say a binary relationR ⊆Y×Dsub(S) is convex whenever the set∆ | y R ∆is convex for everyy in Y, and letlR denote the smallest convex relation containingR.

6.3.1 Lifting relations

In a pLTS actions are only performed by states, in that actions are given by relationsfrom states to distributions. ButrpCSP processes in general correspond to distri-butions over states, so in order to define what it means for a process to performan action, we need to lift these relations so that they also apply to distributions. Infact we will find it convenient to lift them to subdistributions by a straightforwardgeneralisation of Definition 5.3.

Definition 6.2. Let (S,L,→) be a pLTS andR ⊆ S×Dsub(S) be a relation fromstates to subdistributions. ThenR

† ⊆Dsub(S)×Dsub(S) is the smallest relation thatsatisfies:

(1) sR Θ impliessR† Θ , and


† (∑i∈I pi ·Θi), whereIis a finite index set and∑i∈I pi ≤ 1.

Remark 6.1.By constructionR† is convex. Moreover, becauses(lR)Θ implies

sR† Θ we haveR†=(lR)†, which means that when considering a lifted relation

we can without loss of generality assume the original relation to have been convex.In fact whenR is indeed convex, we have thatsR

† Θ andsR Θ are equivalent.

An application of this notion is when the relation isα−→ for α ∈Actτ ; in that casewe also write α−→ for ( α−→)†. Thus, as source of a relationα−→ we now also allowdistributions, and even subdistributions. A subtlety of this approach is that for anyactionα, we have

ε α−→ ε (6.1)

simply by takingI = /0 or ∑i∈I pi = 0 in Definition 6.2. That will turn out to makeε especially useful for modelling the “chaotic” aspects of divergence, in particularthat in the must-case a divergent process can simulate any other.

We have the following variant of Lemma 5.3.

Lemma 6.1.∆ R† Θ if and only if there is a collection of statessii∈I , a collection

of subdistributionsΘii∈I , and a collection of probabilitiespii∈I , for some finiteindex set I, such that∑i∈I pi ≤ 1 and∆ ,Θ can be decomposed as follows:

1. ∆ = ∑i∈I pi ·si

2. Θ = ∑i∈I pi ·Θi

3. For each i∈ I we have si R Θi. ⊓⊔

A simple but important property of this lifting operation isthe following:


Lemma 6.2.Suppose∆ R† Θ , whereR is any relation in S×Dsub(S). Then

(i) |∆ | ≥ |Θ |.(ii) If R is a relation in S×D(S) then|∆ |= |Θ |.

Proof. This follows immediately from the characterisation in Lemma 6.1. ⊓⊔

So for example ifε R† Θ then 0= |ε| ≥ |Θ |, whenceΘ is alsoε.

Remark 6.2.From Lemma 6.1 it also follows that lifting enjoys the following twoproperties:

(i) (Scaling) If∆ R† Θ , p∈ R≥0 and|p·∆ | ≤ 1 thenp·∆ R

† p·Θ .(ii) (Additivity) If ∆i R

† Θi for i ∈ I and|∑i∈I ∆i | ≤ 1 then(∑i∈I ∆i) R† (∑i∈I Θi),

whereI is a finite index set.

In fact, we could have presented Definition 6.2 using scalingand additivity insteadof linearity.

The lifting operation has yet another characterisation, this time in terms ofchoicefunctions.

Definition 6.3. Let R ⊆ S×Dsub(S) be a binary relation from states to subdistri-butions. Thenf : S→ Dsub(S) is a choice function forR if s R f (s) for everys∈ dom(R). We writeCh(R) for the set of all choice functions ofR.

Note that if f is a choice function ofR then f behaves properly at each states inthe domain ofR, but for each statesoutside the domain ofR, the valuef (s) can bearbitrarily chosen.

Proposition 6.1.SupposeR ⊆ S×Dsub(S) is a convex relation. Then for any sub-distribution ∆ ∈ Dsub(S), ∆ R

† Θ if and only if there is some choice functionf ∈ Ch(R) such thatΘ = Exp∆ ( f ).

Proof. First supposeΘ = Exp∆ ( f ) for some choice functionf ∈ Ch(R), that isΘ = ∑s∈⌈∆⌉∆(s)· f (s). It now follows from Lemma 6.1 that∆ R

† Θ sincesR f (s)for eachs∈ dom(R).

Conversely suppose∆ R† Θ ; we have to find a choice functionf ∈ Ch(R) such

thatΘ = Exp∆ ( f ). Applying Lemma 6.1 we know that

(i) ∆ = ∑i∈I pi ·si , for some index setI , with ∑i∈I pi ≤ 1(ii) Θ = ∑i∈I pi ·Θi for someΘi satisfyingsi R Θi .

Now let us define the functionf : S→ Dsub(S) as follows:

• if s∈ ⌈∆⌉ then f (s) = ∑ i∈I |si=s

pi

∆(s)·Θi ;

• if s∈ dom(R)\⌈∆⌉ then f (s) =Θ ′ for anyΘ ′ with sR Θ ′

• otherwise,f (s) = ε, whereε is the empty subdistribution;

Note that ifs∈ ⌈∆⌉ then∆(s) = ∑ i∈I |si=s pi and therefore by convexitysR f (s);so f is a choice function forR assR f (s) for eachs∈ dom(R). Moreover, a simplecalculation shows that Exp∆ ( f ) = ∑i∈I pi ·Θi , which by (ii) above isΘ . ⊓⊔


An important further property is the following:

Proposition 6.2 (Left-decomposable).If (∑i∈I pi ·∆i)R†Θ thenΘ =∑i∈I pi ·Θi for

some subdistributionsΘi such that∆i R†Θi for each i∈ I.

Proof. It is possible to adapt the proof of Proposition 3.3. But herewe provideanother proof that takes advantage of choice functions.

Let ∆ R† Θ where∆ =∑i∈I pi ·∆i . By Proposition 6.1, using thatR†=(lR)†,

there is a choice functionf ∈Ch(lR) such thatΘ =Exp∆ ( f ). TakeΘi :=Exp∆i( f )

for i ∈ I . Using that⌈∆i⌉ ⊆ ⌈∆⌉, Proposition 6.1 yields∆i R† Θi for i ∈ I . Finally,

∑i∈I pi ·Θi = ∑i∈I pi ·∑s∈⌈∆i⌉ ∆i(s)· f (s)= ∑s∈⌈∆⌉∑i∈I pi ·∆i(s)· f (s)= ∑s∈⌈∆⌉∆(s)· f (s)= Exp∆ ( f )= Θ .

⊓⊔

The converse to the above is not true in general: from∆ R† (∑i∈I pi ·Θi) it does

not follow that ∆ can correspondingly be decomposed. For example, we havea.(b 1

2⊕ c) a−→ 1

2 ·b+12 ·c, yet a.(b 1

2⊕ c) cannot be written as12 ·∆1 +

12 ·∆2 such

that∆1a−→ b and∆2

a−→ c.In fact a simplified form of Proposition 6.2 holds for unlifted relations, provided

they are convex:

Corollary 6.1. If (∑i∈I pi ·si) R† Θ andR is convex, thenΘ = ∑i∈I pi ·Θi for sub-

distributionsΘi with si R Θi for i ∈ I.

Proof. Take∆i to besi in Proposition 6.2, whenceΘ = ∑i∈I pi ·Θi for some subdis-tributionsΘi such thatsi R

† Θi for i ∈ I . BecauseR is convex, we then havesi R Θi

from Remark 6.1. ⊓⊔

As we have seen in Proposition 3.4, the lifting operation is monotone, that isR1 ⊆ R2 implies R1

† ⊆ R2†, and satisfies the following monadic property with

respect to composition.

Lemma 6.3.Let R1,R2 ⊆ S×Dsub(S). Then the forward relational composition

R1†·R2

† is equal to the lifted composition(R1·R2†)

†.

Proof. Suppose∆ R1†·R2

† Φ. Then there is someΘ such that∆ R1† Θ R2

† Φ.By Lemma 6.1 we have the decomposition∆ = ∑i∈I pi ·si andΘ = ∑i∈I pi ·Θi withsi R1 Θi for eachi ∈ I . By Proposition 6.2 we obtainΦ = ∑i∈I pi ·Φi and for each

i ∈ I we haveΘi R2† Φi . It follows thatsi R1·R2

† Φi , and thus∆ (R1·R2†)

† Φ.

So we have shown thatR1†·R2

† ⊆ (R1·R2†)

†. The other direction can be proved

similarly. ⊓⊔


Notice that in the second part of Definition 6.2 we have required the index setIto be finite. In fact, this constraint is unnecessary if the state spaceS is finite. Thenwe say that the lifting operation has a property ofinfinite linearity. To show theproperty we need two technical lemmas.

Let us writelωX for the set of subdistributions of the form∑i≥0 pi ·∆i , where∆i ∈ X and∑i≥0 pi = 1.

Lemma 6.4.If the set S is finite thenlX = lωX for any subset X ofDsub(S).

Proof. It is clear thatlX ⊆ lωX, so we prove the inverse inclusion,lωX ⊆ lX.The basic idea is to view a subdistribution overS as a point in Euclidean spaceof dimension|S| and give a geometric proof, by induction on the size ofS. Morespecifically we prove, by induction onk, that ifX is a subset in a space of dimensionk, that lX = lωX. The base case, when|S|= 1 is trivial. Let us we consider theinductive case, where the dimension is(k+1).

Suppose there is a pointx∈ lωX butx 6∈ lX. Then by the Separation theorem (cf.Theorem 2.7) there exists a hyperplaneH that separatesx fromlX. If h is the normalof H we can assume without loss of generality that there is a constantc satisfying

h ·x≥ c andh ·x′ ≤ c for all x′ ∈ X

where with a slight abuse of notation we write· for dot product of two vectors ofdimension(k+ 1). Note that the separation here may not be strict becauselX isconvex but not necessarily Cauchy closed.

Sincex ∈ lωX, there is a sequence of probabilitiespi with ∑i≥0 pi = 1 and asequence of pointsxi ∈ X such thatx= ∑i≥0 pi ·xi. We then have

(i) c≤ h ·x= ∑i≥0 pi · (h ·xi)(ii) h ·xi ≤ c for all i ≥ 0.

It follows from (i) and (ii) that actuallyh·xi = c for all i ≥ 0. In other words, it mustbe the case thath · xi = c for all i, which means that all the pointsxi lies in H; inother words the separation ofx from lX can not be strict. Therefore, we have thatx∈ lω(X∩H) sincelωxi | i ≥ 0 ⊆ lω(X∩H).

On the other hand, sincex 6∈ lX we havex 6∈ l(X∩H). HoweverX∩H can bedescribed as a subset in a space of one dimension lower thanX, that is of dimensionk. We have now contradicted the induction hypothesis. ⊓⊔

In order to use the above lemma, we need to rephrase the lifting operation interms of the closure operatorl( ). To this end let us useR (s) to denote the set∆ ∈ Dsub(S) | sR ∆ , for anyR⊆ S×Dsub(S).

Lemma 6.5.For subdistributions over a finite set S,∆ R† Θ if and only ifΘ can be

written in the form∑s∈⌈∆⌉ ∆(s) ·Θs where eachΘs ∈ lR (s).

Proof. SupposeΘ = ∑s∈⌈∆⌉ ∆(s) ·Θs with Θs ∈ lR (s). To show that∆ R† Θ , it

suffices to prove thatsR† Θs for eachs∈ ⌈∆⌉, asR

† is linear. SinceΘs ∈ lR (s),


we can rewriteΘs asΘs = ∑i∈I pi ·Θis whereΘis ∈R (s) for some finite index setI .The fact thats= ∑i∈I pi ·sandsR Θis yields thatsR

† Θs.Conversely suppose∆ R

† Θ . By Lemma 6.1 we have that

∆ = ∑i∈I

pi ·si si R Θi Θ = ∑i∈I

pi ·Θi. (6.2)

For eachs∈ ⌈∆⌉, let Is = i ∈ I | si = s. Note that∆(s) = ∑i∈Is pi . Hence, we canrewriteΘ as follows:

Θ = ∑s∈⌈∆⌉∑i∈Is pi ·Θi

= ∑s∈⌈∆⌉∆(s) · (∑i∈Ispi

∆ (s) ·Θi)

Since the subdistribution∑i∈Ispi

∆ (s) ·Θi is a convex combination ofΘi | i ∈ Is, it

must be inlR (s) due to (6.2), and the result follows. ⊓⊔

Theorem 6.4 (Infinite linearity). SupposeR is a relation over S×Dsub(S), whereS is finite, and∑i≥0 pi = 1. Then∆i R

† Θi implies(∑i≥0 pi ·∆i) R† (∑i≥0 pi ·Θi).

Proof. Suppose∑i≥0 pi = 1 and∆i R† Θi for eachi ≥ 0. Let∆ ,Θ denote∑i≥0 pi ·∆i

and∑i≥0 pi ·Θi respectively. We have to show∆ R† Θ . By Lemma 6.5 it is sufficient

to show

Θ = ∑s∈⌈∆⌉

∆(s) ·Γs (6.3)

whereΓs ∈ lR (s) for eachs∈ ⌈∆⌉.By the same lemma we know that for eachi ≥ 0, since∆i R

† Θi ,

Θi = ∑s∈⌈∆i⌉

∆i(s) ·Θis with Θis ∈ lR (s). (6.4)

Therefore,Θ = ∑i≥0 pi · (∑s∈⌈∆i⌉ ∆i(s) ·Θis)

= ∑s∈⌈∆⌉∑i≥0(pi ·∆i(s)) ·Θis

Let wsi denotepi ·∆i(s) and note that∆(s) is the infinite sum∑i≥0ws

i . Therefore wecan continue:

Θ = ∑s∈⌈∆⌉∑i≥0wsi ·Θis

= ∑s∈⌈∆⌉∆(s) · (∑i≥0ws

i∆ (s) ·Θis)

The required (6.3) above will follow if we can show(∑i≥0ws

i∆ (s) ·Θis) ∈ lR (s) for

eachs∈ ⌈∆⌉.From (6.4) we knowΘis ∈ lR (s), and therefore by construction we have

that (∑i≥0ws

i∆ (s) ·Θis) ∈ lωlR (s). But now an application of Lemma 6.4 yields

lωlR (s) = llR (s), and this coincides withlR (s) becausel( ) is a closure oper-ator. ⊓⊔


Consequently, for finite-state pLTS’s we can freely use the infinite linearity propertyof the lifting operation.

6.3.2 Weak transitions

We now formally define a notion of weak derivatives.

Definition 6.4 (Weak τ moves to derivatives).Suppose we have subdistributions∆ ,∆→

k ,∆×k , for k≥ 0, with the following properties:

∆ = ∆→0 +∆×

0

∆→0

τ−→ ∆→1 +∆×

1...

∆→k

τ−→ ∆→k+1+∆×

k+1 ....

The τ−→ moves above with subdistribution sources are lifted in the sense of theprevious section. Then we call∆ ′ := ∑∞

k=0 ∆×k a weak derivativeof ∆ , and write

∆ =⇒ ∆ ′ to mean that∆ can make aweakτ moveto its derivative∆ ′.

There is always at least one derivative of any distribution (the distribution itself)and there can be many. Using Lemma 6.2 it is easily checked that Definition 6.4 iswell-defined in that derivatives do not sum to more than one.

Example 6.2.Let τ−→⋆ denote the reflexive transitive closure of the relationτ−→ oversubdistributions. By the judicious use of the empty distributionε in the definition of=⇒, and property (6.1) above, it is easy to see that

∆ τ−→⋆ Θ implies ∆ =⇒Θ

because∆ τ−→⋆ Θ means the existence of a finite sequence of subdistributionssuchthat∆ = ∆0, ∆1, . . . ,∆k =Θ , k≥ 0 for which we can write

∆ = ∆0 + ε∆0

τ−→ ∆1 + ε...

...∆k−1

τ−→ ε + ∆k

ε τ−→ ε + ε...

In total: Θ


This implies that=⇒ is indeed a generalisation of the standard notion for non-probabilistic transition systems of performing an indefinite sequence of internalτmoves.

In Section 5.5 we wrotes τ−→ ∆ if eithers τ−→ ∆ or ∆ = s. Hence the lifted relationτ−→ satisfies∆ τ−→ ∆ ′ if and only if there are∆→, ∆× and∆1 with ∆ = ∆→+∆×,

∆→ τ−→ ∆1 and∆ ′ = ∆1 +∆×. Clearly,∆ τ−→ ∆ ′ implies ∆ =⇒ ∆ ′. With a littleeffort, one can also show that∆ τ−→⋆ ∆ ′ implies ∆ =⇒ ∆ ′. In fact, this followsdirectly from the reflexivity and transitivity of=⇒; the latter will be established inTheorem 6.6.

Conversely, in Section 5.5.1 we dealt with recursion-freerpCSP processesP,and these have the property that in a sequence as in Definition6.4 with∆ = [P wenecessarily have that∆k = ε for somek ≥ 0. On such processes we have that therelations τ−→⋆ and=⇒ coincide.

In Definition 6.4 we can see that∆ ′ = ε iff ∆×k = ε for all k. Thus∆ =⇒ ε iff there

is an infinite sequence of subdistributions∆k such that∆ = ∆0 and∆kτ−→ ∆k+1, that

is ∆ can give rise to a divergent computation.

Example 6.3.Consider the processrecx.x, which recall is a state, and for which wehaverecx.x τ−→ [recx.x and thus[recx.x τ−→ [recx.x. Then[recx.x=⇒ ε .

Example 6.4.Recall the processQ1 = recx.(τ.x 12⊕ a.0) from Section 6.1. We have[Q1=⇒ [a.0 because[Q1 = [Q1+ ε[Q1 τ−→

12·[τ.Q1+ 1

2·[a.0

12·[τ.Q1 τ−→

12·[Q1+ ε

12·[Q1 τ−→

122 ·[τ.Q1+ 1

22 ·[a.0. . .

12k ·[Q1 τ−→

12k+1 ·[τ.Q1+ 1

2k+1 ·[a.0. . .

which means that by definition we have[Q1=⇒ ε + ∑k≥1

12k ·[a.0

thus generating the weak derivative[a.0 as claimed.

Example 6.5.Consider the (infinite) collection of statessk and probabilitiespk fork≥ 2 such that

skτ−→ [a.0 pk

⊕ sk+1 ,


where we choosepk so that starting from anysk the probability of eventually takinga left-hand branch, and so reaching[a.0 ultimately, is just1k in total. Thuspk mustsatisfy 1

k = pk+(1−pk)1

k+1, whence by arithmetic we have thatpk := 1k2 will do.

Therefore in particulars2 =⇒12[a.0, with the remaining1

2 lost in divergence.

Our final example demonstrates that derivatives of (interpretations of)rpCSP pro-cesses may have infinite support, and hence that we can have[P =⇒ ∆ ′ such thatthere is noP′ ∈ rpCSP with [P′= ∆ ′.

Example 6.6.Let P denote the processrecx.(b.0 12⊕ (x | /0 0)). Then we have the

derivation: [P = [P+ ε[P τ−→12·[P | /0 01+ 1

2·[b.0

12·[P | /0 01 τ−→

122 ·[P | /0 02+ 1

22 ·[b.0 | /0 01. . . . . .

12k ·[P | /0 0k τ−→

12k+1 ·[P | /0 0k+1+ 1

2k+1 ·[b.0 | /0 0k. . .

where0k representsk instances of0 running in parallel. This implies that[P=⇒Θ

where

Θ = ∑k≥0

12k+1 ·[b | /0 0k

a distribution with infinite support.

6.3.3 Properties of weak transitions

Here we develop some properties of the weak move relation=⇒ that will be im-portant later on. We wish to use weak derivation as much as possible in the sameway as the lifted action relationsα−→, and therefore we start with showing that=⇒enjoys two of the most crucial properties ofα−→: linearity of Definition 6.2 andthe decomposition property of Proposition 6.2. To this end,we first establish thatweak derivations do not increase the mass of distributions,and are preserved underscaling.

Lemma 6.6.For any subdistributions∆ , Θ , Γ , Λ , Π we have

(i) If ∆ =⇒Θ then|∆ | ≥ |Θ |.


(ii) If ∆ =⇒Θ and p∈R≥0 such that|p·∆ | ≤ 1, then p·∆ =⇒ p·Θ .(iii) If Γ +Λ =⇒ Π thenΠ = ΠΓ +ΠΛ with Γ =⇒ ΠΓ andΛ =⇒ ΠΛ .

Proof. By definition∆ =⇒Θ means that some∆k,∆×k ,∆→

k exist for allk≥ 0 suchthat

∆ = ∆0, ∆k = ∆×k +∆→

k , ∆→k

τ−→ ∆k+1, Θ =∞

∑k=0

∆×k .

A simple inductive proof shows that

|∆ | = |∆→i |+∑

k≤i

|∆×k | for anyi ≥ 0. (6.5)

The sequence∑k≤i |∆×k |∞

i=0 is nondecreasing and by (6.5) each element of thesequence is not greater than|∆ |. Therefore, the limit of this sequence is bounded by|∆ |. That is,

|∆ | ≥ limi→∞ ∑

k≤i

|∆×k | = |Θ |.

Now supposep∈R≥0 such that|p·∆ | ≤ 1. From Remark 6.2(i) it follows that

p·∆ = p·∆0, p·∆k = p·∆→k + p·∆×

k , p·∆→k

τ−→ p·∆k+1, p·Θ = ∑k

p·∆×k .

Hence Definition 6.4 yieldsp·∆ =⇒ p·Θ .Next supposeΓ +Λ =⇒Π . By Definition 6.4 there are subdistributionsΠk, Π→

k ,Π×

k for k∈ N such that

Γ +Λ = Π0, Πk = Π→k +Π×

k , Π→k

τ−→ Πk+1, Π = ∑k

Π×k .

For anys∈ S, defineΓ →

0 (s) := min(Γ (s),Π→0 (s))

Γ ×0 (s) := Γ (s)−Γ →

0 (s)Λ×

0 (s) := min(Λ(s),Π×0 (s))

Λ→0 (s) := Λ(s)−Λ×

0 (s) ,

(6.6)

and check thatΓ →0 +Γ ×

0 =Γ andΛ→0 +Λ×

0 =Λ . To show thatΛ→0 +Γ →

0 =Π→0 and

Λ×0 +Γ×

0 = Π×0 we fix a statesand distinguish two cases: either (a)Π→

0 (s)≥ Γ (s)or (b) Π→

0 (s) < Γ (s). In Case (a) we haveΠ×0 (s) ≤ Λ(s) and the definitions (6.6)

simplify toΓ →0 (s) =Γ (s), Γ ×

0 (s) = 0,Λ×0 (s) =Π×

0 (s) andΛ→0 (s) =Λ(s)−Π×

0 (s),whence immediatelyΓ →

0 (s)+Λ→0 (s) = Π→

0 (s) andΓ ×0 (s)+Λ×

0 (s) = Π×0 (s). Case

(b) is similar.SinceΛ→

0 +Γ →0

τ−→ Π1, by Proposition 6.2 we findΓ1,Λ1 with Γ →0

τ−→ Γ1 andΛ→

0τ−→ Λ1 andΠ1 = Γ1+Λ1. Being now in the same position withΠ1 as we were

with Π0, we can continue this procedure to findΛk, Γk, Λ→k , Γ →

k , Λ×k andΓ ×

k with


Γ = Γ0, Γk = Γ →k +Γ ×

k , Γ →k

τ−→ Γk+1,

Λ = Λ0, Λk = Λ→k +Λ×

k , Λ→k

τ−→ Λk+1,

Γk+Λk = Πk, Γ →k +Λ→

k = Π→k , Γ ×

k +Λ×k = Π×

k .

Let ΠΓ :=∑kΓ ×k andΠΛ := ∑kΛ×

k . ThenΠ = ΠΓ +ΠΛ and Definition 6.4 yieldsΓ =⇒ ΠΓ andΛ =⇒ ΠΛ . ⊓⊔

Together, Lemma 6.6(ii) and (iii) imply the binary counterpart of the decomposi-tion property of Proposition 6.2. We now generalise this result to infinite (but stillcountable) decomposition, and also establish linearity.

Theorem 6.5.Let pi ∈ [0,1] for i ∈ I with ∑i∈I pi ≤ 1. Then

(i) If ∆i =⇒Θi for all i ∈ I then∑i∈I pi ·∆i =⇒ ∑i∈I pi ·Θi.(ii) If ∑i∈I pi ·∆i =⇒Θ thenΘ = ∑i∈I pi ·Θi for some subdistributionsΘi such that

∆i =⇒Θi for all i ∈ I.

Proof. (i) Suppose∆i =⇒Θi for all i ∈ I . By Definition 6.4 there are subdistribu-tions∆ik,∆→

ik ,∆×ik such that

∆i = ∆i0, ∆ik = ∆→ik +∆×

ik , ∆→ik

τ−→ ∆i(k+1), Θi = ∑k

∆×ik .

We compose relevant subdistributions and obtain that∑i∈I pi ·∆i =∑i∈I pi ·∆i0,∑i∈I pi ·∆ik = ∑i∈I pi ·∆→

ik + ∑i∈I pi ·∆×ik , ∑i∈I pi ·∆→

ikτ−→ ∑i∈I pi ·∆i(k+1) by

Theorem 6.4, and moreover∑i∈I pi ·Θi = ∑i∈I pi · ∑k ∆×ik = ∑k(∑i∈I pi ·∆×

ik ).By Definition 6.4 we obtain∑i∈I pi ·∆i =⇒ ∑i∈I pi ·Θi .

(ii) In the light of Lemma 6.6(ii) it suffices to show that if∑∞i=0 ∆i =⇒ Θ then

Θ = ∑∞i=0Θi for subdistributionsΘi such that∆i =⇒Θi for all i ≥ 0.

Since∑∞i=0 ∆i = ∆0+∑i≥1 ∆i and∑∞

i=0 ∆i =⇒Θ , by Lemma 6.6(iii) there areΘ0,Θ≥

1 such that

∆0 =⇒Θ0, ∑i≥1

∆i =⇒Θ≥1 , Θ =Θ0+Θ≥

1 .

Using Lemma 6.6(iii) once more, we haveΘ1,Θ≥2 such that

∆1 =⇒Θ1, ∑i≥2

∆i =⇒Θ≥2 , Θ≥

1 =Θ1+Θ≥2 ,

thus in combinationΘ =Θ0+Θ1+Θ≥2 . Continuing this process we have that

∆i =⇒Θi , ∑j≥i+1

∆×j =⇒Θ≥

i+1, Θ =i

∑j=0

Θ j +Θ≥i+1

for all i ≥ 0. Lemma 6.6(i) ensures that|∑ j≥i+1∆ j | ≥ |Θ≥i+1| for all i ≥ 0.

But since∑∞i=0 ∆i is a subdistribution, we know that the tail sum∑ j≥i+1 ∆ j


converges toε wheni approaches∞, and therefore that limi→∞ Θ≥i = ε. Thus

by taking that limit we conclude thatΘ = ∑∞i=0Θi .

⊓⊔

With Theorem 6.5, the relation=⇒⊆ Dsub(S)×Dsub(S) can be obtained as thelifting of a relation=⇒S from S to Dsub(S), which is defined by writings=⇒S Θjust whens=⇒Θ .

Proposition 6.3.(=⇒S)† = (=⇒).

Proof. That∆ (=⇒S)† Θ implies∆ =⇒Θ is a simple application of Part (i) of The-

orem 6.5. For the other direction, suppose∆ =⇒Θ . Given that∆ = ∑s∈⌈∆⌉∆(s)·s,Part (ii) of the same theorem enables us to decomposeΘ into ∑s∈⌈∆⌉ ∆(s)·Θs wheres=⇒ Θs for eachs in ⌈∆⌉. But the latter actually means thats=⇒S Θs, and so bydefinition this implies∆ (=⇒S)

† Θ . ⊓⊔

It is immediate that the relation=⇒ is convex because of its being a lifting.

We proceed with the important properties of reflexivity and transitivity of weakderivations. First note that reflexivity is straightforward; in Definition 6.4 it sufficesto take∆→

0 to be the empty subdistributionε.

Theorem 6.6 (Transitivity of =⇒). If ∆ =⇒Θ andΘ =⇒ Λ then∆ =⇒ Λ .

Proof. By definition∆ =⇒Θ means that some∆k,∆×k ,∆→

k exist for allk≥ 0 suchthat

∆ = ∆0, ∆k = ∆×k +∆→

k , ∆→k

τ−→ ∆k+1, Θ =∞

∑k=0

∆×k . (6.7)

SinceΘ = ∑∞k=0 ∆×

k andΘ =⇒Λ , by Theorem 6.5(ii) there areΛk for k≥ 0 suchthatΛ = ∑∞

k=0Λk and∆×k =⇒ Λk for all k≥ 0.

Now for eachk ≥ 0, we know that∆×k =⇒ Λk gives us some∆kl , ∆×

kl ,∆→kl for

l ≥ 0 such that

∆×k = ∆k0, ∆kl = ∆×

kl +∆→kl , ∆→

klτ−→ ∆k,l+1 Λk = ∑

l≥0

∆×kl . (6.8)

Therefore we can put all this together with

Λ =∞

∑k=0

Λk = ∑k,l≥0

∆×kl = ∑

i≥0

(

∑k,l |k+l=i

∆×kl

)

, (6.9)

where the last step is a straightforward diagonalisation.Now from the decompositions above we re-compose an alternative trajectory of

∆′i ’s to take∆ via=⇒ to Λ directly. Define

∆′i = ∆′×

i +∆′→i , ∆′×

i = ∑k,l |k+l=i

∆×kl , ∆′→

i = ( ∑k,l |k+l=i

∆→kl )+∆→

i , (6.10)


so that from (6.9) we have immediately that

Λ = ∑i≥0

∆′×i . (6.11)

We now show that

(i) ∆ = ∆′0

(ii) ∆′→i

τ−→ ∆′i+1

from which, with (6.11), we will have∆ =⇒ Λ as required. For (i) we observe that

∆= ∆0 (6.7)= ∆×

0 +∆→0 (6.7)

= ∆00+∆→0 (6.8)

= ∆×00+∆→

00+∆→0 (6.8)

= (∑k,l |k+l=0 ∆×kl )+ (∑k,l |k+l=0 ∆→

kl )+∆→0 index arithmetic

= ∆′×0 +∆′→

0 (6.10)= ∆′

0 . (6.10)

For (ii) we observe that

∆′→i

= (∑k,l |k+l=i ∆→kl )+∆→

i (6.10)τ−→ (∑k,l |k+l=i ∆k,l+1)+∆i+1 (6.7), (6.8), additivity

= (∑k,l |k+l=i(∆×k,l+1+∆→

k,l+1))+∆×i+1+∆→

i+1 (6.7), (6.8)= (∑k,l |k+l=i ∆×

k,l+1)+∆×i+1+(∑k,l |k+l=i ∆→

k,l+1)+∆→i+1 rearrange

= (∑k,l |k+l=i ∆×k,l+1)+∆i+1,0+(∑k,l |k+l=i ∆→

k,l+1)+∆→i+1 (6.8)

= (∑k,l |k+l=i ∆×k,l+1)+∆×

i+1,0+∆→i+1,0+(∑k,l |k+l=i ∆→

k,l+1)+∆→i+1 (6.8)

= (∑k,l |k+l=i+1 ∆×kl )+ (∑k,l |k+l=i+1 ∆→

kl )+∆→i+1 index arithmetic

= ∆′×i+1+∆′→

i+1 (6.10)= ∆′

i+1 , (6.10)

which concludes the proof. ⊓⊔

Finally, we need a property that is the converse of transitivity: if one executesa given weak derivation partly, by stopping more often and moving on less often,one makes another weak transition that can be regarded as an initial segment of thegiven one. We need the property that after executing such an initial segment, it isstill possible to complete the given derivation.

Definition 6.5. A weak derivationΦ =⇒ Γ is called aninitial segmentof a weakderivationΦ =⇒Ψ if for k ≥ 0 there areΓk,Γ →

k ,Γ ×k ,Ψk,Ψ→

k ,Ψ×k ∈ Dsub(S) such

thatΓ0 =Ψ0 = Φ and

Γk = Γ →k +Γ ×

k Ψk =Ψ→k +Ψ×

k Γ →k ≤Ψ→

kΓ →

kτ−→ Γk+1 Ψ→

kτ−→Ψk+1 Γk ≤Ψk

Γ = ∑∞i=0Γ ×

k Ψ = ∑∞i=0Ψ×

k (Ψ→k −Γ →

k ) τ−→ (Ψk+1−Γk+1).


Intuitively, in the derivationΦ =⇒Ψ , for eachk ≥ 0, we only allow a portion ofΨ→

k to makeτ moves, and the rest remains unmoved even if it can enableτ moves,so as to obtain an initial segmentΦ =⇒ Γ . Accordingly, eachΓ ×

k includes the cor-responding unmoved part ofΨ→

k , which is eventually collected inΓ . Now fromΓif we let those previously unmoved parts perform exactly thesameτ moves as inΦ =⇒Ψ , we will end up with a derivation leading toΨ . This is formulated in thefollowing proposition.

Proposition 6.4.If Φ =⇒ Γ is an initial segment ofΦ =⇒Ψ , thenΓ =⇒Ψ .

Proof. For any subdistributions∆ ,Θ ∈ Dsub(S) we define two new subdistributions∆ ∩Θ ∈ Dsub(S) by letting(∆ ∩Θ)(s) := min(∆(s),Θ(s)) and∆ −Θ ∈ Dsub(S) by(∆ −Θ)(s) :=max(∆(s)−Θ(s),0). So we have∆ −Θ = ∆ −(∆ ∩Θ). Observe thatin caseΘ ≤ ∆ , and only then, we have that(∆ −Θ)+Θ = ∆ .

Let Γk,Γ →k ,Γ ×

k ,Ψk,Ψ→k ,Ψ×

k ∈ Dsub(S) be as in Definition 6.5. By induction onk≤ 0 we define∆ki, ∆→

ki and∆×ki , for 0≤ i ≤ k, such that

(i) ∆k0 = Γ ×k

(ii) Ψk = ∑ki=0 ∆ki +Γ→

k(iii) Ψ×

k = ∑ki=0 ∆×

ki(iv) ∆ki = ∆→

ki +∆×ki

(v) ∆→ki

τ−→ ∆(k+1)(i+1)

Induction base:Let ∆00 := Γ ×0 = Γ0−Γ →

0 =Ψ0−Γ →0 . This way the first two equa-

tions are satisfied fork = 0. All other statements will be dealt with fully by theinduction step.

Induction step:Suppose∆ki for 0≤ i ≤ k are already known, and moreover we haveΨk =∑k

i=0 ∆ki+Γ →k . With induction oni we define∆×

ki :=∆ki∩(Ψ×k −∑i−1

j=0∆×k j) and

establish∑ij=0∆×

k j ≤Ψ×k . Namely, writingΘki for ∑i−1

j=0∆×k j, surelyΘk0 = ε ≤Ψ×

k ,

and when assuming thatΘki ≤ Ψ×k and defining∆×

ki := ∆ki ∩ (Ψ×k −Θki) we ob-

tainΘk(i+1) = ∆×ki +Θki ≤ (Ψ×

k −Θki)+Θki =Ψ×k . So in particular∑k

i=0 ∆×ki ≤Ψ×

k .Using thatΓ →

k ≤Ψ→k we find

∆kk = (Ψk−Γ→k )−

k−1

∑i=0

∆ki = (Ψ×k +(Ψ→

k −Γ→k ))−

k−1

∑i=0

∆ki ≥Ψ×k −

k−1

∑i=0

∆ki,

hence∆×kk = ∆kk∩ (Ψ×

k −∑k−1i=0 ∆×

ki ) =Ψ×k −∑k−1

i=0 ∆ki and thusΨ×k = ∑k

i=0 ∆ki.Now define∆→

ki := ∆ki −∆×ki . This yields∆ki = ∆→

ki +∆×ki and thereby

Ψ→k =Ψk−Ψ×

k = (k

∑i=0

∆ki +Γ→k )−

k

∑i=0

∆×ki =

k

∑i=0

∆→ki +Γ →

k .

Since ∑ki=0 ∆→

ki = (Ψ→k − Γ →

k ) τ−→ (Ψk+1 − Γk+1), by Proposition 6.2 we haveΨk+1−Γk+1 = ∑k

i=0 ∆(k+1)(i+1) for some subdistributions∆(k+1)(i+1) such that they

6.4 TestingrpCSP processes 167

form the transitions∆→ki

τ−→ ∆(k+1)(i+1) for i = 0, . . . ,k. Furthermore, let us define∆(k+1)0 := Γ ×

k+1 = Γk+1−Γ→k+1. It follows that

Ψk+1 =k

∑i=0

∆(k+1)(i+1)+Γk+1 =k+1

∑i=1

∆(k+1)i +(∆(k+1)0+Γ →k+1) =

k+1

∑i=0

∆(k+1)i +Γ→k+1.

This ends the inductive definition and proof. Now let us defineΘi := ∑∞k=i ∆ki,

Θ→i := ∑∞

k=i ∆→ki andΘ×

i := ∑∞k=i ∆×

ki . We have thatΘ0 = ∑∞k=0 ∆k0 = ∑∞

k=0Γ ×k = Γ ,

Θi =Θ→i +Θ+

i , and, using Remark 6.2(ii),Θ→i

τ−→Θi+1. Moreover,

∞

∑i=0

Θ×i =

∞

∑i=0

∞

∑k=i

∆×ki =

∞

∑k=0

k

∑i=0

∆×ki =

∞

∑k=0

Ψ×k =Ψ .

Definition 6.4 yieldsΓ =⇒Ψ . ⊓⊔

6.4 TestingrpCSP processes

Applying a test to a process results in a nondeterministic, but possibly probabilis-tic, computation structure. The main conceptual issue is how to associate outcomeswith these nondeterministic structures. In Section 4.2 we have seen an approach oftesting in which we explicitly associate with a nondeterministic structure a set ofdeterministic computations called resolutions, each of which determines a possibleoutcome. In this section we describe an alternative approach in which intuitively thenondeterministic choices are resolved implicitly in a dynamic manner. We show thatalthough these approaches are formally quite different they lead to exactly the sametesting outcomes.

6.4.1 Testing with extremal derivatives

A test is simply a process in the languagerpCSP, except that it may in additionuse specialsuccessactions for reporting outcomes: these are drawn from a setΩ offresh actions not already inActτ . We refer to the augmented language asrpCSPΩ .Formally a testT is some process from that language, and to apply testT to processP we form the processT |Act P in which all visible actions ofP must synchronisewith T. The resulting composition is a process whose only possibleactions areτand the elements ofΩ . We will define the resultA d(T,P) of applying the testT tothe processP to be a set of testing outcomes, exactly one of which results from eachresolution of the choices inT |Act P. Eachtesting outcomeis anΩ -tuple of realnumbers in the interval [0,1], i.e. a functiono : Ω → [0,1], and itsω-componento(ω), for ω ∈ Ω , gives the probability that the resolution in question willreach anω-success state, one in which the success actionω is possible.


We will now give a definition ofA d(T,P), which is intended to be an alternativeof A (T,P). Our definition has three ingredients. First of all, to simplify the pre-sentation we normalise our pLTS by removing allτ-transitions that leave a successstate. This way anω-success state will only have outgoing transitions labelled ω .

Definition 6.6 (ω-respecting).Let 〈S,L,→〉 be a pLTS such that the set of labelsLincludesΩ . It is said to beω-respectingwhenevers ω−→, for anyω ∈ Ω , s 6τ−→.

It is straightforward to modify an arbitrary pLTS so that it is ω-respecting. Here weoutline how this is done for our pLTS forrpCSP.

Definition 6.7 (Pruning). Let [·] be the unary operator onΩ -test states given bythe operational rules

s ω−→ ∆[s] ω−→ [∆ ]

(ω ∈ Ω)s 6ω−→ (for all ω ∈ Ω), s α−→ ∆

[s] α−→ [∆ ](α ∈ Actτ ) .

Just as and|A, this operator extends as syntactic sugar toΩ -tests by distributing[·] over p⊕; likewise, it extends to distributions by[∆ ]([s]) = ∆(s). Clearly, thisoperator does nothing else than removing all outgoing transitions of a success stateother than the ones labelled withω ∈ Ω .

Next, using Definition 6.4, we get a collection of subdistributionsΘ reachablefrom [[T |Act P]. Then we isolate a class of special weak derivatives calledextremederivatives.

Definition 6.8 (Extreme derivatives).A states in a pLTS is calledstableif s 6τ−→,and a subdistributionΘ is calledstableif every state in its support is stable. Wewrite ∆ =⇒≻Θ whenever∆ =⇒Θ andΘ is stable, and callΘ anextremederivativeof ∆ .

Referring to Definition 6.4, we see this means that in the extreme derivation ofΘ from ∆ at every stage a state must move on if it can, so that every stoppingcomponent can contain only states thatmuststop: fors∈ ⌈∆→

k +∆×k ⌉ we have that

s∈ ⌈∆×k ⌉ if and now alsoonly if s 6τ−→. Moreover if the pLTS isω-respecting then

whenevers∈ ⌈∆→k ⌉, that is whenever it marches on, it is not successful,s 6ω−→ for

everyω ∈ Ω .

Lemma 6.7 (Existence of extreme derivatives).

(i) For every subdistribution∆ there exists some (stable)∆ ′ such that∆ =⇒≻ ∆ ′.(ii) In a deterministic pLTS, if∆ =⇒≻ ∆ ′ and∆ =⇒≻ ∆ ′′ then∆ ′ = ∆ ′′.

Proof. We construct a derivation as in Definition 6.4 of a stable∆ ′ by defining thecomponents∆k,∆×

k and∆→k using induction onk. Let us assume that the subdistri-

bution∆k has been defined; in the base casek= 0 this is simply∆ . The decomposi-tion of this∆k into the components∆×

k and∆→k is carried out by defining the former

to be precisely those states that must stop, i.e. thoses for whichs 6τ−→. Formally∆×k

is determined by:


∆×k (s) =

∆k(s) if s 6τ−→

0 otherwise

Then∆→k is given by theremainderof ∆k, namely those states that can perform aτ

action:

∆→k (s) =

∆k(s) if s τ−→

0 otherwise

Note that these definitions divide the support of∆k into two disjoints sets, namelythe support of∆×

k and the support of∆→k . Moreover by construction we know that

∆→k

τ−→Θ for someΘ ; we let∆k+1 be an arbitrary suchΘ .This completes our definition of an extreme derivative as in Definition 6.4 and so

we have established (i).For (ii) we observe that in a deterministic pLTS the above choice of ∆k+1 is

unique, so that the whole derivative construction becomes unique. ⊓⊔

It is worth pointing out that the use of subdistributions, rather than distributions,is essential here. If∆ diverges, that is if there is an infinite sequence of derivations∆ τ−→ ∆1

τ−→ . . .∆kτ−→ . . ., then one extreme derivative of∆ is the empty sub-

distributionε . For example the only transition ofrecx.x is recx.x τ−→ recx.x, andthereforerecx.x diverges; consequently its unique extreme derivative isε .

The final ingredient in the definition of the set of outcomesA d(T,P) is to use thisnotion of extreme derivative to formalise the subdistributions that can be reachedfrom [[T |Act P]. Note that all statess∈ ⌈Θ⌉ in the support of an extreme derivativeeither satisfys ω−→ for a uniqueω ∈ Ω , or haves 6→.

Definition 6.9. [Outcomes] The outcome $Θ ∈ [0,1]Ω of a stable subdistributionΘis given by $Θ(ω) := ∑

s∈⌈Θ⌉, sω−→

Θ(s).

Putting all three ingredients together, we arrive at a definition of A d(T,P):

Definition 6.10.Let P be arpCSP process andT anΩ -test. Then

Ad(T,P) := $Θ | [[T |Act P] =⇒≻Θ.

The role of pruning in the above definition can be seen via the following example.

Example 6.7.Let P= a.b.0 andT = a.(b.0 ω .0). The pLTS generated by apply-ing T to P can be described by the processτ.(τ.0 ω .0). Then[T |Act P has aunique extreme derivation[T |Act P =⇒≻ [0 , and[[T |Act P] also has a uniqueextreme derivation[[T |Act P] =⇒≻ [ω .0. The outcome inA d(T,P) shows thatprocessP passes testT with probability 1, which is what we expect for state-basedtesting, which we use in this book. Without pruning we would get an outcome sayingthatP passesT with probability 0, which would be what is expected for action-basedtesting.


Using the two standard methods for comparing two sets of outcomes, the Hoare-and Smyth preorders, we define the may- and must-testing preorders; they are dec-orated with·Ω for the repertoireΩ of testing actions they employ.

Definition 6.11.

1. P⊑ΩpmayQ if for everyΩ -testT, A d(T,P)≤Ho A d(T,Q).

2. P⊑ΩpmustQ if for everyΩ -testT, A d(T,P)≤Sm A d(T,Q).

These preorders are abbreviated toP⊑pmayQ, andP⊑pmustQ, when|Ω |= 1.

Example 6.8.Consider the processQ1 = recx.(τ.x 12⊕ a.0), which was already dis-

cussed in Section 6.1. When we apply the testT = a.ω .0 to it we get the pLTS inFigure 4.3(b), which is deterministic and unaffected by pruning; from part (ii) ofLemma 6.7 it follows thats0 has a unique extreme derivativeΘ . MoreoverΘ can becalculated to be

∑k≥1

12k ·s3,

which simplifies to the distributions3. Thus it gives the same set of results−→ω gained by applyingT to a.0 on its own; and in fact it is possible to show that thisholds for all tests, giving

Q1 ≃pmaya.0 Q1 ≃pmusta.0 .

Example 6.9.Consider the processQ2 = recx.((x 12⊕ a.0) ⊓ (0 1

2⊕ a.0)) and the

application of the same testT = a.ω .0 to it, as outlined in Figure 4.4(a) and (b).Consider any extreme derivative∆ ′ from [[T |Act Q2], which we have abbrevi-

ated tos0; note that here again pruning actually has no effect. Using the notation ofDefinition 6.4, it is clear that∆×

0 and∆→0 must beε ands0 respectively. Similarly,

∆×1 and∆→

1 must beε ands1 respectively. Buts1 is a nondeterministic state, havingtwo possible transitions:

(i) s1τ−→Θ0 whereΘ0 has supports0,s2 and assigns each of them the weight1

2(ii) s1

τ−→ Θ1 whereΘ1 has the supports3,s4, again diving the mass equallyamong them.

So there are many possibilities for∆2; Lemma 6.1 shows that in fact∆2 can be ofthe form

p ·Θ0+(1− p) ·Θ1 (6.12)

for any choice ofp∈ [0,1].Let us consider one possibility, an extreme one wherep is chosen to be 0; only

the transition (ii) above is used. Here∆→2 is the subdistribution12s4, and∆→

k = εwheneverk> 2. A simple calculation shows that in this case the extreme derivativegenerated isΘ e

1 = 12s3+

12ω .0 that implies that12

−→ω ∈ A d(T,Q2).Another possibility for∆2 is Θ0, corresponding to the choice ofp= 1 in (6.12)

above. Continuing with this derivation leads to∆3 being 12 · s1 +

12 ·ω .0; in other


words∆×3 = 1

2 ·ω .0 and∆→3 = 1

2 · s1. Now in the generation of∆4 from ∆→3 once

more we have to resolve a transition from the nondeterministic states1, by choosingsome arbitraryp∈ [0,1] in (6.12). Suppose that each time this arises we systemati-cally choosep= 1, that is, we ignore completely the transition (ii) above. Then it iseasy to see that the extreme derivative generated is

Θ e0 = ∑

k≥1

12k ·ω .0

which simplifies to the distributionω .0. This in turn means that−→ω ∈ A d(T,Q2).We have seen two possible derivations of extreme derivatives froms0. But there

are many others. In general whenever∆→k is of the formq·s1 we have to resolve the

nondeterminism by choosing ap∈ [0,1] in (6.12) above; moreover each such choiceis independent. However it will follow from later results, specifically Corollary 6.4,that every extreme derivative∆ ′ of s0 is of the form

q ·Θ e0 +(1−q)Θ e

1

for some choice ofq ∈ [0,1]; this is explained in Example 6.11. Consequently itfollows thatA d(T,Q2) = q−→ω | q∈ [1

2,1].SinceA d(T,a.0) = −→ω it follows that

Ad(T,a.0) ≤Ho A

d(T,Q2) Ad(T,Q2) ≤Sm A

d(T,a.0) .

Again it is possible to show that these inequalities result from any testT and thattherefore we have

a.0 ⊑pmayQ2 Q2 ⊑pmusta.0

6.4.2 Comparison with resolution-based testing

The derivation of extreme derivatives, via the schema in Definition 6.4, involves thesystematic dynamic resolution of nondeterministic states, in each transition from∆→

k to ∆k+1. In the literature various mechanisms have been proposed for makingthese choices; for examplepoliciesare used in [12], adversaries in [13], schedulersin [14], . . . . Here we concentrate not on any such mechanism but rather the resultsof their application. In general they reduce a nondeterministic structure, typically apLTS, to a set of deterministic structures. To describe these deterministic structures,in Section 4.2 we adapted the notion ofresolutiondefined in [15, 3] for probabilisticautomata, to pLTS’s. Therefore, we have now seen at least twoways of associatingsets of outcomes with the application of a test to the process. The first, in Sec-tion 6.4.1, uses extreme derivations in which nondeterministic choices are resolveddynamically as the derivation proceeds, while in the second, in Section 4.2, we asso-ciate with a test and a process a set of deterministic structures called resolutions. In


this section we show that the testing preorders obtained from these two approachescoincide.

First let us see how an extreme derivation can be viewed as a method for dynam-ically generating a resolution.

Theorem 6.7 (Resolutions from extreme derivatives).Suppose∆ =⇒≻ ∆ ′ in apLTS〈S,Ωτ ,→〉. Then there is a resolution〈R,Θ ,→R〉 of ∆ , with resolving functionf , such thatΘ =⇒≻R Θ ′ for someΘ ′ with ∆ ′ = Imgf (Θ ′).

Proof. Consider an extreme derivation of∆ =⇒≻ ∆ ′, as given in Definition 6.4where all ∆×

k are assumed to be stable. To define the corresponding resolution〈R,Θ ,−→R〉 we refer to Definition 4.3. First let the set of statesR be S×N andthe resolving functionf :R→Sbe given byf (s,k) = s. To complete the descriptionwe must define the partial functionsα−→, for α = ω andα = τ. These are alwaysdefined so that if(s,k) α−→Γ then the only states in the support ofΓ are of the form(s′,k+1). In the definition we useΘ ↓k, for any subdistributionΘ overS, to be thesubdistribution overR given by

Θ ↓k(t) =

Θ(s) if t = (s,k)

0 otherwise

Note that by definition

(a) Imgf (Θ ↓k) =Θ(b) ∆↓k

k = ∆→↓kk +∆×↓k

k

The definition of ω−→R is straightforward: its domain consists of states(s,k)wheres∈ ⌈∆×

k ⌉ and is defined by letting(s,k) ω−→∆↓k+1s for some arbitrarily chosen

s ω−→ ∆s.The definition of τ−→R is more complicated, and is determined by the moves

∆→k

τ−→ ∆k+1. For a givenk this move means that

∆→k = ∑

i∈Ipi ·si , ∆k+1 = ∑

i∈Ipi ·Γi , si

τ−→ Γi

So for eachk we let

(s,k) τ−→R ∑si=s

pi ·Γ ↓k+1i

This definition ensures

(c) (∆→k )↓k τ−→R (∆k+1)

↓k+1

(d) (∆×k )↓k is stable.

This completes our definition of the deterministic pLTS underlying the requiredresolution; it remains to find subdistributionsΘ ,Θ ′ overR such that∆ = Imgf (Θ),∆ ′ = Imgf (Θ ′) andΘ =⇒≻Θ ′.


Because of (b) (c) and (d) we have the following extreme derivation, which bypart (ii) of Lemma 6.7 is the unique one from∆↓0

0 :

∆↓0 = (∆→0 )↓0 + (∆×

0 )↓0

(∆→0 )↓0 τ−→R (∆→

1 )↓1 + (∆×1 )↓1

......

(∆→k )↓k τ−→R (∆→

k+1)↓k+1 + (∆×

k+1)↓k+1

...

Θ ′ = ∑∞k=0(∆

×k )↓k

LettingΘ be∆↓0, we see that (a) above ensures∆ = Imgf (Θ); the same note andthe linearity of f applied to distributions also gives∆ ′ = Imgf (Θ ′).

⊓⊔

The converse is somewhat simpler.

Proposition 6.5 (Extreme derivatives from resolutions).Suppose〈R,Θ ,→R〉 is aresolution of a subdistribution∆ in a pLTS〈S,Ωτ ,→〉 with resolving function f .ThenΘ =⇒≻R Θ ′ implies∆ =⇒≻ Imgf (Θ ′).

Proof. Consider any derivation ofΘ =⇒≻R Θ ′ along the lines of Definition 6.4.By systematically applying the functionf to the component subdistributions in thisderivation we get a derivation Imgf (Θ) =⇒ Imgf (Θ ′), that is∆ =⇒ Imgf (Θ ′). Toshow that Imgf (Θ ′) is actually an extreme derivative it suffices to show thats isstable for everys∈ ⌈Imgf (Θ ′)⌉. But if s∈ ⌈Imgf (Θ ′)⌉ then by definition there issomet ∈ ⌈Θ ′⌉ such thats= f (t). SinceΘ =⇒≻R Θ ′ the statet must be stable. Thestability ofsnow follows from requirement (iii) of Definition 4.3. ⊓⊔

Our next step is to relate the outcomes extracted from extreme derivatives tothose extracted from the corresponding resolutions. By Lemma 4.3 we know thatthe functionC : (R→ [0,1]Ω )→ (R→ [0,1]Ω ) defined in (4.1) for a deterministicpLTS is continuous. Then its least fixed pointV : R→ [0,1]Ω is also continuous andcan be captured by a chain of approximants. The functionsV

n, n≥ 0, are definedby induction onn:

V0(r)(ω) = 0

Vn+1 = C (Vn )

ThenV=⊔

n≥0 Vn. This is used in the following result.

Lemma 6.8. Let ∆ be a subdistribution in anω-respecting deterministic pLTS. If∆ =⇒≻ ∆ ′ thenV(∆) = V(∆ ′).

Proof. Since the pLTS isω-respecting we know thats τ−→ ∆ impliess 6ω−→ for anyω . Therefore from the definition of the functionC we have thats τ−→ ∆ impliesV

n+1(s) = Vn(∆), whence by lifting and linearity we get


If ∆ τ−→ ∆ ′ thenVn+1(∆) =Vn(∆ ′) for all n≥ 0. (6.13)

Now suppose∆ =⇒≻ ∆ ′. Referring to Definition 6.4 and carrying out a straight-forward induction based on (6.13), we have

Vn+1(∆) = V

0(∆n+1)+n

∑k=0

Vn−k+1(∆×

k ) (6.14)

for all n≥ 0. This can be simplified further by noting

(i) V0(∆)(ω) = 0 for every∆

(ii) Vm+1(∆) = V(∆) for everym≥ 0, provided∆ is stable.

Applying these remarks to (6.14) above, since all∆×k are stable, we obtain

Vn+1(∆) =

n

∑k=0

V(∆×k ) (6.15)

We conclude by reasoning as follows

V(∆) =⊔

n≥0

Vn+1(∆)

=⊔

n≥0

n

∑k=0

V(∆×k ) from (6.15) above

=⊔

n≥0

V(n

∑k=0

∆×k ) by linearity ofV

= V(⊔

n≥0

n

∑k=0

∆×k ) by continuity ofV

= V(∞

∑k=0

∆×k )

= V(∆ ′)

⊓⊔

We are now ready to compare the two methods for calculating the set of outcomesassociated with a subdistribution:

• using extreme derivatives and the reward function $ from Definition 6.9• using resolutions and the evaluation functionV from Section 4.2.

Corollary 6.2. In anω-respecting pLTS〈S,Ωτ ,→〉, the following statements hold.

(a) If ∆ =⇒≻ ∆ ′ then there is a resolution〈R,Θ ,→R〉 of ∆ such thatV(Θ) = $(∆ ′).(b) For any resolution〈R,Θ ,→R〉 of ∆ , there exists an extreme derivative∆ ′ such

that∆ =⇒≻ ∆ ′ andV(Θ) = $(∆ ′).

6.5 Generating weak derivatives in a finitary pLTS 175

Proof. Suppose∆ =⇒≻ ∆ ′. By Theorem 6.7, there is a resolution〈R,Θ ,→R〉 of ∆with resolving functionf and subdistributionΘ such thatΘ =⇒≻Θ ′ and moreover∆ ′ = Imgf (Θ ′). By Lemma 6.8, we haveV(Θ) = V(Θ ′).

SinceΘ ′ and∆ ′ are extreme derivatives, all the states in their supports are stable.Therefore a simple calculation, using the fact that∆ ′ = Imgf (Θ ′), will show thatV(Θ ′) = $(∆ ′), from which the requiredV(Θ) = $(∆ ′) follows.

To prove part (b), suppose that〈R,Θ ,→R〉 is a resolution of∆ with resolvingfunction f , so that∆ = Imgf (Θ). We know from Lemma 6.7 that there exists a(unique) subdistributionΘ ′ such thatΘ =⇒≻ Θ ′. By Proposition 6.5 we have that∆ = Imgf (Θ) =⇒≻ Imgf (Θ ′). The same arguments as in the other direction showthatV(Θ) = $(Imgf (Θ ′)). ⊓⊔

Corollary 6.3. For any process P and test T , we haveA d(T,P) = A (T,P). ⊓⊔

6.5 Generating weak derivatives in a finitary pLTS

Now let us restrict our attention to a finitary pLTS whose state space is assumed tobeS= s1, . . . ,sn. Here by definition the setsΘ | s α−→Θ are finite, for everystates and labelα. This of course is no longer true for the weak arrows; the setsΘ | s α

=⇒Θ are in general not finite, because of the infinitary nature of the weakderivative relation=⇒. The purpose of this section is to show that nevertheless theycan be finitely represented, at least for finitary pLTS’s.

This is explained in Section 6.5.1, and the ramifications arethen explored in thefollowing subsection. These include a very useful topological property of these setsof derivatives; they areclosedin the sense (from analysis) of containing all its limitpoints where, in turn, limit depends on a Euclidean-style metric defining the distancebetween two subdistributions in a straightforward way. Another consequence is thatwe can find in any derivation that partially diverges (by no matter how small anamount) a point at which the divergence isdistilled into a state that wholly diverges;we call thisdistillation of divergence.

6.5.1 Finite generability

A subdistribution over the finite state spaceS can now be viewed as a point inR

n, and therefore a set of subdistributions, such as the set of weak derivatives∆ | s=⇒ ∆ corresponds to a subset ofRn. We endowRn with the standardEuclidean metric and proceed to establish useful topological properties of such setsof subdistributions. Recall that a setX ⊆R

n is (Cauchy)closedif for every Cauchysequencexn | n≥ 0 with limit x, if xn ∈ X for everyn≥ 0 thenx is also inX.

Lemma 6.9.If X is a finite subset ofRn thenlX is closed.


Proof. Straightforward. ⊓⊔

In Definition 4.8 we gave a definition of extreme policies for pLTS’s of the form〈S,Ωτ ,→〉 and showed how they determine resolutions. Here we generalise theseto derivative policiesand show that these generalised policies can also be used togenerate arbitrary weak derivatives of subdistributions overS.

Definition 6.12.A (static) derivative policyfor a pLTS 〈S,Actτ ,→〉, is a partialfunctiondp : SD(S) with the property thatdp(s) = ∆ impliess τ−→ ∆ . If dp isundefined ats, we writedp(s)↑. Otherwise, we writedp(s)↓.

A derivative policydp, as its name suggests, can be used to guide the derivation ofa weak derivative. Supposes=⇒ ∆ , using a derivation as given in Definition 6.4.Then we writes=⇒dp ∆ whenever, for allk≥ 0,

(a) ∆→k (s) =

∆k(s) if dp(s)↓

0 otherwise(b) ∆(k+1) = ∑s∈⌈∆→

k ⌉ ∆→k (s) ·dp(s)

Intuitively these conditions mean that the derivation of∆ from s is guided at eachstage by the policydp:

• Condition (a) implies that the division of∆k into ∆→k , the subdistribution that

will continue marching, and∆×k , the subdistribution that will stop, is determined

by the domain of the derivative policydp.• Condition (b) ensures that the derivation of the next stage∆k+1 from ∆→

k is de-termined by the action of the functiondp on the support of∆→

k .

Lemma 6.10.Letdp be a derivative policy in a pLTS. Then

(a) If s=⇒dp ∆ ands=⇒dp Θ then∆ =Θ .(b) For every state s there exists some∆ such thats=⇒dp ∆ .

Proof. To prove part (a) consider the derivation ofs=⇒ ∆ ands=⇒ Θ as in Def-inition 6.4, via the subdistributions∆k, ∆→

k , ∆×k andΘk, Θ→

k , Θ×k respectively.

Because both derivations are guided by the same derivative policy dp it is easy toshow by induction onk that

∆k =Θk ∆→k =Θ→

k ∆×k =Θ×

k

from which∆ =Θ follows immediately.To prove (b) we usedp to generate subdistributions∆k, ∆→

k , ∆×k for eachk≥ 0

satisfying the constraints of Definition 6.4 and simultaneously those in Defini-tion 6.12 above. The result will then follow by letting∆ be∑k≥0 ∆×

k . ⊓⊔

The net effect of this lemma is that a derivative policydp determines atotalfunction from states to derivations. LetDerdp : S→ D(S) be defined by lettingDerdp(s) be the unique∆ such thats=⇒dp ∆ .


It should be clear that the use of derivative policies limitsconsiderably the scopefor deriving weak derivations. Each particular policy can only derive one weakderivative, and moreover in finitary pLTS there are only a finite number of deriva-tive policies. Nevertheless we will show that this limitation is more apparent thanreal. In Section 4.6.1 we saw how the more restrictive extreme policiesep could infact realise the maximum value attainable by any resolutionof a finitely branchingpLTS. Here we generalise this result by replacing resolutions with arbitrary rewardfunctions.

Definition 6.13 (Rewards and payoffs).Let Sbe a set of states. Areward functionis a functionr : S→ [−1,1]. With S= s1, . . . ,sn we often consider a rewardfunction as then-dimensional vector〈 r(s1), ..., r (sn)〉. In this way, we can use thenotionr ·∆ to stand for the inner product of two vectors.

Given such a reward function, we define thepayoff functionPrmax : S→R by

Prmax(s) = sup r ·∆ | s=⇒ ∆

A priori these payoff functions for a given statesare determined by the set of weakderivatives ofs. However the main result of this section is that they can in factalways be realised by derivative policies.

Theorem 6.8 (Realising payoffs).In a finitary pLTS, for every reward functionrthere exists some derivative policydp such thatPr

max(s) = r ·Derdp(s)

Proof. As with Theorem 4.3 there is a temptation to give a constructive proof here,defining the effect of the required derivative policydp at states by considering theapplication of the reward functionr to bothsand all of its derivatives. However thisis not possible, as the example below explains.

Instead the proof is non-constructive, requiringdiscountedpolicies. The overallstructure of the proof is similar to that of Theorem 4.3, but the use of (discounted)derivative policies rather than extreme policies makes thedetails considerably dif-ferent. Consequently the proof is spelled out in some detailin Section 6.5.2, cumu-lating in Theorem 6.10. ⊓⊔

Example 6.10.Let us say that a derivative policydp is max-seeking with respect toa reward functionr if for all s∈ S the following requirements are met.

1. If dp(s)↑ thenr(s)≥ Prmax(∆1) for all s τ−→ ∆1.

2. If dp(s) = ∆ then

a.Prmax(∆)≥ r(s) and

b. Prmax(∆)≥ P

rmax(∆1) for all s τ−→ ∆1.

What a max-seeking policy does is to evaluatePrmax in advance, for a given reward

function r , and then label each states with the payoff valuePrmax(s). The policy

at any states is then to comparer(s) with the expected label valuesPrmax(∆ ′) (i.e.

Exp∆ ′(Prmax)) for each outgoing transitions τ−→ ∆ ′ and then to select the greatest

among all those values. Note that for the policy to be well defined, we require thatthe pLTS under consideration is finitely branching.


τ1/2

1/2

0s

s1

τ

Fig. 6.3 Max-seeking policies

In case that seems obvious, we now consider the pLTS in Figure6.3 and let usapply the above definition of max-seeking policies to the reward function given byr(s0) = 0, r(s1) = 1. For both states a payoff of 1 is attainable eventually, thusP

rmax(s0) = P

rmax(s1) = 1, because we haves0 =⇒ s1 and s1 =⇒ s1. Hence, both

states will bePrmax-labelled with 1. At states0 the policy then makes a choice among

three options: (1) to stay unmoved, yielding immediate payoff r(s0) = 0; (2) to takethe transitions0

τ−→ s0; (3) to take the transitions0τ−→ s0 1/2⊕ s1. Clearly one of

the latter two is chosen — but which? If it is the second, then indeed the maximumpayoff 1 can be achieved. If it is the first, then in fact the overall payoff will be 0because of divergence, so the policy would fail to attain themaximum payoff 1.

However, for properly discounted max-seeking policies, weshow in Proposi-tion 6.6 that they always attain the maximum payoffs.

With Theorem 6.8 at hand, we are in the position to prove the main result of thissection, which says that in a finitary pLTS the set of weak derivatives from any states, ∆ | s=⇒ ∆, is generable by the convex closure of a finite set. The proof makesuse of the separation theorem, Theorem 2.7.

Theorem 6.9 (Finite generability). Let P= dp1, . . . , dpk be the finite set ofderivative policies in a finitary pLTS. For any state s and subdistribution∆ in thepLTS,s=⇒ ∆ if and only if∆ ∈ lDerdpi

(s) | 1≤ i ≤ k.

Proof. (⇒) For convenience letX denote the setlDerdpi(s) | 1≤ i ≤ k. Sup-

pose, for a contradiction, thats =⇒ ∆ for some∆ not in X. Recall that we areassuming that the underlying state space isS= x1, . . .xn so thatX is a subset ofR

n. It is trivially bounded by[−1,1]n, and by definition it is convex; by Lemma 6.9it follows thatX is (Cauchy) closed.

By the separation theorem,∆ can be separated fromX by a hyperplaneH. Whatthis means is that there is some functionrH : S→ R and constantc ∈ R such thateither

(a) rH ·Θ < c for all Θ ∈ X andrH ·∆ > c(b) or, rH ·Θ > c for all Θ ∈ X andrH ·∆ < c


In fact from case (b) we can obtain case (a) by negating both the constantc andthe components of the functionrH ; so we can assume (a) to be true. Moreoverby scaling with respect to the largestrH(si), 1 ≤ i ≤ n, we can assume thatrH isactually a reward function.

In particular (a) means thatrH ·Derdpi(s) < c, and therefore that

rH ·Derdpi(s) < rH ·∆

for each of derivative policesdpi . But this contradicts Theorem 6.8 which claimsthat there must be some 1≤ i ≤ n such thatrH ·Derdpi

(s) = PrHmax(s)≥ rH ·∆ .

(⇐) Note that by definitions=⇒ Derdpi(s) for every derivative policydpi with

1≤ i ≤ k. By Proposition 6.3 and Remark 6.1, the relation=⇒ is convex. It followsthat if ∆ ∈ lDerdp1

(s), . . . , Derdpk thens=⇒ ∆ . ⊓⊔

Extreme policies, as given in Definition 4.8, are particularkinds of derivativepolicies, designed for pLTS’s of the form〈R,Ωτ ,→R〉. The significant constrainton extreme policies is that for any states if s τ−→ thenep(s) must be defined. Asa consequence in the computation determined byep if a state can contribute to thecomputation at any stage it must contribute.

Lemma 6.11.Let ep be any extreme policy. Thens=⇒ep ∆ impliess=⇒≻ ∆ .

Proof. Consider the derivation of∆ as in Definition 6.4, and determined by theextreme policyep. Since∆ = ∑k≥0 ∆×

k it is sufficient to show that each∆×k is stable,

that iss τ−→ impliess 6∈ ⌈∆×k ⌉.

Sinceep is an extreme policy, Definition 4.8 ensures thatep(s) is defined. Fromthe definition of a computation, Definition 6.4, we know∆k = ∆→

k +∆×k and since

the computation is guided by the policyep we have that∆→k (s) = ∆k(s). An imme-

diate consequence is that∆×k (s) = 0. ⊓⊔

As a consequence the finite generability result, Theorem 6.9, specialises to ex-treme derivatives.

Corollary 6.4. Let ep1, ...,epk be the finite set of extreme policies of a finitaryω-respecting pLTS〈S,Ωτ ,→R〉. For any state s and subdistribution∆ in the pLTS,s=⇒≻ ∆ if and only if∆ ∈ lDerepi (s) | 1≤ i ≤ k.

Proof. One direction follows immediately from the previous lemma.Converselysupposes=⇒≻ ∆ . By Theorem 6.9∆ = ∑1≤i≤n pi ·Derdpi

(s) for some finite collec-tion of derivative policiesdpi , where we can assume that eachpi ≥ 0. Because∆ isstable, that iss 6 τ−→ for everys∈ ⌈∆⌉, we show that each derivative policydpi can betransformed into an extreme policyepi such thatDerepi (s) = Derdpi

(s), from whichthe result will follow.

First note it is sufficient to defineepi on the set of statest accessible froms viathe policydpi ; on the remaining states inS epi can be defined arbitrarily, so as tosatisfy the requirements of Definition 4.8. So consider the derivation ofDerdpi

(s)as in Definition 6.4, determined bydpi and supposet ∈ ⌈∆k⌉ for somek≥ 0. Thereare three cases:


(i) Supposet τ−→. Since∆ is stable we knowt 6∈ ⌈∆×k ⌉, and therefore by definition

dpi(t) is defined. So in this case we letepi(t) be the same asdpi(t).(ii) Supposet ω−→, in which case, since the pLTS isω-respecting, we knowt 6 τ−→,

and thereforedpi(t) is not defined. Here we chooseepi(t) arbitrarily so as tosatisfyt ω−→ epi(t).

(iii) Otherwise we leaveepi(t) undefined.

By definition epi is an extreme policy since it satisfies conditions (a) and (b)inDefinition 4.8, and by constructionDerepi (s) = Derdpi

(s). ⊓⊔

This corollary gives a useful method for calculating the setof extreme derivativesof a given state, and therefore of the result of applying a test to a process.

Example 6.11.Consider again Figure 4.4, discussed in Example 6.9, where we havetheω-respecting pLTS obtained by applying the testa.ω to the processQ2. Thereare only two extreme policies for this pLTS, denoted byep0 andep1. They differonly for the states1, with ep0(s1) =Θ0 andep1(s1) =Θ1. The discussion in Exam-ple 6.9 explained how

Derep0(s1) = ω .0 Derep1(s1) =12

s3+12

ω .0

By Corollary 6.4 we know that every possible extreme derivative of [[T |Act Q2]takes the form

q ·ω .0+(1−q) · (12

s3+12

ω .0)

for some 0≤ q≤ 1. Since $(ω) =−→ω and $(12s3+

12ω) = 1

2−→ω it follows that

Ad(T,Q2) = q−→ω | q∈ [

12,1].

6.5.2 Realising payoffs

In this section we expound some less obvious properties of derivations, relatingto their behaviour at infinity. One important property is that if we associate eachstate with a reward, which is a value in[−1,1], then the maximum payoff realisableby following all possible weak derivations can in fact be achieved by some staticderivative policy, as stated in Theorem 6.8. The property depends on our workingwithin finitary pLTS’s — that is, ones in which the state space is finite and the(unlifted) transition relation is finite-branching. We first need to formalise someconcepts such as discounted weak derivation and discountedpayoff.

Definition 6.14 (Discounted weak derivation).The discounted weak derivation∆ =⇒δ ∆ ′ for discount factorδ (0 ≤ δ ≤ 1) is obtained from a weak derivationby discounting eachτ transition byδ . That is, there is a collection of∆→

k and∆×k

satisfying


∆ = ∆→0 +∆×

0∆→

0τ−→ ∆→

1 +∆×1

...∆→

kτ−→ ∆→

k+1+∆×k+1

...

such that∆ ′ = ∑∞k=0 δ k∆×

k .

It is trivial that the relation=⇒1 coincides with=⇒ given in Definition 6.4.

Definition 6.15 (Discounted payoff).Given a pLTS with state spaceS, a discountδ , and reward functionr , we define thediscounted payoff functionPδ ,r

max : S→R by

Pδ ,rmax(s) = supr ·∆ ′ | s=⇒δ ∆ ′

and we will generalise it to be of typeDsub(S)→ R by letting

Pδ ,rmax(∆) = ∑

s∈⌈∆⌉

∆(s) ·Pδ ,rmax(s).

Definition 6.16 (Max-seeking policy).Given a pLTS, discountδ and reward func-tion r , we say a static derivative policydp is max-seekingwith respect toδ andr iffor all s the following requirements are met.

1. If dp(s)↑, thenr(s) ≥ δ ·Pδ ,rmax(∆1) for all s τ−→ ∆1.

2. If dp(s) = ∆ then

a. δ ·Pδ ,rmax(∆) ≥ r(s) and

b. Pδ ,rmax(∆) ≥ P

δ ,rmax(∆1) for all s τ−→ ∆1.

Lemma 6.12.Given a finitely branching pLTS, discountδ and reward functionr,there always exists a max-seeking policy.

Proof. Given a pLTS, discountδ and reward functionr , the discounted payoffP

δ ,rmax(s) can be calculated for each states. Then we can define a derivative pol-

icy dp in the following way. For any states, if r(s) ≥ δ ·Pδ ,rmax(∆1) for all s τ−→ ∆1,

then we setdp undefined ats. Otherwise, we choose a transitions τ−→ ∆ among thefinite number of outgoing transitions fromssuch thatPδ ,r

max(∆) ≥ Pδ ,rmax(∆1) for all

other transitionss τ−→ ∆1, and we setdp(s) = ∆ . ⊓⊔

Given a pLTS, discountδ , reward functionr , and derivative policydp, we definethe functionFδ ,dp,r : (S→R)→ (S→ R) by letting

Fδ ,dp,r ( f )(s) =

r(s) if dp(s)↑δ · f (∆) if dp(s) = ∆ (6.16)

where f (∆) = ∑s∈⌈∆⌉∆(s) · f (s).


Lemma 6.13.Given a pLTS, discountδ < 1, reward functionr, and derivative pol-icy dp, the function Fδ ,dp,r has a unique fixed point.

Proof. We first show that the functionFδ ,dp,r is a contraction mapping. Letf ,g beany two functions of typeS→R.

|Fδ ,dp,r ( f )−Fδ ,dp,r (g)|= sup|Fδ ,dp,r ( f )(s)−Fδ ,dp,r (g)(s)| | s∈ S= sup|Fδ ,dp,r ( f )(s)−Fδ ,dp,r (g)(s) | | s∈ Sanddp(s)↓= δ ·sup| f (∆)−g(∆)| | s∈ Sanddp(s) = ∆ for some∆≤ δ ·sup| f (s′)−g(s′)| | s′ ∈ S= δ · | f −g|< | f −g|

By the Banach unique fixed-point theorem (Theorem 2.8), the functionFδ ,dp,r hasa unique fixed point. ⊓⊔

Lemma 6.14.Given a pLTS, discountδ , reward functionr, and max-seeking policydp, the functionPδ ,r

max is a fixed point of Fδ ,dp,r .

Proof. We need to show thatFδ ,dp,r (Pδ ,rmax)(s) = P

δ ,rmax(s) holds for any states. We

distinguish two cases.

1. If dp(s)↑, thenFδ ,dp,r (Pδ ,rmax)(s) = r(s) = P

δ ,rmax(s) as expected.

2. If dp(s) = ∆ , then the arguments are more involved. First we observe thatifs=⇒δ ∆ ′, then by Definition 6.14 there exist some∆→

0 ,∆×0 ,∆1,∆ ′′ such that

s= ∆→0 +∆×

0 , ∆→0

τ−→ ∆1, ∆1 =⇒δ ∆ ′′ and∆ ′ = ∆×0 +δ ·∆ ′′. So we can do the

following calculation.


Pδ ,rmax(s)

= supr ·∆ ′ | s=⇒δ ∆ ′= supr · (∆×

0 + δ ·∆ ′′) | s= ∆→0 +∆×

0 ,∆→0

τ−→ ∆1, and∆1 =⇒δ ∆ ′′

for some∆→0 ,∆×

0 ,∆1,∆ ′′= supr ·∆×

0 + δ · r ·∆ ′′ | s= ∆→0 +∆×

0 ,∆→0

τ−→ ∆1, and∆1 =⇒δ ∆ ′′


0 ,∆1,∆ ′′= supr ·∆×

0 + δ ·supr ·∆ ′′ | ∆1 =⇒δ ∆ ′′ | s= ∆→0 +∆×

0 and∆→0

τ−→ ∆1


0 ,∆1

= supr ·∆×0 + δ ·Pδ ,r

max(∆1) | s= ∆→0 +∆×

0 and∆→0

τ−→ ∆1


0 ,∆1

= sup(1− p) · r(s)+ pδ ·Pδ ,rmax(∆1) | p∈ [0,1] ands τ−→ ∆1 for some∆1

[scan be split intops+(1− p)sonly]

= sup(1− p) · r(s)+ pδ ·Pδ ,rmax(∆1) | p∈ [0,1] ands τ−→ ∆1

for some∆1

= sup(1− p) · r(s)+ pδ ·supPδ ,rmax(∆1) | s τ−→ ∆1 | p∈ [0,1]

= max(r(s), δ ·supPδ ,rmax(∆1) | s τ−→ ∆1)

= δ ·Pδ ,rmax(∆) [asdp is max-seeking]

= Fδ ,dp,r (Pδ ,rmax)(s)

⊓⊔

Definition 6.17.Let ∆ be a subdistribution anddp a static derivative policy. Wedefine a collection of subdistributions∆k as follows.

∆0 = ∆∆k+1 = ∑∆k(s) ·dp(s) | s∈ ⌈∆k⌉ anddp(s)↓ for all k≥ 0.

Then∆×k is obtained from∆k by letting

∆×k (s) =

0 if dp(s)↓∆k(s) otherwise

for all k ≥ 0. Then we write∆ =⇒δ ,dp ∆ ′ for the discounted weak derivation thatdetermines a unique subdistribution∆ ′ with ∆ ′ = ∑∞

k=0 δ k∆×k .

In other words, if∆ =⇒δ ,dp ∆ ′ then∆ comes from the discounted weak derivation∆ =⇒δ ∆ ′ that is constructed by following the derivative policydp when choosingτ transitions from each state. In the special case when the discount factorδ = 1, wesee that=⇒1,dp becomes=⇒dp as defined in page 176.

Definition 6.18 (Policy-following payoff).Given a discountδ , reward functionr ,and derivative policydp, the policy-following payoff functionPδ ,dp,r : S→ R isdefined by

Pδ ,dp,r (s) = r ·∆ ′

where∆ ′ is determined by the discounted weak derivations=⇒δ ,dp ∆ ′.


Lemma 6.15.For any discountδ , reward functionr, and derivative policydp, thefunctionPδ ,dp,r is a fixed point of Fδ ,dp,r .

Proof. We need to show thatFδ ,dp,r (Pδ ,dp,r )(s) = Pδ ,dp,r (s) holds for any states.

There are two cases.

1. If dp(s)↑, thens=⇒δ ,dp ∆ ′ implies∆ ′ = s. Therefore,

Pδ ,dp,r (s) = r(s) = Fδ ,dp,r (Pδ ,dp,r )(s)

as required.2. Supposedp(s) = ∆1. If s=⇒δ ,dp ∆ ′ thens τ−→ ∆1, ∆1 =⇒δ ,dp ∆ ′′ and∆ ′ = δ∆ ′′

for some subdistribution∆ ′′. Therefore,

Pδ ,dp,r (s)

= r ·∆ ′

= r ·δ∆ ′′

= δ · r ·∆ ′′

= δ ·Pδ ,dp,r (∆1)

= Fδ ,dp,r (Pδ ,dp,r )(s)

⊓⊔

Proposition 6.6.Let δ ∈ [0,1) be a discount andr a reward function. Ifdp is a

max-seeking policy with respect toδ andr, thenPδ ,rmax= P

δ ,dp,r .

Proof. By Lemma 6.13, the functionFδ ,dp,r has a unique fixed point. By Lem-mas 6.14 and 6.15, bothPδ ,r

max andPδ ,dp,r are fixed points of the same function

Fδ ,dp,r , which means thatPδ ,rmax andPδ ,dp,r coincide with each other. ⊓⊔

Lemma 6.16.Supposes=⇒ ∆ ′ with ∆ ′ = ∑∞i=0 ∆×

i for some properly related∆×i .

Letδ j∞j=0 be a nondecreasing sequence of discount factors convergingto 1. Then

for any reward functionr it holds that

r ·∆ ′ = limj→∞

∞

∑i=0

δ ij(r ·∆

×i ).

Proof. Let f : N×N→R be the function defined byf (i, j) = δ ij(r ·∆

×i ). We check

that f satisfies the four conditions in Proposition 4.3.

1. f satisfies conditionC1. For all i, j1, j2 ∈N, if j1 ≤ j2 thenδ ij1≤ δ i

j2. It follows

that| f (i, j1)| = |δ i

j1(r ·∆×i )| ≤ |δ i

j2(r ·∆×i )| = | f (i, j2)|.

2. f satisfies conditionC2. For anyi ∈ N, we have

limj→∞

| f (i, j)| = limj→∞

|δ ij(r ·∆

×i )| = |r ·∆×

i |. (6.17)


3. f satisfies conditionC3. For anyn∈N, the partial sumSn=∑ni=0 lim j→∞ | f (i, j)|

is bounded because

n

∑i=0

limj→∞

| f (i, j)| =n

∑i=0

|r ·∆×i | ≤

∞

∑i=0

|r ·∆×i | ≤

∞

∑i=0

|∆×i | = |∆ ′|

where the first equality is justified by (6.17).4. f satisfies conditionC4. For anyi, j1, j2 ∈ N with j1 ≤ j2, suppose we have

f (i, j1) = δ ij1(r · ∆×

i ) > 0. Thenr · ∆×i > 0 and it follows immediately that

f (i, j2) = δ ij2(r ·∆×

i )> 0.

Therefore, we can use Proposition 4.3 to do the following inference.

lim j→∞ ∑∞i=0 δ i

j(r ·∆×i )

= ∑∞i=0 lim j→∞ δ i

j(r ·∆×i )

= ∑∞i=0 r ·∆×

i= r ·∑∞

i=0 ∆×i

= r ·∆ ′

⊓⊔

Corollary 6.5. Let δ j∞j=0 be a nondecreasing sequence of discount factors con-

verging to 1. For any derivative policydp and reward functionr, it holds thatP

1,dp,r = lim j→∞Pδ j ,dp,r .

Proof. We need to show thatP1,dp,r (s) = lim j→∞Pδ j ,dp,r (s), for any states. Note

that for any discountδ j , each states enables a unique discounted weak derivations=⇒δ j ,dp ∆ j such that∆ j = ∑∞

i=0 δ ij∆

×i for some properly related subdistributions

∆×i . Let ∆ ′ = ∑∞

i=0 ∆×i . We haves=⇒1,dp ∆ ′. Then we can infer that

lim j→∞Pδ j ,dp,r (s)

= lim j→∞ r ·∆ j

= lim j→∞ r ·∑∞i=0 δ i

j∆×i

= lim j→∞ ∑∞i=0 δ i

j(r ·∆×i )

= r ·∆ ′ by Lemma 6.16= P

1,dp,r (s)

⊓⊔

Theorem 6.10.In a finitary pLTS, for any reward functionr there exists a derivativepolicydp such thatP1,r

max= P1,dp,r .

Proof. Let r be a reward function. By Proposition 6.6, for every discountfactord < 1 there exists a max-seeking derivative policydp with respect toδ andr suchthat

Pδ ,rmax= P

δ ,dp,r . (6.18)

Since the pLTS is finitary, there are finitely many different static derivative policies.There must exist a derivative policydp such that (6.18) holds for infinitely many


discount factors. In other words, for every nondecreasing sequenceδn∞n=0 con-

verging to 1, there exists a subsequenceδn j∞j=0 and a derivative policydp⋆ such

that

Pδnj ,rmax = P

δnj ,dp⋆,r for all j ≥ 0. (6.19)

For any states, we infer as follows.

P1,rmax(s)

= supr ·∆ ′ | s=⇒ ∆ ′= suplim j→∞ ∑∞

i=0 δ in j(r ·∆×

i ) | s=⇒ ∆ ′ with ∆ ′ = ∑∞i=0 ∆×

i by Lemma 6.16= lim j→∞ sup∑∞

i=0 δ in j(r ·∆×

i ) | s=⇒ ∆ ′ with ∆ ′ = ∑∞i=0 ∆×

i

= lim j→∞ supr ·∑∞i=0 δ i

n j∆×

i | s=⇒ ∆ ′ with ∆ ′ = ∑∞i=0 ∆×

i

= lim j→∞ supr ·∆ ′′ | s=⇒δnj∆ ′′

= lim j→∞Pδnj ,rmax (s)

= lim j→∞Pδnj ,dp

⋆,r(s) by (6.19)

= P1,dp⋆,r (s) by Corollary 6.5

⊓⊔

6.5.3 Consequences

In this section we outline two major consequences of Theorem6.9, which informallymeans that the set of weak derivatives from a given state is the convex-closure of afinite set. The first is straightforward, and is explained in the following two results.

Lemma 6.17 (Closure of=⇒). For any state s in a finitary pLTS the set of deriva-tives∆ | s=⇒ ∆ is closed and convex.

Proof. Let dp1, ...,dpn (n ≥ 1) be all the derivative policies in the finitary pLTS.Consider two setsC= lDerdpi

(s) | 1≤ i ≤ n andD = ∆ ′ | s=⇒ ∆ ′. By The-orem 6.9D coincides withC, the convex closure of a finite set. By Lemma 6.9, it isalso Cauchy closed. ⊓⊔

The restriction here to finitary pLTS’s is essential, as the following examples demon-strate.

Example 6.12.Consider the finite state but infinitely branching pLTS containingthree statess1, s2, s3 and the countable set of transitions given by

s1τ−→ (s2 1

2n⊕ s3) n≥ 1

For convenience let∆n denote the distribution(s2 12n⊕ s3). Then∆n | n≥ 1 is a

Cauchy sequence with limits3. Trivially the set∆ | s1 =⇒ ∆ contains every∆n,but it does not contain the limit of the sequence, thus it is not closed.


Example 6.13.By adapting Example 6.12, we obtain the following pLTS that isfinitely branching but has infinitely many states. Lett1 andt2 be two distinct states.Moreover, for eachn≥ 1 there is a statesn with two outgoing transitions:sn

τ−→ sn+1

andsnτ−→ t1 1

2n⊕ t2. Let ∆n denote the distributiont1 1

2n⊕ t2. Then∆n | n≥ 1 is

a Cauchy sequence with limitt2. The set∆ | s1 =⇒ ∆ is not closed because itcontains each∆n but not the limitt2.

Corollary 6.6 (Closure of a=⇒). For any state s in a finitary pLTS the set

∆ | s a=⇒ ∆

is closed and convex.

Proof. We first introduce a preliminary concept. We say a subsetD ⊆ Dsub(S) isfinitely generablewhenever there is some finite setF ⊆ Dsub(S) such thatD = lF .A relationR⊆ X ×Dsub(S) is finitely generableif for every x in X the setx· R isfinitely generable. We observe that

(i) If a set is finitely generable, then it is closed and convex.(ii) If R1,R2⊆ Dsub(S)×Dsub(S) are finitely generable then so is their composi-

tion R1·R2.

The first property is a direct consequence of the definition offinite generability. Toprove the second property, we letBi

Φ be a finite set of subdistributions such thatΦ· Ri= lBi

Φ for i = 1,2. Then one can check that

∆ · (R1·R2) = l ∪B2Θ | Θ ∈ B

1∆

which implies that finite generability is preserved under composition of relations.Notice that the relation a

=⇒ is a composition of three stages:=⇒ · a−→ · =⇒. Inthe proof of Lemma 6.17 we have shown that=⇒ is finitely generable. In a finitarypLTS, the relation a−→ is also finitely generable. It follows from property (ii) that

a=⇒ is finitely generable. By property (i) we have thata

=⇒ is closed and convex.⊓⊔

Corollary 6.7. In a finitary pLTS, the relation a=⇒ is the lifting of the closed andconvex relation=⇒S

a−→=⇒, where s=⇒S ∆ meanss=⇒ ∆ .

Proof. The relation=⇒Sa−→=⇒ is a

=⇒ restricted to point distributions. We haveshown that a

=⇒ is closed and convex in Corollary 6.6. Therefore,=⇒Sa−→=⇒ is

closed and convex. Its lifting coincides witha=⇒, which can be shown by somearguments analogous to those in the proof of Proposition 6.3. ⊓⊔

The second consequence of Theorem 6.9 concerns the manner inwhich diver-gent computations arise in pLTS’s. Consider again the infinite state pLTS given inExample 6.5. There is no states that wholly diverges, that is satisfyings=⇒ ε, yetthere are many partially divergent computations. In fact for everyk ≥ 2 we havesk =⇒

1k[a.0. This cannot arise in a finitary pLTS; if there is any partial derivation

in a finitary pLTS,∆ =⇒ ∆ ′ with |∆ |>|∆ ′|, then there is some state in the pLTS thatwholly diverges.

We say a pLTS isconvergentif s=⇒ ε for no states∈ S.


Lemma 6.18.Let∆ be a subdistribution in afinite-state, convergentanddetermin-istic pLTS. If∆ =⇒ ∆ ′ then|∆ |= |∆ ′|.

Proof. Since the pLTS is convergent, thens=⇒ ε for no states∈ S. In other words,eachτ sequence from a states is finite and ends with a distribution that cannotenable aτ transition. In a deterministic pLTS, each state has at most one outgoingtransition. So from eachs there is a uniqueτ sequence with lengthns ≥ 0.

s τ−→ ∆1τ−→ ∆2

τ−→ ·· · τ−→ ∆ns 6τ−→

Let ps be∆ns(s′) wheres′ is any state in the support of∆ns such thats′ 6τ−→. We set

n = maxns | s∈ Sp = min ps | s∈ S

wheren andp are well defined asS is assumed to be a finite set. Now let∆ =⇒ ∆ ′

be any weak derivation constructed by a collection of∆→k ,∆×

k such that

∆ = ∆→0 +∆×

0∆→

0τ−→ ∆→

1 +∆×1

...∆→

kτ−→ ∆→

k+1+∆×k+1

...

with ∆ ′ =∑∞k=0 ∆×

k . From each∆→kn+i with k, i ∈N, the block ofn steps ofτ transition

leads to∆→(k+1)n+i such that|∆→

(k+1)n+i| ≤ |∆→kn+i |(1− p). It follows that

∑∞j=0 |∆→

j | = ∑n−1i=0 ∑∞

k=0 |∆→kn+i |

≤ ∑n−1i=0 ∑∞

k=0 |∆→i |(1− p)k

= ∑n−1i=0 |∆→

i | 1p

≤ |∆→0 | n

p

Therefore, we have that limk→∞ ∆→k = 0, which in turn means that|∆ ′|= |∆ |. ⊓⊔

Corollary 6.8 (Zero-one law - deterministic case).If for some static derivativepolicy dp over a finite-state pLTS there is for some s a derivations=⇒dp ∆ ′ with|∆ ′|< 1 then in fact for some (possibly different) state sε we have sε =⇒dp ε.

Proof. Suppose that for no states do we haves=⇒dp ε. Then the pLTS inducedby dp is convergent. Since it is obviously finite-state and deterministic, we applyLemma 6.18 and obtain|∆ ′| = |s| = 1, contradicting the assumption that|∆ ′| < 1.Therefore, there must exist some statesε that wholly diverges. ⊓⊔

Although it is possible to have processes that diverge with some probabilitystrictly between zero and one, in a finitary system it is possible to “distill” thatdivergence in the sense that in many cases we can limit our analyses to processes


that either wholly diverge (can do so with probability one) or wholly converge (candiverge only with probability zero). This property is basedon the zero-one law forfinite-state probabilistic systems, and we present the aspects of it that we need here.

Lemma 6.19 (Distillation of divergence - deterministic case).For any state s andstatic derivative policydp over a finite-state pLTS, if there is a derivations=⇒dp ∆ ′

then there is a probability p and full distributions∆ ′1,∆ ′

ε such thats=⇒ (∆ ′1 p⊕ ∆ ′

ε)and∆ ′ = p·∆ ′

1 and∆ ′ε =⇒ ε .

Proof. We modifydp so as to obtain a static policydp′ by settingdp′(t) = dp(t)except whent =⇒dp ε, in which case we setdp′(t) ↑. The new policy determinesa unique weak derivation∆ =⇒dp′ ∆ ′′ for some subdistribution∆ ′′, and induces asub-pLTS from the pLTS induced bydp. Note that the sub-pLTS is deterministicand convergent. By Lemma 6.18, we know that|∆ ′′|= |s|= 1. We split∆ ′′ up into∆ ′′

1 +∆ ′′ε so that each state in⌈∆ ′′

ε ⌉ is wholly divergent under policydp and∆ ′′1 is

supported by all other states. From∆ ′′ε the policydp determines the weak derivation

∆ ′′ε =⇒dp ε . Combining the two weak derivations we haves=⇒dp′ ∆ ′′

1 +∆ ′′ε =⇒dp

∆ ′′1 . As we only divide the original weak derivation into two stages, and do not

change theτ transition from each state, the final subdistribution will not change,thus∆ ′′

1 = ∆ ′. Finally we determinep, ∆ ′1 and∆ ′

ε by letting p= |∆ ′|, ∆ ′1 =

1p∆ ′ and

∆ ′ε =

11−p∆ ′′

ε . ⊓⊔

Theorem 6.11 (Distillation of divergence - general case).For any s,∆ ′ in a fini-tary pLTS withs=⇒ ∆ ′ there is a probability p and full distributions∆ ′

1,∆ ′ε such

thats=⇒ (∆ ′1 p⊕ ∆ ′

ε) and∆ ′ = p·∆ ′1 and∆ ′

ε =⇒ ε.

Proof. Let dpi | i ∈ I (I is a finite index set) be all the static derivative poli-cies in the finitary pLTS. Each policy determines a weak derivation s=⇒dpi

∆ ′i .

From Theorem 6.9 we know that ifs=⇒ ∆ ′ then∆ ′ = ∑i∈I pi∆ ′i for somepi with

∑i∈I pi = 1. By Lemma 6.19, for eachi ∈ I , there is a probabilityqi and full dis-tributions ∆ ′

i,1, ∆ ′i,ε such thats=⇒ (∆ ′

i,1 qi⊕ ∆ ′

i,ε ), ∆ ′i = qi · ∆ ′

i,1, and∆ ′i,ε =⇒ ε.

Finally we determinep, ∆ ′1 and ∆ ′

ε by letting p = |∑i∈I piqi · ∆ ′i,1|, ∆ ′

1 = 1p∆ ′,

and∆ ′ε = 1

1−p ∑i∈I pi(1−qi)∆ ′i,ε . They satisfy our requirements just by noting that

s=⇒ ∑i∈I pi(∆ ′i,1 qi

⊕ ∆ ′i,ε) = ∆ ′

1 p⊕ ∆ ′ε ⊓⊔

The requirement on the pLTS to be finitary is essential for this distillation of diver-gence, as we explain in the following examples.

Example 6.14 (Revisiting Example 6.5).The pLTS in Example 6.5 is an infinite statesystem over statessk for all k≥ 2, where the probability of convergence is 1/k fromany statesk, thus a situation where distillation of divergence fails because all thestates partially diverge, yet there is no single state that wholly diverges.

Example 6.15.Consider the finite state but infinitely branching pLTS described inFigure 6.4; this consists of two statess and0 together with ak-indexed set of tran-sitions


τ

3/4

1/4

1/9

8/9

τ1/16

15/16

1/25

24/25

τ

τ

k=2

k=4

k=5

k=3

s

0

0

0

0

There are two statessand0. To diverge fromswith probability 1−1/k, start at “petal”k and takesuccessiveτ-loops anti-clockwise from there.Yet, although divergence with arbitrarily high probability is present, complete probability-1 diver-gence is nowhere possible. Either infinite states or infinitebranching is necessary for this anomaly.

Fig. 6.4 Infinitely branching flower.

s τ−→k ([0 1/k2⊕ s) for k≥ 2, (6.20)

This pLTS is obtained from the infinite state pLTS described in Example 6.5 byidentifying all of the statessi and replacing the statea.0 with 0.

As we have seen, by taking transitionss τ−→k · τ−→k+1 · τ−→k+2 · · · we haves=⇒ 1

k ·0 for anyk≥ 2; but cruciallys 6=⇒ ε. Since trivially0 6=⇒ ε there is no fulldistribution∆ such that∆ =⇒ ε.

Now to contradict the distillation of divergence for this pLTS note thats=⇒ 12 ·0,

but this derivation cannot be factored in the required manner to s=⇒ (∆ ′1 p⊕ ∆ ′

ε),because no possible full distribution∆ ′

ε can exist satisfying∆ ′ε =⇒ ε.

Corollary 6.8 and Lemma 6.19 are not affected by infinite branching, becausethey are restricted to the deterministic case (i.e. the caseof no branching at all).What fails is the combination of a number of deterministic distillations to make anon-deterministic one, in Theorem 6.11: it depends on Theorem 6.9, which in turnrequires finite branching.

6.6 The failure simulation preorder 191

Corollary 6.9 (Zero-one law - general case).If in a finitary pLTS we have∆ ,∆ ′

with ∆ =⇒ ∆ ′ and |∆ |>|∆ ′| then there is some state s′ reachable with non-zeroprobability from∆ such thats′ =⇒ ε. That is, the pLTS based on∆ must have awholly diverging state somewhere.

Proof. Assume at first that|∆ |=1; then the result is immediate from Theorem 6.11since anys′∈⌈∆ ′

ε⌉ will do. The general result is obtained by dividing the givenderivation by|∆ |. ⊓⊔

6.6 The failure simulation preorder

We have already defined a failure simulation preorder in Definition 5.5, which looksnatural for finite processes. However, for general processes divergence often exists,which makes it subtle to formulate a good notion of failure simulation preorder.

This section is divided in four: the first subsection presents a definition of thefailure simulation preorderin an arbitrary pLTS by taking divergence into account,together with some explanatory examples. It gives two equivalent characterisationsof this preorder: a co-inductive one as a largest relation between subdistributionssatisfying certain transfer properties, and one that is obtained through lifting andan additional closure property from a relation between states and subdistributionsthat we callfailure similarity. It also investigates some elementary properties ofthe failure simulation preorder and of failure similarity.In the second subsectionwe restrict attention to finitary processes, and on this realm characterise the fail-ure simulation preorder in terms of asimple failure similarity. All further resultson the failure simulation preorder, in particular precongruence for the operators ofrpCSP and soundness and completeness with respect to the must testing preorder,are in terms of this characterisation, and hence pertain to finitary processes only. Thethird subsection establishes monotonicity of the operators of rpCSP with respect tothe failure simulation preorder. In other words, we show that the failure simulationpreorder is a precongruence with respect to these operators. The last subsection isdevoted to showing soundness with respect to must testing. Completeness is thesubject of Section 6.7.

6.6.1 Two equivalent definitions and their rationale

Let ∆ and its variants be subdistributions in a pLTS〈S,Actτ ,→〉. Fora∈ Act write∆ a=⇒ ∆ ′ whenever∆ =⇒ ∆pre a−→ ∆post=⇒ ∆ ′. Extend this toActτ by allowing as

a special case thatτ=⇒ is simply=⇒, i.e. including identity (rather than requiring atleast one τ−→). For example, referring to Example 6.4 we have[Q1 a

=⇒ [0, whilein Example 6.5 we have[s2 a

=⇒ 12[0 and[s2=⇒ 6A−→ for any setA not containing

a, becauses2 =⇒12[a.0 .


Definition 6.19 (Failure simulation preorder). Define⊑FS to be the largest rela-tion in Dsub(S)×Dsub(S) such that ifΘ ⊑FS ∆ then

1. whenever∆ α=⇒ (∑i pi∆ ′

i ), for α ∈ Actτ and certainpi with ∑i pi ≤1, then thereareΘ ′

i ∈ Dsub(S) with Θ α=⇒ (∑i piΘ ′

i ) andΘ ′i ⊑FS ∆ ′

i for eachi, and2. whenever∆ =⇒ 6A−→ then alsoΘ =⇒ 6A−→.

Sometimes we write∆ ⊒FSΘ for Θ ⊑FS ∆ .

In the first case of the above definition the summation is allowed to be empty,which has the following useful consequence.

Lemma 6.20.If Θ ⊑FS ∆ and∆ diverges, then alsoΘ diverges.

Proof. Divergence of∆ means that∆ =⇒ ε , whence withΘ ⊑FS ∆ we can take theempty summation in Definition 6.19 to conclude that alsoΘ =⇒ ε. ⊓⊔

Although the regularity of Definition 6.19 is appealing — forexample it is trivialto see that⊑FS is reflexive and transitive, as it should be — in practice, forspe-cific processes, it is easier to work with a characterisationof the failure simulationpreorder in terms of a relation betweenstatesand distributions.

Definition 6.20 (Failure similarity). Define⊳eFS

to be the largest binary relation inS×Dsub(S) such that ifs⊳e

FSΘ then

1. whenevers α=⇒ ∆ ′, for α ∈ Actτ , then there is aΘ ′ ∈ Dsub(S) with Θ α

=⇒ Θ ′

and∆ ′ (⊳eFS)† Θ ′, and

2. whenevers=⇒ 6A−→ thenΘ =⇒ 6A−→.

Any relationR ⊆ S×Dsub(S) that satisfies the two clauses above is called afailure simulation.

This is very close to Definition 5.4. The main difference is the use of the weaktransitions α

=⇒ ∆ ′. In particular, ifs τ=⇒ ε, thenΘ has to respond similarly by the

transitionΘ τ=⇒ ε.

Obviously, for any failure simulationR we haveR ⊆ ⊳eFS

. The following two

lemmas show that the lifted failure similarity relation(⊳eFS)† ⊆ Dsub(S)×Dsub(S)

has simulating properties analogous to 1 and 2 above.

Lemma 6.21.Suppose∆ (⊳eFS)† Θ and∆ α

=⇒ ∆ ′ for α ∈ Actτ . ThenΘ α=⇒Θ ′ for

someΘ ′ such that∆ ′ (⊳eFS)† Θ ′.

Proof. By Lemma 6.1∆ (⊳eFS)† Θ implies that∆ = ∑i∈I pi ·si , si ⊳

eFS

Θi , as well asΘ = ∑i∈I pi ·Θi. By Corollary 6.7 and Proposition 6.2 we know from∆ α

=⇒ ∆ ′ thatsi

α=⇒ ∆ ′

i for ∆ ′i ∈ Dsub(S) such that∆ ′ = ∑i∈I pi ·∆ ′

i . For eachi ∈ I we infer fromsi ⊳

eFS

Θi andsiα=⇒ ∆ ′

i that there is aΘ ′i ∈Dsub(S) with Θi

α=⇒Θ ′

i and∆ ′i (⊳

eFS)† Θ ′.

Let Θ ′ := ∑i∈I pi ·Θ ′i . Then Definition 6.2(2) and Theorem 6.5(i) yield∆ ′ (⊳e

FS)† Θ ′

andΘ α=⇒Θ ′. ⊓⊔


Lemma 6.22.Suppose∆ (⊳eFS)† Θ and∆ =⇒ 6A−→. ThenΘ =⇒ 6A−→.

Proof. Suppose∆ (⊳eFS)† Θ and∆ =⇒ ∆ ′ 6A−→. By Lemma 6.21 there exists some

subdistributionΘ ′ such thatΘ =⇒Θ ′ and∆ ′ (⊳eFS)† Θ ′. From Lemma 6.1 we know

that ∆ ′ = ∑i∈I pi ·si , si ⊳eFS

Θi, Θ ′ = ∑i∈I pi ·Θi, with si ∈ ⌈∆ ′⌉ for all i ∈ I . Since∆ ′ 6A−→, we have thatsi 6A−→ for all i ∈ I . It follows from si ⊳

eFS

Θi thatΘi =⇒Θ ′i 6A−→.

By Theorem 6.5(i) we obtain that∑i∈I pi ·Θi =⇒ ∑i∈I pi ·Θ ′i 6A−→. By the transitivity

of =⇒ we have thatΘ =⇒ 6A−→. ⊓⊔

The next result shows how the failure simulation preorder can alternatively be de-fined in terms of failure similarity. This is consistent withDefinition 5.5 for finiteprocesses.

Proposition 6.7.For ∆ ,Θ ∈ Dsub(S) we haveΘ ⊑FS ∆ just when there is aΘ match

with Θ =⇒Θ matchand∆ (⊳eFS)† Θ match.

Proof. Let⊳′FS⊆ S×Dsub(S) be the relation given bys⊳′

FSΘ iff Θ ⊑FSs. Then⊳′

FS

is a failure simulation; hence⊳′FS⊆ ⊳e

FS. Now supposeΘ ⊑FS ∆ . Let ∆ :=∑i pi ·si .

Then there areΘi with Θ =⇒∑i pi ·Θi andΘi ⊑FS si for eachi, whencesi ⊳′FS

Θi ,

and thussi ⊳eFS

Θi . TakeΘ match:= ∑i pi ·Θi . Definition 6.2 yields∆ (⊳eFS)† Θ match.

For the other direction it suffices to show that the relation(⊳eFS)† · ⇐= satisfies

the two clauses of Definition 6.19, yielding(⊳eFS)† · ⇐=⊆⊒FS. Here we write⇐=

for the inverse of=⇒. So suppose, for given∆ ,Θ ∈ Dsub(S), there is aΘ matchwithΘ =⇒Θ matchand∆ (⊳e

FS)† Θ match.

Suppose∆ α=⇒ ∑i∈I pi ·∆ ′

i for someα ∈ Actτ . By Lemma 6.21 there is someΘ ′

such thatΘ match α=⇒Θ ′ and(∑i∈I pi ·∆ ′

i ) (⊳eFS)† Θ ′. From Proposition 6.2 we know

thatΘ ′ = ∑i∈I pi ·Θ ′i for subdistributionsΘ ′

i such that∆ ′i (⊳

eFS)† Θ ′

i for i ∈ I . Thus

Θ α=⇒ ∑i pi ·Θ ′

i by the transitivity of=⇒ (Theorem 6.6) and∆ ′i ((⊳

eFS)† · ⇐=)Θ ′

i foreachi ∈ I by the reflexivity of⇐=.

Suppose∆ =⇒ 6A−→. By Lemma 6.22 we haveΘ match=⇒ 6A−→. It follows thatΘ =⇒ 6A−→ by the transitivity of=⇒. ⊓⊔

Note the appearance of the “anterior step”Θ =⇒Θ matchin Proposition 6.7 imme-diately above. For the same reason explained in Example 5.14, defining⊒FS simplyto be(⊳e

FS)† (i.e. without anterior step) would not have been suitable.

Remark 6.3.Fors∈SandΘ ∈Dsub(S) we haves⊳eFS

Θ iff Θ ⊑FSs; here no anteriorstep is needed. One direction of this statement has been obtained in the beginningof the proof of Proposition 6.7; for the other note thats⊳e

FSΘ implies s (⊳e

FS)† Θ

by Definition 6.2(1) which impliesΘ ⊑FS sby Proposition 6.7 and the reflexivity of=⇒.

Example 5.14 also shows that⊑FS cannot be obtained as the lifting of any relation:it lacks the decomposition property of Proposition 6.2. Nevertheless,⊑FS enjoys theproperty of linearity, as occurs in Definition 6.2:


Lemma 6.23.If Θi ⊑FS ∆i for i ∈ I then∑i∈I pi ·Θi ⊑FS ∑i∈I pi ·∆i for any pi ∈ [0,1](i ∈ I) with ∑i∈I pi ≤ 1.

Proof. This follows immediately from the linearity of(⊳eFS)† and=⇒ (cf. Theo-

rem 6.5(i)), using Proposition 6.7. ⊓⊔

Example 6.16 (Divergence).From Example 6.3 we know that[recx.x =⇒ ε. This,together with (6.1) in Section 6.3.1, and the fact thatε 6A−→ for any set of actionsA,ensures thats⊳e

FS[recx.x for anys, henceΘ (⊳e

FS)† [recx.x for anyΘ , and thus

that[recx.x ⊑FSΘ . Indeed similar reasoning applies to any∆ with

∆ = ∆0τ−→ ∆1

τ−→ ·· · τ−→ ·· ·

because — as explained right before Example 6.3 — this also ensures that∆ =⇒ ε.In particular, we haveε =⇒ ε and hence[recx.x ≃FS ε.

Yet 0 6⊑FS [recx.x, because the move[recx.x =⇒ ε cannot be matched by acorresponding move from[0 — see Lemma 6.20.

Example 6.16 shows again that the anterior move in Proposition 6.7 is necessary:although[recx.x⊑FS ε we do not haveε (⊳e

FS)† [recx.x, since by Lemma 6.2 any

Θ with ε (⊳eFS)† Θ must have|Θ |= 0.

Example 6.17.Referring to the processQ1 of Example 6.4, with Proposition 6.7we easily see thatQ1 ⊑FS a.0 because we havea.0 ⊳e

FS[Q1. Note that the move[Q1=⇒ [a.0 is crucial, since it enables us to match the move[a.0 a−→ [0 with[Q1 =⇒ [a.0 a−→ [0. It also enables us to match refusals: if[a.0 6A−→ thenA

cannot contain the actiona, and therefore also[Q1=⇒ 6A−→.The converse, thata.0 ⊑FS Q1, is also true because it is straightforward to verify

that the relation

(Q1,[a.0),(τ.Q1,[a.0),(a.0,[a.0),(0,[0)is a failure simulation and thus is a subset of⊳e

FS. We therefore haveQ1 ≃FS a.0.

Example 6.18.Let P be the processa.0 12⊕ recx.x and consider the states2 intro-

duced in Example 6.5. First note that[P (⊳eFS)† 1

2 ·[a.0 , sincerecx.x⊳eFS

ε . Thenbecauses2 =⇒

12 ·[a.0 we haves2 ⊑FS [P. The converse, that[P⊑FSs2 holds, is

true becauses2 ⊳eFS

[P follows from the fact that the relation

(sk,[a.0 1/k⊕ [recx.x) | k≥ 2∪(a.0,[a.0),(0,[0)is a failure simulation that contains the pair(s2,[P).Our final examples pursue the consequences of the fact that the empty distributionε is behaviourally indistinguishable from divergent processes like[recx.x.Example 6.19 (Subdistributions formally unnecessary).For any subdistribution∆ ,let ∆e denote the (full) distribution defined by


∆e := ∆ + (1− |∆ |)·[recx.x .Intuitively it is obtained from∆ by padding the missing support with the divergentstate[recx.x.

Then∆ ≃FS ∆e. This follows because∆e =⇒ ∆ , which is sufficient to estab-lish ∆e ⊑FS ∆ ; but also∆e (⊳e

FS)† ∆ because[recx.x ⊳e

FSε, and that implies the

converse∆ ⊑FS ∆e. The equivalence shows that formally we have no need for sub-distributions, and that our technical development could becarried out using (full)distributions only.

But abandoning subdistributions comes at a cost: the definition of weak transition,Definition 6.4, would be much more complex if expressed with full distributions, aswould syntactic manipulations such as those used in the proof of Theorem 6.6.

More significant, however, is that diverging processes havea special characterin failure simulation semantics. Placing them at the bottomof the ⊑FS preorder— as we do – requires that they failure-simulate every processes, thus allowingall visible actions and all refusals and so behaving in a sense “chaotically”; yetapplying the operational semantics of Figure 6.2 torecx.x literally would suggestexactly the opposite, sincerecx.x allows no visible actions (all its derivatives enableonly τ) and no refusals (all its derivatives haveτ enabled). The case analyses thatdiscrepancy would require are entirely escaped by allowingsubdistributions, as thechaotic behaviour of the divergingε follows naturally from the definitions, as wesaw in Example 6.16.

We conclude with an example involving divergence and subdistributions.

Example 6.20.For 0≤ c ≤ 1 let Pc be the process0 c⊕ recx.x. We show that[Pc ⊑FS [Pc′ just whenc ≤ c′. (Refusals can be ignored, sincePc refuses everyset of actions, for allc.)

Suppose first thatc≤ c′, and split the two processes as follows:[Pc= c·[0+(c′−c)·[recx.x+(1−c′)·[recx.x[Pc′= c·[0+(c′−c)·[0 +(1−c′)·[recx.x .Because0 ⊳e

FS[recx.x (the middle terms), we have immediately[Pc′ (⊳e

FS)† [Pc,

whence[Pc⊑FS [Pc′.For the other direction, note that[Pc′ =⇒ c′ ·[0. If [Pc ⊑FS [Pc′ then from

Definition 6.19 we would have to have[Pc =⇒ c′ ·Θ ′ for some subdistributionΘ ′,a derivative of weight no more thanc′. But the smallest weightPc can reach via=⇒is justc, so that we must have in factc≤ c′.

We end this subsection with two properties of failure similarity that will be usefullater on.

Proposition 6.8.The relation⊳eFS

is convex.

Proof. Supposes⊳eFS

Θi andpi ∈ [0,1] for i ∈ I , with ∑i∈I pi = 1. We need to showthats⊳e

FS ∑i∈I pi ·Θi.


If s α=⇒∆ ′, then there existΘ ′

i for eachi ∈ I such thatΘiα

=⇒Θ ′i and∆ ′ (⊳e

FS)† Θ ′

i .By Corollary 6.7 and Theorem 6.4, we obtain that∑i∈I pi ·Θi

α=⇒ ∑i∈I pi ·Θ ′

i and∆ ′ (⊳e

FS)† ∑i∈I pi ·Θ ′

i .If s=⇒ 6A−→ for someA⊆ Act, thenΘi =⇒Θ ′

i 6A−→ for all i ∈ I . By definition wehave∑i∈I pi ·Θ ′

i 6A−→. Theorem 6.5(i) yields∑i∈I pi ·Θi =⇒ ∑i∈I pi ·Θ ′i .

So we have checked thats⊳eFS ∑i∈I pi ·Θi. It follows that⊳e

FSis convex. ⊓⊔

Proposition 6.9.The relation(⊳eFS)† ⊆Dsub(S)×Dsub(S) is reflexive and transitive.

Proof. Reflexivity is easy; it relies on the fact thats⊳eFS

s for every states.

For transitivity, we first show that⊳eFS·(⊳e

FS)† is a failure simulation. Suppose that

s⊳eFS

Θ (⊳eFS)† Φ. If s α

=⇒∆ ′ then there is aΘ ′ such thatΘ α=⇒Θ ′ and∆ ′ (⊳e

FS)† Θ ′.

By Lemma 6.21, there exists aΦ ′ such thatΦ α=⇒ Φ ′ andΘ ′ (⊳e

FS)† Φ ′. Hence,

∆ ′ (⊳eFS)†·(⊳e

FS)† Φ ′. By Lemma 6.3 we know that

(⊳eFS)†·(⊳e

FS)† = (⊳e

FS·(⊳e

FS)†)

†(6.21)

Therefore, we obtain∆ ′ (⊳eFS·(⊳e

FS)†)

†Φ ′.

If s=⇒ 6A−→ for someA⊆ Act, thenΘ =⇒ 6A−→ and henceΦ =⇒ 6A−→ by applyingLemma 6.22.

So we established that⊳eFS·(⊳e

FS)† ⊆ ⊳e

FS. It now follows from the monotonicity

of the lifting operation and (6.21) that(⊳eFS)†·(⊳e

FS)† ⊆ (⊳e

FS)†. ⊓⊔

6.6.2 A simple failure similarity for finitary processes

Here we present a simpler characterisation of failure similarity, valid when con-sidering finitary processes only. It is in terms of this characterisation that we willestablish soundness and completeness of the failure simulation preorder with re-spect to the must testing preorder; consequently we have these results for finitaryprocesses only.

Definition 6.21 (Simple failure similarity). Let F be the function onS×Dsub(S)such that for any binary relationR∈ S×Dsub(S), states and subdistributionΘ , ifsF(R)Θ then

1. whenevers=⇒ ε then alsoΘ =⇒ ε, otherwise2. whenevers α−→ ∆ ′, for someα ∈ Actτ , then there is aΘ ′ with Θ α

=⇒ Θ ′ and∆ ′ R

† Θ ′, and3. whenevers 6A−→ thenΘ =⇒ 6A−→.

Let⊳sFS be the greatest fixed point ofF.

The above definition is obtained by factoring out divergencefrom Clause 1 inDefinition 6.20 and conservatively extends Definition 5.4 tocompare processes that


might not be finite. We first note that the relation⊳sFS is not interesting for infinitary

processes since its lifted form(⊳sFS)

† is not a transitive relation for those processes.

Example 6.21.Consider the process defined by the following two transitions:t0

τ−→ (0 1/2⊕ t1) andt1τ−→ t1. We compare statet0 with states in Example 6.15

and have thatt0 ⊳sFS s. The transitiont0

τ−→ (0 1/2⊕ t1) can be matched bys=⇒ 120

because(0 1/2⊕ t1) (⊳sFS)

† 120 by noticing thatt1 ⊳s

FS ε .It also holds thats⊳s

FS 0 because the relation(s,0),(0,0) is a simple failuresimulation. The transitions τ−→k (0 1

k2⊕ s) for anyk≥ 2 is matched by0 =⇒ 0.

However, we do not havet0 ⊳sFS 0. The only candidate to simulate the transition

t0τ−→ (0 1/2⊕ t1) is 0 =⇒ 0, but we do not have(0 1

2⊕ t1) (⊳s

FS)† 0 because the

divergent statet1 cannot be simulated by0.Therefore, we havet0 (⊳s

FS)† s(⊳s

FS)† 0 but nott0 (⊳s

FS)† 0, thus transitivity of the

relation(⊳sFS)

† fails. Here the states is not finitely branching. As a matter of fact,transitivity of (⊳s

FS)† also fails for finitely branching but infinite state processes.

Consider an infinite state pLTS consisting of a collection ofstatessk for k ≥ 2such that

skτ−→ 0 1

k2⊕ sk+1. (6.22)

This pLTS is obtained from that in Example 6.5 by replacinga.0 with 0. One cancheck thatt0 (⊳s

FS)† s2 (⊳

sFS)

† 0 but we already know thatt0 (⊳sFS)

† 0 does not hold.Again, we loose the transitivity of(⊳s

FS)†.

If we restrict our attention to finitary processes, then⊳sFS provides a simpler char-

acterisation of failure similarity.

Theorem 6.12 (Equivalence of failure- and simple failure similarity). For fini-tary distributions∆ ,Θ ∈ Dsub(S) in a pLTS〈S,Actτ ,→〉 we have∆ ⊳s

FS Θ if andonly if ∆ ⊳e

FSΘ .

Proof. Becauses α−→ ∆ ′ impliess α−→ ∆ ′ ands 6A−→ impliess=⇒ 6A−→ it is trivial that⊳e

FSsatisfies the conditions of Definition 6.21, so that⊳e

FS⊆⊳s

FS.For the other direction we need to show that⊳s

FS satisfies Clause 1 of Defini-tion 6.20 withα = τ, that is

if s⊳sFS Θ ands=⇒ ∆ ′ then there is someΘ ′ ∈ Dsub(S) with Θ =⇒Θ ′ and

∆ ′ (⊳sFS)

† Θ ′.

Once we have this, the relation⊳sFS clearly satisfies both clauses of Definition 6.20,

so that we have⊳sFS ⊆⊳e

FS.

So suppose thats⊳sFS Θ and thats=⇒∆ ′ where — for the moment — we assume

|∆ ′| = 1. Referring to Definition 6.4, there must be∆k, ∆→k and∆×

k for k ≥ 0 suchthat s= ∆0, ∆k = ∆→

k +∆×k , ∆→

kτ−→ ∆k+1 and∆ ′ = ∑∞

k=1 ∆×k . Since it holds that

∆×0 +∆→

0 = s(⊳sFS)

† Θ , using Proposition 6.2 we can defineΘ =: Θ×0 +Θ→

0 so that∆×

0 (⊳sFS)

† Θ×0 and∆→

0 (⊳sFS)

† Θ→0 . Since∆→

0τ−→ ∆1 and∆→

0 (⊳sFS)

† Θ→0 it follows

thatΘ→0 =⇒Θ1 with ∆1 (⊳

sFS)

† Θ1.


Repeating the above procedure gives us inductively a seriesΘk,Θ→k ,Θ×

k of

subdistributions, fork ≥ 0, such thatΘ0 = Θ , ∆k (⊳sFS)

† Θk, Θk = Θ→k +Θ×

k ,

∆×k (⊳s

FS)† Θ×

k , ∆→k (⊳s

FS)† Θ→

k andΘ→k

τ=⇒Θk+1. We defineΘ ′ := ∑i Θ×

i . By Ad-

ditivity (Remark 6.2) we have∆ ′ (⊳sFS)

† Θ ′. It remains to be shown thatΘ =⇒Θ ′.For that final step, since(Θ =⇒) is closed according to Lemma 6.17, we can

establishΘ =⇒ Θ ′ by exhibiting a sequenceΘ ′i with Θ =⇒ Θ ′

i for eachi and withtheΘ ′

i ’s being arbitrarily close toΘ ′. Induction establishes for eachi that

Θ =⇒Θ ′i := (Θ→

i +∑k≤i

Θ×k ).

Since |∆ ′| = 1, we are guaranteed to have that limi→∞ |∆→i | = 0, whence by

Lemma 6.2, using that∆→i (⊳s

FS)† Θ→

i , also limi→∞ |Θ→i |= 0. Thus theseΘ ′

i ’s formthe sequence we needed.

That concludes the case for|∆ ′| = 1. If on the other hand∆ ′ = ε , i.e. we have|∆ ′|= 0, thenΘ =⇒ ε follows immediately froms⊳s

FS Θ , andε (⊳sFS)

† ε trivially.In the general case, ifs=⇒ ∆ ′ then by Theorem 6.11 we haves=⇒ ∆ ′

1 p⊕ ∆ ′ε for

some probabilityp and distributions∆ ′1,∆ ′

ε , with ∆ ′ = p·∆ ′1 and∆ ′

ε =⇒ ε . From themass-1 case above we haveΘ =⇒ Θ ′

1 p⊕ Θ ′ε with ∆ ′

1 (⊳sFS)

† Θ ′1 and∆ ′

ε (⊳sFS)

† Θ ′ε ;

from the mass-0 case we haveΘ ′ε =⇒ ε and henceΘ ′

1 p⊕ Θ ′ε =⇒ p·Θ ′

1 by The-orem 6.5(i); thus transitivity yieldsΘ =⇒ p·Θ ′

1, with ∆ ′ = p·∆ ′1 (⊳s

FS)† p·Θ ′

1 asrequired, using Definition 6.2(2). ⊓⊔

The proof of Theorem 6.12 refers to Theorem 6.11 where the underlying pLTSis assumed to be finitary. As we would expect, Theorem 6.12 fails for infinitarypLTS’s.

Example 6.22.We have seen in Example 6.21 that the states from (6.20) is relatedto 0 via the relation⊳s

FS. We now compareswith 0 according to⊳eFS

. From stateswehave the weak transitions=⇒ 0 1/2⊕ ε , which cannot be matched by any transitionfrom 0, thuss 6⊳e

FS0. This means that Theorem 6.12 fails for infinitely branching

processes.If we replace states by the states2 from (6.22), similar phenomenon happens.

Therefore, Theorem 6.12 also fails for finitely branching but infinite-state processes.

6.6.3 Precongruence

The purpose of this section is to show that the semantic relation ⊑FS is preservedby the constructs ofrpCSP. The proofs follow closely the corresponding proofs inSection 5.5.2, but here there is a significant extra proof obligation: in order to relatetwo processes we have to demonstrate that if the first diverges then so does thesecond.


Here, in order to avoid such complications, we introduce yetanother version offailure simulation; it modifies Definition 6.21 by checking divergence co-inductivelyinstead of using a predicate.

Definition 6.22.Define⊳cFS

to be the largest relation inS×Dsub(S) such that ifs⊳c

FSΘ then

1. whenevers =⇒ ε, there are some∆ ′,Θ ′ such thats =⇒ τ−→=⇒ ∆ ′ =⇒ ε,Θ =⇒ τ−→=⇒Θ ′ and∆ ′ (⊳c

FS)† Θ ′; otherwise

2. whenevers α−→ ∆ ′, for someα ∈ Actτ , then there is aΘ ′ with Θ α=⇒ Θ ′ and

∆ ′ (⊳cFS)† Θ ′, and

3. whenevers 6A−→ thenΘ =⇒ 6A−→.

Lemma 6.24.The following statements about divergence are equivalent.

(1) ∆ =⇒ ε .(2) There is an infinite sequence∆ τ−→ ∆1

τ−→ ∆2τ−→ . . ..

(3) There is an infinite sequence∆ =⇒ τ−→=⇒ ∆1 =⇒τ−→=⇒ ∆2 =⇒

τ−→=⇒ . . ..

Proof. By the definition of weak transition, it is immediate that(1)⇔ (2). Clearlywe have(2)⇒ (3). To show that(3)⇒ (2), we introduce another characterisation ofdivergence. Let∆ be a subdistribution in a pLTSL. A pLTS induced by∆ is a pLTSwhose states and transitions are subsets of those inL and all states are reachablefrom ∆ .

(4) There is a pLTS induced by∆ where all states have outgoingτ transitions.

It holds that(3)⇒ (4) because we can construct a pLTS whose states and transitionsare just those used in deriving the infinite sequence in(3). For this pLTS, each statehas an outgoingτ transition, which gives(4)⇒ (2). ⊓⊔

The next lemma shows the usefulness of the relation⊳cFS

by checking divergencein a co-inductive way.

Lemma 6.25.Suppose∆ (⊳cFS)† Θ and ∆ =⇒ ε . Then there exist∆ ′,Θ ′ such that

∆ =⇒ τ−→=⇒ ∆ ′ =⇒ ε, Θ =⇒ τ−→=⇒Θ ′, and∆ ′ (⊳cFS)† Θ ′.

Proof. Suppose∆ (⊳cFS)† Θ and∆ =⇒ ε. In analogy with Proposition 6.8 we can

show that⊳cFS

is convex. By Corollary 6.1, we can decomposeΘ as∑s∈⌈∆⌉∆(s)·Θs

ands⊳cFS

Θs for eachs∈ ⌈∆⌉. Now eachs must also diverge. So there exist∆ ′s,Θ ′

s

such thats=⇒ τ−→=⇒∆ ′s=⇒ ε,Θs=⇒

τ−→=⇒Θ ′s and∆ ′

s (⊳cFS)† Θ ′

s for eachs∈⌈∆⌉.Let ∆ ′ = ∑s∈⌈∆⌉∆(s)·∆ ′

s andΘ ′ = ∑s∈⌈∆⌉ ∆(s)·Θ ′s. By Definition 6.2 and Theo-

rem 6.5(i), we have∆ ′ (⊳cFS)† Θ ′, ∆ =⇒ τ−→=⇒ ∆ ′, andΘ =⇒ τ−→=⇒Θ ′. We also

have that∆ ′ =⇒ ε because for each states in ∆ ′ it holds thats∈ ⌈∆ ′s⌉ for some∆ ′

sand∆ ′

s =⇒ ε , which meanss=⇒ ε. ⊓⊔

Lemma 6.26.⊳cFS

coincides with⊳sFS.


Proof. We only need to check that the first clause in Definition 6.21 isequivalent tothe first clause in Definition 6.22. For one direction, we consider the relation

R := (s,Θ) | s=⇒ ε andΘ =⇒ ε

and showR ⊆⊳cFS

. SupposesR Θ . By Lemma 6.24 there are two infinite sequencess τ−→ ∆1

τ−→ ∆2τ−→ . . . andΘ τ−→ Θ1

τ−→ . . .. Then we have both∆1 =⇒ ε andΘ1 =⇒ ε. Note that∆1 =⇒ ε if and only if t =⇒ ε for eacht ∈ ⌈∆1⌉. Therefore,∆1 R

† Θ1 as we have∆1 = ∑t∈⌈∆1⌉∆1(t)·t, Θ1 = ∑t∈⌈∆1⌉ ∆1(t)·Θ1, and t R Θ1.Here|∆1|= 1 because∆1, like s, is a distribution.

For the other direction, we show that∆ (⊳cFS)† Θ and∆ =⇒ ε imply Θ =⇒ ε.

Then as a special case, we gets⊳cFS

Θ ands=⇒ ε imply Θ =⇒ ε . By repeatedapplication of Lemma 6.25, we can obtain two infinite sequences

∆ =⇒ τ−→=⇒ ∆1 =⇒τ−→=⇒ . . . and Θ =⇒ τ−→=⇒Θ1 =⇒

τ−→=⇒ . . .

such that∆i (⊳cFS)† Θi for all i ≥ 1. By Lemma 6.24 this impliesΘ =⇒ ε. ⊓⊔

The advantage of this new relation⊳cFS

over⊳sFS is that in order to checks⊳c

FSΘ

whens diverges it is sufficient to find a single matching moveΘ =⇒ τ−→=⇒ Θ ′,rather than an infinite sequence of moves. However to construct this matching movewe cannot rely on clause 2 in Definition 6.22, as the move generated there mightactually be empty, which we have seen in Example 6.2. Insteadwe need a methodfor generating weak moves that contain at least one occurrence of aτ-action.

Definition 6.23 (Productive moves).Let us writes |A t τ−→p Θ whenever we caninfer s |A t τ−→p Θ from the last two rules in Figure 6.2. In effect this means that tmust contribute to the action.

Theseproductiveactions are extended to subdistributions in the standard manner,giving ∆ τ−→p Θ .

The following lemma is adapted from Lemma 5.8 in the last chapter, which stillholds in our current setting.

Lemma 6.27.(1) If Φ =⇒ Φ ′ thenΦ |A ∆ =⇒ Φ ′ |A ∆ and∆ |A Φ =⇒ ∆ |A Φ ′.(2) If Φ a−→ Φ ′ and a6∈ A thenΦ |A ∆ a−→ Φ ′ |A ∆ and∆ |A Φ a−→ ∆ |A Φ ′.(3) If Φ a−→ Φ ′, ∆ a−→ ∆ ′ and a∈ A then∆ |A Φ τ−→ ∆ ′ |A Φ ′.(4) (∑ j∈J p j ·Φ j ) |A (∑k∈K qk ·∆k) = ∑ j∈J ∑k∈K(p j ·qk)·(Φ j |A ∆k).(5) LetR,R ′ ⊆ S×Dsub(S) be two relations satisfying uR Ψ whenever u= s |A t

andΨ =Θ |A t with sR ′ Θ and t∈ S. Then∆ R ′† Θ andΦ ∈ Dsub(S) implies(∆ |A Φ) R

† (Θ |A Φ). ⊓⊔

Proposition 6.10.Suppose∆ (⊳cFS)† Θ and∆ |A t τ−→p Γ . ThenΘ |A t =⇒ τ−→=⇒Ψ

for someΨ such thatΓ R† Ψ , whereR is the relation(s |A t,Θ |A t) | s⊳c

FSΘ.

Proof. We first show a simplified version of the result. Suppose thats⊳cFS

Θ ands |A t τ−→p Γ ; we prove this entailsΘ |A t =⇒ τ−→=⇒Ψ such thatΓ R

† Ψ . Thereare only two possibilities for inferring the above productive move froms |A t:


(i) Γ = s |A Φ wheret τ−→ Φ(ii) or Γ = ∆ |A Φ where for somea∈ A, s a−→ ∆ andt a−→ Φ.

In the first case we haveΘ |A t τ−→ Θ |A Φ by using Lemma 6.27(2) and also that(s |A Φ) R

† (Θ |A Φ) by Lemma 6.27(5), whereas in the second cases ⊳cFS

Θimplies Θ =⇒ a−→=⇒ Θ ′ for someΘ ′ ∈ Dsub(S) with ∆ (⊳c

FS)† Θ ′, and we have

Θ |A t =⇒ τ−→=⇒ Θ ′ |A Φ by Lemma 6.27(1) and (3), and(∆ |A Φ) R† (Θ ′ |A Φ)

by Lemma 6.27(5).The general case now follows using a standard decomposition/recomposition ar-

gument. Since∆ |A t τ−→p Γ , Lemma 6.1 yields

∆ = ∑i∈I

pi ·si , si |A t τ−→p Γi , Γ = ∑i∈I

pi ·Γi ,

for certainsi ∈ S, Γi ∈ Dsub(S) and∑i∈I pi ≤ 1. In analogy with Proposition 6.8 wecan show that⊳c

FSis convex. Hence, since∆ (⊳c

FS)† Θ , Corollary 6.1 yields that

Θ = ∑i∈I pi ·Θi for someΘi ∈ Dsub(S) such thatsi ⊳cFS

Θi for i ∈ I . By the aboveargument we haveΘi |A t =⇒ τ−→=⇒Ψi for someΨi ∈ Dsub(S) such thatΓi R

† Ψi .The requiredΨ can be taken to be∑i∈I pi ·Ψi as Definition 6.2(2) yieldsΓ R

† Ψand Theorem 6.5(i) and Definition 6.2(2) yieldΘ |A t =⇒ τ−→=⇒Ψ . ⊓⊔

Our next result shows that we can always factor out productive moves from an arbi-trary action of a parallel process.

Lemma 6.28.Suppose∆ |A t τ−→ Γ . Then there exists subdistributions∆→, ∆×,∆next, Γ × (possibly empty) such that

(i) ∆ = ∆→+∆×

(ii) ∆→ τ−→ ∆next

(iii) ∆× |A t τ−→p Γ ×

(iv) Γ = ∆next |A t +Γ ×

Proof. By Lemma 6.1∆ |A t τ−→ Γ implies that

∆ = ∑i∈I

pi ·si , si |A t τ−→ Γi , Γ = ∑i∈I

pi ·Γi ,

for certainsi ∈ S, Γi ∈Dsub(S) and∑i∈I pi ≤ 1. LetJ= i ∈ I | si |A t τ−→p Γi . Notethat for eachi ∈ (I − J) the subdistributionΓi has the formΓ ′

i |A t, wheresiτ−→ Γ ′

i .Now let

∆→ = ∑i∈(I−J)

pi ·si , ∆× = ∑i∈J

pi ·si

∆next= ∑i∈(I−J)

pi ·Γ ′i , Γ × = ∑

i∈J

pi ·Γi

By construction (i) and (iv) are satisfied, and (ii) and (iii)follow by property (2) ofDefinition 6.2. ⊓⊔


Lemma 6.29.If ∆ |A t =⇒ ε then there is a∆ ′ ∈ Dsub(S) such that∆ =⇒ ∆ ′ and∆ ′ |A t τ−→p=⇒ ε.

Proof. Suppose∆0 |A t =⇒ ε. By Lemma 6.24 there is an infinite sequence

∆0 |A t τ−→Ψ1τ−→Ψ2

τ−→ . . . (6.23)

By induction onk≥ 0, we find distributionsΓk+1, ∆→k , ∆×

k , ∆k+1, Γ ×k+1 such that

(i) ∆k |A t τ−→ Γk+1

(ii) Γk+1 ≤Ψk+1(iii) ∆k = ∆→

k +∆×k

(iv) ∆→k

τ−→ ∆k+1(v) ∆×

k |A t τ−→p Γ ×k+1

(vi) Γk+1 = ∆k+1 |A t +Γ×k+1.

Induction Base:TakeΓ1 :=Ψ1 and apply Lemma 6.28.Induction Step:Assume we already haveΓk, ∆k andΓ ×

k . Since∆k |A t ≤ Γk ≤ Ψk

andΨkτ−→ Ψk+1, Proposition 6.2 gives us aΓk+1 such that∆k |A t τ−→ Γk+1 and

Γk+1 ≤Ψk+1. Now apply Lemma 6.28.

Let∆ ′ :=∑∞k=0 ∆×

k . By (iii) and (iv) above we obtain a weakτ move∆0 =⇒∆ ′. Since∆ ′ |A t = ∑∞

k=0(∆×k |A t), by (v) and Definition 6.2 we have∆ ′ |A t τ−→p ∑∞

k=1Γ ×k .

Note that here it does not matter if∆ ′ = ε. SinceΓ ×k ≤ Γk ≤ Ψk andΨk =⇒ ε it

follows from Theorem 6.5(ii) thatΓ ×k =⇒ ε . Hence by using Theorem 6.5(i) we

obtain that∑∞k=1Γ ×

k =⇒ ε. ⊓⊔

We are now ready to prove the main result of this section, namely that⊑FS is pre-served by the parallel operator.

Proposition 6.11.In a finitary pLTS, ifΘ ⊑FS ∆ thenΘ |A Φ ⊑FS ∆ |A Φ.

Proof. We first construct the following relation

R := (s |A t,Θ |A t) | s⊳cFS

Θ

and check thatR ⊆ ⊳cFS

. As in the proof of Theorem 5.1, one can check that eachstrong transition froms |A t can be matched by a transition fromΘ |A t, and thematching of failures can also be established. So we concentrate on the requirementinvolving divergence.

Supposes⊳cFS

Θ ands |A t =⇒ ε. We need to find someΓ ,Ψ such that

(a) s |A t =⇒ τ−→=⇒ Γ =⇒ ε ,(b) Θ |A t =⇒ τ−→=⇒Ψ andΓ R

† Ψ .

By Lemma 6.29 there are∆ ′,Γ ∈ Dsub(S) with s=⇒ ∆ ′ and∆ ′ |A t τ−→p Γ =⇒ ε.Since for finitary processes⊳c

FScoincides with⊳s

FS and⊳eFS

by Lemma 6.26 and

Theorem 6.12, there must be aΘ ′ ∈ Dsub(S) such thatΘ =⇒Θ ′ and∆ ′ (⊳cFS)† Θ ′.

By Proposition 6.10 we haveΘ ′ |A t =⇒ τ−→=⇒Ψ for someΨ such thatΓ R† Ψ .


Now s|A t =⇒∆ ′ |A t τ−→Γ =⇒ ε andΘ |A t =⇒Θ ′ |A t =⇒ τ−→=⇒Ψ with Γ R† Ψ ,

which had to be shown.Therefore, we have shown thatR ⊆ ⊳c

FS. Now let us focus our attention on the

statement of the proposition, which involves⊑FS.SupposeΘ ⊑FS ∆ . By Proposition 6.7 this means that there is someΘ matchsuch

thatΘ =⇒Θ matchand∆ (⊳eFS)† Θ match. By Theorem 6.12 and Lemma 6.26 we have

∆ (⊳cFS)† Θ match. Then Lemma 6.27(5) yields(∆ |A Φ) R

† (Θ match |A Φ). There-

fore, we have(∆ |A Φ) (⊳cFS)† (Θ match |A Φ), i.e. (∆ |A Φ) (⊳e

FS)† (Θ match |A Φ)

by Lemma 6.26 and Theorem 6.12. By using Lemma 6.27(1) we alsohave that(Θ |A Φ) =⇒ (Θ match |A Φ), which had to be established according to Proposi-tion 6.7. ⊓⊔

In the proof of Proposition 6.11 we use the characterisationof ⊳eFS

as⊳sFS, which

assumes the pLTS to be finitary. In general, the relation⊳sFS is not closed under

parallel composition.

Example 6.23.We use a modification of the infinite state pLTS’s in Example 6.5that as before has statessk with k≥ 2, but we add an extraa-looping statesa to giveall together the system

for k≥ 2 skτ−→ (sa 1

k2⊕ sk+1) and sa

a−→ sa .

There is a failure simulation so thatsk ⊳sFS (sa 1

k⊕ 0) because from statesk the tran-

sition skτ−→ (sa 1

k2⊕ sk+1) can be matched by a transition to(sa 1

k2⊕ (sa 1

k+1⊕ 0))

that simplifies to just(sa 1k⊕ 0) again — i.e. a sufficient simulating transition would

be the identity instance of=⇒.Now s2 |a sa wholly diverges even thoughs2 itself does not, and (recall from

above) we haves2 ⊳sFS (sa 1

2⊕ 0). Yet (sa 1

2⊕ 0) |a sa does not diverge, and therefore

s2 |a sa 6⊳sFS (sa 1

2⊕ 0) |a sa.

Note that this counter-example does not go through if we use failure similarity⊳e

FSinstead of simple failure similarity⊳s

FS, sinces2 6⊳eFS(sa 1

2⊕ 0) — the former has

the transitions2 =⇒ sa 12⊕ ε, which cannot be matched bysa 1

2⊕ 0.

Proposition 6.12 (Precongruence).In a finitary pLTS, if P⊑FS Q then it holdsthat α.P⊑FS α.Q for anyα ∈ Act, and similarly if P1 ⊑FS Q1 and P2 ⊑FS Q2 thenP1⊙P2 ⊑FS Q1⊙Q2 for ⊙ being any of the operators⊓, , p⊕ and|A.

Proof. The most difficult case is the closure of failure simulation under parallelcomposition, which is proved in Proposition 6.11. The othercases are simpler, thusomitted. ⊓⊔

Lemma 6.30.In a finitary pLTS, if P⊑FS Q then[P |Act T] ⊑FS [Q |Act T] for anytest T .

Proof. We first construct the following relation


R := (s |Act t,Θ |Act t) | s⊳cFS

Θ

wheres |Act t is a state in[P |Act T] andΘ |Act t is a subdistribution in[Q |Act T],and show thatR⊆⊳c

FS.

1. The matching of divergence betweens |Act t andΘ |Act t is almost the same asthe proof of Proposition 6.11, besides that we need to check the requirementst 6ω−→ andΓ 6ω−→ are always met there.

2. We now consider the matching of transitions.

• If s |Act t ω−→ then this action is actually performed byt. Supposet ω−→ Γ .Then s |Act t ω−→ s |Act Γ andΘ |Act t ω−→ Θ |Act Γ . Obviously we have(s |Act Γ ,Θ |Act Γ ) ∈R

†.• If s |Act t τ−→ then we must haves |Act t 6ω−→, otherwise theτ transition would

be a “scooting” transition and the pLTS is notω-respecting. It follows thatt 6ω−→. There are three subcases.– t τ−→ Γ . So the transitions |Act t τ−→ s |Act Γ can simply be matched by

Θ |Act t τ−→Θ |Act Γ .– s τ−→ ∆ . Sinces ⊳c

FSΘ , there exists someΘ ′ such thatΘ =⇒ Θ ′ and

∆ (⊳cFS)† Θ ′. Note that in this caset 6ω−→. HenceΘ |Act t =⇒ Θ ′ |Act t

which can match the transitions |Act t −→ ∆ |Act t because we also have(∆ |Act t,Θ ′ |Act t) ∈R

†.– s a−→ ∆ andt a−→ Γ for some actiona∈ Act. Sinces⊳c

FSΘ , there exists

someΘ ′ such thatΘ a=⇒Θ ′ and∆ (⊳c

FS)† Θ ′. Note that in this caset 6ω−→.

It easily follows thatΘ |Act t =⇒Θ ′ |Act Γ which can match the transitions |Act t −→ ∆ |Act Γ because(∆ |Act Γ ,Θ ′ |Act Γ ) ∈R

†.• Supposes |Act t 6A−→ for anyA⊆ Act∪ω. There are two possibilities.

– If s |Act t 6ω−→, thent 6ω−→ and there are two subsetsA1,A2 of A such thats 6A1−→, t 6A2−→ andA= A1∪A2. Sinces⊳c

FSΘ there exists someΘ ′ such that

Θ =⇒Θ ′ andΘ ′ 6A1−→. Therefore, we haveΘ |Act t =⇒Θ ′ |Act t 6A−→.– If s |Act t ω−→ thent ω−→ andω 6∈ A. Therefore, we haveΘ |Act t ω−→ and

Θ |Act t 6τ−→ because there is no “scooting” transition inΘ |Act t. It followsthatΘ |Act t 6A−→.

Therefore, we have shown thatR⊆⊳cFS

, from which our expected result can be es-tablished by using similar arguments as in the last part of the proof of Proposi-tion 6.11. ⊓⊔

6.6.4 Soundness

In this section we prove that failure simulations are sound for showing that processesare related via the failure-based testing preorder. We assume initially that we areusing only one success actionω , so that|Ω |= 1.


Because we prune our pLTS’s before extracting values from them, we will beconcerned mainly withω-respecting structures.

Definition 6.24.Let ∆ be a subdistribution in a pLTS〈S,ω ,τ,→〉. We writeV (∆) for the set of testing outcomes$∆ ′ | ∆ =⇒≻ ∆ ′.

Lemma 6.31.Let ∆ andΘ be subdistributions in anω-respecting pLTS given by〈S,τ,ω,→〉. If ∆ is stable and∆ (⊳e

FS)† Θ , thenV (Θ)≤Sm V (∆).

Proof. We first show that ifs is stable ands⊳eFS

Θ thenV (Θ)≤SmV (s). Sinces isstable, we have only two cases:

(i) s 6−→. HereV (s) = 0 and sinces⊳eFS

Θ we haveΘ =⇒ Θ ′ with Θ ′ 6−→,whence in factΘ =⇒≻ Θ ′ and $Θ ′ = 0. Therefore 0∈ V (Θ) that meansV (Θ)≤Sm V (s).

(ii) s ω−→ ∆ ′ for some∆ ′. HereV (s)=1 andΘ =⇒ Θ ′ ω−→ with $Θ ′=|Θ ′|.Because the pLTS isω-respecting, in factΘ =⇒≻ Θ ′ and so again we haveV (Θ)≤Sm V (s).

Now for the general case we suppose∆ (⊳eFS)† Θ . Use Proposition 6.2 to decom-

poseΘ into ∑s∈⌈∆⌉∆(s)·Θs such thats⊳eFS

Θs for eachs∈ ⌈∆⌉, and recall eachsuch states is stable. From above we have thatV (Θs)≤Sm V (s) for thoses, and soV (Θ) = ∑s∈⌈∆⌉ ∆(s)·V (Θs)≤Sm ∑s∈⌈∆⌉∆(s)·V (s) = V (∆). ⊓⊔

Lemma 6.32.Let∆ be a subdistribution in anω-respecting pLTS〈S,τ,ω,→〉. If∆ =⇒ ∆ ′ thenV (∆ ′)⊆ V (∆).

Proof. Note that if∆ ′ =⇒≻ ∆ ′′ then∆ =⇒∆ ′ =⇒≻ ∆ ′′, so that every extreme deriva-tive of ∆ ′ is also an extreme derivative of∆ . ⊓⊔

Lemma 6.33.Let ∆ andΘ be subdistributions in anω-respecting pLTS given by〈S,τ,ω,→〉. If Θ ⊑FS ∆ , then it holds thatV (Θ)≤Sm V (∆).

Proof. Let ∆ andΘ be subdistributions in anω-respecting pLTS〈S,τ,ω,→〉. Wefirst claim that

If ∆ (⊳eFS)† Θ , thenV (Θ)≤Sm V (∆).

We assume that∆ (⊳eFS)† Θ . For any∆ =⇒≻ ∆ ′ we have the matching transition

Θ =⇒Θ ′ such that∆ ′ (⊳eFS)† Θ ′. It follows from Lemmas 6.31 and 6.32 that

V (Θ)⊇ V (Θ ′)≤Sm V (∆ ′).

Consequently, we obtainV (Θ)≤Sm V (∆).Now supposeΘ ⊑FS ∆ . By Proposition 6.7, there exists someΘ ′ with Θ =⇒Θ ′

and∆ (⊳eFS)† Θ ′. By the above claim and Lemma 6.32 we obtain

V (Θ)⊇ V (Θ ′)≤Sm V (∆),

thusV (Θ)≤Sm V (∆). ⊓⊔


Theorem 6.13.For any finitary processes P and Q, if P⊑FS Q then P⊑pmustQ.

Proof. We reason as follows.

P⊑FS Qimplies [P |Act T]⊑FS [Q |Act T] Lemma 6.30, for any testTimplies V ([P |Act T])≤Sm V ([Q |Act T]) [ · ] is ω-respecting; Lemma 6.33iff A d(T,P)≤Sm A d(T,Q) Definition 6.24iff P⊑pmustQ . Definition 6.11

⊓⊔

In the proof of the soundness result above we use Lemma 6.30, which holds forfinitary processes only. For infinitary processes, a preorder induced by⊳s

FS is notsound for must testing.

Example 6.24.We have seen in Example 6.21 that the states from (6.20) is re-lated to0 via the relation⊳s

FS. If we apply testτ.ω to both s and 0, we obtainq−→ω | q∈ [1

2,1] as an outcome set for the former and−→ω for the latter. Although

s(⊳sFS)

† 0, we haveA d(τ.ω ,0) 6≤Sm A d(τ.ω ,s).If we replace states by the states2 from (6.22), similar phenomenon happens.

Althoughs2 (⊳sFS)

† 0, we have

Ad(τ.ω ,0) = −→ω 6≤Sm q−→ω | q∈ [

12,1] = A

d(τ.ω ,s2) .

6.7 Failure simulation is complete for must testing

This section establishes the completeness of the failure simulation preorder with re-spect to the must testing preorder. It does so in three steps.First we provide a charac-terisation of the preorder relation⊑FS by an inductively defined relation. Secondly,using this, we develop a modal logic that can be used to characterise the failure sim-ulation preorder on finitary pLTS’s. Finally, we adapt the results of Section 5.7 toshow that the modal formulae can in turn be characterised by tests; again this resultdepends on the underlying pLTS being finitary. From this, completeness follows.

6.7.1 Inductive characterisation

The relation⊳sFS of Definition 6.21 is given co-inductively: it is the largestfixed

point of an equationR= F(R). An alternative approach therefore is to use thatF(−) to define⊳s

FS as a limit of approximants:

Definition 6.25.For everyk ≥ 0 we define the relations⊳kFS⊆ S×Dsub(S) as fol-

lows:

6.7 Failure simulation is complete for must testing 207

(i) ⊳0FS := S×Dsub(S)

(ii) ⊳k+1FS := F(⊳k

FS)

Finally let⊳∞FS :=

⋂∞k=0 ⊳

kFS.

A simple inductive argument ensures that⊳sFS ⊆ ⊳k

FS, for everyk≥ 0, and thereforethat⊳s

FS ⊆⊳∞FS. The converse is however not true in general.

A (non-probabilistic) example is well-known in the literature: it makes essentialuse of an infinite branching. LetP be the processrecx.a.x ands a state in a pLTSthat starts by making an infinitary choice, namely for eachk≥ 1 it has the option toperform a sequence ofa actions with lengthk in succession and then deadlock. Thiscan be described by the infinitary CSP expression

e∞k=1ak. Then[P 6⊳s

FS s, becausethe move[P a−→ [P cannot be matched bys. However an easy inductive argumentshows that[P⊳k

FS ak for everyk, and therefore that[P⊳∞FS s.

Once we restrict our non-probabilistic systems to be finitely branching, however,a simple counting argument will show that⊳s

FS coincides with⊳∞FS; see [6, Theorem

2.1] for the argument applied to bisimulation equivalence.In the probabilistic casewe restrict to both finite-stateandfinitely branching systems, and the effect of thatis captured by topologicalcompactness. Finiteness is lost unavoidably when we re-member that, for example, the processa.0 ⊓ b.0 can move via=⇒ to a distribution[a.0 p⊕ [b.0 for any of the uncountably many probabilitiesp∈ [0,1]. Neverthe-less, those uncountably many weak transitions can be generated by arbitrary inter-polation of two transitions[a.0 ⊓ b.0 τ−→ [a.0 and[a.0 ⊓ b.0 τ−→ [b.0, andthat is the key structural property that compactness captures.

Because compactness follows from closure and boundedness,we approach thistopic via closure.

Note that the metric spaces(Dsub(S),d1) with d1(∆ ,Θ) = maxs∈S|∆(s)−Θ(s)|and(S→ Dsub(S),d2) with d2( f ,g) = maxs∈Sd1( f (s),g(s)) are complete. LetX bea subset of eitherDsub(S) or S→ Dsub(S). Clearly,X is bounded. So ifX is closed,it is also compact.

Definition 6.26.A relationR ⊆ S×Dsub(S) is said to beclosedif for every s∈ Sthe sets· R= ∆ | sR ∆ is closed.

Two examples of closed relations are=⇒ and a=⇒ for any actiona, as shown by

Lemma 6.17 and Corollary 6.6.Our next step is to show that each of the relations⊳k

FS are closed. This requiressome results to be first established.

Lemma 6.34.LetR⊆ S×Dsub(S) be closed. ThenCh(R) is also closed.

Proof. Straightforward. ⊓⊔

Corollary 6.10. LetR⊆ S×Dsub(S) be closed and convex. ThenR† is also closed.

Proof. For any∆ ∈ Dsub(S), we know from Proposition 6.1 that

∆ · R†= Exp∆ ( f ) | f ∈ Ch(R).


The function Exp∆ (−) is continuous. By Lemma 6.34 the set of choice functions ofR is closed, and it is also bounded, thus being compact. Its image is also compact,thus being closed. ⊓⊔

Lemma 6.35.Let R ⊆ S×Dsub(S) be closed and convex, and C⊆ Dsub(S) beclosed. Then the set∆ | ∆ · R

† ∩ C 6= /0 is also closed.

Proof. First defineE : Dsub(S)×(S→Dsub(S)) → Dsub(S) byE (Θ , f ) =ExpΘ ( f ),which is obviously continuous. Then we know from the previous lemma thatCh(R)is closed. Finally let

Z = π1(E−1(C)∩ (Dsub(S)×Ch(R)))

whereπ1 is the projection onto the first component of a pair. We observe that thecontinuity of E ensures that the inverse image of the closed setC is closed. Fur-thermore,E −1(C)∩ (Dsub(S)×Ch(R)) is compact because it is both closed andbounded. Its image under the continuous functionπ1 is also compact. It follows thatZ is closed. ButZ = ∆ | ∆ · R

† ∩C 6= /0 because

∆ ∈ Z iff (∆ , f ) ∈ E−1(C) for somef ∈ Ch(R)

iff E (∆ , f ) ∈C for somef ∈ Ch(R)

iff Exp∆ ( f ) ∈C for somef ∈ Ch(R)

iff ∆ R† ∆ ′ for some∆ ′ ∈C

The reasoning in the last line is an application of Proposition 6.1, which requires theconvexity ofR. ⊓⊔

An immediate corollary of this last result is:

Corollary 6.11. In a finitary pLTS the following sets are closed:

(i) ∆ | ∆ =⇒ ε (ii) ∆ | ∆ =⇒ 6A−→

Proof. By Lemma 6.17 we see that=⇒ is closed and convex. Therefore, we canapply the previous lemma withC = ε to obtain the first result. To obtain thesecond we apply it withC= Θ | Θ 6A−→, which is easily seen to be closed. ⊓⊔

The result is also used in the proof of:

Proposition 6.13.In a finitary pLTS, for every k≥ 0 the relation⊳kFS is closed and

convex.

Proof. By induction onk. Fork= 0 it is obvious. So let us assume that⊳kFS is closed

and convex. We have to show that

s· ⊳(k+1)FS is closed and convex, for every states (6.24)


If s=⇒ ε then this follows from the corollary above, since in this case s· ⊳(k+1)FS

coincides with∆ | ∆ =⇒ ε . So let us assume that this is not the case.For everyA ⊆ Act let RA = ∆ | ∆ =⇒ 6A−→, which we know by the corollary

above to be closed and is obviously convex. Also for everyΘ andα we let

GΘ ,α := ∆ | (∆ · α=⇒)∩ (Θ · (⊳k

FS)†) 6= /0.

By Corollary 6.7, the relationα=⇒ is lifted from a closed convex relation. By Corol-

lary 6.10, the assumption that⊳kFS is closed and convex implies that(⊳k

FS)†

is alsoclosed. So we can appeal to Lemma 6.35 and conclude that eachGΘ ,α is closed.By Definition 6.2(2) it is also easy to see thatGΘ ,α is convex. But it follows that

s· ⊳(k+1)FS is also closed and convex as it can be written as

∩RA | s 6A−→ ∩ ∩GΘ ,α | s α−→Θ

⊓⊔

Before the main result of this section we need one more technical lemma.

Lemma 6.36.Let S be a finite set of states. SupposeRk ⊆ S×Dsub(S) is a sequenceof closed convex relations such thatR(k+1) ⊆ Rk. Then

∩∞k=0 (R

k)†

⊆ (∩∞k=0 R

k)†.

Proof. Let R∞ denote(∩∞k=0 Rk), and suppose∆ (Rk)

† Θ for everyk≥ 0. We haveto show that∆ (R∞)† Θ .

Let G= f : S→ Dsub(S) | Θ = Exp∆ ( f ), which is easily seen to be a closedset. For eachk we know from Lemma 6.34 that the setCh(Rk) is closed. Finally

consider the collection of closed setsHk = Ch(Rk)∩G; since∆ (Rk)† Θ , Proposi-

tion 6.1 assures us that all of these are non-empty. AlsoH(k+1) ⊆ Hk and thereforeby the finite-intersection property (Theorem 2.4)∩∞

k=0Hk is also non-empty.Let f be an arbitrary element of this intersection. For any states∈ dom(R∞), and

for everyk ≥ 0 sincedom(R∞) ⊆ dom(Rk) we havesRk f (s), that issR∞ f (s).So f is a choice function forR∞, f ∈Ch(R∞). From convexity and Proposition 6.1it follows that ∆ (R∞)† Exp∆ ( f ). But from the definition of theG we know thatΘ = Exp∆ ( f ), and the required result follows. ⊓⊔

Theorem 6.14.In a finitary pLTS, s⊳sFS Θ if and only if s⊳∞

FS Θ .

Proof. Since⊳sFS ⊆ ⊳∞

FS it is sufficient to show the opposite inclusion, which bydefinition holds if⊳∞

FS is a failure simulation, viz. if⊳∞FS ⊆ F(⊳∞

FS). Supposes⊳∞FS Θ ,

which means thats⊳kFS Θ for everyk≥ 0. According to Definition 6.21, in order to

showsF(⊳∞FS)Θ we have to establish three properties, the first and last of which are

trivial (for they are independent on the argument ofF).So supposes α−→ ∆ ′. We have to show thatΘ α

=⇒ Θ ′ for someΘ ′ such that∆ ′ (⊳∞

FS)† Θ ′.


For everyk≥ 0 there exists someΘ ′k such thatΘ α

=⇒Θ ′k and∆ ′ (⊳k

FS)† Θ ′

k. Nowconstruct the sets

Dk = Θ ′ | Θ α=⇒Θ ′ and∆ ′ (⊳k

FS)† Θ ′ .

From Lemma 6.17 and Proposition 6.13 we know that these are closed. They are alsonon-empty andDk+1 ⊆ Dk. So by the finite-intersection property the set

⋂∞k=0Dk is

non-empty. For anyΘ ′ in it we knowΘ α=⇒Θ ′ and∆ ′ (⊳k

FS)† Θ ′ for everyk≥ 0. By

Proposition 6.13, the relations⊳kFS are all closed and convex. Therefore, Lemma 6.36

may be applied to them, which enables us to conclude∆ ′ (⊳∞FS)

† Θ ′. ⊓⊔

For Theorem 6.14 to hold, it is crucial that the pLTS is assumed to be finitary.

Example 6.25.Consider an infinitely branching pLTS with four statess, t,u,v,0 andthe transitions are

• s a−→ 0 12⊕ s

• t a−→ 0, t a−→ t• u a−→ u• v τ−→ u p⊕ t for all p∈ (0,1).

This is a finite-state but not finitely branching system, due to the infinite branch inv. We have thats⊳k

FS v for all k≥ 0 but we do not haves⊳sFS v.

We first observe thats⊳sFS v does not hold becauses will eventually deadlock

with probability 1, whereas a fraction ofv will go to u and never deadlock.We now show thats⊳k

FS v for all k ≥ 0. For anyk we start the simulation bychoosing the movev τ−→ (u 1

2k⊕ t). By induction onk we show that

s⊳kFS (u 1

2k⊕ t). (6.25)

The base casek = 0 is trivial. So suppose we already have (6.25). We now show

thats⊳(k+1)FS (u 1

2k+1⊕ t). Neithersnor t noru can diverge or refusea, so the only

relevant move is thea-move. We know thats can do the moves a−→ 0 12⊕ s. This

can be matched by(u 12k+1

⊕ t) a−→ (0 12⊕ (u 1

2k⊕ t)).

Analogously to what we did for⊳sFS, we also give an inductive characterisation of

⊑FS: For everyk≥ 0 letΘ ⊑kFS ∆ if there exists a transitionΘ =⇒Θ matchsuch that

∆ (⊳kFS)

† Θ match, and let⊑∞FS denote

⋂∞k=0 ⊑

kFS.

Corollary 6.12. In a finitary pLTS,Θ ⊑FS ∆ if and only ifΘ ⊑∞FS ∆ .

Proof. Since⊳sFS ⊆⊳k

FS for everyk≥ 0, it is straightforward to prove one direction:Θ ⊑FS ∆ impliesΘ ⊑∞

FS ∆ . For the converse,Θ ⊑∞FS ∆ means that for everyk we

have someΘ k satisfyingΘ =⇒ Θ k and∆ (⊳kFS)

† Θ k. By Proposition 6.7 we have

to find someΘ ∞ such thatΘ =⇒Θ ∞ and∆ (⊳kFS)

† Θ ∞ for eachk ≥ 0. This can bedone exactly as in the proof of Theorem 6.14. ⊓⊔


6.7.2 The modal logic

We add to the modal languageF given in Section 5.6 a new constantdiv, repre-senting the ability of a process to diverge. The extended language, still written asFin this chapter, has the set of modal formulae defined inductively as follows:

• div,⊤ ∈ F ,

• ref(X) ∈ F whenX ⊆ Act,

• 〈a〉ϕ ∈ F whenϕ ∈ F anda∈ Act,

• ϕ1∧ϕ2 ∈ F whenϕ1, ϕ2 ∈ F ,

• ϕ1 p⊕ ϕ2 ∈ F whenϕ1,ϕ2 ∈ F andp∈ [0,1].

Relative to a given pLTS〈S,Actτ ,→〉 thesatisfaction relation|=⊆Dsub(S)×F

is given by:

• ∆ |=⊤ for any∆ ∈ Dsub(S),

• ∆ |= div iff ∆ =⇒ ε,

• ∆ |= ref(X) iff ∆ =⇒ 6X−→,

• ∆ |= 〈a〉ϕ iff there is a∆ ′ with ∆ a=⇒ ∆ ′ and∆ ′ |= ϕ ,

• ∆ |= ϕ1∧ϕ2 iff ∆ |= ϕ1 and∆ |= ϕ2,

• ∆ |= ϕ1 p⊕ ϕ2 iff there are∆1,∆2 ∈ Dsub(S) with ∆1 |= ϕ1 and∆2 |= ϕ2, suchthat∆ =⇒ p·∆1+(1−p)·∆2.

We writeΘ ⊑F ∆ when∆ |= ϕ impliesΘ |= ϕ for all ϕ ∈ F — note the opposingdirections. This is because the modal formulae express “bad” properties of our pro-cesses, ultimately divergence and refusal: thusΘ ⊑F ∆ means that any bad thingimplementation∆ does must have been allowed by the specificationΘ .

ForrpCSP processes we useP⊑F Q to abbreviate[P⊑F [Q in the pLTS givenin Section 6.2.

The set of formulae used here is obtained from that in Section5.6 by addingone operator,div. But the interpretation is quite different, as it uses the new silentmove relation=⇒. As a result our satisfaction relation no longer enjoys a natural,and expected, property. In the non-probabilistic setting if a recursive CCS processPsatisfies a modal formula from the Hennessy-Milner logic, then there is a recursion-free finite unwinding ofP that also satisfies it. Intuitively this reflects the fact thatif a non-probabilistic process does a bad thing, then at some(finite) point it mustactually do it. But this is not true in our new, probabilisticsetting: for example theprocessQ1 given in Example 6.8 can do ana and then refuse anything; but all finiteunwindings of it achieve that with probability strictly less than one. That is, whereas[Q1 |= 〈a〉⊤, no finite unwinding ofQ1 will satisfy 〈a〉⊤.

Our first task is to show that the interpretation of the logic is consistent with theoperational semantics of processes.

Theorem 6.15.If Θ ⊑FS ∆ thenΘ ⊑F ∆ .


Proof. We must show that ifΘ ⊑FS ∆ then whenever∆ |= ϕ we haveΘ |= ϕ . Theproof proceeds by induction onϕ :

• The case whenϕ =⊤ is trivial.

• Supposeϕ is div. Then∆ |= div means that∆ =⇒ ε and we have to showΘ =⇒ ε, which is immediate from Lemma 6.20.

• Supposeϕ is 〈a〉ϕa. In this case we have∆ a=⇒ ∆ ′ for some∆ ′ that satis-

fies ∆ ′ |= ϕa. The existence of a correspondingΘ ′ is immediate from Defini-tion 6.19 Case 1 and the induction hypothesis.

• The case whenϕ is ref(X) follows by Definition 6.19 Clause 2, and the caseϕ1∧ϕ2 by induction.

• Whenϕ is ϕ1 p⊕ ϕ2 we appeal again to Definition 6.19 Case 1, usingα := τ toinfer the existence of suitableΘ ′

1 andΘ ′2.

⊓⊔

We proceed to show that the converse to this theorem also holds, so that thefailure simulation preorder⊑FS coincides with the logical preorder⊑F.

The idea is to mimic the development in Section 5.7, by designing characteristicformulaethat capture the behaviour of states in a pLTS. But here the behaviour is notcharacterised relative to⊳s

FS, but rather to the sequence of approximating relations⊳k

FS.

Definition 6.27. In a finitary pLTS〈S,Actτ ,→〉, thekth characteristic formulaeϕks ,

ϕk∆ of statess∈ Sand subdistributions∆ ∈ Dsub(S) are defined inductively as fol-

lows:

• ϕ0s =⊤ andϕ0

∆ =⊤,

• ϕk+1s = div, provideds=⇒ ε,

• ϕk+1s = ref(X)∧

∧

sa−→∆ 〈a〉ϕk

∆ whereX = a∈ Act | s 6a−→, provideds 6τ−→,

• ϕk+1s =

∧

sa−→∆ 〈a〉ϕk

∆ ∧∧

sτ−→∆ ϕk

∆ otherwise,

• andϕk+1∆ = (div) 1−|∆ |⊕ (

⊕

s∈⌈∆⌉∆ (s)|∆ | ·ϕ

k+1s ) .

Lemma 6.37.For every k≥ 0, s∈ S and∆ ∈Dsub(S) we haves |= ϕks and∆ |= ϕk

∆ .

Proof. By induction onk, with the case whenk = 0 being trivial. The inductivecase of the first statement proceeds by an analysis of the possible moves froms,from which that of the second statement follows immediately. ⊓⊔

Lemma 6.38.For k≥ 0,

(i) Θ |= ϕks implies s⊳k

FS Θ ,

(ii) Θ |= ϕk∆ impliesΘ =⇒Θ matchsuch that∆ (⊳k

FS)† Θ match,

(iii) Θ |= ϕk∆ impliesΘ ⊑k

FS ∆ .


Proof. For everyk part (iii) follows trivially from (ii). We prove (i) and (ii)simulta-neously, by induction onk, with the casek= 0 being trivial. The inductive case, fork+1, follows the argument in the proof of Lemma 5.12.

(i) First supposes=⇒ ε . Thenϕk+1s = div and thereforeΘ |= div, which gives

the requiredΘ =⇒ ε .Now supposes τ−→ ∆ . Here there are two cases; if in additions=⇒ ε we havealready seen thatΘ =⇒ ε and this is the required matching move fromΘ ,

since∆ (⊳kFS)

† ε. So let us assume thats 6=⇒ ε . Then by the definition ofϕk+1s

we must have thatΘ |= ϕk∆ , and we obtain the required matching move fromΘ

from the inductive hypothesis: induction on part (ii) givessomeΘ ′ such that

Θ =⇒Θ ′ and∆ (⊳kFS)

† Θ ′.The matching move fors a−→Θ is obtained in a similar manner.Finally supposes 6X−→. Since this impliess 6τ−→, by the definition ofϕk+1

s wemust have thatΘ |= ref(X), which actually means thatΘ =⇒ 6X−→.

(ii) Note thatϕk+1∆ = (div) 1−|∆ |⊕ (

⊕

s∈⌈∆⌉∆ (s)|∆ | ·ϕ

k+1s ) and therefore by definition

Θ =⇒ (1− |∆ |)·Θdiv +∑s∈⌈∆⌉∆(s)·Θs such thatΘdiv |= div andΘs |= ϕk+1s .

By definition,Θdiv =⇒ ε, so by Theorem 6.5(i) and the reflexivity and transi-tivity of =⇒ we obtainΘ =⇒ ∑s∈⌈∆⌉∆(s)·Θs. By part (i) we haves⊳k+1

FS Θs

for everys in ⌈∆⌉, which in turn means that∆ (⊳k+1FS )

†∑s∈⌈∆⌉∆(s)·Θs.

⊓⊔

Theorem 6.16.In a finitary pLTS,Θ ⊑F ∆ if and only ifΘ ⊑FS ∆ .

Proof. One direction follows immediately from Theorem 6.15. For the oppositedirection supposeΘ ⊑F ∆ . By Lemma 6.37 we have∆ |= ϕk

∆ , and henceΘ |= ϕk∆ ,

for all k ≥ 0. By part (iii) of the previous lemma we thus know thatΘ ⊑∞FS ∆ . That

Θ ⊑FS ∆ now follows from Corollary 6.12. ⊓⊔

6.7.3 Characteristic tests for formulae

The import of Theorem 6.16 is that we can obtain completenessof the failure simu-lation preorder with respect to the must-testing preorder by designing for each for-mulaϕ a test that in some sense characterises the property of a process of satisfyingϕ . This has been achieved for the pLTS generated by the recursion free fragment ofrpCSP in Section 5.7. Here we generalise this technique to the pLTSgenerated bythe set of finitaryrpCSP terms.

As in Section 5.7, the generation of these tests depends on crucial characteristicsof the testing functionA d(−,−), which are summarised in Lemmas 6.39 and 6.42below, corresponding to Lemmas 5.9 and 5.10, respectively.

Lemma 6.39.Let ∆ be arpCSP process, and T,Ti be tests.

1. o∈ A d(ω ,∆) iff o = |∆ | ·−→ω .


2.−→0 ∈ A d(τ.ω ,∆) iff ∆ =⇒ ε.

3.−→0 ∈ A d(

ea∈X a.ω ,∆) iff ∆ =⇒ 6X−→.

4. Suppose the actionω does not occur in the test T . Then o∈ A d(τ.ω a.T,∆)with o(ω) = 0 iff there is a∆ ′ ∈ Dsub(sCSP) with ∆ a=⇒ ∆ ′ and o∈A d(T,∆ ′).

5. o∈ A d(T1 p⊕ T2,∆) iff o = p·o1+(1−p)·o2 for certain oi ∈ A d(Ti ,∆).6. o∈ A d(T1 ⊓ T2,∆) if there are a q∈ [0,1] and∆1,∆2 ∈ Dsub(sCSP) such that

∆ =⇒ q·∆1+(1−q)·∆2 and o= q·o1+(1−q)·o2 for certain oi ∈ A d(Ti ,∆i).

Proof. 1. Sinceω |Act ∆ ω−→, the states in the support of[ω |Act ∆ ] have no otheroutgoing transitions thanω . Therefore[ω |Act ∆ ] is the unique extreme deriva-tive of itself, and as $[ω |Act ∆ ] = |∆ | ·−→ω we haveA d(ω ,∆) = |∆ | ·−→ω .

2. (⇐) Assume∆ =⇒ ε. By Lemma 6.27(1) we haveτ.ω |Act ∆ =⇒ τ.ω |Act ε.All states involved in this derivation (that is, all statesu in the support of theintermediate distributions∆→

i and∆×i of Definition 6.4) must have the form

τ.ω |Act s, and thus satisfyu 6ω−→ for all ω ∈ Ω . Therefore it follows that[τ.ω |Act ∆ ] =⇒ [τ.ω |Act ε ]. Trivially, [τ.ω |Act ε ] = ε is stable, and hencean extreme derivative of[τ.ω |Act ∆ ]. Moreover, $ε =

−→0 , so

−→0 ∈ A d(τ.ω ,∆).

(⇒) Suppose−→0 ∈ A d(τ.ω ,∆), i.e., there is some extreme derivativeΓ of

[τ.ω |Act ∆ ] such that $Γ =−→0 . Given the operational semantics ofrpCSP, all

statesu ∈ ⌈Γ ⌉ must have one of the formsu = [τ.ω |Act t] or u = [ω |Act t].As $Γ =

−→0 , the latter possibility cannot occur. It follows that all transitions

contributing to the derivation[τ.ω |Act ∆ ] =⇒≻ Γ do not require any actionfrom τ.ω , and in factΓ has the form[τ.ω |Act ∆ ′] for some distribution∆ ′ with∆ =⇒∆ ′. AsΓ must be stable, yet none of the states in its support are, it followsthat⌈Γ ⌉= /0, i.e.∆ ′ = ε.

3. LetT :=e

a∈X a.ω .(⇐) Assume∆ =⇒ ∆ ′ 6X−→ for some ∆ ′. Then T |Act ∆ =⇒ T |Act ∆ ′ byLemma 6.27(1), and by the same argument as in the previous case, we have[T |Act ∆ ] =⇒ [T |Act ∆ ′]. All states in the support ofT |Act ∆ ′ are dead-locked. So[T |Act ∆ ] =⇒≻ [T |Act ∆ ′] and $(T |Act ∆ ′) =

−→0 . Thus we have

−→0 ∈ A d(T,∆).(⇒) Suppose

−→0 ∈ A d(T,∆). By the very same reasoning as in Case 2 we find

that∆ =⇒ ∆ ′ for some∆ ′ such thatT |Act ∆ ′ is stable. This implies∆ ′ 6X−→.4. Let T be a test in which the success actionω does not occur, and letU be an

abbreviation forτ.ω a.T.(⇐) Assume there is a∆ ′ ∈ Dsub(sCSP) with ∆ a

=⇒ ∆ ′ ando ∈ A d(T,∆ ′).Without loss of generality we may assume that∆ =⇒ ∆pre a−→ ∆post=⇒ ∆ ′.Using Lemma 6.27(1) and (3), and the same reasoning as in the previous cases,[U |Act ∆ ] =⇒ [U |Act ∆pre] τ−→ [T |Act ∆post] =⇒ [T |Act ∆ ′] =⇒ Γ for a stablesubdistributionΓ with $Γ = o. It follows thato∈ A d(U,∆).(⇒) Supposeo∈ A d(U,∆) with o(ω) = 0. Then there is a stable subdistribu-tion Γ such that[U |Act ∆ ] =⇒ Γ and $Γ = o. Sinceo(ω) = 0 there is no statein the support ofΓ of the formω |Act t. Hence there must be a∆ ′ ∈Dsub(sCSP)such that∆ =⇒ a−→ ∆ ′ and[T |Act ∆ ′] =⇒ Γ . It follows thato∈ A d(T,∆ ′).


5. (⇐) Assumeoi ∈ A d(Ti ,∆) for i = 1,2. Then[Ti |Act ∆ ] =⇒ Γi for some stableΓi with $Γi = oi . By Theorem 6.5(i) we have

[(T1 p⊕ T2) |Act ∆ ] = p·[T1 |Act ∆ ]+ (1−p)·[T2 |Act ∆ ] =⇒ p·Γ1+(1−p)·Γ2,

andp·Γ1+(1−p)·Γ2 is stable. Moreover,

$(p·Γ1+(1−p)·Γ2) = p·o1+(1−p)·o2,

soo∈ A d(T1 p⊕ T2,∆).(⇒) Supposeo ∈ A d(T1 p⊕ T2,∆). Then there is a stableΓ with $Γ = osuch that[(T1 p⊕ T2) |Act ∆ ] = p·[T1 |Act ∆ ] + (1−p)·[T2 |Act ∆ ] =⇒ Γ . ByTheorem 6.5(ii) there areΓi for i = 1,2, such that[Ti |Act ∆ ] =⇒ Γi and alsoΓ = p·Γ1 +(1−p)·Γ2. As Γ1 andΓ2 are stable, we have $Γi ∈ A d(Ti ,∆) fori = 1,2. Moreover,o= $Γ = p·$Γ1+(1−p)·$Γ2.

6. Supposeq∈ [0,1] and∆1,∆2 ∈ Dsub(rpCSP) with ∆ =⇒ q·∆1+(1−q)·∆2 andoi ∈ A d(Ti ,∆i). Then there are stableΓi with [Ti |Act ∆i ] =⇒ Γi and $Γi = oi .Now

[(T1 ⊓ T2) |Act ∆ ] =⇒ q·[(T1 ⊓ T2) |Act ∆1]+ (1−q)·[(T1 ⊓ T2) |Act ∆2]τ−→ q·[T1 |Act ∆1]+ (1−q)·[T2 |Act ∆2]

=⇒ q·Γ1+(1−q)·Γ2

The latter subdistribution is stable and satisfies

$(q·Γ1+(1−q)·Γ2) = q·o1+(1−q)·o2.

Henceq·o1+(1−q)·o2 ∈ A d(T1 ⊓ T2,∆).⊓⊔

We also have the converse to part (6) of this lemma by mimicking Lemma 5.10.For that purpose, we use two technical lemmas whose proofs are similar to those forLemmas 6.28 and 6.29 respectively.

Lemma 6.40.Suppose∆ |A (T1 ⊓ T2)τ−→ Γ . Then there exist subdistributions

∆→, ∆×1 , ∆×

2 , ∆next (possibly empty) such that

(i) ∆ = ∆→+∆×1 +∆×

2(ii) ∆→ τ−→ ∆next

(iii) Γ = ∆next |A (T1 ⊓ T2)+∆×1 |A T1+∆×

2 |A T2

Proof. By Lemma 6.1∆ |A (T1 ⊓ T2)τ−→ Γ implies that

∆ = ∑i∈I

pi ·si , si |A (T1 ⊓ T2)τ−→ Γi , Γ = ∑

i∈Ipi ·Γi ,

for certainsi ∈ S, Γi ∈ Dsub(sCSP) and∑i∈I pi ≤ 1. LetJ1 = i ∈ I | Γi = si |A T1andJ2 = i ∈ I | Γi = si |A T2. Note that for eachi ∈ (I −J1−J2) we haveΓi in theform Γ ′

i |A (T1 ⊓ T2), wheresiτ−→ Γ ′

i . Now let


∆→ = ∑i∈(I−J1−J2)

pi ·si , ∆×k = ∑

i∈Jk

pi ·si , ∆next= ∑i∈(I−J1−J2)

pi ·Γ ′i .

wherek= 1,2. By construction (i) and (iii) are satisfied, and (ii) follows by property(2) of Definition 6.2. ⊓⊔

Lemma 6.41.If ∆ |A (T1 ⊓ T2) =⇒≻Ψ then there areΦ1 andΦ2 such that

(i) ∆ =⇒ Φ1+Φ2

(ii) Φ1 |A T1+Φ2 |A T2 =⇒≻Ψ

Proof. Suppose∆0 |A (T1 ⊓ T2) =⇒≻Ψ . We know from Definition 6.4 that there isa collection of subdistributionsΨk,Ψ→

k ,Ψ×k , for k≥ 0, satisfying the properties

∆0 |A (T1 ⊓ T2) = Ψ0 = Ψ→0 +Ψ×

0Ψ→

0τ−→ Ψ1 = Ψ→

1 +Ψ×1...

...Ψ→

kτ−→ Ψk+1 = Ψ→

k+1 +Ψ×k+1...

Ψ = ∑∞k=0Ψ×

k

andΨ is stable.TakeΓ0 :=Ψ0. By induction onk≥ 0, we find distributionsΓk+1, ∆→

k , ∆×k1, ∆×

k2,∆k+1 such that

(i) ∆k |A (T1 ⊓ T2)τ−→ Γk+1

(ii) Γk+1 ≤Ψk+1

(iii) ∆k = ∆→k +∆×

k1+∆×k2

(iv) ∆→k

τ−→ ∆k+1(v) Γk+1 = ∆k+1 |A (T1 ⊓ T2)+∆×

k1 |A T1+∆×k2 |A T2

Induction Step:Assume we already haveΓk and∆k. Note that

∆k |A (T1 ⊓ T2)≤ Γk ≤Ψk =Ψ→k +Ψ×

k

and T1 ⊓ T2 can make aτ move. SinceΨ is stable, we know that there are twopossibilities: eitherΨ×

k = ε orΨ×k 6τ−→. In both cases it holds that

∆k |A (T1 ⊓ T2)≤Ψ→k .

Proposition 6.2 gives a subdistributionΓk+1 ≤Ψk+1 such that there exists the transi-tion ∆k |A (T1 ⊓ T2)

τ−→ Γk+1. Now apply Lemma 6.40.

Let Φ1 = ∑∞k=0 ∆×

k1 andΦ2 = ∑∞k=0 ∆×

k2. By (iii) and (iv) above we obtain a weakτ move∆ =⇒ Φ1+Φ2. For k ≥ 0, let Γ →

k := ∆k |A (T1 ⊓ T2), let Γ ×0 := ε and let

Γ ×k+1 := ∆×

k1 |A T1+∆×k2 |A T2. Moreover,Γ :=Φ1 |A T1+Φ2 |A T2. Now all conditions

of Definition 6.5 are fulfilled, so∆0 |A (T1 ⊓ T2) =⇒ Γ is an initial segment of∆0 |A (T1 ⊓ T2) =⇒Ψ . By Proposition 6.4 we haveΦ1 |A T1+Φ2 |A T2 =⇒≻Ψ . ⊓⊔


Lemma 6.42.If o ∈ A d(T1 ⊓ T2,∆) then there are a q∈ [0,1] and subdistributions∆1,∆2 ∈ Dsub(sCSP) such that∆ =⇒ q·∆1+(1−q)·∆2 and o= q·o1+(1−q)·o2

for certain oi ∈ A d(Ti ,∆i).

Proof. If o∈A d(T1 ⊓T2,∆) then there is an extreme derivative of[(T1 ⊓T2) |Act ∆ ],sayΨ , such that $Ψ = o. By Lemma 6.41 there areΦ1,2 such that

(i) ∆ =⇒ Φ1+Φ2

(ii) and [T1 |Act Φ1]+ [T2 |Act Φ2] =⇒≻Ψ .

By Theorem 6.5(ii) there are some subdistributionsΨ1 andΨ2 such thatΨ =Ψ1+Ψ2

and Ti |Act Φi =⇒≻ Ψi for i = 1,2. Let o′i = $Ψi . As Ψi is stable we obtain thato′i ∈ A d(Ti ,Φi). We also haveo= $Ψ = $Ψ1+$Ψ2 = o′1+o′2.

We now distinguish two cases:

• If Ψ1 = ε, then we take∆i = Φi , oi = o′i for i = 1,2 andq= 0. Symmetrically,if Ψ2 = ε, then we take∆i = Φi , oi = o′i for i = 1,2 andq= 1.

• If Ψ1 6= ε andΨ2 6= ε , then we letq= |Φ1||Φ1+Φ2|

, ∆1 =1qΦ1, ∆2 =

11−qΦ2, o1 =

1qo′1

ando2 =1

1−qo′2.

It is easy to check thatq·∆1+(1−q)·∆2 = Φ1 +Φ2, q·o1+(1−q)·o2 = o′1+o′2andoi ∈ A d(Ti ,∆i) for i =1,2. ⊓⊔

Proposition 6.14.For every formulaϕ ∈ F there exists a pair(Tϕ ,vϕ ) with Tϕ anΩ -test and vϕ ∈ [0,1]Ω such that

∆ |= ϕ if and only if∃o∈ Ad(Tϕ ,∆) : o≤ vϕ . (6.26)

Tϕ is called acharacteristic testof ϕ and vϕ its target value.

Proof. The proof is adapted from that of Lemma 5.13, from where we take thefollowing remarks: As in vector-based testingΩ is assumed to be countable (cf.page 74) andΩ -tests are finite expressions, for everyΩ -test there is anω ∈ Ωnot occurring in it. Furthermore, if a pair(Tϕ ,vϕ) satisfies requirement (6.26), thenany pair obtained from(Tϕ ,vϕ ) by bijectively renaming the elements ofΩ alsosatisfies that requirement. Hence two given characteristictests can be assumed to beΩ -disjoint, meaning that noω ∈ Ω occurs in both of them.

Our modal logicF is identical to that used in Section 5.7, with the addition ofone extra constantdiv. So we need a new characteristic test and target value for thislatter formula, and reuse those from Section 5.7 for the restof the language.

• Let ϕ =⊤. TakeTϕ := ω for someω ∈ Ω , andvϕ :=−→ω .

• Let ϕ = div. TakeTϕ := τ.ω for someω ∈ Ω , andvϕ :=−→0 .

• Let ϕ = ref(X) with X ⊆ Act. TakeTϕ :=e

a∈X a.ω for someω ∈ Ω , and set

vϕ :=−→0 .

• Let ϕ = 〈a〉ψ . By induction,ψ has a characteristic testTψ with target valuevψ .TakeTϕ := τ.ω a.Tψ whereω ∈ Ω does not occur inTψ , andvϕ := vψ .


• Let ϕ = ϕ1∧ϕ2. Choose anΩ -disjoint pair(Ti ,vi) of characteristic testsTi withtarget valuesvi , for i = 1,2. Furthermore, letp∈ (0,1] be chosen arbitrarily, andtakeTϕ := T1 p⊕ T2 andvϕ := p·v1+(1−p)·v2.

• Let ϕ = ϕ1 p⊕ ϕ2. Again choose anΩ -disjoint pair(Ti ,vi) of characteristic testsTi with target valuesvi , i = 1,2, this time ensuring that there are two distinctsuccess actionsω1, ω2 that do not occur in any of these tests. LetT ′

i := Ti 12⊕ ωi

andv′i := 12vi +

12−→ωi . Note that fori = 1,2 we have thatT ′

i is also a characteristictest ofϕi with target valuev′i . TakeTϕ := T ′

1 ⊓T ′2 andvϕ := p·v′1+(1−p)·v′2.

Note thatvϕ (ω) = 0 wheneverω ∈ Ω does not occur inTϕ .As in the proof of Lemma 5.13 we now check by induction onϕ that (6.26) above

holds; the proof relies on Lemmas 6.39 and 6.42.

• Let ϕ = ⊤. For all ∆ ∈ Dsub(sCSP) we have∆ |= ϕ and it always holds that∃o∈ A d(Tϕ ,∆ ) : o≤ vϕ , using Lemma 6.39(1).

• Let ϕ = div. Suppose∆ |= ϕ . Then we have that∆ =⇒ ε. By Lemma 6.39(2),−→0 ∈ A d(Tϕ ,∆).

Now suppose∃o ∈ A d(Tϕ ,∆) : o ≤ vϕ . This implieso =−→0 , so we apply

Lemma 6.39(2) and obtain∆ =⇒ ε . Hence∆ |= ϕ .

• Let ϕ = ref(X) with X ⊆ Act. Suppose∆ |= ϕ . Then∆ =⇒ 6X−→. We can useLemma 6.39(3) and obtain

−→0 ∈ A d(Tϕ ,∆).

Now suppose∃o ∈ A d(Tϕ ,∆) : o ≤ vϕ . This implieso =−→0 , so∆ =⇒ 6A−→ by

Lemma 6.39(3). Hence∆ |= ϕ .

• Let ϕ = 〈a〉ψ with a∈ Act. Suppose∆ |= ϕ . Then there is a∆ ′ with ∆ a=⇒ ∆ ′

and∆ ′ |= ψ . By induction,∃o∈ A d(Tψ ,∆ ′) : o≤ vψ . By Lemma 6.39(4), wegeto∈ A d(Tϕ ,∆ ).

Now suppose∃o ∈ A d(Tϕ ,∆ ) : o ≤ vϕ . This implieso(ω) = 0, hence byLemma 6.39(4) there is a∆ ′ with ∆ a

=⇒ ∆ ′ ando∈ A d(Tψ ,∆ ′). By induction,∆ ′ |=ψ , so∆ |=ϕ .

• Let ϕ = ϕ1∧ϕ2 and suppose∆ |= ϕ . Then∆ |= ϕi for i =1,2 and hence, byinduction,∃oi ∈ A d(Ti ,∆) : oi ≤ vi . Thuso := p·o1+(1−p)·o2 ∈ A d(Tϕ ,∆ )by Lemma 6.39(5), ando≤ vϕ .

Now suppose∃o∈ A d(Tϕ ,∆) : o≤ vϕ . Then, using Lemma 6.39(5), we knowthato= p·o1+(1−p)·o2 for certainoi ∈ A d(Ti ,∆). Recall thatT1,T2 areΩ -disjoint tests. One hasoi ≤ vi for both i = 1,2, for if oi(ω) > vi(ω) for somei =1 or 2 andω ∈ Ω , thenω must occur inTi and hence cannot occur inT3−i .This impliesv3−i(ω) = 0 and thuso(ω) > vϕ(ω), in contradiction with theassumption. By induction,∆ |= ϕi for i = 1,2, and hence∆ |= ϕ .

• Let ϕ = ϕ1 p⊕ ϕ2. Suppose∆ |= ϕ . Then there are∆1,∆2 ∈ Dsub(sCSP) with∆1 |= ϕ1 and∆2 |= ϕ2 such that∆ =⇒ p·∆1 +(1−p)·∆2. By induction, fori =1,2 there areoi ∈ A d(Ti ,∆i) with oi ≤ vi . Hence, there areo′i ∈ A d(T ′

i ,∆i)with o′i ≤ v′i . Thuso := p·o′1+(1−p)·o′2 ∈ A d(Tϕ ,∆) by Lemma 6.39(6), ando≤ vϕ .

6.8 Simulations and may testing 219

Now suppose∃o ∈ A d(Tϕ ,∆ ) : o ≤ vϕ . Then, by Lemma 6.42, there areq ∈ [0,1] and ∆1,∆2 ∈ Dsub(sCSP) such that∆ =⇒ q·∆1 + (1−q)·∆2 ando= q·o′1+(1−q)·o′2 for certaino′i ∈ A d(T ′

i ,∆i). Now∀i : o′i(ωi)= v′i(ωi)=12,

so, using thatT1,T2 areΩ -disjoint tests,

12

q= q·o′1(ω1) = o(ω1)≤ vϕ(ω1) = p·v′1(ω1) =12

p

and likewise

12(1−q) = (1−q)·o′2(ω2) = o(ω2)≤ vϕ(ω2) = (1−p)·v′2(ω2) =

12(1−p).

Together, these inequalities say thatq= p. Exactly as in the previous case oneobtainso′i ≤ v′i for bothi = 1,2. Given thatT ′

i = Ti 12⊕ωi , using Lemma 6.39(5),

it must be thato′i =12oi +

12−→ωi for someoi ∈ A d(Ti ,∆i) with oi ≤ vi . By induc-

tion, ∆i |= ϕi for i = 1,2, and hence∆ |= ϕ .

⊓⊔

Theorem 6.17.If Θ ⊑Ωpmust∆ thenΘ ⊑F ∆ .

Proof. SupposeΘ ⊑Ωpmust∆ and∆ |= ϕ for someϕ ∈ F . Let Tϕ be a characteristic

test ofϕ with target valuevϕ . Then Proposition 6.14 yields∃o∈A d(Tϕ ,∆) : o≤ vϕ ,and hence, given thatΘ ⊑Ω

pmust∆ , by the Smyth preorder we are guaranteed to have∃o′ ∈ A d(Tϕ ,Θ) : o′ ≤ vϕ . ThusΘ |= ϕ by Proposition 6.14 again. ⊓⊔

Corollary 6.13. For any finitary processes P and Q, if P⊑pmustQ then P⊑FS Q.

Proof. From Theorems 6.17 and 6.16 we know that ifP ⊑ΩpmustQ thenP ⊑FS Q.

Theorem 4.7 tells us thatΩ -testing is reducible to scalar testing. So the requiredresult follows. ⊓⊔

Remark 6.4.Note that in our testing semantics we have allowed tests to befinitary.The proof of Proposition 6.14 actually tells us that if two processes are behaviourallydifferent they can be distinguished by some characteristictests that are always finite.Therefore, Corollary 6.13 still holds if tests are requiredto be finite.

6.8 Simulations and may testing

In this section we follow the same strategy as for failure simulations and testing(Section 6.6) except that we restrict our treatment to full distributions: this is pos-sible because partial distributions are not necessary for this case; and it is desirablebecause the approach becomes simpler as a result.

Definition 6.28 (Simulation preorder). Define ⊑S to be the largest relation inD(S)×D(S) such that if∆ ⊑SΘ then whenever∆ α

=⇒ (∑i pi∆ ′i ), for finitely many

pi with ∑i pi = 1, there areΘ ′i with Θ α

=⇒ (∑i piΘ ′i ) and∆ ′

i ⊑SΘ ′i for eachi.


Note that, unlike for Definition 6.28, this summation cannotbe empty.Again it is trivial to see that⊑S is reflexive and transitive; and again it is some-

times easier to work with an equivalent formulation based ona state-level “simula-tion” defined as follows.

Definition 6.29 (Simulation).Define⊳S to be the largest relation inS×D(S) suchthat if s⊳S Θ then whenevers α−→ ∆ ′ there is someΘ ′ such thatΘ α

=⇒ Θ ′ and∆ ′ (⊳S)

† Θ ′.

Remark 6.5.Definition 6.29 entails the same simulation given in Definition 5.4when applied to finite processes. It differs from the analogous Definition 6.21in three ways: it is missing the clause for divergence, and for refusal; and it is(implicitly) limited to α

=⇒-transitions that simulate by producing full distributionsonly.2 Without that latter limitation, any simulation relation could be scaled downuniformly without losing its simulation properties, for example allowing counter-intuitively a.0 to be simulated bya.0 1

2⊕ ε .

Lemma 6.43.The above preorder and simulation are equivalent in the followingsense: for distributions∆ ,Θ we have∆ ⊑SΘ just when there is a distributionΘ match

with Θ =⇒Θ matchand∆ (⊳S)† Θ match.

Proof. The proof is as for the failure case, except that in Theorem 6.12 we canassume total distributions, and so do not need the second part of its proof wheredivergence is treated. ⊓⊔

6.8.1 Soundness

In this section we prove that simulations are sound for showing that processes arerelated via the may-testing preorder. We assume initially that we are using only onesuccess actionω , so that|Ω |= 1.

Because we prune our pLTS’s before extracting values from them, we will beconcerned mainly withω-respecting structures, and for those we have the following.

Lemma 6.44.Let ∆ andΘ be two distributions. If∆ is stable and∆ (⊳S)† Θ , then

V (∆)≤Ho V (Θ).

Proof. We first show that ifs is stable ands⊳S Θ thenV (s) ≤Ho V (Θ). Sinces isstable, we have only two cases:

(i) s 6−→. HereV (s)=0 and sinceV (Θ) is not empty we clearly have thatV (s)≤Ho V (Θ).

(ii) s ω−→∆ ′ for some∆ ′. HereV (s)=1 andΘ =⇒Θ ′ ω−→with V (Θ ′)=1. ByLemma 6.32 specialised to full distributions, we have 1∈ V (Θ). Therefore,V (s)≤Ho V (Θ).

2 Even though for simplicity of presentation in Definition 6.2the relation=⇒ was defined by usingsubdistributions, it can be equivalently defined by using full distributions.

6.8 Simulations and may testing 221

Now for the general case we suppose∆ (⊳S)† Θ . Use Proposition 6.2 to decom-

poseΘ into ∑s∈⌈∆⌉ ∆(s)·Θs such thats⊳S Θs for eachs∈ ⌈∆⌉, and recall eachsuch states is stable. From above we have thatV (s)≤Ho V (Θs) for thoses, and soV (∆) = ∑∈⌈∆⌉∆(s)·V (s)≤Ho ∑s∈⌈∆⌉∆(s)·V (Θs) = V (Θ). ⊓⊔

Lemma 6.45.Let ∆ andΘ be distributions in anω-respecting finitary pLTS givenby 〈S,τ,ω,→〉. If ∆ (⊳S)

† Θ , then we haveV (∆)≤Ho V (Θ).

Proof. Since∆ (⊳S)† Θ , we consider subdistributions∆ ′′ with ∆ =⇒≻ ∆ ′′; by dis-

tillation of divergence (Theorem 6.11) we have full distributions∆ ′, ∆ ′1 and∆ ′

2 andprobabilityp such thats=⇒∆ ′ = (∆ ′

1 p⊕ ∆ ′2) and∆ ′′ = p·∆ ′

1 and∆ ′2 =⇒ ε. There is

thus a matching transitionΘ =⇒Θ ′ such that∆ ′ (⊳S)† Θ ′. By Proposition 6.2, we

can find distributionsΘ ′1,Θ

′2 such thatΘ ′ =Θ ′

1 p⊕Θ ′2, ∆ ′

1 (⊳S)† Θ ′

1 and∆ ′2 (⊳S)

† Θ ′2.

Since⌈∆ ′1⌉ = ⌈∆ ′′⌉ we have that∆ ′

1 is stable. It follows from Lemma 6.44 thatV (∆ ′

1)≤Ho V (Θ ′1). Thus we finish off with

V (∆ ′′)= V (p·∆ ′

1) ∆ ′′ = p·∆ ′1

= p·V (∆ ′1) linearity ofV

≤Ho p·V (Θ ′1) above argument based on distillation

= V (p·Θ ′1) linearity ofV

≤Ho V (Θ ′) Θ ′ =Θ ′1 p⊕Θ ′

2≤Ho V (Θ) . Lemma 6.32 specialised to full distributions

Since∆ ′′ was arbitrary, we have our result. ⊓⊔

Lemma 6.46.Let ∆ andΘ be distributions in anω-respecting finitary pLTS givenby 〈S,τ,ω,→〉. If ∆ ⊑SΘ , then it holds thatV (∆)≤Ho V (Θ).

Proof. Suppose∆ ⊑S Θ . By Lemma 6.43, there exists some distributionΘ match

such thatΘ =⇒ Θ match and∆ (⊳S)† Θ match. By Lemmas 6.45 and 6.32 we obtain

V (∆)≤Ho V (Θ ′)⊆ V (Θ). ⊓⊔

Theorem 6.18.For any finitary processes P and Q, if P⊑S Q then P⊑pmayQ.


P⊑S Qimplies [P |Act T]⊑S [Q |Act T] the counterpart of Lemma 6.30 for simulationimplies V ([P |Act T])≤Ho V ([Q |Act T]) [ · ] is ω-respecting; Lemma 6.46iff A d(T,P) ≤Ho A d(T,Q) Definition 6.24iff P⊑pmayQ . Definition 6.11

⊓⊔


6.8.2 Completeness

Let L be the subclass ofF by skipping thediv andref(X) clauses. In other words,the formulae are exactly the same as those in the logic for characterising the simu-lation preorder in Section 5.6. The semantic interpretation is different now becausethe weak transition relationτ

=⇒ used there has been replaced in this chapter by amore general form=⇒ given in Definition 6.4. We continue to writeP ⊑L Q justwhen[P |= ϕ implies[Q |= ϕ for all ϕ ∈ L .

We have the counterparts of Theorems 6.16 and 6.17, with similar proofs.

Theorem 6.19.In a finitary pLTS∆ ⊑L Θ if and only if∆ ⊑SΘ . ⊓⊔

Theorem 6.20.If ∆ ⊑ΩpmayΘ then∆ ⊑L Θ . ⊓⊔

Corollary 6.14. Suppose P and Q are finitaryrpCSP processes. If P⊑pmayQ thenP⊑S Q.

Proof. From Theorems 6.19 and 6.20 we know that ifP ⊑Ωpmay Q then P ⊑S Q.

Theorem 4.7 says thatΩ -testing is reducible to scalar testing. So the required resultfollows. ⊓⊔

As one would expect, the completeness result in Corollary 6.14 would fail forinfinitary processes.

Example 6.26.Consider the states2 that we saw in Example 6.5. It turns out that

τ.(0 12⊕ a.0)⊑pmays2

However, we do not haveτ.(0 1

2⊕ a.0)⊳S s2

because the transitionτ.(0 1

2⊕ a.0) τ−→ (0 1

2⊕ a.0)

cannot be matched by a transition froms2 as there is nofull distribution∆ such thats2 =⇒ ∆ and(0 1

2⊕ a.0) (⊳S)

† ∆ .

6.9 Real-reward testing

In Section 4.4 we introduced a notion of reward testing inspired by [8]. The idea is toassociate with each success action a nonnegative reward, and performing a successaction means accumulating some reward. The outcomes of thisreward testing arenonnegative expected rewards.

In certain occasions it is very natural to introduce negative rewards. For exam-ple, this is the case in the theory of Markov decision processes [12]. Intuitively, wecould understand negative rewards as costs, while positiverewards are often viewed

6.9 Real-reward testing 223

as benefits or profits. Consider for instance the (non-probabilistic) processesQ1 andQ2 with initial statesq1 andq2, respectively, in Figure 6.5. Herea represents theaction of making an investment. Assuming that the investment is made by biddingfor some commodity, theτ-action represents an unsuccessful bid — if this hap-pens one simply tries again. Nowb represents the action of reaping the benefits ofthis investment. WhereasQ1 models a process in which making the investment isalways followed by an opportunity to reap the benefits, the processQ2 allows, non-deterministically, for the possibility that the investment is unsuccessful, so thatadoes not always lead to a state whereb is enabled. The testT with initial statet,which will be explained later, allows us to give a negative reward to actionω1—itscost—and a positive reward toω2.

This leads to the question:if both negative- and positive rewards are allowed,how would the original reward-testing semantics change?3 We refer to the morerelaxed form of testing asreal-reward testingand the original one asnonnegative-reward testing.

q1

τ

a

b

a

b

τ

a

q2

a

b

1/2 1/2

ω1

ω2

t

Fig. 6.5 Two processes with divergence and a test. Reprinted from [2], with kind permission fromElsevier.

The power of real-reward testing is illustrated in Figure 6.5. The two (non-probabilistic) processes in the left- and central diagramsare equivalent under (prob-abilistic) may- as well as must testing; theτ-loops in the initial states cause bothprocesses to fail any nontrivial must test. Yet, if a reward of −2 is associated withperforming the actionω1, and a reward of 4 with the subsequent performance ofω2,it turns out that in the first process the net reward is either 0, if the process remainsstuck in its initial state, or positive, whereas running thesecond process may yield aloss. See Example 6.27 for details of how these rewards are assigned, and how net

3 One might suspect no change at all, for any assignment of rewards from the interval[−1,+1]can be converted into a non-negative assignment simply by adding 1 to all of them. But that wouldnot preserve the testing order in the case of zero-outcomes that resulted from a process’s failing toreach any success state at all: those zeroes would remain zero.


rewards are associated with the application of tests such asT. This example showsthat for processes that may exhibit divergence, real-reward testing is more discrim-inating than nonnegative-reward testing, or other forms ofprobabilistic testing. Italso illustrates that the extra power is relevant in applications.

We will show that for real-reward testing may and must preorders are the inverseof each other, i.e. for any processesP andQ,

P⊑Ωrr may Q iff Q⊑Ω

rr mustP. (6.27)

A more surprising result is that for finitary convergent processes real-reward mustpreorder coincides with nonnegative-reward must preorder, i.e. for any finitary con-vergent processesP andQ,

P⊑Ωrr mustQ iff P⊑Ω

nrmustQ. (6.28)

Here by convergence we mean that in the pLTS generated by a process there is noinfinite sequence of internal transitions between distributions like

∆0τ−→ ∆1

τ−→ ·· ·

Although it is easy to see that in (6.28) the former is included in the latter, to provethat the latter is included in the former is far from being trivial. Our proof strategyis to make use of failure simulation preorder as a stepping stone, and adapt thesoundness proof of failure simulation with respect to must testing (Theorem 6.13).

We now recall our testing framework. Atestis simply a finite process in the lan-guagepCSP, except that it may in addition use specialsuccessactions for reportingoutcomes: these are drawn from a setΩ of fresh actions not already inActτ . Herewe require tests to be finite processes because we will consider convergent pro-cesses; ifP,T are finitary convergent then their parallel compositionP |Act T is notnecessarily convergent unlessT is very finite. As we have seen from Remark 6.4,restricting to finite tests does not weaken our testing semantics, as far as finitaryprocesses are concerned. As in Section 4.2, to apply testT to processP we form theprocessT |Act P in which all visible actions ofP must synchronise withT. The re-sulting composition is a process whose only possible actions areτ and the elementsof Ω . Applying the testT to the processP gives rise to the set of testing outcomesA (T,P) defined in (4.2), exactly one of which results from each resolution of thechoices inT |Act P. Eachtesting outcomeis anΩ -tuple of real numbers in the inter-val [0,1], i.e. a functiono : Ω → [0,1], and itsω-componento(ω), for ω ∈ Ω , givesthe probability that the resolution in question will reach an ω-success state, one inwhich the success actionω is possible.

In Section 4.4 two reward testing preorders are obtained by associating eachsuccess actionω ∈ Ω a nonnegative reward. We refer to that approach of testingasnonnegative-reward testing. If we also allow negative rewards, which intuitivelycan be understood as costs, then we obtain an approach of testing calledreal-rewardtesting. Technically, we simply let reward tuplesh range over the set[−1,1]Ω . If


o∈ [0,1]Ω , we use the dot-producth ·o= ∑ω∈Ω h(ω) ∗o(ω). It can apply to a setO⊆ [0,1]Ω so thath ·A= h ·o | o∈ O. Let A⊆ [−1,1]. We use the notation

⊔A

for the supremum of setA, andd

A for the infimum.

Definition 6.30 (Real-reward testing preorders).

(i) P⊑Ωrr may Q if for everyΩ -testT and real-reward tupleh∈ [−1,1]Ω ,

⊔h ·A (T,P)≤

⊔h ·A (T,Q).

(ii) P⊑Ωrr mustQ if for everyΩ -testT and real-reward tupleh∈ [−1,1]Ω ,d

h ·A (T,P)≤d

h ·A (T,Q).

Note that for any testT and processP it is easy to see that

h ·A (T,P) = Ah(T,P).

Therefore, the nonnegative-reward testing preorders presented in Definition 4.6 canbe equivalently formulated in the following way:

(i) P ⊑Ωnrmay Q if for every Ω -testT and nonnegative-reward tupleh ∈ [0,1]Ω ,

⊔h ·A (T,P)≤

⊔h ·A (T,Q).

(ii) P ⊑Ωnrmust Q if for every Ω -testT and nonnegative-reward tupleh ∈ [0,1]Ω ,d

h ·A (T,P)≤d

h ·A (T,Q).

Although the two nonnegative-reward testing preorders arein general incompa-rable, the two real-reward testing preorders are simply theinverse relations of eachother.

Theorem 6.21.For any processes P and Q, it holds that P⊑Ωrr may Q if and only if

Q⊑Ωrr mustP.

Proof. We first notice that for any nonempty setA⊆ [0,1]Ω and any reward tupleh∈ [−1,1]Ω ,

⊔

h ·A = − (l

(−h) ·A) (6.29)

where−h is the negation ofh, i.e.(−h)(ω) =−(h(ω)) for anyω ∈ Ω . We considerthe “if” direction; the “only if” direction is similar. LetT be anyΩ -test andh beany real-reward tuple in[−1,1]Ω . Clearly,−h is also a real-reward tuple. SupposeQ⊑Ω

rr mustP, then

l(−h) ·A (T,Q) ≤

l(−h) ·A (T,P) (6.30)

Therefore, we can infer that⊔

h ·A (T,P) = −(d

(−h) ·A (T,P)) by (6.29)≤ −(

d(−h) ·A (T,Q)) by (6.30)

=⊔

h ·A (T,Q) by (6.29)

⊓⊔


Our next task is to compare⊑Ωrr must with ⊑Ω

nrmust. The former is included in thelatter, which directly follows from Definition 6.30. Surprisingly, it turns out thatfor finitary convergent processes the latter is also included in the former, thus thetwo preorders are in fact the same. The rest of this section isdevoted to proving thisresult. However, we first show that this result does not extend to divergent processes.

Example 6.27.Consider the processesQ1 andQ2 depicted in Figure 6.5. Using thecharacterisations of⊑Ω

pmayand⊑Ωpmust in Sections 6.6-6.8, it is easy to see that these

processes cannot be distinguished by probabilistic may- and must testing, and hencenot by nonnegative-reward testing either. However, letT be the test in the right dia-gram of Figure 6.5 that first synchronises on the actiona, and then with probability12 reaches a state in which a reward of−2 is allocated, and with the remaining prob-ability 1

2 synchronises with the actionb and reaches a state that yields a reward of4. Thus the test employs two success actionsω1 andω2, and we use the rewardtuple h with h(ω1) = −2 andh(ω2) = 4. Then the resolution ofq1 that does notinvolve theτ-loop contributes the value−2 · 1

2 + 4 · 12 = 1 to the seth ·A (T,Q1),

whereas the resolution that only involves theτ-loop contributes the value 0. Dueto interpolation,h ·A (T,Q1) is in fact the entire interval[0,1]. On the other hand,the resolution corresponding to thea-branch ofq2 contributes the value−1 andh·A (T,Q2) = [−1,1]. Thus

dh·A (T,Q1) = 0>−1=

dh·A (T,Q2), and hence

q1 6⊑Ωrr mustq2.

For convergent pLTS’s, the results in Lemmas 6.31 and 6.33 aswell as Theo-rem 6.13 can be strengthened.

Lemma 6.47.Let ∆ and Θ be distributions in anω-respecting convergent pLTS〈S,Ωτ ,→〉. If distribution∆ is stable and∆ (⊳e

FS)† Θ , then$∆ ∈ V (Θ).

Proof. We first show that ifs is stable ands⊳eFS

Θ then $s∈ V (Θ). Sinces is stable,we have only two cases:

(i) s 6−→. Here $s=−→0 . Sinces⊳e

FSΘ we haveΘ =⇒ Θ ′ with Θ ′ 6−→, whence in

factΘ =⇒≻Θ ′ and $Θ ′ =−→0 . Thus it holds that $s=

−→0 ∈ V (Θ).

(ii) s ω−→ ∆ ′ for some∆ ′. Here $s=−→ω , and sinces⊳eFS

Θ we haveΘ =⇒Θ ′ ω−→.As the pLTS we are considering is convergent andΘ is a distribution, we knowthatΘ ′ is also a distribution. Hence, we have $Θ ′=−→ω . Because the pLTS isω-respecting, in factΘ =⇒≻Θ ′ and so again we have $s=−→ω ∈ V (Θ).

Now for the general case we suppose∆ (⊳eFS)† Θ . It is not hard to show that

we can decomposeΘ into ∑s∈⌈∆⌉∆(s)·Θs such thats⊳eFS

Θs for eachs∈ ⌈∆⌉, andrecall that each such states is stable. From above we have that $s∈ V (Θs) for thoses, and so $∆ = ∑∈⌈∆⌉ ∆(s)·$s ∈ ∑s∈⌈∆⌉ ∆(s)·V (Θs) = V (Θ). ⊓⊔

Lemma 6.48.Let ∆ and Θ be distributions in anω-respecting convergent pLTS〈S,Ωτ ,→〉. If Θ ⊑FS ∆ , then it holds thatV (Θ)⊇ V (∆).

Proof. Let ∆ andΘ be distributions in anω-respecting convergent pLTS given by〈S,Ωτ ,→〉. We note that


(i) If ∆ =⇒ ∆ ′ thenV (∆ ′)⊆ V (∆).(ii) If ∆ (⊳e

FS)† Θ , then we haveV (∆)⊆ V (Θ).

Here (i) follows from Lemma 6.32. For (ii), let us assume∆ (⊳eFS)† Θ . For any

∆ =⇒≻ ∆ ′ we have the matching transitionΘ =⇒ Θ ′ such that∆ ′ (⊳eFS)† Θ ′. It

follows from Lemmas 6.47 and (i) that $∆ ′ ∈ V (Θ ′) ⊆ V (Θ). Consequently, weobtainV (∆)⊆ V (Θ).

Now supposeΘ ⊑FS ∆ . By definition there exists someΘ ′ such thatΘ =⇒ Θ ′

and∆ (⊳eFS)† Θ ′. By (i) and (ii) above we obtainV (∆)⊆ V (Θ ′)⊆ V (Θ). ⊓⊔

Theorem 6.22.For any finitary convergent processes P and Q, if P⊑FS Q thenP⊑Ω

rr mustQ.


P⊑FS Qimplies [P |Act T]⊑FS [Q |Act T] Lemma 6.30, for anyΩ -testTimplies V ([P |Act T])⊇ V ([Q |Act T]) [ · ] is ω-respecting; Lemma 6.48iff A d(T,P)⊇ A d(T,Q) Definitions 6.24 and 6.10iff A (T,P)⊇ A (T,Q) Corollary 6.3implies h ·A (T,P)⊇ h ·A (T,Q) for anyh∈ [−1,1]Ω

impliesd

h ·A (T,P)≤d

h ·A (T,Q) for anyh∈ [−1,1]Ω

iff P⊑Ωrr mustQ .

Note that in the second line above, both[P |Act T] and [Q |Act T] are convergent,since for any convergent processR and very finite processT, by induction on thestructure ofT, it can be shown that the compositionR |Act T is also convergent.⊓⊔

We are now ready to prove the main result of the section which states thatnonnegative-reward must testing is as discriminating as real-reward must testing.

Theorem 6.23.For any finitary convergent processes P, Q, it holds that P⊑Ωrr mustQ

if and only if P⊑ΩnrmustQ.

Proof. The “only if” direction is obvious. Let us consider the “if” direction. Sup-poseP andQ are finitary processes. We reason as follows.

P⊑ΩnrmustQ

iff P⊑ΩpmustQ Theorem 4.5

iff P⊑FS Q Theorems 4.7 and 6.13, Corollary 6.13implies P⊑Ω

rr mustQ . Theorem 6.22

⊓⊔

In the presence of divergence,⊑Ωrr must is strictly included in⊑Ω

nrmust. For example,let P andQ be the processesrecx.x anda.0, respectively. It holds thatP ⊑FS QbecauseP ε

=⇒ and the empty subdistribution can failure simulate any processes. Itfollows thatP ⊑Ω

nrmust Q, by recalling the first two steps of reasoning in the proofof Theorem 6.23. However, if we apply the testT = a.ω and reward tupleh withh(ω) =−1, then


dh ·A (T,P) =

dh · ε =

d0 = 0d

h ·A (T,Q) =d

h · −→ω =d−1 = −1

Asd

h ·A (T,P) 6≤d

h ·A (T,Q), we see thatP 6⊑Ωrr mustQ.

Below we give a characterisation of⊑Ωrr must in terms of the set inclusion rela-

tion between testing outcome sets. As a similar characterisation for⊑Ωnrmust does

not hold in general for finitary (non-convergent) processes, this indicates the subtledifference between⊑Ω

rr must and⊑Ωnrmust, and we see more clearly why our proof of

Theorem 6.23 involves the failure simulation preorder.

Theorem 6.24.Let P and Q be any finitary processes. Then P⊑Ωrr mustQ if and only

if A (T,P)⊇ A (T,Q) for anyΩ -test T .

Proof. (⇐) Let T be anyΩ -test andh∈ [−1,1]Ω be any real-reward tuple. SupposeA (T,P) ⊇ A (T,Q). It is obvious thath ·A (T,P) ⊇ h ·A (T,Q), from which iteasily follows that l

h ·A (T,P) ≤l

h ·A (T,Q).

As this holds for an arbitrary real-reward tupleh, we see thatP⊑Ωrr mustQ.

(⇒) Suppose for a contradiction there is someΩ -testT with A (T,P) 6⊇A (T,Q).Then there exists some outcomeo∈ A (T,Q) lying outsideA (T,P), i.e.

o 6∈ A (T,P). (6.31)

SinceT is finite, it contains only finitely many elements ofΩ , so that we mayassume with loss of generality thatΩ is finite. SinceP andT are finitary, it is easyto see that the pruned composition[P |Act T] is also finitary. By Theorem 6.1, thesetΦ | [P |Act T] =⇒ Φ is convex and compact. With an analogous proof, usingCorollary 6.4, it can be shown that so is the setΦ | [P |Act T] =⇒≻ Φ. It followsthat the set

$Φ | [P |Act T] =⇒≻ Φ

i.e.A d(T,P), is also convex and compact. By Corollary 6.3 the setA (T,P) is thusconvex and compact. Combining this with (6.31), and using the Separation theorem,Theorem 2.7, we infer the existence of some hyperplane whosenormal ish∈ R

Ω

such thath ·o′ > h ·o for all o′ ∈ A (T,P). By scalingh, we obtain without loss ofgenerality thath∈ [−1,1]Ω . It follows that

lh ·A (T,P) > h ·o ≥

lh ·A (T,Q)

which is a contradiction to the assumption thatP⊑Ωrr mustQ. ⊓⊔

Note that in the above proof the normal of the separating hyperplane belongs to[−1,1]Ω rather than[0,1]Ω . So we cannot repeat the above proof for⊑Ω

nrmust. Ingeneral, we do not have thatP⊑Ω

nrmustQ impliesA (T,P)⊇A (T,Q) for anyΩ -testT and for arbitrary finitary processesP andQ, that is finitary processes that mightnot be convergent. However, when we restrict ourselves to finitary convergent pro-cesses, this property does indeed hold, as can be seen from the first five lines in the

6.10 Summary 229

proof of Theorem 6.22. Note that in that proof there is an essential use of the failuresimulation preorder; in particular the pleasing property stated in Lemma 6.48. Evenfor finitary convergent processes we cannot give a direct andsimple proof of thatproperty for⊑Ω

nrmust, analogous to that of Theorem 6.24.

Although for finitary convergent processes, real-reward must testing is no morepowerful then nonnegative-reward must testing, a similar result does not hold formay testing. This follows immediately from our result that (the inverse of) real-reward may testing is as powerful as real-reward must testing, that is known not tohold for nonnegative-reward may and must testing. Thus, real-reward may testingis strictly more discriminating than nonnegative-reward may testing, even in theabsence of divergence.

6.10 Summary

We have generalised the results in Chapter 5 of characterising the may preorder asa simulation relation and the must preorder as a failure-simulation relation, fromfinite processes to finitary processes. Although the generalproof schema is inher-ited from Chapter 5, the details here are much more complicated. One importantreason is the inapplicability of structural induction, an important proof principleused in proving some fundamental properties for finite processes, when we shift tofinitary processes. So we have to make use of more advanced mathematical toolssuch as fixed points on complete lattices, compact sets in topological spaces, es-pecially in complete metric spaces, etc. Technically, we develop weak transitionsbetween probabilistic processes, elaborate their topological properties, and capturedivergence in terms of partial distributions. In order to obtain the characterisationresults of testing preorders as simulation relations, we found it necessary to inves-tigate fundamental structural properties of derivation sets (finite generability) andsimilarities (infinite approximations), which are of independent interest. The use ofMarkov decision processes and Zero-One laws was essential in obtaining our re-sults.

We have studied a notion of real-reward testing that extendsthe nonnegative-reward testing (cf. Section 4.4) with negative rewards. It turned out that real-rewardmay preorder is the inverse of real-reward must preorder, and vice versa. More in-terestingly, for finitary convergent processes, real-reward must testing preorder co-incides with nonnegative-reward testing preorder.

There is a great amount of work about probabilistic testing semantics and sim-ulation semantics, as we have seen in Sections 4.8 and 5.11. Here we mention theclosely related work [15], where Segala defined two preorders called trace distribu-tion precongruence (⊑TD) and failure distribution precongruence (⊑FD). He provedthat the former coincides with an action-based version of⊑Ω

pmay and that for “prob-abilistically convergent” systems the latter coincides with an action-based versionof ⊑Ω

pmust. The condition of probabilistic convergence amounts in ourframework to


the requirement that for∆ ∈ D(S) and∆ =⇒ ∆ ′ we have|∆ ′|= 1. In [9] it has beenshown that⊑TD coincides with a notion of simulation akin to⊑S.

In [5] by restricting the power of schedulers a testing preorder is proposed andshown to coincide with a probabilistic ready-trace preorder that is strictly coarserthan our simulation preorder but is still a precongruence [4].

References

1. Deng, Y., van Glabbeek, R., Hennessy, M., Morgan, C.: Testing finitary probabilistic pro-cesses (extended abstract). In: Proceedings of the 20th International Conference on Concur-rency Theory,Lecture Notes in Computer Science, vol. 5710, pp. 274–288. Springer (2009)

2. Deng, Y., van Glabbeek, R., Hennessy, M., Morgan, C.: Real-reward testing for probabilisticprocesses. Theoretical Computer Science538, 16–36 (2014)


4. Georgievska, S., Andova, S.: Composing systems while preserving probabilities. In: Proceed-ings of the 7th European Performance Engineering Workshop,Lecture Notes in ComputerScience, vol. 6342, pp. 268–283. Springer (2010)

5. Georgievska, S., Andova, S.: Probabilistic may/must testing: retaining probabilities by re-stricted schedulers. Formal Aspects of Computing24, 727–748 (2012)

6. Hennessy, M., Milner, R.: Algebraic laws for nondeterminism and concurrency. Journal ofthe ACM 32(1), 137–161 (1985)

7. Jones, C.: Probabilistic non-determinism. Ph.D. thesis, University of Edinburgh (1990)8. Jonsson, B., Ho-Stuart, C., Yi, W.: Testing and refinementfor nondeterministic and proba-

bilistic processes. In: Proceedings of the 3rd International Symposium on Formal Techniquesin Real-Time and Fault-Tolerant Systems,Lecture Notes in Computer Science, vol. 863, pp.418–430. Springer (1994)

9. Lynch, N., Segala, R., Vaandrager, F.W.: Observing branching structure through probabilisticcontexts. SIAM Journal on Computing37(4), 977–1013 (2007)

10. McIver, A.K., Morgan, C.C.: Abstraction, Refinement andProof for Probabilistic Systems.Springer (2005)

11. Milner, R.: Communication and Concurrency. Prentice Hall (1989)12. Puterman, M.L.: Markov Decision Processes. Wiley (1994)13. Rutten, J., Kwiatkowska, M., Norman, G., Parker, D.: Mathematical Techniques for Analyz-

ing Concurrent and Probabilistic Systems. American Mathematical Society (2004). Vol-ume 23 of CRM Monograph Series


15. Segala, R.: Testing probabilistic automata. In: Proceedings of the 7th International Confer-ence on Concurrency Theory,Lecture Notes in Computer Science, vol. 1119, pp. 299–314.Springer (1996)

Chapter 7Weak probabilistic bisimulation

Abstract By taking the symmetric form of simulation preorder, we obtain a no-tion of weak probabilistic bisimulation. It provides a sound and complete proofmethodology for an extensional behavioural equivalence, aprobabilistic variant ofthe traditional reduction barbed congruence.

Keywords: Weak probabilistic bisimulation; Reduction barbed congruence; Com-positionality

7.1 Introduction

In Section 6.8 we have considered simulation preorder. By taking the symmetricform of Definition 6.28 we easily obtain a notion of weak probabilistic bisimulation.

Definition 7.1 (Weak probabilistic bisimulation). Let S be the set of states in apLTS. A relationR ⊆ D(S)×D(S) is aweak (probabilistic) bisimulationif ∆ R Θimplies, for eachα ∈ Actτ and all finite sets of probabilities pi | i ∈ I satisfying∑i∈I pi = 1,

(i) whenever∆ α=⇒ ∑i∈I pi ·∆i, for any distributions∆i , there are some distribu-

tionsΘi with Θ α=⇒ ∑i∈I pi ·Θi, such that∆i R Θi for eachi ∈ I

(ii) symmetrically, wheneverΘ α=⇒ ∑i∈I pi ·Θi, for any distributionsΘi , there are

some distributions∆i with ∆ α=⇒ ∑i∈I pi ·∆i, such that∆i R Θi for eachi ∈ I .

The largest weak probabilistic bisimulation, which is guaranteed to exist using stan-dard arguments, is calledweak probabilistic bisimilarityand denoted by≈.

It is easy to see that≈ is an equivalence relation. Moreover, it turns out thatthis provides a sound and complete proof methodology for a natural extensional be-havioural equivalence between probabilistic systems, a generalisation ofreductionbarbed congruence, the well-known behavioural equivalence for a large variety ofprocess description languages. Intuitively, reduction barbed congruence is definedto be the coarsest relation that

231

232 7 Weak probabilistic bisimulation

• is compositional; that is preserved by some natural operators for constructingsystems

• preserves barbs; barbs are simple experiments which observers may perform onsystems

• is reduction-closed; this is a natural condition on the reduction semantics of sys-tems which ensures that nondeterministic choices are in some sense preserved.

The three criteria chosen above are very robust because theycan be formalised in asimilar way across different process description languages. In our setting, compo-sitions of pLTS’s and reduction semantics can be easily defined. For barbs, we usepredicates likeP⇓≥p

a , which means that processP can expose the visible actionawith probability at leastp. The details are given in Section 7.4.

7.2 A simple bisimulation

Due to the use of weak arrows and the quantification over sets of probabilities, itis difficult to directly apply Definition 7.1 and exhibit witness bisimulations. Wetherefore give an alternative characterisation of≈ in terms of a relation betweenstatesand distributions by adapting Definition 6.29.

Definition 7.2 (Simple bisimulation).A relationR⊆ S×D(S) is a simple (weakprobabilistic) bisimulationif sR Θ implies, for eachα ∈ Actτ ,

(i) whenevers α−→ ∆ ′, there is someΘ α=⇒Θ ′, such that∆ ′ R

† Θ ′

(ii) there exists some∆ ∈ D(S) such thats τ=⇒ ∆ andΘ R

† ∆ .We use≈s to denote the largest simple bisimulation. As in Remark 6.5,the bisimu-lation game is implicitly limited to weak transitions that simulate by producing fulldistributions only.

The precise relationship between the two forms of bisimulations are given by:

Theorem 7.1.Let∆ andΘ be two distributions in a finitary pLTS.(i) If ∆ ≈Θ then there is someΘ ′ with Θ τ

=⇒Θ ′ and∆ (≈s)† Θ ′

(ii) If ∆ (≈s)† Θ then∆ ≈Θ .

The remainder of this subsection is devoted to the proof of this theorem; it involvesfirst developing a number of subsidiary results.

Proposition 7.1.Suppose∆ (≈s)† Θ and∆ α−→ ∆ ′ in an arbitrary pLTS. Then there

exists someΘ ′ such thatΘ α=⇒Θ ′ and∆ ′ (≈s)

† Θ ′.

Proof. Suppose∆ (≈s)† Θ and ∆ α−→ ∆ ′. By Lemma 6.1 there is a finite index

set I such that (i)∆ = ∑i∈I pi · si , (ii) Θ = ∑i∈I pi ·Θi, and (iii) si ≈s Θi for eachi ∈ I . By the condition∆ α−→ ∆ ′, (i) and Proposition 6.2, we can decompose∆ ′ into∑i∈I pi ·∆ ′

i for some∆ ′i such thatsi

α−→ ∆ ′i . By Lemma 6.1 again, for eachi ∈ I ,

there is an index setJi such that∆ ′i = ∑ j∈Ji

qi j ·∆ ′i j andsi

α−→ ∆ ′i j for eachj ∈ Ji and

7.2 A simple bisimulation 233

∑ j∈Jiqi j = 1. By (iii) there is someΘ ′

i j such thatΘiα

=⇒ Θ ′i j and∆ ′

i j (≈s)† Θ ′

i j . LetΘ ′ = ∑i∈I , j∈Ji

piqi j ·Θ ′i j . By Theorem 6.5(i) the relation=⇒ is linear. Then it is easy

to see that α=⇒ is also linear. It follows thatΘ = ∑i∈I pi ∑ j∈Ji

qi jΘiα

=⇒ Θ ′. By the

linearity of (≈s)†, we notice that∆ ′ = (∑i∈I pi ∑ j∈Ji

qi j ·∆ ′i j ) (≈s)

† Θ ′. ⊓⊔

Theorem 7.2.In a finitary pLTS, if s≈s Θ ands τ=⇒ ∆ ′ then there is someΘ ′ with

Θ τ=⇒Θ ′ and∆ ′ (≈s)

† Θ ′.

Proof. The arguments are similar to those in the proof of Theorem 6.12. Supposes is a state andΘ a distribution in a finitary pLTS such thats≈s Θ ands τ

=⇒ ∆ ′.Referring to Definition 6.4, there must be∆k, ∆→

k and∆×k for k≥ 0 such thats= ∆0,

∆k = ∆→k +∆×

k , ∆→k

τ−→ ∆k+1 and∆ ′ = ∑∞k=1 ∆×

k . Since∆×0 +∆→

0 = s≈s† Θ , using

Proposition 6.2 there exist someΘ×0 andΘ→

0 such thatΘ =Θ×0 +Θ→

0 , ∆×0 (≈s)

† Θ×0

and∆→0 (≈s)

† Θ→0 . Since∆→

0τ−→∆1 and∆→

0 (≈s)† Θ→

0 , by Proposition 7.1 we haveΘ→

0 =⇒Θ1 with ∆1 (≈s)† Θ1.

Repeating the above procedure gives us inductively a seriesΘk,Θ→k ,Θ×

k of sub-

distributions, fork ≥ 0, such thatΘ0 =Θ , ∆k (≈s)† Θk, Θk =Θ→

k +Θ×k , ∆×

k (≈s)†

Θ×k , ∆→

k (≈s)† Θ→

k andΘ→k

τ=⇒Θk+1. We defineΘ ′ := ∑i Θ×

i . By Additivity (Re-

mark 6.2) we have∆ ′ (≈s)† Θ ′. It remains to be shown thatΘ =⇒Θ ′.

For that final step, since the setΘ ′′ |Θ =⇒Θ ′′ is closed by Lemma 6.17, wecan establishΘ =⇒ Θ ′ by exhibiting a sequenceΘ ′

i with Θ =⇒ Θ ′i for eachi and

with theΘ ′i ’s being arbitrarily close toΘ ′. Induction establishes for eachi the weak

transitionΘ =⇒Θ ′i := (Θ→

i +∑k≤i Θ×k ). Since∆ ′ is a full distribution (cf. Defini-

tion 7.2), whose mass is 1, i.e.|∆ ′| = 1, we must have limi→∞ |∆→i | = 0. It is easy

to see that for any two subdistributionsΓ1,Γ2 if Γ1 (≈s)† Γ2 then they have the same

mass. Therefore, it follows from the condition∆→i (≈s)

† Θ→i that limi→∞ |Θ→

i |= 0.Thus theseΘ ′

i ’s form the sequence we needed. ⊓⊔

Corollary 7.1. In a finitary pLTS, suppose∆ (≈s)† Θ and∆ α

=⇒ ∆ ′. Then there issomeΘ ′ with Θ α

=⇒Θ ′ and∆ ′ (≈s)† Θ ′.

Proof. Given the two previous results this is fairly straightforward. Suppose that∆ α=⇒ ∆ ′ and∆ (≈s)

† Θ . If α is τ then the requiredΘ ′ follows by an applicationof Theorem 7.2, since the relationτ=⇒ is actually defined to be=⇒.

Otherwise, by definition we know∆ =⇒ ∆1, ∆1α−→ ∆2 and∆2 =⇒ ∆ ′. An ap-

plication of Theorem 7.2 gives aΘ1 such thatΘ =⇒Θ1 and∆1 (≈s)† Θ1. An appli-

cation of Proposition 7.1 gives aΘ2 such thatΘ1α

=⇒ Θ2 and∆2 (≈s)† Θ2. Finally

another application of Theorem 7.2 givesΘ2 =⇒Θ ′ such that∆ ′ (≈s)† Θ ′.

The result now follows since the transitivity of=⇒, Theorem 6.6, gives the tran-sitionΘ α

=⇒Θ ′. ⊓⊔

Theorem 7.3.In a finitary pLTS,∆ (≈s)† Θ implies∆ ≈Θ .

Proof. Let R denote the relation(≈s)† ∪ ((≈s)

†)−1. We show thatR is a bisimula-tion relation, from which the result follows.

Suppose that∆ R Θ . There are two possibilities:


(a) ∆ (≈s)† Θ .

First suppose∆ α=⇒ ∑i∈I pi ·∆ ′

i . By Corollary 7.1 there is some distributionΘ ′

with Θ α=⇒Θ ′ and(∑i∈I pi ·∆ ′

i ) (≈s)† Θ ′. But by Proposition 6.2 we know that

the relation(≈s)† is left-decomposable. This means thatΘ ′ = ∑i∈I pi ·Θ ′

i forsome distributionsΘ ′

i such that∆ ′i (≈s)

† Θ ′i for eachi ∈ I . We hence have the

required matching move fromΘ .For the converse supposeΘ α

=⇒ ∑i∈I pi ·Θ ′i . We have to find a matching tran-

sition, ∆ α=⇒ ∑i∈I pi · ∆ ′

i , such that∆ ′i R Θ ′

i . In fact it is sufficient to find atransition∆ α

=⇒ ∆ ′ such that∑i∈I pi ·Θ ′i (≈s)

† ∆ ′, since((≈s)†)−1 ⊆R and the

deconstruction of∆ ′ into the required sum∑i∈I pi ·∆ ′i will again follow from

the fact that(≈s)† is left-decomposable. To this end let us abbreviate∑i∈I pi ·Θ ′

ito simplyΘ ′.We know from∆ (≈s)

† Θ , using the left-decomposability of(≈s)†, the convex-

ity of ≈s and Remark 6.1, thatΘ = ∑s∈⌈∆⌉∆(s) ·Θs for someΘs with s≈s Θs.

Then by the definition of≈s, s τ=⇒∆s for some∆s such thatΘs (≈s)

† ∆s. Now us-ing Theorem 6.5(ii) it is easy to show the left-decomposability of weak actions

α=⇒. Then fromΘ α

=⇒ Θ ′ we can derive thatΘ ′ = ∑s∈⌈∆⌉ ∆(s) ·Θ ′s such that

Θsα

=⇒Θ ′s, for eachs in the support of∆ . Applying Corollary 7.1 toΘs (≈s)

† ∆s

we have, again for eachs in the support of∆ , a matching move∆sα

=⇒ ∆ ′s such

thatΘ ′s (≈s)

† ∆ ′s . But, sinces τ

=⇒ ∆s, this givess α=⇒ ∆ ′

s for eachs∈ ⌈∆⌉; usingthe linearity of weak actionsα

=⇒, these moves from the statess in the supportof ∆ can be combined to obtain the action∆ α

=⇒ ∑s∈⌈∆⌉∆(s) ·∆ ′s. The required

∆ ′ is this sum,∑s∈⌈∆⌉ ∆(s) ·∆ ′s, since linearity of(≈s)

† gives∆ ′((≈s)†)−1Θ ′.

(b) The second possibility is that∆ ((≈s)†)−1 Θ , that isΘ (≈s)

† ∆ . But in this casethe proof that the relevant moves fromΘ and∆ can be properly matched isexactly the same as in case (a).

⊓⊔

We also have a partial converse to Theorem 7.3:

Proposition 7.2.In a finitary pLTS,s≈Θ implies s≈s Θ .

Proof. Let≈sbis be the restriction of≈ toS×D(S), in the sense thats≈s

bis Θ whenevers≈Θ . We show that≈s

bis is a simple bisimulation. Supposes≈sbis Θ .

(i) First supposes α−→ ∆ ′. Then sinces≈ Θ there must exist someΘ α=⇒ Θ ′

such that∆ ′ ≈Θ ′. Now consider the degenerate action∆ ′ τ=⇒ ∑t∈⌈∆ ′⌉∆ ′(t) · t.

There must be a matching move fromΘ ′, Θ ′ τ=⇒Θ ′′ = ∑t∈⌈∆ ′⌉ ∆ ′(t) ·Θ ′

t suchthatt ≈Θ ′

t , that ist ≈sbis Θ ′

t for eacht ∈ ⌈∆ ′⌉.By linearity, this means∆ ′ (≈s

bis)† Θ ′′ and by the transitivity of=⇒ we have

the required matching moveΘ α=⇒Θ ′′.

(ii) To establish the second requirement, consider the trivial moveΘ τ=⇒Θ . Since

s≈Θ there must exist a corresponding moves τ=⇒∆ such that∆ ≈Θ . Since≈

is a symmetric relation, we also haveΘ ≈ ∆ . Now by an argument symmetric

7.2 A simple bisimulation 235

to that used in part (i) we can show that this implies the existence of some∆ ′

such that∆ τ=⇒ ∆ ′, that iss τ

=⇒ ∆ ′ andΘ (≈sbis)

† ∆ ′.

⊓⊔

But in general the relations≈ and(≈s)† do not coincide for arbitrary distribu-

tions. For example, consider the two processesP = a.0 12⊕ b.0 andQ= P ⊓ P. It

is easy to see that[P ≈ [Q but not[P (≈s)† [Q; the latter follows because the

point distribution[Q cannot be decomposed as12 ·Θa+

12 ·Θb for someΘa andΘb

so that[a.0 ≈s Θa and[b.0≈s Θb.The nearest to a general converse to Theorem 7.3 is the following:

Proposition 7.3.Suppose∆ ≈ Θ in a finitary pLTS. Then there is someΘ ′ withΘ τ

=⇒Θ ′ and∆ (≈s)† Θ ′.

Proof. Now suppose∆ ≈ Θ . We can rewrite∆ as∑s∈⌈∆⌉∆(s) · s, and the reflexiv-ity of τ

=⇒ gives∆ τ=⇒ ∑s∈⌈∆⌉∆(s) · s. Since≈ is a bisimulation this move can be

matched by someΘ τ=⇒ Θ ′ = ∑s∈⌈∆⌉ ∆(s) ·Θs such thats≈ Θs. But we have just

shown in the previous proposition that this meanss≈s Θs.By Definition 6.2,∆ (≈s)

† Θ ′ and thereforeΘ α=⇒Θ ′ is the required move. ⊓⊔

The weak bisimilarity,≈ from Definition 7.1, is our primary behavioural equiv-alence but we will often develop properties of it via the connection we have justestablished with≈s from Definition 7.2; the latter is more amenable as it only re-quired strong moves to be matched. However we can also prove properties of≈s byusing this connection to weak bisimilarity; a simple example is the following:

Corollary 7.2. Suppose s≈s Θ in a finitary pLTS, where s 6τ−→. Then wheneverΘ τ

=⇒Θ ′ it follows that s≈s Θ ′.

Proof. Supposes≈s Θ , which meanss(≈s)† Θ and therefore by Theorem 7.3s≈Θ .

The moveΘ τ=⇒ Θ ′ must be matched by a corresponding move froms. However

sinces 6τ−→ the only possibility is the empty move, givings≈Θ ′. Now by Proposi-tion 7.2 we have the requireds≈s Θ ′. ⊓⊔

Corollary 7.3. In any finitary pLTS, the relation≈ is linear.

Proof. Consider any collection of probabilitiespi with ∑i∈I pi = 1, whereI is afinite index set. Suppose further that∆i ≈ Θi for eachi ∈ I . We need to show that∆ ≈Θ , where∆ = ∑i∈I pi ·∆i andΘ = ∑i∈I pi ·Θi.

By Proposition 7.3, there is someΘ ′i with Θi

τ=⇒ Θ ′

i and∆i (≈s)† Θ ′

i . By The-orem 6.6 (i) and Definition 6.2, bothτ

=⇒ and(≈s)† are linear. Therefore, we have

Θ τ=⇒Θ ′ and∆ (≈s)

† Θ ′, whereΘ ′ = ∑i∈I pi ·Θ ′i . It follows from Theorem 7.3 that

∆ ≈Θ ′.Now for any transition∆ α

=⇒ (∑ j∈J q j ·∆ j), whereJ is finite, there is a matchingtransitionΘ ′ α

=⇒ (∑ j∈J q j ·Θ j) such that∆ j ≈ Θ j for each j ∈ J. Note that wealso have the transitionΘ α

=⇒ (∑ j∈J q j ·Θ j) according to the transitivity ofτ=⇒. By

symmetrical arguments, any transitionΘ α=⇒ (∑ j∈J q j ·Θ j) can be matched by some

transition∆ α=⇒ (∑ j∈J q j ·∆ j) such that∆ j ≈Θ j for eachj ∈ J. ⊓⊔


7.3 Compositionality

The main operator of interest in modelling concurrent systems is the parallel com-position. We will use a CCS-style parallel composition|. It is convenient for theproofs in Section 7.4, so we add it to our languagerpCSP presented in Section 6.2.Its behaviour is specified by the following three rules.

s1α−→ ∆

s1 | s2α−→ ∆ | s2

s2α−→ ∆

s1 | s2α−→ s1 | ∆

s1a−→ ∆1, s2

a−→ ∆2

s1 | s2τ−→ ∆1 | ∆2

Intuitively in the parallel composition of two processes, each of them can proceedindependently or synchronise on any visible action they share with each other. Therules use the obvious extension of the function| on pairs of states to pairs of distri-butions. To be precise∆ |Θ is the distribution defined by:

(∆ |Θ)(s) =

∆(s1) ·Θ(s2) if s= s1 | s2

0 otherwise

This construction can also be explained as follows:

Lemma 7.1.For any state t and distributions∆ ,Θ ,

(i) ∆ | t = ∑s∈⌈∆⌉∆(s) · (s | t)(ii) ∆ |Θ = ∑t∈⌈Θ⌉Θ(t) · (∆ | t).

Proof. Straightforward calculation. ⊓⊔

We show that both≈s and≈ are closed under the parallel composition. Thisrequires some preliminary results, particularly on composing actions from the com-ponents of a parallel composition, as in Lemma 6.27.

Lemma 7.2.In a pLTS,

(i) ∆ α−→ ∆ ′ implies∆ |Θ α−→ ∆ ′ |Θ , for α ∈ Actτ(ii) ∆1

a−→ ∆ ′1 and∆2

a−→ ∆ ′2 implies∆1 | ∆2

τ−→ ∆ ′1 | ∆ ′

2, for a∈ Act.

Proof. Each case follows by straightforward linearity arguments.As an example weoutline the proof of (i).∆ α−→ ∆ ′ means that

∆ = ∑i∈I

pi ·si siα−→ ∆i ∆ ′ = ∑

i∈Ipi ·∆i

For any statet, we havesi | t α−→ ∆i | t. By linearity we can immediately infer that∑i∈I pi · (si | t) α−→ ∑i∈I pi · (∆i | t) and this may be rendered as

∆ | t α−→ ∆ ′ | t

7.3 Compositionality 237

By the second part of Lemma 7.1(∆ |Θ) may be written as∑t∈⌈Θ⌉Θ(t) · (∆ | t) andtherefore another application of linearity gives∆ | Θ α−→ ∑t∈⌈Θ⌉Θ(t) · (∆ ′ | t) andby the same result this residual coincides with(∆ ′ |Θ). ⊓⊔

Lemma 7.3.In a pLTS,

(i) ∆ =⇒ ∆ ′ implies∆ |Θ =⇒ ∆ ′ |Θ(ii) ∆ α

=⇒ ∆ ′ implies∆ |Θ α=⇒ ∆ ′ |Θ , for α ∈ Actτ

(iii) ∆1a

=⇒ ∆ ′1 and∆2

a=⇒ ∆ ′

2 implies∆1 | ∆2τ

=⇒ ∆ ′1 | ∆ ′

2, for a∈ Act.

Proof. Parts (ii) and (iii) follow from (i) and the corresponding result in the previouslemma.

For (i) suppose∆ =⇒ ∆ ′. First note that a weak move from∆ to ∑∞k=0 ∆×

k = ∆ ′,as in Definition 6.4, can easily be transformed into a weak transition from(∆ | t) to∑∞

k=0(∆×k | t). This means that for any statet we have a(∆ | t) =⇒ (∆ ′ | t).

By the second part of Lemma 7.1(∆ |Θ) can be written as∑t∈⌈Θ⌉Θ(t) · (∆ | t),and since=⇒ is linear, Theorem 6.5(i), this means(∆ |Θ) =⇒ ∑t∈⌈Θ⌉Θ(t) · (∆ ′ | t)and again Lemma 7.1 renders this residual to be(∆ ′ |Θ). ⊓⊔

Theorem 7.4 (Compositionality of≈s). Let s, t be states andΘ a distribution in anarbitrary pLTS, if s≈s Θ then s| t ≈s Θ | t.

Proof. We construct the following relation

R= (s | t,Θ | t) | s≈s Θ

and check thatR is a simple bisimulation in the associated pLTS. This will implythatR ⊆≈s, from which the result follows. Note that by construction wehave that

(a) ∆1 (≈s)† ∆2 implies(∆1 |Θ) R

† (∆2 |Θ) for any distributionΘ

We use this property throughout the proof.Let (s | t, Θ | t) ∈ R. We first prove property (ii) in Definition 7.2, which turns

out to be straightforward. Sinces≈s Θ , there is some∆ such thats τ=⇒ ∆ and

Θ (≈s)† ∆ . An application of Lemma 7.3(ii) givess | t τ

=⇒ ∆ | t and property (a)that(Θ | t) R

† (∆ | t).Let us concentrate on property (i): we must prove that every transition froms | t

has a matching weak transition fromΘ | t. Assume thats | t α−→ Γ for some actionα and distributionΓ . There are three possibilities:

• Γ is ∆ ′ | t for some actionα and distribution∆ ′ with s α−→ ∆ ′. Here we haveΘ α

=⇒Θ ′ such that∆ ′ (≈s)† Θ ′, sinces≈s Θ . Moreover by Lemma 7.3(ii), we can

deduceΘ | t α=⇒Θ ′ | t. Again by (a) we have(∆ ′ | t,Θ ′ | t) ∈ R

†, and thereforea matching transition.

• SupposeΓ is s | ∆ ′ wheret α−→ ∆ ′. Here a symmetric version of Lemma 7.2(i)givesΘ | t α−→Θ | ∆ ′. This is the required matching transition since we can use(a) above to deduce(s | ∆ ′,Θ | ∆ ′) ∈ R

†.


• The final possibility forα is τ andΓ is (∆1 | ∆2) wheres a−→ ∆1 andt a−→ ∆2

for somea ∈ A. Here, sinces≈s Θ , we have a transitionΘ a=⇒ Θ ′ such that

∆1 (≈s)† Θ ′. By combining these transitions using part (iii) of Lemma 7.3 we

obtainΘ | t τ=⇒ Θ ′ | ∆2. Again this is the required matching transition since an

application of (a) above gives(∆1 | ∆2,Θ ′ | ∆2) ∈ R†.

⊓⊔

Corollary 7.4. In an arbitrary pLTS,∆ (≈s)† Θ implies(∆ | Γ ) (≈s)

† (Θ | Γ )

Proof. A simple consequence of the previous compositionality result, using astraightforward linearity argument. ⊓⊔

Theorem 7.5 (Compositionality of≈). Let ∆ ,Θ andΓ be any distributions in afinitary pLTS. If∆ ≈Θ then∆ | Γ ≈Θ | Γ .

Proof. We show that the relation

R = (∆ | Γ ,Θ | Γ ) | ∆ ≈Θ ∪ ≈

is a bisimulation, from which the result follows.Suppose(∆ |Γ ,Θ |Γ )∈R. Since∆ ≈Θ , we know from Theorem 7.1 that some

Θ ′ exists such thatΘ τ=⇒Θ ′ and∆ (≈s)

† Θ ′ and the previous corollary implies that(∆ | Γ ) (≈s)

† (Θ ′ | Γ ); by Theorem 7.1 this gives(∆ | Γ )≈ (Θ ′ | Γ ).We now show thatR is a weak bisimulation. Consider the transitions from both

(∆ | Γ ) and (Θ | Γ ); by symmetry it is sufficient to show that the transitions ofthe former can be matched by the latter. Suppose that(∆ | Γ )

α=⇒ (∑i pi ·∆ ′

i ). Then(Θ ′ | Γ )

α=⇒ (∑i pi ·Θ ′

i ) with ∆ ′i ≈ Θ ′

i for eachi. But by part (i) of Lemma 7.3(Θ | Γ )

τ=⇒ (Θ ′ | Γ ) and therefore by the transitivity ofτ=⇒ we have the required

matching transition(Θ | Γ )α

=⇒ (∑i pi ·Θ ′i ). ⊓⊔

7.4 Reduction barbed congruence

We now introduce an extensional behavioural equivalence called reduction barbedcongruence and show that weak bisimilarity is both sound andcomplete for it.

Definition 7.3 (Barbs).For ∆ ∈ D(S) anda∈ Act let Va(∆) = ∑∆(s) | s a−→.We write∆ ⇓≥p

a whenever∆ =⇒ ∆ ′, whereVa(∆ ′)≥ p. We also we use the notation∆ 6⇓>0

a to mean that∆ ⇓≥pa does not hold for anyp> 0.

Then we say a relationR is barb-preservingif ∆ ⇓≥pa iff Θ ⇓≥p

a whenever∆ R Θ .It is reduction-closedif ∆ R Θ implies

(i) whenever∆ =⇒ ∆ ′, there is aΘ =⇒Θ ′ such that∆ ′ R Θ ′

(ii) wheneverΘ =⇒Θ ′, there is a∆ =⇒ ∆ ′ such that∆ ′ R Θ ′.

Finally, we say that in a binary relationR is compositionalif ∆1 R ∆2 implies(∆1 |Θ) R (∆2 |Θ) for any distributionΘ .

7.4 Reduction barbed congruence 239

Definition 7.4. In a pLTS, let≈rbc be the largest relation over the states that is barb-preserving, reduction-closed and compositional.

Theorem 7.6 (Soundness).In a finitary pLTS, if∆ ≈Θ then∆ ≈rbc Θ .

Proof. Theorem 7.5 says that≈ is compositional. It is also easy to see that≈ isreduction-closed. So it is sufficient to prove that≈ is barb-preserving.

Suppose∆ ≈Θ and∆ ⇓≥pa , for any actiona and probabilityp; we need to show

thatΘ ⇓≥pa . We see from∆ ⇓≥p

a that∆ =⇒ ∆ ′ for some∆ ′ with Va(∆ ′) ≥ p. Sincethe relation≈ is reduction-closed, there existsΘ ′ such thatΘ =⇒Θ ′ and∆ ′ ≈Θ ′.The degenerate weak transition∆ ′ τ

=⇒ ∑s∈⌈∆ ′⌉∆ ′(s) · s must be matched by sometransition

Θ ′ τ=⇒ ∑

s∈⌈∆ ′⌉

∆ ′(s) ·Θ ′s (7.1)

such thats≈Θ ′s. By Proposition 7.2 we know thats≈s Θ ′

s for eachs∈ ⌈∆ ′⌉. Now ifs a−→Γs for some distributionΓs, thenΘ ′

sτ

=⇒Θ ′′s

a−→ τ=⇒Θ ′′′

s for some distributionsΘ ′′

s andΘ ′′′s with Γs (≈s)

† Θ ′′′s . It follows that|Θ ′′

s | ≥ |Θ ′′′s |= |Γs|= 1. LetSa be the

set of statess∈ ⌈∆ ′⌉ | s a−→, andΘ ′′ be the distribution

( ∑s∈Sa

∆ ′(s) ·Θ ′′s )+ ( ∑

s∈⌈∆ ′⌉\Sa

∆ ′(s) ·Θ ′s).

By the linearity and reflexivity of τ=⇒, Theorem 6.6, we have

( ∑s∈⌈∆ ′⌉

∆ ′(s) ·Θ ′s)

τ=⇒Θ ′′ (7.2)

By (7.1), (7.2) and the transitivity ofτ=⇒, we obtainΘ ′ τ

=⇒ Θ ′′, thusΘ =⇒ Θ ′′. Itremains to show thatVa(Θ ′′)≥ p.

Note that for eachs∈ Sa we haveΘ ′′s

a−→, which means thatVa(Θ ′′s ) = 1. It

follows that

Va(Θ ′′) = ∑s∈Sa ∆ ′(s) ·Va(Θ ′′s )+∑s∈⌈∆ ′⌉\Sa ∆ ′(s) ·Va(Θ ′

s)≥ ∑s∈Sa ∆ ′(s) ·Va(Θ ′′

s )= ∑s∈Sa ∆ ′(s)= Va(∆ ′)≥ p

⊓⊔

In order to establish a converse to Theorem 7.6,completeness, we need to work ina pLTS that is expressive enough to provide appropriate contexts and barbs in orderto distinguish processes that are not bisimilar. For this purpose we use the pLTSdetermined by the languagerpCSP. For the remainder of this section we focus onthis particular pLTS.

We will eventually establish the completeness by showing that ≈rbc is a bisim-ulation, but this requires that we first develop a few auxiliary properties of≈rbc in


this setting. The technique used normally involves examining the barbs of processesin certain contexts; the following lemma gives extra power to this technique. If anaction namec does not appear in the processes under consideration, we sayc isfresh.

Lemma 7.4.In rpCSP suppose(∆ | c.0) p⊕ ∆ ′ ≈rbc (Θ | c.0) p⊕ Θ ′ where p> 0and c is a fresh action name. Then∆ ≈rbc Θ .

Proof. Consider the relation

R= (∆ ,Θ) | (∆ | c.0) p⊕ ∆ ′ ≈rbc (Θ | c.0) p⊕Θ ′ for some∆ ′,Θ ′ and freshc

We show thatR ⊆≈rbc, by showing thatR satisfies the three defining properties of≈rbc.

(1) R is compositional. Suppose∆ R Θ ; we have to show(∆ | Φ) R (Θ | Φ), forany distributionΦ. Since∆ R Θ there are some∆ ′,Θ ′ and freshc such that

Λ ≈rbc Γ whereΛ = (∆ | c.0) p⊕ ∆ ′, Γ = (Θ | c.0) p⊕Θ ′ (7.3)

Since≈rbc is compositional, we have(Λ | Φ)≈rbc (Γ | Φ). Therefore, it followsthat (∆ |Φ | c.0) p⊕ (∆ ′ | Φ) ≈rbc (Θ | Φ | c.0) p⊕ (Θ ′ | Φ), which means, bydefinition, that(∆ | Φ) R (Θ | Φ).

(2) R is barb-preserving.Suppose∆ ⇓≥qa for some actionaand probabilityq, where

∆ R Θ . Again we may assume (7.3) above. Consider the testing processa.c.b.0,whereb is fresh. Since≈rbc is compositional, we have

(Λ | a.c.b.0)≈rbc (Γ | a.c.b.0) .

Note that(Λ | a.c.b.0) ⇓≥pqb , which implies(Γ | a.c.b.0) ⇓≥pq

b . Sincec is freshfor Θ ′, the latter has no potential to enable the actionc, and thusΘ ′ | a.c.b.0 isnot able to fire the actionb. Therefore, the action is triggered byΘ and it mustbe the case that(Θ | c.0 | a.c.b.0) ⇓≥q

b , which impliesΘ ⇓≥qa .

(3) R is reduction-closed. Suppose∆ R Θ and∆ τ=⇒ ∆ ′′ for some distribution∆ ′′.

Let Γ andΛ be determined as in (7.3) above. ThenΛ τ=⇒ (∆ ′′ | c.0) p⊕ ∆ ′.

SinceΛ ≈rbc Γ , there is someΓ ′ with Γ =⇒ Γ ′ and(∆ ′′ | c.0) p⊕ ∆ ′ ≈rbc Γ ′.In other words, there are someΘ ′′,Θ ′′′ such thatΓ ′ ≡ (Θ ′′ | c.0) p⊕ Θ ′′′ withΘ τ

=⇒ Θ ′′ andΘ ′ τ=⇒ Θ ′′′. Thus (∆ ′′ | c.0) p⊕ ∆ ′ ≈rbc (Θ ′′ | c.0) p⊕ Θ ′′′. It

follows that∆ ′′ R Θ ′′.

⊓⊔

Proposition 7.4.In rpCSP, if (∑i∈I pi ·∆i) ≈rbc Θ with ∑i∈I pi = 1, then there aresomeΘi such thatΘ τ

=⇒ ∑i∈I pi ·Θi and∆i ≈rbc Θi for each i∈ I.

Proof. Without loss of generality, we assume thatpi 6= 0 for all i ∈ I . Suppose that(∑i∈I pi ·∆i) ≈rbc Θ . Consider the testing processT =

di∈I (ai .0 ⊓ b.0, where all

ai andb are fresh and pairwise different actions. By the compositionality of ≈rbc,


we have(∑i∈I pi ·∆i) | T ≈rbc Θ | T. Now (∑i∈I pi ·∆i) | T τ=⇒ ∑i∈I pi · (∆i | ai.0).

Since≈rbc is reduction-closed, there must exist someΓ such thatΘ | T =⇒ Γ and∑i∈I pi · (∆i | ai .0)≈rbc Γ .

The barbs of∑i∈I pi · (∆i | ai .0) constrain severely the possible structure ofΓ .For example, sinceΓ 6⇓>0

b , we haveΓ ≡ ∑k∈K qk · (Θk | aki .0) for some index setK,whereΘ τ

=⇒ ∑k qk ·Θk andki ∈ I . For any indicesk1 andk2, if ak1 = ak2, we cancombine the two summandsqk1 ·Θk1 andqk2 ·Θk2 into one(qk1 +qk2) ·Θk12 where

Θk12 = (qk1

qk1+qk2

·Θk1 +qk2

qk1+qk2

·Θk2). In this way, we see thatΓ can be written as

∑i∈I qi · (Θi | ai .0). SinceΓ ⇓≥piai , qi ≥ pi and∑i∈I pi = 1, we havepi = qi for each

i ∈ I .Therefore the required matching move isΘ τ

=⇒ ∑i∈I pi ·Θi . This follows because∑i∈I pi · (∆i | ai .0) ≈rbc ∑i∈I pi · (Θi | ai .0), from which Lemma 7.4 implies the re-quired∆i ≈rbc Θi for eachi ∈ I . ⊓⊔

Proposition 7.5.Suppose that∆ ≈rbc Θ in rpCSP. If ∆ α=⇒ ∆ ′ with α ∈ Actτ then

Θ α=⇒Θ ′ such that∆ ′ ≈rbc Θ ′.

Proof. We can distinguish two cases.

(1) α is τ. This case is trivial because≈rbc is reduction-closed.(2) α is a, for somea∈ Act. Let T be the processb.0 a.c.0 whereb andc are

fresh actions. Then∆ | T =⇒ ∆ ′ | c.0 by Lemma 7.3(iii). Since∆ ≈rbc Θ itfollows that∆ | T ≈rbc Θ | T. Since≈rbc is reduction-closed, there is someΓsuch thatΘ | T =⇒ Γ and∆ ′ | c.0 ≈rbc Γ .Since≈rbc is barb-preserving we haveΓ 6⇓>0

b andΓ ⇓≥1c . By the construction

of the testT it must be the case thatΓ has the formΘ ′ | c.0 for someΘ ′ withΘ a

=⇒Θ ′. By Lemma 7.4 and∆ ′ | c.0 ≈rbc Θ ′ | c.0, it follows that∆ ′ ≈rbc Θ ′.

⊓⊔

Theorem 7.7 (Completeness).In rpCSP, ∆ ≈rbc Θ implies∆ ≈Θ .

Proof. We show that≈rbc is a bisimulation. Because of symmetry it is sufficient toshow that if∆ α

=⇒∑i∈I pi ·∆i with ∑i∈I pi = 1, whereα ∈Actτ andI is a finite indexset, there is a matching moveΘ α

=⇒ ∑i∈I pi ·Θi for someΘi such that∆i ≈rbc Θi.In fact because of Proposition 7.4 it is sufficient to match a simple move∆ α

=⇒∆ ′

with a simple moveΘ α=⇒Θ ′ such that∆ ′ ≈rbc Θ ′. But this can easily be established

using Propositions 7.5. ⊓⊔


We have considered a notion of weak bisimilarity, which is induced from simulationpreorder. It turns out that this provides both a sound and a complete proof method-ology for an extensional behavioural equivalence that is a probabilistic version ofof the well-known reduction barbed congruence. This resultis extracted from [2]


where the proofs are carried out for a more general model calledMarkov automata[4], which describe systems in terms of events that may be nondeterministic, mayoccur probabilistically, or may be subject to time delays. As a distinguishing fea-ture, the weak bisimilarity is based on distributions. It isstrictly coarser than thestate-based bisimilarity investigated in Section 3.5, even in the absence of invisibleactions [6]. A decision algorithm for the weak bisimilarityis presented in [3]. Fora state-based weak bisimilarity, a decision algorithm is given in [1]. Note that bytaking the symmetric form of Definition 6.19 we can obtain a notion of weak failurebisimulation. Its characteristics are yet to be explored.

The idea of using barbs as simple experiments [10] and then deriving reductionbarbed congruence originated in [7] but has now been widely used for different pro-cess description languages; for example see [8, 11] for its application to higher-orderprocess languages, [9] for mobile ambients and [5] for asynchronous languages. In-cidently there are also minor variations on the formulationof reduction barbed con-gruence, often calledcontextual equivalenceor barbed congruence, in the literature.See [5, 12] for a discussion of the differences.

References

1. Cattani, S.; Segala, R.: Decision Algorithms for Probabilistic Bisimulation. In: Proceedingsof the 13th International Conference on Concurrency Theory, Lecture Notes in ComputerScience, vol.2421, pp. 371-385. Springer (2002)

2. Deng, Y., Hennessy, M.: On the semantics of Markov automata. Information and Computation222, 139-168 (2013)

3. Eisentraut, C., Hermanns, H., Kramer, J., Turrini, A., Zhang, L.: Deciding bisimilarities onDistributions. In: Proceedings of the 10th International Conference on Quantitative Evalua-tion of Systems,Lecture Notes in Computer Science, vol.8054, pp. 72-88. Springer (2013)

4. Eisentraut, C., Hermanns, H., Zhang, L.: On probabilistic automata in continuous time. In:Proceedings of the 25th Annual IEEE Symposium on Logic in Computer Science, pp. 342-351. IEEE Computer Society (2010).

5. Fournet, C., Gonthier, G.: A hierarchy of equivalences for asynchronous calculi. Journal ofLogic and Algebraic Programming63(1), 131-173 (2005)

6. Hennessy, M.: Exploring probabilistic bisimulations, part I. Formal Aspects of Computing24(4-6), 749–768 (2012)

7. Honda, K., Yoshida, N.: On reduction-based process semantics. Theoretical Computer Sci-ence151(2), 437-486 (1995)

8. Jeffrey, A.; Rathke, J.: Contextual equivalence for higher-order pi-calculus revisited. LogicalMethods in Computer Science1(1:4) (2005)

9. Rathke, J., Sobocinski, P.: Deriving structural labelled transitions for mobile ambients. In:Proceedings of the 19th International Conference on Concurrency Theory,Lecture Notes inComputer Science, vol. 5201, pp. 462–476. Springer (2008)

10. Rathke, J., Sobocinski, P.: Making the unobservable, unobservable. Electronic Notes in Com-puter Science229(3), 131-144 (2009)

11. Sangiorgi, D., Kobayashi, N., Sumii, E.: Environmentalbisimulations for higher-order lan-guages. In: Proceedings of the 22nd IEEE Symposium on Logic in Computer Science, pp.293302. IEEE Computer Society (2007)

12. Sangiorgi, D., Walker, D.: Theπ-calculus: a theory of mobile processes. Cambridge Univer-sity Press (2001)

Index

Symbols

Ω -disjoint 135Ω -test 74ω-respecting 168σ -algebra 18

A

adequacy 24

B

barb 232barbed congruence 242bisimilarity 3bisimulation 41blocks 56bounded 81bounded continuity 77

C

Cauchy sequence 15Cauchy closed 81characteristic equation system 50characteristic formula 51, 104characteristic test 134choice

external 106internal 106nondeterministic 1probabilistic 2

choice function 74co-continuous 9co-induction 3, 10compact 14, 81

complete lattice 8complete metric space 16compositional 232concurrent systems 1contextual equivalence 242continuous 9contraction mapping 17convergent 15convex 16, 81

D

derivativeextreme 168weak 159

derivative lemma 140distribution formula 44divergence 117

E

Euclidean space 16events 19execution 103expected value 73expressivity 24

F

failure simulation 104field 18finitary pLTS 25finitary processes 4finite intersection property 14finite processes 4finite-state 4finitely branching 4fixed point 9fresh 240

243

244 Index

G

greatest element 8

H

Hasse diagrams 8Hausdorff distance 53Hennessy-Milner logic 4hyperplane 16

I

image-finite 42induction 10inequations 138infimum 8infinite linearity 158initial distribution 74initial segment 165interaction 1

J

join 8

K

Kantorovich metric 4

L

lattice 8least element 8limit point 13linearity 119lower bound 7LTS 25

M

Markov automata 242Markov decision processes 4mass 19maximum flow 38measurable space 18measure set 18meet 8metric space 15modal mu-calculus 4, 48model

alternating 143generative 63non-alternating 143reactive 63

monotone 9

N

network flow 4normal 16

O

open sets 13

P

p-closed 81parallel composition 1, 74partially ordered set 7partition refinement 56payoff

discounted 181function 177policy-following 183

pLTS 25point distribution 19policy

derivative 176extreme 89max-seeking 91min-seeking 90

postfixed point 9powerdomain

Hoare 72Smyth 72

precongruence 125, 203prefix 1prefixed point 9preorder

failure simulation preorder 120logical preorder 104simulation preorder 3, 120testing preorders 3

probabilistic acceptance trees 98probabilistic labelled transition system 4probabilistic process 74probabilistic traces 98probability distribution 19process algebras 1process calculi 1product 19productive moves 200pruning 168pseudometric space 15

R

reactive 27recursion 1

Index 245

reduction barbed congruence 5reduction-closed 232resolution 74resolving function 74restriction 1reward function 177reward vector 81

S

sample space 19satisfaction relation 44semantics 2

axiomatic 2denotational 2operational 2, 106simulation semantics 3testing semantics 3

sequential systems 1simulation 41, 104simulation equivalence 42splitter 56stable 168state formula 44state-metric 53static resolution 75structural operational semantics 2subdistribution 19success action 74, 103success state 103success tuple 74support 19supremum 8

T

target value 134

testing 111action-based 103extremal reward 85may 111must 111real-reward 223resolution-based 75reward 76scalar 73, 104state-based 103vector-based 104

theoremπ-λ 18Banach 17duality 21finite generability 178Kantorovich-Rubinstein 34Knaster-Tarski 9Maximum Flow Minimum Cut 40Separation 16

topology 13transition graph 2transportation problem 19

U

upper bound 7

V

vector-based testing 73

W

weak probabilistic bisimulation 231weak transitions 118

semantics of probabilistic processes - sjtu

Documents