[ieee 2012 international conference on high performance computing & simulation (hpcs) - madrid,...

7
Verification of P2P Live Streaming Systems Using Symmetry-based Semiautomatic Abstractions Pedro de Carvalho Gomes KTH Royal Institute of Technology Stockholm, Sweden [email protected] Sergio Vale Aguiar Campos Universidade Federal de Minas Gerais Belo Horizonte, Brazil [email protected] Alex Borges Vieira Universidade Federal de Juiz de Fora Juiz de Fora, Brazil [email protected] Abstract—P2P systems are one of the most efficient data trans- port technologies in use today. Particularly, P2P live streaming systems have been growing in popularity recently. However, ana- lyzing such systems is difficult. Developers are not able to realize a complete test due the due to system size and complex dynamic behavior. This may lead us to develop protocols with errors, unfair or even with low performance. One way of performing such an analysis is using formal methods. Model Checking is one such method that can be used for the formal verification of P2P systems. However it suffers from the combinatory explosion of states. The problem can be minimized with techniques such as abstraction and symmetry reduction. This work combines both techniques to produce reduced models that can be verified in feasible time. We present a methodology to generate abstract models of reactive systems semi-automatically, based on the model’s symmetry. It defines modeling premises to make the abstraction procedure semiautomatic, i.e., without modification of the model. Moreover, it presents abstraction patterns based on the system symmetry and shows which properties are consistent with each pattern. The reductions obtained by the methodology were significant. In our test case of a P2P network, it has enabled the verification of liveness properties over the abstract models which did not finish with the original model after more than two weeks of intensive computation. Our results indicate that the use of model checking for the verification of P2P systems is feasible, and that our modeling methodology can increase the efficiency of the verification algorithms enough to enable the analysis of real complex P2P live streaming systems. I. I NTRODUCTION P2P systems are extremely efficient in disseminating data to a large number of users. In particular, P2P live streaming systems can be used to broadcast video to very large numbers of viewers without overloading centralized servers [19]. How- ever, making sure that these systems behave as expected can be difficult, due to their complex dynamic behavior. Some of the most commonly used methods to check a P2P live streaming behavior are simulation and data collection from real transmissions. Both methods are very useful, but do not solve all problems. Simulation is very efficient, but due to its nature, can miss important but uncommon events, and as a consequence, its results are not guaranteed [17]. The same is true for analyzing real transmissions, which have the additional disadvantage of being complex to set up and run. In this work, we propose the use of the formal method Model Checking to check P2P live streaming systems behavior and its correctness. We also provide an exhaustive analysis of a P2P system model using our approach. We focus on the liveness properties, which are the ones that guarantee that some condition may happen. We use, for this purpose, the existential subset of the Computation Tree Logic (CTL), named CTL. Despite the success of Model Checking on the verification of a large number of computation systems, it is susceptible to the state-space explosion problem. This problem is caused by the fact that the number of states in a model is exponential to the number of variables it contains [17]. This severely limits the complexity of models that can be verified. The use of abstractions is one of the main techniques to mitigate the state-space explosion. It generates smaller abstract models by removing or grouping states and transitions from the original model [1]. Another technique to produce compact models is symmetry reduction [9]. It is applicable on models that present replicated structures. Its key idea is that some states present symmetrical behavior for the verification of some properties. So, instead of exploring all states, the exploration of only one of these symmetric states is sufficient. Although such techniques have been widely studied, and even combined, they still have limitations. The symmetry re- duction has a high computing cost to determine the symmetry between states [8]. The abstraction reduction can lead to many inconclusive results if performed naively. This work aims to address such limitations and assist the model checking task by using the intuitive knowledge which the software engineer has about the model. We present a methodology to assist the person who will check the model to identify the symmetry in a formal and simple way. The main contribution of this work is a semiautomatic technique to remove components from models of reactive systems. This new methodology removes parts (components) of the model by human intervention, but without altering its description; we do it by providing logical constraints that eliminate transitions, thus reducing the reachable states. The decision about which components of the model to eliminate is made using the identified symmetry. We validate the methodology on the model of GridMe- dia [19], a well known P2P live streaming system. Although, we may also use our new methodology in others P2P system as file sharing and tree-based live streaming applications. The quantitative results show that the abstractions used in this work 978-1-4673-2362-8/12/$31.00 ©2012 IEEE 343

Upload: alex-borges

Post on 09-Feb-2017

219 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: [IEEE 2012 International Conference on High Performance Computing & Simulation (HPCS) - Madrid, Spain (2012.07.2-2012.07.6)] 2012 International Conference on High Performance Computing

Verification of P2P Live Streaming Systems UsingSymmetry-based Semiautomatic Abstractions

Pedro de Carvalho GomesKTH Royal Institute of Technology

Stockholm, [email protected]

Sergio Vale Aguiar CamposUniversidade Federal de Minas Gerais

Belo Horizonte, [email protected]

Alex Borges VieiraUniversidade Federal de Juiz de Fora

Juiz de Fora, [email protected]

Abstract—P2P systems are one of the most efficient data trans-port technologies in use today. Particularly, P2P live streamingsystems have been growing in popularity recently. However, ana-lyzing such systems is difficult. Developers are not able to realizea complete test due the due to system size and complex dynamicbehavior. This may lead us to develop protocols with errors,unfair or even with low performance. One way of performingsuch an analysis is using formal methods. Model Checking isone such method that can be used for the formal verification ofP2P systems. However it suffers from the combinatory explosionof states. The problem can be minimized with techniques such asabstraction and symmetry reduction. This work combines bothtechniques to produce reduced models that can be verified infeasible time. We present a methodology to generate abstractmodels of reactive systems semi-automatically, based on themodel’s symmetry. It defines modeling premises to make theabstraction procedure semiautomatic, i.e., without modificationof the model. Moreover, it presents abstraction patterns based onthe system symmetry and shows which properties are consistentwith each pattern. The reductions obtained by the methodologywere significant. In our test case of a P2P network, it has enabledthe verification of liveness properties over the abstract modelswhich did not finish with the original model after more than twoweeks of intensive computation. Our results indicate that the useof model checking for the verification of P2P systems is feasible,and that our modeling methodology can increase the efficiencyof the verification algorithms enough to enable the analysis ofreal complex P2P live streaming systems.

I. INTRODUCTION

P2P systems are extremely efficient in disseminating datato a large number of users. In particular, P2P live streamingsystems can be used to broadcast video to very large numbersof viewers without overloading centralized servers [19]. How-ever, making sure that these systems behave as expected canbe difficult, due to their complex dynamic behavior.

Some of the most commonly used methods to check a P2Plive streaming behavior are simulation and data collection fromreal transmissions. Both methods are very useful, but do notsolve all problems. Simulation is very efficient, but due to itsnature, can miss important but uncommon events, and as aconsequence, its results are not guaranteed [17]. The same istrue for analyzing real transmissions, which have the additionaldisadvantage of being complex to set up and run.

In this work, we propose the use of the formal methodModel Checking to check P2P live streaming systems behaviorand its correctness. We also provide an exhaustive analysis

of a P2P system model using our approach. We focus onthe liveness properties, which are the ones that guaranteethat some condition may happen. We use, for this purpose,the existential subset of the Computation Tree Logic (CTL),named ∃CTL.

Despite the success of Model Checking on the verificationof a large number of computation systems, it is susceptible tothe state-space explosion problem. This problem is caused bythe fact that the number of states in a model is exponential tothe number of variables it contains [17]. This severely limitsthe complexity of models that can be verified.

The use of abstractions is one of the main techniquesto mitigate the state-space explosion. It generates smallerabstract models by removing or grouping states and transitionsfrom the original model [1]. Another technique to producecompact models is symmetry reduction [9]. It is applicable onmodels that present replicated structures. Its key idea is thatsome states present symmetrical behavior for the verificationof some properties. So, instead of exploring all states, theexploration of only one of these symmetric states is sufficient.

Although such techniques have been widely studied, andeven combined, they still have limitations. The symmetry re-duction has a high computing cost to determine the symmetrybetween states [8]. The abstraction reduction can lead to manyinconclusive results if performed naively.

This work aims to address such limitations and assist themodel checking task by using the intuitive knowledge whichthe software engineer has about the model. We present amethodology to assist the person who will check the model toidentify the symmetry in a formal and simple way.

The main contribution of this work is a semiautomatictechnique to remove components from models of reactivesystems. This new methodology removes parts (components)of the model by human intervention, but without altering itsdescription; we do it by providing logical constraints thateliminate transitions, thus reducing the reachable states. Thedecision about which components of the model to eliminateis made using the identified symmetry.

We validate the methodology on the model of GridMe-dia [19], a well known P2P live streaming system. Although,we may also use our new methodology in others P2P systemas file sharing and tree-based live streaming applications. Thequantitative results show that the abstractions used in this work

978-1-4673-2362-8/12/$31.00 ©2012 IEEE 343

Page 2: [IEEE 2012 International Conference on High Performance Computing & Simulation (HPCS) - Madrid, Spain (2012.07.2-2012.07.6)] 2012 International Conference on High Performance Computing

have achieved considerable reductions for both verificationtime and size of the model. In fact, it was not possible tocomplete the computation of the number of reachable statesfor the original GridMedia model after two weeks. The samecomputation finished in less than 3 minutes using the newmethodology. Moreover, the maximum number of reachablestates found was less than 106, many orders of magnitudelower than the approximately 1022 possible states, whichdemonstrate the power of model reduction through eliminationof components.

The reminder of this paper is organized as follows: wepresent related work on Section II. The section III discussesthe theoretical foundations for the performed reductions. Wepresent our new reduction methodology on section IV. Onsection V we present the GridMedia model, and quantitativeresults. Finally, section VI concludes our work.

II. RELATED WORK

The use of abstractions in Model Checking has a rich liter-ature. Over-approximation are used in [1] to verify propertiesin ∀CTL logic. A dual result is presented in [2] for theverification of abstract models by under-approximation, using∃CTL. We based much of our work on [3], which describesthe verification of liveness properties for reactive systems.[4] presents a technique for the automatic abstractions andrefinements when verifying ∀CTL properties. More recently[5] adapted the consolidated techniques of abstractions toDirected Model Checking. Our approach differs for the citedones in the way we implement such abstractions, by removingcomponents using direct specifications.

The exploitation of the symmetry in models was proposedat the same time by [6] and [7]. Both use the concept ofpermutations over the states of a model to determine its sym-metry. We borrow the theoretical definitions from [9], whichare also used in [10]. The present work differs from the latterin some ways: first, we define the symmetry over components,not over states. The number of components is severely smallerthan the number of states. Thus, the computation of symmetryis significantly easier. Moreover, we don’t perform symmetryreduction; we use symmetry only as the orientation to performthe abstractions.

Previous works combine symmetry and abstractions. In [11]over-approximations and symmetry are combined to check∀CTL formulas. Our work checks properties in ∃CTL Under-approximation and symmetry are combined in [13], focusingon the falsification of properties. The technique is applied toOn-the-fly Model Checking, which constructs the model atthe same time it verifies the property. This work differs fromours mainly because we construct the abstract model beforechecking.

Although there are several works about formal verificationof network protocols, there are few that focus on P2P networksand/or use Model Checking. Among these, [14] and [15] verifydistributed hash table protocols; the first uses automated the-orem proving, and the latter uses Model Checking. Moreover,Probabilistic Model Checking was used in [16] to verify a

reorganizing protocol for P2P networks. To the best of ourknowledge, the present work is the first one to verify a P2PLive Streaming system.

III. PRELIMINARIES

The following sections present the theoretical foundationfor the performed reductions. Some explanations omit formaldetails, to help the intuitive comprehension.

A. Under-approximations

The abstraction by under-approximation systematicallyeliminates states and transitions. It creates abstract modelsthat, although have smaller number of behaviors, do not admitbehaviors that are nonexistent in the original model. LetM = (S, S0, R, L) be the standard definition of a Kripkestructure, where S is the set of states, S0 ⊆ S is theset of initial states, R : S × S the transition relation, andL : S → PAP is the mapping from a state to its set ofatomic propositions. An abstract Kripke structure is definedas M = (S, S0, R, L), where:• S ⊆ S.• S0 = S ∩ S0.• R ⊂ R, where (s, s′) ∈ R⇒ s, s′ ∈ S.• L = L.By the definition of R, it is trivial to notice that not all

paths in the original model also exist in the abstract model.Conversely, if a path exists in an abstract model it necessarilyalso exists in the original model. Notice that the definitionabove also holds for the case where only transitions areeliminated. The use of ∃CTL[2], the fragment of CTL withexistential operators only, guarantees the correctness of modelchecking with abstractions in this scenario. Intuitively, if theverification of a ∃CTL formula φ is correct w.r.t the abstractmodel, it implies that the verification found at least one paththat satisfies φ; thus φ is also correct w.r.t to the originalmodel since all the paths in an abstract model also exist inthe original.

The check of ∃CTL formulas over models with under-approximations for the purpose of falsification, i.e., proves thatan undesirable condition may happen, is also valid.

B. Permutation Groups

A permutation σ over a set A is a bijective function definedas σ : A → A. The identity permutation e is the one thatmaps each element of the set A into itself. A permutationlike a1 7→ a2, a2 7→ a3, ..., ak−1 7→ ak, ak 7→ a1 is called acycle and is written as (a1a2a3...ak). Any permutation can bewritten as the functional composition of disjoint cycles. Theusual notation in works about symmetry in Model Checking isto represent the functional composition similarly to concate-nation. Therefore, f(g(x)) = f ◦ g = fg.

A permutation group G is a set of permutations associatedwith the operation of functional composition, such that:• The identity permutation e ∈ G.• For every permutation σ ∈ G exists a permutation σ−1 ∈G such that σσ−1 = e.

344

Page 3: [IEEE 2012 International Conference on High Performance Computing & Simulation (HPCS) - Madrid, Spain (2012.07.2-2012.07.6)] 2012 International Conference on High Performance Computing

• For all permutations σ1, σ2 ∈ G, σ1σ2 is also in G.Permutations σ1, σ2, σ3, ..., σk are said to be the generators

of a permutation group G, if G is the closure of the set{σ1, σ2, σ3, ..., σk} under the operation of functional compo-sition. Formally, G = 〈(σ1, σ2, σ3, ..., σk)〉.

In this paper we use permutations over the finite set ofstates S of the Kripke structure M that represents the system.However a permutation can also be stated in terms of thecomponents of the model. Let pi and pj be two identicalcomponents of a system. The function σ(pi) = pj indicatesthe permutation of all the states where the variables of pi andpj have the same valuations.

C. Symmetry Groups

A permutation group G is also a symmetry group withrespect to a Kripke structure M if all its permutationspreserve the transition relation and the initial states of M .∀s, s′ ∈ S,∀σ ∈ G[((s, s′) ∈ R ⇐⇒ (σ(s), σ(s′)) ∈ R)∧(s ∈ S0 ⇐⇒ σ(s) ∈ S0)].

Given a symmetry group G, we can partition the set ofstates S into equivalence classes called orbits. Formally, theorbit of a state s is a set of states θ(s) = {t|∃σ ∈ G, whereσ(s) = t}. Thus, we can generate a reduced model from theoriginal one by selecting one representative from each orbitθ, which we call rep(θ(s)). This reduced model is known asabstract model.

Formally, the abstract model of a Kripke structure M ,with respect to a symmetry group G is defined as MG =(SG, S

0G, RG, LG), where:

• SG = {θ(s)|s ∈ S} is the set of orbits of the state in S.• S0

G = {θ(s)|s ∈ S0 ∧ s = rep(θ(s))}.• RG = {(θ(s), θ(s′))|(s, s′) ∈ R}.• LG(θ(s)) = L(rep(θ(s)).The fact that G is a symmetry group makes all initial states

s ∈ S0 at M have a representative of its orbit also in S0G. For

the same reason RG is well-defined and independent of thechosen representatives for each orbit.

IV. METHODOLOGY

In this section we present the premises for the application ofour reduction methodology. Moreover we present an overviewof how to perform the formal symmetry identification andcomponents abstraction. The definitions presented here areillustrated in Section V.

A. Modeling

The application of our methodology relies on the presenceof some modeling patterns, such as the separation betweeninternal behavior of a component and its interface with othercomponents. In this section we present how each of thesepatterns is important to either implement the semiautomaticabstractions or to formally identify the symmetry. Fortunatelythese patterns are common for the model of many reactivesystems. For these systems, the concept of a componentmeans an independent logical entity that communicates withother entities, and acts based on its interactions. In a typical

symmetrical model we have components of the same type.This is the case of the P2P Live Streaming network presentedin Section V, where components are the participants of thetransmission.

1) Internal Representation: The internal representationrefers to the description of the finite state machines of thecomponents in the model. It contains the declarations of thestate variables and the transitions rules of its values based onthe current and next states.

Its definition shall implement the concept of semiautomaticabstractions because the removal of a component from themodel requires that the transitions that modify their values aredisabled. In other words, a component which never changesthe values of its state variables is considered to be removedfrom the model. Section IV-C shows the assumptions that theinternal representation must meet to enable the semiautomaticabstractions.

2) Communication Interfaces: A communication interfacedefines how the components interact. The input of a logicsignal in a circuit or a communication message on a networkillustrates this concept. They are a common medium betweenthe internal and external representations. Typically they areimplemented as a shared state variable between the commu-nicating components.

3) External Representation: The external representationrefers to the declaration of the components of the model andhow they are organized. Some examples are the definitions ofthe connections of logical components in a circuit or the topol-ogy of a network. The analysis of the external representationis what determines the symmetry between the components.Moreover, the semiautomatic abstractions alter the externalrepresentation by logically eliminating connections betweencomponents, or removing an entire component by eliminatingall its communication.

B. Generation of the symmetry group

In this work, the symmetry guides which parts should beremoved. Its use is justified by the smaller loss of behaviorsit causes [8]. Intuitively, if the property in analysis preservesthe symmetry of the model, we can consider only one of thesymmetrical components. This representative contains the nec-essary information for the verification. We can also illustratethis concept in terms of variables and states. If the propertypreserves the symmetry of the model, then the valuations forthe state variables of the representative component representthe same valuations of variables from the removed symmetriccomponents.

The definition of symmetry between the components ismade by generating the symmetry groups. As mentioned inSection III-B, these groups are sets of permutations thatpreserve the transition relation. The problem is the same asthe classical problem of isomorphism in two graphs [17],and should be repeated for all possible permutations. Ourmethod is efficient because it performs such computation in thecomponent-level, in contrast to state-level. As a consequence,it manipulates a significantly smaller number of elements.

345

Page 4: [IEEE 2012 International Conference on High Performance Computing & Simulation (HPCS) - Madrid, Spain (2012.07.2-2012.07.6)] 2012 International Conference on High Performance Computing

The model’s external representation is analyzed to deter-mine the symmetry group. If the system has been modeledas described in Section IV-A then the task becomes easierbecause the model can be viewed as a graph where the compo-nents are the vertex, and connections between components arethe edges. If the permutation of two components maintains thetransition relation in the graph of the external representation,then this permutation will be part of the symmetry group.Recall that a permutation between two components means thepermutation among the states that have symmetric valuationsfor the state variables of both, as defined in Section III-B.

C. Semiautomatic abstractions

Semiautomatic abstractions are implemented in the level ofcomponents, by human intervention, but without modifyingthe model’s description. This is accomplished by using directspecifications, which are semantic constructions of modelinglanguages (e.g. SMV[18]). Direct specifications are statedas propositional formulas, and the valid transitions are theones that evaluate to true for such formulas. We use directspecifications to limit the transitions of the state variablesof a component to only the transitions that do not alter thevariables. The fact that the values of the state variables willnot be altered implies that logically the component is no longerpart of the model. Consequently the number of reachable statesis reduced since the state variables of the removed componentdo not contribute to the number of possible valuations in themodel.

The methodology defines that the internal representationof the components should provide a default value for thecommunication variable. The key idea is that if there is nocommunication, then the variables of a component wouldnever change its values. In other words, the state of a compo-nent would never change. This is a valid and fairly commonsituation for many reactive systems, where components remainidle (keep current state) until receiving data from anothercomponent. Conversely, if there is no communication, thecomponent will always keep the same values for the statevariables and hence the states defined by other valuationsbecome unreachable.

The semiautomatic abstractions eliminate states and transi-tions that impact the verification of the model. To illustrate,we can think of a model with two components, each with asingle variable ranging from one to three. If we force oneof the components to always keep its variable to zero, thevaluations of the system where one component’s variable istwo, and another component’s variable is three would not berepresented.

So, the abstractions presented in this work are under-approximations. Consequently, the methodology of this studyuses only formulas in the ∃CTL logic. As we mentioned inSection III-A, if the verification of a property holds for theabstract model, it also holds for the original model. But if doesnot hold, the result is inconclusive. In this case the abstractionsshould be removed one by one, and the check should beperformed again until one of the two conditions hold: the

property is proved to be true, which is a valid result; or thereare no more abstractions to be removed and the property willbe checked over the original model, which may be unfeasible.

Finally, only formulas that do not reference the compo-nents being removed are valid. Although the semiautomaticabstractions logically remove a component, in fact it is stillpresent in the model. So, if the ∃CTL formula being verifiedhas any reference to variables of the removed component, themodel checker will judge it as valid. Consequently it may yieldincorrect results because it contains reference to a componentthat will never change the values of its variables.

V. THE GRIDMEDIA MODEL

Recently the P2P architecture began to be used by applica-tions broadcasting live content. Such applications are calledP2P Live Streaming. They inherit all the characteristics ofP2P systems for file sharing, in addition to a strong constraintof low-latency on the transmission. Most of these systemsimplement an algorithm based on the BitTorrent network todistribute data. On it, the original information is fragmentedinto several pieces and distributed among the participants ofthe network; then the participants try to rebuild the originalinformation by exchanging data with its partners. In this workwe model1 a subset of the GridMedia network [19].

In the GridMedia, information is generated solely by aspecial node, called server. Unlike the concept of same namein the client-server model, this node does not serve all partici-pants. It just generates the live transmission data, divides intosmall pieces called chunks, and distributes each chunk to fewparticipants. Each participant tries to reconstruct the originalinformation by requesting to its neighbors the missing chunks,and forwards to them the chunks it has, but they have not.

The delivery of the information takes place after three steps,and is known as pull method. First, each node informs itspartners constantly which chunks it has and which chunksit needs through a data structure called chunk map. Afterreceiving the partner’s chunk map, the node asks for a missingchunk. The partner receives the request and then repliessending the requested chunk. GridMedia has a complementarymethod to pull, called push, which was not considered in ourmodel. Figure 1 illustrates the pull scheme.

Fig. 1. Data delivery by the pull method

A. Internal Representation and Interfaces

The internal representation refers to the description of thestate machines of the network participants. The model of theGridMedia network contains two types of participants: the

1Available at http://www.csc.kth.se/∼pedrodcg/files/gridmedia.smv

346

Page 5: [IEEE 2012 International Conference on High Performance Computing & Simulation (HPCS) - Madrid, Spain (2012.07.2-2012.07.6)] 2012 International Conference on High Performance Computing

server, that generates the content to be distributed, splits intochunks; and nodes, which receive and forward chunks to theirneighbors to reconstruct the original information. The modeldefines six types of messages plus the special value null,which indicates that no message was received.

B. External Representation

The topology of the GridMedia network implemented inthis work is shown in figure 2. It has the server connected tonodes A and B in a first level. In a second level there are thenodes C and D, which have connections to both A and B, andbetween themselves.

Note that many relevant interactions are represented: thedirect communication between the server and clients is rep-resented by (server → A) and (server → B); the commu-nication between clients on the first level with clients on thesecond level happens in (A ↔ C),(A ↔ D),(B ↔ C) e(B ↔ D); finally, the connection between two clients at thesame level is represented by (C ↔ D). Note that there are noincoming edges to the server; this is because the server onlysends data, and does not request anything. The modeling ofthe topology was made in the SMV language and it containsthe asynchronous composition of five processes.

Despite this simple example, it is easy to see that thistopology is highly symmetrical. As the P2P network grows,we notice more and more groups similar to 2 appearing. Evenwith a large number of clients, there still symmetric parts inthe P2P network which allows applying our technique.

Fig. 2. Topology of the GridMedia network

C. Symmetry Group

The diagram representing the GridMedia network containsfive components, being four of them identical. They are con-nected by twelve edges only. Thus, to generate the symmetrygroup we simply exchange the symmetrical components andverify if the transition relation is preserved. Table I shows thecomputation of the symmetry group.

Some permutations of identical components do not preservethe transition relation. I.e., not all edges in these permuta-tions are also present in the original model. This was thecase of (NodeANodeC) of (NodeANodeBNodeCNodeD).The strike-out edges do not exist in the transition relationof the original model. The permutations (NodeANodeD),

TABLE IGENERATION OF THE SYMMETRY GROUP

EdgePermutation

(NodeA (NodeC (NodeA (NodeANodeBNodeB) NodeD) NodeC) NodeCNodeD)

S → A S → B S → A S → C S → BS → B S → A S → B S → B S → CA→ C B → C A→ D C → A B → DC → A C → B D → A A→ C D → BA→ D B → D A→ C C → D B → AD → A D → B C → A D → C A→ BB → C A→ C B → D B → A C → DC → B C → A D → B A→ B D → CB → D A→ D B → C B → D C → AD → B D → A C → B D → B A→ CC → D C → D D → C A→ D D → AD → C D → C C → D D → A A→ D

Preserve theYes Yes No Notransition

relation

(NodeBNodeC) and (NodeBNodeD) do not preserve thetransition relation by analogy to (NodeANodeC), and wereomitted.

The permutations between nodes A and B, andbetween the nodes C and D preserve the transitionrelation. Thus we conclude that the permutations〈(NodeANodeB), (NodeCNodeD)〉 are generators of thesymmetry group G = {e, (NodeANodeB), (NodeCNodeD),(NodeANodeB) (NodeCNodeD)} with respect to theoriginal model.

D. Introduction of Abstractions

The abstractions in this work refer to the removal of entirecomponents from the model and the consequent removal ofstates. The removal of a component may be the elimination ofa connection between two participants, represented by the pairof variables that simulate message boxes, or the removal of aparticipant in the network, represented by the removal of allthe connections that it maintains. The connection between twoparticipants is abstracted by inputting the TRANS statementsthat restrict the valid transitions to those where the valuesof the communication variables is unchanged. The TRANSstatement below removes the connection between node A andnode C. The abstraction of a participant is made similarly, byinputting direct specifications that restrict any changes to allits communication variables.

TRANS (next(node_A.connection2.event)=null)& (next(node_C.connection1.event)=null)

E. Abstract Models

In this section we construct the abstract models based onthe symmetry group, and on the property being checked.In the properties, nodes are represented by a processnamed Node_{A,B,C,D}. Each node has a variable namedchunk_buffer. This is an array of two bits, where each bitrepresents one chunk of data: if set to 0, the node does nothave that chunk; 1 otherwise.

347

Page 6: [IEEE 2012 International Conference on High Performance Computing & Simulation (HPCS) - Madrid, Spain (2012.07.2-2012.07.6)] 2012 International Conference on High Performance Computing

1) Property φ1: Partial Data Reconstruction: The firstproperty checks the existence of the scenario where one clientthat is in distance of size two to the server can recreate theoriginal information before another node, at the same distance,receives any part of the information. The formula φ1 belowverifies this case, represented by the chunk buffer of NodeDcontaining the two bits of data, and the buffer of NodeC beingempty:

φ1 = E [ node_C.chunk_buffer=0B2_00U node_D.chunk_buffer=0B2_11]

As described in Section IV-C, only components that are notreferenced in φ1 can be abstracted. Thus, the orbit generatedby the permutation (NodeCNodeD) cannot be used, sinceboth of its nodes are part of the formula. The permutation(NodeANodeB) is the only one valid for generating theabstract model, and any of the nodes can be the representativeof the orbit. We chose NodeA; consequently NodeB isabstracted. The abstract model is represented in figure 3.

Fig. 3. Abstract model for φ1

The verification of φ1 holds for the abstract model. Thus itis also correct over the original model.

2) Property φ2: Multiple Paths: An expected propertyin P2P networks is the possibility of nodes that are notdirectly connected to the server to receive all data before theclients connected to the server. This should be possible sincethe server partitions the original information and distributesdifferent parts to the nodes connected directly to it. So it isexpected that there is more than one path that information cantake to reach any participant in the network. The formula φ2below checks the case where the nodes connected to the serverhave distinct data parts, but a node not connected to the serveralready has the full data:

φ2 = EF (node_A.chunk_buffer=0B2_10)& (node_B.chunk_buffer=0B2_01)& (node_C.chunk_buffer=0B2_11)

The two elements of permutation (NodeANodeB) are in φ2and cannot be used to abstract components. The only elementof the permutation (NodeCNodeD) mentioned in the formulais NodeC . Consequently we chose it as the representative fromthe orbit, and abstract NodeD. The abstract model is shownin figure 4.

The verification showed that φ2 holds for the abstract model,and we can conclude that it also holds for the original model.

Fig. 4. Abstract model for φ2

3) Property φ3: Transmission Error: As mentioned inSection III-A, the check of ∃CTL properties is also valid forfalsification, i.e. prove that a bad condition may happen. Thisis the case for the property φ3 below. It checks if there exists asituation where two nodes in different distances to the server,e.g. NodeB and NodeD, may not receive any data:

φ3 = EG ( node_B.chunk_buffer=0B2_00& node_D.chunk_buffer=0B2_00)

The permutations (NodeANodeB) and (NodeCNodeD)may be used to generate the abstract model, as both havemembers that are not in φ3. In this case, NodeB and NodeDare automatically selected as representatives of the orbits,and NodeAand NodeC are abstracted. The abstract model ispresented in figure 5.

Fig. 5. Abstract model for φ3

The temporal formula φ3 holds for the abstract model. Thuswe conclude that transmission can fail, and it is possible thatno data is received at all.

F. Comparison of the reductions

The original model has a total number of states of approx-imately 1022, referring to all the possible valuations over thestate variables. Its complexity did not allow the calculationof the graph diameter (maximum shortest path between anytwo pairs of vertices), nor the number of reachable states aftermore than two weeks of intensive computation on a dedicatedprocessing server.

TABLE IICOMPARISON BETWEEN THE ABSTRACT MODELS

Abstract Diameter Reachable Total CPUModel States Time (s) Time (s)

Property φ1 41 500079 37,267 0,048Property φ2 41 814135 143,137 0,100Property φ3 28 793 0,201 0,016

By contrast the computations performed for all abstractmodels finished in less than three minutes in all cases. TableII shows a quantitative comparison between the verification of

348

Page 7: [IEEE 2012 International Conference on High Performance Computing & Simulation (HPCS) - Madrid, Spain (2012.07.2-2012.07.6)] 2012 International Conference on High Performance Computing

the abstract models. It presents the diameter of the obtainedgraph, the number of reachable states, the CPU time spent toverify the property and the total time between the beginningand the end of the model checker process.

The most remarkable fact was the reduction of exponentialorder in the number of reachable states, from about 106 inthe case of the abstract model for φ2 to approximately 103

in the abstract model for φ3. It shows the magnitude of thereduction in a model when a single component is abstracted.Similar result can be observed by comparing the diameters. Inthe abstract models for φ1 and φ2, where only one componentwas abstracted, the diameter size was 41; in the model for φ3,which had two components removed, the value dropped to 28.

VI. CONCLUSION

The methodology presented in this paper makes it possibleto effectively use formal methods, in particular model checkingin the analysis of large realistic P2P systems. It reduces modelsin terms of components, in contrast to reduction by states.The reasoning about components is more intuitive since themodeling is done at this level, and is more practical becauseit deals with a smaller amount of objects. Consequently theidentification of symmetry in the model is simpler.

Furthermore, the quantitative results show that the removalof components can result in significant reduction in the numberof reachable states and can make possible the verification ofmodels that were previously intractable. This was the case inour example, where the verification of properties in the originalmodel did not finish after two weeks of intensive computationon a dedicated server. The same computations have finishedin less than three minutes for all the abstract models.

The identification of the symmetry proved to be a strongevidence of how to choose the components to be abstracted.It has eliminated redundant behaviors and has minimized theinformation loss. This can be noticed by the fact that it wasnot necessary to refine the abstract model because of aninconclusive result.

One limitation of the technique is that it applies onlyto highly symmetric models. Fortunately this is the casefor several real models, like the example of the GridMedianetwork. A second limitation is that the abstractions removeinformation which makes it only possible to check propertiesin ∃CTL.

Future work includes applying the methodology we proposeto other kinds of P2P system as file sharing and to socialnetworks. Moreover, we intend to extend our methodology tocapture heterogeneous characteristics as processing capacityand upload bandwidth. Finally, we intend to evaluate ourapproach in a dynamic scenario, following real world P2Papplications traces.

ACKNOWLEDGMENT

The authors thank Dilian Gurov for the valuable feedback.

REFERENCES

[1] E. M. Clarke, O. Grumberg, and D. E. Long, “Model checking and ab-straction,” ACM Transactions on Programming Languages and Systems,vol. 16, no. 5, pp. 1512–1542, September 1994.

[2] W. Lee, A. Pardo, J.-Y. Jang, G. Hachtel, and F. Somenzi, “Tear-ing based automatic abstraction for ctl model checking,” in ICCAD’96: Proceedings of the 1996 IEEE/ACM international conference onComputer-aided design. Washington, DC, USA: IEEE ComputerSociety, 1996, pp. 76–81.

[3] D. Dams, R. Gerth, and O. Grumberg, “Abstract interpretation of reactivesystems,” ACM Trans. Program. Lang. Syst., vol. 19, pp. 253–291,March 1997.

[4] E. Clarke, O. Grumberg, S. Jha, Y. Lu, and H. Veith, “Counterexample-guided abstraction refinement for symbolic model checking,” J. ACM,vol. 50, no. 5, pp. 752–794, 2003.

[5] K. Drager, B. Finkbeiner, and A. Podelski, “Directed model checkingwith distance-preserving abstractions,” Int. J. Softw. Tools Technol.Transf., vol. 11, no. 1, pp. 27–37, 2009.

[6] C. N. Ip and D. L. Dill, “Better verification through symmetry,”in CHDL ’93: Proceedings of the 11th IFIP WG10.2 InternationalConference sponsored by IFIP WG10.2 and in cooperation with IEEECOMPSOC on Computer Hardware Description Languages and theirApplications. Amsterdam, The Netherlands, The Netherlands: North-Holland Publishing Co., 1993, pp. 97–111.

[7] E. A. Emerson, A. P. Sistla, and H. Weyl, “Symmetry and model check-ing,” 5th International Conference on Computer-Aided Verification, vol.697, 1993.

[8] S. Jha, “Symmetry and induction in model checking,” Ph.D. dissertation,Carnegie Mellon University, Pittsburgh, PA, USA, 1996.

[9] A. Miller, A. Donaldson, and M. Calder, “Symmetry in temporal logicmodel checking,” ACM Comput. Surv., vol. 38, no. 3, p. 8, 2006.

[10] A. Donaldson and A. Miller, “A computational group theoretic symmetryreduction package for the spin model checker,” Lecture Notes inComputer Science, vol. 4019, pp. 374–380, September 2006.

[11] E. A. Emerson and R. J. Trefler, “From asymmetry to full symmetry:New techniques for symmetry reduction in model checking,” in InConference on Correct Hardware Design and Verification Methods.Springer, 1999, pp. 142–156.

[12] A. P. Sistla and P. Godefroid, “Symmetry and reduced symmetry inmodel checking,” ACM Trans. Program. Lang. Syst., vol. 26, no. 4, pp.702–734, 2004.

[13] S. Barner and O. Grumberg, “Combining symmetry reduction and under-approximation for symbolic model checking,” Form. Methods Syst. Des.,vol. 27, no. 1-2, pp. 29–66, 2005.

[14] R. Bakhshi and D. Gurov, “Verification of peer-to-peer algorithms: Acase study,” Electron. Notes Theor. Comput. Sci., vol. 181, pp. 35–47,2007.

[15] T. Lu, S. Merz, and C. Weidenbach, “ModelChecking the Pastry Routing Protocol,” in10th International Workshop Automated Verification of Critical Systems,J. Bendisposto, M. Leuschel, and M. Roggenbach, Eds. Dusseldorf,Allemagne: Universitat Dusseldorf, Sep. 2010, pp. 19–21, shortcommunication.

[16] R. Heckel, “Stochastic analysis of graph transformation systems: Acase study in p2p networks,” in Theoretical Aspects of Computing –ICTAC 2005, ser. Lecture Notes in Computer Science, D. Van Hungand M. Wirsing, Eds. Springer Berlin / Heidelberg, 2005, vol. 3722,pp. 53–69.

[17] P. D. E M Clarke, Grumberg O., Model Checking. The Mit Press,1999.

[18] R. Cavada, A. Cimatti, C. A. Jochim, G. Keighren, E. Olivetti, M. Pis-tore, M. Roveri, and A. Tchaltsev, NuSMV 2.4 User Manual, 2005.

[19] L. Zhao, J.-G. Luo, M. Zhang, W.-J. Fu, J. Luo, Y.-F. Zhang, and S.-Q.Yang, “Gridmedia: A practical peer-to-peer based live video streamingsystem,” in Multimedia Signal Processing, 2005 IEEE 7th Workshop,Oct 2005, pp. 1–4.

349