ieee transactions on reliability, vol. 66, no. 2, june ... · continues to increase, assuring the...

18
IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE 2017 529 An Approach for Identifying and Analyzing Implicit Interactions in Distributed Systems Jason Jaskolka, Member, IEEE, and John Villasenor, Senior Member, IEEE Abstract—Safety-critical system domains such as critical infras- tructures, aerospace, automotive, and industrial manufacturing and control are becoming increasingly dependent on the use of dis- tributed systems to achieve their functionality. These distributed systems can contain many complex interactions among their con- stituent components. Despite extensive testing and verification of individual components, security vulnerabilities resulting from un- intended and unforeseen component interactions (so-called implicit interactions) often remain undetected and can have an impact on the safety, security, and reliability of a system. This paper presents an approach for identifying and analyzing the existence and sever- ity of implicit interactions in distributed systems. The approach is based on the modeling framework known as communicating con- current Kleene algebra (C 2 KA). Experimental results confirm that this approach can successfully identify and analyze dependencies in system designs that would otherwise be very hard to find. More broadly, the methods presented in this paper can help address the growing need for rigorous and practical methods and techniques for assuring the safe, secure, and reliable operation of distributed systems in critical domains. Index Terms—Assurance, communicating concurrent Kleene al- gebra (C 2 KA), distributed systems, implicit interactions, modeling. NOMENCLATURE A. Acronyms and Abbreviations C 2 KA Communicating concurrent Kleene algebra. CKA Concurrent Kleene Algebra. B. Notations K CKA structure. S Stimulus structure. K Sub-behavior relation. Next behavior mapping. λ Next stimulus mapping. A a Agent with name A and behavior a. A Set of agents. S Potential for direct communication via stimuli. E Potential for direct communication via shared environments. Manuscript received July 11, 2016; revised October 14, 2016 and December 9, 2016; accepted February 2, 2017. Date of publication March 10, 2017; date of current version May 31, 2017. This work was supported by the U.S. Department of Homeland Security under Grant 2015-ST-061-CIRC01. This paper presents a revised and extended version of the material in [1]. Associate Editor: S.-Y. Hsieh. J. Jaskolka is with the Center for International Security and Cooperation, Stanford University, Stanford, CA 94305 USA (e-mail: [email protected]). J. Villasenor is with the Schools of Engineering, Public Policy, and Manage- ment at the University of California Los Angeles, Los Angeles, CA 90095 USA. He is also with the Center for International Security and Cooperation, Stanford University, Stanford, CA 94305 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/TR.2017.2665164 + Potential for communication. P Set of possible agent interactions. P intended Set of intended system interactions. lcs ( p, q ) Longest common substring of p and q. |p| Length of interaction p. σ(p) Severity of interaction p. Less severe relation. I. INTRODUCTION AND MOTIVATION D ISTRIBUTED systems consisting of numerous interact- ing software and/or hardware components are common- place in many of today’s critical systems, including critical in- frastructures, industrial control systems, automobiles, airplanes, and spacecraft. For example, a modern automobile can have nearly 100 million lines of code distributed on 70 to 100 microprocessor-based electronic control units [2]. Due to the importance of such systems, there is a growing need for assur- ance that the systems will operate as expected. Individual system components are often subject to rigorous testing and verifica- tion to ensure that they perform their intended function. Despite these efforts, it is often not guaranteed that the overall system will always have its expected functionality once the components are combined. In distributed systems, a significant number of issues can result from interactions among components that are satisfying their requirements [3]. These issues are not prevented simply by ensuring high reliability of individual components because the root problem is not caused by the components themselves, but rather by the overall system design. Methods that simply ensure the reliability or correctness of the individual system compo- nents often have little or no impact on ensuring system-wide safety or security. Vulnerabilities in distributed systems derive from the complex, and often nonlinear, interactions among the significant number of interconnected components. Normal sys- tem operation does not allow for the detection of these kinds of interactions [4]. Furthermore, design techniques aimed at addressing these vulnerabilities and issues, including redun- dancy and over-design, often overlook unwanted or unexpected component interactions entirely and, in some cases, can even contribute to the component interaction problem because they often add to the complexity of the design [3]. Recognizing that modern systems do not operate in isolation reflects a need for the development of detection and mitigation approaches that capture the intricate relationships existing between system components [5]. 0018-9529 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Upload: others

Post on 03-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE 2017 529

An Approach for Identifying and Analyzing ImplicitInteractions in Distributed SystemsJason Jaskolka, Member, IEEE, and John Villasenor, Senior Member, IEEE

Abstract—Safety-critical system domains such as critical infras-tructures, aerospace, automotive, and industrial manufacturingand control are becoming increasingly dependent on the use of dis-tributed systems to achieve their functionality. These distributedsystems can contain many complex interactions among their con-stituent components. Despite extensive testing and verification ofindividual components, security vulnerabilities resulting from un-intended and unforeseen component interactions (so-called implicitinteractions) often remain undetected and can have an impact onthe safety, security, and reliability of a system. This paper presentsan approach for identifying and analyzing the existence and sever-ity of implicit interactions in distributed systems. The approach isbased on the modeling framework known as communicating con-current Kleene algebra (C2 KA). Experimental results confirm thatthis approach can successfully identify and analyze dependenciesin system designs that would otherwise be very hard to find. Morebroadly, the methods presented in this paper can help address thegrowing need for rigorous and practical methods and techniquesfor assuring the safe, secure, and reliable operation of distributedsystems in critical domains.

Index Terms—Assurance, communicating concurrent Kleene al-gebra (C2 KA), distributed systems, implicit interactions, modeling.

NOMENCLATURE

A. Acronyms and Abbreviations

C2KA Communicating concurrent Kleene algebra.CKA Concurrent Kleene Algebra.

B. Notations

K CKA structure.S Stimulus structure.≤K Sub-behavior relation.◦ Next behavior mapping.λ Next stimulus mapping.A �→ ⟨

a⟩

Agent with name A and behavior a.A Set of agents.→S Potential for direct communication via stimuli.→E Potential for direct communication via shared

environments.

Manuscript received July 11, 2016; revised October 14, 2016 and December9, 2016; accepted February 2, 2017. Date of publication March 10, 2017; date ofcurrent version May 31, 2017. This work was supported by the U.S. Departmentof Homeland Security under Grant 2015-ST-061-CIRC01. This paper presentsa revised and extended version of the material in [1]. Associate Editor: S.-Y.Hsieh.

J. Jaskolka is with the Center for International Security and Cooperation,Stanford University, Stanford, CA 94305 USA (e-mail: [email protected]).

J. Villasenor is with the Schools of Engineering, Public Policy, and Manage-ment at the University of California Los Angeles, Los Angeles, CA 90095 USA.He is also with the Center for International Security and Cooperation, StanfordUniversity, Stanford, CA 94305 USA (e-mail: [email protected]).

Digital Object Identifier 10.1109/TR.2017.2665164

�+ Potential for communication.P Set of possible agent interactions.Pintended Set of intended system interactions.lcs

(p, q

)Longest common substring of p and q.

|p| Length of interaction p.σ(p) Severity of interaction p.� Less severe relation.

I. INTRODUCTION AND MOTIVATION

D ISTRIBUTED systems consisting of numerous interact-ing software and/or hardware components are common-

place in many of today’s critical systems, including critical in-frastructures, industrial control systems, automobiles, airplanes,and spacecraft. For example, a modern automobile can havenearly 100 million lines of code distributed on 70 to 100microprocessor-based electronic control units [2]. Due to theimportance of such systems, there is a growing need for assur-ance that the systems will operate as expected. Individual systemcomponents are often subject to rigorous testing and verifica-tion to ensure that they perform their intended function. Despitethese efforts, it is often not guaranteed that the overall systemwill always have its expected functionality once the componentsare combined.

In distributed systems, a significant number of issues canresult from interactions among components that are satisfyingtheir requirements [3]. These issues are not prevented simply byensuring high reliability of individual components because theroot problem is not caused by the components themselves, butrather by the overall system design. Methods that simply ensurethe reliability or correctness of the individual system compo-nents often have little or no impact on ensuring system-widesafety or security. Vulnerabilities in distributed systems derivefrom the complex, and often nonlinear, interactions among thesignificant number of interconnected components. Normal sys-tem operation does not allow for the detection of these kindsof interactions [4]. Furthermore, design techniques aimed ataddressing these vulnerabilities and issues, including redun-dancy and over-design, often overlook unwanted or unexpectedcomponent interactions entirely and, in some cases, can evencontribute to the component interaction problem because theyoften add to the complexity of the design [3]. Recognizingthat modern systems do not operate in isolation reflects a needfor the development of detection and mitigation approachesthat capture the intricate relationships existing between systemcomponents [5].

0018-9529 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Page 2: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

530 IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE 2017

It is possible, and rather simple, to build systems with behav-iors that can be hard to understand or predict [3]. This reflects theimportance of component interactions that may be unfamiliar,unplanned, or unexpected, and either not visible or not immedi-ately comprehensible by the system designers [6]. These kindsof complex interactions are referred to as implicit interactions,encompassing the means by which components in a system canbe potentially influenced by other components in the system. Inthe literature, these kinds of interactions have also been referredto as hidden interactions. However, because it is not necessarilythe case that these kinds of interactions are intentionally hiddenfrom the view of system designers or operators, and becausethey are not part of the explicit system design, we refer to themas implicit interactions. The existence of implicit interactionscan indicate unforeseen flaws in the design of a system allowingfor these kinds of interactions to be present. Further, an implicitinteraction constitutes a linkage within a system of which de-signers are generally unaware, and that, therefore, can constitutea security vulnerability. These vulnerabilities can be exploitedto mount a cyberattack at a later time if a user can gain access tothe component from which the implicit interaction originates.In turn, this can have severe consequences in terms of the safety,security, and reliability of the system. It is far too often the casethat these kinds of implicit interactions are only made visibleor known when a system experiences some kind of attack orfailure. Therefore, this notion of implicit interactions must becarefully managed to have systems that operate as intended, andthat are resilient to cyberattacks and failures.

As the complexity of modern computer systems and networkscontinues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities forgovernments and providers of communications, financial, elec-tric, and other services (e.g., [7], [8]). There is an increasinglyimportant need, particularly in light of the growth in complex-ity of distributed systems in critical infrastructure and othersectors, for the development of rigorous and practical methodsand tools for determining whether such systems are protectedfrom cyberthreats [7]. Moreover, formal verification and ana-lytic tools are critical to building systems with improved se-curity and safety assurance [8]. Achieving this goal requiresthe ability to detect undesirable interactions among systemcomponents [9].

In an effort to address this challenge, we present an approachfor identifying and analyzing the existence and severity of im-plicit interactions in distributed systems. This work helps todevelop an understanding of how and why implicit interactionscan exist in distributed systems. Additionally, it aids in iden-tifying deficiencies in important existing system components,allowing for better assessment of the risks being taken by usingsuch components in critical systems. The presented approachis based on the specification and analysis of the communica-tion among system components using the modeling frameworkC2KA [10], [11]. The approach presented in this paper aims toprovide a new and systematic way of addressing the threat posedby implicit interactions, and it takes a step towards assuring thesafe, secure, and reliable operation of distributed systems atearly stages in their development.

The rest of this paper is organized as follows. Section IIdiscusses the related work and compares and contrasts our con-tributions with the existing literature. Section III provides thenecessary preliminary knowledge. Section IV outlines the speci-fication of distributed systems using the C2KA modeling frame-work and an illustrative example. Section V details the proposedapproach for identifying implicit interactions. Section VI pro-vides a technique for analyzing identified implicit interactionsto classify and measure their severity. Section VII discusses theproposed approach. Finally, Section VIII concludes and givesthe highlights of our future work.

II. RELATED WORK

There is a large body of existing work in the area of analyzingcomponent interactions in complex systems. In this section,we compare and contrast existing formalisms and approachesfor modeling and analyzing component interactions, and wedifferentiate our contributions from this existing literature.

A. Formalisms for Modeling Complex Distributed Systems

Many formalisms for modeling complex distributed systemshave been proposed in the literature. A significant proportionof these formalisms have aimed to capture the communication,concurrency, and dynamics of the components that comprise agiven system. Examples of these existing formalisms include theActor Model (e.g., [12]–[16]), process algebras (e.g., CCS [17],CSP [18], ACP [19], and π-calculus [20]), architectural formalmodeling languages (e.g., AADL [21], EAST-ADL [22], andSysML [23]), Petri nets [24], labelled transition systems [25],action algebras [26]–[28], and Hoare et al.’s Concurrent KleeneAlgebra (CKA) [29], [30].

While each of the above mentioned formalisms and languageshave their merits, in this paper, we elect to use C2KA [10], [11]for modeling and specifying the behaviors of distributed sys-tems. In order to be capable of specifying systems at variouslevels of abstraction, we prefer a formalism that encompassesthe characteristics of both state-based and event-based models,thereby providing a hybrid view of communication and con-currency. By contrast to C2KA, other formalisms for capturingthe communication, concurrency, and dynamics of complex dis-tributed systems do not directly, if at all, provide such a hybridview.

B. Noninterference Approaches

The study of noninterference [31] is closely related to theissue of implicit interactions. When considering noninterfer-ence, a system is often modeled as a machine with inputs andoutputs, each classified as either low-level or high-level. A sys-tem is said to have the noninterference property if and only ifany sequence of low-level inputs will produce the same low-level outputs, regardless of what the high-level inputs are. Thismeans that, in a system with the noninterference property, thebehaviors of the low-level components of the system are not in-fluenced by the behaviors of the high-level components. The ex-amination of noninterference arose from the need to develop an

Page 3: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

JASKOLKA AND VILLASENOR: AN APPROACH FOR IDENTIFYING AND ANALYZING IMPLICIT INTERACTIONS IN DISTRIBUTED SYSTEMS 531

understanding for why undesirable interactions among compo-nents in systems were possible [32].

A wide variety of approaches have been proposed for ensur-ing that systems satisfy noninterference properties (e.g., [31],[33]–[37]). While the study of noninterference can help to un-derstand why undesirable component interactions may exist incomplex systems, there is a cost for characterizing system cy-bersecurity in terms of noninterference assertions [38]. This costis attributed to the relatively complicated induction required toverify whether a noninterference policy is satisfied for a givensystem. In contrast, we aim to develop an approach that is pri-marily concerned with examining the influence of systems andcomponents on one another through their communication.

C. Information Flow Analysis

One of the most prominent approaches for studying the in-teractions of components in complex distributed systems andnetworks has been information flow analysis [39]. Numerous ap-proaches based on the study of information flow have been pro-posed including those by Denning [40], McDermid and Shi [41],and Shaffer et al. [42]. Other approaches targeted the specifi-cation and analysis of information flow security requirementsusing various formalisms such as state machines (e.g., [43]),Petri nets (e.g., [44]), process algebras (e.g., [39], [45]), typingsystems (e.g., [46]–[49]), as well as with axiomatic approaches(e.g., [50], [51]).

Information flow analyses often attempt to express all of thepossible ways in which information can be composed usingfine-grained views of a system. These kinds of analyses are alsotypically conducted at later stages in software and system devel-opment, such as the implementation stage. For example, typingsystems are capable of analyzing information flows within pro-gram code, but not at an earlier stage of development. Instead,we target an approach that can identify implicit interactions atmuch earlier stages in system development. A similar idea hasbeen carried out by Alghathbar et al. [52] with the proposal ofFlowUML, which aimed at detecting information flow viola-tions at earlier stages of system development.

D. Other Approaches for Studying System and ComponentInteractions

A number of approaches for studying the interactions of sys-tems and their components have been proposed using a varietyof formalisms, methods, and techniques. For example, the prob-lem of undesired component interactions has been well docu-mented in the areas of hazard analysis and system safety withthe development of procedures for identifying potential failuremodes and events in critical systems (e.g., [3], [53], [54]). Sim-ilarly, a variety of risk formulations and analysis approaches forcritical systems have been proposed, including those based onnetwork analysis and fault trees (e.g., [55]), anomaly detec-tion (e.g., [56]), and through the examination of access controlmechanisms (e.g., [57]). However, probabilistic risk assess-ments typically place an emphasis on identifying and dealingwith failure events, with design errors only being consideredindirectly through the probability of the failure event. Problems

resulting from unwanted or unexpected component interactionsand systemic factors are typically not considered.

As an alternative, a large proportion of existing work hasaimed to provide formal analysis and verification of concurrentsystems (e.g., [58]–[61]), as well as the formal verification ofdynamic and parametrized systems and networks (e.g., [62]–[65]). These publications have laid important groundwork forapproaches aimed at providing assurances that systems oper-ate as expected as they continue to grow in size and complexity.Similarly, recent work aiming to formally identify risks and pro-vide solutions for safely managing the complexity in the designand operation of flight-critical systems using a category theo-retical approach has been proposed [66]. However, much of thiswork does not explicitly focus on addressing the issue of im-plicit interactions at the design stage of system development. Incontrast, in this paper, we provide an approach that examines theinteractions of components in distributed systems by analyzingthe potential communication paths that arise from the system de-sign and specification using an alternative modeling framework.

While many formalisms and approaches aiming to discussand address issues related to implicit interactions exist, we pro-pose an alternative approach meant to aid the designers of dis-tributed systems in systematically assessing their designs byhelping to identify potential vulnerabilities and risks at earlystages of system development. This work aims to provide adifferent and complementary perspective for studying the inter-actions of systems and their components than what is offered byexisting approaches and formalisms.

III. PRELIMINARIES

A. Communicating Concurrent Kleene Algebra

C2KA [10], [11] is an algebraic framework for capturing theconcurrent and communicating behavior of agents1 in a dis-tributed multiagent system. C2KA extends the algebraic modelof CKA, first proposed by Hoare et al. [29], [30], to provide thecapability to model open systems by allowing for the separationof communicating and concurrent behavior in a system and itsenvironment and for the expression of the influence of stimulion agent behaviors.

A C2KA is a mathematical system consisting of two semi-modules which describe how a stimulus structure S and a CKAK mutually act upon one another to characterize the responseinvoked by a stimulus on an agent behavior as a next behaviorand a next stimulus. The left S-semimodule

(SK,+

)describes

how the stimulus structure S acts upon the CKA K via thenext behavior mapping ◦ and the right K-semimodule

(SK,⊕)

describes how the CKA K acts upon the stimulus structure Svia the next stimulus mapping λ. We refer the reader to the Ap-pendix for a summary of the algebraic structures mentioned inthis discussion.

Formally, a C2KA is defined as shown in Definition 1.

1Throughout this paper, we use the term agent in the sense used by Milner [67]to refer to any system, component, or process whose behavior consists of discreteactions. When speaking of agents and agent behaviors, we write A �→

⟨a⟩

toindicate that A is the name given to the agent and a ∈ K is the agent behavior(see Definition 1).

Page 4: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

532 IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE 2017

Definition 1 (C2KA – e.g., [11]): A C2KA is a sys-tem (S,K), where S =

(S,⊕,, d, n

)is a stimulus structure

and K = (K,+, ∗, ; ,©∗ ,©; , 0, 1) is a CKA such that(SK,+

)

is a unitary and zero-preserving left S-semimodule with map-ping ◦ : S × K → K and

(SK,⊕)

is a unitary and zero-preserving right K-semimodule with mapping λ : S × K → S,and where the following axioms are satisfied for all a, b, c ∈ Kand s, t ∈ S:

1) s ◦ (a; b) = (s ◦ a);(λ(s, a) ◦ b

);

2) a ≤K c ∨ b = 1 ∨ (s ◦ a);(λ(s, c) ◦ b

)= 0;

3) λ(s t, a) = λ(s, (t ◦ a)

) λ(t, a);4) s = d ∨ s ◦ 1 = 1;5) a = 0 ∨ λ(n, a) = n. �C2KA offers three levels of specification with which we can

specify a distributed multiagent system. Depending on whichlevel of specification we are working at, the model can be viewedas either event-based or state-based. This gives flexibility inallowing us to choose which level is most suitable for the givenproblem. The stimulus-response specification of agents gives thespecification of the next behavior and next stimulus mappingsfor each agent in the system. The abstract behavior specificationspecifies each agent behavior as a CKA term. The concretebehavior specification provides the state-level specification ofeach agent behavior. At this level, the concrete programs foreach of the CKA terms which specify each agent behavior aregiven using any suitable programming or specification language.

B. Potential for Communication

Distributed systems contain a significant number of interac-tions among their constituent agents. Any interaction, direct orindirect, of an agent with its neighboring agents can be un-derstood as a communication [67]. Therefore, any potential forcommunication between two system agents can be character-ized by the existence of a communication path allowing for thetransfer of data or control from one agent to another. In thispaper, we examine the influence of system agents on each otherthrough their potential for communication. The study of agentinfluence allows for the determination of the overall structureof the distributed system of which the agents comprise. A fulltreatment of the potential for communication within distributedmultiagent systems specified using C2KA has been given in [11]and [68] and is highlighted below.

Consider a distributed system formed by a set A of agentswith A, B ∈ A, such that A = B. Communication via stimulifrom agent A to agent B is said to have taken place only whena stimulus generated by A influences (i.e., causes an observablechange in, directly or indirectly) the behavior of B. Note that itis possible that more than one agent is influenced by the gen-eration of the same stimulus by another agent in the system.Formally, we say that agent A �→ ⟨

a⟩

has the potential for di-rect communication via stimuli with agent B �→ ⟨

b⟩

(denotedby A→SB) if and only if ∃(

s, t | s, t ∈ Sb ∧ t ≤S λ(s, a) :t ◦ b = b

), where Sb is the set of all basic stimuli.2 Similarly,

2A stimulus is called basic if it is indivisible with regard to the sequentialcomposition operator of a stimulus structure.

we say that agent A has the potential for communication viastimuli with agent B using at most n basic stimuli (denotedby A→n

SB) if and only if ∃(C | C ∈ A ∧ C = A ∧ C = B :

A→(n−1)S C ∧ C→SB

). More generally, we say that agent A

has the potential for communication via stimuli with agent B(denoted by A→+

S B) if and only if ∃(n | n ≥ 1 : A→n

SB).

When A→+S B, there is a sequence of stimuli of arbitrary length

which allows for the transfer of data or control from agent A toagent B in the system.

Communication via shared environments from agent A toagent B (denoted by A→+

E B) is said to have taken place onlywhen A has the ability to alter an element of the environmentthat it shares with B, such that B is able to observe the alterationthat was made. Formally, we say that agent A �→ ⟨

a⟩

has the po-tential for direct communication via shared environments withagent B �→ ⟨

b⟩

(denoted by A→EB) if and only if aR b, where Ris a given dependence relation (see Appendix). More generally,agent A has the potential for communication via shared environ-ments with agent B (denoted by A→+

E B) if and only if aR+ b,where R+ is the transitive closure of the given dependence rela-tion. This means that if two agents respect the given dependencerelation, then there is a potential for communication via sharedenvironments. For the purpose of this paper, the dependencerelation R is generated as a definition-reference relation be-tween program variables in the concrete behavior specificationsof agents.

By combining the definitions of potential for communica-tion via stimuli and via shared environments, a formulationof the potential for communication between two agents is ob-tained. Agent A is said to have the potential for direct com-munication with agent B (denoted by A � B) if and onlyif A→SB ∨ A→EB. Moreover, agent A is said to have the poten-tial for communication with agent B (denoted by A �+ B) ifand only if A � B ∨ ∃(

C | C ∈ A : A � C ∧ C �+ B).

For a given distributed system, if there exists a sequence ofagents, starting with an agent A and ending with an agent B,that have the potential for direct communication either via stim-uli or via shared environments, then agent A has the potentialfor communication with agent B.

This notion of the potential for communication betweenagents in a distributed system will be used to formulate theexistence of implicit interactions in Section V.

C. Network and Graph Concepts

Let A be a set of agents. A communication graph is a directedgraph G = (A, E), where E ⊆ A×A, and (A, B) ∈ E indi-cates that agent A can directly communicate with agent B [69].Throughout this paper, we consider communication graphs withtwo different kinds of edges, denoting potential for communi-cation via stimuli and potential for communication via sharedenvironments.

In a graph, a walk is an alternating sequence of verticesand connecting edges that may travel over any edge and anyvertex any number of times. In contrast, a path is a walkthat does not include any vertex more than once, except inthe case that the path begins and ends on the same vertex.

Page 5: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

JASKOLKA AND VILLASENOR: AN APPROACH FOR IDENTIFYING AND ANALYZING IMPLICIT INTERACTIONS IN DISTRIBUTED SYSTEMS 533

Fig. 1. Collaboration diagram depicting the expected communication for the manufacturing cell control system.

Throughout the remainder of this paper, we represent walksand paths of a communication graph as sequences of agents inthe following form: A1 →X 1 A2 →X 2 , . . . ,→Xk −1 Ak, wherefor all 1 ≤ i < k, Xi ∈ {S, E}, such that →S denotes po-tential for direct communication via stimuli and →E denotespotential for direct communication via shared environments.The length of a path (or walk) p of the form described above(denoted |p|) is counted by the number of direct communi-cations of which it is comprised (i.e., |p| = k − 1). A sub-path of a path (or walk) p = A1 →X 1 A2 →X 2 , . . . ,→Xk −1 Ak

is a contiguous subsequence of its vertices and edges. Thisis to say, that for any i, j such that 1 ≤ i ≤ j ≤ k, the path(or walk) q = Ai →Xi

Ai+1 →Xi + 1 · · · →Xj −1 Aj is a subpathof p.

It is well known in graph theory that every walk from avertex u to a vertex v contains a path from u to v [70]. Thisimportant attribute will be used in Section V to identify andcharacterize the possible interactions that exist among agents ina given distributed multiagent system.

D. Tool Support

In order to support the automated analysis of distributedmultiagent systems specified using C2KA for the existenceof implicit interactions, we use the C2KA prototype tooldescribed in [11]. The tool is implemented using the functionalprogramming language Haskell and makes use of the Maudeterm rewriting system [71]. It admits the specification of agentbehaviors using C2KA and automatically provides a list of allpotential communication paths between each pair of agents ina given system. For the purpose of this paper, the tool has beenextended to provide the additional capabilities of automaticallyproviding a list of the identified implicit interactions in thegiven system along with a measure of the severity of each. Theusage of the extended prototype tool for automating the analysisand verification of the existence of implicit interactions will be

demonstrated and discussed throughout the remainder of thispaper.

IV. MODELING SYSTEMS USING C2KA

In order to analyze a given system for the presence of implicitinteractions, we first need a suitable model of the system. Forthis, we use the C2KA modeling framework.

A. Illustrative Example: Manufacturing Cell

To illustrate the proposed approach for identifying and an-alyzing implicit interactions in distributed multiagent systems,we consider an example of a distributed manufacturing cellcontrol system, adapted for illustration from [72], consistingof four primary agents (components): Control Agent C, Stor-age Agent S, Handling Agent H, and Processing Agent P. TheControl Agent C is responsible for coordinating the activities ofthe other agents in the system, and for maintaining the overallsystem state. The Storage Agent S is responsible for storing ma-terials required for the manufacturing assembly, and for main-taining a record of its empty/full status. The Handling Agent His responsible for moving the materials from storage so thatthey can be processed, and for recording the readiness of thematerial for processing. Finally, the Processing Agent P is re-sponsible for processing the material to its manufactured state.The operation of the manufacturing cell control system can bevisualized as shown in the collaboration diagram given in Fig. 1,where the solid arrows denote message-passing communication(i.e., communication via stimuli) and the dashed arrows denoteshared variable communication (i.e., communication via sharedenvironments).

When the system is ready to begin manufacturing, a startevent is triggered. C begins the manufacturing process by send-ing a load request to S, which responds by entering its loadedbehavior (FULL) and by assigning the shared variable statusthe value 1. When the loading is complete, S broadcasts a loaded

Page 6: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

534 IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE 2017

message. C responds by transitioning to its preparation behav-ior (PREP) which assigns the shared variable state the value 1to indicate the system is in the preparing state. It also sendsa prepare request. H responds by transitioning to its mov-ing behavior (MOVE) which verifies that the material storage isloaded (i.e., by checking that status = 1) before assigningthe shared variable material the value 1 to indicate that thematerial is ready to be processed. If the material storage is notloaded, then material is assigned the value 0. H also sendsan unload request to S, which responds by entering its unloadedbehavior (EMPTY) and by assigning status the value 0. Afterunloading, S broadcasts an unloaded message that causes C totransition to its initializing behavior (INIT) which assignsstatethe value 2 to indicate the system is in the initializing state, anda setup event is issued. P responds by entering its setup behavior(SET) which ensures that the material is ready to be processed,the system is in its initializing state, and the material storageis empty. If the condition is satisfied, then the shared variableready is assigned the value 1 to indicate that the system isset for processing; otherwise, ready is assigned the value 0. Pthen sends a ready message which causes H to transition to itswaiting behavior (WAIT) and to send a process event. Both Pand C respond by moving to their working (WORK) and process-ing (PROC) behaviors, respectively. The working behavior of Pverifies that the material is ready for processing (i.e., by check-ing that material = 1) before executing the PROCESS()procedure. If the material is not ready for processing, then Pdoes not process the part (i.e., part is null). The processingbehavior of C assigns state the value 3 to indicate the sys-tem is in the processing state. When C is finished processing,it issues a done message that causes P to return to its standbybehavior (STBY). Similarly, once P is finished working, it issuesa processed event that causes C to return to its idle behavior(IDLE) which assignsstate the value 0 to indicate the system isin the idle state. C then sends an end message that may indicateto other systems connected to this manufacturing cell that themanufacturing process has been completed. In this way, the endmessage may act as the stimulus that initiates the behavior ofsome other system in the manufacturing assembly plant. At thispoint, the control system awaits another start event to begin themanufacturing process again.

This distributed manufacturing cell control system will serveas a running example throughout the remainder of this paper toshow how to use C2KA to specify the system, and to demon-strate the proposed approach for identifying and analyzing theexistence and severity of implicit interactions. Although thisexample is presented in the context of manufacturing, the anal-ogous communication and dependencies are found in nearly alldistributed systems.

B. Motivation for the Use of C2KA

Most complex distributed systems are open systems, meaningthat they participate in intensive communication and exchangewith their environment, which often includes other systems. Forexample, many systems need input in terms of energy, resources,information, etc., and as a result, the interactions between a

system and its environment need to be carefully taken into ac-count when modeling such systems [73]. C2KA allows for theseparation of communicating and concurrent behavior in a sys-tem and its environment and for the expression of the influenceof stimuli on agent behaviors, thereby providing the capabilityto model open systems.

Existing formalisms for capturing the concurrent and com-municating behavior of agents do not directly, if at all, provide ahybrid view of communication and concurrency encompassingthe characteristics of both state-based and event-based models.Usually formalisms are either state-based or event-based. Evenwith a formalism such as CKA [29], [30], which can be seen asa hybrid model for concurrency, the notion of communicationis not directly captured. Communication can only be perceivedwhen programs are given in terms of the dependencies of sharedevents, thereby requiring the instantiation of a low-level modelof programs and traces for CKA to define any sort of communi-cation [74]. Instead, we wanted a way to specify communicationwithout the need to articulate the state-based system of each ac-tion (i.e., at a convenient abstract level).

Furthermore, other formalisms do not directly deal with de-scribing how the behaviors of agents are influenced by stimuliin a system. When considering open systems, stimuli are re-quired to initiate agent behaviors. This is to say that agents inan open system need an external influence from the world inwhich they reside to begin their operation. Existing formalisms,such as CKA and process calculi, deal primarily with closedsystems where there is no external influence on the behaviors ofagents and they do not directly, if at all, consider agent behaviorsin open systems. In contrast, C2KA offers an algebraic settingcapable of capturing both the influence of stimuli on agent be-havior as well as the communication and concurrency of agentsat the abstract algebraic level, thereby allowing it to capture thedynamic behavior of complex distributed systems.

C. Specifying Agent Behavior Using C2KA

When specifying a given distributed system using the C2KAframework, we first need to identify the set of agents thatexist in the system. This can often be guided by examiningthe components that comprise the overall system to be speci-fied. In our running example, we consider the manufacturingcell control system formed by a set A consisting of the fouragents: {C, S, H, P}. Next, we need to identify the set of ba-sic stimuli that can be introduced and the set of basic agentbehaviors for the system. These sets will be used to generatethe support sets of the stimulus structure S and the CKA Kthat comprise the C2KA to be used for the specification of thesystem agents. Stimuli are abstract representations of messagesthat can be exchanged among the system agent. Therefore, todetermine the set of basic stimuli, we need to consider what mes-sages need to be passed from one agent to another to achievetheir desired functionality. This can often be extracted from thesystem requirements and the message-passing communicationshown in the collaboration diagram in Fig. 1. With respect to thesystem description provided in Section IV-A, the set S is gener-ated using the operations of the stimulus structure S and the set

Page 7: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

JASKOLKA AND VILLASENOR: AN APPROACH FOR IDENTIFYING AND ANALYZING IMPLICIT INTERACTIONS IN DISTRIBUTED SYSTEMS 535

Fig. 2. Abstract behavior specification of the manufacturing cell control sys-tem agents.

of basic stimuli {start, load, loaded, prepare, done, unload,unloaded, setup, ready, process, processed, end, d, n}. Sim-ilarly, the set K is generated using the operations of the CKA Kand the set of basic behaviors {IDLE, PREP, INIT, PROC, EMPTY,FULL, WAIT, MOVE, STBY, SET, WORK, 0, 1}. The identificationof the set of basic agent behaviors can often be made simplerby considering the behavior of each agent one-by-one. Oncethese sets have been identified, the C2KA constructed from thestimulus structure S and the CKA K captures all of the possiblestimuli and agent behaviors that can exist in the given system,and the specification of each agent can be determined. This in-volves developing the three levels of specifications offered bythe C2KA framework (see Section III-A). In practice, such anactivity will likely be done in consultation with domain expertsand system designers.

Using the C2KA constructed above, the behavior of each sys-tem agent is abstractly represented as shown in Fig. 2. For exam-ple, the abstract behavior specification of the Control Agent Cshows that, at any given time, C can exhibit any one of the fourbehaviors of idle, preparing, initializing, or processing. This isreflected in the use of the nondeterministic choice operator +from the CKA K in the term representing the behavior of theControl Agent C. When modeling and specifying more complexagent behaviors, the abstract behavior specification for agentsmay include more complex CKA terms involving additionalCKA operators, such as ; or ∗, to indicate sequential or parallelcomposition of behaviors from the CKA K, for example. Inthis way, the expressiveness of the C2KA framework allows forthe modeling and specification of agent behaviors within a widerange of complexity.

The stimulus-response specification of agents specifies thenext behavior mapping ◦ and the next stimulus mapping λ foreach system agent and is derived from the system descrip-tion. With respect to the system described in Section IV-A,the stimulus-response specifications of the manufacturing cellcontrol system agents are compactly specified as shown inTables I–IV. In each table, the row header shows the possiblebasic behaviors that the given agent can have in the system, andthe column header shows the basic stimuli to which the givenagent may be subjected in the system. These sets of behaviorsand stimuli are dictated by the CKA and the stimulus structureof the considered C2KA, respectively, as well as the abstractbehavior specification of each agent. Each table grid shows thenext behavior or next stimulus (respective to the operator shownin the top left cell) that results when the stimulus in the columnheader is applied to the behavior in the row header. Note thattogether Tables I–IV define a single next behavior mapping ◦and next stimulus mapping λ; however, separating the tables

to show the stimulus-response specification for the behavior ofeach agent aids in improving the readability and reviewabilityof the specifications.

The concrete behavior specification of the system agents pro-vides the state-level specification of each agent behavior (i.e.,each program). For this purpose, we use a fragment of Dijkstra’sguarded command language [75]. With respect to the system de-scribed in Section IV-A, the concrete behavior specifications ofthe agent behaviors are specified as shown in Fig. 3.

Using the prototype tool (see Section III-D), we can providethe specification of each agent in the manufacturing cell controlsystem example. This involves configuring the tool with the setof system agents, the set of basic stimuli, and the set of basicagent behaviors to define the stimulus structure and the CKA ofthe C2KA to be used for the specification. It also involves theinput of the stimulus-response specification, abstract behaviorspecification, and concrete behavior specification for each agentin the system. The tool then generates a representation of thesystem specification using the Maude language. Once we havethe specification of the system generated by the prototype tool,we can formally verify the existence of implicit interactions.

V. FORMULATING AND IDENTIFYING IMPLICIT INTERACTIONS

A. Intended System Interactions

Systems typically have intended sequences of communica-tion and interaction among their constituent agents to coordi-nate their behaviors to perform their functions. In what follows,let Pintended represent the set of intended interactions for a givensystem. This set of intended interactions can be derived fromthe system description and requirements explicitly provided bythe system designer. For example, this set may be representedas a collaboration diagram, similar to that shown in Fig. 1, oralternatively, as a message-passing or sequence diagram. Thisarticulation of the expected behavior and operation of the systemis typically part of any decent system engineering process.

Consider the example of the manufacturing cell control sys-tem described in Section IV-A. As mentioned above, the collab-oration diagram given in Fig. 1 provides a representation of theintended sequence of communication among the system agentsto perform the intended functionality of the manufacturing cell.By extracting the possible sequences of communication from thecollaboration diagram (or similar notation), the intended systeminteractions that comprisePintended can be derived. For example,the sequences depicted in Fig. 1 can be unravelled to identify thesequence of control or data transferred among the system agentsas shown in Fig. 4, where the solid arrows denote communica-tion via stimuli and the dashed arrows denote communicationvia shared environments. Since some agents in the given systemrespond to the same stimulus at the same time, the expansionof the concurrent interaction of the its agents is captured bybranches in its execution trace. This concurrent interaction istranslated and embodied as a set of possible walks of the sys-tem’s underlying communication graph. In other words, the setof intended system interactions provides a characterization ofthe possible execution traces representing the interleavings ofthe concurrent behaviors of the system agents. With respect to

Page 8: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

536 IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE 2017

TABLE ISTIMULUS-RESPONSE SPECIFICATION OF THE CONTROL AGENT C

TABLE IISTIMULUS-RESPONSE SPECIFICATION OF THE STORAGE AGENT S

TABLE IIISTIMULUS-RESPONSE SPECIFICATION OF THE HANDLING AGENT H

TABLE IVSTIMULUS-RESPONSE SPECIFICATION OF THE PROCESSING AGENT P

Fig. 3. Concrete behavior specification of the manufacturing cell control system agent behaviors.

Page 9: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

JASKOLKA AND VILLASENOR: AN APPROACH FOR IDENTIFYING AND ANALYZING IMPLICIT INTERACTIONS IN DISTRIBUTED SYSTEMS 537

Fig. 4. Expected execution trace captured byPintended for the manufacturingcell control system.

Fig. 4, the set of intended interactions can be characterized bythe set of walks shown in Fig. 5.

We acknowledge that, in some cases, a complete specifica-tion or characterization of the set of intended system interac-tions may not be provided or easily derived. In such cases, itmay be possible to alternatively characterize the set of intendedsystem interactions as a collection of properties of the modeledsystem. For example, we can have a property that expressesthat Agent A should not be able to communicate via stimuliwith agent B (formally ¬(

A→+S B

)). However, characterizing

the set of intended system interactions as a collection of prop-erties requires that each property is carefully specified to ensurethat they are not overly restrictive or relaxed. It is also a chal-lenge to ensure that the collection of properties does in factcompletely characterize the intended system interactions. Forthis reason, we have elected to take a more systematic andrigorous approach for characterizing the intended system inter-actions, despite that for reasonably large systems this set maybe quite large. We also note that the process of identifyingthe set of intended system interactions that comprise Pintended

can be automated if we assume that we are given a specificrepresentation of the expected (designed) system behavior. Suchautomation can help to alleviate the amount of manual effort re-quired by system analysts in determining the set of intendedsystem interactions. This automation is expected as part of ourfuture work.

B. Formulating the Existence of Implicit Interactions

An implicit interaction is any potential for communicationin a system that is unfamiliar, unplanned, or unexpected, andis either not visible or not immediately comprehensible by thesystem designers. Implicit interactions are those potential com-munications that are not explicitly stated as part of the intendedsystem functionality. The existence of an implicit interaction ina distributed system specified using C2KA is formally definedin Definition 2.

Definition 2 (Existence of Implicit Interactions): An im-plicit interaction exists in a distributed system formed by aset A of agents if and only if for any two agents A, B ∈ Awith A = B: ∃(p | p =⇒ (A �+ B) : ∀(q | q ∈ Pintended :¬SubPath(p, q))), where SubPath(p, q) is a predicateindicating that p is a subpath of q. �

Definition 2 states that if there exists a path p indicating apotential for communication from agent A to agent B, that is nota subpath of any of the intended interactions characterized bythe setPintended, then there exists at least one implicit interactionin the system that is p.

The existence of implicit interactions in a distributed systemindicates that there is an aspect of the system design (whether

Fig. 5. Set of walks characterizing the intended system interactions for themanufacturing cell control system.

accidental or intentional, innocuous or malicious) allowing forthis kind of interaction to be present. For example, the existenceof implicit interactions may be due to an accidental oversightby the designers of a system. As mentioned previously, it isoften quite easy to build large and complex systems that are notvery well understood, even by those responsible for designingand building such systems. Implicit interactions may manifestin a system simply because its complexity leads to its design-ers being unable to comprehend all of the possible ways inwhich the components may interact with each other. This sit-uation becomes considerably more complicated when we alsoconsider the fact that there may be malicious actors specificallydesigning system components in such a way, that they do notbegin to exhibit behaviors that are unintended or unexpecteduntil they are composed with other system components. Witha significant number of distributed systems in critical domainsdepending on a large and growing array of suppliers of softwareand hardware, such systems are becoming increasingly suscep-tible to intentionally compromised components introduced intheir supply chains. By formulating the existence of implicitinteractions in a distributed system using a mathematical frame-work, we can begin to better understand how and why implicitinteractions manifest themselves in a given system.

C. Identifying Implicit Interactions

The identification of the implicit interactions that are presentin a given system modeled using the C2KA framework involvesthe verification of whether each possible interaction in the givensystem is an implicit interaction through the application ofDefinition 2.

1) Determining Possible Agent Interactions: In order toidentify the implicit interactions in a given system modeledusing C2KA, we first need to identify all of the possible inter-actions (communication paths) among each pair of agents byperforming an analysis of the potential for communication ofthe given system specification. In essence, at this stage, we arederiving the underlying communication graph from the C2KA

Page 10: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

538 IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE 2017

Fig. 6. Underlying communication graph derived from the specification ofthe manufacturing cell control system.

specification of the given system. This is done with the helpof the prototype tool described in Section III-D. The prototypetool automatically provides a list of all potential communicationpaths between each pair of agents in a given system specified us-ing C2KA. For example, consider the specification of the man-ufacturing cell control system provided in Section IV-C. Thefollowing program fragment shows the analysis of the potentialfor communication (function pfc) from the Storage Agent S(agentS) to the Processing Agent P (agentP) with respectto the manufacturing cell control system specification (sys),where the output is a pair indicating whether there is a potentialfor communication from S to P (i.e., cond ⇐⇒ S �+ P)and a list of all of the possible communication paths from S toP (paths):> let (cond, paths)=pfc sysagentSagentP> print $ cond

True> printPaths $ paths

S ->S C ->S H ->S PS ->S C ->S H ->E PS ->S C ->S PS ->S C ->E PS ->E H ->S C ->S PS ->E H ->S C ->E PS ->E H ->S PS ->E H ->E PS ->E P

From the results of the prototype tool, it is easy to see thatthere are multiple interactions that allow the Storage Agent Sto influence the behavior of the Processing Agent P. A similaranalysis can be done for each other pair of agents in the givensystem (see the Appendix for the output of the prototype toolfor the full system analysis). After completing this analysis, theunderlying communication graph can be derived as mentionedabove. The resulting communication graph for the manufactur-ing cell control system is shown in Fig. 6 where the solid edgesdenote direct communication via stimuli and the dashed edgesdenote direct communication via shared environments. The dot-ted edge from the Processing Agent P to the Storage Agent Sdenotes indirect communication. Recall from Section III-Bthat in a distributed system formed by a set A of agents, anagent A has the potential to indirectly communicate with agent Bif and only if ∃(

C | C ∈ A : A � C ∧ C �+ B). This

effectively means that if agent B is reachable from agent A in acommunication graph by traversing through at least one inter-mediate agent, then there is an indirect communication from Ato B. By this definition of indirect communication, Fig. 6 con-tains numerous other indirect communications. In fact, giventhe communication graph depicted in Fig. 6, there is an indirectcommunication between each pair of agents. However, becausethere is no direct communication from the Processing Agent Pto the Storage Agent S, we use a dotted edge to denote that Pcan only indirectly communicate with S.

Note that we are not identifying all of the possible walksof the communication graph. Because a communication graphmay contain cycles, it is possible that there is an infinite numberof walks. While the proposed approach can also be applied toidentify whether a walk of a given length is implicit, we insteaduse the fact that every walk of a graph contains a path, andwe reduce the problem space to studying the potential pathsin the communication graph. By identifying all of the possiblecommunication paths for a given system, any walk that can beidentified will contain at least one of those paths. This allows usto study the ways in which one agent can influence the behaviorof another agent in the system without having redundant infor-mation about the potential for agents to influence an arbitrarilylong sequence of repeated intermediate agents.

Despite that the number of communication paths that mayexist in a large scale, real-world distributed system may be enor-mous, we conjecture that such an analysis is feasible with thesupport of automated tools and parallel processing techniques.The scalability of the proposed approach is discussed further inSection VII.

2) Determining if a Possible Interaction is Implicit: Oncewe have identified the possible interactions among each pair ofagents in the given system, we need to determine whether eachpotential communication path is an implicit interaction. This isdone by verifying whether each interaction exists as a subpathof any of the walks in Pintended through a direct application ofDefinition 2.

Algorithmically, this verification is equivalent to a stringmatching problem which aims to determine whether a stringcan be found within another string (i.e., whether one string is asubstring of another). In the context of agent interactions, thisproblem can be recast as determining whether a communica-tion path is a subpath of another communication path (or walk).Given two interactions p and q, algorithms such as Knuth–Morris–Pratt, Boyer–Moore, or two-way string matching canbe adapted to identify whether p is a subpath of q with complex-ity O(|p| + |q|) (e.g., [76]).

To demonstrate the approach for identifying implicit inter-actions in a distributed multiagent system, consider the manu-facturing cell control system example from Section IV-A, thepotential communication paths from the Storage Agent S tothe Processing Agent P shown in Section V-C1, and the set ofintended interactions Pintended as shown in Fig. 5. The follow-ing program fragment of the extended prototype tool shows theidentification of the implicit interactions (function findIm-plicit) from the Storage Agent S to the Processing Agent P(set of possible paths from S to P: paths) with respect to the

Page 11: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

JASKOLKA AND VILLASENOR: AN APPROACH FOR IDENTIFYING AND ANALYZING IMPLICIT INTERACTIONS IN DISTRIBUTED SYSTEMS 539

set of intended interactions (intended), where the output is alist of the implicit interactions from S to P (implicit):

> let implicit = findImplicit paths intended> printPaths $ implicit

S ->S C ->S H ->S PS ->E H ->S C ->S PS ->E H ->S C ->E PS ->E H ->S P

With respect to the potential for communication from theStorage Agent S to the Processing Agent P for the manufac-turing cell control system, the extended prototype tool finds apotential communication path denoted as S→SC→SH→EP. Inthis case, it is easy to see that this communication path exists asa subpath of one of the intended system interactions capturedby Pintended (see Fig. 5), meaning this communication path isintended and expected as part of the system behavior. How-ever, there also exists a potential communication path denotedas S→EH→SP, which does not exist as a subpath of one of theintended system interactions captured by Pintended. Therefore,this path represents an implicit interaction.

This procedure can be repeated for each pair of agents toobtain a full analysis of the given system. Additionally, or al-ternatively, it can be repeated for only those pairs of agentsfor which there is interest in the analysis. This can provide away to reduce the problem space by ignoring the potential forcommunication between agents for which there is no interest.

D. Experimental Results

After performing a full system analysis by analyzing each ofthe potential communication paths between each pair of agentsin the illustrative example system, we found that 29 of the 65total possible communication paths are not found as a subpathof one of the intended system interactions in Pintended and aretherefore implicit interactions. Using the extended prototypetool, the full system analysis can be performed in approximately18 s running a machine with a 2.7 GHz Intel Core i5 processorand 8 GB RAM. A summary of our experimental results is givenin Table V, and the detailed output of the extended prototypetool can be found in the Appendix.

The existence of implicit interactions is possible due to thepotential for out-of-sequence stimuli to be issued, or out-of-sequence reads from and/or writes to shared variables by sys-tem agents. Such unexpected behavior could be a consequenceof agents experiencing failures or being subject to maliciousactivity, for example. It is important to note that these implicitinteractions are not easily found without the use of the system-atic analysis of the system based on its C2KA specification.For example, when examining the potential for communica-tion from the Processing Agent P to the Storage Agent S, theproposed approach finds a potential communication path de-noted as P→SC→SS. Once again, it is easy to see that thiscommunication path does not exist as a subpath of one of theintended system interactions in Pintended. This means that thispath represents an implicit interaction and indicates that theProcessing Agent P can indirectly influence the behavior of

TABLE VSUMMARY OF EXPERIMENTAL RESULTS FOR IDENTIFYING IMPLICIT

INTERACTIONS IN THE MANUFACTURING CELL CONTROL SYSTEM

Interaction # Implicit Interactions # Total Possible Paths

C �+ H 1 5C �+ P 3 8C �+ S 2 4H �+ C 2 5H �+ P 2 8H �+ S 3 4P �+ C 1 3P �+ H 2 3P �+ S 4 4S �+ C 4 6S �+ H 1 6S �+ P 4 9

TOTAL 29 65

Storage Agent S, despite having no interaction (direct or indi-rect) with respect to the given the intended system behavior (seeFigs. 1 and 4). This is to say that P should not be able to influ-ence the behavior of S because the system design has no needfor such interaction.

This illustrative example shows that, even for a small sys-tem, there is hidden complexity and coupling among agents thatcan lead to the potential for unexpected system behaviors. Wenote, however, that after having identified the presence of im-plicit interactions in a given system, each implicit interactionshould be validated to ensure that it is indeed an unwanted be-havior, rather than merely an undocumented expected behavior.Such validation can be done by incorporating system design-ers in the loop, for example. Furthermore, an investigation intothe potential severity of the identified implicit interactions canbe used as part of a validation process and is the subject ofSection VI.

VI. ANALYZING THE SEVERITY OF IDENTIFIED

IMPLICIT INTERACTIONS

The severity of an implicit interaction serves as a measureto indicate the interactions that, if exploited in a maliciousmanner, have the potential to most negatively impact the safety,security, and/or reliability of the system in which they exist, andthat should be granted the highest priority for mitigation. Weoutline a framework for measuring the severity of an implicitinteraction based on the idea of longest common substrings.The longest common substring problem aims to find the longeststring that is a substring of two or more strings. By recastingthis problem in the context of agent interactions, we can findthe longest communication path that is a subpath of two ormore communication paths (or walks). Algorithms for findingthe longest common substring exist with complexity O(|p||q|)using dynamic programming approaches (e.g., [76]).

A. Measuring the Severity of Possible Interactions

Let p and q be two interactions among agents in a given sys-tem. The length of the longest common substring between p

Page 12: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

540 IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE 2017

and q, denoted |lcs(p, q)| gives a measure of the overlap

between p and q. This notion is an integral component ofthe measure of the severity of an interaction as shown inDefinition 3.

Definition 3 (Severity Measure): Let p be a possible in-teraction in a given system with intended system interac-tions Pintended. The severity of p (denoted σ(p)) is calculatedas follows:

σ(p) = 1 − maxq∈Pintended

{ |lcs(p, q)|

|p|}

where lcs(p, q

)denotes the longest common substring of inter-

actions p and q. �Definition 3 shows how to compute the severity of a possible

interaction in a given system. The severity measure of a possibleinteraction p is a numeric value σ(p), such that 0 ≤ σ(p) ≤ 1.The severity measure can be interpreted as the relative nonover-lap between a possible interaction with the intended interactionsof a system. The examination of the relative nonoverlap of a pos-sible interaction with an intended interaction provides a measureof how much of the possible interaction is unexpected in the sys-tem. Intuitively, the less that a possible interaction overlaps withthe intended system interactions, the more unexpected that inter-action is. In this way, a possible interaction with a high severitymeasure indicates that the interaction can pose a higher threatto the system in which it exists than an interaction with a lowseverity measure.

Proposition 1: Let p be an interaction such that |p| > 0.Then, p is not an implicit interaction if and only if σ(p) = 0.

Proof: The proof involves the recasting of Definition 2 inthe context of longest common substrings. It is straightforwardto see that for a possible interaction p in a given system, andfor the set of intended interactions Pintended, p is an implicitinteraction if and only if ∀(q | q ∈ Pintended : lcs(p, q) = p ),where lcs(p, q) denotes the longest common substring (subpath)of p and q. The proof also involves the application of Definition 3and the fact that maxq∈Pintended{ |lcs(p,q)|

|p | } = 1 ⇐⇒ ∃(q | q ∈Pintended : |lcs(p, q)| = |p| ). The detailed proof can be foundin the Appendix. �

Corollary 1: Let p be an interaction such that |p| > 0.Then, p is an implicit interaction if and only if σ(p) > 0.

Proof: The proof results from the application of Proposi-tion 1. The detailed proof can be found in the Appendix. �

Proposition 1 and Corollary 1 provide a connection be-tween the notion of the severity of a possible interaction (seeDefinition 3) and the notion of implicit interactions (see Def-inition 2). Any intended, or expected, interaction in a givensystem has a severity measure of 0, and conversely any im-plicit interaction has a severity measure greater than 0. Thiscan be interpreted as meaning that intended interactions presentno (additional) threat to the system in terms of safety and/orsecurity, because they are known and expected by the systemdesigners and it is assumed that the designers are aware of anyrisks inherently present with the behavior associated with theseinteractions.

TABLE VIEXPERIMENTAL RESULTS OF THE SEVERITY ANALYSIS FOR THE POSSIBLE

INTERACTIONS FROM STORAGE AGENT S TO PROCESSING AGENT P WHERE A

HIGHER SEVERITY MEASURE INDICATES A HIGHER THREAT

WITHIN THE SYSTEM

ID Possible Interaction Severity0 ≤ σ (pi ) ≤ 1

p1 S→SC→SH→SP 0.33p2 S→SC→SH→EP 0.00p3 S→SC→SP 0.00p4 S→SC→EP 0.00p5 S→EH→SC→SP 0.33p6 S→EH→SC→EP 0.67p7 S→EH→SP 0.50p8 S→EH→EP 0.00p9 S→EP 0.00

B. Comparing the Severity of Possible Interactions

In order to create a classification that can be used to comparethe severity of the identified implicit interactions in a system,we define a binary relation, denoted � and interpreted as “lesssevere,” on the set of possible agent interactions, denoted P , asshown in Definition 4.

Definition 4 (Less Severe Relation): Let P be a set of possi-ble agent interactions for a given system and let p1 , p2 ∈ P . Wedefine a binary relation � on P as

p1 � p2 ⇐⇒ σ(p1) ≤ σ(p2)

and we say that p1 is less severe than p2 . �Quite simply, Definition 4 states that an interaction is less

severe than another if and only if it has a lesser or equal severitymeasure. By definition, the relation � is a partial order be-cause ≤ is a partial order. Further, we can conversely defineanother partial order on the set of possible agent interactions P ,denoted � and interpreted as “more severe,” in the natural way.In short, the severity relation provides information that can be ofgreat use to analysts in determining where to allocate resourcesfor mitigating the threat of implicit interactions.

C. Experimental Results

Using the extended prototype tool, we can compute the sever-ity of each of the possible interactions derived from the specifi-cation of a given system according to Definition 3. The resultsof the analysis from the Processing Agent P and the ControlAgent C from the illustrative manufacturing cell control systemdescribed in Section IV-A are automatically generated by theextended prototype tool and shown in Table VI. The results ofthe full system analysis are given in the Appendix.

When comparing the severities of possible interactions, ormore specifically implicit interactions, with respect to the lesssevere relation as defined in Definition 4, some interactions aremore severe than others because they contain a longer sequenceof agent interactions that are not expected or intended as partof the system behavior. This means that these interactions usemore intermediate agent interactions that are not expected orforeseen as part of the intended system behavior and therefore,

Page 13: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

JASKOLKA AND VILLASENOR: AN APPROACH FOR IDENTIFYING AND ANALYZING IMPLICIT INTERACTIONS IN DISTRIBUTED SYSTEMS 541

present a greater risk because the system designers are generallymore unaware of such potential sequences of communication.

VII. DISCUSSION

Implicit interactions are a means for system agents to interactin unintended, and often undesirable, ways. As the complexityof system designs continues to increase, accidents resulting fromcomponent interactions are becoming more common [3]. In thepast, the designs of systems were more manageable from anintellectual perspective and, therefore, the possible interactionsamong the system components could be thoroughly planned,understood, anticipated, and guarded against [6]. It is often thecase that with cyberaccidents resulting from implicit interac-tions, the system design flaws that give rise to this unpredictable(and potentially unsafe and insecure) behavior are not randomevents. Rather, these flaws are often oversights by system de-signers arising from the inability to adequately characterizethe full range of system behaviors, and to analyze and assessthe quality of their designs. Implicit interactions may also bethe result of malicious actions to intentionally introduce soft-ware and/or hardware design flaws. For example, consider theillustrative manufacturing cell control system described above.It is quite easy to imagine a scenario that exploits an implicitinteraction to cause an undesirable system behavior. One spe-cific case is to imagine an agent using an implicit interactionto send the deactivation stimulus d, rather than the stimulus itis expected to send, to force all of the agents along the path totransition to their inactive behavior 0 (see Definition 1 and theAppendix). In this way, the entire system, due to its connect-edness, can be effectively shut down. From the perspective ofan observer, it may not be entirely comprehensible how or whythis was possible.

Furthermore, implicit interactions can lead to other challengesin terms of the safety, security, and reliability of information col-lected by the system in which they exist. For instance, exploitingan implicit interaction can provide a means for exfiltrating sen-sitive information to agents that are unauthorized to know orposses that information, or they can provide a means for com-promising the integrity or availability of critical informationrequired for the safe and reliable operation of the system.

The proposed approach for identifying and analyzing im-plicit interactions provides a step towards uncovering poten-tial cybersecurity vulnerabilities resulting from the existenceof implicit interactions in distributed multiagent systems. It isa rigorous and systematic approach using a modeling frame-work capable of identifying the implicit interactions that maybe present in a system with respect to a given specification. Thisis in contrast to prior related work (see Section II). The useof the C2KA framework for specifying the communicating andconcurrent behavior of distributed multiagent systems makes itstraightforward to ascertain the potential communication pathsthat exist in a given system. It also provides facilities to per-ceive the overall topology of a given system with respect toits specification. Having an idea of the topology of the systemallows for the abstraction of components of the overall sys-tem behavior. This kind of abstraction can aid in separating

the communicating and concurrent behavior in a system and itsenvironment, thereby allowing an analyst to focus on particu-lar aspects of the identified implicit interactions that may beof interest. The proposed approach also provides a mechanismfor analyzing the severity of each of the identified interactions.This enables system designers to perform a systematic analy-sis of their system designs to uncover potential vulnerabilitiesearly in the system development life-cycle. In turn, this givesthem insight and an enhanced understanding of the hidden com-plexity and coupling in the systems that they design and build.Furthermore, the proposed approach provides a solid founda-tion upon which mitigation approaches can be developed, andit can serve as the basis for developing guidelines for design-ing and implementing distributed systems that are resilient tocyberthreats, and that can offer increased safety, security, andreliability.

Another important issue is scalability, which presentsboth challenges and opportunities. Direct applicability of theapproach we present here to a system with millions of agentswould be impractical. However, we have applied the proposedapproach to analyze a larger system—a batch chemical reactor,adapted from [77], which includes 11 agents, 19 basic stim-uli, and 43 basic behaviors—to find that 4815 of the 5969 totalpossible communication paths are implicit interactions. The fullanalysis for this system using the extended prototype tool can beperformed in approximately 7 min using a 2.7 GHz Intel Core i5processor and 8 GB RAM. Still, larger systems can be success-fully modeled by aggregating behaviors of a group of agentsinto a single agent. This is reflected in the manufacturing cellcontrol system example presented above: the Handling Agent Hitself is composed of the many different agents (e.g., withina robotic manufacturing system) necessary to intermediate be-tween the Storage Agent S and the Processing Agent P, yetfor the purposes of analyzing interactions, it was appropriateto treat the Handling Agent as a single entity. More generally,many of the systems that can be analyzed using the proposedapproach can be decomposed and structured hierarchically toreduce the number of distinct agents. Moreover, the proposedalgorithms and problem data sets for identifying the implicitinteractions, as mentioned in Section V-C2, can be parallelizedin a straightforward manner.

VIII. CONCLUSION AND FUTURE WORK

In this paper, we presented an approach for identifying im-plicit interactions in distributed multiagent systems. The ap-proach is based on the specification of a system using C2KAand an analysis of the potential for communication among itsagents. It is an endeavor to address an important open ques-tion (e.g., [7], [9]) relating to the development of rigorous andpractical methods and techniques for assuring the safe, secure,and reliable operation of distributed systems at early stagesin their development. Our experimental results, using an illus-trative example of a manufacturing cell control system, haveverified the applicability of C2KA by showing that a substantialfraction of the possible interactions, represented as commu-nication paths, can exist as implicit interactions that may be

Page 14: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

542 IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE 2017

unintended or unforeseen by the system designer. We also pre-sented a framework for analyzing the severity of identified im-plicit interactions in an effort to aid system designers in as-sessing their designs and to be used as a guide for developingapproaches for eliminating or mitigating the existence and threatof implicit interactions in distributed systems. In addition, wedemonstrated the automation of the proposed approach for iden-tifying and analyzing implicit interactions with the use of anexisting prototype tool that we extended to additionally supportthe identification of implicit interactions in a given system andto compute their severity. We acknowledge, however, that due tothe level of abstraction of the given system specification usingC2KA, it is possible that the proposed approach identifies inter-actions which are considered to be implicit interactions, but thatare very unlikely to manifest in the real world. For this reason,we identify the need for methods and techniques for analyz-ing the identified implicit interactions, such as measuring theirseverity, but also more sophisticated approaches that delve intothe semantics of the identified implicit interactions to validatetheir existence, and to more accurately assess the threat that theypose to the overall safety, security, and reliability of the givensystem.

In our future work, we aim to further develop the approachand apply it to a range of case studies using larger and more com-plex models inspired by critical infrastructure systems. Subse-quently, we intend to investigate solutions, like those proposedin [78], for mitigating the existence of implicit interactions indistributed systems. We also plan to enhance the prototype toolwith additional functionality to support the automated extrac-tion of the set of intended system interactions Pintended from agiven design, as mentioned in Section V-A, and to support par-allel processing techniques to help in feasibly analyzing largerand more complex distributed systems.

APPENDIX

ALGEBRAIC STRUCTURES

For ease of reference, we summarize the most relevant alge-braic structures mentioned in this paper.

1) A monoid is a mathematical structure(S, ·, 1)

, where S isa nonempty set, · is an associative binary operation and 1is the identity with respect to · (i.e., a · 1 = 1 · a = a forall a ∈ S).

a) A monoid is called commutative if · is commutative(i.e., a · b = b · a for all a, b ∈ S).

b) A monoid is called idempotent if · is idempotent(i.e., a · a = a for all a ∈ S).

2) A semiring is a mathematical structure(S,+, ·, 0, 1

),

where(S,+, 0

)is a commutative monoid and

(S, ·, 1)

is a monoid such that · distributes over + (i.e., a · (b +c) = a · b + a · c and (a + b) · c = a · c + b · c for alla, b, c ∈ S).

a) The element 0 is called multiplicatively absorbing ifit annihilates S with respect to · (i.e., a · 0 = 0 · a =0 for all a ∈ S).

b) A semiring is called idempotent if + is idempotent.

c) Every idempotent semiring has a natural partial or-der ≤ on S defined by a ≤ b ⇐⇒ a + b = b.

3) A Kleene algebra is a mathematical structure(K,+, ·, ∗, 0, 1

), where

(K,+, ·, 0, 1

)is an idempotent

semiring with a multiplicatively absorbing 0 and iden-tity 1, and where the following axioms are satisfied forall a, b, c ∈ K:

a) 1 + a · a∗ = a∗;b) 1 + a∗ · a = a∗;c) b + a · c ≤ c =⇒ a∗ · b ≤ c;d) b + c · a ≤ c =⇒ b · a∗ ≤ c.

4) Let S =(S,⊕,, 0S , 1S

)be a semiring and K =(

K,+, 0K)

be a commutative monoid. We call(SK,+

)a

left S-semimodule if there exists a mapping ◦ : S × K →K such that for all s, t ∈ S and a, b ∈ K:

a) s ◦ (a + b) = s ◦ a + s ◦ b;b) (s ⊕ t) ◦ a = s ◦ a + t ◦ a;c) (s t) ◦ a = s ◦ (t ◦ a);d)

(SK,⊕)

is unitary if also 1S ◦ a = a;e)

(SK,⊕)

is zero-preserving if also 0S ◦ a = 0K.An analogous right K-semimodule corresponding is de-noted by

(SK,⊕)

. In this paper, we use λ : S × K → S

to denote the semimodule mapping for(SK,⊕)

.5) A CKA is a mathematical structure

(K,+, ∗,©∗ ,©; , 0, 1

)

such that(K,+, ∗,©∗ , 0, 1

)and

(K,+, ; ,©; , 0, 1

)are

Kleene algebras linked by the exchange axiom given by(a ∗ b); (c ∗ d) ≤ (b; c) ∗ (a; d).

a) K represents a set of possible behaviors.b) + is interpreted as a choice of two behaviors.c) ; is interpreted as a sequential composition of two

behaviors.d) ∗ is interpreted as a concurrent composition of two

behaviors.e) ©; is interpreted as a finite sequential iteration of a

behavior.f) ©∗ is interpreted as a finite concurrent iteration of a

behavior.g) 0 represents the behavior of the inactive agent.h) 1 represents the behavior of the idle agent.

6) A stimulus structure S def=(S,⊕,, d, n

)is an idem-

potent semiring with a multiplicatively absorbing d andidentity n.

a) S is the set of stimuli which may be introduced in asystem.

b) ⊕ is interpreted as a choice of two stimuli.c) is interpreted as a sequential composition of two

stimuli.d) d represents the deactivation stimulus which influ-

ences all agents to become inactive.e) n represents the neutral stimulus which has no in-

fluence on the behavior of all agents.7) A dependence relation on a set K with an operator +

is a bilinear relation R ⊆ K × K (i.e.,[(a + b)R c ⇐⇒

(aR c ∨ b R c)]

and[aR (b + c) ⇐⇒ (aR b ∨ aR c)

]

for all a, b, c ∈ S). If aR b, we say that b dependson a.

Page 15: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

JASKOLKA AND VILLASENOR: AN APPROACH FOR IDENTIFYING AND ANALYZING IMPLICIT INTERACTIONS IN DISTRIBUTED SYSTEMS 543

8) A partial order is a binary relation ≤ on a set S that is,for all a, b, c ∈ S:

a) reflexive (i.e., a ≤ a);b) antisymmetric (i.e., a ≤ b∧a = b =⇒ ¬(b ≤ a));c) transitive (i.e., a ≤ b ∧ b ≤ c =⇒ a ≤ c).

DETAILED PROOFS

Detailed Proof of Proposition 1

p is not an implicit interaction ⇐⇒ σ(p) = 0

⇐⇒ 〈 Recasting of Definition 2 using lcs(p, q) and

Definition 3〉¬ ∀(q | q ∈ Pintended : lcs(p, q) = p ) ⇐⇒

1 − maxq∈Pintended

{ |lcs(p, q)||p|

}= 0

⇐⇒ 〈 Generalized De Morgan and Arithmetic 〉∃(q | q ∈ Pintended : lcs(p, q) = p ) ⇐⇒

maxq∈Pintended

{ |lcs(p, q)||p|

}= 1

⇐⇒ 〈 maxq∈Pintended

{ |lcs(p, q)||p|

}= 1 ⇐⇒

∃(q | q ∈ Pintended : |lcs(p, q)| = |p| 〉∃(q | q ∈ Pintended : |lcs(p, q)| = |p| ) ⇐⇒∃(q | q ∈ Pintended : lcs(p, q) = p)

⇐= 〈 lcs(p, q) = p =⇒ |lcs(p, q)| = |p| 〉∃(q | q ∈ Pintended : lcs(p, q) = p ) ⇐⇒∃(q | q ∈ Pintended : lcs(p, q) = p)

⇐⇒ 〈 Reflexivity of ⇐⇒ 〉true

Detailed Proof of Corollary 1

p is an implicit interaction ⇐⇒ σ(p) > 0

⇐⇒ 〈 Double Negation 〉¬¬(

p is an implicit interaction ⇐⇒ σ(p) > 0)

⇐⇒ 〈 Distributivity of ¬ over ⇐⇒〉¬(

p is not an implicit interaction ⇐⇒ σ(p) > 0)

⇐⇒ 〈 Proposition 1 〉¬(

false)

⇐⇒ 〈 Negation of false 〉true

OUTPUT OF THE PROTOTYPE TOOL

This section contains the complete output of the extendedprototype tool for the identification and analysis of implicitinteractions in the manufacturing cell control system describedand specified in Section IV.

Page 16: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

544 IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE 2017

ACKNOWLEDGMENT

The authors would like to thank the anonymous reviewers fortheir thorough review that helped to improve the quality of thepaper.

The views and conclusions contained in this paper are thoseof the authors and should not be interpreted as necessarily rep-resenting the official policies, either expressed or implied, of theU.S. Department of Homeland Security.

REFERENCES

[1] J. Jaskolka and J. Villasenor, “Identifying implicit component interactionsin distributed cyber-physical systems,” in Proc. 50th Hawaii Int. Conf.Syst. Sci., 2017, pp. 5988–5997.

[2] D. L. Dvorak, “NASA study on flight software complexity,” Amer. Inst.Aeronaut. Astronaut., Apr. 2009, no. AIAA 2009–1882.

[3] N. G. Leveson, Engineering a Safer World: Systems Thinking Applied toSafety. Cambridge, MA, USA: MIT Press, 2011.

[4] T. Xu and A. J. Masys, “Critical infrastructure vulnerabilities: Embrac-ing a network mindset,” in Exploring the Security Landscape: Non-Traditional Security Challenges. New York, NY, USA: Springer, 2016,pp. 177–193.

[5] S. M. Rinaldi, J. P. Peerenboom, and T. K. Kelly, “Identifying, under-standing, and analyzing critical infrastructure interdependencies,” IEEEControl Syst., vol. 21, no. 6, pp. 11–25, Dec. 2001.

[6] C. Perrow, Normal Accidents: Living with High-Risk Technologies. Prince-ton, NJ, USA: Princeton Univ. Press, 1984.

Page 17: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

JASKOLKA AND VILLASENOR: AN APPROACH FOR IDENTIFYING AND ANALYZING IMPLICIT INTERACTIONS IN DISTRIBUTED SYSTEMS 545

[7] C. Bennett, “Feds lack method to grade critical infrastructure cy-bersecurity,” Nov. 2015. [Online]. Available: http://thehill. com/policy/cybersecurity/260963-feds-lack-method-to-grad e-critical-infrastructure-cybersecurity

[8] U.S.A. Department of Homeland Security , “A Roadmap for CybersecurityResearch,” Dept. Homeland Secur. Sci. Technol. Directorate, Washington,DC, USA, Nov. 2009.

[9] S. Jackson and T. L. J. Ferris, “Infrastructure resilience: Past, present, andfuture,” CIP Rep., vol. 11, no. 6, pp. 6–13, Dec. 2012.

[10] J. Jaskolka, R. Khedri, and Q. Zhang, “Endowing concurrent Kleene alge-bra with communication actions,” in Relational and Algebraic Methods inComputer Science (series Lecture Notes in Computer Science), vol. 8428,P. Hofner, P. Jipsen, W. Kahl, and M. Muller, Eds. Basel, Switzerland:Springer, 2014, pp. 19–36.

[11] J. Jaskolka, “On the modelling, analysis, and mitigation of dis-tributed covert channels,” Ph.D. dissertation, McMaster Univ., Hamil-ton, ON, Canada, Mar. 2015. [Online]. Available: http://hdl.handle.net/11375/16872

[12] C. Hewitt, P. Bishop, and R. Steiger, “A universal modular ACTOR for-malism for artificial intelligence,” in Proc. 3rd Int. Joint Conf. Artif. Intell.,pp. 235–245, 1973.

[13] I. Greif, “Semantics of communicating parallel processes,” Ph.D. dis-sertation, Dept. Elect. Eng. Comput. Sci., Massachusetts Inst. Technol.,Cambridge, MA, USA, Aug. 1975.

[14] H. Baker and C. Hewitt, “Laws for communicating parallel processes,”Massachusetts Inst. Technol., Cambridge, MA, USA, AI Working Paper134A, May 1977.

[15] W. D. Clinger, “Foundations of actor semantics,” Massachusetts Inst.Technol., Cambridge, MA, USA, Tech. Rep. 633, Jun. 1981.

[16] G. A. Agha, “ACTORS: A model of concurrent computation in distributedsystems,” Massachusetts Inst. Technol., Cambridge, MA, USA, Tech. Rep.844, 1985.

[17] R. Milner, A Calculus of Communicating Systems (series Lecture Notesin Computer Science), vol. 92. Berlin, Germany, Springer-Verlag, 1980.

[18] C. Hoare, “Communicating sequential processes,” Commun. ACM, vol.21, no. 8, pp. 666–677, Aug. 1978.

[19] J. Bergstra and J. Klop, “Process algebra for synchronous communication,”Inf. Control, vol. 60, no. 1–3, pp. 109–137, 1984.

[20] R. Milner, J. Parrow, and D. Walker, “A calculus of mobile processes,Part I,” Inf. Comput., vol. 100, no. 1, pp. 1–40, Sep. 1992.

[21] P. H. Feiler, D. P. Gluch, and J. J. Hudak, “The architecture analysis &design language (AADL): An introduction,” Software Eng. Inst., CarnegieMellon Univ., Pittsburgh, PA, USA, Tech. Rep. CMU/SEI-2006-TN-011,Feb. 2006.

[22] V. Debruyne, F. Simonot-Lion, and Y. Trinquet, “EAST-ADL—An Archi-tecture Description Language,” in Architecture Description Languages.Boston, MA, USA: Springer, 2005, pp. 181–195.

[23] S. Friedenthal, A. Moore, and R. Steiner, A Practical Guide to SysML:The Systems Modeling Language. Amsterdam, The Netherlands: Elsevier,2011.

[24] C. A. Petri, “Kommunikation mit automaten,” Ph.D. dissertation, Institutfur instrumentelle Mathematik, Bonn, Germany, 1962 ( English trans-lation available as: “Communication with automata,” Applied Data Re-search, Princeton, NJ, USA, Tech. Rep. RADC-TR-65–377, vol. 1, suppl.1, 1966).

[25] R. M. Keller, “Formal verification of parallel programs,” Commun. ACM,vol. 19, no. 7, pp. 371–384, Jul. 1976.

[26] D. Kozen, “On action algebras,” in Logic and Information Flow. Cam-bridge, MA, USA: MIT Press, 1993, pp. 78–88.

[27] A. Letichevsky and D. Gilbert, “A general theory of action languages,”Cybern. Syst. Anal., vol. 34, pp. 12–30, 1998.

[28] V. Pratt, “Action logic and pure induction,” in Logics in AI (series LectureNotes in Computer Science), vol. 478, J. Eijck, Ed. Berlin, Germany:Springer, 1991, pp. 97–120.

[29] C. Hoare, B. Moller, G. Struth, and I. Wehrman, “Concurrent kleenealgebra,” in CONCUR 2009—Concurrency Theory (series Lecture Notesin Computer Science), vol. 5710, M. Bravetti and G. Zavattaro, Eds. Berlin,Germany: Springer, 2009, pp. 399–414.

[30] C. Hoare, B. Moller, G. Struth, and I. Wehrman, “Concurrent kleenealgebra and its foundations,” J. Logic Algebr. Program., vol. 80, no. 6,pp. 266–296, 2011.

[31] J. Goguen and J. Meseguer, “Security policies and security models,” inProc. Symp. Secur. Privacy, New York, NY, USA, 1982, pp. 11–20.

[32] P. Ryan, J. McLean, J. Millen, and V. Gligor, “Non-interference:Who needs it?” in Proc. 14th IEEE Workshop Comput. Secur. Found,Washington, DC, USA, 2001, pp. 237–238.

[33] D. Volpano and G. Smith, “Eliminating covert flows with minimum typ-ings,” in Proc. 10th Comput. Secur. Found. Workshop, Los Alamitos, CA,USA, 1997, pp. 156–168.

[34] P. Y. A. Ryan and S. A. Schneider, “Process algebra and non-interference,”in Proc. 12th IEEE Comput. Secur. Found. Workshop, 1999, pp. 214–227.

[35] G. Lowe, “Quantifying information flow,” in Proc. 15th IEEE Comput.Secur. Found. Workshop, Los Alamitos, CA, USA, 2002, pp. 18–31.

[36] R. Van Der Meyden , “What, indeed, is intransitive noninterference?” inProc. 12th Eur. Symp. Res. Comput. Secur., 2007, pp. 235–250.

[37] S. Chong and R. Van Der Meyden, “Using architecture to reason aboutinformation security,” ACM Trans. Inf. Syst. Secur., vol. 18, no. 2, pp. 1–30,Dec. 2015. [Online]. Available: http://doi.acm.org/10.1145/2829949

[38] J. T. Haigh, R. A. Kemmerer, J. McHugh, and W. D. Young, “An ex-perience using two covert channel analysis techniques on a real systemdesign,” IEEE Trans. Softw. Eng., vol. SE-13, no. 2, pp. 157–168, Feb.1987.

[39] R. Focardi, R. Gorrieri, and F. Martinelli, “Real-time information flowanalysis,” IEEE J. Sel. Areas Commun., vol. 21, no. 1, pp. 20–35, Jan.2003.

[40] D. E. Denning, “A lattice model of secure information flow,” Commun.ACM, vol. 19, no. 5, pp. 236–243, May 1976.

[41] J. McDermid and Q. Shi, “A formal model of security dependency foranalysis and testing of secure systems,” in Proc. Comput. Secur. Found.Workshop IV, 1991, pp. 188–200.

[42] A. Shaffer, M. Auguston, C. Irvine, and T. Levin, “Toward a securitydomain model for static analysis and verification of information systems,”in Proc. 7th OOPSLA Workshop Domain Specific Model., Oct. 2007,pp. 160–171.

[43] J. Shen and S. Qing, “A dynamic information flow model of secure sys-tems,” in Proc. 2nd ACM Symp. Inf., Comput. Commun. Secur., 2007,pp. 341–343.

[44] V. Varadharajan, “Petri net based modelling of information flow securityrequirements,” in Proc. Comput. Secur. Found. Workshop III, Jun. 1990,pp. 51–61.

[45] R. Focardi and R. Gorrieri, “A classification of security properties forprocess algebras,” J. Comput. Secur., vol. 3, no. 1, pp. 5–33, Nov. 1994.

[46] R. Hahnle, J. Pan, P. Rummer, and D. Walter, “Integration of a securitytype system into a program logic,” Theor. Comput. Sci., vol. 402, no. 2/3,pp. 172–189, 2008.

[47] K. Hristova, T. Rothamel, Y. A. Liu, and S. D. Stoller, “Efficient typeinference for secure information flow,” in Proc. Workshop Program. Lang.Anal. Secur., Oct. 2006, pp. 85–94.

[48] N. Kobayashi, “Type-based information flow analysis for the π-calculus,”Acta Informatica, vol. 42, no. 4, pp. 291–347, 2005.

[49] D. Volpano, G. Smith, and C. Irvine, “A sound type system for secureflow analysis,” J. Comput. Secur., vol. 4, no. 2/3, pp. 167–187, 1996.

[50] G. R. Andrews and R. P. Reitman, “An axiomatic approach to informationflow in programs,” ACM Trans. Programm. Lang. Syst., vol. 2, no. 1,pp. 56–76, Jan. 1980.

[51] K. E. Sabri, R. Khedri, and J. Jaskolka, “Verification of information flowin agent-based systems,” in Proc. 4th Int. MCETECH Conf. e-Technol.,May 2009, vol. 26, pp. 252–266.

[52] K. Alghathbar, C. Farkas, and D. Wijesekera, “Securing UML informa-tion flow using flowUML,” J. Res. Pract. Inf. Technol., vol. 38, no. 1,pp. 111–120, Feb. 2006.

[53] United States Department of Defense, “Procedures for performing a failuremode effect and criticality analysis,” Dept. Defense, Washington, DC,USA, Tech. Rep. MIL-STD-1629A, Nov. 1980.

[54] N. G. Leveson, Safeware: System Safety and Computers. New York, NY,USA: ACM, 1995.

[55] T. G. Lewis, Critical Infrastructure Protection in Homeland Security:Defending a Networked Nation. Hoboken, NJ, USA: Wiley, 2006.

[56] M. Iturbe, I. Garitano, U. Zurutuza, and R. Uribeetxeberri, “Visualizingnetwork flows and related anomalies in industrial networks using chorddiagrams and whitelisting,” in Proc. 11th Joint Conf. Comput. Vis., Imag.Comput. Graph. Theory Appl., 2016, vol. 2, pp. 99–106.

[57] A. A. E. Kalam, Y. Deswarte, A. Baına, and M. Kaaniche, “Polyorbac: Asecurity framework for critical infrastructures,” Int. J. Critical Infrastruc-ture Prot., vol. 2, no. 4, pp. 154–169, 2009.

[58] L. Lamport, “Proving the correctness of multiprocess programs,” IEEETrans. Softw. Eng., vol. SE–3, no. 2, pp. 125–143, Mar. 1977.

[59] S. M. German and A. P. Sistla, “Reasoning about systems with manyprocesses,” J. ACM, vol. 39, no. 3, pp. 675–735, Jul. 1992.

[60] R. Alur and T. A. Henzinger, “Reactive modules,” Form. Methods Syst.Des., vol. 15, no. 1, pp. 7–48, 1999.

Page 18: IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE ... · continues to increase, assuring the safety, security, and reliabil-ity of distributed systems remains among the top priorities

546 IEEE TRANSACTIONS ON RELIABILITY, VOL. 66, NO. 2, JUNE 2017

[61] S. F. Siegel and G. Gopalakrishnan, “Formal analysis of message pass-ing,” in Verification, Model Checking, and Abstract Interpretation (seriesLecture Notes in Computer Science), vol. 6538, R. Jhala and D. Schmidt,Eds. Berlin, Germany: Springer, 2011, pp. 2–18.

[62] E. Clarke, M. Talupur, and H. Veith, “Environment abstraction for pa-rameterized verification,” in Verification, Model Checking, and AbstractInterpretation (series Lecture Notes in Computer Science), vol. 3855,E. A. Emerson and K. S. Namjoshi, Eds. Berlin, Germany: Springer,2006, pp. 126–141.

[63] P. A. Abdulla, F. Haziza, and L. Holık, “All for the price of few (param-eterized verification through view abstraction),” in Verification, ModelChecking, and Abstract Interpretation (series Lecture Notes in ComputerScience), vol. 7737, R. Giacobazzi, J. Berdine, and I. Mastroeni, Eds.Berlin, Germany: Springer, 2013, pp. 476–495.

[64] K. S. Namjoshi and R. J. Trefler, “Analysis of dynamic process networks,”in Tools and Algorithms for the Construction and Analysis of Systems(series Lecture Notes in Computer Science), vol. 9035, C. Baier andC. Tinelli, Eds. Berlin, Germany: Springer, 2015, pp. 164–178.

[65] K. S. Namjoshi and R. J. Trefler, “Parameterized compositional modelchecking,” in Tools and Algorithms for the Construction and Analy-sis of Systems (series Lecture Notes in Computer Science), vol. 9636,M. Chechik and J.-F. Raskin, Eds. Berlin, Germany: Springer, 2016,pp. 589–606.

[66] K. Schweiker, S. Varadarajan, D. Spivak, P. Schultz, R. Wisnesky, andM. Perez, “Operadic analysis of distributed systems,” Nat. AeronauticsSpace Admin., Washington, DC, USA, Tech. Rep. NASA/CR–2015–xxxxxx, Sep. 2015.

[67] R. Milner, Communication and Concurrency (Prentice-Hall InternationalSeries in Computer Science). Englewood Cliffs, NJ, USA: Prentice Hall,1989.

[68] J. Jaskolka and R. Khedri, “A formulation of the potential for commu-nication condition using C2 KA,” in Games, Automata, Logics and For-mal Verification (series Electronic Proceedings in Theoretical ComputerScience), vol. 161, A. Peron and C. Piazza, Eds. Verona, Italy: OpenPublishing Assoc., Sep. 2014, pp. 161–174.

[69] E. Pacuit and R. Parikh, “The logic of communication graphs,” in Declar-ative Agent Languages and Technologies II (series Lecture Notes in Com-puter Science), vol. 3476, J. Leite, A. Omicini, P. Torroni, and P. Yolum,Eds. Berlin, Germany: Springer, 2005, pp. 256–269.

[70] L.-H. Hsu and C.-K. Lin, Graph Theory and Interconnection Networks.Boca Raton, FL, USA: CRC Press, 2008.

[71] M. Clavel et al., “The Maude 2.0 system,” in Rewriting Techniquesand Applications (series Lecture Notes in Computer Science), vol. 2706,R. Nieuwenhuis, Ed. Berlin, Germany: Springer, 2003, pp. 76–87.

[72] J. Gu, J.-l. Zhou, S.-s. Yu, J. Zhang, and Z.-c. Duan, “Study on the ar-chitecture of control system for manufacturing cells,” J. Shanghai Univ.,vol. 5, no. 2, pp. 130–135, Jun. 2001.

[73] W. Kroger and E. Zio, “Challenges to Methods for the Vulnerability Anal-ysis of Critical Infrastructures” in Vulnerable Systems. London, U.K.:Springer, 2011, pp. 33–39.

[74] C. Hoare and J. Wickerson, “Unifying models of data flow,” in Proc.2010 Marktoberdorf Summer School Softw. Syst. Safety, Aug. 2011,pp. 211–230.

[75] E. Dijkstra, “Guarded commands, nondeterminacy and formal derivationof programs,” Commun. ACM, vol. 18, no. 8, pp. 453–457, Aug. 1975.

[76] D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer Sci-ence and Computational Biology. Cambridge, U.K.: Cambridge Univ.Press, 1997.

[77] Center for Chemical Process Safety, “Example Problem: Batch ChemicalReactor” in Guidelines for Design Solutions for Process Equipment Fail-ures. New York, NY, USA: Amer. Inst. Chem. Eng., Aug. 1998, pp. 179–202.

[78] J. Jaskolka and R. Khedri, “Mitigating covert channels based on analysis ofthe potential for communication,” Theor. Comput. Sci., vol. 643, pp. 1–37,Aug. 2016.

Jason Jaskolka (M’15) received his Ph.D. in software engineering fromMcMaster University, Hamilton, ON, Canada, in 2015. He is a U.S. Depart-ment of Homeland Security Cybersecurity Postdoctoral Scholar at StanfordUniversity, Stanford, CA, USA, within the Center for International Security andCooperation. As of July 2017, he will be an Assistant Professor in the Depart-ment of Systems and Computer Engineering at Carleton University, Ottawa,ON, Canada. His research interests include cybersecurity assurance, distributedmultiagent systems, and algebraic approaches to software engineering.

John Villasenor (SM’97) is a Professor of electrical engineering, public policy,and management, and a Visiting Professor of law at the University of California,Los Angeles, CA, USA. He is also a nonresident senior fellow at the BrookingsInstitution, a Visiting Fellow at the Hoover Institution, a member of the WorldEconomic Forum’s Global Agenda Council on Cybersecurity, a member of theCouncil on Foreign Relations, and an affiliate of the Center for InternationalSecurity and Cooperation, Stanford University, Stanford, CA, USA.