e-informatica software engineering journal...e-informatica software engineering journal, volume 4,...

126

Upload: others

Post on 08-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • Editors

    Zbigniew Huzar ([email protected])Lech Madeyski ([email protected], http://madeyski.e-informatyka.pl/ )

    Wrocław University of TechnologyInstitute of Applied InformaticsWrocław University of Technology, 50-370 Wrocław, Poland

    e-Informatica Software Engineering Journalhttp://www.e-informatyka.pl/wiki/e-Informatica/

    All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,transmitted in any form, or by any means, electronic, mechanical, photocopying, recording, orothervise, without the prior written permission of the publishers.

    Printed in the camera ready form

    c○ Copyright by Oficyna Wydawnicza Politechniki Wrocławskiej, Wrocław 2010

    OFICYNA WYDAWNICZA POLITECHNIKI WROCŁAWSKIEJWybrzeże Wyspiańskiego 27, 50-370 Wrocław

    ISSN 1897-7979

    Drukarnia Oficyny Wydawniczej Politechniki Wrocławskiej. Order No. 418/2010.

  • Editorial BoardEditor-in-Chief

    Zbigniew Huzar (Wrocław University of Technology, Poland)

    Associate Editor-in-Chief

    Lech Madeyski (Wrocław University of Technology, Poland)

    Editorial Board Members

    Pekka Abrahamsson (VTT Technical Research Centre, Finland)Sami Beydeda (ZIVIT, Germany)Miklós Biró (Corvinus University of Budapest, Hungary)Joaquim Filipe (Polytechnic Institute of Setúbal/INSTICC, Portugal)Thomas Flohr (University of Hannover, Germany)Félix García (University of Castilla-La Mancha, Spain)Janusz Górski (Gdańsk University of Technology, Poland)Andreas Jedlitschka (Fraunhofer IESE, Germany)Pericles Loucopoulos (The University of Manchester, UK)Kalle Lyytinen (Case Western Reserve University, USA)Leszek A. Maciaszek (Macqarie University Sydney, Australia)Jan Magott (Wrocław University of Technology, Poland)Zygmunt Mazur (Wrocław University of Technology, Poland)Bertrand Meyer (ETH Zurich, Switzerland)Matthias Müller (IDOS Software AG, Germany)Jürgen Münch (Fraunhofer IESE, Germany)Jerzy Nawrocki (Poznań Technical University, Poland)Krzysztof Sacha (Warsaw University of Technology, Poland)Rini van Solingen (Drenthe University, The Netherlands)Miroslaw Staron (IT University of Göteborg, Sweden)Tomasz Szmuc (AGH University of Science and Technology Kraków, Poland)Iwan Tabakow (Wrocław University of Technology, Poland)Rainer Unland (University of Duisburg-Essen, Germany)Sira Vegas (Polytechnic University of Madrit, Spain)Corrado Aaron Visaggio (University of Sannio, Italy)Bartosz Walter (Poznań Technical University, Poland)Jaroslav Zendulka (Brno University of Technology, The Czech Republic)Krzysztof Zieliński (AGH University of Science and Technology Kraków, Poland)

  • Contents

    EditorialZbigniew Huzar, Lech Madeyski . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Regular PapersDeriving 𝑅𝑇 𝑇 Credentials for Role-Based Trust Management

    Anna Felkner, Krzysztof Sacha . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Hierarchical Model for Evaluating Software Design Quality

    Pawel Martenka, Bartosz Walter . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Pattern-Based Software Architecture for Service-Oriented Software Systems

    Claus Pahl, Ronan Barrett . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31The Evolution of Complexity in Apple Darwin: A Common Coupling Point of View

    Liguo Yu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47Integration of Application Business Logic and Business Rules with DSL and AOP

    Bogumiła Hnatkowska, Krzysztof Kasprzyk . . . . . . . . . . . . . . . . . . . . . . 59A Case Study on Behavioural Modelling of Service-Oriented Architectures

    Marek Rychlý . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Defect Inflow Prediction in Large Software Projects

    Miroslaw Staron, Wilhelm Meding . . . . . . . . . . . . . . . . . . . . . . . . . . 89Automatic Test Cases Generation from Software Specifications

    Aysh Alhroob, Keshav Dahal, Alamgir Hossain . . . . . . . . . . . . . . . . . . . 109

  • Editorial

    It is a pleasure to present to our readersthe fourth issue of the e-Informatica SoftwareEngineering Journal (ISEJ). The mission of thee-Informatica Software Engineering Journal isto be a prime international journal to publishresearch findings and IT industry experiencesrelated to theory, practice and experimentationin software engineering. The scope of the jour-nal includes methodologies, practices, architec-tures, technologies and tools used in processesalong the software development lifecycle, butparticular interest is in empirical evaluation.

    The current issue of the journal includeseight papers. The first of the papers by Felknerand Sacha defines formal language that enableshandling trust in distributed control systems.The sound and complete deductive system de-riving credentials from initial credentials is pre-sented and explained.

    The second of the papers by Martenkaand Walter is a contribution extendingfactor-strategy model proposed by Marinescu.It enables more comprehensive and traceable in-formation concerning detected potential anoma-lies to the designer, resembling the human wayof cognition.

    The third of the papers by Pahl and Barrettpresents a modelling and transformation tech-nique for service-centric distributed systems.Authors capture behavioural aspects and as-sociates quality of architectural structures atdifferent levels of abstraction through patterns.Positive effect of the technique application isillustrated by a case study including design,maintenance and evolution of a system that hasbeen developed by more than 20 people andmaintained for more than ten years.

    The objective of the fourth paper by Yu is tounderstand the changing patterns of softwarecomplexity. Common coupling is a measure

    of the system complexity but also it gives in-sight into software flexibility. How the couplingchanges with the evolution of a software systemis the subject of study on Apple Darwin, anopen-source operating system.

    The fifth paper by Hnatkowska andKasprzyk proposes an approach to businesslogic implementation that enables easy responseto business rules changes. Separation of busi-ness logic layer from business rule layer by in-troducing an integration layer is the core of theidea. The proof-of-concept implementation ofthe integration layer is presented in the aspectoriented language.

    The sixth paper by Rychlý is an interestingapplication of Milner’s 𝜋-calculus to describebehaviour of components in service-oriented ar-chitecture. A case study of the architecturefor functional testing of complex safety-criticalsystems is presented.

    The seventh paper by Staron and Medingpresents methods for constructing predictionmodels of trends in defect inflow in large soft-ware projects. Two models are considered. Thefirst one, so called short-term prediction model,is used to predict the number of defects discov-ered in the code up to three weeks in advance.The second one, long-term prediction model,provides the possibility of predicting the de-fect inflow for the whole project. The initialevaluation of these methods in a large softwareproject at Ericsson shows that the models aresufficiently accurate and easy to deploy.

    In the last paper Alhroob, Dahal and Hos-sain present a new technique of test cases gen-eration extending the Integrated ClassificationTree Methodology. The stress is put on extrac-tion of legitimate test cases by removing theduplicate test cases and those incomputablewith the software specifications. Large amounts

  • 8 Editorial

    of time would have been needed to execute allof the test cases; therefore, a methodology isaimed to select the best testing path whichguarantees the highest coverage of system unitsand avoids using all generated test cases.

    We look forward to receiving quality contri-butions from researchers and practitioners insoftware engineering for the next issue of thejournal.

    EditorsZbigniew HuzarLech Madeyski

  • e-Informatica Software Engineering Journal, Volume 4, Issue 1, 2010

    Deriving RTT Credentials forRole-Based Trust Management

    Anna Felkner∗, Krzysztof Sacha∗∗∗Research and Academic Computer Network

    ∗∗Warsaw University of Technology

    [email protected], [email protected]

    AbstractRole-based trust management languages define a formalism, which uses credentials to handle trustin decentralized, distributed access control systems. A credential provides information about theprivileges of users and the security policies issued by one or more trusted authorities. The maintopic of this paper is RTT , a language which supports manifold roles and role-product operators toexpress threshold and separation of duties policies. The core part of the paper defines a relational,set-theoretic semantics for the language, and introduces a deductive system, in which credentialscan be derived from an initial set of credentials using a set of inference rules. The soundness andthe completeness of the deductive system with respect to the semantics of RTT is proved.

    1. Introduction

    The problem of guaranteeing that confidentialdata and services offered by a computer systemare not made available to unauthorized usersis a challenging issue, which must be solved byreliable software technologies that are used forbuilding high-integrity applications. The tradi-tional solution to this problem is an implementa-tion of some access control techniques, by whichusers are identified, and granted or denied accessto a system data and other resources, depend-ing on their individual or group identity. Theexamples of such solutions can be MandatoryAccess Control (MAC) facilities, DiscretionaryAccess Control (DAC) and Role-Based AccessControl (RBAC) systems. Such an approach fitswell into closed and centralized environments, inwhich the identity of users is known in advance.

    Quite new challenges arise in decentralizedand open systems, where the identity of users isnot known in advance and the set of users canchange. For example, consider a university, inwhich the students are enrolled and registered

    in particular faculties, and no central registryof all the students of that university exists. Thepolicy of the university is such that a studentis eligible to attend a lecture given by a faculty,regardless of the faculty in which he or she is ac-tually registered. However, how could a faculty(the lecture owner) know that Peter Pan is eligi-ble to attend the lecture, if his name is unknownto this faculty? The identity of the student it-self does not help in making a decision whetherhe or she is eligible to attend or not. What isneeded to make such a decision is informationabout the privileges assigned to Peter Pan byother authorities (is he registered in a faculty),as well as trust information about the authorityitself (is the faculty a part of this university).

    Trust-management system is a standardizedsolution for controlling security-critical servicesin high-integrity applications (Figure 1). It helpsanswer questions related to the conformanceof potentially dangerous operations to a secu-rity policy of an organization, and provides theusers with a language for writing the policiesand controlling access to system services and

  • 10 Anna Felkner, Krzysztof Sacha

    Remoteclient

    Remoteclient

    Remoteclient

    Localserver

    Localserver

    TMsystem

    Requestsfor resources

    Security queries

    Figure 1. Trust management system

    resources. The policies are no longer hard-codedinto applications and therefore can be much eas-ier to change. A designer of an application mustonly identify the security issues in the applica-tion and formulate appropriate queries to thetrust-management system.

    Such a conception of trust management, in-troduced in [2], has evolved since that time toa much broader context of assessing the reliabil-ity and developing trustworthiness for other sys-tems and individuals [9]. In this paper, however,we will use the term trust management only in ameaning restricted to the field of access control.

    The paper is organized as follows. Anoverview of the work related to role-based trustmanagement systems and languages is given inSection 2. Section 4 describes the relational se-mantics of RT T language. Section 6, which is thecore part of our contribution, presents a deduc-tive system, in which credentials can be derivedfrom an initial set of credentials using a set ofinference rules. A proof of the soundness andthe completeness of the deductive system withrespect to the semantics of RT T is presented aswell. Sections 3 and 5 provide the reader withillustrative examples. Final remarks and plansfor future research are given in conclusions.

    2. Related Work

    Traditional access control systems usually relyon Role-Based Access Control model [14, 6, 7],which groups the access rights by the role nameand limits the access to a resource to those users,who are assigned to a particular role. RBAC sys-tems provide authorization decisions based onthe identity of the users, and work well in cen-tralized environment of an enterprise.

    Trust management model represents quiteanother approach to access control, in which

    decisions are based on credentials (certificates)issued by multiple principals. A credential isan attestation of qualification, competence orauthority, issued to an individual by a thirdparty. Examples of credentials in real life in-clude identification documents, social securitycards, driver’s licenses, membership cards, aca-demic diplomas, certifications, security clear-ances, passwords and user names, keys, etc.A credential in a computer system can be a dig-itally signed document.

    The potential and flexibility of trust man-agement approach stems from the possibility ofdelegation: A principal may transfer limited au-thority over a resource to other principals. Sucha delegation can be implemented by means of anappropriate credential. This way, a set of cre-dentials can define the access control strategyand allow of deciding on who is authorized toaccess a resource, and who is not. A side-effectof delegation is such that a number of authoriz-ing principals can be distributed over a network.A variety of problems arises if the credentials arestored in a decentralized manner.

    The term trust management was first ap-plied in the context of distributed access con-trol in [2]. The first trust management systemdescribed in the literature was PolicyMaker [3],which defined a special assertion language capa-ble of expressing policy statements, which werelocally trusted, and credentials, which had to besigned using a private key. The next generationof trust management languages were KeyNote[1], which was an enhanced version of Policy-Maker, SPKI/SDSI [4] and a few other lan-guages. All those languages allowed assigningprivileges to entities and used credentials to del-egate permissions from its issuer to its subject.What was missing in those languages was thepossibility of delegation based on attributes ofthe entities and not on their identity.

  • Deriving RTT Credentials for Role-Based Trust Management 11

    Role-based trust management (RT) lan-guages use roles to represent attributes [12]. Themeaning of a role is a set of entities who have theattribute represented by the role. This meaningof roles captures the notion of groups of usersin many systems and has been borrowed fromRole-Based Access Control approach. The corelanguage of RT family is RT0, described in detailin [13]. It allows describing localized authoritiesfor roles, role hierarchies, delegation of authorityover roles and role intersections. All the subse-quent languages add new features to RT0.

    RT1 introduces parametrized roles, i.e. rolesthat are described using additional parameters,which can represent relationships between en-tities. RT2 adds to RT1 logical objects, whichcan represent permissions given to entities withrespect to groups of logically related objects (re-sources). Those extensions can help in keepingthe notation concise, but does not increase theexpressive power of the language, because eachcombination of parameters in RT1 and each per-mission to a logical object in RT2 can be definedalternatively as a separate role in RT0.

    RT T provides manifold roles and role-prod-uct operators, which can express threshold andseparation of duties policies. A manifold role isa role that can be satisfied by a set of cooper-ating entities. A singleton role can be treatedas a special case of a manifold role, whose setof cooperating entities is a singleton set. Thisway, RT0 credentials can also be expressed inRT T . A threshold policy requires a specifiedminimum number of entities to agree on somefact, e.g. in a requirement that two differentbank cashiers must authorize a transaction. Sep-aration of duties policy requires a set of enti-ties, each of which fulfils a specific role, to agreebefore access is granted. Both types of policiesmean that some transactions cannot be com-pleted by a single entity, because no single entityhas all the access rights required to completethe transaction.

    RTD provides mechanisms to describe del-egation of role activations and selective use ofrole membership. This language is not coveredin this paper. The features of RT T and RTD

    can be combined together with the features of

    RT0, RT1 or RT2. A more detailed treatment ofthe role-based trust management family of lan-guages can be found in [12].

    2.1. The Language RT0

    Basic elements of all the RT languages are en-tities, role names, roles and credentials. Enti-ties represent principals that can define rolesand issue credentials, and requesters that canmake requests to access resources. An entity canbe identified by a user account in a computersystem or by a public key. Role names repre-sent permissions that can be issued by entitiesto other entities or groups of entities. Roles rep-resent sets of entities that have permissions is-sued by particular issuers. A role is defined asa pair composed of an entity (role issuer) anda role name. Credentials define roles by point-ing a new member of the role or by delegatingauthority to the members of other roles.

    In this paper, we use nouns beginning with acapital letter or just capital letters, e.g. A,B,C,to denote entities and sets of entities. Rolenames are denoted as identifiers beginning witha small letter or just small letters, e.g. r, s, t.Roles take the form of an entity (the issuer ofthis role) followed by a role name separated bya dot, e.g. A.r. Credentials are statements inthe language. A credential consists of a role, leftarrow symbol and a role expression.

    There are four types of credentials in RT0,which should be interpreted in the following way:A.r ← B – simple membership: Entity B is a

    member of role A.r.A.r ← B.s – simple inclusion: Role A.r includes

    (all members of) role B.s. This is a delega-tion of authority over r from A to B, be-cause B may cause new entities to becomemembers of the role A.r by issuing creden-tials that define B.s.

    A.r ← B.s.t – linking inclusion: Role A.r in-cludes role C.t for each C, which is a memberof role B.s. This is a delegation of authorityfrom A to all the members of the role B.s.The expression B.s.t is called a linked role.

    A.r ← B.s ∩ C.t – intersection inclusion: RoleA.r includes all the entities who are members

  • 12 Anna Felkner, Krzysztof Sacha

    of both roles B.s and C.t. This is a partialdelegation from A to B and C. The expres-sion B.s ∩ C.t is called an intersection role.A formal, set-theoretic semantics of RT0 has

    been defined in a slightly different manner in [13]and [8].

    Let E be a set of entities, R a set of rolenames and P a set of RT0 credentials. The se-mantics of the set P of RT0 credentials is a func-tion SP :

    SP : E ×R → 2E .such that SP is the least fixpoint of the followingsequence of functions Ri, which map roles to setsof entity names [8]:1. R0 maps each role to an empty set φ2. Ri+1 =

    ⊕c∈P f(Ri, c)

    where⊕

    is the point-wise extension of a func-tion and f is a function that, given a (partial)semantics Ri and a credential A.r ← e, returnsall the entities that should be added to Ri(A.r),as governed by e:

    f(Ri, A.r ← B) = {A.r 7→ {B}}f(Ri, A.r ← B.s) = {A.r 7→ Ri(B.s)}

    f(Ri, A.r ← B.s.t) = {A.r 7→⋃

    C∈Ri(B.s)Ri(C.t)}

    f(Ri, A.r ← B.s ∩ C.t)= {A.r 7→ Ri(B.s) ∩Ri(C.t)}

    2.2. The Language RT T

    At the syntax level, RT T adopts all the fourtypes of RT0 credentials, and adds two newtypes of credentials. These are:A.r ← B.s� C.t – role A.r includes one mem-

    ber of role B.s and one member of role C.t.This allows expressing threshold policies.

    A.r ← B.s⊗ C.t – role A.r includes one mem-ber of role B.s and one member of role C.t,but those members of roles have to be differ-ent. This allows for expressing separation ofduties policies.The changes at the semantics level are

    greater, because the requesters as well as theissuers of RT T credentials are no longer enti-ties, but sets of entities, who can jointly fulfil

    a role. Such a change applies to all six types ofcredentials, also those, which are adopted fromRT0.

    Formal definition of the semantics of RT T iscovered in Section 4.

    3. Examples

    The models discussed in this paper can be, ingeneral, very complex. Therefore, we presenthere only simplified examples, with the intentionto illustrate the basic notions and the notation.The first example demonstrates the use of RT0credentials, while the second one presents theuse of RT T credentials.

    Example 1 (RT0)

    A person has the right to attend a lecture, givenat a university U , when he or she is a studentregistered to a faculty of this university. To beable to fulfil the role of a faculty, an organi-zation ought to be a division of the universityand should conduct research activities. John isa student registered to F , which is a division ofU , and which conducts research activities. Thefollowing credentials prove that John have theright to attend a lecture:

    U.lecture← U.faculty.student (1)U.faculty ← U.division ∩ U.research (2)

    U.division← F (3)U.research← F (4)F.student← John (5)

    Example 2 (RT T )

    The following example has been adopted from[11]. A bank B has three roles: manager,cashier and auditor. Security policy of thebank requires an approval of certain transac-tions from a manager, two cashiers, and anauditor. The two cashiers must be different.However, a manager who is also a cashier canserve as one of the two cashiers. The auditormust be different from the other parties in thetransaction.

  • Deriving RTT Credentials for Role-Based Trust Management 13

    Such a policy can be described using the fol-lowing credentials:

    B.twoCashiers← B.cashier ⊗B.cashier (6)B.managerCashiers

    ← B.manager �B.twoCashiers (7)B.approval

    ← B.auditor ⊗B.managerCashiers (8)Now, assume that the following credentials

    have been added:

    B.cashier ←Mary (9)B.cashier ← Doris (10)B.cashier ← Alice (11)B.cashier ← Kate (12)B.manager ← Alice (13)B.auditor ← Kate (14)

    Then one can conclude that, according to thepolicy of B, the following sets of entities cancooperatively approve a transaction: {Mary,Doris, Alice, Kate}, {Mary, Alice, Kate} and{Doris, Alice, Kate}.

    4. The Semantics of RT T

    The syntax of a language defines language ex-pressions, which are used to communicate infor-mation. The primary expressions of role-basedtrust management languages are credentials andsets of credentials, which are used as a means fordefining roles.

    The semantics of a language defines themeaning of expressions. Such a definition con-sists of two parts [10]: A semantic domain and asemantic mapping from the syntax to the seman-tic domain. The meaning of a language expres-sion must be an element in the semantic domain.

    The semantics of RT0, which defines themeaning of a set of credentials as a function fromthe set of roles into the power set of entities, hasno potential to describe the meaning of RT T ,which supports manifold roles and role-productoperators. Therefore, we define in this sectionthe meaning of a set of credentials as a relationover the set of roles and the power set of en-

    tities. Thus, we use a Cartesian product of theset of roles and the power set of entities as thesemantic domain of a role-based trust manage-ment language. The semantic mapping wouldassociate a specific relation between roles andentities with each set of credentials. Such a re-lational approach allows us to define a formalsemantics of RT T language [5].

    Let E be the set of entities and R be theset of role names. P is a set of RT-credentials,which describe the assignment of sets of entitiesto roles, issued by other entities (or rather setsof entities).

    The semantics of P, denoted by SP , is de-fined as a relation:

    SP ⊆ 2E ×R× 2E ,An instance of this relation, e.g.: (A, r,X),

    maps the role A.r to a set of entities X ∈ 2E .If the cardinality of set X is greater than one,then the role A.r is a manifold role and the en-tities of set X must cooperate together in orderto satisfy the role. The cardinality of set A canalso be greater than one, which would mean thatthe role A.r is governed jointly by the entities ofset A.

    If all the sets of entities are singleton sets,the semantics of RT T reduces to the semanticsof RT0. This way, our definition covers all theRT languages including RT0 through RT T .

    Denote the power set of entities by F = 2E .Each element in F is a set of entities from E(a subset of E). Each element in 2F is a set,compound of sets of entities from E .

    The semantics of P can now be described inan alternative way as a function:

    S̃P : 2E ×R → 2Fwhich maps each role from 2E ×R into a set ofsubsets of entities. The members of each subsetmust cooperate in order to satisfy the role.

    Knowing the relation SP , one can define thefunction S̃P as follows:

    S̃P(A, r) = {X ∈ 2E : (A, r,X) ∈ SP}The semantics of RT T can now be defined

    formally in the following way.

  • 14 Anna Felkner, Krzysztof Sacha

    Definition 1. The semantics of a set P of RT T credentials, denoted by SP , is the smallest relationSi, such that:1. S0 = φ2. Si+1 =

    ⋃c∈P f(Si, c) for i = 0, 1, . . .

    which is closed with respect to function f , which describes the meaning of credentials in the followingway (A,B,C,X, Y are sets of entities, may be singletons):

    f(Si, A.r ← X) = {(A, r,X)} (D1)f(Si, A.r ← B.s) = {(A, r,X) : (B, s,X) ∈ Si} (D2)

    f(Si, A.r ← B.s.t) =⋃C:(B,s,C)∈Si{(A, r,X) : (C, t,X) ∈ Si} (D3)

    f(Si, A.r ← B.s ∩ C.t) = {(A, r,X) : (B, s,X) ∈ Si ∧ (C, t,X) ∈ Si} (D4)f(Si, A.r ← B.s� C.t) = {(A, r,X ∪ Y ) : (B, s,X) ∈ Si ∧ (C, t, Y ) ∈ Si} (D5)

    f(Si, A.r ← B.s⊗ C.t) = {(A, r,X ∪ Y ) : (B, s,X) ∈ Si ∧ (C, t, Y ) ∈ Si ∧(X ∩ Y ) = φ} (D6)

    5. Examples

    We use the example sets of credentials from Section 3 to illustrate the definition of RT T semantics.

    Example 1 (RT0)

    The starting relation S0 is, by definition, empty. The sequence of steps to compute consecutiverelations Si can be described as follows:S0 = φS1 = {({U}, division, {F}), ({U}, research, {F}), ({F}, student, {John})}S2 = {({U}, division, {F}), ({U}, research, {F}), ({F}, student, {John}),

    ({U}, faculty, {F})}S3 = {({U}, division, {F}), ({U}, research, {F}), ({F}, student, {John}),

    ({U}, faculty, {F}), ({U}, lecture, {John})}The resulting relation S3 cannot be changed using the given set of credentials, hence: SP = S3.

    Because the RT language considered in this example is RT0, all the sets of entities are singletonsets.

    Example 2 (RT T )

    The sequence of steps to compute consecutive relations Si starts from an empty set, S0 = φ, andproceeds as follows. Credentials 9 through 14 are mapped in S0 into relation S1:S1 = {({B}, cashier, {Mary}), ({B}, cashier, {Doris}),

    ({B}, cashier, {Alice}), ({B}, cashier, {Kate}),({B},manager, {Alice}), ({B}, auditor, {Kate})}

    Credential 6 adds the following instances to relation S2:S2 = S1 ∪ {

    ({B}, twoCashiers, {Mary,Doris}), ({B}, twoCashiers, {Mary,Alice}),({B}, twoCashiers, {Mary,Kate}), ({B}, twoCashiers, {Doris,Alice}),({B}, twoCashiers, {Doris,Kate}), ({B}, twoCashiers, {Alice,Kate})}

    Credentials 7 is resolved in S3:S3 = S2 ∪ {

    ({B},managerCashiers, {Mary,Doris, Alice}),

  • Deriving RTT Credentials for Role-Based Trust Management 15

    ({B},managerCashiers, {Mary,Alice}),({B},managerCashiers, {Mary,Kate,Alice}),({B},managerCashiers, {Doris,Alice}),({B},managerCashiers, {Doris,Kate,Alice}),({B},managerCashiers, {Alice,Kate})},

    and credential 8 in S4:S4 = S3 ∪ {

    ({B}, approval, {Mary,Doris, Alice,Kate}),({B}, approval, {Mary,Alice,Kate}),({B}, approval, {Doris,Alice,Kate})},

    The resulting relation S4 cannot be changed using the given set of credentials, hence: SP = S4.Because the RT language considered in this example is RT T , there is a set of sets of entities assignedto each role.

    6. Deductive system over RT T credentials

    RT T credentials are used to define roles and roles are used to represent permissions. The semanticsof a given set P of RT T credentials defines for each role A.r the set of entities which are membersof this role. The member sets of roles can also be calculated in a more convenient way using adeductive system, which defines an operational semantics of RT T language.

    A deductive system consists of an initial set of formulae that are considered to be true, anda set of inference rules, that can be used to derive new formulae from the known ones.

    Let P be a given set of RT T credentials. The application of inference rules of the deductivesystem will create new credentials, derived from credentials of the set P. A derived credential c willbe denoted using a formula:

    P � cwhich should be read: “credential c can be derived from a set of credentials P”.Definition 2. The initial set of formulae of a deductive system over a set P of RT T credentialsare all the formulae:

    c ∈ Pfor each credential c in P. The inference rules of the system are the following:

    c ∈ PP � c (W1)

    P � A.r ← B.s P � B.s← XP � A.r ← X (W2)

    P � A.r ← B.s.t P � B.s← C P � C.t← XP � A.r ← X (W3)

    P � A.r ← B.s ∩ C.t P � B.s← X P � C.t← XP � A.r ← X (W4)

    P � A.r ← B.s� C.t P � B.s← X P � C.t← YP � A.r ← X ∪ Y (W5)

    P � A.r ← B.s⊗ C.t P � B.s← X P � C.t← Y X ∩ Y = φP � A.r ← X ∪ Y (W6)

  • 16 Anna Felkner, Krzysztof Sacha

    There could be a number of deductive sys-tems defined over a given language. To be use-ful for practical purposes a deductive systemmust exhibit two properties. First, it shouldbe sound, which means that the inferencerules could derive only formulae that are validwith respect to the semantics of the language.Second, it should be complete, which meansthat each formula, which is valid accordingto the semantics, should be derivable in thesystem.

    All the credentials, which can be derived inthe system, either belong to set P (rule W1) orare of the type: P � A.r ← X (rules W2 throughW6). To prove the soundness of the deductivesystem, one must prove that for each new for-mula P � A.r ← X, the triple (A, r,X) belongsto the semantics SP of the set P.

    Let we first note that all the formulae P �A.r ← X , such that A.r ← X ∈ P are sound.This is proved in Lemma 1.Lemma 1. If A.r ← X ∈ P then (A, r,X) ∈SP .Proof. The relation SP , which defines the seman-tics of P, is a limit of a monotonically increas-ing sequence of sets S0, S1 . . . such that S0 = φ.According to Definition 1: f(S0, A.r ← X) =(A, r,X) Hence, (A, r,X) ∈ S1 and becauseS1 ⊆ SP then (A, r,X) ∈ SP . �

    To prove the soundness of the deductive sys-tem over P, we must prove the soundness of eachformula P � A.r ← X, which can be derivedfrom the set P. This is proved in Theorem 1.Theorem 1. If P � A.r ← X then (A, r,X) ∈SP .Proof. By induction with respect to the numbern of inference steps, which are needed to derivea formula P � A.r ← X.

    If n = 1 then the formula P � A.r ← Xcould be derived only using rule W1, because thepremises of only this rule belong to the initial setof formulae of the deductive system. Hence, thethesis is true according to Lemma 1.

    Consider n > 1 and assume for the inductivestep that the thesis is true if the number of infer-ence steps was not greater than n. We will showthat it is true also in a case when the number ofinference steps equals n+ 1.

    Each of the rules W2 through W6 could beused in the last (n + 1) step of inference. Allthose five cases are discussed separately.

    [W2] The first premise of W2 cannot be de-rived otherwise than using W1. Hence, A.r ←B.s ∈ P . The second premise of W2 : P �B.s ← X was derived from P using at mostn steps of inference, hence, (B, s,X) ∈ SP ac-cording to the inductive hypothesis. By Defini-tion 1, there exists such Si that (B, s,X) ∈ Si,and (A, r,X) ∈ f(Si, A.r ← B.s) according to(D2). Because f(Si, A.r ← B.s) ⊆ Si+1 ⊆ SPthen (A, r,X) ∈ SP .

    [W3] The first premise of W3 cannot be de-rived otherwise than using W1. Hence, A.r ←B.s.t ∈ P . The second premise of W3 : P �B.s ← C was derived from P using at mostn steps of inference, hence, (B, s, C) ∈ SP ac-cording to the inductive hypothesis. By Defini-tion 1, there exists such Si that (B, s, C) ∈ Si.Similarly, in the case of the third premise ofW3 : P � C.t ← X, there exists such Sj that(C, t,X) ∈ Sj . Let k be the maximum of (i, j).Then (B, s, C) ∈ Sk and (C, t,X) ∈ Sk, and(A, r,X) ∈ f(Sk, A.r ← B.s.t) according to(D3). Because f(Sk, A.r ← B.s.t) ⊆ Sk+1 ⊆ SPthen (A, r,X) ∈ SP .

    [W4] The first premise of W4 cannot be de-rived otherwise than using W1. Hence, A.r ←B.s ∩ C.t ∈ P . The second premise of W4 : P �B.s ← X was derived from P using at mostn steps of inference, hence, (B, s,X) ∈ SP ac-cording to the inductive hypothesis. By Defini-tion 1, there exists such Si that (B, s,X) ∈ Si.Similarly, in the case of the third premise ofW4 : P � C.t ← X, there exists such Sjthat (C, t,X) ∈ Sj . Let k be the maximum of(i, j). Then (B, s,X) ∈ Sk, (C, t,X) ∈ Sk and(A, r,X) ∈ f(Sk, A.r ← B.s ∩ C.t) according to(D4). Because f(Sk, A.r ← B.s∩C.t) ⊆ Sk+1 ⊆SP then (A, r,X) ∈ SP .

    [W5] The conclusion of W5 is a formulaP � A.r ← X � Y , which states that the setof entities that can play a role A.r is a union oftwo another sets of entities X and Y . To provethe thesis we must show that (A, r,X∪Y ) ∈ SP .

    The first premise of W5 cannot be derivedotherwise than using W1. Hence, A.r ← B.s �

  • Deriving RTT Credentials for Role-Based Trust Management 17

    C.t ∈ P. Similarly as in case of W4, the sec-ond and the third premises of W5 were derivedfrom P using at most n steps of inference. So,(B, s,X) ∈ SP and (C, t, Y ) ∈ SP . Then, thereexists such k that (B, s,X) ∈ Sk and (C, t, Y ) ∈Sk, and (A, r,X ∪ Y ) ∈ f(Sk, A.r ← B.s� C.t)according to (D5). Because f(Sk, A.r ← B.s �C.t) ⊆ Sk+1 ⊆ SP then (A, r,X ∪ Y ) ∈ SP .

    [W6] The conclusion of W6 is a formulaP � A.r ← X ⊗ Y , which states that the setof entities that can play a role A.r is a union oftwo another sets of entities X and Y . To provethe thesis we must show that (A, r,X∪Y ) ∈ SP .

    The first premise of W6 cannot be derivedotherwise than using W1. Hence, A.r ← B.s ⊗C.t ∈ P. Similarly as in case of W4, the sec-ond and the third premises of W6 were derivedfrom P using at most n steps of inference. So,(B, s,X) ∈ SP and (C, t, Y ) ∈ SP . Then, thereexists such k that (B, s,X) ∈ Sk and (C, t, Y ) ∈Sk. The fourth premise of W6: X ∩ Y = φ, doesnot depend on the number of inference steps andis always true if W6 could be applied. Hence,(A, r,X ∪Y ) ∈ f(Sk, A.r ← B.s⊗C.t)accordingto (D6). Because f(Sk, A.r ← B.s ⊗ C.t) ⊆Sk+1 ⊆ SP then (A, r,X ∪ Y ) ∈ SP . �

    To prove the completeness of the deductivesystem over a set P of RT T credentials, we mustprove that a formula P � A.r ← X can bederived using inference rules for each element(A, r,X) ∈ SP . This is proved in Theorem 2.Theorem 2. If (A, r,X) ∈ SP then P � A.r ←X.Proof. Assume (A, r,X) ∈ SP . By Definition 1,there exists such i ≥ 0 and such c ∈ P that(A, r,X) ∈ f(Si, c). The proof of the thesis is byinduction with respect to the value of index i.

    If i = 0 then credential c must take theform of A.r ← X. This is because S0 = φ andf(S0, d) = φ for each credential d other thanA.r ← X. Hence, A.r ← X ∈ P and the for-mula P � A.r ← X can be derived using ruleW1.

    Let i > 0. Assume for the inductive step thatthe thesis is true, if the value of index i in theexpression (A, s,X) ∈ f(Si, c) was not greaterthan n. We will show that it is true also in thecase when the value of index i equals n+ 1.

    Assume (A, r,X) ∈ SP and (A, r,X) ∈f(Sn+1, c) for a certain c ∈ P. The credentialc can take one of the six forms allowed in RT T .Each of these types of credentials will be dis-cussed separately.

    [c = A.r ← X] If this is the case, then theformula P � A.r ← X can be derived using ruleW1.

    [c = A.r ← B.s] If (A, r,X) ∈f(Sn+1, A.r ← B.s), then (B, s,X) ∈ Sn+1according to (D2) of Definition 1. Hence, thereexists a credential c ∈ P such that (B, s,X) ∈f(Sn, c). This implies that (B, s,X) ∈ SPand P � B.s ← X according to the induc-tive hypothesis. Then P � A.r ← B.s andP � B.s ← X, hence, P � A.r ← X is aconclusion of rule W2.

    [c = A.r ← B.s.t] If (A, r,X) ∈f(Sn+1, A.r ← B.s.t) then according to (D3)of Definition 1, there exists a set of entities Csuch that (B, s, C) ∈ Sn+1 and (C, t,X) ∈ Sn+1.Hence, there exists a credential c1 ∈ P such that(B, s, C) ∈ f(Sn, c1) and there exists a creden-tial c2 ∈ P such that (C, t,X) ∈ f(Sn, c2). Thisimplies that (B, s, C) ∈ SP and (C, t,X) ∈ SP ,hence, P � B.s← C and P � C.t← X accord-ing to the inductive hypothesis. P � A.r ← Xis a conclusion of rule W3.

    [c = A.r ← B.s ∩ C.t] If (A, r,X) ∈f(Sn+1, A.r ← B.s ∩C.t) then (B, s,X) ∈ Sn+1and (C, t,X) ∈ Sn+1 according to (D4) of Def-inition 1. Hence, there exist credentials c1, c2such that (B, s,X) ∈ f(Sn, c1) and (C, t,X) ∈f(Sn, c2). This implies that (B, s,X) ∈ SP and(C, t,X) ∈ SP , hence, P � B.s ← X and P �C.t← X according to the inductive hypothesis.P � A.r ← X is a conclusion of rule W4.

    [c = A.r ← B.s� C.t] If (A, r,X) ∈f(Sn+1, A.r ← B.s � C.t), then according to(D5) of Definition 1, there exist two sets of en-tities Z, Y such that Z ∪Y = X and (B, s, Z) ∈Sn+1 and (C, t, Y ) ∈ Sn+1. Hence, there existcredentials c1, c2 such that (B, s, Z) ∈ f(Sn, c1)and (C, t, Y ) ∈ f(Sn, c2). This implies that(B, s, Z) ∈ SP and (C, t, Y ) ∈ SP , hence, P �B.s ← Z and P � C.t ← Y according to theinductive hypothesis. P � A.r ← X is a conclu-sion of rule W5.

  • 18 Anna Felkner, Krzysztof Sacha

    [c = A.r ← B.s⊗ C.t] If (A, s,X) ∈f(Sn+1, A.r ← B.s ⊗ C.t), then according to(D6) of Definition 1, there exist two sets of en-tities Z, Y such that Z ∪ Y = X and Z ∩ Y = φand (B, s, Z) ∈ Sn+1 and (C, t,X) ∈ Sn+1.Hence, there exist credentials c1, c2 such that(B, s, Z) ∈ f(Sn, c1) and (C, t, Y ) ∈ f(Sn, c2).This implies that (B, s, Z) ∈ SP and (C, t, Y ) ∈SP , hence, P � B.s← Z and P � C.t← Y ac-cording to the inductive hypothesis. P � A.r ←X is a conclusion of rule W6. �

    A conclusion from Theorem 1 and Theorem2 is such that the deductive system of Defini-tion 2 is sound and complete with respect tothe semantics of RT T credentials. This way, thedeductive system gives an operational definitionof RT T semantics.

    7. Conclusions

    This paper deals with modelling of trust man-agement systems in decentralized and dis-tributed environments. The modelling frame-work is a family of role-based trust managementlanguage RT T . Two types of semantics for a setof RT T credentials have been introduced in thepaper.

    A set-theoretic semantics of RT T is definedas a relation over a set of roles and a powerset (set of sets) of entities. All the membersof a set of entities related to a role must co-operate in order to satisfy the role. This way,our definition covers the full potential of RT T ,which supports the notion of manifold roles andis able to express structure of threshold andseparation-of-duty policies.

    An operational semantics of RT T is definedas a deductive system, in which credentials canbe derived from an initial set of credentials usinga set of inference rules. The semantics is givenby the set of resulting credentials of the typeA.r ← X, which explicitly show a mapping be-tween roles and sets of entities.

    The properties of soundness and complete-ness of the deductive system with respect to thesemantics of RT T are proved.

    References

    [1] M. Blaze, J. Feigenbaum, J. Ioannidis, andA. Keromytis. The role of trust management indistributed systems security. In Secure InternetProgramming, pages 185–210. 1999.

    [2] M. Blaze, J. Feigenbaum, and J. Lacy. Decen-tralized trust management. In Proceedings ofthe IEEE Conference on Security and Privacy,pages 164–173, 1996.

    [3] M. Blaze, J. Feigenbaum, and M. Strauss. Com-pliance checking in the PolicyMaker trust man-agement system. In Financial Cryptography,pages 1439–1456, 1998.

    [4] D. Clarke, J. E. Elienb, C. Ellison, M. Fredette,A. Morcos, and R. L. Rivest. Certificate chaindiscovery in SPKI/SDSI. Journal of ComputerSecurity, 9(4):285–322, 2001.

    [5] A. Felkner and K. Sacha. The semantics ofrole-based trust management languages. InProc. Central and Eastern European Conferenceon Software Engineering Techniques CEE-SET,pages 195–206, 2009.

    [6] D. Ferraiolo and D. Kuhn. Role-based accesscontrol. In Proc. 15th National Computer Secu-rity Conference, pages 554–563, 1992.

    [7] D. F. Ferraiolo, R. Sandhu, S. Gavrila, D. R.Kuhn, and R. Chandramouli. Proposed NISTstandard for role-based access control. ACMTransactions on Information and System Secu-rity (TISSEC), 4(3):224–274, 2001.

    [8] D. Gorla, M. Hennessy, and V. Sassone. In-ferring dynamic credentials for role-based trustmanagement. In Proceedings of the 8th ACMSIGPLAN international conference on Princi-ples and practice of declarative programming,page 224, 2006.

    [9] W. M. Grudzewski, I. K. Hejduk, A. Sankowska,and M. Wantuchowicz. Trust Management inVirtual Work Environments: A Human FactorsPerspective. CRC Press, 2008.

    [10] D. Harel and B. Rumpe. Modeling languages:Syntax, semantics and all that stu. 2000.

    [11] N. Li and J. Mitchell. RT: a role-basedtrust-management framework. In Proc. 3rdDARPA Information Survivability Conference

  • Deriving RTT Credentials for Role-Based Trust Management 19

    and Exposition, pages 201–212. IEEE ComputerSociety Press, 2003.

    [12] N. Li, J. C. Mitchell, and W. H. Winsborough.Design of a role-based trust-management frame-work. In Proceedings of 2002 IEEE Symposiumon Security and Privacy, pages 114–130, Oak-land CA, 2002. IEEE Computer Society Press.

    [13] N. Li, W. H. Winsborough, and J. C. Mitchell.Distributed credential chain discovery in trustmanagement. Journal of Computer Security,11(1):35–86, 2003.

    [14] R. S. Sandhu, E. J. Coyne, H. L. Feinstein, andC. E. Youman. Role-based access control mod-els. Computer, 29(2):38–47, 1996.

  • e-Informatica Software Engineering Journal, Volume 4, Issue 1, 2010

    Hierarchical Model forEvaluating Software Design Quality

    Paweł Martenka∗, Bartosz Walter∗∗Institute of Computing Science, Poznań University of Technology

    [email protected], [email protected]

    AbstractQuality of software design has a decisive impact on several quality attributes of the resultingproduct. However, simple metrics, despite of their popularity, fail to deliver comprehensive infor-mation about the reasons of the anomalies and relation between them and metric values. Morecomplex models that combine multiple metrics to detect a given anomaly are still only partiallyuseful without proper interpretation. In the paper we propose a hierarchical model that extendthe Factor-Strategy model defined by Marinescu in two ways: by embedding a new interpretationdelivery mechanism into the model and extending the spectrum of data providing input to themodel.

    1. Introduction

    Software design is considered one of the mostcomplex human creative activities [13]. As such,the design process is prone to making errors,which significantly affect the quality of a soft-ware product resulting from the design. There-fore, there is a continuous search for mod-els and approaches that could help both im-proving the design process and evaluating itsquality.

    Since software design is a quantifiable pro-cess, well-known code metrics are advocated asthe primary solution for that problem. They areeasy to compute, there is also plenty of exper-imental data showing the correlation betweenvarious metrics and desired quality attributes.However, metrics are just numbers, which oftendo not point to the design flaws, but rather pro-vide rough and aggregate data. There are threemain drawbacks of using the isolated metrics asdirect providers of quality-related information:1. There is no direct traceable connection be-

    tween an actual cause and the value of ametric; usually it is the designer who is re-

    quired to examine the values and identify theproblem.

    2. A vector of metric values has no meaningfor the designer without a proper interpreta-tion. Aggregate metrics are not subject to astraightforward interpretation.

    3. Code metrics are unable to deliver completeinformation about software design. Theyneed to be combined with diversed set ofdata to provide a more complete view.

    Then, there is a need for more holistic ap-proaches. One of them is a two-stage Fac-tor-Strategy proposed by Marinescu ([17]),which is still based on metrics, but also ad-dresses some of their weaknesses. It is a frame-work for building rule-based descriptions ofdesign anomalies, which builds a navigablepath between metrics and actual violations ofhigh-level design principles. Unfortunately, thisapproach has also drawbacks. Such principlesusually refer to abstract notions like cohesionor coupling, which still are not directly pointingto actual flaws. Moreover, actual code anomaliesoften result from multiple violations of differentnature, for which the rules could be not properly

  • 22 Paweł Martenka, Bartosz Walter

    configured. For example, the Large Class badsmell [12], which describes classes bearing toomuch responsibility, typically denotes an overlycomplex, low-cohesive class with lots of mem-bers. Due to a large number of symptoms sug-gesting the presence of the flaw, metrics point-ings to them must be combined and evaluated innon-linear and fuzzy manner to deliver an effec-tive and useful measurement mechanism. Thus,the Factor-Strategy model, which is based onsimple and strict rules, still does not provide aflexible abstraction for such flaws.

    In this paper we propose a hierarchical modelfor evaluating design quality which is based onthe Factor-Strategy concept, but extends it inseveral ways. It provides designers with hierar-chical, custom-grained information, which helpsin tracing the causes of flaws, and also enrichesthe spectrum of utilized sources of data.

    The paper is structured as follows. Section 2provides an overview of existing literature andapproaches used for similar problems. In Sec-tion 3 we present Factor-Strategy model in amore detailed way, and in Section 4 we proposethe hierarchical model. Section 5 contains a sim-ple exemplary instance of the model, along withearly experimental evaluation results. Section 6summarizes our findings and proposes furtherextensions to the model.

    2. Related Work

    Historically, first attempts to quantitativelyevaluate the design quality of object-orientedsoftware were directly derived from code met-rics. Metric suites proposed by Chidamber andKemerer [6], e Abreu [9] and others were de-signed to capture the most important inter-nal characteristics of object oriented software,like cohesion and coupling, and the use ofmechanisms embedded in the object paradigm.A strong evidence has been collected pointing tocorrelation between these metrics and externalquality characteristics.

    These characteristics were further investi-gated by Briand et al. [3, 2], who noted that theyare too ambiguous to be effectively captured by

    generalized, aggregate metrics. As an effect, theyproposed several specific metrics, which anal-ysed different flavours of cohesion and coupling.

    Some researchers went in the opposite direc-tion, building more holistic approaches to mod-elling design anomalies. Beck, the author of eX-treme Programming methodology, coined a termof “code bad smell” for a general label for de-scribing structures in the code that suggest forthe possibility of refactoring [11]. Since specificsmells describe anomalies that can result frommany initial causes, they should also be backedby several symptoms [23], e.g. diversed sets ofmetrics. Moonen et al. [22] proposed a methodfor automating smell detection based on analysisof source code abstract syntax trees. Kothari etal. in [16] defined a framework for building toolsthat perform partially automated code inspec-tions and transformations.

    Dhambri et al. in [8] proceeded a step fur-ther and employed visualisation techniques fordetecting anomalies. The main idea was basedon presenting some software quality attributes(e.g. measured by metrics) to a software designexpert, who made the final decision. Anotherwork, by Simon and Lewerentz [21], focused onrefactorings driven by distance based cohesion.Distance between members of classes (fields andmethods) was visualised in a 3D space, so thatan expert could decide on appropriate assign-ment of class members and possibly suggestrefactorings.

    Based on critics of the simplisticmetric-based quality models, Marinescu pro-posed Factor-Strategy model [17], composed oftwo stages: detection strategies stage respon-sible for identifying an anomaly, and composi-tion stage that evaluates the impact of suspectsfound in the previous step on the high-levelquality factors.

    This model was further extended. Ratiu [20]encapsulated the detection strategies with a newmodel which incorporated code changes his-tory into the classification mechanism. The newmodel has two main advantages:1. removes false positives from the detected sus-

    pects set,2. emphasizes the most harmful suspects.

  • Hierarchical Model for Evaluating Software Design Quality 23

    Similar concept – use of historical data – wasalso exploited by Graves et al. [14] and Khosh-goftaar et al. [15]. Graves presented a few modelsto predict fault incidence and Khoshgoftaar in-troduced a regression model to predict softwarereliability, both based on the code history.

    3. The Factor-Strategy Model

    As Marinescu noted, classical models of designquality evaluation do not provide explicit map-ping between metrics and quality criteria, sothe rules behind quality quantification are im-plicit and informal. The metrics-based modelscan provide information about existence of aproblem, but they do not reveal the actual causeof a problem. Hence, there is a need for a morecomprehensive and holistic model.

    The Factor-Strategy model has been pro-posed as a response to the above-mentionedweaknesses. It is composed of two main ele-ments: the Detection Strategy and the composi-tion step.

    The Detection Strategy (DS) is defined as aquantifiable expression of a rule by which designfragments that are conforming to that rule canbe detected in the source code.

    Rules are configured by a set of selected andsuitable metrics. In consequence, DS provides amore abstract level of interpretation than indi-vidual metrics do, so that the numeric values ofthese metrics do not need to be interpreted inisolation.

    Metrics are combined into rules using two ba-sic mechanisms: filtering and composition. Fil-ters transform metrics values whereas the com-position operators aggregate into a rule. Mari-nescu gives a following example of a DetectionStrategy instance for the Feature Envy smell:

    FeatureEnvy := ((AID, HigherThan(4))and (AID, TopValues(10%))and (ALD, LowerThan(3)) and (NIC,LowerThan(3))

    This examplary rule uses three metrics: Ac-cess of Import-Data (AID), Access of LocalData (ALD) and Number of Import Classes(NIC) processed with HigherThan, TopValues

    and LowerThan filters, and composed with andcomposition operator.

    Application of DS on a set of software enti-ties (e.g. classes) results in:1. a set of detected suspects,2. a vector of metrics values for each suspect.Using this data, a score for a DS is calcu-lated and mapped to a normalised value (aranked score). The score can be interpreted asa higher-level metric for the strategy. Marinescuprovides a few exemplary formulas for comput-ing the score, for example the simplest is thenumber of suspects for a given DS.

    Quantification of high-level quality factors isbased on an aggregation of ranked strategies andrules. Formulas for aggregation can vary from asimple mean value, where DS and the rules haveequal weight, to more sophisticated, weightedmethods. Selection of a method for aggregationdepends on the measurement goals. The aggre-gated value – which is a score for the quality fac-tor, is also mapped to the ranked score to providequalitative information (labelled ranked scores).

    4. Hierarchical Model

    The Factor-Strategy model overcomes majorproblems of the classical solutions but still hasa few drawbacks. The first doubt refers to thecompleteness of strategies suite: they need to beconfigured for every anomaly, so even the biggestset of strategies does not cover all possible flaws.

    The second weakness is concerned aboutlimiting the data sources to metrics only.As noted in [23], anomalies typically requiremulti-criteria detecting mechanisms, includingdata from dynamic execution, configurationmanagement repository, analysis of AbstractSyntax Tree patterns etc. Ratiu and others[20, 14, 15] proved usefulness of historical datafor quality evaluation. Van Emden [22] and Bax-ter [1] presented examples how Abstract Syn-tax Trees (ASTs) could be exploited as a sourceof quality-related data. The extended spectrumof sensor types, embedded into Factor-Strategymodel, may improve its sensitivity, accuracyand correctness.

  • 24 Paweł Martenka, Bartosz Walter

    The final remark refers to the fact that oper-ators used for defining detection rules are strict,ie. they define a borderline, which may classifyvery similar entities to different categories. Pro-vided that the borderline is set up arbitrary, itcan significantly affect the results of evaluation.

    The goal of this research is to developa hierarchical model which tackles the men-tioned problems and weaknesses. It extends theFactor-Strategy model mainly in two areas:1. diversed data sources are used instead of

    metrics only,2. a simple mechanism for dealing with fuzzy

    problems is proposed.

    4.1. Structure of the Model

    The structure of the hierarchical model andits relation to the Factor-Strategy approach isshown on Fig. 1. At the top of the modelthere are high-level quality criteria (or char-acteristics), which are combined with detectedlower-level patterns and rules violations. Pat-tern and rule detection methods are supportedby data coming from various data sources,e.g. metrics, historical data, results of dynamicbehaviour and abstract syntax trees (AST),which improves accuracy of the detectionmechanism.

    The model schema shows a hierarchy of el-ements, but also a hierarchy of information.The evaluation criteria provide the most ab-stract and the most aggregated information. Adesigner can track down the hierarchy to getmore detailed information and find the cause ofa problem indicated by the criteria.

    4.2. Analysis of Detection Rules andDesign Principles

    Detection strategies, which are the core partof the original Factor-Strategy model, are con-figurable sets of rules aiming at capturing vi-olations of the well-known principles of design,based on quantified data. However, actual designanomalies present in code do not always matchthe predicted and configured set of strategies.They can also violate multiple principles concur-

    rently or – on the other hand – remain ignoredby existing strategies.

    The analysis mechanism present in the hier-archical model can be divided into three parts:1. new data selection approach,2. metrics quantisation,3. entity-level aggregation.

    4.2.1. Data Selection

    Classical quality models employ a set of se-lected metrics for evaluation of quality factor(or factors). For example, a model presentedby Briand et al. in [4] is built upon metricswhich are supposed to measure coupling, inheri-tance, polymorphism and size, and is oriented onfault-proneness prediction. Also instances of De-tection Strategies in [17] consist of diverse setsof metrics.

    The model presented in this section pro-motes different approach. Typically, behind ev-ery principle of software design an internal qual-ity characteristic is present. Based on this obser-vation, the selection of metrics should be strictlyoriented on such characteristic. On the otherhand, the selected metrics should be simple,suitable and adequate in the context of mea-sured characteristic. As a consequence, sometypes of metrics should be avoided:1. strongly aggregating measures, like COF

    (Coupling Factor defined by Abreu et al.in [9]), which are biased by compensationproblem – some parts of highly-coupled de-sign can be masked by parts which areloosely-coupled,

    2. metrics which are ambiguously defined, orthose capturing ambiguous concepts; KhaledEl-Emam in [10] argues that the notion ofcohesion is too general to provide significantresults,

    3. metrics which try to capture multiple char-acteristics at a time or appear not related tothe expected characteristic, eg. Basili et al.in [5] argue that WMC metric actually mea-sures software size instead of complexity.Following the postulate of diversed data

    sources, the model creation process should in-corporate as many sources as is needed to

  • Hierarchical Model for Evaluating Software Design Quality 25

    A S TM e t r i c s H i s t o r i c a l d a t a D y n a m i c b e h a v i o u r

    D a t a g a t h e r i n g

    P a t t e r n d e t e c t i o n R u l e s a n a l y s i s

    H igh - l eve l qua l i t y c r i t e r i a

    C o m b i n a t i o n

    Figure 1. Hierarchical quality model

    increase interpretability of the results. Newpatterns and existing strategies may be builtwith extended spectrum of data coming fromnew sources.

    4.2.2. Metrics Quantization

    As pointed out by Marinescu in [17], a sim-ple vector of metrics values is not very useful,because there is no clear connection betweenmeasures and quality factors. In other words,such values require of proper interpretation. Themethod presented below provides a new inter-pretation mechanism for metrics, so that vio-lations of rules can be detected and presentedto the designer in intuitive way. In the contextof the violated rules, we require an answer tothe question: is the value of a metric unaccept-able and, in consequence, measured character-istic has negative impact on quality? The sim-plest solution introduces a threshold: if a value

    of a metric exceeds threshold, then the measuredattribute is considered to negatively impact thequality. The domain of the metric is divided intotwo intervals, which can be labelled as “negativeimpact” and “no impact”. Thus, the labels pro-vides interpretation for metrics values.

    However, strict threshold values are inflex-ible, because values close to the threshold canbe interpreted incorrectly in certain context. Toprovide a simple fix for that, the strict thresholdvalue can be replaced with an additional intervalrepresenting the uncertainty. Values which fallsinto this interval should be analysed separatelyor supported by other data sources for correctclassification.

    Having considered these arguments, we candefine three classes (intervals) of the attributedomain:1. L – a value of a metric is unambiguously ac-

    ceptable, and the measured attribute has noor negligible negative impact on quality,

  • 26 Paweł Martenka, Bartosz Walter

    2. M – a value of a metric is near to threshold;additional analysis is required or other datasources should be explored,

    3. H – a value of a metric is unambiguously un-acceptable, and the measured attribute hasnegative impact on quality.We can formally define the labelling phase in

    following way:1. E – a set of analysed entities, for example a

    class or a package,2. M – a set of all metrics, suitable for the con-

    structed model,3. L – a set of all labels which identify classes

    of impact,4. P – a set of all principles considered in the

    model,5. m – a metric (e.g. CBO),6. m(e), e ∈ E – a value of metric m for entity e.mlie,m = αm(m(e)), e ∈ E,m ∈M,mlie,m ∈ L.

    (1)Function described by formula (1) maps a valueof a metric m, measured for entity e, to a labelmli1. As an effect, a numerical value deliveredby a metric is replaced by a higher-lever label,which is already interpreted from the qualitypoint of view.

    The entire effort in the construction of thispart of the model must be devoted to defin-ing the α function. For the basic version of themodel (with three classes) at least one thresholdvalue with surrounding interval must be defined.The crucial step deals with identification of athreshold and a width of the interval.

    The quantised metric – the labelled value –is only the very first and preliminary interpreta-tion step. This information is valuable in largercontext, thus labelled metrics should be utilisedin compound patterns and strategies.

    4.2.3. Entity-level Aggregation

    Some of the characteristics and mechanisms,which constitute the basis for the rules of gooddesign, are so complicated that there is a need formany supporting data sources, to capture all as-

    pects and variations of those characteristics (e.g.coupling can be divided into import and export).Therefore, an aggregation function of a set ofquantised metrics and other data sources has tobe engaged, to answer the question: Does a com-pound attribute, expressed by a set of input data,have a negative impact on quality? Let be defined:1. Mp – a set of metrics to express principle p,

    in other words, a set of metrics suitable fordetection of violations of the principle,

    2. Ae,p – a set of all additional pieces of in-formation, extracted from the other datasources (not metrics), for entity e and prin-ciple p,

    3. Me,p = {(m,mlie,m) : e ∈ E, p ∈ P,mlie,m ∈L,∀(m ∈Mp)mlie,m = αm(m(e))} – a set ofpairs: metric with assigned label; the labelis assigned respectively to formula (1); theset is evaluated for all metrics referring toprinciple p and calculated for entity e.

    plie,p = βp(Me,p, Ae,p), e ∈ E, p ∈ P, plie,p ∈ L.(2)

    Function defined by formula (2) aggregates a setof labelled metrics and additional information tolabel pli2, which denotes impact of underlyingcharacteristic on quality. Aggregation defined byformula (2) may be also realized as a classifier3.Assuming labels l ∈ L denotes classes, the clas-sifier built for specific principle p will assign aclass l to an entity e. Meaning of the aggregatedlabel or class can be generalised as follows: labell ∈ L denotes strength of negative impact of anattribute upon quality.

    Aggregation step requires careful interpreta-tion of collected results, especially in the caseof compound characteristics. To sum up aboveconsiderations:1. well-known principles of software design are

    always based upon internal quality charac-teristic,

    2. such characteristics can be decomposed intoelements which can be later evaluated bydata coming from diverse data sources. Thecollected results are useful for detection ofviolations of principles,

    1 Metric-level impact.2 Principle-level impact.3 For example using decision rules or trees.

  • Hierarchical Model for Evaluating Software Design Quality 27

    3. aggregated results say nothing about thequality characteristic they are based on, butprovide information about the negative im-pact of a measured attribute on quality.

    Label evaluated by formula (2) denotes impact,but do not identify a violation of a principle. Todefine a violation, let be assumed:1. V Lp – a set of labels, which are treated as a

    violation of principle p,2. Vp – a symbol of a violation of rule p.

    plie,p ∈ V Lp ⇒ Vp, e ∈ E, p ∈ P. (3)

    Definition If aggregated label pli for a charac-teristic supporting principle p, for analysed en-tity e, belongs to the set VL, then the entity isflawed by a violation of rule p.This definition is captured by formula (3).

    The detected violations can be scored andranked just like Detection Strategies. As aconsequence, presented method can be homo-geneously in-lined with methods presented inFactor-Strategy model.

    5. Example of Application

    This section brings through a process of instan-tiation of a fragment of the hierarchical model.Scope of the example is narrowed to the ele-ments which constitutes novelty of the model:rules analysis method with metrics quantizationand aggregation. Instantiated model will be ap-plied to exemplary entities.

    5.1. Model Creation

    5.1.1. Goals

    The very first step of a model creation is theselection of quality characteristic to be eval-uated. Following activities, like principles andmetrics selection, are made in the context of thehigh-level quality goal. For the purpose of thisexample, readability (but analysability and un-derstandability are closely related) of code anddesign is selected as a goal and high-level qualityfactor.

    5.1.2. Principles

    Coupling concept is considered to be a good pre-dictor of quality. El-Emam in [10] provides evi-dence that high coupling makes programs hardto understand. Rule of low coupling, identifiedby Coad and Yourdon in [7] is selected as thedesign principle used as quality criterion in thisexample. Hence, let us define a set of principlesP = {LowCoupling}.

    5.1.3. Data Sources

    For the purpose of coupling measurement, met-rics Ca and Ce, defined by Robert Martin in[18], are used. The metrics count incoming (Ce)and outgoing (Ca) couplings separately, and willbe applied at class level. Additional information,based on abstract syntax tree, is defined as a flagindicating whether an entity (a class in this case)is abstract. Let us assume:1. M = MLowCoupling = {Ca,Ce} – a set of all

    metrics is actually the set of metrics for thedesign principle LowCoupling, because onlyone design principle is considered,

    2. A = {IsAbstract} – additional informationfrom a non-metrics source.

    5.1.4. Definition of Quantization andAggregation

    As described in [10] by [19], a human can copewith 7±2 pieces of information at a time. We usethis observation as a threshold for the above-se-lected coupling measures. For a quantizationpurpose, let us define:1. L = {L,M,H} – the basic set of labels,2. αCe(Ce(e)):

    mlie,Ce =

    L,Ce(e) < 5M,Ce(e) ∈ [5, 9]H,Ce(e) > 9

    (4)

    3. αCa(Ca(e)):

    mlie,Ca =

    L,Ca(e) < 5M,Ca(e) ∈ [5, 9]H,Ca(e) > 9

    (5)

  • 28 Paweł Martenka, Bartosz Walter

    The model is oriented toward detection ofviolations, so the simple max function willbe used for aggregation, assuming that la-bels are ordered from the lowest value ofL to highest H. Martin in [18] argues thatclasses should depend upon the most stableof them (eg. on abstract classes), so if aclass is abstract then export coupling (Ca) isnot taken into consideration. Aggregation func-tion βLowCoupling(Me,LowCoupling, Ae,LowCoupling)is defined as follows:

    plie,LowCoupling =

    mlie,Ce, IsAbstract(e)max{mlie,Ce,mlie,Ca},

    otherwise

    (6)Finally, let us define the violation:

    1. V LLowCoupling = {M,H} – a set of labelsindicating violations of LowCoupling rule; la-bel M is also included to capture entitieswhich probably violate the rule,

    2. VLowCoupling – a symbol which denotes vio-lation of LowCoupling rule,

    3. (plie,LowCoupling ∈ V LLowCoupling) ⇒VLowCoupling – definition of LowCoupling ruleviolation.

    5.2. Application

    The model will be applied on sample data,taken from a student project, depicted in ta-ble 1. All classes are large (from 384 lines to 477lines in a file) and probably flawed in many as-pects. Results generated by the model are com-pared to results gathered in a survey, conductedamong graduate software engineering students(students were asked to identify classes that aretoo large).

    The quantized metrics and additional datafor all entities:1. MDisplayManager,LowCoupling = {(Ce,H),

    (Ca,M)}

    2. MAmeChat,LowCoupling = {(Ce,H), (Ca,H)}3. MDrawableGroup,LowCoupling = {(Ce,L),

    (Ca,H)}4. ADisplayManager,LowCoupling = {IsAbstract =

    False}5. AAmeChat,LowCoupling = {IsAbstract =

    True}6. ADrawableGroup,LowCoupling = {IsAbstract =

    False}Results of aggregation of quantized metrics:1. pliDisplayManager,LowCoupling = max{H,M}

    = H2. pliAmeChat,LowCoupling = mliAmeChat,Ce = H3. pliDrawableGroup,LowCoupling = max{L,H}

    = HRegarding the previous definitions of violations,all entities violate the principle of low couplingand negatively affect the high-level quality cri-terion.

    5.2.1. Interpretation

    The high-level quality goal – readability – isnot evaluated because there are too few entitiesto get a relevant output. Let be assumed, thehigh-level factor indicates a problem in software.The very first step is to look for strategies andprinciples which support the factor, and chooseonly those with current negative consequences.The second step is to look for entities (suspects)which negatively impacts the factor in the con-text of chosen principle (or strategy). In this par-ticular example there are only three classes andall of them are suspects due to violations of theprinciple.

    Violation in DisplayManager results fromthe metric Ce, labelled with H, and Ca la-belled with M. Considering Ce definition, Dis-playManager suffers mainly from import cou-pling, and moderately from export coupling. Re-spondents classified DisplayManager as Middle

    Table 1. Sample data

    Class Ce Ca mlie,Ce mlie,Ca IsAbstractDisplayManager 13 8 H M FalseAmeChat 14 35 H H TrueDrawableGroup 4 14 L H False

  • Hierarchical Model for Evaluating Software Design Quality 29

    Man and Large Class, and model results can in-dicate causes of these smells.

    AmeChat is an abstract class, so it is ob-vious that it is used by many other classes. Inconsequence, only import coupling is considered,so the impact results from Ce, despite of highvalue of Ca. The vast majority of the respon-dents identified Large Class smell, which can beconnected with high import coupling.

    DrawableGroup uses desirable amount ofclasses, Ce=L, but is used in many other places.The majority of the respondents identified Re-fused Bequest in the class. This smell dealswith inheritance, which is not considered inthis model. Obtained results indicates other,coupling-related problems which probably can-not be named as a defined smell.

    6. Summary

    The proposed hierarchical model extends theFactor-Strategy model in three ways. It deliversmore comprehensive and traceable informationconcerning detected potential anomalies to thedesigner, including the interpretation of metricsvalues, and also broadens the spectrum of anal-ysed data sources to the non-metric ones. Asthe simple example suggests, these elements helpin discovering new types of anomalies and alsosupport the designer in evaluating the impact,scope and importance of the violation. It alsodelivers hierarchically structured data justifyingthe suspected flaws, and includes a uncertaintyinterval. Therefore, the model more resemblesthe human way of cognition.

    Further directions of research include an ex-perimental validation of the model, defining de-tection strategies utilizing data from heteroge-neous data sources, and also embedding internaldesign characteristics into the model.

    References

    [1] I. D. Baxter, A. Yahin, L. Moura, M. Sant’Anna,and L. Bier. Clone detection using abstract syn-tax trees. In ICSM ’98: Proceedings of the Inter-national Conference on Software Maintenance,

    page 368, Washington, DC, USA, 1998. IEEEComputer Society.

    [2] L. C. Briand, J. W. Daly, and J. K. Wüst.A unified framework for cohesion measurementin object-oriented systems. Empirical SoftwareEngineering, 3(1):65–117, 1998.

    [3] L. C. Briand, J. W. Daly, and J. K. Wüst. Aunified framework for coupling measurement inobject-oriented systems. IEEE Transactions onSoftware Engineering, 25:1, 1999.

    [4] L. C. Briand, W. L. Melo, and J. Wüst. Assess-ing the applicability of fault-proneness modelsacross object-oriented software projects. Tech-nical report, ISERN, 2000.

    [5] L. C. Briand, S. Morasca, and V. R. Basili.Property-based software engineering measure-ment. IEEE Transactions on Software Engi-neering, 22:68–86, 1994.

    [6] S. R. Chidamber and C. F. Kemerer. A metricssuite for object oriented design. IEEE Transac-tions on Software Engineering, 20(6):476–493,1994.

    [7] P. Coad and E. Yourdon. Object Oriented De-sign. Prentice Hall, 1991.

    [8] K. Dhambri, H. A. Sahraoui, and P. Poulin.Visual detection of design anomalies. In 12thEuropean Conference on Software Maintenanceand Reengineering 2008, pages 279–283, April2008.

    [9] F. B. e Abreu and R. Carapuça. Object-orientedsoftware engineering: Measuring and controllingthe development process. In Proceedings of the4th International Conference on Software Qual-ity, 1994.

    [10] K. E. Emam. Advances in Software Engineer-ing, chapter Object-Oriented Metrics: A Reviewof Theory and Practice, pages 23–50. 2002.

    [11] M. Fowler. Refactoring. Improving the Designof Existing Code. Addison-Wesley, 1999.

    [12] M. Fowler, K. Beck, J. Brant, W. Opdyke, andD. Roberts. Refactoring: Improving the Designof Existing Code. Addison-Wesley, 1999.

    [13] R. Glass. On design. Journal of Systems andSoftware, 52(1):1–2, May 2000.

    [14] T. L. Graves, A. F. Karr, J. Marron, and H. Siy.Predicting fault incidence using software changehistory. IEEE Transactions on Software Engi-neering, 26:653–661, 2000.

    [15] T. M. Khoshgoftaar, E. B. Allen, R. Halstead,G. P. Trio, and R. M. Flass. Using processhistory to predict software quality. Computer,31:66–72, 1998.

    [16] S. C. Kothari, L. Bishop, J. Sauceda, andG. Daugherty. A pattern-based framework for

  • 30 Paweł Martenka, Bartosz Walter

    software anomaly detection. Software QualityControl, 12(2):99–120, 2004.

    [17] R. Marinescu. Measurement and Quality inObject-Oriented Design. PhD thesis, “Politeh-nica” University of Timişoara, 2002.

    [18] R. Martin. OO design quality metrics. An anal-ysis of dependencies. Report on Object Analysisand Design, 2(3), 1995.

    [19] G. Miller. The magical number seven, plus orminus two: Some limits on our capacity for pro-cessing information. The Psychological Review,(63):81–97, 1956.

    [20] D. Ratiu, S. Ducasse, T. Grba, and R. Mari-nescu. Using history information to improvedesign flaws detection, 2004.

    [21] F. Simon, F. Steinbrückner, and C. Lewerentz.Metrics based refactoring. In Proceedings of the5th European Conference on Software Mainte-nance and Reengineering, pages 30–38, 2001.

    [22] E. van Emden and L. Moonen. Java qualityassurance by detecting code smells. In Proceed-ings of the 9th Working Conference on ReverseEngineering, 2002.

    [23] B. Walter and B. Pietrzak. Multi-criteria detec-tion of bad smells in code with UTA method.In Proceedings of XP 2005 conference, pages154–161, 2005.

  • e-Informatica Software Engineering Journal, Volume 4, Issue 1, 2010

    Pattern-Based Software Architecture forService-Oriented Software Systems

    Claus Pahl∗, Ronan Barrett∗∗School of Computing, Dublin City University

    [email protected], [email protected]

    AbstractService-oriented architecture is a recent conceptual framework for service-oriented softwareplatforms. Architectures are of great importance for the evolution of software systems. Wepresent a modelling and transformation technique for service-centric distributed software systems.Architectural configurations, expressed through hierarchical architectural patterns, form the coreof a specification and transformation technique. Patterns on different levels of abstraction formtransformation invariants that structure and constrain the transformation process. We explorethe role that patterns can play in architecture transformations in terms of functional properties,but also non-functional quality aspects.

    1. Introduction

    The development of distributed software sys-tems based on service architectures is rapidlygaining momentum. Service-oriented architec-ture (SOA) is emerging as a new designparadigm and conceptual framework for dis-tributed service-centric software systems, sup-ported by platforms such as the Web ServicesFramework (WSF) [2]. Services are reusable soft-ware components that are explicitly described,published and provided at fixed locations. Dueto the ubiquity of the Web, the WSF platformand SOA paradigm play a major role for soft-ware systems.

    In service-centric distributed environmentssuch as the Web services platform that al-lows services to be invoked using Internetprotocols, a notion of workflow processesis central to capture service compositionand interaction between services. We presenttechniques to support, firstly, modelling ofservices and service-oriented processes and,secondly, property-preserving transformationsof service-oriented architectures. In contrast

    to a variety of architecture approaches thatfocus primarily on static, structural proper-ties, we concentrate on dynamic dependen-cies in the form of interaction processes be-tween services. Our solution is an approachto the architectural transformation of ser-vices, supporting the evolution of service-ori-ented architectures. Three aspects characteriseour approach:– Architecture modelling using hierarchical

    patterns. A three-layered architecture modeladdresses different levels of abstraction. Eachlayer is supported by a pattern-based mod-elling approach for service processes. A ser-vice-oriented architectural configuration no-tation that combines patterns and processbehaviour in architectures forms the back-bone. Patterns enhance reuse in SOA.

    – Property-preserving architectural transfor-mation. Based on the configuration nota-tion as the abstract description languagefor source and target architectures, a trans-formation technique is developed. Patternsare considered as characteristics of a servicearchitecture that are, due to the implied

  • 32 Claus Pahl, Ronan Barrett

    reliability and maintainability, worth beingpreserved in transformations.

    – Distribution and quality-of-service. Weinvestigate the role of distribution formodelling and look at functional andnon-functional service properties. The in-tegration of quality aspects into modelling isimportant for the services platform, whereproviders and users are usually from differentorganisations.

    We address the lack of behaviour and qualityaspects in service-oriented architectural trans-formations. Our patterns capture essential be-havioural service dependencies in the form ofinteraction process patterns and link these toquality properties. We utilise patterns to cap-ture these properties and allow these propertiesto be preserved in transformations by identifyingpatterns as invariants. Formality is required toobtain unambiguous models of process-basedservice architectures and to complement mod-elling by analysis and reasoning facilities. Archi-tectural change and integration require a tech-nique for process-oriented property-preservingtransformations.

    We introduce our architecture modeland transformation technique in Section 2.Pattern-based architecture modelling andspecification, supported by the architectureconfiguration notation, is addressed in Section 3.Architectural transformations are defined inSection 4. Finally, we discuss related work andend with conclusions.

    2. Architecture Model andSpecification

    Based on background definitions of service andsoftware architecture, we now define the prin-ciples of our architecture model and the corenotation.

    2.1. Service-oriented Architecture

    The objective of software architecture is theseparation of computation and communication.Architectures are about components (i.e. loci

    of computation) and connectors (i.e. loci ofcommunication). Various architecture descrip-tion languages (ADL) and modelling techniqueshave been proposed [17]. An architectural modelcaptures common concepts in architectural de-scription: components provide computation, in-terfaces provide access and connectors provideconnections between components. In service ar-chitecture, the main emphasis is on the compo-sition of services to workflow processes and onthe overall configuration of services and serviceprocesses. For instance [10], use scenarios – de-scriptions of interactions of a user with a system– to operationalise requirements and map theseto a system architecture. We extend the notionof interaction and also consider system-internalinteractions and allow interaction processes tobe composite.

    We focus on service architectures, i.e.service-oriented software architectures, here.A service is usually defined as a coherent setof operations provided at a certain location [2].A service provider makes an abstract interfacedescription available, which can be used bypotential service users to locate and invokethis service. The Web Service platform providesdescription languages (WSDL) and invocationprotocols (SOAP) for this purpose. Servicesare often used ‘as is’ in single request-responseinteractions. More recently, research has focusedon the composition of services to processes [2].Orchestration is the prevalent form of servicecomposition. Existing services can be reusedto form business or workflow processes. Theprinciple of architectural composition that welook at here is process assembly.

    2.2. An Architectural ConfigurationNotation

    At the core of our architecture modelling andtransformation technique is a conceptual archi-tecture model. The objective of this conceptualarchitecture model is to capture the core layer-ing and structuring principles of service-orientedarchitectures. The conceptual service archi-tecture model (SAM), tailored towards theneeds of service- and process-oriented platforms,

  • Pattern-Based Software Architecture for Service-Oriented Software Systems 33

    shall address the different abstraction levels andperspectives in service-oriented architectures:– Reference architectures are high-level

    specifications representing common struc-tures of architectures specific to a particulardomain or platform.

    – Architectural design patterns aremedium-scale patterns – usually referredto as design patterns or architecturalframeworks.

    – Workflow patterns are process-orientedpatterns that represent common dataexchange-oriented workflow processes in anapplication domain.Based on the architecture model, we define

    a notation for architectural specification – theservice-oriented architectural configura-tion notation (SAC) – that has features ofan abstract architectural description language(ADL). Two elements define our transformationtechnique: a description notation to capture ar-chitectural properties and rules and techniquesfor transformation.

    Various formal approaches to the represen-tation of processes have been suggested in thepast, e.g. [6] using Petri nets. Process calculisuch as the π-calculus [15, 13] are suitableframeworks for architectural configurations ofservice- and process-centric systems, i.e. sup-

    port of modelling and transformation, due totheir abstraction from service implementationand their focus on interaction processes. Theπ-calculus, a calculus for mobile processes, isparticularly useful due to a similarity betweenmobility and evolution – both are about changesof a service in relation to its neighbourhood –which helps us to support architectural transfor-mations. Our notation is defined in terms of theπ-calculus [15], but we want to firstly providea less mathematical syntax and, secondly, allowthe addition of further combinators to expressworkflow and design patterns. A simulation no-tion captures property-preservation and permit-ted structure and behaviour variations duringtransformation.

    Our notation consists of process activities,combinators and abstractions, which are sum-marised in Fig. 1. The basic element describingprocess activity is an action. Actions π arecombined to service process expressions. Ac-tions of a service are primitive processes dividedinto invocations and activations. Invocationsinv x(y) by a client of a service via channel xconnects to the remote service, passing y as aparameter. Activations receive rcv x(a) froma provider from other services and the dual replyrep x(b), with channel x and parameters a andb. Based on actions, process combinators are

    Actions:π ::= inv x(y) Invocation

    rcv x(a) Activation – Receiverep x(b) Activation – Reply

    Processes – workflow combinators:

    P ::= π ActionP1;P2 Sequential Compositionpar (P1, P2) Parallel Compositionrepeat (P ) Iterationchoice (P1, P2) Exclusive Choicemchoice (P1, P2) Multi-Choice

    Processes – other constructs:P ::= let x = π in Variable

    0 Inaction

    Abstraction:A(a1, . . . , an) = PA with a1, . . . , an are free in PA

    Figure 1. Syntactical Definition of the SAC Notation

  • 34 Claus Pahl, Ronan Barrett

    basic forms of workflow patterns. Sequencesare represented as P1; P2 – process P1 is ex-ecuted and the system transfers to P2 wherethe next action is executed. Exclusive choicemeans that one Pi (i = 1, . . . , n) from choiceP1, . . ., Pn is chosen, Multi-choice mchoiceP1, . . ., Pn allows any number of the processesPi (i = 1, . . . , n) to be chosen and executed inparallel. Iteration repeat P executes processP an arbitrary number of times. Parallel com-position par (P1, . . ., Pn) executes processes Piconcurrently. A(a1, . . . , an) = PA is a processabstraction, where P is a process expressionand the ai are free variables in P. A variable isintroduced using let x = π in P. Inaction isdenoted by 0.

    The semantics is defined in terms of theπ-calculus [15], by mapping constructs directlyto π-calculus constructs. The actions are de-fined in terms of send x〈y〉 (for invocation invand reply rep) and receive x(y) (for receivercv) of the π-calculus. Combinators are definedthrough their π-calculus counterparts, exceptmultichoice mchoice P1, P2, which is defined aschoice (A, B, par (A, B)) – essentially a parallelcomposition of all elements of the powerset ofthe mchoice argument list. The abstraction isthe π-calculus abstraction.

    3. Pattern-Based Service ArchitectureModelling

    The architectural configuration notation SACenables the modelling of pattern-based servicearchitecture configurations.

    3.1. Patterns and Abstraction Levels

    Architectural and design patterns are re-curring solutions to software design prob-lems [7]. Although originally proposed forobj