project deliverable d6.2-4 domain specific guidelines and ... · pdf filed6.2-4: domain...

D6.2-4: Domain specific guidelines and cookbook for the telecom, industry and enterprise domain

Version: 1.0 Last change: 20.01.2011

© Q-ImPrESS Consortium Dissemination level: public / 43

Project Deliverable D6.2-4

Domain specific guidelines and cookbook for the telecom, industry and enterprise domain

Project name: Q-ImPrESS Contract number: FP7-215013 Project deliverable: D6.2-4: Domain specific guidelines and cookbook for the

telecom, industry and enterprise domain Author(s): Heiko Koziolek, Hergen Oltmann, Ivan Skuliber, Juan Carlos

Flores Beltran, Klaus Krogmann, Marco Masetti, Marijan Zemljic, Michael Hauck

Work package: WP6 Work package leader: SFT Planned delivery date: M36 Delivery date: 20.01.2011 Last change: 20.01.2011 Version number: 1.0

Abstract

Domain specific guidelines and cookbook documents are meant to deal with the particularities of applying the Q-ImPrESS method in a given domain. Keywords: Cookbook, guideline, report, end-user domain




Revision history

Version Change date Author(s) Description 0.1 29.09.2010 M. Masetti Template 0.11 03.12.2010 H.Koziolek Drafted ABB section 0.2 22.12.2010 M. Zemljic, I. Skuliber Drafted ENT section 0.3 22.12.2010 J. C. Flores Beltran Drafted ITE section 0.4 17.01.2011 M. Masetti Added chapter 5 0.5 18.01.2011 M. Masetti Included review comments by FZI 0.6 20.01.2011 H. Koziolek Bibliography added 1.0 20.01.2011 M. Hauck Final version




Table of contents

1 Introduction ................................................................................................................................................. 5

1.1 Purpose ..................................................................................................................................................... 5

1.2 Scope ......................................................................................................................................................... 5

1.3 References ................................................................................................................................................. 5

2 The telecom domain .................................................................................................................................... 6

2.1 Introduction .............................................................................................................................................. 6

2.2 The target system ...................................................................................................................................... 9

2.3 Guidelines and best practices ................................................................................................................. 10 2.3.1 Choosing a system for modelling .................................................................................................. 11 2.3.2 Choosing the model grain ............................................................................................................. 11 2.3.3 Modeling the system ..................................................................................................................... 12 2.3.4 Performing black-box measurements for performance predictions ............................................... 14 2.3.5 Annotating a black-box model for performance predictions ......................................................... 15 2.3.6 Performing availability measurements for availability predictions ............................................... 15 2.3.7 Annotating a black-box for availability predictions ...................................................................... 16

2.4 Efforts and potential risks ....................................................................................................................... 18 2.4.1 Efforts for applying Q-ImPrESS ................................................................................................... 18 2.4.2 Potential risks ................................................................................................................................ 21

2.5 Conclusions ............................................................................................................................................ 22

3 The industrial automation domain ........................................................................................................... 23

3.1 Introduction ............................................................................................................................................ 23

3.2 Characteristics of software in industrial automation ............................................................................. 25

3.3 The target system .................................................................................................................................... 26

3.4 Guidelines and best practices ................................................................................................................. 27 3.4.1 Typical evolution scenarios ........................................................................................................... 27 3.4.2 Best practices for data collection ................................................................................................... 28

3.5 Efforts, risks and pitfalls ......................................................................................................................... 30 3.5.1 Efforts for applying Q-ImPrESS ................................................................................................... 30 3.5.2 Risks and pitfalls ........................................................................................................................... 32

3.6 Checklist ................................................................................................................................................. 34

3.7 Conclusions ............................................................................................................................................ 35

4 The enterprise domain .............................................................................................................................. 36

4.1 The target system .................................................................................................................................... 37

4.2 Guidelines and best practices ................................................................................................................. 38 4.2.1 Typical evolution scenarios ........................................................................................................... 38 4.2.2 Best practices for data collection ................................................................................................... 39

4.3 Efforts, risks and pitfalls ......................................................................................................................... 39 4.3.1 Efforts for applying Q-ImPrESS ................................................................................................... 39 4.3.2 Potential risks ................................................................................................................................ 41

Conclusions ...................................................................................................................................................... 42

5 Conclusions ................................................................................................................................................ 43




1 Introduction

1.1 Purpose The purpose of the document is to collect guidelines and best practices identified and adopted by the Q-ImPrESS industrial project partners in their respective domains (namely: enterprise, telecom and industrial automation). The aim is to easy the adoption of the Q-ImPrESS tools and methods in order to perform a quality assessment of targeted service oriented platforms in the respective domains.

1.2 Scope This document describes a practical application of the Q-ImPrESS method outlined in the D6.1 deliverable [1], using the set of tools described in the D6.1 Annex “Tool Manuals” [2].

1.3 References [1] “D6.1: Method and abstract workflows documentation” [2] “D6.1 Annex: Tool Manuals” [3] Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides, “Design Patterns:

Elements of Reusable Object-Oriented Software”, Addison-Wesley, 1995 [4] Thomas Erl, “Service-Oriented Architecture”, Prentice Hall, 2004 [5] Martin Fowler, “Patterns of Enterprise Application Architecture”, Addison-Wesley,

2002 [6] Gregor Hohpe and Bobby Woolf, “Enterprise Integration Patterns”, Addison-Wesley,

2004 [7] “D8.6: Enterprise SOA Showcase” [8] Raj Jain, “The Art of Computer Systems Performance Analysis: Techniques for

Experimental Design, Measurement, Simulation, and Modeling”, Wiley-Interscience, 1991

[9] Anne Martens, Heiko Koziolek, Lutz Prechelt, and Ralf Reussner, "From Monolithic to Component-based Performance Evaluation of Software Architectures: A Series of Experiments Analysing Accuracy and Effort,", Empirical Software Engineering, p. TBD, 2011




2 The telecom domain This section introduces the telecommunications domain and provides a set of guidelines and best practices for any future user of Q-ImPrESS tools and methods in this domain. The structure of this section is organized as follows. Section 2.1 provides a short overview of the domain and describes typical research and development challenges found in current telecommunication networks. Section 2.2 summarizes information on a typical telecom target system and briefly recalls on the ENT demonstrator, which was already described in detail in deliverable D7.1. Section 2.3 numbers several guidelines formed upon our experiences of applying the Q-ImPrESS tools and methods on the demonstrator. Efforts and potential risks are discussed in Section 2.4. Finally, Section 2.5 concludes the telecommunications domain section.

2.1 Introduction The software systems of the telecom domain are primarily characterised by high availability as a system is expected to provide continuous, uninterrupted service regardless of possible hardware or software failures. Telecom systems are commonly distributed, geographically highly dispersed, making them very complex (see Figure 1). All this demands performance traits so the nodes are able to perform real-time processing of diverse services realised upon the telecommunication infrastructure.

Figure 1: Telecom equipment

In order to assure relevant performance and availability characteristics, the design of almost every telecommunication system is preceded by its prototyping. Following the well-established best practices, the prototyping activities at Ericsson Nikola Tesla are commonly performed through a few separate stages. An idea of a solution to be prototyped is brought forth and clearly formulated in the inception stage. In the feasibility stage, available technologies for project realisation are surveyed and their applicability assessed in a most meticulous manner possible. Finally, the prototype is implemented during the execution stage. Such prototyping procedure is in line with common practices in the telecommunication domain in general. The prototyping activities Ericsson Nikola Tesla’s Research department




are commonly executed by a team of up to 10 software engineers supervised by 1 or 2 senior engineers. It is generally a good practice to comply with a widely adopted methodology for project management, e.g. Scrum. As already mentioned, prototyping tries to determine quality attributes of the forthcoming system. In the case of unsatisfactory results, an alternative approach to system design may be taken. There are a number of different metrics being assessed, but a few provide simple prediction of the forthcoming system's quality of service (QoS). The system performance is usually assessed by means of measuring its response time and throughput. The response time refers to time needed for the system to generate the output data. In the domain in focus systems have to be powerful enough to respond within a few milliseconds. Throughput is characterised by the amount of data that a system can handle within a certain time slot. A telecommunications system commonly needs to be able to process hundreds of requests or calls per second. Moreover, we take into consideration the property of capacity, which in the particular case can be thought of as guaranteed throughput. The system availability is expressed as a percentage of uptime in a given year where uptime means that the system is providing full service to its users. That percentage is usually calculated over a number of years, but is estimated through a much simpler metric of failover time. The failover time is the time during which a system (or an application or a service) switches over automatically to a redundant or standby node upon the failure or abnormal termination on the previously active node. Combined with the number of failures, failover time can be used for rough estimate of system availability. In existing systems, mentioned metrics are measured by a range of proprietary tools, which can be either intrusive or non-intrusive. The intrusive QoS assessment tools are embedded into source code at certain points, providing requested measures related to certain system modules. The non-intrusive QoS assessment tools make measurements only upon the system’s output data. An example of the latter tools is Wireshark1. Wireshark can be used to capture live data, which could be later analysed offline. An example of using Wireshark is given in Figure 2. Regardless of intrusive or non-intrusive measurements, collecting the measured data provides invaluable input for determining performance and availability characteristics of a system.

1 http://www.wireshark.org/




Figure 2: Capturing live data with Wireshark network analyser

Until recently, both prototyping product and design in the telecom industry was done according to prevalent vertical industry model. This meant that network equipment and service providers would build an entire system (or a prototype) by relying solely on their own expertise and investing only their own resources. This causes the prototyping and development process to be quite time-consuming and expensive. On the other hand, such process guarantees quality of the final product which is of utmost important for the telecom domain. However, rapid technology growth and continuously increasing market demands compelled the telecom companies to drastically shorten their development times and costs. It is obvious that such new circumstances could result in error-prone and unreliable products. As delivering error-prone products is not acceptable in the telecom domain and would entail a drop of the market position, telecom companies seek processes and methods to retain the established quality level of services they provide. One of the proposed solutions for adaptation to such new circumstances is the shift from vertical to horizontal model in which system developers can utilize 3rd party commercial-of-the-shelf building blocks (COTS) while building highly available deployment-ready systems. However, a couple of new problems arise. On the one hand, there is a question concerning interoperability of legacy and non-legacy systems that need to be integrated within a final solution. On the other hand, there are potential performance and availability issues arising from such integration. It can be argued that the integration of legacy and non-legacy systems itself is the cause of mentioned issues. Also, it can be argued that perhaps only non-legacy systems should be




developed, with abandonment of legacy systems. However, this is strongly discouraged, as the core functionality of legacy systems is well proven and more than satisfactory. Also, redesigning such systems anew ("transferring legacy systems into completely new non-legacy systems") would require even more resources than integrating non-legacy and legacy systems. As an illustrative example of legacy software systems within other domains we mention COBOL-based core systems in finance industry. Based upon current challenges in the telecom industry, we defined a demonstrator for the Q-ImPrESS project. The focus of the demonstrator was on integration of legacy and non-legacy systems that perform core telecom functionalities. All relevant experiences and lessons learned are presented in the following sections of this chapter.

2.2 The target system Integration of legacy and non-legacy systems accompanied by issues described in the introduction of this chapter is the main focus of the telecommunications demonstrator which was developed for validating the Q-ImPrESS method and tools. The demonstrator utilizes 3rd party COTS components for establishing software logic of its non-legacy part, and integrates that non-legacy part with existing legacy systems. The non-legacy subsystem provides a service to the legacy system when the legacy system asks for it, as depicted in Figure 3.

Figure 3: Extending the legacy system with new functional service

The two main constituting parts of the demonstrator are the call control nodes (the legacy systems) and the DIAMETER extension (the non-legacy subsystem), which are elaborated in the following text. The call control nodes are common core network legacy systems running software that was written in the recent past as well as software written a few decades ago. Most of the call control software is written in proprietary programming languages that facilitate the creation of reliable and highly available telecom networks. Call control is executed on proprietary hardware consisting of many multiprocessor embedded subsystems. The first call control node represents an access network where a call is generated. The second call control node represents the core network which provides the basic service of call control (routes the call toward another end-point). This node is extended with authentication, authorization and auditing (AAA) functionality. The AAA functionality is realised through the DIAMETER protocol, a widely accepted and standardized protocol for AAA. The DIAMETER extension is the non-legacy part of the demonstrator. It is built according to Service Oriented Architecture principles, utilizing existing COTS components, and is used by the call control node as a service. The DIAMETER extension is written entirely in C (with some simple C++ constructs) and runs on common PC hardware with Linux installed as operating system. Due to specifics of standard internet protocols (packet based protocols, verbosity, text-based hierarchical data organization, etc.) the extension would be very hard to implement natively in call control




nodes. Figure 4 depicts the conceptual architecture of the telecommunication demonstrator. On the bottom of the figure lies the Ericsson call control legacy system. It is endowed with the AAA functionality by means of the DIAMETER protocol implemented as a non-legacy extension consisting of two separate PC clusters, one of which is running the DIAMETER client and the other one the DIAMETER server.

Figure 4: Conceptual software architecture of Ericsson Nikola Tesla's demonstrator

It is important to note that existing proprietary solutions for the implementation of AAA functionality work quite well if all of the used network equipment is manufactured by the same vendor. However, the interoperation of proprietary solutions is undermined if deployed network equipment is supplied from different vendors. On the other hand, an AAA solution based on standards, like DIAMETER, offers an advantage in terms of network equipment interoperability.

2.3 Guidelines and best practices This section provides a set of guidelines recommended for any future user of Q-ImPrESS tools and methods. These guidelines are formed based upon our experience of modelling the demonstrator in Q-ImPrESS tools and applying the Q-ImPrESS method in order to get information on potential behaviour of the demonstrator.




2.3.1 Choosing a system for modelling It is generally equally acceptable to model both legacy and non-legacy systems. In the demonstrator, we focused on the non-legacy subsystem and its connection to the legacy system because this part of the overall demonstrator system had many unanswered questions when it comes to performance and availability characteristics. We didn't pay much attention to legacy part of the demonstrator, because we have broad and deep knowledge of our legacy systems and their characteristics. Additionally, it must be considered that telecommunication core systems are programmed in a low-level language, such as assembly, C, or some proprietary language similar to C. Thus, they cannot be reverse-engineered to a meaningful and consistent model. The inability to reverse-engineer white-boxes entails that the system must be modelled as a set of black-boxes. Nevertheless, the availability of source code enables the employment of intrusive measurement tools for assessment of black-boxes’ performances, along with the non-intrusive ones. With the application of measurements, black-boxes actually become grey-boxes (their relevant quality attributes become known). It is important to note that a component can be modelled as a black-box if it performs only one functionality. This guideline is applicable to both legacy and non-legacy parts of the system. If a component does not satisfy this requirement, it must be logically split into parts each performing only one functionality. Thus, a component performing multiple functionalities becomes a collection of black-boxes where each black-box represents exactly one functionality of the component. A suggestion is that a system to be modelled is deployed on dedicated hardware and network resources. In such deployment scenario, there is only interaction among system's components which means that the system can be modelled in simpler manner. However, if this is not the case, i.e. if resources are shared (functional collocation) among the modelled system and some other system, this other system should either also be modelled or its properties assessed and taken into consideration while performing measurements of the target system.

2.3.2 Choosing the model grain Possible modelling grain size can be somewhere in the spectre between coarse and fine grain. Coarse grain models are maximally abstracted and consist of a minimal set of black-boxes. This means that many lower level functionalities and interactions are ignored and the focus is put only on most important functionalities and interactions (e.g. a client-side of some service, a server-side of the same service, a load balancer, etc.). This approach enables fast creation of system models, but this procedure requires meticulous and diligent planning in order to obtain all relevant properties of the system to be modelled. It is important to fully understand "what is happening under the hub" and whether there are some interactions among lower level functionalities that cannot be ignored when abstracted to high-level functionality grain (if this is the case, then there is a significant possibility that abstracted model doesn't encapsulate all the relevant system's properties and thus will result in improper predictions for some or all quality attributes). Fine grain models, on the other hand, are minimally abstracted or not abstracted at all. They contain many black-boxes where each black-box performs a specific lower-level functionality. Thus, the modelling itself requires considerable amount of time as such models usually need to reflect the structure of the source code. Because of this, annotating such black-boxes is rather simple process: certain portions of code are measured for specific quality attributes and results of those measurements characterise corresponding black-boxes




in the model. However, there is an inconvenience in such modelling as there may be hundreds or even thousands of constituting black-boxes, all from which need to be annotated and their annotations properly aggregated in order to incorporate them in Q-ImPrESS tools.

2.3.3 Modeling the system Q-ImPrESS tools can be used for modelling and predicting system's performance, reliability and maintainability attributes. It can be argued that each of these attributes requires different model of the system. However, our tests on the demonstrator show that this doesn't have to be the case. Quite the contrary, a model can be created for one quality attribute and afterwards annotated with information about other quality attributes. Q-ImPrESS tools support this reasoning by sharing the models pertaining to the system Service Architecture among performance, reliability and maintainability prediction tools. Also, from our perspective, it makes sense to annotate a sole logical structure of a system with different quality attributes. In the following text we provide guidelines for system modelling on a high abstraction level, gathered from our experiences on the ENT demonstrator. SAM Repository

Each black-box component model should have clear interface definition which closely resembles the interface of corresponding component in the actual system (the abstraction level of modelling must be taken into consideration). If this condition is satisfied then each physical component from the actual system should be mapped to one primitive component in the SAM Repository.

SEFF Repository Behavioural description of a black-box component should be minimalistic, expressed

only through activities visible from the system level perspective (i.e. interface operations).

For distributed systems containing multiple components, interface calls between components should be modelled with external call entities. If network influence is neglected in predictions then transitions (external calls) pertaining to response messages could be neglected also.

Modelling of load balancing behaviours is possible only with probabilistic branch transitions which enable mutually excluding behaviours to co-exist inside one SEFF diagram.

Hardware Repository In general, a hardware repository is a collection of common hardware descriptors

(processor, memory and network resources) which are used to specify actual hardware elements in the target environment. Definition of at least one processor and one memory resource descriptors is mandatory. This is true even if the memory consumption is not taken into account during actual predictions.

For systems running with dedicated network resources, it is possible to omit the network throughput influence in simulations. This simplifies the modelling procedure at the expense of possible difference between experimental and simulation results.




Whether the network resources should be modelled or not depends on the network traffic caused by the modelled system (in our case that network traffic was significantly smaller than the throughput of used dedicated network).

Target Environment For each node, minimal resource configuration includes processor, memory, and

container definitions. Containers, which are in fact abstractions for the service execution environment, must have its resources allocated from the pool of available node resources. If actual system is deployed on dedicated hardware resources, only one container per node is defined with all available node resources allocated.

Just as in hardware repository, definition of at least one processor and one memory resource is mandatory. If memory consumptions should be omitted from predictions a possible workaround is to leave all memory related parameters on their default values.

Service architecture model For each primitive component defined in the repository model there is an equivalent

subcomponent instance definition which represents the actual implementation of the component. These subcomponent instances are used to model the actual system architecture using connectors. For a system modelled as a set of black-boxes, connections between subcomponent instances should follow the same logic as associations between software components in the repository model.

For each subcomponent instance there is a corresponding service definition which represents its workload. In order to model dedicated resources one execution container is allocated to each service defined in the model.

At least one system level interface must be defined in the model. This interface represents the access point for the system and is connected to one of the available subcomponent instances with connecters. For a coarse-grained level of modelling, modelled system interfaces should resemble the actual system interfaces. E.g. if there is a component representing a single point of contact in the actual system, its interface should be exposed as a system interface in the model.

QoS Annotations QoS annotation repository holds annotations of black-box components used in

performance and reliability predictions. Values of annotation parameters are obtained with measurement and transformation procedures described in subsequent sections.

Usage model If only a part of the actual system is interesting for inspection, the usage model

provides the opportunity to model other parts of the system as a source of traffic (i.e., usage profile) for the part under inspection. This fact was actually utilized while modelling the ENT demonstrator. Since we were not interested in analysing the properties of legacy system, it was modelled only in the usage model as a traffic source for the non-legacy extension.




2.3.4 Performing black-box measurements for performance predictions Once the system has been appropriately partitioned into a number of black-box components, it is necessary to provide the performance annotations for each of them. This is achieved by measuring each black-box’s performance properties, either in an intrusive or a non-intrusive manner. Intrusive measurement is simpler as it requires injecting code snippets for measurement into specific points of the source code. However, this type of measurement can negatively affect system performance, and consequently the relevance of measurement results, especially if the system was modelled on the fine grain level. Non-intrusive measurement, on the other hand, does not make an impact on the overall system’s performance whatsoever, but its setup is significantly more time-consuming than for the intrusive. There are several ways of performing non-intrusive measurements, namely, by utilising a network monitoring tool (e.g. Wireshark), the performance counters provided by virtually all OSs, and by using system’s trace logs. The choice of selecting either intrusive or non-intrusive measurements depends on many factors, some of which are: the abstraction level, availability of source code, availability of developers for implementing intrusive measurements, etc. A common way of assessing a black-box component’s performance is by measuring its response time consumed by processing of certain types of messages. An example is processing of authentication request and authentication answer messages which are characteristic for a specific black-box component in an arbitrary AAA system. For high performing real-time systems pertaining to the telecom domain, the processing time of a single message is usually too short for obtaining the precise measurement result. In such case, it is advisable that a number of messages of the same type be sent for processing so that the average processing time can be calculated by dividing the obtained measured time by the number of sent messages. While doing performance measurements of a black-box component, response time for each activity should be measured separately, i.e. independently of other activities that may be a part of a protocol on a wider level. Thus, it is important to reduce, or, if possible, eliminate, the impact on the measurement coming from any potential activity other than the one in focus. E.g., if a protocol is defined so that request messages are always succeeded by response messages, the processing time for each of them should be regarded separately. If a component is implemented in such a way that these two messages can be processed concurrently, this makes a rather aggravating circumstance. An example of resolving this issue is employing a proxy component that would block all upcoming messages of a type other than the one in focus until the black-box component has finished processing all messages of the type with respect to which its performance is being measured. Figure 5 illustrates a possible deployment of proxy components for the measurement purposes.

Figure 5: Possible deployment of proxy nodes for measurements




In this specific case, proxy components represent a synchronization point for all request/answer messages arriving to the measured component. Once all requests are collected on the proxy, measurement is started and requests are simultaneously passed to the measured component for processing. Respective answer messages are blocked on the other proxy, thereby preventing their influence on the measurement. When all request messages are processed the measurement of the request processing activity is finished. With this approach the proxy component facilitates not only decoupling of the request and answer processing activities on the measured component but also removes the influence of other components in the system on the current measurement. Obtained measurement data is then transformed and configured in the model as described in the following section.

2.3.5 Annotating a black-box model for performance predictions After the performance measurement data have been collected, they need be mathematically transformed into input parameters appropriate for inputting into the Q-ImPrESS model. In the first stage measurement data (i.e., response time consumed by processing of certain types of messages) is transformed into message processing speed for each message type processed on a respective component. For measured response time expressed in milliseconds, the calculation can be carried out using the Formula (1):

sec]/[1000

1

reqT

NSV

N

ii

(1)

where S is the number of messages sent for processing, Ti execution time of measurement i and N is total number of measurements. It is intuitively clear that S should be the larger the shorter the time required for processing a single message. In the second stage calculated values are configured in the Q-ImPrESS service architecture models, namely the target environment and the QoS annotation models. For each component, processing speed value of one message type is chosen as input value for the CPU clock frequency parameter in the target environment model. In the QoS annotations, the CPU resource consumption is then set as 1 clock cycle for the chosen message type and as a relative value for other message types. An example of the procedure is given for the authorization request and authorization answer messages characteristic for AAA systems. If processing speed of authorization request message is chosen as the input value for the clock frequency parameter and set as 1 clock cycle CPU resource consumption, then the input value for the CPU resource consumption of the authorization answer message is calculated as VAR/VAA, where VAR is calculated processing speed of the authorization request messages and VAA is calculated processing speed of the authorization answer messages. It is important to emphasize that the described modelling of performance parameters applies only to components which are defined and modelled as black-boxes and for which dedicated resources are allocated in the target environment model.

2.3.6 Performing availability measurements for availability predictions Obtaining relevant information for reliability annotations of an arbitrary software component depends mainly on the ability to track its failures within a given period of time. For newly developed components such data is usually unavailable and must be estimated using alternative approaches. In the telecom domain, which is primarily characterized by highly




available (HA) systems, one of the main quality attributes negotiated in the service level agreements is service availability. High service availability depends on the ability to shorten the period of service unavailability in failure situations (i.e., by decreasing repair time). Relevant information on service availability is usually acquired over long time periods, but a simple HA indicator for estimating the HA potential of the system can be obtained by measuring service outage times in failure situations. Since majority of HA systems are based on redundancy, service outage is in fact obtained by measuring the service transfer time from active to standby service node. A common way for obtaining failover time is to crash the relevant process on the active component and measure the time needed for the service to be available on the standby component. If the service availability primarily depends on the availability of the network connection, information on the service transfer time can be obtained by utilizing network analysing programs such as Wireshark. The procedure is illustrated in Figure 6.

Figure 6: Procedure for obtaining the service transfer time

First, a service process is terminated abruptly on the active component. This event is recognised by the reference component running the network analyser as termination of the transport connection. Underlying HA mechanisms on the service components ensure that the service is migrated to the standby component, which at this point assumes the active role. A standby component re-establishes transport connection with the reference component and the service transfer procedure is over. In this simplified scenario service transfer time is calculated as the time between connection termination and connection re-establishment. Obtained measurement on the service failover time can be used to estimate component’s availability as described in the following section.

2.3.7 Annotating a black-box for availability predictions As stated earlier, the annotation data pertaining to a determined black-box is in fact history data. These data include bug/issue-tracker database data for each system release, which must be transformed into statistical probability of failure per service call. Alternatively, required reliability data can be derived from system’s availability measurement. Information on availability can either be obtained from history data as well, or estimated via measured failover times, as described in the previous section. There is usually no opportunity to trace system behaviour over a long-term period, in order to obtain the attribute values relevant for system availability assessment. However, usually some




properties are given by the manufacturer or are available as a result of a previous research incorporating the same or similar system. Namely, node reboot period and node downtime per year are usually obtainable as ready figures. Furthermore, in modern telecommunication environments regular maintenance procedures do not obstruct the service as the functionality of the node being maintained is temporarily designated to another node. Thus, the yearly maintenance time can be omitted from the calculation of node downtime per year, so downtime includes only unplanned occurrences. The number of node failures per year and unplanned node downtime per year are in a relation expressed by Formula (2).

(2)

Typically, for telecommunications systems, which are characterised by availability as high as 99.999% and the reboot time of up to 10 minutes, unplanned node downtime per year is approximately 0.5h. From the other perspective, if unplanned node downtime per year is known, then a system component’s availability can be assessed via Formula (3).

(3)

Assuming that system components are modelled as black-boxes and that some component performs activities with the corresponding availabilities A1…An, it does not matter to what extent each of these availabilities influence the overall component availability AC. Therefore, it makes sense to proclaim all activity availabilities as equal, i.e. A1 = A2 = … = An. Overall component availability can thus be calculated according to Formula (4).

(4) In case activity failure and activity unavailability are regarded identical, failure of a black-box’s activity is complement to its availability, as shown in Formula (5).

(5) Formulas (4) and (5) imply Formula (6), which defines black-box’s activity failure as a function of availability of the whole component comprising n activities.

(6) Activity failure probabilities serve as QoS annotations for the reliability model. It is common for HA telecommunications systems to reinforce their availability by employing availability management mechanisms. They are usually realised through introducing a redundant node for each system component. Redundant standby node takes over primary node’s functionality in case of its failure. It thus makes sense to incorporate failover time in the system reliability analysis. Failover time usually includes migrating IP address to standby node as well as additional procedures of establishing communication functionality by informing other network nodes on the IP address change.




In such configuration component availability Ac can be expressed in the manner represented by Formula (7):

(7)

where MTTF is Mean Time To Failure and MTTR Mean Time To Repair. MTTF can be calculated via yearly failure component rate (Formula (1)) by using Formula (8).

(8)

A typical MTTF value of HA telecommunication systems ranges thousands of hours.

2.4 Efforts and potential risks This section brings forward the efforts for applying Q-ImPrESS in the telecommunications domain (Section 2.4.1) and discusses the potential risks during the process (Section 2.4.2).

2.4.1 Efforts for applying Q-ImPrESS In order to provide relevant input parameters for cost/benefit analysis of a system, it is necessary to quantify the efforts that need to be invested in order to gain a satisfactory result from the utilisation of the system. Same logic applies not only to systems but also to new development (business) processes/methods such as those encompassed by Q-ImPrESS. Unfortunately, precise quantification of benefits obtained by utilizing model-driven prediction methods is rather difficult and time-consuming task which usually prolongs over long-term periods. As such, detailed benefit analysis is not feasible and is out of the scope for this project. Thus, the following analysis concentrates only on efforts required for correct utilisation of Q-ImPrESS prediction tools. Obtained values can be compared with efforts of traditional methods, which in the telecommunications domain usually involve prototyping activities accompanied with quality analysis based on measurements and estimations. Table 1 shows estimations of efforts for application of the Q-ImPrESS tools in the telecommunications domain. The “ENT” column indicates the actual duration assessment of ENT developers’ efforts on the ENT demonstrator. It should be pointed out that the figures in Table 1 correspond to high-level modelling abstraction applied on the ENT demonstrator and should therefore not be considered universal for the complete range of possible telecommunication based projects. Also, due to possible high potential variability of efforts durations it is very hard, if not impossible, to provide unified values which would be applicable to all telecommunication projects. Thus, Table 1 also provides the estimation of best, likely and worst possible durations. But even these values should be used with reserve as more experience is needed on the method to provide more accurate values.




Table 1: Efforts estimations for applying Q-ImPreSS in telecommunications domain (in person hours)

Potential duration ENT

# Activity Best Likely Worst

1 Preparation activities

Study Q‐ImPrESS documentation 20 40 70 60

Learn Q‐ImPrESS tools 8 16 24 8

Gather information on data collection 12 16 36 32

Interview developers/architects 8 12 16 16

48 74 124 116

2 Measurements for performance modelling

Design performance measurements 16 24 50 40

Select tools 2 6 12 4

Install and setup testbed system 3 6 8 8

Configure load generators 1 2 3 2

Conduct measurements 16 24 48 44

Analyse data 4 14 32 20

Derive resource demands 1 4 8 6

Validate data 4 8 12 7

47 88 173 131

3 Estimations for availability modelling

Analyse literature 8 16 32 24

Gather usable availability information 6 18 30 22

Design availability measurements 4 6 8 8

Select tools 2 6 8 4

Install and setup testbed system 4 10 16 7

Conduct measurements 2 4 8 6

Analyse data 4 16 24 20

Derive failure probabilities 1 4 8 6


35 88 146 104

4 Estimations for maintainability modelling

Analyse literature 10 24 50 26

Gather usable maintainability information 8 16 40 14

Derive maintainability effort estimations 2 4 8 4





21 47 106 48

5 Manual modelling of main alternative

Decide on model granularity 24 40 100 64

Model system components 2 4 8 2

Model system behaviour 2 6 8 3

Model system hardware 1 1 1 1

Model system assembly 2 3 4 3

Model system deployment 1 2 3 2

Model system usage 2 4 6 3

Model quality annotations 1 3 6 2

Validate models 2 4 6 3

37 67 142 83

6 Modelling of evolution scenarios

Decide on model granularity 1 2 4 1

Model system components 1 2 4 1

Model system behaviour 1 2 4 1

Model system hardware 1 2 3 1

Model system assembly 1 3 6 1

Model system deployment 1 2 3 1

Model system usage 2 4 6 2

Model quality annotations 4 6 8 3

Validate models 1 2 3 2

13 25 41 13

7 Performance prediction analysis

Configure QoS annotations 2 2 2 2

Run simulation 2 4 6 2

Analyse results 2 4 8 2

6 10 16 6

8 Availability prediction analysis

Configure QoS annotations 2 2 2 2

Run simulation 2 4 6 2


6 10 16 6

9 Maintainability prediction analysis

Specify change requests 2 2 2 2




Specify change workplans 2 4 6 2


6 10 16 6

10 Trade‐off analysis

Obtain prediction results 6 12 18 6

Specify preferences 2 5 10 2


11 22 38 11 The table shows that relatively high initial efforts are required for gathering knowledge about the Q-ImPrESS method and tools. Also, according to our experiences, most of the time will be spent on data collection for a specific quality prediction, while the actual tool configuration and simulations require less effort. It is also valuable to notice that modelling of main alternative requires significantly more effort than modelling the evolution scenarios (i.e., alternatives). That is because the tool enables automation of some tasks which would otherwise be done manually. For example, there is no need for a user to re-define all components while modelling evolution scenario, but only those which are new for the scenario. That is why initial modelling of main alternative should be done with a great care in order to prevent propagation of modelling mistakes on other alternatives as well. Estimated cumulative effort for all Q-ImPrESS activities on the ENT demonstrator required 504 hours (approximately 3 person months). That is in line with the duration of prototyping activities in ENT generally, which usually takes up to 12 person months. However, it should again be emphasised that presented values correspond to the high-level modelling abstraction and cannot be considered universal for all types of projects within the telecommunication domain.

2.4.2 Potential risks It is advisable to recognise and bear in mind all the risks while performing the model-driven prediction methods. This way the required efforts may be minimised and possible pitfalls avoided. Here are stated some of the potential risks that future users of Q-ImPrESS in telecom domain should take into consideration before applying the method and tools:

Considerable learning efforts required: Utilisation of Q-ImPrESS tools is applicable only on large, more complex systems, as methods initially require considerable learning efforts.

Limited support for availability predictions: Telecommunication systems are primarily characterized by their HA attributes. Traditionally, these attributes are assured through well-established development processes and utilisation of proven design rules. While Q-ImPrESS offers some support for evaluating availability characteristic of evolving software systems this feature is still immature to be applied on a production level.

Inability to model more complex usage scenarios: Q-ImPrESS tools are well suited for predicting performance attributes in near steady states. However, they lack support




for modelling more complex usage scenarios where usage profile changes over time, which is characteristic for telecommunication systems.

2.5 Conclusions Recommended domain specific guidelines, which are formed based upon our experiences on the telecom demonstrator, could provide a valuable asset for any future user of Q-ImPrESS in the telecom domain, but also beyond. It should be emphasized that most of the guidelines correspond to high-level modelling abstraction which was applied on the ENT demonstrator and should therefore not be considered universal for the complete range of possible telecommunication based projects. Application of Q-ImPrESS model-based prediction method gave us valuable insight into efforts required for particular Q-ImPrESS activity. Cognition that a relatively high initial effort is required for the Q-ImPrESS learning process can be annulled with the fact that this one time activity is not required in subsequent analyses. Also, it is difficult to conclude whether the exhibited efforts are acceptable for the potential users, as this depends on many specific factors and can greatly differ from case to case. Therefore, further studies should be made in order to quantify actual savings achieved by applying the Q-ImPrESS. Nevertheless, presented data could be used as a rough guideline for any future user of Q-ImPrESS. Regarding the telecom domain, future improvements of Q-ImPrESS tools should cover the area of availability prediction mechanisms which are at this point still too simplistic for application on a production level (currently there is no support for service redundancy, failover procedures and times, replication patterns, etc.).




3 The industrial automation domain This section introduces the industrial automation domain and provides guidelines and best practices for applying model-based prediction methods, such as Q-ImPrESS in this domain. Section 3.1 starts with a short introduction into the domain and typical applications in industrial automation. Section 3.2 characterizes software used in industrial automation. Section 3.3 briefly recalls a typical target system, which was already detailed in Q-ImPrESS deliverable D7.1. Section 3.4 lists several guidelines and best practices gained from experiences with the Q-ImPrESS method. Efforts, risks, and typical pitfalls are discussed in Section 3.5. Section 3.6 provides a checklist for potential users of Q-ImPrESS in industrial automation, before Section 3.7 concludes the section.

3.1 Introduction Industrial automation deals with the use of control systems and information technology to

reduce manual work in the production of goods and services. While industrial automation systems originate from manufacturing, there are now many other application scenarios. Industrial control systems are for example used for power generation, traffic management, water management, pulp and paper handling, printing, metal handling, oil refinery, chemical processes, pharmaceutical manufacturing, or carrier ships. Different types of software applications are used in industrial automation:

SCADA: supervisory control and data acquisition systems refer to industrial control systems that monitor and coordinate industrial processes in real time. They consist of human-machine interfaces, controllers and field devices (i.e. sensors and actuators).

DCS: distributed control systems are similar to SCADA systems but put more emphasis on actual process control (instead of coordination) and on the distribution of the system to multiple locations connected via networks (see Figure 7).

PLC: programmable logic controllers manage interactions with field devices, such as flow sensors or valves. They are specifically designed for multiple inputs and output arrangements and have strict real-time constraints. Modern PLCs have a processing power comparable to typical desktop PCs and they are designed to handle large temperature ranges, electrical noise, and vibration.

Figure 8 shows how these systems are embedded into industrial manufacturing systems. SCADA and DCS reside on level 2, while PLCs reside on level 1. The latter are connected to field devices, which ultimately control and manage the actual industrial process. Above level 2, SCADA and DCS systems can be connected for example to Manufacturing Execution Systems (MES) on level 3, which in turn provide input data for Enterprise Resource Planning Systems (ERP). The software on level 3 and 4 is out of scope in this document.

Figure 7: Industrial process control system




Figure 8: Automation pyramid for structuring industrial automation systems

SCADA and DCS have strict reliability and real time performance requirements to not interrupt the underlying industrial process. Therefore applying Q-ImPrESS methods and tools for analysing trade-offs of design decision on performance and reliability properties is interesting for industrial automation companies. A subdomain of industrial automation is robotics (see Figure 9). Nowadays, industrial robots automatise many industrial processes, such as welding, painting, assembly, pick&place, palletizing, product inspection and testing. Industrial robots can have multiple sensors (e.g., cameras, scales) and actuators (e.g., drives, end effectors). Robot controllers are usually PLCs, but some robots are controlled by additional, more powerful workstations. Like SCADA and DCS, industrial robots operate under strict real time performance and reliability constraints and are therefore potential candidates to apply Q-ImPrESS. However, these systems are usually not service-oriented, thus slightly out of scope for the Q-ImPrESS method.

Figure 9: Industrial Robot




3.2 Characteristics of software in industrial automation As Section 3.1 pointed out, there are several heterogeneous systems in the industrial automation domain, which require specific kinds of software. We broadly classify the software in the industrial automation domain into domain-specific applications, embedded software systems and large-scale software systems. These classes are characterized in the following:

Domain specific applications: These applications are for example control loops for process automation or power systems. The IEC 61131-3 standard defines several graphical and textual PLC programming languages used for control loops. These are ladder diagrams, function block diagrams, structured text, instruction lists, and sequential function charts. However, there are additional domain specific application languages, such the programming language RAPID for robot control and advanced motion by ABB. The code sizes for these applications are typically below 50 KLOC. Many of these applications are programmed by individual developers or small teams.

Embedded software applications: This software runs standard micro controllers (e.g., ARM, PowerPC for Automation and Power Products) and interacts with custom hardware. Such software is often programmed in traditional programming languages, such as C/C++ or Assembler. Typical operating systems are VxWorks, Linux or QNX. These applications are medium-sized and often consist of less than 500 KLOC. This software is usually developer in medium-sized teams.

Large-scale software systems: This software includes server-side software for collecting, analysing, and storing process data. Furthermore, human machine interfaces for plant operators require large-scale software systems. SCADA and DCS fall into this category. Programming languages used are typically 3GL languages, such as C++, C#, and Java. Typical operating systems are Windows or Linux. The code sizes in this class may be large, in the range of millions of lines of code. Thus, such software may be developed by up to several hundreds of developers usually organized in distributed teams at various locations. Design activities may be carried out by special architecture teams.

As industrial software systems may involve physical or chemical processes with real time constraints, they typically have very strict non-functional requirements. The main quality metrics important for these systems are:

Reliability/Availability: Industrial software must run without interruptions or exceptions, so that the underlying industrial processed is not influenced. Unreliable software implies severe safety problems. Therefore, many industrial systems for example run on redundant hardware.

Safety: depending on the application area, in some contexts customers have high safety requirements. For example, some controller applications must be certified via IEC61508 to certain safety integrity levels (SIL).

Performance: industrial software systems have real time requirements and must react with strict timing constraints to not impede the industrial process.

Security: authorization and authentication are becoming more and more important since distributed systems may be connected to potentially insecure networks. As




industrial systems have a significant value, they are favoured goals for intruders.

Usability: As industrial software systems must ensure proper human interventions to an industrial process under any condition, the usability of operator workplaces is a continuous concern for industrial companies.

Maintainability: Industrial systems are typical long-living systems with a life-cycle much longer than e.g., for enterprise software. Industrial plants may be running for more than 30 years, while software technology changes in much shorter intervals. New customer requirements and regulatory standards require continuous maintenance and evolution of industrial software systems.

In current practice, these non-functional properties are ensured via various measures. Reliability issues are usually tackled by additional hardware, e.g., redundant CPUs, disk drives and networks. Performance is usually measured (e.g., using profiling tools) during software testing to ensure compliance with the requirements. For assessing the performance of new technologies or components off the shelf (COTS), sometimes prototypes are built and measured in testbeds.

Many design decision for software architecture are based on experience with former systems, since many generations of systems have been built and constantly evolved.

3.3 The target system The ABB demonstrator system used to evaluate Q-ImPrESS is a representative process control system (PCS). Its features and architecture have been described in detail in Q-ImPrESS deliverable D7.1. It has been selected because of its service-oriented structure (i.e., there are servers providing services to various client applications) and available architectural documentation. The following briefly summarizes the most important characteristics of the system. The ABB PCS consists of several workplace clients, process control servers, controllers and field devices (Figure 10), which are interconnected via redundant networks. The main functions of the server-side part of the system are:

Data Handling: collecting sensors values from various field devices, displaying them to human operators and sending new values (e.g., actuator interactions) back to the field devices).

History Management: storing logs of process values and performing statistical functions on the values

Alarm&Event Management: collecting error messages and warnings from the process and display them to the operator for human intervention.

The ABB demonstrator focuses on the server-side part of the system, which has strict performance and reliability requirements. This software is implemented with Microsoft technologies in C/C++, runs on the Windows operating system and comprises several million lines of code. Service providers are COM components or .NET assemblies. Clients communicate via open standards (e.g., OPC) with the service providers.




Figure 10: Topology of a typical process control system

3.4 Guidelines and best practices This section provides guidelines for the application of Q-ImPrESS in the industrial automation domain. Section 3.4.1 discusses typical evolution scenarios which can be analysed with the Q-ImPrESS method. Section 3.4.2 describes some best practices for data collection in the domain.

3.4.1 Typical evolution scenarios Q-ImPrESS supports analysing evolution scenarios on the architectural level. Changes have to be expressed in components, connectors, or in changes to the usage profile and hardware environment. Implementation-level changes can be incorporated into the models rather indirectly, for example, by changing the resource demands of a service effect specification or replacing an abstract connector with a concrete communication component. Most of the changes to a process control system are, however, typically not on the architectural level but on the implementation level (e.g., bug fixes and small feature additions). Many updates usually concern the communication with field devices or third party applications, which are difficult to reflect in a Q-ImPrESS model. Furthermore, some architecture level changes (e.g., replacing a component) do not impact the critical performance and reliability scenarios but instead add functionality or improve other quality attributes such as security or usability. As an additional aspect, the analysable evolution scenarios also depend on the chosen abstraction level for the Q-ImPrESS model, which can be chosen arbitrarily. If a fine-granular model is constructed, which reflects for example class-level properties, then very detailed evolution scenarios are analysable. If a coarse-granular model is constructed, then the changes to be modelled are restricted to high-level feature as a reflection in the model is necessary. Typical evolution scenarios supported by Q-ImPrESS models are:




Adding components/services: this implies adding a component with corresponding SEFFs for all services to the Q-ImPrESS repository and the service architecture model. If the component is implemented quality annotations (e.g., performance, reliability) can potentially be measured. If the component is only a design proposal then the quality annotations have to be estimated.

Changing components/services: this scenario implies adding a service to a component or changing its SEFFs. For example a component could be altered to support multi-core processors or more testing could be performed to gain lower component failure probabilities. These changes can be easily incorporated by altering the QoS annotation model. This scenario might also be used to reflect technology changes (e.g., updates to the used implementation frameworks).

Changing the usage model: this scenario involves altering the closed and open workloads of the Q-ImPrESS usage model to simulate load peaks and perform capacity (i.e., the maximum throughput) planning.

Changing the allocation model: this scenario involves assigning services to different nodes so that different load balancing or fault tolerance schemes can be analysed.

Changing the hardware environment: in the hardware model, several parameters, such as the processing rates of CPU or disk drives can be changed to analyse the impact of faster hardware to the system. For reliability corresponding parameter are missing, so that evolution scenarios with high reliable hardware are not directly analysable. This scenario also includes changing the network latency and throughput.

Changing infrastructure software: altering infrastructure software, such as the middleware, virtualisation, or OS layers is currently not directly supported by the Q-ImPrESS models. If such changes need to be analysed they have to be included into the component and connector abstraction made by the SAMM. For example, to represent an operating system, a component would have to be introduced and any other component running under this operating system would have to have a required interface to reflect OS calls.

These evolution scenarios are still on an abstract level, but can potentially reflect a large number of possible evolutions in the industrial automation domain. These evolution scenarios are however not specific for the industrial automation domain. Domain specific evolution scenarios are for example the introduction of a new standard in industrial automation (e.g., IEC 61850) or the support for new controller devices. These scenarios are potentially relevant for industrial control systems from different vendors, but are nevertheless difficult to capture in Q-ImPrESS models.

3.4.2 Best practices for data collection Data collection for Q-ImPrESS includes gathering performance, reliability, and maintainability measures to be used as input for the Q-ImPrESS models. For completely new components, these measures have to be estimated, for example based on experience with similar systems. For existing systems the values can be determined by measuring or analysing a running system.

There are only limited industrial automation domain specific guidelines for data collection. If a PLC device shall be analysed, there might be the need to use special measurement facilities.




For a complete process control system, the measurement and analysis methods known for other application domains can be reused. The following briefly summarizes some methods for data collection for the three different quality attributes supported by Q-ImPrESS:

Performance (i.e., CPU, HDD, and network demands in Q-ImPrESS models)

o Fine-granular: if detailed performance measurements are required and the source code of the system is available, code instrumentation (e.g., using aspect-oriented programming, Aspect C++) or profiling (e.g., Intel VTune) can be applied. The concrete measurement tools depend on the implementation technologies, (e.g., whether the system is implemented in the Java, C++ or C#) and the operating environment (e.g., application servers, OS).

o Coarse-grained: to get an initial picture of performance properties, system monitoring tools, such as perfmon under Windows or top and vmstat under Linux can be used. They can record the CPU and HDD utilization per process and thus produce input data for the Q-ImPrESS QoS annotation model.

o Other: performance measurements require a suitable testbed, representative workloads encoded into load drivers and statistical tools (e.g., R, Excel) to perform statistical analysis.

Reliability (i.e., activity failure probabilities in Q-ImPrESS models)

o Defect prediction based on code metrics: compute code metrics, such as lines of code, inheritance depth, or cyclomatic complexity to estimate the number of component defects (e.g., four defects per 1 KLOC). Given source code this method is easy to execute, but its validity is not proven and even debated in literature.

o Reliability growth modelling: assume that software reliability grows over time due to bug fixes and extrapolate curves of field failure report dates to predict future failures using statistical regression. Many SRGM are available from literature. However, this method is reasonably applicable only on already completed or almost completed software. Applying the method on individual components is often not possible because of limited failure reports. This method was actually used for the reliability prediction of ABB Q-ImPrESS demonstrator. Details can be found in D7.3.

o Random/statistical testing: generate random test data for individual components and incorporate the number of successfully executed tests into a statistical failure rate estimation model. This method is applicable to any type of software and does not require source code or historical data. However, the effort for generating and executing a sufficient number of test cases is high and the method might not scale. Because of inter-component relationships, it is difficult to test components in isolation.

o Fault injection: manually insert faults into the source code or use test cases from fixed bugs on former versions of the software. The failure rates can then be estimated as the number of failed vs. the number of successful test case executions. This method is accurate for former versions of the software and the effort can be low if suitable test cases are available. However, this method




does not determine the current component failure rate. Additionally, it is often difficult to attribute test case failures to component faults.

o Explicit failure modelling: construct a state-based behaviour model per component explicitly including manually specified transition probabilities for failure states. To create such a model, user requirements, domain knowledge, and/or experience with similar software can be used. While this method is useful for newly developed components, it requires manual estimation of failure rates and its accuracy is not proven.

Maintainability (i.e., cost estimates for the Q-ImPrESS KAMP tool)

o COCOMO II: this method estimates the effort for changes based on the expected number of lines of code and a user-specific weighting of 15 different attributes (e.g., size of involved databases, experience of the developers, amount of performance constraints, etc.)

o Function point analysis: this method relies on costs calculated from past projects. A function point is a unit of measurement to express the amount of functionality a system provides to a user. Based on the expected complexity of a change, an effort estimation can be performed.

3.5 Efforts, risks and pitfalls This section discusses the efforts (in person hours) for applying Q-ImPrESS in the industrial automation domain (Section 3.5.1) and the risks and pitfalls encountered during our case study with the ABB PCS system (Section 3.5.2).

3.5.1 Efforts for applying Q-ImPrESS Cost/Benefit analyses for new methods are highly interesting for practitioners to judge whether the invested efforts are worthwhile. If the efforts and benefits (i.e., additional income or avoided fixing costs over the course of the product life cycle) for a model-based prediction project can be quantified, it is possible to compute its net present value (NPV), which can be used for business decisions. Former studies have pointed out that model-driven prediction methods bear the potential for a significant return on investment (e.g., >400% for a 15 person, 18 months project). Improving architectural design decisions can have a profound impact on a system. However, quantifying the benefits of model-driven predictions requires long-term studies and is therefore out of the scope for this project. Instead, the following comparison of cost estimations for Q-ImPrESS against conventional methods (e.g., prototyping or relying on experience) helps the reader to further evaluate the applicability of Q-ImPrESS. To quantify the costs of Q-ImPrESS we estimated the person hours needed to complete various activities of the method. As we did not track the actual times consumed by ABB employees while executing the demonstrator evaluation, we first estimated these times post-mortem. They are reported in Table 2 in the column labelled “ABB”. However, these values are influenced by several ABB-internal factors, which lie outside of the Q-ImPrESS method and tools (e.g., obtaining data from other ABB personal, acquiring domain knowledge for the system under analysis, interfering projects, incomplete or erroneous Q-ImPrESS tools etc.). Therefore, a cost estimation based on these values might be misleading.




In addition, we estimated the potential effort for applying Q-ImPrESS by third parties under the following assumptions:

The user of the method is familiar with the system under study. Source code and documentation of the system under study is readily available. The Q-ImPrESS tools are bug-free. Evolution scenarios have been designed and discussed, non-functional requirements

are documented. Even if these assumptions are taken into account there is still a high potential for variability in the efforts. This depends on whether facilities for data collection (e.g., performance testbeds or bug tracking systems) are available, on how complex the desired evolution scenarios are and on how accurate the prediction results are expected to be. Thus, we provide not a single potential effort value in persons ours, but a distribution expressed in the best case, most likely case, and worst case (Table 2). The table shows for example that high efforts are required for learning the method, executing SoMoX for reverse engineering, deciding for a modelling abstraction level, and process log data. These are areas for potential improvement or even automation in future research. For some activities, we estimated a large variance of the efforts, e.g. defining a target decomposition for SoMoX, setting up a test system, filtering bug tracking data and running PerOpteryx 2 (a set of Eclipse plugins to automatically improve Palladio Component Model 3 instances) for automated design space exploration. These activities depend on certain factors which can differ heavily case-by-case. It can be observed that the efforts for modelling the basic system are much higher than the efforts for modelling evolution scenarios and executing the different prediction tools. Thus, the high upfront efforts in Q-ImPrESS might best pay off if the model is reused for many evolution scenarios. The values reported here are not easily transferable to other projects, because they originate from our limited experience with just one case study. Instead they are meant as a coarse indication of the potential efforts that should be analysed by other practitioners and validated by researchers in repeated controlled experiments. There are already empirical studies with effort estimations for restricted parts of the method (e.g. manual modelling in [9]). Table 2A condensed version of the efforts prediction can be found in Table 3. There are significant differences between the best case (1.5 person months) and worst case (8.6 person months) predictions. Typical prototyping studies at ABB last for 3-12 person months. The efforts for modelling and prototyping are however not mutually exclusive. To assess new technologies or components, some prototype measurements would also be required in order to parameterise the Q-ImPrESS models.

2 http://sdqweb.ipd.kit.edu/wiki/PerOpteryx 3 http://sdqweb.ipd.kit.edu/wiki/Palladio_Component_Model




Table 3: Condensed effort predictions for applying Q-ImPrESS

# Activity Best Likely Worst

1 Document Analysis 40 76 132

2 Reverse Engineering 21 58 198

3 Manual Modeling 54 147 299

4 Performance Measurement 21 68 188

5 Reliability Estimation 16 50 132

6 Autom. Design Exploration 8 15 39

7 Modeling Evol. Scenarios 10 16 23

8 Performance Analysis 3 5 8

9 Reliability Analysis 3 7 12

10 Tradeoff Analysis 1 1 1

Sum 177 443 1032

Sum in person month (120h) 1.5 3.7 8.6

Potential duration

To judge whether the estimated efforts pay off, we would have to quantify the actual savings achieved by applying the Q-ImPrESS tools. This is, however, not possible because some of the savings cannot be quantified and the time frame of the Q-ImPrESS is too short to get feedback from business units due to the round-trip time of development projects.

3.5.2 Risks and pitfalls The efforts for Q-ImPrESS analysed in the previous section might be lowered if third-party users can effectively mitigate the following risks associated with the method and tools. In the following, we distinguish between risks associated to model-based prediction approaches in general and risks associated directly to Q-ImPrESS.

The following risks are related to model-based prediction approaches in general:

Missing goal-orientation: The goals for modelling activities should be explicitly stated and discussed with stakeholders. This includes documenting current and future requirements for extra-functional properties as well as the detailed description of evolution scenarios and design alternatives. Without explicitly stated goals, inaccurate or even misleading models might be built that do not allow to improve the design.

Choosing the wrong abstraction level: the abstraction level for modelling activities is not fixed or pre-specified. It is possible to model on a very low abstraction level (e.g., with details from the code) or a very high abstraction level (e.g., only including coarse grained subsystems. A too low abstraction level may prolong the modelling unnecessarily and may lead to wasted efforts for modelling unimportant details. A too high abstraction level might miss important parameters or impact factors for the quality attributes as bottlenecks or sources for defects are hidden by the abstraction.

Complicated input data collection: gathering the required resource demands, failure probabilities, and effort estimations is time-consuming and system-specific. Therefore there are limited step-by-step guidelines on how to collect the data from the method. This might require experimentation and an iterative approach, which is usually much more time-consuming than entering the models into the tools and running the predictions.

Inaccuracy for complex evolution scenarios: If an evolution scenario includes many yet unknown or unsure parameters (e.g., due to a missing implementation of a




component or missing hardware), modelling must rely on user-provided estimations for the resource demands, failure probabilities, and change activity efforts. Theses estimations or guesses based on experience might be inaccurate thus also leading to inaccurate predictions. The quality of the input data is a major factor for the prediction accuracy. Sensitivity analysis can reveal how fragile a model is against inaccurate input data.

Missing trust in the results: architects or manager might not trust the model-based prediction results. The results rely on the mathematical assumptions of the underlying model, which may not hold in practice. In addition, they rely on the input data, which is often obtained from artificial testbeds, which might not reflect realistic customer setups. Thus, it is important to validate predictions if possible under realistic conditions.

Some risks are specifically associated with the Q-ImPrESS method:

Limited integration with other tools and methods: While the Q-ImPrESS tools are well-integrated with each other, and can be installed into Eclipse-based development environments, they do not easily incorporate into different development environments. Existing UML models need to be recreated for Q-ImPrESS as there are currently no model transformations from UML to SAMM. The tools are aligned with Eclipse but do not seamlessly integrate into other development environments.

Cost-effective only for large systems: The complexity of the SAMM (>100 classes) requires a substantial learning effort (approx. 1 week). Q-ImPrESS models are more difficult to understand than simple queuing networks because they are developed for evaluating evolution scenarios. Therefore, it might not be justified to apply Q-ImPrESS for single evolution scenarios or smaller systems. However, the effort might pay off when the models can be reused. An advantage of Q-ImPrESS models over queuing networks is that they use terms familiar to developers and architects and do not require an understanding of queuing theory.

Limited expressiveness of SAMM: Although being complex, SAMM still lacks constructs to model many interesting evolution scenarios. There is no direct support for virtualisation, OS changes, application server configurations, transmission protocols, event-based communication, real-time scheduling, or dynamic architectures. There is also only limited support for modelling the middleware. Finally, there is no support for modelling industry automation specific scenarios, e.g., the introduction of a new standard or the support for new controller devices.

Missing data collection support: The hardest, most time-consuming, and error-prone activity in modelling is data collection (i.e., determining resource demands, failure probabilities, etc.). For these activities, Q-ImPrESS offers no method or tool support and requires manual work or third party product use from the user. It would be conceivable to create technology-specific versions of Q-ImPrESS, for example to support data collection in .NET or Java environments.




3.6 Checklist The following for third-party users of Q-ImPrESS is based on a former checklist [8] for model-based QoS prediction. Suitable answers to these questions can mitigate some of the risks stated in the former section:

Is the system correctly defined and the goals clearly stated?

Are the goals stated in an unbiased manner?

Have all the steps of the analysis followed systematically?

Is the problem clearly understood before analysing it?

Are the QoS metrics relevant for this problem?

Is the workload correct for this problem?

Is the evaluation technique appropriate?

Is the list of parameters that affect QoS attributes complete?

Have all parameters that affect the QoS attributes been chosen as factors to be varied?

Is the experimental design efficient in terms of time and results?

Is the level of detail proper?

Is the measured or estimated data presented with analysis and interpretation?

Is the analysis statistically correct?

Has a sensitivity analysis been done?

Would errors in the input cause an insignificant change in the results?

Have the outliers in the input or output been treated properly?

Have the future changes in the system and workload been modelled?

Has the variance of input been taken into account?

Has the variance of the results been analysed?

Is the analysis easy to explain?

Is the presentation style suitable for its audience?

Have the results been presented graphically as much as possible?

Are the assumptions and limitations of the analysis clearly documented?

Specifically for Q-ImPrESS we add the following questions:

Can trustworthy input data be obtained for the intended evolution scenarios?

Is the model intended to be continuously updated and reused to compensate for the high upfront costs?

Are proper data collection facilities (e.g., testbeds, bug tracking systems) in place to




keep the modelling effort reasonable?

Does the system under study follow a component-based architecture, so that the reverse engineering tools are useful?

Do the stakeholders support the modelling activities so that all required information can be obtained?

3.7 Conclusions In summary, only limited domain-specific guidelines can be given for the industrial automation domain. There are many different kinds of software in this domain (e.g., PLC control loops vs. large-scale DCS), which may different modelling methods and data collections techniques. Because of the breadth of the domain, it is also difficult to specify generic industrial automation domain evolution scenarios analysable by Q-ImPrESS tools, which are not already covered by other domains. However, in our case Q-ImPrESS needed only limited tailoring to be applied on the ABB process control system, which is based on Microsoft technologies. With our demonstrator, we gained a more refined picture on the effort required to apply model-based predictions, but the data is not sufficient to allow a complete cost/benefit analysis of the method. Future methods could work with restricted domain-specific languages targeting specific evolution scenarios. Q-ImPrESS itself could be extended to support more quality attributes (e.g., security, safety), so that even more trade-off analyses could be conducted. To reduce the effort for applying Q-ImPrESS, there could be some automated support for data collection. To better integrate into existing development environments, transformations from UML models to Q-ImPrESS models could be implemented.




4 The enterprise domain The enterprise domain is mainly affected by business processes. Considering our enterprise showcase, common examples are processes related to order a supply chain management such as order entry, order fulfilment, order closure, and order tracking. Single tasks of such processes are carried out by particular systems or services, e. g. order management or shipment management which themselves consist or make use of further systems, namely customer relationship management (CRM), product data management (PDM), pricing or inventory systems, and so on. From a technological point of view, provided systems offer local databases, (simple) web front-ends, and – to a large extend – web service interfaces. Hence, the enterprise domain is especially service-oriented. Thus, we also refer to the enterprise domain as an enterprise service-oriented architecture (E-SOA). The E-SOA domain is mainly characterized by locally organized, distributed services. These services are loosely coupled and therefore often message-oriented systems. Additionally, company-internal architectures within the E-SOA domain often comprise of a central integration layer such as an Enterprise Service Bus (ESB) avoiding point-to-point connections between services. Services themselves are mostly implemented by so called 3-tier architectures separating the messaging layer or presentation layer respectively from the underlying business layer and data access layer. It is common practice to deploy such services on application servers (e. g. JBoss AS or MS IIS) that provide a range of technology-stacks supporting e.g. web services or message queues. Over the last century, design and development of enterprise systems has been driven by an Object-Oriented Analysis and Design (OOAD) methodology. Especially, the emergence of today’s preferred languages such as C++, Java or C# for application development has established OOAD and Object-Oriented Programming (OOP) as main design and development practices. Alongside these practices, traditional design patterns (see [3]) have heavy impact on the design, solution or implementation of recurring problems and requirements. Moreover, Patterns of Enterprise Application Architecture (see [5]) and Enterprise Integration Patterns (see [6]) gain a lot of attention, recently. Above all, however, SOA Design Patterns (see [4]) play an outstanding role in the E-SOA world. Beside other reasons, the extensive use of patterns within in the enterprise domain is driven by quality aspects such as sustainability, maintainability, and reliability. Moreover, the ongoing trend of service-orientation fosters such quality aspects by decoupling components and supporting cohesion by encapsulating business functionality in individual services. Alongside, security and efficiency are main quality aspects that software engineers care most. When it comes to quality aspects, the build/deploy/test/release automation cycle plays an important role within the enterprise domain. Continuous Integration (CI) has been widely adopted. When applying CI, development team members integrate their work on a regular basis. Each integration step is automatically verified based on self-testing builds on an integration machine and are tested on a clone of the production system. The average number of team members involved in design, development, testing and




deployment activities varies with project size. The number of project members might range from just a few persons in smaller projects, about 10 to 50 persons in medium sized projects up to a much higher number in large-scale projects. Basically, design and implementation are carried out by separate roles. Most software development processes are coined by the concept of roles and disciplines. For example, the Rational Unified Process (RUP) defines roles such as Business Process Analyst, Software Architect, Implementer, and so forth for disciplines such as Business Modelling, Analysis and Design, or Implementation, respectively. Depending on their role in the software development process, team members are often experts with respect to domain-specific limitations, rules, and guidelines. The enterprise domain is especially subject to security and privacy constraints, e.g. induced by the German Data Protection Act. Therefore, role-based authentication and authorization became a common standard within the enterprise domain. In addition, large enterprises often have to manage multiple clients. Therefore, multi-tenancy capability must often be accounted for. Moreover, design and implementation of enterprise systems are affected by the number of users, system parameters and performance requirements.

4.1 The target system The E-SOA-Showcase serves as a good example for a typical enterprise application landscape.

Figure Fehler! Textmarke nicht definiert.: Component Layers

As shown in Fehler! Textmarke nicht definiert., each component is composed of three layers – Presentation Layer, Business Layer and Data Access Layer. This inner software architecture of components is typical for enterprise software systems. The presentation layers




task is to provide a visual representation of the components interface, the business layer provides the business functions and the data access layer enables the access business data that is typically stored on a database. Further information about the E-SOA-Showcase is provided in the document “D8.6: Enterprise SOA Showcase” (see [7]). Besides the components of an enterprise application middleware software has to be considered.

4.2 Guidelines and best practices This section provides guidelines for the application of Q-ImPrESS in the enterprise application domain. Section 4.2.1 discusses typical evolution scenarios, which can be analysed with the Q-ImPrESS method. Section 4.2.2 describes some best practices for data collection in the domain.

4.2.1 Typical evolution scenarios Typical evolution scenarios supported by Q-ImPrESS models are:

Modification of an existing component: This means modifying a component’s interface or the internal behaviour its provided services. In Q-ImPrESS the corresponding model component interface can be easily altered in order to reflect the modification on the real system. Changes of the behaviour will be edited in the SEFF model of a to-be-modified component service.

Insertion of a new component: This scenario reflects the situation, where new services – represented by a new component – have to be incorporated in the overall software system. Each component in the real software system will be represented by a component in the architecture model of Q-ImPrESS. The corresponding services will be modelled in the SEFF model.

Substitution of a component: Enterprise applications often substitute legacy software systems. Thereby they don’t substitute the overall legacy software, but parts of it. Over the time the remaining old software will be substituted step by step. In Q-ImPrESS this scenario will be modelled by substituting an existing modelled component by a new one. Typically the interface remains, but the behaviour changes. This scenario is similar to the above-described scenario “Modification of an existing component”.

Changing the hardware environment: In the hardware model, several parameters, such as the processing rates of CPU or disk drives can be changed to analyse the impact of faster hardware to the system. For reliability corresponding parameter are missing, so that evolution scenarios with high reliable hardware are not directly analysable. This scenario also includes changing the network latency and throughput.

Changing infrastructure software: Altering infrastructure software, such as the middleware, virtualisation, or OS layers is currently not directly supported by the Q-ImPrESS models. If such changes need to be analysed they have to be included into the component and connector abstraction made by the SAMM. For example, to represent an operating system, a component would have to be introduced and any




other component running under this operating system would have to have a required interface to reflect OS calls.

4.2.2 Best practices for data collection Data collection for Q-ImPrESS includes gathering performance, reliability, and maintainability measures to be used as input for the Q-ImPrESS models. For completely new components, these measures have to be estimated, for example based on experience with similar systems. For existing systems measuring or analysing a running system can determine the values.

As mentioned above, enterprise applications typically rely on middleware software. That means that gathering performance measures of a given enterprise application includes the amount of time the middleware software consumes for performing its tasks. Further some tasks performed by middleware software can be easily modelled in Q-ImPrESS as additional components or services. Other tasks – e.g. transaction management – will lead to a huge modelling overhead.

It is better to model an alternative model in Q-ImPrESS implying the tasks of the middleware implicitly as part of the components and services. If the Q-ImPrESS model has to reflect a varying middleware configuration an additional model alternative has to be created.

The following briefly summarizes some methods for data collection for the three different quality attributes supported by Q-ImPrESS:

Performance: Performance values will be modelled in the QoS-Annotations model. It is not recommended to modify the source code of an enterprise application in order to determine the consumed amount of time. First, this approach requires the modification of source code, what is fail prone and associated with huge programming overhead. Second, the additionally created source code consumes time, which leads to falsified measures. The best approach is to use professional monitoring tools. In case of new components the measure values has to be estimated.

Reliability: Regarding reliability the same methods apply to enterprise domain as for industrial domain.

Maintainability: An approved method is the “Function Point Analysis”. This method relies on costs calculated from past projects. A function point is a unit of measurement to express the amount of functionality a system provides to a user. Based on the expected complexity of a change, effort estimation can be performed.

4.3 Efforts, risks and pitfalls This section brings forward the efforts for applying Q-ImPrESS in the Enterprise domain (Section 4.3.1), whereas the gathered experiences are based on the eSOA-showcase and not on real systems. In section 4.3.2 we discuss the potential risks during the process.

4.3.1 Efforts for applying Q-ImPrESS The analysis of the efforts concentrates only on efforts required for correct utilisation of Q-ImPrESS prediction tools. Table 4 shows estimations of efforts for application of the Q-ImPrESS tools in the enterprise domain. The “ITE” column indicates the actual duration assessment of ITE developers’ efforts on the eSOA-showcase. It should be pointed out that




the figures in Table 4 correspond to high-level modelling abstraction applied on the eSOA-showcase developed by IT and should therefore not be considered universal for the complete range of possible projects in the enterprise domain.




Table 4: Efforts for applying Q-ImPrESS on the eSOA-showcase

Potential Duration

Activity Best Likely Worst

Preparation activities 40 80 140

Measurements for performance 70 90 190

Estimations for availability modelling 45 96 170

Estimations for maintainability 24 30 90

Manual modelling 40 80 120

Modelling of Evolution scenarios 18 30 50

Performance prediction analysis 6 12 18

Availability prediction analysis 8 12 18

Maintainability prediction analysis 8 12 18

Trade-Off analysis 3 5 8

Sum 262 447 822

The table shows that the biggest efforts are required for gathering knowledge about the Q-ImPrESS method and tools. Besides data collection for specific quality prediction requires a further big part of the time consumed. Tool configuration and simulations require less effort. Our experience has shown that modelling the main alternative and collecting data require significantly more effort that modelling the evolution scenarios.

4.3.2 Potential risks Here are stated some of the potential risks that future users of Q-ImPrESS in enterprise domain should take into consideration before applying the method and tools:

High learning effort required: In order to be able to get benefit of Q-ImPrESS it has to be considered that gathering knowledge about the Q-ImPrESS method and tool require a considerable time.

Cost-effective only for large systems: Considering the high learning effort required by Q-ImPrESS it will not be cost-effective for small systems. Determining if a given system is big enough to justify the application of the Q-ImPrESS method and tools may be a difficult task.

Complicated input data collection: Gathering the required resource demands, failure probabilities, and effort estimations is time-consuming and system-specific. Therefore there are limited step-by-step guidelines on how to collect the data from the method. This might require experimentation and an iterative approach, which is usually much more time-consuming than entering the models into the tools and running the predictions.

Inability to model more complex usage scenarios: Q-ImPrESS tools are well suited for predicting performance attributes in near steady states. However, they lack support for modelling more complex usage scenarios where usage profile changes over time.




Conclusions The given domain-specific guidelines for the enterprise domain are limited. Firstly, because the gathered experiences are based on a showcase, and secondly, because there are many different kinds of software systems in this domain. However the showcase gives a first impression of how the Q-ImPrESS method and tools can be applied to the enterprise domain. The most sticking up points are that the required initial efforts for constituting Q-ImPrESS are high, and that the time consumed for analysis is relatively low. Since Q-ImPrESS is well suited for large systems, the high amount of time for learning how Q-ImPrESS methods and tools work and for collecting data can be ignored because this one time activity is not required in subsequent analyses. Regarding the enterprise domain, future improvements of Q-ImPrESS tools should cover the area of availability prediction mechanisms, which are at this point still too simplistic for application on a production level.




5 Conclusions The Q-ImPrESS method could be successfully applied to all the three specific domains. The application of the Q-ImPrESS model-based prediction methods gave valuable insights into efforts required to perform re-engineering tasks and useful feedbacks were gained from the three different domains. Some considerations are fairly generic and transversal to all domains, here is a list of the main ones: High initial effort is required for the Q-ImPrESS learning process: This effort is mitigated over time as it is less required for subsequent analysis. This issue has been one of the major drawbacks coming out from the Enterprise domain that, among the three, represented the use of the method in a smaller environment. This could be anyhow a potential barrier for the adoption of the Q-ImPrESS methods (especially for small-sized and mid-sized projects) and some actions (tutorial material, tutorial screen casts and dissemination activities) are envisaged. Platform already express great potentials: The industrial domain revealed great potentials for the Q-ImPrESS method that could be extended to support more quality attributes (e.g., security, safety), so that even more trade-off analyses could be conducted. These quality attributes could be valuable for other domains as well. Quality and effectiveness of analysis components: While the performance prediction analysis is at a fairly good stage (production-ready) other analysis components, like the availability prediction, provide a solid base for industry adoption. Further tooling and research activities can refine the availability prediction and list it to production level. Platform openness and adaptability: Although the platform leverages on top of the Eclipse framework, the ABB engineers reported limited tailoring activities to cope with their Microsoft based code base.

project deliverable d6.2-4 domain specific guidelines and ... · pdf filed6.2-4: domain...

Documents