comparing process and event-based software performance ... · comparing process- and event-oriented...

ComparingProcess- and Event-oriented

Software Performance Simulation

Master Thesis of

Philipp Merkle

At the Department of InformaticsInstitute for Program Structures

and Data Organization (IPD)

Reviewer: Prof. Dr. Ralf H. ReussnerSecond reviewer: Prof. Dr. Walter F. TichyAdvisor: Dipl.-Inform. Jörg HenßSecond advisor: Dr.-Ing. Jens Happe

Duration: 20. January 2011 – 19. July 2011

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association www.kit.edu

I declare that I have developed and written the enclosed master thesis completely bymyself, and have not used sources or means without declaration in the text.

Karlsruhe, 2011-07-19

Zusammenfassung

Die Vorhersage der Performanz komplexer Softwaresysteme isthäufig auf Simulation gestützt. Dazu werden ausführbare Compu-termodelle (Simulatoren) entwickelt, welche die Performanz realerSysteme nachbilden. Die Art und Weise, in welcher Simulatorendas Systemverhalten erfassen, wird stark beeinflusst durch die so-genannte world-view (dt. Weltanschauung), die von dem Simula-tionsentwickler eingenommen wird. Die beiden gebräuchlichstenworld-views sind Prozessorientierung und Ereignisorientierung.Die Wahl einer world-view beeinflusst auch die Performanz undSkalierbarkeit des resultierenden Simulators. Dies wurde von eini-gen wenigen Studien adressiert, welche die world-views im Hinblickauf Performanz vergleichen; jedoch weisen diese eine geringe Re-präsentativität auf. Insbesondere basiert keine davon auf einemSimulator, der tatsächlich für Performanzvorhersagen verwendetwird.Deshalb führen wir einen solchen Vergleich im Umfeld des PalladioKomponentenmodells (PCM) durch, welches erfolgreich in einerReihe von Fallstudien eingesetzt wurde, wodurch dessen Reprä-sentativität unterstrichen wird. Wir vergleichen den prozessorien-tierten PCM-Simulator SimuCom mit dessen ereignisorientierterEntsprechung EventSim, wobei der letztere in der vorliegendenArbeit entwickelt wurde. Wir i) identifizieren Faktoren, welche diePerformanz der Simulation potenziell beeinflussen, ii) bringen die-se Faktoren in eine Rangfolge und iii) beurteilen die Performanzund Skalierbarkeit der Simulatoren auf Basis der Faktoren mitdem größten Einfluss.Eine Validierung deutet darauf hin, dass EventSim semantischäquivalent ist zu SimuCom. Bezüglich der meisten PCM Modellewird SimuCom von EventSim übertroffen, was auf den zusätzli-chen Aufwand zurückgeführt werden kann, der durch die Prozes-sorientierung entsteht. Beide Simulatoren skalieren linear im Hin-blick auf die Simulationsdauer wenn die Modellkomplexität erhöhtwird; die Simulationsdauer in SimuCom steigt jedoch mit einer Ra-te, welche die Steigerungsrate in EventSim um das bis zu 50-facheübertrifft. Grenzen der Skalierbarkeit wurden nur bei SimuCombeobachtet, jedoch lässt sich ein Großteil der Skalierbarkeitsgren-zen nicht auf die verwendete world-view zurückführen.

Abstract

Predicting the performance of complex software systems mostcommonly relies on simulation. For this, executable computermodels (simulators) are developed that resemble the performanceof real systems. The way in which the system behaviour is cap-tured by simulators is strongly influenced by the so-called world-view employed by the simulation developer. The two most com-mon world-views are process- and event-oriented simulation.The choice for a world-view does also affect the performance andscalability of the resulting simulator. This has been addressed bya few studies comparing the world-views in terms of performance,which, however, lack of representativeness. Specifically, none ofthem is based on a simulator actually being used for performancepredictions.Therefore, we perform such a comparison in the context of the Pal-ladio Component Model (PCM), which has been used successfullyin a number of case studies, thus emphasizing its representative-ness. We compare the process-oriented PCM simulator SimuComto its event-oriented counterpart EventSim, where the latter onehas been developed in this thesis. We i) identify factors poten-tially influencing simulation performance, ii) rank these factors,and iii) assess the simulator’s performance and scalability basedon the most influential factors.A validation suggests that EventSim is semantically equivalent toSimuCom. For most PCM models, EventSim outperforms Simu-Com due to the overhead induced by the process-orientation. Bothsimulators scale linear in terms of simulation duration when in-creasing the model complexity; the simulation duration in Simu-Com, however, increases at a rate up to 50 times the rate ofEventSim. Limits in scalability have been observed only withSimuCom, but most of them can not be attributed to the world-view used.

Contents

1 Introduction 11.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Related Work 52.1 Simulation of Palladio Component Models . . . . . . . . . . . . . . . . . . . 5

2.1.1 SimuCom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1.2 SLAstic.SIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.1.3 SimQPN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Software Performance Simulation with Java . . . . . . . . . . . . . . . . . . 6

3 Foundations 93.1 Software Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Discrete-Event Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.2.2 Simulated Time Advance . . . . . . . . . . . . . . . . . . . . . . . . 113.2.3 Simulation World-Views . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2.3.1 Process-Oriented Simulation . . . . . . . . . . . . . . . . . 113.2.3.2 Event-Oriented Simulation . . . . . . . . . . . . . . . . . . 12

3.3 Component-Based Software Engineering . . . . . . . . . . . . . . . . . . . . 123.4 Meta-Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.5 Palladio Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.5.1 Palladio Component Model . . . . . . . . . . . . . . . . . . . . . . . 153.5.1.1 Separation of Roles . . . . . . . . . . . . . . . . . . . . . . 153.5.1.2 Parametric Dependencies . . . . . . . . . . . . . . . . . . . 153.5.1.3 Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.5.1.4 System Model . . . . . . . . . . . . . . . . . . . . . . . . . 173.5.1.5 Resource Environment Model . . . . . . . . . . . . . . . . . 183.5.1.6 Allocation Model . . . . . . . . . . . . . . . . . . . . . . . . 183.5.1.7 Usage Model . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.5.2 SimuCom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4 Simulator Development 214.1 Simulation Development Processes . . . . . . . . . . . . . . . . . . . . . . . 21

4.1.1 Model-based Simulation Development . . . . . . . . . . . . . . . . . 224.1.2 Model-driven Simulation Development . . . . . . . . . . . . . . . . . 23

4.1.2.1 Generative Approach . . . . . . . . . . . . . . . . . . . . . 244.1.2.2 Interpretative Approach . . . . . . . . . . . . . . . . . . . . 24

4.1.3 Selecting a Development Process . . . . . . . . . . . . . . . . . . . . 244.2 Simulation Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.2.1 Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.2.2 Requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

i

ii Contents

4.2.3 Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3 Preprocessing the Static Structure . . . . . . . . . . . . . . . . . . . . . . . 274.4 Interpreting the Dynamic Behaviour . . . . . . . . . . . . . . . . . . . . . . 29

4.4.1 Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.4.2 Behaviour Interpreter . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.4.2.1 Static Structure . . . . . . . . . . . . . . . . . . . . . . . . 304.4.2.2 Traversal Procedure . . . . . . . . . . . . . . . . . . . . . . 324.4.2.3 Traversal Strategies . . . . . . . . . . . . . . . . . . . . . . 324.4.2.4 Traversal State . . . . . . . . . . . . . . . . . . . . . . . . . 344.4.2.5 Traversal Instructions . . . . . . . . . . . . . . . . . . . . . 344.4.2.6 Traversal Listener . . . . . . . . . . . . . . . . . . . . . . . 36

4.4.3 Simulation Run Example . . . . . . . . . . . . . . . . . . . . . . . . 364.5 Simulation Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.6 Handling Parametric Dependencies . . . . . . . . . . . . . . . . . . . . . . . 424.7 Encapsulating the Model Access . . . . . . . . . . . . . . . . . . . . . . . . 434.8 Collecting Performance Measurements . . . . . . . . . . . . . . . . . . . . . 44

4.8.1 Mounting Resource Probe Sets . . . . . . . . . . . . . . . . . . . . . 444.8.2 Mounting Control Flow Probe Sets . . . . . . . . . . . . . . . . . . . 45

4.9 Attaining Independence of Simulation Libraries . . . . . . . . . . . . . . . . 454.10 Supported Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5 Simulator Validation 475.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.1.1 Testing for Equivalence of Samples . . . . . . . . . . . . . . . . . . . 485.1.2 Testing for Equivalence of Probability Distributions . . . . . . . . . 48

5.2 MediaStore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485.3 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.5.1 Without Resource Contention . . . . . . . . . . . . . . . . . . . . . . 525.5.2 With Resource Contention . . . . . . . . . . . . . . . . . . . . . . . . 535.5.3 Avoiding Processor Sharing Resources . . . . . . . . . . . . . . . . . 555.5.4 Reviewing the Processor Sharing Implementation . . . . . . . . . . . 55

5.5.4.1 Processor Sharing Foundations . . . . . . . . . . . . . . . . 555.5.4.2 Processor Sharing Implementation . . . . . . . . . . . . . . 565.5.4.3 Consequences of Indeterministic Scheduling . . . . . . . . . 56

5.6 Conclusion and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6 Simulator Comparison 596.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596.2 Automated Model Variation and Simulation . . . . . . . . . . . . . . . . . . 60

6.2.1 Configuration Meta-Model . . . . . . . . . . . . . . . . . . . . . . . . 616.2.1.1 EMF Ecore . . . . . . . . . . . . . . . . . . . . . . . . . . . 616.2.1.2 Experiment Repository and Experiments . . . . . . . . . . 626.2.1.3 Referencing PCM Models . . . . . . . . . . . . . . . . . . . 626.2.1.4 Describing Model Variation . . . . . . . . . . . . . . . . . . 626.2.1.5 Configuring Simulation Runs . . . . . . . . . . . . . . . . . 636.2.1.6 Measuring the Simulation Performance . . . . . . . . . . . 64

6.2.2 Experiment Automation Tool . . . . . . . . . . . . . . . . . . . . . . 656.2.2.1 Experiment Controller . . . . . . . . . . . . . . . . . . . . . 656.2.2.2 Model Variation . . . . . . . . . . . . . . . . . . . . . . . . 656.2.2.3 Analysis Tool Adapter . . . . . . . . . . . . . . . . . . . . . 66

ii

Contents iii

6.2.2.4 Performance Measurement . . . . . . . . . . . . . . . . . . 666.3 Identifying Potential Performance Factors . . . . . . . . . . . . . . . . . . . 67

6.3.1 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676.3.2 Reducing the Set of Potential Performance Factors . . . . . . . . . . 68

6.4 Ranking Performance Factors . . . . . . . . . . . . . . . . . . . . . . . . . . 696.4.1 Analysis of Variance (ANOVA) . . . . . . . . . . . . . . . . . . . . . 69

6.4.1.1 Experimental Designs . . . . . . . . . . . . . . . . . . . . . 706.4.1.2 2kr ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6.4.2 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726.4.3 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.4.4 PCM Base Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.4.5 Determining the Simulation Performance . . . . . . . . . . . . . . . 756.4.6 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.4.6.1 Initial Overview Using Box Plots . . . . . . . . . . . . . . . 766.4.6.2 ANOVA Ranking . . . . . . . . . . . . . . . . . . . . . . . 77

6.5 Comparing Performance and Scalability . . . . . . . . . . . . . . . . . . . . 786.5.1 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . 786.5.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.5.2.1 Simulation Duration . . . . . . . . . . . . . . . . . . . . . . 806.5.2.2 Memory Consumption . . . . . . . . . . . . . . . . . . . . . 83

6.6 Identifying Scalability Limits . . . . . . . . . . . . . . . . . . . . . . . . . . 836.6.1 Indications of Scalability Limits . . . . . . . . . . . . . . . . . . . . . 85

6.6.1.1 Stack Overflow . . . . . . . . . . . . . . . . . . . . . . . . . 856.6.1.2 Running out of Memory . . . . . . . . . . . . . . . . . . . . 856.6.1.3 Exceeding Java Class File Limits . . . . . . . . . . . . . . . 86

6.6.2 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . 866.6.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

7 Summary and Conclusion 897.1 Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Bibliography 93

iii

1. Introduction

Besides the fulfilment of functional requirements, performance is a crucial characteristic ofsoftware systems. As Smith and Williams [SW02, p. 9] point out, performance failures canresult in damaged customer relations, inefficient staff and lost income, to name just a few.Even worse, performance failures can cause whole projects to fail since a redesign in latedevelopment stages is often too expensive [SW02]. Performance is also an important factorin the emerging areas of cloud computing and, more specifically, virtualised computing;the better the software performance, the more software systems can operate on the samephysical machine – at the same costs for the infrastructure provider. These benefits arecommonly passed on to the customers by providing services on a pay-per-use basis. Inconsequence, an increase in software performance directly leads to a reduction in operatingcosts. These examples demonstrate the importance and opportunities of a systematicperformance evaluation from early development stages on.

In early development stages, when there is neither an implementation nor a prototypeof the system under investigation (SUI), software performance is commonly predictedby means of performance models, which “capture the main factors that determine thesystem performance” [Kou09]. Even in late development stages performance models canprove advantageous. For example, when the performance evaluation of a running systemwould adversely affect the ongoing operation. Two groups of performance models canbe distinguished: analytical models and simulation models, though there is not alwaysa clear separation. Analytical models are based on formal descriptions of the SUI andallow for a fast calculation of performance metrics. Complex systems, however, are hardto analyse using analytical models due to the so-called state space explosion [KFBR06].When the complexity of the SUI rises, it is therefore common to stick to a simulativeapproach [Kou09].

Simulation models imitate the behaviour of systems yielding approximations of certainperformance metrics. In the area of software performance prediction it is common toemploy a technique called discrete-event simulation. In brief, this simulation approachmodels the SUI’s behaviour over time by a sequence of events, each of which may changethe state of the simulated system. Typically, an event adjusts not only the system state,but also causes further events to occur in the simulated future. Imagine a number ofcustomers waiting in a queue to be served by a teller. The length of the waiting line andthe present state of the teller – idle or busy – constitute the system state. Whenever acustomer arrives, this is regarded as an event. Then, the system state changes by increasingthe length of the waiting line by one. Additionally, the occurrence of the customer arrival

1

2 1. Introduction

event entails a future event since the customer expects to be served eventually, whichcorresponds to an event again.

Even though it is conceivable to analyse such a small simulation model manually usingpaper and pencil only, most of the interesting simulation models are far too complex andneed to be translated to an executable computer program first. This is a challengingtask since the simulation modeller needs to find an appropriate mapping of the abstractconcepts, i.e. events and how they influence the system state, to concrete statementsformulated either in a simulation-specific language (e.g. Arena or GPSS) or a general-purpose programming language (e.g. C++ or Java).

Over the past decades, a number of different strategies have been proposed to guide thedevelopment process of a simulation model. These strategies are termed simulation world-views or conceptual frameworks. Derrick et al. [DBN89] define a world-view as follows:

“A Conceptual Framework (CF) is an underlying structure and organizationof ideas which constitute the outline and basic frame that guide a modeler inrepresenting a system in the form of a model.”

A more concrete definition has been given by Pollacia [Pol89]:

“A world view is the perspective, or point of view, from which the mod-eler sees the world or the system to be modeled. The world view chosen bya modeler greatly influences the structure of and method for constructing amodel.”

To sum up, a world-view guides the simulation modeller in that it provides a frameworkin which the model development takes place. In consequence, the decision for a specificframework or world-view, respectively, influences the design of the resulting computerprogram constituting the simulation model – and, as will be stated later on, may alsoaffect the simulation performance.

Two well-known world-views are the event-scheduling and the process-interaction ap-proaches, which are to be compared in this thesis. They differ mainly from the granularityin which the simulation modeller describes the behaviour of the SUI. While the formerdemands for a fine-grained modelling of individual events, the latter abstracts from eventsin that it allows for a modelling of processes. Such a thinking in terms of processes is com-monly perceived as more natural compared to focusing on events. In the example givenabove, a process is, for instance, the lifecycle of a customer, which begins with enteringthe waiting line and ends as soon as the customer has been served completely.

Some authors suggest performance implications associated with the world-views, whichis due to the way in which processes in the process-interaction world-view are commonlyimplemented, namely by mapping each simulated process to a preemptive thread managedby the operating system. Therefore, Banks [Ban10, p. 567] regards the choice betweenan event-oriented and a process-oriented approach as a trade-off between “modeling easeversus execution speed”, which is explained by higher computational costs of contextswitches in process-based approaches compared to delivering events in event-based ap-proaches. L’Ecuyer [LB05] even suggests that “threads are designed for real parallelism,not for the kind of simulated parallelism required in process-oriented simulation”. Further-more, L’Ecuyer states that using native threads “adds significant overhead and preventsthe use of a very large number of processes in the simulation” [LB05]. We assume it isdue to the lack of viable alternatives that simulated processes are still often mapped tothreads; and, in particular, this applies to simulation models developed in Java.

Despite of the presumed inferiority of a process-interaction simulation which is drivenby threads, we believe that an extensive performance and scalability comparison of the

2

1.1. Contributions 3

two world-views can help in trading off “modelling ease” for performance as suggestedby Banks. However, existing comparisons are based on artificial simulation models whosepractical relevance is rather small. To the best of our knowledge, an extensive performancecomparison between the two world-views has not been conducted yet.

Therefore, the objective of this thesis is to perform a comparison between process- andevent-oriented simulation in order to contribute to a better understanding of the trade-offbetween “modelling ease” and performance.

1.1 ContributionsThe contributions of this thesis are as follows:

• We develop EventSim, a fully event-oriented software performance simulator forinstances of the Palladio Component Model (PCM), a domain-specific language formodelling the performance of component-based software systems on an architecturallevel. EventSim is to be semantically equivalent to the process-oriented simulatorSimuCom, which is shipped with the PCM.

• We identify those PCM modelling elements that account for large proportions of theoverall simulation duration in SimuCom as well as EventSim.

• We compare the performance and scalability of the process-oriented SimuCom and itsevent-oriented counterpart EventSim. The comparison is based on the performance-related modelling elements identified before.

1.2 StructureThis work is organised as follows:

• Chapter 2 introduces related work on software performance simulation using PCMmodels, and more generally, approaches related to the development of software per-formance simulations in Java. Beyond that, we discuss publications dealing with theperformance implications of simulation world-views in Java.

• Chapter 3 lays the foundations for the following chapters. In particular, we give ashort overview of discrete-event simulation modelling and cover the most importantconcepts of the Palladio Component Model.

• Chapter 4 presents our simulator EventSim. We cover the underlying developmentprocess and focus on selected aspects of the simulator design and implementation.

• Chapter 5 deals with the validation of EventSim. We perform a series of experimentsusing both SimuCom and EventSim and require that both simulators yield consistentresults.

• Chapter 6 describes the comparison of SimuCom and EventSim in terms of per-formance and scalability. We begin with identifying factors that are presumed toaffect the simulation performance. In a second step we rank these factors basedon their influence on the duration of simulation runs. Finally, a thorough analysiscompares the performance and scalability of SimuCom and EventSim based on themost influential factors.

• Chapter 7 concludes this thesis and discusses potentials for future work.

3

2. Related Work

This chapter covers related work on software performance simulation in general, and onthe simulation of component-based software systems using the Palladio Component Model(PCM) [RBK+07] in particular. Section 2.1 begins with an overview of simulative ap-proaches for predicting software performance on the basis of PCM models. Thereafter, inSection 2.2, we present simulation libraries which ease the modelling of software perfor-mance simulations in Java. In addition, this section presents recent approaches targetedat improving the way in which simulations are developed in Java. The section concludeswith presenting an existing comparison between the process- and event-oriented simulationtechniques.

2.1 Simulation of Palladio Component ModelsCurrently, we know of two simulators that has been specifically developed to provide perfor-mance analyses of PCM models, namely SimuCom and SLAstic.SIM. Another simulatorcalled SimQPN is capable of simulating PCM models using a preliminary performancemodel transformation to the queuing Petri net (QPN) formalism.

2.1.1 SimuCom

SimuCom [Bec08] is Palladio’s default simulator in that it is delivered together with thePalladio Component Model (PCM). The simulation is loosely based on queuing networks,which means that the simulated system is internally represented by a set of resourcesthat are requested by simulated users. Resources are limited in capacity, which is whysimulated users might have to wait for their turn when a resource has been requestedthat is currently busy with another user. Each resource is associated with a schedulingstrategy, which determines the order in which the queue of simulated users is worked off.Simulation results or performance metrics, respectively, arise from resource utilisation oversimulation time and the time that it took a simulated user to pass through the simulatedsystem.

SimuCom is a collection of Eclipse plug-ins implemented in Java on top of the SSJ simula-tion facilities (cf. Section 2.2). The behaviour of simulated users is modelled by adheringto the process-interaction world-view. As will be mentioned below, this implies a nativethread per simulated user. In contrast, the simulation of resource behaviour relies on theevent-scheduling approach. Nevertheless, major parts of SimuCom are process-orientedand both the advantages and disadvantages due to this world-view apply to SimuCom.

5

6 2. Related Work

Consequently, some performance and scalability issues have been reported with SimuCom.Meier [Mei10] reported that SimuCom could run out of memory for high workloads, whensimulated users arrive faster in the system than they leave it, which has been justified bythe requirement of an own thread per simulated user. Furthermore, performance gainsof up to 20 times had been observed using an event-oriented simulator based on queuingPetri nets in place of SimuCom.

2.1.2 SLAstic.SIM

SLAstic.SIM [vM10], in contrast, is a purely event-oriented simulator focused on runtimereconfiguration of component-based software architectures described by PCM models. Aswith SimuCom, the simulation is centred around simulated resources having a schedulingstrategy each. However, compared to SimuCom, there is a smaller number of schedulingstrategies available.

SLAstic.SIM is part of the SLAstic approach [vHRGH09], which targets at optimising theresource efficiency of component-based systems by means of architecture-based runtimeadaption. For this purpose, the PCM meta-model has been extended in order to be ableto describe a system’s reconfiguration capabilities.

Although SLAstic.SIM is an event-oriented simulator for PCM models and as such couldserve as counterpart of SimuCom in the planned comparison, there are some differencesbetween the two simulators that, taken together, would prevent reasonable findings fromsuch a comparison. First, SLAstic.SIM does not depend on the Eclipse platform, whichprobably results in a lower overhead in terms of resource usage. Second, it makes useof the Desmo-J simulation library (cf. Section 2.2), leading to a different performancebehaviour of platform services such as the random-number generator. Third, no parametricdependencies can be used with SLAstic.SIM whereby the range of accepted PCM modelsis quite restricted.

2.1.3 SimQPN

SimQPN [KB06] is an event-oriented simulator for software performance models describedby the queuing Petri net (QPN) formalism. By adhering to a simulative approach,SimQPN copes with the problem of state space explosion, which is common to analyt-ical approaches, while at the same time providing the expressive power of QPNs.

In order to allow for the simulation of PCM models in SimQPN, Meier et al. [MKK11]propose an automated transformation from PCM models to queuing Petri nets. First,they describe a formal mapping of PCM concepts to the QPN formalism. Second, aQVTO model transformation has been implemented, enabling the automated transitionfrom PCM models to QPN performance models. As mentioned earlier, the simulation oftransformed PCM models using SimQPN leads to a reduction in simulation duration ofup to 20 times compared to SimuCom.

These findings are especially interesting in view of the event-scheduling approach under-lying SimQPN. Nevertheless, a more thorough comparison is required to possibly be ableto attribute the observed performance difference to the respective world-views.

2.2 Software Performance Simulation with JavaThere is a great number of simulation libraries facilitating the development of softwareperformance simulations in Java. Generally, the purpose of a simulation library is toprevent the simulation developer from “re-inventing the wheel” in that common simulationcomponents and functionalities are factored out and provided by the library. Essentially,

6

2.2. Software Performance Simulation with Java 7

this includes mechanisms and data structures associated with the scheduling of events, thegeneration of pseudo-random numbers, and the collection of statistics over the course of asimulation run. A comprehensive, though non-exhaustive, list of Java simulation librariescan be found in [WP04], for example. In this section, we focus on the two libraries usedwith SimuCom and SLAstic.SIM, namely SSJ [LB05, ssj] and Desmo-J [Pag05, des].

Both, SSJ and Desmo-J allow for modelling a system in terms of events (event-schedulingworld-view), processes (process-interaction world-view), or a combination of both. InDesmo-J, each simulated process is mapped to a native thread. This means, a simulatedprocess is expressed by a sequence of Java statements, which are executed by a dedicatedthread. SSJ follows the same approach, but additionally supports the single-threadedimplementation proposed by Jacobs and Verbraeck [JV04]. Instead of using threads toexecute sequences of Java statements, Jacobs and Verbraeck propose a Java interpreterwhich is itself implemented in Java, thus allowing for interpreting the statements of sim-ulated processes in a single thread. However, this approach could not prove satisfactorybecause of its high interpretation overhead [LB05].

The simulated processes in a (non parallel) discrete-event simulation do not require realparallelism, which is why some authors suggest the use of coroutines. A coroutine is similarto a thread, but is in contrast non-preemptive. This means, while multiple threads areallowed to be active in parallel, coroutines are active on an alternating basis, but neverat the same time. In terms of performance, coroutines outperform threads in that theyeliminate a fraction of the overhead induced by threads. For example, no synchronisationbetween coroutines is required since they pass control to each other explicitly [SWW10].Specifically in view of the performance benefits, coroutines are favoured over threads. TheJava virtual machine, however, does not allow for coroutines without modifications. This iswhy Stadler et al. [SWW10] propose an extension of the JVM, which could be consideredas an early step towards coroutines in Java.

For the purpose of assessing the performance influence of a process-oriented simulation,L‘Ecuyer [LB05] modelled two semantically equivalent simulations of an M/M/1 queue inJava, one using processes, and the other one using events. The event-oriented simulatorconsists of two types of events corresponding to arrivals and departures. The process-oriented simulation is driven by a single process. The simulation using processes showedto be approximately 12 times slower compared to its counterpart based on events. Thisexample serves its purpose in demonstrating the inferior performance of thread-basedsimulation processes compared to events. It, however, lacks of representativeness sincethe modelled processes do not account for the usually complex behaviour described byprocesses. Moreover, due to the short lifecycle of processes, the overhead of creatingthreads dominates

7

3. Foundations

In this chapter, we give an overview of the research areas that constitute the context of thisthesis. In doing so, we focus on those aspects that we regard as beneficial to understandthe following chapters. Less important foundations are deferred to later chapters andare introduced where necessary. We begin with a brief overview of software performanceevaluation in Section 3.1, which constitutes the overall context of this thesis. Then, inSection 3.2, we introduce discrete-event simulation, which is commonly used in softwareperformance simulation. Section 3.3 deals with component-based software engineering,followed by an overview of meta-modelling in Section 3.4. Both sections lay the foundationsfor the Palladio Component Model, which is finally covered in Section 3.5.

3.1 Software Performance EvaluationBased on [Kou09], this section gives a brief overview of software performance evaluation.Software performance evaluation is a discipline targeted at assessing the performance ofsoftware systems. Performance is defined as the “degree to which the system meets itsobjectives for timeliness and the efficiency with which it achieves this” [Kou09], wheretimeliness refers to meeting a specified response time or throughput. Various methodsare used for software performance evaluation depending on the development stage of thesystem under study. For systems in a late development stage that are already executable,the performance can be measured by using load testing tools or benchmarks. In theearly stages of the design process, however, there is no system whose performance couldbe measured. Thus, the system is commonly abstracted by a performance model that“captures the main factors that determine the system performance” [Kou09]. Performancemodels can be classified in analytical models and simulation models. Analytical modelsare described by mathematical means. Performance metrics are gained by solving thesemodels, i.e. by calculating the results. Simulation models, by contrast, abstract from asystem in that they imitate the system behaviour. Here, performance metrics are gatheredby observing the system while it is being simulated. Simulative approaches are generallymore accurate but less efficient compared to analytical approaches.

Common analytical models are Markov chains, queuing networks and Petri nets. How-ever, it must be noted that there is not always a clear separation between analytical andsimulative models. For layered queuing networks, for instance, there is both a solver and asimulator. Queuing networks are well suited to modelling contention of hardware resources.They lack, however, the expressiveness required to model software contention. Petri nets

9

10 3. Foundations

conversely allow for modelling of software contention, but are less suited to model hard-ware contention since no scheduling policies are associated with resources. Queuing Petrinets incorporate the benefits of queuing networks and Petri nets allowing for an accuratemodelling of both software and hardware contention.

3.2 Discrete-Event SimulationTwo alternative approaches can be distinguished to simulate the behaviour of dynamicsystems, where dynamic refers to systems that change their state over time. If the systembehaviour over time can be described as a finite sequence of system states, the discrete-event simulation lends itself well for the development of a corresponding model. Usingdiscrete-event simulation, changes of the system state are assumed to be discrete. In otherwords, the system state changes in a step-like manner. By contrast, when the systemunder study can not be described in terms of discrete states, a continuous simulation isbetter suited. For instance, this is the case for the simulation of physical phenomena wherethere is an indefinite number of states between two time instants, no matter how smallthe time distance in between is.

The discrete-event simulation approach is well suited for the simulation of software systems:the number of system users changes step-wise when a user enters the system or leavesit; likewise, the queue of a resource behaves in this manner when the resource is beingrequested or a resource request has been served completely. This is why we rely on discrete-event simulation in this thesis. In the following, we provide an overview of the mostimportant terms and concepts of this simulation approach.

3.2.1 TerminologyA system is a collection of objects that interact together to accomplish one or more goals[Ban10]. In order to study a system by means of simulation, an abstraction needs tobe created, which is called the simulation model. The purpose of a simulation model isto imitate the behaviour of the real system: when supplied with the same inputs, theobservable characteristics of the simulation model should be consistent to those of the realsystem. Most commonly, a simulation model is therefore an executable piece of softwareimplemented in a general-purpose programming language, for instance. Such a simulationmodel is also referred to a simulator ; in the remainder of this thesis we use both termsinterchangeably.

Usually, only a fraction of the system is of interest in a simulation study, which is whythe simulation model needs to be in line with the goals of the simulation study. That is,the simulation model is not required to capture every object present in the system, butonly those related to the objectives of the simulation. In particular, the same system maybe represented by a variety of simulation models, each reflecting another question to beanswered by the simulation.

The objects of interest captured by a simulation model are called entities. An entity is theabstraction of “an object in a system that requires explicit representation within a modelof the system” [Fis01, p.38]. Entities comprise one or more properties, which are calledattributes. The values assigned to the attributes constitute the state of an entity.

The overall system state is composed of the state of all entities present in the simulationmodel. More generally, the system state is “that collection of variables necessary to de-scribe the system at any time, relative to the objectives of the study” [Ban10, p. 30]. Inother words, the system state over time is a sequence of snapshots of the real system.

The behaviour of a system over time is simulated by events. Events are commonly definedas instantaneous occurrences that change the system state, where instantaneous indicates

10

3.2. Discrete-Event Simulation 11

the discrete, step-wise change of the system state as introduced earlier. This definitionimplies that an event may not span a period of time. Each event simulates a small fractionof the system behaviour, and the overall behaviour of the system emerges by simulatingseveral events in succession. Each event occurs at a specified point in time and mayschedule further events to occur in the simulated future.

3.2.2 Simulated Time AdvanceTime in discrete-event simulations advances independently of real time, which is why adistinction has to be made between simulation time (or simulated time) and wall-clock time(or real time). Simulation time is advanced by the simulator whenever the events scheduledat a simulated time instant have been simulated completely. Then, the simulation time ismoved forward to the point in time associated with the next imminent event. The timein between is skipped since no change to the system state may occur between successiveevents.The simulation component which keeps track of imminent events along with their occur-rence times is called event scheduler. The scheduler’s main task is to invoke the simulationlogic associated with an scheduled event as soon as the event’s occurrence time is reached.

3.2.3 Simulation World-ViewsThe development of a discrete-event simulation is influenced to a great extent by theworld-view the simulation modeller adheres to. Below, we give an overview of the process-interaction and the event-scheduling world-views. Though there are some further world-views, we neglect them here since they are out of scope of this thesis.

3.2.3.1 Process-Oriented Simulation

The process-interaction world-view focuses on the lifecycle of entities as they move throughthe simulated system. That is, the simulation developer models processes which describethe behaviour of simulated entities. At simulation runtime, these processes are executeduntil an advance in simulation time is required. Then, the process suspends executionuntil a certain simulation time has passed. Simulations adhering to the process-interactionworld-view are also referred to as process-oriented or process-based simulations.An example of a simple simulation adhering of the process-interaction world-view can beseen in Figure 3.1. The notation is derived from UML sequence diagrams; the lifeline,however, represents the advance of simulation time and has a scale. Solid sections of thelifeline indicate an advance in simulation time. Hatched sections pause the time advanceand thus represent an instant of simulation time. In the example, two simulated userscompete for a shared resource depicted in the middle between the users. The resourceserves one user at a time and uses a first-come, first-served (FCFS) scheduling policy.Each user and their behaviour is represented by their own simulated process. Hereafter,the left user is called U1 and the right user is referred to as U2. The simulation starts atsimulated time t = 0 and both users enter the simulated system. U1 begins and requeststhe resource at t = 0 having a demand of 30 processing units (PU). For simplicity, weassume that the resource works off one PU per simulated time unit. The resource iscurrently idle and thus begins serving the request immediately. In order to simulate thetime advance induced by the service time, U1 is suspended until the demand has beenserved. Meanwhile, at t = 5, the second user U2 requests the resource. However, theresource is still busy with the demand of U1. Therefore, U2 is added to the resource’squeue. Likewise, U2 is suspended to simulate the time advance caused by waiting for theresource to become available plus the time needed to serve the request. As soon as theresource finishes a request, the corresponding user resumes. Then, the user proceeds andmay issue further resource demands.

11

12 3. Foundations

User User

t = 0demand(30)

t = 5

Use

r 1

= 3

0

demand(50)

<<FCFS>>

suspend

resume

resume

suspend

Use

r 2 =

50

t = 30

t = 70

Figure 3.1: Process-oriented simulation example

3.2.3.2 Event-Oriented Simulation

By contrast, the event-scheduling world-view treats processes as sequences of events. Thatis, the developer breaks down a process into a number of events. Technically, each eventis associated with an event routine, which handles the occurrence of the event. Within anevent routine, further events can be scheduled to occur in the simulated future, resultingin a sequence of events ordered by the time of their occurrence. Thus, one could think of aprocess in event-oriented simulation as a sequence of event routines. Simulations adheringto the event-scheduling world-view are also referred to as event-oriented or event-basedsimulations.

A simulation example using an event-oriented world-view is depicted in Figure 3.2. Com-pared to the process-oriented example above, it is simplified in that only a single simulateduser is present. The user and its behaviour is not modelled as a process as was the casebefore. Instead, the simulation logic associated with a process is scattered over severalevents. The event scheduler can be seen as an appointment calendar for events. Events canadd further events to the event scheduler which are supposed to occur at a specified timein the future. The simulation is initialized by the scheduling of the simulated user to arriveat simulated time t = 20. Then, the simulation starts at t = 0. As no further events havebeen scheduled, the first event to occur is the Arrival event at t = 20. The correspondingevent routine simulates the first action of the user, which issues a resource demand of30 processing units to the resource. As the resource is currently idle, the request can beserved immediately. Therefore, a BeginService event is scheduled at the current simu-lation time indicating that the service has been started along with a CompleteServiceevent at t = 50 indicating the completion of service. Both events would usually schedulefurther events; however, for the sake of simplicity, we omit these here.

3.3 Component-Based Software EngineeringComponent-based software engineering (CBSE) is the approach to engineer software prod-ucts out of a number of building blocks that are plugged together to yield the overallsystem. The building blocks in CBSE are called software components, while the systembuild from components is termed component software. In the following, we outline CBSEas motivated by Szyperski [Szy99].

12

3.3. Component-Based Software Engineering 13

Event Scheduler

schedule(Arrival, t=20)

t = 0

t = 20<<event>>Arrival

demand(30)

schedule(BeginService, t=20)

<<event>>BeginServicet = 20

t = 50

<<FCFS>>

schedule(CompleteService, t=50)

<<event>>CompleteService

Figure 3.2: Event-oriented simulation example

Szyperski [Szy99, p. 34] defines a software component as follows:

“A software component is a unit of composition with contractually specifiedinterfaces and explicit context dependencies only. A software component canbe deployed independently and is subject to composition by third parties.”

Components are units of composition inasmuch as a component is self-contained, i.e. it“encapsulates its constituent features” [Szy99, p. 30], in the form of binary code, forinstance, and makes its dependencies on other components and the environment explicit.To express dependencies, interfaces are used. These can be either required or provided bya component. In order to allow for connecting components, interfaces need to be specifiedin a contractual way. That is, the interface states “what a client needs to do to use theinterface” as well as “what the provider has to implement to meet the services promisedby the interface” [Szy99, p. 43]. In this way, components with matching interface can beconnected with each other, which is the case when the same interface is provided by onecomponent and required by the other one.

Similiar to the distinction between classes and objects in object-oriented software develop-ment, there are types and instances of components as well. Cheeseman and Daniels [CD01,pp. 5] propose a distinction between four forms of components, which is shown in Figure3.3. The component specification captures the interfaces that a component provides andrequires along with an abstract description of what the component does. The componentspecification, however, does not include the implementation, which is why this form of acomponent is on the type level. Such a component type can be instantiated by an arbitrarynumber of component implementations, whose interfaces and behaviour are conform to thecomponent specification. An implemented component specification can be deployed in anarbitrary number of deployment environments yielding deployment instances. In order toyield the overall component software, deployed component instances are instantiated atruntime leading to what is called a runtime instance.

13

14 3. Foundations

Component Implementation

Deployment Instance

Runtime Instance

Component Specification

1 * 1 * 1 *

Figure 3.3: Component forms (based on [CD01, p. 7])

3.4 Meta-ModellingA model is an abstract representation of the structure, function and behavior of a system[SV05]. For models intended to be processed by computers, it is of special importanceto have a well-defined structure describing how concrete models may look like. Such adescription is provided by a meta-model.

A meta-model defines the set of modeling elements that may be used for creating validmodels and how these elements are related to each other. A well-known example of ameta-model is the Unified Modeling Language (UML) [uml].

The requirement for a well-defined structure also applies to the meta-model itself leading tothe concept of ameta-meta-model. This could be continued indefinitely. However, commonmeta-meta-models are self-describing and can serve as meta-meta-model and meta-meta-meta-model at the same time. The meta-meta-model of the UML is, for instance, theMeta Object Facility (MOF) defined by the Object Management Group (OMG) [omg].

Another popular meta-meta-model is Ecore. Ecore is part of the Eclipse Modeling Frame-work (EMF) [emf], which is a technical foundation of the Palladio Component Modeldescribed below.

3.5 Palladio ApproachThis thesis relies on the Palladio Approach [RBK+07], which aims at engineering component-based software systems with defined quality levels. For this purpose, Palladio providesperformance and reliability predictions from early design stages on, for both existing aswell as planned systems. The prediction is based on architectural models that captureperformance and reliability characteristics of the system under evaluation. These archi-tectural models are described by means of the Palladio Component Model (PCM) – adomain-specific language (DSL) for component-based software architectures.

PCM models are commonly created by the use of an Eclipse plug-in called PCMBench. Itoffers two types of editors: a graphical editor that supports the creation of PCM modelsin an UML-like presentation as well as a tree editor which allows for a fine-grained, moredirect manipulation of the elements constituting a PCM model.

Once an architectural model has been created in the PCMBench, the quality engineerchooses an analysis tool, which is supposed to yield the quality prediction. Palladio of-fers analytical, simulative and measurement-based approaches to quality prediction. ThePCMSolver subsumes various analytical approaches to performance prediction, includingstochastic regular expressions, layered queuing networks and stochastic process algebras.They involve an initial transformation from a PCM model to the respective formalism.But, as motivated earlier, analytical approaches are limited in applicability. Those limita-tions are relieved by SimuCom, a performance and reliability simulator for PCM models.Compared to analytical performance models, however, the analysis using SimuCom takesmore time to yield accurate results. Another analysis tool, called ProtoCom, transformsPCM models to software performance prototypes. It maps components to Enterprise JavaBeans and deploys them on an application server, where their performance can be mea-sured.

14

3.5. Palladio Approach 15

The Palladio Component Model is subject to the next section. Thereafter, we give anoverview of SimuCom. The PCMSolver and ProtoCom have been mentioned for the sakeof completeness and will not be discussed in more detail.

3.5.1 Palladio Component Model

The Palladio Component Model (PCM) [RBK+07] is a meta-model targeted to modelcomponent-based software architectures in terms of quality attributes, such as performanceand reliability. PCM models, which we also refer to as PCM instances, represent softwaresystems by capturing the quality-related characteristics of a system’s architecture. Basedon these system abstractions, extra-functional attributes of planned or existing systemscan be predicted.

3.5.1.1 Separation of Roles

The PCM meta-model is designed to support the component-based software engineeringprocess described in [KH06]. There, four roles are identified as being involved in thedevelopment of component software:

Component developers specify and implement components.

Software architects assemble the system from existing components or request additionalones from the component developers.

System deployers specify the execution environment in terms of resource containers andthe connections in between. Then, they deploy components on the resource contain-ers.

Domain experts describe the typical system usage by providing the usage behaviour.

Each role adopts its own point of view of the component-based system developed. Thisis reflected in the PCM by allowing the modelling of partial models focusing on elementsthat are of interest for a specific role. Each partial model corresponds to a specific role.These partial models yield the overall PCM model. [BKR09]

3.5.1.2 Parametric Dependencies

The performance of a component is influenced by four factors: its implementation, theexternal services called, the execution environment and the usage profile [RBK+07]. Con-sidering the role concept introduced before, it becomes apparent that none of the rolescan specify the performance characteristics of a component-based system in isolation. Es-pecially the component developer faces the challenge to model the performance behaviourof components in such a way that the specification is valid not only in a specific compo-nent context, but also when the context changes. This is the case, for instance, when anexternal service is replaced with a service provided by another component.

This is why the PCM introduces so-called parametric dependencies: when specifying acomponent’s performance-relevant behaviour, the component developer may refer to pa-rameters whose values are determined by the component’s context. In other words, thecomponent behaviour may be specified in dependence upon input parameters which arecharacterised by other roles.

Suppose, for example, that the CPU demand issued by a component service is twicethe number of entries contained in an input parameter provided to the service. Further,let the parameter depend upon the usage profile, which means that it is not known tothe component developer. Using a parametric dependency, the component developer canspecify the relationship between the input parameter and the CPU demand. In this

15

16 3. Foundations

way, the component developer leaves it to the domain expert to provide an appropriatecharacterisation of the input parameter, i.e. to specify the parameter’s expected size interms of entries.

3.5.1.3 Repository (Component Developer)

The repository model is created by the component developer. It essentially includes com-ponents, interfaces and datatypes. Components provide and require interfaces. Interfacesare lists of signatures, each consisting of an optional return type, a name and a potentiallyempty parameter list. Parameters, in turn, are expressed by a pair of a datatype and aname.

Roles and Interfaces

Providing an interface means that the component implements a service for each signaturelisted in the interface. Requiring an interface means that the component may use theservices listed in the interface to provide its own services. Whether an interface is providedor required by a component is determined by the type of the role, which connects thecomponent and an interface. Accordingly, two types of roles are distinguished: providedroles connect components to interfaces that are to be provided; required roles indicate thatan interface is required by a component.

Service Effect Specifications

An important task of the component developer is to describe the behaviour of componentservices relative to the quality attribute that is to be analysed by the model. When thePCM model is created to enable performance predictions, for instance, the componentdeveloper has to model the component’s performance-related behaviour.

For this purpose, the developer provides a so-called resource-demanding service effectspecification (RD-SEFF) for each service provided by a component. An RD-SEFF ab-stracts from the actual service implementation in that it captures the resource demandsinduced by the service itself (InternalActions) along with calls to external services(ExternalCallActions), which ,in turn, demand resources as well. Although the PCMallows for different types of service effect specifications, the RD-SEFF is the only typeavailable so far. For this reason, we use the terms service effect specification (SEFF) andRD-SEFF interchangeably.

A SEFF is expressed in a way similar to UML activity diagrams [uml] in that it definesa set of actions that are interconnected in a predecessor-successor relationship. In thisway, both impose an order on the actions representing a control flow through a componentservice, for instance. In the UML, control flow constructs, such as branches, split andmerge the control flow resulting in a partial order of the actions. By contrast, the actionsin a SEFF form a total order – a chain of actions – while at the same time providingbranches, loops and forks as well. This is possible due to hierarchically nested behaviours.While the UML models a branch transition by an edge directed from the branch to anotheraction, each branch transition in a SEFF is represented by a nested behaviour encapsulatedby the respective branch action. These nested behaviours may contain the same actionslike a SEFF.

The relation between SEFFs, behaviours and actions can be seen in Figure 3.4. Note thatsome details have been omitted and a more comprehensive presentation can be found in[RBK+07]. A ResourceDemandingSEFF is a ResourceDemandingBehaviour, whose con-trol flow is described in terms of AbstractActions. Each ResourceDemandingBehaviour

16


ResourceDemandingBehaviour

ResourceDemandingSEFF AbstractAction

ServiceEffectSpecification

AbstractInternalControlFlowAction

*

predecessor

successor

0..10..1

AbstractLoopActionAcquireActionBranchAction

ForkAction

InternalAction ReleaseActionSetVariableAction

StartAction

StopAction

CollectionIteratorActionLoopAction

0..* forkedBehaviours

1bodyBehaviour

ExternalCallAction

Figure 3.4: Partial view of the PCM repository meta-model, focused on SEEFs

contains exactly one StartAction and one StopAction indicating the start of the mod-elled control flow and its end, respectively. InternalActions model resource demandsissued by component services to active resources, such as processors or storage devices.Passive resources, such as database connection pools, are requested by AcquireActionsand released again using ReleaseActions. AbstractActions can be placed into the bodybehaviour of an AbstractLoopAction in order to repeat the encapsulated actions severaltimes in succession. The number of iterations of a LoopAction is determined by a ran-dom variable whereas the CollectionIteratorAction iterates over the entries containedin an collection-typed input parameter. Branches of the control flow are enabled by theBranchAction. Each control flow alternative of a branch is modelled by a (not depicted)subclass of the ResourceDemandingBehaviour meta-class along with a transition proba-bility, where the transition probabilities of a branch sum up to one. ForkActions split thecontrol flow in such a way that all of the ForkedBehaviours are executed concurrently.A ForkedBehaviour is a ResourceDemandingBehaviour, and, as such, encapsulates anaction chain describing the behaviour. Calls to external component services are modelledby ExternalCallActions.

3.5.1.4 System Model (Software Architect)

The software architect builds a model of the component software by assembling one ormore components. This model is termed system model. Similar to components, systemsprovide and require interfaces as well. By contrast, however, the system behaviour is notmodelled explicitly, but emerges by the collaborative behaviour of assembled components.As with components, systems use roles to express their relation to interfaces, i.e. whetheran interface is provided or required.

Referring to the distinction between various component forms in Section 3.3, componentsin the repository are actually component specifications, i.e. on a type level. Thus, acomponent needs to be instantiated before it can be used in an assembly such as a system.This is made possible by AssemblyContexts, which allow for instantiating componentsmultiple times. At a later stage, component instances or AssemblyContexts, respectively,are deployed on resource containers; deployed AssemblyContexts meet the definition of adeployment instance from Section 3.3.

17

18 3. Foundations

In order to form the collaborative system behaviour, the software architect connectsAssemblyContexts by AssemblyConnectors. Component services provided by the assem-bly of components can be exposed to system users by means of ProvidedDelegationCon-nectors. A ProvidedDelegationConnector delegates a role provided by the system toa matching role provided by a specified component. This means, whenever a user invokesa system service, the call is delegated to the corresponding component service. Likewise,a RequiredDelegationConnector allows for delegating required service calls to servicesprovided by the system’s environment.

3.5.1.5 Resource Environment Model (System Deployer)

Before AssemblyContexts can be deployed in an execution environment, the systemdeployer needs to create a model of the execution environment. Execution environ-ments are modelled in terms of resources, which is why this model is called the re-source environment model. A ResourceEnvironment comprises ResourceContainers andLinkingResources. ResourceContainers provide a number of resources such as proces-sors, storage devices and network interfaces. Each resource is specified by a Processing-ResourceSpecification, which captures the processing rate, the scheduling policy (e.g.first-come, first-served) and the number of instances (e.g. processor cores). ResourceCon-tainers can communicate over communication links, which are modelled by LinkingRe-sources.

3.5.1.6 Allocation Model (System Deployer)

After creating the ResourceEvironment, the system deployer maps component instancesrepresented by AssemblyContexts to ResourceContainers. The deployment is reflectedby AllocationContexts, each referencing an AssemblyContext and a ResourceCon-tainer. In this way, an AllocationContext constitutes the deployment relation.

3.5.1.7 Usage Model (Domain Expert)

As the performance of a component-based system depends not only on the system itself,but also on the way it is used [RBK+07], the usage behaviour needs to be described beforeconducting performance predictions. In the PCM, the domain expert describes the usageof a system by means of a usage model. It comprises one or more UsageScenarios, eachrepresenting a specific use case, which describes the interactions of a class of users withthe system [RBK+07, p. 69].

Usage Behaviour

As shown in Figure 3.5, each UsageScenario contains exactly one UsageBehaviour andone Workload. Similar to SEEFs, a UsageBehaviour captures behaviour by means ofAbstractUserActions, which are interconnected in a predecessor-successor relationship,thus forming a chain of actions. Each chain begins with a Start action and ends with aStop action. Calls to system provided services are represented by EntryLevelSystemCalls.Delays are used to model passage of time between successive system calls. A Loop encap-sulates a nested action chain, which is executed several times in succession as determinedby the iteration count. A Branch provides a number of control flow alternatives, eachdescribed by a nested action chain. Each alternative is associated with a transition prob-ability, and the probabilities of all alternatives sum up to one.

18


UsageModel

UsageScenario

ScenarioBehaviour

Workload

OpenWorkload ClosedWorkload

AbstractUserAction

0..1

*

1

Branch Delay

Start

EntryLevelSystemCallLoop

Stop1 bodyBehaviour

*

1

0..1

Figure 3.5: Partial view of the PCM usage meta-model

Workloads

Each UsageScenario has exactly one Workload, which specifies the scenario’s usage in-tensity, i.e. the number of concurrent system users. Workloads in the PCM are eitheropen or closed.

OpenWorkloads specify the usage intensity in terms of the inter-arrival time of successiveusers. The inter-arrival time is the time span between the arrival of a certain user andthe arrival of its successor. When a user arrives, it begins at the Start action and passesthrough the corresponding UsageScenario until the Stop action is reached. If users arrivefaster than they leave the system, the number of concurrent users increases over time.

In contrast, a ClosedWorkload sustains a fixed amount of system users, which is calledthe workload population. The workload starts with a whole user generation; whenever auser finishes its UsageScenario, a new user arrives after waiting a specified amount oftime, which is called the think time.

3.5.2 SimuCom

SimuCom [Bec08] is a software performance and reliability simulator for component-basedsoftware systems described by instances of the PCM. In addition to what has been statedin Section 2.1.1, we describe some implementation details which are relevant for this thesis.

Two parts of SimuCom can be distinguished. The simulation code is an executable repre-sentation of the PCM model under study expressed in Java. Accordingly, the code dependsupon the respective PCM instance which is to be simulated and needs to be generatedspecifically for each instance of the PCM. The SimuCom platform provides simulationinfrastructure services to the simulation code and is independent of the respective PCMinstance.

When starting a simulation run in SimuCom, three phases can be distinguished. Basedon the PCM model provided as input, the first phase conducts a series of model-to-modeltransformations. This allows, for instance, to weave in the influences of a specific middle-ware platform on the system model. Then, in the second phase, a code generator uses theaugmented PCM instance and a set of predefined model-to-text templates to generate exe-cutable Java code representing the system to be simulated. Each component, for instance,is mapped to a Java class whose methods represent the component’s services. In this way,a service call can be simulated by calling the corresponding method on an instance of thegenerated class. For a more detailed description of the mappings refer to [Bec08]. Finally,

19

20 3. Foundations

the third phase performs the simulation run by executing the simulation code generatedin the preceding phase.

As described in Section 2.1.1, large parts of SimuCom adhere to the process-interactionworld-view. The behaviour of simulated users as they move through the simulated systemis implemented in a process-oriented way, whereas the simulation of resources is event-oriented. As usual in Java, simulated processes are mapped to native threads. That is,whenever a simulated user enters the simulated system, SimuCom spawns a new thread.The same applies for multiple concurrent behaviours of the same user indicated by aForkAction. That is, each forked behaviour demands for a dedicated thread.

Even though SimuCommakes use of the SSJ simulation library (cf. Section 2.2), it does notemploy the process-interaction package provided by SSJ. Instead, the SimuCom platformrelies on its own implementation.

20

4. Simulator Development

This chapter describes EventSim, the event-oriented software performance simulator de-veloped in the course of this thesis. In Section 4.1, we propose a distinction betweentwo alternative simulator development approaches and decide for the development pro-cess to be used with EventSim. Thereafter, in Section 4.2, we give an abstract overviewof the simulation principles underlying our simulator. Sections 4.3 to 4.9 are concernedwith the object-oriented design and focus on selected implementation details. Section 4.10concludes the chapter with a concise presentation of the features supported by EventSim.

4.1 Simulation Development ProcessesAs introduced in Section 3.2.1, a simulation model is an executable representation ofthe system under investigation (SUI), and the model’s purpose is to imitate the SUI’sbehaviour. That is, when supplied with the same input, the output of the simulation modelis to be consistent with the output of the SUI – relative to the goals of the simulationstudy.

When developing such a simulation model, there is clearly a gap between the actual systemthat is to be modelled, and the executable system abstraction in the form of a computerprogram. Bridging this gap involves at least two modelling and development steps. First,a suitable abstraction needs to be found, be it in the mind of the modeller, sketched onpaper or modelled in a formal way. Thereby, the conceptual gap is bridged. Second, theabstraction needs to mapped to simulation code, which could be expressed in a particularsimulation language or in a general-purpose programming language. This step bridges thetechnical gap.

For the purpose of bridging the conceptual gap, the literature most commonly suggests theuse of a so-called conceptual model. A conceptual model can be defined as an abstractionof the SUI, which embodies its structure and behaviour while being at the same timeunaware of technical details such as a specific programming language, for instance. Butit has to be noted that there is no established definition or a common understandingof what constitutes a conceptual model [Rob06]. In particular, it is unclear whethera conceptual model is expressed by formal means or whether it is merely an informaldescription, expressed in natural language for instance.

This lack of clarity gives rise to different ways of development. More specifically, theshaping of the development process depends upon the degree of formality provided by the

21

22 4. Simulator Development

conceptual model. Consider an informal conceptual model expressed in natural language,for instance. Such a model might be well-suited for documentation purposes or as a meansto structure the thoughts of the modeller, but when it comes to the second step – thecreation of an executable system representation – the simulation modeller has to performa manual translation from the abstract conceptual view to concrete program statementsresembling the conceptual model. By contrast, a formal conceptual model facilitates anautomated transition to an executable simulation model.

Inspired by model-driven software development (see [SVEH07], for example), we refer tothe former approach asmodel-based simulation development. The latter approach is termedmodel-driven simulation development. Both alternatives of developing a simulation modelare sketched below.

The questions that we are especially concerned with in this section are:

• Which process steps are conducted on the way from the actual system towards itsexecutable representation?

• What artefacts emerge in the course of the respective simulation development pro-cess?

• Who creates these artefacts? Or more precisely, which role is responsible for creatinga specific artefact?

We consider roles as a means to categorise the responsibilities resulting from a developmentprocess. The mapping between roles and actual persons can be somewhat arbitrary in thata single person may take multiple or all of the roles; or each role may be assigned to adifferent person.

As will be stated later on, we adhere to a specific role in the scope of this thesis. Thus,answering the aforementioned questions enables us to name the artefacts that are expectedas the outcome of this thesis.

4.1.1 Model-based Simulation Development

The model-based simulation development process is illustrated in Figure 4.1. The wholedevelopment process is characterised by an informal conceptual model. This model is cre-ated by the system modeller to bridge the conceptual gap. Due to its informal nature,manual effort is required to translate the conceptual model to an executable simulationmodel. Therefore, the task of the simulation modeller is to create an executable repre-sentation of the system, which is in line with the conceptual model. In doing this, thesimulation modeller commonly makes use of a simulation language or a general-purposeprogramming language.

Real-WorldSystem

Conceptual Model

abstractionmanual

translation

Simulation Model

Simulation ModellerSystem Modeller

Figure 4.1: Model-based simulation development

This development approach imposes a strong coupling between the two roles, as a changein the conceptual model entails a manual change of the simulation model; whenever theabstraction of the real-world system changes, the simulation modeller has to adapt or

22

4.1. Simulation Development Processes 23

extend the program code that makes up the simulation model. Furthermore, the value ofthe conceptual model is limited mainly to documentation purposes – if it exists on paperat all.

4.1.2 Model-driven Simulation Development

In the course of a model-driven simulation development process, which can be seen inFigure 4.2, the system modeller describes the conceptual model by formal means. As withthe previous approach, the conceptual model serves the purpose of bridging the conceptualgap. Opposed to model-based simulation development, however, the formal descriptionallows for an automatic transition from the conceptual model to an executable simulationmodel.

The conceptual model is not restricted to a specific formalism; one could use Petri nets,queuing networks or some other formalism to describe the system behaviour. However,such formalisms do not allow for a natural modelling of the system, since the modellingtakes place at a low abstraction level and no separation between the system’s static struc-ture and dynamic behaviour is given. For this reason, we favour the use of a domain-specificlanguage (DSL) to model the system. A well-designed DSL which is specifically developedfor performance predictions, for instance, can enable a modelling process that feels morenatural to the system modeller, thus reducing the conceptual gap. Therefore, without lossof generality, we assume in the following that the conceptual model is an instance of aDSL.

In contrast to the model-based approach, the need for a DSL in the model-driven devel-opment process gives rise to a third role – the domain expert. The task of the domainexpert is to create an abstraction of the domain that the set of real-world systems underinvestigation belong to, which results in a DSL. Ideally, the DSL covers the whole domainrelative to the aims of the simulation study. Notice that it might be necessary to split therole of the domain expert across several more specialised roles such as the domain analystand the domain architect as proposed by [SVEH07, pp. 343]. Nevertheless, we adhere tothe compound role of the domain expert for the sake of simplicity.

Real-WorldSystem

Conceptual Model

abstractionautomatedtranslation

Simulation Model

Simulation ModellerSystem Modeller

Domain-Specific

Language (DSL)Domain

abstraction

Domain Expert

instance ofinstance of

Figure 4.2: Model-driven simulation development using a DSL

The task of the system modeller is similar as with the model-based approach, except thatthe conceptual model has to adhere to the syntax and semantics of the DSL.

The role of the simulation modeller, however, differs fundamentally. Instead of creatingan executable simulation model from a given conceptual model, the simulation modellerin the model-driven approach is concerned with creating an automatism facilitating the

23


derivation of simulation models from arbitrary conceptual models adhering to the DSL.Once developed, the automatism enables the system modeller to derive an executablesimulation model whenever the conceptual model changes – without the engagement ofthe simulation modeller.

Two ways of implementing this automatism, i.e. to create an executable simulation modelfrom a formal conceptual model, are generators and interpreters [SVEH07, p. 12]. Bothapproaches are presented below with a focus on their application to model-driven devel-opment of simulations.

4.1.2.1 Generative Approach

Using the generative approach, the simulation process can be distinguished into two phases.The initial transformation phase generates the simulation model by transforming the con-ceptual model to an executable representation, like Java code for instance. This phase iscomparable to the compilation of a program written in a high-level programming language,which is to be translated to a language on a lower level. The subsequent simulation phaseconducts the simulation by executing the code generated previously. Most commonly, notthe whole simulation code is generated, but, instead, the generated simulation model relieson some simulation infrastructure services provided by a simulation platform. An exampleof the generative approach is SimuCom (cf. sections 2.1.1 and 3.5.2).

By adhering to the generative approach, the task of the simulation modeller is to developthe generator – along with the simulation platform, if infrastructure services are supposedto be factored out from the simulation model. Various techniques to implement generatorsare presented in [SVEH07, pp. 145].

4.1.2.2 Interpretative Approach

By contrast, the interpretative approach does not require a preprocessing of the conceptualmodel. Instead, the simulator loads the conceptual model at the beginning of a simulationrun, which is then passed to a built-in interpreter. The interpreter traverses the conceptualmodel in a stepwise manner and executes the code associated with each model elementencountered. In this way, the simulation model emerges on the fly due to the sequence ofcode blocks executed by the interpreter. As with the generative approach, the interpretermay rely on a simulation platform.

The task of the simulation modeller is therefore to develop the interpreter and the simu-lation platform, if intended. Most commonly, a variant of the visitor pattern [Gam95, pp.331] is used to implement the interpreter.

4.1.3 Selecting a Development Process

Now, that the alternative simulation development approaches have been introduced anddiscussed, we motivate the selection of the development process that underlies the simu-lator implementation presented in the remainder of this chapter.

Our simulator EventSim aims at simulating instances of the Palladio Component Model,which serve as conceptual models. As introduced in Section 3.5, the PCM is a domain-specific language (DSL) enabling the modelling of component-based software systems. Inconsequence, the conceptual model is formally well-defined, thus allowing for the model-driven simulation development approach.

The tasks associated with the roles of the domain expert and the system modeller neednot to be addressed in this thesis, since the DSL is already available and no simulationstudy is to be conducted. Instead, we adhere to the role of the simulation developer in

24

4.2. Simulation Overview 25

that we provide the means to the system modeller to conduct a simulation study based onthe PCM.

The choice between a generative and an interpretative approach is a trade-off betweenperformance and flexibility. While generators are commonly considered as superior interms of performance compared to interpreters, they lack of flexibility; once generated,the simulation model can not be modified anymore. For example, runtime reconfigurationscenarios as addressed with SLAstic.SIM (cf. Section 2.1.2) are not possible when thesystem’s static structure is generated and fixed thereafter. In order to preserve a highdegree of flexibility, we decided in favour of the interpretative approach.

4.2 Simulation OverviewThis section provides an overview of how we realised the simulation of component-basedsoftware systems abstracted by PCM instances.

The simulation in EventSim is essentially driven by two entities, namely Users andRequests, which demand shared resources while passing through the simulated system.The relation between entities and resources is illustrated in Figure 4.3. Whenever a Userissues a call to a system service, the simulation spawns a Request. The Request simulatesthe service behaviour and terminates thereafter. Simulating the service behaviour meansto issue a sequence of resource demands to active or passive resources. Resources arelimited in capacity, and multiple Requests may be active at the same time. As a result,Requests compete for scarce resources causing resource contention. In consequence, if arequested resource is busy with a competitor, a Request might have to wait for its turn,leading to waiting times.

User Request Resourceinvokes demands

<<EntryLevelSystemCall>> <<ForkAction>>invokes

1 * * *

1

*

Figure 4.3: Relation between Users, Requests and resources

The presence of multiple Requests in the simulated system, each issuing demands toshared resources, leads to the overall system behaviour that the simulation is to imitate.Observing the entities and resources over the course of a simulation run yields the sim-ulation results. Results are, for instance, the resource’s utilisation over time as well asresponse times. The response time of a system call equals to the residence time of thecorresponding Request in the simulated system, and the response time of a whole usagescenario can be obtained by calculating the simulated time that it took a simulated Userto execute its usage scenario, i.e. to simulate the sequence of system calls constituting theuse case.

The remainder of this section provides a more detailed view on entities and resources.

4.2.1 Users

Users are entities that simulate the usage behaviour captured by UsageScenarios. AUsageScenario in the PCM represents a use case, i.e. the interaction of a class of userswith the simulated system [RBK+07, p. 69]. User classes are, for instance, administrators

25


and regular system users. Use cases, or UsageScenarios, respectively, are modelled bychains of actions of the type AbstractUserAction as introduced in Section 3.5.1.7.

Simulating a use case means to follow a path from the Start action through to the Stopaction while executing the simulation logic associated with each action encountered onthe path. This procedure of passing a chain of actions is in what follows referred to astraversal.

Usually, there is more than a single path, as the PCM provides control flow constructs,such as Loops and Branches. As a result, the simulated behaviour of two Users maydiffer, even though they simulate exactly the same UsageScenario.

Whenever a User encounters an EntryLevelSystemCall, it spawns a Request, as canbe seen in Figure 4.4. Then, the User waits for the simulated system call to returnthe control; it must not move on to the next action before the Request has finished itsexecution. Thus, the Requests of a specific User are executed one after the other, exceptfor ForkedRequests presented below.

* *

<<UsageScenario>>User

<<ScenarioBehaviour>>

<<EntryLevelSystemCall>>

callOperationA Request Resourceinvokes demands

<<ForkAction>>invokes

1

*

1

1

<<EntryLevelSystemCall>>

Figure 4.4: Example of a User issuing a system call

4.2.2 RequestsRequests are entities that simulate the behaviour of system calls issued by simulatedUsers. For this, a Request starts at the system’s provided role corresponding to thecalled service, follows the delegation connector towards the providing component, andsimulates the component behaviour relating to the requested service. The componentbehaviour is modelled by a chain of actions of the type AbstractAction contained in aResourceDemandingSEFF (cf. Section 3.5.1.3).

Simulating a Request means to traverse the chain of actions as introduced above. When-ever a Request encounters an ExternalCallAction, it simulates the behaviour of thecalled component service before continuing the traversal of the calling service. In this way,the chain of actions traversed by a Request is not restricted to a single RD-SEFF. Instead,the simulated control flow may span multiple components as can be seen in Figure 4.5.

When a Request discovers an InternalAction, it issues the according resource demandand blocks until the requested active resource has served the demand. A Request behavesin the same manner when running across an AcquireAction, i.e. it continues the traversalnot until the instances of the according passive resource have been granted.

Whenever a Request comes across a ForkAction, it spawns a ForkedRequest for eachForkedBehaviour. A ForkedRequest is a subclass of Request, which is why all statementsrelated to Requests do also apply for ForkedRequests. The ForkedRequests relating toa certain ForkAction are simulated concurrently in order to account for the semantics ofForkActions.

26

4.3. Preprocessing the Static Structure 27

Userinvokes

<<ResourceDemandingSEFF>>OperationA

<<ExternalCallAction>>

callOperationB

<<ResourceDemandingSEFF>>OperationB

<<InternalAction>>

internalAction

Request

invokes

1 *

demands

* *

Resource

1

* <<ForkAction>><<EntryLevelSystemCall>>

Figure 4.5: Example of a Request spanning two components (assuming that OperationAand OperationB are provided by different components)

4.2.3 Resources

Two types of resources are distinguished in the PCM. Active resources, such as processorsand storage devices on the one hand, and passive resources, such as database connectionpools or semaphores on the other hand.

Active resources are demanded by Requests, where the resource demand is expressed bya certain amount of processing time. Dividing the resource demand by the resource’sprocessing rate, yields the duration of the resource demand in terms of simulated time.However, due to resource contention between concurrent Requests, the simulated timeactually required for serving the Request is commonly larger than the resource demandcalculated above. Requests wait until their resource demand is served before they moveon. Waiting for resources is reflected by an advance in simulation time.

Passive resources are acquired by Requests, then be hold for a certain amount of time andare finally released. Passive resources have a limited capacity which decreases whenever aRequest is being granted a resource instance. Conversely, the available capacity increaseswhen a Request releases a passive resource instance. If a Request tries to acquire aresource instance although no instances are available, it has to wait until the passiveresource can serve the acquisition. The waiting time is reflected by an advance in simulatedtime.

4.3 Preprocessing the Static StructureThe meta-model of the PCM has been designed to integrate itself well into a component-based modelling process, which is commonly distributed across several developer roles (cf.Section 3.5.1.1). An instance of the PCM, therefore, comprises multiple partial-models,each of which captures a part of the modelled system from the associated role’s perspective.As a result, however, many related aspects of the simulated system are scattered overmultiple partial-models.

Consider, for instance, a resource demand (InternalAction) contained in the SEFF de-scribing a component service. Due to the distinction between the role of a componentdeveloper and that of a system deployer, the former role is not aware of the resourcecontainer at which the latter role will deploy the component on. Instead, the compo-nent developer refers to an abstract resource type, such as a CPU or a HDD resource,

27


when specifying the component’s resource demand. Therefore, in order to obtain the con-crete resource, it is necessary to take the component’s allocation to the according resourcecontainer into account.

This example can only provide a rough impression of the great amount of repeated lookupsrequired when interpreting a PCM model. Such a verbose simulation code would not onlyaffect the simulation performance, but also the maintenance of the underlying simulator.We therefore conclude that the way in which the PCM meta-model captures a system’sstatic structure is rather not suited to be used in conjunction with an interpretative sim-ulation approach.

For this reason, we developed an object-oriented representation specifically tailored toenable an efficient interpretation-based simulation. This representation can be regardedas a façade of the underlying PCM model as depicted in Figure 4.6. Just as the PCMmodel itself, the façade provides access to the the simulated system’s static structure.Opposed to a direct model access, however, a single call to the façade usually suffices toprovide the simulation with a certain information relating to system’s structure.

In a preprocessing step, the static structure façade is created once before the simulationstarts. Despite of the presence of the façade, the PCM model can be directly accessedif required. For instance, if the façade does not captures a specific aspect of the staticstructure.

StaticStructure

DynamicBehaviour

PCM Model

Static Structure

Façade

Simulation

Figure 4.6: The static structure façade provides optimised access for interpretative simu-lations to the simulated system’s static structure

The object-oriented design of the static structure façade can be seen in Figure 4.7. Asimulated System comprises one or more ComponentInstances, which are in fact simu-lated deployment instances (cf. Section 3.3). Each component instance provides at leastone interface, which is reflected by ProvidedRoleInstances. More precisely, a role in-stance connects a component instance with an interface; whether this interface is providedor required, is determined by the type of the role instance. Moreover, the role conceptallows for providing or requiring a specific interface more than once by the same compo-nent. As already indicated, a component instance may also require interfaces by means ofRequiredRoleInstances. Two role instances can be connected with each other, if bothrefer to the same interface and one of the roles is a ProvidedRoleInstance, and the otherone a RequiredRoleInstance. Connecting two role instances means actually to connecttheir associated component instances. That is, the component instance associated withthe required role instance uses services of the component instance associated with theprovided role instance.

As the name suggests, a deployment component instance is deployed on a SimulatedRe-sourceContainer. A resource container provides various active resources (SimActive-

28

4.4. Interpreting the Dynamic Behaviour 29

System

+getSEFF(in signature : OperationSignature)

ComponentInstance

SimulatedResourceContainer

RequiredRoleInstanceProvidedRoleInstance

1 1

1

1

1

0..*

*

1

1

1..*

SimPassiveResource1*

1

*

+connect(in role : RoleInstance)

«interface»RoleInstance

deployedOn

Deployment Component Instance

Wires two deployed component instances

pcm

1

*

«interface»ResourceDemandingSEFF

«interface»OperationSignature

SimActiveResource

CPU, HDD

Semaphore, ThreadPool, ...

Figure 4.7: Object-oriented design of the static structure façade

Resource) to component instances. Opposed to active resources, passive resources (Sim-PassiveResources) are not provided by the resource environment but are directly asso-ciated with the component.

Finally, a component instance provides a ResourceDemandingSEFF for each provided ser-vice. The RD-SEFF for a service can be returned by invoking the getSEFF method usingthe corresponding OperationSignature as a parameter. Both interfaces, the Resource-DemandingSEFF, as well as the OperationSignature are imported from the PCM meta-model.

4.4 Interpreting the Dynamic Behaviour

The purpose of our simulator is to imitate the behaviour of component-based systemexpressed by PCM models. In doing so, the simulator should adhere to the interpretativesimulation approach as motivated in Section 4.1.3.

An instance of the PCM contains two types of behavioural descriptions. The system be-haviour is captured by a number of ResourceDemandingSEFFs, and the usage of the systemis described by one or more UsageScenarios. Both types of behavioural descriptions aremodelled in a similar way, inasmuch as they capture simulated users and their requests asthey flow through the system by a chain of actions.

Opposed to the way the PCM describes the static structure of a system, the behaviouraldescription is well-suited for an interpretative simulation approach and gives rise to anintuitive way of simulation. Beginning with the start action, an interpreter moves alongthe chain of actions and simulates each action encountered on its way. Once the stop actionis reached, the behaviour has been simulated completely. This interpretative approachforms the basis of our simulator implementation.

29


In the remainder of this section, we describe at first the various events that initiate thetraversal of action chains. Then, we cover in detail the implementation of the interpreter.We conclude the section with an example that illustrates how the various parts describedbefore fit together.

4.4.1 Events

Events are essential building blocks of each event-oriented simulation, as introduced inSection 3.2.1. Each event encapsulates a piece of the simulation logic, and the overallsimulation logic is yielded by executing several events in succession. When building asimulation based on events, the simulation modeller is confronted with many degrees offreedoms arising from the identification of suitable events. The simulation could be builtfrom many fine-grained events, it could be realised by few large-grain events that coverwide parts of the simulation logic, or by using an event granularity in between. The onlyrestriction that the event-oriented simulation paradigm imposes to the simulation modelleris that the simulation time may advance solely between events, but never within an event.The size of an event is therefore constrained by the need to advance the simulation time,as no time advance may occur while processing an event. Thus, when executing an eventand the simulation time needs to be advanced, another event has to be scheduled. As aresult, the extent of an event is at its maximum, if the event does not return the controlbefore a time advance is required.

For the purpose of driving our simulation, we identified four events with a maximumextend. In this way, the amount of scheduled events is minimised resulting in a bettersimulation performance since it is more expensive to instruct the event scheduler to executea piece of code encapsulated by an event than to invoke the corresponding code directly.The BeginUsageTraversalEvent starts the simulation of a usage behaviour by traversingthe action chain of a UsageScenario until a time advance is required. This is the casewhenever the traversal comes across a Delay action. Then, the current event schedules aResumeUsageTraversalEvent to occur in the future according to the time delay specifiedby the Delay action. If an EntryLevelSystemCall is encountered, the present event(a BeginUsageTraversalEvent or a ResumeUsageTraversalEvent) determines the SEFFthat describes the system call’s behaviour and schedules a new BeginSeffTraversalEventwhose task is to traverse the SEFF. Whenever the simulation time needs to be advanced,the present event schedules a ResumeSeffTraversalEvent and comes to an end. A timeadvance while processing a SEFF is required when active or passive resources are accessed.

Further events are concerned with simulating the the behaviour of resources, which is,however, not within the scope of this thesis.

4.4.2 Behaviour Interpreter

For the traversal of action chains, events rely on the behaviour interpreter, whose task isto traverse action chains and to execute the simulation logic associated with each actionin the chain. The static structure of the interpreter is covered below. Thereafter, we focuson the central traversal procedure that coordinates the interpretation process before wetake a detailed look on the building blocks utilised in the traversal procedure.

4.4.2.1 Static Structure

Figure 4.8 shows a view on the static structure of the interpreter that depicts the mostimportant aspects while at the same time hiding some less relevant details. The abstractBehaviourInterpreter class provides an abstract method named traverse that, begin-ning with an initial action, passes through a chain of actions and executes the simulation

30


+obtainTraversalStrategy() : ITraversalStrategy+traverse(in firstAction : A)

BehaviourInterpreter

A:Entity

+traverse()

«interface»ITraversalStrategy

«interface»ISeffTraversalStrategy

«interface»IUsageTraversalStrategy

+traverse(in firstAction : AbstractAction)

SeffInterpreter

+traverse(in firstAction : AbstractUserAction)

UsageBehaviourInterpreter

<<use>>

pcm

SEFF actionUsage model action

«interface»Entity

«interface»AbstractAction

«interface»AbstractUserAction

<<traverse>>

Figure 4.8: Static structure of the behaviour interpreter

logic associated with each action until the simulation time needs to be advanced. Below,we also refer to the traverse method as traversal procedure.

The traversal procedure is supposed to be applicable for both, the traversal of Usage-Scenarios as well as ResourceDemandingSEFFs, which is why the traverse method has atthe same time to accept usage actions (implementing the AbstractUserAction interface)and SEFF actions (implementing the AbstractAction interface). Usually, this would berealised by having a common supertype of both interfaces, which would then be acceptedby the traverse method. However, the sole common supertype of the two interfaces is theEntity interface, which does not provide for method signatures enabling the navigationfrom one action to another.

In order to still be able to use a common traversal procedure for both types of action chains,we utilise Java generics. The BehaviourInterpreter class is parametrised by the formaltype parameter A that indicates the type of actions, which are to be traversed. In thisway, the parametrised type BehaviourInterpreter<AbstractUserAction> provides themethod traverse(AbstractUserAction firstAction); and likewise, the parametrisedtype BehaviourInterpreter<AbstractAction> offers the method traverse(Abstract-Action firstAction). The class UsageBehaviourInterpreter extends the former type;the class SeffInterpreter extends the latter one, thus providing an interpreter for bothtypes of action chains whose traversal procedure is described by a common superclass.

The simulation logic associated with the several actions is encapsulated by strategies (cf.[Gam95, pp. 315]) that implement the ITraversalStrategy interface. Each concretestrategy provides the simulation logic associated with a specific type of action by imple-menting the traverse method. In doing so, the BehaviourInterpreter can delegate thesimulation of an action to the corresponding strategy and needs no knowledge of whichsteps are necessary to simulate a specific action; it simply calls the traverse method on thesuitable strategy.

As the BehaviourInterpreter is abstract and, as stated before, is supposed to be applica-ble for both types of action chains, it may not be aware of which strategies are available forcertain types of actions. Instead, the subclasses UsageInterpreter and SeffInterpretermaintain the mapping between action types and strategies, which is denoted by a composi-tion association. The abstract method obtainTraversalStrategy has to be implementedby subclasses of BehaviourInterpreter and enables the traversal procedure to querythe subclasses for appropriate traversal strategies. In this respect, the traverse methodoffered by the BehaviourInterpreter class is a template method (cf. [Gam95, pp. 325]).

31


4.4.2.2 Traversal Procedure

From here on, we focus on the implementation of the traverse method provided by theBehaviourInterpreter class, which we also refer to as traversal procedure. As shown inListing 4.1, the traversal of action chains begins with an initial action that is passed tothe traversal procedure. Additionally, the current traversal state (cf. Section 4.4.2.4) ispassed to the procedure, which is empty if the initial action is a start action. The traversalprocedure primarily consists of three steps that are repeated until either the stop action isreached or the simulation time needs to be advanced. As the traversal procedure is usuallyemployed by events and no simulated time may pass in an event, the traversal procedurehas to be interrupted whenever a time advance is required.

The traversal starts with obtaining a traversal strategy (cf. Section 4.4.2.3) that is suitableto simulate the current action. In the second step, this strategy is used to simulate thebehaviour related to the current action, which yields a traversal instruction. The traversalinstruction (cf. Section 4.4.2.5) encapsulates the knowledge of how to proceed the traversal,i.e. which action is next and how the traversal state needs to be modified before moving onto the next action. Then, in the third step, the traversal state is modified in a transparentway by invoking the process method on the traversal instruction. This does not onlyprepare the state for the next action, but also returns the next action. Finally, the nextaction is set to be the current action and the next iteration begins. If the current actionis null, the traversal procedure terminates. For the sake of simplicity, Listing 4.1 omitsthe statements associated with notifying traversal listeners (cf. Section 4.4.2.6).

Listing 4.1: Simplified Java code of the traversal procedure1 protected t r a v e r s e (A f i r s tAc t i on , Trave r sa lS ta t e s t a t e ) {2 A currentAct ion = f i r s tA c t i o n ;3 while ( currentAct ion != null ) {4 // ob ta in the t r a v e r s a l s t r a t e g y f o r the curren t ac t i on5 ITrave r sa lS t ra t egy s = obta inTrave r sa lS t ra t egy ( ge tC la s s ( currentAct ion ) ) ;67 . . .89 // s imu la t e the current ac t i on and r e c e i v e an i n s t r u c t i o n

10 // o f how to proceed wi th the t r a v e r s a l11 IT r av e r s a l I n s t r u c t i o n i = s . t r a v e r s e ( currentAct ion , s t a t e ) ;1213 // s e t the ac t i on to be t r a v e r s e d next accord ing to the14 // i n s t r u c t i o n prov ided by the current t r a v e r s a l s t r a t e g y15 A nextAction = i n s t r u c t i o n . p roce s s ( s t a t e ) ;16 currentAct ion = nextAction ;1718 . . .19 }20 }

4.4.2.3 Traversal Strategies

A traversal strategy encapsulates the simulation logic for a specific action type, i.e. foran implementation of the AbstractUserAction interface in the case of a usage action, orof the AbstractAction interface in the case of a SEFF action. Whenever the traversalprocedure encounters an action of a certain type, it delegates the execution to the corre-sponding traversal strategy. This strategy then handles the action by simulating its effecton the system under simulation, i.e. it changes the system state according to the actionunder traversal.

32


When finished, the traversal strategy returns control to the traversal procedure, along withan instruction on how to proceed the traversal. This instruction includes the informationon which action is next to be traversed.

Below, we describe the two traversal strategies that are responsible for simulating Inter-nalActions and BranchActions.

Simulating InternalActions

The InternalActionTraversalStrategy simulates SEFF actions of the type Internal-Action. InternalActions are used to specify resource demands of a service betweensuccessive calls to required services [RBK+07, p. 52]. For this, each InternalActionreferences one or more active resource types, whose respective resource demand is describedby means of a stochastic expression (cf. Section 4.6).

The first step in the simulation of an InternalAction is to evaluate the stochastic expres-sion, i.e. to draw a sample from the probability distribution described by the stochastic ex-pression. Thereafter, the strategy looks up the concrete resource of the specified type; theconcrete resource is provided by the resource container on which the current component isdeployed on. Using the static structure façade, the resource can be obtained in a straight-forward way by calling component.getResourceContainer().getResourceByType(re-sourceType). The current component is the component whose service is described bythe SEFF under traversal; it can be retrieved from the traversal state. Having both theresource demand and the corresponding resource at hand, the current request can beenqueued at the resource.

At this point, the simulation time needs to be advanced in dependence upon the waitingand processing time induced by the demand. This is, however, not the task of the traversalstrategy. Instead, it instructs the traversal procedure to interrupt the traversal. As soonas the resource has served the request, a new event is created to resume the traversal atthe advanced simulation time.

If the InternalAction specifies more than one resource demands, the traversal proce-dure invokes the InternalActionTraversalStrategy again in order to process the nextdemand. Otherwise, the traversal moves on to the next action.

Simulating BranchActions

The BranchActionTraversalStrategy is responsible for simulating SEFF actions of thetype BranchAction. As branches may also be used in UsageScenarios, there is an-other traversal strategy named BranchTraversalStrategy (notice the absence of the termAction here). Although we focus on the former strategy here, most of what is stated belowdoes also apply to the latter one. A BranchAction splits the control flow, where exactlyone of the alternatives, which are also called transitions, is taken [RBK+07, p. 52]. Howthe strategy selects the alternative to execute depends on the type of the alternative. Analternative is either a ProbabilisticBranchTransition or a GuardedBranchTransition,where all the alternatives contained in a specific branch must be of the same type.

In the case of ProbabilisticBranchTransitions, each alternative is associated with aprobability pi and the probabilities of all alternatives p1, · · · , pn sum up to one. Thetraversal strategy draws a random sample r greater than or equal to 0 and less than 1using a pseudo-random number generator and takes the first alternative if r ∈ [0, p1). Moregenerally, the i-th alternative is taken if r ∈ [p1 + p2 + ...+ pi−1, p1 + p2 + ...+ pi−1 + pi).After having determined the alternative in this way, the traversal strategy instructs thetraversal procedure to traverse the ResourceDemandingBehaviour associated with therespective alternative.

33


If the branch contains GuardedBranchTransitions, each alternative is associated witha condition specified by a boolean stochastic expression. The traversal strategy thentakes the alternative whose condition evaluates to true. For this purpose, the strategyloops through the conditions, evaluates each stochastic expression and takes the currentalternative if the corresponding condition is true. As before, the strategy instructs thetraversal to continue with the ResourceDemandingBehaviour of the selected alternative.

4.4.2.4 Traversal State

While traversing action chains and simulating the effects of actions on the simulated sys-tem, the interpreter needs to be aware of the traversal progress, or, in more general terms,of the state of the traversal. The traversal state essentially comprises the action that hasjust been simulated (the previous action), the action that is being processed at the moment(the current action), and a reference to the component instance whose behaviour is simu-lated by the present action chain. If the interpreted action chain captures the behaviourof a user, the reference to the component is not defined.

Furthermore, some traversal strategies need to maintain a state as well. Consider, forinstance, the traversal strategy that simulates either a Loop or a LoopAction. In additionto the state variables just described, the respective traversal strategy does also need tomaintain a counter which indicates the current iteration count. Otherwise, the iterationcount would get lost between two successive loop iterations. We denote such a statemaintained by a traversal strategy itself as an internal state.

Traversal strategies can add their internal state to the overall traversal state, from where itcan be retrieved later on. For this, the state has to be an instance of a class implementingthe ITraversalStrategyState interface. However, since states are specific to the respec-tive traversal strategy, this interface does not impose any restrictions on how the state hasto be implemented; instead, the interface is empty and, as such, merely serves as a markerinterface.

As introduced in Section 3.5, actions may be hierarchically nested. Examples are Loopsand Branches. In consequence, it does not suffice to maintain a single state only whentraversing the hierarchy of actions. Instead, the traversal state needs to maintain a set ofstate variables for each level of the hierarchy. Consider a nested Loop action, for instance,which is contained in another Loop; in the presence of a single state only, the traversalstrategy corresponding to the inner Loop would override the iteration count of the enclosingLoop. This is why we modelled the traversal state as a stack of so-called stack frames.

Each stack frame contains the state variables as described above and an arbitrary numberof internal states. Before starting the traversal of a nested action chain, i.e. before enteringa lower level of hierarchy, a new stack frame is pushed on the stack. State variables thatare covered by other stack frames are not accessible anymore, but are preserved for laterrecovery. As soon as all actions on a hierarchy level are simulated, the topmost stackframe is removed from the stack, which reveals the underlying stack frame. In this way,the previously covered state variables are restored.

4.4.2.5 Traversal Instructions

The responsibility of traversal strategies is not only to simulate the behaviour of a specificaction as described in Section 4.4.2.3, but also to provide the traversal procedure withthe next action to be traversed. Beyond that, traversal strategies are also expected toprepare the traversal state for the next action. Both additional tasks are motivated in thefollowing.

34


On first glance, the need for passing the next action to the traversal procedure seems odd,since one might assume that the traversal procedure has this knowledge already. This istrue in simple cases, where the next action is known to the traversal procedure withoutrelying on a traversal strategy. An example is the simulation of a Delay action. Regardlessof the Delay’s simulation logic, the next action to be traversed is the successor of the Delayaction. More generally, a “simple case” is an action that does not encapsulate other actionsby nested behaviours.

In more complex cases, however, when an action contains a nested behaviour, the decisionof which action is the next to be traversed depends on the simulation logic associated withthe present action. Consider a Branch, for instance, which encapsulates two control flowalternatives, each with a transition probability of 0.5. The choice for one of the control flowalternatives is in the responsibility of the corresponding traversal strategy, which bases itsdecision on a random number generated by a pseudo-random number generator. Thus, theselected control flow alternative can not be known to the traversal procedure in advance.In consequence, the traversal strategy needs to provide the next action to the traversalprocedure, in dependence upon the selected alternative.

Similar to the discussion above, the need for preparing the traversal state for the nextaction is motivated by nested behaviours as well. As described in Section 4.4.2.4, a stackframe has to be pushed on the traversal state stack before simulating a nested behaviour.Likewise, the frame has to be removed from the top of the stack as soon as the nested be-haviour has been simulated completely. This task has to be done by the respective traversalstrategy. When simulating a Loop action, for instance, the LoopTraversalStrategy firstensures that the state stack is prepared accordingly before it instructs the interpreter tomove on to the first action in the encapsulated behaviour.

Managing the state manually is from a developer’s perspective a tedious and error-pronetask which complicates not only the implementation of traversal strategies, but also themaintenance thereafter. For this reason, the state handling is encapsulated by traversalinstructions, whereby the associated complexity is hidden from the traversal strategies.In addition, traversal instructions carry the next action to be traversed. This bundle,consisting of the state handling functionality along with the next action, can then providedto the traversal procedure. There, the traversal state stack is prepared in a first step usingthe instruction’s state handling functionality. In the second step, the traversal procedureobtains the next action from the traversal instruction and moves on accordingly.

In other words, a traversal instruction encapsulates the knowledge of a traversal strategyon how the traversal procedure is supposed to continue. This knowledge is divided intotwo parts. The first part is the information on which action is to be traversed next; andthe second part is a mechanism that allows for preparing the traversal state stack for thenext action.

A traversal strategy implements the interface ITraversalStrategy, which comprises asingle method named process. Provided with the current traversal state, the processmethod first prepares the passed state for the next action, and then returns the next actionto the caller.

Developers of traversal strategies can choose between six traversal instructions:

TraverseNextAction Instructs the traversal procedure to proceed the traversal with aspecified action while using the present stack frame.

TraverseResourceDemandingBehaviour Instructs the traversal procedure to proceed thetraversal with a specified nested ResourceDemandingBehaviour. For this, the traver-sal instruction pushes a new frame onto the traversal state stack and sets the nextaction to the StartAction of the nested behaviour.

35


TraverseScenarioBehaviour This traversal instruction serves the same purpose as theTraverseResourceDemandingBehaviour, with the difference that the nested be-haviour expresses the usage behaviour, i.e. is a ScenarioBehaviour.

TraverseAfterLeavingScope Instructs the traversal procedure to continue the traversalafter a nested behaviour has been traversed completely. In doing so, the traver-sal instruction removes the topmost frame from the stack, which reveals the frameunderneath.

InterruptTraversal Instructs the traversal procedure to interrupt the traversal. By inter-rupting the traversal, the simulation time can be advanced by scheduling the re-sumption of the traversal at a later time instant. When resuming the traversal, thetraversal procedure is provided with the traversal state, which allows for a seamlesscontinuation of the traversal.

EndTraversal Instructs the traversal procedure that the action chain has been traversedcompletely, which is the case if the present action is a stop action and if there is onlya single frame left on the traversal state stack.

4.4.2.6 Traversal Listener

While traversing a simulated control flow, the traversal procedure can notify interestedobjects about the traversal progress. For this purpose, we used the observer design pattern[Gam95, pp. 293]. Whenever an action is about to be traversed or has been traversedcompletely, the traversal procedure sends out a notification. Objects whose classes imple-ment the ITraversalListener interface can be registered with the traversal procedureand from this point on receive these notifications.

The application range of traversal listeners is illustrated by the following examples.

Trace recorder For debugging purposes, we implemented a trace recorder that logs, be-sides the present simulation time, the order in which the actions of the various actionchains are visited. This provides a means to the simulation developer to conductan extensive analysis of the traversal progress, if the simulation does not behave asexpected.

Slowdown listener For the same purposes, we implemented a slowdown listener, which,whenever an action is about to be traversed, prints the pending action to the con-sole and pauses the simulation for a while. In this way, the simulation process isslowed down, thus enabling the simulation developer to observe the simulation whileoperating.

Lightweight simulator extension Another interesting application of traversal listeners isthe implementation of a lightweight simulator extension. Whenever a traversal lis-tener is notified, it is passed not only the action that is about to be traversed or thathas just been traversed, but also the entity that issued the traversal along with thecurrent traversal state. Based on these information, a traversal listener can enrichthe simulation logic, for example by changing the traversal state. Two examples ofthis extension mechanism are the handling of parametric dependencies (cf. Section4.6) and the integration of a measurement framework (cf. Section 4.8).

4.4.3 Simulation Run Example

In this subsection, we illustrate how the building blocks described above work togetherwhen simulating the PCM model described in Example 1.

36


Example 1. The UsageScenario depicted in Figure 4.9 describes a user that startsits lifecycle, causes a delay of 250 simulated time units and stops its lifecycle withoutinteracting with a system. As UsageScenarios in the PCM are designed to describe theinteraction between users and the system, this example may seem not reasonable from theperspective of a performance analyst. It is, however, well-suited to illustrate the interactionbetween the various classes when traversing a simulated control flow.

<<UsageScenario>>scenario


<<Delay>>delay

<<ClosedWorkload>>Population = 1ThinkTime = 0 <<TimeSpecification>>

Specification = 250

Figure 4.9: UsageScenario of Example 1

As can be seen in Figure 4.10, the simulation starts with generating an initial user pop-ulation in accordance with the workload specification. In our example, the Closed-WorkloadGenerator spawns a single user. The simulation of the usage behaviour startswith the creation of a BeginUsageTraversalEvent, which is then scheduled to occurat a simulation time of 0. Now, that there is a pending event in the event list, theEventScheduler starts its work and calls the event routine on the first event, which is theBeginUsageTraversalEvent. The event routine then delegates the task of interpretingthe usage behaviour to the UsageBehaviourInterpreter. Notice at this point the UMLInteractionUse element [omg07, pp. 487] that integrates the sequence diagram depictedin Figure 4.11.

The UsageBehaviourInterpreter begins the interpretation with the start action. In afirst step, the traverse template method obtains a suitable traversal strategy by passingthe action’s type to the obtainTraversalStrategy method, which returns a traversalstrategy of the type StartTraversalStrategy. Having the traversal strategy at hand,the interpreter delegates the simulation of the start action to the corresponding strategyusing its traverse method. As no simulation logic is associated with the start action, theStartTraversalStrategy simply determines the next action in the chain, which is thenencapsulated in a traversal instruction of the type TraverseNextAction and returned tothe interpreter. The traversal procedure obtains the next action by calling the processmethod on the traversal instruction and continues with the returned delay action.

As before, the traversal procedure first obtains and then invokes the traversal strategy forthe delay action. The task of the DelayTraversalStrategy is to advance the simulationtime as specified by the delay action. As no simulated time may pass within an event (cf.Section 4.4.1), the traversal has to be interrupted. Before that, the traversal strategy sched-ules a future event which resumes the traversal as soon as the simulation time has been ad-vanced accordingly. Then, the traversal strategy creates a ResumeUsageTraversalEvent,which is then scheduled to occur in 250 simulated time units as specified by the delay.Now, the traversal strategy has to ensure that the traversal procedure terminates, whichwill also terminate the execution of the current event. To do so, the traversal instructionreturns the InterruptTraversal instruction, which, in contrast to the previous traversalinstruction, does not carry a next action. Instead, when calling the process method, theinstruction adds the next action to the the traversal state. From there it can be retrievedlater on in order to resume the interpretation.

37


:ClosedWorkloadGenerator

processWorkload()

:EventScheduler

:BeginUsageTraversalEvent

schedule(0)schedule(0)

:UsageBehaviourInterpreter

beginTraversal(start)

sd simulation example

:ResumeUsageTraversalEvent

:UsageBehaviourInterpreter

resumeTraversal(state)

create

simulate()

eventRoutine()

processNextEvent()

create

processNextEvent()

spawnUser()

refbegin traversal

eventRoutine()

create

refresume traversal

Figure 4.10: Interactions while simulating the usage scenario of Example 1

38


:UsageBehaviourInterpreter :StartTraversalStrategy

:TraverseNextAction

obtainTraversalStrategy(Start)

traverse(start)

process()

nextAction = delay

traversalStrategy = StartTraversalStrategy

:DelayTraversalStrategy

obtainTraversalStrategy(Delay)

traversalStrategy = DelayTraversalStrategy

traverse(delay)

:ResumeUsageTraversalEvent

:InterruptTraversal

nextAction = null

process()

sd begin traversal

beginTraversal(start)

:EventScheduler

create

create

create

schedule(250)

traverse(start)

traversal instruction


Figure 4.11: Interactions while simulating the start and delay actions of Example 1

39


:UsageBehaviourInterpreter :StopTraversalStrategy

:EndTraversal

obtainTraversalStrategy(Stop)

traverse(stop)

process()

nextAction = null

traversalStrategy = StopTraversalStrategy

sd resume traversal

beginTraversal(stop)

:EventScheduler

create

traverse(stop)


Figure 4.12: Interactions while simulating the stop action of Example 1

As soon as the BeginUsageTraversalEvent has finished its execution, the control returnsto the EventScheduler. This causes the EventScheduler to advance the simulationtime by 250 simulated time units and, finally, to process the pending event, which is aResumeUsageTraversalEvent here. The remaining interactions are illustrated in Figure4.12. Analogously to the start and delay actions traversed before, the UsageBehaviourIn-terpreter obtains a suitable traversal strategy for the stop action and calls the strategies’straverse method to simulate the action. As with the the start action, no simulation logicis associated with the stop action, which is why the StopTraversalStrategy merelyreturns the EndTraversal instruction, which, finally, causes the traversal procedure toterminate, whereby also the enclosing event is terminated.

4.5 Simulation PlatformThe simulation platform provides some simulation infrastructure services to our simulator,such as an event scheduler, a pseudo-random number generator (PRNG) and a means tocollect measurements.

Figure 4.13 shows EventSim along with the components that constitute its simulationplatform. Components drawn with a grey background have been developed in the courseof this thesis, whereas the remaining components are provided by third parties and havebeen used without modifications. Notice, that we regard a component in this section ratheras a bunch of classes that are related in a logical way than as a physical decomposition ofthe system. Nevertheless, most of the components described below are realised as Eclipseplug-ins.

AbstractSimulationEngine The AbstractSimulationEngine decouples EventSim fromconcrete simulation libraries in that it defines abstract simulation concepts, whichare then used in place of a concrete simulation library. The abstract simulationconcepts have to be implemented for each concrete simulation library that is toserve as simulation engine. Two simulation libraries are currently supported: SSJ andDesmo-J (cf. Section 2.2). The corresponding components SSJ Engine as well as theDesmoJ Engine can be thought of as adapters that map the simulation functionalityof a concrete simulation library to the concepts of the AbstractSimulationEngine.

40

4.5. Simulation Platform 41

EventSimAbstract

Simulation EngineEventSim Controller

DesmoJ EngineSSJ Engine

PRNGStoEx

PCMBench

EclipsePlatform

Scheduler

ProbeSpecification

Workflow Engine

SSJ

DesmoJNewly developed component

Existing component

Figure 4.13: Simulation infrastructure components required by EventSim

PRNG The pseudo-random number generator (PRNG) provides a sequence of randomnumbers to EventSim. These are, for instance, required to evaluate probabilisticbranch transitions. The PRNG component originates from the SimuCom platform.

StoEx The StoEx component allows for the evaluation of stochastic expressions as will beintroduced in Section 4.6. For this, the StoEx component provides an implementa-tion of variable stacks consisting of simulated stack frames. The StoEx componentoriginates from the SimuCom platform.

Scheduler For the simulation of resources and their associated queues, EventSim relieson the Scheduler component, which enables an event-oriented simulation of ac-tive as well as passive resource. Currently, however, the Scheduler component isbound to the SSJ simulation library without utilising the indirection provided bythe AbstractSimulationEngine. The Scheduler component originates from theSimuCom platform.

ProbeSpecification The ProbeSpecifiction is used for collecting measurements and toderive performance metrics that characterise the performance of the system undersimulation (cf. Section 4.8).

The EventSim Controller integrates EventSim into the PCMBench in that it defines a newEclipse launch configuration type, which can be used by the quality analyst to configureand start a simulation run. Before passing the PCM model, which has been specified inthe launch configuration, on to the EventSim component, the controller executes a series ofworkflow jobs, which include the validation of the PCM model and some model-to-modeltransformations, whose purpose can be, for instance, to weave in the performance influenceof a specific middleware on the modelled software system. The workflow functionalityis provided by the Workflow Engine. Both, the EventSim Controller as well as thePCMBench contribute to the functionality of the Eclipse Platform in that they defineextensions to certain extension points defined by Eclipse.

41


4.6 Handling Parametric DependenciesParametric dependencies in the PCM allow for specifying the performance-related be-haviour of components in dependence upon input parameters. Conversely, a componentcan define output parameters or a return parameter, which are passed to the component’senvironment. In this way, dependency chains can arise between input parameters of acomponent service, actions in the SEFF, output parameters, input parameters of anothercomponent service and so forth. In this section, we describe how these chains of depen-dencies are realised with EventSim. Prior to that, we provide a more detailed view ofparameters in the PCM.

Parameters in the PCM are described in an abstract way in that usually no concrete valuesare assigned to them. Instead, parameters are characterised with regard to five abstractproperties, which have been identified as being performance-related [KHB06]: value, struc-ture, type, byte-size and number of elements. When characterising a parameter, one ormore of the properties are defined using a so-called stochastic expression (StoEx) each.A stochastic expression is a means to express a random variable that may also dependupon other random variables. Stochastic expressions in the PCM are used by the variousdeveloper roles to specify performance properties that underlie uncertainty [Bec08, p. 77].

For example, using parametric dependencies, an InternalAction could issue a resourcedemand in dependence upon input parameters defined in the usage model. More generally,an action in a SEFF can depend upon variables defined by actions occurred earlier in thesimulated control flow. Therefore, when traversing a control flow, the traversal procedureneeds to maintain the set of variables that have been defined before and are valid at thecurrent position. Comparable to method calls in programming languages, the argumentspassed to either an EntryLevelSystemCall or an ExternalCallAction, are valid onlyin the scope of the call. Analogously, a variable defined in the scope of a certain call isno longer valid after the call returned the control, unless the variable has been explicitlypassed to the enclosing scope by means of an output or return parameter.

In SimuCom, variable scopes are realised by simulated stack frames [Bec08, pp. 130]. Eachframe on the stack represents a variable scope by carrying a set of variable definitions. Astack frame can point to its parent frame, i.e. to the enclosing scope, whereby variablesdefined in the enclosing scope are valid in the current scope as well.

In order to support variable scopes, we adopted the stack frame implementation providedby SimuCom without modifications. The variable definitions on a stack are specific to acertain simulated user, which is why we realised the variable stack as a part of the traversalstate, which represents the traversal progress of a User or a Request issued by a user. Asa result, an own variable stack is maintained for each simulated User. When creating anew traversal state for a SEFF traversal, it is initialised with the stack frame associatedwith the User that issued the Request. In this way, variables that have been defined inthe course of the usage scenario traversal are accessible while traversing the SEFF, as longas there is a pointer to the parent stack frame as described before.

Using simulated stack frames, we have a means to store variable definitions and to evalu-ate random variables that are expressed in dependence upon variable definitions stored onthe variable stack. So far, however, we have not covered the integration into the traversalprocedure. That is, whenever the traversal procedure encounters an action that is sup-posed to define one or more variables by stochastic expressions, each expression needs tobe evaluated relative to the current variable definitions on the stack and then be stored onthe stack. The associated simulation logic could be implemented by the traversal strate-gies themselves. For instance, when traversing an EntryLevelSystemCall, the associatedtraversal strategy would push a new stack frame onto the stack that points to the parent

42

4.7. Encapsulating the Model Access 43

frame in order to implement the method call semantics described before. In this way, how-ever, the handling of parametric dependencies would be scattered over numerous traversalstrategies and, furthermore, the strategies would suffer increased complexity.

To overcome these drawbacks when integrating the handling of parametric dependenciesinto traversal strategies, we employed traversal listeners as introduced in Section 4.4.2.6.We implemented a traversal listener for both types of calls: EntryLevelSystemCalls aswell as ExternalCallActions. Whenever the traversal procedure is about to traverse acall or when it just finished the traversal of a call, each listener receives a notification. Inthe former case, the traversal listener pushes a new frame onto the stack and stores thevariables representing the call’s input parameters into the newly created frame. In thelatter case, the topmost frame is removed after storing those variables representing outputparameters into the parent frame. In this way, the listeners realise a separation of concernsbetween handling parametric dependencies and the simulation logic associated with thevarious action types.

4.7 Encapsulating the Model AccessWhen simulating a software system modelled by the PCM, many accesses to the underlyingPCM instance are needed. As the PCM is an Ecore meta-model, this primarily meansto invoke a sequence of methods on the object structure representing the PCM model.That way, program statements relating to accessing the PCM model are intermingledwith statements associated with the simulation logic.

When taking a closer look at the distinction between model accesses and the simulationlogic, we can find parallels with the common separation of the so-called business logic andthe data access into distinct layers. Such a multilayer architecture is made up of two ormore layers that are stacked on top of each other, where the degree of abstraction riseson the way from lower layers to upper layers [Bus96, pp. 34]. In enterprise applications,lower layers are usually concerned with accessing the data while upper layers provide thebusiness functionality and an optional graphical presentation thereof to the user or anothersystem, respectively. A multilayer architecture serves well when aspects that lower layersare concerned with change frequently, since the effects of a change are commonly confinedto one layer and do not affect the remaining layers [Bus96, pp. 48]. Furthermore, amultilayer architecture facilitates a separation of the concerns associated with the distinctlayers. This is commonly perceived as a crucial principle when designing software system[Bus96, pp. 397].

With these advantages in mind when separating the data access from the business logic,we can argue that it is not advisable to intermingle model access and simulation logic. Forthis reason, we factor out those blocks of statements whose main purpose is to access theunderlying PCM model. In doing so, blocks of statements are encapsulated in separateclasses implementing the ICommand interface. This interface provides a single methodnamed execute, which accepts a PCM instance as an argument. Provided with a PCMmodel, an instance of such a command class performs arbitrary calculations on the modeland may return a result. This approach is known as command pattern [Gam95, pp.233]. The set of all commands implementing the ICommand interface constitutes a kind ofmodel-access layer.

By means of this approach, the simulation logic usually does not need to access the modelby itself, but, instead, executes the suitable command that performs the model access. Forexample, we implemented a command that returns the list of all EntryLevelSystemCallscontained in a specified SEFF. The associated command class spans around 100 lines ofcodes, whereas the invocation of the command takes a single line only while delivering the

43


same results. In addition to the advantages mentioned above, the commands, therefore,can contribute to a lean simulation logic, which we expect to increase the maintainability.

4.8 Collecting Performance MeasurementsA central objective of a software performance simulation is to gain a better understandingof the performance and scalability characteristics of the system under investigation. It istherefore vital to capture the simulated system’s performance throughout the simulationruns. For this purpose, we used the so-called ProbeSpecification framework, which is alsoemployed in SimuCom. In what follows, we first give an overview of the ProbeSpeciifcationbefore we present how it has been integrated into our simulator.The ProbeSpecification framework provides the measurement infrastructure to our sim-ulator in that it offers mechanisms to collect measurements and to derive performancemetrics out of these measurements. The performance metrics can then be stored either inthe main memory or into a file on a storage devices.Three concepts are central to the ProbeSpecification: probes, probe sets and calculators.A probe is as device that measures a single value along with its unit, thus defining what isto be measured. In the performance analysis domain, there are usually probes to measurethe current time, the queue length of an active resource or the demand that has beenissued to an active resource. A probe set subsumes one or more probes. In this way, itforms a unit that can be mounted at a measurement location, thus defining where is tobe measured. For instance, a trivial probe set that encloses a single queue length probecould be mounted on an active resource like a CPU. The third concept, the calculator,uses the measurements gathered by probes and calculates performance metrics out of thefine-grained measurements.When using the ProbeSpecification, measurements are published on a blackboard, wherecalculators can access them in order to calculate performance metrics. Hence, the black-board decouples the probes from the calculators in such a way that a probe does not needto be aware of the calculators that are interested in the measurements resulting from thatprobe. In consequence, however, each calculator has to be registered with the blackboardin order to receive relevant measurements.The task of integrating the ProbeSpecification into our simulator is therefore twofold:firstly, probe sets need to be mounted in such a way that they collect measurements andpublish them on the blackboard. And secondly, calculators have to be registered with theblackboard. The registration of calculators is straightforward, which is why we focus onthe mounting of probe sets.Two types of probe sets can be distinguished: probes sets that are mounted on active orpassive resources (resource probe sets) and probe sets that are mounted on actions residingin the simulated control flow (control flow probe sets).

4.8.1 Mounting Resource Probe SetsResource probe sets are mounted using the observer pattern. For this purpose, activeresources accept two types of listeners: a state listener observes the queue associatedwith the resource and is notified whenever the queue length changes. A demand listenerobserves the resource demands issued to the resource. Similarly, passive resources maintaina list of listeners as well and notifies them whenever a resource instance is requested, whena resource instance has been successfully acquired or when it is finally released again.A resource probe set, therefore, implements the interface associated with the respectivelistener and is registered with the resource before the simulation starts. Whenever theresource notifies its listeners, the resource probe set receives a notification and, as a result,performs its measurements, which are published on the blackboard thereafter.

44

4.9. Attaining Independence of Simulation Libraries 45

4.8.2 Mounting Control Flow Probe Sets

Control flow probe sets are mounted in a similar manner inasmuch as they also rely onthe observer pattern. They implement the ITraversalListener interface and are reg-istered either with the UsageBehaviourInterpreter or the SeffInterpreter, where aspecific action is specified that is to be observed. Whenever the traversal procedure en-counters the observed action, it sends out two notification to registered control flow probesets: one notification before processing the action and one thereafter. Each notifica-tion triggers a measurement that is published on the blackboard. When observing anEntryLevelSystemCall, for instance, the associated probe set produces one measurementbefore the call and one measurement after the call. Based on these two measurements, acalculator can calculate the response time of the system call.

4.9 Attaining Independence of Simulation LibrariesWhen implementing a simulation using a general-purpose programming language, it iscommon to use a simulation library that provides the basic simulation functionalities tothe simulation developer. Two well-known simulation libraries for the Java programminglanguage are Desmo-J and SSJ (cf. Section 2.2). Both provide a similar functionalityand set of classes that are used by the simulation developer to implement a simulationmodel: a central class (which is called Simulator in SSJ and Experiment in Desmo-J)drives the simulation by simulating the effect of events (class Event) on entities (classEntity in Desmo-J; in SSJ, entities are not specifically modelled). The similar design ofboth simulation libraries gives rise to an abstraction layer that incorporates both librariesusing a unified object-oriented design that is at the same time in line with the concepts ofDesmo-J and those of SSJ.

Such an abstraction layer does already exist for SimuCom. Instead of implementing ourown abstraction layer, we decided to reuse the implementation provided by SimuCom.However, the abstraction layer was interlaced with SimuCom in that it resided in theEclipse plug-in of the SimuCom platform, and, beyond that, contained code specific toSimuCom. This is why we factored out the abstraction layer into a new Eclipse plug-inwhich is independent of SimuCom or EventSim.

The newly created Eclipse plug-in provides a set of abstract classes (which formerly residedin the SimuCom plug-in) that have to be implemented for each simulation library that is tobe available in SimuCom, in EventSim, or some other simulator relying on the abstractionlayer. For this purpose, the abstraction layer’s Eclipse plug-in defines an extension point,which has to be extended by plug-ins providing a library-specific implementation of theabstract classes.

In this way, the simulator keeps independent of specific simulation libraries. Moreover,one could think of a simulation that offers the selection of the simulation library to beused, which drives the simulation process thereafter.

4.10 Supported FeaturesAt the time of writing, most of the modelling elements provided by the PCM are supportedby our simulator. An overview of the modelling elements concerned with the dynamicbehaviour of a system is given in Table 4.1. The PCM modelling elements that capturethe static structure are depicted in Table 4.2.

45


Dynamic BehaviourUsage Modelling SEFF Modelling

Workload Types Actions!OpenWorkload !StartAction!ClosedWorkload !StopActionActions !InternalAction!Start !ExternalCallAction!Stop !SetVariableAction!Delay !AcquireAction!EntryLevelSystemCall !ReleaseAction!Loop !LoopAction!Branch %CollectionIteratorAction

!1 BranchAction(!)2 ForkAction

1 both, probabilistic and guarded branch transitions are available2 no support of synchronous forks, only asynchronous forks are available

Table 4.1: Support of behavioural modelling elements

Static StructureRepository Resource Environment System

Composition Scheduling Policies Miscellaneous!BasicComponent !DELAY !Override Component Parameters%CompositeComponent !FCFS%SubSystem !PSMiscellaneous %1 EXACT!Passive Resources Network!Component Parameters %2 LinkingResource

%2 Connection1 the EXACT scheduling policy emulates real schedulers of some operating systems2 an ideal network connection is assumed to exist between resource containers with an infinite through-put and zero latency.

Table 4.2: Support of structural modelling elements

46

5. Simulator Validation

The objective of this chapter is to show that the simulator developed in the course of thisthesis yields accurate performance predictions. We begin with presenting our validationapproach in Section 5.1. Thereafter, in Section 5.2, the MediaStore system is introducedwhose predicted performance metrics form the basis of the validation. The various per-formance metrics are covered in Section 5.3, followed by a description of the conductedexperiments in Section 5.4. Finally, in Section 5.5, we present and discuss the experimentresults. Section 5.6 concludes the chapter and points out the limitations of our validationapproach.

5.1 ApproachValidation in simulation studies commonly refers to the “process of determining whether asimulation model is an accurate representation of the system” in relation to the objectivesof the study [LK00, p. 265]. In doing so, usually the outputs of the simulation modeland the real system are compared to each other and the simulation model is said to bevalid if the deviations do not exceed a specified limit. This approach was also used for thevalidation of SimuCom – the default simulator of the Palladio approach; for the validationresults see [BKR09].

Repeating the same validation with EventSim in place of SimuCom would be tedious whileat the same time delivering few additional insights with regard to the Palladio approach.Our approach is therefore to assume that the simulation results provided by SimuCom arevalid. In other words, we regard SimuCom as the reference simulator for PCM models.In this way, the validation task reduces to comparing the outputs of EventSim to thoseof SimuCom. When provided with the same input, both simulators are expected to yieldconsistent results – but not necessarily entirely equal results as will be motivated later on.

As a prerequisite to judge the equivalence of simulation runs yielded by SimuCom andEventSim, both simulators are required to work in a deterministic way. Otherwise, itwould not be clear whether observed differences are due to differences in the simulatorimplementations or whether they arise due to the indeterministic behaviour of one of thesimulators. We therefore require the results of subsequent simulation runs to be equivalent,provided that the same simulator has been used and the input to the simulator does notvary between the runs. At first sight, this requirement seems hardly to meet, as bothsimulators heavily rely on a pseudo-random number generator (PRNG), which causes – to

47

48 5. Simulator Validation

a certain extent – randomised simulation results. But, as long as the number generatoris not truly random, it can be configured to provide a deterministic sequence of numbers.The PRNG used in SimuCom and EventSim is initialised with a so-called seed consistingof five values determining the sequence of pseudo-random numbers that will be generated.That is, when provided with the same seed, subsequent simulation runs draw exactly thesame sequence of pseudo-random numbers.

In the reminder of this section, we present techniques for comparing the simulation resultsof the two simulators. If the results do not vary at all, a simple test as presented inthe next subsection suffices to state that both simulators are equal in terms of theirresults. Otherwise, we utilise a statistical test to judge whether the observed differencesare negligible

5.1.1 Testing for Equivalence of Samples

Let x1, x2, ..., xn be the sequence of values in the first sample X and y1, y2, ..., ym be thevalues that constitute the second sample Y . Both samples X and Y are said to be equal,if and only if the two samples are equal at positions i = 1, 2, ..., n:

∆X,Y :=n=m∑i=1|xi − yi| = 0 (5.1)

If both samples turn out to be different, i.e. the sum of differences ∆X,Y > 0, we use theKolmogorov-Smirnov statistic in order to test whether the difference is significant.

5.1.2 Testing for Equivalence of Probability Distributions

If the test for equivalence fails, we use the two-sample Kolmogorov-Smirnov test to deter-mine whether both samples underlie the same probability distribution. As described in[SH06], the Kolmogorov-Smirnov statistic (k-s statistic) is given by

D = maxx

∣∣∣F1,n(x)− F2,m(x)∣∣∣ (5.2)

where F1,n and F2,m denote the empirical cumulative distribution function of the first andthe second sample respectively. The number of values in the samples is n in the first andm in the second sample.

In a second step, the k-s statistic is compared to a critical value Dα, where α denotes thesignificance level. On an α = 0.05 significance level, Dα can be approximated by:

D0.05 = 1.36√n+m

n ·m(5.3)

If D ≥ D0.05, then there is sufficient evidence to reject the null hypothesis H0, whichstates that there is no difference between the two probability distributions, and to assumethat they actually differ. Otherwise, if we fail to reject H0, we argue that there is nosignificant difference between the distributions underlying the compared samples (on a0.05 significance level). This is not to mean, however, that there is no difference at all asstated by H0 – this would not be an admissible conclusion.

5.2 MediaStoreThe MediaStore is a web-based music store that allows users to download and upload MP3files over the Internet. It is an artificial, yet representative, example of a component-based

48

5.2. MediaStore 49

<<ResourceContainer>>Client

Web-Browser

<<ResourceContainer>>Application Server

WebGUI MediaStore DigitalWatermarking

DBCache

<<ResourceContainer>>Database Server

PoolingAudioDB

<<Interface>>IHTTPHTTPDownload()HTTPUpload()

<<Interface>>IMediaStoredownload()upload()

<<Interface>>ISoundwatermark()

<<Interface>>IAudioDBqueryDB()addFile()

<<ProcessingResource>>HDDScheduling Policy: FCFS

<<ProcessingResource>>CPUScheduling Policy: PROCESSOR_SHARING

<<ProcessingResource>>HDDScheduling Policy: FCFS

<<ProcessingResource>>CPUScheduling Policy: PROCESSOR_SHARING

Figure 5.1: Static structure of the MediaStore system (based on [BKR07])

business information system. It is artificial in that it has been developed specifically tosupport validations in the context of Palladio. For this purpose the MediaStore examplecomprises a PCM architectural model as well as an EJB3 implementation of the modelledsystem [KBH07]. One approach to validate our simulator could be to compare the actualmeasurements yielded by the running EJB implementation to those predicted by our sim-ulator. However, as motivated earlier, it suffices to compare the predictions to SimuCom’spredictions (see Section 5.1). We therefore make use of the MediaStore PCM model only.The MediaStore example was used as a case study in [KBH07] and has been modified by[Mar07]. In what follows, we describe the latter version based on [Mar07].

The architecture of the MediaStore system can be seen in figures 5.1 and 5.2. The sys-tem is composed of six components, which are deployed on three interconnected resourcecontainers, thus forming a three-tier application. The client (presentation tier) serves asinterface between the MediaStore system and its users in that it enables a user to accesssystem functionalities by using a web browser. For this purpose, the client requests ser-vices provided by the application server (application tier), which, in turn, makes use ofthe database server (data tier).

The Web-Browser component residing on the client invokes system services using theHTTP protocol. Notice that the MediaStore architectural model does not explicitly cap-tures this component and the associated resource container. Instead, the client behaviouris modelled by means of one or more usage scenarios (cf. Section 3.5.1.7). The applicationserver accepts HTTP requests and generates according HTTP responses by means of theWebGUI component. For this purpose, the WebGUI component delegates requests to thecentral MediaStore component that offers two services: one to upload a new media fileto the server and the other one to download existing files. When downloading files, theuser request passes through the DBCache component, which caches recently accessed files.Requested files that can not be served from cache are retrieved from the database usingthe PoolingAudioDB component. Thereafter, each file is enriched with a digital water-mark (DigitalWatermarking component), whose purpose is to track potential copyrightviolations. Finally, the watermarked files are passed to the WebGUI component and fromthere delivered to the user. Uploading a file takes place in a similar manner, with thedifferences that the database is accessed directly and that uploaded files are not marked.

The usage model (not depicted) consists of a single usage scenario. Each user issues eitheran upload request with a probability of 0.2 or a download request with a probability of0.8. The workload is closed such that a constant amount of users circulates through thesystem.

49


:WebGUI :MediaStore

uc download

HTTPDownload(request:Request)

download(ids:String[])

:DigitalWatermarking :DBCache :PoolingAudioDB

queryDB(ids:String[])

opt [cache miss]

queryDB(id:String)

file:File

fileList:ListOfFiles

watermark(file:File)

loop[ for each file in fileList ]

watermarkedFile:File

watermarkedFileList:ListOfFiles

response:HTTPResponse

loop [ for each file in fileList ]

(a) Use case download

:WebGUI :MediaStore

uc upload

HTTPUpload(request:Request)

upload(upload:File)

:DigitalWatermarking :DBCache :PoolingAudioDB

addFile(file:File)

response:HTTPResponse

addFile(file:File)

(b) Use case upload

Figure 5.2: Dynamic behaviour of the MediaStore system (based on [Mar07])

50

5.3. Performance Metrics 51

5.3 Performance Metrics

When conducting a simulation of the MediaStore example, SimuCom as well as EventSimyield the performance metrics as listed below. The simulation results relating to thesemetrics form the basis of the validation.

Response times are provided for the overall usage scenario (UsageScenario), for eachinvoked system service (EntryLevelSystemCall) as well as for each call to an externalcomponent (ExternalCallAction). Therefore, the simulation results of the MediaStoreexample contain response times for the single usage scenario, for the two system callsHTTPDownload and HTTPUpload, and for each component call depicted in Figure 5.2:

• MediaStore.download

• MediaStore.upload

• DigitalWatermarking.watermark

• DBCache.addFile

• DBCache.queryDB

• PoolingAudioDB.addFile

• PoolingAudioDB.queryDB

In this simple example, component calls are uniquely identified by the combination of thecalled service and the providing component, as each service is invoked only by one otherservice. This is why we use the shorthand notation <componentName>.<serviceName>as seen above.

Demanded times and utilisation are captured for each active resource, where utilisa-tion refers to the length of the resource queue over time and demanded time indicates thesize of resource requests over time, measured in simulated time units. In the MediaStoreexample, there are four active resources as can be seen in Figure 5.1. The hard disk of theapplication server is, however, not accessed by the system, which is why no measurementsexist in this case. In summary, the simulation results contain utilisation and demandedtimes for the following resources:

• Application server CPU

• Database server CPU

• Database server HDD

Hold times, utilisation and wait times are measured for each passive resource. Utili-sation of passive resources is similar to the definition given above, but refers to the numberof the resource instances being hold over time. Wait time captures the duration betweenacquiring a passive resource instance and actually being granted the instance. The holdtime of a passive resource is the duration between being granted a resource and releasingthe resource again. There is only a single passive resource in the MediaStore example,which represents the size of the database connection pool.

Additionally, SimuCom provides measurements for communication links connecting re-source containers. At the time of writing, however, EventSim does not support the sim-ulation of network resources, which is why we neglect the corresponding measurementshere.

51


Maximum simulation time: 10000 [simulated time units]Maximum measurement count: ∞Random number generator seed: 0, 1, 2, 3, 4, 5Simulate linking resources: noSimulate failures: no

Table 5.1: Simulation parameters

5.4 ExperimentsIn the course of the validation, three variants of the MediaStore system were simulatedwith both simulators resulting in a total of six result sets. For each simulation run theparameters as shown in Table 5.1 were used, regardless of the respective variant or theused simulator. Each variant is based on the MediaStore architectural model describedbefore. The modifications that were made to the underlying PCM model are described inthe following.

MediaStore1,F CF S+P S In this variant, the workload population is set to 1 as indicatedby the first subscript. As a result, the number of users passing through the simulatedsystem concurrently is limited to one. Therefore, no resource contention betweendifferent users can occur. Furthermore, as none of the SEFFs in the MediaStoremodel contains a control flow fork, resource contention due to multiple requests ofthe same user can not happen as well. In consequence, resource demands of thesingle user are served instantly without waiting times. The second subscript denotesthe types of scheduling policies used by the active resources, which are first-come,first-served (FCFS) and processor sharing (PS) in this variant.

MediaStore10,F CF S+P S In this variant, the workload population is increased to 10,which results in ten users passing through the simulated system concurrently. Op-posed to the MediaStore1,FCFS+PS variant, these users compete for the limitedresources. In consequence, waiting times can arise when a user demands a resourcethat is busy with another user. For the scheduling of resource demands, some re-sources use FCFS and some others use PS.

MediaStore10,F CF S This variant uses a workload population of 10 and is thereforeequal to the MediaStore10,FCFS+PS variant in terms of resource contention. In thisvariant, however, each active resource uses the FCFS scheduling policy.

5.5 Results and DiscussionEach of the MediaStore variants described above was simulated using both simulatorsunder comparison. The corresponding simulation parameters can be seen in Table 5.1. Inthis section, we compare the results yielded by the resulting six simulation runs.

5.5.1 Without Resource ContentionIn experiment MediaStore1;FCFS+PS , the sum of differences (see Formula 5.1) yields0 for each performance metric presented in Section 5.3. We can therefore argue, thatnot only the various performance metrics produced by SimuCom and EventSim underliethe same probability distribution, but both simulators even produce exactly the samesequence of performance metrics when provided with the PCM model associated with thepresent experiment. We suppose that this result can be generalised to all PCM modelswithout resource contention, i.e. to PCM models with a single-user closed workload and anabsence of control flow forks, which has also been indicated by various other experimentsnot covered in this chapter.

52

5.5. Results and Discussion 53

5.5.2 With Resource Contention

Most commonly, the absence of resource contention is no realistic scenario when simulatinga software system, which is why we deem it more important to validate the equality ofsimulation results in the presence of resource contention. This is the case in experimentMediaStore10;FCFS+PS . Opposed to the previous experiment, however, the sum of differ-ences is not applicable since the requirement of an equal sample size is not met; SimuComand EventSim do not provide the same number of measurements with regard to the variousperformance metrics. But even when forcing an equal sample size by truncating the largersample, the sum of differences is greater than 0 for each of the performance metrics.

As the cumulative distribution function (CDF) of the usage scenario’s response time sug-gests (see Figure 5.3(a)), both samples underlie virtually the same probability distribu-tion. We therefore applied the Kolmogorov-Smirnov test as described in Section 5.1.2.The results are shown in tables 5.2, 5.3 and 5.4, where nsc and nes denote the number ofpredictions yielded by SimuCom and EventSim, respectively. D is the k-s test statistic,which is compared to the critical value D0.05. The null hypothesis H0, which states thatthe predictions of both simulators follow the same probability distribution, could not berejected except for the utilisation of the application server CPU.

The CDF of the utilisation predicted for the application server CPU can be seen in Fig-ure 5.3(b). A visual inspection suggests that both simulators predict nearly the sameutilisation. Based on the ks-test results and the visual inspection, we conclude thatthe probability distributions that underlie the predictions of SimuCom and EventSimdo not differ significantly, provided that the PCM model associated with experimentMediaStore10;FCFS+PS is used.

0 5 10 15 20 25

0.0

0.2

0.4

0.6

0.8

1.0

Response Time

Cum

ulat

ive

Pro

babi

lity

SimuComEventSim

(a) Cumulative probability of the usage scenario’sresponse time

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Utilisation

Cum

ulat

ive

Pro

babi

lity

●

●

●

●

●

●

●● ● ● ●

●

●

●

●

●

●●

● ● ● ●

SimuComEventSim

(b) Cumulative probability of the applicationserver’s CPU utilisation

Figure 5.3: Cumulative distribution functions of selected performance metrics in theMediaStore10;FCFS+PS experiment

Nevertheless, provided that a fixed seed is used for the pseudo-random number generator,the simulation of a specific PCM model should yield exactly the same results – not onlyresults that follow the same probability distribution. Further investigations showed thatthe results differ not only between the two simulators, but also between successive simula-tion runs with either SimuCom or EventSim. This suggests that the differences are causedby an infrastructure component provided by the simulation platform; for instance by the

53


Performance Metric nsc nes D D0.05 D ≥ D0.05(Response Time of...) (Sample Size) (Statistic) (Critical Value) (Reject H0?)

Ext. Call PoolingAudioDB.addFile 2586 2536 0.0165 0.0380 noExt. Call DBCache.addFile 2586 2536 0.0165 0.0380 noExt. Call PoolingAudioDB.queryDB 57582 57558 0.0071 0.0080 noExt. Call DBCache.queryDB 10014 10135 0.0164 0.0192 noExt. Call MediaStore.download 10013 10135 0.0163 0.0192 noExt. Call MediaStore.upload 2586 2536 0.0165 0.0380 noExt. Call DigitalWatermarking.watermark 71974 71888 0.0061 0.0072 no

System Call HTTPDownload 10013 10135 0.0164 0.0192 noSystem Call HTTPUpload 2586 2536 0.0192 0.0380 no

Usage Scenario 12599 12671 0.0104 0.0171 no

Table 5.2: K-s test results for calls and the usage scenario of the MediaStore10;FCFS+PSexperiment at a 0.95 confidence level

Performance Metric nsc nes D D0.05 D ≥ D0.05(Sample Size) (Statistic) (Critical Value) (Reject H0?)

Application Server CPUDemanded Time 169193 169158 0.0040 0.0046 noUtilisation 338385 338338 0.0039 0.0033 yes

Database Server CPUDemanded Time 60173 60057 0 0.0078 noUtilisation 120346 120198 0 0.0055 no

Database Server HDDDemanded Time 60173 60057 0 0.0078 noUtilisation 120341 120193 0.0031 0.0055 no

Table 5.3: K-s test results for active resources of the MediaStore10;FCFS+PS experimentat a 0.95 confidence level

Performance Metric nsc nes D D0.05 D ≥ D0.05(Sample Size) (Statistic) (Critical Value) (Reject H0?)

Database Connection PoolHold Time 60168 60094 0.0058 0.0078 noUtilisation 120341 120193 0.0031 0.0055 noWait Time 60173 60099 0.0063 0.0078 no

Table 5.4: K-s test results for passive resources of the MediaStore10;FCFS+PS experimentat a 0.95 confidence level

54


simulation library, the pseudo-random number generator, or the resource scheduler. Bymeans of the trace recorder described in Section 4.4.2.6, we found the simulation logic asso-ciated with InternalActions to be a source of the problem. When comparing successivesimulation runs, it becomes apparent that the simulation runs behave exactly the samebefore reaching an InternalAction. However, as soon as an InternalAction demands aresource that is already busy, the way in which the pending jobs are worked off seems tobe indeterministic. More precisely, when the service of two different requests is completedat the same time, one cannot say in advance whether the first or the second request is thefirst that may proceed. In this case it is therefore not determined in advance, with whichrequest the simulation proceeds, and, in consequence, it is not predetermined which oneof the requests draws the next random number. This behaviour leads to indeterministicsimulations as will be discussed in Section 5.5.4.

Since the indeterministic behaviour can be observed with SimuCom as well EventSim, theactual source must be one of the both scheduling policies used with InternalActions. Afurther investigation could then show that the indeterminism disappears as soon as thePS scheduling policy is replaced with FCFS.

5.5.3 Avoiding Processor Sharing Resources

Now, that the cause of the result differences seems to be known, theMediaStore10;FCFS+PSexperiment can be modified in such a way that no resource uses the processor shar-ing scheduler. Therefore, in experiment MediaStore10;FCFS each of the four active re-sources uses the FCFS scheduler. Despite of the resource contention present in experimentMediaStore10;FCFS , the sum of differences (Formula 5.1) applied to the performance met-rics predicted by SimuCom and EventSim yields 0. This result shows that both simulatorsproduce exactly the same predictions with regard to the present experiment.

5.5.4 Reviewing the Processor Sharing Implementation

In the remainder of this section, we take a detailed look on the implementation of theprocessor sharing (PS) scheduling policy with regard to the issues discussed above. Webegin with the foundations of PS, which are required to understand the examples givenbelow. Then, we identify the part of the source code that introduces the indeterministicbehaviour as described above. Finally, two examples illustrate the effects due to theindeterminism on the simulation results.

5.5.4.1 Processor Sharing Foundations

A resource that uses PS distributes its processing capacity uniformly over all pending jobs.For example, if the resource is busy with two jobs that demand 100 processing units each(e.g. cycles in the case of a CPU), both jobs are finished after 200 processing units –assuming that both jobs arrived at the same time and no further jobs arrive thereafter. Inparticular, both jobs are completed simultaneously assuming an ideal PS implementation.When multiple jobs arrive at different times, the proportion of processing capacity thateach job receives changes over time depending on the number of concurrent jobs. Wecontinue the example above and assume that the second job arrives when the resourcehas processed half of the demand of the first job. In this case, the first job is served fora duration of 50 processing units with the entire processing capacity. Then, the secondjob arrives whereby the processing capacity is distributed across both jobs. Therefore, theremaining demand of 50 requires 100 processing units to be completed, thus resulting ina total of 150 processing units for the first job.

55


5.5.4.2 Processor Sharing Implementation

The simulation platform’s implementation of the PS policy maintains the jobs and theirremaining demands in a java.util.Hashtable. A Hashtable is a data structure thatmaps keys (here: jobs) to values (here: remaining demands). When the first job arrivesat the resource, it is added to the map along with its demand. Additionally, the PS policyschedules a future event to occur at the simulated time at which the resource has processedthe job completely – provided that no further jobs arrive in the meantime. When the eventis executed, this means that the demand issued by the job has been processed completely,and, as a result, the simulated user may continue passing through the simulated system.Whenever a job arrives while the resource is busy with other jobs, the PS policy performstwo steps: At first, the demands of the current jobs are updated according to the timeelapsed. For example, if the resource is busy with two jobs when the third job arrives, thedemands of the two existing jobs are decreased by the amount of processing units thatwere available during the elapsed time period. In the second step, the newly arrived jobis added to the updated job 7→ demand map. Finally, the future event is adjusted in sucha way that it represents the completion of the job that will be finished next. As soon asthe remaining demand associated with a job reaches 0, the job is removed from the mapand the user that issued the demand may proceed its simulated behaviour.

The part of the source code that is concerned with finding the most imminent job com-pletion can be seen in Listing 5.1. The algorithm iterates over all jobs maintained by themap and updates in each iteration the reference to the shortest job found so far, which isthe job with the lowest remaining demand. If there are multiple shortest job candidates,the reference is set to the candidate that is encountered first. This reference remains un-changed when the job of subsequent iterations have a demand equally short as the presentshortest job. In this way, the algorithm would adhere to a deterministic strategy for se-lecting the shortest job if the entries in the map were ordered in any manner. However,the contract associated with the keySet method of the Hashtable class does not assureto return the keys in any specified order. As a result, the shortest job determined by thealgorithm may vary between repeated simulation runs. This is by no means a behaviourthat we consider incorrect; nevertheless this inderterminism may influence the simulationresults as presented below.

Listing 5.1: Finding the most imminent job completion1 private Hashtable running_processes = new Hashtable ( ) ;2 . . .3 ISchedu lab l eProce s s s h o r t e s t = null ;4 for ( ISchedu lab l eProce s s p roce s s : running_processes . keySet ( ) ) {5 i f ( s h o r t e s t == null | |6 running_processes . get ( s h o r t e s t ) > running_processes . get ( p roce s s ) ){7 s ho r t e s t = proce s s ;8 }9 }

5.5.4.3 Consequences of Indeterministic Scheduling

How an indeterministic scheduling policy influences the results of two successive simulationruns can be seen in the example depicted in Figure 5.4. Both runs rely on a pseudo-randomnumber generator (PRNG) with a fixed seed that generates the pseudo-random numbersequence (r1, r2, ...) = (5, 10, ...). Two simulated processes, denoted by P1 and P2, arethe objects of interest in this simulation. Each of the two processes issues two resourcedemands and terminates thereafter. P1 demands 10 processing units (PU) followed by2 · r PU, where r is a number drawn from the PRNG. P2 demands 10 PU followed by3 · r PU. The simulated processes move along two time dimensions: the simulation time

56


advances with increasing y-coordinates; an advance in real time is indicated by increasingx-coordinates. The simulated time is advanced in order to simulate time consumptiondue to the processing of resource demands issued by P1 and P2 respectively, whereas aprogress in real time is caused by executing the program statements corresponding to thePS implementation. Resource demands are issued to an active resource that makes useof the PS implementation presented above. When demanding a resource, the requestingprocess gets passivated by the resource until the demand has been processed completely.Then, the resource activates the associated process, which means that the process maycontinue. Below, we assume that the processing of 1 PU requires 1 simulated time unit.

In the first simulation run illustrated in Figure 5.4(a), the resource demands of P1 and P2arrive simultaneously with regard to the simulated time. Consequently, as the demandsare equal, both processes may continue at simulated time st = 20. Hence, each of thetwo processes is a candidate for being selected as shortest process. In the present case, P1is selected and continues its execution with drawing the pseudo-random number r1 = 5.Then, P2 continues in the same manner and receives the next pseudo-random numberr2 = 10. The second resource demand of P1 and P2, respectively, depends upon the drawnpseudo-random numbers as introduced above. Therefore, P1 issues a 10 PU demand andP2 demands 30 PU. The service of P1 is finished at st = 40 (20 current simulation time+ 10 PU · 2 jobs), and P2 is served completely at st = 60 (20 current simulation time +10 PU · 2 jobs + 20 PU · 1 job).

The second simulation run, which is shown in Figure 5.4(b), begins with the same resourcedemands as before. Opposed to the first simulation run, however, the indeterministicresource scheduler now selects P2 as the shortest process. As a result, P2 receives thefirst pseudo-random number r1 = 5, calculates 3 · r1 = 15 and issues a 15 PU demand. P1draws r2 = 10, calculates 2 · r2 = 20 and issues a 20 PU demand. As a result, P1 finishesat st = 55 and P2 finishes at st = 50.

dem

and

P1

: 10

dem

and

P2

: 10

acti

vate

P1

20

40

60

acti

vate

P2

dem

and

P1

: 2*r

1 = 10

dem

and

P2

: 3*r

2 = 30

acti

vate

P1

acti

vate

P2

Pseudo-Random Number Generator

Processor-Sharing Resource

Simulated Process 1 (P1)


r1 = 5

r2 = 1

0

Simulation Time

Time

(a) First simulation run

dem

and

P1

: 10

dem

and

P2

: 10

acti

vate

P2

20

40

60

acti

vate

P1

dem

and

P2

: 3*r

1 = 15

dem

and

P1

: 2*r

2 = 20

acti

vate

P1

acti

vate

P2

Pseudo-Random Number Generator

Processor-Sharing Resource



r1 = 5

r2 = 1

0

Simulation Time

Time

(b) Second simulation run

Figure 5.4: Example of a simulation, whose results are influenced by an indeterministicresource scheduler

This example illustrates that the execution times of the two processes in terms of simulationtime may vary depending on the order in which the processes are simulated. Using thepresent implementation of the processor sharing resource, it is therefore not possible toconduct a deterministic, fully reproducible, simulation run.

57


5.6 Conclusion and LimitationsAs the validation suggests, our simulator can be regarded semantically equivalent to Simu-Com with regard to the PCM models used in the course of the validation. That is, thepredictions yielded by SimuCom and EventSim are indistinguishable as long as none of thesimulated resources uses the processor sharing (PS) scheduling policy. But even when thePCM model that is to be simulated contains a PS resource, the simulation results do notdiffer significantly between the two simulators. We could further show that the differencesare caused by the way in which the PS scheduler is implemented. Although the scheduleris implemented in a correct way, adjustments towards a deterministic behaviour would bedesirable. This task was out of the scope of this thesis, as the various schedulers are partof the simulation platform, which was reused from SimuCom.

Although the MediaStore PCM model is a scenario that is representative for softwarequality analyses in the context of Palladio, it does not capture the whole expressivenessprovided by the PCM meta-model. The drawn conclusions may therefore only appliedto the subset of modelling constructs used in the MediaStore PCM model. In particular,this does not include scheduling policies other than FCFS and PS, network resources andcontrol flow forks in SEFFs. Nevertheless, some tests with forks suggest that they workas intended.

58

6. Simulator Comparison

This chapter aims at comparing the process-oriented simulator SimuCom to its event-oriented counterpart EventSim in terms of performance and scalability. The first twosections set the stage for the simulator comparison, which then spans the remainder of thechapter.

The approach underlying the comparison is presented in Section 6.1. Thereafter, in Section6.2, we motivate the need for of a tool-supported model variation and simulation, andprovide an overview of the solution developed in the run-up to the simulation comparison.

The comparison is subdivided into three parts. First, in Section 6.3, we identify factorsthat are presumed to affect the performance of simulation runs in SimuCom or EventSim,respectively. Thereafter, in Section 6.4, these factors are ranked according to their im-portance in terms of performance. Based on the most influential factors, we conducta thorough performance and scalability comparison in Section 6.5. In addition, in Sec-tion 6.6, we identify and quantify factors limiting the scalability of both SimuCom andEventSim. Finally, we conclude the chapter in Section 6.7.

6.1 ApproachAs has been stated in the introduction, the main objective of this thesis is to contribute tothe understanding of the trade-off between “modelling ease” and performance associatedwith the choice between a process- and event-oriented simulation. More specifically, wefocus on the performance implications arising from this choice. Assessing, in addition, thedifferences in “modelling ease” between the two world-views demands for an empirical casestudy involving the participation of various simulation developers, which is clearly beyondthe scope of this thesis.

We focus our work on the area of software performance simulation in general, and onthe simulation of Palladio Component Models in particular. In doing so, we comparePalladio’s default simulator SimuCom to the simulator developed in the course of thisthesis, called EventSim. Both have been shown to be semantically equivalent (cf. Section5). As SimuCom essentially adheres to the process-oriented world-view, while EventSimis an event-oriented simulator, our approach is to base the assessment of performanceimplications associated with the two world-views on a comparison between SimuCom andEventSim. Of course, the results from such a comparison are to a certain extent influencedby the respective simulator implementation. But, as will be apparent later, the simulators

59

60 6. Simulator Comparison

show performance and scalability characteristics that can be clearly attributed to therespective world-view.

Performance as defined in Section 3.1 is the “degree to which the system meets its objec-tives for timeliness and the efficiency with which it achieves this” [Kou09]. Scalability inthe context of this thesis refers to the definition of load scalability provided by [Bon00]:“Load scalability is the ability of a system to perform gracefully as the offered traffic in-creases”. Applied to the simulation of PCM models, we regard a simulator as scalable, ifit is capable to perform gracefully as the complexity of the simulated PCM model rises.The complexity rises, for instance, when further model elements are added to the simu-lated control flow. In particular, we exclude the type of scalability that regards a softwaresystem as scalable even if it is necessary to provide an increased amount of resources tothe system when the input complexity increases.

Based on these definitions, we assess the performance and scalability of SimuCom andEventSim in three steps. We start with the whole set of degrees of freedoms present whencreating PCM models. In the first step (Section 6.3), we exclude those factors that arepresumed to influence neither the performance nor the scalability of a simulation in asignificant way. This exclusion is not based on measurements since the large amount ofpotential variability in PCM models prevents a systematic evaluation in a reasonable time.Instead, we propose a criterion that still allows for the exclusion of a high proportion offactors.

The second step (Section 6.4) is concerned with the ranking of the remaining factors basedon their influence on the performance of SimuCom and EventSim, respectively. For thispurpose, we utilise the ANOVA method in order to quantify the influence of the variousfactors on the duration of a simulation run. The duration of a simulation run is consideredas an indicator for the simulation performance.

In the third step (Section 6.5), the most influential factors are then subject to a more thor-ough analysis, which compares the performance-related behaviour of the two simulatorswhen provided with increasing complex PCM models. Furthermore, we identify limits inscalability associated with the simulators.

The second and third step require a large amount of simulation runs, which are performedby the experiment automation tool presented in Section 6.2. All measurements presentedin the remainder of this chapter are gather by this tool running on the test system describedin Table 6.1.

6.2 Automated Model Variation and SimulationThe experiments conducted in the course of this chapter demand for a vast amount ofdifferent PCM models along with an even larger number of corresponding simulation runs.Creating these models manually is virtually impossible. Likewise, setting up and startingmultiple simulation runs for each of these models without being supported by a tool is notconceivable.

For this reasons, we implemented an Eclipse plug-in that serves two purposes. First, itgenerates variations of a given PCM model. Second, the tool is capable of simulating themodified PCM models with both simulators, SimuCom and EventSim while at the sametime collecting performance metrics such as the duration of a simulation run and memoryconsumption over the simulation runtime.

The experiment automation tool is supplied not only with a certain PCM model, but doesalso require additional information on the variation which is to to be performed and on the

60

6.2. Automated Model Variation and Simulation 61

HardwareProcessor Intel R© CoreTM2 Quad Q8300 @ 2.50 GHz (4 cores)Main Memory 4.00 GB @ 800 MHz, single channel modeSolid-State Drive OCZ Vertex 2

Operating SystemVersion Windows 7 (Version 6.1)

Java Virtual Machine (JVM)Name Java HotSpotTM64-Bit Server VMVendor Sun Microsystems Inc.Version 20.1-b02VM Arguments -Xms512m, -Xmx1024m, -XX:PermSize=256M,

-XX:MaxPermSize=512M

SimulatorsEclipse Galileo (Version 3.5)Simulation Library SSJ (Version 2.1.3)

Table 6.1: Soft- and hardware configuration of the test system

intended simulation configuration. For this purpose, we designed a meta-model capturingthe configuration space of PCM model variations along with simulation parameters suchas simulation stop conditions. In order to conduct an experiment series, the experimentautomation is supplied with an instance of this meta-model – the configuration model –and performs one experiment after another as defined by the configuration model.

An experiment in the scope of this section comprises three parts: a PCM model, a variationdescribing a series of modifications of the PCM model and a simulation configuration,which defines the simulation settings for each variation.

In what follows, we give an overview of the meta-model and the experiment automationtool.

6.2.1 Configuration Meta-Model

The meta-model enabling the model-based description of experiment series is shown inFigure 6.1. It defines the syntax of configuration models and in this way describes theconfiguration space of the experiment automation tool presented later on. Instances of themeta-model, which are also called configuration models, do not embody the tool’s config-uration itself but are an abstract representation of the configuration. As will be describedlater on, the experiment automation tool accepts configuration models and creates theactual configuration on this basis.

6.2.1.1 EMF Ecore

The meta-model has been implemented using the Ecore meta-meta-model provided by theEclipse Modeling Framework (EMF) [emf]. That means, each element depicted in Figure6.1 is an instance of a certain Ecore modelling element. A speciality with EMF is that eachmeta-model has a single root class, whose instances, namely Java objects, contain eitherdirectly or indirectly a set of instances of meta-model classes. The term containment inthis context is in line with the definition of a so-called composite aggregation in the UML,which we refer to as composition: “Composite aggregation is a strong form of aggregationthat requires a part instance be included in at most one composite at a time.” [uml, p.41]. In this way, the composition forms a whole/part relationship [uml, p. 41]. In EMF,

61


-repetitions : EInt

ExperimentRepository

-name : EString

Experiment

-name : EString

ToolConfiguration

ExperimentalDesignStopCondition

-minValue : ELong-maxValue : ELong-variedObjectId : EString

VariationValueProvider

PerformanceMeasurement

-name : EString-strategyClass : EString

VariationType

AbstractSimulationConfiguration

EventSimConfiguration SimuComConfiguration

SimulationDurationMeasurement

JMXMeasurement

-exponent : EDoule-factor : EDouble

PolynomialValueProvider

-values : EString

SetValueProvider

FullFactorial OneFactorAtATime

MeasurementCountStopCondition

SimTimeStopCondition

PCMModelFiles

1..*

* *

1

1

*

1..*

1

1

1

Figure 6.1: Experiment automation configuration meta-model

a single instance of the root class builds the whole, while referenced objects represent theparts. This structure is utilised, for instance, when saving a newly created meta-modelinstance, say to an XML file: starting with the root object, any object reachable via acomposition is traversed in a recursive fashion and finally written to the file. Objects thatare not reachable, will not be saved.

This is why all modelling elements shown in Figure 6.1 are contained in the central classExperimentRepositry; or, in more precise terms, the corresponding objects. One excep-tion is the class VariationType, which is actually imported from another meta-model.

6.2.1.2 Experiment Repository and Experiments

As stated above, the class ExperimentRepository is the central class. It holds a set ofExperiments and ToolConfigurations. According to our definition of an experiment, anExperiment instance contains a PCM model (class PCMModelFiles), a variation descrip-tion (class Variation) and a simulation configuration (class ToolConfiguration).

6.2.1.3 Referencing PCM Models

A PCMmodel is referenced indirectly by specifying the location of the PCM partial modelsin the form of a character string. This decision is mainly due to the way in which EMFmodels are created and edited: EMF allows for the generation of a tree-editor out of anEcore meta-model, which does also account for references to other meta-models like thePCM, for instance. However, when loading multiple models of the same meta-model, thegenerated tree-editor does not provide a convenient solution to distinguish between modelelements from different models. As the referenced PCM model may differ between multipleexperiments defined by the same configuration model, leading to multiple instances of thesame meta-model, we avoided direct references to PCM modelling elements.

6.2.1.4 Describing Model Variation

Instances of the Variation class describe the variation of a specific modelling element con-tained in the referenced PCMmodel. Usually, a modelling element gives rise to more than asingle variation. Given a LoopAction, for instance, we could vary the loop iteration count,increase the number of actions contained in the nested loop behaviour or we could replicate

62


the whole loop action n times. Therefore, the class VariationType allows for describingthe type of the variation. Examples of variation types are the LoopIterationVariationas well as the ReplicateAbstractActionReplication. Both classes are not shown inthe meta-model as they are not defined on the meta-model level, but are instead locatedin the experiment automation tool. On the meta-model level, variation types are refer-enced indirectly by the VariationType. Each variation type references a correspondingimplementation class residing in the experiment automation tool by specifying the fullyqualified name of the so-called strategy class. The knowledge of how to perform a specificvariation is encapsulated by these strategies.

Building Variation Sequences

Variations are numeric inasmuch as the strategy classes mentioned above vary the under-lying model based on a single numeric value. For example, when providing the strategyLoopIterationVariation with the number 10, the strategy will set the loop iterationproperty of the referenced loop to 10. Thus, sequences of consecutive variations may bedefined by a sequence of numeric values. This is what the class ValueProvider does:it generates a sequence of values which are then used to vary a referenced modelling el-ement several times in a row. The PolynomialValueProvider yields the i-th value ofthe sequence by calculating iexponent · factor. Alternatively, the sequence may be specifieddirectly using the class SetValueProvider. The values obtained by the value providersare forwarded to the variation strategies only, if the value is within the range defined bythe attributes minValue and maxValues of the Variation class.

Combining Multiple Variation Sequences

Experiments may contain more than a single variation sequence. That is, multiple PCMmodelling elements may be varied within the same experiment. This raises the questionof whether variation sequences can be combined with each other, and if so, how thecombination is conducted. Suppose, for example, that there are two variation sequencesviterations = (i = 10, i = 20, i = 30) and vreplications = (r = 1, r = 2) defined for aspecific LoopAction, whose loop iteration count is 1 before the variation. While theformer variation changes the loop’s iteration count, the latter replicates the whole loop.Without combining the variation sequences, we would first perform and simulate eachvariation defined by viterations, followed by the variations specified by vreplications resultingin a sequence of 5 model variations: (i = 10, i = 20, i = 30, r = 1, r = 2). No replicationsare present when the iteration count is modified and vice versa.

Conversely, the variation sequences could be combined, say in a crosswise fashion, yieldingthe sequence of variation pairs ((i=10, r=1), (i=10, r=2), (i=20, r=1), ..., (i=30, r=2)).When combining the variation sequences in this way, each varied PCM model is the resultof two variations applied at the same time.

As a result, we conclude that there are several ways to treat the presence of multiplevariation sequences within the same experiment. Therefore, each Experiment referencesa subclass of ExperimentalDesign, which indicates whether multiple variation sequencesare to be combined, and if so, in which manner. The class OneFactorAtATime correspondsto the first example given above, while the class FullFactorial corresponds to the secondone. Notice, that the discussion in this paragraph is closely related with the experimentaldesigns presented in Section 6.4.1.1.

6.2.1.5 Configuring Simulation Runs

Each model variation generated within an Experiment is simulated by at least one sim-ulator. When multiple simulators are specified for the same experiment, the varied

63


AbstractSimulationConfiguration

-simulateLinkingResources : EBoolean-simulateFailures : EBoolean

PersistenceFramework

SensorFramework

SensorFrameworkDatasource

MemoryDatasource

-location : EString

FileDatasource

StopCondition

-measurementCount : EInt

MeasurementCountStopCondition

-blackboardType : BlackboardType

ProbeSpecConfiguration

+SIMPLE+CONCURRENT+NONE

«enumeration»BlackboardType

-simulationTime : EInt

SimTimeStopCondition-seed0-seed1-seed2-seed3-seed4-seed5

RandomNumberGeneratorSeed

1

1 1

0..1

1..*

Figure 6.2: Simulation configuration part of the meta-model

PCM models are simulated with each of the simulators, one after another. The typeof each simulator, SimuCom or EventSim, for instance, is determined by a subclass ofAbstractSimulationConfiguration. Additionally, this class does also cover the config-uration of the respective simulation run. As both simulators, SimuCom and EventSimessentially provide the same configuration possibilities, these are all captured by theAbstractSimulationConfiguration class.

The configuration options are show in Figure 6.2. The way in which the ProbeSpecification(cf. Section 4.8) treats the measurements published on the blackboard is defined by theProbeSpecConfiguration. One of three BlackboardTypes can be selected. Either aSIMPLE blackboard being executed in the same thread as the simulation, a CONCURRENTblackboard operating in a dedicated thread, or a blackboard that drops each incomingmeasurement, denoted by NONE.

When using a blackboard other than the NONE blackboard, the calculated performancemetrics are persisted using a PersistenceFramework. At the time of writing only theso-called SensorFramework was available. The sensor framework is configured with aSensorFrameworkDatasource that stores measurements either in the main memory (classMemoryDatasource) or into files on a storage device (class FileDatasource).

Moreover, a simulation configuration does also include a set of one or more StopConditionsand optional RandomNumberGeneratorSeeds. Notice, that stop conditions defined for anAbstractSimulationConfiguration are overridden by stop conditions defined for an ex-periment, if both conditions are of the same type.

6.2.1.6 Measuring the Simulation Performance

Usually, the rationale for conducting a software performance simulation is to gain deeperinsights relating to the behaviour of the simulated system. In this thesis, however, weare usually concerned with examining the performance of the simulator itself, which iswhy the simulation results are negligible. Instead, we need to collect performance metricsof the simulator while performing a sequence of simulation runs. For this purpose, eachExperiment is associated with a PerformanceMeasurement. The duration of each sim-ulation run can be determined using a SimulationDurationMeasurement. Performancemetrics provided by the JVM can be measured by a JMXMeasurement, where JMX standsfor Java Management Extensions. Further details on JMX are presented in Section 6.2.2.4.

64


Experiment Controller

Analysis Tool Adapter

Model VariationPerformance

Measurement

Figure 6.3: Conceptual components of the experiment automation

6.2.2 Experiment Automation Tool

The experiment automation tool covered in this subsection enables the automated variationand simulation of PCM models according to instructions contained in a configurationmodel. In this way, it supports automated analyses where simulation input parametersare to be varied in a systematic way in order to observe their impact on output parameters.Input parameters are, for instance, the PCM model and simulation settings, such as theconfiguration of stop conditions. Output parameters exist on two levels. On the one handwith regard to the simulation results characterising the behaviour of the simulated system,and on the other hand with respect to the observable behaviour of the simulator itself.Examples of the latter type are the simulation duration or the memory consumption overthe simulation runtime. When observing the impact of input parameters on the simulationresults, this yields a sensitivity analysis. Conversely, when the impact on the performancebehaviour of the simulator itself is of interest, this leads to a scalability analysis. Bothkinds of analyses are supported by the experiment automation tool. In what follows,however, we focus on scalability analyses since a sensitivity analysis was not conducted inthe course of this thesis.

The experiment automation has been implemented as a single Eclipse plug-in. On aconceptual level, the functionality can be separated into four components as illustratedin Figure 6.3, where each component corresponds to a Java package. The ExperimentController coordinates the process of repeated model variations and analyses. An analy-sis can be, for instance, a simulation run. For this purpose, the Experiment Controllerrelies on the Model Variation component to perform variations of a PCM model. Anal-ysis tools are triggered by means of Analysis Tool Adapters. Finally, the PerformanceMeasurement component allows for capturing the performance of analysis tools.

6.2.2.1 Experiment Controller

The Experiment Controller is the central component of the experiment automation.It accepts a configuration model and ensures that the referenced PCM model is variedaccordingly and that each resulting model variation is analysed by one or more analysistools as specified in the configuration model.

6.2.2.2 Model Variation

When supplied with a PCM model along with a variation description in the form ofan instance of the Variation meta-class, the Model Variation component provides asequence of varied PCM models.

For this purpose, the experiment automation has to provide a corresponding implementa-tion for each instance of the VariationType meta-class. Each instance of a VariationType

65


references its implementation class in an indirect way by means of the attribute strate-gyClass that takes the fully qualified name of a Java class. With the knowledge of theclass name, the Model Variation component can create an instance of the variation typeimplementation on demand using the reflection mechanism provided by the Java runtimeenvironment.

6.2.2.3 Analysis Tool Adapter

An Analysis Tool Adapter wraps a tool capable of analysing PCM models, for instanceby simulation, and provides its analysis functionality to the Experiment Controller.Currently, there are two adapters that offer simulative analyses with SimuCom and EventSim,respectively.

On the meta-model level an analysis tool is represented by a subclass of the ToolCon-figuration meta-class. Consequently, there needs to be a corresponding adapter im-plementation For each subclass of ToolConfiguration. The mapping between meta-model classes and adapters is established by a factory that accepts an instance of theToolConfiguration meta-class and returns the corresponding tool adapter instance.

6.2.2.4 Performance Measurement

The Performance Measurement component targets at enabling scalability analyses bymeans of the experiment automation tool. For this purpose, various performance charac-teristics of an analysis tool can be observed.

As with the Analysis Tool Adapter, each subclass of the PerformanceMeasurementmeta-class requires a counterpart in the experiment automation tool, and a factory estab-lishes the mapping between the two. Below, we give an overview of how the measurementsare conducted.

SimulationDurationMeasurement

The implementation corresponding to the SimulationDurationMeasurement meta-classmeasures the duration of a simulation run. In order to do so, it takes a time measurementbefore and after each simulation run by calling the static Java method System.nanoTime(),where System is a built-in object provided by the JRE. The time difference between thetwo calls yields the simulation duration.

Besides System.nanoTime(), Java provides some other methods for measuring time differ-ences, where the result quality differs strongly between the various approaches. Kuperberget al. [KKR11] provide a unified quality metric for timer methods which accounts for ac-curacy and invocation costs at the same time. An evaluation based on the newly proposedquality metric showed a comparably high quality of System.nanoTime(), which justifiedour decision for this timer method.

When performing a series of simulation runs in a row without restarting the JVM, it is verylikely that some unreferenced objects instantiated in the i-th simulation run remain on theheap when starting the i+1-th simulation run. Hence, a proportion of the garbage collec-tor activity observable in a simulation run is actually caused by the previous simulationrun leading to skewed results. Therefore, we force a garbage collector run between sub-sequent simulation runs and begin the time measurement not until the garbage collectorhas finished its work.

66

6.3. Identifying Potential Performance Factors 67

JMXMeasurement

The counterpart of the JMXMeasurement meta-class is the JMXMeasurementStrategy class,which is responsible for collecting performance characteristics of the analysis tool while ananalysis is in progress – provided that the configuration model contains a JMXMeasurementfor the respective analysis run. As indicated by its name, the JMXMeasurementStrategyuses the Java Management Extensions (JMX) for measurement purposes. As of the Javarelease J2SE5.0, JMX is delivered as an integral part of the JRE. It is a powerful utilityto manage Java programs running in a JVM, where the management capabilities go farbeyond monitoring the performance.

For our purposes, however, a small subset of the monitoring capabilities suffices, namely aset of so-called MXBeans. MXBeans are managed by the JVM and can be retrieved by acall to a static method offered by the class java.lang.management.ManagementFactory.Two MXBeans provide the measurements for the JMXMeasurementStrategy.

In order to monitor the memory consumption in terms of the used heap space, we utilisethe java.lang.management.MemoryMXBean. The occupied heap space is retrieved peri-odically and is used to update the two aggregates mean memory and maximum memorycharacterising the heap utilisation over time.

In a similar manner, we monitor the number of live threads by using the java.lang.man-agement.ThreadMXBean, yielding the aggregates mean thread count and maximum threadcount for each run of the analysis tool.

6.3 Identifying Potential Performance FactorsThe PCM meta-model offers a great number of degrees of freedoms enabling system mod-ellers to build abstractions of component-based software systems. Some of them influencethe simulation performance and scalability, but a majority does not as will be motivated inthe course of this section. More precisely speaking, this section targets at separating thosedegrees of freedoms that have an effect on the performance or scalability of a simulationrun from those that do not significantly affect the non-functional simulation behaviour.

As the non-functional behaviour of a simulation run is influenced not only by the usage ofthe simulator (where usage refers to simulating a specific PCM instance), but also by itsimplementation, we can hardly say whether a specific degree of freedom actually affectsperformance or scalability in a significant way – at least not without taking a specificimplementation into account. A highly optimised simulator, for example, might process aloop in the fraction of time needed by a less-efficient simulator. In the preceding example,the number of loops in the input model would be a performance-related factor for thelatter simulator, but not for the former one. This is why we argue the other way round bybeginning with the set of all degrees of freedom and excluding those factors that neitherinfluence the performance nor the scalability in a significant way – regardless of a certainsimulator implementation. The resulting factors, which potentially influence the non-functional simulation behaviour, serve as input to a more thorough analysis conducted inthe next section.

6.3.1 Previous Work

In a previous work, a set of 40 out of 140 factors has been identified as being potentiallyperformance relevant [Bea08, pp. 68]. The initial set of 140 factors captures the variabilityof a PCM model due to attributes and associations. Each attribute of a meta-model classis a factor since the corresponding value in the model can be varied along the attribute’sdomain, which could be, for instance, defined by a data type like an integer. An example in

67


the context of the PCM is the population attribute associated with the ClosedWorkloadmeta-model class, which may take an integer value ranging from 0 to 231−1. Likewise, eachassociation originating from a meta-model class is a factor since the number of associatedclasses in the model can be varied in the range specified by the association’s multiplicity.The identification of performance-related attributes and associations in [Bea08] took placein two steps. At first, the associations with a constant multiplicity (1, 2, ...) were removedfrom the factor set since such a multiplicity prevents a variation. With the same argument,those associations, which are restricted to a constant value by an OCL expression, wereexcluded from the initial set. In the second step, factors were excluded by taking thesemantics of the attribute or association into account. However, the selection of negligiblefactors seems to be based on intuition only and do not follow a systematic approach. Asa result, the set of potential factors contains elements that arguably do not affect theperformance of a simulator. Consider, for instance, the Interface class of the PCMmeta-model, which has an association to one or more Signatures that are defined bythe interface: no significant performance influence can arise by the sole existence of theinterface and its signatures as long as there is no component that actually implementsthe interface. Still, the association from interfaces to signatures has not been excludedfrom the set of potentially performance relevant factors, which is why we propose a moresystematic approach below.

6.3.2 Reducing the Set of Potential Performance FactorsFor the purpose of narrowing down the set of factors that are assumed to have an influenceon the performance or scalability of a simulation run, we take into account the way ofoperation of software performance simulations in conjunction with the PCM and derivea decision criterion that divides the overall set of factors into two classes: The first classcontains factors that do not affect the simulation performance at all or whose performanceinfluence is negligible. On the contrary, the performance influence of the factors in thesecond class depends on the simulator implementation, which is why we regard all of themas candidates for performance-related factors.Using the PCM, the architecture of a component-based software system is captured interms of its static structure and its dynamic behaviour. In a nutshell, the static structureincludes amongst others the assembly of components and the deployment of componentson resource containers. The dynamic behaviour essentially models the behaviour of thesescomponents in terms of resource demands issued by component services and with regardto the interactions between different services arising due to calls to external components.Regardless of a specific simulator implementation, the simulation of a PCM model needsto imitate the simulated control flow defined by the dynamic behaviour. This means,the simulation starts with a UsageScenario and passes through one or more componentservices described by ResourceDemandingSEFFs. Both behavioural descriptions are mod-elled in terms of action chains, which results in a simulation procedure that passes throughseveral sequences of actions and simulates the semantics associated with each action en-countered. In consequence, the control flow of the simulation is aligned with the controlflow described by the PCM model. For example, if there is a loop in a UsageScenario,an according control flow construct can be found in the corresponding control flow of thesimulator as well.Assuming a fixed hardware and software environment, where a certain simulation runsin isolation, the performance of the simulation is determined by the control flow of thesimulation run. Due to the correspondence between the actual control flow of the sim-ulator and the simulated control flow defined by the dynamic behaviour, we are able torestrict the set of potentially performance related factors to those degrees of freedoms as-sociated with the dynamic behaviour of the simulated system. In particular, this includes

68

6.4. Ranking Performance Factors 69

AbstractUserActions contained in UsageScenarios and AbstractActions, which arethe building blocks of ResourceDemandingSEFFs.

It can be argued that the simulated control flow is influenced not only by the behaviouraldescriptions of the simulated model but also by the static structure of the model. Take,for example an ExternalCallAction, which represents a call to a service provided by anexternal component. Due to the principles underlying component-based software engineer-ing, the target service of the call is not specified directly, but instead the call refers to thesignature of an interface, which has to be declared as a required interface by the callingcomponent. In order to provide an implementation of this interface to the calling com-ponent, the system model (which is a part of the static structure) connects the requiringcomponent to a component implementing the interface. In consequence, when the simu-lation encounters an ExternalCallAction, the component service being actually calleddepends on the static structure. Therefore, the control flow complexity of the service calldepends on the providing component connected with the requiring component.

At second glance, however, the influence of the static structure on the simulated controlflow is an indirect one since the static structure essentially connects parts of the simu-lated control flows defined by UsageScenarios or ResourceDemandingSEFFs. Therefore,a potential performance influence arises due to simulating the actions contained in theconnected control flows, not due to the connection itself. This is why we exclude thedegrees of freedoms arising from the modelling of a system’s static structure from the setof factors presumed to influence the performance of a simulation run.

6.4 Ranking Performance Factors

In this section, we refine the list of potential performance factors by allocating a rank toeach factor. The rank indicates the factor’s importance in terms of performance, wherea high rank represents a great influence on the performance. In this way, we can excludeunimportant factors from the detailed performance analysis, which is covered in the nextsection.

Below, we begin with an introduction to the ranking method before we describe the ex-perimental setting and discuss the results from the ranking.

6.4.1 Analysis of Variance (ANOVA)

For the purpose of calculating the rank of each factor, we utilise the analysis of variance(ANOVA), which is a set of statistical methods targeted at building a statistical modelthat captures the relationship between a single response variable (often also denoted asdependent variable) and one or more factors (the so-called independent variables) influ-encing the response variable. In particular, ANOVA also takes into account influenceson the response variable due to experimental errors caused, for instance, by measurementinaccuracies. To this end, ANOVA separates the overall variance of a data set into twoparts: the variance that can be explained by the influence of a factor, and the residual vari-ance, which is attributed to experimental errors. Note, however, that we do not perform afull-fledged ANOVA since a ranking is already available before the variance is calculated.

In what follows, we give a short overview of experimental designs before we explain ANOVAin conjunction with the design that has been used for obtaining the ranking of potentialperformance factors. In doing so, we adhere to the description given in [Jai91].

69


6.4.1.1 Experimental Designs

ANOVA can be used in conjunction with various experimental designs. Given a numberof factors, each of which can take one ore more levels, an experimental design describesthe factor level combinations that are of interest with regard to a study. In this way, theexperimental design determines the number of required experiments.

The simplest approach is to vary one factor at a time, while keeping the other factorsfixed at a certain level. This requires only 1 + n · k experiments for k factors with n levelseach, but interactions between the factors are neglected. Such a design assumes that theresponse of an experiment does not change when one factor is fixed at a certain level andthe remaining factors are changed in an arbitrary way – which is not a valid assumptionin the majority of cases.

All kinds of interactions between factors are captured using a full factorial design. In a fullfactorial design, a measurement is conducted for each combination of factor levels resultingin a total of nk experiments, when there are k factors with n levels each. Obviously, thisdesign does not scale well when increasing the number of factors or their levels.

In order to cope with the vast amount of measurements, there are a number of so-calledfractional factorial designs, whose purpose is to reduce the number of measurements whileat the same time loosing some information on the relationship between the response vari-able and the influencing factors.

A special case of the full factorial is the 2k design, where each of the k factors is restrictedto exactly two levels. Thereby, the number of experiments can be reduced without loosinginformation about potential factor interactions. This design is usually used to determinethe important factors from a greater set of factors. In doing so, the two levels are associatedwith a low and a high value, respectively. Suppose, for example, the factor is the numberof processors in a computer, which ranges from one to four. Then, a reasonable assignmentof the two factor levels would be to set the low level to one processor and the high levelto four processors.

Most commonly, experiments are repeated in order to minimise noise in the measurementsdue to uncontrolled factors and to be able to assess the influence of experimental errors.Consequently, using a 2k design, for instance, it is not possible to make any statementsabout measurement errors. Therefore, when influences on the response variable due toerrors are to be separated from influences due to the variation of factor levels, the experi-ments have to be repeated r times, yielding a 2kr design when repeating a 2k design.

6.4.1.2 2kr ANOVA

In this section, we explain how ANOVA is used in conjunction with a 2kr experimentaldesign. As before, the description is based on [Jai91].

The first step is to conduct the 2k experiments as defined by the experimental design, whereeach experiment is repeated r times. The resulting 2kr measurements can be organised ina table as follows:

70


Experiment F1 F2 . . . Fk Response yi,j Response Mean y

1 L L . . . L

y1,1

y1y1,2...y1,r

2 H L . . . L

y2,1

y2y2,2...y2,r

......

... . . . ......

...

2k H H . . . H

y2k,1

y2ky2k,2...

y2k,r

The Fi, i ∈ 1, ..., k denote the factors. The factor levels are represented by L and H,indicating the low and the high value, respectively. The outcome of replication j of exper-iment i is denoted by the response yi,j . Calculating the mean of the j replications for anexperiment i yields the response mean yi.

Next, a nonlinear regression model is created that captures the influence of the variousfactors, interactions between two or more factors and experimental errors on the responsevariable. In doing so, a measurement yi,j is modelled as a combination of

• the effects of factors Fi (main effects), denoted by q1, . . . , qk,

• the effects of two factor interactions, denoted by qk+1, . . . , qk+(k2),

• analogously, the effects of 3, . . . , (k-1)-factor interactions,

• the effect of all factors interacting with each other, denoted by q2k ,

• and finally, the effect attributed to experimental errors, denoted by e.

This yields the model

yi,j =q0 +main effects︷︸︸︷

q1x1i + q2x2i + ...+ qkxki

+ qk+1x1ix2i + ...︸︷︷︸2, 3, ... (k-1) factor interactions

+ q2kx1i...xki︸︷︷︸k-factor interaction

+e (6.1)

where

xli ={−1 if factor level of Fl in experiment i is Low

1 if factor level of Fl in experiment i is High (6.2)

Notice, that the xl range from x1 to xk, whereas the coefficients qi range from q1 to q2k .Solving this model on the basis of the 2kr measurements yields the main effects q1, ..., qkof the factors, the effects of their interactions qk+1, . . . , q2k and the residual effects e dueto experimental errors. A quite efficient technique for solving such models is the sign tablemethod described in [Jai91, p. 286]. It is, however, beyond the scope of this section tocover this technique.

In the next step, the effects qm are used to assign the overall variation observable in thedata set to factors, factor interactions and experimental errors. Variation can be observed

71


between the r replications of the same experiment as well as between the response meansof the 2k experiments. The former type of variation – variation within experiments – isattributed to experimental errors, whereas variation of the latter type – variation betweenexperiments – is assumed to be caused by the factors and their interactions.

The errors in a data set can be obtained by calculating the distance of a measurementyi,j from the corresponding response mean yi, summed up over all experiments i andreplications j. However, the overall error calculated this way sums up to 0 and is thereforenot suited to describe the variation due to errors. Therefore, the respective distances aresquared in a first step and added up thereafter, yielding the sum of squared errors (SSE):

SSE =2k∑i=1

r∑j=1

(yi,j − yi)2 (6.3)

In a similar manner, the variations due to factors and their interactions are described by thesum of squared effects, denoted by SSm, where SS1, ..., SSk is the variation explained bythe factors F1, ..., Fk and SSk+1, ..., SS2k is the variation explained by factor interactions:

SSm =2k∑i=1

r∑j=1

q2m = 2krq2

m (6.4)

The overall variation of a data set can then be expressed as a combination of the squaredeffects and errors. This value is called the total sum of squares (SST) and is defined asfollows:

SST = SS1 + SS2 + ...+ SS2k + SSE (6.5)

Now, the relative importance of a factor Fi can be determined by calculating the fractionof variation that the factor contributes to the total variation:

SSm

SST· 100 (6.6)

Likewise, the percentage of variation due to experimental errors can be obtained by:

SSE

SST· 100 (6.7)

The variations calculated above serve their purpose in determining the importance ofvarious factors. It is, however, important to be aware of that they are not obtained in astatistical manner. Especially, variation should not be confused with the statistical termvariance. As the name suggests, ANOVA does certainly also provide means to obtain thevariance of the effects and errors. But we skip that part, as the calculated variations sufficeto rank the various factors.

6.4.2 Assumptions

On the basis of the initial set of potential performance factors, we conducted a ranking ofthe factors using ANOVA in conjunction with a 2kr experimental design. As the whole setof factors would result in a vast amount of experiments, we make the following assumptions.

First, we exclude the Start and Stop actions as well as the StartAction and the Stop-Action, as they merely indicate the start and the stop of a ScenarioBehaviour or aResourceDemanding Behaviour, respectively, but no simulation semantics is associatedwith these modelling elements. Moreover, the amount of these actions is fixed to one for

72


each behaviour. Second, we assume that a Loop (contained in a ScenarioBehaviour) isvirtually equivalent to a LoopAction (contained in a ResourceDemandingBehaviour) interms of the influence on the simulation performance, where both loops are configuredin the same way, i.e. the loop iteration count is equal and both loops encapsulate thesame inner behaviour. In the same manner, we expect a Branch to behave very similarto a BranchAction with regard to the simulation performance. Third, we exclude theSetVariableAction since SimuCom and EventSim rely on virtually the same implemen-tation when executing an action of this type, which is why we do not expect any insightswhen assessing its performance influence. Fourth, in the knowledge that AcquireActionsalong with ReleaseActions behave in a way very similar to InternalActions, they areexcluded as well. Furthermore, CollectionIteratorActions can not be analysed sinceEventSim does not yet support them.

The remaining actions, grouped by their type, are:

AbstractUserAction Branch, Delay, EntryLevelSystemCall, Loop

AbstractAction ForkAction, InternalAction

The performance of a simulation run is also influenced by the Workload correspondingto the simulated UsageScenario since the Workload determines the usage intensity ofthe system, i.e. the number of users flowing through the simulated system concurrently.Opposed to the factors listed above, however, the performance influence does not arise dueto the processing of the Workload itself, but is instead caused by the actions containedin the corresponding UsageScenario. Consider, for instance, a ClosedWorkload witha population of 1 and let the corresponding UsageScenario comprise a chain of threeactions: Start, Delay, Stop. In this simple example, each action is simulated once.When increasing the workload population to 2, each action is simulated twice; obviously,this involves an increased simulation effort. The additional effort is however caused bythe three additional actions that have to be simulated, not by the Workload itself. In thissense, Workloads can be regarded as being orthogonal to actions. We, therefore, restrictthe Workload to a single user when assessing the performance influence of a certain factor.

6.4.3 Experimental Setting

As a result from the considerations above, we conducted a 26r ANOVA, where each exper-iment has been repeated 100 times resulting in a total of 26 · 100 = 6,400 experiments persimulator. Each experiment assigns either a low or a high level to each factor. The highand low value, respectively, can be chosen somewhat arbitrarily on the obvious conditionthat low < high. We set low to 1 and high to 100.

The overall idea of the ranking is to start with an initial PCM model where each factoris set to its low level and to increase the number of modelling elements of a certain typedepending on the factor level combination defined by a specific experiment. Suppose, forexample, a certain experiment sets the Loop factor to its high value, while at the sametime the remaining factors are set to their low levels. Assuming that the low and highfactor levels correspond to 1 and 100, respectively, we obtain a PCM model containing100 model elements of the type Loop and 1 model element for each remaining factor.

After modifying the base model, the resulting model is simulated with both simulatorsyielding a response that characterises the performance influence of the current combinationof factor levels on the two simulators.

The simulation configuration, which was used to simulate the generated PCM models, canbe seen in Table 6.2. The measurement count stop condition is set to 1000, while thestop condition depending on the progress of the simulated time is disabled. Notice, that

73


Stop ConditionsMeasurement Count 1000Simulated Time -1 (turns off the stop condition)ProbeSpecificationBlackboard Type NullSampleBlackboard (drops each incoming measurement)

Table 6.2: Simulation settings

the naming of the former stop condition is slightly misleading in that it does not stop assoon as a predefined number of measurements have been taken. Instead, a measurementin this context is defined more coarse-grained as the set of all fine-grained measurementscaused by a simulated user. Thus, a measurement in this sense actually indicates thata simulated user has completed passing through the simulated system. In consequence,setting the stop condition to 1000 causes the simulation to stop as soon as the simulatedcontrol flow has been executed 1000 times.

Another important setting is the type of the ProbeSpecification blackboard. To recapwhat was stated in Section 4.8, the blackboard serves as a container for all measurementscollected in the course of a simulation run. Whenever a measurement is published onthe blackboard, interested calculators are notified, which then calculate performance met-rics out of the fine-grained measurements. Depending on the simulation settings, theseperformance metrics are then stored on a storage device or in the main memory. Cal-culating response metrics and storing according results takes up a great amount of theoverall simulation runtime, which is why we disabled both activities. For this purpose, theProbeSpecification was initialised with a blackboard of the type NullSampleBlackboard,which drops any measurement that is tried to be published. In this way, we exclude theperformance influence introduced by the ProbeSpecification from our performance mea-surements, while at the same time the measurements account for the simulator-specificoverhead due to collecting and publishing simulation-internal measurements. As shown bythe validation, both simulators collect virtually the same data, which is why this settingdoes not distort the experimental results.

6.4.4 PCM Base Model

Before we explain how the response is obtained for each of the 6, 400 · 2 simulation runs,we take a look at the base model, which is shown in Figure 6.4. It comprises a modellingelement for each of the aforementioned six factors, where the EntryLevelSystemCall isrequired twice as will be motivated later on.

A simulation of the base model starts with the EntryLevelSystemCall callDelay whosesole purpose is to call the SEFF named delay. The invoked SEFF begins with an Inter-nalAction, which issues a resource demand of 1 to a resource of the type DELAY. Thiscauses an instantaneous advance of the simulation time by a single simulated time unitbefore the simulation continues with the ForkAction. This action forks the control flowin such a way that the forks are executed in parallel, where each fork is represented by aForkedBehaviour.

The ForkedBehaviour is empty in order to be able to separate a potential performanceinfluence due to a ForkedBehaviour from the influence caused by nested actions. Suppose,for example, the forked behaviour does contain a Loop action. Then, when increasing thenumber of forked behaviours for the purpose of determining the performance influence, wewould actually obtain an estimation of the influence of the forked behaviour in conjunctionwith the loop. The same argument does also apply to Loops and Branches, which is why

74


their corresponding behaviours are empty as well. It could be argued that an empty Loopor Branch might be removed by compiler optimisation. As the results presented lateron suggest, this is, however, not the case since we can observe an increase in simulationduration when increasing the number of Loops or Branches, respectively.

After processing the fork, the simulated control flow returns to the usage scenario, wherean EntryLevelSystemCall is simulated. Opposed to the system call issued earlier, thepurpose of the latter call is to capture the performance influence due to a call itself; theformer call, in contrast, serves merely as a mean to direct the simulated control flow tothe InternalAction and the ForkAction. As motivated above, the system call invokesan empty SEFF in order to exclude an influence of the modelling elements contained inthe SEFF on the simulation performance. Next, a Delay of a single simulated time unitis issued before the simulation encounters a Loop, followed by a Branch. The loops’siteration count is set to 1. The branch comprises two alternatives, where each alternativeis associated with a probability of 0.5.

6.4.5 Determining the Simulation Performance

There are various indicators for the performance of a simulation run. As stated at thebeginning of this chapter, performance refers to timeliness and resource efficiency. Time-liness in the case of a simulation could be the duration between the start and the stop ofa simulation run. Resource efficiency captures performance in terms of the resource usageover simulation time, such as processor utilisation, main memory usage and the numberof threads created.

Timeliness subsumes resource efficiency to a certain extend in that an increased resourceusage implies an increased simulation runtime. In this way, timeliness is a powerful andintuitive measure when the reasons for a specific performance behaviour are not important.This is exactly the case for the ranking of the various potential performance factors, whichis why we assess the performance of a simulation run solely by its response time. Lateron, starting with the next section, we consider also the resource efficiency.

The response time of a simulation run can be defined in different ways. Each definition ofa response time comprises a time instant where the measurement begins along with a timeinstant where the response time measurement stops. In the case of SimuCom, we couldbegin the response time measurement either immediately after the simulation run has beenstarted, which includes the code generation phase, or we await the processing of the firstevent after the simulation code has been generated completely. Likewise, with EventSimwe can await the initialisations including, for instance, the mounting of probes and thepreprocessing of the static structure. In general, the initial effort before the simulationactually begins with the first event is considerably greater with SimuCom. In order toexclude the dissimilar initial effort, the response time throughout this section is defined asthe duration between the occurrence of the first event and the simulation stop triggeredby a stop condition.

6.4.6 Results and Discussion

An initial overview of the performance influence of the various factors is given by the boxplots depicted in Figure 6.5. A box plot shows at the same time the median of a sample,the sample’s interquartile range and potential outliers. The median is indicated by thebold bar separating each box into two parts: the upper part contains the upper quarter ofmeasurements just greater than the median, while the lower part contains the lower quarterdirectly below the median, where it is assumed that the measurements contained in thesample are in an increasing order. Thus, the box represents the range of half of the values,

75


<<UsageScenario>>


<<EntryLevelSystemCall>>callDelay

<<EntryLevelSystemCall>>callDoNothing

<<Delay>>

<<Branch>>

<<BranchTransition>>


<<BranchTransition>>


Branch Probability = 0.5 Branch Probability = 0.5

<<Loop>>


<<Loop Iteration>>Specification = 1

<<TimeSpecification>>Specification = 1

<<ResourceDemandingSEFF>>delay

<<InternalAction>>

<<ForkAction>>

<<ForkedBehaviour>>

<<ResourceDemandingSEFF>>doNothing

callscalls

<<Parametric Resource Demand>>Required Resource = DELAYSpecification = 1

Figure 6.4: PCM base model used for ranking potential performance factors

reaching from the lower quartile to the upper quartile. The extensions below and abovethe box are called whiskers and capture the two remaining quarters containing the smallestvalues and the greatest values, respectively. A whisker extends to the most extreme valuesof a quarter, i.e. to the values farthest from the median, as long as the distance betweenthe mean and the respective value does not exceed 1.5 times the interquartile range. Eachvalue exceeding this distance is deemed as an outlier and depicted by a circle above orbeyond the whisker.

6.4.6.1 Initial Overview Using Box Plots

We begin the discussion with the box plot depicted in Figure 6.5(a). It shows the factors’influences on the performance of a simulation run in SimuCom. Each box subsumes theresponse time of 100 simulation runs as described in Section 6.4.3. Notice, that the verticalscale is logarithmic, whereas the same scale is linear in Figure 6.5(b). Not surprisingly, thebase model has the smallest influence on the runtime of the simulation. However, there is agreat amount of outliers. These extreme outliers can be observed with approximately thefirst 10 to 20 successive simulation runs of an experiment series and disappear thereafter.The experiment series includes experiments for all the six factors and is started with thebase model, which is why the outliers are mostly limited to the unmodified base model.We assume that these outliers are due to just-in-time compilation done by the JVM.

When increasing the number of Loops from 1 to 100, the simulation runtime increasesslightly. The same applies with increasing the number of Branches and EntryLevel-SystemCalls. Nevertheless, none of the three factors seem to have a great effect on theperformance. In contrast, Delays and InternalActions both cause an increase in thesimulation duration by approximately an order of magnitude. This is an interesting ob-servation, since both types of modelling elements cause an advance in the simulation time.

76


A detailed discussion is presented in the next section. Finally, the most influential factoris the number of ForkedBehaviours. With the knowledge of how SimuCom simulates aforked behaviour, this result is less surprising: when the simulation encounters a fork, itspawns an own thread for each forked behaviour encapsulated by the ForkAction. But,as before, we defer the detailed discussion to the next section.

The box plot depicted in Figure 6.5(b) shows the performance influence of the same factorswhen using EventSim instead of SimuCom. No changes were made to the experimentalsetting described in Section 6.4.3. The first observation is that we do not need a logarithmicscale for plotting the results. This means, the differences in the performance influencebetween the factors is not as wide as with SimuCom. Moreover, it is apparent thatthe number of Delays has a minor effect on the simulation runtime, whereas controlflow constructs like Loops and, in particular, Branches are relatively expensive. A morethorough comparison will be given in the next section.

●

●

●

●

●●●●●

●●●●●●

●

●

●●●

●●

●●

Initi

al M

odel

100

Loop

s

100

Bra

nche

s

100

Ent

ryLe

velS

yste

mC

alls

100

Del

ays

100

Inte

rnal

Act

ions

100

For

kedB

ehav

iour

s

500

1000

2000

5000

10000

20000

Sim

ulat

ion

Run

time

[ms]

(a) SimuCom (runtime in log scale)

●

●

●

●

●

●●●●●

●●●

●

●●

●●●●●

●

●

●●●●●●●

●

●●●

●

●●●

●

●●●

●

●

●

●●

●

●

●

●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●

●

●●

●●

●

●●●●●●

Initi

al M

odel

100

Del

ays

100

Loop

s

100

For

kedB

ehav

iour

s

100

Ent

ryLe

velS

yste

mC

alls

100

Bra

nche

s

100

Inte

rnal

Act

ions

200

400

600

800

1000

Sim

ulat

ion

Run

time

[ms]

(b) EventSim

Figure 6.5: Simulation runtime when replicating a specific model element 100 times com-pared with an initial model without replication

6.4.6.2 ANOVA Ranking

A more compact performance assessment can be yielded by using ANOVA, which was usedto determine the relative importance of each factor as explained in Section 6.4.1. Whilethe box plots discussed before cover only a relatively small fraction of 700 experiments,the ANOVA ranking utilises the whole set of 6,400 experiments for each simulator.

The ranking results for SimuCom can be seen in Table 6.3. With a relative importanceof around 95%, the number of ForkedBehaviours clearly dominates the performance ofa simulation run. This is in line with the findings of the box plots seen before. Onthe ranks two and three, the number of InternalActions and Delays are far behindwith an importance of 1.46% and 1.16%, respectively. Nevertheless, when comparedwith EntryLevelSystemCalls and Loops on ranks five and six, the influence of both,InternalActions as well as Delays is not negligible. A Delay, for instance, is with re-gard to performance around 100 times as influencial as a Loop. The variation due toexperimental errors contributes only 1.09% to the overall variation of the data set. Sim-ilarly, interactions between two factors are very rare. Based on this ranking, we deem

77


the number of ForkedBehaviours, InternalActions and Delays as the most importantperformance factors in SimuCom.

As already indicated by the box plot shown in Figure 6.5(b), we obtain a different rankingof the factors when using EventSim. The results are depicted in Table 6.4. The two mostimportant factors are InternalActions on rank one and ForkedBehaviours on ranktwo with a relative importance of 43.51% and 23.78%, respectively. Next, Branches andEntryLevelSystemCall have a relative importance of around ten percent each. Loopsand Delays are negligible.

Opposed to SimuCom, the overall performance influence with EventSim due to errorsand two-factor interactions is around 10 percent. This might be due to the absence ofa dominating factor – as it is the case with SimuCom – in conjunction with the way inwhich the relative importance of a factor is calculated: the effect of each factor on thesimulation runtime is squared and multiplied by the total number of experiments and, inthis way, yields the factor’s variation. If there is a factor with an comparatively high effect,the distance between the factors further increases when being squared. As a result, thesquared effect of the influential factor decreases the influence of the remaining factors. Thisis also the reason why the percentages of the relative factor influences are not comparablebetween SimuCom and EventSim. For example, although a Loop in EventSim has animportance of 2.49% compared to merely 0.01% in SimuCom, the box plots shown inFigure 6.5 suggest that a Loop in SimuCom is approximately twice as expensive as a Loopin EventSim.

Although the percentages are not comparable, we need to determine a unified set of factorsthat contains both, the factors influencing SimuCom’s performance to a great extent andthe important performance factors with regard to EventSim. For this purpose, we selectthe top three ranked factors for each simulator resulting in the following set of factors:

• ForkedBehaviour

• InternalAction

• Delay

• Branch

6.5 Comparing Performance and ScalabilityIn this section, we perform a thorough performance and scalability comparison of SimuComand EventSim. The comparison is based on the the set of the most influential performancefactors identified in the previous section.


The experimental setting in this section is mostly identical with the setting described inSection 6.4.3. In the same manner, we begin with the PCM base model illustrated in Figure6.4.4 and vary the number of specific modelling elements in order to asses their influenceon the simulation performance. However, instead of conducting a multi factorial ANOVA,we analyse the factors independently of each other. Furthermore, the experiments havea higher resolution in that not only the two factor levels low and high (1 and 100) areconsidered, but instead we vary the factors from 1 to 1000 with a step width of 100, yielding10 experiments for each factor. In order to reduce the influence due to measurement errors,we repeat each experiment 30 times, resulting in a total of 300 experiments.

Each of the experiments was simulated using the simulation configuration depicted inTable 6.2. That is, the modified PCM base model is traversed 1000 times in succession,

78

6.5. Comparing Performance and Scalability 79

Rank Factor Sum of Squares Relative Importance

1 ForkedBehaviours 809,710,743,834 95.85%2 InternalActions 12,339,217,542 1.46%3 Delays 9,784,946,039 1.16%4 Branches 296,838,829 0.34%5 EntryLevelSystemCalls 154,097,330 0.02%6 Loops 43,263,551 0.01%

Two-Factor Interactions 3,281,937,137 0.39%Residuals 9,168,409,098 1.09%

Total 844,779,453,362 100%

Table 6.3: Ranking and relative importances of potential performance factors in SimuCom

Rank Factor Sum of Squares Relative Importance

1 InternalActions 3,661,819,622 43.51%2 ForkedBehaviours 2,001,128,011 23.78%3 Branches 906,553,454 10.78%4 EntryLevelSystemCalls 747,901,525 8.89%5 Loops 209,470,720 2.49%6 Delays 51,723,325 0.61%

Two-Factor Interactions 652,489,336 7.75%Residuals 185,151,411 2.20%

Total 8,416,237,404 100%

Table 6.4: Ranking and relative importances of potential performance factors in EventSim

79


which corresponds to 1000 simulated users entering the simulated system in a row, oneafter the other. Furthermore, no simulation-internal performance metrics are calculatedand stored.

Instead, we conducted measurements characterising the performance of the used simulatoras follows:

Simulation Duration The duration of each simulation run is measured by means of themeasurement strategy corresponding to the SimulationDurationMeasurement (cf.sections 6.2.1.6 and 6.2.2.4). Measurements of the simulation duration are neverconducted together with further measurements described below in order to preventa potential influence of the measurement overhead on the duration of the simulationrun. Furthermore, we define the duration of a simulation run according to Section6.4.5.

Memory Usage The memory consumption in terms of heap usage over the course of asimulation run is measured using the JMXMeasurement strategy (cf. Section 6.2.2.4).The polling period, i.e. the time between successive measurements, is set to 50 ms.

Thread Count Similar to the memory usage monitoring, the number of live threads overthe course of a simulation run is measured by the JMXMeasurement strategy with apolling period of 50 ms.


The measurement results grouped by the varied factor can be seen in figures 6.6, 6.7, 6.8and 6.9. The duration of the simulation runs is illustrated by line plots. The shadedarea around a line represents the respective 95% confidence interval which offers a glimpseof the experimental error due to uncontrolled factors. The memory consumption as wellas the number of threads are depicted by combined bar and line plots, where the barsrepresent mean values and the lines show the maximum values. The lines are surroundedby confidence bounds as explained above. The error bar at the top of each bar serves thesame purpose.

6.5.2.1 Simulation Duration

A first view on the duration of the simulation runs reveals two characteristics that hold forthe compared simulators. First, the duration of a simulation run increases linearly whenthe complexity of the simulated system rises, i.e. when additional elements are added tothe simulated base model. Hence, both simulators can be considered as being scalable.Second, and more interestingly, the degree at which the two simulators scale differs to agreat amount. In the following, we discuss the reasons for this observation.

ForkedBehaviours

The greatest difference between the two simulators can be observed when the number ofForkedBehaviours rises, which is shown in Figure 6.6. Applying linear regression, thedifferences can be expressed in numerical terms. Let ds(m, c), s ∈ {SimuCom,EventSim}be the duration of a simulation run on our test system described in Table 6.1 when suppliedwith a PCM modelm and the simulation configuration c shown in Table 6.2. Furthermore,let the replication function rt(m, i) denote the modification of a PCM model m in such away that there are i model elements of the type t after applying the function to m. Thebase model b is defined according to Section 6.4.4. Then, the duration of a simulation canbe approximated by:

dSimuCom(rForkedBehaviour(b, i), c) = 878 ms + 240i ms

80


dEventSim(rForkedBehaviour(b, i), c) = −27 ms + 5.28i ms

Thus, adding a further ForkedBehaviour to the model slows down SimuCom by around240 ms, compared to a slow-down of merely 5 ms observed with EventSim.

In order to explain these findings, we take a look on the way in which the two simulatorsrealise forks of the simulated control flow. As introduced in Section 3.5.2, SimuCom spawnsa native thread for each simulated user. More generally, each simulated control flow isexecuted in a dedicated thread, which is why each ForkedBehaviour occupies a thread– even though the forked behaviours correspond to the same simulated user. Usually,the newly spawned thread would simulate the action chain encapsulated by the forkedbehaviour; in the context of this experiment, however, the forked behaviour has been leftempty intentionally. Thus, when the simulation encounters a ForkAction, it spawns anumber of threads, one for each ForkedBehaviour, which are immediately released againbefore the simulation moves on to the fork’s successor. EventSim, by contrast, makes useof a single thread, regardless of the number of users or requests to be simulated. Thedifferent usage of threads can be seen in Figure 6.6(b).

As a result from the discussion above, we attribute the observed performance differenceto the effort required to create and release threads. Another performance influence inthe presence of multiple threads arises due to context switches, which occur wheneverthe simulation hands over control from one thread to another. Because of the limitedlifetime of a thread in this experiment, however, the results most likely do not reflect theinfluence of context switches. Instead, the experiments associated with the simulation ofInternalActions and Delays are expected to take this influence into account.

Number of ForkedBehaviours

Sim

ulat

ion

Run

time

[ms]

50 000

100 000

150 000

200 000

0 200 400 600 800 1 000

Simulator

SimuComEventSim

(a) Simulation duration


Thr

ead

Cou

nt

200

400

600

800

1 000

0 200 400 600 800 1000

Max Threads

EventSimSimuCom

(b) Threads

Figure 6.6: Performance comparison when increasing the number of ForkedBehaviours

InternalActions and Delays

As indicated by figures 6.7(a) and 6.7(b), InternalActions and Delays show very similarin their influence on the duration of a simulation run, which is why we discuss them in con-junction. Performing a linear regression on the results yields the following approximationsof the influences, where the functions and variables are defined as above:

dSimuCom(rInternalActions(b, i), c) = −687 ms + 44i ms

dEventSim(rInternalAction(b, i), c) = 120 ms + 9i ms

dSimuCom(rDelay(b, i), c) = 281 ms + 31i ms

81


dEventSim(rDelay(b, i), c) = 46 ms + 1.40i ms

As expected, the influence on the simulation duration of InternalActions and Delays isquite similar when using SimuCom – at least compared to the large influence of Forked-Behaviours. This is not a surprising result since both actions do essentially the same workinasmuch as both are concerned with advancing the simulation time. While Delays ad-vance the simulation time by a fixed amount, the time advance caused by InternalActionsis usually influenced by the utilisation of the requested resource. But, this is not the casefor the present experiment since a DELAY scheduling policy is used with each resourcerequested by InternalActions. This raises the question of why there is still a differencein performance between the two types of actions. We assume that the difference is dueto the overhead induced by accessing the simulated resource and its scheduling policy,respectively.

Comparing the influence of InternalActions and Delays with EventSim, in contrast,reveals a relatively high difference between the two actions; one InternalAction accountsfor 9 ms whereas the simulation of a Delay requires only 1.4 ms. The absolute differencebetween the InternalActions and Delays, however, is quite similar between SimuComand EventSim. This observation underpins the assumption above that the difference iscaused by the resource overhead, which is independent of the used simulator since bothrely on virtually the same simulation platform.

Number of InternalActions

Sim

ulat

ion

Run

time

[ms]

10 000

20 000

30 000

40 000

0 200 400 600 800 1 000

Simulator

SimuComEventSim

(a) InternalActions

Number of Delays

Sim

ulat

ion

Run

time

[ms]

10 000

20 000

30 000

40 000

0 200 400 600 800 1 000

Simulator

SimuComEventSim

(b) Delays

Figure 6.7: Simulation duration comparison when increasing the number ofInternalActions or Delays

More interesting, however, is the difference in duration between the two simulators; sim-ulating a Delay in EventSim is around 22 times faster than using SimuCom. This canbe explained in a similar way as the influence of ForkedBehaviours discussed above. InSimuCom, a Delay is realised by suspending the thread corresponding to the user that isto be delayed. The synchronisation between simulated users or their threads, respectively,is done by means of two binary semaphores, i.e. by passive resources with a capacity of oneeach. Together, the semaphores ensure that exactly one simulated user is active at a timewhile the others have to wait for one of the semaphores. In consequence, each Delay inSimuCom entails releasing one semaphore and acquiring the other one thereafter. Then,the control is passed to another thread waiting for the semaphore just released. Thisresults in a context switch. With EventSim, neither semaphores need to be acquired orreleased nor context switches occur, which is a reasonable explanation for the observedperformance difference.

82


Branches

The performance influence of Branches can be seen in Figure 6.8. As before, we fit alinear regression model yielding the following approximations for the simulation durationdependent upon the number of Branches i:

dSimuCom(rBranch(b, i), c) = 458 ms + 3.22i ms

dEventSim(rBranch(b, i), c) = −295 ms + 7.56i ms

The regression model indicates that a Branch in EventSim requires around twice the effortthan with SimuCom. On the one hand, this result is not surprising since Branches are onposition two in the ranking of performance factors with regard to EventSim (cf. Section6.4.6). On the other hand, however, this finding is exceptional in view of the resultspresented earlier, where EventSim outperformed SimuCom by a substantial amount.

Number of Branches

Sim

ulat

ion

Run

time

[ms]

2 000

4 000

6 000

8 000

0 200 400 600 800 1 000

Simulator

SimuComEventSim

Figure 6.8: Simulation duration comparison when increasing the number of Branches

In SimuCom, branches of the simulated control flow are mapped to branches of the “real”control flow of the simulator. Each branch transition is implemented by an if-statement,where the probability of entering an if-block equals to the probability of the respectivebranch transition. EventSim, by contrast, executes a suitable traversal strategy (cf. Sec-tion 4.4.2.3) whenever it encounters a Branch. The implementation within this strategy,in turn, is virtually equivalent to the implementation in SimuCom. Still, the construc-tion and invocation of the traversal strategy, as well as constructing and returning thecorresponding traversal instruction (cf. Section 4.4.2.5) clearly implies a higher computa-tional effort. This applies, in particular, in view of just-in-time (JIT) compilation, whichpotentially produces highly efficient machine code out of the if-statements in SimuCom.

As a result from the discussion, it can be said that the performance difference betweenthe two simulators with regard to Branches is mainly caused by the overhead inducedby an interpretative simulation approach (cf. Section 4.1.2.2) compared to its generativecounterpart (cf. Section 4.1.2.1). In particular, no conclusions can be drawn relating tothe respective simulation world-view.

6.5.2.2 Memory Consumption

The memory consumption in dependence upon the model complexity is shown in Figure6.9. Regardless of the varied factor, the maximum amount of occupied heap space rises upto a model complexity of around 500 model elements and stays constant thereafter, or atleast increases at a low rate. That means, the memory consumption over simulation time isindependent of the model complexity in terms of the number of existing model elements.

83



Use

d H

eap

Mem

ory

(Mby

te)

0

100

200

300

400

0 200 400 600 800 1000

Mean Memory

EventSimSimuCom

Max Memory

EventSimSimuCom

(a) ForkedBehaviours

Number of InternalActions

Use

d H

eap

Mem

ory

(Mby

te)

0

100

200

300

400

0 200 400 600 800 1000

Mean Memory

EventSimSimuCom

Max Memory

EventSimSimuCom

(b) InternalActions

Number of Delays

Use

d H

eap

Mem

ory

(Mby

te)

0

100

200

300

400

0 200 400 600 800 1000

Mean Memory

EventSimSimuCom

Max Memory

EventSimSimuCom

(c) Delays

Number of Branches

Use

d H

eap

Mem

ory

(Mby

te)

0

100

200

300

400

0 200 400 600 800 1000

Mean Memory

EventSimSimuCom

Max Memory

EventSimSimuCom

(d) Branches

Figure 6.9: Memory consumption when increasing the number of ForkedBehaviours,InternalActions, Delays or Branches

Thus, both simulators are considered to be scalable with regard to the consumption ofheap space.

The mean memory consumption is somewhat balanced between SimuCom and EventSim.On the one hand, ForkedBehaviours cause a higher memory usage with EventSim sinceeach forked behaviour is simulated by a dedicated ForkedRequest entity, which occupiesspace on the heap. Indeed, SimuCom spawns even a native thread for each forked be-haviour, which requires memory for the associated stack. However, the stack of a nativethread is separated from the Java heap.

On the other hand, Delays and Branches are more expensive with SimuCom. The reasonsfor this is not obvious and could be found using a profiler. Due to the negligible differencesin memory consumption, however, no such analysis was performed.

Notice, that the memory usage depicted in Figure 6.9 does also include an overhead inducedby the experiment automation tool. Therefore, the results should rather be considered inrelative terms.

6.6 Identifying Scalability LimitsSome previous works suggest that SimuCom lacks of scalability. Meier [Mei10, p. 80]encountered an OutOfMemoryError when simulating a workload with a large number ofconcurrent users. The same issue was also reported by von Massow [vM10, p. 53]. In

84

6.6. Identifying Scalability Limits 85

the course of this thesis, we could reproduce these problems and, in addition, reveal somefurther limiting factors.

This section deals with the systematic identification and quantification of these factors,not only for SimuCom, but also for EventSim. In doing this, we assume that scalabilityrestrictions are caused mainly by those factors that account for a great proportion ofthe duration of a simulation run. In consequence, we stick to the performance-relatedfactors identified before and assess whether and to what extent they limit the scalability ofSimuCom and EventSim, respectively. Notice, that we assess the limitation in concurrentsystem users or requests, respectively, by increasing the workload population in additionto the number of ForkedBehaviours since ForkedBehaviours suffer a quite restrictivelimitation in scalability.

6.6.1 Indications of Scalability Limits

Before we present the experimental setting and the corresponding results, we discuss threepotential signs for scalability issues. All of these indications have been observed withSimuCom.

6.6.1.1 Stack Overflow

It is a common way to keep track of the state of a thread by means of an associated stackwhich is composed of stack frames. A stack frame is pushed on the stack whenever thethread invokes a method, and the topmost stack frame is removed again as soon as themethod returns. Each stack frame occupies a certain amount of memory in dependenceupon the called method. [LY]

When creating a thread, a specific fraction of the Java virtual machine’s memory is reservedfor the thread’s stack and its stack frames, respectively. Exceeding this reserved memoryraises a StackOverflowError causing the program to terminate. Most commonly, suchan error is caused by recursive method calls, each of which requires a frame on the stack;from a certain recursion depth on, the created frames occupy the whole stack and the nextmethod call raises the exception. For the sake of completeness, it has to be noted thata Java virtual machine might dynamically increase the size of a stack if required [LY];however, the used JVM (cf. Table 6.1) seems to fix the stack size at the creation time ofa thread.

Most commonly, StackOverflowErrors are caused by programming errors, such as infiniterecursion, and as such can barely be considered as scalability issue. But, when the presenceof a stack overflow depends upon the complexity of the provided input, we can say thatthe overflow indicates a scalability limit.

The amount of memory allocated to each thread depends upon the implementation of theJava virtual machine and the operating system. For the system configuration described inTable 6.1, the default stack size is 1024 KB.

6.6.1.2 Running out of Memory

Numerous factors influence the memory requirements of a JVM, including the heap andthe stacks associated with the threads created by the JVM. They all have in common thatthey use the shared address space of the process which executes the JVM. In dependenceupon several factors, such as the processor architecture, the operating system and itsconfiguration, restrictions are imposed on the size of the address space. For example, the32-bit versions of Windows have a memory limit of 2 GB for each process; this limit canbe relaxed to 3 GB under certain conditions. In contrast, using a 64-bit Windows version

85


in conjunction with 64-bit processes, the upper limit is 7 to 8 TB. As soon as the processruns out of memory, the JVM throws an OutOfMemoryError.

Thus, especially when using a 32-bit JVM (which implies 32-bit processes, regardless ofthe underlying operating system), there are quite restrictive memory constraints. Theselimit the size of the heap as well as the number of threads in dependence upon the stacksize. Beyond that, there is a trade-off between the heap size and the number of threadssince both share the same address space.

As a result from this discussion, a program whose memory requirements rise in depen-dence upon the complexity of the provided input, lacks appropriate scalability, which iseventually indicated by an OutOfMemoryError.

6.6.1.3 Exceeding Java Class File Limits

Java classes are specified using binary files which adhere to the class file format of theJava virtual machine (JVM). Regardless of a specific JVM implementation, this formatimposes a number of restrictions to the structure of Java classes. For example, the code ofa method may not be larger than 65,536 bytes [LY]. When trying to compile a Java classthat exceeds one of these limits, an exceptions is thrown. Specifically, this might occurwhen a program generates large Java classes and compiles them at runtime, which is thecase with SimuCom.

If the size of the class files that are compiled on runtime depends upon the complexity ofthe provided input, this poses a limitation on the scalability.


For the purpose of finding scalability limits in both SimuCom and EventSim, we simu-lated a number of variations of the PCM base model presented in Section 6.4.4. As withthe comparison in the previous section, we vary the performance-related factors within acertain range. By contrast, however, the upper bound is two orders of magnitudes largersince each factor is varied in a range between 1 and 100,000. Only one factor is varied ata time.

The simulation runs were conducted on the test system described in Table 6.1 using thesimulation configuration shown in Table 6.2.


The results of the scalability assessment can be seen in Table 6.5. On the test system,we could simulate base model variations containing up to approximately 820 ForkedBe-haviours before SimuCom aborts with a StackOverflowError. Likewise, the numberof InternalActions in SimuCom is limited to around 940 elements, and Delays causea stack overflow when reaching the upper limit of around 1,560 elements in the basemodel. In each of these cases, the overflow is caused by the Xpand template enginewhose task is to generate the simulation code – Java classes in this case – from the PCMmodel provided as input to SimuCom. Whether the high stack consumption is caused byinefficient code-generation templates or by the Xpand engine itself has not been assessedsince this scalability limit can not be attributed to the prsence of the process-orientedworld-view in SimuCom.

The number of Branches in SimuCom is limited to around 1,250 elements. When reachingthis limit, and SimuCom tries to compile the generated Java classes, the Java virtualmachine throws an exception indicating that the maximum size of 64 KB per methodhas been exceeded. Again, this issue is associated with the code-generation facility in

86

6.7. Conclusion 87

Factor SimuCom EventSim

Number of ForkedBehaviours < 820a > 100,000Number of InternalActions < 940a > 100,000Number of Delays < 1,560a > 100,000Number of Branches < 1,250c > 100,000Workload Population < 90,000b > 100,000a raised a StackOverflowErrorb raised an OutOfMemoryErrorc exceeded the 64 KB method size limit

Table 6.5: Scalability limits of SimuCom and EventSim

SimuCom. In particular, no conclusions can be drawn relating to the simulation world-views.

The workload population in SimuCom is bound by an upper limit of 90,000 simulated userscirculating through the system concurrently. Reaching this limit causes an OutOfMemory-Error. On first glance, this amount of simulated users seems to be more than enough forthe most application scenarios of the PCM. But, when the usage or system complexityrises, each simulated user demands for an increased amount of stack size which decreasesthe observed limit. Even worse, using a 32-bit JVM usually imposes a 2 GB memory limit,which decreases the user limit further. The observed limit in concurrent system users canbe directly attributed to the way in which the process-interaction world-view is imple-mented in SimuCom and, more generally, in the majority of process-oriented simulationsin Java.

With EventSim, no scalability limitations were observed. Each factor was increased toa number of 100,000 model elements (or simulated users in the case of the workloadpopulation) and simulated successfully. Hence, there is either no limitation in scalabilityor the limit is beyond the 100,000 bound.

Surprisingly, the limits observed with SimuCom are in contradiction to the experimentsconducted previously, where each factor has been varied in a range between 1 and 1000without running into scalability issues. This can be explained by the influence of the ex-periment automation tool and, specifically, by its evolution over the course of this thesis.As the experiment automation tool did not change between the various experiments as-sociated with the scalability assessment, the imposed overhead is constant. Furthermore,one can argue that the experiment automation tool is a part of the test system and, assuch, may also influence the results to a certain extent. Thus, the observed experimentautomation overhead is not perceived to compromise the findings from the experiments.Rather, this discussion suggests that the result have to be considered in relative terms.

6.7 ConclusionBoth simulators meet our definition of a scalable system with regard to the duration ofsimulation runs and their memory consumption: when increasing the model complexityby adding further actions to the simulated control flow, the simulation duration increaseslinearly; the memory consumption rises up to a certain amount and stays even constantthereafter.

When increasing the number of ForkedBehaviours, however, the thread count rises pro-portionally to the amount of ForkedBehaviours. This poses a scalability limit sincethreads are limited in availability. We could simulate a variation of the PCM base model

87


on our test system with around 90,000 ForkedBehaviours before running out of mem-ory. This value seems large, but when adding further actions to the ForkedBehaviours,the memory demand of the corresponding thread increases, which is why we expect thenumber to be much smaller. Moreover, using a 32-bit JVM would further decrease theupper limit. In addition to the limited number of ForkedBehaviours, SimuCom suffersfurther scalability issues. These are, however, not caused in the simulation phase, but inthe preceding code generation phase.

Although SimuCom and EventSim are both regarded as being scalable (neglecting theaforementioned scalability issues for the moment), the performance differs to a great ex-tend. The largest difference in simulation duration between SimuCom and EventSimcan be observed when increasing the number of ForkedBehaviours. Each additionalForkedBehaviour accounts for 240 ms compared to merely 5 ms in EventSim. The secondlargest difference arises when increasing the number of Delays or InternalActions. Ad-ditional Delays cause an extent in simulation duration of 31 ms each, compared to around2 ms with EventSim. Each further InternalAction extends the simulation duration by44 ms compared to 9 ms with EventSim. By contrast, SimuCom is around twice as fastas EventSim when simulating a Branch.

88

7. Summary and Conclusion

Two steps have been conducted towards our goal of comparing a process-oriented simulatorto an event-oriented one in terms of performance and scalability. First, we developed theevent-oriented simulator EventSim, which is capable of simulating PCM models. Second,we compared EventSim to the existing process-based PCM simulator SimuCom. Further-more, we conducted a validation which suggests that the two simulator can be regardedas being semantically equivalent.

The simulation in EventSim is driven by events. Each event simulates a proportion of thebehaviour of either the system or of a system user as long as an advance in simulationtime is needed. Then, the event first schedules the next event to occur at the advancedsimulation time and terminates thereafter. The scheduled event occurs at the advancedsimulation time and continues the simulation of the system or usage behaviour. For thesimulation of behaviours, the events make use of an interpreter. Beginning at the startaction of a behaviour, the interpreter passes along the chain of actions until the behaviour’sstop action is reached. On its path through the action chain, the interpreter executes thesimulation logic associated with each action encountered. The observable system behaviourarises due to the sequence of events, each contributing to the overall system behaviour.

After finishing the development of EventSim, we compared the simulator to SimuCom. Forthis purpose, we assessed the influence of certain factors on the performance or scalabilityof the respective simulator. By factors, we mean the degrees of freedoms provided by thePalladio Component Model (PCM) to the modeller of a component-based software system.In order to cope with the huge number of degrees of freedoms, we excluded factors withlow or no significant performance influence from the comparison.

As a result, we assessed the influence of the PCM modelling elements ForkedBehaviour,InternalAction, Delay and Branch on the duration of simulation runs, on their memoryconsumption and on the number of threads required for the simulation. For this, westarted with a PCM model containing one action of each type; one ForkedBehaviour,one InternalAction and so on. Then, we increased the number of ForkedBehaviour inseveral steps while leaving the rest of the model unchanged. The PCM model resultingfrom each step was simulated using both simulators under comparison while observing thesimulation duration as well as the demand for memory and threads over the time. In thesame manner we assessed the influence of InternalActions, Delays and Branches.

The results showed a considerable influence of the respective simulation world-view onthe performance of the simulator. The greatest difference in simulation duration can be

89

90 7. Summary and Conclusion

observed with ForkedBehaviours. Simulating a ForkedBehaviour in SimuCom takesapproximately 50 times longer than with EventSim. We attribute this difference to thecreation overhead of threads since each ForkedBehaviour in SimuCom is simulated by adedicated thread. A similar finding can be made with the simulation of InternalActionsand Delays. Both actions lead to an advance in simulation time when being simulated. Forthis, the active thread in SimuCom suspends execution until the simulation time has beenadvanced accordingly. The suspension leads to another thread being resumed, which causesan overhead due to context switches. Moreover, multiple threads need to be synchronisedsince only one thread may be active at a time, which causes a further overhead. Asa result, SimuCom is around 4 times slower when simulating an InternalAction andeven 22 times slower when a Delay is being simulated. By contrast, when none of theseoverheads are present, SimuCom performs faster than EventSim. The simulation of aBranch, for instance, is in SimuCom twice as fast as in EventSim.

Although the results show a correlation between the presence of process-orientation inSimuCom and its inferior performance, it has to be noted that no definitive statementscan be made without breaking down the observed simulation duration into more fine-grained influencing factors, such as the processing time associated with the call to a specificmethod. Such an analysis could be conducted by a profiling tool. However, this is a highlycomplex task as we had to find out. In particular, it is very hard to find an appropriatetrade-off between a high measurement accuracy and the bias imposed by the measurementoverhead.

Although there is no doubt that our measurements are influenced not only by the respectiveworld-view, but also by uncontrolled factors, the results can still provide a clue on theapproximate loss in performance or scalability when deciding in favour of a process-orientedsimulation. In view of the fact, that the performance difference between the world-viewsarises essentially due to the common practice to map simulated processes to native threads,it is desirable to find an alternative to threads in process-oriented simulation modelling.With respect to Java, a native support of coroutines by the JVM could solve the observableperformance issues.

7.1 Benefits

The benefits of this thesis are as follows:

• Our simulator accelerates the simulation of PCM models compared to SimuCom. Inparticular, when simulating a PCM model which contains a large number of Forks,EventSim provides a considerabe speedup.

• Our simulator solves scalability issues present with SimuCom. Especially when aPCM model causes a large number of concurrent system users, EventSim can beused when running out of memory.

• Our simulator could be easily extended to support runtime configuration scenariossuch as SLAstic.SIM (cf. Section 2.1.2)

• The ranking of PCM modelling elements based on their influence on the simulationperformance might lead to a more conscious way of modelling systems with the PCM.For instance, Forks can be avoided if not absolutely necessary.

• The results gained by the simulator comparison can be used to trade off the gain in“modelling ease” for performance as motived at the beginning.

90

7.2. Future Work 91

7.2 Future WorkTo conclude this thesis, we provide a brief overview of open issues giving rise to futurework.

• In Section 4.4.2.1, we mentioned the lack of a common supertype for the both typesof actions, AbstractActions and AbstractUserActions. This complicates the be-haviour interpreter as we had to stick to Java generics instead. This issue could beresolved by augmenting the PCM meta-model by a generic Action interface.

• As illustrated in Table 4.1 and Table 4.2, EventSim does not support the wholeexpressiveness of PCM models. In view of the benefits presented above, it seemsdesirable to implement the missing features.

• The simulator comparison could be extended to further factors in order to refine thefindings from the comparison.

• Finally, the assumptions relating to the ranking of performance factors could bereviewed in order to increase the credibility of the comparison results.

91

Bibliography

[Ban10] J. Banks, Ed., Discrete-event system simulation, 5th ed. Upper Saddle River,NJ: Pearson Prentice Hall, 2010.

[Bea08] P.-R. Beauvais, “Skalierbarkeitsanalyse des Palladio Komponentenmodells,”Master’s thesis, Karlsruhe Institute of Technology (KIT), 2008.

[Bec08] S. Becker, “Coupled Model Transformations for QoS Enabled Component-Based Software Design,” Ph.D. dissertation, University of Oldenburg, Ger-many, Mar. 2008.

[BKR07] S. Becker, H. Koziolek, and R. H. Reussner, “Model-based PerformancePrediction with the Palladio Component Model,” in WOSP ’07: Proceedingsof the 6th International Workshop on Software and performance. NewYork, NY, USA: ACM, February 5–8 2007, pp. 54–65. [Online]. Available:http://sdqweb.ipd.uka.de/publications/pdfs/becker2007b.pdf

[BKR09] S. Becker, H. Koziolek, and R. Reussner, “The Palladio component model formodel-driven performance prediction,” vol. 82, pp. 3–22, 2009.

[Bon00] A. B. Bondi, “Characteristics of scalability and their impact on performance,”in Proceedings of the 2nd international workshop on Software and perfor-mance, ser. WOSP ’00. New York, NY, USA: ACM, 2000, pp. 195–203.

[Bus96] F. Buschmann, Ed., Pattern-oriented software architecture. Chichester: Wi-ley, 1996, vol. [1]: A system of patterns.

[CD01] J. Cheesman and J. Daniels, UML components : a simple process for speci-fying component-based software, ser. Component software seriesThe Addison-Wesley object technology series. Boston, Mass.: Addison-Wesley, 2001.

[DBN89] E. J. Derrick, O. Balci, and R. E. Nance, “A comparison of selected conceptualframeworks for simulation modeling,” in Proceedings of the 21st conferenceon Winter simulation, ser. WSC ’89. New York, NY, USA: ACM, 1989, pp.711–718.

[des] “Desmo-J,” last visit: July 19th, 2011. [Online]. Available: http://desmoj.sourceforge.net/

[emf] “Eclipse Modeling Framework,” last visit: July 19th, 2011. [Online].Available: http://www.eclipse.org/modeling/emf/

[Fis01] G. S. Fishman, Discrete-event simulation : modeling, programming, and anal-ysis, ser. Springer series in operations research. New York: Springer, 2001.

[Gam95] E. Gamma, Ed., Design patterns : elements of reusable object-oriented soft-ware, ser. Addison-Wesley professional computing series. Reading, Mass.[u.a.]: Addison-Wesley, 1995.

93

http://sdqweb.ipd.uka.de/publications/pdfs/becker2007b.pdf

http://desmoj.sourceforge.net/

http://desmoj.sourceforge.net/

http://www.eclipse.org/modeling/emf/

94 Bibliography

[Jai91] R. Jain, The art of computer systems performance analysis, ser. Wiley pro-fessional computing. New York [u.a.]: Wiley, 1991.

[JV04] P. Jacobs and A. Verbraeck, “Single-threaded specification of process-interaction formalism in java,” in Simulation Conference, 2004. Proceedingsof the 2004 Winter, vol. 2, dec. 2004, pp. 1548–1555.

[KB06] S. Kounev and A. Buchmann, “SimQPN – a tool and methodology for an-alyzing queueing Petri net models by means of simulation,” PerformanceEvaluation, vol. 63, no. 4-5, pp. 364–394, May 2006.

[KBH07] H. Koziolek, S. Becker, and J. Happe, “Predicting the Performance ofComponent-based Software Architectures with different Usage Profiles,” inProc. 3rd International Conference on the Quality of Software Architectures(QoSA’07), vol. 4880, July 2007, pp. 145–163. [Online]. Available:http://sdqweb.ipd.uka.de/publications/pdfs/koziolek2007b.pdf

[KFBR06] H. Koziolek, V. Firus, S. Becker, and R. Reussner, Handbuch der Software-Architektur. dpunkt-Verlag, Februar 2006, ch. Bewertungstechniken für diePerformanz, pp. 311–326.

[KH06] H. Koziolek and J. Happe, “A QoS Driven Development Process Modelfor Component-Based Software Systems,” in Proc. 9th Int. Symposiumon Component-Based Software Engineering (CBSE’06), I. Gorton, G. T.Heineman, I. Crnkovic, H. W. Schmidt, J. A. Stafford, C. A. Szyperski,and K. C. Wallnau, Eds., vol. 4063, 2006, pp. 336–343. [Online]. Available:http://sdqweb.ipd.uka.de/publications/pdfs/koziolek2006b.pdf

[KHB06] H. Koziolek, J. Happe, and S. Becker, “Parameter Dependent PerformanceSpecification of Software Components,” in Proc. 2nd Int. Conf. on the Qualityof Software Architectures (QoSA’06), C. Hofmeister, I. Crnkovic, R. H.Reussner, and S. Becker, Eds., vol. 4214, July 2006, pp. 163–179. [Online].Available: http://sdqweb.ipd.uka.de/publications/pdfs/koziolek2006f.pdf

[KKR11] M. Kuperberg, M. Krogmann, and R. Reussner, “Metric-based Selection ofTimer Methods for Accurate Measurements,” in Proceedings of the Interna-tional Conference on Software Engineering 2011 (ICPE 2011), March 14-16,2011, Karlsruhe, Germany, 2011.

[Kou09] S. Kounev,Wiley Encyclopedia of Computer Science and Engineering. Wiley-Interscience, John Wiley & Sons Inc., Jan. 2009, ch. Software PerformanceEvaluation.

[LB05] P. L’Ecuyer and E. Buist, “Simulation in Java with SSJ,” in Proceedings ofthe 37th conference on Winter simulation, ser. WSC ’05. Winter SimulationConference, 2005, pp. 611–620.

[LK00] A. M. Law and W. D. Kelton, Simulation modeling and analysis, 3rd ed.,ser. McGraw-Hill series in industrial engineering and management science.Boston [u.a.]: McGraw-Hill, 2000.

[LY] T. Lindholm and F. Yellin, “The Java virtual machine specification,” lastvisit: July 19th, 2011. [Online]. Available: http://java.sun.com/docs/books/jvms/second_edition/html/VMSpecTOC.doc.html

[Mar07] A. Martens, “Empirical Validation of the Model-driven Performance Predic-tion Approach Palladio,” Master’s thesis, Carl-von-Ossietzky Universität Old-enburg, Nov. 2007.

94

http://sdqweb.ipd.uka.de/publications/pdfs/koziolek2007b.pdf

http://sdqweb.ipd.uka.de/publications/pdfs/koziolek2006b.pdf

http://sdqweb.ipd.uka.de/publications/pdfs/koziolek2006f.pdf

http://java.sun.com/docs/books/jvms/second_edition/html/VMSpecTOC.doc.html

http://java.sun.com/docs/books/jvms/second_edition/html/VMSpecTOC.doc.html

Bibliography 95

[Mei10] P. Meier, “Automated Transformation of Palladio Component Models toQueueing Petri Nets,” Master’s thesis, Karlsruhe Institute of Technology(KIT), 2010, to appear.

[MKK11] P. Meier, S. Kounev, and H. Koziolek, “Automated Transformation of Pal-ladio Component Models to Queueing Petri Nets,” in In 19th IEEE/ACMInternational Symposium on Modeling, Analysis and Simulation of Computerand Telecommunication Systems (MASCOTS 2011), Singapore, July 25-272011.

[omg] “Object Management Group,” last visit: July 19th, 2011. [Online]. Available:http://www.omg.org/

[omg07] “UML Superstructure Specification, v2.1.2,” Object Management Group,Tech. Rep., 2007. [Online]. Available: http://www.omg.org/spec/UML/2.1.2/Superstructure/PDF/

[Pag05] W. Page, Bernd ; Kreutzer, The Java simulation handbook : simulating dis-crete event systems with UML and Java, ser. Berichte aus der Informatik.Aachen: Shaker, 2005.

[Pol89] L. F. Pollacia, “A survey of discrete event simulation and state-of-the-artdiscrete event languages,” SIGSIM Simul. Dig., vol. 20, pp. 8–25, September1989.

[RBK+07] R. H. Reussner, S. Becker, H. Koziolek, J. Happe, M. Kuperberg, andK. Krogmann, “The Palladio Component Model,” Universität Karlsruhe(TH), Interner Bericht 2007-21, 2007, october 2007. [Online]. Available:http://sdqweb.ipd.uka.de/publications/pdfs/reussner2007a.pdf

[Rob06] S. Robinson, “Conceptual modeling for simulation: issues and research re-quirements,” in Proceedings of the 38th conference on Winter simulation, ser.WSC ’06. Winter Simulation Conference, 2006, pp. 792–800.

[SH06] L. Sachs and J. Hedderich, Angewandte Statistik, 12th ed. Berlin, Heidelberg:Springer-Verlag Berlin Heidelberg, 2006.

[ssj] “SSJ: Simulation Stochastique en Java,” last visit: July 19th, 2011. [Online].Available: http://www.iro.umontreal.ca/~simardr/ssj/

[SV05] T. Stahl and M. Völter, Eds., Modellgetriebene Softwareentwicklung : Tech-niken, Engineering, Management, 1st ed. Heidelberg: dpunkt-Verl., 2005.

[SVEH07] T. Stahl, M. Völter, S. Efftinge, and A. Haase, Modellgetriebene Softwa-reentwicklung : Techniken, Engineering, Management, 2nd ed. Heidelberg:dpunkt-Verl., 2007.

[SW02] C. U. Smith and L. G. Williams, Performance solutions : a practical guide tocreating responsive, scalable software, 1st ed., ser. The Addison-Wesley objecttechnology series. Boston, Mass.: Addison-Wesley, 2002.

[SWW10] L. Stadler, T. Würthinger, and C. Wimmer, “Efficient coroutines for theJava platform,” in Proceedings of the 8th International Conference on thePrinciples and Practice of Programming in Java, ser. PPPJ ’10. New York,NY, USA: ACM, 2010, pp. 20–28.

[Szy99] C. Szyperski, Component software : beyond object-oriented programming,1st ed., ser. ACM Press books. Harlow: Addison-Wesley, 1999.

95

http://www.omg.org/

http://www.omg.org/spec/UML/2.1.2/Superstructure/PDF/

http://www.omg.org/spec/UML/2.1.2/Superstructure/PDF/

http://sdqweb.ipd.uka.de/publications/pdfs/reussner2007a.pdf

http://www.iro.umontreal.ca/~simardr/ssj/

96 Bibliography

[uml] “Unified Modeling Language,” last visit: July 19th, 2011. [Online]. Available:http://www.uml.org/

[vHRGH09] A. van Hoorn, M. Rohr, A. Gul, and W. Hasselbring, “An adaptation frame-work enabling resource-efficient operation of software systems,” in Proceedingsof the Warm Up Workshop for ACM/IEEE ICSE 2010, ser. WUP ’09. NewYork, NY, USA: ACM, 2009, pp. 41–44.

[vM10] R. von Massow, “Performance Simulation of Runtime Reconfigurable Soft-ware Architectures,” Master’s thesis, University of Oldenburg, 2010.

[WP04] R. M. Weatherly and E. H. Page, “Efficient process interaction simulation injava: implementing co-routines within a single java thread,” in Proceedings ofthe 36th conference on Winter simulation, ser. WSC ’04. Winter SimulationConference, 2004, pp. 1437–1443.

96

http://www.uml.org/

comparing process and event-based software performance ... · comparing process- and event-oriented...

Documents