university of groningen understanding and supporting ......500438-l-sub01-bw-tofan understanding and...

University of Groningen

Understanding and Supporting Software Architectural DecisionsTofan, Dan

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.

Document VersionPublisher's PDF, also known as Version of record

Publication date:2015

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):Tofan, D. (2015). Understanding and Supporting Software Architectural Decisions: for ReducingArchitectural Knowledge Vaporization. University of Groningen.

CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.

Download date: 27-05-2021

https://research.rug.nl/en/publications/understanding-and-supporting-software-architectural-decisions(29d4f043-5066-49dd-a500-ce4842c12e2b).html

500438-L-sub01-bw-Tofan500438-L-sub01-bw-Tofan500438-L-sub01-bw-Tofan500438-L-sub01-bw-Tofan

Understanding and Supporting Software Architectural Decisions

For Reducing Architectural Knowledge Vaporization

PhD thesis

to obtain the degree of PhD at the University of Groningen on the authority of the

Rector Magnificus Prof. E.Sterken and in accordance with

the decision by the College of Deans.

This thesis will be defended in public on

Friday 20 November 2015 at 12.45

by

Dan Constantin Tofan

born on 17 May 1981 in Iași, Romania


Supervisor: Prof. P. Avgeriou

Co-supervisor: Dr. M. Galster

Assessment committee: Prof. P. Lago Prof. I. Crnkovic Prof. M. Aiello


Samenvatting

De architectuur van software systemen wordt bepaald door de architectuur

beslissingen. Hierin worden onderwerpen als frameworks, patterns,

programmeer talen behandeld, of manieren om het systeem op te delen. Deze

beslissingen en hun rationale zijn een belangrijk onderdeel van de architectuur

kennis van een software systeem. Architectuur kennis van software systemen

kan verloren gaan. Een architect kan de redenen van een beslissing vergeten,

een andere baan krijgen, of de documentatie van beslissingen uitstellen. Het

verdwijnen van architectuur kennis heeft enorme consequenties. Het kan zijn

dat de initieel beoogde architectuur ideeën niet meer kunnen worden nageleefd

waardoor uitbreidingen duur worden en het moeilijk is om de consistentie

tussen beslissingen te bewaren. Het hoofddoel van dit onderzoek is het

verminderen van het verlies van deze architectuur kennis. Dit word gedaan

door deze beslissingen en hun rationale beter te documenteren. De bijdrage

van dit onderzoek bestaat uit drie fasen: het begrijpen van het huidige

onderzoek en de praktijk, exploreren van nieuwe ideeën en het aandragen van

een concrete aanpak om het verlies van achitectuurkennis tegen te gaan. Als

derde bijdrage hebben we een gebruiksvriendelijke open-source tool gemaakt

voor het nemen en documenteren van beslissingen als groep of als individu.

De bijdragen van dit proefschrift helpen mensen in de praktijk om minder

architectuur kennis te verliezen. Daarnaast helpt het onderzoekers om de aard

van architectuur kennis beter te begrijpen.


ISBN: 978-90-367-8270-8 (Printed version) ISBN: 978-90-367-8269-2 (Electronic version) Keywords: software architecture, architectural decisions, knowledge vaporization The research presented in this thesis was performed in the Software Engineering and Architecture group, at the Johan Bernoulli Institute for Mathematics and Computing Science at the University of Groningen. The cover image is licensed from Getty Images.


v

Abstract

The architecture of a software system is the result of architectural decisions on

various topics, such as frameworks, patterns, programming languages, or ways

to decompose the software system. Such decisions and their rationales are a

significant part of the architectural knowledge about a software system.

Architectural knowledge about a software system tends to vaporize. For

example, architects might forget the rationales of decisions, change jobs, or

postpone indefinitely documenting decisions to avoid disrupting their design

flow. Architectural knowledge vaporization has major practical consequences,

such as drifting away from the initially intended architecture, and expensive

evolution, due to the substantial needed effort to understand previous decisions

and to avoid conflicts with them.

The overall research problem addressed in this thesis is how to reduce

architectural knowledge vaporization. The overall solution is to reduce

architectural knowledge vaporization by documenting architectural decisions

and their rationales.

The contributions of this thesis at solving this problem can be grouped in three

phases: understanding the state of practice and research, exploring new ideas,

and proposing concrete approaches to reduce architectural knowledge

vaporization.

In the first phase (understanding), we investigated the state of practice in

which architectural knowledge vaporization occurs, and the state of research

that can help reduce architectural knowledge vaporization. To understand the

state of practice, we conducted two surveys with practitioners. The first survey

helps researchers understand the challenges for managing architectural

knowledge in practice, and potential solutions to these challenges. The results

of the first survey indicate that architectural knowledge vaporization is a major

challenge in the industry, and that tool support is a potential solution. The

second survey describes real-world architectural decisions, such as their

characteristics, difficulties, and differences between good and bad architectural

decisions. For example, we found out that most architectural decisions are

group decisions.


To understand the state of research, we conducted a systematic mapping study

on the last decade of research on architectural decisions. This study helped us

understand existing work on reducing architectural knowledge vaporization

and future promising research directions. For example, we identified a lack of

research on group architectural decisions, despite the fact that most

architectural decisions are group decisions. Furthermore, we identified very

few open-source tools for architectural decisions.

In the second phase (exploring), we investigated using established approaches

from the knowledge engineering field for reducing architectural knowledge

vaporization. In particular, we conducted two surveys with students on using

the Repertory Grid technique for documenting architectural decisions, to

identify advantages and disadvantages of the technique. We found out that the

main advantages are reducing architectural knowledge vaporization and

reasoning support. The main disadvantages are the needed effort and lack of

user-friendly tool support.

In the third phase (proposing), we made three contributions.

First, we contributed an approach based on the Repertory Grid technique for

making and documenting individual architectural decisions. We did a survey

with practitioners to identify advantages, disadvantages, and improvement

opportunities of the approach. Advantages include reduction of architectural

knowledge vaporization, and decision making support. Disadvantages include

effort and insufficient tool support. Improvement opportunities include support

for prioritizing concerns and for group decision making. To improve the

approach, we did a controlled experiment with students to compare two

concerns prioritization methods, and then we added the most suitable method

to the approach.

Second, we contributed an extension of the approach for making and

documenting group architectural decisions. We did a case study to identify

benefits and potential improvements of the approach. Benefits include

reduction of architectural knowledge vaporization, and increased consensus of

the group. Furthermore, we did a controlled experiment with students to

compare the approach against ad-hoc group decision making. Experiment

results indicate that the proposed approach reduces architectural knowledge

vaporization and increases consensus.


vii

Third, we contributed with user friendly, open-source tool support for the two

approaches for making and documenting individual and group architectural

decisions.

Overall, the contributions of this thesis help practitioners reduce architectural

knowledge vaporization. Furthermore, the contributions of this thesis help

researchers understand various aspects of architectural decisions and

architectural knowledge, so that researchers can propose approaches that

satisfy the needs of practitioners.


ix

Contents

Chapter 1

Introduction 1

1.1 Context ..........................................................................................1

1.2 Problem statement ..........................................................................2

1.2.1 Research framework.............................................................4

1.3 Research questions .........................................................................5

1.3.1 Iteration A - Understand .......................................................5

1.3.2 Iteration B - Explore.............................................................9

1.3.3 Iteration C - Propose .......................................................... 10

1.4 Research methods ......................................................................... 11

1.5 Publications overview ................................................................... 13

Chapter 2

Architectural Knowledge Management in Practice 15

2.1 Introduction ................................................................................. 16

2.2 Related Work ............................................................................... 17

2.3 Research Method .......................................................................... 18

2.3.1 Data Collection and Analysis Procedures ............................. 19

2.3.2 Organizations..................................................................... 20

2.3.3 Validity Threats ................................................................. 20

2.4 Challenges ................................................................................... 21

2.4.1 Challenges in the public sector ............................................ 22

2.4.2 Challenges in the private sector ........................................... 24

2.5 Solutions ...................................................................................... 26

2.6 Discussion.................................................................................... 29

2.7 Conclusions ................................................................................. 31

Chapter 3

Architectural Decisions in Practice 33

3.1 Introduction ................................................................................. 34

3.2 Related Work ............................................................................... 35

3.3 Survey Design .............................................................................. 37

3.3.1 Questionnaire Development and Evaluation ......................... 37


3.3.2 Data Collection .................................................................. 39

3.4 Results Analysis ........................................................................... 40

3.4.1 Participants Background ..................................................... 40

3.4.2 RQ1 - Characteristics of Architectural Decisions.................. 41

3.4.3 RQ2 - Difficulty of Decisions ............................................. 43

3.4.4 RQ3 - Differences between Junior and Senior Architects ...... 46

3.4.5 RQ4 - Differences between Good and Bad Decisions ........... 49

3.5 Discussion.................................................................................... 51

3.5.1 Validity Threats ................................................................. 53

3.6 Conclusions ................................................................................. 53

Chapter 4

State of Research on Architectural Decisions 55

4.1 Introduction ................................................................................. 57

4.2 Research Methodology ................................................................. 58

4.2.1 Research Directives............................................................ 61 4.2.1.1 Protocol Definition 61 4.2.1.2 Generic Decision Literature Survey 62 4.2.1.3 Research Questions Definition 64

Research Questions Derived from Software Architecture Literature 64 Research Questions Derived from Generic Decision Literature 65

4.3 Data Collection............................................................................. 67

4.3.1 Source Selection and Search String ..................................... 67

4.3.2 Inclusion and Exclusion Criteria.......................................... 69

4.3.3 Search Process ................................................................... 70

4.4 Results ......................................................................................... 73

4.4.1 Overview of Selected Papers............................................... 73 4.4.1.1. Empirical Evaluation Approaches 73 4.4.1.2. Publication Venues and Years 75 4.4.2 RQ1 – Documenting Architectural Decisions ....................... 77

4.4.3 RQ2 – Functional Requirements and Quality Attributes ........ 79

4.4.4 RQ3 – Domain-specific Architectural Decisions .................. 80

4.4.5 RQ4 – Descriptive and Normative Papers ............................ 82

4.4.6 RQ5 - Addressing Uncertainty in Architectural Decisions ..... 84

4.4.7 RQ6 - Group Architectural Decisions .................................. 85

4.5 Discussion.................................................................................... 86

4.5.1 Analysis and Synthesis of Results ....................................... 86 4.5.1.1. Empirical Evaluation Approaches, Publication Venues and Years 86 4.5.1.2. Documenting Architectural Decisions 89


xi

4.5.1.3. Functional Requirements and Quality Attributes 90 4.5.1.4. Domain-specific Architectural Decisions 91 4.5.1.5. Descriptive and Normative Papers 91 4.5.1.6. Uncertainty in Architectural Decisions 96 4.5.1.7. Group Architectural Decisions 97 4.5.2 Implications for Researchers and Practitioners ................... 100

4.6 Validity threats ........................................................................... 102

4.6.1 Conclusion Validity ......................................................... 102

4.6.2 Construct Validity ............................................................ 103

4.6.3 Internal Validity ............................................................... 103

4.6.4 External Validity .............................................................. 104

4.7 Conclusions ............................................................................... 104

Chapter 5

Reducing Vaporization with the Repertory Grid Technique 107

5.1 Introduction ............................................................................... 109

5.2 The Repertory Grid Technique .................................................... 110

5.3 Exploratory Study....................................................................... 111

5.3.1 Study Design ................................................................... 111

5.3.2 Study Results ................................................................... 113

5.3.3 Advantages ...................................................................... 116

5.3.4 Disadvantages.................................................................. 117

5.3.5 Application Context ......................................................... 117

5.3.6 Validity Threats ............................................................... 117

5.4 Survey study .............................................................................. 118

5.4.1 Conceptual Model to Capture Architectural Knowledge Using

the Repertory Grid Technique ........................................... 119

5.4.2 Repertory Grid Technique for Capturing Architectural

Knowledge ...................................................................... 121

5.4.3 Study Definition and Design ............................................. 122

5.4.4 Survey Implementation..................................................... 125

5.4.5 Survey Execution ............................................................. 127

5.4.6 Analysis of Survey Results ............................................... 128 5.4.6.1. Collecting Metrics for a Decision 128 5.4.6.2. Analyzing Metrics for All Decisions 130 5.4.6.3. Post Questionnaires 132 5.4.7 Discussion ....................................................................... 134

5.4.8 Validity Threats ............................................................... 134

5.5 Conclusions ............................................................................... 136


Chapter 6

Improve Individual Architectural Decisions 137

6.1 Introduction ............................................................................... 138

6.2 Phase 1 – Initial REGAIN Approach............................................ 139

6.2.1 Theoretical Foundations for REGAIN ............................... 139

6.2.2 The REGAIN Approach ................................................... 141

6.2.3 REGAIN Output .............................................................. 143

6.2.4 Initial REGAIN Evaluations ............................................. 144

6.3 Phase 2 – Investigate Industrial Applicability of REGAIN............. 145

6.3.1 Research Method, Data Collection and Analysis ................ 145

6.3.2 RQ1 – REGAIN Advantages and Disadvantages ................ 147 6.3.2.1. Post-questionnaire Analysis 147 6.3.2.2. Transcripts Content Analysis 149 6.3.3 RQ2 – REGAIN Improvement Opportunities ..................... 151

6.3.4 Discussion ....................................................................... 153

6.4 Phase 3 - Investigate Prioritization Approaches ............................ 155

6.4.1 Participants ...................................................................... 157

6.4.2 Experimental Materials and Tasks ..................................... 158

6.4.3 Hypotheses and Variables ................................................. 160 6.4.3.1. Performance 160 6.4.3.2. Users’ Perceptions 161 6.4.3.3. REGAIN Output 161 6.4.3.4. Summary 162 6.4.4 Experiment Design and Results ......................................... 163 6.4.4.1. Results on Performance 164 6.4.4.2. Results on Users’ Perceptions 166 6.4.4.3. Results on REGAIN Output 167 6.4.5 Discussion ....................................................................... 168

6.5 Validity Threats.......................................................................... 171

6.5.1 Interview Study Validity Threats....................................... 171

6.5.2 Experiment Validity Threats ............................................. 172

6.6 Related Work ............................................................................. 174

6.7 Conclusions ............................................................................... 177

6.8 Acknowledgments ...................................................................... 177

Chapter 7

Improve Group Architectural Decisions 179

7.1 Introduction ............................................................................... 181

7.2 The GADGET Process ................................................................ 182


xiii

7.3 GADGET Case Study ................................................................. 185

7.3.1 Case Study Design ........................................................... 185

7.3.2 Results ............................................................................ 188 7.3.2.1. Case Study Participants and Execution 188 7.3.2.2. Analysis Results 189

RQ1 - Need for consensus in group architectural decision making 189 RQ2 - Effort and benefits 190 RQ3 – Improvements 191

7.3.3 Discussion ....................................................................... 192 7.3.3.1. Recommendations for Practitioners 192 7.3.4 Implications for Research ................................................. 193

7.4 GADGET Experiment................................................................. 193

7.4.1 Research Goal and Questions ............................................ 194

7.4.2 Participants ...................................................................... 195

7.4.3 Experimental Materials and Tasks ..................................... 197

7.4.4 Hypotheses for RQ1 – Consensus...................................... 200 7.4.4.1. Hypothesis on General Agreement 200 7.4.4.2. Hypothesis on Mutual Understanding on the Priorities of Concerns 200 7.4.4.3. Hypothesis on Mutual Understanding on Ratings 201 7.4.5 Hypotheses for RQ2 - Perceptions ..................................... 202

7.4.6 Results ............................................................................ 204 7.4.6.1. Analysis Procedure 204 7.4.6.2. Participants’ Background 205 7.4.6.3. Answer to RQ1 - Consensus 206 7.4.6.4. Answer to RQ2 - Perceptions 207 7.4.7 Discussion ....................................................................... 208 7.4.7.1. Interpretation of Results 209 7.4.7.2. Cross-study Discussion 210 7.4.7.3. Limitations of GADGET 211

7.5 Validity Threats.......................................................................... 211

7.5.1 Case Study Validity Threats.............................................. 211

7.5.2 Experiment Validity Threats ............................................. 212

7.6 Related Work ............................................................................. 213

7.7 Conclusions ............................................................................... 214


Chapter 8

Tool Support for REGAIN and GADGET 217

8.1 Introduction ............................................................................... 218

8.2 Motivation for a new tool ............................................................ 218

8.3 Features ..................................................................................... 219


8.3.1 REGAIN Support............................................................. 220 8.3.1.1. Concerns Prioritization 223 8.3.2 GADGET Support............................................................ 224

8.4 Tool Development and Deployment ............................................. 226

8.5 Conclusions ............................................................................... 227


Chapter 9

Conclusions 229

9.1 Answers to Research Questions ................................................... 229

9.1.1 RQ1. How is architectural knowledge managed in practice? 229

9.1.2 RQ2. How are architectural decisions made in practice? ..... 230

9.1.3 RQ3. What is the state of research on architectural decisions?

231

9.1.4 RQ4. Can the Repertory Grid technique reduce architectural

knowledge vaporization? .................................................. 232

9.1.5 RQ5. How to support making and documenting individual

architectural decisions?..................................................... 233

9.1.6 RQ6. How to support making and documenting group

architectural decisions?..................................................... 233

9.1.7 RQ7. What tool can support REGAIN and GADGET? ....... 235

9.2 Discussion.................................................................................. 235

9.3 Contributions ............................................................................. 236

9.4 Future Work ............................................................................... 237

Appendices 241

10.1 Appendix for Chapter 3 ........................................................ 241

10.1.1 Questionnaire for Survey .................................................. 241


10.2.1 Selected Papers ............................................................... 251

10.2.2 Publication Venues .......................................................... 259


10.3.1 Phase 2 – Additional Details ............................................. 261

10.3.2 Phase 3 – Additional Details ............................................. 262

References 263

Acknowledgments 277


xv

About the Author 279


Chapter 1

Introduction

1.1 Context

The ISO 42010 standard defines software architecture as the ‘fundamental

concepts or properties of a system in its environment embodied in its elements,

their relationships, and in the principles of its design and evolution ’

(ISO/IEC/IEEE, 2011). The software architecture of a system is an abstraction,

so the description or documentation of software architecture plays an

important practical role. Elements of architecture documentation include

descriptions of stakeholders (e.g. managers, business analysts, developers),

and stakeholders’ concerns (e.g. performance, security), architectural decisions

to satisfy the concerns, the rationales of the architectural decisions, and views

that contain models (ISO/IEC/IEEE, 2011).

The software architecture discipline has evolved since its inception in the late

60’s. This evolution is marked by the following milestone ideas that have

influenced decades of research and practice.

1. In the late 60’s – early 70’s, Dijkstra and Parnas proposed the concept

of software architecture, emphasizing the importance of the structure

of software systems.

2. In the mid 90’s, Shaw and Garlan emphasized components and

connectors as fundamental software architecture concepts (Shaw and

Garlan, 1996).

3. In the mid 00’s, an influential paper (Bosch, 2004) pushed for the

perspective of software architecture as a result of architectural

decisions , building on earlier ideas from (Perry and Wolf, 1992), that

rationales of decisions are part of software architecture and

architectural styles encapsulate decisions about architectural elements.

This thesis builds on the perspective of software architecture as a result of

architectural decisions.


2 1. Introduction

Architectural decisions are a subset of design decisions that have a system-

wide impact, involve trade-offs, are often constraining, are costly to change

and hard to make (Zimmermann, 2011). Architectural decisions involve

important choices on core components or connectors, and the overall software-

intensive system, to satisfy and balance stakeholders’ concerns (Zimmermann,

2011). Architectural decisions have a key influence on the structure, behavior

and quality attributes of software systems (Zimmermann, 2011). When making

decisions, architects need to balance stakeholders’ concerns, such as quality

attributes, business goals and functional requirements.

Examples of architectural decisions include selecting a development

framework (e.g. J2EE, .NET), architectural patterns (e.g. client-server, layers),

middleware for a distributed software system (e.g. an enterprise service bus),

programming languages (e.g. PHP, Java), operating systems (e.g. Windows,

Android, Linux), database systems (e.g. MySQL, Oracle), or how to

decompose a system into modules.

1.2 Problem statement

Architectural decisions and their rationales are a critical part of the

architectural knowledge of a system (Babar et al., 2009; Kruchten et al., 2006).

In practice, such knowledge is often lost, or vaporized (Bosch, 2004; Jansen

and Bosch, 2005).

The overall problem addressed in this thesis is the following:

How can architectural knowledge vaporization be reduced?

I focus on this overall problem because of its strong practical consequences:

- Expensive maintenance and evolution, since implementing new

features in a software system requires understanding the previously

made architectural decisions. Architects and developers need to spend

substantial efforts to understand previous architectural decisions, to

avoid introducing conflicts with them (Bosch, 2004) (Jansen et al.,

2007).

- Architectural drift refers to the implementation drifting away from

the initially intended architecture (Perry and Wolf, 1992), due to the

lack of relevant and up-to-date architectural knowledge.


1.2. Problem statement 3

- Insufficient stakeholder communication by not sharing architectural

knowledge (especially decisions), rationales and how the decisions

satisfy stakeholders’ concerns (Jansen et al., 2007).

- Low reusability of previous decisions, through unawareness of reuse

opportunities (Jansen et al., 2007).

- Poor traceability between architecture, implementation, and

requirements (Harrison et al., 2007)

The following reasons contribute to architectural knowledge vaporization.

- Unawareness which refers to architects not realizing or not reflecting

on their decision making (Harrison et al., 2007)

- Lack of training which refers to architects not knowing why and how

to document decisions (Jansen et al., 2007)

- Difficulty which refers to the substantial effort that can be required by

the documentation of architectural decisions (Harrison et al., 2007)

- Disruption which refers to architects postponing decision

documentation to avoid disrupting their design flow (Harrison et al.,

2007)

- Natural causes such as architects might change positions or retire

(Jansen et al., 2007; Tang et al., 2006) or that architects might forget

the rationales of decisions over time (Tang et al., 2006)

In this thesis, I focus on reducing the vaporization of architectural decisions,

which are the most significant part of architectural knowledge (Babar et al.,

2009; Kruchten et al., 2006). To reduce vaporization of architectural decisions,

I propose the following high-level solution: improve architectural decision

making processes, so that the processes contain explicit steps for documenting

decisions.

Any proposal to improve such processes must consider the reasons for

architectural knowledge vaporization, as follows:

- Unawareness - Improved processes should encourage architects to

think more about their decisions, thus raising architects’ awareness

about decisions

- Lack of training – Improved processes should have a low learning

curve and provide sufficient advantages, so that architects are

motivated to learn and use them


4 1. Introduction

- Difficulty – Improved processes should minimize documentation

efforts and include steps to decompose the documentation task in

small and easy to perform steps

- Disruption – Improved processes should encourage architects to focus

on their decision making

- Natural causes – Improved processes should encourage immediate

capturing of decisions rationales, to avoid the risk of architects

forgetting them over time.

We revisit these five reasons and describe how the solution proposed in this

thesis addresses them in Chapter 9 – Conclusions.

1.2.1 Research framework

Towards addressing the overall problem of this thesis, I used the iterative

approach recommended by the design science framework (Wieringa, 2009),

which is summarized in Figure 1.1. To solve practical problems from the

environment, design science proposes iterations between practical problems

and knowledge questions. The iterations start from a practical problem, which

is decomposed into practical sub-problems. In addition, knowledge questions

help understand how to solve practical problems, as well as how to decompose

the practical problems. In turn, answering knowledge questions may uncover

further practical problems to be solved. Overall, Wieringa regards these

iterations as a rational problem-solving process: analyze the environment and

the goal, propose and evaluate changes towards the goal, apply changes and

repeat iterations (Wieringa, 2009).

Wieringa defines a practical problem ‘as a difference between the way the

world is experienced by stakeholders and the way they would like it to be’ and

a knowledge question ‘as a difference between current knowledge of

stakeholders about the world and what they would like to know’ (Wieringa,

2009). Practical problems aim at changing the environment (e.g. how to

improve something? how to implement something?). Knowledge questions

aim at changing knowledge about the environment (e.g. what are the facts?

what are the effects?) (Wieringa, 2009).


1.3. Research questions 5

Figure 1.1. Design science framework, adapted from (Wieringa, 2009).

The iterations between practical problems and knowledge questions match the

research efforts of this thesis, which starts from an overall practical problem

(i.e. reducing architectural knowledge vaporization) of the software

engineering industry. As part of my research efforts, I decomposed this

practical problem into smaller, more targeted practical problems and

knowledge questions. Typically, while working on a practical problem, new

knowledge questions appeared, and the other way around. Given the limited

resources, I had to focus my research efforts on the most promising practical

problems and knowledge questions. Next, I present the practical problems and

knowledge questions covered by this thesis.

1.3 Research questions

Figure 1.2 summarizes the knowledge questions and practical problems

addressed in this thesis. These can be grouped into three major iterations:

A. understand - covering items 1, 2, and 3 in Figure 1.2

B. explore - covering item 4 in Figure 1.2

C. propose - covering items 5, 6, and 7 in Figure 1.2

1.3.1 Iteration A - Understand

In the first major iteration, my main goal was to understand the environment

(i.e. the state of practice) in which architectural knowledge vaporization

occurs, and the existing knowledge base (i.e. the state of research) that can

help reduce architectural knowledge vaporization. Towards this, I investigated

how architectural knowledge is managed in practice (item 1 in Figure 1.2), so

that I could get a high-level overview of the situation in the industry. Since this


6 1. Introduction

topic is very large, I focused mainly on identifying challenges and solutions

for managing architectural knowledge.

How can architectural knowledge

vaporization be reduced?

5. How to support making and

documenting individual

architectural decisions?

1.1. What are potential solutions to the challenges for managing architectural

knowledge?

6. How to support making

and documenting group


7. Offer tool support

1. How is architectural knowledge

managed in practice?

Practical

problem

Knowledge

questionLegend: decompositionsequence

2. How are architectural decisions

made in practice?

3. What is the state of research on


2.1. What are the characteristics of architectural decisions?

2.2. What factors make architectural decisions difficult?

3.1. What are the papers on documenting architectural decisions?

3.2. Does current research on architectural decisions consider functional

requirements and quality attributes?

4.1. What are advantages and disadvantages of the Repertory Grid technique?

4.2. Does the Repertory Grid technique reduce architectural knowledge

vaporization more than a template-based approach?

5.1. What are advantages and disadvantages of REGAIN?

5.2 . What are the improvement opportunities for REGAIN?

6.1. Is there a practical need for increasing consensus in group architectural

decision making?

6.2. What are the effort and benefits offered by GADGET?

4. Can the Repertory Grid

technique reduce architectural

knowledge vaporization?

2.3. What are the differences between junior and senior software architects?

2.4. What are the differences between good and bad architectural decisions?

3.3. What specific domains for architectural decisions are investigated?

3.4. What are the normative and descriptive papers?

3.5. What are the papers on addressing uncertainty in architectural decisions?

3.6. What are the papers on group architectural decisions?

A

C

B

5.3. Which prioritization approach to use for REGAIN?

6.4. Compared to ADHOC, what is the impact of GADGET on increasing

consensus among group architectural decision makers?

6.3. What are potential improvements to GADGET?

6.5. How do perceptions on GADGET vs. ADHOC differ among decision

makers?

Figure 1.2. Summary of knowledge questions and practical problems addressed

in this thesis for the three phases: understand (A), explore (B), and propose (C).



Since architectural decisions are a significant part of architectural knowledge,

in the next step, I focused on how architectural decisions are made in practice

(item 2 in Figure 1.2). In this investigation, I answered four knowledge

questions (items 2.1 – 2.4 in Figure 1.2) regarding real-world architectural

decisions:

2.1. What are the characteristics of architectural decisions?

Answering this question increases knowledge on real-world


2.2. What factors make architectural decisions difficult? Since

architectural decisions are difficult (i.e. hard to make) (Zimmermann,

2011), then it is important to understand the nature of the difficulties

faced by practitioners.

2.3. What are the differences between junior and senior software

architects? Junior and senior software architects might make

architectural decisions with different characteristics and face different

difficulties when making them.

2.4. What are the differences between good and bad architectural

decisions? Understanding such differences helps improve


Next, I investigated the state of research on architectural decisions (i.e. item 3

in Figure 1.2), by answering the following six knowledge questions.

3.1. What are the papers on documenting architectural decisions?

Understanding existing work on documenting architectural decisions

is a prerequisite for proposing improved approaches for reducing

architectural knowledge vaporization.

3.2. Does current research on architectural decisions consider

functional requirements and quality attributes? Satisfying

functional requirements and quality attributes are critical activities of

architects.

3.3. What specific domains for architectural decisions are

investigated? Architects work and make decisions in many domains

in the industry. Different domains may offer different challenges for

making architectural decisions.


8 1. Introduction

3.4. What are the normative and descriptive papers? This question

refers to understanding which papers recommend approaches on

architectural decisions (i.e. normative), and which papers present how

architectural decisions are made in practice (i.e. descriptive).

3.5. What are the papers on addressing uncertainty in architectural

decisions? Uncertainty plays an important role in making architectural

decisions.

3.6. What are the papers on group architectural decisions? If an

important part of architectural decisions are made in group (rather than

individually), then it makes sense to propose approaches that help

group architectural decision making.

At the end of the first major iteration, I understood five important points:

1. Architectural knowledge vaporization is a major challenge in the

industry (see Chapter 2). This convinced me on the importance of the

overall problem addressed in this thesis (i.e. reducing architectural

knowledge).

2. Much work exists on documenting architectural decisions, as a means

to reduce architectural knowledge vaporization (see Chapter 4). This

suggested that although much work had been done towards reducing

architectural knowledge, the approaches were not enough to solve the

problem.

3. Existing work on documenting architectural decisions does little reuse

of ideas from other fields (based on Chapter 4). This encouraged me to

explore ideas from other fields to reuse proven approaches for

reducing knowledge vaporization, in the second major iteration (B),

detailed in 1.3.2.

4. Many architectural decisions are made in groups (see Chapter 3). This

encouraged me to work towards improving group architectural

decision making in the third major iteration (C), detailed in Section

1.3.3.

5. There are very few open-source tools to help architects make and

document architectural decisions (based on Chapter 4). This

encouraged me to offer practitioners open-source tool support for



making and documenting architectural decisions in the third major

iteration (C), detailed in Section 1.3.3.

1.3.2 Iteration B - Explore

The points that I understood at the end of the first major iteration (A)

motivated me to explore ideas from other fields for reducing architectural

knowledge vaporization, improving group decisions, and offering open-source

tool support. In particular, the knowledge engineering field was a particularly

compelling source of potentially fruitful ideas for reducing architectural

knowledge vaporization.

The rationale for looking at the knowledge engineering field was the

following. The knowledge engineering field has decades of experience and a

strong focus on capturing knowledge from experts in various domains. In

contrast, in the software architecture field, the focus on capturing architectural

knowledge started in 2004 (as found in Chapter 4). Because of the relatively

recent (i.e. 2004) focus on capturing architectural knowledge, some

shortcomings of proposed approaches are expectable. For example, proposed

approaches might lack clear steps, validation, solid conceptual foundations,

tool support, or significant functionality (e.g. no support for group decision

making).

Still, reusing ideas from the knowledge engineering field has several risks.

Some ideas might not be suitable for capturing architectural knowledge. Also,

some ideas might be not specific enough for the software architecture field.

While aware of potential risks and benefits, in the second major iteration (item

4 in Figure 1.2), I explored ideas from the knowledge engineering field, to

identify ideas that could help reduce architectural knowledge vaporization. In

particular, I focused on the idea of using the Repertory Grid technique for

making and capturing architectural decisions (item 4 in Figure 1.2). I started

exploring this idea after personal discussions with Tim Menzies - a researcher

with much experience in knowledge engineering. Starting from these initial

discussions, I investigated advantages and disadvantages of the Repertory Grid

technique (item 4.1), as well as if the Repertory Grid technique reduces

architectural knowledge vaporization (item 4.2) better than a template based

approach.


10 1. Introduction

1.3.3 Iteration C - Propose

In the third iteration (C), I used the results of the previous two iterations to

propose practical solutions to reducing architectural knowledge vaporization.

First, the exploration in item 4 in Figure 1.2 indicated that the Repertory Grid

technique had much potential for reducing architectural knowledge

vaporization. Therefore, I proposed, evaluated, and refined a process (i.e.

REGAIN, item 5 in Figure 1.2) based on the Repertory Grid technique to help

make and document individual architectural decisions (i.e. decisions made by

one architect). Next, I identified advantages and disadvantages of REGAIN

(item 5.1), as well as improvement opportunities for REGAIN (item 5.2). An

important improvement opportunity was to add and investigate a prioritization

approach for REGAIN (item 5.3).

Second, the answers to items 2 and 3 in Figure 1.2 (i.e. on state of practice and

research on architectural decisions) indicated a gap on group architectural

decisions. This is a large topic, so I focused on a particular aspect: increasing

consensus in group architectural decisions. First, I investigated the need for

increasing consensus in group architectural decisions. Next, I proposed a

process (i.e. GADGET) to increase consensus in group architectural decisions,

and capture the group decision. Furthermore, I investigated costs, benefits, and

potential improvements for GADGET (items 6.2 and 6.3). Finally, I compared

GADGET with ADHOC (i.e. group decision making without using any

prescribed approach). From the comparison, I identified the impact of

GADGET on consensus and perceptions of the decision makers.

Third, I offered tool support for the REGAIN and GADGET processes (item 7

in Figure 1.2), since practitioners indicated the need for user-friendly, open

source tool support.

Figure 1.2 shows that the practical problems (with the exception of offering

tool support) and knowledge questions are further decomposed into more

focused knowledge questions or practical problems. Each of the knowledge

questions from 1 to 4, and practical problems 5, 6, and 7 are also high-level

research questions. Furthermore, knowledge questions and practical problems

1 to 6 are further decomposed into more concrete research questions (e.g. 1.1,

5.3).


1.4. Research methods 11

Related work and rationales for each research question are presented in their

corresponding chapters of this thesis. The corresponding chapters are

presented in the next section.

1.4 Research methods

To answer the seven high-level research questions and their corresponding 21

research questions, this thesis used empirical research methods that provide

evidence to substantiate the answers. Next, I describe briefly each empirical

research method, and their typical usages.

1. Experiment – this (mostly) quantitative empirical research method is

used to confirm or reject predefined hypotheses about the relation

between factors, by applying different treatments in a controlled

environment and measuring the effect of the treatments (Wohlin et al.,

2012). Experiments are typically used to establish cause-effect

relationships among predefined factors (Wohlin et al., 2003)

2. Survey – there are two types of surveys. First, quantitative surveys

are used to ‘describe, compare or explain knowledge, attitudes and

behavior’ (Pfleeger and Kitchenham, 2001), using questionnaires

filled out by participants from carefully selected samples. Second,

qualitative surveys (or interview studies) are used to collect data and

impressions from participants about something (e.g. a process) (Hove

and Anda, 2005; Myers and Newman, 2007; Seaman, 2008).

Depending on the goal of the study, qualitative surveys can be

structured (i.e. with a narrow, confirmatory focus and closed

questions), unstructured (i.e. with a broad, exploratory focus and

open-ended questions), or semi-structured (i.e. mix of the two)

(Seaman, 2008).

3. Case study – this qualitative empirical research method is used to

study contemporary phenomena in their natural contexts (Runeson and

Höst, 2009; Yin, 2003). Case studies are particularly useful for

descriptive and exploratory studies, and less so for establishing causal

relationships (Runeson and Höst, 2009).

4. Systematic mapping study – this empirical research method is used

to identify, evaluate and interpret the research on a particular topic

(Kitchenham and Charters, 2007). Systematic mapping studies are

similar to systematic literature reviews. However, systematic mapping


12 1. Introduction

studies cover more papers and focus on a broader literature analysis,

compared to systematic literature reviews (Kitchenham and Charters,

2007; Petersen et al., 2008).

Table 1.1 shows the mapping of the above four research methods to the

research questions in Figure 1.2, and where in this thesis the research

questions are presented. Table 1.1 shows that this thesis presents nine

empirical studies: two experiments, five surveys (one survey per chapter in

Chapters 1, 2, 5, and two surveys in Chapter 4), a case study, and a systematic

mapping study. Overall, 83 practitioners and 177 students participated in nine

empirical studies and provided evidence to answer the research questions in

Figure 1.2.


1.5. Publications overview 13

Table 1.1. Mapping of high-level research questions to empirical methods,

participants, and chapter in this thesis.

ID High-level research question Empirical

method

Participants Cha

pter

1 How is architectural

knowledge managed in

practice?

Survey 11 practitioners 2

2 How are architectural

decisions made in practice?

Survey 43 practitioners 3

3 What is the state of research

on architectural decisions?

Systematic

mapping

study

- 4

4 Can the Repertory Grid

technique reduce architectural


Survey 27 students 5

5 How to support making and

documenting individual


Survey,

Experiment

16 practitioners,

30 students

6

6 How to support making and

documenting group


Case study,

Experiment

13 practitioners,

120 students

7

7 What tool can support

REGAIN and GADGET?

- - 8

1.5 Publications overview

This thesis is based on a collection of published papers. My contributions

included planning the research, collecting data, interpreting data, reporting

research results, revising and rewriting manuscripts to ensure the quality level

required for publication, Table 1.2 indicates the publications corresponding to

the chapters of this thesis.


14 1. Introduction

Table 1.2. Mapping of thesis chapters to publications.

Chapter Publication

2 Tofan, D., Galster, M., and Avgeriou, P., Improving Architectural

Knowledge Management in Public Sector Organizations–an

Interview Study. In Proceedings of the 25th International

Conference on Software Engineering and Knowledge Engineering

(Boston, USA), 2013.

3 Tofan, D., Galster, M., and Avgeriou, P., Difficulty of

Architectural Decisions – a Survey with Professional Architects. In

Proceedings of the 7th European Conference on Software

Architecture, 2013.

4 Tofan, D., Galster, M., Avgeriou, P., and Schuitema, W., Past and

future of software architectural decisions – A systematic mapping

study. Information and Software Technology 56, 8 (2014), 850-

872.

5 Tofan, D. Galster, M. and Avgeriou, P., Capturing Tacit

Architectural Knowledge Using the Repertory Grid Technique. In

Proceedings of the 33rd International Conference on Software

Engineering (NIER Track), 2011.

Tofan, D. Galster, M. and Avgeriou, P., Reducing Architectural

Knowledge Vaporization by Applying the Repertory Grid

Technique. In Proceedings of the 5th European Conference on

Software Architecture, 2011.

6 Tofan, D., Avgeriou, P., and Galster, M., Validating and

Improving a Knowledge Acquisition Approach for Architectural

Decisions. International Journal of Software Engineering and

Knowledge Engineering 24, 04 (2014), 553-589.

7 Under review at the Information and Software Technology journal.

8 Tofan, D. and Galster, M., Capturing and Making Architectural

Decisions: an Open Source Online Tool, in Proceedings of the

2014 European Conference on Software Architecture Workshops

Vienna, Austria: ACM, 2014, pp. 1-4.


Chapter 2

Architectural Knowledge Management in

Practice

Published as: Tofan, D., Galster, M., and Avgeriou, P., Improving

Architectural knowledge management in Public Sector Organizations – an

Interview Study. In Proceedings of the 25th International Conference on

Software Engineering and Knowledge Engineering , 2013.

To understand how architectural knowledge is managed in p ractice, we

started by searching for and reading existing literature on architectural

knowledge management. We noticed that the literature focuses on

architectural knowledge management in organizations in the private sector

(e.g. commercial software vendors), but there is a research gap on

architectural knowledge management practices in public sector organizations

(e.g. municipalities). Therefore, we conducted a study of architectural

knowledge management practices in the public and private sectors, to apply

lessons from the private sector to the public sector. Specifically, we conducted

an interview study with four public and four private sector organizations. We

identified challenges for architectural knowledge management in the public

sector. Then, we derived solutions from the private sector to the challenges in

the public sector. The main challenges in the public sector are vaporization of

architectural knowledge, insufficient knowledge sharing, and organizational

cultures that do not encourage architectural knowledge management.

Solutions to these challenges include community building, improving tool

support, quality control, and management support. The results confirm the

importance of the overall problems addressed in this thesis: reducing

architectural knowledge vaporization in practice.


16 2. Architectural Knowledge Management in Practice

2.1 Introduction

By searching for and reading existing literature on architectural knowledge

management, we noticed that most work on managing architectural knowledge

has been conducted in the context of private sector organizations (e.g.

commercial software vendors or companies that develop products that rely

heavily on software) (Babar et al., 2009). However, there is a research gap:

architectural knowledge management in public sector organizations has not

been studied. Addressing this gap is important because public sector

organizations, given their sizes, budgets, and impact on everyday life,

represent a significant part of the state of practice. Ignoring public sector

organizations means an incomplete view of the state of practice.

Organizations in the private sector are not owned or operated by a government.

Typical private sector organizations are corporations, regardless of their size.

In contrast to private sector organizations, public sector organizations are

owned and operated by some government. Typical public sector organizations

are municipalities or government agencies.

Recent work on service-oriented architectures in e-government (Galster et al.,

2013) suggests that architectural knowledge management in the public sector

needs improvement. For example, immature architectural knowledge

management leads to constraints on designing specialized reference

architectures for municipalities (Galster et al., 2013). Additionally, similar to

the private sector, e-government projects in public sector organizations are

under pressure to reduce costs. As shown for the private sector, architectural

knowledge management helps reduce costs (Babar et al., 2009). However, we

could not find literature on architectural knowledge management in the public

sector. Therefore, the goal of this chapter is to understand architectural

knowledge management in practice, and to provide solutions for improving

architectural knowledge management, especially in the public sector. Towards

this goal, we formulate the following research question: What are potential

solutions to the challenges for managing architectural knowledge?

To answer this research question, we conducted an interview study in public

and private sector organizations. We were interested to find out practical

architectural knowledge management challenges and solutions. Then, we can

use architectural knowledge management solutions from the private sector to


2.2. Related Work 17

address similar challenges in the public sector. Proposing solutions for

improving knowledge management practices in the public sector by using

practices from the private sector has already been applied successfully (Bate

and Robert, 2002; McAdam and Reid, 2000).

The main contribution of this chapter is an increased understanding of the state

of practice, in particular on architectural knowledge management in practice,

using insights from the private and public sector organizations. Researchers

and practitioners can use the results of this chapter to propose improvements to

architectural knowledge management practices. Understanding the state of

practice also encouraged us to focus in this thesis on a significant practical

challenge: reducing architectural knowledge vaporization.

2.2 Related Work

This chapter is related to three research areas: knowledge management in

software engineering, architectural knowledge management, and knowledge

management in the public sector. We discuss related work from each area.

Dingsøyr and Conradi (Dingsøyr and Conradi, 2002) analyzed eight case

studies of knowledge management in software engineering. All cases reported

benefits due to knowledge management, such as time savings. However,

results from a systematic literature review on knowledge management in

software engineering indicate that most existing work consists of informal

lessons learnt from applying knowledge management, instead of scientific

studies (Bjørnson and Dingsøyr, 2008). In contrast, we conducted an interview

study to answer our research question in a scientific manner.

Various architectural knowledge management challenges and solutions have

been investigated in private sector organizations. For example, the challenge

of architectural knowledge vaporization can be addressed by documenting

design decisions (Jansen and Bosch, 2005). Furthermore, the challenge of

sharing architectural knowledge can be addressed by considering

communication, planning issues, and quality of captured knowledge (Avgeriou

et al., 2007; Babar et al., 2009) when implementing architectural knowledge

management strategies. Finally, a delicate balance must exist between sharing

architectural knowledge through documentation and social interactions



(Avgeriou et al., 2007) to ensure that knowledge is made explicit, without

causing much burden on architects.

The idea of getting inspiration from the private sector for improvements in the

public sector has been used before. Bate and Robert (Bate and Robert, 2002)

describe how knowledge management concepts and practices from the private

sector can improve health care organizations in the UK public sector. Another

study compares public and private sector perceptions and the use of knowledge

management (McAdam and Reid, 2000). In both types of organizations,

improved quality and efficiency were the main benefits of knowledge

management.

Overall, many reports exist on architectural knowledge management in the

private sector (e.g. (Avgeriou et al., 2007; Babar et al., 2009)), as well as on

general knowledge management in the public sector (e.g. (Bate and Robert,

2002; McAdam and Reid, 2000)). However, we could not find any work on

architectural knowledge management in the public sector.

2.3 Research Method

To answer the research question in Section 2.1, we conducted an interview

study in public and private sector organizations, using semi-structured

interviews. Such interviews belong to qualitative research, which aims at

investigating and understanding phenomena within their real life context

(Seaman, 2008). Challenges and solutions for architectural knowledge

management are linked tightly to their context. Also, we needed flexibility

during the interviews, so that we could ask new questions, to further probe for

architectural knowledge management challenges and solutions.

Similar to (Svensson et al., 2012), we decided to conduct extended, semi-

structured interviews. Using quantitative surveys was less optimal, because of

the lack of reports on architectural knowledge management practices in public

sector organizations, which inhibits the development of relevant

questionnaires. Additionally, in a survey, participants might have different

interpretations of the questions. Therefore, we decided to conduct semi-

structured interviews, which enabled us to present our topics of interest, and

discuss them directly with the participants. Furthermore, semi-structured

interviews are useful as preliminary work for an in-depth case study (Seaman,


2.3. Research Method 19

2008). However, semi-structured interviews require significant effort to

prepare a discussion plan, recruit participants, and conduct the interview

sessions. Overall, semi-structured interviews suited best our research goal,

given the lack of previous work on architectural knowledge management in the

public sector.

2.3.1 Data Collection and Analysis Procedures

To conduct the interviews, we selected organizations from the private and

public sectors which had enterprise or software architects. We contacted

diverse organizations from our collaboration network. For the interviews, we

used recommendations from (Hove and Anda, 2005) to ensure that the

interviewer has the needed skills, and to facilitate good interaction between

interviewer and interviewees. Such recommendations include for example to

encourage interviewees to participate in open discussions. In each

organization, we interviewed one or two persons, depending on their

availability. In total, we interviewed eleven persons.

The face-to-face interviews lasted typically one hour. The interviews took

place between January 2010 and July 2012. We made audio records for the

interviews, with the permission of the interviewed persons. We used a

discussion plan with open-ended questions structured around three areas:

strategy (e.g. ‘what are the objectives of the architectural knowledge

management strategy?’), processes (e.g. ‘what are the processes for sharing

architectural knowledge?’), and tools (e.g. ‘what tools are used for

architectural knowledge management?’). We derived these areas from

architectural knowledge management literature (Avgeriou et al., 2007; Babar

et al., 2009).

To analyze the interviews, we transcribed the audio recordings. Next, two

researchers performed content analysis, by assigning individually codes to

sentences, phrases or paragraphs (Seaman, 2008). Each code corresponded to

either a challenge or solution for managing architectural knowledge. Different

codes could be assigned to the same piece of content. Afterwards, researchers

discussed their differences, and they agreed on a common interpretation. In

case of disagreements, we consulted a third researcher. Data analysis also

included a mapping of challenges to solutions by identifying which challenges

were addressed by which solutions.



2.3.2 Organizations

The organizations that took part in this study are listed in Table 2.1. Only the

software architect in PS1 had about five years of practical experience. All the

other participants had at least ten years of practical experience. The private

sector organizations are international corporations. The public sector

organizations are part of Dutch government. For confidentiality reasons, we

provide limited details on the organizations, and we assign aliases to them.

Table 2.1. Summary of participating organizations.

ID Sector Domain Number of

employees

Interview with

Gov1 Public Municipality ~1.000 Enterprise architect

Knowledge

management consultant

Gov2 Public Municipality ~100 Enterprise architect

Gov3 Public Agency ~1.300 Software architect

Software architect

Gov4 Public Ministry ~30.000 Enterprise architect

PS1 Private Software provider ~600 Knowledge

management director

Software architect

PS2 Private IT consultancy ~40.000 Enterprise architect

PS3 Private Engineering >100.000 Enterprise architect

PS4 Private IT consultancy >100.000 Software architect

2.3.3 Validity Threats

We discuss validity threats using the recommendations from (Wohlin et al.,

2000), in line with a report that uses the same methodology conducted by

(Svensson et al., 2012).

Construct validity refers to the relation between the observations and the

theory behind the research (Easterbrook et al., 2008). We interviewed many


2.4. Challenges 21

practitioners to avoid mono-operation bias (Wohlin et al., 2000). We avoided

evaluation apprehension (Wohlin et al., 2000) by using the recommendations

from (Hove and Anda, 2005) to create a comfortable and nonjudgmental

atmosphere for the interviews, and ensuring their confidentiality.

Conclusion validity refers to obtaining the same study results, if other

researchers replicate the study (Easterbrook et al., 2008). To increase

conclusion validity, we involved more researchers in the data analysis, who

reached high positive agreement when interpreting the data.

External validity refers to the strength of generalizability claims of the study

results (Easterbrook et al., 2008). To increase external validity and to reduce

validity threats, we conducted interviews at a variety of organizations in the

public and private sectors. Besides architects, we also interviewed knowledge

management consultants, who could offer insights on how architectural

knowledge is managed.

Internal validity refers to the existence of confounding variables and other bias

sources (Kitchenham and Pfleeger, 2008). Internal validity threats are not

applicable to this study, because we do not try to establish any causal

relationships.

2.4 Challenges

We identified three common challenges for the public and the private sector,

as well as a challenge only for the private sector. Additionally, we link these

challenges to results from knowledge management literature. We summarize

these challenges in Table 2.2. Afterwards, we present details on all challenges,

their consequences, and concrete examples from the public and private sectors.



Table 2.2. Challenges in public and private sector organizations.

Challenge Public sector Private sector

Architectural knowledge

vaporization

Gov1, Gov2, Gov3, Gov4 PS1, PS2, PS3, PS4

Low architectural

knowledge sharing

Gov1, Gov2 PS1, PS2, PS3, PS4

Organizational culture Gov1, Gov2, Gov3 PS1, PS2, PS4

Low integration - PS1

2.4.1 Challenges in the public sector

Architectural Knowledge Vaporization: This challenge refers to the loss of

architectural knowledge in an organization (Jansen and Bosch, 2005). We

learnt that architectural knowledge vaporization contributes to increased

vendor lock-in because the less in-house architectural knowledge remains in

public sector organizations, the more they depend on software vendors for

technology decisions (e.g. extending existing software depends on one

vendor).

Also, architectural knowledge vaporization makes it more difficult to modify

the architecture without involving vendors. For example, migrating existing

systems to a service-oriented architecture depends on the willingness of the

vendors. Having more in-house architectural knowledge enables organizations

to make better decisions about software solutions that meet their core needs,

and to decrease vendor lock-in. Overall, architectural knowledge vaporization

reduces flexibility for public sector organizations and increases maintenance

costs.

Architectural knowledge vaporization is a challenge across all public sector

organizations that we studied. In Gov3, little architectural knowledge was

captured on a regular basis. Architects had no formalized way to capture their

knowledge. A wiki was used in the past, but only for a brief period, so the

content became quickly outdated. Consequences of architectural knowledge

vaporization were that similar problems were solved in different ways. Thus,

new people who joined a team needed to re-discover solutions, instead of

reusing a proven solution. Instead of reusing captured knowledge, much


2.4. Challenges 23

informal communication of knowledge needed to take place. Architects often

needed to explain the same solution to more developers, instead of

documenting a solution and sharing the documentation.

Similar to the other organizations, little architectural knowledge was captured

in Gov4. The architects working for Gov4 were employed through external

companies, and were not asked to document their knowledge, although they

were willing to do it. Moreover, little architectural knowledge existed inside

Gov4 to facilitate knowledge sharing through direct interactions. Therefore,

when the external architects stopped working for Gov4, their knowledge

vaporized from Gov4, because there was no mechanism for preserving it.

Low Architectural Knowledge Sharing: This challenge refers to insufficient

sharing of architectural knowledge, inside and across organizations (Babar et

al., 2009). We learnt that low architectural knowledge sharing existed across

Gov1 and Gov2. An architect from Gov1 compared his current position with

his previous job in the private sector, where co-workers were much more open

to knowledge sharing, resulting in higher efficiency, by helping each other.

At Gov3, architects worked in small, isolated groups, without sharing much

knowledge across groups. Also, architects could allocate parts of their time to

increase their knowledge, but not for sharing it with others. In Gov4, the same

tendency for isolation between groups existed, with little knowledge sharing

between them. Moreover, in Gov4 most architects were from external

companies, and very few knowledgeable people existed in Gov4, so architects

could not share their knowledge with them. Overall, low architectural

knowledge sharing caused inefficiencies.

Lack of Supportive Organizational Culture: Culture contains norms about

who controls what knowledge, and who can share or hoard it (Long and Fahey,

2000). For example, a cultural norm is accepting knowledge hoarding as a

source of job security or power (Long and Fahey, 2000). An architect from

Gov1 stated: ‘Nearby municipalities are very small compared to us, maybe

they fear we are going to take over things from them. That’s the kind of

feeling, which is very old.’ Such fears encouraged knowledge hoarding and

reduced knowledge sharing.

An architect at Gov3 considered that organizational culture played a role in a

previous failed attempt to use a wiki for knowledge sharing between architects



and developers. However, there were no accepted norms in Gov3 to capture

and share knowledge, so the wiki content became gradually outdated, and was

abandoned. Overall, we noticed that the lack of a supportive organizational

culture increases knowledge vaporization and leads to reduced knowledge

sharing, within and across organizations.

2.4.2 Challenges in the private sector

The challenges in the private sector match the ones from the public sector and

include one extra challenge, namely low integration of architectural

knowledge management with organizational goals.

Architectural Knowledge Vaporization: We found this challenge in all the

private sector organizations. Architects mentioned several factors that

contribute to this challenge.

First, due to lack of time, less knowledge can be documented (PS1, PS2, and

PS3).

Second, documentation becomes irrelevant a few years after writing it, so the

return for spending much time documenting is unclear (PS1, PS2, and PS4).

The architect at PS2 summarized his view on documenting architectural

knowledge: ‘We typically document when either the client asks for it or we

discover that we need it. I’m not really interested in this documentation, unless

I discover that the speed by which I can address a problem depends on the

documentation.’

Third, the differences in educational background between software architects

and maintainers increased the documentations costs. The architect at PS2

described this as follows: ‘I have a designer, who has knowledge, puts it into a

document, and pass it to someone who does maintenance, and who reads that

information, generates knowledge from it, and these two do not match. Why

not? Well, this one has architectural schooling for eight years and this one is

good at programming routers. The points of view are so different, that these

simply do not match, even if the documentation is the same .’

Forth, existing research results on capturing architectural design decisions are

not fully adopted in industry (PS1, PS2, and PS3). Overall, similar to public

organizations, architectural knowledge vaporization lead to increased

maintenance costs.


2.4. Challenges 25

Low Architectural Knowledge Sharing: This challenge exists in all the

private sector organizations. From the interviews at PS1, we learnt that a factor

contributing to this challenge was sharing knowledge by e-mails, because

senders determined receivers of its content. This created an obstacle for other

persons that might be interested in the knowledge captured by e-mail.

For example, let us assume the rationale for an architectural decision is in an e-

mail thread among a few architects. If a developer working on the code is

interested in the rationale for that decision, then he would need to find out that

the e-mail thread exists, and then ask one of the architects to forward it to him.

Reducing overhead from these steps may facilitate architectural knowledge

sharing.

Lack of Supportive Organizational Culture: We identified this challenge in

the interviews at PS1, PS2, and PS4. Several factors contributed to this

challenge.

First, architects and developers needed to be convinced to deliver not only

source code, but also their knowledge. For example, at PS2, architects were

not interested in transferring knowledge, because they do not consider it an

interesting activity.

Second, trust was an important factor in organizational culture, as put by the

interview at PS1: ‘It’s not about software. It’s not about wiki content, it’s

about people getting trust and solving problems .’

Low Integration with Organizational Goals: This challenge refers to the

integration of knowledge management efforts with the goals of the

organization (Rubenstein-Montano et al., 2001). From the interviews at PS1,

we learnt that if such integration is low, then architectural knowledge

management efforts carry the risk of adding too little value to the organization.

Specifically, the challenge is to provide value from architectural knowledge

management efforts throughout the lifecycle of projects for customers, i.e.

from sales, to architecting, development, and during maintenance.

Architectural knowledge management efforts need to show benefits, such as

time savings for architects and other stakeholders.

Although the integration challenge did not emerge from the interviews in the

public sector organizations, we considers this challenge is also relevant to

public sector organizations, because such integration is a critical element of



knowledge management, regardless of the type of organization (Rubenstein-

Montano et al., 2001). Due to their different nature, the organizational goals in

the public sector differ from the goals in the private sector. However, in both

types of organizations, architectural knowledge management efforts must

serve organizational goals.

2.5 Solutions

We describe six solutions to the challenges in Section 2.4, elicited from the

interviews in the private sector organizations: community building, tool

support, training, resources allocation, quality control, and management

support. Next, we present details about each solution.

Community Building: This solution was described in all private sector

organizations. PS1 built its community, based on three elements: people, tools,

and processes. People include architects, developers, testers, partners, and

customers, who joined the community voluntarily and gradually. The main

tool is a commercial wiki. Processes are managed through PS1’s own business

process management tool. For example, architects follow predefined processes

for capturing knowledge regularly in the company wiki. If an architect leaves,

the impact is reduced, because the other people in the organization can still use

the architect’s previous regular contributions to the wiki.

PS2 supports the creation of various communities of practice, in which

architects can share knowledge with people in other positions or fellow

architects. Moreover, collocating architects with other project groups improves

architectural knowledge sharing across projects. Architects who work in other

groups ‘get the feeling on what that really means and how that works.’

Overall, getting perspectives from other groups helps architects deliver better

documentation as architects became aware of the documentation needs of

other groups.

Architects in PS3 share their knowledge through communities of practice, on

architectural or other technical topics (such as Java or .Net), or business

related topics. For these communities, the company organizes regular events to

help networking, and promote knowledge sharing. Recognized experts are

invited to share their insights at such events. The architect at PS3 stressed the

idea that although tools help, they are less important than networks of people.


2.5. Solutions 27

Tool Support: This solution receives much attention in all private sector

organizations. At PS1, tool support shifted from a sender-dominant paradigm

(e-mail) to a receiver-dominant paradigm (subscription). This means that

notification about content and the actual content are separated. For example,

instead of architects emailing content, they put architectural content in the

wiki, and then send an e-mail notification with the wiki link. If a person

considers that the content is interesting for her work, then the person can

subscribe to the topic, and receive future notifications about it, without the

constraint of receiving content through e-mail.

Moreover, at PS1 knowledge capturing is based on a wiki, to avoid using

different tools (e.g. forums, wikis, or document management systems). Having

content in multiple locations creates obstacles for end users in accessing and

sharing it. Therefore, all content must be delivered in the wiki. For example, if

architects produce artifacts with other tools (e.g. PowerPoint slides), then the

artifacts need to be attached to a wiki page.

At PS2 and PS3, various tools (e.g. SharePoint, wikis, internal blogs, and a

third party collaborative software system) are used for capturing and sharing

architectural knowledge. Additionally, social networking tools (e.g. Skype,

Twitter, and Yammer) are widely used in PS2, PS3, and PS4, enabling

knowledge exchanges across offices around the world.

Training: PS2 develops training materials for maintenance persons, to

facilitate the transfer of architectural knowledge. In PS3, to increase peoples’

architectural knowledge, architectural training take place as part-time

assignments, which may take six to nine months. Although demanding, such

trainings are necessary to ensure similar levels of architectural knowledge

throughout PS3. In addition, PS4 has central training facilities in which

architects from various offices can meet in person during trainings, which

leads to stronger connections through the social networking tools.

At PS3, in addition to trainings, there are company-wide events with software

architecture experts. Architects can attend such events to expand their

knowledge, or share their knowledge with each other.

Resources allocation: This solution refers to planning and allocating

resources for architectural knowledge management activities. At PS1 and PS3,

10% of architects’ time is allocated for knowledge management activities. At



PS2, transferring architectural knowledge to maintenance people is considered

a project by itself. As part of the project, architects need to consider what

knowledge is needed for maintenance, and plan for its transfer. Architects may

join temporarily the maintenance team to facilitate the transfer.

Quality Control: This solution refers to measures for increasing the quality of

captured knowledge. At PS1, various metrics are collected for the wiki pages,

such as number of visitors, profile of visitors, time spent on a page, or next

visited pages. Such metrics indicate issues with content. If the content in the

wiki is useful and up to date, then visitors perceive value in accessing the wiki.

At PS3, peer-review is used to evaluate the quality of captured architectural

knowledge. For example, a group of architects involved in a healthcare project

sent some design documents to another group of experienced architects for

review. The experienced architects provided constructive feedback to increase

documentation quality. On the other hand, the reviewers (experienced

architects) improved their knowledge on the healthcare domain.

At PS4, a solution to increase quality is to separate domain-specific knowledge

from department-specific knowledge in the wiki system used for capturing

knowledge. The rationale was that domains and departments evolve at

different speeds. For example, a department might disappear during a re-

organization, but knowledge from that department about the architecture of a

specific system might be needed across other departments. If no separation

exists, then the captured knowledge about that specific system becomes

difficult to update, because it is mixed with irrelevant knowledge about the

disappeared department.

Management Support: Support from top management was essential for the

knowledge management efforts at PS1, because architectural knowledge

management is a long term effort. A person from PS1 summarized this in a

metaphor: ‘Grass doesn’t grow by pulling it.’ PS1 needed two to three years to

implement its new knowledge management practices. To sustain momentum

for long-term knowledge management efforts, knowledge workers (including

architects) needed to experience benefits from the new practices. This was

mainly achieved by saving time through architectural knowledge reuse.

Top management influences organizational culture by encouraging initiatives,

and having tolerance for mistakes. This was described as a success factor at


2.6. Discussion 29

PS1: ‘You’ll only get fired if you didn’t take initiative, not because you made a

mistake. Otherwise I wouldn’t be doing this. I wouldn’t even be close to this

kind of ideas [for knowledge management].’

At PS4, management supported knowledge management efforts by providing

positive reinforcements to the top wiki contributors who shared their

knowledge. The positive reinforcements were in the form of emails from the

top management thanking contributors, and internal news articles praising

their efforts. By receiving recognition for their efforts, the organizational

culture became more supportive for knowledge management activities. In turn,

people became comfortable to share their knowledge and help colleagues.

2.6 Discussion

A similar study in the UK public sector (i.e. national healthcare) (Bate and

Robert, 2002) describes knowledge management as a core activity for

organizational improvements. Unfortunately, knowledge management in UK

public sector is much more immature, compared to private sector organizations

(Bate and Robert, 2002). Therefore, the public sector can benefit from the

lessons and experiences in the private sector (Bate and Robert, 2002).

In our study, we noticed a similar situation for the Dutch public sector.

Although architectural knowledge management provides significant benefits,

architectural knowledge management in the public sector is much less mature

than architectural knowledge management in the private sector. For example,

interviewees from the public sector mentioned previous failed attempts to use

wikis for capturing and sharing knowledge. Therefore, we think that the

experiences derived from the private sector will help improve architectural

knowledge management practices in the Dutch public sector and elsewhere.

Similar to (Bate and Robert, 2002; McAdam and Reid, 2000), we consider that

solutions from the private sector help improve the situation in the public

sector. Also, the improved quality and efficiency that the private sector derives

from its architectural knowledge management efforts can motivate public

sector organizations to pay more attention to architectural knowledge

management.

We summarize the solutions from the private sector (detailed in Section 2.5)

and map them to the challenges in the public sector (detailed in Section 2.4.1)



in Table 2.3. Each solution exists in two or more private sector organizations,

and addresses one or more challenges. For example, community building

addresses the architectural knowledge vaporization and sharing challenges.

Also, tool support addresses architectural knowledge vaporization, sharing and

organizational culture challenges.

Table 2.3. Summary of solutions and challenges.

Organizations Solution Challenges

PS1,PS2,PS3,PS4 Community building vaporization, sharing

PS1,PS2,PS3,PS4 Tool support vaporization, sharing, culture

PS2,PS3,PS4 Training vaporization, sharing, integration

PS1,PS2,PS3 Resources allocation vaporization, integration

PS1,PS3,PS4 Quality control vaporization, sharing, integration

PS1,PS4 Management support culture, integration, sharing

Dependencies among challenges have received little attention in architectural

knowledge management literature on the private sector. We noticed

dependencies between architectural knowledge sharing and architectural

knowledge vaporization: sharing reduces the risk of vaporization. On the other

hand, addressing vaporization by creating architecture documentation makes it

possible to share architectural knowledge. Also, to address the lack of

architectural knowledge sharing and vaporization we can use a common set of

solutions: trainings, processes, tools and building communities. Another

dependency is that organizational culture influences the willingness of

architects to share and capture their knowledge. For example, architects might

not share their knowledge because there is no positive reinforcement in their

organization for sharing. On the other hand, management support influences

organizational culture, by providing the positive reinforcement and long-term

focus. Both are needed to foster an organizational culture, which encourages

knowledge-related activities.

This study also contributes to existing literature on architectural knowledge

management in practice. For example, various solutions have been proposed to

address architectural knowledge vaporization and sharing (Babar et al., 2009;

Jansen and Bosch, 2005). However, little work exists on the role of

organizational culture and the integration of architectural knowledge


2.7. Conclusions 31

management efforts with organizational goals. Results from knowledge

management literature (Long and Fahey, 2000; Rubenstein-Montano et al.,

2001) and from this study encourage more research on these challenges that

focuses on architectural knowledge.

2.7 Conclusions

In this chapter, we present our research results on the state of practice on

architectural knowledge management. The research results are based on an

interview study consisting of eleven interviews conducted over two years. We

conducted the interviews in four public and four private sector organizations.

This chapter contributes to the existing body of work on architectural

knowledge management (e.g. (Avgeriou et al., 2007; Babar et al., 2009;

Dingsøyr and van Vliet, 2009)) with lessons learnt from implementing

architectural knowledge management in the private sector, and proposes these

as solutions to the challenges in the public sector. Also, this study confirms

that architectural knowledge vaporization is a major challenge.

To address architectural knowledge vaporization, we need to better understand

how architectural decisions are made in practice. As discussed in Chapter 1,

architectural decisions are an important part of architectural knowledge.

Therefore, by understanding real-world architectural decisions, we can

propose approaches that help avoid architectural knowledge vaporization. The

next chapter presents a study on architectural decisions in practice.

Acknowledgment

We thank the study participants for their help.


Chapter 3

Architectural Decisions in Practice

Based on: Tofan, D., Galster, M., and Avgeriou, P., Difficulty of Architectural

Decisions – a Survey with Professional Architects. In Proceedings of the 7th

European Conference on Software Architecture . 2013.

In this chapter, we investigate characteristics of architectural decisions, as

well as factors that contribute to the difficulty of decisions. We compared

characteristics of decisions and difficulty factors for junior and senior

architects. Furthermore, we studied if decisions with good outcomes (as

perceived by the architects) have different characteristics and different

difficulty factors than decisions with bad outcomes (i.e. bad outcomes, as

perceived by the architects). We performed a survey with 43 architects who

described 43 good and 43 bad decisions from their industrial practice .

Characteristics of decisions include the time taken to make a decision and the

number of decision makers. Also, we learnt that dependencies between

decisions and the effort required to analyze decisions are major factors that

contribute to the difficulty of architectural decisions. We found that good

decisions tend to consider more alternatives than bad decisions.


34 3. Architectural Decisions in Practice

3.1 Introduction

Increased understanding of architectural decisions in practice enables

researchers to propose approaches that help practitioners improve architectural

decision making and therefore reduce overall architectural knowledge

vaporization. Although there is much interest in the community on

architectural decisions, little work has been done towards a deeper

understanding of architectural decisions in practice. This chapter offers a step

in this direction, by focusing on understanding two aspects: characteristics of

architectural decisions in practice and factors that contribute to the difficulty of

making decisions. As presented in Chapter 1, architectural decisions are

particularly difficult decisions.

Using GQM (Basili and Caldiera, 1994), the goal of this study is to analyze

architectural decisions, for the purpose of understanding architectural

decisions in practice, from the perspective of software architects, in the context

of real-world projects.

We refine this goal into four research questions.

RQ1. What are the characteristics of architectural decisions?

By defining measurable characteristics of architectural decisions, we can

perform comparisons between decisions. Based on literature (Kruchten, 2008),

our experience and discussions with practitioners, we define the following

basic characteristics of decisions as metrics for RQ1.

1. Actual (i.e. while making the decision) and elapsed (i.e. actual plus

various interruptions) time spent making a decision. For example, an

architect spent five hours (actual time) over three days (elapsed time)

to make a decision.

2. Number of people directly (i.e. decision makers) and indirectly (i.e.

influenced the decision, but did not make the decision) involved in

decision making.

3. Number of alternatives considered in the beginning and later over an

extended period of time during the decision making process.

4. Number of quality attributes considered for a decision.



Since the difficulty of a decision is a major aspect of decision quality (Yates et

al., 2003), we propose RQ2. Answering RQ2 will help researchers and

architects understand the factors that increase difficulty of decisions.

RQ2. What factors make architectural decisions difficult?

To answer RQ2, we defined a list of factors using literature and discussions

with experts (Section 3.3.1). The resulting metrics were 22 factors (Table 3.4).

These were used as input for our survey and survey participants rated them on

a Likert scale.

The level of experience of architects influences their decision making (van

Heesch and Avgeriou, 2010; van Heesch and Avgeriou, 2011). We propose

RQ3 to investigate how difficulty and characteristics of decisions vary with the

level of experience. This helps researchers propose targeted solutions to

address the difficulties perceived by either junior architects or experienced

architects.

RQ3. What are the differences (on difficulty and characteristics of decisions)

between junior and senior software architects?

Besides difficulty, decision outcome is the other major aspect of decision

quality (Yates et al., 2003). We investigate the differences between decisions

with a more preferable outcome (i.e. good decisions) and decisions with a less

preferable outcome (i.e. bad decisions). Answering RQ4 highlights

characteristics and difficulty factors linked to good and bad outcome of


RQ4. What are the differences between good and bad architectural decisions?

To answer the research questions, we conducted a survey with software

architects in industry. Section 3.2 presents related work. Section 3.3 presents

survey design. Section 3.4 contains the analysis of results which are further

discussed in Section 3.5. Section 3.6 presents conclusions.

3.2 Related Work

Work related to this chapter is discussed for each research question. We start

with the research question on characteristics of architectural decisions (RQ1).

We could not find any study that focuses on identifying or describing

characteristics of architectural decisions. Based on observations of various

architecture teams, Kruchten recommends roles, responsibilities and time



allocation for architects (Kruchten, 2008). Based on (Kruchten, 2008), we

consider that involved roles, involved responsibilities (e.g. involved directly or

indirectly in the decision making) and the time allocated (e.g. actual, elapsed)

for making decisions are characteristics of architectural decisions.

Some surveys with practitioners provides insights on architectural decision

making (Clerc et al., 2007), and knowledge sharing for architectural decisions

(Farenhorst et al., 2009). However, we could not find any study that

investigates the characteristics of architectural decisions in RQ1.

Our research question on difficulty of architectural decisions (RQ2) draws

inspiration from work on the quality of decision making (Yates et al., 2003).

Yates et al. (Yates et al., 2003) consider that the difficulty of decisions is

similar to a cost of making decisions, so reducing cost increases the quality of

decision making. We consider this also applies to architectural decisions

because they are a subcategory of decisions. Unfortunately, we could not find

related work that focuses on the difficulty of architectural decisions.

Related work on junior architects versus senior architects (RQ3) includes two

surveys and a case study on the reasoning processes of naïve and professional

architects (van Heesch and Avgeriou, 2010; van Heesch and Avgeriou, 2011;

van Heesch et al., 2013).

- Results of the first survey indicate that naïve architects (i.e.

undergraduate students) do not make trade-off between requirements,

fail to validate dependencies between decisions, and do not evaluate

critically their decisions (e.g. insufficient risk assessment) (van

Heesch and Avgeriou, 2010).

- Results of the case study indicate that naïve architects who use decision

viewpoints are more systematic in the exploration and evaluation of

decision alternatives (van Heesch et al., 2013).

- Results of the second survey (van Heesch and Avgeriou, 2011) indicate

that professional architects very often search for many design

alternatives in their decision making, unless they already have a

solution in mind. Also, professional architects prefer familiar

alternatives, because unfamiliar alternatives need more analysis effort

(van Heesch and Avgeriou, 2011). Additionally, professional


3.3. Survey Design 37

architects do not consider risk assessment as very important (van

Heesch and Avgeriou, 2011).

In our survey, we ask participants about risks, trade-offs as difficulty factors

for their architectural decisions. No related work compares characteristics and

difficulty of architectural decisions for professionals with different levels of

experience.

For our research question on good versus bad decisions (RQ4), we found much

research to avoid bad decisions. For example, ATAM and CBAM help

discover potentially bad architectural decisions (Kazman and Klein, 2001).

They do this by documenting costs, benefits, and uncertainty. Other authors

describe architectural smells as bad architectural decisions, which harm quality

attributes (Garcia et al., 2009). However, we could not find any work which

compares characteristics and difficulty of good and bad architectural decisions.

3.3 Survey Design

To design this survey, we used the recommendations and guidelines from

(Ciolkowski et al., 2003; Kitchenham and Pfleeger, 2008). Data was collected

using an online questionnaire. In this section, we present questionnaire

development, evaluation, and data collection.

3.3.1 Questionnaire Development and Evaluation

To develop the questionnaire, we took the following steps, based on

(Kitchenham and Pfleeger, 2008):

1. Review literature 2. Discuss with experts 3. Pilot questionnaire 4. Publish questionnaire

In the first step, we reviewed existing literature on architectural decisions (e.g.

(van Heesch and Avgeriou, 2010; van Heesch and Avgeriou, 2011;

Zimmermann, 2011)) and decision research (e.g. (Yates et al., 2003)). From

the literature, we identified a list of factors that contribute to the difficulty of

architectural decisions. Based on these factors, we prepared a first version of

the questionnaire that we used as a starting point for interviewing practitioners.



For the second step, we interviewed four senior architects, each with at least

ten years of experience as an architect. We asked each architect to identify two

architectural decisions they had been involved in, and discussed the

questionnaire items for both decisions. Afterwards, we asked the architects to

propose other items that contribute to the difficulty of a decision to be included

in the questionnaire. The architects also provided thoughtful feedback on the

structure of the questionnaire.

For the third step, we piloted the questionnaire with six other persons that

included researchers and practitioners. We used their feedback to further

improve the questionnaire. For example, we eliminated some items so that the

survey would take less than 15 minutes to complete (practitioners are unlikely

to finish long surveys (Kitchenham and Pfleeger, 2008)). Additionally, we

rephrased some questions to increase their clarity.

In the final step, we published the questionnaire online. The final version of

the questionnaire showed the following to participants. The full questionnaire

is available in the Appendix.

1. Welcome message 2. Control question 3. Background questions 4. Describe good decision 5. Rate 22 factors for the good decision 6. Describe bad decision 7. Rate 22 factors for the bad decision 8. Add other factors 9. Thank you message

The first page had a welcome message with an overview of the survey,

including targeted audience, required effort and incentive. The next page

checked that participants had been involved directly in making architectural

decisions during the last two years. The survey continued only after a positive

confirmation from the participant.

The survey continued with a few questions about the background of the

participant. Next, participants were asked to indicate a good architectural

decision (‘good’ according to their judgment), and describe its characteristics,

in terms of the metrics for RQ1 (e.g. number of alternatives). Next,

participants were asked to rate the 22 statements in Table 3.4 about the

difficulty of their good architectural decision on a Likert scale. Afterwards,


3.3. Survey Design 39

participants were asked to indicate a bad architectural decision, and describe

its characteristics. We asked participants to judge themselves what bad

architectural decisions were. Participants were asked to rate the same 22

statements about the difficulty of their bad decision.

Finally, participants could optionally add other items that contribute to the

difficulty of architectural decisions. Also, participants could send me a short e-

mail, to receive a copy of survey results or to offer other feedback.

3.3.2 Data Collection

Our target population was software architects who were directly involved in

making software architectural decisions during the last two years. This ensured

that survey participants could answer reliably to the questions about their

architectural decisions. We dropped the requirement of having the job title

‘architect’ because it might be over-used in the industry, without indicating

architecting activities. Also, people often make architectural decisions without

having a formal role as architects (e.g. senior software developers).

Members of our target population have busy schedules, so they are unlikely to

participate in surveys. Thus, we tried to limit the time to complete the survey

to around 15 minutes. To reach our target population, we used several

approaches. First, we sent invitations to architects in our personal networks,

asking them to participate in the survey and to further distribute it. Second, we

posted invitations to the survey on the website of a social networking site for

professionals. Third, we ran paid ad campaigns on LinkedIn (as detailed in

(Galster and Tofan, 2014)). In the advertisement we targeted professionals

working for companies in the ‘Computer Software’, Internet, or ‘Information

Technology and Services’ industries, from any part of the world, with job

functions in ‘Information Technology’ or Engineering. Also, we selected the

following job seniority: CXO, Director, Manager, Owner, Partner, Senior, or

VP, with skills in software architecture. These filters resulted in around 91,000

professionals and that could potentially see our ad. Finally, we run an ad

campaign with Google AdWords. Here, we did not find options for defining

our target audience. Instead, we defined a set of keywords (e.g. ‘software

architecture’). People searching for such keywords could see our ad for the

survey.



We ran the survey from the 1st of October 2012, until 7

th of January 2013,

using a third-party, web-based tool. In total, 219 persons started the survey.

163 persons provided partial answers (i.e. respondents answered a few

questions, and then abandoned the survey). Twelve participants did not pass

the control question about involvement in architectural decisions. Finally, 43

persons provided answers to all questions in the survey. We analyze the 43

answers in the next section.

3.4 Results Analysis

We use descriptive statistics to present the background of the survey

participants, answers to RQ1 and RQ2. Also, we use statistical tests to answer

RQ3 and RQ4.

3.4.1 Participants Background

Answers were provided from 23 countries on five continents. Regarding their

job title, most respondents described themselves as software architects

(twenty), senior software engineers (seven), or enterprise architects (five). The

other respondents described themselves as managers, business analysts, system

architects, and other roles. Table 3.1 summarizes the years of experience that

participants had in their current role and as architects. We notice that most

respondents had three to five years of experience in their current roles. Most

participants had six to ten years of experience as software developers. The

ranges for years of experience for architects and developers were normally

distributed.


3.4. Results Analysis 41

Table 3.1. Number of participants with corresponding years of experience and

roles.

Years of experience In current role As architect As developer

0-2 years 5 12 3

3-5 years 19 10 11

6-10 years 10 13 12

11-15 years 3 7 8

>15 years 6 1 9

The 43 survey participants answered questions about 86 architectural decisions

(43 good and 43 bad decisions). Architectural decisions were from domains

shown in Figure 3.1. Most decisions (31) belonged to other domains, such as

financial trading, insurance, or advertising.

Other, 31

E-commerce, 12Telecommunication, 10

E-government, 10

Embedded systems, 8

Banking, 6

Healthcare, 6

Transportation, 3

Figure 3.1. Domains of decisions.

3.4.2 RQ1 - Characteristics of Architectural Decisions

Participants indicated the actual and the elapsed time they spent for making the

86 architectural decisions. For example, an architect can spend three working

days over ten working days for a decision, resulting in a ratio of actual versus

elapsed time of 30%. In the survey, participants could indicate time in minutes,

hours, days, weeks, or months, based on their preference. We converted their

responses into working days, by considering that one working day has eight

hours, one working week has five days, and that one working months has 22

working days.

Outlier values bring a negative influence on the interpretation of averages and

standard deviations. To avoid that, we eliminate outliers. We obtain the results



in Table 3.2. On average, architectural decisions took about eight working

days, elapsed over around 35 working days. The average ratio indicates that,

overall, a third of the elapsed time is spent on the actual decision making.

Table 3.2. Metrics for actual time, elapsed time (in working days), ratio of actual

versus elapsed time, and number of directly and indirectly involved persons in

the architectural decisions.

Metric Actual Elapsed Ratio Direct Indirect

Average 7.85 34.74 0.34 3.12 7.05

Standard deviation 9.22 70.59 0.22 1.54 8.91

Minimum 0.50 0.63 0.02 1 0

Maximum 44 600 1 8 50

Mode 1 5 0.5 3 3

Participants indicated how many people were involved directly and indirectly

in making the architectural decisions. The number of indirectly involved

persons does not include the directly involved persons. We collected this

metric because we could not find any related work that describes how many

decision makers are usually involved directly in architectural decisions in

practice. For example, if most architectural decisions are made in groups, then

more work is needed for group decision making, rather than individual

decision making. We eliminated outliers for the number of people both for

direct and indirect involvement, and we obtain the results in Table 3.3.

Table 3.3. Metrics for the number of alternatives considered in the beginning,

and for an extended period of time. Last column has the number of quality

attributes.

Metric Beginning Extended time # Quality attributes

Average 2.91 1.96 4.74

Standard deviation 1.43 0.84 4.19

Minimum 1 0 0

Maximum 8 4 30

Mode 3 2 3

When making an architectural decision, more alternatives can be considered.

Since little is known about how many alternatives architects usually consider



in their decision making, we asked participants to indicate the number of

alternatives they considered at the beginning of their decision making process,

and the number of alternatives they studied for an extended period of time.

Architects consider quality attributes in their decisions. However, it is not clear

how many quality attributes they consider in practice, so we asked them to

indicate this number for their decisions. After eliminating outliers, we obtain

the metrics for the number of alternatives and quality attributes in Table 3.3.

3.4.3 RQ2 - Difficulty of Decisions

Participants rated 22 statements (Table 3.4) with factors on the difficulty of

their decisions, indicating their level of agreement with the statements, using a

five-point Likert scale with the following values: strongly disagree, disagree,

neutral, agree, strongly agree, and not applicable.

Table 3.4. Factors to describe the difficulty of a decision.

ID The decision was difficult because…

F1 you received conflicting recommendations from various sources about

which decision alternative to choose

F2 there were no previous similar decisions to compare this decision

against

F3 it was hard to identify a superior decision alternative from the

alternatives under consideration

F4 the decision required a lot of thinking from you

F5 it was hard to convince stakeholders to accept a certain decision

alternative

F6 stakeholders had strongly diverging perspectives about the decision

F7 you needed to influence some stakeholders without having formal

authority over them

F8 the decision had too many alternatives

F9 the decision had too few alternatives

F10 analyzing alternatives for this decision took a lot of effort

F11 some quality attributes were considered too late in the decision making



process

F12 too many people were involved in making the decision

F13 dependencies with other decisions had to be taken into account

F14 the decision had a major business impact

F15 you had to respect existing architectural principles

F16 serious negative consequences could result from the decision

F17 too little time was available to make the decision

F18 you had a lot of peer pressure

F19 of the trade-offs between quality attributes

F20 you lacked experience as an architect

F21 you lacked domain-specific knowledge (e.g. new customer)

F22 more information was needed to reduce uncertainty when making the

decision

Results for each factor are summarized in Figure 3.2 (left). From the bar

charts, we notice the following. First, participants indicated most agreements

(including strong agreements) with statements on dependencies with other

decisions (F13 for 69 decisions), major business impact (F14 for 60 decisions)

and serious negative consequences (F16 for 59 decisions).

Second, participants indicated most disagreements (including strong

disagreements) with statements on having too many alternatives (F8 for 57

decisions), too many people involved in decision making (F12 for 49

decisions), lack of domain-specific knowledge (F21 for 46 decisions), and

having too few alternatives (F9 for 45 decisions).

Third, participants indicated most neutral standpoint with statements on

respecting existing architectural principles (F15 for 24 decisions), needing lot

of effort for analyzing decision alternatives (F10 for 21 decisions), and having

much peer pressure (F18 for 21 decisions).

Fourth, participants indicated very few statements were ‘not applicable’ to

their decisions. Most ‘not applicable’ answers were obtained for influencing

some stakeholders without formal authority (F7 for six decisions).



Figure 3.2 (right) shows average values for all factors, calculated as follows.

We assign numerical values to the Likert scale: strongly disagree (1), disagree

(2), neutral (3), agree (4), and strongly agree (5). Not applicable values are

ignored. We acknowledge the ongoing discussions on treating a Likert scale as

either an interval or categorical type of data (e.g. (Jamieson, 2004)). Still, we

use averages because they are intuitive and easy to understand for a large

audience.

From Figure 3.2 (right), we notice that dependencies with other decisions

(F13) and major business impact (F14) have highest average agreements

across participants. Negative consequences (F16) received second highest

average. Effort for analyzing alternatives (F10), lack of similar decisions (F2)

and requiring a lot of thinking (F4) received high agreements from

participants.

We notice that some factors have averages that suggest disagreement that they

contribute to the difficulty of architectural decisions, i.e. have averages smaller

than three (neutral). For example, we notice that either too many (F8) or too

few (F9) alternatives contribute little to difficulty. The same goes for lack of

experience (F20) and domain-specific knowledge (F21). However, these two

factors need to be considered in the context that many participants were senior

architects, who might already have enough experience and knowledge.



4 5 2 4 7 7 4 7 50

9 10

2 1 3 1 3 2 0

125 3

20 1719 14

2116 20

50

40

14

26

39

9 10

20

16

3429

21

28 41

21

17

1016

17

17

17

5

9

17

21

11

12

6

14

24

10

19

21

20

20 8

9

28

27

2828

17 25

30

15 21

34

29

15

49 33

25

36

2225

37

23

26

35

1226 17 23

20 17

21

4 2

17 8 9

20

27

1223

6 6 63

5

17

51 4

04 4 6

1 1 0 3 1 0 1 2 0 2 3 20

1 1

F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17 F18 F19 F20 F21 F22

Strongly disagree Disagree Neutral Agree Strongly agree Not applicable ID Avg.

F13 3.88

F14 3.88

F16 3.74

F10 3.63

F2 3.61

F4 3.6

F7 3.55

F22 3.49

F3 3.48

F6 3.35

F19 3.33

F1 3.3

F5 3.27

F15 3.27

F18 3.05

F11 3.01

F17 2.93

F21 2.82

F20 2.73

F9 2.71

F12 2.69

F8 2.52

Figure 3.2. Survey results for each factor (left), and sorted average values for

factors (right) - a higher average indicates stronger agreement with the difficulty

of a factor.

Participants could indicate optionally other factors that contribute to the

difficulty of architectural decisions. We identified three other difficulty

factors: technology evolution (mentioned by six participants), insufficient

knowledge (four participants), and organizational politics (two participants).

First, technology places unpredictable constraints on decisions, such as

uncertain backwards incompatibility of future versions of a platform. Second,

insufficient knowledge refers to lack of professional knowledge, lack of peers

to discuss non-standard issues, and lack of documentation. Third, participants

indicated that organizational politics contribute to the difficulty of decisions,

and require awareness from architects.

3.4.4 RQ3 - Differences between Junior and Senior Architects

We divide survey participants in junior and senior architects, in an arbitrary

manner. We consider that junior architects have up to five years of experience

as architects. Senior architects have six or more years of experience as



architects. Based on this separation, 22 junior and 21 senior architects

answered the survey.

To compare the answers from junior and senior architects, we use a statistical

test. We compare two independent groups and cannot assume that the data is

normally distributed. Thus, we use the Mann-Whitney U test (Field, 2009), a

non-parametric test. We investigate the differences between junior and senior

architects on the answers offered to the 22 factors in Table 3.4, and the eight

metrics in Table 3.2 and Table 3.3.

We use SPSS 20 to apply the Mann-Whitney U test on the survey data. ‘Not

applicable’ answers are treated as missing values. We obtain significant

statistical differences (p-value less than 0.05) between junior and senior

architects for five difficulty factors and two characteristics. Table 3.5 presents

the Mann-Whitney U test results (U, Z, and p), median and average values for

junior and senior architects. Results for difficulty factors without statistical

significance are omitted.

We notice that junior architects found conflicting recommendations on what to

consider for a decision (F1) as more significant to making a decision difficult.

Also, in contrast to senior architects, junior architects found that if lots of

thinking is required (F4), decisions become more difficult. In turn, senior

architects found that decisions become more difficult if they have a major

business impact (F14). Finally there are differences between junior and senior

architects on experience (F20) and domain-specific knowledge (F21); these

differences can be expected to some extent, because senior architects have

more experience and domain-specific knowledge.



Table 3.5. Results of the Mann-Whitney U test are displayed on the first four

columns. Median and average values for junior (J) and senior (S) architects are

displayed on the last four columns. Lines with p-value > 0.05 have gray

background.

ID U Z p Mdn.

J

Mdn.

S

Avg.

J

Avg.

S

F1 565 -2.49 .006 4 3 3.61 2.97

F4 722 -1.80 .036 4 3 3.82 3.38

F14 716 -1.71 .044 4 4 3.73 4.05

F20 429 -4.42 .000 3 2 3.25 2.19

F21 699 -1.93 .027 3 2 3.74 2.59

Actual time 553 -3.21 .001 2.75 9 4.87 19.67

Elapsed time 742 -1.57 .058 15 21.5 22.87 62.32

Ratio time 621 -2.62 .004 0.25 0.35 0.29 0.42

Direct people 883 -.36 .361 3 3 3.28 3.78

Indirect

people

901 -.20 .422 4 4 8.79 24.12

Alternatives

beginning

780 -1.28 .100 2.5 3 2.77 3.62

Alternatives

extended

916 -.07 .474 2 2 2 2.08

No. QAs 764 -1.40 .081 3 4.5 4.23 5.07

We notice that senior architects spend significantly more actual time on their

decisions than junior architects (four times on average). The ratio of actual on

elapsed time is higher for senior architects, with a clear statistical significance.

We also notice a tendency of senior architects to consider more alternatives in

the beginning of the decision making. Also, senior architects tend to consider

more quality attributes than junior architects; however the statistical

significance is not sufficient.



3.4.5 RQ4 - Differences between Good and Bad Decisions

Each survey participant answered questions about a good and a bad

architectural decision. Comparing participants’ answers on the two types of

decisions increases our understanding on the quality of architectural decisions,

by analyzing the link between the two aspects of quality: difficulty (the 22

factors in Table 3.4) and outcome (good and bad decisions). Furthermore, we

analyze the link between the characteristics of architectural decisions and their

outcome (e.g. are there differences between the time spent on good or bad

decisions).

We compare differences between the 43 good and 43 bad decisions using the

Wilcoxon signed ranks test (Field, 2009), a non-parametric statistical test for

comparing groups of two related samples. We treat ‘not applicable’ answers as

missing values. Similar to the analysis in Section 3.4.4, we investigate the

differences between good and bad decisions related to the data in Table 3.2

and Table 3.3. The results of the Wilcoxon signed rank test are presented in

Table 3.6. Results on difficulty factors with statistically insignificant values

are not included.



Table 3.6. Results of the Wilcoxon signed rank test, medians and averages for

variables and metrics that correspnd to good (G) and bad (B) decisions. Lines

with p-values > 0.05 have gray background.

ID Z p-

value

Mdn.

G

Mdn.

B

Avg.

G

Avg.

B

F9 -2.48 0.006 2 3 2.47 2.95

F11 -3.07 0.001 2 4 2.61 3.4

F13 -2.12 0.018 4 4 4.07 3.7

F17 -2.63 0.004 2 3 2.67 3.19

F18 -2.03 0.022 2.5 3 2.81 3.29

F20 -1.85 0.04 3 3 2.6 2.86

Actual time -0.51 .308 4 5 14.44 9.76

Elapsed time -0.73 .235 20 20 38.02 46.25

Ratio time -0.09 .463 0.33 0.33 0.37 0.34

Direct

people

-0.55 .294 3 3 3.37 3.63

Indirect

people

-1.40 .082 4 4 17.14 15.42

Alternatives

beginning

-2.41 .007 3 2 3.65 2.72

Alternatives

extended

-1.92 .029 2 2 2.21 1.84

No. QAs -2.68 .003 4 3 5.02 4.26

We notice differences on having too few alternatives (F9), with a tendency for

disagreement with F9 on good decisions, and for neutral with F9 on bad

decisions. For bad decisions, participants indicated that some quality attributes

were considered too late (F11), in contrast with good decisions. Also,

dependencies with other decisions (F13) are more difficult for good than bad

decisions. Participants disagreed on too little available time (F17), much peer


3.5. Discussion 51

pressure (F18), and lack of experience (F20) for the good decisions, in contrast

with the bad decisions.

Regarding the metrics for decisions characteristics, we notice statistically

significant differences between the number of alternatives considered at the

beginning of the decision making process, number of alternatives studied for

an extended period of time, and the number of quality attributes. For all these

three parameters, the good decisions had higher numbers. The results for the

parameters about time and number of involved people have insufficient

statistical significance.

3.5 Discussion

Little is known in the literature about the duration of real-world architectural

decisions. We found that the time needed for architectural decisions varies

much. An architectural decision takes an average actual time of around eight

working days, over an average elapsed time of 35 working days. Also,

architects spend around one third of the elapsed time on the actual decision

making (Table 3.5). The survey results indicate no significant differences

between good and bad decisions regarding actual and elapsed time (Table 3.6).

However, participants considered they had enough time for the good decisions,

and not enough time for the bad decisions.

Another insight from this study is that the actual time junior architects spend

on making a decision is one quarter of that spent by senior architects (Table

3.5). We expected senior architects to spend less or similar amounts of time to

junior architects, because of their extra experience. A possible explanation is

that senior architects might deal with higher impact decisions than juniors, so

they need the extra time. A future, more precise comparison should use a ratio

of time per decision impact, which can be quantified as the estimated cost of

reversing the decision.

Another insight from this survey concerns the number of people involved in

architectural decisions. The importance of stakeholders in architectural

decisions is widely recognized in the literature. Stakeholders are always

involved indirectly in decision making. However, we could not find any

literature on the direct involvement of stakeholders in architectural decisions,

as actual decision makers, rather than decision influencers. To improve



decision support, researchers need to know if architectural decisions are

typically made by one person (i.e. the architect) or by groups of persons (i.e.

one or more architects, and other stakeholders). For example, researchers can

propose group decision making approaches, if a relevant proportion of

architectural decisions are made in groups. A surprising result is that only 14%

of the decisions in the survey were made by individuals. The typical

architectural decision has three decision makers (Table 3.2). Consequently,

group architectural decision making is a much needed research direction.

Regarding difficulty of decisions, we notice that dependencies with other

decisions are a very important factor for the perceived difficulty of decisions.

Results from a related survey (van Heesch and Avgeriou, 2011) indicate that

architects often come across such dependencies. Moreover, researchers

proposed various approaches for handling decisions dependencies (e.g. (Jansen

et al., 2009)). Our survey confirms the relevance of the topic, and the need for

disseminating research results to practitioners. We also found that analysis

effort and lack of similar (or previously made) decisions increase difficulty of

decision making. This suggests that practitioners welcome approaches that

help them analyze decisions, and appreciate examples of similar decisions as

opportunities to reuse architectural knowledge.

Regarding differences between junior and senior architects we found that

junior architects need help to address the difficulties of analyzing decisions,

such as handling conflicting recommendations. This is not relevant for senior

architects. We consider that existing documentation approaches help junior

architects. However, documentation could be improved by adding capabilities

for analyzing decisions (e.g. what-if analysis).

Regarding differences between good and bad decisions, the survey results

indicate that good decisions have more alternatives than bad decisions.

Therefore, as a rule of thumb, we recommend practitioners to identify three or

more alternatives. Also, this study confirms that practitioners should pay

attention to quality attributes and decisions dependencies while making



3.6. Conclusions 53


We present potential study limitations, in terms of internal, construct, and

external validity (Ciolkowski et al., 2003; Kitchenham and Pfleeger, 2008). To

increase internal validity, we piloted the questionnaire and refined it to ensure

that participants could understand it. Also, we added explanatory text with

small examples to the questions, so that participants could easily interpret the

questions. Another threat to internal validity is that our findings were derived

from polarized instances of architectural decisions (i.e. good and bad), rather

than typical decisions. Not adding typical decisions to the questionnaire was a

necessary trade-off for keeping the survey duration within 15 minutes.

To make sure we measure the difficulty of decisions, we used existing work

(Yates et al., 2003) as a starting point for characterizing decisions and to

conceptualize difficulty, as suggested by (Ciolkowski et al., 2003). Moreover,

we discussed our conceptualization with experienced architects, who helped us

refine it. Furthermore, participants could answer with ‘not applicable’ to the

survey items on difficulty of their decisions. The very low numbers of ‘not

applicable’ answers indicates that the survey items indeed measure difficulty

of decisions.

We increased the external validity of this survey by recruiting participants

through posting survey invitations in venues for professionals, and by using

paid ad campaigns. These efforts resulted in a high diversity of respondents,

compared to only using respondents from personal networks or only answers

from a specific region or company.

3.6 Conclusions

In this chapter, 43 participants provided information about 86 architectural

decisions. We found that dependencies among decisions, analysis effort and

lack of similar and previously made decisions make architectural decisions

difficult. Moreover, junior architects need decision analysis support more than

senior architects. Also, we found that considering more alternatives may lead

to better decisions. Additionally, this survey confirmed the impact of quality

attributes and decisions dependencies on the quality of architectural decisions.

In addition, this survey confirms the importance of reducing architectural



knowledge vaporization (a challenge also identified in the previous chapter),

so that architects can access similar and previously made decisions.

This survey provided insights on the duration and number of people,

alternatives and quality attributes in architectural decision making, which

suggests the need of further research on group architectural decision making.

To offer decision analysis support, Chapters 6 and 7 present approaches for

making architectural decisions (including group decisions). Chapter 8 presents

an open-source tool that includes support for group decision making, analyzing

alternatives and dependencies with other decisions.

After studying practical architectural knowledge management and architectural

decisions in Chapter 2 and Chapter 3, the next chapter presents a systematic

mapping study of literature on architectural decisions, in which we identify

and synthesize existing research on architectural decisions.

Acknowledgment

We thank the study participants for their help.


Chapter 4

State of Research on Architectural Decisions

Published as: Tofan, D., Galster, M., Avgeriou, P., and Schuitema, W., Past

and future of software architectural decisions – A systematic mapping study.

Information and Software Technology 56, 8 (2014), 850-872.

After analyzing the state of practice on managing architectural knowledge (in

Chapter 2) and making architectural decisions (in Chapter 3), this chapter

provides a systematic overview of the state of research on architectural

decisions. Such an overview helps reflect on previous research and plan future

research. Furthermore, such an overview helps practitioners understand the

state of research, and how research results can help practitioners in their

architectural decision-making.

We conducted a systematic mapping study, covering studies published between

January 2002 and January 2012. We defined six research questions. We

queried six reference databases and obtained an initial result set of 28,895

papers. We followed a search and filtering process that resulted in 144

relevant papers.

After classifying the 144 relevant papers for each research question, we found

that current research focuses on documenting architectural decisions. We

found that few studies describe real world architectural decisions. We

identified potential future research topics, such as group architectural

decision making. Regarding empirical evaluations of the papers, around half

of the papers use systematic empirical evaluation approaches (such as

surveys, or case studies). Still, few papers on architectural decisions use

experiments.

This study confirms the increasing interest in the topic of architectural

decisions. This study helps the community reflect on the past ten years of

research on architectural decisions. Researchers are offered a number of

promising future research directions, while practitioners learn what existing


56 4. State of Research on Architectural Decisions

papers offer. This study encouraged us to work on group architectural

decision making (Chapter 7), and to use experiments for empirical validations

(Chapters 6 and 7).


4.1. Introduction 57

4.1 Introduction

Chapters 2 and 3 present the state of practice on architectural decisions:

confirmation that architectural knowledge vaporization is a major challenge,

and insights on real world architectural decisions. As discussed in Chapter 1,

architectural decisions are a major component of architectural knowledge.

Understanding architectural decisions enables researchers to offer approaches

that reduce architectural knowledge vaporization. This chapter has the goal of

providing a systematic overview of the current state of research on

architectural decisions. To achieve this goal, we present a systematic mapping

study. The overview in this systematic mapping study provides value by

offering the list of papers on architectural decisions, clusters of papers based

on various research topics relevant to architectural decisions (as detailed in

Section 4.2.1.3), as well as findings and future research directions derived

from the clusters of papers.

Such an overview benefits two types of audiences. First, the overview enables

researchers to reflect critically on the current state of research on architectural

decisions. Moreover, the overview offers researchers promising future

research directions, gaps in current research and topics that need further

attention. Second, the overview enables practitioners to learn about the state of

existing research on architectural decisions, and potentially adopt state of the

art approaches (e.g. methodologies or tools) and use other insights on

architectural decisions (e.g. empirical evidence about the importance of

architectural decisions) in their own architectural decision-making practices.

We provide a more detailed discussion on the benefits of our study when

introducing the rationales of our research questions in Section 4.2.1.3.

Previously, only partial overviews of the state of research on architectural

decisions have been presented: Falessi et al. (Falessi et al., 2011) survey

fifteen architectural decision-making techniques proposed in the literature,

with the goal of helping architects choose among decision-making techniques

for their practice. Shahin et al. (Shahin et al., 2009) compare features of nine

tools for documenting architectural decisions from the literature. Bu et al. (Bu

et al., 2009) analyze various aspects of design reasoning (such as rationales

reuse) in nine decision-centric architectural design approaches from the



literature. Overall, these three studies only offer partial overviews of research

on architectural decisions. In contrast, we present a systematic overview

following a rigorous approach to identify all relevant papers. Furthermore,

rather than comparing existing tools or approaches, we aim to map the whole

domain of architectural decisions.

As mentioned earlier, architectural decisions are part of architectural

knowledge (in addition to other aspects, such as context and assumptions

(Kruchten et al., 2005)). Thus, work related to our mapping study also includes

overviews of the state of research on the topic of architectural knowledge. De

Boer and Farenhorst (de Boer and Farenhorst, 2008) present a systematic

literature review on definitions of architectural knowledge, and notice that

many definitions of architectural knowledge in fact refer to architectural

decisions. Tang et al. (Tang et al., 2010) compare several tools for

architectural knowledge management. Li et al. (Li et al., 2013) present a

systematic mapping study on applying knowledge-based approaches in

software architecture. Overall, studies on architectural knowledge are on a

different abstraction level, compared to studies on architectural decisions.

Furthermore, our study has a different goal compared to other literature

reviews (de Boer and Farenhorst, 2008; Li et al., 2013; Tang et al., 2010), and

a broader scope than existing partial overviews (Bu et al., 2009; Falessi et al.,

2011; Shahin et al., 2009).

The rest of this chapter is structured as follows. Section 4.2 presents the

mapping study methodology, research questions, study search strategy, data

extraction and analysis. Section 4.3 presents the data collection efforts. Section

4.4 presents the answers to the research questions. Section 4.5 discusses the

study results and limitations. Section 4.7 concludes the chapter.

4.2 Research Methodology

A systematic mapping study is an evidence-based form of secondary study that

provides a comprehensive overview of a research area, identifying common

publication venue types (e.g. conference or journal), quantitative analyses (e.g.

number of published studies per year), and research findings in the

investigated research area (Petersen et al., 2008). Systematic mapping studies

offer multiple benefits. First, mapping studies identify gaps and clusters of


4.2. Research Methodology 59

papers based on frequently occurring themes in current research, including the

nature and extent of empirical data on a topic, using a systematic and objective

procedure (Budgen et al., 2008). Second, mapping studies help plan new

research, avoiding effort duplication (da Mota Silveira Neto et al., 2011).

Third, they identify topics and areas for future systematic literature reviews, a

more in-depth form of secondary studies with focus on smaller research areas,

compared to mapping studies.

We also considered performing a systematic literature review, rather than a

systematic mapping study. These two types of literature reviews differ as

follows. First, mapping studies are suitable for analyzing broad research areas,

compared to systematic literature reviews which are suitable for answering in-

depth research questions (Kitchenham and Charters, 2007). Typically,

systematic literature reviews cover fewer papers than mapping studies

(Kitchenham and Charters, 2007). Second, mapping studies focus on broad

analysis of the literature (e.g. by classifying and summarizing literature), rather

than in-depth analysis of the literature (e.g. in terms of outcomes and quality

assessments of the papers) (Kitchenham and Charters, 2007; Petersen et al.,

2008).

As discussed in Chapter 1, there is much interest in the community on the

topic of architectural decisions. However, no systematic overview of this area

exists. This prevents an in-depth analysis of the literature in this research area

(e.g. analysis of a specific type of architectural decisions). Therefore, given the

large amount of work and architectural decisions and lack of existing

systematic overviews, we concluded that a systematic mapping study benefits

the community more than a systematic literature review, by offering a broad

perspective of existing work on architectural decision. Future systematic

literature reviews could start from the results of this mapping study and define

research questions for sub-topics from literature on architectural decisions.

To conduct this study, we extended the mapping study process proposed by

Petersen et al. (Petersen et al., 2008) and used a process similar to (da Mota

Silveira Neto et al., 2011; Elberzhager et al., 2012), that included the definition

of a data collection form and a study protocol. As shown in Figure 4.1, the

authors of (da Mota Silveira Neto et al., 2011) split the process proposed by

(Petersen et al., 2008) into three phases: 1) Research directives, 2) Data

collection, and 3) Results. In addition to the process in (da Mota Silveira Neto



et al., 2011; Elberzhager et al., 2012), we added an extra step to the first phase:

we surveyed generic decision literature (detailed in Section 4.2.1.2), after

defining the protocol. The survey of generic decision literature resulted in

extra dimensions for classifying papers, from which we derived additional

research questions to the preliminary questions defined in the protocol. In the

second phase (detailed in Section 4.3), we identified relevant papers according

to inclusion and exclusion criteria defined in the protocol. In the third phase

(detailed in Section 4.4), we created the classification scheme to classify

existing papers on architectural decisions using the instructions from (Bailey et

al., 2007; Petersen et al., 2008) and performed the actual mapping of current

literature.

Protocol

Definition

Conduct

Research

Decision

Literature

Survey

Research

Questions

Definition

Data Extraction

and Mapping

Relevant

Topics

Keywording

Papers

Screening

Protocol

All Papers

Review ScopeExtra

Dimensions

Systematic MapClassification

Scheme

Relevant Papers

Phase 1: Research directives

Phase 3: Results

Phase 2: Data collection

Process Step OutcomeLegend:

Figure 4.1. Systematic mapping study process used in this study.



4.2.1 Research Directives

In this section, we present the first phase of the mapping study process. In this

phase, we define the research protocol, survey generic decision-making

literature, and define the research questions.

4.2.1.1 Protocol Definition

We developed the research protocol based on the template from . In the

protocol, we specified the study topic, its justification, and preliminary

research questions. Also, we specified the search strategy (detailed in Section

4.3.1), selection criteria (detailed in Section 4.3.2), and a data extraction form.

We reviewed and updated the protocol in several iterations.

In the protocol, we also indicated offering an overview of the selected papers

in terms of their empirical evaluation approaches. The overview on empirical

evaluation approaches indicated if existing research used empirical evidence,

and what kind of empirical evidence. The rationale of such an overview was to

understand what empirical evidence supports existing work. For example, a

paper might validate an approach using a case study with practitioners.

Synthesizing such information can indicate what approaches are most used and

least used in research on architectural decisions. Results on most used

approaches offer researchers examples for conducting future empirical studies.

Results on least used approaches encourage researchers to use such

approaches, since evidence from multiple empirical evaluation approaches is

stronger than evidence from one approach. Also, these results help

practitioners judge the validity and applicability of research results on

architectural decisions. For example, practitioners might consider that research

results from evaluations with other practitioners are more applicable than

results from evaluations with students.

Other mapping studies also present aspects related to empirical evaluation

approaches, such as research type in (da Mota Silveira Neto et al., 2011;

Engström and Runeson, 2011), which use a classification that includes

opinions, evaluations, and solution proposals. We used the classification from

(Bailey et al., 2007; Elberzhager et al., 2012; Li et al., 2013) that includes

experiments, case studies, and surveys.



In the protocol, we also indicated offering an overview of the selected papers

in terms of their publication venues and years. This overview helps researchers

and practitioners understand the leading venues to publish or read about

research results on architectural decisions. Although there is clear interest in

the community on architectural decisions, there is no empirical evidence about

the trend with the number of publications on architectural decisions over the

years. Such trend helps researchers and practitioners understand if there is a

growing or decreasing in the interest on the topic of architectural decisions,

and if there are gaps in the interest on the topic of architectural decisions.

Moreover, other mapping studies also present publication venues and years for

the selected papers (da Mota Silveira Neto et al., 2011; Elberzhager et al.,

2012; Engström and Runeson, 2011; Li et al., 2013).

We defined research questions (detailed in Section 4.2.1.3) derived from two

sources:

1. Software architecture decisions literature helps classify relevant papers

using research questions emerging from the relevant papers themselves

(such as the role of non-functional requirements in architectural decision-

making).

2. Generic decision literature helps classify relevant papers using research

questions derived from a larger body of work, beyond software

architecture. We further motivate and explain the use of generic decision

literature in Section 4.2.1.2).

4.2.1.2 Generic Decision Literature Survey

Leveraging generic decision literature enables us to position research on

architectural decisions in a larger body of work. Generic decision literature

refers to literature on decisions independent of a domain and independent from

software architectural decisions (e.g. strategic decisions in organizations).

As argued earlier, architectural decisions are a particular type of decisions, i.e.

decisions made by software architects, while architecting software systems.

Since research on generic decisions is more mature than research on

architectural decisions, we consider that this perspective allows us to reuse

results, and to better transfer research results from the generic decisions

literature to the software architecture community. For example, if the generic



decision body of work offers results for compensating bias in decision-making,

then software architecture researchers can investigate such results to address

bias in architectural decision-making. Therefore, we conducted a lightweight

literature survey to identify possible items for the research questions and the

classification scheme. We used the following criteria for identifying relevant

generic decision literature:

1. We chose to use books instead of articles, because books offer more

comprehensive and broad content on an existing body of knowledge in

an area. In contrast, research articles tend to present novel, in-depth

contributions.

2. The books should target an academic audience, because an academic

audience has high expectations on content quality (e.g. providing

detailed references and avoiding speculations).

3. The authors of the books should have a publication track record on

decision-related topics in peer-reviewed venues to indicate their

expertise on decisions. We ensured this by checking the background of

authors and their publication record.

4. The books should focus on generic decision-making field.

We searched for and identified six books (i.e. (Eisenführ et al., 2010; Janis,

1989; Newell et al., 2007; Nutt and Wilson, 2010; Peterson, 2009; Zeckhauser,

1996)) that comply with the above criteria. We are aware that other books on

decision-making exist that comply with the criteria. However, we consider that

the six books offer sufficient content for identifying representative ideas from

generic decision-making literature. We read each book and extracted major

ideas from them. We considered a major idea relevant for our study if it

referred to major, high level established concepts (e.g. estimating probabilities

is a relevant major idea, but specific approaches for how to do this are out of

scope). Next, we consolidated the major ideas and discussed the potential

applicability of each idea as a mapping dimension for literature on decision-

making in software architecture. Finally, following discussions among

researchers, we kept the mapping dimensions that we found most relevant for

the software architecture field, and formulated four additional research

questions based on them. In the next section, we present all research questions



and their rationales, including the four research questions derived from this

literature survey.

4.2.1.3 Research Questions Definition

We present the six research questions for this study and their rationales. We

categorize the research questions in two groups, based on their source, as

below.

Research Questions Derived from Software Architecture

Literature

RQ1. What are the papers on documenting architectural decisions?

Rationale: Documenting decisions reduces architectural knowledge

vaporization, which, in turn, reduces maintenance costs (Bosch, 2004).

Various approaches have been proposed for documenting architectural

decisions. However, an overview of papers on documenting architectural

decisions is currently missing. Such an overview helps researchers analyze

existing approaches on documenting architectural decisions and identify gaps

in existing work. In addition, practitioners can use the proposed approaches to

improve their decision documentation practices.

RQ2. Does current research on architectural decisions consider functional

requirements and quality attributes?

Rationale: In their activities, architects need to consider the functional

requirements and quality attributes (or ‘non-functional’ requirements) of

software systems. In addition, quality attributes play an important role in the

decision-making process, since architects must make tradeoffs between quality

attributes (e.g. security versus usability). However, in some decision-making

situations, specific quality attributes become architectural key drivers and

therefore receive more attention. For example, the quality attribute security

would be an architectural key driver when architecting a security-intensive

system. Answering RQ2 helps researchers identify quality attributes that are

rarely addressed in current decision-making approaches and therefore might

need more attention in future research. Also, practitioners can use the answer

to RQ2 to select approaches from the literature to make decisions related to

specific quality attributes.



RQ3. What specific domains for architectural decisions are investigated?

Rationale: Architectural decisions are made in various domains. Domains

include application domain (e.g. healthcare) and technology domain (e.g.

service-oriented architectures). A paper may belong to more domains.

Different domains bring different challenges for architects, so architects in

industry can benefit from choosing approaches that are geared towards the

challenges of a particular domain. Furthermore, answering RQ3 helps

researchers identify domains that may need more attention. Practitioners can

use the answer to RQ3 to select approaches from the literature to help them in

their domain-specific architectural decision-making. This offers them better

targeted approaches, compared to generic approaches for architectural

decision-making. Also, practitioners learn whether different domains require

different approaches or if the domain has only limited influence on how

decisions are made.

Research Questions Derived from Generic Decision Literature

RQ4. What are the normative and descriptive papers?

Rationale: Generic decision literature distinguishes between normative and

descriptive theories about decisions (Eisenführ et al., 2010; Peterson, 2009).

Descriptive theories aim at explaining and predicting how decisions are

actually made in the real world. In contrast, normative theories aim at

prescribing how decisions should be made in a rational manner. However,

normative and descriptive theories complement each other. Developing a

normative theory (i.e. on how architectural decisions should be made) benefits

from understanding how architectural decisions are actually made. Thus, we

use the normative/descriptive classification for papers on architectural

decisions. Answering RQ4 helps researchers understand the existing

descriptive papers, and plan future descriptive studies. For example, current

descriptive studies might present only parts of the lifecycle of real-world

architectural decisions. Thus, future descriptive studies can present the full

lifecycle of such decisions, from initial need to make the decision to its actual

implementation and results. The answer to RQ4 includes a list of descriptive

papers that helps researchers uncover such missing aspects. Also, researchers

can use the existing descriptive papers to propose better normative approaches.

Practitioners can use descriptive work on real-world architectural decisions to



understand how other practitioners deal with architectural decisions. In

addition, practitioners can use approaches from normative work to improve

their decision-making activities.

RQ5. What are the papers on addressing uncertainty in architectural

decisions?

Rationale: Addressing uncertainty is a major issue in generic decision

literature (Newell et al., 2007; Peterson, 2009; Zeckhauser, 1996), because

most decisions involve uncertainty about the future consequences of choosing

a certain alternative. If a decision does not involve uncertainty, then such

decision is trivial to make, because the decision maker can simply choose the

alternative with the highest benefits. Generic decision literature proposes

various approaches to address uncertainty, such as Bayesian theory.

Uncertainty increases the difficulty of architectural decisions, as we found out

in Chapter 3. For example, an architect might design a new software system,

and when deciding on a piece of technology, the architect might be confronted

with uncertainty regarding the future of that technology, or on how well the

technology satisfies scalability requirements. Therefore, addressing

uncertainty is important for architectural decisions. Answering RQ5 enables

researchers to understand the existing approaches for addressing uncertainty,

and propose improved approaches. Also, practitioners learn which papers help

them address uncertainty.

RQ6. What are the papers on group architectural decisions?

Rationale: Group decisions is an important topic in generic decision literature

(Eisenführ et al., 2010; Janis, 1989; Newell et al., 2007; Nutt and Wilson,

2010; Peterson, 2009), because many important decisions are made by groups,

rather than individuals. Group decisions entail different challenges compared

to individual decisions (e.g. ‘group thinking’ is a bias occurring in groups, not

individuals). The majority of architectural decisions are group, rather than

individual decisions, as we found out in Chapter 3. Therefore, answering RQ6

helps researchers understand and improve approaches for group architectural

decisions. In addition, practitioners can improve their group decision-making

skills by learning from existing approaches.

Next, we present the steps for answering these six research questions.


4.3. Data Collection 67

4.3 Data Collection

The study search strategy must lead to inclusion of relevant papers and

exclusion of irrelevant papers. The search strategy of this study involves

querying reference databases with customized search strings, followed by

manual filtering of the query results, using predefined inclusion and exclusion

criteria. Three researchers were involved in executing the search strategy.

4.3.1 Source Selection and Search String

Since using only one reference database might miss some of the relevant

papers on architectural decisions, we queried six reference databases, all of

which index software engineering papers:

1. ACM Digital Library

2. IEEE Xplore

3. ScienceDirect

4. Scopus

5. SpringerLink

6. Web of Science

We searched for papers published between the 1st of January 2002 and the 1

st

of January 2012. We chose 2002 as a starting date, because in our previous

work on architectural decisions, we observed that many papers on architectural

decisions refer to an influential position paper (Bosch, 2004) from 2004, and

very little work on architectural decisions existed before that. In addition, the

first conference dedicated to software architecture (Working IEEE/IFIP

Conference on Software Architecture) took place in 1999. Therefore, papers

from early 2002 bring a comprehensive overview of existing work on

architectural decisions. We chose 1st of January 2012 as end date, because this

mapping study started in April 2012.

To evaluate the results of the queries on the reference databases, we developed

a quasi-gold standard, as recommended by (Zhang et al., 2011). The quasi-

gold standard is a manually collected set of relevant papers from a small

number of venues, which are well-known for publishing work on the relevant

topic. The results of the queries on the reference databases must include the

quasi-gold standard. Otherwise, the search strategy needs to be revised. The

quasi-gold standard for this study consisted of 40 papers (listed in the



Appendix) that we collected manually from three venues: the European

Conference on Software Architecture, the Working IEEE/IFIP Conference on

Software Architecture, and the Journal of Systems and Software. Working on

the quasi-gold standard helped us validate and refine the search string, as well

as the inclusion and exclusion criteria. Finally, all items in the quasi-gold

standard appeared in the results of the queries on the reference databases.

Starting from our research questions, we identified the keywords for the search

string. Furthermore, we refined the search string using the papers from the

quasi-gold standard. For example, in addition to ‘architectural decision’, we

added ‘architectural knowledge’, because many authors consider architectural

decisions as part of architectural knowledge. Moreover, we identified relevant

synonyms for the initial keywords, such as ‘design decision’. Also, we

considered variations of keywords, such as ‘architecture’ and ‘architectural’,

as well as singular and plural forms, such as ‘decision’ and ‘decisions’.

Finally, all the items in our search string (Table 4.1) are connected with OR

statements, to make sure that all relevant papers are retrieved.

Table 4.1. Search string. Similar items are on the same line.

Search string items

(‘architecture decision’ OR ‘architectural decision’

OR ‘architecture choice’ OR ‘architectural choice’

OR ‘architecture decisions’ OR ‘architectural decisions’

OR ‘architecture choices’ OR ‘architectural choices’

OR ‘architecture rationale’ OR ‘architectural rationale’

OR ‘architecture knowledge’ OR ‘architectural knowledge’

OR ‘design decision’ OR ‘design decisions’

OR ‘design choice’ OR ‘design choices’

OR ‘design rationale’

OR ‘design knowledge’)

In each reference database, we searched the above string in the title, abstract,

and keywords fields. Depending on the options offered by each reference



database, we refined results by the selected time interval and by the relevant

topic (e.g. software engineering).

4.3.2 Inclusion and Exclusion Criteria

We formulated inclusion and exclusion criteria for filtering papers. If a paper

met an exclusion criterion, then the paper was removed from the study. Only if

a paper met all inclusion criteria, the paper was kept. When a researcher was

not sure about including or excluding a paper, the other researchers were asked

to discuss and decide.

The inclusion criteria are:

I1. The study refers to the software architecture of software-intensive

systems (e.g. not hardware or buildings architecture)

I2. The paper focuses on the topic of software architectural decisions

Regarding I2, the focus on architectural decisions means that the paper uses an

‘architectural’ perspective (or level of abstraction), rather than a more generic

one such as ‘design’. Architectural decisions are a sub-category of design

decisions (Zimmermann, 2011). However, papers on design decisions are

excluded, because design decisions are out of scope for this study. Only if a

paper focuses on ‘early design decisions’ (i.e. architectural design decisions),

rather than design decisions in general, such paper is included, after

discussions among researchers.

Some papers mention architectural decisions incidentally (e.g. a paper presents

a tool implementation and some decisions for implementing this tool with a

clear focus on the tool rather than related decisions). Such papers do not meet

I2 (i.e. no focus on architectural decisions). Typical examples of papers with a

focus on architectural decisions are papers on 1) documenting architectural

decisions (e.g. templates, viewpoints, or tool support), 2) making architectural

decisions (e.g. approaches or tool support for decision-making), or 3)

describing real-world architectural decisions (e.g. exploratory studies to

investigate how architectural decisions are made in industrial projects).

The exclusion criteria are:

E1. Papers in a language other than English

E2. Papers published before 1st of January 2002 or after 1

st of January

2012. Even though the search interval was defined in the search scope,



we defined this as an exclusion criterion since some databases did not

allow filtering the search results based on time. Thus, this criterion

was applied on results provided by database searches when no time

filtering option was available

E3. Gray literature, because of their unclear peer review process, as

recommended by (Kitchenham et al., 2010): editorials, extended

abstracts, tutorials, tool demos, doctoral symposium papers, research

abstracts, book chapters (other than proceedings), keynote talks,

workshop reports, and technical reports

E4. Secondary studies on architectural decisions, because such papers are

related work to this study

E5. Papers that focus on hardware topics, computer-aided non-software

design (e.g. industrial), non-software design (e.g. products, systems, or

where the main product is not software)

E6. Papers from other venues than software engineering (e.g. physics, law)

E7. Papers with a distinct body of work, which focus on specialized

related topics, such as components selection or software release

decisions. Although these topics can be considered architectural

decisions, they have a distinct and mature body of work (e.g. much

literature exists on components selection)

Two researchers reviewed the criteria for each study, using title, keywords,

and abstract. If needed, the researchers reviewed also the paper content.

Differences between researchers were discussed and reconciled, sometimes

involving a third researcher.

4.3.3 Search Process

We followed the search process in Figure 4.2. We used the search string on

each of the six reference databases and we obtained various numbers of papers

for each venue. We used EndNote to manage the references. The tool allowed

us to remove some duplicates from the initial set of 28,895 references

automatically. We split the remaining 28,201 references in batches of up to

3,000 references, for easier handling.

Next, two researchers filtered the articles in each batch by title, by removing

articles with titles that were clearly out of scope. When not sure about keeping

a paper or not, we chose to keep the paper, to avoid the risk of filtering



relevant papers. Researchers shared their experiences and discussed examples

of papers, to make sure that they had a common understanding about the

filtering process. Following the filter by title step, we obtained 2,283

references.

Two researchers filtered the papers by reading the abstract and the introduction

sections of each paper. In addition, if a researcher was not sure about keeping

a specific paper, the researcher read the conclusion section. If still not sure, the

researcher kept the paper, to decide about it in the final step. This step resulted

in 262 references.



Automatic

search of

ScienceDirect

Automatic

search of ACM

Digital Library

Automatic

search of IEEE

Xplore

Automatic

search of Web

of Science

Automatic

search of

SpringerLink

Automatic

search of

Scopus

5,078 references

1,872 references

10,578

references

5,079 references

787 references

5,501 references

Manual search

of three

venues

Quasi-gold

standard:

40 references

Process Step OutcomeLegend:

Merge

references

28,895

references

Remove

duplicates

28,201

references

Filter by title

2,283 references

Filter by

abstract,

introduction

262 referencesFilter by

content 144 references

Check subset

relationship

Figure 4.2. Search process.

Finally, we read the full content of each paper, to make the final decision

about each paper. When a researcher was not sure about keeping or removing

a paper, the other two researchers analyzed the paper, and made a final

decision. During discussions, the two researchers used a dialectical decision-

making approach, inspired from (Schweiger et al., 1986): initially, each

researcher argued either for keeping or removing the paper, and then a

consensus was reached after debates. This approach helped us evaluate

systematically the arguments for keeping or removing a paper, and avoid the


4.4. Results 73

decision-making bias of relying heavily on the initial impression about a

paper. Overall, this step produced the final set of 144 references.

4.4 Results

Two researchers extracted data for the 144 papers to answer each research

question. If the two researchers needed help on data extraction, then a third

researcher was involved. Next, we used descriptive statistics and frequency

analysis to answer the research questions. We present an overview of the

papers and the answers to the six research questions from Section 4.2.1.3.

4.4.1 Overview of Selected Papers

We present an overview of the selected papers in terms of their empirical

evaluation approaches and publication venues/years.

4.4.1.1. Empirical Evaluation Approaches

We classified each paper in one of the following empirical evaluation

approaches: experiment, survey, case study, or example. Easterbrook et al.

(Easterbrook et al., 2008) consider experiments, surveys and case studies as

the established research methods that are most relevant to software

engineering. By examples, we refer to early validation attempts (such as using

a hypothetical situation or a toy example in the paper), which can be a

preliminary step for using later an established research method.

To classify papers, we checked if papers referenced established guidelines for

conducting experiments (i.e. (Jedlitschka et al., 2008; Wohlin et al., 2000)),

surveys (i.e. (Ciolkowski et al., 2003; Kitchenham and Pfleeger, 2003;

Kitchenham and Pfleeger, 2008; Pfleeger and Kitchenham, 2001)), or case

studies (i.e. (Brereton et al., 2008; Host and Runeson, 2007; Yin, 2003)). If a

paper did not reference any guideline, we could still classify it as an

experiment, survey, or case study, provided that the paper included important

elements from the guidelines. For example, a paper can include one or more of

the following elements: a protocol, research questions, hypotheses, validity

threats, and interpretation of results. If a paper lacked such elements, it was

classified as using an example for its empirical evaluation. Table 4.2



summarizes the numbers of papers for each category. This table also shows the

type of publication venue at which a paper has been published.

Table 4.2. Overview of empirical evaluation approaches.

Examples Case studies Surveys Experiments Total

Journal 25 5 3 2 35

Conference 41 23 11 2 77

Workshop 25 3 2 2 32

Total 91 31 16 6 144

To zoom into the results shown in Table 4.2, we created the bubble chart in

Figure 4.3, which classifies all papers on the venue type and empirical

evaluation approach. Based on Table 4.2 and Figure 4.3, we notice the

following points:

Workshop

Example Case study Survey Experiment

Conference

Journal

P4, P6, P7, P10, P17,

P24, P25, P30, P34,

P41, P50, P52, P54,

P58, P70, P88, P89,

P90, P101, P109,

P114, P121, P122,

P137, P139

P2, P3, P11, P23,

P26, P28, P29, P31,

P43, P53, P60, P79,

P84, P86, P94, P98,

P100, P105, P117,

P131, P132, P141,

P142

P1, P8, P15, P36, P38,

P55, P57, P64, P66,

P67, P71, P76, P92,

P95, P99, P103, P110,

P115, P116, P118,

P135, P136, P138,

P140, P144

P5, P21,

P33, P83,

P129

P9, P59,

P107

P45,

P46

P47,

P102

P32, P35, P48,

P74, P75, P85,

P112, P113,

P119, P120,

P123

P62,

P106

P97,P130

P133

P12, P13, P14, P16, P18,

P19, P20, P22, P27, P37,

P39, P40, P42, P44, P49,

P51, P56, P63, P65, P68,

P69, P72, P73, P77, P78,

P80, P81, P82, P87, P93,

P96, P104, P108, P111,

P124, P125, P126, P127,

P128, P134, P143

P61,

P91

Figure 4.3. Bubble chart with publication venue types and empirical evaluation

approaches, for each paper.

First, examples are the dominant evaluation approach: 63% of the

papers on architectural decisions use examples as their evaluation

approach. Examples are used by 78% of the workshop papers, which

is not surprising, since workshops tend to present work in progress.

Thus, the evaluation is rather an illustration of a new approach. As


4.4. Results 75

much as 71% of the journal papers rely on examples. One explanation

is that some papers are published in journals that target practitioners,

who are interested in the core findings, rather than details following

from a thorough evaluation approach. This becomes clearer when we

filter the seven papers from the IEEE Software and the one paper from

the IBM Systems journals (both journals are geared towards

practitioners). The remaining papers make up for 50%, which is

slightly less than the percentage for conferences (i.e. 53%).

Second, experiments are, by far, the least used evaluation approach:

only 4% of all papers use experiments. Out of the six papers with

experiments, only P106 reports an experiment with fourteen architects.

Three of the papers (P45, P46, and P47) report experiments with 25-50

students each. The remaining two papers (P62 and P102) use a mix of

students and architects, in total 10 (for P62) and 16 participants (for

P102).

Third, surveys make for 11% of all papers. Most surveys are published

in conferences. Only one survey is published in a workshop. Overall,

surveys have similar distributions to case studies, but the number of

surveys is about half the number of case studies, which make for 21%

of all papers.

4.4.1.2. Publication Venues and Years

Table 4.3 shows the top ten most popular publication venues for papers on

architectural decisions. We notice that the top ten venues include around 57%

of the 144 selected papers. The SHARK workshop series, ECSA/WICSA

conferences, JSS and IEEE Software journals are the most popular venues for

publications on architectural decisions. In the Appendix, we present all venues

for the 144 selected papers. In total, there are 59 different venues: 13

workshops, 29 conferences, and 17 journals.



Table 4.3. Number of papers published in the top ten most popular publication

venues.

Venue Type No %

Workshop on SHAring and Reusing Architectural

Knowledge

Workshop 17 11.81

European Conference on Software Architecture Conference 16 11.11

Working IEEE/IFIP Conference on Software

Architecture

Conference 10 6.94

Journal of Systems and Software Journal 7 4.86

Working IEEE/IFIP Conference on Software

Architecture/European Conference on Software

Architecture

Conference 7 4.86

IEEE software Journal 7 4.86

International Conference on Software Engineering Conference 6 4.17

Quality of Software Architectures Conference 5 3.47

Information and Software Technology Journal 4 2.78

Workshop on Traceability, Dependencies and

Software Architecture

Workshop 3 2.08

Total: 82 56.94

We found evidence on the increasing interest in the community on the topic of

architectural decisions. Figure 4.4 shows the distribution of the 144 papers

over the past ten years (i.e. 2002 – 2011). We plotted the numbers of papers


4.4. Results 77

for each year, and added trend lines for the intervals 2002 – 2004 and 2005 –

2011. We notice that between 2002 and 2004, there were very few papers on

architectural decisions. However, the number of papers on architectural

decisions has grown steadily since 2005. Overall, we see that since 2005

research on architectural decisions started to gain much traction in the

community.

Figure 4.4. Number of selected papers (vertical axis) over the publication years

(horizontal axis).

Next, we answer the six research questions.

4.4.2 RQ1 – Documenting Architectural Decisions

We checked each of the 144 papers to see if they refer to documentation, or

include a process for documentation, or tool support for documentation of

architectural decisions. To answer RQ1, we provide the overview in Figure 4.5

and offer examples of papers for each sub-category.

Figure 4.5 shows that out of the 144 selected papers, 83% (or 120 papers)

propose some form of documentation approach for architectural decisions. The

remaining 24 papers do not refer to documentation, processes or tool support

for documentation of architectural decisions. Most of these 24 papers (i.e. 18)

are either case studies or surveys that took place in the industry, describing

various aspects of architectural decision-making. For example, P129 presents a

case study on real-world architectural decisions, and P9 presents a survey on

organizational factors that influence architectural decisions in data warehouse

projects.

Out of the 120 papers on documentation, 24 papers present no process and no

tool support for documenting architectural decisions. Instead, they focus on

other topics related to architectural decisions. For example, P19 explores ideas



for embedding rationales of decisions in architecting activities, and P60

focuses on business aspects of architectural decisions.

Seventy-six papers on documentation also propose a process that results in

documentation for architectural decisions. These 76 papers consist of 44

papers with no tool support, 26 papers with custom –made tool support, and

six papers with off-the-shelf tool support. For example, P2 proposes an

approach for facilitating the architectural decision-making process, but without

tool support. Also, P144 presents an approach to help the architectural

decision-making process, with a custom made wiki-based tool support.

Finally, P111 proposes a process for assisting change impact analysis for

architectural analysis, using off-the-shelf tools such as an UML editor and a

Bayesian Belief Network tool. RQ-documentation

144 papers: all

76 papers: process44 papers on processes , no tool support:

P2, P5, P13, P14, P15, P16, P28, P29, P30,

P34, P45, P46, P47, P50, P51, P52, P57,

P63, P64, P66, P73, P78, P82, P84, P90,

P94, P96, P99, P100, P105, P106, P107,

P108, P110, P112, P113, P117, P121,

P124, P128, P134, P135, P138, P140

120 papers: documentation24 papers on documentation, no process, no tool support: P12, P19, P27, P32, P39, P40, P41,

P43, P49, P53, P54, P55, P60, P68, P70, P71, P89, P103, P115, P118, P125, P126, P139, P142

24 papers - no documentation, no process, no tool support: P9, P17, P33, P35, P38, P42, P48,

P59, P61, P67, P85, P91, P93, P97, P98, P109, P119, P120, P129, P130, P131, P132, P133, P141

52 papers: tools

16 custom tools:

P7, P10, P11, P20,

P21, P23, P24, P25,

P31, P37, P56, P65,

P69, P75, P136, P143

26 custom tools:

P4, P6, P18, P22, P26, P62,

P74, P76, P77, P79, P80, P81,

P83, P86, P87, P88, P92, P95,

P104, P114, P116, P122, P123,

P127, P137, P144

6 off-the-shelf tools:

P3, P8, P36, P58, P101, P111

4 off-the-shelf tools:

P1, P44, P72, P102

Figure 4.5. High-level categorization of papers.

As visible in Figure 4.5, 52 out of the 120 papers include tool support for

documenting architectural decisions. From these 52 papers, 32 papers (26

custom tools and 6 off-the-shelf tools) overlap with the category of papers that

also include a process. The remaining 20 papers use custom (16 papers) and

off-the-shelf (4 papers) tools for documenting architectural decisions. For

example, P7 proposes an ontology for documenting architectural decisions,

and custom tool support for it. Also, P102 presents an experiment for

visualization of architectural decisions using the off-the-shelf, open source tool

Compendium. Regarding open-source tool support, four papers (i.e. P1, P44,


4.4. Results 79

P101, P102) use off-the-shelf open-source tools (i.e. Protégé in P1, OSATE in

P44, Compendium in P101 and P102), and one paper (i.e. P136) presents a

custom open-source tool (i.e. Frag). Overall, many papers include tool support,

most of the tools are custom made, rather than off-the-shelf tools, and very few

papers include open-source tool support.

4.4.3 RQ2 – Functional Requirements and Quality Attributes

Architectural decisions must help satisfy functional requirements and achieve

quality attributes for a software system. We investigated if the collected papers

mention addressing functional requirements or quality attributes. Moreover,

we investigated whether the collected papers focus on specific functional

requirements or specific quality attributes.

We mapped the papers by analyzing them for mentions of functional

requirements and/or quality attributes. If we could not find such explicit

mention, then we marked the paper as having an unclear treatment of

functional requirements and/or quality attributes. Thus, each paper was

classified in one of the four possible states, as follows:

S1. Unclear functional requirements, and explicit quality attributes

S2. Explicit functional requirements, and unclear quality attributes

S3. Unclear functional requirements, and unclear quality attributes

S4. Explicit functional requirements, and explicit quality attributes

Figure 4.6 summarizes the results. Out of the 144 selected papers, thirteen

papers have unclear treatment of functional requirements but discuss quality

attributes (i.e. S1), ten have unclear treatment of quality attributes but discuss

functional requirements (i.e. S2), and seven have unclear treatment of both

(i.e. S3). For example, P42 is a conceptual paper that compares architectural

decisions with design decisions, and addressing functional requirements and

quality attributes is implicit in the paper. The remaining 114 papers (i.e. the

144 collected papers minus the 30 papers in Figure 4.6) clearly address both

functional requirements and quality attributes (i.e. S4) and are not shown in

Figure 4.6.



Unclear Functional Requirements

P4, P6, P13, P60, P61, P81, P82,

P107, P108, P112, P113, P126, P137

Unclear Quality Attributes

P42, P68, P69, P130,

P131, P132, P133

P9, P30, P31, P34, P47, P56, P97,

P127, P128, P129

S1 S2S3 S4

Figure 4.6. Papers with unclear treatment of functional requirements and quality

attributes.

Out of the 124 papers that explicitly treat functional requirements, we found

no specific functional requirements that are recurring among papers, since

such requirements vary much with the actual software architecture project.

Out of the 127 papers that explicitly treat quality attributes, seven papers focus

on specific quality attributes. P8 refers to achieving both security and

reliability, through a decision-making framework. The other six papers focus

on only one quality attribute, as follows:

Achieving usability is the focus of P103 and P104. Both publications

refer to decisions on usability patterns (e.g. fixed or requested

menus).

Achieving reliability is the focus of P82, through a decision-making

framework for achieving reliability.

Achieving scalability is the focus of P58, through goal-oriented

analysis and simulation of different decision scenarios.

Achieving evolvability in P16 is proposed through a quality model for

supporting evolvability, for evaluating decisions that may affect

evolvability, such as choice of architectural patterns.

Achieving safety is proposed by P124 through a framework for

eliciting and formulating negative requirements (e.g. unplanned or

unwanted events).

Overall, most papers address explicitly both functional requirements and

quality attributes.

4.4.4 RQ3 – Domain-specific Architectural Decisions

Out of the 144 collected papers, 22% of the papers (or 32 papers) refer to

domain-specific architectural decisions. The other papers refer to architectural

decisions in general. Figure 4.7 presents the domains and the IDs of the

papers. Some domains refer to specific application domains (e.g.

telecommunication, web applications, healthcare, defence), while other


4.4. Results 81

domains refer to generic domains (e.g. SOA, enterprise architecture, software

product lines). We notice that the domains are not strictly orthogonal, for

example, a paper from the healthcare domain can also be a paper related to

SOA. However, due to the low number of papers, we were not able to separate

the domains. Only three domains have three or more papers: software product

lines, enterprise architecture, and service-oriented architecture (SOA). We

discuss briefly the papers in each of the three domains.

Figure 4.7. The bar chart shows the decision domains and the papers for each

domain.

Service-oriented architecture domain:

P49 presents an approach to support service design decisions.

P52 proposes modeling SOA process decisions.

P53 presents a template for documenting SOA decisions.

P54 proposes a set of factors that affect SOA decision-making, such as

types of service consumers, and perspectives of service providers and

consumers.

P93 analyzes in-depth the selection between two types of services (i.e.

RESTful and WSDL/SOAP web services).

P116 proposes a decision model for SOA projects.

P140 presents a decision-modeling framework for SOA systems.

P141 describes the rationales for various SOA architectural decisions

from a real-world project.

P142 proposes a design approach for decisions on SOA transactional

workflows.



Enterprise architecture domain:

P2 and P4 consider the architectural design as a search problem, and

propose approaches for searching the design space.

P9 investigates organizational factors (e.g. resource constraints,

perceived skills of IT staff) for decisions on enterprise data

warehouses.

P85 presents an approach for collaborative decision-making for the

design of enterprise architecture.

P94 presents a process model for architectural decisions management.

P97 reports the role of enterprise architecture in group decisions in e-

Government.

P143 proposes a conceptual framework for collaborative decision-

making, identifying and enforcing decisions in enterprise

architectures.

Software product lines domain:

P20 presents tool support for capturing architectural decisions for

software product lines.

P103 discusses decisions on usability patterns for product lines.

P114 uses ideas from software product lines to increase reusability of

documented architectural decisions.

4.4.5 RQ4 – Descriptive and Normative Papers

We classified each paper as either a normative, or a descriptive paper. We

identified 20 descriptive papers, representing 13.8% of all collected papers:

P9, P32, P33, P35, P43, P48, P59, P61, P91, P97, P98, P117, P119, P120,

P129, P130, P131, P132, P133, and P141. The remaining 124 papers are

normative.

The number of descriptive papers is much lower than the number of normative

papers. However, descriptive papers are very important for understanding real-

world architectural decisions. Furthermore, descriptive papers can be used to

propose more targeted approaches for normative papers. For example, a

descriptive paper can present real-world challenges (such as how to increase

consensus for group architectural decisions), and such challenges can be

addressed in normative papers that propose approaches for group architectural

decisions. If few descriptive papers exist, then researchers have little idea on


4.4. Results 83

real-world architectural decision-making, and might risk focusing on a subset

of challenges. Given the importance of descriptive papers, we need to

understand to what extent these papers cover real-world architectural decisions

and decision-making.

To understand the descriptive papers, we use four factors for descriptive

papers: number of decisions, time spent to make decisions, number of

participants, and classes of decisions. The factors help researchers understand

the real-world decisions from the literature, and plan future research to cover

existing gaps in the literature. These factors are not used to assess the quality

of descriptive papers. Next, we present the factors in detail, their rationales,

and characterize the 20 descriptive papers using the factors.

Number of described architectural decisions: This factor indicates the

breadth of a paper, with regard to the paper’s contribution on describing

architectural decisions: a paper that describes many decisions has a broad

contribution. By describing many architectural decisions, researchers can

synthesize findings based on multiple decisions. Overall, researchers benefit

from papers that state explicitly the number of described decisions.

Out of the 20 descriptive publications that we identified, sixteen refer to an

unclear number of architectural decisions. The remaining four publications

refer to one (P9), fifteen (P33), twenty-eight (P132), and eighty (P43)

architectural decisions from real-world projects.

Time spent for making architectural decisions: We chose this factor

because saving time for architects is critical, due to their busy schedules. To

propose timesaving approaches for architects, researchers need to understand

how architects spend time in making architectural decisions. Papers that

describe decisions over a longer time might present new challenges of real-

world decisions, since more insights can be collected over more time.

Out of the 20 descriptive papers, only P33 refers to the time spent for making

architectural decisions (i.e. the number of minutes for each architectural

decision), in the context of a study on observing and analyzing the design

process of practitioners.

Number of participants: This factor is applicable to both types of papers.

First, for descriptive papers, ‘participants’ refers to the number of persons that

were involved in the decisions in the paper. Second, for normative papers,



‘participants’ refers to the number of persons that participated in an empirical

evaluation. This factor is important because more participants indicate papers

with stronger empirical evidence, for both descriptive and normative papers.

Table 4.5 in Section 4.5.1.5 (in which the results are further discussed)

summarizes the results: four papers do not describe clearly the number of

participants involved in the decisions described in the empirical evaluation.

The number of participants in the other papers varies from three to 436 (see

Table 4.5). All participants are from industry, except for P119.

Classes of architectural decisions: We chose this factor because

understanding the kind of decisions described in the literature helps

researchers understand which classes of decisions need further attention,

including better descriptions of the actual decisions. According to (Kruchten,

2004), architectural decisions can be classified in three classes:

existence decisions (i.e. indicating existence of some artifacts, such as

that the system will have three layers).

property decisions (i.e. stating an enduring trait of the system, such as

that the system will use open-source libraries).

executive decisions (i.e. process, tool, or technology decisions, such as

that the system will use Java).

Out of the 20 descriptive papers, two papers (P32 and P98) describe existence

decisions. In addition, only P9 has an executive decision (i.e. data warehouse

selection). No paper describes property decisions. However, four papers (P35,

P117, P129, and P141) describe multiple decisions, from all classes of

decisions. The remaining thirteen papers describe unclear classes of decisions.

The results for descriptive and normative papers are presented in Section

4.5.1.5.

4.4.6 RQ5 - Addressing Uncertainty in Architectural Decisions

Out of the 144 papers, only nine papers (or 6%) address uncertainty in

architectural decision-making. Given the low number, we summarize below

the approaches in the papers.

P5 proposes a process for identifying risks in architectural decisions,

and the use of Bayesian networks to quantify and manage the risks.


4.4. Results 85

P111 proposes quantifying the relationship between architectural

decisions and design artefacts using Bayesian belief networks.

P36 analyzes ‘build vs. buy’ architectural decisions, and uses

probabilities to check reliability constraints for the decisions.

P46 proposes an approach for documenting the rationales of

architectural decisions, which includes documenting the occurrence

probabilities of the various scenarios for decisions.

P61 mentions the need for probabilities of market changes and

technology risks.

P77 considers architectural decisions as solving a multi-attribute

decision problem, which uses the occurrence probabilities for possible

combinations of alternatives.

P82 uses probabilities for making decisions that result in more reliable

software systems.

P96 proposes an approach for evaluation design alternatives that uses

the probabilities for evaluating consequences of the alternatives.

P109 discusses the influence of cognitive biases on evaluating

probabilities that influence architectural decisions.

4.4.7 RQ6 - Group Architectural Decisions

We found out that 15% (or 22 papers) from the selected papers refer to group

architectural decision-making. We summarize below the approaches in the

papers.

P3 uses the Analytical Hierarchy Process with three stakeholders who

evaluate the importance of various quality attributes for an

architectural decision.

P32 describes the importance of drawings for focusing the group

discussions on various decisions.

P33 analyzes the dynamics of decision-making of three teams of two

practitioners.

P43 describes consensus for decisions made by the architecting team

at Volvo Cars.

P45 presents an experiment on team architectural decision-making.

P48 and P59 present a survey with architects portraying them as

lonesome, rather than team decision makers.



P53 presents a template for documenting SOA architectural decisions.

The template was used by teams of three students to document their

SOA decisions.

P66 proposes an extension to the CBAM (Kazman and Klein, 2001)

framework, which considers explicitly stakeholders’ preferences in

group decision-making.

P83 proposes a traceability framework for group decisions, based on

integrating knowledge from various sources.

P84 describes a framework to quantify economically the value of

architectural design decisions.

P85 proposes am approach for supporting collaborative decision-

making for enterprise architectures.

P97 describes an instance of group architectural decision-making in a

Finnish e-Government project.

P101 analyzes the group collaborative features of various tools for

capturing architectural decisions.

P103 mentions involving team members in architectural decisions.

P104 describes an approach for group decision-making, which

includes a facilitator.

P106 describes an industry study about consensus in group decision-

making.

P129, P130, P132, and P133 describe the impact of group interactions

on decision-making.

P143 presents a framework that facilitates group decision-making.

4.5 Discussion

In this section, we present analyses and syntheses of results for all research

questions. Afterwards, we present implications for researchers and

practitioners, and discuss validity threats.

4.5.1 Analysis and Synthesis of Results

4.5.1.1. Empirical Evaluation Approaches, Publication Venues and Years

To understand further the empirical evaluation approaches for papers on

architectural decisions, we present in Figure 4.8 two charts with the evolution


4.5. Discussion 87

of validation approaches over time. Systematic evaluations are especially

relevant for conferences and journals, but matter less for workshops, since

workshops include early ideas, for which systematic evaluations might be less

feasible to conduct. Overall, we present two charts with the evolution of

systematic evaluations and examples for conferences and journals.

The left chart shows the number of studies that use examples, and the number

of systematic studies (i.e. total number of experiments, surveys, and case

studies) for conferences and journals. The right chart shows the ratio of

number of systematic studies on the number of all studies (e.g. in 2003 the

ratio is one, as the only paper for 2003 had a systematic evaluation). We notice

that the ratio was particularly low in 2009, when four systematic studies and

19 papers with examples were published.

Figure 4.8. Number of conference and journal papers with systematic evaluations

(case studies, experiments, surveys) and examples (left chart), their ratio (right

chart).

We make several observations on the charts in Figure 4.8. First, similar to the

trend in Figure 4.4, the numbers of conference and journal papers are very low

(i.e. less than three) for 2002, 2003, and 2004, which explains the extreme

values for their corresponding ratios. Second, for 2005 and 2006, we notice

that about a third of all conference and journal papers had systematic

evaluations. Third, since 2007 about half of all conference and journal papers

had systematic evaluations. The exception is 2009, for which the ratio is much

lower, due mostly to only four papers with systematic evaluations, six papers

with examples at joint WICSA/ECSA conference, three papers with examples

at the Journal of Systems and Software, and two papers at IEEE Software.

Here are basic statistics (i.e. median, mode, and average) about the citation

count of the 144 papers. The median citation count for the 144 selected papers

was six, and the mode citation count was zero (i.e. 17 papers had no citations).



Since most cited papers have more influence, we present the top 10% (i.e. 14)

most cited papers out of the 144 selected papers in Table 4.4, and their

empirical evaluation, venue type, publication year, and average citation count

(per year, since publication), sorted according to average citation count.

For each paper, we indicate its average citation count. Citation count indicates

the impact of a paper in the community. We gathered the citation count for

each of the 144 selected papers as indicated by Google Scholar (including self-

citations), at the start of April 2013. Since papers might receive more citations

over the years, we calculate the average citation count by dividing the number

of citations with the number of elapsed years since publication.

Table 4.4 shows some interesting results. Regarding the citation count, P93

has the highest number of citations, with most citations from papers on

service-oriented systems. Regarding publication year, we notice three papers

(i.e. P17, P70, and P118) published during 2002 – 2004, which influenced

much subsequent work on architectural decisions.


4.5. Discussion 89

Table 4.4. Overview of the top 10% most cited papers on architectural decisions.

Paper

ID

Empirical

evaluation

Venue

type

Year Citation

count

Average

citation count

P93 Example Conference 2008 440 88

P115 Example Journal 2005 281 35.13



P17 Example Workshop 2004 214 23.78


P32 Survey Conference 2007 102 17

P65 Example Conference 2007 95 15.83





P141 Case study Conference 2005 94 11.75


4.5.1.2. Documenting Architectural Decisions

The key facts are that 83% of all papers on architectural decisions propose

some form of documentation approach, and that 53% of all papers propose

processes that result in documentation of architectural decisions. These facts

indicate that documenting architectural decisions is a well-covered research

topic.

Given the large number of studies on documenting decisions, we conclude that

there is a need to consolidate research on documenting architectural decisions.

This includes obtaining evidence on the real-world benefits for documenting

architectural decisions. Such evidence can be obtained from success stories

from early adopters of documentation approaches. The core benefit for



documenting architectural decisions lower knowledge vaporization and

therefore maintenance costs of software systems (as in (Bosch, 2004)). The

cost reduction takes place by capturing, sharing and reusing architectural

knowledge, thus reducing the time that new developers need to familiarize

with an existing software system and potentially increasing the quality of the

software system. However, documenting decisions is a cost by itself.

Therefore, the main goal of consolidating research on documenting

architectural decisions is to get insights on the right amount of documentation

that offers most benefits, at acceptable costs.

Regarding tool support for documenting architectural decisions, we notice that

very few tools are open-source and easily accessible for practitioners (i.e. four

off-the-shelf open-source tools, and one custom open-source tool). We

consider from more open-source tools provide benefits for documenting

architectural decisions since they would facilitate the adoption of

documentation approaches by practitioners. The results encourage efforts for

developing open-source tools for documenting architectural decisions.

4.5.1.3. Functional Requirements and Quality Attributes

The results in Section 4.4.3 suggest that papers on architectural decisions have

considered functional requirements and quality attributes. We think these

trends are due to the wide acceptance of the concepts of functional

requirements and quality attributes. For the future, we expect this acceptance

to persist.

Despite the importance of quality attributes, we observe the low number of

papers on addressing specific quality attributes, such as scalability and

reliability. Satisfying specific quality attributes requires dedicated approaches

(e.g. patterns, tactics, or processes), which offer better-targeted approaches,

compared to generic approaches for making architectural decisions. For

example, P58 indicates the use of simulations to compare scalability in various

workloads and processing resources (such as adding more processing

resources or more powerful processing resources). As a future direction, we

encourage more work on approaches for satisfying specific quality attributes.

Regarding papers for satisfying functional requirements, although such

requirements vary much across projects, we think that there is much potential


4.5. Discussion 91

for distilling architectural decisions that can be reused to satisfy common

functional requirements across software projects and products. For example,

architectural decisions for some functional requirements (e.g. from open

source projects) can be reused by other architects. An encouraging piece of

work in this direction is (Brown and Wilson, 2012), which presents the

architectures of various popular open source projects. Researchers can collect

reusable decisions in decision repositories for satisfying functional

requirements from these projects, and practitioners can reuse such decisions in

their own architectural decision-making.

4.5.1.4. Domain-specific Architectural Decisions

The results in Section 4.4.4 show an interesting trend: domain-specific

architectural decisions refer mostly to the service-oriented and enterprise

domains. These domains are well-established in the sense that there is

considerable research and practice on these two domains. In contrast, other

domains need more attention.

Mobile computing is a domain with huge recent adoption, as part of the post-

PC era (i.e. continuous decline of personal computers sales in favor of mobile

devices, such as tablets and smartphones). Much development effort exists for

creating mobile applications (i.e. apps) for mobile operating systems (e.g. iOS,

Android). Architecting apps involves making mobile-specific architectural

decisions (e.g. to balance local and remote processing). Research on mobile-

specific architectural decisions is missing from our results. Still, such research

has much potential to help practitioners architect apps. Similar to mobile

computing, we consider that other neglected domains are cloud computing and

internet of things.

4.5.1.5. Descriptive and Normative Papers

Descriptive papers help researchers understand real-world architectural

decisions and decision-making. Furthermore, descriptive papers help propose

better approaches in normative papers. In Table 4.5, we present the 20

descriptive papers, which consist of five workshop papers, eleven conference

papers, and four journal papers. These numbers suggest that conferences are

the most popular venue type for publishing descriptive papers.



Table 4.5. Summary of descriptive papers, sorted by average citation count.

Paper

ID

Empirical

evaluation

Number of

participants

Venue

type

Year Average

citation

count

P32 Survey 436 Conference 2007 17

P141 Case study Unclear Conference 2005 11.75

P9 Survey 420 Journal 2010 6.33

P129 Case study 25 Journal 2007 5.83

P35 Survey 107 Conference 2007 4.5



P59 Survey 142 Journal 2011 3

P33 Case study 6 Journal 2010 2

P61 Survey 21 Workshop 2010 2

P130 Case study 27 Workshop 2005 1.75


P43 Case study Unclear Conference 2010 1

P91 Survey 9 Workshop 2010 1

P132 Case study 3 Conference 2007 1

P117 Case study Unclear Conference 2009 0.75

P131 Case study 12 Conference 2006 0.71

P133 Case study 3 Workshop 2007 0.17

P97 Case study 16 Workshop 2011 0

P98 Case study Unclear Conference 2010 0

Regarding empirical evaluation, we notice that the descriptive papers consist

of eleven case studies and nine surveys. The number of participants in the case

studies and surveys varies from three to 436 (in Table 4.5). We observed a


4.5. Discussion 93

high correlation coefficient (i.e. 82.3%) between the number of participants

(i.e. one of the factors in Section 4.4.5) in a descriptive paper and the average

citation count of a paper. This suggests that descriptive studies with more

participants receive more citations.

Based on the results for the other three factors in Section 4.4.5, we propose

several recommendations for researchers who plan to conduct descriptive work

on architectural decisions:

Since most papers refer to an unclear number of decisions, we

recommend more clarity on the number of decisions in descriptive

papers. This would allow researchers to synthesize and compare

findings based on more decisions and therefore increase the validity of

studies.

Since only one paper discusses the time effort spent on making

architectural decisions, we recommend paying attention towards

understanding real-world time effort for making architectural

decisions. We have already collected some data on the time effort

spent on making architectural decisions in the real world in Chapter 3,

but more work is needed: Understanding the required time effort is a

necessary step for researchers to propose approaches that help

architects reduce the needed effort and to perform a cost-benefit

analysis of systematic architectural decision making.

Most papers refer to unclear classes of decisions. We recommend

more clarity on the classes of decisions in descriptive work. This is

because different classes of decisions may be treated differently,

require different effort, etc. In addition, we recommend more

descriptions of property decisions, since no paper describes property

decisions. We consider that insights about property decisions are

particularly useful for practitioners as property decisions include

design rules, design guidelines, and design constraints, which

influence many elements of a software system (Kruchten, 2004).

Furthermore, as an example of descriptive work from a related field, during

the decision literature survey (in Section 4.2.1.2), we found a thorough piece

of work in the field of strategic organizational decisions: researchers analyzed

hundreds of interviews on 150 real-world decisions in 30 organizations (Nutt



and Wilson, 2010). This descriptive study offered in-depth insights on real-

world decision-making in various organizations (e.g. identifying and

explaining various types of real-world decision-making processes (Nutt and

Wilson, 2010)). These insights lead to over 20 publications on organizational

decisions, including improved approaches for organizational decisions (Nutt

and Wilson, 2010), such as a list of key factors that influence the success of

implementing decisions (Miller, 1997). Similarly, in-depth research on real-

world architectural decisions would be very valuable.

Normative papers propose various approaches on architectural decisions. We

notice that the 22 normative papers in Table 4.6 validate their proposed

approaches using practitioners and students, who were asked to use the

approaches on realistic architectural decisions. These normative papers offer

empirical evidence for their proposed approaches, so that practitioners get a

better idea of their value. We found a weak correlation between the number of

participants in normative papers and their average citation count.


4.5. Discussion 95

Table 4.6. Summary of normative papers with validations, sorted by average

citation count.

Paper

ID

Empirical

evaluation

Number of

participants

involved in

empirical

evaluation

Venue

type

Year Average

citation

count


P62 Experiment 16 Journal 2009 7.75


P45 Experiment 50 Conference 2006 3.29


P107 Survey 8 Journal 2005 3.13

P46 Experiment 50 Conference 2008 3

P47 Experiment 25 Workshop 2008 2.4

P66 Example 4 Journal 2005 2



P106 Experiment 14 Journal 2004 1.67



P21 Case study 45 Journal 2010 0.67



P6 Example 4 Workshop 2009 0.5

P5 Case study 17 Journal 2009 0

P53 Case study 105 Conference 2010 0



P102 Experiment 10 Workshop 2011 0


The remaining 102 normative papers propose approaches on architectural

decisions, mostly for documenting architectural decisions. The answer to RQ1

and the corresponding discussion present in detail the documentation

approaches in the normative papers.

4.5.1.6. Uncertainty in Architectural Decisions

We synthesize the nine papers (see Section 4.4.6) by identifying their core

contributions towards addressing uncertainty in architectural decisions, and

grouping the papers by how they address uncertainty. Uncertainty is made

explicit through the use of probabilities. We identified two main categories of

papers:

Basic addressing of uncertainty. Four papers (P46, P61, P82, and P109)

recognize the importance of probabilities (i.e. of specific scenarios to occur),

but do not propose concrete approaches for addressing uncertainty in


Advanced addressing of uncertainty. Two papers (P77 and P96) use

decision theory for making architectural decisions. In both papers,

probabilities are considered first-class entities in decision-making that help

evaluate systematically the consequences of each decision alternative. Three

other papers (P5, P36, and P111) also regard probabilities as first-class entities

in decision-making. In addition, these papers use Bayesian theory for impact

analysis of each decision alternative.

Overall, although there is some work on addressing uncertainty in architectural

decision-making, we consider that the topic is under-represented, given its

practical relevance. For example, we did not find work on how architects can

improve their estimation of probabilities, so that architect can better address

uncertainty. As a future trend, we consider that approaches for decision

documentation should help architects quantify uncertainties, so that architects

can present stakeholders how uncertainty levels influence the architectural

decisions.


4.5. Discussion 97

4.5.1.7. Group Architectural Decisions

Table 4.7 summarizes the 22 papers (i.e. ten conference, seven journal, and

only four workshop papers) on group decisions. Regarding empirical

evaluation, four papers use surveys, eleven papers use case studies, two papers

use experiments, and five papers use examples. Similar to the papers in Table

4.5, we observe a high correlation (75%) between the numbers of participants

in the empirical evaluations and the average citation count of the papers. This

suggests a link between using more participants in the empirical evaluation of

a paper and the subsequent impact of that paper.



Table 4.7. Summary of papers on group decisions, sorted by average citation

count.

ID Empirical

evaluation

Descriptiv

e

Number of

participant

s involved

in

empirical

evaluation

Venue

type

Yea

r

Averag

e

citation

count

P32 Survey Yes 436 Conferenc

e

2007 17

P3 Case study No 3 Conferenc

e

2005 8.88

P14

3

Example No NA Conferenc

e

2007 8.5

P12

9

Case study Yes 25 Journal 2007 5.83

P84 Case study No Unclear Conferenc

e

2003 3.8

P45 Experimen

t

No 50 Conferenc

e

2006 3.29

P48 Survey Yes 142 Conferenc

e

2009 3

P59 Survey Yes 142 Journal 2011 3

P83 Case study No Unclear Journal 2007 2.83

P10

1

Example No NA Workshop 2010 2.67

P33 Case study Yes 6 Journal 2010 2

P66 Example No 4 Journal 2005 2

P13

0

Case study Yes 27 Workshop 2005 1.75


4.5. Discussion 99

P85 Survey No 70 Conferenc

e

2010 1.67

P10

6

Experimen

t

No 14 Journal 2004 1.67

P43 Case study Yes Unclear Conferenc

e

2010 1

P13

2

Case study Yes 3 Conferenc

e

2007 1

P10

4

Example No Unclear Conferenc

e

2006 0.71

P10

3

Example No NA Journal 2011 0.5

P13

3

Case study Yes 3 Workshop 2007 0.17

P53 Case study No 105 Conferenc

e

2010 0

P97 Case study Yes 16 Workshop 2011 0

There are relatively few normative papers with approaches for practitioners on

group architectural decisions. Table 4.7 also indicates that 10 out of the 22

papers are descriptive work (RQ4), meaning that they offer evidence on group

architectural decisions in the industry. However, the ratio of descriptive per

normative papers is much higher for the papers in Table 4.7, compared to the

similar ratio for papers on architectural decisions in general. This suggests a

low number of normative papers on group architectural decisions. The results

from Chapter 3 indicate that 86% of architectural decisions in the industry are

made by groups, rather than individual decision makers. Therefore, we suggest

that future work is needed on approaches for group architectural decisions. To

propose such approaches, researchers need to understand the key findings of

existing papers on group architectural decisions.

The ten descriptive papers contribute at increasing our understanding of group

dynamics in architectural decision-making (e.g. P33, P97). However, more



work needs to be done for understanding how to reach consensus among

decision makers for architectural decisions, as indicated by P32. Additionally,

Zannier’s papers (e.g. P129, P130) call for more descriptive work, to increase

community’s understanding of group dynamics in architectural decision-

making.

The twelve normative papers provide contributions for improving group

architectural decision-making. Based on the nature of the papers, we

categorize these contributions in three types: CBAM (Kazman and Klein,

2001) extensions , documentation approaches, and processes . P66 and P84

propose CBAM extensions that help decision makers elicit and evaluate

alternatives for architectural decisions. The documentation approaches (in

P45, P53, P83, and P143) help capture the perspectives of the multiple

decision makers. Finally, the processes (in P3, P85, P103, and P104) help

decision-makers structure their interactions for faster decision-making.

Furthermore, the normative papers indicate several areas that need further

attention. First, P3 and P66 call for more work on treating judgment

uncertainty of the decision-makers. Second, other papers (i.e. P53, P66, P83,

and P101) ask for better tool support for group decision-making. Third, P66

and P101 call for further empirical validations. Forth, P66 and P143 ask for

further work on dependency analysis of decision makers’ perspectives.

4.5.2 Implications for Researchers and Practitioners

This mapping study confirms that architectural decisions are an increasingly

popular topic in software architecture research. In the recent decade, the

number of papers on architectural decisions has increased steadily since 2005,

as shown in Figure 4.4. Three seminal papers (P17, P70, and P118 – see

Section 4.5.1.1) that were published before 2005 influenced much subsequent

research, as indicated by their high citation count. The key message of the

three seminal papers is to reduce maintenance costs by fighting the

vaporization of architectural knowledge, through documentation of


This study shows much effort has been invested in improving documenting

architectural decisions. We agree with the importance of documenting

architectural decisions. However, following this study, we speculate that


4.5. Discussion 101

critics who look at the efforts for documenting architectural decisions might

demand evidence on how documenting architectural decisions actually reduces

maintenance costs for the industry, as envisioned by the three seminal papers

(i.e. P17, P70, and P118). As future work, researchers can use the 144 selected

papers in this study to search for such evidence.

Following this study, we see value in diversifying research efforts to cover

three additional topics on architectural decisions: descriptive work,

uncertainty, and group architectural decisions. These topics receive much

attention in generic decision-making literature (e.g. (Janis, 1989)). We do not

see any reason for neglecting these topics in research on architectural

decisions.

Addressing specific quality attributes is often needed in practice. For example,

Poort et al. (Poort et al., 2012) bring evidence on the importance of

modifiability in practice, but we could not identify any paper on architectural

decisions that focuses on achieving modifiability. Also, Ameller et al.

(Ameller et al., 2013) investigate the role of specific quality attributes in

architecting service-based systems. In this mapping study, we found only 7

papers that address specific quality attributes. Therefore, more work is needed

for architectural decisions that address specific quality attributes.

Regarding domain-specific architectural decisions, we notice that most papers

address architectural decisions in the SOA and enterprise architecture

domains. There are very few papers for other domains, so more work is needed

to cover other domains, such as mobile. This will lead to better architectural

decisions by using targeted approaches for the specific domains.

Similar to other mapping studies (Engström and Runeson, 2011; Li et al.,

2013), we found improvement opportunities for the empirical evaluation of

future studies, by moving away from examples to surveys, case studies or

experiments. However, we found no clear trend (see Figure 4.8) suggesting

that, over time, papers on architectural decisions improved their empirical

evaluation.

When analyzing the results on empirical evaluation, we were surprised to find

only six experiments on architectural decisions. Experiments use 49

participants on average (Sjøberg et al., 2005). Recruiting professional

architects is very challenging, due to their busy schedules. Using students as



participants raises validity threats, although some authors indicate concrete

steps towards reduce validity threats from using students (Tichy, 2000).

However, experiments allow researchers to test hypotheses on causes and

effects in a systematic manner. Therefore, despite the difficulties, conducting

more experiments on architectural decisions is critical for advancing research

on architectural decisions.

Regarding implications for practitioners, we notice the following points. First,

this study shows that current work on architectural decisions offers a rich set

of approaches for documenting architectural decisions (see Section 4.4.2), so

practitioners can incorporate documentation approaches in their activities.

Second, this study indicates papers that help practitioners achieve specific non-

functional requirements (see Section 4.4.3) and in certain specific domains

(see Section 4.4.4) for their architectural decisions. Third, this study helps

practitioners understand the existing approaches for addressing uncertainty in

architectural decisions (see Section 4.4.6), and for improving group decision-

making (see Section 4.4.7). However, practitioners must be aware that the

maturity of these topics varies: most mature research results are on

documenting architectural decisions, and more work is needed for the other

topics. Therefore, we encourage practitioners to document their architectural

decisions using approaches from the literature, and offer researchers feedback

about the approaches.

4.6 Validity threats

In this mapping study we addressed conclusion, construct, internal and

external validity threats.

4.6.1 Conclusion Validity

By making explicit the criteria for including and excluding papers, we believe

that our conclusions are valid and can be replicated using the same research

questions for three reasons. First, we followed a systematic mapping study

process (detailed in Section 4.2), which helps other researchers replicate the

study. Second, we made explicit our data collection efforts (detailed in Section

4.3), including details on the number of papers in each step of the search

process. Third, three researchers were involved in data collection and


4.6. Validity threats 103

classification, thus reducing the threat of overlooking relevant papers or

misinterpreting the data.

4.6.2 Construct Validity

In our mapping study, ‘architectural decision’ is the key theoretical concept.

However, we spent extra efforts for distinguishing architectural decisions from

other theoretical concepts that might be confounded with, such as

‘architectural knowledge’ and ‘design decisions’. For example, we noticed

that in some studies the line between these theoretical concepts is blurred,

therefore we had to discuss such studies in detail among all researchers to

make sure that we refer correctly to architectural decisions. Also, architectural

decisions might be made by professionals who do not have the official role of

software architect, such as software developers. Thus, we had to discuss if the

decisions in the papers were architectural or design decisions, because the

distinction between them is sometimes fuzzy.

To achieve construct validity, the final set of papers had to be complete. To

achieve this, we created the search string systematically (see Section 4.3.1),

making sure to include not only the most relevant keywords (i.e. ‘architectural

decision’) in the automated search, but also variations of these keywords (e.g.

‘architecture choice’). Moreover, we searched for items similar theoretical

concepts such as ‘design rationale’, ‘design knowledge’, or ‘design decisions’.

On the one hand, by extending the search to similar theoretical concepts, we

increased confidence in retrieving all relevant papers. On the other hand, the

extended search required massive efforts from us to filter manually the initially

more than 28,000 references (see Figure 4.2). Also, to further ensure the

completeness of included papers, we checked several randomly selected

papers for references to other relevant papers, and we found that all relevant

papers were included. Furthermore, we verified the results of the automated

search against a quasi-gold standard.

4.6.3 Internal Validity

In this study, we used basic statistics for analyzing the data, so internal validity

threats are minimal.



4.6.4 External Validity

The results of this mapping study refer to the state-of-research on architectural

decisions in the software architect field, from the perspective of researchers.

Since we identified a comprehensive list of 144 papers on architectural

decisions using a broad search, we consider that our study results have strong

generalizability claims, within the selected timeframe (i.e. 2002-2011). Given

the detailed presentation of the protocol, other researchers can extend this

study beyond our selected timeframe.

4.7 Conclusions

In this chapter, we report a mapping study that provides a systematic overview

of literature on architectural decisions. As part of the overview, we identified

gaps in existing research, and promising future research directions. We

obtained the overview by querying six search engines that returned 28,895

papers, covering a decade of research (i.e. from the 1st of January 2002 until

the 1st of January 2012). After multiple filtering steps, we obtained a set of 144

relevant papers.

For each of these papers, we extracted data to answer six research questions.

The six research questions belonged to two groups. The first group covered

software architecture-specific topics: documenting decisions, (non-)functional

requirements, and domain-specific architectural decisions. The second group

covered topics inspired from generic decision-making literature: normative

and descriptive papers, uncertainty in architectural decisions, and group

architectural decisions. In addition, we presented an overview of empirical

evaluation approaches, publication venues, and average citation count of

papers.

Our analysis of existing research on architectural decisions found the

following. Much work exists on documenting architectural decisions, but very

few open-source tools. However, other topics are not studied in detail: domain-

specific architectural decisions, and decisions for achieving specific quality

attributes. Also, we found that little descriptive work exists. Therefore, the

descriptive work reported in Chapter 3 is a much needed step towards better

understanding of real-world architectural decisions.


4.7. Conclusions 105

Furthermore, this mapping study confirms that much interest exists to solve

the problem of knowledge vaporization. Towards this, we explore a new

approach in Chapter 5. In addition, this mapping study shows that little work

exists on group decision-making, which encouraged us to work on a new

group decision-making approach that reduces architectural knowledge

vaporization (reported in Chapter 7). In addition, we found that the number of

papers validated with surveys and case studies increased over time, but few

papers used experiments. Chapters 6 and 7 use experiments to provide

evidence on approaches for making architectural decisions. Finally, Chapter 8

presents open source tool support for making and documenting architectural

decisions.


Chapter 5

Reducing Vaporization with the Repertory

Grid Technique

The first part of this chapter has been published as: Tofan, D. Galster, M. and

Avgeriou, P. Capturing Tacit Architectural Knowledge Using the Repertory

Grid Technique (NIER Track). In Proceedings of the 33rd International

Conference on Software Engineering, 2011.

The second part of this chapter is based on: Tofan, D. Galster, M. and

Avgeriou, P. Reducing Architectural Knowledge Vaporization by Applying

the Repertory Grid Technique. In Proceedings of the 5th European Conference

on Software Architecture, 2011.

As discussed in Chapter 1, knowledge about the architecture of a software -

intensive system tends to vaporize easily. This leads to increased maintenance

costs. The mapping study in Chapter 4 reviewed existing work on documenting

architectural decisions and indicated an increasing interest in the topic of

architectural decisions. Motivated by findings from previous chapters, in this

chapter, we explore the use the Repertory Grid technique, a knowledge

acquisition technique from the knowledge engineering field, to reduce

architectural knowledge vaporization. An architect can use the Repertory Grid

technique to elicit decision alternatives and concerns, and to evaluate each

alternative against concerns.

To study the applicability of this idea, we conducted two studies. First, we

conducted an exploratory study with seven subjects who applied the Repertory

Grid technique to document a design decision they had to take in previous

projects. Next, we interviewed each subject to understand their perceptions

about the technique. Second, we conducted a survey with graduate students. In

the survey, participants documented decisions using the Repertory Grid


108 5. Reducing Vaporization with the Repertory Grid Technique

technique. To understand the effect of the Repertory Grid technique on

architectural knowledge vaporization, we compared the decisions documented

with the Repertory Grid technique with decisions documented using a basic

decision template from the literature.

From the first study, we found that the main advantage is the reasoning

support it provides, and the main disadvantage is the additional effort it

requires. The results of the second study suggest that the Repertory Grid

technique leads to less architectural knowledge vaporization, compared to

template-based decision documentation.



5.1 Introduction

As presented in Chapter 4, various approaches have been proposed to

document architectural knowledge to prevent its vaporization. For example,

Clements et al. (Clements et al., 2002) describe how to create architecture

documentation using views. Kruchten et al. (Kruchten et al., 2006) (Kruchten

et al., 2009) discussed an architectural knowledge repository, and its

underlying ontology. Kruchten et al. (Kruchten et al., 2009) argue for a

decision view to document decisions.

Current approaches to manage architectural knowledge are based on a

particular notion about knowledge within the architecting community. De Boer

and Farenhorst (de Boer and Farenhorst, 2008) provide an overview of

different notions of architectural knowledge, which all agree on the idea that

architectural knowledge is ‘Design Decisions + Design’. For example, one

notion is that architectural knowledge is ‘the integrated representation of the

software architecture […] along with architectural decisions and their

rationale external influence and the development environment’ (de Boer and

Farenhorst, 2008).

Architectural knowledge has two parts: explicit and tacit architectural

knowledge (Babar et al., 2009). Examples of explicit architectural knowledge

are documented patterns, standards, views and reference architectures

(Farenhorst and de Boer, 2009). Explicit architectural knowledge is easily

communicated through documents. In contrast, tacit architectural knowledge

is more difficult to record, thus also more likely to become lost. Examples of

tacit architectural knowledge include unrecorded design decisions made by an

architect based on his or her experience, the design rationale, or assumptions

(Babar et al., 2009). Furthermore, architectural knowledge vaporizes when

tacit architectural knowledge never becomes explicit (i.e. if it is never

recorded).

Current work does not consider the complexity of tacit architectural

knowledge and making tacit architectural knowledge explicit. A famous quote

by Polanyi summarizes the complexity related to tacit knowledge: ‘we can

know more than we can tell’ (Polanyi, 1967). This quote reflects the

difficulties that experts have when making explicit their tacit knowledge.



Similarly, architects have difficulties making tacit architectural knowledge

explicit. For example, an architect might have difficulties explaining why she

decided on a particular technology, or on a particular pattern.

We argue that the architecture community should look beyond the traditional

way of understanding knowledge in software architecture. Thus, we propose to

employ theories from the knowledge engineering domain to provide

approaches for capturing tacit architectural knowledge.

Specifically, we explore a new perspective on architectural knowledge

originating from the personal construct psychological theory (Jankowicz,

2001). Based on this theory, the Repertory Grid technique has already been

used in knowledge engineering to capture experts’ knowledge (Gaines and

Shaw, 1993). Therefore, we suggest using the Repertory Grid technique to

capture architectural decisions.

5.2 The Repertory Grid Technique

The Repertory Grid technique has its roots in the personal construct

psychology. According to the personal construct psychology theory, humans

create representations of their experiences in their minds. These

representations use contrasting poles or ‘constructs’ (Jankowicz, 2001), which

characterize ‘elements’. For example, let us consider someone who needs to

decide at which restaurant to have dinner. Each considered restaurant (i.e.

alternative) is an element. The person may use constructs like ‘quiet vs. noisy’

or ‘good vs. bad food’ to characterize restaurants. A different person may have

a completely different set of elements (i.e. alternatives) or constructs (i.e.

criteria to characterize alternatives).

The Repertory Grid technique provides a systematic approach to capture such

constructs from an individual. Here, the first step is to set up a topic (i.e.

decide on a restaurant). Next, a set of elements (i.e. alternatives) is elicited

from that person (i.e. restaurants names). The subject must be knowledgeable

about all elements. Afterwards, the constructs (i.e. decision criteria) are

elicited, usually by applying the triadic approach: First selecting three

elements and then asking in what way two of them are alike (e.g. restaurants A

and B are ‘quiet’), but different from the third one (e.g. C is ‘noisy’). Next, the

subject rates each element against each construct, on a predefined scale. Based


5.3. Exploratory Study 111

on these ratings, tools like WebGrid (Gaines and Shaw, 2007) provide means

to analyze the resulting grids (i.e. the matrices or tables of elements,

constructs, and ratings). For a more detailed description of the Repertory Grid

technique, and its applications in software engineering, please refer to

(Edwards et al., 2009; Fransella et al., 2004).

In the context of software engineering, Repertory Grid technique has been

used before. For example, in the 80s, Gaines and Shaw (Gaines and Shaw,

1980) pioneered computer-based elicitation and analysis of the grids resulting

from applying the Repertory Grid technique. In the 90s, Maiden and Rugg

(Maiden and Rugg, 1996) studied Repertory Grid technique for requirements

acquisition. In 2009, Edward et al. (Edwards et al., 2009) reviewed existing

Repertory Grid technique studies in software engineering, also offering

guidelines for planning and judging such studies.

5.3 Exploratory Study

5.3.1 Study Design

To understand the applicability of using the Repertory Grid technique for

capturing architectural knowledge, we performed an exploratory empirical

study. We wanted to understand the feasibility of using Repertory Grid

technique to capture tacit architectural knowledge, and generate hypotheses for

future experiments. Therefore, our research question was:

RQ1. What are the advantages and disadvantages of the Repertory Grid

Technique for capturing architectural knowledge?

We used seven subjects with diverse backgrounds and levels of technical

experiences. All were undergraduate or graduate students at the University of

Groningen in Netherlands. Only one subject had significant industry

experience. Each subject was interviewed individually, using the following

study structure for each session.



Figure 5.1. We performed the above steps with each subject.

In the introduction of the study (step one), we presented each subject with the

plan for the session and ethical considerations. We asked about recent projects

in which the subject had to take architectural decisions (step two). Afterwards,

we selected the decision topic (step three) for which the person was both

knowledgeable and able to provide alternatives. As part of the Repertory Grid

technique session (step four), we elicited elements, constructs and ratings for

the decision topic. We asked subjects to write each decision alternative on a

paper card, which could be easily shuffled on the table during the elicitation of

the characteristics. Meanwhile, we used WebGrid ourselves to enter the data,

while the subject could see the tool’s user interface on an additional monitor.

For analyzing the output of the exercise, we used the WebGrid (Gaines and

Shaw, 2007) web application.

Next, we discussed the resulting grid with each subject (step five). Figure 5.2

shows an example of such output. Then we presented each subject with a grid

created by a different person and a different decision topic (step six). We

asked the subject to indicate the most and least likely alternatives to be

selected by the author of the grid. We did that to verify if a person can

understand the rationale of a decision based on a grid created by another

person. Next, each subject filled in a questionnaire (step seven), providing

background information and feedback on the Repertory Grid technique. In the

final step, we used a set of open questions to discuss the benefits and

drawbacks of the grid session. Each session was audio and screen recorded,

with the consent of the participants.

For analyzing the final step of our study, we transcribed all discussions, to

perform content analysis using recommendations from (Krippendorff, 2004).

Next, two of the authors applied open coding to the transcripts. Several topics

emerged during the analysis, which were used as codes. After coding, we used

frequency analysis to identify how often codes appeared. Afterwards, we

grouped codes into categories. Then we discussed the outcomes of the coding



from both researchers to resolve disagreements. We mapped the agreed

categories to advantages, disadvantages and the application context of the

technique.

5.3.2 Study Results

Table 5.1 contains an overview of the outcome of the first three steps of our

study. The first column has the ID of each participant. The second column

shows their current degree program. In the fourth column, ‘academic’ means

that the project they referred to during the study was part of a larger research

effort in which the participants were involved. The ‘course’ type refers to

projects that were assignments within a Software Architecture course, that ID

5 and 6 took shortly before our study. ‘Industrial’ refers to a commercial

project.

Table 5.1. Decision topics examined with the Repertory Grid technique.

ID Study Project description Proj.

type

Decision Topic

1 Ph.D. Network analysis metrics

for power grids

Academic Graph theory library

2 B.Sc. Visualization of

architectural design

decisions

Academic Technology for data

visualization

3 Ph.D. Support medical disease

treating

Academic Features for the first

release

4 B.Sc. Same as ID 2 Academic Database selection

5 M.Sc. Smart Grid application Course Main hardware

6 M.Sc. Same as ID 5 Course Main hardware

7 Ph.D. E-banking reporting system Industrial Reporting engine

The Repertory Grid technique sessions (step four) took on average 57 minutes,

with a standard deviation of 16 minutes. As an example, Figure 5.2 shows the

output of such a session for ID1, as analyzed by WebGrid, using hierarchical

clustering analysis. It groups similar elements (e.g. Prefuse and Infovis), based

on the similarity of their ratings given to the constructs. The same can be seen



for other constructs, e.g. ‘more complex vs. simple to use’ is grouped with

‘used only for visualization vs. used for graph analysis’. The ratings of the two

constructs differ only for the Leda element. If a person regards two constructs

as very similar, then these constructs cannot be used to differentiate elements.

Repeating such grouping produces the dendrograms for the elements and for

the constructs. The scale above each tree shows the similarity score. For

example, ID1 perceives Prefuse and Infovis as around 95% similar, with

regard to the ratings of the criteria (i.e. constructs) used to evaluate each

option. They form a cluster which is similar around 85% to Leda. If a person

regards two elements as very similar, then s/he cannot distinguish them, thus

potentially selecting any of the two, when considering a decision topic.

To help the architect decide, an ideal element is added, which receives the

most preferable ratings. The ideal element serves as a reference point that

helps the architect identify the most preferable element, based on its closest

similarity to the ideal element.

Figure 5.2. The grid from ID1 shows the elements, constructs, ratings and

dendrograms for his decision topic.

The constructs and the ratings represent the subjective perspectives of the

participants on the elements. It is interesting to check whether there is a link

between the alternative closest to the ideal element, and the actual decision

made in the project by the subject. For example, in Figure 5.2, JGraphT is the

closest to the ideal graph library, so JGraphT is the alternative to be chosen, as

indicated by the grid. The subject confirmed that JGraphT was also the actual

decision he made.



Overall, in five out of seven decision topics, the real decision matched the one

indicated by the grid. The two mismatches were ranked as third and fourth

alternative indicated by the Repertory Grid technique. While discussing the

output of the Repertory Grid technique session (step five), the subjects were

pleasantly surprised by such matches, as they got a confirmation of making the

right choice. For one of the mismatches, the subject noticed that the equal

weights of the constructs may be the reason for that. For us, the high number

of matches indicates that the elicited content is indeed architectural

knowledge, as it includes the rationale for the architectural decision.

For the external grid assessment step (step six), we showed each subject an

output like the one in Figure 5.2, created previously by a different person, for

the decision topic of choosing a programming language. We asked them to

indicate the most and least likely alternative to be selected by that person.

Each subject identified the expected answer in a few seconds, by looking at the

similarities between the ideal element and all the other elements. This suggests

that the output (such as Figure 5.2) is useful for communicating alternatives of

an architectural decision in a concise manner. Additionally, numbers can be

associated to the alternatives. For the example in Figure 5.2, WebGrid

indicates the similarities of the alternatives with the ideal graph library as:

81% with JGraphT, 61% with JGraph, 44% with Leda, 39% with Prefuse and

Infovis and 31% with Tulip. This can be used to rank the alternatives.

For step seven, each subject rated some statements from one (strongly

disagree), three (neutral) to five (strongly agree). Table 5.2 summarizes the

results, which suggest the applicability of the Repertory Grid technique for

explaining, documenting and making architectural decisions. Additionally, we

asked each subject to indicate the most difficult part of the Repertory Grid

technique exercise or session. Out of the seven persons, five indicated

constructs elicitation through the triadic approach. The other two pointed to

the rating step.



Table 5.2. The values indicate the average and standard deviation of the subjects’

ratings for each statement.

Id Statement Values

1 The output of the exercise helps explain to a colleague why

I took a certain decision.

4.14

1.07

2 The exercise provides a structured way of explaining


4.14

0.38

3 The exercise provides a structured way of documenting


3.57

0.53

4 The exercise provides a structured way of making


4.00

1.29

5 When trying to understand why another architect took a

certain decision, I could use such an output.

4.29

0.49

6 I found the exercise too tiring for the output.

2.43

1.13

7 Overall, I enjoyed the exercise.

4.42

0.53

In the following sections, we discuss the results from analyzing the interview

transcripts using open coding. Also, we apply consistency checks on the

extracted themes with other steps of the study, to increase its validity.

5.3.3 Advantages

By applying the open coding procedure and its supporting steps as described in

the study design, we identified reasoning support, as the main category of

advantages. It comprises the following codes: systematic (indicated by two out

of seven persons), new insights (three), reflective (four) and decision support

(five). As examples, for the systematic code, one subject characterized the

approach as ‘very professional’, while the other one as ‘a formal way to

document what you have in mind.’ For new insights, a person said: ‘I also like

that you could say these concerns or these elements are very alike. Hey! This

is true. I didn’t think at it that way.’ For reflective, we quote: ‘it forces you to

think about why you chose something.’ These perceptions are in line with the

averages of the ratings for statements two, three, and four, from Table 5.2.

The other category for advantages is readability, containing the picture output

(five mentions) and conciseness (two) codes. For the picture output, one said:



‘I like that my choices […] have been documented […] in a graph that justifies

my choice.’ For the other code, a person indicated that ‘it provides much

information with less text.’ These perceptions resonate with the statements one

and five.

5.3.4 Disadvantages

The main category concerns effort. It comprises four codes: learning curve

(one out of seven), straining (two), tool support (three) and time consuming

(four). For the learning curve, the person commented that ‘you need to tell the

people how to read the results.’ For straining, a subject noted that the

Repertory Grid session proved ‘difficult, because you have to think so much.’

The comments on tool support asked for a friendlier interface and adding

weight to the constructs. However, for time consuming, the subjects observed

that ‘I disliked it took too long’ and that ‘it takes some time that I wouldn’t

spend. ’

5.3.5 Application Context

Additionally, we identified when and for whom Repertory Grid technique

might be useful, based on three codes: size (three), time (two) and role (six).

For project size, a subject noted that ‘for my small project I won’t go for

[Repertory Grid], for a big project I would.’ For time, one considered

Repertory Grid technique useful ‘at the very beginning of the project.’ For

role, the subjects had mixed opinions whether developers, testers or

maintainers can use the output of the Repertory Grid technique. A subject

commented: ‘not sure about the typical developer using such output […]

testers and maintainers not at all […] yes, for a developer promising to

become an architect.’ During the sessions, we noticed more limitations of the

Repertory Grid technique. First, minimum five alternatives are needed for a

decision topic. Next, the decision topic under focus is weakly related to others,

which is often not the case in practice.


External validity: Our results are not generalizable due to the small sample

size and the background of participants. Additionally, experienced

practitioners may perceive differently the technique and its output.



Internal validity: Due to its exploratory nature, the risk of internal validity

was low. However, to ensure that our results emerged from the collected data,

we checked the consistency and plausibility of the outcomes from the different

steps of the study.

Reliability: To ensure reliability of our study results, we piloted the data

collection instruments and reviewed the study protocol. Also, parts of the data

analysis were performed by more researchers, followed by cross-checking the

results.

Construct validity: Our subjects have never worked as real architects. This

might have influenced the criteria used for deciding between decision

alternatives. Additionally, real architects may have different perceptions on the

technique.

To further validate our study, we checked two criteria recommended by

Edwards et al. (Edwards et al., 2009): 1) the quality of design decisions for

data collection, and 2) the quality of the data gathering process.

For the first criterion (i.e. quality of design decisions), we chose to elicit

elements, constructs, and ratings from each subject, instead of providing them.

We consider that our approach is better suited for exploratory research

(Edwards et al., 2009), as the subjects provided all the content of the grids.

Unlike other Repertory Grid technique studies that use a binary scale, for

ratings we used a five-point scale which provided enough discrimination

among the elements and constructs.

For the second criterion (i.e. quality of the process), we chose to conduct long,

individual sessions with each participant, instead of short ones. This allowed

each subject to reflect thoroughly on perceptions and decisions, necessary for

eliciting tacit architectural knowledge. One researcher performed data

collection during the sessions with each participant, introducing bias risks, but

also facilitating the elicitation. Moreover, the usual limitations of exploratory

studies also apply to ours, e.g. no hypothesis, or lack of triangulation.

5.4 Survey study

From the study in Section 5.3, we learned that the Repertory Grid technique

has potential for fighting architectural knowledge vaporization. However, we


5.4. Survey study 119

did not integrate the Repertory Grid technique with existing models on

architectural knowledge from the literature. Such integration may facilitate the

adoption of the Repertory Grid technique by the architecture community. This

is mainly because we would avoid the introduction of new terminology (e.g.

Repertory Grid specific terms like constructs), and therefore shorten the

learning curve of the approach. In this follow-up study, we propose such

integration and investigate how well the Repertory Grid technique reduces


5.4.1 Conceptual Model to Capture Architectural Knowledge Using the

Repertory Grid Technique

De Boer et al. (de Boer and Farenhorst, 2008) propose a core model of

architectural knowledge, which covers all concepts from three major

terminological frameworks: IEEE-1471 (Hilliard, 2000), Kruchten’s ontology

(Kruchten, 2004), and Tyree & Akerman’s template (Tyree and Akerman,

2005). Furthermore, their core model has industrial validation. Given its

completeness, we build our conceptual model as an extension of the

architectural knowledge core model from de Boer et al. (de Boer and

Farenhorst, 2008).

The architectural knowledge core model (de Boer and Farenhorst, 2008)

describes concepts and their relationships as follows. An Architectural Design

Decision is a subclass of a Decision. For a Decision, multiple Alternatives are

ranked, based on various Concerns. Additionally, the Decision Topic is

regarded as a special type of Concern. We map the Repertory Grid specific

terminology to the one from the core model (see Table 5.3).



Table 5.3. Mapping between architectural knowledge core model and Repertory

Grid technique concepts.

Concept from core model of architectural

knowledge

Concept from Repertory

Grid technique

Alternative Element

Concern Construct

Decision Topic Topic

Ranking Rating

Based on the above mapping, we propose the model in Figure 5.3a, which

considers the main concepts involved in capturing architectural knowledge.

The Expert refers to persons possessing architectural knowledge, such as

software architects, who may be regarded as an instantiation of a Role from the

architectural knowledge core model. To avoid architectural knowledge

vaporization, the Expert may use the Repertory Grid Technique (instantiation

of an Activity from the architectural knowledge core model), as a means of

converting his/her Tacit architectural knowledge into Explicit architectural

knowledge. The Explicit architectural knowledge is an instantiation of an

Artifact, in the form of grids resulting from applying the Repertory Grid

Technique. Decisions are an important part of architectural knowledge.

Furthermore, they comprise concepts of Decision Topic, Alternative, Concern,

and Ranking, from the architectural knowledge core model. For example, an

architect might need to address the Decision Topic of selecting a programming

language for a new project. The architect may consider Java, Python, and C#

as Alternatives. Performance, reusability, understandability, and cost may be

the Concerns in this Decision. The architect may assign Rankings to how

Alternatives address Concerns.



Ranking

Concern

AK

RG Technique

`Expert

Alternative

Decision

Decision Topic

uses

delivers

has1

1..*

m

n x m

cardinality

1..*concept relationshipLegend:a)

n

`Explicit AK

`Tacit AK

subclass composition

b)

Figure 5.3. The upper part of the figure presents the conceptual model, which

uses the Repertory Grid technique steps detailed in the lower part, in UML

notation.

Regarding cardinality, architectural knowledge comprises multiple Decisions.

A Decision has one Decision Topic, for which a number of n Alternatives and

m Concerns may exist, along with n x m Rankings. We argue that obtaining

the maximum number of possible Rankings (i.e. n x m) reduces architectural

knowledge vaporization, as otherwise some of the ratings would just remain

tacit, with the risk of being lost. In the previous example of selecting a

programming language, let us consider the architect does not use a systematic

approach and ranks explicitly each of the three Alternatives (Java, Python, C#)

for two out of the four Concerns. In such case, half of the twelve possible

Rankings remain tacit. By using the Repertory Grid technique, the architect

would capture all n x m possible Rankings.

5.4.2 Repertory Grid Technique for Capturing Architectural

Knowledge

The architectural Expert uses the Repertory Grid technique to deliver explicit

decisions, in the form of grids. A grid is an m by n matrix of n Alternatives

(columns), m Concerns (lines) and their Rankings (cells).

As presented in Section 5.2, step one of the Repertory Grid technique (see

Figure 5.3b) is for the expert to choose an architectural decision topic (e.g.

choice of a programming language for developing a new product). Step two

generates a list of alternatives, relevant for the decision topic.



Step three of the Repertory Grid technique aims to elicit the concerns about the

alternatives. Concerns may be obtained using the triadic elicitation approach

(Fransella et al., 2004), which requires repeatedly asking how two alternatives

are alike, but different from a third one. Another possibility is to use a

sentence completion task (Grice et al., 2004), in which the architect completes

sentences characterizing various alternatives.

Step four aims to obtain the rankings of each alternative against each concern,

on a predefined scale. Step five requires the architect to analyze the grid, e.g.

by applying hierarchical clusters or principal components analysis. This may

result in additional refinements of the grid. To assist with grid elicitation and

analysis, various tools can be used, such as (Gaines and Shaw, 2007; Grice,

2002).

5.4.3 Study Definition and Design

According to Wohlin et al. (Wohlin et al., 2003), a descriptive survey enables

assertions about some population (e.g. software architects). Ciolkowski et al.

(Ciolkowski et al., 2003) describe several steps of the survey process: ‘(1)

Study definition – determining the goal of the survey; (2) Design –

operationalizing the survey goals into a set of questions; (3) Implementation –

operationalizing the design to make the survey executable; (4) Execution – the

actual data collection and data processing; (5) Analysis – interpretation of the

data.’ Next, we report these steps.

To conduct the survey, we needed a sample population. Ideally, we should

have used a sample of real-world software architects. However, such persons

are usually very busy, and it is difficult to obtain their commitment. Therefore,

we decided to use convenience sampling and asked graduate students to

participate in our study. Carver et al. (Carver et al., 2009) mention that there

are small differences between graduate students and professionals, from a

research point of view. However, we discuss validity threats related to using

students in a separate section. Our sample consisted of graduate students

enrolled in the Software Architecture course at the University of Groningen in

fall 2010. Study participation was voluntary and had no influence on grades.

The study was scheduled to take place as part of an optional two-hour seminar.



In the two-hour session we needed to train the students to use the new process

and also enabled them to apply it to some architectural decisions. We used

course projects for our architectural knowledge acquisition attempt. The

course project required students to act as architects, and design a complex

home automation system that interacts with the Smart Grid in order to sell and

buy electric power, control the energy consumption of home devices, and

interoperate with a home automation system. Throughout the semester,

students worked in groups of five persons to architect the system. As our study

took place halfway through the project, students had already made the

important architectural decisions and thus possessed architectural knowledge

about the home automation system. Therefore, we captured the architectural

knowledge of students about some of the decision topics that occurred in their

course project. Even though students worked as teams, we decided to apply the

Repertory Grid technique at an individual level, so that we collect more data.

Therefore, architectural knowledge acquisition from groups of architects was

out of scope for the study presented in this chapter, but it was studied in

Chapter 7. To structure our study and identify research questions and

measurements, we used Basili’s Goal-Question-Metric (GQM) method (Basili

and Caldiera, 1994).

Goal: Reduce architectural knowledge vaporization from architects’

viewpoint.

As recommended by Basili et al. (Basili and Caldiera, 1994), our stated goal

contains a purpose (i.e. reduce), an issue (i.e. vaporization), an object (i.e.

architectural knowledge) and a viewpoint (i.e. architects). The research

question of this study is the following.

RQ2. Does the Repertory Grid technique reduce architectural knowledge

vaporization more than a template-based approach to document


The decisions documented in architectural documents in the course project

were based on a predefined template. However, the template was not as

comprehensive as the ones from Tyree and Akerman (Tyree and Akerman,

2005), or from Kruchten (Kruchten, 2004). Given their lack of experience, the

students were asked to describe only the decision topic, alternatives, their pros

and cons, outcome, and rationale. Therefore, the documentation of

architectural decisions from students was basic. Moreover, they did not use



any systematic process for documenting decisions. For this study, we assume

that architectural documentation created by students resembles documentation

produced by practitioners who do not use systematic approaches for thorough

decision documentation.

We defined the following criteria to identify how Repertory Grid based

decisions differ from decisions based on a basic decision template. The

following metrics were determined for each grid on each decision topic (a, b, c

stand for each type of metrics; 1 and 2 stand for grids, and basic architectural

documentation, respectively).

Ma1: Number of explicit decision alternatives in the output of the Repertory

Grid technique.

Ma2: Number of explicit decision alternatives in the basic architectural

documentation.

A higher number of explicit alternatives suggests a reduction in architectural

knowledge vaporization. To apply the metric, we simply counted the number

of alternatives in either a grid or the decision description from the report.

Mb1: Number of explicit concerns in the output of the Repertory Grid

technique.

Mb2: Number of explicit concerns in the basic architectural documentation.

A higher number of explicit concerns may suggest a reduction in architectural

knowledge vaporization. To identify the concerns required to apply the metric,

we applied content analysis (Krippendorff, 2004) on grids and decisions’

descriptions, to assign concerns to fragments of text. Two researchers

participated in this analysis, to reduce validity threats.

Mc1: Ratio of explicit rankings, against the maximum number of possible

rankings in the output of the Repertory Grid technique.

Mc2: Ratio of explicit rankings, against the maximum number of possible

rankings in the basic architectural documentation.

A higher ratio may suggest a reduction in architectural knowledge

vaporization. The total number of possible rankings is obtained by multiplying

Ma with Mb, for one and two, respectively. For example, if a decision has three

explicit rankings, for two alternatives and three concerns, then the ratio is 0.5.



To apply the metric, we counted the number of explicit rankings assigned in

either a grid or the decision description from reports.

To answer our research question, we define the following null and alternative

hypotheses.

H0a: The Repertory Grid technique does not influence the number of explicit

alternatives.

H0b: The Repertory Grid technique does not influence the number of explicit

concerns.

H0c: The Repertory Grid technique does not influence the ratio of explicit

rankings.

The alternative hypotheses are the following.

H1a: The Repertory Grid technique influences the number of explicit

alternatives.

H1b: The Repertory Grid technique influences the number of explicit concerns.

H1c: The Repertory Grid technique influences the ratio of explicit rankings.

5.4.4 Survey Implementation

Because of the time constraints for the study, we needed a Repertory Grid tool

that the participants could use for the architectural knowledge acquisition task.

We considered using either WebGrid (Gaines and Shaw, 2007), or Idiogrid

(Grice, 2002). After piloting each of them, we selected Idiogrid, mainly

because it has better support for self-administering the Repertory Grid

technique. For each decision topic, we prepared a configuration file that a

subject could load in Idiogrid, and then follow the first four steps of Repertory

Grid technique in Figure 5.3b.

Choose Decision Topic. We prepared some decision topics for students to use

in the survey. From the conceptual model in Figure 5.3a, a participant needs to

be an expert, or at least knowledgeable about the topic under focus. To satisfy

this condition, we analyzed the project reports, delivered by students. For each

of the six groups, we compiled a list of reported decision topics and considered

alternatives. Next, we analyzed which decision topics appeared across all

groups, to see which topics are more common. We identified four such topics:



choice of user interface, programming language, communication technology,

and operating system. During the survey session, we asked participants to use

the Repertory Grid technique for two decision topics. To satisfy the expertise

prerequisite, we asked each student to choose two out of the four decision

topics, based on his/her familiarity.

Get Alternatives. According to Edwards et al. (Edwards et al., 2009), the

alternatives (or elements) can either be supplied to the participant or elicited

from him/her. The former is suitable for investigations on a specific set of

elements (Edwards et al., 2009). As we aimed to elicit architectural knowledge

from participants, we decided to ask participants to specify the alternatives for

each selected topic.

For example, for the programming language decision topic, we configured the

Idiogrid to prompt the participant with the question: ‘Think of a programming

language you consider for developing the HPS system‘, (HPS is an acronym

for Home Power Save, the system that students had to architect). Overall, four

or five elicitation questions were used per decision topic.

Get Concerns . Similar to step two (get alternatives), we decided to elicit

characteristics (or constructs) from participants, rather than to supply them. In

the study in Section 5.3, we used the triadic elicitation approach in the

individual interviews: from three elements (alternatives), asking the expert in

what way two of them are alike, but different from the third. For this study, we

doubted most students could successfully use the triadic approach in a self-

administered session, through a dialog with a tool, instead of an interviewer.

According to Grice et al. (Grice et al., 2004), grids based on sentence

completion are suitable for any domain of experience, and are easy to

complete. Following our pilot sessions, we concluded that it may be more

intuitive for students to generate constructs through sentence completions,

compared to the triadic elicitations. Therefore, we decided to use the sentence

completion approach.

Figure 5.4 displays an example of a sentence completion task. The template

for the sentences is: ‘For stakeholder, I consider that an important

characteristic of decision topic for HPS is …, as opposed to …’ The possible

values for stakeholder were ‘me as an architect’, ‘developers’, ‘testers’, and

‘end-users’. The values for decision topic were the four topics described



above. By repeating the sentences, seven or eight characteristics were elicited

per decision topic.

Figure 5.4. Partial screenshot of a typical sentence completion task in Idiogrid for

eliciting concerns for the decision problem of selecting an operating system.

To prevent superficial constructs, like ‘good as opposed to bad’, we asked the

respondents to provide more details, i.e. ‘affordable AS IN free compilers, as

opposed to expensive AS IN a development license is 100 euro.’ Furthermore,

we encouraged the participants to provide their personal perspectives.

Rank Alternatives. For this step, we configured Idiogrid to use a five-point

rating scale, ranging from -2 to 2. When rating an alternative against a

concern, lower values indicated agreement with the first part of the concern

(i.e. affordable), while higher values indicated agreement with the second part

(i.e. expensive). The middle value indicated neutrality, uncertainty or lack of

applicability.

Due to the limited duration of the session, we skipped grids analysis and

refinement. Instead, we decided to send each student an email with an analysis

of his/her grids. Therefore, no refinements of the grids took place during our

study.

5.4.5 Survey Execution

For the survey execution we developed the schedule shown in Table 5.4. In the

first step, we introduced the participants to the Repertory Grid technique.

Next, we asked the participants to do an example grid session, for training

purposes, with the topic of choosing between bars in town. In steps three and



five, participants applied the process in Figure 5.3b, on decision topics of their

choice (programming language, user interface, communication technology,

operating system). We asked the participants not to use the internet, or talk

with each other during the study. We did not impose a time limit for the

sessions, as some persons might need more time. Furthermore, two researchers

were available to answer questions from participants during the study.

Table 5.4. Schedule for the study.

Step Planned Duration

1.Presentation on the RG technique 10 minutes

2.Example grid session 5 minutes

3.Grid session on the first decision topic 30 minutes

4.Coffee break 10 minutes

5.Second grid session 30 minutes

6.Submission of grids by email 5 minutes

7.Post questionnaire 10 minutes

At the end, each participant filled in a questionnaire. We were interested in the

profile of the participants, including their industrial experience. Moreover, we

added questions on the study itself, i.e. to check whether the participants

understood the instructions and questions, and the perceived difficulty of using

the Repertory Grid technique for the decision topics.

5.4.6 Analysis of Survey Results

Out of 30 students enrolled in the software architecture course, 20 attended our

study. Each participant delivered two grids, except for two persons who

produced only one grid. One participant delivered no grid at all. Overall, we

obtained 36 grids, as well as paper-based post questionnaires from attendants.

At the end of the course the 30 students delivered six architectural documents

(i.e. one document per team), as part of their course group assignment.

5.4.6.1. Collecting Metrics for a Decision

As described in the survey design, we collected measurements for both grids

and architectural documents. We present an example of collecting the metrics



for a decision, in an architectural document and a grid. The metrics are the

numbers of alternatives, concerns, and ratio of explicit rankings (see Section

5.4.3).

Decision Metrics in an Architectural Document. We measured the number

of explicit alternatives by counting them. Table 5.5 shows a fragment of a

decision description with two alternatives, from an architectural document.

To measure the number of explicit concerns, two researchers conducted a

content analysis (Krippendorff, 2004) on each decision description. Each

researcher assigned individually a concern to every sentence of a decision’s

description. Next, we reviewed every sentence, and compared the assigned

concerns. We considered an agreement if both of us meant the same thing, but

used different words. For example, if one assigned to a fragment the concern

cost and the other one assigned affordability, then it is an agreement. Upon

disagreements, we either agreed to use an existing concern from one of us, or

we negotiated the assignment of a new concern. For a few sentences, we asked

a third researcher to mediate. Initial inter-rater agreement was 51.8%. After

negotiations, we achieved full agreement, as both researchers agreed with the

assigned concerns. Table 5.5 shows an example of concerns, as agreed by both

researchers. The decision fragment uses the same concern multiple times (e.g.

cost appears twice). However, the number of concerns counts only distinct

concerns: cost, usability, and extra hardware.

Table 5.5. Fragment of a decision description for from an architectural document

(agreed concerns are in italic). The decision topic was user interface for the

architected system (HPS).

Alternatives Advantages Disadvantages

Web

interface

No

additional

cost to the

HPS (cost)

Not everyone can work with a web interface

(usability)

Users need a local area network (extra

hardware)

HPS needs a LAN adapter (extra hardware)

Screen

Could be

more

intuitive

(usability)

Requires a screen, additional hardware (extra

hardware)

Increases price of the HPS, depended on the



The third metric consisted of the ratio of explicit rankings versus the

maximum number of possible rankings. We multiplied the number of explicit

alternatives and the number of distinct concerns to obtain the maximum

number of possible rankings. For example, the fragment of decision in Table

5.5 has two alternatives and three distinct concerns. Therefore, the decision

fragment has maximum six possible rankings. The two alternatives are ranked

against all concerns, so the ratio metric for this example is one.

Decision Metrics in a Grid. Table 5.6 shows a fragment of a grid created by a

student. The fragment shows three alternatives and three distinct concerns. The

concerns in the grid are expressed as pairs of contrasting poles. Rankings

range from ‘-2’ to ‘2’ (as described in the survey implementation section).

Similarly to architectural documents, we performing content analysis on each

concern elicited with the Repertory Grid technique. The resulting agreed

concerns are displayed in the rightmost column of Table 5.6. For the above

example, the ratio of explicit rankings is one, as each alternative is ranked

against each concern.

Table 5.6. Fragment of a grid, describing a decision. Pole of Grid Concern Rankings Opposite Pole of Grid Concern Agreed Concern

simple code, as in written in a single language 0 1 1

complex code, as in different languages for

different partsImplementability

difficult to interact with, as in tedious to input

information2 2 -1

easy to interact with, as in easy to enter

informationUsability

obtrusive, as in unclear what options are

available, what the presented information means2 -1 -1

intuitive, as in information/options are clearly

presented, structuredUsability

difficult to maintain, as in difficult to change a

module without influencing another component2 -1 -1

easy to maintain, as in easy to change some

part without breaking another partMaintainability

Alternatives:

5.4.6.2. Analyzing Metrics for All Decisions

As participants freely chose the two decision topics, some students from the

same course project group used the Repertory Grid technique on identical

decision topics. We obtained twelve grids for which only one student from a

course project group addressed that decision topic (single grids). Additionally,

we obtained seven double grids: two participants from the same group

produced individually two grids on the same decision. Also, we got two triple

grids, from three members of the same group capturing the same decision.

screen size and quality (cost)



Similarly, we obtained one quadruple grid, from four members of the same

group.

For data analysis, we consider a data point as a pair of three metrics, applied to

the same decision, in an architectural document and a related grid. To ensure a

suitable comparison of the metrics, the grid must have been produced by a

student who also co-authored the architectural document. Our raw data

consisted of twelve data points of single grids, seven data points of double

grids, two data points of triple grids, and one data point of quadruple grids.

To analyze the data, we needed to filter outlier data points. We noticed that

one data point of double grids and the one with quadruple grids had poor

decisions descriptions in the architectural document, i.e. no alternatives

considered. Therefore, we eliminated the two data points. In one of the triple

grids, we eliminated a grid due to poorly phrased concerns, obtaining a new

data point with double grids. Similarly, we converted the other data point with

triple grids into a new one with a single grid, by removing two poor quality

grids. After filtering, we obtained 13 single-grid data points, and seven double-

grid data points. The numbers of alternatives and concerns in a double-grid

data point are calculated by counting the distinct alternatives and concerns

from the two grids.

We use boxplots in Figure 5.5 to summarize the collected metrics. The median

of the numbers of alternatives obtained with Repertory Grid technique is four

for single-grid data points, and six for double-grid data points. Half of the

numbers of concerns in students’ reports (architectural documents), for single-

grid data points, were equal to five, six, or seven. All ratios for grids are equal

to one.

2 4 6 8 10

Alternatives

Concerns

Grids

Reports

2 4 6 8 10

Single-grid Data Points

Ratio of

Rankings (x10)

Grids

Reports

Grids

Reports

Double-grid Data Points

Figure 5.5. Boxplots for each metric, for the two types of data points.

Next, we compare the means of each metric, for the decisions in grids and

architectural documents, by using the Wilcoxon signed ranks test. We test each

hypothesis on the single-grid data points and the double ones. Table 5.7



summarizes the results, for the numbers (#) of alternatives, concerns, and

ration, of both grids (G) and reports (R).

Table 5.7. Hypotheses, metrics, means, standard deviations, and p-values for the

two samples.

13 Single-grid Data

Points

7 Double–grid Data

Points

H Metric Mean Std.

Dev.

p-

value Mean

Std.

Dev.

p-

value

Ha # Alternatives G 4.00 0.41

0.002 6.14 0.90

0.016 # Alternatives R 2.62 0.77 2.71 0.49

Hb # Concerns G 6.00 1.41

0.720 9.57 0.79

0.017 # Concerns R 6.23 1.88 6.14 1.86

Hc Ratio G 1.00 0.00

0.003 1.00 0.00

0.018 Ratio R 0.66 0.19 0.70 0.13

Given the p-values less than 0.05 and the means values, we can reject H0a, and

accept that applying the Repertory Grid technique increased the number of

explicit alternatives, for both samples. Regarding H0b, we cannot find any

influence of the Repertory Grid technique, for single grids, due to the high p-

value. However, we can reject H0b for the sample with double grids (p=0.017).

Additionally, we reject the third null hypothesis, as the p-values are low.

The results strongly suggest that one grid contains more explicit alternatives

and higher ratio of explicit rankings than the equivalent description from a

basic architectural document. Two grids seem to contain not only more

explicit alternatives and rankings, but also more concerns.

5.4.6.3. Post Questionnaires

From post questionnaires, we learned that participants had a bachelor degree in

Computer Science or a related field (e.g. Information Technology). Half of the

participants had an average of around two years of work experience in

software industry. Students needed an average of 24 minutes for each grid

session, with a standard deviation of eleven minutes.



Table 5.8. Ratings of statements in the post questionnaire, from 0 (strongly

disagree) to 4 (strongly agree).

Id Statement Average Std.

Dev.

1 I understood the theoretical part on the Repertory

Grid technique 3.35

0.59

2 The directions for the assignment were clear 3.10 0.79

3 The software tool was difficult to use 0.95 0.78

4 The example grid helped me understand how to do

the assignment 3.15

0.59

5 The grid on the first decision topic was easy to do 2.75 0.85

6 The grid on the second decision topic was easy to do 2.17 1.04

7 I had a clear idea on what I had to do in the

assignment 2.95

0.51

8 I enjoyed doing the assignment 2.50 0.61

We asked students to rate some statements on a Likert scale from zero to four

(strongly disagree, disagree, neutral, agree, strongly agree), to understand their

perceptions about the study. Table 5.8 presents the statements and their

ratings. We learned that participants understood the presentation on the

Repertory Grid technique. Statement two indicates students also understood

the directions for creating the grids. Also, they perceived the Idiogrid tool

(Grice, 2002) as easy to use. Statement four suggests that the example grid

helped students do the two grids. Students perceived the first grid as easier to

do (average of statement five is closest to agree), than the second one (average

of statement six is closest to neutral). A possible explanation for the difference

may be that students applied Repertory Grid technique on the most familiar

topic, followed by the less familiar one. Participants agreed on having a clear

idea about their tasks. Also, they partly enjoyed the assignment.

Overall, post questionnaire results suggest that participants did not face

significant issues in using Repertory Grid technique in a self-administrated

manner, after the short introduction to it. We consider that the smooth learning



curve and low time cost may facilitate Repertory Grid technique adoption by

practitioners.

5.4.7 Discussion

Repertory Grid technique tends to elicit a higher number of alternatives,

compared to basic architectural documents, which usually mention two or

three alternatives. Additionally, Repertory Grid technique delivers 100% of

possible rankings for a decision, due the systematic steps for capturing

decisions. In contrast, architectural documents seem to make explicit only

around 70% of rankings, as participants used no systematic technique for

capturing decisions.

On average, one grid contains around six concerns, similar to decisions from

the architectural documents. However, double grids contain around nine

concerns. Combining elicited concerns by two out of the five members of a

team increases significantly the number of explicit concerns. Therefore, we

believe Repertory Grid technique may be useful for architectural reviews, to

help uncover more concerns from stakeholders.

We found out that participants spent an average of 24 minutes to capture a

decision with the Repertory Grid technique. In our previous study in Section

5.3, participants captured a decision in 57 minutes, on average. We believe the

difference is mainly due to the approach for eliciting concerns. As reported by

Grice et al. (Grice et al., 2004), we also noticed that sentence completion

seems to be more user friendly than triadic elicitation (Fransella et al., 2004).

However, we speculate that the triadic elicitation approach asks the expert to

reflect more, with the potential to unearth more in-depth tacit architectural

knowledge than the sentence completion approach. We consider both

approaches are valid and useful, as they have complementing qualities.


Edwards et al. (Edwards et al., 2009) offer criteria (e.g. element selection, or

elicitation of constructs) for evaluating a study that uses the Repertory Grid

technique. According to them, our study uses full individual repertory grids, as

both elements (alternatives) and constructs (concerns) are elicited from each

participant. This is especially well-suited for exploratory situations, like



capturing tacit knowledge. However, we decided to use a less known approach

for construct elicitation, although more user-friendly: sentence completion

(Grice et al., 2004). To address the risks of using a less-established approach,

we piloted it before the study, to make sure that sentence completion provides

useful outputs.

Regarding the external validity of our study, we consider the study participants

as representative for inexperienced software architects, similar to (Carver et

al., 2009). Half of the graduated students had around two years of working

experience. However, we cannot generalize our results to experienced

architects. Moreover, we do not know if the decision descriptions in the

architectural documents from students are representative for the industry.

From our experience, we speculate that most companies use less systematic

approaches for decisions than the basic one from students’ documentation.

Concerning internal validity, the main issue is the history of the decision

descriptions from the architectural documents. Students created the

descriptions as team work, and grids as individual work. We partly addressed

this risk, by dividing the grids based on the number of students in the same

group, who worked on the same decision topic. Additionally, the structure of

the basic decision template influences the resulting decision descriptions.

Different templates may result in different metrics. Therefore we cannot a

strong claim for Repertory Grid technique providing better results against any

type of decision documentation, due to the template dependency. However,

students repeatedly refined their decisions’ descriptions in the architectural

documents, as part of the course. In contrast, the students did not have time to

refine their grids.

The main issue on construct validity concerns the metrics we defined for

operationalizing architectural knowledge vaporization. There is a risk that our

metrics insufficiently address architectural knowledge vaporization. For

example, given that decisions are intertwined (Bosch, 2004), relationships

among decisions may be more relevant than our chosen metrics. To partially

address this issue, we used an established model of architectural knowledge

(de Boer et al., 2007), for selecting our metrics.



5.5 Conclusions

In this chapter, we explored using an approach (i.e. the Repertory Grid

technique) from the knowledge engineering field for capturing architectural

knowledge. We used the approach in two studies. In the first study, we

identified advantages and disadvantages of the technique. Its reasoning support

and readability advantages recommend it as a promising approach for assisting

in making and capturing architectural decisions. Its effort drawback is also

specific to other approaches.

In the second study, we investigated how Repertory Grid technique compares

to a basic approach to documenting architectural decisions. Specifically, we

analysed metrics on important parts of a decision: alternatives, concerns, and

rankings. Also, content analysis of architectural documentation provided

measurements for descriptions of decisions. We learned that Repertory Grid

technique captures more alternatives and more rankings, compared to

architectural documents. Overall, based on the two studies, the Repertory Grid

technique shows high potential for capturing tacit architectural knowledge, and

reducing architectural knowledge vaporization.

In Chapter 6, we use the Repertory Grid technique with practitioners on real-

world architectural decisions, and refine the Repertory Grid technique using

feedback from practitioners. In Chapter 7 we extend the Repertory Grid

technique to help group architectural decision making. In Chapter 8, we

present tool support for the Repertory Grid technique.

Acknowledgments

We thank study participants, Tim Menzies, David Ameller, Uwe van Heesch

and Pavel Bulanov for their help.


Chapter 6

Improve Individual Architectural Decisions

Published as: Tofan, D., Avgeriou, P., and Galster, M., Validating and

Improving a Knowledge Acquisition Approach for Architectural Decisions.

International Journal of Software Engineering and Knowledge Engineering

24, 04 (2014), 553-589.

In this chapter, we expand the work in Chapter 5 by proposing an approach

(REGAIN) based on the Repertory Grid technique, for capturing architectural

decisions made by individual architects. We also present a study to ensure that

REGAIN meets the needs of industrial architects. We interviewed sixteen

architects, who indicated REGAIN advantages, such as systematic decision

making support. Also, architects indicated improvement opportunities, in

particular tool support and the possibility to prioritize concerns that are used

to evaluate decision alternatives. To address the need for prioritization, we

conducted an additional study to evaluate two approaches for prioritizing

concerns: pairwise-comparisons and the hundred-dollar approach. We

conducted an experiment with thirty graduate students to compare the two

prioritization approaches. Based on the results of the experiment, we added

the hundred-dollar approach to REGAIN.


138 6. Improve Individual Architectural Decisions

6.1 Introduction

In Chapter 5, we argued that existing approaches in software architecture

overlook the challenges of capturing tacit architectural knowledge.

Overlooking this challenge limits the practical usefulness of approaches for

capturing architectural knowledge. To address the challenge of capturing tacit

architectural knowledge, we draw inspiration from the knowledge engineering

discipline, because this discipline focuses on knowledge-related approaches,

so it may offer ideas with high potential for tackling the problem of capturing

tacit architectural knowledge.

One such idea, which we presented in Chapter 5, is to use the Repertory Grid

technique - a powerful interviewing technique for tacit knowledge acquisition

(Boose, 1984; Gaines and Shaw, 1993). In Chapter 5, we proposed an early

version of an approach which uses the Repertory Grid technique for capturing

architectural decisions. In this chapter, we call this approach REGAIN

(REpertory Grid for capturing ArchItectural decisioNs), and we present its

details in Section 6.2.

The version of REGAIN in Chapter 5 lacked industrial validation. Therefore,

this chapter investigates and improves REGAIN in industrial practice, to

ensure that REGAIN meets the needs of industrial architects for capturing tacit

architectural knowledge. In detail, this chapter makes two main contributions:

1. It reports feedback from practitioners on using REGAIN for capturing

real-world architectural decisions

2. It presents an improvement of REGAIN by adding a prioritization

approach that has been selected based on empirical data from an

experiment.

These contributions help address the challenge of capturing tacit architectural

knowledge in industrial practice, and encourage using ideas from the

knowledge engineering discipline to tackle the problem of architectural


Figure 6.1 shows the three research phases that we followed, and the

corresponding sections of this chapter where the phases are presented. Phase 1

(in Section 6.2) presents the initial REGAIN approach, background


6.2. Phase 1 – Initial REGAIN Approach 139

information on the Repertory Grid technique, and a summary of two previous

preliminary evaluations from Chapter 5 of the initial REGAIN approach.

Phase 1

(Section 6.2,

Chapter 5)

Phase 3

(Sections 6.4)

Phase 2

(Section 6.3)Industrial applicability

Need for prioritization

First REGAIN evaluation

Second REGAIN evaluation

Investigate prioritization approaches

Improved REGAIN approach

Process Step

Outcome

Legend:

Initial REGAIN approach

Figure 6.1. Chapter structure: Phase 1 was reported in Chapter 5, Phases 2 and 3

are reported in this chapter.

Phase 2 (summarized in Section 6.3) covers REGAIN’s industrial applicability

from conducting an interview study with practitioners, in which they used the

initial REGAIN approach, and offered feedback about it. In Phase 3 (Section

6.4), we improve the REGAIN approach by addressing two issues raised in

Phase 2: the need for prioritizing concerns and tool support for REGAIN. We

discuss validity threats in Section 6.5 and related work in Section 6.6.

Conclusion and future work are presented in Section 6.7.

6.2 Phase 1 – Initial REGAIN Approach

In this section, we present REGAIN and a short summary of our evaluations of

the Repertory Grid technique from Chapter 5.

6.2.1 Theoretical Foundations for REGAIN

REGAIN is based on two major theoretical foundations: the core model of

architectural knowledge in (de Boer et al., 2007), and the Repertory Grid

Technique (presented in Chapter 5).



In Chapter 5, we mapped concepts from the core model of architectural

knowledge (de Boer et al., 2007) to concepts from the Repertory Grid

technique. In Figure 6.2, we update the mapping for REGAIN. Compared to

the original core model, we renamed the rank action to the evaluate action, and

the ranking element with the rating element. When making an architectural

decision, there is a topic, n alternatives, m concerns, and n x m ratings of

alternatives against concerns. For example, a decision on the topic of choosing

a database can include three databases (such as MySQL, MS SQL, and

PostgreSQL), five concerns (such as scalability, availability of documentation,

performance, familiarity, and costs), and fifteen ratings for each database

against each concern (e.g. using a five-point scale). The actual decision is the

alternative that satisfies best the concerns. The rationale for the decision is

captured in the evaluations of alternatives against concerns (de Boer et al.,

2007) (i.e. the ratings indicate which alternative satisfies best the concerns).

propose

choose

evaluateAlternative

Rating

Decision

Concern

Topic

based on

1

1

n

m

1

n

nxmbased on

relationshipLegend: subclasselement action participation

for

Figure 6.2. REGAIN uses these concepts from the core model in (de Boer et al.,

2007).

The concepts in the core model in (de Boer et al., 2007) have equivalent

concepts in other established models for architectural knowledge: the IEEE

1471-2000 standard (1471-2000, 2000), Tyree’s template (Tyree and

Akerman, 2005), and Kruchten’s ontology (Kruchten, 2004). The equivalency

among concepts is detailed in (de Boer et al., 2007), and we offer below the

following three examples.

1. The concept of concern in the core model (de Boer et al., 2007) is

equivalent to the concepts of requirement and risk in Kruchten’s



ontology (Kruchten, 2004), and to the concepts of assumption and

constraint in Tyree’s template (Tyree and Akerman, 2005).

2. Since architecting is an iterative activity, a decision loop appears:

starting from an initial decision topic, the architect makes an

architectural decision, which, in turn, may uncover new concerns and

decision topics, for which subsequent related decisions need to be

made. The decision loop is equivalent to the relationships among

decisions in Kruchten’s ontology (Kruchten, 2004). For example, the

‘must use J2EE’ decision introduces a constraint relationship with the

‘use the GlassFish application server’ decision in Kruchten’s

ontology (Kruchten, 2004), while in the core model (de Boer et al.,

2007) the ‘must use J2EE’ decision introduces the ‘which application

server to use?’ decision topic, for which GlassFish is one of the

considered alternatives. Overall, the decision loop enables traceability

among decisions.

3. The decision rationale concept in the IEEE 1471-2000 standard

(1471-2000, 2000) is equivalent to the evaluations of alternatives

against concerns in the core model in (de Boer et al., 2007). The

evaluations explain why the architect selected a certain alternative.

6.2.2 The REGAIN Approach

The REGAIN approach consists of the following five steps.

In step one of REGAIN, the architect identifies the architectural decision topic

(e.g. choice of a framework for developing a new web application).

In step two, the architect generates a list of alternatives, relevant for the

decision topic (e.g. a list of web frameworks). Alternatives may be reused

from previous similar decision. Optionally, a hypothetical ideal alternative

may be included (e.g. ideal web framework). The ideal alternative would

represent the most preferred ratings for each concern in an ‘ideal’ world,

without any tradeoffs. Comparing alternatives against the ideal alternative

provides a basic utility function: alternatives with ratings more similar to the

ideal alternative offer more utility than alternatives with less similar ratings.

In step three , the architect provides concerns that are used to evaluate

alternatives identified in step two. The architect can reuse concerns from



previous decisions (e.g. through keywords-based suggestions offered by tool

support). Otherwise, the architect provides concerns using triadic elicitation.

As mentioned earlier, triadic elicitation means that the architect selects three

random alternatives and then asks how two alternatives are similar, but

different from the third. For example, for the decision topic on a Python web

framework, an architect might consider Web2py and Zope similar in terms of

popularity (i.e. both are used by fewer popular existing websites ) and Django

different with regard to popularity (i.e. there are more popular existing

websites implemented with Django). Thus, the architect captures popularity as

a concern in terms of a contrast (fewer and more popular existing websites).

Repeated comparisons among other triads of alternatives produce more

contrasts to express the concerns.

For eliciting concerns, in Chapter 5 we explored using two approaches:

sentence completion and triadic elicitation. For REGAIN, we chose to use the

triadic elicitation approach, since it is better suited than sentence completion at

making tacit knowledge explicit in a systematic manner. Jankowicz

(Jankowicz, 2001) indicates three reasons for this. First, triadic elicitations

encourage expressing concerns as concrete contrasts (or poles) between the

alternatives, instead of only labels for concerns (e.g. popularity). Second,

triadic elicitations encourage describing contrasts in a precise and operational

manner. For example, an architect can interpret popularity in terms of how

many search results are returned for the names of the frameworks, so the

architect should express his definition of popularity as a contrast and indicate

as precisely as possible the values for the left and right poles of the contrast.

Third, triadic elicitations encourage making explicit subjective concerns, such

as architects’ intuitions or gut feelings about the alternatives.

In step four, the architect judges each alternative against every concern and

captures the judgment in a rating. A typical rating scale uses integers from one

to five, to help the architect indicate how an alternative satisfies a concern (e.g.

assigning one for an alternative being less popular, or assigning five for being

very popular).

In step five, the architect analyzes the resulting grid, using content and

structure analysis (Jankowicz, 2003). Content analysis refers to evaluating the

decision topic (e.g. is this topic relevant), alternatives (e.g. should other

alternatives be added to the grid), concerns (e.g. can some concerns be added



or removed), and ratings (e.g. are there missing ratings, or do ratings need to

be changed). The content analysis helps the architect understand if all

components (e.g. alternatives, concerns) of the decision were included in the

grid. Structure analysis of a grid uses cluster (Shaw, 1980) and principal

components (Jankowicz, 2003) analyses. Both involve statistical operations for

analyzing relationships between decision alternatives and concerns. This helps

the architect make the decision. The result of cluster analysis is part of the

output of REGAIN.

6.2.3 REGAIN Output

The output of REGAIN consists of two main artifacts. First, the grid (or

matrix) contains alternatives, concerns, and ratings. Second, two dendrograms

(i.e. trees with relationship similarities between items) are produced from the

cluster analysis of alternatives and concerns. The cluster analysis uses a

nearest neighbor hierarchical clustering algorithm, with the city-block distance

(Shaw, 1980).

Figure 6.3 presents the grid and dendrograms (i.e. clusters) for an architectural

decision collected in Chapter 5. The decision topic was technology for data

visualization. The REGAIN session produced seven alternatives (lower part of

Figure 6.3) and ten concerns (upper part of Figure 6.3). As part of the analysis,

alternatives and concerns are reordered and grouped based on their similarity.

The level of similarity is calculated using the distances between ratings.

Alternatives and concerns are grouped in clusters, based on their similarity

levels. A similarity level of 100% between two alternatives means that the two

alternatives have the same ratings for all concerns. Based on the city-block

distance, in Figure 6.3, SVG (i.e. Scalable Vector Graphics) is the most similar

alternative to the ideal technology (around 75%, which is the highest similarity

level between any pair of alternatives). For the architectural decision in this

grid, the practitioner chose the SVG alternative, which is the closest alternative

to the ideal technology.



Similarity levelAlternatives:

Concerns: Ratings:

Ratings legend:

1 – strongly agree with left pole

2 – agree with left pole

3 – neutral

4 – agree with right pole

5 – strongly agree with right poleClusters

Figure 6.3. Grid of an architectural decision and the clusters (or dendrograms)

for alternatives and concerns.

6.2.4 Initial REGAIN Evaluations

We reported two studies with evaluations of the initial REGAIN approach (i.e.

the Repertory Grid technique) in Chapter 5, as shown in Figure 6.1. The first

study in Chapter 5 aimed at exploring the feasibility of the initial REGAIN

approach. To this end, we conducted an exploratory study with seven graduate

and undergraduate students, who used REGAIN to capture architectural

decisions from their previous academic and industrial projects. After students

had captured decisions, we interviewed them to obtain feedback on advantages

and disadvantages of REGAIN. We found out that the approach helps students

make architectural decisions in a systematic manner, and that it encourages

students to reflect on their decisions. Also, the approach produces a concise

documentation of architectural decisions, which reduces architectural


In the second study from Chapter 5, we investigated if the REGAIN approach

reduces architectural knowledge vaporization. In the study, we asked twenty

graduate students to capture architectural decisions with the initial REGAIN

approach (similar as in the first study). Next, we compared the output of

capturing decisions with REGAIN with the architectural documentation on the

same decisions that students created as part of a software architecture course

project. For the course project, students received a template for documenting

the architecture of a software system. The template included sections for

documenting architectural decisions, to indicate decision topic and rationale.

Although students spent much time on the course project architectural

documentation, we found out that REGAIN was more efficient (i.e. it took less


6.3. Phase 2 – Investigate Industrial Applicability of REGAIN 145

time), and more effective (i.e. it captured more architectural knowledge, in

terms of alternatives, concerns, and evaluations of alternatives against

concerns), compared to the documentation with a template for documenting

architectural decisions. The explanation is that REGAIN is based on the

minimalist core model of architectural knowledge in (de Boer et al., 2007), and

REGAIN offers a structured approach with clear steps, which encourages

architects to capture the most relevant architectural knowledge (e.g.

alternatives, concerns). Overall, the study results confirmed that REGAIN

reduces architectural knowledge vaporization in an efficient manner.

The results from the two evaluations with students in Chapter 5 laid the

groundwork to proceed to the next research phase: investigating the industrial

applicability of REGAIN.

6.3 Phase 2 – Investigate Industrial Applicability of

REGAIN

6.3.1 Research Method, Data Collection and Analysis

The research goal of Phase 2 was to obtain feedback on REGAIN from

industry, for the purpose of understanding advantages, disadvantages, and

potential improvements to REGAIN, from the viewpoint of practitioners (i.e.

architects in the industry), in the context of architects’ efforts for capturing

architectural decisions. To achieve the stated research goal, we formulated the

following research questions.

RQ1. What are the advantages and disadvantages of REGAIN?

RQ2. What are the improvement opportunities for REGAIN?

To answer these research questions, we conducted an interview study with

architects from the industry. We selected architects with a wide range of

experience levels, which is representative for the situation in the industry,

where architects have various experience levels. We contacted architects in our

network and architects from open source projects (e.g. SOFA Statistics project

at www.sofastatistics.com). Upon a positive response, we conducted

individual interview sessions with them, to capture their architectural decisions

with REGAIN.

Each session took typically two hours, and followed three steps:



1. We presented the REGAIN approach, including examples of grids,

and we answered questions from the practitioners.

2. We asked each practitioner to use the REGAIN approach to capture

one to three architectural decisions that practitioners made in their

recent work. We did not ask architects to make new decisions. As tool

support for REGAIN, we used a web tool called WebGrid (Gaines and

Shaw, 2007).

3. Participants filled out a post-questionnaire, and offered additional

feedback in semi-structured interviews.

Regarding ethical concerns, we asked practitioners to omit confidential

information from the sessions. We indicated that results were to be used only

for research purposes and that participants could withdraw at any time from

the sessions or refuse to answer questions. We made audio recordings of the

sessions, with the permission of each practitioner. Most sessions were

conducted face to face, except for two sessions that took place via Skype.

Overall, we interviewed 16 practitioners from 14 different organizations. The

Appendix for Chapter 6 has details on the practitioners and the sessions.

Overall, practitioners had an average of nine years of experience in the

software engineering industry. In total, practitioners used the REGAIN

approach to capture 24 architectural decisions that the practitioners made in

their industry projects from various application domains. As part of the

sessions, we showed practitioners the REGAIN output for their decisions. This

way, practitioners could get a clear idea on what to expect from using the

REGAIN approach on their real-world decisions, and practitioners were able

to offer feedback on the REGAIN approach.

To answer RQ1 and RQ2, we analyzed the answers from the post-

questionnaire and the transcripts of the audio recordings from the interviews.

We used descriptive statistics on the answers in the post-questionnaire. Two of

the authors of this study performed content analysis on the transcripts and

assigned one of the following three codes to pieces of content: advantage

(RQ1), disadvantage (RQ1), or improvement opportunity (RQ2) for the

REGAIN approach.



6.3.2 RQ1 – REGAIN Advantages and Disadvantages

6.3.2.1. Post-questionnaire Analysis

Table 6.1 shows the statements in the post-questionnaire. Statements ID1 to ID

5 relate to advantages of REGAIN, in terms of helping communicate the

rationales of decisions (ID1 and ID5), offering structured explanation and

documentation of the rationale (ID2, ID3), and supporting structured

architectural decision-making (ID4). Statement ID6 allowed architects to

indicate the effort required for using REGAIN. Architects indicated their level

of agreement with each statements using 1 (strongly disagree), 2 (disagree), 3

(neutral), 4 (agree), or 5 (strongly agree). We notice that architects agreed with

the statements about communicating rationale (ID1 and ID5), structured

explanation (ID2), documentation (ID3), and making (ID4) of architectural

decisions. Finally, practitioners perceived a reasonable amount of spent effort

(ID6).

In Table 6.1, we compare the answers of the industry architects with the

answers from the students in a previous study in Chapter 5, in which students

filled out the same post-questionnaire, after using the REGAIN approach.



Table 6.1. The values indicate the average and standard deviation of the indicated

ratings for each statement, for architects (A) and students (S).

I

D Statement A S Diff p

1

The output of the session helps explain to a colleague why I took a certain decision.

4.29 0.59

4.14 1.07

0.15 -0.48

97%

2 The session provides a structured way of explaining architectural decisions.

3.86

0.74

4.14

0.38

-0.28 0.36

33.6%

3 The session provides a structured way of documenting architectural decisions.

4.07

0.59

3.57

0.53

0.50 0.06

8.4%

4 The session provides a structured way of making architectural decisions.

3.64

0.61

4.00

1.29

-0.36

-0.68 38.6%

5

When trying to understand why another architect took a certain decision, I could use such an output.

4.00

0.76

4.29

0.49

-0.29

0.27 44.6%

6 I found the session too tiring for the output.

2.21

1.08

2.43

1.13

-0.22

-0.05 64.2%

We notice that there are only small differences on averages and standard

deviations between the answers from students and the answers from industry

architects (i.e. fifth column in Table 6.1). We investigate further the

differences, by checking if there are any statistical significant differences

between students and industry architects. The two samples (i.e. students and

practitioners) are independent from each other, as they consist of different

persons. We cannot assume that the samples are drawn from a normal

distribution, so we use a non-parametric test: the Mann-Whitney U test. The

null hypotheses are that there are no differences between the two samples on

the statements ID1 to ID6. We use SPSS to apply the Mann-Whitney U test,

and we obtain the p-values in Table 6.1. Since all values are higher than 5%,

we cannot reject any of the null hypotheses. Therefore, we consider safe to

accept that there are no statistically significant differences between the

answers from students and the answers from practitioners on the six statements

in Table 6.1.



Overall, the results of the post-questionnaire indicate two points. First, the

REGAIN approach provides value to practitioners, in terms of costs (i.e. not

too tiring) and benefits (i.e. systematic approach). Second, the REGAIN

approach provides value to both inexperienced architects (i.e. students) and

industry architects.

6.3.2.2. Transcripts Content Analysis

Following the content analysis of the interviews transcriptions, we identified

the following codes for advantages.

1. Decision-making support – This code refers to helping architects in

their decision-making. This code was indicated by 6 out of 16

participants. One participant remarked: ‘now I have a more clear view

about the choices that have been made and the analysis provided a

more clear direction on where to move in the future.’ Another

participant said: ‘Without such a tool, I don't know how to distinguish

between the different possibilities, in a more evident way. This tool

provides evident results of what is really the ideal for this situation.’

2. Decision rationale – This code refers to offering rationale for the

architectural decisions, and it was indicated by five participants. One

participant said: ‘Also to explain your decision to others, I think it

would give a good impression, and easy to understand. If someone

sees it, maybe he doesn't know the project.’

3. Reasoning support – This code refers to the architects obtaining a

clearer perspective on their decision, and it was indicated by four

participants. One of the participants said: ‘from my personal view, now

I have a clearer view about choices I made.’ Another participant said:

‘The nice thing about this tool is it's very simple. If people need a lot

of time for criteria, that's because they don't have the criteria clear in

their own mind, that's not a drawback of the tool, it's offering you a

mirror about your own conceptions and criteria in your design. It

could be a painful experience for somebody who is being interviewed.

But that's not a drawback of the tool, it's exposing what you have in

your mind.’



4. Systematic – This code refers to how well structured the approach is,

and it was indicated by two participants. One of the participants said:

‘I like the structured way of thinking about the decision, the questions

you asked: are there any additional issues? Are these similar? Are

there communalities? I like the approach very much, I think it's very

helpful, it also gives the person you ask a clear focus. It really helps

me think about how the decision was taken, what we did, what were

the most important constraints, it's something like a feedback so I can

reflect myself.’

Following the content analysis, we identified the following codes for

disadvantages.

1. Insufficient tool support – This code was indicated by five

participants, who complained on various issues of the tool that we

used during the sessions. For example, one participant said: ‘I think

the tool should have more workflow support, you know the next step,

but someone using it for the first time might not. The tool should be

more user-friendly.’ Another participant remarked: ‘Having some

open source decision making tool would be very helpful.’

2. Subjectivity – This code refers to the fact that REGAIN encourages

capturing subjective concerns about the alternatives. While useful for

making knowledge explicit, subjective concerns might have limited

reusability, since they express personal perspectives that might not be

applicable in different situations. This code was indicated by three

participants. One of the participants said: ‘my criteria might be

subjective, to my bias, my experience might not be complete.’ Another

participant commented: ‘the subjectiveness of the constructs is the big

problem for reusability.’

3. Effort – This code refers to the effort involved for using REGAIN,

and this code was indicated by three participants. For example, one

participant said: ‘people need some training of how this is working. I

suggest an e-learning thing to use it.’

An additional result to the answer for RQ1 emerged from the content analysis:

the application context of REGAIN. We obtained the following.



1. Project size – One participant found the approach useful for large

projects, rather than for small or medium- sized. Another participant

considered the approach useful especially for expensive decisions.

2. Domain – One participant considered the approach useful for

enterprise architectural decisions, in addition to software architectural

decisions.

3. Time – Regarding when to use the approach, one participant

suggested using the approach for ongoing projects, so that decision

makers can benefit from using the approach. Four participants

suggested using the approach multiple times during the decision

lifetime. One of the participant stated: ‘It's very nice to do this

repetitively, not only for the final choice. For example, after one or

two months to apply again these criteria to see where you are moving,

alter your changes or criteria, and do it again in two weeks or more,

to have an indication on where you are moving compared to your

initial goals.’ Two participants suggested using the approach for the

early architectural decisions. For example, one participated said: ‘It

can help a lot in decision making processes. This is very important at

the very initial stages of making choices. This is a very good thing to

do at the starting point of a project.’

There are similarities between the feedback from practitioners and the

feedback from students in our previous study in Chapter 5. Both practitioners

and students indicated the same advantages: systematic, insight, reflective, and

decision-making support. In addition, practitioners and students indicated the

same disadvantages: learning curve, tool support, and time consuming.

Moreover, practitioners and students indicated similar application context for

REGAIN: large projects and early architectural decisions. Overall, the findings

from the interviews with practitioners confirm our previous findings from

Chapter 5.

6.3.3 RQ2 – REGAIN Improvement Opportunities

Following the content analysis, we identified the following codes for REGAIN

improvement opportunities.



1. Concerns prioritization – This code refers to specifying priorities for

concerns. Ten participants indicated this code. One of them said:

‘Adding in the weights would make it even more precise. That would

make it very useful, this is extremely important to me, because in a

specific project weights can vary a lot.’

2. Group decisions – This code refers to using REGAIN for group

architectural decisions. Ten participants indicated this code. One of

them said: ‘I should find a way to do this together [with others], but

that will become a mess: sitting around one screen. Maybe you should

have an online version where each fills it in, then results are pulled,

you get an outcome like this, but averaged for the whole team. ’

3. Decision reuse – This code refers to reusing previous items from

REGAIN outputs (e.g. alternatives, concerns) for documenting new

architectural decisions. Five participants indicated this code. One of

them stated: ‘I think it would be easier to reuse it, because you know

what kind of criteria you must take into account when making such

decisions, and different projects might have few other criteria. That

way you can definitely reuse it: this kind of decisions are made in

every project, every java project: what kind of UI are we using? So,

some of these alternatives would be reusable.’

4. Interpretation assistance – This code refers to offering guidelines for

interpreting the output of cluster analysis, so that architects can

interpret the output with the help of the REGAIN tool, without help

from another person. Five participants indicated this code. One of

them stated: ‘perhaps there should be some idea about what the

differences, now it's between 64 and 70%: what does that mean? Is

that big or small? This can help the reader: if it's less than 10% it's

quite similar. Now, it doesn't say anything to me now.’

5. Confidence factors for ratings – This code refers to indicating

confidence factors to ratings, which would express the architect’s level

of confidence in each assigned rating (e.g. if the architect has a vague

idea that a certain alternative has high popularity, then the architect

assigns a rating and a low confidence score for the rating). Capturing

confidence factors helps the architect document his level of knowledge



about the various alternatives. In the study, four participants indicated

this code. One of them stated: ‘many criteria cannot be assessed from

the very beginning of a project's lifecycle; they might reveal

themselves at the testing phase. One could have a criterion and give a

rating according to estimations, but the real rating could be assigned

only when there are some test results in hand.’

6. Optional extra explanations – Two participants indicated this code,

which refers to having the possibility of adding optionally more details

to explain the meaning of some concerns or ratings. One participant

said: ‘What I meant was adding a bit more explanations about what

the two poles mean. Clicking on text gives more explanations. Maybe

give the user an option so that when scoring, to type why he chose that

number. I would like that purely optional, so that people don't need to

type a lot. Sometimes, some decisions need to be elaborated more. ’

7. Sensitivity analysis – This code refers to the possibility of analyzing

the impact on REGAIN output when changing ratings for a certain

concern. Two participants indicated this code. One of them said: ‘You

should do something on price. What is the influence on the concern?

Some price sensitivity thing would be useful.’

8. More types of rating scales – One participant indicated the need to

have more types of rating scales, depending on the nature of the

concerns.

9. Decision dependency – One participant indicated the need to consider

dependencies among architectural decisions, which refers to offering

support for the analysis of not only one architectural decision, but for

analyzing sets of decisions that depend on each other (e.g. choice of

web framework and operating system).

6.3.4 Discussion

The two previous studies with students in Chapter 5 indicated that REGAIN

helps reduce architectural knowledge vaporization. The next step was to

investigate the industrial applicability of REGAIN, with the goal of obtaining

feedback from practitioners. We formulated research questions on



advantages/disadvantages and improvement opportunities for REGAIN, and

conducted an interview study to answer the research questions.

The interview study brings empirical evidence on the advantages and

disadvantages that practitioners can expect when using REGAIN to make and

capture architectural decisions. The results indicate that REGAIN is a

systematic approach that offers practitioners important advantages: decision-

making support, documentation of decision rationale, and reasoning support.

However, REGAIN has some disadvantages: insufficient tool support,

subjectivity and required effort. Still, insufficient tool support seems to be the

most important disadvantage, since five practitioners indicated it in the study.

Another disadvantage, subjectivity (indicated by three practitioners) is a minor

disadvantage as it affects only partially the future reuse of captured decisions,

since subjective concerns might be less applicable across multiple architects.

The effort disadvantage is the expected cost for using any systematic

approach. Furthermore, this study shows improvement opportunities for

REGAIN, which reduce the disadvantages, and further help practitioners in

their activities.

An interesting result is that this study indicates similarity between feedback

from students in Chapter 5 and feedback from practitioners. The similarity

suggests that students might be an acceptable proxy for practitioners, in the

sense that results from empirical evaluations with students might be applicable

to practitioners. Following this study, we agree with Tichy’s perspective that

students differ marginally from practitioners, except for knowledge of domain-

specific areas, large scale systems, and organizations (Tichy, 2000). Tichy’s

perspective encourages broader usage of students in empirical evaluations,

which makes much sense for the software architecture community, because

studies with industry architects are very difficult to conduct, due to architects’

busy schedules. If more researchers confirm the applicability of results from

studies with students to practitioners, within the limits indicated by Tichy, then

the software architecture community will be able to produce results with high

statistical significance, by getting access to more study participants (i.e.

students, instead of practitioners).

The following research directions emerge from this study. First, we identified

improvements to REGAIN, such as concerns prioritization, group decisions,

support for decision reuse, and addressing decision dependencies. Since


6.4. Phase 3 - Investigate Prioritization Approaches 155

concerns prioritization was the most demanded improvement (see Section

6.3.3) by practitioners, we present our efforts to add this improvement in

Section 6.4. We will address the other improvements in future work. Finally,

insufficient tool support was the most mentioned disadvantage of REGAIN

(see Section 6.3.2.2...2), therefore we implemented user-friendly, open source

tool support for REGAIN, which we present in Chapter 8.

6.4 Phase 3 - Investigate Prioritization Approaches

Architects indicated the need to prioritize concerns for REGAIN, as presented

in Section 6.3. Furthermore, priorities of concerns are part of architectural

knowledge. For example, if the priority of security is higher than the priority

of performance, then this difference in priorities explains why a more secure,

but slower alternative for a database system was selected. If these priorities are

not captured, then such architectural knowledge is lost, resulting in higher

evolution costs. Therefore, it is very important to answer the following

research question:

RQ3. Which concerns prioritization approach to use for REGAIN?

To identify previous work on prioritization of concerns for REGAIN, we

surveyed existing literature on the Repertory Grid technique. We found that

some Repertory Grid tools include features to capture priorities of concerns

(Boose et al., 1990a; Gaines and Shaw, 2007). Shaw and McKnight describe a

basic prioritization approach: ranking concerns from one to ten based on their

importance (Shaw and McKnight, 1981). We could not identify any other

study on prioritization approaches for the Repertory Grid technique, in

particular for prioritizing concerns for architectural decisions.

We chose to investigate prioritization approaches that are already established

and accepted in the software architecture community. In Section 6.6, we

mention other available prioritization approaches (i.e. ordinal scale

approaches) that we considered, but decided not to use with REGAIN. A

recent survey on architectural decision-making techniques (Falessi et al., 2011)

indicates that two techniques (Gilb, 2005; Kazman and Klein, 2001) (one of

which is CBAM (Kazman and Klein, 2001)) use the hundred-dollar approach

for prioritization, while three other decision-making techniques (Al-Naeem et

al., 2005; Andrews et al., 2005; Svahnberg et al., 2003) use the pairwise-



comparisons approach. In addition, other researchers also use pairwise-

comparisons for prioritization in architectural decision-making (Jabali et al.,

2011). Since these two approaches are already established and accepted in the

software architecture community, we started to investigate which of them to

use with REGAIN.

The hundred-dollar approach requires decision makers to assign a value

between 0 and 100 to items that need to be prioritized, so that the sum of all

values is 100. The pairwise-comparisons approach (or analytic hierarchy

process) requires decision makers to compare all possible pairs of concerns.

This helps identify prioritization inconsistencies (e.g. A is more important than

B, B is more important than C, but A is as important as C) (Saaty, 1990). Since

the hundred-dollar and pairwise-comparisons approaches are already used by

other architectural decision-making techniques, this indicates that these two

prioritization approaches are familiar and effective to architects, at least to

some extent. So, we considered using one of them for improving the REGAIN

approach, and we set out to find which one to select.

No study compares the hundred-dollar and pairwise-comparisons approaches

for making architectural decisions, so we did not know which one to use with

REGAIN. For choosing between the two prioritization approaches, we

considered several factors. First, the performance (in terms of time) of a

prioritization approach is important, because architects have busy schedules

and they prefer prioritization approaches that take less time. Second, the

perception of architects (e.g. learnability) about a prioritization approach

influences their decision to use or not to use the approach. Third, architects

want to capture accurate priorities, which reflect their perspectives. To

summarize: The objective of the study in Phase 3 is to analyze the hundred-

dollar and pairwise comparisons approaches for the purpose of comparison

with respect to their performance, users’ perception, and impact on REGAIN

analysis output, from the viewpoint of architects, in the context of deciding

which prioritization approach to add to the REGAIN approach. To achieve this

research objective, we performed an experiment with the two prioritization

approaches.

Next, we describe participants in the experiment, experimental materials, tasks

performed by participants, hypotheses, experimental parameters, design, as

well as the results of the experiment.



6.4.1 Participants

Participants in the experiment were graduate students who took the Software

Architecture course at the University of Groningen in the Netherlands. As part

of the course, students worked in groups of six over a period of ten weeks to

architect a home automation system that would interact with the Smart Grid to

trade electrical power, and manage energy consumption of home appliances.

For the experiment, we prepared tasks for which students had enough training,

to eliminate the need for professionals-only knowledge. Tichy argues that

computer science graduate students differ marginally from professionals, in

the sense that professionals might be more knowledgeable of domain-specific

areas, large scale systems, and organizations (Tichy, 2000). Therefore, if a

study does not require such extra knowledge from graduate students, then

study results are applicable to professionals. Thus, we minimized knowledge

requirements about domain-specific areas, large systems and organizations, to

make our study results applicable to professionals.

We addressed the ethical aspects of conducting the experiment as part of a

course (Berander, 2004) by using a checklist for integrating research and

teaching goals (Carver et al., 2009; Galster et al., 2012):

1. Ensure adequate integration of the study into the course topics -

The software architecture course stressed the importance of

architectural decisions. During the experiment, students learnt about

using the REGAIN approach to capture architectural decisions.

2. Integrate the study timeline with the course schedule - The course

schedule included an optional seminar session in week four, in which

we conducted our experiment. In week four, students already had

made a few architectural decisions for their course project. However,

they had to make and capture more architectural decisions. Thus, they

could benefit from REGAIN in their course project.

3. Obtain subjects’ permission for their participation in the study -

We announced the optional seminar session at the beginning of the

course. We informed students that the seminar would address

advanced software architecture topics. By showing up for the study,

students indicated implicitly their consent to participate in the study.



We specified clearly that participation or performance in the seminar

had no impact on grades.

4. Plan follow-up activities - To increase the educational value for

students, we scheduled debriefing sessions with groups of participants,

within three weeks after the study. We prepared individual packages,

so that students could see their own results. In each half-hour

debriefing session, we presented the study details, preliminary results,

and we answered questions from students.

Of the 36 graduate students enrolled in the software architecture course, 30

participated voluntarily in our experiment. Twenty-three participants were

Dutch, two Indonesian, one Icelander, one Indian, one German, and one

Chinese. All participants had bachelor’s degrees, in computer science or

informatics (20), industrial engineering management (8), industrial automation

(1) or Mathematics (1). Participants had an average of 3.6 years (standard

deviation of 2.2) of practical experience in software development, and an

average of 1.2 years (standard deviation of 1.3) in software architecture.

6.4.2 Experimental Materials and Tasks

We prepared the following experimental materials, which are available online

at (Tofan, 2012).

1. Instruction about how to perform experimental tasks, including

examples.

2. Two grids that described two architectural decisions that students had

to make for their course project: choice of user-interface and choice of

storage. The grids included decision topics, decision alternatives and

concerns for each of the two topics, so that participants would not

have to use the whole REGAIN approach during the experiment, due

to time constraints. We derived decision topics, decision alternatives

and concerns for the two grids from project reports from previous

years. In the Appendix, we show the grid for the user-interface


3. Four prioritization forms: two types of prioritization forms for both

grids. For the hundred-dollar approach, participants had to fill out

amounts for each concern in a grid, depending on their perceived



importance, with the constraint that sum is 100. For pairwise-

comparisons, participants had to fill out comparisons among all pairs

of concerns (e.g. moderately more, extremely less important). Both

types of forms also asked participants to fill in the time when they

started the prioritization task and when they completed it.

4. A post-questionnaire on the background of participants, questions

about how participants perceived each prioritization approach, and a

section for general feedback on the experiment.

Figure 6.4 shows the steps for executing the experiment, which took place in

one session. All students gathered in the same room. In the first step, we

presented the plan for the session, we gave a brief introduction on the key

concepts for the session, and an overview of tasks to be performed by the

students. In step two, half of the participants went to a different room,

accompanied by one research assistant. In step three, all participants rated

alternatives against concerns for the two architectural decisions, from their

personal perspectives, using their knowledge on their project. In steps four and

five, participants prioritized concerns for both decisions with the hundred-

dollar and pairwise comparisons approaches. We collected the paper forms for

step four, before handing over the forms needed for step five. In the final step,

participants filled in the post-questionnaire.



Figure 6.4. The experimental process had some different steps for Group 1 (left)

and Group 2 (right).

Throughout the session, we encouraged participants to ask for clarifications

about their tasks. Furthermore, we placed no time restriction on tasks. This

ensured that participants understood their tasks and were comfortable with

their tasks.

6.4.3 Hypotheses and Variables

We present the hypotheses on the prioritization approaches, regarding their

impact on performance, users’ perceptions, and REGAIN output.

6.4.3.1. Performance

We consider two dimensions for evaluating performance of prioritization

techniques: required time to perform prioritization using an approach, and

scalability of the approach. Scalability concerns the variation of required time

for a prioritization approach with the number of concerns to be prioritized. We

define the following scalability ratio metric to measure scalability.

a

b

t

trs

b

a (1)



For this metric, a and b are the number of prioritized concerns, and a is smaller

than b. A participant needs a time period ta to prioritize a concerns and a time

period of tb to prioritize b concerns. rs indicates the variation of the time

required to prioritize concerns when the number of concerns increases. For

example, if rs equals 1, then doubling the number of concerns means that at

least twice as much time is needed to prioritize them (linear scalability). If rs is

smaller than 1, rs indicates negative scalability. If rs is larger than 1, then

doubling the number of concerns needs at least less than half the time (positive

scalability).

To evaluate the performance of the hundred-dollar and pairwise comparisons

approaches, we test the following null hypotheses.

Ha0: Participants need the same amount of time for prioritization.

Hb0: The hundred-dollar and pairwise comparisons approaches have the

same scalability ratios.

6.4.3.2. Users’ Perceptions

We operationalize users’ perceptions in terms of ease of use, ease to learn and

attractiveness of a prioritization approach. The post-questionnaire and the

results are presented in the analysis section, in Table 6.3. We test the following

null hypotheses for users’ perceptions.

Hc0: Participants perceive the hundred-dollar ($100) as equally easy to use

compared to the pairwise comparisons (PWC) approach.

Hd0: Participants perceive the hundred-dollar ($100) as equally easy to

learn compared to the pairwise comparisons (PWC) approach.

He0: Participants perceive the hundred-dollar ($100) as equally attractive

compared to the pairwise comparisons (PWC) approach.

6.4.3.3. REGAIN Output

Priorities of concerns influence REGAIN output, which is the result of cluster

analysis on the grid with the architectural decision. Figure 6.5 shows an

example of the impact of prioritization approaches on the REGAIN output.

The example uses two grids with identical decision alternatives (A1 to A5),

concerns ((C1, -C1) to (C4,-C4)) and ratings. However, the concerns in the



two grids have different priorities. All concerns in the left grid have a priority

of 10. In contrast, the concerns in the right grid have various priorities (i.e. 10,

15, 30, and 45). In Figure 6.5, the lower triangular matrix shows percentages

of similarities among decision alternatives A1 to A5 (left grid), as calculated

by the WebGrid tool. The upper triangular matrix shows percentages of

similarities among A1 to A5 for the right grid. For example, the similarity

between A2 and A3 is 68.8 for the left grid, and 76.2 for the right grid.

A1 A2 A3 A4 A5

A1 46.2 55 45 56.2

A2 56.2 76.2 46.2 42.5

A3 62.5 68.8 45 56.2

A4 37.5 43.8 25 21.2

A5 43.8 50 56.2 31.2

Same priorities Different priorities

Figure 6.5. Percentages of similarities between pairs of alternatives vary by

modifying priorities of concerns.

We compare clusters based on the distance between the similarity matrices

from which the clusters are derived. The distance between the similarity

matrices is calculated as follows:

N

i

i

j

jiji PWCSP1 1

,,PWCSP, Distance (2)

In Eq. (2), N is the number of alternatives. SP is the similarity matrix of

alternatives, calculated assuming that all concerns have the same priority.

PWC is the similarity matrix of alternatives, calculated with the priorities

obtained from using the pairwise comparisons approach. Using the above

metric, we define the following hypothesis.

Hf0: DistanceSP,PWC = DistanceSP,$100 = Distance$100,PWC

Hf1: DistanceSP,PWC <> DistanceSP,$100 <> Distance$100,PWC <> DistanceSP,PWC

6.4.3.4. Summary

Overall, in this experiment we investigate how prioritization approaches

influence performance, users’ perceptions, and REGAIN output. We

summarize the independent and dependent variables in Table 6.2. The first



column shows variables and corresponding hypotheses. The last column shows

how the hypotheses will be tested.

Table 6.2. Hypotheses (H), independent(I) and dependent (D) variables.

Variable H Type Scale Unit Range Testing

procedure

Prioritization

approach

NA I Nominal NA PWC,

$100

NA

Needed time Ha D Ratio s > 0 Mann-Whitney

test

Scalability Hb D Interval NA > 0 Mann-Whitney

test

Ease to use Hc D Nominal NA PWC,

$100

Binomial test

Ease to learn Hd D Nominal NA PWC,

$100

Binomial test

Attractiveness He D Nominal NA PWC,

$100

Binomial test

Distance

between

similarity

matrices

Hf D Ratio % >=0 Kruskal-Wallis,

Mann-Whitney

tests

6.4.4 Experiment Design and Results

This experiment used a randomized, paired comparison design, with one factor

and two treatments (Juristo and Moreno, 2001). The factor was the

prioritization approach. Treatments were the hundred-dollar and pairwise-

comparisons approaches. Since each subject used both prioritization

approaches, the order effect was a validity threat, as results might be

confounded by using one prioritization approach before the other. Also,

participants might experience fatigue or boredom after using an approach.

Therefore, we used counterbalancing in our design, by randomly assigning

participants to two similar groups. The two groups used the prioritization



approaches in different orders, as shown in Figure 6.4: participants in Group 1

used pairwise-comparisons, and afterwards they used the hundred-dollar

approach. However, participants in Group 2 used the approaches in the reverse

order.

To increase validity, we exclude data points from subjects who have a

consistency ratio larger than 20% in the pairwise comparisons approach. The

rationale is that such data points indicate inconsistencies in the pairwise

comparisons (e.g. an inconsistency exists if x is more important than y, y more

important than z, but z is more important than x) (Karlsson and Ryan, 1997).

6.4.4.1. Results on Performance

Regarding required time, the histograms in Figure 6.6 present the frequency of

time intervals needed by participants to apply each prioritization approach, for

both decisions. We note that the user-interface decision (left side) had a

maximum time of around five minutes, while the storage decision (right side)

had a maximum time of around eleven minutes. Furthermore, the histograms

indicate the distribution of the time intervals. For example, the leftmost

histogram indicates that most participants (i.e. eleven) needed between 33 and

66 seconds to prioritize concerns for the user-interface decision using the

hundred-dollar approach.



Figure 6.6. Histograms with required time (in seconds) to complete prioritization

for the user-interface (left) and storage (right) decisions using the two

prioritization approaches ($100 and PWC).

We use the Mann-Whitney U Test to test the null hypothesis that both

approaches need the same amount of time (Ha0) for both decisions. The p-

value for both tests is 0.000, so we reject the null hypothesis and we accept the

alternative hypothesis that the required time for $100 differs from the required

time for PWC. Based on histograms in Figure 6.6, we conclude that PWC

needs more time than $100.

Regarding scalability of the prioritization approaches, we calculate rs for the

two prioritization approaches as outlined in Eq. (1), using the amounts of time

for the UI and Storage decisions (ta and tb), which have four and eight concerns

(a = 4, b = 8). We remove two outliers with ratios of 6 and 18, and we display

boxplots of rs in Figure 6.7. Most ratios for the hundred-dollar test indicate

positive scalability, while ratios for pairwise-comparisons indicate negative

scalability.



Figure 6.7. Boxplots with ratios of scalability (rs) for $100 and PWC

prioritization approaches.

To gain more detailed insights into these findings and to check for neutral

scalability, we used the Mann-Whitney U test. However, neither hundred-

dollar test, nor pairwise comparisons scale neutrally (p = 0.043 and p = 0.00).

Therefore, we accept the alternative hypotheses: hundred-dollar test scales

positively, and the pairwise-comparisons approach scales negatively.

6.4.4.2. Results on Users’ Perceptions

Table 6.3 lists questions from the post-questionnaire related to the perception

of users, including the frequency of answers to each of these questions.

Answers were either $100 or PWC. To test the hypotheses in Table 6.3, we ran

binomial tests on the numbers for $100 and PWC. We obtained the p-values in

the last column in Table 6.3.



Table 6.3. Number of answers in the post-questionnaire.

Hypothesis Post-questionnaire item $100 PWC p-

value

Ease to use Which of the two approaches was

easier to use?

23 7 0.003

Ease to learn Which of the two approaches was

easier to learn?

22 7 0.008

Attractiveness Which of the two approaches was

more fun to use?

28 2 <0.01

Attractiveness If choosing between the two

approaches, which would you prefer

to use in your decision making?

24 6 <0.01

Since all p-values are smaller than 0.05, we accept the alternative hypotheses:

participants perceive the hundred-dollar test as easier to use, easier to learn and

more attractive than the pairwise comparisons approach.

6.4.4.3. Results on REGAIN Output

The boxplots in Figure 6.8 summarize the data for the 18 valid data points, for

the UI and storage decisions. The vertical axis represents the city block

distances among each pair of prioritization approaches.

The Kruskal-Wallis test for the UI decision indicates a statistically significant

difference between distances (H(2) = 13.635, p = 0.001), with a mean rank of

38.64 for SP-PWC, 22.75 for $100-PWC, and 21.11 for SP-$100. Therefore,

we reject the null hypothesis that distances between similarity matrices are

equal (Hf0). Repeated Mann-Whitney tests indicate differences between SP-

PWC vs. $100-PWC (p = 0.006), and SP-PWC vs. SP-$100 (p = 0.000), but no

difference between $100-PWC vs. SP-$100 (p = 0.962).

For the storage decision, the Kruskal-Wallis test indicates a statistically

significant difference between the distances (H(2) = 16.395, p = 0.000), with a

mean rank of 23.82 for SP-$100, 41.53 for SP-PWC, and 21.66 for $100-

PWC. Repeated Mann-Whitney tests indicate differences between SP-$100 vs.

SP-PWC (p = 0.002), and SP-PWC vs. $100-PWC (p = 0.000). No difference

exists between SP-$100 vs. $100-PWC (p = 0.815).



Figure 6.8. Distances between similarity matrices for the $100, PWC, and same

priorities (SP) approaches for the two decisions.

This suggests that $100 offers a compromise between using SP and PWC.

Different prioritization approaches produce priority values which impact the

hierarchical clusters. To evaluate the impact, we compared similarity matrices

for grids that use same priorities, pairwise-comparisons, and hundred-dollar

prioritization approaches. SP and PWC result in most different matrices. The

distance between matrices from using $100 and SP does not differ from the

distance between matrices from using $100 and PWC.

6.4.5 Discussion

Results from the interview study with practitioners (in Section 6.3) indicated

the need to prioritize concerns for the initial REGAIN approach. Also,

concerns’ priorities are part of architectural knowledge, and implicit priorities

increase the risk of architectural knowledge vaporization. Hundred-dollar and

pairwise-comparisons are already used in other decision-making approaches

(Falessi et al., 2011). Therefore, we conducted an experiment to compare the

hundred-dollar and pairwise-comparisons approaches, to understand which

prioritization approach to add to REGAIN.

In the experiment, we investigated performance, users’ perceptions, and

impact on REGAIN output of the two approaches. On performance, we found

out that the hundred-dollar test needs less time than the pairwise-comparisons,



and that the hundred-dollar scales better than the pairwise-comparisons

approach. Users perceived the hundred-dollar as more attractive, easier to use

and to learn than pairwise-comparisons. Finally, we found that prioritization

approaches influence REGAIN output, and that the hundred-dollar approach

offers a middle ground between the pairwise-comparisons approach and using

same priorities for all concerns.

The results on performance and users’ perceptions can be explained by the

steps of the two approaches. Pairwise-comparisons require comparisons

among each possible pair of concerns, so it has an O(n2) order of complexity.

In contrast, hundred-dollar requires direct indication of priorities. Therefore, if

a person has a clear idea about priorities of concerns involved in an

architectural decision, then the hundred-dollar approach captures priorities

more efficiently than pairwise-comparisons, which also influences users’

perceptions. However, if a person does not have a clear idea about the

priorities, then pairwise-comparisons might be more efficient, because it offers

smaller steps than the hundred-dollar approach. In such situation, the small

steps of the pairwise-comparisons approach might provide more value than the

hundred-dollar approach.

Following this experiment, we recommend practitioners with little experience

in capturing concerns’ priorities for their decisions to use the hundred-dollar

approach, because first-time users’ perceptions favor it over the pairwise-

comparisons approach. Also, we recommend practitioners who have a clear

idea of the concerns’ priorities to use the hundred-dollar approach, because of

its better performance. However, we recommend that practitioners who need

help clarifying priorities to use the pairwise-comparisons approach, because it

offers smaller steps and a consistency ratio.

In practice, the number of items (i.e. concerns) to be prioritized can be larger

than the number of items we used in this experiment. This raises questions on

the scalability of the hundred-dollar approach for many items to be prioritized.

For example, Regnell et al. (Regnell et al., 2001) present a study with

practitioners who prioritized 58 items. In addition, Berander and Jönsson

(Berander and Jönsson, 2006) mention that prioritizing 200 items happens in

practice. To cope with the large number of items, researchers propose two

approaches. The first approach is to increase the number of points (or dollars),

such as using $100,000 for prioritizing the 58 items in (Regnell et al., 2001).



The second approach is to define various levels of abstraction that enable high-

level items to be refined into more detailed items, thus creating hierarchies of

items in which the hundred-dollar prioritization takes place at each level of the

hierarchy. This extension of the hundred-dollar approach is called hierarchical

cumulative voting. Berander and Jönsson (Berander and Jönsson, 2006)

present hierarchical cumulative voting in detail. Thus, according to (Berander

and Jönsson, 2006; Regnell et al., 2001), the hundred-dollar approach scales

better than the pairwise-comparisons approach even for large number of items

to be prioritized.

There are tradeoffs for choosing which prioritization approach to add to the

REGAIN approach. Although the hundred-dollar approach has better

performance and users’ perceptions, the hundred-dollar approach does not

address potential prioritization judgment errors, as it does not help identify

prioritization inconsistencies. This means that the hundred-dollar approach

might be less reliable than the pairwise-comparisons approach. However, the

experiment results on REGAIN output suggest that the hundred-dollar

approach offers results similar to the results from the pairwise-comparisons

approach, which indicates the reliability issue is not critical. Therefore, we

consider that the hundred-dollar approach offers a good tradeoff for REGAIN

concerns’ prioritization.

These findings are in line with results from the requirements engineering

domain. Karlsson et al. found out that the pairwise-comparisons approach is

the most trustworthy out of six prioritization approaches, because of its

inclusion of a consistency check, which is critical in addressing potential

human judgmental errors (Karlsson et al., 1998). Berander and Svahnberg

noticed that requirements engineers might prefer the hundred-dollar approach

over the pairwise-comparisons approach, because the former is more

straightforward (Berander and Svahnberg, 2009).

The findings from this experiment are important for several reasons. First, we

obtained empirical evidence for the comparison between the hundred-dollar

and pairwise-comparisons approaches. The evidence increases our

understanding of prioritization approaches for architectural decisions, and

removes speculations on the characteristics of the two approaches. Second,

since REGAIN is based on the Repertory Grid technique, this experiment

encourages other researchers to add a concerns prioritization step when using


6.5. Validity Threats 171

the Repertory Grid technique for empirical software engineering research (e.g.

such as future studies, similar to the Repertory Grid studies in (Edwards et al.,

2009)). Third, some results of this study (findings on performance and users’

perceptions) can be reused for other architectural decision-making techniques

that use the hundred-dollar and pairwise-comparisons approaches (Falessi et

al., 2011). Fourth, based on the experiment results, we add a step with the

hundred-dollar approach to the initial REGAIN approach and to its tool

support (as reported in Chapter 8), so that architects can prioritize concerns for

their decisions.

6.5 Validity Threats

We discuss internal, construct, conclusion, and external validity threats

validity threats for the interview study in Section 6.3 and the experiment in

Section 6.4, using the recommendations from Jedlitschka et al. (Jedlitschka et

al., 2008).

6.5.1 Interview Study Validity Threats

Internal validity are not applicable for this study, since we do not try to

establish a causality relationship.

Construct validity threats may originate from how the study design reflects

the studied constructs (e.g. advantages and disadvantages of the REGAIN

approach) (Wohlin et al., 2012). To avoid such threats, we allocated enough

time to explain the REGAIN approach to the practitioners. We used it on a toy

example to make sure that practitioners understood the approach. Also, we

selected practitioners who took architectural decisions in the industry, to

ensure that they could provide relevant feedback. Moreover, we avoided using

a single type of measures (the post-questionnaire), by also using semi-

structured interviews, so that practitioners could elaborate on their

perspectives.

Another source of construct validity threats is the behavior of the participants

and researchers (Wohlin et al., 2012). For example, some participants might

change their behavior during a study. We used recommendations from (Hove

and Anda, 2005) on creating a non-judgmental and comfortable atmosphere

during the interviews, so that participants would not change their behavior.



Still, researchers might influence subconsciously study results because of their

own expectations, which might change researchers’ behavior (e.g. extra

enthusiasm) (Wohlin et al., 2012). To mitigate this problem, we asked another

researcher to conduct the sessions with the architects. This additional

researcher was not involved in the previous studies on the REGAIN approach,

and had no interest in obtaining either positive or negative feedback on the

approach.

Conclusion validity - in this study, we used descriptive statistics. Thus, we

did not run into issues on low statistical power. Regarding measurement, we

used a post-questionnaire validated in a previous study, to ensure reliability of

measures, and we used a non-parametric statistical test, which has broader

assumptions on the distribution of the data, compared to parametric statistical

tests. Furthermore, two researchers were involved in the content analysis of the

interviews transcriptions, who agreed on the data interpretation.

External validity - to address this validity threat, we recruited architects who

worked in a wide variety of domains (see Table 10.2 in the Appendix). Also,

the architects used the REGAIN approach on real-world architectural decisions

from their industrial practice. To further generalize the results, we needed to

make sure that enough architects participated in the study. Unfortunately,

architects have busy schedules and limited time to participate in studies, which

lead to difficulties in recruiting architects for this study. However, we reached

data saturation from the sessions with the sixteen architects, in the sense that

the last two sessions brought no extra insights compared to the previous

sessions.

6.5.2 Experiment Validity Threats

We addressed internal validity threats as follows. To avoid the

instrumentation validity threat, we piloted the experiment on three persons

with software architecting background. We used their feedback to improve the

quality of the experimental package. For example, we increased the readability

of the paper forms. Another instrumentation validity threat is the use of self-

reporting for time needed by participants for the prioritization tasks. We

addressed it by reminding participants to record the start and end time for each

task during the experiment. Social threats like demoralization of participants

influence negatively study results. To avoid that, we made sure that



participants received enough directions for the tasks. Furthermore, we asked

participants to rate some statements about the session. Results are summarized

in Table 6.4. Overall, subjects considered that directions were clear and they

had clear ideas about their tasks. Most subjects were neutral or positive on

enjoying the session. We consider these results suggest a good level of

subjects’ participation and engagement in the exercise, with a positive impact

on the internal validity of the experiment.

Table 6.4. Number of session-related answers in the post-questionnaire.

Statement Strongly

disagree

Disagree Neutral Agree Strongly

Agree

The directions for

the exercise were

clear

1 - 1 20 8

I had a clear idea on

what I had to do in

the exercise

- 1 2 20 7

I enjoyed doing the

exercise

1 2 12 11 4

We addressed the mortality validity threat by integrating the study with the

software architecture course, so that subjects joined it voluntarily for the

educational value. No participant dropped out from the experiment, although

we had a few cases of incomplete data. To avoid selection threats, we

randomized the distribution of participants to the two groups. Furthermore, we

checked that subjects have similar background on decision-making, e.g.

exclude persons with background on prioritization approaches or the repertory

grid technique.

We addressed construct validity by defining concrete metrics for

operationalizing constructs in our experiment, such as the scalability ratio.

Another risk was treatments interaction, because each participant used two

prioritization approaches. We addressed it partially by randomly assigning

participants to two groups that received the treatments in different orders.

Social threats for construct validity refer to the altered behavior of subjects

when participating in an experiment. We reduced social threats by integrating

the study with the course, and eliminating any impact on participants’ grades.



Additionally, participants did not know our hypotheses, which could have

altered their behavior.

We addressed conclusion validity by using non-parametric tests, which make

fewer assumptions about the distribution of data, such as Mann-Whitney.

However, this experiment included around 30 data points, which may

negatively impact the statistical power of tests. Furthermore, subjective

measures such as perceptions (e.g. ease to use) tend to be less reliable than

objective ones (e.g. time). Thus, we piloted the post-questionnaire to ensure

that participants understand the items in the post-questionnaire.

We addressed external validity threats by ensuring that the experimental tasks

did not require a specific level of software architecture experience, and no

professional-only knowledge was needed for the experimental tasks.

Furthermore, the participants had some practical experience in software

development (average of 3.6 years) and software architecture (average of 1.2

years). Kitchenham et al. regard students as relatively close to the population

of interest, because they are the next generation of software professionals

(Kitchenham et al., 2002). Finally, increasing commitment level of participants

in a study benefits applicability of study results to professionals (Berander,

2004), Since we integrated the experiment timeline with the course schedule,

participants in this experiment had a high commitment level. Therefore, we

consider the results of the experiments are applicable to software architects of

various levels of practical experience.

6.6 Related Work

The main benefit of REGAIN is reducing the vaporization of architectural

knowledge, through capturing (or acquisition) of architectural knowledge.

REGAIN is based on the Repertory Grid technique, which has been used

successfully for the acquisition of other types of design knowledge outside the

software domain. Boose et al. (Boose et al., 1990b) used the Repertory Grid

technique to capture design knowledge for different types of products from the

Boeing company. Hassenzahl and Wessler (Hassenzahl and Wessler, 2000)

used the technique to capture design knowledge of industrial products at the

Siemens company. Overall, these studies indicate the value of the Repertory

Grid technique for capturing design knowledge, because the technique



encourages capturing the personal perspectives of the designers, and such

perspectives are a very important of design knowledge.

Software architecture related applications of the Repertory Grid technique

include the following studies. De Boer and van Vliet (de Boer and van Vliet,

2008) used the Repertory Grid technique to compare the mental models of

auditors about existing documentation against the semantic structure of the

documentation. This was to discovery architectural knowledge in

documentation. Shaw and Gaines (Shaw and Gaines, 1996) argue for closer

collaboration between the knowledge engineering and software engineering

community. Therefore, they present the Repertory Grid technique that helps

capturing knowledge on requirements (Shaw and Gaines, 1996). Niu and

Easterbrook (Niu and Easterbrook, 2007) use the Repertory Grid technique to

capture the knowledge of stakeholders about requirements, and to clarify

inconsistencies in how stakeholders use terminology for stating requirements.

Decision-making applications of the Repertory Grid technique include the

following studies. Boose (Boose, 1989) presents decision support tools that are

based on the Repertory Grid technique, with applicability in various domains.

Scheubrein and Zionts (Scheubrein and Zionts, 2006) propose a decision-

making approach based on the Repertory Grid technique, with tool support in

the form of an Excel spreadsheet. Castro-Schez et al. (Castro-Schez et al.,

2005) propose an extension of the Repertory Grid technique for decision-

making support for managers.

There is much interest from the software architecture community in

architectural decision-making and capturing techniques. Falessi et al. (Falessi

et al., 2011) indicate three main types of techniques for decision-making:

1. Naturalistic decision making (or keeping the first available alternative)

2. Selecting among a finite number of alternatives (or multi-attribute

techniques for decision-making)

3. Selecting among an infinite amount of alternatives (covered by optimization

research, such as multi-objective techniques for decision-making)

In practice, the most common type of techniques is selecting among a finite

number of alternatives (Falessi et al., 2011). Falessi et al. (Falessi et al., 2011)

compare fifteen such architectural decision-making techniques (e.g. CBAM



(Kazman and Klein, 2001)), and notice that each technique involves tradeoffs,

as there is no perfect technique. Techniques for capturing architectural

decisions include decision views (Kruchten et al., 2009; van Heesch et al.,

2012), templates (Tyree and Akerman, 2005), and tools (e.g. (Capilla et al.,

2010)). Tang et al. compare five tools for capturing architectural decisions

(e.g. Archium, PAKME) (Tang et al., 2010).

Prioritization approaches received much attention in the software engineering

community, since practitioners need to prioritize concerns, stakeholders,

features, and requirements. Prioritization approaches can be grouped in ordinal

scale and ratio scale approaches (Berander and Jönsson, 2006).

For the ordinal scale prioritization approaches, there is an order among

prioritized items (e.g. by assigning importance levels), but no arithmetic

operations are possible with the values for priorities. Such a prioritization

approach is included in RFC 2119 (Bradner, 1997), which recommends

specific keywords (i.e. must/should/may) for prioritization. In addition, IEEE

830-1998 (830-1998, 1998) recommends prioritizing by assigning one of

following keywords: essential/conditional/desirable. The updated version of

IEEE 830-1998 is IEEE 29148-2011 (29148, 2011), which recommends three

types of prioritization: either using shall/will/should/may keywords, one-to-

five scale, or high/medium/low keywords.

For the ratio scale prioritization approaches, all arithmetic operations are

possible, ratios among values and interval sizes are relevant. This means that

values obtained from using a ratio scale prioritization approach are more

precise than values obtained from ordinal scale prioritization approaches

(Berander and Jönsson, 2006). Pairwise-comparisons and hundred-dollar are

the most used ratio scale prioritization approaches (Berander and Jönsson,

2006).

There are important differences between ordinal and ratio scale prioritization

approaches, which influenced our choice on which type of approach to use for

REGAIN. Although ordinal scale approaches are standardized (830-1998,

1998; 29148, 2011; Bradner, 1997) and very common in practice (Berander

and Jönsson, 2006), such approaches carry the risk of knowledge vaporization.

Results from previous studies (Berander and Jönsson, 2006; Karlsson, 1996;

Karlsson et al., 1998) indicate that using a ratio scale approach captures more



knowledge than using an ordinal scale prioritization approach, such as fine

grained differences in the priorities. Since REGAIN aims at capturing

knowledge on architectural decisions, ratio scale approaches are better suited

for REGAIN, instead of ordinal scale prioritization approaches.

6.7 Conclusions

In this chapter, we address two problems. First, REGAIN or the Repertory

Grid Technique had not been used with practitioners to make and capture real-

world architectural decisions. Second, potential improvements of REGAIN

were not known.

To solve these problems, we conducted an interview study with 16

practitioners. From the interview study, we found advantages, disadvantages,

and improvement opportunities for REGAIN. A critical improvement was to

prioritize concerns for REGAIN. To select a prioritization approach, we

conducted an experiment with 30 graduate students in which we compared the

hundred-dollar and pairwise-comparisons approaches.

The results of the experiment indicate that the hundred-dollar approach has

better scalability and needs less time than the pairwise-comparisons approach.

Also, users’ perceptions favored the hundred-dollar approach: participants

considered that the hundred-dollar approach was easier to use, learn and more

attractive than the pairwise-comparisons approach. Moreover, results suggest

that prioritization approaches impact REGAIN output, and that the hundred-

dollar approach provides a ‘middle ground’ between using the same priorities

and the pairwise-comparisons approach.

Following these studies with practitioners and students, we implemented tool

support for REGAIN, which we report in Chapter 8. Such dedicated tool

support reduces the effort from practitioners to use REGAIN. Furthermore, in

the next chapter, we propose and validate a group architectural decision-

making process that extends REGAIN.

6.8 Acknowledgments

We thank Konstantinos Tselios for his help in conducting the studies, and

participants for their efforts to contribute in the studies.


Chapter 7

Improve Group Architectural Decisions

Under review at the Information and Software Technology journal as: Tofan,

D., Galster, M., Lytra, I., Avgeriou, P., Zdun, U., Fouche, M.A., de Boer, R.,

Solms, F., Empirical Evaluation of a Process to Increase Consensus in Group

Architectural Decision Making.

As found in Chapter 3, many software architectural decisions are group

decisions rather than decisions made by individuals. Consensus in a group of

decision makers increases the acceptance of a decision among decision

makers and their confidence in that decision. Furthermore, going through the

process of reaching consensus means that decision makers understand better

the decision (including the decision topic, decision options, rationales, and

potential outcomes). However, as found in Chapter 4, little guidance exists on

group architectural decision making, and on how to increase consensus.

In this chapter, we propose and evaluate how a process (named GADGET)

helps architects increase consensus when making group architectural

decisions. Specifically, we investigate how well GADGET increases consensus

in group architectural decision making, by understanding its practical

applicability, and by comparing GADGET against group architectural

decision making without using any prescribed approach.

We conducted two empirical studies. First, we conducted an exploratory case

study to understand the practical applicability of GADGET in industry. We

investigated whether there is a need to increase consensus, the effort and

benefits of GADGET, and potential improvements for GADGET. Second, we

conducted an experiment with 113 students from three universities to compare

GADGET against group architectural decision making without using any

prescribed approach.


180 7. Improve Group Architectural Decisions

Study results indicate that GADGET helps decision makers increase their

consensus, captures knowledge on architectural decisions, clarifies the

different points of view of different decision makers on the decision, and

increases the focus of the group discussions about a decision. In addition, we

used the feedback from industrial practitioners to refine GADGET. From the

experiment, we obtained causal evidence that GADGET increases consensus

among participants better than group architectural decision making without

using any prescribed approach.



7.1 Introduction

Current approaches for architectural decision-making are aimed at individual

architects. This includes the REGAIN approach presented in Chapter 6.

However, the survey in Chapter 3 indicates that software architects make most

of their decisions in groups, rather than individually. Other researchers also

report that most architectural decisions are made in groups (Miesbauer and

Weinreich, 2013).

Unfortunately, little is known about group architectural decisions, and how to

improve group architectural decision making. The mapping study on

architectural decisions in Chapter 4 indicates a small number of papers on how

to make group architectural decisions.

Group architectural decisions bring additional challenges compared to

individual architectural decisions, such as the need for communication among

decision makers and increased consensus between decision makers and other

stakeholders (Svahnberg, 2004).

Increasing consensus among decision makers is a critical factor of group

decision making. On the one hand, low consensus in early architectural

decisions may lead to misunderstandings within the group of decision makers

(Svahnberg, 2004). Such misunderstandings may cause problems. For

example, if a stakeholder feels that her point of view about a decision was not

taken seriously, that stakeholder might not accept the final software system.

On the other hand, benefits of consensus include higher acceptance and better

understanding of the architectural decision by all involved stakeholders.

Furthermore, consensus increases confidence in the correctness of the

architectural decision (Svahnberg, 2004). Therefore, consensus needs to be

addressed explicitly as part of group architectural decision making. However,

as mentioned before, no approach from software architecture literature targets

explicitly the increase of consensus in group architectural decision making.

Regarding the scope of this chapter, we focus on consensus (i.e. ‘we have

some general agreement and we understand each other’s perspectives’) instead

of unanimity (i.e. ‘all of us have the same perspectives’). Consensus has two

main components: general agreement and mutual understanding among

stakeholders involved in making a decision (Tastle and Wierman, 2007).



Therefore, in this chapter, we investigate how to increase general agreement

and mutual understanding amongst decision makers. Finally, we focus on

inexperienced architects, rather than senior architects, because senior

architects usually have enough experience to handle group decision making.

In this chapter, we evaluate GADGET (Group Architectural Decisions with

repertory Grid Technique), which is a group decision making process for

helping architectural decision makers (e.g. architects and other stakeholders

who have a decision-making role) increase consensus about their decisions.

The key target group of GADGET is inexperienced architects, as

aforementioned, because they need support to reach consensus in group

decision making. In addition, GADGET aims at groups that are recently

formed and which do not have common procedures and processes in place, and

therefore may benefit from a standardized way of interaction. The process

offers guidance for increasing consensus incrementally, making explicit the

knowledge of the decision makers, and helping them structure their group

interactions.

This chapter contributes with empirical evidence of how GADGET increases

consensus in group architectural decision making. The validation has two

parts:

- a case study with seven students and thirteen practitioners

- an experiment with 113 students to answer research questions that

emerged from the case study

This chapter is organized as follows. Section 7.2 presents GADGET. Next,

GADGET is evaluated in Section 7.3 in a case study, and in Section 7.4 in an

experiment. We discuss validity threats for the evaluation in Section 7.5, and

related work in Section 7.6. Finally, Section 7.7 presents conclusions.

7.2 The GADGET Process

GADGET extends our previous work on making and capturing architectural

decisions with the Repertory Grid technique (that we also used previously

for architectural knowledge acquisition in Chapters 5 and 6) with the idea of

group evaluations and feedback from the Delphi technique (Linstone and

Turoff, 2002).


7.2. The GADGET Process 183

The Delphi technique is a ‘method for structuring a group communication

process so that the process is effective in allowing a group of individuals, as a

whole, to deal with a complex problem’ (Linstone and Turoff, 2002). In

Delphi, participants answer questions on a complex problem in several

iterations, receive a summary of answers from all other participants, and are

given the opportunity to revise their answers for the next iteration. After

several iterations, the answers converge and determine the solution to the

complex problem.

In addition to Delphi, we also considered other techniques to be included in

GADGET, namely brainstorming (Osborn, 1963) and nominal group (Delbecq

and Van de Ven, 1971). However, we preferred Delphi for the following

reasons. Brainstorming is strong at generating new, creative ideas, while

performing evaluations. Since our goal was to increase consensus, these

characteristics were not high priority for GADGET. The nominal group

technique has similar steps as Delphi, but the evaluation step is anonymous.

We preferred that GADGET has an open evaluation step, so that participants

can communicate and understand faster each other’s perspectives.

Here are the five steps used by GADGET, using the topic of an architectural

decision (e.g. choice of database, architectural patterns, JavaScript framework,

or platform technologies) as input. Since the steps are not domain-specific,

GADGET can be used also for making group decisions in other domains. In

this chapter, we focus on evaluating GADGET for making group decisions in

the software architecture domain.

3. Prioritize and rate concerns

Common set of alternatives and concerns

Individual alternatives and concerns1. Indicate alternatives and concerns

2. Discuss alternatives and concerns

5. Iterate

4. Discuss differences Consensus and persisting divergences

Individual priorities and ratings

Process Step OutcomeLegend:Decision topic

Figure 7.1. GADGET process steps and outcomes.

Each step consists of the following.



1. Indicate alternatives and concerns: Decision-makers individually

indicate their alternatives and concerns for the decision topic. Also,

decision-makers can indicate what alternatives or concerns to remove

from previous iterations (see Step 5). The rationale for this step is to

ensure that any potentially relevant alternative and concern is

considered in the decision making process. The output of this step is a

set of alternatives and concerns from each decision-maker. For

example, for making a decision about the JavaScript framework, one

of the decision-makers indicates three alternatives (e.g. Angular,

Ember, and Backbone), and four concerns (e.g. testability,

performance, learning curve, and existing skillsets).

2. Discuss alternatives and concerns: Decision-makers have a group

discussion on the alternatives and concerns, with the purpose of

consolidating them in a common set of agreed alternatives and

concerns. The rationale for this step is to clarify and potentially add or

remove alternatives and concerns that are included in the decision

making process. For example, more alternatives can be added and

some concerns can be clarified (e.g. what is minimum acceptable

performance of a JavaScript framework).

3. Prioritize concerns and rate concerns against alternatives: Decision-

makers individually prioritize concerns using the hundred-dollar

approach (i.e. assign a priority to each concern from 0 to 100, so that

the sum of priorities is 100), which was evaluated in Chapter 6. In

addition, decision-makers individually rate each alternative against

every concern, using a five-level Likert scale, with values ranging

from ‘1-strongly disagree’ to ‘5-strongly agree’. Decision-makers may

use supplementary values such as ‘not applicable’ and ‘don’t know’.

The rationale for this step is to ensure that alternatives and the

importance of concerns are considered when making the decision

(some stakeholders may consider alternatives and concerns more or

less important than others). The output of this step is the set of ratings

and priorities from each decision-maker.

4. Discuss differences: Based on the ratings and priorities of concerns

from Step 3, metrics are calculated for priorities and ratings. For ease

of interpretation and usability of GADGET, only four metrics are used

for the ratings and priorities indicated by participants in Step 3: a)

average of ratings of alternatives based on concerns, b) average


7.3. GADGET Case Study 185

priorities of concerns, c) the range of ratings of alternatives based on

concerns, and d) the range of priorities (i.e. difference between highest

and smallest values of priorities and ratings). These metrics help

decision-makers understand how their own perspectives compare to

the perspectives of the other decision-makers. This generates a ‘soft’

pressure towards convergence. If differences in ranges are small

enough, then there is an acceptable degree of consensus among

decision makers. Otherwise, the decision-makers with highest

differences present their rationales to stimulate focused discussions

about the differences in perceptions. During these discussions,

participants are either willing to modify their priorities and ratings, or

they ‘agree to disagree’. The expected output of this step is increased

consensus, and/or explicit list of persisting divergences, which, if too

big, might require an additional iteration.

5. Iterate from Step 1: This step is needed if decision-makers choose to

update the ratings and priorities provided in Steps 1 and 3. The

discussions in Step 4 may modify the perspectives of the decision-

makers, which could lead to new alternatives and concerns, or

different priorities and ratings of concerns. The rationale of this step is

that it enables decision-makers to capture their updated perspectives.

7.3 GADGET Case Study

We conducted an exploratory case study to explore the practical applicability

of GADGET for the purpose of evaluating GADGET with respect to its impact

on consensus among decision makers from the viewpoint of a group of

decision makers, in the context of architectural decisions. Case studies are very

well suited for exploratory research questions (Runeson et al., 2012), since

case studies offer flexibility to study a phenomenon (e.g. group decision

making) in its real-world context. Case studies rely on observations to form

tentative hypotheses and confirmatory research questions, which can be further

investigated in subsequent studies. Next, we report the case study using the

guidelines from (Runeson and Höst, 2009).

7.3.1 Case Study Design

We defined the following three case study research questions:



RQ1. Is there a practical need for increasing consensus in group

architectural decision making?

As discussed in Section 7.1, there is very little work on consensus in group

architectural decision making. Therefore, before investing efforts into

developing approaches for increasing consensus, we investigated whether such

approaches are needed. If there is a practical need to increase consensus in

group architectural decision making, then an approach such as GADGET may

satisfy this need.

RQ2. What are the effort and benefits offered by GADGET?

The rationale for RQ2 is that practitioners are usually interested in the actual

benefits of a new approach (or GADGET in our case) and effort (i.e. time)

involved in using it. If an approach has low benefits and requires high effort,

then practitioners are unlikely to use such approach. Researchers need to pay

attention to effort and benefits of a new approach, to avoid proposing

approaches that practitioners are unlikely to use.

RQ3. What are potential improvements to GADGET?

The rationale for RQ3 is that we were open to improve GADGET to ensure it

satisfies the needs of its potential users. To improve GADGET, we needed

feedback from participants in the case study.

To recruit participants, we invited practitioners from the local community of

architects in Groningen. In addition, to obtain more data, we invited graduate

students with practical experience, who took the software architecture course

offered at the University of Groningen.

The case study used groups of three to four participants. Each case study

session for each group consisted of three steps:

1. Participants received an overview of the case study session in which they

participated, the GADGET process, and an example to illustrate the

GADGET process.

2. Participants used GADGET on an architectural decision topic they had

been involved with in their recent activity. Participants entered

alternatives, concerns, and ratings into a shared online spreadsheet that we

had prepared in advance.



3. Participants provided feedback on GADGET in a group discussion. To

focus the group discussions, we prepared the set of discussion items in

Table 7.1. We used the discussion items for RQ1 only during the sessions

with practitioners, and skipped these questions in the sessions with

students, since we were interested in identifying the real-world need for

GADGET, as indicated by practitioners.

Table 7.1. In the third step, we used these discussion items.

ID Discussion Item Research Question

1 Do conflicting perspectives occur in group


RQ1

2 What is the impact of conflicting perspectives in

group architectural decision making?

RQ1

3 What approaches have you used so far in

consensus building? (If any)

RQ1

4 What did you like/dislike about the proposed

process?

RQ2, RQ3

5 Would you use this process in your practice? RQ2, RQ3

6 Did you change your opinion about alternatives?

Why (not)?

RQ2

7 How did the process help? RQ2

8 How can the process be improved? RQ3

9 In which situations would you apply the process? RQ2, RQ3

We made audio recordings of the sessions, with the prior permission of the

participants. For analyzing the feedback from participants, two researchers

independently performed content analysis on the transcriptions of the

recordings and observer’s notes, to identify codes corresponding to sentences,

phrases or paragraphs, as recommended by (Krippendorff, 2004). Then, in

case of differences in interpretation, researchers discussed and resolved the

differences. We grouped the codes from the content analysis to answer the

three research questions: on need for consensus in group architectural decision



making (RQ1), effort/benefits of GADGET (RQ2), and possible improvement

for GADGET (RQ3).

7.3.2 Results

7.3.2.1. Case Study Participants and Execution

Table 7.2 summarizes the groups of students and practitioners that participated

in the case study, and the decision topics that were addressed during the

sessions. Years of experience refer to practical experience in software

engineering. Groups S1, S2 and P2 opted to use topics that we prepared in

advance, and all other groups used decision topics from their activities.

Regarding tool support for the sessions, groups P3 and P4 chose to use an

early version of the dedicated tool support for GADGET (detailed in Chapter

8). The other four groups chose to use the shared online spreadsheet.

Table 7.2. Groups of decision makers that participated in the case study.

Group

id

Group

size

Group type Average

years of

experience

Decision topic Number of

GADGET

iterations

S1 4 Students 4.62 Enterprise

Resource

Planning

system

1

S2 4 4.50 JavaScript

framework

1

P1 3 Practitioners 9 Buy or build

critical

component

2

P2 3 9 Communication

system

1

P3 4 3.66 Operating

system

2

P4 3 6 Programming

language

2



As an example on the execution of the sessions, participants in S1 indicated

concerns such as ‘low price’, ‘high security’, ‘high level of customer service’,

and ‘low learning curve’. For S1, step two of GADGET resulted in seven

alternatives (e.g. SAP Business One, Microsoft Dynamics, NetSuite) and

eleven concerns for the first session. In Step three of GADGET, members of

S1 prioritized concerns using the hundred-dollar approach. In addition,

participants rated each alternative against each concern on a one-to-five scale,

indicating how well an alternative satisfies a concern. Participants were

familiar with some of the consolidated alternatives, but needed more time to

learn about the others. During the session, they searched for information on the

alternatives on the internet, and used the results for the ratings. In Step four of

GADGET, members of S1 discussed the differences between the values they

assigned, starting with the ratings that had the highest ranges. Participants

discussed 14 ratings during the only iteration of the process. Participants

reached consensus for eleven ratings.

Finally, we spent 20 minutes to obtain feedback on GADGET through a group

discussion. We encouraged participants to provide feedback on their

experiences, using the questions in Table 7.1.

7.3.2.2. Analysis Results

Next, we present the results of the content analysis, for the three categories

corresponding to RQ1, RQ2, and RQ3.

RQ1 - Need for consensus in group architectural decision

making

Regarding occurrences of conflicting perspectives (item 1 in Table 7.1), two

architects indicated that conflicting perspectives related to a decision do not

occur very often, and four architects indicated that they occur very often.

Increasing the number of decision makers increases the number of conflicting

perspectives, since decision makers have different priorities for concerns, and

tradeoffs need to be found.

From the content analysis, we identified a positive and a negative impact of

conflicting perspectives (item 2 in Table 7.1). On the one hand, participants

indicated that conflicting perspectives is often time consuming (as one

architect phrased it: ’long and often almost endless discussions’). On the other



hand, participants indicated that the outcome of the decision is better if there

are conflicting perspectives, because it encourages decision makers to address

concerns of more stakeholders.

Regarding approaches for increasing consensus (item 3 in Table 7.1), from the

content analysis we learnt architects lack structured approaches. Instead,

architects use unstructured group discussions to increase consensus.

Overall, there is a need for increasing consensus in group architectural

decision making in a systematic way, since 1) conflicting perspectives occur in

practice, 2) conflicting perspectives help make better decisions, and 3)

architects lack structured approaches for increasing consensus.

RQ2 - Effort and benefits

Regarding effort, we observed that GADGET requires one to three hours per

group. Regarding benefits, the main benefit that emerged from the content

analysis was increasing consensus among decision makers on the architectural

decision. This benefit was indicated by five participants. A participant in the

first session expressed this: ‘that’s what I really liked about the process: not

focusing on the decision making in the first place, but on agreeing on a

viewpoint.’ Additionally, a participant stated: ‘we learnt from it, you see other

points of view, you also see your own gaps and misconceptions.’ The overall

message from participants was that GADGET helped them increase consensus,

by developing an increased shared understanding of each other’s perspectives,

as a result of discussing the differences between them in a structured manner.

Several other additional benefits emerged from the content analysis:

1. Increased focus of the group discussions (appearing three times in the

content analysis). According to a participant, decision makers are ‘less

likely to run off-topic’. Moreover, participants considered that the

process offered a structured way of increasing consensus, with

prioritization of items for discussion, allowing them to ‘focus on stuff

that is important.’

2. Rationale – participants appreciated that GADGET helps them

capture the rationale for the decision, in addition to making the

decision. Specifically, GADGET provides the rationale through its

metrics, and maps concerns to participants. Therefore, architects can



see not only the outcome of the group decision, but also the

intermediary steps that lead to the outcome.

3. Reusability – participants indicated that GADGET output (i.e.

alternatives, concerns, and ratings) has high potential for reusability.

For example, after making a group decision with GADGET, if a

decision on the same topic needs to be made in the future, then

alternatives, concerns, and ratings may be reused. In addition, some

concerns may be reused across different decisions, especially across

decisions that have strong dependencies (e.g. security-related concerns

are reusable across most decisions for architecting a security-intensive

system).

4. Clarity of problem – architects indicated that GADGET helped them

clarify their point of view on the decision, by forcing architects to

make explicit what matters to them in the decision.

RQ3 – Improvements

During the case study with the first group of participants, they indicated the

need for increasing consensus on the priorities of concerns. Therefore, we

updated GADGET to include prioritization of concerns (i.e. step three of

GADGET), and we used the updated version of GADGET with the rest of the

groups.

Here are the additional improvements suggested by participants throughout the

sessions, and what we did about them:

- Participants suggested to optimize the time needed to use GADGET,

by avoiding idle time in a face-to-face meeting, which happens when

participants need different amounts of time to finish a step. For

example, step three of GADGET (i.e. prioritize and rate concerns, see

Section 7.2) can take place outside of a face-to-face meeting. Based on

this suggestion, we removed time constraints on using GADGET in

face-to-face meetings.

- Allow decision makers to eliminate less promising alternatives in later

iterations. Based on this suggestion, we made explicit in the GADGET

description (see step one in Section 7.2) that decision makers can also

indicate what alternatives and concerns to remove when iterating.

- Participants considered that spreadsheets lacked dedicated features,

such as the ability to trace divergent perspectives among decision

makers. One of the architects indicated that he ’wants to spend most of



the time on discussions, instead of working with the tool.’ We used

this feedback for developing dedicated, user-friendly tool support for

GADGET in Chapter 8.

7.3.3 Discussion

The exploratory case study offered us insights on GADGET. The increase in

consensus from using GADGET was visible not only in the input from

participants (e.g. ratings), but also in the feedback from participants. For

example, a participant mentioned: ‘I trust the knowledge my teammates have

from their respective fields. After noting they are more informed than I am, I

would gladly accept their vision of the alternative, and I would concede to

their rating.’ Additionally, other participants mentioned that strong arguments

from peers in their groups convinced them to adjust their ratings.

Overall, the benefits of GADGET include: increased focus of the discussions,

captured rationales of the decisions, potential for reusability of captured

knowledge on decisions, and time savings. Still, there is further room for

improving GADGET: offering additional prioritization approaches for

concerns, and adding confidence levels to ratings. Also, tool support for

GADGET needs to be user-friendly (i.e. low learning curve, and reducing the

time required to learn and use GADGET).

7.3.3.1. Recommendations for Practitioners

From our experience with using GADGET, we recommend the following:

- Regarding threshold values for step four of GADGET (i.e. discuss

differences), the recommended thresholds guideline values for

differences are one for ratings and ten for priorities

- Regarding the number of iterations, two iterations for GADGET

provide sufficient opportunities for decision-makers to reach

consensus (i.e. general agreement on the decision, and mutual

understanding of each other’s perspectives)

- GADGET is particularly useful when the following conditions are

met:

o The topic of the architectural decision is important enough for

a group decision.


7.4. GADGET Experiment 193

o The architectural decision has several promising alternatives,

so that spending time to evaluate them systematically is

worthy.

o The decision makers have the maturity and openness to adopt

and apply a systematic approach for their decision.

7.3.4 Implications for Research

Although there is a need for consensus in group architectural decision making,

when making group architectural decisions, decision makers typically do not

use any structured approach for increasing consensus. This means that decision

makers use an ‘as-is’ or ‘natural’ approach which occurs when decision

makers increase consensus without using any predefined approach. We call

this approach ADHOC - the approach of increasing consensus in group

architectural decisions without using any structured approach. Overall, the

ADHOC approach seems to be popular in practice.

Exploratory case studies, such as the one we reported in this section, are useful

for obtaining insights and generating hypotheses for further research (Yin,

2003). This case study brought initial evidence that GADGET increases

consensus. Moreover, this case study helped us generate research questions

and hypotheses for comparing GADGET with ADHOC, which we report in

Section 7.4. Validity threats are reported in Section 7.5.

7.4 GADGET Experiment

The exploratory case study offered insights and initial evidence into the need

for increasing consensus in group architectural decisions, as well as the effort

and benefits offered by GADGET. One of the insights was that, in practice,

consensus is often increased without using any structured approach (i.e.

ADHOC). Therefore, we conducted an experiment to compare GADGET (i.e.

a new approach) with ADHOC (i.e. the existing frequently used approach).

This comparison allows drawing conclusions whether GADGET improves the

current state of practice. Next, we report the experiment using the guidelines

from (Jedlitschka et al., 2008).

In this experiment, we used ADHOC (as motivated in the previous section) for

the control groups, and GADGET for the treatment groups. By comparing

GADGET with ADHOC, we could better understand if GADGET increases



consensus, compared to ADHOC. This was a further research step compared

to the exploratory case study in Section 7.3, in which we brought initial

evidence that GADGET increases consensus, but we did not compare

GADGET with another approach.

We chose to compare GADGET with ADHOC, instead of another process, for

two reasons:

1. Practical relevance. Since ADHOC is popular in practice (as found in

the case study in Section 7.3), the comparison with ADHOC helps

practitioners understand what they can expect from adopting

GADGET.

2. Lack of a reference process. As we found out in Chapter 4, there is

no reference process in the literature for group architectural decision

making to use as a baseline for comparison.

7.4.1 Research Goal and Questions

The goal of the experiment was to compare GADGET with ADHOC for the

purpose of understanding them with respect to their impact on consensus

among decision makers from the viewpoint of decision makers, in the context

of group decision making for software architecture.

From our research goal, we derive the following two research questions.

RQ1. Compared to ADHOC, what is the impact of GADGET on

increasing consensus among group architectural decision makers?

Rationale: This research question aims at offering evidence on how GADGET

compares against ADHOC at increasing consensus among decision makers. In

the case study in Section 7.3, we found that GADGET has the potential to

increase consensus. However, an ad-hoc and unsystematic approach (i.e.

ADHOC) can also help achieve consensus. If ADHOC has the same effect as

GADGET, then it makes little sense for decision makers to use GADGET,

since ADHOC has less overhead than GADGET.

RQ2. How do perceptions on GADGET vs. ADHOC differ among

decision makers?

Rationale: The perception of an approach influences strongly the actual

intention to use that approach (Venkatesh and Davis, 2000). A positive



perception of an approach likely leads to a higher intention to use the

approach, which, in turn, results in actual usage of the approach. For example,

if some architects perceive that GADGET brings benefits such as capturing

rationale and correctness, without significant extra effort, then these architects

are likely to use GADGET in their future activity. Therefore, understanding

the perceptions on GADGET helps us understand the actual potential future

usage of GADGET.

We present the metrics for answering RQ1 and RQ2 in Sections 7.4.4 and

7.4.5.

7.4.2 Participants

There are certain constraints when selecting participants for experiments. If the

experiment has insufficient participants, then it is difficult to obtain relevant

results. Also, if the sample is not representative enough, then the results of the

experiment can be debated. However, a trade-off needs to be made between

the number of participants and their representativeness. Kubickova and Ro

(Kubickova and Ro, 2011) indicate that students are used as research subjects

in an increasingly large number of scientific studies in various disciplines (e.g.

in 80% of consumer research studies), despite continuous debates which have

been going on for several decades on the scientific value of using students as

research subjects (Kubickova and Ro, 2011).

Such debates also exist in software engineering research. A study on freshmen,

graduate students, and industry people found no conclusive results on

differences between these types of participants (Runeson, 2003). Another

study suggests that students ‘may work well’ as subjects for software

engineering studies (Svahnberg et al., 2008).

We chose to use a high number of participants with a good-enough

representativeness for inexperienced software architects, who can benefit

much from a structured approach for increasing consensus in their group

architectural decisions. Furthermore, since we aim at establishing causal

relationships, using students is preferable than using practitioners: students

help reducing variations and thus confounding factors, so they help increase

the internal validity of the study. Participants in our experiment were graduate

and undergraduate software engineering students, who took a Software



Architecture course, in which they were presented the concept of architectural

decisions. We conducted the experiment with students from three universities:

University of Groningen in Netherlands, University of Vienna in Austria, and

University of Pretoria in South Africa. We also describe the practical

experience of students in software engineering (see Section 7.4.6) in order to

interpret the results of the study according to the background of the students.

For validity and ethical purposes, we ensured that students had commitment

for the study, and that the study contributed to participants’ education, as

recommended by (Berander, 2004). To this end, we followed a checklist for

integrating student empirical studies with our research and teaching goals

(Carver et al., 2009; Galster et al., 2012). Below we present several items from

Carver’s checklist for our study.

1 Ensure adequate integration of the study into the course topics .

The course lectures stressed the importance of making architectural

decisions. In the introduction of the experiment session, we explained

to students how the session helps them make architectural decisions.

2 Write up a protocol and have it reviewed. We prepared the set of

steps to follow and discussed them with two other researchers not

involved in the study. The ethics committee from the University of

Pretoria reviewed the protocol and approved it, with minor

modifications (e.g. use a random id, instead of initials on the forms).

Reviews of ethics committees from the other universities were not

required.

3 Obtain subjects’ permission for their participation in the study.

We told students about the session at least one week in advance. We

also told students that the session addresses advanced topics in

architecture, and that participation is voluntary, with no influence on

their grades. By showing up for the session, students consented to

participate. In addition, students from the University of Pretoria signed

a consent form to indicate explicitly their consent, as required by the

ethics committee.

4 Build or update a lab package . We built a lab package for the

experiment, so that it could be replicated by other researchers. We

developed the lab package at the University of Groningen. Later on,



researchers from University of Vienna and University of Pretoria used

the same lab package to replicate the experiment.

7.4.3 Experimental Materials and Tasks

The lab package included the following experimental materials:

1 Experimental case. We used a predefined case with the purpose of

putting students in roles in which they had to make a group

architectural decision during the session. The case was based on a

real-world architectural decision, which we learnt about from

interviewing one of the original decision makers (Tselios et al., 2012).

The case had a five-page description with all the details that students

needed to make the group decision: description of the organization,

roles, problem, concerns, and alternatives. The case included three

decision maker roles: Department Manager, IT Architect, and

Business Analyst. Each student took one of the roles during the

experiment.

2 Tasks descriptions. Each student received descriptions of the tasks

that included detailed instructions of the steps.

3 Shared spreadsheet. Students who used GADGET received access to

a shared Google spreadsheet. Each group received a separate

spreadsheet. Each spreadsheet included GADGET-specific fields (e.g.

ratings, priorities) for each decision maker, and instructions on how to

use the spreadsheet.

4 Post-questionnaire on perceptions. At the end of session, students

filled out a post-questionnaire about their educational background and

experience, as well as their perceptions on various aspects of the group

decision making process (detailed in Section 7.4.5).

5 Post-questionnaire on consensus. This questionnaire included

questions about prioritizing concerns and rating how well the

alternatives satisfy concerns. Students filled them out from their role’s

point of view, but also from the perspective of the other two group

members and how they would fill them out. For example, a student

could indicate a set of concerns’ priorities for her role, a different set

of concerns’ priorities for one of her colleagues, and a totally different



set of concerns’ priorities for the other colleague. We explain further

the rationales for these measurements in Section 7.4.4.

Table 7.3 shows an example of an item from the post-questionnaire on

consensus for capturing an IT Architect’s point of view. The topic of the

architectural decision described in Table 7.3 is choosing the newsletter system

that an organization is using for communicating with its customers. Alternative

A is to replace the current legacy system with a third-party software-as-a-

service solution. Alternative B is to pay a partner to develop a new, modern

system. Alternative C is to use an open source platform and various plugins.

Alternative D is to enhance the current legacy system. Students who had the

role of IT Architects filled out this item with their own values for priorities of

concerns (whose sum had to be 100). In addition, students filled out ratings

from one to five, indicating strong disagreement, disagreement, neutral,

agreement, or strong agreement on how well each of the alternatives described

in the case (i.e. A, B, C, and D) satisfied each of the concerns.

To help students maintain their focus throughout the experiment, we simplified

the post-questionnaire on consensus. We asked students to rate alternatives

from the other roles’ points of view for the ratings of two concerns, instead of

six concerns. Thus, post-questionnaire items for IT Architects’ point of view

only had the last two rows (i.e. cost-efficient, training time), while the items

for the Business Analyst role included only the first two rows, and the items

for the Department Manager included only the middle two rows. This

simplification helped us reduce the risk of obtaining random data as a potential

reaction to being asked to perform a tedious task, by helping students to

maintain their focus.

Table 7.3. Example of post-questionnaire item for capturing an IT Architect’s

priorities of concerns, and ratings of the four alternatives (i.e. A, B, C, and D)

against two concerns.

Concerns Priorities A B C D Better analytics

- Higher security

Better delivery time

Easily scalable

More cost-efficient

Better training time

Total: 100



Figure 7.2 shows the steps of the experimental process. First, we made a short

presentation with the plan for the session, and an overview of tasks. Second,

we selected students randomly to form groups of three students, since

architectural decisions involve typically three persons, as presented in Chapter

4. When the number of students was not divisible by three, we included each

extra student in existing groups. Third, we distributed the groups into two

groups: half of the participants remained in the same room (control group), and

the other half went to a different room (treatment group). Fourth, students read

the experimental case and tasks descriptions. Fifth, students made the group

decisions. Finally, students filled out the post-questionnaires on perceptions

and consensus. During the session, we were available to answer questions

from students, if necessary.

Figure 7.2. Students followed the above steps for the experimental process.

In general, for an experiment, a null hypothesis (H0) states that the treatment

causes no difference (e.g. using GADGET does not make any difference when

compared to an ad-hoc decision making approach). The alternative hypothesis

(H1) states that the treatment makes a difference (e.g. GADGET may help or

hinder reach consensus, compared to an ad-hoc approach) (Wohlin et al.,

2012). Based on the analysis of the data from the experiment, the null

hypothesis can be rejected and the alternative hypothesis can be accepted. The

analysis uses statistical tests to determine statistically significant differences

between the data from the control group (e.g. ADHOC) and data from the

treatment group (e.g. GADGET). Next, we present the hypotheses, including



their null and alternative hypotheses, on the differences caused by the

treatment in our experiment (i.e. GADGET).

7.4.4 Hypotheses for RQ1 – Consensus

To answer RQ1, we define metrics for operationalizing consensus among

decision makers. As mentioned in Section 7.1, we consider two components of

consensus: general agreement and mutual understanding. We define

hypotheses and metrics on both components of consensus.

7.4.4.1. Hypothesis on General Agreement

Regarding general agreement, we defined a metric that counts how many

groups reached agreement on their group architectural decision. For example,

if no group reached agreement on their group architectural decisions, then this

metric is zero. Using this metric, we propose the following hypothesis.

Ha0: ADHOC and GADGET result in the same general agreement among

group decision makers.

Ha1: GADGET results in higher general agreement than ADHOC.

7.4.4.2. Hypothesis on Mutual Understanding on the Priorities of Concerns

Regarding mutual understanding among group decision makers, a group has

high mutual understanding on a decision, if group members are also able to

indicate accurately the perspectives of the other group members on that

decision. For example, let us consider three architects (Anne, Bob, and

Charlie) who need to make a group architectural decision on which framework

(e.g. A, B, C, or D) to use for a new software system. High mutual

understanding among the three architects means that, after discussions, each of

the three architects is able to indicate accurately what the other two architects

think about the performances of each framework. In contrast, low mutual

understanding means the input from the other group members was not taken

seriously, which resulted in misunderstandings among architects on each

other’s perspectives (e.g. at the end of the discussion, Charlie has no idea what

Anne thinks about the performance of the C framework, although Anne

mentioned this during the discussion).



Priorities of concerns are a ratio type of data, which means that calculating

differences between priorities is allowed. For the metric related to the mutual

understanding on the priorities of concerns, we calculate the sum of absolute

differences between the priorities assigned by a student, and the priorities that

the student’s group colleagues estimated. Based on these assumptions,

equation (1) summarizes the metric for calculating mutual understanding on

priorities (MUP) of concerns, for a decision with six concerns (see Table 7.3)

in a group of three decision makers. pAi stands for the priority indicated by

architect A for the i concern, from A’s point of view. pi,j stands for the priority

estimated by colleague j for the i concern, as colleague j estimates that A

indicated. MUP ranges from 0 to 100. Lower values for the metric mean

higher mutual understanding among group decision makers, due to smaller

differences between estimated and actual priorities.

Using the above metric, we propose the following hypothesis.

Hb0: ADHOC and GADGET result in the same level of mutual

understanding on priorities of concerns among group decision makers.

Hb1: GADGET results in higher mutual understanding on priorities of

concerns than ADHOC.

7.4.4.3. Hypothesis on Mutual Understanding on Ratings

Ratings of alternatives are provided on a 5-point Likert scale, which may be

considered an ordinal type of data. This means that summing differences

among ratings (similar to eq. (1) in Section 7.4.4) is problematic. Instead of

summing differences among ratings, we use the standard deviation to measure

the variation among ratings. Similar to the metric for priorities, we calculate

the standard deviation for one’s own ratings, and the ratings that the other

decision makers in the group estimated for one’s ratings. Lower values for the

standard deviation indicate higher mutual understanding on ratings among

group decision makers, due to smaller variation between estimated and actual

priorities.

Using the standard deviation metric, we propose the following hypothesis.



Hc0: ADHOC and GADGET result in the same level of mutual

understanding on ratings of alternatives against concerns among group

decision makers.

Hc1: GADGET results in higher mutual understanding on ratings of

alternatives against concerns than ADHOC.

7.4.5 Hypotheses for RQ2 - Perceptions

To answer RQ2, we defined metrics to measure the perceptions of the group

decision makers about the process they use (i.e. GADGET or ADHOC). Based

on existing literature, we propose three categories of perceptions: on benefits

of using GADGET, challenges related to the use of GADGET, and satisfaction

from using a group decision making process. For each category, we propose

several perception items. Each perception item is operationalized by indicating

the level of agreement with items in the post-questionnaire, using a five-point

Likert scale (i.e. from strong disagreement to strong agreement). The items in

the post-questionnaire originate from the initial GADGET evaluation in

Section 7.3, and literature on decisions. Table 7.4 shows the perception

categories, perception and post-questionnaire items, as well as the literature

source for the items.



Table 7.4. Mapping of perception categories, metrics, and post-questionnaire

items. ID Perception

category

Perception

metric item

Post-questionnaire item

M1. Benefifts

Reevaluation

of initial perspective

After discussing the case with my team I changed

my mind regarding the importance of one or more concerns (Hartwig, 2010; Schweiger et al., 1986)

M2. Reveals extra

points

The discussion with my team revealed valid points

that I would not be able to consider on my own (Hartwig, 2010; Schweiger et al., 1986)

M3. Reusability The artefacts (documents, notes, tables,

spreadsheets, etc.) that my team created during the decision-making session could be reused to

examine similar situations in the future (Jansen et

al., 2007; Tang and van Vliet, 2009)

M4. Rationale The artefacts that my team created during the decision-making session could be used to justify to

other people the reasons we made this decision

(Jansen et al., 2007; Tang and van Vliet, 2009)

M5. Clarifies

problem

After the decision-making session, my team had a

clearer view on ASO’s problem (Lai et al., 2002)

M6. Improves decision

making skills

The decision-making session improved my decision-making skills (Hardgrave et al., 2003)

M7. Challenges

Understandabi

lity of process

It was too difficult for me to understand what I

was required to do (Hardgrave et al., 2003)

M8. Clarity of

instructions

The instructions were clear enough (Hardgrave et

al., 2003)

M9. Time for

decision

I believe that the decision-making session required

too much time (Chapter 5)

M10. Effort I believe that the decision-making session required

too much effort (Chapter 5)

M11. Preparation

time

It took me too long to understand what I was

required to do in the decision-making session (Chapter 5)

M12. Satisfaction

Willingness

for future collaboration

I would be willing to work with the same team on

other projects in the future (Schweiger et al., 1986)

M13. Satisfaction

on cooperation

Working together with my teammates was an

enjoyable experience (Schweiger et al., 1986)

M14. Enjoyment I enjoyed the decision-making session (Schweiger

et al., 1986)

M15. Commitment I strongly support my group’s final decision

(Schweiger et al., 1986)

M16. Overall

satisfaction

I am satisfied with my group’s decision

(Schweiger et al., 1986)

Based on the 16 metrics in Table 7.4, we define 16 hypotheses, as follows.

Since the hypotheses are similar and only the metrics vary, we formulate a

generic hypothesis, which is adaptable to each of the 16 hypotheses.



HMi0: ADHOC and GADGET result in similar perceptions on the M i

metric (where Mi varies from M1 to M16), among group decision

makers.

HMi1: ADHOC and GADGET result in different perceptions on the M i

metric among group decision makers.

In summary, the independent variable for this experiment is the group decision

making process (i.e. GADGET or ADHOC). The dependent variables for RQ1

and RQ2 are summarized in Table 7.5.

Table 7.5. Summary of dependent variables for each research question. RQ Hypothesis Metric description Scale

type

Range

RQ1 Ha0 General agreement Nominal Yes/no

Hb0 Sum of differences between priorities of

concerns

Ratio Zero or

more

Hc0 Standard deviation of ratings Ratio Zero or

more

RQ2 HMi0 Perception metrics Interval 1 to 5

7.4.6 Results

The experiment took place in three sessions. The first session took place with

18 students at the University of Groningen. The second session took place with

72 students at the University of Vienna. The third session took place with 23

students at the University of Pretoria. All sessions took place in a similar

manner, and no deviations from the protocol occurred. After performing the

experimental sessions, we discarded data from 11 students, due to missing or

incomplete values. The valid data from the remaining 102 students was

analyzed as follows.

7.4.6.1. Analysis Procedure

To analyze the collected data, we defined analysis procedures for investigating

the hypotheses in Sections 7.4.4 and 7.4.5. Table 7.6 summarizes the analysis

procedures for all hypotheses. We used the Mann-Whitney U test because it is

well suited for comparing two independents samples (i.e. the

treatment/GADGET and control/ADHOC groups). Furthermore, this statistical

test is non-parametric (i.e. it makes no assumption regarding the normal

distribution of the data), which is suitable to this experiment, since we cannot



assume that the data is normally distributed. Still, we checked the normality of

the data using the Shapiro-Wilk test, to confirm the validity of using a non-

parametric test. We used IBM SPSS for applying statistical tests.

Table 7.6. Summary of hypotheses and their analysis procedure.

Research

question

Hypothesis keywords Hypothesis

number

Analysis

procedure RQ1 Consensus

Agreement Ha0 - Ha1 Binomial test

Mutual understanding (priorities of concerns)

Hb0 - Hb1 Mann-Whitney U tests

Mutual understanding (ratings of alternatives against concerns)

Hc0 – Hc1

RQ2 Perceptions

Benefits, challenges and satisfaction

HMi0 - HMi1

Mi covers M1 to

M16

7.4.6.2. Participants’ Background

Regarding background, we asked participants to indicate their number of years

of practical experience in software engineering. Figure 7.3 summarizes the

results. Five students declined to respond. Overall, a third of the students had

more than one year of practical experience. Participants’ levels of experience

are balanced across the treatment (i.e. GADGET) and control (i.e. ADHOC)

groups.

Figure 7.3. Summary of the years of practical experience in software engineering

of the students.



7.4.6.3. Answer to RQ1 - Consensus

To answer RQ1, we tested the three hypotheses on the two components of

consensus (i.e. agreement and mutual understanding) summarized in Table

7.6. Regarding the hypothesis on agreement, we found that all groups from

both treatments reached consensus. Therefore, we cannot reject the null

hypothesis Ha0 – detailed in Section 7.4.4), and conclude that both GADGET

and ADHOC result in agreement among group decision makers.

Table 7.7 summarizes the values for the hypotheses on mutual understanding

on priorities of concerns and ratings. For example, the average values for the

metrics on priorities (as defined in Section 7.4.4) were 133.91 for students in

the control group (ADHOC), and 102.69 for students in the treatment group

(GADGET). We checked the normality of the data using the Shapiro-Wilk

test, and we found out that the data was not normally distributed (p-value =

0.011). The non-parametric Mann-Whitney U test on Hb returned a statistically

significant difference (p-value = 0.0003). Therefore, we reject the null

hypothesis (i.e. Hb0 in Section 7.4.4), and conclude that GADGET results in

higher consensus for priorities of concerns among group decision makers.

Regarding the hypothesis on mutual understanding on ratings of alternatives

against concerns, we found lower standard deviations of ratings in the

GADGET group. The average values for metrics on ratings (as defined in

Section 7.4.4) was 1.29 for students in the control group (ADHOC), and 1.12

for students in the treatment group (GADGET). The Shapiro-Wilk test for

normality indicated the data was not normally distributed (p-value = 0.015).

The Mann-Whitney U test returned a statistically significant difference (p-

value = 0.00001). Therefore, we reject Hc0, and we conclude that GADGET

results in higher consensus for ratings among group decision makers.



Table 7.7. Medians and means for ADHOC and GADGET for the metrics on

mutual understanding. Hypothesis Hypothesis

keywords

Metric

description

Median

(mean)

ADHOC

Median

(mean)

GADGET

p-value

Hb Mutual understanding on

priorities of

concerns

Sum of differences

between

priorities of

concerns

130 (133.91)

95 (102.69) 0.0003

Hc Mutual

understanding on

ratings

Standard

deviation of

ratings

1.31

(1.29)

1.17 (1.12) 0.00001

7.4.6.4. Answer to RQ2 - Perceptions

We tested the 16 hypotheses (in Section 7.4.5) on students’ perceptions on the

GADGET and ADHOC approaches. Table 7.8 summarizes the results for the

16 hypotheses corresponding to M1 to M16. The Shapiro-Wilk test for

normality indicated that data for all metrics was not normally distributed (p-

value = 0.000). After applying Mann-Whitney U tests, we found statistically

significant (p < 0.05) differences on eight metrics. We rejected HM30, HM40,

HM70, HM90, HM100, HM120, HM130, and HM160. We accepted their corresponding

alternative hypotheses: HM31, HM41, HM71, HM91, HM101, HM121, HM131, and HM161.

Table 7.8. Perceptions on ADHOC and GADGET (1 = strong disagreement, 5 =

strong agreement). The shaded rows highlight the statistically significant results. ID Perception

category

Perception metric

item

Median

(mean)

ADHOC

Median (mean)

GADGET

p-value

M1. Benefits

Reevaluation of initial

perspective

3 (2.90) 3 (3.17) .192

M2. Reveal extra points 3 (3.33) 3 (3.17) .375

M3. Reusability 3 (2.76) 4 (3.55) .019

M4. Rationale 4 (2.86) 4 (3.77) .005

M5. Clarify problem 4 (3.88) 4 (3.83) .821

M6. Improve skills 3 (3.27) 3 (3.25) .641

M7. Challenges

Understand process

(negative statement)

1 (1.31) 2 (1.64) .007

M8. Clear instructions 4 (4.27) 4 (4.15) .352



M9. Time for decision

(negative statement)

2 (2.10) 2 (2.58) .007

M10. Effort (negative statement)

2 (1.96) 3 (2.57) .0003

M11. Preparation time (negative statement)

2 (1.65) 2 (1.83) .222

M12. Satisfaction

Willingness for

future collaboration

4 (4.29) 4 (3.7) .00006

M13. Satisfaction on cooperation

4 (4.29) 4 (3.94) .005

M14. Enjoyment 4 (4.18) 4 (3.81) .157

M15. Commitment 4 (4.14) 4 (3.87) .155

M16. Overall satisfaction 4 (4.15) 4 (3.87) .037

7.4.7 Discussion

Controlled experiments are particularly useful for establishing causal

relationships (Wohlin et al., 2012). In this experiment, we compared the

impact of the group decision making approach (i.e. GADGET or ADHOC) on

two components of consensus: mutual understanding and general agreement.

We found out that GADGET performs better than ADHOC at increasing

mutual understanding among decision makers, for both priorities of concerns

and ratings of alternatives against concerns. We found no difference between

GADGET and ADHOC at the general agreement.

Additionally, we found statistically significant differences between

perceptions (RQ2) on GADGET vs. ADHOC as follows:

Regarding perceptions on the benefits of GADGET vs. ADHOC

approaches, reusability of created artefacts (e.g. alternatives, rationale)

while using the approaches was significantly higher for GADGET. In

addition, the GADGET approach allowed better capturing of the

rationales for the architectural decisions than ADHOC. However, we

found no significant differences on reevaluating the initial

perspectives, revealing extra points, problem clarification, and

improving decision making skills.

Regarding perceptions on the challenges of using GADGET vs.

ADHOC, we found the following significant differences. GADGET

users had more difficulties understanding the process than ADHOC



users, which reflects the learning curve of GADGET. In addition,

GADGET users perceived a higher time and effort to make decisions

compared to ADHOC, which reflects the effort of using a structured

approach for group decision making. However, we found no

differences on the clarity of the instructions and the preparation time.

Regarding perceptions on the satisfaction of using GADGET vs.

ADHOC, we found significantly higher willingness for future

collaboration with the same team members for ADHOC. Also,

ADHOC users reported higher satisfaction on cooperation and overall

higher satisfaction with their decisions than GADGET users.

However, we found no significant differences on enjoying the session,

and on one’s commitment to one’s group final decision.

7.4.7.1. Interpretation of Results

These findings mean the following:

Regarding consensus among decision makers, this experiment

indicates GADGET’s positive effect on increasing consensus. The

combined evidence from the case study in Section 7.3 and the

experiment in this section indicates that practitioners can use

GADGET to increase consensus in their architectural decisions.

Regarding the results on the benefits of GADGET vs. ADHOC, the

results on reusability and capturing rationales in the experiment

confirmed the results from the case study. These benefits help

practitioners avoid architectural knowledge vaporization, and reduce

maintenance costs. For the remaining four items on benefits (i.e.

reevaluation of initial perspective, revealing extra points, clarifying

problem, and improving decision making skills), the results in Table

7.8 indicate no differences between GADGET and ADHOC, which

means these four items are not key benefits of GADGET.

Regarding meaning of results on the challenges of GADGET vs.

ADHOC, the results indicate there is a higher cost for decision makers

in terms of time and effort for using GADGET. These results were

obtained in the context of first-time users of GADGET and not-first

time users of ADHOC (since participants were very likely to have

made other group decisions before the experiment, given their years of

experience, as shown in Figure 7.3). We can expect that the effort of

using GADGET would decrease for subsequent uses, after passing its



learning curve. Still, the lack of differences on instructions clarity and

preparation time (in Table 7.8) suggests that participants could learn

about GADGET from the written instructions they received. Overall,

although GADGET has a learning curve, we expect practitioners to

progress fast on the learning curve.

Regarding meaning of results on satisfaction on using GADGET vs.

ADHOC, we note that ADHOC scored more favorably than

GADGET. However, the results on GADGET still show positive

satisfaction from using GADGET. Overall, practitioners who use

GADGET for the first time can expect positive satisfaction, although

lower than ADHOC, which is more familiar to practitioners.

7.4.7.2. Cross-study Discussion

From the case study and the experiment, we learnt that GADGET increases

consensus among participants. Furthermore, GADGET helps make better

decisions, by encouraging decision makers to evaluate systematically

alternatives. Finally, GADGET reduces architectural knowledge vaporization

by capturing the rationale of the group decision.

In general, the behavior of the decision makers can be explained by Bryson’s

(Bryson, 1996) rationale for reaching consensus in a group. Bryson indicates

that, during group decision making, some participants are in a learning mode

and other participants are in a strategic mode. Participants in a learning mode

are less certain on their preferences, and are willing to change their

preferences. However, participants in a strategic mode are more certain on

their preferences and less likely to change them.

In the case study and the experiment, we noticed the learning and strategic

modes of participants while using the GADGET approach. However, these

modes varied per alternative. For example, a participant of the case study was

in a learning mode for an alternative, but in a strategic mode for a different

alternative. Overall, GADGET ensured that participants in the strategic mode

could present their arguments to participants in the learning mode, and help

achieve consensus among participants.



7.4.7.3. Limitations of GADGET

There are a few limitations for applying GADGET in practice. GADGET

assumes participants in the group decision making are on a similar hierarchy

level, and no politics are involved in the decision making. Other factors

include social relationships among participants. For example, if the group has

high cohesion, then the group decision making process might be easier to

adopt and follow. Still, more work is needed to understand these limitations

and their influence on the adoption and results of group decision making

processes, such as GADGET.

7.5 Validity Threats

Using guidelines from (Jedlitschka et al., 2008) and (Wohlin et al., 2012), we

present construct, internal, external, and conclusion validity threats for the case

study (detailed in Section 7.3) and experiment (detailed in Section 7.4).

7.5.1 Case Study Validity Threats

Construct validity - to avoid this threat, we conducted the case study not only

with students (two groups), but also with practitioners (four groups).

Furthermore, we prevented interviewer (i.e. to please researchers) and

response biases (i.e. responses that make participants look good) by

encouraging participants to criticize GADGET openly. In turn, this helped us

collect areas for improvement, as reported in Section 7.3.2.2. Finally,

participants were anonymized and had no incentive (e.g. grades, money) to

please researchers.

Internal validity threats were not applicable for the case study, since we did

not attempt to show any causality relationship.

External validity - to address this threat, we involved practitioners in the case

study. Furthermore, the students who participated in the case study also had

practical experience.

Conclusion validity - the study conclusions were drawn based on the results

from the content analysis of interviews with participants, using guidelines

from the literature (Krippendorff, 2004). To ensure accurate conclusions, two

researchers were involved in the content analysis of the interviews with



participants. The researchers made sure that there was high agreement in their

interpretation of the data.

7.5.2 Experiment Validity Threats

We addressed construct validity by operationalizing the constructs in our

experiment: we defined metrics for each hypothesis (see Sections 7.4.4 and

7.4.5). Furthermore, to avoid impact on participants’ behavior, we made clear

to the participants that the experiment would not have any impact on their

grades. Additionally, to avoid hypotheses guessing and evaluation

apprehension, we did not tell participants our hypotheses.

To address internal validity threats, such as the instrumentation validity

threat, we made a pilot for the experiment (Tselios et al., 2012), to increase the

clarity of the experimental package. For example, we increased the readability

of the paper forms, so that subjects can easily understand their tasks. We

addressed the mortality validity threat by integrating the study with the

software architecture course (see Section 7.4.2), so that subjects joined it

voluntarily for the educational value. We distributed participants randomly to

the groups to avoid selection threats. Furthermore, by using students we

increased internal validity, since using practitioners means larger variation in

confounding variables such as domains, types of previous projects, or previous

experiences.

Another instrumentation validity threat is that students took roles (i.e.

department manager, IT architect, or business analyst) for which they had little

or no experience. To address this threat and to avoid relying on the experience

of participants, we gave each student printouts with the description of their

corresponding role. This description contained all the information they needed

to make the decision and to participate in the group decision process. This, it

was not necessary that students required external sources of information

during the experiment, or previous experience.

Regarding external validity, Kitchenham et al. regard students as relatively

close to the population of interest, because they are the next generation of

software professionals and close to novice software developers (Kitchenham et

al., 2002). We consider our results as applicable to inexperienced architects,

rather than senior architects. Since inexperienced architects need more support



than senior architects, it is reasonable to use students in the experiment,

instead of senior architects. Moreover, the nature of tasks students had to

perform did not require experience levels of senior architects, as students had

sufficient knowledge to perform their tasks. To ensure the commitment of the

participants, we made sure that the experiment contributes to participants’

education (see Section 7.4.2). To check whether or not GADGET is also

applicable to more experienced or senior architects, we need to conduct a

future similar experiment with practitioners.

Regarding conclusion validity, statistical tests have various assumptions, and

violating them may lead to poor conclusions. We used non-parametric tests

that make fewer assumptions, such as Mann-Whitney. By conducting the

experiment with a large sample of students from multiple universities, we

aimed at increasing tests’ statistical power. Another potential threat is that

some metrics (e.g. perceptions) tend to be less reliable than others (e.g.

ratings). To address this threat, we piloted our study (Tselios et al., 2012) to

clarify wording, and avoid misunderstandings.

7.6 Related Work

There have been a few approaches and studies on group architectural

decisions. Zannier et al. describe real-world architectural decisions, and ask for

more work on understanding real-world group architectural decisions (Zannier

et al., 2007). Kazman et al. propose an extension of CBAM (Kazman and

Klein, 2001) that considers explicitly the preferences of group architectural

decision makers (Kazman et al., 2005). Recently, Rekha and Muccini analyze

real-world group architectural decision making (Rekha and Muccini, 2014).

Nowak and Pautasso analyze situational awareness in group architectural

decision making (Nowak and Pautasso, 2013b). Gaubatz et al. propose

automatic enforcements of constraints in group architectural decisions

(Gaubatz et al., 2015). In this chapter, we focus on a particular aspect of group

architectural decision making (i.e. increasing consensus), which has not been

addressed in previous work.

Related work on processes for group architectural decision making include the

following. Babar et al. studied the feasibility of groupware support for

architecture evaluation, with applicability on architectural decisions (Babar et



al., 2006). Al-Naeem et al. propose using the Analytical Hierarchy Process in

group architectural decision making (Al-Naeem et al., 2005). Nakakawa et al.

propose a theoretical model on group architectural decision making for

enterprise software systems (Nakakawa et al., 2010). Sousa et al. present a

process for group architectural decision making, in which a facilitator helps

the group interactions (Sousa et al., 2006). In this chapter, the proposed

GADGET process does not require a facilitator, while our focus is on

presenting empirical evidence on the GADGET process.

Related work on approaches that capture architectural knowledge and help

group architectural decisions include the following. Falessi et al. reported an

experiment with students on documenting the rationales of group architectural

decisions (Falessi et al., 2006). Mohan and Ramesh propose a traceability

framework for group architectural decisions (Mohan and Ramesh, 2007).

Zimmermann et al. propose a framework for capturing architectural decisions

which can help group architectural decisions (Zimmermann et al., 2009). In

this chapter, we provide evidence that the GADGET process reduces


Tang (Tang, 2011) mentions communication issues that may appear in group

architectural decision making, but no process improvement is offered. Also,

Kazman et al. (Kazman et al., 1999) describe the importance of consensus for

the ATAM approach, but without describing how to increase consensus for

architectural decisions. Furthermore, the Attribute-Driven Design method

(Wojcik et al., 2006) does not indicate how to increase consensus in group

architectural decisions. In contrast, in this chapter we provide evidence on how

GADGET increases consensus in group architectural decisions.

7.7 Conclusions

In this chapter, we propose and evaluate GADGET, a process for increasing

consensus in group architectural decisions. Consensus is conceptualized in

terms of its two main components: general agreement and mutual

understanding. GADGET is based on our previous work on using the

Repertory Grid technique to capture architectural knowledge in Chapters 5 and

6, as well as the Delphi technique (Linstone and Turoff, 2002). To evaluate

GADGET thoroughly, we conducted a case study and an experiment, which


7.8. Acknowledgments 215

increases the validity of our findings. Thirteen practitioners and eight students

participated in the case study, and 113 students participated in the experiment.

From the case study, we identified the need for increasing consensus in group

architectural decisions. In addition, we found that GADGET helps

practitioners increase consensus in group architectural decisions. From the

experiment, we found that GADGET and ADHOC resulted in agreement

among group decision makers, while GADGET resulted in higher mutual

understanding than ADHOC. GADGET provides significantly higher

reusability of architectural decisions and more captured rationales than

ADHOC. However, GADGET requires more effort than ADHOC.

The results of the two studies in this chapter indicate that GADGET helps

practitioners, and particularly inexperienced architects to increase consensus in

group architectural decisions, and capture the rationales of architectural

decisions. Still, group architectural decision making is a multifaceted topic,

since in practice group decisions can be influenced by factors such as

hierarchy levels, hidden agendas, or politics. Such factors were out of scope

for this chapter. Overall, for architectural decisions in which such factors do

not play a role, GADGET is particularly useful for increasing consensus in

group architectural decisions and capturing the rationales of the decisions.

The studies in Chapters 6 and 7 encouraged us to offer tool support for

REGAIN and GADGET, so that practitioners can easily use REGAIN and

GADGET. In the next chapter, we report our efforts towards implementing

such tool support.

7.8 Acknowledgments

We thank Konstantinos Tselios for his help in conducting the studies, and participants for their efforts.


Chapter 8

Tool Support for REGAIN and GADGET

Published as: Tofan, D. and Galster, M., Capturing and Making Architectural

Decisions: an Open Source Online Tool. In Proceedings of the 2014 European

Conference on Software Architecture Workshops: ACM, 2014, pp. 1-4.

In this chapter, we present tool support for the REGAIN (see Chapter 6) and

GADGET (see Chapter 7) approaches. The tool helps architects and other

stakeholders capture and analyze decisions. In addition, the tool helps

architects make group decisions. The tool is based on the theoretical and

conceptual foundations created and evaluated in Chapters 6 and 7. We

developed the tool as a research tool in an academic environment, and we

used the tool with industrial practitioners, who offered us feedback and

influenced the development of the tool. The tool is web-based and available as

an open source project. In this chapter, we highlight the motivation, features,

and development aspects of the tool.


218 8. Tool Support for REGAIN and GADGET

8.1 Introduction

In Chapter 6, we presented and validated the REGAIN approach for making

and capturing architectural decisions, based on the Repertory Grid technique.

In Chapter 7, we presented and validated the GADGET approach for making

and capturing group architectural decisions. To make the results of Chapters 6

and 7 more applicable, to support architects in practice, and to facilitate the

transfer of our research results to industry, we developed a web-based tool to

support capturing and making architectural decisions.

This tool provides several benefits. Mainly, the tool helps architects make

better decisions by providing means for using the REGAIN and GADGET

approaches, which improve representing, analyzing, and communicating

decisions throughout the software development and life cycle. In turn, by using

REGAIN and GADGET with the tool, vaporization of architectural knowledge

is reduced.

Other tools for architectural decisions have been proposed, as reported in the

systematic mapping study in Chapter 4. For example, Zhu et al. support trade-

off analyses between different decisions (Zhu et al., 2005). PAKME is a web-

based tool for capturing architectural knowledge (Babar et al., 2008). Also, Ali

Babar et al. investigated web-based evaluation of architectural decisions

(Babar et al., 2006). Tang et al. compared tools for architectural knowledge

management (Tang et al., 2010), and reported the need for better support to

facilitate knowledge sharing and collaborative work. More recent tools include

an Enterprise Architect plugin for capturing architectural decisions

(Manteuffel et al., 2014) and a web-based tool for group architectural

decisions (Nowak and Pautasso, 2013b). We further motivate the development

of a new tool in the next section.

8.2 Motivation for a new tool

We implemented a new tool, rather than reusing or extending existing tools,

for the following seven reasons:

1. In Chapters 6 and 7 practitioners indicated the need for tool support

for REGAIN and GADGET.


8.3. Features 219

2. Lack of user-friendly, dedicated tool support for REGAIN and

GADGET. We used other tools in previous studies (e.g. WebGrid

(Gaines and Shaw, 2007), Idiogrid (Grice, 2002)). However, these

tools were not user-friendly for users not familiar with the concepts of

the Repertory Grid Technique, and needed modifications to be used

for REGAIN and GADGET.

3. By developing a customized tool, we had the flexibility to implement

concepts and validate ideas from our research (such as prioritization

approaches identified in Chapter 6, and specific steps for GADGET in

Chapter 7).

4. We needed tool support during some of our studies. An earlier version

of the tool has been used in a previous study with industrial

practitioners in Chapter 7.

5. We aimed at long term benefits of our tool. This required evolution

and adaptation of the tool over time. Thus, our initial design decision

was to offer this tool as an open source project, with source code

freely available to the software architecture community, including

industrial organizations.

6. Most existing architecture tools are desktop applications. Based on

early feedback from practitioners, we developed a web-based tool to

allow a wider distribution and easier use (i.e. practitioners do not need

to install anything, to respect the corporate rules on what can be

installed on their workstations).

7. We wanted to facilitate the transfer of research results (i.e. REGAIN,

GADGET approaches) to the industry.

8.3 Features

The features of the tool focus on capturing decisions, decision analysis, and

group decision making.



8.3.1 REGAIN Support

The screenshots below illustrate how the tool supports REGAIN, using an

architectural decision that we had to make for the development of the tool

itself.

In the upper part of Figure 8.1, we notice the main menu of the application.

Clicking the Profile link takes the user to a page that allows the user to add

some basic optional personal details like first name, last name, and phone

number. The Grids link allows the user to manage the usage of the REGAIN

approach. The Sessions link allows the user to manage the usage of the

GADGET approach. The main menu is visible from all pages of the tool.

After clicking on the Grids link, the user can start a new REGAIN session, like

in Figure 8.1, in which we can see the topic of the decision: choose the

programming language for the tool support. The optional description contains

more details on the topic.

Figure 8.1. Indicate topic for the decision and an optional description.

In Figure 8.2, step 2 of REGAIN is illustrated: alternatives for the decision can

be added or removed.


8.3. Features 221

Figure 8.2. Indicate alternatives for the decision.

Figure 8.3 illustrates the third step of REGAIN, in which concerns are

indicated. Concerns can be added directly. In addition, the triadic elicitation of

concerns can be used (for details, see Chapter 6 and (Jankowicz, 2001)): the

user drags and drops two similar alternatives on the left panel, and a

contrasting alternative on the right panel. Next, the user indicates how the two

alternatives are similar and different from the alternative on the right.

Figure 8.3. Indicate concerns.



To prioritize concerns, the hundred-dollar approach (i.e. assign a number

between 1 and 100 to each priority to reflect the importance of the concern,

while ensuring that the sum of priorities is 100) is implemented, following the

study in Chapter 6. The priorities can be indicated directly, as visible in Figure

8.4.

Figure 8.4. Prioritize concerns.

Figure 8.5 shows how the user rates each alternative against each concern (i.e.

step 4 in REGAIN).

Figure 8.5. Each alternative is rated against each concern.

Figure 8.6 shows a summary of the architectural decisions: topic, alternatives,

concerns, ratings, and priorities. In addition, the output of the hierarchical

cluster analysis shows that Python is closest to the ideal alternative. Python

was the output of this decision. In addition, we notice two other types of


8.3. Features 223

analysis: similarity analysis (i.e. similarity levels among each pair of

alternative and each pair of concerns), and principal component analysis (a

feature which is not fully implemented yet).

Figure 8.6. The alternatives, concerns and ratings are summarized in the upper

part of the figure. In the lower part of the figure, the hierarchical cluster analysis

shows similarities among alternatives and concerns.

8.3.1.1. Concerns Prioritization

Figure 8.4 shows that priorities of concerns can be indicated directly by the

user, using the hundred-dollar approach. However, feedback from practitioners

indicated that indicating directly priorities is cumbersome. Therefore, we

implemented a feature that makes it easy for practitioners to prioritize

concerns with the hundred-dollar approach.

Users can adjust the priorities by dragging the sliders for concerns. If the slider

for a concern is dragged to the right, the tool automatically increases the

priority of that concern, and decreases proportionally the priorities of all other

concerns, so that their total sum remains 100. The screenshots in Figure 8.7



illustrate how increasing the priority of the low cost concern decreased the

other priorities proportionally with their initial priorities.

Figure 8.7. User-friendly prioritization feature: increasing priority of low cost

concern (left) decreases priorities of other concerns (right).

8.3.2 GADGET Support

As found in Chapter 3, and in the work of other researchers (e.g. (Nowak and

Pautasso, 2013a; Smrithi R.V. and H., 2014)), many architectural decisions are

made by groups of architects, rather than individuals. To support this, the tool

implements support for GADGET. With the help of the tool, users can create

decisions, share decisions, or collaboratively identify decision alternatives.

Similar to decisions made by individuals, groups of decision makers can

prioritize concerns and rate alternative-concern pairs as well as indicate how

they would prioritize and rate from the perspective of other stakeholders.

Decision makers can see the priorities and ratings of other decision makers and

thus engage in discussions and work towards consensus in several iterations,

as part of using the GADGET approach.

Figure 8.8 shows a summary of using the tool for a group decision with four

participants, who indicated various alternatives, concerns, ratings, and

priorities for the decision in several iterations. We notice that the results from

previous iterations are captured, thus offering traceability into how the group

decision was made.


8.3. Features 225

Figure 8.8. Summary of a group decision session on which Python web

framework to use.

Figure 8.9 shows a summary of metrics for the group decision in Figure 8.8,

during the iteration. The Range metric shows the difference between the

highest and lowest value among decision makers. Mean and standard deviation

are also calculated using the values indicated by group decision makers.

Differences among decision makers are highlighted with shades of red.

Stronger shades are used for larger differences. This allows decision makers to

identify immediately the differences among them and focus on increasing

consensus, by focusing on their differences.

Clicking on any of the red cells offers details about the concrete values that led

to differences. The lower part of Figure 8.9 is displayed, after clicking on the

web2py/’semi-rapid application development’ cell. This cell has a range of

three. In the lower part of Figure 8.9, decision makers can see that the

participant with color green has a different perspective.



Figure 8.9. Summary of metrics for a group decision.

8.4 Tool Development and Deployment

We implemented an open-source, web-based tool for REGAIN (presented in

Chapter 6) and GADGET (presented in Chapter 7). The tool is deployed at

www.repertorygridtool.com on a dedicated virtual server. It requires users to

create an account, using an email address and a securely stored password. The

tool allows users to access their own content and decisions related to them. A

wizard guides users through capturing decision topics, alternatives, concerns,

prioritization, ratings, and analysis of decisions. As of March 2015, the tool

has more than 800 registered users.

The tool is open-source and uses a permissive, industry-friendly, open source

software license (i.e. MIT license). The development of the tool is ongoing and

we add more features to the tool, based on the feedback we received during the

interview study and from the tool users.

http://www.repertorygridtool.com/



The tool is developed mostly in Python and JavaScript, and it has around

20,000 lines of code. Most development took place between early 2012 and

mid-2014. The tool uses the Django web framework, an open source, Python-

based framework which also provides security features. The tool uses SQLite,

a lightweight SQL database, but the tool can use almost any SQL database.

Furthermore, the tool uses jQuery and jQuery UI libraries for the user-

interface design and cross-browser compatibility. The source code for this tool

is available on GitHub and Ohloh (http://www.ohloh.net/p/rgt-tool). Thus,

other developers have access to the source code and can develop it further or

create their own branch of the software.

Compatibility, usability, and security were key drivers for the architecture of

the REGAIN tool, which is compatible with the recent versions of all major

browsers: Internet Explorer, Chrome, and Firefox. Also, the tool uses the

jQuery and jQuery UI libraries, for a friendly user-interface and cross-browser

compatibility. The Django framework offers multiple security features, such as

protection against SQL injections.

8.5 Conclusions

In this chapter, we presented an open source, web-based tool to support

architects in individual and group architectural decision making. A video clip

demoing the tool can be found at http://youtu.be/GAHxrlfOt70. An earlier

version of the tool was featured on InfoQ

(http://www.infoq.com/news/2012/07/rug-rgt-tool). We expect the tool to

evolve in the future based on our ongoing research efforts. Also, ongoing

development focuses on increasing the usability of the tool. The issue tracker

on GitHub (https://github.com/danrg/RGT-tool/issues?state=open) shows the

open tickets that will be addressed in the future.

8.6 Acknowledgments

We thank the students who contributed to the tool development

http://www.infoq.com/news/2012/07/rug-rgt-tool

https://github.com/danrg/RGT-tool/issues?state=open



.


Chapter 9

Conclusions

This chapter offers answers to the research questions in Chapter 1, discussion,

contributions of this thesis, and future work.

9.1 Answers to Research Questions

As stated in Chapter 1, the overall problem addressed in this thesis is:

How can architectural knowledge vaporization be reduced?

This problem was refined into seven high-level research questions and 21

concrete research questions.

9.1.1 RQ1. How is architectural knowledge managed in practice?

Answering this high-level research question helped us understand the need for

reducing architectural knowledge vaporization. Given the wide scope of RQ1,

we focused our efforts in Chapter 2 on understanding challenges and potential

solutions for managing architectural knowledge in the public sector. We had

two reasons for this focus: research novelty and opportunity. Regarding

novelty, no previous work existed on architectural knowledge management in

the public sector, while much work existed in the private sector. Regarding

opportunity, we took part in a research project, which offered us access to

relevant practitioners who could offer insights on architectural knowledge

management in practice.

For the study in Chapter 2, we interviewed eleven practitioners from four

public and four private sector organizations, so that we could apply lessons

from the private sector to the public sector. We identified challenges (e.g.


230 9. Conclusions

vaporization, sharing integration) for managing architectural knowledge and

potential solutions (e.g. tool support, training, community building) to them.

Furthermore, the study in Chapter 2 confirmed that architectural knowledge

vaporization is a major challenge in organizations.

9.1.2 RQ2. How are architectural decisions made in practice?

Architectural decisions are a significant part of architectural knowledge. To

propose approaches for reducing architectural knowledge vaporization, we had

to understand real-world architectural decisions, so that the approaches would

be applicable in practice.

We answered RQ2 in Chapter 3, in which we present a survey with 43

architects who described 86 decisions from their activity. We refined RQ2 in

the following four research questions.

RQ2.1. What are the characteristics of architectural decisions?

To answer RQ2.1, we defined seven metrics to characterize architectural

decisions (e.g. actual time spent making the decision, number of people

directly involved). Answering RQ2.1 provided novel insights into real-world

architectural decisions. For example, previously, little was known about how

much time or how many people are involved in architectural decisions. In

particular, in Chapter 3 we found out that most architectural decisions are

made in groups. This finding encouraged us to propose the GADGET

approach in Chapter 7.

RQ2.2. What factors make architectural decisions difficult?

From answering RQ2.2, we found out several factors that make decisions

difficult, such as analysis effort and lack of similar previously made decisions.

Identifying these factors encouraged us to propose approaches that help

architects analyze their decisions and reduce architectural knowledge

vaporization, so that previous decisions are available.

RQ2.3. What are the differences between junior and senior software

architects?

Junior architects need help analyzing their decisions more than senior

architects. In addition, senior architects consider more alternatives and quality


9.1. Answers to Research Questions 231

attributes at the start of the decision making. These insights encouraged us to

propose the approaches in Chapters 6 and 7.

RQ2.4. What are the differences between good and bad architectural

decisions?

We found statistically significant differences on the number of alternatives and

quality attributes: good decisions have higher numbers than bad decisions.

9.1.3 RQ3. What is the state of research on architectural decisions?

To answer RQ3, we conducted a systematic mapping study, as reported in

Chapter 4. We covered studies published between 2002 and 2012. We

identified 144 relevant papers on architectural decisions. We refined RQ3 into

the following six research questions, and we answered each of them using the

set of relevant papers.

RQ3.1. What are the papers on documenting architectural decisions?

We identified much work on documenting architectural decisions, and we

classified existing papers on their tool support and process for documenting.

Overall, we found 120 papers that help document architectural decisions, 76 of

the 120 papers present a process, and 52 of the 120 papers include tool

support. Still, only five papers include open-source tool support.

RQ3.2. Does current research on architectural decisions consider

functional requirements and quality attributes?

We found out that most (i.e. 114) of the relevant papers on architectural

decisions address explicitly functional requirements and quality attributes.

RQ3.3. What specific domains for architectural decisions are

investigated?

We found out that service-oriented and enterprise domains received much

attention. However, other domains (e.g. mobile) have received little attention.

RQ3.4. What are the normative and descriptive papers?

We identified 20 descriptive and 124 normative papers. Furthermore, for the

descriptive papers, we analyzed the number of decisions, time spent for

making decisions, number of participants, and classes of decisions.


232 9. Conclusions

RQ3.5. What are the papers on addressing uncertainty in architectural

decisions?

We identified only nine papers on addressing uncertainty in architectural

decisions. Given its importance, we consider that addressing uncertainty is an

important future research topic for architectural decisions.

RQ3.6. What are the papers on group architectural decisions?

We identified 22 papers that refer to group architectural decisions. About half

of these papers are descriptive work, offering evidence on group architectural

decisions in the industry. The relatively low number of normative approaches

for group architectural decisions encouraged us to propose the GADGET

approach in Chapter 7.

9.1.4 RQ4. Can the Repertory Grid technique reduce architectural


The answers to RQ1, RQ2, and RQ3 encouraged us to explore ideas for

reducing architectural knowledge vaporization from other fields. The

knowledge engineering field had the potential to offer useful ideas, since the

field has much experience on capturing knowledge from experts. In the

knowledge engineering field, the Repertory Grid technique is an established

approach to capture knowledge. However, this technique has not been

investigated for capturing architectural knowledge. We refined RQ4 into two

research questions, which we investigated in Chapter 5 in two studies.

RQ4.1. What are the advantages and disadvantages of the Repertory Grid

technique for capturing architectural knowledge?

Following a study with students, we identified the following advantages. The

Repertory Grid is a systematic approach, which encourages reflection on the

decisions, and helps architects with their decision making. Disadvantages

include learning curve, tool support, and effort.

RQ4.2. Does the Repertory Grid technique reduce AK vaporization more

than a template-based approach to document architectural decisions?

In a different study with students, we compared the documentation created

with the Repertory Grid technique against the documentation created with a

template for documenting decisions. We found out that the documentation


9.1. Answers to Research Questions 233

created with the Repertory Grid technique contains more alternatives, concerns

and ratings, compared to the template-based documentation.

9.1.5 RQ5. How to support making and documenting individual architectural decisions?

Chapter 5 presents two studies with students from which we learnt that the

Repertory Grid technique had much potential for reducing architectural

knowledge vaporization. Based on this finding, in Chapter 6, we proposed

REGAIN – an approach based on the Repertory Grid technique to make and

document individual architectural decisions. To answer RQ5, we proposed the

following three research questions on REGAIN.

RQ5.1. What are the advantages and disadvantages of REGAIN?

We asked practitioners to use REGAIN and then offer feedback on its

advantages and disadvantages. We learnt that REGAIN is a systematic

approach that offers decision-making support, documentation of decision

rationale, and reasoning support to practitioners. Disadvantages include

limited tool support, subjectivity, and effort.

RQ5.2. What are the improvement opportunities for REGAIN?

Practitioners indicated several improvement opportunities for REGAIN such

as: concerns prioritization (addressed in RQ5.3), group decisions (addressed in

RQ6), decision reuse, and sensitivity analysis.

RQ5.3. Which concerns prioritization approach to use for REGAIN?

We conducted an experiment with students to compare two prioritization

approaches: the hundred-dollar and pairwise comparisons approaches. We

investigated hypotheses on performance, users’ perceptions, and impact on

REGAIN output for the two approaches. Following the experiment, we

recommended using the hundred-dollar approach with REGAIN for most

situations.

9.1.6 RQ6. How to support making and documenting group


In Chapter 3 we found out that most architectural decisions are made in

groups, rather than individually. In Chapter 4 we found little work on group


234 9. Conclusions

architectural decision making. Therefore, we were interested in supporting

group architectural decisions. In Chapter 7, we propose and validate GADGET

- an approach for increasing consensus in group architectural decision making.

We refine RQ6 in the following five research questions. The first three

research questions were answered by conducting a case study with

practitioners and students, in which participants used GADGET. The last two

research questions were answered by conducting an experiment with students,

in which half of the participants used GADGET and the other half used

ADHOC (i.e. group decision making without any prescribed approach).

RQ6.1. Is there a practical need for increasing consensus in group


By analyzing the feedback from case study participants, we learnt that there is

a need to increase consensus in group architectural decision making for two

reasons. First, conflicting perspectives occur in practice. Second, conflicting

perspectives are time consuming.

RQ6.2. What are the effort and benefits offered by GADGET?

GADGET takes one to three hours to use for one decision. The main benefit is

that participants developed an increased shared perspective of each other’s

perspectives. Other benefits include capturing rationales of decisions, time

savings by keeping the group discussions focused, equal engagement of

participants, and traceability of the decision.

RQ6.3. What are potential improvements to GADGET?

Potential improvements include elimination of less promising alternatives,

offering more concerns’ prioritization approaches, tool support, and

confidence levels to ratings.

RQ6.4. Compared to ADHOC, what is the impact of GADGET on

increasing consensus among group architectural decision makers?

We considered two components of consensus: general agreement and mutual

understanding. From the experiments with students, we learnt that GADGET

increases mutual understanding among participants better than ADHOC, but

no difference between GADGET and ADHOC on increasing the general

agreement among participants.


9.2. Discussion 235

RQ6.5. How do perceptions on GADGET vs. ADHOC differ among

decision makers?

Participants perceived that, compared to ADHOC, GADGET captures more

rationales for architectural decisions, thus reducing architectural knowledge

vaporization than ADHOC.

9.1.7 RQ7. What tool can support REGAIN and GADGET?

Chapter 4 reports that there is a very low number of open source tool support

for making and capturing architectural decisions. In Chapters 6 and 7,

practitioners indicated the need for user-friendly, open source tool support for

REGAIN and GADGET. Towards this, we developed open source tool

support. Chapter 8 reports motivation, features, and development aspects of

the tool.

9.2 Discussion

Chapter 1 presents five reasons that contribute to architectural knowledge

vaporization. Based on the answers to research questions (Section 9.1), we

discuss how the two proposed architectural decision making processes (i.e.

REGAIN and GADGET) address the five reasons.

- Unawareness – improved processes should encourage architects to

think more about their decisions. In Chapter 6, we found out that

REGAIN offers systematic reasoning support for making architectural

decisions, which encourages architects to think about their decisions.

In Chapter 7, we found out that GADGET helps architects think about

their decisions and clarify their points of view on the decisions.

- Lack of training – improved processes should have a low learning

curve and provide sufficient advantages, so that architects are

motivated to learn and use them. In Chapter 6, we found out that

REGAIN has the disadvantage of a learning curve, but REGAIN has

important advantages: systematic decision-making support, capturing

decision rationale, and reasoning support. In Chapter 7, we found out

the GADGET also has the disadvantage of a learning curve, but

GADGET offers important advantages, such as increased focus of the

group discussions, and capturing decision rationale. Both REGAIN


236 9. Conclusions

and GADGET have the advantage of friendly, open source tool

support, which can motivate architects to learn and use them.

- Difficulty - improved processes should minimize documentation

efforts and include steps to decompose the documentation task in

small and easy to perform steps. In Chapters 6 and 7, we found out

that REGAIN and GADGET have the disadvantage of effort.

However, this disadvantage is specific to any systematic approach.

The actual documentation effort is minimized, since REGAIN and

GADGET are based on a minimalistic, core model of architectural

knowledge, which encourages capturing the essential documentation

of architectural decisions. Furthermore, both REGAIN and GADGET

have small steps, which are easy to perform, after passing the learning

curve.

- Disruption – improved processes should encourage architects to focus

on their decision making. Both REGAIN and GADGET offer

decision-making support for architects, and they also capture

rationales of decisions, thus reducing architectural knowledge

vaporization.

- Natural causes – improved processes should encourage immediate

capturing of decisions rationales, to avoid the risk of architects

forgetting them over time. Both REGAIN and GADGET help capture

decisions rationales while decisions are made, thus reducing the risk of

forgetting the rationales.

9.3 Contributions

This thesis brings the following contributions to the state of the art of the

software architecture field.

Chapter 2 contributes with insights from real-world implementations of

architectural knowledge management in the private sector, and insights into

architectural knowledge management challenges and solutions for the public

and private sector.

Chapter 3 contributes with insights on characteristics of real-world

architectural decisions, on factors that make decisions difficult, on differences

between junior and senior architects, and on differences between good and bad

decisions.


9.4. Future Work 237

Chapter 4 summarizes a decade of research on architectural decisions and

proposes promising future research directions, such as addressing uncertainty

in decision making and group architectural decision making.

Chapter 5 contributes with initial evidence on the potential of the Repertory

Grid technique to reduce architectural knowledge vaporization.

Chapter 6 proposes and validates REGAIN - an approach based on the

Repertory Grid technique for making and documenting individual architectural

decisions.

Chapter 7 proposes and validates GADGET – an approach that extends

REGAIN for making and documenting group architectural decisions.

Chapter 8 contributes with open-source tool support for REGAIN and

GADGET.

9.4 Future Work

This thesis answers several research questions and makes certain contributions

to the software architecture field. In an ideal world, such answers would be

definitive and would necessitate no further inquiry, but this is not the case.

Therefore, we indicate below future work items for the academic and

industrial communities.

In Chapter 2, we reported challenges and solutions for managing architectural

knowledge. Particularly, we identified two challenges: the role of

organizational culture, and the integration of architectural knowledge

management with organizational goals. These challenges have received little

attention so far in the literature, and they require further research to understand

their role in architectural knowledge management efforts. In addition,

researchers can use the results of the study in Chapter 2 to propose taxonomies

of challenges and solutions for architectural knowledge management.

Chapter 3 provides insights into real-world architectural decisions. Still, more

work is needed to further understand real-world architectural decisions, in

particular to understand characteristics of decisions and difficulty factors

across domains. This would help researchers to offer practitioners domain-

specific decision support, to address the specific difficulty factors from each

domain.


238 9. Conclusions

Chapter 4 shows that uncertainty in architectural decisions has received little

attention so far. In addition, since there are few studies on real-world group

architectural decision making (as identified in Chapter 4, and investigated

already in Chapter 7), there is a need to further study group architectural

decision making in practice. Furthermore, as discussed in Chapter 4, there is a

need for stating explicitly the number and classes of decisions in descriptive

papers, so that findings can be compared across studies.

In Chapter 5, we explored a technique from the knowledge engineering field

(i.e. the Repertory Grid technique) to capture architectural knowledge.

However, the knowledge engineering field may offer additional techniques

with high applicability for capturing architectural knowledge. A systematic

mapping study of knowledge engineering literature would be the first step to

identify additional techniques and their potential applicability to the software

architecture field.

As follow-up work to the REGAIN and GADGET approaches proposed in

Chapters 6 and 7, we plan to propose an ISO42010-compliant documentation

framework with viewpoints that help reuse architectural knowledge and

capture dependencies among architectural decisions. Furthermore, we only

used ratings from one to five in REGAIN and GADGET, but in future work

we consider adding other types of ratings, such as specific categories.

Additionally, there is a need to investigate how to increase the practitioners’

acceptance of approaches on architectural decisions, so that practitioners can

benefit from approaches (such as REGAIN and GADGET) proposed by

researchers. For example, the Technology Acceptance Model (Venkatesh and

Davis, 2000) offers a powerful model on how users accept and use a new

technology.

As future work, there is a need to support practitioners to tackle other

challenges of group architectural decision making, in addition to consensus,

which we studied in Chapter 7. In addition, since there is a need for treating

uncertainty in architectural decision making (identified in Chapter 4), we will

update REGAIN and GADGET to include support for uncertainty in

individual and group architectural decision making. Finally, there is a need to

define criteria for evaluating various group architectural decision making

processes across different studies.


9.4. Future Work 239

In Chapter 8, we presented open source tool support for REGAIN and

GADGET. In future work, we plan to add additional features to the tool, such

as facilitating reuse of concerns based on concern-specific keywords. Also, we

plan to evaluate the tool to understand how much time practitioners save by

using the tool, and further improve the tool. Finally, given the importance of

handling dependencies among decisions, we will add new features to help

architects analyze dependencies among decisions.


Appendices

10.1 Appendix for Chapter 3

10.1.1 Questionnaire for Survey

Welcome!

Thank you for taking the time to complete this survey by the Software

Engineering and Architecture group at the University of Groningen,

Netherlands. Your answers will contribute to academic research and to future

advances in the Software Architecture field.

The purpose of this survey is to understand what makes architectural decisions

difficult. This will contribute to improved future support for architectural

decision making. To participate in this survey, you should have been directly

involved in making architectural decisions during the last two years in the

industry (for example, as an architect, or experienced developer).

This survey should only take about 15 minutes of your time. Your answers

will be anonymous, and will not be shared with other parties. No sensitive or

confidential issues need to be disclosed while taking this survey. Your answers

will only be used for academic purposes. Survey results will be published in an

academic venue by spring 2013.

As an incentive for this survey, we offer you a copy of the article reporting the

results of this survey (details at the end of this survey). If you have any

questions, please contact us by email.


242 Appendix for Chapter 3

Thank you,

Dan Tofan

[email protected]

PhD Candidate

Matthias Galster

[email protected]

Researcher

Paris Avgeriou

[email protected]

Professor and Head of the Group

Software Engineering and Architecture Group

University of Groningen, Netherlands

Overview

The survey has the following parts:

- background questions

- questions about the difficulty of a good architectural decision

- questions about the difficulty of a problematic (or bad) architectural

decision

Let's start!

1) Have you been involved directly in making architectural decisions during

the last two years?

Yes / No


Appendices 243

Page exit logic: IF: Answer to question #1 is equal to (‘No’) THEN:

Disqualify and display: ‘This survey targets persons who have been involved

in architectural decisions, so the survey is not applicable to you. If you know

such persons, please send them the link to this survey. Thank you for your

help!’

Background

ID: 69

2) What describes you best?

Software architect

Enterprise architect

Senior software engineer

Junior software engineer

Software testing engineer

Business, technical or project manager

Chief technical officer

Business analyst or requirements engineer

Other: ___

ID: 66



3) Please describe your professional experience

0-2

years

3-5

years

6-10

years

11-15

years

>15

years

Years of experience in

the role you described

above:

Years of experience as

architect:

Years of experience as

software developer

(other than architect):

The Good Decision

4) Think about the architectural decisions that you had to make in the last two

years (examples of architectural decisions are choices of middleware, or

architectural patterns).

Next, out of your decisions, please select a decision that you consider an

especially good architectural decision (you judge what good means!). Please

write down a very brief description of this decision, without including any

sensitive or confidential details.

_____

5) Which of the following describes best the domain of the project in which

this decision was made?*


Appendices 245

Healthcare / Telecommunication / Embedded systems / Transportation /

Banking / E-commerce / E-government / Other: ____

6) Please estimate how much time you spent to make this decision.

Estimate the Actual and Elapsed time that you spent working on making this

decision. For example, if you spent one week over one month (because you

were busy with other things), then enter '7 days' in the Actual and '30 days' in

the Elapsed fields. Use decimals if necessary (for example, 1.5 days).

If actual time was 30 minutes over 3 days, then please enter '30 minutes' for

Actual and '3 days' for Elapsed.

Actual: __

Elapsed: __

7) How many people were involved directly in the decision making?

Put '1' if only you made the decision. If there were more decision makers, then

add their number.

__

8) How many people were involved indirectly in the decision making?*

Add your best estimate for the number of people involved to some extent in

the decision making. Do not include the people you considered in the previous

question.

__

9) Around how many alternatives for this decision were considered at the

beginning of the decision making process?*

For example, if you had to decide on a programming language and you

considered PHP, C#, Python, Perl and Java at the beginning, then fill in '5'.

Also include alternatives that were considered, but discarded later on.



__

10) How many alternatives for this decision were studied for an extended

period of time?

The extended period of time is relative to the actual time spent for making the

decision. For example, if you had to decide on a programming language and

you looked at PHP, C# and Java in depth, then fill in '3'.

__

11) How many non-functional requirements (or quality attributes) were

considered for this decision?

If you considered only security and performance, then specify '2'. You can also

include business issues as non-functional requirements.

__

Why was this decision difficult?

Please rate the following statements.

This decision was difficult because...

12) ...you received conflicting recommendations from various sources about

which decision alternative to choose.*

For example, some colleagues suggested this option, some blogs suggested

something different.

Strongly disagree / Disagree / Neutral / Agree / Strongly agree / NA

13) ...there were no previous similar decisions to compare this decision

against.*


Appendices 247


14) ...it was hard to identify a superior decision alternative from the

alternatives under consideration.*


15) ...the decision required a lot of thinking from you.*


16) ...it was hard to convince stakeholders to accept a certain decision

alternative.*


17) ...stakeholders had strongly diverging perspectives about the decision.*


18) ...you needed to influence some stakeholders without having formal

authority over them.*


19) ...the decision had too many alternatives.*


20) ...the decision had too few alternatives.*


21) ...analyzing alternatives for this decision took a lot of effort.*




22) ...some quality attributes were considered too late in the decision making

process.*


23) ...too many people were involved in making the decision.*


24) ...dependencies with other decisions had to be taken into account.*


25) ...the decision had a major business impact.*


26) ...you had to respect existing architectural principles.*

‘Principles are general rules and guidelines, intended to be enduring and

seldom amended, that inform and support the way in which an organization

sets about fulfilling its mission’ (TOGAF).


27) ...serious negative consequences could result from the decision.*

For example, personal loss for you (you could get fired) or for the organization

(it could impact its financial results).


28) ...too little time was available to make the decision.*


Appendices 249


29) ...you had a lot of peer pressure.*

For example, you had much pressure from your colleagues.


30) ...of the trade-offs between quality attributes.*


31) ...you lacked experience as an architect.*


32) ...you lacked domain-specific knowledge (e.g. new customer).*


33) ...more information was needed to reduce uncertainty when making the

decision.*


The Bad Decision

Thank you for your answers so far! They are very important for understanding

the difficulties faced by architects in their decision making. Please finish the

rest of this survey.

34) Think about architectural decisions that you had to make in the last two

years.



Next, out of these decisions, please select a decision that you consider an

especially problematic (or not-so-good, or bad) architectural decision, in

contrast with the good decision you specified earlier. It is up to you to judge

what a 'problematic' decision is. Please write down a very brief description of

this decision, without including any sensitive or confidential details.*

Examples of architectural decisions are choices of middleware, or architectural

patterns.

_______

Rate the same statements on the difficulty for the bad decision.

Extra difficulties?

Based on your personal experience, do you see other items that contribute to

the difficulty of architectural decision (other than the ones you rated)? These

items can refer to either the good or the problematic decision.

64) Please add other items (optional).

___

Thank You!

Thank you for taking our survey. Your response is very important to Software

Architecture research.

If you are interested in the results of this survey, please drop an email to

[email protected] with the subject ‘survey results’. Your email address will

only be used for sending you the survey results, when available.

If you want to offer feedback, please use the above email address.


Appendices 251


10.2.1 Selected Papers

Here is the list with the 144 selected papers for the systematic mapping study

reported in Chapter 4. The papers that were included in the quasi-gold standard

(see Section 4.3.1) are marked with a star (e.g. P12*).

P1 Akerman, A. and J. Tyree, Using ontology to support development of software architectures. IBM Systems Journal, 2006. 45(4): p. 813-825.

P2 Al-Naeem, T., et al. Formulating the architectural design of enterprise applications as

search problem. in Proceedings of the 2005 Australian conference on Software Engineering. 2005.

P3 Al-Naeem, T., et al. A quality-driven systematic approach for architecting distributed

software applications. in Proceedings of the 27th International Conference on Software Engineering. 2005: ACM.

P4 Al-Naeem, T., et al. Tool support for optimization-based architectural evaluation. in

Proceedings of the second international workshop on Models and processes for the

evaluation of off-the-shelf components. 2005: ACM.

P5 Al-Rousan, T., S. Sulaiman, and R.A. Salam, Supporting architectural design

decisions through risk identification architecture pattern (RIAP) model. WSEAS

Transactions on Information Science and Applications, 2009. 6(4).

P6 Aman ul, h. and M.A. Babar. Tool support for automating architectural knowledge

extraction. in Proceedings of the 2009 ICSE Workshop on Sharing and Reusing

Architectural Knowledge. 2009: IEEE Computer Society.

P7 Ameller, D. and X. Franch. Ontology-Based Architectural Knowledge Representation:

Structural Elements Module. in Advanced Information Systems Engineering

Workshops. 2011: Springer Berlin Heidelberg.

P8 Andrews, A., et al., A framework for design tradeoffs. Software Quality Journal, 2005.

13(4): p. 377-405.

P9 Ariyachandra, T. and H. Watson, Key organizational factors in data warehouse

architecture selection. Decision Support Systems, 2010. 49(2): p. 200-212.

P10 Babar, M.A. and I. Gorton. A Tool for Managing Software Architecture Knowledge. in Proceedings of the Second Workshop on SHAring and Reusing architectural

Knowledge Architecture, Rationale, and Design Intent. 2007: IEEE Computer Society.

P11 Babar, M.A., et al. Introducing Tool Support for Managing Architectural Knowledge:

An Experience Report. in Engineering of Computer Based Systems, 2008. ECBS 2008. 15th Annual IEEE International Conference and Workshop on the. 2008.

P12* Bernini, D. and F. Tisato. Explaining architectural choices to non-architects. in

Proceedings of the 4th European Conference on Software Architecture. 2010.

P13 Biehl, M. and M. Törngren. An executable design decision representation using model

transformations. in 36th EUROMICRO Conference on Software Engineering and

Advanced Applications (SEAA). 2010.

P14 Bingfeng, X., H. Zhiqiu, and W. Ou, Making Architectural Decisions Based on

Requirements: Analysis and Combination of Risk-based and Quality Attribute-based

Methods. 36th EUROMICRO Conference on Ubiquitous Intelligence & Computing

and 7th International Conference on Autonomic & Trusted Computing (UIC/ATC



2010), 2010.

P15 Blair, S., T. Cull, and R. Watt, Responsibility Driven Architecture. IEEE Software, 2010. PP(99): p. 1-1.

P16* Bode, S. and M. Riebisch. Impact evaluation for quality-oriented architectural

decisions regarding evolvability. in Proceedings of the 4th European Conference on Software Architecture. 2010.

P17 Bosch, J. Software architecture: The next step. in First European Workshop on

Software Architecture. 2004.

P18* Buchgeher, G. and R. Weinreich. Automatic Tracing of Decisions to Architecture and

Implementation. in 9th Working IEEE/IFIP Conference on Software Architecture

(WICSA). 2011.

P19* Capilla, R. Embedded design rationale in software architecture. in Joint Working

IEEE/IFIP Conference on Software Architecture (WICSA) & 3rd European

Conference on Software Architecture (ECSA). 2009.

P20* Capilla, R. and M. Ali Babar. On the role of architectural design decisions in software

product line engineering. in Proceedings of the 2nd European Conference on


P21 Capilla, R., J.C. Dueñas, and F. Nava, Viability for codifying and documenting

architectural design decisions with tool support. Journal of Software Maintenance and

Evolution, 2010. 22(2): p. 81-119.

P22 Capilla, R. and F. Nava. Extending software architecting processes with decision-

making activities. in Second IFIP TC 2 Central and East European Conference on

Software Engineering Techniques, CEE-SET 2007. 2008.

P23 Capilla, R., F. Nava, and C. Carrillo. Effort estimation in capturing architectural knowledge. in 23rd IEEE/ACM International Conference on Automated Software

Engineering. 2008.

P24 Capilla, R., F. Nava, and J.C. Duenas. Modeling and Documenting the Evolution of Architectural Design Decisions. in Proceedings of the Second Workshop on SHAring

and Reusing architectural Knowledge Architecture, Rationale, and Design Intent.

2007: IEEE Computer Society.

P25 Capilla, R., F. Nava, and A. Tang. Attributes for characterizing the evolution of

architectural design decisions. in 2007 Third IEEE Workshop on Software

Evolvability. 2007.

P26* Capilla, R., et al., An enhanced architectural knowledge metamodel linking

architectural design decisions to other artifacts in the software engineering lifecycle,

in Proceedings of the 5th European conference on Software architecture. 2011,

Springer-Verlag: Essen, Germany. p. 303-318.

P27* Carignano, M.C., S. Gonnet, and H. Leone. A model to represent architectural design

rationale. in 2009 Joint Working IEEE/IFIP Conference on Software Architecture

(WICSA) & 3rd European Conference on Software Architecture (ECSA). 2009.

P28 Carriere, J., R. Kazman, and I. Ozkaya. A cost-benefit framework for making

architectural decisions in a business context. in 32nd International Conference on

Software Engineering (ICSE). 2010.

P29 Che, M. and D.E. Perry. Scenario-Based Architectural Design Decisions

Documentation and Evolution. in Proceedings of the 2011 18th IEEE International

Conference and Workshops on Engineering of Computer Based Systems (ECBS 2011).

2011.

P30 Chen, C.L., D. Shao, and D.E. Perry. An Exploratory Case Study Using CBSP and

Archium. in Proceedings of the Second Workshop on SHAring and Reusing

architectural Knowledge Architecture, Rationale, and Design Intent. 2007: IEEE


Appendices 253

Computer Society.

P31 Chen, L., M.A. Babar, and H. Liang. Model-centered customizable architectural design decisions management. in 21st Australian Software Engineering Conference

(ASWEC). 2010.

P32 Cherubini, M., et al. Let's go to the whiteboard: how and why software developers use drawings. in Proceedings of the SIGCHI conference on Human factors in computing

systems. 2007: ACM.

P33 Christiaans, H. and R.A. Almendra, Accessing decision-making in software design.

Design Studies, 2010. 31(6): p. 641-662.

P34 Clements, P.C. An Economic Model for Software Architecture Decisions. in

Proceedings of the First International Workshop on The Economics of Software and

Computation. 2007: IEEE Computer Society.

P35 Clerc, V., P. Lago, and H. Vliet. The Architect’s Mindset. in 3rd international

conference on Quality of Software architectures, components, and applications . 2007:

Springer Berlin Heidelberg.

P36 Cortellessa, V., F. Marinelli, and P. Potena, An optimization framework for “build-or-

buy” decisions in software architecture. Computers and Operations Research, 2008.

35(10): p. 3090-3106.

P37* de Boer, R.C., et al. Ontology-driven visualization of architectural design decisions.

in 2009 Joint Working IEEE/IFIP Conference on Software Architecture (WICSA) &

3rd European Conference on Software Architecture (ECSA). 2009.

P38* de Boer, R.C. and H. van Vliet, On the similarity between requirements and

architecture. Journal of Systems and Software, 2009. 82(3): p. 544-550.

P39* de Bruin, H., H. van Vliet, and Z. Baida. Documenting and analyzing a context-sensitive design space. in Software Architecture. Systems Design, Development and

Maintenance. IFIP 17th World Computer Congress - TC2 Stream/ 3rd Working

IEEE/IFIP Conference on Software Architecture. 2002.

P40* de Roo, A., H. Sozer, and M. Aksit. An architectural style for optimizing system

qualities in adaptive embedded systems using multi-objective optimization. in 2009

Joint Working IEEE/IFIP Conference on Software Architecture (WICSA) & 3rd

European Conference on Software Architecture (ECSA). 2009.

P41 Dueñas, J.C. and R. Capilla. The decision view of software architecture. in 2nd

European Workshop on Software Architecture. 2005.

P42 Eden, A.H. Strategic Versus Tactical Design. in Proceedings of the 38th Annual

Hawaii International Conference on System Sciences. 2005.

P43* Eklund, U. and T. Arts. A classification of value for software architecture decisions.

in Proceedings of the 4th European conference on Software architecture. 2010.

P44 Evensen, K.D., Reducing Uncertainty in Architectural Decisions with AADL. 44th

Hawaii International Conference on System Sciences (HICSS 2011), 2011: p. 1 - 9.

P45 Falessi, D., G. Cantone, and M. Becker. Documenting design decision rationale to

improve individual and team design decision making: an experimental evaluation. in

Proceedings of the 2006 ACM/IEEE international symposium on Empirical software

engineering. 2006: ACM.

P46* Falessi, D., G. Cantone, and P. Kruchten, Value-based design decision rationale

documentation: principles and empirical feasibility study. 2008 7th Working

IEEE/IFIP Conference on Software Architecture (WICSA '08), 2008: p. 189 - 198.

P47 Falessi, D., R. Capilla, and G. Cantone. A value-based approach for documenting

design decisions rationale: a replicated experiment. in Proceedings of the 3rd

international workshop on Sharing and reusing architectural knowledge. 2008: ACM.



P48* Farenhorst, R., et al. The lonesome architect. in 2009 Joint Working IEEE/IFIP

Conference on Software Architecture (WICSA) & 3rd European Conference on

Software Architecture (ECSA). 2009.

P49 Gebhart, M., M. Baumgartner, and S. Abeck. Supporting Service Design Decisions. in

Fifth International Conference on Software Engineering Advances (ICSEA 2010).

2010.

P50 Gilson, F. and V. Englebert. Rationale, decisions and alternatives traceability for

architecture design. in 5th European Conference on Software Architecture:

Companion Volume. 2011.

P51 Grunske, L. Identifying "good" architectural design alternatives with multi-objective

optimization strategies. in 28th International Conference on Software Engineering.

2006: ACM.

P52 Gu, Q. and P. Lago. SOA process decisions: new challenges in architectural

knowledge modeling. in Proceedings of the 3rd international workshop on Sharing

and reusing architectural knowledge. 2008: ACM.

P53 Gu, Q., P. Lago, and H. Van Vliet. A template for SOA design decision making in an

educational setting. in 36th EUROMICRO Conference on Software Engineering and

Advanced Applications (SEAA). 2010.

P54 Gu, Q. and H. van Vliet. SOA decision making - what do we need to know. in Proceedings of the 2009 ICSE Workshop on Sharing and Reusing Architectural

Knowledge. 2009: IEEE Computer Society.

P55 Harrison, N.B., P. Avgeriou, and U. Zdun, Using Patterns to Capture Architectural Decisions. IEEE Software, 2007. 24(4): p. 38-45.

P56* Harrison, T.C. and A.P. Campbell. Attempting to Understand the Progress of Software

Architecture Decision-Making on Large Australian Defence Projects. in 9th Working IEEE/IFIP Conference on Software Architecture (WICSA). 2011.

P57 Heeseok, C., C. Youhee, and Y. Keunhyuk, An integrated approach to quality

achievement with architectural design decisions. Journal of Software, 2006. 1(3): p.

40-49.

P58 Hill, T., S. Supakkul, and L. Chung. Confirming and Reconfirming Architectural

Decisions on Scalability: A Goal-Driven Simulation Approach. in On the Move to

Meaningful Internet Systems: OTM 2009 Workshops. Proceedings Confederated International Workshops and Posters ADI, CAMS, EI2N, ISDE, IWSSA, MONET,

OnToContent, ODIS, ORM, OTM Academy, SWWS, SEMELS, Beyond SAWSDL,

COMBEK 2009. 2009.

P59* Hoorn, J.F., et al., The lonesome architect. Journal of Systems and Software, 2011.

84(9): p. 1424-1435.

P60* Ivanović, A. and P. America. Customer value in architecture decision making. in 4th European Conference on Software Architecture. 2010.

P61 Ivanović, A. and P. America. Information needed for architecture decision making. in

2010 ICSE Workshop on Product Line Approaches in Software Engineering. 2010.

P62* Jansen, A., P. Avgeriou, and J.S. van der Ven, Enriching software architecture

documentation. Journal of Systems and Software, 2009. 82(8): p. 1232-1248.

P63* Jansen, A. and J. Bosch. Software Architecture as a Set of Architectural Design

Decisions. in 5th Working IEEE/IFIP Conference on Software Architecture. 2005.

P64* Jansen, A., J. Bosch, and P. Avgeriou, Documenting after the fact: Recovering

architectural design decisions. Journal of Systems and Software, 2008. 81(4): p. 536-557.

P65* Jansen, A., et al. Tool support for architectural decisions. in Working IEEE/IFIP


Appendices 255

Conference on Software Architecture. 2007.

P66 Kazman, R., H.P. In, and H.-M. Chen, From requirements negotiation to software architecture decisions. Information and Software Technology, 2005. 47(8): p. 511-

520.

P67 Khaled, L., Achieving Goals through Architectural Design Decisions. Journal of Computer Sciences, 2010. 6(12): p. 1424-1429.

P68* Könemann, P. Integrating decision management with UML modeling concepts and

tools. in Joint Working IEEE/IFIP Conference on Software Architecture & European

Conference on Software Architecture. 2009.

P69* Könemann, P. and O. Zimmermann. Linking design decisions to design models in

model-based software development. in 4th European conference on Software

architecture. 2010.

P70 Kruchten, P. An Ontology of Architectural Design Decisions in Software Intensive

Systems. in 2nd Groningen Workshop Software Variability. 2004.

P71 Kruchten, P., R. Capilla, and J.C. Dueñas, The decision view's role in software

architecture practice. IEEE Software, 2009. 26(2): p. 36-42.

P72 Kruchten, P., P. Lago, and H. van Vliet. Building up and reasoning about architectural knowledge. in Second International Conference on Quality of Software

Architectures. 2006.

P73 Lee, L. and P. Kruchten. Capturing software architectural design decisions. in Canadian Conference on Electrical and Computer Engineering. 2007.

P74 Lee, L. and P. Kruchten. Customizing the capture of software architectural design

decisions. in Canadian Conference on Electrical and Computer Engineering -

CCECE. 2008.

P75 Lee, L. and P. Kruchten. A tool to visualize architectural design decisions. in

Proceedings of the 4th International Conference on Quality of Software-

Architectures: Models and Architectures. 2008.

P76 MacDonald, S., et al., Deferring Design Pattern Decisions and Automating Structural

Pattern Changes Using a Design-Pattern-Based Programming System. ACM

Transactions on Programming Languages and Systems, 2009. 31(3).

P77* Makki, M., E. Bagheri, and A.A. Ghorbani. Automating architecture trade-off

decision making through a complex multi-attribute decision process. in Second

European Conference on Software Architecture. 2008.

P78 Mayr, C., U. Zdun, and S. Dustdar. Reusable architectural decision model for model

and metadata repositories. in 7th International Symposium on Formal Methods for

Components and Objects. 2009.

P79* Miksovic, C. and O. Zimmermann. Architecturally Significant Requirements,

Reference Architecture, and Metamodel for Knowledge Management in Information

Technology Services. in 9th Working IEEE/IFIP Conference on Software Architecture.

2011.

P80 Mirakhorli, M. and J. Cleland-Huang. Tracing architectural concerns in high

assurance systems (NIER track). in 33rd International Conference on Software

Engineering. 2011.

P81 Moaven, S., et al. Decision Support System Environment for Software Architecture

Style Selection (DESAS v1. 0). in Proc. of 21th conference on Software Engineering &

Knowledge Engineering SEKE'09. 2009.

P82 Mohamed, A. and M. Zulkernine. Architectural Design Decisions for Achieving

Reliable Software Systems. in First international conference on Architecting Critical

Systems. 2010.



P83 Mohan, K. and B. Ramesh, Traceability-based knowledge integration in group

decision and negotiation activities. Decision Support Systems, 2007. 43(3): p. 968-

989.

P84 Moore, M., et al. Quantifying the value of architecture design decisions: Lessons from

the field. in 25th International Conference on Software Engineering. 2003.

P85 Nakakawa, A., P. Van Bommel, and E. Proper. Towards a theory on collaborative

decision making in enterprise architecture. in 5th international conference on Global

Perspectives on Design Science Research. 2010.

P86* Navarro, E. and C.E. Cuesta. Automating the trace of architectural design decisions and rationales using a MDD approach. in 2nd European Conference on Software

Architecture. 2008.

P87* Navarro, E., C.E. Cuesta, and D.E. Perry. Weaving a network of architectural knowledge. in Joint Working IEEE/IFIP Conference on Software Architecture & 3rd

European Conference on Software Architecture. 2009.

P88 Nowak, M. and C. Pautasso. Goals, questions and metrics for architectural decision models. in Proceedings of the 6th International Workshop on SHAring and Reusing

Architectural Knowledge. 2011.

P89 Nowak, M., C. Pautasso, and O. Zimmermann. Architectural decision modeling with

reuse: Challenges and opportunities. in Proceedings of the 2010 ICSE Workshop on Sharing and Reusing Architectural Knowledge. 2010.

P90 Orlic, B., et al. Concepts and diagram elements for architectural knowledge

management. in 5th European Conference on Software Architecture: Companion Volume Article No. 3. 2011.

P91 Ozkaya, I., P. Wallin, and J. Axelsson. Architecture knowledge management during

system evolution - Observations from practitioners. in 2010 ICSE Workshop on Sharing and Reusing Architectural Knowledge. 2010.

P92 Parmar, M., W.U. Khan, and B. Kumar, An Architectural Decision Tool Based on

Scenarios and Nonfunctional Requirements. International Journal of Advanced

Computer Science and Applications, 2011. 2(2).

P93 Pautasso, C., O. Zimmermann, and F. Leymann. Restful web services vs. "big"' web

services: making the right architectural decision. in Proceedings of the 17th

international conference on World Wide Web. 2008: ACM.

P94 Pulkkinen, M. Systemic Management of Architectural Decisions in Enterprise

Architecture Planning. Four Dimensions and Three Abstraction Levels . in 39th

Annual Hawaii International Conference on System Sciences. 2006.

P95 Ribeiro, R.A., et al., Hybrid assessment method for software engineering decisions.

Decision Support Systems, 2011. 51(1): p. 208-219.

P96 Riebisch, M. and S. Wohlfarth. Introducing impact analysis for architectural

decisions. in 2007 IEEE Symposium and Workshop on Engineering of Computer

Based Systems. 2007.

P97 Saarelainen, M.M. and V. Hotti. Does enterprise architecture form the ground for

group decisions in egovernment programme? qualitative study of the finnish national

project for it in social services. in 15th IEEE International Enterprise Distributed

Object Computing Conference Workshops. 2011.

P98* Savolainen, J., et al. Experiences in making architectural decisions during the

development of a New Base Station platform. in 4th European Conference on


P99 Savolainen, J. and T. Männistö, Conflict-centric software architectural views:

Exposing trade-offs in quality requirements. IEEE Software, 2010. 27(6): p. 33-37.


Appendices 257

P100 Sawada, A., et al. A Design Map for Recording Precise Architecture Decisions. in

18th Asia Pacific Software Engineering Conference. 2011.

P101 Shahin, M., P. Liang, and M.R. Khayyambashi. Improving understandability of

architecture design through visualization of architectural design decision. in 2010

ICSE Workshop on Sharing and Reusing Architectural Knowledge. 2010.

P102 Shahin, M., P. Liang, and Z. Li. Architectural design decision visualization for

architecture design: Preliminary results of a controlled experiment. in 5th European

Conference on Software Architecture: Companion Volume. 2011.

P103 Siva Balan, R.V. and M. Punithavalli, Decision based development of productline: A quintessence usability approach. Journal of Computer Science, 2011. 7(5): p. 619-

628.

P104 Sousa, K., H. Mendonca, and E. Furtado. Applying a multi-criteria approach for the selection of usability patterns in the development of DTV applications . in Proceedings

of VII Brazilian symposium on Human factors in computing systems . 2006: ACM.

P105* Stoll, P., A. Wall, and C. Norstrom. Guiding architectural decisions with the influencing factors method. in 7th Working IEEE/IFIP Conference on Software

Architecture. 2008.

P106 Svahnberg, M., An industrial study on building consensus around software

architectures and quality attributes. Information and Software Technology, 2004. 46(12): p. 805-818.

P107 Svahnberg, M. and C. Wohlin, An investigation of a method for identifying a software

architecture candidate with respect to quality attributes. Empirical Software Engineering, 2005. 10(2): p. 149-181.

P108 Svahnberg, M., et al. A method for understanding quality attributes in software

architecture structures. in Proceedings of the 14th international conference on Software engineering and knowledge engineering. 2002: ACM.

P109 Tang, A., Software designers, are you biased?, in Proceedings of the 6th International

Workshop on SHAring and Reusing Architectural Knowledge. 2011, ACM: Waikiki,

Honolulu, HI, USA. p. 1-8.

P110 Tang, A., H. Vliet, and R. Vasa, Software Architecture Design Reasoning: A Case for

Improved Methodology Support. IEEE Software, 2009. 26(2): p. 43-49.

P111* Tang, A., et al. Predicting Change Impact in Architecture Design with Bayesian Belief

Networks. in 5th Working IEEE/IFIP Conference on Software Architecture. 2005.

P112 Tofan, D., M. Galster, and P. Avgeriou. Capturing tacit architectural knowledge using

the repertory grid technique (NIER track). in 33rd International Conference on

Software Engineering. 2011.

P113* Tofan, D., M. Galster, and P. Avgeriou. Reducing architectural knowledge vaporization by applying the repertory grid technique. in 5th European Conference on


P114 Trujillo, S., et al. Exploring Extensibility of Architectural Design Decisions. in Proceedings of the Second Workshop on SHAring and Reusing architectural


P115 Tyree, J. and A. Akerman, Architecture decisions: Demystifying architecture. IEEE Software, 2005. 22(2): p. 19-27.

P116* Umar, A. and A. Zordan, Reengineering for service oriented architectures: A strategic

decision model for integration versus migration. Journal of Systems and Software,

2009. 82(3): p. 448-462.

P117 van den Berg, M., A. Tang, and R. Farenhorst. A constraint-oriented approach to

software architecture design. in 9th International Conference on Quality Software.

2009.



P118* van Gurp, J. and J. Bosch, Design erosion: Problems and causes. Journal of Systems

and Software, 2002. 61(2): p. 105-119.

P119* van Heesch, U. and P. Avgeriou. Naive architecting - Understanding the reasoning

process of students: A descriptive survey. in 4th European Conference on Software

Architecture. 2010.

P120* van Heesch, U. and P. Avgeriou. Mature Architecting - A Survey about the Reasoning

Process of Professional Architects. in 9th Working IEEE/IFIP Conference on


P121 Wallin, P., J. Froberg, and J. Axelsson. Making Decisions in Integration of Automotive Software and Electronics: A Method Based on ATAM and AHP. in 4th

International Workshop on Software Engineering for Automotive Systems . 2007: IEEE

Computer Society.

P122 Wang, W. and J.E. Burge. Using rationale to support pattern-based architectural

design. in 2010 ICSE Workshop on Sharing and Reusing Architectural Knowledge.

2010.

P123* Weinreich, R. and G. Buchgeher. Integrating requirements and design decisions in

architecture representation. in 4th European Conference on Software Architecture.

2010.

P124 Wu, W. and T. Kelly. Managing architectural design decisions for safety-critical software systems. in Second International Conference on Quality of Software


P125 Youhee, C., C. Heeseok, and O. MoonKyun. An architectural design decision-centric approach to architectural evolution. in 11th international conference on Advanced

Communication Technology - Volume 1. 2009.

P126* Zalewski, A. and S. Kijas. Architecture decision-making in support of complexity control. in 4th European Conference on Software Architecture. 2010.

P127* Zalewski, A., S. Kijas, and D. Sokolowska, Capturing architecture evolution with

maps of architectural decisions 2.0, in 5th European Conference on Software

Architecture. 2011, Springer-Verlag: Essen, Germany. p. 83-96.

P128* Zalewski, A. and M. Ludzia, Diagrammatic Modeling of Architectural Decisions, in

2nd European Conference on Software Architecture. 2008, Springer-Verlag: Paphos,

Cyprus. p. 350-353.

P129 Zannier, C., M. Chiasson, and F. Maurer, A model of design decision making based on

empirical results of interviews with software designers. Information and Software

Technology, 2007. 49(6): p. 637-653.

P130 Zannier, C. and F. Maurer. A qualitative empirical evaluation of design decisions. in

2005 workshop on Human and social factors of software engineering. 2005: ACM.

P131 Zannier, C. and F. Maurer. Foundations of agile decision making from agile mentors

and developers. in 7th international conference on Extreme Programming and Agile

Processes in Software Engineering. 2006.

P132 Zannier, C. and F. Maurer. Comparing decision making in agile and non-agile

software organizations. in 8th international conference on Agile processes in software

engineering and extreme programming. 2007: Springer-Verlag.

P133 Zannier, C. and F. Maurer. Social Factors Relevant to Capturing Design Decisions. in

Proceedings of the Second Workshop on SHAring and Reusing architectural


P134 Zayaraz, G., S. Vijayalakshmi, and V. Vijayalakshmi. Evaluation of software architectures using multicriteria fuzzy decision making technique. in International

Conference on Intelligent Agent & Multi-Agent Systems. 2009.


Appendices 259

P135 Zdun, U., Systematic pattern selection using pattern language grammars and design

space analysis. Software-Practice & Experience, 2007. 37(9): p. 983-1016.

P136 Zdun, U., A DSL toolkit for deferring architectural decisions in DSL-based software

design. Information and Software Technology, 2010. 52(7): p. 733-748.

P137 Zdun, U., et al. Architecting as decision making with patterns and primitives. in 3rd international workshop on Sharing and reusing architectural knowledge. 2008: ACM.

P138 Zhu, L., et al., Tradeoff and sensitivity analysis in software architecture evaluation

using analytic hierarchy process. Software Quality Journal, 2005. 13(4): p. 357-375.

P139 Zhu, L. and I. Gorton. UML Profiles for Design Decisions and Non-Functional

Requirements. in Second Workshop on SHAring and Reusing architectural Knowledge

Architecture, Rationale, and Design Intent. 2007: IEEE Computer Society.

P140 Zimmermann, O., Architectural Decisions as Reusable Design Assets. IEEE Software,

2011. 28(1): p. 64-69.

P141 Zimmermann, O., et al. Service-oriented architecture and business process choreography in an order management scenario: rationale, concepts, lessons learned.

in Companion to the 20th annual ACM SIGPLAN conference on Object-oriented

programming, systems, languages, and applications. 2005: ACM.

P142 Zimmermann, O., et al. Architectural decisions and patterns for transactional

workflows in SOA. in Fifth International Conference on Service-Oriented Computing.

2007.

P143 Zimmermann, O., et al. Reusable architectural decision models for enterprise

application development. in Third International Conference on Quality of Software


P144* Zimmermann, O., et al., Managing architectural decision models with dependency relations, integrity constraints, and production rules. Journal of Systems and

Software, 2009. 82(8): p. 1249–1267.

10.2.2 Publication Venues

Table 10.1. Distribution of the 144 papers over all publication venues.

Venue Type No %

Workshop on SHAring and Reusing Architectural

Knowledge

Workshop 17 11.81 European Conference on Software Architecture Conference 16 11.11 Working IEEE/IFIP Conference on Software Architecture Conference 10 6.94

Journal of Systems and Software Journal 7 4.86 Working IEEE/IFIP Conference on Software

Architecture/European Conference on Software

Architecture

Conference 7 4.86

IEEE software Journal 7 4.86 International Conference on Software Engineering Conference 6 4.17 Quality of Software Architectures Conference 5 3.47

Information and Software Technology Journal 4 2.78 Workshop on Traceability, Dependencies and Software

Architecture

Workshop 3 2.08

Engineering of Computer Based Systems Conference 3 2.08 Hawaii International Conference on System Sciences Conference 3 2.08 Decision Support Systems Journal 3 2.08

European Workshop on Software Architecture Workshop 2 1.39 Euromicro Conference on Software Engineering and

Advanced Applications

Conference 2 1.39

Journal of Computer Science Journal 2 1.39 Extreme Programming and Agile Processes in Software

Engineering

Conference 2 1.39 Software Engineering and Knowledge Engineering Conference 2 1.39



Canadian Conference on Electrical and Computer

Engineering

Conference 2 1.39

Australian Software Engineering Conference Conference 2 1.39 Symposium on Human Factors in Computing Systems Conference 1 0.69

Product Line Approaches in Software Engineering (In

conjunction with ISCE)

Workshop 1 0.69 Empirical Software Engineering Journal Journal 1 0.69 IEEE/ACM International Conference on Automated

Software Engineering

Conference 1 0.69

Software Engineering for Automotive Systems Workshop 1 0.69 IFIP TC 2 Central and East European Conference on

Software Engineering Techniques

Conference 1 0.69

International Workshop on The Economics of Software

and Computation

Workshop 1 0.69 Design Studies Journal Journal 1 0.69 Journal of Software Maintenance and Evolution: Research

and Practice

Journal 1 0.69

International Conference of Advanced Communication

Technology

Conference 1 0.69 SIGCHI Conference on Human Factors in Computing

Systems

Conference 1 0.69

International Conference on Design Science Research in

Information Systems and Technology

Conference 1 0.69 Software Quality Journal Journal 1 0.69 International Conference on Intelligent Agent & Multi-

Agent Systems

Conference 1 0.69

International Workshop on Models and Processes for the

Evaluation of off-the-shelf Components

Workshop 1 0.69

International Conference on Quality Software Conference 1 0.69 International Workshops and Posters on the Move to

Meaningful Internet Systems

Workshop 1 0.69 International Conference on Service-Oriented Computing Conference 1 0.69

Journal of Software Journal 1 0.69 ACM SIGPLAN Conference on Object-Oriented

Programming, Systems, Languages, and Applications

Conference 1 0.69

Asia Pacific Software Engineering Conference Conference 1 0.69 International Conference on Software Engineering

Advances

Conference 1 0.69

IBM Systems Journal Journal 1 0.69 International Conference on World Wide Web Conference 1 0.69 Advanced Information Systems Engineering Workshops Workshop 1 0.69

Ubiquitous, Autonomic and Trusted Computing Symposia

and Workshops

Conference 1 0.69 Software Quality Control Journal Journal 1 0.69

Formal Methods for Components and Objects Conference 1 0.69 Software: Practice and Experience Journal 1 0.69 Computers & Operations Research Journal 1 0.69

IEEE Workshop on Software Evolvability Workshop 1 0.69 Groningen Workshop on Software Variability Workshop 1 0.69

WSEAS Transactions on Information Science and

Applications

Journal 1 0.69 Workshop on Human and Social Factors of Software

Engineering

Workshop 1 0.69 International Journal of Advanced Computer Science and

Applications

Journal 1 0.69

ACM Transactions on Programming Languages and

Systems

Journal 1 0.69 International Symposium on Architecting Critical Systems Conference 1 0.69

International Symposium on Empirical Software

Engineering

Conference 1 0.69 International Enterprise Distributed Object Computing

Conference Workshops

Workshop 1 0.69


Appendices 261


10.3.1 Phase 2 – Additional Details

The table below shows details on the architects who participated in the study

in Chapter 6, Section 6.6.3.1.

Table 10.2. Details on the architects who participated in the study about the

industrial applicability of REGAIN, and the captured decisions.

ID Age Years of

experience

No. of captured

decisions in

session

Application domain of

decision

1 30-35 9 2 Bioinformatics

2 25-30 1 1 Home automation

3 30-35 3 1 e-Government



6 30-35 5 2 Healthcare

7 20-25 3 1 Mobile

8 30-35 7 1 Service oriented systems

9 45-50 23 1 Logistics

10 40-45 15 2 Database design

11 45-50 22 3 Telecommunication

12 25-30 2 1 Document-oriented

database

13 45-50 14 1 e-Government

14 30-35 7 1 Geographic information

system

15 40-45 15 2 Statistics

16 45-50 8 1 Mobile

In Section 6.3.1, we present the sessions that we conducted with the architects.

In the third step of the sessions, we asked architects to offer additional

feedback on REGAIN through semi-structured interviews. We used the

questions below for the semi-structured interviews.

- When designing a software system, would you use such an approach

to help you with your decision making? If so, why?

- In the context of software architecting, what benefits do you see for

such an approach, and in which activities?



- In the context of software architecting, what drawbacks do you see

for such an approach, and in which activities?

- How difficult do you find the approach?

a. Which parts were difficult?

b. Were the steps and their rational clear?

c. What did you like or dislike about it?

d. Do you have any suggestions for improvement?

e. What do you think about the results of the analysis?

10.3.2 Phase 3 – Additional Details

Table 10.3 shows below one of the decisions (on user-interface), which is part

of the experimental materials (see Section 6.4.2).

Table 10.3. Fixed grid with the decisions on user-interface. A rating of 1 indicates

strong agreement with the left pole, and a rating of 5 indicates strong agreement

with the right pole, similar to the example in Figure 6.3.

UI Decision Dedicated touchscreen

Web interface

Mobile apps

Windows application

Ideal alternative

Concern Left pole (1) Right pole (5)

Learnability The end user needs a few days

to get familiar with the interface

The end user needs half an

hour to get familiar with the interface

Implementa

bility

ESH needs a team of 10

people, working for 6 months

to implement it

ESH needs a team of 2

people, working for 6

months to implement it

Responsiven

ess

The interface provides

feedback to user’s actions

anytime between 0.5-5

seconds

The interface provides

feedback to user’s actions in

less than 0.5 second

Interactivity The interface enables entering text to HPS with the speed of

writing an SMS on a phone

keyboard

The interface enables entering text to HPS with

the speed of keyboard

typing


References

830-1998, I.S., 1998. Recommended Practice for Software Requirements Specifications. IEEE Computer Society.

1471-2000, I., 2000. IEEE Recommended Practice for Architectural Description of Software-Intensive Systems. IEEE.

29148, I.I.I., 2011. Systems and software engineering — Life cycle processes — Requirements engineering.

Al-Naeem, T., Gorton, I., Babar, M.A., Rabhi, F., Benatallah, B., 2005. A quality-driven systematic approach for architecting distributed software applications, Proceedings of the International Conference on Software Engineering. ACM, New York, USA, pp. 244-253.

Ameller, D., Galster, M., Avgeriou, P., Franch, X., 2013. The Role of Quality Attributes in Service-Based Systems Architecting: A Survey, in: Drira, K. (Ed.), Software Architecture. Springer Berlin Heidelberg, pp. 200-207.

Andrews, A., Mancebo, E., Runeson, P., France, R., 2005. A Framework for Design Tradeoffs. Software Quality Journal 13, 377-405.

Avgeriou, P., Kruchten, P., Lago, P., Grisham, P., Perry, D., 2007. Architectural knowledge and rationale: issues, trends, challenges. ACM SIGSOFT Software Engineering Notes 32, 41-46.

Babar, M.A., Dingsøyr, T., Lago, P., van Vliet, H., 2009. Software Architecture Knowledge Management: Theory and Practice. Springer Berlin.

Babar, M.A., Kitchenham, B., Zhu, L., Gorton, I., Jeffery, R., 2006. An empirical study of groupware support for distributed software architecture evaluation process. Journal of Systems and Software 79, 912-925.

Babar, M.A., Northway, A., Gorton, I., Heuer, P., Thong, N., 2008. Introducing Tool Support for Managing Architectural Knowledge: An Experience Report, Engineering of Computer Based Systems, 2008. ECBS 2008. 15th Annual IEEE International Conference and Workshop on the, pp. 105-113.

Bailey, J., Budgen, D., Turner, M., Kitchenham, B., Brereton, P., Linkman, S., 2007. Evidence relating to Object-Oriented software design: A survey, First International Symposium on Empirical Software Engineering and Measurement, pp. 482-484.


264 References

Basili, V., Caldiera, G., 1994. The goal question metric approach. Encyclopedia of software 2, 1-10.

Bate, S.P., Robert, G., 2002. Knowledge management and communities of practice in the private sector: lessons for modernizing the National Health Service in England and Wales. Public Administration 80, 643-663.

Berander, P., 2004. Using students as subjects in requirements prioritization, Proceedings of the International Symposium on Empirical Software Engineering, pp. 167-176.

Berander, P., Jönsson, P., 2006. Hierarchical cumulative voting (HCV)-prioritization of requirements in hierarchies. International Journal of Software 16, 819-849.

Berander, P., Svahnberg, M., 2009. Evaluating two ways of calculating priorities in requirements hierarchies – An experiment on hierarchical cumulative voting. Journal of Systems and Software 82, 836-850.

Bjørnson, F.O., Dingsøyr, T., 2008. Knowledge management in software engineering: A systematic review of studied concepts, findings and research methods used. Information and Software Technology 50, 1055-1068.

Boose, J., Bradshaw, J., Kitto, C., Shema, D.B., 1990a. From ETS to Aquinas: Six years of knowledge acquisition tool development, Proceedings of the 5th Knowledge Acquisition Workshop.

Boose, J., Shema, D., Bradshaw, J., 1990b. Design knowledge capture for a corporate memory facility, NASA, Marshall Space Flight Center, Fifth Conference on Artificial Intelligence for Space Applications p 271-280(SEE N 90-27275 21-61).

Boose, J.H., 1984. Personal construct theory and the transfer of human expertise, Proceedings of the National Conference on Artificial Intelligence, Texas, USA, pp. 27-33.

Boose, J.H., 1989. Using repertory grid-centered knowledge acquisition tools for decision support. , 1989. Vol. III: Decision Support and Knowledge 3, 211-220.

Bosch, J., 2004. Software Architecture: The Next Step, in: Oquendo, F., Warboys, B., Morrison, R. (Eds.), First European Workshop on Software Architecture. Springer, pp. 194-199.

Bradner, S., 1997. RFC 2119.

Brereton, P., Kitchenham, B., Budgen, D., Li, Z., 2008. Using a protocol template for case study planning, Evaluation and Assessment in Software Engineering, pp. 1-8.


References 265

Brown, A., Wilson, G., 2012. The Architecture of Open Source Applications.

Bryson, N., 1996. Group decision-making and the analytic hierarchy process: Exploring the consensus-relevant information content. Computers & Operations Research 23, 27-35.

Bu, W., Tang, A., Han, J., 2009. An analysis of decision-centric architectural design approaches, ICSE Workshop on Sharing and Reusing Architectural Knowledge, pp. 33-40.

Budgen, D., Turner, M., Brereton, P., Kitchenham, B., 2008. Using mapping studies in software engineering, 20th Annual Meeting of the Psychology of Programming Interest Group, pp. 195-204.

Capilla, R., Dueñas, J., Nava, F., 2010. Viability for codifying and documenting architectural design decisions with tool support. Journal of Software Maintenance and Evolution: Research and Practice 22, 81-119.

Carver, J.C., Jaccheri, L., Morasca, S., Shull, F., 2009. A checklist for integrating student empirical studies with research and teaching goals. Empirical Software Engineering 15, 35-59.

Castro-Schez, J.J., Jimenez, L., Moreno, J., Rodriguez, L., 2005. Using fuzzy repertory table-based technique for decision support. Decision Support Systems 39, 293-307.

Ciolkowski, M., Laitenberger, O., Vegas, S., Biffl, S., 2003. Practical experiences in the design and conduct of surveys in empirical software engineering, in: Conradi, R., Wang, A.I. (Eds.), Empirical Methods and Studies in Software Engineering. Springer, Berlin, Heidelberg, pp. 104-128.

Clements, P., Garlan, D., Bass, L., Stafford, J., Nord, R., Ivers, J., Little, R., 2002. Documenting software architectures: views and beyond. Pearson Education.

Clerc, V., Lago, P., van Vliet, H., 2007. The Architect's Mindset, 3rd Conference on Quality of Software Architectures. Springer-Verlag, Medford, MA, pp. 231-249.

da Mota Silveira Neto, P.A., Carmo Machado, I.d., McGregor, J.D., de Almeida, E.S., de Lemos Meira, S.R., 2011. A systematic mapping study of software product lines testing. Information and Software Technology 53, 407-423.

de Boer, R.C., Farenhorst, R., 2008. In Search of ‘Architectural Knowledge’, Third Workshop on SHAring and Reusing architectural Knowledge Architecture, Rationale, and Design Intent, pp. 71-78.


266 References

de Boer, R.C., Farenhorst, R., Lago, P., Van Vliet, H., Clerc, V., Jansen, A., 2007. Architectural knowledge: Getting to the core. Lecture Notes in Computer Science 4880, 197.

de Boer, R.C., van Vliet, H., 2008. Architectural knowledge discovery with latent semantic analysis: Constructing a reading guide for software product audits. Journal of Systems and Software 81, 1456-1469.

Delbecq, A.L., Van de Ven, A.H., 1971. A group process model for problem identification and program planning. The Journal of Applied Behavioral Science 7, 466-492.

Dingsøyr, T., Conradi, R., 2002. A survey of case studies of the use of knowledge management in software engineering. Journal of Software Engineering and Knowledge Engineering 12, 391-414.

Dingsøyr, T., van Vliet, H., 2009. Introduction to software architecture and knowledge management, in: Babar, M.A., Dingsøyr, T., Lago, P., van Vliet, H. (Eds.), Software Architecture Knowledge Management. Springer Berlin, pp. 1-17.

Easterbrook, S., Singer, J., Storey, M.-A., Damian, D., 2008. Selecting empirical methods for software engineering research, Guide to advanced empirical software engineering. Springer, pp. 285-311.

EBSE-RG, Template for a Mapping Study Protocol, http://www.dur.ac.uk/ebse/resources/templates/MappingStudyTemplate.pdf, (last accessed June 2013).

Edwards, H.M., McDonald, S., Young, S.M., 2009. The repertory grid technique: Its place in empirical software engineering research. Information and Software Technology 51, 785-798.

Eisenführ, F., Weber, M., Langer, T., 2010. Rational decision making. Springer.

Elberzhager, F., Rosbach, A., Münch, J., Eschbach, R., 2012. Reducing test effort: A systematic mapping study on existing approaches. Information and Software Technology 54, 1092–1106.

Engström, E., Runeson, P., 2011. Software product line testing – A systematic mapping study. Information and Software Technology 53, 2-13.

Falessi, D., Cantone, G., Becker, M., 2006. Documenting design decision rationale to improve individual and team design decision making: an experimental evaluation, Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering. ACM, pp. 134-143.

http://www.dur.ac.uk/ebse/resources/templates/MappingStudyTemplate.pdf


References 267

Falessi, D., Cantone, G., Kazman, R., Kruchten, P., 2011. Decision-making techniques for software architecture design: A comparative survey. ACM Computing Surveys 43, 1-28.

Farenhorst, R., de Boer, R.C., 2009. Architectural Knowledge Management: Supporting Architects and Auditors. dspace.ubvu.vu.nl.

Farenhorst, R., Hoorn, J., Lago, P., van Vliet, H., 2009. The Lonesome Architect, Joint Working IEEE/IFIP Conference on Software Architecture & European Conference on Software Architecture. IEEE, pp. 61-70.

Field, A., 2009. Discovering Statistics Using SPSS (Introducing Statistical Methods). Sage Publications Ltd.

Fransella, F., Bell, R., Bannister, D., 2004. A Manual for Repertory Grid Technique. Wiley.

Gaines, B.R., Shaw, M.L.G., 1980. New directions in the analysis and interactive elicitation of personal construct systems. International Journal of Man-Machine Studies 13, 81-116.

Gaines, B.R., Shaw, M.L.G., 1993. Knowledge acquisition tools based on personal construct psychology. Knowledge Engineering Review 8, 49-85.

Gaines, B.R., Shaw, M.L.G., 2007. WebGrid Evolution through Four Generations 1994-2007.

Galster, M., Avgeriou, P., Tofan, D., 2013. Constraints for the design of variability-intensive service-oriented reference architectures - An industrial case study. Information and Software Technology 55, 428-441.

Galster, M., Tofan, D., 2014. Exploring web advertising to attract industry professionals for software engineering surveys, Proceedings of the 2nd International Workshop on Conducting Empirical Studies in Industry. ACM, Hyderabad, India, pp. 5-8.

Galster, M., Tofan, D., Avgeriou, P., 2012. On Integrating Student Empirical Software Engineering Studies with Research and Teaching Goals, Proceeding of the Evaluation and Assessment in Software Engineering.

Garcia, J., Popescu, D., Edwards, G., Medvidovic, N., 2009. Identifying Architectural Bad Smells, Software Maintenance and Reengineering, 2009. CSMR '09. 13th European Conference on, pp. 255-258.

Gaubatz, P., Lytra, I., Zdun, U., 2015. Automatic enforcement of constraints in real-time collaborative architectural decision making. Journal of Systems and Software 103, 128-149.


268 References

Gilb, T., 2005. Competitive Engineering: A Handbook For Systems Engineering, Requirements Engineering, and Software Engineering Using Planguage. Butterworth-Heinemann.

Grice, J., Burkley, E., Burkley, M., Wright, S., Slaby, J., 2004. A sentence completion task for eliciting personal constructs in specific domains. Personal Construct Theory and Practice, 60-75.

Grice, J.W., 2002. Idiogrid: software for the management and analysis of repertory grids. Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc 34, 338-341.

Hardgrave, B.C., Davis, F.D., Riemenschneider, C.K., 2003. Investigating Determinants of Software Developers' Intentions to Follow Methodologies. J. Manage. Inf. Syst. 20, 123-151.

Harrison, N., Avgeriou, P., Zdun, U., 2007. Using Patterns to Capture Architectural Decisions. Software, IEEE 24, 38-45.

Hartwig, R.T., 2010. Facilitating Problem Solving : A Case Study Using the Devil's Advocacy Technique. Group Facilitation: A Research & Applications Journal, 17.

Hassenzahl, M., Wessler, R., 2000. Capturing Design Space From a User Perspective: The Repertory Grid Technique Revisited. International Journal of Human-Computer Interaction 12, 441-459.

Hilliard, R., 2000. IEEE-Std-1471-2000 Recommended Practice for Architectural Description of Software-Intensive Systems. IEEE, http://standards. ieee. org.

Host, M., Runeson, P., 2007. Checklists for Software Engineering Case Study Research, First International Symposium on Empirical Software Engineering and Measurement, pp. 479-481.

Hove, S.E., Anda, B., 2005. Experiences from conducting semi-structured interviews in empirical software engineering research, 11th IEEE International Software Metrics Symposium (METRICS'05). IEEE, pp. 23-23.

ISO/IEC/IEEE, 2011. Systems and software engineering -- Architecture description. ISO/IEC/IEEE 42010:2011(E) (Revision of ISO/IEC 42010:2007 and IEEE Std 1471-2000), 1-46.

Jabali, F.H., Sharafi, S.M., Zamanifar, K., 2011. A Quality Based Method to Analyze Software Architectures. International Journal of Computer Science 8.

Jamieson, S., 2004. Likert scales: how to (ab)use them. Journal of Medical Education 38, 1217-1218.

http://standards/


References 269

Janis, I.L., 1989. Crucial decisions. Free Press.

Jankowicz, D., 2001. Why does subjectivity make us nervous? Making the tacit explicit. Journal of Intellectual Capital 2, 61-73.

Jankowicz, D., 2003. The easy guide to repertory grids. Wiley.

Jansen, A., Avgeriou, P., van der Ven, J.S., 2009. Enriching Software Architecture Documentation. Journal of Systems and Software 82, 1232-1248.

Jansen, A., Bosch, J., 2005. Software architecture as a set of architectural design decisions, 5th Working IEEE/IFIP Conference on Software Architecture, pp. 109-120.

Jansen, A., van der Ven, J., Avgeriou, P., Hammer, D.K., 2007. Tool Support for Architectural Decisions, Proc. Working IEEE/IFIP Conference on Software Architecture WICSA '07, pp. 4-4.

Jedlitschka, A., Ciolkowski, M., Pfahl, D., 2008. Reporting experiments in software engineering, in: Shull, F., Singer, J., Sjøberg, D. (Eds.), Guide to advanced empirical software engineering. Springer, pp. 201-228.

Juristo, N., Moreno, A.M., 2001. Basics of software engineering experimentation. Springer.

Karlsson, J., 1996. Software requirements prioritizing, Requirements Engineering, 1996., Proceedings of the Second International Conference on, pp. 110-116.

Karlsson, J., Ryan, K., 1997. A Cost-Value Approach for Prioritizing Requirements. IEEE Software 14, 67-74.

Karlsson, J., Wohlin, C., Regnell, B., 1998. An evaluation of methods for prioritizing software requirements. Information and Software Technology 39, 939-947.

Kazman, R., Barbacci, M., Klein, M., Jeromy Carriere, S., Woods, S.G., 1999. Experience with performing architecture tradeoff analysis, 21st International Conference on Software Engineering. IEEE, pp. 54-63.

Kazman, R., In, H.P., Chen, H.-M., 2005. From requirements negotiation to software architecture decisions. Information and Software Technology 47, 511-520.

Kazman, R., Klein, M., 2001. Quantifying the costs and benefits of architectural decisions, Proceedings of the International Conference on Software Engineering. IEEE, pp. 297-306.

Kitchenham, B., Pfleeger, S.L., 2003. Principles of survey research, Parts 1 to 6. ACM SIGSOFT Software Engineering Notes 28, 24-27.


270 References

Kitchenham, B., Pfleeger, S.L., Pickard, L.M., Jones, P.W., Hoaglin, D.C., El Emam, K., Rosenberg, J., 2002. Preliminary guidelines for empirical research in software engineering. IEEE Transactions on Software Engineering 28, 721-734.

Kitchenham, B.A., Brereton, P., Turner, M., Niazi, M.K., Linkman, S., Pretorius, R., Budgen, D., 2010. Refining the systematic literature review process-two participant-observer case studies. Empirical Software Engineering 15, 618-653.

Kitchenham, B.A., Charters, S., 2007. Guidelines for performing systematic literature reviews in software engineering, EBSE-RG. Keele University and Durham University.

Kitchenham, B.A., Pfleeger, S.L., 2008. Personal Opinion Surveys, in: Shull, F., Singer, J., Sjøberg, D.I.K. (Eds.), Guide to Advanced Empirical Software Engineering. Springer London, pp. 63-92.

Krippendorff, K., 2004. Content analysis: An introduction to its methodology. Sage Publications, Inc.

Kruchten, P., 2004. An ontology of architectural design decisions in software intensive systems, 2nd Groningen Workshop on Software Variability. Citeseer, pp. 54-61.

Kruchten, P., 2008. What do software architects really do? Journal of Systems and Software 81, 2413-2416.

Kruchten, P., Capilla, R., Dueñas, J.C., 2009. The decision view’s role in software architecture practice. IEEE Software 26, 36-42.

Kruchten, P., Lago, P., van Vliet, H., 2006. Building up and reasoning about architectural knowledge, Proceedings of the International Conference on Quality of Software Architectures. Springer, pp. 43-58.

Kruchten, P., Lago, P., van Vliet, H., Wolf, T., 2005. Building up and Exploiting Architectural Knowledge, 5th Working IEEE/IFIP Conference on Software Architecture. IEEE, pp. 291-292.

Kubickova, M., Ro, H., 2011. Are students "Real People"? The Use of Student Subjects in Hospitality Research.

Lai, V.S., Wong, B.K., Cheung, W., 2002. Group decision making in a multiple criteria environment: A case using the AHP in software selection. European Journal of Operational Research 137, 134-144.

Li, Z., Liang, P., Avgeriou, P., 2013. Application of knowledge-based approaches in software architecture: A systematic mapping study. Information and Software Technology 55, 777-794.


References 271

Linstone, H., Turoff, M., 2002. The Delphi method: Techniques and applications.

Long, D.W.D., Fahey, L., 2000. Diagnosing cultural barriers to knowledge management. The Academy of Management Executive (1993-2005) 14, 113-127.

Maiden, N.a.M., Rugg, G., 1996. ACRE: selecting methods for requirements acquisition. Software Engineering Journal 11, 183-192.

Manteuffel, C., Tofan, D., Koziolek, H., Goldschmidt, T., Avgeriou, P., 2014. Industrial Implementation of a Documentation Framework for Architectural Decisions, IEEE/IFIP Conference on Software Architecture.

McAdam, R., Reid, R., 2000. A comparison of public and private sector perceptions and use of knowledge management. Journal of European Industrial Training 24, 317-329.

Miesbauer, C., Weinreich, R., 2013. Classification of Design Decisions – An Expert Survey in Practice, in: Drira, K. (Ed.), Software Architecture. Springer Berlin Heidelberg, pp. 130-145.

Miller, S., 1997. Implementing Strategic Decisions: Four Key Success Factors. Organization Studies 18, 577-602.

Mohan, K., Ramesh, B., 2007. Traceability-based knowledge integration in group decision and negotiation activities. Decis. Support Syst. 43, 968-989.

Myers, M.D., Newman, M., 2007. The qualitative interview in IS research: Examining the craft. Inf. Organ. 17, 2-26.

Nakakawa, A., Bommel, P., Proper, E., 2010. Towards a Theory on Collaborative Decision Making in Enterprise Architecture, in: Winter, R., Zhao, J.L., Aier, S. (Eds.), Global Perspectives on Design Science Research. Springer Berlin Heidelberg, pp. 538-541.

Newell, B.R., Lagnado, D.A., Shanks, D.R., 2007. Straight choices: The psychology of decision making. Routledge.

Niu, N., Easterbrook, S., 2007. So, You Think You Know Others' Goals? A Repertory Grid Study. IEEE Software 24, 53-61.

Nowak, M., Pautasso, C., 2013a. Team Situational Awareness and Architectural Decision Making with Software Architecture Warehouse, 7th European Conference on Software Architecture (ECSA). Springer Verlag, Montpellier, France, pp. 146-161.

Nowak, M., Pautasso, C., 2013b. Team Situational Awareness and Architectural Decision Making with the Software Architecture Warehouse, in:


272 References

Drira, K. (Ed.), Software Architecture. Springer Berlin Heidelberg, pp. 146-161.

Nutt, P.C., Wilson, D.C., 2010. Handbook of decision making. Wiley-Blackwell.

Osborn, A.F., 1963. Applied Imagination; Principles and Procedures of Creative Problem-solving: Principles and Procedures of Creative Problem-solving. Scribner.

Perry, D.E., Wolf, A.L., 1992. Foundations for the study of software architecture. ACM SIGSOFT Software Engineering Notes 17, 40-52.

Petersen, K., Feldt, R., Mujtaba, S., Mattsson, M., 2008. Systematic mapping studies in software engineering, 12th international conference on Evaluation and Assessment in Software Engineering, pp. 68-77.

Peterson, M., 2009. An Introduction to Decision Theory. Cambridge University Press.

Pfleeger, S.L., Kitchenham, B.A., 2001. Principles of survey research: part 1: turning lemons into lemonade. SIGSOFT Softw. Eng. Notes 26, 16-18.

Polanyi, M., 1967. The tacit dimension. Anchor Books, Garden City, N.Y.

Poort, E., Martens, N., Weerd, I., Vliet, H., 2012. How Architects See Non-Functional Requirements: Beware of Modifiability, in: Regnell, B., Damian, D. (Eds.), 18th International Working Conference on Requirements Engineering: Foundation for Software Quality. Springer Berlin Heidelberg, pp. 37-51.

Regnell, B., Höst, M., Och Dag, J.N., Beremark, P., Hjelm, T., 2001. An Industrial Case Study on Distributed Prioritisation in Market-Driven Requirements Engineering for Packaged Software. Requirements Engineering 6, 51-62.

Rekha, S., Muccini, H., 2014. A Study on Group Decision-Making in Software Architecture, 11th Working IEEE/IFIP Conference on Software Architecture (WICSA 2014).

Rubenstein-Montano, B., Liebowitz, J., Buchwalter, J., McCaw, D., Newman, B., Rebeck, K., 2001. A systems thinking framework for knowledge management. Decision Support Systems 31, 5-16.

Runeson, P., 2003. Using students as experiment subjects–an analysis on graduate and freshmen student data. Proceedings of the 7th International Conference on Empirical Assessment in Software Engineering.–Keele University, UK, 95-102.


References 273

Runeson, P., Höst, M., 2009. Guidelines for conducting and reporting case study research in software engineering. Empirical Softw. Engg. 14, 131-164.

Runeson, P., Host, M., Rainer, A., Regnell, B., 2012. Case study research in software engineering: Guidelines and examples. John Wiley & Sons.

Saaty, T., 1990. How to make a decision: The analytic hierarchy process. European Journal of Operational Research 48, 9-26.

Scheubrein, R., Zionts, S., 2006. A problem structuring front end for a multiple criteria decision support system. Comput. Oper. Res. 33, 18-31.

Schweiger, D.M., Sandberg, W.R., Ragan, J.W., 1986. Group Approaches for Improving Strategic Decision Making: a Comparative Analysis of Dialectical Inquiry, Devil'S Advocacy, and Consensus. Academy of Management Journal 29, 51-71.

Seaman, C.B., 2008. Qualitative methods, in: Shull, F., Singer, J., Sjøberg, D.I.K. (Eds.), Guide to Advanced Empirical Software Engineering. Springer London, pp. 35-62.

Shahin, M., Liang, P., Khayyambashi, M.R., 2009. Architectural Design Decision: Existing Models and Tools, Joint Working IEEE/IFIP Conference & European Conference on Software Architecture pp. 293-296.

Shaw, M., Gaines, B.R., 1996. Requirements acquisition. Software Engineering Journal.

Shaw, M., Garlan, D., 1996. Software architecture: perspectives on an emerging discipline. Prentice-Hall, Inc.

Shaw, M.L.G., 1980. On becoming a personal scientist: Interactive computer elicitation of personal models of the world. Academic Press.

Shaw, M.L.G., McKnight, C., 1981. Think Again: Personal Problem-Solving and Decision-Making. Prentice Hall.

Sjøberg, D., Hannay, J.E., Hansen, O., Kampenes, V.B., Karahasanovic, A., Liborg, N.-K., Rekdal, A.C., 2005. A survey of controlled experiments in software engineering. IEEE Transactions on Software Engineering 31, 733-753.

Smrithi R.V., H., M., 2014. A Study on Group Decision-Making in Software Architecture, Working IEEE/IFIP Conference on Software Architecture (WICSA).

Sousa, K., Mendon, H., #231, Furtado, E., 2006. Applying a multi-criteria approach for the selection of usability patterns in the development of DTV applications, Proceedings of VII Brazilian symposium on Human factors in computing systems. ACM, Natal, RN, Brazil, pp. 91-100.


274 References

Svahnberg, M., 2004. An industrial study on building consensus around software architectures and quality attributes. Information and Software Technology 46, 805-818.

Svahnberg, M., Aurum, A., Wohlin, C., 2008. Using students as subjects - an empirical evaluation, Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement. ACM, Kaiserslautern, Germany, pp. 288-290.

Svahnberg, M., Wohlin, C., Lundberg, L., Mattsson, M., 2003. A quality-driven decision-support method for identifying software architecture candidates. International Journal of Software Engineering and Knowledge Engineering 13, 547-573.

Svensson, R.B., Gorschek, T., Regnell, B., Torkar, R., Shahrokni, A., Feldt, R., 2012. Quality Requirements in Industrial Practice - An Extended Interview Study at Eleven Companies. IEEE Transactions on Software Engineering 38, 923-935.

Tang, A., 2011. Software designers, are you biased? Proceeding of the 6th international workshop on, 1-8.

Tang, A., Avgeriou, P., Jansen, A., Capilla, R., Ali Babar, M., 2010. A comparative study of architecture knowledge management tools. Journal of Systems and Software 83, 352-370.

Tang, a., Babar, M., Gorton, I., Han, J., 2006. A survey of architecture design rationale. Journal of Systems and Software 79, 1792-1804.

Tang, A., van Vliet, H., 2009. Software Architecture Design Reasoning, Software Architecture Knowledge Management. Springer, pp. 155-174.

Tastle, W.J., Wierman, M.J., 2007. Consensus and dissention: A measure of ordinal dispersion. International Journal of Approximate Reasoning 45, 531-545.

Tichy, W.F., 2000. Hints for Reviewing Empirical Work in Software Engineering. Empirical Software Engineering Journal 5, 309-312.

Tofan, D., Appendix, http://www.cs.rug.nl/~dan/PrioritizationExperiment/, (last accessed May 2015).

Tofan, D., Tool for Repertory Grid Technique, https://github.com/danrg/RGT-tool, (last accessed May 2015).

Tselios, K., Avgeriou, P., Tofan, D., 2012. Two empirical studies on decision-making processes in software architecture (Master's Thesis). University of Groningen.

http://www.cs.rug.nl/~dan/PrioritizationExperiment/


References 275

Tyree, J., Akerman, A., 2005. Architecture decisions: Demystifying architecture. IEEE Software, 19-27.

van Heesch, U., Avgeriou, P., 2010. Naive Architecting - Understanding the Reasoning Process of Students, 4th European Conference on Software Architecture. Springer, pp. 24-37.

van Heesch, U., Avgeriou, P., 2011. Mature Architecting - A Survey about the Reasoning Process of Professional Architects, 9th Working IEEE/IFIP Conference on Software Architecture, pp. 260-269.

van Heesch, U., Avgeriou, P., Hilliard, R., 2012. A Documentation Framework for Architecture Decisions. Journal of Systems and Software 85, 795-820.

van Heesch, U., Avgeriou, P., Tang, A., 2013. Does decision documentation help junior designers rationalize their decisions? - A comparative multiple-case study. Journal of Systems & Software.

Venkatesh, V., Davis, F.D., 2000. A Theoretical Extension of the Technology Acceptance Model: Four Longitudinal Field Studies. Management Science 46, 186-204.

Wieringa, R., 2009. Design science as nested problem solving, Proceedings of the 4th International Conference on Design Science Research in Information Systems and Technology. ACM, Philadelphia, Pennsylvania, pp. 1-12.

Wohlin, C., Höst, M., Henningsson, K., 2003. Empirical research methods in software engineering. Empirical Methods and Studies in Software Engineering 9, 7-23.

Wohlin, C., Runeson, P., Höst, M., 2000. Experimentation in Software Engineering: an Introduction. Springer.

Wohlin, C., Runeson, P., Hst, M., Ohlsson, M.C., Regnell, B., Wessln, A., 2012. Experimentation in Software Engineering. Springer Publishing Company, Incorporated.

Wojcik, R., Bachmann, F., Bass, L., Clements, P., Merson, P., Nord, R., Wood, B., 2006. Attribute-Driven Design (ADD), Version 2.0.

Yates, J.F., Veinott, E.S., Patalano, A.L., 2003. Hard Decisions, Bad Decisions: On Decision Quality and Decision Aiding, in: Schneider, S.L., Shanteau, J. (Eds.), Emerging Perspectives on Judgment and Decision Research. Cambridge University Press, New York, pp. 13-63.

Yin, R.K., 2003. Case study research: Design and methods. Sage Publications.


276 References

Zannier, C., Chiasson, M., Maurer, F., 2007. A model of design decision making based on empirical results of interviews with software designers. Information and Software Technology 49, 637-653.

Zeckhauser, R.J., 1996. Wise choices: decisions, games, and negotiations. Harvard Business School Press.

Zhang, H., Babar, M.A., Tell, P., 2011. Identifying relevant studies in software engineering. Information and Software Technology 53, 625-637.

Zhu, L., Aurum, A., Gorton, I., Jeffery, R., 2005. Tradeoff and Sensitivity Analysis in Software Architecture Evaluation Using Analytic Hierarchy Process. Software Quality Control 13, 357-375.

Zimmermann, O., 2011. Architectural Decisions as Reusable Design Assets. IEEE Software 28, 64-69.

Zimmermann, O., Koehler, J., Leymann, F., Polley, R., Schuster, N., 2009. Managing architectural decision models with dependency relations, integrity constraints, and production rules. Journal of Systems and Software 82, 1249-1267.


Acknowledgments

A PhD thesis is the result of a unique journey. It is a pleasure to express my

gratitude to all those who made this journey exciting.

I would like to thank my supervisor, Paris Avgeriou, for inspiring me to

always keep high research standards, write well, and provide evidence.

Moreover, I thank Paris for his availability, continuous feedback, patience, and

support. Furthermore, I appreciate the research freedom and space for

exploration that I was privileged to have throughout my PhD. Also, Paris

inspired me to appreciate white beer. These pieces of evidence suggest a great

supervisor.

I would like to thank my co-supervisor, Matthias Galster, for his efforts and

inspiring dedication to research. Our whiteboard and Skype discussions helped

improve my ideas, and make the research process fun, exciting, and with

minimum doodling. I appreciate his efforts for making our papers and my

thesis cleaner, my arguments sharper, and ramble-free. We both appreciate

good food, and I am sure that his future PhD students will benefit from his

expertise on both research and food.

I appreciate the efforts of the assessment committee members: Prof. Patricia

Lago, Prof. Ivica Crnkovic, and Prof. Marco Aiello, for reviewing my thesis,

offering valuable comments, and attending my defense.

Anonymous reviewers deserve special thanks for the time they spent to offer

feedback. Their feedback challenged me to improve my papers, and accept that

rejected papers make improvements possible and necessary.

I enjoyed the company of my colleagues from the University of Groningen:

Pavel Bulanov, Uwe Van Heesch, Ahmad Kamal, Andrea Pago, Eirini Kaldeli,

Viktoriya Degeler, Trosky Callo, Peng Liang, Zengyang Li, Daniel Feitosa,

Christian Manteuffel, Alexander Lazovik, Nick van Beest, Heerko Groefsema

and the other members of the SEARCH group. Special thanks to Jan Salvador

van der Ven for his continuous optimism and his help with the Dutch

translations for this thesis.


278 Acknowledgments

I appreciate the kindness and prompt help of Ineke Schelhaas, Esmee Elshof

and Desiree Hansen, for all matters related to administrative issues.

I made many new friends in Groningen. I was delighted to meet extraordinary

persons: Parintele Costel, Preoteasa Cristina, Lidia, Maria, Laura, Geo,

Cristina, Wim, Jan, Roxana, Gert. Cati and Cristi have always been amazing.

Adnana and Broni, Corina and Lucian, Roxana and Irinel, Fabiola and Will,

Georgiana and Nicoleta, Jan Coerts are always a joy to meet and spend time

with.

I made new friends in Brussels as well. Elena, Sarah, and Joel contributed with

much fun at delicious meals. My colleagues (Jerome, Jorg, Boris, Dominique,

Cedric, Nicolas, Vincent, Baja, and the others) at Eurofins are a fun team with

high skills and respect for software design.

My friends in Romania: Cristi, Lucian, and Dragos have been supportive at all

times. I always enjoy the time spent with you.

My family has supported me unconditionally. Every day, my beloved wife,

Ioana, has brought light in my life. My two sons have taught me a lot about

life before, during, and after the PhD project. My brother has inspired me to

try more and try better.

I am grateful to all of you for this journey!


About the Author

Dan Tofan did his B.Sc. and M.Sc. studies in computer science at the

Gheorghe Asachi Technical University of Iasi in Romania. During his studies,

he worked in various software engineering roles at a leading provider of

telecom billing solutions. Currently, he is a hands-on software engineer with

software architecture and project management responsibilities for a portfolio

of internal core software applications at a global leader in bio-analytical

testing.

university of groningen understanding and supporting ......500438-l-sub01-bw-tofan understanding and...

Documents