[ieee 2013 35th international conference on software engineering (icse) - san francisco, ca, usa...

10
Estimating Software-Intensive Projects in the Absence of Historical Data Aldo Dagnino Industrial Software Systems ABB Corporate Research Raleigh, NC, USA [email protected] Abstract—This paper describes a software estimation technique that can be used in situations where there is no reliable historical data available to develop the initial effort estimate of a software development project. The technique described incorporates a set of key estimation principles and three estimation methods that are utilized in tandem to deliver the estimation results needed to have a robust initial estimation. An important contribution of this paper is bringing together into ONe Software Estimation Tool-kit (ONSET) multiple concepts, principles, and methods in the software estimation field, which are typically discussed separately in the estimation literature and can be employed when an organization does not have reliable historical data. The paper shows how these principles and methods are applied to derive estimates without the need of using complex or expensive tools. A case study is presented using ONSET which was carried out as an estimation pilot study conducted in one of the software development Business Units of ABB. The results of this pilot project provided insights on how to implement ONSET across ABB software development business units. Practical guidance is offered in this paper on how an organization that does not have reliable historical data can begin to collect data to use in future projects using ONSET. In contrast to many papers that describe estimation approaches, this paper explains how to use a combination of judgment-based and model-based methods such as the Planning Poker, Modified Wideband Delphi, and Monte Carlo simulation to derive the initial estimates. Once an organization begins collecting reliable historical data, ONSET will provide even more accurate estimation results and a smoother transition to the use of model-based estimation methods and tools can be achieved. Index Terms—Component, formatting, style, styling, insert. I. INTRODUCTION In spite of the high volume of research conducted in software estimation, a large percentage of software development organizations these days still have enormous difficulties developing reliable effort estimates that result in on- time and on-budget delivery of their software products. There are several reasons why even after a lot of research conducted in this topic software development organizations have difficulties with their estimation activities. First, many software development organizations do not have a robust software estimation process. Second, there are a myriad of estimation algorithms that have been developed and each algorithm provides different outputs (one estimate point, two-point interval estimate, probability and estimate) which may become confusing to organizations. Third, software estimation tools assume that organizations have reliable historical data and this is seldom true. Fourth, it is not clear to organizations how to use estimation methods when they do not have reliable historical data. Fifth, many software development organizations are not aware of the basic principles needed to develop estimates and how to present these estimates to the rest of the organization. Sixth, many organizations confuse the concepts of target, estimate, and commitment. Four primary objectives are the focus of this paper. These objectives include: (1) Defining the key software estimation principles that need to be incorporated into a software estimation process and toolkit. (2) Defining an approach to utilize different software estimation methods to develop estimates when there is no reliable historical data, and that can also be utilized once historical data is being collected. (3) Defining an approach to start collection of reliable historical data by utilizing the estimation methods described. (4) Defining an overall software estimation process that can lead organizations towards CMMI Levels 2-3 in the software estimation sub-practice of the Project Planning Process Area. In this paper, Section II provides a short literature review on initial estimation research work. Section III focuses on the importance of the initial estimate. Section IV brings together a set of estimation principles that are often scattered and discussed in a piecemeal fashion in the estimation literature and should become an integral part of a software estimation process for a software organization. Section V discusses the importance of the Cone of Uncertainty, which links the stages of the software development lifecycle and the estimation stages. Section VI presents a combination of estimation methods that can be used to derive initial estimates when no reliable historical data is available to the organization. These methods provide estimation results that complement each other and provide the basis for a CMMI Levels 2-3 software estimation sub-practice. Section VII discusses how to structure a work breakdown structure so that it can be used as the raw material to develop the estimates. In section VIII, the concept of an institutionalized estimation process is discussed and how this process consists of integrating key estimation principles and a set of estimation methods into an estimation toolkit (ONSET) 978-1-4673-3076-3/13/$31.00 c 2013 IEEE ICSE 2013, San Francisco, CA, USA Software Engineering in Practice 941

Upload: aldo

Post on 07-Mar-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE 2013 35th International Conference on Software Engineering (ICSE) - San Francisco, CA, USA (2013.05.18-2013.05.26)] 2013 35th International Conference on Software Engineering

Estimating Software-Intensive Projects in the Absence of Historical Data

Aldo Dagnino

Industrial Software Systems ABB Corporate Research

Raleigh, NC, USA [email protected]

Abstract—This paper describes a software estimation technique

that can be used in situations where there is no reliable historical data available to develop the initial effort estimate of a software development project. The technique described incorporates a set of key estimation principles and three estimation methods that are utilized in tandem to deliver the estimation results needed to have a robust initial estimation. An important contribution of this paper is bringing together into ONe Software Estimation Tool-kit (ONSET) multiple concepts, principles, and methods in the software estimation field, which are typically discussed separately in the estimation literature and can be employed when an organization does not have reliable historical data. The paper shows how these principles and methods are applied to derive estimates without the need of using complex or expensive tools. A case study is presented using ONSET which was carried out as an estimation pilot study conducted in one of the software development Business Units of ABB. The results of this pilot project provided insights on how to implement ONSET across ABB software development business units. Practical guidance is offered in this paper on how an organization that does not have reliable historical data can begin to collect data to use in future projects using ONSET. In contrast to many papers that describe estimation approaches, this paper explains how to use a combination of judgment-based and model-based methods such as the Planning Poker, Modified Wideband Delphi, and Monte Carlo simulation to derive the initial estimates. Once an organization begins collecting reliable historical data, ONSET will provide even more accurate estimation results and a smoother transition to the use of model-based estimation methods and tools can be achieved.

Index Terms—Component, formatting, style, styling, insert.

I. INTRODUCTION In spite of the high volume of research conducted in

software estimation, a large percentage of software development organizations these days still have enormous difficulties developing reliable effort estimates that result in on-time and on-budget delivery of their software products. There are several reasons why even after a lot of research conducted in this topic software development organizations have difficulties with their estimation activities. First, many software development organizations do not have a robust software estimation process. Second, there are a myriad of estimation algorithms that have been developed and each algorithm provides different outputs (one estimate point, two-point interval estimate, probability and estimate) which may become confusing to organizations. Third, software estimation tools

assume that organizations have reliable historical data and this is seldom true. Fourth, it is not clear to organizations how to use estimation methods when they do not have reliable historical data. Fifth, many software development organizations are not aware of the basic principles needed to develop estimates and how to present these estimates to the rest of the organization. Sixth, many organizations confuse the concepts of target, estimate, and commitment.

Four primary objectives are the focus of this paper. These objectives include:

(1) Defining the key software estimation principles that

need to be incorporated into a software estimation process and toolkit.

(2) Defining an approach to utilize different software estimation methods to develop estimates when there is no reliable historical data, and that can also be utilized once historical data is being collected.

(3) Defining an approach to start collection of reliable historical data by utilizing the estimation methods described.

(4) Defining an overall software estimation process that can lead organizations towards CMMI Levels 2-3 in the software estimation sub-practice of the Project Planning Process Area.

In this paper, Section II provides a short literature review

on initial estimation research work. Section III focuses on the importance of the initial estimate. Section IV brings together a set of estimation principles that are often scattered and discussed in a piecemeal fashion in the estimation literature and should become an integral part of a software estimation process for a software organization. Section V discusses the importance of the Cone of Uncertainty, which links the stages of the software development lifecycle and the estimation stages. Section VI presents a combination of estimation methods that can be used to derive initial estimates when no reliable historical data is available to the organization. These methods provide estimation results that complement each other and provide the basis for a CMMI Levels 2-3 software estimation sub-practice. Section VII discusses how to structure a work breakdown structure so that it can be used as the raw material to develop the estimates. In section VIII, the concept of an institutionalized estimation process is discussed and how this process consists of integrating key estimation principles and a set of estimation methods into an estimation toolkit (ONSET)

978-1-4673-3076-3/13/$31.00 c© 2013 IEEE ICSE 2013, San Francisco, CA, USASoftware Engineering in Practice

941

Page 2: [IEEE 2013 35th International Conference on Software Engineering (ICSE) - San Francisco, CA, USA (2013.05.18-2013.05.26)] 2013 35th International Conference on Software Engineering

that can be utilized even if there is no reliable historical data. One advantage of ONSET is that once an organization begins collecting historical data, the same principles and methods can be utilized to develop new estimates. Section IX is dedicated to discussing a case study that utilizes ONSET and presents the details on how the estimation principles and methods are utilized to derive the initial project estimates. Finally, Section X provides some concluding remarks and lessons learned from using and institutionalizing ONSET at ABB.

II. BACKGROUND Cohn [1] provides a good definition of estimation and

points out that “[a] good estimate is an estimate that provides a clear enough view of the project reality to allow the project leadership to make good decisions about how to control the project to hit its targets”. After decades of research in the field of software estimation, and despite the large number of cost factors gathered and the rigorous data collected in the software industry, there is a lot of uncertainty on how to effectively estimate software projects, especially at early stages in the development lifecycle. In practical terms, the ability to estimate software projects well depends on how much knowledge and uncertainty exists about the project being estimated at a particular point in time [5], [9]. Estimation focuses on three aspects: effort, schedule, and cost. Most methods focus on estimating size to then estimate effort and cost. Estimating effort is a primary challenge as very good tools exist to translate effort into schedule, staff, cost, and risks [2] [5].

Project estimation has different stages and they map the stages of the development lifecycle. The initial estimation occurs at early stages of the lifecycle and this is a stage where uncertainty is still quite high. Planning estimation occurs when the product is better defined and more detailed requirements are known. During the development lifecycle stage, a lot of the uncertainty has been removed and the preliminary commitment estimates can be made. This paper focuses on helping a project manager to develop initial estimates, which are considered by many authors as the most important because the initial development budget is committed and also because this is a stage where the uncertainty is the highest [7].

III. THE INITIAL ESTIMATE The most critical estimate in the life of a software-intensive

project is the initial estimate. This estimate typically occurs at the beginning of the project planning or initiation phase as mentioned above and is the main focus of the methods described in this paper and the case study. A “savvy Project Manager” that is a good estimator is aware of the importance of the initial estimate. A savvy Project Manger must also be aware of the inherent conflict around estimates and the legitimacy of two positions, the business target position and the technical people’s estimate. The initial estimate represents a great opportunity to develop constructive collaboration among stakeholders towards solving problems that arise from the estimation, rather than promoting entrenched positions. Moreover, it is important that the Project Manger be able to develop sound and defendable estimates. A “savvy Project

Manager” needs to facilitate the development of estimates, present estimates, and discuss their implications taking into account the above mentioned four principles.

The initial estimate plays a key role in the project planning process. The initial estimate is the first indication to the organization about the feasibility of realizing the anticipated project benefits. If the project planning phase indicates that benefits are realizable, then the initial estimate also plays a significant role in determining the amount of resources that will be committed to the endeavor by the organization. The initial estimate is a communication vehicle that allows the whole organization to have a meaningful dialogue about the project and its significance to the organization [8].

The process of approving an estimate involves two very distinct sides in the organization, the business side and the technical side. One of the roles of the business function is to secure business (projects) for the organization, and a primary motivator is to ensure that the business plan is met. The development function has the role to deliver the projects to secure the business objectives, but their primary motivators are to perform sound technical work and focus on what is feasible. The motivations of both, the business and technical functions, can cause internal frictions and potential political posturing. The goal is to balance both the business and technical views. The role of corporate strategy also needs to be considered so as to provide a balance between risks and rewards of the entire portfolio of projects.

Estimation of software products can often be a source of personal risk for the Project Manager preparing the estimate. The friction point arises over the gap between the business’ target for the project and the technical staff’s estimate of completion. The gap between the two views represents the organizational risk. Frequently, the organization resolves the organizational risk by adopting the target as the plan. Thus, the risk is transferred to the project team and, in the process, becomes an additional personal risk for the Project Manager. For many organizations, the debate over the gap between the target and the estimate can degenerate into strife, or a “negotiation”, instead of an open discussion. This can “poison” the project and make the organization lose sight of the “estimation process” and focus only on the end result of the estimation process. This means that the organization focuses on the certainty of the estimation outcome, downplaying and de-emphasizing the risks and uncertainty that could prevent success. This situation shortchanges the organization by taking away the opportunity to develop an in-depth understanding of the project in terms of the risks, rewards, and benefits.

As seen above, conflict arises because of the difference between the “target” and the “estimate”. This situation pits the project planning team against the business management team. Frequently, Senior Management seeks to resolve the situation by imposing the target on the project planners.

A “savvy” Project Manager will use a thorough initial estimate to promote a business discussion of the estimate itself. This will focus discussion on the gap between the estimate and the organization’s target rather than on a technical discussion of the estimate’s merit. At the end, this discussion will serve to

942

Page 3: [IEEE 2013 35th International Conference on Software Engineering (ICSE) - San Francisco, CA, USA (2013.05.18-2013.05.26)] 2013 35th International Conference on Software Engineering

make the organization aware of the level of risk in the project and to frame a discussion around how to creatively mitigate the project risk and thus, the gap between the target and the estimate.

A good presentation of an estimate includes the description of the estimate’s scope, the estimation methods utilized, the accuracy, and the uncertainty of the estimate. Initial estimation is most successful when multiple methods and different people are employed to develop the estimate. Convergence in the estimation is an indication of the accuracy of the estimate and it also provides higher level of confidence in the estimate.

IV. ESTIMATION PRINCIPLES To improve the odds of developing sound initial estimates

for planning software development projects, several key principles should be observed when developing these estimates, and these include:

1) Continuously aim at reducing uncertainty. 2) Improve chaotic software development processes. 3) Make sure estimates are accurate even though they

may not be as precise. 4) Avoid rushed and off-the-cuff estimates 5) Present estimates in a way that allows tightening them

as team moves further into the project. 6) Recognize that the most critical estimate in the life of

the project is the initial estimate. 7) Recognize that the discussion of an initial estimate

involves the business and technical perspectives. 8) Use multiple estimating methods. 9) Use a multi-disciplinary team to develop the estimates. 10) Provide a range of values (optimistic and pessimistic)

in your estimate. 11) Select a measure that is easy to count, well correlated

with effort, meaningful across projects and base future size estimates on this measure.

12) Identify and document implicit and explicit assumptions when estimating elements in the work breakdown structure of a project.

13) Do not confuse an estimate with a target. 14) Quantify the gap between the estimate and the target

and then focus discussion in these terms and the potential risks posed by the gap.

15) Emphasize counting and computing to derive effort. 16) Plan re-estimation at predefined points in the project. 17) Define when an estimate can be used as a basis for

internal and external commitments. 18) Allow the people who will perform the project work to

develop the estimates, when possible. 19) Provide software estimators with adequate education

and training on estimation techniques. 20) Estimate size and then based on team velocity,

compute the effort estimate.

V. THE CONE OF UNCERTAINTY A fundamental concept in software estimation is the Cone

of Uncertainty [6]. The Cone of Uncertainty represents the uncertainty inherent to any project and shows how estimation becomes more accurate as the development of a product moves

into later development lifecycle stages and the uncertainty in the project decreases as shown in Fig. 1. The “Y” axis shows the degree of error that is closely correlated with the uncertainty in every project. Estimates created early in the development lifecycle have a higher degree of uncertainty and estimates improve rapidly after the first third of the project. It is important to notice that the most important business decisions related to the software project are made at the time when there is minimum knowledge about the project and there is maximum level of uncertainty. An important concept to understand is that the Cone of Uncertainty represents the best-case accuracy that is possible to have in software estimates at different points in a project. The Cone represents the error in estimates created by skilled estimators. It is possible to do worse than the Cone of Uncertainty. If the project is not well controlled or the estimators are not very skilled, estimates can fail to improve and the uncertainty instead of being a well defined Cone, is a Cloud that persists until the end of the project as shown in Fig. 1. Hence, the Cone of Uncertainty is narrowed by making decisions that remove uncertainty from the project.

Fig. 1. Depiction of the Cone of Uncertainty.

Studies of software estimates have shown that estimators who start their estimates with single point estimates often do not adjust their minimum and maximum values sufficiently to account for the uncertainty. A suggested approach is to estimate a most likely value and then use Table I below as a guide to compute the values of ranges. It is important to think carefully about the stage at which the project is so that the error factors are selected appropriately. Notice that regardless of having or not having historical data in the organization, there are some points at the estimation process that still require some subjectivity and this is one of them.

4 X

0.25 X

0.8 X

0.9 X

1.25 X

1.1 XX

0.85 X

1.15 X

Initial Concept

Complete

Approved Product

Definition

RequirementsSpecification

Complete

Detailed Design

Complete

SoftwareComplete

Variability in the Estimate

Lifecycle Stage

(Time)

2 X

0.5 X

InitialProduct

DefinitionComplete

943

Page 4: [IEEE 2013 35th International Conference on Software Engineering (ICSE) - San Francisco, CA, USA (2013.05.18-2013.05.26)] 2013 35th International Conference on Software Engineering

TABLE I. FACTORS ASSOCIATED WITH CONE OF UNCERTAINTY

VI. ESTIMATION METHODS UTILIZED IN ONSET A large amount of literature exists on methods that can

assist estimation of software-intensive development projects. These methods are typically subdivided into top-down vs. bottom-up and judgment-based vs. model-based [6].

A. Top Down and Bottom Up Estimation Methods Top-down estimation methods focus on the overall

characteristics of the software to be developed. These methods do not require the use of detailed requirements and hence see the “whole picture” of the project. Nevertheless, top-down methods can underestimate the effort of developing difficult low-level technical components. When using a top-down estimation method, the total effort of a software project is estimated without a detailed decomposition of the project into low-level activities. There are situations that clearly favor top-down estimation such as the creation of early project estimates.

Bottom-up estimation methods are focused on estimating each of the software components in detail and then aggregate the overall complete estimation of the project into one estimate. Bottom-up methods are focused on a detailed work breakdown structure and look into the detailed elements for estimation. Nevertheless, these methods due to their detailed perspective can miss system level activities such as architecture, design, system integration, integration testing, and documentation. Bottom-up methods can also underestimate the overall size and effort of system-level tasks. Bottom-up estimation helps to better understand the project requirements. Bottom-up estimation typically entails significant effort expenditure on the part of the development team, as details need to be defined.

B. Expert Judgment and Model Based Estimation Methods Expert judgment estimates can be “intuitive” (ad-hoc and

inaccurate) or “structured” (can be as accurate as model-based methods). Intuitive estimation is not recommended as it is quite inaccurate. Expert structured judgment estimates are determined from the experience of the estimators and they do

not necessarily require project historical data. Nevertheless, if reliable historical data exists, expert estimators can effectively use this historical project data and employ a structured expert judgment-based estimation method. If applied correctly, structured expert-based estimation can be as accurate as estimates generated by using estimation models not calibrated to the organization.

Model-based estimation methods make use of reliable historical project data. Historical data can be project data derived from the same project (this historical data is typically stored from pervious iterations and it is quite accurate), same organization historical data (data derived from past projects within the same organization), and industry data (least preferable). Model-based methods are based on a mechanical quantification set of steps using different types of equations. User-friendly tools have been developed by third-party vendors that facilitate the use and implementation of model-based estimation techniques. Model-based methods, for the most part, use computation to derive estimates.

VII. PROJECT REQUIREMENTS CATEGORIZATION The basic raw materials needed to estimate a project are the

elements in the work breakdown structure (WBS). The level of specificity of the WBS is less detailed at early stages in the development lifecycle while at more advanced stages of the project the WBS is more detailed [10].

The concept “levels of requirements” refers to defining a hierarchy of requirements based on their level of detail or level of abstraction. Hence, “higher-level” requirements are more abstract and less granular (less detailed) while “lower-level” requirements are more detailed and smaller in terms of granularity (more detailed). Typically, one “higher-level” requirement can be traced from more than one “lower-level” requirement and vice-versa. Fig. 2 shows the hierarchy of requirements.

A. Market or Customer Needs The need is a statement of the need of the customer or the

market and it is shown as a cloud in Fig. 1. Need statements can be derived by asking series of “Why” statements. The format and an example of a market need are presented below.

Format: [A{Customer] needs to {qualifier action} {Subject}] Example: A Utility needs to efficiently manage the feeders’ settings in

intelligent Electronic devices (IEDs) in power substations.

B. Feature, Function, or User Story The first box inside the first rounded rectangle in Figure 2

shows that from market/customer needs features, functions, user stories, etc., can be derived. It is important that an organization defines a structure to move from market/customer needs to features or functions. An organization must ensure that there is a consistent way to define a feature, a function, and a user story, etc. because this is the basis for initial estimation. A feature, function, or user story must be clear, specific, timeless,

Development Phase

Possible Error Low side

Possible Error High Side

Initial concept complete

0.25 * X 4.0 * X

Initial product definition complete

0.5 * X 2.0 * X

Approved product

definition

0.8 * X 1.25 * X

Requirements specification

complete

0.85 * X 1.15 * X

Detailed design complete

0.9 * X 1.1 * X

944

Page 5: [IEEE 2013 35th International Conference on Software Engineering (ICSE) - San Francisco, CA, USA (2013.05.18-2013.05.26)] 2013 35th International Conference on Software Engineering

specified in customer/market language, concise, within the scope of the product, and positive as shown below.

Format: [As a {User] I need to {qualifier} so that I can {reason}] Example: As a Utilities Engineer I need to efficiently manipulate the

settings of IEDs so that I can reduce the effort of configuring IEDs in feeders in a substation.

Fig. 2. Different Levels of Requirements.

C. Market or Customer Requirements The potential market has needs of the total population who

has interest in acquiring a product. The “Market Requirements” or “Customer Requirements” are those requirements that the organization has decided to embrace to serve the “Target Market” (see Fig. 2). Market/customer requirements can enter in the organization as “verbatim” statements from many stakeholders (end users, domain experts, senior management, sales people, etc.). Market/customer requirements can be expressed in similar fashion as market needs but at a lower level of granularity as shown below.

Format: [As a {Use] I need to {qualifier} so that I can {reason}] Example: As a Utilities Engineer I need to efficiently view the settings

of IEDs so that I can reduce the effort of making changes in the feeders’ settings in a substation.

D. Technical Requirements Technical requirements are written in the developer’s

language and describe a technical solution to the market requirements. Technical requirements must be traced back to higher-level requirements (see Fig. 2). The elements of the Volere Shell [10] can be used to define the technical requirements. The primary elements identified in the Volere Shell for Technical requirements include: (a) Requirement #; (b) Requirement Type #; (c) Market requirement #: (d) Requirement Description; (e) Rationale; (f) Requirement Source; (g) Fit Criterion; (h) Dependencies; (i) Conflicts; (j) Supporting Materials; (k) History.

Format: [The system shall allow {User] to {action} {measure}] Example: The IED Configuration System shall allow the Utilities

Engineer to define IED settings and enter a description of the setting in less than 10 seconds.

VIII. ONE SOFTWARE ESTIMATION TOOLKIT (ONSET) ONSET can be implemented as an institutionalized

estimation process that must be defined so that is not adjustable on an estimate-by-estimate basis but is applicable to all software development projects equally. The process is institutionalized because it is defined before creating any estimate and it is used by the whole organization. ONSET contains the set of estimation principles and estimation methods that an organization has agreed to employ for estimation purposes. Fig. 3 below shows the primary elements of an institutionalized estimation process. The elements of ONSET can be subdivided into inputs (technical scope, priorities, and constraints), methods and principles, historical data (if available) and outputs. The technical scope of the project is typically represented with a work breakdown structure (WBS). The only way to have variation in the outputs is to make adjustments to the inputs. Once established, methods, principles and historical data must be stable for all estimation activities.

Fig. 3. Institutionalized Estimation Process.

Market and Customer Needs Level

Market / Customer Requirements

Features, Functions, User Stories, etc.

System Requirements

Product Requirements

Software Product Component

Requirements

Hardware Product Component

Requirements

Technical Requirements

945

Page 6: [IEEE 2013 35th International Conference on Software Engineering (ICSE) - San Francisco, CA, USA (2013.05.18-2013.05.26)] 2013 35th International Conference on Software Engineering

IX. A CASE STUDY This section is dedicated to discuss the real-world case

study conducted in a Business Unit (BU) of a multi-national power and automation technologies research and development organization. The focus of this case study is to outline the portion of institutionalized estimation process used to conduct the initial estimation of a software development project. The objective of this project was to enhance the functionality of the User Interface of an existing system that allows a Utilities Engineer to define the settings of intelligent electronic devices (IEDs) of a substation in a power distribution grid.

A. Needs and Features of UI for Substation System Table II below shows the list of needed enhancements for

the Substation System as defined at the initial stages of the project. Table II shows seven (7) user needs defined using the ontology presented in Section VI Sub-section A.

TABLE II. CUSTOMER NEEDS FOR ENHANCED SUBSTATION SYSTEM

Table III shows the list of features derived from the

customer needs summarized in Table II at the initial product definition stage of the project. The seven (7) features were defined using the ontology presented in Section VI Sub-section B.

B. An Institutionalized Process for Initial Estimates The software development BU in this case study has

developed and Institutionalized Estimation Process that includes estimation at the following development lifecycles: (a) Initial Product Definition; (b) Requirements Specification Complete; and (c) Detailed Design Complete. This case study primarily focuses on (a) and the methods and principles employed to derive initial estimates using as raw material the Features shown in Table III. As this BU did not have any reliable historical data to apply to this specific project, the following estimation methods were employed to derive the initial estimates for this enhancement project: (i) Planning Poker; (ii) Modified Wideband Delphi method; (iii) Monte Carlo Simulation using Triangular distribution. Additionally,

all principles presented in Section III were applied in this case study. The following sections of the paper will discuss details on each of the three estimation methods utilized. The estimation team in the BU that participated in the estimation activities using all three methods includes the Product Manager, Project Manager, developers, testers, software architect, Marketing Manager.

TABLE III. FEATURES FOR ENHANCED SUBSTATION SYSTEM

C. Planning Poker Method for Initial Estimation This is considered a top-down and structured expert

judgment estimation method useful at early stages of the development lifecycle where size is estimated and effort is computed based on team velocity. This method classifies the sizes of features in the WBS relative to a selected baseline feature. After the sizes of the elements in a WBS have been estimated, it calculates the effort for each feature based on team velocity. Team velocity refers to the amount of time a development team employs to implement the points associated with the baseline feature. The following steps were followed to develop the estimates shown in Table IV.

1) After the estimation team has identified the main features of the WBS, a “baseline feature” is selected and it is used to compare the sizes and complexity of the remaining features in the WBS. If the organization has reliable historical data, the baseline feature can be selected from the historical database. If not, as in our

Defined Customer Needs

A user in a Utility needs to efficiently define the settings in intelligent electronic devices (IEDs) in substations A user in a Utility needs to save the settings of intelligent electronic devices (IEDs) in feeders in substations A user in a Utility needs to efficiently load the values of intelligent electronic devices (IEDs) of feeders in a substation A user in a Utility needs to edit the settings of intelligent electronic devices (IEDs) A user in a Utility needs to efficiently manage the feeders' settings in intelligent Electronic devices (IEDs) in substations A user in a Utility needs to print a simplified setting report for a selected intelligent electronic device (IED) A user in a Utility needs to export the settings of intelligent electronic devices (IEDs) in feeders

Feature Title/No

Feature Description

Specify IED

settings 1

As a Utilities Engineer I need to quickly specify the settings of IEDs so that I can reduce the customization effort of feeders in a substation

Save IED settings

2

As a Utilities Engineer I need to efficiently save the settings of IEDs so that I can reduce the customization effort of feeders in a substation

Load IED settings

3

As a Utilities Engineer I need to rapidly load the values of the settings of IEDs so that I can reduce the customization effort of feeders in a substation

Edit IED settings

4

As a Utilities Engineer I need to efficiently edit the settings of IEDs so that I can reduce the customization effort of feeders in a substation

Manipulate IED

settings 5

As a Utilities Engineer I need to efficiently manipulate (view, hide, set editable) the settings of IEDs so that I can reduce the effort of configuring IEDs in feeders in a substation

Print IED settings

6

As a Utilities Engineer I need to efficiently print a summary of IED settings so that I can reduce the effort of preparing reports

Export IED settings

7

As a Utilities Engineer I need to quickly export the settings of IEDs to spreadsheets and other MS Word documents so that I can reduce the effort of preparing reports

946

Page 7: [IEEE 2013 35th International Conference on Software Engineering (ICSE) - San Francisco, CA, USA (2013.05.18-2013.05.26)] 2013 35th International Conference on Software Engineering

present case, the estimation team selects the baseline feature by identifying a feature in the WBS that seems of medium size and with medium complexity, and assigns it a number (feature points) in the middle of the range that the team expects to use. The series of feature points can be selected in many ways, but the following set of points has been successfully used in the past: (1, 2, 3, 5, 8, 13, 20, 30, 40, 50, 70, 90, and 100) [1]. After the baseline feature has been selected and assigned a number of points, the team discusses and documents all associated assumptions (see Table IV, first row, columns 1, 2, and 6).

2) The remaining features in the WBS are now compared in terms of size and complexity with the baseline feature and assumptions are documented for each feature (see Table IV, all remaining rows, and columns 1, 2, and 6).

3) Once the estimation team has completed discussions and assigned points to each element of the WBS, the team has a discussion to ensure that major tasks (such as develop system architecture, integration testing, system documentation, etc.) have been considered. If there is any task remaining, add this task to the elements of the WBS and estimate its size in feature points (no additional tasks added in Table IV).

4) As previously discussed, the team velocity in the project is the time that the development team requires to develop a certain number of feature points. It is recommended to use past historical data if it is available in the organization and is considered as reliable. Otherwise, as is the current case, the team can estimate the team velocity based on past experience. Estimates of velocity should be given as a range that reflects the uncertainty inherent in the estimate as shown by the Cone of Uncertainty. It is important to notice that team velocity is a critical component as the project evolves. For the most part, the points assigned to the features of the WBS should not be adjusted throughout the project. The equalizer is the team velocity and this one is the one that can be changed (see Table IV, column 4).

5) With these data points, the estimation team can then add all the effort numbers and obtain the overall project effort (see Table IV, column 4, and row 9).

6) Finally, depending at what stage in the development lifecycle the project is at, use the Cone of Uncertainty shown in Figure 1 and the values of Table 1 to determine the lower and upper range values for the effort estimates. In our case, the lifecycle stage was the “Initial Product Definition” and hence the lower bound multiplier for velocity is 0.5 and the upper bound is 2.0 (see Table IV columns 3 and 5).

D. Modified Wideband Delphi Method for Initial Estimation Wideband Delphi is a structured group estimation

technique. This technique, if appropriately used, is employed by higher maturity software development organizations. It is important that historical data is stored, that estimating size is kept at the forefront and that effort and cost estimates are derived from the size estimates. Wideband Delphi is considered

a bottom-up and structured expert judgment estimation method useful at early and later stages of the development lifecycle where size is estimated and effort is computed based on team velocity. This technique serves to discuss group’s estimates and improve them by holding a structured meeting with the help of a facilitator. The following steps were followed to develop the estimates shown in Table VII.

TABLE IV. RESULTS OF PLANNING POKER ESTIMATION

1) A Delphi facilitator works with the estimation team to define the baseline feature that will be used to compare each of the technical requirements in the project. If there is a historical list of accepted baseline features classified based by their type (i.e. functional algorithmic, functional user interface, functional database related, non-functional, hardware, etc.) then this list will be used with the associated feature point sizes and assumptions. Moreover, historical data can provide the typical effort that a development team takes to implement one feature point. If no historical data exists, the estimating team may decide to identify the baseline feature that will be used to compare the size(s) and complexity(ies) of the features to be estimated. The team also needs to estimate the level of effort (in person/time) that a feature point takes to be implemented. Table VII shows Feature 1 as the baseline feature shaded.

2) The Delphi Facilitator presents the group of experts with the description of the selected baseline feature. The assumptions made for the baseline feature are also discussed. Notice that Assumptions in Table VII are not included as they are the same as in Table IV.

3) The Delphi Facilitator presents the group of experts with the description of the baseline feature in the WBS to be estimated, and guides the team into comparing the technical requirement being estimated with the size and complexity of the selected the baseline feature.

Feature Feature Points (size)

Velocity p-days Best

Case (effort)

Velocity p-days Most

Likely (effort)

Velocity p/days

Worst Case (effort)

Assumptions

1

BaselineFeature

2 3 3 6 12 A settings template exists

3 8 8 16 32 A settings template exists

4 8 8 16 32 Values are available in XRIO

file5 13 13 26 52 A hierarchy

schema has been defined

6 8 8 16 32 Values are available in XRIO

file7 5 5 10 20 The exported file

is a .csv fileTotals 50 50 100 200

5 5 10 20 A list of possible settings is available

947

Page 8: [IEEE 2013 35th International Conference on Software Engineering (ICSE) - San Francisco, CA, USA (2013.05.18-2013.05.26)] 2013 35th International Conference on Software Engineering

Each team member, in an anonymous way, provides a single most likely estimate of the size of the feature and arguments or assumptions behind the estimate. This step is followed for each of the features identified in the WBS.

4) The Facilitator prepares a summary of the size estimates showing the different estimates and presents it to the group for discussion. The participants see how their estimates compare with other estimators’ estimates.

5) Estimators vote anonymously on whether they want to accept the average size estimate as the Most-likely estimate for each feature. If estimate is accepted then estimators document assumptions behind this estimate. If any of the estimators vote no, they go back to step 3.

6) Estimators discuss Most-likely estimate and vote to provide a Best-case (BC), Worst-case (WC) size estimates for the feature (see Table VII, columns 1-3, and rows 1-7).

7) For each feature, compute the Expected Case Estimate (ECE) with the following equation (1), where (MLC) is the most likely case estimate:

ECE = [BC + ( 3 * MLC) + (2 * (WC)] / 6 (1)

Studies have shown that estimators using the

Wideband Delphi method tend to produce optimistic “Most-likely” estimates, which can yield to optimistic overall estimates. Equation (1) is a slightly altered version to consider “optimism” (see Table VII, column 7).

8) For each feature in the WBS compute the Standard Deviation (STD) using equation (2) (see Table VII, column 5).

STD = [WorstCase – BestCase] / 1.4 (2)

9) Using the divisor of 1.4 statistically implies that the

estimators’ ranges between Best-case and Worst-case will include the actual outcome of the estimate 50% of the times. To increase the percent to 70% of the times, the divisor in the equation must be changed to 2.1 instead of 1.4. Table V shows the divisors to be used when calculating standard deviations. Compute the Variance (VAR) using equation (3) and Total Variance (TVAR) using equation (4) (see Table VII, column 6)

VAR = [STD] ** 2 (3)

TVAR = ∑ 𝑉𝐴𝑅𝑖=1𝑛 i (4)

10) Compute the Aggregate Standard Deviation (ASTD) using equation (5) (see Table VII, row 10).

ASTD = [TVAR] ** 0.5 (5)

11) Compute the 90% Percentage Confident Estimate

(PCEST) using equation (6) (see Table VII, row 11).

Table VI shows the percentage confidence based on use of aggregate standard deviation. This means that the PCEST is expected to be accurate with 90% confidence by using the factor 1.28

PCEST = [ECE + (1.28 * ASTD)] (6)

TABLE V. STANDARD DEVIATION FACTORS

TABLE VI. PERCENTAGE CONFIDENCE VALUES

12. Compute overall effort estimate by multiplying the

Velocity of the development team (see Table VII, row 12) by the PCE (see Table VII, row 13).

13. Table VII shows the results of the estimates carried out

by the Business Unit estimation team using the modified Wideband Delphi method.

If this % of actual outcomes fall within estimation range . . .

then use this factor as a divisor in the STD calculation

10% 0.25 20% 0.51 30% 0.77 40% 1.0 50% 1.4 60% 1.7 70% 2.1 80% 2.6 90% 3.3

99.7% 6.0

Percentage Confidence

Calculation

2% EC – (2 * STD)10% EC – (1.28 * STD)16% EC – (1 * STD)20% EC – (0.84 * STD)25% EC – (0.67 * STD)30% EC – (0.52 * STD)40% EC – (0.25 * STD)50% EC60% EC + (0.25 * STD)70% EC + (0.52 * STD)75% EC + (0.67 * STD)80% EC + (0.84 * STD)84% EC + (1 * STD)90% EC + (1.28 * STD)98% EC + (2 * STD)

948

Page 9: [IEEE 2013 35th International Conference on Software Engineering (ICSE) - San Francisco, CA, USA (2013.05.18-2013.05.26)] 2013 35th International Conference on Software Engineering

TABLE VII. RESULTS OF MODIFIED WIDEBAND DELPHI ESTIMATION

E. Monte Carlo Method for Initial Estimation Monte Carlo is a stochastic technique based on the use of random numbers and probabilistic approaches that can also be used to derive initial estimates. Monte Carlo methods have been used to model business phenomena that have high degree of uncertainty. The following steps were utilized to derive the estimates presented in Table VIII utilizing the Monte Carlo method.

TABLE VIII. EFFORT CALCULATIONS FROM TABLE IV

1) Using the values in Table IV, columns 2-4, and lines 1-

7, construct Table VIII. 2) Inputs were generated using random numbers and the

Triangular Probability Distribution for each Feature using the three inputs Best, Most Likely, and Worst cases and mapping them to the Triangular Distribution.

For each feature, Triangular Distribution random entries were generated for a total of 12,500. So, the first of all entries was: (a) feature 1 = 12.69; (b) feature 2 = 10.97; (c) feature 3 = 39.12; (d) feature 4 = 21.53; (e) feature 5 = 25.79; (f) feature 6 = 10.44; and (g) feature 7 = 11.52. Then the algorithm calculated an average of 132.10 person/days effort. This calculation was performed 12,500 times and averaged as shown in Fig. 4.

Fig. 4. Monte Carlo Estimation Output

F. Analysis of Results Table IX presents the summary of the estimation results from the three methods (Planning Poker, modified Wideband Delphi, and Monte Carlo). The Planning Poker (PP) method identified that total expected size of the project was 5o feature points.

TABLE IX. SUMMARY OF RESULTS FOR ESTIMATION METHODS

The modified Wideband Delphi method calculated that the total effort for this project was 157 person/days with a 90% confidence. The Monte Carlo method calculated that the project was estimated at 134 person/days. Hence, the team considered the average between the two methods and the estimated effort for this project was then (157 + 134) / 2 = 145.5=146 person-days, as shown in Table X.

TABLE X. OVERALL ESTIMATION RESULTS

Table XI shows the size estimate in feature-points for each feature from the results obtained in Table VII (column ECE).

BC

Size

1 3 5 8 3.57 12.76 5.67

2 1 3 8 5 25 4.33

3 3 8 13 7.14 51.02 8.83

4 5 8 13 5.71 32.65 9.17

5 5 13 20 10.71 114.8 14

6 3 8 13 7.14 51.02 8.83

7 5 5 8 2.14 4.59 6

Totals 25 50 83 291.8 56.83ASTD 17.08

PCEST 78.7

Team Velocity

0.5 featur

e points

O verall Effort

Estimate 90%

Confidce.

157 perso

n / days

ECEFeatureMLC Size WC Size STD VAR

Features Velocity p/days Best

Case (effort)

Velocity p/days Most

Likely (effort)

Velocity p/days Worst Case

(effort)1 5 10 20

2 3 6 12

3 8 16 32

4 8 16 32

5 13 26 52

6 8 16 32

7 5 10 20

84 86 88 89 91 93 94 96 98 9910

110

310

410

6

108

109

111

113

114

116

118

119

121

123

124

126

128

130

131

133

135

136

138

140

141

143

145

146

148

person/days

Monte Carlo Histogram of 'Enhanced User Interface Substation System' with 7 lines occuring.

Median is 116 (person/days), 10% lowest value is 100 (person/days), 90% highest value is 134 (person/days).

Created with 12500 simulations on 10/17/2011.

Total Size feature / points

BC p/d

WC p/d Effort 90%

Conf. p/days 50 50 200 146

Effort 90%

Conf. p-days

PP 50 50 100 200

WD 50 56.83 157

MC 134

MethodTotal Size

feat. / points BC p-d MLC p-d WC p-d ECE f-p

949

Page 10: [IEEE 2013 35th International Conference on Software Engineering (ICSE) - San Francisco, CA, USA (2013.05.18-2013.05.26)] 2013 35th International Conference on Software Engineering

TABLE XI. SIZE ESTIMATION RESULTS FOR EACH FEATURE

Computing the summation for each feature in Table XI, Total ECE = ∑ F7

𝑘=1 k = 56.83 feature points, and multiplied by the team velocity, results in 113.56 person/days. Computing a multiplier for the individual features by dividing 146/113 equals 1.292035. Table XII shows the final estimates for each feature for a total of 146 person-days effort.

TABLE XII. FINAL EFFORT ESTIMATES PER FEATURE AND TOTAL

G. Historical Data Collection As described earlier in this paper, the development business unit did not have reliable historical data associated with the development of the features in this project. For this reason, a higher effort in terms of judgment was required. At the end of this project, the team maintained consistency with the size estimates but the team velocity was the one that had some discrepancies from the actual. Table XIII shows the comparative values of the Estimates and the actual. The data presented in Table XIII was collected to be utilized in future projects as historical data. The size for the baseline feature is shown in shaded row. The average velocity for the team was recorded as 2.6 days per feature Using the actual effort column) instead of 2.0 days per feature. Part of the historical data also included a description of the team profile in terms of their level of expertise using .NET development for this type of application.

X. CONCLUSIONS The initial estimate is a very important stage in the

development lifecycle of software-intensive products as it presents an opportunity to discuss the business and technical perspectives of the project and manage the organizational risk in the project when the uncertainty is at its highest level. This case study has demonstrated the value of utilizing various estimation methods and sound estimation principles to derive the ultimate initial estimate for a development project. An important aspect in the estimation process is the point of departure which is determined by the development of a sound work breakdown structure. The WBS represents the main raw

material utilized to develop estimates and this paper outlines a practical way to state the elements in a WBS. Even if an organization does not have reliable historical data, a combination of judgment and model based methods can be utilized to derive the estimates. It is essential to develop size estimates and then compute the effort estimates by utilizing the development team velocity. This paper suggested a list of historical data items that can be collected to use in the development of estimates of future projects.

TABLE XIII. COMPARATIVE VALUES OF ESTIMATES VS ACTUAL

REFERENCES [1] M. Cohn, Agile Estimating and Planning”, Prentice Hall, Robert

C. Martin Series, 2006. [2] L. A. Galorath, “Software estimation- an introduction” in

Proceedings of the Second IEE Conference on Automotive Electronics, 2006, pages 101–118.

[3] M. V. Genuchten, “Why is Software Late? An Empirical Study of the Reasons for Delay in Software Development”, IEEE Transactions on Software Engineering, vol. 17, no. 6, pp. 582-590, June, 1991.

[4] M. Jørgensen, B. Boehm, and S. Rifkin, “Software development effort estimation: Formal models or expert judgment?”, IEEE Software, vol. 26, no. 2, 2009, pp.14–19.

[5] L. Laird, “The limitations of estimation”, IT Professional, vol. 8, no.6, 2006, pp. 40–45.

[6] S. McConnell, Software Estimation: Demystifying the Black Art (Best Practices (Microsoft)), Microsoft Press, 2006.

[7] A. F. Minkiewicz, “The evolution of software size: A search for value” CrossTalk: The Journal of Defense Software Engineering, March-April, 2009, pp. 23-26.

[8] D. Muir, Estimation for the savvy project manager. http://¬www.spc.ca/-index.htm, 2009.

[9] M. Nasir, “A Survey of Software Estimation Techniques and Project Planning Practices”, Proceedings of the Seventh IEEE ACIS Intl. Conf. on Software Engineering, AI, Networking, and Parallel Distributed Computing (SNDP’06), 2006.

[10] S. Robertson and J. Robertson, Mastering the Requirements Process, Addison-Wesley, 1999.

FeatSize

EstimateSize

ActualEffort

EstimateEffort Actual

MRE Size

MRE Effort

1 7 7 14 17.5 0 0.22 6 6 12 15 0 0.23 11 11 22 27.5 0 0.24 12 12 24 36 0 0.335 18 18 36 54 0 0.336 11 11 22 27.5 0 0.27 8 8 16 20 0 0.2

Total 73 73 146 197.5 0 0.26

Feat. 1 Feat. 2 Feat. 3 Feat. 4 Feat. 5 Feat. 6 Feat. 7

5.67 4.33 8.83 9.17 14 8.83 6

Feat. 1 Feat. 2 Feat. 3 Feat. 4 Feat. 5 Feat. 6 Feat. 7 Total

7 6 11 12 18 11 8 146

950