se 381 - lec 25 - 32 - 12 may29 - program size and cost estimation models

SE-381Software Engineering

BEIT - VLecture # 25

(Program Size and Cost Estimation Models)

Metrics - Program Size Estimation

– Program size is a measure of the effort and time required to develop the product

– Two prevalent metrics in use are• Lines of Code• Function Points

– Lines of Code (LOC)• Historically oldest, evolved and its use propagated by the

availability of historical data with the orgs• Simplest and so, popular• Source lines of code counted and comments and headers

left out

Lines of Code• Cons

– LOC available when product is ready, and difficult to estimate before the start of SD

– Favors the (novice) programmers for their poor programming skills as compared to the experienced who write smartly

– Biased towards the programming language used

– Reflective of only coding phase, which is a fraction of SD process

– Complexity not addressed – Penalizes re-usability

Function Point– Function Point metric is in use since mid 70’s and published by

Albrecht 1983, and cures the shortcomings of LOC– FP metric can estimate program size directly from SRS– FP is based on the concept that size of software is directly

dependent on the number of different functions and features it supports

– Each feature when invoked reads input data and transforms it to output data, Albrecht proposed to include the no of files and no of interfaces as well

– FP is computed in two steps, in first step UFP - Unadjusted Function Points are computed then these are corrected

UFP = (no of inputs)*w1 + (no of outputs)*w2 + (no of inquiries)*w3 + (no of files)*w4 + (no of interfaces)*w5

Where wi depends on the complexity level of program, according to Rajib Mall (2005), these weights are 4, 5, 4, 10 and 10 respectively. In general these are given in the table below

Computation of Function Points

Description

Level of Information Processing Function

TotalSimple Average Complex

External Input___ x 3 = ___ ___ x 4 = ___ ___ x 6 = ___ _____________

External Output___ x 4 = ___ ___ x 5 = ___ ___ x 7 = ___ _____________

Logical Internal File___ x 7 = ___ ___ x 10 =___ ___ x 15 = ___ _____________

Ext. Interface File___ x 5 = ___ ___ x 7 = ___ ___ x 10 = ___ _____________

External Inquiry___ x 3 = ___ ___ x 4 = ___ ___ x 6 = ___ _____________

Total Unadjusted Fictions Points (UFP)_____________

Computation of FP

– In the second step, first Degree of Influence - DI, is computed considering fourteen possible factors, each having influence value varying from 0 - no influence to 5 - maximum influence, so DI can vary from 0 - 70

– The parameters considered for DI computation are shown in the table (on next slide)

– Technical Complexity Factor - TCF is computed using

TCF = 0.65 + 0.01 * DI– TCF varies from 0.65 to 1.35, and in second step FP

is computed by

FP = UFP * TCF

ID Characteristic DI ID Characteristic DI

C1 Data Communications --- C8 On-Line Update ---

C2 Distributed Functions --- C9 Complex Processing ---

C3 Performance --- C10 Re-Usability ---

C4 Heavily Used Configuration

--- C11 Installation Ease ---

C5 Transaction Rate --- C12 Operational Ease ---

C6 On-Line Data Entry --- C13 Multiple Sites ---

C7 End User Efficiency --- C14 Facilitate Change ---

Total Degree of Influence

TCF = 0.65 + 0.01 x (Total ‘Degree of Influence’)FP = UFP * TCF

DI ValuesNot Present, or not Influence = 0Insignificant Influence = 1Moderate Influence = 2Average Influence = 3Significant Influence = 4Strong Influence, throughout = 5

FP Cons and Improvements • Shortcomings

– Allocation of parameters is subjective– Symons 1988 pointed out that the proposed FP

analysis is based on two ‘intrinsic’ factors but did not include the Environmental factors like Project Management Risks, People Skills, Methods and tools used for development etc• His proposed method includes the influence of

Environmental factors

– Algorithmic complexity not taken into account• Inclusion of Feature Point metric is proposed to cater this

shortcomings

3 Components of System Size

Project Estimation Techniques

– After determining the Project Size; effort to develop the sw, project duration and cost are to be estimated

– These parameters help in winning the contract, as well as in resource planning, scheduling, monitoring and controlling the project

– Estimation Techniques can be categorised as:• Empirical Estimation Techniques• Heuristic Techniques• Analytical Estimation Techniques

Empirical Estimation Techniques– Based on educated guess of the project

parameters – Prior experience of similar products helpful– Although based on common sense, different

activities involved in estimation have been formalised over the years,

– First the estimates are guessed and later, on completion of project these are calibrated i.e. estimates are corrected to reflect the desired

– Two such formalisations are • Expert Judgement and• Delphi Technique

Delphi Technique• Non-Consultative, group consensus technique• Needs access to several experts• Experts may be at one or more locations• Operates under the control of a coordinator• Steps in a typical Delphi process

– Coordinator explains the task to experts– Specifications are supplied to each expert– Each expert makes estimates anonymously– Coordinator consolidates responses and circulates the

summary– Each expert reacts to disagreements giving reason– This process iterates till agreement is reached

• Wideband Delphi Approach requires minimal interaction between experts to speed up consensus process

Heuristic Techniques

– Assumes that relationships among the different project parameters can be modelled using suitable mathematical expressions

– Once basic (independent) parameters are known,the other (dependent) parameters can be determined using basic parameters in mathematical expressions

– Heuristic Models can be divided into two classes:• Single variable estimation models• Multivariable estimation models

Single Variable Estimation Models

Multivariable estimation models

Analytical Estimation Techniques

Software Cost Estimation

• COCOMOCOnstructive COst-estimation MOdel – A software cost and schedule estimating

method that was developed by Barry W Boehm and documented in Software Engineering Economics [Boehm 1981].

– The model is an empirically derived, nonproprietary, cost-estimation model, based on a study by Boehm of 63 sw development projects.

COCOMOAccommodates three categories of software:

Organic• Application programs – small well understood, smaller

development teams needed and team members are experienced in developing similar programs

Semidetached• Compilers, linkers etc the utility programs; development teams

are mix of experienced and novices, team members may have limited experience on related systems but may be unfamiliar with some aspects of the system to be developed.

Embedded• System programs, that interact directly with the hardware and

typically involve meeting of timing constraints and concurrent processing, include Operating Systems The developed sw is strongly coupled to complex hw, or stringent regulations on the operational procedures exist.

Effort and Development Time

• Effort is measured in PM – Person Months– PM is the effort one can put in one month, taking into

account the productivity loss due to holidays, weekly offs, coffee and prayer breaks etc.

– One PM is 19 calendar days or 152 working hours– Conforms to the engineers assignments and

deadlines of calendar months

• Development time is measured in months, i.e. Calendar months

Three Levels of Cost Estimation

• According to Boehm the cost estimation should be done through three stages:– Basic COCOMO– Intermediate COCOMO– Complete COCOMO

• Basic COCOMO (Single variable model)Effort = a1 * (KLOC)**a2 PMTdev = b1 * (Effort)**b2 months

Basic COCOMO (cont.)– PM is the area under the person-month plot, the

100 PM is NOT the effort put in by 100 people in one month or effort put in by one person for 100 months – the commonly followed myth

– According to Boehm every LOC should be calculated as one LOC, irrespective of actual no of instructions on that line, some authors refer it as DSI delivered Source Instructions

a1 a2 b1 b2Organic 2.4 1.05 2.5 0.38S-Detached 3.0 1.12 2.5 0.35Embedded 3.6 1.20 2.5 0.32

Correlations between variables

• Effort vs. Product Size– For different program sizes if the Effort is

plotted for all the three categories against program size, Effort has super-linear behavior and higher effort for complexity is reflected. That is Embedded Sw needs higher effort than Organic sw for same product size

• Development Time vs. Size– Development Time is sub-linear to Size,

Because of parallel activities in Sw development process

Effort vs. Product Size

Development Time vs. Size

Example – Basic COCOMO Calculations

Find Effort, Productivity (LOC per Person-Month), Development Time (in months) and Average Staffing (full-time staff personnel per month) for a project , which is of Organic type and estimated size of 128,000 Lines of Code.

• Effort = 2.4 * (KLOC)**1.05

= 392 PM (person-months)• Productivity = Size / Effort

= 128,000 LOC/392 PM

= 327 LOC/PM• Dev Time = 2.5 * (Effort)**0.38

= 2.5 * (392)**0.38

= 24 months• Av. Staffing = Effort / Tdev

= 392 PM / 24 months

= 16 FSP• FSP = Full-time-equivalent Staff Personnel

Intermediate COCOMO– Intermediate COCOMO is an extension to

Basic COCOMO and provides greater accuracy and level of detail which makes it more suitable for cost estimation in more detailed stages of software product definition.

– For all three categories it uses the same exponents but the coefficients for Effort computation are 3.2, 3.0 and 2.8 respectively for Organic, Semi-detached and embedded.

– Schedule for Intermediate is determined by the same equations as that for Basic model

Cost Drivers

– It incorporates 15 predictor variables, called Cost Drivers, to account for software project cost variations, that are not directly correlated to project size.

– These Cost Drivers are grouped into four categories• Software Product Attributes• Computer Attributes• Personnel Attributes and • Project Attributes

• Each of these attributes have different ratings and some numerical values are assigned to each, Eg RELY - Required Sw Reliability has ratings as: Very Low, Low, Nominal, High and Very High.

• Software Attributes:– RELY – Required Software Reliability– DATA – Database size– CPLX – Software Complexity

• Computer Attributes:– TIME – Execution Time Constraint– STOR – Main Storage Constraint

– VIRT – Virtual Memory Volatility– TURN – Computer Turnaround Time

• Personnel Attributes– ACAP – Analyst Capability– AEXP – Applications Experience (Team)– PCAP – Programmer Capability– VEXP – Virtual Machine Experience– LEXP – Programming Language Experience

• Project Attributes– MODP – Use of Modern Programming

Practices– TOOL – Use of Software Tools– SCED – Schedule Constraint

Reuse – Adaptation Adjustment– The previously developed software, code, which

is now reused, or being adapted for reuse in the new project. Its effect could be incorporated as EDSI – Equivalent number of Delivered Software Instructions. Calculated as:AAF = Adaptation Adjustment FactorAAF = 0.40(DM) + 0.30(CDM) +0.30 (IM)

Where DM = % Design Modified, CDM = % Code Modified and IM = % of Integration required for modified Sw

SoEDSI = (Adapted DSI) * (AAF / 100)

References1. Deanna B Legg, Synopsis of COCOMO from Richard H Thayer (Ed) Software

Engineering Project Management, 2nd Ed, IEEE Society of Computer Sciences (2000)

2. Barry Boehm et al, Cost Models for future Software Life Cycle Processes: COCOMO 2.0 from Richard H Thayer (Ed) Software Engineering Project Management, 2nd Ed, IEEE Society of Computer Sciences (2000)

3. Rajib Mall (2005); Fundamentals of Software Engineerign, 2nd Ed, Prentice-Hall of India, New Delhi, Ch – 3 Software Project Management, pp:38-84

4. Capers Jones (2007); Estimating software Costs: Bringing Realism to Estimating; 2nd Ed, Tata McGraw-Hill Publishing Company, New Delhi

5. Jalote Pankaj (2005), An Integrated Approach to Software Engineering, Ch - 56. A J Albrecht and J E Gaffney; “Software Functions, Source Lines of Code and

Development Effort Prediction: A software Science Validation” in IEEE Transactions on Software Engineering, Vol SE-9, no 6, pp 639-47, Nov 1983

7. Charles R Symons, “Function Point Analysis: Difficulties and Improvements” in IEEE Transactions on Software Engineering, Vol 14, no 1, pp:2-11, Jan 1988

8. S A Kelkar (2007); Software Engineering – A Concise Study; Printice Hall of India, New Delhi, Appendix A – Estimation Techniques pp: 641 – 682

se 381 - lec 25 - 32 - 12 may29 - program size and cost estimation models

Education

average influence

total degree of influence

project estimation techniques

moderate influence

influence value

insignificant influence

strong influence

maximum influence