programmer and analyst time/cost estimation€¦ · equation, plus a set of factor tables which...

13
P/A Time/CostEstimation PROGRAMMER AND ANALYST TIME/COST ESTIMATION By:Izak Benbasat Iris Vessey A bstract The ability to estimate the personnel time and costs required for the completion of programming and systems projects is animportant managerial tool for the information systems department. Thisarticle presents a survey of the estimation techniques found in the literature by describing each technique and discussing its strengths and weaknesses.Some empirical evidence on how the various program andprogrammer/ analyst characteristics affect projecttimeand cost are also reported. Keywords:Systems development, time estimation ACM Categories:2,4, 3,5 and cost Programmerand Analyst Performance Measurement The issue of programmer and analyst productivity standards is of prime importance to a systems development manager. Without such standards, it is not possible with any certainty to develop realistic budgets for a project, schedule workloads, and evaluate the performance of per- sonnel. The factors which play a role in determination of the systems building process are many and varied. For example, Farquhar [8] has identified ninety-six. In addition, the environment is continually changing. With the introduction of new database management capabilities, more com- plex input-output equipment, new languages, new system development tools and methodologies, and a newer generation of computers, the standards must be revised to incorporate more diversified factors. Nonetheless, one would expect that organiza- tions would make an attempt at estimating pro- grammer and analyst productivity as a means for obtaining greater control over performance of the systemsdivision. However, this is not true in practice. As Chrysler [5] states with referenceto programming standards, Them is no generally accepted, reliable and accurate set of programming perfor- mance standards. Further, one will not even find a generally accepted technique for developing such standards. Chrysler has proposed a framework within which to study the variables affecting programmer per- formance. This framework shows the range of variables affecting the programming process. The variables are classified into six groups:organiza- tional, hardware, source language, programming mode, programming problem, and programmer characteristics. This model, modified principally by the introduction of a seventh class of variables, i.e., software engineering technology, is presented in Figure 1. As yet, little has been done towardmeasuring the factors affecting a syst~ems analyst’s performance, but such a model would be equally as complex if not more so. MIS Quarterly/June 1980 31

Upload: others

Post on 14-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time/Cost Estimation

PROGRAMMERAND ANALYSTTIME/COST ESTIMATION

By: Izak BenbasatIris Vessey

A bstractThe ability to estimate the personnel time and costsrequired for the completion of programming andsystems projects is an important managerial tool for theinformation systems department. This article presents asurvey of the estimation techniques found in theliterature by describing each technique and discussingits strengths and weaknesses. Some empiricalevidence on how the various program and programmer/analyst characteristics affect project time and cost arealso reported.

Keywords:Systems development, timeestimation

ACM Categories: 2,4, 3,5

and cost

Programmer andAnalyst PerformanceMeasurementThe issue of programmer and analyst productivitystandards is of prime importance to a systemsdevelopment manager. Without such standards, itis not possible with any certainty to developrealistic budgets for a project, scheduleworkloads, and evaluate the performance of per-sonnel.

The factors which play a role in determination ofthe systems building process are many andvaried. For example, Farquhar [8] has identifiedninety-six. In addition, the environment iscontinually changing. With the introduction of newdatabase management capabilities, more com-plex input-output equipment, new languages, newsystem development tools and methodologies,and a newer generation of computers, thestandards must be revised to incorporate morediversified factors.

Nonetheless, one would expect that organiza-tions would make an attempt at estimating pro-grammer and analyst productivity as a means forobtaining greater control over performance of thesystems division. However, this is not true inpractice. As Chrysler [5] states with reference toprogramming standards,

Them is no generally accepted, reliableand accurate set of programming perfor-mance standards. Further, one will noteven find a generally accepted techniquefor developing such standards.

Chrysler has proposed a framework within whichto study the variables affecting programmer per-formance. This framework shows the range ofvariables affecting the programming process. Thevariables are classified into six groups: organiza-tional, hardware, source language, programmingmode, programming problem, and programmercharacteristics. This model, modified principallyby the introduction of a seventh class of variables,i.e., software engineering technology, ispresented in Figure 1. As yet, little has been donetoward measuring the factors affecting a syst~emsanalyst’s performance, but such a model wouldbe equally as complex if not more so.

MIS Quarterly/June 1980 31

Page 2: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time~Cost Estimation

Organizational OperationsCharacteristics

Programming methods standardsSystem documentation standardsWork environmentGroup interaction

Computer HardwareCharacteristics

Memory size

Programming ModeCharacteristics

Batch entryOnline entry

Software EngineeringCharacteristics

Structured designStruc~red programming

Storage devices

/Source LanguageCharacteristics

Instruction setSelf-documentation

PROGRAMMER PERFORMANCE

/\

Testing aidsDatabase management

~systems

Programming ProblemCharacteristics

Function of programDegree of integration

Programmer CharacteristicsInnate abilityProgramming experienceSource language experience

Figure 1. Variables Affecting Programmer Performance

With this complexity in mind, it is apparent that ameaningful assessment of any part of thesystems development process can be made onlywhen the variables that are changing at any onetime are few in number. For this reason, anorganization should not attempt to use the costfactors of others to estimate costs for its owndevelopment process. It should instead seek tomeasure productivity based on its own organiza-tional context, its own hardware, the chosensource language, and programming mode, etc.

The next section presents the techniques thatmay be used for estimating programmingvariables, while those used for systems estima-tion follow.

Estimation ofProgramming Time/CostThe methods that have been proposed in theliterature for the estimation of programming timeare presented in order of increasing sophistica-tion. Hence, the early methods are easier toapply, but since they do not rely on an empiricalbase the estimates produced by them are subjectto greater error. The later methods require a com-prehensive databank on past projects, andsubstantive analysis in order to determine thevariables that are relevant to the particularorganization.

The methods which are presented are classifiedas follows:

32 MIS Quarterly~June 1980

Page 3: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time~Cost Estimation

1. Personal Experience

2. Analogy

3. Work Factors

4. Standards

5. Parametric Equations

Personal experience

This estimation method is the most subjectiveone. It is based on the judgment of the projectsupervisor, senior programming staff, or the pro-grammers who will write the programs. Typically,supervisors take a weighted average of thevarious estimates, giving more weight to the judg-ment of experienced performers, or they use thedifferent estimates to determine an upper andlower bound.

Greater reliability can be obtained by breaking thetotal effort into relatively small, manageable jobunits which are clearly delineated. The individualtasks are estimated separately, then aggregatedto obtain the total project resources. An advan-tage of this method is the participation of thepeople who are going to perform the task, thusincreasing the incentive to complete the taskwithin the agreed time constraints.

A well-known technological forecasting approach,called the Delphi technique [19], could be used toimprove the forecasting ability of individualestimators. Using this approach, a panel ofestimators who remain anonymous, presentindependent estimates to a coordinator. Thecoordinator then provides each panel memberwith the distribution of group responses, such asthe median, upper, and lower quartiles. Eachpanel member is asked to revise or support hisestimates if his response differs from the groupconsensus. By iteration of the estimation/revisionprocesses, the technique aims to reach a groupconvergence on the resource estimates.

A nalogy ~This method of estimation requires that fairly highlevel data on past development projects be retained

’Aron’s [1 ] term for this procedure is the "Experience Method."

by the organization. To employ this method atleast one project with similar features must havebeen completed previously. The new projectmust be clearly specified at least at the functionallevel, permitting comparison of similar elements.Using the assumption that similar tasks requiresimilar resources, an estimate is made of thematching areas, perhaps scaled upward to allowfor unforeseen problems. Functions that cannotbe compared are estimated separately by someother method.

The major problem encountered in the use of thismethod is that it cannot be applied to systemslarger than that used for comparison. Aron [1]states that "system complexity grows as thesquare of the number of sy~,tem elements;therefore, experience with a small system cannotaccount for all the things that will have to be donein a large system." In addition, this procedure can-not be applied to systems which are totally new incontent.

Work factors~

This method is a step toward the analyticalmethod of estimation. It is based on an estimationequation, plus a set of factor tables which providethe variable values to be used as inputs to theequation [3, 11, 21, 24].

Weiss [24] has proposed the following estimationequation to find the total man-days required for aprogramming project (define program logic, code,compile, test/debug, and document):

Program -1Total F Input Output function /

= + complexi~J XMan-days L~omplexity +complexity

[Programmer experience + Job knowledge]

In order to use the equation, Weiss provides anumber of tables to determine the complexity fac-tors associated with the particular programmingassignment. For example, for input and outputcomplexity, sequential tape files are given a factor

=Note that certain project management systems such asSPECTRUM-1 also provide estimation of programming time aspart of their systems estimation process. SPECTRUM will bediscussed more fully in the section on the total systemdevelopment time and costs.

MIS Quarterly~June 1980 33

Page 4: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time~Cost Estimation

of 1, whereas a database file is assigned a factorof 4. For programmer experience the factorsrange from 0.5 for an experienced programmer to4 for a programming trainee. Program complexityratings are differentiated by program function atthree levels of complexity. For example, anaverage edit function is assigned a rating of 2, adisplay function a rating of 1, and an averageupdate function, a rating of 3.

Henney [10] evaluates the approach suggestedby Weiss by stating:

the advantages of this method, in princi-ple, are that the estimate is built up onidentifiable definitions of complexity forthe program as a whole (although stillrelying on some subjective judgment),and there is no need to assess a globalcomplexity factor, or to try and estimatethe program length before it is wdtten.The disadvantages are that it is complex,there is no evidence of its validation, andit purports to be a general purpose for-mula, which gives it little chance ofparticular success.

More basic equations to calculate total program-ming time have also been developed. Watson[23] has proposed the following equation:

Total Man-days = 1/4SC + 2S + 4C + 2N

where C is a complexity rating ranging from 1 ¯(easy) to 5 (highly complex), N is the number record types, and S is the program size in units of250 statements. The above result can then beadjusted for programming experience.

This approach has a number of problems. Itrequires an estimate of the number of lines ofcode, which in itself is a difficult estimating task. It

does not give any guidelines as to the range ofprogramming tasks, from very small to very com-plex, for which the formula is applicable. There isalso no evidence as to the empirical basis for itsdevelopment.

Standards

Two ways in which standards may be used toestimate programming costs will be presentedhere. A third method, described by Wolverton[25], employs standard costs for the totaldevelopment life cycle. However, the procedurecould be applied to the programming functionalone.

Lines of Code Per Hour

Johnson [13] has suggested that the averagenumber of lines of COBOL code (LOC) producedper hour may be used as a means of determiningprogram development time. The average numberof LOC/hour for small projects requiring less than250 man-days was found to be 8.7, while pro-jects requiring more than 500. man-days fordevelopment were found to have been producedat an average rate of 2.0 LOC/hour.

According to the following statement by Brooks[2], adjustment of these figures would suggest a1 LOC/hour production rate for compilers and .3LOC/hour for operating systems. Brooks states:

My guideline in the morass of estimatingcomplexity is that compilers are threetimes as bad as normal batch applicationprograms, and operating systems arethree times as bad as compilers.

These figures, in fact, correspond well with thosegiven by Brooks which are shown in Table 1.

Table 1. Productivity Rates (LOC)

Johnson BrooksType of Development Hourly Yeady Yearly

Batch

Compiler

Operating system

3 6336

1 2112

0.3 704

6,000 - 9,000

2,000 - 3,000

600 - 800

34 MIS Quarterly~June 1980

Page 5: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time~Cost Estimation

Again, the difficulty in implementing this methodlies in the need to calculate the number ofexpected LOC before programming takes place.However, if the system under consideration is toreplace an existing computerized system, theestimate may be reasonably accurate.

Module Standards

This standard method of estimating programmingcosts is more analytical than the LOC methodpresented above. It requires an extensive reviewof past projects in order to obtain the variability inthe number of statements per module and theprogramming rate for modules of a particularclass. The statistics found by Donelson [7] forCOBOL programs run on an IBM 360/40 batchprocessing computer are shown in Table 2.

By determining the number of modules of eachclass (ME), the cost of programming can estimated using the equation:

12Programming = T. M~" (S<’ + o:<’o’S<’). Rp

Cost <" = 1 (P~" + ~ <" (~P<’)

where, M~" = number of modules of class ~’,

S~’ = mean number of statements permodule per class ~’,

o’S~" = standard deviation of S~’,

=~’ = chosen multiple of o’S<"(planner’s estimate),

Table 2. Programming and Testing Estimatesfor COBOL Batch Applications

Module Class

Statements Programming Rateper module’~ (statements per hour)

Standard StandardMean Deviation Mean Deviation

Computertest hoursper module

1. Data definition 62 52 16 N/A*

2. One-time utility 177 92 30 N/A

3. Conversion utility 449 179 24 N/A

4. General utility 260 120 5 4.5

5. Database interface"bridge" 450 150 10 1.8

6. Edits 1715 415 16 4.2

7. Updates 1278 528 20 7.8

8. Processing 1186 108 8 1.4

9. Major extracts 530 29 15 3.9

10. Minor extracts 186 76 7 3.0

11. Major reports 907 436 8 2.5

12. Minor reports 260 95 9 5.1

.7

.7

3.2

8.9

8.1

19.0

11.3

27.0

6.1

4.6

19.7

5.0

*N/A = Not Available1"Does not include COPIED data definition statements

MIS Quarterly~June 1980 35

Page 6: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time~Cost Estimation

P~" = programming statements perhour for module class ~’,

(~P~" -- standard deviation of P~’,

1~(= chosen multiple of o’P(, and

Rp = hourly charge for programming.

The method is flexible, allowing for assessment ofrisk in the form of the multipliers, ¢(and I~’, Le.,the estimator’s uncertainty about the currentdevelopment will be reflected in the number ofstandard deviations from the mean specified inthe equation.

Parametric equations

This method involves the use of equations der-ived by the application of multiple regressiontechniques to estimate project developmenttimes. An example of the estimating equation maybe:

Man-months = 1~ 0 + I~ 1 (%

Mathematical Instructions) + ~2 (Pro-

gramming Language) + I~3 (Number of

Subprograms) + I~4 (Experience Level

of Programmer)

In statistical notation, the general form of thisequation is:

Y = ~oXo + ~1 Xl + 1~2X2 .... + ~=nxn + E:

where Y is the value to be estimated, X0 is a con-stant, X’s are the variables in absolute or codedform, I~’s are the coefficients to be estimated,and E is the error term. Examples of the use ofthe regression method to estimate programmingproject development times can be found in [5, 6,9, 12, 15, 22].

The parametric equation as an estimation methodis the one which looks at the problem most objec-tively. It is also the most difficult to develop. Itrequires a history file to be used in estimating thecoefficients, and detailed statistical work to deter-mine the parameters to be included in the equa-tion. Once the coefficients are determined, theestimate for a new programming task can be

calculated from the equation by substituting thevariable values relating to the task.

There are a number of caveats which should bekept in mind when using multiple regressiontechniques. Nelson [15] describes these asfollows:

Such work not only produces equationsthat can be used for estimating purposes,it also serves to identify factors that havestatistically significant impact on expen-ditures, and hence directs managementattention to those critical factors, many ofwhich are subject to management controlA disadvantage also arises, however,when the results of cost research arepublished in the form of equations. Suchequations are frequently interpreted bythe user as demonstrating a specificcausative relationship. This is not thecase. Equations developed by multipleregression techniques do reveal impor-tant parameters and may represent therelationships that provide the moststatistically significant manner of describ-ing the character of the analyst’s sample.However, they do not necessarilyrepresent natural law, as do many of theequations used by the engineer or thephysicist. Also, they must be used in theirentirety or not at all. ff a value for any oneof the independent variables were notavailable, the equation could not be usedas it stands, since repeating the multipleregression analysis without the missingparameter would result in the reassign-ment of weights to all of the remainingvariables and the Y-intercept of theequation.

Therefore, the transfer of the regression estima-tion equations published in the literature from thecompany whose data was used to another, is dif-ficult. Variables such as company size, systemdesign methodology, hardware, operatingsystems, etc., which might not have been takeninto consideration in estimating the regressionparameters would be likely to cause differences.Only in those cases where the company andsystem characteristics are similar can such equa-tions be transferred.

36 MIS Quarterly~June 1980

Page 7: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time~Cost Estimation

Another consideration in using the regressionequation for estimating purposes is the nature ofthe variables employed. Some equations containvariables such as number of classes of items inthe database per lines of code, number of pagesof delivered documentation per lines of deliveredcode, or total number of personnel divided bymaximum number working at one time, which canonly be determined ex-post, that is, after the com-pletion of the programming task. However, equa-tions which contain variables which are measuredex-post would still be useful in evaluating the com-parative productivity of the programming taskafter project completion.

Estimating equations which contain quantitativerather than qualitative variables are moredesirable. Quantitative variables such as thenumber of input/output files or number of reportsgenerated can be measured objectively.Qualitative variables, such as complexity of theprogram or programmer experience with similarapplications, must be coded using subjectivejudgment. Therefore, the use of quantitativevariables only, would provide more theoreticallysound equations.

There are not many published reports ofparametric equations developed for estimation.

Some of the variables which were included inthose equations are shown in Table 3 to give thereader an indication of those factors which mighthave a possible impact on performance. Anorganization trying to develop an estimating equa-tion could Collect historical data on thesevariables, and consider them in its estimationprocess.

Although lines of code in a program are highly cor-related with programming time (for example) see[12]), this variable was not included in Table since it is difficult to estimate in advance. On amore macro level the number of application sub-programs [18], and the number of COBOLparagraphs [12] could be used instead of lines ofcode.

Programmer experience seems a priori to be animportant variable affecting programming time,and has been found to be so in various studies [5,6, 9, 22]. Some authors who have not foundexperience to be significant [12, 25], suggestthat less experienced programmers may be giventhe easier tasks, or that more experienced pro-grammers may be promoted to systems analystsas possible reasons for such findings [12].

Table 3. Summary of Regression Analysis References

Variable Name Referenced By

Number of Input Record Types

Number of Output Record Types

Number of Input Fields

Number of Input Edits

Number of Input Files

Number of Report Types

Number of Files

Number of Application Subprograms

Number of COBOL Paragraphs

Programmer Experience

Gayle [9], Jeffery and Lawrence [12]

Gayle [9], Jeffery and Lawrence [12]

Chrysler [5, 6]

Chrysler [5, 6]

Chrysler [5, 6]

Putnam [18]

Chen [4], Jeffery and Lawrence [12], Putnam [18]

Putnam [18]

Jeffery and Lawrence [12]

Chrysler [5, 6], Gayle [9], Walston and Felix [22]

MIS Quarterly~June 1980 37

Page 8: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time/Cost Estimation

Estimation of Total SystemsDevelopment Time/CostThere are four principal ways in which costs ortimes for the whole of the systems developmentproject may be estimated. These are:

1. Proportion of Life Cycle Method2. Work Factors Method3. Standards Method4. Manpower Life Cycle Method

The first three of these methods may be regardedas micro-estimating approaches since they useprogram elements as basic building blocks, andhence bear a direct relationship to the methodspresented for estimation of programming time.The fourth, the manpower life cycle method, is atotally different concept in that it takes a macroview of the total project. To date, reports on thisapproach in the literature have been confined tovery large3 software development projects.

Before proceeding with a detailed discussion ofthe micro-estimating techniques note should bemade that no technique makes specific mentionof any overlap of the phases or iterations whichmay be required as the development proceeds.For example, a re-evaluation of systems designmay be required after the commencement of theprogramming phase. Hence, the implementors,

=Myers [14] defines "very large" as a project exceeding 20 to30 man-years with a time to reach operational capacity of twoor more years,

themselves should include such estimates in theircalculations.

Proportion of fife cycle method

Once the time to develop the programming phaseof a project has been estimated, extrapolationprovides an estimate for the total project. Thereare many reports in the literature showing fairlyconstant time periods associated with the dif-ferent development phases. However, theindividual organization would be well advised toestablish its own phase times, as life cyclephases do not necessarily correspond from oneorganization to another. In addition, implementorsmay wish to break these phases into a greaternumber of subphases. The same basic methodcan be used with specified proportions of the totaleffort being assigned to the designated activities.Selected figures are given in Table 4.

An example of the use of this method is given byDonelson [7], who also calculates the cost of pro-gram keypunch time and computer test time.(Refer to page 35 and Table 2 for Donelson’s basic3arameters.) Total project cost is calculated by:

~-((S<" +=<" OS<’)M<’ o(I’IRs + Rp +L

~S "+’125" (~S’"RK~) +T" RC~

Table 4. Percentage of Time Spent in Life Cycle Phases

Business Systems

Donelson [7]

Johnson [13]

Software Systems

Aron [1 ]

Wolverton [25]

Systems Analysis & Design52%

Definition & Design5,0%

Programming48%

Programming & Testing5O%

Design3O%

Analysis & Design40%

Implementation4O%

Coding & Development2O%

Test30%

Checkout & Test40%

38 MIS Quarterly/June 1980

Page 9: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time~Cost Estimation

where, RS =

RK =

RC =

T~" =

hourly charge for systemsanalysis and design;

hourly charge for keypunching;

hourly charge for computer testtime;

mean number of computer testhours for module of class ~"(refer to the rightmost columnin Table 2).

The expression on the left hand side of the plussign is similar to the programming cost formulashown on page 35, except for the addition of thesystems analysis and design time multiplier (1.1RS). The number 1.1 is the proportion of systemsanalysis and design time with respect to program-ming time (.52/.48), using the figures derived Donelson as shown in Table 4. The figure 125used on the right hand side of the expressionstands for the average number of COBOLstatements keypunched per hour. Note that themethod is somewhat inconsistent in that it doesnot incorporate estimates of standard deviationfor computer test time. This type of proceduremay be used in association with any of thosedescribed for estimation of programming times.

Work factors

Work factors methods similar to those describedearlier for the estimation of programming time,may be used also in the assessment of systemstime. For example, Weiss [24] has suggested thefollowing approach. The system developmentprocess is represented in seventeen steps, suchas the design of system output, design of forms,approval of conversion program tests, and follow-up review. Each of these steps is rated as simple,simple/complex, complex, or very complex. Atable provides the suggested man-days requiredfor each step at each complexity level. For exam-ple, five man-days are allocated to developingvery complex system test procedures while oneman-day is allocated to designing master files fora simple system. The man-days assigned to astep are then multiplied by a weighting factor foranalyst experience, ranging from 0.5 for anexpert to 4.0 for one with limited systemsexperience, to obtain a final estimate. The sum ofthe estimates of all the steps represents the totalsystem time.

The advantages and disadvantages of such anapproach are similar to those discussed for pro-gramming estimation (see p. 34).

The project management system, SPECTRUM-1[20], uses essentially the same approach as thatdiscussed above. It provides time estimates fortasks in the form of average, minimum, and max-imum figures. The task times are aggregated intofigures for each of twelve phases. A number ofadjustment factors may then be applied, includingproject size, average team experience, percen-tage of time on other work, user understanding,newness of techniques, etc. For example, theadjustment for a team composed of all seniorpeople would be -10% of the calculated time,while a team of juniors would be permitted anincrease of 40%; similar figures apply to well-defined user understanding as opposed to nouser understanding.

StandardsWolverton [25] has reported a software costestimation technique based on the assumptionthat total development costs vary proportionallywith the number of instructions. Using a"reverse" proportion of life cycle method, he thencalculates costs for the individual phases ofdevelopment.

The cost for the complete software developmentcycle is estimated at the program or subprogramlevel by consideration of the number of objectinstructions, the functions of the routine, and itsrelative difficulty. The six functions identified byWolverton are control, input/output, pre/post pro-cessor, algorithm, data management, and timecritical processes. Difficulty is defined as threelevels, (easy, medium, and hard) for each of twocategories, (old and new routines) producing total of six levels. This approach envisages theuse of standard costs per instruction for each ofthe above function-difficulty combinations (36 inall). Using these standard costs, plus theknowledge of program functions, difficulty leveland estimated number of instructions per pro-gram, the total cost of software development iscalculated. The next step is to break this total intoits subphases in the following way: requirements8%; preliminary design 18%; interface definition4%; detailed design 16%; code and debug 20%;development testing 21%; validation, testing, andoperational demonstrations 13%.

MIS Quarterly~June 1980 39

Page 10: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time~Cost Estimation

There are three problems with this method. First,the building of a reliable databank is necessarysince only data based on the organization’s ownproductivity can be employed. Second, any pro-cedure which uses a set of standards makes theassumption that the building blocks are directlycomparable, which is not necessarily true whenthe elements under consideration are computerprograms. Third, estimation of the number ofobject instruction may be subject to considerableerror, especially if the program is classified as"new," or if the estimation is made before thedetailed design phase.

Advantages claimed for this procedure includethe minimal amount of data required to obtain acost estimate, once the databank is established,and requirements are directly traceable to costs.In addition, the effect of alternative designs maybe readily assessed.

Manpower fife cycle

This method is based on a formal representationof manpower utilization rates over the project lifecycle. The basic premise upon which the methodrests is that certain facts about a softwaredevelopment project are known at the beginningof development. From these facts a graph of

expected manpower utilization over time may begenerated. It is a macro approach applicable onlyto large software projects from the time ofdetailed development.

Norden [16, 17] found that software develop-ment projects are composed of cycles corres-ponding to each phase of the life cycle, e.g.,planning design, model (prototype), release(extension), and modification and maintenance.Each of these cycles is well described by theRayleigh equation, given by:

y = 2Kat ¯ exp(-at2)

where y = manpower utilized in each timeperiod t,

K = total cumulative manpower usedon the project in units of manyears,

a = shape parameter, governing therate of change of manpower utiliza-tion, and

t = time in years.

In addition, the curves representing the cycles,when combined, form an overall developmentcurve which also may be described by a Rayleighequation. A graph of the equation is shown inFigure 2.

15-

Percent of 10-totalmanpower

Manpower peak

,

.,.~------------------~ full operational capability

1 2 ~3 4 5 6 7

td

8 Years

Figure 2. Manpower Utilization Curve

40 MIS Quarterly~June 1980

Page 11: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time~Cost Estimation

Ihe Rayleigh equation was then applied to soft-ware development by Putnam [18]. He found thatthe time of maximum manpower utilization almostcorresponds to the time when the system is firstoperational. Therefore, Putnam defined the timeto peak as the development time of the project(td).

Extensive testing of the Rayleigh equation againsta substantial number of past projects from severalorganizations showed that many had correlationcoefficients in excess of 0.95, indicating that theequation can reproduce the practical manpowerutilization situation extremely well [18].

In order to use the Rayleigh equation to predicttotal cost and manpower life cycle requirements,parameters K and t d must be estimated. An exam-ple of the estimating approach as described byPutnam is discussed below. Putnam collectedhistorical data on K (manpower utilized), d(development time), 2 (number of r eports), x 3(number of application subprograms) from nine-teen projects, and using multiple regressiontechniques, developed the following five estima-tion equations (constant terms in the equationwere dropped):

K/td2 = .1991 x2 -I- .0859 x3K/td =.2700x 2 +.5901 x3K = -.3447 x2 + 2.7931 x3Ktd = 4.1385 x2 + 11.909 x3Ktd4 =-483.3196x 2 + 791.08x3

In order to test the estimating power of theseequations he chose one project from the nineteenused to generate the equation. This project had179 reports (x2) and 256 application sub-programs (x3). Substituting these values into theequations shown above, and simultaneously solvoing them for K and t d, provided an estimate of K= 682.35 man years, and t d = 3.4558 years(the actual values for the project were K = 700and t d = 3.65).

Once t d is estimated, the shape parameter, a, canbe obtained from the relationship:

1a -(2td2).

Substituting for a, the Rayleigh curve is given by:

y(t) = 2K t exp (-t2/2td2),

2td2

and, for K = 682.35 and t d = 3.4558, weobtain

y(t) = 57.1 t exp (-t2/23.89).

The above equation permits calculation of thenumb_er of man years needed for year t as shownin Table 5.

In summary, the method described abovegenerates a number of estimating equationsbased on historical data and regression methods.To determine the manpower required for a new pro-ject in the same organization, first the number ofreports and application subprograms areestimated. These values are then applied to theestimating equations, and values for K and t d aregenerated. Using K and t d in the Rayleigh curve,an estimate of manpower needed for each year ofthe project is obtained. Other more sophisticatedapproaches to estimation using the Rayleighcurves can be found in Putnam [18].

Concluding CommentsThe information systems department has the goalof providing management with the information itneeds to conduct the organization’s day-to-dayoperations, and the information it needs to directthe organizatioh toward its short range and longrange objectives. Although few information

Table 5. Estimated Manpower Derived From the Rayleigh Equation

t (year) 2 3 4 5 6 7 ...

man years 52.1 96.6 117.5 116.9 100.3 75.9 51.4 ...needed

MIS Quarterly~June 1980 41

Page 12: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time~Cost Estimation

system departments have reached that stage yet,they should at least by trying to work towardthose objectives. Therefore, it follows that adepartment whose main objective is to provideinformation, must build its own information systemto gather information on performances of systemsanalysts, programmers, computer operators, aswell as on equipment and other needs. It seemsthat many information systems departments tendto ignore their own management control systemswhile trying to help the organization.

A review of the literature for the last ten yearsshows that very little in terms of new methods hasbeen proposed in this area. In our opinion, themethods available today are more than adequatefor a company to establish an estimationapproach. All that is needed is management’s will-ingness to employ the planning and controlphilosophy used in other functional areas in theinformation systems department.

In order to achieve a good internal managementsystem, subjective standards should be usedless, in favor of more objective procedures. Thismay be achieved by starting to build a history file,retrieving the information needed, and analyzing itin order to develop standards or parametric equa-tions for improved estimation methods.

The building of the history file is a prerequisite fordeveloping any one of the estimation techniquesdescribed in this paper, except for the "personalexperience" approach. A framework whichoutlines the data elements which, a priori, arethought to be significant, is useful in order todetermine what types of project data to collect.The framework suggested by Chrysler [5, 6] (seeFigure 1), and a measurement database des-cribed by Walston and Felix [22] could be usedas guidelines.

In analyzing the various methods described in thispaper it is important to understand their differingcharacteristics. Methods such as the work fac-tors and the manpower utilization curves areintended to be general purpose ones which couldbe used in any organizational context. However,one should be careful to realize that the man-power utilization curves were developed for verylarge software projects, thus, their validity can berelevant only in such environments. The workfactors methods, since they try to be general intheir orientation, would have less success in aparticular organization than a parametric equationdeveloped in the given organizational environ-ment. Standards and parametric equationmethods are presented in order to describeapproaches an organization could use fordeveloping estimation techniques rather than toprovide the readers with specific numbers which"they could use in their organizations.

References

[1] Aron, J. D. "Estimating Resources forLarge Programming Systems," Tutorial onQuantitative Management: Software CostEstimating, Putnam, L. H. and Wolverton,R.W. (eds.), IEEE Computer Society, LongBeach, California, 1977, pp. 257-268.

[2] Brooks, F. Mythical Man-Month, Addison-Wesley, Reading, Massachusetts, 1975,p. 93.

[3] Burton, B. J. "Manpower Estimating forSystems Projects," Journal of SystemsManagement, Volume 26, Number 1,January 1975, pp. 29-33.

[4] Chen, E. T. "Program Complexity and Pro-grammer Productivity," IEEE Transactionson Software Engineering, Volume SE-4,Number 3, May 1978, pp. 187-194.

[5] Chrysler, E. "Programmer PerformanceStandards," Journal of Systems Manage-ment, Volume 29, Number 2, February1978, pp. 18-25.

[6] Chrysler,. E. "Some Basic Determinants ofComputer Programming Productivity,"Communications of the ACM, Volume 21,Number 6, June 1978, pp. 472-483.

[7] Donelson, W. S. "Project Planning andControl," Datamation, Volume 22, Number6, June 1976, pp. 73-80.

[8] Farquhar, J, A. "A Preliminary Inquiry intothe Software Estimation Process," RM-6271-PR, The Rand Corp., Santa Monica,California, 1970.

[9] Gayle, J. B. "Multiple Regression Tech-niques for Estimating Computer ProgrammingCosts," Journal of Systems Management,Volume 22, Number 2, February 1971,pp. 13-16.

[10] Henney, A. "Techniques for EstimatingProgramming Effort and Assessing Pro-

42 MIS Quarterly~June 1980

Page 13: Programmer and Analyst Time/Cost Estimation€¦ · equation, plus a set of factor tables which provide the variable values to be used as inputs to the equation [3, 11, 21, 24]

P/A Time~Cost Estimation

grammer Performance," Data Systems,September 1969, pp. 34-36.

[11] Hurtado, C. D. "A Quantitative Tool forDecision-making," Data Management,Volume 9, Number 3, March 1971, pp.19-22.

[12] Jeffery, D. R. and Lawrence, M. J. "AnInterorganizational Comparison of Program-ming Productivity," Working Paper 1/79,Department of Information Systems,University of New South Wales, Australia,1979.

[13] Johnson, J. R. "A Working Measure of Pro-ductivity," Datamation, Volume 23, Number2, February 1977, pp. 106-110.

[14] Myers, W. "A Statistical Approach toScheduling Software Development," Com-puter, Volume 11, Number 12, December1978, pp. 23-35.

[15] Nelson, E. A. "Some Recent Contributionsto Computer Programming Management,"On the Management of Computer Program-ming, G. F. Weinwurm (ed.), AuerbachPublishers, 1970, pp. 1 59-184.

[16] Norden, P. V. "On the Anatomy of Develop-ment Projects," IRE Trans. PGEM, VolumeEM-7, Number 1, 1960, p, 41.

[17] Norden, P. V. "Useful Tools for ProjectManagement," Management of Production,M. K. Starr (ed.), Penguin Books, Inc.,Baltimore, Maryland, 1970, pp. 71-101,

[18] Putnam, L. H. "A General Empirical Solutionto the Macro Software Sizing andEstimating Problem," IEEE Trans. SoftwareEngineering, Volume SE-4, Number 4, July1978, pp. 345-361.

[19] Scott, R. F. and Simmons, R. B. "Program-mer Productivity and the DelphiTechnique," Datamation, Volume 22,Number 5, May 1976, pp. 71-73.

[20] Spectrum, Spectrum International Incor-porated, Los Angeles, California.

[21 ] Toellner, J. "Project Estimating," Journal ofSystems Management, Volume 28,Number 5, May 1977, pp. 6-9.

i22] Walston, C. E. and Felix, C. P. "A Methodof Programming Measurement and Estima-tion," IBM Systems Journal, Volume 16,Number 1, 1977, pp. 54-73.

[23] Watson, D. A. "Some Factors Involved inthe Establishment of Programmer Perfor-mance Standards," Computer Bulletin,Volume 13, Number 6, June 1969.

[24] Weiss, H. M. "Project Control," Journal ofSystems Management, Volume 24,Number 5, May 1973, pp. 14-17.

[25] Wolverton, R. W. "The Cost of DevelopingLarge-Scale Software," IEEE Transactionson Computers, Volume C-23, Number 6,June 1974.

About the Authors

Izak Benbasat is Assistant Professor of Account-ing and MIS at the University of British Columbia.He received his M.S. and Ph.D. in MIS at theUniversity of Minnesota. He is presently spendinghis sabbatical year at the MIS Research Center atthe University of Minnesota.

Iris ~lessey is currently Lecturer in InformationSystems at the Department of Commerce,University of Queensland, Australia. She holds anM.Sc., Dip. of Info. Proc., and an MBA degreefrom the University of Queensland. Her currentresearch interests include software design, soft-ware maintenance and the cognitive processesinvolved in computer programming and debugging.

MIS Quarterly~June 1980 43