2012 - quality control and due diligence in project management - getting decisions right by taking...

Upload: danielcaldeira

Post on 04-Mar-2016

214 views

Category:

Documents


0 download

DESCRIPTION

2012 - Quality Control and Due Diligence in Project Management - Getting Decisions Right by Taking the Outside View

TRANSCRIPT

  • Quality control and due diligence in pre

    lyv

    Bus

    orm

    This paper explores how theories of the planning fallacy and the outside view may be used to conduct quality control and due diligence in

    1.1. Third Wave research cases, but must instead be seen as a core business activity, vital to

    Available online at www.sciencedirect.com

    International Journal of Project ManagemeThe 2011 Oxford Handbook of Project Management identifiesa third wave in project management research characterized by apositioning of the management of projects as a vital part of generalmanagement (Morris et al., 2011:3). With projects taking centerstage as delivery model for products and services in most

    organizations as a whole. As part of this development from spe-cific to general management, theory has developed from project-based theory to more general theory, substantially strengtheningthe academic base of project management, according to Sderlund(2011) and Turner et al. (2011). The present paper is intended as acontribution to, first, the third wave's focus on general manage-ment and, second, fortifying the theoretical basis of projectmanagement by focusing on general theory and its importance toproject management.project management. First, a much-neglected issue in project management is identied, namely that the front-end estimates of costs and benets used in the business cases, costbenet analyses, and social and environmental impact assessments that typically support decisions on projects are commonly signicantly different from actual ex post costs and benets, and are therefore poor predictors of the actual value and viability ofprojects. Second, it is discussed how Kahneman and Tversky's theories of the planning fallacy and the outside view may help explain and remedythis situation through quality control of decisions. Third, it is described what quality control and due diligence are in the context of projectmanagement, and an eight-step procedure is outlined for due diligence based on the outside view. Fourth, the procedure is tested on a real-life,multibillion-dollar project, organized as a publicprivate partnership. Finally, Akerlof and Shiller's recent discussion in economics of ring theforecaster is discussed together with its relevance to project management. In sum, the paper demonstrates the need, the theoretical basis, a practicalmethodology, and a real-life example for how to de-bias project management using quality control and due diligence based on the outside view. 2012 Elsevier Ltd. APM and IPMA. All rights reserved.

    Keywords: Quality control; Due diligence; Front-end management; Planning fallacy; Outside view; Forecasting; Costbenet analysis; Ex-post analysis; DanielKahneman

    1. Introduction organizations, project management can no longer be seen as aspecialist subcategory of management brought to bear on specialright by taking th

    Bent F

    BT Centre for Major Programme Management, Sad

    Received 15 May 2012; received in revised f

    Abstract The author wishes to thank Daniel Kahneman for generously taking the timeto discuss in several meetings and correspondence the idea of quality controland due diligence of decision making, based on theories of the planning fallacyand the outside view. The author also wishes to thank Alexander Budzier, OleJonny Klakegg, Jens Krause, Jrgen Laartz, Petter Nss, and Martin Wachs fortheir valuable comments on earlier versions of this paper. Research for the paperwas supported in part by The Danish Council for Strategic Research.

    E-mail address: [email protected].

    0263-7863/$36.00 2012 Elsevier Ltd. APM and IPMA. All rights reserved.http://dx.doi.org/10.1016/j.ijproman.2012.10.007oject management: Getting decisionsoutside view

    bjerg

    iness School, University of Oxford, United Kingdom

    2 October 2012; accepted 16 October 2012

    www.elsevier.com/locate/ijpromannt 31 (2013) 760774Winter et al. (2006), Williams et al. (2009), Edkins et al.(2012), and many others have identified high-quality front-endmanagement as crucial to good project delivery. The front-endhas been singled out as the perhaps most important stage in theoverall project cycle in securing the success of projects, oravoiding failure. Within the front-end, estimating expected out-comes of projects and evaluating whether they justify investing

  • in a given project is a key activity, and maybe the most con-sequential one, because ideally estimates inform the decision ofwhether projects will go ahead or not.1 Furthermore, the quality

    only in business cases and costbenefit analyses, but also in socialand environmental impact assessments, which are typically also

    761B. Flyvbjerg / International Journal of Project Management 31 (2013) 760774of estimates will highly influence whether a project is deemedsuccessful or not in meeting its objectives. Yet, as pointed out byEdkins et al. (2012:6) there is little about this in the researchliterature, nor about how the quality of estimation may be im-proved. It is the purpose of the present paper to help fill this gap inthe literature.

    Specifically, the paper brings Nobel prize-winning economictheory about the planning fallacy and the outside view, developedby Daniel Kahneman and Amos Tversky, to bear upon qualitycontrol and due diligence in the management of projects.Previous studies have documented a much-neglected topic inresearch and practice, namely that ex ante estimates of costs andbenefits used in the business cases, costbenefit analyses, andsocial and environmental impact assessments that typicallysupport decisions on projects are often significantly differentfrom actual ex post costs and benefits (Flyvbjerg, 2009). As aresult, such business cases, costbenefit analyses, and social andenvironmental impact assessments are typically poor predictorsof the actual value and viability of projects and cannot be trustedas a basis for informed decision making. The idea proposed in thepresent article is:

    1. to develop quality control and due diligence to establish justhow reliable, or unreliable, the business case, costbenefitanalysis, and social and environmental impact assessment arefor a given project, and

    2. to decide the implications of this for the project go-ahead.

    Kahneman and Tversky did not develop their theories withthese purposes in mind. The thesis is, however, that their theorieswill be highly effective in quality control and due diligenceof planned projects and thus in improving decision making onprojects. The existing literature on quality control, due diligence,and risk assessment in project management does not cover thisparticular approach to improved decision making (Breakwell,2007; De Meyer et al., 2002; Fischoff, 2002; Perminova et al.,2008; Winch, 2010).

    Due diligence is generally an evaluation of the assets of acompany, an investment, or a person. Here, due diligence isspecifically understood as an evaluation of the costs and benefitsderiving from investing in a given project, and especially whetherthe estimated costs and benefits for that project are likely tomaterialize. Due diligence is thus developed to be used as qualitycontrol of business cases, costbenefit analyses, and thego-decision in projects. Data to test the thesis are drawn from alarge database on estimated and actual costs and benefits in majorprojects and programs. Theory and methodology are developed ata level of generalization sufficiently high to allow their use, not

    1 This is assuming that the project has not fallen victim to lock-in at an earlierstage and that promoters are genuinely interested in understanding outcomes,

    which is not always the case, for instance where promoters have locked in on acertain design or construction interests dominate operations interests (Cantarelliet al., 2010).key in decision making on major projects, and which also sufferfrom the problem of predicted impacts often being very differentfrom actual ones (Flyvbjerg et al., 2002:4964).

    2. The planning fallacy and the outside view

    Flyvbjerg et al. (2002, 2005) and Flyvbjerg et al. (2002)showed that forecasts of costs and benefits of major projects aregenerally highly inaccurate and biased. These findings have beenverified by other similar research (Arena et al., 2006; Dantata etal., 2006; Flyvbjerg and Stewart, 2012; Gardener and Moffat,2008; Merrow, 2011; Molkken and Jrgensen, 2003; Scott etal., 2011; Williams et al., 2009). Kahneman and Tversky (1979a,1979b), in their Nobel-prize-winning work on decision makingunder uncertainty, argued that such inaccuracy is caused by asystematic fallacy in decision making causing people to under-estimate the costs, completion times, and risks of planned actions,whereas people overestimate the benefits of the same actions.Following and expanding upon Buehler et al. (1994), Lovallo andKahneman (2003:58) would later call such common behavior theplanning fallacy.2 Kahneman (1994) argued that this fallacystems from actors taking an inside view focusing on theconstituents of the specific planned action rather than on theoutcomes of similar actions already completed. Kahneman alsoidentified a cure to the fallacy, namely taking an outside viewon planned actions, which consists in using experience fromprevious similar ventures already completed, including (a) theaverage outcome in sets of such ventures and (b) distributionalinformation about outcomes. Distributional information is hereunderstood as data on variation from the average outcome, forinstance as expressed in common statistical measures such asstandard deviation and variance. For instance, students askedhow long they will take to write their thesis typically significantlyunderestimate the time, trying to understand and estimate the taskinside-out. If they would simply collect data and calculate theaverage of how long it took 1015 fellow students in last year'sclass in the same program they would arrive at a much moreaccurate estimate, according to Kahneman. Below we will seehow doing so in a systematic fashion may be used for performingdue diligence of cost and benefit estimates, and for uncoveringjust how biased and error-prone such estimates typically are.

    The theoretical and methodological foundations of the out-side viewwere first described by Kahneman and Tversky (1979b).As explained in Flyvbjerg (2006), thinking about the outside viewwas originally developed to compensate for the type of cognitivebias that Kahneman and Tversky had found in their work. Thiswork shows that errors of judgment are often systematic andpredictable rather than random, manifesting bias rather thanconfusion, and that any corrective prescription should reflect this.It also shows that many errors of judgment are shared by experts

    2 Buehler et al. (1994) originally used the term "planning fallacy" to explain

    error and bias in schedule estimates. Later, the term has been used to alsoexplain error and bias in cost and benet estimates, in addition to in scheduleestimates.

  • Proand laypeople alike. Finally the work demonstrates that errorsremain compelling even when one is fully aware of their nature.Thus awareness of a perceptual or cognitive illusion does not byitself produce a more accurate perception of reality, according toKahneman and Tversky (1979b:314). Awareness may, however,enable one to identify situations in which the normal faith in one'simpressions must be suspended and in which judgment should becontrolled by a more critical evaluation of the evidence. Here it isargued that due diligence is a method for such critical evaluation.Human judgment, including in developing business cases andforecasts for projects, is biased. Due diligence is a method forsystematically and reliably assessing how large biases are and forde-biasing business cases and other forecasts.

    With Kahneman and Tversky (1979b) it is argued that theprevalent tendency for humans to underweigh or ignore dis-tributional information is perhaps the major source of error inforecasting and decision making. The analysts should there-fore make every effort to frame the forecasting problem so asto facilitate utilizing all the distributional information thatis available, say Kahneman and Tversky (1979b:316). Thismay be considered the single most important piece of adviceregarding how to increase accuracy in business cases andforecasting through improved methods (as opposed to throughimproved incentives, see Flyvbjerg, 2007a). Using such dis-tributional information from other ventures similar to that beingforecasted is taking an outside view and it is the cure to theplanning fallacy, according to Kahneman and Tversky. Duediligence as described below is a method for consistently takingan outside view on planned actions.

    The contrast between inside and outside views has been con-firmed by systematic research (Buehler et al., 1994; Gilovich etal., 2002). The research shows that when people are asked simplequestions requiring them to take an outside view, their forecastsbecome significantly more accurate. For example, a group ofstudents enrolling at a college were asked to rate their futureacademic performance relative to their peers in their major. Onaverage, these students expected to perform better than 84% oftheir peers, which is logically impossible. The forecasts werebiased by overconfidence. Another group of incoming studentsfrom the same major were asked about their entrance scores andtheir peers' scores before being asked about their expectedperformance. This simple diversion into relevant outside-viewinformation, which both groups of subjects were aware of,reduced the second group's average expected performanceratings by 20%. That is still overconfident, but it is significantlymore realistic than the forecast made by the first group (Lovalloand Kahneman, 2003:61).

    However, most individuals and organizations includingproject managers, cost engineers, and risk departments areinclined to adopt the inside view in planning new projects. Thisis the conventional and intuitive approach. The traditional wayto think about a complex project is to focus on the project itselfand its details, to bring to bear what one knows about it, payingspecial attention to its unique or unusual features, trying to

    762 B. Flyvbjerg / International Journal ofpredict the events that will influence its future. The thought ofgoing out and gathering simple statistics about related projectsseldom enters a planner's mind, according to Kahneman andTversky, who use the term planner to denote any individualor organization contemplating future actions. For this reason,biases in decision making are persistent and de-biasing will nothappen of its own accord. For de-biasing to take place, incen-tives and methods for quality control of projects and decisionsmust be in place, like those described below.

    While understandable, planners' preference for the insideview over the outside view is unfortunate. When both forecastingmethods are applied with equal skill, the outside view is muchmore likely to produce a realistic estimate, as shown by thesimple experiments conducted by Kahneman and others. That isbecause the outside view bypasses cognitive and political biasessuch as optimism bias and strategic misrepresentation and cutsdirectly to outcomes (Flyvbjerg, 2007a). In the outside viewproject planners and forecasters are not required to makescenarios, imagine events, or gauge their own and others' levelsof ability and control, so they cannot get all these things wrong.Human bias is bypassed. Undoubtedly, the outside view, beingbased on historical precedent, may fail to predict extreme out-comes, that is, those that lie outside all historical precedents. Butfor most projects, the outside view will produce more accurateresults. In contrast, a focus on inside details will lead to in-accuracy and poor decisions.

    The comparative advantage of the outside view will be mostpronounced for non-routine projects, understood as projectsthat managers and decision makers in a certain organization orlocale have never attempted before like building greenfieldinfrastructure and plant or catering to new types of demand. It isin the planning of such new efforts that the biases towardoptimism and strategic misrepresentation are likely to belargest. To be sure, choosing the right class of comparative pastprojects would become more difficult when planners are fore-casting initiatives for which precedents are not easily found, forinstance the introduction of new and unfamiliar technologies.However, most projects, and especially major ones, are bothnon-routine locally and use well-known technologies that havebeen tried out elsewhere. Such projects would, therefore, beparticularly likely to benefit from the outside view.

    Below, it is demonstrated how project managers and decisionmakers may use the outside view to perform due diligence ofbusiness cases, costbenefit analyses, and social and environ-mental impact assessments for planned ventures.

    3. Potentials and barriers for the outside view

    As argued in Flyvbjerg (2007a), two types of explanationbest account for forecasting inaccuracy: optimism bias andstrategic misrepresentation. The outside view was originallydeveloped to mitigate optimism bias, but it may help mitigateany type of bias, including strategic bias, because the outsideview bypasses such bias by cutting directly to empirical out-comes and building conclusions about future events on these.Even so, the potentials for and barriers to using the outsideview will be different in situations where (1) optimism bias is

    ject Management 31 (2013) 760774the main cause of inaccuracy as compared to situations where(2) strategic misrepresentation is the chief reason for inaccu-racy. We therefore need to distinguish between these two types

  • methods alone will not solve the problem, because any methodcan be gamed (Winch and Maytorena, 2011:354). Bettermethods need to go hand in hand with better incentives.3

    Finally, promoters' forecasts should be made subject to qualitycontrol and due diligence by other parties, including banks andindependent bodies such as national auditors or independent

    section we will see what quality control and due diligence

    763Project Management 31 (2013) 760774of situation when endeavoring to apply the outside view inpractice.

    In the first type of situation where optimism bias is the maincause of inaccuracy we may assume that managers andforecasters are making genuine mistakes and have an interest inimproving accuracy. Consider, for example, the students men-tioned above, who were asked to estimate their future academicperformance relative to their peers. We have no reason to believethat the students deliberately misrepresented their estimates,because they had no interest in doing so and were not exposed topressures that would push them in that direction. The studentsmade honest mistakes, which produced honest, if biased,numbers regarding performance. And, indeed, when studentswere asked to take into account outside-view information, wesaw that the accuracy of their estimates improved significantly. Inthis type of situation when forecasters are honestly trying togauge the future the potential for using the outside view toimprove outcomes is good. Managers and forecasters will bewelcoming the outside view, and barriers to using it will be low,because no one has reason to be against a methodology that willimprove their forecasts.

    In the second type of situation where strategic misrepresen-tation is the main cause of inaccuracy differences betweenestimated and actual costs and benefits are best explained bypolitical and organizational pressures. Here managers andforecasters would still need the outside view if accuracy were tobe improved, but managers and forecasters may not be interestedin this because inaccuracy is deliberate. Biased forecasts servestrategic purposes that dominate the commitment to accuracy andtruth. Consider, for example, the case of urban rail. Here, theassumption of innocence regarding estimates typically cannot beupheld. Cities compete fiercely for approval and for scarcenational funds for such projects, and pressures are strong topresent business cases as favorably as possible, that is, with lowcosts and high benefits, in order to beat the competition. There isno incentive for the individual city to debias its forecasts, butquite the opposite. Unless all other cities also debias, the indi-vidual city would lose out in the competition for funds. Managersare on record confirming that this is a common situation(Flyvbjerg and Cowi, 2004:3658). The result is the same as inthe case of optimism: managers promote ventures that areunlikely to perform as promised. But the causes are different asare possible cures.

    Under these circumstances the potential for using the outsideview is low the demand for accuracy is simply not present and barriers are high. In order to lower barriers, and thus createroom for the outside view, incentives must be aligned to rewardaccurate forecasts and punish inaccurate ones, for instance byallocating forecasting risk directly to investors so they willcarefully scrutinize the accuracy of forecasts before deciding toinvest in a project. Below, we will see an example of how thisworks, in the case study of the A-Train. Furthermore, to curbstrategic forecasts governments and other funders may have tomake the outside view mandatory and make funding contingent

    B. Flyvbjerg / International Journal ofon the application of this particular methodology, as has beendone in the UK and Denmark (Danish Ministry for Transport,2008; UK Department for Transport, 2006). However, betterbased on the outside view might look like in practice.4

    4. What is due diligence?

    Due diligence is a term used for the performance of aninvestigation of a business, an investment, or a person with acertain standard of care. Due diligence may be a legal obligation,but the term more commonly applies to voluntary investigations.Common examples of due diligence are the process throughwhich a potential buyer evaluates a target company or its assetsfor acquisition or the process through which a potential investorevaluates a major investment for its costs and benefits. The theorybehind due diligence holds that performing this type of inves-tigation contributes significantly to informed decision making byenhancing the amount and quality of information available todecision makers and by ensuring that this information is sys-tematically used to deliberate in a reflexive manner on thedecision at hand and all its costs, benefits, and risks (Bing, 2007;Chapman, 2006).

    Here, focus will be on the process through which a potentialinvestor, public or private, may evaluate a major planned in-vestment for its costs, benefits, and risks. To perform duediligence of the estimated costs and benefits for a planned projectinvolves an evaluation of whether the cost and benefit forecasts inthe project's business case are likely to be over- or under-estimated and by how much. Like other forecasters, cost andbenefit forecasters are faced with the following two basicquestions:

    1. What is the expected value of the variable being forecasted(here costs and benefits of the project in question)?

    2. What is the variance of the expected value, i.e., the likelihoodthat the value will be, say, 5, 10 or 50% lower or higher thanthe expected value?

    3 The wider issue of the politics of business case development are detailed inFlyvbjerg, Bruzelius, and Rothengatter (2003) and Flyvbjerg et al. (2005),including the design and implementation of incentive structures that wouldmake decision making well-informed and accountable.4 The existence of strategic misrepresentation does not exclude the

    simultaneous existence of optimism bias, and vice versa. In fact, it is realisticto expect such co-existence in forecasting in large and complex ventures andanalysts. Such bodies would need the outside view or similarmethodology to do their work. Projects that were shown to haveinflated benefit-cost ratios would not receive funding or wouldbe placed on hold. The higher the stakes, and the higher thelevel of political and organizational pressures, the more pro-nounced will be the need for measures like these. In the nextorganizations. This again underscores the point that improved methods andmeasures of accountability must go hand in hand if the attempt to arrive at moreaccurate forecasts is to be effective.

  • ProTo perform due diligence of a given cost or benefit forecast i.e., to evaluate whether the forecast is likely to be over- orunderestimated is equivalent to judging how well these twoquestions have been addressed by the forecasters. Below, it isdemonstrated how this may be done in practice.

    For a given cost or benefit forecast, the specific stepsinvolved in performing due diligence based on the outside vieware:

    1. Identification and description of the business case or forecastto be evaluated.

    2. Establishing a benchmark that represents the outside view,against which performance may be measured.

    3. Using the benchmark to evaluate performance in the forecastin question.

    4. Checking the forecaster's track record from other, similar,forecasts.

    5. Identifying further cost and benefit risks.6. Establishing the expected outcome.7. Soliciting comments from the forecaster.8. Concluding as to whether the forecast is over- or under-

    estimated and by how much.

    The outside view, as described above, is brought into duediligence by establishing the benchmark used in due diligenceon the basis of performance in a group of projects similar to theone being investigated in the due diligence. Below it is shownhow this is done in practice and we will see how each of theeight steps is carried out. Finally, we will discuss accuracy incost and benefit forecasting and how it may be improved.

    It should be emphasized that due diligence and quality controlis not about eliminating uncertainty from business cases andforecasts. That is impossible. Uncertain events will alwaysinfluence actual outcomes and thus the accuracy of forecasts. Forinstance, air fares change and influence rail patronage, or fuelprices change and affect demand for highways. This may becalled inherent uncertainty. Nevertheless, much uncertainty and especially bias is not inherent but is caused by avoidableerror and strategic behavior. Due diligence is aimed at reducinguncertainty by getting a clear picture of the size and types ofuncertainty that apply to specific decisions and forecasts.

    5. Due diligence in practice

    This section considers a case of typical state-of-the-art fore-casting for a major capital project, namely a planned multibillion-dollar rail investment project. For reasons of data and pre-sentation, the focus is on benefits and demand. However, theissues and methodology covered are general and apply to alltypes of costs and benefits and to social and environmentalimpacts. The project has been anonymized for reasons of con-fidentiality and is here called the A-Train. The delivery modelfor the A-Train is a government-sponsored publicprivatepartnership with a private concessionaire responsible for building

    764 B. Flyvbjerg / International Journal ofand operating the service. This type of delivery model is todaycommon for major infrastructure development in many parts ofthe world.Demand forecasts are important to project managers anddecision makers because they influence both cost and revenuerisks of projects. Forecasts affect basic design decisions becausethey determine capacity and size of the facilities and systemsbeing planned, for rail for instance number of tracks, number andlength of trains, length of station platforms, and size of stations.An inflated passenger forecast for Bangkok's US$2 billionSkytrain resulted in a greatly oversized system, with less than halfof the projected passengers, station platforms too long for theshortened trains, many trains and cars sitting in the garage,because there was no need for them, terminals too large, etc.(Flyvbjerg et al., 2005:132). Similarly, in areas of low populationdensity, rail stations may require extensive car parking. Ifridership forecasts are inflated, tens of thousands of parkingspaces may be built that may not be needed, at a cost of tens ofthousands of dollars per space. That would mean piling cost uponcost to meet fictional demand. The impact of the inflated forecastis that revenues will be lower than expected and costs higher thannecessary, increasing viability risks. Given the high cost of majorinfrastructure projects, the irreversibility of decisions, and thelimited availability of resources, it is crucial to focus on thequality of demand forecasts.

    In publicprivate partnerships there is special focus ondemand risk, because such risk typically needs to be allocated,first, between the public and private sectors, and second, betweenindividual parties within those sectors, e.g., between local andnational government and between several private investors. So,too, for the A-Train, where size and allocation of demand riskwas a key issue. The question of whether the project's demandforecast may be over- or underestimated therefore became centralto the project management process, and especially to parties whowere asked to take on demand risk as part of their investment.

    In what follows, due diligence is used to answer the questionof whether the A-Train's demand forecast may be over- orunderestimated. The due diligence was initiated by an investorwho was invited to fund the A-Train and to take on a substantialpart of the project's demand risk.

    Again, the methodology used is general, although the datapresented are particular to rail. For the method to work for othertypes of projects than rail, data for those types of projects wouldsimply have to be substituted for the rail data in both theforecast to be evaluated and in the benchmark used to carry outthe evaluation.

    Each of the eight steps in due diligence listed above arecovered in turn below.

    6. The Case of the A-Train

    As part of step one identification and description of theforecast to be evaluated the sponsors' demand forecast for theA-Train is summarized in Table 1 and explained in the maintext.

    The sponsors' demand forecast, made for the A-Train finalbusiness case, predicted 14.1 million passengers for the opening

    ject Management 31 (2013) 760774year, growing to 17.7 million passengers per year ten years later.The sponsors and their forecaster considered this forecast tobe the expected outcome taking into account the balance of

  • Table 2Accuracy in 27 rail demand forecasts. Accuracy is measured as actual divided byforecast demand for the first year of operations. The standard deviation ismeasured for actual divided by forecast demand. (Source: Flyvbjerg et al., 2005).

    Average accuracy Standard deviation

    27 rail projects 0.61 0.5225 rail projects, excluding 0.49 0.28

    765Project Management 31 (2013) 760774probabilities of all the different risks and uncertainties in themodeling and forecasting process. Thus the sponsors consideredthe chance to be 50:50 for each of the events that the forecastwould be exceeded and not be reached, respectively.

    As an estimate of variance of the expected outcome, thesponsors presented what they called a downside case fordemand and revenues, with 15% lower demand and revenuesthan the main forecast. The sponsors stated they had 95%confidence that these figures would be exceeded, which isequal to a 5% risk of a demand or revenue shortfall of 15% ormore. This is equivalent to a standard deviation of forecastingoutcomes of 9.1%. The downside case was produced by thesponsor's forecasters on the basis of their experience fromMonte Carlo simulations of risk in other projects, according tothe forecasters.

    Finally, the sponsors of the A-Train estimated a five-yearramp up profile for demand, beginning with 60% of the forecastin Year 1 increasing to 100% in Year 5, as shown in Table 1.Ramp up denotes a delay in demand take-up during the firstmonths or years of a new service. The delay may arise due tothe slow spread of knowledge about the service, inertia inchanging travelers' behavior, initial operational hiccups, and

    Table 1Sponsors' demand forecast for the A-Train, final business case. Predicteddemand, downside case, and ramp up profile are shown as estimated by thesponsors' forecaster. See main text for discussion.

    Item Forecast

    Predicted demand(expected outcome)

    14.1 million passengers, first year17.7 million passengers, after 10 years

    Downside case (variance, risk) Standard deviation: 9.1%95% confidence level: 15% passenger shortfallLikelihood of passenger shortfall of 15% ormore: 5%

    Ramp up profile (first fiveyears after going live)

    Year 1: 60% of forecastYear 2: 75% of forecastYear 3: 85% of forecastYear 4: 95% of forecastYear 5: 100% of forecast

    B. Flyvbjerg / International Journal ofthe like. According to the forecasters, the ramp up profile wasestablished on the basis of a limited amount of available datafrom other systems and a degree of professional judgment withthe forecasters. The data claimed to support the ramp up profilewas not provided with the forecast.

    7. Best or average practice as benchmark?

    Step two of due diligence is establishing a benchmark againstwhich individual forecasts may be measured, including theA-Train forecast presented above. This is perhaps the mostconsequential step of the due diligence exercise. Benchmarkstypically establish best, average, or median practice. In cost andbenefit forecasting, best practice is equivalent to being on target,i.e., to produce a forecast with zero or little error, meaning aforecast that is neither too high nor too low. Although bestpractice is clearly an ideal to be strived for, in forecasting it is nota useful benchmark for due diligence. This is because bench-marking against best practice would give us only one piece ofinformation, namely how far the forecast in question is from theideal, i.e., from zero error. In this case we would obtain none ofthe distributional information that Kahneman and Tversky argueis crucial for understanding forecasting problems and gettingforecasts right. For instance, using best practice we would notknow how often best practice is actually achieved in forecastingand how far from best practice forecasts usually are. In whatfollows, the benchmark is therefore established on the basis ofaverage and median practice, not best practice. As we will see,average and median practice in rail demand forecasting is farfrom best practice.

    Measured by sample size, Flyvbjerg et al. (2005) containsthe most comprehensive study that has been published so far ofaccuracy in travel demand forecasting. The study is based on asample of 210 transportation infrastructure projects of which 27are rail projects. For 23 of the 27 rail projects demand forecastswere overestimated. Table 2 shows the main results from thestudy as regards rail, with and without 2 statistical outliers.Without statistical outliers actual demand is on average 51.4%lower than forecasted demand. This is equivalent to rail demandforecasts being overestimated by 106% on average. Just asimportant, standard deviations are high indicating a high risk oflarge forecasting errors.

    For the present study, the 2005 sample was expanded to cover475 transportation infrastructure projects of which 62 are railprojects that are comparable to the A-Train, in the sense theywere built using technologies and under regulatory regimes thatare comparable to those that apply to the A-Train. The dataset isthe largest of its kind. All large transportation infrastructureprojects, for which valid and reliable data on forecasting accuracywere available, have been included in the set. For a detaileddescription of the data, see Flyvbjerg (2005) and Flyvbjerg et al.(2005).

    Table 3 shows results from the new, expanded study. For 53

    statistical outliersof the 62 rail projects in the dataset, demand forecasts wereoverestimated. Provided that an estimate was high, the averageaccuracy of the demand estimate was 0.50 for these projects,

    Table 3Accuracy in 62 rail demand forecasts. Accuracy is measured as actual dividedby forecast demand for the first year of operations. The standard deviation ismeasured for actual divided by forecast demand. (Source: Flyvbjerg database).

    Average accuracy Standard deviation

    62 rail projects 0.63 0.4161 rail projects, excludingstatistical outlier

    0.59 0.33

  • that is, the overestimate was 100% on average and half theforecasted passengers never showed up on the trains. Accuracyin the new, larger sample of forecasts is approximately the sameas in the earlier, smaller sample. Without statistical outliers,accuracy is 10 percentage points higher in the larger samplethan in the smaller one, although accuracy is still low withactual demand being 41% lower than forecast demand on

    Furthermore, public and media scrutiny will also be at its highestduring the first period of operation, i.e., the ramp up period,making accurate estimates of ramp up important to good publicrelations and corporate reputation.

    Table 6 compares estimated ramp up for the A-Train withactual ramp up in the 11 rail projects for which comparableramp up data was available from the dataset. The 60% startingpoint assumed for the A-Train ramp up profile in year onecompares to 41% actually realized for the 11 projects for whichdata are available. This is an overestimate for the A-Train of

    766 B. Flyvbjerg / International Journal of Proaverage. This is equivalent to rail demand forecasts beingoverestimated by 69% on average. Again standard deviationsare high indicating high risks of large errors in rail demandforecasting.

    Table 4 shows the quartiles for the expanded study with andwithout the statistical outlier. The difference between includingand excluding the outlier is small. In what follows the figureswithout the outlier are used, taking into account the possibilitythat the outlier may be caused by measurement error.

    The table shows that for 25% of rail projects actual demandis 65% lower or more than forecast demand, equal to anoverestimate in forecasts of 186% or more in a quarter of cases.For half the projects in the sample actual demand is 49% loweror more than that forecast, equal to a overestimate of 96% ormore. Finally, the table shows that for 75% of rail projectsactual demand is 22% lower or more than forecast, equal to anoverestimate in demand forecasts of 28% or more for theseprojects.

    The numbers show that the risk of overestimates in raildemand forecasts is high. For instance the risk is 50:50 of anoverestimate of 96% or more. It is the data for these 62 raildemand forecasts that are used as a benchmark against whichother rail demand forecasts may be measured, including theA-Train forecast.

    8. Applying the outside view

    Step three in due diligence based on the outside view is toemploy the benchmark to evaluate performance for the forecast inquestion. As mentioned above, research by Daniel Kahneman,Amos Tversky, and others has established that a major source oferror in forecasts is a prevalent tendency with forecasters tounderweigh or ignore variance of outcomes and thus risk. This isto say that not only do forecasters err in estimating the expectedoutcome of the variable they are forecasting. More often than notthey make even larger errors in estimating the variance (typicallymeasured in standard deviations) of the expected outcome. Herewe therefore emphasize how to benchmark variance in a businesscase forecast, again with rail and the A-Train demand forecast asexamples of the general problematic.

    Table 4Quartiles for accuracy in 62 rail demand forecasts. Accuracy is measured asactual divided by forecast demand for the first year of operations (source:Flyvbjerg database).

    25% 50% 75%62 rail projects 0.36 0.54 0.7961 rail projects, excluding statistical outlier 0.35 0.51 0.78Table 5 compares the A-Train downside case with thebenchmark established above. The table shows that:

    1. The standard deviation of accuracy in the benchmark is33%, compared with 9.1% in the A-Train downside case.

    2. The 95% confidence level is equivalent to a demand short-fall of 85% in the benchmark, compared with 15% in thedownside case.

    3. The likelihood of a demand shortfall of 15% or more is 80%in the benchmark, compared with 5% in the downside case.

    Compared with the benchmark, the standard deviation of theA-Train demand forecast appears to be highly underestimated.As mentioned, this is a common error in forecasting and itresults in forecasts that are optimistic in the sense that the actualrange of possible outcomes, i.e., actual uncertainty, is largerthan the estimated range of possible outcomes, i.e., estimateduncertainty. The A-Train demand forecast is highly optimisticin this sense. To illustrate, the risk is a full 16 times higher inthe benchmark than in the A-Train demand forecast of ademand shortfall of 15% or more.

    9. Dealing with demand ramp up

    For a demand forecast, step 3 also includes evaluating theramp up of patronage. Ramp up denotes the delay in demandtake-up during the first months or years of a new service, asmentioned above. The accuracy of ramp up is crucial to realizingforecasted revenues. Getting ramp up wrong has been known toseriously jeopardize the viability of projects. The practice inbusiness cases and costbenefit analyses of discounting futurebenefits and costs to net present values makes the accuracy oframp up even more important, because first-year revenues andbenefits are attached the greatest weight in such discounting.

    Table 5Main results from benchmarking of A-Train downside case against outcomes of61 actual rail demand forecasts.

    A-Traindownside case

    Benchmark(61 forecasts)

    Standard deviation 9.1% 33%95% confidence level 15% shortfall 85% shortfallLikelihood of demand shortfallof 15% or more

    5% 80%

    ject Management 31 (2013) 76077446% for year one, compared with the benchmark. For the rampup profile itself, the A-Train forecast assumes an increase of 40percentage points over five years, from 60 to 100%, compared

  • made in the past. If asked to provide such evidence of actual trackrecord, forecasters typically reply that the evidence is confidentialand cannot be released (a common explanation for privateprojects) or that the evidence is not available, because they havedone no systematic follow-up studies of accuracy, and neitherhave others (a common explanation for public projects). Eitherway, evidence of documented track record of accuracy of pastpredictions is typically not available from forecasters.

    So, too, for the A-Train forecaster, which is a large con-sultancy with global presence and extensive experience in

    767Project Management 31 (2013) 760774to 14 percentage points actually realized in the benchmark,from 41 to 55, which amounts to an overestimate by a factor2.86, compared with the benchmark.

    Table 6 shows that five years after going live, actual demandin the 11 projects was approximately half of the forecasts. Thedata thus does not support the estimated ramp up for theA-Train. The table shows that with 45% of demand having notmaterialized after five years of operations it hardly makes senseto even speak of ramp up for the 11 projects. Rather, we have acase of consistent underperformance as regards later-yeardemand, if first-year demand turned out substantially lowerthan forecasted. This conclusion holds both for ramp up foreach of the 11 projects and for average ramp up for all 11projects.

    Four published studies on ramp up in rail and road schemessupport this conclusion and thus do not support the A-Trainramp up profile (Flyvbjerg, 2005; Pickrell, 1990; Standard andPoor's, 2005; UK National Audit Office, 2004). All fourstudies found that error and bias in demand forecasting is notcaused by forecasters not considering ramp up but by actualramping up if it exists at all being far less pronounced andtaking many more years than assumed by forecasters. Beyondramp up, many projects still fail to meet demand forecasts. Thestudies even documented negative ramp up for a number ofprojects, i.e., actual demand dropped over time.

    Based on this evidence, most likely the A-Train demandforecast is overestimated as regards ramp up. The availableevidence indicate that this is common for rail demand forecastsand is a major source of bias in such forecasts.

    As mentioned above, the data claimed to support theA-Train ramp up profile was not provided with the forecast.Nor were the data available upon request, even to a potential

    Table 6Estimated ramp up profile for the A-Train and actual average ramp up for 11other rail projects. Actual demand in percent of forecast demand. See main textfor detailed explanation.

    Year A-Train (planned) 11 rail projects (actual)

    1 60 412 75 493 85 684 95 515 100 55

    B. Flyvbjerg / International Journal ofinvestor, who was invited to invest substantially in the projectand to take on a significant part of the demand and ramp-uprisks. This was seen as indication that the data may not existbut might simply be postulated by the forecaster, whichundermined trust in the forecast and the forecaster.

    10. Checking the forecaster's track record

    Step four in the method for due diligence followed hereconsists in checking the forecaster's track record from other,similar, forecasts. Forecasters typically attempt to lend credibilityto their bid to do a forecast for a specific project by listingforecasting experience from other similar projects. But rarely doforecasters provide ex post data on actual accuracy of forecastsdemand forecasting. For the A-Train forecast, they listed cre-dentials from more than 20 previous, similar forecasts they haddone. But when asked for evidence of track record for theaccuracy of these forecasts, no evidence was provided. Thereasons given were the same as above, i.e., no ex post data hadbeen collected or such data were confidential. The party askingfor evidence was a potential major investor, who would take onsubstantial demand risk, should they decide to invest in theA-Train. The forecaster's inability or unwillingness to provideevidence of the actual accuracy of their forecasts therefore meant,in effect, that the investor was asked to commit hundreds ofmillions of dollars to the A-Train venture on sheer belief and trustthat the demand forecast would be accurate. But Trust me! hashistorically not been the best basis for handing over large sums ofmoney to strangers to invest, which is why due diligence isimportant, of course.

    For the A-Train, the evaluation above shows that such trustwould clearly be unwarranted. Moreover and unfortunatelyfor the forecaster the dataset used here to establish thebenchmark for the due diligence contains information abouttrack record for two of the rail projects mentioned by theA-Train forecaster on their credentials list. These two projects,each a multibillion-dollar venture, performed as shown inTable 7.

    Table 7 shows that the forecaster's track record for the twoprojects is significantly poorer than the benchmark, which isalready quite poor with a median overestimate of 96% (seeabove).

    The national audit office in the country where Project 2 islocated decided to commission an in-depth ex-post study ofbenefits and costs of the project, when demand shortfalls provedthe project financially nonviable and a national scandal ensuedleaving high-ranking politicians and officials with egg on theirfaces. The study found that the demand forecasts completely

    Table 7Demand overestimate in percent for two rail forecasts done by the A-Trainforecaster. Each figure shows by how many percent that variable wasoverestimated. Mode share is the percentage of total travellers expected to userail.

    Project First-yeardemandoverestimate,%

    Later-yeardemandoverestimate,%

    Mode sharefor railoverestimate,%

    Comment byNational AuditOffice

    Project 1 500 250 (4th year) 350 n.a.

    Project 2 175 150 (5th year) n.a. completely

    failed grosslyoverestimated"

  • Given the available information it is prudent to assume a very

    Prohigh risk of overestimate in the A-Train demand forecast andthus a very high risk of revenue shortfalls. Prospective investorswould be well advised to require more information about thetrack record of the A-Train forecaster before committing fundsto this project, especially if investors were asked to carry all or asubstantial part of the demand risk.

    In a wider perspective, forecasters' unwillingness or inabilityto document the accuracy of their forecasts runs directly againstmuch-touted policies for evidence-based decision making andmanagement and it amounts to forecasters effectively asking theirclients to buy both in the sense of believing in and paying for their product without evidence to substantiate that the productactually works, or has worked in the past. The type of forecastingwe are studying here is globally a billion-dollar business. Veryfew other businesses of this size would get away with notdemonstrating the track record and quality of their product. Thefailed to capture development in the [rail passenger] market at[the location of the rail project]. The study also found that, theprivate consortium [bidding for the link] was presented withgrossly overestimated [demand] forecasts, which was [later]criticized [by the National Audit Office] but without response[from the forecaster]. Thus the private consortium that fundedProject 2 wasmisled by the demand forecast, and the consortium to their later grief did not carry out due diligence for the project,like the one described here for the A-Train. As a result the projectwent bankrupt and the consortium lost a fortune. When theNational Audit Office criticized the forecaster for this, the fore-caster simply ignored the critique and did not bother to answer theauditor, according to the audit.

    In the due diligence of the A-Train demand forecast, trust inthe A-Train forecaster and forecast suffered from this com-bination of:

    1. The forecaster's unwillingness or inability to documentaccuracy for their previous forecasts.

    2. The forecaster's extremely poor track record for the fewprojects for which forecasting accuracy could be established.

    3. The forecaster being discredited in public by a national auditoffice for having completely failed and not even respondingto the critique when invited.

    In sum, for the A-Train, the available information regardingthe forecaster's track record does not support the forecaster'sclaim that the A-Train demand forecast would be accurate. Thebenchmark documents an expected demand overestimate of69% (median 96%) and there are no a priori reasons to assumethat the expected overestimate in the A-Train forecast would belower than in the benchmark. Quite the opposite, given the factthat for the two projects on the forecaster's credentials listfor which we have data, the average overestimate was 338%for the first-year estimate i.e., 4.9 times the already dismalbenchmark and 200% for the four-to-five-year estimate.

    768 B. Flyvbjerg / International Journal ofpoint is that the forecasting business should also not be allowed todo this, and there are no defensible reasons they are. It is the taskof quality control and due diligence to point to this problem andto make sure the track record is documented to the extentpossible, as done above.

    11. Further demand risks

    Step five investigates whether further demand risks exist, ontop of those already described and analyzed above. A number ofcircumstances specific to the A-Train pose such further risks.Other things being equal, this indicates that demand shortfalls forthe A-Train are likely to be higher than average shortfalls in thebenchmark. The additional demand risks for the A-Train are:

    1. Urban development in the catchment area for the A-Train isdispersed and low-density. Predicted demand for the rail lineis partly dependent on increased density around stations.Changes in density and land use that were expected but didnot materialize are a common cause of failed demandforecasts (Transportation Research Board, 2007).

    2. Part of the A-Train system runs a service to an airport locatedat the edge of a metropolitan area not far from the city'sbusiness center. Research shows that this type of locationindicates a higher-than-average risk for lower-than-forecastedrail use, whereas higher distances from airports to the city edgeand city center indicate higher rail use (Bradley, 2005).

    3. The planned fare level of the A-Train airport link seems highwhen compared to fare levels at other airport rail links. Thus arisk may exist that high fares could result in lower-than-forecasted demand. This risk is mitigated, however, by the factthat demand on airport rail links tend to be relatively inelasticto changes in fares, probably because air passengers generallyhave a high value of time (Bradley, 2005:9). To assess risk onthis point would be to assess whether or not the high fares forthe A-Train airport link are likely to fall within the area ofinelasticity. Such assessment is pertinent but was not carriedout by the forecaster.

    4. Total A-Train revenues are particularly sensitive to demandrisk on the airport link, because although the airport linkis forecasted to carry only 1014% of a total A-Trainpassengers, it is forecasted to generate 3542% of totalrevenues. Thus if the airport link demand forecast is over-estimated then this might significantly and negatively impactrevenue for the total A-Train.

    5. The database used for the present study includes data onforecasting accuracy for four airport rail links, two of whichwere forecast by the A-Train forecaster. The average over-estimate for the four forecasts was 147%, which is more thandouble the benchmark. Thus the risk of demand over-estimate and shortfall seems higher for airport links than forother rail links, given the available evidence.

    6. Further demand risks exist that cannot be revealed, becausethis might disclose the identity of the A-Train. These risksare high and have been proved significant through statisticaltesting.

    ject Management 31 (2013) 760774In sum, further demand risks for the A-Train indicate thatdemand shortfalls for the A-Train are likely to be higher thanaverage shortfalls in the benchmark, which are already high.

  • 12. Establishing expected outcome

    Step six establishes the expected outcome of the forecast.Given the optimism documented above in the A-Train demandforecast regarding variance, ramp up, etc., it appears prudent topoint out that if the estimate of expected demand is as optimisticas the estimates of variance and ramp up then expected demand isoverestimated. The large overestimates in the track record of theforecaster and the specific demand risks for the A-Train men-tioned above also point in the direction of substantial risk of

    2. The majority of transportation infrastructure schemes including the A-Train has unique characteristics. Thereforeit is wrong to compare them to other projects, for instancecomparing the A-Train project to the rail projects in thebenchmark, as done above.

    769B. Flyvbjerg / International Journal of Project Management 31 (2013) 760774overestimation.If the A-Train demand forecast is argued to be neither more

    or less accurate than the average rail demand forecast, then onewould expect an overestimate of 69% as shown above for thebenchmark. Expected actual demand would then be 59% offorecast demand, or 8.3 million first-year passengers instead ofthe 14.1 million predicted. Thus 5.8 million forecasted passen-gers would not show up on the trains, resulting in largeovercapacity and revenue shortfalls. With 90% confidence,actual demand would lie in the interval from 15% to 110% offorecast; with 80% confidence, in the interval from 23% to101% of forecast; and with 50% confidence in the interval from35% to 78% of forecast (see Table 8).

    Given the evidence presented above, most likely the A-Traindemand forecast is less accurate than the average rail demandforecast. Demand shortfalls are thus likely to be larger thanthose shown in Table 8.

    13. Do private projects perform better than public ones?

    Step seven solicits comments from the forecaster regardingthe previous six steps of the due diligence. When the A-Trainforecasters were given the opportunity to comment on themethodology and results presented above they made the fol-lowing two claims:

    1. The accuracy of demand forecasting for bidding consortia andfinancial institutions has improved since the mid 1990's asmore privately funded schemes, such as the A-Train, havecome to the market. Private funding is likely to curb bias inforecasting and result in more accurate forecasts. The A-Traindemand forecast is therefore believed to be more accurate thanthe forecasts in the benchmark, according to the A-Trainforecaster.

    Table 8Expected outcome and confidence intervals for A-Train demand, provided theA-Train forecast is neither more nor less accurate than demand forecasts in thebenchmark.

    Actual/forecastdemand

    First-yearpassengers, mio

    Expected outcome 0.59 8.3

    With 90% confidence 0.151.10 2.115.5With 80% confidence 0.231.01 3.214.2With 50% confidence 0.350.78 4.911.0The first claim is covered in this section, the second in thenext. However, for both claims it may be observed that theA-Train forecaster presented no evidence to support the claims,just as the forecaster presented no evidence to support thevalidity of the A-Train forecast itself. Again we therefore haveto look elsewhere for evidence.

    Table 9 shows the ratio of actual to forecast demand forprivately funded rail projects in the database, built from the mid1990's. The ratio of actual to forecast demand is 0.30 forprivately funded schemes and 0.62 for public schemes, i.e., onaverage 30% of demand materialized for private projects versus62% for public ones. Thus inaccuracy (bias) is more than twiceas high for private schemes than for public ones. Although thenumber of private projects for which data are available is small,clearly the data in the table do not support the A-Trainforecaster's claims regarding higher forecasting accuracy forprivately funded schemes. In fact, the data supports theopposite conclusion, namely that demand forecasts for pri-vately funded projects are more overestimated than for publiclyfunded ones.

    This conclusion is supported by a large-sample study of tollroads carried out by Standard and Poor's (2004), which foundthat demand forecasts for privately funded roads were sig-nificantly more inaccurate (overestimated) than demand fore-casts for public roads. Private ventures were found to beincentivized to inflate demand forecasts to make projects lookgood on paper in order to secure investor interest.

    Two of the five privately funded projects in Table 9 are thetwo projects mentioned above under track record, for which theA-Train forecaster had themselves done the forecast, and forwhich first-year demand was overestimated by 175 and 500%,respectively. Not only were these forecasts more inaccuratethan public-sector forecasts, they were also more inaccurate onaverage than the other private-sector forecasts presented inTable 9. Thus the A-Train forecaster's forecasts were the mostinaccurate of all in the sample, and especially for private-sectorprojects. On this background, the forecaster's claim is under-mined that the A-Train demand forecast is likely to be moreaccurate than the forecasts in the benchmark, because it is aforecast for a privately funded project, and private funding islikely to result in more accurate forecasts. It was concluded thateither (a) the A-Train forecaster did not know the accuracy of

    Table 9Accuracy of demand forecasts for privately funded rail projects, first-year demand.

    Project type Actual/forecast demand5 private projects 0.3056 public projects 0.62All projects (61) 0.59

  • their own forecasts, or (b) they knew and misinformed about it.Either way, these unfounded claims, contradicted by the data,eroded trust in the forecaster and their forecast.

    For the above reasons, the forecaster's claim was rejectedthat because the A-Train involves private funding the A-Traindemand forecast is likely to be more accurate than the forecasts

    770 B. Flyvbjerg / International Journal of Proin the benchmark.

    14. Are projects unique?

    The second claim made by the A-Train forecasters understep seven of the due diligence was that projects are unique andtherefore cannot be compared, including to benchmarks. This isa claim often made by forecasters and project managers.Nevertheless, the claim is wrong.5

    It is correct that infrastructure projects typically have physical,locational, or other characteristics that are unique. The claim isincorrect, however, in the sense that research shows, with a veryhigh level of statistical significance, that despite physical or otheruniqueness rail and other infrastructure schemes are statisticallysimilar in the sense that they are highly likely to have largedemand shortfalls and large cost overruns (Flyvbjerg, 2007a,2007b). Statistical similarity is all that is needed for statisticalprediction, and statistical prediction based on the outside view isan effective means to improve accuracy in forecasts as dem-onstrated by Flyvbjerg (2006).

    This is the approach taken by the UK Treasury (2003:1),when they observe about project appraisal:

    There is a demonstrated, systematic, tendency for projectappraisers to be overly optimistic. To redress this tendencyappraisers should make explicit, empirically based adjust-ments to the estimates of a project's costs, benets, andduration it is recommended that these adjustments be basedon data from past projects or similar projects elsewhere.

    Similar here means statistically similar and although thetype of adjustments recommended by the UK Treasury wereneglected by the A-Train forecaster such adjustments appear tobe exactly what would be needed to produce a more reliabledemand forecast for the A-Train. The data from past projects andsimilar projects elsewhere, mentioned by the UK Treasury, aredata like those presented above in the benchmark. The use of suchdata has been made mandatory in the UK by the UK Treasuryand the UK Department for Transport (2006) to adjust foroptimism bias in UK transportation cost forecasts followingthe methodology described in Flyvbjerg (2006). The Danish and

    5 Recent research on major ICT projects shows that one of the most reliablepredictors of larger-than-normal cost overrun is project management seeingtheir project as unique (Budzier and Flyvbjerg, 2011; Flyvbjerg and Budzier,2011). This is easy to understand. Managers who see their projects as uniquebelieve they have little to learn from other projects. They therefore accumulateless knowledge and experience to inform their decisions. This, in turn, may

    prove ruinous, especially in analyzing and managing risk, which, per denition,requires knowledge of variation across projects and thus a view of projects asnon-unique, at least in a statistical sense.Swiss governments have implemented similar measures to curbunrealistic forecasts (Danish Ministry for Transport, 2008; SwissAssociation of Road and Transportation Experts, 2006).

    15. Final step of due diligence: are forecasted benefits likelyto happen?

    The eighth, and final, step in a due diligence of a cost orbenefit forecast is to answer the question of whether the specificforecast is likely to be over- or underestimated, and by howmuch.In demand forecasting, such assessment is crucial, because de-mand impacts revenues, which directly impact viability for agiven project. Demand also impacts costs because demand de-termines the size of facilities and systems, as explained above.Finally, demand impacts the environment and thus the results ofenvironmental impact assessments.

    For the A-Train demand forecast, the answer to this questionis based on the evidence presented above that the forecast islikely to be overestimated. The central arguments for a likelyoverestimate are:

    1. Average practice in a benchmark of 61 comparable passengerdemand forecasts, taken from the world's largest database ofsuch forecasts, is an overestimate of demand of 69% (median96%). This is the expected overestimate of a demand forecastof this kind, unless there is evidence that the forecaster is likelyto have been more accurate than the forecasters who did theforecasts in the benchmark. For the A-Train forecast suchevidence does not exist, but quite the opposite, as shown above.

    2. Variance of the expected outcome is significantly under-estimated in the A-Train demand forecast. For instance, therisk of a demand shortfall of 15% or more is 16 times higher inthe empirical benchmark than in the A-Train forecast.

    3. Assumptions about ramp up in the A-Train forecast isunsupported by evidence and appears to be overestimated byapproximately a factor 3.

    4. The forecaster provides no track record to document accuracyin their previous forecasts. Evidence from previous forecastsdone by the forecaster, for which data on accuracy could befound, show a high degree of overestimate, significantlyhigher than in the benchmark, which is already high.

    5. The mode share for the A-Train airport link appears to beoverestimated, that is, the percentage of total travellers ex-pected to use rail, against the percentage expected to use car,bus, etc.

    6. Issues of density, urban form, location, fare levels, and policypose risks to demand that are larger for the A-Train than forthe average project in the benchmark.

    7. The forecaster's argument that demand forecasts in privatelyfunded schemes, such as the A-Train, will be more accuratethan in publicly funded ones, is not supported by evidence. Infact, for privately funded schemes for which data are available,demand overestimates are more than twice as high as in otherschemes.

    ject Management 31 (2013) 760774On balance, one cannot argue that the A-Train demandforecast is likely to be less overestimated than the average rail

  • 1. This is a normal, state-of-the-art demand forecast, like

    771Project Management 31 (2013) 760774demand forecast in the benchmark. The opposite is more likely,given the available evidence, i.e., that the A-Train demandforecast is more overestimated than the benchmark, which hasan average overestimate of 69% (median 96%).

    However, if one gives the benefit of doubt to the A-Traindemand forecast and assumes that it is neither more nor lessaccurate than the average rail demand forecast, then one wouldpredict the following:

    1. An overestimate of 69%, equivalent to actual demand being59% of that forecasted, that is, 8.3 million passengers duringthe first year of services, instead of the forecaster's estimateof 14.1 million passengers. Thus 5.8 million of the fore-casted passengers would most likely never show up on thetrain service

    2. A 50% chance that actual demand will fall in the interval of35% to 78% of the forecast, i.e., 4.911.0 million passengers

    For the A-Train, most likely both the overestimate and theshortfall will be larger than these figures, given the evidencepresented above.

    In addition to the forecaster's numbers being a cause ofconcern, the forecaster's behavior also undermined trust whenthe forecaster:

    1. Was unwilling or unable, upon request, to document accuracyof their previous forecasts.

    2. Was unwilling or unable, upon request, to provide documen-tation for their assumptions regarding ramp up and insteadjustified these by professional judgment, which was laterproven optimistic by actual data.

    3. Argued without evidence that because the A-Train is privatelyfunded its demand forecast is more likely to be accurate, whenevidence indicates that the exact opposite is the case, includingfor the forecasters own previous forecasts.

    4. Argued without evidence that the A-Train is unique and maytherefore not be compared to other projects, when evidenceshows that such comparisons are justified and are mandatoryelsewhere, including for UK government-funded projects.

    5. Had their forecasts for a multibillion-dollar high-speed railproject, which was similar to the A-Train, and which wentbankrupt, thoroughly debunked by a national audit officeand did not respond to the critique when invited to do so.

    On this basis it was concluded that potential investors in theA-Train public or private would be well advised to trustneither the forecaster nor the forecast and to not take ondemand risks until demand and revenues had been reassessed inthe manner described above. This advice was followed by theinvestor who commissioned the due diligence presented here.The investor decided not to invest in the A-Train.

    16. Fire the forecaster?

    B. Flyvbjerg / International Journal ofTwo observations may be made about the A-Train demandforecast, as described and assessed above:thousands of such forecasts made every year at substantialcost to clients.6

    2. The forecast is garbage, of the type garbage-in-garbage-out.In fact, the forecast can be argued to be worse than garbage,because it gives the client, investors, and others who use theforecast, the impression that they are being informed aboutfuture demand, when, in fact, they are being misinformed.Instead of reducing risk in decision making, forecasts likethis increase risk by systematically misleading decisionmakers and investors about the real risks involved.

    The latter was the point made by the national audit studymentioned above about another forecast done by the A-Trainforecaster for another multibillion-dollar high-speed rail project.The national auditor called the forecast completely failed andgrossly overestimated and documented how the forecast hadmisled investors to commit billions of dollars to a project thatproved non-viable, became a national scandal, and had to gothrough painful financial restructuring. The risk is very real thatthe A-Train faces a similar fate, but the A-Train demand forecasthides this for the time being by its optimism, documented above.

    First, if the promoter who commissioned and paid for theoriginal A-Train demand forecast requested a valid and reliableforecast, then the promoter should ask to have their money backon the grounds that the forecaster has delivered a faultyproduct. In this case, the promoter may also consider to laterseek compensation from the forecaster for possible damagescaused by the faulty forecast. Second, if the promoter andforecaster conspired to knowingly produce an inflated forecastin order to lure investors to fund the A-Train project by makingit look good on paper, then investors may consider seekingcompensation for damages caused by the conspiracy.

    In general, risks for major project investments like the A-Trainare at least as large, as misunderstood, and as mismanaged asrisks in the financial markets, and probably more so because dataabout project risks are less available than are data about marketrisks. In finance, biased forecasts and bad risk assessments led tothe 20072009 market collapse, which exposed financial fore-casters' dubious practices. As a consequence, economists beganto discuss the necessity of firing the forecaster (Akerlof andShiller, 2009:146). This is prudent advice, also for major projectinvestment decisions. But here, firing the forecaster may beletting off forecasters too easily. Some forecasts are so grosslymisrepresented and have such dire consequences that we need toconsider not only firing the forecasters but suing them, too, forthe losses they incur. In a few cases, where forecasters fore-seeably produce deceptive forecasts, criminal penalties may be inplace, if the damage caused by forecasters is severe enough. Sofar, no forecaster has been sued for an inaccurate forecast, to thebest knowledge of the present author. This may well be changingwith an AUD$150 million class action suit under way inAustralia, against the traffic and revenue forecaster for the Clem6 Most likely, the forecast will prove to be less accurate than the benchmark,as explained above, but the forecast is unlikely to be a statistical outlier.

  • Jones toll road in Brisbane for producing woefully inaccurateforecasts. The suit is organized on behalf of 700 investors wholost their money when the toll road went bankrupt (Samuel, 2011;Wilson, 2012). The suit is historic as the first of its kind andshould therefore be studied carefully by forecasters and fore-casting scholars alike.

    Taleb (2010:163) says about deceptive forecasters:

    772 B. Flyvbjerg / International Journal of ProPeople who forecast simply because that's my job,knowing pretty well that their forecast is ineffectual, are notwhat I would call ethical. What they do is no different fromrepeating lies simply because it's my job. Anyone whocauses harm by forecasting should be treated as either a foolor a liar. Some forecasters cause more damage to societythan criminals.

    Management research has recently developed strong theory tohelp identify the fools and liars Taleb talks about here. Beingacademic, the theories use more polite and opaque language,needless to say, describing Taleb's fools as people who aresubject to optimism bias and the planning fallacy and liars asthose practicing strategic misrepresentation, agency, and theconspiracy of optimism (Flyvbjerg et al., 2009; Gardener andMoffat, 2008; Hirt, 1994;Weston et al., 2007).7 This research hasdemonstrated that fools and liars constitute the majority offorecasters and that forecasts are used as sinister tools of fraudmore often than we would like to think. Wachs (1990:146,1986:28) found nearly universal abuse of forecasting in thismanner, and it is common in all sectors of the economy whereforecasting routinely plays an important role in policy debates,according toWachs. Forecasts are used to cloak advocacy in theguise of scientific or technical rationality, inWachs' (1989, 477)classic formulation, and there is no evidence that things haveimproved since Wachs did his pioneering studies (Flyvbjerg,2007a).

    However, recent research has also developed the concepts,tools, and incentives that may help curb delusional and deceptiveforecasts. The outside view and due diligence, presented above,was developed with this purpose in mind. So was reference classforecasting (Flyvbjerg, 2006; Flyvbjerg, et al., 2005), which hasbeen officially endorsed by the American Planning Association(2005) and mademandatory by government in the UK, Denmark,and Switzerland (Danish Ministry for Transport, 2008; SwissAssociation of Road and Transportation Experts, 2006; UKDepartment for Transport, 2006). Finally, the incentive structuresdesigned by Flyvbjerg et al. (2002) and Flyvbjerg (2007a) are

    7 A gray area exists between deliberate misrepresentation and innocentoptimism. Forecasters may know, or have a hunch, that their forecasts are likelyto be optimistic. However, they may decide not to investigate further and not tocorrect for possible optimism, because they assume, often for good reason, thatoptimistic forecasts are more likely to get projects approved than realistic ones.Such behavior has been called the "conspiracy of optimism" (Gardener andMoffat, 2008; Hirt, 1994; Weston et al., 2007). But this term is an oxymoron, of

    course. Per denition, optimism is non-deliberate, unintentional, and thusinnocent. People who deliberately decide or conspire to be optimistic, are notoptimistic; they are practicing strategic misrepresentation and are thus lying.means to curb both optimism bias and strategic misrepresentationin forecasting. Given the knowledge and tools we have today formaking better forecasts, there is little excuse for acceptingforecasts like that for the A-Train, which is incompetent at best(made by fools) and fraudulent at worst (made by liars). We havethe knowledge and tools to disclose the incompetence and fraudand should clearly do so, given the large financial, economic,social, and environmental risks that are at stake in this type ofmultibillion-dollar investments. Luckily, government and busi-ness are beginning to see this.

    Forecasting fools should be sent back to school. They makeerrors unknowingly and are therefore likely to be motivated todo things differently, once the errors are pointed out to themand better ways of forecasting are presented. When in school,they would need to unlearn conventional forecasting methodsand to instead learn about optimism bias, the planning fallacy,the inside view, black swans, strategic misrepresentation, politicalpower, principalagent problems, the outside view, quality con-trol, due diligence, reference class forecasting, incentives, andaccountability. Their key teachers should be people like DanielKahneman, Nassim Taleb, and Martin Wachs. The main take-away for students would be a realization that a fool with a tool isstill a fool, and that, therefore, they must stop being fools.

    Forecasting frauds should be fired and possibly prosecuted,depending on how serious their fraud is. Forecasting fraudsknowingly bias their forecasts for personal or organizational gain,for instance to obtain approval and funding for their projects.Forecasting frauds are therefore not immediately motivated tochange their ways. Fraud is their business model. This is whycarrots will be ineffective with frauds and sticks must unfor-tunately be brought out. A light stick may be found in pro-fessional ethics. Professional organizations like PMI, APM,APA, RTPI, and other professional societies for managers,planners, engineers, and economists could use their codes ofethics to penalize, and possibly exclude, members who dounethical forecasts. Given how widespread unethical forecastsare, it is interesting to note that such penalties seem to benon-existent or very rare, despite codes of conduct that explicitlystate that misinforming clients, government, citizens, and othersis unacceptable behavior by members of these organizations. Bynot sanctioning fraudulent forecasts, and thus quietly acceptingthem, professional organizations become co-responsible for theirexistence. This needs to be debated openly within the relevantprofessional organizations. Malpractice in project managementshould be taken as seriously as malpractice in other professions,like medicine and law. To not take malpractice seriously amountsto not taking the profession of project management seriously.

    A heavier stick would be to make forecasters financiallyliable for inaccurate forecasts and to reward them for accurateones, something that is rarely done today, but which wouldundoubtedly improve accuracy. If there are no incentives to getforecasts right, perhaps we should not be surprised when theyare wrong. Finally, an even heavier stick would be to prosecuteforecasters who, through blatant negligence or deliberate decep-

    ject Management 31 (2013) 760774tion, cause serious damage to others with their forecasts, be theygovernment, corporations, or citizens. Here the SarbanesOxleyAct-like legislation, which has recently been implemented in

  • damage to others with their forecasts. Further research is needed

    on project performance. Such statistical analyses would inform

    Profurther development of both the theory and methodology ofquality control and due diligence in project management.

    3. Studies of how different types of incentives and disincen-tives impact accuracy of front-end estimates, including thethreat of legal action against forecasters who cause seriousdamage with their forecasts. Australia's Clem Jones classaction suit, mentioned above, is likely to be an interestingin the following areas:

    1. Case studies with additional examples of how the proposed (orsimilar) methodology for quality control works in practice,akin to the case study presented above, with a view to ob-taining more in-depth knowledge and a wider variety ofexperience regarding the strengths and weaknesses of themethodology and the theories behind it. Studies should coverthe full range of outcome variables and not just demand as inthe case study above.

    2. Statistical analyses of a sample of projects that have beensubjected to the methodology for quality control (or alikemethodology), with the specific purpose of measuring whetherestimates become more accurate and actual outcomes improvewith the application of themethodology. The sample should belarge enough to allow tests for statistical significance. For validresults, estimates and actual performance should be measuredconsistently in the same manner across projects using standardmethodology, something that is rarely done today in researchmany countries around the world, signifies progress (United StatesCongress, 2002). Such legislation is aimed at corporate fraud andin many contexts fraudulent forecasts may be considered part ofcorporate fraud. Forecasters are therefore no longer as safe fromprosecution as they were earlier (Bainbridge, 2007). This in itselfshould have a sobering effect on forecasters, if not before thenwhen someone has the good sense to sue a fraudulent forecaster.

    17. Areas for further research

    The present paper has taken a first step in researching amuch-neglected problem in scholarly work on project manage-ment, namely what to do about the fact that front-end estimatesof costs and benefits used in the business cases, costbenefitanalyses, and social and environmental impact assessments thattypically support decisions on projects are often significantlydifferent from actual ex post costs and benefits. Such estimatesare therefore commonly poor predictors of the actual value andviability of projects and estimates frequently misinform decisionmakers on projects instead of informing them. The paper offers(1) a new methodology for quality control of front-end estimates,based on theories of the planning fallacy and the outside view,(2) a case study with empirical test and demonstration of theproposed methodology, and (3) incentives for getting front-endestimates right, including sanctioning forecasters who cause

    B. Flyvbjerg / International Journal ofcase study in this context. Impacts of changed or new incen-tives should be studied not just for individual forecasters butacross the wider forecasting community.References

    Akerlof, G.A., Shiller, R.J., 2009. Animal Spirits: How Human PsychologyDrives the Economy, and Why It Matters for Global Capitalism. PrincetonUniversity Press, Princeton.

    American Planning Association, 2005. JAPA article calls on planners to helpend inaccuracies in public project revenue forecasting. retrieved April 7,2005, from http://www.planning.org/newsreleases/2005/ftp040705.htm.

    Arena, M.V., Leonard, R.S., Murray, S.E., Younossi, O., 2006. Historical CostGrowth of Completed Weapon System Programs. RAND Corporation,Santa Monica, CA. 67 pp.

    Bainbridge, S.M., 2007. The Complete Guide to Sarbanes-Oxley. Adams Media,Avon, MA.

    Bing, G., 2007. Due Diligence: Planning, Questions, Issues. Praeger Publishers,Santa Barbara, CA.

    Bradley, M., 2005. A Comparison of Australian Airport Rail Links WithSystems Elsewhere in the World. Paper presented at the AustralianTransport Research Forum, Sydney, September 2830.

    Breakwell, G.M., 2007. The Psychology of Risk. Cambridge University Press,Cambridge.

    Budzier, A., Flyvbjerg, B., 2011. Double Whammy: How ICT Projects areFooled by Randomness and Screwed by Political Intent. Working Paper,Sad Business School, University of Oxford.

    Buehler, R., Griffin, D., Ross, M., 1994. Exploring the planning fallacy: whypeople underestimate their task completion times. Journal of Personality andSocial Psychology 67, 366381.

    Cantarelli, C.C., Flyvbjerg, B., van Wee, B., Molin, E.J.E., 2010. Lock-in andits influence on the project performance of large-scale transportationinfrastructure projects: investigating the way in which lock-in can emergeand affect cost overruns. Environment and Planning B: Planning and Design37, 792807.

    Chapman, C.E., 2006. Conducting Due Diligence. Practicing Law Institute,New York, NY.

    Danish Ministry for Transport, 2008. Ny anlgsbudgettering p Transport-ministeriets omrde, herunder om konomistyringsmodel og risikohndteringfor anlgsprojekter. New Construction Budgeting for The Ministry of Trans-port, Including Financial Management Model and Risk Management forConstruction Projects (November 18, 33 pp.).

    Dantata, N.A., Touran, A., Schneck, D.C., 2006. Trends in US Rail TransitProject Cost Overrun. Transportation Research Board Annual Meeting.National Academie, Washington, DC.

    De Meyer, Loch, C.H., Pich, M.T., 2002. Managing Project Uncertainty: fromVariation to Chaos. Sloan Management Review 5967.

    Edkins, A., Geraldi, J., Morris, P., Smith, A., 2012. Exploring the Front-end ofProject Management. In: Javernick-Will, A., Ashwin Mahalingam, A.(Eds.), Proceedings of the Engineering Project Organizations Conference,Rheden, The Netherlands (July, 25 pp.).

    Fischoff, B., 2002. Heuristics and Biases in Application. In: Gilovich, T.,Griffin, D., Kahneman, D. (Eds.), Heuristics and Biases: The Psychology ofIntuitive Judgment. Cambridge University Press, Cambridge.

    Flyvbjerg, B., 2005. Measuring inaccuracy in travel demand forecasting:methodological considerations regarding ramp up and sampling. Transpor-tation Research A 39 (6), 522530 (July Issue).

    Flyvbjerg, B., 2006. From Nobel Prize to Project Management: Getting RisksRight. Project Management Journal 37 (3), 515 (August).

    Flyvbjerg, B., 2007a. Policy and planning for large-infrastructure projects:problems, causes, cures. Environment and Planning B: Planning and Design34 (4), 578597.

    Flyvbjerg, B., 2007b. Cost overruns and demand shortfalls in urban rail andother infrastructure. Transportation Planning and Technology 30 (1), 930(February).

    Flyvbjerg, B., 2009. Survival of the unfittest: why the worst infrastructure getsbuilt, and what we can do about it. Oxford Review of Economic Policy 25(3), 344367.

    Flyvbjerg, B., Budzier, A., 2011. Why your IT project may be riskier than you

    773ject Management 31 (2013) 760774think. Harvard Business Review 601603 (September 2011).Flyvbjerg, B., Cowi, 2004. Procedures for Dealing with Optimism Bias in Transport

    Planning: Guidance Document. UK Department for Transport, London.

  • Flyvbjerg, B., Stewart, A., 2012. Olympic Proportions: Cost and Cost Overrunat the Olympics 19602012. Working Paper, Sad Business School,University of Oxford (June, 23 pp.).

    Flyvbjerg, B., Holm, M.K.S., Buhl, S.L., 2002. Underestimating costs in publicworks projects: error or lie? Journal of the American Planning Association68 (3), 279295 (Summer).

    Flyvbjerg, B., Holm, M.K.S., Buhl, S.L., 2005. How (in)accurate are demandforecasts in public works projects? The case of transportation. Journal of theAmerican Planning Association 71 (2), 131146 (Spring Issue).

    Flyvbjerg, B., Garbuio, M., Lovallo, D., 2009. Delusion and deception in largeinfrastructure projects: two models for explaining and preventing executive

    Standard and Poor's, 2004. Traffic Forecasting Risk: Study Update 2004.Standard and Poor's, London (October).

    Standard and Poor's, 2005. Traffic Forecasting Risk Study Update 2005:Through Ramp-Up and Beyond. Standard and Poor's, London (August).

    Swiss Association of Road and Transportation Experts, 2006. Kosten-Nutzen-Analysen im Strassenverkehr, Grundnorm 641820, valid from August 1.Author, Zrich.

    Taleb, N.N., 2010. The Black Swan: The Impact of the Highly Improbable.Second Edition. Random House, New York.

    Transportation Research Board, 2007. Determination of the State of the Practicein Metropolitan Area Travel Forecasting: Findings of the Surveys ofMetropolitan Planning Organizations. Report prepared for TRB CommitteeB0090. National Research Council, Washington, DC.

    Treasury, U.K., 2003. Supplementary Green Book Guidance: Optimism Bias.HM Treasury, London.

    774 B. Flyvbjerg / International Journal of Project Management 31 (2013) 760774Flyvbjerg, Bent, Bruzelius, Nils, Rothengatter, Werner, 2003. Megaprojects andRisk: An Anatomy of Ambition. (Cambridge University Press).

    Gardener, T., Moffat, J., 2008. Changing behaviours in defence acquisition: agame theory approach. The Journal of the Operational Research Society 59,225230.

    Gilovich, T., Griffin, D., Kahneman, D. (Eds.), 2002. Heuristics and Biases:The Psychology of Intuitive Judgment. Cambridge University Press.

    Hirt, P.W., 1994. A conspiracy of optimism: management of the nat