a skeptic's guide to computer models - …web.mit.edu/jsterman/www/skeptic's_guide.pdfa...

A Skeptic's Guide to Computer Models

by John D. Sterman

This article was written by Dr. John D. Sterman, Director of the MIT System Dynamics Groupand Professor of Management Science at the Sloan School of Management, MassachusettsInstitute of Technology, 50 Memorial Drive, Cambridge, MA 02139, USA; email:[email protected]. Copyright © John D. Sterman, 1988, 1991. All rights reserved. This paper isreprinted from Sterman, J. D. (1991). A Skeptic's Guide to Computer Models. In Barney, G. O. etal. (eds.), Managing a Nation: The Microcomputer Software Catalog. Boulder, CO: WestviewPress, 209-229. An earlier version of this paper also appeared in Foresight and National

A SKEPTIC'S GUIDE TO COMPUTER MODELS

Decisions: The Horseman and the Bureaucrat (Grant 1988).

2


The Inevitability of Using Models........................................................................3Mental and Computer Models..............................................................................2The Importance of Purpose..................................................................................3Two Kinds of Models: Optimization Versus Simulation and Econometrics.......4

Optimization.............................................................................................4Limitations of Optimization..........................................................5When To Use Optimization..........................................................8

Simulation................................................................................................9Limitations of Simulation.............................................................11

Econometrics............................................................................................13Limitations of Econometric Modeling..........................................16

Checklist for the Model Consumer.......................................................................22Conclusions..........................................................................................................22References............................................................................................................24

3


A Skeptic's Guide to Computer Models...

But Mousie, thou art no they laneIn proving foresight may be vain;The best-laid schemes o' mice an' men

Gang aft a-gley,An lea'e us nought but grief an' pain,

For promis'd joy!Robert Burns, "To a Mouse"

The Inevitability of Using Modelsfast becoming a prerequisite for thepolicymaker, legislator, lobbyist, and citizenalike.

Computer modeling of social and economicsystems is only about three decades old. Yetin that time, computer models have beenused to analyze everything from inventorymanagement in corporations to theperformance of national economies, from theoptimal distribution of fire stations in NewYork City to the interplay of globalpopulation, resources, food, and pollution.Certain computer models, such as The Limitsto Growth (Meadows et al. 1972), have beenfront page news. In the US, some have beenthe subject of numerous congressionalhearings and have influenced the fate oflegislation. Computer modeling has becomean important industry, generating hundredsof millions of dollars of revenues annually.

During our lives, each of us will be facedwith the result of models and will have tomake judgments about their relevance andvalidity. Most people, unfortunately, cannotmake these decisions in an intelligent andinformed manner, since for them computermodels are black boxes: devices that operatein completely mysterious ways. Becausecomputer models are so poorly understoodby most people, it is easy for them to bemisused, accidentally or intentionally. Thusthere have been many cases in whichcomputer models have been used to justifydecisions already made and actions alreadytaken, to provide a scapegoat when a forecastturned out wrong, or to lend speciousauthority to an argument.

As computers have become faster, cheaper,and more widely available, computer modelshave become commonplace in forecastingand public policy analysis, especially ineconomics, energy and resources,demographics, and other crucial areas. Ascomputers continue to proliferate, more andmore policy debates—both in governmentand the private sector—will involve theresults of models. Though not all of us aregoing to be model builders, we all arebecoming model consumers, regardless ofwhether we know it (or like it). The ability tounderstand and evaluate computer models is

If these misuses are to stop and if modelingis to become a rational tool of the generalpublic, rather than remaining the specialmagic of a technical priesthood, a basicunderstanding of models must become morewidespread. This paper takes a step towardthis goal by offering model consumers apeek inside the black boxes. The computermodels it describes are the kinds used inforesight and policy analysis (rather thanphysical system models such as NASA usesto test the space shuttle). The characteristicsand capabilities of the models, their

4


advantages and disadvantages, uses andmisuses are all addressed. The fundamentalassumptions of the major modelingtechniques are discussed, as is theappropriateness of these techniques forforesight and policy analysis. Considerationis also given to the crucial questions a modeluser should ask when evaluating theappropriateness and validity of a model.

Often these models are also flawed, since wefrequently make errors in deducing theconsequences of the assumptions on whichthey are based.

Our failure to use rational mental models inour decision making has been welldemonstrated by research on the behavior ofpeople in organizations (e.g., families,businesses, the government). This researchshows that decisions are not made by rationalconsideration of objectives, options, andconsequences. Instead, they often are madeby rote, using standard operating proceduresthat evolve out of tradition and adjust onlyslowly to changing conditions (Simon 1947,1979). These procedures are determined bythe role of the decision makers within theorganization, the amount of time they have tomake decisions, and the information availableto them.

Mental and Computer ModelsFortunately, everyone is already familiar withmodels. People use models—mentalmodels—every day. Our decisions andactions are based not on the real world, buton our mental images of that world, of therelationships among its parts, and of theinfluence our actions have on it.

Mental models have some powerfuladvantages. A mental model is flexible; it cantake into account a wider range ofinformation than just numerical data; it can beadapted to new situations and be modified asnew information becomes available. Mentalmodels are the filters through which weinterpret our experiences, evaluate plans, andchoose among possible courses of action.The great systems of philosophy, politics,and literature are, in a sense, mental models.

But the individual perspectives of thedecision makers may be parochial, the timethey have to weigh alternatives insufficient,and the information available to them dated,biased, or incomplete. Furthermore, theirdecisions can be strongly influenced byauthority relations, organizational context,peer pressure, cultural perspective, andselfish motives. Psychologists andorganizational observers have identifieddozens of different biases that creep intohuman decision making because of cognitivelimitations and organizational pressures(Hogarth 1980; Kahneman, Slovic, andTversky 1982). As a result, many decisionsturn out to be incorrect; choosing the bestcourse of action is just too complicated anddifficult a puzzle.

But mental models have their drawbacksalso. They are not easily understood byothers; interpretations of them differ. Theassumptions on which they are based areusually difficult to examine, so ambiguitiesand contradictions within them can goundetected, unchallenged, and unresolved.

That we have trouble grasping other peoples'mental models may seem natural. Moresurprising, we are not very good atconstructing and understanding our ownmental models or using them for decisionmaking. Psychologists have shown that wecan take only a few factors into account inmaking decisions (Hogarth 1980;Kahneman, Slovic, and Tversky 1982). Inother words, the mental models we use tomake decisions are usually extremely simple.

Hamlet exclaims (perhaps ironically) "Whata piece of work is a man, how noble inreason, how infinite in faculties...!" But itseems that we, like Hamlet himself, aresimply not capable of making error-freedecisions that are based on rational modelsand are uninfluenced by societal andemotional pressures.

2


Enter the computer model. In theory,computer models offer improvements overmental models in several respects:

The Importance of PurposeA model must have a clear purpose, and thatpurpose should be to solve a particularproblem. A clear purpose is the single mostimportant ingredient for a successfulmodeling study. Of course, a model with aclear purpose can still be incorrect, overlylarge, or difficult to understand. But a clearpurpose allows model users to ask questionsthat reveal whether a model is useful forsolving the problem under consideration.

They are explicit; their assumptions arestated in the written documentation andopen to all for review.

They infallibly compute the logicalconsequences of the modeler'sassumptions.

They are comprehensive and able tointerrelate many factors simultaneously.

Beware the analyst who proposes to modelan entire social or economic system ratherthan a problem. Every model is arepresentation of a system—a group offunctionally interrelated elements forming acomplex whole. But for the model to beuseful, it must address a specific problemand must simplify rather than attempting tomirror in detail an entire system.

A computer model that actually has thesecharacteristics has powerful advantages overa mental model. In practice, however,computer models are often less than ideal:

They are so poorly documented andcomplex that no one can examine theirassumptions. They are black boxes. What is the difference? A model designed to

understand how the business cycle can bestabilized is a model of a problem. It dealswith a part of the overall economic system. Amodel designed to understand how theeconomy can make a smooth transition fromoil to alternative energy sources is also amodel of a problem; it too addresses only alimited system within the larger economy. Amodel that claims to be a representation ofthe entire economy is a model of a wholesystem. Why does it matter? The usefulnessof models lies in the fact that they simplifyreality, putting it into a form that we cancomprehend. But a truly comprehensivemodel of a complete system would be just ascomplex as that system and just asinscrutable. The map is not the territory—anda map as detailed as the territory would be ofno use (as well as being hard to fold).

They are so complicated that the user hasno confidence in the consistency orcorrectness of the assumptions.

They are unable to deal with relationshipsand factors that are difficult to quantify,for which numerical data do not exist, orthat lie outside the expertise of thespecialists who built the model.

Because of these possible flaws, computermodels need to be examined carefully bypotential users. But on what basis shouldmodels be judged? How does one knowwhether a model is well or badly designed,whether its results will be valid or not? Howcan a prospective user decide whether a typeof modeling or a specific model is suitablefor the problem at hand? How can misusesof models be recognized and prevented?There is no single comprehensive answer,but some useful guidelines are given on thefollowing pages.

The art of model building is knowing what tocut out, and the purpose of the model acts asthe logical knife. It provides the criterionabout what will be cut, so that only theessential features necessary to fulfill thepurpose are left. In the example above, since

3


the purpose of the comprehensive modelwould be to represent the entire economicsystem, few factors could be excluded. Inorder to answer all questions about theeconomy, the model would have to includean immense range of long-term and short-term variables. Because of its size, itsunderlying assumptions would be difficult toexamine. The model builders—not tomention the intended consumers—wouldprobably not understand its behavior, and itsvalidity would be largely a matter of faith.

way to accomplish some goal. Optimizationmodels do not tell you what will happen in acertain situation. Instead they tell you what todo in order to make the best of the situation;they are normative or prescriptive models.

Let us take two examples. A nutritionistwould like to know how to design meals thatfulfill certain dietary requirements but cost aslittle as possible. A salesperson must visitcertain cities and would like to know how tomake the trip as quickly as possible, takinginto account the available flights between thecities. Rather than relying on trial and error,the nutritionist and the salesperson could useoptimization models to determine the bestsolutions to these problems.

A model designed to examine just thebusiness cycle or the energy transition wouldbe much smaller, since it would be limited tothose factors believed to be relevant to thequestion at hand. For example, the businesscycle model need not include long-termtrends in population growth and resourcedepletion. The energy transition model couldexclude short-term changes related to interest,employment, and inventories. The resultingmodels would be simple enough so that theirassumptions could b_ examined. The relationof these assumptions to the most importanttheories regarding the business cycle andresource economics could then be assessed todetermine how useful the models were fortheir intended purposes.

An optimization model typically includesthree parts. The objective function specifiesthe goal or objective. For the nutritionist, theobjective is to minimize the cost of the meals.For the salesperson, it is to minimize the timespent on the trip. The decision variables arethe choices to be made. In our examples,these would be the food to serve at each mealand the order in which to visit the cities. Theconstraints restrict the choices of the decisionvariables to those that are acceptable andpossible. In the diet problem, one constraintwould specify that daily consumption of eachnutrient must equal or exceed the minimumrequirement. Another might restrict thenumber of times a particular food is servedduring each week. The constraints in thetravel problem would specify that each citymust be visited at least once and wouldrestrict the selection of routes to actuallyavailable connections.

Two Kinds of Models: OptimizationVersus Simulation and EconometricsThere are many types of models, and theycan be classified in many ways. Models canbe static or dynamic, mathematical orphysical, stochastic or deterministic. One ofthe most useful classifications, however,divides models into those that optimizeversus those that simulate. The distinctionbetween optimization and simulation modelsis particularly important since these types ofmodels are suited for fundamentally differentpurposes.

An optimization model takes as inputs thesethree pieces of information—the goals to bemet, the choices to be made, and theconstraints to be satisfied. It yields as itsoutput the best solution, i.e., the optimaldecisions given the assumptions of themodel. In the case of our examples, themodels would provide the best set of menusand the most efficient itinerary.

OptimizationThe Oxford English Dictionary definesoptimize as "to make the best of most of; todevelop to the utmost." The output of anoptimization model is a statement of the best

4


Limitations of Optimization

embody, both explicitly and by omission.Imagine that a government employee, givenresponsibility for the placement of sewagetreatment plants along a river, decides to usean optimization model in making thedecision. The model has as its objectivefunction the cheapest arrangement of plants; aconstraint specifies that the arrangement mustresult in water quality standards being met. Itwould be important for the user to ask howthe model takes into account the impacts theplants will have on fishing, recreation, wildspecies, and development potential in theareas where they are placed. Unless theseconsiderations are explicitly incorporated intothe model, they are implicitly held to be of novalue.

Many optimization models have a variety oflimitations and problems that a potential usershould bear in mind. These problems are:difficulties with the specification of theobjective function, unrealistic linearity, lackof feedback, and lack of dynamics.

Specification of the Objective Function:Whose Values? The first difficulty withoptimization models is the problem ofspecifying the objective function, the goal thatthe model user is trying to reach. In ourearlier examples, it was fairly easy to identifythe objective functions of the nutritionist andthe salesperson, but what would be theobjective function for the mayor of NewYork? To provide adequate city services forminimal taxes? To encourage the arts? Toimprove traffic conditions? The answerdepends, of course, on the perspective of theperson you ask.

Linearity. Another problem, and one that canseriously undermine the verisimilitude ofoptimization models, is their linearity.Because a typical optimization problem isvery complex, involving hundreds orthousands of variables and constraints, themathematical problem of finding theoptimum is extremely difficult. To rendersuch problems tractable, modelerscommonly introduce a number ofsimplifications. Among these is theassumption that the relationships in thesystem are linear. In fact, the most popularoptimization technique, linear programming,requires that the objective function and allconstraints be linear.

The objective function embodies values andpreferences, but which values, whosepreferences? How can intangibles beincorporated into the objective function? Howcan the conflicting goals of various groups beidentified and balanced? These are hardquestions, but they are not insurmountable.Intangibles often can be quantified, at leastroughly, by breaking them into measurablecomponents. For example, the quality of lifein a city might be represented as dependingon the rate of unemployment, air pollutionlevels, crime rate, and so forth. There are alsotechniques available for extractinginformation about preferences frominterviews and other impressionistic data.Just the attempt to make values explicit is aworthwhile exercise in any study and mayhave enormous value for the clients of amodeling project.

Linearity is mathematically convenient, but inreality it is almost always invalid. Consider,for example, a model of a firm's inventorydistribution policies. The model contains aspecific relationship between inventory andshipments—if the inventory of goods in thewarehouse is 10 percent below normal,shipment may be reduced by, say, 2 percentsince certain items will be out of stock. If themodel requires this relationship to be linear,then a 20 percent shortfall will reduceshipments by 4 percent, a 30 percent shortfallby 6 percent, and so on. And when theshortfall is 100 percent? According to themodel, shipments will still be 80 percent of

It is important that potential users keep inmind the question of values when theyexamine optimization models. The objectivefunction and the constraints should always bescrutinized to determine what values they

5


normal. But obviously, when the warehouseis empty, no shipments are possible. Thelinear relationship within the model leads toan absurdity.

reflect this reality, however. Consider anoptimization model that computes the bestsize of sewage treatment plants to build in anarea. The model will probably assume thatthe amount of sewage needing treatment willremain the same, or that it will grow at acertain rate. But if water quality improvesbecause of sewage treatment, the area willbecome more attractive and development willincrease, ultimately leading to a sewage loadgreater than expected.

The warehouse model may seem trivial, butthe importance of non-linearity is welldemonstrated by the sorry fate of thepassenger pigeon, Ectopistes migratorius.When Europeans first colonized NorthAmerica, passenger pigeons were extremelyabundant. Huge flocks of the migrating birdswould darken the skies for days. They oftencaused damage to crops and were huntedboth as a pest and as food. For years, huntinghad little apparent impact on the population;the prolific birds seemed to reproduce fastenough to offset most losses. Then thenumber of pigeons began to decline—slowlyat first, then rapidly. By 1914, the passengerpigeon was extinct.

Models that ignore feedback effects must relyon exogenous variables and are said to have anarrow boundary. Exogenous variables areones that influence other variables in themodel but are not calculated by the model.They are simply given by a set of numericalvalues over time, and they do not change inresponse to feedback. The values ofexogenous variables may come from othermodels but are most likely the product of anunexaminable mental model. Theendogenous variables, on the other hand, arecalculated by the model itself. They are thevariables explained by the structure of themodel, the ones for which the modeler has anexplicit theory, the ones that respond tofeedback.

The disappearance of the passenger pigeonsresulted from the non-linear relationshipbetween their population density and theirfertility. In large flocks they could reproduceat high rates, but in smaller flocks theirfertility dropped precipitously. Thus, whenhunting pressure was great enough to reducethe size of a flock somewhat, the fertility inthat flock also fell. The lower fertility lead toa further decrease in the population size, andthe lower population density resulted in yetlower birth rates, and so forth, in a viciouscycle.

Ignoring feedback can result in policies thatgenerate unanticipated side effects or arediluted, delayed, or defeated by the system(Meadows 1982). An example is theconstruction of freeways in the 1950s and1960s to alleviate traffic congestion in majorUS cities. In Boston it used to take half anhour to drive from the city neighborhood ofDorchester to the downtown area, a journeyof only a few miles. Then a limited accesshighway network was built around the city,and travel time between Dorchester anddowntown dropped substantially.

Unfortunately, the vast majority ofoptimizations models assume that the worldis linear. There are, however, techniquesavailable for solving certain non-linearoptimization problems, and research iscontinuing.

Lack of Feedback. Complex systems in thereal world are highly interconnected, having ahigh degree of feedback among sectors. Theresults of decisions feed back throughphysical, economic, and social channels toalter the conditions on which the decisionswere originally made. Some models do not

But there's more to the story. Highwayconstruction led to changes that fed back intothe system, causing unexpected side effects.Due to the reduction in traffic congestion andcommuting time, living in outlyingcommunities became a more attractive

6


option. Farmland was turned into housingdevelopments or paved over to provide yetmore roads. The population of the suburbssoared, as people moved out of the centercity. Many city stores followed theircustomers or were squeezed out bycompetition from the new suburbanshopping malls. The inner city began todecay, but many people still worked in thedowntown area—and they got there via thenew highways. The result? Boston has morecongestion and air pollution than before thehighways were constructed, and the rush-hour journey from Dorchester to downtowntakes half an hour, again.

would affect its future ecologicaldevelopment. It did not consider how land-use needs or lumber prices might change inthe future. It did not examine how long itwould take for new trees to grow to maturityin the harvested areas, or what the economicand recreational value of the areas would beduring the regrowth period. The model justprovided the optimal decisions for a singleyear, ignoring the fact that those decisionswould continue to influence the developmentof forest resources for decades.

Not all optimization models are static. TheMARKAL model, for example, is a largelinear programming model designed todetermine the optimal choice of energytechnologies. Developed at the BrookhavenNational Laboratory in the US, the modelproduces as its output the best (least-cost)mix of coal, oil, gas, and other energysources well into the next century. It requiresvarious exogenous inputs, such as energydemands, future fuel prices, and constructionand operating costs of different energytechnologies. (Note that the model ignoresfeedbacks from energy supply to prices anddemand.) The model is dynamic in the sensethat it produces a "snapshot" of the optimalstate of the system at five-year intervals.

In theory, feedback can be incorporated intooptimization models, but the resultingcomplexity and non-linearity usually renderthe problem insoluble. Many optimizationmodels therefore ignore most feedbackeffects. Potential users should be aware ofthis when they look at a model. They shouldask to what degree important feedbacks havebeen excluded and how those exclusionsmight alter the assumptions and invalidate theresults of the model.

Lack of Dynamics. Many optimizationmodels are static. They determine the optimalsolution for a particular moment in timewithout regard for how the optimal state isreached or how the system will evolve in thefuture. An example is the linearprogramming model constructed in the late1970s by the US Forest Service, with theobjective of optimizing the use ofgovernment lands. The model wasenormous, with thousands of decisionvariables and tens of thousands ofconstraints, and it took months just to correctthe typographical errors in the model's hugedatabase. When the completed model wasfinally run, finding the solution required fulluse of a mainframe computer for days.

The Brookhaven model is not completelydynamic, however, because it ignores delays.It assumes that people, seeing what theoptimal mix is for some future year, beginplanning far enough in advance so that thismix can actually be used. Thus the modeldoes not, for example, incorporateconstruction delays for energy productionfacilities. In reality, of course, it takes time—often much longer than five years—to buildpower plants, invent new technologies, buildequipment, develop waste managementtechniques, and find and transport necessaryraw materials.

Despite the gigantic effort, the modelprescribed the optimal use of forest resourcesfor only a single moment in time. It did nottake into account how harvesting a given area

Indeed, delays are pervasive in the real world.The delays found in complex systems areespecially important because they are a majorsource of system instability. The lag time

7


required to carry out a decision or to perceiveits effects may cause overreaction or mayprevent timely intervention. Acid rainprovides a good example. Although there isalready evidence that damage to the forests ofNew England, the Appalachians, and Bavariais caused by acid rain, many scientistssuspect it will take years to determine exactlyhow acid rain is formed and how it affectsthe forests. Until scientific and then politicalconsensus emerges, legislative action to curbpollution is not likely to be strong. Pollutioncontrol programs, once passed, will takeyears to implement. Existing power plantsand other pollution sources will continue tooperate for their functional lifetimes, whichare measured in decades. It will require evenlonger to change settlement patterns andlifestyles dependent on the automobile. Bythe time sulfur and nitrogen oxide emissionsare sufficiently reduced, it may be too late forthe forests.

and free of feedback, optimization may wellbe the best technique to use. Unfortunately,these latter conditions are rarely true for thesocial, economic, and ecological systems thatare frequently of concern to decision makers.

Look out for optimization models thatpurport to forecast actual behavior. Theoutput of an optimization model is astatement of the best way to accomplish agoal. To interpret the results as a prediction ofactual behavior is to assume that people in thereal system will in fact make the optimalchoices. It is one thing to say, "in order tomaximize profits, people should make thefollowing decisions," and quite another to say"people will succeed in maximizing profits,because they will make the followingdecisions." The former is a prescriptivestatement of what to do, the latter adescriptive statement of what will actuallyhappen.

Optimization models are valid for makingprescriptive statements. They are valid forforecasting only if people do in fact optimize,do make the best possible decisions. It mayseem reasonable to expect people to behaveoptimally—after all, wouldn't it be irrationalto take second best when you could have thebest? But the evidence on this score isconclusive: real people do not behave likeOptimization models. As discussed above,we humans make decisions with simple andincomplete mental models, models that areoften based on faulty assumptions or thatlead erroneously from sound assumptions toflawed solutions. As Herbert Simon puts it,

Delays are a crucial component of thedynamic behavior of systems, but—likenonlinearity—they are difficult to incorporateinto optimization models. A commonsimplification is to assume that all delays inthe model are of the same fixed length. Theresults of such models are of questionablevalue. Policymakers who use them in aneffort to find an optimal course of action maydiscover, like the proverbial American touriston the back roads of Maine, that "you can'tget there from here."

When To Use OptimizationDespite the limitations discussed above,optimization techniques can be extremelyuseful. But they must be used for the properproblems. Optimization has substantiallyimproved the quality of decisions in manyareas, including computer design, airlinescheduling, factory siting, and oil refineryoperation. Whenever the problem to besolved is one of choosing the best fromamong a well-defined set of alternatives,optimization should be considered. If themeaning of best is also well defined, and ifthe system to be optimized is relatively static

The capacity of the human mind forformulating and solving complexproblems is very small compared withthe size of the problem whose solution isrequired for objectively rational behaviorin the real world or even for a reasonableapproximation to such objectiverationality. (Simon 1957, p. 198)

Optimization models augment the limitedcapacity of the human mind to determine the

8


objectively rational course of action. It shouldbe remembered, however, that evenoptimization models must make simplifyingassumptions in order to be tractable, so themost we can hope from them is anapproximation of how people ought tobehave. To model how people actuallybehave requires a very different set ofmodeling techniques, which will bediscussed now.

decision-making strategies or organizationalstructures and evaluating their effects on thebehavior of the system).

In other words, simulation models are "whatif' tools. Often such "what if' information ismore important than knowledge of theoptimal decision. For example, during the1978 debate in the US over natural gasderegulation, President Carter's originalproposal was modified dozens of times byCongress before a final compromise waspassed. During the congressional debate, theDepartment of Energy evaluated each versionof the bill using a system dynamics model(Department of Energy 1979). The modeldid not indicate what ought to be done tomaximize the economic benefits of naturalgas to the nation. Congress already had itsown ideas on that score. But by providing anassessment of how each proposal wouldaffect gas prices, supplies, and demands, themodel generated ammunition that the Carteradministration could use in lobbying for itsproposals.

SimulationThe Latin verb simulare means to imitate ormimic. The purpose of a simulation model isto mimic the real system so that its behaviorcan be studied. The model is a laboratoryreplica of the real system, a microworld(Morecroft 1988). By creating arepresentation of the system in the laboratory,a modeler can perform experiments that areimpossible, unethical, or prohibitivelyexpensive in the real world.

Simulations of physical systems arecommonplace and range from wind tunneltests of aircraft design to simulation ofweather patterns and the depletion of oilreserves. Economists and social scientistsalso have used simulation to understand howenergy prices affect the economy, howcorporations mature, how cities evolve andrespond to urban renewal policies, and howpopulation growth interacts with food supply,resources, and the environment. There aremany different simulation techniques,including stochastic modeling, systemdynamics, discrete simulation, and role-playing games. Despite the differencesamong them, all simulation techniques sharea common approach to modeling.

Every simulation model has two maincomponents. First it must include arepresentation of the physical world relevantto the problem under study. Consider forexample a model that was built for thepurpose of understanding why America'slarge cities have continued to decay despitemassive amounts of aid and numerousrenewal programs (Forrester 1969). Themodel had to include a representation of thephysical components of the city—the sizeand quality of the infrastructure, including thestock of housing and commercial structures;the attributes of the population, such as itssize and composition and the mix of skillsand incomes among the people; flows (ofpeople, materials, money, etc.) into and outof the city; and other factors that characterizethe physical and institutional setting.

Optimization models are prescriptive, butsimulation models are descriptive. Asimulation model does not calculate whatshould be done to reach a particular goal, butclarifies what would happen in a givensituation. The purpose of simulations may beforesight (predicting how systems mightbehave in the future under assumedconditions) or policy design (designing new

How much detail a model requires about thephysical structure of the system will, ofcourse, depend on the specific problem beingaddressed. The urban model mentioned

9


above required only an aggregaterepresentation of the features common tolarge American cities. On the other hand, amodel designed to improve the location anddeployment of fire fighting resources in NewYork City had to include a detailedrepresentation of the streets and trafficpatterns (Greenberger, Crenson, and Crissey1976).

rules: When excess inventory piled up on theshelves, a sale was held and the price wasgradually reduced until the goods were sold;if sales goals were exceeded, prices wereboosted. Prices were also adjusted towardthose of competitors.

Cyert and March built a simulation model ofthe pricing system, basing it on thesedecisionmaking rules. The output of themodel was a description of expected pricesfor goods. When this output was comparedwith real store data, it was found that themodel reproduced quite well the actualpricing decisions of the floor managers.

In addition to reflecting the physical structureof the system, a simulation model mustportray the behavior of the actors in thesystem. In this context, behavior means theway in which people respond to differentsituations, how they make decisions. Thebehavioral component is put into the modelin the form of decisionmaking rules, whichare determined by direct observation of theactual decision-making procedures in thesystem.

Limitations of SimulationAny model is only as good as itsassumptions. In the case of simulationmodels, the assumptions consist of thedescriptions of the physical system and thedecision rules. Adequately representing thephysical system is usually not a problem; thephysical environment can be portrayed withwhatever detail and accuracy is needed for themodel purpose. Also, simulation models caneasily incorporate feedback effects, non-linearities, and dynamics; they are not rigidlydetermined in their structure by mathematicallimitations as optimization models often are.Indeed, one of the main uses of simulation isto identify how feedback, nonlinearity, anddelays interact to produce troubling dynamicsthat persistently resist solution. (Forexamples see Sterman 1985, Morecroft1983, and Forrester 1969.)

Given the physical structure of the systemand the decision-making rules, the simulationmodel then plays the role of the decisionmakers, mimicking their decisions. In themodel, as in the real world, the nature andquality of the information available todecision makers will depend on the state ofthe system. The output of the model will be adescription of expected decisions. Thevalidity of the model's assumptions can bechecked by comparing the output with thedecisions made in the real system.

An example is provided by the pioneeringsimulation study of corporate behaviorcarried out by Cyert and March (1963). Theirfield research showed that department storesused a very simple decision rule to determinethe floor price of goods. That rule wasbasically to mark up the wholesale cost of theitems by a fixed percentage, with the value ofthe markup determined by tradition. Theyalso noted, however, that through time thetraditional markup adjusted very slowly,bringing it closer to the actual markuprealized on goods when they were sold. Theactual markup could vary from the normalmarkup as the result of several other decision

Simulation models do have their weakpoints, however. Most problems occur in thedescription of the decision rules, thequantification of soft variables, and the choiceof the model boundary.

Accuracy of the Decision Rules. Thedescription of the decision rules is onepotential trouble spot in a simulation model.The model must accurately represent how theactors in the system make their decisions,even if their decision rules are less thanoptimal. The model should respond to

10


change in the same way the real actorswould. But it will do this only if the model'sassumptions faithfully describe the decisionrules that are used under differentcircumstances. The model therefore mustreflect the actual decision-making strategiesused by the people in the system beingmodeled, including the limitations and errorsof those strategies.

soft variables be tested? How can statisticaltests be performed without numerical data?

Actually, there are no limitations on theinclusion of soft variables in models, andmany simulation models do include them.After all, the point of simulation models is toportray decision making as it really is, andsoft variables—including intangibles such asdesires, product quality, reputation,expectations, and optimism – are often ofcritical importance in decision making.Imagine, for example, trying to run a school,factory, or city solely on the basis of theavailable numerical data. Without qualitativeknowledge about factors such as operatingprocedures, organizational structure, politicalsubtleties, and individual motivations, theresult would be chaos. Leaving suchvariables out of models just because of a lackof hard numerical data is certainly less"scientific" than including them and makingreasonable estimates of their values. Ignoringa relationship implies that it has a value ofzero—probably the only value known to bewrong! (Forrester 1980)

Unfortunately, discovering decision rules isoften difficult. They cannot be determinedfrom aggregate statistical data, but must beinvestigated first hand. Primary data on thebehavior of the actors can be acquiredthrough observation of actual decisionmaking in the field, that is, in the boardroom,on the factory floor, along the sales route, inthe household. The modeler must discoverwhat information is available to each actor,examine the timeliness and accuracy of thatinformation, and infer how it is processed toyield a decision. Modelers often require theskills of the anthropologist and theethnographer. One can also learn aboutdecision making through laboratoryexperiments in which managers operatesimulated corporations (Sterman 1989). Thebest simulation modeling draws on extensiveknowledge of decision making that has beendeveloped in many disciplines, includingpsychology, sociology, and behavioralscience.

Of course, all relationships and parameters inmodels, whether based on soft or hardvariables, are imprecise and uncertain tosome degree. Reasonable people maydisagree as to the importance of differentfactors. Modelers must therefore performsensitivity analysis to consider how theirconclusions might change if other plausibleassumptions were made. Sensitivity analysisshould not be restricted to uncertainty inparameter values, but should also considerthe sensitivity of conclusions to alternativestructural assumptions and choices of modelboundary.

Soft Variables. The majority of data are softvariables. That is, most of what we knowabout the world is descriptive, qualitative,difficult to quantify, and has never beenrecorded. Such information is crucial forunderstanding and modeling complexsystems. Yet in describing decision making,some modelers limit themselves to hardvariables, ones that can be measured directlyand can be expressed as numerical data. Theymay defend the rejection of soft variables asbeing more scientific than "making up" thevalues of parameters and relationships forwhich no numerical data are available. How,they ask, can the accuracy of estimates about

Sensitivity analysis is no less a responsibilityfor those modelers who ignore soft variables.Apparently hard data such as economic anddemographic statistics are often subject tolarge measurement errors, biases, distortions,and revisions. Unfortunately, sensitivityanalysis is not performed or reported oftenenough. Many modelers have been

11


embarrassed when third parties, attempting toreplicate the results of a model, have foundthat reasonable alternative assumptionsproduce radically different conclusions. (Seethe discussion below of the experimentconducted by the Joint Economic Committeewith three leading econometric models.)

production. The way the model wasconstructed, even a full embargo of importedoil or a doubling of oil prices would have noimpact on the economy.

Its exogenous treatment of the economymade the PIES model inherentlycontradictory. The model showed that theinvestment needs of the energy sector wouldincrease markedly as depletion raised the costof getting oil out of the ground and syntheticfuels were developed. But at the same time,the model assumed that higher investmentneeds in the energy sector could be satisfiedwithout reducing investment or consumptionin the rest of the economy and withoutraising interest rates or inflation. In effect, themodel let the economy have its pie and eat ittoo.

Model Boundary. The definition of areasonable model boundary is anotherchallenge for the builders of simulationmodels. Which factors will be exogenous,which will be endogenous? What feedbackswill be incorporated into the model? Intheory, one of the great strengths ofsimulation models is the capacity to reflectthe important feedback relationships thatshape the behavior of the system and itsresponse to policies. In practice, however,many simulation models have very narrowboundaries. They ignore factors outside theexpertise of the model builder or the interestsof the sponsor, and in doing so they excludeimportant feedbacks.

In part because it ignored the feedbacksbetween the energy sector and the rest of theeconomy, the PIES model consistentlyproved to be overoptimistic. In 1974 themodel projected that by 1985 the US wouldbe well on the way to energy independence:energy imports would be only 3.3 millionbarrels per day, and production of shale oilwould be 250,000 barrels per day.Furthermore, these developments would beaccompanied by oil prices of about $22 perbarrel (1984 dollars) and by vigorouseconomic growth. It didn't happen. In fact, atthe time this paper is being written (1988), oilimports are about 5.5. million per day, andthe shale oil industry remains a dream. Thissituation prevails despite the huge reductionsin oil demand that have resulted from oilprices of over $30 per barrel and from themost serious economic recession since theGreat Depression.

The consequences of omitting feedback canbe serious. An excellent example is providedby the Project Independence EvaluationSystem (PIES) model, used in the 1970s bythe US Federal Energy Administration andlater by the US Department of Energy. Asdescribed by the FEA, the purpose of themodel was to evaluate different energystrategies according to these criteria: theirimpact on the development of alternativeenergy sources, their impact on economicgrowth, inflation, and unemployment; theirregional and social impacts; theirvulnerability to import disruptions; and theirenvironmental effects (Federal EnergyAdministration 1974, p. 1).

Surprisingly, considering the stated purpose,the PIES model treated the economy asexogenous. The economy—includingeconomic growth, interest rates, inflation,world oil prices, and the costs ofunconventional fuels—was completelyunaffected by the US domestic energysituation—including prices, policies, and

A broad model boundary that includesimportant feedback effects is more importantthan a great amount of detail in thespecification of individual components. It isworth noting that the PIES model provided abreakdown of supply, demand, and price fordozens of fuels in each region of the country.Yet its aggregate projections for 1985 weren't

12


even close. One can legitimately ask whatpurpose was served by the effort devoted toforecasting the demand for jet fuel or naphthain the Pacific Northwest when the basicassumptions were so palpably inadequate andthe main results so woefully erroneous.

Klein, econometrics is now taught in nearlyall business and economics programs.Econometric forecasts are regularly reportedin the media, and ready-to use statisticalroutines for econometric modeling are nowavailable for many personal computers. Andthird, the well publicized failure ofeconometric models to predict the future haseroded the credibility of all types of computermodels, including those built for verydifferent purposes and using completelydifferent modeling techniques.

In fairness it must be said that the PIESmodel is not unique in the magnitude of itserrors. Nearly all energy models of all typeshave consistently been wrong about energyproduction, consumption, and prices. Theevidence shows clearly that energy forecastsactually lag behind the available information,reflecting the past rather than anticipating thefuture (Department of Energy 1983). A gooddiscussion of the limitations of PIES andother energy models is available in theappendix of Stobaugh and Yergin (1979).

Econometrics is literally the measurement ofeconomic relations, and it originally involvedstatistical analysis of economic data. Ascommonly practiced today, econometricmodeling includes three stages –specification, estimation, and forecasting.First the structure of the system is specifiedby a set of equations. Then the values of theparameters (coefficients relating changes inone variable to changes in another) areestimated on the basis of historical data.Finally, the resulting output is used to makeforecasts about the future performance of thesystem.

Overly narrow model boundaries are not justa problem in energy analysis. The Global2000 Report to the President (Barney 1980)showed that most of the models used by USgovernment agencies relied significantly onexogenous variables. Population modelsassumed food production was exogenous.Agriculture models assumed that energyprices and other input prices were exogenous.Energy models assumed that economicgrowth and environmental conditions wereexogenous. Economic models assumed thatpopulation and energy prices wereexogenous. And so on. Because they ignoredimportant intersectoral feedbacks, the modelsproduced inconsistent results.

Specification. The model specification is thedescription of the model's structure. Thisstructure consists of the relationships amongvariables, both those that describe thephysical setting and those that describebehavior. The relationships are expressed asequations, and a large econometric modelmay have hundreds or even thousands ofequations reflecting the manyinterrelationships among the variables.Econometrics

Strictly speaking, econometrics is asimulation technique, but it deserves separatediscussion for several reasons. First, itevolved out of economics and statistics,while most other simulation methodsemerged from operations research orengineering. The difference in pedigree leadsto large differences in purpose and practice.Second, econometrics is one of the mostwidely used formal modeling techniques.Pioneered by Nobel Prize-winningeconomists Jan Tinbergen and Lawrence

For example, an econometric model of themacroeconomy typically will containequations specifying the relationship betweenGNP and consumption, investment,government activity, and international trade.It also will include behavioral equations thatdescribe how these individual quantities aredetermined. The modeler may expect, forinstance, that high unemployment reducesinflation and vice versa, a relationship knownas the Phillips curve. One of the equations in

13


the model will therefore express the Phillipscurve, specifying that the rate of inflationdepends on the amount of unemployment.Another equation may relate unemploymentto the demand for goods, the wage level, andworker productivity. Still other equationsmay explain wage level in terms of yet otherfactors.

process. The model will assume that the firmknows not only the wages of workers and theprices of machines and other inputs, but alsothe production attainable with differentcombinations of people and machines, evenif those combinations have never been tried.Rational expectation models go so far as toassume that the firm knows future prices,technologies, and possibilities, and that it canperfectly anticipate the consequences of itsown actions and those of competitors.

Not surprisingly, econometrics draws oneconomic theory to guide the specification ofits models. The validity of the models thusdepends on the validity of the underlyingeconomic theories. Though there are manyflavors of economics, a small set of basicassumptions about human behavior arecommon to most theories, including modernneoclassical theory and the "rationalexpectations" school. These assumptions are:optimization, perfect information, andequilibrium.

The third assumption is that the economy isin or near equilibrium nearly all of the time.If disturbed, it is usually assumed to return toequilibrium rapidly and in a smooth andstable manner. The prevalence of staticthinking is the intellectual legacy of thepioneers of mathematics and economics.During the late nineteenth century, beforecomputers or modern cybernetic theory, thecrucial questions of economic theoryinvolved the nature of the equilibrium statefor different situations. Given humanpreferences and the technological possibilitiesfor producing goods, at what prices willcommodities be traded, and in whatquantities? What will wages be? What willprofits be? How will a tax or monopolyinfluence the equilibrium?

In econometrics, people (economic agents, inthe jargon), are assumed to be concernedwith just one thing—maximizing theirprofits. Consumers are assumed to optimizethe "utility" they derive from their resources.Decisions about how much to produce, whatgoods to purchase, whether to save orborrow, are assumed to be the result ofoptimization by individual decision makers.Non-economic considerations (defined asany behavior that diverges from profit orutility maximization) are ignored or treated aslocal aberrations and special cases.

These questions proved difficult enoughwithout tackling the more difficult problemof dynamics, of the behavior of a system influx. As a result, dynamic economictheory—including the recurrent fluctuationsof inflation, of the business cycle, of thegrowth and decline of industries and nations– remained primarily descriptive andqualitative long after equilibrium theory wasexpressed mathematically. Even now,dynamic behavior in economics tends to beseen as a transition from one equilibrium toanother, and the transition is usually assumedto be stable.

Of course, to optimize, economic agentswould need accurate information about theworld. The required information would gobeyond the current state of affairs; it alsowould include complete knowledge aboutavailable options and their consequences. Inmost econometric models, such knowledgeis assumed to be freely available andaccurately known.

Take, for example, an econometric modelsimulating the operation of a firm that isusing an optimal mix of energy, labor,machines, and other inputs in its production

The rich heritage of static theory ineconomics left a legacy of equilibrium foreconometrics. Many econometric modelsassume that markets are in equilibrium at all

14


times. When adjustment dynamics aremodeled, variables are usually assumed toadjust in a smooth and stable manner towardthe optimal, equilibrium value, and the lagsare nearly always fixed in length. Forexample, most macroeconometric modelsassume that capital stocks of firms in theeconomy adjust to the optimal, profit-maximizing level, with a fixed lag of severalyears. The lag is the same whether theindustries that supply investment goods havethe capacity to meet the demand or not. (See,for example, Eckstein 1983 and Jorgenson1963).

All modeling methods must specify thestructure of the system and estimateparameters. The use of statistical proceduresto derive the parameters of the model is thehallmark of econometrics and distinguishes itfrom other forms of simulation. It giveseconometricians an insatiable appetite fornumerical data, for without numerical datathey cannot carry out the statisticalprocedures used to estimate the models. It isno accident that the rise of econometrics wenthand in hand with the quantification ofeconomic life. The development of thenational income and produce accounts bySimon Kuznets in the 1930s was a majoradvance in the codification of economic data,permitting consistent measures of economicactivity at the national level for the first time.To this day all major macroeconometricmodels rely heavily on the national accountsdata, and indeed macroeconomic theory itselfhas adapted to the national accountsframework.

Yet clearly, when the supplying industrieshave excess capacity, orders can be filledrapidly; when capacity is strained, customersmust wait in line for delivery. Whether thedynamic nature of the lag is expressed in amodel does make a difference. Models thatexplicitly include the determinants of theinvestment delay will yield predictionssignificantly different from models thatassume a fixed investment lag regardless ofthe physical capability of the economy to fillthe demand (Senge 1980). In general, modelsthat explicitly portray delays and theirdeterminants will yield different results frommodels that simply assume smoothadjustments from one optional state toanother.

Forecasting. The third step in econometricmodeling is forecasting, making predictionsabout how the real system will behave in thefuture. In this step, the modeler providesestimates of the future values of theexogenous variables, that is, those variablesthat influence the other variables in the modelbut aren't themselves influenced by themodel. An econometric model may havedozens of exogenous variables, and eachmust be forecast before the model can beused to predict.

Estimation. The second stage in econometricmodeling is statistical estimation of theparameters of the model. The parametersdetermine the precise strengths of therelationships specified in the model structure.In the case of the Phillips curve, for example,the modeler would use past data to estimateprecisely how strong the relationship betweeninflation and unemployment has been.Estimating the parameters involves statisticalregression routines that are, in essence, fancycurve-fitting techniques. Statistical parameterestimates characterize the degree ofcorrelation among the variables. They usehistorical data to determine parameter valuesthat best match the data themselves.

Limitations of Econometric ModelingThe chief weak spots in econometric modelsstem from the assumptions of the underlyingeconomic theory on which they rest:assumptions about the rationality of humanbehavior, about the availability of informationthat real decision makers do not have, andabout equilibrium. Many economistsacknowledge the idealization and abstractionof these assumptions, but at the same timepoint to the powerful results that have beenderived from them. However, a growing

15


number of prominent economists now arguethat these assumptions are not just abstract—they are false. In his presidential address tothe British Royal Economics Society, E. H.Phelps-Brown said:

Econometrics also contains inherentstatistical limitations. The regressionprocedures used to estimate parameters yieldunbiased estimates only under certainconditions. These conditions are known asmaintained hypotheses because they areassumptions that must be made in order touse the statistical technique. The maintainedhypotheses can never be verified, even inprinciple, but must be taken as a matter offaith. In the most common regressiontechnique, ordinary least squares, themaintained hypotheses include the unlikelyassumptions that the variables are allmeasured perfectly, that the model beingestimated corresponds perfectly to the realworld, and the random errors in the variablesfrom one time period to another arecompletely independent. More sophisticatedtechniques do not impose such restrictiveassumptions, but they always involve other apriori hypotheses that cannot be validated.

The trouble here is not that the behaviorof these economic chessmen has beensimplified, for simplification seems to bepart of all understanding. The trouble isthat the behavior posited is not known tobe what obtains in the actual economy.(Phelps-Brown 1972, p. 4)

Nicholas Kaldor of Cambridge University iseven more blunt:

...in my view, the prevailing theory ofvalue – what I called, in a shorthand way,"equilibrium economics"—is barren andirrelevant as an apparatus of thought...(Kaldor 1972, p. 1237)

As mentioned earlier, a vast body ofempirical research in psychology andorganizational studies has shown that peopledo not optimize or act as if they optimize, thatthey don't have the mental capabilities tooptimize their decisions, that even if they hadthe computational power necessary, they lackthe information needed to optimize. Instead,they try to satisfy a variety of personal andorganizational goals, use standard operatingprocedures to routinize decision making, andignore much of the available information toreduce the complexity of the problems theyface. Herbert Simon, in his acceptance speechfor the 1978 Nobel Prize in economics,concludes:

Another problem is that econometrics fails todistinguish between correlations and causalrelationships. Simulation models mustportray the causal relationships in a system ifthey are to mimic its behavior, especially itsbehavior in new situations. But the statisticaltechniques used to estimate parameters ineconometric models don't prove whether arelationship is causal. They only reveal thedegree of past correlation between thevariables, and these correlations may changeor shift as the system evolves. The prominenteconomist Robert Lucas (1976) makes thesame point in a different context.

Consider the Phillips curve as an example.Though economists often interpreted thePhillips curve as a causal relationship—apolicy trade-off between inflation andunemployment—it never did represent thecausal forces that determine inflation or wageincreases. Rather, the Phillips curve wassimply a way of restating the past behavior ofthe system. In the past, Phillips said, lowunemployment had tended to occur at thesame time inflation was high, and vice-versa.

There can no longer be any doubt that themicro assumptions of the theory—theassumptions of perfect rationality—arecontrary to fact. It is not a question ofapproximation; they do not even remotelydescribe the processes that human beingsuse for making decisions in complexsituations (Simon 1979, p. 510).

16


Then, sometime in the early 1970s, thePhillips curve stopped working; inflation rosewhile unemployment worsened. Among theexplanations given by economists was thatthe structure of the system had changed. Buta modeler's appeal to "structural change"usually means that the inadequate structure ofthe model has to be altered because it failed toanticipate the behavior of the real system!

decision making, including desires, goals,and perceptions. Numerical data maymeasure the results of human decisionmaking, but numbers don't explain how orwhy people made particular decisions. As aresult, econometric models cannot be used toanticipate how people would react to a changein decision-making circumstances.

Similarly, econometric models are unable toprovide a guide to performance underconditions that have not been experiencedpreviously. Econometricians assume that thecorrelations indicated by the historical datawill remain valid in the future. In reality,those data usually span a limited range andprovide no guidance outside historicalexperience. As a result, econometric modelsare often less than robust: faced with newpolicies or conditions, the models breakdown and lead to inconsistent results.

What actually occurred in the 1970s was that,when inflation swept prices to levelsunprecedented in the industrial era, peoplelearned to expect continuing increases. As aresult of the adaptive feedback process oflearning, they learned to deal with highinflation through indexing, COLAs, inflation-adjusting accounting, and other adjustments.The structure, the causal relationships of thesystem, did not change. Instead, causalrelationships that had been present all along(but were dormant in an era of low inflation)gradually became active determinants ofbehavior as inflation worsened. In particular,the ability of people to adapt to continuinginflation existed all along but was n testeduntil inflation became high enough andpersistent enough. Then the behavior of thesystem changed, and the historical correlationbetween inflation and unemployment brokedown.

An example is the model used by DataResources, Inc. in 1979 to test policies aimedat eliminating oil imports. On the basis ofhistorical numerical data, the model assumedthat the response of oil demand to the price ofoil was rather weak—a 10 percent increase inoil price caused a reduction of oil demand ofonly 2 percent, even in the long nun.According to the model, for consumption tobe reduced by 50 percent (enough to cutimports to zero at the time), oil would have torose to $800 per barrel. Yet at that price, theannual oil bill for the remaining 50 percentwould have exceeded the total GNP for thatyear, an impossibility (Sterman 1981). Themodel's reliance on historical data led toinconsistencies. (Today, with the benefit ofhindsight, economists agree that oil demandis much more responsive to price than wasearlier believed. Yet considering therobustness of the model under extremeconditions could have revealed the problemmuch earlier.)

The reliance of econometric estimation onnumerical data is another of its weaknesses.The narrow focus on hard data blindsmodelers to less tangible but no lessimportant factors. They ignore bothpotentially observable quantities that haven'tbeen measured yet and ones for which nonumerical data exist. (Alternatively, they mayexpress an unmeasured factor with a proxyvariable for which data already exists, eventhough the relationship between the two istenuous—as when educational expenditureper capita is used as a proxy for the literacyof a population.)

Validation is another problem area ineconometric modeling. The dominantcriterion used by econometric modelers todetermine the validity of an equation or a

Among the factors excluded fromeconometric models because of the hard datafocus are many important determinants of

17


model is the degree to which it fits the data.Many econometrics texts (e.g., Pindyck andRubinfeld 1976) teach that the statisticalsignificance of the estimated parameters in anequation is an indicator of the correctness ofthe relationship. Such views are mistaken.Statistical significance indicates how well anequation fits the observed data; it does notindicate whether a relationship is a correct ortrue characterization of the way the worldworks. A statistically significant relationshipbetween variables in an equation shows thatthey are highly correlated and that theapparent correlation is not likely to have beenthe result of mere chance. But it does notindicate that the relationship is causal at all.

limited success. Some analysts took anothertack, pointing to the oil price shock, Russianwheat deal, or other one-of-a-kind events asthe explanation for the change. Still othersargued that there had been structural changesthat caused the Phillips curve to shift out tohigher levels of unemployment for any giveninflation rate.

These flaws in econometrics have generatedserious criticism from within the economicprofession. Phelps-Brown notes that becausecontrolled experiments are generallyimpossible in economics "runningregressions between time series is only likelyto deceive" (Phelps-Brown 1972, p. 6).Lester Thurow notes that econometrics hasfailed as a method for testing theories and isnow used primarily as a "showcase forexhibiting theories." Yet as a device foradvocacy, econometrics imposes fewconstraints on the prejudices of the modeler.Thurow concludes:

Using statistical significance as the test ofmodel validity can lead modelers to mistakehistorical correlations for causal relationships.It also can cause them to reject validequations describing important relationships.They may, for example, exclude an equationas statistically insignificant simply becausethere are few data about the variables, orbecause the data don't contain enoughinformation to allow the application ofstatistical procedures.

By simple random search, the analystlooks for the set of variables andfunctional forms that give the bestequations. In this context the bestequation is going to depend heavily uponthe prior beliefs of the analyst. If theanalyst believes that interest rates do notaffect the velocity of money, he find a'best' equation that validates his particularprior belief. If the analyst believes thatinterest rates do affect the velocity ofmoney, he finds a 'best' equation thatvalidates this prior belief. (Thurow 1983,pp. 107-8)

Ironically, a lack of statistical significancedoes not necessarily lead econometricmodelers to the conclusion that the model orthe equation is invalid. When an assumedrelationship fails to be statistically significant,the modeler may try another specification forthe equation, hoping to get a better statisticalfit. Without recourse to descriptive, micro-level data, the resulting equations may be adhoc and bear only slight resemblance to eithereconomic theory or actual behavior.Alternatively, the modelers may attempt toexplain the discrepancy between the modeland the behavior of the real system byblaming it on faulty data collection,exogenous influences, or other factors.

But the harshest assessment of all comesfrom Nobel laureate Wassily Leontief:

Year after year economic theoristscontinue to produce scores ofmathematical models and to explore ingreat detail their formal properties; andthe econometricians fit algebraicfunctions of all possible shapes toessentially the same sets of data withoutbeing able to advance, in any perceptible

The Phillips curve again provides anexample. When it broke down, numerousrevisions of the equations were made. Theseattempts to find a better statistical fit met with

18


way, a systematic understanding of thestructure and the operations of a realeconomic system. (Leontief 1982, p. 107;see also Leontief 197 1.)

("Forecasters Overhaul Models of Economyin Wake of 1982 Errors," Wall StreetJournal, 17 February 1983). Business Week("Where Big Econometric Models GoWrong," 30 March 1981) quotes aneconomist who points out that there is noway of knowing where the Wharton modelends and the model's developer, Larry Klein,takes over. Of course, the adjustments madeby add factoring are strongly colored by thepersonalities and political philosophies of themodelers. In the article cited above, the WallStreet Journal quotes Otto Eckstein asconceding that his forecasts sometimesreflect an optimistic view: "Data Resources isthe most influential forecasting firm in thecountry...If it were in the hands of a doom-and-gloomer, it would be bad for thecountry."

But surely such theoretical problems matterlittle if the econometric models provideaccurate predictions. After all, the primepurpose of econometric models is short-termprediction of the exact future state of theeconomy, and most of the attributes ofeconometrics (including the use of regressiontechniques to pick the "best" parameters fromthe available numerical data and the extensivereliance on exogenous variables) haveevolved in response to this predictivepurpose.

Unfortunately, econometrics fails on thisscore also; in practice, econometric modelsdo not predict very well. The predictivepower of econometric models, even over theshort-term (one to four years), is poor andvirtually indistinguishable from that of otherforecasting methods. There are severalreasons for this failure to predict accurately.

In a revealing experiment, the JointEconomic Committee of Congress (throughthe politically neutral General AccountingOffice) asked these three econometricforecasting firms (DRI, Chase, and Wharton)to make a series of simulations with theirmodels, running the models under differentassumptions about monetary policy. One setof forecasts was "managed" or add factoredby the forecasters at each firm. The other setconsisted of pure forecasts, made by theGAO using the untainted results of themodels. As an illustration of theinconsistencies revealed by the experiment,consider the following: When the moneysupply was assumed to be fixed, the DRImodel forecast that after ten years the interestrate would be 34 percent, a result totallycontrary to both economic theory andhistorical experience. The forecast was thenadd factored down to a more reasonable 7percent. The other models fared little better,revealing both the inability of the puremodels to yield meaningful results and theextensive ad hoc adjustments made by theforecasters to render the results palatable(Joint Economic Committee 1982).

As noted earlier, in order to forecast, themodeler must provide estimates of the futurevalues of the exogenous variables, and aneconometric model may have dozens of thesevariables. The source of the forecasts forthese variables may be other models butusually is the intuition and judgment of themodeler. Forecasting the exogenous variablesconsistently, much less correctly, is difficult.

Not surprisingly, the forecasts produced byeconometric models often don't square withthe modeler's intuition. When they feel themodel output is wrong, many modelers,including those at the "big three" econometricforecasting firms—Chase Econometrics,Wharton Econometric ForecastingAssociates, and Data Resources – simplyadjust their forecasts. This fudging, or addfactoring as they call it, is routine andextensive. The late Otto Eckstein of DataResources admitted that their forecasts were60 percent model and 40 percent judgment

Add factoring has been criticized by othereconomists on the grounds that it is

19


unscientific. They point out that, although themental models used to add factor are themental models of seasoned experts, theseexperts are subject to the same cognitivelimitations other people face. And whethergood or bad, the assumptions behind addfactoring are always unexaminable.

exact, point prediction of the future is neitherpossible nor necessary:

...at present we are far from being able topredict social-system behavior exceptperhaps for carefully selected systems inthe very short term. Effort spent onattempts at precise prediction is almostsurely wasted, and results that purport tobe such predictions are certainlymisleading. On the other hand, much canbe learned from models in the form ofbroad, qualitative, conditionalunderstanding—and this kind ofunderstanding is useful (and typically theonly basis) for policy formulation. Ifyour doctor tells you that you will have aheart attack if you do not stop smoking,this advice is helpful, even if it does nottell you exactly when a heart attack willoccur or how bad it will be. (Meadows,Richardson, and Bruckmann 1982, p.279)

The failure of econometric models have notgone unnoticed. A representative sampling ofarticles in the business press on the topic ofeconometric forecasting include the followingheadlines:

"1980: The Year The Forecasters ReallyBlew It." (Business Week, 14 July1980).

"Where The Big Econometric ModelsGo Wrong." (Business Week, 30 March1981).

"Forecasters Overhaul Models ofEconomy in Wake of 1982 Errors."(Wall Street Journal, 17 February, 1983). Of course, policy evaluation and foresight

depend on an accurate knowledge of thehistory and current state of the world, andeconometrics has been a valuable stimulus tothe development of much-needed datagathering and measurement by governmentsand private companies. But econometricmodels do not seem well-suited to the typesof problems of concern in policy analysis andforesight. Though these models purport tosimulate human behavior, they in fact rely onunrealistic assumptions about the motivationsof real people and the information available tothem. Though the models must represent thephysical world, they commonly ignoredynamic processes, disequilibrium, and thephysical basis for delays between actions andresults. Though they may incorporatehundreds of variables, they often ignore softvariables and unmeasured quantities. In realsystems the feedback relationships betweenenvironmental, demographic, and socialfactors are usually as important as economicinfluences, but econometric models oftenomit these because numerical data are notavailable. Furthermore, econometrics usually

"Business Forecasters Find Demand IsWeak in Their Own Business: BadPredictions Are Factor." (Wall StreetJournal, 7 September 1984).

"Economists Missing The Mark: MoreTools, Bigger Errors." (New YorkTimes, 12 December 1984).

The result of these failures has been anerosion of credibility regarding computermodels all models no matter what theirpurpose, not just econometric modelsdesigned for prediction. This is unfortunate.Econometric models are poor forecastingtools, but well-designed simulation modelscan be valuable tools for foresight and policydesign. Foresight is the ability to anticipatehow the system will behave if and whencertain changes occur. It is not forecasting,and it does not depend on the ability topredict. In fact, there is substantial agreementamong modelers of global problems that

20


deals with the short term, while foresighttakes a longer view. Over the time span thatis of concern in foresight, real systems arelikely to deviate from their past recordedbehavioral making unreliable the historicalcorrelations on which econometric modelsare based.

motives, political factors, cognitivelimitations) into account?

Does the model assume people haveperfect information about the future andabout the way the system works, or doesit take into account the limitations, delays,and errors in acquiring information thatplague decision makers in the real world?Checklist for the Model Consumer

The preceding discussion has focused on thelimitations of various modeling approachesin order to provide potential modelconsumers with a sense of what to look outfor when choosing a model. Despite thelimitations of modeling, there is no doubt thatcomputer models can be and have beenextremely useful foresight tools. Well-builtmodels offer significant advantages over theoften faulty mental models currently in use.

Are appropriate time delays, constraints,and possible bottlenecks taken intoaccount?

Is the model robust in the face of extremevariations in input assumptions?

Are the policy recommendations derivedfrom the model sensitive to plausiblevariations in its assumptions?

The following checklist provides furtherassistance to decision makers who arepotential model users. It outlines some of thekey questions that should be asked toevaluate the validity of a model and itsappropriateness as a tool for solving aspecific problem.

Are the results of the modelreproducible? Or are they adjusted (addfactored) by the model builder?

Is the model currently operated by theteam that built it? How long does it takefor the model team to evaluate a newsituation, modify the model, andincorporate new data?

What is the problem at hand? What is theproblem addressed by the model?

Is the model documented? Is thedocumentation publicly available? Canthird parties use the model and run theirown analyses with it?

What is the boundary of the model?What factors are endogenous?Exogenous? Excluded? Are softvariables included? Are feedback effectsproperly taken into account? Does themodel capture possible side effects, bothharmful and beneficial?

ConclusionsThe inherent strengths and weaknesses ofcomputer models have crucial implicationsfor their application in foresight and policyanalysis. Intelligent decision making requiresthe appropriate use of many different modelsdesigned for specific purposes—not relianceon a single, comprehensive model of theworld. To repeat a dictum offered above,"Beware the analyst who proposes to modelan entire social or economic system ratherthan a problem." It is simply not possible tobuild a single, integrated model of the world,into which mathematical inputs can be

What is the time horizon relevant to theproblem? Does the model include asendogenous components those factorsthat may change significantly over thetime horizon?

Are people assumed to act rationally andto optimize their performance? Does themodel take non-economic behavior(organizational realities, non-economic

21


inserted and out of which will flow acoherent and useful understanding of worldtrends.

Technical solutions alone are notsufficient to satisfy basic needs or create adesirable future.

To be used responsibly, models must besubjected to debate. A cross-disciplinaryapproach is needed; models designed byexperts in different fields and for differentpurposes must be compared, contrasted, andcriticized. The foresight process should fostersuch review.

The IIASA program on global modelingrepresents the most comprehensive effort todate to use computer models as a way toimprove human understanding of socialissues. The debate about the models createdagreement on crucial issues where none hadexisted. The program helped to guide furtherresearch and provided a standard for theeffective conduct of foresight in both thepublic and private sectors.

The history of global modeling provides agood example. The initial global modelingefforts, published in World Dynamics(Forrester 1971) and The Limits To Growth(Meadows et al. 1972) provoked a storm ofcontroversy. A number of critiques appeared,and other global models were soondeveloped. Over a period of ten years, theInternational Institute for Applied SystemsAnalysis (IIASA) conducted a program ofanalysis and critical review in which thedesigners of global models were broughttogether. Six major symposia were held, andeight important global models wereexamined and discussed. These models haddifferent purposes, used a range of modelingtechniques, and were built by persons withwidely varying backgrounds. Even after theIIASA conferences, there remain large areasof methodological and substantivedisagreement among the modelers. Yetdespite these differences, consensus didemerge on a number of crucial issues(Meadows, Richardson, and Bruckmann1982), including the following:

At the moment, model-based analysesusually take the form of studiescommissioned by policymakers. The clientssit and wait for the final reports, largelyignorant of the methods, assumptions, andbiases that the modelers put into the models.The policymakers are thus placed in the roleof supplicants awaiting the prophecies of anoracle. When the report finally arrives, theymay, like King Croesus before the Oracle atDelphi, interpret the results in accordancewith their own preconceptions. If the resultsare unfavorable, they may simply ignorethem. Policymakers who use models asblack boxes, who accept them withoutscrutinizing their assumptions, who do notexamine the sensitivity of the conclusions tovariations in premises, who do not engagethe model builders in dialogue, are littledifferent from the Delphic supplicants or thepatrons of astrologers. And thesepolicymakers justly alarm critics, who worrythat black box modeling abdicates to themodelers and the computer a fundamentalhuman responsibility (Weizenbaum 1976).

Physical and technical resources exist tosatisfy the basic needs of all the world'speople into the foreseeable future.

Population and material growth cannotcontinue forever on a finite planet.

No one can (or should) make decisions onthe basis of computer model results that aresimply presented, "take 'em or leave 'em." Infact, the primary function of model buildingshould be educational rather than predictive.Models should not be used as a substitute forcritical thought, but as a tool for improvingjudgment and intuition. Promising efforts in

Continuing "business as usual" policiesin the next decades will not result in adesirable future nor even in thesatisfaction of basic human needs.

22


corporations, universities, and publiceducation are described in Senge 1989;Graham, Senge, Sterman, and Morecroft1989; Kim 1989; and Richmond 1987.

Factor. Wall Street Journal, 7 September1984.

Cyert, R., and March, J. 1963. A BehavioralTheory of the Finn. Englewood Cliffs, NJ:Prentice Hall.Towards that end, the role of computer

models in policymaking needs to beredefined. What is the point of computermodeling? It should be remembered that weall use models of some sort to makedecisions and to solve problems. Most of thepressing issues with which public policy isconcerned are currently being handled solelywith mental models, and those mentalmodels are failing to resolve the problems.The alternative to continued reliance onmental models is computer modeling. Butwhy turn to computer models if they too arefar from perfect?

Department of Energy. 1979. NationalEnergy Plan II. DOE/TIC-10203.Washington, DC: Department of Energy.

. 1983. Energy Projections to theYear 2000. Washington, DC: Department ofEnergy, Office of Policy, Planning, andAnalysis.

Eckstein, O. 1983. The DRl Model of the USEconomy. New York, McGraw Hill.

Economists Missing the Mark: More Tools,Bigger Errors. New York Times, 12December 1984.

The value in computer models derives fromthe differences between them and mentalmodels. When the conflicting results of amental and a computer model are analyzed,when the underlying causes of the differencesare identified, both of the models can beimproved.

Federal Energy Administration. 1974.Project Independence Report. Washington,DC: Federal Energy Administration.

Forecasters Overhaul Models of Economy inWake of 1982 Errors. Wall Street Journal,17 February, 1983.

Computer modeling is thus an essential partof the educational process rather than atechnology for producing answers. Thesuccess of this dialectic depends on ourability to create and learn from sharedunderstandings of our models, both mentaland computer. Properly used, computermodels can improve the mental models uponwhich decisions are actually based andcontribute to the solution of the pressingproblems we face.

Forrester, Jay W. 1969. Urban Dynamics.Cambridge, Mass.: MIT Press.

. 1971. World Dynamics. Cambridge,Mass.: MIT Press.

. 1980. Information Sources forModeling the National Economy. Journal ofthe American Statistical Association75(371):555574.

Graham, Alan K.; Senge, Peter M.; Sterman,John D.; and Morecroft, John D. W. 1989.Computer Based Case Studies inManagement Education and Research. InComputer-Based Management of ComplexSystems, eds. P. Milling and E. Zahn, pp.317-326. Berlin: Springer Verlag.

ReferencesBarney, Gerald O., ed. 1980. The Global2000 Report to the President. 3 vols.Washington, DC: US Government PrintingOffice.

Business Forecasters Find Demand Is Weakin Their Own Business: Bad Predictions Are

23


Grant, Lindsey, ed. 1988. Foresight andNational Decisions: The Horseman and theBureaucrat. Lanham, Md.: University Pressof America.

and Labor Markets, K. Brunner and A.Meltzer, eds. Amsterdam: North-Holland.

Meadows, Donella H.; Meadows, Dennis L.;Randers, Jorgen.; and Behrens, William W.1972. The Limits to Growth. New York:Universe Books.Greenberger, M., Crenson, M. A., and

Crissey, B. L. 1976. Models in the PolicyProcess. New York: Russell SageFoundation.

Meadows, Donella H. 1982. Whole EarthModels and Systems. CoEvolutionQuarterly, Summer 1982, pp. 98-108.

Hogarth, R. M. 1980. Judgment and Choice.New York: Wiley. Meadows, Donella H.; Richardson, John;

and Bruckmann, Gerhart 1982. Groping inThe Dark. Somerset, NJ: Wiley.Joint Economic Committee. 1982. Three

Large Scale Model Simulations of FourMoney Growth Scenarios. Prepared forsubcommittee on Monetary and FiscalPolicy, 97th Congress 2nd Session,Washington, DC

Morecroft, John D. W. 1983. SystemDynamics: Portraying Bounded Rationality.Omega II: 131-142.

. 1988. System Dynamics andMicroworlds for Policy Makers. EuropeanJournal of Operational Research 35(5):301-320.

Jorgenson, D. W. 1963. Capital Theory andInvestment Behavior. American EconomicReview 53:247-259.

Kahneman, D., Slovic, P., and Tversky, A.1982. Judgment Under Uncertainty:Heuristics and Biases. Cambridge:Cambridge University Press.

1980: The Year The Forecasters Really BlewIt Business Week, 14 July 1980.

Phelps-Brown, E. H. 1972. TheUnderdevelopment of Economics. TheEconomic Journal 82:1-10.Kaldor, Nicholas. 1972. The Irrelevance of

Equilibrium Economics. The EconomicJournal 82: 1237-55. Pindyck, R., and Rubinfeld, D. 1976.

Econometric Models and EconomicForecasts. New York: McGraw Hill.Kim, D. 1989. Learning Laboratories:

Designing a Reflective LearningEnvironment. in Computer-BasedManagement of Complex Systems, P. Millingand E. Zahn, eds., pp. 327334. Berlin:Springer Verlag

Richmond, B. 1987. The Strategic Forum.High Performance Systems, Inc., 45 LymeRoad, Hanover, NH 03755, USA.

Senge, Peter M. 1980. A System DynamicsApproach to Investment FunctionFormulation and Testing. SocioeconomicPlanning Sciences 14:269-280.

Leontief, Wassily. 1971. TheoreticalAssumptions and Nonobserved Facts.American Economic Review 61(1):1-7.

. 1989. Catalyzing Systems ThinkingWithin Organizations. In Advances inOrganization Development, F. Masaryk, ed.,forthcoming.

. 1982. Academic Economics.Science, 217: 104-107.

Lucas, R. 1976. Econometric PolicyEvaluation: A Critique. In The Phillips Curve

24


Simon, Herbert. 1947. AdministrativeBehavior. New York: MacMillan.

. 1989. Modeling ManagerialBehavior: Misperceptions of Feedback in aDynamic Decision Making Experiment.Management Science 35(3):321-339. . 1957. Models of Man. New York:

Wiley.Stobaugh, Robert and Yergin, Daniel. 1979.Energy Future. New York: Random House. . 1979. Rational Decisionmaking in

Business Organizations. American EconomicReview 69:493-513. Thurow, Lester. 1983. Dangerous Currents.

New York: Random House.Sterman, John D. 1981. The EnergyTransition and the Economy: A SystemDynamics Approach. Ph.D. dissertation,Massachusetts Institute of Technology,Cambridge.

Weizenbaum, J. 1976. Computer Power andHuman Reason: from Judgment toCalculation. San Francisco: W. H. Freeman.

Where The Big Econometric Models GoWrong. Business Week, 30 March 1981. . 1985. A Behavioral Model of the

Economic Long Wave. Journal of EconomicBehavior and Organization 6(1):17-53.

The author wishes to acknowledge that many of the ideas expressed here emerged from discussions with or werefirst formulated by, among others, Jay Forrester, George Richardson, Peter Senge, and especially DonellaMeadows, whose hook Groping in the Dark (Meadows, Richardson, and Bruckmann 1982), was particularlyhelpful.

25

a skeptic's guide to computer models - …web.mit.edu/jsterman/www/skeptic's_guide.pdfa...

Documents