bayesian comparison of dynamic macroeconomic models series and appli… · bayesian comparison of...

Bayesian Comparison of Dynamic Macroeconomic Models

John S. Landon-Lane ∗

University of New South Wales

Sydney, NSW. 2052

E-Mail: [email protected]

Preliminary draft. Please do not quote.

October 27, 1998

Abstract

This paper develops a method to formally compare and evaluate non-linear dynamic macroeco-

nomic models. In particular, the method developed aims to compare models that are found in the

real business cycle (RBC) literature. The method is based on the Bayesian model comparison litera-

ture and is flexible enough to incorporate model uncertainty and prior beliefs over the parameters of

the model into the final decision. Calibration is therefore treated as a special case of a more general

method. The method is also able to compare two models over subsets of the data as well as the

whole data set thus allowing for the identification of where one model is “better” than another.

In order to use the Bayesian model comparison literature, a likelihood function for each model

needs to be calculated. Using Bayesian techniques a likelihood function for a typical model in the

RBC literature is calculated directly form the equations that make up the solutions to the model.

Because the method is likelihood based, models in the RBC literature can now be compared across

the full dimension of the data.

To illustrate the method, two models found in the RBC literature are compared for the case where

there is no prior uncertainty over the structural parameters of the model and for the case where there

is some prior uncertainty over the parameters. It is found that the comparison is sensitive to the

prior specification. The paper also shows how to draw from the posterior distributions of models

that are typical in the RBC literature using standard Markov chain Monte Carlo techniques.

1 Introduction

Dynamic macroeconomic models have become an important tool in modern macroeconomics. In macroe-conomics, dynamic general equilibrium models have been used to develop models of the economy; toask and answer questions regarding observed economic phenomena in these economies; and to conduct

∗I would like to thank Professors John Geweke, Michael Keane, Lee Ohanian, and Craig Swan for their useful comments.

I also acknowledge the comments of Arnie Quinn, Claustre Bajona-Xandri and seminar participants at Louisiana State

University, Rice University and the University of Minnesota. All remaining mistakes are my own.

1

experiments using the models. Frequently, new models are distinguished from older models by how theyemulate features in the data. There may be a puzzling feature of the data that is observed and a newmodel may be proposed that attempts to mimic this feature. If one is to seriously ask whether a modelanswers a particular question, they first must believe that the model used to answer the question isa “good” model. However, the question of what is a “good” model is a difficult concept. It may betoo much to expect a model to perfectly predict the complex system that is being observed. In thatcase, one has to decide what aspects of the system are important for a model to explain well. Once adetermination has been made on what the model should be attempting to explain, it is necessary todetermine how to decide whether a model has replicated the important aspects of the observed system.

In economics, the models that have been used to model the observed economic world have becomemore and more complex. In particular, after the seminal paper of Kydland and Prescott (1982), the RealBusiness Cycle (RBC) literature grew using more and more complicated dynamic non-linear models inthe search for answers to questions posed by investigators. The approach of the RBC literature thatis described by Kydland and Prescott (1996) is a familiar one: Models are advanced to approximatean economy that is observed. The models are tested to see if they are “good” representations of theeconomy that they are attempting to model. If the model is determined to be a “good” representationof the observed economy, it is then used to answer questions regarding the observed economy or used asan experimental tool to analyze the effects of competing policies that are proposed.

The methods that have been used to validate the use of the models in the RBC literature havecome under increasing criticism. The most common method of determining whether a model does a“good” job of mimicking an observed economy has been criticized for being too informal (Hansen andHeckman 1996). Various alternative methods have been proposed to evaluate the ability of models withrespect to observations on the economy. Almost all of the alternative methods evaluate the performanceof a model with respect to the observed data rather than evaluating a model with respect to othercompeting models in the literature. According to Kydland and Prescott (1996), the models used in theRBC literature should not be expected to predict the observed economy perfectly. Placing too highof a standard on these models could lead to the rejection of models that could nevertheless lead toincreased understanding of the observations at hand. Their view is that while the models are not goodat prediction, the models could be useful in helping to understand observations from an economy. If thisis the case, then it would be important to be able to distinguish which model of the type used in theRBC literature is the “best”. Therefore, a method that is able to distinguish among the class of modelsthat is used in the RBC literature is needed. One of the criticisms of the literature that was made byStadler (1994) was that there were no methods that directly compared models in the RBC literature.The current methods of model evaluation involve the comparison of models to the observed data. Thenthere is an informal comparison across models.

Once a model has been proposed and shown to “fit” the observations sufficiently, it is commonto use the model to analyze the effects of changing structural parts of the model. Again, any answer tothis kind of experiment can only be taken seriously if the model that is used to conduct the experimentcan be shown to improve upon competing models. Note that a competing model does not necessarilyhave to be of a similar type. In fact, it is useful to know how a model performs against different modelsthat are not of the same type. Inferences from a set of models can only be put into context if thereis a comparison of those models with other models that are found in the wider literature. Thereforeit is important for any method of comparison to be able to formally compare models inside the RBCliterature directly with models outside the RBC literature.

In this paper, a method will be developed that will allow for the direct comparison of models bothinside and outside the RBC literature. In relation to the method of comparison, the concept of “better”

2

will be defined and it will be shown that the method can be extended to allow for the comparisonof models across sub-samples of the data as well as across the whole sample. The ability to comparemodels across sub-samples could allow for a better understanding as to what is the difference betweentwo models that are being compared.

Another problem inherent in this approach to model selection is the use of prior information.Prior information is used to determine what characteristics are used to evaluate a model. In the worstcase, prior beliefs about a model could lead to biased model selection. It would be nice for a method tobe able to formally include prior beliefs as to a model’s validity to be included in the process. With thisin mind, the method of comparison will be Bayesian in nature. One benefit of using a Bayesian approachis that model uncertainty is incorporated into the comparison the same as any other uncertainty in themodel. The Bayesian approach also easily allows for prior uncertainty over the structural parameters ofthe model. The common practice in the RBC literature is to fix the values of the structural parametersof a model to predetermined values. In the literature this practice is called calibration. The methoddeveloped in this paper incorporates the practice of calibration as a special case of a more generalmethod. The more general method has the values of the structural parameters being defined with someuncertainty. One criticism (Hansen and Heckman 1996) of the calibration approach is that the studiesthat are used to help calibrate the values of the structural parameters are ill equipped for the job. Thissuggests that there is a need to allow for prior uncertainty in the calibration of the models.

Another criticism made by Hansen and Heckman (1996) of how models are compared that was themethods are not likelihood based. The Bayesian method used in this paper is a likelihood-based approach.One feature of this approach is that it satisfies the Likelihood Principle. The Likelihood Principle statesthat all relevant experimental information is contained in the likelihood function. Therefore, all inferencesare based only on the sample that is observed. Berger and Wolpert (1988) contains an excellent discussionon the Likelihood Principle.

However, in order to use a likelihood based approach, a likelihood function for models that aretypically found in the RBC literature needs to be calculated. The first part of this paper deals withthis problem. For models in the RBC literature especially, and more generally dynamic non-linearmacroeconomic models, the calculation of a likelihood function is a difficult problem. This is becausethe models that are used do not have tractable likelihood functions. The models generally do not have asmany stochastic components as they have variables, which imply that there is a dimensionality problemwhen defining the likelihood function. A method of directly constructing and calculating a likelihoodfunction for models found in the RBC literature is defined. This method involves conditioning on theinitial conditions of a model. By doing that a likelihood function can be directly computed. However,given the dimensionality problem, the method that is defined uses only a subset of the variables of amodel to calculate the likelihood functions for a model.

The layout of this paper is as follows: Section 2 will review the methods currently used to comparemodels found in the RBC literature while Section 3 will outline a formal method to compare modelsin the RBC literature. Section 3 will also outline a method to calculate the likelihood function for atypical model found in the RBC literature. In Section 4 there will be an application of the method. Theapplication will entail the formal comparison of the models found in Farmer and Guo (1994) both forthe case of no prior uncertainty and for the case of prior uncertainty over the structural parameters ofthe models. Finally, Section 5 will conclude.

3

2 Real Business Cycle Models

Surveys of RBC literature can be found in Stadler (1994), Danthine and Donaldson (1993), and inKydland and Prescott (1996). In their paper, Kydland and Prescott outline the steps used to posean interesting question, construct a model and conduct an experiment. They explain the procedure ingeneral and for the case of models in the RBC literature. The first part of their procedure is to pose aquestion. One example of a question they give is to ask what is the quantitative nature of fluctuationsinduced by technology shocks? This was the question that was posed in Kydland and Prescott (1982).Other questions posed in the literature have attempted to understand various observed features ofthe business cycle. For example, Lucas (1977) notes that amongst observed cyclical fluctuations, it isobserved that investment is more volatile than output, consumption is less volatile than output, andthe capital stock is very much less volatile than output (Stadler 1994). Danthine and Donaldson (1993)note that the velocity of money is counter cyclical and that the correlation between money aggregatesand output varies substantially.

Once an interesting question has been posed, Kydland and Prescott (1996) suggest to build amodel economy based on well-tested theory. Once a model economy is constructed they suggest that thestructural parameters of the model should be fixed to specific values. To do that, they suggest assigningvalues to the structural parameters of the model so that artificial data generated from the model issimilar to that observed. They call this procedure calibration. In essence, prior information is beingincorporated into the model with no uncertainty. Once the structural parameters of the model are fixed,it is now possible to simulate the model to get artificial observations on the variables that make up themodel. Once these artificial observations have been obtained, it is possible to answer the question thatwas first posed. In order to evaluate how well the question has been answered, however, it is necessary toask and answer the question of how “good” the model that is being used is. The term “good” could referto a models performance relative to a class of models that could have been used to answer the questionposed or it could refer to how well the model predicts the observed data. In the RBC literature, modelsare usually judged by how they replicate all, or more likely, some of the characteristics of the observeddata.

There have been a number of criticisms leveled at the RBC literature (Stadler (1994), Kim andPagan (1995),Danthine and Donaldson (1993), Hansen and Heckman (1996), and Sims (1996)). Thispaper, however, will concentrate on the criticisms regarding how models in the RBC literature are eval-uated. The most common method is to evaluate RBC models using informal moment matching. Certainstylised facts are taken from the observed data and these facts are used to evaluate the performance ofthe model. In practice, second moments of the observed data and correlations between variables havebeen used as the stylised facts. An example of this approach can be found in Kydland and Prescott(1982) and in Hansen (1985).

The first criticism, leveled against this approach, by Kim and Pagan (1995) is that the range ofstylised facts used in the evaluation may be too restrictive. An example that Kim and Pagan (1995)give is the case of models that aim to study asset returns. They argue that the stylised facts thatare used to determine the validity of the model should include facts about whether the asset return isintegrated, whether the asset returns exhibit high leptokurtosis and whether the asset returns exhibitARCH behavior. Kim and Pagan (1995) claim that very few models in the RBC literature that aimto study asset returns ever use these facts as determinants of a model’s performance. Another maincriticism of the “stylised fact matching” method of model evaluation is that the distance between themodel and the stylised fact is measured informally. This criticism is noted in Kim and Pagan (1995)and in Hansen and Heckman (1996).

4

There have been a number of attempts to formally measure the distance between a model andthe data. Reviews can be found in Canova and Ortega (1995) and Kim and Pagan (1995). The basicapproach in the literature is to choose a set of stylised facts and then to chose a metric. Examples offormal measures can be found in Watson (1993), Farmer and Guo (1994), Christiano and Eichenbaum(1992), and Diebold, Ohanian, and Berkowitz (1994). These papers all define formal metrics to measurethe distance between the model and the data. The stylised facts used in these papers have varied aswell. Impulse response functions have been used as has the spectral density matrix.

There have been other criticisms aimed at the RBC literature that is important to this paper.These were noted in Hansen and Heckman (1996) and Sims (1996). In their paper, Hansen and Heckmanargue that the studies that are used to calibrate models in the RBC literature may not be as appro-priate as thought. Their argument is that the studies used in the calibration of a model are typicallymicroeconomic studies whereas the typical model in the RBC literature is a heavily aggregated dynamicgeneral equilibrium model. Hansen and Heckman (1996) also argue that the parameters of a typicalmodel may not be able to be calibrated with the high accuracy that is common in the RBC literature.This suggests that it would be preferred for any method that aims to evaluate model performance in theRBC literature should be able to allow for uncertainty over the structural parameters of the model. themethod proposed in this paper allows for there to be prior uncertainty over the structural parametersof a model.

Finally, Sims (1996) argues that model uncertainty should be formally incorporated into modelevaluation. Sims (1996) notes that this problem, especially when the models are to be compared andevaluated using only one realization of a time series, only makes sense in a Bayesian context. Using aBayesian approach to model comparison, all uncertainty over model specification is treated the sameway as all other uncertainty that is inherent in the problem and models being compared.

3 A method to formally compare models in the RBC literature

With the comments of Sims (1996) in mind, the approach to the problem of comparing two modelswill utilize the Bayesian model comparison literature outlined in Geweke (1998). In order to be able toimplement this literature, a likelihood function for models in the RBC literature needs to be constructed.

3.1 Constructing a likelihood function for models in the RBC

literature

In order to be able to apply the Bayesian model comparison literature to the problem of directly compar-ing two or more models from the RBC literature, likelihood functions for the relevant models are needed.The likelihood function for all but the simplest of models found in the RBC literature is intractable. Theproblem is that for most models in the RBC literature, there is no analytical solution. Solutions to thesemodels are usually approximated. Even with the approximated solutions there is still a problem withconstructing a likelihood function for these models. This is because there are usually fewer stochasticelements of the model than there are variables. For example, the indivisible labor model of Hansen(1985) contains only one stochastic element. In that model, the only shock is a productivity shock thatimpacts upon the production function.

There have been a number of attempts in the literature to construct an approximate likelihoodfunction for models in the RBC literature. One such attempt is the method of Smith (1993), where thedata that is generated from a model is represented as a VAR. The likelihood function of the VAR isconstructed and this likelihood function is used to approximate the likelihood function of the model.

5

Because there are fewer stochastic variables than there are variables and because it is usual forvariables to be expressed as functions of the state vectors of the model it is not possible to constructa VAR that includes all of the variables of the model. Another problem with this method is that it iscommon to use data that has been transformed to deviations from trend form. The common practice isto use the Hodrick and Prescott (1997) filter to calculate the trend for observations on a variable wherethe mean of the deviations from trend is equal to zero. In the case of a model with only one shock themain determinant of the level of the likelihood will be the variance of the simulated data from the model.In comparing two models, the model whose simulated data has the closest variance to the observed datawill have the higher likelihood. In essence, the models are being compared with respect to their secondmoments.

Anderson, Hansen, McGratten, and Sargent (1996) describe another way to construct a likelihoodfunction for a specific class of models for which RBC models are a subset. The approach of Anderson,Hansen, McGratten, and Sargent (1996) is to construct a likelihood function for the model by calculatingthe “innovations” representation of the model using a Kalman filter.

Once a likelihood function is calculated, Anderson, Hansen, McGratten, and Sargent (1996) usethe likelihood function to calculate maximum likelihood estimates of the structural parameters that makeup the model. The innovations representation approach to approximating the likelihood function of themodel is also used by DeJong, Ingram, and Whiteman (1997). However, in their paper, DeJong, Ingram,and Whiteman (1997) do not assume any measurement error in observing the data on the variables inthe model. In this case, there are fewer stochastic components than variables so that DeJong, Ingram,and Whiteman (1997) only use a subset of the data to construct the likelihood function. The number ofvariables that can be used , in this case, is constrained to be equal to the number of stochastic componentsin the model. To implement this method, the initial conditions for the Kalman filter along with theinitial conditions of the model need to be assumed. In contrast, the method that will be described inthis paper treats the initial conditions of the model as unknown parameters of the likelihood functionthus allowing for a direct evaluation of the likelihood function.

A state-space representation is used to construct the likelihood function. However, similar to thethe approach of Smith (1993) and DeJong, Ingram, and Whiteman (1997) only a subset of the variablesthat make up a model will be used to construct the likelihood function. However, unlike Smith (1993)and DeJong, Ingram, and Whiteman (1997) the likelihood function will be calculated directly. Theapproach will be to construct the values of stochastic components of the model that is implied by theobserved data. Once the implied shocks have been calculated, the assumed distribution of the shocks,which is part of the model description, is used to construct the likelihood function. This method relieson the ability to approximate the model in such a way that it is possible to invert the mapping betweenthe model and the data.

Let the model be denoted by M. Let st be a ns × 1 vector of state variables of the model at timeperiod t where st = (at, bt)′. The vector bt represents the subset of the state vector that is stochasticand at represents the non-stochastic component of the state vector. Let M represent the approximationto the model and let xt represent the vector of all variables that make up the model. Let nx be the orderof the vector xt and let na and nb be the orders of the vectors at and bt respectively. The approximationto the model can be obtained a number of ways. The essential feature, though, is to be able to write allof the variables that are present in the model as functions of the state variables of the model. Then themodel can be represented by the following representation:

st+1 = Ms(st) + ut+1 for t = 0, . . . , T − 1 , given s0

xt = Mx(st). (3.1)

6

In the above representation, ut represents the ns× 1 vector of innovations to the state vector, Ms is thefunction implied by the approximation to the model that relates the current value of the state vector tothe value of the state vector last period and the function Mx is the function implied by the approximationto the model that relates all of the variables in the model to the current value of the state vector.

As an example, consider the model of Hansen (1985). The basic model that was used in Hansen(1985) is a variant of the one sector stochastic growth model. This model is a representative agentmodel with households and firms. Let consumption by the household be represented by ct and let theamount of labor supplied by the household in period t be denoted by ht. Let output be denoted by ytand let the stock of capital in period t be represented by kt. Let investment in capital be denoted byit and assume that the production function is hit by productivity shocks that are denoted by zt. Thehousehold values both consumption and leisure. The household has a time endowment of one so thatthe amount of leisure in period t is denoted by 1-ht. Therefore, the representative household solves theintertemporal maximization problem given by

maxE∞∑t=0

βtu(ct, 1− ht) (3.2)

subject to the following constraints. The first constraint is that all of the output of the economy is eitherconsumed or invested. That is,

ct + it ≤ yt. (3.3)

The other constraints are the non-negativity constraints that impose that consumption is non-negative,ct ≥ 0, that the capital stock is non-negative, kt ≥ 0, and that the amount of labor endowment that issupplied is non-negative and less than or equal to one, 0 ≤ ht ≤ 1. Here the representative household’slabor endowment is normalized to be one. Capital is assumed to depreciate at a rate of δ so that thelaw of motion for the capital stock is

kt+1 = (1− δ)kt + it, 0 ≤ δ ≤ 1. (3.4)

The representative firm is assumed to have a constant returns to scale production technology that is hitby productivity shocks. The production function is

yt = ztkαt h

1−αt 0 ≤ α ≤ 1, (3.5)

where zt is the productivity shock that is constrained to be positive. The productivity shock is assumedto follow a first order Markov process

log zt = γ log zt−1 + εt, (3.6)

where εt is assumed to be identically and independently distributed with mean zero and variance σ2ε .

There are a number of ways this model could be approximated. As there are no externalities ordistortionary taxes in the model, the solution to the problem can be obtained from the following socialplanner’s problem:

max E

∞∑t=0

βt [u(ct, 1− ht)] ,

subject to

ct + it ≤ ztkαt h1−αt , t = 0, 1, 2, . . .

ct ≥ 0, kt ≥ 0, , 0 ≤ ht ≤ 1 t = 0, 1, 2, . . . (3.7)

7

kt+1 = (1− δ)kt + it t = 0, 1, 2, . . .

log zt = γzt−1 + εt, where εt ∼ N(0, σ2ε ) t = 1, 2, 3, . . .

k0 > 0 given ; z0 given.

The solutions to the above model cannot be calculated analytically. There are many ways toapproximate the model in order to obtain solutions. In Hansen (1985), the method described in Hansenand Prescott (1995) is used. Once the solution to the approximated model has been calculated, it ispossible to calculate decision rules for hours supplied, ht and investment, it as functions of the statevariables of the model. The state variables of Hansen’s model are capital, kt, the log of the productivityshock, log(zt) and the constant 1. The decision rules for ht and it are of the form:

ht = h1 + hkkt + hz log(zt) (3.8)

andit = i1 + ikkt + iz log(zt). (3.9)

where the coefficients in 3.8 and 3.9 are all non-linear functions of the structural parameters of themodel.

Given 3.8 and 3.9 it is possible to write all other variables of the model as functions of the statevector, st = (1, kt, log(zt))′. For example, for this model, output can be written as

yt = ztkαt (h1 + hkkt + hz log(zt))1−α. (3.10)

Once all the variables of the model can be represented as functions of the state variables as is (3.8),(3.9) and (3.10) it is possible to define partially the function Mx that is defined in (3.1). For example,if xt = (ht, it, yt)′ then the function Mx(st) would be given by

Mx(st) =

h1 + hkkt + kz log(zt)i1 + ikkt + iz log(zt)

ztkαt (h1 + hkkt + hz log(zt))1−α

. (3.11)

For this model, the non-stochastic state variables are at = (1, kt)′ while the stochastic statevariable is bt = log(zt). Then it follows (3.2) to (3.6) that the function Ms that is defined in (3.1) isgiven by

Ms(st) =

1 0 0i1 (1− δ) + ik iz

0 0 γ

st (3.12)

and the innovation to the state vector, ut is

ut = (0, 0, εt)′. (3.13)

Let Y T = {yt}Tt=0 be the nx × T vector of observations on the variables xt for the model. Toconstruct a likelihood for the model represented by 3.1, the values of the stochastic variables, bt, impliedby Y T need to be calculated. To calculate the values of bt, only nb variables of the model can be usedas there are only nb stochastic variables in the model. Let Y Tb = {ybt}Tt=0 be the set of observations thatwill be used to calculate the values of the stochastic elements of the model. Here, ybt is an nb× 1 vectorcontinuing a subset of variables from yt. The calculation of the stochastic elements is a recursive one.Consider first, period zero. Suppose that the value of a0 is known. To calculate the value of b0 impliedby yb0, the appropriate relations that make up the function Mx(st) in 3.1 are used. Once b0 is known,the first part of 3.1 is used to calculate the values of a1. Then, given the value of a1 and yb1, Mx(st) isused to calculate the value of b1. This process is continued until the end of the sample.

8

For example, suppose that one wished to calculate the values of the stochastic components ofHansen’s model using observations on total hours supplied , {ht}Tt=0. In this example, nb is equal to oneand na is equal to two. Given a value for the initial capital stock, k0, and the observed value of h0 use(3.8) to calculate b0 = z0. That is,

˜log(z)0 =h0 − h1 − hkk0

hz. (3.14)

Then , for observations t = 1, . . . , T ,

kt = (1− δ)kt−1 + it−1

˜log(z)t =ht − h1 − hkkt

hz(3.15)

where it is calculated using (3.9). Note that zt represents the calculated values of zt given the observations{ht}Tt=0.

Once the stochastic elements of the model, BT = {bt}Tt=0, have been calculated it is possibleto construct a likelihood function for the model. The object is to use the given distribution of theinnovations to bt to construct the likelihood function. The innovation to the state vector, ut can bedecomposed into two components: the innovation to at, uat, and the innovation to bt, ubt. In thisexample, uat = 0 for all values of t. Let gu(.) be the density function of the innovation, ubt and let gz0(.)be the density function of the unconditional distribution of z0. Using the first equation of (3.1), it ispossible to calculate the values of the innovations to the stochastic elements that are implied by the dataY Tb . Let the matrix Pb be a matrix that picks the last nb elements of a vector. Then, for t = 1, 2, . . . , T

ubt = Pb

[st − Ms(st−1)

]. (3.16)

Let Py be a matrix that picks the nb elements from the vector yt that will be used to calculatethe likelihood function. In general there are many combinations of yt that can be chosen. It followsfrom the first equation of the representation given in 3.1 that the current value of the state vector, st isa function of a0, b0 and U t = {ut}tv=1. Then,

ybt = PyMx(st(U t, b0, a0)) (3.17)

so that the likelihood function for the model is

f(Y Tb | a0) = f(yb0 | a0)T∏t=1

f(ybt | Y t−1b , a0)

= gb0(b0)T∏t=1

gu(ut)

∣∣∣∣∣(

∂(yb0, . . . , ybT )∂(b0, ub1, . . . , ubT )′

)−1∣∣∣∣∣ (3.18)

where

∂(yb0, . . . , ybT )∂(b0, ub1, . . . , ubT )′

=

∂PyM(s0)∂b0

0 0 . . . 0∂PyM(s1)

∂b0

∂PyM(s1)∂ub1

0 . . . 0∂PyM(s2)

∂b0

∂PyM(s2)∂ub1

∂PyM(s2)∂ub2

. . . 0...

...... . . .

...∂PyM(sT )

∂b0

∂PyM(sT )∂ub1

∂PyM(sT )∂ub2

. . .∂PyM(sT )∂ubT

. (3.19)

Note that the calculated values of zt are functions of the observations {ht}Tt=0 and the structural param-eters of the model, θ.

As an example, consider the model defined above in (3.3) to (3.6). For this model, at = (1, kt)′,bt = log(zt), and nb is equal to one. The distribution for ubt = εt is reported to be Normal with zero

9

mean and variance σ2ε . Hence, for Hansen’s model,

gu(εt) = (2πσ2ε )−1/2 exp

{− 1

2σ2ε

ε2t

}(3.20)

and

gz0(z0) =(

1− γ2

2πσ2ε

)−1/2

exp{−1− γ2

2σ2ε

z20

}. (3.21)

In order to construct the likelihood function for Hansen’s model, the determinant of the Jacobianof the transformation, which is given in (3.19), needs to be determined. It follows from (3.8) and (3.7)that

∂ht∂εt

= hz

so that ∣∣∣∣∣(

∂(h0, . . . , hT )∂(z0, ε1, . . . , εT )′

)−1∣∣∣∣∣ = |h−(T+1)

z |. (3.22)

Therefore, the likelihood function of the model described in (3.7) is

f(Y T | k0) =(

1− γ2

2πσ2ε

)−1/2

.(2πσ2ε )−T/2 exp

{−1− γ2

2σ2ε

z20 −

(1

2σ2ε

)T T∑t=1

ε2t

}|hz|−(T+1) (3.23)

where z0 and {εt}Tt=1 are found using (3.14) and (3.15).The method described above is dependent on being able to calculate the values of {bt}Tt=0 that

is implied by the observed data and the solution to the model. If this cannot be done then the likeli-hood function cannot be constructed. Situations where this may arise would include models that havemultiplicative shocks. In this case, the product of the shocks would be able to be calculated but theindividual values of the shocks would be intractable.

Another possible case would be that, at the beginning of each period, simulations from the modelwere needed in order to obtain values of variables in the model. One example of this would be a modelwhere the stochastic components were governed by a Markov chain with given transition probabilitiesthat are dependent on a function of some of the variables of the model. In some cases the observed datacould be used to calculate the probabilities but there might be a case where a variable is unobservable.There are potential solutions to these problems but these may make the method infeasible in that thesolutions may increase the computing time to calculate the likelihood function. For example, it may bepossible to simulate data to calculate transition probabilities. This, however, could increase the time itwould take to calculate the likelihood function to such an extent that the Monte Carlo Markov Chainmethods become infeasible.

If it is possible to construct a likelihood function for a model in the RBC literature, then it ispossible to use likelihood-based techniques to formally compare two or more models. The next sectionintroduces the Bayesian model comparison techniques that will be used to compare models.

3.2 Bayesian Model Comparison

Once a likelihood function has been constructed for a set of models, likelihood-based methods of modelcomparison and evaluation are available. In particular, Bayesian model comparison is available. Inusing Bayesian methods to compare and evaluate the performance of a set of models, the uncertaintyover which model is better is treated the same as all other uncertainty in the model. This is notedby Sims (1996). Another benefit of using Bayesian methods to compare models is that the results are

10

conditional on the observed data only and no assumptions with regard to sample size are needed. Oneof the criticisms noted in Hansen and Heckman (1996) was that models in the RBC literature mightnot be able to be calibrated with as much accuracy as previously thought. The Bayesian approach alsohandles this criticism by allowing for uncertainty over the structural parameters of the models.

The problem is to compare a finite set of models indexed by I given the observations {yt}Tt=0. Letθk be the structural parameters of model k ∈ I. For each model k ∈ I let p(yt | Y t−1, θk) be the densityof yt conditional on Y t−1 under model k. Note that Y s = {yt}st=0 for an s≥0. Then the likelihoodfunction for model k is

p(Y T | θk) =T∏t=0

p(yt | Y t−1, θk). (3.24)

Let the prior for θk, defined on model k ∈ I, be p(θk). Then the posterior distribution for θk given Y T

and model k ∈ I isp(θk|Y T ) ∝ p(θk)p(Y T |θk) (3.25)

and the marginal likelihood of model k is defined to be

Mk =∫

Θk

p(θk)p(Y T |θk)dθk. (3.26)

The Bayes factor between two models is then defined to be

Bij =Mi

Mj(3.27)

and it is this object that is used to compare models. Model uncertainty can be incorporated into thecomparison through the posterior odds ratio. The posterior odds ratio is defined to be

PORij =pipjBij (3.28)

where pi is the prior probability that is assigned to model i ∈ I. In order to rule out scaling effects it isimportant to note that the prior and data densities that are used to calculate the posterior distributionare normalized so that they integrate to one.

Beliefs as to which model best fits the data can be incorporated into the comparison using (3.28).If one does not have any strong feelings towards any single model or group of models, then the priorweights given to all models will be the same. In this case the posterior odds ratio is just the Bayes factor.One way to evaluate a model is to ask how much prior probability must be given to a model in order forit to have a higher posterior odds ratio. The more unrealistic that prior probability assigned to a modelthe more unrealistic the model. As mentioned in the previous section, the Bayesian approach to modelcomparison and evaluation treats model uncertainty the same way as you treat uncertainty in any partof a model. The Bayes factor and posterior odds ratio are used to compare and evaluate the models athand.

To gain insight into how we can use these concepts to compare and evaluate models we first lookat the closely related concept of a predictive density. Suppose that we have data {yt}Tt=0 and we wish topredict the values of yT+1, . . . , yT+m . The predictive density of yT+1, . . . , yT+m conditional on model kand data Y T is

pk(yT+1, . . . , yT+m) =∫

ΘK

pk(θk | Y T )T+m∏t=T+1

p(yt | Y t−1, θk)dθk. (3.29)

The predictive density applies prior to observing the data and as usual we can define the analogouspredictive likelihood function,

11

pT+mkT =

∫ΘK

pk(θk | Y T )T+m∏t=T+1

p(yt | Y t−1, θk)dθk. (3.30)

Note that PTk0 = MkT . It can be shown that

pvku =∫

Θk

pk(θk | Y T )v∏

t=u+1

p(yt | Y t−1, θk)dθk

=∫

Θk

pk(θk)∏ut=0 p(yt | Y t−1, θk)∫

Θkpk(θk)

∏ut=0 p(yt | Y t−1, θk)dθk

v∏t=u

p(yt | Y t−1, θk)dθk

=

∫Θkpk(θk)

∏vt=0 p(yt | Y t−1, θk)dθk∫

Θkpk(θk)

∏ut=0 p(yt | Y t−1, θk)dθk

(3.31)

=Mkv

Mku.

Thus the predictive likelihood function for observations u+1 through v is just the ratio of the marginallikelihood’s for the sample of observations 0 through v and 0 through u respectively.

The predictive likelihood can then be decomposed using (3.31). Consider any sequence of numberssuch that 0 ≤ u = s0 < s1 < . . . < sq = v . Then it follows from (3.31) that

pvku =Mks1

Mks0

. . .Mksq

Mksq−1

=q∏

τ=1

psτksτ−1. (3.32)

Note that for u=0 and v=T,pvku = pTk0 = MkT . As the marginal likelihood can be decomposed into theproduct of ratios of predictive likelihoods, we can see that the marginal likelihood represents the out ofsample prediction performance of the model. Thus, by using the Bayes factor, which is just the ratio ofthe respective marginal likelihood’s for each model, we are comparing models on their ability to predictout of sample.

Another application for the decomposition given in (3.32) is for direct model diagnostics. Considerthe complete decomposition in which u=0 and v=T and where si − si−1 = 1 . A relatively low valueof psiksi−1

would indicate that the observation indexed by si was surprising given observation si−1 andmodel k. Thus, one could use this decomposition to evaluate the performance of models in regard topredicting large movements in the data. For example, a large movement may be surprising to all modelsbut some may do better than others. The decomposition is also useful in getting some insight into theBayes factor. Using the decomposition of the marginal likelihood in 3.32, one can do the same for theBayes factor. That is,

Bvij,u =pviupvju

=q∏

τ=1

(psτisτ−1

psτjsτ−1

)=

q∏τ=1

Bsτij,sτ−1(3.33)

Observations or periods of observations that make large contributions to the overall Bayes factor in favorof model i over model j can be identified using (3.33). Unusually low values of the predictive densityfor an observation or for a group of observations would indicate that the model did not do a good jobof predicting that particular observation or group of observations. By breaking up the Bayes factor upinto a product of predictive Bayes factor, as in (3.33), we are able to see what observations or groupof observations have the biggest contribution to the overall Bayes factor. There may be observationsthat are surprising to both models, but the decomposition allows us to see which model handles thesurprising event the best. The question of how models handle a rare but potentially important eventcould be extremely useful in the evaluation of that model. An example of a surprising observation might

12

be an observation that was more than three sample standard deviations different from the precedingobservation. A plot of the cumulative log Bayes factor in favor of one model versus another is a way ofseeing which observations had the most influence. Jumps in the cumulative log Bayes factor plot wouldindicate that a large addition to the overall log Bayes factor occurred at that particular observation.This would mean that one of the models out-performed the other significantly at that observation. Onthe other hand, there may be no large jumps in the cumulative log Bayes factor. This would mean thateither one of the models out-performs the other consistently or that both models perform the same inthe given sample.

In order to use the Bayes Factor to compare models, the marginal likelihood defined in (3.26)needs to be calculated for each model. Geweke (1998) discusses various ways to calculate the marginallikelihood. In most cases, the marginal likelihood is not able to be calculated analytically. In orderto calculate the marginal likelihood, it is essential to be able to make drawings from the posteriordistribution of the model. There are many ways to do this, and the appropriate method depends onthe problem. If the posterior distribution is known and is easily drawn from, then it is possible to makeindependent draws from the posterior. If independent draws are not possible then various Markov chainMonte Carlo (MCMC) methods are available. These are described in detail in Geweke (1998). One suchMCMC method is the Metropolis-Hastings algorithm. The Metropolis-Hastings algorithm is describedin detail in Chib and Greenberg (1995) and Tierney (1994).

The Metropolis-Hastings algorithm is defined by a transition probability density function q(x,y). Given a value of x , q(x,y) generates a value y from the target set of possible values. At the mth step,given the value θ(m) the algorithm generates a potential value for θ(m+1) from q(x,y), and accepts thisvalue for θ(m+1) with probability

α(θ(m), θ) = min{

p(θ | Y T )q(θ, θ(m))p(θ(m) | Y T )q(θ(m), θ)

, 1}. (3.34)

If the candidate value is not accepted the algorithm sets θ(m+1) = θ(m). Chib and Greenberg(1995) show that the above Markov chain has the posterior distribution p(θ | Y T ) as its invariantdistribution.

In order to use this algorithm, a distribution for q(x, .) must be defined. One such choice forq(x, .) is to choose a density f that is defined on the support of p(θ | Y T ) and set q(x,y) = f(y − x) .Practically, defining this way means that the candidate y is determined by drawing z from f and addingit to x. That is, y = x + z . The choice of f is dependent on the problem. Tierney (1994) suggests amultivariate normal, multivariate t , or a uniform distribution defined on a disc as potential choices forf. If the choice of f is symmetric around the origin then the probability of acceptance collapses to

α(θ(m), θ) = min{

p(θ | Y T )p(θ(m) | Y T )

, 1}. (3.35)

Hence the Random Walk Metropolis-Hastings algorithm is, given an initial value θ(0),

• for m=1, . . .,M, generate z from f

• form θ = θ(m) + z

• let θ(m+1) =

{θ with probability α(θ(m), θ)θ(m) else

• return {θ(0), . . . , θ(M)}.

13

Tierney (1994) shows that if f is chosen correctly the Metropolis-Hastings algorithm convergesto its invariant distribution. If suitably defined, the invariant distribution of the Metropolis-Hastingsalgorithm is p(θk | Y T ) and so after a number of burn-in iterations, the Metropolis-Hastings algorithmdraws from p(θk | Y T ). Once draws from the posterior distribution of a model are obtained it is possibleto calculate the marginal-likelihood of the model using a variant of the method of Gelfand and Dey(1994) suggested by Geweke (1998).

The method is as follows. Suppose we wish to approximate

MT =∫

Θ

p(θ)p(Y T | θ)dθ,

where p(.) is the prior for the model in question. Let f(.) be any p.d.f. that has its supportcontained in Θ. Define the function g(θ) as

g(θ) =(

f(θ)p(θ)p(Y T | θ)

).

Then the conditional expectation of g(.) under the posterior distribution is

E[g(θ) | Y T ] =∫

Θ

g(θ)p(θ | Y T )dθ =∫

Θ


p(θ | Y T )dθ

=∫

Θ


p(θ)p(Y T | θ)∫Θp(θ)p(Y T | θ)dθ

dθ (3.36)

=

∫Θf(θ)dθ∫

Θp(θ)p(Y T | θ)dθ

= M−1T .

This conditional moment can be approximated by

E[g(θ) | Y T ] = M−1M∑m=1

g(θ(m)), (3.37)

where {θ(m)}Mm=1 is the set of draws from the posterior distribution from the Markov chain. Apractical method for implementing this method can be found in Geweke (1998).

Consider now the case of comparing a set of models from the RBC literature. In constructing thelikelihood function for a typical model from this literature, a distinction was made between structuralparameters of the model and parameters that were necessary to initialize the model. Let θs,k be thevector of parameters that make up the structural parameters of the model and let θi,k be the vector ofparameters that are needed to initialize the model. Let p(Y T |θs,k, θi,k) be the properly normalized datadensity for model k∈ I and let pk(θi,k, θs,k) be the properly normalized prior density placed over thestructural parameters and the initial parameters jointly. Then the posterior density for θk = (θi,k, θs,k)′

isp(θk|Y T ) ∝ pk(θi,k, θs,k)p(Y T |θs,k, θi,k), (3.38)

and given the posterior given in 3.38 the marginal likelihood for model k∈ I is

Mk =∫

Θk

pk(θi,k, θs,k)p(Y T |θs,k, θi,k)dθk. (3.39)

In the RBC literature, it is common to calibrate the structural parameters, θs,k to specific values.By calibrating the values of θs,k, the prior ps,k(θs,k) is degenerate. Therefore the prior for θk is

p(θk) = pi,k(θi,k, θcs,k) (3.40)

14

where θcs,k is the calibrated value of θs,k.The difference between the calibrated case and the non-calibrated case is that the prior for the

structural parameters of the model is degenerate. All of the methods described above still apply to thecase of calibration. It is easy to see that this method of model comparison is very flexible and allows forthe use of prior knowledge over the structural parameters of the model in a consistent way. In particular,prior uncertainty as to what are the correct calibrated values is allowed for in a consistent way. Also,the method is able to compare and evaluate models over sub-samples as well as across the whole sample.

The next section contains an application of the technique described above for two separate cases.The first case is where the structural parameters of the model are calibrated to specific values as is thecommon practice in the RBC literature. The second case is where prior uncertainty over the values ofthe structural parameters of the model is allowed for. From the first case it will be clear that there isa need to be able to allow for uncertainty over the structural parameters of the model. It will also beshown that allowing for prior uncertainty can lead to different conclusions as to which model is preferred.

4 An Application

Recently, there has been renewed attention placed on models that have stochastic components that areunrelated to the “fundamental” components of the model. In particular Benhabib and Farmer (1994)show that by perturbing a standard RBC model it is possible to obtain a model that can generate cycleswith shocks that are unrelated to any “fundamental” components of the model. Farmer and Guo (1994)calls this model a “sunspot” model, and compare such a model with a standard model with real shocks.Farmer and Guo (1994) motivate their work by noting that the source of fluctuations to an economy hasimportant policy implications. They argue that if a model with shocks to its “fundamental” componentsbest describes the data then there is no role for policy as these allocations would be Pareto optimal.However, if a model that has shocks that are not related to the fundamentals of the model is preferred,then there is a role for policy to reduce the fluctuations and increase welfare.

The two models that are compared are variants of the one-sector stochastic growth model. Thefirst model contains real shocks that are related to the fundamental components of the model while thesecond model was shown by Benhabib and Farmer (1994) to have the potential for shocks that are notrelated to any fundamental component of the model. Farmer and Guo (1994) first compare the simulateddata of the two models respectively with the observed data. From this comparison, Farmer and Guoconclude that the “sunspot” model cannot be rejected. Farmer and Guo then compare the two modelswith respect to the dynamics of the data. They do this by comparing the impulse response functionimplied by each model with the impulse response function implied by the data. Their conclusions arethat the “sunspot” model does a better job in replicating the dynamics of observed data. Therefore,Farmer and Guo conclude that a “sunspot” model cannot be rejected as a potential tool to analyzepolicy or to answer questions that are posed in the literature.

The approach of this application is to compare these two models using the Bayesian model com-parison methods described in Section 3.2. The likelihood functions of each model are constructed usingthe method of Section 3.1 and these are used to compare the two models. The models are compared atthe values calibrated by Farmer and Guo (1994)and also for the case where there is prior uncertaintyplaced across the structural parameters of each model. Section 4.1 describes the models that are usedwhile Sections 4.2 and 4.3 describe how a likelihood function is obtained for the models used in thispaper. The results will be reported in Section 4.4.

15

4.1 The Models

The “fundamental” is the one-sector stochastic growth model with constant returns to scale aggregateproduction technology. The “sunspot” is a model that is a stochastic one-sector growth model with anincreasing returns to scale aggregate production technology. The second model was shown by Benhabiband Farmer (1994) to have the potential to have a “sunspot” equilibrium. The two models are describedbelow.

Consider first the increasing returns to scale economy that is the basis of the “sunspot” model .A more detailed discussion of this model can be found in Benhabib and Farmer (1994) and in Farmer(1993). The economy is as follows: Let Ct denote consumption and Lt denote labor supply. There are avery large number of agents indexed by i ∈ [0, 1]. Each agent acts as both a household and a producer.When acting as a household, the agent maximizes the discounted sum of utility with a time discountfactor of ρ where 0 < ρ < 1. The problem faced by the consumer is

Ui =∞∑t=0

ρtE0

[logCt −A

L1−γt

1− γ

], 0 < ρ < 1 (4.1)

subject toKt+1 + Ct ≤ wtLt + (1− δ + rt)Kt + Πt, K0 given, t = 0, 1, 2, . . . (4.2)

where Πt are possibly non-zero profits the consumer receives from the ownership of firms. Note that asγ → 0 the utility function converges to the utility function used in Hansen (1985). Here, the parameterA reflects the dis-utility from supplying labor.

The production side of the economy is a monopolistically competitive environment whereby house-holds each produce a unique intermediate good using an increasing returns to scale production technology

Yit = ZtKαitL

βit (4.3)

where α + β > 1 and Yit represents the output of the ith household. The intermediate goods producedby the households are combined to make one final good using the technology,

Yt =(∫ 1

0

Y λit di

)1/λ

0 < λ < 1. (4.4)

where λ represents the degree of monopoly power each intermediate producer has. The aggregatetechnology for this economy is,

Yt = ZtKαt L

βt . (4.5)

Aggregate output is either consumed or invested so that the resource constraint for the economy is

Ct + It ≤ Yt (4.6)

where It denotes investment. Capital is assumed to depreciate at a rate denoted by δ where 0< δ < 1.Therefore, the law of motion for the capital stock is

Kt+1 = (1− δ)Kt + It. (4.7)

Benhabib and Farmer (1994) show that for this economy labor’s and capital’s share of nationalincome for this economy are equal to the constants λβ and λα respectively. Therefore the factor sharesare

a = λα and b = λβ. (4.8)

In this framework, it is possible to have a+ b < 1 and α+ β > 1 which implies the possibility ofpositive profits.

16

The economy described above is an economy that has an increasing returns to scale technology(α + β > 1). The degree of monopoly power for the intermediate producers is given by the parameterλ. If λ = 1, then there is no monopoly power and the factor share defined in (4.8) are exactly equalto their production elasticities. In this case the model collapses to the standard one-sector stochasticgrowth model.

The two economies that are used in this comparison are variants of the economy defined above.The first economy, which will be known as the “fundamental” model, is the model described above withλ = 1 and α+β = 1. This is essentially the indivisible labor model of Hansen (1985). The second model,which will be called the “sunspot” model, is the model described above with λ restricted to the interval(0,1) and α + β > 1. In Section 4.2 there will be a discussion on how the second model can exhibit asunspot equilibrium.

The stochastic component of the above economy is Zt, the technology shock parameter thatevolves according to the equation

Zt = Zθt−1ηt (4.9)

where ηt is assumed to be an identically and independently drawn random variable from the distributionN(1, σ2

η). Note that in the RBC literature it is common to assume that the error ηt has a Normaldistribution even though this assumption implies that there is non-zero probability that Zt is negative.However, for the calibrated values that are used by Farmer and Guo (1994), 0 is 142.8 standard deviationsaway from 1 so that the probability of a negative value for Zt is negligible. It then follows from the firstorder conditions to the problem set out above and from the laws of motion for the state variables thatan equilibrium to the economy described above is characterized by the following set of equations:

Yt = ZtKαt L

βt , (4.10)

ACtLγt

= bYtLt, (4.11)

1Ct

= Et

[ρ

Ct+1

(aYt+1

Kt+1+ 1− δ

)], (4.12)

Kt+1 = (1− δ)Kt + Yt − Ct, (4.13)

Zt = Zθt−1ηt, Z0 given, (4.14)

limt→∞

ρtKt

Ct= 0. (4.15)

The aggregate technology for the economy is described in (4.10) while (4.13) and (4.14) describe thelaws of motion for capital, Kt, and for the technology shock, Zt respectively.

The next section will introduce the methods that allow solutions to the (4.10) through (4.15) canbe calculated and how those solutions are used to construct a likelihood function for this model. Also,a discussion of how the model with increasing returns to scale can exhibit an equilibrium with shocksthat do not affect the fundamental components of the model to cause fluctuations in the model.

4.2 Solving the models and constructing a likelihood function

In order to use the Bayesian model comparison literature to compare the two models introduced inSection 4.1, a likelihood function for each model is needed. As is the case for most models found in theReal Business Cycle literature, the likelihood function for the models described in the previous sectionis intractable. The approach is to approximate a solution to the model and use the solution to constructa likelihood function as described in Section 3.1. The solution to the model takes the form of equationsthat relate all of the variables to the vector of state variables of the model and equations that relate

17

how the state variables evolve over time. The structure of this section is as follows. The first part willdeal with how a solution to the model will be approximated. Next, there will be a discussion of howthe model with increasing returns can exhibit an “sunspot” equilibrium and then the construction of alikelihood function for each model will be introduced.

The method of approximation used in this section is the method that is used in Farmer and Guo(1994). Farmer and Guo (1994) show that the general model can be represented by the following set ofdifference equations:

Kt+1 = BZmt Kgt C

dt + (1− δ)Kt − Ct

1Ct

= Et

[DZmt+1K

g−1t+1 C

d−1t+1 +

τ

Ct+1

](4.1)

Zt = Zθt−1ηt

where B = (A/b)d, m = 1− d, d = βφ, g = αm,τ = ρ(1− δ), and D = Bαρ. The state variables of themodels are {Kt, Ct, Zt} . Let {K∗t , C∗t , 1} be the unique deterministic steady state values for {Kt, Ct, Zt}where K∗ and C∗ solve the following equations

K∗ = B(K∗)g(C∗)d + (1− δ)K∗ − C∗

1 = D(K∗)g−1(C∗)d + τ (4.2)

For any variable Xt define

Xt ≡Xt −X∗tX∗t

∼= log(Xt

X∗t

).

Also, define

et+1 =

Et[Kt+1]− Kt+1

Et[Ct+1]− Ct+1

Et[Zt+1]− Zt+1

.Using the above definitions, the first order Taylor series approximation to (4.1) can be represented

as the following matrix system: Kt

Ct

Zt

= J

Kt+1

Ct+1

Zt+1

+R

[ηt+1

et+1

], (4.3)

where the (3×3) matrix J contains partial derivatives of(4.1) and R is a (3 × 4) matrix of coefficients.See Appendix A.1 for a derivation of (4.3). The system of equations in (4.3) contains the equationsthat determine how the state variables of the model evolve over time. The elements that make up thematrices J and R are all functions of the structural parameters of the model.

The non-state variables of the model are output, Yt, investment,It = Yt − Ct, productivity,Pt = Yt/Lt, and supply of labor, Lt. These variables can also be written as functions of the statevariables {Kt, Ct, Zt}. For example, labor supply can be written as a function of the state variables.That is,

Lt = lkKt + lcCt + lzZt. (4.4)

where lk = −αφ, lc = φ, and lz = −φ. For a derivation of (4.4) , see Appendix A.2. Similar equationscan be obtained for output, Yt, investment, It = Yt − Ct, and productivity, Pt = Yt

Lt. Therefore it is

possible to write Lt

It

Pt

Yt

= M

Kt

Ct

Zt

. (4.5)

18

The set of equations that relate the variables of the model to the state variables is given in (4.5).In order to use the method of constructing the likelihood function that was described in Section 3.1,

there needs to be a set of equations that describe the evolution of the state variables of the model overtime. Farmer (1993) discusses a method of how to solve the system given in (4.3). The general setupis as follows: The model described above for both the “fundamental” model and the “sunspot” modelhas three state variables. However, Kt and Zt both have known initial conditions. Therefore, accordingto Farmer (1993), the model described above has only one free state variable. Farmer (1993) definesproblems for which the matrix J of (4.3) has exactly the same number of eigenvalues of modulus lessthan 1 as it has free state variables as “regular” problems. For a “regular” model, the equilibrium is asaddle point equilibrium. This is the most common case for economic models. For either of the modelsdescribed above, if J has exactly one eigenvalue that has modulus less than one, the model would beclassed as a “regular” model. If the number of eigenvalues of J that have a modulus less one is smallerthen the number of free state variables, then the model a called and “irregular” model.

The “fundamental” model is the model described above with λ equal to one and aggregate tech-nology has constant returns to scale. Farmer (1993) shows that this model is an example of a regularproblem and so has a unique “regular” equilibrium. Therefore, the matrix J of (4.3) for the “fundamen-tal” model has exactly one eigenvalue that has modulus less than one. Farmer and Guo (1994) showthat in this case it is possible to express one of the state variables as a linear function of the others. Inparticular Ct can be expressed as a linear combination of Kt and Zt. That is,

Ct = ckKt + czZt (4.6)

where the coefficients ck and cz are, again, non-linear functions of the structural parameters of the model.By substituting (4.6) in to (4.3) the evolution of the state vectors for the “fundamental” model is givenby the following system

Kt = a11Kt−1 + a12Zt−1

Zt = θZt−1 + ηt. (4.7)

Substituting (4.6) into (4.5), all of the variables of the “fundamental” model can be represented asfunctions of the two state variables Kt and Zt. This together with (4.7) are the equations that are usedto construct the likelihood function.

The “sunspot” model that is used in Farmer and Guo (1994) is the variant of the basic modeldescribed in (4.1) to (4.7) with 0 < λ < 1 and α+β > 1. It is the model with an increasing returns to scaleaggregate technology that arises through agents producing intermediate goods with some market powerindexed by λ. This model also has a representation of the form given in (4.3). Benhabib and Farmer(1994) show that for certain values of the parameters , all of the eigenvalues of J lie outside of the unitcircle. In this case the model, using the notation of Farmer (1993), is said to be an “irregular” model.Farmer and Guo (1994) further restrict the “sunspot” economy by letting Zt equal its unconditionalmean of one for all periods. That is, Zt = 1 for all t. Therefore, the “sunspot” model has no real shocks,only “sunspot” shocks. As the value of Zt is zero for all periods, the system of equations that representthe evolution of the state variables through time becomes(

Kt

Ct

)= J

(Kt+1

Ct+1

)+ R ( et+1 ) (4.8)

where J and R are the appropriate partitions of J and R respectively. If the eigenvalues of J all havemodulus greater than one then the system(

Kt+1

Ct+1

)= J−1

(Kt

Ct

)+(

0Vt+1

)(4.9)

19

characterizes a Markov process that is stable and satisfies the equilibrium conditions given in (4.8). Notethat the error term in (4.9) is given by

J−1Ret+1

where

et+1 =(Et[Kt+1]− Kt+1

Et[Ct+1]− Ct+1

).

In this model, the value of Kt+1 is known in period t so that Et[Kt+1]− Kt+1 is always equal tozero. Hence,

J−1Ret+1 =(

0c2Vt+1

).

where Vt is the random variable that has mean zero and variance σ2V that represents Et[Ct+1]− Ct+1.

So, for the one sector stochastic growth model with increasing returns it is possible that the modelcan exhibit an equilibrium that has the form of (4.9) where the random variable Vt is not related to anyfundamental parameters of the model. One can think of this as a model that is driven by “sunspots”.For the purposes of the comparison, only parameter values that lead to the matrix J with the modulusof its eigenvalues all greater than zero were used. In that sense, the “fundamental” model was comparedto the “sunspot” model. The evolution of the state variables in the “sunspot” model is described by thefollowing system:

Kt = b11Kt−1 + b12Ct−1,

Ct = b21Kt−1 + b22Ct−1 + c2Vt. (4.10)

Once solutions to the two models have been obtained, it is now possible to construct a likelihoodfunction for each model. Consider, first, the “fundamental” model. The “fundamental” model containsonly one stochastic state variable, Zt. Before the likelihood function can be constructed, assumptions onthe innovation to the stochastic variable need to be made. For the “fundamental” model, it is assumedthat

ηt ∼ N(1, σ2η)

so thatηt ∼ N(0, σ2

η).

As there is only one stochastic state variable, the likelihood function for the “fundamental” model canonly be constructed using observations on one variable at a time. Let Xt be the variable that is to beused, where X could represent any of the variables that make up the model. For example, X could beoutput or hours supplied. Let XT = {Xt}Tt=0. As described in Section 3.1 the likelihood function isconstructed iteratively. All variables in the model can be written as a function of the state variables ofthe model so that in order to calculate the implied values of the shocks to the model, it is necessary tocalculate the values of the state variables for each period. This is a straightforward problem in periodstwo and higher once the initial values of the state variables are known. For the case of the “fundamental”model, the initial value K0 is needed. Once this value is known it is possible to construct the values of allother state variables from the equations defined in (4.7) and the equation that relates the current valueof the observable, Xt and the state variables. Let the equation that relates Xt to the state variables be

Xt = xkKt + xcCt + xzZt (4.11)

Then combining (4.6) with (4.11) gives,

Xt = (xk + xcck)Kt + (xccz + xz)Zt. (4.12)

20

where xc, xk, ck and cz are all functions of the structural parameters of the model.Given the value of Kt and the value of Xt, (4.12) can be used to calculate the value of Zt. Once

the value of Zt is known, then all of the state variables are known for period t. Therefore it is possibleto calculate, from (4.4) or similar, all the variables in the model. In particular, it is possible to calculatethe value of Kt+1. Once Kt+1 is known, it is possible to calculate the value of ηt+1 using the observationXt+1. This process continues until period T, the end of the sample. Once {Zt}Tt=0 is known it ispossible to calculate the values of ηt for periods one through T using (4.7). The above description canbe summarized in the following algorithm:

• for t=0,. . . ,T, given K0

• Zt = Xt−(xk+xcck)Kt(xccz+xz)

• Kt+1 = kkKt + kzZt.

Once {Zt}Tt=0 is known it is possible to calculate {ηt}Tt=1 using the equation

ηt = Zt − θZt−1.

Then {ηt}Tt=1, together with Z0 are used to calculate the likelihood function for the “fundamental”model. The likelihood function is

p(XT |θf , K0) = p(X0|θf , K0)T∏t=1

p(Xt|θf , Xt−1, K0)

= g0(Z0)T∏t=1

g(ηt)

∣∣∣∣∣∣(

∂(X0, . . . , XT )∂(Z0, η1, . . . , ηT )′

)−1∣∣∣∣∣∣ (4.13)

where

g0(Z0) = (1− θ2

2πσ2η

)1/2 exp{−1− θ2

2σ2η

(Z0 − 1)2

}and

g(ηt) = (2πσ2η)−1/2 exp

{− 1

2σ2η

η2t

}.

It follows from (3.19) that the Jacobian matrix for the transformation between the observed dataand the stochastic component is lower triangular so that the determinant of this Jacobian will be theinverse of the product

∂X0

∂Z0

T∏t=1

∂Xt

∂ηt

where∂Xt

∂ηt= ((xccz + xz)) for t = 1, . . . , T

and∂X0

∂Z0

= ((xccz + xz)).

Therefore the determinant of the Jacobian of the transformation is

|(xccz + xz)|−(T+1).

Hence the likelihood function for the “fundamental” model is

p(XT |θf , K0) = ((1− θ)2

2πσ2η

)1/2(2πσ2η)−T/2 exp

{− (1− θ)2

2σ2η

(Z0 − 1)2 − 12σ2

η

T∑t=1

η2t

}|(xccz + xz)|−(T+1) (4.14)

21

The process to construct the likelihood function for the “sunspot” model is the same as thatdescribed above for the “fundamental” model. For the “sunspot” model, the vector of state variables isst = (Kt, Ct)′. The innovation to the state vector for the “sunspot” model is

ut = (0, c2Vt)′

where Vt is identically and independently distributed with mean zero and variance σ2V . For this example,

it is assumed thatVt ∼ N(0, σ2

V )

so thatgv(Vt) = (2πσ2

V )−1/2 exp{− 1

2σ2V

V 2t

}.

Again, suppose that there are observations on XT = {Xt}Tt=0. Then (4.11) and (4.10) can beused to calculate the values of Vt that are implied by the observations XT . The process is summarizedin the following algorithm:

• given K−1 and C−1, for periods t=0,. . . ,T

• Kt = b11Kt−1 + b12Ct−1

• Ct = Xt−xkKtxc

• Vt = (1/c2){Ct − b21Kt−1 − b22Ct−1}.

Using the above algorithm, it is possible to construct the values {Vt}Tt=1 implied by the observed data,{Xt}Tt=1. Thus, it follows that the likelihood function for the “sunspot” model is

p(XT |θs, K−1, C−1) =T∏t=0

p(Xt|θs, Xt−1, K−1, C−1)

=T∏t=0

gv(Vt)

∣∣∣∣∣∣(

∂(X0, . . . , XT )∂(V0, V1, . . . , VT )′

)−1∣∣∣∣∣∣ . (4.15)

It follows from (3.19) that the Jacobian in (4.15) is a lower triangular matrix, which implies thatthe determinant of the Jacobian is∣∣∣∣∣∣

(∂(X0, . . . , XT )∂(Z0, η1, . . . , ηT )′

)−1∣∣∣∣∣∣ =

T∏t=0

∂Xt

∂Vt.

It follows from (4.11) that∂Xt

∂Vt= xcc2

so that ∣∣∣∣∣∣(

∂(X0, . . . , XT )∂(Z0, η1, . . . , ηT )′

)−1∣∣∣∣∣∣ = |xcc2|−(T+1).

Therefore, the likelihood function for the “sunspot” model is

p(XT |θs, K−1, C−1) = (2πσ2V )−(T+1)/2 exp

{− 1

2σ2V

T∑t=0

V 2t

}|xcc2|−(T+1). (4.16)

22

Now that the likelihood function for each model has been constructed,likelihood methods are nowavailable. In particular, the Bayesian model comparison method introduced in Section 3.2 is now avail-able. The results from this comparison for the two models described above can be found in Section 4.4below. As each model has only one stochastic component, the models are compared across a number ofdata sets. A likelihood function is calculated for each data set. Section 4.3 describes the raw data thatwas used and also discusses how the data was transformed before it was used.

4.3 Data

The two models that were described in Section 4.1 were compared using five separate data sets. Thedata sets that were used were data on total hours supplied (hours), total consumption (consumption),total investment (investment), productivity, and Output. All of the data sets that were used consistedof deseasonalized quarterly data. Consumption, investment and output were obtained from the NationalIncome and Product accounts and hours supplied was constructed using data obtained from the Bureauof Labor Statistics LABSTAT1 database.

Two series were used in the construction of the total-hours series used. They were average hourssupplied2 and number employed3 according to the Household Labor Survey. Total hours supplied wasthus defined as the number of people employed multiplied by the average hours supplied. As defined inthe Household Labor Survey, persons are defined as employed if, during the reference week, they 1) didany work at all as paid employees, worked in their own business or 2) were temporarily absent from jobsbecause of vacation, illness, bad weather, maternity leave, labor dispute, job training or personal reasons.People who work more than one job are counted only once. This definition of the number of peopleemployed is used because it closely mirrors the construction of the average hours series. Once obtained,the total hours series was deseasonalized using seasonal dummies in the obvious way. One aspect of theCurrent Population Survey is that for the years of 1959, 1964, 1970, 1981, 1987, and 1992, Labor Dayfell in the survey week. For those years the average hours supplied for September was artificially low,as the reference week only contained four days. To control for this effect, extra dummies were includedwhen the data was deseasonalized. Those extra dummies took the value of one for an observation thatfell during one of the Labor Day weeks, and zero otherwise.

The data that makes up the consumption series includes all consumption expenditure on non-durable goods and services while the investment series is made up of total gross private investment asdefined by the National Income and Product Accounts. Output is defined to be Real Gross NationalProduct and productivity is defined to be output divided by total labor hours. The sample used in thiscomparison ranged from the first quarter of 1955 (1955:1) until the last quarter of 1996 (1996:4) forTotal hours supplied. For all other data sets the data used was from the third quarter of 1958 (1958:3)until the last quarter of 1996 (1996:4).

Before the data could be used in the comparison of two models from the RBC literature, it needsto be transformed to a form that better matches the equations used to calculate the likelihood function.The equations used to calculate the likelihood function use data in the form of deviations from trend ofthe log of the variable. If Xt is a variable of the model, then

Xt = log(Xt

X∗

)= log(Xt)− log(X∗)

is used in the equations that make up the solutions to the model.1http://stats.bls.gov:80/datahome.htm2The series ID for average hours supplied is lfu12310400000003The series ID for number employed is lfs11104010000

23

The data that are observed grow over time. The models described above, however, abstract fromgrowth and assumes the variables fluctuate around a steady state. It is the practice in the RBC literatureto handle this problem by transforming the data using the Hodrick-Prescott (1997) filter. This filteracts to remove a trend from the data by solving the following problem. Let {xt}Tt=1 be the log of theraw data and let {τt}Tt=1 be the trend for that logged series. The Hodrick-Prescott (1997) filter chooses{τt}Tt=1 so as to minimize

1T

T∑t=1

(xt − τt)2 +φ

T

T−1∑t=2

[(τt+1 − τt)− (τt − τt−1)]2

where φ is a smoothing parameter that is set equal to 1600 for quarterly data. Then dt = xt− τt is usedto as a proxy for Xt. In essence, log(X∗) is replaced by τt. Figures 1 to 5 contain plots of the resultingdeviations from trend for the data that was used in the comparison.

4.4 Results

The two models were then compared under two separate cases. The first case is the case where thestructural parameters of the models are calibrated. This was the comparison that was undertaken byFarmer and Guo (1994). The second case, which was not undertaken by Farmer and Guo (1994) iswhere there is prior uncertainty over the structural parameters of the model. Please note that all resultswere calculated using software from the Bayesian Analysis Computation and Communication (BACC)project (Geweke and Chib 1998).

4.5 Calibration

In this case, the two models described in Section 4.1 are compared using the Bayesian model comparisontechniques described in Section 3.2. In this section, the structural parameters of the model will befixed at the calibrated values given by Farmer and Guo (1994). This is the exact comparison that wasundertaken in Farmer and Guo (1994). In their paper, Farmer and Guo (1994) use informal methodssuch as comparing second moments and comparing the implied impulse response functions of the datagenerated by the models with the impulse response function implied by the observed data. This sectionaims to perform a formal comparison of the two models by constructing a likelihood function for the twomodels and forming a Bayes factor in favor of one model over the other. The values for the structuralparameters that were used in this study were the same values used by Farmer and Guo (1994). Thesecan be found in Table 1.

a b α β δ ρ γ λ θ ση or σV“fundamental” 0.36 0.64 0.36 0.64 0.025 0.99 0 1 0.95 0.007

“sunspot” 0.23 0.70 0.40 1.21 0.025 0.99 0 0.58 NA 0.00217

Table 1: Calibrated values of structural parameters

Bayes factors were constructed using the methods described in Section 3.2. The “fundamental”model described in Section 4.1 contains only one stochastic element which is the technology shock, Zt.As a result the likelihood function for the “fundamental” model will be constructed using one data setat a time. Thus the results that follow will be reported for each data set separately. Let p(XT |θs,f , θi,f )be the properly normalized data density for the “fundamental” model. Here XT = {Xt}Tt=1 is the dataset that is being used where Xt is the tth observation used in the construction of the likelihood function.

24

Given this data density, the posterior distribution for θf = (θs,f , θi,f )′ is

p(θf |XT ) ∝ p(θf )p(XT |θf ) (4.1)

where p(θf ) = p(θs,f |θi,f )p(θi,f ) is the properly normalized prior distribution for θf . In this case, thestructural parameters are fixed at their calibrated values, θcs,f . This implies that the prior for thestructural parameters is

p(θs,f |θi,f ) = Iθcs,f

(θs,f )

where I() is an indicator function which takes the value of one if θs,f = θcs,f and zero otherwise. In thecase of the “fundamental” model, the set of initial parameters is made up of only the initial level ofcapital, K0. In the notation of Section 3.1, θs,f = (α, β, a, b, δ, ρ, γ, λ, θ, σ2

η)′ and θi,f = K0.The prior for the initial condition was constructed as follows: The initial condition for the “fun-

damental” model is the deviation from trend of the initial capital stock. The prior distribution wasassumed to be Gaussian with mean zero and variance σ2. To calculate the prior variance consider theTaylor’s series approximation to the capital accumulation equation given by

Kt+1 = (1− δ)Kt + δIt. (4.2)

Using (4.2), the unconditional variance of capital is approximately

var(Kt) ≈δ

2− δvar(It). (4.3)

Using this equation, the approximate variance for capital is 0.000032. The prior variance for Kt wastherefore set to be 0.000064, twice the calculated variance of the capital series. This makes the priordistribution for the initial capital stock relatively diffuse relative to the approximate distribution of Kt.

The posterior distribution for θf given XT is therefore

p(θf |XT ) ∝ p(θi,f )p(XT |θcs,f , θi,f ).

To make draws from the posterior described above, a random-walk Metropolis-Hastings algorithm wasused. The algorithm is as follows:

• given θ(0)i,f

• for m=1,M

• draw x ∼ f()

• let y=θ(m−1)i,f + x

• let θ(m)i,f =

y with prob. min(

p(y,θcs|XT )

p(θ(m−1)i,f

,θcs|XT ), 1)

θ(m−1)i,f else

• return {θ(0)i,f , . . . , θ

(M)i,f }

The source density, f(), for the random walk step was chosen to be the Normal density with mean 0,and variance σ2

rw. The variance of the source density was chosen to tune the algorithm. Once the draws{θ(0)i,f , . . . , θ

(M)i,f } are obtained, they were used to calculate the marginal likelihood for the “fundamental”

model as described in Section 3.2The structural parameter vector for the “sunspot” model is θs,s = (α, β, a, b, δ, ρ, γ, λ, σ2

V )′ and theinitial condition parameter is θi,s = (K0, C0)′. The structural parameters were fixed at the values given

25

in Table 1. The prior for θs = (θs,s, θi,s)′ is defined as for the “fundamental” model except that the priorfor the initial parameters is defined to be the product of two independent Normal distributions. Theprior for the initial value of capital is the same as for the “fundamental” model while the prior for theinitial value of consumption is defined to have mean zero and variance equal to 0.00135. The variancefor consumption was constructed using the same approach that was used to construct the variance forthe initial value of capital. The approximation to

Ct + It = Yt

yields an approximate variance for Ct of 0.000675. The prior variance for C0 was therefore set equalto twice the calculated variance for Ct. Table 2 contains information on the priors for the initialconditions for the “sunspot” model. The marginal-likelihood was calculated the same as described forthe “fundamental” model.

Model parameter prior mean prior variance

“fundamental” k0 0 0.000064

“sunspot” k0 0 0.000064

c0 0 0.00135

Table 2: Prior specification for initial conditions

The following tables contain the results for the two models. In all cases, a total of 50,000 drawsfrom the posterior distribution of each model were made using the Metropolis-Hastings algorithm de-scribed above. The value of σ2

rw was chosen to tune each algorithm separately. There were a number ofdiagnostics that were used to determine whether the algorithm was tuned appropriately. One of those isthe acceptance probability of the algorithm. Tierney (1994), discusses what should be the appropriateacceptance proportion from the Metropolis-Hastings algorithm. The value of σ2

rw was chosen so thatthe acceptance probability of the algorithm was between 0.3 and 0.5. Another diagnostic is to comparethe numerical standard error of each moment calculated with the corresponding posterior standard de-viation. These are reported by MOMENT4. In all cases, the numerical standard errors of the momentscalculated were all less than one tenth of the calculated posterior standard deviations. For the case ofthe calibrated models, the draws from the posterior distribution,

p(θi,k|XT , θcs,k)

were used to calculate the marginal likelihoods for each model. These marginal likelihoods are reportedin Table 3 below. The subsequent log Bayes factors in favor of the “fundamental” model over the“sunspot” model are reported in Table 4.

Consumption Hours Investment Productivity Output

“fundamental” -264.7742 504.8544 244.0432 -827.0816 458.9279(0.0035) (0.0129) (0.0143) (0.0103) (0.0114)

“sunspot” -1917.2548 505.1721 242.0583 -3095.4583 464.8219(0.0144) (0.0150) (0.0122) (0.0160) (0.0104)

Table 3: Log marginal likelihoods: Calibrated case

4http://www.econ.umn.edu/∼bacc

26


1652.4806 -0.3177 1.9849 2268.3767 -5.8940(0.0148) (0.0198) (0.0188) (0.0190) (0.0154)

Table 4: Log Bayes factor in favor of “fundamental” model over “sunspot” model: Calibrated case

The first thing that should be noted from the reported marginal likelihoods and Bayes factorsis the extreme difference in the log Bayes factors across the data sets that are used to construct thelikelihood functions. When consumption and productivity are used to calculate the likelihood function,the log Bayes factor is overwhelmingly in favor of the “fundamental” model. This contradicts Farmerand Guo’s finding that the “sunspot” model does no worse than the “fundamental” model.

However, the variance in the log Bayes factors suggests that there is a problem with the calibrationof the models for those data sets. The parameter that would have the biggest direct effect on the scale ofthe marginal likelihoods would be the variance parameters of the shock terms for each model. Supposethat these values were smaller than they should be. Then, the values of the shock terms that are usedto construct the likelihood functions would be mostly in the tail of their assumed distribution. Thus taileffects could explain the drastic differences between the two models.

In the literature, it is common to calibrate the models so that the variance of the artificial timesseries for output is approximately equal to the observed variance of output. It is not common, however,to recalibrate this value for all of the data sets that are used. For those variables that have significantlydifferent variance form that of output, there could be a problem in using the variance of output tocalibrate the variance of the shock term. To test this hypothesis, the variance of the shock term for eachmodel was calibrated for each data set that was used.

Let Xt be any variable that is used to construct the likelihood function. Consider first the caseof the “fundamental” model. Each variable in that model can be written as a function of the statevariables. Therefore,

Xt = xkKt + xzZt

whereZt ∼ N(θZt−1, σ

2η).

For each data set, XT , the new value of σ2η was chosen so that the variance of the artificial data set

was equal to the variance of the observed data set. Therefore, the value of σ2η was calibrated using the

following relation,

σ2η =

var(X)(1− θ2)x2z

. (4.4)

For the case of the “sunspot” model, the variable, X can be written as

Xt = xkKt + xcCt.

This along with (4.10) implies that the value of σ2V should be calibrated to,

σ2V =

var(X)(c2 ∗ xc)2

(4.5)

The new calibrated values that were used are reported in Table 5.The log marginal likelihoods for the “fundamental” model and the “ sunspot” model were cal-

culated exactly the same as the previous case. The models were solved at their calibrated values, andthe posterior distribution of the remaining free parameters was formed using the constructed likelihood

27

Variable σ2η σ2

V

consumption 1.0727× 10−4 2.4334× 10−4

hours 1.0902× 10−5 1.0677× 10−5

investment 1.4921× 10−5 6.5254× 10−6

productivity 1.3723× 10−4 3.1128× 10−4

real GDP 8.1364× 10−6 9.4757× 10−6

Table 5: Calibrated values for variances to shock process

function. The marginal likelihood was then calculated using the method of Gelfand and Dey (1994), asbefore. The results can be found in Tables 6 and 7.


“fundamental” 218.8067 252.6044 115.8952 94.3669 294.2506(0.0035) (0.0141) (0.0103) (0.0119) (0.0121)

“sunspot” 453.0565 497.9558 222.6462 428.4416 444.0045(0.0121) (0.0126) (0.0167) (0.0121) (0.0118)

Table 6: Log marginal likelihoods: New calibrated case


-234.2498 -245.3514 -106.7510 -334.0742 -149.7539(0.0126) (0.0186) (0.0196) (0.0170) (0.0169)

Table 7: Log Bayes factor in favor of “fundamental” model over “sunspot” model: New calibrated case

The differences in the results presented in Tables 6 and 7 from those presented in Tables 3 and4 are quite stark. Under the new calibration, the “sunspot” model is heavily favored using all datasets. The difference between the new calibration and the old calibration is greater for the variablesconsumption and productivity. While the calibration for these variables appears to be better than theold calibration, the new calibration is not favored for all variables. The log Bayes factors in favor of thenew calibration can be found in Table 8.

The above results do suggest however that the reason for the differences across the log Bayesfactors reported in Table 3 is because of scaling. The above results also suggest that it is important tocalibrate the variance of the shock process correctly when one is trying to compare the performance oftwo models. It is the practice to calibrate the variance of the shock process using information on outputonly. This practice, as it was the case above, could lead to incorrect inferences as to model validity forvariables other than output.

The results in Table 8 imply that calibrating the variance term of the shock process for the twomodels using the equations of the approximated solution does not do as good a job as the originalcalibration when observations on hours, investment, and output are used to construct the likelihoodfunction. It is not clear, therefore, as to what is the “best” method to calibrate the value of σ2

η and σ2v

should be. Allowing for uncertainty over their values would seem to be the best solution to this problem.Allowing for prior uncertainty over structural parameters is discussed in the Section 4.6.

Table 9 contains the log Bayes factors in favor of the “fundamental” model over the “sunspot”model for the most favored calibration. The results in Table 9 are not consistent across the data sets.

28

Model Consumption Hours Investment Productivity Output

fundamental 483.5809 -252.2500 -128.1480 921.4485 -164.6773(0.0049) (0.0191) (0.0176) (0.0157) (0.0166)

sunspot 2370.3113 -7.2163 -19.4121 3523.8999 -20.8174(0.0188) (0.0196) (0.0207) (0.0200) (0.0157)

Table 8: Log Bayes factor in favor of the new calibration over the old calibration


-234.2498 -0.3177 1.9849 -334.0742 -5.8940(0.0126) (0.0198) (0.0188) (0.0170) (0.0154)

Table 9: Log Bayes factor in favor of “fundamental” model over “sunspot” model: Most favored cali-bration

The “sunspot” model is favored for four of the five data sets. There is also inconsistency as to thedegree to which the “sunspot” model is favored. The evidence would suggest that the claim of Farmerand Guo (1994) that the “sunspot” model is “as good” as the “fundamental” model is supported.Using consumption or productivity, the evidence suggests that the “sunspot” model is superior to the“fundamental” model.

In order to understand why there is a disparity across the different variables that were used toconstruct the likelihood functions for the two models, the log Bayes factor was decomposed across theentire sample. It was hoped that this would ad insight as to the differences between the two models.The approach was as follows: For each observation t = 2, . . . , T − 1, the Metropolis-Hastings algorithmwas used to draw from

pk(θi,k|Xt, θs,k)

and the value ofpk(xt+1|Xt, θk)

was returned as a function of interest. The value of pk(xt+1|Xt, θk) is the predictive likelihood of theobservation xt+1 given the information Xt. From the output of the Metropolis-Hastings algorithm,posterior moments of the pk(xt+1|Xt, θk) were formed for each period t = 2, . . . , T − 1 using the routineMOMENT from the BACC software.5 Then the cumulative sums of the log of the predictive likelihoodswas calculated. The results of this can be found in Figures 1 to 5.

Figures 1 to 5 contain three graphs each. The first graph is the graph of the deviations fromtrend as reported by the Hodrick-Prescott filter. The second graph displays the proportional deviationof period t’s observation from period (t-1)’s observation in relation to the range of the observations.That is, the value of the proportional deviation for period t, xt is equal to

xt =xt − xt−1

maxTt=1(xt)−minTt=1(x).

The third graph is the cumulative log Bayes factor in favor of the “fundamental” model over the “sunspot”model. In fact, the third graph is the cumulative sum of the posterior means of the predictive likelihoodsfor next period.

It is clear from Figures 1 to 5 that neither of the two models are in ascendancy over the wholesample. In fact, it appears that the “sunspot” model does better during periods where the data is more


29

Figure 1: Cumulative log Bayes factor: Consumption

Figure 2: Cumulative log Bayes factor: Hours

volatile. Thus, one reason for the disparity of results noted above could be due to the nature of the datarather than the superiority of any model. This example gives a good illustration of the method in itsability to compare models across sub-sections of the data as well as across the whole data.

During the comparison of the “sunspot” model with the “fundamental”model it was necessary tore-calibrate the variance of the shock process of the two models for each data set that was being used tocompare the models. The re-calibration was carried out so as to set the variance of the respective shockprocesses so that the models would generate artificial data that had the same variance as the observeddata. While it was clear that there was a problem with the old calibration, it was not clear that thenew calibration was the best method either. One way to solve this problem would be to calibrate themodels a number of ways to get an idea of how the results are sensitive to the calibration used. A betterand easier solution would be to calibrate the models allowing for uncertainty as to what the exact valueof the structural parameters should be. The next section deals with the problem of comparing modelswhen there is uncertainty over the correct values of the structural parameters of the model.

4.6 Prior uncertainty over the structural parameters

The problem of comparing two or more models with prior uncertainty over the parameters of a modelis an easy problem. Once a properly normalized data density and a properly normalized prior arespecified, all of the Bayesian model comparison techniques that were used in Section 4.5 carry through.In the RBC literature there has not been a lot of attention paid to the problem of allowing for thestructural parameters to be calibrated with some degree of uncertainty. DeJong, Ingram, and Whiteman(1996, 1997) do allow for the structural parameters to be calibrated with uncertainty. In their papers,they show that allowing for uncertainty allows the model to better fit the data.

Hansen and Heckman (1996) argue that the calibration of dynamic macroeconomic models usingcross-sectional data may not lead to “good” calibrations. Calibration, as it is known in the RBCliterature, specifies the values of the structural parameters of a model using historical studies and knowntheory. In calibrating the parameters to specific values, one is placing a degenerate prior over theparameters. Hansen and Heckman (1996) argue that models in the RBC literature are highly aggregatedand dynamic while some of the studies are micro-studies that are cross-sectional in nature. They arguethat it may not be possible to be able to calibrate with the accuracy that is suggested in the literature.Also, as was found in the previous section, the calibration of the variance of the shock terms for the twomodels being compared left some uncertainty to what was the correct method that should be used tocalibrate the model.

In is clear that it would be prudent to allow for some prior uncertainty in calibrating the varianceof the shock process. Before making inferences from a model, it is usual to use the data to calculatethe values of the structural parameters that are most likely. This has not been the case in the RBCliterature. Instead, the structural parameters have been fixed using prior knowledge without out anyroom for error. If one allows for the calibration of the structural parameters of a model with someuncertainty then it is possible to use prior information as well as information from the data to ascertain

30

Figure 3: Cumulative log Bayes factor: Investment

Figure 4: Cumulative log Bayes factor: Productivity

the values of the structural parameters that are most likely.In this section, there is prior uncertainty allowed across the structural parameters of the two

models described in Section 4.1. For the “fundamental” model, the structural parameters, θs,f , are thefollowing: b, the labor share of income, δ, the depreciation rate, ρ, the time discount factor, θ, the AR(1)parameter on the productivity shock parameter, A, the preference parameter, and σ2

η, the variance ofthe innovation process to the productivity shock. All other parameters of the model are either fixed orare functions of the above parameters. For example, a, capital’s share of income is defined to be equalto 1-b, while α and β are equal to a/λ and b/λ respectively. The labor supply elasticity, γ, is set equalto zero for all cases. This is not the only way of defining the free parameters that make up the modeland hence the likelihood function. There is some flexibility as to which of the parameters a, b, α, β andλ to make free. However, for this comparison, the structural parameters for the “fundamental” modelis the vector

θs,f = (b, δ, ρ, θ, σ2ε , A)′.

The structural parameters for the “sunspot” model are

θs,s = (b, δ, ρ, σV , A, λ)′.

There are a number of restrictions that are placed on the structural parameters in the descriptionof the models. In the “fundamental” model, the parameters b, δ, ρ, and θ are all constrained to lie inthe unit interval, (0,1). The parameters A and σ2

η are both constrained to be greater than zero. For the“sunspot” model, the same restrictions apply to those parameters that overlap except for the parameterb. In the “sunspot” model, it was necessary to incorporate monopolistic competition into the productionside of the model. This meant that firms could make above normal profits. Farmer and Guo (1994)calibrate the proportion of national income due to profits as 7%. This was also done for this paper. Thisimplies that the value of b, labor’s share of income, lies in the interval (0,0.93). The parameter λ is alsoconstrained to the interval (0,1). Therefore, the domain of (θs,f ,Θs,f ) is

Θs,f = (0, 1)4 × (0,∞)2

and the domain of the structural parameters for the “sunspot” model is

Θs,s = (0, 0.93)× (0, 1)4 × (0,∞)2.

It is convenient to transform the domains of the structural parameters in each model to Rk,where k is equal to the dimension of the domain. This transformation is convenient in implementing therandom-walk component of the Metropolis-Hastings algorithm that was used to make draws from theposterior distribution for each model. By transforming the domain of the parameter vectors, the random-walk Metropolis-Hastings algorithm was more efficient in that it was guaranteed to always remain in thedomain of the posterior distribution. The priors for the structural parameters were also defined over thetransformed space.

31

Figure 5: Cumulative log Bayes factor: GDP

The parameters were transformed in the following way: For those parameters that were con-strained to an interval of the form (c,d), the parameter, x, where x represents any parameter, wastransformed using the transformation

x′ = log(x− cd− x

), (4.6)

while the parameters that were constrained to the positive real numbers were transformed via the logtransformation.

Tables 10 and 11 report 95% credible intervals for the priors that were used in the study. Inpractice, the priors were independently defined over the transformed parameter space with means atthe transformed value of the calibrated value for each parameter. For example, in the “fundamental”model, b is calibrated to be equal to 0.64. The transformed value, using the formula in (4.6) is equalto log( 0.64

1−0.64 ) = 0.5754. The prior for the transformed value of b was then defined to be Normal withmean equal to 0.5754 and with a prior variance so as to get the 95% prior coverage intervals reportedin Tables 10 and 11. That is, a 95% credible set was calculated for the transformed space and thentransformed into the credible sets reported in Tables 10 and 11. This method was also used for the otherparameters as well.

Prior mean 95% prior coverage interval

b 0.64 [0.4857, 0.7699]

δ 0.025 [0.0128,0.0482]

ρ 0.99 [0.9616,0.9975]

θ 0.95 [0.8603,0.9832]

A 2.86 [1.9171,4.2667]

σ2η 4.9× 10−5 [3.13× 10−5, 1.26× 10−4]

Table 10: Prior for “fundamental” model

The prior variances were chosen to reflect a reasonably large degree of uncertainty over the valuesof the parameters. However they were chosen so that the 95% prior coverage intervals lay in a regionof the parameter space that were not too unreasonable with respect to the literature. For example,the 95% prior coverage interval for b, labor’s share of income, was set to be equal to [0.4857, 0.7699].Authors have suggested a variety of values for b. DeJong, Ingram, and Whiteman (1996) calibrate b tobe 0.54 and use a prior for b of [0.48, 0.68]. Hansen (1985) and Kydland and Prescott (1982) calibrate bto be 0.64. However, depending on how capital is defined and measured, other authors have suggestedhigher values for b. For example, Prescott (1986) suggests a value for b of 0.75 while Christiano (1988)suggests a value of 0.66. All these values lie in the 95% prior coverage interval for b.

The prior for δ, the depreciation rate for capital, covers a range of about 5% per annum to about19% per annum with a prior mean at 10% per annum. Most studies calibrate δ to be 0.025. The priorfor ρ, the time discount factor, has a 95% prior coverage interval of [0.9616,0.9975]. This implies aneconomy with a real interest rate ranging from 1% to 21%. The 95% prior coverage interval for theAR(1) parameter θ is [0.8603,0.9832]. This implies a wide range of persistence in the productivity shockprocess. Hansen (1985) calibrates θ to be 0.95 while DeJong, Ingram, and Whiteman (1996)) use a valueof 0.90. A recent paper by Hansen (1997)) finds evidence to suggest that the value of θ is lower than

32

0.95 and closer to 0.90. Finally, the 95% prior coverage interval for σ2η is [3.13× 10−5, 1.26× 10−4]. This

implies that the variation in real GDP in this model economy ranges from 2.2% per annum to 4.5% perannum.

The priors for the “sunspot” model are given in Table 11. In the “sunspot” model case, thereis one extra free parameter. That is λ, the monopoly power parameter. Its value was calibrated to be0.58. Farmer and Guo (1994) used a study by Domowitz, Hubbard, and Peterson (1988)) to calibratethe value of λ. They report a range of values that λ can take and the 95% prior coverage interval shownin Table 11 reflects this. One thing to note in the “sunspot” model is that there are monopoly profitspresent. Farmer and Guo (1994) calibrate the level of monopoly profits to be equal to 7% of nationalincome. This implies that labor’s and capital’s income shares, for the “sunspot” model, sum to 0.93.Finally, for the purposes of this study, the elasticity γ is set equal to zero for both models and monopolyprofits are always set equal to 7% of national income in the “sunspot” model. The same priors wereused for all data sets.

Prior mean 95% prior coverage interval

b 0.70 [0.5535, 0.8145]

δ 0.025 [0.0128,0.0482]

ρ 0.99 [0.9616,0.9975]

A 2.86 [1.9171,4.2667]

σ2V 4.0× 10−6 [1.55× 10−6, 1.03× 10−5]

λ 0.58 [0.4934,0.6620]

Table 11: Prior for “sunspot” model

Figures 6 through 17 contain pictures of the prior distributions for each of the structural pa-rameters of each model. These were obtained by calculating the properly normalized prior densityin the transformed space and using the appropriate Jacobian transformation to calculate the properlynormalized density for each parameter in the original space.

The priors for the “sunspot” model were constructed the same way as described for the “funda-mental” model. However, not all possible combinations of parameters that make up the prior will leadto a sunspot equilibrium. To have a sunspot equilibrium, all of the roots of the matrix defined in (4.3)have to lie outside the unit circle. Hence the prior for the sunspot model is re-normalized by dividingthrough by the probability, psun, that a random draw from Θs leads to a model that exhibits a sunspotequilibrium. That is

p(θs) = pb,s(b)pδ,s(δ) . . . pλ,s(λ)pθi,s(θi,s)p−1sun. (4.7)

The probability psun was calculated by making M draws from the prior for θs and counting the number,n, of times the resulting parameter value led to a model that had a “sunspot” equilibrium. This entailedchecking to see if all of the eigenvalues of the matrix J of (4.3) all had modulus greater than one. Thenpsun was set equal to n/M .

The initial parameters for each model were the same as for the calibrated case. That is, θi,f = K0,and θi,s = (K−1, C−1)′. The same prior distributions as those reported in Table 2 were used for thisexample as well.

The procedure to compare the two models with prior uncertainty over the structural parameterswas the same as for the previous case of calibration. For each data set respectively, the likelihoodfunction for each model was constructed by calculating the values of the stochastic components of themodels. These values were used to construct the likelihood as described in Section 4.2.

33

Then, given p(XT |θs,f , θi,f ) and p(θf ), the posterior distribution for the “fundamental” model is

p(θf |XT ) ∝ p(θf )p(XT |θf ). (4.8)

where p(XT |θf ) is defined in (4.14). Likewise, the posterior distribution for the “sunspot” model is

p(θs|XT ) ∝ p(θs)p(XT |θs). (4.9)

where p(XT |θs) is defined in (4.16).In order to calculate the marginal likelihood for the two models, and hence use the Bayesian

model comparison techniques described in Section 3.2 to compare them, draws from their respectiveposterior distributions need to be made. As in the calibrated case, a random walk Metropolis-Hastingsalgorithm was used to do this. Let p(θk|XT ) be either of the posterior distributions that draws are tobe made from. Let R be a matrix of the same order of θk. The matrix R is the variance matrix of thesource density from which the random step in the Metropolis-Hastings algorithm is drawn from. Thesource density for the algorithms that were used to obtain the results presented below was a multivariateNormal distribution with mean 0 and variance-covariance matrix R. The Metropolis-Hastings algorithmwas as follows:

• given θ(0)k drawn from the prior

• for m=1,M

• draw x ∼ N(0, R)

• let y=θ(m−1)k + x

• let θ(m)k =

y with prob. min(

p(y|XT )

p(θ(m−1)k

|XT ), 1)

θ(m−1)k else

• return {θ(0)k , . . . , θ

(M)k }

The value of R was chosen so as to tune the algorithm. Various rules were used to tune thealgorithm. The value of R was first chosen so as to have the proportion of acceptances over the wholealgorithm to be between 0.3 and 0.5. The overall aim is to efficiently draw from the posterior. The rulethat was used to determine that the algorithm was tuned was to check whether the computed numericalstandard errors of the posterior means of the drawings {θ(0)

k , . . . , θ(M)k } was less than 10% of the reported

posterior standard deviations. All these quantities can be calculated using the software MOMENT fromthe BACC6 software package. Other tests were made also. One test was to take two different drawsfrom the posterior, starting at different random draws from the prior distribution, and check to see ifthe resulting posterior moments were statistically similar. The routine APM from the BACC softwarepackage was used for this test.

Tables 19 and 20 at the end of this section report the posterior means and standard deviationsfor the two models. In all cases, the moments were calculated using MOMENT and M was set equal to50,000. Using the program CONVERGE of the BACC package and by inspecting plots of the parametervalues that were drawn from the distributions, it was apparent that all of the algorithms had convergedto their invariant distribution, the posterior distribution of each model respectively, after at most 5,000draws. Therefore, the results reported are for the last 45,000 draws for each Metropolis-Hastings algo-rithm. In all cases, the numerical standard error for each of the moments were less than 10% of thereported posterior standard deviations and so were not reported.


34

The posterior and prior distributions for each of the structural parameters of the two models canbe found in the Figures at the end of this section. Once the drawings from the posterior distributionswere made, the log marginal likelihoods for each model, and each set of observations, were calculatedusing the program MLIKE from BACC. The log marginal likelihoods and the resulting log Bayes factorsin favor of the “fundamental” model over the “sunspot” model can be found in Tables 12 and 13.


“fundamental” 442.6225 493.5325 226.5215 410.6508 441.7111(0.0711) (0.0853) (0.0776) (0.0721) (0.0946)

“sunspot” 436.4448 514.6317 253.2549 418.7906 466.3343(0.0080) (0.0580) (0.1131) (0.1124) (0.1124)

Table 12: Log marginal likelihoods: Non-dogmatic prior


6.1777 -21.0992 -26.7334 -8.1398 -24.6232(0.0715) (0.1037) (0.1335) (0.1360) (0.1469)

Table 13: Log Bayes factor in favor of “fundamental” model over “sunspot” model: Non-dogmatic prior

Note that, in Table 13, the log Bayes factors favor the “sunspot” model in all cases except whenconsumption is used to construct the likelihood function. This result is in contrast to the results fromthe case when the structural parameters were calibrated.

In the course of calculating the marginal likelihoods that were reported in Tables 12, draws fromthe posterior distribution were made. Therefore posterior moments can be obtained for each model.These are reported in Tables 19 and 20. Also, using the draws from the respective posterior distributionsit is possible to obtain plots of the posterior distribution for each parameter. These are reported inFigures 6 through 17 at the end of this section. All of the plots were obtained using the routine GRAPHfrom the BACC software package. The priors are plotted with each posterior for comparison. In eachplot there is a separate graph for each of the data sets used.

The first point to note is that for some of the parameters, the data seems to be adding informationto the posterior while for others the data seems to be adding little with respect to the prior. The bestexample of the second case can be found in the plots of the parameter A for both models (Figures 10and 15). Here it is clear that the posterior and the prior are very similar for both models and forall variables. However for other parameters there is evidence that the data is influencing the posterior.Consider labor’s share of income, b. For the “fundamental” model, the posterior distribution is shifted tothe right of the prior. In the case of consumption and productivity, the posterior means are considerablybigger than the prior mean. This would suggest that the data implies that labor’s share of income ishigher than that calibrated. However, the posterior means of b are varied across the data sets. For the“sunspot” model there is again evidence that the data is influencing the posterior. Also, the posteriormeans are greater than the prior means , although not as significantly as those for the “fundamental”model. Again, this would suggest that labor’s share of income for the “sunspot” model is higher thanthe calibrated value.

For the depreciation rate of capital, δ, there are some major differences across models. For the“fundamental” model the posterior and the prior are similar as evidenced in Figure 7. Here there ismore posterior weight placed on higher values of δ than in the prior. For the “sunspot” model there

35

is a significant difference between the prior and the posterior. In all but the case when consumption isused to construct the likelihood function, the posterior distribution for δ has a mean that is significantlyhigher than the mean of the prior distribution. This suggests that the depreciation rate of capital ishigher in the “sunspot” model. Also, the variance of the posterior distribution is higher than in theprior. More weight is put on both high and low values than in the prior.

Figures 8 and 14 contain the posterior and prior distributions for the parameter ρ, the timediscount factor. One aspect that should be noted is that for the “fundamental” model the posteriordistribution puts more weight on higher values of ρ in relation to the prior and less weight on lowervalues of the prior. The posterior and prior means, however, are not too dissimilar. This is the casefor all variables that are used to construct the likelihood function except for hours. When hours areused, the posterior distribution for ρ puts more weight on both low and high values of ρ. It shouldbe noted that the domain for ρ is constrained away from 1. For the “sunspot” model, the posteriordistributions for ρ tend to be more diffuse than the prior. This is most marked when productivity isused to construct the likelihood function. However, not all posteriors for ρ are more diffuse than theprior. When consumption is used, the posterior distribution favors higher values of ρ rather than lowervalues. The results from the two models indicate that the intertemporal discount factor for these modelsis hard to pin down. The information from the data does not do a good job of pinning down a value forρ. In those cases where the data seemed to provide some information, the posterior distribution favoredhigher values of ρ but there were enough cases when this did not hold to be confident that the datasuggest that the value of ρ used in the literature is too low.

Another parameter for which there is a mixture of results is the AR(1) parameter on the shockprocess for the “fundamental” model. Figure 9 contains the posterior and prior distributions for θ. Forthree of the data sets, the posterior distributions are much more diffuse than the prior and suggestthat the value for θ is lower than the prior mean of 0.95. However, for two of the data sets, there isevidence to suggest that the value for θ should be higher than 0.95. The two variables that lead to aposterior distribution for the “fundamental” model that has a mean higher than 0.95 are consumptionand productivity. These were the series whose prior for σ2

η was significantly different from the posterior.The domain of θ was restricted to (0,1) which means that a unit root process for the shock process wasruled out. It may be the case that for these variables that restriction is not valid. The results for θare mixed, but certainly do suggest, at least for some variables that value of θ should be lower that thevalue 0.95 that is used by most calibrators. This result is similar to that found in Hansen (1997).

In the previous section, it was found that the value of the variance of the shock process of themodels made a difference to the outcome of the comparison. Figures 11 and 16 contain the posterior andprior distributions for the variance terms to each models stochastic variables. For both the “fundamental”and “sunspot” model, the posterior distribution for the variances σ2

η and σ2V respectively are significantly

different from the prior for the variables consumption and productivity. In the calibration case, theBayes factor was strongly in favor of the variances to be larger for those sequences. For both of thosevariables, the location of the posterior is significantly higher than the location of the prior. Whenconsumption is used to construct the likelihood function, the posterior mean is approximately fourposterior standard deviations from the prior mean for the “fundamental” model and is approximately 9posterior standard deviations for the “sunspot” model. For productivity the numbers are approximately4.5 and 9.3 respectively.

For the “fundamental” model, the two data sets that yield high values for σ2η also yield high

values for θ. For consumption and productivity the stochastic variable has higher persistence andvariability than what is used in the literature. For the other data sets, the stochastic component hadlower persistence than that used in the literature but had a variance that is comparable to that used in

36

the literature.The results presented above again show the need to be careful when comparing models with fewer

shocks than variables. Because there are less shocks than variables, only a subset of the variables thatmake up a model can be used in the comparison each time. There is therefore a need to be carefulwith calibrating the model for each subset of variables that are used. Given the log marginal likelihoodsalready computed it is possible to calculate log Bayes factors in favor of placing prior uncertainty overthe structural parameters of the model. Table 14 contains these results. The results are mixed. Thereis evidence to suggest that prior uncertainty should be placed over the structural parameters but thisresult is not consistent across the data sets or even across models.


fundamental 223.8158 -11.3219 -17.5217 316.2839 -17.2168(0.0711) (0.0863) (0.0735) (0.0785) (0.0952)

Sunspot -16.6117 9.4596 11.1966 -9.6510 1.5124(0.0145) (0.0599) (0.1137) (0.1130) (0.1129)

Table 14: Log Bayes factor in favor of Non-Dogmatic prior over Dogmatic prior (calibrated case)

The fact that the marginal-likelihood for the calibrated case is larger than the non-calibrated caseis not that surprising. The marginal-likelihood is a weighted average of the likelihood function where theweights are given by the prior. If the calibrated values are picked to be in an area where the likelihooddensity is high and the prior is reasonably diffuse then the marginal-likelihood can easily come out infavor of the calibrated case. In this example, the prior is centered at the calibrated values. While themarginal-likelihood is lower for the non-calibrated prior for some of the cases it is apparent that theposterior distribution of the structural parameters are different from the prior distribution. For all datasets, the posterior distributions for some of the parameters are significantly different than the prior. Thisis evidence that information from the data is suggesting that the calibrated values are not correct.

Also, since the results of the comparison differ when prior uncertainty is allowed over the struc-tural parameters, it is clear that whether to allow for uncertainty over the structural parameters is animportant aspect of the decision process. In the RBC literature, it is not the value of the parametersthat is the most important but rather the implications implied buy the model that is important.

4.7 Comparison of an RBC model with a time series AR(1) model

The preceding sections have compared two models that are found in the RBC literature. The purpose ofthis section is to compare the two RBC models described in Section 4.1 with a very simple model thatis not a model that is found in the RBC literature. Real Business Cycle models aim to account for whythere are fluctuations in the observed data. If the ultimate use of a model is to run policy experimentsthen it is necessary to be able to quantify how good the model is with respect to competing models inthe same literature. It is also important to be able to quantify the relative performance of model withother models that are not in the literature. This section aims to do just that.

The Bayesian comparison methods that were described in Section 3.2 are able to compare non-nested models. Therefore, it is a simple problem to compare two models that are very different in naturebut nonetheless aim to model the same data. A simple model will be used to compare with the two RBCmodels described in Section 4.1. For what follows, the simple model will be called the “AR1” model. Adescription of this model follows: Let yt be a vector of observations for period t. Then

yt = εt (4.10)

37

whereεt = φεt−1 + ut (4.11)

and ut|εt−1 is assumed to be independently and identically distributed with a Normal distribution withmean 0 and variance h−1. Also, εt is assumed to be stationary. Note that the observations that will beused in the comparison have been passed through the Hodrick-Prescott filter described in Section 4.3and so have a mean equal to zero.

This model is the model UVR3 of the BACC project7, and the software provided there was usedin this analysis. Suppose that the observations Y T = {yt}Tt=0 are to be used in the comparison. Definey∗t = yt−φyt−1 for t=1. . .,T. Then y∗t

i.i.d.∼ N(0, h−1). The unconditional distribution of y0 is N(0, h−1

1−φ2 )so that the data density for the model described in (4.10) is

p(Y T |φ, h) = (2π)−(T+1)/2h(T+1)/2(1− φ2)1/2

. exp{−12h[(1− φ2)y2

0 +T∑t=1

y2t }. (4.12)

The priors for the parameters, φ and h were defined as follows.

s2h ∼ χ2(ν)

andφ ∼ N(φ, h−1

φ )

subject to φ < 1.A reasonably diffuse prior was place over the parameter φ. The prior mean was set to be 0.8

while the prior variance was set equal to 0.01. This implies that the standard deviation for the prior is0.1. A value of 0.8 for φ implies a moderately persistent error process for εt. Note that various valuesfor φ and h−1

φ were tried. The results were not sensitive to the value of the prior for φ. The prior forh was constructed using a notional sample. The data that was used for the study was in log form soit was convenient to think of the error εt as a percentage deviation. A notional sample consisting of 10observations for εt was constructed. The value of

∑10t=1 ε

2t for this notional sample was 0.0023. Therefore,

s2 was set to be 0.0023 and ν was set to be 10. The priors for this study are summarized in Table 15.

φ h−1φ s2 ν

0.8 0.01 0.0023 10

Table 15: Prior specification for AR1 model

Using the software UVR3 and MLIKEUVR3 from the BACC project, posterior moments andmarginal likelihoods for the “AR1” model were calculated. They were calculated each data set used inthe comparison of the models described in Section 4.1. Table 16 contains the posterior moments for the“AR1” model while Table 17 contains the marginal likelihoods for the “AR1” model.

Given the marginal likelihoods reported in Table 17 it is possible to calculate Bayes factors betweenboth the “fundamental” model and the “sunspot” model with the “AR1” model. These are reported inTable 18. It is clear from Table 18 that the “AR1” model is strongly favored over both of the models.In fact, in order for the posterior odds ratio to be just in favor of any of the two models from the RBCliterature, there would have to be at least 1.679×10116 ( =exp(267.6182)) more prior weight placed onthe RBC models. In effect, zero prior weight would need to be placed on the “‘AR1” model. This is a

7http://www.econ.umn.edu/∼bacc/models.html

38


φ 0.7546 0.7212 0.7945 0.7264 0.7724(0.0514) (0.0524) (0.0451) (0.0550) (0.0489)

h−1 1.18×10−4 1.33× 10−4 2.16× 10−3 1.64× 10−4 1.38× 10−4

(1.29× 10−5) (1.45× 10−5) (2.39× 10−4) (1.84× 10−5) (1.51× 10−5)

Table 16: Posterior moments for AR1 model


757.9801 815.4367 520.8731 731.1842 745.2873(0.0250) (0.0372) (0.0344) (0.0384) (0.0301)

Table 17: Marginal likelihoods for AR1 model

strong result. The two models from the RBC literature were developed with the aim of trying to modelthe data that was used in the comparison. The “AR1” has no “economic content” but clearly is favored.This suggests, at least for the two RBC models that were compared, that more work is needed in thedevelopment of the models. Careful consideration of the above results should be taken into accountwhen analyzing inferences that are obtained from using the models described in Section 4.1.


“fundamental” 539.1734 310.5823 276.8299 636.8173 286.3594calibrated case (0.0252) (0.0393) (0.0367) (0.0402) (0.0322)

“sunspot” 304.9236 310.2646 278.8148 302.7426 280.4654calibrated case (0.0278) (0.0401) (0.0365) (0.0403) (0.0319)

“fundamental” 315.3576 321.9042 294.3516 320.5334 303.5762full prior (0.0754) (0.0930) (0.0799) (0.0866) (0.0993)

“sunspot” 321.5353 300.8050 267.6182 312.3936 278.9530full prior (0.0262) (0.0689) (0.1182) (0.1188) (0.1164)

Table 18: Log Bayes factor in favor of the AR1 model

39

Figure 6: Posterior and Prior Distribution for b: “fundamental”

Figure 7: Posterior and Prior Distribution for δ: “fundamental”

Figure 8: Posterior and Prior Distribution for ρ: “fundamental”

Figure 9: Posterior and Prior Distribution for θ: “fundamental”

Figure 10: Posterior and Prior Distribution for A: “fundamental”

Figure 11: Posterior and Prior Distribution for σ2η: “fundamental”

Figure 12: Posterior and Prior Distribution for b: “sunspot”

Figure 13: Posterior and Prior Distribution for δ: “sunspot”

Figure 14: Posterior and Prior Distribution for ρ: “sunspot”

Figure 15: Posterior and Prior Distribution for A: “sunspot”

Figure 16: Posterior and Prior Distribution for σ2V : “sunspot”

Figure 17: Posterior and Prior Distribution for λ: “sunspot”

40

Con

sum

ptio

nH

ours

Inve

stm

ent

Pro

duct

ivit

yO

utpu

tP

rior

mea

n95

%pr

ior

cred

ible

set

b0.

7827

0.71

880.

6851

0.80

780.

6646

0.64

[0.4

857,

0.76

99]

(0.0

018)

(0.0

661)

(0.0

559)

(0.0

454)

(0.0

713)

(0.0

729)

δ0.

0299

80.

0275

0.02

780.

0262

0.02

420.

025

[0.0

128,

0.04

82]

(0.0

083)

(0.0

101)

(0.0

085)

(0.0

088)

(0.0

085)

(0.0

083)

ρ0.

9910

0.98

290.

9898

0.99

020.

9912

0.99

[0.9

616,

0.99

75]

(0.0

066)

(0.0

103)

(0.0

077)

(0.0

081)

(0.0

066)

(0.0

068)

θ0.

9778

0.85

060.

8949

0.97

930.

8788

0.95

[0.8

603,

0.98

32]

(0.0

107)

(0.0

527)

(0.0

447)

(0.0

106)

(0.0

505)

(0.0

268)

A2.

7752

2.79

472.

7867

2.81

892.

7747

2.86

[1.9

171,

4.26

67]

(0.5

548)

(0.5

709)

(0.5

513)

(0.5

436)

(0.5

665)

(0.5

720)

σ2 η

2.31×

10−

43.

79×

10−

53.

58×

10−

53.

09×

10−

42.

85×

10−

54.

9×

10−

5[3.1

3×

10−

5,

(4.4

9×

10−

5)

(1.4

2×

10−

5)

(1.5

3×

10−

5)

(5.9

0×

10−

5)

(7.7

0×

10−

6)

(2.3

2×

10−

5)

1.26×

10−

4]

Tab

le19

:P

oste

rior

mom

ents

for

“fun

dam

enta

l”m

odel

41

Con

sum

ptio

nH

ours

Inve

stm

ent

Pro

duct

ivit

yO

utpu

tP

rior

mea

n95

%cr

edib

lese

t

b0.

7688

0.70

280.

7582

0.72

750.

7254

0.70

[0.5

535,

0.81

45]

(0.0

418)

(0.0

261)

(0.0

267)

(0.0

198)

(0.0

268)

(0.0

547)

δ0.

0213

0.03

110.

0453

0.03

890.

0310

0.02

5[0

.012

8,0.

0482

](0

.007

4)(0

.009

1)(0

.008

5)(0

.015

2)(0

.008

9)(0

.008

3)

ρ0.

9931

0.98

380.

9861

0.97

090.

9853

0.99

[0.9

616,

0.99

75]

(0.0

057)

(0.0

095)

(0.0

094)

(0.0

207)

(0.0

088)

(0.0

068)

A2.

7877

2.82

072.

8221

2.77

492.

7849

2.86

[1.9

171,

4.26

67]

(0.5

814)

(0.5

734)

(0.5

655)

(0.5

916)

(0.5

641)

(0.5

720)

σ2 V

1.14×

10−

42.

40×

10−

63.

35×

10−

61.

30×

10−

42.

96×

10−

64.

0×

10−

6[1.5

5×

10−

6,

(1.1

6×

10−

5)

(1.4

2×

10−

6)

(9.6

3×

10−

7)

(1.4

3×

10−

5)

(1.1

6×

10−

6)

(1.8

9×

10−

6)

1.03×

10−

5]

λ0.

5538

0.60

410.

5949

0.64

860.

6171

0.58

[0.4

934,

0.66

20]

(0.0

404)

(0.0

262)

(0.0

288)

(0.0

339)

(0.0

263)

(0.0

425)

Tab

le20

:P

oste

rior

mom

ents

for

“Sun

spot

”m

odel

42

5 Conclusion

It was asserted earlier that in order to use a model to answer questions posed by a researcher or toevaluate policy changes, first there needs to be a determination of how good the model is. However,in a new literature it may be unwise to reject a model because it does a poor job of predicting theobservations one has. The models that are put forward may add insight into the unknown processgoverning the system that is observed. Gaining this insight may then lead to better models that do abetter job at predicting the data. However, even if a model is assumed to do a bad job at predicting theobservations, that is no reason to not evaluate a model’s performance as fully as possible. It would behard to take any inferences obtained from a model seriously if there was no way to relate the model tocompeting models, both inside the literature and outside the literature.

It has been argued that, while models in the RBC literature are poor at predicting the datathat is observed, the models in the RBC literature do a good job of replicating some characteristics ofthose observations. This has been the reasoning behind using those models to try to understand theobservations from the data that one has. However, one of the criticisms of the RBC literature is thatthe methods used to claim that the models do a good job of replicating certain facts are flawed. Theimportant aspect of using models to answer questions is to know whether the model that is being usedis superior to other models that have been used. Therefore a method that can formally and directlycompare two models is preferred.

In the RBC literature, there are few methods of evaluation that directly compare two or moremodels. The object of this paper was to put forward a method that is able to formally and directlycompare two models in the RBC literature. The benefit of such a method is twofold: First, when usinga model to make inferences, a formal statement of the model’s performance with respect to competingmodels can be made, and second, when extending old models it is possible to formally determine therelative performance with respect to the existing model both across the whole sample and across sub-samples. The benefits of such a method is that is that if it possible to formally distinguish betweencompeting models it is possible to formally distinguish between their competing inferences. Because themethod is Bayesian it is likelihood based and satisfies the Likelihood Principle. This implies that allresults are based solely on information from the observations from the system through the likelihoodfunction and any prior information before the sample is obtained.

Because the method is likelihood based, a likelihood function needed to be constructed for modelsof the type found in the RBC literature. Because of the complexity of the models found in the RBCliterature, it is necessary to approximate the solution of the model. Once that is done it is possible towrite the model in a state-space representation that has all of the variables of the model represented asfunctions of the states of the model. This representation is used to calculate the likelihood function. Inparticular, conditional on initial conditions, observations on some or all of the variables of the model canbe used to determine values for the stochastic state variables of the model. The number of variables usedis limited to the number of stochastic state variables. The process is an iterative process whereby onceall of the values of the stochastic state variables are known for one period, it is possible to calculate thevalues of all of the variables of the model in the current period and also the values of the non-stochasticstate variables for next period. Then, next period’s observations are used to calculate next period’svalues of the stochastic state variables and so on. The calculated values of the stochastic components ofthe models are then used to calculate the likelihood function for the model.

Once the likelihood function is calculated, Bayesian model comparison methods are available. Itwas shown that these methods compare models with respect to their relative out-of-sample predictionperformance. It was also shown that is a simple step to allow for uncertainty over the structural

43

parameters of the models being compared. In fact, the current practice of calibration is just a specialcase of the more general case. It was also shown that it is possible to compare models across sub-samplesof the data as well as across the whole sample.

The method that was developed was applied to the case of comparing two models from the paperof Farmer and Guo (1994). The two models were variants of the same one-sector stochastic growthmodel. The first variant was a model that had shocks that affected the “fundamental” componentsof the model. The second variant, had an increasing returns to scale technology that allowed for amodel that produced cycles using shocks that were unrelated to the “fundamental” components of themodel. When the structural parameters were calibrated, the outcome of the comparison was mixed. Itwas apparent that the value of the variance of the respective shock terms mattered with respect to theoutcome of the comparison. When the values of the variances of the shock processes were recalibrated,there was a dramatic change in the result for two of the five data sets that were used to calibrate themodels. However, the new method of calibration was preferred for only those two data sets. It wasclear that there was uncertainty as to the correct way to calibrate the variances of the shock processes.This is evidence of the need for the ability to allow for uncertainty over the parameters when comparingmodels. The Bayes factor was decomposed across the whole data set. The result of this was that iswas clear that there were periods where each model was ascendant. No model was consistently betteracross the whole data set. This result could not have been found using comparisons across the wholedata set. It seemed that the “sunspot” model was preferred when the data was relatively volatile whilethe “fundamental” model did better in periods of lower volatility.

The two models were also compared when there was uncertainty allowed over the structuralparameters of the model. The posterior distributions for some of the parameters were significantlydifferent from their prior distribution. The overall results of the comparison were more in favor of the“sunspot” model than in the calibration case. These results add weight to the claims of Farmer and Guo(1994) that a “sunspot” model does as good a job at replicating business cycles as the commonly used“fundamental” model. This result has an implication for policy as in a “sunspot” model there is roomfor a welfare increasing policy that aims to smooth cycles. This is not the case in the “fundamental”model.

The nature of the RBC literature is that the structural parameters of the models are calibrated.In a Bayesian context, by calibrating the parameters, prior knowledge with regard to the values of thestructural parameters is used in the decision making process. In the RBC literature, the use of priorinformation is extreme in that the prior distributions for the parameters are degenerate at the calibratedvalues. What the results above show is that; first, it is easy to allow for uncertainty over the values ofthe structural parameters; and second, allowing for prior uncertainty can make a difference to the resultof the comparison. If the aim is to increase understanding about a process then it would seem that prioruncertainty should be allowed for when comparing models.

A third comparison was made between the two RBC models with a very simple non-RBC model.This highlights the fact that the method that was developed is not only able to compare two similarmodels. The comparison between the RBC models and the non-RBC model yielded interesting results.It was found that the non-RBC model was strongly favored in all cases. This result has implicationswith regard to inferences that can be drawn from the two models in particular and RBC models ingeneral. It would seem that a very simple model with little economic content performs much better thanthe more complicated models with a high degree of economic content that are described in Section 4.1.For example, if a RBC model is to be used in making inferences about the implications of differentgovernment policies then a evaluation of the performance of that model with other competing models isimportant. The result that the simple “AR1” model strongly outperforms the two RBC models places

44

the inferences that can be made from them into context. There clearly is room for a non-RBC modelwith more economic content than the “AR1” model to be developed and used for inference. The resultalso shows there is room for improvement within the class of RBC models as well.

While the application in this paper is model comparison, the Bayesian approach can be extendedeasily to any other application of the literature. Once a model has been chosen from amongst a set ofmodels, the model is then used for inference. Prior uncertainty over the structural parameters of themodel can be applied as easily as for the model comparison case. Using a Metropolis-Hastings algorithm,draws can be made from the posterior distribution of the model, and for each of these draws functions ofinterest can be calculated. For example, one function of interest may be the impulse response functionimplied by the model at the specific values of the parameters that were drawn. Then it is possible toconstruct posterior moments for the function of interest just as was done for the parameters of the twomodels studied in Section 4.1. The method described in this paper is, therefore, not restricted to theproblem of model comparison. All problems now studied in the RBC literature can be handled usingthe methods described in this paper and there is now no reason not to use information from the data toinfer values of the structural parameters to be used for model inference.

References

Anderson, E., L. Hansen, E. McGratten, and T. Sargent (1996): “Mechanics of Forming andEstimating Dynamic Linear Economies,” in Handbook of Computational Economics, ed. by H. Aum-man, D. Kendrick, and J. Rust, vol. I. Elsevier Science B.V.

Benhabib, J., and R. Farmer (1994): “Indeterminacy and Increasing Returns,” Journal of EconomicTheory, 25.

Berger, J., and R. Wolpert (1988): The Likelihood Principle. Hayward, California: Institute ofMathematical Statistics, 2nd edn.

Canova, F., and E. Ortega (1995): “Testing Calibrated General Equilibrium Models,” unpublishedmanuscript.

Chib, S., and E. Greenberg (1995): “Understanding the Metropolis-Hastings Algorithm,” The Amer-ican Statistician, 49.

Christiano, L. (1988): “Why Does Inventory Investment Fluctuate So Much?,” Journal of MonetaryEconomics, 28.

Christiano, L., and M. Eichenbaum (1992): “Current Business Cycle Theories and Aggregate LaborMarket Fluctuations,” American Economic Review, 82.

Danthine, J., and J. Donaldson (1993): “Methodological and empirical issues in real business cycletheory,” European Economic Review, 37.

DeJong, D., B. Ingram, and C. Whiteman (1996): “A Bayesian Approach to Calibration,” Journalof Business and Economic Statistics.

(1997): “A Bayesian Approach to Dynamic Macroeconomics,” Working paper, University ofIowa.

Diebold, F., L. Ohanian, and J. Berkowitz (1994): “Dynamic Equilibrium Economies: A Frame-work for Comparing Models and Data,” unpublished manuscript, University of Pennsylvania.

45

Domowitz, I., G. Hubbard, and B. Peterson (1988): “Market structure and cyclical fluctuationsin U.S. manufacturing,” Review of Economics and Statistics, 70.

Farmer, R. (1993): The Macroeconomics of Self-Fulfilling Prophecies. M.I.T. Press, Cambridge, Mas-sachsetts.

Farmer, R., and J.-T. Guo (1994): “Real Business Cycles and the Animal Spirits Hypothesis,”Journal of Economic Theory, 63.

Gelfand, A., and D. Dey (1994): “Bayesian Model Choice: Asymptotics and Exact Calculations,”Journal of the Royal Statistical Society Series B, 56.

Geweke, J. (1998): “Using Simulation Methods for Bayesian Econometric Models: Inference, Devel-opment and Communication,” Econometric Reviews, forthcoming.

Geweke, J., and S. Chib (1998): “Bayesian Analysis Computation and Communication (BACC): Aresource for Investigators, Clients and Students,” University of Minnesota working paper.

Hall, R. (1990): “Invariance properties of Solow’s productivity residual,” in Growth-Productivity-Unemployment. M.I.T. Press, Cambridge,Massachusetts.

Hansen, G. (1985): “Indivisible Labor and the Business Cycle,” Journal of Monetary Economics, 16.

(1997): “Technical progress and aggregate fluctuations,” Journal of Economic Dynamics andControl, 21(6).

Hansen, G., and E. Prescott (1995): “Recursive Methods for Computing Equilibria of BusinessCycle Models,” in Frontiers of Business Cycle Research, ed. by T. Cooley. Princeton University Press,Princeton.

Hansen, L., and J. Heckman (1996): “The Empirical Foundations of Calibration,” The Journal ofEconomic Perspectives, 10(1).

Hansen, L., and T. Sargent (1996): “Recursive linear models of dynamic economies,” unpublishedmanuscript.

Hodrick, R., and E. Prescott (1997): “Post-War U.S. Business Cycles: An Empirical Investiga-tion.,” Journal of Money, Credit and Banking, 29(1).

Kim, K., and A. Pagan (1995): “The Econometric Analysis of Calibrated Macroeconomic Models,” inHandbook of Applied Econometrics, ed. by H. Pesaran, and M. Wickens. Blackwell Press.

Kydland, F., and E. Prescott (1982): “Time to Build and Aggregate Fluctuations,” Econometrica,50.

(1996): “The Computational Experiment: An Econometric Tool,” Journal of Economic Per-spectives, 10(1).

Lucas, R. (1977): “Understanding Real Business Cycles,” in Stabilization of the Domestic and Inter-national Economy, ed. by K. Brunner, and A. Meltzer. North-Holland, Amsterdam.

Prescott, E. (1986): “Theory Ahead of Business Cycle Measurement,” Federal Reserve Bank ofMinneapolis Quarterly Review, 10.

Sims, C. (1996): “Macroeconomics and Methodology,” The Journal of Economic Perspectives, 10(1).

46

Smith, A. (1993): “Estimating Non-Linear Time Series Models using Simulated VAR’s,” Journal ofApplied Econometrics, 8.

Stadler, G. (1994): “Real Business Cycles,” Journal of Economic Literature, XXXII.

Tierney, L. (1994): “Markov Chains for Exploring Posterior Distributions,” Annals of Statistics, 22(4).

Watson, M. (1993): “Measures of Fit for Calibrated Models,” Journal of Political Economy, 101.

A Derivation of formulae

A.1 Derivation of (4.3)

A first order Taylor’s series approximation to (4.1) is used to derive (4.3). That is, the following equationsare linearized,

Kt+1 = BZmt Kgt C

dt + (1− δ)Kt − Ct

1Ct

= Et

[DZmt+1K

g−1t+1 C

d−1t+1 +

τ

Ct+1

](A.13)

Zt = Zθt−1ηt.

Consider, first, the first equation of the system given above.

Kt+1 ≈ K + [gBKg−1Cd + (1− δ)](Kt − K)

+ [dBKgCd−1 − 1](Ct − C)

+ [mBKgCd](Zt − Z).

Subtracting K from both sides and dividing through by K, the above approximation becomes

Kt+1 − KK

≈ [gBKg−1Cd + (1− δ)]Kt − KK

+ [dBKg−1Cd − C

K]Ct − CC

+ [mBKg−1Cd](Zt − Z) (A.14)

= [gBΛ + (1− δ)]Kt + [dBΛ− C

K]Ct + [mBΛ]Zt

where Λ = Kg−1Cd.Next, consider the second equation of the system given in (A.14). This equation can be written

Et[Ct+1] = CtEt[DZmt+1Kg−1t+1 C

dt+1 + τ ] (A.15)

so that the first order Taylor series approximation to this equation is

Et[Ct+1] ≈ C + [DKg−1Cd + τ ](Ct − C)

+ dDKg−1Cd(Ct+1 − C)

+ (g − 1)DKg−1Cd+1(Kt+1 − K)

+ mDKg−1Cd+1(Zt+1 − Z)

Subtracting C from both sides of the above approximation and then dividing through by C theapproximation becomes

Et

[Ct+1 − C

C

]=

Ct − CC

47

+ dDΛEt

[Ct+1 − C

C

]+ (g − 1)DΛEt

[Kt+1 − K

K

]+ mDΛEt[Zt+1 − Z].

Therefore the approximation to (A.14) is

−Ct = [dDΛ− 1]Et[Ct+1] + (g − 1)DΛEt[Kt+1] +mDΛEt[Zt+1]. (A.16)

Finally, the approximation to the third equation of (A.14) is

Zt+1 = θZt + ηt+1 (A.17)

The three equations, (A.15) to (A.17), can be represented as the following matrix system:

A

Kt

Ct

Zt

= B

Kt+1

Ct+1

Zt+1

+ C

ηt+1

Et[Kt+1]− Kt+1

Et[Ct+1]− Ct+1

Et[Zt+1]− Zt+1

(A.18)

where

A =

gBΛ + (1− δ) dBΛ− CK

mBΛ0 −1 00 0 θ

,

B =

1 0 0(g − 1)DΛ dDΛ− 1 mDΛ

0 0 1

,

and

C =

0 0 0 00 (g − 1)DΛ dDΛ− 1 mDΛ1 0 0 0

.

Therefore, the approximation to the system of equations in (A.14) is

Kt

Ct

Zt

= J

Kt+1

Ct+1

Zt+1

+ R

ηt+1

Et[Kt+1]− Kt+1

Et[Ct+1]− Ct+1

Et[Zt+1]− Zt+1

(A.19)

whereJ = A−1B

andR = A−1C.

A.2 Derivation of (4.4)

The object is to calculate the first order Taylor’s series approximation to (4.4) of Section 4.2. Theapproximation is

Lt ≈ L+ φ

[A

b

C

Kα

]φ−1A

b

1Kα

(Ct − C)

+ φ

[A

b

C

Kα

]φ−1

αA

b

−CKα+1

(Kt − K) + φ

[A

b

C

Kα

]φ−1A

b

−CKα

(Zt − 1).

48

Subtracting L from both sides and then dividing through by L, the above approximation becomes

Lt ≈ φ

[b

A

Kα

C

]A

b

1Kα

(Ct − C)− αφ[b

A

Kα

C

]A

b

C

Kα+1(Kt − K)

− φ

[b

A

Kα

C

]A

b

−CKα

(Zt − 1)

= φCt − αφKt − φZt. (A.20)

49

bayesian comparison of dynamic macroeconomic models series and appli… · bayesian comparison of...

Documents