kalman filter and state space models

Upload: apurv-sathe

Post on 02-Jun-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 Kalman filter and State Space Models

    1/24

    State Space Models, Kalman Filter and Smoothing

    The idea that any dynamic system can be expressed in a particular representation called the

    state space representation was proposed by Kalman. He presented an algorithm (or) a set of rules

    to sequentially forecast and update a set of projections of the unknown state vector.

    State space representation of a dynamic system The general case

    State space models were originally developed by control engineers to represent a dynamic system

    or dynamic linear models. Interest normally centers on a (m1) vector of variables, called a state

    vector, that may be signals from a satellite or the actual position of a missile or a rocket. A state

    vector represents the dynamics of the process. More precisely, it retains all the memory in the

    process. All the dependence between past and future, must funnel through the state vector. The

    elements of the state vector may not have any specific economic meaning, but state space approach ispopular in economic applications involving modelling unobserved or latent variables, like permanent

    income, NAIRU (Non-Acclerating Inflation Rate of Unemployment), expected inflation, state of the

    economy in business cycle analysis, etc. In most cases such signals are not observable directly, but

    such a vector of variables is related to a (n1) vector zt of variables that are actually observed,

    through an equation called measurement equationor the observation equation, given by

    zt = Atxt+ Ytt+ Nt (1)

    where Yt and At are parameter matrices of order (nm) and (nk) respectively, xt is (k 1)

    vector of exogeneous or pre-determined variables, andNt is (n 1) vector of disturbances which has

    zero mean and covariance matrix Ht.

    Although the state vector tis not directly observable, its movements are assumed to be governed

    by a well defined process, called the transition equationor state equationgiven by,

    t = Ttt1+ Rtt, t= 1, . . . , T . (2)

    where Tt and Rt are matrices of order (mm) and (mg) respectively. t is (g 1) vector of

    disturbances with mean zero and covariance matrixQt.

    Remarks:

    1. Note that in the measurement equation we have an added disturbances termNt.We need it iff

    we assume that what we have observed is contaminated by an additional noise; otherwise we

    simply have

    zt = Atxt+ Ytt. (3)

    1

  • 8/11/2019 Kalman filter and State Space Models

    2/24

  • 8/11/2019 Kalman filter and State Space Models

    3/24

    large enough so that the dynamics of the systems can be captured by the simple first order

    Markov structure of the state equation. From a technical point of view, the aim of state space

    form is to set up t such that, it has as small number of elements as possible. Such a state

    space set up is called a minimal realizationand it is a basic criterion for a good state space

    form.

    6. In many cases of interest only one observation is available in each time period that is zt

    is now a scalar in the state equation. Also, the transition matrix is much simpler than given

    before, in the sense, the parameters in most cases, including the variance, are assumed to be

    time invariant. Thus the transition equation now becomes

    t = Tt1+ Rt, t= 1, . . . , T . (12)

    and

    t= WN(0, 2Q). (13)

    7. For many applications using Kalman filter, the vector of exogenous variables is simply not

    necessary. One may also assume that the variance of the noise term is time invariant. So that,

    the general system now boils down to:

    zt = ytt+Nt, t= 1, . . . , T (14)

    t = Tt1+ Rt, t= 1, . . . , T . (15)

    zt now is a scalar and Nt (0, 2h) and yt is a (1 m) vector. In some of the state

    space applications, especially those that use ARMAmodels the measurement error in the

    observation equation i.e. Nt is assumed to be zero. This means thatNt in such applications

    will be absent.

    8. There are many ways to write a given system in state-space form. But written in any way, if

    our primary interest is forecasting, we would get identical forecasts no matter which form we

    use. Note also that, we can write any state space form as an ARMA model. This way, there

    is an equivalence between the two forms.

    3

  • 8/11/2019 Kalman filter and State Space Models

    4/24

    Examples of state space representation:

    Example 1: First let us consider the general ARMA(p, q) model and see how it can be cast in a

    state space form. AnARMA(p, q) model can be written, by defining m =max(p, q+ 1),in the form:

    zt = 1zt1+2zt2+ +mztm+1et1+2et2+ +m1etm+1+et

    where we interpret j = 0 for j > p and j = 0 for j > q.

    Then we can write the state and observation equation as follows:

    State equation:

    t =

    1...

    2... Im1

    ... ...

    . . . . . . . . .

    m... 0

    t1+

    1

    1......

    m1

    et

    Observation equation:

    zt =

    1 0 . . . 0t.

    The original model can be easily recovered by repeated substitution, starting at the bottom row of

    the state equation. We can easily note that the first element of the state vector is identically equalto the given model for zt.

    Example 2: Let us consider next a univariate AR(p) process:

    zt = 1zt1+2zt2+ +pztp+et

    where(B) = (11B2B2 pB

    p) is the AR operator and et is white noise. This could

    be defined in state space form by writing the (m 1) state vector, t,wherem = pfor the present

    case, as follows:State equation:

    t =

    1...

    2... Im1

    ... ...

    . . . . . . . . .

    m... 0

    t1+

    1

    0......

    0

    et

    4

  • 8/11/2019 Kalman filter and State Space Models

    5/24

    Observation equation:

    zt =

    1 0 . . . 0t.

    Defining t = (1t 2t . . . mt), and substituting from the bottom row, we get the original AR

    model.

    Example 3: Let us consider the following ARMA(1, 1) model. For this model m = 2.

    zt = 1zt1+1et1+et.

    For this model the state and the measurement equations are given below:

    StateEquation: t = 1 10 0

    t1+ 11

    et andObservationEquation: zt =

    1 0

    t.

    If we define t = (1t 2t),then

    2t = 1et

    1t = 11,t1+2,t1+et

    = 1zt1+1et1+et.

    and this is precisely the original model.

    Example 4: As a final example, we shall consider the first order moving average model, assuming

    that the model has zero mean:

    zt = et+1et1.

    Herem = 2, so that the state and the measurement equations are given as follows:

    State equation: t =

    0 1

    0 0

    t1+

    1

    1

    et and

    Observation equation: zt =

    1 0t.

    If we define t = (1t 2t), then2t = 1et and1t = 2,t1+ et = et+ 1et1 and this is precisely

    the original model.

    5

  • 8/11/2019 Kalman filter and State Space Models

    6/24

    We have seen before that there are many ways of writing a given system in state space form. We

    shall here give an example of writing the AR(p) process in a different way.

    Example 5: As before let m = p.The state equation is given as: State Equation:

    zt

    zt1...

    ztp+1

    t

    =

    1 2 . . . p1 p1 0 . . . 0 0...

    ...

    0 . . . . . . 1 0

    TTT

    zt1zt2

    ...

    ztp

    t1

    +

    10...

    0

    RRR

    et

    Observation Equation:

    (zt) =

    1 0 . . . 0

    yyyt

    zt

    zt1...

    ztp+1

    t

    In this case, by carrying out the matrix multiplication on RHS of the state equation, we cannotice that the first row gives the originalAR model and the rest are trivial identities, including the

    observation equation.

    Example 6: Let us take the ARMA(p, q) that we have seen before:

    zt = 1zt1+2zt2+ +mztm+1et1+2et2+ +m1etm+1+et

    where we interpret j = 0 for j > p and j = 0 for j > q. We shall re-write it in a way different

    from what we saw in Example 1. Letm=max(p, q+ 1). Then we can write the state equation andobservation equation as follows:

    State Equation:

    t+1=

    1 2 . . . m1 m

    1 0 . . . 0 0...

    ...

    0 0 . . . 1 0

    t+

    et+1

    0...

    0

    6

  • 8/11/2019 Kalman filter and State Space Models

    7/24

    Observation Equation:

    zt= +

    1 1, . . . , m1t.

    We shall take theARMA(1, 1) model and see how to write the state space form as given Example

    6 and retrieve the original model. For ARMA(1, 1), m = 2. So the state and the observation

    equations are:

    StateEquation:

    t+1 =

    1 0

    1 0

    t+

    1

    0

    et+1,

    ObservationEquation:

    zt = +

    1 1t.

    Starting from the second row of the state equation, we have

    2,t+1= 1,t.

    First row of state equation implies that

    1,t+1 = 11,t+et+1

    or

    11B

    1,t+1 = et+1. . . . (1)

    Observation equation states that

    zt = +1,t+12,t

    = +1,t+11,t1

    = +

    1 +1B

    1,t . . . (2)

    Multiply (2)

    11B

    to give:

    11B

    zt

    =

    11B

    1 +1B

    1,t

    =

    1 +1B

    et [from (1)]

    = (which is the given model)

    Example 7: Let us take an example of state space formulation for an economic problem. Fama

    and Gibbons, Journal of Monetary Economics, 1982,9,pp.297-32 use the state space idea

    7

  • 8/11/2019 Kalman filter and State Space Models

    8/24

    to study the behaviour ofex-antereal interest rate (defined as the nominal interest rate, it, minus

    the expected inflation rate,et .) This is unobservable because we do not have data on the anticipated

    rate of inflation. Thus, the state variable is:

    t = itet ,

    where is the average ofex-antereal interest rate. Fama and Gibbons assume that the ex-antereal

    interest rate follows the AR(1) process:

    t+1= t+et+1.

    But an econometrician has data on ex-postreal interest rate (that is, nominal interest rate, it minus

    the actual rate of inflation,t.) That is,

    itt = itet et t= +t+t,

    wheret = et t, is the error agents made in forecasting inflation. If people forecast optimally,

    thent should be free of autocorrelation and should be uncorrelated with ex-antereal interest rate.

    Kalman Filter An Overview

    Consider the system given by the following equations:

    zt = ytt+Nt, t= 1, . . . , T

    t = Tt1+ Rt, t= 1, . . . , T .

    Given this, our objectives could be either to obtain the values of unknown parameters or, given

    the parameter vectors, we may be aiming to obtain the linear least squares forecasts of the state

    vector on the basis observed data. Kalman filter (KF here after) has many uses. We are utilising

    it as an algorithm to evaluate the components of the likelihood function. Kalman filtering followsa two-step procedure. In the first step, the optimal predictor for the nextobservation is formed,

    based on all the information currently available. This is done by the prediction equation. In the

    second step, the moment a new observation becomes available, it is then incorporated into the

    estimator of the state vector using the updating equation. These two equations collectively form

    the Kalman filter equations. Applied recursively, the KF provides an optimal solution to the twin

    problems of prediction and updating. Assuming that the observations are normally distributed and

    also assuming that current estimator of the state vector is the bestavailable, the prediction and the

    8

  • 8/11/2019 Kalman filter and State Space Models

    9/24

    updating estimators are the best. By best, we mean the estimators have the minimum mean squared

    error (MMSE). It is very evident that the process of predicting the next observation and updating it

    as soon as the actual value becomes available, has an interesting by product the prediction error.

    And we have seen in the chapter on estimation, that how a set of dependent observation can be

    decomposed in terms of the prediction errors. KF gives us a natural mechanism to carry out thisdecomposition.

    Kalman filter recursions Main equations

    We shall useatto denote theMMSEestimator oftbased on all information up to and including

    the current observation zt. Similarly, we have at/t1 as the MMSEestimator oft at time t1.

    That is,at|t1= E(t|It1).

    Prediction:

    At timet1,all available information, includingzt1is incorporated inat1,which is theMMSEestimator oft1. The prediction error has a covariance matrix of

    2ePt1.More precisely,

    2ePt1= E

    (t1 at1) (t1 at1)

    From

    t = Tt1+ Rt,

    we get that at time t 1,the MMSEestimator oft is given by

    at/t1= Tat1

    so that the estimation error or the sampling error is given by

    t at/t1

    = T (t1 at1) + Rt.

    The right hand side of this estimation error has zero expectations. We have to note here that an

    estimator isunconditionally unbiased (u-unbiased) if its estimation error has zero expectations. And

    when an estimator is u-unbiased its MSE matrix, E

    t at|t1

    t at|t1

    is identical to the

    covariance matrix of the estimation error, at|t1 t at|t1 t .And hence we can write thethe covariance of the estimation error as:

    E

    t at/t1

    t at/t1

    = E

    T (t1 at1) + Rt

    T (t1 at1) + Rt

    = TE

    (t1 at1) (t1 at1)

    T + TE

    (t1 at1)

    t

    R +

    RE

    t(t1 at1)

    T + RE

    t

    t

    R

    = 2TPt1

    T +2RQR.

    9

  • 8/11/2019 Kalman filter and State Space Models

    10/24

    Thus,

    t at/t1

    W S

    0, 2Pt/t1

    where

    Pt/t1 = TPt1T + RQR

    where WS stands for wide sense. Weak stationarity is sometimes referred to as wide sense

    stationarity.

    Now, given thatat/t1 isMMSEoft at timet 1,the MMSEofzt at time timet 1 clearly

    is,

    zt/t1= ytat/t1.

    The associated prediction error is

    zt zt/t1

    = t= y

    t

    t at/t1

    +Nt

    the expectation of which is zero. Hence,

    vart= E(2t) = E

    ytt at/t1

    t at/t1

    yt

    +E(N2t)

    [since cross product terms have zero expectations]

    = 2ytPt|t1yt+2h= 2ft

    Deriving the state updatingequations is involved and hence the important steps are relegated to

    the appendix and we state only the main equations below:

    Updating equation

    at = at|t1+ Pt|t1yt zt ytat|t1 /ft.And the estimation error is said to be

    (t at) WS(0, 2Pt)

    where

    Pt = Pt|t1 Pt|t1ytytPt|t1/ft where ft = y

    tPt|t1yt+h.

    We have to highlight the following points.

    10

  • 8/11/2019 Kalman filter and State Space Models

    11/24

    1. Note the role played by the prediction error, t =

    zt ytat|t1

    and the variance associated

    with it,2ft.

    2. And note also the term, (m1) vector,

    Pt|t1yt/ft

    ,which is called the Kalman gain.

    3. In the discussion so far, we have assumed the presence of an additional noise in the measurement

    equation; that is, h > 0. But we also have to note that, in our examples of state space

    representation of ARMA models, we have assumed that the measurement equation has no

    additional error. That is,Ntis assumed to be zero, implyingh,the variance of the measurement

    error term, will be zero. However this should not matter, since through these adjustments note

    that we have isolated h as an additive scalar, which when becomes zero, does not affect our

    calculations. (Note the expression for ft.)

    ML Estimation ofARMA models

    Literature has many algorithms aimed at simplifying the computation of the components of

    the likelihood. One approach is to use the Kalman filter recursions. Other useful algorithms are

    by Newbold (Biometrica, 1974, Vol.61, 423-26) and the innovations algorithm, suggested by

    Ansley (Biometrica, 1979, Vol.66,59-65).

    KF recursions are useful for a number of purposes. But our emphasis will be on understanding

    how these recursions (1) can be used to construct linear least squares forecasts of the state vector on

    the basis of data observed through time t, and (2) use the resulting prediction error and its variance

    to build the components of the likelihood function. In our derivation so far, we have motivated the

    discussion on Kalman filter in terms of linear projection of the state vector, tand the observed times

    serieszt.These are linear forecasts and are optimal among any function, if we assume that the state

    vector and the disturbances are multivariate Gaussian. Our main aim is to see how KF recursions

    calculate these forecasts recursively, generating a1 | 0, a2 | 1, . . . , aT| T1,and P1|0, P2|1, . . . , PT|T1 in

    succession.

    How do we start the recursions?

    To start the recursions, we need to get a1|0.This means we should get the first period forecast of

    based on an information set. Since we dont have information on the zeroth period, we take the

    unconditional expectation as

    a1|0 = E(1) ,

    where the associated estimation error has zero mean and covariance matrix 2P1|0

    .

    11

  • 8/11/2019 Kalman filter and State Space Models

    12/24

    Let us explain this with the help of an example.

    Example 8: Let us take the simplest M A(1) model.

    zt = et+1et1

    We have shown before that the state vector is simply

    t=

    zt

    1et

    and hence

    a1|0= E

    z1

    1e1

    =

    0

    0

    .

    And the associated variance matrix of the estimation error, 2P0 or 2P1/0, is simply E(11), so

    that we have,

    P1|0 = 2E(1

    1)

    = 2E

    z1

    1e1

    z1 1e1

    =

    1 +21 1

    1 21

    While one can work out by hand the covariance matrix for the initial state vector for pure MA

    models, this turns out to be too tedious for higher order mixed models. So, we need a closed form

    solution to calculate this matrix. We get such a solution by generalising this. Generalisation is easy

    if we can make prior assumptions about the distribution of the state vector.

    Two categories of state vector can be distinguished depending on whether or not the state vector

    is covariance stationary. If it is so, then the distribution of the state vector is readily available; and

    with that the problem of starting values can be easily resolved. With the assumption that the state

    vector is covariance stationary, one can easily check from the state equation that the unconditional

    mean of the state vector is zero. That is, from the state equation, one can easily see that

    E(t) =0,

    and the unconditional variance oft is easily seen to be,

    E(t

    t) =E(Tt1+ Rt) (Tt1+ Rt)

    12

  • 8/11/2019 Kalman filter and State Space Models

    13/24

    Let us denote theLHSof the above expression as .Noting that the state vector depends on shocks

    only up to t1,we get

    = TT + RQR

    Though this can be solved in many ways, a direct closed form solution is given by the following

    matrix lemma.

    We use the vec operator and use the following result.

    Proposition: LetA, BandC be matrices such that the product ABC exists. Then

    vec

    ABC

    =

    C A

    vec

    B

    .

    Thus, we vectorize both sides of the expression for and rearrange to get a closed form solution

    as,

    vec() =

    Im2(T T)

    1vec(RQR)

    What this implies is that, provided the process is covariance stationary, Kalman filter recursions

    can be started with

    a1 | 0

    = 0,and the (m m) matrixP1 | 0, whose elements can be expressed as

    a column vector, is obtained from:

    vecP1|0= Im2(T T)1

    vec(RQR)

    The best way to get a grasp of the Kalman recursions, is to try them out on a simple model. Let

    us try them on the simple M A(1) model.

    Example 9: Assume for convenience that the process has zero mean. So, the MA(1) model can

    be written as,

    zt = et+1et1.

    Herem = 2.So fromExample 3,we have the state and the measurement equations given as follows:

    State equation: t =

    0 1

    0 0

    t1+

    1

    1

    et and

    Observation equation: zt =

    1 0t.

    Note that the observation equation has no error. How do we start the recursions? Recall from the

    prediction equation that we have to first get at|t1

    . That is, for the first period, we need to get

    13

  • 8/11/2019 Kalman filter and State Space Models

    14/24

    a1|0,the initial state vector. From our discussion about covariance stationary properties of the state

    vector, it is clear that that

    a1|0 = Ta0 = 0.

    Next we have to calculate the matrix of the estimation error, i.e. 2P1|0 or 2P0. Though we have

    a formula to calculate the such matrices, for the present problem one can find it directly:

    P1|0 = P0= 2 E

    1

    1

    = 2E

    z1

    1e1

    {z1 1e1}

    =

    (1 +21) 1

    1 21

    .

    Let us calculate the prediction error for z1.One can easily see that z1|0 = 0,and hence the associated

    prediction error 1 = z1 itself and the prediction error variance is given as:

    var (1) = 2

    1 0

    2E

    1

    1

    10

    = 2 [1 0]

    1 +21 1

    1 21

    1

    0

    = 2(1 +21), with f1 = (1 +

    21).

    Application of the updating formula:

    a1 =

    (1 +21) 1

    1 21

    1

    0

    z1

    (1 +21)

    =

    (1 +21)z1

    z11

    (1 +21)

    =

    z1

    z11

    (1 +21)

    14

  • 8/11/2019 Kalman filter and State Space Models

    15/24

    Similarly,

    P1 =

    (1 +21) 1

    1 21

    (1 +21) 1

    1 21

    1

    0

    1 0

    (1 +21) 1

    1 21(1 +21)

    =

    0 0

    0 41

    (1 +21)

    Prediction equation for2:

    a2|1 = Ta1

    = 0 1

    0 0 z1

    z11

    (1 +21)

    =

    z11

    (1 +21)

    0

    .

    And,

    P2|1 =

    0 1

    0 0

    0 0

    0 41(1 +21)

    0 0

    1 0

    +

    1 1

    1 21

    =

    41(1 +21) 00 0

    + 1 1

    1 21

    =

    (1 +21+41)

    (1 +21) 1

    1 21

    .

    Predictingz2:

    z2 = 1 0 z11(1 +21)

    0

    = z11

    (1 +21)

    Prediction error2:

    2 = z2z11

    (1 +21),

    15

  • 8/11/2019 Kalman filter and State Space Models

    16/24

    and

    f2 =

    1 0

    (1 +21+

    41)

    (1 +21) 1

    1 21

    1

    0

    = (1 +21+41)(1 +21) 1 10

    = (1 +21+41)/(1 +

    21)

    These steps show that, for the M A(1) model, one can calculate the prediction error and its variance

    using the following recursions:

    t = zt1t1

    ft1, t= 1, 2, . . . , T , where 0= 0, and

    ft = 1 + 2t

    11 +21+ +

    2(t1)1

    Note here that the expressions for the prediction error t and the prediction error variance ft are

    exactly the same as those obtained using triangular factorization for the M A(1) model.

    -

    As a final step towards finalising the likelihood function, we shall note the following further

    simplification. Recall that we had decomposed the likelihood for set of dependent observations, into

    a likelihood for the independent errors, using the concept of prediction error decomposition, as:

    logL(z) =T

    2log2

    T

    2log2

    1

    2

    Tt=1

    logft1

    22

    Tt=1

    2t /ft.

    From our derivation, we can see that the t and ft do not depend on2 and hence we can concen-

    trate2 out. This means, we have to differentiate the log-likelihood with respect to 2 and get an

    estimator for 2,say, 2.So we get,

    2 = 1

    T

    T

    t=12tft

    .

    Evaluating the log-likelihood in terms of2 = 2 and simplifying, we get

    Log L (z)c =T

    2

    log2+ 1

    1

    2

    Tt=1

    log ftT

    2log 2.

    We either maximize this log likelihood or minimize,

    Log L (z)c=T

    t=1

    log ft+Tlog 2.

    16

  • 8/11/2019 Kalman filter and State Space Models

    17/24

    One can make an initial guess about the underlying parameters and either apply the numerical

    estimation procedures to calculate the derivatives or analytically calculate the derivatives by differ-

    entiating the Kalman recursions. In either case one has to keep in mind the restrictions to be imposed

    on the parameters, especially on the M Aparameters, to take care of the identification problem. Also,

    it has been proved in the literature, that using Kalman recursions to estimate pure AR models isreally not necessary.

    -

    Kalman Smoothing

    We have motivated the discussion on kalman filter so far as an algorithm for predicting the

    state vector, obtaining exact finite sample forecasts, as a linear function of past observations.

    We have also shown, how the resulting prediction error and the prediction error variance, can

    be used to evaluate the loglikelihood.

    This is sub-optimal if we are interested in estimating the sequence of states. In many cases,

    kalman filter is used to obtain an estimate of the state vector itself. For example, in their model

    of the business cycle, Stock and Watson show how one may be interested in knowing the state

    of the economyor the phase of the business cycle the economy is in, which is unobservable

    at any given historical point. Stock and Watson suggest that comovements in many macro

    aggregates have a common element, which may be called the state of the economy and this is

    unobservable. They motivate the use of kalman filter to obtain an estimate of this unobserved

    state of the economy.

    Sometimes elements of the state vector are even interpreted as estimates of missing observations,

    which could be higher frequency data points from an observable lower frequency one or simply

    an estimate of missing data point. For example, if we have data on a macro aggregate from

    1955 through 2104,we may interested in obtaining an estimate of 1970 which may be missing.

    Or, we may be interested in extracting monthly data from quarterly data.

    Such estimates of the unobserved state of the economy or missing observations can be obtained

    fromsmoothed estimatesof the state vector, t.

    Each step of the kalman recursions gives an estimate of the state vector,t,given all current and

    past observations. But an econometrician should use all available information to estimate the

    sequence of states. Kalman smoother provides these estimates. The only smoothed estimator

    which utilises all the sample observations is given by

    17

  • 8/11/2019 Kalman filter and State Space Models

    18/24

    at|T = E(t|IT)

    and the M SEof this smoothed estimate is denoted

    Pt|T = E

    (t at|T)(t at|T)

    .

    The smoothing equations start from at|T andPt|Tand work backwards.

    The expressions for at|T and Pt|T, which may be called the smoothing algorithm, are given

    below without proof:

    at|T = at+ Ptat+1|T Tt+1at

    Pt|T = Pt+ Pt

    Pt+1|T Pt+1|t

    P

    t

    where

    Pt = PtTt+1P

    1t+1|t, t= T1, . . . , 1

    with aT|T =aT and PT|T =PT.

    A set of direct residuals can also be obtained from the smoothed estimators.

    et= zt ytat|T, t= 1, . . . , T

    This is not to be confused with the prediction residuals, t, defined earlier.

    -

    We shall explain the smoothing algorithm with an example. Consider the simple model

    zt = t+t, t W N(0, 2)

    t = t1+t, t W N(0, 2q)

    where the state,t, and the observation, zt, are scalars. The state, which follows a random walk

    process, cannot be observed directly as it is contaminated by noise. This is the simple signal plus

    noise model. We assume that q is known. Also note that in this example we have allowed the

    observation ztto be measured with error,

    t. For this example, note that T = 1, R= 1 andy

    t= 1.

    18

  • 8/11/2019 Kalman filter and State Space Models

    19/24

    The prediction equations for this example are

    at|t1 = at1, Pt|t1 = Pt1+q

    and the updating equations are

    at = at|t1+Pt|t1(ztat|t1)/(Pt|t1+ 1)

    and

    Pt = Pt|t1P2t|t1/(Pt|t1+ 1)

    We shall demonstrate how to predict, update and smooth with 4 observations: z1 = 4.4, z2 =

    4, z3 = 3.5 and z4 = 4.6. The initial state vector has the property, 0 N(a0, 2P) and we have

    been given that a0 = 4, P0 = 12 and q= 4 so that RQR

    = 4 andh = 1.

    From the prediction equation we have a1|0 = 4, and P1|0 = 16, so that from the updating

    equations we have,

    a1 = 4 + (12 + 4)(4.44)/(12 + 4 + 1) = 4.376

    and

    P1 = 16162/17 = 0.941

    Since yt = 1 in the measurement equation for all t, MM SLE ofzt is always at|t1. So, z2|1 =

    a2|1= a1= 4.376.

    Repeating the calculations for t = 2, 3 and 4,we get the following results:

    Smoothed estimators and residuals

    t 1 2 3 4

    zt 4.4 4.0 3.5 4.6

    at 4.376 4.063 3.597 4.428

    Pt 0.941 0.832 0.829 0.828

    t 0.400 -0.376 -0.563 1.003

    at|T 4.306 4.007 3.739 4.428

    Pt|T 0.785 0.710 0.711 0.828

    et 0.094 0.007 -0.239 0.172

    19

  • 8/11/2019 Kalman filter and State Space Models

    20/24

    From the above table we also have: a2|1= 4.376, P2|1= 4.941, a3|2 = 4.063, P3|2 = 4.832, a4|3=

    3.597 and P4|3= 4.829.

    From the table, the final estimates are seen to be a4= 4.428 andP4 = 0.828.

    These values can now be used in the smoothing algorithm. And the algorithm, for the current

    example reduces to,

    at|T = at+Pt/Pt+1|t

    at+1|Tat

    Pt|T = Pt+

    Pt/Pt+1|t2

    Pt+1|TPt+1|t

    , t= T1, . . . , 1

    Since a4|4 = a4 and P4|4 = P4, we can apply the smoothing algorithm to obtain smoothed

    estimates for a3|4 andP3|4 and work backwards. So we have

    a3|4 = 3.597 + (0.829/4.829)(4.4283.597) = 3.379

    P3|4 = 0.829 + (0.829/4.829)2(0.8284.829) = 0.711

    The rest of the smoothed estimates have been displayed in the table above.

    The smoothed estimates of the unobserved state vector is displayed by the row at|Tin the table

    above.

    Both the direct and the prediction error residuals have been calculated using the formulae,

    et = ztat|T andt = ztat1 respectively.

    20

  • 8/11/2019 Kalman filter and State Space Models

    21/24

    Appendix

    Derivation of updating equations

    In this Appendix we shall derive the important steps leading to the updating equation and the

    associated variance matrix of the estimation error. Before discussing the steps involved, we shall

    digress a bit to delve into the following important material.

    1. Consider the model:

    Z(T1)

    = Y(Tm)

    (m1)

    + N(T1)

    , N

    0, 2

    .

    We shall call this model the sample information.

    (a) Case 1: If is fixed in the above model, we have the usual GLSestimator given as:

    Y1YY1Zand this would beBLUE.

    (b) Case 2: Suppose the vector is either partially or fully random or stochastic. The

    question now here, is the GLSestimator stillBLUE? The answer is it still is according to

    theextendedGauss-Markov theorem, enunciated by Duncan and Horn, JASA, 1972,

    pp.815-21. They proved that the GLSestimator now satisfies a condition called best,

    linear, unconditionally unbiased (or u-unbiased) estimator.[An estimator is u-unbiased if

    its estimation error has expectation zero.]

    (c) Case 3: Suppose that is still fully or partially random. Additionally suppose that

    we have some prior information about it. How can we use it to update the estimator

    of already obtained? This becomes a special case of themixed estimationprocedure

    developed by Theil and Goldberger (see Theil, Principles of Econometrics, pp.347-

    52) where we incorporate such prior information with the sample information. Suppose

    in our case, the prior information is given in the form given below:

    (0 )

    0, 2P0

    ,

    where 0 is a known vector and P0 is a known positive definite matrix. Then to get an

    updatedestimator, that combines this prior information with the sample information, we

    first construct the augmented model:

    0

    Z

    =

    I

    Y

    +

    0

    N

    21

  • 8/11/2019 Kalman filter and State Space Models

    22/24

    More concisely,

    Z = Y + N, where, E(N) =0, and

    E

    NN

    = 2V= 2

    P0 0

    0

    .

    Using the extended Gauss-Markov theorem, we have the estimator ofgiven as:

    =

    YV1Y1

    YV1Z.

    Using the original notations, this can be re-written as:

    = P

    P10 0+ Y1Z

    where P=

    P10 + Y

    1Y1

    .

    is now the updatedMMSEof,with

    ( ) 0, 2PWe are going to use this principle of combining sample information and prior information in deriving

    our updating equation of the KF recursion.

    Updating the state vector

    The role of the updating vector is to incorporate the new information in zt the moment we

    are at timet with the information already available in the estimator at/t1.This problem is directly

    analogous to the one that we discussed under the extended Gauss-Markov theorem and Theils mixed

    estimation procedure, where prior information was combined with the sample information. For our

    case, the prior information is in

    t at/t1

    0, 2Pt/t1

    ,

    while the sample information is derived from the measurement equation. Thus the augmented model:

    at/t1 = t+ at/t1 t

    zt = ytt+Nt.

    In matrix notation, at/t1

    zt

    =

    I

    yt

    t+

    at/t1 t

    Nt

    .

    The disturbance term has zero expectations and covariance matrix,

    E

    at/t1 t

    at/t1 t

    Nt

    Nt

    = 2

    Pt/t1 0

    0 h

    .

    22

  • 8/11/2019 Kalman filter and State Space Models

    23/24

    More precisely,

    Zt= ytt+et,

    whereE(et) is zero and E(etet) = 2V,where

    V= Pt/t1 0

    0 h .

    Now, using the extended Gauss-Markov theorem, we can write

    at=

    ytV1Yt

    1ytV

    1Zt

    Using the original notations, we can re-write the expression for at as follows:

    at = PtP1t|t1 at| t1+ ytzt/h

    where

    Pt =

    P1t| t1+ yty

    t/h

    Thusat t

    0, 2Pt

    The updating formula can be put in a different way using a matrix inversion lemma. The advan-

    tage in such an adjustment is that we dont have to invert any matrix in the updating equations.

    Lemma:

    For any (nn) matrix,D,defined by

    D=A + BCB

    1,

    whereA and C are non-singular matrices of order n and m respectively and B is (nm) ,then we

    have:D= A1 A1B

    C1 + BA1B

    1BA1

    We can use this lemma on the expression for Pt by noting that Pt = D, P1t|t1 = A, yt = B

    and C = h1 and it follows that,

    Pt = Pt|t1 Pt|t1ytytPt|t1/ft where ft = y

    tPt|t1yt+h.

    23

  • 8/11/2019 Kalman filter and State Space Models

    24/24

    One can make it even more compact by writing

    at =

    Pt|t1 Pt|t1ytytPt|t1/ft

    P1t|t1 at| t1+ ytzt/h

    = at|t1+ Pt|t1yt

    zt/h y

    tat|t1/ft y

    tPt|t1ytzt/fth

    = at|t1+f

    1t Pt|t1yt ztft/h ytat|t1 ytPt|t1ytzt/h .

    Substituting for fth = ytPt|t1yt in the above term and re-arranging, we get

    at = at|t1+ Pt|t1yt

    zt ytat|t1

    /ft.

    Note that the expressions for at andPt in this appendix are exactly the ones we have used as

    updatingand the variance matrix of the estimation error respectively in the main text.

    Note also that in the discussion so far, we have assumed the presence of an additional noise

    in the measurement equation; that is, h > 0. If we dont, then, note that V would become

    singular. But we also have to note that, in our examples of state space representation ofARMA

    models, we have assumed that the measurement equation has no additional error. However this

    should not matter, since through these adjustments, note that we have isolated the variance

    component as an additive scalar, which when becomes zero, does not affect our calculations.