e4man.pdf
TRANSCRIPT
Last revision: June 2000
Time Series Analysis using
MATLAB,Including a complete MATLAB Toolbox
By Jaime Terceiro, José Manuel Casals, Miguel Jerez, Gregorio R.
Serrano and Sonia Sotoca
ContentsIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
How the toolbox works? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
Installing the toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Contents of the book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
Description of the models supported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
State-space models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
The state-space model with fixed coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
The steady-state innovations model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
Simple models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-4
The structural econometric model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
The VARMAX model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
The transfer function model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
Composite models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
Nested models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
Nesting in inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
Nesting in noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
Component models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
Models with GARCH errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
Modeling options supported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9
Defining models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
General ideas about model definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
The THD format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
General rules about parameter matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
Defining state-space and structural time series models . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
Defining simple models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
Defining nested models in inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
Defining component models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
Model estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
Modification of toolbox options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
Evaluation of the likelihood function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
Models with homoscedastic errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
Models with GARCH errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
Initial conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
Computation of the gradient and the information matrix . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Evaluation of the analytical gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Evaluation of the exact information matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Quasi-maximum likelihood estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Numerical optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
General use of e4min . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
Scaling problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
Preliminary estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
Displaying the estimation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10
Specification, forecasting, simulation and smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
Tools for time series analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
General purpose functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
Data transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
Tools for model specification and validation . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
User models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
Defining user models in the general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
Defining user models in reparametrized formulations . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4
Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1
Univariate ARIMA examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
VARMA modeling: interaction between minks and muskrats . . . . . . . . . . . . . . . . . . . . 7-6
Transfer function analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
Unconstrained transfer function modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-12
Period estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14
Estimation of a constrained transfer function . . . . . . . . . . . . . . . . . . . . . . . . . 7-15
Composite model forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-16
Structural econometric models: supply and demand of food . . . . . . . . . . . . . . . . . . . . 7-18
Maximum-likelihood estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-19
An ARCH model for the U.S. GNP deflator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-22
Estimation under homoscedasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-22
Estimation of an ARCH(8) process for the error . . . . . . . . . . . . . . . . . . . . . . 7-24
Estimation of a GARCH(1,1) process for the error . . . . . . . . . . . . . . . . . . . . 7-25
Forecasting and monitoring of objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-28
Disaggregation of value added in industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
Estimation of the high frequency data model . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
Disaggregation from nonstationary models . . . . . . . . . . . . . . . . . . . . . . . . . . 7-31
Disaggregation from a stationary model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-32
Models with observation errors: Wölfer´s sunspots data . . . . . . . . . . . . . . . . . . . . . . . 7-33
Univariate modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-33
Model with observation errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-35
Structural time series models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-38
Computation and modeling of unobservable components . . . . . . . . . . . . . . . . 7-39
Reference guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1
aggrmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4
arma2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6
augdft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8
comp2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-11
descser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-12
e4init . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-13
e4min . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-14
e4preest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-17
e4trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19
fismiss, fismod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24
foregarc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-26
foremiss, foremod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-27
garc2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28
ggarch, gmiss, gmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-30
histsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-32
igarch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-33
imiss, imod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35
imodg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-37
lagser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-39
lffast, lfmiss, lfmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-40
lfgarch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43
midents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-45
nest2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-46
plotqqs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
plotsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-48
prtest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49
prtmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-51
residual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-52
rmedser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-54
sete4opt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-56
simgarch, simmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-58
ss_dv, garch_dv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-60
ss_dvp, garc_dvp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-61
ss2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-62
stackthd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-64
str2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-65
tf2thd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-67
thd2arma, thd2str, thd2tf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-69
thd2ss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-71
tomod, touser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-72
transdif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-73
uidents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-74
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1
Error messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
Warning messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
&KDS� � 3DJ� �
1 Introduction
This book describes a MATLAB Toolbox for econometric modeling of time series. Its name, E4,
refers to the Spanish Estimación de modelos Econométricos en Espacio de los Estados, meaning
“State-Space Estimation of Econometric Models.”
E4 makes up for the lack of PC software for estimating econometric models by exact maximum
likelihood. The main model supported is a general state-space (SS) form with fixed-parameters. On
this basis, the Toolbox supports also many standard formulations such as VARMAX (Vector
AutoRegressive Moving Average with eXogenous variables), structural econometric models and
single-output transfer functions. All these models can be estimated: a) by themselves or in composite
formulations, b) unconstrained or subject to linear and/or nonlinear constraints on the parameters,
and c) under standard conditions (i.e.. with full information and homoscedastic errors) or in an
extended framework that allows for observation errors, missing data and vector GARCH
(Generalized AutoRegressive Conditional Heteroscedastic) errors.
This flexibility is obtained by treating each model internally in its equivalent state-space
formulation, which allows certain computations and analyses that would not otherwise be possible.
In most cases, however, the user does not need to understand nor handle the SS formulation, as the
library includes many interface functions that manage the necessary conversions from/to the
conventional representation of the model.
From a theoretical point of view, the library’s most relevant characteristics are the following:
& The main estimation criterion is exact maximum likelihood. Likelihood evaluation can be done
using the filters of Kalman or Chandrasekhar.
& There are several algorithms to compute initial conditions for the state of a stationary or
nonstationary system.
& The toolbox includes a subspace-based consistent estimation criterion. This function is very fast.
Hence its estimates can be used as final values when analyzing large samples or as starting
values for maximum likelihood estimation.
&KDS� � 3DJ� �
& The Toolbox includes functions to compute the analytical gradient of the likelihood functions and
the exact information matrix. These functions enhance the likelihood optimization process and
the ex-post validation of the model, through hypotheses testing.
& Whereas the emphasis is in model estimation, the Toolbox includes many functions for model
specification and validation, simulation and forecasting.
& The functions related with forecasting and fixed-interval smoothing are based in efficient and
state-of-the-art methods.
& The use of MATLAB as development platform also guarantees the reliability of computations, as
it is based on the results of the well-known and reputable LINPACK and EISPACK projects.
Also, it allows the user to easily extend the formulations and methods supported.
The focus of E4 on SS econometrics has some pros and cons. Specifically, the routines have been
optimized for numerical accuracy and robustness, not for speed. This is not a serious inconvenience,
as econometric modeling seldom requires real-time performance. Speedier computations can be
obtained easily by tuning some system parameters, or with some effort by translating critical
functions to a lower level language, such as C, or generating MEX files from the source code, see
MATLAB (1992) and MATLAB (1996). On the other hand, this emphasis on accuracy and
robustness has many advantages, see McCullough and Vinod (1999).
How the toolbox works?
To unify the treatment of a wide range of econometric models, the toolbox uses an internal format,
known as THD (THeta-Din) format, which stores the information about the dynamic and stochastic
structure of the models. This format is managed by user-friendly interface functions.
For example, if one wants to estimate a VARMA model, the first step consists of writing its
conventional formulation and calling the function that returns the corresponding THD format. The
rest of the toolbox functions assume the model to be codified in this format. A detailed description of
the THD format can be found in the Appendix C.
Many functions in the Toolbox require, as input arguments, a THD format definition and a data
matrix. These functions transform the model to SS representation and start the required computation
process. Therefore, most users do not need to know the SS formulation of the model.
The steps in the normal operation of E4 are the following:
&KDS� � 3DJ� �
& Read and transform the data (Box-Cox, differencing or any other). In this stage, graphics and
functions for model specification can also be used. If simulations are required, there are functions
(e.g., simmod) that can generate the data.
& Define the model to be estimated by writing its parameter matrices and obtain the corresponding
representation in THD format. This is done with a specific function for each type of model.
& Call the toolbox optimization algorithm (e4min) to estimate the model by maximizing the exact
likelihood function. Before that, the user may have to define some parameters that control the
behaviour of the optimization algorithm.
& Once the estimates are obtained, use the appropriate function for computing the exact
information matrix. An alternative estimate of the information matrix can be obtained directly
from e4min, through the hessian inverse. Finally, display the estimation results in a legible
format.
& A wide range of possibilities are opened up with the results from the previous operations, like
validation, forecasting, interpolation of objectives, re-estimation of the model under linear or non-
linear constraints or outlier elimination.
Of course, it is not necessary to go through all the previous steps nor to always follow the same
sequence, as this will depend on the user’s objectives.
Installing the toolbox
The distribution diskette contains the source code of the toolbox in the directory \E4. Assuming that
MATLAB is installed in the directory C:\MATLAB, the contents of A:\E4 should be copied to
C:\MATLAB\TOOLBOX\E4. If the distribution diskette is inserted into the A: drive, the sequence
of MS-DOS commands would be as follows:
C:\>cd c:\matlab\toolboxC:\MATLAB\TOOLBOX>md e4C:\MATLAB\TOOLBOX>copy a:\e4\*.* c:\matlab\toolbox\e4
Once the files are copied, the MATLABRC.M file must be modified to include the directory
c:\matlab\toolbox\e4 in the search path of MATLAB functions. To do this with a text editor,
such as Windows NOTEPAD, open C:\MATLAB\MATLABRC.M and find the command
matlabpath, which will be similar to:
matlabpath([...
&KDS� � 3DJ� �
'C:\MATLAB;',...'C:\MATLAB\toolbox\matlab\general;',......'C:\MATLAB\toolbox\matlab\plotxy;',...]);
The user should add the path to the directory containing the library, obtaining something similar to:
matlabpath([...'C:\MATLAB;',...'C:\MATLAB\toolbox\matlab\general;',...
...'C:\MATLAB\toolbox\matlab\plotxy;',...'C:\MATLAB\toolbox\e4;',...]);
Last, before saving this file, add the following line to the end of MATLABRC.M:
e4init;
This command initializes the toolbox options. Although the call to e4init can be done at the
beginning of each toolbox session, it is easier to initialize when starting up MATLAB. Bear in mind
that the toolbox does not work properly if this function is not run.
Once these steps have been completed, MATLAB is able to use the E4 library.
The .m files corresponding to the examples in Chapter 7 are in the directory \EXAMPLES of the
distribution diskette. If the distribution diskette is inserted into the A: drive, they can be copied to the
directory C:\EXAMPLES (or any other) of the hard disk with the following commands:
C:\>md examplesC:\>xcopy a:\examples c:\examples /s
Contents of the book
This book is organized as follows. Chapter 2 presents the econometric models supported by E4.
Chapter 3 describes the functions managing the different representations of a model. It begins by
describing the functions that transform conventional models to THD format. Building on this format,
the models can be translated to the SS or conventional formulations.
Chapter 4 is about model estimation. It describes several functions related with likelihood
evaluation, optimization and calculation of the exact information matrix, as well as the functions
that manage the toolbox options and tolerances.
&KDS� � 3DJ� �
Chapter 5 deals with the toolbox functions for model specification, validation, forecasting and
smoothing.
Chapter 6 is about the extension of the formulations supported by means of user model. This option
allows the user to define new formulations or to work with nonstandard parametrizations of the
models supported.
To illustrate how the toolbox works, Chapter 7 includes several case studies that cover both,
introductory and advanced time series analysis with E4.
Finally, Chapter 8 is the Reference Guide to the toolbox functions and Chapter 9 contains the
bibliographic references. The Appendices A to C describe the error and warning messages, the
structure of the internal vector E4OPTION and the THD format, respectively.
&KDS� � 3DJ� �
2 Description of the models supported
E4 addresses a wide set of econometric models in all stages of analysis: specification, estimation,
forecasting, smoothing and simulation. This Chapter presents the mathematical formulation of these
models and is divided in five Sections.
The first Section describes the structure of the fixed-parameters SS models, which are used
internally by E4 for most computational purposes.
The second Section is devoted to the basic econometric models supported in the toolbox: the
structural econometric model, the VARMAX model and the single-output transfer function. The
third Section defines a very flexible option, called “composite models”, which can be used to
combine several SS formulations in a single model.
The fourth Section describes the formulation of models with disturbances conditionally
heteroscedastic and the last Section presents the combination of modeling options supported and
introduces the concept of “user models”, which is discussed in Chapter 6.
State-space models
7KH VWDWH�VSDFH PRGHO ZLWK IL[HG FRHIILFLHQWV
The most general SS formulation supported by E4 is:
(2.1)xt �� 1 00 xt � ut � E wt
(2.2)zt H xt � D ut � C vt
where:
is an vector of state variables,xt (n×1)
is an vector of exogenous variables,ut (r×1)
is an vector of observable variables,zt (m×1)
and are white noise processes such that:wt vt
(2.3)E [wt ] 0 , E [vt ] 0
&KDS� � 3DJ� �
(2.4)Ewt1
vt2
w Tt1
v Tt2
Q S
S T R t 1t2
being Q and R positive semi-definite matrices. See Terceiro (1990).
Example 2.1 (SS representation of standard econometric models). Any linear econometric model
with fixed parameters can be represented in the SS form (2.1)-(2.2). Therefore, any numerical or
statistical procedure (e.g., forecasting) developed for the SS model can be applied to all the
particular cases. This is the basic idea implemented in E4.
For example, consider the ARMA(1,1) model: . An equivalentzt�1 1 zt � at�1 � � at
representation in SS form is:
xt�1 1xt � (� 1 )at
zt xt � at
The SS representation of a given model is not unique, e.g., the previous ARMA(1,1) model can also
be written in SS form as:
x 1t�1
x 2t�1
1 �
0 0
x 1t
x 2t
�
1
1at�1
zt 1 0x 1
t
x 2t
For further details about the SS representation of econometric models, see Terceiro (1990).
Example 2.2 (SS and structural time series models). The additive structural decomposition of a
time series, , is defined by:zt
zt tt � ct � st � Jt
where:
is the trend component, representing the long-term behavior of the series,tt
is the transitory component, or cycle, describing short-term fluctuations,ct
is the seasonal component, associated to persistent variability patterns repeated along ast
season, and
is an irregular component.Jt
&KDS� � 3DJ� �
An structural time series model is directly set up in terms of these components, which are
represented by SS models specified according to the properties of the time series. For example, the
following formulation describes a series in terms of an stochastic trend and quarterly dummy
seasonality:
tt�1
�t�1
st�1
st
tt1
1 1 0 0 0
0 1 0 0 0
0 0 1 1 1
0 0 1 0 0
0 0 0 1 0
tt
�t
st
st1
tt2
�
0 0
1 0
0 1
0 0
0 0
�t
7t
zt 1 0 1 0 0
tt
�t
st
st1
tt2
� Jt
where the error terms , and are assumed to be gaussian white noise processes, with an�t 7t Jt
instantaneous covariance matrix:
V
�t
7t
Jt
)2�
0 0
0 )27
0
0 0 )2J
This class of models is supported by E4 through the use of the formulation (2.1)-(2.2). For more
details about these formulations, see Harvey (1989).
7KH VWHDG\�VWDWH LQQRYDWLRQV PRGHO
A particular case of (2.1)-(2.2) is the steady-state innovations SS model, see Anderson and Moore
(1979), defined by:
(2.5)xt��1 00xt � ut � EJJt
(2.6)zt Hxt � Dut � JJt
&KDS� � 3DJ� �
Comparing (2.5)-(2.6) with (2.1)-(2.2) it is immediate to see that, in this formulation, the errors in
the state and observation equations are the same and . The relevance of this special case lies inC I
two facts: a) many econometric models in SS have the steady-state innovations structure, see e.g.,
the first SS representation in Example 2.1, and b) when applied to a SS model with this structure,
the forecasting, filtering and smoothing algorithms have special convergence properties, which allow
the implementation of very efficient and stable computational procedures, see Casals, Sotoca and
Jerez (1999) and Casals, Jerez and Sotoca (2000).
Whenever suitable, E4 takes advantage of these properties.
Simple models
Besides state-space models, E4 manages three basic formulations, which are known as simple
models. These are structural econometric models, VARMAX models and single-output transfer
functions. The last two formulations can also be combined with a multivariate GARCH model for
the conditional variances of the errors.
7KH VWUXFWXUDO HFRQRPHWULF PRGHO
A structural econometric model can be formulated as:
�����FR (B ) FS (B S ) yt G (B )ut � AR (B ) AS (B S ) JJt
ZKHUH 6 GHQRWHV WKH OHQJWK RI WKH VHDVRQDO SHULRG� % LV WKH EDFNVKLIW RSHUDWRU� VXFK WKDW IRU DQ\
VHTXHQFH � � DQG LV D �Pð�� YHFWRU RI HQGRJHQRXV YDULDEOHV� LV D �Uð��xt B ±k xt xt.k yt ut
YHFWRU RI H[RJHQRXV YDULDEOHV� LV D �Pð�� YHFWRU RI ZKLWH QRLVH HUURUV DQGJJt
FR (B ) FR0 � FR1 B � ... � FRp B p
FS (B S ) FS0 � FS1 B S� ... � FSP B P·S
G (B ) G0 � G1 B � ... � Gg B g
AR (B ) AR0 � AR1 B � ... � ARq B q
AS (B ) AS0 � AS1 B S� ... � ASQB Q ·S
The characteristic feature of this formulation consists of allowing for a contemporary relationship
between the endogenous variables, given by WKH PDWULFHV DQG . To normalize the model,FR0 FS0
the elements in its main diagonal should be equal to one. This formulation includes, as particular
cases, the linear regression model and the simultaneous equations model.
&KDS� � 3DJ� �
7KH 9$50$; PRGHO
The VARMAX model is defined by:
(2.8)FR (B ) FS (B S ) yt G (B )ut � AR (B ) AS (B S ) JJt
where , , are defined in (2.7) and:yt ut JJt
FR (B ) I � FR1 B � ... � FRp B p
FS (B S ) I � FS1 B S� ... � FSP B P·S
G (B ) G0 � G1 B � ... � GnB n
AR (B ) I � AR1 B � ... � ARq B q
AS (B ) I � AS1 B S� ... � ASQB Q ·S
Important and frequent particular cases of this formulation are the univariate ARMA and ARMAX
models and the VARMA model.
7KH WUDQVIHU IXQFWLRQ PRGHO
The third basic specification is the single-output transfer function model, which can be formulated
as:
yt 71(B)
1(B)u1t � ... �
7r(B)
r(B)urt �
�(B) �(B S)
1(B)0(B S)Jt
(2.9)
where:
is the value of the endogenous variables at time t,yt
is a (r×1) vector of exogenous variables,ut [u1 t ,á ,urt]T
is a white noise error andJt
; 7i (B) 7i0 � 7i1 B � 7i2 B 2� á � 7ini
Bni i 1,2 ,á , r
; i (B) 1 � i1 B � á � i ndiB
ndi i 1,2 ,á , r
1(B) 1 � 11 B � á � 1p B p
0(B S) 1 �01 B S� á �0PB P·S
�(B) 1 � �1 B � á � �q B q
�(B S ) 1 � �1 B S� á � �QB Q·S
&KDS� � 3DJ� �
Composite models
In the context of E4, a model that combines the formulation of several models is called a composite
model. E4 allows for two types of model composition: nested models and component models.
A model is said to be nested if it is constructed by a combination of multiplicative factors. For
example, the famous airline model can be viewed as the result of nesting a regular IMA(1,1) process
with a seasonal IMA(1,1) process.
On the other hand, a component model is obtained by the addition of several dynamic structures. For
example, one could define the model for a trend-cycle decomposition by adding an ARIMA model
for the trend and an ARIMA model for the cycle.
1HVWHG PRGHOV
E4 allows for two types of model nesting: nesting in inputs and nesting in noise.
1HVWLQJ LQ LQSXWV
A model is said to be nested in inputs if some exogenous variable is substituted by a model
describing its dynamic and stochastic structure. After doing so, the exogenous variables become
endogenous and, therefore, we can also describe this operation as “endogeneization”. The following
example illustrates in a general SS framework..
Example 2.3 (Endogeneization). Assume that the model for a vector of endogenous variables, ,yt
is:
(2.10)x at �� 1 00
a x at �
a ut � E a w at
(2.11)yt H a x at � D a ut � C a v a
t
whereas the exogenous variables, , follow the model:ut
(2.12)x bt �� 1 00
b x bt � E b w b
t
(2.13)ut H b x bt � C b v b
t
being the errors in (2.10)-(2.11) and (2.12)-(2.13) mutually independent. Substituting (2.13) in
(2.10)-(2.11) yields:
&KDS� � 3DJ� �
(2.14)x at �� 1 00
a x at �
a (H b x bt � C b v b
t ) � E a w at
(2.15)yt H a x at � D a (H b x b
t � C b v bt ) � C a v a
t
The Eqs. (2.14) and (2.12) can be easily combined in a single state equation:
(2.16)x a
t��1
x bt��1
00a
a H b
0 00b
x at
x bt
�
E a
a C b 0
0 0 E b
w at
v bt
w bt
and Eqs. (2.15) and (2.13) can be combined in a single observation equation:
(2.17)yt
ut
H a D a H b
0 H b
x at
x bt
�
C a D a C b
0 C b
v at
v bt
1HVWLQJ LQ QRLVH
Nesting in errors consists of defining the noise structure of an econometric model as a multiplicative
combination of several dynamic factors. The only requisite for these factors is that their SS
equivalent representation should be an innovation model, see (2.5)-(2.6). This is not a severe
restriction, as most simple models supported by E4 satisfy it. Two relevant applications of this type
of nesting consist of: a) separating unit roots from stationary or invertible factors and b)
representing a time series with multiple seasonal cycles.
&RPSRQHQW PRGHOV
A component model is defined as the sum of several components, having all of them a (simple or
composite) model describing its particular dynamic and stochastic features. Two important cases of
component models are Structural Time Series Models (STSM), see Harvey (1989) and models with
observation errors, see Terceiro (1990).
Example 2.4 (Observation errors). Assume that a vector in model (2.10)-(2.11) is such as:yt
(2.18)y ��
t yt � v yt
where is observable and is white noise observation error, independent from the disturbancesy ��
t v yt
in (2.10)-(2.11). Combining (2.18) with (2.11) yields the new observation equation:
(2.19)y ��
t H a x at � D a ut � C a v a
t � v yt
&KDS� � 3DJ� �
which relates the states in (2.10) with the observable variables.
Models with GARCH errors
Most econometric models assume that errors have constant conditional variances. To generalize this
assumption, Engle (1982) introduced a class of stochastic processes with time-varying conditional
variances. For a comprehensive survey, see Bollerslev et al. (1994).
E4 allows one to combine any VARMAX model, transfer function or SS in the steady-state
innovations form (2.5)-(2.6), with a vector GARCH process for the error . This process is definedJJt
by the unconditional moments , ; and the conditional moments: ,E[JJt ] 0 V [JJt ] ((JJEt1[JJt ] 0
, where denotes the expectation of the argument conditional to the informationEt1[JJtJJTt ] (( t Et1[ ]
available in t-1. In a GARCH model, the conditional variances, , are such that:((t
(2.20)[I ��(B ) ]vech((( t ) vech(w ) � ��(B )vech(JJtJJTt )
where stands for the vector-half operator, which stacks the lower triangle of a matrixvech( ) N×N
as a vector; and the polynomial matrices are given by:[N (N � 1)/2]×1
��(B ) M
sJ
i1��i B
i
.��(B ) M
pJ
i1��i B
i
E4 manages model (2.20) in its alternative VARMAX representation. To derive it, consider the
process , such that . Then .vt vech(JJtJJTt ) vech(((t) Et1[vt ] 0 vech((( t) vech(JJtJJ
Tt ) vt
Substituting this expression in (2.20) and rearranging some terms yields:
(2.21)[I ��(B ) ��(B ) ]vech(JJtJJTt ) vech(w ) � [I ��(B ) ]vt
Eq. (2.21) defines a VARMAX model for , which can be written in compact form as:vech(JJt JJTt )
(2.22)vech(JJtJJTt ) vech(**
JJ) � Nt
(2.23)[I � ��(B ) ] Nt [I � ��(B ) ]vt
where and . This is the formulation supported in E4.��(B ) [��(B ) � ��(B ) ] ��(B ) ��(B )
Note that:
&KDS� � 3DJ� �
1) If the VAR polynomial in (2.23) has roots on the unit radius circle, then the process has some
IGARCH (Integrated-GARCH) components.
2) The formulation (2.22)-(2.23) does not assure the eigenvalues of to be non-negative for all t.(( t
Example 2.5 (Formulation of a GARCH(1,1) process as an ARMA(1,1)). Consider the process
; , such that the conditional variance, , follows aJt � IID N (0 , )2J) Jt 6t1 � IID N (0 , h 2
t ) h 2t
GARCH(1,1) equation:
h 2t 7 � �1 J
2t1 � �1 h 2
t1
Defining it follows that the previous equation can be rewritten as:vt r J2t h 2
t
(1 �1 B �1 B ) J2t 7 � (1 �1 B ) vt
which is analogous to (2.21), or:
J2t )
2J� nt
[1 (�1 � �1 )B ] nt (1 �1 B ) vt
which is analogous to (2.22)-(2.23). An IGARCH model can be formulated by imposing
.�1 � �1 1
Modeling options supported
The following table summarizes the different options supported by E4, for all the simple
formulations defined in this section:
Support for:
Simple models
Missing
data
Stationarity/
Nonstationarity
GARCH
errors
VARMAX YES YES YES
Structural econometric YES YES NO
Single-output transfer functions YES YES YES
State-Space (including structural time
series models)
YES YES YES†
† Only for models in steady-state innovations form
&KDS� � 3DJ� ��
Besides these formulations, E4 supports the definition of “user models”. This feature allows the user
to:
1) define any econometric model not supported directly by E4, providing that it has an equivalent SS
representation,
2) impose general nonlinear equality constraints on the parameters of the models supported, and
3) formulate a nonstandard parametrization of the models supported, see Chapter 6.
Chapter 6 describes in detail this option.
&KDS� � 3DJ� �
3 Defining models
E4 uses internally the SS representation (2.1)-(2.2) for most computations. However, its basic
representation is the “THD format”. This Chapter describes the functions that translate any of the
models described in Chapter 2 into THD format and, conversely, those that translate a THD
representation into the standard notation.
The first Section introduces the THD format and the general rules about definition of parameter
matrices. The second, third and fourth Sections describe the functions that translate the matrices of a
SS model, a simple model or a composite model, respectively, into THD format. Definition of
models with conditional heteroscedastic errors is discussed in Section fifth. Finally, Sections sixth
and seventh explain how to translate a model in THD format into a SS or a conventional
representation.
General ideas about model definition
7KH 7+' IRUPDW
The basic format for model representation in E4 is called “THD format”. Any THD format
specification is composed of two matrices: theta and din, which contain, respectively, the values
of the model parameters and a description of its dynamic structure. Besides of theta and din, a
model can be documented by an optional character matrix lab, which contains names for the
parameters in theta.
The matrix theta has a first column containing the values of all the parameters in a model.
Optionally, it may have a second column whose values are either zero, to indicate that the
corresponding parameter in the first column is free, or one, when the parameter is constrained to
remain at its present value.
While some analyses require the modification of theta and lab, most users will never need to
handle the matrix din. Anyway Appendix C contains a detailed description of the THD format.
In E4, defining a model consists of creating its parameter matrices by means of MATLAB
commands and feeding these matrices to an E4 interface function, which generates theta, din and
&KDS� � 3DJ� �
lab. The use of these interface functions is reviewed in the rest of this Chapter and discussed in
detail in Chapter 8.
After defining a model in THD format, its structure can be displayed by the function prtmod, which
has the syntax:
prtmod(theta, din, lab);
*HQHUDO UXOHV DERXW SDUDPHWHU PDWULFHV
When creating the parameter matrices, one should take into account the following general rules:
1) The value NaN marks those positions where the corresponding parameter is constrained to zero.
2) When all the parameters in a matrix are null, it should be defined as empty (“[]”).
3) A matrix of covariances can be defined as a vector. In this case, it is assumed to be diagonal and
the elements in the main diagonal are automatically set to the values in the vector. In order not to
impose this constraint, it is necessary to define at least its lower triangle.
4) A covariance matrix cannot contain the value NaN. To impose independence between two
specific errors, the user should specify a zero value in the first column of theta and, afterwards,
add a fixed-parameter constraint in the second column.
Example 3.1 (Defining matrices of parameters).
The following MATLAB commands initialize five matrices, which could be used to define a model
in THD format:
A=[-.8 NaN;.1 0];Sigma1=[1 .1; .1 2];Sigma2=[1 2; .1 2];Sigma3=[1;2];Sigma4=[1 NaN;NaN 2];
The first command generates the matrix:
A
.8 0
.1 0
where the parameter in the position (1,2) is constrained to remain in its present null value because it
has been defined using NaN. The zero element in the position (2,2) defines a parameter with a null
starting value, which will be allowed to change during the estimation process.
&KDS� � 3DJ� �
The second and third commands define different matrices but, when interpreted as covariance
matrices by an E4 function, they are equivalent.
((1 ((2 1 .1
.1 2
because the upper triangle is in fact ignored. The fourth command defines the covariance matrix:
((3 1 0
0 2
where the off-diagonal elements are constrained to remain at its present null values. Finally, the last
command is a valid MATLAB sentence but, when interpreted as a covariance matrix by an E4
function, will generate an error message due to the presence of NaN.
Defining state-space and structural time series models
The function ss2thd obtains the representation of the SS model (2.1)-(2.2) in THD format. Its
syntax is:
[theta, din, lab] = ss2thd(Phi, Gam, E, H, D, C, Q, S, R);
where the input arguments correspond to the parameter matrices in the standard representation (2.1)-
(2.2). See Chapter 8.
Example 3.2 (Defining an error correction model). Consider the following model in error
correction form:
; �yt .5wt � .7�yt1 � .4�ut � at .8at1 )2a .1
wt yt .3ut
which does not correspond to any of the formulations defined in Chapter 2. Its SS formulation is:
xt�1
wt�1
.7 .35
0 0
xt
wt
�
.28 0 0
0 .3 1
�ut
ut�1
yt�1
�
.1
0at
&KDS� � 3DJ� �
�yt 1 .5xt
wt
� .4 0 0
�ut
ut�1
yt�1
� at
and the corresponding THD representation can be obtained with the following code:
Phi=[.7, -.35; NaN, NaN];Gamma=[.28, NaN, NaN; NaN, -.3, 1];E=[-.1; NaN];H=[1, -.5];D=[.4, NaN, NaN];[theta, din, lab] = ss2thd(Phi, Gamma, E, H, D, [1], [.1], [.1], [.1]);prtmod(theta, din, lab);
and the output of prtmod is:
*************************** Model ***************************Native SS model1 endogenous v., 3 exogenous v.Seasonality: 1SS vector dimension: 2Parameters (* denotes constrained parameter):PHI(1,1) 0.7000PHI(1,2) -0.3500GAMMA(1,1) 0.2800GAMMA(2,2) -0.3000GAMMA(2,3) 1.0000E(1,1) -0.1000H(1,1) 1.0000H(1,2) -0.5000D(1,1) 0.4000C(1,1) 1.0000Q(1,1) 0.1000S(1,1) 0.1000R(1,1) 0.1000*************************************************************
Note that the elements of the matrices in the SS representation are nonlinear functions of the
parameters in the original formulation. Obtaining estimates for the original parameters would
require a user model, see Chapter 6.
Example 3.3 (Defining structural time series models). Consider the decomposition of a time
series, , into a trend, , and irregular component, , such that:yt Tt Jt
; ; yt Tt � Jt Tt Tt1 � �t1 �t �t1 � �t1
where is the change of the trend at time t, and the errors and are independent white noise�t Jt �t1
processes with variances and , respectively. To define this model with E4 it is)2� 100 )
2� .001
first necessary to obtain its SS representation:
&KDS� � 3DJ� �
Tt�1
�t�1
1 1
0 1
Tt
�t
�
0
1�t
yt [1 0]Tt
�t
� Jt
and then it can be defined and listed with the commands:
[theta,din,lab] = ss2thd([1 1; 0 1],[],[0;1], ... [1 0],[],[1],[.001],[0],[100]);prtmod(theta, din, lab);
which generate the following output:
*************************** Model ***************************Native SS model1 endogenous v., 0 exogenous v.Seasonality: 1SS vector dimension: 2Parameters (* denotes constrained parameter):PHI(1,1) 1.0000PHI(2,1) 0.0000PHI(1,2) 1.0000PHI(2,2) 1.0000E(1,1) 0.0000E(2,1) 1.0000H(1,1) 1.0000H(1,2) 0.0000C(1,1) 1.0000Q(1,1) 0.0010S(1,1) 0.0000R(1,1) 100.0000*************************************************************
If the model is to be estimated, all the parameters except the variances should keep its present
values. These fixed-value constraints can be imposed with the additional commands:
theta=[theta ones(12,1)]; theta(10,2)=0; theta(12,2)=0;prtmod(theta,din,lab);
and the resulting output is:
*************************** Model ***************************Native SS model1 endogenous v., 0 exogenous v.Seasonality: 1SS vector dimension: 2Parameters (* denotes constrained parameter):PHI(1,1) * 1.0000PHI(2,1) * 0.0000PHI(1,2) * 1.0000PHI(2,2) * 1.0000E(1,1) * 0.0000E(2,1) * 1.0000H(1,1) * 1.0000H(1,2) * 0.0000C(1,1) * 1.0000Q(1,1) 0.0010S(1,1) * 0.0000R(1,1) 100.0000*************************************************************
&KDS� � 3DJ� �
where the parameters constrained to its initial values are marked with an asterisk.
Defining simple models
The functions str2thd, arma2thd and tf2thd obtain the THD specification for structural
econometric models, VARMAX models and transfer functions, respectively. Their synopses are:
[theta,din,lab] = str2thd([FR0 ... FRp],[FS0 ... FSps], ... [AR0 ... ARq], [AS0 ... ASqs],v,s,[G0 ... Gg],r)
[theta,din,lab] = arma2thd([FR1 ... FRp],[FS1 ... FSps], ... [AR1 ... ARq],[AS1 ... ASqs],v,s,[G0 ... Gn],r)
[theta,din,lab] = tf2thd([fr1 ... frp], [fs1 ... fsps], ... [ar1 ... arq],[as1 ... asqs],v,s,[w1; ...; wr],[d1; ...; dr])
See Chapter 8 for a detailed description of the input arguments.
Example 3.4 (Defining structural econometric models). Consider the model:
1 .3
.7 1�
.4 0
.5 0B
y1t
y2t
.3 0
0 .5
u1t
u2t
�
J1t
J2t
VJ1t
J2t
1 0
0 .9
The first step to obtain its definition in THD format consist of defining the input arguments to
str2thd. This can be done with the following MATLAB commands:
FR0 = [ 1 -.3; -.7 1 ];FR1 = [-.4 NaN; .5 NaN];G0 = [ .3 NaN; NaN .5];v = [1.0 .9];
and afterwards, the THD form is obtained and displayed with:
[theta, din, lab] = str2thd([FR0 FR1],[],[],[],v,1,[G0],2);prtmod(theta, din, lab);
These commands generate the following output:
&KDS� � 3DJ� �
*************************** Model ***************************Structural model (innovations model)2 endogenous v., 2 exogenous v.Seasonality: 1SS vector dimension: 2Parameters (* denotes constrained parameter):FR0(1,1) 1.0000FR0(2,1) -0.7000FR0(1,2) -0.3000FR0(2,2) 1.0000FR1(1,1) -0.4000FR1(2,1) 0.5000G0(1,1) 0.3000G0(2,2) 0.5000V(1,1) 1.0000V(2,2) 0.9000*************************************************************
Example 3.5 (Defining VARMAX models): The model:
1 0
0 1�
.8 0
.5 .7B �
.45 0
0 0B 2
1 0
0 1�
.2 0
0 .3B 4 yt
1 0
0 1�
.4 0
0 .3B at V(at)
.9 0
0 .9
Can be defined in THD format and displayed with the following MATLAB commands:
FR1 = [-.8 NaN; -.5 -.7];FR2 = [-.45 NaN; NaN NaN];FS1 = [-.2 NaN; NaN -.3];AR1 = [-.4 NaN; NaN -.3];v = [.9 .9];[theta, din, lab] = arma2thd([FR1 FR2],[FS1],[AR1],[],v,4);prtmod(theta, din, lab);
Note that an empty matrix, [], should appear where the model does not include the corresponding
structure. In this case there was no seasonal moving average factor. Finally, the output displayed by
prtmod is the following:
*************************** Model ***************************VARMAX model (innovations model)2 endogenous v., 0 exogenous v.Seasonality: 4SS vector dimension: 12Parameters (* denotes constrained parameter):FR1(1,1) -0.8000FR1(2,1) -0.5000FR1(2,2) -0.7000FR2(1,1) -0.4500FS1(1,1) -0.2000FS1(2,2) -0.3000AR1(1,1) -0.4000AR1(2,2) -0.3000V(1,1) 0.9000V(2,2) 0.9000*************************************************************
&KDS� � 3DJ� �
Example 3.6 (Defining transfer functions). Given the transfer function:
yt .3 � .6B1 .5B
u1t � (.3B � .4B 2) u2t �.3B
1 .1B .2B 2u3t �
1 .8B1 .6B
Jt ; )2J 1
its definition in THD format is obtained as follows:
w1 = [ .3 .6 NaN]; d1 = [-.5 NaN];w2 = [NaN .3 .4 ]; d2 = [NaN NaN];w3 = [NaN .3 NaN]; d3 = [-.1 -.2];fr = [-.6]; ar = [-.8];v = [1.0];[theta,din,lab]=tf2thd(fr,[],ar,[],v,1,[w1;w2;w3],[d1;d2;d3]);prtmod(theta,din,lab);
and the corresponding prtmod output is:
*************************** Model ***************************Transfer function model (innovations model)1 endogenous v., 3 exogenous v.Seasonality: 1SS vector dimension: 5Parameters (* denotes constrained parameter):FR(1,1) -0.6000AR(1,1) -0.8000W1(1,1) 0.3000W1(2,1) 0.6000W2(2,1) 0.3000W2(3,1) 0.4000W3(2,1) 0.3000D1(1,1) -0.5000D3(1,1) -0.1000D3(2,1) -0.2000V(1,1) 1.0000*************************************************************
Defining composite models
'HILQLQJ QHVWHG PRGHOV LQ LQSXWV
To obtain in E4 the THD formulation of a model nested in inputs, one should follow a three step
procedure:
Step 1) Obtain the THD formulation of the models for the endogenous and exogenous
variables, using functions as arma2thd y tf2thd.
Step 2) Feed the THD forms obtained to the function stackthd, which arranges the individual
THD formats into a single (“stacked”) THD description. The syntax is:
&KDS� � 3DJ� �
[theta, din, lab] = stackthd(t1, d1, t2, d2, l1, l2);
where the inputs arguments t1, d1, l1 and t2, d2, l2 are the THD forms obtained
in Step 1). If necessary, the output arguments can be feed again to stackthd, to continue
recursively the stacking process.
Step 3) Translate the stacked model to the final nested formulation using the function nest2thd,
which syntax is:
[theta, din, lab] = nest2thd(theta, din, nestwhat, lab);
where the input arguments theta, din, lab were the final results of Step 2), and
nestwhat is a binary flag to choose between nesting in inputs (if nestwhat is equal to
one) or in errors (if nestwhat is equal to zero).
Example 3.7 (Endogeneization of the exogenous variable in a transfer function). Given the
transfer function:
yt .3 � .6B1 .5B
u1t �1 .8B1 .6B
Jt ; )2J 1
where is such that . In a standard analytic framework, obtainingu1t (1 .7B )u1t at ; )2a .3
forecasts for requires first to forecast the exogenous variable and afterwards feed these forecastsyt
to the model. An endogeneized model is an effective way to: a) perform both steps as a single
operation and b) taking into account the uncertainty affecting the forecasts for the input, which are
often ignored . The code required to follow steps 1) to 3) is:
Step 1) Obtain the THD representation for both models:
w1 = [ .3 .6]; d1 = [-.5];fr = [-.6]; ar = [-.8];v = [1.0];[t1,d1,l1]=tf2thd(fr,[],ar,[],v,1,[w1],[d1]);[t2,d2,l2]=arma2thd(-.7,[],[],[],.3,1);
Step 2) Combine the model for the endogenous variable with the model for the input into a single
“stacked” THD representation:
[theta, din, lab] = stackthd(t1, d1, t2, d2, l1, l2);
Step 3) Translate the stacked model to the final nested formulation and display a description of its
structure:
&KDS� � 3DJ� ��
[theta, din, lab] = nest2thd(theta, din, 1, lab);
prtmod(theta,din,lab);
The output from prtmod is:
*************************** Model ***************************Nested model in inputs (innovations model)2 endogenous v., 0 exogenous v.Seasonality: 1SS vector dimension: 3Submodels:{ Transfer function model (innovations model) 1 endogenous v., 1 exogenous v. Seasonality: 1 SS vector dimension: 2 Parameters (* denotes constrained parameter): FR(1,1) -0.6000 AR(1,1) -0.8000 W1(1,1) 0.3000 W1(2,1) 0.6000 D1(1,1) -0.5000 V(1,1) 1.0000 -------------- VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 1 SS vector dimension: 1 Parameters (* denotes constrained parameter): FR1(1,1) -0.7000 V(1,1) 0.3000 --------------}*************************************************************
Note that the resulting model has two endogenous variables and no exogenous variable.
Example 3.8 (Unit roots). Assume the following ARIMA(1,1,0) model:
(1 .7B )/zt at ; )2a .2
where . The standard procedure to deal with non-stationary processes like this consists/ � 1 B
of eliminating the unit root by differencing the time series. For some applications - e.g., forecasting
the level of time series, , or interpolating missing values- it is more convenient to work directlyzt
with the factored AR(2) model:
(1 .7B ) (1 B )zt at
The THD format of this model is obtained and displayed with the following code:
[t1, d1, l1] = arma2thd([-1], [], [], [], 1, 1);[t2, d2, l2] = arma2thd([-.7], [], [], [], .2, 1);[ts, ds, ls] = stackthd(t1, d1, t2, d2, l1, l2);[tn, dn, ln] = nest2thd(ts, ds, 0, ls);prtmod(tn, dn, ln);
&KDS� � 3DJ� ��
and the corresponding prtmod output is:
*************************** Model ***************************Nested model in errors (innovations model)1 endogenous v., 0 exogenous v.Seasonality: 1SS vector dimension: 2Submodels:{ VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 1 SS vector dimension: 1 Parameters (* denotes constrained parameter): FR1(1,1) -1.0000 -------------- VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 1 SS vector dimension: 1 Parameters (* denotes constrained parameter): FR1(1,1) -0.7000 V(1,1) 0.2000 --------------}*************************************************************
where the only error variance relevant for nest2thd is that of the nested model.
Example 3.9 (Multiple seasonal factors). Consider a time series observed once each ten days, ,zt
with the following structure:
/36/9/3/zt (1 .6B 36 ) (1 .7B 9 ) (1 .8B 3 ) (1 .9B )at , )2a .2
where and ./36� (1 B 36 ) ,/9 (1 B 9 ) /3 (1 B 3 )
In this example, the THD representation can be obtained with:
[t1, d1, l1] = arma2thd([], [-1], [], [-.6], 1, 36);[t2, d2, l2] = arma2thd([], [-1], [], [-.7], 1, 9);[t3, d3, l3] = arma2thd([-1], [-1], [-.9], [-.8], .2, 3);[ts1, ds1, ls1] = stackthd(t1, d1, t2, d2, l1, l2);[ts2, ds2, ls2] = stackthd(ts1, ds1, t3, d3, ls1, l3);[tn, dn, ln] = nest2thd(ts2, ds2, 0, ls2);prtmod(tn, dn, ln);
and the corresponding output is:
*************************** Model ***************************Nested model in errors (innovations model)1 endogenous v., 0 exogenous v.Seasonality: 36SS vector dimension: 49Submodels:{ VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 36 SS vector dimension: 36 Parameters (* denotes constrained parameter): FS1(1,1) -1.0000
&KDS� � 3DJ� ��
AS1(1,1) -0.6000 -------------- VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 9 SS vector dimension: 9 Parameters (* denotes constrained parameter): FS1(1,1) -1.0000 AS1(1,1) -0.7000 -------------- VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 3 SS vector dimension: 4 Parameters (* denotes constrained parameter): FR1(1,1) -1.0000 FS1(1,1) -1.0000 AR1(1,1) -0.9000 AS1(1,1) -0.8000 V(1,1) 0.2000 --------------}*************************************************************
'HILQLQJ FRPSRQHQW PRGHOV
A component model is defined in E4 following a three step procedure, very similar to that described
for nested models. In fact, Steps 1) and 2) are identical. Step 3) is similar, replacing the call to
nest2thd for a similar call to comp2thd. The syntax of this function is:
[theta, din, lab] = comp2thd(ts, ds, ls);
where the input arguments ts, ds, ls are the stacked THD format of the models to be composed.
Example 3.10 (Definition of an AR(2) model with observation errors). Assume that evolvesyt
according with the following model:
(1 .5B .7B 2 )yt at ; )2a 1.0
y �
t yt � v yt ; )
2v 1.0
and the errors and are mutually independent white noise processes. The corresponding THDat v yt
format and the resulting output are:
[t1, d1, l1] = arma2thd([-.5 -.7], [], [], [], 1, 1);[t2, d2, l2] = arma2thd([], [], [], [], 1, 1);[ts, ds, ls] = stackthd(t1, d1, t2, d2, l1, l2);[theta, din, lab] = comp2thd(ts, ds, ls);prtmod(theta, din, lab);
&KDS� � 3DJ� ��
*************************** Model ***************************Components model1 endogenous v., 0 exogenous v.Seasonality: 1SS vector dimension: 2Submodels:{ VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 1 SS vector dimension: 2 Parameters (* denotes constrained parameter): FR1(1,1) -0.5000 FR2(1,1) -0.7000 V(1,1) 1.0000 -------------- White noise model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 1 SS vector dimension: 0 Parameters (* denotes constrained parameter): V(1,1) 1.0000 --------------}*************************************************************
Defining models with conditional heteroscedastic errors
Formulation of models with conditional heteroscedastic errors in THD format is similar to that of
composite models. First, it is necessary to obtain the THD formulation of: a) a VARMAX or
transfer function model for the mean and b) a VARMAX model equivalent to the ARCH, GARCH
or IGARCH structure desired. The full model can then be defined using the garc2thd function,
which has the following syntax:
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);
where t1-d1 is the THD format associated with the model for the mean, t2-d2 is the THD format
associated with the VARMAX model for the variance, and lab1 and lab2 are optional parameters
with labels for the parameters in t1 and t2, respectively.
Example 3.11 (Defining models with GARCH errors). Consider the following ARMA(2,1) model
with GARCH (1,1) errors, in conventional notation:
; ; ; yt 1 .8B
1 .7B � .3B 2Jt Jt � iid (0 , .01) Jt 66t � iid (0 , h 2
t ) h 2t .002 � .1J
2t1 � .7h 2
t1
which, in the ARMA representation supported by E4 becomes:
, such that: , yt 1 .8B
1 .7B � .3B 2Jt J
2t .01 � Nt (1 .8B)Nt (1 .7B)vt
see Example 2.5. The following code defines and displays the model structure:
&KDS� � 3DJ� ��
% Model for the mean[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);
% Model for the conditional variance[t2, d2, lab2] = arma2thd([-.8], [], [-.7], [], [.01], 1);
% Full model[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);prtmod(theta, din, lab);
generating the output:
*************************** Model ***************************GARCH model (innovations model)1 endogenous v., 0 exogenous v.Seasonality: 1SS vector dimension: 2Endogenous variables model: VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 1 SS vector dimension: 2 Parameters (* denotes constrained parameter): FR1(1,1) -0.7000 FR2(1,1) 0.3000 AR1(1,1) -0.8000 V(1,1) 0.0100 --------------GARCH model of noise: VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 1 SS vector dimension: 1 Parameters (* denotes constrained parameter): FR1(1,1) -0.8000 AR1(1,1) -0.7000 --------------*************************************************************
Assume now the same model for the mean and an IGARCH(1,1) for the conditional variance:
h 2t .002 � .3J
2t1 � .7h 2
t1
which in ARMA form can be written as:
, with: J2t .01 � Nt (1 B)Nt (1 .7B)vt
The following commands define the IGARCH structure by constraining the autoregressive
parameter to unity:
% Model for the mean[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);
% Model for the conditional variance[t2, d2, lab2] = arma2thd([-1], [], [-.7], [], [.01], 1);
% Full model[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);theta1 = [theta zeros(size(theta))]; theta1(5,2)=1;prtmod(theta1, din, lab);
&KDS� � 3DJ� ��
and the output from prtmod is:
*************************** Model ***************************GARCH model (innovations model)1 endogenous v., 0 exogenous v.Seasonality: 1SS vector dimension: 2Endogenous variables model: VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 1 SS vector dimension: 2 Parameters (* denotes constrained parameter): FR1(1,1) -0.7000 FR2(1,1) 0.3000 AR1(1,1) -0.8000 V(1,1) 0.0100 --------------GARCH model of noise: VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 1 SS vector dimension: 1 Parameters (* denotes constrained parameter): FR1(1,1) * -1.0000 AR1(1,1) -0.7000 --------------*************************************************************
Converting THD models to state-space representation
The conversion of THD models to SS representation is done by function thd2ss, which uses a
THD formulation as argument and returns the SS formulation matrices 00, , E, H, D, C, Q, S and
R of (2.1)-(2.4). Due to its particular nature, models with GARCH errors are supported by an
specific function, garch2ss. The general calls to thd2ss and garch2ss are:
[Phi, Gam, E, H, D, C, Q, S, R] = thd2ss(theta, din)
[Phi,Gam,E,H,D,C,Q,Phig,Gamg,Eg,Hg,Dg] = garch2ss(theta, din)
The function garch2ss returns seven matrices (Phi, Gam, E, H, D, C and Q) corresponding to the
model for the mean, and five matrices (Phig, Gamg, Eg, Hg and Dg) corresponding to the model for
the conditional variance. Note that the matrices R and S are not returned, since they are the same as
Q in this class of models. For further details, on these functions, see Chapter 8.
Converting THD models to the standard representation
Once a model is estimated, it is sometimes convenient to obtain again the matrices characterizing its
standard representation. To do this, E4 includes three functions, thd2str, thd2arma and thd2tf,
&KDS� � 3DJ� ��
that translate a THD definition into the equivalent reduced form of the model. The general syntax of
these functions is:
[F, A, V, G] = thd2str(theta, din)
[F, A, V, G] = thd2arma(theta, din)
[F, A, V, W, D] = thd2tf(theta, din)
The main differences between the output arguments of these functions and the input arguments for
their reciprocals (arma2thd, str2thd and tf2thd) are:
1) The elements with fixed values in the formulation are also returned (e.g., identity matrix of a
VARMAX model).
2) If the model includes seasonal factors, the matrices returned are the product of the regular and
seasonal factors.
These functions do not work with SS models or composite models.
&KDS� � 3DJ� �
4 Model estimation
After defining a model structure in THD format, many analyses require to obtain estimates of its
unknown parameters and the standard deviations of these estimates. This chapter describes the E4
functions that deal with these issues.
The first Section describes how to set or modify the default options and tolerances that affect
likelihood evaluation and optimization. The second Section is concerned with the functions required
to evaluate the log-likelihood function for all the models supported. The third Section deals with
computing the analytical gradient and the information matrix. The fourth Section discusses the
numerical optimization algorithm employed and the fifth section describes how the estimation results
can be combined in a summary report. A final Section illustrates the use of all these functions with
several examples.
Modification of toolbox options
The values of the general toolbox options are stored in an internal vector, E4OPTION, created by the
e4init command, see Chapter 1 and Appendix B. These options can be modified using the
function sete4opt, which allows three different calls:
1) sete4opt, without any argument, restores the default options and lists the E4OPTION vector.
2) sete4opt('show') shows current options. If the function is called with this argument, no
other argument should be included.
3) sete4opt(option, value, ...), where the argument option stands for the name of the
option to be modified, and value stands for the new choice. In this case, the following rules
apply:
6 option must be a character string, enclosed by quotes. It is enough to indicate the first three
letters.
6 value may be a character string, enclosed by quotes, or a numeric value. If it is a character
string, it is enough to indicate its first three letters
6 A single call may contain several option-value pairs, up to a maximum of ten.
The different options and their admissible values are summarized in the following table.
&KDS� � 3DJ� �
Option Description Possible values
Functions that control the estimation process
'filter' Selects the filter used in the evaluation of the
likelihood function
'kalman'†,
'chandrasekhar'
'scale' Scales matrices when computing their
Cholesky decomposition during filtering.
'no'†, 'yes'
'econd' Selects the criterion to compute the initial
value of the state vector
'iu', 'au', 'ml',
'zero', 'auto'†
'vcond' Selects the criterion to compute the covariance
of the initial state vector
'lyapunov', 'zero',
'idejong'†
'var' Selects between estimation of the covariance
matrix or estimation of its Cholesky factor
'variance'†, 'factor'
Functions that control the behaviour of eemin
'algorithm' Chooses the optimization algorithm 'bfgs'†, 'newton'
'step' Maximum step length during optimization 0.1‡
'tolerance' Stop criteria tolerance 1.0e-5‡
'maxiter' Maximum number of iterations 75‡
'verbose' Displays output at each iteration 'yes'†, 'no'
† Default option.
‡ This is the default value. Other reasonable values are admissible.
Example 4.1. When issued after e4init, the command:
sete4opt('show');
displays the default options:
*********************** Options set by user ***********************Filter. . . . . . . . . . . . . : KALMANScaled B and M matrices . . . . : NOInitial state vector. . . . . . : AUTOMATIC SELECTIONInitial covariance of state v. : IDEJONG¿Variance or Cholesky factor? . : VARIANCEOptimization algorithm. . . . . : BFGSMaximum step length . . . . . . : 0.100000Stop tolerance. . . . . . . . . : 0.000010Max. number of iterations . . . : 75Verbose iterations. . . . . . . : YES****************************************************************
and the code:
sete4eopt('filt','chandra','vco','lyapunov','eco','ml','step',0.5);
&KDS� � 3DJ� �
selects the Chandrasekhar filter, sets the initial conditions for the covariance matrix of the Kalman
filter at the solution of the corresponding Lyapunov equation and adjusts the maximum step length
to .5. The corresponding output is:
************** The following options are modified **************Filter. . . . . . . . . . . . . : CHANDRAInitial covariance of state v. : LYAPUNOVInitial state vector. . . . . . : MLMaximum step length . . . . . . : 0.500000****************************************************************
Evaluation of the likelihood function
0RGHOV ZLWK KRPRVFHGDVWLF HUURUV
Starting from a model in THD format and a sample, the functions lfmod, lffast and lfmiss
compute the value of the gaussian log-likelihood function for any model with homoscedastic errors.
Their syntax is:
[l, innov, ssvect] = lfmod(theta, din, z)
[l, innov, ssvect] = lfmiss(theta, din, z)
[l, innov, ssvect] = lffast(theta, din, z)
The functions lfmod and lfmiss compute the log-likelihood of a standard sample and a sample
with missing values, respectively. Technical details about how they work are given in Terceiro
(1990, Chapter 4). The function lffast is a faster version of lfmod because it takes advantage of
the innovations structure of many econometric models; see Casals, Sotoca and Jerez (1999).
The input arguments of these functions are a THD format specification, theta, din, and the data
matrix z. Internally, each of these functions formulate the model in SS with thd2ss and then
compute the value of the following output arguments:
1) l, a scalar that contains the value of the log-likelihood function in theta,
2) innov, a matrix of one-step-ahead forecast errors, defined as:
zt |t
1 zt H xt | t
1 D ut
where is an estimate of the state vector in t conditional to the information available up to xt |t
1
t-1.
&KDS� � 3DJ� �
3) and ssvect, a matrix of estimates of the state variables. Its t-th row contains the filtered
estimate of the state vector at time t, conditional on the information available up to t-1:
xt��1 |t 00 xt |t
1 � ut � Kt zt |t
1
For a detailed reference on these functions, see Chapter 8.
0RGHOV ZLWK *$5&+ HUURUV
The gaussian log-likelihood of models with GARCH errors is computed by function lfgarch,
which synopsis is:
[l, innov, hominnov, ssvect] = lfgarch(theta, din, z)
where all the arguments are the same as those of lfmod except hominnov, which is a matrix of
residuals standardized with the square root of the conditional variances.
,QLWLDO FRQGLWLRQV
To compute the exact log-likelihood function of a SS model it is necessary to define adequate initial
values for the state vector ( ) and its covariance matrix . The selection of these values, seex1 P1
Casals and Sotoca (1997), depends on two characteristics of the model:
1) Whether or not the model is stationary. A model is said to be a) totally stationary when all the
roots of are, in module, less than one, b) totally nonstationary if all the roots of have a00 00
module greater or equal than one, and c) partially nonstationary if some roots of are greater or00
equal than one and others less.
2) Whether or not there are exogenous variables ( ) in the model, and their stochastic orut
deterministic nature.
Adequate initial conditions for each model are chosen automatically. The default options can be
manually overridden using the function sete4opt. The options related with this issue are
'econd', which sets the initial condition for the state vector, and 'vcond', which does the same
for its covariance matrix. Admissible values for 'econd' are: 'auto', 'zero', 'iu', 'au' and
'ml'. Admissible values for 'vcond' are 'zero', 'lyap' and 'idej'. The next table
summarizes the adequate options for each case.
&KDS� � 3DJ� �
Type of model econd vcond
Totally stationary
Without exogenous variables zero
lyap With exogenous variables
Deterministic iu, au
Stochastic ml
Totally nonstationary
Without exogenous variables zero
idej With exogenous variables
Deterministiczero, iu, au
Stochastic
Partially nonstationary
Without exogenous variables zero
idej With exogenous variables
Deterministic iu, au
Stochastic ml
Models with GARCH
errors
Without exogenous variables† zero
idej‡With exogenous variables†
Deterministic iu, au
Stochastic ml
† In the model for the mean
‡ Denotes a fixed option, not modifiable by the user.
Computation of the gradient and the information matrix
The exact gradient and information matrix of the log-likelihood function are relevant to both, the
iterative estimation process and many inference procedures, see Engle (1984).
(YDOXDWLRQ RI WKH DQDO\WLFDO JUDGLHQW
The general syntax for the E4 functions dealing with gradient computation is:
g = gmod(theta, din, z)
g = gmiss(theta, din, z)
g = ggarch(theta, din, z)
As suggested by their names, gmod computes the derivatives of lfmod and lffast, gmiss
computes the derivatives of lfmiss and ggarch computes the derivatives of lfgarch. The input
arguments are the same as those lfmod, and the output argument, g, is a vector containing the
analytical derivatives of the corresponding log-likelihood evaluated at theta.
&KDS� � 3DJ� �
Details about the gradient of the log-likelihood are given in Terceiro (1990, Appendix B). For a
complete reference on the use of these functions, see Chapter 8.
(YDOXDWLRQ RI WKH H[DFW LQIRUPDWLRQ PDWUL[
As in the case of the gradient, there are three functions dealing with computation of the information
matrix. They are:
[std, corrm, varm, Im] = imod(theta, din, z, aprox)
[std, corrm, varm, Im] = imiss(theta, din, z, aprox)
[std, corrm, varm, Im] = igarch(theta, din, z)
The function imod computes the information matrix of lfmod and lffast, imiss computes the
information matrix of lfmiss and igarch does the same for lfgarch. The input arguments are in
general the same as those of lfmod, with the following exception: there is an additional input
argument to imod and imiss, approx, which indicates whether the calculations should be exact or
approximate, being the latter option computationally more efficient, see Watson and Engle (1983).
The output arguments are std, the standard deviation of the values in theta, corrm, the
correlation matrix between these parameters, varm, which is the corresponding covariance matrix
and Im, which is the exact information matrix in the case of imod and imiss, see Terceiro (1990,
Appendices C and D) and Terceiro (1999). For a complete reference on the use of these functions,
see Chapter 8.
4XDVL�PD[LPXP OLNHOLKRRG HVWLPDWLRQ
If the model is misspecified or its errors are nonnormal, optimization of the log-likelihood function
provides consistent (but not efficient) estimates for the parameters. In this case, we speak of quasi-
maximum likelihood estimates. This situation has an even more important consequence, as the
standard errors computed by imod and imiss are no longer adequate.
Ljung and Caines (1979) and White (1982) propose an analytical approximation to the information
matrix that is robust to these specification errors. It can be computed using the function imodg:
[std, stdg, corrm, corrmg, varm, varmg, Im] = ...imodg(theta, din, z, aprox)
where the input arguments and the first four output arguments are the same as those of imod. The
additional outputs stdg, corrmg and varmg are the quasi-maximum likelihood values.
&KDS� � 3DJ� �
Numerical optimization
Except in very simple formulations, the first-order conditions of a maximum likelihood problem are
a complex set of nonlinear equations. Therefore, its solution requires an iterative algorithm, see
Dennis and Schnabel (1983). The general iteration of an unconstrained numerical optimization
algorithm is:
��i��1
��i '
i W i g i
where is the vector of parameters to estimate in the i-th iteration, is a matrix that describes��i W i
the curvature of the function (often the inverse of the hessian or an approximation), is thegi
gradient of the objective function evaluated at and is a scalar that determines the step length in��i
'i
the direction.W i g i
Different choices for the components of the previous expression characterize each specific
implementation. Also, it is necessary to define a criterion to stop the iterative process.
*HQHUDO XVH RI H�PLQ
The E4 function e4min implements a numerical optimization procedure, based on the techniques
described by Dennis and Schnabel (1983). It includes two main optimization algorithms, BFGS
(Broyden-Fletcher-Goldfarb-Shanno) and Newton-Raphson. MATLAB functions fmin, fmins and
fminu can be used instead of e4min. However e4min has been carefully designed and tuned to
solve likelihood optimization problems and, in most cases, it should be more reliable and robust for
this specific use.
The general synopsis of e4min is:
[pnew,iter,fnew,gnew,hessin]=e4min(func,theta,dfunc,P1,P2,P3,P4,P5)
The operation of e4min is the following. Starting from an initial estimate of the parameters in
theta, the algorithm iterates on the objective function func, using a Newton-Raphson or BFGS
search direction. The iteration can be based on the analytical gradient or on a numerical
approximation, depending on whether dfunc contains the name of the analytical gradient function
or is an empty string ''. The step length is computed with the e4lnsrch function. Finally, the stop
criterion takes into account the relative changes in the values of the parameters and/or the size of the
gradient vector.
Parameters P1, ..., P5 are optional and, if specified, are feed to the objective function without
&KDS� � 3DJ� �
modifications. In the context of E4 the first parameter, P1, may have the name of a user model, see
Chapter 6.
The function sete4opt manages different options which affect the optimizer, including the
algorithm to use, the maximum step length at each iteration, the tolerance for stop criteria and the
maximum number of iterations allowed.
Once the process is stopped, whether because convergence is reached or because of other causes
(exceeding maximum number of iterations or bad conditioning of the objective function) the function
returns the following values: pnew which is the value of the parameters, iter, the number of
iterations performed, fnew, the value of the objective function at pnew, gnew which is the analytical
or numerical gradient, depending on the contents of dfunc, and finally hessin, which is a
numerical approximation to the hessian.
6FDOLQJ SUREOHPV
When the parameters to be estimated have very different values, e.g., if they range from 10-4 to 104,
the number of iterations required increases and the numerical precision of the solution can be poor.
This problem often affects the variances of the errors, which may be several orders of magnitude
different from the rest of the parameters.
There are two main solutions for this problem. First, the data can be scaled so that the values of all
the parameters fall in a relatively small range, say from 10-2 to 102. The function e4preest can be
used to test quickly different scaling factors. If the scaling problem happens because the variances
are too small, another possibility consists of setting the option 'var' to the value 'factor' with
sete4opt. By so doing, optimization is done with respect to the Cholesky factors of the covariance
matrix, instead of the matrix itself. Besides improving the scale of the variances, in most cases, this
choice has the advantage that the covariance matrices are then constrained to be positive definite.
Preliminary estimates
If the initial value of theta is far from the optimum, the cost of iterative process can be very large.
It is then desirable to have a preliminary estimation algorithm, which provides initial estimates with
a reasonable computational burden.
The function e4preest computes consistent estimates of the parameters in theta. In most cases,
these are adequate starting values for likelihood optimization with e4min. The syntax for calling
this function is:
&KDS� � 3DJ� �
theta2 = e4preest(theta, din, z)
where the input arguments are identical to those of lfmod. The estimates are returned in theta2.
The operation of e4preest is the following. It first obtains a subspace representation of the system
and then computes estimates for its parameters by solving a nonlinear least squares problem, which
computational load does not depend on the size of the sample. Hence, this method is very efficient
when processing large samples. For more details on subspace methods, see Viberg (1995) and
Casals (1997).
Displaying the estimation results
After obtaining maximum likelihood estimates for the parameters of a model and (optionally),
computing its standard errors, it is useful to obtain a summary of all these results. To this end, E4
includes the function prtest, which allows a reduced call:
prtest(theta, din, lab, z, it, lval, g, h)
where the input arguments are: the model specification given by theta-din-lab, the data matrix,
z, the number of iterations, it, the value of the log-likelihood, lval, its gradient, g and its hessian
h. All these arguments except din, lab and z, are output values of e4min.
A more complex call is:
prtest(theta, din, lab, z, it, lval, g, h, std, corrm, t)
where the additional arguments std and corrm should obtained with one of the functions dealing
with the computation of the information matrix. The last argument, t, is the total computing time in
minutes. It should be computed by the user using the MATLAB commands: tic and toc.
The first syntax does not require the computation of an analytical information matrix and, therefore,
it is useful to obtain a quick first impression on the model adequacy. The omitted input arguments,
std and corrm, are replaced internally by an approximation to the covariance matrix by inverting
the hessian of the log-likelihood function in the optimum. This method is very fast but its results
should be taken cautiously.
Finally, a third valid call is:
prtest(theta, din, lab, z, it, lval, g, h, [], [], t)
&KDS� � 3DJ� ��
which does not require results from the information matrix but displays the elapsed computing time.
Examples
The examples in this Section use simulated data generated with the E4 function simmod. Its use is
described in Chapter 5. See also the corresponding reference in Chapter 8.
Example 4.2 (Simulation and estimation of an ARMA model). Consider the model:
zt (1 .7B ) (1 .5B 12 )at V[at ] .1
The following code:
1) obtains the corresponding THD format,
2) simulates 250 observations of the model, discarding the first 50 samples,
3) computes preliminary estimates with e4preest, which are displayed using prtmod, and
4) computes maximum likelihood estimates using the numerical gradient:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);z=simmod(theta,din,250); z=z(51:250,1);theta=e4preest(theta, din, z);prtmod(theta, din, lab)[thopt, it, lval, g, h] = e4min('lffast', theta,'', din, z);
To use analytical derivatives in the optimization process, replace the last line with:
[thopt, it, lval, g, h] = e4min('lffast', theta,'gmod', din, z);
The estimation results can be displayed using numerical standard errors with:
prtest(thopt, din, lab, z, it, lval, g, h)
or using analytical standard errors (under normality) with:
[std, corrm, varm, Im] = imod(thopt, din, z);prtest(thopt, din, lab, z, it, lval, g, h, std, corrm)
Finally, if the normality assumption is doubtful, one can display the results using robust standard
errors:
&KDS� � 3DJ� ��
[std, stdg, corrm, corrmg] = imodg(theta, din, z);prtest(thopt, din, lab, z, it, lval, g, h, stdg, corrmg)
Example 4.3 (Simulation and estimation of an ARMA model with GARCH errors). Consider
the ARMA(2,1) model with GARCH(1,1) that we used in Example 3.11:
, such that:yt 1 .8B
1 .7B � .3B 2Jt
, J2t .01 � Nt (1 .8B)Nt (1 .7B)vt
The following code defines the model structure, simulates a sample, computes the maximum-
likelihood estimates of the parameters, the analytical standard errors and displays the results:
[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);[t2, d2, lab2] = arma2thd([-.8], [], [-.7], [], [.01], 1);[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);z=simgarch(theta,din,450); z=z(51:400,1);theta=e4preest(theta, din, z);prtmod(theta, din, lab);[thopt, it, lval, g, h] = e4min('lfgarch', theta,'', din, z);[std, corrm, varm, Im] = igarch(thopt, din, z);prtest(thopt, din, lab, z, it, lval, g, h, std, corrm)
In GARCH modeling the samples are usually large and, therefore, using the analytical gradient in
the iteration process is very expensive. However, to assure optimality one may want to evaluate it
after convergence. This can be done with the command:
g = ggarch(thopt, din, z)
&KDS� � 3DJ� �
5 Specification, forecasting, simulation andsmoothing
Despite its focus on model estimation, E4 includes several functions implementing standard methods
for model building and model validation. These functions are described in the first Section of this
Chapter. The second Section reviews the forecasting techniques implemented in the toolbox. The
third Section describes the functions available for model simulation. Finally, the fourth and fifth
Sections deal with smoothing.
Tools for time series analysis
There are three groups of functions for time series analysis: a) general purpose functions, including
several standard graphs and descriptive statistics, b) data transformations and c) tools for model
specification and validation. This section provides a brief discussion of the functions in each group,
see Chapter 8 for a detailed reference on these functions.
*HQHUDO SXUSRVH IXQFWLRQV
The general purpose functions include: plotsers, which plots a centered and standardized time
series; histsers, which shows the histogram of a time series; rmedser, which displays a scaled
plot of sample means versus sample standard deviations; plotqqs, which plots the quantile graph
under normality; and descser, which presents a table of descriptive statistics. The general syntax
of these functions is:
ystd = plotsers(y, mode, lab)
freqs = histsers(y, lab)
[med, dts] = rmedser(y, len, lab)
[nq, yq] = plotqqs(y, lab)
stats = descser(y, lab)
The common input arguments are: y, a N×m matrix which contains m series of N observations each,
and lab, a matrix of characters which contains in each row an optional descriptive title for each
series. Besides:
&KDS� � 3DJ� �
1) The function plotsers allow an optional input argument, mode, which selects the display
format.
2) The function rmedser allow an optional input argument, len, which defines the number of
observations to be used in computing sample means and standard deviations.
Further details about these functions can be consulted in Chapter 8.
Example 5.1 (Statistical analysis of gaussian white noise). The following commands simulate
(150×2) draws from a N(0,1) distribution, eliminate the first 50 values, define titles for both series
and then call the different general purpose functions:
y=randn(150,2); y=y(51:150,:);lab=['noise #1';'noise #2'];ystd = plotsers(y, -1, lab);ystd = plotsers(y, 1, lab);freqs = histsers(y, lab);[med, dts] = rmedser(y, 10, lab);[nq, yq] = plotqqs(y, lab);stats = descser(y, lab);
'DWD WUDQVIRUPDWLRQV
In time series analysis it is frequent to transform the data before model specification. E4 includes two
data transformation functions: lagser which returns a series lagged or leaded a specific number of
periods and transdif, which computes the Box-Cox (1964) transformations, seasonal and regular
differences for a time series.
The syntax of lagser is:
[yl , ys] = lagser(y, ll)
where y is an n×k data matrix and ll is a 1×l vector containing the list of lags (positive numbers)
and leads (negative numbers) applicable to all the series. The function returns yl, which contains
the lagged-leaded variables, and optionally ys, an nl×k data matrix (nl=n-maxlag+maxlead) which
contains the original variables resized to be conformable with yl.
As for the differencing and Box-Cox transformation, the syntax of transdif is:
z = transdif(y, lambda, d, ds, s);
where the input arguments are: y, a matrix whose columns correspond to the different series to be
transformed, lambda, the parameter of the Box-Cox transformation, d, the order of regular
&KDS� � 3DJ� �
differencing, ds, a S×1 vector containing the orders of seasonal differencing (default ds=0) and s,
a S×1 vector containing the lengths of the seasonal periods (default s=1). The last two parameters
are optional and can be omitted if seasonal differences are not required.
The output argument is the differenced and transformed series z such that:
(5.1)zt /
dNs�S
/dss y (�)
t (1 B )dNs�S
(1 B s )ds y (�)t
where , is the difference operator of s-th order, such that forS s1 , s2 ,á , sS /s /s yt yt yts
any sequence , and is defined as:yt y (�)t
(5.2)y (�)
t
ln (yt � µ ) if � 0
(yt � µ )� 1
�if � g 0
being null if all the values of are strictly positive, and equal to otherwise.µ yt min(yt ) � 105
7RROV IRU PRGHO VSHFLILFDWLRQ DQG YDOLGDWLRQ
In regard to model specification and validation E4 includes four functions: augdft, which computes
the augmented Dickey-Fuller (1981) test for unit roots, see Hamilton (1994, Chapter 17); uidents,
which computes the univariate simple and partial autocorrelation functions of a series and plots the
results; midents, which computes the analogue multivariate specification statistics, that is, multiple
autocorrelation function and partial autoregression matrices; and residual, which computes the
residuals and smoothed error estimates of a model.
Any call to augdft has the following structure:
[adft] = augdft(y, p, trend);
The input arguments are, y, a matrix with N observations of m variables, p, the number of lags in
the unit root regression, and trend, an optional parameter to allow for a deterministic time trend (if
trend=1). When an output argument adft is specified, the function does not display the results,
but stores them in adft.
The syntax for calling uidents is:
[acf, pacf, Qs] = uidents(y, lag, tit)
&KDS� � 3DJ� �
and for midents is:
[acf, prcf, Qus] = midents(y , lag, tit)
In both cases the input arguments are: y, a N×m matrix which contains m series of N observations
each, lag, the maximum lag for computing the values of the autocorrelation functions, and lab,
which is a matrix of characters which contains a descriptive title for each series. The output
arguments are the values of the empirical autocorrelation functions and the Box-Ljung Q single or
multiple statistic.
The function residual computes the residuals of a model and is used mainly for validation. Any
call to this function has the following format:
[z1, vT, wT, vz1, vvT, vwT] = residual(theta, din, z, stand)
The input arguments are a THD format specification, (theta- din) and a data matrix (z). The
optional parameter stand selects between standardized (stand=1) or ordinary values (stand=0 or
argument omitted).
The output arguments are z1, a matrix of residuals, vT, a matrix of smoothed observation errors,
wT, a matrix of smoothed state errors (standardized if stand=1), vz1, a matrix which stacks the
covariance matrices of z1, vvT, a matrix which stacks the covariance matrices of vT and vwT, a
matrix which stacks the covariance matrices of wT. All these values are standardized if stand=1.
Forecasting
One of the main practical uses of time series analysis is forecasting. Forecasting with E4 requires to
obtain the THD representation of the model, estimate its parameters (if required), select suitable
initial conditions for the filter, see Chapter 4, and then call the foremod function. This function has
the following syntax:
[yf, Bf] = foremod(theta, din, z, k, u)
where theta and din define the model structure in THD format, z is a data matrix containing the
values of the endogenous and exogenous variables, k is the forecast horizon and u contains the data
of the exogenous variables for the forecast horizon. The output arguments are forecasts of the
endogenous variables (yf) and its corresponding covariances (Bf).
Forecasts for model with GARCH errors are computed by the function foregarc. Its syntax is:
&KDS� � 3DJ� �
[yf, Bf, vf] = foregarc(theta, din, z, k, u)
where the output argument yf contains the forecasts of the endogenous variable, Bf contains the
covariance matrices of these forecasts and vf contains forecasts of the conditional covariance.
Simulation
The functions simmod and simgarch generate a random sample from any model in THD format.
Their syntax is:
y = simmod(theta, din, N, u)
y = simgarch(theta, din, N, u)
The arguments are a model in THD format (theta, din), the number of observations to be
generated (N) and the exogenous variable data matrix (u). Both functions use the MATLAB function
randn to obtain N(0,1) random disturbances and select adequate initial conditions by themselves.
As a general practice, it is advisable to omit the first observations of the simulated sample.
Example 5.2 (Simulation). To obtain a realization with 200 observations of the model:
y1t .9 � .3y1t1 � a1t
y2t .7 � .4y1t1 � a2t .8a2 t4
Va1t
a2t
1 .9
.9 1
the following code can be used:
[theta, din, lab] = arma2thd([-.3 NaN; -.4 NaN],[],[], ...[NaN NaN; NaN -.8], [1 .9; .9 1], 4, [.9;.7], 1);
% Generate the exogenous (constant) variableu = ones(250,1);% Compute the simulated sample and omit the first observationsy = simmod(theta, din, 250, u);y=y(51:250,:)
&KDS� � 3DJ� �
Smoothing
E4 includes three functions, fismod, fismiss and aggrmod that implement different specialized
versions of the fixed interval smoother algorithms, see Anderson and Moore (1979), De Jong (1989)
and Casals, Jerez and Sotoca (2000). The main econometric applications of these functions are in
“cleaning” a sample contaminated with observation errors, computing the unobservable components
of a structural time series model, see Chapter 2, interpolating missing values and disaggregating a
low frequency sample.
The syntax of fismod and fismiss is:
[xhat, Px, e] = fismod(theta, din, z)
[zhat, Pz, xhat, Px] = fismiss(theta, din, z)
Both functions receive a model in THD format (theta, din) and a data matrix (z). The function
fismiss allows for missing values in z, which should be marked by NaN.
The output arguments of fismod are xhat, the expectation of the state vector conditional on all the
sample, Px, the covariance matrix of this expectation and e, a matrix of smoothed errors. On the
other hand, fismiss has two additional arguments, zhat, which is a matrix containing the
available values of z series and smoothed estimates of its missing values and Pz, which is the
covariance of zhat.
A common application of fixed-interval smoothing consists of computing the optimal disaggregation
of low frequency (say yearly) samples of flow variables into high frequency (say quarterly or
monthly) time series, so the disaggregates add up to the sample data. The unobserved high frequency
values can be computed taking into account, not only the low frequency sample information, but also
high frequency indicator(s). For example, a monthly industrial production index can be used as an
indicator to disaggregate a yearly series of GNP.
This disaggregation is performed by aggrmod, which synopsis is:
[zhat, Bt] = aggrmod(theta, din, z, per, m1)
where the input arguments are a THD model definition (theta, din) relating all the variables in
the high frequency observation interval, the data matrix (z), the number of observations that add up
to an aggregate (per) and the number of endogenous variables that are observed as aggregates (m1).
The output arguments are the optimal disaggregates of the first endogenous variables (zhat) and the
corresponding covariances (Bt).
&KDS� � 3DJ� �
Example 5.3 (Residual analysis, forecasting and smoothing). Consider again the model in
Example 4.2, which was simulated and estimated with the following code:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);z=simmod(theta,din,250); z=z(51:250,1);theta=e4preest(theta, din, z);[thopt, it, lval, g, h] = e4min('lffast', theta,'', din, z);[std, corrm, varm, Im] = imod(thopt, din, z);prtest(thopt, din, lab, z, it, lval, g, h, std, corrm)
After computing the maximum-likelihood estimates, the residuals can be obtained and analyzed with
the commands:
ehat = residual(thopt, din, z);descser(ehat,'residuals');plotsers(ehat,-1,'residuals');uidents(ehat,10,'residuals');
One may want also to calculate out-of-the-sample forecasts. The following code computes and plots
10 forecasts, with the standard ±2) limits:
[zfor, Bfor] = foremod(thopt, din, z, 10);% The following is standard MATLAB code :figure;whitebg('w');hold onplot([z(191:200); zfor],'k-')plot([z(191:200); zfor+2*sqrt(Bfor)],'k--')plot([z(191:200); zfor-2*sqrt(Bfor)],'k--')xlabel('Time')hold off
Finally, some samples have missing values due to, e.g., holidays or discontinuities in the source.
Also, one may want to eliminate some observations because they are considered outliers and may
affect the analysis. The following code generates a new variable with two missing values,
interpolates them, using fismiss and displays the results:
z1=z; z1(10)=NaN; z1(40)=NaN;[zhat, pz] = fismiss(thopt, din, z1);[z(1:50) z1(1:50) zhat(1:50)]
&KDS� � 3DJ� �
6 User models
The architecture of E4 makes easy to accommodate new formulations, provided that they can be
expressed in an equivalent SS form. To do this, the user should define a “user model”. This capacity
is very useful in three different cases:
First, if the user model has no close relationship with any of the formulations supported by E4, see
Chapter 2, the user should code the functions that generate the model in SS form and, if required,
compute its derivatives. This situation is discussed in the first section of this Chapter.
Second, some analyses require the use of reparametrized models. That is, models which contain
some parameters that are functions of the parameters in the standard formulation and, therefore,
differ slightly from a model supported by E4. In this case, the toolbox functions can be used to
simplify the definition of a user model, as the second section describes in detail.
Last, the reparametrization of a standard model, combined with fixed-value constraints, allows one
to impose general linear and non-linear equality constraints on the parameters.
Defining user models in the general case
The implementation of a user model can be divided into four steps:
Step 1: Generate a THD format for the model.
The results of this process should be
1) A theta vector, which contains the initial value of the parameters and, optionally, a second
column with the fixed/free-parameter flags, see Chapter 3.
2) A din vector. [This part is deliberately omitted].
3) An optional lab matrix, to document the contents of theta.
&KDS� � 3DJ� �
Step 2: Create a MATLAB function to generate the SS matrices corresponding to the user
model.
The header of this function should be:
[Phi, Gam, E, H, D, C, Q, S, R] = userf1(theta, din)
where userf1 can be any name selected by the user. This function receives theta and din as
input arguments and return the SS matrices: , , E, H, D, C, Q, S and R.00
Step 3: If required, create a user function to generate the derivatives of the SS matrices.
If the exact information matrix and/or the analytical gradient of the model are required, it is
necessary to create a second user function, whose synopsis should be:
[dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR] = userf2(theta,din,i)
where userf2 can be again any name selected by the user. This function receives theta, din and
i as arguments, and returns the first-order partial derivatives of the SS matrices with respect to the
i-th parameter in theta.
Step 4: Invoke the functions required to complete the analysis.
[ This part is deliberately omitted ]
Finally, take into account that the user models are defined by means of standard MATLAB
functions. They should be saved in ASCII files, with the names userf1.m and userf2.m, and
stored in the active directory or in a directory appearing in the variable MATLABPATH.
Example 6.1 (Structural time series models): Consider the following decomposition of a time
series into trend, cycle and irregular noise:
(6.1)yt Tt � Ct � Jt
where is an observable variable, and are, respectively, unobservable stochastic componentsyt Tt Ct
of trend and cycle and is a white noise error. Assume also that the model for the trend is:Jt
(6.2)Tt Tt1 � St1
St St1 � !t
and the cycle is governed by:
&KDS� � 3DJ� �
(6.3)Ct
C �
t
'cos� sin�
sin� cos�
Ct1
C �
t1
�
�t
��
t
where is the damping factor and is the frequency of the cycle in radians, such that .' � (0 � � � % )
The period in time units is . If or the stochastic cycle degenerates to a first-p 2%�
� 0 � %
order autoregressive process. The errors , , and are independent white noise processesJt !t �t ��
t
such that . See Harvey and Shephard (1993).V(�t ) V(��t )
Expressions (6.1)-(6.3) can be written in SS form as:
(6.4)
Tt�1
St�1
Ct�1
C �
t�1
1 1 0 0
0 1 0 0
0 0 'cos� 'sin�
0 0 'sin� 'cos�
Tt
St
Ct
C �
t
�
0 0 0
1 0 0
0 1 0
0 0 1
!t�1
�t�1
��
t�1
(6.5)yt 1 0 1 0
Tt
St
Ct
C �
t
� Jt
and the covariance matrices of the errors are:
(6.6)Q
)2!
0 )2�
0 0 )2�
, S
0
0
0
, R )2J
In this formulation, the parameters to be estimated are, therefore, , , , and . Model (6.4)-' � )2! )
2� )
2J
(6.6) can be defined by the following user function:
function [Phi, Gam, E, H, D, C, Q, S, R] = trendmod(theta, din)% Obtains the SS formulation from the values in theta:% theta(1) = rho,% theta(2) = lambda,% theta(3:5) standard deviations of xi, kappa and epsilon rho = theta(1,1); lambda = theta(2,1); rcl = rho*cos(lambda); rsl = rho*sin(lambda); Phi = [1 1 0 0; 0 1 0 0; 0 0 rcl rsl; 0 0 -rsl rcl]; Gam = []; E = zeros(4,3); E(2:4,:) = eye(3); H = [1 0 1 0]; D = []; C = 1;
&KDS� � 3DJ� �
v = [theta(3,1); theta(4,1); theta(4,1)]; Q = diag(v.^2); S = zeros(3,1); R = theta(5,1).^2;
This code should be stored in the ASCII file trendmod.m.
Finally, the following code simulates a sample with , , , and ,' .5 � % )2! .01 )
2� .1 )
2J 1.2
obtains maximum likelihood estimates of the parameters, constraining to its true value, and�
obtains smoothed estimates of the unobservable components , and .Tt Ct Jt
e4init
% Create theta and labtheta = [.5; pi; sqrt(.01); sqrt(.1); sqrt(1.2)];lab = str2mat('Rho','Lambda','sd(xi)','sd(kappa)','sd(eps)');% Create din (only public header)mtype = 7; % model type = 7 (SS)m = 1; % endogenous variablesr = 0; % exogenous variabless = 1; % seasonal periodn = 4; % number of statesnp = 5; % number of parameters (rows of theta)usflag = 1; % flag for user models (yes)usfunc = 'trendmod'; % name of user function (no gradient required).innov = [0;3;1]; % innov(1), not innovation model; innov(2), size of Q; % innov(3), size of Rszpriv = [0;0]; % szpriv(1) size of private din, szpriv(2) size ofprivate % headerdin = e4sthead(mtype,m,r,s,n,np,usflag,usfunc,innov,szpriv);
% Constrain the value of lambda and display the resulting modeltheta=[theta zeros(size(theta))]; theta(2,2)=1;prtmod(theta, din, lab);
% Select adequate initial conditions for the filter variablessete4opt('econd','zero','vcond','idej','var','fac');
% Simulate the data, discarding the first 50 samplesy = simmod(theta,din,150);y = y(51:150);
% Compute ML estimates of the unknown parameters[thopt,it,lval,g,h] = e4min('lffast',theta,'',din,y);prtest(thopt,din,lab,y,it,lval,g,h);disp(sprintf('Period = %4.2f', (2*pi)/thopt(2,1)));disp(sprintf('Damping factor = %4.2f', thopt(1,1)));
% Obtain estimates of the unobserved components and plot the results[xhat,px,ehat]=fismod(thopt,din,y);plotsers([y,xhat(:,1)],1,str2mat('Data','Trend'));plotsers(xhat(:,3),-1,'cycle');plotsers(ehat,-1,'irregular component');
% Note that the name of the user function was feed to simmod, % e4min and fismod
Defining user models in reparametrized formulations
Some analyses require the use of a model with a nonstandard parametrization, which has the same
&KDS� � 3DJ� �
dynamics as a formulation supported by E4 and, therefore, the same SS representation. In this case,
we speak of a “reparametrized model”, and the steps 1, 2 and 3 of the process described in previous
section can be drastically simplified by using some E4 functions.
The whole process of user model definition can be again divided in four steps:
Step 1: Generate the reparametrized model description in THD format.
1.1) Generate the standard formulation of the model in THD format: oldtheta, olddin,
oldlab.
1.2) Define the vector theta, which should contain values coherent with those of oldtheta, but
in terms of the new parameters.
1.3) Mark din as user model adding user function names (userf2.m is optional)
din=touser(olddin, 'userf1', 'userf2')
1.4) Optionally, create a vector lab, documenting the contents of theta.
Step 2: Create a function to generate the SS matrices corresponding to the reparametrized
model.
Same as Step 2 of the general case but, to simplify it, one can use the toolbox functions for
formulation of SS models (e.g., thd2ss, see Chapter 3).
Step 3: If required, create a user function to generate the derivatives of the SS matrices.
Same as Step 3 of the general case but, to simplify it, one can use the toolbox functions for
computing the derivatives of an SS model (e.g., ss_dv or ss_dvp, see Chapter 8).
Step 4: Same as Step 4 of the general case.
Example 6.2 (Reparametrized transfer functions): Consider the transfer function:
(6.7)yt 7
1 � But � at , E(a 2
t ) )2a
and its corresponding steady-state gain, defined as: g
7
1 �
For many analyses the transfer function gain is more relevant than any other parameter. If this is the
case, the nonstandard parametrization:
(6.8)yt g (1 � )1 � B
ut � at , E(a 2t ) )2
a
&KDS� � 3DJ� �
includes the gain as an explicit parameter. Note that the SS representation of models (6.7) and (6.8)
is the same. Therefore, (6.8) is a reparametrization of (6.7) and its definition can be done with the
help of the Toolbox functions. On the other hand, estimation results are not the same because (6.8)
allows one to obtain direct estimates of the gain and its standard deviation, and also to define
constraints on this parameter.
Step 1, as defined in previous section, can be done as follows. First, formulate the transfer function
(6.7):
[oldtheta, olddin, oldlab] = tf2thd([],[],[],[],.1,1,[.8],[-.4]);
where oldtheta is [.8; -.4; .1]. Then, theta can be generated as follows:
theta = oldtheta;theta(1,1) = oldtheta(1,1)/(1+oldtheta(2,1));
so the first element of theta is equal to the gain, and the model should be marked as a user model
calling the touser function:
din = touser(olddin, 'mymodel', 'mymoddv');
Note that we will need the functions mymodel.m and mymoddv.m
Last, we generate a vector of descriptive labels with the following sentence:
lab=str2mat('gain',oldlab(2:3,:));
Step 2 consist of creating a function that receives theta and din and returns the matrices of the
SS formulation of (6.8). In this case, the SS formulation of (6.7) and (6.8) is the same. Hence, the
easiest way to code this function consist of rebuilding oldtheta and olddin from theta and din
and generating the SS model with a call to thd2ss. The code to do this is:
function [Phi, Gam, E, H, D, C, Q, S, R] = mymodel(theta, din)% Converts theta to oldthetag = theta(1,1); delta = theta(2,1);oldtheta = theta;oldtheta(1,1) = g*(1+delta);% Eliminates in din the user model flagolddin = tomod(din);% Obtains the SS matrices[Phi, Gam, E, H, D, C, Q, S, R] = thd2ss(oldtheta, olddin);
Step 3 consists of creating a function that receives theta, din and i and returns the derivatives of
the SS matrices with respect to the i-th parameter of theta. Because of the equivalence between the
SS representations of (6.7) and (6.8) the easiest way to do this consist of: a) obtaining again
oldtheta and olddin from theta and din, b) defining the Jacobian of the reparametrization:
&KDS� � 3DJ� �
Jfuser (�� )
07
0g07
0
07
0)2a
0
0g
0
0
0
0)2a
0)2a
0g
0)2a
0
0)2a
0)2a
1 � g 0
0 1 0
0 0 1
and c) generating the derivatives with a call to ss_dvp, see Chapter 8. This can be done with the
following code:
function [dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR] = mymoddv(theta,din,i)% Returns the derivatives in SS formulation of the reparametrized model% with respect to the i-th parameter of thetag = theta(1,1); delta = theta(2,1);oldtheta = theta;oldtheta(1,1)=g*(1+delta);olddin = tomod(din);% Define the JacobianJ = [1 + delta, g, 0; ... 0, 1, 0; ... 0, 0, 1];% Derive with respect to the i-th column of J[dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR] = ss_dvp(oldtheta, olddin, J(:,i));
In Step 4 the reparametrized model can be used for estimation, simulation, forecasting or any other
analysis. The complete code for an analysis with simulated data is the following:
e4initsete4opt('econd','ml','vcond','lyap');
% *** Step 1[oldtheta, olddin, oldlab] = tf2thd([],[],[],[],.1,1,[.8],[-.4]);prtmod(oldtheta,olddin,oldlab);
% Data generationu=randn(nobs,1);y=simmod(oldtheta, olddin, 250, u);u=u(51:250); y=y(51:250);
% Reparametrizationtheta = oldtheta;theta(1,1) = oldtheta(1,1)/(1+oldtheta(2,1));din = touser(olddin, 'mymodel','mymoddv');lab=str2mat('gain',oldlab(2:3,:));prtmod(theta,din,lab);
% *** Step 4% Compute ML estimates of the gain% Note that the name of the user functions are feed to e4min and imod[thopt,it,lval,g,h]=e4min('lffast',theta,'',din,[y u]);[std,corrm,varm,Im]=imod(thopt,din,[y u]);prtest(thopt,din,lab,[y u],it,lval,g,h,std,corrm);
Once the gain becomes an explicit parameter, its value can be constrained, see Chapter 3. For
example, the following code constrains the gain to its true value (which is known because the data
has been simulated) and reestimates the model:
% Constrain the gain to its true value
&KDS� � 3DJ� �
theta=[theta zeros(size(theta,1),1)]; theta(1,2)=1.;prtmod(theta,din,lab);% ... and then compute new estimates.% Note that the derivatives user function is now differentdin = touser(olddin, 'mymodel','mymoddvr');[thopt,it,lval,g,H]=e4min('lffast',theta,'',din,[y u]);[std,corrm,varm,Im]=imod(thopt,din,[y u]);prtest(thopt,din,lab,[y u],it,lval,g,H,std,corrm);
Note that the function to obtain the SS matrices (mymodel) is the same as in the previous case, but
the derivatives function is different because of the constraints. Now it should be:
function [dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR] = mymoddvr(theta,din,i)% Returns the derivatives in SS formulation of the reparametrized model% with respect to the i-th parameter of theta% In this version, the gain value is constrainedg = theta(1,1); delta = theta(2,1);oldtheta = theta;oldtheta(1,1)=g*(1+delta);olddin = tomod(din);% Define the constrained JacobianJ = [0 1 0; 0 0 1];% Derive with respect to the i-th column of J[dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR] = ss_dvp(oldtheta, olddin, J(:,i));
&KDS� � 3DJ� �
7 Case studies
This chapter presents a set of case studies that illustrate the main features of E4. Most of these
examples cannot be processed by standard econometric packages but are easy to analyze with this
toolbox.
The first block of cases concentrates on the estimation of some common time-series models,
sometimes with not-so-common features. Thus, the first example applies E4 to the ARIMA modeling
of four well-known time series, comparing the estimates obtained with those computed with other
software packages. The second case focus in VARMA modeling of the famous mink-muskrat series.
The third example shows how to estimate a standard transfer function, which is afterward
reparametrized to obtain direct estimates of a parameter - the noise model period - that is not
explicitly included in the conventional formulation. It also indicates how to obtain forecasts from a
composite formulation. In the fourth example a simultaneous equations econometric system is first
written in structural (str) form and then estimated by maximum likelihood, both with a complete
sample and with some artificial missing data. Last, the fifth example illustrates the specification and
estimation of models with GARCH errors.
Interpolation and extrapolation is illustrated in the second block of cases. Thus, the sixth example
shows how to use the Toolbox functions to forecast a time series of number of airline passengers and
compute short term objectives consistent with a medium term target value and with the series
dynamics. This application has a clear interest for management, as intermediate objectives provide
an effective way to monitor the progressive fulfilment of the target. The seventh case deals with
estimation of relationships between unequally spaced time series and shows how they can be used to
disaggregate a yearly time series into higher frequency data.
The last block of cases centers on models with structure close to the SS formulation. Therefore, the
eight example is referred to the estimation of models with observation errors. Finally, the ninth
example shows how to define and estimate an structural time series model in direct SS form.
All the final estimates are computed by exact maximum likelihood. In most cases, the optimization
algorithm (e4min) is started from preliminary estimates computed with e4preest except in the
fourth example, in which the sample is too short.
A first-time reader should concentrate on block one, skipping the more complex parts of the second
and third examples. This should be enough to provide a good start in E4.
&KDS� � 3DJ� �
Univariate ARIMA examples
Even the simplest modeling exercise depends crucially of the software used, see McCullough and
Vinod (1999). In a paper titled “Adventures with ARIMA software” Newbold et al. (1994)
illustrated this idea by fitting ARIMA models, using different software packages, to the following
monthly series:
Series A: An index of electricity consumption.
Series B: Housing starts.
Series C: Housing sales.
Series D: The monthly sales of a company.
The first three series can be found in Pankratz (1991) and were represented by the following
MA(1)×(1)12 model:
(7.1)zt (1 � �B ) (1 � �B 12 )Jt
Series D was taken from Chatfield and Prothero (1973) and can be represented by the following
ARMA(1,0)×(0,1)12 model:
(7.2)(1 � 1B ) zt (1 � �B 12 ) Jt
In all cases , is the data in logs, when this transformation is required, B iszt (1 B ) (1 B 12 )xt xt
the backward shift operator and is and error assumed to be white noise.Jt
The input code required to define and estimate the models for series A, B and C is:
% *** Series A. First read and transform the dataload seriesa.dat;y=transdif(seriesa,0,1,1,12);% Formulation and preestimation of the univariate model[theta,din,lab]=arma2thd([],[],[0],[0],0,12);theta=e4preest(theta,din,y);prtmod(theta,din,lab);% Optimize the likelihood function, compute the information matrix% and print the results[theta,it,lval,g,h]=e4min('lffast',theta,'',din,y);[std,corrm,varm,Im]=imod(theta,din,y);prtest(theta,din,lab,y,it,lval,g,h,std,corrm);
% *** Series B. Note that we do not define the THD model structure% corresponding to series B and C, as it coincides with that of% series Aload seriesb.dat;y=transdif(seriesb,0,1,1,12);theta=e4preest(theta,din,y);prtmod(theta,din,lab);
&KDS� � 3DJ� �
[theta,it,lval,g,h]=e4min('lffast',theta,'',din,y);[std,corrm,varm,Im]=imod(theta,din,y);prtest(theta,din,lab,y,it,lval,g,h,std,corrm);
% *** Series Cload seriesc.dat;theta=e4preest(theta,din,y);prtmod(theta,din,lab);[theta,it,lval,g,h]=e4min('lffast',theta,'',din,y);[std,corrm,varm,Im]=imod(theta,din,y);prtest(theta,din,lab,y,it,lval,g,h,std,corrm);
and the corresponding outputs are:
For series A:
******************** Results from model estimation ******************** Objective function: -252.5742 # of iterations: 11Information criteria: AIC = -2.9889, SBC = -2.9329 Parameter Estimate Std. Dev. t-test GradientAR1(1,1) -0.7218 0.0549 -13.1555 0.0001AS1(1,1) -0.8242 0.0740 -11.1418 0.0001V(1,1) 0.0511 0.0029 17.7527 -0.0001 ************************* Correlation matrix **************************AR1(1,1) 1.00AS1(1,1) -0.01 1.00V(1,1) 0.01 0.24 1.00 Condition number = 1.6235Reciprocal condition number = 0.6481***********************************************************************
For series B:
******************** Results from model estimation ******************** Objective function: -113.6077 # of iterations: 11Information criteria: AIC = -1.8590, SBC = -1.7889 Parameter Estimate Std. Dev. t-test GradientAR1(1,1) -0.2699 0.0888 -3.0408 0.0000AS1(1,1) -1.0000 3663.0371 -0.0003 0.0000V(1,1) 0.0825 151.1795 0.0005 -0.0002 ************************* Correlation matrix **************************AR1(1,1) 1.00AS1(1,1) 0.00 1.00V(1,1) 0.00 1.00 1.00 Condition number = 3193548113.3231Reciprocal condition number = 0.0000***********************************************************************
And for series C:
******************** Results from model estimation ******************** Objective function: -113.6077 # of iterations: 11Information criteria: AIC = -1.8590, SBC = -1.7889 Parameter Estimate Std. Dev. t-test GradientAR1(1,1) -0.2699 0.0888 -3.0408 0.0000
&KDS� � 3DJ� �
AS1(1,1) -1.0000 3663.0371 -0.0003 0.0000V(1,1) 0.0825 151.1795 0.0005 -0.0002 ************************* Correlation matrix **************************AR1(1,1) 1.00AS1(1,1) 0.00 1.00V(1,1) 0.00 1.00 1.00 Condition number = 3193548113.3231Reciprocal condition number = 0.0000***********************************************************************
Note that the estimates of the last two models have a unit root in the seasonal moving average factor.
This is a clear symptom of overdifferencing and would require the cancellation of the seasonal
difference and the seasonal moving average factor.
Finally we will model series D with a richer residual diagnostic output. The code to estimate model
(7.2) and compute some standard diagnostic tests is the following:
% Series Dload seriesd.dat;seriesd=log10(seriesd);y=transdif(seriesd,1,1,1,12);[theta,din,lab]=arma2thd([0],[],[],[0],0,12);theta=e4preest(theta,din,y);prtmod(theta,din,lab);[theta,it,lval,g,h]=e4min('lffast',theta,'',din,y);[std,corrm,varm,Im]=imod(theta,din,y);prtest(theta,din,lab,y,it,lval,g,h,std,corrm,(toc/60));[e,vt,wt,ve]=residual(theta,din,y);titD='residuals from series D';descser(e,titD);plotsers(e,0,titD);uidents(e,20,titD);
and the corresponding the estimation and diagnosis output is:
******************** Results from model estimation ******************** Objective function: -72.2362 # of iterations: 15Information criteria: AIC = -2.1636, SBC = -2.0624 Parameter Estimate Std. Dev. t-test GradientFR1(1,1) 0.4531 0.1119 4.0492 0.0000AS1(1,1) -0.7270 0.2023 -3.5928 0.0000V(1,1) 0.0729 0.0075 9.7303 0.0000 ************************* Correlation matrix **************************FR1(1,1) 1.00AS1(1,1) 0.00 1.00V(1,1) -0.01 0.51 1.00 Condition number = 3.0836Reciprocal condition number = 0.3563*********************************************************************** ***************** Descriptive statistics ***************** --- Statistics of residuals from series D ---Valid observations = 64Mean = -0.0009, t test = -0.1039Standard deviation = 0.0680Skewness = 0.1186Excess Kurtosis = -0.1370
&KDS� � 3DJ� �
10 20 30 40 50 60-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Standardized plot of residuals from series D
2 4 6 8 10 12 14 16 18-1
-0.5
0
0.5
1A.C.F. of residuals from series D, LBQ = 34.59
2 4 6 8 10 12 14 16 18-1
-0.5
0
0.5
1P.A.C.F. of residuals from series D
Quartiles = -0.0490, -0.0007, 0.0400Minimum value = -0.1367, obs. # 11Maximum value = 0.1787, obs. # 51Jarque-Bera = 0.2000Dickey-Fuller = -4.3217, computed with 8 lagsDickey-Fuller = -7.5422, computed with 1 lagsOutliers list Obs # Value 25 0.1442 51 0.1787 ************************************************************
The following table summarizes previous results and compares them with the ML1 and ML2
estimates reported in Newbold et al. (1994):
Package
Series A Series B Series C Series D
� � � � � � 1 �
ML1 -.693
(.057)
-.803
(.076)
-.270
(.087)
-.967
(.601)
-.200
(.086)
-.967
(.724)
.454
(.114)
-.725
(.195)
ML2 -.694
(.056)
-.804
(.075)
-.269
(.085)
-1.000
(153.6)
-.216
(.083)
-1.000
(57.3)
.453
(.113)
-.727
(.194)
E4 -.722
(.055)
-.824
(.074)
-.270
(.089)
-1.000
(3663.0)
-.270
(.089)
-1.000
(3663.0)
.453
(.112)
-.727
(.202)
Notes: Figures in parentheses are standard errors. The ML1 and ML2 estimates are the opposite of
those reported in Newbold et al. (1994), to make them coherent with E4 standards.
The following commands compute and display six-months ahead forecasts in logs and levels, as in
Newbold et al. (1994, table 3):
% Compute six months ahead forecasts% Note that the nonstationary version of the model is usedphi=[-1+theta(1) -theta(1)];sphi=-1;sth=theta(2);
&KDS� � 3DJ� �
v=theta(3);[thetaf,dinf,labf]=arma2thd([phi],[sphi],[],[sth],[v],12);prtmod(thetaf,dinf,labf);[yf,bf]=foremod(thetaf,dinf,seriesd,6);% Forecasts of log sales[(1:6)' yf bf]% Forecasts of sales10.^yf
and the corresponding outputs are:
*************************** Model ***************************VARMAX model (innovations model)1 endogenous v., 0 exogenous v.Seasonality: 12SS vector dimension: 14Parameters (* denotes constrained parameter):FR1(1,1) -0.5469FR2(1,1) -0.4531FS1(1,1) -1.0000AS1(1,1) -0.7270V(1,1) 0.0729*************************************************************
ans =
1.0000 2.4513 0.0054 2.0000 2.6300 0.0070 3.0000 2.7402 0.0100 4.0000 2.9335 0.0124 5.0000 3.0507 0.0150 6.0000 3.0775 0.0175
ans =
1.0e+003 *
0.2827 0.4266 0.5498 0.8579 1.1238 1.1952
The code and data required to replicate this case can be found in the directory
\EXAMPLES\NEWBOLD of the distribution diskette, files newbold.m, seriesa.dat,
seriesb.dat, seriesc.dat and seriesd.dat.
VARMA modeling: interaction between minks and muskrats
The number of muskrat ( ) and mink ( ) skins traded annually by the Hudson’s Bay Companyz1 t z2 t
from 1848 to 1909 is a standard benchmark for multivariate methods, as the feedback interaction
between these series arises clearly from the fact that the mink is an important predator of the
muskrat. After a cross-correlation analysis, Jenkins and Alavi (1981) propose two alternative
VARMA models for the relationship between both series. The first is:
&KDS� � 3DJ� �
1 � 111 1 B � 1
211 B 2
� 1311 B 3
� 1411 B 4 0
0 1 � 1122 B � 1
22 2 B 2
/ log z1 t
log z2 t µ2
1 � �11 1 B �
112 B
�121 B 1 � �1
22 B
a1 t
a2 t
(7.3)
and the second:
1 � 111 1 B � 1
211 B 2 1
11 2 B � 1
212 B 2
112 1 B � 1
221 B 2 1 � 11
22 B � 122 2 B 2
� 1322 B 3
/ log z1 t
log z2 t µ2
a1 t
a2 t
(7.4)
The estimation of the first model with E4 requires the code:
% Bivariate modelling of the mink-muskrat series% Model (5.8) of Jenkins and Alavi (1981) pag. 37% The data is already in logse4initload mink.dat;z1=transdif(mink(:,3),1,1);z2=mink(:,2)-mean(mink(:,2));z=[z1 z2(2:62)];% Define the parameter matrices and generate% the THD representationphi1 = [0 NaN; NaN 0];phi2 = [0 NaN; NaN 0];phi3 = [0 NaN; NaN NaN];phi4 = [0 NaN; NaN NaN];theta= [0 0; 0 0];sigma= [0 0; 0 0];[theta,din,lab]=arma2thd([phi1 phi2 phi3 phi4],[],[theta],[],sigma,1);% Compute preliminary estimatestheta=e4preest(theta,din,z);prtmod(theta,din,lab);% Compute ML estimates, information matrix and print the results[thopt,it,l,g,H]=e4min('lffast',theta,'',din,z);[std,corrm,varm,Im]=imod(thopt,din,z);prtest(thopt,din,lab,z,it,l,g,H,std,corrm);
which yields:
******************** Results from model estimation ******************** Objective function: -9.0059 # of iterations: 33Information criteria: AIC = 0.1310, SBC = 0.5808 Parameter Estimate Std. Dev. t-test GradientFR1(1,1) -0.6887 0.1364 -5.0510 0.0000FR1(2,2) -1.2680 0.1098 -11.5433 0.0000FR2(1,1) 0.5941 0.1403 4.2346 0.0000FR2(2,2) 0.5593 0.0921 6.0712 0.0000FR3(1,1) -0.0682 0.1171 -0.5821 0.0000FR4(1,1) 0.2816 0.0856 3.2894 0.0000AR1(1,1) -0.2966 0.1606 -1.8468 0.0000AR1(2,1) 0.6027 0.0804 7.4997 0.0000AR1(1,2) -0.8642 0.1387 -6.2299 0.0000AR1(2,2) -0.8352 0.1529 -5.4615 0.0000
&KDS� � 3DJ� �
V(1,1) 0.0639 0.0117 5.4854 0.0000V(2,1) 0.0191 0.0072 2.6502 0.0000V(2,2) 0.0423 0.0078 5.4491 0.0000 ************************* Correlation matrix **************************FR1(1,1) 1.00FR1(2,2) -0.02 1.00FR2(1,1) -0.68 0.34 1.00FR2(2,2) -0.23 -0.84 -0.15 1.00FR3(1,1) 0.39 -0.19 -0.89 0.14 1.00FR4(1,1) 0.35 0.15 0.35 -0.17 -0.65 1.00AR1(1,1) 0.70 -0.36 -0.35 0.04 0.07 0.39 1.00AR1(2,1) -0.17 0.28 0.39 -0.35 -0.26 0.12 0.09 1.00AR1(1,2) -0.51 0.40 0.40 -0.38 -0.19 -0.22 -0.58 0.25 1.00AR1(2,2) -0.44 0.69 0.35 -0.45 -0.12 -0.21 -0.70 0.00 0.64 1.00V(1,1) 0.00 0.03 -0.01 -0.03 0.01 -0.01 0.00 -0.02 0.02 0.02 1.00V(2,1) -0.01 0.02 0.00 -0.02 0.00 -0.01 -0.01 0.00 0.03 0.03 0.48 1.00V(2,2) 0.01 0.01 -0.01 -0.02 0.01 -0.01 0.00 -0.02 0.02 0.02 0.12 0.48 1.00 Condition number = 333.3643Reciprocal condition number = 0.0025***********************************************************************
and the code to estimate the second model is:
% Bivariate modelling of the mink-muskrat series% Model (5.9) of Jenkins and Alavi (1981) pag. 39e4initload mink.dat;z1=transdif(mink(:,3),1,1);z2=mink(:,2)-mean(mink(:,2));z=[z1 z2(2:62)];% Define the parameter matrices and generate the THD representationphi1= [ 0 0; 0 0];phi2= [ 0 0; 0 0];phi3= [NaN NaN; NaN 0];sigma=[ 0 0; 0 0];[theta,din,lab]=arma2thd([phi1 phi2 phi3],[],[],[],sigma,1);% Compute preliminary estimatestheta=e4preest(theta,din,z);prtmod(theta,din,lab);% Compute ML estimates, information matrix and print the results[thopt,it,l,g,H]=e4min('lffast',theta,'',din,z);[std,corrm,varm,Im]=imod(thopt,din,z);prtest(thopt,din,lab,z,it,l,g,H,std,corrm);
with the output:
******************** Results from model estimation ******************** Objective function: -3.9049 # of iterations: 21Information criteria: AIC = 0.2654, SBC = 0.6807 Parameter Estimate Std. Dev. t-test GradientFR1(1,1) -0.2485 0.1175 -2.1160 0.0000FR1(2,1) -0.4247 0.1254 -3.3876 0.0000FR1(1,2) 0.7314 0.1211 6.0401 0.0000FR1(2,2) -0.7892 0.1194 -6.6106 0.0000FR2(1,1) 0.1808 0.1033 1.7502 0.0000FR2(2,1) 0.3000 0.1164 2.5771 0.0000FR2(1,2) -0.3696 0.1509 -2.4490 0.0000FR2(2,2) -0.2818 0.1969 -1.4315 0.0000FR3(2,2) 0.5578 0.1421 3.9245 0.0000V(1,1) 0.0593 0.0108 5.5058 0.0000V(2,1) 0.0179 0.0077 2.3382 0.0000V(2,2) 0.0542 0.0098 5.5077 0.0000 ************************* Correlation matrix **************************FR1(1,1) 1.00
&KDS� � 3DJ� �
FR1(2,1) 0.29 1.00FR1(1,2) -0.19 -0.06 1.00FR1(2,2) -0.06 -0.29 0.30 1.00FR2(1,1) 0.05 0.02 -0.38 -0.11 1.00FR2(2,1) 0.01 -0.19 -0.10 -0.15 0.27 1.00FR2(1,2) 0.55 0.16 -0.72 -0.21 0.43 0.11 1.00FR2(2,2) 0.13 0.66 -0.16 -0.69 0.10 -0.10 0.23 1.00FR3(2,2) 0.00 -0.44 0.01 0.29 0.00 0.53 0.00 -0.69 1.00V(1,1) 0.01 0.01 0.00 0.00 -0.01 -0.01 0.00 0.00 -0.01 1.00V(2,1) 0.00 0.01 0.01 0.01 -0.01 -0.01 0.00 0.00 -0.01 0.42 1.00V(2,2) 0.00 0.01 0.01 0.02 0.00 -0.02 -0.01 0.01 -0.03 0.09 0.42 1.00 Condition number = 37.2248Reciprocal condition number = 0.0232***********************************************************************
Comparing the information criteria for both models, one concludes that the VARMA describes the
sample slightly better than the VAR.
E4 includes several functions for model validation and diagnosis. The following code computes the
residuals of the VARMA specification, performs a descriptive multiple autocorrelation analysis and,
finally, displays a standardized plot:
% Validation[ehat,vT,wT,vz1,vvT,vwT]=residual(thopt,din,z);tit=str2mat('muskrat residuals','mink residuals');descser(ehat,tit);midents(ehat,10,tit);plotsers(ehat,0,tit);
The residual descriptive statistics are:
***************** Descriptive statistics ***************** --- Statistics of muskrat residuals ---Valid observations = 61Mean = 0.0164, t test = 0.5320Standard deviation = 0.2412Skewness = -0.0707Excess Kurtosis = -0.5588Quartiles = -0.1659, 0.0122, 0.2052Minimum value = -0.5792, obs. # 58Maximum value = 0.5555, obs. # 52Jarque-Bera = 0.8446Dickey-Fuller = -3.1759, computed with 7 lagsDickey-Fuller = -7.3740, computed with 1 lagsOutliers list Obs # Value 52 0.5555 56 -0.4752 58 -0.5792 --- Statistics of mink residuals ---Valid observations = 61Mean = 0.0101, t test = 0.3395Standard deviation = 0.2319Skewness = -0.4582Excess Kurtosis = 0.1015Quartiles = -0.1235, 0.0322, 0.1641Minimum value = -0.6061, obs. # 20Maximum value = 0.5475, obs. # 35Jarque-Bera = 2.1603Dickey-Fuller = -2.9948, computed with 7 lagsDickey-Fuller = -8.1683, computed with 1 lagsOutliers list Obs # Value 20 -0.6061 35 0.5475 49 -0.4905 58 -0.5198
&KDS� � 3DJ� ��
Sample correlation matrix 1.0000 0.3414 0.3414 1.0000
Eigen structure of the correlation matrix i eigenval %var | Eigen vectors 1 1.3414 0.67 | 0.7071 0.7071 2 0.6586 0.33 | 0.7071 -0.7071************************************************************
The descriptive analysis indicates that the mean of both series is not statistically different of zero and
the empirical distribution of the residuals is consistent with the normality assumption, as the
skewness, excess kurtosis and Jarque-Bera statistics are small. Perhaps there are too many values
exceeding two standard deviations from the mean that should be further investigated. On the other
hand the multiple autocorrelation analysis do not reveal any sign of misspecification except a high
value of the Ljung-Box Q statistic in the (2,2) position, revealing that there could be some
autocorrelation in the mink residuals:
******** Autocorrelation and partial autoregression functions ******** MACF MPARF MACF MPARF k = 1, Chi(k) = 3.85, AIC(k) = -5.79, SBC(k) = -5.59muskrat re .. | .. | 0.03 0.08 | 0.00 0.08 mink resid .. | .. | 0.02 -0.06 | 0.04 -0.07 k = 2, Chi(k) = 3.01, AIC(k) = -5.72, SBC(k) = -5.37muskrat re .. | .. | -0.05 -0.03 | -0.06 -0.01 mink resid .. | .. | 0.05 -0.15 | 0.12 -0.20 k = 3, Chi(k) = 7.93, AIC(k) = -5.73, SBC(k) = -5.25muskrat re .. | +. | 0.21 -0.07 | 0.29 -0.16 mink resid .. | .. | 0.21 0.10 | 0.23 -0.03 k = 4, Chi(k) = 3.49, AIC(k) = -5.67, SBC(k) = -5.04muskrat re .. | .. | -0.01 -0.07 | 0.01 -0.14 mink resid .. | .. | 0.09 -0.01 | 0.13 -0.11 k = 5, Chi(k) = 7.41, AIC(k) = -5.68, SBC(k) = -4.92muskrat re .. | .. | -0.02 0.17 | -0.03 0.12 mink resid .. | +. | 0.19 0.00 | 0.31 -0.09 k = 6, Chi(k) = 10.11, AIC(k) = -5.76, SBC(k) = -4.86muskrat re .. | .. | -0.06 0.05 | -0.19 0.11 mink resid .. | .- | -0.09 -0.25 | 0.02 -0.34 k = 7, Chi(k) = 7.60, AIC(k) = -5.79, SBC(k) = -4.75muskrat re .. | .. | -0.20 -0.08 | -0.30 0.10 mink resid .. | .. | 0.01 0.09 | 0.08 -0.05 k = 8, Chi(k) = 10.06, AIC(k) = -5.89, SBC(k) = -4.71muskrat re .. | .. | -0.01 -0.15 | 0.10 -0.29 mink resid .. | .. | -0.13 -0.12 | -0.21 -0.30 k = 9, Chi(k) = 4.58, AIC(k) = -5.86, SBC(k) = -4.55muskrat re .+ | .. | 0.04 0.27 | -0.05 0.25 mink resid .. | .. | 0.00 0.15 | 0.12 0.18 k = 10, Chi(k) = 17.30, AIC(k) = -6.16, SBC(k) = -4.70muskrat re .. | .. | -0.24 -0.01 | -0.44 0.03 mink resid .+ | .+ | 0.14 0.36 | 0.07 0.25 The (i,j) element of the lag k matrix is the cross correlation (MACF)or partial autoregression (MPACF) estimate when series j leads series i. ********************* Cross correlation functions ********************* muskrat re mink resid muskrat re .......... ........+. mink resid .......... .........+
&KDS� � 3DJ� ��
10 20 30 40 50 60-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Standardized plot of mink residuals
10 20 30 40 50 60
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
Standardized plot of muskrat residuals
Each row is the cross correlation function when the column variable leadsthe row variable. Ljung-Box Q statistic for previous cross-correlations muskrat re mink resid muskrat re 10.73 10.76 mink resid 9.28 19.83 Summary in terms of +.- For MACF std. deviations are computed as 1/T^0.5 = 0.13 For MPARF std. deviations are computed from VAR(k) model***********************************************************************
Last, the time-series plots of the residuals are again quite satisfactory.
Further elaborations of this example may arise from inspection of the impulse-response functions of
both variables, or by using these results to build a discrete time analogue of the Lotka-Volterra
equations, as Jenkins and Alavi suggest.
The code and data required to replicate this case can be found in the directory
\EXAMPLES\MINKS of the distribution diskette, files mink.dat, mink1.m and mink2.m
Transfer function analysis
Very often, relevant characteristics of the dynamic behaviour of a time series are not measured by
explicit parameters of a standard model, but by functions of these parameters. In this example we
show the application of user functions to reparametrize a transfer function. Hence, an implicit
parameter becomes explicit and can be subject to standard estimation and testing processes.
&KDS� � 3DJ� ��
8QFRQVWUDLQHG WUDQVIHU IXQFWLRQ PRGHOLQJ
McLeod (1982) builds a transfer function model relating consumption of petrochemical products
with a seasonally adjusted UK industrial production index, using data from 1958 first quarter to
1976 fourth quarter. The model is:
(7.5)ln PCt w10 ln IPt � (w20 � w21 B � w22 B 2 ) !IV /74
t � Nt
(1 �01 B 4�02 B 8 )//4 Nt (1 � �1 B ) (1 � �1 B 4 )at
where PC denotes consumption of petrochemical products, IP is the industrial production index, and
is an impulse variable which models the beginning of a period of destocking in the 4th quarter!IV /74t
of 1974. The code to estimate this model is:
e4initload petro.dat;petro(:,1:2) = log(petro(:,1:2));y = transdif(petro,1,1,1,4);% Defines the structure of the transfer functionsar = [0 0]; ma = [0]; sma = [0]; v = [0];w = [0 NaN NaN; 0 0 0];[theta, din, lab] = tf2thd([],[sar],[ma],[sma],[v],4,[w],[]);% Computes preliminary estimatestheta=e4preest(theta,din,y);prtmod(theta,din,lab);% ... and ML estimates[thopt,it,lval,g,h]=e4min('lffast', theta, '', din, y);[std,corrm,varm,Im]=imod(thopt,din,y);prtest(thopt,din,lab,y,it,lval,g,h,std,corrm);% Computes the period of the noise modelperiod = 2*pi/acos(-thopt(1,1)/(2*sqrt(thopt(2,1))));disp(sprintf('period = %4.2f years', period));
Note that the last sentences compute and display the period for the seasonal AR(2) noise process.
This code yields the following output:
******************** Results from model estimation ******************** Objective function: -173.6745 # of iterations: 48Information criteria: AIC = -4.6387, SBC = -4.3519 Parameter Estimate Std. Dev. t-test GradientFS(1,1) 0.1521 0.1307 1.1634 0.0000FS(1,2) 0.2660 0.1222 2.1768 0.0000AR(1,1) -0.5450 0.0994 -5.4824 0.0000AS(1,1) -0.7781 0.1201 -6.4776 0.0000W1(1,1) 1.3966 0.1089 12.8191 0.0000W2(1,1) -0.0350 0.0182 -1.9299 0.0000W2(2,1) -0.1150 0.0187 -6.1517 0.0000W2(3,1) -0.0478 0.0185 -2.5888 0.0000V(1,1) 0.0200 0.0017 11.7627 0.0000 ************************* Correlation matrix **************************FS(1,1) 1.00FS(1,2) 0.34 1.00AR(1,1) 0.06 0.00 1.00AS(1,1) 0.52 0.42 -0.05 1.00W1(1,1) -0.03 -0.01 0.00 -0.02 1.00W2(1,1) 0.00 0.00 0.00 0.00 -0.01 1.00
&KDS� � 3DJ� ��
1 2 3 4 5 6 7 8 9-1
-0.5
0
0.5
1A.C.F. of petro. consumption residuals, LBQ = 6.25
1 2 3 4 5 6 7 8 9-1
-0.5
0
0.5
1P.A.C.F. of petro. consumption residuals
10 20 30 40 50 60 70-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5Standardized plot of petro. consumption residuals
W2(2,1) 0.01 0.01 0.00 0.01 -0.14 0.25 1.00W2(3,1) -0.01 0.00 0.00 0.00 0.18 0.17 0.22 1.00V(1,1) 0.01 -0.02 0.01 0.12 0.00 0.00 0.00 0.00 1.00 Condition number = 4.2496Reciprocal condition number = 0.2787*********************************************************************** period = 3.66 years
A standard validation output can be obtained with the following sentences:
% Validation[ehat,vT,wT,vz1,vvT,vwT]=residual(thopt,din,y);descser(ehat,'petro. consumption residuals');uidents(ehat,10,'petro. consumption residuals');plotsers(ehat,0,'petro. consumption residuals');
which generate the following outputs:
***************** Descriptive statistics ***************** --- Statistics of petro. consumption residuals ---Valid observations = 71Mean = -0.0007, t test = -0.3021Standard deviation = 0.0201Skewness = -0.3747Excess Kurtosis = -0.1037Quartiles = -0.0134, 0.0000, 0.0123Minimum value = -0.0522, obs. # 41Maximum value = 0.0388, obs. # 36Jarque-Bera = 1.6931Dickey-Fuller = -3.4079, computed with 8 lagsDickey-Fuller = -8.6135, computed with 1 lagsOutliers list Obs # Value 41 -0.0522 49 -0.0512 69 -0.0431 ************************************************************
These results do not suggest any alternative specification, so the model appears to be statistically
adequate.
&KDS� � 3DJ� ��
3HULRG HVWLPDWLRQ
Previous estimates imply that the autoregressive factor of the noise model is periodic, with a cycle of
3.66 years. With E4 a simple reparametrization allows one to estimate this period, its standard
deviation and, if required, to constrain its value, see Terceiro and Gómez (1985). To do this, it is
first necessary to build a user function that will receive the period in theta, compute and return02
the SS formulation of the model. The relationship that links with the values of and the period02 01
(p) is:
(7.6)02
01
2cos(2% /p)
2
The user function that does the transformation is:
function [Phi, Gam, E, H, D, C, Q, S, R] = pcons1(thetan, dinn);% SS formulation with the period as explicit parameter.% It is stored in theta(2,1).theta = thetan;theta(2,1) = (-theta(1,1)/2*cos(2*pi/thetan(2,1)))^2;din = tomod(dinn);[Phi, Gam, E, H, D, C, Q, S, R]= thd2ss(theta,din);
and the previous command file needs the following addition:
% Explicit period estimationthetan = thopt;thetan(2,1) = 3.7;labn = str2mat(lab(1,:),'Period',lab(3:size(lab,1),:) );dinn = touser(din,'pcons1');[theta,it,lval,g,h]=e4min('lffast',thetan,'',dinn,y);prtest(theta,din,labn,y,it,lval,g,h);
The corresponding output is:
******************** Results from model estimation ******************** Objective function: -173.6745 # of iterations: 34Information criteria: AIC = -4.6387, SBC = -4.3519 Parameter Estimate Appr.Std.Dev. t-test GradientFS(1,1) 0.1521 0.1360 1.1179 0.0000Period 3.6556 0.2958 12.3579 0.0000AR(1,1) -0.5450 0.1291 -4.2218 0.0000AS(1,1) -0.7781 0.1108 -7.0221 0.0000W1(1,1) 1.3966 0.1252 11.1538 0.0000W2(1,1) -0.0350 0.0183 -1.9136 0.0002W2(2,1) -0.1150 0.0185 -6.2061 -0.0001W2(3,1) -0.0478 0.0189 -2.5291 0.0003V(1,1) 0.0200 0.0016 12.1620 0.0028 ************************* Correlation matrix **************************FS(1,1) 1.00Period -0.96 1.00AR(1,1) 0.45 -0.54 1.00AS(1,1) 0.40 -0.31 0.09 1.00W1(1,1) -0.34 0.43 -0.47 -0.11 1.00W2(1,1) 0.18 -0.26 0.25 -0.04 -0.21 1.00W2(2,1) 0.29 -0.35 0.31 -0.07 -0.31 0.39 1.00
&KDS� � 3DJ� ��
W2(3,1) -0.17 0.21 -0.11 0.28 0.20 0.08 0.03 1.00V(1,1) 0.05 -0.05 0.06 0.20 -0.05 0.03 0.02 0.09 1.00 Condition number = 137.4069Reciprocal condition number = 0.0060***********************************************************************
Note that the standard error for the period is 0.3 years. Not surprisingly, the point estimate of the
period and the optimal value of the objective function are equal to those obtained from the standard
representation.
(VWLPDWLRQ RI D FRQVWUDLQHG WUDQVIHU IXQFWLRQ
With the last parametrization it is straightforward to impose any value of the period. If extraneous
information suggests that the period is exactly equal to three years, this constraint can be imposed
with the following code:
thetan=theta(:,1);thetan(2,1)=3;thetan=[thetan, [0; 1; 0; 0; 0; 0; 0; 0; 0]];dinn=touser(din,'pcons1');[theta,it,lval,g,h]=e4min('lffast',thetan,'',dinn,y);prtest(theta,din,labn,y,it,lval,g,h);
which yields:
******************** Results from model estimation ******************** Objective function: -172.2959 # of iterations: 33Information criteria: AIC = -4.6281, SBC = -4.3731 Parameter Estimate Appr.Std.Dev. t-test GradientFS(1,1) 0.2097 0.1876 1.1182 0.0000Period * 3.0000 0.0000 0.0000 0.0000AR(1,1) -0.4542 0.0960 -4.7311 0.0000AS(1,1) -0.8419 0.1354 -6.2185 0.0000W1(1,1) 1.3176 0.1054 12.5022 0.0000W2(1,1) -0.0236 0.0179 -1.3182 0.0000W2(2,1) -0.1054 0.0188 -5.6168 0.0000W2(3,1) -0.0548 0.0192 -2.8483 0.0000V(1,1) 0.0204 0.0017 12.1003 0.0001* denotes constrained parameter ************************* Correlation matrix **************************FS(1,1) 1.00AR(1,1) 0.24 1.00AS(1,1) 0.63 0.25 1.00W1(1,1) -0.10 -0.23 -0.15 1.00W2(1,1) -0.06 0.04 -0.06 0.03 1.00W2(2,1) 0.21 0.20 0.03 -0.08 0.34 1.00W2(3,1) -0.13 0.12 0.23 0.12 0.23 0.22 1.00V(1,1) 0.02 -0.01 0.23 -0.06 0.01 -0.06 0.07 1.00 Condition number = 9.5355Reciprocal condition number = 0.0925***********************************************************************
&KDS� � 3DJ� ��
&RPSRVLWH PRGHO IRUHFDVWV
To compute forecasts for the endogenous variable of a transfer function - here consumption of
petrochemical products - it is common to use a two stage approximation. The first stage consist of
computing forecasts for the inputs, often with univariate models. The second stage consists of
forecasting the endogenous variable by feeding the transfer function with the first stage forecasts.
This procedure ignores the stochastic nature of the input forecasts and, therefore, underestimates the
standard error of the output forecasts.
With E4 it is not necessary to proceed in this way, as it is possible to compute simultaneous forecasts
for both variables. To see this, consider the input model proposed by McLeod (1982):
(7.7)(1 � 1B ) (1 �0B 4 )/ lnIPt at
To formulate a simultaneous model for the input and output, we first have to compensate the
additional seasonal difference in the transfer function. This can be done by adding to
the univariate model a seasonal difference and a seasonal moving average, with a parameter equal to
one, that is:
(7.8)(1 � 1B ) (1 �0B 4 )//4 lnIPt (1 B 4 )at
We can estimate this model with the code:
e4init;load petro.dat;petro = log(petro(:,1:2));% Filters deterministic effects using the estimates of model (7.5)petro(68,1)=petro(68,1)+.0350;petro(69,1)=petro(69,1)+.1150;petro(70,1)=petro(70,1)+.0478;y = transdif(petro,1,1,1,4);
% Input model definition and estimation[t2, d2, l2] = arma2thd([0],[0],[],[0],[0],4);t2(3,1)=-1.;t2 = [t2 [0; 0; 1; 0]];t2=e4preest(t2,d2,y(:,2));[t2,it,lval,g,h]=e4min('lffast',t2, '',d2,y(:,2));prtest(t2,d2,l2,y(:,2),it,lval,g,h)
which yields the following output:
******************** Results from model estimation ******************** Objective function: -177.9523 # of iterations: 17Information criteria: AIC = -4.9282, SBC = -4.8326 Parameter Estimate Appr.Std.Dev. t-test GradientFR1(1,1) -0.2124 0.1187 -1.7896 0.0000FS1(1,1) 0.1565 0.1204 1.3006 0.0000AS1(1,1) * -1.0000 0.0000 0.0000 0.0000V(1,1) 0.0180 0.0015 11.9012 -0.0004* denotes constrained parameter
&KDS� � 3DJ� ��
************************* Correlation matrix **************************FR1(1,1) 1.00FS1(1,1) -0.20 1.00V(1,1) 0.03 -0.08 1.00 Condition number = 1.5466Reciprocal condition number = 0.6364***********************************************************************
Note that the first part of the code filters out the deterministic effect , see Eq. (7.5). We define!IV /74t
the composite model using as starting values the estimates contained in t1 (for the transfer function)
and in t2 (for the input model):
% TF definition and estimationsar = [0 0]; ma = [0]; sma = [0]; v = [0];w = [0];[t1, d1, l1] = tf2thd([],[sar],[ma],[sma],[v],4,[w],[]);t1=e4preest(t1,d1,y);[t1,it,lval,g,h]=e4min('lffast',t1,'',d1,y);prtest(t1,d1,l1,y,it,lval,g,h)
% Composite model formulation[theta,din,lab]=stackthd(t1,d1,t2,d2,l1,l2);[theta,din,lab] = nest2thd(theta,din,1,lab);prtmod(theta,din,lab);
which yields:
*************************** Model ***************************Nested model in inputs (innovations model)2 endogenous v., 0 exogenous v.Seasonality: 4SS vector dimension: 13Submodels:{ Transfer function model (innovations model) 1 endogenous v., 1 exogenous v. Seasonality: 4 SS vector dimension: 8 Parameters (* denotes constrained parameter): FS(1,1) 0.1520 FS(1,2) 0.2660 AR(1,1) -0.5450 AS(1,1) -0.7781 W1(1,1) 1.3966 V(1,1) 0.0200 -------------- VARMAX model (innovations model) 1 endogenous v., 0 exogenous v. Seasonality: 4 SS vector dimension: 5 Parameters (* denotes constrained parameter): FR1(1,1) -0.2124 FS1(1,1) 0.1565 AS1(1,1) * -1.0000 V(1,1) 0.0180 --------------}*************************************************************
To obtain forecasts for the stationary variables two different methods are compared. The standard
one, in which the forecasts for the input variable are treated as known to forecast the output, and the
method used in the toolbox. Using a composite model simultaneous forecasts can be computed for
&KDS� � 3DJ� ��
both the input and the output and the variances are computed accordingly. These calculations can be
done with the following code:
% Forecasting. First, conventional approach ...[xf,Bfx] = foremod(t2,d2,y(:,2),8);[yf1,Bfy1] = foremod(t1,d1,y,8,xf);[yf1 xf sqrt(Bfy1) sqrt(Bfx)]
% ... and then, composite model forecasts[yf,Bf] = foremod(theta,din,y,8);[yf sqrt([Bf(1:2:2*8,1) Bf(2:2:2*8,2)])]
The conventional approach forecasts for the output and the input, with the corresponding standard
errors, are:
ans =
-0.0350 -0.0123 0.0200 0.0185 0.0436 -0.0068 0.0227 0.0189 -0.0189 0.0058 0.0227 0.0189 0.0250 -0.0015 0.0227 0.0189 -0.0140 0.0016 0.0294 0.0278 0.0071 0.0010 0.0311 0.0281 -0.0090 -0.0009 0.0311 0.0282 0.0088 0.0002 0.0311 0.0282
and the forecasts computed with the composite model are:
ans =
-0.0350 -0.0123 0.0327 0.0185 0.0436 -0.0068 0.0348 0.0189 -0.0189 0.0058 0.0349 0.0189 0.0250 -0.0015 0.0349 0.0189 -0.0140 0.0016 0.0487 0.0278 0.0071 0.0010 0.0501 0.0281 -0.0090 -0.0009 0.0501 0.0282 0.0088 0.0002 0.0501 0.0282
As expected, the forecast standard errors of the endogenous variable are higher when computed
using the composite model. The other results are the same.
The code and data required to replicate this case can be found in the directory
\EXAMPLES\PETROL of the distribution diskette, files petro.dat, petrol1.m, petrol2.m
and pcons1.m.
Structural econometric models: supply and demand of food
Kmenta (1997) proposes a simple supply-demand model to explain the consumption and prices of
food, inspired in previous work by Girshick and Haavelmo (1947). In this section we use this model
&KDS� � 3DJ� ��
to illustrate the application of E4 to structural (str) formulations. The model includes two
behavioural equations:
Demand: (7.9)Qt �1 � �2 Pt � �3 Dt � u1t
Supply: (7.10)Qt �1 � �2 Pt � �3 Ft � �4 At � u2t
The endogenous variables are , food consumption per head, and , ratio of food prices to generalQt Pt
consumer prices. On the other hand, the exogenous variables are the constant term, the disposable
income in constant prices , the ratio of preceding year´s prices received by farmers to generalDt
consumer prices and time . The sample includes 20 yearly observations for all the variables,Ft At
and was taken from Kmenta (1997).
0D[LPXP�OLNHOLKRRG HVWLPDWLRQ
The structural model (7.9)-(7.10) in matrix notation is:
(7.11)1 �2
1 �2
Qt
Pt
�1 �3 0 0
�1 0 �3 �4
1
Dt
Ft
At
�
u1t
u2t
(7.12)Vu1t
u2t
v1 c12
c12 v2
Note that the left-hand-side matrix in (7.11) is not normalized, therefore we cannot use str2thd to
define and estimate the structural form. Instead we will compute the likelihood using the
observationally equivalent reduced form:
(7.13)Qt
Pt
1 �2
1 �2
1 �1 �3 0 0
�1 0 �3 �4
1
Dt
Ft
At
�
1 �2
1 �2
1 u1t
u2t
This reparametrization is implemented by means of a user function (food2ss) which: a) receives as
input a parameter vector th that contains parameters in (7.11) and the covariances in (7.12), b)
generates the formulation (7.13) in thd format and c) calls th2sss to obtain the corresponding SS
model matrices. The code to define this function is the following:
function [Phi, Gam, E, H, D, C, Q, S, R] = food2ss(th, din)% Returns the SS representation of Kmenta’s model % th(1:7) contains the structural parameters% th(8:10) contains the lower triangle of the noise covariance matrix
&KDS� � 3DJ� ��
t = th(:,1);F0 = [1 -th(2); 1 -th(5)];G0 = [th(1) th(3) 0 0; th(4) 0 th(6) th(7)];V = vech2m(th(8:10),2);% THD formulation of structural modeliF0 = diag(1./diag(F0)); % Normalizes the main diagonal elements[theta,din] = str2thd(iF0*F0,[],[],[],iF0*V*iF0',1,iF0*G0,4);% SS conversion[Phi, Gam, E, H, D, C, Q, S, R] = thd2sss(theta,din);
Since the model includes many parameters and the sample is very short, the degrees of freedom are
not enough to use e4preest. Hence, we use 2SLS estimates as initial conditions for likelihood
optimization. Under these conditions, the model can be estimated with the following code:
% Model for the supply and demand of food from Kmenta (1986)e4initload food.datQ = food(:,1); P = food(:,2); D = food(:,3);F = food(:,4); A = food(:,5); cte = food(:,6);z = [Q P cte D F A];
% 2SLS estimatest = [ 95; -0.24; 0.31; ... 49; 0.24; 0.5; 0.25; 3.1; 1.7; 4.6];% Model formulationF0 = [1 -t(2); 1 -t(5)];G0 = [t(1), t(3), 0, 0; ... t(4), 0 , t(6), t(7)];V = vech2m(t(8:10),2);lab = str2mat( 'alpha1','alpha2','alpha3');lab = str2mat(lab,'beta1', 'beta2', 'beta3', 'beta4');lab = str2mat(lab, 'v1', 'c12', 'v2');
[tdum,din] = str2thd([F0],[],[],[],V,1,[G0],4);din = touser(din,'food2ss');
[p,iter,lnew,g,h] = e4min('lffast',t,'',din,z);prtest(p,din,lab,z,iter,lnew,g,h);
[e, vT, wT, Ve, VvT, VwT]=residual(p,din,z,1);tit=['demand residuals'; 'supply residuals'];midents(e,5,tit);
The output from prtest is:
******************** Results from model estimation ******************** Objective function: 67.7697 # of iterations: 69Information criteria: AIC = 7.7770, SBC = 8.2748 Parameter Estimate Appr.Std.Dev. t-test Gradientalpha1 93.2067 7.1386 13.0567 0.0000alpha2 -0.2253 0.0863 -2.6100 0.0017alpha3 0.3098 0.0385 8.0462 0.0017beta1 51.4429 10.2259 5.0307 0.0000beta2 0.2425 0.0908 2.6718 -0.0002beta3 0.2207 0.0352 6.2605 -0.0006beta4 0.3695 0.0604 6.1189 0.0002v1 3.3465 1.1199 2.9881 0.0000c12 4.2680 1.4227 2.9999 0.0000v2 5.6395 1.8489 3.0502 0.0000 ************************* Correlation matrix **************************alpha1 1.00alpha2 -0.90 1.00alpha3 0.18 -0.58 1.00
&KDS� � 3DJ� ��
beta1 0.76 -0.43 -0.46 1.00beta2 -0.94 0.74 0.09 -0.92 1.00beta3 0.19 -0.57 0.97 -0.46 0.09 1.00beta4 0.16 -0.53 0.94 -0.46 0.10 0.93 1.00v1 -0.28 0.32 -0.20 -0.13 0.23 -0.20 -0.17 1.00c12 -0.28 0.28 -0.11 -0.19 0.26 -0.11 -0.08 0.99 1.00v2 -0.27 0.22 0.02 -0.25 0.28 0.00 0.06 0.95 0.98 1.00 Condition number = 158565.5617Reciprocal condition number = 0.0000***********************************************************************
and the multiple autocorrelation function of the residuals is:
******** Autocorrelation and partial autoregression functions ******** MACF MPARF MACF MPARF k = 1, Chi(k) = 0.81, AIC(k) = 0.52, SBC(k) = 0.82demand res .. | .. | -0.10 -0.18 | -0.10 -0.19 supply res .. | .. | -0.05 0.02 | -0.05 0.02 k = 2, Chi(k) = 4.10, AIC(k) = 0.66, SBC(k) = 1.16demand res .. | .. | -0.03 -0.04 | -0.05 -0.05 supply res .. | .. | 0.15 -0.09 | 0.14 -0.12 k = 3, Chi(k) = 6.17, AIC(k) = 0.60, SBC(k) = 1.30demand res .. | .. | -0.17 0.23 | -0.02 0.34 supply res .. | .. | -0.13 -0.11 | -0.30 -0.21 k = 4, Chi(k) = 17.01, AIC(k) = -0.48, SBC(k) = 0.42demand res .. | .. | 0.22 -0.22 | 0.25 -0.31 supply res -. | -. | -0.51 -0.20 | -0.89 -0.73 k = 5, Chi(k) = 6.55, AIC(k) = -0.77, SBC(k) = 0.33demand res .. | ++ | 0.22 0.33 | 1.20 1.40 supply res .. | .. | 0.09 -0.03 | 0.01 0.01 The (i,j) element of the lag k matrix is the cross correlation (MACF)or partial autoregression (MPACF) estimate when series j leads series i. ********************* Cross correlation functions ********************* demand res supply res demand res ..... ..... supply res ...-. ..... Each row is the cross correlation function when the column variable leadsthe row variable. Ljung-Box Q statistic for previous cross-correlations demand res supply res demand res 3.77 6.65 supply res 8.46 1.65 Summary in terms of +.- For MACF std. deviations are computed as 1/T^0.5 = 0.22 For MPARF std. deviations are computed from VAR(k) model***********************************************************************
The code and data required to replicate this case can be found in the directory \EXAMPLES\FOOD
of the distribution diskette, files food.dat, food.m and food2ss.m.
&KDS� � 3DJ� ��
An ARCH model for the U.S. GNP deflator
In most econometric software ARCH modeling options only allow a regression model for the
(conditional) mean and a univariate GARCH process for the variance. In contrast, E4 allows one to
combine transfer function or VARMAX models for the mean with vector ARCH, GARCH or
IGARCH models for its conditional variance.
In this example we illustrate the use of ARCH modeling functions with a model for the quarterly
U.S. implicit price deflator of GNP, from 1948:II to 1983:IV. This series has been analyzed by
Engle and Kraft (1983) and Bollerslev (1986). These authors model the (conditional) mean of the
deflator by an AR(4) process:
(7.18)%t �0 � �1%t1 � �2%t2 � �3%t3 � �4%t4 � Jt
where and is the GNP deflator. The difference between papers lies in the%t 100× ln(Pt /Pt1 ) Pt
parametric model assumed for the conditional variance. Engle and Kraft (1983) consider that Jt
follows a constrained ARCH(8) process and Bollerslev (1986) assumes a GARCH(1,1) process.
(VWLPDWLRQ XQGHU KRPRVFHGDVWLFLW\
The first step in this analysis will be to estimate the AR(4) model for inflation under the assumption
of homoscedasticity, that is . To do this, we can use the following code:E (J2t ) )2 , ~t
e4init;load gnpn.dat;y = gnpn; c = ones(size(y));% Defines the AR(4) structure and computes preliminary estimates ...[theta, din, lab] = arma2thd([0 0 0 0],[],[],[],0,4,0,1);theta=e4preest(theta,din,[y c]);
% ... and then, computes ML estimates under homoscedasticity[theta,it,lval,g,h]=e4min('lffast',theta,'', din, [y c]);[std, corrm, varm, Im]= imod(theta, din, [y c]);prtest(theta,din,lab,[y c],it,lval,g,h,std,corrm);
% Finally, computes the residuals and its squares, to detect ARCH effects[ehat,vT,wT,vz1,vvT,vwT]=residual(theta,din,[y c]);plotsers(ehat,0,'AR(4) residuals');uidents(ehat,15,'AR(4) residuals');uidents(ehat.^2,15,'AR(4) squared residuals');
which yields:
&KDS� � 3DJ� ��
20 40 60 80 100 120 140
-4
-3
-2
-1
0
1
2
3
4
Standardized plot of AR(4) residuals
2 4 6 8 10 12 14-1
-0.5
0
0.5
1A.C.F. of AR(4) squared residuals, LBQ = 64.72
2 4 6 8 10 12 14-1
-0.5
0
0.5
1P.A.C.F. of AR(4) squared residuals
2 4 6 8 10 12 14-1
-0.5
0
0.5
1A.C.F. of AR(4) residuals, LBQ = 18.84
2 4 6 8 10 12 14-1
-0.5
0
0.5
1P.A.C.F. of AR(4) residuals
******************** Results from model estimation ******************** Objective function: 111.6734 # of iterations: 40Information criteria: AIC = 1.6458, SBC = 1.7701 Parameter Estimate Std. Dev. t-test GradientFR1(1,1) -0.5282 0.0827 -6.3834 0.0000FR2(1,1) -0.1925 0.0931 -2.0681 0.0000FR3(1,1) -0.2016 0.0932 -2.1643 0.0000FR4(1,1) 0.1506 0.0838 1.7975 0.0000G0(1,1) 0.2315 0.0794 2.9150 0.0000V(1,1) 0.2773 0.0328 8.4549 0.0001 ************************* Correlation matrix **************************FR1(1,1) 1.00FR2(1,1) -0.46 1.00FR3(1,1) -0.15 -0.36 1.00FR4(1,1) -0.13 -0.15 -0.46 1.00G0(1,1) 0.20 0.11 0.12 0.21 1.00V(1,1) 0.01 0.00 0.01 -0.01 0.01 1.00 Condition number = 42.9291Reciprocal condition number = 0.0177***********************************************************************
with the following diagnostic graphs:
The residuals show a long-memory structure in the autocorrelation function of the squared residuals.
This is a symptom of conditional heteroscedasticity.
&KDS� � 3DJ� ��
(VWLPDWLRQ RI DQ $5&+��� SURFHVV IRU WKH HUURU
Engle and Kraft (1983) try to capture this last feature assuming that the AR(4) has a conditional
heteroscedastic error term, such that , and:E(J2t ) )
2 Et1(J2t ) )
2t
)2t )
2� Nt
Nt �1M8
i1
9 i36
Nti � vt
(7.19)
Note that the terms in the ARCH model are actually a declining function of a single parameter. To
obtain the SS formulation of the constrained ARCH model we need to define a user function:
function[Phi,Gam,E,H,D,C,Q,Phig,Gamg,Eg,Hg,Dg] = arch8(thetan,dinn);% User function for constrained ARCH(8) modeltheta = zeros(14,1);theta(1:6) = thetan(1:6,1);alpha1 = thetan(7,1);for i=1:8 theta(i+6) = alpha1*(9-i)/36;enddin= tomod(dinn);[Phi Gam E H D C Q Phig Gamg Eg Hg Dg]= garch2ss(theta,din);
and then estimates can be obtained with the following code:
% Model: AR(4)+constrained ARCH(8). Implicit price deflator for GNP data.e4init;load gnpn.dat;y = gnpn;c = ones(size(y));
% Model for the mean[thetay, diny, lab1] = arma2thd([0 0 0 0],[],[],[],[0],4,[0],1);
% Model for the conditional variance[thetae, dine, lab2] = arma2thd([0 0 0 0 0 0 0 0],[],[],[],1,4);
% Defines the composite model[theta, din, lab3] = garc2thd(thetay, diny, thetae, dine, lab1, lab2);
% Computes preliminary estimatestheta=e4preest(theta,din,[y c]);prtmod(theta, din, lab3);thetan = theta(1:7,1);labn = lab3(1:7,:);dinn = touser(din,'arch8');
% ... and ML estimates. Note the user function in the call to e4min[thopt,it,lval,g,h]=e4min('lfgarch',thetan,'', dinn, [y c]);prtest(thopt,dinn,labn,[y c],it,lval,g,h,[],[]);
The estimation results are:
&KDS� � 3DJ� ��
******************** Results from model estimation ******************** Objective function: 89.8282 # of iterations: 74Information criteria: AIC = 1.3542, SBC = 1.4993 Parameter Estimate Appr.Std.Dev. t-test GradientFR1(1,1) -0.4021 0.0829 -4.8506 0.0001FR2(1,1) -0.1959 0.0826 -2.3701 -0.0001FR3(1,1) -0.3698 0.0887 -4.1705 0.0001FR4(1,1) 0.1241 0.0896 1.3849 0.0001G0(1,1) 0.1365 0.0551 2.4772 -0.0001V(1,1) 1.1816 1.1001 1.0741 0.0000FR1(1,1) -0.9635 0.0406 -23.7462 0.0003 ************************* Correlation matrix **************************FR1(1,1) 1.00FR2(1,1) -0.34 1.00FR3(1,1) -0.08 -0.42 1.00FR4(1,1) -0.41 -0.13 -0.42 1.00G0(1,1) 0.12 0.11 0.13 0.14 1.00V(1,1) 0.00 0.02 0.02 -0.04 0.07 1.00FR1(1,1) -0.03 -0.02 0.04 -0.01 -0.05 -0.92 1.00 Condition number = 70.9489Reciprocal condition number = 0.0131***********************************************************************
(VWLPDWLRQ RI D *$5&+����� SURFHVV IRU WKH HUURU
A more sophisticated parametrization for the conditional variance is that of Bollerslev (1986), who
proposes the GARCH(1,1) model:
)2t )
2� Nt
Nt �1 Nt1 � vt �1 vt1
(7.20)
The code needed to estimate the mean and variance models and to perform a validation analysis is:
% Model: AR(4)+GARCH(1,1). Implicit price deflator for GNP data.e4init;load gnpn.dat;y = gnpn;c = ones(size(y));% Model for the mean[thetay, diny, lab1] = arma2thd([0 0 0 0],[],[],[],[0],4,[0],1);% Model for the conditional variance[thetae, dine, lab2] = arma2thd([0],[],[0],[],1,4);% Composite model[theta, din, lab3] = garc2thd(thetay, diny, thetae, dine, lab1, lab2);% Computes preliminary estimatestheta=e4preest(theta,din,[y c]);prtmod(theta, din, lab3);% ... and ML estimates[thopt,it,lval,g,h]=e4min('lfgarch',theta,'', din, [y c]);[std,corrm,varm,Im]=igarch(thopt,din,[y c]);prtest(thopt,din,lab3,[y c],it,lval,g,h,std,corrm);% Validation. Note that residual.m only returns 'ehat' and 'vz1'[ehat,vT,wT,vz1]=residual(thopt,din,[y c]);figure; whitebg('w');plot(vz1);title('Conditional variance')plotsers(ehat,0,'original residuals');stdres=ehat./sqrt(vz1)plotsers(stdres,0,'standardized residuals');uidents(stdres,15,'standardized residuals');uidents(stdres.^2,15,'standardized squared residuals');
&KDS� � 3DJ� ��
20 40 60 80 100 120 140
-4
-3
-2
-1
0
1
2
3
4
Standardized plot of original residuals
20 40 60 80 100 120 140
-3
-2
-1
0
1
2
3
Standardized plot of standardized residuals
which yields the following results:
****************** Results from model estimation ******************** Objective function: 87.5947 # of iterations: 65Information criteria: AIC = 1.3370, SBC = 1.5027 Parameter Estimate Std. Dev. t-test GradientFR1(1,1) -0.4186 0.0927 -4.5158 -0.0002FR2(1,1) -0.2090 0.0951 -2.1963 -0.0013FR3(1,1) -0.3295 0.0934 -3.5282 -0.0004FR4(1,1) 0.1098 0.0889 1.2351 -0.0008G0(1,1) 0.1441 0.0617 2.3365 -0.0002V(1,1) 1.5613 1.0813 1.4440 -0.0001FR1(1,1) -0.9982 0.0025 -396.3847 -0.0029AR1(1,1) -0.8373 0.0462 -18.1330 -0.0002 ************************* Correlation matrix **************************FR1(1,1) 1.00FR2(1,1) -0.43 1.00FR3(1,1) -0.19 -0.34 1.00FR4(1,1) -0.29 -0.19 -0.38 1.00G0(1,1) 0.18 0.10 0.11 0.11 1.00V(1,1) -0.04 0.03 0.10 -0.03 0.10 1.00FR1(1,1) 0.04 -0.01 -0.13 0.02 -0.11 -0.50 1.00AR1(1,1) -0.04 0.01 -0.08 0.07 -0.01 0.11 0.42 1.00 Condition number = 85.6653Reciprocal condition number = 0.0087***********************************************************************
Note that the estimates of the AR(4) model of the mean, the likelihood function and the information
criteria are very similar to that of the constrained ARCH(8). This evidence indicates that both
representations are very similar.
&KDS� � 3DJ� ��
2 4 6 8 10 12 14-1
-0.5
0
0.5
1A.C.F. of standardized residuals, LBQ = 8.26
2 4 6 8 10 12 14-1
-0.5
0
0.5
1P.A.C.F. of standardized residuals
2 4 6 8 10 12 14-1
-0.5
0
0.5
1A.C.F. of standardized squared residuals, LBQ = 15.06
2 4 6 8 10 12 14-1
-0.5
0
0.5
1P.A.C.F. of standardized squared residuals
0 50 100 1500
0.5
1
1.5
2
2.5Conditional variance
The validation output do not show any evidence of misspecification. However, note that the plot of
the standardized residuals shows some outliers which are not due to heteroscedasticity, as the
autocorrelation function of the standardized residuals indicates.
Finally, the following plot shows the time-varying conditional variances implied by this model:
The code and data required to replicate this case can be found in the directory
\EXAMPLES\GARCH of the distribution diskette, files gnpn.dat, gre1.m, gre2.m, gre3.m and
arch8.m.
&KDS� � 3DJ� ��
Forecasting and monitoring of objectives
Many firms define growth objectives for strategic variables. Although these goals are usually
determined for a medium term period - often one year - it is desirable to monitor the degree to
which they are met with higher frequency. To do this, it is necessary to compute intermediate
objectives which should be consistent with both medium term goals and the target variable dynamics.
Consider the well-known series G of international airline passengers, from January 1949 to
December 1960, see Box, Jenkins and Reinsel (1994). This data is adequately represented by an
IMA(1,1)× IMA(1,1)12 process:
(7.21)(1 B) (1 B 12)yt (1 � �B ) (1 � �B 12 )at
The following code defines and estimates this model:
e4initload airline.daty=log(airline);% Defines the nonstationary version of the airline model% The parameters corresponding to the unit roots are constrained[theta, din, lab] = arma2thd([-1],[-1],[0],[0],[0],12);theta=[theta zeros(size(theta))];theta(1,2)=1;theta(2,2)=1;% Computes preliminary estimatestheta=e4preest(theta,din,y);% ... and then ML estimates[thopt,it,lval,g,h]=e4min('lffast', theta, '', din, y);[std, corrm, varm, Im] = imod(thopt,din,y);prtest(thopt,din,lab,y,it,lval,g,h,std,corrm);
and yields the following output:
******************** Results from model estimation ******************** Objective function: -232.7503 # of iterations: 15Information criteria: AIC = -3.1910, SBC = -3.1291 Parameter Estimate Std. Dev. t-test GradientFR1(1,1) * -1.0000 0.0000 0.0000 0.0000FS1(1,1) * -1.0000 0.0000 0.0000 0.0000AR1(1,1) -0.4018 0.0766 -5.2438 0.0002AS1(1,1) -0.5569 0.0738 -7.5450 0.0001V(1,1) 0.0367 0.0022 16.9706 0.0012* denotes constrained parameter ************************* Correlation matrix **************************AR1(1,1) 1.00AS1(1,1) 0.00 1.00V(1,1) 0.00 0.00 1.00 Condition number = 1.0001Reciprocal condition number = 0.9999***********************************************************************
which is very similar to the results reported by Box, Jenkins and Reinsel (1994).
&KDS� � 3DJ� ��
0 5 10 15 20 25 30 35 40300
350
400
450
500
550
600
650
700
750
800
Assume that a 25% growth objective is defined for next year. To track this target it would be
adequate to compute monthly growth objectives consistent with the end-year objective. This monthly
growth path can be computed by applying fismiss to an augmented time series which contains a)
past observations, b) eleven missing values, corresponding to the first months of the year and c) a
twelfth value equal to the sales objective. The following code computes the unconditional forecasts
for the series, a set of forecasts conditional to the exact accomplishment of the growth target, and
plots the results:
N=size(airline,1);[yfor,Bfor]=foremod(thopt,din,y,12);Bfor=sqrt(Bfor);
% Computes the conditional forecasts and plots all the resultsyobj = log(airline(N,1)*1.25); % End year targetyext = [y; NaN*ones(11,1); yobj];[yhat Bhat] = fismiss(thopt,din,yext);
figure;hold onplot([exp(y((N-23):N,1));yfor*NaN],'k-')plot([y((N-23):N,1)*NaN;exp(yfor)],'k--');plot([y((N-23):N,1)*NaN;exp(yhat(N+1:N+12))],'ko');plot([y((N-23):N,1)*NaN;exp(yfor+1.96*Bfor)],'k-.');plot([y((N-23):N,1)*NaN;exp(yfor-1.96*Bfor)],'k-.');gridwhitebg('w');hold off
the output is a plot which displays the last two years of data, the forecasts, their 95% confidence
interval (in continuous line) and the projected growth path to the target (denoted by “o”):
The code and data required to replicate this case can be found in the directory
\EXAMPLES\AIRLINE of the distribution diskette, files airline.m and airline.dat.
&KDS� � 3DJ� ��
Disaggregation of value added in industry
The analysis of irregularly observed time series is an important problem faced by analysts. E4 has
several functions that can be used to a) estimate models relating time series observed at unequally
spaced intervals and b) to estimate high frequency data from a low frequency sample.
This section shows these capacities by disaggregating the yearly series of value added in industry
(VAI) using as an indicator the monthly series of the index of industrial production (IIP). Both series
correspond to the Spanish economy and the sample includes data from 1975 to 1995.
(VWLPDWLRQ RI WKH KLJK IUHTXHQF\ GDWD PRGHO
The first problem that arises when analyzing irregularly observed time series refers to the
specification of high frequency model from low frequency data. After a standard analysis of sample
information, the following transfer function was found adequate to model the relationship between
VAI and IIP in the low frequency (yearly) sampling interval:
$t .268$t1 � 6.705�t � Nt
(.079) (.650)
/ Nt Jt )J 96.234
(15.216)
(7.22)
where denotes VAI in year t, , being the value of IIP in i-th month of t-th year, and$t �t
12
i1�ti �ti
is a white noise process.Jt
We will assume that the high frequency relationship between these variables is coherent with the low
frequency transfer function. There are several models satisfying this constraint. For example, the
following models are coherent with the relationship specified for the yearly series:
%ti .268%(t1)i � 6.705�ti � nti
/12 nti /ti )/ )
J/ 12 27.780
(7.23)
and:
(7.24)%ti .268%(t1) i � 6.705�ti � nti
(1 � 0.265B 12 )/ nti /ti )/ 2.928
&KDS� � 3DJ� ��
where is the VAI in month i, year t.%t i
'LVDJJUHJDWLRQ IURP QRQVWDWLRQDU\ PRGHOV
Assuming that the model (7.23) is adequate, it has to be formulated in a toolbox standard form. For
example, it can be written as a nonstationary transfer function:
(7.25)%ti 6.705�ti
1 .268B 12� nti ; (1 1.268B 12
� .268B 24 ) nti /ti
The data loading and the translation of model (7.25) to the equivalent THD format can be done by
means of the following commands:
% Disaggregation of value added in industry% First, using nonstationary modelse4init;load ipi.dat; load vai.dat;x = ipi; y = vai;[theta,din, lab] = tf2thd([],[-1.268 .268],[],[], [27.780], ... 12, [6.705], [0 0 0 0 0 0 0 0 0 0 0 -.268]);
Note that, since the model includes a seasonal structure in the transfer function, the denominator of
the transfer function is defined as a 12th order polynomial, whose first eleven coefficients are zero.
Finally we build an aggregate series. Assuming that vector y contains the VAI series, then:
yagr = NaN*zeros(252,1);yagr(12:12:252) = y;
where 252 is the sample size. Thus, yagr contains information on the last month of the year, which
corresponds to the sum of the monthly values of VAI. Last, we call the aggrmod function and plot
the results:
[yhat, vyhat] = aggrmod(theta, din, [yagr x], 12);figure; whitebg('w');title('Variance of the monthly VAI')plot(vyhat);plotsers(yhat,0,'monthly VAI');
where the IIP monthly data, which serves as indicator, is contained in x and the disaggregate data of
VAI is stored in yhat. The output shows that the variance of the interpolated data grows as we
move away from the forecast origin. This happens because the variable is nonstationary.
&KDS� � 3DJ� ��
50 100 150 200 250
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Standardized plot of monthly VAI
0 50 100 150 200 250 3000
0.5
1
1.5
2
2.5
3x 10
4 Variance of the monthly VAI
0 50 100 150 200 250 3005
10
15
20
25
30Variance of the monthly VAI
50 100 150 200 250
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Standardized plot of monthly VAI
To perform the disaggregation with the alternative model (7.24) we can use the commands:
[theta2,din2] = tf2thd([-1],[-.003 -.07],[],[], [2.928],...12, [6.705], [0 0 0 0 0 0 0 0 0 0 0 -.268]);
[yhat2, vyhat2] = aggrmod(theta2, din2, [yagr x],12);figure; whitebg('w');plot(vyhat2);title('Variance of the monthly VAI')plotsers(yhat2,0,'monthly VAI');
which generate the following plots:
Note that the profile of the interpolated VAI is very similar in both cases. The main differences are
in the variances which, in this last case, are bounded and substantially lower than those
corresponding to the first model. This happens because the second model implies that both variables
are seasonally cointegrated.
'LVDJJUHJDWLRQ IURP D VWDWLRQDU\ PRGHO
An alternative way to disaggregate VAI with bounded uncertainty consists of writing the stationary
version of (7.23):
&KDS� � 3DJ� ��
50 100 150 200
-3
-2
-1
0
1
2
3
Standardized plot of annual increments of monthly VAI
(7.26)/12%ti .268/12%(t1)i � 6.705/12 �ti � �ti
so the commands to perform the disaggregation and display the results in this new case are:
dyagr=transdif(yagr,1,0,1,12)dx=transdif(x,1,0,1,12);[dtheta,ddin]=arma2thd([],[-.268],[],[], [27.780], 12,[6.705], 1);[dyhat,dvyhat]=aggrmod(dtheta,ddin,[dyagr dx],12);plotsers(dyhat,0,'annual increments of monthly VAI');
In this case the variance of the interpolated series is constant, due to the stationarity of the model.
The code and data required to replicate this case can be found in the directory \EXAMPLES\AGGR
of the distribution diskette, files ipi.dat, vai.dat and aggr1.m.
Models with observation errors: Wölfer´s sunspots data
This section illustrates the specification and estimation of models with observation errors using
Wölfer’s classic time series of total number of sunspots from 1749 to 1924.
8QLYDULDWH PRGHOLQJ
A previous analysis of the data reveals that there are outliers in the observations corresponding to
1777, 1786, 1836, 1848 and 1870. After an intervention analysis to remove these effects, the
standard specification tools suggest an AR(2) structure, but an overparametrization exercise reveals
that all the parameters of an ARMA(2,2) process:
&KDS� � 3DJ� ��
(7.27)(1 � 11 B � 12 B 2 ) (SCt µSC ) (1 � �1 B � �2 B 2 )Jt
are significant. Then, this will be our tentative univariate model. To estimate it one may use the
following code:
e4initload wolfercc.dat;wolf10 = wolfercc/10;wolf10=wolf10-mean(wolf10);
% Defines an ARMA(2,2) model and computes preliminary estimates[theta1, din1, lab1] = arma2thd([0 0],[],[0 0],[],[0],1);theta1=e4preest(theta1,din1,wolf10);
% ML estimation[thopt1,it,lval,g,h]=e4min('lffast', theta1, '', din1, wolf10);[std, corrm, varm, Im ] = imod(thopt1, din1, wolf10);prtest(thopt1,din1,lab1,wolf10,it,lval,g,h,std,corrm);
% Computation of residuals and diagnosis[ehat,vT,wT,vz1,vvT,vwT]=residual(thopt1,din1,wolf10);descser(ehat,'sunspots data: residuals of ARMA(2,2)');plotsers(ehat,0,'sunspots data: residuals of ARMA(2,2)');uidents(ehat,25,'sunspots data: residuals of ARMA(2,2)');
Note that the series is scaled to homogenize the metrics of all the parameters. This is an advisable
practice to reduce round-off errors. The estimation and diagnosis results are:
******************** Results from model estimation ******************** Objective function: 280.9808 # of iterations: 17Information criteria: AIC = 3.2498, SBC = 3.3399 Parameter Estimate Std. Dev. t-test GradientFR1(1,1) -1.4511 0.0756 -19.2011 0.0003FR2(1,1) 0.7656 0.0646 11.8439 0.0000AR1(1,1) -0.1764 0.1024 -1.7227 0.0004AR2(1,1) 0.2105 0.0915 2.3005 0.0000V(1,1) 1.1850 0.0632 18.7580 0.0003 ************************* Correlation matrix **************************FR1(1,1) 1.00FR2(1,1) -0.89 1.00AR1(1,1) 0.69 -0.65 1.00AR2(1,1) 0.50 -0.30 0.23 1.00V(1,1) 0.01 -0.01 0.00 0.00 1.00 Condition number = 32.9985Reciprocal condition number = 0.0284***********************************************************************
***************** Descriptive statistics ***************** --- Statistics of sunspots data: residuals of ARMA(2,2) ---Valid observations = 176Mean = -0.0018, t test = -0.0202Standard deviation = 1.1832Skewness = 0.4676Excess Kurtosis = 0.1654Quartiles = -0.8183, -0.1317, 0.7015Minimum value = -3.0700, obs. # 170Maximum value = 3.6886, obs. # 169Jarque-Bera = 6.6134Dickey-Fuller = -2.0661, computed with 13 lagsDickey-Fuller = -13.3641, computed with 1 lagsOutliers list
&KDS� � 3DJ� ��
5 10 15 20 25-1
-0.5
0
0.5
1A.C.F. of sunspots data: residuals of ARMA(2,2), LBQ = 34.43
5 10 15 20 25-1
-0.5
0
0.5
1P.A.C.F. of sunspots data: residuals of ARMA(2,2)
20 40 60 80 100 120 140 160-3
-2
-1
0
1
2
3
Standardized plot of sunspots data: residuals of ARMA(2,2)
Obs # Value 3 -2.3786 13 2.5093 14 -2.5745 19 2.4281 38 2.5026 87 3.3564 99 2.3964 103 2.6194 120 2.7278 169 3.6886 170 -3.0700 ************************************************************
The analysis of residuals indicates that they could be non-normal, perhaps due to some remaining
outliers. On the other hand, there are no symptoms of any remaining autocorrelation structure, so we
consider the model statistically adequate.
0RGHO ZLWK REVHUYDWLRQ HUURUV
Let be an AR(2) model with white noise observation errors:
(7.28)(1 � 11 B � 12 B 2 )SC �
t Jt
(7.29)SCt µSC SC �
t � vst
where is the observation error, is the observed number of sunspots in t and is thevst SCt SC �
t
“true” number of sunspots. As it is well known, this model is observationally equivalent to an
ARMA(2,2) with complex constraints over its parameters. To estimate the model (7.28)-(7.29) and
perform a standard diagnosis, we can use the commands:
% Defines an AR(2)+white noise and computes preliminary estimates[th1, d1, l1] = arma2thd([0 0],[],[],[],[0],1);[th2, d2, l2] = arma2thd([],[],[],[],[0],1);[theta,din,lab]= stackthd(th1,d1,th2,d2,l1,l2);[theta2,din2] = comp2thd(theta,din,lab);lab2 = str2mat(l1,'Vu');
&KDS� � 3DJ� ��
theta2 = e4preest(theta2,din2,wolf10);
% Estimation[thopt2,it,lval,g,h]=e4min('lffast', theta2, '', din2, wolf10);[std, corrm, varm, Im ] = imod(thopt2, din2, wolf10);prtest(thopt2,din2,lab2,wolf10,it,lval,g,h,std,corrm);period = 2*pi/acos(-thopt2(1,1)/(2*sqrt(thopt2(2,1))));disp(sprintf('period = %4.2f years', period));
% Validation[ehat,vT,wT,vz1,vvT,vwT]=residual(thopt2,din2,wolf10);descser(ehat,'sunspots data: residuals of AR(2)+error');plotsers(ehat,0,'sunspots data: residuals of AR(2)+error');uidents(ehat,25,'sunspots data: residuals of AR(2)+error');
which yield the output:
******************** Results from model estimation ******************** Objective function: 282.1741 # of iterations: 9Information criteria: AIC = 3.2520, SBC = 3.3240 Parameter Estimate Std. Dev. t-test GradientFR1(1,1) -1.5220 0.0527 -28.9039 0.0000FR2(1,1) 0.8081 0.0512 15.7817 0.0000V(1,1) 0.9686 0.0893 10.8449 0.0001Vu 0.3885 0.0681 5.7067 0.0000 ************************* Correlation matrix **************************FR1(1,1) 1.00FR2(1,1) -0.88 1.00V(1,1) 0.46 -0.43 1.00Vu -0.40 0.37 -0.59 1.00 Condition number = 21.5291Reciprocal condition number = 0.0520*********************************************************************** period = 11.19 years
Note that these results do not reject the hypothesis of observation errors as the estimate of its
standard deviation is significant. Also, the values of the likelihood function and information criteria
are very similar for both representations, indicating that they have the same explanatory power.
The following residual analysis does not suggest any alternative specification.
***************** Descriptive statistics ***************** --- Statistics of sunspots data: residuals of AR(2)+error ---Valid observations = 176Mean = -0.0003, t test = -0.0036Standard deviation = 1.1915Skewness = 0.4999Excess Kurtosis = 0.2197Quartiles = -0.8169, -0.1622, 0.6533Minimum value = -3.2339, obs. # 170Maximum value = 3.7674, obs. # 169Jarque-Bera = 7.6841Dickey-Fuller = -2.0962, computed with 13 lagsDickey-Fuller = -13.1245, computed with 1 lagsOutliers list Obs # Value 3 -2.4194 14 -2.5054 19 2.4211 38 2.6037
&KDS� � 3DJ� ��
20 40 60 80 100 120 140 160-3
-2
-1
0
1
2
3
Standardized plot of sunspots data: residuals of AR(2)+error
5 10 15 20 25-1
-0.5
0
0.5
1A.C.F. of sunspots data: residuals of AR(2)+error, LBQ = 38.07
5 10 15 20 25-1
-0.5
0
0.5
1P.A.C.F. of sunspots data: residuals of AR(2)+error
0 50 100 150 200-5
0
5
10Plot of smoothed versus original sunspots (scaled deviations)
87 3.4437 99 2.4903 103 2.6257 120 2.6230 169 3.7674 170 -3.2339
************************************************************
Finally, we can “clean” the data with the smoothed estimates of the observation errors with the code:
% Compute the 'clean' seriessunsp=wolf10-vT(:,2);figure;hold onplot(sunsp,'k-')plot(wolf10,'ko');gridwhitebg('w');hold offtitle('Plot of smoothed versus original sunspots (scaled deviations)');
which generates also the following plot, in which the original data is represented by a hollow circle
whereas the smoothed series is represented by a continuous line:
&KDS� � 3DJ� ��
The code and data required to replicate this case can be found in the directory
\EXAMPLES\WOLFER of the distribution diskette, files wolfercc.dat and wolfer.m.
Structural time series models
A recent trend in econometrics proposes the use of structural time series models. In this example we
use this methodology to build a model for annual Belgium GDP data, from 1950 to 1986, see
García-Ferrer et al. (1996). These authors propose the following specification:
yt Tt � Jt
/ Tt St1
/ St �t
where is the log of GDP, is a trend variable, is the change of the trend and , areyt Tt St Jt �t
independent white noise processes. Thus, the only unknown parameters of the model are the noise
variances.
It is frequent in the literature to set the ratio between the variances (which is known as noise-
variance ratio or NVR) to some heuristic small value. Given this value, it is easy to estimate the
trend, whose behaviour is described by an ARIMA model. With E4 it is possible to obtain exact
maximum likelihood estimates of both variances. This requires one to define an SS version of the
previous model:
Tt
St
1 1
0 1
Tt1
St1
�
0
1�t (7.35)
yt [1 0]Tt
St
� Jt (7.36)
The data loading and model formulation requires the following code:
% Non observable components modele4initload belgi.dat;y = log(belgi)*1000;[theta,din,lab]=ss2thd([1 1; 0 1],[],[0;1],[1 0],[],[1],[0],[0],[0]);
Next, one must constrain the values of the known parameters and compute preliminary estimates for
the rest.
&KDS� � 3DJ� ��
% All parameters except the variances are constrainedtheta = [theta ones(12,1)];theta(10,2)=0;theta(12,2)=0;
% Compute preliminary estimatestheta=e4preest(theta,din,y);
And the commands for model estimation are:
%... and optimize the likelihood function.[thopt,it,lval,g,h]=e4min('lffast', theta,'', din, y);[std, corrm, varm, Im ] = imod(thopt, din, y);prtest(thopt,din,lab,y,it,lval,g,h,std,corrm);NVR=thopt(10,1)/thopt(12,1);disp(sprintf('NVR = %4.2f ', NVR));
which yields:
******************** Results from model estimation ******************** Objective function: 155.3452 # of iterations: 12Information criteria: AIC = 8.7414, SBC = 8.8294 Parameter Estimate Std. Dev. t-test GradientPHI(1,1) * 1.0000 0.0000 0.0000 0.0000PHI(2,1) * 0.0000 0.0000 0.0000 0.0000PHI(1,2) * 1.0000 0.0000 0.0000 0.0000PHI(2,2) * 1.0000 0.0000 0.0000 0.0000E(1,1) * 0.0000 0.0000 0.0000 0.0000E(2,1) * 1.0000 0.0000 0.0000 0.0000H(1,1) * 1.0000 0.0000 0.0000 0.0000H(1,2) * 0.0000 0.0000 0.0000 0.0000C(1,1) * 1.0000 0.0000 0.0000 0.0000Q(1,1) 131.0239 59.9687 2.1849 0.0000S(1,1) * 0.0000 0.0000 0.0000 0.0000R(1,1) 99.6790 35.3129 2.8227 0.0000* denotes constrained parameter ************************* Correlation matrix **************************Q(1,1) 1.00R(1,1) -0.30 1.00 Condition number = 1.8462Reciprocal condition number = 0.5416*********************************************************************** NVR = 1.31
Note that the ML estimate of the NVR is much higher than the values usually assumed by the
practitioners of this methodology, implying a highly adaptive trend. This is not surprising, as the ML
criterion select the values of the parameters (in this case, both variances) that allow a closer
replication of the data movements.
&RPSXWDWLRQ DQG PRGHOLQJ RI XQREVHUYDEOH FRPSRQHQWV
The second stage of the analysis requires the extraction of the trend. We can do this with fismod by
means of the following commands:
&KDS� � 3DJ� ��
5 10 15 20 25 30 35-3
-2
-1
0
1
2
3Standardized plot of changes of the trend
0 5 10 15 20 25 30 35 407
7.2
7.4
7.6
7.8
8
8.2plot of log(PIB) versus trend
% Smoothing to estimate the trend[xhat,phat,e]=fismod(thopt,din,y);trend=xhat(:,1);deltat=transdif(trend,1,1);
figure;hold onplot(trend/1000,'k-')plot(y/1000,'ko');gridwhitebg('w');hold offtitle('plot of log(PIB) versus trend');plotsers(deltat,0,'changes of the trend');
which generate the plots:
The last stage of this analysis involves building a univariate model for the trend. After a preliminary
analysis (not shown) an ARIMA(2,2,0) model is fitted and diagnosed with the code:
% Model for the trend: y1=transdif(xhat(:,1),1,2);[theta3, din3, lab3] = arma2thd([0 0],[],[],[],[0],1);
% Computes preliminary and ML estimatestheta3=e4preest(theta3,din3,y1);[thetan,it,lval,g,h]=e4min('lffast',theta3,'',din3, y1);[std,corrm,varm,Im] = imod(thetan,din3,y1);prtest(thetan,din3,lab3,y1,it,lval,g,h,std,corrm);
% Residual diagnostics[ehat,vT,wT,vz1,vvT,vwT]=residual(thetan,din3,y1);descser(ehat,'residuals of the model for the trend');uidents(ehat,10,'residuals of the model for the trend');plotsers(ehat,0,'residuals of the model for the trend');
which yields the following output:
&KDS� � 3DJ� ��
5 10 15 20 25 30-3
-2
-1
0
1
2
3Standardized plot of residuals of the model for the trend
1 2 3 4 5 6 7 8 9-1
-0.5
0
0.5
1A.C.F. of residuals of the model for the trend, LBQ = 17.48
1 2 3 4 5 6 7 8 9-1
-0.5
0
0.5
1P.A.C.F. of residuals of the model for the trend
******************** Results from model estimation ******************** Objective function: 108.2797 # of iterations: 11Information criteria: AIC = 6.5459, SBC = 6.6805 Parameter Estimate Std. Dev. t-test GradientFR1(1,1) -0.7023 0.1614 -4.3508 -0.0002FR2(1,1) 0.3744 0.1624 2.3060 0.0001V(1,1) 33.5723 8.1484 4.1201 0.0000 ************************* Correlation matrix **************************FR1(1,1) 1.00FR2(1,1) -0.51 1.00V(1,1) 0.03 -0.03 1.00 Condition number = 3.0569Reciprocal condition number = 0.3561*********************************************************************** ***************** Descriptive statistics ***************** --- Statistics of residuals of the model for the trend ---Valid observations = 34Mean = -0.3677, t test = -0.3673Standard deviation = 5.8371Skewness = -0.1785Excess Kurtosis = -0.9368Quartiles = -4.2303, -0.0981, 4.5046Minimum value = -11.6458, obs. # 26Maximum value = 10.9150, obs. # 8Jarque-Bera = 1.4239Dickey-Fuller = -3.4332, computed with 5 lagsDickey-Fuller = -6.0014, computed with 1 lagsOutliers list Obs # Value ************************************************************
The code and data required to replicate this case can be found in the directory
\EXAMPLES\NONOBS of the distribution diskette, files belgi.dat and belgi.m.
8 Reference guide
0RGHO IRUPXODWLRQ
form2THD Where form may be ARMA, STR, TF, SS or GARC. Converts a VARMAX
(ARMA), structural econometric model (STR), transfer function (TF), SS model
(SS) or model with GARCH errors (GARC) to THD format.
COMP2THD Converts a stacked model in a components model in THD format.
NEST2THD Converts a stacked model in a nested model in THD format.
STACKTHD Stacks to models in THD format.
THD2form Where form may be ARMA, STR, TF or SS. Converts a model in THD format to
the corresponding VARMAX, structural econometric, transfer function or SS
formulation.
TOMOD Suppresses the user model flag in a THD model specification.
TOUSER Adds the user model flag in a THD model specification.
0RGHO LQIRUPDWLRQ
PRTMOD Displays information about a model.
PRTEST Displays the estimation results.
0RGHO HVWLPDWLRQ
E4PREEST Computes a fast estimate of the parameters for a model in THD form.
LFMOD Computes the exact log-likelihood function for a model in THD form.
LFFAST A faster version of LFMOD.
LFMISS Same as LFMOD, but allowing for missing data.
LFGARCH Same as LFMOD, but allowing for GARCH errors.
GMOD Computes the analytical gradient of LFMOD.
GMISS Computes the analytical gradient of LFMISS.
GGARCH Computes the gradient of LFGARCH.
IMOD Computes the exact information matrix of LFMOD and LFFAST.
&KDS� � 3DJ� �
IMODG Computes the quasi-maximum likelihood information matrix of LFMOD and
LFFAST.
IMISS Computes the exact information matrix of LFMISS.
IGARCH Computes an analytical approximation to the information matrix of LFGARCH.
)XQFWLRQV IRU FRPSXWLQJ GHULYDWLYHV
form_DV Where form may be SS or GARCH. Computes the derivatives of the SS matrices
of a model with respect to the i-th parameter.
form_DVP Where form may be SS or GARC. Computes the derivatives of the SS matrices of
a model in the direction of any vector.
)RUHFDVWLQJ� VPRRWKLQJ DQG VLPXODWLRQ
AGGRMOD Disaggregates a sample of low frequency data into smoothed estimates of the
corresponding high frequency values.
FOREMOD Computes forecasts for the endogenous variables of a model in THD form.
FOREMISS Same as FOREMOD, but allowing for missing data.
FOREGARC Computes forecasts for the endogenous variables and conditional variances of a
model with GARCH errors.
FISMOD Computes fixed interval smoothing estimates of the state and observable variables
of a model in THD form.
FISMISS Same as FISMOD, but allowing for missing data.
E4TREND Decomposes a vector of time series analysis into trend, seasonal, cycle and
irregular components.
SIMMOD Simulates the endogenous variables of a model in THD form.
SIMGARCH Same as SIMMOD, but allowing form GARCH errors.
'DWD WUDQVIRUPDWLRQ� PRGHO VSHFLILFDWLRQ DQG GLDJQRVLV
AUGDFT Computes the augmented Dickey-Fuller test for unit roots.
DESCSER Displays the main descriptive statistics for a set of time series.
HISTSERS Displays a standardized histogram for a set of time series.
LAGSER Generates lags and leads for a set of time series.
&KDS� � 3DJ� �
MIDENTS Computes and displays the multiple autocorrelation and partial autoregression
functions for a set of time series.
PLOTQQS Plots the quantile graphs for a set of time series.
PLOTSERS Displays a plot of centered and standardized time series versus time.
RESIDUAL Computes the residuals of a model.
RMEDSER Displays a scaled plot of sample means versus sample standard deviations for a
set of time series.
TRANSDIF Applies stationarity inducing transformations (Box-Cox and differencing) to a set
of time series.
UIDENTS Displays the univariate simple and partial autocorrelation functions for a set of
time series.
2WKHU IXQFWLRQV
E4INIT Initializes the global toolbox options.
E4MIN Computes the unconstrained minimum of a nonlinear function.
SETE4OPT Allows the user to modify the toolbox options.
&KDS� � 3DJ� �
aggrmod
3XUSRVH
Disaggregates a sample of low frequency data into smoothed estimates of the corresponding high
frequency values.
6\QRSVLV
[zhat, bt] = aggrmod(theta, din, z, per, m1)
'HVFULSWLRQ
Computes the optimal disaggregation of low frequency (say yearly) time series into high frequency
(say quarterly or monthly) time series, so the disaggregates add up to the sample data. The
unobserved high frequency values can be computed taking into account, not only the low frequency
sample information, but also high frequency indicator(s). For example, a monthly industrial
production index can be used as an indicator to disaggregate a yearly series of GNP.
The disaggregates are computed using an algorithm known as fixed-interval smoothing. See the
reference on fismiss and fismod. Further details about this method can be found in Anderson and
Moore (1979), De Jong (1989) and Casals, Jerez and Sotoca (2000).
The input arguments of aggrmod are a model in THD format (theta-din) relating all the variables
in the high frequency sampling interval, the data matrix (z), the number of observations that add up to
an aggregate (per) and the number of endogenous variables that are observed as aggregates (m1).
The input argument z should be structured in the following way:
4) The first m columns correspond to the endogenous variables and the rest to the exogenous
variables.
5) The first m1 columns correspond to endogenous variables observed in the low frequency sample
interval. The rest of the columns, up to m, correspond to high frequency endogenous variables.
6) The parameter m1 is optional. If it is not specified, aggrmod assumes that all the endogenous
variables are observed with low frequency, i.e. m1 = m.
&KDS� � 3DJ� �
7) All the columns corresponding to variables observed with low frequency should be coded with
NaN where no observation is available. For example, the column corresponding to a quarterly
variable observed once a year would have the following structure:
[NaN NaN NaN y1 Na NaN NaN y2 . . . NaN NaN NaN yn]'
where the values yi (i=1,...,n) correspond to the yearly observations.
8) All the exogenous variables should be low frequency data.
The output arguments of aggrmod are the optimal disaggregates of the first m1 endogenous variables
(zhat) as well as their covariances (bt).
5HIHUHQFHV
Anderson, B. D. O. and J. B. Moore (1979). Optimal Filtering. Englewood Cliffs, N.J.: Prentice
Hall.
Casals, J. M. Jerez and S. Sotoca (2000). “Exact Smoothing for Stationary and Nonstationary Time
Series”, International Journal of Forecasting, 16, 59-69.
De Jong, P. (1989), “Smoothing and Interpolation with the State-Space Model”, Journal of the
American Statistical Association, 84, 408, 1085-1088.
6HH $OVR
fismod, fismiss
&KDS� � 3DJ� �
arma2thd
3XUSRVH
Converts a VARMAX model to THD format.
6\QRSVLV
[theta,din,lab] = arma2thd([FR1 ... FRp],[FS1 ... FSps], ... [AR1 ... ARq],[AS1 ... ASqs],v,s,[G0 ... Gn],r)
'HVFULSWLRQ
The function arma2thd obtains the representation in THD format of the VARMAX model:
FR (B ) FS (B S ) yt G (B )ut � AR (B ) AS (B S ) JJt
where B is the backshift operator, such that for any sequence : , s denotes the length ofxt B ±k xt xt.k
the seasonal period and:
is a (m×1) vector of endogenous variables,yt
is a (r×1) vector of exogenous variablesut
is a (m×1) vector of errorsJJt
FR (B ) I � FR1 B � ... � FRp B p
FS (B S ) I � FS1 B S� ... � FSP B S ·P
G (B ) G0 � G1 B � ... � GnB n
AR (B ) I � AR1 B � ... � ARq B q
AS (B S ) I � AS1 B S� ... � ASQB S ·Q
, ..., , , ..., and , ..., , , ..., are (m×m) matrices.FR1 FRp AR1 ARq FS1 FSP AS1 ASQ
The input arguments are:
1) The parameter matrices of the regular autoregressive and moving average factors, [FR1...FRp]
and [AR1...ARq].
2) The parameter matrices of the seasonal autoregressive and moving average factors,
[FS1...FSps] and [AS1...ASqs].
3) The covariance matrix of , v. If this matrix is defined as a vector, this implies the constraint ofJJt
independence between noises. In order not to impose this constraint, it is necessary to define at
&KDS� � 3DJ� �
least the lower triangle of the matrix. This matrix cannot contain NaN. To impose independence
between two errors, the user can set the corresponding covariance to zero and, afterwards, impose
a fixed-parameter constraint on this value, see Chapter 5.
4) The scalar s, which indicates the length of the seasonal period (e.g. for nonseasonal data, s=1, if
the data is quarterly, s=4, if monthly s=12).
5) The parameter matrix [G0 ... Gn] and the number of exogenous variables, r, need to be
included only when the model contains exogenous variables.
If any of the matrices (except v) is null, it should be specified using an empty matrix, []. If any of the
elements in these matrices (except in v) are fixed values equal to zero, they should be specified with
NaN.
The output arguments are the vectors and matrices that define a model in THD format.
([DPSOH
Consider the VARMA model:
,1 0
0 1�
.3 0
.4 0B �
.5 0
0 0B 2
y1t
y2t
.9
.7�
1 0
0 1�
0 0
0 .8B 12
J1t
J2t
VJ1t
J2t
1 .3
.3 1
The following code defines the parameter matrices, converts them to THD format and displays the
model structure:
FR1 = [-.3 NaN; -.4 NaN];FR2 = [ .5 NaN; NaN NaN];AR1 = [NaN NaN; NaN -.8];V = [1 .3; .3 1];c = [.9; .7];[theta, din, lab] = arma2thd([FR1 FR2],[],[],AR1,V,12,c,1);prtmod (theta,din,lab);
Note that the constant term is included by means of an exogenous variable.
6HH $OVR
ss2thd, str2thd, garc2thd, tf2thd, comp2thd, prtmod
&KDS� � 3DJ� �
augdft
3XUSRVH
Computes the augmented Dickey-Fuller test autoregressive for unit roots.
6\QRSVLV
[adft] = augdft(y, p, trend);
'HVFULSWLRQ
The Dickey and Fuller (1981) statistic tests the null hypothesis of an autoregressive unit root versus
the alternative of stationarity. Further elaborations on this idea allow for autocorrelation and a
deterministic time trend, see Hamilton (1994).
The function augdft computes a version of the augmented Dickey-Fuller statistic. The input
arguments are:
1) y, a matrix with n observations of m variables.
2) p, the number of lags (plus one) in the unit root regression. The value of p should be equal or
greater than 1.
3) trend, an optional parameter to allow (trend=1) for a deterministic time trend.
If the output argument adft is specified, the function does not display the results.
When invoked without the argument trend or with trend=0 this function computes, for each of the
m variables in the matrix y, the OLS estimates of the parameters in the unit root regression:
yt �1/yt1 � á � �p1/ytp�1 � � � 'yt1 � et
and the standard t and F statistics for the null hypothesis , and . If H0: ' 1 H0: � 0 H0: ' 1 � 0
trend=1, augdft computes the OLS estimates for the unit root regression:
yt �1/yt1 � á � �p1/ytp�1 � � � 'yt1 � t � et
&KDS� � 3DJ� �
as well as the same statistics as in the previous case and an additional t statistic for the null hypothesis
. In both cases the number of lags (p) should be enough to avoid autocorrelation of theH0: 0
residuals of the regression.
The following Tables summarize the 95% and 90% percentiles of the above mentioned statistics, see
Hamilton (1994, Chapter 17 and Appendix B).
Table 1: 95% percentiles of the t and F statistics.
Size of y
trend=0 trend=1
True model:
( and )� 0 ' 1
True model:
( and )� g 0 ' 1
True model:
( any value, and )� 0 ' 1
t-statistic F-
statistic
t-statistic F-statistic t-statistic F-statistic
25 -300 518
Both statistics should be
compared with standard
t and F critical values
-360 724
50 -293 486 -350 673
100 -289 471 -345 649
250 -288 463 -343 634
500 -287 461 -342 630
250 -286 459 -341 625�
Table 2: 90% percentiles of the t and F statistics.
Size of y
trend=0 trend=1
True model
( and )� 0 ' 1
True model
( and )� g 0 ' 1
True model
( any value, and )� 0 ' 1
t-statistic F-statistic t-statistic F-statistic t-statistic F-statistic
25 -263 412
Both statistics should be
compared with standard
t and F critical values
-324 591
50 -260 394 -318 561
100 -258 386 -315 547
250 -257 381 -313 539
500 -257 379 -313 536
250 -257 378 -312 534�
([DPSOH
The following code simulates 200 samples of a random walk process and calls to augdft:
&KDS� � 3DJ� ��
[theta,din,lab] = arma2thd([-1],[],[],[],[.1],1);y = simmod(theta,din,200);augdft(y,1);
The corresponding output will be similar to:
Augmented Dickey-Fuller results, p = 1 rho = 0.9862, t-test (rho=1) = -1.3388 alpha = -0.0794, t-test (alpha=0)= -1.8595 F test (rho=1,alpha=0) = 3.6708, d.f. = 2, 197
Note that none of the null hypotheses is rejected with a 95% significance. The call:
augdft(y,1,1);
will yield a result similar to:
Augmented Dickey-Fuller results, p = 1 rho = 0.9533, t-test (rho=1) = -2.1353 alpha = -0.0540, t-test (alpha=0)= -1.2006 delta = -0.0014, t-test (delta=0)= -1.7031 F test (rho=1,delta=0) = 5.7738, d.f. = 2, 196
which (adequately) rejects a time trend.
5HIHUHQFHV
Dickey, D.A. and W.A. Fuller (1981). “Likelihood Ratio Statistics for Autoregressive Time Series
with a Unit Root”. Econometrica, 49, 1063.
Hamilton, J.D. (1994). Time Series Analysis. Princeton, N.J: Princeton University Press.
&KDS� � 3DJ� ��
comp2thd
3XUSRVH
Converts a stacked model in a components model in THD format.
6\QRSVLV
[theta, din, label] = comp2thd(t, d, l);
'HVFULSWLRQ
The input arguments of comp2thd are a stacked model in THD format (t, d, l). The function
returns the composite model defined in THD format.
([DPSOH
The model:
yt .8 � .3yt1 .4yt2 � at
y �
t yt � vt
V[at ] .1
V[vt ] .2
can be expressed in THD format with the following code:
[tha, da, laba] = arma2thd([-.3 .4], [], [], [], [.1], 1, [.8], 1);[thc, dc, labc] = arma2thd([], [], [], [], [.2], 1);[ts, ds, ls]= stackthd(tha, da, thc, dc, laba, labc);[theta, din, lab] = comp2thd(ts, ds, ls);lab=str2mat(laba,'V*(1,1)');prtmod(theta,din,lab);
6HH $OVR
arma2thd, ss2thd, stackthd, str2thd, tf2thd, garc2thd, prtmod
&KDS� � 3DJ� ��
descser
3XUSRVH
Displays the main descriptive statistics for a set of time series.
6\QRSVLV
[stats, aval1, avect1] = descser(y, lab)
'HVFULSWLRQ
Computes and displays a set of descriptive statistics for each of the series in the matrix y. If y
contains more than one series, it also displays the correlation coefficients and the corresponding
principal components information.
The input arguments are: y, a matrix with n observations of m variables and lab, a matrix with m
rows containing descriptive names for each series. The parameter lab is optional.
The output argument stats is a matrix which contains the statistics computed for each series in this
order: number of valid observations, mean, standard deviation, skewness, excess kurtosis, the 25%,
50% and 75% percentiles, maximum value, position of the maximum value in the sample, minimum
value, position of the minimum value in the sample, Jarque-Bera statistic, the augmented Dickey-
Fuller statistic computed with a number of lags equal to the square root of the number of observations
and with one lag, and an outliers list.
This function accepts missing observations marked with NaN, eliminating these observations before
computing the statistics. In this case some statistics (e.g. the correlation matrix or the augmented
Dickey-Fuller statistics) will not be computed.
([DPSOH
The following code generates two normal variables and computes their descriptive statistics:
y = randn(100, 2);descser(y, str2mat('First series','Second series'));
6HH $OVR
plotsers, rmedser, plotqqs
&KDS� � 3DJ� ��
e4init
3XUSRVH
Initializes the global toolbox options.
6\QRSVLV
e4init
'HVFULSWLRQ
This command creates and initializes the internal variable E4OPTION. This variable is a 1×51 vector
which stores the values the global variables that control the behaviour of E4. It also displays a listing
of the default options and initializes the matrices of error and warning messages.
Bear in mind that the toolbox does not work properly if this function is not run.
6HH $OVR
sete4opt
&KDS� � 3DJ� ��
e4min
3XUSRVH
Computes the unconstrained minimum of a nonlinear function.
6\QRSVLV
[pnew,iter,fnew,gnew,hessin] = e4min(func,p,dfunc,P1,P2,P3,P4,P5)
'HVFULSWLRQ
The function e4min implements a numerical optimization procedure based on the techniques
described by Dennis and Schnabel (1983). It includes two main optimization algorithms, BFGS
(Broyden-Fletcher-Goldfarb-Shanno) and Newton-Raphson.
The operation of e4min is the following. Starting from an initial estimate of the optimal value, p, the
algorithm iterates on the objective function func, using the BFGS (default) or Newton-Raphson
search direction and computing the optimum step length. The algorithm stops when satisfying any of
two criteria: a) the relative changes in the values of the objective function are small and/or b) the
gradient vector is small. The default tolerances for convergence are set by e4init and can be
modified using sete4opt.
The input parameters are:
1) func, is a string containing the name of the objective function (e.g. 'lfvmod' or 'lfgarch').
The input arguments of this function should be the vector p and the optional parameters P1-P5.
2) p, is a vector containing the initial value of the variables in the optimization problem. When
e4min is used to estimate a model in THD format, p should be equal to theta.
3) dfunc, is a string containing the name of the function that computes the gradient of func (e.g.
'gmod' or 'ggarch'). The input arguments of this function should be the vector p and the
optional parameters P1-P5. If an analytical gradient is not required, dfunc should be an empty
string: ''. In this case, e4min uses a numerical approximation to the gradient.
4) P1-P5, are optional parameters used to feed additional information to func and dfunc. When
e4min is used for estimation of a model in THD format, P1 should be the name of the variable
that contains the din specification and P2 should be the name of the data matrix.
&KDS� � 3DJ� ��
After the end of the iterative process, e4min returns the following values: pnew, which is the value of
the unconstrained parameters; iter, the number of iterations; fnew, the value of the objective
function in pnew; gnew which is the analytical or numerical gradient of the objective function in
pnew, depending on the contents of dfunc; and finally hessin, which is the hessian of the objective
function in pnew.
The user should also take into account that:
1) It is possible to impose fixed-value constraints on any parameter by augmenting p with a second
column. The values in this column should be either zero, to indicate that the parameter in the first
column is free, or any nonzero value, when the parameter is constrained to its present value.
2) When estimating the parameters of an econometric model, the user can optimize the objective
function with respect to the error covariances (the default) or its Cholesky factors by selecting this
alternative with the sete4opt function. In the univariate case, the Cholesky factor of the variance
is the standard deviation.
3) The behaviour of e4min can be altered by using sete4opt. The specific e4min-related options
are:
Option Description Possible values
'algorithm' Optimization algorithm 'bfgs'†, 'newton'
'step' Maximum step length during optimization 0.1‡
'tolerance' Stop criteria tolerance 1.0e-5‡
'maxiter' Maximum number of iterations 75‡
'verbose' Display output at each iteration 'yes'†, 'no'
† Default option.
‡ This is the default value. Other reasonable values are admissible.
([DPSOH
Consider the model:
zt (1 .7B ) (1 .5B 12 )at V[at ] .1
The following code obtains the corresponding THD format and simulates the data:
[theta, din] = arma2thd([], [], [-.7], [-.5], .1, 12);z=simmod(theta,din,150);z=z(51:150,1);
&KDS� � 3DJ� ��
In real applications, initial values of the parameters may be far from the optimum. Hence, it may be
convenient to obtain good starting values with e4preest and then e4min is used to minimize lfmod
using the analytical gradient:
tnew = e4preest(theta, din, z);[thopt, iter, fnew, gnew] = e4min('lfmod', tnew, 'gmod', din, z);
The input arguments feed to e4min are the most conservative ones. In most cases, the following
syntax will provide the same results, with a much faster optimization process:
[thopt, iter, fnew, gnew] = e4min('lffast', tnew, '', din, z);
6HH $OVR
lffast, lfgarch, lfmod, ggarch, gmod, sete4opt
5HIHUHQFHV
Dennis, J. E. and R. B. Schnabel (1983). Numerical Methods for Unconstrained Optimization and
Nonlinear Equations. Englewood Cliffs, N. J.: Prentice-Hall.
&KDS� � 3DJ� ��
e4preest
3XUSRVH
Computes a fast estimate of the parameters for a model in THD format.
6\QRSVLV
theta2 = e4preest(theta, din, z)
'HVFULSWLRQ
This function provides fast and consistent estimates of the parameters in theta. These estimates are
adequate starting values for likelihood optimization with e4min.
The operation of e4preest is the following. It first obtains a subspace representation of the system,
where the future of the output is expressed as a linear function of its past and the information of the
input. The estimates are then computed as the solution of a nonlinear least squares problem. See
Casals (1997), Van Overschee and De Moor (1996) and Viberg (1995).
The input arguments are: theta and din, which define the model structure in THD format and z, a
matrix containing the values of the endogenous and exogenous variables. The values in the first
column of theta are irrelevant to the operation of e4preest, except if they are parameters
constrained to fixed-values, see Chapter 5. The estimates are returned in theta2.
The user should also take into account that:
1) If the sample is too short in comparison with the dimension of the system, the function will display
the message: “The sample is too short to use e4preest”. This means that the procedure does not
have enough degrees of freedom to estimate the model in subspace form. The degrees of freedom
for any model can be computed using the formula:
df n 2(d � 1)(m � r )
where n is the number of observations of the sample, d is the model dynamics (as it appears in the
standard output of prtmod), m is the number of endogenous variables and r is the number of
exogenous variables.
2) The behaviour of e4preest can be altered by sete4opt. Adequate options for likelihood
optimization with e4min will be adequate in general for e4preest.
&KDS� � 3DJ� ��
([DPSOH
Consider the model:
zt (1 .7B ) (1 .5B 12 )at V[at ] .1
The following code obtains the corresponding THD format, simulates the data, computes preliminary
estimates with e4preest and, finally, computes maximum likelihood estimates:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);z=simmod(theta,din,200); z=z(51:200,1);theta=e4preest(theta, din, z)[thopt] = e4min('lffast', theta,'', din, z)
6HH $OVR
sete4opt, e4min
5HIHUHQFHV
Casals, J. (1997). Métodos de Subespacios en Econometría. Phd Thesis. Madrid: Universidad
Complutense.
Van Overschee, P. and B. De Moor (1996). Subspace Identification for Linear Systems: Theory,
Implementation, Applications. Dordretch: Kluwer Academic Publishers.
Viberg, M. (1995). “Subspace-based methods for the identification of linear time-invariant systems”,
Automatica, 31, 12, 1835-1851.
&KDS� � 3DJ� ��
e4trend
3XUSRVH
Decomposes a vector of time series into the trend, seasonal, cycle and irregular components implied
by an econometric model.
6\QRSVLV
[trend,season,cycle,irreg,thetat,dint,ixmodes,xhat] = ... e4trend(theta,din,y,toinnov)
'HVFULSWLRQ
The function e4trend decomposes a vector of time series, represented by an econometric model, into
several structural components corresponding to: a) unit roots (trend component), b) seasonal roots
(seasonal component), c) stationary (nonseasonal) roots (cyclic component) and d) residuals
(irregular component). These components are additive, so the command:
trend+season+cycle+irreg-y
should return a null value.
The input arguments are a model in THD format (theta- din), a data matrix (y) and, optionally, a
logical flag (toinnov). If toinnov=1 the SS model for the data is obtained imposing a steady-state
innovations structure on the model, which allows to obtain exact estimates of the components, if
toinnov=0, (default) the original structure of the SS model is preserved. The number of rows of y
should be equal to the number of observations, its first columns correspond to the endogenous
variables and the rest to the exogenous variables. The output arguments are:
1) trend, smoothed estimates of the trend components; it has one row per
observation and one column per each independent unit root.
2) season, smoothed estimates of the seasonal components; it has one row per
observation and one column per independent seasonal component.
3) cycle, smoothed estimates of the cyclic components; it has one row per
observation and one column per independent stationary component.
&KDS� � 3DJ� ��
4) irreg, smoothed estimates of the irregular components; it has one row per
observation and one column per endogenous variable.
5) thetat, dint, the theta-din specification corresponding to the block-diagonal SS
model.
6) ixmodes, a vector of indexes identifying the different states. The value ‘1’
corresponds to trend states, ‘2’ corresponds to seasonal states and ‘3’ to
cyclic states. It has the same number of rows as the transition matrix and
one column.
7) xhat, a matrix of smoothed estimates of the states.
When a model does not include one of these components the function returns a null matrix.
Internally, this function proceeds as follows. First, it calls thd2ss to obtain the matrices of the SS
equivalent representation corresponding to theta-din. Second, it transforms the SS model to a
block-diagonal equivalent structure, according to the eigenvalues of the transition matrix. Third, the
block-diagonal model is feed to fismod (or fismiss, if the sample contains missing values) to
obtain estimates of the different states. Fourth, the estimates of the states are assigned to the structural
components taking into account the frequencies where they show a peak of spectral power. Then, all
the states with peaks at the zero frequency are assigned to the trend, the states with peaks at seasonal
frequencies are assigned to the seasonal component and the rest of the states are assigned to the cycle.
Finally, the components are computed by combining the states with the corresponding coefficients in
the observation matrix and returned as output arguments.
([DPSOHV
The following code simulates 200 samples of the nonstationary process:
(1 .5B ) (1 B ) (1 B 4 )yt (1 .8B ) (1 .7B 4 )at ; at � iid N(0 , .1)
and computes its structural components.
&KDS� � 3DJ� ��
TrendData
20 40 60 80 100 120 140 160 180
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
Standardized plot of: Trend, Data
20 40 60 80 100 120 140 160 180-3
-2
-1
0
1
2
3
Standardized plot of Cycle
e4init[theta,din,lab]=arma2thd([-1.5 .5],[-1],[-.8],[-.7],[.1],4);y=simmod(theta,din,200);[trend,season,cycle,irreg,thetat,dint,ixmodes]=e4trend(theta,din,y);% Displays the block-diagonal SS model[Phi,Gam,E,H,D,C,Q,S,R]=thd2ss(thetat,dint);[ixmodes Phi][H]% Plots the componentsplotsers([trend y],1,str2mat('Trend','Data'));plotsers(cycle,-1,’Cycle’);plotsers(season,-1,’Seasonal component’);plotsers(irreg,-1,’Irregular component’);
The output corresponding to the sentence [ixmodes Phi] is:
ans =
1.0000 1.0035 0.9919 0 0 0 0 1.0000 0.0000 0.9965 0 0 0 0 3.0000 0 0 0.5000 0 0 0 2.0000 0 0 0 0.4087 1.3033 0 2.0000 0 0 0 -0.8954 -0.4087 0 2.0000 0 0 0 0 0 -1.0000
so the first and second states correspond to the trend, the second state corresponds to the cycle and the
other states correspond to the seasonal component. The components are obtained combining the
smoothed estimates of these states with coefficients in the observation matrix, displayed by the
command [H]:
ans =
0.8880 0.4365 2.4631 0.4109 0.1963 0.1841
The resulting components will vary in different runs of this code, but they should be similar to:
&KDS� � 3DJ� ��
0 50 100 150 2000
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4Trace of covariance of smoothed states
Time
20 40 60 80 100 120 140 160 180-3
-2
-1
0
1
2
3Standardized plot of seasonal component
20 40 60 80 100 120 140 160 180
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Standardized plot of irregular component
When the model for the data is a VARMAX or transfer function, or when the toinnov flag is
enabled, the components obtained with e4trend have the important property of converging to exact
values, i.e., to uncorrelated estimates with null variance. To visualize this feature with the data
previously simulated, run the code:
[xhat,phat,ehat]=fismod(thetat,dint,y);trz=[];for i=1:6:1200 trz=[trz;trace(phat(i:i+5,:))];end
figure;whitebg('w');hold onplot(trz,'k-')title('Trace of covariance of smoothed states');xlabel('Time')hold off
which yields an output similar to:
Among other implications, this means that the values of the components at the end of the sample are
not be revised when the sample increases. This is a very desirable property, for example, when the
decomposition applied to obtain seasonally adjusted data.
&KDS� � 3DJ� ��
6HH $OVR
fismod, fismiss
&KDS� � 3DJ� ��
fismiss, fismod
3XUSRVH
Compute fixed interval smoothing estimates of the state and observable variables of a model in THD
form.
6\QRSVLV
[zhat, Pz, xhat, Px] = fismiss(theta, din, z)
[xhat, P, e] = fismod(theta, din, z)
'HVFULSWLRQ
These functions compute fixed interval smoothed estimates of the variables in a SS model, see
Anderson and Moore (1979) and De Jong (1989). Their main econometric applications are in a)
“cleaning” a noise contaminated sample of observation errors, b) estimating missing values in the
sample and c) computing unobservable components in a model.
The function fismiss allows missing observations in the endogenous variables data, coded by NaN,
while fismod requires a complete sample.
In both cases, the input arguments are a model in THD format (theta- din) and a data matrix (z).
The number of rows of z should be equal to the number of observations. The first columns of z
correspond to the endogenous variables and the rest to the exogenous variables.
The output arguments of fismiss are: zhat, a matrix that contains the smoothed estimates of the
observable variables; Pz, a matrix containing the sequence of covariances of zhat; xhat, the
expectation of the state vector conditional on all the sample; and Px, a matrix containing the sequence
of covariances of xhat. Unless the sample of the endogenous variables is affected by observation
errors or contain missing values, the values in zhat should coincide with those in z.
The output arguments of fismod are: xhat, the expectation of the state vector conditional on all the
sample; P, the covariance matrix of this expectation; and e, a matrix of smoothed errors in the
observation equation, computed as .zt N zt Hxt N Dut
The details of the algorithm implemented in fismiss and fismod can be found in Casals, Jerez and
Sotoca (2000).
&KDS� � 3DJ� ��
([DPSOH
Consider the stochastic process:
(1 .4B )yt (1 .7B )(1 .8B 4 )at V[at ] .01
The following code defines the model, generates a sample with five missing observations and
interpolates them using fismiss.
[theta, din] = arma2thd([-.4], [], [-.7], [-.8], [.01], 4);y = simmod(theta, din, 100);y1 = y;y1(50)=NaN; y1(52)=NaN; y1(55)=NaN; y1(56)=NaN; y1(58)=NaN; [zhat, Pz, xhat, Px] = fismiss(theta, din, y1);[y(48:60,1) y1(48:60,1) zhat(48:60,1)]
5HIHUHQFHV
Anderson, B. D. O. and J. B. Moore (1979). Optimal Filtering. Englewood Cliffs, N.J.: Prentice
Hall.
Casals, J. M. Jerez and S. Sotoca (2000). “Exact Smoothing for Stationary and Nonstationary Time
Series”, International Journal of Forecasting, 16, 59-69.
De Jong, P. (1989), “Smoothing and Interpolation with the State-Space Model”, Journal of the
American Statistical Association, 84, 408, 1085-1088.
&KDS� � 3DJ� ��
foregarc
3XUSRVH
Computes forecasts for the endogenous variables and conditional variances of a model with GARCH
errors.
6\QRSVLV
[yf, Bf, vf] = foregarc(theta, din, z, k, u)
'HVFULSWLRQ
The use of this function is exactly the same as that of foremod. The only difference is that it returns
an additional output argument, vf, which is the expectation of the conditional covariance matrices of
the errors.
([DPSOH
Consider the following ARMA(2,1) model with GARCH(1,1) errors, in conventional notation:
; ; ; yt 1 .8B
1 .7B � .3B 2Jt Jt � iid (0 , .01) Jt 66t � iid (0 , h 2
t ) h 2t .002 � .1J
2t1 � .7h 2
t1
which, in the ARMA representation supported by E4 becomes:
, such that: yt 1 .8B
1 .7B � .3B 2Jt J
2t .01 � 1 .7B
1 .8Bvt
The following code defines the model structure, simulates 200 observations and computes 10 step
ahead forecasts of both, and the conditional variance of the error:yt
% Model for the mean[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);% Model for the conditional variance[t2, d2, lab2] = arma2thd([-.8], [], [-.7], [], [.01], 1);% Full model[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);% Simulates the data and computes the forecastsy=simgarch(theta, din, 300); y=y(101:300,1);[yf, Bf, vf] = foregarc(theta, din, y, 10);[yf Bf vf]
6HH $OVR
foremiss, foremod, garc2thd
&KDS� � 3DJ� ��
foremiss, foremod
3XUSRVH
Compute forecasts for the endogenous variables of a model in THD format.
6\QRSVLV
[yf, Bf] = foremiss(theta, din, z, k, u)
[yf, Bf] = foremod(theta, din, z, k, u)
'HVFULSWLRQ
The input arguments to both functions are: theta and din, which define the model in THD format;
z, a data matrix of the endogenous and exogenous variables; k, the forecast horizon and u, the values
of the exogenous variables for the forecast horizon.
The operation of these functions is as follows. They receive a model in THD format, convert it to the
corresponding SS formulation and then propagate the forecasting equations of the Kalman filter.
The function foremiss allows for missing data in z, marked by NaN, whereas foremod requires a
complete sample. Forecasts for models with GARCH errors should be computed using foregarc.
The output arguments are forecasts of the endogenous variables (yf) and their corresponding
covariances (Bf).
([DPSOH
Consider the following univariate model:
yt (1 .6B ) (1 .4B 4 )Jt V[Jt ] .1
The following code simulates 200 observations of and computes five forecasts:yt
[theta, din, lab] = arma2thd([], [], [-.6], [-.4], [.1], 4);y = simmod(theta, din, 250); y = y(51:250,:);[yf, Bf] = foremod(theta, din, y, 5);[yf Bf]
6HH $OVR
comp2thd, foregarc
&KDS� � 3DJ� ��
garc2thd
3XUSRVH
Converts a model with GARCH errors to THD format.
6\QRSVLV
[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2)
'HVFULSWLRQ
Obtains the THD format for a model (VARMAX or transfer function) with GARCH errors.
The input arguments are:
3) t1-d1, which is the THD format associated with the model for the mean.
4) t2-d2, which is the VARMAX model for the variance in THD format.
5) lab1 and lab2, which are optional labels for the parameters in t1 and t2, respectively.
The output arguments are the vectors and matrices that define a model in THD format.
([DPSOH
Consider the following ARMA(2,1) model with GARCH(1,1) errors, in conventional notation:
; ; ; yt 1 .8B
1 .7B � .3B 2Jt Jt � iid (0 , .01) Jt 66t � iid (0 , h 2
t ) h 2t .002 � .1J
2t1 � .7h 2
t1
which, in the ARMA representation supported by E4 becomes:
, such that: yt 1 .8B
1 .7B � .3B 2Jt J
2t .01 � 1 .7B
1 .8Bvt
The following code defines and displays the model structure:
% Model for the mean[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);% Model for the conditional variance[t2, d2, lab2] = arma2thd([-.8], [], [-.7], [], [.01], 1);% Full model[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);
&KDS� � 3DJ� ��
prtmod(theta, din, lab);
Assume now the same model for the mean and an IGARCH(1,1) model for the conditional variance:
h 2t .002 � .3J
2t1 � .7h 2
t1
which in ARMA form can be written as:
, with: J2t .01 � Nt (1 B)Nt .01 � (1 .7B)vt
The following commands define the IGARCH structure by constraining the autoregressive parameter
to unity:
% Model for the mean[t1, d1, lab1] = arma2thd([-.7 .3], [], [-.8], [], [.01], 1);% Model for the conditional variance[t2, d2, lab2] = arma2thd([-1], [], [-.7], [], [.01], 1);% Full model[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);theta1 = [theta zeros(size(theta))]; theta1(5,2)=1;prtmod(theta1, din, lab);
6HH $OVR
arma2thd, ss2thd, str2thd, tf2thd, prtmod
&KDS� � 3DJ� ��
ggarch, gmiss, gmod
3XUSRVH
Compute the analytical gradient of the log-likelihood function.
6\QRSVLV
g = ggarch(theta, din, z)
g = gmiss(theta, din, z)
g = gmod(theta, din, z)
'HVFULSWLRQ
The input arguments are a THD format specification, (theta- din) and a data matrix (z), which
should be structured as in the calls to lfgarch, lfmiss and lfmod (or lffast). The output
argument is g, a vector containing the gradient of the log-likelihood function in theta, see Terceiro
(1990). If theta includes a second column with constraint flags, then the gradient is computed with
respect to the free parameters.
To optimize a likelihood function with analytical derivatives, the name of the adequate function should
be passed to e4min as a parameter. Hence, ggarch should be used when optimizing lfgarch. In the
same way, gmiss and gmod should be used when optimizing lfmiss and lfmod/lffast,
respectively. The analytical gradient is also used in several inference procedures, see Engle (1984).
([DPSOH
Consider the model:
zt (1 .7B ) (1 .5B 12 )at V[at ] .1
The following code obtains the corresponding THD format, simulates 200 samples, computes the
maximum likelihood estimates using numerical derivatives and checks the analytical gradient:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);z=simmod(theta,din,250); z=z(51:250,1);[thopt, it, lval, g, h] = e4min('lffast', theta, '', din, z);prtest(thopt, din, lab, z, it, lval, g, h);g1 = gmod(thopt, din, z)
The analytical derivatives can be used in the optimization process with the following syntax:
&KDS� � 3DJ� ��
[thopt1, it1, lval1, g1, h1] = e4min('lffast', theta, 'gmod', din, z);
6HH $OVR
e4min, lfgarch, lfmiss, lfmod, lffast
5HIHUHQFHV
Engle, R. (1984). “Wald, Likelihood and Lagrange Multiplier Tests in Econometrics”, in Z. Griliches
and H.D. Intriligator (editors). Handbook of Econometrics, vol. II. Amsterdam: North-Holland.
Terceiro, J. (1990). Estimation of Dynamic Econometric Models with Errors in Variables. Berlin:
Springer-Verlag.
&KDS� � 3DJ� ��
histsers
3XUSRVH
Displays a standardized histogram for a set of time series.
6\QRSVLV
freqs = histsers(y, mode, tit)
'HVFULSWLRQ
The input arguments are y, a matrix with n observations of m variables, mode = 0 relative frequencies
(default) mode = 1 absolute frequencies; and tit, a matrix of characters whose rows contain an
optional descriptive title for each series.
The output argument, freqs, is a 2×m matrix which contains the class marks and the frequency of
each of the intervals presented in the graph.
([DPSOH
y=randn(100,2);freqs = histsers(y, 0, ['first series ';'second series'])freqs = histsers(y, 1, ['first series ';'second series'])
6HH $OVR
descser, midents, plotsers, plotqqs, rmedser, uidents
&KDS� � 3DJ� ��
igarch
3XUSRVH
Computes an analytical approximation to the information matrix of LFGARCH.
6\QRSVLV
[dts, corrm, varm, Im] = igarch(theta, din, z)
'HVFULSWLRQ
Computes the Watson and Engle (1983) approximation to the information matrix of a model with
GARCH errors. In general, the analytical standard errors are smaller than the corresponding
numerical approximations, thus allowing for more a more powerful statistical inference.
The use of this function is exactly the same as that of imod. The only difference is that it does not
accept the optional input argument aprox.
([DPSOH
Consider the following model with GARCH(1,1) errors, in conventional notation:
; ; ; yt Jt Jt � iid (0 , .1) Jt 66t � iid (0 , h 2t ) h 2
t .01 � .15J2t1 � .75h 2
t1
which, in the ARMA representation supported by E4 becomes:
, such that: yt Jt J2t .1 � 1 .75B
1 .9Bvt
The following code defines the model structure, simulates 400 observations, computes the maximum
likelihood estimates of the parameters and prints the results:
% Model for the mean[t1, d1, lab1] = arma2thd([], [], [], [], [.1], 1);
% Model for the conditional variance[t2, d2, lab2] = arma2thd([-.9], [], [-.75], [], [.1], 1);
% Full model[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);y=simgarch(theta, din, 500); y=y(101:500,1);[thopt, it, lval, g, h] = e4min('lfgarch', theta, '', din, y);prtest(thopt, din, lab, y, it, lval, g, h);
&KDS� � 3DJ� ��
With these commands, the function prtest computes an approximation to the standard errors of the
estimates as sqrt(diag(inv(h))). To compute and display the analytical standard errors, replace
the last command by:
[std, corrm, varm, Im] = igarch(theta, din, y);prtest(thopt, din, lab, y, it, lval, g, h, std, corrm);
6HH $OVR
lfgarch, imod, imiss
5HIHUHQFHV
Watson, M. W. and R. F. Engle (1983). “Alternative Algorithms for the Estimation of Dynamic
Factor, MIMIC and Varying Coefficient Regression Models”, Journal of Econometrics, 23, 3,
385-400.
&KDS� � 3DJ� ��
imiss, imod
3XUSRVH
Compute the exact information matrix.
6\QRSVLV
[std, corrm, varm, Im] = imiss(theta, din, z, aprox)
[std, corrm, varm, Im] = imod(theta, din, z, aprox)
'HVFULSWLRQ
These functions receive as input argument a model estimated by maximum likelihood in THD format,
convert it to the corresponding SS formulation and then compute the exact information matrix of the
estimates, see Terceiro (1990). In general, the exact standard errors are smaller than the
corresponding numerical approximations, thus allowing for more a more powerful statistical
inference.
The input arguments are:
1) A THD format specification, (theta- din). If theta includes a second column with constraint
flags, see Chapter 5, the information matrix will only be calculated with respect to the free
parameters.
2) A data matrix (z) whose number of rows is the number of observations. The first columns of z
should correspond to the endogenous variables, while the rest correspond to the exogenous.
3) The parameter aprox is a logical indicator, if it takes value 1 the function computes the
approximation of Watson and Engle (1983), which reduces the computational load. This argument
is optional.
The output arguments are: std, a vector containing the standard deviations of the estimates; corrm, a
matrix containing the correlation matrix of the estimates; varm, a matrix containing the covariance
matrix of the estimates; and Im, which is the information matrix.
The function imiss allows missing data in the endogenous variables. These observations should be
marked with NaN.
&KDS� � 3DJ� ��
When the model is not locally identified, the information matrix is rank deficient, which affects the
calculations of the covariance matrix, see Terceiro (1990). In this case, imod and imiss print a
warning message.
([DPSOH
Consider the model:
zt (1 .7B ) (1 .5B 12 )at V[at ] .1
The following code obtains the corresponding THD format, simulates 200 samples, computes the
maximum likelihood estimates using numerical derivatives and displays the results using approximate
standard errors:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);z=simmod(theta,din,250); z=z(51:250,1);[thopt, it, lval, g, h] = e4min('lffast', theta, '', din, z);prtest(thopt, din, lab, z, it, lval, g, h);
With these commands, prtest computes an approximation to the standard errors of the estimates as
sqrt(diag(inv(h))). To compute and display the analytical standard errors, replace the last
command by:
[std, corrm, var, Im] = imod(theta, din, z);prtest(thopt, din, lab, z, it, lval, g, h, std, corrm);
6HH $OVR
igarch, imodg
5HIHUHQFHV
Terceiro, J. (1990). Estimation of Dynamic Econometric Models with Errors in Variables. Berlin:
Springer-Verlag.
Watson, M. W. and R. F. Engle (1983). “Alternative Algorithms for the Estimation of Dynamic
Factor, MIMIC and Varying Coefficient Regression Models”, Journal of Econometrics, 23, 3,
385-400.
&KDS� � 3DJ� ��
imodg
3XUSRVH
Computes the quasi-maximum likelihood information matrix of LFMOD and LFFAST.
6\QRSVLV
[std, stdg, corrm, corrmg, varm, varmg, Im] = ... imodg(theta, din, z, aprox)
'HVFULSWLRQ
If the model is misspecified or its errors are non-normal, optimization of the log-likelihood function
still provides consistent estimates, but the standard errors computed by imod are no longer valid. In
this case, we will speak of quasi-maximum likelihood estimation.
The function imodg computes an information matrix robust to these specification errors, see Ljung
and Caines (1979) and White (1982).
The use of this function is exactly the same as that of imod. The only difference is that it has an
additional input argument: aprox is a logical switch. If aprox=0 the function only returns the
analytical values, that should coincide with those of imod. For any other value of aprox, the function
computes the approximation of Watson and Engle (1983).
The output arguments are the exact maximum likelihood values (std, corrm, varm and Im) and the
quasi-maximum likelihood values (stdg, corrmg, varmg).
([DPSOH
Consider the model:
zt (1 .7B ) (1 .5B 12 )at V[at ] .1
The following code obtains the corresponding THD format, simulates 200 samples, computes the
maximum likelihood estimates using numerical derivatives and displays the results using approximate,
analytical and robust standard errors:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);z=simmod(theta,din,250); z=z(51:250,1);[thopt, it, lval, g, h] = e4min('lffast', theta, '', din, z);
&KDS� � 3DJ� ��
prtest(thopt, din, lab, z, it, lval, g, h);[std, stdg, corrm, corrmg, varm, varmg, Im] = imodg(theta, din, z);prtest(thopt, din, lab, z, it, lval, g, h, std, corrm);prtest(thopt, din, lab, z, it, lval, g, h, stdg, corrmg);
6HH $OVR
imod, imiss, igarch
5HIHUHQFHV
Watson, M.W. and R.F. Engle (1983). “Alternative Algorithms for the Estimation of Dynamic
Factor, MIMIC and Varying Coefficient Regression Models”, Journal of Econometrics, 23, 3,
385-400.
Ljung, L. and P.E. Caines (1979). “Asymptotic Normality of Prediction Error Estimators for
Approximate System Models”, Stochastic, 3, 29-46.
White, H. (1982). “Maximum Likelihood Estimation of Misspecified Models”, Econometrica, 50,1,1-
25.
&KDS� � 3DJ� ��
lagser
3XUSRVH
Generates lags and leads for a set of time series.
6\QRSVLV
[yl, ys] = lagser(y, ll)
'HVFULSWLRQ
The input arguments are y, a n×k data matrix, and ll, a 1×l vector containing the lags (positive
numbers) and leads (negative numbers) to be applied to all the series. This function returns yl, which
contains the lagged and/or leaded variables, and optionally ys, an nl×k data matrix (nl=n-
maxlag+maxlead) which contains the original variables, but resized to be conformable with yl.
6HH $OVR
transdif
&KDS� � 3DJ� ��
lffast, lfmiss, lfmod
3XUSRVH
Compute the exact log-likelihood function for a model in THD form.
6\QRSVLV
[l, innov, ssvect] = lffast(theta, din, z)
[l, innov, ssvect] = lfmiss(theta, din, z)
[l, innov, ssvect] = lfmod(theta, din, z)
'HVFULSWLRQ
The operation of these functions is as follows: they receive as input argument a model in THD format,
obtain the corresponding SS formulation and then compute the value of the exact log-likelihood
function.
The function lfmod computes the log-likelihood function for any of the supported formulations
except models with GARCH errors, which require lfgarch. When the endogenous variables sample
includes missing data, lfmiss should be used instead of lfmod. The missing values should be
marked with NaN. The algorithms implemented in these functions are described in Terceiro (1990).
The log-likelihood can also be computed using lffast, which is faster than lfmod, see Casals
Sotoca and Jerez (1999).
The input arguments are a THD format specification, (theta- din) and a data matrix (z). The
number of rows of z is the number of observations. The first columns of z correspond to the
endogenous variables and the rest to the exogenous.
The output arguments are:
1) l, which is a scalar that contains the value of the log-likelihood function in theta,
2) innov, which is the N×m matrix of one-step-ahead forecast errors:
zt zt H xt |t
1 D ut
3) and ssvect, which is the N×n matrix of estimated state values. Its t-th row contains the filtered
&KDS� � 3DJ� ��
estimate of the state vector, size n, at time t, conditional on the information available up to t-1:
xt��1 |t 00 xt |t
1 � ut � Kt zt
where is the Kalman filter gain.Kt
In many applications, the names of these functions should be passed to e4min as an input argument
to compute maximum likelihood estimates of the parameters in theta. The values of the log-
likelihood are also used for other purposes, such as hypotheses testing by means of likelihood-ratio
statistics.
([DPSOH
Consider the model:
zt (1 .7B ) (1 .5B 12 )at V[at ] .1
First, we need to obtain the corresponding THD format and simulate a sample:
[theta, din] = arma2thd([], [], [-.7], [-.5], .1, 12);z=simmod(theta,din,250); z=z(51:250);
The following code evaluates the log-likelihood using the true values of the parameters:
l = lffast(theta, din, z)l = lfmod(theta, din, z)l = lfmiss(theta, din, z)
Note that the three functions return the same values. Now, the following calls to e4min compute the
maximum likelihood estimates using the faster and slower options:
[thopt, iter, lnew, gnew] = e4min('lffast', theta, '', din, z);[thopt, iter, lnew, gnew] = e4min('lfmod', theta, 'gmod', din, z);
Finally, we generate two missing values, compute the log-likelihood and obtain the maximum-
likelihood estimates:
z(30)=NaN; z(90)=NaN;[thopt, iter, lnew, gnew] = e4min('lfmiss', theta, '', din, z);
6HH $OVR
lfgarch, sete4opt
&KDS� � 3DJ� ��
5HIHUHQFHV
Casals, J. and S. Sotoca (1997). “Exact Initial Conditions for Maximum Likelihood Estimation of
State Space Models with Stochastic Inputs”, Economics Letters, 57, 261-267.
Casals, J. S. Sotoca and M. Jerez (1999). “A Fast and Stable Method to Compute the Likelihood of
Time Invariant State-Space Models”, Economics Letters, 65, 329-337.
Terceiro, J. (1990). Estimation of Dynamic Econometric Models with Errors in Variables. Berlin:
Springer-Verlag.
&KDS� � 3DJ� ��
lfgarch
3XUSRVH
Computes the log-likelihood function of a model with GARCH errors.
6\QRSVLV
[l, innov, hominnov, ssvect] = lfgarch(theta, din, z)
'HVFULSWLRQ
The use of lfgarch is identical to that of lfmod. The only difference is that there is an additional
output argument, hominnov, which stores the sequence of standardized residuals.
([DPSOH
Consider the following model with GARCH(1,1) errors, in conventional notation:
; ; ; yt Jt Jt � iid (0 , .1) Jt 66t � iid (0 , h 2t ) h 2
t .01 � .15J2t1 � .75h 2
t1
which, in the ARMA representation supported by E4 becomes:
, such that: yt Jt J2t .1 � 1 .75B
1 .9Bvt
The following code defines the model structure, simulates 400 observations, evaluates the log-
likelihood using the true values of the parameters and, finally, computes the maximum likelihood
estimates:
% Model for the mean[t1, d1, lab1] = arma2thd([], [], [], [], [.1], 1);
% Model for the conditional variance[t2, d2, lab2] = arma2thd([-.9], [], [-.75], [], [.1], 1);
% Full model[theta, din, lab] = garc2thd(t1, d1, t2, d2, lab1, lab2);y=simgarch(theta, din, 500); y=y(101:500,1);[l] = lfgarch(theta, din, y)[thopt, it, lval, g, h] = e4min('lfgarch', theta, '', din, y);prtest(thopt, din, lab, y, it, lval, g, h);
Note that the estimates for the parameters in the conventional GARCH representation can be easily
computed with the following commands:
&KDS� � 3DJ� ��
omega=thopt(1)*(1+thopt(2))alpha=thopt(3)-thopt(2)beta=-thopt(3)
6HH $OVR
lffast, lfmod, lfmiss, sete4opt
&KDS� � 3DJ� ��
midents
3XUSRVH
Computes and displays the multiple autocorrelation and partial autoregression functions for a set of
time series.
6\QRSVLV
[macf, mparf, Qus] = midents(y, lag, tit)
'HVFULSWLRQ
The input arguments are: a) y, a n×m matrix which contains m series of n observations each; b) lag,
the maximum lag for computing the values of the autocorrelation functions; its default value is n/4;
and c) tit, which is an optional matrix of characters which contains a descriptive title for each
series. The last two parameters are optional.
The output arguments, macf and mparf, contain the simple autocorrelation and partial
autoregression matrices. The argument Qus, contains the matrix of Ljung-Box Q statistics computed
using the first lag values of macf.
The function also prints out the values of these functions and their representation in '+' '.' '-' format,
where '+' indicates a positive significant value, '-' a significant value less than zero and '.' indicates a
non-significant value. The significance of these coefficients is tested using the asymptotic standard
deviation . The values in macf are also displayed in cross-correlation function form.2/ n
([DPSOH
The following code generates a 100×2 matrix of gaussian white noise and displays ten lags of the
multiple autocorrelation and partial autoregression functions:
y=randn(100,2);midents(y,10);
6HH $OVR
descser, plotsers, plotqqs, rmedser, uidents
&KDS� � 3DJ� ��
nest2thd
3XUSRVH
Converts a stacked model in a nested model in THD format.
6\QRSVLV
[theta, din, label] = nest2thd(t, d, nestwat, l)
'HVFULSWLRQ
The input arguments are: 1) t-d-l is the THD formulation of the stacked model and 2) nestwat is a
logical indicator, if it takes value 1 the function nested in inputs and if it takes value 0, nested in
errors.
([DPSOH
*LYHQ WKH WUDQVIHU IXQFWLRQ�
yt .3 � .6B1 .5B
u1t �1 .8B1 .6B
Jt ; )2J 1
ZKHUH LV VXFK WKDW � 7KH HQGRJHQHL]DWLRQ RI WKH H[RJHQRXVu1t (1 .7B )u1t at ; )2a .3
YDULDEOH UHTXLUHV WKH IROORZLQJ FRGH�
%Defines the transfer functionw1 = [ .3 .6]; d1 = [-.5];fr = [-.6]; ar = [-.8];v = [1.0];[t1,d1,l1]=tf2thd(fr,[],ar,[],v,1,[w1],[d1]);
% Defines the input model[t2,d2,l2]=arma2thd(-.7,[],[],[],.3,1);
% Stacks the models and translate the stacked model to the final nested% formulation[theta, din, lab] = stackthd(t1, d1, t2, d2, l1, l2); [theta, din, lab] = nest2thd(theta, din, 1, lab);prtmod(theta,din,lab);
6HH $OVR
arma2thd, comp2thd, garc2thd, prtmod, ss2thd, stackthd, str2thd, tf2thd
&KDS� � 3DJ� ��
plotqqs
3XUSRVH
Plots the quantile graphs for a set of time series.
6\QRSVLV
[nq, yq] = plotqqs(y, lab)
'HVFULSWLRQ
The function plotqqs displays the quantile graph under normality for a set of time series. Along
with the histogram, this is a rough tool for assessing the normality of a series. In the graph, the
theoretical quantiles under normality (a straight line with unit slope) are displayed along with the
empirical quantiles obtained from the standardized series. The OLS regression of empirical over
theoretical quantiles is also shown.
The input arguments are: y, a n×m matrix which contains m series of n observations each, and lab ,
a matrix of characters whose rows contain an optional descriptive title for each series.
This function returns the theoretical quantiles in nq and the empirical quantiles in yq.
([DPSOH
The following code generates 100 samples of gaussian white noise and displays the quantile plots:
y=randn(100,1);plotqqs(y);
6HH $OVR
descser, histsers, plotsers, rmedser
&KDS� � 3DJ� ��
plotsers
3XUSRVH
Displays a plot of centered and standardized time series versus time.
6\QRSVLV
ystd = plotsers(y, mode, lab)
'HVFULSWLRQ
The input arguments of plotsers are:
1) y, a n×m matrix which contains m series of n observations each,
2) mode, an optional parameter that selects the type of display. If mode = 0, each series is displayed
in a different graph (default value); if mode = 1, all the series (up to seven) represented in a single
graph; last, if mode = -1, each series is displayed in a single graph, but all of them have the same
axes.
3) lab, a matrix of characters whose rows contain an optional descriptive title for each series.
This function returns the centered and standardized series in ystd.
The resulting plot includes bands in ±2. If an stationary and homoscedastic series is gaussian, a 95%
aprox. of the values should be between these bands.
([DPSOH
The following code generates and plots 100 samples of gaussian white noise:
plotsers(randn(100,1));
6HH $OVR
histsers, rmedser, plotqqs, uidents, midents
&KDS� � 3DJ� ��
prtest
3XUSRVH
Displays the estimation results.
6\QRSVLV
prtest(thopt, din, lab, y, it, lval, g, h, std, corrm, t)
'HVFULSWLRQ
The input parameters are provided by e4min and optionally by imod, imiss, igarch or imodg ,
and are the following:
1) thopt-din, the THD format specification of the model. The vector thopt is an output argument
of e4min.
2) lab, label matrix that documents the parameters in thopt.
3) y, a n×m matrix which contains the m series of n observations each that have been used for model
estimation.
4) it, number of iterations. This parameter is an output argument of e4min.
5) lval, value of the log-likelihood function in thopt. This parameter is an output argument of
e4min.
6) g, gradient of the objective function in thopt. This parameter is an output argument of e4min.
7) h, hessian of the objective function in thopt. This parameter is an output argument of e4min.
8) std, vector of analytical standard deviations in thopt. This parameter is optional, and should be
computed using imod, imiss, igarch or imodg.
9) corrm, matrix of analytical correlations between the estimates in thopt. This parameter is an
output argument of imod, imiss, igarch or imodg.
&KDS� � 3DJ� ��
10) t: Elapsed computing time. This value should be computed by the user using the MATLAB
functions tic and toc.
The parameters std, corrm and t are optional. If std and corrm are not specified, or specified with
empty matrices, “[]’, standard errors and correlations between estimates are computed using
numerical approximations.
This function returns no output arguments.
6HH $OVR
e4min, imod, imiss, igarch, imodg, prtmod
&KDS� � 3DJ� ��
prtmod
3XUSRVH
Displays information about a model in THD format.
6\QRSVLV
prtmod(theta, din, lab)
'HVFULSWLRQ
The input argument is a model in THD format (theta- din) and, optionally, a label matrix (lab) to
document the parameters in theta. This function returns no output arguments.
The function prtmod is used mainly to check if the definition of a model is correct.
6HH $OVR
prtest
&KDS� � 3DJ� ��
residual
3XUSRVH
Computes the residuals and smoothed error estimates of a model, as well as the corresponding
covariance matrices.
6\QRSVLV
[z1, vT, wT, vz1, vvT, vwT] = residual(theta, din, z, stand)
'HVFULSWLRQ
This function is used mainly for model validation.
The input arguments are a THD format specification, (theta- din) and a data matrix (z). The
optional parameter stand selects between standardized (stand=1) or ordinary values (stand=0 or
argument omitted).
The output arguments are the following:
1) z1, a matrix of residuals (standardized if stand=1) computed as:
zt |t
1 zt H xt | t
1 D ut
which can be interpreted as one-step-ahead forecast errors.
2) vT, a matrix of smoothed residuals (standardized if stand=1) computed as:
zt |N zt H xt |N D ut
This argument is returned empty in the case of GARCH models.
3) wT, a matrix of smoothed state errors (standardized if stand=1). This argument is returned
empty in the case of GARCH models.
4) vz1, a matrix which stacks the covariance matrices of z1.
5) vvT, a matrix which stacks the covariance matrices of vT. This argument is returned empty in the
case of GARCH models.
6) vwT, a matrix which stacks the covariance matrices of wT. This argument is returned empty in the
case of GARCH models.
&KDS� � 3DJ� ��
Most empirical analyses use the innovations z1 for model validation, through testing if they could be
a sample realization of a zero-mean homoscedastic white noise process. In structural time series
models some authors, see Harvey and Koopman (1992), propose the use of the smoothed errors, also
known in this literature as “auxiliary residuals’, to detect outliers and structural changes in the
unobservable components.
([DPSOH
Consider the model:
zt (1 .7B ) (1 .5B 12 )at V[at ] .1
The following code obtains the corresponding THD format, simulates 200 samples, computes the
maximum likelihood estimates and prints the residuals:
[theta, din, lab] = arma2thd([], [], [-.7], [-.5], .1, 12);z=simmod(theta,din,250); z=z(51:250,1);[thopt, it, lval, g, h] = e4min('lffast', theta, '', din, z);[z1] = residual(thopt, din, y)
The resulting series z1 can be then analyzed using other functions, such as descser or uidents, to
validate the model.
5HIHUHQFHV
Harvey, A.C. and Koopman, S.J. (1992). “Diagnostic Checking of Unobserved-Components Time
Series Models”, Journal of Business and Economic Statistics, vol. 10, 4, 377-389.
6HH $OVR
decser, uidents, midents, lffast, lfmod, lfmiss, fismod, fismiss, sete4opt
&KDS� � 3DJ� ��
rmedser
3XUSRVH
Displays a scaled plot of sample means versus sample standard deviations for a set of time series.
6\QRSVLV
[med, std] = rmedser(y, len, lab)
'HVFULSWLRQ
Computes and displays a standardized XY plot of sample means (on the X axis) versus sample
standard deviations (on the Y axis) for a set of time series.
The configuration of this plot helps to select an adequate value for the � parameter of the Box-Cox
transformation. For example, a linear relationship between the mean and the standard deviation with
positive slope indicates that the series requires a logarithmic transformation (�=0). On the other hand,
a random scattering of the data points indicates that the series does not require a transformation
(�=1). A nonlinear relationship indicates that the series requires a transformation with .1 � � � 1
The input arguments are:
1) y, matrix whose columns correspond to the series to be represented. All of them should have the
same number of observations.
2) len, number of observations to be used in computing sample means and standard deviations. For
seasonal series, an adequate choice is any integer multiple of the seasonal period.
3) lab, matrix of characters whose rows contain an optional descriptive title for each series.
The parameters len and lab are optional.
The output arguments are med, a matrix of sample means; and std, a matrix of sample standard
deviations.
([DPSOH
The following code generates and plots 100 samples of lognormal white noise:
&KDS� � 3DJ� ��
y=exp(randn(100,1));rmedser(y,10,’log-normal sample’);
Note that the sample standard deviation increases linearly with the sample mean. The same plot
computed for the log-transformed series should not show any clear relationship.
rmedser(log(y),10,’log-transformed data’);
6HH $OVR
transdif, histsers, plotsers, uidents, midents, plotqqs
&KDS� � 3DJ� ��
sete4opt
3XUSRVH
Allows the user to modify the toolbox options.
6\QRSVLV
opt = sete4opt(o1,v1, o2,v2, o3,v3, o4,v4, o5,v5, o6,v6, ... o7,v7, o8,v8, o9,v9, o10,v10)
'HVFULSWLRQ
The sete4opt function manages the toolbox options by modifying the internal vector E4OPTION. It
allows three different calls:
1) sete4opt, without any argument, restores the default options.
2) sete4opt('show') shows current options. If the function is called with this argument, no other
argument should be included.
3) sete4opt(option, value, ...) is the most usual call, where:
6 The argument option stands for the name of the option to be modified, and value stands for
the new choice.
6 option must be a character string, enclosed by quotes. It is enough to indicate the first three
letters.
6 value may be a character string, enclosed by quotes, or a numeric value. If it is a character
string, it is enough to indicate its first three letters.
6 A single call may contain several option-value pairs, up to a maximum of ten.
The different options and values are summarized in the following table.
Option Description Possible values
Functions that control the estimation process
'filter' Filter used in the evaluation of the likelihood
function
'kalman'†,
'chandrasekhar'
'scale' Scales matrices when computing their
Cholesky decomposition during filtering.
'no'†, 'yes'
Option Description Possible values
&KDS� � 3DJ� ��
'econd' Algorithm for computing the initial value of the
state vector
'iu', 'au', 'ml',
'zero' , 'auto'†
'vcond' Algorithm for computing initial state vector
covariance matrix
'lyapunov', 'zero',
'idejong'†
'var' Selects between estimation of the covariance
matrix or estimation of the Cholesky factor of
the covariance matrix
'variance'†, 'factor'
Functions that control the behaviour of eemin
'algorithm' Optimization algorithm 'bfgs'†, 'newton'
'step' Maximum step length during optimization 0.1‡
'tolerance' Stop criteria tolerance 1.0e-5‡
'maxiter' Maximum number of iterations 75‡
'verbose' Display output at each iteration 'yes'†, 'no'
† Default option.
‡ This is the default value. Other reasonable values are admissible.
6HH $OVR
e4init
&KDS� � 3DJ� ��
simgarch, simmod
3XUSRVH
Simulate the endogenous variables of a model in THD format.
6\QRSVLV
y = simgarch(theta, din, N, u)
y = simmod(theta, din, N, u)
'HVFULSWLRQ
The input arguments of these functions are a model in THD format (theta- din), the number of
observations to be generated (N) and the exogenous variable data matrix (u). If u=[], the model
presents no exogenous variables. The output argument is y, a n×m matrix which contains the
realization of the endogenous variables.
The function simgarch is used to simulate models with GARCH errors. The rest of the formulations
supported by E4 can be simulated by simmod.
These functions operate as follows, the model received as input argument is converted to the
equivalent SS representation. Using this formulation and a white noise realization obtained with the
MATLAB function randn, a realization of the endogenous variables is computed. As a general
practice, it is advisable to omit the first observations of the sample.
([DPSOH
To obtain a realization of 200 observations of the model:
y1t .9 � .3y1t1 � a1t
y2t .7 � .4y1t1 � a2t .8a2 t4
Va1t
a2t
1 .9
.9 1
the following code can be used:
&KDS� � 3DJ� ��
[theta, din, lab] = arma2thd([-.3 NaN; -.4 NaN],[],[], ...[NaN NaN; NaN -.8], [1 .9; .9 1], 4, [.9;.7], 1);
% Generate the exogenous (constant) variableu = ones(250,1);
% Compute the simulated sample and omit the first 50 observationsy = simmod(theta, din, 250, u);y=y(51:250,1)
6HH $OVR
arma2thd, str2thd, ss2thd, garc2thd, tf2thd
&KDS� � 3DJ� ��
ss_dv, garch_dv
3XUSRVH
Computes the derivatives of the SS matrices of a model with respect to the i-th parameter.
6\QRSVLV
[dPhi, dGam, dE, dH, dD, dC, dQ, dS, dR] = ss_dv(theta, din, i)
[dPhi, dGam, dE, dH, dD, dC, dQ, dPhig,dGamg,dEg,dHg,dDg] = ...garch_dv(theta, din, i)
'HVFULSWLRQ
These functions return the partial derivatives of the SS matrices of any model in THD form with
respect to the i-th parameter of theta. The function ss_dv is used for SS models and garch_dv is
used for models with GARCH errors. The derivatives provided by these functions are used internally
to compute analytic gradients and information matrices. They can be also useful to simplify the
coding of user functions, see Chapter 7.
The input arguments are a THD model definition (theta-din) and the position in theta of the
parameter for which the derivatives are computed (i). The output arguments preceded by the letter d
are derivatives of the corresponding SS matrices.
In garch_dv the output arguments dPhi, dGam, dE, dH, dD, dC and dQ are the derivatives of the SS
model for the mean, and the output arguments dPhig, dGamg, dEg, dHg and dDg are the derivatives
of the SS model for the variance.
6HH $OVR
ss_dvp, garc_dvp
&KDS� � 3DJ� ��
ss_dvp, garc_dvp
3XUSRVH
Computes the derivatives of the SS matrices of a model in the direction of any vector.
6\QRSVLV
[dPhi,dGam,dE,dH,dD,dC,dQ,dS,dR]=ss_dvp(theta, din, p)
[dPhi,dGam,dE,dH,dD,dC,dQ,dPhig,dGamg,dEg,dHg,dDg]= garc_dvp(theta,din,p)
'HVFULSWLRQ
These functions return the partial derivatives of the SS matrices of any model in THD format in the
direction of a vector p. The function ss_dvp is used for SS models and garch_dvp is used for
models with GARCH errors.
The input arguments are a THD model definition (theta-din) and a vector chosen by the user (p).
The output arguments preceded by the letter d are derivatives of the corresponding SS matrices.
In garch_dvp the output arguments dPhi, dGam, dE, dH, dD, dC and dQ are the derivatives of the
SS model for the mean, and the output arguments dPhig, dGamg, dEg, dHg and dDg are the
derivatives of the SS model for the variance.
These functions are used mainly to simplify the coding of user functions, see Chapter 7.
6HH $OVR
ss_dv, garch_dv
&KDS� � 3DJ� ��
ss2thd
3XUSRVH
Converts a SS model to THD format.
6\QRSVLV
[theta, din, lab] = ss2thd(Phi, Gam, E, H, D, C, Q, S, R)
'HVFULSWLRQ
The function ss2thd obtains the THD format representation of any model in the form:
xt �� 1 00 xt � ut � E wt
zt H xt � D ut � C vt
where:
is an vector of state variables,xt (n×1)
is an vector of exogenous variables,ut (r×1)
is an vector of observable variables,zt (m×1)
and are white noise processes such that: , andwt vt E [wt ] 0 E [vt ] 0
Ewt1
vt2
w Tt1
v Tt2
Q S
S T R t 1t2
being Q and R positive definite matrices.
The input arguments are the parameter matrices Phi ( ), Gam ( ), E (E), H (H), D (D), C (C), Q (Q),00
S (S) and R (R). If any of the elements in these matrices, except in the covariances, are zero, they
should be specified with NaN.
The output arguments are the vectors and matrices that define a model in THD format.
The user should also take into account that:
&KDS� � 3DJ� ��
1) If Q and R are defined as column vectors, they are considered diagonal matrices.
2) The identity occurs in many formulations. To specify it, call the function without theQ � R � S
last two parameters, R and S.
3) For deterministic models with observation errors, one may define Q = [] and S = [], which
indicates that no error exists in the state equation.
4) To formulate models without error in the observation equation, one should define R = [] and
S = [].
5) If the state and observation errors are independent, define S = [].
6) If the matrices Gam and/or D are null ([]), the variables in do not affect the state and/orut
observation equation.
7) If the matrices E and/or C are null, [], they are replaced internally by the identity matrix.
6HH $OVR
arma2thd, str2thd, garc2thd, tf2thd, comp2thd
&KDS� � 3DJ� ��
stackthd
3XUSRVH
Stacks to models in THD format.
6\QRSVLV
[theta, din, label] = stackthd(t1, d1, t2, d2, l2);
'HVFULSWLRQ
The input arguments are: 1) t1, d1, l1 is the THD formulation of the first model; 2) t2, d2,
l2 is the THD representation of second model. The function returns the stacked model defined in
THD format where theta = [t1; t2], din = [d1; d2],and label = [l1; l2].
([DPSOH
The model:
yt .8 � .3yt1 .4yt2 � at
y �
t yt � vt
V[at ] .1
V[vt ] .2
can be stacked in THD format with the following code:
[tha, da, laba] = arma2thd([-.3 .4], [], [], [], [.1], 1, [.8], 1);[thc, dc, labc] = arma2thd([], [], [], [], [.2], 1);[ts, ds, ls]= stackthd(tha, da, thc, dc, laba, labc);
6HH $OVR
arma2thd, comp2thd, nest2thd, ss2thd, str2thd, tf2thd, garc2thd, prtmod
&KDS� � 3DJ� ��
str2thd
3XUSRVH
Converts a structural econometric model to THD format.
6\QRSVLV
[theta,din,lab] = str2thd([FR0 ... FRp],[FS0 ... FSps], ... [AR0 ... ARq],[AS0 ... ASqs],v,s,[G0 ... Gg],r)
'HVFULSWLRQ
The function str2thd obtains the THD format representation of any model in the form:
FR (B ) FS (B S ) yt G (B )ut � AR (B ) AS (B S ) JJt
where S denotes the length of the seasonal period, B is the backshift operator, such that for any
sequence : , and is a (m×1) vector of endogenous variables, is a (r×1) vector ofxt B ±k xt xt.k yt ut
exogenous variables, is a (m×1) vector of white noise errors and JJt
FR (B ) FR0 � FR1 B � ... � FRp B p
FS (B S ) FS0 � FS1 B S� ... � FSP B P·S
G (B ) G0 � G1 B � ... � Gg B g
AR (B ) AR0 � AR1 B � ... � ARq B q
AS (B ) AS0 � AS1 B S� ... � ASQB Q ·S
The input arguments of str2thd are:
1) The matrices of the autoregressive and moving average factors, [FR0...FRp],[AR0...ARq].
2) The matrices of seasonal autoregressive and moving average factors
[FS0...FSps],[AS0...ASqs].
3) The covariance matrix of , v. If this matrix is defined as a vector, the disturbances are assumedJJt
to be independent. In order not to impose this constraint, it is necessary to define at least the lower
triangle of the matrix. This matrix cannot contain NaN. To impose independence between two
errors, the user can set the corresponding covariance to zero and, afterwards, impose a fixed-
parameter constraint on this value, see Chapter 5.
&KDS� � 3DJ� ��
4) The parameter s indicates the seasonal period (e.g. for nonseasonal data, s=1, if the data is
quarterly, s=4, if monthly s=12).
5) The parameter matrix [G0 ... Gg] and the number of exogenous variables, r. In this function,
the number of exogenous variables cannot be 0.
If any of the matrices (except v) is null, it should be specified using an empty matrix, []. If any of the
elements in these matrices, except in v, are null, they should be specified with NaN.
The output arguments are the vectors and matrices that define a model in THD format.
([DPSOH
Consider the structural model:
1 .3
0 1
y1t
y2t
.9 0
0 .7�
0 0
0 .4B
1
ut
�
1 0
0 1�
0 0
.2 .8B
J1t
J2t
VJ1t
J2t
1 0
0 .8
The following code defines the model matrices, converts them to THD format and displays the model
structure:
FR0 = [1 -.3; NaN 1];AR1 = [NaN NaN; .2 -.8];G0 = [.9 NaN ; NaN .7 ];G1 = [NaN NaN ; NaN -.4];v = [1 .8];[theta, din, lab] = str2thd(FR0,[],AR1,[],v,1,[G0 G1],2);prtmod(theta, din, lab);
Note that the constant term has been included by means of an exogenous variable.
6HH $OVR
arma2thd, ss2thd, garc2thd, tf2thd, comp2thd, nest2thd, prtmod
&KDS� � 3DJ� ��
tf2thd
3XUSRVH
Converts a transfer function model to THD format.
6\QRSVLV
[theta, din, lab] = tf2thd([fr1 ... frp], [fs1 ... fsps], ... [ar1 ... arq],[as1 ... asqs],v,s,[w1; ...; wr],[d1; ...; dr])
'HVFULSWLRQ
The function tf2thd obtains the THD format representation of any model in the form:
yt 71(B)
1(B)u1t � ... �
7r(B)
r(B)urt �
�(B) �(B S)
1(B)0(B S)Jt
where:
is the value of the endogenous variables at time t,yt
is a (r×1) vector of exogenous variables,ut [u1 t ,á ,urt]T
is a white noise error,Jt
; 7i (B) 7i0 � 7i1 B � 7i2 B 2� á � 7ini
Bni i 1,2 ,á , r
; i (B) 1 � i1 B � á � i ndiB
ndi i 1,2 ,á , r
1(B) 1 � 11 B � á � 1p B p
0(B S) 1 �01 B S� á �0PB P·S
�(B) 1 � �1 B � á � �q B q
�(B S ) 1 � �1 B S� á � �QB Q·S
The input arguments are:
1) The parameters of the regular and seasonal AR factors of the noise model, [fr1...frp],
[fs1...fsps].
2) The parameters of the regular and seasonal MA factors of the noise model, [ar1...arq],
[as1...asqs].
3) The variance of , v.Jt
4) The scalar s, which indicates the length of the seasonal period (e.g. for nonseasonal data, s=1, if
the data is quarterly, s=4, if monthly s=12).
&KDS� � 3DJ� ��
5) The coefficients of the polynomials 7(B) and (B), which are specified in the rows of
[w1; ...; wr] and [d1; ...; dr], respectively.
The matrices fr, fs, ar and as are row vectors. All the matrices, except W, can be empty, [], and
may include the value NaN to mark parameters with a null value.
All the arguments are required. If exogenous variables are not included in the model, a VARMA
representation should be used instead.
The output arguments are the vectors and matrices that define a model in THD format.
([DPSOH
Consider the transfer function:
yt .4 � .9u1t2 �
.51 .3B
u2t � Nt
Nt (1 .7B ) (1 .8B 12 )at
V[at ] .1
The following code defines and displays its structure:
[theta, din, lab] = tf2thd([], [], [-.7], [-.8], [.1], 12, ...[.4 NaN NaN; NaN NaN .9; .5 NaN NaN], [NaN; NaN; -.3]);
prtmod(theta, din, lab);
6HH $OVR
arma2thd, ss2thd, str2thd, garc2thd, prtmod
&KDS� � 3DJ� ��
1 .3
0 1
y1t
y2t
.9 0
0 .7�
0 0
0 .4B
1
ut
�
1 0
0 1�
0 0
.2 .8B
a1t
a2t
thd2arma, thd2str, thd2tf
3XUSRVH
Convert a simple model in THD format to the corresponding standard formulation.
6\QRSVLV
[F, A, V, G] = thd2arma(theta, din)
[F, A, V, G] = thd2str(theta, din)
[F, A, V, W, D] = thd2tf(theta, din)
'HVFULSWLRQ
Convert a simple model in THD format to the standard formulation of a VARMAX, structural
econometric model or transfer function. Hence, these are the reciprocal functions of arma2thd,
str2thd and tf2thd, respectively.
([DPSOH
The model:
Va1t
a2t
1 0
0 .8
can be converted to THD format using the str2thd function:
[theta, din] = str2thd([1 -.3; NaN 1],[NaN NaN; .2 -.8],... [1 ;.8], 1, [.9 NaN NaN NaN; NaN .7 NaN -.4],2);
and the matrices in the standard representation are recovered with the command:
[F, A, V, G] = thd2str(theta, din)
The use of thd2arma is completely analogous. As for thd2tf, consider the transfer function:
&KDS� � 3DJ� ��
yt .4 � .9u1t2 �
.51 .3B
u2t � Nt
(1 B ) (1 B 12 )Nt (1 .7B ) (1 .8B 12 )at
V[at ] .1
which can be translated to THD format by the command:
[theta, din] = tf2thd([-1],[-1],[-.7],[-.8],[.1], 12,... [.4 NaN NaN; NaN NaN .9; .5 NaN NaN], [NaN; NaN; -.3]);
and then the model polynomials are recovered using thd2tf:
[F, A, V, W, D] = thd2tf(theta,din)
6HH $OVR
arma2thd, str2thd, tf2thd
&KDS� � 3DJ� ��
thd2ss
3XUSRVH
Converts any model in THD format to the corresponding SS representation.
6\QRSVLV
[Phi, Gam, E, H, D, C, Q, S, R] = thd2ss(theta, din)
'HVFULSWLRQ
The function thd2ss is the reciprocal of ss2thd. It receives a model in THD format and returns the
matrices of its SS formulation:
; xt��1 00xt � ut � Ewt E[wt ] 0
; zt Hxt � Dut � Cvt E[vt ] 0
where:
, .Vwt1
vt2
Q S
S´ R t1t2
t1t2
0, t1 g t2
1 , t1 t2
The input argument is a THD model definition (theta-din). The output arguments are the
parameter matrices Phi (00), Gam (), E (E), H (H), D (D), C (C), Q (Q), S (S) and R (R).
6HH $OVR
ss2thd
&KDS� � 3DJ� ��
tomod, touser
3XUSRVH
Disables or enables the user model flag in a THD model specification.
6\QRSVLV
din = tomod(din)
din = touser(din, userf, userfg)
'HVFULSWLRQ
The function tomod disables the user model flag in a THD model specification, while touser
activates the user model indicator in din and adds the user function. The input argument userf is the
name of user function, see Chapter 7, and userfg is an optional parameter for building derivatives.
6HH $OVR
arma2thd, str2thd, garc2thd, ss2thd, tf2thd
&KDS� � 3DJ� ��
transdif
3XUSRVH
Applies stationarity inducing transformations to a set of time series.
6\QRSVLV
z = transdif(y, lambda, d, ds, s)
'HVFULSWLRQ
Computes the Box-Cox (1964) transformation and the regular and seasonal differences of a time
series.
The input arguments are: a) y, a matrix whose columns correspond to the different series to be
transformed, b) lambda, the parameter of the Box-Cox transformation, c) d, the order of regular
differencing, d) ds, a (S×1) matrix containing the orders of seasonal differencing (default value ds=0)
and d) s, a (S×1) matrix containing the lengths of the seasonal periods (default value s=1). The last
two parameters are optional and can be omitted if seasonal differences are not required.
The output argument is the differenced and transformed series z such that:
zt /dNs�S
/dss y (�)
t (1 B )dNs�S
(1 B s )ds y (�)t
S s1 , s2 ,á , sS
y (�)t
ln (yt � µ ) if � 0
(yt � µ )� 1
�if � g 0
The parameter is zero if all the values of are positive, and equal to in other cases.µ yt min(y) � 105
5HIHUHQFHV
Box, G. E. P. and D. R. Cox (1964). “An Analysis of Transformations”, Journal of the Royal
Statistical Society, B, 26, 211-243.
&KDS� � 3DJ� ��
uidents
3XUSRVH
Displays the univariate simple and partial autocorrelation functions for a set of time series.
6\QRSVLV
[acf, pacf, qs] = uidents(y, lag, tit)
'HVFULSWLRQ
The input arguments are: a) y, a n×m matrix which contains m series of n observations each; b) lag,
the maximum lag for computing the values of the autocorrelation functions; its default value is n/4;
and c) tit, which is an optional matrix of characters which contains a descriptive title for each
series. The last two parameters are optional.
The output arguments are the matrices acf and pacf, whose columns contain the sample
autocorrelation function and partial autocorrelation function of each series, and qs, which is a 1×m
vector containing the values of the Ljung-Box Q statistic for each series, computed with the first lag
values of the autocorrelation function.
([DPSOH
The following code generates a 100×2 matrix of gaussian white noise and displays ten lags of the
corresponding autocorrelation functions:
y=randn(100,2);uidents(y,10);
6HH $OVR
midents, histsers, plotsers, rmedser, plotqqs
9 References
Anderson, B. D. O. and J. B. Moore (1979). Optimal Filtering. Englewood Cliffs (N.J.): Prentice
Hall.
Bollerslev, T. (1986). “Generalized Autoregressive Conditional Heteroscedasticity,” Journal of
Econometrics, 31, 307-327.
Bollerslev, T., R.F. Engle and D.B. Nelson (1994). “ARCH Models”, in R.F. Engle and D.L.
McFadden (editors), Handbook of Econometrics, vol. IV. Amsterdam: North-Holland.
Box, G. E. P. and D. R. Cox (1964). “An Analysis of Transformations,” Journal of the Royal
Statistical Society, B, 26, 211-243.
Box, G.E.P., G. M. Jenkins and G.C. Reinsel (1994). Time Series Analysis, Forecasting and
Control. Englewood Cliffs (N. J.): Prentice-Hall.
Casals, J. (1997). Métodos de Subespacios en Econometría. Phd Thesis. Madrid: Universidad
Complutense.
Casals, J. and S. Sotoca (1997). “Exact Initial Conditions for Maximum Likelihood Estimation of
State Space Models with Stochastic Inputs,” Economics Letters, 57, 261-267.
Casals, J. S. Sotoca and M. Jerez (1999). “A Fast and Stable Method to Compute the Likelihood of
Time Invariant State-Space Models,” Economics Letters, 65, 3, 329-337.
Casals, J. M. Jerez and S. Sotoca (2000). “Exact Smoothing for Stationary and Nonstationary Time
Series,” International Journal of Forecasting, 16, 59-69.
Chatfield, C. and D.L. Prothero (1973). “Box-Jenkins Seasonal Forecasting: Problems in a Case
Study,” Journal of the Royal Statistical Society, A, 136, 295-336.
De Jong, P. (1989), “Smoothing and Interpolation with the State-Space Model,” Journal of the
American Statistical Association, 84, 408, 1085-1088.
De Jong, P. and S. Chu-Chun-Lin (1994). “Stationary and Non-Stationary State Space Models,”
Journal of Time Series Analysis, 15, 2, 151-166.
&KDS� � 3DJ� �
Dennis, J.E. and R.B. Schnabel (1983). Numerical Methods for Unconstrained Optimization and
Nonlinear Equations. Englewood Cliffs (N. J.): Prentice-Hall.
Dickey, D.A. and W.A. Fuller (1981). “Likelihood Ratio Statistics for Autoregressive Time Series
with a Unit Root,” Econometrica, 49, 1063.
Engle, R.F. (1982). “Autoregressive Conditional Heteroskedasticity with Estimates of the Variance
of U.K. Inflation,” Econometrica, 50, 987-1008.
Engle, R.F. (1984). “Wald, Likelihood and Lagrange Multiplier Tests in Econometrics”, in Z.
Griliches and M.D. Intriligator (editors), Handbook of Econometrics, vol. II. Amsterdam:
North-Holland.
Engle, R.F. and D. Kraft (1983).”Multiperiod Forecast Error Variances of Inflation Estimated from
ARCH models” in A. Zellner (editor), Applied Time Series Analysis of Economic Data.
Washington D.C.: Bureau of the Census.
García-Ferrer, A., J. del Hoyo, A. Novales and P. C. Young (1996). “Recursive Identification,
Estimation and Forecasting of Nonstationary Economic Time Series with Applications to
GNP International Data” in D.A. Berry, K.M. Chaloner and J.K. Geweke (editors),
Bayesian Analysis in Statistics and Econometrics: Essays in Honor of Arnold Zellner.
Nueva York: Jonh Wiley.
Girshick, M.A. and Haavelmo, T. (1947). “Statistical Analysis of the Demand for Food: Examples
of Simultaneous Estimation of Structural Equations,” Econometrica, 15, 79-110.
Grace, A. and MATLAB (1993). Optimization Toolbox. Natick, Mass.: The MathWorks Inc.
Greene, W.H. (1996). Econometric Analysis. New York: Macmillan Publishing Company.
Hamilton, J.D. (1994). Time Series Analysis. Princeton, N.J: Princeton University Press.
Harvey, A.C. (1989). Forecasting, Structural Time Series Models and the Kalman Filter.
Cambridge: Cambridge University Press.
Harvey, A.C. and Koopman, S.J. (1992). “Diagnostic Checking of Unobserved-Components Time
Series Models,” Journal of Business and Economic Statistics, vol. 10, 4, 377-389.
&KDS� � 3DJ� �
Harvey, A.C. and N. Shephard (1993). “Structural Time Series Models,” In G.S. Maddala, C.R.
Rao, and H.D. Vinod (editors), Handbook of Statistics, vol. 11. Amsterdam: North-Holland.
Jenkins, G.M. and A.S. Alavi (1981). “Some Aspects of Modelling and Forecasting Multivariate
Time Series,” Journal of Time Series Analysis, 2, 1, 1-47.
Johansen, S. (1988). “Statistical Analysis of Cointegration Vectors,” Journal of Economic
Dynamics and Control, 12, 231-254
Johansen, S. (1991). “Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian
Vector Autoregressive Models,” Econometrica, 59, 1551-1580.
Kmenta, J. (1997). Elements of Econometrics. Ann Arbor: The University of Michigan Press.
Ljung, L. and P.E. Caines (1979). “Asymptotic Normality of Prediction Error Estimators for
Approximate System Models,” Stochastic, 3, 29-46.
MATLAB (1992). MATLAB: Reference Guide. Natick, Mass.: The MathWorks Inc.
MATLAB (1992). MATLAB: External Interface Guide. Natick, Mass.: The MathWorks Inc.
MATLAB (1996). MATLAB COMPILER: Users Guide. Natick (Mass): The MathWorks Inc.
McCullough and Vinod (1999). “The Numerical Reliability of Econometric Software”, Journal of
Economic Literature, XXXVII, 633-665.
McLeod, G. (1982). Box Jenkins in Practice. Lancaster: Gwilym Jenkins & Partners Ltd.
Newbold, P., C. Agiakloglou and J. Miller (1994). “Adventures with ARIMA software,”
International Journal of Forecasting, 10, 573-581.
Pankratz, A. (1991). Forecasting with Dynamic Regression Models. New York: John Wiley &
Sons.
Sotoca, S. (1994). “Aplicación del Filtro de Chandrasekhar a la Estimación por Máxima
Verosimilitud Exacta de Modelos Dinámicos,” Estadística Española, 36, 136, 259-285.
Swamy, P.A.V.B. and G.S. Tavlas (1995). “Random Coefficients Models: Theory and
Applications,” Journal of Economic Surveys, 9, 2, 165-196.
&KDS� � 3DJ� �
Terceiro, J. and P. Gómez (1985). “Theoretical and Empirical Restrictions in Time Series
Analysis”, Proceedings of the 5th World Congress of the Econometric Society, MIT,
Massachussets.
Terceiro, J. (1990). Estimation of Dynamic Econometric Models with Errors in Variables. Berlin:
Springer-Verlag.
Terceiro, J. (1999). “Comments on Kalman Filtering Methods for Computing Information Matrices
for Time-Invariant Periodic and Generally Time-Varying VARMA Models and Samples”,
Computers & Mathematics with Applications (forthcoming).
Van Overschee, P. and B. De Moor (1996). Subspace Identification for Linear Systems: Theory,
Implementation, Applications. Dordretch: Kluwer Academic Publishers.
Viberg, M. (1995). “Subspace-based methods for the identification of linear time-invariant
systems,” Automatica, 31, 12, 1835-1851.
Watson, M.W. and R.F. Engle (1983). “Alternative Algorithms for the Estimation of Dynamic
Factor, MIMIC and Varying Coefficient Regression Models,” Journal of Econometrics, 23,
3, 385-400.
Wells, C. (1996). The Kalman Filter in Finance. Dordretch: Kluwer Academic Publishers.
White, H. (1982). “Maximum Likelihood Estimation of Misspecified Models,” Econometrica,
50,1,1-25.
Appendix A: Error and warning messages(UURU PHVVDJHV
1. THETA and DIN do not fit
2. i inconsistent with THETA (out of range)
3. Incorrect number of arguments
4. Badly conditioned covariance matrix
5. Incorrect model specification
6. Only one series allowed
7. Should be more than 1 observation
8. Model %1d inconsistent
9. Endogenous variables model should be simple
10. Model not identified
11. Inconsistent input arguments
12. Inconsistent error model
13. User function should be passed as argument in user models
14. Incorrect model
15. Inconsistent system matrix dimension
16. Impossible to compute with missing data
17. Invalid number of lags
18. File not found: %s
19. The equation has no solution
20. SETE4OPT. Unrecognized option %s
21. SETE4OPT. Unrecognized value %s
22. SETE4OPT. Invalid value for %s
23. Run E4INIT before using E4
24. Use ARMA2THD for ARMA models
25. Initial conditions are meaningless
26. Non-stationary system. Initial conditions not compatible with Chandrasekhar
27. E4MIN. No decision variables; check second column of THETA
28. E4LNSRCH. THETA vector is meaningless
29. Multivariate time-varying parameters models are not supported
30. The sample size should be an integer multiple of the seasonal period
31. SETE4OPT. If vcond=De Jong, filter must be Kalman
32. Argument should be scalar
33. E4MIN. Objective function not found
34. For this type of model ARMA2THD or STR2THD should be used
35. Not enough data for using e4preest()
$SSHQGL[ $� �
:DUQLQJ PHVVDJHV
1. Should be one title per series
2. Invalid number of lags
3. Invalid %s option
4. PLOTSERS. A maximum of seven series can be represented in mode 2
5. RMEDSERS. Invalid group length
6. LFMODINI. Roots within the circle of radius 1
7. Approximate computation of information matrix
8. Information matrix sd+ o d-. Pseudo-inverse computed
9. E4MIN. Surpassed the maximum number of iterations
11. E4MIN. Hessian reinitialized
13. E4LNSRCH. Precision problem
14. CHOLP. Matrix not square
15. Kalman filter will be used
16. E4MIN. Analytic gradient function not found. Numeric approximation used
Appendix B: Structure of E4OPTIONThis Appendix describes structure of the internal vector, E4OPTION, created by the command
e4init, see Chapter 1. E4OPTION is a 1×51 numeric vector, which stores the general options that
control the behaviour of E4. The first ten positions, which can be modified using sete4opt, are
defined as follows:
1) E4OPTION(1,1) indicates the filter to be used in estimation. 1<Kalman,
2<Chandrasekhar.
2) E4OPTION(1,2) indicates if the matrices are to be scaled when computing their Cholesky
decomposition during filtering. 1<Scale, 0<Do not scale
3) E4OPTION(1,3) indicates the algorithm for computing initial state vector expectation.
1<Maximum likelihood, 2<initializes to zero, 3<uses the first value of the exogenous,
4<uses the exogenous variables average.
4) E4OPTION(1,4) indicates the algorithm for computing the initial state vector covariance.
1< Solution of the algebraic Lyapunov equation, 2<zero, 4<Inverse of De Jong, see De
Jong and Chu-Chun-Lin (1994).
5) E4OPTION(1,5) indicates whether to use the covariance matrices or their Cholesky factors
as parameters in model estimation. 1<Covariance matrices, 2<Cholesky factors.
6) E4OPTION(1,6) stores the optimization algorithm to use. 1<BFGS, 2<Newton.
7) E4OPTION(1,7) stores the maximum step length to be used by the optimizer.
8) E4OPTION(1,8) stores the stop criteria tolerance.
9) E4OPTION(1,9) stores the maximum number of iterations for the optimization algorithm.
10) E4OPTION(1,10) stores an option to display or omit output at each iteration of eemin.
1<Yes, 0<No.
The values stored in E4OPTION(1,11:51) are not user-modifiable through sete4opt, as they
store numeric tolerances for internal E4 functions.