arima modelling

TIME SERIES:

MODELLING

Stages in ModelingYesNoarmaarimaNOT GOOD

Building models based on past realizations of a time series we implicitly assume that there is some regularity in the process generating the series.

We view as if the series is produced by a specially designed machine. We attempt to find this machine.

Precondition. One way to view such regularity is through the concept of stationarity.Assumptions

IDENTIFICATIONNONOT GOOD

Check for StationarityA time series is covariance stationary if, for all values of t,

A time series is covariance stationary if its statistical properties do not change with time.A stationary series = the mean and variance are constant across time and the covariance between current and lagged values of the series (autocovariances) depends only on the distance between the time points.

Stationary Time Series?Stationary Process

Stationary Time Series?Non-stationary Variance

InterpretationWe try to predict the future in terms of what has already been known.And for the stationary series

But these two properties don't help very much to talk about the future.

Interpretation

More useful. That is, if g1 > 0, a high value of Y today will likely be followed by a high value tomorrow.By assuming that the gk are stable, this information can be exploited and estimated.

AutocorrelationsCovariances are difficult to interpret, since they depend on the units of measurement.

Correlations are scale-free. Thus we can obtain the same information about the time series by computing the autocorrelations of a time series.

Autocorrelation CoefficientThe autocorrelation coefficient between Yt and Yt-k is

A graph of the autocorrelations is called a correlogram.Knowledge of the correlogram implies knowledge of the process [model] which generated the series and vice versa.

Partial AutocorrelationsAnother important function in analyzing time series is the partial autocorrelation function.Partial autocorrelations measure the strength of the relationship between observations in a series controlling for the effect of intervening time periods.

If the observations of Y in period t are highly related to the observations in, say, period t-12, then a plot of the partial autocorrelations for that series (partial correlogram) should exhibit a spike, or relative peak, at lag 12.Monthly time series with a seasonal component should exhibit such a pattern in their partial correlograms.Monthly time series with seasonal components will also exhibit spikes in their correlograms at multiples of 12 lags.

AutocorrelationsPartial Autocorrelations

MODELING: ARMANONOT GOOD

Time Series ModelsTwo basic models commonly used:Autoregressive modelMoving Average model

When you know the underlying process governing a time series you can remove its effect.

Autoregressive Processes (AR)Regress variable on lags of itself onlyunivariate processHow many lags?look at t-statshow smooth is dataAR(1):

Yt = 0+ 1Yt-1 + t

Autoregressive Process of order p, AR(p) :

Note how mechanical the model isno attempt to explain Y other than to say it follow its past historylag length arbitraryYt = 0+ 1Yt-1 + 2Yt-2 +...+ pYt-p + t

AR(1): Yt = f0 + f1 Yt-1 + etf1 = .9Note: smooth long swings away from mean

AR(1): Yt = f0 + f1 Yt-1 + etf1 = .9Note: exponential decay

Note: smooth long swings away from meanAR(2): Yt = 0 + f1 Yt-1 + f2 Yt-2 + etf1 = 1.4, f2 = -.45

AR(2): Yt = f0 + f1 Yt-1 + f2 Yt-2 + etf1 = 1.4, f2 = -.45

Autoregressive Models: Summary1) Autocorrelations decay or oscillate

2) Partial Autocorrelations cut-off after lag p, for AR(p) model

3) Stationarity is a big issuevery slow decay in autocorrelationsre-check the stationarity

Moving Average Process (MA)Univariate: Explain a variables as being the weighted average of a series of disturbances

MA(1)

MA(q):

MA():Yt = 0+ t + 1t-1Yt = 0+ t + 1t-1 + 2t-2 +...+ qt-qYt = 0+ t + 1t-1 + 2t-2 +...

Note that this model is also mechanical in the sense that there are no explanatory variablesDifficult to estimate MA directly because the disturbance terms are not observedSurprisingly it is easiest to estimate the infinite MAthis is because it can be represented as an AR processs

AR and MATake AR(1) and keep substituting in

Every AR(p) has an MA() representationi.e. not just AR(1), but more complicatedEvery MA() can be represented as an AR(p), but not necessarily as AR(1)MA(q) with q

Note: jagged, frequent swings around meanMA(1): Yt = a + et - q1 et-1q1 = .9

MA(1): Yt = q0 + et - q1 et-1q1 = .9

Moving Average Models: Summary1) Autocorrelations cut off after lag q for MA(q) model

2) Partial autocorrelations decay or oscillate

A time series may be represented by a combination of autoregressive and moving average processes.In this case the series is an ARMA(p,q) process and takes the form:

ARMA(p,q): Yt = 0 + 1 Yt-1 + . . . + p Yt-p + et - 1 et-1 - . . . - q et-qAutoregressive Moving Average Models

Note: smooth long swings away from meanARMA(1,1): Yt = 0 + f1 Yt-1 + et - q1 et-1f1 = .9, q1 = .5

ARMA(1,1): Yt = 0 + f1 Yt-1 + et - q1 et-1f1 = .9, q1 = .5Note: exponential decay, starting after lag 1

AR, MA, ARMA Models: Summarychart

MODELING: ARIMAA.K.A BOX-JENKINS METHODNONOT GOOD

Nonstationarities: IIn the level of the mean. Here the mean changes over different segments of the data. Time series with a strong trend are not mean stationary.

Nonlinear trends will cause the covariances to change over time.

S&P 500 Index

S&P 500: Correlogram

Nonstationarities: IISeasonality. Time series with a seasonal component exhibit variations in the data which are a function of the time of the year.The autocovariances will depend on the location of the data points in addition to the distance between observations.Such nonstationarities are made worse by seasonal components which are not stable over time.

Nonstationarities: IIIShocks. A drastic change in the level of a time series will cause it to be nonstationary.

Removing NonstationaritiesTake logarithms, if necessary: Xt= log (Yt)First differencing: D(Yt) = Yt - Yt-1Second differencing: D(Yt) - D(Yt-1) = Yt - 2Yt-1 + Yt-2Annualized Rates of Growth

Autoregressive Integrated Moving Average ARIMA (p,d,q) ModelsARMA model in the dth differences of the data

First step is to find the level of differencing necessary

Next steps are to find the appropriate ARMA model for the differenced data

Need to avoid overdifferencing

Note: slowly wandering level of series, with lots of variation around that levelARIMA(0,1,1):(Yt - Yt-1) = et - .8 et-1

Note: autocorrelations decay very slowly(from moderate level); pacf decays at rate .8ARIMA(0,1,1):(Yt - Yt-1) = et - .8 et-1; (T = 150)

Autoregressive Integrated Moving Average Models: Summary1) Autocorrelations decay slowlyinitial level is determined by how close MA parameter is to one

2) Partial Autocorrelations decay or oscillatedetermined by MA parameterchart

DIAGNOSTICSNONOT GOOD

Check for Stationarity Ljung-Box statistics: Q is a diagnostic measure of randomness for a time series, assessing whether there are patterns in a group of autocorrelations. where: i = number of lags.

Reject Q, if the model is not adequate (data is not stationarity).

Criteria for Model SelectionAkaikes Information Criterion:

Schwarzs Bayesian Information Criterion:

where 2 is the estimated variance of et.

Choose the model with the smallest AIC or BIC.chart

EVALUATENONOT GOOD

Charateristics of a Good Forecasting ModelsIt fits the past data well.Plots of actual against fitted value are good.Adjusted R2 is high.Error is low relative to other models (MAD, MSE, MAPE and MPE).It forecasts the future and withheld data well.

Characteristics of It is parsimonious, simple but effective, not having too many coefficientsThe estimated coefficients and are statistically significant and no redundant or unnecessary.The model is stationary and invertible, (i.e., 1 > , > -1).No patterns left in ACFs and PACFs.The residual are white noise.

Measuring Forecasting ErrorA residual is the difference between an actual and its forecast values:et = Yt - tMean Absolute Deviationaveraging the magnitute of the forecast errors:

Mean Squared Errorpenalizing large forecasting error:

Measuring Mean Absolute Percentage Erroruseful when the size of deviation is important:

Mean Percentage Errorto find whether a forecasting method is biased:

chart

arima modelling

Documents