basic time series

Linear Time Series AnalysisLecture 1: Some Basic Time Series Concepts

Daniel Buncic

Institute of Mathematics & StatisticsUniversity of St. Gallen

Switzerland

December 12, 2013

Version: [ltsa1-a]

Homepagewww.danielbuncic.com

University of St. Gallen

http://www.danielbuncic.com

Outline/Table of Contents

OutlineIntroduction

OverviewDescriptive AnalysisProbabilistic ApproachExamples of Time SeriesObjectives of Time Series Analysis

Basic ConceptsSome DefinitionsExamples of Time SeriessGeneral Approach to Time Series ModelingStationary ModelsAutocovariance and AutocorrelationSome Model Based ExamplesSample Autocovariance and AutocorrelationEstimation and Elimination of Both Trend andSeasonalityLag (or Backshift) Operator

Exercises

Daniel Buncic (University of St. Gallen) Lecture 1: Linear Time Series Analysis December 12, 2013 � 2/65

IntroductionOverview

Overview

A time series is a set of observations xt, each one being recorded at a specific time t:

� discrete-time time series, when the set T0 of times at which observations aremade is a discrete set;

� continuous-time time series, when observations are recorded continuously oversome time interval.

Of particular interest are discrete-time time series with observations recorded at fixedtime intervals. Time-distance between observations is called frequency.

Typical frequencies used in practice: daily, monthly, quarterly, . . .


IntroductionDescriptive Analysis

Descriptive Analysis

Plot a two-dimensional graph of recording times t (X-axis) vs. observationsyt, t = 1, . . . , T (Y -axis).Generally, some smoothing techniques are applied:

� rolling means: instead of yt plot

y∗t =1

d

(yt− d−1

2+ . . .+ yt + . . .+ y

t+ d−12

)(1)

where d is an odd positive integer.

� exponential smoothing: instead of yt plot

y∗t = ayt + (1− a)y∗t−1 (2)

with y∗1 = y1 and 0 < a < 1.


IntroductionDescriptive Analysis

Preliminary goal of this analysis: examine the main features of the graph and checkwhether there is:

a) a trend component (linear, quadratic, . . .);

b) a seasonal component;

c) a cyclical component;

d) any apparent sharp changes in behavior;

e) any outlying observations.


IntroductionProbabilistic Approach

The observed time series {yt}t∈Z is a realization or sample of an underlying unknownstochastic process {Yt}t∈Z.

Final goal of the analysis:

� explore the main characteristics of the underlying stochastic process, given theinformation included in the observed sample.

� generally, analysis is performed on stationary time series. If the time series isnot stationary, then some transformations of the data are done to reachstationarity.

� note that almost all modern economic time-series are not stationary and whentransforming the data some important information on the original stochasticprocess can be lost.


IntroductionProbabilistic Approach

Possible solution:

� if it exists, take a linear combination of two (or more) stochastic processesthat are non-stationary.

� if this linear combination returns a series that is stationary, then the two serieshave a common stochastic trend or permanent component

� the series are then said to be cointegrated

� least-squres is valied for fitting but adjustments to standard errors need to bemade when using statistical tests


Examples of Time Series

Figure 1: Monthly sales in kiloliters of red wine by Australian winemakers from January 1980through October 1991 (142 recorded times, frequency: monthly)

� upward trend and seasonal pattern with peaks in July and troughs in January.



Figure 2: Accidental deaths in the US. Monthly accidental deaths in the US from January1973 through December 1978 (72 observations, frequency: monthly).

� strong seasonal pattern, with maximum for each year occurring in July andminimum in February

� not apparent trend



Figure 3: Population in the US. Population of the US measured at ten-year intervals from1790 to 1990 (21 observations, frequency: 10 years).

� upward trend is evident, can fit a quadratic or exponential trend to the data.



Figure 4: A signal extraction problem. 200 observations from the processXt = cos(t/10) +Nt, t = 1, . . . , 200, are simulated, where Nt independent,Nt ∼ N (0, 0.25). Such a series is often referred to as signal plus noise model.

� need to determine the unknown signal component −→ smooth the data byexpressing Xt as a sum of sine waves of various frequencies

� eliminate high-frequency components −→ spectral analysis.



Figure 5: Two macroeconomic factors in the US. US monthly Help Wanted Advertising inNewspapers (HELP) Index and US Industrial Production Index (IP) observations fromJanuary 1960 to December 2001 (504 observations, frequency: monthly).

� HELP Index: random fluctuation around a slowly changing level.

� IP: evident upward trend, yearly seasonality component.Daniel Buncic (University of St. Gallen) Lecture 1: Linear Time Series Analysis December 12, 2013 � 12/65


Figure 6: Daily US S&P500 Index values and log-returns for the period between January, 1st

2003 to December, 30th 2005 (783 observations, frequency: daily).

� upward trend in the Index values is evident,

� log-returns show that one can eliminate trend in Index by differencing.

� heteroskedasticity in the return series −→ need model for volatility



Figure 7: Daily Swiss Credit Suisse share values and log-returns for the period between

January, 1st 2003 to December, 30th 2005 (783 observations, frequency: daily).

� as above for the index.


Objectives of Time Series Analysis

Objectives of Time Series Analysis

� the final goal of time series analysis is to introduce some techniques fordrawing inferences from time series yt, t = 1, . . . , T of realizations.

� for this purpose we need to set up a hypothetical probability model torepresent the data.−→ choose appropriate family of models determined by some parameters.

� Then: fit the model to the data, estimate the parameters, check of goodnessof fit to the data, use the fitted model to enhance understanding of thestochastic mechanism generating the series, use the model for prediction andother applications of interest.

� For the interpretation of the results (from both a statistical and an economicpoint-of-view), it is important to recognize and eliminate the presence of“disturbing” quantities like seasonal and/or other noisy components.


Basic ConceptsSome Definitions

Definitions

� a time series (or stochastic process) in discrete time is a sequence ofreal-valued random variables: {Xt : t ∈ Z}.

� a time series model for the observed data {xt} is a specification of the jointdistributions of the time series {Xt : t ∈ Z} for which {xt} is postulated to bea realization.

Remark

The definition above naturally extends to a multivariate vector of random variables.


Basic ConceptsSome Definitions

� the laws of such a stochastic process are completely determined by the jointdistributions of every set of variables (Xt1 , Xt2 , . . . , Xtk ), k = 1, 2, . . .:

P [Xt1 ≤ xt1 , Xt2 ≤ xt2 , . . . , Xtk ≤ xtk ] (3)

where −∞ < xt1 , xt2 , . . . , xtk <∞, k = 1, 2, . . .

� a stochastic process is called a process of second order if

E[X2t ] <∞, ∀t. (4)

In this case, the laws are (at least partially) characterized only by the first twomoments (what are moment?)


Basic ConceptsExamples of Time Series

Examples of Times Series

Some Zero-Mean Models:

� iid noise: no trend or seasonal component, observations independent andidentically distributed with zero mean:

P [Xt1 ≤ xt1 , Xt2 ≤ xt2 , . . . , Xtk ≤ xtk ] = F (xt1) · . . . F (xtk ),

where F (·) is the cumulative distribution function of X1, X2, . . .

� A binary process: consider an iid random variables with

P [Xt = 1] = p, P [Xt = −1] = 1− p, with p =1

2.

� Random Walk: (starting at zero) {St, t = 0, 1, 2, . . .} with

S0 = 0; St = X1 +X2 + . . .+Xt, for all t = 1, 2, . . . ,

where {Xt} is iid noise.



If, in addition, {Xt} is the binary process above, {St, t = 0, 1, 2, . . .} is calleda simple symmetric random walk.

Models with trend and seasonality:

Most of the time series examples presented above show a trend and/or a seasonalcomponent in the data:

� Australian red wine sales

� accidental deaths in the US

� population in the US

−→ in such cases a zero-mean model is clearly inappropriate. What to do?



Population in the US:

The earlier graph suggests trying the following model (no evident seasonalcomponent)

Xt = mt + Yt,

where mt is a slowly changing function called the trend component and Yt has zeromean.

What type of parametric representation do we choose for mt?

mt = a0 + a1t+ a2t2 (5)

where a0, a1, a2 are constants. Fitting by least squares: (in millions) we geta0 = 6.96, a1 = −2.16 and a2 = 0.651.



Accidental deaths in the US:

the graph suggests the presence of a strong seasonal pattern due to seasonallyvarying factors (no evident trend component). This effect can be modeled by aperiodic component with fixed known period:

Xt = st + Yt,

where st is a periodic function of t with period d and Yt has zero mean.Convenient choice for st: sum of harmonics (or sine waves)

st = a0 +

k∑j=1

(aj cos(λjt) + bj sin(λjt)

),

where a0, . . . , ak, b1, . . . , bk are unknown parameters.



The parameters λ1, . . . , λk are fixed frequencies, each being some integer multiple of

2π/d. (6)

We should thus choose k = 2 which will have periods twelve and six months.

Australian red wine sales:

From the earlier graph: both a trend and a seasonal pattern are visible.−→ build up a model with both trend and periodic components, of the form

Xt = mt + st + Yt, (7)

where mt, st and Yt are as defined before.


Basic ConceptsGeneral Approach to Time Series Modeling

General Approach to Time Series Modeling

From the examples introduced above, we can derive a general strategy for time seriesmodeling.

1) Plot the series and examine the main features of the graph (trend and seasonalpatterns, outlying observations, . . .)

2) Remove the trend and seasonal components to get stationary residuals. Whenneeded, apply a preliminary transformation of the data.

3) Choose a model to fit the “residual series”, based on various sample statistics.

4) Use the fitted model to reach the final goals of the analysis.Example: if the final goal is forecasting, use the model to forecast residuals,then invert the transformations used in the first two steps to get forecasts ofthe original series.


Basic ConceptsIntroduction to Stationary Models

Introduction to Stationary Models

Idea: a time series {Xt, t ∈ Z} is said to be stationary if it has statistical propertiessimilar to those of the time-shifted series {Xt+h, t ∈ Z} for every integer h.

Definition: The stochastic process {Xt, t ∈ Z} is strictly stationary if the jointdistribution of every subset (Xt1 , . . . , Xtk ), k = 1, 2, . . . equals those of(Xt1+h, . . . , Xtk+h), k = 1, 2, . . . for every integer h.

The problem with this definition is that it is not operational.

Definition: The stochastic process {Xt, t ∈ Z} is weakly (or second order)stationary if

1) µX(t) = E[Xt] = µ is independent of t;

2) γX(t+ h, t) = Cov(Xt+h, Xt) = E[(Xt+h − µX(t+ h))(Xt − µX(t))] = γ(h)is independent of t for each integer h.


Basic ConceptsAutocovariance and Autocorrelation functions

Autocovariance and Autocorrelation functions

Let us denote briefly for a stationary time series {Xt, t ∈ Z}

γX(h) = γX(h, 0) = γX(t+ h, t). (8)

Note that in this case γX(0) = Var(Xt) independent of t and the process{Xt, t ∈ Z} is homoskedastic.

Let us define now the autocovariance and autocorrelation functions. Note thatγX(h) = γX(−h) (symmetry).

Definition: Let {Xt, t ∈ Z} be a stationary time series. The autocovariancefunction (ACVF) of {Xt} at lag h is

γX(h) = Cov(Xt+h, Xt). (9)

The autocorrelation function (ACF) of {Xt} at lag h is

ρX(h) =Cov(Xt+h, Xt)√

Var(Xt+h)Var(Xt)=γX(h)

γX(0). (10)


Basic ConceptsExamples

Examples

iid noise.If {Xt} is iid noise and E[X2

t ] = σ2 <∞, then

γX(t+ h, t) =

{σ2 , if h = 0,0 , if h 6= 0

(11)

which does not depend on t.−→ iid noise with finite second moment is stationary.Notation: {Xt} ∼ IID(0, σ2).

White noise.{Xt} a sequence of uncorrelated random variables, with zero mean and variance σ2.−→ {Xt} is stationary with the same covariance function as the iid noise.Notation: {Xt} ∼WN(0, σ2).



Note that every IID(0, σ2) sequence is WN(0, σ2) but not conversely.

The random walk.If {St} is a random walk with {Xt} ∼ IID(0, σ2), then E[St] = 0 andE[S2

t ] = tσ2 <∞ for all t, and, for h ≥ 0, γS(t+ h, h) = tσ2.−→ Since γS(t+ h, h) depends on t, the series {St} is not stationary.

First-order moving average or MA(1) process.Consider the series defined by the equation

Xt = µ+ Zt + θZt−1, t ∈ Z, (12)

where {Zt} ∼WN(0, σ2) and |θ| < 1 a real-valued parameter. Then:

� E[Xt] = µ independent of t;

� E[X2t ] = µ2 + σ2(1 + θ2) <∞;



and

γX(t+ h, t) =

σ2(1 + θ2) , if h = 0,σ2θ , if h ∈ {−1; +1},0 , if | h |> 1

(13)

which does not depend on t.

−→ {Xt} is stationary.

First-order autoregression or AR(1) process.Let us assume that {Xt} is a stationary series satisfying the equations

Xt = c+ φXt−1 + Zt, t ∈ Z, (14)



where {Zt} ∼WN(0, σ2), | φ |< 1, c and φ two real-valued parameters, and Zt isuncorrelated with Xs for each s < t.

� E[Xt] = c1−φ = µ constant;

� E[X2t ] = µ2 + σ2

1−φ2 <∞;

� γX(h) = γX(−h) = Cov(Xt, Xt−h) = φ|h|γX(0), and γX(0) = σ2

1−φ2 ;

� ρX(h) = γX (h)γX (0)

= φ|h|, h ∈ Z.

In practice, we have to start by looking at ”observed data” {xi}ni=1 and then find alink between the observed series and a good ”approximating model”.


Basic ConceptsSample Autocovariance and Autocorrelation functions

We can compute sample autocorrelation function (sample ACF), to assess the degreeof dependence in the data and then to select a model for the data that reflect this.

Definitions: Let {xi}ni=1 be observations of a time series. The sample mean ofx1, . . . , xn is computed as

x =1

n

n∑t=1

xt. (15)

The sample autocovariance function at lag h is

γ(h) =1

n

n−|h|∑t=1

(xt+|h| − x)(xt − x), −n < h < n. (16)

The sample autocorrelation function at lag h is

ρ(h) =γ(h)

γ(0), −n < h < n. (17)


Basic ConceptsSample Autocovariance and Autocorrelation functions Examples: iid Noise

Figure 8: 200 simulated values for an IID N (0, 1) noise.

Since ρ(h) = 0 for h > 0 in the model, sample autocorrelations should be near 0.Asymptotic theory: ρ(h), h > 0 approximately IIDN (0, 1/n) for n large.

Approximately 95% of sample autocorrelations should fall between[−1.96√

n; +1.96√

n

].


Basic ConceptsSample Autocovariance and Autocorrelation functions Examples: a nonstationary example

Figure 9: Australian red wine sales data series

The sample autocorrelation function can be useed as an indicator of non-stationarity:

� data with trend: |ρ(h)| exhibits slow decay as h increases;

� data with seasonal component: |ρ(h)| exhibits similar behavior with sameperiodicity.


Basic ConceptsSample Autocovariance and Autocorrelation functions Examples: test of model residuals

Figure 10: ACF of Population in the US and ACF of residuals form a quadratic time trendregression

We have seen that we can fit a model with a quadratic trend to this series.

� how good is such a model (from a preliminary graphical inspection)?

� look at the autocorrelation function of the residuals.


Basic ConceptsEstimation and Elimination of Both Trend and Seasonality

Estimation and Elimination of Both Trend and Seasonality

First step in the analysis of a time series: plot the data; when needed, transform thedata. Final purpose: get a stationary time series.

If there are any apparent discontinuities (sudden changes in level):

⇒ break the series into homogeneous segments.

If there are any outlying observations

⇒ check if there is any justification to discard them,

⇒ or try to model them

If any trend or seasonal components are evident:

⇒ represent the data as a Classical Decomposition Model.



In the classical decomposition, the realization of the process are modelled as:

Xt = mt + st + Yt,

where mt is a slowly changing function, st is a function with known period d and Ytis a cyclical component that is stationary, with E(Yt) = 0.

Nonseasonal Model with Trend

Xt = mt + Yt, t = 1, . . . , n, where E[Yt] = 0.

Method 1: Trend estimation

Final goal: find an estimate mt for the trend component function mt.



a) Smoothing with a finite moving average filter:

Obtain {mt} from {Xt} by application of a linear operator or linear filter

mt =

∞∑j=−∞

ajXt−j , with some weights aj .

For smoothing, consider the filter specified by the weights

aj = (2q + 1)−1, −q ≤ j ≤ q,

q a nonnegative integer. This particular filter is a low pass filter: it takes thedata {Xt} and removes from it the rapidly fluctuating component to leavethe slowly varying estimated trend term {mt}.



Then for q + 1 ≤ t ≤ n− q,

Wt =1

2q + 1

q∑j=−q

Xt−j (18)

=1

2q + 1

q∑j=−q

mt−j +1

2q + 1

q∑j=−q

Yt−j (19)

≈ mt (20)

and

mt =

q∑j=−q

1

2q + 1Xt−j ,

assuming that mt is approximately linear over the interval [t− q; t+ q] andthat the average of the error terms over this interval is close to zero.



b) Exponential smoothing:

Compute the one-sided moving averages {mt} as

mt = αXt + (1− α)mt−1, t = 2, . . . , n, α fixed ∈ [0, 1].

Take as initial value m1 = X1.This model is often referred to as exponential smoothing, since for t ≥ 2

mt =

t−2∑j=0

α(1− α)jXt−j + (1− α)t−1X1. (21)



a) Smoothing by elimination of high-frequency components:

Using this method, the original series is smoothed by elimination of thehigh-frequency components of its Fourier series expansion(−→ use spectral theory).

b) Polynomial fitting:

Assumption: the trend component is of a polynomial form (i.e. linear,quadratic, cubic, . . .)−→ fit a polynomial function to the data {x1, . . . , xn} to get estimates forthe coefficients (for example, by least squares).


Basic ConceptsExample. Strikes in the US

Number of strikes per year in the US. Time period: from 1951 to 1980.



Method 2: Trend Elimination by Differencing

Final goal: eliminate the trend term by differencing instead of smoothing like inMethod 1.

Let us define the lag-1 difference operator ∆ by

∆Xt = Xt −Xt−1 = (1− L)Xt,

where L is the Lag (or Backshift) operator LXt = Xt−1. (Note: sometimes B isused for Backshift and ∇ for difference operator)

Powers of the operator ∆ are defined recursively as

∆j(Xt) = ∆(∆j−1(Xt), j ≥ 1, with ∆0(Xt) = Xt. (22)

Polynomials in ∆ are manipulated in the same way as polynomial functions of realvariables (see the Section on the backward shift operator for more details).



Why difference data?

Any polynomial trend of degree k can be reduced to a constant by application of theoperator ∆k.

−→ possibility of getting a plausible realization of a stationary time series {∆kxt}from the data {xt}.

In practice often the order k of differencing required is quite small: k = 1 or k = 2.

Differencing is not free! Comes at the costs of

� of higher variances of the process

� non-invertible MA models (we will see that later)

� differencing needs to be applied with care


Basic ConceptsExample: Population in the US.

For population in the US already introduced above we find that differencing twice issufficient to produce a series with no apparent trend, ie.

{xt} −→ {∆2xt} = {xt − 2xt−1 + xt−2}.


Basic ConceptsExample: S&P500 Stock price index.

For daily US S&P500 Index values the trend evident in the data can be easilyeliminated by differencing once, that is:

{xt} −→ {∆xt} = {xt − xt−1}.



Classical Decomposition Model:

Xt = mt + st + Yt, t = 1, . . . , n, (23)

where E[Yt] = 0, st+d = st and∑dj=1 sj = 0.

Method 1: Estimation of trend and seasonal components

Final goal: find estimates mt, st for the trend and seasonal functions.

1) Estimate the trend by applying a moving average filter specially chosen toeliminate the seasonal component and to dampen the noise:

m∗t =

{(0.5xt−q + xt−q+1 + . . .+ xt+q−1 + 0.5xt+q)/d , if d = 2q even,(2q + 1)−1∑q

j=−q xt−j , if d = 2q + 1 odd.



2) Estimate the seasonal component:

– for each k = 1, . . . , d compute the average wk of the deviations{(xk+jd −m∗k+jd), q < k + jd ≤ n− q};

– Estimate sk = wk − d−1∑di=1 wi, k = 1, . . . , d. Set: sk = sk−d, k > d.

3) Let

dt = xt − st, t = 1, . . . , n, (24)

be the deseasonalized data.

– Re-estimate the trend from the deseasonalized data {dt} −→ mt.



The estimated cyclical component yt is then given by

yt = xt − mt − st, t = 1, . . . , n. (25)

Remark: Note that the re-estimation of the trend in step 3. above is done in order tohave a parametric form for the trend that can be used for simulation and prediction.

Method 2: Elimination of trend and seasonal components by differencing

Define the lag-d differencing operator ∆d by

∆dXt = Xt −Xt−d = (1− Ld)Xt. (26)



Apply the operator ∆d to the classical decomposition model

∆dXt = mt −mt−d + Yt − Yt−d, (27)

where st has period d.

� st thus drops out because st = st−d.

The obtained model in (27) has a trend and a noise component.

� Eliminate the trend component by differencing as before.


Basic ConceptsExample: Accidental deaths in the US.


Basic ConceptsTesting the Estimated Residual Sequence

Testing the Estimated Residual Sequence

The objective of the data transformations described above is to produce a sequenceof stationary residuals.Next step: find a model for the residuals.

� no dependence in the residual series: residuals come from an iid process, nofurther modeling needed;

� significant dependence among residuals: look for a more complex stationarymodel.

To determine this: use simple tests for checking the hypothesis that the residuals areobserved values of iid random variables.



a) Sample autocorrelation function.For large n, the sample autocorrelation function of an iid sequenceY1, . . . , Yn with finite variance is ≈ N (0, 1/n).−→ if y1, . . . , yn is a realization of such an iid sequence, about 95% of thesample autocorrelations should fall between the bounds [− 1.96√

n; + 1.96√

n].

b) Portmanteau type test.Let us consider other statistics for the sample autocorrelations ρ(j):

QP = nh∑j=1

ρ2(j) (Portmanteau test) (28)

QLB = n(n+ 2)

h∑j=1

ρ2(j)

n− j (Ljung-Box test) (29)



QML = n(n+ 2)h∑j=1

ρ2WW (j)

n− j (McLeod-Li test) (30)

where ρWW (j) are the sample autocorrelations of the squared data.

Under the assumption that the residuals are a finite-variance iid sequence,the three statistics are ≈ χ2(h) (Chi-squared distributed).

The hypothesis of iid data is then rejected at level α if Q(·) > χ21−α(h).

c) The turning point test.Let y1, . . . , yn be a sequence of observations. We say that there is a turningpoint at time i, 1 < i < n if



(i) yi−1 < yi and yi > yi+1 or

(ii) yi−1 > yi and yi < yi+1.

If NTP is the number of turning points of an iid sequence of length n, then

µNTP = E[NTP ] = 2(n− 2)/3;

σ2NTP

= V (NTP ) = (16n− 29)/90.

→ A large value of NTP − µNTP indicates that the series is fluctuating morerapidly than expected for an iid sequence.

→ On the other side: a value of NTP − µNTP much smaller than zeroindicates a positive correlation between neighboring observations.



In fact: for an iid sequence with large n: NTP ≈ N (µNTP , σ2NTP

).

Test: reject the iid hypothesis at level α if

| NTP − µNTP |σNTP

> φ1−α/2. (31)

d) The difference-sign test.Let NS be the number of values of i such that yi > yi−1, i = 2, . . . , n. Foran iid sequence we have:

µNS = E[NS ] = (n− 1)/2;

σ2NS

= V (NS) = (n+ 1)/12,

and for large n: NS ≈ N (µNS , σ2NS

).



A large positive (or negative) value of NS − µNS indicates the presence of atrend in the data.

Test: reject the hypothesis of no trend in the data if

| NS − µNS |σNS

> φ1−α/2. (32)

This test must be taken with caution: observations exhibiting a strongcyclical component will pass the difference-sign test!

e) Fitting an autoregressive moving average type models.

to be discussed later.



f) Checking for normality.

Draw a Gaussian qq-plot to verify whether the data may be assumed tocome from a Gaussian iid sequence.

g) The rank test.

This test is particular useful to detecting a linear trend in the data.

Define NP to be the number of pairs (i, j) such that

yj > yi and j > i, i = 1, . . . , n− 1. (33)

Note that there is a total of 12n(n− 1) pairs such that j > i.

If {Y1, . . . , Yn} is an iid sequence:

µNP = E[NP ] = n(n− 1)/4 (34)



σ2NP

= V (NP )

= n(n− 1)(2n+ 5)/72,

and for large n: NP ≈ N (µNP , σ2NP

).

A large positive (negative) value of NP − µNP indicates the presence of anincreasing (decreasing) trend in the data.

Test: reject the hypothesis that {yt} is a sample from an iid sequence if

| NP − µNP |σNP

> φ1−α/2. (35)


Basic ConceptsExample: Accidental deaths in the US.cont.


Basic ConceptsExample: Accidental deaths in the US. cont.


Basic ConceptsLag (or Backshift) Operator

The Lag (or Backshift) Operator

Define as before the Lag (or Backshift operator) as: LXt = Xt−1.

Properties:

1) linearity: L(Xt + Yt) = Xt−1 + Yt−1 and L(λXt) = λXt−1, λ a constant;

2) powers: LkXt = Xt−k, k = 1, 2, . . .;

3) inverse: L−1Xt−1 = Xt.

∆Xt = Xt −Xt−1 = (1− L)Xt (36)

Xt = ρXt−1 + Yt ⇔ (1− ρL)Xt = Yt (37)

Lc = c, where c is a constant. (38)



Polynomials in the Lag (or Backshift) Operator

Let us consider the expression

a0Xt + a1Xt−1 + a2Xt−2 + . . .+ anXt−n,

where ai are constant coefficients.

We can rewrite it using the properties of the backward shift operator L as

(a0 + a1L+ a2L2 + . . .+ anL

n)Xt = a(L)Xt

a polynomial of degree n (that can also be equal to ∞) in L.

All the classical operations used for polynomials can be applied to a(L)



� evaluation in L = 1:

a(1) = a0 + a1 + . . .+ an =

n∑i=0

ai (39)

� first derivative:

d

dLa(L) = a′(L) = a1 + 2a2L+ 3a3L

2 + . . .+ nanLn−1. (40)

Then

a′(1) = a1 + 2a2 + . . .+ nan =n∑i=1

iai. (41)

A polynomial a(L) is invertible if all the solutions of the characteristic equation

a0 + a1z + . . .+ anzn = 0 (42)

lie outside the unit circle.


Exercises

1) Let {Zt} be a sequence of independent normal random variables, each with mean 0 and varianceσ2, and let a, b and c be constants. Which, if any, of the following processes are stationary? Foreach stationary process specify the mean and autocovariance function.

a) Xt = a+ bZt + cZt−2;

b) Xt = Zt cos(ct) + Zt−1 sin(ct);

c) Xt = a+ bZ0;

d) Xt = ZtZt−1.

2) Let {Xt} be a moving average process of order 2 given by

Xt = Zt + θZt−2, where Zt ∼ WN(0, 1). (43)

a) Find the autocovariance and autocorrelation functions for this process when θ = 0.8.

b) Compute the variance of the sample mean X4 = 14

∑4j=1Xj .

c) Repeat b) when θ = −0.8 and compare your answer with the result obtained in b)


Exercises

3) Let {Xt} be the AR(1) process defined as:

Xt = φXt−1 + Zt, ∼ WN . (44)

a) Compute the variance of the sample mean X4 when φ = 0.9 and σ2 = 1.

b) Repeat a) when φ = −0.9 and compare your answer with the result in a).

4) Consider the simple moving average filter with weights

aj = (2q + 1)−1, −q ≤ j ≤ q. (45)

a) If mt = c0 + c1t, show that∑qj=−q ajmt−j = mt.

b) If Zt, t ∈ Z, are independent random variables with mean 0 and variance σ2, showthat the moving average

At =

q∑j=−q

ajZt−j

is “small” for large q in the sense that E[At] = 0 and V (At) =σ2

2q+1.


Exercises

5) Let {Yt} be a stationary process with mean zero and let a and b be constants. If

Xt = a+ bt+ st + Yt, (46)

where st is a seasonal component with period 12, show that

∆∆12Xt = (1− L)(1− L12)Xt (47)

is stationary and express its autocovariance function in terms of that of {Yt}.

6) Let us consider an invertible polynomial a(L) with finite a(1). Show that such a polynomial can berewritten as

a(L) = a(1) + (1− L)g(L), (48)

where

g0 = a0 − a(1)

g1 = a1 + g0 (49)

g2 = a2 + g1

etc.

Compute g(1).


basic time series

Documents

discretetime time series

stationary time series

time series isnot stationary

modern economic timeseries

specific time t

oversome time interval

linear combination

gallen lecture