![Page 2: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/2.jpg)
Time series analysis
● The basic idea of time series analysis is simple: given an observed sequence, how can we build a model that can predict what comes next?
● Obvious applications in finance, business, ecology, agriculture, demography, etc.
![Page 3: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/3.jpg)
What's different about time series?
● In most of the contexts we've seen so far, there's an implicit assumption that observations are independent of each other.
● In other words, the fact that subject 27 is 165cm tall and terrible at basketball says nothing at all about what will happen with subject 28.
![Page 4: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/4.jpg)
What's different about time series?
● In time series data, this is not true.
● We're hoping for exactly the opposite: that what happens at time t contains information about what will happen at time t+1.
● Observations are treated as both outcome and then predictor variables as we move forward in time.
![Page 5: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/5.jpg)
Ways of dealing with time series
● Despite (or perhaps because of) the practical uses of time series, there is no single universal technique for handling them.
● Lots of different ways to proceed depending on the implicit theory of data generation we're proposing.
● Easiest to illustrate with examples...
![Page 6: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/6.jpg)
Example 1: Lake Huron data
● Our first example data set is a series of annual measurements of the level of Lake Huron, in feet, from 1875 to 1972.
● It's a built-in data set in R. So we only need data(LakeHuron) to access it.
● R already "knows" that this is a time series.
![Page 7: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/7.jpg)
Example 1: Lake Huron data
![Page 8: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/8.jpg)
Ex. 2: Australian beer production
● Our second example is data on monthly Australian beer production, in millions of litres.
● The time series runs from January 1956 to August 1995.
● The data is available in beer.csv.
![Page 9: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/9.jpg)
Ex. 2: Australian beer production
● R doesn't yet know that this is a time series: the data comes in as a list of numbers.
● We use the ts function to specify that something should be interpreted as a time series, optionally specifying the seasonal period.
● beer = ts(beer[,1],start=1956,freq=12)
![Page 10: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/10.jpg)
Ex. 2: Australian beer production
![Page 11: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/11.jpg)
Two goals in time series modelling
● We assume there's some structure in the time series data, obscured by random noise.
● Structure = trends + seasonal variation.
● The Lake Huron data has no obvious repetitive structure, but possibly a downward trend. The beer data shows clear seasonality and a trend.
![Page 12: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/12.jpg)
Models of data generation
● The most basic of data generation is to suppose that there is no structure in the time series at all, and that each observation is an independent random variate.
● An example: white noise.
● In this case, the best we can do is simply predict the mean value of the data set.
![Page 13: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/13.jpg)
Lake Huron: prediction if observations were independent
![Page 14: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/14.jpg)
Beer production: prediction if observations were independent
![Page 15: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/15.jpg)
Producing these graphs in Rpng("BeerMeanPredict.png",width=800,height=400)plot(beer,xlim=c(1956,2000),lw=2,col="blue")lines(predict(nullBeer,n.ahead=50)$pred,
lw=2,col="red")lines(predict(nullBeer,n.ahead=50)$pred
+1.96*predict(nullBeer,n.ahead=50)$se,lw=2,lty="dotted",col="red")
lines(predict(nullBeer,n.ahead=50)$pred-1.96*predict(nullBeer,n.ahead=50)$se,lw=2,lty="dotted",col="red")
graphics.off()
![Page 16: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/16.jpg)
Simple approach to trends
● We could ignore the seasonal variation and the random noise and simply fit a linear or polynomial model to the data.
● For example:
tb = seq(1956,1995.8,length=length(beer))tb2 = tb^2polyBeer = lm(beer ~ tb + tb2)
![Page 17: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/17.jpg)
Polynomial fit of lake level on time
![Page 18: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/18.jpg)
Polynomial fit of beer production on time
![Page 19: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/19.jpg)
Regression on time a good idea?
● This is an OK start: it gives us some sense of what the trend line is.
● But we probably don't believe that beer production or lake level is a function of the calendar date.
● More likely these things are a function of their own history, and we need methods that can capture that.
![Page 20: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/20.jpg)
Autoregression
● A better approach is to ask whether the next value in the time series can be predicted as some function of its previous values.
● This is called autoregression.
● We want to build a regression model of the current value fitted on one or more previous values (lagged values). But how many?
![Page 21: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/21.jpg)
Autocorrelation and partial autocorrelation
● We can look directly at the time series and ask how much information there is in previous values that helps predict the current value.
● The acf function looks at the correlation between now and various points in the past.
● Partial autocorrelation(pacf) does the same, but "partials out" the other effects to get the unique contribution of each time-lag.
![Page 22: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/22.jpg)
ACF & PACF, Lake Huron data
![Page 23: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/23.jpg)
ACF & PACF, beer data
![Page 24: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/24.jpg)
ACF & PACF plots
● ACF shows a correlation that fades as we take longer lagged values in the Lake Huron time series.
● ACF shows periodic structure in the beer time series reflecting its seasonal nature.
![Page 25: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/25.jpg)
ACF & PACF plots
● But if t[0] is correlated with t[-1], and t[-1] is correlated with t[-2], then t[0] will necessarily be correlated with t[-2] also.
● So we need to look at the PACF values.
● We find that only the most recent value is really useful in building an autoregression model for the Lake Huron data, for example.
![Page 26: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/26.jpg)
Autoregression models
● With the ar command we can fit autoregression models and ask R to use AIC to decide how many lagged values should be included in the model.
● For example: arb = ar(beer)
● The Lake Huron model includes only one lagged value; the beer model includes 24.
![Page 27: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/27.jpg)
Autoregression model, lake data, 1 lagged term
![Page 28: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/28.jpg)
Autoregression model, beer data, 24 lagged terms
![Page 29: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/29.jpg)
Automatically separating trends, seasonal effects, and noise
● The stl procedure uses locally weighted regression to separate out a trend line, and parcels out the seasonal effect.
● For example:
plot(stl(beer,s.window="periodic"),col="blue",lw=2)
● If things go well, there should be no autocorrelation structure left in the residuals.
![Page 30: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/30.jpg)
![Page 31: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/31.jpg)
Exponential smoothing
● A reasonable guess about the next value in a series is that it would be an average of previous values, with the most recent values weighted more strongly.
● This assumption constitutes exponential smoothing:
t0 = α t-1 + α(1-α)t-2 + α(1-α)2 t-3 ...
![Page 32: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/32.jpg)
Holt-Winters procedure
● The logic can be applied to the basic level of the prediction, to the trend term, and to the seasonal term.
● The Holt-Winters procedure automatically does this for all three; for example:
HWB = HoltWinters(beer)
![Page 33: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/33.jpg)
Holt-Winters analysis on beer data
![Page 34: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/34.jpg)
Holt-Winters analysis on lake data
● The process seems to work well with the seasonal beer data.
● For the lake data, we have not specified a seasonal period, and we might also drop the trend term, thus:
HWLake = HoltWinters(LakeHuron,gamma=FALSE,beta=FALSE)
![Page 35: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/35.jpg)
Holt-Winters analysis on lake data
![Page 36: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/36.jpg)
Holt-Winters analysis on lake data
● The fitted alpha value is close to 1 (i.e., a very short memory) so the prediction is that the process will stay where it was.
● What if we put the trend term back in?
HWLake = HoltWinters(LakeHuron,gamma=FALSE)
![Page 37: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/37.jpg)
Holt-Winters analysis on lake data
● Trend is overdoing it (beta = 0.17)?
![Page 38: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/38.jpg)
Differencing
● Some time series techniques (e.g., ARIMA) are based on the assumption that the series is stationary, i.e., that it has constant mean, variance, and autocorrelation values over time.
● If we want to use these techniques we may need to work with the differenced values rather than the raw values.
![Page 39: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/39.jpg)
Differencing
● This just means transforming t[1] into t[1] - t[0], etc.
● We can use the diff command to make this easy.
● To plot the beer data as a differenced series:
plot(diff(beer),lw=2,col="green")
![Page 40: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/40.jpg)
Differencing
![Page 41: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/41.jpg)
Some housekeeping in R
● To get access to some relevant ARIMA model fitting functions, we need to download the "forecast" package.
● install.packages("forecast")library(forecast)
![Page 42: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/42.jpg)
Auto-regressive integrated moving-average models (ARIMA)
● ARIMA is a method for putting together all of the techniques we've seen so far.
● A non-seasonal ARIMA model is specified with p, d, and q parameters.
● p: no. of autoregression terms.d: no. of difference levels.q: no. of moving-average (smoothing) terms.
![Page 43: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/43.jpg)
Auto-regressive integrated moving-average models (ARIMA)
● ARIMA(0,0,0) is simply predicting the mean of the overall time series, i.e., no structure.
● ARIMA(0,1,0) works with differences, not raw values, and predicts the next value without any autoregression or smoothing. This is therefore a random walk.
● ARIMA(1,0,0) and ARIMA(24,0,0) are the models we originally fitted to the lake and beer data.
![Page 44: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/44.jpg)
Auto-regressive integrated moving-average models (ARIMA)
● We can also have seasonal ARIMA models: three more terms apply to the seasonal effects.
● The "forecast" library includes a very convenient auto.arima function that uses AIC to find the most parsimonious model in the space of possible models.
![Page 45: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/45.jpg)
ARIMA(1,1,2) model of lake data
![Page 46: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/46.jpg)
ARIMA(2,1,2)(2,0,0)[12] model of beer data
![Page 47: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/47.jpg)
Fourier transforms
● No time to discuss Fourier transforms...
● But they're useful when you suspect there are seasonal or cyclic components in the data, but you don't yet know the period of these components.
● In the beer example, we already knew the seasonal period was 12, of course.
![Page 48: autocorrelation Time series analysis, COMP6053 lectureusers.ecs.soton.ac.uk/jn2/teaching/timeSeries.pdf · Autocorrelation and partial autocorrelation We can look directly at the](https://reader031.vdocuments.us/reader031/viewer/2022030403/5a7906dc7f8b9a68148e128d/html5/thumbnails/48.jpg)
Additional material
● The beer.csv data set.
● The R script used to do the analyses.
● A general intro to time series analysis in R by Walter Zucchini and Oleg Nenadic.
● An intro to ARIMA models by Robert Nau.
● Another useful intro to time series analysis.