time series data analysis using r

42
Time Series Data Analysis Using R Introduction to R Getting Started - Using RStudio IDE R 3.3.x RStudio 1.0.xxx On Line Data Resources quantmod Quandl 1 Time Series Data Analysis Using R

Upload: others

Post on 03-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Time Series Data Analysis Using R

Time Series Data Analysis Using R

• Introduction to R • Getting Started - Using RStudio IDE

– R 3.3.x

– RStudio 1.0.xxx

• On Line Data Resources

– quantmod

– Quandl

1 Time Series Data Analysis Using R

Page 2: Time Series Data Analysis Using R

Time Series Data

• Economic Time Series Data – GDP, CPI, Oil Price, Climate Change – Interest Rates, Exchange Rates

• Financial Time Series Data

– S&P 500, VIX (Fear Index)

– International Stock Markets

• High Frequency Time Series

2 Time Series Data Analysis Using R

1 2{ } {..., , ,..., ,...}t ny y y y

Page 3: Time Series Data Analysis Using R

Time Series Data

• Decomposition

– Additive Components

– Multiplicative Components

• Deterministic vs. Stochastic Trend (and/or Seasonality)

• Transformation

– Stationary vs. Non-stationary Time Series

• Box-Cox or Log -> Stationarity in the Variance

• Difference -> Stationarity in the Mean

3 Time Series Data Analysis Using R

log( ) ...t t t ty or y t x

log( )

t t t t

t t t t

y m s

y m s

Page 4: Time Series Data Analysis Using R

Time Series Decomposition

• decompse – function (x, type = c("additive",

"multiplicative"), filter = NULL)

• stl – function (x, s.window, s.degree = 0, t.window =

NULL, t.degree = 1, l.window = nextodd(period),

l.degree = t.degree, s.jump =

ceiling(s.window/10), t.jump =

ceiling(t.window/10), l.jump =

ceiling(l.window/10), robust = FALSE, inner =

if (robust) 1 else 2, outer = if (robust) 15

else 0, na.action = na.fail)

Time Series Data Analysis Using R 4

Page 5: Time Series Data Analysis Using R

Strategies for Time Series Analysis

• Data Exploration

– Using Graphs

• Hypothesis Testing

– Normality

– Stationarity

– Serial Correlation • Durbin-Watson

• Box-Pierce / Ljung-Box

• ACF/PACF

• Time Series Smoothing

– Exponential Smoothing

– Structural Time Series

• Model Estimation

– Regression Model

– ARIMA Model

– Dynamic Linear Model

Time Series Data Analysis Using R 5

Page 6: Time Series Data Analysis Using R

Time Series Forecasting

• One-Step Ahead Forecasts

• h-Step Ahead Forecasts

• Forecast Evaluation

– Training (Estimation) vs. Testing (Forecast)

– Cross Validation: Rolling Forecasts

Time Series Data Analysis Using R 6

1| 1 1ˆ ( | ), {..., ,..., }t t t i t ty E y I I y y

| 1ˆ ( | ), { ,..., }, 1,2,...n h n n h n n ny E y I I y y h

Page 7: Time Series Data Analysis Using R

Time Series Forecasting

• Simple Forecast

– Random Walk

• Naive Forecast

• Naïve Seasonal Forecast

– Random Walk with Drift

7 Time Series Data Analysis Using R

1| 1

|

ˆ , , 1,...,

ˆ , 1,2,...; [( 1) / ] 1

t t t p

n h n n h pk

y y t p p n

y y h k h p

1|

|

ˆ , 1, 2,...,

ˆ , 1, 2,...

t t t

n h n n

y y t n

y y h

| 1

2

ˆ ( )1

n

n h n n t t

t

hy y y y

n

Page 8: Time Series Data Analysis Using R

Time Series Forecasting

• Regression-Based Forecast

– Deterministic Trend and Seasonal Forecasts

– Stochastic Trend Forecasts

8 Time Series Data Analysis Using R

2

,

2

1|

[ ]

ˆˆˆ ( 1), 1,2,..., ,...

p

t s s t t

s

t t

y t t D

y t t n

1

1| 1

, | | 1

ˆ ˆˆ , 1,2,..., ,...

t t t

t t t

y y

y y t n

Page 9: Time Series Data Analysis Using R

Time Series Forecasting

• One-Step Ahead Forecast Error

• Forecast Error Statistics

9 Time Series Data Analysis Using R

,..., 1|

,..., 1| 1

2

,..., 1|

2

,..., 1|

2

,..., 1| 1

ˆ(| |)

ˆ100 (| / |)

ˆ( )

ˆ( )

ˆ100 [( / ) ]

t p n t t

t p n t t t

t p n t t

t p n t t

t p n t t t

MAE mean

MAPE mean y

MSE mean

RMSE mean

RMSPE mean y

1| 1 1|ˆ ˆ , , 1,..., ( )t t t t ty y t p p n p seasonal period

Page 10: Time Series Data Analysis Using R

Time Series Forecasting

• h-Step Ahead Forecast Error (if yn+h is known)

• Forecast Error Statistics

10 Time Series Data Analysis Using R

|

|

2

|

2

|

2

|

ˆ(| |)

ˆ100 (| / |)

ˆ( )

ˆ( )

ˆ100 [( / ) ]

h n h n

h n h n n h

h n h n

h n h n

h n h n n h

MAE mean

MAPE mean y

MSE mean

RMSE mean

RMSPE mean y

| |ˆ ˆ , 1,2,...n h n n h n h ny y h

Page 11: Time Series Data Analysis Using R

Time Series Forecasting

• Using R Package forecast – rwf

• function (y, h = 10, drift = FALSE, level =

c(80, 95), fan = FALSE, lambda = NULL,

biasadj = FALSE, x = y)

– accuracy • function (f, x, test = NULL, d = NULL, D =

NULL)

Time Series Data Analysis Using R 11

Page 12: Time Series Data Analysis Using R

Time Series Smoothing

• Exponential Smoothing

– Simple Exponential Smoothing (EWMA)

– Holt Exponential Smoothing

– Holt-Winters Exponential Smoothing

12 Time Series Data Analysis Using R

log( )

: ( )

: :

t t t t t t t t

t t t

t t

y m s or y m s

trend m a b level slope

seasonal s random

0, 0t tb s

0ts

Page 13: Time Series Data Analysis Using R

Time Series Smoothing

• Simple Exponential Smoothing (EWMA)

– Forecast Equation

– Smoothing Equation

13 Time Series Data Analysis Using R

1|

|

ˆ , 1,2,...,

ˆ , 1,2,...

t t t

n h n n

y a t n

y a h

1

2

1 2

(1 ) , 0 1

(1 ) (1 ) ...

t t t

t t t

a y a

y y y

Page 14: Time Series Data Analysis Using R

Time Series Smoothing

• Simple Exponential Smoothing (EWMA)

– State Space Representation

– Model Estimation

14 Time Series Data Analysis Using R

2

0

1

1

ˆ min arg ,

, 1,2,...,

n

t

t

t t t

e with initials a

where e y a t n

1

1

( )

( )

t t t

t t t

y a e observation equation

a a e state equation

Page 15: Time Series Data Analysis Using R

Time Series Smoothing

• Holt Exponential Smoothing

– Forecast Equation

– Smoothing Equation

15 Time Series Data Analysis Using R

1 1

1 1

(1 )( ), 0 1

( ) (1 ) , 0 1

t t t t

t t t t

a y a b

b a a b

1|

|

ˆ , 1,2,...,

ˆ , 1,2,...

t t t t

n h n n n

y a b t n

y a hb h

Page 16: Time Series Data Analysis Using R

Time Series Smoothing

• Holt Exponential Smoothing

– State Space Representation

– Model Estimation

16 Time Series Data Analysis Using R

1 1

1 1

1

t t t t

t t t t

t t t

y a b e

a a b e

b b e

2

0 0( , ) 1

1 1

ˆˆ( , ) min arg , ( , )

( )

n

t

t

t t t t

e with initials a b

where e y a b

Page 17: Time Series Data Analysis Using R

Time Series Smoothing

• Holt-Winters Exponential Smoothing

– Forecast Equation

– Smoothing Equation

17 Time Series Data Analysis Using R

1 1

1 1

( ) (1 )( ), 0 1

( ) (1 ) , 0 1

( ) (1 ) , 0 1

t t t p t t

t t t t

t t t t p

a y s a b

b a a b

s y a s

1| 1

| 1 [( 1)mod ]

|

ˆ , 1, 2,...,

ˆ , 1, 2,...

ˆ , 1, 2,...

t t t t t p

n h n n n n p h p

n h n n n n h p

y a b s t n

y a hb s h

y a hb s h p

Page 18: Time Series Data Analysis Using R

Time Series Smoothing

• Holt-Winters Exponential Smoothing

– State Space Representation

18 Time Series Data Analysis Using R

1 1

1 1

1

(1 )

t t t t p t

t t t t

t t t

t t p t

y a b s e

a a b e

b b e

s s e

2

0 0 0 2 1( , , ) 1

1 1

ˆˆ ˆ( , , ) min arg , ( , , ,..., , )

( )

n

t p p

t

t t t t t p

e with initials a b s s s

where e y a b s

Page 19: Time Series Data Analysis Using R

Time Series Smoothing

• One-Step Ahead Forecast at t = p, p+1,…

19 Time Series Data Analysis Using R

1| 1

1| 1

2| 1 1 1 2

2 |2 1 2 1 2 1

2 1|2 2 2 1

| 1 1 1 1

ˆ

ˆ

ˆ

...

ˆ

ˆ

...

ˆ

t t t t t p

p p p p

p p p p

p p p p p

p p p p p

n n n n n p

y a b s

y a b s

y a b s

y a b s

y a b s

y a b s

1|0 0 0 1

2|1 1 1 2

| 1 1 1 0

ˆ

ˆ

...

ˆ

p

p

p p p p

Initialization

y a b s

y a b s

y a b s

1| 1 1|ˆ ˆ

, 1,...,

t t t t t

Forecast Error

y y

t p p n

Page 20: Time Series Data Analysis Using R

Time Series Smoothing

• h-Step Ahead Forecast at t = n

• Forecast Error (if yn+h is known)

20 Time Series Data Analysis Using R

|

| [( 1)mod ]

ˆ , 1, 2,...

ˆ ,

:[( 1)mod ] 0,1,2,... 1, 2,...

n h n n n n h p

n h n n n n h p p

y a hb s h p

y a hb s h p

Note h p p for h

| |ˆ ˆn h n n h n h ny y

Page 21: Time Series Data Analysis Using R

Time Series Data Analysis Using R 21

| 1 [( 1)mod ]

1| 1

2| 2

|

1| 1

1| 2

|

ˆ

ˆ1:

ˆ2 : 2

...

ˆ:

ˆ1: ( 1)

ˆ2 : ( 2)

...

ˆ2 : 2

..

n h n n n n p h p

n n n n n p

n n n n n p

n p n n n n

n p n n n n p

n p n n n n p

n h n n n n

y a hb s

h y a b s

h y a b s

h p y a pb s

h p y a p b s

h p y a p b s

n p y a pb s

2 |2 1 2 1 2 1

2 1|2 2 2 1

.

ˆ

ˆ

...

p p p p p

p p p p p

y a b s

y a b s

Page 22: Time Series Data Analysis Using R

Time Series Smoothing

• HoltWinters – function (x, alpha = NULL, beta = NULL, gamma =

NULL, seasonal = c("additive",

"multiplicative"), start.periods = 2, l.start =

NULL, b.start = NULL, s.start = NULL,

optim.start = c(alpha = 0.3, beta = 0.1, gamma

= 0.1), optim.control = list())

• Predict.HoltWinters – predict(object, n.ahead = 1,

prediction.interval = FALSE, level = 0.95, ...)

Time Series Data Analysis Using R 22

Page 23: Time Series Data Analysis Using R

Time Series Smoothing

• Using R Package forecast:

• ets – function (y, model = "ZZZ", damped = NULL,

alpha = NULL, beta = NULL, gamma = NULL, phi =

NULL, additive.only = FALSE, lambda = NULL,

biasadj = FALSE, lower = c(rep(1e-04, 3), 0.8),

upper = c(rep(0.9999, 3), 0.98), opt.crit =

c("lik", "amse", "mse", "sigma", "mae"), nmse =

3, bounds = c("both", "usual", "admissible"),

ic = c("aicc", "aic", "bic"), restrict = TRUE,

allow.multiplicative.trend = FALSE,

use.initial.values = FALSE, ...)

Time Series Data Analysis Using R 23

Page 24: Time Series Data Analysis Using R

Time Series Smoothing

• Using R Package forecast:

• forecast – function(object,

h=ifelse(frequency(object$x)>1,

2*frequency(object$x),10), level=c(80,95),

fan=FALSE, lambda=NULL, biasadj=FALSE,...)

Time Series Data Analysis Using R 24

Page 25: Time Series Data Analysis Using R

Structural Time Series Model

• Linear Gaussian State Space Model

– Measurement (Observation) Equation

– State Equations

Time Series Data Analysis Using R 25

1 1

1

1 1...

t t t t

t t t

t t t p t

a a b

b b

s s s

2

2

2

~ (0, )

~ (0, )

~ (0, )

t

t

t

N

N

N

2~ (0, )t t t t ty a s N

Page 26: Time Series Data Analysis Using R

Structural Time Series Model

• Special Cases

– Non-Seasonal (or Local Linear Trend) Model

– Local Level Model

Time Series Data Analysis Using R 26

1 1

1

t t t

t t t t

t t t

y a

a a b

b b

2

2

2

~ (0, )

~ (0, )

~ (0, )

t

t

t

N

N

N

1

t t t

t t t

y a

a a

2

2

~ (0, )

~ (0, )

t

t

N

N

Page 27: Time Series Data Analysis Using R

Structural Time Series Model

• Linear Gaussian State Space Model

– Matrix Representation

Time Series Data Analysis Using R 27

2

1

0 0 0

~ (0, )

~ (0, )

~ ( , )

t t t t

t t t t

y Z N

T R N V

N C P

2

2

2

0 0

, 0 0

0 0

t

t t

t

V

Page 28: Time Series Data Analysis Using R

Time Series Data Analysis Using R 28

1 0 1 0 0

1 1 0 0 0

0 1 0 0 0

0 0 1 1 1

0 0 1 0 0

0 0 0 1 0

1 0 0

0 1 0

0 0 1

0 0 0

0 0 0

Z

T

R

1

2

Assuming 4 :

t

t

t t

t

t

p

a

b

s

s

s

Page 29: Time Series Data Analysis Using R

Structural Time Series Model

• StrucTS – function (x, type = c("level", "trend", "BSM"),

init = NULL, fixed = NULL, optim.control =

NULL)

Time Series Data Analysis Using R 29

Page 30: Time Series Data Analysis Using R

Bayesian Structural Time Series Model

• Using Package bsts – function (formula, state.specification,

family = c("gaussian", "logit", "poisson",

"student"), save.state.contributions = TRUE,

save.prediction.errors = TRUE, data,

bma.method = c("SSVS", "ODA"), prior = NULL,

oda.options = list(fallback.probability = 0,

eigenvalue.fudge.factor = 0.01), contrasts =

NULL, na.action = na.pass, niter,

ping = niter/10, timeout.seconds = Inf, seed =

NULL, ...)

Time Series Data Analysis Using R 30

Page 31: Time Series Data Analysis Using R

Bayesian Structural Time Series Model

• AddLocalLevel

– function (state.specification = NULL, y,

sigma.prior = NULL, # SdPrior

initial.state.prior = NULL, # NormalPrior

sdy,

initial.y)

Time Series Data Analysis Using R 31

Page 32: Time Series Data Analysis Using R

Bayesian Structural Time Series Model

• AddLocalLinearTrend

– function (state.specification = NULL, y,

level.sigma.prior = NULL, # SdPrior

slope.sigma.prior = NULL, # SdPrior

initial.level.prior = NULL, # NormalPrior

initial.slope.prior = NULL, # NormalPrior

sdy, initial.y)

Time Series Data Analysis Using R 32

Page 33: Time Series Data Analysis Using R

Bayesian Structural Time Series Model

• AddSeasonal

– function (state.specification, y, nseasons,

season.duration = 1,

sigma.prior = NULL, # SdPrior

initial.state.prior = NULL, # NormalPrior

sdy)

Time Series Data Analysis Using R 33

Page 34: Time Series Data Analysis Using R

Bayesian Structural Time Series Model

• SdPrior

– function (sigma.guess, sample.size = 0.01,

initial.value = sigma.guess, fixed = FALSE,

upper.limit = Inf)

• This sets a gamma prior on 1/σ2. Shape (α) = sigma.guess2 ×sample.size/2 Scale (β) = sample.size/2 If specify an upper limit on σ then support will be truncated.

• NormalPrior

– function (mu, sigma, initial.value = mu, fixed

= FALSE)

Time Series Data Analysis Using R 34

Page 35: Time Series Data Analysis Using R

BSTS Application

• Counterfactual Inference

– Estimating the causal effect of a designed intervention on a time series: How the response variable would have evolved after the intervention if the intervention had never occurred?

– Random experiments method vs. non-random experimental approach.

35 Time Series Data Analysis Using R

Page 36: Time Series Data Analysis Using R

BSTS Application

• Assumptions on Counterfactual Inference

– There is a set of control time series that were themselves not effected by the intervention.

– The relationship between covariates and treated time series, as established during the pre-intervention period, remains stable throughout the post-intervention period.

36 Time Series Data Analysis Using R

Page 37: Time Series Data Analysis Using R

BSTS Application

• Bayesian Counterfactual Inference

– A Bayesian structural time series model is used to estimate and predict the counterfactual.

– Priors are part of the model.

– Given a response time series model, performs posterior inference on the counterfactual by computing estimates of the causal effect.

37 Time Series Data Analysis Using R

Page 38: Time Series Data Analysis Using R

BSTS Application

• CausalImpact – function (data = NULL,

pre.period = NULL,

post.period = NULL,

model.args = NULL,

bsts.model = NULL,

post.period.response = NULL,

alpha = 0.05)

38 Time Series Data Analysis Using R

Page 39: Time Series Data Analysis Using R

BSTS Application

• References

– CausalImpact 1.1.0

– Inferring Causal Impact Using Bayesian Structural Time Series Models

39 Time Series Data Analysis Using R

Page 40: Time Series Data Analysis Using R

Example 1

• GDP and GDP Growth

– GDP Quarterly Time Series

• Trend, Seasonality

– GDP Growth

• Simple vs Compound Growth Rate

• Quarterly Growth vs Annual Growth

– Time Series Decomposition

– Smoothing and Forecasting

40 Time Series Data Analysis Using R

Page 41: Time Series Data Analysis Using R

Example 2

• China Shanghai Common Stock

– High Frequency Daily Index

– Monthly Index Time Series

• Trend, Seasonality

– Monthly Log-Return and Volatility

– Time Series Decomposition

– Exponential Smoothing and Forecasting

41 Time Series Data Analysis Using R

Page 42: Time Series Data Analysis Using R

Example 3

• Chinese Yuan vs. U.S. Dollar

– Exchange Rate Time Series

• Trend

• Intervention

– Bayesian Structural Time Series Model

– Policy Evaluation

42 Time Series Data Analysis Using R