intro to forecasting - part 2 - hrug

20
Intro to Forecasting in R Part Deux! Houston R Users Group Ed Goodwin, CFA

Upload: egoodwintx

Post on 16-Jul-2015

113 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Intro To Forecasting - Part 2 - HRUG

Intro to Forecasting in R Part Deux!

Houston R Users Group Ed Goodwin, CFA

Page 2: Intro To Forecasting - Part 2 - HRUG

Last time at HRUG…• we left off discussing linear

trend models.

• there was something VERY wrong with this forecast.

• WHAT WAS IT?

Page 3: Intro To Forecasting - Part 2 - HRUG

How accurate was it?

RMSE Training = 38.3

RMSE Test = 76.6

Page 4: Intro To Forecasting - Part 2 - HRUG

Our forecast was really inaccurate!

• the 95% confidence interval is doing a poor job of predicting recent values.

• there seems to be a seasonal trend in the data that is increasing over time.

• we are not accounting for things like lower cost of travel and population growth that are affecting the data

Page 5: Intro To Forecasting - Part 2 - HRUG

The solution?

We need to transform the data!

Page 6: Intro To Forecasting - Part 2 - HRUG

What are transformations?Transformations replace data with a function of that data

Page 7: Intro To Forecasting - Part 2 - HRUG

Types of transformations• convenience transforms - changing scale to make

calculations easier (percentages, absolute values, Fahrenheit to Celsius, miles to kilometers)

• log transforms - for compounded data (CPI inflators, market returns, power laws)

• skew reductions - reduce left or right skewness

• additive transforms - makes multiplicative relationships linear

• spread transforms - reduce heteroskedasticity

Page 8: Intro To Forecasting - Part 2 - HRUG

Some common transforms

TRANSFORM EXAMPLE

Reciprocal x = 1/x

Log x = log(x)

Roots x = x^2; x=sqrt(x)

Common scale y = 1:100; x = 1/y

Page 9: Intro To Forecasting - Part 2 - HRUG

Forecast with transform• Use log( ) to account for growth factor

in Air Passenger data

Page 10: Intro To Forecasting - Part 2 - HRUG

More accurate?

RMSE Training = 0.134 RMSE Test = 0.167

Page 11: Intro To Forecasting - Part 2 - HRUG

Don’t forget to transform the data back!

Page 12: Intro To Forecasting - Part 2 - HRUG

Back Transformed Plot

Page 13: Intro To Forecasting - Part 2 - HRUG

Linear Models• lm( ) function to

create a linear model

• tslm( ) is an lm( ) wrapper and adds season and trend variables

• season is a dummy variable based on data decomposition

Page 14: Intro To Forecasting - Part 2 - HRUG

What does our model look like?

• Use the summary( ) function to get details

Page 15: Intro To Forecasting - Part 2 - HRUG

How well does it fit?

• Use the residuals( ) function to look at the std error

Page 16: Intro To Forecasting - Part 2 - HRUG

Plot of Log Forecast using seasonal Dummy Variable

Page 17: Intro To Forecasting - Part 2 - HRUG

Creating our own dummy variables

• Time series with ‘1’ where variable is TRUE, ‘0’ where FALSE

• Factors are a good place to start when creating dummy variables

• Always have n-1 dummy variables (e.g. days of week would have 6 dummy variables, since all ‘0’ would represent one of the days)

Page 18: Intro To Forecasting - Part 2 - HRUG

Examples of dummy variables

• Employment status (for credit scores)

• Bank holidays (for econometrics and market data)

• Black Friday and Christmas shopping season for retail sales

• Days of critical events that move (e.g. Super Bowl Sunday, worker strikes, natural disasters)

Page 19: Intro To Forecasting - Part 2 - HRUG

Easter Holiday 2014-2017• Let’s say you’re in charge

of forecasting sales of Cadbury Eggs for Cadbury Schweppes. The sales peak near the Easter holiday in the US.

• Easter falls at various times of the year (March or April)

• Solution? Create a dummy variable for Easter

EASTER HOLIDAY

2014 April 20th

2015 April 5th

2016 March 27th

2017 April 16th

Page 20: Intro To Forecasting - Part 2 - HRUG

Easter Dummy Variable