forecasting multiple time series using the baselineforecast r package

Post on 15-Jan-2017

126 Views

Category:

Data & Analytics

5 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Forecasting (Revenue for S&P 500 Companies) Using the baselineforecast Package

by Konstantin GolyaevMicrosoft Azure Machine Learning

Konstantin Golyaev, useR! 2016, Stanford, CA 16/30/2016

Motivation

• “Prediction is very difficult, especially about the future”• © Niels Bohr (allegedly)

• We want to: • Forecast multiple time series at different horizons

• Leverage useful external information, when available

• Employ state-of-the-art methods

Note: won’t show any results due to five-minute time constraint

Konstantin Golyaev, useR! 2016, Stanford, CA 26/30/2016

Two Ways to Forecast

1. Time-series methods (ARIMA, ETS, STL, etc.)• Great for modeling trend and seasonality

2. Regression-based methods (elastic net, random forest, boosted regression trees, etc.)• Derive power from external information (features)

Can we get the best of both worlds?

Konstantin Golyaev, useR! 2016, Stanford, CA 36/30/2016

Konstantin Golyaev, useR! 2016, Stanford, CA 46/30/2016

Illustration

• Take small window of series

𝑦1𝑦2𝑦3𝑦4𝑦5𝑦6𝑦7𝑦8⋮

Konstantin Golyaev, useR! 2016, Stanford, CA 56/30/2016

Illustration

• Take small window of series

• Fit a model to it, make forecasts few steps ahead

𝑦1𝑦2𝑦3𝑦4𝑦5𝑦6𝑦7𝑦8⋮

𝑓7|6𝑓8|6⋮

Konstantin Golyaev, useR! 2016, Stanford, CA 66/30/2016

Illustration

• Take small window of series

• Fit a model to it, make forecasts few steps ahead

• Move the window forward

𝑦1𝑦2𝑦3𝑦4𝑦5𝑦6𝑦7𝑦8⋮

Konstantin Golyaev, useR! 2016, Stanford, CA 76/30/2016

Illustration

• Take small window of series

• Fit a model to it, make forecasts few steps ahead

• Move the window forward

• Repeat the process

𝑦1𝑦2𝑦3𝑦4𝑦5𝑦6𝑦7𝑦8⋮

𝑓8|7𝑓9|7⋮

Konstantin Golyaev, useR! 2016, Stanford, CA 86/30/2016

Illustration

• Take small window of series

• Fit a model to it, make forecasts few steps ahead

• Move the window forward

• Repeat the process

• Continue until out of data, combine results when done

𝑦7 𝑓7|6𝑦8 𝑓8|6𝑦8𝑦9⋮

𝑓8|7𝑓9|7⋮

Konstantin Golyaev, useR! 2016, Stanford, CA 96/30/2016

What Else Can We Do?

Konstantin Golyaev, useR! 2016, Stanford, CA 106/30/2016

Date-Based Features

Examples:

• Year

• Quarter

• Month

• Week

• Holidays

• Etc…

Konstantin Golyaev, useR! 2016, Stanford, CA 116/30/2016

Lags or Other Functions of 𝑦𝑡

• R does not compute lags correctly when series has gaps in its index (e.g. missing months/days)

• So we implemented it

Konstantin Golyaev, useR! 2016, Stanford, CA 126/30/2016

External Series as Features

• This is very much problem-specific

• What we used in various projects:• Macroeconomic data from Federal Reserve Economic Data (FRED)

• Web search trends from Bing/Google/etc

• Tweets scored for sentiments

• External business drivers such as promotions

Konstantin Golyaev, useR! 2016, Stanford, CA 136/30/2016

Implementation

• All code is combined into baselineforecast R package

• Function ConstructDataset() takes series 𝑦𝑡 and external data 𝑋𝑡, returns data frame with target and features

• Function FitModel() interfaces with caret package to train any regression learning algorithm and perform time series cross-validation

Konstantin Golyaev, useR! 2016, Stanford, CA 146/30/2016

Future Work

• Exploratory Data Analysis

• Computing Prediction Intervals

• Decide on the license/distribution model

Have questions?

Ping me at Konstantin.Golyaev@Microsoft.com

Konstantin Golyaev, useR! 2016, Stanford, CA 156/30/2016

top related