escoe research seminar - amazon s3...1 nt n å i=1 t å t=1 (x˜ it l0 if t) 2, (4) where l i is an...

49
ESCoE Research Seminar Improving the Predictive Ability of ONS’ “Faster Indicators of UK Economic Activity” Presented by Fotis Papailias (King’s College London) 10 March2020

Upload: others

Post on 24-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

ESCoE Research Seminar

Improving the Predictive Ability of ONS’“Faster Indicators of UK Economic Activity”

Presented by Fotis Papailias (King’s College London)

10 March2020

Page 2: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Improving the Predictive Ability of ONS’

“Faster Indicators of UK Economic Activity”

Economic Statistics Centre of Excellence

Seminar

Fotis Papailias

Joint with George Kapetanios

King’s College London, ESCoE

March 10, 2020

Fotis Papailias (KCL) ESCoE March 10, 2020 1 / 48

Page 3: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Outline

Introduction & Motivation

Data Description

Methodologies & Models

Results

Conclusions and Suggestions

Fotis Papailias (KCL) ESCoE March 10, 2020 2 / 48

Page 4: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Introduction & Motivation

The Data Science Campus at the Office for National Statistics (ONS)in the UK has been leading the “Faster Indicators of UK EconomicActivity” (FIEA) project.

On April 15th, 2019, ONS has made this data available online1 andthe press has covered this news with great interest2.

The Read-me file associated with the published data states that thisset of indicators has the following goals:

identify close-to-real-time big data or administrative dataset whichrepresent useful economic concepts,create a set of indicators which allow early identification of largeeconomic changes, andprovide insight into economic activity, at a level of timeliness andgranularity not possible for official economic statistics.

1See LINK.2See LINK.

Fotis Papailias (KCL) ESCoE March 10, 2020 3 / 48

Page 5: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Introduction & Motivation

ONS clearly highlights that this project -and the corresponding outputdata- “...We are not attempting to forecast or predict GDP or otherheadline economic statistics here, and the indicators should not beinterpreted in this way...” .

In the recently updated narrative, ONS say: “...Rather, by exploringbig, closer-to-real-time datasets of activity likely to have an impact onthe economy, we provide an early picture of a range of activities thatsupplement official economic statistics and may aid economic andmonetary policymakers and analysts in interpreting the economicsituation...” .

Fotis Papailias (KCL) ESCoE March 10, 2020 4 / 48

Page 6: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Introduction & Motivation

The set of indicators from this project “...they should be consideredearly warning indicators providing timely insight into real activities inthe economy, and their potential impact on headline GDP should beinterpreted carefully. However, it may be that these indicators havethe power to improve the performance of nowcasting or forecastingmodels, as components of these models...” .

This research investigates the predictive ability of a subset of theFIEA dataset when used to nowcast the UK GDP growth.

Fotis Papailias (KCL) ESCoE March 10, 2020 5 / 48

Page 7: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Introduction & Motivation

Even though ONS argues that these indices are not constructed to beused as economic predictors -and should not be interpreted this way-,our nowcasting exercise reveals that some of the VAT monthly3

diffusion indices (which are part of the FIEA dataset) providesatisfactory performance when predicting the direction (increase ofdecrease) of the GDP growth.

Finally, we show that the individual performance of these indices canbe further improved, extracting the common factors across all VATReporting Behaviour indices and using these common factors aspredictors of economic activity.

Our empirical evidence suggests that there is nowcasting predictivepower hidden in these indices and, therefore, the forecastingperformance of these indices should be further examined thoroughly.

3Quarterly series used in an earlier draft.Fotis Papailias (KCL) ESCoE March 10, 2020 6 / 48

Page 8: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Data Description

The purpose of this note is to examine the predictive performance ofthe ONS FIEA diffusion indices in the nowcasting of the UK GDPgrowth.

To do so, we compare the performance of these indices to a set ofunivariate models, and models which can handle a large panel ofpotential predictors in a Stock and Watson (2002) manner.

We build a large panel of key macroeconomic and financial variableswhich consists of:

49 daily variables,3 weekly variables,178 monthly variables and60 quarterly variables4.

4All variables have been downloaded using the Macrobond software.Fotis Papailias (KCL) ESCoE March 10, 2020 7 / 48

Page 9: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Data Description

As in McCracken and Ng (2016), this panel includes variables of thefollowing categories: output and income, labour market, housing,consumption, orders and inventories, money and credit, interest andexchange rates, prices, stock market and surveys.

The target variable is the period-to-period growth of the UK GDP atconstant prices.

All variables are seasonally adjusted and have been transformed tostationarity in a similar manner to McCracken and Ng (2016).

Fotis Papailias (KCL) ESCoE March 10, 2020 8 / 48

Page 10: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Data Description:FIEA

We complement the standard Macroeconomics dataset with the set of“VAT Reporting Behaviour indices” (SA) extracted from the ONSFIEA project.5

We include the Total, Agriculture, forestry and fishing, Production,Construction and Services indices, as well as the individuals series (Ato S).

Why “VAT Reporting Behaviour indices” and not “VAT MonthlyDiffusion Indices”6?

Data availability: 2008M01 vs. 2013M01.Given the short sample, 5 years of data (60 obs.) is an importantfactor.

5Click here for data.6“Quarterly Monthly Diffusion Indices” used in an earlier draft.

Fotis Papailias (KCL) ESCoE March 10, 2020 9 / 48

Page 11: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting

Let yt , t = 1, ...,T , be the target variable and xt = (x1t , ..., xNt)′ be

a set of potential predictors, with N being very large.

We do not assume a particular data generating process for yt butsimply posit the existence of a representation of the form

yt = a+ g(x1t , ..., xNt) + ut , (1)

which implies that E (ut |x1t , ..., xNt) = 0.

We consider an approximating linear representation of the form,

yt = a+N

∑i=1

βixit + ut , (2)

with ut denoting a martingale difference process and where the set ofxits can also contain products of the original indicators in order toprovide a better approximation to (1).

Fotis Papailias (KCL) ESCoE March 10, 2020 10 / 48

Page 12: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting

Our main aim is to provide estimates for current and future values ofyt . There are two strands in econometric methodologies.

Variable Reduction

Reducing the dimension of xt by producing a much smaller set ofgenerated regressors, which can then be used to produce nowcasts andforecasts in standard ways.

Variable Selection

While ordinary least squares (OLS) is the benchmark method for doingso, it is clear that if N is large this is not optimal or even feasible (whenN > T ). Therefore, other methods need to be used. We considersparse regression, with origins in the machine learning literature.

Fotis Papailias (KCL) ESCoE March 10, 2020 11 / 48

Page 13: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: Factor Extraction

Relatively few summaries of the large data sets are used in forecastingequations, which thereby become standard forecasting equations asthey only involve a few explanatory variables.

The main assumption is that the co-movements across the indicatorvariables xt , where xt = (x1t · · · xNt)′ is a vector of dimension N × 1,can be captured by a r × 1 vector of unobserved factorsFt = (F1t · · · Frt)′, i.e.,

xt = Λ′Ft + et (3)

where xt may be equal to xt or may involve other variables, such aslags, leads or products of the elements of xt , and Λ is an r ×N matrixof parameters describing how the individual indicator variables relateto each of the r factors, which we denote with the terms ‘loadings’.

Fotis Papailias (KCL) ESCoE March 10, 2020 12 / 48

Page 14: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: Factor Extraction

The number of factors is assumed to be finite. So, implicitly, in (2)α′ = α′Λxt , where Ft = Λxt , which means that a small, r , number oflinear combinations of xt represent the factors and act as thepredictors for yt , the target variable.

The main difference between different factor methods relates to howΛ and the factors are estimated.

The use of PCA for the estimation of factor models is, by far, themost popular factor extraction method. It has been popularised byStock and Watson (2002a, 2002b), in the context of large data sets,although the idea had been well established in the traditionalmultivariate statistical literature.

Fotis Papailias (KCL) ESCoE March 10, 2020 13 / 48

Page 15: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: PCA

The method of principal components is simple. Estimates of Λ andthe factors Ft are obtained by solving:

V (r) = minΛ,F

1

NT

N

∑i=1

T

∑t=1

(xit − λ′iFt)2, (4)

where λi is an r × 1 vector of loadings that represent the N columnsof Λ = (λ1 · · · λN).

PC estimation of the factor structure is essentially a static exercise asno lags or leads of xt are considered.

One alternative is dynamic principal components, which, as a methodof factor extraction, has been suggested in a series of papers by Forni,Hallin, Lippi and Reichlin (see, e.g., Forni, Hallin, Lippi and Reichlin(2000) among others) and is designed to address this issue.

Fotis Papailias (KCL) ESCoE March 10, 2020 14 / 48

Page 16: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: DFA

Dynamic principal components are extracted in a similar fashion tostatic principal components but, instead of the second momentmatrix, the spectral density matrix of the data at various frequenciesis used.

The dynamic PCs are then used to construct estimates of thecommon component of the data set, which is a function of theunobserved factors.

The basic version of this method uses leads of the data, making it notsuited in a forecasting context, but later work by the developers ofthe method has addressed this issue (see, e.g., Forni, Hallin, Lippi andReichlin (2005)).

Fotis Papailias (KCL) ESCoE March 10, 2020 15 / 48

Page 17: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: PLS

In Partial Least Squares (PLS), the basic idea is similar to PCA inthat factors or components, which are linear combinations of theoriginal regression variables, are used, instead of the original variables,as regressors.

PLS regression does not seem to have been explicitly considered fordata sets with a very large number of series, i.e., when N is assumedin the limit to converge to infinity.

A conceptually powerful way of defining PLS is to note that the PLSfactors are those linear combinations of xt , denoted by Υxt , that givemaximum covariance between yt and Υxt while being orthogonal toeach other.

Fotis Papailias (KCL) ESCoE March 10, 2020 16 / 48

Page 18: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: Penalised Regression

Penalised regression is one of the most popular ways for sparseregression in the literature.

Various penalties have been suggested in order to effectively estimatethe βi parameters assigning zeros to the variables which should notbe used in the regression (meaning that these are not part of the truemodel) and consequently in the forecasting exercise.

In what follows we denote βN = (β1, ..., βN)′ and xN = (x1, ..., xN)

′.

Fotis Papailias (KCL) ESCoE March 10, 2020 17 / 48

Page 19: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: Ridge

Ridge Regression creates a linear regression model that is penalisedwith the L2-norm which is the sum of the squared coefficients.

This has the effect of shrinking the coefficient values (and thecomplexity of the model) allowing some coefficients with minorcontribution to the response to get close to zero (but not exactlyequal to zero).

The parameter estimators, βRidge

, are then computed by solving thefollowing optimisation problem:

minβN

{T

∑t=1

(yt − a− β

Nxt,N

)2+ λ

N

∑i=1

β2i

}, (5)

for given values of a and λ. λ is the penalty parameter.

Fotis Papailias (KCL) ESCoE March 10, 2020 18 / 48

Page 20: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: Ridge

OLS corresponds to the no penalty case, where βRidge → β

OLSas

λ→ 0.

Also, it can be easily seen that βRidge → 0 as λ→ ∞.

By centering the columns of x , the intercept becomes α = y .

Therefore, we typically center y , xN and do not include the interceptterm.

Fotis Papailias (KCL) ESCoE March 10, 2020 19 / 48

Page 21: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: LASSO

Least Absolute Shrinkage and Selection Operator (LASSO) creates aregression model that is penalised with the L1-norm which is the sumof the absolute coefficients.

Because of the nature of this constraint, it tends to produce somecoefficients that are exactly 0 and hence gives more interpretablemodels.

Simulation studies suggest that the LASSO enjoys some of thefavourable properties of both subset selection and ridge regression.

Fotis Papailias (KCL) ESCoE March 10, 2020 20 / 48

Page 22: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: LASSO

As originally noted by Tibshirani (1996), the lasso regression is bettersuited for predictor selection compared to the Ridge regressionbecause the former method performs model/predictors selectionkeeping those variables which are more suitable for forecasting.

The optimisation problem now becomes:

minβN

{T

∑t=1

(yt − a− β

Nxt,N

)2+ λ

N

∑i=1

|βi |}

. (6)

Fotis Papailias (KCL) ESCoE March 10, 2020 21 / 48

Page 23: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: EN

Elastic Net (EN) creates a regression model that is penalised withboth the L1-norm and L2-norm.

Introduced by Zou and Hastie (2005), the elastic net has the effect ofeffectively shrinking coefficients (as in ridge regression) and settingsome coefficients to zero (as in LASSO).

The optimisation problem now is:

βnaiveEN

= minβN

{T

∑t=1

(yt − a− β

Nxt,N

)2+ λ1

N

∑i=1

|βi |+ λ2

N

∑i=1

β2i

}.

(7)

The main advantage of the elastic net is its usefulness when thenumber of predictors is much bigger than the number of observations,which is usually the case in our big data context.

Fotis Papailias (KCL) ESCoE March 10, 2020 22 / 48

Page 24: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: EN

The reason for adding an additional squared L2-norm penalty ismotivated by Zou and Hastie (2005) as follows. For stronglycorrelated covariates, the LASSO may select one but typically notboth of them (and the non-selected variable can then beapproximated as a linear function of the selected one).

From the point of view of sparsity, this is what we would like to do.

However, in terms of interpretation, we may want to have two evenstrongly correlated variables among the selected variables: this ismotivated by the idea that we do not want to miss a “true” variabledue to selection of a “non-true” which is highly correlated with thetrue one.

Fotis Papailias (KCL) ESCoE March 10, 2020 23 / 48

Page 25: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: SSlab

Spike and Slab regressions were originally proposed by Mitchell andBeauchamp (1988) and recently used by Scott and Varian (2013).

The idea is to include an indicator variable γi = 1 if βi 6= 0 (i.e. thecorresponding regressor is included in the model), and γi = 0 ifβi = 0.

Denoting the nonzero elements of β by βγ, the spike and slab priorfor β and γ can be written as

p(β, γ, σ2) = p(βγ|γ, σ2)p(σ2|γ

)p (γ)

The vector of indicator variables γ is assumed to have a Bernoulliprior (independent across elements)

p (γ) = ∏N

i=1π

γii (1− πi )

(1−γi ) ,

so it represents a spike as it places positive probability mass at zero .

Fotis Papailias (KCL) ESCoE March 10, 2020 24 / 48

Page 26: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: Nowcasting Setup

Our nowcasting algorithm is simple and works as follows.

1 We leave a number of observations, TO , out of the sample, in order touse them for the evaluation of the nowcast output of the differentmodels.

2 The initial sample we use for the first round of estimation andforecasting is T IN

1 = {1, ..,(T − TO + 1

)}. Then, we estimate the

parameters and produce the nowcasts for each model.

3 We repeat Step 2 in a recursive manner, i.e.T IN2 = {1, ..,

(T − TO + 2

)} and generally

T INj = {1, ..,

(T − TO + j

)}. We stop when T IN

j = {1, .., (T − 1)},so that we can evaluate the nowcasts in the last period.

Fotis Papailias (KCL) ESCoE March 10, 2020 25 / 48

Page 27: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: Nowcasting Setup

At the end of the above recursive procedure we end up with TOUT

nowcasts for each model under consideration.

Once we have computed the number of TO nowcasts, we evaluate theoutput using the Mean Squared Nowcast Error statistic defined as:

MAEi =1

TO

TO

∑t=1

|ei ,t |,

where ei is the out-of-sample forecast error (in levels) for model i .

All our tables present the relative MAE with respect to eachappropriate benchmark.

Fotis Papailias (KCL) ESCoE March 10, 2020 26 / 48

Page 28: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: Nowcasting Setup

We further calculate the Diebold-Mariano statistic for predictiveaccuracy as follows:

DM =d(

LRV d/T)1/2 ,

where

d =1

TO

TO

∑t=1

dt ,

dt =∣∣e1,t ∣∣− ∣∣e2,t ∣∣ , corresponding to MAE

LRVd = γ0 + 2∞

∑v=1

γv , γv = cov(dt , dt−v ),

for candidate models 1 and 2 (model 1 always being thecorresponding benchmark).

Fotis Papailias (KCL) ESCoE March 10, 2020 27 / 48

Page 29: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: Nowcasting Setup

We use the one-sided test where the null hypothesis states equalpredictive ability between models:

H0 : E [dt ] = 0 ,HA : E [dt ] > 0.

In the tables we report the p-value of the test in the standard manner(*, **, *** for 10%, 5% and 1% levels respectively).

Any model with MAE < 1 indicates that it has a smaller nowcasterror when compared to the appropriate benchmark.

Fotis Papailias (KCL) ESCoE March 10, 2020 28 / 48

Page 30: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: Models

Benchmarks

Univariate Series: Naıve, AR(4), LR[O Total, YL4]

Factor Extraction

DFA, PCA(1), PCA(MSE), PLS(1), PLS(MSE)

Variable Selection

Ridge, Lasso, Elastic Net

For all methods

Using the standard macro dataset [MF]Using the standard macro dataset including the ONS factors [MFO]Extracting separate factors for MF and O [MF, O]Using the ONS factors only [O]

Fotis Papailias (KCL) ESCoE March 10, 2020 29 / 48

Page 31: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Empirical Setting: Details

First in-sample: 12 obs - dataset too short. 2008Q2-2011Q1

Out-of-sample: 90 monthly nowcasts. 2011-07-22 to 2018-12-28

Unbalancedness: quarter averages

Further consideration: UMIDAS, problems with too short dataset

Analysis

TotalNowcast Month (M1, M2, M3) and distance to target dateAnnually

Comparison

AllMF vs MFO/O

Fotis Papailias (KCL) ESCoE March 10, 2020 30 / 48

Page 32: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: Benchmarks Evaluation

AR(4) Naive LR[O Total, YL4]Total 0.866** 1.000 0.887*M1 0.903 1.000 0.853M2 0.903 1.000 0.941M3 0.809** 1.000 0.872

AR(4) Naive LR[O Total, YL4]2011-2018 0.866** 1.000 0.887*

2011 1.278 1.000 1.6502012 0.903 1.000 0.9982013 0.777 1.000 0.594**2014 1.115 1.000 1.3052015 0.667*** 1.000 0.627**2016 0.78** 1.000 0.712**2017 0.645*** 1.000 0.644***2018 1.075 1.000 1.088

2011-2014 0.936 1.000 0.9942015-2018 0.775*** 1.000 0.748***

Fotis Papailias (KCL) ESCoE March 10, 2020 31 / 48

Page 33: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: Benchmarks Evaluation

●●

●●●●●

●●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

2012 2014 2016 2018

−0.

004

−0.

002

0.00

00.

002

0.00

40.

006

Cum

. RM

SF

E

AR(4)NaiveLR[O_Total, YL4]

●●

●●

●●●

●●●

●●

●●

●●

●●●●●●

●●

●●●●●●●

●●●●

●●●●●●

●●

●●●●●●

●●●

●●●●

●●●●●

2012 2014 2016 2018

0.00

20.

004

0.00

60.

008

Rol

ling

RM

SF

E

AR(4)NaiveLR[O_Total, YL4]

Fotis Papailias (KCL) ESCoE March 10, 2020 32 / 48

Page 34: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: DFA

AR(4) DFA[MF, YL4] DFA[MFO, YL4] DFA[MF, O, YL4] DFA[O, YL4]Total 1.102 1.000 0.994 1.089 1.156M1 1.024 1.000 1.002 1.061 1.063M2 1.058 1.000 0.986 1.212 1.239M3 1.226 1.000 0.994 0.994 1.170

AR(4) DFA[MF, YL4] DFA[MFO, YL4] DFA[MF, O, YL4] DFA[O, YL4]2011-2018 1.102 1.000 0.994 1.089 1.156

2011 3.701 1.000 0.918 3.530 4.3302012 0.879* 1.000 1.003 1.003 0.9302013 1.278 1.000 0.991 1.189 1.2992014 1.707 1.000 0.966** 1.266 1.8542015 0.655*** 1.000 1.016 0.979 0.716*2016 1.202 1.000 0.988 1.045 1.1132017 0.992 1.000 1.011 1.088 1.0852018 1.332 1.000 0.959*** 0.766** 1.323

2011-2014 1.186 1.000 0.992 1.178 1.2672015-2018 0.991 1.000 0.997 0.972 1.011

Fotis Papailias (KCL) ESCoE March 10, 2020 33 / 48

Page 35: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: DFA

●●

●●●●

●●●

●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

2012 2014 2016 2018

−0.

002

0.00

00.

002

0.00

40.

006

Cum

. RM

SF

E

AR(4)DFA(2,2,1)[MF, YL4]DFA(2,2,1)[MFO, YL4]DFA(2,2,1)[MF, O, YL4]DFA(2,2,1)[O, YL4]

●●

●●

●●●

●●●

●●

●●

●●

●●●●●●

●●

●●●●●●●

●●●●

●●●●●●

●●

●●●●●●

●●

●●●●

●●●●●

2012 2014 2016 2018

0.00

20.

004

0.00

60.

008

Rol

ling

RM

SF

E

AR(4)DFA(2,2,1)[MF, YL4]DFA(2,2,1)[MFO, YL4]DFA(2,2,1)[MF, O, YL4]DFA(2,2,1)[O, YL4]

Fotis Papailias (KCL) ESCoE March 10, 2020 34 / 48

Page 36: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: PCA(1)

AR(4) PCA(1)[MF, YL4] PCA(1)[MFO, YL4] PCA(1)[MF, O, YL4] PCA(1)[O, YL4]Total 1.039 1.000 0.99* 0.999 1.100M1 0.952 1.000 1.003 0.975 0.939M2 0.978 1.000 0.983* 1.043 1.080M3 1.199 1.000 0.983 0.977 1.296

AR(4) PCA(1)[MF, YL4] PCA(1)[MFO, YL4] PCA(1)[MF, O, YL4] PCA(1)[O, YL4]2011-2018 1.039 1.000 0.99* 0.999 1.100

2011 2.893 1.000 0.910 2.491 3.4522012 0.929 1.000 1.000 0.935 0.9802013 0.894 1.000 0.973** 0.886*** 0.7722014 0.995 1.000 1.012 1.079 1.1492015 0.858 1.000 0.997 0.971 0.8932016 1.081 1.000 0.945** 0.871** 1.1782017 1.036 1.000 1.043 1.091 1.1582018 1.398 1.000 0.971 0.949 1.483

2011-2014 1.018 1.000 0.991 1.018 1.0662015-2018 1.075 1.000 0.988 0.966 1.157

Fotis Papailias (KCL) ESCoE March 10, 2020 35 / 48

Page 37: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: PCA(1)

●●

●●●●

●●●

●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

2012 2014 2016 2018

−0.

002

0.00

00.

002

0.00

40.

006

Cum

. RM

SF

E

AR(4)PCA(1)[MF, YL4]PCA(1)[MFO, YL4]PCA(1)[MF, O, YL4]PCA(1)[O, YL4]

●●

●●

●●●

●●●

●●

●●

●●

●●●●●●

●●

●●●●●●●

●●

●●

●●●●●●

●●

●●●●●●

●●●●

●●●●

2012 2014 2016 2018

0.00

10.

002

0.00

30.

004

0.00

50.

006

0.00

7

Rol

ling

RM

SF

E

AR(4)PCA(1)[MF, YL4]PCA(1)[MFO, YL4]PCA(1)[MF, O, YL4]PCA(1)[O, YL4]

Fotis Papailias (KCL) ESCoE March 10, 2020 36 / 48

Page 38: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: PCA(1)

●●●

●●

●●

●●●

●●●

●●●●●

●●●

●●●

●●

●●●●

●●

●●●●

●●●

●●●

●●

●●●

●●●

●●●

●●●

●●●●

●●●

●●

●●●●

●●●

2012 2014 2016 2018

0.04

950.

0500

0.05

050.

0510

0.05

150.

0520

PCA(1), Average Absolute Loading

MF

MFO

0.03

00.

031

0.03

20.

033

0.03

40.

035

O

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

2012 2014 2016 2018

0.04

80.

049

0.05

00.

051

0.05

20.

053

PCA(1), Average Median Loading

MF

MFO

0.02

40.

026

0.02

80.

030

0.03

2

O

Fotis Papailias (KCL) ESCoE March 10, 2020 37 / 48

Page 39: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: PCA(mse)

AR(4) PCA(mse)[MF, YL4] PCA(mse)[MFO, YL4] PCA(mse)[MF, O, YL4] PCA(mse)[O, YL4]Total 0.897 1.000 1.084 1.029 1.009M1 0.806* 1.000 1.007 0.987 0.867M2 0.952 1.000 1.504 1.193 1.099M3 0.941 1.000 0.804 0.933 1.074

AR(4) PCA(mse)[MF, YL4] PCA(mse)[MFO, YL4] PCA(mse)[MF, O, YL4] PCA(mse)[O, YL4]2011-2018 0.897 1.000 1.084 1.029 1.009

2011 0.904 1.000 1.037 0.285*** 1.0952012 0.858 1.000 1.220 1.058 0.9042013 0.61* 1.000 1.229 0.579*** 0.527**2014 0.987 1.000 0.73** 0.888 1.0972015 0.926 1.000 1.127 1.443 0.9982016 0.789 1.000 0.906 1.175 0.8282017 2.377 1.000 1.292 3.041 4.1262018 1.185 1.000 0.869** 1.494 1.466

2011-2014 0.807** 1.000 1.126 0.8** 0.84*2015-2018 1.086 1.000 0.995 1.515 1.366

Fotis Papailias (KCL) ESCoE March 10, 2020 38 / 48

Page 40: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: PLS(1)

AR(4) PLS(1)[MF, YL4] PLS(1)[MFO, YL4] PLS(1)[MF, O, YL4] PLS(1)[O, YL4]Total 1.059 1.000 0.986* 1.006 1.122M1 0.962 1.000 1.000 0.977 0.954M2 0.998 1.000 0.982 1.060 1.090M3 1.232 1.000 0.976 0.980 1.342

AR(4) PLS(1)[MF, YL4] PLS(1)[MFO, YL4] PLS(1)[MF, O, YL4] PLS(1)[O, YL4]2011-2018 1.059 1.000 0.986* 1.006 1.122

2011 3.039 1.000 0.86** 2.404 3.6172012 0.937 1.000 1.000 0.940 0.9902013 1.008 1.000 0.965** 0.948 0.8642014 0.967 1.000 1.022 1.072 1.0922015 0.875 1.000 0.995 0.977 0.9132016 1.025 1.000 0.934** 0.88** 1.1302017 1.075 1.000 1.060 1.104 1.2072018 1.401 1.000 0.94* 0.918 1.516

2011-2014 1.050 1.000 0.990 1.032 1.0932015-2018 1.075 1.000 0.98* 0.964 1.169

Fotis Papailias (KCL) ESCoE March 10, 2020 39 / 48

Page 41: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: PLS(1)

●●

●●●●

●●●

●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

2012 2014 2016 2018

−0.

002

0.00

00.

002

0.00

40.

006

Cum

. RM

SF

E

AR(4)PLS(1)[MF, YL4]PLS(1)[MFO, YL4]PLS(1)[MF, O, YL4]PLS(1)[O, YL4]

●●

●●

●●●

●●●

●●

●●

●●

●●●●●●

●●

●●●●●●●

●●

●●

●●●●●●

●●

●●●●●●

●●●●

●●●●

2012 2014 2016 2018

0.00

10.

002

0.00

30.

004

0.00

50.

006

0.00

7

Rol

ling

RM

SF

E

AR(4)PLS(1)[MF, YL4]PLS(1)[MFO, YL4]PLS(1)[MF, O, YL4]PLS(1)[O, YL4]

Fotis Papailias (KCL) ESCoE March 10, 2020 40 / 48

Page 42: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: PLS(1)

●●

●●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

2012 2014 2016 2018

0.05

100.

0515

0.05

200.

0525

PLS(1), Average Absolute Loading

MF

MFO

0.03

20.

033

0.03

40.

035

0.03

60.

037

0.03

8

O

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

2012 2014 2016 2018

0.04

90.

050

0.05

10.

052

0.05

30.

054

PLS(1), Average Median Loading

MF

MFO

0.02

60.

028

0.03

00.

032

0.03

4

O

Fotis Papailias (KCL) ESCoE March 10, 2020 41 / 48

Page 43: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: PLS(mse)

AR(4) PLS(mse)[MF, YL4] PLS(mse)[MFO, YL4] PLS(mse)[MF, O, YL4] PLS(mse)[O, YL4]Total 0.879** 1.000 0.92*** 1.005 0.944M1 0.785** 1.000 0.906*** 0.974 0.778**M2 0.842* 1.000 0.925** 1.032 0.932M3 1.024 1.000 0.931 1.012 1.142

AR(4) PLS(mse)[MF, YL4] PLS(mse)[MFO, YL4] PLS(mse)[MF, O, YL4] PLS(mse)[O, YL4]2011-2018 0.879** 1.000 0.92*** 1.005 0.944

2011 0.755* 1.000 0.984 1.102 0.8992012 0.864 1.000 0.91* 0.963 0.9132013 0.636** 1.000 0.85*** 0.999 0.545***2014 0.711* 1.000 1.002 1.009 0.8032015 1.388 1.000 1.034 1.001 1.4482016 0.894 1.000 0.856** 1.024 0.9882017 1.097 1.000 0.77* 1.039 1.2692018 1.549 1.000 1.065 1.004 1.829

2011-2014 0.757*** 1.000 0.921** 0.999 0.788***2015-2018 1.181 1.000 0.917** 1.019 1.327

Fotis Papailias (KCL) ESCoE March 10, 2020 42 / 48

Page 44: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: Ridge

AR(4) Ridge[MF, YL4] Ridge[MFO, YL4] Ridge[O, YL4]Total 1.017 1.000 1.005 1.026M1 0.931 1.000 0.989 0.933M2 0.991 1.000 1.007 1.102M3 1.132 1.000 1.020 1.048

AR(4) Ridge[MF, YL4] Ridge[MFO, YL4] Ridge[O, YL4]2011-2018 1.017 1.000 1.005 1.026

2011 1.809 1.000 0.982 0.471**2012 0.82** 1.000 1.000 0.9432013 0.900 1.000 1.008 0.9452014 0.623*** 1.000 1.017 1.1152015 1.269 1.000 0.981 0.9022016 1.493 1.000 0.9* 1.0582017 1.860 1.000 1.215 1.7532018 1.525 1.000 1.004 1.314

2011-2014 0.843** 1.000 1.005 0.9612015-2018 1.504 1.000 1.006 1.208

Fotis Papailias (KCL) ESCoE March 10, 2020 43 / 48

Page 45: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: EN

AR(4) ElNet[MF, YL4] ElNet[MFO, YL4] ElNet[O, YL4]Total 1.042 1.000 0.975 1.038M1 0.966 1.000 0.974 0.931M2 1.026 1.000 0.997 1.083M3 1.135 1.000 0.955 1.101

AR(4) ElNet[MF, YL4] ElNet[MFO, YL4] ElNet[O, YL4]2011-2018 1.042 1.000 0.975 1.038

2011 4.636 1.000 1.000 1.4292012 0.921 1.000 0.987 1.0212013 0.764 1.000 1.003 0.79**2014 0.726** 1.000 0.944 1.1112015 1.000 1.000 0.888 0.758*2016 1.245 1.000 0.918 1.2132017 1.663 1.000 1.267 1.5862018 1.546 1.000 0.885* 1.257

2011-2014 0.921 1.000 0.982 0.9882015-2018 1.314 1.000 0.960 1.148

Fotis Papailias (KCL) ESCoE March 10, 2020 44 / 48

Page 46: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: LASSO

AR(4) Lasso[MF, YL4] Lasso[MFO, YL4] Lasso[O, YL4]Total 1.011 1.000 1.003 1.065M1 0.918 1.000 0.979 0.942M2 0.987 1.000 1.016 1.099M3 1.134 1.000 1.017 1.161

AR(4) Lasso[MF, YL4] Lasso[MFO, YL4] Lasso[O, YL4]2011-2018 1.011 1.000 1.003 1.065

2011 4.867 1.000 1.000 1.7112012 0.904 1.000 0.992 0.9982013 0.717 1.000 0.991 0.8322014 0.662*** 1.000 0.972 1.1522015 1.026 1.000 0.976 0.8182016 1.237 1.000 0.896 1.2332017 1.659 1.000 1.376 1.6632018 1.580 1.000 1.060 1.328

2011-2014 0.878 1.000 0.987 1.0062015-2018 1.328 1.000 1.042 1.205

Fotis Papailias (KCL) ESCoE March 10, 2020 45 / 48

Page 47: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Results: SSlab

AR(4) SSlab[MF, YL4] SSlab[MFO, YL4] SSlab[O, YL4]Total 0.82** 1.000 1.068 1.268M1 0.704** 1.000 1.073 1.133M2 0.725* 1.000 1.055 1.176M3 1.100 1.000 1.078 1.570

AR(4) SSlab[MF, YL4] SSlab[MFO, YL4] SSlab[O, YL4]2011-2018 0.82** 1.000 1.068 1.268

2011 0.794 1.000 1.618 1.1842012 0.982 1.000 1.060 1.0942013 0.473*** 1.000 1.009 1.1132014 0.584*** 1.000 1.017 1.8402015 0.771 1.000 0.999 1.3752016 1.337 1.000 1.017 1.2242017 1.283 1.000 0.981* 1.3022018 1.335 1.000 1.008 0.999

2011-2014 0.697*** 1.000 1.093 1.2812015-2018 1.133 1.000 1.002 1.236

Fotis Papailias (KCL) ESCoE March 10, 2020 46 / 48

Page 48: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Conclusions

This paper examines the predictive ability of the recently introducedONS FIEA indicators in a simple nowcasting setup.

Even though FIEA is not particularly designed for (or targets)forecasting purposes, our results suggest that these indicators do havepredictive some power in nowcasting economic activity.

The exercise is short due to availability issues, however more workshould be done to maximise the utility in applications.

Fotis Papailias (KCL) ESCoE March 10, 2020 47 / 48

Page 49: ESCoE Research Seminar - Amazon S3...1 NT N å i=1 T å t=1 (x˜ it l0 iF t) 2, (4) where l i is an r 1 vector of loadings that represent the N columns of L = (l 1 l N). PC estimation

Future Work & Suggestions

Based on this “first approach” of ONS FIEA data, we want to furtherinvestigate the predictive abilities of this dataset.

1. First, the current results are based on real-time nowcasting. Extendto real-time forecasting .

2. Investigate the “Shipping indicators” and consider cases for now-,forecasting.

3. Investigate the backcasting abilities of the dataset.

X CURRENT WORK IN PROGRESS

Forecasting the direction of GDP growth using ONS FIEA trafficindicators (regional and aggregate).

Fotis Papailias (KCL) ESCoE March 10, 2020 48 / 48