kth.diva-portal.org1255660/fulltext01.pdf · denna rapport fokuserar p a att unders oka anv...

IN DEGREE PROJECT MATHEMATICS,SECOND CYCLE, 30 CREDITS

, STOCKHOLM SWEDEN 2018

Estimating fuel consumption using regression and machine learning

LUKAS EKSTRÖM

KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF ENGINEERING SCIENCES

Estimating fuel consumption using regression and machine learning

LUKAS EKSTRÖM

Degree Projects Systems Engineering (30 ECTS credits) Degree Programme in Aerospace Engineering (120 credits) KTH Royal Institute of Technology year 2018 Supervisor at Scania: Henrik Wentzel Supervisor at KTH: Johan Karlsson Examiner at KTH: Johan Karlsson

TRITA-SCI-GRU 2018:378 MAT-E 2018:80

Royal Institute of Technology School of Engineering Sciences KTH SCI SE-100 44 Stockholm, Sweden URL: www.kth.se/sci

Abstract

This thesis focuses on investigating the usage of statistical models for esti-mating fuel consumption of heavy duty vehicles. Several statistical modelsare assessed, along with machine learning using artificial neural networks.

Data recorded by sensors on board trucks in the EU describe the oper-ational usage of the vehicle. The usage of this data for estimating the fuelconsumption is assessed, and several variables originating from the opera-tional data is modelled and tested as possible input parameters.

The estimation model for real world fuel consumption uses 8 parametersdescribing the operational usage of the vehicles, and 8 parameters describingthe vehicles themselves. The operational parameters describe the averagespeed, topography, variation of speed, idling, and more. This model has anaverage relative error of 5.75%, with a prediction error less than 11.14% for95% of all tested vehicles.

When only vehicle parameters are considered, it is possible to makepredictions with an average relative error of 9.30%, with a prediction errorless than 19.50% for 95% of all tested vehicles.

Furthermore, a computer software called Vehicle Energy ConsumptionCalculation tool(VECTO) must be used to simulate the fuel consumption forall heavy duty vehicles, according to legislation by the EU. Running VECTOis a slow process, and this thesis also investigates how well statistical modelscan be used to quickly estimate the VECTO fuel consumption. The modelestimates VECTO fuel consumption with an average relative error of 0.32%and with a prediction error less than 0.65% for 95% of all tested vehicles.

I

Estimering av bransleforbrukning med re-gression och maskininlarning

Sammanfattning

Denna rapport fokuserar pa att undersoka anvandningen av statistiska mod-eller for att uppskatta bransleforbrukningen hos tunga fordon. Flera statis-tiska modeller utvarderas, tillsammans med maskinlarning med artificiellaneurala natverk.

Data som registreras av sensorer ombord pa Scania-lastbilar i EU beskriverfordonets drift. Anvandningen av dessa data for att uppskatta bransleforbrukningenundersoks och flera variabler som kommer fran operativa data modellerasoch testas som mojliga in-parametrar.

Uppskattningsmodellen for den verkliga bransleforbrukningen anvander8 parametrar som beskriver anvandningen av fordonet och 8 parametrar sombeskriver sjalva fordonet. Bland annat beskrivs medelhastighet, topografi,hastighetsvariation, andel tomgang. Denna modell har ett genomsnittligtrelativfel pa 5.75 %, med ett skattningsfel mindre an 11.14% for 95% av dede fordon som testats.

Om endast fordonsparametrar beaktas som in-parametrar ar det mojligtatt gora skattningar med ett genomsnittligt relativfel pa 9.30 %, med ettskattningsfel mindre an 19.50% for 95% av de de fordon som testats. Ett da-torprogram kallat VECTO maste anvandas for att simulera bransleforbrukningenfor alla tunga fordon enligt EU-lagstiftning. Att kora VECTO ar en tid-skravande process, och denna rapport undersoker ocksa hur val statistiskamodeller kan anvandas for att snabbt uppskatta VECTO-bransleforbrukningen.Modellen uppskattar VECTO-bransleforbrukningen med ett genomsnittligtrelativfel pa 0.32% och med ett skattningsfel mindre an 0.65% for 95% avde de fordon som testats.

II

Thanks

I would like to express my gratitude to Henrik Wentzel, Johan Karlsson,Marcus Forslin, Antonius Kies, and Goran Svensson, for the help and guid-ance they have given me.

III

Contents

Contents IV

1 Introduction 1

2 Theory 32.1 Preliminary math . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1.1 Basic statistical measures . . . . . . . . . . . . . . . . 32.1.2 Normalizing . . . . . . . . . . . . . . . . . . . . . . . . 42.1.3 Standardized euclidean distance . . . . . . . . . . . . 42.1.4 Prediction error . . . . . . . . . . . . . . . . . . . . . . 52.1.5 Categorical variables . . . . . . . . . . . . . . . . . . . 6

2.2 Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2.1 Linear regression . . . . . . . . . . . . . . . . . . . . . 82.2.2 KNN regression . . . . . . . . . . . . . . . . . . . . . . 92.2.3 Multivariate adaptive regression splines . . . . . . . . 10

2.3 Artificial neural network . . . . . . . . . . . . . . . . . . . . . 162.3.1 Artificial neuron . . . . . . . . . . . . . . . . . . . . . 172.3.2 Artificial neural network . . . . . . . . . . . . . . . . . 182.3.3 Input . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.3.4 Output . . . . . . . . . . . . . . . . . . . . . . . . . . 192.3.5 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 202.3.6 Batch learning . . . . . . . . . . . . . . . . . . . . . . 252.3.7 Evaluating observations . . . . . . . . . . . . . . . . . 25

3 Data 263.1 Vehicle parameters . . . . . . . . . . . . . . . . . . . . . . . . 263.2 Operational parameters . . . . . . . . . . . . . . . . . . . . . 28

3.2.1 Bin data . . . . . . . . . . . . . . . . . . . . . . . . . . 293.2.2 Considered variables . . . . . . . . . . . . . . . . . . . 30

3.3 Output data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.4 Data gathering . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4.1 Estimation scenario A . . . . . . . . . . . . . . . . . . 333.4.2 Estimation scenario B and C . . . . . . . . . . . . . . 33

IV

4 Method 344.1 Validation and testing . . . . . . . . . . . . . . . . . . . . . . 344.2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.2.1 Linear regression . . . . . . . . . . . . . . . . . . . . . 354.2.2 KNN regression . . . . . . . . . . . . . . . . . . . . . . 354.2.3 KNN Regression with inverse distance weighting . . . 364.2.4 Multivariate adaptive regression splines (MARS) . . . 364.2.5 Artificial neural network . . . . . . . . . . . . . . . . . 37

4.3 Choice of input parameters . . . . . . . . . . . . . . . . . . . 384.3.1 Estimation scenario A - VECTO fuel consumption . . 38

4.4 Estimation scenario B and C - Real world fuel consumption . 404.5 Selection of model . . . . . . . . . . . . . . . . . . . . . . . . 42

5 Results 435.1 Estimation scenario A - VECTO fuel consumption . . . . . . 43

5.1.1 Selected model . . . . . . . . . . . . . . . . . . . . . . 435.2 Estimation scenario B - Real world fuel consumption using

only vehicle parameters . . . . . . . . . . . . . . . . . . . . . 465.2.1 Selected model . . . . . . . . . . . . . . . . . . . . . . 46

5.3 Estimation scenario C - Real world fuel consumption usingvehicle and operational parameters . . . . . . . . . . . . . . . 485.3.1 Selected model . . . . . . . . . . . . . . . . . . . . . . 48

5.4 Results summary . . . . . . . . . . . . . . . . . . . . . . . . . 505.4.1 Comparison of estimation scenario B and C . . . . . . 505.4.2 Variable influence . . . . . . . . . . . . . . . . . . . . 51

6 Discussion and conclusion 526.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

References 54

A Glossary 55A.1 Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55A.2 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

B Fundamental statistics 59

C Assumed values for bin data 60

D Additional tables 63

E Activation functions 66

V

F Model training result 72F.1 Estimation scenario A . . . . . . . . . . . . . . . . . . . . . . 72F.2 Estimation scenario B . . . . . . . . . . . . . . . . . . . . . . 84F.3 Estimation scenario C . . . . . . . . . . . . . . . . . . . . . . 96

G Additional figures 109

H Code 110

VI

1

Chapter 1

Introduction

This report is the result of a thesis made at Scania AB, in cooperation withRoyal Institute of Technology (KTH). The thesis is being conducted in thegroup YDMC at Scania in Sodertalje, Sweden, which is specialized in analy-sis of fuel consumption, and by extent, CO2 emissions. It is of great interestto be able to quickly and accurately estimate the fuel consumption (FC) andCO2 emissions for vehicles. Current methods for estimation relies on sim-ulation, which is too slow and too inaccurate. More advanced simulationsrequire several minutes to complete, which is too slow for being implementedinto a sales process, and are still not very accurate. Furthermore, a new leg-islation by the European union require all new heavy duty vehicles (HDVs)to be simulated using a software called Vehicle Energy Consumption Cal-culation Tool (VECTO). The simulation tool is used for certification of thevehicles. Therefore, it is also of great interest to be able to quickly deliverand estimate the VECTO certified CO2 to haulage contractors. This paperassesses statistical models for predicting fuel consumption. Several regres-sional models are created and tested. Furthermore, the use of Artificialintelligence (AI) using artificial neural networks are also tested. The paperalso investigates the usage of Operational data - data uploaded from vehiclesin use - and their potential role in estimating the real world FC.

When VECTO is used for certification purposes, simulations are per-formed for a set of predefined transport missions, each with a set of prede-fined transport payloads. Since the driving patterns are predefined, the FCobtained through VECTO, for a given predefined transport mission, can beregarded as a function only of the vehicle itself. This thesis focuses on thetransport mission called long haulage with reference payload, which simu-lates transportation across a large distance, with few stops and dominantlymotorway velocity. A previous thesis, [1] by Max af Klintberg, has assessedhow several regressional models can be used to evaluate the VECTO FCfor the long haulage cycle. That thesis has heavily influenced the methodsassessed in this thesis for estimating VECTO FC. New prediction models

CHAPTER 1. INTRODUCTION 2

Figure 1.1: The three estimation scenarios

are tested and their performance compared.The main objective of this thesis is to investigate how well prediction

models can be used to estimate fuel consumption. Estimation of two differ-ent values of fuel consumption - VECTO and real world - are to be assessed.The VECTO fuel consumption describes the average fuel consumption dur-ing one simulated transport mission. The real world fuel consumption de-scribes the average fuel consumption in a longer term perspective.

The input for the models should be a set of variables that can be de-rived from vehicle specification and/or operational data. The estimation ofVECTO is done solely based on vehicle data, and does not consider anyoperational data. Estimation of real world fuel consumption considers bothvehicle and operational data. A scenario when the real world FC is estimatedsolely based on vehicle parameters is also assessed.

Consequentially, three estimation scenarios can be introduced, accordingto

• Estimation scenario A - Estimation of VECTO fuel consumptionwith vehicle data as input.

• Estimation scenario B - Estimation of real world fuel consumptionwith vehicle data as input.

• Estimation scenario C - Estimation of real world fuel consumptionwith vehicle data and operational data as input.

The three estimation scenarios are also illustrated in Figure 1.1.

3

Chapter 2

Theory

2.1 Preliminary math

This section introduces mathematical concepts which form a basis for fol-lowing sections. Methods for describing properties in a quantifiable manner,measuring similarity between observations, and for modelling estimation er-ror, are explained.

2.1.1 Basic statistical measures

If X1,X2, ...,Xn are random variables, and they form a vector

X =[X1 X2 ... Xn

]. (2.1)

Let xij be the ith independent observation on the jth random variable.Assume that m ∈ N independent observations have been drawn for all n ∈ Nrandom variables. These observations then form a data set X = {xi ∈Rn, i = 1, ...,m}. Further, let xj be the sample mean and sj be the samplestandard deviation of the jth random variable. Obtained as described inAppendix B.

The sample covariance qjk of the j:th and k:th random variables is de-fined as

qjk =1

m− 1

m∑i=1

(xij − xj)(xik − xk). (2.2)

The pearson correlation coefficient r can then be computed using

rjk =qjksjsk

. (2.3)

Pearson correlation coefficient takes a value in the range [−1, 1] where 1is positive linear correlation, -1 is negative linear correlation and 0 is nocorrelation. The value r2jk is called the coefficient of determination, whichtakes a value in the range [0, 1] where 1 is linear correlation and 0 is nocorrelation.

CHAPTER 2. THEORY 4

2.1.2 Normalizing

Consider a random variable xj in a sample, that has a sample mean xj anda sample standard deviation sj , obtained as in subsection 2.1.1. The samplecan be normalized to have a new sample mean x0j and sample standard

deviation s0j . This is done by modifying all observations xij according to

xij :=xij − sjxj

s0j + x0j . (2.4)

2.1.3 Standardized euclidean distance

For comparing similarity of quantitative components of observations, a mea-sure of distance between the observations are to be obtained. A formulationof the distance is derived below.

According to [2], a metric on the set X d(x1,x2) is a function such thatfor all x1,x2 ∈X , the following holds:

1. d(x1,x2) ≥ 0

2. d(x1,x2) = 0 if, and only if x1 = x2

3. d(x1,x2) = d(x2,x1)

4. d(x1,x2) ≤ d(x1,x3) + d(x2,x3).

A metric can then be considered to be the distance between observations.There are a lot of ways to define a metric. However, the following metricsare relevant for this thesis.

The euclidean distance between x1 ∈ Rn and x2 ∈ Rn is the L2-norm(also called the euclidean norm) of the difference x1 − x2, i.e. ||x1 − x2||2.The L2-norm of a vector v ∈ Rn is defined as

||v||2 =

√√√√ n∑j=1

v2j . (2.5)

Therefore, the euclidean distance deuc(x1,x2) is defined as:

deuc(x1,x2) = ||x1 − x2||2 =

√√√√ n∑j=1

(x1,j − x2,j)2. (2.6)

Problems can arise when the euclidean distance is used and the com-ponents of the vectors vary in size, and larger components can come todominate the value of the norm. The standardized euclidean distance nor-malizes the components in X and makes them dimensionless. If xj and sj

CHAPTER 2. THEORY 5

are the sample mean and sample standard deviation of component j in thedata set X , the standardized euclidean distance deuc,s(x1,x2) is defined as

deuc,s(x1,x2) =

√√√√ n∑j=1

(x1,j − sj

xj− x2,j − sj

xj

)2

. (2.7)

2.1.4 Prediction error

Consider a set of m observations for one random variable

y = {yi, i = 1, ...,m} ,

and a set of predictions for the same variable

y = {yi, i = 1, ...,m} .

The prediction error for each observation is defined as

ei = yi − yi (2.8)

and the relative prediction error is

erel,i =eiyi

=yi − yiyi

. (2.9)

The root-mean-squared error (RMSE) is defined as

RMSE(y) =

√√√√ 1

m

m∑i=1

e2i . (2.10)

The root-mean-squared for the relative errors is

RMSErel(y) =

√√√√ 1

m

m∑i=1

e2rel,i. (2.11)

Quantiles

The performance of an estimation model can also be measures by quantilesof the prediction error. The quantiles are associated with a probability p,which creates a threshold for the prediction error, given that probability p.This is useful for measuring the performance regarding outliers. This thesisuses quantiles for p = 95%, 99%, 99.5%.

CHAPTER 2. THEORY 6

2.1.5 Categorical variables

Categorical variables are variables that is set to one of a finite set of possiblevalues. They are not trivially quantifiable, i.e. it is not clear how values forone categorical variable relate to each other, other than being equal or not.There are several different ways to implement categorical variables into aquantitative model.

Dummy variables

Assume that a categorical variable xc is given which can take on one valuefrom {v1, ..., vn}. This categorical variable can be represented by a series ofbinary variables yj , j = 1, ..., n− 1 [3], such that

yj(xc) =

{0 if xc = v1 or x 6= vj+1

1 if xc = vj+1. (2.12)

The result is a n − 1 binary variables. For example, if x is set to the firstvalue v1, the resulting vector is

v|xc=v1 =[0 0 ... 0

].

If xc is set to the second or third value:

v|xc=v2 =[1 0 ... 0

]v|xc=v3 =

[0 1 ... 0

].

Exclusivity

Assume that a categorical variable xc is given, which can take on one valuefrom {v1, ..., vn}, where xc is one component of vectors x ∈ X . Subsetscan be created for every value {v1, ..., vn} and X can then be split intothese subsets depending on the categorical variable. This would result in ndifferent data sets.

This effectively removes the categorical variable xc from the data sets,and the resulting data sets would be on the form Y = {xi ∈ Rn−1}. Thedrawback of this method is that some of the resulting data sets may besmall, and using this method on several categorical variables will partitionthe subsets into even smaller subsets.

Penalized distance

Using penalized distance, dp(·, ·), is a method of measuring distance betweencategorical observations. Assume that x1c and x2c are two observations for

CHAPTER 2. THEORY 7

the categorical variable xc. The penalized distance weight wc > 0 is definedfor xc. The penalized distance for the two observations are then

dpc(x1c, x2c) = wcI(x1c = x2c) =

{wc if x1c = x2c0 otherwise.

(2.13)

The concept can be expanded to vectors of categorical variable. As-sume the vectors x1 ∈ Rn and x2 ∈ Rn are observations of the same set ofcategorical variable, and that penalized distance weights wi, i = 1, ..., n aredefined for all variables. The penalized distance between the two categoricalobservations are then

dp(x1,x2) = dp1(x11, x21) + dp2(x12, x22) + ...+ dpn(x1n, x2n)

= w1I(x11 = x21) + w2I(x12 = x22) + ...+ wnI(x1n = x2n).(2.14)

2.2 Regression

Regressional analysis is a branch within statistic that aims to estimate therelationship between variables. A set of independet variables X (input vari-ables) are used to estimate one or more dependent variables Y (output vari-ables).

A regressional f(X) model is constructed using a set o training data,which is a collection of previous observations consisting of both independentand one dependent variable where xi ∈ X and yi ∈ Y forms a pair (xi, yi).The regressional model will then be constructed to make estimates Y of thedependent variable, using the independent variables:

yi = E(yi|xi) = f(xi). (2.15)

The above equation combined with Equation 2.8 yields the following func-tion for the prediction error

ei = f(xi)− yi. (2.16)

The aim of regressional analysis is to construct the prediction function f soto minimize the prediction error. Assume a set of m previous observations,and therefore i = 1, ...,m. A measure on the total prediction error can beformulated as

E =1

m

m∑i=1

|ei| (2.17)

or

E =1

m

m∑i=1

e2i . (2.18)

CHAPTER 2. THEORY 8

The process of constructing a regressional model can then be formulated as

minimizef

N∑i=1

(f(xi)− yi)2 , (2.19)

which is called the method of least squares.There are different methods for regressional analysis. The ones used in

this thesis are linear regression, K nearest neighbors-regression, Multivariateadaptive regression splines and Artificial neural network.

2.2.1 Linear regression

Assume that there are k independent variables. Linear regression assumesthat the relationship between X and Y is linear, and that f(X) is a linearfunction, according to:

f(xi,β) =β0 + β1xi,1 + β2xi,2 + ...+ +βkxi,k

=βTai (2.20)

where there a m independent variables and ai =[1 xTi

]T ∈ Rk+1. Theparameter vector β ∈ Rk+1 is obtained through

β = arg minβ

m∑i=1

(f(xi,β)− yi)2 . (2.21)

Let A ∈ R(k+1)×m be the design matrix, and y a column vector with all theobservations for the dependent variable, according to

A =

aT1aT2...aTm

=

1 xT11 xT2...

...1 xTm

, y =

y1y2...ym

.The optimization problem in Equation 2.21 can then be written as

β = arg minβ

(Aβ − y)T (Aβ − y) = arg minβ

Φ(β) (2.22)

. Simplifications of the objective function Φ yield

Φ(β) = (Aβ − y)T (Aβ − y)

=(Aβ)TAβ − (Aβ)Ty − yT (Aβ) + yTy

=βTATAβ − 2(Aβ)Ty + yTy. (2.23)

Differentiate twice:∂Φ

∂β= 2ATAβ − 2ATy (2.24)

CHAPTER 2. THEORY 9

∂2Φ

∂β2= 2ATA (2.25)

The second derivative is positive semi-definite and therefore Φ is convex.The optimum to Equation 2.21 and Equation 2.22 is then obtained from

∂Φ

∂β=2ATAβ − 2ATy = 0

⇒ ATAβ = ATy. (2.26)

If ATA is invertible, there exists a unique solution

β∗ = (ATA)−1ATy. (2.27)

Thus, predictions for new observations x0 can be computed with

y0 = f(x0) = β∗Ta0 = β∗T[1 xT0

]T.. (2.28)

2.2.2 KNN regression

K nearest neighbors (KNN) is a non-parametric regressional model, whereK ∈ N is a parameter for the model. An observation x0 is evaluated bymeasuring the standardized euclidean distance to observations in thetraining data xi, i = 1, ...,m. The value of the dependent variables are thenequal to the average of the K nearest observations, i.e, the K items withthe shortest distance.

deuc,s(x(q)0 ,x

(q)i ) (2.29)

is obtained as explained in Equation 2.7. Penalized distance is used to

evaluate the distance over the categorical input variables x(c)i , according to

Equation 2.14, which yields

dp(x(c)0 ,x

(c)i ). (2.30)

The total distance between the observations is then

dtot(x,0xi) = deuc,s(x

(q)0 ,x

(q)i ) + dp(x

(c)0 ,x

(c)i ). (2.31)

Inverse distance weighting

KNN regression may also implement Inverse distance weighting (IDW).The value of a dependent variable o for new observations x0 will then bea weighted average for the K nearest observations in the training data:

f(x0) =

{∑Kk=1 wk(x0,xk)ok∑Kk=1 wk(x0,xk)

if dtot(x,0xi) 6= 0∀i

oi if dtot(x,0xi) = 0 for some i.

(2.32)

CHAPTER 2. THEORY 10

In the above equation x1,x2, ...,xK are the K nearest observations to x0.The weights will be inversely proportional to the distance, according to

wk(x0, xk) =1

dtot(x,0xi)

u(2.33)

where u ∈ R+ is a parameter that can be used to control the magnitude ofthe distance weighting.

2.2.3 Multivariate adaptive regression splines

Multivariate adaptive regression splines (MARS) is a nonlinear regressionalgorithm introduced in [4]. The algorithm results in an estimation functionthat is a finite sum of basis functions BM (·):

f(xi) =

Mfinal∑M=1

cMBM (xi) (2.34)

where cM , for M = 1, ...,Mfinal, are coefficients. An example of a functionfit to data by MARS can be seen in Figure 2.1.

Figure 2.1: The function f(x) = 0.10968+1.1890∗max(0, x−5.5102) createdby MARS to fit the data.


The MARS algorithm consists of two sequential parts

1. The forward stepwise part finds a set of Mmax basis functions.

2. The backwards stepwise part tries to reduce the set of basis functionprovided from the forward stepwise part, to increase generalizability.The number of basis functions in the final set of basis functions Mfinal,where 1 ≤Mfinal ≤Mmax, is selected as a trade-of of generalizabilityand fit to training data.

Basis functions

The original MARS algorithm implements methods to form basis functionsof quantitative input variables, as explained in [4]. Futhermore, an approachfor managing categorical input variables is given in [5]. These methods areexplained here.

The most basic form of basis function considered by MARS is

B1(xi) = 1. (2.35)

All new basis functions are created as a product of a previously created basisfunction and some function b(xij) of one variable xj . Assume there are Mbasis functions already created, basis function M + 1 will then be on theform

BM+1(xi) = BMp,M+1(xi) · b(xij). (2.36)

The function b(xij) can assume two different forms, depending on whetherxj is quantitative or categorical.

For quantitative input variables, MARS uses hinge functions with onequantitative variable as a parameter. Assume xj is a quantitative inputvariable. The hinge functions are on the form

ba(xij) = h(xij) = max(0, xij − c) (2.37)

orbb(xij) = h(xij) = max(0, c− xij) (2.38)

where c is a constant, called the knot. The two above equations are consid-ered a mirrored pair.

Assume now that xj is a categorical input variable, which assumes Kj

different values so that xij ∈ Cj where Cj ={Cj1, ..., CjKj

}. Let Aj1, Aj2, ...

be subsets of Cj . According to the binomial theorem, there are

2Kj (2.39)

subsets of a set with Kj elements, as explained in [6]. MARS considersfunctions with a binary output value, on the form

ba(xj) = I(xij ∈ Ajl) (2.40)


orbb(xj) = I(xij /∈ Ajl) (2.41)

where Ajl is some of the subsets of Cj and

I(xij ∈ Ajl) =

{1 if xij ∈ Ajl0 otherwise

I(xij /∈ Ajl) =

{1 if xij /∈ Ajl0 otherwise

.

Thus, the basis functions created by MARS are a product of 1 and anumber of hinge functions defined in Equation 2.37 and Equation 2.38, andcategorical functions defined in Equation 2.40 and Equation 2.41. The in-teraction of a basis function is the number of hinge and categorical functionspresent in a basis function. Some examples of basis functions are

• B(xi) = 1, the most basic basis function considered, which has inter-action = 0.

• B(xi) = h1(xij), where h1(xij) = max(0, xij − c1), where c1 is theknot of h1, This basis function has interaction = 1.

• B(xi) = h1(xij) · I(xik ∈ Akl), where Akl is a subset of the the valuesthat the categorical variable xk can assume, h1(xij) = max(0, xij−c1),where c1 is the knot of h1. This basis function has interaction = 2.

• B(xi) = h1(xij) · h2(xit), where h1(xij) = max(0, xij − c1), h2(xit) =max(0, xit− c2) where c1 and c2 are the knots of h1 and h2. This basisfunction has interaction = 2.

The function in Figure 2.1 have two basis functions, B1(xi) = 1 andB2(xi) = max(0, xi1− 5.5102). B2 implements a hinge function with a knotat 5.5102.

Finding coefficients

For a set of basis functions B = {B1(·), B2(·), ..., BM (·)}. The coefficientsc =

[c1 c2 ... M

]present in Equation 2.34 are found using linear regres-

sion. The design matrix is obtained by evaluating the basis functions in Bfor every observation xi in the training data.

A =

B1(x1) B2(x1) ... BM (x1)B1(x2) B2(x2) ... BM (x2)

......

. . ....

B1(xm) B2(xm) ... BM (xm)

. (2.42)

The output vector is y =[y1 y2 ... ym

]. This results in a linear problem.

If ATA is invertible, the solution c∗, for B is obtained using Equation 2.27.


Forward stepwise part

The forward stepwise part of MARS creates a basis function set B containingMmax basis functions. The set B of basis functions increases in size as thealgorithm proceeds. At the start of the algorithm the set contains the basisfunction B1(xi) = 1. New basis function created for MARS are products ofone of the basis functions already in B and a hinge function or categoricalfunction. Furthermore, a pair of basis functions are added at a time, i.e.two basis functions are added simultaneously.

Consider B = {B1(·), B2(·), ..., BM (·)}, i.e. there are currently M basisfunctions. Assume M < Mmax. A pair of new basis functions

BM+1(·) and BM+2(·)

should now be added. The new pair of basis function are based on the samebasis function BMp,M+1

(·) ∈ B and appends a mirrored pair ba(xi,M+1) andbb(xi,M+1). So that

BM+1(xi) = BMp,M+1(xi)ba,M+1(xi,jM+1)

BM+2(xi) = BMp,M+1(xi)bb,M+1(xi,jM+1)

where xjM+1 is one of the variables xj , j = 0, ..., n. If xjM+1 is quantitative,the new basis functions take the form

BM+1(xi) = BMp,M+1(xi)max(0, xi,jM+1 − cM+1)

BM+2(xi) = BMp,M+1(xi)max(0, cM+1 − xi,jM+1)

for some knot c used for this pair. If xj is categorical, the new basis functionstake the form

BM+1(xi) = Bk(xi)I(xi,M+1 ∈ AM+1)

BM+2(xi) = Bk(xi)I(xi,M+1 /∈ AM+1)

where AM+1 is some of the subsets Aj1, Aj2, ... of Cj .Thus, for every new pair BM+1(·) and BM+2(·), the forward stepwise

algorithm should determine:

• BMp,M+1(xi) - One of the basis function already in B.

• xM+1 - The variable to use in the mirrored pair ba,M+1 and bb,M+1.

• ba,M+1(xM+1) and bb,M+1(xjM+1) - The mirror paired of b(·)-functionsto use. If xM+1 is

– quantitative - ba,M+1 and bb,M+1 are hinge-functions as describedin Equation 2.37 and Equation 2.38. The knot location cM+1 hasto be determined.


– categorical - ba,M+1 and bb,M+1 are categorical functions as de-scribed in Equation 2.40 and Equation 2.41. The set AM+1 hasto be determined.

Determining the above values is an iterative process. MARS considersevery combination of BMp,M+1

(xi) ∈ B and xjM+1 . Furthermore, the algo-rithm considers a range of different ways to formulate ba,M+1(xi,jM+1) andbb,M+1(xi,jM+1):

• If xjM+1 is quantitative, then bM+1(·) are a pair of hinge functions asdescribed in Equation 2.37 and Equation 2.38. There is an infinitenumber of possible knot locations. However, MARS considers everyobservation xijM+1 , i = 1, ..., n in the training data and as a possiblelocation for the knot.

• If xjM+1 is categorical, then bM+1(·) are a pair of of categorical func-tions as described in Equation 2.40 and Equation 2.41. All possiblesubsets Aj1, Aj2, ... of Cj are candidates to form AM+1.

For all of these combinations, two candidate basis functions B1 and B2 arecreated:

B1(xi) = BMp,M+1(xi)ba,M+1(xi,jM+1)

B2(xi) = BMp,M+1(xi)bb,M+1(xi,jM+1).

MARS then considers a set consisting of B, B1 and B2:

B = {B1, B2, ..., BM , BM+1, BM+2} (2.43)

and thereafter solves the linear optimization problem with the designmatrix provided by Equation 2.42 given B, which yields the coefficientsc ∈ RM+2 corresponding to B. Thereafter the RMSE, with respect to thetraining data, is calculated for the current solution.

When RMSE has been computed for every combination of the elementsin B and b , the pair B1, B2 associated with the smallest RMSE is selectedand added as basis functions to B. Hence, M := M + 2. The procedurerepeats untill M ≥Mmax.

Backward stepwise part

The algorithm featured in the forward stepwise part is a greedy algorithmand resulting in overfitting of the model given by B, meaning the thatmodel has adapted to noise in the training data and does not generalize well.Therefore, the backward stepwise part attempts to remove basis functionsfrom B. Removing these functions will increase the prediction error for thetraining data collection. However, the basis function that gives the smallest


increase in RMSE are removed one at a time. The final basis function setB∗ is selected by a trade-off of generalizability and fit to training data.

The algorithm proceeds as following:

1. The initial data set is B1 := B, obtained from the forward stepwisealgorithm.

2. The basis function that can be removed with the smallest increasein LOF is removed from the previous BM . The resulting set of basisfunction is BM+1. The process is repeated until there is a set that onlycontains one basis function. The result is a range of basis function setsB1,B1, ...,BMmax where N(M) = Mmax−M+1 is the number of basisfunctions in BM .

3. The final set Bfinal is determined using generalized cross-validation(GCV). Let RMSEM be the root-mean-squared error obtained fromperforming regression with the data set BM on the training data.Furthermore, MARS defines effective number of parameters (ENOF )of BM as

ENOFM = N(M) + ψN(M)− 1

2(2.44)

where ψ = 2.3 is a penalty, which is fixed in this thesis. The GCV forBM is then

GCVM =RMSE2

M

N(M) (1− ENOFM/N(M))2. (2.45)

The final set of basis functions is the set that has the lowest GCV:

M∗ = arg minM∈{1,...,Mmax}

GCVM , (2.46)

B∗ := BM∗

Mfinal := N(M∗).

Given B∗, the corresponding optimal coefficients,

c∗ =[c1 c2 ... CMfinal

]is obtained using linear least squares fitting with the design matrixprovided in Equation 2.42.

Optimization

The forward stepwise algorithm may be very computationally demanding.For every new basis function to add to B, the algorithm will consider everycombination of


• Basis functions already in B.

• Every input variable xj .

• Every point xij in the training data if xj is quantitative, or everysubset of Cj if xj is categorical.

For combination of these variables, the MARS algorithm solves the linearproblem resulting from Equation 2.42. Solving the problem using the normalequation solution as given in Equation 2.27 is computationally fast. Thenumber of columns in the design matrix are proportional to the amountof basis functions in B, computation time for solving every linear problemincreases as B grows.

The computational time for the entire forward stepwise part grows rapidlywith Mmax and the training data size, as described in [7]. Furthermore, thecomputation time grows exponentially with the length of Ck for every cat-egorical variable xj , according to Equation 2.39.

Two methods of reducing the amount of computations for the forwardstepwise part are presented in [7]. These include:

• Introducing an interaction limit. This limits the interaction of basisfunctions to a certain positive integer. Basis functions in B that haveinteraction equal to the interaction limit will not be considered to forma factor in new basis functions.

• Introducing a cutoff count and a priority queue for which basis func-tions B to consider. The basis functions in B are ordered in a de-scending order according to their latest computed improval of fit - thereduction in LOF. The algorithm then only considers the basis func-tions at the top of the priority queue. The cutoff count is the numberof basis functions considered.

• Ignoring categorical parameters with many subsets.

• Only considering some of the points xij for quantitative parametersxj as possible knot location for hinge-functions. The points should beselected so that they are evenly distributed over the range of values ofthe variable.

2.3 Artificial neural network

Artificial neural networks (ANNs) are systems of computation in the field ofAI that can be used for regression and classification. Inspired by biologicalneural networks, which exists in brains, artificial neural networks have aset of artificial neurons distributed in different layers. There is one inputlayer and one output layer. Between them, there can also be a number of


hidden layers. Signals are passed from the input layers through the net-work to the output layer. The neurons are connected by synapses, which allhave synaptic weights that signifies how the transported value is amplifiedbetween the two neurons connected by the synapse. The neural networkcan learn by adjusting the synaptic weights to produce certain outputs de-pending on the inputs. This section describes the type of Artificial neuralnetwork considered in this thesis.

2.3.1 Artificial neuron

A neuron that has n input synapses receives input values x1, x2, ..., xn andproduces one output value. The procedure is described in Figure 2.2.

Figure 2.2: An artificial neuron

The following components are used by the neuron:

1. Inputs x1, x2, ..., xn. There is one input from input synapse - one inputfrom every neuron in the previous neurons layer.

2. Synapse weights w1, w2, ..., wn. The weights of the input synapses.

3. Total net input is the weighted sum of all the inputs, according to

u =n∑i=1

wixi. (2.47)

4. Output value y. The total net input is transformed using an activationfunction g to produce the output, according to

y = g(u). (2.48)

The output of the neuron can then be defined as

y(x1, x2, ..., xn) = g

(n∑i=1

wixi

). (2.49)


Activation function

The activation function g : R 7→ R, is a function that transforms the total netinput. Determining the activation function of an Artificial neural network(ANN) is directly linked to the performance of the network. Non-linear ac-tivation functions give ANN:s their ability to work with non-linear patterns.Some common activation functions are

• Logistic function g(u) = σ(u) = 11+e−u

• Hyperbolic tangent g(u) = tanh(u) = eu−e−u

eu+e−u

• Identity function g(u) = u

• Rectified linear unit (ReLU) g(u) =

{u for x ≥ 00 for x < 0

• Leaky rectified linear unit (L-ReLU) g(u) =

{u for x ≥ 0

0.01u for x < 0

More information about activation functions can be found in Appendix E.

2.3.2 Artificial neural network

The type of ANNs considered in this thesis are called Feed forward artificialneural network. The neurons are organized into layers. All neurons in thehidden layers and output layer are called receiving neurons. All receivingneurons are connected through synapses to every neuron in the previouslayer so that the outputs of a layer are the inputs of the next layers, as canbe seen in Figure 2.3. Every synapse has a specific weight associated withit. Furthermore, all neurons in the hidden layers and output layer have atransfer function associated with it. Is is possible to have different transferfunctions for different neurons in the same ANN, however this thesis onlyconsiders one transfer function for all hidden layer neurons and one transferfunction for the output layer neurons. This concept is explained further insubsection 2.3.4.

Bias neuron

It may be beneficial to add a constant term to the ANN. This is done byadding a neuron with a constant output, called bias neuron. All neurons inhidden layers and output layers are connected to the bias neuron by regularsynapses. This allows the weight of the constant term to be adjusted in thesame way as other weights in the ANN. The output value of the bias neuronis usually set to yb = 1. An example of an ANN with a bias neuron can beseen in Figure 2.4.


Figure 2.3: Example of an artificial neural network with 3 input values, 1output value and one hidden layer with 4 neurons. The arrows representsynapses. Every synapses has an individual weight w associated with it.

2.3.3 Input

The input values of an ANN can be normalized as explained in subsec-tion 2.1.2. According to [8], this increases the computational efficiency ofthe learning process.

Categorical input variables are managed using the dummy variable ap-proach explained in subsection 2.1.5.

2.3.4 Output

Many activation functions have a limited codomain. For example:

• The codomain of the logistic function σ(u) = 11+e−u is described by

0 < σ(u) < 1 ∀ u ∈ R.

• The codomain of the hyperbolic tangent function tanh(u) is describedby −1 < tanh(u) < 1 ∀ u ∈ R.

The output values of a pattern that the ANN is to learn must be in thecodomain of the transfer functions of the output layer. To achieve this, 2methods are considered:


Figure 2.4: Example of an artificial neural network that features a biasneuron.

1. Normalizing the output data so it is distributed in a range compatiblewith the codomain of the output transfer functions. The normalizingis performed as described in subsection 2.1.2.

2. Setting the transfer function of the output layer neurons to some func-tion g(x) that has a codomain that includes the range of the outputvalues. The identity function g(u) = u can conveniently be used, asits codomain is R.

2.3.5 Learning

The learning process of an ANN is done using the backpropagation algorithm(BPA). Using this algorithm, the network adapts to the training data. Thisis done by adjusting the synaptic weights

Assume an ANN created for a set of data consisting of nin input variablesand nout output variables, and which contains m observations. Let vi ∈ Rnin

and oi ∈ Rnout be the input and output vectors for the i:th observation.


There are therefore nin neurons in the input layer and nout neurons in theoutput layer. There are NH hidden layers. The number of neurons in thehidden layers are nhi, 1 ≥ i ≥ NH . Total number of layers is N = NH + 2.The number of neurons in all layers are described by the set

n = {nin, nh1, nh2, ..., nhNH, nout} . (2.50)

The activation functions for the hidden layer neurons and output layer neu-rons are gh(u) and go(u) respectively. Furthermore, assume gh(u) and go(u)are C0 or a higher level of differentiability.

The BPA can be summarized by the following stages:

1. Forward pass - The procedure of sending values through the ANN toobtain an estimate of the output values and a metric of the predictionerror.

2. Backward pass - The process of finding the influence of the synapticweights on the prediction error.

3. Updating synaptic weights using the Gradient descent algorithm.

For this subsection, the following notation is introduced:

• LH = {2, 3, ..., N − 1} is the collection of hidden layers. L = 1 repre-sents the input layer and L = N represents the output layer.

• n(L) is the number of neurons in layer L, indexed as the set in Equa-tion 2.50.

• y(L) is a vector consisting of the current output values of the neurons

in layer L. Consequentially, y(L)j is the output value of the j:th neuron

in layer L.

• oi are the output values generated by the ANN for the i:th observation.That is, the estimations of oi made with the input vi.

• u(L)j is the total net input of the j:th neuron in layer L.

• wLjk is the synaptic weight of the synapse from neuron j in layer L− 1to neuron k in layer L.

• If the ANN features a bias neuron, wLbk is the synaptic weight of thesynapse from the bias neuron to neuron k in layer L.


Cost function

A cost function is defined to represent the error of prediction. The BPA isused to minimize the cost function. The cost function used in this thesis isthe Quadratic cost function.

Assume that the ANN has been used to predict the i:th observationwhere the target values are oi ∈ Rnout and the predicted values are oi ∈Rnout . The quadratic cost function is defined as

φ(oi, oi) =1

2

nout∑k=1

(oik − oik)2. (2.51)

Forward pass

The forward pass is the process of sending values through the ANN formthe input layer to the output layer. Before the forward pass, an input vectorvi is assigned to the input neurons, so that

y(1) := vi.

Thereafter, all receiving neuron layers compute their output values y(L)j

according to Equation 2.49, sequentially, starting with the first receivinglayer, i.e. layer L = 2. At this point, the following holds for the input andoutput vectors:

oi = y(N). (2.52)

The estimations for the i:th observation is thereby obtained, and predictionerror φ(oi, oi) can be calculated using Equation 2.51.

Backward pass

The objective of the backward pass is to find the partial derivative

∂φ

∂wLjk

for all synaptic weights. To find these partial derivatives, the chain rule isapplied. The derivatives are computed sequentially, starting with the finallayer. The formulation of the chain rule product used for all synaptic weightsis

∂φ

∂wLjk=

∂φ

∂y(L)k

·∂y

(L)k

∂u(L)k

·∂u

(L)k

∂wLjk. (2.53)

For the synapses leading into the output layer, L = N , Equation 2.53 yields

∂φ

∂wNjk=

∂φ

∂y(N)k

·∂y

(N)k

∂u(N)k

·∂u

(N)k

∂wNjk. (2.54)


For the first partial derivative, the following holds, as explained in Equa-tion 2.52:

∂φ

∂y(N)k

=∂φ

∂oik

∣∣∣∣oik=y

(N)k

. (2.55)

Differentiation yields

∂φ

∂oik=

∂

∂oik

[1

2

nout∑k=1

(oik − oik)2]

=∂

∂oik

[1

2(oik − oik)2

]= oik − oik

⇒ ∂φ

∂y(N)k

= y(N)k − oik. (2.56)

The second partial derivative is given by differentiating Equation 2.48, usingthe transfer function go(u). This yields:

∂y(N)k

∂u(N)k

=dgodu

∣∣∣∣u=u

(N)k

. (2.57)

The third partial derivative∂u

(N)k

∂wNjk

can be obtained by differentiating Equa-

tion 2.47, according to

∂u(N)k

∂wNjk=

∂

∂wNjk

[n∑i=1

wiy(N−1)j

]= y

(N−1)j (2.58)

where y(N−1)j is the output of the neuron in the preceding layer connected

sending its output through the current synapse.By considering Equation 2.56, Equation 2.57 and Equation 2.58, all par-

tial derivatives of Equation 2.54 are known and the procedure can be usedto calculate

∂φ

∂wLjk

for L = N .For partial derivatives of synaptic weights leading into a hidden layer,

i.e. for L ∈ LH , Equation 2.59 becomes

∂φ

∂wLjk=

∂φ

∂y(L)k

·∂y

(L)k

∂u(L)k

·∂u

(L)k

∂wLjk∀ L ∈ LH . (2.59)

The second and third partial derivatives of the above product are obtainedin a manner consistent with the output layer,

∂y(L)k

∂u(L)k

=dghdu

∣∣∣∣u=u

(N)k

(2.60)


and∂u

(L)k

∂wLjk=

∂

∂wLjk

[n∑i=1

wiy(L−1)j

]= y

(N−1)j (2.61)

for all L ∈ LH . Because the backward pass starts with the last layer, allpartial derivatives for layer L+ 1 are known. This can be used to calculatethe partial derivative ∂φ

∂y(L)k

∀ L ∈ LH :

∂φ

∂y(L)k

=n(L+1)∑l=1

w(L+1)kl · ∂φ

∂u(L+1)l

=n(L+1)∑l=1

w(L+1)kl · ∂φ

∂y(L+1)l

·∂y

(L+1)l

∂u(L+1)l

∀ L ∈ LH , (2.62)

where n(L+1) is the number of neurons in layer L+1. The partial derivatives∂φ

∂y(L+1)l

and∂y

(L+1)l

∂u(L+1)l

have previously been obtained and Equation 2.62 can be

computed.By considering Equation 2.62, Equation 2.60 and Equation 2.61, all par-

tial derivatives of Equation 2.59 are known and the procedure can be usedto calculate

∂φ

∂wLjk

for all L ∈ LH .If the ANN has a bias neuron, the synaptic weights wLbk are updated in

the same way as the synaptic weights wLjk.

Gradient descent

After the backward pass is completed, all partial derivatives are known.Gradient descent is used to update the synaptic weights. The method isexplained in detail in [9]. The weights are updated according to

w(L)jk := w

(L)jk − η∆w

(L)jk , where ∆w

(L)jk =

∂φ

∂wLjk(2.63)

for all L, j and k. The value η is called step size, and is central to thegradient descent method. Specifically for ANNs, η is usually called learningrate. The choice of learning rate greatly affects training performance - Ahigh value may result in divergence and failure of the algorithm, whereas alow value may result in entrapment in local minima or slow learning.


2.3.6 Batch learning

During training, the observations in the training data may be used multipletimes. Each complete cycle of the entire training data through the BPA iscalled a training epoch.

The BPA can also be defined so that the forward pass and backwardpass for several observations in the training data are performed before thesynaptic weights are updated. This is called batch learning. Let γ ≥ 1be the batch size. During batch learning, a sample of γ observations arerandomly selected from the training data. The forward pass and backwardpass are performed for the observations. The weights are then updated using

gradient descent as defined in Equation 2.63, but instead letting ∆w(L)jk be

the average of ∂φ∂wL

jk

for observations in the batch.

Furthermore, two special cases of batch learning are described in [9].These are

• Online learning. γ = 1 - The weights are updated after every iterationof forward pass and backward. pass, i.e. m times per training epoch.

• Offline learning. γ = m, where m is the number of samples in thetraining data - The weights are updated once per training epoch.

2.3.7 Evaluating observations

When an ANN has been trained, observations can quickly be evaluated.Consider a new observations with input vector v0. An estimation o0 of theoutput can be obtained using the forward pass, explained above.

26

Chapter 3

Data

This chapter describes parameters considered as variables in the models.The variables introduced here are candidates for being in the final estimationmodels. Therefore, a variable introduced here will not necessarily be usedin the final models.

The data gathering shall result in a database where the considered vari-ables can be obtained using vehicle Chassis number .

The first two sections of this chapter describes variables considered asinput for the models, i.e. variables that might be used to predict the fuelconsumption. The third section describes the two output variables.

Two categories of variables are considered. Vehicle data and operationaldata. Operational data variables are not present in estimation of VECTOFC.

3.1 Vehicle parameters

Vehicle parameters are variables that are part of, or can be derived from,the vehicle specification. The variables can be determined from the vehicleitself, regardless of driving patterns and other operational data. Most ofthe variables can be directly obtained from vehicle specification and vehicledata.

• Wheel configuration cwc - A string representing the total numberof wheels, number and location of driven wheels and the number oflocation of supporting wheel axles.

• Chassis adaption cca- Whether the vehicle is a tractor or rigid.

• Rolling resistance coefficient RRC - Ratio between the rollingresistance force and normal force. Assumed to be temperature- andspeed invariant. Two different values of rolling resistance are consid-ered:

CHAPTER 3. DATA 27

– RRCV ECTO - Computed as a weighted average of the rolling resis-tance of different wheel axles. The average is weighted accordingto how the vehicle mass is distributed over the axles.

– RRCV O - The sum of the rolling resistance of all tires.

• Air drag factor CdA - Product of the air drag coefficient Cd andvehicle front area A. The total drag force of a vehicle is

D = CdA1

2ρv2 (3.1)

where ρ is the density of the air and v is the speed of the vehicle, asexplained in [10]. Therefore, CdA can be considered a measure of theair drag properties of the vehicle itself. Two different values of CdAare considered:

– [CdA]V ECTO - Calculated from data resulting from tests in a testtrack of entire vehicles, featuring combinations of chassis adap-tion, wheel configuration, cab dimensions and configuration of airdeflectors. The tests have been done with a reference trailer.

– [CdA]V O - Weighted sum of air drag contributions from differentvehicle parts. No trailer is included.

• Total cruising powertrain ratio PTRcruise- The relationship be-tween the engine Revolutions Per Minute (RPM) and the vehicle speedfor the highest gear. It is computed using

PTRcruise =imaxgear · irearaxle

r2wheelπ(3.2)

where imaxgear is the gearbox ratio of the highest gear, irearaxle is theratio of the rear axle transmission and rwheel is the wheel radius.

• Curb mass mcurb - Defined according to the in accordance with thedefinition in Swedish legislation [11], i.e. the mass of the vehicle withstandard equipment such as spare wheel, necessary consumables suchmotor oil and coolant, full tank of fuel and a 75 kg driver. No traileris included.

• Engine displacement Ve - The volume of the swept volume of allpistons in cylinders of the engine.

• Engine model ceng.

• Gearbox model cgbx.

• Rear axle model crax.

CHAPTER 3. DATA 28

• HDV CO2 vehicle class cco2 - Combination of wheel configurationand chassis adaption. The classification is done according to Table D.2in Appendix D.

• Country ccountry - The country is which the vehicle was registeredwhen it was initially purchased.

• Transmission type ctrans - Type of gearbox, can be:

– Manual transmission (MT)

– Automatic transmission (AT)

– Automatic manual transmission (AMT).

The considered vehicle parameters, introduced above, are listed in Ta-ble 3.1.

Table 3.1: Vehicle parameters considered as potential input for the estima-tion models.

Parameter Type

Name Symbol Quant. Cat.Rolling resistance coefficient (VECTO) RRCV ECTO ×Rolling resistance coefficient (Vehicle Optimizer (VO)) RRCV O ×Air drag factor (VECTO) [CdA]V ECTO ×Air drag factor (VO) [CdA]V O ×Total cruising powertrain ratio PTRcruise ×Curb mass mcurb ×Engine displacement Ve ×Engine model ceng ×Gearbox model cgbx ×Rear axle model crax ×Wheel configuration cwc ×Chassis adaption cca ×HDV CO2 vehicle class cco2 ×Country ccountry ×Transmission type ctrans ×

3.2 Operational parameters

Operational parameters describe operational usage of a vehicle, collectedfrom sensors on board. Variables introduced here are only considered forestimation of real world fuel consumption in estimation scenario C.

CHAPTER 3. DATA 29

3.2.1 Bin data

Operational information that in reality are distributions over continuousintervals are stored as bin data, where the continuous interval are discretized,i.e. split into sub-intervals, each associated with a sum of the data containedin that sub-interval. From this data, it is not possible to find how dataare distributed inside each sub-interval. This thesis considers three suchdata distributions, which are converted into a virtual form, where each sub-interval are assumed to correspond to one number instead of an interval,as explained below. Most assumed values are the average of the originalinterval.

• Vehicle velocity distribution stored as a vector of how long dis-tance a vehicle has driven in different speed intervals. Table C.1 de-scribes the velocity assumed for each sub-interval. Thus, a vector ofvirtual velocity distribution is obtained.

• Road gradient distribution is the distribution of road slope for avehicle. This is stored as a vector of how long distance a vehicle hasdriven in different intervals of road slope. A vector of virtual roadgradient is obtained by the slope assumed as described in Table C.2.

• Ambient temperature distribution is the distribution of the tem-perature of the surrounding air. This is stored as a vector consisting ofthe fraction of how much time is spent in different temperature inter-vals. A vector of virtual temperature is obtained by the temperatureassumed as described in Table C.3. The temperature intervals ob-tained from the operational data database are not mutually exclusive,i.e. they coincide. Still, the virtual temperature is obtained in thesame way as the other virtual variables, and the virtual temperatureshould still explain the overall ambient temperature, so that a highvirtual temperature will still signify a higher overall temperature.

It is then possible to obtain the mean µvirt, variance Varvirt and stan-dard deviation σvirt of these virtual distributions, according to the belowequations. Assume a set of intervals K = {k1, k2, ..., kn} where dk is thedistance travelled in interval k in K and rk is the assumed virtual value ofof interval k in K, then

µvirt(K) =

∑k∈K dkrk∑k∈K dk

, (3.3)

Varvirt(K) =

∑k∈K dk(rk − µ)2∑

k∈K dk, (3.4)

σvirt(K) =√

Varvirt(K). (3.5)

CHAPTER 3. DATA 30

3.2.2 Considered variables

The following list describes variables that can be attained from operationaldata, in addition to vehicle data.

• Average GTW mgtw - Gross train weight (GTW) is the mass of thevehicle including trailers and payload, at a given time. This parameteris the mean GTW.

• Average mass of payload and trailer mpayload - Since the averageGTW mgtw can be obtained, and the curb mass mcurb is known forevery vehicle the average weight mpl of the payload and trailers canbe attained from

mpl = mgtw −mcurb. (3.6)

• Average speed vavg - The average speed when driving. Idling timeis not included when this variable is calculated.

• Average break frequency ξbreak - The average spatial frequencyof break starts, i.e. how many breaks occur on average for a givendistance of travelling.

• Stop frequency ξstop - The average spatial frequency of the vehiclevelocity reaching 0 km/h.

• Velocity variation - A measure how how much the velocity variesfor a vehicle. A high value should signify that a vehicle is driven a lotin both high and low velocity. For example:

– A truck used for long haulage that spends the majority of its timeand distance driving on motorways, should presumably have alower variation of speed.

– A garbage truck would probably have a high variation in speed.

Two different models for variation of speed are considered, both basedon the virtual velocity introduced in subsection 3.2.1:

– vvirt,var - Virtual velocity variance, obtained according to Equa-tion 3.4.

– vvirt,std - Virtual velocity standard deviation, obtained accordingto Equation 3.5.

• Topography - A measure of the hillyness of where the vehicle isdriving. Three different formulations of topography are considered:

– Topography classification ctop - Scania’s classification of topogra-phy. There are three classes:

CHAPTER 3. DATA 31

∗ Flat

∗ Hilly

∗ Very Hilly.

The classification procedure is not explained here due to confi-dentiality.

– Virtual road gradient average absolute value φvirt,mean,abs. Com-puted according to

φvirt,mean,abs =

∑k∈K dk|φk|∑k∈K dk

(3.7)

where K is the set of road gradient intervals as given in Table C.2,dk is the distance travelled with a slope in interval k and φk isthe assumed slope for interval k.

– Virtual road gradient average square φvirt,mean,sq - Computed ac-cording to

φvirt,mean,sq =

∑k∈K dkφ

2k∑

k∈K dk. (3.8)

• Idling - Idling is when the vehicle is standing still with the enginerunning. Furthermore, many vehicles are fitted with a device for Powertake-off (PTO), which allows the vehicle’s engine to power externalmachinery, for example a concrete mixer. The engine consumes fuelwhen idling, and may consume significantly more fuel when idling withPTO. The amount of idling with and without PTO can be obtainedseparately, as a fraction of the time when the vehicle’s ignition is on.The following three variables are considered:

– Idling with PTO τidling,pto.

– Idling without PTO τidling.

– Total idling τidling,tot = τidling,pto + τidling.

• Ambient temperature - The mean of the virtual temperature, Tvirt,meanis obtained in accordance with Equation 3.3.

• Cruise control usage τcc - The fraction of the driving time whencruise control is activated.

The considered operational parameters, introduced above, are listed in Ta-ble 3.2.

CHAPTER 3. DATA 32

Table 3.2: Operational parameters considered as potential input for theestimation models

Parameter Type

Name Symbol Quant. Cat.Average GTW mgtw ×Average mass of payload and trailer mpayload ×Average speed vavg ×Average break frequency ξbreak ×Stop frequency ξstop ×Virtual velocity variance vvirt,var ×Virtual velocity standard deviation vvirt,std ×Topopgraphy classification ctop ×Virtual road gradient average absolute value φvirt,mean,abs ×Virtual road gradient average square φvirt,mean,sq ×Idling with PTO τidling,pto ×Idling without PTO τidling ×Total idling τidling,tot ×Virtual temperature mean Tvirt,mean ×Cruise control usage τcc ×

3.3 Output data

The output value of the models should be some measure of fuel consumption,which in this thesis is the fuel consumption per unit of length. The unit usedis

l

100km.

Data for the output variables must be obtained, so that the estimationmodels can be trained. Data for the following two variables are required:

• VECTO fuel consumption FCV ECTO, used in estimation scenario A -Data is obtained from previous completed VECTO simulations. Datais only gathered for the long haulage VECTO-cycle with referencepayload. This data has a sample standard deviation corresponding to8% of its mean value.

• Real world fuel consumption FCReal, used in estimation scenario C -Data is obtained from the operational data database. This data has astandard deviation corresponding to 17% of its mean value.

3.4 Data gathering

Data for estimation scenario A, B and C are obtained from different sources.The procedure of gathering data are described below

CHAPTER 3. DATA 33

3.4.1 Estimation scenario A

Data for estimation modelling of VECTO variables are obtained from pre-vious VECTO-simulation input and output data. Data is obtained for thelong haulage cycle with reference payload, for trucks produced in 2017.

3.4.2 Estimation scenario B and C

For real world fuel consumption, only chassis numbers of the - at the time ofwriting - newest truck generation, called development level 6 , are considered.Furthermore, only trucks that run on diesel are considered. To improvevalidity of operational data, only vehicles that have driven for at least 20000 km are considered. VECTO input parameters are being considered asinput and must therefore be available for the chassis numbers. Thus, datais gathered for

• Trucks of development level 6.

• Produced in 2016-2017.

• Mileage of at least 20 000 km.

• Input data for VECTO simulation are available.

Filtering

For many chassis numbers, some of the operational data parameters aremissing. These chassis numbers are removed from the data set.

Furthermore, to counteract the usage of erroneous data, operational datais filtered to remove some of the outliers for some of the quantitative vari-ables. This is done by removing chassis numbers corresponding to lower andupper quantiles of the data. Information about the filtering can be viewedin Table D.1 in Appendix D.

34

Chapter 4

Method

This chapter explains how different models are created and how input pa-rameters, among those introduced in chapter 3, are selected. Models arecreated separately for estimation scenario A, B and C.

4.1 Validation and testing

The data sets resulting from the procedures described in section 3.4 are

• Data for estimation scenario A, 12243 vehicles.

• Data for estimation scenario B and estimation scenario C, 28255 ve-hciles.

Both data sets are split according to:

• Training data 70% - used for training of estimation models.

• Testing data 20% - used for evaluating performance of different mod-els and selecting the best model.

• Verification data 10% - used for verifying the performance of thebest model.

The splitting is done randomly to reduce interference from potential patternsin the data. The training data set is used to create regressional models byimplementation of different procedures described in chapter 2. Thereafter,the models are evaluated using all items in the testing data set. The purposeof evaluating the models using a different data set than the one used fortraining is to reduce overfitting. The model that performs best with respectto the testing data is selected as the best model. The verification data willbe used to evaluate the performance of that model. The same data sets areused for estimation scenario B and C.

CHAPTER 4. METHOD 35

4.2 Models

A range of different models will be considered. The models are denoted withnumbers in the range 1.1 to 5.4. Every model is tested for each estimationscenario. An example of how the models are denoted is Model A1.1 - Model1.1 for estimation scenario A.

This section introduces the considered models. For some models, multi-ple settings configurations are considered. All possible combinations of theconsidered settings are tested, and the best setting is determined by thelowest RMSErel.

4.2.1 Linear regression

Three linear regression models are considered:

• Model 1.1 - Linear regression using dummy variables to representcategorical data.

• Model 1.2 - Linear regression using exclusivity for all categoricalvariables.

• Model 1.3 - Linear regression using exclusivity Engine model andusing dummy variables to represent remaining categorical variables.

4.2.2 KNN regression

The following models are considered:

• Model 2.1 - KNN-regression with penalized distance for all categor-ical variables.

• Model 2.2 - KNN-regression with exclusivity for all categorical vari-ables.

• Model 2.3 - KNN-regression with exclusivity for Engine model andpenalized distance for all other categorical variables.

The penalized distance for all categorical variables are set to 1. Furthermore,the following values are considered for the parameters K:

Table 4.1: Considered parameters for models 2.1-2.3

Parameter Values

K 1,2,3,4,6,10,15


4.2.3 KNN Regression with inverse distance weighting

The following models are considered:

• Model 3.1 - KNN-regression with IDW and penalized distance for allcategorical variables.

• Model 3.2 - KNN-regression with IDW and exclusivity for all cate-gorical variables.

• Model 3.3 - KNN-regression with IDW, exclusivity for Engine modeland penalized distance for all other categorical variables.

For the above models, the following values are considered for the parametersK and u:

Table 4.2: Considered parameters for models 3.1-3.3

Parameter Values

K 1,2,3,4,6,10,15

u 1,2,5,10

4.2.4 Multivariate adaptive regression splines (MARS)

MARS is implemented using two of the optimization methods described insubsection 2.2.3, to reduce computation time. These methods are: intro-ducing an interaction limit, introducing a cutoff count, introducing a limitfor how many quantitative points to consider as knot location and intro-ducing a limit for the number of levels of categorical variables. Even withthese optimization methods, MARS takes a long time. Therefore, only afew combinations of parameters are considered:

• Model 4.1 - MARS-regression with Mmax = 40, interaction limit 1and categorical level limit 12.




• Model 4.5 - MARS-regression with Mmax = 80, interaction limit 3and categorical level limit 0, i.e. no categorical variables are consid-ered.


Furthermore, the following setting parameters are used for all above MARSmodels.

Table 4.3: Consdered parameters for models 4

Parameter Values

Cutoff count 10

Considered points as knot location 50

4.2.5 Artificial neural network

The following ANN:s are implemented:

• Model 5.1 - ANN without bias neuron and with no output normal-izing.

• Model 5.2 - ANN with bias neuron and with no output normalizing.

• Model 5.3 - ANN without bias neuron and with output normalizing.

• Model 5.4 - ANN with bias neuron and with output normalizing.

Multiple configurations of hidden layers, transfer functions gh(u) andgo(u) are tested. These are listed in Table 4.4, Table 4.5 and Table 4.6. Allpossible combinations of these configurations are evaluated.

Training of ANNs uses batch learning with batch size γ = 10. For everyconfiguration, training proceeds in such a way that the initial learning rateis set to

η := 1.

Training is then attempted using the procedures described in subsection 2.3.5.An increasing average prediction error between epochs signifies that thelearning rate is too high. Hence, if the average error increases, the learningrate is adjusted

η := 0.5η (4.1)

and training restarts. This procedure is repeated until training continues for50 epochs without the error increasing between epochs. Training more than50 epochs may improve performance of the ANN. However, training of anANN requires a lot of computational power, and therefore 50 epochs is usedas limit for finding the best configuration. When this is determined, trainingof the network is performed for more epochs to improve the ANN. In thisthesis, the selected configuration is used in training with up to 500 epochs.The ANN are evaluated multiple times during the procedure, using thetesting data set. This is because too many epochs may result in overfitting.The amount of epochs resulting in the lowest RMSErel for the testing data,is again selected.


Table 4.4: Number of hidden layers (HLs) and number of neurons in respec-tive hidden layer

Hidden layer(s) Neurons (HL1) Neurons (HL2)

1 5 -

1 10 -

1 20 -

1 30 -

2 5 5

2 10 10

2 20 20

2 30 30

Table 4.5: Transfer functions for hidden layer neurons

Activation function

Hyperbolic tangent (tanh)

Rectified linear unit (ReLU)

Leaky rectified linear unit (L-ReLU)

Table 4.6: Transfer functions for output layer neurons

Activation function

Identity function



4.3 Choice of input parameters

This section describes the set of input parameter chosen from those intro-duced in chapter 3.

4.3.1 Estimation scenario A - VECTO fuel consumption

The input data of the this model should be vehicle parameters. The choice ofparameters for estimating VECTO fuel consumption are selected accordingto those described in [1], with some changes that were found to improve theresults. The chosen parameters are listed in Table 4.7. Table 4.8 displaysthe number of levels for the categorical parameters. Correlation coefficientsfor the selected quantitative parameters can be seen in Table 4.9.


Table 4.7: Input variables in estimation scenario A

Parameter name Symbol Type

Quant. Cat.Rolling resistance coefficient (VECTO) RRCV ECTO ×Air drag factor (VECTO) [CdA]V ECTO ×Total cruising powertrain ratio PTRcruise ×Curb mass mcurb ×Engine displacement Ve ×Engine model ceng ×Gearbox model cgbx ×Rear axle model crax ×HDV CO2 vehicle class cco2 ×

Table 4.8: Categorical variables in estimation scenario A and their numberof levels

Parameter name Symbol Levels

Engine model ceng 11

Rear axle model crax 5

HDV CO2 vehicle class cco2 2

Table 4.9: Correlation between selected quantitative parameters andFCV ECTO, displaying the pearson correlation coefficient r and r2. Thetable is sorted by r2

Parameter Correlation

Name Symbol r r2

Air drag factor (VECTO) RRCV ECTO 0.694838 0.482800Engine displacement Ve 0.612359 0.374983Rolling resistance coefficient (VECTO) RRCV ECTO 0.448840 0.201458Curb mass mcurb 0.418463 0.175111Total cruising powertrain ratio PTRcruise 0.004119 1.70·10−5


4.4 Estimation scenario B and C - Real world fuelconsumption

The input parameters estimation scenario C are parameters resulting fromboth vehicle information and operational information. All parameters intro-duced in chapter 3 are considered. The pearson correlation coefficient, r, isobtained between each quantitative parameter and FCReal. Then r2 is usedto get a measure of how correlated the variables are. The correlation valuesare listed in Table 4.10.

Table 4.10: Correlation between quantitative parameters and FCReal, dis-playing the pearson correlation coefficient r and r2. The table is sorted byr2

Parameter Correlation

Name Symbol r r2

Engine displacement Ve 0.503785 0.253800Total idling τidling.tot 0.458542 0.210261Rolling resistance coefficient (VO) RRCV O 0.443529 0.196718Average GTW mgtw 0.418234 0.174920Virtual road gradient average square φvirt.mean.sq 0.408120 0.166562Virtual road gradient average abs. value φvirt.mean.abs 0.393723 0.155018Average mass of payload and trailer mpayload 0.380182 0.144538Idling with PTO (%) τidling.pto 0.354579 0.125726Virtual velocity variance vvirt.var 0.341609 0.116697Average break frequency εbreak 0.339337 0.115150Virtual velocity standard deviation vvirt.std 0.339023 0.114937Average speed vavg -0.33614 0.112993Rolling resistance coefficient (VECTO) RRCV ECTO 0.326117 0.106352Curb mass mcurb 0.324734 0.105452Cruise control usage (%) τcc -0.32323 0.104478Idling without power take off (%) τidling 0.286038 0.081817Stop frequency per 100 km (stop/100km) εstop 0.262033 0.068661Air drag factor (VECTO) [CdA]V ECTO 0.179994 0.032398Air drag factor (VO) [CdA]V O 0.111580 0.012450Virtual temp mean Tvirt.mean 0.077307 0.005976Total cruising powertrain ratio PTRcruise 0.008055 6.49·10−5

The variables selected as input for estimation scenario C are listed in Ta-ble 4.12. These are obtained by a combination by both considering the cor-relation values listed in Table 4.10, and by iterativelly adding and removingparameters to regressional models to evaluate how the mean error RMSErel,for the testing data, is affected. Some of the considered variables that arenot selected does yield a small improvement in estimation accuracy. How-ever, they were discarded because the improvement was deemed to small, inrelation to the increase in model complexity resulting from adding more pa-


rameters. Estimation scenario B uses vehicle data to predict real world fuelconsumption. The choice of parameters are therefore the vehicle parametersused for estimation scenario C. These are listed in Table 4.11. Table 4.13displays the number of levels for the categorical parameters considered inboth estimation scenarios for real world FC.

Table 4.11: Input variables in estimation scenario B

Parameter Type

Name Symbol Quant. Cat.Rolling resistance coefficient (VO) RRCV O ×Air drag factor (VECTO) [CdA]V ECTO ×Total cruising powertrain ratio PTRcruise ×Curb mass mcurb ×Engine displacement Ve ×Engine model ceng ×Gearbox model cgbx ×HDV CO2 vehicle class cco2 ×

Table 4.12: Input variables in estimation scenario C

Parameter Type

Name Symbol Quant. Cat.Rolling resistance coefficient (VO) RRCV O ×Air drag factor (VECTO) [CdA]V ECTO ×Total cruising powertrain ratio PTRcruise ×Curb mass mcurb ×Engine displacement Ve ×Engine model ceng ×Gearbox model cgbx ×HDV CO2 vehicle class cco2 ×Average mass of payload and trailer mpayload ×Average speed vavg ×Average break frequency ξbreak ×Virtual velocity variance vvirt,var ×Virtual road gradient average abs. value φvirt,mean,abs ×Idling with PTO τidling,pto ×Total idling τidling,tot ×Cruise control usage τcc ×

Table 4.13 shows the number of levels for the categorical variables for


this estimation scenario.

Table 4.13: Categorical variables in estimation scenario B and estimationscenario C and their number of levels

Parameter name Symbol Levels

Engine model ceng 17

Gearbox model cgbx 12

HDV CO2 vehicle class cco2 2

4.5 Selection of model

For every estimation scenario, the models are compared in terms ofRMSErel,quantiles and the amount of failed predictions, with respect to the testingdata sets. A final model is then selected for each estimation scenario. Theverification data sets are used to evaluate the performance of the selectedmodels.

43

Chapter 5

Results

This chapter presents the results for each estimation scenario separately,in three sections. All three section contain a table describing the selectedconfigurations for every model, selected by having the lowest RMSErel forthe testing data. Results for all configurations can be seen in Appendix F.Each section also contain a table showing RMSE, RMSErel and quantilesfor testing results with the selected configurations. Estimations may fail forsome of the models, due to lack of training data. This is especially true formodels implementing exclusivity for categorical variables, resulting in somesubsets being too small for that particular model to function. Therefore, acolumn Failure, which describes the amount of failed estimations, are alsoincluded in the tables.

5.1 Estimation scenario A - VECTO fuel consump-tion

This section presents the result of the considered models for estimation sce-nario A. Table 5.1 shows the selected configuration for each model, and Ta-ble 5.2 shows results for each model with the selected configuration. Resultsfor all considered configurations can be seen in the Appendix, in section F.1.It should be noted that linear regression performs better than KNN regres-sion, KNN-IDW regression and MARS, which is different from the resultspresented in [1].

5.1.1 Selected model

Model A5.4 is selected for estimation scenario A. The results for the verifi-cation data set is shown in Table 5.3.

CHAPTER 5. RESULTS 44

Table 5.1: Selected configuration for models A1-5 and correspondingRMSErel with the testing data.

Model Selected configuration RMSErel

A1.1 N/A 0.42 %

A1.2 N/A > 100%

A1.3 N/A 0.37%

A2.1 K = 2 0.62%

A2.2 K = 1 1.00%

A2.3 K = 2 0.67%

A3.1 K = 4, u = 2 0.57 %

A3.2 K = 1, u = 1 1.00 %

A3.3 K = 6, u = 5 0.63 %

A4.1 N/A 0.48 %

A4.2 N/A 14.03 %

A4.3 N/A 0.50 %

A4.4 N/A 13.23 %

A4.5 N/A 23.47 %

A5.1 1 hidden layer, 30 neurons, tanh, ReLU 0.39%

A5.2 1 hidden layer, 20 neurons, tanh, ReLU 0.37%

A5.3 2 hidden layer, 10:10 neurons, L-ReLU, Identity 0.37%

A5.4 1 hidden layer, 20 neurons, ReLU, Identity 0.33%


Table 5.2: Results for models A1-5 with selected configurations. The resultsare for the chassis in the testing data set.

Model Failures RMSE Quantiles

Linear Abs. Rel. 95% 99% 99.9%A1.1 0.00% 0.138 0.42% 0.72% 1.23% 2.85%A1.2 0.94% 1.87·1013 > 100% 4.61% 7.64% > 100%A1.3 49.47% 0.117 0.37% 0.72% 0.88% 1.82%

KNNA2.1 0.00% 0.218 0.62% 1.02% 2.62% 5.95%A2.2 0.33% 0.368 1.00% 0.95% 2.78% 15.14%A2.3 0.00% 0.236 0.67% 1.03% 2.74% 7.10%

KNN-IDWA3.1 0.00% 0.202 0.57% 0.87% 2.34% 6.35%A3.2 0.33% 0.368 1.00% 0.95% 2.78% 15.14%A3.3 0.00% 0.226 0.63% 0.92% 2.76% 7.84%

MARSA4.1 0.00% 0.159 0.48% 0.80% 2.06% 3.33%A4.2 0.00% 5.063 14.03% 0.75% 1.25% 5.25%A4.3 0.00% 0.165 0.50% 0.79% 2.26% 4.11%A4.4 0.00% 4.776 13.23% 0.75% 2.20% 12.09%A4.5 0.00% 8.474 23.47% 1.33% 3.05% 20.57%

ANNA5.1 0.00% 0.127 0.39% 0.79% 1.00% 2.92%A5.2 0.00% 0.118 0.36% 0.75% 0.95% 3.28%A5.3 0.00% 0.107 0.33% 0.64% 0.84% 1.48%A5.4 0.00% 0.103 0.32% 0.65% 0.85% 1.32%

Table 5.3: Results for the selected model for estimation scenario A with theverification data


Abs. Rel. 95% 99% 99.9%A5.4 0.00% 0.114 0.36% 0.77% 0.94% 1.20%


5.2 Estimation scenario B - Real world fuel con-sumption using only vehicle parameters

This section presents the result of the considered models for estimation sce-nario B. Table 5.4 shows the selected configuration for each model, and Ta-ble 5.8 shows results for each model with the selected configuration. Resultsfor all considered configurations can be seen in the Appendix, in section F.2.

Table 5.4: Selected configuration for models B1-5 and correspondingRMSErel with the testing data.


B1.1 N/A 10.68%

B1.2 N/A > 100%

B1.3 N/A 11.13%

B2.1 K = 15 9,77%

B2.2 K = 15 9,80%

B2.3 K = 15 9.77%

B3.1 K = 15, u = 1 9.13%

B3.2 K = 15, u = 1 9.15%

B3.3 K = 15, u = 1 9.13%

B4.1 N/A 11.24%

B4.2 N/A 10.98%

B4.3 N/A 11.08%

B4.4 N/A 11.14%

B4.5 N/A 11.14%

B5.1 1 hidden layer, 20 neurons, tanh, L-ReLU 10.21%

B5.2 1 hidden layer, 30 neurons, tanh, ReLU 10.25%

B5.3 2 hidden layers, 20:20 neurons, tanh, identity 10.40%

B5.4 2 hidden layers, 30:30 neurons, tanh, identity 10.35%


Model B3.1 is selected for estimation scenario B. The results for the verifi-cation data set is shown in Table 5.6.


Table 5.5: Results for models B1-5 with selected configurations. The resultsare for the chassis in the testing data set

Model Failures RMSE Quantiles. for rel. error

Linear Abs. Rel. 95% 99% 99.9%B1.1 0.00% 3.70 10.68% 21.02% 30.11% 45.48%B1.2 1.17% 2.4·1014 > 100% 27.72% > 100% > 100%B1.3 33.55% 3.62 11.13% 21.96% 33.23% 48.13%

KNNB2.1 0.00% 3.47 9.77% 19.74% 30.07% 46.92%B2.2 0.12% 3.47 9.80% 19.77% 30.28% 47.62%B2.3 0.00% 3.47 9.77% 19.73% 30.07% 47.52%

KNN-IDWB3.1 0.00% 3.25 9.13% 19.18% 30.57% 50.57%B3.2 0.12% 3.25 9.15% 19.10% 30.60% 50.68%B3.3 0.00% 3.25 9.13% 19.16% 30.57% 50.57%

MARSB4.1 0.00% 3.90 11.24% 22.17% 31.48% 48.29%B4.2 0.00% 3.82 10.98% 21.60% 30.93% 48.45%B4.3 0.00% 3.87 11.08% 21.93% 30.86% 47.77%B4.4 0.00% 3.88 11.14% 22.03% 31.40% 48.20%B4.5 0.00% 3.88 11.14% 21.82% 31.83% 48.67%

ANNB5.1 0.00% 3.59 10.44% 20.65% 30.80% 46.90%B5.2 0.00% 3.65 10.21% 20.66% 30.53% 46.93%B5.3 0.00% 3.59 10.28% 20.84% 30.56% 46.43%B5.4 0.00% 3.59 10.30% 20.69% 30.85% 48.13%

Table 5.6: Results for the selected model for estimation scenario B with theverification data


Abs. Rel. 95% 99% 99.9%B3.1 0.00% 3.338 9.30% 19.50% 30.98% 50.00%


5.3 Estimation scenario C - Real world fuel con-sumption using vehicle and operational param-eters

This section presents the result of the considered models for estimation sce-nario C. Table 5.7 shows the selected configuration for each model, and Ta-ble 5.8 shows results for each model with the selected configuration. Resultsfor all considered configurations can be seen in the Appendix, in section F.3.

Table 5.7: Selected configuration for models C1-5 and correspondingRMSErel with the testing data.


C1.1 N/A 6.55 %

C1.2 N/A > 100%

C1.3 N/A 6.64%

C2.1 K = 6 6.94%

C2.2 K = 6 7.08%

C2.3 K = 6 6.96%

C3.1 K = 15, u = 5 6.60 %

C3.2 K = 15, u = 5 6.68 %

C3.3 K = 15, u = 5 6.62 %

C4.1 N/A 6.69%

C4.2 N/A 6.53%

C4.3 N/A 6.89%

C4.4 N/A 6.42%

C4.5 N/A 6.42%

C5.1 1 hidden layer, 30 neurons, ReLU, ReLU 6.01%

C5.2 2 hidden layers, 30:30 neurons, L-ReLU, L-ReLU 5.88%

C5.3 2 hidden layers, 30:30 neurons, L-ReLU, Identity 5.88%

C5.4 2 hidden layers, 20:20 neurons, ReLU, Identity 5.88%


Model C5.3 is selected for estimation scenario C. The results for the verifi-cation data set is shown in Table 5.9.


Table 5.8: Results for models C1-5 with selected configurations. The resultsare for the chassis in the testing data set


Linear Abs. Rel. 95% 99% 99.9%C1.1 0.00% 2.39 6.55% 12.12% 20.31% 49.24%C1.2 2.11% 3.48·1013 > 100% 17.05% 42.33% > 100%C1.3 33.55% 2.36 6,64% 12.14% 20.40% 49.65%

KNNC2.1 0.00% 2.59 6.94% 13.89% 21.33% 49.04%C2.2 0.12% 2.65 7.08% 13.89% 22.22% 50.33%C2.3 0.00% 2.60 6.96% 13.89% 21.33% 49.04%

KNN-IDWC3.1 0.00% 2.48 6.60% 12.94% 21.47% 48.04%C3.2 0.12% 2.53 6.68% 13.02% 21.59% 49.32%C3.3 0.00% 2.49 6.62% 12.96% 21.69% 48.04%

MARSC4.1 0.00% 2.47 6.69% 12.65% 20.09% 49.21%C4.2 0.00% 2.42 6.53% 12.19% 20.10% 48.88%C4.3 0.00% 2.52 6.89% 13.07% 20.44% 49.08%C4.4 0.00% 2.40 6.42% 12.05% 19.25% 48.82%C4.5 0.00% 2.39 6.42% 12.17% 20.09% 49.36%

ANNC5.1 0.00% 2.23 6.01% 11.12% 18.36% 48.25%C5.2 0.00% 2.22 5.90% 11.06% 18.16% 48.64%C5.3 0.00% 2.19 5.86% 10.96% 17.16% 48.17%C5.4 0.00% 2.20 5.90% 11.14% 17.85% 48.26%

Table 5.9: Results for the selected model for estimation scenario C with theverification data


Abs. Rel. 95% 99% 99.9%C5.3 0.00% 2.17 5.75% 11.14% 16.63% 45.26%


5.4 Results summary

5.4.1 Comparison of estimation scenario B and C

This section compares the performance between the two estimation scenar-ios for predicting real world FC. The difference in mean error (RMSE)illustrates the prediction reduction in average error gained by including op-erational parameters when estimating FC, as shown in Table 5.10.

Table 5.10: Comparison of prediction performance between estimation sce-nario B and estimation scenario C. Both scenarios predict real world fuelconsumption.

Model RMSE[

l100km

]Linear B C Diff. (C-B) Rel. diff (*)

1.1 3.70 2.39 -1.31 -35.4%1.2 2.4·1014 3.48·1013 -2.05·1013 -85.5%1.3 3.62 2.36 -1.26 -34.8%

KNN2.1 3.47 2.59 -0.88 -25.4%2.2 3.47 2.65 -0.82 -23.6%2.3 3.47 2.60 -0.87 -25.1%

KNN-IDW3.1 3.25 2.48 -0.77 -23.7%3.2 3.25 2.53 -0.72 -22.2%3.3 3.25 2.49 -0.77 -23.4%

MARS4.1 3.90 2.47 -1.43 -36.7%4.2 3.82 2.42 -1.40 -36.6%4.3 3.87 2.52 -1.35 -34.9%4.4 3.88 2.40 -1,48 -38.1%4.5 3.88 2.39 -1.49 -38.4%

ANN5.1 3.59 2.23 -1.36 -37.9%5.2 3.65 2.22 -1.43 -39.2%5.3 3.59 2.19 -1.40 -39.0%5.4 3.59 2.20 -1.39 -38.7%

Average**: 3.60 2.42 -1.14 -32.5%

*Relative to B. **Does not include model 1.2


5.4.2 Variable influence

Using linear regression, as described in subsection 2.2.1, a coefficient is ob-tained for every quantitative input variable and for every possible value forcategorical variables. This section evaluates the coefficients obtained usingmodel 1.1 for each estimation scenario. This evaluation yields a measure ofhow, according to model 1.1, the input variables affect the FC.

The data presented is the maximum variation in fuel consumption re-sulting from the variables. For quantitative variables, this is obtained bymultiplying the variable coefficient with the range of the input data, which isthe difference between the maximum and minimum values. For categoricalvariables, this is obtained as the difference between the highest and lowestcoefficients of the dummy variables associated to the categorical variable.These values are presented in Table 5.11. Negative values are possible forquantitative variables, in the case of a negative coefficient.

Because of linear dependency between the variables engine model andengine displacement, the variable engine displacement is omitted be-fore obtaining the linear regression coefficients. This does not affect theRMSE-values or the error quantiles of this model.

Table 5.11: FC range for model 1.1 for each estimation scenario

Variable Type FC range[

l100km

]Quant. Cat. A B C

Air drag factor (VECTO) × 7.44 4.95 0.76Average break frequency × - - 5.37Average mass of payload and trailer × - - 18.88Average speed × - - -3.94Cruise control usage × - - -1.04Curb mass × 2.04 -8.22 1.40Engine model × 4.88 3.04 1.16Gearbox model × 0.87 5.13 4.17HDV CO2 vehicle class × 0.04 3.04 1.16Idling with PTO × - - 3.04Rear axle model × 1.99 - -Rolling resistance coefficient (VECTO) × 3.01 - -Rolling resistance coefficient (VO) × - 10.30 4.48Total cruising powertrain ratio × 4.82 8.63 4.58Total idling × - - 4.51Virtual road gradient average absolute value × - - 16.65Virtual velocity variance × - - 2.53

52

Chapter 6

Discussion and conclusion

6.1 Discussion

Artificial neural networks gives the best accuracy for 2 of 3 estimation sce-narios. Estimation of VECTO FC can be done with high accuracy. Thisthesis has only considered the Long haulage VECTO-cycle. It is possiblethat estimating fuel consumption for other VECTO-cycles will perform dif-ferently. Furthermore, a potential future estimation model for other cyclesshould maybe use a different set of parameters. As an example, the param-eter Total cruising powertrain ratio may not be as relevent for other cyclesas it is for Long haulage, due to other cycles not featuring the same amountof cruising. Estimation of VECTO FC is easier than that of real world FC.This is to be expected, because the VECTO FC is the result of a simulationprogram, and all parameters used in the simulation is available. Theoreti-cally, if all the input data to VECTO would be considered in a statisticalmodel, it should be possible to estimate the VECTO FC with no error atall, provided the estimation model is complex enough. The parameters usedin this thesis are only a few of all the input parameters used by VECTO,but the model is still able to estimate the VECTO FC with a an averageerror of 0.36% and with a 95% probability that the error is less than 0.77%.

Estimation of real world FC yield larger errors. The results concludethat considering operational parameters improves estimation performance.However, one potential issue with the operational input parameters is know-ing them. In this thesis, the testing data is obtained in the same way as thetraining data, i.e. from vehicles in operation. If the estimation models are tobe used for estimating FC for hypothetical operations, it can be difficult topredict some of the parameters. For example, predicting the virtual velocityvariance and virtual average road gradient may be difficult. However, it ispossible to assume values for operational parameters based on operationaldata from trucks already in operation that are known to have similar, orequal, transport missions. This thesis has not taken uncertainty of oper-

CHAPTER 6. DISCUSSION AND CONCLUSION 53

ational data into account. The performance of the estimations presentedassumes that the input data is accurate. In reality, uncertainty of inputdata should result in more uncertainty and wider confidence intervals thanthat presented in this report. Contrary, determining vehicle parameters iseasier, since it can be obtained from product data and from tests.

The choice of input parameters for estimating real world FC was selectedas a trade-off between prediction performance and model complexity. It ispossible to include more parameters and get slightly better results. Fur-thermore, considering other variables than those that were introduced inchapter 3 could also improve prediction performance.

6.2 Conclusion

Statistical models can successfully be used to quickly estimate VECTO FC.The probability of the relative prediction error being less than 0.77% is 95%.The best model for estimating VECTO FC is an Artificial neural network.

Real world FC can be estimated with less accuracy than VECTO FC.Making estimations solely bases on vehicle parameters can be done with arelative error less than 19.50 % for 95% of all vehicles. The model selected forthis estimation scenario is a KNN regression with Inverse distance weighting.

If operational parameters are considered in addition to vehicle parame-ters the average prediciton error is reduced by about one third. An Artificialneural network is the best model for this scenario. The prediction error isless than 11.14% for 95% of all chassis.

54

References

[1] Max af Klintberg. Predictive modeling of emissions : Heavy duty ve-hicles. Technical report, Umea University, Department of Mathematicsand Mathematical Statistics, 2016.

[2] Harkrishan L. Vasudeva. Metric spaces. Springer, 2000. pg. 27-29.

[3] M.A. Hardy. Regression with Dummy Variables. Number nr. 91–93 inQuantitative Applications in t. SAGE Publications, 1993.

[4] Jerome H. Friedman. Multivariate adaptive regression splines. TheAnnals of Statistics, 19(1):1–67, 1991.

[5] Jerome H. Friedman. Estimating functions of mixed ordinal and cat-egorical variables using adaptive splines. Technical report, StanfordUniversity California, lab for computational statistics, 1991.

[6] Graham Jameson. 80.29 counting subsets and the binomial theorem.The Mathematical Gazette, 80(488):395–396, 1996.

[7] Jerome H. Friedman. Fast mars. Technical report, Stanford UniversityCalifornia, lab for computational statistics, 1993.

[8] Ivan Nunes Da Silva, Danilo Hernane Spatti, Rogerio AndradeFlauzino, Luisa Helena Bartocci Liboni, and Silas Franco dosReis Alves. Artificial Neural Networks. Springer, 2017.

[9] Erdal Kayacan and Mojtaba Ahmadieh Khanesar. Chapter 5 - gra-dient descent methods for type-2 fuzzy neural networks. In ErdalKayacan and Mojtaba Ahmadieh Khanesar, editors, Fuzzy Neural Net-works for Real Time Control Applications, pages 47 – 52. Butterworth-Heinemann, 2016.

[10] Thomas Gleason Paul Fahlstrom. Basic Aerodynamics, chapter 3,page 35. Wiley-Blackwell, 2012.

[11] Sveriges riksdag (Swedish parlament). SFS 2001:559, 2001. Definitionof curb mass (Swedish: Tjanstevikt).

55

Appendix A

Glossary

A.1 Acronyms

AI Artificial Intelligence 1, 16

ANN Artificial Neural Network 8, 16–22, 24, 25, 37, 45, 47, 49, 50, 52, 53,56, 58

BPA Backpropagation algorithm 20–22, 25, Terms: Backpropagation al-gorithm

FC Fuel Consumption 1, 2, 26, 41, 50–53

GTW Gross train weight 30, 32, 40, 64, Terms: Gross train weight

HDV Heavy Duty Vehicle 1, 28, 39, 41, 42, 51

HL Hidden Layer 38

IDW Inverse Distance Weighting 9, 36, 43, 45, 47, 49, 50, 53

KNN K Nearest Neighbors 8, 9, 35, 36, 43, 45, 47, 49, 50, 53

MARS Multivariate adaptive regression splines 10–16, 36, 37, 43, 45, 47,49, 50, Terms: Multivariate adaptive regression splines

PTO Power take-off 31, 32, 40, 41, 51, 64, Terms: Power take-off

RPM Revolutions Per Minute 27

VECTO Vehicle Energy Consumption Calculation Tool 1, 2, 26, 28, 32, 33,38–41, 51–53, 56, Terms: Vehicle Energy Consumption CalculationTool

VO Vehicle Optimizer 28, 40, 41, 51, Terms: Vehicle Optimizer

APPENDIX A. GLOSSARY 56

A.2 Terms

Air deflector

Device to redirect flow of air to reduce drag 27

Ambient temperature

The temperature of the air outside the vehicle 29

Artifical neuron

Fundamental component of Artificial neural networks 16–18, 21–24,38, 56, 58

Backpropagation algorithm

The algorithm used to train an Artificial neural network 20,

Bias neuron

A neuron that has a constant output value that can be optionallyimplemented into an Artificial neural network 18, 21, 24, 37

Break start

A unit corresponding to pressing down on the break handle so thevehicle goes from not breaking at all to some magnitude of breaking.30

Chassis number

Unique number identifying a truck 26, 33

Development level 6

At the time of writing, Scania’s newst truck generation 33

Diesel fuel

A type of fuel 33

Estimation scenario A

Estimation of VECTO fuel consumption based on vehicle parameters2, 32, 34, 35, 39, 43, 45

Estimation scenario B

Estimation of real world fuel consumption based only on vehicle pa-rameters 2, 34, 41, 42, 46, 47, 50


Estimation scenario C

Estimation of real world fuel consumption based on an vehicle param-eters and operational parameters 2, 28, 32, 34, 40–42, 48–50

Gross train weight

Mass of the entire vehicle train, including trailers and payload 30,

Multivariate adaptive regression splines

Non-parametric, non-linear regressional algorithm 8, 10

Operational data

Data collected about driving patterns and other operational usagefrom the SCANIA fleet of vehicles by on board sensors and controlunits 1

Outlier

Data point in a set of data that is distant from the bulk of the obser-vations. This may be due to faulty measurements or data handling 5,33

Overfitting

Fitting of a model to noise in the training data. See Figure G.1 for anexample 14, 34, 37

Pearson correlation coefficient

Measure of linear correlation between two variables. It takes a valuein the range [−1, 1] where

• 1 is positive linear correlation

• -1 is negative linear correlation

• 0 is no correlation.

3, 39, 40

Power take-off

Extraction of power from the vehicle’s engine to power an externalmachine 31,

Road gradient

The slope of the road 29, 62


Synapse

Connection between two neurons in an Artificial neural network 17,18, 21–23

Synaptic weight

Represents the amplitude of the connection between two neurons 17,20–25

Vehicle Energy Consumption Calculation Tool

(VECTO) Simulation program used for certification of heavy dutyvehicles in the European union 1,

Vehicle Optimizer

Simulation program developed by Scania 28,

Virtual road gradient

Road gradient (road slope) obtained as explained in subsection 3.2.129, 32, 41, 51, 62, 64

Virtual temperatire

Ambient temperatire obtained as explained in subsection 3.2.1 29, 31,32, 62, 64

Virtual velocity

Velocity obtained as explained in subsection 3.2.1 29, 30, 32, 41, 51,61, 64

59

Appendix B

Fundamental statistics

This chapter describes how sample mean and sample standard deviationsare obtained for a set of random variables.

If X1,X2, ...,Xn are random variables, and they form a vector

X =[X1 X2 ... Xn

](B.1)

Let xij be the ith independent observation on the jth random variable.Assume that m independent observations have been drawn for all n randomvariables. These observations then form a data set X = {xi ∈ Rn, i =1, ...,m}. The sample mean of the jth random variable

xj =1

m

m∑i=1

xij (B.2)

, which can also be formulated in vector form for all n variables

x =1

m

m∑i=1

xi =

x1...xn.

(B.3)

The sample standard deviation of of the sample of the jth random variable

sj =

√√√√ 1

m− 1

m∑i=1

(xij − xj)2 (B.4)

, and in vector form

s =1

m− 1

m∑i=1

(xi1 − x1)2...

(xin − xn)2

=

s1...sn.

(B.5)

60

Appendix C

Assumed values for bin data

APPENDIX C. ASSUMED VALUES FOR BIN DATA 61

Table C.1: Velocity assumed for the bin data intervals of the vehicle velocitydistribution in the operational data database. This yields the distributionof virtual velocity.

Velocity interval [km/h] Assumed velocity [km/h]

Less than 3 1.53 to 10 6.510 to 15 12.515 to 20 17.520 to 25 22.525 to 30 27.530 to 35 32.535 to 40 37.540 to 50 45.050 to 60 55.060 to 65 62.565 to 70 67.570 to 75 72.575 to 80 77.580 to 85 82.585 to 90 87.590 to 95 92.595 to 100 97.5100 to 105 102.5Greater than 105 107.5

APPENDIX C. ASSUMED VALUES FOR BIN DATA 62

Table C.2: Road slope assumed for the bin data intervals of the road gradientin the operational data database. This yields the distribution of virtualroad gradient. The units describe the ratio of the vertical component tothe horizontal component of the vehicle velocity. Hence, 0% is horizontal.Positive numbers correspond to travelling uphill.

Road slope interval Assumed slope

Less than -8% -10%-8% to -5% -6.5%-5% to -3% -4.5%-3% to -2% -2.5%-2% to -1% -1.5-1% to 1% 0.0%1% to 2% 1.5%2% to 3% 2.5%3% to 5% 4.5%5% to 8% 6.5%Greater than 8% 10%

Table C.3: Ambient temperature assumed for the bin data intervals in theoperational data database. The data are not sub-intervals in the same senseas the other bin-data, since the intervals coincide. However, the virtualtemperature is still obtained in a manner consistent with the other virtualdistributions.

Ambient temperature interval Assumed temperature

Greater than 30 °C 35 °CGreater than 40 °C 45 °CLess than 0 °C -5 °CLess than 10 °C -5 °CLess than 20 °C 15 °CLess than -20 °C -15 °C

63

Appendix D

Additional tables

APPENDIX D. ADDITIONAL TABLES 64T

able

D.1

:N

um

eric

alfi

lter

ing

ofop

erat

ion

ald

ata

Para

mete

rn

am

eS

ym

bol

Low

er

filt

er

Up

per

filt

er

Init

ial

valu

era

nge

Resu

ltin

gvalu

era

nge

Un

it

Ave

rage

GT

Wmgtw

2%

2%

1000

0-85

000

1500

0-52

000

kg

Ave

rage

spee

dv avg

2%

0%

0-10

026

-100

km

/h

Ave

rage

bre

ak

freq

uen

cyξ break

2%

2%

0-52

0019

.6-3

28b

reak

s/10

0km

Sto

pfr

equ

ency

ξ stop

2%

2%

0-93

05-

109

stop

s/10

0km

Vir

tual

velo

c-it

yva

rian

cev virt,var

2%

2%

0-11

38,2

1632

134-

576

(km

/h)2

Vir

tual

velo

c-it

yst

and

ard

dev

iati

on

v virt,std

2%

2%

0-33

.711

,6-2

4.0

km

/h

Vir

tual

road

gra

die

nt

aver

-age

abso

lute

valu

e

φvirt,mean,abs

2%

2%

0-7.

50.

184-

5.15

%

Vir

tual

road

gra

die

nt

aver

-age

squ

are

φvirt,mean,sq

2%

2%

0-56

.30.

0337

-26.

5%

2

Idli

ng

wit

hP

TO

τ idling,pto

0%

2%

0-78

0-47

%

Idli

ng

wit

hou

tP

TO

τ idling

2%

2%

1-79

3-54

%

Vir

tual

tem

-p

eratu

rem

ean

Tvirt,mean

1%

1%

0-40

4.69

-36.

4°C

Cru

ise

contr

ol

usa

ge

τ cc

0%

2%

0-10

00-

80%

Fu

elco

nsu

mp

-ti

onFCReal

2%

2%

2-99

23-6

8l/

100k

m

APPENDIX D. ADDITIONAL TABLES 65

Tab

leD

.2:

Cla

ssifi

cati

onof

HD

VC

O2

veh

icle

clas

s.©

Eu

rop

ean

Un

ion

,htt

p:/

/eu

r-le

x.e

uro

pa.

eu/,

1998

-201

8’.

Sou

rce:

Offi

cial

Jou

rnal

ofth

eE

uro

pea

nU

nio

n,

L349

,29

Dec

emb

er20

17,

AN

NE

XI.

Axle

sA

xle

con

figu

rati

on

Ch

ass

isad

ap

tion

Tech

nic

ally

perm

issi

ble

maxi-

mu

mla

den

mass

[1000kg]

HD

VC

O2

veh

icle

cla

ss

2

4x2

Rig

id3.

5-

7.5

0R

igid

/T

ract

or7.

5-

101

Rig

id/T

ract

or10

-12

2R

igid

/T

ract

or12

-16

3R

igid

>16

4T

ract

or>

165

4x4

Rig

id7.

5-

166

Rig

id>

167

Tra

ctor

>16

8

3

6x2

Rig

idA

ll9

Tra

ctor

All

10

6x4

Rig

idA

ll11

Tra

ctor

All

12

6x6

Rig

idA

ll13

Tra

ctor

All

14

4

8x2

Rig

idA

ll15

8x4

Rig

idA

ll16

8x6

or

8x8

Rig

idA

ll17

66

Appendix E

Activation functions

This appendix chapter shows additional information about activation func-tions considered for artificial neural networks.

APPENDIX E. ACTIVATION FUNCTIONS 67

Logistic function

Defined as:

g(u) = σ(u) =1

1 + e−u. (E.1)

First order derivative:dg

du= g(u) (1− g(u)) . (E.2)

Figure E.1: The logistic function


Hyperbolic tangent function

Defined as:

g(u) = tanh(u) =eu − e−u

eu + e−u. (E.3)


du= 1− g(u)2. (E.4)

Figure E.2: The hyperbolic tangent function


Identity function

Defined as:g(u) = u. (E.5)


du= 1. (E.6)

Figure E.3: The identity function



Defined as:

g(u) =

{u for x ≥ 00 for x < 0

(E.7)


du=

{1 for x ≥ 00 for x < 0

(E.8)

Figure E.4: Rectified linear unit (ReLU)



Defined as:

g(u) =

{u for x ≥ 0

0.01u for x < 0(E.9)

First order derivative:

dg

du=

{1 for x ≥ 0

0.01 for x < 0(E.10)

Figure E.5: Leaky rectified linear unit (L-ReLU)

72

Appendix F

Model training result

F.1 Estimation scenario A

This appendix section contains detailed results for estimation of VECTOfuel consumption using all considered models.

Table F.1: Model A1 - Results

Model Failed evaluations RMSE Quantiles

Abs. Rel. 95% 99% 99.9%A1.1 0.00% 0.138427 0.42% 0.72% 1.23% 2.85%A1.2 0.94% 1.87·1013 > 100% 4.61% 7.64% > 100%A1.3 49.47% 0.116836 0.37% 0.72% 0.88% 1.82%

Table F.2: Model A2.1 - Results for all considered configurations

Configuration Failed evaluations RMSE Quantiles

K Abs. Rel. 95% 99% 99.9%1 0.00% 0.236142 0.67% 0.96% 2.78% 7.27%2 0.00% 0.217545 0.62% 1.02% 2.62% 5.95%3 0.00% 0.228116 0.65% 1.00% 2.84% 6.28%4 0.00% 0.244785 0.69% 1.10% 3.17% 6.61%6 0.00% 0.2841 0.78% 1.26% 3.81% 7.29%10 0.00% 0.326559 0.89% 1.59% 4.09% 8.37%15 0.00% 0.370864 0.99% 1.78% 4.49% 9.98%

APPENDIX F. MODEL TRAINING RESULT 73



K Abs. Rel. 95% 99% 99.9%1 0.33% 0.3683 1.00% 0.95% 2.78% 15.14%2 0.33% 0.413519 1.11% 1.02% 3.09% 16.72%3 0.33% 0.46109 1.23% 0.99% 4.39% 17.53%4 0.33% 0.505515 1.36% 1.13% 5.00% 16.84%6 0.33% 0.586908 1.57% 1.40% 8.54% 17.43%10 0.33% 0.691282 1.84% 1.88% 11.71% 18.28%15 0.33% 0.781705 2.07% 2.33% 13.30% 18.84%



K Abs. Rel. 95% 99% 99.9%1 0.00% 0.275892 0.79% 0.96% 2.78% 13.10%2 0.00% 0.235522 0.67% 1.03% 2.74% 7.10%3 0.00% 0.25192 0.71% 0.99% 3.35% 9.16%4 0.00% 0.277191 0.77% 1.10% 3.58% 8.58%6 0.00% 0.338854 0.91% 1.29% 4.30% 9.49%10 0.00% 0.43831 1.16% 1.70% 6.13% 11.97%15 0.00% 0.516334 1.35% 1.98% 6.82% 14.23%




K u Abs. Rel. 95% 99% 99.9%1 1 0.00% 0.23616 0.67% 0.96% 2.78% 7.27%2 1 0.00% 0.207781 0.59% 0.92% 2.53% 5.93%3 1 0.00% 0.202559 0.58% 0.85% 2.50% 6.27%4 1 0.00% 0.205472 0.58% 0.86% 2.47% 6.29%6 1 0.00% 0.219174 0.60% 0.88% 3.00% 6.69%10 1 0.00% 0.241398 0.64% 0.98% 2.91% 6.33%15 1 0.00% 0.271204 0.71% 0.99% 2.95% 8.45%1 2 0.00% 0.23616 0.67% 0.96% 2.78% 7.27%2 2 0.00% 0.208963 0.60% 0.92% 2.54% 5.92%3 2 0.00% 0.201913 0.58% 0.88% 2.33% 6.26%4 2 0.00% 0.20173 0.57% 0.87% 2.34% 6.35%6 2 0.00% 0.205519 0.57% 0.86% 2.74% 6.56%10 2 0.00% 0.215468 0.59% 0.86% 2.77% 6.06%15 2 0.00% 0.232471 0.62% 0.87% 2.81% 6.53%1 5 0.00% 0.23616 0.67% 0.96% 2.78% 7.27%2 5 0.00% 0.215632 0.61% 0.95% 2.47% 6.10%3 5 0.00% 0.211064 0.60% 0.95% 2.47% 6.43%4 5 0.00% 0.209273 0.60% 0.93% 2.49% 6.48%6 5 0.00% 0.204 0.57% 0.92% 2.67% 6.04%10 5 0.00% 0.202858 0.57% 0.92% 2.72% 5.48%15 5 0.00% 0.204908 0.57% 0.90% 2.60% 6.41%1 10 0.00% 0.23616 0.67% 0.96% 2.78% 7.27%2 10 0.00% 0.221114 0.63% 0.96% 2.54% 6.56%3 10 0.00% 0.219499 0.62% 0.96% 2.57% 6.83%4 10 0.00% 0.218726 0.62% 0.96% 2.54% 6.67%6 10 0.00% 0.213918 0.60% 0.96% 2.58% 6.57%10 10 0.00% 0.214333 0.60% 0.96% 2.54% 6.47%15 10 0.00% 0.215527 0.60% 0.96% 2.55% 6.35%




K u Abs. Rel. 95% 99% 99.9%1 1 0.33% 0.368248 1.00% 0.95% 2.78% 15.14%2 1 0.33% 0.383746 1.03% 0.91% 2.71% 16.72%3 1 0.33% 0.404995 1.08% 0.86% 3.40% 17.51%4 1 0.33% 0.420645 1.12% 0.88% 3.93% 16.85%6 1 0.33% 0.448329 1.19% 0.91% 3.94% 15.99%10 1 0.33% 0.488296 1.29% 1.01% 4.08% 15.90%15 1 0.33% 0.526501 1.38% 1.09% 5.52% 16.43%1 2 0.33% 0.368248 1.00% 0.95% 2.78% 15.14%2 2 0.33% 0.378821 1.02% 0.91% 2.43% 16.72%3 2 0.33% 0.392089 1.05% 0.88% 2.75% 17.49%4 2 0.33% 0.397993 1.06% 0.88% 3.07% 16.86%6 2 0.33% 0.410532 1.10% 0.86% 3.41% 16.02%10 2 0.33% 0.432166 1.15% 0.89% 3.77% 15.90%15 2 0.33% 0.454133 1.20% 0.93% 3.61% 15.90%1 5 0.33% 0.368248 1.00% 0.95% 2.78% 15.14%2 5 0.33% 0.374039 1.01% 0.93% 2.47% 16.72%3 5 0.33% 0.382207 1.02% 0.93% 2.58% 17.44%4 5 0.33% 0.380783 1.02% 0.92% 2.66% 16.86%6 5 0.33% 0.380999 1.02% 0.90% 2.62% 16.08%10 5 0.33% 0.385348 1.03% 0.90% 2.64% 15.94%15 5 0.33% 0.392139 1.05% 0.89% 2.74% 15.94%1 10 0.33% 0.368248 1.00% 0.95% 2.78% 15.14%2 10 0.33% 0.371422 1.00% 0.95% 2.54% 16.71%3 10 0.33% 0.376706 1.01% 0.95% 2.42% 17.35%4 10 0.33% 0.373787 1.01% 0.95% 2.45% 16.85%6 10 0.33% 0.372197 1.00% 0.95% 2.40% 16.19%10 10 0.33% 0.372228 1.00% 0.95% 2.37% 16.03%15 10 0.33% 0.372869 1.00% 0.95% 2.40% 16.03%




K u Abs. Rel. 95% 99% 99.9%1 1 0.00% 0.275902 0.79% 0.96% 2.78% 13.10%2 1 0.00% 0.225882 0.64% 0.91% 2.49% 7.25%3 1 0.00% 0.227363 0.64% 0.86% 2.61% 8.08%4 1 0.00% 0.236627 0.66% 0.88% 3.14% 7.82%6 1 0.00% 0.261897 0.70% 0.91% 3.39% 8.08%10 1 0.00% 0.318803 0.83% 1.01% 3.71% 10.11%15 1 0.00% 0.370293 0.95% 1.03% 4.10% 11.65%1 2 0.00% 0.275902 0.79% 0.96% 2.78% 13.10%2 2 0.00% 0.225789 0.64% 0.92% 2.54% 7.41%3 2 0.00% 0.223924 0.63% 0.89% 2.42% 7.78%4 2 0.00% 0.22817 0.64% 0.89% 2.83% 7.71%6 2 0.00% 0.241039 0.66% 0.88% 3.03% 7.00%10 2 0.00% 0.280666 0.74% 0.89% 3.12% 8.05%15 2 0.00% 0.317071 0.83% 0.92% 3.51% 9.63%1 5 0.00% 0.275902 0.79% 0.96% 2.78% 13.10%2 5 0.00% 0.229954 0.66% 0.95% 2.47% 7.87%3 5 0.00% 0.226473 0.65% 0.95% 2.52% 8.04%4 5 0.00% 0.225667 0.64% 0.93% 2.68% 7.09%6 5 0.00% 0.226132 0.63% 0.92% 2.76% 7.84%10 5 0.00% 0.244193 0.67% 0.92% 2.85% 7.36%15 5 0.00% 0.258097 0.70% 0.91% 2.68% 8.48%1 10 0.00% 0.275902 0.79% 0.96% 2.78% 13.10%2 10 0.00% 0.237389 0.68% 0.96% 2.54% 8.62%3 10 0.00% 0.235451 0.67% 0.96% 2.42% 7.87%4 10 0.00% 0.23442 0.67% 0.96% 2.45% 7.22%6 10 0.00% 0.232328 0.66% 0.96% 2.40% 7.18%10 10 0.00% 0.242098 0.68% 0.96% 2.41% 7.94%15 10 0.00% 0.24528 0.68% 0.96% 2.50% 8.35%

Table F.8: Model BA - Results


Abs. Rel. 95% 99% 99.9%A4.1 0.00% 0.15889 0.48% 0.80% 2.06% 3.33%A4.2 0.00% 5.06309 14.03% 0.75% 1.25% 5.25%A4.3 0.00% 0.16527 0.50% 0.79% 2.26% 4.11%A4.4 0.00% 4.77633 13.23% 0.75% 2.20% 12.09%A4.5 0.00% 8.47366 23.47% 1.33% 3.05% 20.57%



Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.5 Tan H Identity 0.176364 0.51%5 Tan H ReLU 0.178486 0.52%5 Tan H L-ReLU 0.175929 0.51%5 ReLU Identity 0.14034 0.43%5 ReLU ReLU 0.137846 0.42%5 ReLU L-ReLU 0.132291 0.40%5 L-ReLU Identity 0.144692 0.45%5 L-ReLU ReLU 0.179616 0.54%5 L-ReLU L-ReLU 0.169028 0.51%10 Tan H Identity 0.147371 0.44%10 Tan H ReLU 0.163152 0.47%10 Tan H L-ReLU 0.144118 0.43%10 ReLU Identity 0.137102 0.42%10 ReLU ReLU 0.124328 0.38%10 ReLU L-ReLU 0.135228 0.41%10 L-ReLU Identity 0.135383 0.41%10 L-ReLU ReLU 0.77426 2.12%10 L-ReLU L-ReLU 0.753519 2.23%20 Tan H Identity 0.180699 0.55%20 Tan H ReLU 0.135691 0.41%20 Tan H L-ReLU 0.124059 0.38%20 ReLU Identity 0.130838 0.40%20 ReLU ReLU 0.126355 0.39%20 ReLU L-ReLU 0.127621 0.39%20 L-ReLU Identity 0.132914 0.40%20 L-ReLU ReLU 0.125995 0.39%20 L-ReLU L-ReLU 0.153998 0.48%30 Tan H Identity 0.128767 0.40%30 Tan H ReLU 0.125813 0.39%30 Tan H L-ReLU 0.121995 0.38%30 ReLU Identity 0.125142 0.38%30 ReLU ReLU 0.124991 0.38%30 ReLU L-ReLU 0.131005 0.40%30 L-ReLU Identity 0.134464 0.41%30 L-ReLU ReLU 0.124617 0.38%30 L-ReLU L-ReLU 0.138567 0.42%5:5 Tan H Identity 19.20609 58.67%5:5 Tan H ReLU 0.153902 0.45%5:5 Tan H L-ReLU 0.170419 0.50%5:5 ReLU Identity 31.49289 100.00%5:5 ReLU ReLU 0.151838 0.46%5:5 ReLU L-ReLU 0.144832 0.44%5:5 L-ReLU Identity 0.200518 0.59%

Continued on next page


A5.1 Continued from previous page

Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.5:5 L-ReLU ReLU 0.151043 0.47%5:5 L-ReLU L-ReLU 0.154585 0.47%

10:10 Tan H Identity 0.227519 0.64%10:10 Tan H ReLU 0.177627 0.51%10:10 Tan H L-ReLU 0.142477 0.42%10:10 ReLU Identity 0.129238 0.40%10:10 ReLU ReLU 0.145034 0.44%10:10 ReLU L-ReLU 0.128502 0.39%10:10 L-ReLU Identity 0.162421 0.49%10:10 L-ReLU ReLU 0.142775 0.44%10:10 L-ReLU L-ReLU 0.149684 0.45%20:20 Tan H Identity 0.139896 0.43%20:20 Tan H ReLU 0.134078 0.41%20:20 Tan H L-ReLU 0.122902 0.38%20:20 ReLU Identity 0.128682 0.40%20:20 ReLU ReLU 0.128475 0.40%20:20 ReLU L-ReLU 31.49289 100.00%20:20 L-ReLU Identity 0.135785 0.42%20:20 L-ReLU ReLU 0.181691 0.53%20:20 L-ReLU L-ReLU 0.150874 0.46%30:30 Tan H Identity 7.054764 21.58%30:30 Tan H ReLU 0.131715 0.41%30:30 Tan H L-ReLU 0.129051 0.40%30:30 ReLU Identity 0.126234 0.39%30:30 ReLU ReLU 0.135422 0.42%30:30 ReLU L-ReLU 0.123737 0.38%30:30 L-ReLU Identity 0.125359 0.38%30:30 L-ReLU ReLU 0.130436 0.40%30:30 L-ReLU L-ReLU 0.153058 0.46%


Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.5 Tan H Identity 2.181752 6.30%5 Tan H ReLU 0.169337 0.49%5 Tan H L-ReLU 0.166278 0.48%5 ReLU Identity 0.144327 0.44%5 ReLU ReLU 0.131493 0.40%5 ReLU L-ReLU 0.148099 0.45%5 L-ReLU Identity 0.126256 0.39%




Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.5 L-ReLU ReLU 0.135365 0.42%5 L-ReLU L-ReLU 0.23025 0.67%10 Tan H Identity 0.142033 0.44%10 Tan H ReLU 0.128449 0.39%10 Tan H L-ReLU 0.131368 0.40%10 ReLU Identity 0.159049 0.49%10 ReLU ReLU 0.147112 0.45%10 ReLU L-ReLU 0.132751 0.41%10 L-ReLU Identity 0.127005 0.39%10 L-ReLU ReLU 0.131232 0.40%10 L-ReLU L-ReLU 0.188325 0.56%20 Tan H Identity 0.13895 0.42%20 Tan H ReLU 0.114257 0.35%20 Tan H L-ReLU 0.125995 0.39%20 ReLU Identity 0.123909 0.38%20 ReLU ReLU 0.123211 0.38%20 ReLU L-ReLU 0.126857 0.39%20 L-ReLU Identity 0.1253 0.38%20 L-ReLU ReLU 0.12138 0.38%20 L-ReLU L-ReLU 0.121285 0.38%30 Tan H Identity 0.119338 0.37%30 Tan H ReLU 0.117402 0.37%30 Tan H L-ReLU 0.124115 0.39%30 ReLU Identity 0.126795 0.39%30 ReLU ReLU 0.121227 0.37%30 ReLU L-ReLU 0.126488 0.39%30 L-ReLU Identity 0.125398 0.39%30 L-ReLU ReLU 0.128211 0.39%30 L-ReLU L-ReLU 0.11989 0.37%5:5 Tan H Identity 0.140668 0.42%5:5 Tan H ReLU 0.146312 0.44%5:5 Tan H L-ReLU 0.141954 0.43%5:5 ReLU Identity 0.138641 0.42%5:5 ReLU ReLU 0.134114 0.40%5:5 ReLU L-ReLU 0.139842 0.43%5:5 L-ReLU Identity 0.13575 0.41%5:5 L-ReLU ReLU 0.131298 0.40%5:5 L-ReLU L-ReLU 0.119766 0.37%

10:10 Tan H Identity 1.825003 5.16%10:10 Tan H ReLU 0.133343 0.41%10:10 Tan H L-ReLU 0.132422 0.41%10:10 ReLU Identity 0.124974 0.39%10:10 ReLU ReLU 0.123341 0.38%10:10 ReLU L-ReLU 2.22971 6.43%




Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.10:10 L-ReLU Identity 0.127324 0.39%10:10 L-ReLU ReLU 0.147056 0.46%10:10 L-ReLU L-ReLU 0.132331 0.41%20:20 Tan H Identity 0.127376 0.40%20:20 Tan H ReLU 0.123484 0.39%20:20 Tan H L-ReLU 0.126588 0.39%20:20 ReLU Identity 0.122504 0.37%20:20 ReLU ReLU 0.130763 0.40%20:20 ReLU L-ReLU 31.39199 99.68%20:20 L-ReLU Identity 0.143053 0.43%20:20 L-ReLU ReLU 0.131557 0.40%20:20 L-ReLU L-ReLU 0.130226 0.39%30:30 Tan H Identity 0.130234 0.40%30:30 Tan H ReLU 0.136991 0.43%30:30 Tan H L-ReLU 0.145571 0.45%30:30 ReLU Identity 0.116919 0.36%30:30 ReLU ReLU 0.126116 0.39%30:30 ReLU L-ReLU 31.42161 99.77%30:30 L-ReLU Identity 0.128928 0.40%30:30 L-ReLU ReLU 0.117798 0.36%30:30 L-ReLU L-ReLU 0.123977 0.38%


Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.5 Tan H Identity 0.147294 0.44%5 Tan H ReLU 0.981665 3.27%5 Tan H L-ReLU 0.803064 2.64%5 ReLU Identity 0.123399 0.37%5 ReLU ReLU 0.979927 3.27%5 ReLU L-ReLU 1.950363 5.11%5 L-ReLU Identity 0.112046 0.35%5 L-ReLU ReLU 0.984266 3.28%5 L-ReLU L-ReLU 0.163551 0.49%10 Tan H Identity 0.140787 0.44%10 Tan H ReLU 0.979681 3.27%10 Tan H L-ReLU 0.481741 1.54%10 ReLU Identity 0.115084 0.35%10 ReLU ReLU 0.979831 3.27%10 ReLU L-ReLU 0.152709 0.46%




Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.10 L-ReLU Identity 0.11663 0.36%10 L-ReLU ReLU 0.979209 3.27%10 L-ReLU L-ReLU 0.154389 0.47%20 Tan H Identity 0.139033 0.43%20 Tan H ReLU 0.979033 3.27%20 Tan H L-ReLU 0.300261 0.96%20 ReLU Identity 0.109526 0.34%20 ReLU ReLU 0.97928 3.27%20 ReLU L-ReLU 0.164668 0.51%20 L-ReLU Identity 0.109664 0.34%20 L-ReLU ReLU 0.979422 3.27%20 L-ReLU L-ReLU 0.167726 0.51%30 Tan H Identity 0.139317 0.44%30 Tan H ReLU 0.979345 3.27%30 Tan H L-ReLU 0.359757 1.07%30 ReLU Identity 0.110596 0.34%30 ReLU ReLU 0.979252 3.27%30 ReLU L-ReLU 0.176206 0.54%30 L-ReLU Identity 0.107759 0.33%30 L-ReLU ReLU 0.979716 3.27%30 L-ReLU L-ReLU 0.175515 0.53%5:5 Tan H Identity 0.146842 0.44%5:5 Tan H ReLU 0.982051 3.27%5:5 Tan H L-ReLU 0.426805 1.41%5:5 ReLU Identity 0.123554 0.38%5:5 ReLU ReLU 0.980434 3.27%5:5 ReLU L-ReLU 1.950047 5.11%5:5 L-ReLU Identity 0.118565 0.37%5:5 L-ReLU ReLU 0.979825 3.27%5:5 L-ReLU L-ReLU 0.143815 0.44%

10:10 Tan H Identity 0.128829 0.40%10:10 Tan H ReLU 0.979662 3.27%10:10 Tan H L-ReLU 0.48583 1.52%10:10 ReLU Identity 0.117449 0.36%10:10 ReLU ReLU 0.979605 3.27%10:10 ReLU L-ReLU 1.949656 5.11%10:10 L-ReLU Identity 0.107209 0.33%10:10 L-ReLU ReLU 0.979888 3.27%10:10 L-ReLU L-ReLU 0.150474 0.45%20:20 Tan H Identity 0.130705 0.41%20:20 Tan H ReLU 0.979352 3.27%20:20 Tan H L-ReLU 0.2083 0.62%20:20 ReLU Identity 0.109638 0.34%20:20 ReLU ReLU 0.979702 3.27%




Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.20:20 ReLU L-ReLU 1.949962 5.11%20:20 L-ReLU Identity 0.109392 0.33%20:20 L-ReLU ReLU 0.979308 3.27%20:20 L-ReLU L-ReLU 0.122606 0.38%30:30 Tan H Identity 0.163509 0.52%30:30 Tan H ReLU 0.979567 3.27%30:30 Tan H L-ReLU 0.266783 0.76%30:30 ReLU Identity 0.108539 0.33%30:30 ReLU ReLU 0.979376 3.27%30:30 ReLU L-ReLU 0.146605 0.44%30:30 L-ReLU Identity 0.110414 0.34%30:30 L-ReLU ReLU 0.979189 3.27%30:30 L-ReLU L-ReLU 0.117147 0.36%


Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.5 Tan H Identity 0.127339 0.39%5 Tan H ReLU 0.981929 3.27%5 Tan H L-ReLU 0.574317 1.85%5 ReLU Identity 0.121961 0.37%5 ReLU ReLU 0.980031 3.27%5 ReLU L-ReLU 0.194839 0.60%5 L-ReLU Identity 0.157536 0.47%5 L-ReLU ReLU 0.979871 3.27%5 L-ReLU L-ReLU 0.16569 0.50%10 Tan H Identity 0.118288 0.37%10 Tan H ReLU 0.979454 3.27%10 Tan H L-ReLU 0.530164 1.70%10 ReLU Identity 0.113427 0.35%10 ReLU ReLU 0.979248 3.27%10 ReLU L-ReLU 0.165125 0.49%10 L-ReLU Identity 0.117806 0.36%10 L-ReLU ReLU 0.979902 3.27%10 L-ReLU L-ReLU 0.155122 0.47%20 Tan H Identity 0.151067 0.48%20 Tan H ReLU 0.979216 3.27%20 Tan H L-ReLU 0.397753 1.22%20 ReLU Identity 0.106967 0.33%20 ReLU ReLU 0.979123 3.27%




Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.20 ReLU L-ReLU 0.179483 0.55%20 L-ReLU Identity 0.11001 0.34%20 L-ReLU ReLU 0.979178 3.27%20 L-ReLU L-ReLU 0.174551 0.54%30 Tan H Identity 0.138034 0.44%30 Tan H ReLU 0.97934 3.27%30 Tan H L-ReLU 0.305231 0.91%30 ReLU Identity 0.109835 0.34%30 ReLU ReLU 0.979048 3.27%30 ReLU L-ReLU 0.191791 0.59%30 L-ReLU Identity 0.109132 0.34%30 L-ReLU ReLU 0.979028 3.27%30 L-ReLU L-ReLU 0.183399 0.57%5:5 Tan H Identity 0.148968 0.46%5:5 Tan H ReLU 0.980558 3.27%5:5 Tan H L-ReLU 0.494908 1.57%5:5 ReLU Identity 0.129364 0.39%5:5 ReLU ReLU 0.979689 3.27%5:5 ReLU L-ReLU 0.191156 0.59%5:5 L-ReLU Identity 0.11656 0.36%5:5 L-ReLU ReLU 0.979804 3.27%5:5 L-ReLU L-ReLU 0.129802 0.40%

10:10 Tan H Identity 0.1362 0.42%10:10 Tan H ReLU 0.98023 3.27%10:10 Tan H L-ReLU 0.213183 0.68%10:10 ReLU Identity 0.114893 0.35%10:10 ReLU ReLU 0.979607 3.27%10:10 ReLU L-ReLU 0.13749 0.43%10:10 L-ReLU Identity 0.113134 0.35%10:10 L-ReLU ReLU 0.980175 3.27%10:10 L-ReLU L-ReLU 0.12666 0.39%20:20 Tan H Identity 0.143011 0.45%20:20 Tan H ReLU 0.979321 3.27%20:20 Tan H L-ReLU 0.207221 0.65%20:20 ReLU Identity 0.113166 0.35%20:20 ReLU ReLU 0.979244 3.27%20:20 ReLU L-ReLU 0.140461 0.43%20:20 L-ReLU Identity 0.110881 0.34%20:20 L-ReLU ReLU 0.97923 3.27%20:20 L-ReLU L-ReLU 0.11088 0.34%30:30 Tan H Identity 0.136676 0.43%30:30 Tan H ReLU 0.979425 3.27%30:30 Tan H L-ReLU 0.194509 0.60%30:30 ReLU Identity 0.109806 0.34%




Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.30:30 ReLU ReLU 0.979739 3.27%30:30 ReLU L-ReLU 8.59·1014 > 100%30:30 L-ReLU Identity 0.1109 0.34%30:30 L-ReLU ReLU 0.979244 3.27%30:30 L-ReLU L-ReLU 0.127812 0.39%

F.2 Estimation scenario B

This appendix section contains detailed results for estimation of real fuelconsumption using all considered models.

Table F.13: Model B1 - Results


Abs. Rel. 95% 99% 99.9%B1.1 0.00% 3.700188 10.68% 21.02% 30.11% 45.48%B1.2 1.17% 2.4·1014 > 100% 27.72% > 100% > 100%B1.3 33.55% 3.62087 11.13% 21.96% 33.23% 48.13%

Table F.14: Model B2.1 - Results for all considered configurations


K Abs. Rel. 95% 99% 99.9%1 0.00% 4.02661 11.74% 25.00% 40.00% 87.10%2 0.00% 3.634381 10.46% 22.00% 37.04% 51.96%3 0.00% 3.5124 10.02% 21.14% 34.48% 50.00%4 0.00% 3.499428 10.01% 21.15% 34.00% 49.56%6 0.00% 3.474289 9.88% 20.48% 32.99% 48.40%10 0.00% 3.479035 9.85% 20.29% 31.22% 47.92%15 0.00% 3.465574 9.77% 19.74% 30.07% 46.92%




K Abs. Rel. 95% 99% 99.9%1 0.12% 4.052696 11.92% 25.00% 40.74% 87.10%2 0.12% 3.64648 10.56% 22.22% 37.93% 52.08%3 0.12% 3.52336 10.12% 21.09% 35.48% 50.00%4 0.12% 3.518697 10.09% 21.15% 35.00% 51.32%6 0.12% 3.487171 9.94% 20.45% 33.69% 48.40%10 0.12% 3.494715 9.89% 20.34% 31.48% 47.92%15 0.12% 3.471843 9.80% 19.77% 30.28% 47.62%



K Abs. Rel. 95% 99% 99.9%1 0.00% 4.001062 11.62% 25.00% 40.00% 62.96%2 0.00% 3.642448 10.58% 22.22% 37.50% 52.08%3 0.00% 3.515894 10.10% 21.33% 34.67% 49.71%4 0.00% 3.498356 10.02% 21.09% 33.59% 49.56%6 0.00% 3.472989 9.87% 20.43% 32.99% 48.40%10 0.00% 3.484477 9.85% 20.34% 31.40% 47.92%15 0.00% 3.468565 9.77% 19.73% 30.07% 47.52%




K u Abs. Rel. 95% 99% 99.9%1 1 0.00% 3.839699 11.05% 23.81% 38.33% 55.22%2 1 0.00% 3.568723 10.23% 21.75% 36.28% 52.66%3 1 0.00% 3.455312 9.84% 21.07% 33.85% 51.72%4 1 0.00% 3.410024 9.71% 20.57% 33.68% 50.07%6 1 0.00% 3.336052 9.46% 20.00% 32.85% 49.34%10 1 0.00% 3.285896 9.27% 19.50% 31.11% 50.28%15 1 0.00% 3.247007 9.13% 19.18% 30.57% 50.57%1 2 0.00% 3.839699 11.05% 23.81% 38.33% 55.22%2 2 0.00% 3.599302 10.32% 21.95% 36.28% 53.16%3 2 0.00% 3.501227 9.98% 21.29% 34.86% 52.72%4 2 0.00% 3.45838 9.85% 20.92% 34.91% 51.26%6 2 0.00% 3.395029 9.64% 20.27% 34.67% 49.69%10 2 0.00% 3.348719 9.47% 20.18% 32.55% 49.78%15 2 0.00% 3.316117 9.36% 19.88% 31.72% 50.07%1 5 0.00% 3.839699 11.05% 23.81% 38.33% 55.22%2 5 0.00% 3.665686 10.53% 22.50% 37.00% 53.97%3 5 0.00% 3.601661 10.30% 22.09% 36.26% 54.52%4 5 0.00% 3.573594 10.22% 21.73% 35.82% 52.78%6 5 0.00% 3.53649 10.09% 21.39% 35.75% 52.63%10 5 0.00% 3.51245 10.01% 21.21% 35.39% 52.63%15 5 0.00% 3.500119 9.97% 21.20% 34.99% 52.63%1 10 0.00% 3.839699 11.05% 23.81% 38.33% 55.22%2 10 0.00% 3.722667 10.70% 22.94% 37.50% 54.75%3 10 0.00% 3.683972 10.56% 22.50% 36.43% 55.22%4 10 0.00% 3.66952 10.52% 22.57% 36.36% 54.74%6 10 0.00% 3.651419 10.46% 22.26% 36.40% 54.06%10 10 0.00% 3.641247 10.42% 22.22% 36.28% 54.06%15 10 0.00% 3.637733 10.41% 22.22% 36.11% 54.06%




K u Abs. Rel. 95% 99% 99.9%1 1 0.12% 3.838573 11.04% 23.68% 38.89% 58.33%2 1 0.12% 3.560785 10.20% 21.67% 36.54% 52.66%3 1 0.12% 3.444947 9.81% 20.81% 34.38% 51.35%4 1 0.12% 3.412916 9.72% 20.57% 34.30% 50.27%6 1 0.12% 3.338852 9.48% 20.00% 33.16% 49.34%10 1 0.12% 3.291848 9.29% 19.51% 31.25% 50.68%15 1 0.12% 3.250783 9.15% 19.10% 30.60% 50.68%1 2 0.12% 3.838573 11.04% 23.68% 38.89% 58.33%2 2 0.12% 3.592407 10.30% 21.95% 36.52% 53.16%3 2 0.12% 3.491651 9.95% 21.14% 34.86% 52.72%4 2 0.12% 3.45756 9.85% 20.80% 34.91% 51.26%6 2 0.12% 3.395416 9.65% 20.24% 34.67% 49.69%10 2 0.12% 3.352189 9.49% 20.18% 32.55% 49.78%15 2 0.12% 3.318547 9.37% 19.93% 31.85% 50.25%1 5 0.12% 3.838573 11.04% 23.68% 38.89% 58.33%2 5 0.12% 3.661927 10.51% 22.48% 37.04% 53.97%3 5 0.12% 3.595825 10.28% 21.97% 36.26% 54.52%4 5 0.12% 3.570194 10.21% 21.64% 35.82% 52.78%6 5 0.12% 3.53566 10.09% 21.36% 35.75% 52.63%10 5 0.12% 3.514029 10.01% 21.21% 35.45% 52.63%15 10 0.12% 3.502288 9.97% 21.20% 34.99% 52.63%1 10 0.12% 3.838573 11.04% 23.68% 38.89% 58.33%2 10 0.12% 3.7202 10.69% 22.89% 37.67% 54.75%3 10 0.12% 3.681261 10.55% 22.36% 36.43% 55.22%4 10 0.12% 3.665958 10.51% 22.35% 36.36% 54.74%6 10 0.12% 3.649935 10.45% 22.23% 36.40% 54.06%10 10 0.12% 3.640947 10.42% 22.22% 36.36% 54.06%15 10 0.12% 3.638165 10.41% 22.22% 36.28% 54.06%




K u Abs. Rel. 95% 99% 99.9%1 1 0.00% 3.839377 11.06% 23.53% 38.89% 62.96%2 1 0.00% 3.55983 10.22% 21.62% 36.54% 52.66%3 1 0.00% 3.446119 9.82% 20.93% 34.12% 50.05%4 1 0.00% 3.405603 9.69% 20.59% 33.68% 50.07%6 1 0.00% 3.335196 9.46% 20.00% 32.44% 49.34%10 1 0.00% 3.286983 9.27% 19.51% 31.19% 50.28%15 1 0.00% 3.247518 9.13% 19.16% 30.57% 50.57%1 2 0.00% 3.839377 11.06% 23.53% 38.89% 62.96%2 2 0.00% 3.591119 10.31% 21.84% 36.52% 53.16%3 2 0.00% 3.492538 9.96% 21.19% 34.67% 52.63%4 2 0.00% 3.453838 9.84% 20.83% 34.52% 51.26%6 2 0.00% 3.394173 9.64% 20.23% 33.33% 49.69%10 2 0.00% 3.349154 9.47% 20.18% 32.55% 49.78%15 2 0.00% 3.316455 9.36% 19.88% 31.85% 50.07%1 5 0.00% 3.839377 11.06% 23.53% 38.89% 62.96%2 5 0.00% 3.65872 10.51% 22.39% 37.04% 53.97%3 5 0.00% 3.594657 10.28% 22.00% 36.26% 53.46%4 5 0.00% 3.569231 10.20% 21.64% 35.77% 52.78%6 5 0.00% 3.535698 10.09% 21.43% 35.56% 52.63%10 5 0.00% 3.512449 10.01% 21.21% 35.39% 52.63%15 10 0.00% 3.500372 9.97% 21.20% 34.99% 52.63%1 10 0.00% 3.839377 11.06% 23.53% 38.89% 62.96%2 10 0.00% 3.715986 10.69% 22.85% 37.50% 54.75%3 10 0.00% 3.678588 10.55% 22.50% 36.43% 54.75%4 10 0.00% 3.665528 10.50% 22.35% 36.36% 54.74%6 10 0.00% 3.650342 10.45% 22.26% 36.40% 54.06%10 10 0.00% 3.640529 10.42% 22.22% 36.36% 54.06%15 10 0.00% 3.637312 10.41% 22.22% 36.28% 54.06%

Table F.20: Model B4 - Results


Abs. Rel. 95% 99% 99.9%B4.1 0.00% 3.903643 11.24% 22.17% 31.48% 48.29%B4.2 0.00% 3.82209 10.98% 21.60% 30.93% 48.45%B4.3 0.00% 3.868441 11.08% 21.93% 30.86% 47.77%B4.4 0.00% 3.880771 11.14% 22.03% 31.40% 48.20%B4.5 0.00% 3.883087 11.14% 21.82% 31.83% 48.67%



Configuration RMSE




B5.1 Continued from previous page

Configuration RMSE




Configuration RMSE





Configuration RMSE


10:10 Tan H Identity 3.642982 10.65%10:10 Tan H ReLU 3.58383 10.35%10:10 Tan H L-ReLU 2510388 > 100%10:10 ReLU Identity 3.773282 10.52%10:10 ReLU ReLU 3.738492 10.42%10:10 ReLU L-ReLU 3.649875 10.40%




Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.10:10 L-ReLU Identity 3.654286 10.33%10:10 L-ReLU ReLU 3.679719 10.39%10:10 L-ReLU L-ReLU 3.638704 10.44%20:20 Tan H Identity 3.593615 10.39%20:20 Tan H ReLU 3.598787 10.50%20:20 Tan H L-ReLU 3.624851 10.54%20:20 ReLU Identity 3.685819 10.33%20:20 ReLU ReLU 3.620399 10.42%20:20 ReLU L-ReLU 31.18493 99.72%20:20 L-ReLU Identity 3.686011 10.34%20:20 L-ReLU ReLU 3.624601 10.34%20:20 L-ReLU L-ReLU 3.621062 10.36%30:30 Tan H Identity 3.59659 10.41%30:30 Tan H ReLU 3.6007 10.46%30:30 Tan H L-ReLU 3.628952 10.40%30:30 ReLU Identity 3.684541 10.31%30:30 ReLU ReLU 3.61748 10.40%30:30 ReLU L-ReLU 3.71·1018 > 100%30:30 L-ReLU Identity 3.643245 10.30%30:30 L-ReLU ReLU 3.59576 10.38%30:30 L-ReLU L-ReLU 3.600698 10.29%


Configuration RMSE





Configuration RMSE






Configuration RMSE



Configuration RMSE





Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.20 ReLU L-ReLU 3.69461 10.87%20 L-ReLU Identity 3.730209 10.49%20 L-ReLU ReLU 4.10388 13.30%20 L-ReLU L-ReLU 3.692452 10.80%30 Tan H Identity 3.65875 10.52%30 Tan H ReLU 4.099541 13.28%30 Tan H L-ReLU 3.831626 11.53%30 ReLU Identity 3.578413 10.49%30 ReLU ReLU 4.105879 13.31%30 ReLU L-ReLU 3.698009 10.77%30 L-ReLU Identity 3.636988 10.51%30 L-ReLU ReLU 4.099476 13.29%30 L-ReLU L-ReLU 3.699137 10.77%5:5 Tan H Identity 3.670727 10.45%5:5 Tan H ReLU 4.117885 13.35%5:5 Tan H L-ReLU 3.951055 12.10%5:5 ReLU Identity 3.667857 10.51%5:5 ReLU ReLU 4.128494 13.33%5:5 ReLU L-ReLU 3.727404 10.90%5:5 L-ReLU Identity 3.685579 10.58%5:5 L-ReLU ReLU 4.125362 13.35%5:5 L-ReLU L-ReLU 3.693958 10.63%





Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.30:30 ReLU ReLU 4.103098 13.33%30:30 ReLU L-ReLU 3.707979 10.86%30:30 L-ReLU Identity 3.572188 10.43%30:30 L-ReLU ReLU 4.099036 13.31%30:30 L-ReLU L-ReLU 3.629528 10.54%

F.3 Estimation scenario C

This appendix section contains detailed results for estimation of real fuelconsumption using all considered models.

Table F.25: Model C1 - Results


Abs. Rel. 95% 99% 99.9%C1.1 0.00% 2.398317912 6.55% 12.12% 20.31% 49.24%C1.2 2.11% 3.47658·1013 > 100% 17.05% 42.33% > 100%C1.3 33.55% 2.355034344 6.64% 12.14% 20.40% 49.65%

Table F.26: Model C2.1 - Results for all considered configurations


K Abs. Rel. 95% 99% 99.9%1 0.00% 3.052365 8.63% 16.67% 28.00% 57.14%2 0.00% 2.691695 7.42% 14.52% 25.00% 49.15%3 0.00% 2.634188 7.14% 14.29% 23.33% 48.72%4 0.00% 2.60858 7.03% 13.95% 22.41% 48.21%6 0.00% 2.594548 6.94% 13.89% 21.33% 49.04%10 0.00% 2.61735 6.98% 13.64% 22.07% 48.00%15 0.00% 2.645702 7.05% 13.85% 22.30% 47.88%




K Abs. Rel. 95% 99% 99.9%1 0.12% 3.075935 8.66% 16.67% 28.00% 57.14%2 0.12% 2.742532 7.53% 14.58% 26.00% 50.00%3 0.12% 2.691028 7.26% 14.44% 24.07% 50.28%4 0.12% 2.6715 7.17% 14.06% 23.13% 49.51%6 0.12% 2.650347 7.08% 13.89% 22.22% 50.33%10 0.12% 2.678659 7.15% 13.71% 23.06% 50.33%15 0.12% 2.717978 7.26% 14.02% 23.42% 50.26%



K Abs. Rel. 95% 99% 99.9%1 0.00% 3.051902 8.63% 16.67% 28.00% 57.14%2 0.00% 2.694578 7.43% 14.52% 25.58% 49.15%3 0.00% 2.637057 7.15% 14.29% 23.66% 48.72%4 0.00% 2.618643 7.05% 14.00% 22.60% 48.21%6 0.00% 2.602907 6.96% 13.89% 21.33% 49.04%10 0.00% 2.626713 7.00% 13.61% 22.07% 48.00%15 0.00% 2.659179 7.08% 13.89% 22.38% 47.88%




K u Abs. Rel. 95% 99% 99.9%1 1 0.00% 3.052365 8.63% 16.67% 28.00% 57.14%2 1 0.00% 2.675036 7.37% 14.44% 24.17% 50.11%3 1 0.00% 2.597011 7.03% 13.99% 23.21% 48.96%4 1 0.00% 2.564695 6.90% 13.71% 22.49% 48.36%6 1 0.00% 2.544593 6.79% 13.52% 21.06% 49.04%10 1 0.00% 2.559776 6.80% 13.36% 21.39% 48.01%15 1 0.00% 2.585375 6.86% 13.44% 21.83% 47.90%1 2 0.00% 3.052365 8.63% 16.67% 28.00% 57.14%2 2 0.00% 2.676659 7.37% 14.27% 24.93% 50.22%3 2 0.00% 2.584895 7.00% 13.91% 23.17% 49.11%4 2 0.00% 2.542858 6.84% 13.59% 22.31% 48.63%6 2 0.00% 2.511718 6.69% 13.31% 21.39% 49.03%10 2 0.00% 2.510847 6.66% 13.06% 21.07% 48.03%15 2 0.00% 2.525834 6.68% 13.27% 21.48% 47.93%1 5 0.00% 3.052365 8.63% 16.67% 28.00% 57.14%2 5 0.00% 2.716616 7.52% 14.46% 24.85% 50.48%3 5 0.00% 2.620276 7.14% 14.19% 23.33% 50.53%4 5 0.00% 2.568671 6.96% 13.85% 22.71% 50.56%6 5 0.00% 2.52399 6.77% 13.43% 22.23% 49.40%10 5 0.00% 2.494963 6.65% 13.11% 21.79% 48.34%15 5 0.00% 2.484064 6.60% 12.94% 21.47% 48.04%1 10 0.00% 3.052365 8.63% 16.67% 28.00% 57.14%2 10 0.00% 2.78705 7.76% 14.83% 25.02% 53.85%3 10 0.00% 2.715385 7.50% 14.66% 24.41% 53.39%4 10 0.00% 2.675368 7.35% 14.54% 24.01% 52.89%6 10 0.00% 2.641292 7.22% 14.17% 23.70% 52.74%10 10 0.00% 2.613504 7.11% 13.90% 23.07% 50.68%15 10 0.00% 2.599631 7.05% 13.78% 23.09% 50.68%




K u Abs. Rel. 95% 99% 99.9%1 1 0.12% 3.075935 8.66% 16.67% 28.00% 57.14%2 1 0.12% 2.717003 7.45% 14.59% 25.45% 50.13%3 1 0.12% 2.645741 7.12% 14.09% 23.75% 50.28%4 1 0.12% 2.617743 7.00% 13.80% 22.74% 49.58%6 1 0.12% 2.59103 6.89% 13.59% 22.20% 50.29%10 1 0.12% 2.610141 6.92% 13.46% 21.79% 50.29%15 1 0.12% 2.64276 7.00% 13.59% 22.42% 50.29%1 2 0.12% 3.075935 8.66% 16.67% 28.00% 57.14%2 2 0.12% 2.715245 7.44% 14.33% 25.00% 50.22%3 2 0.12% 2.629322 7.08% 13.96% 23.32% 50.28%4 2 0.12% 2.590737 6.93% 13.69% 22.39% 49.96%6 2 0.12% 2.554187 6.78% 13.36% 21.79% 50.24%10 2 0.12% 2.556474 6.75% 13.14% 21.32% 50.24%15 2 0.12% 2.575519 6.79% 13.38% 21.67% 50.24%1 5 0.12% 3.075935 8.66% 16.67% 28.00% 57.14%2 5 0.12% 2.750753 7.58% 14.56% 25.22% 50.48%3 5 0.12% 2.658415 7.21% 14.22% 23.54% 50.53%4 5 0.12% 2.609074 7.02% 13.86% 23.02% 50.56%6 5 0.12% 2.561698 6.84% 13.43% 22.57% 50.11%10 5 0.12% 2.535091 6.73% 13.19% 21.90% 48.94%15 5 0.12% 2.525299 6.68% 13.02% 21.59% 49.32%1 10 0.12% 3.075935 8.66% 16.67% 28.00% 57.14%2 10 0.12% 2.817649 7.81% 14.94% 25.47% 53.85%3 10 0.12% 2.748376 7.55% 14.66% 24.68% 53.39%4 10 0.12% 2.709719 7.41% 14.53% 24.28% 52.89%6 10 0.12% 2.673745 7.27% 14.16% 23.76% 52.74%10 10 0.12% 2.646855 7.17% 13.92% 23.31% 50.68%15 10 0.12% 2.632854 7.11% 13.79% 23.21% 50.68%




K u Abs. Rel. 95% 99% 99.9%1 1 0.00% 3.051902 8.63% 16.67% 28.00% 57.14%2 1 0.00% 2.677778 7.38% 14.45% 24.97% 50.11%3 1 0.00% 2.599696 7.04% 13.99% 23.50% 48.96%4 1 0.00% 2.574015 6.92% 13.75% 22.53% 48.36%6 1 0.00% 2.551245 6.80% 13.55% 21.17% 49.04%10 1 0.00% 2.568026 6.82% 13.29% 21.32% 48.01%15 1 0.00% 2.596831 6.89% 13.44% 21.86% 47.90%1 2 0.00% 3.051902 8.63% 16.67% 28.00% 57.14%2 2 0.00% 2.679248 7.39% 14.29% 24.94% 50.22%3 2 0.00% 2.587379 7.01% 13.92% 23.27% 49.11%4 2 0.00% 2.551622 6.86% 13.61% 22.39% 48.63%6 2 0.00% 2.517444 6.71% 13.34% 21.69% 49.03%10 2 0.00% 2.518433 6.68% 13.06% 21.09% 48.03%15 2 0.00% 2.535794 6.71% 13.26% 21.32% 47.93%1 5 0.00% 3.051902 8.63% 16.67% 28.00% 57.14%2 5 0.00% 2.718777 7.53% 14.50% 24.95% 50.48%3 5 0.00% 2.622236 7.15% 14.19% 23.52% 50.53%4 5 0.00% 2.576146 6.97% 13.86% 23.03% 50.56%6 5 0.00% 2.528381 6.78% 13.43% 22.25% 49.40%10 5 0.00% 2.501205 6.67% 13.14% 22.02% 48.34%15 5 0.00% 2.49119 6.62% 12.96% 21.69% 48.04%1 10 0.00% 3.051902 8.63% 16.67% 28.00% 57.14%2 10 0.00% 2.788692 7.77% 14.94% 25.02% 53.85%3 10 0.00% 2.716945 7.50% 14.67% 24.68% 53.39%4 10 0.00% 2.681715 7.37% 14.54% 24.28% 52.89%6 10 0.00% 2.645154 7.23% 14.17% 23.76% 52.74%10 10 0.00% 2.618648 7.12% 13.89% 23.29% 50.68%15 10 0.00% 2.604508 7.07% 13.78% 23.19% 50.68%

Table F.32: Model C4 - Results


Abs. Rel. 95% 99% 99.9%C4.1 0.00% 2.467006575 6.69% 12.65% 20.09% 49.21%C4.2 0.00% 2.42467463 6.53% 12.19% 20.10% 48.88%C4.3 0.00% 2.523092788 6.89% 13.07% 20.44% 49.08%C4.4 0.00% 2.398799757 6.42% 12.05% 19.25% 48.82%C4.5 0.00% 2.394519819 6.42% 12.17% 20.09% 49.36%



Configuration RMSE




C5.1 Continued from previous page

Configuration RMSE




Configuration RMSE





Configuration RMSE


10:10 Tan H Identity 2.251716254 6.09%10:10 Tan H ReLU 2.248413693 6.03%10:10 Tan H L-ReLU 2.271435403 6.09%10:10 ReLU Identity 2.261611095 6.07%10:10 ReLU ReLU 2.272969293 6.09%10:10 ReLU L-ReLU 4.37913·1014 -




Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.10:10 L-ReLU Identity 2.253002069 6.01%10:10 L-ReLU ReLU 2.262105091 6.02%10:10 L-ReLU L-ReLU 2.319226508 6.30%20:20 Tan H Identity 2.263081596 6.10%20:20 Tan H ReLU 2.28745402 6.14%20:20 Tan H L-ReLU 2.310157286 6.15%20:20 ReLU Identity 2.239455942 5.96%20:20 ReLU ReLU 2.22106589 5.93%20:20 ReLU L-ReLU 11480581.03 > 100%20:20 L-ReLU Identity 2.225170968 5.95%20:20 L-ReLU ReLU 2.225384881 5.90%20:20 L-ReLU L-ReLU 2.221103527 5.89%30:30 Tan H Identity 2.28350961 6.14%30:30 Tan H ReLU 2.311960927 6.13%30:30 Tan H L-ReLU 2.323748049 6.22%30:30 ReLU Identity 2.227360287 5.90%30:30 ReLU ReLU 2.221764098 5.90%30:30 ReLU L-ReLU 2.22256383 5.95%30:30 L-ReLU Identity 2.228973075 5.95%30:30 L-ReLU ReLU 2.216725357 5.92%30:30 L-ReLU L-ReLU 2.22062401 5.88%


Configuration RMSE





Configuration RMSE






Configuration RMSE



Configuration RMSE





Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.20 ReLU L-ReLU 2.33294137 6.36%20 L-ReLU Identity 2.233583172 5.97%20 L-ReLU ReLU 3.402064409 11.78%20 L-ReLU L-ReLU 2.311852444 6.28%30 Tan H Identity 2.28622928 6.09%30 Tan H ReLU 3.408024951 11.79%30 Tan H L-ReLU 591774.1528 > 100%30 ReLU Identity 2.225864873 5.93%30 ReLU ReLU 3.394428747 11.78%30 ReLU L-ReLU 2.32696041 6.35%30 L-ReLU Identity 2.214892641 5.94%30 L-ReLU ReLU 3.399791674 11.78%30 L-ReLU L-ReLU 2.3275426 6.36%5:5 Tan H Identity 2.312734136 6.25%5:5 Tan H ReLU 3.419211918 11.83%5:5 Tan H L-ReLU 2.57864064 7.68%5:5 ReLU Identity 2.27722664 6.17%5:5 ReLU ReLU 3.423959025 11.83%5:5 ReLU L-ReLU 2.352725549 6.46%5:5 L-ReLU Identity 2.284281376 6.11%5:5 L-ReLU ReLU 3.42096507 11.83%5:5 L-ReLU L-ReLU 2.293240132 6.24%





Configuration RMSE

HL. config. Act. func. HL Act. func. OL Abs. Rel.30:30 ReLU ReLU 3.394887508 11.78%30:30 ReLU L-ReLU 2.291584132 6.26%30:30 L-ReLU Identity 2.201082283 5.90%30:30 L-ReLU ReLU 3.394255672 11.78%30:30 L-ReLU L-ReLU 2.272760732 6.12%

109

Appendix G

Additional figures

(a) Green line - Overfitted model. Blackline - properly fitted model.

(b) Blue line - Overfitted model. Greenline - properly fitted model.

Figure G.1: Example of overfitting in classification (a) and regression (b)

110

Appendix H

Code

An implementation of artificial neural networks was implemented in C# (C-sharp) for this thesis. The source code can be found at:

https://sourceforge.net/projects/lukas-ann

TRITA -SCI-GRU 2018:378

www.kth.se

kth.diva-portal.org1255660/fulltext01.pdf · denna rapport fokuserar p a att unders oka anv...

Documents