prediction of automotive engine power and torque using ... · prediction of automotive engine power...
Post on 13-Jul-2018
225 Views
Preview:
TRANSCRIPT
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
1
Prediction of Automotive Engine Power and Torque Using
Least Squares Support Vector Machines and Bayesian
Inference
CHI-MAN VONG 1*
, PAK-KIN WONG 2
, YI-PING LI 1
1 (Department of Computer and Information Science, University of Macau, P.O.Box 3001, Macau, China)
2 (Department of Electromechanical Engineering, University of Macau, P.O.Box 3001, Macau, China)
* E-mail: cmvong@umac.mo, Phone: (853) 3974476
Abstract: Automotive engine power and torque are significantly affected with effective
tune-up. Current practice of engine tune-up relies on the experience of the automotive
engineer. The engine tune-up is usually done by trial-and-error method, and then the
vehicle engine is run on the dynamometer to show the actual engine output power and
torque. Obviously the current practice costs a large amount of time and money, and may
even fail to tune up the engine optimally because a formal power & torque function of
the engine has not been determined yet. With an emerging technique, Least Squares
Support Vector Machines (LS-SVM), the approximate power and torque functions of a
vehicle engine can be determined by training the sample data acquired from the
dynamometer. The number of dynamometer tests for an engine tune-up can therefore be
reduced because the estimated engine power and torque functions can replace the
dynamometer tests to a certain extent. Besides, Bayesian framework is also applied to
infer the hyper-parameters used in LS-SVM so as to eliminate the work of
cross-validation, and this leads to a significant reduction in training time. In this paper,
the construction, validation and accuracy of the functions are discussed. The study
shows that the predicted results are good agreement with the actual test results. To
illustrate the significance of the LS-SVM methodology, the results are also compared
with that regressed using a multilayer feed forward neural networks
Keywords: Automotive engine setup, Least Squares Support vector machines, Bayesian
inference, Engine power and torque.
1. INTRODUCTION
Modern automotive gasoline engines are controlled by the electronic control unit (ECU).
The engine output power and torque are significantly affected by the setup of control
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
2
parameters in the ECU. Many parameters are stored in the ECU using a look-up table/
map (Fig.1). Normally, the data of a car engine and torque is obtained through
dynamometer tests. An example of performance data of an engine output horsepower
and torque against speeds is shown in Fig. 2. The engine power and torque reflect the
dynamic performance of an engine. Traditionally, the setup of ECU is done by the
vehicle manufacturer. However, in recent, the programmable ECU and ECU read only
memory (ROM) editors have been widely adopted by many passenger cars. These
devices allow the non-OEM’s engineers to tune up their engines according to different
add-on components and driver’s requirements.
Current practice of engine tune-up relies on the experience of the automotive engineer
who will handle a huge number of combinations of engine control parameters. The
relationship between the input and output parameters of a modern car engine is a
complex multi-variable nonlinear function, which is very difficult to be found, because
modern automotive engine is an integration of thermo-fluid, electromechanical and
computer control systems. Consequently, engine tune-up is usually done by
trial-and-error method. Firstly, the engineer guesses an ECU setup based on his/her
experience and then stores the setup values in the ECU. Finally, the engine is run on a
dynamometer to test the actual engine power and torque. If the performance is loss, the
engineer adjusts the ECU setting and repeats the procedure until the performance is
satisfactory. That is why vehicle manufacturers normally spend many months to tune-up
an ECU optimally for a new car model. Moreover, the power and torque functions are
engine dependent as well. Every engine requires doing the similar tune-up procedure.
By knowing the power and torque functions, the automotive engineers can predict if a
trial ECU setup is gain or loss. The car engine only requires going through a
dynamometer test for verification after estimating a satisfactory setup from the
functions. Hence the number of unnecessary dynamometer tests for the trail setup can
be drastically reduced so as to save a large amount of time and money for testing.
Recent researches (Brace, 1998; Traver et al., 1999; Su et al., 2002; Yan et al., 2003; Liu
et al., 2004) have described the use of neutral-networks for modeling the diesel engine
emission performance based on experimental data. It is well known that a neural
network (Bishop, 1995; Haykin, 1999) is a universal estimator. It has, however, two
main drawbacks (Smola et al., 1996; Schölkopf and Smola, 2002).
(1) The architecture, including the number of hidden neurons, has to be determined a
priori or modified while training by heuristic, which results in a non-necessarily
optimal network structure.
(2) Neural networks can easily be stuck by local minima. Various ways of preventing
local minima, like early stopping, weight decay, etc., are employed. However, those
methods greatly affect the generalization of the estimated function, i.e., the capacity
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
3
of handling new input cases.
Traditional mathematical methods of nonlinear regression (Sen and Srivastava, 1990;
Ryan, 1996; Harrell, 2001; Tabachnick and Fidell, 2001; Seber and Wild, 2003) may be
applied to construct the engine power and torque functions. However, an engine setup
involves too many parameters and data. Constructing the functions in such a high
dimensional and nonlinear data space is very difficult for traditional regression methods.
With an emerging technique, Support Vector Machines (SVM) (Cristianini and
Shawe-Taylor, 2000; Schölkopf and Smola, 2002; Suykens et al., 2002), the issues of
high dimensionality as well as the previous drawbacks from neural networks are
overcome. Using SVM, the regressed engine power and torque functions can be used
for precision prediction so that the number of dynamometer tests can be significantly
reduced. The dynamometer tests normally cost a large amount of money and time.
Moreover, dynamometer is not always available, particular in the case of on-road fine
tune-up. Research on the prediction of modern gasoline engine output power and torque
subject to various parameter setups in the ECU are still quite rare, so the use of SVM
for modeling of engine output power and torque is the first attempt.
2. SUPPORT VECTOR MACHINES
SVM is an interdisciplinary field of machine learning, optimization, statistical
learning and generalization theory. Basically it can be used for pattern classification and
nonlinear regression (Gunn, 1998). No matter which application, SVM considers the
application as a Quadratic Programming (QP) problem for the weights with
regularization factor included. Since a QP problem is a convex function, the solution of
the QP problem is global (or even unique) instead of a local solution. The advantages of
SVM (Smola et al., 1996) as opposed to Neural Networks are:
(1) The architecture of the system has not to be determined before training. Input data
of any arbitrary dimensionality can be treated with only linear cost in the number of
input dimensions.
(2) SVM treats regression as a QP problem of minimizing the data fitting error plus
regularization, which produces a global (or even unique) solution having minimum
fitting error, while high generalization of the estimated function can also be
obtained.
2.1 Least squares support vector machine
Least squares support vector machines (LS-SVM) (Suykens, et al., 2002) is a variant of
SVM, which employs least square errors in the objective function of the optimization
problem. SVM solves nonlinear regression problems by means of convex quadratic
programs and the sparseness is obtained as a result of this QP problem. However, QP
problems are inherently difficult to be solved. Although many commercial packages
exist in the world for solving QP problems, it is still preferred to have a simpler
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
4
formulation. LS-SVM is the variant that modifies the original SVM formulation,
leading to solving a set of linear equations that is easier to use/solve than QP problems.
In addition, LS-SVM requires only two hyper-parameters for Radial Basis Function
kernel whereas SVM requires three hyper-parameters. Moreover, the threshold b is
returned as part of the LS-SVM solution while on the contrary SVM must calculate the
threshold b separately.
2.2 LS-SVM for nonlinear function estimation
Consider a data set, D = (x1, y1), …, (xN, yN), with N data points where xk ∈ Rn, y ∈ R,
k = 1 to N. LS-SVM deals with the following optimization problem in the primal weight
space
=+−=
+= ∑=
Nkbye
eJ
k
T
kk
N
k
k
T
Pb
,...,1],)([such that
2
1
2
1),(min
1
2
,,
xw
wwewew
ϕ
γ
where hnR∈w is the weight vector of the target function, e = [e1;…,eN] is the residual
vector, and hnn RR →:ϕ is a nonlinear mapping, n is the dimension of xk, and nh is
the dimension of the unknown feature space. Solving the dual of Eq. (1) can avoid the
high (and unknown) dimensionality of w. The LS-SVM dual formulation of nonlinear
function estimation is then expressed as follows (Suykens, et al., 2002):
=
+ yαIΩ1
1
α
01
0
: ,in Solve
b
b
Nv
T
v
γ
where IN is an N-dimensional identity matrix, y = [y1, …, yN]T, 1v is an
(N-1)-dimensional vector = [1,…,1]T, α =[α1,..., αN]
T, and γ∈R is a scalar for
regularization (which is a hyper-parameter for tuning). The kernel trick is employed as
follows:
.,,1,),(
)()(,
NlkxxK lk
l
T
klk
K==
= xxΩ ϕϕ
where K: predefined kernel function.
The resulting LS-SVM model for function estimation becomes
(2)
(3)
(1)
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
5
exp
),(
)()(
)(
2
2
1
1
1
b
bK
b
My
kN
k
k
k
N
k
k
T
k
N
k
k
+
−−=
+=
+=
=
∑
∑
∑
=
=
=
σα
α
ϕϕα
xx
xx
xx
x
where αk, b ∈R are the solutions of Eq. (2), xk is training data, x is the new input case,
and Radial Basis Function (RBF) is chosen as the kernel function K. From the
viewpoint of the current application, some parameters in Eq. (2) are specified as:
N : total number of engine setups (data points)
xk : engine input control parameters in the kth
sample data point k = 1,2…N (i.e. the kth
engine setup)
yk : engine output torque in the kth
sample data point
3. APPLICATION OF LS-SVM TO GASOLINE ENGINE MODELLING
In current application, M(x) in Eq. (4) is the torque function of an automotive engine.
The power of the engine is calculated based on the engine torque as discussed in Section
4. The issues of LS-SVM for this application domain are discussed in the following
sub-sections.
3.1 Schema
The training data set is expressed as D = dk =(xk, yk), k = 1 to N. Practically, there
are many input control parameters and they are also ECU and engine dependent.
Moreover, the engine power and torque curves are normally obtained at full-load
condition. For the demonstration purpose of the LS-SVM methodology, the following
common adjustable engine parameters and environmental parameter are selected to be
the input (i.e., engine setup) at engine full-load condition.
x = < Ir, O, tr, f, Jr, d, a, p > and y = <Tr>
where
r: Engine speed (RPM) and r ∈ 1000, 1500, 2000, 2500,…, 8000
Ir: Ignition spark advance at the corresponding engine speed r (degree before top
dead centre)
O: Overall ignition trim ( ± degree before top dead centre)
tr: Fuel injection time at the corresponding engine speed r (millisecond)
f: Overall fuel trim ( ± %)
Jr: Timing for stopping the fuel injection at the corresponding engine speed r (degree
(4)
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
6
before top dead centre)
d: Ignition dwell time at 15V (millisecond)
a: Air temperature (°C)
p: Fuel pressure (Bar)
Tr: Engine torque at the corresponding engine speed r (Kg-m)
The engine speed range for this project has been selected from 1000 RPM to 8000 RPM.
Although the engine speed r is a continuous variable, in practical ECU setup, the
engineer normally fills the setup parameters for each category of engine speed in a map
format. The map is usually divided the speed range discretely with interval 500 as
shown in Fig. 1, i.e. r = 1000, 1500, 2000, 2500,…. Therefore, it is unnecessary to
build a function across all speeds. Under this reason, r is manually categorized with a
specified interval of 500 instead of any integer ranging from 0 to 8000.
As some data is engine speed dependent, another notation Dr is used to further specify a
data set containing the data with respect to a specific r. For example, D1000 contains the
following parameters: <I1000, O, t1000, f, J1000, d, a, p, T1000>, while D8000 contains <I8000,
O, t8000, f, J8000, d, a, p, T8000> (Fig.3).
Consequently, D is separated into fifteen subsets namely D1000, D1500, …, D8000. An
example of the training data (engine setup) for D1000 is shown in Table 1. For every
subset Dr, it is passed to the LS-SVM regression module, Eq. (2), one by one in order to
construct fifteen torque functions Mr(x) with respective to engine speed r, i.e. Mr(x)=Mr
=M1000, M1500,…M8000.
In this way, the LS-SVM module is run for fifteen times. In every run, a different subset
Dr is used as training set to estimate its corresponding torque function. An engine torque
against engine speed curve is therefore obtained by fitting a curve that passes through
all data points generated by M1000, M1500, M2000,…,M8000.
4. DATA SAMPLING AND IMPLMENTATION
In practical engine setup, the automotive engineer determines an initial setup, which can
basically start the engine, and then the engine is fine-tuned by adjusting the parameters
about the initial setup values. Therefore, the input parameters are sampled based on the
data points about an initial setup supplied by the engine manufacturer. In the experiment,
a sample data set D of 200 different engine setups along with output torque has been
acquired from a Honda B16A DOHC engine controlled by a programmable ECU,
MoTeC M4 (Fig. 4), running on a chassis dynamometer (Fig. 5) at wide open throttle.
The engine output data is only the torque against the engine speeds because the
horsepower HP of an engine is calculated using:
60746
81.92
×
×××=
TrHP
π (5)
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
7
where HP : Engine horsepower (Hp)
r : Engine speed (RPM : Revolution per minute)
T: Engine torque (Kg-m)
After collection of sample data set D, for every data subset Dr ⊂ D, it is randomly
divided into two sets: TRAINr for training and TESTr for testing, such that Dr = TRAINr
∪TESTr, where TRAINr contains 80% of Dr and TESTr holds the remaining 20% (Fig.
6). Then every TRAINr is sent to the LS-SVM module for training, which has been
implemented using LS-SVMlab (Pelckmans et al., 2003), a MATLAB toolbox under
MS Windows XP. The implementation and other important issues are discussed in
following sub-sections.
4.1 Data pre-processing and post-processing
In order to have a more accurate regression result, the data set is conventionally
normalized before training (Pyle, 1999). This prevents any parameter from domination
to the output value. For all input and output values, it is necessary to be normalized
within the range [0,1], i.e. unit variance, through the following transformation formula:
minmax
min*)(vv
vvvvN
−
−==
where: vmin and vmax are the minimum and maximum domain values of the input or
output parameter v respectively. For example, v∈[7, 39], vmin = 7 and vmax = 39. The
limits for each input and output parameter of an engine should be predetermined via a
number of experiments or expert knowledge or manufacturer data sheets. As all input
values are normalized, the output torque value v* produced by the LS-SVM is not the
actual value. It must be de-transformed using the inverse N-1
of Eq. (6) in order to
obtain the actual output value v.
4.2 Error function
To verify the accuracy of each function of Mr, an error function has been established.
For a certain function Mr, the corresponding validation error is:
2
1
)(1∑
=
−=
N
k k
krk
ry
My
NE
x
where xk ∈ Rn is the engine input parameters of k
th data point in a test set or a validation
set, yk is the true torque value in the data point dk ( dk = <xk, yk> represents the kth
data
point) and N is the number of data points in the test set or validation set.
The error Er is a root-mean-square of the difference between the true torque value yk of
a test point dk and its corresponding estimated torque value Mr(xk). The difference is
also divided by the true torque yk, so that the result is normalized within the range [0, 1].
(6)
(7)
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
8
It can ensure the error Er also lies in that range. Hence the accuracy rate for each torque
function of Mr is calculated using the following formula:
( ) %1001 ×−= rr EAccuracy
4.3 Procedures of selection of hyper-parameters
According to Eqs. (2) and (4), it can be noted that the user has to adjust two
hyper-parameters (γ, σ), where γ is the regularization factor and σ specifies the kernel
width. Without knowing the best values for these hyper-parameters, all engine torque
functions cannot perform well. In order to select the best values for these
hyper-parameters, 10-fold cross validation is usually applied but it takes a very long
time. Recently, there is a more sophisticated Bayesian framework (Suykens et al., 2002;
Van Gestel et al., 2001a) that can infer the hyper-parameter values for LS-SVM.
As Bayesian inference is out of the scope of this research, it is not discussed in detail
but only the basic inference procedure is given. The procedure is based on a modified
version of LS-SVM program as follows, where µ is now the regularization factor
instead of γ, and ζ is the variance of the noise for residual ek (assuming constant
variance):
=+−=
+=
Nkbye
EEJ
k
T
kk
DPb
,...,1],)([such that
),(min,,
xw
ew wew
ϕ
ζµ
with
∑∑==
+−==
=
N
k
k
T
k
N
k
kD
T
byeE
E
1
2
1
2 ]))([(2
1
2
1
2
1
xw
www
ϕ
whose dual program is the same as Eq. (2), where hnR∈w is the weight vector of the
target function and e = [e1;…,eN] is the residual vector. The relationship of γ with µ and
ζ isµ
ζγ = .
The Bayesian inference for LS-SVM regression has three inference levels as described
below. Each level corresponds to infer different parameters.
Level 1: [inference of parameters w, b]
),,|,(),,|(
),,,,|(.),,,|,( σ
σ
σσ ζµ
ζµ
ζµζµ Mbp
Mp
MbpMbp w
D
wDDw =
where D is the training data set, Mσ = M(x) is the regressed function with a
specified (or guessed) kernel width σ, it is similar to Eq. (4):
After derivation, the posterior probability Eq. (11) is expressed as (Seeger,
(8)
(11)
(10)
(9)
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
9
2004):
.22
exp),,,|,(1
2
−−∝ ∑
=
N
k
k
TeMbp
ζµζµ σ wwDw
Minimizing ∑=
+N
k
k
Te
1
2
22
ζµww for w and b gives the solution for the function
to be estimated and that is exactly what LS-SVM does (Eqs (9) and (10)). So,
Level 1 is not necessary in the work for guessing the hyper-parameters, but it just
show the complete Bayesian framework.
Level 2: [inference of hyper-parameters µ, ζ]
)|,()|(
),,|(),|,( σ
σ
σσ ζµ
ζµζµ Mp
Mp
MpMp
D
DD =
It was shown (MacKay, 1995; Van Gestel et al., 2001b) that calculating Eq. (13)
is equivalent to an optimization problem in the hyper-parameter µ
ζγ = :
)log()1()1
log()(min1
1
, DW
N
i
i EENJ γγ
λγγ
+−++=∑−
=
G
where
)ˆ(2
1)
1()ˆ(
2
1 1
v
T
N
T
vDW mmEEeff
1yVIDV1y yGGGy −+−≈+ −
γγ
Empirical mean ∑=
=N
i
iyN
m1
1ˆ
y
];;[ and, ,]),,([ ,1,,1, effeff NNdiag GGGGGG vvVD KK == λλ
The above unknown variables Neff, vG,i, and λG,i are obtained by solving a
symmetric eigenvalue problem about a centered Gram matrix TMΩMG = with
)1
( T
vvNN
11IM −= and ΩΩΩΩ as defined in Eq. (3):
1,...,1, −≤== NNi eff,ii,i GGG vGv λ
In Eq. (16), Neff is defined as the number of non-zero eigenvalues, λG,i and vG,i are
the eigenvalues and eigenvectors of G, respectively.
Once the best hyper-parameter value γMP is found (MP stands for Maximum
Posterior) using a simple one-variable optimization (e.g., a line search) with Eq.
(14) as the objective function, the concerned hyper-parameters µMP and ζMP can be
found as well, using Eq. (17), the relationship γMP = ζMP /µMP, and Eq. (15):
)(2
1
DMPW
MPEE
N
γµ
+
−=
These hyper-parameters µMP and ζMP are used in next level, while the user actually
(12)
(13)
(14)
(15)
(16)
(17)
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
10
is interested in γMP only.
Level 3: [inference of kernel parameter σ]
)()(
)|()|( σ
σσ Mp
p
MpMp
D
DD =
where Mσ is the hypothesis, or function, using RBF kernel parameterized with
kernel width value σ. The purpose of this step is to optimize the value for σ , so
that the regressed function Mσ using value σ as the kernel width is the best fit to the
training data set D. It was shown that (MacKay, 1995; Van Gestel et al., 2001b):
∏=
−
+−−
∝∝eff
eff
N
i
iMPMPeffeff
N
MP
N
MP
N
MpMp
1
,
1
)())(1(
)|()|(
G
DD
λζµγγ
ζµσσ
where
∑= +
+=effN
i iMP
iMP
eff
1 ,
,
11
G
G
λγ
λγγ
To optimize the value for σ, another line search is usually invoked using Eq. (19)
as the objective function. This can be easily done with commercial optimization
package such as MATLAB optimization toolbox.
4.4 Training
The training data is firstly preprocessed using Eq. (6). Then the hyper-parameters (γ, σ)
for the target torque functions are estimated at this point. Since there are fifteen target
torque functions, then fifteen individual sets of hyper-parameters (γr, σr) are inferred
with respect to r. The detailed inference procedure for a certain training data set TRAINr
is listed in Fig. 7. The following paragraph also describes the procedures in detail.
In order to find out the best hyper-parameters, a line search on σ is initially performed.
A value of σ is guessed for the objective function to be optimized. The objective
function for σ is evaluated as follows. Firstly, considering the training data set TRAINr =
(xk, yk), k = 1 to N, a matrix ΩΩΩΩ is prepared with the guessed σr (the RBF kernel width)
where ΩΩΩΩkl = K(xk, xl), and a matrix )1
( T
vvNN
11IM −= , where xk and xl are the kth
and
lth
data points in TRAINr, and N is the number of data points in TRAINr. Then the
centered GRAM matrix TMΩMG = can be calculated. After that, the eigen problem
in Eq. (16) is solved using a commercial package such as MATLAB. Two sets of results
about eigenvalue λG,i and eigenvectors vG,i of G are returned, where i = 1 to Neff, and Neff
is the number of (nonzero) eigenvalues returned in the result. Knowing the values of Neff,
(18)
(19)
(20)
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
11
vG,i, and λG,i can construct the matrices DG and VG as in Eq.(15). ∑=
=N
i
iyN
m1
1ˆ
y can
also be calculated, where yi can be found in the ith
data point in TRAINr. So, DW EE γ+
can be estimated as listed in Eq. (15), where γ is still an unknown scalar. Hence, all
necessary information is prepared for Eq. (14) at this stage. To find out the best
hyper-parameter γMP, another line search is carried out using Eq. (14) as the objective
function. After obtaining the hyper-parameter γMP, another hyper-parameter µMP can be
calculated easily from Eq. (17). According to the relationship γMP = ζMP /µMP, ζMP can
be known as well. At this point, the information of γMP, ζMP, µMP, λG,i, Neff, and N are
known. According to Eq. (20), γeff can be calculated as well. The posterior of the
function using the guessed σ can be evaluated from Eq. (19). Up to this stage, only one
iteration for the line search on σ is done. Then, the algorithm goes back and guesses
another value forσ, and evaluates the whole procedure again, until the best σ can be
found and returned as σMP. γMP is also returned as one of the solutions. This pair of
values (γMP, σMP) is considered to be the best hyper-parameters for the training data set
TRAINr, and it is denoted as (γMP,r , σMP,r).
After obtaining the inferred hyper-parameters (γMP,r , σMP,r), the training data set TRAINr
is used for calculating the support values αααα and threshold b in Eq. (2). Finally, the target
function Mr can be constructed using Eq. (4).
5. RESULTS
To illustrate the advantage of LS-SVM regression, the results are compared with that
obtained from training a multilayer feedforward neural network (MFN) with
backpropagation. Since MFN is a well-known universal estimator, the results from
MFN can be considered as a rather standard benchmark.
5.1 LS-SVM Results
After obtaining all torque functions for an engine, their accuracies are evaluated one by
one against their own test sets TESTr using Eqs. (7) and (8). According to the accuracy
obtained in Table 2, the predicted results are in good agreement with the actual test
results under their hyper-parameters (γMP,r , σMP,r) inferred using the procedure described
in Fig.7. However, it is believed that the function accuracy could be improved by
increasing the number of training data. An example of comparison between the
predicted and actual engine torque and horsepower under the same ECU configuration
is shown in Fig.8.
5.2 MFN results
Fifteen neural networks NETr = NET1000, NET1500,…,NET8000 with respect to engine
speed r are built based on the fifteen sets of training data TRAINr = TRr ∪ Validr. TRr is
really used for training the corresponding network NETr whereas Validr is used as
validation set for early stop of trainings so as to provide better network generalization.
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
12
Every neural network consists of 8 input neurons (the parameters of an engine setup at a
certain engine speed r), 1 output neuron (the output torque value Tr), and 50 hidden
neurons, which is just a guess. Normally, 50 hidden neurons can provide enough
capability to approximate a highly nonlinear function. The activation function used
inside the hidden neurons is Tan-Sigmoid Transfer function while for the output neuron,
a pure linear filter is employed (Fig.9).
The training method employs standard backpropgation algorithm (i.e., gradient descent
towards the negative direction of the gradient) so that the results of MFN can be
considered as a standard. Learning rate of weight update is set to be 0.05. Each network
is trained for 300 epochs. The training results of all NETr are shown in Table 3. The
same test sets TESTr are also chosen so that the accuracies of the engine torque
functions built by LS-SVM and MFN can be compared reasonably. The average
accuracy of each NETr shown in Table 3 is calculated using Eqs. (7) and (8).
5.3 Comparison of results
With reference to Tables 2 and 3, SVM outperforms MFN about 7.07% in overall
accuracy under the same test sets TESTr. In addition, the issues of hyper-parameters and
training time have also been compared. In LS-SVM, two hyper-parameters (γMP ,σMP)
are required. They can be guessed using Bayesian inference, which totally eliminates
the user burden. In MFN, learning rate and number of hidden neurons are required to be
supplied from the users. Surely, these parameters can also be solved by 10-fold
cross-validation. However, the users have to predetermine a grid of guessed values for
these parameters, and the grid may not cover the best values for the hyper-parameters.
Under this reason, LS-SVM could often produce better generalization rate over MFN as
indicated in Tables 2 & 3. MFN produces less training error than LS-SVM since there is
no regularization factor controlling the tradeoff between training error and
generalization. In the contrast, LS-SVM produces much better generalization, about
7.07%, due to the regularization factor γMP introduced in the objective function.
Another issue is about the time required for training. Under a 800 MHz Pentium III PC
with 512M Byte RAM on board, LS-SVM takes about 25 minutes for training 200 data
points of 8 attributes for one time. The Bayesian inference for two hyper-parameters
takes about 35 minutes. In other words, fifteen engine torque functions requires (25+35)
×15 = 900 minutes training time. For MFN, an epoch takes about 0.5 minutes and each
network takes 300 epochs for training. Consequently, it takes about (300×0.5)×15 =
2250 minutes for fifteen neural networks. According to this estimation, LS-SVM only
takes 40% of training time of MFN. The major time reduction is caused by doing one
time optimization in LS-SVM as opposed to 300 trainings in MFN. Even LS-SVM
compares with standard SVM, LS-SVM require less training time because of
elimination of 10-fold cross-validation for guessing hyper-parameters.
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
13
6. CONCLUSIONS
LS-SVM method is applied to produce a set of torque function for an automotive engine
according to different engine speeds. According to Eq. (5), the engine power is
calculated based on the engine torque. In this research, the torque functions are
separately regressed based on fifteen sets of sample data acquired from an automotive
engine through the chassis dynamometer. The engine torque functions developed are
very useful for vehicle fine tune-up because the effect of any trial ECU setup can be
predicted to be gain or loss before running the vehicle engine on a dynamometer or road
test.
If the engine performance with a trial ECU setup can be predicted to be gain, the vehicle
engine is then run on a dynamometer for verification. If the engine performance is
predicted to be loss, the dynamometer test is unnecessary and another engine setup
should be made. Hence the function for prediction can greatly reduce the number of
expensive dynamometer tests, which saves not only the time taken for optimal tune-up,
but also the large amount of expenditure on fuel, spare parts and lubricants, etc. It is
also believed that the function can let the automotive engineer predict if his/her new
engine setup is gain or loss during road tests, where the dynamometer is unavailable.
Moreover, experiments have been done to indicate the accuracy of the torque functions,
and the results are highly satisfactory. In comparison to the traditional neural network
method, the LS-SVM method outperforms about 7.07% in overall accuracy and its
training time is approximately 60% less than that using neural-networks.
From the perspective of automotive engineering, the construction of modern automotive
gasoline engine power and torque functions using LS-SVM is a new attempt and this
methodology can also be applied to different kinds of vehicle engines.
REFERENCES
Bishop, C., 1995. Neural Networks for Pattern Recognition. Oxford University Press,
New York.
Brace, C., 1998. Prediction of Diesel Engine Exhaust Emission using Artificial Neural
Networks. IMechE Seminar S591, Neural Networks in Systems Design, U.K.
Cristianini, N., Shawe-Taylor, J., 2000. An Introduction to Support Vector Machines and
Other Kernel-based Learning Methods. Cambridge University Press, U.K.
Gunn, S., 1998. Support Vector Machines for Classification and Regression. ISIS
Technical Report ISIS-1-98. Image Speech & Intelligent Systems Research Group,
University of Southapton, May, 1998, U.K.
Harrell, F., 2001. Regression Modelling Strategies with Applications to Linear Models,
Logistic Regression, and Survival Analysis. Springer-Verlag, New York.
Haykin, S., 1999. Neural Networks: A comprehensive foundation. Prentice Hall, second
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
14
ed., USA.
Liu, Z., Fei, S., 2004. Study of CNG/diesel dual fuel engine’s emissions by means of
RBF neural network. Journal of Zhejiang University SCIENCE, 5(8): 960-965.
MacKay, D., 1995. Probable Networks and Plausible Predictions – A Review of
Practical Bayesian Methods for Supervised Neural Networks. Network Computation in
Neural Systems, 6, 469-505.
Pelckmans, K., Suykens, J., Van Gestel, T., De Brabanter, J., Lukas, L., Hamers, B., De
Moor, B., and Vandewalle, J., 2003. LS-SVMlab: a MATLAB/C toolbox for Least
Squares Support Vector Machines. Available at
http://www.esat.kuleuven.ac.be/sista/lssvmlab
Pyle, D., 1999. Data Preparation for Data Mining. Morgan Kaufmann, USA.
Ryan, T., 1996. Modern Regression Methods. Wiley-Interscience, USA.
Schölkopf, B., Smola, A., 2002. Learning with Kernels: Support Vector Machines,
Regularization, Optimization, and Beyond. MIT Press, USA.
Seber, G., Wild, C., 2003. Nonlinear regression, New Edition. Wiley-Interscience, USA.
Seeger, M., 2004. Gaussian processes for machine learning. International Journal of
Neural Systems, 14(2):1-38.
Sen, A., Srivastava, M., 1990. Regression Analysis: Theory, Methods, and Applications.
Springer-Verlag, New York.
Smola, A., Burges, C., Drucker, H., Golowich, S., Van Hemmen, L., Muller, K.,
Scholkopf, B., Vapnik, V., 1996. Regression Estimation with Support Vector Learning
Machines, available at http://www.first.gmd.de/~smola
Su, S., Yan, Z., Yuan, G., Cao, Y., Zhou, C., 2002. A Method for Prediction In-Cylinder
Compound Combustion Emissions. Journal of Zhejiang University SCIENCE, 3(5):
543-548.
Suykens, J., Gestel, T., De Brabanter, J., De Moor, B., and Vandewalle, J., 2002. Least
Squares Support Vector Machines. World Scientific, Singapore.
Tabachnick, B., Fidell, L., 2001. Using Multivariate Statistics. Allyn and Bacon, fourth
ed., USA.
Traver, M., Atkinson, R. and Atkinson, C., 1999. Neural Network-based Diesel Engine
Emissions Prediction Using In-Cylinder Combustion Pressure. SAE Paper
1999-01-1532.
Van Gestel, T., Suykens, J., De Moor, B., Vandewalle, J., 2001a. Automatic relevance
determination for least squares support vector machine classifiers, in Proc. of the
European Symposium on Artificial Neural Networks (ESANN'2001), Bruges, Belgium,
Apr. 2001: 13-18.
Van Gestel, T., Suykens, J., Lambrechts, D., Lanckriet, A., Vandaele, G., De Moor, B.,
Vandewalle, J., 2001b. Predicting Financial Time Series using Least Squares Support
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
15
Vector Machines within the Evidence Framework, IEEE Trans. On Neural Networks,
Special Issue on Financial Engineering, 12(4), 809-821.
Yan, Z., Zhou, C., Su, S., Liu, Z., Wang, X., 2003. Application of Neural Network in
the study of Combustion Rate of Neural Gas/Diesel Dual Fuel Engine, Journal of
Zhejiang University SCIENCE, 4(2): 170-174.
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
16
Fig. 1 Example of fuel map in a typical ECU setup where the engine speed (RPM) is
discretely divided
Fig. 2 Example of engine output horsepower and torque curves
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
17
...
...
D1000
D8000
I1000
... I8000
O t1000
... t8000
f J1000
J8000
d p... a
...
T1000
T8000
...
I1000
O t1000
f J1000
d pa T1000
...
I8000
O t8000
f J8000
d pa T8000
D
I1500
O t1500
f J1500
d pa T1500
Fig. 3 Division of data set D into 15 subsets Dr according to various engine speeds
Fig. 4 Adjustment of engine input parameters using MoTeC M4 programmable ECU
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
18
Fig. 5 Car engine performance data acquisition on a chassis dynamometer
D1000 80% TRAIN100020% TEST1000
D8000 80% TRAIN800020% TEST8000
............ ............90% TR1000
10% VALID100090% TR8000
10% VALID8000
Fig. 6 Further division of data randomly into training sets (TRAINr) and test sets (TESTr)
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
19
Fig. 7 Inference procedure for hyper-parameters (γ, σ)
Fig. 8 Example of comparison between predicted and actual engine torque and power
Optimize σr with respect to )|(r
MTRAINP r σ . For each possible σr, calculate the
optimal µMP,r ,ζMP,r , and γMP,r as follows:
A. Solve the eigenvalue problem in Eq. (15), getting λG,i ,vG,i, and Neff.
B. Find the optimal value γMP,r in Eq. (13) using a line search.
C. Given γMP,r, calculate µMP,r and ζMP,r using Eqs. (16), (14), and the
relationship that γMP,r = ζMP,r /µMP,r.
D. Given γMP,r, and λG,,i, calculate γeff in Eq. (19).
E. Using Eq. (18), find out the posterior )|(r
MTRAINP r σ for σr.
F. If the current σr produces the highest posterior, return the values (γMP,r ,
σr). Otherwise, choose another value for σr and go back to Step A. The
work of guessing next choice of σr can be done by some well-known
optimization methods (e.g., line search). Steps A to E just prepare the
objective function to be optimized.
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
20
Fig. 9 Architecture (layer diagram) of every MFN
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
21
Fig. 1 Example of fuel map in a typical ECU setup where the engine speed (RPM) is
discretely divided
Fig. 2 Example of engine output horsepower and torque curves
Fig. 3 Division of data set D into 15 subsets Dr according to various engine speeds
Fig. 4 Adjustment of engine input parameters using MoTeC M4 programmable ECU
Fig. 5 Car engine performance data acquisition on a chassis dynamometer
Fig. 6 Further division of data randomly into training sets (TRAINr) and test sets (TESTr)
Fig. 7 Inference procedure for hyper-parameters (γ, σ)
Fig. 8 Example of comparison between predicted and actual engine torque and power
Fig. 9 Architecture (layer diagram) of every MFN
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
22
I1000 O t1000 f J1000 d a p T1000
d1 8 0 7.1 0 385 3 25 2.8 20
d2 10 2 6.5 0 360 3 25 2.8 11
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
dN 12 0 7.5 3 360 2.7 30 2.8 12
Table 1 Example of training data di in data set D1000
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
23
Torque
function Mr
γMP,r σMP,r Mean Square Error
with training set TRAINr
Average accuracy
with test set TESTr
M1000 0.28 2.32 0.43% 91.2%
M1500 0.31 8.77 0.65% 91.1%
M2000 0.22 4.91 0.89% 90.5%
M2500 1.14 5.64 0.44% 91.2%
M3000 0.59 2.42 0.32% 91.3%
M3500 0.74 4.37 0.27% 91.6%
M4000 0.98 3.38 0.08% 92.5%
M4500 1.33 5.89 1.25% 84.2%
M5000 0.10 10.71 2.10% 81.1%
M5500 0.49 6.87 1.89% 83.2%
M6000 0.59 10.92 1.24% 88.7%
M6500 1.23 7.43 0.58% 90.0%
M7000 0.43 3.05 0.77% 91.3%
M7500 0.75 6.34 0.66% 90.5%
M8000 0.61 3.28 0.39% 90.4%
Overall 0.80% 90.32%
Table 2 Accuracy of different functions Mr and its corresponding hyper-parameter
values
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006
24
Neural network
NETr
Training error
(Mean square error)
Average accuracy with test set
TESTr
NET1000 0.01% 86.1%
NET1500 0.03% 87.2%
NET2000 0.07% 87.9%
NET2500 0.25% 83.4%
NET3000 0.04% 85.5%
NET3500 0.24% 81.4%
NET4000 0.04% 86.3%
NET4500 0.08% 79.3%
NET5000 0.12% 75.2%
NET5500 0.85% 77.3%
NET6000 0.23% 82.9%
NET6500 0.45% 85.3%
NET7000 0.07% 82.4%
NET7500 0.12% 84.8%
NET8000 0.21% 83.8%
Overall 0.19% 83.25%
Table 3 Training errors and average accuracy of the fifteen neural networks
top related