non-linear models. non-linear growth models many models cannot be transformed into a linear model...
TRANSCRIPT
Non-Linear Models
Mechanistic Growth Model
Non-Linear Growth models • many models cannot be transformed into a linear model
The Mechanistic Growth Model
Equation: kxeY 1
or (ignoring ) “rate of increase in Y” = Ykdx
dY
The Logistic Growth Model
or (ignoring ) “rate of increase in Y” = YkY
dx
dY
Equation:
kxeY
1
10864200.0
0.5
1.0
Logistic Growth Model
x
y
k=1/4
k=1/2k=1k=2
k=4
The Gompertz Growth Model:
or (ignoring ) “rate of increase in Y” =
YkY
dx
dY ln
Equation: kxeeY
10864200.0
0.2
0.4
0.6
0.8
1.0
Gompertz Growth Model
x
y
k = 1
Non-Linear Regression
Non-Linear Regression
Introduction
Previously we have fitted, by least squares, the General Linear model which were of the type:
Y = 0 + 1X1 + 2 X2 + ... + pXp +
• the above equation can represent a wide variety of relationships.
• there are many situations in which a model of this form is not appropriate and too simple to represent the true relationship between the dependent (or response) variable Y and the independent (or predictor) variables X1 , X2 , ... and Xp.
• When we are led to a model of nonlinear form, we would usually prefer to fit such a model whenever possible, rather than to fit an alternative, perhaps less realistic, linear model.
• Any model which is not of the form given above will be called a nonlinear model.
This model will generally be of the form:
Y = f(X1, X2, ..., Xp| 1, 2, ... , q) +
where the function (expression) f is known except for the q unknown parameters 1, 2, ...
, q.
Least Squares in the Nonlinear Case
• Suppose that we have collected data on the Y, – (y1, y2, ...yn)
• corresponding to n sets of values of the independent variables X1, X2, ... and Xp
– (x11, x21, ..., xp1) ,
– (x12, x22, ..., xp2),
– ... and
– (x12, x22, ..., xp2).
• For a set of possible values 1, 2, ... , q of the parameters, a measure of how well these values fit the model described in equation * above is the residual sum of squares function
•
• where • is the predicted value of the response variable yi from
the values of the p independent variables x1i, x2i, ..., xpi
using the model in equation * and the values of the parameters 1, 2, ... , q.
n
iiiq yy,, ,S
1
221 ˆ
n
iqpiiii ,θ, ,θ|θ,x, ,xxfy
1
22121
),, ,|,x, ,xf(xy qpiiii 2121ˆ
• The Least squares estimates of 1, 2, ... , q,
are values
• which minimize S(1, 2, ... , q).
• It can be shown that the error terms are independent normally distributed with mean 0 and common variance 2 than the least squares estimates are also the maximum likelihood estimate of 1, 2, ... , q).
q,, , ˆˆˆ21
• To find the least squares estimate we need to determine when all the derivatives S(1, 2, ... , q)
with respect to each parameter 1, 2, ... and q are
equal to zero.
• This quite often leads to a set of equations in 1, 2,
... and q that are difficult to solve even with one
parameter and a comparatively simple nonlinear model.
• When more parameters are involved and the model is more complicated, the solution of the normal equations can be extremely difficult to obtain, and iterative methods must be employed.
• To compound the difficulties it may happen that multiple solutions exist, corresponding to multiple stationary values of the function S(1, 2, ... , q).
• When the model is linear, these equations form a set of linear equations in 1, 2, ... and q which
can be solved for the least squares estimates .
• In addition the sum of squares function, S(1,
2, ... , q), is a quadratic function of the
parameters and is constant on ellipsoidal surfaces centered at the least squares estimates .
q,, , ˆˆˆ21
q,, , ˆˆˆ21
Example
We have collected data on two variables (Y, X)
The model we will consider is
Y = 1e-X +2 + The data:
(x1 ,y1) (x2,y2) (x3,y3) , ... , (xn,yn)
The predicted value of yi using the model is:
21ˆ ixi ey
The Residual Sum of Squares is:
n
i
xi
n
iii
ieyyyS1
2
211
221 ˆ,
The least squares estimates of 1 and 2 satisfy
n
i
xi
ieySS1
2
21,
21,
212121
min,minˆ,ˆ
Techniques for Estimating the Parameters
of a Nonlinear System • In some nonlinear problems it is convenient to
determine equations (the Normal Equations) for the least squares estimates ,
• the values that minimize the sum of squares function, S(1, 2, ... , q).
• These equations are nonlinear and it is usually necessary to develop an iterative technique for solving them.
q,, , ˆˆˆ21
• In addition to this approach there are several currently employed methods available for obtaining the parameter estimates by a routine computer calculation.
• We shall mention three of these: – 1) Steepest descent, – 2) Linearization, and – 3) Marquardt's procedure.
• In each case a iterative procedure is used to find the least squares estimators .
• That is an initial estimates, ,for these values are determined.
• The procedure than finds successfully better estimates, that hopefully converge to the least squares estimates,
q,, , ˆˆˆ21
002
01 q,, ,
iq
ii ,, , 21
q,, , ˆˆˆ21
Steepest Descent • The steepest descent method focuses on determining
the values of 1, 2, ... , q that minimize the sum of squares function, S(1, 2, ... , q).
• The basic idea is to determine from an initial point,
and the tangent plane to S(1, 2, ... , q) at this point, the vector along which the function S(1, 2, ... , q) will be decreasing at the fastest rate.
• The method of steepest descent than moves from this initial point along the direction of steepest descent until the value of S(1, 2, ... , q) stops decreasing.
002
01 q,, ,
• It uses this point,
as the next approximation to the value that minimizes S(1, 2, ... , q).
• The procedure than continues until the successive approximation arrive at a point where the sum of squares function, S(1,
2, ... , q) is minimized.
• At that point, the tangent plane to S(1,
2, ... , q) will be horizontal and there will
be no direction of steepest descent.
112
11 q,, ,
• It should be pointed out that the technique of steepest descent is a technique of great value in experimental work for finding stationary values of response surfaces.
• While, theoretically, the steepest descent method will converge, it may do so in practice with agonizing slowness after some rapid initial progress.
• Slow convergence is particularly likely when the S(1, 2, ... , q) contours are
attenuated and banana-shaped (as they often are in practice), and it happens when the path of steepest descent zigzags slowly up a narrow ridge, each iteration bringing only a slight reduction in S(1, 2, ... , q).
• A further disadvantage of the steepest descent method is that it is not scale invariant.
• The indicated direction of movement changes if the scales of the variables are changed, unless all are changed by the same factor.
• The steepest descent method is, on the whole, slightly less favored than the linearization method (described later) but will work satisfactorily for many nonlinear problems, especially if modifications are made to the basic technique.
Steepest descent path
Initial guess
Steepest Descent
Linearization• The linearization (or Taylor series) method uses
the results of linear least squares in a succession of stages.
• Suppose the postulated model is of the form:
Y = f(X1, X2, ..., Xp| 1, 2, ... , q) + • Let
be initial values for the parameters 1, 2, ... , q. • These initial values may be intelligent guesses or
preliminary estimates based on whatever information are available.
002
01 q,, ,
• These initial values will, hopefully, be improved upon in the successive iterations to be described below.
• The linearization method approximates f(X1,
X2, ..., Xp| 1, 2, ... , q) with a linear function of
1, 2, ... , q using a Taylor series expansion of
f(X1, X2, ..., Xp| 1, 2, ... , q) about the point and
curtailing the expansion at the first derivatives. • The method then uses the results of linear least
squares to find values, that provide the least squares fit of of this linear function to the data .
• The procedure is then repeated again until the successive approximations converge to hopefully at the least squares estimates:
Contours of RSS for linear approximatin
Initial guess
2nd guess
Linearization
Contours of RSS for linear approximatin
Initial guess
2nd guess
3rd guess
Contours of RSS for linear approximatin
Initial guess
2nd guess
3rd guess
4th guess
The linearization procedure has the following possible drawbacks:
1. It may converge very slowly; that is, a very large number of iterations may be required before the solution stabilizes even though the sum of squares S(1,2 , …,p) may decrease consistently as the number of iterations increase. This sort of behavior is not common but can occur.
2. It may oscillate widely, continually reversing direction, and often increasing, as well as decreasing the sum of squares. Nevertheless the solution may stabilize eventually.
3. It may not converge at all, and even diverge, so that the sum of squares increases iteration after iteration without bound.
• The linearization method is, in general, a useful one and will successfully solve many nonlinear problems.
• Where it does not, consideration should be given to reparameterization of the model or to the use of Marquardt's procedure.
Example
Non-Linear Regression
In this example we are measuring the amount of a compound in the soil:
1. 7 days after application
2. 14 days after application
3. 21 days after application
4. 28 days after application
5. 42 days after application
6. 56 days after application
7. 70 days after application
8. 84 days after application
This is carried out at two test plot locations
1. Craik
2. Tilson
6 measurements per location are made each time
The datadays Craik Tilson days Craik Tilson days Craik Tilson
7 49.7 53.4 28 29.4 37.2 70 9.7 27.07 40.4 52.7 28 21.7 42.4 70 18.6 30.07 43.2 52.7 28 24.3 38.2 70 6.2 24.87 44.6 59.1 28 26.5 35.2 70 11.4 25.87 50.4 56.6 28 20.8 35.1 70 20.1 20.97 44.1 59.5 28 26.9 39.2 70 12.0 31.7
14 37.3 43.8 42 25.3 33.7 84 13.9 23.614 34.9 39.8 42 26.4 37.7 84 19.6 29.514 35.8 55.7 42 27.2 38.2 84 1.4 29.914 32.7 44.8 42 24.7 36.3 84 17.9 25.814 32.5 46.9 42 8.9 41.7 84 26.4 18.214 41.7 49.7 42 22.8 33.3 84 27.2 27.721 30.2 46.9 56 30.2 30.121 33.6 40.9 56 23.8 36.321 27.7 40.1 56 23.6 28.021 32.2 40.4 56 21.0 27.721 30.2 38.0 56 23.6 30.121 34.4 41.3 56 26.0 25.8
Graph
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
0 20 40 60 80 100
Craik
Tilson
The Model: Exponential decay with nonzero asymptote
kxy ae c
0
10
20
30
40
50
60
70
0 20 40 60 80 100
yk y c
x
c
a
.50
ln 2x
k
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
0 20 40 60 80 100
Craik
Tilson
fitted Craik
Fitted Tilson
Craik Tilsona 50 45c 10 20k 0.04 0.03
Some starting values of the parameters found by trial and error by Excel
Iteration Residual SS A C K 1 1993.670464 50.0000000 10.0000000 .040000000 1.1 1467.177752 38.7647951 16.5707533 .044923587 2 1467.177752 38.7647951 16.5707533 .044923587 2.1 1455.069561 39.2335353 16.5430270 .047536045 3 1455.069561 39.2335353 16.5430270 .047536045 3.1 1454.990580 39.3731019 16.5866215 .048017245 4 1454.990580 39.3731019 16.5866215 .048017245 4.1 1454.988775 39.3918983 16.5975879 .048099100 5 1454.988775 39.3918983 16.5975879 .048099100 5.1 1454.988724 39.3948934 16.5995306 .048112911 6 1454.988724 39.3948934 16.5995306 .048112911 6.1 1454.988722 39.3953930 16.5998606 .048115236
Non Linear least squares iteration by SPSS (Craik)
ANOVA Table (Craik) Source DF Sum of Squares Mean Square Regression 3 38959.23151 12986.41050 Residual 45 1454.98872 32.33308 Uncorrected Total 48 40414.22023 (Corrected Total) 47 5573.52411 R squared = 1 - Residual SS / Corrected SS = .73895
Asymptotic 95 % Asymptotic Confidence Interval Parameter Estimate Std. Error Lower Upper A 39.395392982 4.322521115 30.689388555 48.101397409 C 16.599860620 2.090542197 12.389292496 20.810428745 K .048115236 .011889354 .024168847 .072061625
Parameter Estimates (Craik)
Testing Hypothesis: similar to linear regression
1Reduced Complete
Complete
q SSE SSEF
MSE
Caution: This statistic has only an approximate F – distribution when the sample size is large
Example: Suppose we want to test
H0: c = 0 against HA: c ≠ 0
kxy ae c
Complete model
Reduced model
kxy ae
ANOVA Table (Complete model) Source DF Sum of Squares Mean Square Regression 3 38959.23151 12986.41050 Residual 45 1454.98872 32.33308 Uncorrected Total 48 40414.22023 (Corrected Total) 47 5573.52411 R squared = 1 - Residual SS / Corrected SS = .73895
ANOVA Table (Reduced model)
Source DF Sum of Squares Mean Square Regression 2 38667.23245 19333.61623 Residual 46 1746.98778 37.97800 Uncorrected Total 48 40414.22023 (Corrected Total) 47 5573.52411 R squared = 1 - Residual SS / Corrected SS = .68656
The Test
1Reduced Complete
Complete
q SSE SSEF
MSE
11 1746.98778 1454.98872
32.33308
9.031
1 21, 45 0.004328df df p
Use of Dummy Variables
Non Linear Regression
The Model:1
2
1 1
2 2
Craik
Tilson
k x
k x
a e cy
a e c
k D k xy a D a e c D c or
where1 Craik
0 TilsonD
2 1 2 and a a a a a
2 1 2 and k k k k k
2 1 2 and c c c c c
days Amt Plot D days7 49.7 1 1 427 40.4 1 1 427 43.2 1 1 427 44.6 1 1 427 50.4 1 1 427 44.1 1 1 42
14 37.3 1 1 5614 34.9 1 1 5614 35.8 1 1 5614 32.7 1 1 5614 32.5 1 1 5614 41.7 1 1 56
56 27.7 2 056 30.1 2 056 25.8 2 070 27.0 2 070 30.0 2 070 24.8 2 070 25.8 2 070 20.9 2 070 31.7 2 084 23.6 2 084 29.5 2 084 29.9 2 084 25.8 2 084 18.2 2 084 27.7 2 0
The data file
Non Linear least squares iteration by SPSS Iteration Residual SS A C K DA DC DK 1 2814.326009 45.0000000 20.0000000 .030000000 5.00000000 -10.000000 .010000000 1.1 2158.807195 38.2450038 23.9878780 .032927242 .519790880 -7.4171248 .011996344 2 2158.807195 38.2450038 23.9878780 .032927242 .519790880 -7.4171248 .011996344 2.1 2143.323187 38.4247069 24.0137675 .033967225 .808811330 -7.4707428 .013568788 3 2143.323187 38.4247069 24.0137675 .033967225 .808811330 -7.4707428 .013568788 3.1 2143.230932 38.4455241 24.0566179 .034143651 .927567736 -7.4699980 .013873572 4 2143.230932 38.4455241 24.0566179 .034143651 .927567736 -7.4699980 .013873572 4.1 2143.228790 38.4472426 24.0652038 .034173705 .944657860 -7.4676160 .013925396 5 2143.228790 38.4472426 24.0652038 .034173705 .944657860 -7.4676160 .013925396 5.1 2143.228730 38.4474868 24.0667016 .034178821 .947409756 -7.4671709 .013934093 6 2143.228730 38.4474868 24.0667016 .034178821 .947409756 -7.4671709 .013934093 6.1 2143.228728 38.4475303 24.0669567 .034179693 .947860559 -7.4670964 .013935539 Run stopped after 12 model evaluations and 6 derivative evaluations. Iterations have been stopped because the relative reduction between successive residual sums of squares is at most SSCON = 1.000E-08
ANOVA Table
Parameter Estimates
Nonlinear Regression Summary Statistics Dependent Variable AMT Source DF Sum of Squares Mean Square Regression 6 111107.59858 18517.93310 Residual 90 2143.22873 23.81365 Uncorrected Total 96 113250.82731 (Corrected Total) 95 13329.99590 R squared = 1 - Residual SS / Corrected SS = .83922
Asymptotic 95 % Asymptotic Confidence Interval Parameter Estimate Std. Error Lower Upper A 38.447530261 2.872214696 32.741374449 44.153686073 C 24.066956710 2.831618146 18.441453031 29.692460390 K .034179693 .008677372 .016940579 .051418807 DA .947860559 4.691561308 -8.372744848 10.268465967 DC -7.467096366 3.352145804 -14.12671909 -.807473641 DK .013935539 .013394312 -.012674600 .040545677
Testing Hypothesis:
1Reduced Complete
Complete
q SSE SSEF
MSE
Suppose we want to test
H0: a = a1 – a2 = 0 and k = k1 – k2 = 0
The Reduced Model:
1
2
Craik
Tilson
kx
kx
ae cy
ae c
kxy ae c D c or
ANOVA Table
Parameter Estimates
Nonlinear Regression Summary Statistics Dependent Variable AMT Source DF Sum of Squares Mean Square Regression 4 111084.31069 27771.07767 Residual 92 2166.51662 23.54909 Uncorrected Total 96 113250.82731 (Corrected Total) 95 13329.99590 R squared = 1 - Residual SS / Corrected SS = .83747
Asymptotic 95 % Asymptotic Confidence Interval Parameter Estimate Std. Error Lower Upper A 38.551192997 2.255883543 34.070813560 43.031572434 C 25.878556272 1.619720431 22.661651687 29.095460858 K .040862515 .006593482 .027767290 .053957740 DC -10.64103918 .990561565 -12.60837995 -8.673698408
The F Test
1Reduced Complete
Complete
q SSE SSEF
MSE
12 2166.51662 2143.22873
23.81365
0.4890
1 22, 90 0.615df df p
Thus we accept the null Hypothesis that the reduced model is correct