calibrating a thermocouple - addlink software científico · calibrating a thermocouple ... of free...
TRANSCRIPT
EXAMPLES
Calibrating a Thermocoupleby Jean Giraud
Thermocouples measure temperature indirectly by measuring the voltage produced across dissimilar metals when there is a temperature gradient present at the junction. Converting any measured voltage to a temperature requires calibration of the thermocouple. Typically, thermocouples are calibrated using standard reference tables of coefficients published by NIST (http://srdata.nist.gov/its90/main/its90_main_page.html), which fit a piecewise polynomial curve to the standard ITS-90 reference temperatures (http://srdata.nist.gov/its90/tables/table_iii.html).
There are several problems with the temperature conversions obtained in this manner. First, the curve that describing the voltage-to-temperature data is highly nonlinear, and is, perhaps, better fit with a more complicated fit function. The calibrations published by NIST are piecewise for this reason, so you must know which temperature range you will measure before you can choose a set of calibration parameters. Next, the NIST data only represents the calibration of the thermocouple wires themselves, not any extended cabling, gradient thermal effects, time-dependent degradation, or other sources of error.
We will use a rational function to fit the calibration data. The fit parameters will be stored so that subsequent measurements at non-fixed-point temperatures can be calculated whenever new measurements are taken.
It is unlikely that all the calibration points will be measured, but you can only fit as many parameters as there are data points. For demonstration purposes, use the calibrated values from NIST.
− −
Values at the fixed pointsFixed points °C mV
Helium NBPHydrogen TPHydrogen NBPNeon TPNeon NBPOxygen TPNitrogen TPNitrogen NBPOxygen NBPCabon Dioxide SPMercury FPIce Point Ether TPWater BP Benzoic TP Indium FP Tin FP Bismuth FP Cadmium FP Lead FP Mercury BP
-268.935 -6.25629-259.340 -6.22919-252.870 -6.19773-248.595 -6.17138-246.048 -6.15358-218.789 -5.87302-210.002 -5.75328-195.802 -5.53559-182.962 -5.31472-78.476 -2.74070-38.862 -1.43494 0.00 0.00 26.87 1.0679 100.00 4.2773 122.37 5.3414 156.634 7.0364 231.9681 11.0133 271.442 13.2188 321.108 16.0953 327.502 16.4733 356.66 18.2179
ET
6.25629−
6.22919−
6.19773−
6.17138−
6.15358−
5.87302−
5.75328−
5.53559−
5.31472−
2.74070−
1.43494−
0.00000
1.06790
4.27730
5.34140
7.03640
11.01330
13.21880
16.09530
16.47330
18.21790
268.935−
259.340−
252.870−
248.595−
246.048−
218.789−
210.002−
195.802−
182.962−
78.476−
38.862−
0.000
26.870
100.000
122.370
156.634
231.968
271.442
321.108
327.502
356.660
:=
type ' T ' Thermocouple (T/C) from NBS Monograph 125
i 0 rows ET( ) 1−..:=
Divide the matrix into X and Y vectors.
X ET 0⟨ ⟩:= Y ET 1⟨ ⟩
:=
10 5 0 5 10 15 20
500
500
Y
X
Regression
Here, rationalfit is used to fit the data.
top_order 7:= bottom_order 6:=
These would need to be reduced if there were fewer data points available.
−( )
cfit rationalfit X Y, 0.95, top_order, bottom_order, 105−
, "noscale",( ):=
Poly x c, On, Od,( )0
On
n
cn xn
⋅( )∑=
1
1
Od
n
cn On+ xn
⋅( )∑=
+
:=
yfiti Poly Xi cfit0⟨ ⟩
, top_order, bottom_order,
:=
Look at the final fit. This seems to have good convergence at first glance.
10 5 0 5 10 15 20
500
Y
yfit
X
10 5 0 5 10 15 20
0.1
0.1Residuals
Y yfit−
X
Use the stored fitted parameters for subsequent calculations.
K
0.14298647377
25.89470606652
4.75419821808
0.41190837679−
0.08307537523−
1.33815495105 103−
×
2.46943457583 104−
×
0.21474459551
0.01184922606−
3.88461364789− 103−
×
1.01779135807− 105−
×
1.36789923053 105−
×
5.82209744522 108−
×
:= cfit0⟨ ⟩
0.06053529506
25.87152590017
2.90755340593
1.58842445974−
0.26452928703−
1.93318016539 103−
×
1.58775766658 103−
×
3.30408223193 105−
×
0.1426986327
0.05952407034−
0.01212247532−
1.38551568319− 104−
×
7.14362618209 105−
×
2.51908911013 106−
×
=
NISTY TX
103
→
:=
Calculate the residuals for the NIST fit and compare them with the new calibrated fit.
ETpoint 1, 248.595−=Actual value:
temp measured( ) 248.5921−=Tmeasured
103
227.5048−=
in units of volts.
T V( ) NIST0 V NIST1 V NIST2 V NIST3 V NIST4 V NIST5 V NIST6(⋅+⋅+⋅+⋅+⋅+⋅+:=
NIST
0.100860910
25727.94369
767345.8295−
78025595.81
9247486589−
6.97688 1011
⋅
2.66192− 1013
⋅
3.94078 1014
⋅
:=
Let's compare this with the polynomial expression. For a type T thermocouple, the coefficients from the table are
ETpoint 1, 248.595−=temp measured( ) 248.5921−=
Calculate the measured temperature given a measured voltage (in mV) here. We've used one of the fixed point voltages, to compare results, but any temperature in the range of the calibration is acceptable.
measured 6.1714−=measured ETpoint 0,:=
point 3:=Input the measured voltage here
temp mV( ) Poly mV cfit0⟨ ⟩
, top_order, bottom_order,
:=
Define a temperature conversion function:
Comparison of rational fit χ2 error with the polynomial fit parameters:
Y temp X( )−( )→( )2
∑ 0.0135= Y NISTY−( )2∑ 3647.8134=
Improving the fit further
We can further improve and stabilize this fit by transforming the data. The closer the trend is to a straight line, the faster and better converged the rational polynomial will be. We should also be able to use a smaller number of fit coefficients, which will also give more stable results from the solvers, and allows us to measure fewer data points for calibration. We can achieve this with the current data set by transforming the y variable to be y/x. To accomplish this, we'll strip out the (0,0) point from the data set.
ET1
6.25629−
6.22919−
6.19773−
6.17138−
6.15358−
5.87302−
5.75328−
5.53559−
5.31472−
2.7407−
1.43494−
1.0679
4.2773
5.3414
7.0364
11.0133
13.2188
16.0953
16.4733
18.2179
268.935−
259.34−
252.87−
248.595−
246.048−
218.789−
210.002−
195.802−
182.962−
78.476−
38.862−
26.87
100
122.37
156.634
231.968
271.442
321.108
327.502
356.66
:= i 0 rows ET1( ) 1−..:=
X1 ET1 0⟨ ⟩:=
Y1ET1 1⟨ ⟩
ET1 0⟨ ⟩
→
:=
top_order 6:=
bottom_order 6:=
10 5 0 5 10 15 2015
25
35
45
Y1
X1
Finally, we'll use genfit to do the fit, so we can use analytical derivatives, which will give a few more places of accuracy than the numerical derivatives. To do this, we'll need to construct the vector of analytical derivatives required by genfit.
f'Poly x c, k,( ) BottomSum 1
1
bottom_order
n
cn top_order+ xn
⋅( )∑=
+←
xk top_order−
−
0
top_order
n
cn xn
⋅( )∑=
⋅
BottomSum2
return k top_order>if
xk
BottomSumreturn
:=
GenfitFunctionMatrix x C,( ) A0 Poly x C, top_order, bottom_order,( )←
Ai 1+ f'Poly x C, i,( )←
i 0 top_order bottom_order+..∈for
Areturn
:=
To construct a guess value, calculate a set of polynomial fit coefficients to the data, which is the same as a numerator in the rational polynomial with a denominator of 1 (order 0).
j 0 top_order..:= cguesstop_order bottom_order+0:=
Mi j, X1i( ) j:= q M
TM⋅( ) 1−
MT
⋅ Y1⋅:=
cguessjqj:=
Calculate the rational fit with genfit.
cfit genfit X1 Y1, cguess, GenfitFunctionMatrix,( ):=
y1i Poly X1i cfit, top_order, bottom_order,( ):=
10 5 0 5 10 15 20
20
30
40
Y1
y1
X1
10 5 0 5 10 15 20
0.001
0.001Residuals on the "Reduced" data set
Y1 y1−( )2∑ 2.0373 10
6−×= corr Y1 y1,( ) 1.0000000=
Go back and look at this fit with respect to the original, untransformed data, and its residuals, by multiplying by the x values.
fitrs x( ) Poly x cfit, top_order, bottom_order,( ):=
x 6.26− 6.25−, 20..:=
10 5 0 5 10 15 20
400
200
200
400
X1 y1⋅( )→
Y
x fitrs x( )⋅
X1 X, x,
residual Y X fitrs X( )⋅( )→
−:=
10 5 0 5 10 15 20
0.005
0.005Residuals on the original data set
Transforming back to the original data:
residual( )2∑ 0.0001=
The χ2 from above was
Y temp X( )−( )→( )2
∑ 0.0135=
One extra operation, that of transforming the data, has improved the fit by a factor of approximately 600%, with fewer fit parameters to calculate.
References
The Omege Instruments Web Site: http://www.omega.com/temperature/Z/pdf/z021-032.pdf
The NIST ITS90 Calibration Standard page:http://srdata.nist.gov/its90/main/its90_main_page.html
Spline SplineT
:=
Spline i⟨ ⟩Binterp rangei b,( ):=
rangeii max x( ) min x( )−( )⋅
101min x( )+:=i 0 100..:=
knotsi bi 2+:=i 0 b1..:=
The number of knots returned for the optimal spline fit is b1 40= , which is a good compression of the data from the
original length x( ) 536= points. We can take a look at the quality of the spline fit:
b Spline2 x y, SplineDegree, w,( ):=
SplineDegree 3:=
where w is a vector of weights giving the estimated standard deviations of the random error in y. We'll fit with spline polynomials of order 3.
w a 2⟨ ⟩:=y a 1⟨ ⟩
:=x a 0⟨ ⟩:=
a READPRN "example1.txt"( ):=
The statistical B-spline functions introduced in the data analysis extension pack will calculate a string of knots using the Durbin-Watson statistic to accept or reject spline fits. In this way, statistical B-splines supply a minimal number of knots to reflect all of the data features.
by Robert AdairComparing Splines with a Polynomial Fit
EXAMPLES
0 500 1000 1500 20000
2 .104
4 .104
6 .104
8 .104
original datainterpolated spline
Polynomial fits
But why couldn't we fit this same data with an equivalent global polynomial? Do a global fit to the data using the same number of free parameters as the spline has knots, to see why the spline is a better option.
NumFreeParm b1:= PolyDegree NumFreeParm 1−:=
vs regress x y, PolyDegree,( ):=
fglobal.regress x1( ) interp vs x, y, x1,( ):=
yregress fglobal.regress x( )→
:=
The regress function fails in this fit, which is not surprising given the high polynomial degree.
0 500 1000 1500 20001 .1019
5 .1018
0
5 .1018
1 .1019
yregress
x
Try fitting to Chebyshev polynomials for a better numerical stability.
xmin min knots( ):= xmax max knots( ):=
fPoly n x,( ) Tcheb n 2x xmin−
xmax xmin−1−,
:=
i 0 last x( )..:= j 0 PolyDegree..:=
Mi j, fPoly j xi,( ):= c MT
M⋅( ) 1−M
T⋅ y⋅:=
The matrix inversion is nearly singular in a numerical sense as can be seen by evaluating the determinant.
MT
M⋅ 2.528236 1092
×=
This is testament to the numerical complexity of fitting a polynomial with enough coefficients to represent all the inflections in the data, as well as the large number of points.
fglobal x( )
j
cj fPoly j x,( )⋅( )∑:=
yglobal fglobal x( )→
:=
200 400 600 800 1000 1200 1400 1600 1800 20000
2 .104
4 .104
6 .104
8 .104
original datainterpolated splineglobal polynomial
Even the more stable Tchebychev polynomials will oscillate on the tails of the data because of the large number of terms, and fail to accurately represent the peaks and finer features of the data. While the splines are perhaps less informative as a physical model, they are a much better predictor of the behavior of the data at any arbitrary point in the range.
The curve defined by this equation passes through points (0,0) and (1,0); and its slopes at these points are equal to c and b.
spline x c, b,( ) c x⋅ 2 c⋅ b+( ) x2
⋅− c b+( ) x3
⋅+=
which yields the equation of an approximating cubic spline in the form
a1 2 a2⋅+ 3 a3⋅+ b=a1 c=
a1 a2+ a3+ 0=a0 0=
Plugging these values back into the cubic equation, we get a system of linear equations
tp 1( )d
db=
tp 0( )d
dc=p 1( ) 0=p 0( ) 0=
and assume the initial and boundary conditions
p x( ) a0 a1 x⋅+ a2 x2
⋅+ a3 x3
⋅+=
To be more precise, examine the third-order polynomial
Cubic splines are a common method of approximating a curve with cubic polynomials. This article discusses a method for specifying the length of a section of the cubic spline fit between two endpoints. Splines are used to describe a curve without a generating function. Spline interpolation can also be used to approximate a curve which has an exceptionally complicated function associated with it. The utility of splines is based on the fact that a cubic polynomial is completely determined if its value and the value of its derivative are specified at two distinct points. This means that one can prescribe directions of the curve defined by a cubic polynomial at its endpoints. This property of cubic polynomials is used for creating continuous and well-behaved interpolation curves.
by Alex KushkuleyComplex Cubic Splines of Prescribed Length
EXAMPLES
On the other hand, suppose that we want the arc of our curve between the endpoints to have a prescribed length — this requirement is also quite natural for various CAD/CAM problems. Now we have a nonlinear equation
(1) L c b,( ) s= where L c b,( )
0
1
x1x
spline x c, b,( )dd
2
+
⌠⌡
d=
and s is a specified length. Fixing c, we can solve this equation for b in order to obtain a cubic curve which has a given length and which satisfies all the conditions of the "plain" cubic spline, except that now we can't prescribe the slope of the curve at its destination point. The integral involved is an elliptic one, so it is impossible to solve equation (1) analytically (see, e. g. [1] ). Mathcad can be used to get a numeric solution.
The important thing to understand is that the length function on the left-hand side of equation (1) is a convex function of its second argument, i. e. the second derivative of L in b is positive and hence, dL/db is a monotonically increasing function of b. This property means that the function has no more than two solutions. This is a rather general property of such functions (cf. [2]) but in our simple case we can use live symbolics to verify that it is correct.
The integrand as a function of its second argument looks like the following expression on the left, where x and y are arbitrary constants. Using the symbolic processor, differentiate twice on b to get
2b
1 x b y⋅+( )2
+
12d
d
2simplify
y2
1 x2
+ 2 x b y⋅⋅⋅+ b2
y2
⋅+( )32
→
This expression confirms that the second derivative of the length function is positive, and therefore that equation (1) has no more than two solutions. We can find these solutions by the following procedure.
Note that L1(Slope) is the minimum possible length of the cubic curve which satisfies three given conditions (values at the endpoints and the slope at the initial point). If we want our curve to have a given length, this length should be no less than L1(Slope). In this case the number Slope separates the two roots of our equation (1).
DerivL1 Slope( ) 2.320678− 104−
×=L1 Slope( ) 1.599284=
And find the length corresponding to this value of the derivative:
Slope 0.834195−=Slope root DerivL1 b( ) b,( ):=
and compute the resulting value of the first derivative at the endpoint:
b 0:=and take the initial value
DerivL1 b( ) DerivL c b,( ):=Define
DerivL c b,( )
0
1
xc 2 2 c⋅ b+( )⋅ x⋅− 3 c b+( )⋅ x
2⋅+ 2− x⋅ 3 x
2⋅+( )⋅
1 c 2 2 c⋅ b+( )⋅ x⋅− 3 c b+( )⋅ x2
⋅+ 2
+
12
⌠⌡
d:=
The derivative of L(c,b) in b is
L1 b( ) L c b,( ):=and define c 4−:=Fix
L c b,( )0
1
x1 c 2 2 c⋅ b+( )⋅ x⋅− 3 c b+( )⋅ x2
⋅+ 2
+
12
⌠⌡
d:=
which has only one solution since the left hand side is a monotone function. Using the previously derived expression for spline(x,c,b) we get
bL b( )d
d0=First, solve the equation
So far our methodology is of limited use, since we cannot change the endpoints of the curve, which, for the sake of simplicity, were fixed at (0,0) and (1,0). Below is a more general formulation. After the formulations given above, only a few changes are necessary.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 11
0.850.7
0.550.4
0.250.1
0.050.2
0.350.5
spline1 x( )
spline2 x( )
x
x 0 .01, 1..:=which we plot below
spline2 x( ) c x⋅ 2 c⋅ b2+( ) x2
⋅− c b2+( ) x3
⋅+:=
spline1 x( ) c x⋅ 2 c⋅ b1+( ) x2
⋅− c b1+( ) x3
⋅+:=
This gives us two curves, corresponding to the two roots of L1(b):
L1 b2( ) 1.799278=b2 1.335276=Check the result:
b2 root L1 guess( ) L0− guess,( ):=guess Slope 1+:=
The second root is on the right of Slope
L1 b1( ) 1.80003=b1 3.26509−=Check the result:
b1 root L1 guess( ) L0− guess,( ):=guess Slope 1−:=
then we will have one root on the left of Slope
L0 1.8:=
Let's say we look for a spline with the length
C x t,( ) c 2 A x( )⋅ t⋅+ B x( ) t2
⋅ 3⋅+:=The velocity vector is equal to the derivative of spline(x, t):
where the real parameter t belongs to the interval [0,1].
spline x t,( ) start c t⋅+ A x( ) t2
⋅+ B x( ) t3
⋅+:=
so that the curve we are looking for has the form
B x( ) x Slope x⋅ i⋅+ 2 start⋅+ 2 end⋅− c+( ):=
A x( ) 3− start⋅ 3 end⋅+ 2 c⋅− x− Slope x⋅ i⋅−( ):=
then the velocity at the finishing point is Slope · x · i + x, where x is the real part of a complex variable. Define the coefficients of the complex spline by setting up a system of equations from the initial conditions, just as we did for the non-complex case. In this instance, the second and third order coefficients will be functions of the real variable, x, since the velocity at the endpoint is written in terms of x, as above. This yields:
Slope 1:=
To solve for the derivative at the end point, we can fix the ratio between its imaginary and real parts and solve the length equation for the real part. This will also insure that the curve has a prescribed tangent line at the end point. Note that we were unable to achieve this with real cubic polynomials. If we set, for example, the slope at the finishing point to be
c 1− 5 i⋅+:=derivative of the curve at the starting point (the velocity of the curve):
end 1− i−:=end point:
start 1 i+:=starting point:
Points on the plane will now be represented as complex numbers, and cubic curves as cubic polynomials with complex coefficients. So, in specifying the initial conditions, use complex numbers as follows:
Arbitrary splines of prescribed length
DerivL x( )
0
1
tRe 1 Slope i⋅+( ) C x t,( )
⋅ 2− t⋅ 3 t
2⋅+( )⋅
C x t,( )
⌠⌡
d:=
Method 2:Differentiating on x we get the derivative of the length function
The value of L(d) found this way will still be good because the minimum is shallow.
L d( ) 3.601786=d 0.137445=d Minerr x( ):=
L x( ) 0=Given
x 0:=Pick a value of x near the minimum and enter this as a guess for the solve block.
To find a guess for the minimum possible length for the spline, plot L over a range of x values (you may need to change the range of x above to see the minimum.) 5 0 5
L x( )
x
x 5− 4.5−, 5..:=
There are two ways to proceed at this juncture. One way to find the minimum possible length is to use a solve block and the Minerr function. The accuracy of the value for the root of the function L(x) will be poor with the low tolerance used, but it will calculate quickly, and allow you to play with the values start, end, c and slope above. The second, more familiar way is to find the derivative of L(x), set it to zero, and solve for x. This method will be covered later, since it raises some interesting issues.
L x( )0
1
tC x t,( )⌠⌡
d:=
Hence, the length of the curve is equal to
ε λ TOL⋅:=λ 10:=Define the size of the intervals to be removed from the integral:
T2 x( ) if t2 x( ) t1 x( )> t2 x( ), t1 x( ),( ):=
T1 x( ) if t1 x( ) t2 x( )< t1 x( ), t2 x( ),( ):=
Order the roots in such a way that T1 < T2, using the intermediate values t1 and t2:
t2 x( ) Q t2 x( )( ):=t1 x( ) Q t1 x( )( ):=
Q x( ) if Im x( )( ) TOL<[ ] Re x( ) 0≥( )⋅ Re x( ) 1≤( )⋅ Re x( ), 1,[ ]:=
We look for those roots which are real and belong to the interval [0,1]:
t2 x( )A x( )− D x( )−
3 B x( )⋅:=t1 x( )
A x( )− D x( )+3 B x( )⋅
:=
D x( ) A x( )2
3 c⋅ B x( )⋅−:=
To do this, compute the roots t1 and t2 of C(x,t):
DerivL(x) is differentiable even when |C(x,t)| =0, and in many such cases Mathcad will compute this integral correctly, despite the apparent singularity. To avoid this issue and still get a good approximation for DerivL(x), remove small intervals of integration in the neighborhood of the roots of |C(x,t)|. The following expressions calculate the roots of |C(x,t)|, and create three integrals that leave out the intervals ±ε around each root.
L0 4.2:=Choose a length L0 to be greater than L(d):
L d( ) 3.602334=
With either method, the minimal possible length for our curve is approximately
DerivL d( ) 6.637192 104−
×=d 0.215161=
d root DerivL guessx( ) guessx,( ):=guessx 0.7:=
Now we can find the root of the derivative.
DerivL x( ) DerivL1 x( ) T1 x( ) ε+ T2 x( ) ε−<( ) DerivL2 x( )⋅+ T2 x( ) ε+ <(+:=
The new expression for the derivative becomes:
DerivL3 x( )
T2 x( ) ε+
1
tRe 1 Slope i⋅+( ) C x t,( )
⋅ 2− t⋅ 3 t
2⋅+( )⋅
C x t,( )
⌠⌡
d:=
DerivL2 x( )
T1 x( ) ε+
T2 x( ) ε−
tRe 1 Slope i⋅+( ) C x t,( )
⋅ 2− t⋅ 3 t
2⋅+( )⋅
C x t,( )
⌠⌡
d:=
DerivL1 x( )
0
T1 x( ) ε−
tRe 1 Slope i⋅+( ) C x t,( )
⋅ 2− t⋅ 3 t
2⋅+( )⋅
C x t,( )
⌠⌡
d:=
and rewrite the expression for the derivative in the following way:
This is an "equilateral fish" — the red solid curve has the same length as the blue dotted one. Both curves start at the same point on the "fish tail." Their starting velocities are equal. They finish simultaneously at the same point. Their finishing velocities are different. However the curves have the same tangent line at the finishing point which has a prescribed angle with the x-axis (45 degrees in this case).
1.5 1 0.5 0 0.5 12
1
0
1
2
Yi
Wi
Xi Ui,
W Im spline2 s( )( )→
:=U Re spline2 s( )( )→
:=
Y Im spline1 s( )( )→
:=X Re spline1 s( )( )→
:=
siiI
:=i 0 I..:=I 100:=range variable:
To graph these splines we need to plot imaginary part of our polynomials against their real parts.
spline2 t( ) start c t⋅+ A r2( ) t2
⋅+ B r2( ) t3
⋅+:=
and the second one is
spline1 t( ) start c t⋅+ A r1( ) t2
⋅+ B r1( ) t3
⋅+:=
So the first curve is
L r2( ) 4.20001=r2 root L x( ) L0− x,( ):=x guessq:=
L r1( ) 4.199986=r1 root L x( ) L0− x,( ):=x guessp:=
Check:guessq d 5+:=guessp d 5−:=
Find two roots of equation L(x) = L0.
To verify the results let's calculate the velocities of our curves at the endpoints (these match the complex value of c set at the start of this example):
t 0:=tspline1 t( )d
d1− 5i+=
tspline2 t( )d
d1− 5i+=
t 1:= vel1 argtspline1 t( )d
d
:= vel2 argtspline2 t( )d
d
:=
To see that the slope at the finishing point really is equal to 45 degrees, compare the tangents of the two velocities (the tangent will equal the value of Slope, set earlier):
tan vel1( ) 1= tan vel2( ) 1=
The reader is advised to play with all the parameters in this document. For example, setting the parameter slope to be a big number will force both curves to be perpendicular to the x-axis at the finishing point (the correct way of doing this is, of course, to reparametrize the problem in terms of y instead of x). Changing the velocity at the starting point can radically change the shape of the solutions even if the slope at the starting point remains the same. Try, for example, multiplying the starting velocity by 5 and changing length L0 to 12. Note also that if the required length approaches L(d) the curves will move closer and closer to each other until they will merge when the length will be approximately equal to L(d). When the length is less then L(d) the root functions fail to converge (bear in mind, however, that some precision loss is inevitable).
Another experiment is to impose other linear initial conditions or to relax some of these conditions. This article can be used for the purposes of practical curve design. Some experimentation will be required in choosing initial guesses for the Mathcad root functions, but the information on the length function presented here is sufficient for doing this quickly. The calculations performed here are applicable to any parametric families of curves which depend linearly on a parameter which controls the length. (cf. [2]).
The author is grateful to Frank Purcell, Paul Lorczak and Leslie Bondaryk for very useful comments.
References
[1] Handbook of Mathematical Functions, edited by M. Abramovitz and A Stegun, National Bureau of Standards, 1964.[2] A. Kushkuley and S. Rozenberg. Length function on a parametric family of curves, Latvian Math. Ezhegodnik, vol. 27, Riga, pp. 154-159, 1983 (Russian).
EXAMPLES
Conical Surface Regression and Analysisby Xavier Colonna de LegaZygo Corporation
Introduction
This worksheet illustrates how to calculate the parameters defining a conical surface that best matches a set of (x,y,z) points in a least-squares sense. The deviation of the original data points with respect to the best-fit cone is then calculated to create radial and tangential profiles of the surface deviation.
Valves used for hydraulic and fuel injection systems in cars and trucks are typically made of a moving ball or needle that mates to a conical surface. The region on the cone where the two surfaces contact in the closed position is called the valve seat. In order for the seat to be an effective sealing surface its deviation from a perfect cone is tightly toleranced. Dedicated profilers are used on the production floor and in the QC lab to verify that these tolerances are met. The data used in this worksheet are the result of a measurement of a machined valve seat with a surface profiling interferometer.
Coordinate system
The (X,Y,Z) coordinate system is defined by the interferometer geometry, see figure. The center is the point P. The axes are defined by the unitless X, Y and Z normed vectors.
In the figure the conical surface is perfectly aligned to the instrument: its axis defined by the cone center point C and the unit vector D is collinear with the Z-axis. The cone semi included angle is γ. The inspection diameter where roundness deviation must be measured is φ.
C0 Pradius
sinγ0Z⋅+:=Nominal cone center or apex:
radius 2.0 mm⋅≡Radius of tangent sphere:
D0 rot α0 β0,( ) Z−( )⋅:=Nominal cone direction:
rot α β,( )1
0
0
0
cos α
sinα
0
sinα−
cos α
cosβ
0
sinβ
0
1
0
sinβ−
0
cos β
⋅:=Rotation matrix:
β0 0 mrad⋅:=α0 0 mrad⋅:=Nominal rotations:
Nominal cone direction: we define the normed vector D by two angles α and β that correspond to two successive rotations of the cone axis about the X and Y axes.
γ0 45 deg⋅≡Nominal cone semi included angle:
P
0
0
0
m⋅:=Center position:
Z X Y×:=Y
0
1
0
:=X
1
0
0
:=Unit vectors:
Coordinate system:
Predefined units
Notations: Vector variables are bold faced. Small caps variables are scalar values. Large caps variables correspond to (x,y,z) points.
Finally, the location of the cone center on the Z-axis is such that the normal to the surface at a point M belonging to the inspection diameter passes through the center of the coordinate system P. In other words, the conical surface is tangent to a sphere of radius MP centered at P. The nominal distance MP is defined by the instrument. It is called "radius" in the worksheet.
P
CD
Y
X
Z
φ
γ
M
The signed length M'M is also positive.
M'M'' CM' tan γ( )⋅=
The signed length M'M" is defined as positive.
CM' CM D⋅ 0>=We have:
The figures at right illustrates the calculation of the distance HM for a given measured point M.
H
M
CD
M’γ
γ
M”
We first define a merit function to minimize. Here we choose to calculate the sum of the squared distances from the measured data points to the best-fit cone. The distance is measured along the local cone normal.
Best-fit conical surface
z MeasuredXYZ 2⟨ ⟩:=y MeasuredXYZ 1⟨ ⟩
:=x MeasuredXYZ 0⟨ ⟩:=
Extract coordinates:
rows MeasuredXYZ( ) 51518=Number of (x,y,z) measurement points:
MeasuredXYZ READBIN "ConeXYZ.bin" "float", 0, 3, 0,( ) mm⋅:=
Load measured (x,y,z) coordinates:
Import measured data
We have defined the conical surface and its location in space by 6 scalar values: the semi included angle γ, the (x,y,z) coordinates of the cone center or apex and the two rotation angles describing the normed vector D.
In practice, the measured data corresponds to a part that is not perfectly aligned to the instrument. The least-squares fit cone will provide estimates for these 6 parameters.
M'M MM'= CM( )2CM'( )2
−=
H
M
CD
M’
γ
γ
M”
This yields an expression for the signed length MM".
MM'' M'M'' M'M−=
We have: MM''MH
cos γ( )=
If we define the surface deviation as HM this signed quantity is given by:
HM cos γ( ) CM( )2CM'( )2
−⋅ sin γ( ) CM'⋅−=
The sum calculation has been optimized in this program. It turns out that calculation with vectors is slower than scalar calculations. The least square sum is scaled to mm2 in order to avoid large exponents.
SumSquares points cx, cy, cz, α, β, γ,( ) Σ 0←
sγsin γ( )mm
←
cγcos γ( )
mm←
Dx
Dy
Dz
sin β( )sin α( ) cos β( )⋅
cos α( )− cos β( )⋅
←
"Loop on all data points"
CMx pointsk 0, cx−←
CMy pointsk 1, cy−←
CMz pointsk 2, cz−←
CM2 CMx2 CMy2+ CMz2+←
cmz CMx Dx⋅ CMy Dy⋅+ CMz Dz⋅+←
HM cγ CM2 cmz cmz⋅−⋅ sγ cmz⋅−←
Σ Σ HM2
+←
k 0 rows points( ) 1−..∈for
Σreturn
:=
Guess values for the fit:
α0 0mrad= β0 0 mrad= γ0 45 deg=
C0
0
0
2.828
mm=
cx0
cy0
cz0
C0:=
Minimize the sum of squares using a solve block:
Set convergence tolerance: TOL 10 4−:=
C C0−
0.011−
0.013
0.133−
mm=Displacement from nominal:
C
cx
cy
cz
:=Cone center:
D rot α β,( ) D0⋅:=Direction:
Best fit cone parameters:
β 0.244 deg=α 0.296− deg=Tilt of cone axis:
Note: A typical cone angle tolerance is on the order of 1°.
2 γ⋅ 90.091 deg=Best fit cone included angle:
SumSquares MeasuredXYZ cx, cy, cz, α, β, γ,( ) 0.010398=
SumSquares MeasuredXYZ cx0, cy0, cz0, α0, β0, γ0,( ) 441.274195467072=
Verify improvement of sum of squares:
cx cy cz α β γ( ) Minerr cx0 cy0, cz0, α0, β0, γ0,( )T:=
SumSquares MeasuredXYZ cx0, cy0, cz0, α0, β0, γ0,( ) 0=
SumSquares subMeasuredXYZ cx0, cy0, cz0, α0, β0, γ0,( ) 0=
Given
Enable one or the other equality statement, depending on if you wish to use the truncated data or the full set. The Quasi-Newton algorithm gives the best results here. The right-click option on Minerr was used to force this choice.
Sum of Squares Solve Block
subMeasuredXYZ subMeasuredXYZT:=subMeasuredXYZ
i
sub
⟨ ⟩
T0 i⟨ ⟩:=
T0 MeasuredXYZT:=i 0 sub, rows MeasuredXYZ( ) 1−..:=
sub 20:=
On slower computers we can use a data subset to perform the fit.
3D plot of measured data and fitted cone
Angular width of displayed best fit cone surface: ∆θ 6 deg⋅:=
Best-fit 3D plotThe red line corresponds to the best-fit cone axis.
The light-blue wireframe surface represents a section of the best fit cone.
x
mm
y
mm,
z
mm,
FitX
mm
FitY
mm,
FitZ
mm,
,AxisX
mm
AxisY
mm,
AxisZ
mm,
,
Calculation of deviation from best fit cone
This step consists in calculating the distance of each data point to the best-fit cone along the local normal. This distance h is associated to the (X,Y) projection of the data point onto a plane perpendicular to the cone axis.
Residuals
Each line in the residual matrix contains the height deviation d from the best-fit cone, X, Y, ρ and ω. (X,Y) are the rectangular coordinates of the projection of the data point onto a plane perpendicular to the best-fit cone axis. (ρ,ω) are the distance of the data point from the cone center and the angular position of the data point projection in the plane XY.
Residuals GetResiduals MeasuredXYZ C, D, γ,( ):=
Surface deviation: δ Residuals 0⟨ ⟩:=
Look at the distribution of values:
hh histogram 1000 δ nm 1−⋅,( ):=
1500 1000 500 0 500 1000 1500 2000 25000
100
200
300
Height deviation (nm)
Freq
uenc
y (c
ount
)
Cone flatness deviation
Peak to valley surface deviation:
Range δ( ) 3.661 µm=
Projected points coordinates:
Px Residuals 1⟨ ⟩:= Py Residuals 2⟨ ⟩
:=
Plots
φ 2.68mm= ψ 45 deg=
Px
mm
Py
mm,
δ
µm,
Cx
mm
Cy
mm,
Cz
µm,
,strx
mm
stry
mm,
strz
µm,
,
This surface is plotted using a 3D scatter plot because the surface plot can not handle data with holes. The style for the 3 plots is scatter plot.
The blue line position is defined by the angle ψ. The so-called straightness profile is generated as the intersection of the surface with a plane that contains the blue segment and is perpendicular to the plane of the figure.
Similarly, the red trace is the location where we want to extract a roundness profile, which is the intersection of the measured surface with a cylinder of diameter φ that has its axis perpendicular to the plane of the figure.
In the following we will use the projection shown above to extract a straightness and a roundness profile. Note that the projection introduces geometrical distortions that we will ignore in this worksheet.
stρ' Straightness' 0⟨ ⟩:=stZ' Straightness' 1⟨ ⟩
:=
Straightness' Interpolate4PtPlane X Y, Z, min subρ( ), max subρ( ), 10 µm⋅, dρ,( ):=
Z subZ:=
Y subρ mod subω ψ− 2 π⋅,( )⋅( )→
:=X subρ:=
4-point interpolation
We also create an interpolated profile by fitting a plane to the four closest neighbors of each interpolation location. The corresponding function needs data in a new coordinate system. The first parameter X is the position of the data points projected on the plane of the profile. The second parameter Y is the distance of the data points to the plane of the profile. The third parameter Z is the surface deviation. The last two parameters are the maximum distance from the profile plane for a point to be valid and the sampling step for the interpolation.
stρ Straightness 3⟨ ⟩:=stZ Straightness 0⟨ ⟩
:=
Straightness StProf ψ subData, γ, lp, dρ,( ):=
dρ 2 µm⋅:=Interpolation sampling step:
lp 0.05:=
We use loess to create an interpolated profile. The loess proximity parameter is chosen to be small to limit lowpass filtering:
subωsubData 4⟨ ⟩
m:=subρ subData 3⟨ ⟩
:=
subZ subData 0⟨ ⟩:=subY subData 2⟨ ⟩
:=subX subData 1⟨ ⟩:=
subData SelectDataSt ψ Residuals, 2 deg⋅,( ):=
Straightness
We select a subset of data points based on their proximity to the generatrix of interest. Here we retain points whose angular position is within +/-1° of the generatrix direction ψ.
ψ 45 deg⋅≡Straightness profile at angular position:
Straightness profile
dp 5µm=dpπ φ⋅
2 roundπ φ⋅2 dp⋅
⋅
:=
Exact sampling step:
dp 5 µm⋅:=
Nominal interpolation sampling step along the perimeter:
subωsubData2 4⟨ ⟩
m:=subρ subData2 3⟨ ⟩
:=
subZ subData2 0⟨ ⟩:=subY subData2 2⟨ ⟩
:=subX subData2 1⟨ ⟩:=
subData2 SelectDataRd φ Residuals, 10 µm⋅,( ):=
We first select the data points that fall within a certain distance of the circle of diameter φ. The last parameter of the function call below defines this maximum distance.
Roundness
φ 2.68 mm⋅≡
Cone diameter where roundness is measured:
Roundness profile
1.75 1.8 1.85 1.9 1.95 2 2.05 2.12
1.6
1.2
0.8
0.4
0
0.4
0.8
1.2
1.6
2
Loess interpolation4-point linear interpolationProjection of selected data points
Radial profile (from cone center)
Distance from cone center (mm)
Surf
ace
devi
atio
n (m
icro
met
er)
Range stZ'( ) 1.861 µm=
0 1 2 3 4 5 6 7 8 91
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
4-point linear interpolation
Roundness profile
Position along the perimeter (mm)
Surf
ace
devi
atio
n (m
icro
met
er)
Note: A typical roundness tolerance is on the order of 1 micrometer.
Range rdZ( ) 0.381 µm=max rdZ( ) min rdZ( )− 0.381 µm=
The roundness error at diameter φ is equal to:
rdp Roundness 0⟨ ⟩:=rdZ Roundness 1⟨ ⟩
:=
Roundness Interpolate4PtPlane X Y, Z, 0 mm⋅, π φ⋅ dp−, 10 µm⋅, dp,( ):=
Z stack submatrix Z cut, last Z( ), 0, 0,( ) Z, submatrix Z 0, cut, 0, 0,( ),( ):=
Y stack submatrix Y cut, last Y( ), 0, 0,( ) Y, submatrix Y 0, cut, 0, 0,( ),( ):=
X stack submatrix X cut, last X( ), 0, 0,( ) π φ⋅− X, submatrix X 0, cut, 0, 0,( ) π φ⋅+,( ):=
cut round 0.5 last X( )⋅( ):=
Because the data wrap around in this case we pad both ends of the data sent to the interpolation function.
Z subZ:=
Y subρ sin γ( )⋅φ
2−:=X
φ
2subω π+( )⋅:=
Use 4-point linear interpolation:
Profile filtering
Roundness profiles are usually lowpass, band-pass and highpass filtered to extract form, waviness and roughness information. We illustrate here how to extract form with a Gaussian filter. A typical cutoff frequency (50% transfer function) for this type of data is "50 upr", which corresponds to 50 undulations per revolution.
Since there may be missing data points in the interpolated roundness profile, we use linear interpolation to fill in the missing data and then use a Fourier transform to apply the Gaussian filter. Note that since the data is effectively periodic for a roundness profile there are no edge discontinuities issues for the Fourier transformation.
Create regularly sampled data
Generate Gaussian filter for the frequency domain:
Npt rows rdp( ):= j 0Npt
2..:=
filtj exp ln 2( )−j2
502⋅
:= filtmod Npt j− Npt,( ) filtj:=
Filtered profile: rdZ2 icfft cfft rdZ2( ) filt⋅( )→
:=
The roundness from error at diameter φ is equal to:
max rdZ2( ) min rdZ2( )− 0.277 µm=
Roundness profiles are frequently presented as polar plots.
Offset radius: R 1.5 µm⋅:=
Plot range: sc ceilmax rdZ( ) R+
0.2 µm⋅
0.2⋅ µm⋅:=
The number of grid points on the polar axis should be set manually
to sc 10⋅
µm14= to get a scale of 0.1 µm⋅ per radial division.
0
15
30
45
607590105
120
135
150
165
180
195
210
225
240255 270 285
300
315
330
345
4-point interpolation50-upr profile
3D plot detail
As a final step, let's display the subregion of the data where the roundness and straightness profiles intersect. Rotate the figure to observe how the profiles follow the original surface deviation.
Select data
subX
mm
subY
mm,
subZ
µm,
rdX
mm
rdY
mm,
rdZ
µm,
,stX
mm
stY
mm,
stZ
µm,
,
References
Metrology data courtesy of Zygo Corporation, all rights reserved, used by permission.
For more information on surface metrology: D. J. Whitehouse (1994), Handbook of Surface Metrology, Institute of Physics Publishing, ISBN 0-7503-0039-6.
For more information on optical profilers: Peter de Groot, Xavier Colonna de Lega (May 2003), “Valve cone measurement using white light interference microscopy in a spherical measurement geometry,” Optical Engineering, Vol. 42, No. 5, pp. 1232-1237.
EXAMPLES
Cosine Smoothingby James C. (Jim) BachDelphi Delco Electronic SystemsKokomo, IN, USA
Introduction
Mathcad provides a number of built-in smoothing filters, which are often sufficient for cleaning-up your data/waveforms prior to the primary processing operation. Sometimes, however, the built-in filters oversmooth data, removing features that are not noise, and you cannot obtain the desired level or quality of filtering you need. For example, here is a noisy signal which we might want to filter before analyzing its key characteristics.
Signal READPRN "signal.prn"( ):= i 0 last Signal( )..:=
Timeii
300:=
Passing the signal through the three built-in smoothing filters we see:
NFilt1 41:= NFilt2NFilt1
300:=
FilteredMed medsmooth Signal NFilt1,( ):=
FilteredK ksmooth Time Signal, NFilt2,( ):=
FilteredSup supsmooth Time Signal,( ):=
0 0.5 1 1.5 2 2.5 3 3.51
0
1
Filtered SignalOriginal Signal
Median Smoothing
0 0.5 1 1.5 2 2.5 3 3.51
0
1
Filtered SignalOriginal Signal
K (Gaussian Weighted) Smoothing
0 0.5 1 1.5 2 2.5 3 3.51
0
1
Filtered SignalOriginal Signal
Super (Variable Bandwidth) Smoothing
With this dataset, there are problems with all of the above smoothing techniques.
2.5
0
Median smoothing artifically 'squared-up' the high-frequency ripple.
2.5
0
Gaussian smoothing overattenuated the high-frequency ripple,
1.5 2
0
and artificially 'rounded-off' the square pulse.
2.5
0
Super smoothing removed thehigh-frequency ripple (highly over-filtered),
1.5 2
0
and artificially rounded the square pulse.
Mean smoothing
In cases such as this, you can write your own sliding window filters using a Mathcad program block (multiline function), as shown in the example below. This "Mean" smoother will filter your data (input vector "Data") by taking the average (mean) of the data points within the sliding window (width specified by "Width"):
MeanSmooth Data Width,( ) "Calculate 1/2-width of filtering window"
WidthHalf truncWidth
2
←
"Calculate where we need to start collapsing the window"
NearEnd last Data( ) WidthHalf−←
"Iterate through all of the data points"
"Calculate beginning (Start) of sliding window"
Start Pt WidthHalf−← Pt WidthHalf≥if
Start 0← otherwise
"Calculate ending (Stop) of sliding window"
Stop Pt WidthHalf+← Pt NearEnd≤if
Stop last Data( )← otherwise
"Use 'submatrix' to extract the window of data, then take mean of that chunk"
OutPt mean submatrix Data Start, Stop, 0, 0,( )( )←
Pt 0 last Data( )..∈for
"Return the filtered data"
Outreturn
≡
Testing this filter with the signal yeilds
FilteredMean MeanSmooth Signal NFilt1,( ):=
0 0.5 1 1.5 2 2.5 3 3.51
0.5
0
0.5
1
Filtered SignalOriginal Signal
Mean Smoothing
Cosine smoothing
A varient of the mean smoother, shown above, is the "Cosine Weighted Mean" smoother, which weights the averaged data points so that those near the center of the window have more weight (importance) than those near the edges of the window:
CosMeanSmooth Data Width,( ) "Calculate 1/2-width of filtering window"
WidthHalf truncWidth
2
←
"Calculate where we need to start collapsing the window"
NearEnd last Data( ) WidthHalf−←
"Store-away pi divided by 2 because we'll need it often"
Ππ
2←
"Iterate through all of the data points"
"Calculate beginning (Start) of sliding window"
Start Pt WidthHalf−← Pt WidthHalf≥if
Start 0← otherwise
"Calculate ending (Stop) of sliding window"
Stop Pt WidthHalf+← Pt NearEnd≤if
Stop last Data( )← otherwise
"Calculate 'Area Correction Factor' . . ."
" . . . compensate for size of sliding window"
η1
Start
Stop
j
cos Πj Pt−
WidthHalf⋅
∑=
←
"Iterate through the window of data . . ."
"Take sum of of cosine-weighted data points"
OutPt η
Start
Stop
j
Dataj cos Πj Pt−
WidthHalf⋅
⋅
∑=
⋅←
Pt 0 last Data( )..∈for
"Return the filtered data"
Outreturn
:=
Note the truncated (narrowed) window widths at the ends of the data, as controlled by the variables "Start" (beginning of sliding window) and "Stop" (ending of sliding window). At the 1st data point, only 1/2 of the specified window width is available for averaging. As the program block iterates through the data, the averaging window expands until it is the specified width, which occurs when it has reached the "WidthHalf" data point. It continues to slide across the data until it nears the end of the data, at which point (at "NearEnd" data point) it begins to collapse. When it finally reaches the last data point, the averaging window is once again only 1/2 of the specified width. It is because of this automatically expanding/collapsing window that the summation of weighted points needs to be multiplied by the scaling factor containing the "Start" and "Stop" indices:
η1
Start
Stop
j
cos Πj Pt−
WidthHalf⋅
∑=
←
Testing this filter, we see:
FilteredCos CosMeanSmooth Signal NFilt1,( ):=
0 0.5 1 1.5 2 2.5 3 3.51
0.5
0
0.5
1
Filtered SignalOriginal Signal
Cosine Mean Smoothing
Note, however, that the area compensation factor (η) needs to be adjusted for each shape by performing an iterated sum of the shape's values in order to find the area under the curve. Since the curve (window) expands and collapses dynamically at the data endpoints, this needs to be calculated on-the-fly, or, at least when near the ends. This process can be made more efficient (since the collapsed window occurs a small fraction of the time) by only repeatedly calculating the window area when near the ends of the data, as shown in the following revision of the program block:
CosMeanSmooth Data Width,( ) "Calculate 1/2-width of filtering window"
WidthHalf truncWidth
2
←
"Calculate where we need to start collapsing the window"
NearEnd last Data( ) WidthHalf−←
"Store-away pi divided by 2 because we'll need it often"
Ππ
2←
"Calculate window area when free-and-clear of the ends of data"
ηmid1
0
2 WidthHalf⋅
j
cos Πj WidthHalf−
WidthHalf⋅
∑=
←
"Iterate through all of the data points"
"Calculate beginning (Start) of sliding window"
Start Pt WidthHalf−← Pt WidthHalf≥if
Start 0← otherwise
"Calculate ending (Stop) of sliding window"
Stop Pt WidthHalf+← Pt NearEnd≤if
Stop last Data( )← otherwise
"Calculate 'Area Correction Factor' . . ."
" . . . compensate for size of sliding window"
η ηmid← WidthHalf Pt≤ NearEnd≤if
η1
Start
Stop
j
cos Πj Pt−
WidthHalf⋅
∑=
← otherwise
"Iterate through the window of data . . ."
"Take sum of of cosine-weighted data points"
OutPt η
Start
Stop
j
Dataj cos Πj Pt−
WidthHalf⋅
⋅
∑=
⋅←
Pt 0 last Data( )..∈for
"Return the filtered data"
Outreturn
:=
Testing this more efficient version of the smoother:
FilteredCos2 CosMeanSmooth Signal NFilt1,( ):=
0 0.5 1 1.5 2 2.5 3 3.51
0.5
0
0.5
1
Filtered SignalOriginal Signal
Cosine Mean Smoothing
Notice that this filter doesn't suffer (as much) the same problems of the built-in smoothing filters. The high-frequency ripple isn't as severely attenuated and the square pulse is more accurately reproduced:
2.5
0
1.5 2
0
The graph below illustrates three varients of "Weighted Mean" smoothing windows that are eaily implemented with the programming block above:
We can find the area under the curves, presuming untruncated (full-width) windows, as:
Ππ
2:= Width 10000:= WidthHalf 50% Width⋅:=
"Cosine" Window
0
Width
j
cos Πj WidthHalf−
WidthHalf⋅
∑=
Width63.661977 %=
"Cosine Squared" Window
0
Width
j
cos Πj WidthHalf−
WidthHalf⋅
2
∑=
Width50 %=
"Linear Ramp" Window
0
Width
j
1j WidthHalf−
WidthHalf−
∑=
Width50 %=
0.01 0 0.010.95
1
1.05
fzi
a b zi⋅+
zi
When we plot the points and the line, we get this graph:
a
b
1.000013
4.9996
=a
b
line z fz,( ):=
The intercept and slope of this line are
corr z fz,( ) 1=
But from Pearson's correlation coefficient, it looks as if the points are collinear:
cvar z fz,( ) 1.662367 104−
×=
It is interesting to note that the covariance is quite small:
fzi y zi( ):=zi .001 i⋅ .01−:=i 0 19..:=
This is not a linear relationship. However, if we have only data from a small interval, we might compute the correlation coefficient based on an insufficient sampling to see the relationship:
y z( ) .4 z2
⋅ 5 z⋅+ 1+:=
Linear correlation has the potential to be misinterpreted. This example shows the potential for misunderstanding, and the dangers of extrapolation from fitted parameters. Consider the following example. Let z and y be related by the following formula:
by Paul LorczakLimitations of Linear Correlation and Extrapolation
EXAMPLES
F u( ) 3 sin12 π⋅u 3−
⋅ 1+u10
+:=
range .01 .05, 6.01..:=j 0 6..:=
INT 1=SL 0.1=
INT intercept S T,( ):=SL slope S T,( ):=
Again, we compute the intercept and slope of the best-fitting line.
T
1
1.1
1.2
1.3
1.4
1.5
1.6
:=S
0
1
2
3
4
5
6
:=
Here is another set of data with what looks like a linear functional relationship.
100 50 0 502000
0
2000
4000
graph of y(z)extrapolation of best-fitting line for |z| small
fzi
a b zi⋅+
zi
corr z fz,( ) 0.048599=
fzi y zi( ):=zi 10 i⋅ 100−:=
It seems conclusive that z and fz are linearly correlated over this small range. Once we move to larger values of z, however, the difference between the true functional value and the assumed linear one becomes apparent.
When we graph the points, we see that they might be drawn from any number of functions that pass through them. Adequate sampling frequency and range of measurement is critical to draw meaningful conclusions about your data.
0 2 4 6
0
5
Tj
INT SL range⋅+
F range( )
Sj range, range,
∆x1
∆x2
0.020833
0.01
=∆x1
∆x2
X11 X10−
X21 X20−
:=
X2ix2
Y2ix2
xend xstart−n2 1−
ix2⋅
f X2ix2( )
:=X1ix1
Y1ix1
xend xstart−n1 1−
ix1⋅
f X1ix1( )
:=
Generate vectors for the 51 and 25 point data sets in x and y:
ix2 0 n2 1−..:=ix1 0 n1 1−..:=
xend 0.5:=xstart 0:=
Create a range variable for each data set:
n2 51:=n1 25:=
f x( ) 25 exp 41− x⋅( )⋅ 0.015+:=
Here are two vectors that represent the same exponentially shaped data. They present several challenges for integration routines. First, they are sparse, that is, there is wide spacing between points and they cover a large range of magnitudes. X1 has approximately half the number of points as X2, spanning the same range of values. We wish to integrate over the same limits for both vectors so that we can compare five methods of numerical integration.
One common problem in data analysis is that of integrating over a data set which can't be accurately fit with an analytical function. Various discrete techniques can be used, or the data can be interpolated with a function that can then be used to integrate over an arbitrarily close spacing of points, yielding more accurate results.
by Jean Giraud, Richard Jackson, and Leslie BondarykNumerical Integration of Data
EXAMPLES
Y1
0
0
1
2
3
4
25.015
10.65594354
4.54418717
1.942793
0.83554146
= Y2
0
0
1
2
3
4
25.015
16.60625625
11.02579136
7.32231444
4.86450106
=
Spacing for the first data set is 0.01, and the second is 0.021.
0 0.1 0.2 0.3 0.4 0.5
10
20
30
Y1
Y2
X1 X2,
Ideally, when integrated, these two sets of data should yield the same area under the curve.
Trapezoidal and Simpson's rules
Traditional numerical integration methods use quadrature rules to obtain approximate areas. The simplest of these quadrature methods are the Trapezoidal rule and Simpson's rule. The Trapezoidal rule can be applied to any vector of data, whereas Simpson's rule requires that the number of data points is odd (i.e. an even number of divisions). While the Trapezoidal rule fits a straight line to each division (a straight line is drawn from data point to data point) forming trapezoidal areas, which are summed, Simpson's rule fits second degree polynomials or parabolas to three subsequent points, in many instances giving a better approximation to the integrated area.
Trapezoidal V h,( ) hVlast V( ) V0+
21
last V( ) 1−
i
Vi∑=
+
⋅:=
Trapezoidal Y1 ∆x1,( ) 0.65389= Trapezoidal Y2 ∆x2,( ) 0.62577=
Compare the three methods and the previous results from Simpson's rule. Since both sets of data represent the same curve, we would like the values for a similar method to yield similar results.
R x( ) rationalint X2 Y2, x,( )0:=r x( ) rationalint X1 Y1, x,( )0:=
And the rational function interpolation:
Lin x( ) linterp X2 Y2, x,( ):=lin x( ) linterp X1 Y1, x,( ):=
The linear interpolation:
PSP x( ) interp sx2 X2, Y2, x,( ):=psp x( ) interp sx1 X1, Y1, x,( ):=
sx2 pspline X2 Y2,( ):=sx1 pspline X1 Y1,( ):=
The pspline:
An better approach is to create an interpolating function, for example using splines, and then apply Mathcad's numerical integration methods to the interpolating function. The advantage of this approach is that the numerical integration methods can have an arbitrarily small step size. What is important in this approach is that the chosen interpolation function is well suited to the shape of the curve it must interpolate. We'll compare three interpolation methods: a cubic spline with parabolic endpoints (pspline), linear interpolation, and a rational function interpolation.
Integration using interpolating functions
Note that there is much better agreement between the two values using Simpson's rule. Sometimes, it may even be warranted to fit higher degree polynomials to the data, in which case the Newton-Coates formula may be used.
Simp25 0.61735=Simp51 0.618914=
Simp25 Simpsons Y2 ∆x2,( ):=Simp51 Simpsons Y1 ∆x1,( ):=
Simpsons V h,( )h3
Vlast V( ) V0+ 4
1
last V( )2
i
V2 i⋅ 1−∑=
⋅+ 2
1
last V( ) 2−2
i
V2 i⋅∑=
⋅+
⋅:=
This should convince you of the importance of getting the most appropriate interpolating method you can, as the errors propagate when you make subsequent use of the data. For example, look at the errors generated by two of the interpolation methods for the missing data points in the sparse vector.
The parabolic-endpoint cubic spline does well in the case where there are more points, but fails somewhat when the data is sparse, as it cannot adequately predict the interim values for this type of curve. Linear interpolation produces much the same result as the Trapezoidal method, as we might expect, since both are drawing straight lines from one data point to the next and finding the area of the resulting trapezoid.
Rational spline integration produces a correct value to six significant figures, and, what's more, it produces consistent values for both numbers of data points. Rational function interpolation is particularly well-suited to asymptotic data, and interpolates between the values well even in the sparse case.
0
0.5
xf x( )⌠⌡
d .61725609679868727660→
The integral of the actual function used to generate this data is
Simp25 0.61735=Simp51 0.618914=
0
0.5
xR x( )⌠⌡
d 0.617256=0
0.5
xr x( )⌠⌡
d 0.617256=
0
0.5
xLin x( )⌠⌡
d 0.62572=0
0.5
xlin x( )⌠⌡
d 0.65388=
0
0.5
xPSP x( )⌠⌡
d 0.617497=0
0.5
xpsp x( )⌠⌡
d 0.62081=
26 points vector 51 points vector
Generate the interpolated points for the 51 point spacing using the interpolation for the 26 point spacing.
I R X1( )→
:=
Find the associated errors with each of these points, using the second value returned by rationalint.
Eix2 rationalint X1 Y1, X2ix2,( )1:=
err 2 E⋅:=
The error is 0 at the points which overlap in Y2, since these were used to generate the interpolating values, but at the interleaved points, the errors are very small.err
Y2
→
0
0
1
2
3
4
5
6
7
8
9
10
0
1.03136·10 -8
2.676739·10 -10
-5.796653·10 -10
-5.205209·10 -11
6.770412·10 -11
9.668197·10 -12
-1.082638·10 -11
-2.72565·10 -12
2.064213·10 -12
6.988379·10 -13
%=
If we do the same analysis using the pspline values for the sparse case,
Ipsp psp X2( ):= Epsp Y2 Ipsp−:=
We see much larger errors, particularly at the hard-to-fit initial values on the asymptotic slope.Epsp
Y2
→
0
0
1
2
3
4
5
6
7
8
0
-2.077208
-0.220314
1.471206
0.324831
-0.758896
-0.236028
0.583565
0.296593
%=
This difference accounts for the difference in the integration values.
xj a j step⋅+:=stepb a−200
:=j 0 200..:=
fit x( ) polyint vx vy, x,( )0:=
Create a global polynomial interpolation using this data spacing
vyi g1 vxi( ):=g1 x( )1
1 25 x2
⋅+:=
Sample the function
vxi ai b a−( )⋅
n+:=i 0 n..:=n 8= points
b 1:=a .9−:=Endpoints:
What this means is that a polynomial of the same order as the number of the data points will pass through all the points, but can give a terrible approximation to the function elsewhere if the points are badly chosen. For example, consider the following uniformly spaced sample points across a Lorenzian function:
In his seminal book on spline fitting, deBoor says, "Polynomial interpolation at appropriately chosen points (e.g. the Chebyshev points) produces an approximation which ... differs very little from the best possible approximant by polynomials of the same order". That is, if your points are well chosen, a single polynomial of appropriate order will fit all your data well. If they are not, then a spline fitting method may be a better choice. In particular, uniformly spaced points can have bad consequences. Further, "...If the function to be approximated is badly behaved anywhere in the interval of approximation, then the approximation is poor everywhere. This global dependence on local properties can be avoided when using piecewise polynomial approximants." by which he means cubic splines or bsplines.
by Robert AdairOptimal Spacing of Interpolation Points
EXAMPLES
You might expect that as you increase the number of sampled points and the order of the polynomial, n, the approximation would improve, but it does not. Try n = 20.
n 8≡
1 0.5 0 0.5 11
0.5
0
0.5
1
vyi
fit x( )→
g1 x( )
vxi x,
What deBoor recommends is instead choosing the interpolating points as the zeros of the Chebyshev polynomial of degree n,
vxChebi
a b+ a b−( ) cos2 i⋅ 1+( ) π⋅2 n 1+( )⋅
⋅−
2:=
vxCheb sort vxCheb( ):= vyChebi g1 vxChebi( ):=
data 0⟨ ⟩vxCheb:= data 1⟨ ⟩
vyCheb:=
Interpolate the same function using these new points. Go back and try a few values of n, to see that these points generally provide a nicer fit everywhere for larger values of n.
fitCheb x( ) polyint vxCheb vyCheb, x,( )0:=
1 0.5 0 0.5 10.5
0
0.5
1
vyChebi
fitCheb x( )→
g1 x( )
vxChebi x,
The Chebyshev point spacing does indeed provide a better interpolation for higher values of n, as we expect. However, it would be nice to know what the optimal spacing of points was so that we could use a lower value of n and still get good results. This can be calculated if we recall that a global polynomial interpolation is the same as a polynomial regression of the same order as the number of data points.
Optimal point spacing
Based on the notion that a polynomial of order n can always be found that passes through n points, we can find the polynomial that best approximates the given function g, and find which points on the curve it fits exactly. These points are the optimal interpolation points, that is, the points that, when interpolated with a polynomial, provide the best possible approximation everywhere on the interval.
We need to work with the integral version of the least squares problem to get the optimal solution for a polynomial fit.
Error
a
b
x
0
degree
n
cn xn
⋅( )∑=
g x( )−
2⌠⌡
d:=
The normal equations are generated by taking the derivative of the error with respect to the polynomial coefficients and setting the result equal to 0.
ckErrord
d2
a
b
x
0
degree
n
cn xn
⋅( )∑=
g x( )−
xk
⋅
⌠⌡
d⋅:=
1 0.5 0 0.5 1
0.4
0.2
OptErr x( ) g1 x( ) fitOpt x( )−:=ChebErr x( ) g1 x( ) fitCheb x( )−:=
Take a look at the interpolated error over the data range:
fitOpt x( )
k
ck xk
⋅( )∑:=
Then the optimal fit equation is given by the new polynomial of order n
c LinverseT
Linverse⋅ yintegral⋅:=
Linverse L1−
:=L cholesky M( ):=
Solving for the optimal interpolation coefficients, c gives:
Mk k1, fI0 k k1,( ):=
yintegralkfI1 k( ):=
k1 0 degree..:=k 0 degree..:=degree n:=
where:
M c⋅ yintegral=
The least squares matrix equation for the coefficients then becomes
In general, the integral fI1 will require numerical evaluation. In the case of a Runge function, it could be evaluated symbolically, but the speed of evaluation is good enough that both fI0 and fI1 will be left as numerical integrals.
fI1 k( )a
b
xg1 x( ) xk
⋅⌠⌡
d:=fI0 n k,( )a
b
xxn k+⌠
⌡
d:=
To solve the equations, we will need to evaluate the integrals
Finding the zeros of the error will give the points at which the optimal fit polynomial is exactly equal to the function, and these are the optimal interpolation points. To do this, let's bracket the zero crossings, then use these brackets as guesses for the root solver.
Bracket all the roots
fGuess a b,( ) npts 200←
dxb a−npts
←
x0 a←
y0 OptErr x0( )←
k 0←
x1 a i dx⋅+←
y1 OptErr x1( )←
x1 x1dx2
+←
y1 OptErr x1( )←
y0 y1⋅ 0=if
RootBracketk 0, x0←
RootBracketk 1, x1←
k k 1+←
y0 y1⋅ 0≤if
x0 x1←
y0 y1←
i 1 npts..∈for
RootBracketreturn
:=
Refine the roots from the initial bracketing guesses
xOpt guess fGuess a b,( )←
nRoots rows guess( )←
xguessi 0, guessi 1,+
2←
ri root OptErr x( ) x, guessi 0,, guessi 1,,( )←
i 0 nRoots 1−..∈for
rreturn
:=
yOpt fitOpt xOpt( ):=
Compare the optimal fit to the Chebyshev fit. Overall the two fits have the same general appearance, and the ripple is much smaller than the equally spaced data points fit. Comparing the location of the interpolation points, one finds the Chebyshev points have a similar distribution to the optimal points which helps to reduce the overall ripple. Adjust the value of n to see how these two approximations change with sampling.
1 0.5 0 0.5 10.5
0
0.5
1
Chebyshev pointsOptimal pointsOptimal polynomial interpolationoriginal functionChebyshev pts polynomial interpolation
If you have the luxury of choosing where your data will be sampled, you may wish to use guidelines of this sort when choosing your data spacing.
Reference
de Boor, Carl (1978), A Practical Guide to Splines, Springer-Verlag, Chapter 2.
EXAMPLES
Principal Component Regression of NIR Spectra for Alcohol Mixturesby Richard JacksonBruker Optics
This data represents the near-infrared (NIR) spectra of 15 mixtures of three alcohols: methanol, ethanol, and propanol. The first column is wavenumber (1/wavelength), in cm-1, the remaining 30 columns are the spectral absorbances, 2 spectra of each mixture. The objective is to produce a calibration based on this data that can later be used to predict the concentrations of the alcohols in an unknown mixture.
DATA8900 -0.00673 -0.00682 -0.00586 -0.00572
8896.144 -0.00638 -0.00646 -0.00528 -0.005138892.287 -0.00603 -0.00611 -0.00465 -0.004498888.431 -0.00573 -0.00583 -0.00401 -0.003868884.575 -0.00552 -0.0056 -0.00338 -0.003248880.719 -0.00537 -0.00542 -0.00273 -0.002618876.862 -0.00525 -0.00528 -0.00206 -0.001938873.006 -0.00512 -0.00517 -0.00134 -0.001198869.15 -0.00495 -0.00501 -0.00055 -0.00037
8865.294 -0.00471 -0.00475 0.000305 0.0005098861.437 -0.00438 -0.00441 0.001213 0.0014138857.581 -0.00395 -0.00399 0.002172 0.0023558853.725 -0.00344 -0.00348 0.003187 0.0033638849.869 -0.00283 -0.00288 0.004242 0.0044318846.012 -0.00212 -0.00218 0.005344 0.005554
:=
Create Data vectors
First, split X (independent variables - the wavenumbers) and Y (dependent variables - the measured absorbances) data into a vector and a matrix. Also transpose the data so that the spectra are in the rows, so that each column corresponds to an independent variable.
X DATA 0⟨ ⟩:=
A1 submatrix DATA 0, rows DATA( ) 1−, 1, cols DATA( ) 1−,( )T:=
Plot a few of the spectra to see what they look like (by convention, wavenumbers are plotted in decreasing order):
450055006500750085000
1
2
3
Wavenumber
Abs
orba
nce
Multiple Linear Regression
One method we can use for the calibration is multiple linear regression (MLR), also sometimes called Inverse Least Squares (ILS) or k-matrix. The first step is to generate a calibration curve. We can model the data as
C A1 B⋅ E+=
where for l components (in this case 3), m standards (in this case 30), and n variables (in this case absorbances) C is a m x l matrix of concentrations, A is the m x n matrix of variables, B is the n x l matrix of calibration coefficients, and E is a m x l matrix of residuals. The calibration coefficients are found using the equation
B A1T A1⋅( ) 1−A1T⋅ C⋅=
An estimate of the concentrations for an unknown spectrum, a, can then be predicted using the equation
c aT B⋅=
The problem with this methodology is that the dimension n (i.e. the number of variables) cannot exceed the dimension m (i.e. the
number of standards), otherwise the matrix A1T A1⋅( ) has no inverse (the problem is underdetermined).
This is clearly a problem with the data above, which has only 30 standards, but 1142 absorbances. Looking at the data, however, most of it must be redundant. Not only are the spectra at different wavenumbers interrelated, but we only have three concentrations varying (which, since they add to 100%, represents only two degrees of freedom), plus other minor effects due to interactions between the alcohols, temperature changes, etc. It must therefore be possible to reduce the number of absorbances so that the calibration coefficients can be found. One obvious way to do this is to just keep the absorbances at a few wavenumbers, and throw the rest of the data away. This leaves us with the problem of which wavenumbers to choose, and if the absorbances that are chosen
for the calibration are collinear, the matrix A1T A1⋅( ) still has no inverse.
Now we need to pick a number of wavelengths. We could pick up to 30, but let's just choose 5 because it's easier (and because there is a better way to do this anyway).
WaveNumbers
4600
5700
6500
7500
8500
:=
Use the Match function to get a smaller data matrix corresponding only to those wavenumbers
Anew NewData A1Match WaveNumbers0 X, "near",( )0⟨ ⟩
←
NewData augment NewData A1Match WaveNumbersi X, "near",( )0
⟨ ⟩,( )←
i 1 rows WaveNumbers( ) 1−..∈for
NewData
:=
Now we can calibrate for the concentrations. These are the concentrations of the alcohols methanol, ethanol, and propanol in each of the 30 samples
C0 1 2
0
1
2
3
4
0 0 100
0 0 100
100 0 0
100 0 0
0 100 0
:=
Calculate the calibration coefficients:
B AnewT Anew⋅( ) 1−AnewT⋅ C⋅:=
Now predict each concentration for all the samples, to see how good the calibration is:
c Anew B⋅:= i 10− 0, 110..:=
0 50 100
0
100
Methanol
Reference Value
Pred
icte
d V
alue
0 50 100
0
100
Ethanol
Reference Value
Pred
icte
d V
alue
0 50 100
0
100
Propanol
Reference Value
Pred
icte
d V
alue
The r2 values for the three calibrations are:
corr C 0⟨ ⟩c 0⟨ ⟩
,( )20.99895=
corr C 1⟨ ⟩c 1⟨ ⟩
,( )20.98736=
corr C 2⟨ ⟩c 2⟨ ⟩
,( )20.99386=
These calibrations look OK (although they could certainly be better!), but this is not a good way to evaluate how well the data has been modelled. We are interested in the ability of the model to predict concentrations from data that was not included in the calibration. We can get a measure of this by performing what is termed a cross validation. In this scheme we remove the data for sample 1 (in this case, the first two spectra), calibrate using the remaining data, then predict the concentrations for sample 1. The process is then repeated for sample 2, sample 3, etc. Here is a function that performs a cross validation given a matrix of data, a matrix of concentrations, and the number of samples to remove each time.
Function to split matrix into two, by pulling out N rows starting at index:
SplitMatrix M index, N,( ) OUT0 submatrix M index, index N+ 1−, 0, cols M( ) 1−,(←
indi index− i←
i index index N+ 1−..∈for
OUT1 trim M ind,( )←
OUT
:=
Cr_Val DATA C, N,( ) RMSECV 0←
"calibrate"
SplitData SplitMatrix DATA i, N,( )←
CalData SplitData1←
PredData SplitData0←
CalC SplitMatrix C i, N,( )1←
B CalDataT CalData⋅( ) 1−CalDataT⋅ CalC⋅←
c PredData B⋅←
PredictedC c← i 0=if
PredictedC stack PredictedC c,( )← otherwise
i 0 N, rows DATA( ) N−..∈for
PredictedC
:=
Now we can look at the results of the cross validation:
c Cr_Val Anew C, 2,( ):=
0 50 100
0
100
Methanol
Reference Value
Pred
icte
d V
alue
0 50 100
0
100
Ethanol
Reference Value
Pred
icte
d V
alue
0 50 100
0
100
Propanol
Reference Value
Pred
icte
d V
alue
The r2 values for the three calibrations are:
corr C 0⟨ ⟩c 0⟨ ⟩
,( )20.9972=
corr C 1⟨ ⟩c 1⟨ ⟩
,( )20.96529=
corr C 2⟨ ⟩c 2⟨ ⟩
,( )20.98241=
CenteredA CenteredAT:=
CenteredA i⟨ ⟩A1T( ) i⟨ ⟩
MeanA−:=i 0 rows A1( ) 1−..:=
MeanAi mean A1 i⟨ ⟩( ):=i 0 cols A1( ) 1−..:=
The first step in PCA is to mean center the data. This is done automatically by the Nipals function. By mean centering the data we remove everything that is common to all the spectra.
Principal Component Regression
Fortunately, there is a much better way to compress the data than just throwing most of it away. Principle Component Analysis will yield the minimum number of variables required to describe the data, and the variables are guaranteed to be orthogonal. The regression is then performed on the new variables (the scores). This is termed principle component regression, or PCR.
We could improve the cross validations by changing the number of absorbances used, and by finding the best wavenumbers to use, but this is not easy. There are obviously a huge number of possible combinations we could try. We have also thrown away a lot of information, and the variables we are left with are not orthogonal. If they are not chosen carefully they may in fact be collinear, in which case we cannot obtain the calibration coefficients.
RMSECV C 2⟨ ⟩c 2⟨ ⟩
,( ) 3.87515=
RMSECV C 1⟨ ⟩c 1⟨ ⟩
,( ) 5.48239=
RMSECV C 0⟨ ⟩c 0⟨ ⟩
,( ) 1.55091=
For methanol, ethanol, and propanol the values are:
RMSECV Ref Pred,( )
0
last Ref( )
i
Ref i Predi−( )2
rows Ref( )∑=
:=
An indicator of the average error in the cross validation is given by the root mean square error of cross validation, or RMSECV:
450055006500750085000.15
0.1
0.05
0
0.05
0.1
loading 1loading 2
The loading vectors are vectors of the same length as the original spectra. Here is what the first two look like. It can sometimes be instructive to look at the loadings, because they can indicate which parts of the data are important
LOADINGS loadings PCA_result( ):=
Get the Loadings:
SCORES scores PCA_result( ):=
Get the Scores:
PCA_result Nipals A1 NumPC, MaxIter, "noscale", Acc,( ):=
MaxIter 100:=Acc 1010−
:=NumPC 2:=
We now wish to compress the data to the minimum possible number of orthogonal variables. To do this we will use the Nipals function. We do not wish to scale the data to the standard deviations, so we set the last argument to "noscale". To start, we will just try a calibration using the first two PCs:
CenteredC CenteredCT:=
CenteredC i⟨ ⟩CT( ) i⟨ ⟩
MeanC−:=i 0 rows C( ) 1−..:=
MeanCi mean C i⟨ ⟩( ):=i 0 cols C( ) 1−..:=
Since the data has been mean centered, we will also mean center the concentrations:
RMSECV C 2⟨ ⟩c 2⟨ ⟩
,( ) 6.63735=
RMSECV C 1⟨ ⟩c 1⟨ ⟩
,( ) 8.40882=
RMSECV C 0⟨ ⟩c 0⟨ ⟩
,( ) 1.86236=
For methanol, ethanol, and propanol the RMSECV values are:
corr C 2⟨ ⟩c 2⟨ ⟩
,( )20.94823=
corr C 1⟨ ⟩c 1⟨ ⟩
,( )20.91845=
corr C 0⟨ ⟩c 0⟨ ⟩
,( )20.9959=
The r2 values for the three calibrations are:
0 50 100
0
100
Propanol
Reference Value
Pred
icte
d V
alue
0 50 100
0
100
Ethanol
Reference Value
Pred
icte
d V
alue
0 50 100
0
100
Methanol
Reference Value
Pred
icte
d V
alue
i 10− 0, 110..:=
c cT:=c i⟨ ⟩CenteredcT( ) i⟨ ⟩
MeanC+:=
i 0 rows C( ) 1−..:=
Add the mean concentrations back in:
Centeredc Cr_Val SCORES CenteredC, 2,( ):=
Now perform a cross validation:
These are comparable to the values obtained using MLR. We can do better, though, by optimizing the number of PCs used in the calibration. To do this, cross validate using 1 PC, then 2 PCs, then 3 PCs, etc. For each number of PCs we calculate the sums of the squares of the prediction errors for each component. The prediction error sum of squares (PRESS) will tend to decrease until we start to overfit the data, when it will increase. The PRESS function below will keep adding PCs until the PRESS for all three components increases. We will also calculate the r2 value for each cross validation. To save calculating the scores and loadings again, the PRESS function will also return these.
PRESS DATA C, N,( ) "Center the concentrations"
MeanCi mean C i⟨ ⟩( )←
i 0 cols C( ) 1−..∈for
CenteredC i⟨ ⟩CT( ) i⟨ ⟩
MeanC−←
i 0 rows C( ) 1−..∈for
CenteredC CenteredCT←
"Get the first PC and calculate the PRESS and r-squared"
PCA_result Nipals DATA 1, MaxIter, "noscale", Acc,( )←
S PCA_result0←
L PCA_result1←
Centeredc Cr_Val S CenteredC, N,( )←
PRESS0 i,0
rows C( ) 1−
j
CenteredC i⟨ ⟩( )j Centeredc i⟨ ⟩( )
j−
2
∑=
←
R_squared0 i, corr CenteredC i⟨ ⟩Centeredc i⟨ ⟩
,( )2←
i 0 cols C( ) 1−..∈for
"Calculate more PCs until the PRESS increases or we get the maximum number"
PCA_result Nipals2 PCA_result 1,( )←
S PCA_result0←
L PCA_result1←
Centeredc Cr_Val S CenteredC, N,( )←
PRESSk i,0
rows C( ) 1−
j
CenteredC i⟨ ⟩( )j Centeredc i⟨ ⟩( )
j−
2
∑=
←
⟨ ⟩ ⟨ ⟩( )2
i 0 cols C( ) 1−..∈for
k 1 rows C( ) 1−..∈for
:=
R_squaredk i, corr CenteredC i⟨ ⟩Centeredc i⟨ ⟩
,( )2←
Num_Greater 0←
break
Num_Greater Num_Greater 1+← PRESSk i, PRESS>if
i 0 cols C( ) 1−..∈forif
OUT0 PRESS←
OUT1 R_squared←
OUT2 S←
OUT3 L←
OUT
PRESS_result PRESS A1 C, 2,( ):= PRESS_result
{10,3}
{10,3}
{30,10}
{1142,10}
=
The PRESS values indicate that for all components 6 PCs give the lowest predicion error:
i 0 rows PRESS_result0( ) 1−..:=
1 2 3 4 5 6 7 8 9 101
10
100
1 .103
1 .104
1 .105
methanolethanolpropanol
Number of PCs
PRE
SS
Although the two indicators do not always give the same results,
in this case the r2 values also indicate that 6 PCs are optimum:
1 2 3 4 5 6 7 8 9 100.98
0.985
0.99
0.995
1
methanolethanolpropanol
Number of PCs
R_s
quar
ed
It is worth noting here that if we use the nth PC, there is no requirement that we use all the PCs lower than n. In fact, the graphs above show that when we include the 4th PC the RMSECV and r2 for ethanol are worse, but improve again when we include the 5th PC. The lowest RMSECV for ethanol is in fact obtained if only PCs 1,2,3,5,6 are used. This is because PCA compresses the data to the most dominant factors, not the most relevant factors. The most common method used for multivariate calibration that compresses the data to the most relevant factors is Partial Least Squares (PLS).
Final Calibration
We will keep the first 6 scores from those returned by the PRESS calculation (for simplicity, we will keep the 4th PC for the ethanol calibration):
SCORES submatrix PRESS_result2 0, rows PRESS_result2( ) 1−, 0, 5,( ):=
Now perform a cross validation:
Centeredc Cr_Val SCORES CenteredC, 2,( ):=
Add the mean concentrations back in:
i 0 rows C( ) 1−..:=
c i⟨ ⟩CenteredcT( ) i⟨ ⟩
MeanC+:= c cT:=
LOADINGS submatrix PRESS_result3 0, rows PRESS_result3( ) 1−, 0, 5,( ):=
We will keep the first 6 loadings from those returned by the PRESS calculation:
Prediction
B SCOREST SCORES⋅( ) 1−SCOREST⋅ CenteredC⋅:=
These results are exellent! The final calibration is
RMSECV C 2⟨ ⟩c 2⟨ ⟩
,( ) 6.9443 101−
×=
RMSECV C 1⟨ ⟩c 1⟨ ⟩
,( ) 9.50189 101−
×=
RMSECV C 0⟨ ⟩c 0⟨ ⟩
,( ) 2.75962 101−
×=
For methanol, ethanol, and propanol the RMSECV values are:
corr C 2⟨ ⟩c 2⟨ ⟩
,( )20.99943=
corr C 1⟨ ⟩c 1⟨ ⟩
,( )20.99893=
corr C 0⟨ ⟩c 0⟨ ⟩
,( )20.99991=
The r2 values for the three calibrations are:
0 50 100
0
100
Propanol
Reference Value
Pred
icte
d V
alue
0 50 100
0
100
Ethanol
Reference Value
Pred
icte
d V
alue
0 50 100
0
100
Methanol
Reference Value
Pred
icte
d V
alue
i 10− 0, 110..:=
Spectroscopic data courtesy of Bruker Optics, Inc., all rights reserved, used by permission.
References
To conclude, we have used PCA to compress the original data to 6 variables, which are used for the final regression. This has the advantages over MLR that we do not have a wavenumber selection problem, and the variables are guaranteed to be orthogonal.
y 5.95299 1.62882 101
× 7.775 101
×( )=
y Unknown MeanA−( )T LOADINGS⋅ B⋅( )T MeanC+
T
:=
We can combine all these steps into a single statement
yT 5.95299 1.62882 101
× 7.775 101
×( )=y S B⋅( )T MeanC+:=
Finally, calculate the predicted concentrations, remembering to add the mean concentrations to get the final answer:
S 6.1464 5.2431 101−
× 1.0577 101−
× 7.0234 102−
× 1.0719 102−
× 2.6901 102−
×( )=
S Unk_centeredT LOADINGS⋅:=
Next calculate an estimate of the scores using the first 6 loadings:
Unk_centered Unknown MeanA−:=
Unknown0
0
1
2
-0.0065
-0.0062
-0.0059
:=
To predict the unknown concentrations from a spectrum, first we subtract the mean spectrum, as the Nipals function did for the calibration data:
EXAMPLES
Savitzky-Golay and Median Filteringby Erik Esveld Wageningen University and Research Center
For the analysis of kinetic processes, the derivative of a signal measured at regular time intervals is often required. Because of the noise in real measured systems, the data needs to be smoothed in order to obtain meaningful results for the derivative of the data.
Savitzky-Golay (SG) filtering is a smoothing technique which relies on a local polynomial fit of regularly spaced data. Since the nth order derivative for x=0 is directly determined by the respective polynomial coefficient, the filtering method can also conveniently be used to directly obtain the derivative of the data.
SG filtering is a convolution method. The coefficients of the convolution window which follow from the least-square polynomial fit are obtained by the Moore Penrose inverse of a fixed matrix.
Real data is often also cluttered with spikes, which can be efficiently removed by a median filter. This filter is especially useful for monotonic ascending or descending data or data which contains sudden level changes. It doesn't work very well for spectral data with peaks.
Filter function definition
The filter functions have error messages defined in this vector:
CustErrMsg
"Odd window width expected"
"Order of polynomial should be less \nthan width of the window"
"Order of derivative cannot be greater \nthan the order of polynomial"
:=
The following function calculates the Savitzky-Golay coefficients for the smoothing, with odd-length window width w, and a kth order polynomial to obtain the the dth order derivative by convolution.
SGcoef k w, d,( ) error CustErrMsg0( ) mod w 2,( ) 0=if
error CustErrMsg1( ) k w≥if
error CustErrMsg2( ) d k>if
mw 1−( )
2←
Sr c, r m−( )c
←
c 0 k..∈for
r 0 2 m⋅..∈for
MPinverse ST S⋅( ) 1−ST⋅←
d! MPinverseT( ) d⟨ ⟩⋅
otherwise
otherwise
otherwise
:=
Convolution of a vector X with window coefficients c:
Convol X c,( ) width length c( )←
error CustErrMsg0( ) mod width 2,( ) 0=if
mwidth 1−( )
2←
Conlast X( ) 0←
Coni c submatrix X i m−, i m+, 0, 0,( )⋅←
i m last X( ) m−( )..∈for
Con
otherwise
:=
The error messages are handy to track misuse of the functions. For example, click on this faulty expression:
SGcoef 2 5, 3,( ) =SGcoef 2 5, 3,( )
Filter application example
Consider the example data with increasing chirp.
f t( ) 2 cos expt
300
−:=
Add white noise with stdevwn 0.2:= .
whitenoise t( ) qnorm rnd 1( ) 0, stdevwn,( ):=
Add some outliers with stdevout 2:= every n 20:= points.
outliers t( ) if rnd 1( )1
n≤ qnorm rnd 1( ) 0, stdevout,( ), 0,
:=
Store the original data and the noisy data in vectors.
i 0 1000..:= Fi f i( ):=
Xi f i( ) whitenoise i( )+ outliers i( )+:=
Stdev X F−( ) 0.5403=
0 200 400 600 800 10000
2
4
Xi
Fi
i
Let's get rid of the spikes by the application of a narrow (5pt window) median filter.
XM medsmooth X 5,( ):=
0 200 400 600 800 10000
2
4
XMi
Fi
i
Smooth the result with a 2nd order polynomial fit over 41 points.
k 2:= w 41:=
XMSG Convol XM SGcoef k w, 0,( ),( ):=
0 200 400 600 800 10000
2
4
XMSGi
Fi
i
Now look at the first derivative of the smoothed data.
dXMSG Convol XM SGcoef k w, 1,( ),( ):= df i if i( )d
d:=
0 200 400 600 800 10000.1
0
0.1
dXMSGi
df i
i
The smoothing of the function worked well with the exception of the region ends, where the window width extends past the end of the data. The first derivative shows an underestimation of the steepest slopes. This is due to the limitations of the second order (parabolic) fit over the large window. With a smaller window or with a quadratic polynomial fit, these steep features can be better modeled at the cost of decreased noise smoothing. Try changing the values of k and w to examine these effects.
Loess smoothing
Finally, it's worth comparing this method with another polynomial fitting and windowing method, the loess fit. Loess, with small scale factors, similar to windows, provides an effective smoother. The difference with the Savitzky-Golay method is that loess uses a weighted and adaptive window to locally fit a 2nd degree polynomial.
Ii i:=
smoothed loess I XM, 0.07,( ):=
XMLi interp smoothed I, XM, i,( ):=
0 200 400 600 800 10000
2
4
XMLi
Fi
i
Since the loess method cannot be used for extrapolation, and numerical derivatives require differencing on either side of the endpoints, we'll contract the range over which we find the derivatives.
i 1 length X( ) 2−..:=
To obtain the first derivative we can use the simple linear difference since the local parabolic fit is discontinuous in the second derivative.
dXMLi
XMLi 1+ XMLi 1−−
2:=
0 200 400 600 800 10000.1
0
0.1
0.2
dXMLi
df i
i
In the case of loess, the fit is applicable over a wider range, since it uses an adaptive and weighted window. Unlike the Savitzky-Golay method, the derivative at the right side is not underestimated. However, better fidelity in the derivative implies poorer smoothing. The Savitzky-Golay method, on the other hand, is capable of using higher order fits to obtain all the derivatives directly.
Reference
W.H. Press et al. (1992), Numerical recipes in C : the art of scientific computing, Cambridge University Press, 2nd ed. Chapter 14.8, page 640.
EXAMPLES
Stabilizing and Normalizing the Error Varianceby Paul Lorczak
Suppose we have a multivariate dataset where the value of σ2, the random error, is increasing with respect to one of the independent variables
sample size: n 75:=
σINCR DATA 0⟨ ⟩runif n 0, 1,( )←
DATA 1⟨ ⟩runif n 0, 1,( )←
"Two columns of random data for x1 and x2"
σincr DATA 0⟨ ⟩←
"create a column of errors that depend on the first data column"
ε σincr rnorm n 0, 1,( )⋅→
←
β
4
7
2
←
"create a linear relationship between x1, x2, and a dep. variable"
DATA 2⟨ ⟩β0 β1 DATA 0⟨ ⟩
⋅+ β2 DATA 1⟨ ⟩⋅+ ε+←
DATA
:=
y σINCR2⟨ ⟩
:= i 0 last y( )..:=
X 0⟨ ⟩σINCR
0⟨ ⟩:= X 1⟨ ⟩
σINCR1⟨ ⟩
:=
The error term ε is from a normal distribution with a mean of zero.
To check model assumptions, do a multivariate polynomial fit with the two columns of independent variables to the single column of dependent variables:
params regress X y, 2,( ):=
yfiti interp params X, y, XT( ) i⟨ ⟩
,
:=
resid y yfit−:=
stERR
resid2∑
last y( ) 5−:= stERR 0.573246=
If scatterplots of the standardized residuals versus each of the independent variables or versus the predicted values show no discernible pattern, then the variance of the error terms is most likely constant. On the plot of the residuals versus x1 (shown in red), the points spread out increasingly from left to right, indicating that ε increases with x1.
0 0.2 0.4 0.6 0.84
2
0
2
4
residstERR
σINCR 0⟨ ⟩
On a similar graph for x2, the pattern is not clear.
0 0.2 0.4 0.6 0.84
2
0
2
4
residstERR
σINCR 1⟨ ⟩
The pattern, however, is again in evidence on the graph of the residuals versus the predicted values,
4 6 8 10 124
2
0
2
4
residstERR
yfit
To perform a valid regression, we'll need to counteract the increasing error variance by transforming the dependent data. The variance stabilizing transformations are listed below in order of increasing severity. The first equation is enabled. To see the effect of the other transformations, you can disable the first equation and enable another equation.
ystab y→
:=
ystab ln y( )→
:=
ystab1y
→
:=
Below, we've plotted the new resulting residuals versus the predicted values, to see the effect of stabilization.
params regress X ystab, 2,( ):=
yfitsi interp params X, ystab, XT( ) i⟨ ⟩
,
:=
resids ystab yfits−:=
stERR
resid2∑
last y( ) 5−:= stERR 0.573246=
2 2.5 3 3.50.5
0
0.5
residsstERR
yfits
You may find that some of the transformations are too severe, actually causing a violation rather than removing it. For instance, the graph pattern could be curved rather than random, indicating that a nonlinear model might provide a better fit to the data.
Correcting Nonnormality
Besides helping to stabilize the error variance, the same transformations may also correct any nonnormality of errors. To see this, we'll generate a sample from a population having error terms that are exponentially distributed.
sample size: n_expo 25:=
εEXPO DATA 0⟨ ⟩runif n_expo 0, 1,( )←
DATA 1⟨ ⟩runif n_expo 0, 1,( )←
ε1 rexp n_expo 10,( )←
β
4
7
2
←
DATA 2⟨ ⟩β0 β1 DATA 0⟨ ⟩
⋅+ β2 DATA 1⟨ ⟩⋅+ ε1+←
DATA
:=
ye εEXPO2⟨ ⟩
:= i 0 last ye( )..:= Xei 1:=
Xe 1⟨ ⟩εEXPO
0⟨ ⟩:= Xe 2⟨ ⟩
εEXPO1⟨ ⟩
:=
ynorm1ye
:=
ynorm ln ye( ):=
ynorm ye:=
Again, we've defined the transformations in order of increasing severity. The first equation is enabled. To see the effect of the other transformations, you can disable the first equation, and enable another equation.
Transforming for normality
As expected, the residuals are not normally distributed, as shown on this nonlinear scatter plot.
1 0 1 2 3 4 54
2
0
2
4
rplot 1⟨ ⟩
rplot 0⟨ ⟩
rplot qqplotresidestERR
"normal",
:=
stERR 0.107443=stERR
reside2∑
last ye( ) cols Xe( )−:=
reside ye yfit−:=
params
4.041308
7.087907
1.991035
=yfit Xe params⋅:=
params XeT
Xe⋅( ) 1−Xe
Tye⋅( )⋅:=
Nonnormality can be detected in a normal plot of the standardized residuals. If the plot resembles a straight line, then most likely the errors are from a normal distribution.
Inspecting the new standardized residual normal plot.
params XeT
Xe⋅( ) 1−Xe
Tynorm⋅
⋅:=
yfit Xe params⋅:= params
2.109525
1.248504
0.339393
=
reside ynorm yfit−:=
stERR
reside2∑
last ynorm( ) cols Xe( )−:= stERR 0.028774=
rplot qqplotresidestERR
"normal",
:=
2 1 0 1 2 34
2
0
2
4
rplot 1⟨ ⟩
rplot 0⟨ ⟩
You can investigate the relationship between the transformation and the nonnormality of error terms through enabling and disabling different equations. Again, you'll probably find that some transformations work better than others.
EXAMPLES
Statistical Analysis of Water Meter Databy D. M. GriffinLouisiana Tech University
The data used here are daily water usage values expressed as gal/min. These are being collected as part of an ongoing research project between Louisiana Tech University and the Louisiana Department of Transportation and Development to quantify water usage and wastewater generation at Interstate Rest Areas in Louisiana.
The medsmooth function was used to examine water flowrates over a period of record from 7/1/97 to the present. Using a smoothed data set (bandwidth = 7 days) it was found that daily flows after day 1200 were somewhat lower and less variable than those before day 1200. This results in substantial changes in less frequent flow rates before and after day 1200. A bandwidth of 7 days was used because significant autocorrelation occurs within a one-week period, less so after one week.
Read in daily water usage rates at Grand Prairie Interstate rest area, I-49 in gal/min. Missing values given value of -1. There are 10 to 20 missing values in the data set.
DATAGP.PRN
:=
i 0 length DATA( ) 1−..:= length DATA( ) 2059=
elapsed_timei i day⋅:= gpmgalmin
:=
0 205.9 411.8 617.7 823.6 1029.5 1235.4 1441.3 1647.2 1853.1 205950.5
48.513
17.522
26.531
35.540
DAILY FLOW (GPM) VS ELAPSED TIME (DAYS)
Elapsed time in days
Ave
rage
dai
ly fl
ow (g
pm)
median DATAorig( ) 3.401 gpm=median DATA( ) 3.417gpm=
mean DATAorig( ) 4.199 gpm=mean DATA( ) 4.274gpm=
As expected, the median changes less in this data transformation, since it's a more robust measure of centrality.
DATA indexj( ) medianval gpm⋅:=
j 0 last index( )..:=
DATAorig DATA:=DATA DATA gpm⋅:=
medianval median DATAtrim( ):=
Now compute median of DATAtrim and substitute it for all -1 values.
rows DATAtrim( ) 2024=rows DATA( ) 2059=
DATAtrim trim DATA index,( ):=
Trim them from the data set:
index match 1− DATA,( ):=
Get all the indices of the data which are of value -1:
We first need to deal with the -1 values. There is no generally accepted way of doing this. If they are removed, the data set is discontinuous, which is undesirable. In this case I will first remove the -1 values to create another set of "real" values. Then I will compute the mean of that data set. Finally, I will substitute the median value of that data set for all -1 values in the original data set. In this way I have a continuous data set (not a perfect solution, but this is the real world!) with no missing values.
The raw data exhibits no obvious changes in pattern. There were about 20 days where flow data were not available and these were given the value -1
Autocorrelation
Autocorrelation is a procedure to determine if a data set is correlated with itself over time. Data which are correlated are not independent and standard statistical hypothesis tests are not valid, strictly speaking. Data which are not autocorrelated are said to be independent.
First, compute and plot the residuals, the differences between each daily flow value and the overall mean flow for the data set.
µ mean DATA( ):=
RESID DATA µ−:= mark1 320:= mark2 360:=
0 240 480 720 960 1200 1440 1680 1920 2160 240010
5
0
5
10
15
20
25
30
35
40FLOWRATE RESIDUALS
elapsed days
(Dat
a va
lue
- dat
a m
ean)
0
mark1mark2
Comparing the data points to the y = 0 line, we see that the flow was generally larger than the mean for the first 90 days or so; then was less than the mean until about day 240. Recently the flow has been increasing again.
The high values occurring between 320 and 360 were caused by watering of trees and grass during May and June 1998. In addition, problems occurred with the well around 6/15/98, (day 349) which necessitated pumping the well for an extended period while working on it. High flows around the end of September (day 450) were probably a result of large numbers of evacuees heading north from New Orleans and surrounding area as hurricane Georges approached New Orleans (actually it turned out that the riser pipe in the well had split). High residuals occurring near day 880 and day 890 are due to the conduct of dye tests in the rock plants filters. All of these events tend to wash out legitimate long-term trends in the data, which is what we are interested in.
Plot the correlogram for the daily flow. A correlogram is the sequence of autocorrelation coefficients of flow over a series of specified time intervals, called lag periods or lags.
n 0 28..:= N length DATA( ):= σ stdev DATA( ):=
lagcorrn1
N σ2
⋅ n
N 1−
i
DATAi µ−( ) DATAi n− µ−( )⋅ ∑=
⋅:=
0 7 14 21 280.2
0
0.2
0.4
0.6
0.8
CORRELOGRAM
Lag (days)
Aut
ocor
rela
tion
Coe
ffic
ient
The autocorrelation coefficient over a lag period of 7 days is approximately 0.4. This means that the flow on any day is positively correlated with the the flow 7 days previous. Similarly the value at 21 days is somewhat over 0.2. This indicates that the flow is correlated with the flow 21 days previous but not as well as the flow 7 or 14 days ago.
Now look at the overall curve. It has a regular pattern that repeats every 7- 8 days. This is not immediately obvious from the raw data. It is also interesting to note that the value of the coefficients drop with each succeeding cycle.
Here are the upper and lower 95% confidence limits on the autocorrelation coefficients. Values inside this region cannot be assumed different than zero in a statistical sense.
u1− 1.96 length DATA( ) 2−⋅+
length DATA( ) 2−
:= u 0.043=
l1− 1.96 length DATA( ) 2−⋅−( )
length DATA( ) 1−( ):= l 0.044−=
Median Smoothing
Now use medsmooth to smooth the data. Based on the serial correlation analysis, it was decided to use a bandwidth of 7 days.
Q7 medsmooth DATA 7,( ):=
The median of the 7 points surrounding each data point is used to replace each data point. The procedure is followed for each point so there are as many point in the smoothed data set as the original.
0 240 480 720 960 1200 1440 1680 1920 2160 24001
0.6
2.2
3.8
5.4
7
8.6
10.2
11.8
13.4
15
7 day median smooth
7-Day Median Smooth
y3sort Rank DATAaft_1200( )( )
length DATAaft_1200( ) 100⋅:=x3sort DATAaft_1200( )
gpm:=
y2sort Rank DATAbef_1200( )( )
length DATAbef_1200( ) 100⋅:=x2sort DATAbef_1200( )
gpm:=
y1sort Rank DATA( )( )
length DATA( )100⋅:=x1
sort DATA( )gpm
:=
We sort the data and create the Weibull plotting variable for three probability curves. The first is for all data, the second for data collected before day 1200, and the third for data collected after day 1200.
median DATAaft_1200( ) 3.125galmin
=median DATAbef_1200( ) 3.785galmin
=
mean DATAaft_1200( ) 3.666galmin
=mean DATAbef_1200( ) 4.708galmin
=
DATAaft_1200k2DATAk2 1200+:=
DATAbef_1200k1DATAk1:=
k2 0 last DATA( ) 1200−..:=k1 0 1199..:=
First, we extract the data before and after day 1200.
Weibull plots
Note the change in water use highlighted by the smoothed data. The curve above suggests that daily water use appeared to increase up until about day 1200, which would correspond to about October of 2000. After that time, water usage dropped and appeared to be less variable. This is not at all evident in the original data. The main reason for this change is the fact that before day 1200, testing was occurring at the site, requiring extra water to be run through the treatment system. Based on these results it could be argued that we have, in effect, 2 data sets, one before day 1200, and one from day 1200 to the present. It seems reasonable to plot probability curves for the data before and after day 1200.
0 10 20 30 400
25
50
75
100
125
all databefore day 1200after day 1200
water use
perc
ent f
low
s le
ss th
an s
tate
d va
lue
The curves may be interpreted as follows:
1. The curve for all the data combined shows a flow rate with a 50% probability of occurrence is 3.42 gal/min. That is, 50% of future flows will be less than this value. Corresponding values for data collected before and after day 1200 are 3.8 and 3.1 gal/min respectively.
percentile DATA 50%,( ) 3.417 gpm=
percentile DATAbef_1200 50%,( ) 3.785 gpm=
percentile DATAaft_1200 50%,( ) 3.125 gpm=
2. Using the curve for data collected after day 1200, 90% of future flows will be less than 5.8 gal/min. The corresponding value for data collected before day 1200 is 8.9 gal/min
percentile DATAbef_1200 90%,( ) 8.981 gpm=
percentile DATAaft_1200 90%,( ) 5.833 gpm=
change90 54− %=changemed 21.1− %=change20 7.8− %=
Summary:
change90 53.96− %=change90B120090 A120090−
A120090−:=
A120090 percentile DATAaft_1200 90%,( ):=After day 1200:
B120090 percentile DATAbef_1200 90%,( ):=Before day 1200:
90 percentile flow
changemed 21.111− %=changemedB120050 A120050−
A120050−:=
A120050 percentile DATAaft_1200 50%,( ):=After day 1200:
B120050 percentile DATAbef_1200 50%,( ):=Before day 1200:
Median flows
change20 7.845− %=change20B120020 A120020−
A120020−:=
A120020 percentile DATAaft_1200 20%,( ):=After day 1200:
B120020 percentile DATAbef_1200 20%,( ):=Before day 1200:
20 percentile flow
Now, lets examine changes in flowrate as a function of how often the flow occurs. We will look at 20 percentile flows, median flows, and 90 percentile flows.
These data suggest that changes in the more frequent flows before and after day 1200 are small, about - 5%. However, the change in median flows before and after day 1200 is -21% and the change in the 90 percentile flowrate is twice that, or - 54%. This suggests that:
changes in flow before and after day 1200 were 1.substantial. The flow dropped considerably after day 1200. Percentage changes in the mean or median flowrate 2.underpredict changes at higher, less frequent, flow rates. This makes sense because the testing that was done required large water flows and thus affected the probability of occurrence of the less frequent flows.
EXAMPLES
Two-Point, Two-Body Elements for the Planet Jupiter by Roger L. MansfieldAstronomical Data Service
The problem to be solved is how to generate a set of orbital elements for the planet Jupiter that can be used to calculate Jupiter's position at any instant of the year 2004, i.e., during the period 2004 January 1.0 to 2005 January 1.0. Orbital elements are a dynamical astronomer's way of describing an orbit in a manner that is useful for calculating positions in the orbit, as we shall see in what follows.
To begin our task of generating elements for Jupiter, we retrieve a dataset from a U.S. Naval Observatory publication [1].
DataSet
2452880.5
2452920.5
2452960.5
2453000.5
2453040.5
2453080.5
2453120.5
2453160.5
2453200.5
2453240.5
2453280.5
2453320.5
2453360.5
2453400.5
2453440.5
2453480.5
2453520.5
4.590859157−
4.744349733−
4.883439395−
5.007785421−
5.11708814−
5.211090635−
5.289578417−
5.352379063−
5.399361849−
5.430437358−
5.445557097−
5.444713104−
5.427937567−
5.395302447−
5.346919109−
5.282937969−
5.203548148−
2.525277766
2.29600822
2.059769646
1.817312563
1.569396401
1.316787749
1.060258691
0.800585218
0.538545706
0.274919463
0.010485323
0.253979697−
0.517701717−
0.779911265−
1.039844529−
1.296744589−
1.549862639−
1.195317055
1.100718447
1.002778822
0.901811754
0.798135691
0.692073192
0.583950212
0.474095406
0.362839474
0.25051452
0.137453446
0.023989362
0.089544984−
0.202817767−
0.315498628−
0.427259208−
0.537773687−
"2003 Aug 29"
"2003 Oct 28"
"2003 Nov 17"
"2003 Dec 27"
"2004 Feb 05"
"2004 Mar 16"
"2004 Apr 25"
"2004 Jun 04"
"2004 Jul 14"
"2004 Aug 23"
"2004 Oct 02"
"2004 Nov 11"
"2004 Dec 21"
"2005 Jan 30"
"2005 Mar 11"
"2005 Apr 20"
"2005 May 30"
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
:=
DataSet is called an ephemeris (plural: ephemerides), because it is a table of times (zeroth column) and positions of the planet Jupiter at those times (the first, second and third columns are the x, y, and z components of Jupiter's 3D position vectors). Our ephemeris gives Jupiter's positions at equal 40-day intervals. The times are Julian dates; their corresponding calendar dates are given in the fourth column.
Note that calendar date 2004 January 1.0, having Julian date 2453005.5, lies between the dates in rows 3 and 4 of the table. Also note that calendar date 2005 January 1.0, having Julian date 2453371.5, lies between the dates in rows 12 and 13 of the table.
We want our orbital elements to work for the period 2004 January 1.0 (midnight on 2003 December 31) to 2005 January 1.0 (midnight on 2004 December 31). To generate the orbital elements, we will need to calculate the positions of Jupiter very precisely on these two dates.
Aitken-Neville iterated polynomial interpolation [2], as implemented in polyiter, is useful for this calculation. It allows us not only to interpolate for the two positions of Jupiter on the two dates of interest, but also to put a tolerance, ε, on how good the interpolation must be.
The x, y, and z positional coordinates of Jupiter in DataSet columns 1, 2, and 3, respectively, have been rounded off to 9 places past the decimal, and are known to be accurate to this number of decimal places. Therefore, we should specify our Aitken-Neville interpolation tolerance as
ε 5 1010−
⋅:=
We will use eight positions from DataSet, the zeroth through the 7th, to interpolate for Jupiter's position on 2004 January 1.0, and eight positions, the 9th through the 16th, to interpolate for Jupiter's position on 2005 January 1.0, as these two sets of eight positions bracket the two ephemeris points of interest.
With eight data points input, polyiter can compute a polynomial of, at most, degree seven. Therefore, we input a maximum iteration count of seven to polyiter.
N 7:=
We now extract the column vectors of x, y, and z coordinates for interpolation. The dot subscript "1" denotes the column vectors of values as needed for interpolation in x, y, and z, at the first Julian date. The subscript "2" denotes the column vectors of values for interpolation at the second Julian date. (See also additional comments on notation, Note 6, at the end of the worksheet.)
r1
5.0222766203−
1.7866059269
0.8889938158
=
This is Jupiter's interpolated position vector, in A.U., on the first Julian date, 2453005.5, corresponding to 2004 January 1.0 TT (Terrestrial Time).
r12Out2:=
Out
1
6
0.8889938158
=Out polyiter JD1 z1, JDT1, N, ε,( ):=
r11Out2:=
Out
1
6
1.7866059269
=Out polyiter JD1 y1, JDT1, N, ε,( ):=
r10Out2:=
Out
1
6
5.0222766203−
=Out polyiter JD1 x1, JDT1, N, ε,( ):=
We perform iterated interpolation for Jupiter's position r1 at the first Julian date JDT1 by invoking polyiter three times, once for each of x, y, and z.
Aitken-Neville Interpolation
JDT2 2453371.5:=JDT1 2453005.5:=
We specify the Julian dates for the interpolated ephemeris positions 1 and 2.
z2 submatrix DataSet 9, 16, 3, 3,( ):=z1 submatrix DataSet 0, 7, 3, 3,( ):=
y2 submatrix DataSet 9, 16, 2, 2,( ):=y1 submatrix DataSet 0, 7, 2, 2,( ):=
x2 submatrix DataSet 9, 16, 1, 1,( ):=x1 submatrix DataSet 0, 7, 1, 1,( ):=
JD2 submatrix DataSet 9, 16, 0, 0,( ):=JD1 submatrix DataSet 0, 7, 0, 0,( ):=
What we see is that the x, y, and z coordinates were all successfully found by Aitken-Neville interpolation, because the first element of each polyiter output vector, the "Converged" flag, is set to 1. We see that only six iterations were needed in all cases, one iteration less than seven, the maximum degree and iteration count permissible for eight input data points.
We should note at this point a strength of Aitken-Neville interpolation: if there is a manual transcription error in the input data points, iteration will usually go up to the maximum permissible degree and will not converge (Converged = 0). So if you are pretty sure you have input enough data points to meet your convergence criterion, then Aitken-Neville is probably telling you, "Better check your input data for manual entry errors."
Now we perform iterated interpolation for Jupiter's position r2 at the second Julian date JDT2.
Out polyiter JD2 x2, JDT2, N, ε,( ):= Out
1
6
5.4205399099−
=
r20Out2:=
Out polyiter JD2 y2, JDT2, N, ε,( ):= Out
1
6
0.5899925915−
=
r21Out2:=
Out polyiter JD2 z2, JDT2, N, ε,( ):= Out
1
6
0.1207351015−
=
r22Out2:=
This is Jupiter's interpolated position vector, in A.U., on the second Julian date, 2453371.5, corresponding to 2005 January 1.0 TT.
r2
5.4205399099−
0.5899925915−
0.1207351015−
=
Orbit Determination via Two-Body Mechanics
We now have two precise positions for the planet Jupiter on two dates, 2004 January 1.0, and 2005 January 1.0. These dates are exactly 366 days apart because the year 2004 is a leap year.
Let us assume for a moment that the solar system consists of just two bodies, Jupiter and the Sun, and that Jupiter travels from its position on the first date to its position on the second under the gravitational influence of the Sun alone. Motion under this assumption is called "two-body orbital mechanics," and the orbit that results is called a "two-body orbit." The larger body is called the primary and the smaller body is called the secondary.
The great mathematician Karl Friedrich Gauss deduced a little more than 200 years ago a method of determining a two-body orbit from two position vectors and the time of flight from the first to the second. We will use Gauss's method below to determine the orbital path of Jupiter from 2004 January 1.0 to 2005 January 1.0. This path will be quite close to the DataSet path, as we will see.
Since we will be using the two points of Jupiter's orbit that we have just calculated above via polyiter, and since we are assuming two-body mechanics, we will call the resulting orbital elements "two-point, two-body" (2P2B) orbital elements for the planet Jupiter.
Gauss's method can be summed up in the two functions, VEL1 and TWOPOS, defined below in a collapsed area. I won't say too much about these two functions here. The mathematics that is implemented in these functions is fully documented in Chapter 9 of [3].
TWOPOS and VEL1 definitions
We now use TWOPOS to calculate Jupiter's heliocentric equatorial velocity vector at the first interpolated position, as needed to travel to the second interpolated position under the two-body assumptions. First we set up the arguments K and ∆t for input to TWOPOS.
k1 0.01720209895:= Gaussian constant for two-body motion when the Sun is the primary body.
µ 1.000954786:= Sum of the Sun's mass and Jupiter's mass, expressed in solar masses.
K k1 µ⋅:= Gravitational parameter composed of k1 and µ.
∆t JDT2 JDT1−:= We specify a one-year time of flight, in this case, 366 days for leap year 2004.
The output from TWOPOS is the position and velocity vectors side by side. We put these into the 3-by-2 array PV. Then we extract position r1 and velocity v1 for use in calculating orbital elements.
PV TWOPOS K ∆t, r1, r2,( ):=
r1 PV 0⟨ ⟩:= v1 PV 1⟨ ⟩
:=
r1
5.0222766203−
1.7866059269
0.8889938158
= v1
0.0028745571−
0.0061497325−
0.002567772−
=
Transformation to Ecliptic Coordinates
The components of position are expressed in astronomical units (A.U.). The components of velocity are expressed in A.U./day.
The two vectors r1 and v1 are "heliocentric equatorial cartesian." This means that they are the (x, y, z) components of position and (vx, vy, vz) components of velocity in a reference frame whose origin is at the Sun and whose fundamental plane is Earth's equatorial plane. We will want to transform r1 and v1 to a set of classical orbital elements for the planet Jupiter. Before we can do this we must make the vectors "heliocentric ecliptic cartesian." This means that the fundamental reference plane must be changed to the plane of Earth's orbit around the sun, called the ecliptic plane. We need function EQ2EC to do the transformation, and we will need EC2EQ later to go back the other way, so we define both functions now in a collapsed area.
EC2EQ and EQ2EC definitions
Now we transform r1 and v1 from heliocentric equatorial to heliocentric ecliptic.
r1 EQ2EC r1( ):= v1 EQ2EC v1( ):=
Transformation to Classical Elements
Position and velocity at some epoch (in this case, 2004 January 1.0) are called "fundamental" orbital elements. What we want are called "classical" orbital elements. Classical orbital elements are derived and discussed in Chapter 5 of [3]. We use the function PV2CL ("position and velocity to classical elements") to do the transformation (see Chapter 8 of [3]). We define PV2CL in a collapsed area.
PV2CL definition
We invoke PV2CL to transform position and velocity to classical elements.
Elmts PV2CL K r1, v1,( ):=
This is what the classical elements are, for a body in orbit around the Sun:
We now have our 2P2B elements for Jupiter, for the year 2004:
Elmts
5.20183217
0.04896418
1.30557531
100.08496669
273.99147004
140.91224419
=
"Semimajor axis, in A.U."
"Orbital eccentricity"
"Orbital inclination, in deg"
"Celestial longitude of ascending node, in deg"
"Argument of perihelion, in deg"
"Mean anomaly, in deg"
Generation and Plot of 2P2B Ephemeris
We will now use these elements to generate positions of Jupiter all around its orbit, which we will plot. But before we do so, let us take a quick look at a plot of the DataSet that we started with:
DataSet 1⟨ ⟩DataSet 2⟨ ⟩
, DataSet 3⟨ ⟩,( )
This is a 3D scatterplot of the raw DataSet data.
The points that we see are positions of Jupiter at equal 40-day intervals over the time span 2003 Aug 29 to 2005 May 30.
To generate our 2P2B Jupiter ephemeris for plotting, we use functions PQ2EQ and Ephem. The math in these two functions is derived and discussed in Chapter 5 of [3]. Expand the collapsible area below to see them.
PQ2EQ and Ephem definitions
In order to know how many points to plot, we need to know the orbital period of Jupiter, in days. This is the amount of time it takes Jupiter to travel once around the Sun. The formula from Kepler's Third Law of Planetary Motion is
P2 π⋅
KElmts0( )
3
2⋅:= P 4331.374=
Let us choose a time increment that gives us 36 equally-spaced ephemeris points to plot (the 37th gets plotted over the first). We have
∆t 366=∆t
P
36:=
We input this time step to Ephem and ask for 37 points (points 0 through 36), then we plot them.
Orbit Ephem Elmts K, JD1, ∆t, 36,( ):=
Orbit 1⟨ ⟩Orbit 2⟨ ⟩
, Orbit 3⟨ ⟩,( ) DataSet 1⟨ ⟩
DataSet 2⟨ ⟩, DataSet 3⟨ ⟩
,( ),
This 3D plotshows the raw data, plus the2P2B orbit plotted for 37 ephemeris points.
We have used our 2P2B orbital elements for Jupiter to generate position points around the entire orbit (see Note 4). We also see our DataSet points (red) superimposed upon the 2P2B orbital trace and position points (blue).
Accuracy of the Ephemeris and the DataSet
Why didn't we simply fit three coordinate polynomials to DataSet and use the coefficients to generate Jupiter's positions during the year 2004? This is indeed the preferred approach for highest accuracy. But the orbital elements, as we have seen, are much more descriptive, and using Aitken-Neville interpolation on just two points has allowed us to generate 2P2B orbital elements for all of 2004.
This concludes our data analysis. But the orbital analyst who wants to generate 2P2B elements for Jupiter for years 2005, 2006, 2007, and so on, will want quantitative answers to the following two questions.
1. How accurate are the positions generated by Ephem using 2P2B elements?
2. How accurate are the positions in DataSet in the first place?
To answer the first question, let us look at the 2P2B elements error in the 8th dataset point, expressed in seconds of arc, i.e., in arcseconds.
SecPerRad r2d 3600⋅:= (SecPerRad is the number of arcseconds in one radian.)
Orbit Ephem Elmts K, JDT1, DataSet8 0, JDT1−, 1,( ):=
i 0 3..:=
datarowi DataSetT( ) 8⟨ ⟩
i:=
datarow
2453200.5
5.399361849−
0.538545706
0.362839474
= OrbitT( ) 1⟨ ⟩2453200.5
5.39935018−
0.53854764
0.36283998
=
datarow OrbitT( ) 1⟨ ⟩− SecPerRad⋅ 2.44128= (arcseconds)
This suggests how to write a function that calculates all of the errors in the 2P2B predicted positions vs. the tabular points in DataSet during year 2004.
Errors
Orbit Ephem Elmts K, JDT1, DataSeti 4+ 0, JDT1−, 1,( )←
datarowj DataSetT( ) i 4+⟨ ⟩
j←
j 0 3..∈for
Outi datarow OrbitT( ) 1⟨ ⟩− SecPerRad⋅←
i 0 8..∈for
Out
:=
We see that the 2P2B elements predict the positions of Jupiter at the tabular dates in 2004 to better than 3 arcseconds.
This accuracy is quite adequate for planetary visibility predictions for Jupiter made from 2P2B elements in 2004.
Errors
0.89309
1.65831
2.16386
2.42132
2.44128
2.23164
1.79887
1.14679
0.27748
=
On to the final question: the positions in DataSet were obtained from a U.S. Naval Observatory publication from 1951. So it is quite appropriate to ask how accurate they are today. It will suffice just to look at the two positions r1 and r2 that we obtained from DataSet by Aitken-Neville iterated interpolation.
Both r1 and r2 are referred to the mean equator and equinox of the Besselian epoch B1950.0, having Julian date 2433282.423. We precess them to the Julian epoch J2000.0, having Julian date 2451545.0, so that we can compare them with results from the U.S. Naval Observatory's most recent MICA program [5]. We use the following precession function, PRECESS. Expand the collapsible area below to see this function.
PRECESS definition
JD2000.0 2451545.0:=
We see that the angle between r1m and r1 and the angle between r2m and r2 are both less than an arcsecond. That is, calculations that the U.S. Naval Observatory made in the late 1940s on vacuum-tube computers, and that were published in 1951 in "Coordinates of the Five Outer Planets, 1653-2060", are still good, more than 50 years later, to better than an arcsecond.
SecPerRad acosr2m PRECESS r2 JD2000.0,( )⋅
r2m PRECESS r2 JD2000.0,( )⋅
⋅ 0.79517=
Finally, we compute the angle between r2m and r2 in arcseconds.
PRECESS r2 JD2000.0,( )5.412956867−
0.65053459−
0.147050304−
=
This is r2
precessed from B1950.0 to J2000.0.
r2m
5.412958671−
0.650550743−
0.147064256−
:=
r2m was obtained from the U.S. Naval Observatory's MICA 1990-2005 program.
We do the same for r2m and r2.
SecPerRad acosr1m PRECESS r1 JD2000.0,( )⋅
r1m PRECESS r1 JD2000.0,( )⋅
⋅ 0.78015=
Now we compute the angle between r1m and r1 in arcseconds.
PRECESS r1 JD2000.0,( )5.04619017−
1.730339222
0.864536908
=
This is r1 precessed from B1950.0 to J2000.0.
r1 EC2EQ r1( ):=We need to convert r1 from ecliptic back to equatorial.
r1m
5.046196922−
1.730326266
0.864522613
:=r1m was obtained from the U.S. Naval Observatory's MICA 1990-2005 program.
References and Notes
[1] Eckert, W. J., Brouwer, Dirk and Clemence, G. M., "Coordinates of the Five Outer Planets 1653-2060", Astronomical Papers, Vol. XII, U.S. Naval Observatory, Washington, 1951.
[2] McCalla, Thomas Richard, Introduction to Numerical Methods and FORTRAN Programming, John Wiley (1967). Although I have programmed extensively in Borland's Turbo Pascal and in several implementations of C and C++, I still find this book to be quite useful. The FORTRAN programs are short and elegant; the emphasis of the book is on deriving numerical methods. Try an author search at http://www.alibris.com to find this out-of-print book.
[3] Mansfield, Roger L., Topics in Astrodynamics, Astronomical Data Service, Colorado Springs, Colorado (September 2003). See http://home.att.net/~astrotopics/ for availability.
[4] The 2P2B elements that we have generated for Jupiter are most accurate during the year 2004, yet we can use them to see what the entire orbit looks like, and that is what we have done in the second 3D scatterplot. We could not have plotted the entire orbit using the DataSet points alone.
[5] Nautical Almanac Office, U.S. Naval Observatory, Multiyear Interactive Computer Almanac 1990-2005, Washington, DC (1998). Available from Willmann-Bell, Richmond, Virginia. See http://www.willbell.com.
[6] A note on notation: In physics and math, and especially in dynamical astronomy, vectors and matrices are typically denoted by boldface type. The convention was originally implemented in this worksheet by using a "user-defined" Math style called "Vectors & Matrices" to create vector and matrix variables, rather than simply to use the standard, default Math style "Variables", which uses the "regular" typeface.
However, adoption of a user-defined Math style is not a good idea for an Electronic book, because the Ebook user who is not familiar with that Math style might encounter problems (a) when trying to modify math regions in the Ebook, or (b) after copying and pasting math regions into another worksheet, and then trying to modify them.
So a compromise was struck: the math regions do not use boldface for vectors and matrices, but the text regions that describe them still do.
Typically, the experimenter will only have a small number of data points, so extract a subset of data which rationalint will interpolate.
rows SiO2_Data( ) 56=
n SiO2_Data 1⟨ ⟩:=λtest SiO2_Data 0⟨ ⟩
nm⋅:=
µm 106−
m⋅≡
nm 109−
m⋅≡
SiO2_Data0 1
0
1
2
3
4
180 1.5853
190 1.5657
200 1.5505
213.9 1.5343
226.7 1.5228
:=
As an example, consider the dispersion of fused silica. This data was taken from a table in the Melles Griot Optics catalog:
One problem encountered in optics is obtaining this dispersion model from a set of data points. This is normally done using a nonlinear least-squares solver to solve the unknown constants in the partial fraction expansion. If the data is accurate enough, rationalint can also be used to interpolate between the data points.
where λ is the wavelength of light.
n λ( )2 c0 c1 λ2
⋅+ c2 λ4
⋅+ ..+
1 d1 λ2
⋅+ d2 λ4
⋅+ ..+=
which is another form of:
n λ( )2A
B λ2
⋅
C λ2
−+
D λ2
⋅
E λ2
−+ ..+=
The refractive index of a material is a particularly good candidate for fitting with a rational function because the dispersion model is a sum of partial fractions, which is another form of a rational function.
by Robert AdairUsing a Rational Function to Fit Optical Constants
EXAMPLES
0 500 1000 1500 2000 25001.4
1.45
1.5
1.55
1.6
1.65
full data rangeextrapolated from 8 points (squared)8 points providedextrapolated from 8 points (direct)
Compare the computed values of the refractive indexto the experimental data.
n2fit fn λtest( )→
:=nfit fnsq λtest( )→
:=
Compute the refractive index at the full range of wavelengths.
fn x( ) rationalint λinterp ninterp, x,( )0:=fnsq x( ) rationalint vx1 vy1, x2
,( )0:=
vy1 ninterp2
→
:=vx1 λinterp2
→
:=
Create a rational function that can evaluate the data between the interpolation points. Note that we expect the index of refraction squared to follow a rational function, not the index. It is best, then, to transform the measured data by squaring it before interpolating, then take the square root when comparing with the original data.
i 0 rows InterpData( ) 1−..:=
ninterp InterpData 1⟨ ⟩:=
λinterp InterpData 0⟨ ⟩nm⋅:=InterpData
302.2
404.7
496.5
632.8
706.5
1500
1800
2100
1.48719
1.4696
1.4625
1.457
1.4551
1.4446
1.4409
1.4366
:=
Where l is in nanometers (first column), and n is dimensionless (second column).
nfit λ( ) fλ
µm
2param 0⟨ ⟩
,
:=wav λmin λmin 1 nm⋅+, λmax..:=
λmax max λtest( ):=λmin min λtest( ):=
f x β,( ) β0 β1 x⋅+ β2 x2
⋅+
1 β3 x⋅+ β4 x2
⋅+:=
param
1.2605
201.8952−
3.1568
95.9621−
1.0871
1.2598
202.0415−
3.1465
96.0318−
1.0823
1.2612
201.7489−
3.1671
95.8924−
1.0919
=
param rationalfitnp vx1_sc vy1, 0.98, 2, 2, 1015−
,( ):=
Fit the data with the rationalfitnp function which creates a rational function with no poles in the data fitting region. If the poles are not in an area of concern, then rationalfit may be used for slightly higher accuracy.
StdYi 3 105−
⋅:=
The data also has a published accuracy:
vx1_scλinterp
µm
2→
:=
Let's give ourselves a squared dataset to work with that is scaled to microns for numerical accuracy:
fε1 λ( ) 10.6961663 λ
2⋅
λ2
.0684043 µm⋅( )2−
+.4079426 λ
2⋅
λ2
.1162414 µm⋅( )2−
+.8974794 λ
2⋅
λ2
9.896161 µm⋅( )2−
+:=
The Dispersion formula is given in Melles Griot as
Here is an example using rationalfit to fit the data.
Regression
The interpolation for the squared data lies directly on top of the actual data, even though it must extrapolate outside of the data range, and bridge large gaps between points. The direct interpolation without squaring overpredicts n when it is used to extrapolate.
0 500 1000 1500 2000 25001.4
1.45
1.5
1.55
full data setrationalfit8 point data setSellmeier equation
All the curves, including the well-established Sellmeier equation, appear to lie approximately on top of one another within the range of the measured data.
residratfit ninterp nfit λinterp( )−( )→
:= residSell ninterp fε1 λinterp( )−( )→
:=
0 5 .10 7 1 .10 6 1.5 .10 6 2 .10 6 2.5 .10 65 .10 5
0
5 .10 5
residratfit
residSell
λinterp
residratfit2∑ 8.6934 10
10−×= residSell
2∑ 4.356 109−
×=
Note that scaling is critical to the calculation to help the nonlinear solver converge quickly to a numerically accurate solution. Squaring the wavelength also helps in that it creates a rational function that best matches the physics of the refractive index dispersion.
Reference
(1988) Melles Griot Optics Guide, Melles Griot, Irvine, CA, p.3-5.
EXAMPLES
Wilcoxon Signed-Rank Test
The following are the compressive strengths of a material manufactured by two different methods, A and B:
The data is paired, that is, there are an equal number of measurements in each pool.
A
60.3
50.2
56.5
60.6
59.3
49.7
50.8
59.8
52.5
57.4
55.8
54.5
53.6
56.8
57.1
:= B
56.0
56.2
55.1
59.2
62.3
54.5
56.5
57.1
56.2
56.1
58.5
63.5
58.2
48.9
53.0
:= n length A( ):= n 15=
The signed-rank test will tell us, to some degree of statistical significance, whether the means of the two data sets are equal, that is, if they came from the same distribution. The test compares the ranks of the positive and negative differences between data pairs. First, find the differences:
diff B A−:=
and determine which are positive and which are negative:
positive v( ) count 0←
outcount i←
count count 1+←
vi 0>if
i 0 last v( )..∈for
out
:=
T 41=T if T+ T-< T+, T-,( ):=
The test statistic for smaller sample sizes (n<5) is
Note that the variable names, T+ and T-, were made by typing T[Ctrl-Shift-K]+{Ctrl-Shift-K]. This key sequence lets you type literal values that would otherwise be interpreted as operators in Mathcad.
T- 41=T+ 79=
T- negranks∑:=T+ posranks∑:=
negranks
9
2.5
2.5
4
1
14
8
=posranks
13
6
11
12
7
5
15
10
=
negranks
pi ranks indexni( )←
i 0 last indexn( )..∈for
p
:=posranks
pi ranks indexpi( )←
i 0 last indexp( )..∈for
p
:=
indexn positive diff−( ):=indexp positive diff( ):=
ranksT 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
0 9 13 2.5 2.5 6 11 12 4 7 1 5 15 10 14 8=
diffT 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
0 -4.3 6 -1.4 -1.4 3 4.8 5.7 -2.7 3.7 -1.3 2.7 9 4.6 -7.9 -4.1=
ranks Rank diff→( ):=
These definitions will not count differences which are 0. Then, find the rank of the absolute values of these differences, and pull and sum the ranks corresponding to the positive and negative differences independently:
but here we will use the percent point function for the normal distribution, which is reasonably close to the appropriate shape for this test.
T0 qnorm 0.95 0, 1,( ):= T0 1.644854=
µTn n 1+( )⋅
4:= µT 60=
σ2Tn n 1+( )⋅ 2 n⋅ 1+( )⋅
24:= σ2T 310=
TT+ µT−
σ2T:= T 1.079127=
WRStest "null hypothesis rejected, means are unequal" T T0>if
"null hypothesis OK, means are equal" otherwise
:=
WRStest "null hypothesis OK, means are equal"=
So we may conclude that the mean for both materials samples is the same, with a 95% confidence limit. This has implications for the manufacturing processes for these materials.
Reference
Kottegoda and Rosso, Statistics, Probability and Reliability for Civil and Environmental Engineers, McGraw-Hill, 1997. pp.274-5.