total least squares estimation of dynamical...

Total Least Squares Estimation of Dynamical Systems

Manoranjan Majji∗ and John L. Junkins †

Texas A & M University, College Station, TX, 77843-3141.

The Total least squares error criterion is considered for estimation problems. Exact nec-essary conditions for the total error criterion have been derived. Several solution method-ologies are presented to solve the modified normal equations obtained from the necessaryconditions. The results are applied on the parameter identification of a novel morphingwing developed at Texas A & M University (Static Problem). Subsequently, a filter isderived, with a multilinear measurement model, whose minimum variance estimator isshown to also minimize the total least squares type cost function. The filter thus derivedis compared with a classical Kalman filter on a numerical example.

I. Introduction

The least squares error criterion was invented by Carl Friedrich Gauss which till to this day remains themost widely used in many diverse areas owing to its computational and statistical properties. During theturn of this century, this criterion was generalized to include uncertainty in the basis functions by Adcock.1

Consequently, a new theory was developed called the “errors in variables” theory.2 Golub and Van loan3

applied this to numerical mathematics problems. Sabine Van Huffel and Van de Valle4 applied the theorydeveloped by Gleser to parameter estimation problems and brought the theory to systems science.

I.A. The Total Least Squares Error Criterion

The total least squares error criterion is based on generalization of the least squares error criterion. It aimsat changing the basis by the slightest possible (as small as possible) to capture as much of the measurementvector as possible.

I.A.1. Paper Outline

The paper is presented as follows. First section introduces the total least squares error criterion and derivesthe necessary conditions in a direct manner, obtaining a modest set of nonlinear equations that are amodification of the least squares solution. Geometrical insights in to the problem are presented and acomparison is made with existing literature. Methods to solve the equations thus obtained are discussed inthe next section. Two novel approaches are presented along with the celebrated SVD solution by Golub et.al,3 Van Huffel,4 and the well known eigenvalue problem proposed by Villegas.5 We have to note the strongcorrelation of the concepts developed herein to the Minimum Model error estimation proposed by Mook andJunkins.6 Subjecting the error criterion to dynamical systems is the topic of section (VI). Subsequent section(VI) compares the performance of the filter derived in this section with the classical Kalman estimator.

II. Total Least Squares

As pointed out in the introduction, the least squares error criterion for minimization of residual errordoes not apply when linear equations are involved with an uncertainty in the basis function. Therefore, if weonly have access to measurements A,of basis functions and y of the range vector, in the linear error model

y ≈ Ax + v (1)

∗Graduate Student, Department of Aerospace Engineering, 3141 TAMU, and Student Member, AIAA.†Distinguished Professor, Holder of Royce E. Wisenbaker Chair in Engineering, Regents Professor, Department of Aerospace

Engineering, Fellow, AIAA.

1 of 19

American Institute of Aeronautics and Astronautics

where A = A+VA and VA is a matrix of random variables ∼ N(0, σ) and v is the vector of random variables∼ N(0, σ). We do not require the errors to have same statistics, however for simplicity of the derivations,we would like to impose these conditions. Straight-forward Generalizations to arbitrary statistics can beperformed with arbitrary weighting as shown by Van Huffel4 and Glesler.2 Though the simplicity of ourapproach in obtaining these results makes this generalization obvious and natural, we choose not to presentthe general method in favor of clarity. However we do summarize the results obtained for general weightmatrices in a separate section. So the Total Least Squares error criterion minimizes the cost associated withestimation errors in basis defined by ∆ := A − A, r = y − y

J =1

2tr(∆T ∆) +

1

2rT r (2)

the necessary conditions for a minimum of these equations are given by

∂J

∂A= 0,

∂J

∂x= 0 (3)

leading to the equations (using matrix derivative identities7),

∆(I + xxT ) = (Ax − y)xT (4)

which, (using the Morrison Sherman Woodbury Matrix Inversion Lemma,7) is equivalent to

A = A −(A − y)xT

(1 + xT x)= A + exT (5)

where e := (y−Ax)1+xT x

, leading to the fact that the optimal correction in the data matrix, is the rank one

correction exT . The second necessity that ∂J∂x

= 0 yields

(AT A)x = AT y (6)

which, by using the expression A = A + exT yields,

[

AT A −(y − Ax)T (y − Ax)I

(1 + xT x)

]

x = AT y (7)

which we call the modified normal equations. We found that these equations are same as the ones derivedby Van Huffel.4 Substituting the necessary conditions in to the cost function, we get the optimal cost to be

J1 =(y − Ax)T ∗ (y − Ax)

(1 + xT x)(8)

The solution to the necessary conditions derived here is a nonlinear problem but the solution that minimizesJ1 is the eigenvector corresponding to the smallest eigenvalue of the symmetric positive definite form T T T =[

AT A AT y

yT A yT y

]

, T being defined as T = [A . . . y]. Defining an associated vector z = [xT −1]T the minimum

value of J1 hence becomes (after making necessary substitutions of the necessary conditions),

J1 =(y − Ax)T (y − Ax)

(1 + xT x)=

zT T T Tz

zT z(9)

whose extremals are eigenvalues associated with the quadratic form T T T , called the Rayleigh quotient.8 Theassociated eigenvectors are the solutions of the problem and in this problem, the smallest eigenvalue and thecorresponding eigenvector are the solutions.

2 of 19


II.A. Geometry of the problem

By a careful analysis of the necessary conditions for a solution to the total least squares problem, we canmake some observation on the geometry of the problem. Consider the space of rectangular matrices withfixed order n × m together with the inner product definition < A, B >: tr(AT B). The norm derived fromthis inner product definition satisfies the polarization identity, as shown below

‖P + Q‖2 + ‖P − Q‖2 = tr((P + Q)T (P + Q)) + tr((P − Q)T (P − Q)) (10)

= 2tr(PT P ) + 2tr(QT Q) = 2(‖A‖2 + ‖B‖2) (11)

A result in analysis9 states that a norm satisfies the polarization identity is if and only if it is derived from aninner product and is unique. This allows us to define orthogonality in matrix spaces. Now consider the innerproducts yT (y − Ax) and tr(AT ∆). These are the inner products of the best estimates with the residualobject in the corresponding space. Then,

yT (y − Ax) = xT AT (y − Ax) = 0 (12)

tr(AT ∆) = −tr(AT exT ) = −AT e = 0 (13)

In the above expressions, the identity AT e = 0 has been used. This identity directly follows from thenecessary conditions AT (Ax − y) = AT (Ax − y − exT x) = AT e = 0. Therefore, the total least squaresproblem enforces a geometry and performs an orthogonal regression in both range space and the space ofthe basis functions.

III. Weighted Total Least Squares

Researchers have claimed equivalence of the weighted total least squares problems to an error criteriongiven by Golub and Van loan,8 and Van Huffel.4 However there is not sufficient freedom nor insightfulconclusions from the resulting advantages unless there is a symbolic expression for the estimate. In thissection we show that our technique yields a more general necessary condition that gives a lot of designfreedom to alter the weights. So the cost function with appropriate weights is given by

Jw = tr(∆T P∆) + rT Qr (14)

where P, Q are arbitrary positive definite weight matrices. They allow the designer to control the magnitudesof correction of the range and basis function tolerance levels. The first order necessary conditions ∂Jw

∂A= 0,

∂Jw

∂x= 0 yield

P∆ + Q∆xxT = Q(Ax − y)xT (15)

x = (AT QA)−1AT Qy (16)

Clearly, there is no obvious way of determining the “best” correction for giving an expression for A. Butindeed there is. We choose

A = A + P−1(Q−1 + P−1(xT x))−1(y − Ax)xT (17)

We can easily verify that this expression for A satisfies the first necessary condition. Upon substitutionin to the second condition, we get nothing similar to the nice eigenvalue problem. We observe that if therange measurements are weighted more Q >> P , then the best correction to A is applied and we get

AQ→inf = A + (y−Ax)xT

xT xand when P >> Q, we receive no correction in AP→inf = A, which is the least

squares solution. In between, when there is equal uncertainty in both, we have P ∼ Q, A = A + α (y−Ax)xT

1+xT x,

which is the eigenvalue problem presented above. This design freedom via weighted total least squaressolution is not present in the literature to the best knowledge of the authors. However the pay off is the lackof algorithms to solve this problem. Therefore we propose some new methods to solve the problem besidesthe eigenvalue iterations that may be extensible to solve the more general problem.

IV. Solution Methodologies

The necessary conditions being the eigenspace computations of a symmetric quadratic form can becomputed stably using the Singular Value Decomposition.8 Therefore we have the first algorithm.

3 of 19


IV.A. SVD Method

Since the matrix is symmetric, the smallest left singular vector is the solution to the problem (We recall thatthe eigenvectors and singular vectors span the same spaces in the case of a symmetric matrix).

SVD Approach.

Step 1 Compute SVD and save V , in T T T = UΣV T

Step 2 xSV D = −V (1 : n, n + 1)/(V (n + 1, n + 1))Step 3 xSV D is the required solution.

This algorithm is fairly robust but computationally expensive.

IV.B. Eigenvalue Problem

The solution, which also is the eigenvector corresponding to the smallest eigenvalue can be computed byinverse iterations.8

Rayleigh quotient iteration.

Step 1 Start with z = [xT − 1]T . x = (AT A)−1AT y.Step 2 Solve (T T T − λI)v(k) = v(k−1) for k = 1, 2, ....Step 3 xTLS = −v(n)(1 : n, 1)/v(n)(n + 1, 1) is the converged solution.

The algorithm has cubic convergence and we can get 10 digits of accuracy in 3 iterations.

IV.C. Davidenko’s Homotopy Method

Structure of the necessary conditions clearly indicates that their solution is “close” to the solution of thenormal equations. This motivates us to explore the perturbation methods10 to solve this problem. Thismethod sees importance in the light of the expressions for the necessary conditions for the weighted totalleast squares formulation developed by the authors, where the eigenvalue problem is not “obvious” from thenonlinear equations (obtained in section III).

Davidenko’s Method.

Step 1 Start with least squares solution G(x) = (AT A)x − AT y = 0.Step 2 Let F (x) = (AT A − λ(x)I)x − AT y = 0 (Or the weighted version of it).Step 3 Consider the Homotopy H(z) = tF (z) + (1 − t)G(z), ∀t ∈ [0, 1].Step 4 Integrate dz

dt= −[∂H

∂z]−1[∂H

∂t]

Step 5 x = z(1) is the required estimate.

The accuracy of the solution depends on the numerical integrator and also is reasonably slow to compute.

IV.D. A QUEST type algorithm

QUEST is an attitude estimation algorithm, proposed by Shuster11 which determines the “Best” attitudematrix for vector measurements. This algorithm receives attention owing to the possibility of its recursiveimplementation.12 Our recursive (rather accumulative) formulations of the TLS problem has strong corre-lation with the REQUEST methodology. We will also derive some additional benefit from this algorithmand it is presented next. It is amazing to find that the exact developments carry forward to a generaldimension from the three dimensional case involving QUEST computations. The result is summarized andthe algorithm is presented next. In the eigenvalue problem,

[

AT A AT y

yT A yT y

]

=

[

x

−1

]

= λ

[

x

−1

]

(18)

(S − λI)x = z (19)

λ = α + zT (λI − S)−1z (20)

where, S := AT A and z := AT y. The fact that (S − λI)−1 can be expanded in lower powers (due to CayleyHamilton theorem13) enables us to compute the x algebraically.

4 of 19


Generalized QUEST type algorithm.

Step 1 Compute the characteristic polynomial associated with the matrix T T T .Step 2 With λ0 = 0, as the initial guess, compute the smallest eigenvalue (Newtons root solver).Step 3 Calculate x = (S − λI)−1z

V. Static Parameter Estimation Application : Morphing Wing

The algorithms presented above were used in the identification of sensitivity coefficients of a novel mor-phing wing developed at Texas A& M University (fig.1). The twisting wing actuator being developed, wasamenable to quasi-static aerodynamic models. As an alternative approach, we wanted to develop alternativemodels directly from the input output data of the wind tunnel tests. The experimental setup and aerody-namic models, along with the specifications of the tests performed are discussed in an accompanying paper.14

Figure 1. Morphing Wing : Experimental Setup

V.A. Discussion

The idea of using the Total least squares method as opposed to least squares method in fitting the dataobtained, was to have a better approximation of the data in regions where physics based models (any strictlylinear models) fail. Least squares approximation is known to “filter” the data in such regions (especiallywhere the wing stalls) and therefore yields poor models of the physics. On the other hand, the Total leastsquares algorithms, possessing more knobs to turn would indeed model the physics to arbitrary extent (fig.??) (user could control this by playing with the weights). This objective could only partially be realizedbecause the current TLS approximation deals with “equal” magnitudes of uncertainty and thereby stayingclose to the least squares estimates. Upon careful observation, the better approximation of TLS is revealedby the plots 2, 3. This would be potentially increased by incorporation of weights.

V.B. Model Validation

Once the fit was completed, a time varying test result was obtained and the prediction from least squaresand the TLS method are compared in the following plots. The time varying test was performed with a small

5 of 19


−4 −2 0 2 4 6 8 10 12−1

0

1

2

Angle of AttackLi

ft C

oeffi

cien

t, C

L

TLS Data Fit

DataTLS fitLS fit

−4 −2 0 2 4 6 8 10 12−0.5

0

0.5

Angle of Attack

Lift

Coe

ffici

ent,

CL

Residual Errors

TLS fit − Baseline testLS fit

Figure 2. Baseline (no twist) test : Linear Model and Residuals

stalling and large stalling time periods and compared with the outputs of the models. It is observed thatthe approximation of the model fails in region where near stall conditions prevail (figs. 5, 4).

VI. Applications to Dynamic Systems: The Total Least Squares Kalman

Filter

Having discussed methods of determining the best estimates of a static problem, we would now beinterested in extending the methodology for applications to dynamical systems. In other words, we wouldbe interested in applying the constraints of differential equations to the optimization problem and obtainassociated filters. The associated filter is developed by considering the following problem. Consider thediscrete time dynamical system given by

xk+1 = Akxk + Bkuk + Gkwk (21)

where uk is the control input to the system and wk is the random forcing function most popularly known asthe process noise. The measurements of linear combinations of the states being given by

yk = Hk + vk (22)

In contrast to the classical Kalman filtering framework, in this case, the true measurement sensitivity matrix,Hk is unknown. But, its measurements are available at every update time step, being given by,

Hk = Hk + Emk (23)

The state process and measurement noise vectors are assumed to be zero mean Gaussian random vectors, withthe covariances defined by wk ∼ N(0, Rsp

k ) and vk ∼ N(0, Rsmk ). The matrix measurement noise corrupting

the measurement sensitivity matrix are also assumed to be zero mean Gaussian random variables. However,to simplify the developments, each row of this matrix is assumed to be an independent random vector,identically distributed with all the other rows. That is, the statistics of the measurement noise matrix given

6 of 19


−5 0 5 10 150

0.5

1

1.5

Tip Section Twist AngleLi

ft C

oeffi

cien

t, C

L

TLS Data Fit − Tip Section Twists

DataTLS fitLS fit

−5 0 5 10 15−0.5

0

0.5

Tip Section Twist Angle

Lift

Coe

ffici

ent,

CL

Residual Errors

TLS ResidualsLS Residuals

Figure 3. Tip Section Twist test: Fit of Model and Residuals

by Emk :=

e1

e2

...

ei

...

em

are defined by the statistics of each row,given by ei ∼ N(0, Ri), uncorrelated with other

rows and uncorrelated in time. This implies, that,

E[

V ec(

EmTk

)]

= 0m×n (24)

E[

V ec(

EmTk

)

V ec(

EmTk

)T]T

=

E(eT1 e1) E(eT

1 e2) · · · E(eT1 em)

E(eT2 e1) E(eT

2 e2) · · · E(eT2 em)

.... . . · · ·

...

E(eTme1) E(eT

me2) · · · E(eTmem)

(25)

=

R1 R1,2 · · · R1,m

R2,1 R2 · · · R2,m

.... . . · · ·

...

Rm,1 Rm,2 · · · Rm,m

(26)

= REH

m

k (27)

The process noise statistics involved with the evolution of the truth model of the measurement sensitivitymatrix, given by

Hk+1 = Hk + Epk (28)

V ec(

EmTk

)

∼ N(0, REH

p

k ) (29)

are similarly defined and the second moment matrix (first moment being zero) of all the elements arranged

vectorially (exactly similar to above developments) is accordingly denoted by REH

p

k , as defined above. Withthe appropriate noise statistics defined as above, the problem is to produce a state estimate, x, by processing

7 of 19


0 10 20 30 40 50 60 70 800.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

time (sec)

Lift

Coe

ffici

ent,

CL

TLS Data Fit testing − Time Varying Input

DataTLS fitLS fit

Figure 4. Model Validation : time varying input with large stalling periods

the vector measurements yk, that arrive at each update time step. To solve this problem, we propose anestimator of the form,

xk+1 = Akxk + Bkuk (30)

Hk+1 = Hk (31)

The estimated output is assumed to be computed from the updated measurement sensitivity matrix andtherefore attains the following structure.

yk = H+k x−

k (32)

Kalman updates are performed on both state and the measurement sensitivity matrix updates and are givenby

x+k = x−

k + Kk [yk − yk] (33)

H+k = H−

k + Lk

[

Hk − H−

k

]

(34)

Defining the estimation error to be given by δx±

k x − x±

k and the measurement sensitivity matrix estimation

error δH±

k = Hk − H±

k , the innovation error, yk − yk can be expressed as

yk − yk = Hkxk + vk − H+k x−

k (35)

= Hkxk + Emk xk + vk − H+

k xk − H+k xk + H+

k xk − H+k x−

k (36)

= δH+k xk + H+

k δx−

k + Emk xk + vk (37)

With the above definitions, the estimation errors in the state update equation is given by

δx+k = xk − x+

k (38)

= xk − x−

k − Kk [yk − yk] (39)

= δx−

k − Kk

[

δH+k δx−

k + δH+k x−

k + H+k δx−

k + Ekxk + vk

]

(40)

The corresponding matrix update error of the measurement sensitivity matrix is given by

δH+k = Hk − H+

k (41)

= Hk − H−

k − Lk

(

Hk − H−

k

)

(42)

= (Im×m − Lk) δH−

k − LkEmk (43)

8 of 19


0 10 20 30 40 50 60 700.2

0.4

0.6

0.8

1

1.2

time (sec)

Lift

Coe

ffici

ent,

CL

TLS Data Fit testing − Time Varying Input while stalling

DataTLS fitLS fit

Figure 5. Model Validation : time varying inputs with small stalling periods

This enables us to write the state estimation error covariance update equation of the form,

P s+k = E

[

δ+k δ+T

k

]

(44)

= E(

δx−

k − Kk

[

δH+k xk + H+

k δx−

k + Ekxk + vk

]) (

δx−

k − Kk

[

δH+k xk + H+

k δx−

k + Ekxk + vk

])T

(45)

To simplify the expression, we make use of the conditional expectation identity from probability theory thatE [X ] = E (E [X |Y ]), for all random variables X, Y . Using this property and the additional property thatthe closed loop of the discrete Kalman filter is stable, leading to unbiased estimates of the state and themeasurement sensitivity matrix, (i.e., δH±

k = 0m×n, δx±

k = 0), together with the zero mean and uncorrelatednature of the measurement and process noise terms, E(wk) = 0, E(vk) = 0, E(Em

k ) = 0m×n, the followingterms in the state covariance update equation vanish

E[

vkδx−Tk

]

= 0 (46)

E[

δH+k x−

k δx−Tk

]

= E(

E[

δH+k x−

k δx−Tk |δH+

k

])

= 0 (47)

E[

δH+k δx−

k δx−Tk

]

= E[

δH+k δx−

k δx−Tk |δH+

k

]

= E(

δH+k P s−

k

)

= 0 (48)

E(

Emk xkδx−T

k

)

= E (Emk )E

(

xkδx−Tk

)

= 0 (49)

Similarly,

KkE[

δH+k (δx−

k + x−

k ) + H+k δx−

k + Emk xk + vk

]

[Emk xk + vk]

T= 0 (50)

E[

δH+k δx−

k δx−Tk H+

k

]

= E(

E[

δH+k δx−

k δx−Tk H+

k |δH+k

])

= E[

δH+k P s−

k H+k

]

= 0 (51)

E[

δH+k δx−

k δx−Tk δH+T

k

]

= E(

E[

δH+k δx−

k δx−Tk δH+T

k |δH+k

])

= E[

δH+k P s−

k δH+Tk

]

(52)

This simplifies the covariance update equation to take the form,

P s+k = P s−

k − KkH+k P−

k − P−

k H+Tk KT

k + Kk∆kKTk (53)

∆k = E[

δH+k δx−

k δx−Tk δH+T

k + δH+k x−

k x−Tk δH+T

k + H+k P s−

k H+Tk + Em

k xkxTk EmT

k + Rsmk

]

(54)

The expression for ∆k further simplifies using the fact that xk = x−

k + δ−k and the additional abbreviation,

Ξk = P s−k + x−

k x−Tk , in to

∆k = H+k P s−

k H+Tk + Rsm

k + E[

δH+k ΞkδH+T

k + Emk ΞkEmT

k

]

(55)

= H+k P s−

k H+Tk + Rsm

k + ∆1k (56)

9 of 19


Now, to determine the best estimates of the state, we determine the optimal gains, Lk, Kk such that theerror covariance of the updated estimation error is minimized. Considering the scalar performance index,

J = trace

(

P s+k

)

(57)

= trace

(

P s−k − KkH+

k P s−k − P s−

k H+Tk KT

k + Kk∆kKTk

)

(58)

Necessary conditions for minima, ∂J∂Kk

, ∂J∂Lk

= 0 lead to the gain equations,

∂J

∂Kk

= −P s−k H+T

k + Kk∆k = 0 (59)

implying

Kk = P s−k H+T

h

[

H+k P s−

k H+Tk Rsm

k + ∆1k

]−1

(60)

The other necessary condition is rather unique. Notice that the only terms containing Lk to optimize arecontained in trace

(

Kk∆2kKT

k

)

(where ∆2k = E

[

δH+k ΞkδH+T

k

]

. Hence the necessary condition becomes,

∂J

∂Lk

=∂trace(∆2

k)

∂Lk

= 0 (61)

Given ∆2k, we can express it as a function of the propagated measurement sensitivity error covariance elements

and the gain Lk to be determined as follows

∆2k = E

[

δH+k ΞkδH+T

k

]

(62)

= (Im×m − Lk)E[

δH−

k ΞkδH−Tk

]

(Im×m − Lk)T + LkE[

Emk ΞkEmT

k

]

LTk (63)

From the above, the second necessary condition becomes,

∂trace(

Kk∆2kKT

k

)

∂Lk

= 2KTk Kk

{

−E[

δH−

k ΞδH−Tk

]

+ Lk

(

E[

δH−

k ΞδH−Tk

]

+ E[

Emk ΞEmT

k

])}

= 0 (64)

leading to the gain equation

Lk = E(

δH−

k ΞδH−Tk

) [

E[

δH−

k ΞδH−Tk

]

+ E[

Emk ΞEmT

k

]]−1(65)

However, to facilitate these gain computations, it turns out, we need to compute the covariance associatedwith all the elements of the measurement sensitivity matrix. In what follows, we will derive the relationsused in the computations of the weighted covariances of the form E

[

δH−

k ΞδH−Tk

]

. The required covariancepropagation and update equations of the measurement sensitivity matrix estimation error are first derivedusing the full covariance matrix (of all mn elements) given by

PH±

k := E[

V ec(

δH±Tk

)

V ec(

δH±Tk

)T]

(66)

where, if δH±Tk =

[

h1 · · · hm

]

and hi is the ith row of the matrix δH±

k , the V ec operator operates on

a matrix (of dimensions say m × n) and produces a vector of length mn. Accordingly,

V ec(δH±Tk ) =

hT1...

hTm

(67)

Consequently, the expression for the covariance is given by the mn × mn matrix,

PH±

k = E[

V ec(

δH±Tk

)

V ec(

δH±Tk

)T]

=

E(hT1 h1) E(hT

1 h2) · · · E(hT1 hm)

E(hT2 h1) E(hT

2 h2) · · · E(hT2 hm)

......

. . ....

E(hTmh1) · · · · · · E(hT

mhm)

(68)

10 of 19


an expression much similar to equation 24, which denoted the statistics of the matrix. At this stage, wewould point out that there was no assumption so far on the nature of the random variables. We assumed thatthe rows were i.i.d. Gaussian, but clearly, as we are tracking the evolutions of all possible correlations of therows, (E(HT

i hj)), this is not required, provided, we know the correlations apriori. Writing the measurementsensitivity update error equation 41 another way,

δH+Tk = δH−T

k (Im×m − Lk)T − EmTk LT

k (69)

Taking the V ec operator on both sides and using the identity, V ec (ACB) =(

BT ⊗ A)

V ecC, we get,

V ec δH+Tk = [(Im×mLk) ⊗ In×n] V ec δH−T

k − (Lk ⊗ In×n) V ec EmTk (70)

= Φ+k V ec δH−T

k + Ψ+k V ec EmT

k (71)

This together with the definition of the measurement sensitivity estimation error covariance, PH±

k :=

E[

(V ec δH±Tk )(V ec δH±T

k )T]

leads to the measurement sensitivity covariance update equation,

PH+k = Φ+

k PH−

k Φ+Tk + Ψ+

k REH

m

k Ψ+Tk (72)

where the expression for the measurement noise statistics from equation (27), has been used. Similarly, themeasurement sensitivity estimation error propagation vector is given by

V ec δH−Tk = V ec δH+T

k + V ec Emk (73)

PH−

k = PH+k + R

EHp

k (74)

Using the above derived propagation and update equations, a filter can be constructed. An example demon-strating the same and comparing the results with a classical Kalman Filter under three different circumstancesis presented in the next section.

VII. Numerical Simulation

We now consider a simple example to evaluate the performance of the newly developed filter and the clas-sical Kalman filter. The problem is a linear oscillator where only the position is available for measurements.The dynamics of the plant are given by

x =

[

0 1

−2 −0.5

]

x +

[

0

1

]

u +

[

0

1

]

w (75)

y = Cx =[

1 0]

x (76)

The measurement model is given by

y = Cx + v (77)

C = C + EmHk (78)

The filter is required to generate position and velocity estimates. Since the filter has to be compared with theclassical Kalman filter, we implement the filter and base our Kalman gain calculation based on the measuredmeasurement sensitivity matrix C. The tuning parameters for the filter implementations are summarizedfor the three different cases in table (1).

VIII. Conclusion and Future Directions

VIII.A. Conclusions

The least squares error criterion is generalized to account for errors in both range space and the basisfunctions in a measurement model. This was shown to lead to nonlinear necessary conditions. The necessaryconditions were then realized as solutions to eigenvalue problem associated with the measurement matrix andthe vector. A novel weighted total least squares criterion was presented and associated necessary conditionswere derived. Several methods to solve this problem including two novel methods were presented. Thiswas applied on a parameter identification problem of a morphing wing model developed at Texas A& MUniversity.

11 of 19


Table 1. Continuous Discrete Constrained Attitude Filter

Case Number 1 2 3

Process Noise (StateE(wwT ) = Rspk ) 10−6I2×2 10−4I2×2 10−2I2×2

Measurement Noise (State E(vvT ) = Rsmk ) 10−4 10−1 10−1

Process Noise (Meas. Sensitivity)REH

p

k 10−4I2×2 10−4I2×2 10−4I2×2

Measurement Noise (Meas. Sensitivity) REH

m

k 0 × I2×2 10−4I2×2 10−4I2×2

PHk (0) 101I2×2 101I2×2 101I2×2

P xk (0) 102I2×2 102I2×2 102I2×2

H(0) [1, 1]T [1, 1]T [1, 1]T

x(0) [−1, 1]T [−1, 1]T [−1, 1]T

Estimates Vs. Truth figure (6) figure (9) figure (14)

Absolute Value of Estimation Errors figure (7) figure (10) figure (15)

Estimation Errors and Covariance Bounds figure (8) figure (11) figure (16)

Estimation Errors and Covariance Bounds (Magnified) NA figure (12) figure (17)

Parameter Estimation Errors NA figure (13) figure (18)

VIII.B. Future Directions

Most importantly because of the large degree of design choice, the method tunes itself to fit the measurementsas close as possible. While in some problems, this models the physics not essentially modeled by linear leastsquares, in filtering problems this is not always desirable as some kind of signal reconstruction is anticipated.Therefore, work is in progress in the direction of modifying this error criterion so as to reduce the huge overparameterization. Smoother formulations incorporated for dynamical state estimation are being consideredfor incorporation. As mentioned in the paper, work is also in progress to develop algorithms for weightedtotal least squares criterion whose necessary conditions do not form the eigenvalue problem. Static problemswere dealt with in the above discussion. Extension to dynamical system state estimation is expected toimproved filters where there is uncertainty in the models of measurement and plant dynamics. This is beinginvestigated and researched currently. The developments so far make optimistic gestures towards this goal.

Acknowledgments

The authors wish to acknowledge the support of Texas Institute of Intelligent Bio Nano Materials andStructures for Aerospace Vehicles funded by NASA Cooperative Agreement No. NCC-1-02038.

References

1Adcock, R. J., “A Problem in Least Squares,” The Analyst , Vol. 5, Jan. – Feb. 1878, pp. 53–54.2Glesler, L. J., “Estimation in a multivariable “errors in variables” regression model: large sample results,” Annals of

Statistics, Vol. 9, 1981, pp. 24–44.3Golub, G. H. and Loan, C. F. V., “An Analysis of the Total Least Squares Problem,” SIAM Journal of Numerical

Analysis, Vol. 17, 1980, pp. 883–893.4Huffel, S. V. and Vandewalle, J., The Total Least Squares Problem: Computational Aspects and Analysis, SIAM Publi-

cations, Philadelphia, 1991.5Villegas, C., “Maximum Likelihood Estimation of a Linear Functional Relationship,” The Annals of Mathematical Statis-

tics, Vol. 32, No. 4, December, 1961, pp. 1048–1062.6Mook, D. J. and Junkins, J. L., “Minimum Model Error Estimation for Poorly Modeled Dynamic Systems,” Journal of

Guidance Control and Dynamics, Vol. 3, No. 4, 1988, pp. 367–375.7Crassidis, J. L. and Junkins, J. L., Optimal Estimation of Dynamic Systems, Chapman and Hall/CRC Press, Boca

Raton, FL, 2004.8Golub, G. H. and Loan, C. F. V., Matrix Computations, The Johns Hopkins University Press, Baltimore, MD, 3rd ed.,

1996.9Riesz, F. and Nagy, B. S., Functional Analysis, Reprinted by Dover Publications, Mineola, NY, 1990.

10Davidenko, D. F., “A new method of solution of a system of nonlinear equations,” Dokl. Akad. Nauk SSSR, Vol. 88,1953, pp. 601.

12 of 19


0 1 2 3 4 5 6 7 8 9 10−1

0

1

2

time

Pos

ition

Comparison of the Kalman filters in multilinear uncertainty environment

truthDiscrete Kalman Filter estimateTotal Kalman Filter estimate

0 1 2 3 4 5 6 7 8 9 10−2

−1

0

1

2

time

Vel

ocity

Figure 6. Case1: Estimates Vs Truth

11Schuster, M. D. and Oh, S. D., “Three axis attitude determination from Vector Observations,” Journal of Guidance and

Control , Vol. 4, Jan. – Feb. 1981, pp. 70 – 77.12Bar-Itzhack, T. Y., “REQUEST : A recursive QUEST Algorithm for Squential Attitude Determination,” Journal of

Guidance Control and Dynamics, Vol. 19, No. 5, Sep. – Oct. 1996, pp. 1034–1038.13Junkins, J. L. and Kim, Y., Introduction to Dynamics and Control of Flexible Structures, AIAA Education Series,

Washington, DC, 1991.14Majji, M., Rediniotis, O. K., and Junkins, J. L., “Design of a Morphing Wing : Modeling and Experiments,” Submittal,

AIAA Guidance Navigation and Control Conference, 2007.

13 of 19


0 1 2 3 4 5 6 7 8 9 1010

−5

100

time

Pos

ition

Est

imat

ion

Err

or

Estimation Error in Two Filters

Estimation Error −DKFEstimation Error − TDKF

0 1 2 3 4 5 6 7 8 9 1010

−5

100

time

Vel

ocity

Figure 7. Case1: Estimation error

0 1 2 3 4 5 6 7 8 9 10−15

−10

−5

0

5

10

15

Pos

ition

Est

imat

ion

Err

ors

Estimation Error and Error Covariance Comparison

Estimation Error KFError Covariance KFEstimation Error TLSKFError Covariance TLSKF

0 1 2 3 4 5 6 7 8 9 10−15

−10

−5

0

5

10

15

Vel

ocity

Est

imat

ion

Err

ors

Figure 8. Case1: Estimation error and covariance bounds

14 of 19


0 1 2 3 4 5 6 7 8 9 10−2

−1

0

1

2

3

time

Pos

ition



0 1 2 3 4 5 6 7 8 9 10−3

−2

−1

0

1

2

time

Vel

ocity


0 1 2 3 4 5 6 7 8 9 1010

−5

100

time

Pos

ition

Est

imat

ion

Err

or



0 1 2 3 4 5 6 7 8 9 1010

−5

100

time

Vel

ocity

Figure 10. Case2: Estimation error (Log Scale)

15 of 19


0 1 2 3 4 5 6 7 8 9 10−50

0

50

Pos

ition

Est

imat

ion

Err

ors



0 1 2 3 4 5 6 7 8 9 10−50

0

50

Vel

ocity

Est

imat

ion

Err

ors


2.5 2.55 2.6 2.65 2.7 2.75 2.8 2.85 2.9 2.95−1

0

1

2

Pos

ition

Est

imat

ion

Err

ors


2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8

−2

−1

0

1

Vel

ocity

Est

imat

ion

Err

ors


Figure 12. Case2: Estimation error and covariance bounds Magnified View

16 of 19


0 1 2 3 4 5 6 7 8 9 1010

−6

10−5

10−4

10−3

10−2

10−1

100

time

Par

amet

er E

stim

atio

n E

rror

s

Measurement Sensitivity Matrix Parameter Estimation Errors

H(1,1)H(2,1)

Figure 13. Case2: Measurement Sensitivity Matrix Element Estimation Errors

0 1 2 3 4 5 6 7 8 9 10−2

−1

0

1

2

3

time

Pos

ition



0 1 2 3 4 5 6 7 8 9 10−4

−2

0

2

4

time

Vel

ocity


17 of 19


0 1 2 3 4 5 6 7 8 9 10

10−2

100

time

Pos

ition

Est

imat

ion

Err

or



0 1 2 3 4 5 6 7 8 9 10

10−2

100

time

Vel

ocity

Figure 15. Case3: Estimation error (Log Scale)

0 1 2 3 4 5 6 7 8 9 10−50

0

50

Pos

ition

Est

imat

ion

Err

ors



0 1 2 3 4 5 6 7 8 9 10−50

0

50

Vel

ocity

Est

imat

ion

Err

ors


18 of 19


2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9

−1

−0.5

0

0.5

1

1.5

Pos

ition

Est

imat

ion

Err

ors



2 2.2 2.4 2.6 2.8 3 3.2

−5

0

5

Vel

ocity

Est

imat

ion

Err

ors


0 1 2 3 4 5 6 7 8 9 1010

−6

10−5

10−4

10−3

10−2

10−1

100

time

Par

amet

er E

stim

atio

n E

rror

s

Measurement Sensitivity Matrix Parameter Estimation Errors

H(1,1)H(2,1)

Figure 18. Case3: Measurement Sensitivity Matrix Estimation Errors

19 of 19


total least squares estimation of dynamical...

Documents