part 9 - model validation and structure selection

Upload: farideh

Post on 05-Jul-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    1/41

    System IdentificationControl Engineering B.Sc., 3rd yearTechnical University of Cluj-Napoca

    Romania

    Lecturer: Lucian Buşoniu

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    2/41

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    3/41

    Model validation   Structure selection   Overparametrization

    Recall: Importance of validation

    Model validation is a crucial step: the model must be good enough(for our purposes).

    If validation is unsuccessful, previous steps in the workflow must beredone – e.g., the model structure can be changed.

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    4/41

    Model validation   Structure selection   Overparametrization

    Motivation

    So far, we validated and selected models mostly in a heuristic way, byexamining plots – using common sense .

    Next, some mathematically well-founded tests will be given.

    However, common sense remains indispensable – mathematical testswork under assumptions that may not always be satisfied.

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    5/41

    Model validation   Structure selection   Overparametrization

    Focus: Prediction error methods

    This lecture focuses on single-output  models obtained by prediction error methods .

    Some of the tests can be extended to other settings.

    M d l lid ti St t l ti O t i ti

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    6/41

    Model validation   Structure selection   Overparametrization

    Table of contents

    1   Model validation with correlation tests

    Basic correlation tests

    Matlab example

    Global correlation tests

    2   Structure selection

    3   Testing for overparametrization

    Model validation Structure selection Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    7/41

    Model validation   Structure selection   Overparametrization

    Whiteness: Intuition

    Recall general PEM model structure:

    y (k ) = G (q −1)u (k ) + H (q −1)e (k )

    where e (k ) is  assumed  to be zero-mean white noise.

    PEM are derived so that the prediction errorε(k ) = y (k )− y (k ) = e (k ). If the system satisfies the model structure(so the assumption holds), and moreover the model is accurate, thenε(k ) is also zero-mean white noise.

    Whiteness hypothesis

    (W)   The prediction errors ε(k ) are zero-mean white noise.

    Model validation Structure selection Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    8/41

    Model validation   Structure selection   Overparametrization

    Independence of past inputs: Intuition

    y (k ) = G (q −1)u (k ) + v (k )

    If the model is accurate, it entirely explains the influence of inputs

    u (k ) on current and future outputs  y (k  + τ ). Therefore, the errorsε(k  + τ ) = y (k  + τ )− y (k  + τ ) are only influenced by thedisturbances v , and are independent  of inputs u (k ).

    Independence hypothesis 1

    (I1)   The prediction errors ε(k  + τ ) are independent of inputs  u (k ) forτ  ≥ 0 (i.e., of past and current inputs).

    Model validation Structure selection Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    9/41

    Model validation   Structure selection   Overparametrization

    Independence of all inputs: Intuition

    y (k ) = G (q −1)u (k ) + v (k )

    If the experiment is closed-loop, u (k ) depends on past outputsy (k  − j ) and this will lead to a correlation of ε(k  − j ) with u (k ) (notethe independence of ε(k  + τ ) and  u (k ) is unaffected). If open-loop,

    then ε(k  − j ) is also independent from u (k ).

    Independence hypothesis 2

    (I2)   The prediction errors ε(k  + τ ) are independent of u (k ) for any  τ (i.e., of all inputs).

    Model validation Structure selection Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    10/41

    Model validation   Structure selection   Overparametrization

    All hypotheses

    (W)   The prediction errors ε(k ) are zero-mean white noise.

    (I1)   The prediction errors ε(k  + τ ) are independent of u (k ) for τ  ≥ 0(i.e., of past and current inputs).

    (I2)   The prediction errors ε(k  + τ ) are independent of u (k ) for any  τ (i.e., of all inputs).

    A good model should satisfy (W) and (I1), and if there is no feedback,also (I2).

    We will develop tests that allow to either accept or reject thesehypotheses for a given model, and therefore validate or reject themodel.

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    11/41

    p

    Whiteness: Correlations

    Recall the correlation function (equal to the covariance in zero-meancase):

    r ε(τ ) = E {ε(k  + τ )ε(k )}

    If ε(k ) is zero-mean white noise:

    The correlation function is zero, r ε(τ ) = 0 for any nonzero τ .

    At zero, r ε(0) is the variance σ2 of the white noise.

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    12/41

    p

    Whiteness: Correlations from data

    Estimating correlations from data:

     r ε(τ ) =  1

    N −τ 

    k =1 ε(k  + τ )ε(k )Normalization by the (estimated) variance:

    x (τ ) = r ε(τ )

     r ε(0)Then, x (τ ) should be small for nonzero τ .

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    13/41

    Whiteness test at   τ 

    For any Gaussian distribution N (µ, σ2), the probability of drawing asample in the interval µ − 1.96σ, µ + 1.96σ  is found (from theGaussian formula) to be 0.95.

    It can be shown that for large  N , x (τ ) is distributed according to

    Gaussian N (0,  1N  ), and therefore that:

    P (|x (τ )| ≤   1.96√ N 

    ) = 0.95

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    14/41

    Whiteness test at   τ   (continued)

    Whiteness test at τ 

    If |x (τ )| ≤   1.96√ N 

     , then the whiteness hypothesis r ε(τ ) = 0 is accepted.

    Otherwise, the whiteness hypothesis is rejected.

    Intuition: If the test succeeds, then the hypothesis is likely to be true.

    (Mathematically, the probability of rejecting the hypothesis when it isin fact true is 0.05.)

    In practice, we require the hypothesis to hold at all  τ  = 0 supportedby the data.

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    15/41

    Independence: Correlations

    To verify independence of ε  from u , use cross-correlation  function:

    r εu (τ ) =  E {ε(k  + τ )u (k )}

    1 If (I1) is true, then r εu (τ ) for τ 

     ≥0.

    2 If (I2) is true, then r εu (τ ) for any τ .

    Estimation from data and normalization:

     r εu (τ ) = 1N 

    N −τ k =1   ε(k  + τ )u (k )   if τ  ≥ 0

    1N N k =1−τ  ε(k  + τ )u (k )   if τ

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    16/41

    Independence test at   τ 

    Like before, for large N , x (τ ) is distributed according to Gaussian N (0,   1N 

     ), and therefore:

    P (|x (τ )| ≤   1.96√ N 

    ) = 0.95

    Independence test at τ 

    If |x (τ )| ≤   1.96√ N 

     , then the independence hypothesis r εu (τ ) = 0 is

    accepted.Otherwise, the independence hypothesis is rejected.

    In practice, we require the hypothesis to hold at all  τ  ≥ 0 supportedby the data. If the model is accurate, then checking it at τ

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    17/41

    Table of contents

    1   Model validation with correlation tests

    Basic correlation tests

    Matlab example

    Global correlation tests

    2   Structure selection

    3   Testing for overparametrization

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    18/41

    Matlab example: Experimental data

    The real system is in output-error form:

    y (k ) =  B (q −1)F (q −1)

    u (k ) + e (k )

    and has order n  = 3.

    plot(id);  and  plot(val);

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    19/41

    Matlab: ARX model

    First, we try an ARX model:mARX = arx(id, [3, 3, 1]);   resid(mARX, id);

    The model fails the whiteness test (W). This is because the system isnot within the model class, leading to an inaccurate model. Themodel is rejected.

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    20/41

    Matlab: ARX model (continued)

    Simulating on the validation data confirms the fact that the model ispoor.

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    21/41

    Matlab: OE model

    mOE = oe(id, [3, 3, 1]);   resid(mOE, id);

    The OE model passes all the tests – as expected because the OEmodel class contains the real system. The model is thereforevalidated.

    Important note: The Matlab functions use 0.99 probability instead of

    0.95.

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    22/41

    Matlab: OE model (continued)

    Simulating the model on the validation data confirms the model hasgood quality.

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    23/41

    Table of contents

    1   Model validation with correlation tests

    Basic correlation tests

    Matlab example

    Global correlation tests

    2   Structure selection

    3   Testing for overparametrization

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    24/41

    Chi-squared distribution

    Let  x  be a vector of m  random variables x i  so that each x i   is hasGaussian distribution N (0, 1).Define χ2(m ) as the distribution of the sum

    m i =1 x 

    2i    =  x 

    x . This is

    called the Chi-squared distribution  (Matlab:   chi2pdf).

    Image credit: Wikipedia

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    25/41

    Whiteness test

    Define r  = [ r ε(1), . . . , r ε(m )], and choose a small α, e.g. 0.05.Define β (α) to be the value β  so that a random variable z  ≥ 0distributed with χ2(m ) has probability 1 − α of being at most β :

    P (z  ≤ β ) = 1 − α

    (Compare:  P (|x (τ )|) ≤   1.96√ N 

     ) = 0.95 for the single-τ   case.)

    Whiteness test

    If N   r r 

    br 2ε(0)

     ≤β (α), then the hypothesis that the sequence  ε(k ) is white

    noise is accepted.Otherwise, the hypothesis is rejected.

    (Compare: |x (τ )| ≤   1.96√ N 

     )

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    26/41

    Independence test

    Can be extended in a similar way, formalism is complicated and willnot be given here.

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    27/41

    Table of contents

    1   Model validation with correlation tests

    2   Structure selection

    3   Testing for overparametrization

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    28/41

    Consider we are given several model structures M1,M2, . . . ,M.Example: ARX structures of increasing order.

    How to choose among them?

    First idea: choose Mi  leading to the smallest mean squared error:

    V ( θ) =   1N 

    N k =1

    ε(k )2

    This does not take into account the complexity of the model, so it maylead to overfitting. We explore other options (without going into their

    derivation).

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    29/41

    Akaike’s information criterion

    W AIC = N  log V (

     θ) + 2p ,   or equivalently: log V (

     θ) +

     2p 

    where N  is the number of data points and  p  the number ofparameters (e.g., na  + nb  in ARX).

    Choice: Model with smallest W AIC.

    Intuition: The term 2p  penalizes the complexity of the model (numberof parameters). Taking the logarithm of the MSE allows to better take

    into account small values of the MSE.

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    30/41

    Final prediction error

    W FPE  = V (

     θ)

    1 + p /N 

    1− p /N 

    Choice: Model with smallest W FPE.

    Intuition: When N   is large:

    V (

     θ)

    1 + p /N 

    1− p /N   = V ( θ)(1 +

      2p /N 

    1− p /N ) ≈ V ( θ)(1 +

     2p 

    N  )

    The term   2p N 

      penalizes model complexity.

    Model validation   Structure selection   Overparametrization

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    31/41

    F-test

    Take just two model classes M1,M2  so that M1 ⊂M2  (e.g., theyare both ARX models and M2  has larger na  and  nb ).Choice: M1  is chosen over M2  (M2  does not offer a significantimprovement) if:

    V 1( θ1)− V 2( θ2)V 2( θ2) ≤ β (α)β (α) is defined like before using the Chi-squared distribution, as thevalue β  so that a random variable z  ≥ 0 distributed with χ2(p 2 − p 1)has probability 1− α of being at most β :

    P (z  ≤ β ) = 1 − α

    Subscripts 1 and 2 refer to the respective model classes. E.g.  p 2   isthe number of parameters of M2, and is larger than p 1.

    Model validation   Structure selection   Overparametrization

    M l b l

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    32/41

    Matlab example

    An OE system with n  = 2.

    Model validation   Structure selection   Overparametrization

    M tl b ith AIC

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    33/41

    Matlab:   selstruc with AIC

    Recall arxstruc:

    Na = 1:15; Nb = 1:15; Nk = 1:5;

    NN = struc(Na, Nb, Nk); V = arxstruc(id, val, NN);

    struc generates all combinations of orders in  Na, Nb, Nk.

    arxstruc identifies for each combination an ARX model on thedata id, simulates it on the data  val, and returns information

    about the MSEs, model orders etc. in  V.

    Model validation   Structure selection   Overparametrization

    M tl b ith AIC ( ti d)

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    34/41

    Matlab:   selstruc with AIC (continued)

    To choose the structure with the Akaike’s information criterion:N = selstruc(V, ’aic’);

    For our data,  N= [8, 8, 1].

    Alternatively, graphical selection also allows using AIC:N = selstruc(V, ’plot’);

    Model validation   Structure selection   Overparametrization

    M tl b R lt

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    35/41

    Matlab: Results

    Model validation   Structure selection   Overparametrization

    Remarks

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    36/41

    Remarks

    AIC, FPE also work if the system is not in the model class.

    Matlab offers functions  aic, fpe that compute these criteria for alist of models.

    Model validation   Structure selection   Overparametrization

    Table of contents

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    37/41

    Table of contents

    1   Model validation with correlation tests

    2

      Structure selection

    3   Testing for overparametrization

    Model validation   Structure selection   Overparametrization

    Motivation

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    38/41

    Motivation

    Consider the real system  obeys the ARMAX structure:

    A0(q −1)y (k ) = B 0(q −1)u (k ) + C 0(q −1)e (k )

    where subscript 0 indicates quantities related to the real system.

    This is equivalent to the model:

    M (q −1)A0(q −1)y (k ) = M (q −1)B 0(q −1)u (k ) + M (q −1)C 0(q −1)e (k )

    with M (q −1) some polynomial of order  nm .

    So, using ARMAX identification with  na  =  na 0 + nm , nb  = nb 0 + nm ,nc  = nc 0 + nm  will produce an accurate model. This model ishowever too complicated (overparametrized), and will have some(nearly) common factors M (q −1) in all polynomials.

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    39/41

    Model validation   Structure selection   Overparametrization

    Matlab: overparameterized OE model

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    40/41

    Matlab: overparameterized OE model

    On the same data as for correlation tests (recall system has ordern  =  3):

    mOE = oe(id, [5, 5, 1]);

    Looking at the validation data, the model is accurate:

    Model validation   Structure selection   Overparametrization

    Matlab: testing for pole-zero cancellations

  • 8/15/2019 Part 9 - Model Validation and Structure Selection

    41/41

    Matlab: testing for pole-zero cancellations

    pzmap(mOE, ’sd’, 1.96);

    Arguments ’sd’, nsd

     specify the confidence region as a number ofstandard deviations (nsd taken here 1.96 to get 0.95 confidence).

    Two pairs of poles and zeros have overlapping confidence regions ⇒likely they are canceling each other. This indicates the order should

    be 3 (the true system order).