knowledge representation and reasonning bayesian ... · knowledge representation and reasonning...

Knowledge Representation and Reasonning

Bayesian filterting - part 2

Elise [email protected]

December 1, 2011

Elise Arnaud [email protected] M2R

Overview

1. Introduction on HMM

2. Filtering : Problem Statement

3. Forward algorithm

4. Kalman filter

5. Particle filter


Problem Statement

Dynamic system modeled as a Hidden Markov Chain

described by

1. state space X ; measurement space Z2. the initial distribution p(X0)

3. an evolution model (transition law)p(xk|x0:k−1, z1:k−1) = p(xk|xk−1)

4. an observation model (likelihood)p(zk|x0:k−1, z1:k−1) = p(zk|xk)


Problem Statement

Filtering

� estimation of the state given the past and presentmeasurements

filtering distribution: p(xk|z1:k)

� This estimation has to be sequential ie :

p(xk−1|z1:k−1) → Algorithm → p(xk|z1:k)


Forward algorithm

� Both X and Z are discrete and finite

� p(xk|xk−1) is embedded in the matrix Q

� p(zk|xk) is embedded in the matrix B


Forward algorithm

Generalization

We define: αk(i) = P (z1:k,xk = i)

Initialization: α0(i) = µ(i)

Induction:

αk+1(i) =

N�

j=1

αk(j)qji

bi(zk+1)


Viterbi algorithm

Generalization

We define: αk(i) = P (z1:k,xk = i)Initialization: α0(i) = µ(i)Induction:

αk+1(i) =

�maxj=1:N

αk(j) qji

�bi(zk+1)

ψk+1(i) = argmaxj

[αk(j) qji]

The retro-propagation enables to find the Viterbi pathx∗1, . . . ,x

∗K :

x∗K = argmax

j[αK(j)]

x∗k = ψk+1 [x∗

k+1]


But ...

what to do when the state space and or the measurement space isinfinite or huge ?

� p(xk|xk−1) not any more described in the matrix Q

� p(zk|xk) not any more described in the matrix B


But ...


� p(xk|xk−1) not any more described in the matrix Q

but we have an expression that links xk−1 to xk :

xk = f(xk−1,wk)

� p(zk|xk) not any more described in the matrix B

but we have an expression that links zk to xk :

zk = h(xk,vk)


But ...


�xk = fk(xk−1,wk)zk = hk (xk,vk)

� if f and h are linear, additive with Gaussian noise:�

xk = Fkxk−1 +wk wk ∼ N (0;Qk)zk = Hk xk + vk vk ∼ N (0;Rk)

� then the solution lies in the Kalman filter


1

Introduction to Kalman Filters

Michael Williams 5 June 2003

2

Overview

•  The Problem – Why do we need Kalman Filters?

•  What is a Kalman Filter? •  Conceptual Overview •  The Theory of Kalman Filter •  Simple Example

3

The Problem

•  System state cannot be measured directly •  Need to estimate “optimally” from measurements

Measuring Devices Estimator

Measurement Error Sources

System State (desired but not known)

External Controls

Observed Measurements

Optimal Estimate of

System State

System Error Sources

System

Black Box

4

What is a Kalman Filter? •  Recursive data processing algorithm •  Generates optimal estimate of desired quantities

given the set of measurements •  Optimal?

–  For linear system and white Gaussian errors, Kalman filter is “best” estimate based on all previous measurements

–  For non-linear system optimality is ‘qualified’

•  Recursive? –  Doesn’t need to store all previous measurements and

reprocess all data each time step

5

Conceptual Overview

•  Simple example to motivate the workings of the Kalman Filter

•  Theoretical Justification to come later – for now just focus on the concept

•  Important: Prediction and Correction

6

Conceptual Overview

•  Lost on the 1-dimensional line •  Position – y(t) •  Assume Gaussian distributed measurements

y

7

Conceptual Overview

•  Sextant Measurement at t1: Mean = z1 and Variance = !z1

•  Optimal estimate of position is: !(t1) = z1

•  Variance of error in estimate: !2x (t1) = !2

z1

•  Boat in same position at time t2 - Predicted position is z1

8

Conceptual Overview

•  So we have the prediction !-(t2) •  GPS Measurement at t2: Mean = z2 and Variance = !z2

•  Need to correct the prediction due to measurement to get !(t2) •  Closer to more trusted measurement – linear interpolation?

prediction !-(t2) measurement z(t2)

9

Conceptual Overview

•  Corrected mean is the new optimal estimate of position •  New variance is smaller than either of the previous two variances

measurement z(t2)

corrected optimal estimate !(t2)

prediction !-(t2)

10

Conceptual Overview

•  Lessons so far: Make prediction based on previous data - !-, !-

Take measurement – zk, !z

Optimal estimate (!) = Prediction + (Kalman Gain) * (Measurement - Prediction)

Variance of estimate = Variance of prediction * (1 – Kalman Gain)

11

Conceptual Overview

•  At time t3, boat moves with velocity dy/dt=u •  Naïve approach: Shift probability to the right to predict •  This would work if we knew the velocity exactly (perfect model)

!(t2) Naïve Prediction !-(t3)

12

Conceptual Overview

•  Better to assume imperfect model by adding Gaussian noise •  dy/dt = u + w •  Distribution for prediction moves and spreads out

!(t2)

Naïve Prediction !-(t3)

Prediction !-(t3)

13

Conceptual Overview

•  Now we take a measurement at t3

•  Need to once again correct the prediction •  Same as before

Prediction !-(t3)

Measurement z(t3)

Corrected optimal estimate !(t3)

14

Conceptual Overview •  Lessons learnt from conceptual overview:

–  Initial conditions (!k-1 and !k-1) –  Prediction (!-

k , !-k)

•  Use initial conditions and model (eg. constant velocity) to make prediction

–  Measurement (zk) •  Take measurement

–  Correction (!k , !k) •  Use measurement to correct prediction by ‘blending’

prediction and residual – always a case of merging only two Gaussians

•  Optimal estimate with smaller variance

15

Theoretical Basis •  Process to be estimated:

yk = Ayk-1 + Buk + wk-1

zk = Hyk + vk

Process Noise (w) with covariance Q

Measurement Noise (v) with covariance R

•  Kalman Filter Predicted: !-

k is estimate based on measurements at previous time-steps

!k = !-k + K(zk - H !-

k )

Corrected: !k has additional information – the measurement at time k

K = P-kHT(HP-

kHT + R)-1

!-k = Ayk-1 + Buk

P-k = APk-1AT + Q

Pk = (I - KH)P-k

16

Blending Factor

•  If we are sure about measurements: –  Measurement error covariance (R) decreases to zero –  K decreases and weights residual more heavily than prediction

•  If we are sure about prediction –  Prediction error covariance P-

k decreases to zero –  K increases and weights prediction more heavily than residual

17

Theoretical Basis

!-k = Ayk-1 + Buk

P-k = APk-1AT + Q

Prediction (Time Update)

(1) Project the state ahead

(2) Project the error covariance ahead

Correction (Measurement Update)

(1) Compute the Kalman Gain

(2) Update estimate with measurement zk

(3) Update Error Covariance

!k = !-k + K(zk - H !-

k )

K = P-kHT(HP-

kHT + R)-1

Pk = (I - KH)P-k

18

Quick Example – Constant Model

Measuring Devices Estimator

Measurement Error Sources

System State

External Controls

Observed Measurements

Optimal Estimate of

System State

System Error Sources

System

Black Box

19


Prediction

!k = !-k + K(zk - H !-

k )

Correction

K = P-k(P-

k + R)-1

!-k = yk-1

P-k = Pk-1

Pk = (I - K)P-k

20


21


Convergence of Error Covariance - Pk

22

Quick Example – Constant Model Larger value of R – the measurement error covariance (indicates poorer quality of measurements)

Filter slower to ‘believe’ measurements – slower convergence

23

References 1.  Kalman, R. E. 1960. “A New Approach to Linear Filtering and Prediction

Problems”, Transaction of the ASME--Journal of Basic Engineering, pp. 35-45 (March 1960).

2.  Maybeck, P. S. 1979. “Stochastic Models, Estimation, and Control, Volume 1”, Academic Press, Inc.

3.  Welch, G and Bishop, G. 2001. “An introduction to the Kalman Filter”, http://www.cs.unc.edu/~welch/kalman/

Beyond the Kalman filter

Non linear - Gaussian case

�xk = fk(xk−1) +wk wk ∼ N (0;Qk)zk = hk (xk) + vk vk ∼ N (0;Rk)

Extensions of the Kalman filter

Extended Kalman - [Jazwinski70]: Kalman filter applied on thelinearized systemUnscented Kalman filter [Julier 97]: Use of the “unscentedtransform”

Based on a Gaussian approximation of the filtering densityAdapted to weakly non linear and unimodal case


Beyond the Kalman filter

Non linear - non Gaussian case

�xk = fk(xk−1,wk)zk = hk (xk,vk)

Grid based methods

Discretization of the state space on a grid [Kitagawa 87][Sorenson 88]Highly computational method, small dimension (d < 4)

Sequential Monte Carlo methods (particle fiter)

Independant of the dimension and of the non linearity of thesystem


Overview

1. Introduction on HMM

2. Filtering : Problem Statement

3. Forward algorithm

4. Kalman filter

5. Particle filter


Kalman filter vs Particle filter


Particle filter

p(xk|z1:k) is described with a set of weighted particles= set of possible solutions, with associated trust


Sequential Monte Carlo methods

p(xk|z1:k) is described with a set of weighted particles

1 exploration of the state space

2 evaluation of the particles quality with respect to theobservations

3 mutation/selection of the particles


Why particle filter ?

� Simple management of several possible solutions, even in highdimensional space

� Constraint-free modelizationenable the introducing of any a priori informationenable a simple fusion of different observations

� Multimodality, non linearity and non Gaussian noises are not aproblem ...

� Easy to implement


Particle filter

advantages of particle filtering

� easy to implement, and to extand

� robust approach

� plenty of theoretical results

problems of particle filtering

� jitter of the final estimate

� computational cost

� short term multimodality


Overview of existing solutions

� Both X and Z are discrete and finite → Forward algorithm

� Otherwise

� Linear Gaussian model → Kalman filter

� weakly non linear, Gaussian → Extensions of Kalman filter

� non linear, non Gaussian → Particle filter (Sequential MonteCarlo methods)

A good algorithm does not fix a weak model and vice versa


knowledge representation and reasonning bayesian ... · knowledge representation and reasonning...

Documents