knowledge representation and reasonning bayesian ... · knowledge representation and reasonning...
TRANSCRIPT
Knowledge Representation and Reasonning
Bayesian filterting - part 2
Elise [email protected]
December 1, 2011
Elise Arnaud [email protected] M2R
Overview
1. Introduction on HMM
2. Filtering : Problem Statement
3. Forward algorithm
4. Kalman filter
5. Particle filter
Elise Arnaud [email protected] M2R
Problem Statement
Dynamic system modeled as a Hidden Markov Chain
described by
1. state space X ; measurement space Z2. the initial distribution p(X0)
3. an evolution model (transition law)p(xk|x0:k−1, z1:k−1) = p(xk|xk−1)
4. an observation model (likelihood)p(zk|x0:k−1, z1:k−1) = p(zk|xk)
Elise Arnaud [email protected] M2R
Problem Statement
Filtering
� estimation of the state given the past and presentmeasurements
filtering distribution: p(xk|z1:k)
� This estimation has to be sequential ie :
p(xk−1|z1:k−1) → Algorithm → p(xk|z1:k)
Elise Arnaud [email protected] M2R
Forward algorithm
� Both X and Z are discrete and finite
� p(xk|xk−1) is embedded in the matrix Q
� p(zk|xk) is embedded in the matrix B
Elise Arnaud [email protected] M2R
Forward algorithm
Generalization
We define: αk(i) = P (z1:k,xk = i)
Initialization: α0(i) = µ(i)
Induction:
αk+1(i) =
N�
j=1
αk(j)qji
bi(zk+1)
Elise Arnaud [email protected] M2R
Viterbi algorithm
Generalization
We define: αk(i) = P (z1:k,xk = i)Initialization: α0(i) = µ(i)Induction:
αk+1(i) =
�maxj=1:N
αk(j) qji
�bi(zk+1)
ψk+1(i) = argmaxj
[αk(j) qji]
The retro-propagation enables to find the Viterbi pathx∗1, . . . ,x
∗K :
x∗K = argmax
j[αK(j)]
x∗k = ψk+1 [x∗
k+1]
Elise Arnaud [email protected] M2R
But ...
what to do when the state space and or the measurement space isinfinite or huge ?
� p(xk|xk−1) not any more described in the matrix Q
� p(zk|xk) not any more described in the matrix B
Elise Arnaud [email protected] M2R
But ...
what to do when the state space and or the measurement space isinfinite or huge ?
� p(xk|xk−1) not any more described in the matrix Q
but we have an expression that links xk−1 to xk :
xk = f(xk−1,wk)
� p(zk|xk) not any more described in the matrix B
but we have an expression that links zk to xk :
zk = h(xk,vk)
Elise Arnaud [email protected] M2R
But ...
what to do when the state space and or the measurement space isinfinite or huge ?
�xk = fk(xk−1,wk)zk = hk (xk,vk)
� if f and h are linear, additive with Gaussian noise:�
xk = Fkxk−1 +wk wk ∼ N (0;Qk)zk = Hk xk + vk vk ∼ N (0;Rk)
� then the solution lies in the Kalman filter
Elise Arnaud [email protected] M2R
1
Introduction to Kalman Filters
Michael Williams 5 June 2003
2
Overview
• The Problem – Why do we need Kalman Filters?
• What is a Kalman Filter? • Conceptual Overview • The Theory of Kalman Filter • Simple Example
3
The Problem
• System state cannot be measured directly • Need to estimate “optimally” from measurements
Measuring Devices Estimator
Measurement Error Sources
System State (desired but not known)
External Controls
Observed Measurements
Optimal Estimate of
System State
System Error Sources
System
Black Box
4
What is a Kalman Filter? • Recursive data processing algorithm • Generates optimal estimate of desired quantities
given the set of measurements • Optimal?
– For linear system and white Gaussian errors, Kalman filter is “best” estimate based on all previous measurements
– For non-linear system optimality is ‘qualified’
• Recursive? – Doesn’t need to store all previous measurements and
reprocess all data each time step
5
Conceptual Overview
• Simple example to motivate the workings of the Kalman Filter
• Theoretical Justification to come later – for now just focus on the concept
• Important: Prediction and Correction
6
Conceptual Overview
• Lost on the 1-dimensional line • Position – y(t) • Assume Gaussian distributed measurements
y
7
Conceptual Overview
• Sextant Measurement at t1: Mean = z1 and Variance = !z1
• Optimal estimate of position is: !(t1) = z1
• Variance of error in estimate: !2x (t1) = !2
z1
• Boat in same position at time t2 - Predicted position is z1
8
Conceptual Overview
• So we have the prediction !-(t2) • GPS Measurement at t2: Mean = z2 and Variance = !z2
• Need to correct the prediction due to measurement to get !(t2) • Closer to more trusted measurement – linear interpolation?
prediction !-(t2) measurement z(t2)
9
Conceptual Overview
• Corrected mean is the new optimal estimate of position • New variance is smaller than either of the previous two variances
measurement z(t2)
corrected optimal estimate !(t2)
prediction !-(t2)
10
Conceptual Overview
• Lessons so far: Make prediction based on previous data - !-, !-
Take measurement – zk, !z
Optimal estimate (!) = Prediction + (Kalman Gain) * (Measurement - Prediction)
Variance of estimate = Variance of prediction * (1 – Kalman Gain)
11
Conceptual Overview
• At time t3, boat moves with velocity dy/dt=u • Naïve approach: Shift probability to the right to predict • This would work if we knew the velocity exactly (perfect model)
!(t2) Naïve Prediction !-(t3)
12
Conceptual Overview
• Better to assume imperfect model by adding Gaussian noise • dy/dt = u + w • Distribution for prediction moves and spreads out
!(t2)
Naïve Prediction !-(t3)
Prediction !-(t3)
13
Conceptual Overview
• Now we take a measurement at t3
• Need to once again correct the prediction • Same as before
Prediction !-(t3)
Measurement z(t3)
Corrected optimal estimate !(t3)
14
Conceptual Overview • Lessons learnt from conceptual overview:
– Initial conditions (!k-1 and !k-1) – Prediction (!-
k , !-k)
• Use initial conditions and model (eg. constant velocity) to make prediction
– Measurement (zk) • Take measurement
– Correction (!k , !k) • Use measurement to correct prediction by ‘blending’
prediction and residual – always a case of merging only two Gaussians
• Optimal estimate with smaller variance
15
Theoretical Basis • Process to be estimated:
yk = Ayk-1 + Buk + wk-1
zk = Hyk + vk
Process Noise (w) with covariance Q
Measurement Noise (v) with covariance R
• Kalman Filter Predicted: !-
k is estimate based on measurements at previous time-steps
!k = !-k + K(zk - H !-
k )
Corrected: !k has additional information – the measurement at time k
K = P-kHT(HP-
kHT + R)-1
!-k = Ayk-1 + Buk
P-k = APk-1AT + Q
Pk = (I - KH)P-k
16
Blending Factor
• If we are sure about measurements: – Measurement error covariance (R) decreases to zero – K decreases and weights residual more heavily than prediction
• If we are sure about prediction – Prediction error covariance P-
k decreases to zero – K increases and weights prediction more heavily than residual
17
Theoretical Basis
!-k = Ayk-1 + Buk
P-k = APk-1AT + Q
Prediction (Time Update)
(1) Project the state ahead
(2) Project the error covariance ahead
Correction (Measurement Update)
(1) Compute the Kalman Gain
(2) Update estimate with measurement zk
(3) Update Error Covariance
!k = !-k + K(zk - H !-
k )
K = P-kHT(HP-
kHT + R)-1
Pk = (I - KH)P-k
18
Quick Example – Constant Model
Measuring Devices Estimator
Measurement Error Sources
System State
External Controls
Observed Measurements
Optimal Estimate of
System State
System Error Sources
System
Black Box
19
Quick Example – Constant Model
Prediction
!k = !-k + K(zk - H !-
k )
Correction
K = P-k(P-
k + R)-1
!-k = yk-1
P-k = Pk-1
Pk = (I - K)P-k
20
Quick Example – Constant Model
21
Quick Example – Constant Model
Convergence of Error Covariance - Pk
22
Quick Example – Constant Model Larger value of R – the measurement error covariance (indicates poorer quality of measurements)
Filter slower to ‘believe’ measurements – slower convergence
23
References 1. Kalman, R. E. 1960. “A New Approach to Linear Filtering and Prediction
Problems”, Transaction of the ASME--Journal of Basic Engineering, pp. 35-45 (March 1960).
2. Maybeck, P. S. 1979. “Stochastic Models, Estimation, and Control, Volume 1”, Academic Press, Inc.
3. Welch, G and Bishop, G. 2001. “An introduction to the Kalman Filter”, http://www.cs.unc.edu/~welch/kalman/
Beyond the Kalman filter
Non linear - Gaussian case
�xk = fk(xk−1) +wk wk ∼ N (0;Qk)zk = hk (xk) + vk vk ∼ N (0;Rk)
Extensions of the Kalman filter
Extended Kalman - [Jazwinski70]: Kalman filter applied on thelinearized systemUnscented Kalman filter [Julier 97]: Use of the “unscentedtransform”
Based on a Gaussian approximation of the filtering densityAdapted to weakly non linear and unimodal case
Elise Arnaud [email protected] M2R
Beyond the Kalman filter
Non linear - non Gaussian case
�xk = fk(xk−1,wk)zk = hk (xk,vk)
Grid based methods
Discretization of the state space on a grid [Kitagawa 87][Sorenson 88]Highly computational method, small dimension (d < 4)
Sequential Monte Carlo methods (particle fiter)
Independant of the dimension and of the non linearity of thesystem
Elise Arnaud [email protected] M2R
Overview
1. Introduction on HMM
2. Filtering : Problem Statement
3. Forward algorithm
4. Kalman filter
5. Particle filter
Elise Arnaud [email protected] M2R
Kalman filter vs Particle filter
Elise Arnaud [email protected] M2R
Kalman filter vs Particle filter
Elise Arnaud [email protected] M2R
Particle filter
p(xk|z1:k) is described with a set of weighted particles= set of possible solutions, with associated trust
Elise Arnaud [email protected] M2R
Sequential Monte Carlo methods
p(xk|z1:k) is described with a set of weighted particles
1 exploration of the state space
2 evaluation of the particles quality with respect to theobservations
3 mutation/selection of the particles
Elise Arnaud [email protected] M2R
Why particle filter ?
� Simple management of several possible solutions, even in highdimensional space
� Constraint-free modelizationenable the introducing of any a priori informationenable a simple fusion of different observations
� Multimodality, non linearity and non Gaussian noises are not aproblem ...
� Easy to implement
Elise Arnaud [email protected] M2R
Particle filter
advantages of particle filtering
� easy to implement, and to extand
� robust approach
� plenty of theoretical results
problems of particle filtering
� jitter of the final estimate
� computational cost
� short term multimodality
Elise Arnaud [email protected] M2R
Overview of existing solutions
� Both X and Z are discrete and finite → Forward algorithm
� Otherwise
� Linear Gaussian model → Kalman filter
� weakly non linear, Gaussian → Extensions of Kalman filter
� non linear, non Gaussian → Particle filter (Sequential MonteCarlo methods)
A good algorithm does not fix a weak model and vice versa
Elise Arnaud [email protected] M2R