using prior information in bayesian inference - with application to fault diagnosis anna pernestål...

Using Prior Information in Bayesian Inference- with Application to Fault Diagnosis

Anna Pernestål and Mattias NybergDepartment of Electrical Engineering, Linköping University, Sweden

Scania CV AB, Sweden

MaxEnt 2007, Saratoga Springs 8 – 13 July

Outline

• Motivation: The Fault Diagnosis Problem

• Problem Formulation

• Our Approach

• Small Example

• Conclusions

Motivation: Automotive Fault Diagnosis

Go to workshop?

Stop immediately?

Ignore and go on?

Motivation: Automotive Fault Diagnosis

Why Fault Diagnosis?

Safety

Uptime

Fuel Consumption

Environmental Issues (Emissions)

Guidance at the workshop

The Diagnosis Problem

Probability of faults, c

System under diagnosis

Diagnosis system

Observations, x

Pre-processing

System under diagnosis

Diagnosis system

Probability of faults, c

Observations, x

Pre-processing

Complex system several hundreds of faults observations.

Uncertainty due to noise, missing information, lack of understanding of the system under diagnosis.

No model of the probabilistic relations between observations and faults available.

Training data from some, but not all faults.

Training data collected by implementing faults and run the system.

Prior probabilities of faults are known.

Engineering skills and prior knowledge may be available.

Observations have different characteristics: sensor readings, model based diagnostic tests, ”ad hoc tests” constructed by engineers.

The Diagnosis Problem

Example: catalyst diagnosis• Gas flow in a catalyst• Two tests:

iT oT

ooS

iiS

wTT

wTT

2

1

1ST 2ST

otherwise ,1

|)(| ,0

,1

,0

,1

122

1

1

1

1

SS

highS

highSlow

Slow

TgTx

T

T

T

x

)(0 iTgiT

Prior Response Knowledge

otherwise ,1

|)(| ,0

,1

,0

,1

122

1

1

1

1

SS

highS

highSlow

Slow

TgTx

T

T

T

x

C = c1

No Fault

C = c2

Ts1

C = c3

Ts2

C = c4

Catalyst

x1=-1 0 √ 0 0

x1=0 √ √ √ √

x1=1 0 √ 0 0

x2=0 √ √ √ √

x2=1 0 √ √ √

Means that some values of the observation is impossible under some faults

Assume that the thresholds are such that the probabilities for false alarms are zero (in practice)

Simple in Bayesian framework!

Prior Causality Knowledge

otherwise ,1

|)(| ,0

,1

,0

,1

122

1

1

1

1

SS

highS

highSlow

Slow

TgTx

T

T

T

x

C = c1

No Fault

C = c2

Ts1

C = c3

Ts2

C = c4

catalyst

x1 0 • 0 0

x2 0 • • •

Know that some observations are not affected under certain faults.

),,|(),,|( 3111 CC IcxpIcxp

Summary of Probelm Formulation

• We have– data from some faults– prior knowledge about causality

• Determine the probability of different faults

• Previous works use either prior information or data.

Now: Be Bayesian and combine training data and prior knowledge!

Notation

• Observation vector X= x, with x = (x1, x2,… xm)• State of the system C, with values c1, c2,…• Z = (X,C), Zd ={1, 2, 3...K}• Training data, D• State of knowledge I

)|(

),|(),,|(

ICp

IDZpIDXCp

Training Data Only

Assume that

K

kkk

kIkZp

1

1 ,0

,),|(

AN

nIDkZp kk

),|(

Then

samples trainingof counts

samples, alhypothetic

1

1

K

kk

K

kk

nN

A

0 ,)(

)()|(

1

1

1

1

k

K

kkK

kk

K

kk

kIf

Dirichlet distribution

),|,(),|,(

),|(

),|,(),,|(),,|(

),|(

),|,(

, CkikjCji

Ck

CkiCkiCji

Cj

Cji

IcxpIcxp

Icp

IcxpIcxpIcxp

Icp

Icxp

Prior Causality Knowledge),,|(),,|( CkiCji IcxpIcxp

0 bM

x1 c P(x1,c|I)

0 1 θ1

0 2 θ2

1 1 θ3

1 2 θ4

0

1

0

0

1111

1100

0011

bM

Let: 112 Example

TK ),...,( 21

Causality knowledge, cont.

d

dIDizp K

kkk

nK

ni

nn

K

kkk

nK

ni

nn

KKii

KKii

1

01112

11

1

0112

11

*

)(

)(),|(

1111

1111

dIDfIDZpIDZp CCC ),|(),,|(),|(

dIfIDp

IfIDpIDf

CC

CCC

)|(),|(

)|(),|(),|(

)|(),|(),|()|( IfIIfIIfIf MMC

K

kkkM IIf

1

0 )(),|(

Dirichlet

Can be solved e.g. using variable substitution.

Example:

23

1),|7(),|5(

38

1

)415(2

1

)(2),|7(),|,1,1(

8

1

)40(2

1

)(2),|5(),|,1,0(

}7,6,3,2{

7121

}8,5,4,1{

5121

IDZpIDZp

nIDZpIDcCXXp

nIDZpIDcCXXp

iii

CC

iii

CC

Two classes and two binary observations.

Training data from fault c2 only.

Do inference about c1

x1

x2

Have reused training data, and learned that x1 is far probable under c1 also!

0 1

1

0

Conclusion• Formulated the fault isolation problem in the

Bayesian framework• Emphasized the use of prior information• Data and prior knowledge solves different parts

of the diagnosis problems, the optimal solution is when both are used together!

Future work• General solution of the integral

• Compare to MaxEnt

d

dIDizp K

kkk

nK

ni

nn

K

kkk

nK

ni

nn

KKii

KKii

1

01112

11

1

0112

11

*

)(

)(),|(

1111

1111

Thank you!

Some Previous Work • Determine the faults that are logically consistent with the

observations, using prior information only. Ignores noise.– DeKleer & Williams (1992), Reiter (1992), ….

• Use response information and fault models.– Gertler (1998), Blanke et. al. (2003), …

• Qualitative information about signs, magnitudes etc. – Pulido et. al. (2005), Daigle et. al. (2006), …

• Fuzzy logic.– Fagarasan et. al. (2001)

• Construct a Baysian network from expert knowledge.– Schwall(2002), Lerner et. al.(2000), …

• Use Training data only. Classification methods, SVM.– Pernestål et. Al. (2007), Gareth et. Al. (2007)

Now: Be Bayesian and combine training data and prior knowledge!

References

1. DeKleer & Williams (1992), Diagnosis with Behavioral Modes, Readings in Model Based Diagnosis.

2. Reiter (1992), A Theory of Diagnosis From First Principles, Readnings in Model Based Diagnosis.

3. Gertler (1998), Fault Detection and Diagnosis in Engineering Systems, Marcel & Decker.4. Blanke, Kinnaert , Lunze , Staroswiecki and Schröder, (2003) Diagnosis and Fault

Tolerant Control, Springer. 5. Pulido, Puig, Escobet, and Quevedo (2005), A new Fault Localization Algorithm that

Improves the Integration Between Fault Detection and Localization in Dynamic Systems. 16th International Workshop on Principles of Diagnosis, DX05.

6. Daigle, Koutsoukos and Biswas (2006), Multiple Fault Diagnosis in Complex Systems, 17th International Workshop on Principles of Diagnosis, DX06.

7. Sala (2006), Fuzzy Logic Diagnostic Rules – a Constraint Optimization Viewpoint, Proceedings of ECC 2006

8. Schwall and Gerdes (2002), A probabilistic Approach to Residual Processing for Vehicle Fault Detection, Proceedings of ACC.

9. Lerner, Parr, Koller, and Biswas (2000), Bayesian Fault Detection and Diagnosis in Dynamic Systems, AAAI/IAAI

10. Pernestål and Nyberg, (2007), Probabilistic Fault Diagnosis Based on Incomplete Data with Application to an Automotive Engine, Proceedings of ECC.

11. Lee, Bahri, Shastri, and Zaknich (2007) A Multi-Category Decission Support for the Tennesse Eastman Problem, Proceedings of ECC.

using prior information in bayesian inference - with application to fault diagnosis anna pernestål...

Documents

fault c

c system

system c

c observations

values c

prior causality knowledge

diagnosis problem slide

prior response knowledge