jeffreys' and bdeu priors for model selection
TRANSCRIPT
Jeffreys' and BDeu Priors for Model Selection
WITMSE 2016
Helsinki, Finland, September 20Joe Suzuki(prof-joe)
Joe Suzuki (Osaka Univ., Japan)
Goal and Contributions
[Goal] Compare for model selection
• BDeu (Bayesian Dirichlet equivalent uniform)
• Jeffreys prior (T-K estimator)
[Contribution]
Mathematically Proves
Road Map
1. Bayesian Dirichlet Scores
2. BDeu and Jeffreys Scores
3. A Found Property and its Proof
4. Main Theorem
5. Regularity in Model Selection
6. Summary
Assign a Prob. to each Seq.
Express a Prob. by the product of Cond. Probs.
Simultaneous Probs.
Cond. Probs.
BDeu and Jeffreys’ Prior
Example 1 : Bayesian Network Structure Learning (BNSL)
Example 2: Independence Testing
A Motivating Example
A Found Property
Sketch of J(n)>0 for BDeu
Sketch of J(n)≦0 for Jeffreys’
An Intuitive Reasoning
Main Theorem
Examples
more likely
unlikely
Regularity in Model Selection
Fitness + Simplicity → optimal
(-1) x Likelihood + Penalty Term → min
Newton’sLaw of Motion
MaxwellEquations
If model A is better than model B w.r.t. fitness and simplicity,model A should be chosen (regularity).
Information CriteriaLASSO
BDeu violates regularity in model selection
Z XZ X
Y
Y X
B&B for efficient BNSL (Depth First Search)
Those bounds utilize regularity
Campos and Ji 2011 figured out one (=nice)
but the bound is not efficient (experiments).
Designing Pruning rules for BDeu is HARDer.
because regularity cannot be assumed
Bayes Prior
Based on his/her Belief:
Nobody should reject it from a general point of view.
BDeu violates regularity
contradicts with Newton, Maxwell, Information Critreria, LASSO, etc.
People might notice that their beliefs have been wrong, after knowing the new result in this paper.
Summary
The prior behind BDeu might have been based on a wrong belief That contradicts regularity in model selection
Future Work: Consider NML and others in a similar way