bayesian network classification using spline-approximated kde

21
Bayesian network classification using spline-approximated KDE Y. Gurwicz, B. Lerner Journal of Pattern Recognition

Upload: norman-bradford

Post on 31-Dec-2015

39 views

Category:

Documents


6 download

DESCRIPTION

Bayesian network classification using spline-approximated KDE. Y. Gurwicz, B. Lerner Journal of Pattern Recognition. Outline. Introduction Background on Naïve Bayesian Network Computational Issue with KDE Proposed solution: Spline Approximated KDE Experiments Conclusion. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Bayesian network classification using spline-approximated KDE

Bayesian network classification using spline-approximated KDE

Y. Gurwicz, B. Lerner

Journal of Pattern Recognition

Page 2: Bayesian network classification using spline-approximated KDE

Outline

• Introduction

• Background on Naïve Bayesian Network

• Computational Issue with KDE

• Proposed solution: Spline Approximated KDE

• Experiments

• Conclusion

Page 3: Bayesian network classification using spline-approximated KDE

Introduction

• Bayesian Network (NB) classifiers have been successfully applied to a variety of domains

• Attains asymptotically optimal classification error (i.e., Bayes Risk) given that the conditional and prior density estimates are asymptotically consistent (e.g., KDE)

• A particular form of the BN is the Naïve BN (NBN) which has shown to provide good performance in practice and can help alleviate the curse of dimensionality [Zhang 2004]

• Hence NBN is the focus of this work

Page 4: Bayesian network classification using spline-approximated KDE

Naïve Bayesian Network (NBN)

• A BN expresses joint probability distributions (nodes = RVs, edges = dependencies)

• Because expressing node densities is difficult in high dimensions (sample density becomes sparse), the BN can be constrained so that the attributes (RVs) are independent for a given class (increases sample densities)

• This constrained BN is called the Naïve BN• The following introductory slides are obtained

from A. Moore tutorial

Page 5: Bayesian network classification using spline-approximated KDE
Page 6: Bayesian network classification using spline-approximated KDE
Page 7: Bayesian network classification using spline-approximated KDE
Page 8: Bayesian network classification using spline-approximated KDE
Page 9: Bayesian network classification using spline-approximated KDE
Page 10: Bayesian network classification using spline-approximated KDE
Page 11: Bayesian network classification using spline-approximated KDE
Page 12: Bayesian network classification using spline-approximated KDE

Estimating prior and conditional probabilities

• Methods for estimating prior P(C) and conditional P(e|C) probabilities– Parametric

• Gaussian form are mainly used (CRV)• Fast to compute• May not accurately reflect the true distribution

– Non-parametric• KDE• Slow• Can accurately model the true distribution

• Can we come up with a fast non-parametric method?

Page 13: Bayesian network classification using spline-approximated KDE

Cost of calculating conditionals

• Let N_ts = test patterns; N_tr = training patterns; N_f = # of dimensions; N_c = # of classes

• Parametric approach: O(N_ts * N_c * N_f)

• Non-parametric approach: O(N_ts * N_tr * N_f)

• N_c << N_tr

Page 14: Bayesian network classification using spline-approximated KDE

Reducing N_tr: Spline approximation

• Estimate the KDE using splines• Splines are piecewise polynomial regression of

order P interpolated at K intervals over the domain constrained to some smoothness property (e.g., s1’’=s2’’)

• Spline regression only requires O(P * Log K) or O (P) (if a hash function is employed)

• Usually P = 4• Hence significant computational savings can be

attained over the direct KDE

Page 15: Bayesian network classification using spline-approximated KDE

Constructing the Splines

• Calculate the endpoints for the K intervals to interpolate– K+1 estimates from the KDE– O(K * N_tr)

• Calculate the P coefficients for all the individual splines of the K intervals– O(K * P)

• Once splines have been obtained, a density query can be computed in O(P) time

Page 16: Bayesian network classification using spline-approximated KDE

Experiment

• Measurement– Approximation accuracy– Classification accuracy– Classification speed

• Classifiers– BN-KDE– BN-Spline– BN-Gauss

• Synthetic and real-world

Page 17: Bayesian network classification using spline-approximated KDE

Approximation Accuracy

Page 18: Bayesian network classification using spline-approximated KDE

Classification Accuracy

Page 19: Bayesian network classification using spline-approximated KDE

Classification Speed

Page 20: Bayesian network classification using spline-approximated KDE

Conclusion• Spline based method can well approximate the univariate standard

KDE • Speed gains can be realized over the direct KDE• Comments

– How to determine the # of intervals in the splines? Analogous problem to bandwidth specification in KDE..

– Assigns static intervals.. Same problem as the global bandwidth– This is an approximation for the global bandwidth KDE. How well do the

splines approximate the AKDE?– Proposed method works for static data set, however if data distribution

changes, then splines will need to be reconstructed• May not be directly applicable to data streams

– Implication to LR-KDE• Develop multi-query algorithms (e.g., deriving K+1 endpoints/knots)• Assign dynamic spline intervals based on regularized LR since each LR

models a simple density

Page 21: Bayesian network classification using spline-approximated KDE

Reference

• H. Zhang, “The optimality of Naïve Bayes”, AAAI 2004