a tour of kernel smoothing - mvstat.netoct 10, 2007  · cv 3 (1992, 2004) n−5/14 n−2/(d+6) t....

33
Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion A tour of kernel smoothing Tarn Duong Institut Pasteur October 2007 T. Duong Institut Pasteur A tour of kernel smoothing

Upload: others

Post on 07-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

A tour of kernel smoothing

Tarn Duong

Institut Pasteur

October 2007

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 2: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

The journey up till now

I 1995–1998 Bachelor, Univ. of Western Australia, PerthI 1999–2000 Researcher, Australian Bureau of Statistics,

Canberra and SydneyI 2001–2004 PhD, Univ. of Western Australia, PerthI 2005 Lecturer, Macquarie Univ., SydneyI 2005–2007 Post-doc, Univ. of New South Wales, SydneyI 2007– present Post-doc, Institut Pasteur, Paris

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 3: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Research interests

I Kernel smoothingI Nonparametric statisticsI Statistical software

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 4: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Today

I Kernel density estimation (KDE)I 1st stage of inference (estimation)I translation is Éstimation de densité à noyau

I Feature significanceI 2nd stage of inference (formal inference)I translation is ?I extension of density estimation to significance testing

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 5: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 6: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Kernel (1)

I NOT cell nucleusI NOT kernel of an operating systemI NOT kernel/nullspace of a matrix A: {x : Ax = 0}

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 7: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Kernel (2)

Kernel K : Rd → R isI K (x) ≥ 0

I

∫Rd

K (x) dx = 1

I K is symmetric about 0

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 8: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Kernel density estimation

Let X 1, X 2, . . . , X n be a random sample drawn from a commondensity f . A kernel density estimate f is

f (x ; H) = n−1n∑

i=1

KH(x − X i)

where

KH(x − X i) = normal (Gaussian) pdf with mean X i , variance HH = bandwidth or window width (fenetre)

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 9: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Graphical illustration

Scaled kernels KH(x − X i) Kernel density estimate f

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 10: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Advantages of kernel density estimates

I non-parametricI easy to constructI easy to interpretI suitable for multivariate dataI smooth, no discretisation effectsI no anchor points effects

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 11: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 12: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Bandwidth selectors

I single most important factor effecting performance of fI ideal bandwidth selector: H0 = argmin

HAMISE(H)

where AMISE = asymptotic∫

Rd E[f (x ; H)− f (x)]2 dx

I data-driven selector: H = argminH

AMISE(H)

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 13: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Relative convergence rates (1)

I a data-driven selector H = argminH

AMISE(H) converges to

H0 with rate n−α, α > 0 if

vech(H− H0) = Op(n−αJ) vech H0

where Op is order in probability, J = matrix of ones, and

vech[a bb c

]=

abc

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 14: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Relative convergence rates (2)

I H converges to H0 with rate n−α if

MSE(H) = Var(H) + Bias(H) BiasT (H)

= O(n−2α)(vech H0)(vechT H0)

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 15: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Relative convergence rates (3)

Easier(?!) to compute

Bias(H) = O(

E[

∂ vech H(AMISE− AMISE)(H0)

])Var(H) = O

(Var

[∂

∂ vech H(AMISE− AMISE)(H0)

])

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 16: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Table of convergence rates

Convergence rateSelector d = 1 d > 1Plug-in 1 (1994) n−4/13 n−4/(d+12)

Plug-in 2 (2003) n−2/7 n−2/(d+6)

CV 1 (1982, 1984) n−1/10 n−min(d ,4)/(2d+8)

CV 2 (1994) n−1/10 n−min(d ,4)/(2d+8)

CV 3 (1992, 2004) n−5/14 n−2/(d+6)

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 17: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 18: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Software

I ks: R library available on CRAN www.r-project.org

I comprehensive package for kernel density estimation andbandwidth selection

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 19: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Flow cytometry (FACS) data (1)

Data sample KDE

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

● ●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●●

●●● ●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

● ●

●●

●●

●●

● ●● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

● ●

●●

●● ●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●● ●

●●

●●

● ●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

● ●●

●●

●● ●

●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

0.2 0.4 0.6 0.8

0.0

0.2

0.4

0.6

0.8

1.0

x

y

0.0 0.2 0.4 0.6 0.8

0.0

0.2

0.4

0.6

0.8

1.0

x

y

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 20: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Flow cytometry (FACS) data (2)

Contour plot Wireframe plot

0.2 0.4 0.6 0.8

0.0

0.2

0.4

0.6

0.8

1.0

x

y

x

y

Density function

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 21: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Independent citations in other fields

I Zago, A. and Dongili P. (2006) Bad loans and efficiency inItalian banks, Working paper no. 28, Università di Verona

I Fieberg, J. (2007) Kernel density estimators of homerange: smoothing and the autocorrelation red herring.Ecology, 88, 1059–1066

I Peng T.G., Wang Y.H. and Wu T.H. (2007) Mean shiftalgorithm equipped with the intersection of confidenceintervals rule for image segmentation. Pattern RecognitionLetters, 28, 268–277

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 22: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 23: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Features

I d = 1, 2: mode, valley, saddle-point, ridge etc.I d > 2: mode

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 24: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Modes and modal regions

I mode x∗ of function f : Rd → RI D f (x∗) = 0, D2 f (x∗) < 0I D f (x∗) = 0, eigenvalues λ1(x∗), λ2(x∗), . . . , λd (x∗) of

D2 f (x∗) < 0I modal region M of f

I M = {x : ‖D f (x)‖ ≤ δ,−ε ≤ λj(x) ≤ 0}I δ, ε ‘small’ positive

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 25: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Kernel density derivative estimation

I density (zero-th derivative):

f (x ; H) = n−1n∑

i=1

KH(x − X i)

I gradient (first derivative):

D f (x ; H) = n−1n∑

i=1

D KH(x − X i)

I curvature (second derivative):

D2 f (x ; H) = n−1n∑

i=1

D2 KH(x − X i)

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 26: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Kernel curvature estimators

I asymptotic distribution:vech D2 f (x ; H)

approx.∼ N(vech D2 f (x),Σ(x))

I local null hypothesis: H0(x) : vech D2 f (x) = 0

I null distribution: vech D2 f (x ; H)approx.∼ N(0,Σ(x))

I test statistic:W (x) = ‖Σ(x)−1/2 vech D2 f (x ; H)‖2 approx.∼ χ2

d(d+1)/2

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 27: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Significant curvature regions

I extension of kernel density estimation suited to findingmodal regions

I modal region estimate at significance level α: significantcurvature region M = {x : W (x) ≥ χ2

d(d+1)/2;1−α′}I α′ is adjusted significance level to account for multiple

hypothesis tests

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 28: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Software

I feature: R library available on CRAN

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 29: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Flow cytometry (FACS) data (3)

Density estimate Modal regions estimates

0.2 0.4 0.6 0.8

0.0

0.2

0.4

0.6

0.8

1.0

x

y

0.2 0.4 0.6 0.8

0.0

0.2

0.4

0.6

0.8

1.0

x

y

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 30: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 31: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Summary

I Multivariate kernel density estimatorsI theoretical development of optimal bandwidth selectorsI software implementation

I Feature significanceI some theoretical development of multivariate modal region

estimationI software implementation

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 32: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Future directions

I Comparing two kernel density estimatorsI Optimal bandwidth selection for kernel density derivative

estimators

T. Duong Institut Pasteur

A tour of kernel smoothing

Page 33: A tour of kernel smoothing - mvstat.netOct 10, 2007  · CV 3 (1992, 2004) n−5/14 n−2/(d+6) T. Duong Institut Pasteur Introduction Kernel density estimation Bandwidth selection

Introduction Kernel density estimation Bandwidth selection Applications of KDE Feature significance Conclusion

Acknowledgements

I Kernel density estimationI Prof. Martin Hazelton, then Univ. of Western Australia, now

at Massey Univ. (New Zealand), as PhD supervisorI Feature significance

I Dr Inge Koch, Univ. of New South Wales (Australia),I Prof. Matt Wand, then Univ. of New South Wales

(Australia), now at Univ. of Wollongong (Australia)

T. Duong Institut Pasteur

A tour of kernel smoothing