p-spline mixed models for spatio-temporal...

85
P-spline mixed models for spatio-temporal data María Durbán joint work with Dae-Jin Lee DEPARTMENT OF STATISTICS UNIVERSIDAD CARLOS III DE MADRID June 2009 Uc3m/ Dept. of Statistics 1 First Workshop on Spatio-temporal Disease Mapping, Valencia 2009 María Durbán

Upload: others

Post on 22-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-spline mixed models for spatio-temporal data

María Durbánjoint work with Dae-Jin Lee

DEPARTMENT OF STATISTICSUNIVERSIDAD CARLOS III DE MADRID

June 2009

Uc3m/ Dept. of Statistics 1First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 2: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Outline

1 P-splinesMixed models approachMultidimensional P-splines

2 P-splines for spatial count dataSpatial smoothingSmooth-CAR modelApplication: Scottish Lip Cancer data

3 Spatio-temporal data Smoothing with P-splinesANOVA-Type Interaction ModelsApplication Environmental spatio-temporal data

4 Spatio-temporal Disease Mapping

Uc3m/ Dept. of Statistics 2First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 3: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Outline

1 P-splinesMixed models approachMultidimensional P-splines

2 P-splines for spatial count dataSpatial smoothingSmooth-CAR modelApplication: Scottish Lip Cancer data

3 Spatio-temporal data Smoothing with P-splinesANOVA-Type Interaction ModelsApplication Environmental spatio-temporal data

4 Spatio-temporal Disease Mapping

Uc3m/ Dept. of Statistics 3First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 4: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

I Penalized Likelihood splines (Eilers & Marx, 1996):

• Given the data (xi,yi), i = 1, ...,n

• Fit a sum of local basis functions:

yi = f (xi) + εi, ε ∼ N (0,σ2)

where f (xi) = Bθ and

I B = B(x) is a Regression Basis, and

I θ is a vector of coefficients.

• Control the fit through a smoothing parameter (λ).

» Regression Basis

Uc3m/ Dept. of Statistics 4First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 5: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

I B-splines Basis:

• p + 1 Piece-wise polynomialsof degree p.

• Connected by knots.• In general the choice is p=3,

cubic spline.

B-splines of degree p:

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

x

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

x

Uc3m/ Dept. of Statistics 5First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 6: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

I B-splines Basis:

• y = f (xi) = Bθ

• B-splines Regression:

min S(θ; y) = ‖y− Bθ‖2

θ = (B′B)−1B′y

I Optimal selection ofknots (Complex).

• P-Splines: add a penalty tocontrol smoothness.

Example:

●●

●●●●

●●●

●●●●

●●●

●●●

●●

●●

●●●

●●●●

●●

●●●

●●

●●

●●●●

●●●●

●●●

●●

●●

●●

●●

0.0 0.2 0.4 0.6 0.8 1.0

−20

24

6

» Methodology

Uc3m/ Dept. of Statistics 6First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 7: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

I B-splines Basis:

• y = f (xi) = Bθ

• B-splines Regression:

min S(θ; y) = ‖y− Bθ‖2

θ = (B′B)−1B′y

I Optimal selection ofknots (Complex).

• P-Splines: add a penalty tocontrol smoothness.

Example:

●●

●●●●

●●●

●●●●

●●●

●●●

●●

●●

●●●

●●●●

●●

●●●

●●

●●

●●●●

●●●●

●●●

●●

●●

●●

●●

0.0 0.2 0.4 0.6 0.8 1.0

−20

24

6

» Methodology

Uc3m/ Dept. of Statistics 6First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 8: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

I B-splines Basis:

• y = f (xi) = Bθ

• B-splines Regression:

min S(θ; y) = ‖y− Bθ‖2

θ = (B′B)−1B′y

I Optimal selection ofknots (Complex).

• P-Splines: add a penalty tocontrol smoothness.

Example:

●●

●●●●

●●●

●●●●

●●●

●●●

●●

●●

●●●

●●●●

●●

●●●

●●

●●

●●●●

●●●●

●●●

●●

●●

●●

●●

0.0 0.2 0.4 0.6 0.8 1.0

−20

24

6

» Methodology

Uc3m/ Dept. of Statistics 6First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 9: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

Methodology:

• Minimize the penalized sum of squares (PSS):

S(θ; y, λ)p = ‖y− Bθ‖2 + PENALTY

• The PENALTY term, controls the smoothness of the fit by λ.

I Eilers & Marx (1996):⇒ (discrete) Penalty over adjacent coefficients θ.

I Lang & Brezger (2004):⇒ “Bayesian P-splines”: random walk priors for θ, e.g.:

θ|θm−1 ∼ N (θm−1, τ2), or

θ|θm−1,θm−2 ∼ N (2θm−1 − θm−2, τ2)

Uc3m/ Dept. of Statistics 7First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 10: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

Methodology:

• Minimize the penalized sum of squares (PSS):

S(θ; y, λ)p = ‖y− Bθ‖2 + PENALTY

• The PENALTY term, controls the smoothness of the fit by λ.

I Eilers & Marx (1996):⇒ (discrete) Penalty over adjacent coefficients θ.

I Lang & Brezger (2004):⇒ “Bayesian P-splines”: random walk priors for θ, e.g.:

θ|θm−1 ∼ N (θm−1, τ2), or

θ|θm−1,θm−2 ∼ N (2θm−1 − θm−2, τ2)

Uc3m/ Dept. of Statistics 7First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 11: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

Methodology:

• Minimize the penalized sum of squares (PSS):

S(θ; y, λ)p = ‖y− Bθ‖2 + PENALTY

• The PENALTY term, controls the smoothness of the fit by λ.

I Eilers & Marx (1996):⇒ (discrete) Penalty over adjacent coefficients θ.

I Lang & Brezger (2004):⇒ “Bayesian P-splines”: random walk priors for θ, e.g.:

θ|θm−1 ∼ N (θm−1, τ2), or

θ|θm−1,θm−2 ∼ N (2θm−1 − θm−2, τ2)

Uc3m/ Dept. of Statistics 7First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 12: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

• PSS becomes:S(θ; y, λ)p = ‖y− Bθ‖2 + θ′Pθ

I P = λD′D.I λ is the smoothing parameter.I D are difference matrices.

• For given λ , min S(θ; y, λ)p

θ =(B′B + λD′D

)−1 B′y

I λ can be selected by CV, GCV, AIC or BIC.

Uc3m/ Dept. of Statistics 8First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 13: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

• PSS becomes:S(θ; y, λ)p = ‖y− Bθ‖2 + θ′Pθ

I P = λD′D.I λ is the smoothing parameter.I D are difference matrices.

• For given λ , min S(θ; y, λ)p

θ =(B′B + λD′D

)−1 B′y

I λ can be selected by CV, GCV, AIC or BIC.

Uc3m/ Dept. of Statistics 8First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 14: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

• PSS becomes:S(θ; y, λ)p = ‖y− Bθ‖2 + θ′Pθ

I P = λD′D.I λ is the smoothing parameter.I D are difference matrices.

• For given λ , min S(θ; y, λ)p

θ =(B′B + λD′D

)−1 B′y

I λ can be selected by CV, GCV, AIC or BIC.

Uc3m/ Dept. of Statistics 8First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 15: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

I 1d P-splines:

• No penalty over coefficients.

• Penalty over coefficients.

Example:

●●

●●●●

●●●

●●●●

●●●

●●●

●●

●●

●●●

●●●●

●●

●●●

●●

●●

●●●●

●●●●

●●●

●●

●●

●●

●●

0.0 0.2 0.4 0.6 0.8 1.0

−20

24

6

B-splines basis and θ without penalty

» Advantages

Uc3m/ Dept. of Statistics 9First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 16: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

I 1d P-splines:

• No penalty over coefficients.

• Penalty over coefficients.

Example:

●●

●●●●

●●●

●●●●

●●●

●●●

●●

●●

●●●

●●●●

●●

●●●

●●

●●

●●●●

●●●●

●●●

●●

●●

●●

●●

0.0 0.2 0.4 0.6 0.8 1.0

−20

24

6

B-splines basis and θ with penalty

» Advantages

Uc3m/ Dept. of Statistics 9First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 17: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines“The Flexible Smoother”

Advantages over other smoothers:• Low-Rank : “dim(B) < dim(data)”.

• Computationally efficient: “# knots ≤ 40”.

• Selection of number and Location of knots is NOT and issue.

• Discrete Penalties over the θ, not over the fitted curve.

• Easy extension to:I Mixed models,I non-gaussian data (GLM’s) andI Multidimensional smoothing.I Spatial and Spatio-temporal smoothing.

» Mixed models

Uc3m/ Dept. of Statistics 10First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 18: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splinesA mixed model approach

I Reformulate:

• Model y = Bθ + ε, into

y = Xβ + Zα + ε, ε ∼ N (0,σ2I)

I where X and Z are “fixed” and “random” effects matrices.

I with coefficients β and α ∼ N (0,G), and G = σ2αR

I λ = σ2

σ2α

» Reparameterization

Uc3m/ Dept. of Statistics 11First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 19: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splinesA mixed model approach

I Reformulate:

• Model y = Bθ + ε, into

y = Xβ + Zα + ε, ε ∼ N (0,σ2I)

I where X and Z are “fixed” and “random” effects matrices.

I with coefficients β and α ∼ N (0,G), and G = σ2αR

I λ = σ2

σ2α

» Reparameterization

Uc3m/ Dept. of Statistics 11First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 20: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splinesA mixed model approach

I Reparameterization:

B ≡ [ X : Z ]⇒ Bθ = Xβ + Zα

IWe use the Singular Value Decomposition (SVD) on D′D

» SVD

Uc3m/ Dept. of Statistics 12First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 21: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splinesA mixed model approach

I Singular Value Decomposition (SVD)

D′D = UΣU′

• with U = [Un : Us]

D′D = [Un : Us]

[0d

Σ

] [ U′nU′s

]

I Σ ≡ non-null eigenvalues.I Un ≡ eigenvectors corresponding to the null eigenvalues.I Us ≡ eigenvectors corresponding to the non-null eigenvalues.

Uc3m/ Dept. of Statistics 13First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 22: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splinesA mixed model approach

• The fix effects (β) are unpenalized and

• The Penalty θ′Pθ becomesα′Fα

where F = λΣ is diagonal.

• And the random effects (α) covariance matrix G:

G = σ2F−1

• Mixed Model Basis:

X = [ 1 : x ]

Z = BUs

Uc3m/ Dept. of Statistics 14First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 23: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splinesA mixed model approach

Advantages:

• Flexibility:

I Easy incorporation of smoothing in complex models (“spatial” randomeffects and/or correlated errors).

• Mixed Models Theory:

I Estimation and Inference.

• Software Implementation.

I R, Splus, MATLAB or SAS.

• Extension to non-gaussian data:

I Generalized Linear Mixed Models (GLMM)

Uc3m/ Dept. of Statistics 15First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 24: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Multidimensional P-splines

Example: 2d-array

• Data Y = yij, i = 1, ..., n1 and j = 1, ..., n2

• Array structure: n1 rows and n2columns

Y =

y11 y12 · · · y1n2y21 y22 · · · y2n2...

.... . .

...yn11 · · · · · · yn1n2

• Regressors:

x1 = (x11, · · · , x1n1 )′

x2 = (x21, · · · , x2n2 )′

x.1

x.2

(a) Simulated data

Uc3m/ Dept. of Statistics 16First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 25: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Multidimensional P-splines

I Use of Tensor Products of B-splines (Durbán et al, 2002):

Example: 2d-array

• Marginal Basis:

• B1 = B1(x1), of dim. n1 × c1.• B2 = B2(x2), of dim. n2 × c2.

• 2d B-splines Basis:

• Kronecker Product (⊗) ofmarginal basis:

B = B2⊗B1, of dim. n1n2×c1c2x2

x1

2d

−B

sp

line

x2x1

2d

−B

sp

line

Uc3m/ Dept. of Statistics 17First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 26: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Multidimensional P-splines

I Use of Tensor Products of B-splines (Durbán et al, 2002):

Example: 2d-array

• Marginal Basis:

• B1 = B1(x1), of dim. n1 × c1.• B2 = B2(x2), of dim. n2 × c2.

• 2d B-splines Basis:

• Kronecker Product (⊗) ofmarginal basis:

B = B2⊗B1, of dim. n1n2×c1c2x2

x1

2d

−B

sp

line

x2x1

2d

−B

sp

line

Uc3m/ Dept. of Statistics 17First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 27: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Multidimensional P-splines

Model:

y = f (x1, x2) + ε,

with yn1n2×1

• In matrix form, y = Bθ can be written as:

Y = B1AB2, of dim n1 × n2

where A is a matrix c1 × c2 of coefficients θ of length c1c2 × 1.

IDEA:

• Set penalties over Θ.

• Row-wise Penalty: θ′(Ic2 ⊗D′1D1

• Column-wise Penalty: θ′(D′2D2 ⊗ Ic1

Uc3m/ Dept. of Statistics 18First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 28: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Multidimensional P-splines

Model:

y = f (x1, x2) + ε,

with yn1n2×1

• In matrix form, y = Bθ can be written as:

Y = B1AB2, of dim n1 × n2

where A is a matrix c1 × c2 of coefficients θ of length c1c2 × 1.

IDEA:

• Set penalties over Θ.

• Row-wise Penalty: θ′(Ic2 ⊗D′1D1

• Column-wise Penalty: θ′(D′2D2 ⊗ Ic1

Uc3m/ Dept. of Statistics 18First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 29: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Multidimensional P-splines

I Penalty Matrix in 2d:

P = λ1 Ic2 ⊗D′1D1︸ ︷︷ ︸P1

+λ2 D′2D2 ⊗ Ic1︸ ︷︷ ︸P2

• λ1 and λ2 are the smoothing parameters in each dimension.

• Anisotropy: (λ1 6= λ2)

Uc3m/ Dept. of Statistics 19First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 30: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Multidimensional P-splinesMixed Models Representation

I As in 1d Case:

Example:

I The Mixed Model consists of:

y = Xβ + Zα(Linear/Fixed) (Non-Linear/Random)

x.1

x.2

(b) Fitted Surface

x.1

x.2

(a) Linear/Fixed part

x.1

x.2

(b) Non−linear/Random part

Uc3m/ Dept. of Statistics 20First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 31: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Multidimensional P-splines

I Mixed Models Representation:

• As in 1d case, the aim is:

B ≡ [ X : Z ] =⇒ Bθ = Xβ + Zα

• The SVD over P allows the simultaneous diagonalization of D′1D1 and D′2D2

• The penalty P becomes F (block diagonal matrix):

F =

λ2Σ2 ⊗ I2

λ1I2 ⊗ Σ1

λ1Ic2−2 ⊗ Σ1 + λ2Σ2 ⊗ Ic1−2

» Model

Uc3m/ Dept. of Statistics 21First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 32: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Multidimensional P-splines

ANOVA-type Decomposition of Smooth Surfaces:

y = f (x1) + f (x2) + f (x1, x2)(additive term for x1 ) (additive term for x2) (interaction term for x1, x2)

X1

X2

Y

Fitted Surface

X1

X2

Y

Additive term for x1

X1X2

Y

Additive term for x2

X1

X2

Y

Non−additive term

» Advantages

Uc3m/ Dept. of Statistics 22First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 33: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Multidimensional P-splines

Advantages:

• Extension to d-dimensions:

B = B2 ⊗ B1 ⊗ · · · ⊗ Bd

• Efficient algorithms:

• Currie et al (2006): Generalized Linear Array Models (GLAM)

• Anisotropy (different smoothing for each dimension):

• Complex models: spatial data smoothing

Uc3m/ Dept. of Statistics 23First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 34: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Outline

1 P-splinesMixed models approachMultidimensional P-splines

2 P-splines for spatial count dataSpatial smoothingSmooth-CAR modelApplication: Scottish Lip Cancer data

3 Spatio-temporal data Smoothing with P-splinesANOVA-Type Interaction ModelsApplication Environmental spatio-temporal data

4 Spatio-temporal Disease Mapping

Uc3m/ Dept. of Statistics 24First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 35: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines for spatial count dataP-splines for spatial smoothing

IWe propose:

• 2d P-splines:

• Geostatistics: at sampling locations.

• Regional/areal: at the centroids.

I Models of the form:y = f (lon, lat) + ε

where

• f (lon, lat) is a large-scale spatial smooth trend: Xβ + Zα.

• The mixed model allows the simultaneous estimation of smoothing andspatial correlation.

» Spatial count data

Uc3m/ Dept. of Statistics 25First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 36: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines for spatial count dataP-splines for spatial smoothing

IWe propose:

● ●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●● ●

● • 2d P-splines:

• Geostatistics: at sampling locations.

• Regional/areal: at the centroids.

I Models of the form:y = f (lon, lat) + ε

where

• f (lon, lat) is a large-scale spatial smooth trend: Xβ + Zα.

• The mixed model allows the simultaneous estimation of smoothing andspatial correlation.

» Spatial count data

Uc3m/ Dept. of Statistics 25First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 37: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines for spatial count dataP-splines for spatial smoothing

IWe propose:

●●

●●

● ●●●●●

●● ●

●●

●●● ●

●●●●●

●●● ● ●

●● ●

●● ●

●●

• 2d P-splines:

• Geostatistics: at sampling locations.

• Regional/areal: at the centroids.

I Models of the form:y = f (lon, lat) + ε

where

• f (lon, lat) is a large-scale spatial smooth trend: Xβ + Zα.

• The mixed model allows the simultaneous estimation of smoothing andspatial correlation.

» Spatial count data

Uc3m/ Dept. of Statistics 25First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 38: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines for spatial count dataP-splines for spatial smoothing

IWe propose:

●●

●●

● ●●●●●

●● ●

●●

●●● ●

●●●●●

●●● ● ●

●● ●

●● ●

●●

• 2d P-splines:

• Geostatistics: at sampling locations.

• Regional/areal: at the centroids.

I Models of the form:y = f (lon, lat) + ε

where

• f (lon, lat) is a large-scale spatial smooth trend: Xβ + Zα.

• The mixed model allows the simultaneous estimation of smoothing andspatial correlation.

» Spatial count data

Uc3m/ Dept. of Statistics 25First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 39: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines for spatial count dataBasis for Spatial Data

I B-spline Basis for spatial data:

• Given that data are NOT in an array

B = B2 ⊗ B1 replace by B2�B1

� denotes the “Row-wise Kronecker” or Box-Product.

Def. Box-Product:

B2�B1 = (B2 ⊗ 1c1)� (1c2 ⊗ B1)

� is the “element-wise” product.

Uc3m/ Dept. of Statistics 26First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 40: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines for spatial count dataI In many applications:

• Collect count data observed in regions or areas.• E.g.: # of cases of disease or deaths

• Counts are Poisson distributed.

y ∼ P(µ)

Penalized-GLMM

• P-splines as mixed models:

I Linear Predictor:η = Bθ =⇒ Xβ + Zα

I Penalized log-Likelihood:

`p(β,α; y) = `(β,α; y)− 12α′Fα

I Estimation via PQL

Uc3m/ Dept. of Statistics 27First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 41: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

P-splines for spatial count dataI In many applications:

• Collect count data observed in regions or areas.• E.g.: # of cases of disease or deaths

• Counts are Poisson distributed.

y ∼ P(µ)

Penalized-GLMM

• P-splines as mixed models:

I Linear Predictor:η = Bθ =⇒ Xβ + Zα

I Penalized log-Likelihood:

`p(β,α; y) = `(β,α; y)− 12α′Fα

I Estimation via PQL

Uc3m/ Dept. of Statistics 27First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 42: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Smooth-CAR modelCAR model

I Most popular approach:

• Conditional AutoregressiveModels (CAR), Besag (1991)

• Spatial Dependence across“neighbours”.

• Different neighbourhoodcriteria.I Common border.I Centroids distance, 4-nearest

neighbours.

Uc3m/ Dept. of Statistics 28First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 43: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Smooth-CAR modelCAR model

I Most popular approach:

• Conditional AutoregressiveModels (CAR), Besag (1991)

• Spatial Dependence across“neighbours”.

• Different neighbourhoodcriteria.I Common border.I Centroids distance, 4-nearest

neighbours.

Uc3m/ Dept. of Statistics 28First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 44: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Smooth-CAR modelCAR model

I Most popular approach:

• Conditional AutoregressiveModels (CAR), Besag (1991)

• Spatial Dependence across“neighbours”.

• Different neighbourhoodcriteria.I Common border.I Centroids distance, 4-nearest

neighbours.

Uc3m/ Dept. of Statistics 28First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 45: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Smooth-CAR modelCAR model

I Most popular approach:

●●

●●

● ●●●●●

●● ●

●●

●●● ●

●●●●●

●●● ● ●

●● ●

●● ●

●●

●●

●●

● ●●●●●

●● ●

●●

●●● ●

●●●●●

●●● ● ●

●● ●

●● ●

●●

• Conditional AutoregressiveModels (CAR), Besag (1991)

• Spatial Dependence across“neighbours”.

• Different neighbourhoodcriteria.I Common border.I Centroids distance, 4-nearest

neighbours.

Uc3m/ Dept. of Statistics 28First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 46: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Smooth-CAR modelCAR model

I Formulation:y = Xβ + b,

where b = (b1, b2, ..., bn)′ is a vector for the spatial effects

• Impose a spatial dependency structure by a prior distribution for b:

b ∼ N (0,Gb)

where Gb depends on the “neighbourhood structure”:

I defined by Contiguity matrix (Q)

Uc3m/ Dept. of Statistics 29First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 47: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Smooth-CAR modelCAR model

I Formulation:y = Xβ + b,

where b = (b1, b2, ..., bn)′ is a vector for the spatial effects

• Impose a spatial dependency structure by a prior distribution for b:

b ∼ N (0,Gb)

where Gb depends on the “neighbourhood structure”:

I defined by Contiguity matrix (Q)

Uc3m/ Dept. of Statistics 29First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 48: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Smooth-CAR modelCAR model

X We follow an Empirical Bayes approach:

I Intrinsic CAR:Gb = σ2

b Q− + κ−1I (Besag, 1991)

• Two independent and separate variance components:

I Spatially-structured variation: σ2b Q−

I Unstructured non-spatial correlation: κ−1I

I Alternative CAR models structures:

Gb = σ2b (φQ + (1− φ)I)−1 (Leroux et al, 1999)

Gb = σ2b (φQ− + (1− φ)I) (Dean et al, 2001)

where

I φ measures the relative weight between structured and unstructured variability

I 0 ≤ φ ≤ 1

Uc3m/ Dept. of Statistics 30First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 49: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Smooth-CAR modelCAR model

X We follow an Empirical Bayes approach:

I Intrinsic CAR:Gb = σ2

b Q− + κ−1I (Besag, 1991)

• Two independent and separate variance components:

I Spatially-structured variation: σ2b Q−

I Unstructured non-spatial correlation: κ−1I

I Alternative CAR models structures:

Gb = σ2b (φQ + (1− φ)I)−1 (Leroux et al, 1999)

Gb = σ2b (φQ− + (1− φ)I) (Dean et al, 2001)

where

I φ measures the relative weight between structured and unstructured variability

I 0 ≤ φ ≤ 1

Uc3m/ Dept. of Statistics 30First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 50: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Smooth-CAR modelLee and Durban (2009)

IWe propose a “hybrid” model:

• Spatial P-spline with CAR structure: “Smooth-CAR” model

• Model:η = Xβ + Zα + b ,

where b ∼ N (0,Gb)

Our approach:

η = Spatial Trend︸ ︷︷ ︸Xβ + Zα

+ Local area-level spatial correlation︸ ︷︷ ︸Spatial Random Effects

(Large-scale) (Small-scale)

Uc3m/ Dept. of Statistics 31First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 51: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Smooth-CAR modelLee and Durban (2009)

IWe propose a “hybrid” model:

• Spatial P-spline with CAR structure: “Smooth-CAR” model

• Model:η = Xβ + Zα + b ,

where b ∼ N (0,Gb)

Our approach:

η = Spatial Trend︸ ︷︷ ︸Xβ + Zα

+ Local area-level spatial correlation︸ ︷︷ ︸Spatial Random Effects

(Large-scale) (Small-scale)

Uc3m/ Dept. of Statistics 31First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 52: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Smooth-CAR model

I Summary:

Model Linear Predictor Area-level var.Poisson Xβ + Zα −

CAR Xβ + b b ∼ N (0,Gb)Smooth-CAR Xβ + Zα + b b ∼ N (0,Gb)

I The Smooth-CAR:

I Allow us model the spatial trend (Xβ + Zα) along large geographical distances and

I Local area-level correlation by a CAR component (b).

Uc3m/ Dept. of Statistics 32First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 53: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Application: Scottish Lip Cancer data

Example: Scottish Lip Cancer

• Breslow and Clayton (1993)

• Observed (y) and Expected (e)cases of lip cancer

• 56 counties in Scotland

• Period: 1975− 1980.

SCOTTISH LIP CANCER

OBSERVED EXPECTED

0

20

40

60

80

Uc3m/ Dept. of Statistics 33First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 54: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Application: Scottish Lip Cancer dataFitted Models

IWe fit several models:

• Smooth P-splines models:

η = log(e) + Xβ + Zα (Poisson)

log(e) is the offset term.

• CAR models:η = log(e) + Xβ + b , b ∼ N (0,Gb),

with:

Gb = σ2b (φQ− + (1− φ)I) (Dean)

• Smooth-CAR model:

η = log(e) + Xβ + Zα + b , b ∼ N (0,Gb)

Uc3m/ Dept. of Statistics 34First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 55: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Application: Scottish Lip Cancer dataFitted Models

IWe fit several models:

• Smooth P-splines models:

η = log(e) + Xβ + Zα (Poisson)

log(e) is the offset term.

• CAR models:η = log(e) + Xβ + b , b ∼ N (0,Gb),

with:

Gb = σ2b (φQ− + (1− φ)I) (Dean)

• Smooth-CAR model:

η = log(e) + Xβ + Zα + b , b ∼ N (0,Gb)

Uc3m/ Dept. of Statistics 34First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 56: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Application: Scottish Lip Cancer dataFitted Models

IWe fit several models:

• Smooth P-splines models:

η = log(e) + Xβ + Zα (Poisson)

log(e) is the offset term.

• CAR models:η = log(e) + Xβ + b , b ∼ N (0,Gb),

with:

Gb = σ2b (φQ− + (1− φ)I) (Dean)

• Smooth-CAR model:

η = log(e) + Xβ + Zα + b , b ∼ N (0,Gb)

Uc3m/ Dept. of Statistics 34First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 57: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Application: Scottish Lip Cancer dataModels comparison criteria

I In order to compare the proposed models we use:

AIC = Dev + 2× dfBIC = Dev + log(n)× df

where:

• df is the effective dimension of the model (“degrees of freedom”).

I is a measure of the complexity of the fitted model,I Calculated as the trace(H),

y = Hy

Uc3m/ Dept. of Statistics 35First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 58: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Application: Scottish Lip Cancer dataModels comparison criteria

I In order to compare the proposed models we use:

AIC = Dev + 2× dfBIC = Dev + log(n)× df

where:

• df is the effective dimension of the model (“degrees of freedom”).

I is a measure of the complexity of the fitted model,I Calculated as the trace(H),

y = Hy

Uc3m/ Dept. of Statistics 35First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 59: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Application: Scottish Lip Cancer dataComparisons of fitted models

ParametersModel λ1 λ2 σ2

s κ−1 φ AIC BIC dfSmooth: Poisson 11.75 3.63 - - - 114.04 228.46 15.90

CAR: Dean - - 0.78 - 0.99 89.36 179.56 32.78Smooth-CAR: Dean 30.11 16.37 0.53 - 0.97 87.46 175.70 30.64

I Observations:

• φ ≈ 1 −→ Overdispersion is due to “structured” spatial correlation (σ2b Q−).

• Smooth-CAR performs better in terms of the selected criteria.

Uc3m/ Dept. of Statistics 36First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 60: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

I Dean’s CAR model:

(a) Linear Trend (b) CAR random effect (c) CAR

−1.0 −0.5 0.0 0.5 1.0 1.5

(a) Large-scale linear trend: Xβ

(b) CAR structured random effects: b ∼ N (0,Gb)

(c) Xβ + b

Page 61: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

I Smooth-CAR model:

(a) Smooth Trend (b) CAR component (c) Trend+CAR

−1.0 −0.5 0.0 0.5 1.0 1.5

(a) Smooth large-scale spatial trend: Xβ + Zα

(b) CAR structured random effects: b ∼ N (0,Gb)

(c) Xβ + Zα + b

Page 62: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines

Outline

1 P-splinesMixed models approachMultidimensional P-splines

2 P-splines for spatial count dataSpatial smoothingSmooth-CAR modelApplication: Scottish Lip Cancer data

3 Spatio-temporal data Smoothing with P-splinesANOVA-Type Interaction ModelsApplication Environmental spatio-temporal data

4 Spatio-temporal Disease Mapping

Uc3m/ Dept. of Statistics 39First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 63: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines

Spatio-temporal data

• Response variable, yijt

• measured over geographical locations, s = (xi, xj), with i, j = 1, .., n

• and over time periods, xt, for t = 1, ....,T

• ISSUE: huge amount of data available• e.g. : Environmental data, epidemiologic studies, disease mapping

applications, ...

• Smoothing techniques:

• Study spatial and temporal trends.

• Space and time interactions.

X 3-dimensional smoothing: P-splines and GLAM.

Uc3m/ Dept. of Statistics 40First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 64: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines

Example of GLAM in 3dCurrie et. al (2006)

• 3d-case:f (x1, x2, x3) = Bθ

• Basis: B = B1 ⊗ B2 ⊗ B3

• θ can be expressed as a 3d-array A = {θ}ijk of dim. c1 × c2 × c3

θ(1,1,c3)θ(1,c2,c3)

θ(1,1,1)1,...,c2

columns

rows 1,...,c1

layer

1,...,c3

uuuuuuuuuθ(1,c2,1)

ttttttttt

θ(c1,1,c3)θ(c1,c2,c3)

θ(c1,1,1) θ(c1,c2,1)

ttttttttt

Uc3m/ Dept. of Statistics 41First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 65: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines

• 3d-Penalty matrix:

• Set penalties over the 3d-array A:

P = λ1 D′1D1 ⊗ Ic2 ⊗ Ic3︸ ︷︷ ︸row-wise

+λ2 Ic1 ⊗ D′2D2 ⊗ Ic3︸ ︷︷ ︸column-wise

+λt Ic1 ⊗ Ic2 ⊗ D′tDt︸ ︷︷ ︸layer-wise

• For spatio-temporal data:

f ( longitude, latitude︸ ︷︷ ︸Space

, time)

• Spatial anisotropy (λ1 6= λ2), different amount of smoothing for latitude andlongitude.

• Temporal smoothing (λt)

• Space-time interaction.

Uc3m/ Dept. of Statistics 42First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 66: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines

• For spatio-temporal data, we propose:

B-splines Basis:

B = Bs ⊗ Bt,

whereBs ≡ is the spatial B-spline basis (B1� B2) and

Bt ≡ is the B-spline basis for time of dim. t× c3.

X as GLAM:Given yijt = Yt×n, and θijt = Act×cs , we have

E[Y] = BtAB′s

X as Mixed models Bθ = Xβ + Zα

Uc3m/ Dept. of Statistics 43First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 67: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines ANOVA-Type Interaction Models

Smooth-ANOVA decomposition models

• Chen (1993), Gu (2002):

• “Smoothing-Spline ANOVA” (SS-ANOVA).

• Interpretation as “main effects” and “interactions”.

• Models of type:

y = f (x1) + f (x2) + f (xt) “Main/additive effects”+f (x1, x2) + f (x1, xt) + f (x2, xt) “2-way interactions”+f (x1, x2, xt) “3-way interactions”

• PROBLEMS:• identifiability, and• basis dimension (“curse of dimensionality”)

Uc3m/ Dept. of Statistics 44First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 68: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines ANOVA-Type Interaction Models

P-spline ANOVA modelfor spatio-temporal smoothing

• Lee and Durbán (2009a), consider:

y = γ + fs(x1, x2) + fs(time) + fst(x1, x2, time) + ε ,

wherefs(x1, x2) ≡ Spatial 2d smooth surface

ft(time) ≡ Smooth time trendfst(x1, x2, time) ≡ Space-time interaction

• We need to construct an identifiable model.

• Our approach is based on:• low-rank basis (P-splines)• the mixed model representation and SVD properties.

Uc3m/ Dept. of Statistics 45First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 69: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines ANOVA-Type Interaction Models

Basis, Coefficients and Penalty

• For each smooth term f (·), in spatio-temporal ANOVA model we have

• B−spline basis:B = [1nt : Bs ⊗ 1t : 1n ⊗ Bt : Bs ⊗ Bt]

• vector of coefficients:

θ = (γ,θ(s)′ ,θ(t)′ ,θ(st)′)′

• and a blockdiagonal Penalty:

P =

0

Ps

Pt

Pst

,

where Ps = 2d-spatial penaltyPt = 1d-penalty for timePst = 3d space-time penalty

Uc3m/ Dept. of Statistics 46First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 70: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines ANOVA-Type Interaction Models

Basis, Coefficients and Penalty

• For each smooth term f (·), in spatio-temporal ANOVA model we have

• B−spline basis:B = [1nt : Bs ⊗ 1t : 1n ⊗ Bt : Bs ⊗ Bt]

• vector of coefficients:

θ = (γ,θ(s)′ ,θ(t)′ ,θ(st)′)′

• and a blockdiagonal Penalty:

P =

0

Ps

Pt

Pst

,

where Ps = 2d-spatial penaltyPt = 1d-penalty for timePst = 3d space-time penalty

Uc3m/ Dept. of Statistics 46First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 71: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines ANOVA-Type Interaction Models

Basis, Coefficients and Penalty

• For each smooth term f (·), in spatio-temporal ANOVA model we have

• B−spline basis:B = [1nt : Bs ⊗ 1t : 1n ⊗ Bt : Bs ⊗ Bt]

• vector of coefficients:

θ = (γ,θ(s)′ ,θ(t)′ ,θ(st)′)′

• and a blockdiagonal Penalty:

P =

0

Ps

Pt

Pst

,

where Ps = 2d-spatial penaltyPt = 1d-penalty for timePst = 3d space-time penalty

Uc3m/ Dept. of Statistics 46First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 72: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines ANOVA-Type Interaction Models

Basis, Coefficients and Penalty

• For each smooth term f (·), in spatio-temporal ANOVA model we have

• B−spline basis:B = [1nt : Bs ⊗ 1t : 1n ⊗ Bt : Bs ⊗ Bt]

• vector of coefficients:

θ = (γ,θ(s)′ ,θ(t)′ ,θ(st)′)′

• and a blockdiagonal Penalty:

P =

0

Ps

Pt

Pst

,

where Ps = 2d-spatial penaltyPt = 1d-penalty for timePst = 3d space-time penalty

Uc3m/ Dept. of Statistics 46First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 73: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines ANOVA-Type Interaction Models

X However, B is NOT full column-rank (“linear dependency”)

X Model is NOT identifiable

Solution:

• Reparameterize as a mixed model (using SVD).

• For each term we have:Basis [ X : Z ]

fs(x1, x2) ≡ x1 : x2 (1)

ft(xt) ≡ xt (2)

fst(x1, x2, xt) ≡ x1 : x2 : xt (3)

• Some terms in (1) and (2) also appear in (3).

Uc3m/ Dept. of Statistics 47First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 74: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines ANOVA-Type Interaction Models

I The mixed model representation, allow us to identify the columns toremove in order to maintain the identifiability of the model.

and obtain a blockdiagonal penalty F

F =

08

FsFt

Fst

,

withλ1, λ2λtτ1, τ2, τt

I In P-splines context, this is equivalent toX apply constraints over regression coefficients θi,j,k

Uc3m/ Dept. of Statistics 48First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 75: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines ANOVA-Type Interaction Models

I For the ANOVA spatio-temporal model, the resultant mixed modelreparameterization is equivalent to apply the next constraints:

• time effect coefficient: ∑ctt=1 θ

(t)t = 0,

• constraints over the spatio-temporal array of coefficients, Θ(st), ofdimensions ct × cs:

c1∑i

θ(st)t,ij =

c2∑j

θ(st)t,ij =

c1∑i

c2∑j

θ(st)t,ij = 0.

Uc3m/ Dept. of Statistics 49First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 76: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines ANOVA-Type Interaction Models

In practiceI We only need to construct the matrices X, Z and penalty F

fs(x1, x2) ft(xt) fst(x1, x2, xt)

X ≡ by columns x1 : x2 xt (x1, x2, xt)

Z ≡ by blocks ′′ ′′ ′′

F ≡ blockdiagonal Fs Ft Fst(λ1, λ2) λt (τ1, τ2, τt)

Uc3m/ Dept. of Statistics 50First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 77: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines Application Environmental spatio-temporal data

Ozone pollution in EuropeLee and Durbán (2009a)

• Sample of 45 monitoring stations

• Monthly averages of O3 levels (in µg/m3 units)

• from january 1999 to december 2005 (t = 1, ..., 84)

Models:

• Additive:fs(x1, x2) + ft(xt)

• Spatio-temporal Interaction:X ANOVA:

fs(x1, x2) + ft(xt) + fst(x1, x2, xt)

Uc3m/ Dept. of Statistics 51First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 78: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines Application Environmental spatio-temporal data

Spatial 2d + time

fs(x1, x2) + ft(xt)

0 5 10 15 20 25

4045

5055

6065

Latitude

Long

itude

40

50

60

70

80

90

1999 2000 2001 2002 2003 2004 2005

−20

−10

010

20year

f(tim

e)

X Space-time interaction is not considered

X time smooth trend is additive

Uc3m/ Dept. of Statistics 52First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 79: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines Application Environmental spatio-temporal data

Spatio-temporal ANOVA model

Play animation =

+ +

y f(space)

f(time)

1999 : 1

f(space,time)

Uc3m/ Dept. of Statistics 53First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 80: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines Application Environmental spatio-temporal data

Comparison of fitted valuesAdditive VS ANOVA

I Additive model fit I ANOVA model fitfs(x1, x2) + fs(xt) fs(x1, x2) + ft(xt) + fst(x1, x2, xt)

1999 2000 2001 2002 2003 2004 2005 2006

2040

6080

100

120

140

year

O3

SpainSwedenAustriaUK

1999 2000 2001 2002 2003 2004 2005 2006

2040

6080

100

120

140

year

O3

SpainSwedenAustriaUK

X Additive model assumes a spatial smooth surface over all monitoring stations that remainsconstant over time.

X ANOVA model captures individual characteristics of the stations throughout time.

Uc3m/ Dept. of Statistics 54First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 81: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal data Smoothing with P-splines Application Environmental spatio-temporal data

Comparison of ModelsANOVA and Additive

Model AIC dfANOVA 14280.73 366.03

Additive 16506.28 65.98

I Observations:

• Best overall performance of ANOVA in terms of AIC• ANOVA model is more realistic than Additive, and easier to decompose

and interpret in terms of the fit.

Uc3m/ Dept. of Statistics 55First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 82: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Outline

1 P-splinesMixed models approachMultidimensional P-splines

2 P-splines for spatial count dataSpatial smoothingSmooth-CAR modelApplication: Scottish Lip Cancer data

3 Spatio-temporal data Smoothing with P-splinesANOVA-Type Interaction ModelsApplication Environmental spatio-temporal data

4 Spatio-temporal Disease Mapping

Uc3m/ Dept. of Statistics 56First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 83: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Spatio-temporal Disease Mapping

P-spline ANOVA model for disease mapping

• Y and E are t× n arrays of observed and expected cases of disease over t timeperiods, and M = log( Y

E ).

• Consider an ANOVA model for η

fs(x1, x2) + ft(xt) + fst(x1, x2, xt)

−0.5

0.0

0.5

1.0

1.5

t1 t2 t3 t4 t5

t6 t7 t8 t9 t10

0

2

4

6

8

Uc3m/ Dept. of Statistics 57First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 84: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Summary

I New flexible approach for spatial and spatio-temporal data smoothing:

• based on P-splines as mixed models and• ANOVA decomposition

I Methodology also extensible for disease mapping applications.

I Computationally efficient algorithms (GLAM)

Uc3m/ Dept. of Statistics 58First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán

Page 85: P-spline mixed models for spatio-temporal datahalweb.uc3m.es/esp/Personal/personas/durban/esp/... · P-spline mixed models for spatio-temporal data María Durbán joint work with

Bibliography

Lee, D.-J. and Durbán, M. (2009)Smooth-CAR mixed models for spatial count data.Computational Statistics and Data Analysis 53(8):2968-2979.

Lee, D.-J. and Durbán, M. (2009)P-spline ANOVA-Type interaction models for spatio-temporal smoothing.Submitted.

Eilers, PHC., Currie, ID. and Durbán, M. (2006)Fast and compact smoothing on large multidimensional grids.Computational Statistics and Data Analysis, 50(1):61-76.

Currie, ID., Durbán M. and Eilers, PHC. (2006)Generalized linear array models with applications to multidimensional smoothing.Journal of the Royal Statistical Society B, 68:1-22.

Uc3m/ Dept. of Statistics 59First Workshop on Spatio-temporal Disease Mapping, Valencia 2009

María Durbán