open-source glmm tools
DESCRIPTION
talk at UWO 7 April 2011 on open-source tools for generalized linear mixed modelsTRANSCRIPT
Precursors GLMMs Results Conclusions References
Open-source tools for estimation and inferenceusing generalized linear mixed models
Ben Bolker
McMaster UniversityDepartments of Mathematics & Statistics and Biology
7 April 2011
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Outline
1 PrecursorsExamplesDefinitions
2 GLMMsEstimationInference: testsInference: confidence intervals
3 ResultsGlyceraArabidopsis
4 Conclusions
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Examples
Outline
1 PrecursorsExamplesDefinitions
2 GLMMsEstimationInference: testsInference: confidence intervals
3 ResultsGlyceraArabidopsis
4 Conclusions
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Examples
Coral protection by symbionts
none shrimp crabs both
Number of predation events
Symbionts
Num
ber
of b
lock
s
0
2
4
6
8
10
1
2
0
1
2
0
2
0
1
2
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Examples
Environmental stress: Glycera cell survival
H2S
Cop
per
0
33.3
66.6
133.3
0 0.03 0.1 0.32
Osm=12.8Normoxia
Osm=22.4Normoxia
0 0.03 0.1 0.32
Osm=32Normoxia
Osm=41.6Normoxia
0 0.03 0.1 0.32
Osm=51.2Normoxia
Osm=12.8Anoxia
0 0.03 0.1 0.32
Osm=22.4Anoxia
Osm=32Anoxia
0 0.03 0.1 0.32
Osm=41.6Anoxia
0
33.3
66.6
133.3
Osm=51.2Anoxia
0.0
0.2
0.4
0.6
0.8
1.0
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Examples
Arabidopsis response to fertilization & clipping
panel: nutrient, color: genotype
Log(
1+fr
uit s
et)
0
1
2
3
4
5
unclipped clipped
●●●●● ●
●●
●
●●
●●
●
●
●
●●
●
●
●●●
●
●
●
●●● ●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●● ●● ●
●
●●
●
●●
●
●●
●
●
●
●
●● ●
●●
●
●
●
● ●●● ●●● ●●● ●●● ●● ●● ●● ●
●
● ●● ●●● ●●●●
●
●
●
●●
●
●
● ●
●
●
●
●●●●●● ●●
●
●●●
●
●
● ●
●
●
●
●
●
: nutrient 1
unclipped clipped
●
●
●●●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
● ●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●● ●● ●●● ●●●● ●
●
●
●●
●
●
●
●
●
●
●●●●●●● ●●● ●●
●
●
●
●
●
●
●●
●
●
●
●●
●●●● ●●●●●●●
●
●●
●
●
●
●
●
●
●
●
●
●
: nutrient 8
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Examples
Glossary: data
Fixed effects Predictors where interest is in specific levels
Random effects (RE) predictors where interest is in distributionrather than levels (blocks) (Gelman, 2005)
Crossed RE multiple REs where levels of one occur in more thanone level of another (ex.: block × year: cf. nested)http://lme4.r-forge.r-project.org/book/,Pinheiro and Bates (2000)
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Examples
Data challenges
Estimation Computation InferenceSmall # RE levels (<5–6) Large n Small N (< 40)Overdispersion Multiple REs Small nCrossed REs Crossed REsSpatial/temporal
correlationUnusual distributions
(Gamma, neg. binom . . . )
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Definitions
Outline
1 PrecursorsExamplesDefinitions
2 GLMMsEstimationInference: testsInference: confidence intervals
3 ResultsGlyceraArabidopsis
4 Conclusions
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Definitions
Generalized linear models
Distributions from exponential family(Poisson, binomial, Gaussian, Gamma,
neg. binomial (known k) . . . )
Means = linear functions of predictorson scale of link function (identity, log, logit, . . . )
Y ∼ D(g−1(Xβ), φ)
φ often set to 1 (Poisson, binomial) except forquasilikelihood approaches
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Definitions
Generalized linear mixed models
Add random effects:
Y ∼ D(g−1(Xβ + Zu), φ)
u ∼ MVN(0,Σ)
Synonyms: multilevel, hierarchical models
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Definitions
Marginal likelihood
Likelihood (Prob(data|parameters)) — requires integrating overpossible values of REs to get marginal likelihood e.g.:
likelihood of i th obs. in block j is L(xij |θi , σ2w )
likelihood of a particular block mean θj is L(θj |0, σ2b)
marginal likelihood is∫L(xij |θj , σ
2w )L(θj |0, σ2
b) dθj
Balance (dispersion of RE around 0) with (dispersion of dataconditional on RE)
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Definitions
Marginal likelihood
Likelihood (Prob(data|parameters)) — requires integrating overpossible values of REs to get marginal likelihood e.g.:
likelihood of i th obs. in block j is L(xij |θi , σ2w )
likelihood of a particular block mean θj is L(θj |0, σ2b)
marginal likelihood is∫L(xij |θj , σ
2w )L(θj |0, σ2
b) dθj
Balance (dispersion of RE around 0) with (dispersion of dataconditional on RE)
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Definitions
Shrinkage
● ●
●
●●
●●
● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●
Arabidopsis block estimates
Genotype
Mea
n(lo
g) fr
uit s
et
0 5 10 15 20 25
−15
−3
0
3
●● ●
●●
●●
● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●
23 10
810
43 9 9 4 6
4 2 6 10 5 7 9 4 9 11 2 5 5
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Definitions
RE examples
Coral symbionts: simple experimental blocks, RE affectsintercept (overall probability of predation in block)
Glycera: applied to cells from 10 individuals, RE again affectsintercept (cell survival prob.)
Arabidopsis: region (3 levels, treated as fixed) / population /genotype: affects intercept (overall fruit set) as well astreatment effects (nutrients, herbivory, interaction)
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Estimation
Outline
1 PrecursorsExamplesDefinitions
2 GLMMsEstimationInference: testsInference: confidence intervals
3 ResultsGlyceraArabidopsis
4 Conclusions
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Estimation
Penalized quasi-likelihood (PQL)
alternate steps of estimating GLM using known RE variancesto calculate weights; estimate LMMs given GLM fit (Breslow,2004)
flexible (allows spatial/temporal correlations, crossed REs)
biased for small unit samples (e.g. counts < 5, binary orlow-survival data)
widely used: SAS PROC GLIMMIX, R MASS:glmmPQL: in ≈90% of small-unit-sample cases
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Estimation
Penalized quasi-likelihood (PQL)
alternate steps of estimating GLM using known RE variancesto calculate weights; estimate LMMs given GLM fit (Breslow,2004)
flexible (allows spatial/temporal correlations, crossed REs)
biased for small unit samples (e.g. counts < 5, binary orlow-survival data)
widely used: SAS PROC GLIMMIX, R MASS:glmmPQL: in ≈90% of small-unit-sample cases
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Estimation
Penalized quasi-likelihood (PQL)
alternate steps of estimating GLM using known RE variancesto calculate weights; estimate LMMs given GLM fit (Breslow,2004)
flexible (allows spatial/temporal correlations, crossed REs)
biased for small unit samples (e.g. counts < 5, binary orlow-survival data)
widely used: SAS PROC GLIMMIX, R MASS:glmmPQL: in ≈90% of small-unit-sample cases
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Estimation
Penalized quasi-likelihood (PQL)
alternate steps of estimating GLM using known RE variancesto calculate weights; estimate LMMs given GLM fit (Breslow,2004)
flexible (allows spatial/temporal correlations, crossed REs)
biased for small unit samples (e.g. counts < 5, binary orlow-survival data)
widely used: SAS PROC GLIMMIX, R MASS:glmmPQL: in ≈90% of small-unit-sample cases
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Estimation
Laplace approximation
approximate marginal likelihood
for given β, θ (RE parameters), find conditional modes bypenalized, iterated reweighted least squares; then usesecond-order Taylor expansion around the conditional modes
more accurate than PQL
reasonably fast and flexible
lme4:glmer, glmmML, glmmADMB, R2ADMB (AD Model Builder)
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Estimation
Gauss-Hermite quadrature (AGQ)
as above, but compute additional terms in the integral(typically 8, but often up to 20)
most accurate
slowest, hence not flexible (2–3 RE at most, maybe only 1)
lme4:glmer, glmmML, repeated
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Estimation
Bayesian approaches
Bayesians have to do nasty integrals anyway (to normalize theposterior probability density)
various flavours of stochastic Bayesian computation (Gibbssampling, MCMC, etc.)
generally slower but more flexible
solves many problems of assessing confidence intervals
must specify priors, assess convergence
specialized: glmmAK, MCMCglmm (Hadfield, 2010), INLA
general: glmmBUGS, R2WinBUGS, BRugs(WinBUGS/OpenBUGS), R2jags, rjags (JAGS)
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Estimation
Overdispersion (slight tangent)
Variance greater than expected from statistical model
Quasi-likelihood approaches: MASS:glmmPQL
Extended distributions (e.g. negative binomial): glmmADMB
Observation-level random effects (e.g. lognormal-Poisson):lme4
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Estimation
Comparison of coral symbiont results
Regression estimates−6 −4 −2 0 2
Symbiont
Crab vs. Shrimp
Added symbiont
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
GLM (fixed)GLM (pooled)PQLLaplaceAGQ
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: tests
Outline
1 PrecursorsExamplesDefinitions
2 GLMMsEstimationInference: testsInference: confidence intervals
3 ResultsGlyceraArabidopsis
4 Conclusions
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: tests
Wald tests [non-quadratic likelihood surfaces]
For OLS/linear models, likelihood surface is quadratic; onlyasymptotically true for GLM(M)s
Wald tests (e.g. typical results of summary) assumequadratic, based on curvature (information matrix)
always approximate, sometimes awful (Hauck-Donner effect)
do model comparison (F , score or likelihood ratio tests [LRT])instead
But . . .
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: tests
Conditional F tests [Uncertainty in scale parameters]
Model comparison: in general−2 log L = D =
∑deviancei/φ
Classical linear models:∑
deviance and φ are both χ2
distributed so D ∼ F (ν1, ν2)
Denominator degrees of freedom (df) (ν2) for complex(unbalanced, crossed, R-side effects) models?Approximations: Satterthwaite, Kenward-Roger (Kenwardand Roger, 1997; Schaalje et al., 2002)
Is D really ∼ F in these situations?
Scale parameters usually not estimated in GLMMs (Gamma,quasi-likelihood cases only).But . . .
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: tests
Likelihood ratio tests [non-normality of likelihood]
What about cases where φ is specified (e.g. ≡ 1)?
in GLM(M) case, numerator is only asymptotically χ2 anyway
Bartlett corrections (Cordeiro et al., 1994; Cordeiro andFerrari, 1998), higher-order asymptotics: cond [neitherextended to GLMMs!]
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: tests
Tests of random effects [boundary problems]
LRT depends on null hypothesis being within the parameter’sfeasible range (Goldman and Whelan, 2000; Molenberghs andVerbeke, 2007)
violated e.g. by H0 : σ2 = 0
In simple cases null distribution is a mixture of χ2
(e.g. 0.5χ20 + 0.5χ2
1 (emdbook:dchibarsq)
ignoring this leads to conservative tests (e.g. true p-value =12 · nominal p-value)
simulation-based testing: RLRsim
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: tests
Information-theoretic approaches
Above issues apply, but less well understood (Greven, 2008;Greven and Kneib, 2010)
AIC is asymptotic
“corrected” AIC (AICc) (HURVICH and TSAI, 1989) derivedfor linear models, widely used but not tested elsewhere(Richards, 2005)
For comparing models with different REs,or for AICc , what is p?
AICcmodavg, MuMIn
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: tests
Parametric bootstrapping
fit null model to data
simulate “data” from null model
fit null and working model, compute likelihood difference
repeat to estimate null distribution
> pboot <- function(m0, m1) {
s <- simulate(m0)
L0 <- logLik(refit(m0, s))
L1 <- logLik(refit(m1, s))
2 * (L1 - L0)
}
> replicate(1000, pboot(fm2, fm1))
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: tests
Finite-sample problems
How far are we from “asymptopia”?
How much data(number of samples, number of RE levels)?
How many parameters(number of fixed-effect parameters, number of RE levels,number of RE parameters)?
Hope (#data)− (#parameters)� 1 but if not?
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: tests
Levels of focus
how many parameters does a RE take?
Somewhere between q and r (e.g., 1 and the number of levelsfor a variance) . . . shrinkage
Conditional vs. marginal AIC
Similar issues with Deviance Information Criterion(Spiegelhalter et al., 2002)
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: confidence intervals
Outline
1 PrecursorsExamplesDefinitions
2 GLMMsEstimationInference: testsInference: confidence intervals
3 ResultsGlyceraArabidopsis
4 Conclusions
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: confidence intervals
Wald tests
a sometimes-crude approximation
computationally easy, especially for many-parameter models
use Wald Z (assume “residual df” large)? Or t, guessing atthe residual df?
Available from most packages
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: confidence intervals
Profile confidence intervals
Tedious to program
Computationally challenging
Inherits finite-size sample problems from LRT
lme4a (in development/soon!)
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: confidence intervals
Bayesian posterior intervals
Marginal quantile or highest posterior density intervals
Computationally “free” with results of stochastic Bayesiancomputation
Easily extended to confidence intervals on predictions, etc..
Post hoc Markov chain Monte Carlo sampling available forsome packages (glmmADMB, R2ADMB, eventually lme4a)
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Inference: confidence intervals
Summary
Large data
computation can be limitingasymptotics better
Small data
RE variances may be poorly estimated/ set to zero(informative priors can help)inference tricky
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Glycera
Outline
1 PrecursorsExamplesDefinitions
2 GLMMsEstimationInference: testsInference: confidence intervals
3 ResultsGlyceraArabidopsis
4 Conclusions
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Glycera
Effect on survival
−60 −40 −20 0 20 40 60
Osm
Cu
H2S
Anoxia
Osm:Cu
Osm:H2S
Cu:H2S
Osm:Anoxia
Cu:Anoxia
H2S:Anoxia
Osm:Cu:H2S
Osm:Cu:Anoxia
Osm:H2S:Anoxia
Cu:H2S:Anoxia
Osm:Cu:H2S:Anoxia
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
MCMCglmmglmer(OD:2)glmer(OD)glmmMLglmer
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Glycera
Effect on survival
−20 −10 0 10 20 30
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
H2SCu
Osm
Oxygen
Cu : H2S
Osm : Oxygen
Cu : Oxygen
Osm : H2S
H2S : OxygenOsm : Cu
Osm : Cu : H2S
Cu : H2S : Oxygen
Osm : H2S : Oxygen
Osm : Cu : Oxygen
Osm : Cu : H2S : Oxygen
main effects
2−way
3−way
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Glycera
Parametric bootstrap results
True p value
Infe
rred
p v
alue
0.02
0.04
0.06
0.08
0.02 0.04 0.06 0.08
Osm Cu
H2S
0.02 0.04 0.06 0.08
0.02
0.04
0.06
0.08
Anoxia
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Arabidopsis
Outline
1 PrecursorsExamplesDefinitions
2 GLMMsEstimationInference: testsInference: confidence intervals
3 ResultsGlyceraArabidopsis
4 Conclusions
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Arabidopsis
Arabidopsis: AIC comparison of REs
∆AIC
clip(gen) X clip(popu)clip(gen) X nut(popu)clip(gen) X int(popu)
nut(gen) X clip(popu)nut(gen) X nut(popu)nut(gen) X int(popu)int(gen) X clip(popu)int(gen) X nut(popu)int(gen) X int(popu)
int(popu)nointeract
0 2 4 6
●
●
●
●
●
●
●
●
●
●
●
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Arabidopsis
Arabidopsis: fits with and without nutrient(genotype)
Regression estimates−1.0 −0.5 0.0 0.5 1.0 1.5
nutrient8
amdclipped
rack2
statusPetri.Plate
statusTransplant
nutrient8:amdclipped
●
●
●
●
●
●
●
●
●
●
●
●
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Primary tools
lme4: multiple/crossed REs, (profiling): fast
MCMCglmm: Bayesian, very flexible
glmmADMB: negative binomial, zero-inflated etc.
Most flexible: R2ADMB/AD Model Builder,R2WinBUGS/WinBUGS/R2jags/JAGS, INLA
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Loose ends
Overdispersion and zero-inflation: MCMCglmm, glmmADMB
Spatial and temporal correlation (R-side effects):MASS:glmmPQL (sort of), GLMMarp, INLA;WinBUGS, AD Model Builder
Additive models: amer, gamm4, mgcv
Penalized methods (Jiang, 2008) (?)
Hierarchical GLMs: hglm, HGLMMM
Marginal models: geepack, gee
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
To be done
Many holes in knowledge (but what can be done?)
Faster algorithms, more parallel computation
Lots of implementation and clean-up
Benefits & costs of staying within the GLMM framework
Benefits & costs of diversity
More info: glmm.wikidot.com
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
Acknowledgements
Data: Josh Banta and Massimo Pigliucci (Arabidopsis);Adrian Stier and Sea McKeon (coral symbionts); CourtneyKagan, Jocelynn Ortega, David Julian (Glycera);
Co-authors: Mollie Brooks, Connie Clark, Shane Geange, JohnPoulsen, Hank Stevens, Jada White
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs
Precursors GLMMs Results Conclusions References
References
Breslow, N.E., 2004. In D.Y. Lin and P.J. Heagerty, editors, Proceedings of the second Seattle symposium inbiostatistics: Analysis of correlated data, pages 1–22. Springer. ISBN 0387208623.
Cordeiro, G.M. and Ferrari, S.L.P., 1998. Journal of Statistical Planning and Inference, 71(1-2):261–269. ISSN0378-3758. doi:10.1016/S0378-3758(98)00005-6.
Cordeiro, G.M., Paula, G.A., and Botter, D.A., 1994. International Statistical Review / Revue Internationale deStatistique, 62(2):257–274. ISSN 03067734. doi:10.2307/1403512.
Gelman, A., 2005. Annals of Statistics, 33(1):1–53. doi:doi:10.1214/009053604000001048.
Goldman, N. and Whelan, S., 2000. Molecular Biology and Evolution, 17(6):975–978.
Greven, S., 2008. Non-Standard Problems in Inference for Additive and Linear Mixed Models. Cuvillier Verlag,Gottingen, Germany. ISBN 3867274916.
Greven, S. and Kneib, T., 2010. Biometrika, 97(4):773–789.
Hadfield, J.D., 2010. Journal of Statistical Software, 33(2):1–22. ISSN 1548-7660.
HURVICH, C.M. and TSAI, C., 1989. Biometrika, 76(2):297 –307. doi:10.1093/biomet/76.2.297.
Jiang, J., 2008. The Annals of Statistics, 36(4):1669–1692. ISSN 0090-5364. doi:10.1214/07-AOS517.
Kenward, M.G. and Roger, J.H., 1997. Biometrics, 53(3):983–997.
Molenberghs, G. and Verbeke, G., 2007. The American Statistician, 61(1):22–27.doi:10.1198/000313007X171322.
Pinheiro, J.C. and Bates, D.M., 2000. Mixed-effects models in S and S-PLUS. Springer, New York. ISBN0-387-98957-9.
Richards, S.A., 2005. Ecology, 86(10):2805–2814. doi:10.1890/05-0074.
Schaalje, G., McBride, J., and Fellingham, G., 2002. Journal of Agricultural, Biological & Environmental Statistics,7(14):512–524.
Spiegelhalter, D.J., Best, N., et al., 2002. Journal of the Royal Statistical Society B, 64:583–640.
Ben Bolker McMaster University Departments of Mathematics & Statistics and Biology
Open-source GLMMs