introduction of mixed effect model

34
Introduction of Mixed effect model Learning by simulation Supstat Inc. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 1 of 34 1/29/14, 10:51 PM

Upload: vivian-s-zhang

Post on 10-Dec-2014

807 views

Category:

Entertainment & Humor


3 download

DESCRIPTION

Event link: http://www.meetup.com/NYC-Open-Data/events/161342472/ A free R workshop given by SupStat Inc at New York R user group and NYC Open Data Meetup group

TRANSCRIPT

Page 1: Introduction of mixed effect model

IInnttrroodduuccttiioonn ooff MMiixxeedd eeffffeecctt mmooddeellLearning by simulation

Supstat Inc.

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

1 of 34 1/29/14, 10:51 PM

Page 2: Introduction of mixed effect model

OOuuttlliinneeWhat is mixed effect model

Fixed effect model

Mixed effect model

General Mixed effect model

Case study

·

·

·

Random Intercept model

Random Intercept and Slope Model

-

-

·

·

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

2 of 34 1/29/14, 10:51 PM

Page 3: Introduction of mixed effect model

What is mixed effect model

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

3 of 34 1/29/14, 10:51 PM

Page 4: Introduction of mixed effect model

CCllaassssiiccaall nnoorrmmaall lliinneeaarr mmooddeellFormation:

Yi = b0 + b1*Xi + ei

Yi is response from suject i.

Xi are covariates.

b0, b1 are parameters that we want to estimate.

ei are the random terms in the model, and are assumped to be independently and indenticallydistributed from Normal(0,1). It is very important that there is no stucuture in ei and itrepresents the variations that could not be controled in our studies.

·

·

·

·

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

4 of 34 1/29/14, 10:51 PM

Page 5: Introduction of mixed effect model

VViioollaattiioonn ooff iinnddeeppeennddeennccee aassssuummppaattiioonn..In many cases, responses are not independent from each other. These data usualy have somecluster stucture.

We need new tools - Mixed effect model.

Repeated measures, where measurements are taken multiple times from the same sujects.(clustered by subject)

A survey of all the family memebers. (clustered by family)

A survey of students from 20 classrooms in a high school. (clustered by classroom)

Longitudial data, or known as the panel data, where several responses are collected from thesame sujects along the time. (clustered by subject)

·

·

·

·

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

5 of 34 1/29/14, 10:51 PM

Page 6: Introduction of mixed effect model

MMiixxeedd eeffffeecctt mmooddeellMixed effect model = Fixed effect + Random effect

Fixed effects

Random effect

·

expected to have a systematic and predictable influence on your data.

exhaust “the levels of a factor”.Think of sex(male/femal).

-

-

·

expected to have a non-systematic, unpredictable, or “random” influence on your data.

Random effects have factor levels that are drawn from a large population, but we do notknow exactly how or why they differ.

-

-

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

6 of 34 1/29/14, 10:51 PM

Page 7: Introduction of mixed effect model

EExxaammppllee ooff FFiixxeedd eeffffeeccttss aanndd RRaannddoomm eeffffeeccttssFIXED EFFECTS RANDOM EFFECTS

Male or female Individuals with repeated measures

Insecticide sprayed or not Block within a field

Upland or lowland Brood

One country versus another Split plot within a plot

Wet versus dry Family

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

7 of 34 1/29/14, 10:51 PM

Page 8: Introduction of mixed effect model

Fixed effect model

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

8 of 34 1/29/14, 10:51 PM

Page 9: Introduction of mixed effect model

FFiixxeedd eeffffeecctt mmooddeellFixed effect model is just the linear model that you maybe already know.

Yi = b0 + b1*Xi + ei

1<i<n n is number of sample

Yi: Response Variable

b0: fixed intercept

b1: fixed slope

Xi: Explanatory Variable (fixed effect)

ei: noise (error)

·

·

·

·

·

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

9 of 34 1/29/14, 10:51 PM

Page 10: Introduction of mixed effect model

DDaattaa ggeenneerraattiioonn ooff ffiixxeedd eeffffeecctt mmooddeellset.seed(1)# genaerate xx <- seq(1,5,length.out=100)# generate errornoise <- rnorm(n=100,mean=0,sd=1)b0 <- 1b1 <- 2# generate yy <- b0 + b1*x + noise

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

10 of 34 1/29/14, 10:51 PM

Page 11: Introduction of mixed effect model

DDaattaa ggeenneerraattiioonn ooff ffiixxeedd eeffffeecctt mmooddeellplot(y~x)

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

11 of 34 1/29/14, 10:51 PM

Page 12: Introduction of mixed effect model

CCooooeeffffiicciieenntt eessttiimmaattiioonn ooff ffiixxeedd eeffffeecctt mmooddeellmodel <- lm(y~x)summary(model)

Call:lm(formula = y ~ x)

Residuals: Min 1Q Median 3Q Max -2.3401 -0.6058 0.0155 0.5851 2.2975

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.1424 0.2491 4.59 1.3e-05 ***x 1.9888 0.0774 25.70 < 2e-16 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.903 on 98 degrees of freedom

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

12 of 34 1/29/14, 10:51 PM

Page 13: Introduction of mixed effect model

pplloott ooff ffiixxeedd eeffffeecctt mmooddeellplot(y~x)abline(model)

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

13 of 34 1/29/14, 10:51 PM

Page 14: Introduction of mixed effect model

Mixed effect model

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

14 of 34 1/29/14, 10:51 PM

Page 15: Introduction of mixed effect model

RRaannddoomm IInntteerrcceepptt mmooddeellthere are i people, and we repeat measure j times for every people. These poeple are individuallydifferent which we don't know, so there are random effect cause by people, and there are anotherrandom noise cause by measure for every people.

Yij = b0 + b1*Xij + bi + eij

b0: fixed intercept

b1: fixed slope

Xij: fixed effect

bi: random effect(influence intercept)

eij: noise

·

·

·

·

·

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

15 of 34 1/29/14, 10:51 PM

Page 16: Introduction of mixed effect model

DDaattaa ggeenneerraattiioonn ooff RRaannddoomm IInntteerrcceepptt mmooddeellb0 <- 9.9b1 <- 2# repeat measure times for 6 peoplen <- c(13, 14, 14, 15, 12, 13)npeople <- length(n)set.seed(1)# generate x(fixed effect)x <- matrix(rep(0, length=max(n) * npeople),ncol = npeople)for (i in 1:npeople){ x[1:n[i], i] <- runif(n[i], min = 1, max = 5) x[1:n[i], i] <- sort(x[1:n[i], i])}# random effectbi <- rnorm(npeople, mean = 0, sd = 10)

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

16 of 34 1/29/14, 10:51 PM

Page 17: Introduction of mixed effect model

DDaattaa ggeenneerraattiioonn ooff RRaannddoomm IInntteerrcceepptt mmooddeellxall <- NULLyall <- NULLpeopleall <- NULLfor (i in 1:npeople){ xall <- c(xall, x[1:n[i], i]) # combine x # generate y y <- rep(b0 + bi[i], length = n[i]) + b1 * x[1:n[i],i] + rnorm(n[i], mean = 0, sd = 2) # noise yall <- c(yall, y) # combine y people <- rep(i, length = n[i]) peopleall <- c(peopleall, people)}# final datasetdata1 <- data.frame(yall,peopleall,xall)

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

17 of 34 1/29/14, 10:51 PM

Page 18: Introduction of mixed effect model

CCooooeeffffiicciieenntt eessttiimmaattiioonn ooff RRaannddoomm IInntteerrcceeppttmmooddeelllibrary(nlme)# xall is fixed effect# bi influence intercept of modellme1 <- lme(yall~xall,random=~1|peopleall,data=data1)summary(lme1)

Linear mixed-effects model fit by REML Data: data1 AIC BIC logLik 358 368 -175

Random effects: Formula: ~1 | peopleall (Intercept) ResidualStdDev: 7.3 1.77

Fixed effects: yall ~ xall

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

18 of 34 1/29/14, 10:51 PM

Page 19: Introduction of mixed effect model

PPlloott ooff RRaannddoomm IInntteerrcceepptt mmooddeell

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

19 of 34 1/29/14, 10:51 PM

Page 20: Introduction of mixed effect model

RRaannddoomm IInntteerrcceepptt aanndd ssllooppee mmooddeellYij = b0 + (b1+si)*Xij + bi + eij

b0: fixed intercept

b1: fixed slope

X: fixed effect

bi: random effect(influence intercept)

eij: noise

si: random effect(influence slope)

·

·

·

·

·

·

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

20 of 34 1/29/14, 10:51 PM

Page 21: Introduction of mixed effect model

DDaattaa ggeenneerraattiioonn ooff RRaannddoomm IInntteerrcceepptt aannddssllooppee mmooddeella0 <- 9.9a1 <- 2n <- c(12, 13, 14, 15, 16, 13)npeople <- length(n)set.seed(1)si <- rnorm(npeople, mean = 0, sd = 0.5) # random slopex <- matrix(rep(0, length = max(n) * npeople), ncol = npeople)for (i in 1:npeople){ x[1:n[i], i] <- runif(n[i], min = 1, max = 5) x[1:n[i], i] <- sort(x[1:n[i], i])}

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

21 of 34 1/29/14, 10:51 PM

Page 22: Introduction of mixed effect model

DDaattaa ggeenneerraattiioonn ooff RRaannddoomm IInntteerrcceepptt aannddssllooppee mmooddeellbi <- rnorm(npeople, mean = 0, sd = 10) # random interceptxall <- NULLyall <- NULLpeopleall <- NULLfor (i in 1:npeople){ xall <- c(xall, x[1:n[i], i]) y <- rep(a0 + bi[i], length = n[i]) + (a1 + si[i]) * x[1:n[i],i] + rnorm(n[i], mean = 0, sd = 0.5) yall <- c(yall, y) people <- rep(i, length = n[i]) peopleall <- c(peopleall, people)}# generate final datasetdata2 <- data.frame(yall, peopleall, xall)

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

22 of 34 1/29/14, 10:51 PM

Page 23: Introduction of mixed effect model

CCooooeeffffiicciieenntt eessttiimmaattiioonn ooff RRaannddoomm IInntteerrcceeppttaanndd ssllooppee mmooddeell# bi influence intercept and slope of modellme2 <- lme(yall~xall,random=~1+xall|peopleall,data=data2)print(summary(lme2))

Linear mixed-effects model fit by REML Data: data2 AIC BIC logLik 179 194 -83.6

Random effects: Formula: ~1 + xall | peopleall Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr (Intercept) 11.593 (Intr)xall 0.464 0.044 Residual 0.445

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

23 of 34 1/29/14, 10:51 PM

Page 24: Introduction of mixed effect model

PPlloott ooff RRaannddoomm IInntteerrcceepptt aanndd ssllooppee mmooddeell

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

24 of 34 1/29/14, 10:51 PM

Page 25: Introduction of mixed effect model

wwhhaatt iiff wwee jjuusstt uussee lliinneeaarr mmooddeellcomplete pooling·

# wrong estimationlm1 <- lm(yall~xall,data=data2)summary(lm1)

Call:lm(formula = yall ~ xall, data = data2)

Residuals: Min 1Q Median 3Q Max -17.80 -6.27 -3.67 2.19 24.33

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.86 3.72 1.84 0.06874 . xall 4.31 1.15 3.76 0.00032 ***---

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

25 of 34 1/29/14, 10:51 PM

Page 26: Introduction of mixed effect model

wwhhaatt iiff wwee jjuusstt uussee lliinneeaarr mmooddeellno pooling·

# wrong estimation and waste too many freedom and we don't care about the exact different of people. we juslm2 <- lm(yall~xall+factor(peopleall)+xall*factor(peopleall),data=data1)summary(lm2)

Call:lm(formula = yall ~ xall + factor(peopleall) + xall * factor(peopleall), data = data1)

Residuals: Min 1Q Median 3Q Max -2.983 -1.194 0.054 1.092 4.238

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 18.818 1.342 14.02 < 2e-16 ***xall 0.929 0.413 2.25 0.028 *

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

26 of 34 1/29/14, 10:51 PM

Page 27: Introduction of mixed effect model

General Mixed effect model

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

27 of 34 1/29/14, 10:51 PM

Page 28: Introduction of mixed effect model

LLooggiissttiicc MMiixxeedd eeffffeecctt mmooddeellYij = exp(eta)/(1+exp(eta))

eta = b0 + b1*Xij + bi + eij

b0: fixed intercept

b1: fixed slope

X: fixed effect

bi: random effect(influence intercept)

eij: noise

·

·

·

·

·

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

28 of 34 1/29/14, 10:51 PM

Page 29: Introduction of mixed effect model

DDaattaa ggeenneerraattiioonn ooff LLooggiissttiicc MMiixxeedd eeffffeeccttmmooddeellb0 <- - 6b1 <- 2.1set.seed(1)n <- c(12, 13, 14, 15, 16, 13)npeople <- length(n)x <- matrix(rep(0, length = max(n) * npeople), ncol = npeople)bi <- rnorm(npeople, mean = 0, sd = 1.5)for (i in 1:npeople){ x[1:n[i], i] <- runif(n[i], min = 1,max = 5) x[1:n[i], i] <- sort(x[1:n[i], i])}

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

29 of 34 1/29/14, 10:51 PM

Page 30: Introduction of mixed effect model

DDaattaa ggeenneerraattiioonn ooff LLooggiissttiicc MMiixxeedd eeffffeeccttmmooddeellxall <- NULLyall <- NULLpeopleall <- NULLfor (i in 1:npeople){ xall <- c(xall, x[1:n[i], i]) y <- NULL for(j in 1:n[i]){ eta1 <- b0 + b1 * x[j, i] + bi[i] y <- c(y, rbinom(n = 1, size = 1, prob = exp(eta1)/(exp(eta1) + 1))) } yall <- c(yall, y) people <- rep(i, length = n[i]) peopleall <- c(peopleall, people)}data3 <- data.frame(xall, peopleall,yall)

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

30 of 34 1/29/14, 10:51 PM

Page 31: Introduction of mixed effect model

CCooooeeffffiicciieenntt eessttiimmaattiioonn ooff LLooggiissttiicc MMiixxeeddeeffffeecctt mmooddeelllibrary(lme4)# formula is differentlmer3 <- glmer(yall~xall+(1|peopleall),data=data3,family=binomial)print(summary(lmer3))

Generalized linear mixed model fit by maximum likelihood ['glmerMod'] Family: binomial ( logit )Formula: yall ~ xall + (1 | peopleall) Data: data3

AIC BIC logLik deviance 69.8 77.1 -31.9 63.8

Random effects: Groups Name Variance Std.Dev. peopleall (Intercept) 3.94 1.98 Number of obs: 83, groups: peopleall, 6

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

31 of 34 1/29/14, 10:51 PM

Page 32: Introduction of mixed effect model

PPlloott ooff LLooggiissttiicc MMiixxeedd eeffffeecctt mmooddeell

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

32 of 34 1/29/14, 10:51 PM

Page 33: Introduction of mixed effect model

Case study

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

33 of 34 1/29/14, 10:51 PM

Page 34: Introduction of mixed effect model

Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1

34 of 34 1/29/14, 10:51 PM