advanced regression methodsdental.washington.edu/wp-content/media/iadr/1spiekerman-mancl.pdf•also...
TRANSCRIPT
Advanced Regression Methods Symposium on Updates on Clinical
Research Methodology
March 18, 2013
Lloyd Mancl, PhD Charles Spiekerman, PhD Oral Health Sciences Oral Health Sciences University of Washington University of Washington [email protected] [email protected]
Outline • Introduce common regression methods for
different types of outcomes – Logistic regression
– Multiple linear regression
– Cox proportional hazards regression
– Poisson or log-linear regression
• Uses of regression: – Adjust for confounding
– Assess for effect modification or interaction
– Account for non-independent outcomes
Uses for Multiple regression analysis • Used to adjust for confounding
– In observational studies, groups of interest can differ on other variables that may be related to the outcome.
– In an RCT, randomization may not result in balanced groups.
• Used to assess simultaneously the associations for several explanatory variables, as well as, interactions between variables – In observational studies, you may be interested in how several
explanatory variables are related to an outcome.
– In an RCT, we are often interested in testing if treatment is modified by another variable (i.e., interaction or moderation).
– In designed experiments, we typically test for interactions between the different study factors.
• Used to develop a prediction equation – Not commonly used for this purpose; not covered in this workshop.
Multiple regression analysis
Models the association between one outcome variable, Y, and multiple variables of interest, X1, X2,…,Xk.
Multiple Linear regression model
Y = α + β1X1 + β2X2 + … + βkXk + random error
Generalized Linear Model
G(Y) ~ α + β1X1 + β2X2 + … + βkXk
(basically, a more complicated version of the outcome, Y, is
related to a linear combination of the variables of interest)
Common Multiple Regression Methods
Outcome variable Regression method Regression results
Quantitative/continuous Linear regression Slopes & difference
between means
Binary (2 categories) Logistic regression Odds ratio
Count or count rate Poisson or
log-linear regression Relative risk or rate ratio
Time to an event Cox proportional hazards
regression Hazard ratio
Ordinal (>2 categories) Ordinal logistic regression Odds ratio
Nominal (>2 categories) Multinomial logistic
regression Odds ratio
Regression method depends on the outcome
• Continuous or quantitative outcome – linear regression – Amount of attachment loss (mm)
– Change in dmfs
• Binary outcome – logistic regression – Any new decay
– Incident TMD
• Time to event outcome - Cox proportional hazards regression – Time to tooth loss
– Time to pulp cap failure
• Count outcome – Poisson or log-linear regression – Number of new caries
– Rate of new caries
Example: heart disease and periodontitis
• NHANES II –observational study that examined a large number of participants at a baseline visit and followed them for over 10 years to ascertain morbid events.
• We are interested in assessing the association between periodontal disease evaluated at baseline and the occurrence of heart disease (CHD) within 10 years of study entry.
Logistic regression
CHD risk by exposure group
Group
Healthy gums
Periodontal disease
Odds Ratio
95% Conf. Int.
CHD incidence 4.9% 13.5% 3.0 (2.5, 3.7)
In this analysis the outcome variable, CHD incidence, is a binary variable, so a regression method we could employ is logistic regression
Logistic regression uses the Odds Ratio as an estimate of association between your independent variable and the outcome
Confounding
• Confounding occurs when there is a third variable that is strongly related to both the dependent and the independent variable.
• This can bias an estimate of association.
Y X1 ?
X2 • With respect to Periodontitis and CHD, an obvious
potential confounder is Age. • We can adjust for the potential confounding effects by
entering Age into the logistic regression model as an additional independent variable
CHD incidence by periodontal disease
Logistic Regression with Periodontal disease as the only independent variable
Independent Variable Odds Ratio 95% Conf. Int.
Healthy Gums 1 -
Periodontal Disease 3.0 (2.5, 3.7)
Logistic Regression model with Age added
Independent Variable Odds Ratio 95% Conf. Int.
Healthy Gums 1 -
Periodontal Disease 1.6 (1.3, 2.0)
Age (10 year increment) 2.1 (1.9, 2.2)
CHD incidence by periodontal disease
• By controlling for Age the estimated association of Periodontal disease with CHD is less strong.
• The association is still statistically significant (confidence interval does not contain 1).
• The Age output indicates 2.1 times higher odds of CHD associated with 10 years greater Age
Logistic Regression model with Age added
Independent Variable Odds Ratio 95% Conf. Int.
Healthy Gums 1 -
Periodontal Disease 1.6 (1.3, 2.0)
Age (10 year increment) 2.1 (1.9, 2.2)
Effect Modification / Interaction • An interaction is when the association or relationship between
an explanatory variable and the outcome variable depends on the value of another explanatory variable.
• Also called effect modification or moderation.
• In extreme cases, an interaction may completely reverse the relationship between the explanatory variable and outcome.
• More commonly, the effect is stronger (or weaker) depending on the value of another explanatory variable.
• Stratification can be used to identify an interaction.
• Can use regression to test for an interaction by adding an interaction term/variable in the regression model, which is the product of two explanatory variables.
Chewing Gum Study
DMFS Change Baseline DMFS
Group n Mean (SD) Mean (SD)
A 25 -0.72 (5.37) 4.68 (1.02)
B 35 -0.83 (3.57) 3.77 (0.55)
C 40 2.63 (3.80) 3.67 (0.57)
• Subjects randomly assigned to 3 different chewing gums • Outcome was continuous, change in DMFS • Linear regression used to compare the 3 groups, adjusting for
baseline DMFS
Linear regression results
Coefficient Standard
Estimate Error P-value
Intercept 1.30 0.90 .15
Group B (vs A) -0.50 1.01 .62
Group C (vs A) 2.91 0.98 .004
Baseline DMFS -0.43 0.10 <.001
Group main effect, p-value <.001
• Model estimates a constant group difference • DFMS change 2.91 greater for Group C than Group A
Linear regression results
Coefficient Standard
Estimate Error P-value
Intercept 1.30 0.90 .15
Group B (vs A) -0.50 1.01 .62
Group C (vs A) 2.91 0.98 .004
Baseline DMFS -0.43 0.10 <.001
Group main effect, p-value <.001
• Model estimates a constant group difference • DFMS change 2.91 greater for Group C than Group A
• DFMS change -0.50 less for Group B than Group A
Scatterplot of change in DMFS versus baseline DMFS for the treatment groups (A, B, C)
0 5 10 15 20 25
-20
-15
-10
-5
0
5
10
15
Baseline DMFS
Ch
an
ge
in
DM
FS
Group AGroup BGroup C
• Group x baseline DMFS interaction added to the linear regression model to test if group differences are affected by baseline DMFS
Coefficient Standard
Estimate Error P-value
Intercept 2.96 0.97 .003
Group B (vs A) -1.53 1.33 .25
Group C (vs A) -0.79 1.26 .53
Baseline DMFS -0.79 0.14 <.001
Group B x Baseline DMFS 0.19 0.23 .42
Group C x Baseline DMFS 0.91 0.21 <.001
• Group x Baseline DMFS interaction, p-value <.001
• Difference between Group C and A increases with baseline DMFS
Scatterplot of change in DMFS versus baseline DMFS for the treatment groups (A, B, C)
0 5 10 15 20 25
-20
-15
-10
-5
0
5
10
15
Baseline DMFS
Ch
an
ge
in
DM
FS
Group AGroup BGroup C
• Difference between Group C and A increases with baseline DMFS
Effect Modification / Interaction
• Test for interactions after all main effects are included in the regression model.
• Typically, only assess for two-way interactions.
• Usually only test interactions, when at least one of the variables has a significant main effect.
• (Exceptions for designed experiments involving a small number of factors, where all possible interactions may be assessed).
• Lower significance level (e.g., p<0.01) may be used, if testing a large number of interactions, to control type I error due to multiple comparisons.
Survival Analysis: Time to event data
• In some studies the outcome of interest is the time until an event
– Time to implant failure
– Time to tooth loss
– Time to death
• Analyses of this type of data are commonly called “Survival Analysis”
Censored events
• In most time to event studies a non-trivial portion of the events will not be observed because they don’t occur during the period of observation.
• These unobserved events are considered “censored”
• For the censored events we don’t know the actual time until the event, but we do know that the time until event is at least as great as the time until the patient was last seen.
• Survival analysis uses the information on the complete observations and the censored observations in a smart way.
Censored events
• The 2nd, 4th and 5th patients have censored times to event.
• The 1st, 3rd and 6th patients we know the exact times to event.
Start of enrollment
End of study
Enrollment date
Event date
Time
Periodontal disease and tooth loss
• One hundred periodontal patients under maintenance care*.
• Interest in assessing factors associated with tooth loss.
• Outcome is time to loss of tooth.
• We will look at oral hygiene, patient age, and smoking.
*M. McGuire & M. Nunn, J Periodontology, 1996; 67:666-674.
Kaplan-Meier survival plots
0 5 10 15
0.92
0.94
0.96
0.98
1.00
Time (years)
Sur
viva
l Pro
babi
lity
Hygiene
good or fair
poor
0 5 10 15
0.92
0.94
0.96
0.98
1.00
Time (years)
Age
less than 50
50 or older
0 5 10 15
0.92
0.94
0.96
0.98
1.00
Time (years)
Smoking
not smoker
smoker
Kaplan-Meier plots present estimates of the survival function, S(t).
S(t) = Probability of surviving to time t
Cox proportional hazards regression
• If some simplifying assumptions hold, then one can compare survival probabilities using a regression framework.
• Cox proportional hazards regression assumes that the hazards in different groups are proportional.
• The hazard is the instantaneous probability of failure, and is related to S(t)
• The hazard at time t can be thought of as the probability of failure right after t, given that one has survived up until time t.
Cox regression results • The output from a Cox
regression is the hazard ratio
• The hazard ratio can be interpreted as an instantaneous relative risk.
• A value of 1 means no association
• Since the “smoker” hazard ratio is significantly different from 1, we can say the teeth from the smokers are at significantly higher risk.
Variable Hazard ratio
95% confidence interval*
Poor Hygiene
1.9 (0.7, 4.9)
Age > 50 1.3 (0.7, 2.4)
Smoker 2.1 (1.1, 3.9)
*Corrected for correlation between teeth within same patient
Count outcome – “Poisson” regression
Treatment group
n
Surface-years at risk
Surfaces of new decay per person
P-value1
Mean ± SD Mean ± SD
Standard 264 168 ± 60 7.4 ± 7.7 <.001
Intensive 251 161 ± 57 9.8 ± 8.6
• Children randomly assigned to intensive application of fluoride varnish (3 times in 2 weeks) versus standard application applied semiannually*
• Outcome was number of new primary caries per time at risk
*P Weinstein, C Spiekerman & P Milgrom. Caries Research 2009
• 1”Possion” regression (with robust standard errors) indicates a higher caries rate for the intensive treatment.
•There are some minor (?) differences between the two treatment groups at baseline, which we may want to adjust for when comparing the two treatment group.
Treatment Group
Standard (n=306) Intensive (n=294)
Age at baseline, months
Mean ± SD 54.7 ± 4.5 55.8 ± 4.7
Baseline dmfs, n (%)
0 dmfs 159 (52%) 96 (33%)
1-7 dmfs 81 (26%) 100 (34%)
>7 dmfs 66 (22%) 98 (33%)
Poisson Regression • If counts have a Poisson distribution, most common model is the
log-linear regression model
log (outcome) = log (time) + explanatory variables
log (time) is often called an offset
• If time is included in the model (as an offset), the exponentiated regression coefficients have a relative risk or rate ratio interpretation
• Because the Poisson distribution is somewhat restrictive (e.g., requires the mean and variance to be the same), the Poisson model may not be ideal for every count outcome. (E.g., the Standard group mean was 168 and variance was 602 = 3600)
• Other distributions and regression methods are available. For example, a negative binomial regression can be used as an alternative to the Poisson regression to account for variance > mean (i.e., over-dispersion)
• Another option is to fit a log-linear model, but use robust standard estimates to compute p-values & confidence intervals (i.e., “Poisson” regression)
• The interpretation of the regression coefficients are similar to Poisson regression.
• Other options are zero-inflated Poisson and zero-inflated negative binominal regression. – account for extra “zeros” in the data.
– methods are appropriate if the outcomes are generated by two processes (one producing only zero counts [e.g., will never get caries] and one that can produce nonzero counts [e.g., may get caries]).
– regression coefficients have a different interpretation.
“Poisson” regression results (with robust standard errors*)
Covariate
Adjusted Rate Ratio
95% CI
P-value
Treatment group .20
Standard 1 ---
Intensive 1.13 0.94, 1.37
Caries at baseline <.0001
0 dmfs 1 ---
1-7 dmfs 1.93 1.52, 2.45
>7 dmfs 3.24 2.54, 4.13
(Also adjusted for enrollment cohort & site, gender and age)
*Robust standard errors used to compute 95% CI and P-value; does not require mean = variance assumption.
“Poisson” regression results (with robust standard errors*)
Covariate
Adjusted Rate Ratio
95% CI
P-value
Treatment group .20
Standard 1 ---
Intensive 1.13 0.94, 1.37
Caries at baseline <.0001
0 dmfs 1 ---
1-7 dmfs 1.93 1.52, 2.45
>7 dmfs 3.24 2.54, 4.13
(Also adjusted for enrollment cohort & site, gender and age)
• Adjusted rate ratio indicates the caries rate is only 13% higher in intensive group (after adjusting for caries at baseline, etc.)
• The 95% CI and P-value indicates this difference is not statistically significant
Regression methods when outcomes are not independent
Non-independent outcomes can by due to: • Multiple outcomes within a mouth
– outcomes & explanatory variables measured on multiple teeth or surfaces within the same patient
• Multiple outcomes over time – outcomes & explanatory variables measured on same patient at
multiple visits over time
• Cluster sampling – National surveys (e.g., NHANES) involve cluster sampling
• Cluster randomization – Intervention randomized by school, classroom or dental practice
When the outcomes are not independent, but we assume they are:
• Regression coefficient estimates typically still valid as long as sample size is sufficiently large.
• Standard errors (SEs) for regression coefficient estimates may not be valid.
• Hence, p-values and confidence intervals may not be valid
• For explanatory variables that do not vary within a cluster, p-values tend to be too small and confidence intervals too narrow.
Generalized Estimating Equations (GEE) •One commonly used regression method that takes into the account the dependence or correlation between observations from the same cluster or individual is generalized estimating equations or GEE.
•You can use the GEE method to fit a linear, logistic, Poisson, multinomial logistic and a variety of other regression models.
•GEE-like methods also available for time to event / survival outcomes.
•The GEE method doesn’t require the same number of outcomes per individual or cluster.
•Regression coefficient estimates are interpreted the same way as ordinary linear regression, logistic regression coefficients, Poisson regression, etc.
Radiographic bone loss study
Coefficient Estimate
Standard error
Odds ratio
95 CI% P-value
Logistic regression*
Treatment -0.381 0.130 0.68 0.53, 0.88 0.0035
Prior disease 0.391 0.144 1.38 1.04, 1.83 0.027
• RCT of non-steroidal anti-inflammatory drug on radiographic bone loss • Outcome was binary, >2.5% annual bone loss, measured on each tooth • Explanatory variables were treatment (drug vs placebo) randomly assigned
to a subject and prior disease activity (yes vs no) measured on each tooth
*Logistic regression assumes the outcomes are all independent
Radiographic bone loss study Coefficient Estimate
Standard error
Odds ratio
95 CI% P-value
Logistic regression
Treatment -0.381 0.130 0.68 0.53, 0.88 0.0035
Prior disease 0.391 0.144 1.38 1.04, 1.83 0.027
GEE logistic regression*
Treatment -0.381 0.327 0.68 0.35, 1.30 0.25
Prior disease 0.391 0.205 1.38 0.92, 2.06 0.12
*GEE logistic regression using an Independence working correlation; robust standard errors account correlation within subjects.
Radiographic bone loss study Coefficient Estimate
Standard error
Odds ratio
95 CI% P-value
Logistic regression
Treatment -0.381 0.130 0.68 0.53, 0.88 0.0035
Prior disease 0.391 0.144 1.38 1.04, 1.83 0.027
GEE logistic regression*
Treatment -0.381 0.327 0.68 0.35, 1.30 0.25
Prior disease 0.391 0.205 1.38 0.92, 2.06 0.12
*GEE logistic regression using an Independence working correlation; robust standard errors account correlation within subjects.
Final Thoughts •Regression modeling is an “art”
–All models are “wrong”, but some are useful
–Often several plausible models can fit the data equally well
–Should validate model (e.g., check model assumptions, outliers?) McCullagh & Nelder (1989)
•Limitations of multiple regression methods – Multiple regression can be used to control for confounding, but best
way to avoid confounding is to do a controlled experiment.
– Can only control for known & measured confounders, and need sufficient variation among the variables to “adjust” for confounding.
Further reading • Katz M.H. Multivariable Analysis: A Practical Guide for Clinicians and Public Health
Researchers, 3rd edition. Cambridge University Press, 2011.
• Vittinghoff E., Glidden D.V., Shiboski, S.C., McCulloch, C.E. Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures, 2nd edition, 2012, Springer.
• Collett D. Modelling Survival Data in Medical Research, 2nd edition, 2003, Chapman & Hall/CRC.
• Dobson A.J., Barnett A.G. An Introduction to Generalized Linear Models, 3rd edition, 2008, Chapman & Hall/CRC.
• Kleinbaum D.G., Klein M. Logistic Regression: A Self-Learning Text, 3rd edition, 2010, Springer.
• Kleinbaum D.G., Klein M. Survival Analysis: A Self-Learning Text, 2nd edition, 2005, Springer.
• Hardin J.W., Hilbe, J.M. Generalized Estimating Equations. Chapman & Hall/CRC, 2003.
• Diggle P.J., Heagerty P., Liang K-Y, Zeger S.L. Analysis of Longitudinal Data, 2nd edition, 2002, Oxford University Press.
• Raudenbush S.W., Bryk A.S. Hierarchical Linear Models, 2nd edition, 2002, Sage Publications.
• Verbeke G., Molenberghs G. Linear Mixed Models for Longitudinal Data, 2009, Springer.
• Molenberghs G., Verbeke G. Models for Discrete Longitudinal Data, 2005, Springer.
Addendum
Additional information on options for non-independent (correlated) outcomes
Repeated measures ANOVA and Multivariate Analysis of Variance (MANOVA)
• “Traditional” methods for correlated outcomes
• Appropriate for continuous/quantitative outcomes
• Outcomes should have (approximately) a multivariate normal distribution
• Requires same number of outcomes per individual or cluster
• Difficult to adjust for explanatory variables that varying within an individual or cluster
Random-effects or Mixed-effects models
• Another commonly used method for analyzing correlated continuous outcomes is linear mixed-effects models
• Also known as linear random-effects models, multilevel linear models and hierarchical linear models.
• The method of generalized linear mixed models (GLMM) is an extension of the linear mixed-effects models to other types of outcomes, including binary and count outcomes (e.g., mixed-effects logistic regression).
• Like GEE, do not require same number of outcomes per individual or cluster.
• Methods require the correlation structure to be correctly specified, but then allow one to perform statistical inference about the correlation or clustering.