biostatistics
TRANSCRIPT
![Page 1: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/1.jpg)
BIOSTATISTICS IICapita Selecta, 2009
Part I Analysis of VariancePart II Generalized Linear ModelsPart III Multiple regression and model buildingPart IV Sample size calculationsPart V Measuring agreementPart VI Systematic review and meta-analysis
Søren Lundbye ChristensenJohannes J. Struijk
![Page 2: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/2.jpg)
Part IIIMultiple regression & model building
Literature: any serious book on statistics
Martin Bland, ”Introduction to medical statistics” Oxford Univ. Press, 2000,chapter 17.
![Page 3: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/3.jpg)
Multiple regression & model building
Basic model:
Best (minimum mean square error) estimator:
Solution for b:
We immediately see a problem: if some of the independent variables are linearly related then the inverse of the covariance matrix doesn’t exist.
exbxbxbbY kk 22110
eXbY
bXY ˆˆ
xyxx
xyxx
SSb
SbS
bXXYX
1
TT
ˆ
ˆ
ˆ
![Page 4: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/4.jpg)
Multiple regression & model building
Maximum voluntary contraction (MVC) of the quadriceps muscle as function of age and height of 41 alcoholics.
![Page 5: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/5.jpg)
Multiple regression & model building
![Page 6: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/6.jpg)
Multiple regression & model building
Model: MVC = b0 + b1xHeight + b2xAge
Multiple correlation coefficientR2 = SSReg / SST (proportion of variability accounted for)
![Page 7: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/7.jpg)
Multiple regression & model building
![Page 8: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/8.jpg)
Multiple regression & model building
Interaction:
MVC = b0 + b1xHeight + b2xAge + b3xHeightxAge
Note: adjusted Ra2 = 1- (1-R2)(n-1)/(n-p-1)
![Page 9: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/9.jpg)
Multiple regression & model building
Polynomial regression: MVC = b0 + b1xHeight + b2xHeight2
![Page 10: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/10.jpg)
Multiple regression & model building
![Page 11: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/11.jpg)
Multiple regression & model building
Dichotomous variables
ExamplesSex: man / womanLiver disease: yes / no
Assign 0’s and 1’s to those variables and use the standard techniques
![Page 12: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/12.jpg)
Multiple regression & model building
Variance inflation factor: VIF = 1/(1-Ri2)
VIF>10 is real problem (Ri2 >90%: 90 of influence
of xi is explained by other x’s)
Leverage: Cook’s distance (influential points)
![Page 13: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/13.jpg)
Multiple regression & model building
Many variables?
Step-up (forward)Step-down (backward)Forward-backwardBest subset
F1,n-q=(SSE(q)-SSE(q+1)) / (SSE(q+1)/(n-q))
![Page 14: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/14.jpg)
Part IVSample Size Calculation
Literature:
Machin et al., (1997), ”Sample size tables for clinical studies”, Blackwell, Oxford
Altman (1982), ”How large a sample?” In: Statistics in Practice (Eds. Gore & Altman), Blackwell Publishing Ltd., London
Lehr (1992), ”Sixteen s squared over d squared: a relation for crude sample size estimates”, Stat. in Med., 11:1099-1102
Martin Bland, ”Introduction to medical statistics” Oxford Univ. Press, 2000,chapter 18.
![Page 15: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/15.jpg)
Sample Size Calculation
Importance of sample size
Common error to have a sample that is too small: low power, Type II error: no rejection of the null hyptoheses.
![Page 16: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/16.jpg)
Sample Size Calculation
A little taxonomy of sample size calculations
Power – chance of rejecting the null-hypothesis if it is falseSignificance level – cutt-off level of the p-value below which
we reject the null-hypothesisVariability - e.g., standard deviation for numerical dataSmallest effect of interest – magnitude of the effect that we
want to be able to detect as being statistically significant
![Page 17: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/17.jpg)
Sample Size Calculations
Sample size calculations are important for:- Estimation: effect on confidence intervals
- Examples: estimation of population mean estimation of correlation coefficient
- Tests: effect on confidence level and power- Example: 1-sample test
Literature: Martin Bland, ”Introduction to medical statistics” Oxford Univ. Press, 2000,chapter 18.
![Page 18: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/18.jpg)
Sample Size Calculations
Methods of Sample Size Calculations
- Do the math- Special tables- Nomograms- Simulation- Computer software
![Page 19: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/19.jpg)
Sample Size Calculations
Estimation of population mean μ.
Assume sample size = n.
Estimated mean:
Estimated variance:
Estimated standard error:
Confidence interval:
n
iinXM
1
1
n
iinMXS
1
21
12
nSMSE /)(
)(,)( 2/2/ MSEzMMSEzM
![Page 20: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/20.jpg)
Sample Size Calculations
Estimation of population mean μ.
Width of the confidence interval:
For a desired width, Wd, of the CI we can thus calculate n:
Thus, n depends on • confidence level,• desired width of the confidence interval,• variance,• distribution of the data.
n
SzWidth 2/2
22/2
dW
Szn
![Page 21: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/21.jpg)
Sample Size Calculations
Estimation of correlation coefficient ρ.
Assume sample size = n
Estimated correlation = r (has a very nasty distribution)
Fisher’s z transformation: has a normal distribution with
Mean:
SE:
r
rz
1
1ln
2
1
1
1ln
2
1
121
1ln
2
1
nz
31)(S nzE
![Page 22: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/22.jpg)
Sample Size Calculations
Estimation of correlation coefficient ρ.
Confidence interval:
Example: expected r = 0.5; desired 95% CI = [0.4, 0.6]z0.4=0.424; z0.5=0.549; z0.6=0.693
z0.6-z0.5=0.144; z0.5-z0.4=0.126
31,312/2/
nzznzz
246126.03196.1 nn
![Page 23: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/23.jpg)
Sample Size Calculations
Paired-sample test.
Test statistic:)(dse
dz d
zα
-zβ+μd/se(d)
μd/se(d)
0
β
H0
![Page 24: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/24.jpg)
Sample Size Calculations
Altman’s nomogramAltman (1982), ”How large a sample?”, in
Statistics in practice, eds. Gore & Altman, BMA London.
Example: difference of capillary density (per mm2) in the feet of ulcerated patients (better foot minus worse foot):Min. diff. to be detected 4 mm-2
SD(difference) = 6.1Standardized difference = 2 x (4/6.1)= 1.31Required Power = 0.80Significance level = 0.05
![Page 25: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/25.jpg)
Sample Size Calculations
Using the formula:
zα = 1.96 (α = 0.05)
zβ = 0.86 (Power = 80%)
Min. μd = 4.0
VAR(d) = 6.12 =37.21
n = 18
zα
-zβ+μd/se(d)
μd/se(d)
0
β
H0
![Page 26: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/26.jpg)
Part VMeasuring agreement
Literature:
Bland, Altman, (1999), ”Measuring agreement in method comparison studies”, Stat Meth Med Res, 8:135-160
Landis, Koch, (1977), ”The measurement of observer agreement for categorical data”, Biometrics, 33:159-174
![Page 27: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/27.jpg)
Measuring agreement
Methods used in the literature:
Data MethodOrdinal Cohen’s kappa
Spearman’s rank-order correlation coefficientKendall’s tauKendall’s coefficient of concordance
Interval/ratio Pearson’s correlation coefficientIntraclass correlation coefficientTukey’s mean-difference plot (Bland-Altman plot)
![Page 28: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/28.jpg)
Measuring agreement
![Page 29: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/29.jpg)
Measuring agreement
Cohen’s kappa(Ordinal data)
Doctor 1
Doctor 2
Schizo- Bipolar Other Row sum
Schizo- 31 4 2 37
Bipolar 6 29 8 43
Other 10 7 3 20
Column sum
10 7 13 100agreement rate = 0.63κ = 0.41σκ= 0.077
![Page 30: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/30.jpg)
Measuring agreement
More than two judges (Ordinal data)For example: Kendall’s coefficient of concordance(related to Friedman’s two-way ANOVA on ranks)
MGP 2009 - Song
District 1 2 3 4 5 Totals
NJutl 1 2 3 5 4
MJutl 1 2 4 3 5
SJutl 1 2 3 5 4
Sjæll 1 2 3 4 5
Cophn 1 2 4 3 5Sum 5 10 17 20 23 T=75Sumsq 25 100 289 400 529 U=1343
![Page 31: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/31.jpg)
Measuring agreement
Kendall’s coefficient of concordance, W
m = number of ratersn = number of classes
W = 218 / 250 = 0.872
![Page 32: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/32.jpg)
Measuring agreement
NUMERICAL VARIABLES
Correlation coefficient
Intraclass correlation coefficient
Bland-Altman plot (Tukey plot)
Manual
Auto
mat
ed
Identity lin
e
![Page 33: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/33.jpg)
Measuring agreement
Pearson’s product-moment correlation coefficient
Ignores bias and gain!Only for two raters.
![Page 34: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/34.jpg)
Measuring agreement
Intraclass correlation coefficient (also for multiple raters) = Between pairs variance / Total variance.
k = number of subjects (or measured objects)n = number of raters (or methods)
This takes into account the systematic difference!
![Page 35: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/35.jpg)
Measuring agreement
Bland-Altman plotTukey mean-difference plot
![Page 36: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/36.jpg)
Measuring agreementBias
Proportional errorHeterogeneous variance
![Page 37: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/37.jpg)
Part VISystematic review and meta-analysis
Literature:
Chalmers, Altman, (eds), (1995), ”Systematic reviews”, Br. Med. J. Publ. Group, London
Higgins et al., (2003), ”Measuring inconsistency in meta-analysis”, Br. Med. J., 237:557-560
Cochrane Handbook: at http://www.cochrane.org
![Page 38: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/38.jpg)
Systematic review and meta-analysis
Systematic review =
Formalized and stringent process of combining the information from all (published and unpublished) of the same health condition.
![Page 39: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/39.jpg)
Systematic review and meta-analysis
Why systematic reviews?
Reduction of informationGeneralization to a wider populationConsistency by comparing different studiesReliability of recommendationsPower and precision increases
![Page 40: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/40.jpg)
Systematic review and meta-analysis
Meta-analysis =
Systematic review with focus on numerical results
To combine results f rom individual studies to estimate an overall / average effect of interest (example: the relative risk of getting cancer because of using mobile phones)
![Page 41: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/41.jpg)
Systematic review and meta-analysis
Meta-analysis
From a statistical angle, meta-analysis is an application of multifactorial methods:
Multiple studies of the same thing. Combine the results of the studies: - Treatment / risk factor is one independent factor- Study is a second independent factor
![Page 42: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/42.jpg)
Systematic review and meta-analysis
Meta-analysis
Clear definition of the question / effect of interest.Example:- Does lowering serum cholesterol reduce risk of dying from
coronary artery disease? - Does a diet to lower serum cholesterol reduce risk of dying
from coronary artery disease?
Study where attempt to lower cholesterol failed should be included?
![Page 43: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/43.jpg)
Systematic review and meta-analysis
Meta-analysis – PUBLICATION BIAS
Simple literature search is not good enough!- Bias towards positive results (sometimes to
negative results)- More positive results in English literature?- Unpublished studies are important.
![Page 44: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/44.jpg)
Systematic review and meta-analysis
Meta-analysis – Example from M. Bland, ch. 17
![Page 45: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/45.jpg)
Systematic review and meta-analysis
Meta-analysis – Example from M. Bland, ch. 17
![Page 46: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/46.jpg)
Systematic review and meta-analysis
Meta-analysis – Example from M. Bland, ch. 17
ln(o) = b0+b1T+b2S1+ ... +b5S4+b6S5+b7TS1+ ... +b11TS5
![Page 47: BIOSTATISTICS](https://reader035.vdocuments.us/reader035/viewer/2022062514/55851768d8b42aff298b4fcb/html5/thumbnails/47.jpg)
Systematic review
Example (Mailis-Gagnon et al., (2004), ”Spinal cord stimulation for chronic
pain”, The Cochrane Library, issue 3)
1692 papers : only 2 admitted to the reviewResult: further study needed(!)
http://thecochranelibrary.com