types of data & variables statistical procedures for ......types of data: quantitative data...
TRANSCRIPT
Statistical Procedures for Biomedical Study
Jaranit KaewkungwalFaculty of Tropical Medicine
Types of Data & Variables
Types of Data
QUALITATIVE
Data expressed by type
Data that has been described
QUANTITATIVE
Data classified by numeric value
Data that has been measured or counted
Adapted from: Dr. Craig Jackson, University of Central England
QUALITATIVE and QUANTITATIVE data are not mutually exclusive
Types of Data: Qualitative (Categorical) Data
NOMINAL DATA• values that the data may have do not have specific order• values act as labels with no real meaning• Binomial: two possible values (categories, states)• Multinomial: more than two possible values (categories, states)e.g. Health status healthy =1 sick=2e.g. Treatment new regimen = 1 standard regimen = 2e.g. hair colour brown =1 blond =2 black =100
ORDINAL DATA• values with some kind of ordering • data that has been measured or countede.g. social class: upper=1 middle = 2 working = 3e.g. glioblastoma tumor grade: 1 2 3 4 5e.g. position in a race: 1 st 2 nd 3 rd
Adapted from: Dr. Craig Jackson, University of Central England
Types of Data: Quantitative Data
DISCRETE• distinct or separate parts, with no finite detaile.g children in family
CONTINUOUS• between any two values, there would be a thirde.g between meters there are centimetres
INTERVAL• equal intervals between values and an arbitrary zero on the scalee.g temperature gradient
RATIO• equal intervals between values and an absolute zeroe.g body mass index
Adapted from: Dr. Craig Jackson, University of Central England
Examples of Data Coding
1 2 99Cat.
Example of Descriptive Statistics Constant vs. Variable
Variables are the specific properties that have the ability to take different values.
Constants are the specific properties that cannot vary or won’t be made to vary.
Terminology - Variables
INDEPENDENT(syn: treatment, predictor, input, exposure, explanatory variable) is a stimulus or activity that is identified or manipulated to predict the dependent variable e.g. new drug, sex, smokingDEPENDENT(syn: effect, criterion, outcome, output variable) is a response that the researcher wanted to assess and it happens due to the independent variables. e.g., cure rates, attitudes, performance on neuropsychological test
Adapted from: Dr. Craig Jackson, University of Central England
X Y X (independent) Y (dependent)
Terminology - Variables
EXTRANEOUSExtraneous variable is a variable that has a potential to distorts the relationship between dependent and independent variables.
• Controlled extraneous variables are recognized before the study is initiated and are controlled in the design and selection criteria.• Uncontrolled extraneous variables are recognized before or after the study is initiated but are not controlled in the design. They can be assessed and adjusted later by statistical methods.
e.g., diet, income, age
Adapted from: Dr. Craig Jackson, University of Central England
X (independent) Y (dependent)X1 (independent) Y (dependent)
X2 (independent)
Study Variables
Confounding Variable- When the effects of two or more variables cannot be separated.
“Condom Use increases the risk of STD”BUT ...
Explanation: Individuals with more partners are more likely to use condoms. But individuals with more partners are also more likely to get STD.
Study VariablesConfounding Variable- When the effects of two or more variables cannot be separated.
Types of Study
•Types of Observational Study• Descriptive(Parameter Estimation)
- Cross-sectional• Analytic
(Hypothesis Testing)- Cross-sectional- Case-Control- Cohort
• Types of Experimental Study• True Experimental
- Randomize Control Trial (RCT)• Quasi Experimental
• Other Types of MedicalResearch Study
• Diagnosis• Prognostic Factor Study
Types of Study Design
•Types of Observational Study• Descriptive(Parameter Estimation)
- Cross-sectional
• Analytic(Hypothesis Testing)
- Cross-sectional- Case-Control- Cohort
Types of Study Design:Descriptive Study
Sampling Techniques
Generalization/Inferential Statistics
P
X
•Types of Observational Study• Descriptive(Parameter Estimation)
- Cross-sectional
• Analytic(Hypothesis Testing)
- Cross-sectional- Case-Control- Cohort
Types of Study Design:Analytic Study
Generalization/
P1 X1
P2 X2
Ho:P1 = P2 X1 = X2
Types of Study Design:Analytic - Observational
•Types of Observational Study• Descriptive(Parameter Estimation)
- Cross-sectional• Analytic
(Hypothesis Testing)- Cross-sectional- Case-Control- Cohort
Cross‐sectionalExposure <‐> Outcome
Smoking (y/n)<‐> Cancer (y/n)Depress <‐> QOL
Case‐ControlOutcome Exposure
Cancer (y/n) Smoking (y/n)Stroke after surgery(y/n) surgery time
CohortExposure Outcome
Smoking (y/n) Cancer (y/n)Diabetes(y/n) Cognitive & physical function score
D D
E a b m
E c d n
o p N
Coho
rt
Case-Control
• Types of Experimental Study• True Experimental
- Randomize Control Trial (RCT)• Quasi Experimental
Types of Study Design:Analytic - Experimental
ExperimentalExposure Outcome
Treatment (A/B) Cure (y/n)Intensive counseling (y/n) Coping score
D D
E a b m
E c d n
o p N
Coho
rtR
CT
Case-Control
Subjects, i = 1,…,n
A
C
B
Randomize
Treatment groups
• Types of Experimental Study• True Experimental
- Randomize Control Trial (RCT)• Quasi Experimental
Truly randomized subjects
Not‐ truly randomized subjects
Types of Study Design:Analytic - Experimental
Subjects, i = 1,…,n
A
C
B
Treatment groupsBaseline Exposure groups Measurement times
Types of Study Design:Analytic – Repeated measures (Longitudinal)
Subjects, i = 1,…,n
A
C
B
Measurement times
Xi0 Xi1 Xi2 Xi3 Xi4
Treatment groupsBaseline Exposure groups
Types of Study Design:Analytic – Repeated measures (Longitudinal)
Subjects, i = 1,…,n
A
B
Study Closure
LostStudy closed
OutcomeOutcome
OutcomeOutcome of other cause
OutcomeStudy closed
Failure
Censor
Time from randomized to outcomes / censor / study closure
Treatment groupsBaseline Exposure groups
Types of Study Design:Analytic – Time-to-Event (Longitudinal)
Types of Statistics
Types of Statistics
• By Level of Underlying Distribution– Parametric Statistics– Non-parametric Statistics
Sampling Techniques
Generalization/Inferential Statistics
Types of Statistics
• By Level of Generalization– Descriptive Statistics– Inferential Statistics
• Parameter Estimation• Hypothesis Testing
– Comparison– Association– Multivariable data analysis– Multivariate data analysis
Sampling Techniques
Generalization/Inferential Statistics
Types of Statistics
• By Level of Generalization– Descriptive Statistics– Inferential Statistics
• Parameter Estimation• Hypothesis Testing
– Comparison– Association– Multivariable data analysis
Sampling Techniques
Generalization/Inferential Statistics
Describe Statistics of SampleP X SD
Descriptive Statistics
• Describe sample statistics– Categorical Variables
Proportion, Percent (%)– Continuous Variable
• Normal Distribution Mean SD
• Not normal DistributionMedian Range/IQR
Examples: Descriptive Statistics
Examples: Descriptive Statistics Types of Statistics
• By Level of Generalization– Descriptive Statistics– Inferential Statistics
• Parameter Estimation• Hypothesis Testing
– Comparison– Association– Multivariable data analysis
Descriptive StudyEstimate Parameter in Population from Statistics of Sample
P P (95%CI of P) Ex: 20% (17%‐23%) = X (SD) X (SE) or X (95%CI of X) Ex: 2.75 (1.60‐3.90) =
Sampling Techniques
Generalization/Inferential Statistics
P
X
Inferential Statistics
• Estimate population parameters– Categorical Variables
Proportion, Percent (%) with 95%CI– Continuous Variable
Mean SE (or 95%CI)
Examples: Parameter Estimation
Examples: Parameter Estimation Types of Statistics
• By Level of Generalization– Descriptive Statistics– Inferential Statistics
• Parameter Estimation• Hypothesis Testing
– Comparison– Association– Multivariable data analysis
Generalization/
P1 X1
P2 X2
Ho:P1 = P2 X1 = X2
Analytic StudyAssociation or Causation
between Exposure and Outcome
Inferential Statistics
• Comparison between groups – Categorical outcome
Chi-square (2) test– Continuous Variable
• Normal Distributiont-test (2 groups) F-test/ANOVA (>= 2 groups)
• Not normal DistributionWilcoxon test, Mann-Whitney U test
Examples: 2‐test & t‐test
Examples: Paired t-test Types of Statistics
• By Level of Generalization– Descriptive Statistics– Inferential Statistics
• Parameter Estimation• Hypothesis Testing
– Comparison– Association– Multivariable data analysis
Analytic StudyAssociation or Causation
between One or Many Exposures and Outcome
Example:SimpleY XBP Drug (A/B)BP Age
MultipleY X1 X2 X3BP Drug(A/B) Sex (M/F) Age
Types of Statistics
Regression Model• Prediction of outcome from exposures• Hypothesis testing with adjusted (controlled) for confounding
other variablesSimple Y X• BP Drug (A/B)
BP = a + b Drug• BP Age
BP = a + b AgeMultiple Y X1 X2 X3
BP Drug(A/B) Sex (M/F) AgeBP = a + b1 Drug + b2 Drug + b3 Age BP = 100 + 3 Drug + 5 Sex ‐ 7 Age
A=0 / B=1 M=0 / F=1
Multi-variables analysis
• Linear Regression
• Logistic Regression
• Poisson Regression
• Cox’s Proportional Hazard Regression
Y X1 X2 X3 …Continuous Continuous, Categorical
(normal dist)
Y X1 X2 X3 …Categorical Continuous, Categorical
(Binary, Ordinal, Multinomial))
Y X1 X2 X3 …Incidence / Count Continuous, Categorical
Y X1 X2 X3 …Time to event Continuous, Categorical
Typical Regression Models in Clinical Research
• Linear Regression
• Binary Logistic Regression
• Poisson Regression
• Cox’s Proportional Hazard Regression
Y X1 X2 X3 …Continuous Continuous, Categorical
Asthma scores Age Sex Smoking
Y X1 X2 X3 …Categorical Continuous, Categorical
Asthma(1) vs. No Asthma(0) Age Sex Smoking
Y X1 X2 X3 …Incidence / Count Continuous, Categorical
Asthma episodes in past year Age Sex Smoking
Y X1 X2 X3 …Time to event Continuous, Categorical
Time to asthma relapse Age Sex Smoking
Comparisons Among Different Regression Models
Linear regression model
OR = exp Binary logistic regression model
Cox regression modelHR = exp
IRR = exp
Poisson regression model
Note: If P0 (T) is small, when comparing two groups
Simple and Multiple Models Crude and Adjusted RRin Case-Control study
Relative Risk in Experimental Study Relative Risk in Experimental Study
Relative Risk in Experimental Study
4/9 = 0.441/9 = 0.11
0.44 – 0.110.44
วเิคราะห์แบบได้รับการรักษาครบตามทีก่าํหนดไว้ในแผนการวจิยั(Per‐Protocol – PP)
Relative Risk in Experimental Study
Example from RV144 (2009)
ITT Population
Modified ITT Population
IRR / HR and Efficacy in Clinical TrialExample from RV144 (2009)
ประสิทธิผล0.279 – 0.192
0.279
ประสิทธิผล1- RR = 1 - 0.192
0.279
Types of Statistics
• By Level of Generalization– Descriptive Statistics– Inferential Statistics
• Parameter Estimation• Hypothesis Testing
– Comparison– Association– Multivariate data analysis
Analytic StudyAssociation or Causation
between Many Exposures and Many Outcomes Over Time
Example:Single OutcomeY XY X1 X2 X3BP Drug(A/B) Sex (M/F) Age
Repeated OutcomesYt1 Yt2 Yt3 XYt1 Yt2 Yt3 X1 X2 X3BPt1 BPt2 BPt3 Drug(A/B) Sex (M/F) Age
Different Analytic Modelsin Longitudinal Data Analysis
• Generalized Linear Model (GLM)– Basic Statistical Methods for Hypothesis Testing
(t-test, F-test, Regressions)– Repeated Measures Analysis of Variance
• Generalized Estimating Equation (GEE)• Time-varying Covariates• Generalized Mixed Model• Survival & Time-to-event Model• Multiple Failures Model• Time series Model (secondary data)
Example: Comparing means at each time point
Example: Comparing means at each time point and against baseline
Example: Comparing change over time
The End Statistical Procedures for Clinical Study