advanced statistics for medical studies mwarumba mwavita, ph.d. school of educational studies...
TRANSCRIPT
ADVANCED STATISTICS FOR MEDICAL STUDIES
Mwarumba Mwavita, Ph.D.School of Educational Studies
Research Evaluation Measurement and Statistics (REMS)Oklahoma State University
StatisticsSet of methods and rules for organizing,
summarizing, and interpreting information.
Two categories of statistical procedures to organize and interpreting data.
Descriptive and Inferential Statistics
Descriptive statistics are statistical
procedures that are used to summarize,
organize, and simplify data.
Inferential statistics – techniques used to
study samples and make generalizations about
the populations from which they were selected.
Descriptive Statistics
Descriptive measure computed from the
data of a sample is called a statistic
Descriptive measure computed form the
data of a population is called a parameter
Central Tendency A statistical measure that identifies a single score
as representative for an entire distribution The goal of central tendency is to find the single
score that is most typical or most representative of the entire group
Mean – commonly referred as the average
Mode – most frequent score in a distribution
Median – the middle value in a distribution
Variability Range - highest score – lowest score Semi-interquartile range - (Q3 – Q1)/2 Standard Deviation – the standard distance from mean Variance - the mean of the squared deviations Coefficient of Variation (CV) - useful for comparing two or
more data with different units of measurement because it is expressed in percentage (CV = SD/mean x 100%)
Confidence Interval (CI) - is a measure of the precision of the point estimate
Normal distribution
A bell shape distribution
It is symmetrical
Terms
IV – Independent variable (treatment)
DV – Dependent variable (outcome)
Z- Test Used in hypothesis testing when a sample
mean is used to test a hypothesis about an
unknown population, generally a population
that has received treatment
Note the parameters of the population that
did not receive treatment are known
T- test
T statistic is used to test hypotheses about µ when the value for population standard deviation is not known
Uses a t-distribution- thus degree of freedom (number of scores in a sample that are free to vary)
Sample size determines use of t-distribution
Independent and Dependent t-test
Independent t-test uses two samples of the treatment conditions. (rule of thumb at least 10 subjects per each group)
Dependent also is referred as repeated-measure. A single sample of individuals is measured more than once on the same dependent variable
ANOVA (Analysis of Variance)
ANOVA - hypothesis-testing procedure used to
1. test hypotheses about population variances
2. evaluate mean differences between two or more
treatments (or populations)
Uses variances to determine if the means are
significantly different.
1.Single factor (one way) - one treatment under
different levels
2.Factorial designs – involves more than one
factor (treatment)
3.Repeated measures – assess a measurement
on the same participants under different
condition/time
ANOVA (Analysis of Variance)
Correlation and Regression Analysis
Correlation analyses mathematically identify and
describe relationships between variables
Regression analysis attempts to predict or
estimate the value of a response variable form the
known values of one or more explanatory
variables
Factor Analysis Exploratory factor analysis – used when the
researcher does not know how many factors are
necessary to explain the inter-relationships among
a set of characteristics, indicators, or items
(Reduction)
Confirmatory factor analysis- assess the extent
to which the hypothesized organization of a set of
identified factor fits the data
Survival Analysis Survival/failure analysis is a family of techniques
dealing with the time it takes for something to happen: cure, a failure, a relapse, a death and so on
Two major varieties of the technique are life tables, which describe the course of survival of one or more groups of cases
The second one encompasses a set of regression techniques in which the DV is survival time
Nonparametric techniques Usually do not state hypotheses in terms of a
specific parameter They make vary few assumptions about the
population distribution- distribution-free tests. Suited for data measured in ordinal and nominal
scales Not as sensitive as parametric tests; more likely
to fail in detecting a real difference between two treatments
Types of nonparametric tests
Chi-square statistic tests for Goodness of Fit
(how well the obtained sample proportions fit the
population proportions specified by the null
hypothesis
Test for independence – tests whether or not
there is a relationship between two variables
More Terms
Type I error – rejecting a true null hypothesis.
(treatment has an effect when in fact the
treatment has no effect)
Alpha level for a hypothesis test is the probability
that the test will lead to a Type I error
Scenario 1Alcohol appears to be involved in a variety
of birth defects, including low birth weight and retarded growth. A researcher would like to investigate the effect of prenatal alcohol on birth weight.
How will the researcher do this?D.V.I.V.Participants
Scenario 2
A researcher would like to know whether room temperature affects eating behavior.
DesignI.V.D.V.OthersParticipants
Scenario 3 A patient recently visited her physician complaining
of backache. The physician is aware of a new
technique of disc replacement. The physician
would like to test the technique but does not want
to use it on the patient.
What would you advise the physician to
do in this case?
Scenario 4
You notice that students from a nearby
elementary school that you have attended suffer
from the common cold, a disease that has been
at the school for a while. How does this school
compare to an elementary school across town?
How would you go about investigating this
problem?
Scenario 5
Suppose you are interested in finding out how a new treatment on osteoporosis among women will work.
DesignIVDVOthers
Scenario 6
Using scenario 5, how can we make it a two-
way ANOVA?
How could we make it a Repeated-measures
ANOVA?
Scenario 7
Diabetes has been on the increase among
American adolescents. A researcher is
interested in determining factors that
contribute to rise of diabetes among
adolescents
Scenario 8
A physician is interested in finding out the
factors that contribute to lung cancer. How
would you design this study?
Scenario 9
How would you investigate factors that
contribute to high blood pressure among
people?
Summary
ProblemDesign issuesVariables ParticipantsSample size