statistics in clinical researchnull and alternative hypothesis type i and type ii errors p-value...
TRANSCRIPT
Statistics in Clinical Research
By
Dr. Tamer M.S. HifnawyAssociate Professor of Public health
Faculty of Medicine, BSU.
Types of VariablesCommonly used Statistical testNormal distribution curveMeasures of Central Tendency and Measures of DispersionNull and alternative hypothesisType I and type II errorsP-value & statistical significance
Statistics are numerical statements.
Biomedical Statistics are numerical statements about medical matters.
• Number with certain disease.• Number or proportion cured.• Proportion of deaths.• Number of hospital beds.
What is Statistics?
Biomedical Statistics
Discipline concerned with treatment of numerical data derived from a group of individuals.
• Administrative purposes.
• Research.
Used for:
Why Statistics?
I smoke cigarettes for 30 years and I suffered no disease.Therefore smoking cigarettes is safe!
• I smoked cigarettes for 2 years, I had bronchitis which turned to chronic condition. Now I discovered lung malignancy.
• Therefore smoking cigarettes is damaging to your health!
Why Statistics?Who are we going to believe?
• Dr. “X” Observed 1000 smokers.
Get Statistics, but how?
• Health problems was found among 800 of them.• Only 200 had no health problems.• Therefore smoking cigarettes is damaging to your
health!
Why Statistics?
Who are we going to believe?
• Dr. “Y” Observed 2000 smokers.
• Health problems was found also among 800 of them.
• More smokers (1200) are healthy.• Therefore smoking cigarettes does not affect your
health!
Wait!
Variables definition
Characteristics that vary from one subject to the other
Needs to be measured for every subject
Relate to person, environment, cause etc.
Statistics
Data CollectionData PresentationData AnalysisInterpretation of results
At the beginning, there must be a clear understanding of the word data or Variables
Data : “Numerical information suitable for processing or organized for analysis”
It is a well-defined situation and objectively described.
Statistical methods will be mainly concerned with data collection, presentation and analysis.
The interpretation by the specialist in collaboration with the statistician would result in the information.
Research Steps: (backbone)The three main steps are:STEP I: Proper data collection:
Problem definition and objective setting.Study designSample typeSample sizeSources of dataTools for data collection
STEP II: Proper data presentation:
Tables: When details of data are neededGraphs: When only impressions are needed.Parameters: Precise mathematical summary, useful for comparison.
STEP III: Proper use of statistical data analysis:
Comparisons: Tests of significance are the main statistical tools used.Associations: Correlation and regression are the tools used.
VARIABLE TYPESQuantitative When data expressing the different variable levels are measured as numbers.
Qualitative Variable is assessed as description
VARIABLE TYPES
Quantitative Qualitative
NominalSex
OrdinalDis. Severity
ContinuousAge
DiscretePulse
Normal Distribution Curve
34%34%
13.5%13.5%
95%
It has a peak and two symmetrical sides.
The peak coincides with all measures of central tendency.
The curve extends to infinity on both sides.
Normal Distribution Curve
Normal Distribution CurveConfidence limits: two values that limit the 95% closer values to the mean. Between these two values we are confident that any value could be presented by the mean. Any value that is more than or less than the confidence limit is considered significantly different from the mean value.95% CI
Summarizing DataMeasures of Central TendencyAlso called measures of the middle or measures of “center”
Three most common:Mean: Arithmetic average of observationsMedian: The middle observationMode: The most frequent observation
Sample Mean
Simple arithmetic average or sum of the scores divided by the number of items
Example: Xi=10, 12, 11, 15, 19
Sample mean=(10+12+11+15+19)/5=13.4
Median
Divides the bottom 50% of the data from the top 50%
Appropriate for use with skewed distributions
Approximately equal to mean when distribution is symmetric
Distribution ShapesSymmetric (normal): Mean = Median
Right (positive) Skewed: Mean < Median
Left (negative) Skewed: Mean > Median
Mean: 106.7
Median: 105.0
50 60 70 80 90 110 130 150 170 190
Descriptive Statistics
Central Tendency(Location)
Dispersion
a Mid range Range
b Mode Minimum Maximum
c Median 25th percentile 75th percentile
d Arithmetic Mean
Standard deviation
Summarizing DataMeasures of Variation
Also called ‘measures of spread’Most commonly used:
Sample rangeSample varianceSample standard deviationSample interquartile range
62o C12o C
Average = 37o C
Statistically Speaking “He is comfortable!”
Arithmetic Mean
Standard deviation
CENTER
DISPERSION
Hypothesis testing
It is a procedure for deciding whether a hypothesis about a quantitative feature of a population is true or false.
Practical ExampleIt is known that in a particular country, the mean birth weight of male babies is 3.3 ± 0.5 Kg.
Suppose that a random sample of 100 male babies born to a particular ethnic subgroup was found to have a mean birth weight of 3.2 ±0.4 kg.
Null hypothesis and alternative hypothesis
There is no difference between the means of two compared groups (this shown difference would be expected to occur by chance)
Null hypothesis
Alternative hypothesis
There is a difference between the means of the two groups
Statistical testing: If the difference between the means of the samples is among those that would occur by chance, this means that the results are not statistically significant.
When the null hypothesis is rejected and alternative hypothesis is accepted, the investigator describes the results as statistically significant.
Hypothesis Is Never Proven To Be True OR False, BUT It Is Only
Accepted OR Rejected On The Basis OF Statistical Tests.
Commonly used statistical tests
To examine the relationship (association or difference) between qualitative variables.
Ex.Lung cancer Control
Smokers A BNon-smokers C D
1- Chi-square test (X2):
2- Paired t test:
2- Paired t test: used to compare the means of one variable in the same group (pre and post an event).
Wilcoxon’s matched pairs test
3- Student t test
to evaluate the difference in means between two groups.
Mann-Whitney test (U-test)
4- ANOVA (F test):
to test for the difference of means of the same variable between more than two groups.
Kruskall-Wallis test
5- LSD Post Hoc Test
to test for the difference of the means of the same variable between each two groups individually.
Following a significant F test
6- Pearson’s coefficient (r):
It is a measure of the degree of linear relationship between two variablesIt is described for two main things:
Its strength
-1.0
Perfect
+1.0
Perfect
Sperman rank correlation coefficient
Its direction
+ve-veneutral
7- Regression line (Prediction):
It is important as it shows the average expected values of the two variables,based on the observed values.
If sample size is very small “as small as 10”Abnormally distributed data
Via histogramPerforming a normality test “Kolmogorov –Smirnov test”
Scale of measurement (scores, titer)
Non-Parametric Statistics
Poor Man
500,000 $ 1000 $
1,000,000 $
Clinical significance vs. Statistical significance
Statistical significance examines the likelihood that the difference found between groups could have occurred by chance alone.
In most clinical trials, a result is statistically significant if the difference between groups could have occurred by chance alone in less 5%.
This is expressed as a P value <0.05.
Clinical significance has little to do withstatistics and is a matter of clinicaljudgment.It answers the question “Is the differencebetween groups large enough to be worthachieving?”Studies can be statistically significant yetclinically insignificant and vice versa
Alpha error (Type I error):
P value (traditionally levels of 0.05 are used for statistical significance)It indicates the probability of rejecting the statistical hypothesis tested when in fact, that hypothesis is true.
ART
Beta error (Type II error):
the probability that the test will accept the hypothesis tested when in fact, it is false. It measures the power of the test. =(1-B error)
Power of the test: probability of rejecting the null hypothesis when it is false.
BAF
EXAMPLE
It may be concluded on the basis of the results that a new treatment is better when in fact it is
not better than the standard treatment
Type I error
Randomized control trial of drugs
On the other hand, a new treatment, that is actually effective may be
concluded to be ineffective
Type II error
Type I error (P-value)Level of significance
of the test
P < 0.05 ??
P<0.001
P>0.05P<0.01P<0.05
P<0.0001
Type II error Power of the test
Remember:
Be sure of the distribution of your data before doing any statistical analysis to choose the right statistical test.
P -value Statistical significance
Types of VariablesCommonly used Statistical testNormal distribution curveMeasures of Central Tendency and Measures of DispersionNull and alternative hypothesisType I and type II errorsP-value & statistical significance
Special Thanks To My Professor,
Dr. Mohamed HassanProfessor of Public Health and Community Medicine
Faculty of Medicine - Cairo University
THANK YOU
Tamer Hifnawy MD. Dr PH.Associate Professor of Public Health & Community MedicineEmail: [email protected]
[email protected]: +20124130107