statistics in clinical researchnull and alternative hypothesis type i and type ii errors p-value...

Statistics in Clinical Research

By

Dr. Tamer M.S. HifnawyAssociate Professor of Public health

Faculty of Medicine, BSU.

Types of VariablesCommonly used Statistical testNormal distribution curveMeasures of Central Tendency and Measures of DispersionNull and alternative hypothesisType I and type II errorsP-value & statistical significance

Statistics are numerical statements.

Biomedical Statistics are numerical statements about medical matters.

• Number with certain disease.• Number or proportion cured.• Proportion of deaths.• Number of hospital beds.

What is Statistics?

Biomedical Statistics

Discipline concerned with treatment of numerical data derived from a group of individuals.

• Administrative purposes.

• Research.

Used for:

Why Statistics?

I smoke cigarettes for 30 years and I suffered no disease.Therefore smoking cigarettes is safe!

• I smoked cigarettes for 2 years, I had bronchitis which turned to chronic condition. Now I discovered lung malignancy.

• Therefore smoking cigarettes is damaging to your health!

Why Statistics?Who are we going to believe?

• Dr. “X” Observed 1000 smokers.

Get Statistics, but how?

• Health problems was found among 800 of them.• Only 200 had no health problems.• Therefore smoking cigarettes is damaging to your

health!

Why Statistics?

Who are we going to believe?

• Dr. “Y” Observed 2000 smokers.

• Health problems was found also among 800 of them.

• More smokers (1200) are healthy.• Therefore smoking cigarettes does not affect your

health!

Wait!

Variables definition

Characteristics that vary from one subject to the other

Needs to be measured for every subject

Relate to person, environment, cause etc.

Statistics

Data CollectionData PresentationData AnalysisInterpretation of results

At the beginning, there must be a clear understanding of the word data or Variables

Data : “Numerical information suitable for processing or organized for analysis”

It is a well-defined situation and objectively described.

Statistical methods will be mainly concerned with data collection, presentation and analysis.

The interpretation by the specialist in collaboration with the statistician would result in the information.

Research Steps: (backbone)The three main steps are:STEP I: Proper data collection:

Problem definition and objective setting.Study designSample typeSample sizeSources of dataTools for data collection

STEP II: Proper data presentation:

Tables: When details of data are neededGraphs: When only impressions are needed.Parameters: Precise mathematical summary, useful for comparison.

STEP III: Proper use of statistical data analysis:

Comparisons: Tests of significance are the main statistical tools used.Associations: Correlation and regression are the tools used.

VARIABLE TYPESQuantitative When data expressing the different variable levels are measured as numbers.

Qualitative Variable is assessed as description

VARIABLE TYPES

Quantitative Qualitative

NominalSex

OrdinalDis. Severity

ContinuousAge

DiscretePulse

Normal Distribution Curve

34%34%

13.5%13.5%

95%

It has a peak and two symmetrical sides.

The peak coincides with all measures of central tendency.

The curve extends to infinity on both sides.

Normal Distribution Curve

Normal Distribution CurveConfidence limits: two values that limit the 95% closer values to the mean. Between these two values we are confident that any value could be presented by the mean. Any value that is more than or less than the confidence limit is considered significantly different from the mean value.95% CI

Summarizing DataMeasures of Central TendencyAlso called measures of the middle or measures of “center”

Three most common:Mean: Arithmetic average of observationsMedian: The middle observationMode: The most frequent observation

Sample Mean

Simple arithmetic average or sum of the scores divided by the number of items

Example: Xi=10, 12, 11, 15, 19

Sample mean=(10+12+11+15+19)/5=13.4

Median

Divides the bottom 50% of the data from the top 50%

Appropriate for use with skewed distributions

Approximately equal to mean when distribution is symmetric

Distribution ShapesSymmetric (normal): Mean = Median

Right (positive) Skewed: Mean < Median

Left (negative) Skewed: Mean > Median

Mean: 106.7

Median: 105.0

50 60 70 80 90 110 130 150 170 190

Descriptive Statistics

Central Tendency(Location)

Dispersion

a Mid range Range

b Mode Minimum Maximum

c Median 25th percentile 75th percentile

d Arithmetic Mean

Standard deviation

Summarizing DataMeasures of Variation

Also called ‘measures of spread’Most commonly used:

Sample rangeSample varianceSample standard deviationSample interquartile range

62o C12o C

Average = 37o C

Statistically Speaking “He is comfortable!”

Arithmetic Mean

Standard deviation

CENTER

DISPERSION

Hypothesis testing

It is a procedure for deciding whether a hypothesis about a quantitative feature of a population is true or false.

Practical ExampleIt is known that in a particular country, the mean birth weight of male babies is 3.3 ± 0.5 Kg.

Suppose that a random sample of 100 male babies born to a particular ethnic subgroup was found to have a mean birth weight of 3.2 ±0.4 kg.

Null hypothesis and alternative hypothesis

There is no difference between the means of two compared groups (this shown difference would be expected to occur by chance)

Null hypothesis

Alternative hypothesis

There is a difference between the means of the two groups

Statistical testing: If the difference between the means of the samples is among those that would occur by chance, this means that the results are not statistically significant.

When the null hypothesis is rejected and alternative hypothesis is accepted, the investigator describes the results as statistically significant.

Hypothesis Is Never Proven To Be True OR False, BUT It Is Only

Accepted OR Rejected On The Basis OF Statistical Tests.

Commonly used statistical tests

To examine the relationship (association or difference) between qualitative variables.

Ex.Lung cancer Control

Smokers A BNon-smokers C D

1- Chi-square test (X2):

2- Paired t test:

2- Paired t test: used to compare the means of one variable in the same group (pre and post an event).

Wilcoxon’s matched pairs test

3- Student t test

to evaluate the difference in means between two groups.

Mann-Whitney test (U-test)

4- ANOVA (F test):

to test for the difference of means of the same variable between more than two groups.

Kruskall-Wallis test

5- LSD Post Hoc Test

to test for the difference of the means of the same variable between each two groups individually.

Following a significant F test

6- Pearson’s coefficient (r):

It is a measure of the degree of linear relationship between two variablesIt is described for two main things:

Its strength

-1.0

Perfect

+1.0

Perfect

Sperman rank correlation coefficient

Its direction

+ve-veneutral

7- Regression line (Prediction):

It is important as it shows the average expected values of the two variables,based on the observed values.

If sample size is very small “as small as 10”Abnormally distributed data

Via histogramPerforming a normality test “Kolmogorov –Smirnov test”

Scale of measurement (scores, titer)

Non-Parametric Statistics

Poor Man

500,000 $ 1000 $

1,000,000 $

Clinical significance vs. Statistical significance

Statistical significance examines the likelihood that the difference found between groups could have occurred by chance alone.

In most clinical trials, a result is statistically significant if the difference between groups could have occurred by chance alone in less 5%.

This is expressed as a P value <0.05.

Clinical significance has little to do withstatistics and is a matter of clinicaljudgment.It answers the question “Is the differencebetween groups large enough to be worthachieving?”Studies can be statistically significant yetclinically insignificant and vice versa

Alpha error (Type I error):

P value (traditionally levels of 0.05 are used for statistical significance)It indicates the probability of rejecting the statistical hypothesis tested when in fact, that hypothesis is true.

ART

Beta error (Type II error):

the probability that the test will accept the hypothesis tested when in fact, it is false. It measures the power of the test. =(1-B error)

Power of the test: probability of rejecting the null hypothesis when it is false.

BAF

EXAMPLE

It may be concluded on the basis of the results that a new treatment is better when in fact it is

not better than the standard treatment

Type I error

Randomized control trial of drugs

On the other hand, a new treatment, that is actually effective may be

concluded to be ineffective

Type II error

Type I error (P-value)Level of significance

of the test

P < 0.05 ??

P<0.001

P>0.05P<0.01P<0.05

P<0.0001

Type II error Power of the test

Remember:

Be sure of the distribution of your data before doing any statistical analysis to choose the right statistical test.

P -value Statistical significance

Types of VariablesCommonly used Statistical testNormal distribution curveMeasures of Central Tendency and Measures of DispersionNull and alternative hypothesisType I and type II errorsP-value & statistical significance

Special Thanks To My Professor,

Dr. Mohamed HassanProfessor of Public Health and Community Medicine

Faculty of Medicine - Cairo University

THANK YOU

Tamer Hifnawy MD. Dr PH.Associate Professor of Public Health & Community MedicineEmail: [email protected]

[email protected]: +20124130107

mailto:[email protected]

mailto:[email protected]

statistics in clinical researchnull and alternative hypothesis type i and type ii errors p-value...

Documents