statisticsstatistics dr. ahmed abd elmaksoud lecturer of anesthesiology & icu faculty of...
TRANSCRIPT
StatisticsStatisticsStatisticsStatistics
Dr. Ahmed Abd ElmaksoudDr. Ahmed Abd ElmaksoudLecturer of Anesthesiology & ICULecturer of Anesthesiology & ICU
Faculty of medicineFaculty of medicineAin Shams UniversityAin Shams University
When guessing is informed When guessing is informed decisiondecision
Objectives
Define statistics and understand its terminology
Discuss the importance and need of statistics in medical field
Distinguish types of data and variables Describe types of statistics & statistical
tests
What is statistics?
Statistics is the science of collecting, describing, and analyzing data in order to get a good decision.
Why do we need statistics?
Variability (atropine) Causes: -
Uncontrollable (too many) factors Immeasurable factors Unknown factors
Why do we need statistics?
Variability (atropine) Effect of variability
Large amount of data describing the same thing (many values for one variable)
No certainty “Deterministic vs probabilistic” Sampling
Functions of statistics (new hypothetical β-blocker drug)
Describe (Descriptive statistics) Inference (Inferential statistics)
Variables & Data
A variable is something whose value can vary. For example, age, gender and blood type are variables.
Data are the values you get when you measure a variable
Mrs Brown
Mr Patel
Ms Manda
Age 32 24 20
Gender Female Male Female
Blood type
O O A
The Variables…… ………and the
Data
Types of variables 1. Qualitative variables (data)
a) Categorical variableThe values (data) of a categorical variable are categories
Gender: (dichotomous –binary) Male Female
Type of ICU admission Medical Surgical Physical injuries Poisoning Others
Types of variables 1. Qualitative variables (data)
b) Ordinal variable Categorical variable whose values are ordered
Degree of illness Mild Moderate Severe
Physical status ASA I ASA II ASA III ASA IV ASA V
Glasgow’s coma scale
Types of variables 2. Quantitative (Numerical) variables
a) Discrete variable Episodes of myocardial ischemia
b) Continuous (interval – ratio) variable Cardiac index Creatinine clearance
Sample and Population
A sample is a group (subset) taken from a population.
Population is the group of ALL individuals (entities) sharing specific characteristics
Sample and Population All human beings Day case surgical patients undergoing
general anesthesia Low-risk CABG surgery patients ICU patients with septic shock Women undergoing CS under spinal
anesthesia Human Skeletal muscle fibers Cardiac muscles of rates
Descriptive statistics
Descriptive statistics is a series of procedures designed to illuminate the data, so that its principal characteristics and main features are revealed.
This may mean sorting the data by size; perhaps putting it into a table, may be presenting it in an appropriate chart, or summarizing it numerically; and so on.
Descriptive statistics
Qualitative variables Frequency Relative frequency Cumulative frequency Cross tabulation
Descriptive statistics
Qualitative variables Frequency Relative frequency Cumulative frequency Cross tabulation (what about numerical variables?, grouping)
Descriptive statistics
Qualitative variables In terms of describing data, an appropriate
chart is almost always a good idea. What ‘appropriate’ means depends primarily
on the type of data, as well as on what particular features of it you want to explore.
Finally, a chart can often be used to illustrate or explain a complex situation for which a form of words or a table might be clumsy, lengthy or otherwise inadequate.
Postoperative Complications
23.5%
9.8%
39.2%
27.5%
Nausea VomitingPainCouph
Advantage:- 1. Summarize (Area-relative frequency)2. show magnitude (relative frequency)
Disadvantage:- 1. one variable only2. loose clarity if more than 4-5 categories.3. no cross tabulation “separate pies”
Postoperative Complications
0
5
10
15
20
25
Nausea Vomiting Pain Couph
Nu
mb
er
of
pa
tie
nts
AlternativeWidth
spacing
Postoperative Complications
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Nausea Vomiting Pain Couph
Nu
mb
er
of
pa
tie
nts
(%
)
Postoperative Complications
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Nausea Vomiting Pain Couph
Nu
mb
er
of
pa
tie
nts
(%
)
Group IGroup II
Descriptive statistics
Qualitative variables Charts
Pie Chart Bar Chart
Simple Clustered bar chart Stacked bar chart
Postoperative Complications
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Nausea Vomiting Pain Couph
Nu
mb
er
of
pa
tie
nts
(%
)
Group IGroup II
Postoperative Complications
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Group I Group II
Nu
mb
er
of
pa
tie
nts
(%
)
CouphPainVomitingNausea
Descriptive statistics
Qualitative variables Charts
Pie Chart Bar Chart
Simple Clustered bar chart Stacked bar chart
Histogram
Descriptive statistics
Qualitative variables Charts
Pie Chart Bar Chart
Simple Clustered bar chart Stacked bar chart
Histogram Frequency polygon Cumulative frequency polygon (Ogive)
Nausea Vomiting Pain Couph
0%
5%
10%
15%
20%
25%
30%
35%
40%
Nu
mb
er
of
pa
tie
nts
(%
)Postoperative Complications
Group IGroup II
Descriptive statistics
Numerical variables Measures of Central tendency
(summary measures of location) Mean (Average)
1 – 1 – 3 – 5 - 10
Advantages – Disadvantages – Ratio
Descriptive statistics
Numerical variables Measures of Central tendency
(summary measures of location) Median
1 – 1 – 3 – 5 - 10
1-3-5-9Odd vs Even
Descriptive statistics
Numerical variables Measures of Central tendency
(summary measures of location) Mode
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables Measures of Central tendency
(summary measures of location) Midrange
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables Measures of Central tendency
(summary measures of location)
1 – 1 – 3 – 5 - 10 Mean = 4 Median = 3 Mode = 1 Mid range = 4.5
Descriptive statistics
Numerical variables Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion (summary measures of spread) Range
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion (summary measures of spread) Percentiles – quartiles
1 – 1 – 3 – 5 - 10
1-2-5-6-8-9-11-13-17-20-22
1st quartile3rd quartile
70th quartile
90th percentile
Descriptive statistics
Numerical variables Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion (summary measures of spread) Inter-quartile range (Q3 – Q1)
1 – 1 – 3 – 5 - 101st quartile
3rd quartile
Descriptive statistics
Numerical variables Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion (summary measures of spread) Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion (summary measures of spread) Variance – standard deviation
1 – 1 – 3 – 5 - 10
Descriptive statistics
Numerical variables Measures of Central tendency
(summary measures of location)
Measures of degree of dispersion (summary measures of spread)
1 – 1 – 3 – 5 - 10 Range Inter-quartile range Variance Standard deviation
= 9= 4
= 3.35
= 11.2
Inferential statistics (Informed guess)
Making inference about population parameters from sample statistics
Inferential statistics (Informed guess)
Making inference about population parameters from sample statistics Standard Error (SD of the statistic)
Confidence interval (95% CI)
Inferential statistics (Informed guess)
Hypothesis testing Almost all clinical research begins with a question. For example, is stress a risk factor for breast
cancer? To answer questions like this you have to
transform the research question into a testable hypothesis called the null hypothesis, conventionally labeled H0.
This usually takes the following form: H0: Stress is NOT a risk factor for breast cancer H0: The drug has NO effect on mean heart rate
Inferential statistics (Informed guess)
Hypothesis testing Null hypotheses reflect the conservative position
of no difference, no risk, no effect, etc., To test this null hypothesis, researchers will take
samples and measure outcomes, and decide whether the data from the sample provides strong enough evidence to be able to reject the null hypothesis or not.
If evidence against the null hypothesis is strong enough for us to be able to reject it, then we are implicitly accepting that some specified alternative hypothesis, usually labelled H1, is probably true.
Inferential statistics (Informed guess)
Hypothesis testing ExampleIs the new hypothetical β-blocker is more
efficient than another conventional β-blocker (e.g. Inderal) in decreasing heart rate or not.
Inferential statistics (Informed guess)
Hypothesis testing ExampleLet the mean heart rate of all people having
Inderal is 1
Let the mean heart rate of all people having the other new drug is 2
Inferential statistics (Informed guess)
Type I & Type II Errors
The Ho is:
True false
The decision about Ho
Accept Good decision Type II error
reject Type I error Good decision
Inferential statistics (Informed guess)
: Probability of conducting type I error : Probability of conducting type II error p–value: the probability of getting the
outcome observed (or one more extreme), assuming the null hypothesis to be true.
Sample size & power of the study
Inferential statistics (Informed guess)
Some example of testing of hypothesis? Comparisons
One sample Two independent samples Two dependent samples More than two samples (independent-dependent) Comparing two or more factors
Association
Inferential statistics (Informed guess)
Some example of testing of hypothesis? Comparisons
One sample Two independent samples Two dependent samples More than two samples (independent-dependent)
Association Prediction
Inferential statistics (Informed guess)
Some example of testing of hypothesis? Comparisons
One sample Two independent samples Two dependent samples More than two samples (independent-dependent)
Association Prediction Diagnostic test (dichotomous – continuous)
Diagnostic tests
The disease (outcome) is:
Present Absent
The test is:
Positive TP FP
Negative FN TN
0.0 0.2 0.4 0.6 0.8 1.0
1 - Specificity
0.0
0.2
0.4
0.6
0.8
1.0
Sen
siti
vity
Diagonal segments are produced by ties.
ROC Curve
0.0 0.2 0.4 0.6 0.8 1.0
1 - Specificity
0.0
0.2
0.4
0.6
0.8
1.0
Sen
siti
vity
Diagonal segments are produced by ties.
ROC Curve
Inferential statistics (Informed guess)
Some example of testing of hypothesis? Comparisons
One sample Two independent samples Two dependent samples More than two samples (independent-dependent)
Association Prediction Diagnostic test (dichotomous – continuous) Survival analysis (censored data)
Final words If valid data are analyzed improperly, then
the results become invalid and the conclusions may well be inappropriate.
At best, the net effect is to waste time, effort, and money for the project.
At worst, therapeutic decisions may well be based upon invalid conclusions and patients’ wellbeing may be jeopardized.