statisticsstatistics dr. ahmed abd elmaksoud lecturer of anesthesiology & icu faculty of...

StatisticsStatisticsStatisticsStatistics

Dr. Ahmed Abd ElmaksoudDr. Ahmed Abd ElmaksoudLecturer of Anesthesiology & ICULecturer of Anesthesiology & ICU

Faculty of medicineFaculty of medicineAin Shams UniversityAin Shams University

When guessing is informed When guessing is informed decisiondecision

Objectives

Define statistics and understand its terminology

Discuss the importance and need of statistics in medical field

Distinguish types of data and variables Describe types of statistics & statistical

tests

What is statistics?

Statistics is the science of collecting, describing, and analyzing data in order to get a good decision.

Collecting data

Describing data

Analyzing data

Good Decision

Why do we need statistics?

Variability (atropine) Causes: -

Uncontrollable (too many) factors Immeasurable factors Unknown factors

Why do we need statistics?

Variability (atropine) Effect of variability

Large amount of data describing the same thing (many values for one variable)

No certainty “Deterministic vs probabilistic” Sampling

Functions of statistics (new hypothetical β-blocker drug)

Describe (Descriptive statistics) Inference (Inferential statistics)

Variables & Data

A variable is something whose value can vary. For example, age, gender and blood type are variables.

Data are the values you get when you measure a variable

Mrs Brown

Mr Patel

Ms Manda

Age 32 24 20

Gender Female Male Female

Blood type

O O A

The Variables…… ………and the

Data

Types of variables 1. Qualitative variables (data)

a) Categorical variableThe values (data) of a categorical variable are categories

Gender: (dichotomous –binary) Male Female

Type of ICU admission Medical Surgical Physical injuries Poisoning Others

Types of variables 1. Qualitative variables (data)

b) Ordinal variable Categorical variable whose values are ordered

Degree of illness Mild Moderate Severe

Physical status ASA I ASA II ASA III ASA IV ASA V

Glasgow’s coma scale

Types of variables 2. Quantitative (Numerical) variables

a) Discrete variable Episodes of myocardial ischemia

b) Continuous (interval – ratio) variable Cardiac index Creatinine clearance

Types of variables

Changing data scales

Sample and Population

A sample is a group (subset) taken from a population.

Population is the group of ALL individuals (entities) sharing specific characteristics

Sample and Population All human beings Day case surgical patients undergoing

general anesthesia Low-risk CABG surgery patients ICU patients with septic shock Women undergoing CS under spinal

anesthesia Human Skeletal muscle fibers Cardiac muscles of rates

Descriptive statistics

Descriptive statistics is a series of procedures designed to illuminate the data, so that its principal characteristics and main features are revealed.

This may mean sorting the data by size; perhaps putting it into a table, may be presenting it in an appropriate chart, or summarizing it numerically; and so on.


Qualitative variables Frequency Relative frequency


Qualitative variables Frequency Relative frequency Cumulative frequency


Qualitative variables Frequency Relative frequency Cumulative frequency Cross tabulation


Qualitative variables Frequency Relative frequency Cumulative frequency Cross tabulation (what about numerical variables?, grouping)


Qualitative variables In terms of describing data, an appropriate

chart is almost always a good idea. What ‘appropriate’ means depends primarily

on the type of data, as well as on what particular features of it you want to explore.

Finally, a chart can often be used to illustrate or explain a complex situation for which a form of words or a table might be clumsy, lengthy or otherwise inadequate.


Qualitative variables Charts

Pie chart

Postoperative Complications

23.5%

9.8%

39.2%

27.5%

Nausea VomitingPainCouph

Advantage:- 1. Summarize (Area-relative frequency)2. show magnitude (relative frequency)

Disadvantage:- 1. one variable only2. loose clarity if more than 4-5 categories.3. no cross tabulation “separate pies”



Pie Chart Bar Chart

Simple


0

5

10

15

20

25

Nausea Vomiting Pain Couph

Nu

mb

er

of

pa

tie

nts

AlternativeWidth

spacing


0%

5%

10%

15%

20%

25%

30%

35%

40%

45%


Nu

mb

er

of

pa

tie

nts

(%

)



Pie Chart Bar Chart

Simple Clustered bar chart


0%

5%

10%

15%

20%

25%

30%

35%

40%

45%


Nu

mb

er

of

pa

tie

nts

(%

)

Group IGroup II



Pie Chart Bar Chart

Simple Clustered bar chart Stacked bar chart


0%

5%

10%

15%

20%

25%

30%

35%

40%

45%


Nu

mb

er

of

pa

tie

nts

(%

)

Group IGroup II


0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Group I Group II

Nu

mb

er

of

pa

tie

nts

(%

)

CouphPainVomitingNausea



Pie Chart Bar Chart


Histogram

one variable – no gapping



Pie Chart Bar Chart


Histogram Frequency polygon Cumulative frequency polygon (Ogive)


0%

5%

10%

15%

20%

25%

30%

35%

40%

Nu

mb

er

of

pa

tie

nts

(%

)Postoperative Complications

Group IGroup II


Numerical variables Measures of Central tendency

(summary measures of location) Mean (Average)

1 – 1 – 3 – 5 - 10

Advantages – Disadvantages – Ratio



(summary measures of location) Median

1 – 1 – 3 – 5 - 10

1-3-5-9Odd vs Even



(summary measures of location) Mode

1 – 1 – 3 – 5 - 10



(summary measures of location) Midrange

1 – 1 – 3 – 5 - 10



(summary measures of location)

1 – 1 – 3 – 5 - 10 Mean = 4 Median = 3 Mode = 1 Mid range = 4.5




Measures of degree of dispersion (summary measures of spread) Range

1 – 1 – 3 – 5 - 10




Measures of degree of dispersion (summary measures of spread) Percentiles – quartiles

1 – 1 – 3 – 5 - 10

1-2-5-6-8-9-11-13-17-20-22

1st quartile3rd quartile

70th quartile

90th percentile




Measures of degree of dispersion (summary measures of spread) Inter-quartile range (Q3 – Q1)

1 – 1 – 3 – 5 - 101st quartile

3rd quartile




Measures of degree of dispersion (summary measures of spread) Variance – standard deviation

1 – 1 – 3 – 5 - 10




Measures of degree of dispersion (summary measures of spread)

1 – 1 – 3 – 5 - 10 Range Inter-quartile range Variance Standard deviation

= 9= 4

= 3.35

= 11.2


Numerical variables

Inferential statistics (Informed guess)

Making inference about population parameters from sample statistics


Making inference about population parameters from sample statistics Standard Error (SD of the statistic)

Confidence interval (95% CI)


Hypothesis testing Almost all clinical research begins with a question. For example, is stress a risk factor for breast

cancer? To answer questions like this you have to

transform the research question into a testable hypothesis called the null hypothesis, conventionally labeled H0.

This usually takes the following form: H0: Stress is NOT a risk factor for breast cancer H0: The drug has NO effect on mean heart rate


Hypothesis testing Null hypotheses reflect the conservative position

of no difference, no risk, no effect, etc., To test this null hypothesis, researchers will take

samples and measure outcomes, and decide whether the data from the sample provides strong enough evidence to be able to reject the null hypothesis or not.

If evidence against the null hypothesis is strong enough for us to be able to reject it, then we are implicitly accepting that some specified alternative hypothesis, usually labelled H1, is probably true.


Hypothesis testing ExampleIs the new hypothetical β-blocker is more

efficient than another conventional β-blocker (e.g. Inderal) in decreasing heart rate or not.


Hypothesis testing ExampleLet the mean heart rate of all people having

Inderal is 1

Let the mean heart rate of all people having the other new drug is 2


Type I & Type II Errors

The Ho is:

True false

The decision about Ho

Accept Good decision Type II error

reject Type I error Good decision


: Probability of conducting type I error : Probability of conducting type II error p–value: the probability of getting the

outcome observed (or one more extreme), assuming the null hypothesis to be true.

Sample size & power of the study


How does it work? Probability distributions


How does it work? Parametric vs non-parametric tests


Some example of testing of hypothesis? Comparisons

One sample Two independent samples Two dependent samples More than two samples (independent-dependent) Comparing two or more factors

Association



One sample Two independent samples Two dependent samples More than two samples (independent-dependent)

Association Prediction

y = 2.03x + 8

0

5

10

15

20

25

30

35

40

45

0 2 4 6 8 10 12 14

Age (y)

Wei

gh

t (k

g)




Association Prediction Diagnostic test (dichotomous – continuous)

Diagnostic tests

The disease (outcome) is:

Present Absent

The test is:

Positive TP FP

Negative FN TN

0.0 0.2 0.4 0.6 0.8 1.0

1 - Specificity

0.0

0.2

0.4

0.6

0.8

1.0

Sen

siti

vity

Diagonal segments are produced by ties.

ROC Curve

0.0 0.2 0.4 0.6 0.8 1.0

1 - Specificity

0.0

0.2

0.4

0.6

0.8

1.0

Sen

siti

vity

Diagonal segments are produced by ties.

ROC Curve




Association Prediction Diagnostic test (dichotomous – continuous) Survival analysis (censored data)

Final words If valid data are analyzed improperly, then

the results become invalid and the conclusions may well be inappropriate.

At best, the net effect is to waste time, effort, and money for the project.

At worst, therapeutic decisions may well be based upon invalid conclusions and patients’ wellbeing may be jeopardized.

Thank youThank you

statisticsstatistics dr. ahmed abd elmaksoud lecturer of anesthesiology & icu faculty of...

Documents

data slide

variable slide

data good decision slide

types of data

types of variables

values data

informed decision slide

data scales