why do we need statistics? - johns hopkins university...(metrics) • temperature, bp, ht & wt...
TRANSCRIPT
![Page 1: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/1.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 1
Biostatistics for Evidence-Based Practice
Module 1
Sharon L. Kozachik, PhD, RN, FAAN
Johns Hopkins University
School of Nursing
Why do we need statistics?• Evidence based practice depends on
solid statistical evidence• “Evidence based practice is a problem-
solving approach to clinical decision making within a healthcare organization that integrates the bestavailable scientific evidence with the best available experiential (patient and practitioner) evidence.” (pg. 3)
• Considers internal and external influences on nursing practice
Newhouse, R. P., Dearholt, S. L., Poe, S. S., Pugh, L. C., & White, K. M. (2007). Johns Hopkins nursing evidence based practice: Model and guidelines. Indianapolis: Sigma Theta Tau.
Evidence based practice• Develop an answerable clinical
question
What is the best practice for managing pain in cancer patients?
• Search for relevant research-based evidence
• Appraise and synthesize the evidence• Integrate the evidence with other
factors• Assess effectiveness of the change
![Page 2: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/2.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 2
Types of studies that guide EBP
• Descriptive studies– What symptoms emerge during cancer treatment?– Use descriptive statistics
• Explanatory studies– Among persons with lung cancer, are women more
likely to report pain than men?– Use inferential statistics
• Prediction and control studies (RCT)– Will mindfulness meditation reduce pain to a
greater degree than distraction?– Use inferential statistics
What is nursing research?
• A systematic inquiry– Disciplined methods– Answers questions/solves problems of
importance to nurses & nursing profession
• Two basic categories– Basic research: knowledge production– Applied research: knowledge
implementation (problem-solving)
Nursing research develops knowledge to:
• Build the scientific foundation for clinical practice
• Prevent disease and disability• Manage and eliminate symptoms
caused by illness• Enhance end-of-life and palliative care
http://www.ninr.nih.gov/
![Page 3: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/3.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 3
Two paradigms guide nursing research
1. Positivist: reality exists and there is only 1 truth; the real world is driven by natural causes that can be quantified and analyzed– Answers research questions– Tests hypotheses
1. Naturalist: reality is multiple, subjective, and constructed by individuals within their context
This course focuses on the positivist paradigm
What is a Hypothesis?• A prediction that specifies the expected
relationship between variables• 3 types:1. Null – used in statistics
– There is no association between sleep and pain
2. Non-Directional– There is an association between sleep and
pain3. Directional
– Persons with poor sleep will have greater subsequent day pain
What is a Variable?• A characteristic that varies
– From person to person– Within a person over time– Examples: Hair color, Blood type, BP, Ht,
Wt
• What do we call a characteristic that does not vary?
• In research, there are two categories of variables
![Page 4: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/4.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 4
What is Measurement?• The assignment of numbers to represent
the amount of an attribute present in an object or person, using specific rules (metrics)
• Temperature, BP, ht & wt have rules for measuring
• Advantages:– Removes guesswork– Provides precise information– Less vague than words
Scales for Measurement
• Provides the unit of measurement– Level of measurement
• Provides the range and type of possible values– Infinite– Finite, as few as two
• What if there is only one value? – Measurement unit can be continuous– Measurement unit can be discrete
Levels of measurement
• Researchers strive to use highest level of measurement possible, especially for the dependent variable (DV)– More information about DV– Can use more powerful statistical tests
• Determines what type of data analysis you are able to perform
• There are four levels of measurement in statistics
![Page 5: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/5.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 5
Nominal• Also called categorical• Lowest level of measurement• Exclusive & exhaustive• Uses numbers to categorize attributes
– Examples: Sex, Race, Blood Type, Religion• Discrete variable• Each category is assigned a number
for the purpose of analyses, number does not have quantitative importance– Sex: Male = 1; Female = 2
Ordinal
• Exclusive, exhaustive & rank ordered• Ranks object based on its relative
standing on an attribute• Discrete variable • Does not tell how much greater one
level is than another – unequal intervals between rankings– Assistance with ADLs or IADLs– Patient satisfaction with care
Interval
• Exclusive, exhaustive, ranked, and numerically equal intervals
• Does not have a meaningful/true zero, only defines position on the scale– Temperature (Celsius, Fahrenheit)– IQ
• Continuous or discrete variable
![Page 6: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/6.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 6
Ratio• Highest level of measurement• Exclusive, exhaustive, ranked, equal distance
between intervals and a meaningful zero (point at which the variable is absent)– Provides information about the absolute
magnitude of the attribute– Weight: someone who weighs 200 lbs is twice as
heavy as someone who weighs 100 lbs– Urine output, bleeding, burn surface area, BP,
AR, RR• Continuous or discrete variable, depending
upon how measured
How we use Statistics in Research
1. Describe and summarize data2. Make predictions about future events
based on current evidence3. Make generalizations about population
occurrences based on sample observations
4. Identify associations/relationships or differences between sets of observations
Two types of statistics
• Descriptive statistics– Used to describe or characterize sample
characteristics by summarizing them
• Inferential statistics– A set of statistical techniques that provide
predictions about population characteristics based on information obtained from a sample taken from that population
![Page 7: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/7.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 7
Descriptive statistics - Univariate
• Univariate = 1 variable• Frequency distributions (counts,
percentages)• Central tendency = where the
masses huddle• Dispersion/variability = spread
Example: Data Table of Descriptive Statistics
Commonly presented descriptive statistics
• Mean• Median• Mode• Percentage & percentiles• Count• Minimum/Maximum• Range• Standard deviation (sd)• Variance• Inter-Quartile Range
Measures of central tendency
Measures of variability/dispersion
![Page 8: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/8.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 8
Data Organization
• Frequency Distribution– Systematic arrangement of data values– Imposes order on the data – List from lowest to highest – Provides a frequency count (f) and the
percentage of times each value occurred– The sum of all value frequencies = sample
size
Σf = n
Example: Frequency distributionSystematic arrangement of values
1. Lowest to highest (rank-ordered)
2. Indicates the count and percentage of the occurrence of each value in the data set
Raw data
Frequency Distribution: Education
![Page 9: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/9.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 9
Frequency Distributions for Variables with Many Values
• When a variable has many possible values, a regular frequency distribution may be unwieldy – For example, weight
values (here, in pounds)
Grouped Frequency Distributions
• Forming groups communicates information more conveniently than individual weights
Reporting Frequency Information
• Narrative in text (e.g., “83% of study participants were male”)
• Frequency distribution table (multiple variables often presented in a single table)
• Graphically
![Page 10: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/10.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 10
Graphic displays of frequency distributions• Bar graphs and pie charts
• Histograms, frequency polygon
• Shapes of distributions
• Modality
• Symmetry and skewness
• Kurtosis
• The Normal distribution
Bar Graphs
• Used for nominal (and many ordinal) level variables
• Horizontal dimension (X axis) that specifies categories (i.e., data values)
• Vertical dimension (Y axis) specifies either frequencies or percentages
• Bars for each category drawn to the height that indicates the frequency or %
Bar Graph: Education
Bars do not touch
![Page 11: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/11.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 11
Pie Chart
• Nominal (and many ordinal) level variables
• Circle is divided into pie-shaped wedges corresponding to percentages for a given category or data value
• All pieces add up to 100%
• Should place wedges in order, with biggest wedge starting at “12 o’clock”
Pie Chart
Histograms
• Interval- and ratio-level data • Similar to a bar graph, with an X and Y
axis—but adjacent values are on a continuum so bars touch one another
• Data values on X axis arranged from lowest to highest
• Bars drawn to height to show frequency or percentage (Y axis)
• May include a superimposed normal curve
![Page 12: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/12.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 12
Histogram: AgeBars touch
Normal curve superimposed
Frequency polygon
• Similar to a histogram• Resembles a line graph• Can be used to display a cumulative
frequency• Used in economical research
Frequency Polygon
![Page 13: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/13.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 13
Measures of Central Tendency
• An indicator of the center of the data– Typical / “average” data point– Center data point– Most frequently occurring data point
• Is it important to know where the center of the data is located?
Measures of Central Tendency• Mean – the average or typical value
– Interval or ratio level data – Sample mean:
• Median – the value that cuts the data in half, 50th %ile– Ordinal, interval or ratio level data (if
outliers)
• Mode – the most frequently occurring value– Categorical, ordinal, interval or ratio level
data
Example: MeanUsing the data values below, what is the mean?
86, 82, 94, 76, 88, 92, 92, 94, 94, 941. Rank order the data:
76, 82, 86, 88, 92, 92, 94, 94, 94, 942. Sum the values and divide by n (number of
values)Mean = 76+82+86+88+92+92+94+94+94+94
10Mean = 89.2
How might an outlier value affect the mean?
![Page 14: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/14.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 14
Example: MedianUsing the data values below, what is the
median?86, 82, 94, 76, 88, 92, 92, 94, 94, 94
• Steps1. Rank order the data:76, 82, 86, 88, 92, 92, 94, 94, 94, 94
2. Find the value that ‘cuts’ the data in half, that is the median (50th percentile)
76, 82, 86, 88, 92, 92, 94, 94, 94, 94
Median for our Data76, 82, 86, 88, 92, 92, 94, 94, 94, 94 -> 10
valuesFor even numbers of values:
we calculate the average of: (10/2)th + (1 + 10/2)th values (5th value) + (6th value)
For our data values: (92 + 92) = 922
92 splits our data in half:
76, 82, 86, 88, 92, 92, 94, 94, 94, 94
How might an outlier value affect the median?
ModeLet’s return to our data:
76, 82, 86, 88, 92, 92, 94, 94, 94, 94Make a frequency count for each value:76 – 182 – 186 – 188 – 192 – 294 – 4 94 is the mode of this data array
![Page 15: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/15.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 15
Central Tendency Comparisons:Normal Distribution
• In a normal distribution, the mean, median, and mode are equal
Symmetry• Symmetrical distribution: the two halves
of the distribution, folded over in the middle, are identical
KurtosisConcerned with peakedness relative to the normal distribution
![Page 16: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/16.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 16
Central Tendency: Skewed Distributions
• In a skewed distribution, the mean is pulled “off center” in the direction of the skew– What causes a distribution to skew?
Measures of Variability/Dispersion
• The spread of the data in a distribution– Two distributions with the same mean
could have different dispersion• Reported through 4 mechanisms
– Range: highest value (maximum) – lowest value (minimum)
– Interquartile range– Standard deviation: the average deviation
of all scores from the mean, the degree of error of the sample mean
– Variance: (standard deviation)2
Variability
High variability: (A) heterogeneous
distribution
Low variability: (B) homogeneous
distribution
![Page 17: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/17.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 17
Range
• Difference between highest and lowest value in distribution
• Weights (pounds):
110 120 130 140 150 150 160 170 180 190
• The range for these data is 80 (190 – 110)
Interquartile Range (IQR)• Reported with median value
• Based on quartiles– Lower quartile (Q1): Point below which 25% of scores lie– Upper quartile (Q3): Point below which 75% of scores lie
• IQR = Q3 - Q1
• IQR Example: Weights (pounds):
110 120 130 135 140 150 150 165 170 170 180 190
Q3 = 170, Q1 = 130
• IQR = 40.0 (170 – 130 )
Standard Deviation
• An index that conveys how much, on average, scores in a distribution vary
• Based on deviation scores, calculated by subtracting the mean from each individual score
X’ = X - X
![Page 18: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/18.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 18
Computing Standard Deviation
• X = mean of all scores• X = each individual score• ∑ = sum (in this case, the sum of the
differences of each score from the mean, squared)
• n = number of sample values
∑(X ‐ X)2
n‐1Standard Deviation =
Standard Deviation
• Advantages:– Takes all data into account in describing variability
– Is more stable as a measure of variability than the range or IQR
– Helpful in interpreting individual scores when data are distributed approximately normally
• Disadvantages:– Can be influenced by extreme scores/outliers
– Not as “intuitive” or as easy to interpret as the range
Variance
• An important variability concept in inferential statistics, but not used descriptively
• The variance = SD2
• Not easily interpreted because it is not in units of original data—it is in units squared
• Formula for sample variance:
• Formula for population variance:
![Page 19: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/19.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 19
Sample Variance: Example
110 ‐40 1600
120 ‐30 900
130 ‐20 400
140 ‐10 100
150 0 0
150 0 0
160 10 100
170 20 400
180 30 900
190 40 1600
∑ = 1500 = 0 = 6000
=1500/10=150
SD = 666.6725.82
Measurement Scales andDescriptive Statistics
Level of Measurement
Central Tendency Statistic
Variability Statistic
Nominal Mode ‐‐
Ordinal Median Range, IQR
Interval or Ratio
Mean (what if outliers present?)
Standard deviation, Variance
Normal Distribution
• Bell shaped symmetric curveWhat do we mean by symmetric curve?
• Mean, median and mode have the same value
• Approximately 68% values lie within 1 SD of mean
• Approximately 95% values lie within 2 SD of mean
• > 99% values lie within 3 SD of mean• Range is – ∞ to ∞
![Page 20: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/20.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 20
Normal Distribution - sd
You are in a class of 100 students. It is exam day, and the scores are normally distributed.If you score 1 sd below the mean score, what percentage of the class scored higher than you?If you scored 1 sd above the mean score, what percentage of the class scored higher than you?
Relative Standing
• Central tendency and variability indexes describe a distribution
• There are descriptive statistics that tell us the relative standing or position of a score in a distribution
• Two types:1. Standard Score
2. Percentile Rank
Standard Scores • An index of relative standing of raw
scores/values• Each value is standardized using mean
and SD of the distribution• Called a z-score
• z- distribution: mean = 0; sd = 1• If normal distribution is standardized, it
is called standardized or standard normal distribution
![Page 21: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/21.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 21
Standard Scores and Relative Standing: the z score
Z-score example • Heart rate data from a sample has a mean =
65.21 and a sample SD = 4.50• What is the z-score for a heart rate score of 70?
• What is the z-score for a heart rate score of 56?
• What is the probability that an individual has rate between 56 and 70 bpm given heart rate follows normal distribution with mean = 65.21 and SD = 4.50?• Probabilities range from 0 - 1
**
Z = 1.06Z = ‐2.05
![Page 22: Why do we need statistics? - Johns Hopkins University...(metrics) • Temperature, BP, ht & wt have rules for measuring • Advantages: – Removes guesswork – Provides precise information](https://reader034.vdocuments.us/reader034/viewer/2022042316/5f04934d7e708231d40ea66c/html5/thumbnails/22.jpg)
Sharon L. Kozachik, PhD, RN, FAAN 22
.8554 ‐ .0202 = .8352
**
Z = 1.06Z = ‐2.05
We can roughly estimate by looking at the normal curve:13.59 + 34.13 + 34.13 = 81.85%We are only accounting for the area between 2 sd below and 1 sd above the mean