handbook for health care research, second edition chapter 10 © 2010 jones and bartlett publishers,...

19
Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Upload: emerald-johnston

Post on 18-Jan-2018

220 views

Category:

Documents


0 download

DESCRIPTION

Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC Levels of Measurement Numbers (0, 1, 2, … ) have the following properties: - Distinguishability: 0, 1, 2, and so on, are different numbers. -Ranking (greater than or less than): 1 is less than 2. -Equal intervals: Between 1 and 2, we assume the same distance as between 3 and 4. Nominal- named categories without any particular order to them Ordinal- consist of discrete categories that have an order to them (no indication of equal interval) Continuous (Interval)- can assume any value, rather than just whole numbers (assume equal, uniform intervals) Continuous (Ratio) - mathematically strongest level is the ratio, where numbers represent equal intervals and start with zero 3

TRANSCRIPT

Page 1: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

CHAPTER 10Basic Statistical Concepts

Page 2: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Preliminary ConceptsThe following terms are used frequently in this chapter: • Population- Collection of data or objects (that describes some

phenomenon of interest. • Sample: Subset of a population that is accessible for measurement. • Variable- Characteristic or entity that can take on different values. • Qualitative variable-Categorical variable not placed on a meaningful

number scale. • Quantitative variable-One that is measurable using a meaningful

scale of numbers. • Discrete variable- Quantitative variable with gaps or interruptions

in the values it may assume. • Continuous variable- Quantitative variable that can take on any

value, including fractional ones possible and limited by instrumentation or application.

2

Page 3: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Levels of Measurement• Numbers (0, 1, 2, … ) have the following properties:

-Distinguishability: 0, 1, 2, and so on, are different numbers. -Ranking (greater than or less than): 1 is less than 2.

-Equal intervals: Between 1 and 2, we assume the same distance as between 3 and 4.

• Nominal- named categories without any particular order to them

• Ordinal- consist of discrete categories that have an order to them (no indication of equal interval)

• Continuous (Interval)- can assume any value, rather than just whole numbers (assume equal, uniform intervals)

• Continuous (Ratio) - mathematically strongest level is the ratio, where numbers represent equal intervals and start with zero

3

Page 4: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Significant Figures• Number of digits used to express a measured number

is a rough indication of the error• Zero as Significant Figures:

- Final zeros to the right of the decimal point that are used to indicate accuracy are significant- For numbers less than one, zeros between the decimal point and the fi rst digit are not significant

• Calculations Using Significant Figures- the least precise measurement used in a calculation determines the number of significant figures in the answer

4

Page 5: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Rounding• Done so that you do not infer accuracy in the result

that was not present in the measurements• Universal rounding rules:

• If the final digits of a number are 0, 1, 2, 3, or 4, the numbers are rounded down (dropped, and the preceding figure is retained unaltered). • If the final digits are 5, 6, 7, 8, or 9, the numbers are rounded up (dropped, and the preceding figure is increased by one).

5

Page 6: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Descriptive Statistics• Methods for organizing data and reducing a large set of

numbers to a few informative numbers• Data representation- data set should be organized for

inspection through use of a frequency distribution • Histogram- A bar graph in which the height of the bar indicates the frequency of occurrence of a value or class of values. • Frequency polygon- A graph in which a point indicates the frequency of a value, and the points are connected to form a broken line (hence a polygon) • Percentage- The numerical frequency on the Y-axis is replaced with the percentage of occur rence in this form of the polygon.

6

Page 7: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Descriptive StatisticsPercentile- A percentile is the value of a variable in a

data set below which a certain percent of observations fall

• Cumulative percentage curve- This graph plots the cumulative percentage on the Y-axis against the values of the variable on the X-axis. The curve then describes the rate of accumulation for the values of the variable.

Page 8: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Measures of the Typical Value of a Set of Numbers

• Summation Operator- denoted by the Greek capital letter sigma(∑) and simply indicates addition over values of variable

• Three statistics are used to represent the typical value (also called the central tendency)-Mean-sum of all observations divided by the number of observations-Median- is the 50th percentile of a distribution, or the point that divides the distribution into equal halves-Mode- is the most frequently occurring observation in the distribution

8

Page 9: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Measures of Dispersion• Dispersion indicate the variability, or how spread out the

data are• Range- is the distance between the smallest and the largest

values of the variable• Variance - is a measure of how different the values in a set

of numbers are from each other. It is calculated as the average squared deviation of the values from the mean

• Standard deviation-average deviation from mean• Coefficient of Variation- expresses standard deviation as

percentage of mean• Standard scores (or z score)- deviation from mean

expressed in units of standard deviation

9

Page 10: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Propagation of Errors in Calculations and Correlation and Regression

• Propagation of Errors in Calculations- physical quantity of interest is not measured directly but rather a function of one of more measurement made from an experiment

• Correlation- descriptive measure of relationship or association between two variables

• Regression- linear relationship between two variables, use the value of one variable to predict the value of the other variable- When we measure X and predict Y, Y is said to be regressed on X

10

Page 11: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Inferential Statistics• Although a sample from a population is economical, we still

wish to use the sample measure ments (statistics) to infer to the population measures (parameters).

• Concept of Probability- probability of an event can be defined as the relative frequency, or proportion, of occurrence of that event out of some total number of events. -Values between 0 and 1

• Normal Distribution and Standard scores- a normally distributed variable, the mean is at the center of the distribution, and therefore, the mean is also the median and the mode. -Normal distribution- z score for the mean must always be zero.

11

Page 12: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Normal Curve

12

Approximate areas under the normal curve within one, two, and three standard deviations around the mean.

Page 13: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Inferential Statistics• Sampling Distribution- is the probability

distribution of a statistic and most important concept in inferential statistics

• Confidence Intervals- is the range of values that are believed to contain the true parameter value

• Error intervals- describe the combined effects of systematic and random errors on individual measurements. -We can also say something about how much confidence should be placed in the estimate

13

Page 14: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Inferential Statistics• Data Analysis for Device Evaluation Studies

Step 1- create a scatter plot of the raw data to get a subjective impression of their validity. Step 2- make sure the data comply with the assumption of normality Step 3- once the data are judged to conform to the underlying assumptions, the mean and standard deviation are used to calculate error intervalsStep 4- the data should be presented in graphic form and labeled with the numerical values for the error intervals

14

Page 15: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Inferential Statistics• Interpreting Manufacturers’ Error Specifications- evaluating a

new device, a major concern is with how much error can be expected in normal use. - Knowing that any specification of error is just an estimate, we want to know how much confidence to place in it. - Manufacturers can be rather obscure about their error specifications.

• Hypothesis testing- technique for quantifying our guess about a hypothesis. We never know the “real” situation.

-Does drug X cause Y or not? We can figure the odds and quantify our probability of being right or wrong. -Chance difference

Page 16: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Inferential Statistics• Type I and II Errors– Type 1-the error of rejecting the null hypothesis

when it is true– Type 2-the error of accepting false null hypothesis

• Power Analysis and Sample Size- probability of correctly rejecting the null hypothesis– The most practical means to control power is to

manipulate sample size

16

Page 17: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Inferential Statistics• Rules of Thumb for Estimating Sample Size

-Estimates Based on Mean and Standard Deviation-Estimates Based on Proportionate Change and Coefficient of Variation-Estimates for Confidence Intervals-Sample Size for Binomial Test-Unequal Sample Sizes-Rule of Threes

17

Page 18: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Inferential Statistics• Clinical Importance Versus Statistical

Significance– Size of the test statistic for a given difference is

determined by the standard error, which in turn is determined by the sample size

– Difference between two mean values (treatment group vs. control group) is significant but so small that it does not have any practical effect, then we must conclude that the results are not clinically important

18

Page 19: Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts

Handbook for Health Care Research, Second Edition Chapter 10

© 2010 Jones and Bartlett Publishers, LLC

Inferential Statistics• Matched Versus Unmatched Data – Unmatched data (or unpaired or independent) if

values in one group are unrelated in any way to the data values in the other group

– Matched data (or paired or dependent) data are selected so that they will be as nearly identical as possible

19