data analysis for research and application john j. bennett fellow in healthcare systems engineering...

47
DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy.

Upload: coleen-lewis

Post on 18-Jan-2018

213 views

Category:

Documents


0 download

DESCRIPTION

Table of Contents  Variables  Types of Data  Hypothesis Testing  Types of Errors  Methods  Middle Range Analysis  Testing Differences  Relationships between variables  When to use each method  Correlation  Sample size  Sampling Methods  Bias  Reliability  Validity  Philosophy of Four World Views  Alternate Strategies of Inquiry  Approaches Described  A Framework for Research Design  Pre Experimental Designs  Quasi Experimental Designs  True Experimental Designs  Qualitative Methodology  Aspects To Consider When Planning A Mixed Methods Design  Criteria For Choosing and Selecting Statistical Tests 3

TRANSCRIPT

Page 1: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

DATA ANALYSIS FOR RESEARCH AND APPLICATION

John J. BennettFellow in Healthcare Systems Engineering and all around Swell Guy.

Page 2: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Define MeasureAnalyze

Improve Control

Visio

nAn

alys

is

Team Aim

Map

Mea

sure Change

Sustain/Spread

PLAN

DOCHECK

ACT

CustomerSatisfaction

PLAN

DOCHECK

ACT

CustomerSatisfaction

Lean Six Sigma VA~TAMMCS Crosswalk

Page 3: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Table of Contents Variables Types of Data Hypothesis Testing Types of Errors Methods Middle Range Analysis Testing Differences Relationships between variables When to use each method Correlation Sample size Sampling Methods Bias Reliability Validity

Philosophy of Four World Views

Alternate Strategies of Inquiry Approaches Described A Framework for Research

Design Pre Experimental Designs Quasi Experimental Designs True Experimental Designs Qualitative Methodology Aspects To Consider When

Planning A Mixed Methods Design

Criteria For Choosing and Selecting Statistical Tests

3

Page 4: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Philosophy of Four World Views

FOUR WORLDVIEWS

Post positivism Constructivism

Determination Reductionism Empirical observation and

measurement Theory verification

Understanding Multiple participant meanings Social and historical Construction Theory generation

Advocacy/Participatory Pragmatism

Political Empowerment Issue-oriented Collaborative Change-oriented

Consequences of actions Problem-centered Pluralistic Real-world practice oriented

4

Page 5: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Alternate Strategies of Inquiry

Alternative Strategies of InquiryQuantitative Mixed Methods Qualitative

Experimental designs Non-experimental designs,

such as surveys

Sequential Concurrent Transformative

Narrative Research Phenomenology Ethnographies Grounded Theory Case Study

5

Quantitative Methods Mixed Methods Qualitative Methods Pre-determined Instrument based questions Performance data, attitude

data, observational data, and census data

Statistical analysis Statistical interpretation

Both Pre-determinedAnd emerging methods

Both open and closed ended questions

Multiple forms of data drawing on all possibilities

Statistical and text analysis Across databases

interpretation

Emerging methods Open-ended questions Interview data, observation

data, document data, and audio-visual data

Text and image analysis Themes, patterns

interpretation

Page 6: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Approaches Described Qualitative, Quantitative, and Mixed Methods Approaches

Tend to or Typically… Qualitative Approaches Quantitative Approaches Mixed Methods Approaches Use these philosophical assumptions

Constructivist/Advocacy/Participatory knowledgeClaims

Post-positivist knowledge claims Pragmatic knowledge claims

Employ these strategies of inquiry

PhenomenologyGrounded TheoryEthnographyCase Study and narrative

Surveys and experiments Sequential, concurrent, and transformative

Employ these methods Open ended questions, emerging approaches, text or image data

Close-ended questions, predetermined approaches, numeric data

Both open and closed ended questions, both emerging and predetermined approaches and both quantitative and qualitative data and analysis

Use these practices of research as the researcher

Positions him/herselfCollects participant meanings Focuses on a single concept or phenomenon Brings personal values into the studyStudies the context or settings of participants Validates the accuracy of findingsMakes interpretations of the dataCreates an agenda for change or reform Collaborates with the participants

Tests or verifies theories or explanationsIdentifies variables to study Relates variables in questions or hypotheses Uses standards of validity and reliabilityObserves and measures information numerically Uses unbiased approaches Employs statistical procedures

Collects both quantitative and qualitative dataDevelops a rationale for mixing Integrates the data at different stages of inquiry Presents visual pictures of the procedures in the study Employs the practices of both qualitative and quantitative research

6

Page 7: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

7

Philosophical WorldviewsPostpositive

Social constructionAdvocacy/participatory

Pragmatic

Research MethodsQuestions

Data CollectionData AnalysisInterpretation

Write-upValidation

Selected Strategies of Inquiry

Qualitative Strategies Quantitative Strategies

Mixed Methods Strategies

Research Designs

A Framework For Research Design

Researcher tests or verifies a theory

Researcher tests hypotheses or

research questions from the theory

Researcher defines and

operationalizes variables derived from the theory

Researcher measures or

observes variables using an instrument

to obtain scores

Deductive Approach Used in Quantitative

Research

Inductive Logic of Research in a

Qualitative StudyResearchers poses generalizations or theories from past experiences and

literature

Researcher looks for broad patterns,

generalizations, or theories from themes or

categories

Researcher analyzes data to form themes or

categories

Researcher asks open ended questions of

participants or records fieldnotes

Researcher gathers

information (e.g. interviews,

observations)

Page 8: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Pre Experimental Designs One Shot Case Study

This design involves an exposure of a group to a treatment followed by a measure.

Group AX-----------0 One Group Pre Test Post Test Design

This design includes a pre-test measure followed by a treatment and a post-test for a single group.

Group A01----------X------------02 Static Group Comparison or Post Test Only with Nonequivalent Groups

Experimenters use this design after implementing a treatment. After the treatment, the researcher selects a comparison group and provides a post-test to both the experimental group(s) and the comparison group(s).

Group AX-------------------------0

---------------------------------------

Group B---------------------------0 Alternative Treatment Post Test Only with Nonequivalent Groups Design

This design uses the same procedure as the Static Group Comparison, with the exception that the nonequivalent comparison group received a different treatment.

Group A X1---------------------0

------------------------------------

Group BX2----------------------0

8

Page 9: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Quasi Experimental Designs Nonequivalent (Pre-Test and Post-Test) Control-Group Design

In this design, a popular approach to quasi experiments, the experimental group A and the control group B are selected without random assignment. Both groups take a pre-test and post-test. Only the experimental group receives the treatment.

Group A 0-------------X----------0

-------------------------------------

Group B 0 ------------------------0 Single-Group Interrupted Time-Series Design

In this design, the researcher records measures for a single group both before and after a treatment.

Group A 0---0---0---0---X---0---0---0---0 Control-Group Interrupted Time-Series Design

A modification of the Single-Group Interrupted Time-Series design in which two groups of participants, not randomly assigned, are observed over time. A treatment is administered to only one of the group (i.e. Group A).

Group A 0---0---0---0---X---0---0---0---0

---------------------------------------------------

Group B 0---0---0---0---0---0---0---0---0

9

Page 10: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

True Experimental Designs Pre-Test-Post-Test Control Group Design

A traditional, classical design, this procedure involves random assignment of participants to two groups. Both groups are administered both a pre-test and a post-test, but the treatment is provided only to experimental Group A.

Group A R-------0-------X-------0

Group B R-------0----------------0 Post-Test Only Control Group Design

This design controls for any confounding effects of a pre-test and is a popular experimental design. The participants are randomly assigned to groups, a treatment is given only to the experimental group, and both groups are measured on the post-test.

Group A R-----X------0

Group B R-------------0 Solomon Four-Group Design

A special case of 2 x 2 factorial design, this procedure involves the random assignment of participants to four groups. Pre-tests and treatments are varied for the four groups. All groups receive a post-test.

Group A R----0----X-----0 Group C R----------X-----0

Group B R----0-----------0 Group D R-----------------0 A-B-A Single Subject Design

This design involves multiple observations of a single individual. The target behavior of a single individual is established over time and is referred to as a baseline behavior. The baseline behavior is assessed, the treatment provided, and then the treatment is withdrawn.

Baseline A -Treatment B---Baseline A

0-0-0-0-0-0-X-X-X-X-X-X-X-0-0-0-0-0-0-0

10

Page 11: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Aspects To Consider When Planning A Mixed Methods Design

Aspects to Consider in Planning a Mixed Methods Design

Timing Weighting Mixing Theorizing

No Sequence Concurrent

Equal Integrating Explicit

Sequential-Qualitative first

Qualitative Connecting Explicit/Implicit

Sequential- Quantitative first

Quantitative Embedding Implicit

11

Page 12: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Sequential Mixed Method Design

Sequential Explanatory Design

QUAN QUAN qual qual

Data Collection Data Analysis Data CollectionData AnalysisInterpretation of Entire Analysis Sequential Exploratory Design

QUAL QUAL quan quan

Data Collection Data Analysis Data CollectionData AnalysisInterpretation of Entire Analysis Sequential Transformative Design

     

12

QUAN qual

QUAL quan

QUAL quan

Social science theory, qualitative theory, advocacy worldview

QUAN qual

Social science theory, qualitative theory, advocacy worldview

Page 13: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Concurrent Mixed Method Designs Concurrent Triangulation Design Concurrent Embedded Design

+

QUAN QUAL

Data collection Data Collection

QUAN Data Results Compared QUAL

Data Analysis Data Analysis Analysis of Findings Analysis of Findings

Concurrent Transformative Design

13

QUAN QUAL

QUAN

qual

QUAL

quan

QUAN + QUAL

Social science theory, qualitative theory, advocacy worldview

QUAN

Social science theory, qualitative theory, advocacy worldview

Page 14: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Variables14

Independent variable = (antecedent) the presumed cause of the dependent variable

Dependent variable = the presumed effect The independent variable influences the dependent

variable Dichotomies = 2-valued variables (such as male-

female, correct-incorrect, young-old) Polytomies = multi-valued variables (such as

Catholic, Muslim, Jew, Buddhist, etc) Active variable = experimental or manipulated

variable Attribute variable = a measured variable

Page 15: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Moderator Variables15

A variable is considered a moderator if it explains when a relationship between an independent variable and a dependent variable is larger or smaller.

Independent

Dependent

Moderator

Page 16: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Mediator Variable16

A variable is considered a mediator if it explains how a relationship between an independent variable and a dependent variable occurs.

Independent

Dependent

Mediator

Page 17: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Types of Variable Data17

Nominal = the numbers or symbols have no number meaning beyond presence or absence of the property or attribute being measured (i.e., 0=male, 1=female)

Ordinal = data that can be ordered in terms of importance, quantity or similar hierarchical attributes

Interval = possess the characteristics of nominal and ordinal, numerically equal distances on interval scales representing equal distances in the property being measured

Ratio = in addition to possessing the characteristics of nominal, ordinal, and interval scales, has an absolute or natural zero that has empirical meaning.

Page 18: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Hypothesis Testing18

Research hypothesis (H1) and null hypothesis (H0): H1 states that there is a difference (or relationship) between two variables; H0 states that there is no difference (or relationship) between two variables

H0 is the negation of H1 Note that H1 (H0) refers to populations A sample is used to test H1; results are interpreted

probabilistically. The idea behind the scene here is that one sample is enough to determine if H1, which states differences or relationships in the population, can be supported or not statistically (this is based on the central limit theorem)

In the sample, a statistic is calculated: t for differences in the means of a variable between two groups (t-test), F for differences in the means of a variable among three or more groups (one-way ANOVA and MANOVA) , r for relationships between two variables (Pearson correlation coefficient)

Page 19: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Hypothesis Testing - contd19

The question is, based on the results in the sample, can H1 be supported? The answer is probabilistic; a p value is determined based on the calculated statistic (t, F, or r), sample size (N), and degrees of freedom* (df) - SPSS provides all the information you need; tables are not needed when using SPSS

A level of significance is selected for the study; usually, this level is .05 (alpha level or the probability of committing the type I error** or the level of significance)

The calculated p value is compared with the alpha level (let's assume that is .05)

If p < .05 H0 can be rejected; therefore, H1 is accepted If p >= .05 H0 cannot be rejected; therefore, H1 cannot

be supported

Page 20: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Type I and II Errors20

The type I error (alpha; ) is rejecting the H0 when it should have been accepted

type II error (beta; ) is accepting the H0 when it should have been rejected. (Power = 1 - = the probability of correctly rejecting H0).

Here is another way of remembering these errors. If, in your research, you make the type I error, you incorrectly

reject the H0. Therefore, you incorrectly support the H1, which you then publish. What a shame! (Shme, that is!).

On the other hand, if you make the type II error, you incorrectly fail to reject H0. Therefore, you incorrectly reject the H1. Therefore, you drop the research incorrectly thinking that there was nothing worth publishing. All that work for nothing - What a pity!

Page 21: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Methods21

Quantitative – used for description of trends, attitudes or opinions of a population by studying a sample of that population (Creswell, 2003) Common tool is survey. From sample results, the researcher generalizes

or makes claims about the population. Experiment would test the impact of a treatment or intervention on an

outcome where all other factors are controlled to isolate the variable in question.

Qualitative – a type of field study where quantitative approaches cannot adequately capture the appropriate information (Kerlinger & Lee, 2000). Provides a deeper understanding of a process or experience. More descriptive – addresses the ‘why’ behind observations Uses direct observation and semi-structured interviews in the natural

environment Researcher may even develop new hypotheses during the research

process, flexible. Creswell, 2003

Page 22: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

22

Validating the Accuracy of the

Information

Interpreting the Meaning of Themes/

Descriptions

Interrelating Themes/Description

(e.g. grounded theory, case study, Phenomenology, ext.)

Themes Description

Coding the Data(hand or

computer)

Reading Through all Data

Organizing and Preparing Data for

Analysis

Raw Data (transcripts,

fieldnotes, images, etc.)

Data Analysis in Qualitative Research

Page 23: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Criteria for Choosing Select Statistical Tests

Nature of Question Number of Independent

Variables

Number of Dependent Variables

Number of Control Variables

(covariates)

Type of Score Independent/

Dependent Variables

Distribution of Scores

Statistical Test

Group Comparison 1 1 0 Categorical/Continuous

Normal T-Test

GroupComparison

1 or more 1 0 Categorical/Continuous

Normal Analysis of Variance

Group Comparison

1 or more 1 1 Categorical/Continuous

Normal Analysis of Covariance

Group Comparison

1 1 0 Categorical/Continuous

Non-normal Mann-Whitney U Test

Association BetweenGroups

1 1 0 Categorical/Categorical

Non-normal Chi-square

Relate Variables

1 1 0 Continuous/Continuous

Normal Pearson product moment correlation

Relate Variables

2 or more 1 0 Continuous/Continuous

Normal Multiple Regression

Relate Variables

1 1 or more 0 Categorical/Categorical

Non-normal Spearman rank-order

correlation

23

Page 24: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Two Kinds of Operational Definitions24

Measured = describes how a variable will be measured

Experimental = spells out the details of the investigators manipulation of the variable

Page 25: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Middle-Range Analysis25

There are three general ways of relating theory to data:

Grand Theory: Theory built from abstract concepts using deductive logic; mathematical theories are a good example. [theory theory]

Middle-Range Analysis: [theory data theory] Grounded Theory: Theory inductively derived

from systematic data collection and analysis; see Glaser & Strauss (1967), Strauss & Corbin (1990) for more details on this type of qualitative research. [data theory]

Page 26: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Middle Range Analysis Illustration

26

Theoretic Hypothesis:

Innovativeness is positively related to

Cosmopoliteness

Concept Concept

Empirical Hypothesis:

Early adoption of hybrid corn

is positively related to

Number of trips to Des Moines

Theoretical Level

Empirical Level

(Epistemic (Epistemic

Relationship) Relationship)

Operation Operation

Page 27: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Testing Differences Between Independent Groups

27

T-Test for comparing two parametric samples that we want to compare concerning their mean value for some variable of interest

Nonparametric alternatives for this test are the Wald-Wolfowitz runs test, the Mann-Whitney U test, and the Kolmogorov-Smirnov two-sample test.

If we have multiple groups, we would use analysis of variance - ANOVA/MANOVA

Nonparametric equivalents to this method are the Kruskal-Wallis analysis of ranks and the Median test.

Page 28: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Testing Differences Between Dependent Groups

28

If we want to compare two variables measured in the same sample we would customarily use the t-test for dependent samples

Nonparametric alternatives to this test are the Sign test and Wilcoxon's matched pairs test. If the variables of interest are dichotomous in nature (i.e., "pass" vs. "no pass") then McNemar's Chi-square test is appropriate.

If there are more than two variables that were measured in the same sample, then we would customarily use repeated measures ANOVA.

Nonparametric alternatives to this method are Friedman's two-way analysis of variance and Cochran Q test (if the variable was measured in terms of categories, e.g., "passed" vs. "failed"). Cochran Q is particularly useful for measuring changes in frequencies (proportions) across time.

Page 29: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Relationships Between Variables

29

To express a relationship between two variables one usually computes the correlation coefficient.

Nonparametric equivalents to the standard correlation coefficient are Spearman R, Kendall Tau, and coefficient Gamma (see Nonparametric correlations).

If the two variables of interest are categorical in nature (e.g., "passed" vs. "failed" by "male" vs. "female") appropriate nonparametric statistics for testing the relationship between the two variables are the Chi-square test, the Phi coefficient, and the Fisher exact test.

In addition, a simultaneous test for relationships between multiple cases is available: Kendall coefficient of concordance. This test is often used for expressing inter-rater agreement among independent judges who are rating (ranking) the same stimuli.

Page 30: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

When to Use Which Method30

Nonparametric methods are most appropriate when the sample sizes are small. When the data set is large (e.g., n > 100) it often makes little sense to use

nonparametric statistics at all. Each nonparametric procedure has its peculiar sensitivities and blind spots. For example, the Kolmogorov-Smirnov two-sample test is not only sensitive to

differences in the location of distributions (for example, differences in means) but is also greatly affected by differences in their shapes.

The Wilcoxon matched pairs test assumes that one can rank order the magnitude of differences in matched observations in a meaningful manner.

If this is not the case, one should rather use the Sign test. In general, if the result of a study is important (e.g., does a very expensive and

painful drug therapy help people get better?), then it is always advisable to run different nonparametric tests; should discrepancies in the results occur contingent upon which test is used, one should try to understand why some tests give different results.

On the other hand, nonparametric statistics are less statistically powerful (sensitive) than their parametric counterparts, and if it is important to detect even small effects one should be very careful in the choice of a test statistic.

Page 31: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Correlation / Cause and Effect

31

The mathematical correlation between two variables is represented by an italicized, lower case r.

The correlation coefficient r measures the strength of relationship between any two variables. Values of r range between -1 and 1. The sign (+ or –) gives the direction of the relationship.

A value of zero means there is no relationship. +1 indicates a perfect positive relationship (when one increases, the other

increases), and -1 indicates a perfect negative relationship (one decreases as the other

increases). When two variables are correlated it is tempting to assume a cause-and-

effect relationship. This cannot be concluded from r alone but must include consideration of supporting research and/or a sound theoretical argument.

It is possible to find what are known as spurious correlations due to “freaks of nature,” coincidence, or “accidents.”

 Kerlinger & Lee (2000) argued that when an experimental design is properly executed, the researcher can determine a cause and effect relationship between the independent and dependent variables.

Page 32: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Sample Size32

Creswell (2003) suggested that the sample size for groups should be calculated by using: the level of significance; the amount of power desired for the study; and the effect size.

Hair, et. al. (2006) noted that a researcher must appreciate sample size considerations for multiple regression testing.

Based upon their recommendations an effective sample size can be as low as 20 while maintaining a desired statistical power of .80 for simple regression testing.

Hair et. al. recommend a minimum sample size of 50 for a multiple regression testing but noted that 100 is the preferred sample size.

Page 33: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Sampling Methods33

Sampling refers to taking a portion of the population that is representative of the total population (Kerlinger & Lee, 2000)

Random – each member of the population sampled has an equal chance of being selected (Kerlinger & Lee, 2000)

Stratified-random - Stratified random sampling occurs when the population is divided into smaller groups, or strata, based on some similar characteristic within each stratum. A random sample is taken from each stratum. The size of the sample is proportional to the stratum’s size as compared to the overall population. These random samples are then combined to create a random population. Stratified random sampling provides that the sample will accurately reflect the population (Zikmund, 2003).

Cluster – Successive random sampling of units, or sets (held together by some common characteristic) (Kerlinger & Lee, 2000)

Convenience - Because the individuals were asked to participate this was not a random sample. Creswell (2003) describes this as a non-probability sample and not as desirable.

Page 34: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Research Bias34

Unknown or unacknowledged error created during the design, measurement, sampling, procedure, or choice of problem studied

There are two types of error associated with most forms of research: random and systematic. Random errors, i.e., those due to sampling variability or measurement precision,

occur in essentially all quantitative studies and can be minimized but not avoided.

Systematic errors, or biases, are reproducible inaccuracies that produce a consistently false pattern of differences between observed and true values.

Both random and systematic errors can threaten the validity of any research study.

However, random errors can be easily determined and addressed using statistical analysis; most systematic errors or biases cannot. This is because biases can arise from innumerable sources, including complex human factors. For this reason, avoidance of systematic errors or biases is the task of proper research design

key difference between qualitative and quantitative research is attempts to eliminate bias by quantitative researcher explicit acknowledgement of bias by qualitative researchers

Page 35: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Major Categories of Bias35

Selection biases, which may result in the subjects in the sample being unrepresentative of the population of interest

Measurement biases, which include issues related to how the outcome of interest was measured

Intervention (exposure) biases, which involve differences in how the treatment or intervention was carried out, or how subjects were exposed to the factor of interest

Hartman et al, 2002

Page 36: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Selection Bias36

Selection biases occur when the groups to be compared are different. These differences may influence the outcome. Common types of sample (subject selection) biases include volunteer or referral bias, and non-respondent bias. By definition, nonequivalent group designs also introduce selection bias.

Volunteer or referral bias. Volunteer or referral bias occurs because people who volunteer to participate in a study (or who are referred to it) are often different than non-volunteers/non-referrals. This bias usually, but not always, favors the treatment group, as volunteers tend to be more motivated and concerned about their health.

Non-respondent bias. Non-respondent bias occurs when those who do not respond to a survey differ in important ways from those who respond or participate. This bias can work in either direction.

Page 37: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Measurement Bias37

Measurement biases involve systematic error that can occur in collecting relevant data. Common measurement biases include instrument bias, insensitive measure bias, expectation bias, recall or memory bias, attention bias, and verification or work-up bias.

Instrument bias. Instrument bias occurs when calibration errors lead to inaccurate measurements being recorded, e.g., an unbalanced weight scale.

Insensitive measure bias. Insensitive measure bias occurs when the measurement tool(s) used are not sensitive enough to detect what might be important differences in the variable of interest.

Expectation bias. Expectation bias occurs in the absence of masking or blinding, when observers may err in measuring data toward the expected outcome. This bias usually favors the treatment group

Recall or memory bias. Recall or memory bias can be a problem if outcomes being measured require that subjects recall past events. Often a person recalls positive events more than negative ones. Alternatively, certain subjects may be questioned more vigorously than others, thereby improving their recollections.

Attention bias. Attention bias occurs because people who are part of a study are usually aware of their involvement, and as a result of the attention received may give more favorable responses or perform better than people who are unaware of the study’s intent.

Verification or work-up bias. Verification or work-up bias is associated mainly with test validation studies. In these cases, if the sample used to assess a measurement tool (e.g., diagnostic test) is restricted only to who have the condition of factor being measured, the sensitivity of the measure can be overestimated.

Page 38: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Intervention Bias38

Intervention or exposure biases generally are associated with research that compares groups. Common intervention biases include: contamination bias, co-intervention bias, timing bias(es), compliance bias, withdrawal bias, and proficiency bias.  

Contamination bias. Contamination bias occurs when members of the 'control' group inadvertently receive the treatment or are exposed to the intervention, thus potentially minimizing the difference in outcomes between the two groups.

Co-intervention bias. Co-intervention bias occurs when some subjects are receiving other (unaccounted for) interventions at the same time as the study treatment.

Timing bias(es). Different issues related to the timing of intervention can bias. If an intervention is provided over a long period of time, maturation alone could be the cause for improvement. If treatment is very short in duration, there may not have been sufficient time for a noticeable effect in the outcomes of interest.

Compliance bias. Compliance bias occurs when differences in subject adherence to the planned treatment regimen or intervention affect the study outcomes.

Withdrawal bias. Withdrawal bias occurs when subjects who leave the study (drop-outs) differ significantly from those that remain.

Proficiency bias. Proficiency bias occurs when the interventions or treatments are not applied equally to subjects. This may be due to skill or training differences among personnel and/or differences in resources or procedures used at different sites.

Page 39: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Design bias39

research design bias is introduced NOT when the study fails to control for threats to internal and external validity BUT RATHER when the study fails to identify the

validity problems OR when publicity about the research fails to

incorporate the researchers cautions

Page 40: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Measurement bias40

measurement bias exists when researcher fails to control for the effects of data collection and measurement e.g. tendency of people to give socially

desirable answers using self report is often biased by social

desirability

Page 41: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Sampling bias41

sampling bias exists (beyond regression) when the sampling procedure introduces bias Key Sampling Problem #1: omission of

women, Hispanics or other minorities from samples OR studying only minorities

Key Sampling Problem #2: targeting the most desirable or most accessible sample

Page 42: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Procedural bias42

procedural bias exists most often when we administer the research interview or questionnaire under adverse conditions Using students Paying subjects

Page 43: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

“Type III " error or problem bias

43

Not acknowledging a Type I or Type II error Influences researcher to see a result that is

not there

Page 44: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Other biases44

Contamination bias. Contamination bias occurs when members of the 'control' group inadvertently receive the treatment or are exposed to the intervention, thus potentially minimizing the difference in outcomes between the two groups.

Co-intervention bias. Co-intervention bias occurs when some subjects are receiving other (unaccounted for) interventions at the same time as the study treatment.

Timing bias(es). Different issues related to the timing of intervention can bias. If an intervention is provided over a long period of time, maturation alone could be the cause for improvement. If treatment is very short in duration, there may not have been sufficient time for a noticeable effect in the outcomes of interest.

Compliance bias. Compliance bias occurs when differences in subject adherence to the planned treatment regimen or intervention affect the study outcomes.

Withdrawal bias. Withdrawal bias occurs when subjects who leave the study (drop-outs) differ significantly from those that remain.

Proficiency bias. Proficiency bias occurs when the interventions or treatments are not applied equally to subjects. This may be due to skill or training differences among personnel and/or differences in resources or procedures used at different sites.

Page 45: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Reliability45

Opposite of random error Refers to the systematic or consistent

portion of scores (Schwab, 1999) 3 contexts of reliability:

Internal consistency – applies to multiple items in a measure

Interrater reliability – applies to multiple observers or raters

Stability – applies to multiple time periods

Page 46: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Validity46

Are you measuring what you think you are measuring?

3 types are important: Content validity = sampling adequacy of the

content (is the substance or content of this measure representative of the content being measured?)

Criterion-related validity = comparison of test or scale scores with one or more external variables known or believed to measure the attribute under study

Construct validity = meaning of the tests (what factors or constructs account for the variance in test performance?)

Kerlinger & Lee, 2000

Page 47: DATA ANALYSIS FOR RESEARCH AND APPLICATION John J. Bennett Fellow in Healthcare Systems Engineering and all around Swell Guy

Selected References:47

Creswell (2003) Glaser, B., & Strauss, A. (1967). The discovery of

grounded theory. Chicago: Aldine. Hair, (2006). Multivariate Data Analysis Hartman, J.M., Forsen, J.W., Wallace, M.S., Neely, J.G.

(2002). Tutorials in clinical research: Part IV: Recognizing and controlling bias. Laryngoscope, 112, 23-31.

Kerlinger & Lee (2000) Patton () Rogers, E. M. (1995). The diffusion of innovations (4th

ed.). New York, NY: Free Press. Strauss, A., & Corbin, J. (1990). Basics of qualitative

research: Grounded theory procedures and techniques. Newbury Park, CA: SAGE.