chapter 7 chapter 7 experiments. introduction experimental research is most appropriate for...

Chapter 7Chapter 7Experiments

Introduction

Experimental research is most appropriate for answering research questions about the effect of a treatment or some other variable whose values can be manipulated by the researcher.

It is the most powerful design for testing causal hypotheses.

True Experiments

True experiments must have at least three things:1. Two groups (in the simplest case, an experimental

and a control group).2. Variation in the independent variable before

assessment of change in the dependent variable.3. Random assignment to the two (or more)

comparison groups.

True Experiments, cont.

The combination of these features permits us to have much greater confidence in the validity of causal conclusions than is possible in other research designs.

Our confidence in the validity of an experiment’s findings is further enhanced by two more things:

1. Identification of the causal mechanism2. Control over the context of an experiment

Experiments and Comparison Groups True experiments must have at least one

experimental group (subjects who receive some treatment) and at least one comparison group (subjects to whom the experimental group can be compared).

The comparison group differs from the experimental group in terms of one or more independent variables whose effects are being tested. In other words, the difference between the experimental and comparison groups is determined by variation in the independent variable.

Experimental and Comparison Groups, cont. Experimental group. In an experiment, the group of

subjects that receives the treatment or experimental manipulation.

Comparison group. In an experiment, a group that has been exposed to a different treatment (or value of the independent variable) than the experimental group.

In many experiments, the independent variable indicates the presence or absence of something, such as receiving a treatment program or not receiving it.

In these experiments, the comparison group, consisting of the subjects who do not receive the treatment, is termed a control group.

Pretest and Posttest Measures

All true experiments have a posttest—that is, measurement of the outcome in both groups after the experimental group has received the treatment.

Many true experiments also have pretests that measure the dependent variable prior to the experimental intervention.

A pretest is exactly the same as a posttest, just administered at a different time.

Pretest scores permit a direct measure of how much the experimental and comparison groups changed over time.

Pretest and Posttest Measures, cont. A randomized experimental design with a pretest

and posttest is termed a pretest-posttest control group design.

An experiment may have multiple posttests and perhaps even multiple pretests.

Multiple posttests can identify just when the treatment has its effect and for how long.

This is particularly important when treatments are delivered over a period of time.

Exhibit 7.2

Randomization

Randomization, or random assignment, is what makes the comparison group in a true experiment such a powerful tool for identifying the effects of the treatment. (This is not the same as Random Sample)

Randomization, cont.

A researcher cannot determine for sure what the unique effects of a treatment are if the comparison group differs from the experimental group in any way other than not receiving the treatment.

Assigning subjects randomly to the experimental and comparison groups ensures that systematic bias does not affect the assignment of subjects to groups.

What random assignment does—create two (or more) equivalent groups—is useful for maximizing the likelihood of internal validity, not generalizability.

Exhibit 7.3

Randomization, cont.

Matching is another procedure sometimes used to equate experimental and comparison groups, but by itself it is a poor substitute for randomization.

Matching of individuals in a treatment group with those in a comparison group might involve pairing persons on the basis of similarity of gender, age, years in school, or some other characteristic.

The basic problem is that, as a practical matter, individuals can be matched on only a few characteristics; unmatched differences between the experimental and comparison groups may still influence outcomes.

Exhibit 7.4

Limitations of True Experimental Designs The hypothesis test itself does not require any

analysis of mechanism, and if the experiment was conducted under carefully controlled conditions during a limited span of time, the causal effect (if any) may seem to be quite direct.

True experimental designs also do not guarantee that the researcher has been able to maintain control over the conditions to which subjects are exposed after they are assigned to the experimental and comparison groups.

Limitations of True Experimental Designs Association between the hypothesized

independent and dependent variables. experiments can provide unambiguous evidence of association by comparing the experimental and comparison groups on the dependent variable.

Limitations of True Experimental Designs, cont. Time order of effects of one variable on the

others. In experiments with a pretest, time order can be established by comparing posttest to pretest scores.

In experiments with random assignment of subject to the experimental and comparison groups, time order can be established by comparison of posttest scores only.

Limitations of True Experimental Designs, cont. Context in which change occurs. Control over

conditions is more feasible in many experimental designs than it is in nonexperimental designs. Bear in mind that it is often difficult to control conditions in experiments conducted outside of a laboratory setting.

Quasi-Experiments

Often, testing a hypothesis with a true experimental design is not feasible with the desired subjects and in the desired setting.

Such a test may be too costly or take too long to carry out; it may not be ethical to randomly assign subjects to the different conditions; or it may be too late to do so.

Researchers may instead use “quasi-experimental” designs that retain several components of experimental design but differ in important details.

Quasi-Experiments, cont. A quasi-experimental design is one in which the

comparison group is predetermined to be comparable to the treatment group in critical ways, such as being eligible for the same services or being in the same school cohort (Rossi & Freeman 1989:313).

These research designs are only “quasi” experimental because subjects are not randomly assigned to the comparison and experimental groups.

As a result, we cannot be as confident in the comparability of the groups as in true experimental designs.

Two Major Types of Quasi-Experimental DesignsNonequivalent control group designs.

Nonequivalent control group designs have experimental and comparison groups that are designated before the treatment occurs and are not created by random assignment.

Before-and-after designs. Before-and-after designs have a pretest and posttest but no comparison group. In other words, the subjects exposed to the treatment serve, at an earlier time, as their own controls.

Nonequivalent Control Group Designs In this type of quasi-experimental design, a

comparison group is selected to be as comparable as possible to the treatment group.

Individual Matching

In individual matching, individual cases in the treatment group are matched with similar individuals in the comparison group.

In some situations, this can create a comparison group that is very similar to the experimental group, as when Head Start participants were matched with their siblings in order to estimate the effect of participation in Head Start (Currie & Thomas 1995:341).

However, in many studies, it may not be possible to match on the most important variables.

Aggregate Matching

In most situations when random assignment is not possible, the second method of matching makes more sense: identifying a comparison group that matches the treatment group in the aggregate rather than trying to match individual cases.

This means finding a comparison group that has similar distributions on key variables: the same average age, the same percentage female, and so on.

For this design to be considered quasi-experimental, however, individuals must not have been able to choose whether to be in the treatment group or the control group.

Before-and-After Designs

The common feature of before-and-after designs is the absence of a comparison group.

Because all cases are exposed to the experimental treatment, the basis for comparison is provided by comparing the pretreatment to the posttest measures.

These designs are thus useful for studies of interventions that are experienced by virtually every case in some population.

Before-and-After Designs, cont.

Types of before-and-after designs Multiple group before-and-after design. In this

design, several before-and-after comparisons are made involving the same variables but different groups.

Repeated measures panel designs. These include several pretest and posttest observations.

Time series designs. These include many (preferably 30 or more) such observations in both pretest and posttest periods.

Summary: Causality in Quasi-Experiments Let’s now examine how well quasi-experiments

meet the criteria for identifying a cause. Association between the hypothesized

independent and dependent variables. Quasi-experiments can provide evidence of association between the independent and dependent variables that is as unambiguous as that provided by a true experiment.

Summary: Causality in Quasi-Experiments, cont. Time order of effects of one variable on the others.

This is a strength of the various quasi-experimental before-and-after designs

Nonspurious relationships between variables. We cannot be certain of ruling out all potentially extraneous influences with either type of quasi-experimental design, but it is important to note that the criteria for these designs do give us considerable confidence that most extraneous influences could not have occurred.

Validity in Experiments

Like any research design, experimental designs must be evaluated in terms of their ability to yield valid conclusions.

True experiments are particularly well suited to producing valid conclusions about causality (internal validity), but they are likely to fare less well in achieving generalizability.

Causal (Internal) Validity

An experiment’s ability to yield valid conclusions about causal effects is determined by the comparability of its experimental and comparison groups.

First, of course, a comparison group must be created.

Second, this comparison group must be so similar to the experimental group or groups that it shows what the experimental group would be like if it had not received the experimental treatment—if the independent variable had not varied.

Causal (Internal) Validity, cont.

There are four basic sources of noncomparability (other than the treatment) between a comparison group and an experimental group.

They produce four of the five sources of internal invalidity:

1. Selection bias. When characteristics of the experimental and comparison group subjects differ.

2. Endogenous change. When the subjects develop or change during the experiment as part of an ongoing process independent of the experimental treatment.

History effects. When something occurs during the experiment, other than the treatment, which influences outcome scores.

Contamination. When either the experimental group or the comparison group is aware of the other group and is influenced in the posttest as a result (Mohr 1992:64).

Treatment misidentification. Variation in the independent variable (the treatment) is associated with variation in the observed outcome, but the change occurs through a process that the researcher has not identified.

Causal (Internal) Validity, cont.

Generalizability

The need for generalizable findings can be thought of as the Achilles heel of true experimental design.

The design components that are essential for a true experiment and that minimize the threats to causal validity make it much more difficult to achieve sample generalizability (being able to apply the findings to some clearly defined larger population)

Generalizability is not a strong point of Experiment. But Generalizability is a strong point in Survey

Research

Sample Generalizability

Subjects who can be recruited for a laboratory experiment, randomly assigned to a group, and kept under carefully controlled conditions for the study’s duration are unlikely to be a representative sample of any large population of interest to social scientists.

The more artificial the experimental arrangements are, the greater the problem will be (Campbell & Stanley 1966:20–21).

External Validity

Researchers are often interested in determining whether treatment effects identified in an experiment hold true for subgroups of subjects and across different populations, times, or settings.

Of course, determining that a relationship between the treatment and the outcome variable holds true for certain subgroups does not establish that the relationship also holds true for these subgroups in the larger population, but it suggests that the relationship might be externally valid.

There is always an implicit trade-off in experimental design between maximizing causal validity and generalizability.

Interaction of Testing and Treatment A variant on the problem of external validity occurs

when the experimental treatment has an effect only when particular conditions created by the experiment occur.

One such problem occurs when the treatment has an effect only if subjects have had the pretest.

The pretest sensitizes the subjects to some issue, so that when they are exposed to the treatment, they react in a way that differs from how they would have reacted had they not taken the pretest.

In other words, testing and treatment interact to produce the outcome.

Ethical Issues in Experimental Research Social science experiments often raise difficult

ethical issues. In spite of the ethical standard of “informed consent

” by subjects, deception is an essential part of many experimental designs.

As a result, contentious debate continues about the interpretation of this standard.

Experimental evaluation of social programs also pose ethical dilemmas because they require researchers to withhold possibly beneficial treatment from some of the subjects just on the basis of chance (Boruch 1997).

Deception

Deception is used in social experiments to create more “realistic” treatments, often within the confines of a laboratory.

The American Sociological Association’s Code of Ethics (1997) does not discuss experimentation explicitly, but it does highlight the ethical dilemma posed by deceptive research.

The ASA approach is to allow deception when it is unlikely to cause harm, is necessary for the research, and is followed by adequate explanation after the experiment is over.

Selective Distribution of Benefits

Field experiments conducted to evaluate social programs also can involve issues of informed consent (Hunt 1985:275–276).

One ethical issue that is somewhat unique to field experiments is the distribution of benefits: How much are subjects harmed by the way treatments are distributed in the experiment?

Is it ethical to give some potentially advantageous or disadvantageous treatment to people on a random basis?

Random distribution of benefits is justified when the researchers do not know whether some treatment actually is beneficial or not—and, of course, it is the goal of the experiment to find out.

Conclusions

True experiments are the best research design for testing causal hypotheses.

Even when conditions preclude use of a true experimental design, many research designs can be improved by adding some experimental components.

Conclusion, cont.

In spite of their obvious strengths, true experiments are used infrequently to study many of the research problems that interest social scientists.

There are three basic reasons:1. The experiments required to test many important

hypotheses require far more resources than most social scientists have access to.

2. Most of the research problems of interest to social scientists simply are not amenable to experimental designs, for reasons ranging from ethical considerations to the limited possibilities for randomly assigning people to different conditions in the real world. (Better for Medical Research)

Conclusions, cont.

3. The requirements of experimental design usually preclude large-scale studies and so limit generalizability to a degree that is unacceptable to many social scientists. Survey research often works better for social scientists like in the field of Psychology and Sociology.

Conclusion, cont.

Even laboratory experiments are inadvisable when they do not test the real hypothesis of interest but test instead a limited version amenable to laboratory manipulation.

Conclusions, cont.

The intersecting complexity of societies, social relationships, and social beings—of people and the groups to which they belong—is so great that it often defies reduction to the simplicity of a laboratory or restriction to the requirements of experimental design.

Yet, the virtues of experimental designs mean that they should always be considered when explanatory research is planned.

chapter 7 chapter 7 experiments. introduction experimental research is most appropriate for...

Documents