experiment notes

8/4/2019 Experiment Notes

1/6

Y520 Spring 2000 Page 1

Michael Y520

Experimental Method

The best method indeed the only fully compelling method of establishing causation is to conduct a carefully

designed experimentin which the effects of possible lurking variables are controlled. To experiment means to

actively change x and to observe the response in y (p. 202).

Moore, D., & McCabe, D. (1993).Introduction to the practice of statistics. New York: Freeman.

The experimental method is the only method of research that can truly test hypotheses concerning cause-and-effectrelationships. It represents the most valid approach to the solution of educational problems, both practical and

theoretical, and to the advancement of education as a science (p. 298).

Gay, L. R. (1992). Educational research (4th Ed.). New York: Merrill.

Importance of Good Design: (http://www.tufts.edu/*gdallal/study.htm)

100% of all disasters are failures of design, not analysis. Ron Marks, Toronto, August 16, 1994

To propose that poor design can be corrected by subtle [statistical] analysis techniques is contrary to good scientific

thinking. Stuart Pocock (Controlled Clinical Trials, p 58) regarding the use of retrospective adjustment for trials

with historical controls.

Issues of design always trump issues of analysis. G.E. Dallal, 1999, explaining why it would be wasted effort to

focus on the analysis of data from a study under challenge whose design was fatally flawed.

Unique Features of Experiments:

1. The investigator manipulates a variable directly (the independent variable).

2. Empirical observations based on experiments provide the strongest argument for cause-effect relationships.

Additional features:

1. Problem statement theory constructs operational definitions variables hypotheses.

2. The research question (hypothesis) is often stated as the alternative hypothesis to the null hypothesis, that is

used to interpret differences in the empirical data.

3. Random sampling of subjects from population (insures sample is representative of population).4. Random assignment of subjects to treatment and control (comparison) groups (insures equivalency of groups;

ie., unknown variables that may influence outcome are equally distributed across groups).

5. Extraneous variables are controlled by 3 & 4 and other procedures if needed.

6. After treatment, performance of subjects (dependent variable) in both groups is compared.

Ways to control extraneous variables:

1. Random assignment of subjects to groups. This is the best way to control extraneous variables in

experimental research. Provides control for subject characteristics, maturation, and statistical regression.

2. Variables that may still exist:

a. Subject mortality (i.e., dropouts due to treatment)

b. Hawthorne effect

c. Fidelity of treatment (manipulation check)

d. Data collector bias (double blind studies)e. Location, history

3. Additional procedures for controlling extraneous variables (use as needed)

a. Exclude certain variables.

b. Blocking.

c. Matching subjects on certain characteristics.

d. Use subject as own control.

e. Analysis of covariance.


2/6


Michael Y520

True Experimental DesignsA. Randomized Post-test only Control Group Design

Treatment R X1 O R = random assignment

Comparison R X2 O X = Treatment occurs for X1 only

O = Observation (dependent variable)

This is the best of all designs for experimental research.Random assignment controls for subject characteristics, maturation, statistical regression.

Potential threats not controlled: subject mortality, Hawthorne effect, fidelity of treatment, data collection bias, unique

features of location, history of subjects.

B. Randomized Pretest Post-test Control Group Design

Treatment R O1 X1 O2 R = random assignment

Comparison R O1 X2 O2 X = Treatment occurs for X1 only

O1 = Observation (Pre-test)

O2 = Observation (Post-test, dependent

Potential threat: Effect of pre-testing. variable)

C. Randomized Solomon Four Group Design

Treatment R O1 X1 O2 R = random assignment

Comparison R O1 X2 O2 X = Treatment occurs for X1 only

O1 = Observation (Pre-test)

Treatment R X1 O2 O2 = Observation (Post-test, dependent

Comparison R X2 O2 variable)

Random sampling, random assignment.

Best control of threats to internal validity, particularly the threat introduced by pretesting.

Requires a relatively large number of subjects.

D. Randomized Assignment with Matching

1. Randomized (Sampling & Assignment), Matched Ss, Post-test only, Control Group

Treatment M,R X1 O M = Matched Subjects

R = Random assignment of matched pairs

Comparison M,R X2 O X =Treatment (for X1 only)


Example: An experimenter wants to test the impact of a novel instructional program in formal logic. The investigator

infers from reports in the literature that high ability students and those with programming, mathematical, or

music backgrounds are likely to excel in formal logic regardless of type of instruction. The experimenterrandomly samples subjects, looks at subjects SAT scores, matches subjects on basis ofSAT scores and randomly

assigns matched pairs (one of each pair to each group). The other concominant variables (previous

programming, mathematical, and music experience) could also be matched.


3/6


Michael Y520

2. Randomized Pretest-Post-test Control Group, Matched Ss

Treatment O1 M,R X1 O2 O1 = Pretest

M = Matched Subjects

Comparison O1 M,R X2 O2 R = Random assignment of matched pairs

X =Treatment (for X1 only)

O2 = Observation (dependent variable)

Subjects are matched on the basis of their pretest score and pairs of subjects are randomly assigned to groups.

3. Matching Methods

a. Mechanical matching

1). Rank order subjects on variable, take top two, randomly assign members of pairs to groups. Repeat for

all pairs.

2). Problems:

Impossible to match on more than one or two variables simultaneously.

May need to eliminate some Ss due to no appropriate match for one of the groups. a. Statistical

matching

b. Statistical Matching

1). The purpose is to control for factors that cannot be randomized but nonetheless can be measured on (at

least) an interval scale (but in practice we often treat ordinal scales as if they were interval). Statisticalcontrol is achieved by measuring one or more concomitant variables (referred to as the covariate) in

addition to the variable (variate) of primary interest (i.e., the dependent or response variable).

Statistical control can be used in experimental designs and because no direct manipulation of subjects

or conditions is required, it can also be used in quasi-expermential and non-experimental designs.

2). Analysis of covariance is used to test the main and interaction effects of categorical variables on a

continuous dependent variable, controlling for the effects of selected other continuous variables which

covary with the dependent.The control variable is called the covariate.

(http:http://www2.chass.ncsu.edu/garson/pa765/ancova.htm).

3). To control a covariate statistically means the same as to adjust for the covariate or to correct for

covariate, or to hold constantor to partial outthe covariate. (http://www.psych.uiuc.edu/-

mho/psy307a.html)

4). But see:

Loftin, L., & Madison, S. (1991). The extreme dangers of covariance corrections. In B. Thompson(Ed.), (1991). Advances in educational research: Substantive findings, methodological developments

(Vol. 1, pp. 133-148). Greenwich, CT: JAI Press. (IBSN: 1-55938-316-X)

Thompson, B. (1992). Misuse of ANCOVA and related "statistical control" procedures.Reading

Psychology, 13, iii-xviii.


4/6


Michael Y520

Pre-Experimental Designs

A. One-Shot Case Study

X O X = treatment


Problems: No control group; cannot tell if treatment had any effect.

Comments from Campbell and Stanley (1963): As has been pointed out (e.g., Boring, 1954; Stouffer, 1949) such studies have such a total absence of

control as to be of almost no scientific value (p. 6).

Basic to scientific evidence (and to all knowledge-diagnostic processes including the retina of the eye) is the

process of comparison, of recording differences, or of contrast. Any appearance of absolute knowledge, or

intrinsic knowledge about singular isolated objects, is found to be illusory upon analysis. Securing scientific

evidence involves making at least one comparison" (p. 6).

It seems well-nigh unethical... to allow, as theses or dissertations in education, case studies of this nature

(i.e., involving a single group observed at one time only)" (p. 7).

B. One Group Pretest-Post test Design

O1 X O2 O1 = PretestX = treatment


Problems: No control group. Changes between pre- and post-test may be due not to the treatment but to:

history, maturation, instrument decay, data collection characteristics, data collection bias, testing, statistical

regression, attitude of subjects, problems with implementation, etc.

C. Static-group comparison design

X O1 X = treatment


O1

Intact, existing groups are used. No random selection of subjects; no random assignment to groups. No way to insureequivalence of groups.

Comments from Campbell and Stanley (1963):

Instances of this kind of research include, for example, the comparison of school systems which require the

bachelors degree of teachers (the X) versus those which do not; the comparison of students in classes given

speed-reading training versus those not given it; the comparison of those who heard a certain TV program with

those who did not, etc. (p. 12).

There is ... no formal means of certifying that the groups would have been equivalent had it not been for the

X.... If O2 and O2 differ, this difference could well have come through the differential recruitment of persons

making up the groups: the groups might have differed anyway, without the occurrence of X" (p. 12).


5/6


Michael Y520

Quasi-Experimental Designs No random sampling of subjects. Intact groups often used.

No random assignment of Ss to groups. Confidence in equivalency of groups is lower.

A. Matching-only Group Design

Treatment M X1 O X = treatment

Control M X2 O

B. Matching-only Pretest-Post test Group Design

Treatment O1 M X1 O2 O1 = Pretest

X1 = treatment

Control O1 M X2 O2 O2 = Post test

Existing, intact groups.

Subjects matched on one or more variables; can't be certain if groups are equivalent on remaining unmatched

variables.

Matching is never a substitute for random sampling and random assignment to groups.

C. Single Group Time Series Design

The essence of the time-series design is the presence of a periodic measurement process on some group or

individual and the introduction of an experimental change into this time series of measurements, the results of

which are indicated by a discontinuity in the measurements recorded in the time series" (Campbell & Stanley,

1963, p. 37).

O1O2 O3O4 O5X1 O6O7 O8O9 O10 X1 = treatment

Factorial Designs. Requires at a minimum, two levels variable A crossed with two levels of variable B. That is, all levels of A

occur with all levels of B.

Factorial designs enable the investigator to observe an interaction, if one exists. An interaction simply meansthat different levels of the dependent variable occur at different levels of the independent variable.

Let us suppose that three types of teachers are all, in general, effective (e.g., the spontaneous extemporizers,

the conscientious preparers, and the close supervisors of student work). Similarly, three teaching methods in

general turn out to be equally effective (e.g., group discussion, formal lecture, and tutorial). In such a case...,

teaching methods could plausibly interact strongly with types, the spontaneous extemporizer doing best with

group discussion and poorest with tutorial, and the close supervisor doing best with tutorial and poorest with

group discussion (Campbell & Stanley, 1963, p. 29).

Threats to Internal ValidityIs the investigators conclusion correct? Are the changes in independent variable indeed responsible for the observed

variation in the dependent variable? Or, might the variation in the dependent variable be attributable to other causes?

This is the question of internal validity.The following list is from Campbell and Stanley (1963) as interpreted by

Kirk (1995):

1. History. Events other than the administration of a treatment level that occur between the time the treatment

level is assigned to subjects and the time the dependent variable is measured may affect the dependent

variable.

2. Maturation. Processes not related to the administration of a treatment level that occur within subjects is

simply a function of the passage of time (growing older, stronger, larger, more experienced, and so on) may

affect the dependent variable.

3. Testing. Repeated testing of subjects may result in familiarity with the testing situation or acquisition of

information that can affect the dependent variable.


6/6


Michael Y520

4. Instrumentation. Changes in the calibration of a measuring instrument, shifts in the criteria used by

observers and scorers, or unequal intervals in different ranges of a measuring instrument can affect the

measurement of the dependent variable.

5. Statistical regression. When the measurement of the dependent variable is not perfectly reliable, there is a

tendency for extreme scores to regress or move toward the mean. Statistical regression operates to (a) increase

the scores of subjects originally found to score low on a test, (b) decrease the scores of subjects originally

found to score high on a test, and (c) not affect the scores of subjects at the mean of the test. The amount of

statistical regression is inversely related to the reliability of the test.

6. Selection. Differences among the dependent-variable means may reflect prior differences among thesubjects assigned to the various levels of the independent variable.

7. Mortality. The loss of subjects in the various treatment conditions may alter the distribution of subject

characteristics across the treatment groups.

8. Interactions with selection. Some of the foregoing threats to internal validity may interact with selection to

produce effects that are confounded with or indistinguishable from treatment effects. Among these are

selection-history effects and selection-maturation effects. For example, selection-maturation effects occur

when subjects with different maturation schedules are assigned to different treatment levels.

9. Ambiguity about the direction of causal influence. In some types of research for example,

correlational studies it may be difficult to determine whether X is responsible for the change in Y or vice

versa. This ambiguity is not present when X is known to occur before Y.

10. Diffusion or imitation of treatments. Sometimes the independent variable involves information that is

selectively presented to subjects in the various treatment levels. If the subjects in different levels can

communicate with one another, differences among the treatment levels may be compromised.11. Compensatory rivalry by respondents receiving less desirable treatments. When subjects in some

treatment levels receive goods or services generally believed to be desirable and this becomes known to

subjects in treatment levels that do not receive those goods and services, social competition may motivate the

subjects in the latter group, the control subjects, to attempt to reverse or reduce the anticipated effects of the

desirable treatment levels. Saretsky (1972) named this the John Henry effect in honor of the steel driver who,

upon learning that his output was being compared with that of a steam drill, worked so hard that he

outperformed the drill and died of overexertion.

12. Resentful demoralization of respondents receiving less desirable treatments. If subjects learn that the

treatment level to which they have been assigned received less desirable goods or services, they may

experience feelings of resentment and demoralization. Their response may be to perform at an abnormally low

level, thereby increasing the magnitude of the difference between their performance and that of units assigned

to the desirable treatment level.

Campbell, D. T., & Stanley, J. C. (1963).Experimental and quasi-experimental designs for research. Chicago, IL:

Rand McNally.

Kirk, R. E. (1995).Experimental design: Procedures for the behavioral sciences. Pacific Grove, CA: Brooks/Cole.

experiment notes

Documents