quasi-experimental designs (aka natural …people.uwplatt.edu/~enrightc/research procedures... ·...

24
Table of Contents Qualitative and Quantitative Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Types of Qualitative Scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Descriptive Statistics and Measurement Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Measures of Central Tendency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Measures of Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Shape of the Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Why does the measure of Central Tendency Reported matter? . . . . . . . . . . . . . . . . . . . 8 Types of Quantitative Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Quasi-Experimental Designs (AKA Natural Manipulation Studies) . . . . . . . . . . . 16 True Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1

Upload: phungdung

Post on 21-Mar-2018

216 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

Table of Contents

Qualitative and Quantitative Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Types of Qualitative Scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Descriptive Statistics and Measurement Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Measures of Central Tendency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Measures of Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Shape of the Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Why does the measure of Central Tendency Reported matter? . . . . . . . . . . . . . . . . . . . 8

Types of Quantitative Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Quasi-Experimental Designs (AKA Natural Manipulation Studies) . . . . . . . . . . . 16

True Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1

Page 2: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

Qualitative and Quantitative Research

Quantitative research differs from qualitative research in that it uses qualitative techniques to assign numbers to constructs under study. Qualitative studies use non-numerical methods such as words and pictures. While we will be focusing on qualitative techniques in this course, we will take a couple of minutes to briefly describe some qualitative techniques.

Qualitative research is generally carried out to gain an in-depth understanding of selected participants communicated from their own perspective. An in-depth understanding requires that the researcher go beyond typical questionnaires, surveys, tests and psychological scales. There are several techniques that are used to conduct qualitative

research. For example, ethnology is a qualitative approach used to learn about cultures. Ethnologists use in-depth, semi-structured interviews or direct observations over a long period of time. Some qualitative researchers go so far as to live with and work with their participants. For example, in 1963 Gloria Steinem conducted a qualitative study of playboy culture by taking a job as a playboy bunny. Similarly, in 1993 Schouten and McAlexander,

undertook a study of Harley Davidson Motor cycle culture and brand loyalty by joining a motorcycle club and taking part in the day to day activities of Harley enthusiasts. Other examples of ethnological studies include Margret Mead’s work on childhood, adolescence and childbearing in places like New Guinea, Samoa and Bali and Jane Goodall’s studies of wild chimpanzees in what is today known as Tanzania.

Another form of qualitative research that is often used in medicine, forensics, psychology and psychiatry is the case study. Case studies involve in depth, detailed descriptions of a single case. For example, Oliver Sacks published several interesting studies of individuals with unusual neurological conditions. For example, in his 1970 book The Man who Mistook his Wife for a Hat, Sacks describes a case of a man who suffered from a form a agnosia (a perceptual disorder) that impaired his ability to visually identify objects and faces. Unusual cases, such as individuals with rare disorders or talents lend themselves well to these descriptive methods. In general, qualitative research is recommended for new areas of research or areas that behavioral scientists have limited knowledge about.

There are several limitations associated with qualitative research. First, because the study is conducted by a single researcher, there is a strong tendency for their interpretation of

2

Field trip to Manus, Papua New Guinea,1953-4. (Mead Archives, Library of Congress.)

Photograph by Hugo Van Lawick

Page 3: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

their observations to be biased. For example, if you took the same serial killer and had two different psychologists produce a case study, their interpretations could be very different. If one of the psychologists was a Freudian, the information they seek out, the questions they ask would be very different than those that would be of interest to a cognitive neurologist. While the Freudian might focus on childhood issues such as toilet training and weaning, the neurologist would be interested in neurological symptoms and brain abnormalities.

A second limitation is a lack of generalizability. No matter how well done, a study of one individual or one small group will not necessarily generalize to other individuals or groups. For example, what Schouten and McAlexander learned from their study of one Harley Davidson club may not generalize to other Harley clubs.

While the behavioral sciences have learned a great deal from qualitative studies, they are generally very time consuming and complex. Learning how to conduct sound qualitative studies requires that the researcher take one or more specialized courses. As Patten (2002) points out:

. . . students who have limited knowledge of statistics may be drawn to qualitative research simply because statistics are not required . . . It is relatively easy to find someone to help in the quantitative analysis and interpretation of scores obtained by using objective measures. Getting help from an expert with sorting through hundreds of pages of transcribed material such as interview verbatims . . . . is a much more difficult matter. In other words, it is a myth that analyzing the results of qualitative research is inherently easier than analyzing the data collected in quantitative research. (p. 29 -30)

Types of Qualitative Scales

Quantitative analysis is concerned with making sense of data that are expressed in terms of quantities (numbers). Data comes in several forms. The type of statistics and analysis you will use for reporting the results of your study will depend on the nature of variables in your study. Two types of measures can be used. Often a study will include a combination of these different types of scales.

Categorical (also known as discrete) scales. A categorical scale defines the qualities we are measuring according to non-overlapping, discrete categories. There are two types of categorical scales; nominal scales and ordinal scales.

Nominal Scales. Numbers are used to label categories which differ in type. The magnitude of the numbers are not meant to indicate that one category is better than another. No rank ordering of categories is intended. A common example of a nominal category is sex. We might enter the number one to indicate that a participant is male and the number two to indicate that a participant is female. The number should only be considered a label (name) and should not be viewed as a value.

3

Page 4: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

Ordinal Scales. Numbers in this case are meant to indicate an order to the levels of the categories being measured. For example, education level could be measured on an ordinal scale. We might enter a zero to indicate that a participant has not completed a high school degree, a one to indicate that a person’s highest level of education is a high school degree, a two to indicate an Associate degree, a three to indicate a Bachelor’s degree, a four to indicate an Master’s degree and a five to indicate a Doctorate. In this case, the higher the number is the more education a person has. So an ordinal scale defines not only a difference between categories but also assigns an order to the variables. The intervals, however, on an ordinal scale are not equal. By this I mean that you could not logically argue that there is an equal difference in education between an Associate degree (labeled 2) and a bachelor’s degree(labeled 3) as there is between a Bachelor’s degree and a Master’s degree (labeled 4), even though the numerical differences are the same.

Another example of an ordinal scale is placement in a race. The first person to cross the finish line we call 1. The second person we call 2 and the third we call 3. This does not imply that the time that passed between the first person crossing the finish line and the second person crossing the finish line is the same as the time that passed between the second and the third place finisher. Ordinal scales give us a measure of order as a categorical measure and do not imply that there is an equal distance (value) indicated between each number on the scale.

Continuous scales are numerical scales in which the underlying dimension (at least in theory) can take infinitely more precise values. For example, if I was to measure the amount of sleep that a person gets in a night (at least in theory) I could measure to a level of precision beyond even that of a nano-second. In general, when we use continuous scales we simplify our measures by rounding to an appropriate unit of measure. For example, my sleep measures would likely be precise enough rounded to the nearest minute. There are two types of continuous scales.

Interval Scales. Similar to ordinal scales, interval scales are used to indicate differences in the construct we are measuring, and to infer order, however the numbers on this scales have one additional property; the units indicate equal intervals that allow us to measure the degree of difference between two measures. An example of an interval scale is temperature measured in either Fahrenheit or Celsius. In this case we can say that 50 degrees is as many degrees higher than 25 degrees, as 75 degrees is higher than 50 degrees. Thus, the intervals indicated by the numbers are equal. We can thus use the numbers as a consistent measurement at any point on the scale. An interval scale, however, does not have an absolute zero. In our example, zero does not indicate an absence of temperature, therefore, we cannot say that 50 degrees is twice as hot as 25 degrees.

Ratio Scales. Numbers on this scale have all the properties of an interval scale with the addition of an absolute or naturally falling zero point. Zero on a ratio scale indicates an absence of the quality being assessed. For example, the scale of time measured in minutes is a ratio scale. Zero indicates no time. The importance of an absolute zero point is that it allows us

4

Page 5: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

to make ratio based comparisons (e.g., half as much or double the amount). On a ratio scale of time we can say that a person who takes 5 minutes to complete the task, completed it twice as fast as a person who took 10 minutes.

Descriptive Statistics and Measurement Scales

Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Descriptive statistics are typically distinguished from inferential statistics. Inferential statistics are used to reach conclusions that extend beyond the immediate data. For instance, we use inferential statistics to make judgments of the probability that an observed difference between groups is unlikely to have happened by chance in the study (i.e., tests of statistical significance). Thus, we use inferential statistics to make inferences from our data to more general conditions; we use descriptive statistics simply to describe what's going on in our data.

One way to summarize data is to report the number of people who obtained given scores on a scale. In such a case we would be tallying the responses and reporting frequencies. In some cases, frequency is more meaningfully expressed as percentages or as proportions. For example, if I tell you that last semester 20 of my students scored an A or higher on the first exam, I am giving you the frequency of A grades in that class. This frequency would indicate very different things if my class were comprised of 25 students compared to 200 students. In one case 80% of the students earned an A whereas in the other case only 10% earned As. Percentages and proportions adjust a frequency and report it as though the sample size is 100 or 1 respectively.

Measures of Central Tendency

Measures of central tendency, attempt to quantify the "typical" or "average" score in a data set. The concept is extremely important and we encounter it frequently in daily life. For example, we often want to know before purchasing a car its average gas mileage. Or before accepting a job, you might want to know what a typical salary is for people in that position so you will know whether or not you are going to be paid what you are worth. We often ask questions in psychological science revolving around how groups differ from each other "on average". There are three common ways of determining the central tendency of a distribution of scores.

Mode. By far the simplest, but also the least widely used measure of central tendency is the mode. The mode in a distribution of data is simply the score that occurs most frequently. One way of describing a distribution is in terms of the number of modes in the data. A unimodal distribution has one mode. In contrast, a bimodal distribution has two.

5

Page 6: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

The biggest limitation of a mode is that is not very stable. Small changes in the distribution can produce large changes in the mode. For example, if in a class 20 students have an A; 19 students have a B; 13 students have a C and 3 students have less than a C, the modal grade would be A (yeah!!!!!). If one student who previously held an A, slipped into the B category, this would change the mode from A to B. Change in one data point can produce large changes in this measure of central tendency. The biggest strength of this measure, on the other hand, is that it can be used with both categorical and continuous measures (see above).

Median. Technically, the median of a distribution is the value that cuts the distribution exactly in half such that an equal number of scores are larger than that value as are smaller than that value. The median is also called the 50th percentile. The median is most easily computed by arranging the data set from smallest to largest. The median is the "middle" score in the distribution. Suppose we have the following scores in a data set: 5, 7, 6, 1, 8. Arranging the data in order we have: 1, 5, 6, 7, 8. The "middle score" is 6, so the median is 6. Half of the (remaining) scores are larger than 6 and half of the (remaining) scores are smaller than 6. The biggest strength of this measure is that it is not sensitive to “outliers” (extreme scores). For example, compare the following distribution of scores

Set A 2,5,6,6,7,8,9 Median = 6

Set B 2,5,6,6,7,8,100 Median = 6.

The limitation of this measure is that it can only be used with continuous scales (see above).

Mean. The mean is the most widely used measure of central tendency. The mean is defined as the arithmetic average (sum of all the data scores divided by the number of scores in the distribution). In a sample, we often symbolize the mean with a letter with a line over it. If the letter is "X", then the mean is symbolized as , (pronounced "X-bar)." If we use the letter X to represent the variable being measured, symbolically the mean is defined as

Where Σ indicates sum of and n represents the number of data points in the set.

6

Page 7: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

The mean can be used only with continuous scales of measurement. A second limitation of the mean is that it is affected by outliers. Looking at the two distributions used above, the mean of set A is 6.14 whereas the mean of Set B is 19.14. This difference in the mean is produced by the difference of one extreme score.

Measures of Dispersion

The central score in a distribution is important in many research contexts. So too is another set of statistics that quantify how spread out (or "how dispersed") the scores tend to be. Do the scores vary a lot, or do they tend to be very similar or near each other in value?

The simplest measure of dispersion is the range. The range defines the difference between the largest and smallest score in the data set. One advantage of this measure is that it can be used with any measurement scale that has the property of order (e.g., continuous, interval and ratio scales). A weakness is that it is strongly affected by extreme scores. For example, Set A and Set B have hugely different ranges even though they differ by only one score.

The two most commonly used measures of dispersion; Variance (s2) and Standard Deviations (s) are calculated by finding the average amount that the scores in a distribution differ from the mean. One might think that the way to calculate this is to take each score and minus the mean (calculate deviation scores), then add these up and divide by the number of scores. One thing that is important to notice is that the sum of deviation scores from the mean will always be 0. This will always be the case because, by definition the mean is the point at which the sum of scores above the mean and the sum of scores below the mean are exactly equal to each other. The negative numbers thus cancel out the positive numbers.

We use a very simple mathematical trick to get around this in calculating variance. We simply square all the deviation scores and then add them up (referred to as the sum of squares). Recall that the square of any number is a positive number; therefore, the sum of all the squared deviation scores will be a positive number. We simply divide by the number of scores to get the average squared deviation from the mean.

7

Page 8: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

Since most of us do not think in terms of squared distances, the most commonly used statistic for describing dispersion is the square root of variance. This statistic is called the standard deviation and it is simply the average distance of score from the mean of the distribution. The larger the standard deviation, the more spread out the distribution is.

Shape of the Distribution

Another important thing to know about a distribution is whether it is symmetric or skewed. A distribution is symmetric if when the folded in half, the two sides match up. If a curve is not symmetric it is skewed. When a curve is positively skewed, most of the scores occur at the lower values of the horizontal axis, and the curve tails off towards the higher end. When a curve is negatively skewed, most of the scores occur at the higher value and the curve tails off towards the lower end of the horizontal axis. Note that the shape of the distribution is named after the tail. A negatively skewed distribution has its mean pulled down by the extreme scores, whereas a positively skewed distribution has its mean pulled up by the extreme scores.

If a distribution is a unimodal symmetrical curve, the mean, median and mode of the distribution will all be the same value. When the distribution is skewed the mean and median will not be equal. Since the mean is most affected by extreme scores, it will have a value closer to the extreme scores than will the median

Why does the measure of Central Tendency Reported matter?

Consider a country so small that its entire population consists of a queen (Queen Cori) and four subjects. Their annual incomes are:

Citizen Annual IncomeQueen Cori 1,000,000Subject 1 5,000Subject 2 4,000Subject 3 4,000Subject 4 2,000

8

Page 9: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

Queen Cori might boast that this is a fantastic country with an “average” annual income of $203,000. Before rushing off to become a citizen, you might want to be wise and find out what measure of central tendency Queen Cori is using! The mean is $203,000, so she is not lying, but this is not a very accurate representation of the incomes of the population. In this case, the median ($4,000.00) is a more reprehensive value of the “average” income. The point to be made here is that the appropriate measure of central tendency is affected not only by the type of measurement but also by the distribution of the scores. The mean of a distribution is strongly affected by extreme scores, whereas the median is not.

Types of Quantitative Studies

There are three goals to science that apply regardless of the topic of study. At the most basic level we want to be able to gain good description of the phenomena we are interested in. In learning about a phenomena, however, we want to go further than a simple description, we want an understanding that will allow us to predict future outcomes. In other words, we want to know about the relationship between variables so that if we observe that characteristic X is present, we can predict what other characteristics are also likely to be present or to follow.

While descriptions and predictions are useful, scientists want to be able to go further and be able to explain phenomena. In the sciences the word “explain” has a special meaning. It means that we understand the underlying causes of the phenomena. We know what factors need to be present in order to produce a desired outcome. If we can understand the causal relationships between factors and outcomes, we can then engage in the ultimate goal of science, applying what we know to improve our professional practice.

In the sciences we use the results of studies as evidence upon which to form theories. Theories summarize our current understanding of an issue and can be used to predict future outcomes. For example, Jennifer Cromley, a graduate student in educational psychology at the University of Maryland College Park was interested in the effectiveness of using Computer Based Technology (CTB) in Adult basic education (ABE) instruction. She conducted a literature search to identify studies relevant to the topic.

Some of the studies she found simply reported descriptive information (e.g., the percentage of ABE instruction that used computers, the reasons instructors choose to use and students choose to take course that use computer based technology). While descriptive studies give relevant background information, they do not tell us what qualities an effective CBT program is likely to have.

For example, Parke and Tracy-Mumford (2000, cited in Cromley, 2000) found that distance learning programs in which students studied on their own and rarely interacted with others led to little learning and showed high student dropout rates. In other words, they found a positive relationship between the amount of interaction with other students and with the instructor, and retention rates. Relational studies point to possible causal relationships between variables, however they are weak evidence because relational studies do not

9

Page 10: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

eliminate other possible explanations for the findings. For example, perhaps instructors who choose to incorporate more interaction in their CBT are simply more skilled and/or experienced. It could be their skill that leads to a better understanding of the material and a lower drop-out rate. In other words, it may not be that A (Interaction) causes B (effectiveness); it may be that B (the instructor’s skill) is what causes A (a more interactive class).

Relational studies always leave open more than one interpretation. In order to determine what the cause of a given outcome is we have to look at evidence gained from true experiments. True Experiments use techniques that allow the researcher to eliminate all but one interpretation of the results of their study.

For example, to control for differences in teaching skill we could use one instructor and have them design two CBT programs that incorporated the same material but differ on the amount of interaction they allowed. If we found a difference in learning and retention between the two CBT programs, and the only difference between them is the interaction allowed, we could reasonably conclude that interaction is a causal factor underlying the differences. In the next section we will discuss the most common forms of relational studies Correlational studies and quasi-experimental Studies. We will then look at various true experimental designs.

Correlations

Correlations deal with the relationship between two or more variables. With this measure we can answer questions such as: Are achievement tests scores related to Grade Point Averages? Is counselor empathy related to counseling outcome? Is student toenail length related to success in graduate school? Correlational studies, however, are limited in that they do not allow us to make conclusions about causes. No matter how high a correlation we find between two variables, we can only say the two variables are related, we cannot say one is a cause of the other.

To begin let’s review what we mean by “cause”. When we make the claim that changes in one variable cause a change in another variable, we are saying that we have evidence that changing one variable will produce a predictable change in a second variable. We can only make this claim if we have in fact changed (manipulated) one variable and observed the effects of this change on a second variable (while holding all other variable constant). Correlational studies do not meet this standard for making causal conclusions. In a correlational study we simply measure the level of two variables for the same individual and then statistically

10

Page 11: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

determine whether one variable allows us to better predict the second variable than if we did not have knowledge of an individual’s score on a second variable. If we find a significant correlation between two variables there are always three possible causal explanations for why that relationship exists.

You have likely heard of the story of the stork that parents used to tell children when asked “where do baby’s come from?”

According to European folklore, the stork is responsible for bringing babies to new parents. The legend is very ancient, but was popularized by a 19th century Hans Christian Andersen story called The Storks. German folklore held that storks found babies in caves or marshes and brought them to households in a basket on their backs or held in their beaks. These caves contained adebarsteine or “stork stones”. The babies would then be given to the mother or dropped down the chimney. Households would notify when they wanted children by placing sweets for the stork on the window sill (http://www.planetofbirds.com/the-white-stork-and-the-bring-baby-story)

Why are we talking about storks and babies? The famous case of the Storks Oldenburg, reports a high correlation between the population growth (birth rate) of Oldenburg, Germany, during the 1930s and the number of storks observed each year has often been used as a classic example of the limitations of correlational studies. One explanation for this finding is that the folklore is correct and the storks bring the babies. This conclusion however would fail to take into account two other likely explanations.

These explanations can be summarized as two problems:

1) The Directional Problem - If we find a correlation between two variables, for example Birth Rate in humans (Variable A) and number of storks nesting in a village, two possible explanations for this relationship are 1) that A causes B (i.e., Higher rates of storks cause higher birth rates - perhaps my mother is correct and the storks really do bring babies); an equally good explanation of the this correlation is that B causes A ( i.e., High birthrates in humans cause higher nesting rates of strokes; perhaps the storks are attracted to babies).

2) The Third Variable Problem - Some outside variable (or set of variables) may effect both A and B in a manner so that they co-vary in a predictable way but A does not cause B and B does not cause A. In our example, a possible explanation of the relationship between human birth rates and the number of nesting storks may be population itself. The larger the population the higher the birthrate. The larger the population, the more houses that are built. As it turns out, European storks nest in chimneys. Thus as the population grew, so did the birthrate and the number of places for storks to roost.

11

Page 12: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

What correlations do allow us to do is conclude that two variables are (or are not) related. This can be very interesting and valuable information, but it does not justify a conclusion that states which variable is causing the other. In a correlational study the researcher measures variables as they naturally occur. They do not do anything to manipulate the variables.

Example: You may well see this type of question on your exam.

Dr. Tippy Bakafew conducted a study looking at the relationship between religion and alcohol consumption. Dr. Bakafew begins by obtaining a list of all villages, towns and cities (we will refer to these as communities) in Wisconsin. From this list he randomly selects a sample of 20% of the communities. For each of these communities he obtains a count of the number of Bars and the number of churches listed in the Yellow Pages. He finds that there is a correlation of +0.88 between these two measures. Based on this Dr. Bakafew concludes that religion causes people to drink. Consider Dr. Bakafew’s conclusion – and present two other possible INTERPRETATIONS OF THIS CORRELATION.

In a correlational study the researcher measures variables as they naturally occur. They do not do anything to manipulate the variables.

The Correlation Statistic ( r ) provides two separate pieces of information. (1) The sign, negative or positive tells us the direction of the relationship. If a correlation is positive it indicates that higher levels or one variable predict higher levels of the second variable (and conversely that lower levels of one variable predict lower levels of the other). A negative correlation indicates that higher levels of one variable predict lower levels of the second variable (and vice versa). When interpreting a correlation it is important to keep in mind that a negative correlation is not a negative results. It is just as informative as a positive correlation. For example, in my Introduction to Experimental Psychology course I have found a fairly high positive correlation between class attendance and final grades. The more classes a student attend the higher their final grade tends to be. Another equally valid way to discuss this correlation is to say that the higher a student’s final grade, the more classes they tended to be present for. If instead of measuring the number of classes students attend, I had measured the number of classes they missed, I would find a negative correlation between classes missed and final grades. Both correlations are equally as informative. Knowing how many classes a student attended allows me to better predict their final grade than I would be able to do without this information. The more classes they attend the higher their final grade tends to be. The relationship is positive because higher levels of one variable predict higher levels on the other. On the other hand, knowing how many classes a student missed also allows me to better predict their final grade than I would be able to do without this information, but in this case the correlation is negative. The more classes a student misses the lower their final grade tends to be.

12

Page 13: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

The second piece of information a correlation statistic tells us the strength of the relationship -- how accurately we can predict one variable based on the other -- how good a predictor one variable is of another. The value of a correlation ranges from +1 to -1. So a correlation is always either 1, 0, or a decimal. The closer the absolute value of the correlation is to 1 the stronger the relationship. Similarly the closer the absolute value is to zero the weaker the relationship.

Final grade

1009080706050

Num

ber o

f Cla

sses

Atte

nded

46

44

42

40

38

36

34

Data Points

One way to represent the correlation between two variables is to graph the relationship between two variables. One the Y (vertical axis) we put the scale for one variable, and on the X axis (horizontal axis) we put the second variable. For now it really does not matter which you put on which axis. Each subjects score on the two variables are thus represented as a data point on this graph which is called a scatter diagram. On the diagram below a subjects scores on two variables is represented by one data point. Subject 1 attended 40 out of 45 classes and obtained a final grade of 75%.

When we plot a data from a sample of scores the formation of the scores can tell us something about the relationship between the two variables. Two measures that are related to each other will produce scatter- plots which approximate a line. The less the scores deviate from the line the stronger the correlation. If the line is sloping upward, the direction of the correlation is positive (the higher the scores on one axis, the higher the scores on the other axis). If the slope is downward, the direction of the correlation is negative (higher scores on one variable are associated with lower scores on the second variable).

The correlation statistic is a mathematical technique for finding the slope of a straight line that best summarizes the relationship between the two variables. This line is called the line of best fit. It is the line which fits the data in a manner that reduces the overall distance

13

Page 14: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

between itself and all the data points. In other words, if you drew a series of lines on your scatterplot and then measured the total amount the data points varied from the line (in two dimensional geometric space), you would find that there would be one line that produced the lowest total deviation scores. In other words, there is one line which summarizes the relationship between the two variables best, or that best fits the data.

Final grade

100908070605040

Num

ber o

f Cla

sses

Atte

nded

50

40

30

20

Line of Best Fit

The line of best fit for a perfect correlation will have two characteristics.1) all the data points will line up perfectly on the line of best fit.2) The slope of the line will be 45 (negative or positive.)

Final grade

100908070605040

Num

ber o

f Cla

sses

Atte

nded

50

40

30

20

10

0

-10

For example, this scatterplot shows a perfect positive correlation.

14

Page 15: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

When two variables are not related they may appear to be random, circular or form a straight vertical or straight horizontal line. For example, the scatterplot below show a correlation of +.06 (very low).

Final grade

100908070605040

Num

ber o

f Cla

sses

Atte

nded

50

40

30

20

10

0

-10

Zero correlations may also appear as straight vertical or horizontal lines. On the graph below we see that no matter how many classes a person attends, the best prediction we can make is 65, which is also the class mean. In other words, knowing how many classes a person attends does not give me any information about their final grade. My best prediction in all cases would be the class mean. The line of best fit for a random or circular scatterplot is also both a straight vertical or horizontal line.

Final grade

65.665.465.265.064.864.664.4

Num

ber o

f Cla

sses

Atte

nded

50

40

30

20

10

0

-10

15

Page 16: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

Quasi-Experimental Designs (AKA Natural Manipulation Studies)

Quasi-experimental studies compare two naturally occurring groups. Unlike true experiments, the difference between the two conditions (groups) is not produced by the researcher, nor are potential confounds (differences between conditions other than the variable of interest) controlled for. Similar to correlations, conclusions about causation cannot be made based on quasi-experimental studies.

For example, studies looking at sex differences have found consistent and reliable differences between males and females on both math and language scores. Can we conclude from these studies that biological sex causes a person to be better or worse at math or language? No! The reason is that males and females do not just differ biologically. Other factors might explain these performance differences. For example, from the very beginning of life we treat little girls differently than little boys (e.g., different toys, chores, freedoms, expectation). These childhood influences may be what underlies skill differences found between males and females throughout the lifespan.

True Experiments

True experiments are studies that are conducted in such a manner that conclusions about cause and effect can be made. The criteria that must be met for a study to be a true experiment are as follows:

Measures of variables must be objective Differences between conditions being compared must be produced by and under

the control of the researcher All potential differences between conditions other than the independent variable

are controlled (i.e., eliminated or accounted for).

When those criteria are met, a researcher can claim to have evidence of cause and effect. We will discuss various ways of designing a true experiment that meet this criteria in future readings.

16

Page 17: Quasi-Experimental Designs (AKA Natural …people.uwplatt.edu/~enrightc/Research Procedures... · Web viewGetting help from an expert with sorting through hundreds of pages of transcribed

References

Cromley, J. G. (2000). Learning with computers: The theory behind the practice. Focus on Basics: Connecting Research and Practice, 4(C), from http://www.ncsall.net/index.html@id=771&pid=303.html

Patten, M. L. (2002) Proposing Empirical Research: A Guide to the Fundamentals (2nd edition). Pyrczak Publishing:Los Angeles, CA.

Sacks, O (1970). The Man Who Mistook His Wife for a Hat. Harper Perennial: New York, NY.

Schouten, J. W. & J. H. McAlexander, J. H (1993). Market Impact of a Consumption Subculture: the Harley-Davidson Mystique", in European Advances in Consumer Research Vol 1, (eds). W. F. Van Raaij and G. J. Bamossy, Provo, UT : Association for Consumer Research, Pages: 389-393.

Van Lawick, H. (n.d.) Jane Goodall http://www.nationalgeographic.com/explorers/women-of-national-geographic/

17