common flaws in research

Common flaws Running head: COMMON FLAWS Scientific Research in Education: Common Flaws John Koetsier University of British Columbia 1

Upload: john-koetsier

Post on 17-Nov-2014




0 download


Part of a course in my master of educational technology program.


Common flaws

Running head: COMMON FLAWS

Scientific Research in Education:

Common Flaws

John Koetsier

University of British Columbia


Common flaws


Research studies are difficult to do right and easy to do wrong. There are

many potholes to avoid, and many factors can impact a study’s validity and

reliability. To find and understand some of the common problems, I’m going

to look at three different types of studies, see what the researchers did, how

they did it, and what problems they encountered. The studies are Beck and

Fetherston’s The effects of incorporating a word processor into a year three

writing program (2003), Schweingruber and Brandenburg’s Middle School

Students’ Technology Practices and Preferences: Re-examining Gender

Differences (2001), and Haye’s A comparison of fifth graders’ frequency using

web-based activities versus traditional activities for self-directed enrichment



Common flaws

In the first study, Natalie Beck and Tony Fetherston studied the effects of

teaching writing with a word processor in primary grades. For six weeks, they

studied both how students felt about using word processing technology

versus paper and pencil, and what effects technology had on the quality of

their writing. As a result, they concluded that students who used word

processors wrote significantly better than students using pencil and paper.

Unfortunately, the quality of the study was severely and negatively

undermined by several design and procedural decisions. Together those flaws

cause it to miss the standard for research that is generalizable to other

settings and can be counted upon when creating programs and curricula.

In brief, the problems with the study include a very small sampling size - only

seven students – which basically eliminates any opportunity for external

validity. The sample cannot possibly be representative enough. And – not

that it matters that much with such a small sample - the researchers used

convenience sampling rather than random sampling.

In addition, the short six-week study ensured that researchers could not

compensate for the effects of novelty … any new technique employed in an

educational setting might result in a temporary bump in performance as the

sheer newness galvanizes student attention and effort. Oddly, in what must

be a rare problem for a study with a novelty issue, maturation was also a

problem, since the students apparently used the word processing software

previous to the initiation of the study.


Common flaws

Finally, and perhaps most importantly, the design was pre-experimental.

There was no control group receiving a placebo and equal but different

treatment. The sample group essentially was its own control group.

In the second study, Miller, Schweingruber, and Brandenburg looked at

middle school students’ use of technology in America - specifically at

male/female differences. They administered a 512-question survey to

students in Texas middle schools, and used the results to argue that

historical differences are disappearing as technology – particularly the web -

becomes more prevalent.

The conclusions are valid and supported by subsequent research, but the

methodology (particularly the sampling) could have been significantly

improved. Therefore, the study is not as generalizable as it could have been,

and follow-up research was required.

Problems included a significantly skewed urban/suburban mix that is heavily

weighted in favor of urban students and against rural students, who were

entirely excluded. In addition, ethnicity was a factor that was not addressed

at all in the study, even though the schools from which students were drawn

were in a city and state that over-represented certain racial groups. A final

sampling complication was the fact that schools subjects were drawn from

were significantly undersized compared to the average middle school.


Common flaws

In addition to sampling concerns, the author’s assertion that the web is

predominantly responsible for male/female technology preferences becoming

more similar is problematic, as there are many potentially confounding

variables. Finally, high mortality in the course of the study due to data

collection problems adds yet another question mark.

In the third study, teacher Karen Hayse engaged in action research to guide a

school district’s recommended practice with regard to using web-based

versus traditional enrichment resources. Working with a single class of fifth

graders over a period of 10 weeks, Hayse introduced 15 web resources and

15 traditional resources as activities that students could explore and use

during non-graded personal enrichment time every third school day. Students

self-reported which resources they used.

Hayse discovered that students preferred web resources to traditional

resources most of the time, with web resources being the clear leader

initially, trailing off halfway through the 10 weeks, and then regaining

popularity in the final few weeks. Hayse also noticed, anecdotally, that giving

students a choice between different types of enrichment activities seemed to

result in students choosing to engage in enrichment more often, regardless of

which type they chose.

There are a number of concerns with this study, starting with sampling.

Specifically, Hayse has apparently used convenience sampling, probably with

her own class. Clearly, there are no guarantees of representativeness.


Common flaws

Another is a lack of pre-testing. It would be important to know whether before

the study started there was already a student preference for technology and

web-based resources, and a simple survey could have provided helpful

insight when interpreting the study data.

A concern I have is that there may have been a novelty effect … that

students who had previously only been exposed to traditional enrichment

activities in the classroom may have chosen web activities simply due to their

newness. A longer study would have reduced any novelty effects that might

be operating.

Hayse mentions that she has controlled for a number of factors, including

types of activities and opportunities to work with peers, but I wonder if the

web resources were as potentially social or perceived as potentially social by

the students as the traditional resources. The spike in non-traditional

resource usage came after a student asked friends to play a trivia game; was

a similar thing possible with the web resources? It’s difficult to say without

being able to examine the actual websites.

A number of other questions suggest themselves: while “neither or both”

were options, as students could choose to use any combination of resources

including no resources, they do not show up in the data in Table 1. It seems

unlikely that over 10 weeks these options were never chosen. Also, students

self-reported use of activities at the end of the day: this may not be the most

accurate method of collecting data. And finally, while not necessarily to be


Common flaws

expected in action research, it would still be ideal to have a better study

design than pre-experimental.

Looking over these three studies, it seems clear that sampling is an

enormous challenge and frequent source of external validity concerns. Each

of the studies had, to varying degrees, sampling problems. This probably

shouldn’t be too much of a concern, as finding subjects and convincing them

to participate in studies is difficult, time-consuming, and potentially

expensive. However, it is worth researchers time to expend considerable time

and effort on this specific facet of their studies, since without a good random

or appropriately stratified sample, results are not generalizable anyways. In

other words, garbage in, garbage out.

Secondly, an appropriate degree of control over the variables in the study is

critical. Knowing that students had used the particular type of word

processing software that they were testing should have impelled Beck and

Fetherston to find other subjects. And while I can’t prove it without access to

the resources that Hayse used, I suspect that while equivalent on the surface,

and in terms of topic, they may not have been equivalent in terms of

presentation and use by students.

In conclusion, academic research is a difficult process to do well, and pitfalls

exist at every stage. These three studies illuminate some of the common

issues, and provide insight for researchers about what to avoid and minimize

in study design in order to maximize internal and external validity.


Common flaws 8

Common flaws


Beck, N., & Fetherston T. (2003). The effects of incorporating a word

processor into a year three writing program. Information Technology

in Childhood Eduction Annual, 2003, 139-161.

Hayse, K. (2003). A comparison of fifth graders’ frequency using web-based

activities versus traditional activities for self-directed enrichment.

Retrieved from

Hayse.htm on March 5, 2008.

Miller, L.M., Schweingruber, H., and Brandenburg, C.L. (2001). Middle School

Students’ Technology Practices and Preferences: Re-examining Gender

Differences. Journal of Educational Multimedia & Hypermedia, 10(2),
