the standards for conducting research on topics of immediate social relevance

7
Editorial The standards for conducting research on topics of immediate social relevance Earl Hunt a, , Jerry Carlson b a Department of Psychology, Box 351525, University of Washington, Seattle, WA 98195-1525, United States b University of California, Riverside, United States Received 30 June 2006; received in revised form 16 October 2006; accepted 18 October 2006 Available online 4 December 2006 Abstract Should studies with immediate social relevance be held to exceptionally high standards of scientific excellence? This question is most often raised concerning studies of racial, ethnic, or gender differences in intelligence, but there are other areas where the question is appropriate. We treat the problem as one in signal detection. We argue that there is an excellent case for requiring unusually high quality work on a topic of immediate social relevance, regardless of the outcome of the study. Whether or not a decision concerning publication of a paper should hinge on the outcome of the research raises deeper issues. Nevertheless, decisions concerning threshold for publication will be influenced by such things as a priori beliefs and estimates of the costs and benefits of publishing (or not publishing) a particular finding. We believe that these considerations should be explicit rather than leaving them implicit. © 2006 Elsevier Inc. All rights reserved. Keywords: Cognitive; Criteria for publication; Intelligence; Sex differences; Racial differences; Theory of signal detection Should some fields of scientific inquiries be held to higher standards than others? Our concern for this issue stems from the publication of a number of articles dealing with racial differences in intelligence. We felt that some of these articles were poorly done, either due to methodology or faulty reasoning. 1 On reflection, we decided that the issue was more general than the study of racial differences in intelligence. While one rarely knows where a scientific advance will lead, some scien- tific topics have more immediate social relevance than others. For example, studies of the role of environmental pollutants on the development of intelligence deal with a putative danger that might affect millions of people, and could lead (in the case of atmospheric lead, has led) to environmental regulations costing billions of dollars. Studies in fields that have immediate social conse- quence attract more controversy outside of psycholog- ical science than, say, studies of the location of intellectual functions within the brain, although we quite agree that the latter may ultimately be of more importance in answering questions about the causes of individual differences in cognitive capabilities. The Intelligence 35 (2007) 393 399 We thank Thomas Bouchard, Nathan Brody, and Ian Deary for their constructive comments on an earlier version of this paper. Naturally, the conclusions expressed entirely our own. Corresponding author. E-mail address: [email protected] (E. Hunt). 1 We have purposely omitted citations in order to avoid controversy over the details of a particular study. We assure the reader that there was more than one article, that they appeared in several journals, and that our concern extended to articles on both sides of the debate over racial differences in intelligence. 0160-2896/$ - see front matter © 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.intell.2006.10.002

Upload: earl-hunt

Post on 08-Oct-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Intelligence 35 (2007) 393–399

Editorial

The standards for conducting research on topics ofimmediate social relevance☆

Earl Hunt a,⁎, Jerry Carlson b

a Department of Psychology, Box 351525, University of Washington, Seattle, WA 98195-1525, United Statesb University of California, Riverside, United States

Received 30 June 2006; received in revised form 16 October 2006; accepted 18 October 2006Available online 4 December 2006

Abstract

Should studies with immediate social relevance be held to exceptionally high standards of scientific excellence? This question ismost often raised concerning studies of racial, ethnic, or gender differences in intelligence, but there are other areas where thequestion is appropriate. We treat the problem as one in signal detection. We argue that there is an excellent case for requiringunusually high quality work on a topic of immediate social relevance, regardless of the outcome of the study. Whether or not adecision concerning publication of a paper should hinge on the outcome of the research raises deeper issues. Nevertheless,decisions concerning threshold for publication will be influenced by such things as a priori beliefs and estimates of the costs andbenefits of publishing (or not publishing) a particular finding. We believe that these considerations should be explicit rather thanleaving them implicit.© 2006 Elsevier Inc. All rights reserved.

Keywords: Cognitive; Criteria for publication; Intelligence; Sex differences; Racial differences; Theory of signal detection

Should some fields of scientific inquiries be held tohigher standards than others? Our concern for this issuestems from the publication of a number of articlesdealing with racial differences in intelligence. We feltthat some of these articles were poorly done, either dueto methodology or faulty reasoning.1 On reflection, we

☆ We thank Thomas Bouchard, Nathan Brody, and Ian Deary fortheir constructive comments on an earlier version of this paper.Naturally, the conclusions expressed entirely our own.⁎ Corresponding author.E-mail address: [email protected] (E. Hunt).

1 We have purposely omitted citations in order to avoid controversyover the details of a particular study. We assure the reader that therewas more than one article, that they appeared in several journals, andthat our concern extended to articles on both sides of the debate overracial differences in intelligence.

0160-2896/$ - see front matter © 2006 Elsevier Inc. All rights reserved.doi:10.1016/j.intell.2006.10.002

decided that the issue was more general than the study ofracial differences in intelligence. While one rarelyknows where a scientific advance will lead, some scien-tific topics have more immediate social relevance thanothers. For example, studies of the role of environmentalpollutants on the development of intelligence deal with aputative danger that might affect millions of people, andcould lead (in the case of atmospheric lead, has led) toenvironmental regulations costing billions of dollars.Studies in fields that have immediate social conse-quence attract more controversy outside of psycholog-ical science than, say, studies of the location ofintellectual functions within the brain, although wequite agree that the latter may ultimately be of moreimportance in answering questions about the causes ofindividual differences in cognitive capabilities. The

394 Editorial

question we explore is whether or not studies withimmediate social relevance should be held to a differentstandard of scholarship than is applied to studies whoseinitial impact is likely to be largely within science itself.Our argument is not directed at the outcome of suchresearch, whether it be on this or that side of a particularsocial dispute. Our argument is directed at the quality tobe demanded of research with immediate relevance, sothat whatever the scientific findings are, non-scientistsare not misinformed.

In discussing this issue with colleagues we havefound strongly held opinions, both pro and con. Threearguments have been advanced for applying the samestandards to all studies, without regard to likely socialrelevance. One is that the highest standards should beapplied to any study, making the issue non-existent. Asecond is that any elevation of standards for a particulartype of study amounts to censorship, usually in anattempt to suppress findings that are seen as threateningto conventional morals or order. In the extreme, peoplewho hold this view invoke the image of the prosecutionof Galileo. The third is that any lowering of standardsamounts to pandering to a specific viewpoint, eitherwithin science itself or from the more general society.For instance, a methodologically flawed study thatclaims to show that intelligence tests do not predictworkforce effectiveness fits well with a hostility held bymany toward selection by impersonal test scores.Subsequent technical criticisms of that study are likelyto be lost outside of the scientific literature itself.

We disagree with all three of these arguments.There is no clear line between “good” and “bad”

science. Everyone is for “good” research; it's to be onthe side of the Angels. But what is “good” research andwhat are the standards that must be met before aparticular piece of research or a research program can begiven that label? Studies in the social and behavioralsciences do not fall into neat categories of “good” and“bad”. There are several reasons for this: only undervery unusual circumstances can truly random samples ofthe populations be generalized to be obtained; seldom, ifever, do measurements assess exact equivalents of theconcepts they are supposed to evaluate; there are choicesabout the ways that statistical analyses will be done; andoften there is more than one arguable interpretation of afinding. The estimated quality of a paper is a continuousmeasure. Journals publish papers of varying quality allthe time. For that matter, scientists themselves differ intheir own standards for being convinced by their data.The decisions to initiate research projects and to submitfindings are acceptance–rejection decisions just asmuch as the decision to publish.

How should the threshold for submission oracceptance be set? The “pure science” argument is thatthe criterion should be based solely on scientific merit.As straight-forward as this may seem, the argument isflawed because it is based on an oversimplified view ofscientific inquiry. It fails to consider the fact that scienceis a human activity, embedded in and supported by thelarger society. Scientific studies alone seldom, if ever,determine social policy decisions. However one of themajor motivations for the funding of scientific researchis to obtain information that can inform policy makersand the general public. Research findings may then beused to influence social decisions, in either a responsibleor irresponsible manner.

What is the duty of scientists when such debates arelikely to erupt? Certainly the research itself should becontinued. Scientists should do research on socialimportant and sometimes controversial topics. If scientif-ically solid findings are misused, the authors, as citizens,may protest, but as scientists they have behaved ethically.If an article is published providing evidence of a geneticbasis for racial differences in intelligence, and the KuKlux Klan trumpets the findings, this does not make theauthor either a poor scientist or a racist. Similarly, if acritical review is published that concludes that the case forracial differences this review is then used by activists whowant strict quotas for racial groups in college admissions,this does not make the author, or the editor, an advocate ofracial quotas. Neither authors nor editors have the obli-gation or ability to police the use of published findingsand interpretations.

What scientists at every level, authors, reviewers, andeditors, do have is an obligation to get things right, i.e. tomake sure that findings and interpretations likely to beinjected into social debates are clearly stated, solidlysupported, and unlikely to admit of accidental misinter-pretation. Of course it can be argued that every findingshould be clearly stated and supported. Consider, how-ever, the following analogy. You should not drop a boxof dishes. Nor should you drop a box of explosives. It isrational to take more care with the explosives than withthe dishes.

We advocate taking an approach philosophicallyakin to the theory of signal detection (TSD). TSDadvises decision makers to consider two things; howlikely it is that a given signal comes from a targetdistribution, and what the consequences are of acting asif the signal did or did not come from the target dis-tribution. Let us first look at probabilities, and then atthe issue of consequences.

A decision maker's subjective probability that aparticular conclusion follows from a scientific study

395Editorial

depends upon two things; the decision maker's a prioribeliefs in the conclusion and the probability that theobserved data would occur if the conclusion is indeedcorrect. To be concrete, suppose that an empirical studyshows that for the applicants to University X the gap inScholastic Assessment Test (SAT) scores betweenAfrican American and White intelligence test scoreshas diminished (or risen) by .4 standard deviation unitsfrom its value ten years ago. Should authors, reviewers,and editors believe, with some degree of strength, thatthe gap has changed in “intending college students” orthat, still more generally, that the gap has changed in thecurrent cohort of university-aged people?

That depends. A person whose a priori beliefs holdthat there is ample evidence that the gap is permanentmay doubt the findings, and therefore be inclined torecommend against publication, emphasizing varioustechnical concerns about the test involved or the lack ofrepresentative sampling. A person whose a prioribeliefs are that the gap is malleable in the observeddirection may regard the study as strong evidence for theclaimed change, and accordingly recommend publica-tion. This is natural, and is a specific example of thedictum that findings ought to subjected to specialscrutiny if they are unexpected or unusual. We ask thereaders of this journal how strong the evidence wouldhave to be to make them believe in the ability of somestudents to benefit from telepathic communicationswhile taking an intelligence test. This is an extremeexample, but the principle applies to any scientific topic,including the study of racial differences. A priori beliefswill inevitably influence an author, reviewer, or editor'sinterpretations. It is naïve to act as if they do not. In fact,Bayesian reasoning is usually considered to be quiterational.

The next thing TSD tells us to look at is the cost.What are the relative costs of hits (publication of solidfindings on either side of an issue), misses (rejection ofa solid finding), false alarms (publication of a findingthat turns out to be erroneous, misinterpreted, etc.) andcorrect rejections (turning down a paper that indeed doescontain fatal flaws)? Considering these costs force us toconfront difficult issues, but ignoring them is worse.Here are some of the considerations, using the languageof signal detection.

Hits: True positives are findings and/or interpreta-tions that do indeed have validity. For example, thefinding that in industrial societies IQ scores predict bothacademic and non-academic success to a moderatedegree is a fact, and should be part of any considerationof racial differences in IQ scores, however inconvenientthis fact may be to proponents of certain political

positions. The decision to publish Schmidt and Hunter's(1998) excellent meta-analysis showing that intelligencetests are important predictors of workplace performancedocumented that this fact was certainly a correct one.Unfortunately numerous popular media articles onintelligence have ignored their findings, but the editorsand reviewers of Psychological Bulletin, where thearticle appeared, cannot be blamed for that.

More generally, if writers, editors, and reviewers areconfident that a paper has been correctly done the articleshould be published. High standards are acceptable,impossible standards are not. Not publishing an articlebecause it establishes an inconvenient truth violates thecanons of science itself.

We do add one caveat, though. Clarity, which isalways desirable, is especially important in articles withimmediate social relevance. This is particularly an issuewith the discussion section. If we regard an article in ascientific journal as a communication between scientiststhe important parts of the article are the methods andresults sections. Authors are certainly entitled to drawtheir own conclusions. Readers who are themselvesspecialists in the field will use the methods and resultssections, plus their own background knowledge, to drawtheir own conclusions. The situation is different when anon-specialist reads the same article. Non-specialists,particularly journalists and policy makers, are likely tomove quickly to the discussion section. They will spendtheir time asking what the author's conclusions mean forthem, not whether the author's conclusions are justifiedby the methods and results. Therefore when apparentlyvalid results are published that bear on an importantsocial issue extra care has to be exercised to be sure thatconclusions are stated unambiguously. Alternativeexplanations and limitations on generalization must bestated in equally unambiguous terms.

While everyone should be for clarity, the burden ofobtaining it will inevitably fall on reviewers and editors,for writers are notoriously poor censors of their ownideas.

Correct rejections: Poorly done studies, no matterwhat they purport to establish, should not be published,independently of whether or not authors, reviewers andeditors think that the conclusion is correct. Let us returnto our example of applicants to University X. SupposeUniversity X is very specialized, e.g. a militaryacademy, a school with a strong religious orientation,or a school that has historically been associated with aparticular ethnic, or cultural group, such as HowardUniversity or Yeshiva. The students who choose toapply to such schools exercise substantial self-selection,and the mechanisms of self-selection may well vary

396 Editorial

across different racial and ethnic groups and be differentfor males and females. Therefore, regardless of the factsof changes in the race–IQ gap, a study of such apopulation would be very weak evidence for anyconclusions about the gap in a more general population.

The same reasoning applies to weaknesses inmethodology. Many people talk about social issues,including journalists, politicians, and the ubiquitousperson on the street. The only claim that a scientist canmake to special status in these conversations is that thescientist applies methodologies that have been validatedwithin the scientific discipline. Therefore when socialand behavioral scientists report on a socially relevantissue they have an obligation to be especially concernedabout such topics as sampling and the use of appropriatemethodologies for data collection and analysis. Editorsand reviewers are particularly important here. Theyshould be concerned over methodological issues evenwhen they believe that the proffered conclusion iscorrect. There is an analogy here to the actions of theAmerican Civil Liberties Union, which is famous forinsisting that proper procedure be followed in criminalcases, even if many members of the union believe thatsociety would be better off if the accused were in jail.

The wise author ought to be as concerned about thisas editors and reviewers. If someone publishes a paperon a topic that is of immediate social relevance and if theresults run counter either to commonly acceptedknowledge or to current social–political beliefs, theauthor can expect challenges, both by experts within afield and by “powerful individuals” outside the field.Both the author and the editor of the journal are welladvised to be prepared for the challenge. While we canthink of bad examples, we prefer to offer a good one.

In The Bell Curve (Herrnstein & Murray, 1994)Richard Herrnstein and Charles Murrayt argued that IQscores are valid indicators of the mental competencesrequired by our society and that the distributions of IQscores indicate that intelligence is generally weaker inthe African American population. The firestorm thatfollowed was predictable. Murray was well prepared forit. (Herrnstein had passed away.) Herrnstein and Murraywere very clear about the data that they had analyzed.One of us (EH) corresponded with Murray on this topicand felt that very few laboratories would have been asable to assist in alternative analyses as Murray was. Anumber of very competent statisticians reanalyzed theirdata, added further variables, and, insofar as we know,were able to offer only relatively small modifications totheir conclusions.

There is a lesson here. If you have solid evidence thatbears on a current social issue, by all means publish it.

You will not be able to defend yourself from adhominem attacks. There is no way that Murray candefend himself from the charge that he is a racistbecause he published certain facts. You can defendyourself, as, in our opinion, Murray has very effectivelydone, from the charge of improperly collecting oranalyzing data.

The same argument applies at the editorial level. Oneof the “dirty little secrets” about peer review (which isquite obvious when one thinks about it) is that editorscan exert considerable influence behind everyone's backby selecting reviewers for articles. Editors who publishsolid findings on controversial topics, as they certainlyshould, have to be ready to defend themselves againstcharges that they have biased the reviewing process byselecting reviewers who are so committed to a particularviewpoint that they are willing to overlook weaknessesin a paper because they agree with the conclusion.

The two cases we have just considered, hits andcorrect rejections, are easy to deal with because theyinvolve correct decisions. The next two cases, falsepositives and misses, are much harder to deal with.Ideally we would like them to go away, but they aregoing to happen. What are the costs and benefitsinvolved?

False positives. Recall that our definition of a falsepositive is the publication of results that are methodo-logically flawed, regardless of how these results bear ona particular topic. By our reasoning a false positivewould occur if a flawed study were to be reported of achange in the racial gap in IQ scores, regardless of thedirection of the change.

Within the scientific community false alarms dorelatively little harm. They are exposed after a time,either by failures to replicate or by subsequent criticalanalysis. The situation is quite different when falsealarms are picked up and amplified upon outside ofscience. While such amplifications are seldom decisivein debates on social issues, they do provide faultyammunition for one or the other sides to a debate. Ifscientists know, as we often do, that a particular findingis going to be used in an important public debate we alsoknow that we have an obligation to get the fact right, inthe initial publication, over and above our generalobligation to get facts right in order to avoid needlessdebates and expense within science. We can legitimatelyadjust our standards upwards when we think that there isa chance that publication of an erroneous finding mightdo social harm.

Consider our two prototypical examples; racialdifferences in intelligence and the influence of environ-mental agents on intelligence. In the first example a false

397Editorial

positive finding of differences could obviously con-tribute to unwarranted discrimination. A false positivefinding of equality (or to be more correct, only trivialdifferences) is also serious, because such findings quitepossibly will be used as evidence for discrimination,followed by a call for ameliorative action, whenno discrimination exists. In the second case, flawedresults suggesting that a pollutant or medical treatmenthad a deleterious effect upon intelligence, especial-ly in children, could result in public health actions,often with considerable cost, taken to eliminate arisk that did not exist. On the other hand, flawed re-sults suggesting that the environmental agent hadno risk could also have large economic and socialconsequences.

Reporting results on controversial issues that subse-quently have to be withdrawn also has a cost to thescientific endeavor as a whole, and to the scientistsinvolved. If the findings of a study go against firmlyheld beliefs, one has to be prepared to counter an attackon one's credibility. If the reported research is, in fact,weak, the attack may well succeed. Obviously a scientistshould be willing to publish and defend good researchon controversial topics. Galileo is the prototypicalexample. Attempting to defend weak research on acontroversial topic can have severe career conse-quences. The analogy is to carelessly dropping a boxof dynamite, not to the prosecution of Galileo.

Misses: A similar argument can be made concerningmisses, decisions not to publish a socially importantfinding because of doubts concerning its validity. Ob-viously if the doubts are high enough a finding shouldnot be published. But what about a finding that might betrue, and if true would have serious implications for acurrent social issue? This is possibly the most difficultquestion to decide, because it is much harder to identifythe costs associated with not publishing a finding, or noteven being willing to investigate it, than it is to identifycosts of a poor publication.

Nevertheless, there are real costs to a decision not topublicize an inconvenient truth. The most trivial one is ablow to one's self-esteem. A much more serious concernis that, as no less an authority than Socrates pointed out,the roles of philosophers and scientists are to be gadfliesto the state, and we would add, to the conventionalwisdom. This is particularly the case for research ongroup differences in intelligence. It is a well documentedfact that Hispanics and African Americans, as a group,lag behind Whites and Asians on a wide variety ofsocioeconomic and educational measures. This is aproblem for our society and its causes need to beunderstood. Oversimplifying greatly, suppose that there

are only three possible causes for this discrepancy;differences in intellectual competence (intelligence inthe most general sense), differences in social organiza-tion and motivation in different communities, anddiscrimination on the part of the majority against theminority. It does no one any good if programs aredeveloped to attack one of these putative causes whenthe problem is actually caused by another. There arecosts to society if psychological and educationalscientists fail to report what the situation actually is.The fact that these costs occur over a long period of timedoes not make them less real.

How are we to balance these issues? Formally, signaldetection is of no help, for applying TSD literally wouldlead to an impossible problem of quantification.However the philosophy of TSD can help.

TSD tells us that we can avoid errors, in general, byincreasing our ability to discriminate between signalsemanating from the target and noise distributions,technically by increasing the d' parameter, which is ameasure of the distance between the means of the signaland target distributions, measured in standard deviationunits of the target distribution. In the present context, ifd' were infinite we would always publish the good andnever publish the bad, and let the chips of socialrelevance fall where they may. Since such a Nirvana isnot possible, either in this or any other field, TSD furthertells us that we can control the rate of errors of aparticular type by adjusting our decision boundary, ourconfidence that the conclusion of a study is, in fact,correct. Where the decision boundary is set dependsupon two factors determined outside of the context ofthe study under consideration; the decision maker's apriori belief about the likelihood of the outcome, priorto the study's being conducted, and the decision maker'sestimate of the relative costs of hits, misses, false alarms,and correct rejection.

Our ability to discriminate erroneous, doubtful, andsolid findings depends upon the effort that we put intodesign, execution, analysis, and review of research.Such effort is costly. Researchers may have to decide notto do a study because sufficiently appropriate popula-tions are not available for sampling, money is notavailable for administering the best or desirablycomprehensive measures, or because desirably appro-priate statistics require larger samples than the research-er can afford. Virtually every empirical study representsa compromise between what is desirable and what isaffordable in money and effort. So does reviewing.Editors and publishers have a legitimate interest inreceiving rapid reviews and in receiving accuratereviews. Speed–accuracy tradeoffs are inevitable, for

2 This example has long been a favorite of EH's in classroompresentations. We thank Ian Deary for reminding us of it for use inthis paper, and for providing the statistics.

398 Editorial

careful reviews require time, and competent reviewersare likely to be busy reviewers.

In TSD terms, these issues affect d', not the decisionthreshold. When research has immediate social implica-tions we believe that it is desirable to demand that d' beas high as possible, in order to minimize the far moredifficult issue of deciding what the decision thresholdshould be. We say this in no small part because the d'requirement for high quality research can be appliedwithout consideration of the outcome of the research.We can be equally demanding of quality for studies thatdo or do not indicate racial differences in intelligence. Aconsideration of the threshold for publication that isdependent upon the outcome of the research inevitablyinvolves consideration of social costs. The way to avoidsuch complex, controversial considerations is to try toget a clear signal, to move judgments about researchquality as far as possible from the decision boundary.

But we are never going to reach a situation where ourability to discriminate is perfect. A decision boundaryhas to be set and, as we have pointed out, this involvesamalgam of a priori beliefs and judgments of costs. Itwould be nice to avoid them, but we cannot, and actingas if we are ignoring them is simply not being honest.

A priori beliefs influence how a research decides todo a study and submit results for publication. Theyinfluence reviewers, and they influence editors. Inevi-tably a priori beliefs will differ across individuals.There are competent researchers in the intelligence fieldwho are convinced that a genetic explanation for racialdifferences is highly likely, there are equally competentresearchers who feel that the evidence for geneticdifferences is woefully weak. When dealing withcontroversial topics editors would be well advised toavoid selecting reviewers who are known to be in strongagreement or disagreement with an article's conclusion,simply because the implicit cognitive biases of suchreviewers cannot help but exert an influence over theirexplicit evaluation of a study.

Adjusting the decision boundary controls the rates ofdifferent types of errors. When the issue is put this way,it seems to us obvious that either implicitly or explicitlyauthors, reviewers, and editors balance the costs andbenefits of different types of errors. Given that we have acertain level of confidence in a result, which is nevergoing to be infinite, what are the relative costs ofpublishing or not publishing? This is going to dependupon the perceived costs of publishing erroneous resultsor suppressing results that may turn out to be true. Somecommentators on earlier versions of this article haveargued that such considerations should never enter into adecision about publication, because to do so brings

politics into science. This is a real concern. We pointout, however, that failure to vary criteria dependingupon the topic is a political judgment, just as much asvarying the criterion is. Such a policy amounts to a claimthat, for example, research on the relation betweenintelligence and critical flicker fusion and research onthe relation between intelligence and racial identity haveidentical potentials for influencing social policies in theshort term. (In the long term, no one knows.) We do notsee how this position can be maintained.

A counter to this argument is that when research is“controversial” it is best to let the issue be thrashed out inthe literature, by inviting commentaries on publishedarticles (Detterman, 2006). We have a good deal ofsympathy for this position, and believe that this is often areasonable compromise between publishing and notpublishing. We would point out two qualifications. First,we agree that this is the best way to deal with situations inwhich the methodology and empirical findings are wellestablished, but the interpretation may be ambiguous. Inother words, arguments over the discussion section areoften profitable; arguments over the methods section areusually less desirable (and far easier to resolve prior topublication). Second, when a finding is dramatic thescientific journals; and even more so, the media outsideof science, are far more likely to talk about the originalfindings than about criticisms of them. There is adramatic example in our own field. Ceci and Liker(1986) found that in a group of race track bettors therewas essentially no relation between IQ scores and theability to predict handicaps at post time. They interpretedthis finding as an example of a general lack ofrelationship between IQ scores and ‘real world’performance. Detterman and Spry (1988) pointed outsubstantial errors in Ceci and Liker's study. As ofAugust, 2006, the Ceci and Liker article had been cited79 times. The Detterman and Spry article had been cited11 times.2

On the one hand, there is no way to avoid consideringthe social relevance of a scientific finding when decidingwhether or not to publicize it. On the other hand, we dowant to avoid letting the direction of science becontrolled by current social policies and beliefs. We donot believe that it is desirable to ignore this issue byclaiming that criteria are always the same, or acting as ifscience is uninfluenced by social considerations. Westrongly advise another course; be explicit about theissues that go into the decision process.

399Editorial

There is an excellent example here, the way in whichthe journal SCIENCE treated the decision to publishinformation about the genetics of the avian-derivedinfluenza virus responsible for the 1918 pandemic. Theargument for publishing was that, in addition to theusual scientific criterion of adding to knowledge,publication might advance research on what is currentlya very real threat, mutation of a widespread avianinfluenza virus into a form capable of rapid human–human transmission. The argument against publicationwas that the information in the article might assist abioterrorist who wanted to develop and release such avirus. The editors of SCIENCE eventually decided topublish the information; but they also considered therisk–benefit tradeoffs and publicly described theirprocedure for doing so (Sharp, 2005).

Researchers interested in intelligence are unlikely toface such a dramatic example. Although the stakes arelower, the principle is the same. When dealing withfindings that have immediate social relevance it isappropriate that researchers, editors, and reviewersconsider risk–benefit tradeoffs. The process by whichthis has been done should be public, either in publishedcommentaries, explicit statements of value judgmentswithin reviews, or by the development of documentsthat describe the judgment process and, on challenge,can be made available to anyone who wishes to inquire.

We argue that it is always appropriate to demand, onan a priori basis, that studies with immediate socialrelevance be held to a higher standard of technicalexcellence than studies that are more directed at purelyscientific issues, with less immediate social relevance.

We cannot stress too strongly that this emphasis appliesregardless of the outcome of the research. Theappropriate standard of excellence adheres to the topicof the research, not to the outcome of the study. This isour most important message. Even when this caution isobserved there will be cases where the decision topublish involves a cost-benefit analysis that considersboth the likelihood of errors of different types and theirconsequences. In these cases the a priori believability ofthe results and the presumed costs of different errors willbe issues. They will be considered either implicitly orexplicitly. We believe that it is far better that consi-deration of the issues be explicit and documented ratherthan implicit and denied.

References

Ceci, S. J., & Liker, J. (1986). A day at the races: A study of IQ,expertise, and cognitive complexity. Journal of ExperimentalPsychology. General, 115, 255−266.

Detterman, D. K. (2006). Editorial note on controversial papers.Intelligence, 34(1), iv.

Detterman, D. K., & Spry, K. M. (1988). Is it smart to play the horses?Comment on “A day at the races: A study of IQ, expertise, andcognitive complexity” (Ceci&Liker, 1986). Journal of ExperimentalPsychology. General, 117(1), 91−95.

Herrnstein, R. J., & Murray, C. (1994). The bell curve. Intelligenceand class structure in American Life. New York: The Free Press.

Schmidt, F. L., &Hunter, J. E. (1998). The validity and utility of selectionmethods in personnel psychology: Practical and theoreticalimplications of 85 years of research findings. PsychologicalBulletin, 124(2), 262−274.

Sharp, P. A. (2005, October 7). The 1918 flu and responsible science.Science, 310, 17.