are we doing any good by doing really well? (where’s the bacon?)

7
Are we doing any good by doing really well? (Where’s the Bacon?) Donald Herbert Citation: Medical Physics 30, 489 (2003); doi: 10.1118/1.1555493 View online: http://dx.doi.org/10.1118/1.1555493 View Table of Contents: http://scitation.aip.org/content/aapm/journal/medphys/30/4?ver=pdfcov Published by the American Association of Physicists in Medicine Articles you may be interested in Happenings and news in physics and astronomy: One year anniversary of American Physical Society's Physics News Phys. Teach. 47, 608 (2009); 10.1119/1.3264600 Educational Equity or Equality — Which Do We Really Want? Phys. Teach. 43, 314 (2005); 10.1119/1.1903828 The World Wide Web is a net disservice to medical physicists in developing countries Med. Phys. 28, 2391 (2001); 10.1118/1.1421374 Proposition: Medical physicists would benefit by establishment of an institute for biomedical imaging Med. Phys. 25, 1994 (1998); 10.1118/1.598362 Measuring the returns to NASA life sciences research and development AIP Conf. Proc. 420, 810 (1998); 10.1063/1.54881

Upload: donald

Post on 06-Apr-2017

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Are we doing any good by doing really well? (Where’s the Bacon?)

Are we doing any good by doing really well? (Where’s the Bacon?)Donald Herbert Citation: Medical Physics 30, 489 (2003); doi: 10.1118/1.1555493 View online: http://dx.doi.org/10.1118/1.1555493 View Table of Contents: http://scitation.aip.org/content/aapm/journal/medphys/30/4?ver=pdfcov Published by the American Association of Physicists in Medicine Articles you may be interested in Happenings and news in physics and astronomy: One year anniversary of American Physical Society's PhysicsNews Phys. Teach. 47, 608 (2009); 10.1119/1.3264600 Educational Equity or Equality — Which Do We Really Want? Phys. Teach. 43, 314 (2005); 10.1119/1.1903828 The World Wide Web is a net disservice to medical physicists in developing countries Med. Phys. 28, 2391 (2001); 10.1118/1.1421374 Proposition: Medical physicists would benefit by establishment of an institute for biomedical imaging Med. Phys. 25, 1994 (1998); 10.1118/1.598362 Measuring the returns to NASA life sciences research and development AIP Conf. Proc. 420, 810 (1998); 10.1063/1.54881

Page 2: Are we doing any good by doing really well? (Where’s the Bacon?)

Are we doing any good by doing really well? „Where’s the Bacon? …

Donald Herberta)

University of South Alabama College of Medicine, Department of Radiology, Mobile, Alabama 36617

~Received 22 November 2002; accepted for publication 17 December 2002; published 17 March 2003!

Francis Bacon, who with Rene Decartes laid the intellectual foundations for Western science in theseventeenth century, asserted that the purpose of all knowledge is ‘‘action in the production ofworks for . . . the relief of man’s estate.’’We assess briefly several aspects of a few of the currentefforts directed to the production of such ‘‘works’’ with respect to such ‘‘relief ’’ as they mayprovide: cancer mortality, the medical literature, evidence-based medicine, clinical trials, observa-tional databases and criteria for the promotion and tenure of the medical faculty. We suggest whyeach of these efforts appears to have failed to some degree and then propose some measures thatmay possibly serve as correctives. ©2003 American Association of Physicists in Medicine.@DOI: 10.1118/1.1555493#

I. INTRODUCTION

In the preface to his seminal work,De Novum Organon,published in 1620, that laid the philosophical foundations fora new,empirical, approach to learning from which he pro-ceeded with the ‘‘instauration,’’ or renovation, of sciencethat was to become Western science, Francis Bacon, Baronof Verulam, Viscount St. Albans, castigated the weaknessesof its immediate predecessors, the sciences of Aristotle,Plato,et al.: ‘‘ Observe also that if sciences of this kind hadany life in them, that could never come to pass which hasbeen the case now for many ages—that they stand almost ata stay, without receiving any augmentations . . . insomuchthat many times not only what was asserted once is assertedstill, but what was a question once is a question still.’’ 1

In a recent paper Mohanet al. asserted that one of thedifficulties encountered in defining an objective function forIMRT ~the most recent—and by far the most complex—ofthe technologies and hardwares that implement in daily prac-tice the current radiation oncology beliefs on dose–volume-response! is that, ‘‘ . . . in spite of many decades of radio-therapy experience, the available dose- and dose-volume-response data are meager and ofinadequate quality andreliability. @emphasis added# Furthermore, specification oftradeoffs among the tumor and normal tissue end points ishighly subjective,’’ 2 thereby substantially repeating what hadbeen asserted more than a quarter-century before~and what,‘‘ . . . is a question still:’’ How to objectively and routinelymaximize the probability of uncomplicated control of dis-ease?! by another eminent group: ‘‘After 7 decades of radio-therapeutic practice, precise knowledge of tumoricidal dosesand tolerance of normal tissues is lacking. . . . Theradiationtherapist is admittedly treating to ‘tolerance’ doses ratherthan to specific ‘tumoricidal’ doses. The risks are poorly de-fined, and therapeutic ratios remain largely abstractionsrather than concrete estimates. The amount of radiationdamage that is acceptable for the purpose of curing a cancerremains one of personal philosophy as long as the overlap-ping zones of normal tissue tolerance and tumor curabilityare inadequately defined’’ ~Rubin and Casarett!.3

Considered together, the foregoing assertions of Rubin

and Casarett~1973! and of Mohanet al. ~2002! might sug-gest to some, in yet another phrase from Bacon’s animadver-sions on Aristotleet al., that in modern radiation oncology‘‘ . . . the state of knowledge@as distinct from the state ofhardware# is not prosperous nor greatly advancing . . . ’’

The 1994 ASTRO Presidential address by Lester Petersdoes not appear to wholly confute that suggestion: ‘‘ . . .from the perspective of the hard-nosed businessman, therehas yet to be any measurable improvement in the ‘bottomline’: the age-adjusted mortality from cancer in the popula-tion as a whole since 1970@at the beginning of the ‘War onCancer’#. Unfortunately, our major successes have been inrelatively rare diseases, which have little impact on the over-all picture.’’ 4

The medical knowledge stored in medical journals is nowvast and the efforts at the systematic deployment of thatknowledge in medical practice have given rise to the newfield of Evidence-based Medicine:5 ‘‘ In the early days ofmedicine, decisions about methods of diagnosis and treat-ment were based on authority . . . We are now in the era ofevidence-based medicine. This is a shift in the paradigm ofmedicine, from authority-based to evidence-basedpractice.’’6 It might be termed a belated neo-Baconian shift.

For Bacon, the purpose of all knowledge is ‘‘action in theproduction of works for the promotion of human happinessand the relief of man’s estate.’’1 However, Brook has as-serted, on the evidence of a RAND study, that, ‘‘ . . . thepurpose of journals is not to disseminate information but topromote faculty—this is the sole reason and justification forthe journal’s existence,’’7 an observation quite consistentwith all previous and current evaluations of the quality of theinformation in the medical literature, since, ‘‘Put simply,much poor research arises because researchers feel com-pelled for career reasons to carry out research that they areill equipped to perform and nobody stops them.’’8 ~See, forexample, Altman,8,9 Williamson et al.,10 and, especially,AAPM Report 43.11! But Anderson~1960! remarks that inBacon’s vision of his new science~i.e., what is nowourscience!, ‘‘ there is no place for knowledge which has for its

489 489Med. Phys. 30 „4…, April 2003 0094-2405 Õ2003Õ30„4…Õ489Õ6Õ$20.00 © 2003 Am. Assoc. Phys. Med.

Page 3: Are we doing any good by doing really well? (Where’s the Bacon?)

purpose mere contemplation’’1 ~say, by a Faculty Committeeon Promotion and Tenure?!.

Given the findings of all reviews of the quality of medicalevidence~For example, ‘‘This is a sorry state of affairs . . .you . . . can rarelytake what is published or presented atclinical and scientific meetings at face value.’’12 Or, ‘‘ . . .the evidence substantiating the effectiveness of many currentand emerging practices is frequently questionable and inmany instances is entirely lacking. Research on medical in-terventions is often poorly designed and methodologicallyflawed.’’13! it may well be that it is not only sometimes moreuseful but also more prudent to ‘‘merely contemplate’’mostof the overabundant medical literature rather than to attemptto exploit it either to inform the medical practice or to guidemedical research. A lack of evidence in the medical literatureis obviously an existential issue for the practice of evidence-based medicine.

II. CLINICAL ROLES FOR OBSERVATIONALDATABASES

Evolutionary Operation~EVOP!, a common statisticalmethod for process improvement, is also a management tool.‘‘ It’s basic philosophy is that it isinefficient to run an indus-trial process in such a way that only a product is produced,and that a process should be operated so as to produce notonly a product but alsoinformation on how to improve thatproduct.’’ 14 An extrapolation of the EVOP philosophy to ra-diation oncology practice suggests that a course of treatmentshould produce not only an acceptable set of radiation re-sponses~in tumor and normal tissues! in the present patientbut also information on how to improve that set of responsesin the next patient. Thus, the rigorous collection, editing,collation, formatting andsophisticated analysis~‘‘ . . . a@mathematical# model is the tool that converts data intoinsights.’’15! of cumulative, detailed, and circumstantiallocaldata, both prospective and follow-up, on outcomes~in tumorand normal tissues! and treatment, in particular, ‘‘dose anddose–volume-response data,’’if properly deployed, can helpto bring radiation oncology purposes more in line with theBaconian purpose. Of course, ‘‘ . . . the credibility of anyinvestigation depends mainly upon its adherence to high sci-entific standards in obtaining data and the care taken inanalyzing the data . . . Thehigh scientific standards main-tained by good observational studies can and should equalthe rigour of a properly conducted RCT@randomized con-trolled trial#.’’ 16 Califf et al.17 have ‘‘ . . . identified severalcharacteristics common to successful clinical databases.First, important data items should be delineated and col-lected prospectively. The quality control mechanisms of adatabase should be the same as those used in a randomizedcontrolled trial. Second, patients must be followed compul-sively at regular intervals. Again, the methods of completeaccurate follow-up should be no different between databasesand randomized controlled trials. For a clinician to be awareof his or her clinical experience, these first two principleswill allow, for adequate collection and tabulation of data. Ifthe database is to be used to evaluate therapy a third essen-

tial component should be included: a team of investigators,including clinicians, clinical epidemiologists, biostatisticiansand computer scientists.’’In sum, observational databasescan and should be developed locally—including local imple-mentation of the requisitesophisticated analyses—and thenregularly deployed to inform local radiation treatment proto-cols that can guide local practices and inform local research.

Continuing this theme~see Herbert18!, clinical databasesare a crucial component in the useful extrapolation of theresults of RCTs to local clinical practice since~1! ‘‘Clinicaltrials are typically done under carefully controlled circum-stances in a university setting and do not reflect the use of atechnology in community practice’’19 and ~2! ‘‘ Patients in atrial are not to be regarded as a random sample from somepopulation,’’20 i.e., clinical trials have high internal validitybut low external validity. Indeed, ‘‘Data on the quality ofcommunity practice by hospital and physician are essentialfor making proper use of data from randomized trials . . . thequality with which each arm of the trial may be carried outin the community may overwhelm any differences betweenalternative therapies that are demonstrated in the trial it-self.’’ . . . ‘‘In the absence of information on the effectivenessof a procedure, information on the efficacy of a proceduremay do more harm than good.’’7 Although clinical databaseshave low internal validity, they have high external validity:‘‘ . . . the observational database offers a mechanism fortailoring therapy to individuals;’’16 i.e., it provides a solutionto Bernard’s Dilemma: ‘‘The dilemma articulated by Bernardin 1865 still haunts the clinician today: the response of theaverage patient to therapy is not necessarily the response ofthe patient being treated.’’21

Brook further remarks that, ‘‘I submit that the data pub-lished in the ‘New England Journal of Medicine’ may harmmore people than it helps in the absence of a system to iden-tify which @say# surgeons and their hospital teams have com-plication rates that are sufficiently low to justify performingthis procedure. Without a commitment to making informationavailable on outcomes by physician and hospital, efficacydata produced by randomized trials may be elegant, but mayalso be misleading. If we are going to invest money in per-forming expensive randomized trials, should we also insistthat valid information on performance at the physician andhospital level must also be released so that the trial resultscan be properly used?’’ 7

But obviously, before ‘‘valid information’’ can be releasedit must be carefully collected, collated, edited, formatted,analyzed, etc., to form readily useful implementations resid-ing in observational databases. It is obvious from the Mohanet al. paper2 that such information is not now available evenfor the narrow task of defining objective functions for IMRT,much less for broadly guiding extrapolation of the results ofRCTs to inform local practice, or for the release to localphysicians to guide their referrals. And local clinical data-bases are quite far from being in an acceptable state for theuseful deployment thereon of either a meta-analysis~The fol-lowing statement is taken from the Introduction to the 1996ACR Appropriateness Criteria: ‘‘Since data available fromexisting scientific studies are usuallyinsufficient for meta-

490 Donald Herbert: Where’s the Bacon? 490

Medical Physics, Vol. 30, No. 4, April 2003

Page 4: Are we doing any good by doing really well? (Where’s the Bacon?)

analysis broad-based consensus techniques are needed forreaching agreement in the formulation of the appropriate-ness criteria. The ACR uses a modified Delphi technique toarrive at a consensus level.’’22 But a Delphi technique is adata-poor substitute for meta-analysis.! or the ‘‘newest newthing’’—Data Mining ~Hand23!: ‘‘ Data mining is a new dis-cipline lying at the interface of statistics, database technol-ogy, pattern recognition, machine learning and other areas.It is concerned with the secondary analysis of large data-bases in order to find previously unsuspected relationshipswhich are of interest or value to database owners.’’~See alsoHastie, Tibshirani, and Friedman24.! In healthcare, of course,it is the ‘‘previously unsuspected relationships’’betweentreatment and outcome that are of interest to the ‘‘databaseowners.’’

III. ON PRODUCING ‘‘WORKS’’ FOR ‘‘THE RELIEFOF MAN’S ESTATE’’

From Brook’s remarks, does it follow that we should in-sist that local, clinically useful, observational databases mustbe designed and implemented? And should we also insist thatmedical reporting practices become less ‘‘career-oriented.’’That is, should medical science become more Baconian, i.e.,produce more knowledge for ‘‘relief of man’s estate’’andless for ‘‘mere contemplation?’’

If we do so insist, then the evidence of past experiences inthe introduction into medical research and medical practiceof the randomized controlled trial~RCT! and, more recently,of medical outcome studies, especially ‘‘severity-adjusted’’assessments of provider-performance, suggests that it is onlyby the introduction of strongcoercive administrative andlegislative measures that can we expect to continually assurethat most scientific effort is directed to ‘‘relief of man’s es-tate’’ and not to ‘‘mere contemplation.’’We give some ex-amples.

The methodology for, as well as a strong case for, theimplementation and deployment of clinically useful observa-tional databases were well described more than a decade ago.~See, for example, the papers by Califfet al.;17 Hlaltkyet al.;16 Pryor et al.;25 Marinez et al.;26 and McDonald andHui.27! Yet apparently they are still mostly developed, imple-mented, and deployed only where the response to demand bypayors for empirical quantitative evidence of provider-specific severity-adjusted performance has been mandated bystate legislation~or where it is required for reimbursement!.In such cases both the format and content of clinical data-bases are now strongly influenced by the requirements of thepayors. For example, ‘‘Severity measurement systems holdimportant implications for physicians. At a minimum, theycould standardize the content, structure and terminology ofthe medical record . . . . At theoutside, they could affect thechoice and timing of diagnostic and therapeutic interven-tions’’ ~Iezzoni, Schwartz, and Rustica28!. Among the earliestand perhaps the best examples are the observational data-bases supporting the published reports of provider-specific~physician, physician group, hospital! severity-adjusted per-formances released annually since 1991 by the payor-driven

Pennsylvania Health Care Cost Containment Council(PHC4).29 The corresponding HCFA performance assess-ment program based on the Medicare database30 was quietlydropped several years ago after only a few years’ experience.The reason given was the difficulty in adequately capturingin a severity-adjustment model the effect on provider perfor-mance of some socioeconomic features characteristic of thelarge inner city hospital.31

Similarly, both the weaknesses in the medical literatureand the methods for their remediation have been clearly andrepeatedly identified over the past 140 years. For example, in1858, Radicke, a professor of physics at the University ofBonn, remarked that the ‘‘mathematical ignorance of physi-cians’’ leads to a‘‘stream of baseless and, to a greater ex-tent,erroneousdoctrines which daily threaten to overwhelmmedical science’’~Matthews32!. Despite the rapid develop-ment and medical deployments of statistical methods in thefollowing half-century, Greenwood in a 1908 letter to KarlPearson remarked, ‘‘With all your experience you have noconception of theintense badnessof much that appears inthe medical press in the name of experimental science’’~Matthews32!. Major Greenwood, a member of the BritishMedical Research Council and a prote´ge of Pearson, was~along with Florence Nightingale! foremost among thosewho initially sought to extend quantitative, statistical, meth-ods to inform and guide medical practice and research. KarlPearson was the principal leader in the development of sta-tistical methods during that period.~He founded at Univer-sity College London the first department of mathematicalstatistics.! Many well-informed and detailed criticisms of thequality of the offerings in the ‘‘medical press’’have appearedregularly since then.~See again Altman,8,9 Williamsonet al.,10 and AAPM Report 43.11!

One of the more novel remedies for the poor quality ofmuch of the medical literature was recently proposed byIrvin Bross, an eminent biostatistician, in a 1990 paper inwhich he observes that, ‘‘It may be no exaggeration to saythat scientific fraud in biomedical research has reached epi-demic proportions,’’and further remarks that the problem offraud in biomedical research is ‘‘ . . . only compoundedwhen the ‘findings’ are endorsed by ‘blue-ribbon’ scientificpanels. This does not make them any less fraudulent but itshows how gullible scientists can be when it comes to statis-tical methods.’’Bross suggests that in the long run, ‘‘ . . . thesurest way to eradicate such fraud is for biostatisticians todo their own science’’~Bross33!. However, to several in theAAPM his proposal seems to have it backward. A differentresponse to Bross’ ‘‘epidemic’’ was proposed in AAPM Re-port 43:11 medical scientists must ‘‘do their own statistics.’’The basis for the AAPM response seems obvious: ‘‘Now it ispossible to maintain . . . that statistics in its broadest senseis the matrix of all experimental science and is consequentlya branch of scientific method, if not Scientific Method itself;and, hence, that it transcends the application of the scientificmethod in sundry fields of specialization. The scientist shouldknow statistics as he knows logic and formal language forcommunicating his ideas’’~see Kendall34!. AAPM Report 43also differed from previous published critiques of the medi-

491 Donald Herbert: Where’s the Bacon? 491

Medical Physics, Vol. 30, No. 4, April 2003

Page 5: Are we doing any good by doing really well? (Where’s the Bacon?)

cal literature by explicitly demonstrating not only ‘‘whatwent wrong’’~often risibly so!, but also how to ‘‘do it right,’’in each of several peer-reviewed, published, and frequentlycited studies in radiation biology that were anatomized atsome length in the 350-page report.

IV. ON THE ‘‘GOLD STANDARD’’

That so little movement has occurred on either of theseissues bespeaks the intransigence of professional parochial-ism in general and, still more generally, of the durability ofthe old contention between public and private~‘‘ tacit’’ !knowledge, or, in the sense made famous by Thomas Kuhnin 1962, ‘‘paradigmatic’’knowledge ~For those who mayhave forgotten, the dominant paradigm in a given field pre-scribes both the problems that are to be addressed and whatwill be counted as a solution thereto.!: For it is the case that,‘‘ . . . the medical practitioner and the medical researcherboth share an antipathy towards methods of quantitative orstatistical inference’’ ~Matthews32!. Famously, in the case ofthe randomized clinical trial, now accepted by these twogroups as the ‘‘gold standard’’of scientific inference, it wasnecessary to overcome their shared antipathy to ‘‘methods ofquantitative or statistical inference’’by government ukase:The Kefauver–Harris Act of 1962.35

As was the case in the problems of the medical databasesand of the medical literature, the remedies for such inferen-tial problems as led to the ‘‘Thalidomide disaster’’wereknown to but simply neglected by these two groups. Indeed,the statistical principles and methods of the randomized con-trolled clinical trial were well-known as early as 1930~ran-domization as a basis for inference was initially proposed bythe American physicist/philosopher C. S. Peirce in the latenineteenth century! and had been demonstrated to yield clini-cally valuable information and insight by 1950.~The resultsof the first RCT—of streptomycin1 bed rest versus bed restalone in the treatment of respiratory tuberculosis—were pub-lished in 1948 by Bradford Hill of the British Medical Re-search Council.36! However, it was not until the Kefauver–Harris Amendment to the Food, Drug and Cosmetic Act in1962 mandated it, that—for the very first time —both theefficacy and safety of new technologies were required to beempirically demonstrated before being marketed for pa-tient treatment, did the two professional groups accept andthen begin reluctantly to deploy the methods of the random-ized controlled trial.

It is noteworthy that the signature of Western~i.e., Baco-nian! science—requiring empirical evidence as the basis forbelief—was insisted upon, not by the medical community, asmight have been expected, but by the political community.~The latter may have thereby demonstrated that the ‘‘profes-sion of arms’’ is not necessarily the only profession whosepractice may be too important to be left entirely to its pro-fessionals.!

It is the case, of course, that the randomized controlledtrial, as currently practiced, is encumbered by many inherentinferential—as well as ethical—problems. Many were appar-ent at its inception.~Although not to the legislators who were

its earliest champions, nor to more than a very few in themedical field at any time.! Advances in computational speedand numerical algorithms have made it possible to overcomemany of the inherent problems—including problems of com-pliance and of ethics—in RCTs by, for example, the use ofBayesian methods.~See Berry;37 Chickering and Pearl;38

Brophy and Joseph;39 and Kadane.40! But, again, the medicalresearch and clinical communities are, characteristically, re-sisting the deployment of these still newer and improved‘‘ methods of quantitative or statistical inference’’because formost members of these two communities the older statisticalmethods were regarded as a xenograft anyway and were onlyaccepted as ‘‘a cost of doing business;’’ they were never fullyunderstood in either their strengths or their weaknesses.

V. ‘‘ . . . IN THE MANNER DICTATED BY THEIRCURIOSITY.’’

The concept that became ‘‘Paradigmatic knowledge’’wasfirst embodied in the ‘‘principle of intellectual self-determination’’that became enshrined in U.S. science policyby Vannevar Bush, President of MIT and wartime head of theOffice of Scientific Research and Development~OSRD!, inhis 1945 letter to President Roosevelt: ‘‘Scientific progresson a broad front results from the free play of free intellectsworking on subjects of their own choicein the manner dic-tated by their curiosity for the exploration of theunknown.’’41 Fifteen years later the concept was more for-mally given a normative status as well as its now familiarname by Thomas Kuhn~1962! in his classic ‘‘The Structureof Scientific Revolutions,’’ 42 which is ‘‘probably the best-known academic book of the second half of the twentiethcentury’’ ~Fuller43!. Indeed, it is, ‘‘ . . . by any measure, themost influential book on the nature of science yet to be pub-lished in the twentieth century’’ ~Giere44!. Kuhn’s acknowl-edged debt to a still earlier work ‘‘Genesis and Developmentof a Scientific Fact’’by the pathologist Ludwig Fleck45 ~firstpublished in 1935! is another story. Kuhn, a Harvard physicsprofessor, was a prote´ge of the polymath and Harvard presi-dent James Conant, who was Director of the National De-fense Research Committee during World War II and who,with Bush, was largely responsible for the formation of theNational Science Foundation in 1950. Conant, an elitist, wasan early proponent of the principle of absolute autonomy forscience and thus a forceful and articulate advocate of the‘‘ principle of intellectual self-determination’’~see Fuller43!.

In his remark in the 1945 letter Bush rather unreflectivelyextrapolated as normative for all of science, a principle thatworked for~indeed that defined them! a small elite group andthe narrow special circumstances in which they practiced itand achieved enormous success thereby: the physical scien-tists of the early 20th century. For them, ‘‘the manner dic-tated by their curiosity’’was ~as it is for their successors!inherently and entirely, that of ‘‘quantitative or statisticalinference.’’However, it is well-known that such a manner ofinference~albeit the only proper one! is not always found inmuch of science; certainly it is not often found among mostof the biological and medical scientists, as correctly reported,

492 Donald Herbert: Where’s the Bacon? 492

Medical Physics, Vol. 30, No. 4, April 2003

Page 6: Are we doing any good by doing really well? (Where’s the Bacon?)

for example, by Matthews,32 and is clearly evidenced in theirliterature. For this latter group ‘‘the manner dictated by theircuriosity’’ has led to the ‘‘epidemic’’ of indefensible resultsdescribed by Bross. Thus, for these scientists their ‘‘manner’’must be circumscribed by controls imposed on it by othersfrom outside the group~who are thus not possessed by itsparadigm! and hence who will be able to challenge its au-tonomy in whatever the matter at issue, e.g., bypost hocstatistical review and secondary analysis: ‘‘Statistics is aform of social control over the professional behavior of re-searchers. The ultimate justification for any statistical proce-dure lies in the kind of research behavior it encourages ordiscourages’’ ~see Harris46!.

VI. CHANGING, ‘‘ . . . HOW THE GAME WILL BEPLAYED.’’

During a joint ABA–AAAS investigation of scientificfraud a few years ago, the attorney Patricia Woolf famouslyremarked, ‘‘Tell me how the score will be kept, and I’ll tellyou how the game will be played.’’Her remark suggests away to improve the quality of the literature: Change the cri-teria for promotion and/or tenure. Altman8 says much thesame thing: ‘‘As the system encourages poor research it isthe system that should be changed. We need less research,better research, and research done for the right reasons.Abandoning using the number of publications as a measureof ability would be a start.’’It is now a commonplace that atmany institutions one of the current criteria for assessing thescientific productivity of the aspiring faculty member oftenappears to be more quantitative than qualitative, i.e., theyappear to be based on a perverse version of the Law of LargeNumbers: ‘‘There must be a pony in there somewhere.’’But,‘‘ The length of a list of publications is a dubious indicator ofability to do good research; . . . ’’ ~Altman8!. ~A recent storyin the Wall Street Journal suggests that such criteria, i.e.,quantitative rather than qualitative, will also lead to mischiefin other medical enterprises as well.47!

Perhaps the criteria for promotion and tenure should berevised so that only a small number~say a half-dozen or adozen?! of more profound publications~probably of muchgreater length than is now typically the case! could be sub-mitted as evidence of a level of ability that merits promotionand/or tenure. Then the unique insights and perspectivesachieved by a single scientist~with perhaps a few col-leagues! in pursuing a few sustained, thoughtful and broadlyinformed~‘‘ Indeed, it is obvious that invention or discovery,be it in mathematics or anywhere else, takes place by com-bining ideas.’’ 48! lines of inquiry ~that may lie a bit beyond‘‘ the lamppost’’ ! and the more informative reports that wouldsurely follow would be, in fact, ‘‘actions in the production ofworks for the promotion of human happiness and the relief ofman’s estate,’’and would thereby tend to assure that advancesin medical knowledge proceeded more in step with advancesin careers within the respective domains of medical practiceand medical research.

However, after remarking again in an article in 2000 that,‘‘ Too much research is done primarily to benefit the careers

of researchers’’Altman ends with the observation that ‘‘Ma-jor improvements in the quality of research published inmedical journals are unlikely in the present climate.’’9 Oneway to change the climate might be to change, ‘‘ . . . how thescore will be kept.’’

a!Electronic mail: [email protected]; [email protected]. Bacon,The New Organon, edited by F. H. Anderson~Macmillan/Library of Liberal Arts, New York, 1960!.

2R. Mohan, Q. Wu, A. Niemierko, and R. Schmidt-Ullrich, ‘‘Optimizationof IMRT plans based on biologically equivalent uniform dose, inBiologi-cal & Physical Basis of IMRT & Tomotherapy,’’Proceedings of the 6thInternational Conference On Dose, Time and Fractionation in RadiationOncology, 23–25 September, edited by B. Paliwal, D. Herbert, J. Fowler,and M. Mehita~Medical Physics Publishing, Madison, WI, 2002!, pp.151–166.

3R. Rubin and G. Casarett, inConcepts of Clinical Radiation Pathology inMedical Radiation Biology, edited by G. Dalrymple, M. Gaulden, G.Kollmorgen, and H. Vogel~W. B. Saunders, Philadelphia, 1973!, pp.160–189.

4L. Peters, Presidential Address at the 1994 ASTRO Annual Meeting.5Evidence Based Medicine Working Group, ‘‘Evidence-based medicine. Anew approach to teaching the practice of medicine,’’ J. Am. Med. Assoc.268, 2420–2425~1992!.

6M. Bland and J. Peacock,Statistical Questions in Evidence-Based Medi-cine ~Oxford University Press, Oxford, 1996!.

7R. H. Brook, ‘‘Using scientific information to improve quality of healthcare,’’ in Doing More Good Than Harm: The Evaluation of Health CareInterventions, edited by K. S. Warren and F. Mosteller~New York Acad-emy of Sciences, New York, 1993!, pp. 74–85.

8D. G. Altman, ‘‘The scandal of poor medical research,’’ Br. Med. J.308,283–284~1994!.

9D. G. Altman, ‘‘Statistics in medical journals: Some recent trends,’’ Stat.Med. 19, 3275–3289~2000!.

10J. R. Williamson, F. Goldschmidt, and T. Colton, ‘‘The quality of medicalliterature: An analysis of validation assessments,’’ inMedical Issues ofStatistics, edited by J. C. Bailer III and F. Mostellar~NEJM Books,Waltham, MA, 1986!, pp. 370–391.

11D. Herbert~Principal Author and Chairman!, A. Feldman, E. Krishnan, C.Orton, J. Ovadia, B. Paliwal, T. Schultheiss, P. Shrivastava, A. Smith, M.Stovall, and L. Cohen~Consultant!, AAPM Report 43, ‘‘Quality assess-ment and improvement of dose response models: Some effects of studyweaknesses on study findings. ‘‘C’est magnifique,’’ Report of AAPMBiological Effects Committee Task Group 1,Evaluation of Models forDose-Response in Radiation Oncology~Medical Physics Publishing,Madison, WI, 1993!.

12S. A. Glantz,Primer of Biostatistics, 3rd ed.~McGraw-Hill, New York,1992!.

13W. L. Roper, W. Winkenwerder, G. M. Hackbarth, and H. Krakauer,‘‘Effectiveness in health care: An initiative to evaluate and improve medi-cal practice,’’ N. Engl. J. Med.319, 1197–1202~1988!.

14G. E. P. Box and N. R. Draper,Evolutionary Operation~Wiley, NewYork, 1969!.

15Institute of Medicine, National Academy of Science,Assessing MedicalTechnologies~National Academy, Washington, DC, 1985!.

16M. A. Hlatky, K. L. Lee, F. E. Harrell, R. M. Califf, D. B. Pryor, B. M.Daniel, and R. A. Rosati, ‘‘Tying clinical research to patient care by useof an observational database,’’ Stat. Med.3, 375–384~1984!.

17R. M. Califf, D. B. Pryor, and J. C. Greenfield, ‘‘Beyond randomizedclinical trials: Applying clinical experience in the treatment of patientswith coronary artery disease,’’ Circulation7, 1191–1194~1986!.

18D. E. Herbert III, ‘‘Methodological issues in radiation dose-volume out-come analyses: Summary of a joint AAPM/NIH workshop,’’ Med. Phys.29, 2109–2127~2002!.

19S. B. Thacker and R. L. Berkelman, ‘‘Surveillance of medical technolo-gies,’’ J. Public Health Policy7, 363–377~1986!.

20S. J. Senn, ‘‘Falsificationism and clinical trials,’’ Stat. Med.10, 1679–1692 ~1991!.

21S. Yusuf, J. Wittes, J. Probsfield, and H. El. Tyroler, ‘‘Analysis and inter-pretation of treatment effects in subgroups of patients in randomizedclinical tials,’’ J. Am. Med. Assoc.266, 93–98~1991!.

493 Donald Herbert: Where’s the Bacon? 493

Medical Physics, Vol. 30, No. 4, April 2003

Page 7: Are we doing any good by doing really well? (Where’s the Bacon?)

22ACR, Appropriateness Criteria for Imaging and Treatment Decisions,Pocket Guide, Reston, VA, 1996, Vols. 1, 2.

23D. J. Hand, ‘‘Data mining: Statistics and more?’’ Am. Stat.52, 112–118~1998!.

24T. Hastie, R. Tibshirani, and J. Friedman,The Elements of StatisticalLearning. Data Mining, Inference, and Prediction~Springer, Verlag, NewYork, 2001!.

25D. B. Pryor et al., ‘‘Clinical data bases. Accomplishments in medicalcare,’’ Med. Care23, 623–647~1985!.

26J. N. Marinez, C. A. McMahan, G. M. Barnwell, and H. S. Wigodsky,‘‘Ensuring data quality in medical research through an integrated datamanagement system,’’ Stat. Med.3, 101–111~1984!.

27C. J. McDonald and S. L. Hui, ‘‘The analysis of humongous databases:Problems and promises,’’ Stat. Med.10, 511–518~1991!.

28L. I. Iezzoni, M. Schwartz, and R. Restuccia, ‘‘The role of severity in-formation in health policy debates: A survey of state and regional con-cerns,’’ Inquiry28, 117–128~1991!.

29Pennsylvania Health Care Cost Containment Council, 225 Market Street,Suite 400, Harrisburg, PA 17101.

30Health Care Financing Administration, U.S. Dept. of Health and HumanServices, Room 2230, Oak Meadows Bldg., 6325 Security Blvd., Balti-more, MD 21207-5817.

31B. Vladek ~private communication, 1998!.32J. R. Matthews,Qualification and the Quest for Medical Certainty~Prin-

ceton University Press, Princeton, NJ, 1995!.33I. D. Bross, ‘‘How to eradicate fraudulent statistical methods: Statisti-

cians must do science,’’ Biometrics46, 1213–1225~1990!.34M. G. Kendall, ‘‘On the future of statistics—A second look,’’ J. Roy. Stat.

Soc. Ser. A131, 182–204~1968!.

35Kefauver–Harris Drug Amendment to the Federal Food, Drug and Cos-metic Act of 1938, 1962.

36‘‘Streptomycin Treatment of Pulmonary Tuberculosis: A Medical Re-search Council Investigation, Bri. Med. J.2, 769–782~1948!.

37D. A. Berry, ‘‘A case for Bayesianism in clinical trials~with discussion!,’’Stat. Med.12, 1377–1404~1993!.

38D. M. Chickering and J. Pearl, ‘‘A clinician’s tool for analyzing non-compliance,’’ Comput. Sci. Stat. Proc.29, 424–431~1997!.

39J. M. Brophy and L. Joseph, ‘‘Placing trials in context using Bayesiananalysis. Gusto revisited by Reverend Bayes,’’ J. Am. Med. Assoc.15,871–875~1995!.

40J. B. Kadane, Introduction, inBayesian Methods and Ethics in a ClinicalTrial Design, edited by J. B. Kadane~Wiley, New York, 1996!, pp. 3–18.

41V. Bush, Science—The Endless Frontier. A Report to the President,1945.

42T. Kuhn, The Structure of Scientific Revolution, 2nd ed.~University ofChicago Press, Chicago, IL, 1970!.

43S. Fuller,Thomas Kuhn. A Philosophical History for Our Time~Univer-sity of Chicago Press, Chicago, IL, 2000!.

44R. Giere, Theories and generalizations, inThe Limitations of Deductiv-ism,edited by A. Grunbaum and W. C. Salmon~University of CaliforniaPress, Berkeley, CA, 1988!, pp. 37–46.

45L. Fleck,Genesis and Development of a Scientific Fact, T. F. Bradley andT. Trenn~University of Chicago Press, Chicago, IL, 1979! ~first published1935!.

46R. J. Harris,A Primer of Multivariate Statistics~Academic, New York,1975!.

47The Wall Street Journal, ‘‘History and science. In Waksal’s past: Re-peated ousters,’’ 27 September 2002.

48J. Hadamard,The Psychology of Invention in the Mathematical Field~Dover, New York, 1945!.

494 Donald Herbert: Where’s the Bacon? 494

Medical Physics, Vol. 30, No. 4, April 2003