decision-making and diagnosis in disease management

7
LETTER TO THE EDITOR Decision-making and diagnosis in disease management G. Hughes a *, N. McRoberts b and F. J. Burnett c a Institute of Ecology and Resource Management, University of Edinburgh, Edinburgh EH9 3JG; b Plant Biology Department, Scottish Agricultural College, Auchincruive KA6 5HW; and c Crop Health Department, Scottish Agricultural College, Edinburgh EH9 3JG, UK Introduction Suppose a decision is to be made on whether or not to apply crop protection measures, based on the use of a risk algorithm. The term ‘risk algorithm’ is used here to refer to any calculation that uses observations of one or more components of the ‘disease triangle’ – the host crop, the pathogen population and the environment – to make an assessment of the need for crop protection measures, judged by comparison of the result of the calculation with some predetermined threshold value. Depending on the outcome of this comparison, a decision on whether or not to treat a crop will be made. It has to be accepted that decision-making, based on whatever risk algorithm is employed, will not be perfect. Along with correct decisions to treat when treatment is required, and not to treat when treatment is not required, incorrect decisions will sometimes be reached. That is to say, sometimes the decision to treat will be made when treatment is not required, and sometimes the decision not to treat will be made when treatment is required. Obviously, in order to be of practical use, it is a requirement that risk algorithms lead to correct decisions most of the time. Some method of evaluation is therefore required. The main objectives of this article are to draw attention to a paper by Murtaugh (1996) on the statistical evaluation of ecological indicators, and to try to explain its signifi- cance in the context of disease management decision- making. An indicator is an easily measured substitute for a property of a system that is difficult to measure directly. Murtaugh (1996) discussed the use of receiver operating characteristic (ROC) curves to assess the usefulness of indicators in the general context of monitoring environ- mental quality. In the context of disease management decision-making, we can think of risk algorithms as indicators. Economic yield loss cannot be measured directly until it is too late to prevent. The purpose of a risk algorithm is to provide a substitute for the measurement of economic yield loss, allowing an earlier assessment of the need for crop protection measures. Yuen et al. (1996) suggested the use of ROC curves as a means of evaluating risk algorithms. This is discussed, along with some related problems, including the assess- ment of accuracy, the use of discriminant function analysis for assessment of the need for crop protection measures, and the use of data obtained by sampling in the establishment of evaluation methods. Receiver operating characteristic curve analysis ROC curve analysis is widely used by clinicians as a means of evaluating diagnostic tests for decision-making in the context of patient management (see, for example, Metz, 1978; Zweig & Campbell, 1993; Schulzer, 1994). We outline the analysis in that context before discussing its application in disease management decision-making. ROC curve analysis of a laboratory test proposed as a basis for clinical diagnosis proceeds as follows. First, from a group of subjects, two subgroups are established. One subgroup comprises the ‘cases’ all those individuals known definitively to be suffering from the particular condition in question; the other subgroup comprises the ‘controls’ – all those individuals known definitively not to be suffering from the condition. The classification into cases and controls is made indepen- dent of the diagnostic test, the performance of which is being evaluated. The diagnostic test is then performed on all the individuals in both subgroups. Typically, this procedure results in two overlapping frequency distributions of test scores, one distribution for the cases, the other for the controls (Fig. 1). Since the two distributions overlap, the test does not provide perfect discrimination between cases and controls. In Fig. 1, most of the cases have test scores above the indicated threshold (these are true positives), but some have test scores below the threshold (these are false negatives). Most of the controls have test scores below the threshold (these are true negatives), but some have Plant Pathology (1999) 48, 147–153 Q 1999 BSPP 147 *To whom correspondence should be addressed. Accepted 2 November 1998.

Upload: hughes

Post on 06-Jul-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Decision-making and diagnosis in disease management

LETTER TO THE EDITOR

Decision-making and diagnosis in disease management

G. Hughesa*, N. McRobertsb and F. J. BurnettcaInstitute of Ecology and Resource Management, University of Edinburgh, Edinburgh EH9 3JG; bPlant Biology Department, ScottishAgricultural College, Auchincruive KA6 5HW; and cCrop Health Department, Scottish Agricultural College, Edinburgh EH9 3JG, UK

Introduction

Suppose a decision is to be made on whether or not toapply crop protection measures, based on the use of arisk algorithm. The term ‘risk algorithm’ is used here torefer to any calculation that uses observations of one ormore components of the ‘disease triangle’ – the hostcrop, the pathogen population and the environment – tomake an assessment of the need for crop protectionmeasures, judged by comparison of the result of thecalculation with some predetermined threshold value.Depending on the outcome of this comparison, adecision on whether or not to treat a crop will bemade. It has to be accepted that decision-making, basedon whatever risk algorithm is employed, will not beperfect. Along with correct decisions to treat whentreatment is required, and not to treat when treatment isnot required, incorrect decisions will sometimes bereached. That is to say, sometimes the decision to treatwill be made when treatment is not required, andsometimes the decision not to treat will be made whentreatment is required. Obviously, in order to be ofpractical use, it is a requirement that risk algorithms leadto correct decisions most of the time. Some method ofevaluation is therefore required. The main objectivesof this article are to draw attention to a paper byMurtaugh (1996) on the statistical evaluation ofecological indicators, and to try to explain its signifi-cance in the context of disease management decision-making.

An indicator is an easily measured substitute for aproperty of a system that is difficult to measure directly.Murtaugh (1996) discussed the use of receiver operatingcharacteristic (ROC) curves to assess the usefulness ofindicators in the general context of monitoring environ-mental quality. In the context of disease managementdecision-making, we can think of risk algorithms asindicators. Economic yield loss cannot be measureddirectly until it is too late to prevent. The purpose of a

risk algorithm is to provide a substitute for themeasurement of economic yield loss, allowing an earlierassessment of the need for crop protection measures.Yuen et al. (1996) suggested the use of ROC curves as ameans of evaluating risk algorithms. This is discussed,along with some related problems, including the assess-ment of accuracy, the use of discriminant functionanalysis for assessment of the need for crop protectionmeasures, and the use of data obtained by sampling inthe establishment of evaluation methods.

Receiver operating characteristic curveanalysis

ROC curve analysis is widely used by clinicians as ameans of evaluating diagnostic tests for decision-makingin the context of patient management (see, for example,Metz, 1978; Zweig & Campbell, 1993; Schulzer, 1994).We outline the analysis in that context before discussingits application in disease management decision-making.ROC curve analysis of a laboratory test proposed as abasis for clinical diagnosis proceeds as follows. First,from a group of subjects, two subgroups are established.One subgroup comprises the ‘cases’ – all thoseindividuals known definitively to be suffering from theparticular condition in question; the other subgroupcomprises the ‘controls’ – all those individuals knowndefinitively not to be suffering from the condition. Theclassification into cases and controls is made indepen-dent of the diagnostic test, the performance of which isbeing evaluated. The diagnostic test is then performedon all the individuals in both subgroups.

Typically, this procedure results in two overlappingfrequency distributions of test scores, one distributionfor the cases, the other for the controls (Fig. 1). Since thetwo distributions overlap, the test does not provideperfect discrimination between cases and controls. InFig. 1, most of the cases have test scores above theindicated threshold (these are true positives), but somehave test scores below the threshold (these are falsenegatives). Most of the controls have test scores belowthe threshold (these are true negatives), but some have

Plant Pathology (1999) 48, 147–153

Q 1999 BSPP 147

* To whom correspondence should be addressed.

Accepted 2 November 1998.

Page 2: Decision-making and diagnosis in disease management

test scores above the threshold (these are false positives).The question therefore arises as to what threshold testscore should be adopted for the implementation oftreatment in a subject where the test score is the maininformation available (the definitive condition beingunknown and, say, too risky, time-consuming, orexpensive to establish at the outset). Setting the thresh-old at a lower test score than the one indicated in Fig. 1will reduce the number of false negatives but, at the sametime, increase the number of false positives. Conversely,setting the threshold at a higher test score than the oneindicated in Fig. 1 will reduce the number of falsepositives but, at the same time, increase the number offalse negatives.

Before an ROC curve analysis, some notation isrequired. The cases are all definitively disease positive(D þ) individuals, the controls are all definitively disease

negative (D –) individuals. Most of the cases, and someof the controls, provide test scores (T) above thethreshold value (T>Tthresh). Most of the controls, andsome of the cases, provide test scores (T) at or below thethreshold (T # Tthresh). We make the following defini-tions: the true positive proportion (TPP) is the number oftrue positive decisions divided by the total number ofcases; the false negative proportion (FNP) is the numberof false negative decisions divided by the total number ofcases; the true negative proportion (TNP) is the numberof true negative decisions divided by the total number ofcontrols; and the false positive proportion (FPP) is thenumber of false positive decisions divided by the totalnumber of controls. Then:TPP is an estimate of Prob(T>Tthresh | D þ) (read as ‘theprobability of a test score above the threshold, given thepresence of disease’), and similarly,FNP is an estimate of Prob(T # Tthresh | D þ),FPP is an estimate of Prob(T>Tthresh | D –),TNP is an estimate of Prob(T # Tthresh | D –).

The TPP is often referred to as the ‘sensitivity’ of adiagnostic test, and the TNP is often referred to as its‘specificity’. An ROC curve is a graphical plot of TPP(sensitivity) against FPP (1 – specificity), the values ofTPP and FPP being calculated by allowing the thresholdtest score (Tthresh) to vary over the whole range of testscores (T). The ROC curve shown in Fig. 2 is based onthe distributions of test scores for cases and controlsshown in Fig. 1. The plot passes through the points (0, 0)(which in decision-making terms corresponds to nevertreating) and (1, 1) (which in decision-making termscorresponds to always treating). For the purposes ofevaluating a diagnostic test, an ROC curve that passesclose to the point (0, 1) (in the top left-hand corner of theplot) shows the test has both desirable sensitivity andspecificity characteristics (that is, relatively high valuesof both can be achieved with an appropriate choice ofthreshold test score). A straight line joining the points(0, 0) and (1, 1) is the ‘no discrimination’ line. On anROC plot, this line is indicative of a diagnostic test thatdoes not provide a basis for discriminating betweencases and controls (Hanley & McNeil, 1982).

Application of ROC curve analysis to plantdisease data

There is an important distinction between diagnosis anddecision-making as practised by plant pathologists, andas practised by clinicians. For both plant pathologists(see, for example, Grogan, 1981; Miller, 1998) andclinicians, diagnosis is concerned with the problem ofdisease identification. However, while clinicians treatdisease at the same level at which they diagnose it (thatis, the individual patient), plant pathologists oftendiagnose at the level of the individual (the plant), buttreat at the level of the population (the crop). In thecontext of plant disease management decision-making, a‘diagnostic test’ is not synonymous with an ‘indicator’ ofthe need for crop protection measures. This raises some

G. Hughes et al.148

Q 1999 BSPP Plant Pathology (1999) 48, 147–153

Figure 1 Frequency distributions of test scores for an hypotheticaldiagnostic test. By convention, the frequency distribution of testscores for cases is shown above the test score axis and thefrequency distribution of test scores for controls is shown below thetest score axis.

Figure 2 The receiver operating characteristic (ROC) curve derivedfrom the frequency distributions of test scores for cases and controlsshown in Fig. 1. The point indicated by a solid circle (X)corresponds to the threshold test score set as indicated in Fig. 1.The diagonal line (– – –) is the ‘no discrimination’ line.

Page 3: Decision-making and diagnosis in disease management

problems for the use of ROC curve analysis in theevaluation of risk algorithms, which are indicators of theneed for treatment at the crop level.

Consider the data of Yuen et al. (1996) in this context.The final incidence of Sclerotinia stem rot was estimatedfrom a random sample of 200 plants taken fromuntreated plots in each of 267 fields, in which variouscovariates were also recorded. On economic grounds, itwas established that treatment was required in order toprevent final disease incidence >0·2, but not in thosewith a final disease incidence # 0·2. In threshold theory(Stern, 1973; Pedigo et al., 1986), a final diseaseincidence of 0·2 corresponds to the ‘economic injurylevel’ (EIL). Thus, the fields were divided into twosubgroups, depending on whether final disease incidencewas above the EIL (these were the cases) or less than orequal to the EIL (these were the controls). Two riskalgorithms, based on covariate data, were to becompared. For each algorithm in turn, an indicatorscore (referred to as the ‘risk point sum’) was calculatedfor each field, and frequency distributions of indicatorscores were compiled separately for the cases and thecontrols. A comparative evaluation of the two riskalgorithms was then carried out by plotting the ROCcurve for each. These curves plot the TPP as a function ofthe FPP at all decision thresholds (Yuen et al., 1996). Theterm ‘decision threshold’ was used to denote thethreshold value of the indicator score (the ‘risk pointsum’) and corresponds to the term ‘economic threshold’(ET) as used in the development of threshold theory(Stern, 1973; Pedigo et al., 1986); the latter terminologyis adopted here.

The application of ROC curve analysis described byYuen et al. (1996) does not correspond exactly with theapplication of ROC curve analysis by clinicians in theevaluation of diagnostic tests, as outlined above.Clinicians have definitive subgroups of cases andcontrols. In circumstances such as those described byYuen et al. (1996), plant pathologists have subgroups ofcases and controls defined by reference to an EIL. Werethe EIL for Sclerotinia stem rot to be increased to a finaldisease incidence of, say, 0·25, a field with a final diseaseincidence of 0·22, that previously would have beenregarded as a case, would now be regarded as a control.Thus, in disease management decision-making, crops arenot classified definitively (as either D þ or D –), but onlyin relation to the adopted EIL (as either D>Dthresh orD # Dthresh). The definitions of the various outcomes inthe establishment of the risk algorithm are then:TPP is an estimate of Prob(I> Ithresh | D>Dthresh) (read as‘the probability of an indicator score above the adoptedET, given that disease is above the adopted EIL), andsimilarly,FNP is an estimate of Prob(I # Ithresh | D>Dthresh),FPP is an estimate of Prob(I> Ithresh | D # Dthresh),TNP is an estimate of Prob(I # Ithresh | D # Dthresh).

The implication of this is that the ROC curves plottedby Yuen et al. (1996) are not complete descriptions of theperformance of the risk algorithms they evaluated. They

are descriptions of the performance when the EIL is setat a final incidence of 0·2. Since an EIL is establishedprimarily on economic grounds, it may change with, forexample, the cost of treatment or the potential value ofthe crop. A complete description of the performance of arisk algorithm for use in disease management decision-making requires that TPP and FPP are calculated byvarying Ithresh (i.e. ET) over the whole range of indicatorscores (I) and varying Dthresh (i.e. EIL) over the wholerange of disease levels (D). This problem was addressedby Murtaugh (1996) in the context of environmentalmonitoring. In situations like those with which Yuenet al. (1996) dealt, where the response (here, eitherD>Dthresh or D # Dthresh) is actually a dichotomizationof a continuous variable (here, D), ‘sensitivity andspecificity can be thought of as double integrals of theconditional density of the indicator, given the value ofthe continuous response’ (Murtaugh, 1996). That is tosay, an ROC surface, rather than an ROC curve, can beplotted. In effect, such a surface would show a sequenceof ROC curves, each curve corresponding to a differentchoice of Dthresh. When a satisfactory basis for dis-criminating between cases and controls is provided by anindicator, the advantage of the ROC curve format is thatit makes explicit the implications of choosing anyparticular ET, in terms of the risks involved.

Data from an ongoing series of trials devotedprincipally to forecasting the need for fungicide treat-ment of wheat to control eyespot disease (caused byPseudocercosporella herpotrichoides) are presented as abrief illustrative example. These data were collectedduring 1992–97 from experimental crops in 32unsprayed plots of winter wheat (cv. Beaver or cv.Riband) and comprise the eyespot index at growth stage(GS) 85 (from which the potential yield loss, and so theneed for treatment, may be determined) and diseaseincidence based on a (visual) assessment of the percen-tage tillers affected at GS 30/31/32 (which is theindicator). All assessments were made in accordancewith Anonymous (1986) and Goulds & Polley (1990).Retrospectively, the experimental crops were dividedinto subgroups comprising cases and controls on thebasis of the eyespot index at GS 85. The level of thisindex that corresponds to the EIL depends on the cost oftreatment and the potential value of the crop, amongother things. For this illustration, four different EILswere investigated: eyespot index at GS 85 equal to 20,30, 40 or 50. For each of these four EILs, frequencydistributions of scores for the early assessment of diseaseincidence were compiled, separately for cases andcontrols. These are shown in Fig. 3 (a–d). As the EILincreases, more of the 32 crops are classified as controls,rather than cases. For each EIL, an ROC curve was thenproduced by allowing the ET to vary over the range ofobserved indicator scores, and calculating the corre-sponding TPP and FPP values. The four ROC curves inFig. 4 (a–d) represent slices through an ROC surface.They are shown separately here in order to illustrate thecorrespondence between the four ROC curves in

Decision-making and diagnosis in disease management 149

Q 1999 BSPP Plant Pathology (1999) 48, 147–153

Page 4: Decision-making and diagnosis in disease management

Fig. 4 (a–d) and the four EILs by which the crops wereclassified in Fig. 3 (a–d), respectively.

Generally, the ROC curves shown in Fig. 4 indicatethat visual assessment of disease incidence at GS 30/31/32 was a poor indicator of whether or not a croprequired treatment, whichever of the four EILs was used.Only when an EIL of eyespot index at GS 85 of 40 wasused (Figs 3c and 4c) did the visual assessment of diseaseincidence at GS 30/31/32 offer any basis for discriminat-ing between cases and controls. This could be shownmore formally by comparing the areas under the ROCcurves (see, for example, Hanley & McNeil, 1982). Thedata set on which this illustration is based is too small toreach any definitive conclusions, either about the generalvalue of early visual assessment of disease incidence asan indicator for use in eyespot disease managementdecision-making or the particular use of the currentlyrecommended threshold level. However, eyespot assess-ment in spring has long been known to be an unreliableindicator of subsequent disease development (Scott &Hollins, 1978). In view of changes in fungicides, wheatcultivars and in the pathogen population since thecurrently recommended threshold was devised (Fittet al., 1988; Jones, 1994), the results shown in Fig. 4are perhaps not surprising. On the basis of the datapresented here, the currently recommended thresholdseems to provide specificity at the expense of sensitivity,resulting in a relatively low FPP but a relatively high FNP(Fig. 4). A high FNP results when the adopted ET is suchthat treatment tends not to be recommended when infact it would be justified.

Jones (1994, see Table 11) discussed the ‘accuracy’ ofthe ET for use of fungicides to control eyespot disease.Data from 58 sites were presented. The case and controlsubgroups were identified on the basis of the increase inyield resulting from prochloraz treatment at GS 30–31.For cases, this increase was $ 0·2 t ha–1; for controls,this increase was<0·2 t ha–1. The data set comprised 41cases and 17 controls. Eyespot incidence at GS 30–31was the indicator, with the ET set, in this example, sothat the decision was to treat when $ 20% of tillers wereaffected, and not to treat when <20% of tillers wereaffected. Of the 41 cases, 28 had $ 20% tillers affected(true positives), so sensitivity (TPP) was 0·68. Of the 17

G. Hughes et al.150

Q 1999 BSPP Plant Pathology (1999) 48, 147–153

Figure 3 Frequency distributions of cases and controls, determinedaccording to whether the eyespot index at GS 85 (Goulds & Polley,1990) is greater than the economic injury level (EIL) (the cases) orless than or equal to the EIL (the controls) in each of 32 plots ofwinter wheat. (a) EIL set at eyespot index ¼ 20; (b) EIL set ateyespot index ¼ 30; (c) EIL set at eyespot index ¼ 40; (d) EIL set ateyespot index ¼ 50. The scale on the ‘frequency’ axis is such thateach division represents a single case (above the ‘indicator score’axis) or control (below the ‘indicator score’ axis). The indicator scoreis based on an early visual assessment of disease incidence(number of tillers affected out of 25; Anonymous, 1986; Goulds &Polley, 1990). The scale on the ‘indicator score’ axis shows 6–10tillers affected as ‘10’ and 11–15 tillers affected as ‘15’.

Page 5: Decision-making and diagnosis in disease management

controls, seven had <20% of tillers affected (truenegatives), so specificity (TNP) was 0·41. Jones (1994)calculated accuracy from the proportion of totaldecisions that were correct, giving 35/58, or 60%. Theproblem with this calculation is that, because theindicator is more accurate for cases than for controls,the calculation of accuracy depends on the proportionsof cases and controls in the data set. Consider thefollowing hypothetical data set. Of 58 sites, 19 wereclassified as cases and 39 as controls. Of the 19 cases, 13were correctly identified by the indicator (true positives),so sensitivity (TPP) was 0·68. Of the 39 controls, 16were correctly identified by the indicator (true nega-tives), so specificity (TNP) was 0·41. This hypotheticaldata set thus comprises the same number of sites as thedata set of Jones (1994), and has the same TPP and TNP.However, in this case, the accuracy is only 29/58 (50%),because the proportion of cases in the hypothetical dataset is lower than in the data set of Jones (1994).Sensitivity and specificity represent two kinds ofaccuracy, respectively, for cases and controls (see, forexample, Johnson et al., 1998). Unlike the calculation ofaccuracy from the proportion of correct decisions, anROC curve analysis does not depend on the proportionsof cases and controls in a data set, because sensitivityand specificity are independent of these proportions(Metz, 1978).

Discriminant function analysis

Discriminant function analysis is another technique usedby clinicians in classification and diagnosis that hasattracted some attention from plant pathologists (seeHau & Kranz, 1990). Snedecor & Cochran (1967)summarized as follows: ‘With two diseases that are oftenconfused, it is helpful to learn what measurements aremost effective in distinguishing between the conditions,how best to combine these measurements, and howsuccessfully the distinction can be made.’ As with ROCcurve analysis, a clinician begins by identifying indivi-duals, definitively, as members either of the subgroupcomprising cases or the one comprising controls. Thediscriminant function is constructed, from data compris-ing the covariates measured on the individuals in bothgroups, in such a way as to produce as accurate aprediction as possible of the disease status of anindividual for whom only the (relevant) covariateshave been measured. Ahlers & Hindorf (1987) applieddiscriminant function analysis in forecasting Sclerotiniastem rot of winter rape. Without going into the same

Decision-making and diagnosis in disease management 151

Q 1999 BSPP Plant Pathology (1999) 48, 147–153

Figure 4 Receiver operating characteristic (ROC) curves derivedfrom the data shown in Fig. 3. Curves (a)–(d) correspond,respectively, to the frequency distributions of cases and controlsshown in Fig. 3 (a–d). On each plot, the point indicated by a solidcircle (X) corresponds to a threshold indicator score of five tillersaffected out of 25 (i.e. 20%, as recommended; Anonymous, 1986)and the diagonal line (– – –) is the ‘no discrimination’ line.

Page 6: Decision-making and diagnosis in disease management

level of detail as with the above discussion of ROC curveanalysis, we note that the same problem arises whencases and controls are identified according to whether apopulation (crop) is above or below an EIL, as in thestudy by Ahlers & Hindorf (1987), rather thandefinitively at the level of the individual, as in mostclinical studies. The discriminant function as formulatedapplies only to the adopted EIL. If economic considera-tions dictate that this EIL should be, say, increased, thensome crops that were classified as cases in the data setfrom which the function was originally formulated maybe reclassified into the control subgroup, necessitatingreformulation of the discriminant function.

Uncertainty in the classification of cases andcontrols

Classification of cases and controls by reference to anEIL is one way in which the use of ROC curve analysisand discriminant function analysis by plant pathologistsin the context of disease management decision-makingdiffers from usage by clinicians; there is another. In theestablishment of an ROC curve or a discriminantfunction, clinicians usually have a method of classifyingeach individual definitively as either a case or a control.However, in plant pathology, there is uncertaintyattached to the classification of crops into the case andcontrol subgroups, because this information is derivedfrom sampling. For example, Yuen et al. (1996) classifiedfields as cases or controls on the basis of a randomsample of 200 plants taken from an untreated plot ineach field. Inevitably there is some uncertainty attachedto this classification. In fact, it is a relatively simple

matter to quantify this uncertainty, given the details ofthe sampling scheme. For any particular EIL adopted, asampling likelihood can be plotted. This shows theprobability of a decision (correct or otherwise) thatdisease incidence is less than or equal to the adopted EIL,for any actual value of incidence (Fig. 5). Up to andincluding the EIL, the curve gives the probability of atrue negative; above the EIL, the curve gives theprobability of a false negative. Typically, such curvesshow probability values near 1 when the actualincidence is much less than the incidence value adoptedas the EIL, near 0 when the actual incidence is muchlarger than the incidence value adopted as the EIL, andnear 0·5 when actual incidence is near the EIL. Further,the probabilities of false positives and true positives are,respectively (1 – the probability of a true negative) and(1 – the probability of a false negative). Thus, whencrops are being classified as cases or controls on the basisof information obtained by sampling, there is a chancethat some crops will be wrongly classified (as either falsenegatives or false positives), and this chance is greatestfor crops near the adopted EIL. There is furtherdiscussion of the problem of evaluation when there isuncertainty in the classification of cases and controls inZweig & Campbell (1993) and Schulzer (1994).

Diagnosis and decision-making

Many new assays are being developed that makepossible very accurate and early detection and diagnosisof infection by plant pathogens (e.g. Duncan &Torrance, 1992). These include assays for the causalagent of eyespot of cereals (for example, Poupard et al.,1993; Priestley & Dewey, 1993; Beck et al., 1996; Gacet al., 1996; Nicholson et al., 1997) and for the causalagent of Sclerotinia stem rot of oil seed rape (Jamaux &Spire, 1994). Such assays can contribute to earlier andmore detailed diagnoses than were previously possible(for example, Anonymous, 1996), but where a thresholdapproach to disease management decision-making isadopted, pathogen detection at the level of individualplants is not the only issue. In this context, an indicatorof the need for treatment of the crop is required. It is theextent to which such an indicator, together with anappropriate choice of ET, allows discrimination betweencrops in which the adopted EIL would subsequently beexceeded, and those in which it would not, that iscrucial. The proper evaluation of risk algorithms there-fore remains the basis for good disease managementdecision-making.

References

Ahlers VD, Hindorf H, 1987 Epidemiologische Untersuchun-gen uber den Schaderreger Sclerotinia sclerotiorum anWinterraps im Hinblick auf eine Prognose. Nachrichten-blatt des Deutschen Pflanzenschutzdienstes.(Braunschweig)39, 113–9.

Anonymous, 1986 Use of Fungicides and Insecticides on

G. Hughes et al.152

Q 1999 BSPP Plant Pathology (1999) 48, 147–153

Figure 5 A sampling likelihood for the sampling scheme adopted byYuen et al. (1996). The likelihood shows, for any true diseaseincidence, the probability that disease incidence assessed bysampling will be less than or equal to the adopted economic injurylevel (EIL) of 0·2. This probability is calculated from

ProbðX # 40Þ ¼X40

x¼0

ProbðX ¼ xÞ

where there are X diseased plants in a random sample of 200 plantsand Prob(X ¼ x) is based on the binomial distribution.

Page 7: Decision-making and diagnosis in disease management

Cereals. Booklet 2257 (86). Alnwick, UK: Ministry ofAgriculture, Fisheries and Food.

Anonymous, 1996 Take careful aim at eyespot. FarmersWeekly 125, 55.

Beck JJ, Beebe JR, Stewart SJ, Bassin C, Etienne L, 1996Colorimetric PCR and ELISA diagnostics for the detectionof Pseudocercosporella herpotrichoides in field samples. In:Proceedings of the Brighton Crop Protection Conference,Pests and Diseases, 1996, Vol. 1. Farnham, UK: BCPC,221–6.

Duncan JM, Torrance L, eds. 1992. Techniques for the RapidDetection of Plant Pathogens. Oxford, UK: BlackwellScientific Publications.

Fitt BDL, Goulds A, Polley RW, 1988 Eyespot (Pseudocercos-porella herpotrichoides) epidemiology in relation to diseaseseverity and yield loss in winter wheat – a review. PlantPathology 37, 311–28.

Gac ML, Montfort F, Cavelier N, 1996 An assay based on thepolymerase chain reaction for the detection of N- and L-types of Pseudocercosporella herpotrichoides in wheat.Journal of Phytopathology 144, 513–8.

Goulds A, Polley RW, 1990 Assessment of eyespot and otherstem base diseases of winter wheat and winter barley.Mycological Research 94, 819–22.

Grogan RG, 1981 The science and art of plant disease diagno-sis. Annual Review of Phytopathology 19, 333–51.

Hanley JA, McNeil BJ, 1982 The meaning and use of the areaunder a receiver operating characteristic (ROC) curve.Radiology 143, 29–36.

Hau B, Kranz J, 1990 Mathematics and statistics for analysesin epidemiology. In: Kranz J, ed. Epidemics of PlantDiseases 2nd edn. Berlin: Springer-Verlag, 12–52.

Jamaux I, Spire D, 1994 Development of a polyclonal anti-body-based immunoassay for the early detection of Sclero-tinia sclerotiorum in rapeseed petals. Plant Pathology 43,847–62.

Johnson DA, Alldredge JR, Hamm PB, 1998 Expansion ofpotato late blight forecasting models for the ColumbiaBasin of Washington and Oregon. Plant Disease 82, 642–5.

Jones DR, 1994 Evaluation of fungicides for control ofeyespot disease and yield loss relationships in winter wheat.Plant Pathology 43, 831–46.

Metz CE, 1978 Basic principles of ROC analysis. NuclearMedicine 8, 283–98.

Miller SA, 1998 Impacts of molecular diagnostic technologieson plant disease management – past, present and future. In:Abstracts – Vol. 1, 7th International Congress of PlantPathology. International Society for Plant Pathology,Edinburgh, August 1998, 9–16.

Murtaugh PA, 1996 The statistical evaluation of ecologicalindicators. Ecological Applications 6, 132–9.

Nicholson P, Rezanoor HN, Simpson DR, Joyce D, 1997Differentiation and quantification of the cereal eyespotfungi Tapesia yallundae and Tapesia acuformis using a PCRassay. Plant Pathology 46, 842–56.

Pedigo LP, Hutchins SH, Higley LG, 1986 Economic injurylevels in theory and practice. Annual Review of Entomol-ogy 31, 341–68.

Poupard P, Simonet P, Cavalier N, Bardin R, 1993 Molecularcharacterisation of Pseudocercosporella herpotrichoidesisolates by amplification of ribosomal DNA internal tran-scribed spacers. Plant Pathology 42, 873–81.

Priestley RA, Dewey FM, 1993 Development of a monoclonalantibody immunoassay for the eyespot pathogen Pseudocer-cosporella herpotrichoides. Plant Pathology 42, 403–12.

Schulzer M, 1994 Diagnostic tests – a statistical review.Muscle and Nerve 17, 815–9.

Scott PR, Hollins TW, 1978 Prediction of yield loss due toeyespot in winter wheat. Plant Pathology 27, 125–31.

Snedecor GW, Cochran WG, 1967 Statistical Methods 6thedn. Ames, Iowa, USA: Iowa State UP.

Stern VM, 1973 Economic thresholds. Annual Review ofEntomology 18, 259–80.

Yuen J, Twengstrom E, Sigvald R, 1996 Calibration andverification of risk algorithms using logistic regression.European Journal of Plant Pathology 102, 847–54.

Zweig MH, Campbell G, 1993 Receiver-operating character-istic (ROC) plots: a fundamental evaluation tool in clinicalmedicine. Clinical Chemistry 39, 561–77.

Decision-making and diagnosis in disease management 153

Q 1999 BSPP Plant Pathology (1999) 48, 147–153