for peer review - montana state university · 2020. 8. 18. · 7 alberta, e-mail:...
TRANSCRIPT
For Peer Review
Abundance estimation in the presence of zero inflation and detection error using single visit data
Journal: Environmetrics
Manuscript ID: Draft
Wiley - Manuscript type: Research Article
Date Submitted by the Author:
n/a
Complete List of Authors: Solymos, Peter; University of Alberta, Department of Biological Sciences
Lele, S. Bayne, Erin
Keywords: Open populations, Conditional likelihood, Ecological Monitoring, Mixture models, Pseudo-likelihood
John Wiley & Sons
Environmetrics
For Peer Review
1
Abundance estimation in the presence of zero inflation and detection error using single 1
visit data 2
3
Péter Sólymos1, Subhash Lele
2 and Erin Bayne
3 4
5
1Alberta Biodiversity Monitoring Institute, Department of Biological Sciences, University of 6
Alberta, e-mail: [email protected] 7
2Department of Mathematical and Statistical Sciences, University of Alberta, e-mail: 8
3Department of Biological Sciences, University of Alberta, e-mail: [email protected] 10
11
Running title: Abundance estimation using single visit data 12
Word count in the abstract: 142 13
Word count in the manuscript as a whole: 7035 14
Word count in the main text: 4821 (from Introduction to Acknowledgements) 15
Number of references: 30 16
Number of figures and tables: 3 figures, 2 tables 17
18
Address of correspondence: Péter Sólymos, Alberta Biodiversity Monitoring Institute, 19
Department of Biological Sciences, CW 405, Biological Sciences Bldg., University of Alberta, 20
Edmonton, Alberta, T6G 2E9, Canada, Phone: 780-492-8534, Fax: 780-492-7635, e-mail: 21
Page 1 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
2
Abstract 23
It is well established that population surveys are subject to detection error. Current methods to 24
correct for detection error using mixture models require multiple visits to survey locations, and 25
assume closed populations that do not change during the full survey period. We show that 26
contrary to popular belief, multiple visits are not necessary to correct for detection error. The 27
parameters of the Binomial-zero inflated Poisson mixture model can be estimated using single 28
visit data. The use of conditional likelihood leads to estimators that are more stable than full 29
likelihood based estimators used in multiple visit survey approaches. Our single visit method has 30
several advantages: 1) it does not require the hard to satisfy closed population assumption; 2) it is 31
cost effective, enabling ecologists to cover a larger geographical region than possible with 32
multiple visit methods; and 3) resultant estimators are statistically efficient. 33
34
Keywords: Closed populations, Conditional likelihood, Ecological Monitoring, Mixture models, 35
Open populations, Pseudo-likelihood. 36
Page 2 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
3
37
Introduction 38
Ecologists are fundamentally interested in understanding the environmental factors that 39
influence variation in the size of populations. To understand variation in population size requires 40
information on how the occurrence or abundance of species changes in time and space. Many 41
ecologists rely on relative differences in occurrence or counts of the number of individuals 42
observed to draw inferences about factors influencing populations (Krebs 1985). However, 43
models that predict naïve estimates of occurrence (e.g. logistic regression) or abundance (e.g. 44
Poisson regression) are known to underestimate true occurrence and abundance because of 45
detection error. Detection error is the probability that a species (occurrence) or individual of a 46
species (counts) is present during the period of observation but is not detected. The probability of 47
detecting all individuals present in a survey area is rarely one (Yoccoz et al. 2001, Gu and 48
Swihart 2004). Environmental factors that influence population change also affect probability of 49
detection. Thus the issue of imperfect detection needs to be addressed if ecologists are to draw 50
correct conclusions about factors influencing population change (MacKenzie et al. 2002, Tyre et 51
al. 2003). 52
The last decade has seen an enormous growth in the statistical methodology to deal with 53
the problem of detection error (MacKenzie et al. 2006). One approach that has been widely 54
adopted is that of multiple visit surveys that use an N-mixture approach to estimate detection 55
error from count data (Royle 2004). In the N-mixture approach, true abundance has typically 56
been modeled using a Poisson or a Negative Binomial (NB) distribution, while detection error 57
has been modelled as a Binomial observation process. True abundance rates in the Poisson or 58
Negative Binomial model and detection probabilities of individuals in the Binomial model are 59
Page 3 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
4
commonly modeled as a function of habitat and survey-specific characteristics. By accounting 60
for detection error in the observed counts, N-mixture models differentiate between two kinds of 61
zeros: “false” zeros due to detection error where true abundance is greater than 0 but observed 62
count is 0; and “true” zeros due to the state process where the true abundance is 0 and the 63
observed count is also 0. 64
In many situations, a third type of zero can exist. When surveys take place on larger 65
geographic scales, “true” zeros arise not only as zeros due to the Poisson or NB distribution but 66
as a result of true zero-inflation (Martin et al. 2005). True zero-inflation can happen when a 67
species’ range is only partly covered by the extent of the area sampled, the species is quite rare, 68
or the distribution of individuals is highly aggregated. Joseph et al. (2009) proposed zero-inflated 69
Poisson (ZIP) and zero-inflated NB (ZINB) mixture models to account for this third type of true 70
zeros. They used Binomial-ZIP and Binomial-ZINB models with a multiple visit sampling 71
approach to account for detection error in overdispersed counts. 72
The goal of multiple visit methodologies is to provide a more accurate estimator of true 73
abundance by considering detection error than the naïve estimator that ignores detection error. 74
However, there is growing evidence that multiple survey models overestimate true abundance in 75
many situations (Joseph et al. 2009, Moreno and Lele 2010, Bayne et al. under review). For an 76
N-mixture estimator to be an accurate estimator of the true abundance of a species, it is 77
necessary that the population size does not change during the total duration of the repeated visits 78
(closure assumption). Violations of closure can happen for non-sessile organisms due to 79
dispersal or even daily movement (Joseph et al. 2009, Bayne et al. under review). One way to 80
ensure closure is to decrease the time elapsed between successive visits, however this can lead to 81
Page 4 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
5
the violation of the assumption of independent visits which is also required for multiple visit 82
approaches to provide accurate estimates of abundance. 83
Given the challenges inherent in meeting the assumptions of multiple visit N-mixture 84
models for some common ecological situations, a new approach to dealing with detection error 85
in count data is needed. In this paper, we show that detection error in abundance surveys can be 86
corrected using only a single visit to a site hence avoiding the assumption of closure. Our 87
approach requires that covariates that affect abundance or detectability are available. Such 88
covariates are commonly in most biological studies. We show that the parameters of the 89
Binomial-ZIP N-mixture model, that account for all three kinds of zeros, can be consistently and 90
efficiently estimated based on a single visit to sites. We also show that abundance estimators 91
based on single visit are robust, and ecologically sensible. 92
The Binomial-ZIP model 93
We consider the zero-inflated Poisson (ZIP) model for the true state. A hierarchical 94
representation of the ZIP model is (Ni | λi, Ai) ~ Poisson(λi Ai), (Ai | φ) ~ Bernoulli(1 - φ), where 95
Ni is the population abundance at location i (i = 1, 2, …, n; the total number of sites), λi is the rate 96
parameter of the Poisson distribution when the species is present at location i. The probability 97
that Ai = 0 is φ, consequently the probability that at least one individual is present is 98
(1−φ)(1−e−λi ). The φ = 0 case corresponds to a Poisson model for the true state. The Poisson 99
rate parameter can be modelled as a function of covariates using the log link function: log(λi) = 100
XiTβ, where β is a vector of regression coefficients including the intercept (β0), and Xi is the 101
covariate matrix with n rows and as many columns as the number of variables in the model. 102
Links other than the log-link for the Poisson model can be used. 103
Page 5 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
6
The observation process is modeled using the Binomial distribution as (Yi | Ni) ~ 104
Binomial(Ni, pi), where Yi is the observed count at site i, and pi is the probability of detecting an 105
individual given the true abundance Ni is greater than 0. The probability of detection can be 106
modeled as a function of covariates using the logistic link function: logit(pi) = ZiTθ, where θ is a 107
vector of regression coefficients including the intercept (θ0), and Zi is a covariate matrix similar 108
to Xi. One can use links other than the logistic link in the Binomial model. The covariate vectors 109
Xi and Zi can have common covariates, but needs to have at least one covariate that is unique to 110
either the abundance or detection error vectors. 111
Parameter estimation 112
The likelihood function corresponding to the Binomial-Poisson mixture based on single 113
visit is: 114
, 115
where I(.) is an indicator function. Because Ni is unknown, the likelihood involves summation 116
over all possible values of Ni. Direct maximization of this function leads to substantial 117
confounding between the parameters. We have observed that the parameter φ and the intercept 118
parameter θ0 in the detection model are especially confounded. To reduce this confounding, we 119
divide the problem in two parts. In the first part, we condition on a sufficient statistic for the 120
parameter φ and use the conditional distribution of the data given the sufficient statistics to form 121
a conditional likelihood function (Anderson, 1970) for the parameters (β, θ). The conditional 122
likelihood estimators are known to be consistent and asymptotically normal under fairly general 123
conditions. To estimate φ, we construct a new random variable Wi = I(Yi > 0). Then, we write the 124
likelihood function for (β, θ, φ) based on the distribution of W i . This likelihood function does 125
Page 6 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
7
not involve infinite summation and hence is easy to maximise. Further, it is a concave function 126
of φ and hence has a unique solution. Based on the idea of pseudo-likelihood described in Gong 127
and Samaniego (1981), we fix the values of (β, θ) at their conditional likelihood based estimates 128
and maximize the likelihood with respect to φ to obtain its estimate. The results in Gong 129
and Samaniego (1981) show that this pseudo-likelihood estimator is consistent and 130
asymptotically normal. The derivation of the conditional and pseudo-likelihood functions is 131
described in the Appendix. We use the bootstrap procedure (Efron and Tibshirani 1994) to 132
calculate confidence intervals for the estimated parameters. The software implementation is 133
available in the statistical package ‘occupy’ (Sólymos and Moreno 2010) written in the free 134
statistical software R (R Development Core Team 2009). To compare single visit conditional 135
likelihood estimators with the multiple visits Binomial-ZIP, we follow Joseph et al (2009) and 136
maximize the full likelihood function. 137
We use probability plots to evaluate model fit under the Binomial-Poisson and Binomial-138
ZIP models. The model fit is adequate if the values of the empirical and fitted cumulative 139
distribution function (CDF) fall along a line with intercept 0 and slope 1. 140
Simulation study 141
To study the properties of the estimation procedure described in the previous section, we 142
performed several simulations. We considered the situation where the covariates that affect 143
detection and abundance are distinct from each other and where some of the covariates are 144
common, that is we had covariates that affected both detection and abundance. Furthermore, we 145
considered eight different scenarios corresponding to combinations of low ( = 2.13) vs. high 146
abundance ( = 5.25), zero-inflated (φ = 0.25) vs. non zero-inflated data (φ = 0), and low ( = 147
Page 7 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
8
0.25) vs. high ( = 0.65) detection probability (for more details, see Appendix S1 in Supporting 148
Information). 149
We fitted the Binomial-ZIP mixture model to each simulated data set using 100, 300, 150
500, 700, 1000 sites. All together we used 160 different settings (4 settings x 8 scenario x 5 151
sample sizes) and ran100 simulations for each. Average of the true abundances varied between 152
1.6 to 5.2, while the average of the observed counts varied between 0.4 to 3.4 depending on the 153
parameter settings and the covariates used in the simulations. These settings represented a wide 154
range of ecologically plausible situations. 155
To compare the single visit results with those obtained from multiple surveys, assuming 156
the closed population assumption is satisfied, we considered our worst case setup (common 157
discrete covariate under all eight scenarios of combinations of abundance levels, zero inflation 158
and detection probabilities). We assumed four independent visits to each location. We then 159
compared the single visit n = 500 results with the 2 visits with n = 250 results, and the single 160
visit n = 1000 case with the 2 visits n = 500, and 4 visits n = 250 results. 161
Abundance parameters (β) were consistently estimated as the sample size increased, and 162
reliable estimates were obtained with n = 100 in most situations. Detection parameters (θ) were 163
also consistently estimated as sample size increased. The zero inflation parameter φ was well 164
estimated even at small sample sizes. Predicted values were somewhat overestimated for n = 165
100, otherwise for larger sample sizes they were consistent with the true values. Predicted 166
values were consistent for all sample sizes. The correlation between the true and predicted and 167
values were high ranging from 0.8 to 1 for sample sizes n = 300 and above. Even when the 168
data were simulated under no zero-inflation (φ = 0), the parameter φ was well estimated. Figure 1 169
represents the worst case scenario with a common discrete covariate for the abundance and 170
Page 8 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
9
detection models, and low abundance – zero-inflated data – low detectability scenario. Even in 171
this difficult situation, it is clear that the conditional likelihood method works well. A complete 172
summary of the results obtained for the 160 cases is available in the Appendix S1 in Supporting 173
Information. 174
Simulation results for estimation based on multiple visits indicate that parameter 175
estimates (especially intercepts β0, θ0 and φ) were quite biased and confounded when the worst 176
case scenario was used (common discrete covariate, low abundance – zero-inflated data – low 177
detectability). This resulted in biased estimates of mean abundance and detection probability. 178
Comparatively, the single survey conditional likelihood estimator worked very well in this 179
situation (Fig. 2). Under different simulation settings (i.e. high detectability), multiple visits 180
estimates weren’t as biased as in this case (see Appendix S1 in Supporting Information). But 181
even in these best case scenarios, single survey- conditional likelihood estimators were more 182
efficient than the often biased and inefficient estimators based on multiple visits (cf. Figs. 1 and 183
2). Sometimes, single survey estimators at smaller sample size (n = 100) were better than even 184
the multiple survey estimators based on larger sample sizes (n = 250 or n = 500). We believe this 185
gain in small sample efficiency is because of the separation of the parameters (β, θ) and φ by the 186
use of conditional and pseudo-likelihood method. 187
Analysis of the Ovenbird data 188
For data analysis, we used two examples where zero-inflation component was suspected. 189
We used observed counts of Ovenbirds (Seiurus aurocapilla) to illustrate the estimation of the 190
parameters for the Binomial-ZIP model. Data were collected in 1999 using Breeding Bird Survey 191
(BBS) Protocols (Downes and Collins 2003) in the boreal plains eco-region of Saskatchewan. 192
The goal of the study was to determine whether the occupancy of this species was influenced by 193
Page 9 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
10
the amount of forest around each survey point. Data were collected along 36 BBS routes each 194
consisting of 50 survey locations with survey locations separated by 800 meters. To increase 195
independence of observations we used every second survey point along each route in our 196
analysis (n = 891 survey locations). Attributes about the forest type and amount of forest 197
remaining with a 400 meter radius were estimated from the Saskatchewan Digital Land Cover 198
Project (MacTavish 1995). The same data set was used in Lele et al. (2010, manuscript) for 199
studying single survey based estimation of site occupancy of the species. 200
The habitat requirements of the Ovenbird are well understood in the boreal forest 201
(Hobson and Bayne 2002) and we expected that Ovenbird abundance would be positively 202
influenced by the amount of forest, or deciduous forest remaining and negatively by amount of 203
agricultural land. The zero-inflation component is likely to be present because of the marked 204
difference in habitat suitability for the species along the agricultural area gradient. We also 205
included latitude-longitude as the study covered an east-west gradient over 1000 kilometers and 206
a 400 km north-south gradient in length although a priori we were not sure what effect this 207
would have on abundance. 208
We expected four factors to influence detection probability: observer, time of day, time 209
of year, and amount of forest. Observers differ in their ability to hear birds in part due to skill but 210
also due to fundamental differences in the distance over which they hear things. In general, male 211
songbirds sing very regularly early in the breeding season making it easy to detect individuals 212
that are present. As the breeding season progresses however, males spend less time singing as 213
they focus on other activities. This often results in lower detectability later in the breeding 214
season. We included Julian date as a variable influencing detection error. Male songbirds also 215
have a tendency to sing earlier in the day, shortly after sunrise, and then later in the morning 216
Page 10 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
11
focus on guarding the mate or foraging. To account for this, we included time of the day as a 217
factor influencing detectability. Detectability can also be influenced by habitat attributes. In more 218
open environments where forest loss has occurred it is plausible that birds can be heard from 219
long distances increasing the likelihood an individual is detected (Schieck 1997). Alternatively, 220
in areas with more forest the chance of multiple males singing may be higher, increasing 221
detection probability relative to areas with less forest where only one individual may exist. 222
Proportional covariates (ranging from 0-1) were logit transformed and all covariates were 223
scaled to unit variance and centered. We performed backward stepwise model selection starting 224
with the full model including all abundance and detection covariates, and dropped insignificant 225
terms until all remaining terms were significant. Then we compared the models based on the 226
Akaike’s Information Criterion (AIC) to select the final model. We calculated 90% confidence 227
limits based on 100 bootstrap samples. 228
We fitted the Binomial-ZIP mixture model to the single visit Ovenbird data set. We 229
started with the full model including habitat characteristics and geographic coordinates for the 230
abundance model, and observer, Julian day and time of day for the detection model. Proportion 231
of forest area was used in both the abundance and detection model, because it was a priori 232
assumed to influence both processes (model 1; Table 1). We started by simplifying the detection 233
model first. We dropped the time of day, because that term was not significant based on a Wald 234
test (model 2). All remaining terms in the detection model were significant. Then we started 235
dropping terms from the abundance model. After eliminating non-significant variables 236
(proportion of deciduous forest and agricultural area that were correlated with proportion of 237
forest area), we found the best fit model (model 4), which couldn’t be further simplified without 238
an increase in the Akaike’s information criteria (AIC) value. The AIC value corresponding to the 239
Page 11 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
12
Binomial-Poisson mixture with the same covariates as model 4 was 1456.7. This is much higher 240
than the AIC value 860.7 of the Binomial-ZIP model. Aside from better AIC value, the 241
probability plot clearly shows that the Binomial-ZIP model fit is better than the Binomial-242
Poisson model (Fig. 3B). 243
Proportion of forest area had significant positive effect on Ovenbird abundance. 244
Geographic coordinates were not significant predictors of abundance as the confidence intervals 245
overlapped zero. However, their inclusion based on AIC suggested there was a spatial pattern 246
that explained some of the variation in Ovenbird abundance. Ovenbird abundance increases as 247
one goes further north and east in the study area. Observer effects were pronounced. Julian date 248
had significant negative effect on detectability of individuals probably because of decreased 249
singing activity later in the season. Time of day had a negative relationship (in the full model) 250
with detectability. This is concordant with the singing behaviour of the males but the effect was 251
not significant. Proportion of forest area had significant negative effect on detectability. This 252
indicates that individuals are more detectable in open habitats, in spite of lower abundances in 253
such habitats. The zero-inflation component was 0.41, and the average probability of Poisson 254
zeros (P(N = 0) = mean{(1-φ) }) was 0.15 (Table 1, Fig. 3A). The probability of occurrence 255
(mean{ (1−φ)(1−e−λi )}) was 0.44 and predicted mean abundance for the entire study area was 256
(1-φ) = 1.54. This translates into 11.36 birds per point count station at point count stations 257
where the entire area was forested (100% forest cover). Mean probability of detection of 258
individual Ovenbirds was 0.51. 259
Given that Breeding Bird Survey uses an unlimited sampling distance to count birds, 260
absolute density cannot be directly estimated from the Ovenbird example. However, Rosenberg 261
and Blancher (2004) as part of the Partners in Flight planning process estimated that the 262
Page 12 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
13
maximum distance over which Ovenbirds could be heard was 200 metres. Using this as the area 263
sampled by BBS counts, our mean count when a point count station has 100% forest cover 264
converted to a density of 0.904 male Ovenbirds per hectare. This is very close to the density 265
estimate of 0.99 (95% confidence limits (CL): 0.85-1.12) found by Bayne (2000) who mapped 266
the territories of color-banded male Ovenbirds and determined absolute density in the same 267
region. 268
Lele at al. (2010, manuscript) used the same Ovenbird data set but treated it as 269
detection/non-detection data to estimate site occupancy and detectability based on single visit. 270
They found that proportion of forest positively influenced the probability of detecting a species. 271
This finding is in contrast with our results but is easy to resolve. The probability of detecting a 272
species (at least one individual) is highest where population density is higher (i.e. in forest). But 273
the probability of detecting an individual can be higher in open areas where sound can travel 274
greater distances than in more forested landscapes. Lele at al.(2010, manuscript) found the 275
average detection probability to be 0.49 (90% CL: 0.38-0.61). This coincides well with our mean 276
detectability estimate of 0.51. Lele et al. (2010, manuscript) found the mean probability of 277
occurrence to be 0.5 (90% CL: 0.41-0.64). Based on the abundance data, the average probability 278
of occupancy was 0.44 which is within the confidence limits of their estimates. By using the 279
abundance data instead of the detection/non-detection transformation of it, we were able to 280
differentiate between zero-inflation, Poisson and non-detection zeros. Using occupancy, only the 281
distinction between non-detection (false) and true zeros is possible which for many species does 282
not provide a complete picture. 283
Analysis of the Mallard data 284
Page 13 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
14
To compare multiple and single visits based estimates, we used the data set reported in 285
Kéry et al. (2005) for Mallards (Anas platyrhynchos) from the Swiss monitoring program for 286
common breeding bird species. This species is easy to detect, has a narrow local distribution and 287
low abundance in wetland habitats in Switzerland. The dataset contained n = 235 sites with 1-3 288
visits to the sites (2 sites had only one visits, 42 sites had 2 visits and 191 sites had 3 visits). Sites 289
represented 1 km2 quadrats distributed in a grid across Switzerland. Territory mapping was 290
carried out along quadrat specific routes by experienced observer. The data set included route 291
length, elevation and forest cover for the sites and date and survey effort for each visits to the 292
sites. All variables were scaled to unit variance and centered in the original data set. 293
We fitted multiple-visit Binomial-ZIP model to the Mallard data set, and also fitted 294
individual single-visit Binomial-ZIP models to each visit separately. We also fitted the naïve ZIP 295
model (without detection error) using the ‘pscl’ (Zeileis et al. 2008) R package based on the 296
maximum counts per site over all 3 visits. We calculated 90% confidence limits based on 100 297
bootstrap samples. 298
Kéry et al. (2005) used this data to fit Binomial-Poisson and Binomial-NB N-mixture 299
models. Out of the 235 sites, only 39 contained non-zero counts for at least one visit, out of 300
which only 15 sites had maximum counts larger than 1. This indicated the possibility of zero 301
inflation prior to any analysis. The AIC value (509.2) corresponding to the Binomial-Poisson 302
mixture (Kéry et al. 2005) was substantially higher than the AIC value (-393.9) for the multiple 303
visits Binomial-ZIP model (Table 2). 304
As the Binomial-ZIP model fits the data better than the Binomial-Poisson model, we 305
compared the naïve ZIP and the Binomial-ZIP models based on single versus multiple visits. The 306
naïve estimate of Mallard mean abundance ((1-φ) ) using the ZIP model (without detection 307
Page 14 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
15
error) was 0.36 per km2, and gave 0.21 as probability of occurrence (mean{ (1−φ)(1−e−λi )}) 308
(Table 2). Based on the multiple visits Binomial-ZIP model, mean abundance was 4.32 per km2 309
with probability of occurrence 0.17. There were no Poisson zeros due to the high value. We 310
used the single visit Binomial-ZIP model estimation for each visit separately. The mean 311
abundances were 2.57, 0.58 and 0.39 per km2 respectively, indicating a negative trend in 312
abundance over the 3 visits. This was accompanied by only a slight change in probability of 313
occurrence (mean{ (1−φ)(1−e−λi )}; 0.24, 0.14, 0.17 for the three visits), but the probability of 314
Poisson zeros increased with the third visit substantially (from 0.03 to 0.14), indicating a drop in 315
abundance. Given the difference between predicted abundance values based on the multiple and 316
single visit approaches, and given the trend in the single visit estimates, we suspect that the 317
closed population assumption is violated for this data set. 318
The mallard data that was originally analyzed using the multiple visit approach in Kéry et 319
al. (2005) predicted mean abundance under the Binomial-NB model to be 0.43 per km2. They 320
argued this was an inaccurate estimate given that observed average density (with no detection 321
error) was 0.41. They interpret this small difference to indicate that three visits were insufficient 322
for detecting all territories. Based on the comparison of Binomial-Poisson and Binomial-ZIP 323
AIC values, the alternative explanation is that the data is zero inflated with very high probability 324
of zero-inflated zeros (0.69-0.83). Given that the zero-inflation model represents the data better, 325
and that wetlands that this species prefer are inherently patchily distributed, the non-zero inflated 326
Poisson or NB model is hard to interpret ecologically without other habitat characteristics. Using 327
the multiple visits Binomial-ZIP likelihood estimates, average abundance was a magnitude 328
higher (4.3) than the estimate of Kéry et al. (2005). The multiple visits method estimated very 329
low average probability of detection (0.016), which resulted in the high abundance estimate. The 330
Page 15 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
16
single visit based Binomial-ZIP estimates were lower and showed a decrease in mean abundance 331
(2.6, 0.6, 0.4 for the three visits) over the 3 month interval of the visits (15 April – 15 July). 332
Given the 3 months time span of the repeated visits, and that Mallards are not strongly territorial, 333
the closed population assumption is likely violated. The decreasing trend during the breeding 334
season is somewhat surprising, but might be explained by movement out of sites to non-breeding 335
wetlands prior to migration. 336
Discussion 337
The N-mixture models that account for the detection error in wildlife studies represent an 338
important class of models. However, it is widely believed that correcting for detection error 339
requires temporal replication (e.g. Royle et al. 2005). Our results show that this is not true. Under 340
fairly general conditions, detection error can be corrected with single visit survey data for 341
occupancy and abundance studies and will give estimates similar to multiple visit approaches 342
when the assumption of closure is met. The single survey methodology, however, does not 343
require close population assumption to correctly estimate the population abundance. Thus, 344
single-visit approaches can save money, time and effort for field ecologists while accounting for 345
detection error. 346
For zero-inflated models, when the probability of detection is low, we found that 347
maximizing the likelihood function based on multiple visits leads to unstable inference. This is 348
because the different sources of zeros are confounded. As the use of N-mixture models is 349
growing and new variants are appearing in the literature, we feel that parameter identifiability 350
issues are often neglected (Lele, 2010, manuscript). For example, Royle (2004) carefully used 351
extensive numerical simulations to judge the validity of the estimators under multiple visits when 352
all assumptions are satisfied. On the other hand, Joseph et al. (2009) provided no such evidence 353
Page 16 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
17
for the ZIP and ZINB extension of the N-mixtures. We acknowledge their efforts in raising the 354
important problem of zero-inflation but our simulation results suggest that there is substantial 355
confounding of the parameters. The conditional likelihood, proposed in our paper, separates the 356
parameter space and hence reduces the extent of confounding. According to our simulations, 357
even when the close population assumption was satisfied, the single visit, conditional likelihood 358
based estimators (n = 100, one visit) outperformed the multiple visits, likelihood based 359
estimators (n = 250, 4 visits; n = 500, 2 visits) under many scenarios. As a consequence, even 360
when the assumption of closure is met, multiple visits results should be viewed cautiously. 361
N-mixture models based on multiple visits should be highly suspect when the assumption 362
of closure is thought to be violated. When closure is violated, Bayne et al. (under review) found 363
that multiple visit N-mixture models overestimated density by several hundred percent. This was 364
likely the case in the Mallard example. In most practical situations, the closed population 365
assumption is likely to be violated for simple ecological reasons such as within territory 366
movement, dispersal, etc. This is a widely acknowledged fact among wildlife biologists. A 367
number of papers are appearing that try to deal with the lack of closeness by redefining the time 368
or space interval over which multiple surveys are done (e.g. Kendall and White 2009). However, 369
the results in Lele et al. (2010, manuscript) and this paper suggest that, when covariates are 370
available, multiple surveys are not necessary and hence the closed population assumption is 371
irrelevant. In addition to the robustness against closed population assumption, our results indicate 372
that for most practical situations, conditional likelihood based estimation of the single survey 373
data requires smaller sample size and provides more efficient estimators than multiple survey 374
approaches. 375
Page 17 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
18
Many sampling methods and statistical analyses have been developed to estimate species 376
abundance. Even when it is possible to measure density accurately, the economics of doing so 377
can be prohibitive for large-scale applications. As a result, collecting presence-absence data at a 378
series of locations to get coarse measures of species abundance has become a preferred method 379
of evaluating ecological status and trends because of the simplicity of data collection. 380
Comparison of the simulations in Lele et al (2010, manuscript) with simulations in this paper 381
suggest that one may need substantially larger samples for occupancy data as compared to 382
abundance data. When abundance data are available, the estimators are stable and efficient, even 383
at much smaller sample sizes. Furthermore, using the zero-inflated Poisson model for the true 384
abundance, one can differentiate between zero-inflation and Poisson zeros. This is not possible 385
when using detected/not-detected data to model site occupancy. Hence we encourage ecologists 386
to collect count data whenever possible. Single survey methods increase the cost effectiveness of 387
monitoring studies without sacrificing statistical validity and efficiency of the estimates. 388
Acknowledgements 389
We would like to thank Stan Boutin, Steve Cumming, Monica Moreno, Jim Schieck, 390
Fiona Schmiegelow, Samantha Song, and the Boreal Avian Modeling Project Technical 391
committee for helpful discussions on the issue of detection error. Funding for this research was 392
provided by the Alberta Biodiversity Monitoring Institute, Environment Canada, North 393
American Migratory Bird Conservation Act, and Natural Sciences and Engineering Research 394
Council. 395
References 396
Anderson, E. B. (1970) Asymptotic properties of conditional maximum likelihood estimators. J. 397
Royal Stat. Soc. B 32: 283-301. 398
Page 18 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
19
Bayne, E. M. (2000) Effects of forest fragmentation on the demography of ovenbirds (Seiurus 399
aurocapillus) in the boreal forest. University of Saskatchewan, Saskatoon, Canada. PhD 400
Thesis. 401
Bayne, E., Lele, S. R. & Sólymos, P. (2010) Bias in the estimation of bird density and relative 402
abundance when the closure assumption of multiple survey approaches is violated: a 403
simulation study. The Auk, under review 404
Downes, C. M. & Collins, B. T. (2003) The Canadian breeding bird survey, 1967-2000. 405
Canadian Wildlife Service, Progress Notes No. 219. National Wildlife Research Centre, 406
Ottawa, ON. 407
Efron, B. & Tibshirani, R. (1994) An introduction to the bootstrap. Chapman & Hall/CRC. 436 408
p. 409
Casella, G. & Berger, R. L. (2002) Statistical inference. 2nd edn. Australia, Pacific Grove, CA. 410
Thomson Learning. 660 p. 411
Gong G. & Samaniego F. J. (1981) Pseudo-likelihood estimation: theory and applications. 412
Annals of Statistics 9: 861-869. 413
Gu, W. & Swihart R. K. (2004) Absent or undetected? Effects of non-detection of species 414
occurrence on wildlife-habitat models. Biol. Conserv. 116: 195-203. 415
Hobson, K. A. & Bayne E. M. (2002) Breeding bird communities in boreal forest of Western 416
Canada: Consequences of “unmixing” the mixed woods. Condor 102: 759-769. 417
Joseph, L.N., Elkin, C., Martin, T. G., Possinghami, H. P. (2009) Modeling abundance using N-418
mixture models: the importance of considering ecological mechanisms. Ecol. Appl. 19: 419
631-42. 420
Page 19 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
20
Kendall, W. L., & White, G. C. (2009) A cautionary note on substituting spatial subunits for 421
repeated temporal sampling in studies of site occupancy. J. Appl. Ecol. 46: 1182-1188. 422
Kéry, M., Royle, J. A., & Schmid, H. (2005) Modeling avian abundance from replicated counts 423
using binomial mixture models. Ecol. Appl. 15: 1450-1461. 424
Krebs, C.J. (1985) Ecology: The experimental analysis of distribution and abundance. 3rd edn. 425
Harper and Row, New York, USA. 426
Lele S. R. (2010) Model complexity and information in the data: could it be a house built on 427
sand? Ecology, in press 428
Lele, S. R., Moreno, M. & Bayne, E. (2010) Dealing with detection error in site occupancy 429
surveys: What can we do with a single survey? (Manuscript) 430
MacKenzie, D. I., Nichols J. D., Lachman G.B., Droege S., Royle J. A. & Langtimm C. A. 431
(2002) Estimating site occupancy rates when detection probabilities are less than one. 432
Ecology 83: 2248-2255. 433
MacKenzie, D. I., Nichols, J. D., Royle, A. J., Pollock, K. H., Bailey, L. L. & Hines, J. E. (2006) 434
Occupancy estimation and modeling: inferring patterns and dynamics of species 435
occurrence. Elsevier, Amsterdam, Netherlands. 324 pp. 436
MacTavish, P. (1995) Saskatchewan digital landcover mapping project. Report I-4900-15-B-95. 437
Saskatchewan Research Council, Saskatoon, SK. 438
Martin, T.G., Wintle B.A., Rhodes J.R., Kuhnert P.M., Field S.A., Low-Choy S.J., Tyre A.J. & 439
Possingham H.P. (2005) Zero tolerance ecology: improving ecological inference by 440
modeling the source of zero observations. Ecol. Lett. 8: 1235-1246. 441
Moreno M. & Lele S.R. (2010) Improved estimation of site occupancy using penalized 442
likelihood. Ecology, 91: 341-346. 443
Page 20 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
21
R Development Core Team (2009) R: A language and environment for statistical computing. R 444
Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL 445
http://www.R-project.org. 446
Rosenberg, K. V. & Blancher, P. J. (2005) Setting numerical population objectives for priority 447
landbird species. In: Bird Conservation and Implementation in the Americas: 448
Proceedings of the Third International Partners in Flight Conference (eds. Ralph, C. J. & 449
Rich, T. D.). U.S. Department of Agriculture, Forest Service, General Technical Report 450
PSW-GTR-191. Vol. 1, pp. 57-67. 451
Royle, J. A. (2004) N-mixture models for estimating population size from spatially replicated 452
counts. Biometrics 60: 108-115. 453
Royle, J. A., Nichols, J. D., & Kéry, M. (2005) Modelling occurrence and abundance of species 454
when detection is imperfect. Oikos 110: 353-359. 455
Schieck, J. (1997) Biased detection of bird vocalizations affects comparisons of bird abundance 456
among forested habitats. The Condor 99: 179-190. 457
Sólymos, P. & Moreno, M. (2010) ‘occupy’: analyzing single visit data with detection error. R 458
package version 1.0-0. URL: http://cran.r-project.org/package=occupy 459
Tyre, A.J., Tenhumberg, B., Field, S.A., Niejalke, D., Parris, K. & Possingham, H. P. (2003) 460
Improving precision and reducing bias in biological surveys: estimating false negative 461
error rates. Ecol. Appl. 13: 1790-1801. 462
Zeileis, A., Kleiber, C. & Jackman, S. (2008) Regression models for count data in R. J. Stat. 463
Soft., 27(8). URL http://www.jstatsoft.org/v27/i08/. 464
Yoccoz, N. G., Nichols, J. D. & Boulinier, T. (2001) Monitoring of biological diversity in space 465
and time. Trends in Ecol. Evol. 16: 446-453. 466
Page 21 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
22
467
Appendix: Conditional and pseudo-likelihood estimation for the Binomial-Zero Inflated 468
Poisson mixture model 469
Let ),(~| iiii pNBinomialNY where pi = p(Zi,θ) is a function of detection covariates Zi. Let 470
N i | Ai ~ Poisson(Aiλi) where λi = λ(X i,β) is a function of abundance covariates X i . Further 471
)1(~ φ−BernoulliAi . Then the random variableYi is said to follow a Binomial-Zero Inflated 472
Poisson distribution. We first derive some elementary mathematical statistics results related to 473
this distribution. 474
Result 1: Consider the conditional distribution 475
P(Yi = yi |Yi > 0) =P(Yi = yi)
1− P(Yi = 0) for yi =1,2,3,.... 476
The probability mass function for this conditional distribution is given by: 477
P(Yi = yi |Yi > 0) =
Ni
yi
Ni = yi
∞
∑ pi
yi (1− pi)Ni −yi e
−λi λi
N i /Ni!
1− e−λi pi
for yi =1,2,3,... 478
Notice that this conditional distribution does not depend on the parameter φ. 479
Proof: This proof follows elementary probability theory (e.g. Casella and Berger, 2002). 480
P(Yi = y i |Yi > 0) =P(Yi = y i)
1− P(Yi = 0)
=
(1− φ)N i
y i
N i = y i
∞
∑ pi
y i (1− pi)N i −y i e−λi λi
N i /N i!
1− P(Yi = 0)
………(1) 481
Further, 482
Page 22 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
23
P(Yi = 0) = φ + (1− φ)N i
0
N i = 0
∞
∑ pi
0(1− pi)
N i −0e
−λi λi
N i /N i!
= φ + (1− φ)e−λi (1− pi)λi[ ]N i /N i!N i = 0
∞
∑
= φ + (1− φ)e−λi e(1− p i )λi
= φ + (1− φ)e−λi p i
483
Hence, we can write 484
1− P(Yi = 0) = (1− φ) 1− e−λi pi( ) ………..(2) 485
Combining equations (1) and (2), it follows that: 486
P(Yi = yi |Yi > 0) =
Ni
yi
Ni = yi
∞
∑ pi
yi (1− pi)N i −yi e
−λi λi
Ni /Ni!
1− e−λi pi
. 487
Result 2: The binary random variable defined by W i = I(Yi >0) has the following distribution: 488
P(W i = 0) = φ + (1− φ)e−λi p i
489
P(Wi =1) = (1−φ) 1−e−λi pi( ). 490
Proof: Follows from equation (2) in the proof of the previous result. 491
Conditional likelihood estimation of (β,θ): 492
To estimate the parameters (β,θ), we use the likelihood using only those sites that have at 493
least one individual observed. This is called the conditional likelihood function (Anderson 1970). 494
The conditional likelihood is given by: CL(β,θ) = P(Yi = y i |Yi > 0)y i >0
∏ where the product is only 495
on those sites where yi > 0 . We maximize this function to obtain the estimates of the parameters 496
(β,θ). The conditional likelihood estimators are known to be consistent (Anderson 1970) as the 497
number of sites that have at least one individual observed increases. 498
Pseudo-likelihood estimation of φ: 499
Page 23 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
24
To estimate the parameter φ, we consider the likelihood based on the random 500
variablesW i where parameters (β,θ) are fixed at their conditional likelihood estimates ( ˆ β , ˆ θ ). 501
Gong and Samaniego (1981) call such likelihood ‘pseudo-likelihood’. 502
PL(φ;W , ˆ β , ˆ θ ) = (1− φ)(1− e− ˆ λ i ˆ p i ){ }
Wi
φ + (1− φ)e− ˆ λ i ˆ p i{ }i=1
n
∏1−Wi
503
Because the conditional likelihood estimates ( ˆ β , ˆ θ ) are consistent, the pseudo-likelihood 504
estimator of φ obtained by maximizing the pseudo-likelihood is also consistent (Gong and 505
Samaniego 1981). 506
507
SUPPORTING INFORMATION 508
The following Supporting Information is available for this article: 509
Appendix S1 Simulation results 510
511
Page 24 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
25
512
Table 1. Model selection results for the Ovenbird data set based on the Binomial-ZIP mixture. 513
Model terms not significant (based on Wald test) were backward dropped until only significant 514
terms remained (model 4). Bootstrap based 90% confidence intervals are provided in parentheses 515
for the best fit model 4. 516
Model 1 Model 2 Model 3 Model 4
Abundance
Intercept 0.300 0.211 0.161 0.508 (-0.071, 0.578)
Proportion of forest area 0.820 0.977 0.999 0.825 (0.697, 1.138)
Proportion of deciduous area 0.044 0.005
Proportion of agricultural area -0.058 0.067 0.029
Latitude 0.300 0.282 0.205 0.236 (-0.011, 0.379)
Longitude 0.214 0.201 0.137 0.195 (-0.037, 0.330)
Detection
Intercept -0.607 -0.107 -0.916 -1.719 (-3.275, -0.315)
Observer (DW) -1.037 -0.360 -0.510 -0.453 (-1.889, -0.886)
Observer (RDW) 0.818 0.737 1.813 1.753 (-6.379, 2.770)
Observer (SVW) 1.436 1.415 2.444 2.329 (1.099, 4.045)
Proportion of forest area -1.155 -1.355 -1.444 -1.019 (1.700, 4.428)
Julian day -0.512 -0.542 -0.501 -0.470 (-0.689, -0.357)
Time of day -0.079
φ 0.366 0.389 0.376 0.410 (0.311, 0.451)
P(N = 0) 0.200 0.206 0.218 0.150 (0.136, 0.276)
2.387 2.214 2.142 2.604 (1.858, 2.717)
(1-φ) 1.513 1.352 1.337 1.537 (1.171, 1.595)
0.559 0.641 0.646 0.513 (0.514, 0.720)
AIC 977.7 1146.1 1093.4 860.7
517
Page 25 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
26
518
Table 2. Naïve (detection probability is 1), multiple visits and single visit N-mixture estimates 519
for the Mallard dataset based on the zero-inflated Poisson distribution for abundances. Naïve 520
estimates are based on maximum counts from the visits to the sites, multiple visits include the 521
three visits, single visit N-mixture model was fitted for each visit separately. Bootstrap based 522
90% confidence limits are in parentheses. 523
Naïve (max counts) Multiple visits Visit 1 Visit 2 Visit 3
Abundance
Intercept -1.264 3.232 1.253 0.736 -0.430
(-2.224, -0.833) (2.909, 3.280) (-0.626, 2.094) (-3.641, 2.910) (-8.126, 1.039)
Route -0.453 -0.043 -0.744 -0.839 -1.134
length (-0.964, 0.164) (-0.084, -0.011) (-1.216, 0.117) (-1.988, 0.572) (-2.099, 0.941)
Elevation -1.163 -0.055 0.595 -0.459 -0.908
(-1.923, -0.625) (-0.136, 0.018) (-2.307, 1.261) (-4.198, 1.620) (-6.883, 1.551)
Forest (%) -0.600 0.007 -0.132 0.465 0.036
(-1.185, -0.262) (-0.050, 0.065) (-0.837, 0.344) (-1.217, 2.283) (-4.481, 1.278)
Detection
Intercept -4.307 -8.437 1.285 13.169
(-4.688, -3.910) (-29.378, -3.321) (-4.181, 35.904) (2.768, 47.476)
Effort 0.326 0.681 0.794 0.939
(-0.589, 0.870) (-0.720, 5.796) (-3.847, 20.326) (-2.756, 12.341)
Date -0.448 -5.069 1.667 -10.650
(-0.538, -0.287) (-20.432, -0.923) (-11.770, 33.068) (-33.749, -4.707)
φ 0.418 0.830 0.729 0.824 0.688
(0.131, 0.619) (0.790, 0.877) (0.520, 0.817) (0.373, 0.879) (0.226, 0.797)
P(N = 0) 0.370 0.000 0.033 0.034 0.141
(0.239, 0.686) (0.000, 0.000) (0.007, 0.223) (0.000, 0.399) (0.053, 0.654)
0.540 25.364 9.468 3.295 1.261
(0.303, 0.833) (18.395, 26.653) (1.102, 82.829) (0.796, 1188.788) (0.397, 13.532)
(1-φ) 0.360 4.318 2.565 0.581 0.394
(0.243, 0.492) (2.785, 4.932) (0.367, 25.047) (0.307, 286.227) (0.242, 4.491)
0.016 0.502 0.672 0.563
(0.012, 0.027) (0.100, 0.624) (0.092, 0.825) (0.323, 0.692)
524
Page 26 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
27
525
Figure captions 526
Figure 1. Simulation results with a common discrete covariate used both for the 527
abundance (β2) and the detection (θ2) model. Each box and whiskers correspond to 100 528
simulations; horizontal axes give the sample size (n) used for estimation. As n increases, medians 529
(thick black lines) are getting closer to the true parameter values (thick grey lines), and estimates 530
are getting accurate (inter-quartile boxes and range whiskers getting narrower). The low 531
abundance – zero inflated data – low detectability scenario was used. β, θ, and φ are model 532
parameters (see text), is the mean of the predicted rate parameter of the Poisson distribution, 533
is the mean of the detection probabilities. Correlations between true and predicted λ and p values 534
are shown in the lowest row. Right bottom insert represents the count distribution for an example 535
data set out of the 100 simulated ones, black bars are true, grey bars are observed counts. 536
Figure 2. Comparison of single (1x) and multiple (2x, 4x) visits estimates based on 100 537
simulations (settings are the same as for Fig. 1). Sample sizes (visits x number of sites) are 500 538
(1 and 2 visits) and 1000 (1, 2, and 4 visits). Single visit estimation was based on the conditional 539
approach (see text) whereas likelihood based estimation was used for multiple survey analysis. 540
Results show that intercept (β0, θ0) and φ estimates based on multiple visits are biased, and 541
variability of the estimates is greater than for the single visit estimator. 542
Figure 3. Count distribution for the Ovenbird data set (A) and probability plot (B) for the 543
N-mixture model fitted to the data set. Ovenbird abundances are actual counts from 891 544
locations (grey bars), the estimated proportion of zero-inflation (black) and Poisson zeros (white) 545
are shown beside the zero point mass bar, the difference between the observed and predicted zero 546
point mass is due to non-detection zeros. The probability plot shows the values of the empirical 547
and fitted cumulative distribution functions (CDF) based on the Binomial-Poisson (open circles) 548
Page 27 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
28
and the Binomial-ZIP (filled circles) mixtures. Scattered line represent the line with slope 1; 549
values closer to this line indicate better fit. 550
551
Page 28 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
Figure 1
273x331mm (72 x 72 DPI)
Page 29 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
Figure 2
248x321mm (72 x 72 DPI)
Page 30 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
For Peer Review
Figure 3
408x204mm (72 x 72 DPI)
Page 31 of 31
John Wiley & Sons
Environmetrics
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960