published jas paper

20
 This article was downloaded by: [Universiti Pendidikan Sultan Idris], [Nor Azah Samat] On: 27 June 2012, At: 19:09 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Applied Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/cjas20 Vector-borne infectious disease mapping with stochastic difference equations: an analysis of dengue disease in Malaysia N. A. Samat a  & D. F. Percy b a  Department of Mathematics, Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris, 35900 Tanjong Malim, Perak, Malaysia b  Salford Business School, University of Salford, Greater Manchester, M5 4WT, UK Version of record first published: 27 Jun 2012 To cite this article:  N. A. Samat & D. F. Percy (2012): Vector-borne infectious disease mapping with stochastic difference equations: an analysis of dengue disease in Malaysia, Journal of Applied Statistics, DOI:10.1080/02664763.2012.700 450 To link to this article: http://dx.doi.or g/10.1080/026647 63.2012.700450 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes . Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licens ing, systematic supply, or distribution i n any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions , formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

Upload: nurrasyikahishak

Post on 03-Feb-2018

239 views

Category:

Documents


0 download

TRANSCRIPT

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 1/19

This article was downloaded by: [Universiti Pendidikan Sultan Idris], [Nor Azah Samat]On: 27 June 2012, At: 19:09Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Applied StatisticsPublication details, including instructions for authors and

subscription information:

http://www.tandfonline.com/loi/cjas20

Vector-borne infectious disease

mapping with stochastic difference

equations: an analysis of dengue

disease in MalaysiaN. A. Samat a & D. F. Percy b

a Department of Mathematics, Faculty of Science and

Mathematics, Universiti Pendidikan Sultan Idris, 35900 Tanjong

Malim, Perak, Malaysiab Salford Business School, University of Salford, Greater

Manchester, M5 4WT, UK

Version of record first published: 27 Jun 2012

To cite this article: N. A. Samat & D. F. Percy (2012): Vector-borne infectious disease mapping

with stochastic difference equations: an analysis of dengue disease in Malaysia, Journal of Applied

Statistics, DOI:10.1080/02664763.2012.700450

To link to this article: http://dx.doi.org/10.1080/02664763.2012.700450

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representationthat the contents will be complete or accurate or up to date. The accuracy of anyinstructions, formulae, and drug doses should be independently verified with primary

sources. The publisher shall not be liable for any loss, actions, claims, proceedings,demand, or costs or damages whatsoever or howsoever caused arising directly orindirectly in connection with or arising out of the use of this material.

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 2/19

 Journal of Applied Statistics

2012, iFirst article

Vector-borne infectious disease mappingwith stochastic difference equations: an

analysis of dengue disease in Malaysia

N.A. Samata∗ and D.F. Percyb

a Department of Mathematics, Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris,

35900 Tanjong Malim, Perak, Malaysia;  b Salford Business School, University of Salford, Greater 

 Manchester, M5 4WT, UK 

( Received 3 May 2011; final version received 2 June 2012)

Few publications consider the estimation of relative risk for vector-borne infectious diseases. Most of 

these articles involve exploratory analysis that includes the study of covariates and their effects on disease

distribution and the study of geographic information systems to integrate patient-related information. The

aim of this paper is to introduce an alternative method of relative risk estimation based on discrete time–space stochastic SIR-SI models (susceptible–infective–recovered for human populations; susceptible–

infective for vector populations) for the transmission of vector-borne infectious diseases, particularly

dengue disease. First, we describe deterministic compartmental SIR-SI models that are suitable for dengue

disease transmission. We then adapt these to develop corresponding discrete time–space stochastic SIR-

SI models. Finally, we develop an alternative method of estimating the relative risk for dengue disease

mapping based on these models and apply them to analyse dengue data from Malaysia. This new approach

offers a better model for estimating the relative risk for dengue disease mapping compared with the other

common approaches, because it takes into account the transmission process of the disease while allowing

for covariates and spatial correlation between risks in adjacent regions.

Keywords:   relative risk; disease mapping; dengue disease; tract-count data; SIR-SI models

1. Introduction

Dengue is a common, serious, infectious, mosquito-borne, viral disease in tropical and subtropical

regions of the world. Dengue viruses are transmitted to humans through the bites of infective

female  Aedes  mosquitoes, which live in clear and stagnated water that is mostly generated by

human activity and rainfall. There is currently no vaccine available for the prevention or treatment

of dengue disease. However, dengue can be prevented and controlled if detected early. Therefore,

the use of statistical models for studying the transmission of dengue disease and the estimation

∗Corresponding author. Email: [email protected]

ISSN 0266-4763 print/ISSN 1360-0532 online© 2012 Taylor & Francishttp://dx.doi.org/10.1080/02664763.2012.700450http://www.tandfonline.com

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 3/19

2   N.A. Samat and D.F. Percy

of relative risk for disease mapping are important contributions to the prevention and control

strategies for dengue.

This paper investigates geographical distribution and disease mapping particularly for dengue

disease. Relative risk estimation is one of the most important issues when studying geographical

distributions of disease occurrence. Many studies of disease mapping use regression-type models

in which observable (fixed effects) and unobservable (random effects) variables are included togive a clean map and so depict the true excess risk [2–4,13,16,19,32]. In spite of this, published

studies that use structural disease transmission models for disease mapping are scarce  [9].

Specifically for the case of dengue disease, few researchers use stochastic processes to estimate

the relative risk for disease mapping. Rather, most dengue studies are based on exploratory

data analysis accompanied by pictorial maps, which includes the study of covariates and their

effects on dengue disease distribution. See, for example, [10,27]. Furthermore, some authors use

a geographic information system to integrate the patient-related information [31].

In attempting to develop an improved model and a complementary analysis, our research

introduces an alternative method to estimate the relative risk of dengue disease transmission

based initially on discrete-time, discrete-space, stochastic SIR-SI models (susceptible–infective–recovered for human populations; susceptible–infective for vector populations). This method is

designed to overcome the drawbacks of relative risk estimation in disease mapping using the

classical approach based on standardized morbidity ratios (SMRs). It involves extending the

fundamental Poisson-gamma model and developing a Bayesian analytic approach.

In the remainder of this paper, we first describe existing deterministic compartmental SIR-SI

models for dengue disease transmission. Then, we derive a discrete time–space stochastic SIR-

SI model for dengue disease transmission, which adapts and extends the stochastic SIR models

described by Lawson [14]. We then continue with explanations about an alternative method of 

relative risk estimation for denguedisease mapping, which we developbased on this new stochastic

SIR-SI model. This method is then applied to dengue data of Malaysia to demonstrate the models

in practice.

2. Compartmental SIR-SI models for dengue disease transmission

The compartmental model displayed in Figure 1 is the most common model used in the study of 

dengue disease transmission and is adapted from [7,22]. In this study, for  i  =  1, 2, . . . , M  study

Figure 1. Compartmental SIR-SI model for dengue disease transmission.

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 4/19

 Journal of Applied Statistics   3

regions, and j  =  1,2, . . . , T  time periods, S (h)i, j   represents the total number of susceptible humans

at time j, I (h)i, j   represents the total number of infective humans at time j, and R

(h)i, j   represents the total

number of recovered humans at time  j. We use the superscript (h) to distinguish the variables and

parameters as representing the human population rather than the vector population for which we

use the superscript (v). Furthermore, in Figure 1,  S (v)i, j   represents the total number of susceptible

mosquitoes at time  j, I (v)i, j   represents the total number of infective mosquitoes at time j, µ(h) and

µ(v) represent the (assumed equal) birth and death rates of humans per week and the (assumed

equal) birth and death rates of mosquitoes per week, respectively, γ (h) represents the rate at which

humans recover per week,   b   represents the biting rate per week,   m   represents the number of 

alternative hosts available as the blood source,  A  represents the constant recruitment rate for the

mosquito vector,  β(h) represents the transmission probability from mosquitoes to humans, β(v)

represents the transmission probability from humans to mosquitoes,  N (h)i   represents the human

population size for the study region  i  and  N (v)i   represents the mosquito population size for the

study region i. These definitions and notations hold throughout this paper.

For the case of dengue, susceptible people can become infective and then recover or die due

to the infection. However, susceptible  Aedes mosquitoes can become infective but they will not

recover or die due to the infection because infective mosquitoes stay infective for the remainder

of their lifetimes.

For discrete-time intervals, the compartmental model in Figure 1 can also be written mathemat-

ically as a system of difference equations. Therefore, the deterministic SIR-SI model for dengue

disease transmission in human populations is given by

S (h)i, j   = µ(h) N 

(h)i   +

1 − µ(h) −

  β(h)b

 N (h)i   + m

 I (v)i, j−1

S (h)i, j−1, (1)

 I (h)i, j   = (1 − µ(h) − γ (h)) I 

(h)i, j−1 +

  β(h)b

 N (h)i   + m

 I (v)i, j−1S 

(h)i, j−1, (2)

 R(h)i, j   = (1 − µ(h)) R

(h)i, j−1 + γ (h) I 

(h)i, j−1. (3)

Similarly, the deterministic SIR-SI model for dengue disease transmission in vector populations

is given by

S (v)i, j   = µ(v) N 

(v)i   +

1 − µ(v) −

  β(v)b

 N (h)i   + m

 I (h)i, j−1

S (v)i, j−1, (4)

 I (v)i, j   = (1 − µ(v)) I 

(v)i, j−1 +

  β(v)b

 N (h)i   + m

 I (h)i, j−1S 

(v)i, j−1. (5)

The combined model derived above has the same form as the deterministic SIR-SI model used

by Esteva and Vargas   [7]. Here,   N (h)i   and   N 

(v)i   are assumed to be constant, such that   N 

(h)i   =

S (h)i, j   + I 

(h)i, j   + R

(h)i, j   and N 

(v)i   = S 

(v)i, j   + I 

(v)i, j  . This formulation can then be used to provide a link to

stochastic means, which will be explained in the next section.

3. Stochastic SIR-SI model for dengue disease transmission

A deterministic analysis provides a good approximation to the stochastic means for a major out-

break when the sample size is large [12]. Therefore, in the following analysis we use a formulation

of the deterministic model to provide an approximation to the stochastic means.

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 5/19

4   N.A. Samat and D.F. Percy

Lawson [14] developed a stochastic SIR model for direct transmission of infectious diseases.

Although it only considered discrete time and discrete space, this model proved very effective for

analysing the spread of influenza. We now extend this model to enable the analysis of indirectly

transmitted infectious diseases, similarly taking into account of correlations among neighbouring

regions, using a spatial prior as described later in this section. However, in this study we include

the terms(h)i, j   ,

(h)i, j   and

(v)i, j   to represent the numbers of newly infective humans, newly recovered

humans and newly infective mosquitoes, respectively, all in the interval or time period ( j − 1, j],

and study region i. This is because the dengue data that we observe are weekly new infective cases

in human populations, and we are interested in finding the posterior mean of the new infective

dengue cases each week.

For   i  =  1,2, . . . , M  study regions and   j  = 1,2, . . . , T   time periods, our discrete time–space

stochastic SIR-SI model for dengue disease transmission in human populations follows by adapt-

ing Equations (1)–(5) and including a probability distribution to reflect the randomness inherent

in the data as shown:

S (h)i, j   = µ(h) N 

(h)i   + (1 − µ(h))S 

(h)i, j−1 −

(h)i, j   , (6)

(h)i, j   ∼ Poisson(λ

(h)i, j   ), (7)

λ(h)i, j   = exp(β

(h)0   + c

(h)i   )

  β(h)b

 N (h)i   + m

 I (v)i, j−1S 

(h)i, j−1, (8)

 I (h)i, j   = (1 − µ(h)) I 

(h)i, j−1 +

(h)i, j   −

(h)i, j   , (9)

 R(h)i, j   = (1 − µ(h)) R

(h)i, j−1 +

(h)i, j   , (10)

(h)i, j   = γ (h) I 

(h)i, j−1. (11)

Furthermore, in this study and due to the general unavailability of sufficient data for vectors, thediscrete-time discrete-space SIR-SI models for dengue disease transmission in vector populations

are assumed non-stochastic and are as follows:

S (v)i, j   = µ(v) N 

(v)i   + (1 − µ(v))S 

(v)i, j−1 −

(v)i, j   , (12)

(v)i, j   =

  β(v)b

 N (h)i   + m

 I (h)i, j−1S 

(v)i, j−1. (13)

 I (v)i, j   = (1 − µ(v)) I 

(v)i, j−1 +

(v)i, j   . (14)

We use the Poisson distribution to model the number of new infectives, as this is the fundamentalmodel for count data. Its mean  λ

(h)i, j   is chosen to match the deterministic form in Equation (2) with

a positive multiplicative factor to represent spatial correlation as explained below.

The formulations above show that the counts of new infective humans are assumed to follow

independent Poisson distributions, where the expected numbers of new infectives include elements

of the transmission, which are the simple direct dependence of current infective counts on previous

counts in the same spatial unit and a linear predictor term that can include covariates or random

effects.

As these counts are conditional upon other variables, the Poisson assumption cannot be tested in

isolation, but rather by trying other candidate distributions and comparing overall goodness-of-fit

as described in Section 5.3. However, the Poisson assumption is the default for log linear models

such as this, and we leave the testing of other distributions to future investigations.

In Equation (8),  β(h)0   is a constant term to describe the overall rates of the process for human

populations, and  c(h)i   is a random effect that is designed to absorb residual spatial variation for

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 6/19

 Journal of Applied Statistics   5

human populations. In this study, a conditional autoregressive (CAR) prior is used as a family of 

prior distributions for the random effect. This CAR model was proposed by Besag et al. [3], where

the probability densities of values at any given location are conditional on the neighbouring areas.

The advantage of this intrinsic CAR model is that the conditional moments are defined as simple

functions of the neighbouring values and the number of neighbours mi  by means of a conditional

distribution defined by

c(h)i   |c

(h) j   ( j  = i) ∼  Normal

c̄(h)i   ,

mi

.

In other words, under the CAR prior, the random effect  c(h)i   at site i, conditional upon the random

effects at all other sites, is normally distributed with mean equal to the average of the neighbouring

c(h)

 j   and variance equal to r /mi, where r  is an unknown variance parameter. This intrinsic Gaussian

CAR model allows for over-dispersion and spatial correlation among neighbouring areas. How-

ever, Lawson [15] points out that this intrinsic CAR model is not the only available specification

of a Gaussian Markov random field model. In fact, a proper CAR model formulation can also be

used. The application, comparison and discussion of a proper CAR prior to the analysis of ourstochastic SIR-SI dengue disease transmission model will be included in future investigations to

improve this methodology.

The discrete time–space stochastic SIR-SI model for dengue disease transmission that we

propose here will be used in the estimation of relative risk for dengue disease mapping. However,

the methods extend readily to apply more generally to other vector-borne infectious diseases. A

discussion about this is presented and explained in the next section.

4. Relative risk estimation for disease mapping

Many studies on disease mapping use regression-type models to estimate the risk. Here, we

introduce an alternative method of relative risk estimation of disease mapping based on the disease

transmission model adapted specially for dengue disease. Our computational analysis is performed

using WinBUGS software, which is a package designed to carry out Markov chain Monte Carlo

computations for a wide variety of Bayesian models [29].A discussion and application of Bayesian

analysis of disease mapping using this software can be found in Lawson and Clark  [17].

In general, for i =  1, 2, . . . , M  study regions and j  =  1,2, . . . , T  time periods, a pseudo-random

sample of observations  λ(h)ijk    for  k  = 1,2, . . . , n   is generated from the posterior distribution for

the mean number of infectives  λ(h)ij   . From this sample, the posterior expected mean number of 

infectives can be approximated using the unbiased sample mean

λ̃(h)ij   =

1

n

nk =1

λ(h)ijk  . (15)

Next, the relative risk parameter θ (h)ij   is defined by

θ (h)ij   =

λ(h)ij

e(h)ij

. (16)

Therefore, the posterior expected relative risk can also be approximated using an unbiased sample

mean

θ̃ (h)ij   =

1

n

nk =1

θ (h)ijk    =

1

n

nk =1

λ(h)ijk 

e(h)ij

=λ̃(h)ij

e(h)ij

. (17)

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 7/19

6   N.A. Samat and D.F. Percy

In other words, the posterior expected relative risk is equal to the posterior expected mean number

of infectives,  λ̃(h)ij  , divided by the corresponding naïve mean number of infectives based on the

human population across all study regions, e(h)ij   .

We then use this formulation in the estimation of relative risk for disease mapping, based on

the discrete time–space stochastic SIR-SI model for disease transmission using data in the form

of counts of cases for all tracts under consideration.

5. Application of relative risk estimation for dengue disease in Malaysia

This section demonstrates and displays the results of relative risk estimation based on an

application of the preceding discrete time–space stochastic SIR-SI models for dengue disease

transmission with five alternative assumptions about the mosquito population. The results are

compared and presented in tables and a map, and a powerful model for relative risk estimation

and dengue disease mapping is revealed.

5.1   Data set

Data used in this study were provided by the Ministry of Health, the Institute for Medical Research

and the Department of Statistics, all in Malaysia. All methods presented here are applied to dengue

data in the form of counts of cases within the states of Malaysia for epidemiology weeks 1–53

during a 1-year period spanning 2008–2009. Figure 2 displays the available data, which refer

to observed new infective dengue cases of humans in time periods or intervals   ( j − 1, j]   for

 j  =  1, 2, . . . , 53.

The values for  β(h) and β(v) are chosen to be 0.50 and 0.75, respectively, and the number of 

alternative hosts available as the blood source  m  is assumed to be zero. Furthermore, the weekly

rate values forµ(h), µ(v) and γ (h) are 0.0002736, 0.4028 and 0.7903, respectively, and b is 2.33. Allof these rates are converted from daily rates that we derived from the literature [22,25]. Moreover,

0

100

200

300

400

500

600

700

1 3 5 7 9 1 1 13 1 5 17 1 9 21 2 3 25 2 7 2 9 31 3 3 35 3 7 39 4 1 43 4 5 47 4 9 5 1 53

   N  u  m   b  e  r  s  o   f   N  e  w   I  n   f  e  c   t   i  v  e

   D  e  n  g  u  e   C  a  s  e  s

Epidemiology Week

Perlis

Kedah

P.Pinang

Perak

Kelantan

Terengganu

Pahang

Selangor

K.Lumpur 

Putrajaya

N.Sembilan

Melaka

Johor

Sarawak

Labuan

Sabah

Figure 2. Time series plot for numbers of new infective dengue cases from epidemiology weeks 1–53 during

1-year period spanning 2008–2009 for all 16 states in Malaysia.

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 8/19

 Journal of Applied Statistics   7

since there are no routine data available for dengue mosquitoes, we impute suitable values based

on studies conducted by other researchers. This process is explained in the next section.

5.2   Estimation of vector mosquito populations

Implementation of the SIR-SI models requires dengue mosquito vector data. Since there are

no available routine data for vector mosquito populations, specifically data for newly infective

mosquitoes that are difficult to collect, we propose three simple methods to impute values in

order to generate better results for relative risk estimation than would otherwise be possible. First,

the estimation is based on seasonal averages reported in relevant journal publications, which

specifically study dengue in Malaysia, written by Rohani et al. [26] and Lee and Inder Singh [18].

Second, the estimation is based on the SIR-SI model for dengue disease transmission where the

starting values are set and the estimation propagates from the SIR-SI equations. Here, some of the

estimation is based on information taken from an article by Nishiura [22]. Third, the estimation

is based on an assumption that the infective mosquito data follow the pattern of weekly data for

new infective humans.

5.2.1   Estimation of vector mosquito populations based on seasonal averages

Rohani et al. [26] identified about 40 infective adult mosquitoes in a sample of 5508. In order to

progress, it is feasible to interpret this information as

S (v)i,0

 I (v)i,0

≈5508 − 40

40=

1367

10, (18)

⇒ I (v)i,0   ≈ 0.00732S (v)

i,0 . (19)

The calculation above clearly assumes that the ratio of susceptibles to infectives for mosquitoes

is approximately constant, which is a reasonable first-order assumption.

Now consider a study by Lee and Inder Singh  [18], who conducted monthly surveillance of 

adult mosquitoes in Kuala Lumpur, Malaysia, continuously from January to December 1990 to

monitor their population. Results of the study give the distribution and numbers of adult  Aedes

collected in sentinel traps and the total number of adult mosquitoes for each month. Sentinel

traps are typically huts or rooms or houses, which are used to collect mosquitoes. Normally, two

humans stay inside the hut as a bait to attract mosquitoes. Therefore, the numbers of mosquitoes

collected here refer to the numbers corresponding to the population of susceptible humans at risk.

In this investigation, Lee and Inder Singh [18] observed a total of 8518 mosquitoes among 556

susceptible humans in the year 1990. Since we plan to use the number of susceptible vectors as

the starting point at time j  = 0 for each region in our analysis, we have

S (v)i,0  + I 

(v)i,0

S (h)i,0

≈8518

556=

4259

278. (20)

Rearranging Approximation (20) gives

 I (v)i,0   ≈

4259

278S (h)i,0   − S 

(v)i,0 . (21)

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 9/19

8   N.A. Samat and D.F. Percy

Substituting Approximation (21) into Approximation (19) now gives

S (v)i,0   ≈

1367

10

4259

278S (h)i,0   − S 

(v)i,0

⇒ S (v)i,0   ≈ 15.21S 

(h)i,0 . (22)

However, the data for adult mosquitoes in Lee’s paper represent monthly periods, and the lifespan

of  Aedes mosquitoes in nature typically ranges from 2 weeks to a month depending on environ-

mental conditions [21]. Consequently, we need to redefine Equation (22) by transforming to a

single generation of  Aedes mosquito. Under this redefinition, Approximation (20) changes so that

the appropriate revised form of Equation (22) becomes

S (v)i,0   ≈

15.21S (h)i,0

2= 7.605S 

(h)i,0 . (23)

Hence, there are seven or eight susceptible mosquitoes for every susceptible human, on average.

Relations (19)–(23) give some idea of what the average values are for the infective mosquitopopulation and susceptible mosquito population, which we assume as initial values for our inves-

tigation. In this analysis, the value for the infective mosquito count I (v)i,0   is used as the average value

over the first time period, which we then propagate using one of three alternative assumptions.

These values are then imputed in Equation  (8), giving three similar sets of results arising from

our relative risk estimation.

First, we assume that the data for infective mosquitoes are constant over time for all the states

in Malaysia (Assumption 1). Figure 3 shows a graph of the estimated infective mosquito data for

each state in Malaysia from epidemiology weeks 1–53 during the course of 2008–2009. That is,

from Equations (19) and (23) we estimate the number of infective mosquitoes for the start of the

time period for each state, which we then assume constant for all time periods.Second, we assume that the data for infective mosquitoes follow a cyclical seasonal pattern

(Assumption 2). This is because many researchers have reported in their studies that the seasonal

patterns of outbreak of dengue coincide with the rainy season [8,23,28,30].

Figure 3. Imputed infective mosquitoes without seasonality.

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 10/19

 Journal of Applied Statistics   9

Figure 4. Imputed infective mosquitoes with piecewise constant seasonality.

According to Okogun et al. [23], rainfall is an important factor which regulates the abundance

of outdoor breeding mosquito populations and consequently directly associates with the higher

prevalence levels of mosquito diseases. This view is supported by Foo et al. [8], who found that the

monthly incidence of dengue is associated with the monthly rainfall, which provides the breeding

sites for mosquito populations. In Malaysia, the northeast monsoon is the major rainy season inthe country, which brings heavy rainfall from mid November to early March [20]. Therefore, it

is expected that the number of infective mosquitoes will increase during this monsoon season. In

this study, we assume that the number of infective mosquitoes is piecewise constant over time,

where the value is in a range between 10% above and 10% below the estimated average number of 

infective mosquitoes in each state (Figure 4). Here, the number of infective mosquitoes is assumed

to be large during epidemiology weeks 1–11 and 46–53, corresponding to the raining season in

Malaysia, and small during the other epidemiology weeks.

Third, we again assume that the data for infective mosquitoes follow a cyclical seasonal pattern,

but that this seasonality is now represented by a sinusoidal function ranging from 20% below the

estimated average value to 20% above the estimated average value in each state (Assumption 3).

The idea of using a sinusoidal function is to model the seasonal variation continuously throughout

the year, as a better representation of the true cyclical behaviour than in Assumption 2. Figure  5

shows the imputed infective mosquito data based on this assumption.

In any particular state i, we fit the sinusoidal function for infective mosquitoes by considering

the continuous-time equivalent to I (v)i, j  , which is

 I (v)i   (t ) =  ai  + bi sin(c + dt ),

where   ai   is the mean response,   bi   is the amplitude,   c   is the phase, 2π/d   is the period and   t 

represents time. In this research,  ai  represents the estimated average value used in Assumptions1 and 2,  bi  reflects the amplitude of  ±20% about the average and   t   interpolates epidemiology

weeks j  = 1,2, . . . , 53. The parameters c  =  39/53 and d  = 2π/53 are assumed constant across

all states.

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 11/19

10   N.A. Samat and D.F. Percy

Figure 5. Imputed infective mosquitoes with sinusoidal seasonality.

Therefore, dt  measures annual cycles, taking the values  [0, 2π) for year 1,  [2π , 4π) for year

2, [4π , 6π) for year 3 and so on. Here, we choose  c   in the interval  [0, 2π), but any value equal

to this plus a multiple of 2π  will give the same imputed values for  I (v)i   (t ). As for Assumption 2,

the rainy season falls during epidemiology weeks 1–11 and 46–53. Therefore, it is assumed that

the number of infected mosquitoes is high in this duration compared with the other epidemiology

weeks.These three alternative assumptions for mosquito data are then imputed in the discrete time–

space stochastic SIR-SI model for dengue disease transmission for all states in Malaysia, to obtain

comparable posterior expected relative risks.

5.2.2   Estimation of vector mosquito populations based on propagation

Several articles used the same information as Nishiura [22] in their studies of dengue disease

transmission [7,25]. Here, we use information from Nishiura [22] in order to estimate the total

mosquito population   N (v)i   in state  i =  1,2, . . . , M . In his analysis, Nishiura assumed the total

human population N (h)

i  to be 10,000 and the recruitment rate of mosquitoes to be 5000 per day.

Converting this daily rate to weekly rate gives the recruitment rate of mosquitoes to be 35,000 per

week. We know that the recruitment rate of the mosquito population is µ(v) N (v)i   , and in this study

the birth and death rates for the mosquito population are both  µ(v) ≈ 0.4028 per week. Therefore,

 N (v)i   ≈

35, 000

0.4028≈ 86, 892,

and this leads to

 N (v)i   ≈ 8.6892 N 

(h)i   . (24)

Based on Approximation (24), we can now estimate the total mosquito population for each state.These data are then imputed in Equation (12), which is then substituted in Equations (13)a nd(14).

This subsequently gives estimated values for the numbers of infective mosquitoes  I (v)i, j   which we

propagate from Equation (14). We refer to this approach as Assumption 4.

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 12/19

 Journal of Applied Statistics   11

0

5000

10000

15000

20000

25000

30000

35000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53

   I  m  p  u   t  e   d   I  n   f  e  c   t   i  v  e   M  o  s  q  u   i   t  o  e  s

Epidemiology Week

Perlis

Kedah

P.Pinang

Perak

Kelantan

Terengganu

Pahang

Selangor 

K.Lumpur 

Putrajaya

N.Sembilan

Melaka

Johor 

Sarawak

Labuan

Sabah

Figure 6. Imputed infective mosquitoes based on propagation.

Figure 6  shows the corresponding values of  I (v)i, j   for each state in Malaysia for epidemiology

weeks 1–53 corresponding to the 12 months from 1 January 2008 to 3 January 2009. These values

are finally imputed in Equation (8) to give the posterior expected means of new infective humans,

which subsequently give the posterior expected relative risks of dengue disease.

5.2.3   Estimation of vector mosquito populations from human populations

Here, we assume that the infective mosquito population counts follow the cyclical pattern of infec-

tive human population counts, with a constant ratio between infective mosquitoes and infective

humans (Assumption 5).

This assumption is based on our belief that there is a positive correlation between the numbers

of infective mosquitoes and the numbers of new infective humans. We assume that when there

is an increase in the number of new infective humans, there will also be an increase in the

number of infective mosquitoes. Figure  7 shows the pattern of   I (v)i, j   for each state in Malaysia

from epidemiology weeks 1–53 for the same year during 2008–2009, based on Assumption 5.

These data are then imputed in Equation (8) to give the posterior expected means of new infective

humans, which subsequently give the posterior expected relative risks of dengue disease.

5.3   Analysis and results: comparison of posterior expected relative risks

The aim of this research is to improve the accuracy and reliability of the existing methods for

mapping vector-borne infectious diseases. In this paper, the estimation of relative risk is based

on our stochastic SIR-SI model for disease transmission. Many published studies of general

infectious diseases, including  [1,5,6,24], use stochastic terms in their models as probabilistic

statements about the progression of the disease. These studies generally agree that stochastic

models are more realistic than deterministic models, the latter being a special case of the former.To demonstrate the possible benefits of our approach, we focus on the spread of dengue disease

in Malaysia. We adopt Bayesian methods of analysis for improved robustness in estimation and

decision-making. However, this paper is primarily concerned with the models and methods, so we

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 13/19

12   N.A. Samat and D.F. Percy

Figure 7. Imputed infective mosquitoes based on human populations.

choose reference (uniform) priors for illustration, except for the CAR prior for spatial variability.

Future work will investigate the impactof more informative priors by means of sensitivity analyses.

We now present the results of relative riskestimation basedon our discrete time–space stochastic

SIR-SI model for dengue disease transmission using the five alternative methods for imputing

vector mosquito populations described in the previous section. The model in this analysis is

posterior sampled and is run to convergence using WinBUGS software. Figures  8–12 show time

series plots for posterior expected relative risks across all states, based on our discrete time–space

stochastic SIR-SI models for dengue disease transmission in epidemiology weeks 1–53 during

2008–2009 using Assumptions 1–5, respectively.

Figures 8–12 suggest a conclusion that all states have similar patterns of posterior expected

relative riskfor all epidemiologyweeks, thoughdifferent methods give different values of posterior

expected relative risk. Based on the posterior expected relative risks for epidemiology week 53

in Table 1, all methods lead to the same conclusion that the state with the highest risk is Putrajaya

and the state with the lowest risk is Sabah, except for Assumption 5 which concludes that the

state of Labuan has the lowest risk. The risks for the other 14 states seem to be quite similarfor all five assumptions. This consistency is most encouraging and suggests that the disease

maps are not overly sensitive to the accuracy of the assumption made for imputing mosquito

counts. Consequently, there appears to be little to gain from expensive efforts to collect actual

data on mosquito populations, so long as reference values are available, such as those used in

our analysis. Mathematical considerations lead towards Assumption 3 as best representing the

physical process, but we now evaluate model goodness-of-fit measures to help us determine

which mosquito population assumption is most appropriate.

The use of goodness-of-fit measures is common in statistics for comparing fitted models. Law-

son[15] discusses several methods that can be used to assess goodness-of-fit, including chi-square

statistics, Akaike information criterion, Bayesian information criterion, deviance information cri-terion (DIC) and posterior predictive loss. In this study, we use the DIC because it is readily

available in WinBUGS software and because Lawson [15] identifies weaknesses with the other

measures, particularly for models that involve several random effects. The DIC is defined by

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 14/19

 Journal of Applied Statistics   13

Figure 8. Posterior expected relative risks under Assumption 1.

Figure 9. Posterior expected relative risks under Assumption 2.

Spiegelhalter et al. [29] as

DIC =  2 E θ | x { D} − D{ E θ | x (θ)},

where D(·) is the deviance of the model and  x  represents the observed data. It uses the average

of the posterior samples of  θ  to produce an expected value of  θ . This value can also be computedfrom a sample output from a chain. According to Spiegelhalter  et al.  [29], the model with the

smallest DIC is the model that would best predict a replicate data set of the same structure as that

currently observed. While Lawson and Clark [17] point out that the other overall goodness-of-fit

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 15/19

14   N.A. Samat and D.F. Percy

Figure 10. Posterior expected relative risks under Assumption 3.

Figure 11. Posterior expected relative risks under Assumption 4.

measures are useful for helping model selection, they give little help in assessing how well the

model fits the data.

Table 2 shows the DIC values for the new infective humans for epidemiology weeks 1–53 for

all states in Malaysia based on our five different assumptions for the mosquito populations. From

the DIC values in Table 2, we can say that the model with Assumption 5 fits best because it givesthe smallest DIC, compared with the other models. We conclude that the discrete time–space

stochastic SIR-SI model that assumes that infective mosquito counts are proportional to infective

human counts is the best model to be used in the analysis specifically for estimating relative risk.

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 16/19

 Journal of Applied Statistics   15

Figure 12. Posterior expected relative risks under Assumption 5.

Table 1. Posterior expected relative risks for epidemiology week 53.

Assumption 1 Assumption 2 Assumption 3 Assumption 4 Assumption 5

 I (v) piecewise   I (v) propagated   I (v) estimated from

 I (v) constant   I (v) sinusoidal from SIR-SI human infectives

State constant seasonality seasonality equations   I (h)

1. Perlis 0.3471 0.3955 0.4171 0.6454 0.29602. Kedah 0.3881 0.4422 0.4663 0.8397 0.89633. Pulau Pinang 0.6753 0.7695 0.8115 0.9692 1.07104. Perak 0.7790 0.8876 0.9361 0.9843 0.81685. Kelantan 0.6972 0.7944 0.8377 0.5154 0.53516. Terengganu 0.7176 0.8177 0.8623 0.6239 0.55187. Pahang 0.3891 0.4433 0.4675 0.5375 0.71518. Selangor 1.9350 2.2040 2.3250 3.0420 3.18309. Kuala Lumpur 1.4450 1.6470 1.7370 1.6210 1.557010. Putrajaya 2.2420 2.5550 2.6950 5.7400 5.237011. Negeri Sembilan 0.6184 0.7046 0.7430 0.6123 0.845512. Melaka 0.4911 0.5595 0.5900 0.3274 0.281713. Johor 0.5444 0.6204 0.6542 0.6684 0.598314. Sarawak 0.2766 0.3151 0.3323 0.4213 0.282915. Labuan 0.5723 0.6522 0.6878 0.1551 0.00000011216. Sabah 0.1532 0.1745 0.1841 0.1251 0.0926

Table 2. DIC evaluated for Assumptions 1–5.

Assumption 1 Assumption 2 Assumption 3 Assumption 4 Assumption 5

New infective humans, (h) 8993.57 9515.41 10087.4 10137.5 7982.23

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 17/19

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 18/19

 Journal of Applied Statistics   17

can overcome the problems of SMR, especially when there are no observed count data in cer-

tain regions, and the problems of the Poisson-gamma model, where covariate adjustments are

impossible and it is not possible to allow for spatial correlation between risks in adjacent areas.

Possible extensions to thiswork include the development of a model for dengue disease mapping

with continuous time and discrete space, in order to improve the accuracy of disease mapping

further and for particular applicability to vector-borne infectious diseases that are rare or in theirearly stages. We anticipate that the results of this analysis will further strengthen our conclusions

about tract-count data using the above analysis. The techniques presented in this paper offer an

alternative method for estimating the relative risk in the study of disease mapping particularly for

diseases with indirect transmission.

Acknowledgements

The authors acknowledge Universiti Pendidikan Sultan Idris and the Ministry of Higher Education in Malaysia for their

financial support in respect of this study.

References

[1] C.L. Addy, I.M. Longini, Jr., and M. Haber, A generalized stochastic model for the analysis of infectious disease

 final size data, Biometrics 47 (1991), pp. 961–974.

[2] L. Bernardinelli, D.G. Clayton, C. Pascutto, C. Montomoli, M. Ghislandi, and M. Songini,  Bayesian analysis of 

space–time variation in disease risk , Stat. Med. 14 (1995), pp. 2433–2443.

[3] J. Besag, J. York, and A. Mollie, Bayesian image restoration with two applications in spatial statistics, Ann. Inst.

Stat. Math. 43 (1991), pp. 1–59.

[4] D. Boehning, E. Dietz, and P. Schlattmann,  Space–time mixture modelling of public health data, Stat. Med. 19

(2000), pp. 2333–2344.

[5] D. Clancy, A stochastic SIS infection model incorporating indirect transmission, J.Appl. Probab. 42 (2005), pp. 726–

737.[6] D. Clancy and P.D. O’Neill, Bayesian estimation of the basic reproduction number in stochastic epidemic models,

Bayesian Anal. 3 (2008), pp. 737–758.

[7] L. Esteva and C. Vargas, Analysis of a dengue disease transmission model, Math. Biosci. 150 (1998), pp. 131–151.

[8] L.C. Foo, T.W. Tim, H.L. Lee, and R. Fang, Rainfall, abundance of Aedes aegypti and dengue infection in Selangor,

 Malaysia, Southeast Asian J. Trop. Med. Public Health 16 (1985), pp. 560–568.

[9] A. Gemperli, P. Vounatsou, N. Sogoba, and T. Smith, Malaria mapping using transmission models: An application

to survey data from Mali, Am. J. Epidemiol. 163 (2006), pp. 289–297.

[10] D.J. Gubler, Dengue and dengue haemorrhagic fever , Clin. Microbiol. Rev. 11 (1998), pp. 480–496.

[11] D.J. Gubler, Epidemic dengue/dengue haemorrhagic fever as a public health, social and economic problem in the

21st century, Trends Microbiol. 10 (2002), pp. 100–103.

[12] V. Isham, Stochastic models for epidemics: Current issues and development , in Celebrating Statistics, A.C. Davison,

Y. Dodge and N. Wermuth, eds., Oxford University Press, Oxford, 2005, pp. 27–54.

  ,

[13] L. Knorr-Held and J. Besag, Modelling risk from a disease in time and space, Stat. Med. 17 (1998), pp. 2045–2060.

[14] A.B. Lawson, Statistical Methods in Spatial Epidemiology, 2nd ed., John Wiley & Sons, Chichester, UK, 2006.

[15] A.B. Lawson, Bayesian Disease Mapping, CRC Press, Boca Raton, FL, 2009.

[16] A.B. Lawson, W.J. Browne, and C.L Vidal Rodeiro, Disease Mapping with WinBUGS and MLwiN , John Wiley &

Sons, Chichester, UK, 2003.

[17] A.B. Lawson and A. Clark, Spatial mixture relative risk models applied to disease mapping, Stat. Med. 21 (2002),

pp. 359–370.

[18] H.L. Lee and K. Inder Singh, Sequential sampling for Aedes aegypti and Aedes albopictus (Skuse) adults: Its use in

estimation of vector density threshold in dengue transmission and control, J. Biosci. 2 (1991), pp. 9–14.

[19] Y.C. MacNab and C.B Dean, Spatio-temporal modelling of rates for the construction of disease maps , Stat. Med. 21

(2002), pp. 347–358.

[20] Malaysian Meteorological Department, Monsoon season in Malaysia. Available at http://www.met.gov.my (2 April

2010).

[21] Maricopa County EnvironmentalServices, Lifecycle and information on Aedes aegypti mosquitoes, MaricopaCounty.

Available at http://www.maricopa.gov/EnvSvc/VectorControl/Mosquitos/MosqInfo.aspx (20 July 2009).

[22] H. Nishiura, Mathematical and statistical analysis of the spread of dengue, Dengue Bull. 30 (2006), pp. 51–67.

7/21/2019 Published Jas Paper

http://slidepdf.com/reader/full/published-jas-paper 19/19

18   N.A. Samat and D.F. Percy

[23] R.A.G. Okogun, E.B.N. Bethran, N.O. Anthony, C.A. Jude, and C.E. Anegbe,   Epidemiological implication of 

 preferences of breeding sites of mosquito species in Midwestern Nigeria, Ann. Agric. Environ. Med. 10 (2003),

pp. 217–222.

[24] P.D. O’Neil, A tutorial introduction to Bayesian inference for stochastic epidemic models using Markov chain Monte

Carlo methods, Math. Biosci. 180 (2002), pp. 103–114.

[25] P. Pongsumpun, K. Patanarapelert, M. Sripom, S. Varamit, and I.M. Tang, Infection risk to travellers going to dengue

 fever endemic regions, Southeast Asian J. Trop. Med. Public Health 35 (2004), pp. 155–159.[26] A. Rohani, I. Asmaliza, S. Zainah, and H.L. Lee, Detection of dengue from field Aedes aegypti and Aedes albopictus

adults and larvae, Southeast Asian J. Trop. Med. Public Health 28 (1997), pp. 138–142.

[27] M.G. Rosa-Freitas, P. Tsouris, A. Sibajev, E.T. Weimann, A.U. Marques, R.L Ferreire, and F.C.L. Gards-Moura,

 Exploratory temporal and spatial distribution analysis of dengue notifications in Boa Vista, Roraima, Brazilian

 Amazon, 1999–2001, Dengue Bull. 27 (2003), pp. 63–80.

[28] H. Rozilawati, J. Zairi, and C.R. Adanan, Seasonal abundance of Aedes albopictus in selected urban and suburban

areas in Penang, Malaysia, Trop. Biomed. 24 (2007), pp. 83–94.

[29] D. Spiegelhalter, A. Thomas, N. Best, and D. Lunn, WinBUGS User Manual Version 1.4, MRC Biostatistics Unit,

Cambridge, UK, 2003.

[30] S. Sulaiman, Z.A. Pawanchee, J. Jeffery, I. Ghauth, and V. Buspavani, Studies on the distribution and abundance

of Aedes aegypti (L.) and Aedes albopictus (Skuse) (Diptera: Culicidae) in an endemic area of dengue/dengue

haemorrhagic fever in Kuala Lumpur , Mosq.-Borne Dis. Bull. 8 (1991), pp. 35–39.

[31] A. Tran, X. Deparis, P. Dussart, J. Morran, P. Rabarison, F. Remy, L. Polidori, and J. Gardon, Dengue spatial and 

temporal patterns, French Guiana, 2001, Emerg. Infect. Dis. 10 (2004), pp. 615–621.

[32] L.A. Waller, B.P. Carlin, H. Xia, and A.E. Gelfand, Hierarchical spatio temporal mapping of disease rates, J. Am.

Stat. Assoc. 92 (1997), pp. 607–617.