an epidemiological perspective david buckeridge, md phd ... · mcgill university, montréal, canada...

27
Detection of outbreaks using laboratory data An epidemiological perspective David Buckeridge, MD PhD McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique du Québec (INSPQ) October 23, 2006

Upload: others

Post on 10-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Detection of outbreaks using laboratory dataAn epidemiological perspective

David Buckeridge, MD PhDMcGill University, Montréal, Canada

Direction de santé publique de MontréalInstitut national de santé publique du Québec (INSPQ)

October 23, 2006

Page 2: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Outline

The Context of Laboratory Surveillance

Evidence and Methods for Automated SurveillanceCase detectionOutbreak detection

Evaluation of Outbreak Detection

Page 3: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Outline

The Context of Laboratory Surveillance

Evidence and Methods for Automated SurveillanceCase detectionOutbreak detection

Evaluation of Outbreak Detection

Page 4: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

The Process of Automated Surveillance

Surveillance methodology and approaches to surveillance

The process of surveillance

All approaches to surveillance share some common principles. While some of theunderlying methods used in public health surveillance have evolved considerably inrecent years, the general approach to surveillance has remained relatively constant.At a fundamental level, surveillance aims to (1) identify individual cases, (2) detectpopulation patterns in identified cases, and then (3) convey information to deci-sion-makers about population health patterns (Fig. 3).

Identification of individual cases

The definition of a case for a surveillance system (Fig. 3, Step 1) has importantimplications for the design and performance of the system. In settings where asurveillance system is intended to follow cases of a well-understood disease, it maybe possible to make the case definition highly specific. For example, public healthagencies in many developed countries conduct routine surveillance for communi-cable diseases such as measles. Definitions of cases in these systems tend to relyupon highly specific diagnostic tests. As a result, communicable disease surveillance

1

3

5

7

9

11

13

15

17

19

21

23

25

27

29

31

33

35

37

39

41

43

PMVI� V016 : 16013

EventReports

Individual EventDefinitions

Population PatternDefinitions

EventDetectionAlgorithm

Pattern Report

Population UnderSurveillance

InterventionDecision

InterventionGuidelines Public

HealthAction

Data DescribingPopulation

PatternDetectionAlgorithm

1. Identifying individual cases

2. Detecting population patterns

3. Conveying information for action

Decision Algorithm

Knowledge

Fig. 3 The process of surveillance. Critical points in this process include the detection of events in

individuals (e.g., a diagnosis of measles), the identification of patterns in the population (e.g., a rapid rise

in incidence in a geographic location), and the incorporation of information about identified patterns

into decisions about interventions. (For colour version: see colour section on page xxx).

D. Buckeridge and G. Cadieux332

(Buckeridge & Cadieux, 2007)

Page 5: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Laboratory Surveillance Contexts

I Within hospitalsI Nosocomial surveillance.I Adverse event detection (e.g., post-op infection).

I Regional, National, InternationalI Notifiable disease surveillance .I Syndromic surveillance.I . . .

Page 6: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Data Elements Used in Laboratory Surveillance

I Incidence of test orders (Ma et al, 2005)

I Timely, sensitive signal.I Reflects clinical suspicion.

I Incidence of positive resultsI Results analyzed by organism or host characteristics.I Organism – type, subtype (e.g., Bender 2001 for Salmonella),

antimicrobial resistance.I Person – Demographics, Location, Comorbidities.

Page 7: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Purposes of Laboratory Surveillance

I Monitoring antimicrobial resistance.I Monitoring disease management

I Diabetes – Haemoglobin A1C.I Cardiovascular disease – Lipids.

I . . .

I Detecting disease cases.I Detecting disease outbreaks.

Page 8: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Outline

The Context of Laboratory Surveillance

Evidence and Methods for Automated SurveillanceCase detectionOutbreak detection

Evaluation of Outbreak Detection

Page 9: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Determinants of Disease Outbreak Detection

Data Collection Case Detection Outbreak Detection Public Health Action

EndemicDisease

EpidemicDisease- incidence- variation / shape- timing of onset

- case definition- case classification algorithm- processing frequency

- algorithm- threshold- analysis frequency

- interpretation of analysis- investigation- response

- incidence- variation / shape

- data source(s)- sampling frame, method, frequency

System Factors

OutbreakFactors

(Buckeridge, 2007)

Page 10: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Outline

The Context of Laboratory Surveillance

Evidence and Methods for Automated SurveillanceCase detectionOutbreak detection

Evaluation of Outbreak Detection

Page 11: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

There is Strong Evidence that Automated CaseReporting is Effective

I Many studies support automated case-reporting within areportable disease context. (e.g., Effler, 1999)

I Result reporting is faster, more representative and morecomplete.

I Reporting delay decreases from approximately 5 days toapproximately 1 day.

Page 12: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Simple Case Definitions are Used in Practice, butMore Information is Better

I Single code matching on diagnosis and / or result is theusual approach.

I Reports tests, not truly cases.I Inconsistent coding is a practical problem. (Overhage, 2001)

I In a hospital setting, automated lab surveillance hasreasonable sensitivity compared to review of patients –Patient review uses additional sources.

I When lab data are combined with other indicators,†

sensitivity increases, but impact on specificity (falsepositives) is not well-defined.

†Prescriptions, risk factors such as age, time in hospital, alternate diagnoses

Page 13: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Outline

The Context of Laboratory Surveillance

Evidence and Methods for Automated SurveillanceCase detectionOutbreak detection

Evaluation of Outbreak Detection

Page 14: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

The Appropriate Analytic Approach Varies

I The statistical algorithms should be selected to match thedata and the likely outbreak signal. (Buckeridge, 2005)

I Resolution of dataI Temporal frequency.I Spatial precision.I Other covariates.

I Type of outbreak signalI Spatially clustering.I Slowly or rapidly increasing.I Focussed in high-risk populations.

Page 15: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Temporal Control Charts are Used Commonly

yt = max(

0, yt−1 +

(xt − µt

σt− k

))I Cumulative sum (cusum) method works well for data

grouped by weeks or months.I Statistical Process Control (SPC) Chart methods are taken

from manufacturingI System is ‘in control’ or ‘out of control’.I Other methods include Shewhart, Exponentially Weighted

Moving Average (EWMA).I Also, ad hoc smoothing methods (e.g., Stern 1999).

Page 16: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Using a CUSUM Method for Salmonella Surveillance

395Vol. 3, No. 3, July–September 1997 Emerging Infectious Diseases

Dispatches

Using Laboratory-Based SurveillanceData for Prevention: An Algorithm for

Detecting Salmonella Outbreaks

By applying cumulative sums (CUSUM), a quality control method commonly usedin manufacturing, we constructed a process for detecting unusual clusters amongreported laboratory isolates of disease-causing organisms. We developed a computeralgorithm based on minimal adjustments to the CUSUM method, which cumulates sumsof the differences between frequencies of isolates and their expected means; we usedthe algorithm to identify outbreaks of Salmonella Enteritidis isolates reported in 1993. Bycomparing these detected outbreaks with known reported outbreaks, we estimated thesensitivity, specificity, and false-positive rate of the method. Sensitivity by state in whichthe outbreak was reported was 0%(0/1) to 100%. Specificity was 64% to 100%, and thefalse-positive rate was 0 to 1.

Effective surveillance systems providebaseline information on incidence trends andgeographic distribution of known infectiousagents. The ability to provide such informationis a prerequisite to detecting new or reemer-ging threats (1). Laboratory-based surveillancecan provide data on the location and frequencyof isolation of specific pathogens, which can beused to rapidly detect unusual increases orclusters. These data can be transmitted elec-tronically from multiple public health sites to acentral location for analysis.

Many acute outbreaks of infectious diseasesare detected by astute clinical observers, localpublic health authorities, or the affected personsthemselves. However, outbreaks dispersed over abroad geographic area, with relatively few casesin any one jurisdiction, are much more difficult todetect locally. Rapid analysis of data to detectunusual disease clusters is the first step inrecognizing outbreaks. We developed an algorithmfor the Public Health Laboratory InformationSystem (PHLIS) (2) that detects unusual clustersby using a statistical quality control methodcalled cumulative sums (CUSUM), a methodcommonly used in manufacturing. CUSUM hasalso been applied to medical audits of influenzasurveillance in England and Wales (3,4).

The AlgorithmThe statistical problem of detecting unusual

disease clusters in public health surveillance issimilar to that of detecting clusters of defectiveitems in manufacturing. In both cases, the aim is

to detect an unusual number of occurrences.Manufacturing operations use several existingquality control methods, e.g., Shewhart Charts,moving average control, and CUSUM, to indicateabnormalities in data collected (5,6). Of thesemethods, CUSUM has two unique attributes thatmake it especially suitable for disease outbreakdetection. CUSUM detects smaller shifts fromthe mean, and it detects similar shifts in themean more quickly (6-8). The computational sim-plicity of this method also makes it especiallywell suited for use on personal computers. Otherpublished methods (9-11) require more personalinteractions, e.g., model building, and use moreintense computations.

Applying the Algorithm toSurveillance Data

To evaluate how well the CUSUM algorithmdetects unusual clusters of disease, we applied itto the Centers for Disease Control and Pre-vention (CDC) National Salmonella SurveillanceSystem dataset. Since 1962, this surveillancesystem has collected reports of laboratory-confirmed Salmonella isolates from humansources from all U.S. state public health labora-tories and the District of Columbia (12). Thelaboratories serotype clinical isolates of Sal-monella by the Kauffman-White methods, whichsubdivide this diverse bacterial genus into morethan 2,000 named serotypes (13). Each week,laboratories report to CDC each Salmonellastrain they have serotyped, along with the age,sex, county of residence of the person from whom

(Hutwagner, 1997)

Page 17: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Time-Series Methods in Surveillance

I Time-series forecast methods are more technicallydemanding, but useful for data with regular temporalcycles, such as day of the week.

I A regression model that accounts for temporalautocorrelation is used to forecast expected values.

I The observed value is compared to the forecast and theresidual or difference is used to detect outbreaks.

I Time-series methods also provide a framework forcombing data from laboratories and other sources in ahybrid surveillance scheme.

Page 18: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Using Time-Series Methods for Reportable DiseaseSurveillance

STATISTICS IN MEDICINE

Statist. Med. 18, 3283}3298 (1999)

A MONITORING SYSTEM FOR DETECTING ABERRATIONSIN PUBLIC HEALTH SURVEILLANCE REPORTSs

G. DAVID WILLIAMSON1* AND GINNER WEATHERBY HUDSON2

1Epidemiology Program Ozce, MS K73, Centers for Disease Control and Prevention, 4770 Buford Highway, Atlanta,GA 30341, U.S.A.

2Drexel University, College of Business and Administration, Matheson Hall, 32nd and Market Streets,Philadelphia, PA 19104, U.S.A.

SUMMARY

Routine analysis of public health surveillance data to detect departures from historical patterns of diseasefrequency is required to enable timely public health responses to decrease unnecessary morbidity andmortality. We describe a monitoring system incorporating statistical &#ags' identifying unusually largeincreases (or decreases) in disease reports compared to the number of cases expected. The two-stagemonitoring system consists of univariate Box}Jenkins models and subsequent tracking signals from severalstatistical process control charts. The analyses are illustrated on 1980}1995 national noti"able disease datareported weekly to the Centers for Disease Control and Prevention (CDC) by state health departments andpublished in CDC's Morbidity and Mortality=eekly Report. Published in 1999 by John Wiley & Sons, Ltd.This article is a U.S. Government work and is in the public domain in the United States.

INTRODUCTION

Public health surveillance provides the foundation for much of e!ective epidemiologic andpublic health practice. It is the ongoing collection, analysis, interpretation and dissemination ofoutcome-speci"c information for use in planning, implementing and evaluating public healthpractice.1,2 Data collected from public health surveillance systems can provide important clues tothe aetiology of a disease and assist in identi"cation of important risk factors, onset of epidemicsand detection of unusual observations in reports of infectious diseases and other conditions, thusfacilitating early public health response to minimize undue morbidity and mortality.3~5

There is a long distinguished history in modelling public health data to provide greater insightinto aetiology, spread, prediction and control of diseases. One such early e!ort was by WilliamFarr when, in 1840, he "t a normal curve to deaths from smallpox in hopes of discovering whyepidemics appear and disappear.6 Other epidemiologic and statistical analyses have focused onmodelling disease incidence and prevalence, geographical distribution and spread of disease, andon forecasting disease counts and associated health care needs.7~17 Models have been developedto detect time and/or spatial clusters of disease and, recently, to smooth rates in small areaestimation problems (that is, to overcome statistical issues occurring when the unit of analysis isassociated with a geographic area which is small relative to the area spanned by a set of

* Correspondence to: G. David Williamson, Epidemiology Program O$ce, MS K73, Centers for Disease Control andPrevention, 4770 Buford Highway, Atlanta, GA 30341, U.S.A. E-mail: [email protected] article is a U.S. Government work and is in the public domain in the United States.

Published in 1999 by John Wiley & Sons, Ltd.

Page 19: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Forecasting Expected Cases of Hepatitis A

Figure 5. Predicted and actual number of reported cases of hepatitis A, by week, United States, 1993}1995 (the 1993predicted numbers are model estimates, the 1994}1995 predicted numbers are rolling forecasts and the actual number of

cases are provisional data from 9 January 1993 to 13 May 1995)

Figure 6. Shewhart control chart for the number of reported cases of hepatitis A, by week, United States, 1989

identi"ed any signi"cantly high values for the 1994}1995 forecasted national hepatitis A diseasereport series.

For 1989 hepatitis A forecasted data, monitoring results varied by state with no evidentrelationships for statistically high values among the three states or with the national diseasemonitoring results. High aberrations were detected by at least one control chart for each of the

3292 G. D. WILLIAMSON AND G. WEATHERBY HUDSON

Published in 1999 by John Wiley & Sons, Ltd. Statist. Med. 18, 3283}3298 (1999)

(Williamson, 1999)

Page 20: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Outline

The Context of Laboratory Surveillance

Evidence and Methods for Automated SurveillanceCase detectionOutbreak detection

Evaluation of Outbreak Detection

Page 21: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

The Metrics of Outbreak Detection

I Sensitivity, Specificity, Receiver Operating Characteristics(ROC) curves, and Area Under the Curve (AUC) are usedto measure accuracy.

I Timeliness, Activity Monitoring Operating Characteristics(AMOC) curves, TROC (Timeliness Receiver OperatingCharacteristics) surface, and Volume Under the Surface(VUS) are used to assess the delay until detection.

I Other measures, such as the proportion of infectionsaverted, are used less frequently.

Page 22: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Receiver Operating Characteristics (ROC) Curves

0.00 0.05 0.10 0.15 0.20

0.0

0.2

0.4

0.6

0.8

1.0

1 − Specificity

Sen

sitiv

ity

0 1 2 3 4 5 6

False alarms per month (1 analysis per day)

100 infected500 infected1000 infected5000 infected

Page 23: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Measuring Accuracy and Timeliness of OutbreakDetection

1 − Prop. time saved

0.0

0.2

0.4

0.6

0.8

1.01 − Specificity

0.00

0.05

0.10

0.15

0.20

Sensitivity

0.0

0.2

0.4

0.6

0.8

1.0

Timeliness−Receiver Operating Characteristic Surface

Page 24: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Approaches to Evaluating Outbreak Detection

I Real outbreaks, if available, are preferable.I The superimposition of simulated outbreaks onto real data

is a good alternative.

Page 25: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

A Superimposed OutbreakR

espi

rato

ry V

isits

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan2003 2004

020

040

060

080

010

00

Res

pira

tory

Vis

its

Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat SunSep 1, 2003 Sep 5, 2003 Sep 9, 2003 Sep 13, 2003 Sep 17, 2003 Sep 21, 2003

020

040

060

080

010

00

Real dataReal + Simulated data

ForecastUpper 95% CI

Cusum alarm

Page 26: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

Summary

I Laboratory surveillance can be performed in variouscontexts for a range of reasons.

I Automated case reporting improves completeness andtimeliness.

I A range of outbreak detection methods exists for differenttypes of data.

I Metrics exist for accuracy and timeliness of detection.

Page 27: An epidemiological perspective David Buckeridge, MD PhD ... · McGill University, Montréal, Canada Direction de santé publique de Montréal Institut national de santé publique

References

→ Bender JB et. al. Use of molecular subtyping in surveillance for Salmonella enterica serotype typhimurium. NEngl J Med. 2001 Jan 18;344(3):189-95.→ Buckeridge DL, Burkom H, Campbell M, Hogan WR, Moore AW. Algorithms for rapid outbreak detection: aresearch synthesis. J Biomed Inform. 2005 Apr;38(2):99-113.→ Buckeridge DL. Outbreak Detection through Automated Surveillance: A Review of the Determinants ofDetection. J Biomed Inform. 2007. To Appear.→ Buckeridge DL and Cadieux G (2007). Surveillance for Newly Emerging Viruses. In: Tabor E (Ed). Perspectivesin Medical Virology: Emerging Viruses in Human Populations. London: Elsevier.→ Effler P, Ching-Lee M, Bogard A, Ieong MC, Nekomoto T, Jernigan D. Statewide system of electronic notifiabledisease reporting from clinical laboratories: comparing automated reporting with conventional methods. JAMA.1999 Nov 17;282(19):1845-50.→ Hutwagner LC, Maloney EK, Bean NH, Slutsker L, Martin SM. Using laboratory-based surveillance data forprevention: an algorithm for detecting Salmonella outbreaks. Emerg Infect Dis. 1997 Jul-Sep;3(3):395-400.→ Ma H, Rolka H, Mandl K, Buckeridge DL, Endy T, Fleischauer A, Pavlin J. Lab order data implementation inBioSense early outbreak detection system. In: Syndromic Surveillance: Reports from a National Conference, 2004.MMWR 2005;54(Suppl):27-30.→ Overhage JM, Suico J, McDonald CJ. Electronic laboratory reporting: barriers, solutions and findings. J PublicHealth Manag Pract. 2001 Nov;7(6):60-6.→ Stern L, Lightfoot D. Automated outbreak detection: a quantitative retrospective analysis. Epidemiol Infect. 1999Feb;122(1):103-10.