assessing generalisability by location in trial-based cost-effectiveness analysis: the use of...

15
HEALTH ECONOMICS Health Econ. 14: 471–485 (2005) Published online 14 June 2004 in Wiley InterScience (www.interscience.wiley.com). DOI:10.1002/hec.914 COST EFFECTIVENESS ANALYSIS Assessing generalisability by location in trial-based cost-e¡ectiveness analysis: the use of multilevel models Andrea Manca a, *, Nigel Rice a , Mark J. Sculpher a and Andrew H. Briggs b a Centre for Health Economics, University of York, UK b Health Economics Research Centre, University of Oxford, UK Summary Cost-effectiveness analysis (CEA) in health care is increasingly conducted alongside multicentre and multinational randomised controlled clinical trials (RCTs). The increased use of stochastic CEA is designed to account for between-patient sampling variability in cost-effectiveness data assuming that observations are independently distributed. However, between-location variability in cost-effectiveness may result if there is a hierarchical structure in the data; that is, if there is correlation in costs and outcomes between patients recruited in particular locations. This may be expected in multi-location trials given that centres and countries often differ in factors such as clinical practice, patient case-mix and the unit costs of delivering health care. A failure to acknowledge this feature may lead to misleading conclusions in a trial-based economic study. Multilevel modelling (MLM) is an analytical framework that can be used to handle hierarchical cost-effectiveness data. Using data from a recently conducted economic analysis, this paper shows how multilevel modelling can be used to obtain (a) more appropriate estimates of the population average incremental cost-effectiveness and associated standard errors compared to standard stochastic CEA; and (b) location-specific estimates of incremental cost-effectiveness which can be used to explore appropriately the variability between centres/countries of the cost-effectiveness results. Copyright # 2004 John Wiley & Sons, Ltd. Keywords generalisability; trial-based cost-effectiveness analysis; multilevel modelling; multinational and multi- centre RCTs Introduction Economic evaluation in health care is often conducted alongside multicentre and multina- tional randomised controlled clinical trials (RCTs). In such studies, patient-level data on resource use and health outcomes are collected in a number of different sites, with the usual objective to generate a cost-effectiveness estimate that is generalisable across geographical locations. In recent years, there have been developments in statistical methods in cost-effectiveness analysis (CEA) that reflect the inherent between-patient variability in costs and outcomes, and hence quantify sampling uncertainty in CEA results [1– 5]. These methods typically assume that patient- specific data are independent of the location in which they are collected. It is widely recognised, however, that a compar- ison of health services in different locations – both within one country and between countries – will reveal important differences in a range of econom- ically relevant parameters [Sculpher MJ, Pang FS, Manca A et al. Generalisability in economic evaluation studies in health care: a review and Copyright # 2004 John Wiley & Sons, Ltd. Received 30 June 2003 Accepted 25 March 2004 *Correspondence to: Centre for Health Economics, University of York, York YO10 5DD, UK. E-mail: [email protected]

Upload: andrea-manca

Post on 11-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

HEALTH ECONOMICS

Health Econ. 14: 471–485 (2005)

Published online 14 June 2004 in Wiley InterScience (www.interscience.wiley.com). DOI:10.1002/hec.914

COST EFFECTIVENESS ANALYSIS

Assessing generalisability by location in trial-basedcost-e¡ectiveness analysis: the use of multilevel models

Andrea Mancaa,*, Nigel Ricea, Mark J. Sculphera and Andrew H. BriggsbaCentre for Health Economics, University of York, UKbHealth Economics Research Centre, University of Oxford, UK

Summary

Cost-effectiveness analysis (CEA) in health care is increasingly conducted alongside multicentre and multinationalrandomised controlled clinical trials (RCTs). The increased use of stochastic CEA is designed to account forbetween-patient sampling variability in cost-effectiveness data assuming that observations are independentlydistributed. However, between-location variability in cost-effectiveness may result if there is a hierarchical structurein the data; that is, if there is correlation in costs and outcomes between patients recruited in particular locations.This may be expected in multi-location trials given that centres and countries often differ in factors such as clinicalpractice, patient case-mix and the unit costs of delivering health care. A failure to acknowledge this feature may leadto misleading conclusions in a trial-based economic study. Multilevel modelling (MLM) is an analytical frameworkthat can be used to handle hierarchical cost-effectiveness data. Using data from a recently conducted economicanalysis, this paper shows how multilevel modelling can be used to obtain (a) more appropriate estimates of thepopulation average incremental cost-effectiveness and associated standard errors compared to standard stochasticCEA; and (b) location-specific estimates of incremental cost-effectiveness which can be used to explore appropriatelythe variability between centres/countries of the cost-effectiveness results. Copyright # 2004 John Wiley &Sons, Ltd.

Keywords generalisability; trial-based cost-effectiveness analysis; multilevel modelling; multinational and multi-centre RCTs

Introduction

Economic evaluation in health care is oftenconducted alongside multicentre and multina-tional randomised controlled clinical trials(RCTs). In such studies, patient-level data onresource use and health outcomes are collected in anumber of different sites, with the usual objectiveto generate a cost-effectiveness estimate that isgeneralisable across geographical locations. Inrecent years, there have been developments instatistical methods in cost-effectiveness analysis

(CEA) that reflect the inherent between-patientvariability in costs and outcomes, and hencequantify sampling uncertainty in CEA results [1–5]. These methods typically assume that patient-specific data are independent of the location inwhich they are collected.

It is widely recognised, however, that a compar-ison of health services in different locations – bothwithin one country and between countries – willreveal important differences in a range of econom-ically relevant parameters [Sculpher MJ, Pang FS,Manca A et al. Generalisability in economicevaluation studies in health care: a review and

Copyright # 2004 John Wiley & Sons, Ltd.Received 30 June 2003

Accepted 25 March 2004

*Correspondence to: Centre for Health Economics, University of York, York YO10 5DD, UK. E-mail: [email protected]

case-studies. Health Technol Assess 2004, in press,6–8]. These will include clinical factors such aspatient case-mix and clinical practice, but alsoeconomic variables including resource use andfactor prices, whether a location is technicallyefficient and preferences about health states. Iflocations vary markedly in these factors, this islikely to influence the resource use, unit costs andoutcome data observed in trials, resulting in thedataset taking on hierarchical characteristics. Thatis, there may be a correlation in the costs andoutcomes relating to patients treated in the samelocation, which may not be the case whencomparing such data for patients treated indifferent locations. In other words, data may beclustered within locations. Furthermore, if there issimilarity in clinical and economic factors betweensites participating in the trial in a given country,centres may be considered clustered in countries.

The key implication of clustering in economicdata in multicentre and multinational trials is thatthe cost-effectiveness of the interventions ofinterest may vary between locations. Most trial-based economic studies of this type, however,ignore this potential source of variability. Manystudies assume that all data are exchangeableacross locations, including the prices (unit costs) ofhealth care resources. In the context of multi-national trials, for example, a single set of unitcosts from one country in the trial is often used tovalue resource use measured in all countries [9–13].Other evaluations recognise the need to explorethe implications of different unit costs on theresults of a multinational study [14,15]. Fewstudies, however, recognise the inter-relationshipbetween unit costs, resource use and outcomes[16]. For example, if medical staff is particularlyexpensive in one country, a substitution may takeplace towards nursing staff, and this may haveimplications for resource use and outcomes.

Few trial-based economic studies have sought toaddress variability between location simulta-neously in all forms of data (resource use, unitcosts and effects). Two studies have suggestedmethods to assess variability between locations incost-effectiveness studies alongside trials. In thecontext of a multinational trial in subarachnoidhaemorrhage, Willke et al. used regression analysisto model separately the effects of treatment oncosts and outcomes, and of outcomes on costs [17].The use of a treatment-country interaction termallowed for the estimation of country-specificincremental cost-effectiveness ratios based on

country-specific resource use and outcomes asmeasured in the trial, and each country’s unitcosts. The considerable extent of variation be-tween countries in cost-effectiveness indicated thepotential lack of generalisability of the trial’soverall (i.e. pooled) results. These methods donot, however, formally allow for the potentiallyhierarchical nature of economic data in thesetrials. Another approach, suggested by Cooket al. [18], is based on statistical tests forheterogeneity designed to inform the decisionregarding whether data from different centres/countries can be pooled into a single analysis or if,instead, separate analyses by location are neces-sary. This method is analogous to clinical tests ofheterogeneity, and relies on the identification of‘statistically significant heterogeneity’ to definewhether the overall results of studies are gener-alisable.

The concept of net-benefit, where both the costsand outcomes of an intervention are scaled ontothe same (monetary or outcome) scale on the basisof the maximum value associated with an addi-tional unit of health gain [19], offers a convenientway of modelling the variability in cost-effective-ness between geographical locations. Recently,Hoch et al. demonstrated how the use of patient-level net-benefit data as the dependent variable inregression analysis can allow more precise esti-mates of treatment cost-effectiveness by adjustingfor covariates which may be unevenly distributedbetween the arms of a trial; the same frameworkwas used to explore how cost-effectiveness couldbe estimated for specific patient sub-groups bymodelling interactions between treatment andpatient covariates [20]. This paper considers howthe net-benefit regression framework can beextended to analyse trial datasets where patient-level costs and outcomes are clustered by trialcentre or by country.

The particular focus here is on the use ofmultilevel regression modelling (MLM). MLM hasbeen used in many fields of research [21–26]. Itsuse in health economics has recently been reviewed[27], but no application has been identified relatingto cost-effectiveness analysis. MLM not onlyprovides a means of estimating location-specificmeasures of cost-effectiveness, but also allowscorrect quantification of uncertainly by adjustingstandard errors to reflect variability in net-benefitboth within and between locations.

This paper is structured as follows. The nextsection provides a brief recap on the principles of

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

A.Manca et al.472

stochastic CEA and the use of net-benefit regres-sion. The implications of clustering in economicevaluation data collected in multi-site trials arethen discussed. The case study used to illustrateMLM is subsequently introduced; the methodsemployed to analyse the trial data, and theirresults are compared with those obtained usingstandard approaches to cost-effectiveness along-side randomised clinical trials. The paper con-cludes with a discussion of the potentialadvantages of MLM, and suggestions for futureresearch.

Stochastic CEA and net-bene¢tregression

Stochastic CEA conducted alongside RCTs usespatient-level data to estimate mean costs, out-comes and cost-effectiveness for each treatmentunder scrutiny, producing appropriate measures ofsampling uncertainty around their point estimates.In recent years, key contributions in this researcharea have focussed on methods to estimateconfidence intervals around mean incrementalcost-effectiveness ratios (ICERs) [1–3], and howto present and characterise sampling uncertaintyto inform the decision making process [4,5]. Thismethodological work indicated that there are twoimportant problems associated with representingthe stochastic nature of the ICER: the qualitativeinterpretation of negative ICERs [28], and thequantification of the sampling uncertainty aroundthe ratio statistic when there is a non-negligibleprobability that the denominator takes valuesclose to zero.

Given difficulties in formally expressing sam-pling uncertainty in the ICER [28], the adoption ofthe net-benefit framework has been suggested as away of dealing with uncertainty in CEA [19,29,30],by reinterpreting the traditional ratio-based deci-sion rule shown in Equation (1)

ICER ¼%CC1 � %CC0

%EE1 � %EE0

5l ð1Þ

as a linear expression, formulating the decision-making problem in terms of incremental netmonetary benefit (Equation (2a)), or incrementalnet health benefit (Equation (2b))

INMB ¼ ð %EE1 � %EE0Þ � l� ð %CC1 � %CC0Þ > 0 ð2aÞ

INHB ¼ ð %EE1 � %EE0Þ �ð %CC1 � %CC0Þ

l> 0 ð2bÞ

where, %CC1; %EE1; %CC0 and %EE0 are, respectively, themean cost and effect for the ‘intervention’and ‘standard treatment’. The parameter l repre-sents the decision maker’s (or society’s) maximumwillingness to pay to achieve an additional unitof effectiveness which can be interpreted in severalways. In other words, l is the value of the ceilingcost-effectiveness ratio with which the healthsystem operates. Once l is defined, the adop-tion of the net-benefit approach facilitates unequi-vocal decisions about the cost-effectiveness ofinterventions. Furthermore, the quantificationof the sampling uncertainty around the meanincremental net-benefit becomes straight-forward as the linearity of the net-benefit statistichelps to overcome the problems suffered by theICER [19].

When individual patient data on costs andeffects exist, as in trial-based studies, net monetarybenefit (NMBi) and net health benefit (NHBi) canbe calculated for each individual (i) in the trial asshown in Equations (2c) and (2d).

NMBi ¼ Eil� Ci ð2cÞ

NHBi ¼ Ei �Ci

lð2dÞ

For different levels of l sampling uncertaintyaround the mean cost-effectiveness estimate can beexplicitly represented using a cost-effectivenessacceptability curve (CEAC). The CEAC has anaturally Bayesian interpretation [31], as it showsthe probability that expected (mean) net-benefit isgreater for the intervention compared to standardtreatment given the data in the trial, and this isusually how this curve is presented in appliedstudies [9,32].

Using a model regressing patient-level netmonetary benefit against the treatment armdummy variable (ti), Hoch et al. demonstratedthe equivalence of a regression-based approach toCEA with the ‘standard’ stochastic CEAs that arenow prevalent in the applied evaluation literature[20]. This method also facilitated covariate adjust-ment to cost-effectiveness estimates, and patientsub-group analysis. Their regression framework isillustrated in Equation (3) below.

NMBi ¼ b0 þ b1ti þ ei ð3Þ

In this formulation, the net monetary benefitfor the ith patient in the trial is the patient-level

Assessing Generalisability by Location 473

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

net-benefit defined in Equation (2c). The coeffi-cients b0 and b1 are, respectively, the intercept andthe slope term obtained from a standard ordinaryleast squares (OLS) regression. The term ei is anidiosyncratic error term with zero mean andconstant variance which is often assumed to benormally distributed. An important assumption isthat the errors are identically and independentlydistributed (i.i.d.). This requires zero covariancebetween the error terms. Should this assumptionnot hold, OLS estimation of b0 and b1 will beinefficient. In terms of the interpretation of theresults from the OLS regression, the estimatedcoefficient b0 represents the mean net-benefit in thegroup receiving the ‘standard treatment’ in thetrial, the sum of the two estimated coefficients ðb0þb1Þ is the mean net-benefit in the ‘intervention’arm, and b1 is the incremental net-benefit betweenthe two arms of the trial.

Dealing with clustered data

In trials that randomise by location rather thanby patient (i.e. cluster-randomised trials), thehierarchical nature of the data available foreconomic analysis is an inevitable implication ofthe design of the study. However, at least foreconomic analysis, some degree of clusteringis also likely to exist in trials where the patient isthe ‘unit of randomisation’ due to variationbetween locations in clinical and economic para-meters. Indeed, this is true for some individuallyrandomised clinical trials as far as the clinicaloutcome variables are concerned [33,34]. In thesecases, one of the assumptions underpinning theuse of standard OLS regression – that the randomerrors, ei, are independently distributed – doesnot hold, as observations within clusters will becorrelated. Failure to acknowledge this featureof the data in the analysis will result in exaggeratedprecision in the estimated incremental NMBin the trial as well as the average NMB in eachtreatment arm. More formally, without adjust-ment for clustering, standard OLS can produceinefficient parameter estimates and biased stan-dard errors.

The extent to which standard approachesto stochastic CEA will produce misleading resultsdepends on the degree of clustering in the data.In a hierarchical (or nested) dataset, the degreeof clustering is generally measured by the

intraclass correlation coefficient (ICC), a statisticsummarising the degree of dependency innested observations. In a simple 2-level hierarch-ical data structure (for example, patients clusteredwithin hospitals), the concept of the ICC canbe illustrated with the help of a simplevariance components random effects regressionmodel

NMBij ¼ b0 þ u0j þ eij ð4Þ

Here, b0 is the average NMB from pooling allthe observations in the dataset, u0j is a randomquantity applying to all individuals within thejth ‘centre’, and eij is another random quantityapplying to the ith patient within the jthcentre. Unlike the expression in Equation (3),NMB is now expressed with two subscripts, i and j,which illustrates the individuals’ (i) membershipof specific groups (j). In general terms, we defineM as the number of groups in the study (e.g.centres), and nj as the number of individualswithin the jth group. Accordingly, i ¼ 1; 2; . . . ; nj;j ¼ 1; 2; . . . ;M and the total sample sizeN ¼

PMj¼1 nj :

The mean NMB for a particular centre, j, canthen be expressed as ðb0 þ u0jÞ; with each indivi-dual-patient observation departing from thisgroup mean by a random value eij : Notice thatthis model has two random components, u0j andeij. Both are assumed to have zero mean and areuncorrelated; u0j has variance s2u0 and eij hasvariance s2e :

In this model, the quantity b0 is often referred toas the fixed part of the model, while u0j and eijconstitute the random part of the model. Thequantities of interest which are estimated from thedata are b0, s2u0 and s2e . The latter two quantitiesare often referred to as variance components andare used to derive the ICC as follows:

ICC ¼s2u0

s2u0 þ s2eð5Þ

The ICC can take values of between 0 and 1inclusive, and can be interpreted as the proportionof the total variance that can be attributed tobetween-centre variation. The ICC is discussedmore fully in the sections that follow.

A.Manca et al.474

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

Multilevel modelling in economicevaluation using patient-level data

Variance component specification

There are a number of ways in which the simplevariance components model might be elaboratedto analyse cost-effectiveness data. Perhaps themost obvious extension is to expand the fixed partof the model to accommodate a treatment armdummy variable as follows:

NMBij ¼ b0 þ b1tij þ u0j þ eij ð6Þ

with b0 and b1, respectively, being the populationaverage intercept and slope, and u0j the jth groupdeparture from the average intercept. This modeltakes the form of the multiple regression Equation(3) but it has in addition a level 2 error or residualterm, u0j. Snijders and Bosker refer to this as arandom effects analysis of variance (ANOVA)model [35]. It is also often referred to as a randomintercept model.

In the context of a RCT with patients clusteredwithin hospitals, the coefficient b0 can be regardedas the average net-benefit for the group of patientsreceiving the ‘standard treatment’, with u0j reflect-ing the departure from this average for the jthhospital. Similarly, the term ðb0 þ b1Þ correspondsto the average net-benefit for the group receivingthe ‘intervention’. Again, hospitals will deviatefrom this average by an amount u0j. We cansummarise the variability across the quantity u0jby s2u0. The coefficient b1 is the average incre-mental NMB. Notice that, in the variancecomponent specification, the average NMB ineach treatment arm is allowed to vary by centre,but their difference (i.e. b1) is not. In other words,in this class of models the implicit assumption isthat the average incremental net monetary benefitdoes not vary by location.

Random coefficient specification

For CEA using patient-level data from a numberof centres and/or countries, a major interest is inthe variability of average incremental NMB acrossdifferent locations. In this case, it is necessary tomove from a variance component specification, toa random coefficient model (this model is oftenreferred to as a random slope model as both theintercept and slope are allowed to vary randomly

across level-2 units). This is achieved by specifyingthe treatment effect to have both a fixed andrandom component

NMBij ¼ b0 þ b1tij þ u0j þ u1j tij þ eij ð7Þ

or equivalently

NMBij ¼ b0ijx0 þ b1j tij

b0ij ¼ b0 þ u0j þ eij ð8Þ

b1j ¼ b1 þ u1j

Here, the fixed and random parts of the model,respectively, are represented by the terms ðb0 þb1 � tijÞ and ðu0j þ u1j tij þ eijÞ. The fixed part of themodel is as before, but the random part nowincludes the term u1j tij which allows for the factthat both the intercept and the slope of theregression on treatment vary randomly acrosscentres. We are now interested in estimating therandom parameters s2u0; s

2u1 and s2e . However, the

random slopes and random intercepts may becorrelated (for example, centres/countries withhigher regression intercepts may have lowerregression slopes and vice versa). This makesintuitive sense because, in the context of multi-centre/country studies, we can expect that, withinthe same centre, the NMB of the two alternativetreatments being compared will be correlated: onthe cost side because both interventions will beaffected by the hospital production function; onthe outcome side, the general level of carein the two arms will probably not differ apartfrom the interventions being investigated. As aresult of this correlation, we are also interested inestimating the intercept-slope covariance, su01.

In this specification, the total unexplainedvariance of the NMBij conditional on tij is

VarðNMBij j tijÞ ¼ s2u0 þ 2su01tij þ s2u1t2ij þ s2e ð9Þ

Since tij ¼ 0, 1 for all i and j then

VarðNMBij j tij ¼ 0Þ ¼ s2u0 þ s2e ð10Þ

VarðNMBij j tij ¼ 1Þ ¼ s2u0 þ 2su01 þ s2u1 þ s2eand, therefore, we can define an interclass correla-tion coefficient for each arm of the trial.

Location specific cost-effectiveness

The estimation process of the multilevel familyof models is based on the assumption that the

Assessing Generalisability by Location 475

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

group-level units (here centres/countries) are ran-domly selected from a population of such units.For this reason the use of MLM is particularlyuseful to assess the generalisability by location ofthe results of economic evaluation studies along-side clinical trials. This can be achieved in twoways. Firstly, MLM can be used to estimate cost-effectiveness results across all locations which allowfor the hierarchical nature of the data by addingadditional uncertainty to the results. Secondly,MLM can be used to generate cost-effectivenessestimates which are specific to each location in thestudy; the extent to which the results of the analysisare consistent across locations can then be estab-lished. How MLM is used in a particular study willpartly depend on whether the study is a multi-centre or multi-national trial.

Location-specific estimates of cost-effectivenessare facilitated through a feature of MLM. Therandom effects (or residuals) u0j and u1j inEquation (8) are latent variables and not statistical,so they are not obtained as a part of the parameterestimation process (the parameters of interest arethe fixed part coefficients and the variances ofthe random components). Nevertheless, it can beuseful to quantify the residuals and this canbe achieved through what is termed shrinkageestimation. Consider the raw residual from avariance components model and calculated as rij ¼NMBij �N #MMBij where patients are clustered incentres. The raw residual for the jth centre, rj,could then be calculated as the mean of rij over allpatients within the centre. Shrinkage estimation ofcentre specific residuals for the intercept term, forinstance, are obtained as follows:

#uu0j ¼s2u0

½s2u0 þ ðs2e=njÞ�� rj ð11Þ

The multiplier in (11) is always less than or equalto 1 so that the estimated residual cannot begreater than the raw residual for a specific centre.This multiplier represents the shrinkage factor.Shrinkage is large when either s2e is large relative tos2u0, or when nj is small, or both. In thesecircumstances, shrinkage reflects a lack of infor-mation about centres and so the raw centre-specific residual is shrunken towards zero. Asimilar principle can be used to obtain shrinkageestimation of centre specific residuals for the slopeterm which is used to generate centre-specificincremental NMB. For both intercept and slope,the less information, or more uncertainty, there is

relating to an individual centre’s results, the moreit will be appropriate to rely on the overall mean asthe best estimate for that centre and the more thecentre-specific mean will be shrunk towards theoverall mean.

Case study: the EVALUATE trial

Background

In this section, the use of MLM to analysehierarchical datasets from multi-location trials isillustrated using a specific case study. This is amulticentre trial undertaken in one country, butthe analytical principles would remain the same formultinational studies. The EVALUATE trialcompared laparoscopic-assisted hysterectomy andstandard hysterectomy (vaginal or abdominal).Full clinical and economic results have beenpublished elsewhere [36,37]. Given the design ofthe trial, in effect two separate comparisons wereundertaken but, for purposes of the presentapplication, only the comparison of laparoscopic-assisted hysterectomy (N=573) and abdominalhysterectomy (N=286) is included. To avoid re-analysing data which have already been published,a simplified dataset is used here based on 6 weeksfollow-up (rather than 12 months), and data fromthe 25 English centres (out of a total of 30). Arange of resource use data was collected, andcentre-specific unit costs were estimated for dailyward cost and theatre overheads; other costs weretaken from national sources. Outcomes wereexpressed in terms of quality-adjusted life-years(QALYs) based on women’s responses to the EQ-5D [38,39].

Methods

There are a number of ways to estimate multilevelmodels. Perhaps the most common methods arevia maximum likelihood (assuming normality ofthe components of the random part of the model)and generalized least squares or an iterativeversion of it (GLS – which is equivalent tomaximum likelihood estimation if components ofthe random part are normally distributed). In thispaper, models are estimated using BayesianMarkov chain Monte Carlo (MCMC) methods.

A.Manca et al.476

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

These methods are easily implemented into thepackage MlwiN [40]. An advantage of MCMC inthis context is that it provides output in aconvenient format for calculating cost-effective-ness acceptability curves. The probability that theintervention is cost-effective is simply the prob-ability that b1 is greater than zero. This can beobtained in a straightforward manner from theposterior distribution of b1. If we wish to representthe sampling uncertainty around the mean incre-mental net benefit estimate for different levels of lfor the jth centre, we can do so by plotting theCEAC for the jth centre using a random coefficientmodel. This curve can be built by deriving theposterior distribution for ðb1 þ u1jÞ and againreading off the probability that this compositeparameter is greater than zero for a given value ofthe ceiling ratio. To implement MCMC methods,we use uninformative priors and assume that therandom components are normally distributed withzero mean and either constant variances (in the

variance components model) or variances as afunction of the predictor variables (as in therandom coefficient model).a

Summary data by centre

Table 1 presents summary data for the EVALU-ATE trial by centre (hospital). These might bedescribed as ‘na.ııve’ centre-specific results as theyare generated simply by analysing data from eachcentre individually, rather than using all data as ina MLM framework. In total 859 patients wereincluded in the study distributed across the 25hospitals. The distribution of patients acrosshospitals is unbalanced, with a minimum of 3patients observed in Centre 13 and a maximum110 in Centre 1. The average number of patientsper centre is 34. Columns 3–5 in Table 1 presentmean costs, outcomes (QALYs) and net monetarybenefits (l ¼ d30 000) for each of the 25 centres.

Table 1. Descriptive data by centre

Centre nj Costs (d) QALYs NMB (d) INMB (d)(l=d30 000) (l=d30 000)

1 110 1358.5 (482.6) 0.0878 (0.0194) 1276.8 (767.9) 238.4 (155.1)2 5 1966.0 (1572.4) 0.0923 (0.0101) 802.0 (1829.0) �1633.1 (1681.6)3 12 1669.0 (770.8) 0.0957 (0.0135) 1202.0 (985.6) �1014.8 (545.6)4 71 1171.2 (440.6) 0.0833 (0.0229) 1328.0 (950.7) 296.4 (243.2)5 26 2098.0 (676.0) 0.0952 (0.0128) 757.3 (733.3) �775.6 (275.8)6 11 1191.0 (253.4) 0.0785 (0.0238) 1164.7 (708.7) 658.2 (413.6)7 13 1609.0 (328.7) 0.0983 (0.0085) 1341.0 (415.9) �108.9 (245.4)8 43 1757.2 (1594.3) 0.0934 (0.0179) 1044.9 (1698.9) 267.9 (558.0)9 44 1478.2 (490.3) 0.0834 (0.0221) 1025.3 (921.1) �117.6 (301.1)10 72 1751.1 (1587.7) 0.0862 (0.0214) 835.7 (1924.3) �1009.4 (469.2)11 74 1497.6 (731.9) 0.0917 (0.0198) 1253.1 (959.1) �6.289 (242.6)12 56 1847.2 (701.9) 0.0885 (0.0180) 806.6 (851.5) �385.0 (240.2)13 3 2355.9 (218.5) 0.1095 (0.0101) 930.3 (172.2) �76.3 (288.3)14 8 2101.3 (1053.5) 0.0818 (0.0259) 353.5 (1496.3) �923.0 (1118.6)15 67 1154.2 (784.1) 0.0919 (0.0215) 1604.0 (999.3) �279.9 (259.7)16 52 1510.5 (671.9) 0.0891 (0.0207) 1162.9 (854.4) �393.9 (242.2)17 10 955.7 (137.8) 0.0855 (0.0117) 1609.4 (408.0) 248.0 (265.2)18 32 1840.1 (716.9) 0.0931 (0.0175) 952.8 (876.3) 163.0 (330.2)19 24 1630.4 (307.5) 0.0995 (0.0370) 1354.3 (1069.1) 269.7 (457.3)20 72 2464.0 (1041.4) 0.0943 (0.0141) 364.2 (1150.2) 136.1 (289.1)21 4 1754.5 (1270.2) 0.0745 (0.0178) 481.8 (1466.8) �1155.9 (1599.7)22 15 1575.7 (242.3) 0.0860 (0.0189) 1002.9 (634.8) 83.8 (360.1)23 18 1658.6 (1633.6) 0.0901 (0.0044) 1045.1 (1626.1) �620.6 (795.4)24 7 1659.2 (455.8) 0.0830 (0.0253) 831.8 (694.4) 289.7 (811.3)25 10 1924.5 (951.7) 0.0887 (0.0036) 737.5 (994.2) �1166.1 (599.7)

Means and standard deviations (in parentheses) for costs, QALYs and net monetary benefits (NMB). Incremental NMB were

obtained using OLS by regressing NMB on treatment arm indicator variable as in Equation (3). The estimated coefficient, #bb1together with its standard error (in parentheses) are reported. INMB is laparoscopic minus abdominal hysterectomy.

Assessing Generalisability by Location 477

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

Standard errors are presented in parentheses. Asimple inspection of the results reveals a great dealof variability in costs and outcomes, both withinand across the centres. For example, Centre 10 hasa mean cost of d1751 with a standard deviation ofd1588, and mean costs range from d956 in Centre17 to d2464 in Centre 20. This, in turn, relates tovariability in net monetary benefits both acrossand within centres.

The final column of the table presentsincremental net monetary benefits. These wereobtained by an OLS regression of NMB(l ¼ d30 000) on a treatment dummy (refer to(3)); incremental NMB (INMB) is simply theestimated coefficient, #bb1. The estimated coefficientstogether with their standard errors (in parentheses)are reported. Once again, it is clear that variabilityexists both across and within centres. The mini-mum INMB is observed for Centre 1 (–d1633) andthe maximum for Centre 6 (d658). Indeed, inspec-tion of the results by centre shows that the sign ofINMB also varies. Although, there is appreciableuncertainty around these centre-specific estimates,they indicate that decisions based on incrementalcost-effectiveness may have the potential to varyby centre.

Pooled results using standard (OLS) methods

The results of the standard pooled CEA, usingOLS regressions of net monetary benefit on thetreatment dummy (0 for abdominal hysterectomyand 1 for laparoscopic assisted abdominal hyster-ectomy), are shown in Table 2. For values ofthe ceiling ratio of d0 and d30 000 the results referto a regression of NMB on the treatment arm(see Equation (2c)), whereas the results reportedfor l ! 1 represent a regression of NHB ontreatment (see Equation (2d)). NHB is the appro-priate dependent variable here as NMB! 1

as l ! 1. NHB is equivalent to using theeffectiveness measure (QALYs) as the dependentvariable. The pooled results indicate that laparo-scopic hysterectomy is associated with higher costand higher QALYs, generating an ICER ofd93 783. At l=d30 000, this is equivalent anINMB of �d145 (95% confidence interval:�d307.60, d17.00).

Multilevel analysis

The results of the multilevel analyses are reportedin Tables 3 and 4 relating, respectively, to thevariance components and random coefficientmodels.

Variance components models. The results of theanalysis of the variance components specificationare reported in Table 3. As shown by a comparisonof the parameter estimates detailed in Tables 2 and3, the parameter estimates of b0 and b1 are, on thewhole, similar to those obtained using OLS.However, the multilevel analysis facilitates thedecomposition of the level-1 and level-2 variancecomponents and hence the calculation of the ICC.This information is important as it can beemployed usefully to explore the extent of thegeneralisability of economic evaluation resultsacross locations. A high ICC indicates that thelevel-2 variation is an important component of thetotal variation and, accordingly, centres differsubstantially in measured health outcomes and/or costs. In such circumstances, ignoring thehierarchical structure in the data could lead tomisleading results when quantifying the samplingvariability around the estimates of interest. Con-versely, a near zero ICC would normally beexpected to indicate that the role of level-2variation is modest and that centres can beassumed to have similar results. The extent of

Table 2. Results of standard economic evaluation based on pooled (OLS) analysis for the EVALUATE Trial usingthree different specifications of net monetary benefits (NMB)

N ¼ 859 NMB (d) l ¼ 0 NMB (d) l ¼ 30 000 NHBa (QALYs) l ! 1

Constant b0 �1472.1 (�1581.5 to �1362.7) 1166.8 (1034.3–1299.3) 0.088 (0.088–0.092)Treatment b1 �215.7 (�349.6�81.8) �145.3 (�307.6�17.0) 0.0023 (–0.002–0.006)

Coefficients and 95% confidence intervals (in parentheses) are reported. aThis column reports the coefficient for the Effectregression and not the NMB regression. The change in dependent variable (from NMB to NHB) is a function of how the units arereported (NMB in d and NHB in QALYs) and not because of some intrinsic limitation in the NMB.

A.Manca et al.478

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

clustering in these data is best observed whenconsidering the two extreme cases of modellingcost data (NMB with l=0) and modelling health

outcome (NHB with l!1) data. For theEVALUATE trial, the respective ICCs are 11.9and 27.8%.

Table 3. Results of the variance components specification for the EVALUATE Trial

N ¼ 859 NMB (d) NMB (d) NHBa (QALYs)l ¼ 0 l ¼ 30 000 l ! 1

Chain length=50 000 Chain length=10 000 Chain length=10 000Burn-in=2000 Burn-in=1000 Burn-in=1000

Fixed PartConstant b0 �1509.3 (�1686.2 to �1336.4) 1142.3 (959.6–1316.9) 0.088 (0.082–0.093)Treatment b1 �220.1 (�345.8 to �94.0 ) �147.5 (�303.1–8.5) 0.0025 (�0.0003–0.005)

Random Parts2u0 105 097 (46 894–210 306) 75 424 (27 612–166 035) 0.00015 (0.00008–0.0003)

s2e 779 731 (708 726–859 026) 1 234 346 (1 123 403–1 356 718) 0.00039 (0.00036–0.00043)ICC: 11.9% 5.8% 27.8%

Coefficients and 95% credibility intervals (in parentheses) are reported. Burn-in’ represents the number of initial iterations

which discarded in estimating the final parameter distributions allow the Markov chain to converge to the posterior distribution.

The ‘chain length’ (or monitoring period) is the number of iterations, after the burn-in period, for which the Markov chain is

to be run, to ensure the chain of values sampled is stable and converges to a particular mean estimate. aThis column reports thecoefficient for the Effect regression and not the NMB regression. The change in dependent variable (from NMB to NHB)is a function of how the units are reported (NMB in d and NHB in QALYs) and not because of some intrinsic limitation in theNMB.

Table 4. Random coefficients specification for the EVALUATE Trial

N ¼ 859 NMB (d) NMB (d) NHBa (QALYs)l ¼ 0 l ¼ 30; 000 l ! 1

Chain length=50000 Chain length=50000 Chain length=100 000Burn-in=2000 Burn-in=2000 Burn-in=5000

Fixed PartConstant b0 �1493.8 (�1663.3 to –1324.2) 1159.0 (986.8–1331.4) 0.088 (0.085–0.091)Treatment b1 �243.8 (�407.6 to –88.1 ) �176.3 (�385.6–23.3) 0.0024 (�0.003–0.008)

Random Parts2u0 94821 (35 721–206 039) 61134 (15 489–158 570) 0.00001 (0.000002–0.00004)

s2u1 40555 (7623–126 824) 71936 (14 522–213 021) 0.00014 (0.00007–0.0003)

s2u01 �6430 (�72 055–36 071) �13627 (�98 753–32 192) �0.000006 (�0.00003–0.00001)

s2e 772717 (701 394–851 271) 1220511 (1 108 103–1 344 094) 0.0004 (0.00036–0.00044)

ICCAbdominal hysterctomy 10.9% 4.8% 2.4%Laparoscopic hysterectomy 13.7% 8.0% 25.7%

Coefficients and 95% credibility intervals (in parentheses) are reported. ‘Burn-in’ represents the number of initial iterations

which discarded in estimating the final parameter distributions allow the Markov chain to converge to the posterior distribu-

tion. The ‘chain length’ (or monitoring period) is the number of iterations, after the burn-in period, for which the Markov chain is

to be run, to ensure the chain of values sampled is stable and converges to a particular mean estimate. aThis column reportsthe coefficient for the Effect regression and not the NMB regression. The change in dependent variable (from NMB toNHB) is a function of how the units are reported (NMB in d and NHB in QALYs) and not because of some intrinsic limitation inthe NMB.

Assessing Generalisability by Location 479

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

Random coefficient models. Table 4 reports theresults of estimating a random coefficient model(Equation (7)). This specification allows the effectof the intervention to vary across centres. In theEVALUATE trial, the estimated coefficients onthe intervention fixed part parameter b1, aresimilar (given the level of sampling variabilityevidenced by the width of the credibility intervals)to those reported for the variance coefficient modelin Table 3.

As with the variance components specification,the random coefficients analysis confirms thatthe structure of the dataset is clustered; more sofor cost data compared to health outcomedata. For example, for values of l=0, theICC was 11 and 14%, respectively, in theabdominal and laparoscopic arms. The net healthbenefit analysis with, l!1, shows considerableclustering within the laparoscopic arm, but muchless clustering within the abdominal arm of thetrial. However, the estimates of the variancecomponents are close to zero rendering the ICCestimate susceptible to sampling variability.Hence, although it would appear that greatercorrelation between patient health outcomes with-in centres exists for the laparoscopic arm of thetrial, the absolute value of the ICC should beinterpreted with caution.

Location-specific cost-effectiveness

Shrinkage estimation. As indicated above, loca-tion-specific estimates of cost-effectiveness arefacilitated through the residual shrinkage featureof MLM. Figure 1 illustrates the effect ofshrinkage on centre-specific estimates of netmonetary benefits (with the ceiling ratio (l) atd30 000). The figure is used simply to illustrate theshrinkage process as applied to a variancecomponents model (Equation (6)). The sameprinciples apply to more complex specifications,such as the random coeffecients model (Equation(7)) where the focus of shrinkage would be thecentre-specific INMB. The horizontal line repre-sents the mean NMB for the abdominal hyster-ectomy group obtained by estimating a simplevariance components random effects regressionmodel (Equation (4)); its value is simply theresulting estimated coefficient, #bb0, and forthe EVALUATE trial this is d1166.80. Centre-specific net monetary benefits are simply calculatedas the mean NMB for each centre independently.These are the values given in column 5 of Table 1which are termed na.ııve centre effects and displaythem as circles. Centre-specific shrinkage estimatesare also calculated using expression (Equation(11)). These are displayed as triangles. The relative

Net

mon

etar

y be

nefit

s -

ceili

ng r

atio

(£3

0,00

0)

Centre code

Naive centre effect MLM shrunken centre effect

1 3 5 7 9 11 13 15 17 19 21 23 25

300

500

700

900

1100

1300

1500

1700

1900

Figure 1. Shrinkage stimation of centre–specific NMBs at l=30000 based on a variance components specification. The size of the

objects represents the sample size in each centre. The horizontal line represents the term b0 in Equation (6), and the vertical distance

from the horizontal line and centre specific MLM shrunken centre effects (i.e. triangles) represent the term m0j in Equation (6)

A.Manca et al.480

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

size of the symbols is intended to provide aguide to the number of observations within eachcentre. It is easily seen that the na.ııve centre-specificestimates of mean NMB are always larger thanthe corresponding MLM shrunken estimates.Further, in general, the smaller the number ofpatients within a centre the greater the discrepancybetween the na.ııve and shrunken estimates – that is,the greater the shrinkage. This is to be expectedas the smaller the number of patients theless information is available from which to derivethe centre-specific estimate. Accordingly, onewould have less confidence in the resultingestimate and this is reflected by applying greaterweight to the shrinkage factor that pulls thena.ııve estimate towards the overall mean. Thesame process is applied to the calculationof centre-specific incremental net monetarybenefits.

Incremental net monetary benefits. Location-spe-cific cost-effectiveness can also be illustrated usingincremental net monetary benefit curves as afunction of the ceiling ratio, l (Figure 2). Severalalternative INMB curves are presented. The firstwas obtained by specifying and estimating thesingle-level model (Equation (3)) where, for agiven value of the ceiling ratio, INMB was

estimated as #bb1 (curve with circles). The secondINMB curve was estimated from the fixed partonly (again using estimate #bb1) of the randomcoefficient multilevel model (Equation (7)) (curvewith triangles). Finally, centre-specific INMBcurves were calculated, for selected hospitals, asthe sum of the fixed and random INMB coeffi-cients, #bb1 þ #uu1j, and are denoted by dotted curves.Due to the figures becoming unwieldy when allcentre-specific INMB curves are displayed, thelargest centres were chosen since these provide themost robust centre-specific estimates being basedon larger numbers of patients.

A couple of points can be made about thisanalysis. First, the results illustrate the widevariability in hospital-specific incremental netmonetary benefits. This is apparent across the fullrange of values of the ceiling ratio considered. Thisfeature is also apparent for values of l for whichthe corresponding model ICCs suggest centre-specific variation is negligible relative to totalresidual variation. Indeed, the centre-specificestimates reveal that there is at least one centrefor which INMB are positive over a wide range ofvalues of the ceiling ratio. This is evident forCentre 1 for which the results are contrary to thoseobtained by considering only the fixed partestimate of incremental cost effectiveness #bb1,

Incr

emen

tal n

et m

onet

ary

bene

fits

Ceiling ratio

MLM INMB Single level model MLM INMB (fixed part only: β1) MLM centre-specific INMBs

10000 20000 30000 40000 50000 60000 70000 80000 90000 100000

-500

-400

-300

-200

-100

0

100

200

300

centre 1 (nj=110)

centre 10 (nj=72)

centre 11 (nj=74)

centre 20 (nj=72)

Figure 2. Incremental Net Monetary Benefit curves based on a random coefficients specification. MLM, multilevel model; INMB,

incremental net monetary benefit

Assessing Generalisability by Location 481

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

obtained through either the single-level specifica-tion or the random coefficient specification, bothof which display negative estimates for values of lup to d95 000. Secondly, the centre-specific curvesare a non-linear function of the ceiling ratio. Thisis due to the non-linear relationship between theICC and the ceiling ratio. This affects the randompart estimate of the INMB #uu1j

� �through the

shrinkage estimation process since this is, in turn,a function of the ICC.

Cost-effectiveness acceptability curves. Similar fea-tures are revealed in terms of cost-effectivenessacceptability curves (Figure 3). Again, circlesdenote the curve produced from a single-levelmodel specification (Equation (3)), while trianglesdenote the curve obtained from the fixed part only(using estimate #bb1) of the random coefficientmultilevel model (Equation (7)). Curves are alsoshown for the same hospitals used in the INMBanalysis in Figure 2.

Once again, the curves display great variabilityacross hospitals in cost-effectiveness for given

values of l. This variability appears greatest atthe value of l of around d60 000, although cautionis required here as this observation is based ononly those selected hospitals displayed. Forexample, the probability of laparoscopic hyster-ectomy being cost-effective, with a ceiling ratioof d50 000, is approximately 0.16 applying theresults of the single-level model or fixed-part random coefficient model specification. Thecorresponding probability for Centre 10 is 0.01and for Centre 1 p=0.72. The observed maximumprobability that the intervention is cost-effectivefor Centre 10 is approximately, 0.1 (atl ¼ 100 000). For Centre 1, the maximum isp=0.8. For values of l greater than d20 000,laparoscopic hysterectomy would probably beconsidered cost-effective based on the results ofCentre 1. However, for values of l less than atleast d100 000, laparoscopic hysterectomy wouldprobably not be considered cost-effective based onpatient cost and outcomes reported for Centre 10.Results obtained from the single-level modelspecification indicate that the probability that the

Pro

babi

lity

of in

terv

entio

n be

ing

cost

-effe

ctiv

e

Ceiling ratio

MLM CEAC Single level model MLM CEAC (fixed part only: 1)β MLM centre-specific CEACs

10000 20000 30000 40000 50000 60000 70000 80000 90000 100000

0

.1

.2

.3

.4

.5

.6

.7

.8

.9

1

.

.

.

.

.

.

.

..

..

. . . . . . . . .centre 1 (nj=110)

. . . . . . . . . . . . . . . . . . . .centre 10 (nj=72)

.

.

.

.

.

.

.

.

..

..

..

..

. . . .centre 11 (nj=74)

. . . . . . . . . ..

..

..

..

..

.centre 20 (nj=72)

Figure 3. Cost-effectiveness acceptability curves based on a random coefficient specification. MLM, multilevel model

A.Manca et al.482

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

intervention is cost-effective is less than p ¼ 0:5 forthe values of l considered.

Discussion

This paper has illustrated the use of MLMin multi-location trials. The case-study was basedon a multicentre trial in one country, but themethods are of equal relevance to the analysisof multinational trials. The extent to which the useof MLM is crucial in a particular study depends onthe proportion of overall variability incost-effectiveness that takes place between loca-tions. Although, with relatively small valuesof the ICC, reasonably good agreement betweenthe multilevel and the OLS estimates can beexpected, in practical terms it is impossible toestablish a rigid threshold value of the ICC abovewhich the use of MLM should be recommended.Relating sample size and ICC, Borcikowski [41]showed that when using models which ignorethe clustering feature of the data even an ICC assmall as 1% could lead to considerable increase oftype I errors. Furthermore, Figures 2 and 3illustrate the wide variability in hospital-specificcost-effectiveness across the full range of valuesof the ceiling ratio considered. This feature isalso apparent for values of l for which thecorresponding model ICCs suggest centre-specific variation is negligible relative to totalresidual variation. There is, therefore, a strongbasis for starting the analysis with a multilevelregression to understand the data structure inmore depth.

Could multi-location trials be analysedmore simply using fixed effect models? Thelimitation of regression results obtained from fixedeffect models is that they are only valid within thesample of locations that participated in the study[42], and as such are not generalisable to thepopulation of centres outside the trial. Further-more, the use of fixed effect estimation to explorelocation-specific cost-effectiveness may not befeasible when there is a large number of level-2units as it requires a series of centre-by-treatmentinteraction terms which will result in a significantloss of degrees of freedom [42]. Random effectmodels, on the other hand, have the property to begeneralisable the centres outside the study samplethat share similar characteristics with the level-2units participating in the trial.

The analyses presented here can be extendedin two important ways. The first would be toinclude location-specific unit cost data to valueall resource use measured in that location. Giventhe difficulty in acquiring these data in alllocations, the EVALUATE case study presentedhere used a mix of centre-specific and nationalaverage unit costs. This has the effect of constrain-ing the degree of variation between centres as aproportion of this will be generated by differencesin unit costs. Furthermore, economic theory wouldsuggest that unit costs and resource use will berelated by location-specific production functions[16]. In future research, where it is possible toidentify centre- and country-specific unit costs,MLM can be used to explore this type ofsubstitution more formally. Moreover it shouldbe noted that a potentially crucial assumption of arandom effects framework is that the randomcomponents are not correlated with the fixed partcovariates.

The second extension to the MLM frameworkpresented here is the use of additional covariateswithin the regression specification. For ease ofexposition, the only covariate included in themethods described here is the treatment arm towhich a patient was randomised. The inclusion ofadditional covariates at hospital and patient levelmay help to reduce the estimated variability in themodel. Through the use of interactions betweenthese level 1 covariates and treatment, it is alsopossible to undertake patient sub-group analysis;that is, to estimate incremental net-benefit insub-groups of patients (e.g. defined by age and/or gender) [20]. Within the framework of MLM,it is also possible to include covariates whichexplain some of the variation between the higherlevel groups in the model (e.g. hospitals orcountries). These covariates might include factorssuch as hospital type (teaching vs non-teaching)[6], patient throughput and experience of clinicalstaff. The rationale is the same as for patient-level covariates: their inclusion facilitatescontrol over important differences between loca-tions that might be correlated with treatment, andto explore cost-effectiveness in sub-groups oflocations.

A caveat regarding the use of MLM in theanalysis of multi-location trials is that, the under-lying theory behind these models is that theselection of participating centres/countries shouldbe random. Unfortunately, largely for practicalreasons, this is very rarely the case. This is most

Assessing Generalisability by Location 483

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

likely to cause problems when the selectedlocations are clearly unrepresentative of thosefor which decisions are being taken. The extentto which the absence of randomly selected loca-tions compromises the results of MLM, andmethods to overcome this, are topics for futureresearch.

An important policy issue is raised by thiswork – that is, the extent to which location-specificestimates of incremental net-benefit are usefulto decision-makers. In the context of multinationaltrials, the ability to generate estimates of cost-effectiveness by country would seem potentiallyuseful to country-level decision-makers. In the caseof multicentre trials in a single country, however,this may not be so straightforward. The implica-tion of generating centre-specific estimates is thatthe decision-maker may be willing to fund aparticular intervention in some centres, but notin others. Alternatively, the analysis might beuseful to centre-level decision-makers. The use ofhigher-level covariates in these models, in whichcost-effectiveness is assessed by sub-groups oftypes of location (e.g. medical schools or regionalcentres), may provide more policy-relevant out-puts. Although the implications of this needfurther consideration, it remains the case that themultilevel population average from the fixed partof a random coefficient model (see Figures 2 and 3)remains the most appropriate way of estimatingaverage cost-effectiveness (i.e. across centres),given clustered data, if the decision-maker is notinterested in implementing different decisions indifferent centres.

Acknowledgements

This work was developed as part of a project ongeneralisability in economic evaluation studies inhealth care (98/22/05) funded by the NHS HealthTechnology Assessment Programme. An earlier versionof this paper was presented at the January 2002 UKHealth Economists’ Study Group meeting, heldin Norwich (UK). Mark Sculpher and Andrew Briggsare funded by Career Awards in Public Health fundedby the NHS Research and Development Programme.We are grateful to Alastair Leyland and Andy Willanand two anonymous referees for helpful comments.Remaining errors are our own responsibility. The viewsand opinions expressed therein are those of the authorsand do not necessarily reflect those of the UKDepartment of Health.

Notes

a. In this application, we have used MLwiN which, isstandard software for multi level modelling. Thissoftware allows multi level regression to be per-formed through GLS estimation (i.e. iterative gen-eralised least squares and restricted iterativegeneralised least squares) as well MCMC procedures.However, there is a variety of alternative softwarethat researchers can use to perform multi levelregression analysis. For a review, please see http://multilevel.ioe.ac.uk/softrev/index.html

References1. Chaudahry MA, Sterns SC. Estimating confidence

intervals for cost-effectiveness ratios: an example froma randomised trial. Stat Med 1996; 15: 1447–1458.

2. Willan AR, O’Brien B. Confidence intervals forcost-effectiveness ratios: an application of Fiellers’theorem. Health Econ 1996; 5: 297–305.

3. Briggs AH, Mooney CZ, Wonderling DE. Con-tructing confidence intervals for cost-effectivenessratios: an evaluation of parametric and non-para-metric techniques using Monte Carlo simulation.Stat Med 1999; 18: 3245–3262.

4. Van Hout BA, Al MJ, Gordon GS, Rutten FFH.Costs, effects and c/e-ratios alongside a clinical trial.Health Econ 1994; 3: 309–319.

5. Fenwick E, Claxton K, Sculpher M. Representinguncertainty: the role of cost-effectiveness accept-ability curves. Health Econ 2001; 10: 779–789.

6. Sloan F, Feldman RD, Steinwald AB. Effectsof teaching on hospital costs. J Health Econ 1983;2: 1–28.

7. Drummond MF, Bloom BS, Carrin G et al. Issuesin the cross-national assessment of health technol-ogy. Int J Technol Assess Health Care 1992; 8:671–682.

8. O’Brien BJ. A tale of two (or more) cities:geographic transferability of pharmacoeconomicdata. Am J Managed Care 1997; 3: S33–S39.

9. Sculpher MJ, Poole L, Cleland J et al. Low doses vs.high doses of the angiotensin converting-enzymeinhibitor lisinopril in chronic heart failure: a cost-effectiveness analysis based on the assessment oftreatment with Lisinopril and survival (ATLAS)study. Eur J Heart Fail 2000; 2: 447–454.

10. Scott WC, Cooper BC, Scott HM. Pharmacoeco-nomics evaluation of Roxithromycin versus Amox-ycillin/Clavulanic acid in a community-acquiredlower respiratory tract infection study. Infection1995; 23: S21–S24.

11. Rajan R, Gafni A, Levine M, Hirsh J, Gent M. Verylow-dose Warfarin prophylaxis to prevent throm-boembolism in women with metastatic breast cancer

A.Manca et al.484

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)

receiving chemotherapy: an economic evaluation.J Clin Oncol 1995; 13: 42–46.

12. Dasbach EJ, Rich MW, Segal R et al. The cost-effectiveness of losartan versus captopril in patientswith symptomatic heart failure. Cardiology 1999; 91:189–194.

13. Nord E, Wisloff F, Hjorth M, Westin J. Cost-utilityanalysis of Melphalan plus Prednisone with orwithout Interferon-alpha 2b in newly diagnosedmultiple myeloma. Results from a randomised con-trolled trial. Pharmacoeconomics 1997; 12: 89–103.

14. Schulman KA, Buxton M, Glick H et al. Results ofthe economic evaluation of the FIRST study. Amultinational prospective economic evaluation. IntJ Technol Asses Health Care 1996; 12: 698–713.

15. Jonsson B, Cook JR, Pedersen TR. The cost-effectiveness of lipid lowering in patients withdiabetes: results from the 4S trial. Diabetologia1999; 42: 1293–1301.

16. Raikou M, Briggs A, Gray A, McGuire A. Centre-specific or average unit costs in multi-centre studies.Some theory and simulation. Health Econ 2000; 9:191–198.

17. Willke RJ, Glick HA, Polsky D, Schulman KA.Estimating country-specific cost-effectiveness frommultinational clinical trials. Health Econ 1998; 7:481–493.

18. Cook JR, Drummond M, Glick H, Heyse JF.Assessing the appropriateness of combining eco-nomic data from multinational clinical trials. StatMed 2003; 22: 1955–1976.

19. Stinnett A, Mullahy J. Net health benefits: a newframework for the analysis of uncertainty in cost-effectiveness analysis. Med Decision Making 1998;18: S68–S80.

20. Hoch JS, Briggs AH, Willan A. Something old,something new, something borrowed, somethingBLUE: a framework for the marriage of healtheconometrics and cost-effectiveness analysis. HealthEcon 2002; 11: 415–430.

21. DiPetre TA, Forristal JD. Multilevel models:methods and substance. Ann Rev Sociol 1994; 20:331–357.

22. Goldstein H, Rasbash J, Yang M et al. A multi levelanalysis of school examination results. Oxford RevEduc 1993; 19: 425–433.

23. Aitkin M, Longford N. Statistical modelling issuesin school effectiveness studies. J R Stat Soc, Ser A1986; 149: 1–43.

24. Brown H, Prescott R. Applied Mixed Models inMedicine. Wiley: Chichester, 1998.

25. Cardoso AR. Wage differentials across firms: anapplication of multilevel modelling. J Appl Econ2000; 15: 343–354.

26. Rice N, Leyland A. Multilevel models: applicationsto health data. J Health Services Res and Policy1996; 1: 154–164.

27. Rice N, Jones AM. Multilevel models and healtheconomics. Health Econ 1997; 6: 561–575.

28. Briggs AH. Handling uncertainty in economicevaluation and presenting the results. In EconomicEvaluation in Health Care. Merging Theory withPractice,Drummond MF, McGuire A (eds). OxfordUniversity Press: Oxford, 2001.

29. Claxton K, Posnett J. An economic approach toclinical trial design and research priority-setting.Health Econ 1996; 5: 513–524.

30. Tambour M, Zethraeus NMJ. A note on confidenceintervals in cost-effectiveness analysis. Int J TechnolAssess Health Care 1998; 14: 467–471.

31. Briggs AH. A Bayesian approach to stochastic cost-effectiveness analysis. Health Econ 1999; 8: 257–262.

32. UK Prospective Diabetes Study Group. Cost effec-tiveness analysis of improved blood pressure controlin hypertensive patients with type 2 diabetes:UKPDS 40. Br Med J 1998; 317: 720–726.

33. Roberts C. The implication of variation in outcomebetween health care professionals for the design andanalysis of randomised controlled trials. Stat Med1999; 18: 2605–2615.

34. Moerbeek M, Breukelen GJP, Berger MPF. Acomparison between traditional methods and multi-level regression for the analysis of multicenterintervention studies. J Clin Epidemiol 2003; 56:341–350.

35. Snijders TAB, Bosker RJ. Multilevel Analysis. AnIntroduction to Basic and Advanced MultilevelModeling. Sage Publications: London, 1999.

36. Sculpher M, Manca A, Abbott J, Fountain J,Mason S, Garry R. The cost-effectiveness oflaparoscopic-assisted hysterectomy in comparisonwith standard hysterectomy: The EVALUATETrial. Br Med J 2004; 328(7432): 134–139.

37. Garry R, Hawe J, Abbott J et al. The EVALUATEstudy: randomised trials comparing laparoscopicwith abdominal and vaginal hysterectomy. Br Med J2004; 328(7432): 129–133.

38. Kind P. The EuroQoL instrument: an index ofhealth-related quality of life. In Quality of Life andPharmacoeconomics in Clinical Trials, Spilker B(ed.). Lippincott-Raven: Philadelphia, 1996.

39. Kind P, Hardman GSM. UK Population Norms forEQ-5D. Discussion Paper. Centre for Health Eco-nomics, University of York: York, 1999. ReportNo.: 172.

40. Rasbash J, Browne W, Goldstein H et al. A User’sGuide to MLwiN Version 2.1. Institute of Education,University of London: London, 2000.

41. Barcikowski RS. Statistical power with group meanas the unit of analysis. J Educ Stat 1981; 6: 267–285.

42. Baltagi BH. Econometric Analysis of Panel Data.Wiley: Chichester, 2001.

Assessing Generalisability by Location 485

Copyright # 2004 John Wiley & Sons, Ltd. Health Econ. 14: 471–485 (2005)