utilities in hta: challenges for theory and practice now
TRANSCRIPT
1
Utilities in HTA: Challenges for Theory and Practice Now and in the Future
Workshop at the 18th ISPOR European Congress, Milan
10 November 2015
Discussion Leaders
• Michael Drummond, Professor of Health Economics, University of York
• Nancy Devlin, Research Director, Office of Health Economics, London
• Jenny Berg, Senior Scientist, Mapi, Stockholm
2
Workshop Objectives
• To discuss the methodological and practical challenges regarding the generation and use of utility data
• To compare the different sources of utility data, through examples from the literature and recent analyses of real world data
• To provide guidance on how to generate utility data for submissions to authorities with different requirements
• To discuss how these requirements may differ in the future
Workshop Agenda
• Introduction
• Why utilities matter and how HTA agencies view them
• Theoretical foundations and challenges in utility measurement
• Practical implications of the choice of utility measurement
• Interactive discussion
3
Why Utilities Matter and How
HTA Agencies Deal with Them
Michael Drummond
Centre for Health Economics,
University of York
Background
• Health state preference values, or ‘utilities’,
are required to calculate the quality-adjusted
life-years (QALYs) gained from therapy
• The ICER is particularly sensitive to the
utility estimates when the treatment impacts
primarily on quality of life, or when the life-
years gained are not lived at full health
4
Considerations in the Choice of
Utility Values for Decision-Making
• Characteristics of the various methods of
estimation (e.g. VAS,TTO,SG, DCE)
• Views on the desirability of experienced
valuations versus stated preferences (e.g.
patients’ values versus the general population)
• Need for standardization (e.g. generic versus
bespoke utilities)
• Need for local data
Characteristics of the Various
Methods • Visual Analogue Scale (VAS) is not
choice-based
• Time Trade-Off (TTO) is choice-based
under certainty
• Standard Gamble (SG) is choice-based
under uncertainty
• Discrete Choice Experiments (DCEs) can
more easily be administered, facilitating
large sample sizes
5
• Health state valuations can be obtained from individuals who are either living in the health state, or who have experienced it in the past (e.g. patients)
• Stated preferences are often obtained from individuals with no direct experience of the health state (e.g. the general public)
• There are also normative reasons for preferring valuations either from patients or the general public
• Generic instruments usually have a tariff generated using the preferences of the general public
Experienced Valuations Versus
Stated Preferences
Need for Standardization
• Many decision-makers feel that analytical
methods should be standardized in order to
facilitate comparisons between the
assessments of different technologies
• In the case of utilities, this normally implies
using a generic instrument
• Often argued that generic instruments cannot
capture all the impacts on quality of life of all
health conditions and may lack sensitivity
6
Need for Local Data
• Ideally, decision-makers would prefer local data
for all elements of an economic evaluation, including utility estimates
• In an analysis of 27 international methods guidelines, Barbieri et al (2010) found that 11 discussed the jurisdiction/source of the utility estimates; of these 6 indicated a need for local data
• Tariffs do vary between countries for the generic instruments, but not clear whether the variations are great enough to have a substantial impact on cost-effectiveness results
Which Jurisdictions Require
QALYs? • Strong preference for QALYs/Cost-utility analysis
- Australia, Canada, England/Wales, Ireland, Netherlands, New Zealand, Norway
• QALYs/Cost-utility analysis mentioned as one possible approach
- Belgium, Colombia, Portugal, Slovakia, Sweden, Switzerland, Taiwan
• QALYs/Cost-utility analysis not encouraged - Germany, United States
www.ispor.org Pharmacoeconomic Guidelines Around the World
7
The Position in the US Public
Sector • “The Patient-Centered Outcomes Research
Institute . . . shall not develop or employ a dollars per quality adjusted life year (or similar measure that discounts the value of a life because of an individual's disability) as a threshold to establish what type of health care is cost effective or recommended. The Secretary shall not utilize such an adjusted life year (or such a similar measure) as a threshold to determine coverage, reimbursement, or incentive programs under title XVIII.”
The Patient Protection and Affordable Health Care Act, 2010
What Methods are Prescribed?
• Belgium – generic instrument, encourage EQ-5D
• Canada – justify the approach used
• Colombia – EQ-5D with Latin American tariffs
• England/Wales – EQ-5D with UK tariff
• Ireland – indirect methods such as EQ-5D, SF-6D preferred
• Netherlands – can be VAS,TTO or SD; justify selection
• New Zealand – EQ-5D with New Zealand tariff
• Norway – generic instruments preferred
• Sweden – SG,TTO or EQ-5D, prefer weights from patients
• Taiwan – any approach using the general public’s views
www.ispor.org Pharmacoeconomic Guidelines Around the World
8
Extracts from the NICE Methods
Guide (1) 5.3.4 The valuation of health-related quality of life measured in patients
(or by their carers) should be based on a valuation of public
preferences from a representative sample of the UK population using a
choice-based method. This valuation leads to the calculation of utility
values
5.3.5 Different methods used to measure health-related quality of life
produce different utility values; therefore, results from different methods
or instruments cannot always be compared. Given the need for
consistency across appraisals, one measurement method, the EQ-5D,
is preferred for the measurement of health-related quality of life in
adults.
National Institute for Health and Care Excellence. Guide to the methods of technology
appraisal. London, NICE, April 2013. http://publications.nice.org.uk/pmg9
Extracts from the NICE Methods
Guide (2) 5.3.8 If not available in the relevant clinical trials, EQ-5D data can be
sourced from the literature. When obtained from the literature, the
methods of identification of the data should be systematic and
transparent. The justification for choosing a particular data set should
be clearly explained. When more than 1 plausible set of EQ-5D data is
available, sensitivity analyses should be carried out to show the impact
of the alternative utility values.
5.3.9 When EQ-5D data are not available, these data can be estimated
by mapping other health-related quality of life measures or health-
related benefits observed in the relevant clinical trial(s) to EQ-5D. The
mapping function chosen should be based on data sets containing both
health-related quality of life measures and its statistical properties
should be fully described, its choice justified, and it should be
adequately demonstrated how well the function fits the data. Sensitivity
analyses to explore variation in the use of the mapping algorithms on
the outputs should be presented.
9
Use of Utility Measures in
Practice • Most decision-makers are pragmatic and
consider the QALY data presented to them
• It is rare for QALY estimates to be cited as
the main reason for rejecting a
manufacturer’s submission
• QALY data may be a source of uncertainty
in some submissions, either because the
data are from a different jurisdiction, or
generated using a non-preferred method
Theoretical foundations and challenges Professor Nancy J Devlin Director of Research Office of Health Economics ISPOR Milan 2015 WORKSHOP W18: UTILITIES IN HTA: CHALLENGES FOR THEORY AND PRACTICE NOW AND IN THE FUTURE
10
Three things that determine QoL utilities used in HTA
1. What method do we use to elicit data?
• SG, TTO, VAS, DCE, other…?
2. Who do we ask?
• The general public, patients, someone else?
3. How do we model the data to use individuals’ data to represent the ‘average’ preferences for a ‘society’?
• What measure of ‘average’: mean, median, mode?
• Wide variety of econometric modelling approaches can be used to model preferences data
Fundamental problem no.1:
Each of these researcher choices has a non-trivial impact on QoL utilities – and cannot be
determined with recourse to statistical properties alone.
Or in other words: theory matters (a lot)
How do we choose?
Who chooses – HTA bodies? Academics?
11
• All current methods for valuing HRQoL rely on stated preferences – there is no corresponding market in which to reveal preferences, to validate results or help to choose between methods.
• How do we as researchers choose our approaches, given the importance of this to HTA and patients’ access to medicines?
(a) Do the results look like we expected?
• Tautological: what results we think are OK, depends on what results we saw before, which are a product of previous methodological choices
(b) What theories do we ‘believe’/subscribe to?
• Entirely normative. • Might be derivable from the client (real or imagined)
Fundamental problem no.2:
What theoretical foundations are relevant?
choice Theoretical foundations
What are we measuring?
Utility/welfare? Health? Health Related QoL?
Which method? No single theory. Theoretical foundations of each method is different. • SG: utility under uncertainty • TTO: empirical proxy for SG, but can be given its own
theoretical foundation in Hick’s utility theory • VAS: psychometric theory; Parducci. • DCE: random utility theory Choosing between methods = choosing between theories.
Whose preferences?
Welfarism: utility of those affected by the state of the world (but not via QALYs or measures of HRQoL!) Extra-welfarism: by convention, the general public - but alternative interpretations possible.
How to model? Values sets are sensitive to choices about how to model the data. We need to be much more transparent about that. The normative basis for these choices is often weak.
12
Extra welfarism
• If we assume extra welfarism to be the relevant theoretical foundation for HTA – what guidance does this provide on these issues?
• Important to note that extra-welfarism is much less prescriptive (e.g. about the sources of QoL weights) than the current orthodoxy that has emerged in the practice of HTA (Morris, Devlin, Parkin 2007)
‘The extra welfarist approach differs from the welfarist in
four general ways: (1) it permits the use of outcomes other
than utility (2) it permits the use of sources of valuation
other than the affected individuals (3) It permits the
weighting of outcomes (whether utility or other) according
to principles that need not be preference based (4) It
permits interpersonal comparisons of wellbeing in a variety
of dimensions, thus enabling movements beyond Paretian
economics. (Culyer 2012 p. 72).
• Note: ‘permits’ ≠ ‘requires’.
Key quote no. 1: what method?
13
Isn’t it ironic?
• Extra welfarism arises from a rejection of utilities (a la welfarism) as an acceptable sole basis for making public choices (vis a vis Sen)
• Yet in measuring/valuing HRQoL for HTA, our current approaches are deeply influenced by our (i.e. economists’) attachment to utility theory
Key quote no.2: Whose values?
In extra welfarism: “…any number of stakeholders might be regarded as the appropriate source of different values” (Culyer 2012). And these sources of values might appropriately come from: “…an authority (decision makers, wise women, the general public, an elected or appointed committee, a citizen’s jury, or some other organ)” (Culyer 2012) • Who’s the client (real or imagined)? That’s the big question.
14
Key quote no.3: how to model?
“The choice of summary statistics is not merely a technical matter, but invokes ethical issues which need to be resolved”
(Nicholls 1989, cited by us:)
“Which approach to aggregation of individual preferences is chosen can have an important effect on conclusions about what ‘society’s’ preferences are – with implications for decision making and the allocation of public funds.
Ultimately, what approach to calculating the average should be used is a normative question: it cannot be answered with recourse to empirical evidence alone.
(Devlin and Buckingham, 2015).
Key quote no.4: who should be making these choices?
“…economists may be able to derive values from experimental groups or samples of the relevant population through modern methods for eliciting preferences…the choice about which groups to sample are not normally for the analyst to make but for the ultimate decision maker, advised by the analyst’ (Culyer 2012 p. 77).
15
Concluding remarks
• Choices about what utilities to use in HTA are strongly influenced by the choices made by researchers.
• Choice about what utilities to use ‘should’ be driven by well-informed HTA bodies, reflecting the socio-political, economic and health care system context within which they operate.
• Implications:
• Researchers should commit to being transparent in reporting of HRQoL utilities, and report a wide range of sensitivity analyses e.g. relating to different modelling approaches; confidence intervals.
• HTA should check the sensitivity of cost effectiveness to different utilities, rather than relying on single point-estimates from value sets.
References
Buckingham K, Devlin N. (2006) A theoretical framework for TTO valuations of health. Health Economics 5(10) 15 (10) 1149-54.
Buckingham, K., Devlin, N. (2009) An exploration of the marginal utility of time in health. Social Science and Medicine 68: 362-367.
Culyer A.(2012) Extra welfarism. Chapter 2 in: The humble economist. Cookson R, Claxton K (eds). London: OHE.
Devlin N, Buckingham K. (2013) What is the normative basis for selecting the measure of ‘average’ preferences to use in social choices? OHE Research Paper (forthcoming).
Morris S, Devlin N, Parkin D. (2012) Economic analysis in health care. Wiley (2nd ed.).
Parkin D., Devlin N. (2006) Is there a case for using visual analogue scale valuations in Cost utility Analysis? Health Economics 15:653-664.
16
Disclaimer:
Views expressed in this presentation are my own, and not necessarily those of the EuroQol Group, or any other organisation with which I work.
Practical implications of the choice of utility measurement
ISPOR 18th Annual European Congress, Milan, 10 November 2015
WORKSHOP W18: Utilities in HTA – Challenges for theory and practice now and in the future
Jenny Berg, PhD
Senior Scientist, Mapi, Sweden
17
33 © Mapi 2015, All rights
reserved 33
Differences between selected value sets for EQ-5D (1/2) Methods
France
Germany Germany Sweden UK US
Reference Chevalier et al, 2013
Greiner et al, 2006
Leidl et al, 2011
Burström et al, 2013
Dolan, 1997
Shaw et al, 2005
Sample General population (n=443)
General population (n=339)
General population (n=2,032)
General population (n=45,477)
General population (n=2,997)
General population (n=4,048)
Type of health states valued
Hypothetical Hypothetical Experience-based
Experience-based
Hypo-thetical
Hypo-thetical
No. of health states valued
24 36 49 148 42 13
Valuation technique
TTO TTO VAS TTO TTO TTO
TTO=time trade-off, VAS=visual analogue scale
34 © Mapi 2015, All rights
reserved 34
Differences between selected value sets for EQ-5D (2/2) Methods and results
France
Germany (TTO)
Germany (VAS)
Sweden UK US
Modelling techniques
Random effects model
N3 term
Negative scores transformed
Additive linear model
N3 term
Negative scores transformed
Generalized linear model
No N3 term
Ordinary least squares
N3 term
Random effects model
N3 term
Negative scores transformed
Random effects model
D1 term
Interaction terms
Negative scores transformed
Range of utilities
[-0.53, 1 ]
[-0.18, 1 ] [0.18, 0.89]
[0.34, 0.97] [-0.54, 1 ] [-0.11, 1 ]
Most influential dimensions
Mobility,
Self-care
Mobility,
Pain/ discomfort
Pain/ discomfort
Anxiety/ depression
Pain/ discomfort
Mobility,
Pain/ discomfort
N3 term: severe problems in any dimension D1 term: number of dimensions with some or severe problems beyond first dimension
18
35 © Mapi 2015, All rights
reserved 35
Different value sets give different utilities
36 © Mapi 2015, All rights
reserved 36
Utilities by NHYA class in chronic heart failure for different types of elicitation methods and samples
Note: Berg 2015 – utilities derived for men, 70-79 years, other variables at reference level References: Pressler et al., J Card Fail. 2011 Feb;17(2):143-50. Göhler et al., Value Health. 2009 Jan-Feb;12(1):185-7. Alehagen et al., Eur J Heart Fail. 2008 Oct;10(10):1033-9. Lewis et al., J Heart Lung Transplant. 2001 Sep;20(9):1016-24. Berg et al., Value Health. 2015 Jun;18(4):439-48
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NYHA I NYHA II NYHA III NYHA IV
Uti
lity
HUI-3, primary care(Pressler 2011)
EQ-5D, post-MI RCT(Goehler 2009)
TTO, primary care(Alehagen 2008)
TTO, hospital care(Lewis 2001)
SG, hospital care(Lewis 2001)
EQ-5D SE tariff,mostly hospital (Berg2015)
EQ-5D UK tariff,mostly hospital (Berg2015)
19
37 © Mapi 2015, All rights
reserved 37
Observation 1: The same method in different samples does not imply same levels or slope of utilities
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NYHA I NYHA II NYHA III NYHA IV
Uti
lity
TTO, primary care(Alehagen 2008)
TTO, hospital care(Lewis 2001)
38 © Mapi 2015, All rights
reserved 38
Observation 2: Different samples, valuation techniques and value sets may still give similar utilities
Note: Berg 2015 – utilities derived for men, 70-79 years, other variables at reference level
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NYHA I NYHA II NYHA III NYHA IV
Uti
lity
EQ-5D, post-MI RCT(Goehler 2009)
TTO, primary care(Alehagen 2008)
EQ-5D SE tariff, mostlyhospital (Berg 2015)
20
39 © Mapi 2015, All rights
reserved 39
Observation 3: Different value sets in the same sample and instrument can have large impact on utilities
Note: Berg 2015 – utilities derived for men, 70-79 years, other variables at reference level
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NYHA I NYHA II NYHA III NYHA IV
Uti
lity
EQ-5D SE tariff,mostly hospital (Berg2015)
EQ-5D UK tariff,mostly hospital (Berg2015)
40 © Mapi 2015, All rights
reserved 40
Observation 4: Methods allowing for health states worse than death lead to lower utilities
Note: Berg 2015 – utilities derived for men, 70-79 years, other variables at reference level
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NYHA I NYHA II NYHA III NYHA IV
Uti
lity
HUI-3, primary care(Pressler 2011)
EQ-5D UK tariff,mostly hospital (Berg2015)
21
41 © Mapi 2015, All rights
reserved 41
Example: Different value sets give different incremental effects
Based on Karlsson et al., 2011; Ann Rheum Dis; 70: 2163-2166. Notes: Differences adjusted for baseline differences in EQ-5D utility, disease duration and TNF inhibitor used
Work based on rheumatoid arthritis register (Karlsson et al., 2011)
Patients starting either anti-TNF or anti-TNF plus methrotexate
EQ-5D available at baseline, 3, 6 and 12 months
Utilities and accumulated QALYs calculated using different value sets
Worst health: -0.62 (Danish), -0.59 (UK), -0.11 (US)
US utilities generally higher: mean difference 0.23 (UK), 0.08 (Danish)
Differences increase for worse health states
42 © Mapi 2015, All rights
reserved 42
Implications for economic evaluations
Differences in utility estimates can lead to different incremental effects and thus cost-effectiveness results
Different value sets can result in different distributions of utilities: e.g. Kiadaliri et al., HQLO 2015; 13:145 Comparison of TTO value sets for EQ-5D in type 2 diabetes in
Sweden, UK, Germany, US and Denmark
Swedish value set (only experience-based one) was only one not sensitive to treatment modality (insulin treatment)
Compared to other value sets, Swedish value set had
— Higher discriminative ability for macrovascular complications
— Lower discriminative ability for microvascular complications
Case of Sweden: TLV guidelines recommend use of utilities from those with experience of the health states In practice, reimbursement applications to TLV should include both
value sets – allows comparison of impact of value sets
22
43 © Mapi 2015, All rights
reserved 43
Examples of recent TLV decisions incorporating Swedish and UK value sets
Xtandi (enzalutamide) Prostate cancer
Best supportive care
SE: 614 000 SEK/QALY
UK: 698 000 SEK/QALY
Rejected
High ICER when assuming different
survival extrapolation
Stivarga (regorafenib)
Colorectal cancer
Best supportive care
SE: 1 300 000 SEK/QALY
UK: 1 600 000 SEK/QALY
Rejected
High ICER
Comparator
Base case ICER
(by tariff)
TLV decision
Main reasons
Vargatef (nintedanib)
NSCLC
Docetaxel
SE: 424 000 SEK/QALY
UK: 550 000 SEK/QALY
Positive
High disease severity, reasonable price in
relation to health gain
44 © Mapi 2015, All rights
reserved 44
Concluding remarks
The effect of different utility values applied to the same sample differ depending on type of treatment and disease
For interventions affecting survival, Swedish value set will e.g. lead to more QALYs than UK value set
For treatments affecting mainly/only quality of life, Swedish value set will e.g. lead to smaller QALY gains than UK value set
Differences in funding decisions based on utilities alone (i.e. given same efficacy, costs, WTP thresholds, etc.) would depend on:
Distribution and severity of health states (cf. domains)
Treatment effect in terms of change in transitions between health states
23
Issues for Discussion
• What are the main challenges facing manufacturers in generating utility data?
• Should utility estimates be based on experienced utilities (i.e. from patients), or stated preferences (i.e. from any group, but most often the general public)?
• Should the requirements for utility data be standardized within each jurisdiction (aka NICE)?
• Should the requirements for utility data be standardized across jurisdictions (aka DALYs)?
Conclusions
• Utility data are often important in assessments of the cost-effectiveness of new technologies
• The choice of estimation method depends on both technical and value judgments
• The choice of estimation method can make a difference to the results of studies
• There needs to be a thorough discussion of the choice of method in each jurisdiction