grade background • two stepsebm.mcmaster.ca/documents/large_group_presentations/grade... · •...
TRANSCRIPT
PlanPlan•• GRADE backgroundGRADE background
•• two stepstwo steps–– quality of evidencequality of evidence–– strength of recommendationstrength of recommendation
•• quality and strength can differquality and strength can differ
•• profiles and summary of findingsprofiles and summary of findings
•• importance of values/preferencesimportance of values/preferences
PlanPlan•• GRADE backgroundGRADE background•• two stepstwo steps
–– quality of evidencequality of evidence–– strength of recommendationstrength of recommendation
•• evidence profilesevidence profiles
•• an exercise in applying GRADEan exercise in applying GRADE
PlanPlan•• GRADE backgroundGRADE background•• two stepstwo steps
–– quality of evidencequality of evidence–– strength of recommendationstrength of recommendation
•• quality and strength can differquality and strength can differ
•• profiles and summary of findingsprofiles and summary of findings
•• importance of values/preferencesimportance of values/preferences
•• an exercise in applying GRADEan exercise in applying GRADE
PlanPlan•• GRADE backgroundGRADE background
•• two stepstwo steps–– quality of evidencequality of evidence–– strength of recommendationstrength of recommendation
•• importance of values/preferencesimportance of values/preferences
•• an exercise in applying GRADEan exercise in applying GRADE
PlanPlan•• GRADE backgroundGRADE background
•• two stepstwo steps–– quality of evidencequality of evidence–– strength of recommendationstrength of recommendation
•• application to breast cancer screeningapplication to breast cancer screening–– contrast with USPSTFcontrast with USPSTF
Summarizing recommendationsSummarizing recommendations
•• clinicians need succinct summariesclinicians need succinct summaries
•• should includeshould include–– quality of evidencequality of evidence–– summaries of best estimates of effectsummaries of best estimates of effect
•• all patientall patient--important outcomesimportant outcomes–– strength of recommendationsstrength of recommendations
•• GRADE working groupGRADE working group–– BMJ 2004 and 2008BMJ 2004 and 2008
•• Is grading recommendations a good Is grading recommendations a good idea?idea?
•• Why?Why?
•• experience with gradingexperience with grading–– systems used?systems used?
Why Grade Why Grade Recommendations?Recommendations?
•• strong recommendationsstrong recommendations–– strong methods strong methods –– large precise effect large precise effect –– few down sides of therapyfew down sides of therapy
•• weak recommendationsweak recommendations–– weak methodsweak methods–– imprecise estimateimprecise estimate–– small effectsmall effect–– substantial down sidessubstantial down sides
Which grading system to use?Which grading system to use?•• many availablemany available
–– Australian National and MRCAustralian National and MRC–– Oxford Center for EvidenceOxford Center for Evidence--based Medicinebased Medicine–– Scottish Intercollegiate Guidelines (SIGN)Scottish Intercollegiate Guidelines (SIGN)–– US Preventative Services Task ForceUS Preventative Services Task Force–– American professional organizationsAmerican professional organizations
•• AHA/ACC, ACCP, AAP, Endocrine society, etc....AHA/ACC, ACCP, AAP, Endocrine society, etc....
•• cause of confusion, dismaycause of confusion, dismay
A common international A common international grading system?grading system?
•• GRADE (GRADE (GGrades of rades of rrecommendation, ecommendation, aassessment, ssessment, ddevelopment and evelopment and eevaluation)valuation)
•• international groupinternational group–– Australian NMRC, SIGN, USPSTF, WHO, NICE, Australian NMRC, SIGN, USPSTF, WHO, NICE,
Oxford CEBM, CDC, CCOxford CEBM, CDC, CC
•• ~ 25 meetings over last ten years~ 25 meetings over last ten years•• (~10 (~10 –– 50 attendants)50 attendants)
GRADE UptakeGRADE UptakeAgencia sanitaria regionale, Bologna, Italia Agency for Health Care Research and Quality (AHRQ)Allergic Rhinitis and Group - Independent Expert PanelAmerican Association for the study of liver diseasesAmerican College of Cardiology FoundationAmerican College of Chest PhysiciansAmerican College of Emergency PhysiciansAmerican College of PhysiciansAmerican Endocrine Society American Society of Gastrointestinal EndoscopyAmerican society of Interventional Pain PhysiciansAmerican Thoracic Society (ATS)BMJ Clinical Evidence British Medical Journal Canadian Agency for Drugs and Technology in HealthCanadian Cardiovascular SocietyCanadian Task Force on Preventive Health CareCenters for Disease ControlCochrane Collaboration EBM Guidelines Finland Emergency Medical Services for Children National
Resource Center European Association for the Study of the LiverEuropean Respiratory SocietyEuropean Society of Thoracic SurgeonsEvidence-based Nursing Sudtirol, Alta Adiga, ItalyFinnish Office of Health Technology Assessment
German Agency for Quality in MedicineHeelth Inspectorate for ScotlandInfectious Disease Society of America Japanese Society of Oral and Maxillofacial Radiology Joslin Diabetes CenterJournal of Infection in Developing CountriesKaiser PermanenteKidney Disease International Guidelines Organization National and Gulf Centre for Evidence-based MedicineNational Institute for Clinical Excellence (NICE)National Kidney FoundationNorwegian Knowledge Centre for the Health ServicesOntario MOH Medical Advisory SecretariatPanama and Costa Rica National Clinical Guidelines ProgramPolish Institute for EBMScottish Intercollegiate Guideline Network (SIGN)Society of Critical Care MedicineSociety of Pediatric Endocrinology Society of Vascular SurgerySpanish Society of Family Practice (SEMFYC) Stop TB Diagnostic Working GroupSurviving sepsis campaign Swedish Council on Technology Assessment in Health CareSwedish National Board of Health and Welfare University of Pennsylvania Health System for EB Practice UpToDateWINFOCUSWorld Allergy OrganizationWorld Health Organization (WHO)
What are we grading?What are we grading?
•• two componentstwo components
•• quality of body of evidencequality of body of evidence–– extent to which confidence in estimate of extent to which confidence in estimate of
effect adequate to support decisioneffect adequate to support decision•• high, moderate, low, very lowhigh, moderate, low, very low
•• strength of recommendationstrength of recommendation•• strong and weakstrong and weak
What are we grading?What are we grading?
•• two componentstwo components
•• quality of body of evidencequality of body of evidence–– confidence in estimate of effectconfidence in estimate of effect
•• high, moderate, low, very lowhigh, moderate, low, very low
•• strength of recommendationstrength of recommendation•• strong and weakstrong and weak
Interpretation of qualityInterpretation of quality•• High qualityHigh quality—— Further research is very unlikely to Further research is very unlikely to
change our confidence in the estimate of effect change our confidence in the estimate of effect •• Moderate qualityModerate quality—— Further research is likely to Further research is likely to
have an important impact on our confidence in the have an important impact on our confidence in the estimate of effect and may change the estimate estimate of effect and may change the estimate
•• Low qualityLow quality—— Further research is very likely to Further research is very likely to have an important impact on our confidence in the have an important impact on our confidence in the estimate of effect and is likely to change the estimate of effect and is likely to change the estimate estimate
•• Very low qualityVery low quality—— Any estimate of effect is very Any estimate of effect is very uncertainuncertain
Interpretation of qualityInterpretation of quality•• High: We are very confident that tHigh: We are very confident that the true effect lies he true effect lies
close to that of the estimate of the effect.close to that of the estimate of the effect.
•• Moderate: We are moderately confident in the effect Moderate: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it estimate of the effect, but there is a possibility that it is substantially different.is substantially different.
•• Low: Our confidence in the effect estimate is limited: Low: Our confidence in the effect estimate is limited: The true effect may be substantially different from the The true effect may be substantially different from the estimate of the effect.estimate of the effect.
•• Very low: We have very little confidence in the effect Very low: We have very little confidence in the effect estimate: The true effect is likely to be substantially estimate: The true effect is likely to be substantially different from the estimate of effect.different from the estimate of effect.
Health Care Question
(PICO)Systematic reviews
Studies
Outcomes
Important outcomes
Rate the quality of evidence for each outcome, across studiesRCTs start high, observational studies start low(-)Study limitationsImprecisionInconsistency of resultsIndirectness of evidencePublication bias likely
Final rating of quality for each outcome: high, moderate, low, or very low
(+)Large magnitude of effectDose responsePlausible confounders would ↓ effect when an effect is present or ↑ effect if effect is absent
Decide on the direction (for/against) and grade strength (strong/weak*) of the recommendation considering:
Quality of the evidenceBalance of desirable/undesirable outcomes
Values and preferencesDecide if any revision of direction or strength is
necessary considering: Resource use*also labeled “conditional”or “discretionary”
Rate overall quality of evidence (lowest quality among critical outcomes)
S1 S2 S3 S4
OC1 OC2 OC3 OC4
OC1 OC3Criticaloutcomes
OC4
Generate an estimate of effect for each outcome
OC2
S5
Structured questionStructured question
• patients: lymphoma patients at risk of developing chemotherapy-induced febrile neutropenia
• granulocyte colony-stimulating (G-CSF)
• alternative not using G-CSF
Structured questionStructured question• patients:
– women considering breast cancer screening– age 40-9; 50 to 74; > 75– no risk genetic mutation chest radiation
• intervention– film mammography
• alternative – no screening
Need to define all patientNeed to define all patient--important outcomesimportant outcomesand evaluate their importanceand evaluate their importance
• desirable consequences – reduction in breast cancer mortality
• undesirable consequences– false positive screening results– invasive procedures from positive results– complications of invasive procedures– unnecessary diagnosis and treatment
Determinants of qualityDeterminants of quality•• RCTsRCTs start highstart high•• observational studies start low observational studies start low
•• what can lower quality?what can lower quality?–– detailed design and executiondetailed design and execution–– inconsistencyinconsistency–– indirectnessindirectness–– imprecisionimprecision–– reporting biasreporting bias
Determinants of qualityDeterminants of quality•• RCTsRCTs start highstart high•• observational studies start low observational studies start low
•• 5 limitations can lower quality5 limitations can lower quality•• detailed design and executiondetailed design and execution
–– concealment, blinding, loss to followconcealment, blinding, loss to follow--upup
•• inconsistencyinconsistency–– variability in results (heterogeneity)variability in results (heterogeneity)
•• publication biaspublication bias
Determinants of qualityDeterminants of quality•• RCTs start highRCTs start high•• observational studies start low observational studies start low
•• 5 limitations can lower quality5 limitations can lower quality•• Bias Bias
–– detailed design and executiondetailed design and execution•• concealment, blinding, loss to followconcealment, blinding, loss to follow--upup
–– publication biaspublication bias
•• Imprecision Imprecision –– wide confidence intervalswide confidence intervals
Determinants of qualityDeterminants of quality•• RCTsRCTs start highstart high•• observational studies start low observational studies start low
•• limitations can lower quality?limitations can lower quality?•• Bias Bias
–– detailed design and executiondetailed design and execution•• concealment, blinding, loss to followconcealment, blinding, loss to follow--upup
•• Imprecision Imprecision –– wide confidence intervalswide confidence intervals
Determinants of qualityDeterminants of quality
•• RCTsRCTs start highstart high
•• observational studies start low observational studies start low
•• limitations can lower quality?limitations can lower quality?
Determinants of qualityDeterminants of quality•• 5 limitations can lower quality5 limitations can lower quality
•• risk of biasrisk of bias–– concealment, blinding, loss to followconcealment, blinding, loss to follow--upup
•• imprecisionimprecision
•• inconsistencyinconsistency–– variability in results (heterogeneity)variability in results (heterogeneity)
•• IndirectnessIndirectness•• publication biaspublication bias
Determinants of qualityDeterminants of quality•• 5 limitations can lower quality5 limitations can lower quality
•• risk of biasrisk of bias–– concealment, blinding, loss to followconcealment, blinding, loss to follow--upup
•• imprecisionimprecision
•• inconsistencyinconsistency–– variability in results (heterogeneity)variability in results (heterogeneity)
•• publication biaspublication bias
Risk of BiasRisk of Bias
•• well establishedwell established–– concealmentconcealment–– intention to treat principle observedintention to treat principle observed–– blindingblinding–– completeness of followcompleteness of follow--upup
•• more recentmore recent–– early stopping for benefitearly stopping for benefit–– selective outcome reporting biasselective outcome reporting bias
Breast cancer risk of biasBreast cancer risk of bias•• most trials not concealedmost trials not concealed
•• blindingblinding–– ? adjudication of ? adjudication of outomeoutome–– no other blindingno other blinding
•• ? loss to follow? loss to follow--upup
•• all trials rated as all trials rated as ““fairfair”” by USPSTFby USPSTF
Consistency of resultsConsistency of results
•• if inconsistency, look for explanationif inconsistency, look for explanation–– patients, intervention, outcome, methodspatients, intervention, outcome, methods
•• judgment of consistencyjudgment of consistency–– variation in size of effectvariation in size of effect–– overlap in confidence intervalsoverlap in confidence intervals–– statistical significance of heterogeneitystatistical significance of heterogeneity–– II22
Relative Risk with 95% CI for Vitamin D Non-vertebral Fractures
ChapuyChapuy et al, (2002) 0.85 (0.64, 1.13)et al, (2002) 0.85 (0.64, 1.13)
Pooled Random Effect Model0.82 (0.69 to 0.98)
p= 0.05 for heterogeneity, I2=53%
ChapuyChapuy et al, (1994) 0.79 (0.69, 0.92)et al, (1994) 0.79 (0.69, 0.92)
Lips et al, (1996) 1.10 (0.87, 1.39)Lips et al, (1996) 1.10 (0.87, 1.39)
DawsonDawson--Hughes et al, (1997) 0.46 (0.24, 0.88)Hughes et al, (1997) 0.46 (0.24, 0.88)
Pfeifer et al, (2000) 0.48 (0.13, 1.78)Pfeifer et al, (2000) 0.48 (0.13, 1.78)
Meyer et al, (2002) 0.92 (0.68, 1.24)Meyer et al, (2002) 0.92 (0.68, 1.24)
TrivediTrivedi et al, (2003) 0.67 (0.46, 0.99)et al, (2003) 0.67 (0.46, 0.99)
0.1 1 10
Favours Vitamin D Favours Control
Relative Risk 95% CI
Relative Risk with 95% CI for Vitamin D (Non-Vertebral Fractures, Dose >400)
Chapuy et al, (1994) 0.70 (0.69, 0.92)
Dawson-Hughes et al, (1997) 0.46 (0.24, 0.88)
Pfeifer et al, (2000) .48 (0.13, 1.78)
Chapuy et al, (2002) 0.85 (0.64, 1.13)
Trivedi et al, (2003) 0.67 (0.46, 0.99)
Pooled Random Effect Mode0.75 (0.63 to 0.89)
p= 0.26 for heterogeneity, I2=24%
0.1 1 10
Favours Vitamin D Favours Control
Relative Risk 95% CI
Relative Risk with 95% CI for Vitamin D (Non-Vertebral Fractures, Dose = 400)
Lips et al (1996) 1.10 (0.87, 1.39)
Meyer et al (2002) 0.92 (0.68, 1.24)
Pooled Random Effect Mode1.03 (0.86 to 1.24)
p = 0.35 heterogeneity, I2=0%
0.1 1 10
Favours Vitamin D Favours Control
Relative Risk 95% CI
ShouldShould wewe believebelieve subsub--groupgroupanalysisanalysis??
•• withinwithin--studystudy comparisoncomparison? ? NoNo•• largelarge differencedifference in in effecteffect BorderlineBorderline•• unlikelyunlikely chance chance YesYes, p = 0.006, p = 0.006•• consistentconsistent acrossacross studiesstudies YesYes•• a priori a priori hypothesishypothesis YesYes•• oneone ofof smallsmall numbernumber hypotheseshypotheses YesYes•• biologicallybiologically compellingcompelling YesYes
•• shallshall wewe believebelieve subsub--groupgroup analysisanalysis??
Directness of EvidenceDirectness of Evidence
•• differences indifferences in–– patientspatients–– interventionsinterventions–– comparatorscomparators
•• differences in outcomesdifferences in outcomes–– surrogatessurrogates
Quality judgments: DirectnessQuality judgments: Directness
•• populations populations –– older, sicker or more coolder, sicker or more co--morbiditymorbidity
•• interventions interventions –– new statins versus oldnew statins versus old
•• outcomes outcomes –– important versus surrogate outcomesimportant versus surrogate outcomes–– glucose control versus CV eventsglucose control versus CV events
Flatulence
Figure 6: Hierarchy of outcomes according to their patientFigure 6: Hierarchy of outcomes according to their patient--importance to assess the importance to assess the effect of phosphate lowering drugs in patients with renal failureffect of phosphate lowering drugs in patients with renal failure and hyperphophatemiae and hyperphophatemia
Importance of endpoints
Critical for decision making
Important, but not critical for decision making
Of low patient-importance2
5
Pain due to soft tissue Calcification / function 6
Fractures 7
Myocardial infarction 8
Mortality 9
3
4
1
Coronary calcification
Ca2+/P-Product
Surrogates of declining importance
Bone density
Ca2+/P-Product
Soft tissue calcification
Ca2+/P-Product
Lower by one level for
indirectness
Lower by two levels for
indirectness
Alendronate Risedronate
Placebo
DirectnessDirectnessinterested in A versus B interested in A versus B
available data A available data A vsvs C, B C, B vsvs CC
Issues in directness breast Issues in directness breast cancer screeningcancer screening
•• estimated screening effects in over 75estimated screening effects in over 75–– could also use observational studiescould also use observational studies
•• no direct comparisons of screening no direct comparisons of screening intervalsintervals
ImprecisionImprecision
•• small sample sizesmall sample size–– small number of eventssmall number of events
•• wide confidence intervalswide confidence intervals–– uncertainty about magnitude of effectuncertainty about magnitude of effect
•• primary criterion:primary criterion:–– would decisions differ at ends of CIwould decisions differ at ends of CI
Meta-analysis of women 39 to 49 years of age
25% RRR, reduction in breast cancer deaths 1/1,000
4% RRR, reduction in breast cancer deaths 1/6,000
ImprecisionImprecision•• small sample sizesmall sample size
–– small number of eventssmall number of events
•• wide confidence intervalswide confidence intervals–– uncertainty about magnitude of effectuncertainty about magnitude of effect
•• problemsproblems–– analogy to stopping earlyanalogy to stopping early–– lack of prognostic balancelack of prognostic balance
•• solution: optimal information sizesolution: optimal information size–– # of pts from conventional sample size # of pts from conventional sample size
calculationcalculation–– specify CER, effect, specify CER, effect, αα, , ββ, , ΔΔ
Fluoroquinolone prophylaxis in neutropenia: infection-related mortality
sample size 1,002sample size 1,002αα 0.05, 0.05, ββ 0.20, 0.20, ΔΔ 0.25, N = 6,0000.25, N = 6,000
Stroke Stroke –– Fixed EffectsFixed Effects
1 5 10 50 1000.50.1
Study Year Overall Event Rate Relative Risk (95% CI)
Wallace 1998 5 / 200 3.06 (0.49 to 19.02)
Pobble 2005 1 / 103 2.63 (0.11 to 62.97)
DIPOM 2006 2 / 921 4.97 (0.24 to 103.19)
MaVS 2006 6 / 496 1.83 (0.39 to 8.50)
Zaugg 2007 1 / 119 2.97 (0.12 to 72.19)
POISE 2007 60 / 8351 2.13 (1.25 to 3.64)
Fixed Effects Estimate 2.22 (1.39 to 3.56)
p=0.99 for heterogeneity, I²=0%
Downgrading for precisionDowngrading for precision
•• if OIS not met rate down for imprecisionif OIS not met rate down for imprecision–– unless very large unless very large ssss (? > 1,000 per group)(? > 1,000 per group)
•• if OIS met and CI exclude RR = 1, donif OIS met and CI exclude RR = 1, don’’t t downgradedowngrade
•• if OIS met and CI includes, RR = 1, if OIS met and CI includes, RR = 1, downgrade only if RR < 0.75 or > 1.25downgrade only if RR < 0.75 or > 1.25
PrecisionPrecision•• atrial fib at risk of strokeatrial fib at risk of stroke
•• warfarin increases serious warfarin increases serious gigi bleedingbleeding–– 3% per year 3% per year
•• 1,000 patients 1 less stroke1,000 patients 1 less stroke–– 30 more bleeds for each stroke prevented30 more bleeds for each stroke prevented
•• 1,000 patients 100 less strokes1,000 patients 100 less strokes–– 3 strokes prevented for each bleed3 strokes prevented for each bleed
•• where is your threshold?where is your threshold?–– how many strokes in 100 with 3% bleeding?how many strokes in 100 with 3% bleeding?
Example: Example: clopidogrelclopidogrel or ASA?or ASA?•• pts with threatened strokepts with threatened stroke
•• RCT of RCT of clopidogrelclopidogrel vsvs ASAASA–– 19,185 patients19,185 patients
• ischaemic stroke, MI, or vascular death compared – 939 events (5·32%) clopidogrel– 1021 events (5·83%) with aspirin
• RR 0.91 (95% CI 0.83 – 0.99) (p=0·043)
•• downgrade for precision?downgrade for precision?
01.0%
Clopidogrel or ASA for threatened vascular events
RCT 19,185 patients
1.7% - 0.9 – 0.1%
RR 0.91 (95% CI 0.83 – 0.99)
Avoiding misleading conclusionsAvoiding misleading conclusions
•• analogyanalogy–– SR accumulating data just as single trialSR accumulating data just as single trial
•• risk of spurious effect by stopping earlyrisk of spurious effect by stopping early
•• how to avoidhow to avoid–– insist on sufficient datainsist on sufficient data–– optimal information sizeoptimal information size
Avoiding misleading conclusionsAvoiding misleading conclusions
•• optimal information sizeoptimal information size–– # of pts from conventional sample size # of pts from conventional sample size
calculationcalculation–– specify CER, effect, specify CER, effect, αα, , ββ
•• alternative, number of eventsalternative, number of events–– ? 300 ?? 300 ?
Criteria for NOT downgradingCriteria for NOT downgrading•• CI narrow enough to permit confident CI narrow enough to permit confident
recommendation for or againstrecommendation for or against
•• if positive benefit outcome, if positive benefit outcome, safeguard against false positivesafeguard against false positive–– OIS OIS or or -- number threshold met (300)number threshold met (300)
Publication biasPublication bias
•• high likelihood could lower qualityhigh likelihood could lower quality
•• when to suspectwhen to suspect•• number of small studiesnumber of small studies•• industry sponsoredindustry sponsored
What can raise quality?What can raise quality?•• large magnitude can upgrade one levellarge magnitude can upgrade one level
–– very large two levelsvery large two levels
•• common criteriacommon criteria–– everyone used to do badlyeveryone used to do badly–– almost everyone does wellalmost everyone does well–– quick actionquick action
•• hip replacement for hip osteoarthritiship replacement for hip osteoarthritis
•• mechanical ventilation in respiratory failuremechanical ventilation in respiratory failure
DoseDose--response gradientresponse gradient
•• childhood lymphoblastic leukemiachildhood lymphoblastic leukemia
•• risk for CNS malignancies 15 years after risk for CNS malignancies 15 years after cranial irradiationcranial irradiation
• no radiation: 1% (95% CI 0% to 2.1%) • 12 Gy: 1.6% (95% CI 0% to 3.4%) • 18 Gy: 3.3% (95% CI 0.9% to 5.6%).
Observational study starts lowObservational study starts lowWhat can move up?What can move up?
• plausible confounders strengthen association
• FP hospitals higher mortality than NFP hospitals– NFP treat sicker patients– FP treat wealthier patients
Overall level of evidenceOverall level of evidence•• most systems just use evidence about most systems just use evidence about
primary benefit outcomeprimary benefit outcome
•• but what about others (risk)?but what about others (risk)?
•• what to do?what to do?
•• optionsoptions–– ignore all but primaryignore all but primary–– weakest of any outcomeweakest of any outcome–– some blended approachsome blended approach–– weakest of critical outcomesweakest of critical outcomes
Quality Assessment
Summary of Findings
QualityRelative Effect
(95% CI)
Absolute risk difference
OutcomeNumber of
participants(studies)
Risk of Bias Consistency Directness Precision Reportin
g Bias
Myocardial infarction
10,125(9)
No serious limitations
No serious imitations
No serious limitations
No serious limitations
Not detected High 0.71
(0.57 to 0.86)1.5% fewer
(0.7% fewer to 2.1% fewer)
Mortality 10,205(7)
No serious limitations
Possibllyinconsistent
No serious limitations Imprecise Not
detectedModerate
or low1.23
(0.98 – 1.55)
0.5% more(0.1% fewer
to 1.3% more)
Stroke 10,889(5)
No serious limitaions
No serious limitations
No serious limitations
Possible imprecision
Not detected
Moderate or High
2.21(1.37 – 3.55)
0.5% more (0.2% more to 1.3% more0
Beta blockers in non-cardiac surgery
Quality AssessmentSummary of Findings
Quality
Relative Risk
(95% CI)p-value
Illustrative risks
OutcomeNo. of
patients(studies)
Risk of Bias Inconsistency Indirectness Imprecision Publication Bias
Examplecontrol
rate
Associated risk with PVL
Hospital mortality
1,664(9)
Inability to blind.2 trials stopped early with few events and large effects; were also confounded by ‘open lung’strategies.
p = 0.07I2 = 45.6% Varied populations, interventions.Not robust in sensitivity analyses
Direct Precise UndetectedModerate (due
toinconsistency)
0.82 (0.68 – 0.99)
p = 0.0440% 32.8%
(27.2 – 39.6)
Barotrauma 1,497(7) Inability to blind.
p = 0.24I2 = 25.3%Varied populations, interventions
Direct Imprecise UndetectedModerate (due
toimprecision)
0.90(0.66 – 1.24)
p = 0.53NS NS
Paralysis 1,202(5) Inability to blind.
p = 0.004I2 = 59%Varied populations, interventions, measurements
Direct Precise Not assessed
Moderate (due to
inconsistency)
1.37 (1.04 – 1.82)
p = 0.0330% 41.1%
(31.2 – 54.6)
Dialysis 173(2) Inability to blind.
p = 0.26I2 = 22.8%Varied populations, interventions
Direct Imprecise Not assessed
Moderate (due to
imprecision)
1.76(0.79 – 3.90)
p = 0.16NS NS
Pressure limited ventilation
Quality AssessmentSummary of Findings
Quality
Relative Risk
(95% CI)p-value
Illustrative risks
OutcomeNo. of
patients(studies)
Risk of Bias Inconsistency Indirectness Imprecision Publication Bias
control rate
vaccinated rate
Zoster episodes
38,546(1) No serious risk only one study Direct Precise Undetected High not reported
11.12 per 1,000
patient-years
5.42(difference
5.7 per 1,000 pt-years
(p< 0.001)
Post-herpetic
neuralgia
38,546(1)
No serious risk only one study
Direct Precise Undetected High not reported
1.38 per 1,000
patient-years
0.46(difference
0.92 per 1,000 pt-
years (p< 0.001)
Serious adverse events
38,546(1) No serious risk only one study Direct Precise Undetected High Not
reported
13 per 1,000 19 (difference
6 per 1,000)
Zoster vaccine
Population No. of participants (trials) †
Higher PEEP
Lower PEEP
Adjusted Relative Risk (95% CI; P-value) ‡
Adjusted Absolute Risk Difference (95% CI)
Quality
Patients with ARDS
1892 (3) 324/951 (34.1%)
368/941 (39.1%)
0.90 (0.81 to 1.00; 0.049)
-3.9% (-7.4% to -0.04%) High
Patients without ARDS
404 (3) 50/184 (27.2%)
41/220 (18.6%)
1.37 (0.98 to 1.92; 0.065)
6.9% (-0.4% to 17.1%) Moderate (imprecision)
High versus low PEEP in ALI and ARDS
Patients or population: Anyone taking a long flight (lasting more than 6 hours)Settings: International air travelIntervention: Compression stockings1
Comparison: Without stockings
Outcomes Illustrative comparative risks* (RANGE OF UNCERTAINTY)
Relative effect(95% CI)
Number of participants(studies)
Quality of theevidence(GRADE)
Comments
Assumed risk Corresponding risk
Without stockings With stockings(95% CI)
Symptomatic deep vein thrombosis (DVT)
See comment See comment Not estimable 2637(9 studies)
See comment 0 participants developed symptomatic DVT in these studies.
Symptomatic deep vein thrombosis –surrogate symptomless deep vein thrombosis
Low risk population 2 RR 0.10(0.04 to 0.25)
2637(9 studies)
⊕⊕⊕Moderate3
Estimates of control group asymptomatic thrombosis from the primary studies range from 15 per 1,000 in low risk patients to 25 per 1,000 in high risk patients
5 per 10,000 0.5 per 10,000 (0 to 1)
High risk population 2
18 per 10,000 1.8 per 10,000 (0.5 to 4)
Superficial vein thrombosis
13 per 1000 6 per 1000 (2 to 15)
RR 0.45(0.18 to 1.13)
1804(8 studies)
⊕⊕⊕Moderate4
Diagnostic tests
• same logic as for treatment
• judges quality of evidence NOT for accuracy, but for change in patient-important outcome
• ideally establish through RCT– focus on patient-important outcome– screening RCTs (breast cancer, colon cancer)
• most of time not available– new complexities, process in evolution
Test accuracy is a surrogate for patient important outcomes
• When clinicians think about diagnostic tests, they focus on their accuracy
• Underlying assumption: obtaining a better idea of whether a target condition is present or absent will result in superior patient management and improved outcome.
Example of new test and reference test or strategy
Putative benefit of new test
Diagnostic accuracy Patient Outcomes and expected impact on management for the following test outcomes
Sensitivity SpecificityTrue positives
Falsepositives
True negatives
False negatives
Helical CT for renal calculus compared with intravenous pyeolgram
Detection of more (but smaller) calculi
greater equal Presumed influence on patient important outcomes
Certain benefit for larger stones, for smaller stones the benefit is less clear and unnecessary treatment can result
Likely detriment from unnecessary additional invasive tests
Almost certain benefit from avoiding unnecessary tests
Likely detriment for large stones, less certain for small stones More testing
Directness of the evidence (test results) for patient-important outcomes
Some uncertainty
No uncertainty
No uncertainty
Major uncertainty
Balance between presumed patient outcomes, complications and cost: Less complications and downsides compared to IVP would support the new test’s usefulness, but the balance between desirable and undesirable effect not clear in view of the uncertain consequences of identifying smaller stones.
Strength of recommendationsStrength of recommendations•• degree of confidence that desirable degree of confidence that desirable
effects of adhering to recommendation effects of adhering to recommendation outweigh undesirable effects. outweigh undesirable effects.
•• strong recommendationstrong recommendation–– benefits clearly outweigh risks/hassle/costbenefits clearly outweigh risks/hassle/cost–– risk/hassle/cost clearly outweighs benefitrisk/hassle/cost clearly outweighs benefit
Strength of recommendationsStrength of recommendations
•• degree of confidence that desirable degree of confidence that desirable effects of adhering to recommendation effects of adhering to recommendation outweigh undesirable effects. outweigh undesirable effects.
•• strong recommendationstrong recommendation–– benefits clearly outweigh risks/hassle/costbenefits clearly outweigh risks/hassle/cost–– risk/hassle/cost clearly outweighs benefitrisk/hassle/cost clearly outweighs benefit
•• what can downgrade strength?what can downgrade strength?
Strength of RecommendationStrength of Recommendation
•• strong recommendationstrong recommendation–– benefits clearly outweigh risks/hassle/costbenefits clearly outweigh risks/hassle/cost–– risk/hassle/cost clearly outweighs benefitrisk/hassle/cost clearly outweighs benefit
•• what can downgrade strength?what can downgrade strength?
•• low quality evidence low quality evidence
•• close balance between up and downsidesclose balance between up and downsides
Grades TranslationsStrong recommendations• Just do it• Virtually all well informed individuals would want
the intervention and only a small proportion would not
• Most individuals should receive the intervention • Use of the intervention according to the
guideline could be used as a quality criterion or performance indicator
Grades TranslationsWeak recommendation• Examine the evidence yourself• The majority of well informed individuals
would want the intervention, but a substantial proportion would not
• Many but not all individuals should receive the intervention
• The intervention is not a candidate for a quality criterion or performance indicator.
Risk/Benefit tradeoffRisk/Benefit tradeoff
•• aspirin after myocardial infarctionaspirin after myocardial infarction–– 25% reduction in relative risk 25% reduction in relative risk –– side effects minimal, cost minimalside effects minimal, cost minimal–– benefit obviously much greater than benefit obviously much greater than
risk/costrisk/cost
•• warfarin in low risk atrial fibrillationwarfarin in low risk atrial fibrillation–– warfarin reduces stroke warfarin reduces stroke vsvs ASA by 50%ASA by 50%–– but if risk only 1% per year, ARR 0.5%but if risk only 1% per year, ARR 0.5%–– increased bleeds by 1% per yearincreased bleeds by 1% per year
Strength of Recommendations• Resuscitate fast in septic patient
- do it!
• Prone ventilation in failing patient with ARDS– Probably do it– Probably do not do it
Strength of Strength of RecommendationsRecommendations
Aspirin after MI Aspirin after MI –– do itdo it
Warfarin rather than ASA in Warfarin rather than ASA in AfibAfib---- probably do itprobably do it---- probably donprobably don’’t do itt do it
Significance of strong Significance of strong vsvs weakweak•• variability in patient preferencevariability in patient preference
–– strong, almost all same choice (> 90%)strong, almost all same choice (> 90%)–– weak, choice varies appreciablyweak, choice varies appreciably
•• interaction with patientinteraction with patient–– strong, just inform patientstrong, just inform patient–– weak, ensure choice reflects valuesweak, ensure choice reflects values
•• use of decision aiduse of decision aid–– strong, donstrong, don’’t bothert bother–– weak, use the aidweak, use the aid
•• quality of care criterionquality of care criterion–– strong, considerstrong, consider–– weak, donweak, don’’t considert consider
USPSTF documents- clinical summary- supporting article (decision analysis)- evidence summary- recommendations
For women age 40 – 49 we suggest NOT screening.(Weak recommendation based, moderate quality evidence. 2B)
Values and Preferences: This recommendation places a relatively low value on a very small, uncertain mortality decreaseand reflects concerns with false positive results, unnecessary biopsies, and unnecessary dx of breast cancerWomen who place a higher value on a small reduction in breast cancer mortality and are less concerned about the undesirable consequences will choose screening
For women age 50 - 74 we suggest screening.(Weak recommendation, moderate quality evidence. 2B)
Women who do not place a high value on a small reduction in breast cancer mortality and are concerned about false positive results, unnecessary biopsies, and unnecessary diagnosis of breast cancer will decline screening
Using GRADE no insufficient- no recommendation or recommend for or against
Breast self-examination: We recommend against breast self-examination. (strong recommendation, high/moderate quality evidence)
Weak recommendationWeak recommendation
•• practice will vary practice will vary –– according to what?according to what?
•• interpretation of evidenceinterpretation of evidence–– clopidogrelclopidogrel in strokein stroke
•• patientspatients’’ values and preferencesvalues and preferences–– atrial fibrillationatrial fibrillation
•• inclination to gamble (risk aversion)inclination to gamble (risk aversion)–– HRTHRT
When evidence is low qualityWhen evidence is low quality
•• choice more preference dependentchoice more preference dependent
•• risk aversionrisk aversion
•• steroids for pulmonary fibrosissteroids for pulmonary fibrosis–– low quality evidence in support of low quality evidence in support of
benefitbenefit–– high quality evidence of toxicityhigh quality evidence of toxicity
When evidence is low qualityWhen evidence is low quality•• recommendation to the hopeful patientrecommendation to the hopeful patient
–– II’’m likely to deterioratem likely to deteriorate–– if something might work, letif something might work, let’’s try its try it–– damn the torpedoesdamn the torpedoes
•• recommendation to the fearful patientrecommendation to the fearful patient–– doctor, you mean you know itdoctor, you mean you know it’’s toxics toxic
•• diabetes, skin changes, body diabetes, skin changes, body habitushabitus, infection, , infection, osteoporosisosteoporosis
–– you donyou don’’t know for sure it works?t know for sure it works?–– are you crazy?are you crazy?
•• weak recommendation mandatedweak recommendation mandated
Strong recommendation Strong recommendation when evidence is low quality?when evidence is low quality?•• known benefit, strong recommendation for known benefit, strong recommendation for
one of two alternativesone of two alternatives–– antipyretics in children with chickenpoxantipyretics in children with chickenpox–– but which one: ASA or acetaminophenbut which one: ASA or acetaminophen
•• benefit: high quality evidence of benefit: high quality evidence of equivalenceequivalence
•• harm: low quality evidence that harm harm: low quality evidence that harm differs appreciablydiffers appreciably–– Reye syndrome from ASAReye syndrome from ASA
•• strong recommendation for acetaminophenstrong recommendation for acetaminophen
Strong recommendation when Strong recommendation when evidence is low quality?evidence is low quality?
•• BlastomycosisBlastomycosis–– low quality evidence low quality evidence amphotericinamphotericin more more
effective than effective than itraconazoleitraconazole–– high quality evidence more toxichigh quality evidence more toxic
•• patients with life threatening patients with life threatening blastoblasto–– life and death situationlife and death situation–– strong recommendation for strong recommendation for amphoampho
Strong recommendation when Strong recommendation when evidence is low quality?evidence is low quality?
•• head to toe CT scanninghead to toe CT scanning–– prevent cancer deathsprevent cancer deaths
•• very low quality evidence of benefitsvery low quality evidence of benefits•• moderate quality evidence re risks, moderate quality evidence re risks,
high re costshigh re costs•• strong recommendation againststrong recommendation against
PresentationPresentation•• strong and weakstrong and weak
–– discomfort with discomfort with ““weakweak””–– alternative wording: discretionary, conditionalalternative wording: discretionary, conditional
•• strongstrong–– ““we recommendwe recommend”…”…
•• discretionary discretionary –– ““we suggestwe suggest…”…”
•• nevernever–– we recommend (or suggest) you considerwe recommend (or suggest) you consider……
•• always: quality of evidence and gradealways: quality of evidence and grade
When (not to) GRADEWhen (not to) GRADE““good to remind/alertgood to remind/alert””
•• if no systematic review undertakenif no systematic review undertaken
•• no sensible person would consider contraryno sensible person would consider contrary–– We recommend that the patient, and the clinician We recommend that the patient, and the clinician
responsible for the patientresponsible for the patient’’s care, should be made s care, should be made aware of any change in a prescribed medication, aware of any change in a prescribed medication, including change to a generic drugincluding change to a generic drug
•• very general (not sufficiently specific)very general (not sufficiently specific)–– We suggest that longWe suggest that long--term maintenance term maintenance
immunosuppressionimmunosuppression be tailored to individual patientbe tailored to individual patient’’s s adverse events or risk of adverse eventsadverse events or risk of adverse events
Explicit comparatorExplicit comparator
•• we recommend hourly urine volume we recommend hourly urine volume measurement for at least 24 hoursmeasurement for at least 24 hours–– in contrast to every 2 hours, every 3in contrast to every 2 hours, every 3……??
•• we suggest measuring serum creatinine in we suggest measuring serum creatinine in all all KTRsKTRs at leastat least–– daily for 7 daysdaily for 7 days–– 2 to 3 X per week for weeks 2 to 42 to 3 X per week for weeks 2 to 4–– every 2 weeks for months 4 to 6 every 2 weeks for months 4 to 6
Value and preference Value and preference statementsstatements
•• underlying values and preferences underlying values and preferences always presentalways present
•• sometimes crucialsometimes crucial
•• important to make explicitimportant to make explicit
Values and preferencesValues and preferences
Stroke guideline: patients with TIA clopidogrel over aspirin (Grade 2B).
Underlying values and preferences: This recommendation to use clopidogrel over aspirin places a relatively high value on a small absolute risk reduction in stroke rates, and a relatively low value on minimizing drug expenditures.
Values and preferencesValues and preferences
peripheral vascular disease: aspirin be used instead of clopidogrel (Grade 2A).
Underlying values and preferences: This recommendation places a relatively high value on avoiding large expenditures to achieve small reductions in vascular events.
Flavanoids for HemorrhoidsFlavanoids for Hemorrhoids•• venotonicvenotonic agentsagents
–– mechanism unclear, increase venous returnmechanism unclear, increase venous return
•• popularitypopularity–– 90 90 venotonicsvenotonics commercialized in Francecommercialized in France–– none in Sweden and Norwaynone in Sweden and Norway–– France 70% of world marketFrance 70% of world market
•• possibilitiespossibilities–– French misguidedFrench misguided–– rest of world missing outrest of world missing out
Systematic ReviewSystematic Review•• 14 trials, 1432 patients14 trials, 1432 patients•• key outcomekey outcome
–– risk not improving/persistent symptomsrisk not improving/persistent symptoms–– 11 studies, 1002 patients, 375 events11 studies, 1002 patients, 375 events–– RR 0.4, 95% CI 0.29 to 0.57RR 0.4, 95% CI 0.29 to 0.57
•• minimal side effectsminimal side effects
•• is France right?is France right?•• what is the quality of evidence?what is the quality of evidence?
What can lower quality?What can lower quality?
•• risk of biasrisk of bias–– lack of detail re concealmentlack of detail re concealment–– questionnaires not validatedquestionnaires not validated
•• indirectness indirectness –– no problemno problem
•• inconsistency, need to look at the inconsistency, need to look at the resultsresults
Review : Phlebotonics for hemorrhoidsComparison: 01 Venotonics vs placebp Outcome: 08 Overall improvement: no improvement/some improvement
Study RR (random) Weight RR (random)or sub-category log[RR] (SE) 95% CI % 95% CI
01 Up to seven daysChauvenet -0.8916 (0.2376) 12.67 0.41 [0.26, 0.65] Cospite -2.2073 (0.6117) 5.51 0.11 [0.03, 0.36] Thanapongsathorn -0.4308 (0.2985) 11.18 0.65 [0.36, 1.17]
Subtotal (95% CI) 29.36 0.37 [0.18, 0.77Test for heterogeneity: Chi² = 6.92, df = 2 (P = 0.03), I² = 71.1%Test for overall effect: Z = 2.67 (P = 0.008)
02 Up to four w eeksAnnoni F -1.6094 (0.7073) 4.50 0.20 [0.05, 0.80] Clyne MB -0.9943 (0.3983) 8.94 0.37 [0.17, 0.81] Pirard J -1.1712 (0.3086) 10.94 0.31 [0.17, 0.57] Thanapongsathorn -1.1087 (1.1098) 2.18 0.33 [0.04, 2.91] Thorp 0.2624 (0.3291) 10.46 1.30 [0.68, 2.48] Titapan -0.8916 (0.3691) 9.56 0.41 [0.20, 0.85] Wijayanegara -0.5978 (0.1375) 14.97 0.55 [0.42, 0.72]
Subtotal (95% CI) 61.54 0.48 [0.32, 0.72Test for heterogeneity: Chi² = 13.87, df = 6 (P = 0.03), I² = 56.7%Test for overall effect: Z = 3.57 (P = 0.0004)
03 Further than four w eeksGodeberg -1.7719 (0.3906) 9.10 0.17 [0.08, 0.37]
Subtotal (95% CI) 9.10 0.17 [0.08, 0.37Test for heterogeneity: not applicableTest for overall effect: Z = 4.54 (P < 0.00001)
Total (95% CI) 100.00 0.40 [0.29, 0.57Test for heterogeneity: Chi² = 28.66, df = 10 (P = 0.001), I² = 65.1%Test for overall effect: Z = 5.14 (P < 0.00001)
0.001 0.01 0.1 1 10 100 1000
Favours treatment Favours control
Publication bias?Publication bias?
•• size of studiessize of studies–– 40 to 234 patients, most around 10040 to 234 patients, most around 100
•• all industry sponsoredall industry sponsored
Review : Phlebotonics for hemorrhoidsComparison: 01 Venotonics vs placebp Outcome: 08 Overall improvement: no improvement/some improvement
0.001 0.01 0.1 1 10 100 1000
0.0
0.4
0.8
1.2
1.6
RR (fixed)
What can lower quality?What can lower quality?•• detailed design and executiondetailed design and execution
–– lack of detail re concealmentlack of detail re concealment–– questionnaires not validatedquestionnaires not validated
•• inconsistencyinconsistency–– almost all show positive effect, trendalmost all show positive effect, trend–– heterogeneity p < 0.001; Iheterogeneity p < 0.001; I22 65.1%65.1%
•• indirectnessindirectness•• imprecisionimprecision
–– RR 0.4, 95% CI 0.29 to 0.57RR 0.4, 95% CI 0.29 to 0.57
•• reporting biasreporting bias–– 40 to 234 patients, most around 10040 to 234 patients, most around 100
RecommendationRecommendation
•• for clinicianfor clinician–– offer to patientoffer to patient–– dondon’’t offer to patient?t offer to patient?
•• strength of recommendationstrength of recommendation–– strong or weakstrong or weak
•• for the funding bodyfor the funding body–– publicly fundedpublicly funded–– not publicly fundednot publicly funded–– strong or weak?strong or weak?
Is France right?Is France right?
•• recommendationrecommendation–– yesyes–– no against useno against use
•• strengthstrength–– strong strong –– weakweak
Resource UseResource Use
•• why not cost?why not cost?-- may lead to focus on cost of intervention may lead to focus on cost of intervention
rather than downstream resource userather than downstream resource use-- resource use emphasizes alternative resource use emphasizes alternative
uses of resources (opportunity cost)uses of resources (opportunity cost)
Resource Use just another Resource Use just another outcome?outcome?
•• yes and noyes and no
•• who benefits?who benefits?–– other outcomes usually clearother outcomes usually clear–– costs borne by different payerscosts borne by different payers
•• across societies and within (age)across societies and within (age)
•• some argue costs arensome argue costs aren’’t relevant to t relevant to clinicians when third party payerclinicians when third party payer
Why resource use differentWhy resource use different
•• costs vary much more than other outcomescosts vary much more than other outcomes–– across jurisdictionsacross jurisdictions–– within jurisdictionswithin jurisdictions–– over timeover time
•• even when resource use the same, even when resource use the same, implications may differimplications may differ–– yearyear’’s supply of expensive drugs supply of expensive drug–– nursesnurses’’ salary in U.S., 6 in Poland, 30 in Chinasalary in U.S., 6 in Poland, 30 in China
Why resource use differentWhy resource use different•• opportunity cost differs by perspectiveopportunity cost differs by perspective
•• hospital pharmacy, fixed budgethospital pharmacy, fixed budget–– new expensive drug, clear what give upnew expensive drug, clear what give up
•• envelope public spendingenvelope public spending–– more on health, less on education, social more on health, less on education, social
servicesservices–– will refraining from spending on drugs will refraining from spending on drugs
really mean more for other services?really mean more for other services?–– should envelope include military spending?should envelope include military spending?
ImplicationsImplications•• unbearable lightness of resource useunbearable lightness of resource use•• consider balance of desirable and consider balance of desirable and
undesirable before considering undesirable before considering resource useresource use
•• may decide not to consider resource may decide not to consider resource use at alluse at all–– intervention not usefulintervention not useful–– desirable consequences >>>> undesirabledesirable consequences >>>> undesirable–– relevant only when difference smallrelevant only when difference small
Similarities with other Similarities with other outcomesoutcomes
•• only consider important resource useonly consider important resource use
•• need estimate of difference between need estimate of difference between trttrtand controland control
•• explicit judgments about the quality of the explicit judgments about the quality of the evidence, special judgmentsevidence, special judgments–– perspectiveperspective–– how to judge quality of evidencehow to judge quality of evidence–– ? use of economic model ? use of economic model
Evidence summaryEvidence summary•• includes quality of evidence, summary includes quality of evidence, summary
of findings of findings –– ““balance sheetbalance sheet””, special form of grade , special form of grade
profileprofile
•• resource use and not just costsresource use and not just costs–– can judge whether resource use can judge whether resource use
applicable to local settingapplicable to local setting–– focus on focus on cosscoss relevant to them relevant to them
(pharmacy)(pharmacy)–– apply unit costs to local settingapply unit costs to local setting
Example questionExample question•• patientspatients
–– women with prewomen with pre--eclampsiaeclampsia
•• intervention intervention –– intravenous magnesiumintravenous magnesium
•• RCT done in 33 countriesRCT done in 33 countries–– over 9,000 patientsover 9,000 patients
•• for presentation of resource use evidence for presentation of resource use evidence need to specify perspectiveneed to specify perspective–– health systemhealth system
Quality assessment
Studies Design Limitations Inconsistency Indirectness Imprecision No of patients
Relative effect
(95% CI)
Quality
Eclampsia
Duley 2003 RCT No one trial only No No 9,992 RR 0.41(0.29-0,58)
High
Maternal death
Duley 2003 RCT No one trial only No Imprecision 9,992 RR 0.54(0.26-1,10)
Moderate
Quality assessment
DesignLimita-tions
Inconsis-tency
Indirect-ness
Impre-cision
Resources Costs per patient
Studies per patient (US $; year 2001)
Placebo MgSO4 Placebo MgSO4
Magnesium sulphate
High GNI 0 6 0 20
Simon 2005 Middle GNI RCT No one trial only No No 0 6 0 3 High
Low GNI 0 6 0 5
Administration of the drug
High GNI 0 1 0 66
Simon 2005 Middle GNI RCT No one trial only No No 0 1 0 14 High
Low GNI 0 1 0 8
Other hospital resourcesa, b
High GNILarge
variationresourcesc
NA NA 12,839 12,818
Simon 2005 Middle GNI RCT No one trial only No NA NA 1,412 1,416 Modera
te
Low GNI NA NA 155 157
Outcomes
Typical control group risk
Typical absolute effect (95% CI)
Relative effect(95% CI)
Nr. of participants(studies)
Quality of theevidence
Comments
Clinical outcomes
Eclampsia Severe RR 0.41(0.29 - 0.58)
11,444 ⊕⊕⊕⊕High
27 per 1,000 16 fewer per 1,000(11 to 19)
Not severe
15 per 1,000 9 fewer per 1,000(6 to 11)
Maternal death Severe RR 0.54(0.26 - 1.10)
10,795 ⊕⊕⊕Moderate2
6 per 1,000 3 fewer per 1,000(0.6 more to 4 fewer)
Not severe
3 per 1,000 1 fewer per 1,000(0.3 more to 2 fewer)
Side effects 46 per 1,0003 196 more per 1,000(165 to 231)
RR 5.26(4.59 - 6.03)
9.992 ⊕⊕⊕⊕High
Mostly flushing. Other side effects include nausea, vomiting, slurred speech, muscle weakness, dizziness, drowsiness, confusion and headache.
Magnesium sulphateampoules
0 6 10 ml. ampoules per woman
9.996 ⊕⊕⊕⊕High
CostHigh GNIMiddle GNILow GNI
$20 more per patient$ 3 more per patient$ 5 more per patient
Administration of magnesium sulphate
0 1 per woman 9.996 ⊕⊕⊕⊕High
CostHigh GNIMiddle GNILow GNI
$66 per patient$14 per patient$ 8 per patient
Resources for administering magnesium sulphate included midwife time (main cost), intravenous cannula/needle, syringe, IV fluids, drug.
Other hospital resources Varied widely 9.996 ⊕⊕⊕Moderate5
There was large variation in the use of other hospital resources in both intervention and control groups.
CostHigh GNIMiddle GNILow GNI
$12,839$ 1,416$ 157
$20 less per woman(0 to 60)
$ 4, less per woman(0 to 10)
$ 2 less per woman(1 to 3)
Other hospital costs have been adjusted based on the influence of eclampsia to control for the many other factors that influenced these costs.
Resource use from the perspective of the health system
control grop difference trt vs control
Issues in resource useIssues in resource use•• broad perspective desirablebroad perspective desirable
–– narrow perspectives ignore much resource usenarrow perspectives ignore much resource use–– users can pick costs relevant to themusers can pick costs relevant to them–– either health care system or societaleither health care system or societal
•• indirect costs controversialindirect costs controversial
•• indirect evidence of resources useindirect evidence of resources use–– costs only reportedcosts only reported–– RCT but doesnRCT but doesn’’t reflect practicet reflect practice
•• ulcer prevention everyone gets repeat endoscopyulcer prevention everyone gets repeat endoscopy
Quality of evidence for Quality of evidence for resource useresource use
•• rules basically the same rules basically the same –– RCTsRCTs start high, observational lowstart high, observational low
•• may need multiple sources of evidencemay need multiple sources of evidence–– RCTsRCTs may not fully report resource usemay not fully report resource use
•• variation across settingsvariation across settings–– RCT may not reflect real worldRCT may not reflect real world–– time frame may extend beyond trialtime frame may extend beyond trial
•• different quality for different resourcesdifferent quality for different resources–– magmag sulphatesulphate versus hospital resourcesversus hospital resources
Formal economic modelsFormal economic models•• limitationslimitations
–– supported by industry, biasedsupported by industry, biased–– setting specificsetting specific–– reduces transparencyreduces transparency–– if evidence low quality, speculativeif evidence low quality, speculative–– often many assumptionsoften many assumptions
•• solution: develop own modelsolution: develop own model–– OK if you are NICE with lots of resourcesOK if you are NICE with lots of resources
•• even so, doneven so, don’’t include in profilet include in profile
Costs versus affordabilityCosts versus affordability
•• intervention may be intervention may be ““costcost--effectiveeffective””–– $10,000 per $10,000 per qalyqaly gainedgained
•• but if applicable to huge proportion but if applicable to huge proportion of population, may still be of population, may still be unaffordableunaffordable
healthy asymptomatic postmenopausal healthy asymptomatic postmenopausal qomwnqomwn: : HRT in 1992?HRT in 1992?
Possible benefitsPossible benefits–– CHD, Hip fracture, Colorectal cancerCHD, Hip fracture, Colorectal cancer
Possible harmsPossible harms–– Breast cancerBreast cancer–– StrokeStroke–– ThrombosisThrombosis–– Gall bladder diseaseGall bladder disease
Can GRADE lead to change?
Evidence profile: Quality assessmentEvidence profile: Quality assessmentOestrogen + progestin for prevention Oestrogen + progestin for prevention
in 1992 (before WHI and HERS)in 1992 (before WHI and HERS)
Oestrogen + progestin versus usual care
Oestrogen + progestin for Oestrogen + progestin for prevention after WHI and HERSprevention after WHI and HERS
Postulate
• major work in preparing guideline/HTA assessment is systematic review
• If already doing this, GRADE framework should add little
• history: Rolls-Royce and Volkswagen
VW and RR VW and RR appraochesappraoches
•• Rolls Royce (NICE)Rolls Royce (NICE)–– systematic review for every outcomesystematic review for every outcome–– production of evidence profilesproduction of evidence profiles–– involvement of multiple constituenciesinvolvement of multiple constituencies
•• including patientsincluding patients–– inclusion of economic analysisinclusion of economic analysis
•• cost $1 million per guidelinecost $1 million per guideline
MOPED GRADEMOPED GRADE•• UpToDateUpToDate
–– 5,000 graded recommendations5,000 graded recommendations
•• generate PICO (informal)generate PICO (informal)–– no formal rating of outcome importanceno formal rating of outcome importance
•• use of existing reviews, primary studiesuse of existing reviews, primary studies–– no new evidence synthesesno new evidence syntheses
•• quality for key outcomesquality for key outcomes–– 5 reasons rating down, 3 up5 reasons rating down, 3 up–– no new evidence profiles, no new evidence profiles, SoFSoF tablestables
•• recommendationsrecommendations–– strong or weak, consider 3 factorsstrong or weak, consider 3 factors–– value and preference statementsvalue and preference statements
ACCP• formal structured questions
• no formal rating of outcome importance– trying to change
• hit-and miss systematic reviews– largely only available ones
• hit-and-miss individual study evidence summaries
• rare evidence profiles– trying to change
VW approachVW approach
•• take systematic reviews if availabletake systematic reviews if available
•• if not, review key, accessible evidenceif not, review key, accessible evidence
•• no metano meta--analysis if not doneanalysis if not done
•• no evidence profilesno evidence profiles
•• small group make expert small group make expert judgementjudgement
ConclusionConclusion
•• clinicians, policy makers need summariesclinicians, policy makers need summaries–– quality of evidencequality of evidence–– strength of recommendationsstrength of recommendations
•• explicit rulesexplicit rules–– transparent, informativetransparent, informative
•• GRADEGRADE–– simple, transparent, systematicsimple, transparent, systematic–– increasing wide adoptionincreasing wide adoption