grade background • two stepsebm.mcmaster.ca/documents/large_group_presentations/grade... · •...

PlanPlan•• GRADE backgroundGRADE background

•• two stepstwo steps–– quality of evidencequality of evidence–– strength of recommendationstrength of recommendation

•• quality and strength can differquality and strength can differ

•• profiles and summary of findingsprofiles and summary of findings

•• importance of values/preferencesimportance of values/preferences

PlanPlan•• GRADE backgroundGRADE background•• two stepstwo steps

–– quality of evidencequality of evidence–– strength of recommendationstrength of recommendation

•• evidence profilesevidence profiles

•• an exercise in applying GRADEan exercise in applying GRADE

PlanPlan•• GRADE backgroundGRADE background•• two stepstwo steps

–– quality of evidencequality of evidence–– strength of recommendationstrength of recommendation

•• quality and strength can differquality and strength can differ

•• profiles and summary of findingsprofiles and summary of findings





•• application to breast cancer screeningapplication to breast cancer screening–– contrast with USPSTFcontrast with USPSTF

Summarizing recommendationsSummarizing recommendations

•• clinicians need succinct summariesclinicians need succinct summaries

•• should includeshould include–– quality of evidencequality of evidence–– summaries of best estimates of effectsummaries of best estimates of effect

•• all patientall patient--important outcomesimportant outcomes–– strength of recommendationsstrength of recommendations

•• GRADE working groupGRADE working group–– BMJ 2004 and 2008BMJ 2004 and 2008

•• Is grading recommendations a good Is grading recommendations a good idea?idea?

•• Why?Why?

•• experience with gradingexperience with grading–– systems used?systems used?

Why Grade Why Grade Recommendations?Recommendations?

•• strong recommendationsstrong recommendations–– strong methods strong methods –– large precise effect large precise effect –– few down sides of therapyfew down sides of therapy

•• weak recommendationsweak recommendations–– weak methodsweak methods–– imprecise estimateimprecise estimate–– small effectsmall effect–– substantial down sidessubstantial down sides

Which grading system to use?Which grading system to use?•• many availablemany available

–– Australian National and MRCAustralian National and MRC–– Oxford Center for EvidenceOxford Center for Evidence--based Medicinebased Medicine–– Scottish Intercollegiate Guidelines (SIGN)Scottish Intercollegiate Guidelines (SIGN)–– US Preventative Services Task ForceUS Preventative Services Task Force–– American professional organizationsAmerican professional organizations

•• AHA/ACC, ACCP, AAP, Endocrine society, etc....AHA/ACC, ACCP, AAP, Endocrine society, etc....

•• cause of confusion, dismaycause of confusion, dismay

A common international A common international grading system?grading system?

•• GRADE (GRADE (GGrades of rades of rrecommendation, ecommendation, aassessment, ssessment, ddevelopment and evelopment and eevaluation)valuation)

•• international groupinternational group–– Australian NMRC, SIGN, USPSTF, WHO, NICE, Australian NMRC, SIGN, USPSTF, WHO, NICE,

Oxford CEBM, CDC, CCOxford CEBM, CDC, CC

•• ~ 25 meetings over last ten years~ 25 meetings over last ten years•• (~10 (~10 –– 50 attendants)50 attendants)

GRADE UptakeGRADE UptakeAgencia sanitaria regionale, Bologna, Italia Agency for Health Care Research and Quality (AHRQ)Allergic Rhinitis and Group - Independent Expert PanelAmerican Association for the study of liver diseasesAmerican College of Cardiology FoundationAmerican College of Chest PhysiciansAmerican College of Emergency PhysiciansAmerican College of PhysiciansAmerican Endocrine Society American Society of Gastrointestinal EndoscopyAmerican society of Interventional Pain PhysiciansAmerican Thoracic Society (ATS)BMJ Clinical Evidence British Medical Journal Canadian Agency for Drugs and Technology in HealthCanadian Cardiovascular SocietyCanadian Task Force on Preventive Health CareCenters for Disease ControlCochrane Collaboration EBM Guidelines Finland Emergency Medical Services for Children National

Resource Center European Association for the Study of the LiverEuropean Respiratory SocietyEuropean Society of Thoracic SurgeonsEvidence-based Nursing Sudtirol, Alta Adiga, ItalyFinnish Office of Health Technology Assessment

German Agency for Quality in MedicineHeelth Inspectorate for ScotlandInfectious Disease Society of America Japanese Society of Oral and Maxillofacial Radiology Joslin Diabetes CenterJournal of Infection in Developing CountriesKaiser PermanenteKidney Disease International Guidelines Organization National and Gulf Centre for Evidence-based MedicineNational Institute for Clinical Excellence (NICE)National Kidney FoundationNorwegian Knowledge Centre for the Health ServicesOntario MOH Medical Advisory SecretariatPanama and Costa Rica National Clinical Guidelines ProgramPolish Institute for EBMScottish Intercollegiate Guideline Network (SIGN)Society of Critical Care MedicineSociety of Pediatric Endocrinology Society of Vascular SurgerySpanish Society of Family Practice (SEMFYC) Stop TB Diagnostic Working GroupSurviving sepsis campaign Swedish Council on Technology Assessment in Health CareSwedish National Board of Health and Welfare University of Pennsylvania Health System for EB Practice UpToDateWINFOCUSWorld Allergy OrganizationWorld Health Organization (WHO)

What are we grading?What are we grading?

•• two componentstwo components

•• quality of body of evidencequality of body of evidence–– extent to which confidence in estimate of extent to which confidence in estimate of

effect adequate to support decisioneffect adequate to support decision•• high, moderate, low, very lowhigh, moderate, low, very low

•• strength of recommendationstrength of recommendation•• strong and weakstrong and weak

What are we grading?What are we grading?

•• two componentstwo components

•• quality of body of evidencequality of body of evidence–– confidence in estimate of effectconfidence in estimate of effect

•• high, moderate, low, very lowhigh, moderate, low, very low

•• strength of recommendationstrength of recommendation•• strong and weakstrong and weak

Interpretation of qualityInterpretation of quality•• High qualityHigh quality—— Further research is very unlikely to Further research is very unlikely to

change our confidence in the estimate of effect change our confidence in the estimate of effect •• Moderate qualityModerate quality—— Further research is likely to Further research is likely to

have an important impact on our confidence in the have an important impact on our confidence in the estimate of effect and may change the estimate estimate of effect and may change the estimate

•• Low qualityLow quality—— Further research is very likely to Further research is very likely to have an important impact on our confidence in the have an important impact on our confidence in the estimate of effect and is likely to change the estimate of effect and is likely to change the estimate estimate

•• Very low qualityVery low quality—— Any estimate of effect is very Any estimate of effect is very uncertainuncertain

Interpretation of qualityInterpretation of quality•• High: We are very confident that tHigh: We are very confident that the true effect lies he true effect lies

close to that of the estimate of the effect.close to that of the estimate of the effect.

•• Moderate: We are moderately confident in the effect Moderate: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it estimate of the effect, but there is a possibility that it is substantially different.is substantially different.

•• Low: Our confidence in the effect estimate is limited: Low: Our confidence in the effect estimate is limited: The true effect may be substantially different from the The true effect may be substantially different from the estimate of the effect.estimate of the effect.

•• Very low: We have very little confidence in the effect Very low: We have very little confidence in the effect estimate: The true effect is likely to be substantially estimate: The true effect is likely to be substantially different from the estimate of effect.different from the estimate of effect.

Health Care Question

(PICO)Systematic reviews

Studies

Outcomes

Important outcomes

Rate the quality of evidence for each outcome, across studiesRCTs start high, observational studies start low(-)Study limitationsImprecisionInconsistency of resultsIndirectness of evidencePublication bias likely

Final rating of quality for each outcome: high, moderate, low, or very low

(+)Large magnitude of effectDose responsePlausible confounders would ↓ effect when an effect is present or ↑ effect if effect is absent

Decide on the direction (for/against) and grade strength (strong/weak*) of the recommendation considering:

Quality of the evidenceBalance of desirable/undesirable outcomes

Values and preferencesDecide if any revision of direction or strength is

necessary considering: Resource use*also labeled “conditional”or “discretionary”

Rate overall quality of evidence (lowest quality among critical outcomes)

S1 S2 S3 S4

OC1 OC2 OC3 OC4

OC1 OC3Criticaloutcomes

OC4

Generate an estimate of effect for each outcome

OC2

S5

Structured questionStructured question

• patients: lymphoma patients at risk of developing chemotherapy-induced febrile neutropenia

• granulocyte colony-stimulating (G-CSF)

• alternative not using G-CSF

Structured questionStructured question• patients:

– women considering breast cancer screening– age 40-9; 50 to 74; > 75– no risk genetic mutation chest radiation

• intervention– film mammography

• alternative – no screening

Need to define all patientNeed to define all patient--important outcomesimportant outcomesand evaluate their importanceand evaluate their importance

• desirable consequences – reduction in breast cancer mortality

• undesirable consequences– false positive screening results– invasive procedures from positive results– complications of invasive procedures– unnecessary diagnosis and treatment

Determinants of qualityDeterminants of quality•• RCTsRCTs start highstart high•• observational studies start low observational studies start low

•• what can lower quality?what can lower quality?–– detailed design and executiondetailed design and execution–– inconsistencyinconsistency–– indirectnessindirectness–– imprecisionimprecision–– reporting biasreporting bias


•• 5 limitations can lower quality5 limitations can lower quality•• detailed design and executiondetailed design and execution

–– concealment, blinding, loss to followconcealment, blinding, loss to follow--upup

•• inconsistencyinconsistency–– variability in results (heterogeneity)variability in results (heterogeneity)

•• publication biaspublication bias

Determinants of qualityDeterminants of quality•• RCTs start highRCTs start high•• observational studies start low observational studies start low

•• 5 limitations can lower quality5 limitations can lower quality•• Bias Bias

–– detailed design and executiondetailed design and execution•• concealment, blinding, loss to followconcealment, blinding, loss to follow--upup

–– publication biaspublication bias

•• Imprecision Imprecision –– wide confidence intervalswide confidence intervals


•• limitations can lower quality?limitations can lower quality?•• Bias Bias

–– detailed design and executiondetailed design and execution•• concealment, blinding, loss to followconcealment, blinding, loss to follow--upup

•• Imprecision Imprecision –– wide confidence intervalswide confidence intervals

Determinants of qualityDeterminants of quality

•• RCTsRCTs start highstart high

•• observational studies start low observational studies start low

•• limitations can lower quality?limitations can lower quality?

Determinants of qualityDeterminants of quality•• 5 limitations can lower quality5 limitations can lower quality

•• risk of biasrisk of bias–– concealment, blinding, loss to followconcealment, blinding, loss to follow--upup

•• imprecisionimprecision


•• IndirectnessIndirectness•• publication biaspublication bias

Determinants of qualityDeterminants of quality•• 5 limitations can lower quality5 limitations can lower quality

•• risk of biasrisk of bias–– concealment, blinding, loss to followconcealment, blinding, loss to follow--upup

•• imprecisionimprecision


•• publication biaspublication bias

Risk of BiasRisk of Bias

•• well establishedwell established–– concealmentconcealment–– intention to treat principle observedintention to treat principle observed–– blindingblinding–– completeness of followcompleteness of follow--upup

•• more recentmore recent–– early stopping for benefitearly stopping for benefit–– selective outcome reporting biasselective outcome reporting bias

Breast cancer risk of biasBreast cancer risk of bias•• most trials not concealedmost trials not concealed

•• blindingblinding–– ? adjudication of ? adjudication of outomeoutome–– no other blindingno other blinding

•• ? loss to follow? loss to follow--upup

•• all trials rated as all trials rated as ““fairfair”” by USPSTFby USPSTF

Consistency of resultsConsistency of results

•• if inconsistency, look for explanationif inconsistency, look for explanation–– patients, intervention, outcome, methodspatients, intervention, outcome, methods

•• judgment of consistencyjudgment of consistency–– variation in size of effectvariation in size of effect–– overlap in confidence intervalsoverlap in confidence intervals–– statistical significance of heterogeneitystatistical significance of heterogeneity–– II22

Relative Risk with 95% CI for Vitamin D Non-vertebral Fractures

ChapuyChapuy et al, (2002) 0.85 (0.64, 1.13)et al, (2002) 0.85 (0.64, 1.13)

Pooled Random Effect Model0.82 (0.69 to 0.98)

p= 0.05 for heterogeneity, I2=53%

ChapuyChapuy et al, (1994) 0.79 (0.69, 0.92)et al, (1994) 0.79 (0.69, 0.92)

Lips et al, (1996) 1.10 (0.87, 1.39)Lips et al, (1996) 1.10 (0.87, 1.39)

DawsonDawson--Hughes et al, (1997) 0.46 (0.24, 0.88)Hughes et al, (1997) 0.46 (0.24, 0.88)

Pfeifer et al, (2000) 0.48 (0.13, 1.78)Pfeifer et al, (2000) 0.48 (0.13, 1.78)

Meyer et al, (2002) 0.92 (0.68, 1.24)Meyer et al, (2002) 0.92 (0.68, 1.24)

TrivediTrivedi et al, (2003) 0.67 (0.46, 0.99)et al, (2003) 0.67 (0.46, 0.99)

0.1 1 10

Favours Vitamin D Favours Control

Relative Risk 95% CI

Relative Risk with 95% CI for Vitamin D (Non-Vertebral Fractures, Dose >400)

Chapuy et al, (1994) 0.70 (0.69, 0.92)

Dawson-Hughes et al, (1997) 0.46 (0.24, 0.88)

Pfeifer et al, (2000) .48 (0.13, 1.78)

Chapuy et al, (2002) 0.85 (0.64, 1.13)

Trivedi et al, (2003) 0.67 (0.46, 0.99)

Pooled Random Effect Mode0.75 (0.63 to 0.89)

p= 0.26 for heterogeneity, I2=24%

0.1 1 10



Relative Risk with 95% CI for Vitamin D (Non-Vertebral Fractures, Dose = 400)

Lips et al (1996) 1.10 (0.87, 1.39)

Meyer et al (2002) 0.92 (0.68, 1.24)

Pooled Random Effect Mode1.03 (0.86 to 1.24)

p = 0.35 heterogeneity, I2=0%

0.1 1 10



Meta-analysis of women 39 to 49 years of age

ShouldShould wewe believebelieve subsub--groupgroupanalysisanalysis??

•• withinwithin--studystudy comparisoncomparison? ? NoNo•• largelarge differencedifference in in effecteffect BorderlineBorderline•• unlikelyunlikely chance chance YesYes, p = 0.006, p = 0.006•• consistentconsistent acrossacross studiesstudies YesYes•• a priori a priori hypothesishypothesis YesYes•• oneone ofof smallsmall numbernumber hypotheseshypotheses YesYes•• biologicallybiologically compellingcompelling YesYes

•• shallshall wewe believebelieve subsub--groupgroup analysisanalysis??

Directness of EvidenceDirectness of Evidence

•• differences indifferences in–– patientspatients–– interventionsinterventions–– comparatorscomparators

•• differences in outcomesdifferences in outcomes–– surrogatessurrogates

Quality judgments: DirectnessQuality judgments: Directness

•• populations populations –– older, sicker or more coolder, sicker or more co--morbiditymorbidity

•• interventions interventions –– new statins versus oldnew statins versus old

•• outcomes outcomes –– important versus surrogate outcomesimportant versus surrogate outcomes–– glucose control versus CV eventsglucose control versus CV events

Flatulence

Figure 6: Hierarchy of outcomes according to their patientFigure 6: Hierarchy of outcomes according to their patient--importance to assess the importance to assess the effect of phosphate lowering drugs in patients with renal failureffect of phosphate lowering drugs in patients with renal failure and hyperphophatemiae and hyperphophatemia

Importance of endpoints

Critical for decision making

Important, but not critical for decision making

Of low patient-importance2

5

Pain due to soft tissue Calcification / function 6

Fractures 7

Myocardial infarction 8

Mortality 9

3

4

1

Coronary calcification

Ca2+/P-Product

Surrogates of declining importance

Bone density

Ca2+/P-Product

Soft tissue calcification

Ca2+/P-Product

Lower by one level for

indirectness

Lower by two levels for

indirectness

Alendronate Risedronate

Placebo

DirectnessDirectnessinterested in A versus B interested in A versus B

available data A available data A vsvs C, B C, B vsvs CC

Issues in directness breast Issues in directness breast cancer screeningcancer screening

•• estimated screening effects in over 75estimated screening effects in over 75–– could also use observational studiescould also use observational studies

•• no direct comparisons of screening no direct comparisons of screening intervalsintervals

ImprecisionImprecision

•• small sample sizesmall sample size–– small number of eventssmall number of events

•• wide confidence intervalswide confidence intervals–– uncertainty about magnitude of effectuncertainty about magnitude of effect

•• primary criterion:primary criterion:–– would decisions differ at ends of CIwould decisions differ at ends of CI

Meta-analysis of women 39 to 49 years of age

25% RRR, reduction in breast cancer deaths 1/1,000

4% RRR, reduction in breast cancer deaths 1/6,000

ImprecisionImprecision•• small sample sizesmall sample size

–– small number of eventssmall number of events

•• wide confidence intervalswide confidence intervals–– uncertainty about magnitude of effectuncertainty about magnitude of effect

•• problemsproblems–– analogy to stopping earlyanalogy to stopping early–– lack of prognostic balancelack of prognostic balance

•• solution: optimal information sizesolution: optimal information size–– # of pts from conventional sample size # of pts from conventional sample size

calculationcalculation–– specify CER, effect, specify CER, effect, αα, , ββ, , ΔΔ

Fluoroquinolone prophylaxis in neutropenia: infection-related mortality

Total number of events: 47

Fluoroquinolone prophylaxis in neutropenia: infection-related mortality

sample size 1,002sample size 1,002αα 0.05, 0.05, ββ 0.20, 0.20, ΔΔ 0.25, N = 6,0000.25, N = 6,000

Stroke Stroke –– Fixed EffectsFixed Effects

1 5 10 50 1000.50.1

Study Year Overall Event Rate Relative Risk (95% CI)

Wallace 1998 5 / 200 3.06 (0.49 to 19.02)

Pobble 2005 1 / 103 2.63 (0.11 to 62.97)

DIPOM 2006 2 / 921 4.97 (0.24 to 103.19)

MaVS 2006 6 / 496 1.83 (0.39 to 8.50)

Zaugg 2007 1 / 119 2.97 (0.12 to 72.19)

POISE 2007 60 / 8351 2.13 (1.25 to 3.64)

Fixed Effects Estimate 2.22 (1.39 to 3.56)

p=0.99 for heterogeneity, I²=0%

Downgrading for precisionDowngrading for precision

•• if OIS not met rate down for imprecisionif OIS not met rate down for imprecision–– unless very large unless very large ssss (? > 1,000 per group)(? > 1,000 per group)

•• if OIS met and CI exclude RR = 1, donif OIS met and CI exclude RR = 1, don’’t t downgradedowngrade

•• if OIS met and CI includes, RR = 1, if OIS met and CI includes, RR = 1, downgrade only if RR < 0.75 or > 1.25downgrade only if RR < 0.75 or > 1.25

PrecisionPrecision•• atrial fib at risk of strokeatrial fib at risk of stroke

•• warfarin increases serious warfarin increases serious gigi bleedingbleeding–– 3% per year 3% per year

•• 1,000 patients 1 less stroke1,000 patients 1 less stroke–– 30 more bleeds for each stroke prevented30 more bleeds for each stroke prevented

•• 1,000 patients 100 less strokes1,000 patients 100 less strokes–– 3 strokes prevented for each bleed3 strokes prevented for each bleed

•• where is your threshold?where is your threshold?–– how many strokes in 100 with 3% bleeding?how many strokes in 100 with 3% bleeding?

00.5%1.0%

Example: Example: clopidogrelclopidogrel or ASA?or ASA?•• pts with threatened strokepts with threatened stroke

•• RCT of RCT of clopidogrelclopidogrel vsvs ASAASA–– 19,185 patients19,185 patients

• ischaemic stroke, MI, or vascular death compared – 939 events (5·32%) clopidogrel– 1021 events (5·83%) with aspirin

• RR 0.91 (95% CI 0.83 – 0.99) (p=0·043)

•• downgrade for precision?downgrade for precision?

01.0%

Clopidogrel or ASA for threatened vascular events

RCT 19,185 patients

1.7% - 0.9 – 0.1%

RR 0.91 (95% CI 0.83 – 0.99)

0

Non-inferiority

Avoiding misleading conclusionsAvoiding misleading conclusions

•• analogyanalogy–– SR accumulating data just as single trialSR accumulating data just as single trial

•• risk of spurious effect by stopping earlyrisk of spurious effect by stopping early

•• how to avoidhow to avoid–– insist on sufficient datainsist on sufficient data–– optimal information sizeoptimal information size

Avoiding misleading conclusionsAvoiding misleading conclusions

•• optimal information sizeoptimal information size–– # of pts from conventional sample size # of pts from conventional sample size

calculationcalculation–– specify CER, effect, specify CER, effect, αα, , ββ

•• alternative, number of eventsalternative, number of events–– ? 300 ?? 300 ?

Criteria for NOT downgradingCriteria for NOT downgrading•• CI narrow enough to permit confident CI narrow enough to permit confident

recommendation for or againstrecommendation for or against

•• if positive benefit outcome, if positive benefit outcome, safeguard against false positivesafeguard against false positive–– OIS OIS or or -- number threshold met (300)number threshold met (300)

Publication biasPublication bias

•• high likelihood could lower qualityhigh likelihood could lower quality

•• when to suspectwhen to suspect•• number of small studiesnumber of small studies•• industry sponsoredindustry sponsored

What can raise quality?What can raise quality?•• large magnitude can upgrade one levellarge magnitude can upgrade one level

–– very large two levelsvery large two levels

•• common criteriacommon criteria–– everyone used to do badlyeveryone used to do badly–– almost everyone does wellalmost everyone does well–– quick actionquick action

•• hip replacement for hip osteoarthritiship replacement for hip osteoarthritis

•• mechanical ventilation in respiratory failuremechanical ventilation in respiratory failure

DoseDose--response gradientresponse gradient

•• childhood lymphoblastic leukemiachildhood lymphoblastic leukemia

•• risk for CNS malignancies 15 years after risk for CNS malignancies 15 years after cranial irradiationcranial irradiation

• no radiation: 1% (95% CI 0% to 2.1%) • 12 Gy: 1.6% (95% CI 0% to 3.4%) • 18 Gy: 3.3% (95% CI 0.9% to 5.6%).

Observational study starts lowObservational study starts lowWhat can move up?What can move up?

• plausible confounders strengthen association

• FP hospitals higher mortality than NFP hospitals– NFP treat sicker patients– FP treat wealthier patients

Quality assessment criteriaQuality assessment criteria

Overall level of evidenceOverall level of evidence•• most systems just use evidence about most systems just use evidence about

primary benefit outcomeprimary benefit outcome

•• but what about others (risk)?but what about others (risk)?

•• what to do?what to do?

•• optionsoptions–– ignore all but primaryignore all but primary–– weakest of any outcomeweakest of any outcome–– some blended approachsome blended approach–– weakest of critical outcomesweakest of critical outcomes

Quality Assessment

Summary of Findings

QualityRelative Effect

(95% CI)

Absolute risk difference

OutcomeNumber of

participants(studies)

Risk of Bias Consistency Directness Precision Reportin

g Bias

Myocardial infarction

10,125(9)

No serious limitations

No serious imitations



Not detected High 0.71

(0.57 to 0.86)1.5% fewer

(0.7% fewer to 2.1% fewer)

Mortality 10,205(7)


Possibllyinconsistent

No serious limitations Imprecise Not

detectedModerate

or low1.23

(0.98 – 1.55)

0.5% more(0.1% fewer

to 1.3% more)

Stroke 10,889(5)

No serious limitaions



Possible imprecision

Not detected

Moderate or High

2.21(1.37 – 3.55)

0.5% more (0.2% more to 1.3% more0

Beta blockers in non-cardiac surgery

Quality AssessmentSummary of Findings

Quality

Relative Risk

(95% CI)p-value

Illustrative risks

OutcomeNo. of

patients(studies)

Risk of Bias Inconsistency Indirectness Imprecision Publication Bias

Examplecontrol

rate

Associated risk with PVL

Hospital mortality

1,664(9)

Inability to blind.2 trials stopped early with few events and large effects; were also confounded by ‘open lung’strategies.

p = 0.07I2 = 45.6% Varied populations, interventions.Not robust in sensitivity analyses

Direct Precise UndetectedModerate (due

toinconsistency)

0.82 (0.68 – 0.99)

p = 0.0440% 32.8%

(27.2 – 39.6)

Barotrauma 1,497(7) Inability to blind.

p = 0.24I2 = 25.3%Varied populations, interventions

Direct Imprecise UndetectedModerate (due

toimprecision)

0.90(0.66 – 1.24)

p = 0.53NS NS

Paralysis 1,202(5) Inability to blind.

p = 0.004I2 = 59%Varied populations, interventions, measurements

Direct Precise Not assessed

Moderate (due to

inconsistency)

1.37 (1.04 – 1.82)

p = 0.0330% 41.1%

(31.2 – 54.6)

Dialysis 173(2) Inability to blind.

p = 0.26I2 = 22.8%Varied populations, interventions

Direct Imprecise Not assessed

Moderate (due to

imprecision)

1.76(0.79 – 3.90)

p = 0.16NS NS

Pressure limited ventilation

Quality AssessmentSummary of Findings

Quality

Relative Risk

(95% CI)p-value

Illustrative risks

OutcomeNo. of

patients(studies)

Risk of Bias Inconsistency Indirectness Imprecision Publication Bias

control rate

vaccinated rate

Zoster episodes

38,546(1) No serious risk only one study Direct Precise Undetected High not reported

11.12 per 1,000

patient-years

5.42(difference

5.7 per 1,000 pt-years

(p< 0.001)

Post-herpetic

neuralgia

38,546(1)

No serious risk only one study

Direct Precise Undetected High not reported

1.38 per 1,000

patient-years

0.46(difference

0.92 per 1,000 pt-

years (p< 0.001)

Serious adverse events

38,546(1) No serious risk only one study Direct Precise Undetected High Not

reported

13 per 1,000 19 (difference

6 per 1,000)

Zoster vaccine

Population No. of participants (trials) †

Higher PEEP

Lower PEEP

Adjusted Relative Risk (95% CI; P-value) ‡

Adjusted Absolute Risk Difference (95% CI)

Quality

Patients with ARDS

1892 (3) 324/951 (34.1%)

368/941 (39.1%)

0.90 (0.81 to 1.00; 0.049)

-3.9% (-7.4% to -0.04%) High

Patients without ARDS

404 (3) 50/184 (27.2%)

41/220 (18.6%)

1.37 (0.98 to 1.92; 0.065)

6.9% (-0.4% to 17.1%) Moderate (imprecision)

High versus low PEEP in ALI and ARDS

Whipples procedure pancreatic cancerwith or without duodenectomy

Patients or population: Anyone taking a long flight (lasting more than 6 hours)Settings: International air travelIntervention: Compression stockings1

Comparison: Without stockings

Outcomes Illustrative comparative risks* (RANGE OF UNCERTAINTY)

Relative effect(95% CI)

Number of participants(studies)

Quality of theevidence(GRADE)

Comments

Assumed risk Corresponding risk

Without stockings With stockings(95% CI)

Symptomatic deep vein thrombosis (DVT)

See comment See comment Not estimable 2637(9 studies)

See comment 0 participants developed symptomatic DVT in these studies.

Symptomatic deep vein thrombosis –surrogate symptomless deep vein thrombosis

Low risk population 2 RR 0.10(0.04 to 0.25)

2637(9 studies)

⊕⊕⊕Moderate3

Estimates of control group asymptomatic thrombosis from the primary studies range from 15 per 1,000 in low risk patients to 25 per 1,000 in high risk patients

5 per 10,000 0.5 per 10,000 (0 to 1)

High risk population 2

18 per 10,000 1.8 per 10,000 (0.5 to 4)

Superficial vein thrombosis

13 per 1000 6 per 1000 (2 to 15)

RR 0.45(0.18 to 1.13)

1804(8 studies)

⊕⊕⊕Moderate4

Diagnostic tests

• same logic as for treatment

• judges quality of evidence NOT for accuracy, but for change in patient-important outcome

• ideally establish through RCT– focus on patient-important outcome– screening RCTs (breast cancer, colon cancer)

• most of time not available– new complexities, process in evolution

Study designs

Study designs II

Test accuracy is a surrogate for patient important outcomes

• When clinicians think about diagnostic tests, they focus on their accuracy

• Underlying assumption: obtaining a better idea of whether a target condition is present or absent will result in superior patient management and improved outcome.

Example of new test and reference test or strategy

Putative benefit of new test

Diagnostic accuracy Patient Outcomes and expected impact on management for the following test outcomes

Sensitivity SpecificityTrue positives

Falsepositives

True negatives

False negatives

Helical CT for renal calculus compared with intravenous pyeolgram

Detection of more (but smaller) calculi

greater equal Presumed influence on patient important outcomes

Certain benefit for larger stones, for smaller stones the benefit is less clear and unnecessary treatment can result

Likely detriment from unnecessary additional invasive tests

Almost certain benefit from avoiding unnecessary tests

Likely detriment for large stones, less certain for small stones More testing

Directness of the evidence (test results) for patient-important outcomes

Some uncertainty

No uncertainty

No uncertainty

Major uncertainty

Balance between presumed patient outcomes, complications and cost: Less complications and downsides compared to IVP would support the new test’s usefulness, but the balance between desirable and undesirable effect not clear in view of the uncertain consequences of identifying smaller stones.

Strength of recommendationsStrength of recommendations•• degree of confidence that desirable degree of confidence that desirable

effects of adhering to recommendation effects of adhering to recommendation outweigh undesirable effects. outweigh undesirable effects.

•• strong recommendationstrong recommendation–– benefits clearly outweigh risks/hassle/costbenefits clearly outweigh risks/hassle/cost–– risk/hassle/cost clearly outweighs benefitrisk/hassle/cost clearly outweighs benefit

Strength of recommendationsStrength of recommendations

•• degree of confidence that desirable degree of confidence that desirable effects of adhering to recommendation effects of adhering to recommendation outweigh undesirable effects. outweigh undesirable effects.


•• what can downgrade strength?what can downgrade strength?

Strength of RecommendationStrength of Recommendation


•• what can downgrade strength?what can downgrade strength?

•• low quality evidence low quality evidence

•• close balance between up and downsidesclose balance between up and downsides

Grades TranslationsStrong recommendations• Just do it• Virtually all well informed individuals would want

the intervention and only a small proportion would not

• Most individuals should receive the intervention • Use of the intervention according to the

guideline could be used as a quality criterion or performance indicator

Grades TranslationsWeak recommendation• Examine the evidence yourself• The majority of well informed individuals

would want the intervention, but a substantial proportion would not

• Many but not all individuals should receive the intervention

• The intervention is not a candidate for a quality criterion or performance indicator.

Risk/Benefit tradeoffRisk/Benefit tradeoff

•• aspirin after myocardial infarctionaspirin after myocardial infarction–– 25% reduction in relative risk 25% reduction in relative risk –– side effects minimal, cost minimalside effects minimal, cost minimal–– benefit obviously much greater than benefit obviously much greater than

risk/costrisk/cost

•• warfarin in low risk atrial fibrillationwarfarin in low risk atrial fibrillation–– warfarin reduces stroke warfarin reduces stroke vsvs ASA by 50%ASA by 50%–– but if risk only 1% per year, ARR 0.5%but if risk only 1% per year, ARR 0.5%–– increased bleeds by 1% per yearincreased bleeds by 1% per year

Strength of Recommendations• Resuscitate fast in septic patient

- do it!

• Prone ventilation in failing patient with ARDS– Probably do it– Probably do not do it

Strength of Strength of RecommendationsRecommendations

Aspirin after MI Aspirin after MI –– do itdo it

Warfarin rather than ASA in Warfarin rather than ASA in AfibAfib---- probably do itprobably do it---- probably donprobably don’’t do itt do it

Significance of strong Significance of strong vsvs weakweak•• variability in patient preferencevariability in patient preference

–– strong, almost all same choice (> 90%)strong, almost all same choice (> 90%)–– weak, choice varies appreciablyweak, choice varies appreciably

•• interaction with patientinteraction with patient–– strong, just inform patientstrong, just inform patient–– weak, ensure choice reflects valuesweak, ensure choice reflects values

•• use of decision aiduse of decision aid–– strong, donstrong, don’’t bothert bother–– weak, use the aidweak, use the aid

•• quality of care criterionquality of care criterion–– strong, considerstrong, consider–– weak, donweak, don’’t considert consider

USPSTF documents- clinical summary- supporting article (decision analysis)- evidence summary- recommendations

For women age 40 – 49 we suggest NOT screening.(Weak recommendation based, moderate quality evidence. 2B)

Values and Preferences: This recommendation places a relatively low value on a very small, uncertain mortality decreaseand reflects concerns with false positive results, unnecessary biopsies, and unnecessary dx of breast cancerWomen who place a higher value on a small reduction in breast cancer mortality and are less concerned about the undesirable consequences will choose screening

For women age 50 - 74 we suggest screening.(Weak recommendation, moderate quality evidence. 2B)

Women who do not place a high value on a small reduction in breast cancer mortality and are concerned about false positive results, unnecessary biopsies, and unnecessary diagnosis of breast cancer will decline screening

Using GRADE no insufficient- no recommendation or recommend for or against

Breast self-examination: We recommend against breast self-examination. (strong recommendation, high/moderate quality evidence)

Weak recommendationWeak recommendation

•• practice will vary practice will vary –– according to what?according to what?

•• interpretation of evidenceinterpretation of evidence–– clopidogrelclopidogrel in strokein stroke

•• patientspatients’’ values and preferencesvalues and preferences–– atrial fibrillationatrial fibrillation

•• inclination to gamble (risk aversion)inclination to gamble (risk aversion)–– HRTHRT

When evidence is low qualityWhen evidence is low quality

•• choice more preference dependentchoice more preference dependent

•• risk aversionrisk aversion

•• steroids for pulmonary fibrosissteroids for pulmonary fibrosis–– low quality evidence in support of low quality evidence in support of

benefitbenefit–– high quality evidence of toxicityhigh quality evidence of toxicity

When evidence is low qualityWhen evidence is low quality•• recommendation to the hopeful patientrecommendation to the hopeful patient

–– II’’m likely to deterioratem likely to deteriorate–– if something might work, letif something might work, let’’s try its try it–– damn the torpedoesdamn the torpedoes

•• recommendation to the fearful patientrecommendation to the fearful patient–– doctor, you mean you know itdoctor, you mean you know it’’s toxics toxic

•• diabetes, skin changes, body diabetes, skin changes, body habitushabitus, infection, , infection, osteoporosisosteoporosis

–– you donyou don’’t know for sure it works?t know for sure it works?–– are you crazy?are you crazy?

•• weak recommendation mandatedweak recommendation mandated

Strong recommendation Strong recommendation when evidence is low quality?when evidence is low quality?•• known benefit, strong recommendation for known benefit, strong recommendation for

one of two alternativesone of two alternatives–– antipyretics in children with chickenpoxantipyretics in children with chickenpox–– but which one: ASA or acetaminophenbut which one: ASA or acetaminophen

•• benefit: high quality evidence of benefit: high quality evidence of equivalenceequivalence

•• harm: low quality evidence that harm harm: low quality evidence that harm differs appreciablydiffers appreciably–– Reye syndrome from ASAReye syndrome from ASA

•• strong recommendation for acetaminophenstrong recommendation for acetaminophen

Strong recommendation when Strong recommendation when evidence is low quality?evidence is low quality?

•• BlastomycosisBlastomycosis–– low quality evidence low quality evidence amphotericinamphotericin more more

effective than effective than itraconazoleitraconazole–– high quality evidence more toxichigh quality evidence more toxic

•• patients with life threatening patients with life threatening blastoblasto–– life and death situationlife and death situation–– strong recommendation for strong recommendation for amphoampho

Strong recommendation when Strong recommendation when evidence is low quality?evidence is low quality?

•• head to toe CT scanninghead to toe CT scanning–– prevent cancer deathsprevent cancer deaths

•• very low quality evidence of benefitsvery low quality evidence of benefits•• moderate quality evidence re risks, moderate quality evidence re risks,

high re costshigh re costs•• strong recommendation againststrong recommendation against

PresentationPresentation•• strong and weakstrong and weak

–– discomfort with discomfort with ““weakweak””–– alternative wording: discretionary, conditionalalternative wording: discretionary, conditional

•• strongstrong–– ““we recommendwe recommend”…”…

•• discretionary discretionary –– ““we suggestwe suggest…”…”

•• nevernever–– we recommend (or suggest) you considerwe recommend (or suggest) you consider……

•• always: quality of evidence and gradealways: quality of evidence and grade

When (not to) GRADEWhen (not to) GRADE““good to remind/alertgood to remind/alert””

•• if no systematic review undertakenif no systematic review undertaken

•• no sensible person would consider contraryno sensible person would consider contrary–– We recommend that the patient, and the clinician We recommend that the patient, and the clinician

responsible for the patientresponsible for the patient’’s care, should be made s care, should be made aware of any change in a prescribed medication, aware of any change in a prescribed medication, including change to a generic drugincluding change to a generic drug

•• very general (not sufficiently specific)very general (not sufficiently specific)–– We suggest that longWe suggest that long--term maintenance term maintenance

immunosuppressionimmunosuppression be tailored to individual patientbe tailored to individual patient’’s s adverse events or risk of adverse eventsadverse events or risk of adverse events

Explicit comparatorExplicit comparator

•• we recommend hourly urine volume we recommend hourly urine volume measurement for at least 24 hoursmeasurement for at least 24 hours–– in contrast to every 2 hours, every 3in contrast to every 2 hours, every 3……??

•• we suggest measuring serum creatinine in we suggest measuring serum creatinine in all all KTRsKTRs at leastat least–– daily for 7 daysdaily for 7 days–– 2 to 3 X per week for weeks 2 to 42 to 3 X per week for weeks 2 to 4–– every 2 weeks for months 4 to 6 every 2 weeks for months 4 to 6

Value and preference Value and preference statementsstatements

•• underlying values and preferences underlying values and preferences always presentalways present

•• sometimes crucialsometimes crucial

•• important to make explicitimportant to make explicit

Values and preferencesValues and preferences

Stroke guideline: patients with TIA clopidogrel over aspirin (Grade 2B).

Underlying values and preferences: This recommendation to use clopidogrel over aspirin places a relatively high value on a small absolute risk reduction in stroke rates, and a relatively low value on minimizing drug expenditures.

Values and preferencesValues and preferences

peripheral vascular disease: aspirin be used instead of clopidogrel (Grade 2A).

Underlying values and preferences: This recommendation places a relatively high value on avoiding large expenditures to achieve small reductions in vascular events.

Flavanoids for HemorrhoidsFlavanoids for Hemorrhoids•• venotonicvenotonic agentsagents

–– mechanism unclear, increase venous returnmechanism unclear, increase venous return

•• popularitypopularity–– 90 90 venotonicsvenotonics commercialized in Francecommercialized in France–– none in Sweden and Norwaynone in Sweden and Norway–– France 70% of world marketFrance 70% of world market

•• possibilitiespossibilities–– French misguidedFrench misguided–– rest of world missing outrest of world missing out

Systematic ReviewSystematic Review•• 14 trials, 1432 patients14 trials, 1432 patients•• key outcomekey outcome

–– risk not improving/persistent symptomsrisk not improving/persistent symptoms–– 11 studies, 1002 patients, 375 events11 studies, 1002 patients, 375 events–– RR 0.4, 95% CI 0.29 to 0.57RR 0.4, 95% CI 0.29 to 0.57

•• minimal side effectsminimal side effects

•• is France right?is France right?•• what is the quality of evidence?what is the quality of evidence?

What can lower quality?What can lower quality?

•• risk of biasrisk of bias–– lack of detail re concealmentlack of detail re concealment–– questionnaires not validatedquestionnaires not validated

•• indirectness indirectness –– no problemno problem

•• inconsistency, need to look at the inconsistency, need to look at the resultsresults

Review : Phlebotonics for hemorrhoidsComparison: 01 Venotonics vs placebp Outcome: 08 Overall improvement: no improvement/some improvement

Study RR (random) Weight RR (random)or sub-category log[RR] (SE) 95% CI % 95% CI

01 Up to seven daysChauvenet -0.8916 (0.2376) 12.67 0.41 [0.26, 0.65] Cospite -2.2073 (0.6117) 5.51 0.11 [0.03, 0.36] Thanapongsathorn -0.4308 (0.2985) 11.18 0.65 [0.36, 1.17]

Subtotal (95% CI) 29.36 0.37 [0.18, 0.77Test for heterogeneity: Chi² = 6.92, df = 2 (P = 0.03), I² = 71.1%Test for overall effect: Z = 2.67 (P = 0.008)

02 Up to four w eeksAnnoni F -1.6094 (0.7073) 4.50 0.20 [0.05, 0.80] Clyne MB -0.9943 (0.3983) 8.94 0.37 [0.17, 0.81] Pirard J -1.1712 (0.3086) 10.94 0.31 [0.17, 0.57] Thanapongsathorn -1.1087 (1.1098) 2.18 0.33 [0.04, 2.91] Thorp 0.2624 (0.3291) 10.46 1.30 [0.68, 2.48] Titapan -0.8916 (0.3691) 9.56 0.41 [0.20, 0.85] Wijayanegara -0.5978 (0.1375) 14.97 0.55 [0.42, 0.72]

Subtotal (95% CI) 61.54 0.48 [0.32, 0.72Test for heterogeneity: Chi² = 13.87, df = 6 (P = 0.03), I² = 56.7%Test for overall effect: Z = 3.57 (P = 0.0004)

03 Further than four w eeksGodeberg -1.7719 (0.3906) 9.10 0.17 [0.08, 0.37]

Subtotal (95% CI) 9.10 0.17 [0.08, 0.37Test for heterogeneity: not applicableTest for overall effect: Z = 4.54 (P < 0.00001)

Total (95% CI) 100.00 0.40 [0.29, 0.57Test for heterogeneity: Chi² = 28.66, df = 10 (P = 0.001), I² = 65.1%Test for overall effect: Z = 5.14 (P < 0.00001)

0.001 0.01 0.1 1 10 100 1000

Favours treatment Favours control

Publication bias?Publication bias?

•• size of studiessize of studies–– 40 to 234 patients, most around 10040 to 234 patients, most around 100

•• all industry sponsoredall industry sponsored

Review : Phlebotonics for hemorrhoidsComparison: 01 Venotonics vs placebp Outcome: 08 Overall improvement: no improvement/some improvement

0.001 0.01 0.1 1 10 100 1000

0.0

0.4

0.8

1.2

1.6

RR (fixed)

What can lower quality?What can lower quality?•• detailed design and executiondetailed design and execution

–– lack of detail re concealmentlack of detail re concealment–– questionnaires not validatedquestionnaires not validated

•• inconsistencyinconsistency–– almost all show positive effect, trendalmost all show positive effect, trend–– heterogeneity p < 0.001; Iheterogeneity p < 0.001; I22 65.1%65.1%

•• indirectnessindirectness•• imprecisionimprecision

–– RR 0.4, 95% CI 0.29 to 0.57RR 0.4, 95% CI 0.29 to 0.57

•• reporting biasreporting bias–– 40 to 234 patients, most around 10040 to 234 patients, most around 100

RecommendationRecommendation

•• for clinicianfor clinician–– offer to patientoffer to patient–– dondon’’t offer to patient?t offer to patient?

•• strength of recommendationstrength of recommendation–– strong or weakstrong or weak

•• for the funding bodyfor the funding body–– publicly fundedpublicly funded–– not publicly fundednot publicly funded–– strong or weak?strong or weak?

Is France right?Is France right?

•• recommendationrecommendation–– yesyes–– no against useno against use

•• strengthstrength–– strong strong –– weakweak

Resource UseResource Use

•• why not cost?why not cost?-- may lead to focus on cost of intervention may lead to focus on cost of intervention

rather than downstream resource userather than downstream resource use-- resource use emphasizes alternative resource use emphasizes alternative

uses of resources (opportunity cost)uses of resources (opportunity cost)

Resource Use just another Resource Use just another outcome?outcome?

•• yes and noyes and no

•• who benefits?who benefits?–– other outcomes usually clearother outcomes usually clear–– costs borne by different payerscosts borne by different payers

•• across societies and within (age)across societies and within (age)

•• some argue costs arensome argue costs aren’’t relevant to t relevant to clinicians when third party payerclinicians when third party payer

Why resource use differentWhy resource use different

•• costs vary much more than other outcomescosts vary much more than other outcomes–– across jurisdictionsacross jurisdictions–– within jurisdictionswithin jurisdictions–– over timeover time

•• even when resource use the same, even when resource use the same, implications may differimplications may differ–– yearyear’’s supply of expensive drugs supply of expensive drug–– nursesnurses’’ salary in U.S., 6 in Poland, 30 in Chinasalary in U.S., 6 in Poland, 30 in China

Why resource use differentWhy resource use different•• opportunity cost differs by perspectiveopportunity cost differs by perspective

•• hospital pharmacy, fixed budgethospital pharmacy, fixed budget–– new expensive drug, clear what give upnew expensive drug, clear what give up

•• envelope public spendingenvelope public spending–– more on health, less on education, social more on health, less on education, social

servicesservices–– will refraining from spending on drugs will refraining from spending on drugs

really mean more for other services?really mean more for other services?–– should envelope include military spending?should envelope include military spending?

ImplicationsImplications•• unbearable lightness of resource useunbearable lightness of resource use•• consider balance of desirable and consider balance of desirable and

undesirable before considering undesirable before considering resource useresource use

•• may decide not to consider resource may decide not to consider resource use at alluse at all–– intervention not usefulintervention not useful–– desirable consequences >>>> undesirabledesirable consequences >>>> undesirable–– relevant only when difference smallrelevant only when difference small

Similarities with other Similarities with other outcomesoutcomes

•• only consider important resource useonly consider important resource use

•• need estimate of difference between need estimate of difference between trttrtand controland control

•• explicit judgments about the quality of the explicit judgments about the quality of the evidence, special judgmentsevidence, special judgments–– perspectiveperspective–– how to judge quality of evidencehow to judge quality of evidence–– ? use of economic model ? use of economic model

Evidence summaryEvidence summary•• includes quality of evidence, summary includes quality of evidence, summary

of findings of findings –– ““balance sheetbalance sheet””, special form of grade , special form of grade

profileprofile

•• resource use and not just costsresource use and not just costs–– can judge whether resource use can judge whether resource use

applicable to local settingapplicable to local setting–– focus on focus on cosscoss relevant to them relevant to them

(pharmacy)(pharmacy)–– apply unit costs to local settingapply unit costs to local setting

Example questionExample question•• patientspatients

–– women with prewomen with pre--eclampsiaeclampsia

•• intervention intervention –– intravenous magnesiumintravenous magnesium

•• RCT done in 33 countriesRCT done in 33 countries–– over 9,000 patientsover 9,000 patients

•• for presentation of resource use evidence for presentation of resource use evidence need to specify perspectiveneed to specify perspective–– health systemhealth system

Quality assessment

Studies Design Limitations Inconsistency Indirectness Imprecision No of patients

Relative effect

(95% CI)

Quality

Eclampsia

Duley 2003 RCT No one trial only No No 9,992 RR 0.41(0.29-0,58)

High

Maternal death

Duley 2003 RCT No one trial only No Imprecision 9,992 RR 0.54(0.26-1,10)

Moderate

Quality assessment

DesignLimita-tions

Inconsis-tency

Indirect-ness

Impre-cision

Resources Costs per patient

Studies per patient (US $; year 2001)

Placebo MgSO4 Placebo MgSO4

Magnesium sulphate

High GNI 0 6 0 20

Simon 2005 Middle GNI RCT No one trial only No No 0 6 0 3 High

Low GNI 0 6 0 5

Administration of the drug

High GNI 0 1 0 66

Simon 2005 Middle GNI RCT No one trial only No No 0 1 0 14 High

Low GNI 0 1 0 8

Other hospital resourcesa, b

High GNILarge

variationresourcesc

NA NA 12,839 12,818

Simon 2005 Middle GNI RCT No one trial only No NA NA 1,412 1,416 Modera

te

Low GNI NA NA 155 157

Outcomes

Typical control group risk

Typical absolute effect (95% CI)

Relative effect(95% CI)

Nr. of participants(studies)

Quality of theevidence

Comments

Clinical outcomes

Eclampsia Severe RR 0.41(0.29 - 0.58)

11,444 ⊕⊕⊕⊕High

27 per 1,000 16 fewer per 1,000(11 to 19)

Not severe

15 per 1,000 9 fewer per 1,000(6 to 11)

Maternal death Severe RR 0.54(0.26 - 1.10)

10,795 ⊕⊕⊕Moderate2

6 per 1,000 3 fewer per 1,000(0.6 more to 4 fewer)

Not severe

3 per 1,000 1 fewer per 1,000(0.3 more to 2 fewer)

Side effects 46 per 1,0003 196 more per 1,000(165 to 231)

RR 5.26(4.59 - 6.03)

9.992 ⊕⊕⊕⊕High

Mostly flushing. Other side effects include nausea, vomiting, slurred speech, muscle weakness, dizziness, drowsiness, confusion and headache.

Magnesium sulphateampoules

0 6 10 ml. ampoules per woman

9.996 ⊕⊕⊕⊕High

CostHigh GNIMiddle GNILow GNI

$20 more per patient$ 3 more per patient$ 5 more per patient

Administration of magnesium sulphate

0 1 per woman 9.996 ⊕⊕⊕⊕High


$66 per patient$14 per patient$ 8 per patient

Resources for administering magnesium sulphate included midwife time (main cost), intravenous cannula/needle, syringe, IV fluids, drug.

Other hospital resources Varied widely 9.996 ⊕⊕⊕Moderate5

There was large variation in the use of other hospital resources in both intervention and control groups.


$12,839$ 1,416$ 157

$20 less per woman(0 to 60)

$ 4, less per woman(0 to 10)

$ 2 less per woman(1 to 3)

Other hospital costs have been adjusted based on the influence of eclampsia to control for the many other factors that influenced these costs.

Resource use from the perspective of the health system

control grop difference trt vs control

Issues in resource useIssues in resource use•• broad perspective desirablebroad perspective desirable

–– narrow perspectives ignore much resource usenarrow perspectives ignore much resource use–– users can pick costs relevant to themusers can pick costs relevant to them–– either health care system or societaleither health care system or societal

•• indirect costs controversialindirect costs controversial

•• indirect evidence of resources useindirect evidence of resources use–– costs only reportedcosts only reported–– RCT but doesnRCT but doesn’’t reflect practicet reflect practice

•• ulcer prevention everyone gets repeat endoscopyulcer prevention everyone gets repeat endoscopy

Quality of evidence for Quality of evidence for resource useresource use

•• rules basically the same rules basically the same –– RCTsRCTs start high, observational lowstart high, observational low

•• may need multiple sources of evidencemay need multiple sources of evidence–– RCTsRCTs may not fully report resource usemay not fully report resource use

•• variation across settingsvariation across settings–– RCT may not reflect real worldRCT may not reflect real world–– time frame may extend beyond trialtime frame may extend beyond trial

•• different quality for different resourcesdifferent quality for different resources–– magmag sulphatesulphate versus hospital resourcesversus hospital resources

Formal economic modelsFormal economic models•• limitationslimitations

–– supported by industry, biasedsupported by industry, biased–– setting specificsetting specific–– reduces transparencyreduces transparency–– if evidence low quality, speculativeif evidence low quality, speculative–– often many assumptionsoften many assumptions

•• solution: develop own modelsolution: develop own model–– OK if you are NICE with lots of resourcesOK if you are NICE with lots of resources

•• even so, doneven so, don’’t include in profilet include in profile

Costs versus affordabilityCosts versus affordability

•• intervention may be intervention may be ““costcost--effectiveeffective””–– $10,000 per $10,000 per qalyqaly gainedgained

•• but if applicable to huge proportion but if applicable to huge proportion of population, may still be of population, may still be unaffordableunaffordable

healthy asymptomatic postmenopausal healthy asymptomatic postmenopausal qomwnqomwn: : HRT in 1992?HRT in 1992?

Possible benefitsPossible benefits–– CHD, Hip fracture, Colorectal cancerCHD, Hip fracture, Colorectal cancer

Possible harmsPossible harms–– Breast cancerBreast cancer–– StrokeStroke–– ThrombosisThrombosis–– Gall bladder diseaseGall bladder disease

Can GRADE lead to change?

Evidence profile: Quality assessmentEvidence profile: Quality assessmentOestrogen + progestin for prevention Oestrogen + progestin for prevention

in 1992 (before WHI and HERS)in 1992 (before WHI and HERS)

Oestrogen + progestin versus usual care

Oestrogen + progestin for Oestrogen + progestin for prevention after WHI and HERSprevention after WHI and HERS

Postulate

• major work in preparing guideline/HTA assessment is systematic review

• If already doing this, GRADE framework should add little

• history: Rolls-Royce and Volkswagen

VW and RR VW and RR appraochesappraoches

•• Rolls Royce (NICE)Rolls Royce (NICE)–– systematic review for every outcomesystematic review for every outcome–– production of evidence profilesproduction of evidence profiles–– involvement of multiple constituenciesinvolvement of multiple constituencies

•• including patientsincluding patients–– inclusion of economic analysisinclusion of economic analysis

•• cost $1 million per guidelinecost $1 million per guideline

MOPED GRADEMOPED GRADE•• UpToDateUpToDate

–– 5,000 graded recommendations5,000 graded recommendations

•• generate PICO (informal)generate PICO (informal)–– no formal rating of outcome importanceno formal rating of outcome importance

•• use of existing reviews, primary studiesuse of existing reviews, primary studies–– no new evidence synthesesno new evidence syntheses

•• quality for key outcomesquality for key outcomes–– 5 reasons rating down, 3 up5 reasons rating down, 3 up–– no new evidence profiles, no new evidence profiles, SoFSoF tablestables

•• recommendationsrecommendations–– strong or weak, consider 3 factorsstrong or weak, consider 3 factors–– value and preference statementsvalue and preference statements

ACCP• formal structured questions

• no formal rating of outcome importance– trying to change

• hit-and miss systematic reviews– largely only available ones

• hit-and-miss individual study evidence summaries

• rare evidence profiles– trying to change

VW approachVW approach

•• take systematic reviews if availabletake systematic reviews if available

•• if not, review key, accessible evidenceif not, review key, accessible evidence

•• no metano meta--analysis if not doneanalysis if not done

•• no evidence profilesno evidence profiles

•• small group make expert small group make expert judgementjudgement

ConclusionConclusion

•• clinicians, policy makers need summariesclinicians, policy makers need summaries–– quality of evidencequality of evidence–– strength of recommendationsstrength of recommendations

•• explicit rulesexplicit rules–– transparent, informativetransparent, informative

•• GRADEGRADE–– simple, transparent, systematicsimple, transparent, systematic–– increasing wide adoptionincreasing wide adoption

grade background • two stepsebm.mcmaster.ca/documents/large_group_presentations/grade... · •...

Documents