quantitative critical appraisal october 2015

An Introduction to Critical AppraisalIsla KuhnMedical Librarian

Last updated: September 2014

Learning Outcomes

By the end of this session you will:

• Understand what Critical Appraisal is• Be aware of some of the different types

of research• Be able to interpret basic statistics within

a research paper• Gain experience in critically appraising a

research paper

How do I Appraise?

• You don’t need to be a statistics expert

• Ready-made checklists help you focus on the most important aspects of the article

• Different checklists available for different types of research (RCTs, systematic reviews, case-control studies, etc).

• Checklist for Qualitative research

• Available free from CASP

http://www.casp-uk.net

http://www.casp-uk.net/

Critical Appraisal Critical appraisal of any study design must assess:

ValidityWere sound scientific methods used? Chance / Bias / Confounding Factors

ResultsWhat are the results and how are they expressed?

RelevanceAre the findings generalisable – can they be

applied to settings / situations outside the research study? Do these results apply to my local context?

Event Rates

Number of people experiencing an event as a proportion of the number of people in the population

• Form the basis of other calculations Control Event Rate (CER) Experimental Event Rate (EER)

Emerg Med J 2008 25: 26-29:

Proportion with recurrent headache (whole sample) CER = 12/31 = 39% EER = 8/30 = 27%

Risk of benefit and harm

Relative Risk (RR) = compares the risk in 2 different groups of people

tells us how many times more likely it is that an event will occur in the treatment group relative to the control group EER / CER Relative Risk of 1 means the risk is the same in each group <1 = treatment reduces risk of event >1 = treatment increases risk of event

27/39 = 0.69 = treatment reduces risk of event

Risk of headache is 0.69 times lower in the treatment group than in the control group.

Risk continued

Absolute risk reduction (ARR)

Difference in risk between experimental and control groups

Risk of Event in Control Group – Risk of Event in intervention groupARR=0 Treatment has no effectARR positive – Treatment is beneficialARR negative – Treatment is harmful

39% - 27% = 12%Dexamethasone reduces the absolute risk of

recurrent headache by 12%

Relative Risk Reduction (RRR)

tells us the reduction in the rate of the outcome in the treatment group relative to that in the control group

ARR / CER Or 1 – RR

0.12 / 0.39 = 0.31 = 31%

1-0.69 = 0.31 = 31%

Dexamethasone reduces the risk of recurrent headache by 31% relative to that occurring in the control group.

Absolute Risk Reduction & Relative Risk Reduction

Results of hypothetical trial of a new drug for myocardial infarction

Numbers Needed to Treat

Measures the impact of a treatment or intervention

States how many patients need to be treated in order to prevent an event which would otherwise occur.

NNT = 10 means that 10 patients need to be treated to prevent one adverse outcome

The closer to 1 the better

Calculation: 1 / ARR (if ARR expressed as a proportion)100/ARR (if ARR expressed as a %)100/12 = 8

P=Probability A p-value is a measure of statistical significance which tells us the

probability of an event occurring due to chance alone

In simple terms, probability (p-value) can only take values between 0 and 1:

0|-----------------------|--------------------|1

Impossible…....... Absolutely certain…

If p=0.001 the likelihood of a result happening by chance is extremely low: 1 in 1000

If p=0.05 it is fairly unlikely that the result happened by chance 1 in 20

If p=0.5 it is fairly likely that the result happened by chance 1 in 2

If p=0.75 it is very likely that the result happened by chance 3 in 4

P Values

Confidence intervals:

“The recurrent headache rate in the control group was 39%(12/31, 95% CI 22% to 57%) compared with 27% (8/30, 95% CI13% to 46%) in the dexamethasone group (relative risk (RR)0.69, 95% CI 0.33 to 1.45; p=0.47)”.

Why 95%? It measures the reliability of an estimate, so if you repeated this same study 95 times you could be certain that the result would be the same every time, within that particular range i.e. 0.33 to 1.45. CI are typically recorded as 95% but when presented in graphical terms they are sometimes expressed as intervals of 50%, 95% and 99%

Confidence Intervals

An alternative way of assessing the effects of chanceThe result of the trial is a “point estimate” – if you ran

the trial again you will get a different resultThe Confidence Interval gives the range in which you

think the real answer liesThe 95% CI is the range in which we are 95% certain

that the true population value liesLook at how wide the interval is, and the values at

each end

E.g. RR = 0.69 95% CI 0.33 to 1.45

Forest Plot – Simple Example

Individual sample size

Combined Results

Confidence Interval

Line of No Effect

Best Estimate

The shorter the Confidence Interval (CI) the more confident we can be that the results are true

If the CI crosses the line of no effect, then the results of that study are not statistically significant

Favours Treatment Favours Control

Heterogeneity – what is it?

• Relevant to statistical meta-analysis, so you are more likely to come across this in a study review or systematic review – it is when multiple studies on an effect are actually measuring somewhat different effects due to differences in subject population, intervention, choice of analysis, experimental design, etc; this can cause problems in attempts to summarize the meaning of the studies.

What is df?

• Degrees of freedom – frequently expressed with the Chi² test.

• The number of independent pieces of information available for the statistician to make the calculations

What is Chi²?

• The chi-square test is used to determine whether there is a significant difference between the expected frequencies and the observed frequencies in one or more categories. Do the number of individuals or objects that fall in each category differ significantly from the number you would expect? Is this difference between the expected and observed due to sampling error, or is it a real difference?

How do I understand and interpret different statistical information?

• The short answer is, you don’t have to understand it, you only need to look at the p value

• As a general rule, remember the following:

• Statistics that describe data – percentages, mean, median, mode, standard deviation

• Statistics that test confidence – confidence intervals, p values

• Statistics that test difference – t tests and other parametric tests, Mann-Whitney and other non parametric tests, Chi² test

• Statistics that compare risk – risk and odds ratio, risk reduction and numbers needed to treat

Source: Medical and Health Science Statistics Made Easy by Michael Harris and Gordon Taylor

Conclusion

Critical Appraisal is part of Evidence Based Healthcare

It takes practice

Use CASP checklists

Depth of Appraisal is your choice

Only you can assess usefulness

Useful websites

www.healthknowledge.org.uk/interactive-learning/finding-and-appraising-the-evidence

www.thennt.com/

www.casp-uk.net/

www.wikipedia.org

http://www.nhs.uk/news/Pages/NewsIndex.aspx NHS Choices Behind the Headlines

http://www.healthknowledge.org.uk/interactive-learning/finding-and-appraising-the-evidence

http://www.healthknowledge.org.uk/interactive-learning/finding-and-appraising-the-evidence

http://www.thennt.com/

http://www.casp-uk.net/

http://www.wikipedia.org/

http://www.nhs.uk/news/Pages/NewsIndex.aspx

Help!

Isla Kuhn

Medical Librarian

Medical Library

Box 111

Addenbrooke’s Hospital

email: [email protected]

twitter: @ilk21

phone: (01223 3) 36750

web: library.medschl.cam.ac.uk

Thank you.