Download - Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 1

VALIDITY AND RELIABILITY OF SCREENING TESTS

Rashida B Syed, EpidemiologistConsultant Faculty Field Epidemiology Training Program (FETP)-Pakistan


Objectives Calculate and interpret measures of the validity of a screening

test: Sensitivity Specificity

Understand the relationship between sensitivity and specificity.

Calculate and interpret measures of the performance (yield) of a screening test:

Predictive value positive (PV+) Predictive value negative (PV-)

Understand factors that influence PV+ and PV-

Recognize issues and sources of bias in evaluating screening programs.


Purpose of screening The early detection of disease in individuals

who do not show any signs of disease.

Aims to reduce morbidity and mortality from disease among persons being screened.

Is the application of a relatively simple, inexpensive test, examinations or other procedures to people.

a means of identifying persons at increased risk for the presence of disease, who warrant further evaluation


Diagnosis = Screening

Screening tests can also often be used as diagnostic tests

Diagnosis involves confirmation of presence or absence of disease in someone suspected of or at risk for disease

Screening is generally in done among individuals who are not suspected of having disease


Requirements

Is there a truly effective treatment available for the discovered disease?

Is that treatment more effective in screened than non-screened cases?

What are the side effects of the screening process?

How efficient is screening? Do we have the right threshold? i.e. how many people must be screened to obtain a case?


Susceptible Host

Subclinical Disease

Clinical Disease

Stage of Recovery, Disability, or Death

Point of Exposure

Screening

Onset of symptoms

Diagnosis sought

Natural History of Disease

Detectable sub-clinical disease


Examples of Screening Tests

Questions Clinical Examinations Laboratory Tests Genetic Tests X-rays

Goel


Diseases for which screening has been recommendedDiseases for which screening has been recommended

Cervical cancer

Breast cancer

Prostate cancer

Colon cancer

Diabetes

Hypertension


Terminology

Validity is analogous to accuracy

The validity of a screening test is how well the given screening test reflects another test of known greater accuracy

Validity assumes that there is a gold standard to which a test can be compared

Paneth


Three key measures of validity

• Sensitivity• Specificity• Predictive value


Sensitivity and Specificity

Sensitivity tells us how well a positive test detects disease.

It is defined as the ability of the test to identify correctly as diseased, those who have the disease.

---------------------------------------------------------------------------------

Specificity tells us how well a negative test detects

non-disease.

Defined as the ability of the test to identify correctly those who do not have the disease as test negative.


DiseaseS

cree

nin

gT

est

Present Absent

PositiveTrue

positives

Negative

Falsepositives

Falsenegatives

Truenegatives


Present Absent

Positive a b

Negative c d

a + b

c + d

a + c b + d

DiseaseS

cree

nin

gT

est

N


Sensitivity

Proportion of individuals who have the disease who test positive (true positive rate) tells us how well a “+” test picks up disease

a

a + c=Sensitivityyes no

+ a b

- c d

a + b

c + d

a + c b + d

Disease

Scr

eeni

ngT

est

N


Specificity

Proportion of individuals who don’t have the disease who test negative (true negative rate) tell us how well a “-” test detects no disease

d

b + d=Specificityyes no

+ a b

- c d

a + b

c + d

a + c b + d

Disease

Scr

eeni

ngT

est

N


Predictive value

Positive predictive value – the number of individuals who have a condition from all those who test positive.

Negative predictive value - the number of individuals who do not have a condition from all those who test negative


Positive Predictive Value

Proportion of individuals who test positive who actually have the disease

a

a + b=P.P.V.yes no

+ a b

- c d

a + b

c + d

a + c b + d

Disease

Scr

een

ing

Tes

t

N


Negative Predictive Value

Proportion of individuals who test negative who don’t have the disease

d

c + d=N.P.V.yes no

+ a b

- c d

a + b

c + d

a + c b + d

Disease

Scr

een

ing

Tes

t

N


Determinants of predictive value

The predictive value of a test is determined by 3 factors:

1. Sensitivity 2. Specificity 3. Prevalence of the disease in the

population being tested


Effect of prevalence on PPV

As prevalence rates decrease, the positive predictive value of a test also decreases

This explains why diagnostic tests which are developed in clinical populations (where the prevalence of the disease being tested is often high) often perform poorly in general population settings (where disease prevalence tends to be lower).

In our example-prove it


Scenarios

Tests with Dichotomous Results Examples

(Positive or Negative)

Tests with Continuous results Examples

Systolic blood pressure (mm Hg) Tuberculin reaction (induration diameter, mm)


Examples

In a sample of 200 people: 100 people have the disease Hypothyroidism, and 100 people do not have it.

In the same sample of 200 people: 110 people test positive for Hypothyroidism using a new diagnostic test, and 90 people test negative for Hypothyroidism using the same diagnostic test.

Of the 110 people who are test positive, 90 do have the disease and 20 do not.

Of the 90 people who are test negative, 10 do have the disease and 80 do not.

Sensitivity and Specificity?


Solution

SENSITIVITY=TP/TP+FN

=90/90+10=90% SPECIFICITY=TN/TN+FP

=80/80+20=80%


Present Absent

Positive 48 3

Negative 2 47

51

49

50 50

Disease

Scr

een

ing

Tes

t

100

A test is used in 50 people with disease and50 people without. These are the results.

Paneth


Present Absent

Positive 48 3

Negative 2 47

51

49

50 50

Disease

Scr

een

ing

Tes

t

100Sensitivity = 48/50Specificity = 47/50Positive Predictive Value = 48/51Negative Predictive Value = 47/49

Paneth


So… you understand the accuracy of a screening test …

What is the next step?

Put screening to use in the population


Sensitive vs. Specific tests

A test with high levels of sensitivity is usually positive when disease is present and has few false negatives – useful when it is important not to miss a diagnosis (e.g. if the disease is dangerous but has an effective treatment)

A test with high levels of specificity is usually negative when disease is absent and has few false positives – useful when a false positive diagnosis would be harmful (e.g. if it resulted in unnecessary treatment)


Balancing sensitivity vs. specificity A really good test would be highly sensitive and highly specific. In practice, this is often not the case.

Instead, there is often a trade-off between the sensitivity and the specificity of diagnostic tests

This occurs in cases where the test result is expressed on a continuous scale (e.g. blood pressure, blood sugar levels)

In such circumstances, a cut-point has to be chosen to define normal vs. abnormal

The decision for the cut point involves weighing the consequences of leaving cases undetected (false negatives) against erroneously classifying healthy persons as diseased (false positives).

Refer to Gordis


NET SENSITIVITY AND SPECIFICITY

Use of multiple tests Refer Gordis


Balancing sensitivity vs. specificity

Blood sugar level Sensitivity % Specificity%2hrs after eating(mg/100ml)

70 98.6 8.890 94.3 47.6110 85.7 84.1130 64.3 96.9170 42.9 100.0


ROC curves

One method for determining the best cut-off point is by constructing a ROC curve

ROC=receiver operating characteristic, a term that comes from radar science

ROC curves are constructed by plotting the sensitivity (or true positive rate) against the false positive rate (1-specificity)


ROC curve for blood sugar readings

Source: Fletcher, Fletcher and Wagner, Clinical epidemiology: the essentials (3rd ed)


Shows trade-off between sensitivity and specificity

Closer to left hand and top borders the more accurate the test

Slope of tangent at cut point gives the Likelihood Ratio (LR) for that value of the test

The area under the curve is a measure of test accuracy


The Area under an ROC Curve


Good tests lie close to the upper left hand corner of the graph – where sensitivity and specificity are both high

Generally the best cut-off point lies at or near the “shoulder” of the curve*

The overall accuracy of the test is represented by the area under the curve

Tests that plot close to the diagonal across the middle of the graph are least useful, as this is where the test is no better than chance

ROC curves can also be used to compare different tests

*unless there are clinical reasons for preferring a highly sensitive or highly specific test


Sources of Bias in the Evaluation of Screening Programs

Lead time bias Length bias Volunteer bias


Lead time bias

Lead time: interval between the diagnosis of a disease at screening and the usual time of diagnosis (by symptoms)

Diagnosis by screening

Diagnosis via symptoms

Lead Time


Consider a condition where the natural history allows for an earlier diagnosis, however, survival does not improve despite identifying it earlier

A screening program here will… survival will appear to increase

but in reality, it is increased by exactly the amount of time their diagnosis was advanced by the screening program

Thus there is no benefit to screening from a survival standpoint.

Lead-Time Bias


Lead time bias Assumes survival is time between screen and

death Does not take into account lead time between

diagnosis at screening and usual diagnosis.

Diagnosis by screening

in 1994

Deathin 2008

Survival = 14 years


Lead time bias

Diagnosis by

screeningin 1994

Usual time of diagnosis

via symptomsin 1998

Lead Time 4 years

Deathin 2008

True Survival = 10 years

Survival = 14 years


Length Bias

Most chronic diseases, especially cancers, do not progress at the same rate in everyone.

Any group of diseased people will include some in whom the disease developed slowly and some in whom it developed rapidly.

Screening will preferentially pick up slowly developing disease (longer opportunity to be screened) which usually has a better prognosis

Paneth


Leng

th b

ias

OBiological onset of disease

Screening

YSymptoms

Begin

DDeath

PDisease

detectable via screening

O DP Y

O DP Y

O DP Y

O DP Y

O DP Y

O P Y D

Time


Volunteer bias

Type of bias where those who choose to participate are likely to be different from those who don’t

Volunteers tend to have: Better health Lower mortality Likely to adhere to prescribed medical regimens


A worked example the Fecal occult blood (FOB) screen test is used in 203 people to look for bowel cancer: Patients with bowel cancer (as confirmed on endoscopy)

False positive rate (α) = FP / (FP + TN) = 18 / (18 + 182) = 9% = 1 − specificity.

False negative rate (β) = FN / (TP + FN) = 1 / (2 + 1) = 33% = 1 − sensitivity.

Power = sensitivity = 1 − β

Hence with large numbers of false positives and few false negatives, a positive FOB screen test is in itself poor at confirming cancer (PPV = 10%) and further investigations must be undertaken, it will though pickup 66.7% of all cancers (the sensitivity). However as a screening test, a negative result is very good at reassuring that a patient does not have cancer (NPV = 99.5%) and at this initial screen correctly identifies 91% of those who do not have cancer (the specificity).


Reliability

Validity (accuracy) Reliability (Repeatability)

Refer Epidemiology by Gordis


Review questions from Gordis


Likelihood-ratio positive = sensitivity / (1 − specificity) = 66.67% / (1 − 91%) = 7.4

Likelihood-ratio negative = (1 − sensitivity) / specificity = (1 − 66.67%) / 91% = 0.37

Download - Validity-reliability Screening Tests Rbs-feltp

Top Related