reliability ofthe guide to pregnancy risk grading of the ontario

ORIGINAL RESEARCH * NOUVEAUTES EN RECHERCHE

Reliability of the Guide to Pregnancy Risk Gradingof the Ontario Antenatal Record in assessingobstetric risk

Brian G. Hutchison,* MD, Msc, CCFP; Ruth Milner,t MIS

Objective: To assess the reliability of the Guide to Pregnancy Risk Grading of the OntarioAntenatal Record through evaluation of inter- and intra-observer agreement on the grading ofobstetric risk.Design: Retrospective chart review.Setting: Urban community teaching hospital in Hamilton, Ont.Patients: Obstetric charts of 77 women were randomly selected from those of all womenwho delivered at the hospital or were transferred before delivery to the regional perinatal cen-tre between Apr. 1, 1987, and Mar. 31, 1988. Six family physicians and two obstetriciansparticipated as chart reviewers.Main outcome measures: Agreement beyond chance (kappa [K] statistic) between (a) differ-ent reviewers, (b) the same reviewer at different times and (c) the majority of reviewers (ma-jority risk grade) and the antenatal record.Main results: The K value for interobserver agreement ranged from 0.48 (95% confidenceinterval [CI] 0.34 to 0.62) to 0.51 (95% CI 0.36 to 0.66). For intraobserver agreement it was0.69 (95% CI 0.37 to 1.0). Agreement between the majority risk grade and the risk grade lastrecorded in the antenatal record had a K value of 0.58 (95% CI 0.54 to 0.61).Conclusion: The guide possesses only modest reliability. Efforts should be made to makedescriptions of risk factors more explicit and to improve the training of health care providersin the use of the guide in order to prevent errors in pregnancy risk assessment and resultinginappropriate patient care and misdirection of health care resources.

Objectif: Evaluer la fiabilite du document intitule Guide to Pregnancy Risk Grading dudossier antenatal de l'Ontario par l'evaluation de l'accord entre observateurs et de l'observa-teur avec lui-meme sur le classement du risque obstetrique.Conception : Examen retrospectif des dossiers.Contexte: Hopital d'enseignement urbain communautaire a Hamilton (Ont.).Patientes: Les dossiers obstetriques de 77 femmes ont ete choisis au hasard parmi ceux detoutes les femmes qui ont accouche a l'hopital ou ont ete transf6rees avant l'accouchement aucentre perinatal regional du ler avril 1987 au 31 mars 1988. Six medecins de famille et deuxobstetriciens ont participe a titre d'examinateur des dossiers.Principales mesures de resultats : Accord au-dela de tout doute (statistique kappa [K]) entre(a) divers examinateurs, (b) le meme examinateur a des moments diff6rents et (c) la majoritdes examinateurs (classement majoritaire du risque) et le dossier antenatal.Principaux resultats: La valeur K pour l'accord entre observateurs s'echelonnait de 0,48

From *the departments ofFamily Medicine and of Clinical Epidemiology and Biostatistics and the Centrefor Health Economics and PolicyAnalysis, McMaster University, Hamilton, Ont., andtthe departments ofPediatrics and ofHealth Care and Epidemiology, University ofBritishColumbia, Vancouver, BC

Reprint requests to: Dr. Brian G. Hutchison, First Place Family Medical Centre, 350 King St. E, Hamilton, ON L8N 3Y3; fax (905) 527-4486

CAN MED ASSOC J 1994; 150 (12) 1983JUNE 15, 1994

(intervalle de conflance [IC] de 95 %, 0,34 a 0,62) 'a 0,51 (IC de 95 %, 0,36 'a 0,66). Pourl'accord de l'observateur avec lui-meme, elle etait de 0,69 (IC de 95 %, 0,37 'a 1,0). L'accordentre le classement du risque majoritaire et le demier classement enregistre dans le dossierantenatal avait une valeur K de 0,58 (IC de 95 % 0,54 'a 0,61).Conclusion: Le guide n'est que modestement fiable. I1 faudrait s'efforcer de donner des des-criptions plus explicites des facteurs de risque et d'ameliorer la formation des dispensateursde soins de sante pour l'utilisation du guide en vue de prevenir les erreurs dans l'evaluationdu risque de grossesse et par consequent dans les soins de sante appropries a donner et dansl'affectation des ressources en soins de sante.

T he Ontario Antenatal Record, with its Guide toPregnancy Risk Grading, was introduced in1979' and has since become "widely, although

not universally used and accepted."2 From the outset, theOntario Medical Association recommended that allphysicians providing obstetric care use the antenatalrecord. The Ontario Ministry of Health produces anddistributes the forms across the province.

The Ontario Antenatal Record is in two parts. Thefirst part is for recording demographic information, ob-stetric history, medical history, findings on physical ex-amination, history of the pregnancy to date and a nutri-tion assessment and includes a checklist of discussiontopics. A space is provided for recording a risk grade.The second part is designed for recording all informationconcerning the balance of the pregnancy. It includes aspace for recording risk grade at each antenatal visit.

The Guide to Pregnancy Risk Grading is printed onthe back of both parts of the antenatal record. The guideconsists of three separate checklists under the followingheadings: grade A (pregnancy at no predictable risk),grade B (pregnancy at risk) and grade C (pregnancy athigh risk). Grade A pregnancies are those that satisfy allfour criteria in the checklist: no prior perinatal death orlow-birth-weight infant, no significant medical disease,no pregnancy complications now or in the past, and ade-quate fetal growth. Grade B pregnancies meet one ormore of the criteria on the relevant checklist; cliniciansare advised to consider consultation with a specialist ob-stetrician. Pregnancies that meet one or more of the cri-teria on the grade C checklist are classified as being athigh risk; transfer to a regional perinatal centre for inten-sive care and delivery is recommended. Clinicians areinstructed to assess pregnancy risk at each antenatalvisit.

Although there have been several modifications tothe guide, most recently in 1992,3 the format of the grad-ing system has remained unchanged. The system's per-formance in predicting or altering pregnancy outcomehas not been assessed.

Several reviews have concluded that the availablesystems for antenatal risk assessment perform poorly inpredicting pregnancy outcomes."7 For example, Alexan-der and Keirse4 assessed 31 systems used to screen for in-creased risk of perinatal death, low birth weight, pretermbirth or low Apgar score at birth. They found that de-pending on the cutoff point and the test chosen, only be-

tween 10% and 30% of the women actually experiencedthe adverse outcome for which the scoring system haddeclared them to be at high risk. On the other hand, be-tween 20% and 50% of women who had preterm or low-birth-weight infants had low-risk scores. Possible sourcesof this low predictive validity are summarized in Table 1.Of these, the reliability (reproducibility) of risk assess-ment systems has not been evaluated, even though Wall9defined an effective scoring system as one that is "objec-tive and reliable, with consistent scores given for an indi-vidual patient by multiple users."

Reliability is a prerequisite for the validity of an as-sessment tool. An instrument that does not provide reli-able (reproducible) results will invariably lack validity(although an instrument can be reliable without neces-sarily being valid). It is therefore possible that the unreli-ability of antenatal risk assessment systems is a substan-tial contributor to their poor performance in predictingpregnancy outcomes. In this study we addressed thispossibility by examining inter- and intraobserver agree-ment on grading of obstetric risk using the Ontario Ante-natal Record and its Guide to Pregnancy Risk Grading.

Methods

The study setting was St. Joseph's Hospital, Hamil-ton, Ont., a community hospital affiliated with McMas-ter University. The reliability study was carried out inconjunction with an audit of 250 obstetric charts ran-domly selected from the charts of all women who deliv-ered at the hospital or were transferred before delivery tothe regional perinatal centre from Apr. 1, 1987, to Mar.31, 1988. Approximately 20% of these women were ad-mitted under the care of family physicians.

Table 1: Possible sources of low pi-edic-ilve valid'ity -antenatal risk assessment instrrumentQ

1984 CAN MED ASSOC J 1994; 150 (12) LE I S JUIN l1994

Six family physicians and two obstetricians partici-pated as the chart reviewers. Each of the family phys-icians provided full obstetric care (antenatal and intra-partum care) in their practices.

Each chart was reviewed by at least one physician.To assess interobserver reliability, 82 charts were re-

viewed independently by any three of the eight raters;thus, each of these charts was reviewed by three people.The scheme for allocating charts to the reviewers isshown in Table 2. In addition, each of the eight review-ers reviewed five charts again for intraobserver agree-

ment. In total, each rater reviewed either 56 or 57 charts,31 (or 32) of which were for assessing interobserveragreement and 5 of which were repeat reviews for in-traobserver agreement. The charts were reviewed in twoequal sets, approximately 3 months apart. The time in-terval and high proportion of new charts ensured that thereviewers were unlikely to recognize the charts they hadpreviously examined.

The number of charts selected for assessment of in-

ter- and intraobserver agreement was the minimum num-

ber that we felt, based on previous experience, wouldprovide an acceptably precise estimate of agreement be-yond chance. The reviewers were instructed to deter-mine risk status on admission by examining the admis-sion history and physical record as well as parts 1 and 2of the Ontario Antenatal Record. They were further in-structed to consult the Guide to Pregnancy Risk Gradingand to assign a grade with reference to the risk factors(markers) listed in the guide. No special training in use

of the guide was provided, and no attempt was made todiscuss or resolve disagreement. A risk grade was as-

signed only to women at term (36 to 42 weeks' gesta-tion) on admission; this reduced the number of cases

available for assessing interobserver agreement from 82to 77.

As well as assessing inter- and intraobserver agree-

ment we examined agreement between the risk graderecorded in the antenatal record and the grade assignedby the majority of the three reviewers (majority riskgrade). For this analysis, we excluded cases in which therelevant antenatal record was missing from the chart or

was lacking a risk grade.In our analysis of results we calculated a general-

ized kappa (K) statistic, which provides a measure ofagreement beyond that expected to occur by chance.8

Results

In all of the cases the reviewers assigned a riskgrade of A or B. The levels of inter- and intraobserveragreement beyond chance on risk grading are summa-

rized in Table 3.The levels of agreement beyond chance between

the majority risk grade and the grade recorded in the an-

tenatal record are shown in Table 4. Two of the 77 cases

were eliminated because there was no majority riskgrade (in each case one reviewer did not record a gradeand the other two assigned different grades). In the re-

maining 75 cases all three reviewers assigned the same

grade in 49, and two of the three reviewers did so in 26.Pregnancies were substantially more likely to be

considered at risk by the reviewers than by the attendingphysician. For example, of the pregnancies for which a

Table 3: Level of agreement on risk grading among the reviewers using theGuide to Pregnancy Risk Grading of the Ontario Antenatal Record

Agreement

Interobserver

Intraobserver

PooledNo. of observedcharts agreement

77t74t40

0.750.770.85

Pooledchance

agreement

Generalizedkappa (K) value(and 95% CI)*

0.52 0.48 (0.34-0.62)0.54 0.51 (0.36-0.66)0.50 0.69 (0.37-1.0)

*CI = conftidence interval.tincludes three charts with missing data for one reviewer.tExcludes three charts with missing data for one reviewer.

CAN MED ASSOC J 1994; 150 (12)

Table 2: Allocation of obstetric charts to six family phys-icians and two obstetricians for review of pregnancy riskgrading*

Reviewer

Charts A B C D E F G H

1-10 LX X X11-20 x X X21-30 IX X xi31-51 X52-61 ,X X X62-82 X

83-92 X X X93-113 X

114-123 X X X,124-144 X _

145-155 x x X156-176 X

177-187 X X X188-208 X209-229 X230-250 X

*Boxes indicate chart allocation for assessment of interobserver agree-ment.

JUNE 15, 1994

risk grade was recorded in the antenatal record, 21 % hadgrade B as the last recorded risk grade, whereas 33% hada majority risk grade of B. Similarly, of the pregnancieswith a risk grade assigned at or after 35 weeks' gesta-tion, 21% had grade B recorded, whereas 37% had a ma-jority risk grade of B.

Discussion

We found that the level of interobserver agreementbeyond chance on antenatal risk grading was only fairaccording to the criteria suggested by Lanais and Koch9and endorsed by Fleiss.'° (A K value of less than 0.40represents poor agreement, 0.40 to 0.75 fair to goodagreement and more than 0.75 excellent agreement.) Thelevel of intraobserver agreement was good (but not ex-cellent) according to those standards. These modest lev-els of agreement suggest that the unreliability of systemsfor obstetric risk assessment may be a significant con-tributor to their poor predictive validity.47

Our finding of only fair interobserver agreement in-dicates that the guide's unreliability can be expected toresult in failure to identify some pregnancies at risk andincorrect labelling of others as being at risk- possiblyleading to unnecessary and potentially hazardous inter-ventions.

One possible reason for the guide's unreliabilitymay be the ambiguity of some of the descriptions of riskfactors in the guide. For example, among the criteria fora grade B pregnancy are "significant tobacco, alcohol,drug intake", "premature rupture of membranes 34weeks or more", "other significant illness" and "mildtoxemia," each of which is subject to varying interpreta-tion. Another possible source of unreliability is theguide's encouragement for physicians to add additionalrisk factors. However, in our study the reviewers wereinstructed to base their grading only on the risk factorslisted in the guide. If physicians were to add their ownrisk factors this would result in even greater unreliabilityof the assessment instrument.

Agreement between the majority risk grade and thegrade recorded in the antenatal record was fair to good,the K value ranging from 0.58 to 0.69 depending on howclose to hospital admission the risk grade was recorded.Disagreement between these risk grades could have beendue to a combination of the guide's unreliability, failureof the attending physicians to recognize risk factors or toalter risk grading in response to new risk factors, or theemergence of new risk factors between the last antenatalvisit and admission to hospital. A significant incidentalfinding was the unavailability of risk grades for a sub-stantial proportion of the 75 pregnancies. No risk gradewas available for 18 (24%) because the antenatal recordwas either missing or did not include a risk grade. A riskgrade at or after 35 weeks' gestation was unavailable in51 (68%) of the 75 cases. In computing the level ofagreement between the majority risk grade and the graderecorded in the antenatal record, we excluded cases inwhich the antenatal record was missing or did not pro-vide a risk grade. Had we considered these cases to rep-resent disagreement with the majority risk grade, thelevel of agreement would have been much lower.

To assess the implications of our findings, the rep-resentativeness of our reviewers needs to be considered.The 3:1 ratio of family physicians to obstetricians in ourstudy corresponds fairly closely to the distribution of an-tenatal (as opposed to intrapartum) care among familyphysicians and obstetricians in Ontario. In a survey of arandom sample of Ontario family physicians 51.5% re-ported that they provided antenatal care to all patients,and a further 17.0% reported that they provided antena-tal care to low-risk patients only."

The family physicians who participated in our studyall provide intrapartum as well as antenatal care andhave a strong commitment to family-practice obstetrics.Because of this shared commitment, their level of agree-ment on risk grading may have been higher than thelevel expected in a random sample of family physiciansproviding antenatal care. Also, our results tend to over-state the level of interobserver agreement, because all of

Table 4: Level of agreement betweern fhe rmaiontyv C.jj--Jar-:sand 2 of the antenatal record

ajur!i

-.ja,c

t _ 'X!t

1986 CAN MED ASSOC J 1994; 150 (12) LE 15JUIN 1994

the reviewers were performing risk assessment usingidentical information. In clinical practice different clini-cians will elicit and note somewhat different informa-tion, which will in turn contribute to disagreement re-garding risk classification.

Another potential limitation of our study is that theinformation used for risk grading was collected in 1987and 1988, and the charts were reviewed in 1989. How-ever, we do not believe that the level of inter- and intraob-server agreement would have changed during the inter-vening period. Subsequent modifications to the guidewere minor,3 and the format and instructions to clinicianswere unchanged. The level of agreement could have in-creased in response to improved training of care providersin the use of the guide. However, we are not aware of anysignificant educational activity of this nature.

None of the pregnancies in our study were assigneda risk grade of C. With random sampling, this is perhapsnot surprising given the rarity of the conditions leadingto a grade C designation. However, by limiting our sam-ple to women between 36 and 42 weeks' gestation wemay have excluded some grade C pregnancies for whichthe criterion for high risk was early uncontrolled prema-ture labour. The inter- and intraobserver agreement mayconceivably be greater on the assignment of grade Cthan on the assignment of grades A and B. However, inusual clinical practice the principle distinction to bemade is between grades A and B, because these repre-sent most pregnancies.

Our findings indicate that the guide in the antenatalrecord possesses only modest reliability. This suggeststhat descriptions of risk factors should be more explicitand the training of care providers in the use of the guideshould be improved. Such efforts can be expected to re-duce errors in pregnancy risk assessment and resultinginappropriate patient care and misdirection of health careresources.

We thank Drs. Fionnella Crombie, Bob Lancaster, HenryMuggah, Phil Shea, David Small, Paul Steinberg and JacquiWakefield for their conscientious work, as chart reviewers.

References

1. Goodwin JW, Chance GW: New system for managing high-riskpregnancies. Ont Med Rev 1969; Nov: 563-567

2. Advisory Committee on Reproductive Care: Reproductive Care:Towards the 1990s, Ontario Ministry of Health, Toronto, 1987:24

3. Chance GW: Changes to the Ontario Antenatal Record. Ont MedRev 1992; Dec: 23-25

4. Alexander S, Keirse MJNC: Formal risk scoring during preg-

nancy. In Chalmers I, Enkin M, Keirse MJNC (eds): EffectiveCare in Pregnancy and Childbirth, Oxford University Press, Ox-ford, England, 1989: 345-365

5. Wall EM: Assessing obstetric risk: a review of obstetric risk-scor-ing systems. J Fam Pract 1988; 27: 153-163

6. Lumley J: Prediction of preterm births. In Yu VYH, Wood EC(eds): Prematurity, Churchill Livingstone, Edinburgh, 1987:43-53

7. Committee to Study the Prevention of Low Birthweight: Screen-ing for obstetric risk. In Preventing Low Birthweight, NationalAcademy Press, Washington, 1985: 76-93

8. Fleiss JL: Statistical Methods for Rates and Proportions, 2nd ed,John Wiley & Sons, New York, 1981: 225-232

9. Lanais JR, Koch GG: The measurement of observer agreementfor categorical data. Biometrics 1977; 33: 159-174

10. Fleiss JL: Statistical Methods for Rates and Proportions, 2nd ed,John Wiley & Sons, New York, 1981: 218

11. Bain ST, Grava-Gubins I, Edney R: The family doctor in obstet-rics: Who's looking after the shop? Can Fam Physician 1987; 33:2693-2701

Hoechst-Roussel Canada Inc.Montreal, Qu6bec

Dr. Joseph H. Patterson, Chairmanof the Board for Hoechst-Roussel NEWTON WILLIAMSCanada Inc. is pleased to announcethe appointment of Mr. H. Newton Williams as Presidentand CEO of Hoechst-Roussel Canada Inc. effectiveApril 1, 1994.

Mr. Newton (Newt) Williams was formerly Vice-Presidentof Hoechst-Roussel Agri-Vet Company and* GeneralManager of the North American Animal Health BusinessUnit for Hoechst-Celanese Corporation. After receivinghis degree in chemical engineering from VanderbiltUniversity, Nashville, Tenn. in 1968, Mr. Williams joinedthe corporation as Quality Control Engineer.In the late 1980's Mr. Williams held two internationalassignments; one as Plant Manager of the company'sfibres and filter products plant in Lanaken, Belgium, and theother as Director of Marketing for the Filter ProductsDivision in Europe, the Middle East and Africa.

Hoechst-Roussel Canada Inc. is responsible for the market-ing and sales of a wide range of prescription products; mostnotably in the areas of cardiovascular and anti-arthritictherapies. The company's head office is located in Montreal,Quebec.

JUNE 15, 1994 CAN MED ASSOC J 1994; 150 (12) 1987

reliability ofthe guide to pregnancy risk grading of the ontario

Documents