wolfe mammographic parenchymal patterns. a study of the masking hypothesis of egan and mosteller

7
Wolfe Mammographic Parenchymal Patterns A Study of the Masking Hypothesis of Egan and Mosteller JOHN WHITEHEAD, PHD: THOMAS CARLILE, MD,t KENNETH J. KOPECKY. PHD,+ DONOVAN J. THOMPSON, P H D , ~ FRED 1 . GILBERT, JR. MD.11 ARTHUR J. PRESENT, MD.T BARBARA ANNE THREATT, MD,# PETER KROOK, MD,” AND EVELYN HADAWAY, RN, hQSWt Wolfe defined four different classes of breast parenchymal patterns and claimed that they were associated with different risks for the subsequent development of breast cancer. Egan and Mosteller suggested that these patterns did not constitute a true risk factor, rather the effect was caused by the greater difficulty of detecting breast cancers in the dense (P2, DY) patterns compared with the fatty (NI, P1) patterns. Similarly, Mendell believed that a bias was introduced into Wolfe’s work by requiring a negative mammogram before a patient entered the study. This study of 221 prevalent and 706 incident cancers followed for up to 10 years indicates that a masking effect does exist, but that it operates in addition to a difference in risk of breast cancer within the four Wolfe classes. Wolfe’s hypothesis is found to be valid. Cancer 56:1280-1286, 1985. OLE’,* proposed that the radiographic appearance W of breast parenchyma provides a method of predicting who will develop breast cancer. He divided mammograms into four classes depending upon the proportion of fat, ducts, and “dysplasia,” or densities present in the breast (Table I). “Dysplasia” is descriptive of generalized density, and is not related to pathologic, microscopic dysplasia. Wolfe conducted two retrospective studies on women older than 30 years of age. The first concerned women who had been referred to Hutzel Hospital, Detroit, Michigan, because of presumed breast abnormalities. The women’s mammograms were taken between January Supported by NCl grant 3-9 18-CA-2607 I. University of Reading, England; formerly of the Fred Hutchinson Cancer Research Center, Seattle, Washington. Virginia Mason Research Center, Seattle. Washington. t: Radiation Effects Research Foundation, Hiroshima City, Japan; formerly of the Fred Hutchinson Cancer Research Center, Seattle, Washington. 5 University of Washington, Seattle, Washington. (1 Pacific Health Research Institute, Honolulu, Hawaii. ll University of Arizona College of Medicine, Tucson, Arizona. # University of Michigan, Ann Arbor, Michigan. ** Bess Kaiser Medical Center, Portland, Oregon. Address for reprints: John Whitehead, PhD, Department of Applied Statistics, University of Reading, Whiteknights, Reading RG6 2AN, England. The authors thank the following Project Coordinators: Pamela Roselle, MBA, Patricia Hackley, BA (Ann Arbor), Sue Anderson, RN, Gloria Low, RN, MPH (Honolulu), Evelyn Hadaway, RN, MSW (Seattle), Terry Tobin, BA, and Bobbe Dexter, RN (Tucson). Pat Mueller and John Weaver provided technical assistance. Accepted for publication November 2, 1984. 1967 and January 1972. Breast cancer cases were found by searching the files of the Tumor Registry of the Michigan Cancer Foundation, and also the files of Hutzel Hospital. The second study concerned women who were referred between January 1972 and January 1973. This time cases were found using a mailed ques- tionnaire to each woman; the response rate was 75%. Wolfe states that, “Criteria for classification were deliberately varied in the two studies. An attempt was made in study 2 to capture within a larger population a greater number of the developing cancers.” However, only the definitions given in Table 1 were published. The results of the two studies are shown in Table 2. Risks are quoted relative to the NI pattern. Wolfe’s work stimulated other radiologists to investi- gate his findings. Over 50 reports have appeared in the international literature evaluating the Wolfe classification, with varying results. Some authors confirmed his findings but with lower risk ratios, whereas others found no confirmation. The studies have varied in such aspects as use of prevalent or incident cases, duration of follow- up after initial mammogram, methods, and design. A number of studies are based on small numbers of cancers and short periods of observation. Several suggestions were offered as explanations of why Wolfe’s findings could not be reproduced. Mendell and co-workers3 believed that Wolfe’s work was biased by requiring a negative mammogram before entry into the study. This criticism is similar to that of Egan and Mosteller; who suggested that the differences in risk 1280

Upload: john-whitehead

Post on 06-Jun-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Wolfe Mammographic Parenchymal Patterns

A Study of the Masking Hypothesis of Egan and Mosteller

JOHN WHITEHEAD, PHD: THOMAS CARLILE, MD,t KENNETH J. KOPECKY. PHD,+ DONOVAN J. THOMPSON, P H D , ~ FRED 1. GILBERT, JR. MD.11 ARTHUR J. PRESENT, MD.T

BARBARA ANNE THREATT, MD,# PETER KROOK, MD,” AND EVELYN HADAWAY, RN, hQSWt

Wolfe defined four different classes of breast parenchymal patterns and claimed that they were associated with different risks for the subsequent development of breast cancer. Egan and Mosteller suggested that these patterns did not constitute a true risk factor, rather the effect was caused by the greater difficulty of detecting breast cancers in the dense (P2, DY) patterns compared with the fatty (NI, P1) patterns. Similarly, Mendell believed that a bias was introduced into Wolfe’s work by requiring a negative mammogram before a patient entered the study. This study of 221 prevalent and 706 incident cancers followed for up to 10 years indicates that a masking effect does exist, but that it operates in addition to a difference in risk of breast cancer within the four Wolfe classes. Wolfe’s hypothesis is found to be valid.

Cancer 56:1280-1286, 1985.

OLE’,* proposed that the radiographic appearance W of breast parenchyma provides a method of predicting who will develop breast cancer. He divided mammograms into four classes depending upon the proportion of fat, ducts, and “dysplasia,” or densities present in the breast (Table I ) . “Dysplasia” is descriptive of generalized density, and is not related to pathologic, microscopic dysplasia.

Wolfe conducted two retrospective studies on women older than 30 years of age. The first concerned women who had been referred to Hutzel Hospital, Detroit, Michigan, because of presumed breast abnormalities. The women’s mammograms were taken between January

Supported by NCl grant 3-9 18-CA-2607 I . University of Reading, England; formerly of the Fred Hutchinson

Cancer Research Center, Seattle, Washington. Virginia Mason Research Center, Seattle. Washington.

t: Radiation Effects Research Foundation, Hiroshima City, Japan; formerly of the Fred Hutchinson Cancer Research Center, Seattle, Washington.

5 University of Washington, Seattle, Washington. (1 Pacific Health Research Institute, Honolulu, Hawaii. ll University of Arizona College of Medicine, Tucson, Arizona. # University of Michigan, Ann Arbor, Michigan. ** Bess Kaiser Medical Center, Portland, Oregon. Address for reprints: John Whitehead, PhD, Department of Applied

Statistics, University of Reading, Whiteknights, Reading RG6 2AN, England.

The authors thank the following Project Coordinators: Pamela Roselle, MBA, Patricia Hackley, BA (Ann Arbor), Sue Anderson, RN, Gloria Low, RN, MPH (Honolulu), Evelyn Hadaway, RN, MSW (Seattle), Terry Tobin, BA, and Bobbe Dexter, RN (Tucson). Pat Mueller and John Weaver provided technical assistance.

Accepted for publication November 2, 1984.

1967 and January 1972. Breast cancer cases were found by searching the files of the Tumor Registry of the Michigan Cancer Foundation, and also the files of Hutzel Hospital. The second study concerned women who were referred between January 1972 and January 1973. This time cases were found using a mailed ques- tionnaire to each woman; the response rate was 75%.

Wolfe states that, “Criteria for classification were deliberately varied in the two studies. An attempt was made in study 2 to capture within a larger population a greater number of the developing cancers.” However, only the definitions given in Table 1 were published. The results of the two studies are shown in Table 2. Risks are quoted relative to the NI pattern.

Wolfe’s work stimulated other radiologists to investi- gate his findings. Over 50 reports have appeared in the international literature evaluating the Wolfe classification, with varying results. Some authors confirmed his findings but with lower risk ratios, whereas others found no confirmation. The studies have varied in such aspects as use of prevalent or incident cases, duration of follow- up after initial mammogram, methods, and design. A number of studies are based on small numbers of cancers and short periods of observation.

Several suggestions were offered as explanations of why Wolfe’s findings could not be reproduced. Mendell and co-workers3 believed that Wolfe’s work was biased by requiring a negative mammogram before entry into the study. This criticism is similar to that of Egan and Mosteller; who suggested that the differences in risk

1280

No. 6 PARENCHYMAL PATTERNS AND MASKING * Whitehead et al. 1281

TABLE I . The Wolfe Classification of Parenchymal Patterns

Class As defined by Wolfe in 1976’ As defined, in consultation with Wolfe, for the current study

NI

PI

P2

DY

Parenchyma composed primarily of fat with at most small amounts of “dysplasia.” No ducts visible.

Parenchyma chiefly fat with prominent ducts in the anterior portion up to I14 of volume of breast. Also may be a thin band of ducts extending into a quadrant.

Severe involvement with prominent duct pattern occupying more than of volume of breast.

Severe involvement with “dysplasia.” Often obscures an underlying prominent duct pattern.

The breasts are composed primarily of fat with, at most, minimal areas of increased density varying somewhat according to age. No ducts visible. Biopsy changes, calcifications, alterations of vascularity, or the presence of cysts, fibroadenomata, or other masses are ignored in making the classification. If subareolar structures are present suggesting ducts, but the identity is not certain, the classification is N1.

Same as N1 with the addition of ducts. The ducts occupy less than I14 of the volume of the breast. They may form a small dense triangle or cone with apex at the nipple. Nodularity or beading of the ducts varies in amount and may be the only evidence of ducts. As in N1, nonductal densities may be present to a minimal degree.

Involvement with prominent ductal pattern occupying I14 or more of the breast volume. The patterns may be linear, nodular, or both. The ducts may present as a cone or funnel occupying more than I14 of the volume in the central part of the breast. Often the periductal connective tissue produces coalescence of the area of involvement of prominent ducts. The superficial margin of the involved area is smooth and not scalloped. There is a rim of subcutaneous fatty tissue when the ducts appear as a coalescent area. A homogeneously dense breast, with a sharply marginated, smooth rim of subcutaneous fatty tissue, even without visible ducts, is also a P2. Some cases present with extensive sheetlike areas of density consistent with DY but also have ducts. The presence of ducts requires a P2 classification.

Sheetlike areas of irregularly increased density which often contain islands of fat. The volume of breast involved by this density is usually more than 25%, but may involve the whole breast. The classic DY breast has at least 50%-75% involvement. This usually has a scalloped peripheral margin. A breast completely filled with a nonhomogeneous density is also included. No ducts are visible. In the breast of usual DY density the presence of ducts of any amount precludes DY and leads to a P2 classification. In the equivocal, less dense N1-DY, the presence of ducts less than 25% leads to a PI classification. If there is a question about the presence or absence of ducts, and no certain decision, the classification is DY.

observed by Wolfe were due to the difficulty in diagnosing small cancers in the dense P2 and DY type breasts, rather than being truly associated with parenchymal pattern. They called this phenomenon “masking.”

In this report, data from the study of Carlile et al.’ are used to investigate masking. The main objective of the study was to estimate the relative risks of breast cancer in the four parenchymal pattern classes, and to determine whether they were significantly different from one another. For the purpose of the study, the definitions of the four parenchymal pattern classes were expanded, in consultation with Dr. Wolfe himself. The only mod- ification was the exclusion of ducts from the DY clas- sification (see also Wolfe el aL6). The definitions used are shown together with Wolfe’s original versions in Table 1. Relative to the N1 pattern, the risks for PI, P2, and DY were found to be 2.0, 3.5, and 3.1, respectively; full details of that analysis were reported by Carlile el al.’ In the current article it will be shown that masking does exist. Because of the study design, however, the relative risk estimates reported by Carlile

et al.’ were virtually unaffected by this masking. Before describing the methods and results of the current inves- tigation, it is necessary to consider the action of masking in more detail.

Masking

The hypothesis of Egan and Mostelle? is that a masking phenomenon exists that renders mammography less efficient in detecting cancers when used on dense breasts than when used on fatty breasts. They equate fatty breasts with Wolfe’s N 1 and P1 patterns and dense breasts with P2 and DY. They further suggest that

TABLE 2. Results of the Two Studies by Wolfe

No. Relative risks No. of of

Study subjects cases NI PI P2 DY

I 5284 56 1 3.7 14.0 37.3 2 I930 20 1 0.0 7.1 21.2

1282

Exam and Type

CANCER September 15 1985 Vol. 56

Fatty Breasts 0 1 2 3 Ydet

-3 P

-2 P

- 1 P

O M

1 P

2 P

3 P

4 P

\ \ c

\ \ &

\ \ 5

8 6 %

50 50 50 50

50 50 50 50

50 50 50 50

50 150

50 50 0

50 50 50 0

50 50 50 50

50 50 50 50

\

\ \

\ \ 5

\ \ &

\ \ <

Dense Breasts 0 1 2 3 4 sdet

\ \ \ c 50 50 50 50

\ \ \ z 50 50 50 50

\ \ \ & 50 50 50 50

\ \ a t 50 50 50

\ \ \ 50 50 50 50

\ \ \ & 50 50 50 50

\ \ \ & 50 50 50 50

\ \ \ < 50 50 50 50

\ \ \ <

50

50

50

too

0

50

50

50

- \ A tumor growing lor one year. P = DhYSlCal examination onlv

undetected

next examination mammography M = physical examination and

6 A tumor being detected at the

FIG. I . Diagrammatic representation of the influence of masking on study A.

masking, by protecting cancers in P2 and DY breasts from detection by mammography, can lead to an a p parent excess of cancers in these dense breasts. Such an excess has been observed in studies with a short-term follow-up after a negative mammogram. In particular, they claim that the masking phenomenoo alone could account for the large relative risks reported by Wolfe’*2 and seen again in the data from the first part of their own study.

For the masking hypothesis to give rise to spurious relative risks it is necessary that the advanfage of using

Fatty Breasts

\ c -3 P 50 50 50

\ \ & -2 P 1 50 50 50 50

L k - 5 -1 P 50 50 50 50

1 5 5 % O M

t M

2 M

50

50

50

5

5

150

50

50

50

50 \ <

A tumor growing lor one year, undet

& A tumor being detected at the next examination

Dense Breasts

- 0 1 2 3 4

\ \ \ < 50 50 50 50

\ \ \ & 50 50 50 50 \ \ h - 4

50 50 50 50 \ \ & 5

50 50 50 \ \ a

50 50 50

\ ’ - z 50 50 50 ’ \ 5

50 50 50 \ L - 5

50 50 50 \ \ \

50 50 50 50 \ \ \ 5

5 0 5 0 5 0 5 0 \ \ \ z

50 50 50 50 \ \ \ &

50 50 50 50 \ \ \ <

P det - 50

50

50

100

50

50

50

50

0

50

50

50

led P = physical examination

M = physical examinalion and mammography

- RR

-

1

1

1

1

1

1 25

1 20

1 1 7

‘Y -

FIG. 2. Diagrammatic representation of the influence of masking on study B.

mammography rather than physical examination alone is greater for fatty breasts than for dense breasts. As the issue is crucial to the assessment of parenchymal pattern relative risks, and also quite complex, the effect will be illustrated using a simplified model of the masking phenomenon of Egan and Mo~teller.~ The intention is not to suggest that the model is quantitatively realistic, but rather to indicate qualitatively what influence mask- ing can and cannot have.

We shall consider only two breast types, “fatty” and “dense,” and assume that the risk of breast cancer is the same for both. It will be assumed that an examination that includes mammography as well as physical exami- nation detects a tumor in a fatty breast afler I year of development, and in a dense breast after 3 years. A physical examination alone is assumed to detect 3-year- old tumors in fatty breasts and 4-year-old tumors in dense breasts. This we believe to be a qualitative truth: that tumors are easier to detect using mammography and easier to detect in fatty breasts, and that they are especially easy to detect using mammography on fatty breasts.

Now consider a large population of women, equally divided into those with fatty breasts and those with dense breasts. Assume that each year 100 new tumors begin development in 100 different women in this population. As we are assuming breast type to be independent of risk, assume that 50 of these are in fatty breasts and 50 are in dense breasts.

Every year the women are examined. Usually this consists of physical examination only. Two distinct uses of mammography will be considered. Study A resembles Wolfe’s’*2 and that of Egan and Mosteller4: mammog- raphy is used routinely at only one of the annual examinations. Study B resembles our own investigation in that a series of five consecutive annual examinations include mammography.

Figures I and 2 show what will happen in the two studies. A P-exam is one using physical examination only, and an M-exam includes mammography as well. Year 0 is the year of the first M-exam, and the figures include examinations in previous and in subsequent years. In Figure I , the first line represents the situation at the P-exam in year -3. Among women with fatty breasts, there are 50 with tumors that have just started, 50 with tumors that are I year old, and 50 with tumors 2 years old. None of these tumors are detected, and thus they are present the following year, 1 year more advanced. In addition, at year -3, physical examination detects 50 tumors in fatty breasts, which are 3 years old and were undetected the previous year. This pattern recurs throughout the premammography period, with 50 new tumors appearing each year, tumors in place growing I year older, and 50 tumors being detected

No. 6 PARENCHYMAL PATTERNS AND MASKING - Whitehead et al. 1283

each year after 3 years of growth. The situation in dense breasts is similar, although detection takes place only after 4 years of development. For dense breasts too, 50 tumors are detected each year.

Then at year 0, mammography is added. In fatty breasts, tumors that are 1-, 2-, or 3-years-old are detected, 150 in all. In dense breasts the advantage is not as great; 100 3- and 4-year-old tumors are detected. These 150 + 100 cancers form the prevalent series. Notice that from this prevalent series the risk of breast cancer will appear to be greater for women with fatty breasts than for women with dense breasts. All subsequent tumors are included in the incident series. At year 1 examination reverts to physical examination only. All detectable tumors have been cleared out by mammography. At year 2 there are 50 tumors in dense breasts ready to be detected, but none yet in fatty breasts. Calculated now, the risk of breast cancer for dense breasts relative to fatty breasts is infinite. From year 3 onwards, the steady premammography pattern is restored. The incident series includes the excess of tumors in dense breasts detected in year 2, and the relative risk decreases back to I as shown in the RR (relative risk) column.

Figure 2 describes Study B in a similar way. The situation is the same as for Study A until year 1. As mammography is used for 5 years, the first 4 years of the incident series produce 50 tumors from each type of breast, and relative risks of 1. Only when mammog- raphy ceases does masking have its effect. As shown in the figure, no tumors are detected in year 5 . Detection of tumors in dense breasts resumes in year 6 and detection in fatty breasts resumes in year 7. The inflation of relative risk is less marked because by year 5 the incident series already includes 400 tumors collected in an unbiased manner and evenly divided between the breast types. Thus, the model has shown that, without any real difference in risk between fatty and dense breasts, the risk of cancer in dense breasts relative to fatty breasts has been estimated to be 0.67 from the prevalent series, and greater than 1 from the incident series. The model further shows that the problem is alleviated in the incident series of a study with a series of follow-up mammograms. Finally, it indicates that in such a study the effect of masking is likely to be seen only after mammography ceases. The model is too naive to be useful in adjusting relative risks for the effect of masking. It would be interesting to develop a stochastic model allowing for variation in tumor rates and detection rates for that purpose.

Methods

The study was based on the follow-up of 40,000 women who had been screened in four Breast Cancer

Detection Demonstration Projects (BCDDP), two using film-screen (Ann Arbor and Honolulu) and two using xerography (Seattle and Tucson). After a 5-year screening program, the subjects were contacted annually by tele- phone or letter to determine the development of new incident cancers and changes in health status. A data- coordinating center (DCC) was established at the Fred Hutchinson Cancer Research Center, Seattle, Washing- ton. The ability of participating radiologists to classify mammograms was established satisfactorily in a prelim- inary study by Carlile et al. in 1983.’ Other background data were collected and analyzed and demonstrated that the effects of conventional risk factors in the 40,000 women were comparable to those found in other pop- ulations. In the follow-up of the women in the study, 92% were interviewed, 4% were deceased, and only 4% were lost to follow-up or refused to participate.

The investigation was conducted as a matched case- control study. As cases of breast cancer were diagnosed they were matched with controls enrolled at the same BCDDP clinic, within the same 5-year age range and with a similar date of initial mammogram. Two controls were selected for each case. Because of the racial mix of subjects in Honolulu, race was used as a matching factor for that clinic only: two categories were specified: Asian and non-Asian. Initial mammograms were taken between September 1973 and July 1976. Data collection ceased on February 29, 1984.

Women diagnosed as a result of their initial screening mammograms formed a prevalent series (P-series) of cases. Women diagnosed subsequently formed an inci- dent series (I-series) of cases. Relative risks were analyzed using conditional logistic regression.8

Physical examinations were performed by experienced nurses who were highly trained in breast examination. The project directors in most situations confirmed the physical findings when abnormal. Mammograms were done on dedicated mammography units. The mammo- grams were read and classified by experienced radiologists who had worked with Wolfe. The conventional two views were taken and augmented by additional projec- tions when needed.

As a result of concern about radiation carcinogenesis,’ in July 1977 the National Cancer Institute discontinued mammography in the BCDDPs for women younger than 50 years, unless there was a personal history of breast cancer or primary family history (mother or sister). Because of the widespread publicity, other women refused mammography and a few dropped out of the program. This resulted in reduced numbers of subjects, particularly those younger than 50.

Visits and examinations were designated numerically by the year of the visit and whether it was the first or second examination of that year. Thus, the 5-1 mam-

1284 CANCER September 15 1985 Vol. 56

221 1 141 ] 188 I 128 1 130 I 1311 Number 01 can8

T'ABII 3. The Joint Distribution of Method of Dctection and Parenchymal Pattern (Cases Only)

N1 P1

1 I4 I40 77 78 297 706 Total (100) (100) (100) (100) (100) (100)

Chi-square = 39.4: df = 12; P < 0.00 I . PP: parenchymal pattern; Mamm: mammography: PE: physical ex-

amination.

mograms mentioned below were taken in the patient's fifth year, and were the first examination of that year.

Measurements of tumor size were usually obtained when possible from the hospital pathology report. If the dimensions were not recorded the project pathologist was asked to attempt to obtain this information from the examining pathologist or, when possible, to measure the tumor on the slide.

Results

Method of Detection

A direct assessment of whether masking actually exists can be made by considering how cases in the incident series were diagnosed. These data are displayed in Table

1 .o

0.8

0.6 - s e e

0.4

0.2

3. The first three columns concern women diagnosed during screening, and are ordered according to the decreasing influence of mammography on diagnosis. Thus, for women in the first column, the tumors were detected by mammography alone and were not detected by physical examination. In the second column, both the mammogram and the physical examination identified cancer. In the third column, physical examination was the sole means of diagnosis. The fourth column com- prises cases diagnosed between screening examinations, and the fifth, cases diagnosed after screening ceased: routine BCDDP mammography could not be responsible for detection in either situation.

There is a clear and highly significant (P < 0.001) relationship between parenchymal pattern and method of detection. Mammography revealed a higher proportion of cancers in N1 and P1 breasts than did physical examination. This accords with and supports, the mask- i ng hypothesis.

The Eflect of Masking on the Current Study

The effect of masking is illustrated in Figure 3. This displays the proportions of the four breast types among prevalent cases, together with corresponding proportions for live subdivisions of the I-series classified by duration between initial mammogram and diagnosis. The first three of the I-series subdivisions comprise cancers de- tected during the 5-year screening process and the last two concern postscreening diagnoses and are thus subject to distortion due to masking. Indeed, the proportions remain essentially constant over the first three subdivi- sions of the I-series, after which the N 1 pattern becomes more rare and the P2 pattern more common. The PI and DY proportions are less obviously affected. Table 4 shows the numbers corresponding to Figure 3. The variation within the I-series is significant (P < 0.05). The figure indicates that the masking phenomenon does indeed have the effect of inflating the apparent relative risks calculated from cases diagnosed after screening. The effect is most marked within the final group with diagnoses more than 7 years after initial mammogram. There is no suggestion that the parenchymal pattern proportions will return to those observed during screening among women diagnosed long after the cessation of screening. Such a return is predicted by the masking hypothesis. However, the postscreening follow-up vanes between 2 and 5 years, and may not be sufficiently long for the return to be detected. The figure also shows the proportions of the four breast types among the controls. In the P-series cases, and all five subdivisions of the I- series cases, the NI pattern is less common and the P2 pattern more common than among the controls. This demonstrates that masking cannot completely explain away the Wolfe findings.

No. 6 PARENCHYMAL PATTERNS AND MASKING - Whitehead et al. 1285

TABLE 4. Joint Distribution of Months From Mammogram to Diagnosis by Parenchymal Pattern (Cases Only)

No. of cases (96)

During screening Postscreening P

PP Series 1 2 4 24-41 42-59 60-83 284 Total

Figure 3 and Table 4 demonstrate that masking does cause a significant increase in the number of P2 and DY tumors detected after screening ceases. However, this does not necessarily imply that masking has a major effect on parenchymal pattern relative risks. Table 5 shows, in fact, that the effect is slight. The table shows relative risks calculated from the prevalent series, relative risks from cases diagnosed during the 60-month screening period and their associated controls, and relative risks derived from postscreening cases. The last are inflated, but not to a significant extent, and the magnitude of the exaggeration is small. The relative risks calculated from the women diagnosed within 60 months of their initial mammogram are significantly different from I , (chi-square = 35.1, df = 3, P < 0.001). Relative risks from the P-series are the smallest, but they are still significantly different from 1, (chi-square = 10.2, df = 3, P < 0,025). The masking hypothesis, in the absence of a true effect of parenchymal pattern, would predict that in the P-series fatty breasts would be associated with higher risk; in fact, the reverse has been observed.

Tumor Size

Table 6 concerns all cases in the I-series, and is a cross-tabulation of tumor size by parenchymal pattern. There is a slight tendency for the N1 and PI patterns to be more common among women with small tumors, and for the P2 and DY patterns to be less common in this group. Such a relationship would be predicted by the masking hypothesis. However, the relationship is by no means significant, so this table cannot be said to provide additional evidence supporting the masking hypothesis.

TABLE 5. Months From Mammogram to Diagnosis: Interaction With Parenchymal Pattern. Relative Risks

I series

PP P series <60 mo 260 mo Overall

Nl 0.55 0.55 0.3 I 0.5 1 PI 1 1 1 1 P2 I .33 1.57 2.08 1.75 DY 1 .00 1.71 1.37 1.58

Chi-square for comparison of the two subdivisions of the I series

P P parenchymal pattern. = 5.3; df = 3; NS.

Analysis Using the 5-1 Mammograms

The data available from the 5-1 mammograms, taken during the fifth annual screen, allow us to undertake a study similar in design to Wolfe’s original investigation.

The analysis of this section is confined to an incident series of cases comprising women diagnosed after screen- ing ceased. Only cases and controls with a classified 5-1 mammogram could be used; 122 cases and 200 matched controls were available for analysis. The distribution of parenchymal pattern for these subjects is shown in Table 7, and Table 8 presents the relative risks. With an N1 baseline, the risks N1:Pl:PZ:DY are 1:2.4:8.2:1.6, and the effect is highly significant (P < 0.001). Although not attaining the size of the risks reported by Wolfe, the 8.2 risk for P2 relative to N l is the largest value found in this study. Clearly, it has been inflated by masking.

Discussion

A detailed account of how masking might work has been given, and evidence has been furnished for this

TABLE 6. Joint Distribution of Tumor Size (mm) bv Parenchvmal Pattern

No. of cases (%)

Not completely

PP <I0 10-14 15-19 20-29 230 specified Total

NI 9 I I 6 5 5 8 44 (10) (9) ( 5 ) (4) (6) (4) (6)

PI 23 32 30 35 19 48 I87 (26) (27) (27) (30) (22) (26) (26)

P2 44 5 5 57 63 51 I02 372 (49) (46) (52) (54) (58) (56) (53)

DY 14 21 17 13 13 25 103 (16) (18) (16) ( 1 1 ) (15) (14) (15)

Total 90 119 I10 116 88 183 706 (100) (100) (100) (100) (100) (100) (100)

For completely specified categories: chi-square = 8.7; df = 12; NS. P P parenchymal pattern.

I286 CANCER September 15 1985 Vol. 56

TABLE 7. The Distribution of 5-1 Parenchymal Patterns'

No. of subjects (70)

PS 1 Cases Controls All subjects

NI 4 (3) 24 (12) 28 (9) PI 39 (32) 96 (48) 135 (42) P2 74 (61) 67 (34) 141 (44) DY 5 (4) 13 (6) 18 (6)

Total 122 (100) 200 (100) 322 (100)

* This analysis is restricted to subjects with 5-1 mammograms, and in triads where the case was diagnosed after screening.

PSI: Parenchymal pattern of 5-1 mammogram.

mode of action from the current case-control study data. That masking does occur is demonstrated con- vincingly by the data shown in Table 3, and that it affects our study is shown by Figure 3 and Table 4. However, because of the study design, its influence is not sufficiently strong to invalidate our conclusions on parenchymal pattern relative risks. Table 5 shows that the exaggeration of parenchymal pattern relative risks due to masking is small, and that masking certainly does not provide an alternative explanation for them.

The analysis of the postscreening cases using their 5- I mammograms demonstrates the misleading effect that masking can have on a study that relies on a single mammogram. Wolfel.* reported even larger risk ratios than the 1:8.2 ratio we found for NI:P2. His study differed from ours in many ways: it used slightly different definitions of the four breast types; it was a cohort study rather than a case-control study; his risks were not adjusted for age, whereas ours were because of the matching; he reported an average follow-up of just 2.5 years after mammogram, whereas our follow-up ranged from 2 to 5 years after the 5-1 mammogram. Although his study was based on data from a large number of women, this included only a small proportion of cases: our study with I22 cases and 200 controls was statistically more powerful and reliable.

TABLE 8. Relative Risks Associated with 5-1 Parenchymal Pattern5

Standard PP RR (RRS) Log (RR)* error7

NI 0.42 ( I ) -0.860 0.602 PI 1 (2.38) 0 0 P2 3.44 (8.19) 1.237 0.312 DY 0.67 ( I .60) -0.402 0.620

All logarithms are to base e. t Standard error of the estimate of log (RR). $ Relative risks taking N I as the baseline. 0 This analysis is restricted to the 322 subjects who had 5-1 mam-

mograms. and who were in triads where the case was diagnosed after screening.

Chi-square = 28.7; df = 3; P < 0.001. P P parenchymal pattern: RR: relative risk.

Conclusion This study has shown that women with dense breasts

are disadvantaged in two different ways. First, the P2 and DY patterns do indicate risks of breast cancer of 3.5 and 3.1, respectively, relative to the N1 pattern. Second, tumors are more difficult to detect in these breast types, and although mammography aids detection, its use does not improve detection rates as much as when it is used on fatty breasts. However, women with dense breasts comprised 53% of the controls in our study, whereas the N1 pattern accounted for only 15%. Rather than considering women with dense breasts to be especially disadvantaged, it is more appropriate to consider women with the N1 pattern as a fortunate minority.

The elevated risk of breast cancer faced by women with dense breasts is of interest in the epidemiologic study of the disease, but its clinical importance is less clear. Certainly the current authors would not recom- mend that a prophylactic mastectomy be performed solely or even largely because of the Wolfe classification. Although the elevation of risk due to parenchymal pattern is not nearly as large as originally claimed by Wolfe, it is of the same magnitude as the effects of other known risk factors. Therefore, it must take its place alongside these established risk factors in clinical decision- making and in determining the frequency of mammo- graphic screening. Unfortunately, as was shown more clearly by Whitehead el a/.," even when parenchymal pattern is added to the usual list of risk factors, their combined use fails to discriminate reliably between women in whom breast cancer will develop and women in whom it will not.

REFERENCES I . Wolfe, JN. Risk for breast cancer development determined by

mammographic parenchymal pattern. Cuncer 1976; 37:2486-2492. 2. Wolfe JN. Breast patterns as an index of risk for developing

breast cancer. Am J Roentgenol 1976: 126: I 130- I 139. 3. Mendell L. Rosenbloom M, Naimark A. Are breast patterns a

risk index for breast cancer? A reappraisal. Am J Roenfgtnol 1977; 128547.

4. Egan RL. Mosteller RC. Breast cancer mammography patterns. Cuncer 1977; 40:2087-2090.

5 . Carlile T, Thompson DJ, Whitehead J el al. Breast cancer prediction and the Wolfe classification of mammograms. JAMA 1984: in preparation.

6. Wolfe JN. Albert S, Belle S, Salane M. Breast parenchymal patterns: Analysis of 332 incident breast carcinomas. Am J Roenrgend

7 . Carlile T, Thompson DJ. Kopecky KJ el al. Reproducibility and consistency in classification of breast parenchymal patterns. Am J Roenrgenol 1983: 140: 1-7.

8. Breslow NE. Day NE. Statistical Methods in Clinical Research. Vol. I , The Analysis of Case-Control Studies. Lyon: IARC, 1980 ch. 7.

9. Bailar JC 111. Mammography: A contrary view. Ann In1i.m Med

10. Whitehead J, Carlile T, Kopecky KJ el al. The relationship between Wolfe's classification of mammograms, accepted breast cancer risk factors, and the incidence of breast cancer. Am J Eprdemiol (submitted).

1982; 138:113-118.

1976; 84:77-84.