effect modification and the limits of biological inference from epidemiologic data

12
J Clin Epidemiol Vol. 44, No. 3, pp. 221-232, 1991 Printed in Great Britain. All rights reserved 0895-4356/91 $3.00 + 0.00 Copyright 0 1991 Pergamon Press plc EFFECT BIOLOGICAL Commentary MODIFICATION AND THE LIMITS OF INFERENCE FROM EPIDEMIOLOGIC DATA W. DOUGLAS THOMPSON Department of Applied Medical Sciences, University of Southern Maine, Portland, ME 04103, U.S.A. (Received in revised form 27 June 1990) INTRODUCTION A decade ago the concept of interaction among causes of disease was at the center of a lively debate [l-lo]. Since that time, controversy over the nature of interaction has largely subsided, even though there seems never to have been an adequate resolution of the conceptual and prag- matic issues that had been raised [ll]. In this commentary I return to some of these issues, in hopes of clarifying them and in hopes of encour- aging appropriately cautious interpretation of epidemiologic findings regarding the joint effects of multiple risk factors. This and other attempts to clarify further the problems in the assessment of interaction are warranted by the considerable attention that epidemiologists give to interactions in the analy- sis and interpretation of their data. For example, among the 17 papers dealing with the etiology of cancer and published in the Ameri- can Journal of Epidemiology during the final 3 months of 1988 [12-281, joint effects of multiple factors were considered in 12 [13, 17-26,281. CONCEPTUAL AND DEFINITIONAL ISSUES In terms of their causal effects on the inci- dence of a disease, two risk factors either may act independently or they may interact. Risk factors interact when one factor modifies the effect that another factor has on the occurrence of disease. Consequently, the term effect modifi- cation is often used by epidemiologists in refer- ence to such interactive phenomena. Although shades of meaning between interaction and effect modification can be distinguished, I will use the two terms interchangeably in this paper. Also, the term risk factor will be used through- out to denote some factor that, in the absence of the other factors under consideration, has the causal effect of increasing the incidence of the disease of interest. To avoid confusion, I will treat in a separate section those variables that have an effect on disease only when some other factor is also present. If two risk factors do actually interact, they may do so in one of two general ways. When the presence of one risk factor augments the bio- logic effect of another, the two risk factors are said to have synergistic effects. On the other hand, when the presence of one risk factor reduces, eliminates, or reverses the effect of another, the two risk factors are said to have antagonistic effects. At the conceptual level of mechanisms, inde- pendence, synergy and antagonism are logically distinct and mutually exclusive. Suppose that, among persons who drink no alcohol, exposure to cigarette smoke increases the incidence of a particular type of epithelial cancer. Similarly, suppose that, among persons who are not ex- posed to cigarette smoke, the consumption of alcohol increases the incidence of the cancer. Finally, suppose that, among those exposed to both cigarette smoke and alcohol, the alcohol serves as a solvent that increases the intensity of exposure of epithelial cells to a given amount of

Upload: wdouglas

Post on 03-Jan-2017

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Effect modification and the limits of biological inference from epidemiologic data

J Clin Epidemiol Vol. 44, No. 3, pp. 221-232, 1991 Printed in Great Britain. All rights reserved

0895-4356/91 $3.00 + 0.00 Copyright 0 1991 Pergamon Press plc

EFFECT BIOLOGICAL

Commentary

MODIFICATION AND THE LIMITS OF INFERENCE FROM EPIDEMIOLOGIC DATA

W. DOUGLAS THOMPSON

Department of Applied Medical Sciences, University of Southern Maine, Portland, ME 04103, U.S.A.

(Received in revised form 27 June 1990)

INTRODUCTION

A decade ago the concept of interaction among causes of disease was at the center of a lively debate [l-lo]. Since that time, controversy over the nature of interaction has largely subsided, even though there seems never to have been an adequate resolution of the conceptual and prag- matic issues that had been raised [ll]. In this commentary I return to some of these issues, in hopes of clarifying them and in hopes of encour- aging appropriately cautious interpretation of epidemiologic findings regarding the joint effects of multiple risk factors.

This and other attempts to clarify further the problems in the assessment of interaction are warranted by the considerable attention that epidemiologists give to interactions in the analy- sis and interpretation of their data. For example, among the 17 papers dealing with the etiology of cancer and published in the Ameri- can Journal of Epidemiology during the final 3 months of 1988 [12-281, joint effects of multiple factors were considered in 12 [13, 17-26,281.

CONCEPTUAL AND DEFINITIONAL ISSUES

In terms of their causal effects on the inci- dence of a disease, two risk factors either may act independently or they may interact. Risk factors interact when one factor modifies the effect that another factor has on the occurrence of disease. Consequently, the term effect modifi- cation is often used by epidemiologists in refer-

ence to such interactive phenomena. Although shades of meaning between interaction and effect modification can be distinguished, I will use the two terms interchangeably in this paper. Also, the term risk factor will be used through- out to denote some factor that, in the absence of the other factors under consideration, has the causal effect of increasing the incidence of the disease of interest. To avoid confusion, I will treat in a separate section those variables that have an effect on disease only when some other factor is also present.

If two risk factors do actually interact, they may do so in one of two general ways. When the presence of one risk factor augments the bio- logic effect of another, the two risk factors are said to have synergistic effects. On the other hand, when the presence of one risk factor reduces, eliminates, or reverses the effect of another, the two risk factors are said to have antagonistic effects.

At the conceptual level of mechanisms, inde- pendence, synergy and antagonism are logically distinct and mutually exclusive. Suppose that, among persons who drink no alcohol, exposure to cigarette smoke increases the incidence of a particular type of epithelial cancer. Similarly, suppose that, among persons who are not ex- posed to cigarette smoke, the consumption of alcohol increases the incidence of the cancer. Finally, suppose that, among those exposed to both cigarette smoke and alcohol, the alcohol serves as a solvent that increases the intensity of exposure of epithelial cells to a given amount of

Page 2: Effect modification and the limits of biological inference from epidemiologic data

222 Commentary

carcinogens in cigarette smoke [29,30]. The joint biological effects of alcohol and cigarette smoke can in this example be characterized only as synergistic. Clearly, the two factors are not acting independently and in order to invoke the concept of antagonism one would have to hy- pothesize an entirely different or at least an additional mechanism.

In general, for well specified etiologic mech- anisms such as this one, the designation of joint effects as independent, synergistic or antagon- istic is straightforward. However, when, in the absence of knowledge of underlying mechan- isms, one starts with an observed set of rates of disease for various combinations of values for risk factors and then attempts to infer whether the joint effects are independent versus synergis- tic versus antagonistic, the inference is far from straightforward. As will be demonstrated below, such inference is often totally inappropriate.

SIMPLE MODELS OF CAUSATION

Essential to the biological interpretation of data regarding the joint effects of two risk factors is clear specification of how the results would have looked if the two factors had acted independently in the pathogenic process. In this section I review a few simple causal models that have been proposed for describing the induction of disease. For each of these models I indicate the pattern of occurrence of disease that would be anticipated under independence of effects for two risk factors. This pattern specifies the point of reference from which synergism or antagon- ism is assessed.

Single -hit model

Walter and Holford [l] have considered the assessment of effect modification when the oc- currence of a single adverse event is sufficient for the development of disease. If, for this “single- hit” model, two risk factors act independently in increasing the rate at which the adverse events occur, then for a rare disease the observed pattern of risks will be approximately additive, i.e. the relative risk for the subgroup of the population with both risk factors will be one less than the sum of the relative risks for the two subgroups with just one of the risk factors. If, however, the effects for the two risk factors are observed to be more than additive, then, given that the single-hit model holds, the factors must operate synergistically in producing the disease. Similarly, if the effects for the two risk factors

are observed to be less than additive, then, again given that the single-hit model holds, the factors must operate in an antagonistic manner.

No -hit model

Walter and Holford [I] also considered the “no-hit” model, which specifies that the individ- uals who become diseased are those who fail to experience one or more occurrences of a par- ticular beneficial event. The authors demon- strated that if, for the no-hit model, two protective factors act independently in increas- ing the rate at which the beneficial events occur, then the observed pattern of risks will be multi- plicative, i.e. the relative risk for the subgroup of the population with both protective factors will be equal to the product of the relative risks for the two subgroups with just one of the protective factors. If, however, the effects for the two protective factors are observed to be more than multiplicative, then, given that the no-hit model holds, the factors must lessen the capa- bility of one another to prevent the disease. Likewise, if the effects for the two factors are observed to be less than multiplicative, then, again given that the no-hit model holds, the factors must augment the capability of one another to prevent the disease. A more detailed consideration of the joint effects of preventive factors is available elsewhere [31].

Multi-stage models

In attempts to describe the processes of car- cinogenesis, a number of multistage models have been proposed [32-341. Here I consider only a simple two-stage model.

Suppose that the pathogenic process involves an initial transition from the normal state to Stage 1, followed by a subsequent transition from Stage 1 to Stage 2. For simplicity, suppose also that no individuals make the transition directly from the normal state to Stage 2 with- out having first passed through Stage 1. If within the context of this pathogenic process there is one risk factor for the disease that influences only the transition from the normal state to Stage 1 and a second risk factor that influences only the transition from Stage 1 to Stage 2, then the causal actions of the two risk factors lead to a multiplicative pattern of inci- dence rates [35].

Suficient -component -cause model

A rather different type of causal model, the sufhcient-component-cause model, has been

Page 3: Effect modification and the limits of biological inference from epidemiologic data

Commentary

I

r--1 Factor A -.

I I

I x I-* Disease

I

I I El

Factor B _.L__J

Fig. 1. Schematic representation of two risk factors that have their influence on the incidence of disease via direct effects on the same intervening variable.

223

proposed by Rothman [36]. Within the frame- work of this model Rothman has argued that independent causal actions for two risk factors would lead to additivity of risks. A detailed discussion of the sufficient-component-cause model is beyond the scope of this commentary, but is available elsewhere [3641].

UNMEASURED INTERVENING VARIABLES

The causal models discussed in the preceding section are unlikely to provide adequately for the complexity of pathogenic processes. One important shortcoming of all of the models, except for certain types of multistage models, is that they make no direct provision for unmea- sured intervening (intermediate) variables in the causal sequence from measured risk factors to the observed manifestations of disease. In this section I argue for the ubiquity of such unmea- sured intervening variables and illustrate with a numerical example the profound complication that they pose for the interpretation of inter- actions in epidemiologic research.

Conceptual framework

Figure 1 illustrates a simple situation in which Variable X is an intervening variable affected by two risk factors, denoted Factor A and Factor B. It is assumed for simplicity that Factors A and B have their influence on the disease only via their effects on Variable X and not via some other pathway not involving Variable X. The box around Variable Xis drawn as a broken one because in practice many intervening variables are unknown or at least unmeasured. In fact, if

r---i r---7 1 FactorA 1-b [_;_J-- , ,

1 x 1-c r---1

-. I 7 L---J

Fig. 2. Schematic representation of two risk factors that have their influence on the incidence of disease via indirect effects on the same intervening variable.

Variable X could be measured, then the epi- demiologic investigation of this situation could be separated into two components, namely the etiologic relationships of Factors A and B to Variable X and the etiologic relationship be- tween Variable X and the incidence of disease. The question of a possible interaction between Factors A and B could then be assessed most directly through their observable effects on Variable X.

In virtually all situations where epidemiologic methods are appropriate, there will be one or more unmeasured intervening variables in the causal sequence between the risk factors we are able to measure and the ultimate development of clinically recognizable disease. Once re- searchers have identified the variables that rep- resent the immediate causes of disease, they have generally moved well beyond the domain of epidemiology. Consequently, the situation shown in Fig. 1 is not an esoteric one, but one which can be used to represent nearly every epidemiologic problem of multifactorial cau- sation. The only limiting assumption of this formulation is that Factors A and B have their effects through a common pathway. This as- sumption is not as restrictive as it may appear at first. It does not, for example, require that the direct effects of Factors A and B be on the single Variable X. Instead, the two factors may oper- ate through two pathways that are distinct up until the final step in the causal sequence, as is illustrated in Fig. 2. Because the situation in Fig. 2 is merely an elaboration of the situation in Fig. 1, I will restrict attention here to the simpler of the two.

Page 4: Effect modification and the limits of biological inference from epidemiologic data

224 Commentary

Numerical example

Let us consider a scenario in which the value of a continuously distributed intervening vari- able (Variable X) is 5 units when neither Factor A nor Factor B is present. Suppose that when only Factor A is present, the value of Variable X is increased by 10 units to a value of 15 units and that when only Factor B is present, the value of Variable X is increased by 30 units to a value of 35 units. Finally, suppose that the joint effects of Factor A and Factor B on Variable X are strictly additive, i.e. when both Factor A and Factor B are present, the value of Variable X is increased over its baseline level by 40 units to a value of 45 units. Many epidemiol- ogists would probably consider these additive effects of Factors A and B on Variable X as neither synergistic nor antagonistic, but as inde- pendent. However, the central point of this example does not depend on how one views the additive effects of Factors A and B on Variable X.

Given that we have specified the nature of the relationship of A and B to X, the incidence rates for the four combinations of presence vs ab- sence of Factor A and Factor B depend only on the functional relationship between Variable X and the incidence of the disease. Figure 3 illus- trates a few of the infinite number of possible functional forms for this relationship. Each of the curves in the figure specifies a rate of 2 per 100,000 when the value of Variable X is 5 units and a rate of 4 per 100,000 when the value of

12

11

18 N c 7

; 6 E N 5 C E 4

I

I

I A only

Variable X is 25 units. For one of the curves the incidence rate is a logarithmic function of Variable X(rate = log, [ -4.413 + 2.360X]). For a second curve the incidence rate is a linear function of Variable X (rate = 1.5 + 0.1X). For a third curve the incidence rate is an exponential function of Variable X (rate = exp[0.520 + 0.03466X]). For the final curve the incidence rate is an exponential function of Variable X raised to a power greater than 1 (rate = exp[0.625 + O.O0609P]).

When, as in this example, the intervening Variable X has not been identified, the only available means for assessing the joint effects of Factors A and B is to compare the magnitudes of the incidence rates for the four subgroups defined by the presence/absence of the two factors. It is important to note that for each of the curves in Fig. 3 the same additive relation- ship of Factors A and B to Variable X has been assumed. All that varies for the individual curves is the functional form of the relationship between Variable X and the incidence of the disease. An example related to this one was discussed briefly by Kupper and Hogan in 1978 [3], but the authors did not address the issue of intervening variables directly.

In the figure, vertical lines have been drawn at the values of 15, 35 and 45 for Variable X in order to indicate on each curve the respective values for the rate when Factor A only is present, when Factor B only is present, and when both are present. Table 1 gives the values of the rate difference and the rate ratio that

P exp(0.625 +

O.OObO9X1.5)

I I I

I I I I I I 1 I 1 I I 1

5 10 15 20 25 30 35 40 45 50

VALUE OF X

Fig. 3. Illustration of various functional forms for the association between a continuously distributed variable (X) and the incidence of disease.

Page 5: Effect modification and the limits of biological inference from epidemiologic data

Commentary

Table 1. Illustration of the dependence of conclusions regarding effect modification on the measure of association used and on the functional form of underlying relationships

225

Functional form for the relationship between Variable X and the incidence rate (per 100,000)

log, (-4.413 + 2.3608) 1.5+0.1x exp(0.520 +0.03466X) exp(0.625 + O.OO6O9X’.5)

Rate difference Rate ratio for Factor A for Factor A

Factor B Factor B Factor B Factor B present absent present absent

0.26 1.43 1.06 1.72 1 .oo 1 .oo 1.20 1.50 2.34 0.83 1.41 1.41 5.15 0.66 1.78 1.33

would be calculated from epidemiologic data if Variable X bore the various functional relation- ships to the incidence rate as illustrated in Fig. 3. Homogeneity of the rate difference for Factor A across the two levels of Factor B corresponds to additivity of effects, whereas homogeneity of the rate ratio corresponds to multiplicativity of effects.

The results in the table indicate that if the incidence of disease is actually a logarithmic function of the unmeasured Variable X, then use of either the rate difference or the rate ratio as the measure of effect would suggest that the effects of Factor A and Factor B are antagon- istic rather than either independent or synergis- tic. That is, the magnitude of the effect for Factor B appears to be smaller when Factor A is present than when Factor A is absent (0.26 vs 1.43 per 100,000 for the rate difference and 1.06 vs 1.72 for the rate ratio). If, on the other hand, the functional form for the relationship between Variable X and the incidence rate is a linear one, then the rate difference for the effect of Factor A will appear to be homogeneous across the two levels of Factor B, whereas the rate ratio will again appear to indicate antagonism. Finally, if the incidence rate varies as an exponential func- tion of Variable X to the first power or to a power greater than one, then the rate difference for Factor A will be larger when Factor B is present then when it is absent, suggesting syner- gistic effects for the two factors. On the other hand, the rate ratio will be homogeneous when the rate varies as an exponential function of Variable X to the first power, but there will be an apparent synergy if the rate ratio is the measure used and if the rate itself is an exponen- tial function of Variable X to a power greater than one.

The results in Table 1 clearly indicate that when the joint biological effects of two risk factors are strictly additive, the ultimate impact on incidence rates may appear to be antagon- istic, synergistic or independent, depending on

the measure of association used to quantify the effects and depending on the unknown funtional form of the relationship between an intervening variable and the incidence of the disease. Although not illustrated here, joint effects of Factor A and Factor B on Variable X that are less than or more than additive will generally produce a similar variety of outcomes, depend- ing again on the measure of association used and on the unknown functional form of the relation of Variable X to the incidence of disease.

This example, like the simple biological models considered above, illustrates that there is no universally “correct” measure of association to use in the statistical assessment of effect modification. It also illustrates that even when the rate difference and the rate ratio both suggest the same qualitative conclusion regard- ing synergy vs antagonism, there is no guarantee that both measures of effect do not simply give the same misleading impression, as may be the case in the first and last rows of Table 1.

REJECTION OF THEORIES OF PATHOGENESIS

If two risk factors have each been shown to be causally related to a particular disease in the absence of the other, then epidemiologic evalu- ation of their joint effects provides important etiologic insight only to the extent that some specific theories of pathogenesis can be dis- carded and others retained as compatible with one’s observations. If, for example, a particular set of observations permitted an investigator to rule out all theories of causation that postulate either independent or antagonistic effects for two risk factors, such demonstration of syner- gistic effects could well represent an important scientific contribution. Unfortunately, choice among theories of pathogenesis is enhanced hardly at all by the epidemiologic assessment of interaction.

Page 6: Effect modification and the limits of biological inference from epidemiologic data

226 Commentary

Table 2. Some possible causal mechanisms if the multiplica- tive model fits the data for the joint effects of two risk factors

Causal mechanism

Independent effects within a no-hit process

Synergistic effects within a single-hit process

Each having its only effect on two different stages of a multistage process

Additive (and therefore possibly independent) effect on an intervening variable that bears an exponential relation- ship to the incidence of disease

Less than additive (and therefore possibly antagonistic) effects on an intervening variable that bears an exponen- tial-power relationship to the incidence of the disease

Consider a study of two risk factors in which one finds that the multiplicative model fits the data perfectly and that the sample size is enor- mous so that one does not have to be concerned about the low precision that the typical study has for estimating interactions. Table 2 lists some of the underlying causal systems with which the observed pattern of results would be consistent. Note that independent effects, antag- onistic effects, and synergistic effects are all possibilities, depending on the typically un- known nature of the underlying pathologic pro- cess and the functional forms of relationships involving unmeasured intervening variables. Clearly, it is impossible to discern from the data how or even whether the two risk factors inter- act in any biologically meaningful way.

What few causal systems can be rejected on the basis of the observed results would provide decidedly limited etiologic insight. If one ig- nored the issue of unmeasured intervening vari- ables, then the fit with the multiplicative model might be interpreted as evidence against a single-hit process in which the two risk factors contribute independently. However, if one ac- knowledges the role of unmeasured intervening variables, then independent effects can be ruled out only for causal systems in which the re- lationships between unmeasured intervening variables and the incidence of disease have very specific functional forms. Rejection of such nar- rowly defined pathologic processes represents only a minute increment in knowledge.

Even an extremely large numeric difference in the values of two stratum-specific estimates of effect may not always have a clear interpret- ation. For example, if the rate ratio were 15.0 in one of two strata and 1.5 in the other, then one might be inclined to interpret the results as indicative of a stronger biological effect within the first stratum than within the second. How-

ever, if the actual incidence rate for the reference category within the first stratum were just 2 per 100,000 per year as opposed to 200 per 100,000 per year in the second stratum, then the rate differences corresponding to the rate ratios of 15.0 and 1.5 would be 28 per 100,000 per year and 100 per 100,000 per year, respectively. On the basis of the rate difference, it would appear that the association is stronger in the second stratum than in the first. This reversal demon- strates that, despite the substantial disparity in rate ratios, shifting to a different measure of effect could transform what may have been interpreted as clearly synergistic effects into apparent antagonistic effects.

CROSSOVER EFFECTS

Although conclusions regarding synergy vs antagonism vs independent effects cannot in general be drawn validly from epidemiologic data, there is an extreme form of interaction for which the interpretation is considerably less problematic. This extreme form is sometimes known as a “crossover effect” and involves the reversal of the direction of an association across levels of some other factor. Crossover effects are sometimes referred to as “qualitative inter- actions” [42].

Provided that the confidence intervals are very narrow, even crossover effects of small magnitude (e.g. rate ratios of 0.9 and 1 .l within two subgroups may be of biologic inter- est since the qualitative conclusion is indepen- dent of one’s choice of measure of effect. On the other hand, small effects may result from small biases in the selection of subjects or in the measurement of variables, so that small cross- over effects, like all small effects, should be interpreted cautiously.

Example

An empirical example of a crossover effect is provided by a case-control study of breast cancer conducted in Connecticut as part of the Cancer and Steroid Hormone Study of CDC [43]. It involves the relationship of nulliparity to breast cancer. Table 3 gives age-specific odds ratios for this relationship from the study in Connecticut. Among young women nulliparity seems to be protective against breast cancer, whereas among older women nulliparity in- creases the risk. Other evidence in support of this particular crossover effect has been pro- vided by several earlier studies [23,44-47].

Page 7: Effect modification and the limits of biological inference from epidemiologic data

Commentary

Table 3. Relationship between nulliparity and breast cancer within 5-year categories of age: Connecticut component of the Cancer and Steroid Hor-

mone Study, 1980-1982

Nulliparous Parous

Age Cases Controls Cases Controls Odds ratio (95% CI)

221

20-24 3 9 11 I 0.21 (0.03-1.34) 25-29 I 16 41 12 0.13 (0.040.43) 30-34 15 18 15 62 0.69 (0.3k1.58) 35-39 22 15 123 88 1.05 (0.49-2.27) 4044 26 10 162 128 2.05 (0.914.75) 4549 25 16 178 223 1.96 (0.97-3.98) 50-54 32 23 159 250 2.19 (1.19-4.03)

When, as in this instance, an actual crossover is observed, one’s qualitative assessment of whether the factors interact or have independent effects does not depend on the particular measure of association used to quantify the effects. That is, regardless of whether the rate ratio, the rate difference, or some other measure is used in analyzing the data, it would be equally clear that young nulliparous women have a lowered risk of breast cancer whereas older nulliparous women have an elevated risk. Con- sequently, it would seem to be quite appropriate to proceed to search for the biological basis for these opposite effects in different age groups. One possibility is that the hormonal changes that accompany pregnancy inhibit the initiation of malignant transformation but promote the growth of existing tumors.

The interpretation of crossover effects is con- strained somewhat by the likely presence of unmeasured intervening variables. However, the constraints are not nearly so severe as they are for interactions that do not involve a crossover. Provided that the relationship between an un- measured intervening variable and the incidence of the disease is monotonic, the precise func- tional form for that relationship does not affect the qualitative interpretation of an observed crossover effect. Consider, for example, two binary factors that each alone increase the inci- dence of a disease. By definition, a crossover effect is observed if and only if the incidence of

disease when both factors are present is actually lower than when the stronger of the two risk factors is the only one present. This lower incidence could not occur if the two risk factors had no crossover effect in terms of their relation to an unmeasured intervening variable and if that intervening variable bore a monotonically increasing relationship to the incidence of the disease.

Statistical evaluation

For binary risk factors the statistical evalu- ation of crossover effects is relatively straight- forward. Specifically, application of the following criteria guarantees that, when the null hypothesis of no crossover effect is true, the probability is less than or equal to 5% for falsely concluding from the data that there is a cross- over effect in the population: (1) stratification by one of the two risk factors, but not necessarily by the other, produces two stratum-specific estimates of effect for the other risk factor that are on opposite sides of the null value, and (2) the 95% confidence intervals for these two stratum-specific estimates both exclude the null value. The Appendix describes the rationale for these criteria.

It should be noted that statistical assessment of an interaction parameter is not useful for the evaluation of crossover effects since the null value for the interaction parameter (e.g. the ratio of two stratum-specific odds ratios) may be

Table 4. Hypothetical example for illustration of appropriate uses of estimates of interaction

Expected numbers in case-control study

Proportion Rate Cases Controls Factor 1 Factor 2 of population ratio (n = 1000) (n = 1000)

+ + 0.02 9.0 144 20 + _ 0.03 4.0 96 30

+ 0.38 1.0 304 380 - - 0.57 1.0* 456 570

*Reference category.

Page 8: Effect modification and the limits of biological inference from epidemiologic data

228 Commentary

excluded with a high degree of confidence even when the criteria for a crossover effect are not met.

Table 4 presents a simple numeric example that illustrates this point. In the table, the incidence rates for three of the combinations of values for two binary factors are expressed relative to the rate for the fourth. These rate ratios, in conjunction with the joint distri- bution of the two factors in the population, determine the expected outcome for a case-control study of a given size, here 1000 cases and 1000 controls. For instance, the ex- pected number of cases with both factors is (1000) (0.02) (9)/[(0.02) (9) + (0.03) (4) + (0.38) (1) + (0.57) (l)] = 144 for a rare disease. Note that although Factor 2 increases the risk among persons with Factor 1, it neither in- creases nor decreases the risk among persons without Factor 1. (Therefore, Factor 2 does not qualify as a risk factor according to the definition adopted at the beginning of this commentary.) The interaction parameter for the multiplicative model, i.e. (9/4)/(1/l) = 2.25, is statistically significant (95% confidence interval = 1.174.32), but there is no crossover.

Evaluation of crossover effects across more than two strata, for example, the seven cat- egories of age in Table 3, requires somewhat different statistical methods than does evalu- ation of just two strata. As the number of strata increases, the probability increases that when there is no crossover in the population the upper confidence limit for the rate ratio or odds ratio will be less than 1.0 in at least one stratum and the lower confidence limit will be greater than 1.0 in at least one stratum. Prior to direct assessment of crossovers in such in- stances, it is helpful to derive a simplified de- scription of how the measure of association changes across the strata. From the data in Table 3 a logistic model can be fit with the seven categories of age coded as the integers l-7 and with the interaction between age and nulliparity represented as a single cross- product term. The fitted odds ratios for this model are given in Table 5. The interaction term for the linear change in the logarithm of the odds ratio as a function of age is highly significant (p < 0.001) and there is little evi- dence for systematic departure from this linear trend. In terms of an actual crossover effect, the confidence intervals for the fitted odds ratios in Table 5 indicate that there is statisti- cal evidence that nulliparity protects young

Table 5. Smoothed values for the re- lationship between nulliparity and breast cancer within 5-year categories of age: Connecticut component of the Cancer and

Steroid Hormone Study, 198&1982*

Age Odds ratio (95% CI)

20-24 0.21(0.11-0.41) 25-29 0.33 (0.19455) 30-34 0.51 (0.34-0.76) 35-39 0.79 (0.58~i.oej 4@44 1.24 (0.93-1.64) 45-49 1.93 (1.37-2.71) 50-54 3.01 (1.92472)

*Based on the counts given in Table 3 and on a logistic model that provides for a linear change with age in the logarithm of the odds ratio.

women against breast cancer but that it opposite effect of increasing the risk women in their late 40s and early 50s.

AN ADDITIONAL PITFALL

has the among

Although the pattern of incidence rates in Table 4 does not represent a crossover, such a pattern would nevertheless be of etiologic in- terest regarding joint effects if it were possible to infer from a sample that Factor 2 has no effect whatever on disease in the absence of Factor 1. In that instance the effect of Factor 2 in the presence of Factor 1 would permit an inference of synergistic effects under a wide range of causal systems. However, in practice it is impossible to distinguish between no effect whatever for Factor 2 in the absence of Factor 1 and a small positive effect. As a result, the interpretation is just as problematic as in other situations not meeting the criteria for a cross- over effect.

APPROPRIATE USES OF ASSESSMENTS OF INTERACTION

Although substantive biological interpret- ation of observed patterns of interaction can seldom be validly made in epidemiology, that fact should not be taken to imply that statisti- cal assessment of interaction is usually without value. Actually, there are at least three jus- tifiable reasons for examining interaction even when there is no evidence for a crossover effect. These will be discussed briefly below. They relate not to etiologic inference regarding joint effects but to prediction, to the enhance- ment of one’s ability to detect an effect for a single risk factor, and to the targeting of interventions.

Page 9: Effect modification and the limits of biological inference from epidemiologic data

Commentary 229

Table 6. Results of fitting a logistic model to the data in Table 4 without provision for interaction

Fitted frequencies in case-control study

Cases Controls Fitted Factor 1 Factor 2 (n = 1000) (n = 1000) odds ratio

+ + 136.48 21.52 6.39 + _ 103.52 22.48 5.93 _ + 311.52 372.48 1.08 - _ 448.48 571.52 1 .oo*

*Reference category.

Prediction

An analysis that includes the assessment of interaction permits a fuller description of how incidence varies as a function of risk factors than would an analysis that considers only the overall or average effects for risk factors. There- fore, the provision for interaction improves one’s ability to predict disease on the basis of an individual’s profile for a set of risk factors.

Referring again to the example in Table 4, suppose that one ignored any interaction on the multiplicative scale and considered only the main effects for the two factors. The fitted frequencies from a logistic model without an interaction term would be as given in Table 6. Clearly, failure to take account of departure from multiplicativity leads to an inferior basis for predicting disease for individuals according to their values on the two factors. For each of the subgroups with only one of the factors, we would predict a higher incidence relative to the referent group than would be appropriate. For the subgroup with both of the factors, we would underestimate the risk.

Enhanced detection of eflects

Formal evaluation of interaction can be use- ful in assessing whether a factor has any effect whatsoever. As an illustration of enhanced de- tection of effects through the assessment of interaction, consider again the example in Tables 4 and 6. Fitting the logistic model that ignores interaction (Table 6) yields an estimate of 1.08 for the odds ratio for Factor 2 with adjustment for Factor 1. This point estimate suggests that Factor 2 may have a small effect on the incidence of the disease, but the 95% confidence interval of 0.89-1.30 indicates that the data are also not terribly inconsistent with the hypothesis that the value of the odds ratio in the population is unity. The p-value for testing the null hypothesis is 0.43.

When a logistic model provides for the inter- action between Factors 1 and 2 as they relate to

the disease, then the fitted frequencies are iden- tical to those given in Table 4 and the parameter for the interaction is interpretable as the ratio between the following two odds ratios: the odds ratio for the effect of Factor 2 when Factor 1 is present and the odds ratio for the effect of Factor 2 when Factor 1 is absent. As was noted above, this ratio of odds ratios is estimated to be 2.25, with a 95% confidence interval of 1.17-4.32. Since the null hypothesis of a ratio of unity is excluded with reasonably high statistical confidence (p = 0.015), Factor 2 must have an effect in one or in both of the two strata. That is, the stratum-specific odds ratios could not possibly differ from each other if neither differed from the null value. Here we have rather com- pelling evidence of an effect for Factor 2, whereas the results from the analysis that ignored the interaction provided only weak statistical evidence for an effect.

Assessment of interaction for enhancing the detection of effects apparently runs counter to two important principles. The first has to do with power, which is substantially lower for the detection of interaction than for the detection of main effects [48,49]. Since this is the case, it may seem paradoxical that assessment of interaction should detect an effect that cannot be readily detected when the interaction is ignored. How- ever, ignoring the interaction leads to the aver- aging of stratum-specific effects, and when the subgroup in which the effect occurs is small, the effect can be seriously obscured, as in the example given in Tables 4 and 6.

The second principle has to do with simplic- ity. Walter and Holford have suggested that in the absence of knowledge of underlying mech- anisms, a statistical model should be as simple as possible [l]. By “simple” they mean incorpor- ating as few parameters as are needed to de- scribe the data adequately. Thus, if a multiplicative model without an interaction term fits well, then that model would be pre- ferred to an additive model in which an inter- action term is required. However, for the enhanced detection of effects, models that do require an interaction term may be the most informative.

Once statistical evidence for interaction has been found via the examination of variation in a particular measure of effect across levels of some other variable, a logical next step is to examine in which level or levels an effect can be demonstrated. One can accomplish this either by computing stratum-specific measures of

Page 10: Effect modification and the limits of biological inference from epidemiologic data

230 Commentary

effect or by evaluating a multivariable model at specific combinations of values for variables.

Targeting of interventions

Statistical evaluation of interaction is also appropriate when one is interested in planning interventions. As Saracci [6] and Rothman et al. [7] have pointed out, elimination of a particular factor will have a non-uniform impact on the control of disease whenever the combined effects with other risk factors are non-additive. For example, in Table 4 the effects of Factors 1 and 2 on the incidence rate are non-additive. Intervening to eliminate Factor 2 among those individuals who have Factor 1 would reduce the incidence of disease substantially among those with both Factor 1 and Factor 2. On the other hand, such an intervention among those individ- uals who do not have Factor 1 would not reduce their incidence of disease at all. Intervention on Factor 2 should clearly be targeted exclusively at individuals who have Factor 1 as well.

As an example of additive effects, suppose that the rate ratio for individuals who are negative for Factor 1 but positive on Factor 2 were 6.0 instead of the value of 1.0 given in the table. In that instance, elimination of Factor 2 in a specified number of individuals would, on average, prevent exactly the same number of cases of disease if those with Factor 1 were targeted as it would if those with Factor 1 were targeted. Consequently, from the point of view of intervention, additivity of effects is a “natu- ral” model for the assessment of joint effects, and departure from additivity has a straightfor- ward interpretation.

CONCLUSION

The message of this review has been admit- tedly a rather gloomy one. Nevertheless, it would seem to be important to face up to the limitations of our discipline. With well-designed studies epidemiologists can hopefully detect and estimate the magnitude of effects for individual factors, conditional on the values of other fac- tors. Unfortunately, that is often the limit to which available methods can be pushed, given the macro level at which epidemiologists typi- cally work. For the association between an individual risk factor and a particular disease, the null hypothesis of no effect can be readily formulated and empirically evaluated. For the joint effects of two risk factors, there are a myriad of possible null hypotheses for indepen-

dence of causal action. Consequently, definitive conclusions regarding synergistic and antagon- istic effects are generally beyond our grasp.

It is my hope that these comments will not be misconstrued as advocating a retreat from efforts to use epidemiologic methods for the elucidation of mechanisms and for choosing among theories of pathogenesis. When evaluat- ing the effects of individual factors, thoughtful consideration should always be given to possible mechanisms. However, when considering inter- actions, extreme caution is in order.

Acknowledgements-This commentary is based in part on a paper presented at a plenary session of the Twenty-First Annual Meeting of the Society for Epidemiologic Research, Vancouver, British Columbia, 15-17 June 1988. The work was supported in part by Contract 200-80-0561 from the Centers for Disease Control and by Research Grant ROI- CA39477 from the National Cancer Institute. The author thanks Sander Greenland for his helpful comments on an earlier draft of the manuscript.

1.

9.

10.

11.

12.

13.

14.

IS.

16.

REFERENCES

Walter SD, Holford TR. Additive, multiplicative, and other models for disease risks. Am J Epidemiol 1978; 108: 341-346. Rothman KJ. Occam’s razor pares the choice among statistical models. Am J Epidemiol 1978; 108: 347-349. Kupper LL, Hogan MD. Interaction in epidemiologic studies. Am J Eoidemiol 1978: 108: 447453. Greenland S. Limitations of ‘the logistic analysis of epidemiologic data. Am J Epidemiol 1979; 110: 653698. - Blot WJ, Day NE. Synergism and interaction: are they eauivalent? Am J Eoidemiol 1979: 110: 99-100. Saracci R. Interaction and synergism. Am J Epidemiol 1980: 112: 465466. Rothman KJ, Greenland S, Walker AM. Concepts of interaction. Am J Epidemiol 1980; 112: 467-470. Dayal HH. Additive excess risk model for epidemio- logic interaction in retrospective studies. J Chron Dis 1980; 33: 653-660. Gardner MJ, Munford AG. The combined effect of two factors on disease in a case-control study. Appl Stat 1980; 29: 276281. Mantel N. Epidemiologic interactions. Appl Stat 198 1; 30: 311-312. Weed DL, Selmon M, Sinks T. Links between cat- egories of interaction. Am J Epidemiol 1988: 127: 17-27. Buffler PA, Cooper SP, Stinnett S et al. Air pollution and lung cancer mortality in Harris County, Texas, 1979-1981. Am J Epidemiol 1988; 128: 683699. Kristal AR, Nasca PC, Burnett WS ef al. Changes in the epidemiology of non-Hodgkin’s lymphoma associ- ated with epidemic human immunodeficiency virus (HIV) infection. Am J Epidemiol 1988; 128: 711-718. Mori M, Harabuchi I, Miyake H et al. Reproductive, genetic and dietary risk factors for ovarian cancer. Am J Epidemiol 1988; 128: 771-777. Musicco M, Sant M, Molinari S et al. A case-control study of brain gliomas and occupational exposure to chemical carcinogens: the risk to farmers. Am J Epi- demiol 1988; 128: 778-785. Dong MH, Redmond CK, Mazumdar S ef al. A multistage approach to the cohort analysis of lifetime

Page 11: Effect modification and the limits of biological inference from epidemiologic data

Commentary 231

17.

18.

19

20

21

22

23

24.

25

26

27

28.

29

30

31

32

33

34

3s

36.

37.

lung cancer risk among steelworkers exposed to coke oven emissions. Ant J Epidemiol 1988; 128: 86&873. Kampert JB, Whittemore AS, Paffenbarger RS Jr. Combined effect of childbearing, menstrual events, and body size on age-specific breast cancer risk. Am J Epidemiol 1988; 128: 962-979. Mackerras D, Buther PA, Randall DE et al. Carotene intake and the risk of laryngeal cancer in coastal Texas. Am J Epidemiol 1988; 128: 980-988. Slattery ML, Schumacher MC, Smith KR et al. Physi- cal activity, diet and risk of colon cancer in Utah. Am J Enidemiol 1988: 128: 989-999. Lyon JL, Mahoney AW. Fried foods and the risk of colon cancer. Am J Epidemiol 1988; 128: lOO(r1006. Klatsky AL, Armstrong MA, Friedman GD et al. The relations of alcoholic beverage use to colon and rectal cancer. Am J Epidemiol 1988; 128: 1007-1015. Schairer C, Hartge P, Hoover RN et al. Racial differ- ences in bladder cancer risk: a case-control study. Am J Eoidemiol 1988: 128: 1027-1037. Negri E, La Vecchia C, Bruzzi P et a/. Risk factors for breast cancer: pooled results from three Italian casecontrol studies. Am J Epidemiol 1988; 128: 1207-1215. Wu ML, Whittemore AS, Paffenbarger RS Jr et al. Personal and environmental characteristics related to epithelial ovarian cancer. I. Reproductive and men- strual events and oral contraceptive use. Am J Epi- demiol 1988; 128: 12161227. Whittemore AS, Wu ML, Paffenbarger RS Jr et al. Personal and environmental characteristics related to epithelial ovarian cancer. II. Exposures to talcum powder, tobacco, alcohol and coffee. Am J Epidemiol 1988; 128: 1228-1240. Goodman MT, Kolonel LN, Yoshizawa CN et al. The effect of dietary cholesterol and fat on the risk of lung cancer in Hawaii. Am J Epidemiol 1988; 128: 1241~1255. Nasca PC, Baptiste MS, MacCubbin PA et al. An epidemiologic casecontrol study of central nervous system tumors in children and parental occupational exposures. Am J Epidemiol 1988; 128: 12561265. Morrison HI, Semenciw RM, Mao Y et al. Cancer mortality among a group of fluorspar miners exposed to radon progeny. Am J Epidemiol 1988; 128: 12661275. McCoy GD, Wynder EL. Etiological and preventive implications in alcohol carcinogenesis. Cancer Res 1979; 39: 28442850. Flanders WD, Rothman KJ. Interaction of alcohol and tobacco in laryngeal cancer. Am J Epidemiol 1982; 115: 371-379. Miettinen OS. Causal and preventive interdependence: elementary principles. Stand J Work Environ Health 1982; 8: 159-168. Armitage P, Doll R. Stochastic models for carcinogen- esis. In: Neyman J, Ed. Proc Fourth Berkeley Sym- posium Math Stat Prob IV. Berkeley, Calif: University of California Press; 1961; 19-38. Moolgavkar SH. The multistage theory of carcinogen- esis and the age distribution of cancer in man. J Nat1 Cancer Inst 1978; 61: 49-52. Moolgavkar SH, Day NE, Stevens RG. The two- stage model for carcinogenesis: epidemiology of breast cancer in females. J Nat1 Cancer Inst 1980; 65: 59-69. Siemiatycki J, Thomas DC. Biological models and statistical interactions: an example from multistage carcinogenesis. Int J Epidemiol 1981; 10: 383-387. Rothman KJ. Synergism and antagonism in cause- effect relationships. Am J Epidemiol 1974; 99: 385-388. Rothman KJ. The estimation of synergy or antagon- ism. Am J Epidemiol 1976; 103: 506-511.

38.

39.

40.

41.

42.

43.

44.

45.

46.

47.

48.

49.

SO.

Rothman KJ. Causes. Am J Epidemiol 1976; 104: 587-593. Koopman JS. Interaction between discrete causes. Am J Eoidemiol 1981: 113: 716-724. Greknland S, Poole C. Invariants and noninvariants in the concept of interdependent effects. Seand J Work Environ Health 1988; 14: 125-129. Thomas DC, Whittemore AS. Methods for testing interactions, with applications to occupational ex- posures, smoking and lung cancer. Am J Industrial Med 1988; 13: 131-147. Peto R. Statistical aspects of cancer trials. In: Halnan KE, Ed. Treatment of Cancer. London: Chapman and Hall; 1982: 867-87 1. Cancer and Steroid Hormone Study of the Centers for Disease Conrol and the National Institute of Child Health and Human Development: Oral contraceptive use and the risk of breast cancer. N Engl J Med 1986; 315: 405-411. Janerich DT, Hoff MB. Evidence for a crossover in breast cancer risk factors. Am J Epidemiol 1982; 116: 737-742. Lubin JH, Burns PE, Blot WJ ef al. Risk factors for breast cancer in women in northern Alberta, Canada, as related to age at diagnosis. J Nat1 Cancer Inst 1982; 68: 211-217. Ron E, Lubin F, Wax Y. Re: “Evidence for a crossover in breast cancer risk factors.” Am J Epidemiol 1984; 119: 139-140. Pathak DR, Speizer FE, Willett WC et al. Parity and breast cancer risk: possible effect on age at diagnosis. Int J Cancer 1986; 37: 21-25. Smith PG, Day NE. The design of case-control stud- ies: the influence of confounding and interaction effects. Int J Epidemiol 1984; 13: 356-365. Greenland S. Tests for interaction in epidemiologic studies: A review and study of power. Stat Med 1983; 2: 2433251. Gail M, Simon R. Testing for qualitative interactions between treatment effects and patient subsets. Bio- metrics 1985; 41: 361-372.

APPPENDIX

Justification of Statistical Criteria for Inferring a ‘Crossover

LetR++, R++, R_+ and R__ denote the incidence rates for four subgroups of the population according to the presence (+) or absence (-) of two binary variables. The specified criteria for concluding that there is a crossover effect in the population is equivalent to requiring that one of the follow- ing four conditions is met (A denotes an estimate obtained from a sample):

(1) lower 95% confidence limit for t+-/t__ > 1 and upper 95% confidence limit for R++/{_+ < 1

(2) upper 95% confidence limit for R+_/e__ < I and lower 95% confidence limit for R++/<_+ > I

(3) lower 95% confidence limit for R-+/R_- > 1 and upper 95% confidence limit for e++/!+- < 1

(4) upper 95% confidence limit for R-+/R_- < 1 and lower 95% confidence limit for R+ +/k+ > 1

When the null hypothesis of no crossover is true, the probability that one of these four conditions is met can be substantially less than 5%. Suppose, for example, that R = R, _ = Rm + = R_ ~. The probability that a given one of+the four conditions is met would be (0.025)’ = 0.000625. Since Conditions (1) and (3) can hold simultaneously, as can Conditions (2) and (4), the probability that at least one of the four conditions is met would be less than (4) (0.000625) = 0.0025.

Page 12: Effect modification and the limits of biological inference from epidemiologic data

232 Commentary

For other sets of values for the rates in the population, however, the probability that at least one of the four conditions is met when there is no crossover in the popu- lation would be 0.05. Suppose that R, + = R, _ = R_ + > R__ and that the sample size is large enough and the elevation above R_ _ is high enough so that the lower 95% confidence intervals for R+ _/ri_ _ and A_ +/R_ _ will virtu- ally always be greater than 1. Suppose further that the distribution of the two binary variables in the population is such that the estimate of R, + is considerably more precise than the estimates of R, _ and R_ _ . Under these circum- stances, the probability that each of the four conditions is met would therefore be 0.025, 0.0, 0.025, and 0.0, respect- ively. Since Conditions (1) and (3) would be virtually independent in this particular situation, the probability that either condition is met would be approximately 0.05. Consequently, the need for adequate protection against false positive findings of a crossover effect dictates that criteria as stringent as the ones specified in the text be adopted.

Provided that one of the two risk factors has been designated a priori as the one to be used as the basis for stratification, the criteria specified in the text may be validly reformulated as follows: (1) stratification by the designated risk factor produces two stratum-specific estimates of effect that are on opposite sides of the null value and (2) the 90% confidence intervals for these two stratum-specific estimates

both exclude the null value. These criteria still guarantee that the probability is less than or equal to five per cent for falsely concluding from the data that there is a crossover effect in the population [50].

Despite this technical justification for using 90% con- fidence intervals in certain circumstances, it may never- theless be prudent to adopt a consistent practice of calculating 95% confidence intervals when assessing cross- over effects. In many situations there would be no clear rationale for designating one risk factor a priori as the one to be used as the basis for stratification and another risk factor as the one to be treated as the exposure. Conse- quently, the researcher’s ability to explore the data fully would be unnecessarily constrained. Additionally, it would seem undesirable to adopt an analytic strategy whereby, depending on what is going on in the researcher’s head rather than in the data themselves, the same dataset would yield potentially different conclusions as to whether the statistical evidence for a crossover effect reaches the 95% level of confidence. In light of these considerations, 95% rather than 90% confidence intervals are given in Table 5. This has been done even though in that particular example it would be impossible, in light of the age-matching of controls to cases, to assess whether the direction of the change in breast cancer with age among parous women is in the opposite direction from the change with age in nulliparous women.