measurement error and results from analytic epidemiology ...stefanski/mem_reports... · ces on...

10
Measurement Error and Results From Analytic Epidemiology: Dietary Fat and Breast Cancer Ross L. Prentice* Background: International correlational analyses have sug- gested a strong positive association between fat consumption and breast cancer incidence, especially among post- menopausal women. However, case—control studies have been taken to indicate a weaker association, and a recent, pooled cohort analysis reported little evidence of an associa- tion. Differences among study results could be due to dif- ferences in the populations studied, differences in the control for total energy intake, recall bias in the case-control studies, and dietary measurement error biases. Existing measurement error models assume either that the sample data used to validate dietary self-report instruments are without measurement error or that any such error is inde- pendent of both the true dietary exposure and other study subject characteristics. However, growing evidence indicates that total energy and, presumably, both total fat and percent energy from fat are increasingly underreported as percent body fat increases. Purpose: A relaxed dietary measurement model is introduced that allows all measurement error parameters to depend on body mass index (weight in kilograms divided by the square of height in meters) and in- corporates a random underreporting quantity that applies to each dietary self-report instrument. The model was ap- plied to results from international correlational analyses to determine whether the differing associations between dietary fat and postmenopausal breast cancer can be ex- plained by measurement errors in dietary assessment Methods: The relaxed measurement model was developed by use of data on total fat intake and percent energy from fat from 4-day food records (4DFRs) and food-frequency ques- tionnaires (FFQs) from the original Women's Health Trial. This trial was a randomized, controlled, feasibility study of a low-fat dietary intervention carried out from 1985 through 1988 in Cincinnati (OH), Houston (TX), and Seattle (WA) among 303 women (184 intervention and 119 control) who were 45-69 years of age. The relaxed model was used to project results from the international correlational analyses onto 4DFR and FFQ fat-intake categories. Results and Con- clusions: If measurement errors in dietary assessment are overlooked entirely, the projected relative risks (RRs) for breast cancer based on the international data vary substan- tially across percentiles of total fat intake. The projected RR for the 90% versus the 10% fat-intake percentile is 3.08 with the 4DFR and 4.00 with the FFQ. If random (i.e., noise) aspects of measurement error are acknowledged, the projected RR for the same comparison is reduced to 1.54 with the 4DFR and 1.42 with the FFQ. If both systematic and noise aspects of measurement error are acknowledged, the projected RR is reduced to about 1.10 with either instru- ment Acknowledgment of measurement error also leads to a projected RR of about 1.10 for the 90% versus the 10% per- centile of percent energy from fat with either dietary instru- ment. Implications: Dietary self-report instruments may be inadequate for analytic epidemiologic studies of dietary fat and disease risk because of measurement error biases. [J Natl Cancer Inst 1996;88:1738-47] The hypothesis that a low-fat diet may reduce the risk for breast cancer has been promulgated for several decades. Ex- perimental studies in rodents {1-3) indicate specific roles for both fat reduction and calorie restriction in inhibiting mammary tumorigenesis. International correlational studies (4,5) suggest strong relationships between breast cancer incidence and mor- tality and fat consumption, particularly for postmenopausal women, and they gain support from time trend and migrant studies. For example, Prentice and Sheppard (5) used breast can- cer incidence rates among women in the age range of 55-69 years in 21 countries with representative cancer registration and per capita nutrient supply data to suggest that a 50% reduction in total fat intake in the United States could eventually reduce postmenopausal breast cancer incidence rates to about 40% of present levels. There have been a large number of case-control studies of this association reported during the past two decades. Howe et al. (6~) carried out a joint analysis of the data from 12 such case-control studies that included 4247 breast cancer patients and 6095 control subjects; about two thirds of both groups were postmenopausal. The authors reported a highly sig- nificant (/ > <.0001) positive association between breast cancer risk and estimated total fat intake among postmenopausal women, with estimated relative risks (RRs) of 1.00, 1.20, 1.24, 1.24, and 1.46 across fat-intake quintiles. However, this trend was interpreted as being less than would be anticipated from the international correlational analyses. Recently, Hunter et al. (7) reported on a pooled analysis of seven cohort studies of dietary fat and breast cancer, which included 4980 breast cancer cases arising in the follow-up of more than 300 000 women. They reported an estimated RR of only 1.05 (95% confidence interval = 0.94-1.16) for the highest compared with the lowest quintile of calorie-adjusted total fat intake. The authors noted a similar lack of trend across percent energy from fat quintiles, with a * Correspondence to: Ross L. Prentice, Ph.D., Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1124 Columbia St., Seattle, WA 98104. See "Notes" section following "References." 1738 ARTICLES Journal of the National Cancer Institute. Vol. 88, No. 23, December 4, 1996

Upload: others

Post on 28-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measurement Error and Results From Analytic Epidemiology ...stefanski/MEM_Reports... · ces on measurement properties, only baseline data and post-randomization dietary data from

Measurement Error and Results From AnalyticEpidemiology: Dietary Fat and Breast Cancer

Ross L. Prentice*

Background: International correlational analyses have sug-gested a strong positive association between fat consumptionand breast cancer incidence, especially among post-menopausal women. However, case—control studies havebeen taken to indicate a weaker association, and a recent,pooled cohort analysis reported little evidence of an associa-tion. Differences among study results could be due to dif-ferences in the populations studied, differences in the controlfor total energy intake, recall bias in the case-controlstudies, and dietary measurement error biases. Existingmeasurement error models assume either that the sampledata used to validate dietary self-report instruments arewithout measurement error or that any such error is inde-pendent of both the true dietary exposure and other studysubject characteristics. However, growing evidence indicatesthat total energy and, presumably, both total fat and percentenergy from fat are increasingly underreported as percentbody fat increases. Purpose: A relaxed dietary measurementmodel is introduced that allows all measurement errorparameters to depend on body mass index (weight inkilograms divided by the square of height in meters) and in-corporates a random underreporting quantity that appliesto each dietary self-report instrument. The model was ap-plied to results from international correlational analyses todetermine whether the differing associations betweendietary fat and postmenopausal breast cancer can be ex-plained by measurement errors in dietary assessmentMethods: The relaxed measurement model was developed byuse of data on total fat intake and percent energy from fatfrom 4-day food records (4DFRs) and food-frequency ques-tionnaires (FFQs) from the original Women's Health Trial.This trial was a randomized, controlled, feasibility study of alow-fat dietary intervention carried out from 1985 through1988 in Cincinnati (OH), Houston (TX), and Seattle (WA)among 303 women (184 intervention and 119 control) whowere 45-69 years of age. The relaxed model was used toproject results from the international correlational analysesonto 4DFR and FFQ fat-intake categories. Results and Con-clusions: If measurement errors in dietary assessment areoverlooked entirely, the projected relative risks (RRs) forbreast cancer based on the international data vary substan-tially across percentiles of total fat intake. The projected RRfor the 90% versus the 10% fat-intake percentile is 3.08 withthe 4DFR and 4.00 with the FFQ. If random (i.e., noise)aspects of measurement error are acknowledged, theprojected RR for the same comparison is reduced to 1.54with the 4DFR and 1.42 with the FFQ. If both systematicand noise aspects of measurement error are acknowledged,

the projected RR is reduced to about 1.10 with either instru-ment Acknowledgment of measurement error also leads to aprojected RR of about 1.10 for the 90% versus the 10% per-centile of percent energy from fat with either dietary instru-ment. Implications: Dietary self-report instruments may beinadequate for analytic epidemiologic studies of dietary fatand disease risk because of measurement error biases. [JNatl Cancer Inst 1996;88:1738-47]

The hypothesis that a low-fat diet may reduce the risk forbreast cancer has been promulgated for several decades. Ex-perimental studies in rodents {1-3) indicate specific roles forboth fat reduction and calorie restriction in inhibiting mammarytumorigenesis. International correlational studies (4,5) suggeststrong relationships between breast cancer incidence and mor-tality and fat consumption, particularly for postmenopausalwomen, and they gain support from time trend and migrantstudies. For example, Prentice and Sheppard (5) used breast can-cer incidence rates among women in the age range of 55-69years in 21 countries with representative cancer registration andper capita nutrient supply data to suggest that a 50% reductionin total fat intake in the United States could eventually reducepostmenopausal breast cancer incidence rates to about 40% ofpresent levels. There have been a large number of case-controlstudies of this association reported during the past two decades.Howe et al. (6~) carried out a joint analysis of the data from 12such case-control studies that included 4247 breast cancerpatients and 6095 control subjects; about two thirds of bothgroups were postmenopausal. The authors reported a highly sig-nificant (/><.0001) positive association between breast cancerrisk and estimated total fat intake among postmenopausalwomen, with estimated relative risks (RRs) of 1.00, 1.20, 1.24,1.24, and 1.46 across fat-intake quintiles. However, this trendwas interpreted as being less than would be anticipated from theinternational correlational analyses. Recently, Hunter et al. (7)reported on a pooled analysis of seven cohort studies of dietaryfat and breast cancer, which included 4980 breast cancer casesarising in the follow-up of more than 300 000 women. Theyreported an estimated RR of only 1.05 (95% confidence interval= 0.94-1.16) for the highest compared with the lowest quintileof calorie-adjusted total fat intake. The authors noted a similarlack of trend across percent energy from fat quintiles, with a

* Correspondence to: Ross L. Prentice, Ph.D., Division of Public HealthSciences, Fred Hutchinson Cancer Research Center, 1124 Columbia St., Seattle,WA 98104.

See "Notes" section following "References."

1738 ARTICLES Journal of the National Cancer Institute. Vol. 88, No. 23, December 4, 1996

Page 2: Measurement Error and Results From Analytic Epidemiology ...stefanski/MEM_Reports... · ces on measurement properties, only baseline data and post-randomization dietary data from

somewhat elevated breast cancer incidence among womenwhose reported energy intake from fat was less than 15% or lessthan 20%. The seven cohort studies each used a food-frequencyinstrument for dietary assessment and included a "validation"subsample in which an additional, more comprehensive dietaryapproach, involving multiple food records or recalls, was used.Hunter et al. commented that using these validation data to pro-vide a measurement error correction had little impact on theiranalyses. Differences between these cohort and case-controlstudy results could be due to differing populations studied anddiffering dietary instruments, to differing control for energy in-take, to recall bias in the case-control studies, or to dietarymeasurement error biases in one or both sets of studies.

Existing measurement error models {8-11) assume either thatthe validation sample data are without measurement error or thatany such error is independent of both the true dietary exposureand all other study subject characteristics. However, measure-ment errors for protein consumption under various dietary self-report instruments have been shown to be correlated (12). Also,there is a growing body of literature (13-17) describing the useof doubly labeled water and other techniques to obtain objectivemeasures of total energy expenditure while controlling for phys-ical activity. These studies consistently showed obese persons tounderreport calorie intake substantially, perhaps by 25%-50%,on food records and other forms of dietary self-report. For ex-ample, a recent study in Denmark (15) indicated that self-reported energy intake among middle-aged and older womenwas underestimated in an approximately linear fashion acrossdeciles of percent body fat. The estimated degree of under-reporting increased from near zero at the lowest body fat decileto 30%-40% in the upper three deciles. Furthermore, proteinenergy, calculated from urinary nitrogen output, was under-reported to a considerably lesser degree than was total energy,'making it likely that both total fat and percent energy from fatare increasingly underreported as percent body fat increases.

In this study, a more relaxed measurement model is intro-duced by allowing all measurement error parameters to dependon body mass index (BMI) (i.e., weight in kilograms divided bythe square of height in meters) categories and by incorporating arandom underreporting quantity that applies to each dietary self-report instrument. The implication of these measurement modelimprovements is then examined in an attempt to understand thevarying results of analytic epidemiologic studies of dietary fatand breast cancer among postmenopausal women and to identifyfuture research strategies in the diet-and-disease arena.

Methods

The relaxed measurement model referred to above was applied to food recordand food-frequency data from the National Cancer Institute-sponsored Women'sHealth Trial. The Women's Health Trial was a randomized, controlled, feas-ibility trial of a low-fat dietary pattern carried out among 303 women (184 inter-vention and 119 control) in the age range of 45-69 years; the trial took placefrom 1985 through 1988 in Cincinnati (OH), Houston (TX), and Seattle (WA){18,19).

All Women's Health Trial participants were instructed on how to keep foodrecords on standardized forms. Records that involved 4 consecutive days of foodrecording, including 1 weekend day (i.e., 4-day food records [4DFRs]), were ob-tained al baseline (before randomization) and at 6, 12, and 24 months after ran-domization. A nutritionist reviewed the records for accuracy, legibility, and

completeness. The inability to provide a food record of sufficient quality pre-cluded trial participation. 4DFRs were coded and analyzed at the Nutrition Coor-dinating Unit at Tufts University (Boston, MA). The University ofMassachusetts Nutrient Databank (Worcester, MA), based on the U.S. Depart-ment of Agriculture Revised Handbook 8 (20), was used to derive nutrient con-sumption estimates from food record information. A self-administered food-frequency questionnaire (FFQ), which attempted to ascertain usual dietary habitsduring the preceding months (12 months at baseline and 6 months subsequently),was also provided at baseline and at 6, 12, and 24 months. At baseline and at 6months, a questionnaire developed by Willett et al. (21) and its accompanyingnutrient database were used. At 12 and 24 months, a similar questionnairedeveloped by Block et al. (22) was used. To avoid dietary intervention influen-ces on measurement properties, only baseline data and post-randomizationdietary data from control group women were used in these analyses. These data,along with baseline height and weight, and hence BMI, were used to build ameasurement model for both total (daily) fat intake and for percent energy fromfat, the details of which will be presented in the "Results" section.

To characterize the influence of measurement error in fat-intake assessmenton the results of analytic epidemiologic studies, we assumed that the strong as-sociations between measures of dietary fat and postmenopausal breast cancerseen in international correlational analyses were due entirely to fat consumption.The measurement model developed using Women's Health Trial data was thenused to examine the extent to which such strong associations are attenuated anddistorted by measurement error in dietary assessment. For example, if the strongsignal assumed from international comparisons is largely masked by measure-ment error in 4DFRs and FFQs, it would follow that these instruments may wellbe inadequate for the reliable assessment of trends between disease risk and fatintake in cohort or case-control studies, regardless of study size.

The international data from 21 countries previously mentioned (5) were usedto specify models for postmenopausal breast cancer risk as a function of fat in-take. These data consisted of age-adjusted breast cancer incidence rates amongwomen in the age range of 55-69 years around 1980, from volume 5 of CancerIncidence in Five Continents (23), and national per capita supply data for totalfat and total calories for the time period 1975-1977 (24). These data suggest anapproximately linear relationship between breast cancer risk and either thelogarithm of per capita total fat or the logarithm of percent energy from fat, andparameters were estimated by use of simple linear regression. Because loga-rithms of dietary data are used, the regression coefficients would be unaltered ifthe per capita supply estimates were rescaled to enhance their agreement withcorresponding nutrient consumption estimates. The 21 countries include theUnited States and several other Western countries, precluding any noteworthyextrapolation in application to Women's Health Trial fat-intake data.

Results

Measurement Model Development

Denote by Z,- the "true" fat-intake measure for the ith studysubject. For example, Z,- may be defined as the logarithm ofaverage daily grams of fat consumed over a time period per-tinent to cancer risk, say during the past 10 years. The use oflogaritiimic transformations is common in nutritional epidemiol-ogy as a means of inducing nutrient intakes that are approx-imately normally distributed. Denote by X1( the corresponding4DFR fat-intake measure. A standard measurement model (11)for the "validation" sample measure Xu would assume

X,, = Z,• + £,,-, [1]

where £|, is a measurement error that is assumed to be statisti-cally independent of Z, and of all other study subject charac-teristics. In this context, one can think of £j/ as primarily due tovariations in dietary habits among 4-day periods within thepreceding decade, but other sources of error include recordingerrors or omissions by the study subject and nutrient databaseinaccuracies. These latter sources of error can be expected to

Journal of the National Cancer Institute, Vol. 88, No. 23, December 4, 1996 ARTICLES 1739

Page 3: Measurement Error and Results From Analytic Epidemiology ...stefanski/MEM_Reports... · ces on measurement properties, only baseline data and post-randomization dietary data from

lead to positive correlations among eh values based on 4DFRskept by the jth subjects at different 4-day recording periods.Similarly, trends in fat consumption for a study subject duringthe preceding decade would tend to induce positive correlationsamong errors (£|,-) if food recording periods are close in time.

A similar measurement model,

t-2,' [2]

could be considered for an FFQ fat-intake measure X2l. Here,the error term £2; can reflect variations among the time periods(e.g., preceding 6 or 12 months) for which an FFQ would at-tempt to ascertain usual dietary habits, errors due to the choiceof food items listed in the FFQ or due to the related nutrientdatabase, and inaccuracies on the part of the study subject inrecalling the frequency of consuming various foods and usualportion sizes.

A traditional measurement model would assume Z,, £|,-, and£2, to be statistically independent and normally distributed.Under these circumstances, measurement error merely addsnoise to the dietary assessment, and RRs are simply attenuatedtoward unity. Specifically, if the breast cancer RR depends onthe true fat measure Z according to

where p is a regression coefficient and ZQ is a reference value,then the RR as a function of the 4DFR measure X, is to an ex-cellent approximation

n n / v \ ttV«il ft/V V \l f311\J\(A. i) — eXp|A.p^A| — A\Q)J, [jj

where X,o is a reference value and X = {correlation (X^^(standard deviation X2 /standard deviation X,)} is an attenuationcoefficient. A corresponding expression holds for RR(X2), theRR as a function of an FFQ measure of dietary fat.

It may be important to relax the assumption that measurementerror is independent of the true fat intake for the nonvalidation(FFQ) measure X^. This can be accomplished by writing thismeasurement error as the sum of two components. The first maydepend on Z,, but it does not vary with repeat applications of anFFQ, whereas the second does vary with repeat applications andwill be taken to be independent of Z,- and of the first error com-ponent, giving

u = Z2i + E2;. [4]

where E^ has been redefined to be the second component oferror just mentioned and Z^ is the sum of Z, and the first errorcomponent. Expressions [1] and [4] and normal distribution as-sumptions imply that the conditional expectation of the true fatintake Z, is a linear function of the corresponding 4DFR andFFQ measures, consistent with the regression calibration mea-surement model approach (8,9,11) used by Hunter et al. (7) intheir measurement error corrections of cohort studies of fat andbreast cancer.

The measurement model given by [1] and [4], though state ofthe art, does not acknowledge the underreporting of energy andfat as a function of body mass, as noted in the introduction.Hence, the measurement model will be refined by allowing allmodel parameters to vary across categories v = 1, . . ., m of BMIand by introducing an underreporting variable W that applies to

both 4DFR and FFQ assessments. For study subject 1 in BMIcategory v, one then has

JJV — Z 2 | V [5]

where the mean and variance of the true fat intake Zn will bewritten as (i,, and a2,, respectively; for Z^v, these will be denotedby (i2v and a\\, respectively; for Wm they will be denoted by a,,and r\l, respectively. Without loss of generality, the mean of £,-,vcan be taken to be zero, while the variance will be denoted 8^, j= 1, 2. All quantities on the right side of the expressions inmodel [5], except Z,v and Z2lv, are assumed to be independentand normally distributed, conditional on BMI category. Thismeasurement model will be applied with baseline BMI valuesgrouped into tertiles (m = 3), but BMI quintiles and deciles werealso examined with RR projections very similar to those shownbelow. For convenience of terminology, E|,v and E2,v will bereferred to as the noise, or random, aspects of measurementerror, while WIV will be referred to as the systematic error com-ponent, although it should be remembered that Z^v may also in-clude systematic measurement error.

Measurement Model Fitting

From [5], the 4DFR and FFQ measures XUv and X2;v in BMIcategory v have a mean (\iv + a^ u.2v + cO and variance matrix

[6]

Note that the covariance between Xlh, and X2lv is the sum of aterm Yv°v°2^ arising from the correlation yv between Z,v and Z2,vand a term vfc coming from their shared systematic bias.

The parameters in [6] were estimated by use of data from theWomen's Health Trial. The repeat 4DFRs and FFQs on womenrandomly assigned to the dietary control group can help inparameter estimation. We focus on the control group 4DFR andFFQ assessments XUv and X^ at 1 year after randomization. Inaccordance with [5], assume that

Wiv + E1/V; X2lv - WIV [7]

where the noise variates E1IV and E2IV are normally distributed,with the mean equal to zero and the variance equal to 8?,, and8 ^ respectively, independent of the other variates in [5]. Thefact that the FFQ of Willett et al. (21) was used at baseline andthe FFQ of Block et al. (22) was used at 1 year allows instru-ment differences related to choice of food items and portion-sizeascertainment strategies to be included in the FFQ noise meas-urement error component.

An additional assumption is required to separately estimate a jand v\t. We will proceed, somewhat arbitrarily, by supposingthat the 4DFR noise-to-signal ratios ftf^/al are constant acrossBMI categories, so that the relative degree of variability of thefat consumption between 4-day recording periods is assumed tobe common across BMI tertiles. It turns out that this assumptionleads to estimates of the true log-fat and log-percent energyfrom fat-intake variances, al, that are nearly constant acrossBMI categories. An assumption of a2, = a2 for all v would pro-vide an equally attractive modeling step and would yield resultsvery similar to those given below.

1740 ARTICLES Journal of the National Cancer Institute, Vol. 88, No. 23, December 4, 1996

Page 4: Measurement Error and Results From Analytic Epidemiology ...stefanski/MEM_Reports... · ces on measurement properties, only baseline data and post-randomization dietary data from

The measurement model fining can be completed by specify-ing a value for r\], the systematic error variance in the lowestBMI category, or equivalently by specifying the correlationPi i = r|i/(T|| + 5n) between the baseline and 1-year 4DFR meas-urement errors at v = 1. In view of the lesser average under-reporting of calories at lower body mass, one expects pn to beless than the corresponding measurement error correlationsP\v = ̂ /(Tiy + of,,) in the other BMI categories. A value of p u =0 (uncorrelated measurement errors) will be considered, and thesensitivity of RR projections to small values of pn in the rangeof 0.0-0.20 will be examined.

With these specifications, all the parameters in the variancemodel [6] can be estimated as functions of the sample variancessjv of Xjiv values, the sample correlations rp between X;iV and Xjiv

values, and the sample correlations between Xllv and Xj^ as iselaborated in Appendix 1. Table 1 shows these values andsample means for both log-total fat and log-percent energy fromfat broken down by baseline BMI tertile (m = 3). Baseline cal-culations included all 303 women entered in the Women'sHealth Trial, whereas baseline to 1-year correlations wererestricted to the 109 control group women who provided both a4DFR and an FFQ at 1 year. Each BMI tertile consisted of 101women. BMI ranges in the lowest, middle, and highest tertileswere 16.6 to less than 23.3, 23.3 to less than 26.7, and 26.7 to35.4, respectively. The similarity in fat-intake sample meansacross BMI tertiles for the two dietary instruments suggests asimilar degree of underreporting as a function of body mass.The higher baseline to 1-year 4DFR fat-intake correlations inhigher BMI tertiles perhaps reflect more strongly correlatedmeasurement errors.

Table 2 shows estimated measurement model parameters atspecified values of the correlation pn between baseline and 1-year 4DFR measurement errors in the lowest BMI tertile. Notethe higher values of the estimated correlation between baselineand 1-year 4DFR measurements in higher BMI categories, aswould be expected with more variable underreporting. The es-timated baseline to 1-year FFQ measurement error correlationsare also larger in higher BMI categories, as are the estimatedcorrelations between baseline 4DFR and baseline FFQ measure-ment errors. As mentioned previously, the estimated variance ajof the true fat-intake measurement, whether log-total fat or log-percent energy from fat, varies little across BMI categories.

Constancy of c j suggests a similar degree of variability amongindividuals in true fat intake across BMI tertiles, but with pos-sibly differing absolute amounts. Table 2 shows that the es-timated true fat-intake variance constitutes only a small fractionof the total intake sample variance for the 4DFR measurementsand an even smaller fraction for the FFQ measurements and thatthese estimated fractions are lower among women having largerBMI values. The final rows of Table 2 show the estimated frac-tion, T|v, /<:,„ of the sample covariance between baseline 4DFRand FFQ fat-intake estimates that is attributable to measurementerror correlation. Most of the covariance is attributable to mea-surement error correlation in the higher BMI categories underthis measurement model. Table 2 also indicates that these frac-tions are quite sensitive to the magnitude of the 4DFR measure-ment error correlation (p,,) in the lowest BMI tertile.

Postmenopausal Breast Cancer RR Projections

The estimated coefficients from the regression of log-breastcancer incidence among women aged 55-69 years in 21countries on corresponding log-per capita fat supply estimateswere p = 1.341 for log-total fat and (3 = 1.380 for log-percentenergy from fat. As previously mentioned, a corresponding ex-ponential form RR model exp{p\Z - ZQ)} will be assumed,where Z is the "true" fat intake, and RRs as a function of 4DFRand FFQ fat-intake measures will be estimated under varyinglevels of acknowledgment of measurement error. Estimated RRswill be presented for the 30th, 50th, 70th, and 90th'percentilesof the fat-intake distribution in comparison with the 10th per-centile, since these will correspond closely with the RRs acrossfat-intake quintile categories, which are commonly used innutritional epidemiology reporting. Under the normal distribu-tion assumptions previously mentioned, the estimated 10th,30th, 50th, 70th, and 90th percentiles for daily grams of total fatare 47.0, 60.2, 71.4, 84.8, and 108.6, respectively, from baseline4DFRs in the Women's Health Trial and 44.8, 60.8, 75.2, 92.9,and 126.1, respectively, from baseline FFQs. The correspondingpercentiles for percent energy from fat are 30.9, 35.2, 38.5,42.1,and 47.8 from baseline 4DFRs and 29.8, 34.0, 37.3, 40.9, and46.7 from baseline FFQs. These distributions are generally com-parable to those for the Nurses' Health Study and other cohortstudies included in the pooled analysis of Hunter et al. (7).

Table 1. Four-day food record (4DFR) and food-frequency questionnaire (FFQ) estimates of means, standard deviations, and correlations for log-total fat and log-percent energy from fat from baseline and I -year control group dietary assessments in the Women's Health Trial*

Baseline 4DFR mean (u,, + a,,)Baseline 4DFR SD (slr)

Baseline FFQ mean (ji2v + a,)Baseline FFQ SD (s2r)

Baseline 4DFR/FFQ corr.Baseline and l-y 4DFR corr. (r l r)Baseline and 1 -y FFQ corr. (r,,,)

1

6.4620.292

6.4950.345

0.2420.1880.478

Log-total fat

2

6.4740.348

6.5430.406

0.3740.3950.639

Body mass index

3

6.4630.342

6.5130.456

0.2930.4220.515

tertile (v)

1

3.6280.146

3.5990.164

0.3230.2420.322

Log-percent energy from fat

2

3.6790.171

3.6420.174

0.5160.4440.552

3

3.6450.188

3.6480.188

0.3390.4460.332

•Mean = ji + a; SD = standard deviation; corr. = correlation (r).

Journal of the National Cancer Institute, Vol. 88, No. 23, December 4, 1996 ARTICLES 1741

Page 5: Measurement Error and Results From Analytic Epidemiology ...stefanski/MEM_Reports... · ces on measurement properties, only baseline data and post-randomization dietary data from

Table 2. Selected measurement model parameter estimates for log-total fat and log-percent energy from fat as a function of the measurement error correlation (pi i)between baseline and 1-year control group 4-day food record (4DFR) dietary assessments in the Women's Health Trial*

Baseline to l-y4DFRmeas. error corr. (P|V)

Baseline to 1-yFFQtmeas. error corr. (p2v)

Baseline 4DFR and FFQtmeas. error corr. (P12*)

"True" fat-intake varianceestimate (aj) x I02

Fraction of 4DFR fat-intake

variance due to true fat (o^/s2lv)

Fraction of FFQt fat-intakevariance due to true fat (aj /s\v)

Fraction of 4DFR/FFQt covarianceattributable to measurement error (T\v /cv)

Body massindex tertile (v)

123

123

123

123

1

23

1

23

123

p , , = 0

0.000.300.33

0.000.340.25

0.000.320.29

1.611.701.58

0.190.140.13

0.130.100 08

0.000.580.74

Log-total fat

P11 = 0 1

0.100.350.38

0.110.400.29

0.110.370.33

0.840.890.82

0.100.070.07

0.070.050.04

0.300.630.91

Log-percent energy

Pn=0

0.000.320.33

0.000.370.29

0.000.340.31

0.520.520.62

0.24

0.180.18

0.190.170.18

0.000.510.79

from fat

P11 =0.1

0.100.370.37

0.090.420.33

0.100.390.35

0.340.340.41

0.160.120.12

0.130.110.11

0.230.630.98

*Meas. error corr. = measurement error correlations.fFFQ = food-frequency questionnaire.

The first row of Table 3 shows that international data-pro-jected RRs would vary substantially across total fat-intake per-centiles if dietary assessment measurement errors could beentirely overlooked. For example, the projected RR for the 90%versus the 10% fat-intake percentile is 3.08 for the 4DFR and4.00 for the FFQ. Corresponding RRs would be about 1.8-1.9for the 90th versus the 10th percentile of percent energy from fatif both random and systematic aspects of dietary assessmentmeasurement error could be ignored.

Table 3 also shows RR projections under the oversimplifiedmeasurement model given by [1] and [2], which assumes base-line 4DFR and FFQ measurement errors not to be correlated and

makes no provision for systematic measurement errors. Thesample correlation between baseline 4DFR and FFQ total fatmeasures is 0.310, giving estimated attenuation coefficients A in[3] of (3.10) (0.404) (0.327)"' = 0.383 for the 4DFR and 0.251for the FFQ. Based on a sample correlation of 0.407 betweenbaseline 4DFR and FFQ percent energy from fat measures, theattenuation coefficient in [3] is estimated by 0.407 (0.176)(0.170)"' = 0.421 for the 4DFR and 0.393 for the FFQ percentenergy from fat measures. As shown in Table 3, the internation-al data-projected RRs are much reduced by this acknowl-edgment of the noise aspect of measurement error, with RRs forthe 90th percentile versus the 10th percentile reduced to about

Table 3. International data-projected relative risks for breast cancer at various percentiles of the Women's Health Trial fat-intake distribution, under oversimplifiedmeasurement error models*

Measurement error overlooked

Systematic and noiseSystematic

Systematic and noiseSystematic

10t

1.001.00

1.001.00

30

1.391.13

1.201.07

4DFR percentile

50

1.751.24

1.351.13

70

2.211J6

1.531.20

Relative risks

90 lOt

Log-total fat

3.08 1.001.54 1.00

% energy from fat

1.82 1.001.28 1.00

30

1.511.11

1.201.07

FFQ percentile

50

2.001.19

1.361.12

70

2.661.28

1.551.19

90

4.001.42

1.861.27

*4DFR = 4-day food record: FFQ = food-frequency questionnaire.tReference category.

1742 ARTICLES Journal of the National Cancer Institute, Vol. 88, No. 23, December 4,1996

Page 6: Measurement Error and Results From Analytic Epidemiology ...stefanski/MEM_Reports... · ces on measurement properties, only baseline data and post-randomization dietary data from

1.5 for total fat and to less than 1.3 for percent energy from fatfor both dietary assessment instruments.

These RR projections still ignore a systematic underreportingof fat intake, especially among obese persons. Variation be-tween individuals in the extent of such underreporting causescorrelations among the measurement errors associated with the4DFR and FFQ instruments and correlation among measure-ment errors when such instruments are used at multiple points intime. The measurement model [5] acknowledges these sys-tematic biases, as well as the noise aspect of measurement error.

Appendix 2 shows formulae for estimating RRs under model[5]. Tables 4 and 5 present international data-projected RRsunder [5], using the parameter estimates given in Table 2, as afunction of the correlation (pn) between baseline and 1-year4DFR measurement errors in the lowest BMI fertile and as afunction of oil,,, the average fat-intake underreporting in BMIfertile v, v = 1, 2, 3. Motivated by the data of Heitmann andLessner (75), we assume ah,, a2v, and (X3,. to be of the form 0,a<)/2, and OQ, respectively, and let (XQ take values of 0,log(0.75) - -0.288, or log(0.50) = -0.693, corresponding to 0%,25%, and 50% average fat-intake underreporting, respectively.For simplicity, we assume that BMI is not a breast cancer riskfactor, except by virtue of its relationship to fat intake.

Table 4 presents international data-projected RRs for total fat.In comparison with the data in Table 3, one sees that allowingfor a systematic component to measurement error has led tomuch reduced projected RRs. For example, at p n = 0.10, whichassumes a small positive correlation between 4DFRs at baselineand 1 year, and cto = -0.288, which corresponds to 0, 12.5%,and 25% average total fat-intake underreporting in the lowest,middle, and upper BMI tertiles, respectively, one has projectedRRs for the 90th versus the 10th total fat-intake percentiles ofonly 1.09 for the 4DFR and 1.11 for the FFQ. The projectedRRs in Table 4 are not highly sensitive to the average under-reporting parameter (Oo), but they decrease markedly as the4DFR measurement error correlation in the lowest BMI tertile isallowed to increase. In fact, the RR trends (not shown) are com-pletely eliminated at values of 0.2 or greater for this correlation.

Table 5 shows corresponding projected RRs across percentenergy from fat percentiles. For example, at pn = 0.10 andao =-0.288, the projected RRs range from 1.00 up to 1.11 for boththe 4DFR and FFQ in comparison with ranges from 1.00 to 1.28or 1.27 when only the noise aspect of measurement error in per-cent calories from fat is acknowledged. Again, these projectionsare rather sensitive to the p n specification but not to the averageunderreporting parameter (0LQ).

As mentioned previously, Hunter et al. (7) argued that the ab-sence of reduction in breast cancer risk among women reportingvery low levels of percent energy from fat provided evidence ofa lack of importance of dietary fat as a breast cancer risk factor.Table 6 presents international data-projected breast cancer RRsunder the measurement model [5] at 20% and 15% energy fromfat. The sensitivities seen in Tables 4 and 5 reverse for these ex-treme projections, with substantial sensitivity to the average per-cent energy from fat-intake underreporting across BMI tertiles.For example, at pn = 0.10 andoto= -0.288, one projects RRs of1.08 and 1.06 at 20% and 15% energy from fat, respectively, asmeasured by the 4DFR and RRs of 1.07 and 1.18 at 20% and15% energy from fat, respectively, as measured by the FFQ.Similar projected RR elevations were also obtained at very lowlevels of log-total fat (not shown). It may seem paradoxical thatthe strong positive relationship between fat intake and breastcancer risk estimated from the international correlationalanalyses could yield projected excess risk at very low reportedfat intakes when taking account of measurement errors. Theseincreases in projected RR occur because of the greater measure-ment error variance in the highest BMI tertile in conjunctionwith very weak associations between true and measured fat in-take in this tertile. Specifically, the estimated probabilities that aperson reporting a 20%-calories-from-fat diet is in the lowest,middle, and highest BMI tertile are 3.7%, 12.8%, and 83.6%,respectively, for the 4DFR and 16.2%, 14.4%, and 69.3%,respectively, for the FFQ. Also, the breast cancer probabil-ities increase as a function of OQ in higher BMI categories,explaining the sensitivity of these projections to the a® spe-cification.

Table 4. International data-projected relative risks for breast cancer across percentiles of log-total fat intake under measurement model [5]* that acknowledges bothsystematic and noise aspects of measurement error in dietary assessment!

Correlation (p n )between baseline and1 -y 4DFR measurementerrors in lowest BMI tertile

0

0.10

0.15

underreporting(OQ) in highest —BMI tertilet

0-0.288-0.693

0-0.288-0.693

0-0.288-0.693

10§

88

88

88

88

8

30

.05

.04

.02

.03

.01).99

.01

.00)98

4DFR percentile

50

(

.09

.07

.04

.05

.03

.00

.02

.00).98

70

.13

.11

.09

.07

.05

.03

.03

.02

.00

Relative risks

90

.19

.19

.18

.10

.09

.09

.04

.04

.04

10§

88

8

88

88

88

30

1.051.020.99

1.031.000.96

1.020.990.96

FFQ percentile

50

1.091.061.01

1.061.020.98

1.041.000.96 (

70

.14

.11

.06

.08

.05

.02

.05

.03).99

90

.19

.19

.18

.12

.11

.11

.07

.07

.07

*See text for details.I4DFR = 4-day food record; BMI = body mass index; FFQ = food-frequency questionnaire.tMean underreporting of log-fat intake is assumed to be of the form 0, a</2, and a^ in lowest, middle, and highest BMI tertiles, respectively.§Refercnee category.

Journal of the National Cancer Institute, Vol. 88, No. 23, December 4, 1996 ARTICLES 1743

Page 7: Measurement Error and Results From Analytic Epidemiology ...stefanski/MEM_Reports... · ces on measurement properties, only baseline data and post-randomization dietary data from

Table 5. International data-projected relative risks for breast cancer across percentiles of log-percent calones from fat under measurement model [5]* thatacknowledges both systematic and noise aspects of measurement error in dietary assessmentt

Correlation (Pn)between baseline and1 -y 4DFR measurementerrors in lowest BMI tertile

0

0.10

0.15

Mean -underreporting(OQ) in highestBMI tertilet

0-0.288-0.693

0-0.288-0.693

0-0.288-0.693

105

.00

.00

.00

.00

.00

.00

888

30

1.041.020.99

1.021.000.98

1.021.000.97

4DFR percentile

50

C

.06

.04

.02

.04

.02

.00

.03

.01).98

70

.09

.08

.06

.06

.05

.03

.04

.03

.02

Relative risks

90

.13

.15

.17

.09

.11

.12

.06

.08

.10

105

888

888

888

30

.04

.03

.01

.03

.02

.00

.02

.01).99

FFQ percentile

50

.07

.06

.04

.05

.04

.02

.04

.03

.01

70

.10

.09

.08

.07

.06

.05

.05

.05

.03

90

.14

.15

.15

.10

.11

.11

.07

.08

.09

*Sec text for details.t4DFR = 4-day food record; BMI = body mass index; FFQ = food-frequency questionnaire.JMean underreporting of log-fat intake is assumed to be of the form 0, <X(/2, and o^ in lowest, middle, and highest BMI tertiles, respectively.§Reference category.

Discussion

The measurement model [5] has been proposed for 4DFR andFFQ measures of total fat and percent energy from fat. Thismodel is more flexible than the models previously proposed inthat it incorporates an underreporting variable, common to the4DFR and FFQ measures. Total energy-intake underreporting iswell documented, and the average underreporting of energy in-take appears to be rather similar among dietary self-report in-struments and to be greater among persons having greater bodymass. Undoubtedly, the extent of underreporting varies amongpersons at a given body mass, leading to a random underreport-ing variable as in [5]. It seems likely that selected high-fat foodsare particularly underreported, especially since energy fromprotein, as estimated from urinary nitrogen output, is evidentlyunderreported to a lesser extent on average than is total energy(15).

Application of this measurement model to baseline and 1-year4DFR and FFQ data from the Women's Health Trial indicatesthat both of these dietary instruments are sufficiently limitedthat even the strong association between dietary fat and post-menopausal breast cancer incidence suggested by internationalcorrelational analyses is reduced to a very weak or nonexistentRR association across total fat or percent energy from fat quin-tiles. Since the fat-intake distributions estimated by 4DFR andFFQ measurements in the Women's Health Trial are similar tothose in the cohort studies pooled by Hunter et al. (7), theseprojections cast doubt on the claim by Hunter et al. that lower-ing fat intake in midlife is unlikely to reduce the risk for breastcancer substantially. The RR projections shown in Table 5 seemgenerally compatible with RRs of 1.00, 1.01, 1.12, 1.07, and1.05 across quintiles of calorie-adjusted fat intake, as reportedby Hunter et al. (7), based on FFQs. Hence, under the measure-ment model [5], one could equally regard the results of Hunter

Table 6. International data-projected relative risks for breast cancer at very low levels of percent calories from fat under measurement model [5]* that acknowledgesboth systematic and noise aspects of measurement error in dietary assessmentt

Correlation (Pn)between baselineand l-y4DFRmeasurement errors

0

0.10

0.15

Mean underreporting(Oo) in highestBMI tertilej

0-0.288-0.693

0-0.288-0.693

0-0.288-0.693

20% energyfrom fat

0.9051.041.24

0.941.081.29

0.961.111.31

4DFR

15% energyfrom fat

0.841.001.23

0.891.061.30

0.921.101.35

Relative risks§

20% energyfrom fat

0.941.041.17

0.981.071.21

0.991.091.23

FFQ

15% energyfrom fat

().95.12.35

.01

.18

.43

.03

.22

.47

'See text for details.t4DFR = 4-day food record; BMI = body mass index; FFQ = food-frequency questionnaire.t Mean underreporting of log-percent energy from fat is assumed to be of the form 0, ot(/2, and OQ in lowest, middle, and highest BMI tertiles, respectively.SRelative risks compared with 10th percentile of base-line percent energy from fat distribution (i.e., 30.9% for 4DFR and 29.8% for FFQ).

1744 ARTICLES Journal of the National Cancer Institute, Vol. 88, No. 23, December 4, 1996

Page 8: Measurement Error and Results From Analytic Epidemiology ...stefanski/MEM_Reports... · ces on measurement properties, only baseline data and post-randomization dietary data from

et al. as supportive of the strong international correlational asso-ciation.

It is more difficult to judge the compatibility of the projectedRRs with values of 1.00, 1.20, 1.24, 1.24, and 1.46 for post-menopausal breast cancer across the total fat-intake quintiles ina Canadian case-control study, as reported by Howe et al. (6) onthe basis of the data from 12 case-control studies. These studiesused a variety of dietary instruments, each having its own meas-urement properties, and included populations having low-fateating patterns, as well as Western populations. Hence, thelowest fat-intake quintile presumably includes a substantial frac-tion of the study sample in the low-fat-consuming populations,perhaps explaining the corresponding lower breast cancer in-cidence. An RR of 1.46/1.20 = 1.22 between the fifth and thesecond quintiles appears to be more compatible with the pro-jections of Table 4.

Available data do not allow one to apply the measurementmodel [5] without some additional assumptions. In the applica-tion to the data of the Women's Health Trial, pertinent data in-cluded only baseline 4DFR and FFQ fat-intake estimates andBMIs, along with 4DFR and FFQ assessments at 1 year fromdietary control group women. For parameter identification, thesignal-to-noise ratio for 4DFR log-total fat and 4DFR log-per-cent energy from fat assessments was assumed to be constantacross BMI tertiles. This specification implies that, aside fromsystematic errors, women have a similar degree of variability intheir dietary patterns across dietary-recording periods, regard-less of BMI. This assumption is plausible, but it requires actualvalidation data for verification. Under this measurement model,RRs can be projected as a function of 4DFR and FFQ assess-ments upon specifying pn, the measurement error correlationamong 4DFR assessments in the lowest BMI tertile, and theaverage 4DFR underreporting across BMI tertiles.

Concerning pn, one may expect some positive measurementerror correlation between baseline and 1-year 4DFRs, owing topossible lack of social desirability of a very low body mass. Forexample, correlations between baseline and 1-year 4DFR totalfat estimates are 0.19, 0.06, and 0.02 in the lowest three decilesof baseline BMi; suggesting a positive measurement error cor-relation in the lowest decile. The results of Heitmann andLessner (15) also suggested an average overreporting of totalenergy among women in the lowest body mass decile. In addi-tion, one can note that the correlation between baseline and 2-year control group 4DFR total fat estimates is only 0.080,compared with 0.188 for the corresponding baseline to 1-yearcorrelation, suggesting an estimate of at least 0.1 for the total fatmeasurement error correlation.

The RR projections in Tables 4 and 5 are not very sensitive tothe average underreporting specifications 0, <xJ2, and o<o acrossBMI tertiles. Nevertheless, one could use the results of theDanish study (75) to suggest an approximate 35% average un-derreporting of total energy in the highest BMI tertile, comparedwith an approximate 20% average underreporting of proteinenergy. If, for example, carbohydrate and alcohol calories werealso underreported by 20% on average in the highest BMI ter-tile, one would obtain under the 4DFR baseline macronutrientdistribution reported in the Women's Health Trial (19) a cor-responding underreporting of total fat by 49% on average and of

percent energy from fat by 22% on average. These exercisesmay cause one to pay particular attention to RR projections atPn =0.10 and at 0^= log(0.75) = -0.288 in interpreting Tables4-6. These specifications lead to international data-projectedRRs of 1.00, 1.00, 1.02, 1.05, and 1.11 across FFQ total fat per-centiles (Table 4) and 1.00, 1.02, 1.04, 1.06, and 1.11 acrossFFQ percent energy from fat percentiles (Table 5). Clearly, itwould be inadvisable to undertake a cohort or case-controlstudy in an attempt to identify RRs of this magnitude. Cor-responding projected RRs at 20%, 15%, and 10% of energyfrom fat as compared with 27.8% of energy are 1.07, 1.18, and1.23, based on the FFQ, so that dietary assessment measurementerror can provide a possible explanation for the observation (7)of somewhat elevated breast cancer risk if extremely low per-cent energy from fat is reported by Western women, even ifthere is a highly positive association between actual fat con-sumption and disease risk.

The measurement model proposed here leads to certainpredictions that could be examined in relation to the data fromcohort or case-control studies. For example, RR trends acrossmeasured fat-intake categories are predicted to be strongeramong women having lower BMI values. Also, this modelpredicts that women reporting very low levels of fat intake willtend to have high BMI values. Alternatively, such women mayhave variable BMIs if they undertake "yo-yo" patterns of diet-ing in an attempt to lose weight.

It is also interesting to examine the implications of the RRmodel leading to Tables 4-6 for BMI as a risk factor. The breastcancer probability as a function of BMI category v alone isproportional to exp((ivfi+ VSaJp2). For example, at p , | = 0.10and Oo = -0.288, projected RRs for BMI tertiles, taking thelowest tertile as reference, are 1.00, 1.23, and 1.47 based on thefitted model for total fat and 1.00, 1.31, and 1.52 based on themodel for percent energy from fat (the slight difference beingdue to the differing RR models for the two fat-intake measures),rather similar to previously reported RRs for postmenopausalbreast cancer (25). Corresponding projected RRs at OQ = 0 arenearly flat; at oto = -0.693, they rise from 1 to more than 2across BMI tertiles and, hence, are less consistent with theliterature on BMI in relation to postmenopausal breast cancer.

The RR projections leading to Tables 4-6 were repeated withthe use of the control group FFQs at 6 months rather than 1 yearfrom randomization, in which case the FFQ of Willett et al. (21)is used at both time points. Projected RRs were substantiallysimilar to but slightly more disparate from unity than thoseshown in Tables 4-6. Similarly, the use of control group dietarydata at 2 years rather than at 1 year leads to RR projections thatagree substantially with those shown in Tables 4-6.

Corresponding RRs were also projected for colon canceramong women aged 55-69 years, again beginning with a modelexp{p(Z - ZQ)) for RR as a function of log-fat intake. Theseprojections were again similar to but more disparate from unitythan those given in Tables 4-6, owing to colon cancer RRparameter estimates (P = 1.526 for log-total fat and P = 1.867for log-percent energy from fat) that are somewhat larger thanthe corresponding breast cancer estimates given above. Areviewer has pointed out that some nutritional epidemiologistsview the analytic epidemiologic data on dietary fat to be more

Journal of the National Cancer Institute, Vol. 88, No. 23, December 4, 1996 ARTICLES 1745

Page 9: Measurement Error and Results From Analytic Epidemiology ...stefanski/MEM_Reports... · ces on measurement properties, only baseline data and post-randomization dietary data from

consistent for colon cancer than for breast cancer and that theyhave used this perspective as an argument against the fat-and-breast-cancer hypothesis. Explanations for any such differencemay again include dietary fat assessment measurement error.For example, there are reasons to think {5,26) that polyun-saturated fats (e.g., omega-6 fatty acids) may have a particularrole in breast cancer promotion, with saturated fats also con-tributing to risk, whereas saturated fats may play a centralpromotional role in colon cancer. Differential measurementerror properties of saturated and polyunsaturated fats using foodrecords or food frequencies could then easily lead to noticeablydifferent RR estimates as a function of self-reported fat intake,even if breast and colon cancers have an equally important over-all relationship with dietary fat. It also seems worth noting thatanalytic epidemiologic studies of self-reported diet have some-times [e.g., (27)] been unable to provide consistent evidence ofan association between intake of dietary fat and coronary heartdisease, in spite of the widespread acceptance of such an as-sociation based on well-documented effects of dietary fat onblood cholesterol (28).

It is natural to ask whether the inclusion of biomarker sub-studies can enhance the reliability of analytic epidemiologicstudies of fat consumption and disease risk. While there is atpresent no satisfactory biomarker of long-term fat consumption,there are various blood markers that respond to variations in fatintake and are presumably free of systematic bias related tostudy subject body mass or other social desirability (29) charac-teristics. For example, one could add to model [5] a biomarkermeasurement

where Z3lV is positively correlated with the true intake Z,v and in-dependent of the error terms in [5], while the noise term £3^ isindependent of Z3lv,and all quantities in [5]. This setup may per-mit at least a valid test of trend between disease risk and true fatintake, but this topic will not be pursued here.

The measurement model fitted here suggests that 4DFRs andFFQs are very weak instruments for use in analytic epi-demiologic studies of dietary fat intake and disease. Only asmall fraction of the reported variation in dietary fat is at-tributable to true variations in fat intake, and this latter variationmay be dominated by systematic and random measurement er-rors. Hence, other research strategies may be necessary to makeprogress in this important research area. One such strategy isthat of full-scale dietary intervention trials with disease out-comes, as is included in the Women's Health Initiative (30), al-though cost and logistics imply that very few intervention trialsof this type will be practical. Another promising strategy wouldimprove upon international correlational analyses by relatingage- and sex-specific disease rates from good-quality cancerregistries worldwide to corresponding dietary recall and cancerrisk factor data obtained by surveying moderate numbers ofstudy subjects in the catchment areas of such registries (31,32).By virtue of aggregating dietary and confounding factor dataamong study subjects in a given registry area, one can essential-ly eliminate the noise aspect of measurement error, while thepotential wide variation among dietary habits of populationscovered by existing cancer registries further reduces the sys-

tematic error-to-signal ratio in dietary self-report. A study of thistype is currently in the planning stages.

Appendix 1: Measurement Model ParameterEstimation

As before, let sp\ denote the sample variance based on Xpv

values and let rp, denote the sample correlation between Xjiv andXji»j = It 2 in BMI tertile v. It is straightforward to show that 5 ,̂can be estimated by (1 - r^sf^ while o^, + r|2 can be estimatedby r^j =1,2 , where of „ = a2,.

Setting 52/aJ = dand noting that plv = \]l/<r\l + 52V) allow one

to estimate d by a = (\ - ru)(l -pn)(ru-pu)~\ from whichal = 82

u/d can be estimated by aj = jfv(l - rlv)/d and r|2 byr|v = s\v{ru, - (1 - rlv)/d). It follows that a^, can be estimated by

a\v = r2Av - Tlv and Y by Y= (<\ ~ T\l)(ov o2vT\ where cv is thesample covariance of XUv and X^y, completing the estimation ofthe variance matrix [6].

Appendix 2: Relative Risk Projection Methods

For simplicity, assume that BMI is not a risk factor beyond itsrelationship to fat intake, so that the disease probability as afunction of true fat intake Z and BMI category v can be taken tobe proportional to

p(D\Z, v) = exp{z P}.

Using the moment-generating function for the normally dis-tributed Z, given X, and v, allows one to estimate the diseaseprobability as a function of X\ and v as proportional to

p](D\X1,v) =

exp[(3F,v - <xlv)p + Vi

where xiv is the sample mean of the XUv values. The estimateddisease probability given Xt alone is then proportional to

px(D\X,) =v = 1 v= I

where pi(X,lv) = (2ns2,v)''/i exp {~Vi(Xi - xlv)

2s^} and p(v) is thefraction of study subjects in BMI category v. Hence, the RR atXl compared with a standard value X|0 is

RR,(Xx)=Px(D\X,)/p,(D\Xw).

Projected RRs as a function of the FFQ measure X2 are given bythe corresponding expressions with

P2(D\Xzv) =

and

- <xls)P

P2(X2\v) = (2ns22vy%xp {-Vi(X2 -

Note that these projections do not involve the means \i2v of theZ^v variables.

1746 ARTICLES Journal of the National Cancer Institute, Vol. 88, No. 23, December 4, 1996

Page 10: Measurement Error and Results From Analytic Epidemiology ...stefanski/MEM_Reports... · ces on measurement properties, only baseline data and post-randomization dietary data from

References(/) Tannenbaum A. Genesis and growth of tumors. III. Effects of a high fat

diet Cancer Res 1942;2:468-75.(2) Newbeme PM, Shrager TE, Conner MW. Experimental evidence on the

nutritional prevention of cancer. In: Moon TE, Micozzi MS, editors. Nutri-tion and cancer prevention: investigating the role of micronutrients. NewYork: Marcel Dekker, 1989:33-82.

(3) Freedman L, Clifford C, Messina M. Analysis of dietary fat, calories, bodyweight, and the development of mammary tumors in rats and mice: areview. Cancer Res 1990^0:5710-9.

(4) Carroll KK, Gammal EB, Plunkett ER. Dietary fat and mammary cancer.Can Med Assoc J 1968;98:590-4.

(5) Prentice RL, Sheppard L. Dietary fat and cancer consistency of theepidemiologic data, and disease prevention that may follow from a practi-cal reduction in fat consumption [published erratum appears in CancerCauses Control 1990;l:253]. Cancer Causes Control 1990;l:81-97.

(6) Howe GR, Hirohata T, Hislop TG, Iscovich JM, Yuan JM, Katsouyanni K,et al. Dietary factors and risk of breast cancer combined analysis of 12case-control studies [see comment citation in Medline]. J Natl Cancer Inst1990;82:561-9.

(7) Hunter DJ, Spiegelman D, Adami HO, Beeson L, van den Brandt PA,Folsom AR, et al. Cohort studies of fat intake and the risk of breast can-cer—a pooled analysis [see comment citations in Medline]. N Engl J Med1996;334:356-61.

(8) Rosner B, Willett WC, Spiegelman D. Correction of logistic regressionrelative risk estimates and confidence intervals for systematic within-per-son measurement error. Stat Med 1989;8:1051-69.

(9) Rosner B, Spiegelman D, Willett WC. Correction of logistic regressionrelative risk estimates and confidence intervals for measurement error thecase of multiple covariates measured with error. Am J Epidemiol 1990;132:734-45.

(10) Wacholder S, Armstrong B, Hartge P. Validation studies using an alloyedgold standard [see comment citations in Medline]. Am J Epidemiol 1993;137:1251-8.

(//) Carroll RJ, Ruppert D, Stefanski LA. Measurement error in nonlinearmodels. New York: Chapman and Hall, 1995.

(12) Plummer M, Clayton D. Measurement error in dietary assessment: an in-vestigation using covariance structured models. Part n. Stat Med 1993;12:937-48.

(13) Lichtman SW, Pisarska K, Berman ER, Pestone M, Dowling H, Offen-bacher E, et al. Discrepancy between self-reported and actual caloric intakeand exercise in obese subjects [see comment citations in Medline]. N EnglJ Med 1992;327:1893-8.

(14) Bandini LG, Schoeller DA, Cyr HN, Dietz WH. Validity of reported ener-gy intake in obese and nonobese adolescents [see comment citation inMedline]. Am J Clin Nutr 1990^2:421-5.

(15) Heitmann BL, Lessner L. Dietary underreporting by obese individuals—isit specific or non-specific? BMJ 1995;311:986-9.

(16) Martin LJ, Su W, Jones PJ, Lockwood GA, Tritchler DL, Boyd NF. Com-parison of energy intakes determined by food records and doubly labeledwater in women participating in a dietary-intervention trial. Am J ClinNutrl996;63:483-90.

(17) Sawaya AL, Tucker K, Tsay R, Willett W, SaJtzman E, Dallal GE, et al.Evaluation of four methods for determining energy intake in young andolder women: comparison with doubly labeled water measurements oftotal energy expenditure. Am J Clin Nutr 1996;63:491-9.

(18) Insull W Jr, Henderson MM, Prentice RL, Thompson DJ, Clifford C,Goldman S, et al. Results of a randomized feasibility study of a low-fatdiet Arch Intern Med 1990; 150:421-7.

(19) Henderson MM, Kushi LH, Thompson DJ, Gorbach SL, Clifford LK,Insull W Jr, et al. Feasibility of a randomized trial of a low-fat diet for theprevention of breast cancer, dietary compliance in the Women's HealthTrial Vanguard Study. Prev Med 1990;19:l 15-33.

(20) Composition of foods. Agricultural handbook 8-1 to 8-16. Hyattsville(MD): Human Nutrition Information Service, US Department of Agricul-ture, 1976-87.

(21) Willett WC, Sampson L, Stampfer MJ, Rosner B, Bain C, Witschi J, et al.Reproducibility and validity of a semiquantitative food frequency ques-tionnaire. Am J Epidemiol 1985; 122:51 -65.

(22) Block G, Hartman AM, Dresser CM, Carroll MD, Gannon J, Gardner L. Adata-based approach to diet questionnaire design and testing. Am JEpidemiol 1986;124:453-69.

(23) Muir C, Waterhouse J, Mack T, Powell J, Whelan S, editors. Cancer inci-dence in five continents. Vol 5. Lyon, France: International Agency forResearch on Cancer, 1987.

(24) Food and Agriculture Organization of the United Nations. Food balancesheets 1975-77 average. Rome: FAO, 1980.

(25) Paffenbarger RS Jr, Kampert JB, Chang H. Characteristics that predict riskof breast cancer before and after the menopause. Am J Epidemiol 1980;112:258-68.

(26) Carroll KK, Khor HT. Effect of level and the type of dietary fat on in-cidence of mammary tumors induced in female Sprague-Dawley rats by7,12-dimethylbenz(a)anthracene. Lipids 1971 ;6:415-20.

(27) Kuller LK. The etiology of breast cancer—from epidemiology to preven-tion. Publ Health Rev 1995;23:157-213.

(28) Hegsted DM, McGrandy RB, Meyers ML, Stare FJ. Quantitative effects ofdietary fat on serum cholesterol in man. Am J Clin Nutr 1965; 17:281-95.

(29) Hebert JR, Clemow L, Pbert L, Ockene IS, Ockene JK. Social desirabilitybias in dietary self-report may compromise the validity of dietary intakemeasures. Int J Epidemiol 1995;24:389-98.

(30) Rossouw JE, Finnegan LP, Harlan WR, Pinn VW, Clifford C, McGowanJA. The evolution of the Women's Health Initiative: perspectives from theNIH. J Am Med Women's Assoc 1995;5O:5O-5.

(31) Prentice RL, Sheppard L. Aggregate data studies of disease risk factors.Biometrikal995;82:113-25.

(32) Sheppard L, Prentice RL. On the reliability and precision of within- andbetween-population estimates of relative risk parameters. Biometrics 1995;51:853-63.

NotesSupported by Public Health Service grant CA53996 from the National Cancer

Institute, National Institutes of Health, Department of Health and Human Ser-vices.

I thank Mark W. Mason for substantial computational support and Molly Ed-monds and Joy Poutr£ for technical support in manuscript preparation. Valuablecomments by the referees and by colleagues Larry Freedman, Victor Kipnis, C.Y. Wang, John Potter, Alan Kristal, Mary Anne Rossing, Polly Newcomb, andMaureen Henderson are also gratefully acknowledged.

Manuscript received April I, 1996; revised August 14, 1996; accepted August26, 1996.

Journal of the National Cancer Institute, Vol. 88, No. 23, December 4, 1996 ARTICLES 1747