measuring dietary intake raymond j. carroll department of statistics faculty of nutrition and...
TRANSCRIPT
Measuring Dietary Intake
Raymond J. CarrollDepartment of Statistics
Faculty of Nutrition and Faculty of Toxicology
Texas A&M Universityhttp://stat.tamu.edu/~carroll
_________________________________________________________
I Still Cook
Me in the kitchen, Yokohama (my birthplace), 1953
_________________________________________________________
Advertisement
College Station, home of Texas A&M University
I-35
I-45
Big Bend National Park
Wichita Falls, my hometown
West Texas
Palo DuroCanyon, the Grand Canyon of Texas
Guadalupe Mountains National Park
East Texas
What I am Not
I know that potato chips are not a basic healthy food group. However, if you ask me a detailed question about nutrition, then I will ask
Joanne Lupton Nancy Turner Meeyoung Hong
_________________________________________________________
You are what you eat, but do you know who you are?
• This talk is concerned with a simple question.
• Will lowering her intake of fat decrease a woman’s chance of developing breast cancer?
_________________________________________________________
Basic Outline
• Diet affects health. Many (not all!) studies though are not statistically significant.
• Focus: quality of the instruments used to measure diet
• Conclusion #1: The instruments are largely to blame.
• Conclusion #2: Expect studies to disagree
_________________________________________________________
Evidence in Favor of the Fat-Breast Cancer Hypothesis
• Animal studies
• Ecological comparisons
• Case-control studies
_________________________________________________________
International Comparisons _____________________________________________________________
Evidence against the Fat-Breast Cancer Hypothesis
• Prospective studies• These studies try to assess a woman’s
diet, then follow her health progress to see if she develops breast cancer
• The diets of those who developed breast cancer are compared to those who do not
• Only (?) 1 prospective study has found firm evidence suggesting a fat and breast cancer link, and 1 has a negative link
_________________________________________________________
Prospective Studies
• NHANES (National Health and Nutrition Examination Survey): n = 3,145 women aged 25-50
• Nurses Health Study: n = 100,000+
• Pooled Project: n = 300,000+
• Norfolk (UK) study: n = 15,000+
_________________________________________________________
The Nurses Health Study, Fat and Breast Cancer_________________________________________________________
60,000 women, followed for 10 years
Prospective study
Note that the breast cancer cases were announcing that they eat less fat
Donna Spiegelman, the NHS statistician
Clinical Trials
• The lack of consistent (even positive) findings led to the Women’s Health Initiative
• Approximately 40,000 women randomized to two groups: healthy eating and typical eating
_________________________________________________________
WHI Diet Study Objectives_________________________________________________________
Prior Objections to WHI
• Cost ($415,000,000)
• Whether North Americans can really lower % Calories from Fat to 20%, from the current 38%
• Even if the study was successful, difficulties in measuring diet mean that we will not know what components led to the decrease in risk.
_________________________________________________________
Change in Fat Calories Over Time_________________________________________________________
0
5
10
15
20
25
30
35
40
Y-0 Y-1 Y-3 Y-6
Control
Intervention
Goal
Women reported a decrease in fat-calories, but not to 20%
How do we measure diet in humans?
• 24 hour recalls
• Diaries
• Food Frequency Questionnaires (FFQ)
_________________________________________________________
Walt Willett has a popular book and a popular FFQ
Food diaries
• Hot topic at NCI
• Only measures a few day’s diet, not typical diet
• A single 3-day diary finding a diet-cancer link is not universally scientifically acceptable
• Need for repeated applications
• Induces behavioral change??
_________________________________________________________
1350140014501500155016001650170017501800
FF
Q
Dia
ry 1
Dia
ry 2
Dia
ry 3
Dia
ry 4
Dia
ry 5
Dia
ry 6
Typical (Median) Values of Reported Caloric Intake Over 6 Diary Days: WISH Study
The Food Frequency Questionnaire
• Do you remember the SAT?
_________________________________________________________
The Pizza Question_________________________________________________________
The Norfolk Study with ~Diaries and FFQ_________________________________________________________
15,000 women, aged 45-74, followed for 8 years
163 breast cancer cases
Diary: p = 0.005
FFQ: p = 0.229
Summary
• FFQ does not find a fat and breast cancer link
• 24 hour recalls and diaries are expensive• They have found links, but in opposite directions• Diaries also appear to modify behavior
• Question: do any of these things actually measure dietary intake? • How well or how badly?
• These are statistical questions!
_________________________________________________________
Do We Know Who We Are?
• Karl Pearson was arguably the 1st great modern statistician
• Pearson chi-squared test
• Pearson correlation coefficient
_________________________________________________________
Karl Pearson at age 30
Do We Know Who We Are?
• Pearson was deeply interested in self-reporting errors
• In 1896, Pearson ran the following experiment.
• For each of 3 people, he set up 500 lines of a set of paper, and had them bisected by hand
_________________________________________________________
A gaggle of lines
Pearson’s Experiment
• He then had an postdoc measure the error made by each person on each line, and averaged
• “Dr. Lee spent several months in the summer of 1896 in the reduction of the observations ”
_________________________________________________________
A gaggle of lines, with my bisections
Pearson’s Personal Equations
• Pearson computed the mean error committed by each individual: the “personal equations “
• He found: the errors were individual. His errors were to the right, Dr. Lee’s to the left
_________________________________________________________
Karl Pearson in later life
What Do Personal Equations Mean?
• Given the same set of data, when we are asked to report something, we all make errors, and our errors are personal
• In the context of reporting diet, we call this “person-specific bias “
_________________________________________________________
Laurence Freedman of NCI, with whom I did the work
Model Details for Statisticians
• The model in symbols
• The existence of person-specific bias means that variance of true intake is less than one would have thought
_________________________________________________________
iij 0 1
2r
2ε
i
i
ij
i
i
j
Q =β + β + + ;
=true intake;
=personal equation=Normal(0,σ );
=random error =Normal(0,
r
X
ε
ε σ
rX
)
Model Details for Statisticians
• The OPEN Study had the following measurements• Two FFQ• Two Protein biomarkers• Two Energy biomarkers
_________________________________________________________
Model Details for Statisticians
• The model in symbols
• Linear mixed model, fit by PROC MIXED
_________________________________________________________
iij 0Q 1Q
i
iQ
i Fj i
ijQ
j
Q =β +β + +ε
UX
;
M = +
rX
;
Attenuation
• The attenuation is the slope in the linear regression of X on Q
_________________________________________________________
ijQ
ijF
iQij 0Q 1Q
ij
Q
i
i
Q =β +β + + ;
M = + ;
λ =cov( ,Q)/ v
ε
ε
a
X
X
X
r
r(Q)
Relative Risk and Attenuation
• Start with a logistic model
• True relative risk
• Observed relative risk (regression calibration)
0 1pr(D=1)=H X( + )
_________________________________________________________
1R exp( )
QλQR R since λ < 1
Relative Risk and Attenuation_________________________________________________________
Attenuation Relative Risk
1.0 (no meas. Error) 2.0
0.8 1.74
0.5 1.41
0.25 1.19
0.10 1.07
Our Hypothesis
• We hypothesized that when measuring Fat intake• The personal equation, or person-
specific bias, unique to each individual, is large and debilitating.
• The problem: the actual variability in American diets is much smaller than suspected.
_________________________________________________________
Can We Test Our Hypothesis?
• We need biomarker data that are not much subject to the personal equation
• There is no biomarker for Fat
• There are biomarkers for energy (calories) and Protein
• We expect that studies are too small by orders of magnitude
_________________________________________________________
Biomarker Data
Calories and Protein: Available from NCI’s
OPEN study
Results are surprising
Victor Kipnis was the driving force behind OPEN
_________________________________________________________
Sample Size Inflation
There are formulae for how large a study needs to be to detect a doubling of risk from low and high Fat/Energy Diets
These formulae ignore the personal equation
We recalculated the formulae
_________________________________________________________
Biomarker Data: Sample Size Inflation
0
2
4
6
8
10
12P
rote
in
Ca
lorie
s
%-
Prote
in
_________________________________________________________
If you are interested in the effect of calories on health, multiply the sample size you thought you needed by 11. For protein, by 4.5
Relative Risk_________________________________________________________
If high calories increases the risk of breast cancer by 100% in fact, and you change your intake dramatically, the FFQ thinks doing so increases the risk by 4%
1
1.2
1.4
1.6
1.8
2
Relative Risk ForChanging Your Food
Intake
True: 2.00
ObservedProtein: 1.09
ObservedCalories: 1.04
Result: It is not possible to tell if changing your absolute caloric intake, or your fat intake, or your protein intake will have any health effects
Relative Risk, Food Composition_________________________________________________________
If high protein (fat) increases the risk of breast cancer by 100%, your calories remain the same, you dramatically lower your protein (fat) intake, then FFQ thinks your risk increases by 20%-30%
1
1.2
1.4
1.6
1.8
2
Relative Risk for FoodComposition
True: 2.00
ObservedProteinDensity: 1.31
Result: It is pretty difficult to tell if changing your food composition while maintaining your caloric intake will have any health effects
New Results The AARP Study: 250,000+
women, by far the greatest number in any single study
Results according to rumor: Huge size statistical
significance
FFQ small measured increase in risk for dramatic behavioral change
Statistician’s dream: use Pearson’s idea to get at the true increase in risk
_________________________________________________________
A happy statistician dreaming about AARP
New Results
The WHI Controls Study: 30,000+ women
All with > 32% Calories from Fat via FFQ
Diaries in a nested case-control study
Highly significant fat effect in the diaries (RR in quantiles of 1.6)
_________________________________________________________
A happy statistician doing field biology in Northwest Australia (the Kimberley)
Summary
WHI, 2006, clinical trial
My best case conjecture in 2005:
Probably no statistically significant effects
The p-value was 0.07, relative risk about 1.2
My best case conjecture in 2008 after further follow-up Statistically significant, modest effects
_________________________________________________________
You are what you eat, but do you know who you are?
Diet is incredibly hard to measure
Even 100% increases in risk cannot be seen in large cohort studies with an FFQ
If you read about a diet intervention, measured by a FFQ, and it achieves statistical significance multiple times: wow!
_________________________________________________________
You are what you eat, but do you know who you are?
Much work at NCI and WHI and EPIC on new ways of measuring diet
EPIC (a multi-country study) may be a model, because of the wide distribution of intakes
_________________________________________________________
What Was Done
• The OPEN analysis actually fit Protein and Energy together.
• We call this the Seemingly Unrelated Measurement Error Model
• Can get major gains in efficiency
_________________________________________________________
SUMEM
• Gains in efficiency come from the correlations of the random effects
_________________________________________________________
ijP 0QP 1QP
ij
iP
iP
iE
i
ijQP
ijQP
iQP
i
P
ijE QE0QE 1QE
ijE
ijQE
ijQE E
Q =β +β + + ;
M = + ;
Q =β + β + + ;
M = + ;
X
X
X
U
X
ε
U
r
εr