reliability of data on smoking habit and coffee drinking collected by personal interview in a...

9
European Journal of Epidemiology 14: 259–267, 1998. 1998 Kluwer Academic Publishers. Printed in the Netherlands. Reliability of data on smoking habit and coffee drinking collected by personal interview in a hospital-based case-control study Francesco Donato 1 , Paolo Boffetta 2 , Raffaella Fazioli 3 , Umberto Gelatti 1 & Stefano Porru 3 1 Institute of Hygiene, University of Brescia, Brescia; 2 Unit of Environmental Cancer Epidemiology, International Agency for Research on Cancer, Lyon, France; 3 Institute of Occupational Health, University of Brescia, Brescia, Italy Accepted in revised form 7 February 1998 Abstract. A study on the reliability of information on smoking habits and coffee drinking collected via in- terview was conducted among 500 subjects enrolled in a case–control study on bladder cancer in Brescia, North Italy. A total of 215 cases (incident and preva- lent) and 285 controls were interviewed personally in the hospital setting by a first interviewer, and then re- interviewed by telephone by either the same inter- viewer or another one. Agreement between the first and second interview was evaluated using the kappa statistic and the intra-class correlation coefficient and via multiple logistic regression modelling. No impor- tant differences in reliability were found according to sex, education or case/control status, while agreement was better among subjects below 65 than among older ones, and among incident than prevalent cases. A slightly better agreement was found among subjects interviewed twice by the same interviewer than those interviewed by two different individuals, which may reflect the presence of inter-observer reliability for the latter. Overall, these results show a very high relia- bility of data on smoking and a fairly high reliability regarding coffee drinking as collected through face- to-face interviews. Key words: Bladder cancer, Coffee drinking, Reliability, Tobacco smoking Introduction Validity of exposure measurement is a key problem in epidemiologic research [1]. When no objective mea- sure of the true values of exposure is available, such as for past tobacco smoking or present or past coffee drinking, validity cannot be assessed. However, the reliability of these data can be measured through rep- licated measures, by computing a measure of agree- ment between them, i.e. the correlation coefficient of reliability (‘reliability coefficient’). The reliability coefficient is of great interest, since it is equal to the square of the validity coefficient, which is the correla- tion between any observed measure and the true val- ue (‘validity coefficient’) [1]. The test–retest tech- nique has been widely used to estimate reliability, due to its low cost and feasibility [2]. Various studies have been carried out to assess the reliability of data col- lected in hospital-based case–control studies. How- ever, these studies often involved a limited number of subjects and did not assess reliability according to case/control status, sex, age or education. Further- more, the most widely used measures of agreement between two categorical or continuous variables, such as Cohen’s kappa and the intraclass correlation coeffi- cient [3], have frequently been misinterpreted and misused, as pointed out by Maclure and Willett [4], and Muller and Buttner [5]. Tobacco smoking and coffee drinking are among the most widely investigated exposures as possible risk factors for human diseases. In spite of a large number of epidemiologic investigations, discrepan- cies still exist regarding the association of bladder can- cer risk with coffee drinking [6], which may – at least in part – be due to exposure misclassification. The aims of this study were: (1) to assess reliability of data on cigarette smoking and coffee drinking col- lected through personal interviews in case-control studies at the hospital setting, and (2) to evaluate the role of factors that may influence reliability of data collecting through personal interviews, such as inter- viewer, case/control status, sex, age at interview, edu- cation and time since diagnosis for cases. Methods A hospital-based case–control study on bladder can- cer and occupation, tobacco smoking, coffee and alco- hol drinking has been carried out in Brescia, northern

Upload: francesco-donato

Post on 02-Aug-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

European Journal of Epidemiology 14: 259–267, 1998. 1998 Kluwer Academic Publishers. Printed in the Netherlands.

Reliability of data on smoking habit and coffee drinking collected by personalinterview in a hospital-based case-control study

Francesco Donato1, Paolo Boffetta2, Raffaella Fazioli3, Umberto Gelatti1 & Stefano Porru3

1 Institute of Hygiene, University of Brescia, Brescia; 2 Unit of Environmental Cancer Epidemiology, International Agency forResearch on Cancer, Lyon, France; 3 Institute of Occupational Health, University of Brescia, Brescia, Italy

Accepted in revised form 7 February 1998

Abstract. A study on the reliability of information onsmoking habits and coffee drinking collected via in-terview was conducted among 500 subjects enrolled ina case–control study on bladder cancer in Brescia,North Italy. A total of 215 cases (incident and preva-lent) and 285 controls were interviewed personally inthe hospital setting by a first interviewer, and then re-interviewed by telephone by either the same inter-viewer or another one. Agreement between the firstand second interview was evaluated using the kappastatistic and the intra-class correlation coefficient andvia multiple logistic regression modelling. No impor-

tant differences in reliability were found according tosex, education or case/control status, while agreementwas better among subjects below 65 than among olderones, and among incident than prevalent cases. Aslightly better agreement was found among subjectsinterviewed twice by the same interviewer than thoseinterviewed by two different individuals, which mayreflect the presence of inter-observer reliability forthe latter. Overall, these results show a very high relia-bility of data on smoking and a fairly high reliabilityregarding coffee drinking as collected through face-to-face interviews.

Key words: Bladder cancer, Coffee drinking, Reliability, Tobacco smoking

Introduction

Validity of exposure measurement is a key problem inepidemiologic research [1]. When no objective mea-sure of the true values of exposure is available, such asfor past tobacco smoking or present or past coffeedrinking, validity cannot be assessed. However, thereliability of these data can be measured through rep-licated measures, by computing a measure of agree-ment between them, i.e. the correlation coefficient ofreliability (‘reliability coefficient’). The reliabilitycoefficient is of great interest, since it is equal to thesquare of the validity coefficient, which is the correla-tion between any observed measure and the true val-ue (‘validity coefficient’) [1]. The test–retest tech-nique has been widely used to estimate reliability, dueto its low cost and feasibility [2]. Various studies havebeen carried out to assess the reliability of data col-lected in hospital-based case–control studies. How-ever, these studies often involved a limited number ofsubjects and did not assess reliability according tocase/control status, sex, age or education. Further-more, the most widely used measures of agreementbetween two categorical or continuous variables, suchas Cohen’s kappa and the intraclass correlation coeffi-

cient [3], have frequently been misinterpreted andmisused, as pointed out by Maclure and Willett [4],and Muller and Buttner [5].

Tobacco smoking and coffee drinking are amongthe most widely investigated exposures as possiblerisk factors for human diseases. In spite of a largenumber of epidemiologic investigations, discrepan-cies still exist regarding the association of bladder can-cer risk with coffee drinking [6], which may – at least inpart – be due to exposure misclassification.

The aims of this study were: (1) to assess reliabilityof data on cigarette smoking and coffee drinking col-lected through personal interviews in case-controlstudies at the hospital setting, and (2) to evaluate therole of factors that may influence reliability of datacollecting through personal interviews, such as inter-viewer, case/control status, sex, age at interview, edu-cation and time since diagnosis for cases.

Methods

A hospital-based case–control study on bladder can-cer and occupation, tobacco smoking, coffee and alco-hol drinking has been carried out in Brescia, northern

260

Figure 1. Design of the study (number of cases and controls in parentheses).

Italy [7, 8]. All the 931 participants were interviewedpersonally using a standardised questionnaire by oneinterviewer, herein named A. All subjects with blad-der cancer admitted to the main hospital in the prov-ince of Brescia for diagnosis, treatment of follow-upduring 1992–1993 were enrolled (n = 353), includingnewly diagnosed, or incident cases (n = 129) and pre-viously diagnosed, or prevalent cases (n = 224). Sub-jects admitted to the same hospital and to two otherhospitals in the town for benign urological diseasewere enrolled as controls (n = 578). The controls wereinterviewed at the same time as cases and were group-matched with them for age and sex.

Information on lifelong history of tobacco smokingand coffee drinking was collected. Data on current to-bacco smoking and coffee drinking referred to thedate of diagnosis for cases. Subjects were classified ascurrent smokers if they smoked at least one cigaretteper day, and current coffee drinkers if they drank atleast one cup of coffee per day. All subjects were in-terviewed between January 1992 and September 1993.

Of the 931 subjects included in the case-controlstudy (353 cases and 578 controls), 909 (338 cases and571controls) were eligible for a second interview (Fig-ure 1), the others being excluded because either theywere known to have died in the meantime or they hadnot answered some questions on tobacco smoling orcoffee drinking on the first interview. A random sam-ple of male cases, of male and female controls, wasselected, while the 78 female cases were all included inthe reliability study. Six hundred thirty-one subjects(292 cases and 339 controls) were selected for tele-

phone re-interview. The subjects were then randomlyassigned either to the previous interviewer (inter-viewer A) or to another one (interviewer B). Onehundred thirty-one subjects (77 cases and 54 controls)were not re-interviewed, because they had died (n =56), were too ill to answer (n = 28), had no telephoneor were not found at three calls on different days andat different times (n = 47). The proportion of subjectsre-interviewed was slightly lower among cases (males:74.7%, females: 70.5%) than controls (males: 82.8%,females: 86.4%), mainly due to the death of the sub-jects (42 cases and 14 controls). A comparison be-tween the subjects re-interviewed and those selectedbut not re-interviewed showed a lower proportion ofpeople aged 65 years and over among re-interviewedsubjects than among the others, while no differenceswere seen as regards sex or education.

All the re-interviews were performed by telephonebetween January and June 1995. The second question-naire included the same questions on smoking statusand coffee drinking of the previous one, and also somenew questions on the usual consumption of hot bever-ages (tea, chocolate, ‘milk and coffee’, others). Thetelephone interviews lasted about 10 minutes and sub-jects were asked again about their habits at the time ofthe first interview (incident cases and controls) or thefirst diagnosis (prevalent cases).

All the variables were analysed as categorical. Theanalysis of agreement was restricted to individualswho gave full answers at both interviews for each vari-able. The odds ratios (ORs) of bladder cancer forsmoking habit and coffee drinking were computed us-

261

Table 1. Demographic characteristics of the cases and con-trols in the reliability study

Demographiccharacteristics

Cases

No. %

Controls

No. %

SexMale 160 74.4 183 64.2Female 55 25.6 102 35.8

Age (years)< 55 59 27.4 145 50.955–64 74 34.4 68 23.965+ 82 38.1 72 25.3

ResidenceProvince of Brescia 199 92.6 220 77.2Other areas 16 7.4 65 22.8

Education (years)0–5 145 67.4 155 54.46–8 37 17.2 66 23.29+ 33 15.3 64 22.5

Total 215 (100) 285 (100)

ing unconditional logistic regression analysis in orderto adjust the estimates for age, sex, years of educationand residence. The ORs for coffee drinking were ad-justed for smoking habits, too.

Agreement between the first and the second inter-view was assessed by computing Cohen’s kappa statis-tic (K) for dichotomous variables, and the intra-classcorrelation coefficient (ICC) for ordinal variables [4].The ICC was computed using a two-way analysis ofvariance (ANOVA) with fixed effects, since a system-atic difference between the first, face-to-face, inter-view at hospital and the second, telephone interviewat home can be suspected a priori [5]. A value of 0.75was considered indicative of good agreement, accord-ing to others [9], but lower values of K were not auto-matically regarded as indicative of non-agreement,since this statistic is strongly dependent on true preva-lence [10] and it may be low when agreement is highdue to unbalanced, symmetrical marginals [11]. Sincethe indices of agreement can not identify the existenceof a bias, i.e. when one interview gives systematicallyhigher values than the other, we computed Bhapkar’sW statistic for testing for marginal homogeneity [12]and the χ2 for testing the symmetry (symmetry test)[12, 13], which reduces to McNemar’s test for 2*2 ta-bles.

We assessed reliability in the total sample and ac-cording to case/control status, sex and age of subject,the interviewer who conducted the second interview,and, for cases only, the interval between diagnosis ofthe disease and the first interview. We tested the dif-ferences between the first and second interview be-tween the two subgroups (cases versus controls, males

versus females, people aged less than 65 and 65 andover years, etc.) using the Mann–Whitney rank testfor unpaired data [14]. Then we used logistic regres-sion [15, 16] to assess the independent associations ofthe variables investigated with concordance betweenthe first and second interview. The dependent varia-ble was defined as 1 if the answers given in the first andsecond interview were identical (e.g. the same cate-gory for categorical variables), and 0 otherwise. Wecomputed the OR as a measure of association be-tween each category of independent variable andagreement, that is, an OR > 1in one category indicatesbetter agreement in that category as compared to thereference. To determine the statistical significance ofdifferences in agreement across subgroups, we usedthe Likelihood-Ratio test [15].

Finally, we compared some observed ORs, comput-ed by using data from the first interview, with those‘corrected’ by the information on measurement er-rors for cigarette smoking and coffee drinking givenby the reliability study. To this purpose, we computedthe true ORs as a function of the observed ORs and ofthe reliability coefficients (ICCs) using the equationsgiven by Armstrong et al., which are valid under someassumptions [17]. As regards dichotomous variables,we computed the corrected ORs by using the methodsuggested by Bashir et al. [18] and Duffy et al. [19],which is based on the computation of α as the prob-ability that the risk factor is correctly classified.

All the statistical analyses were performed usingthe BMDP/Dynamic package for personal computer[20].

Results

The demographic characteristics of the subjects in-cluded in the reliability study are shown in Table 1.The distributions of cases and controls is similar ac-cording to sex, residence and education, while caseswere older than controls, on the average.

Test–retest agreement was high as regards smokinghabits (Table 2), including semi-quantitative esti-mates of the number of cigarettes smoked per day,years of smoking, years since quitting and total num-ber of cigarettes smoked in lifetime. A lower agree-ment was found for coffee drinking, as both dichoto-mous and ordinal variable. The relatively low value ofK (0.65) for coffee drinking considered as a dichoto-mous variable in spite of the high percentage of agree-ment (89.2%) is due to the unbalanced symmetricalmarginals, since the percentage of subjects classifiedas non-drinkers was low in both the first (19.4%) andthe second interview (19.0%). No lack of symmetry ormarginal homogeneity was found according to all the

262

Table 2. Test–retest reliability: agreement and values of kappa (K) and intraclass correlation coefficient (ICC) according tocase-control status

Variable Total subjects Cases Controls

Agreement

No. (%)

Measure ofagreementK / ICC

Agreement

No. (%)

Measure ofagreementK / ICC

Agreement

No. (%)

Measure ofagreementK / ICC

Smoking habit(never/ex-smoker/current smoker) 472/500 (94.4) K = 0.92 204/215 (94.9) K = 0.92 268/285 (94.0) K = 0.91

Current smoking: No. cigarettes a daya

(1–5/6–10/11–20/21+) 134/185 (72.4) ICC = 0.79 79/107 (73.8) ICC = 0.82 55/78 (70.5) ICC = 0.75Years of smoking (1–20/21–40/41+)b 291/330 (88.2) ICC = 0.87 155/176 (88.1) ICC = 0.83 136/154 (88.3) ICC = 0.88Years since quitting smoking (1–5/6–20/21+)c 116/130 (89.2) ICC = 0.78 54/60 (90.0) ICC = 0.76 62/70 (88.6) ICC = 0.79Lifetime smoking: No. cigarettes smokedb

(1–99,999/100,000–199,999/200,000+) 254/330 (77.0) ICC = 0.80 132/176 (75.0) ICC = 0.77 122/154 (79.2) ICC = 0.81Current coffee drinking (yes/no) 446/500 (89.2) K = 0.65 200/215 (93.0) K = 0.66 246/285 (86.3) K = 0.64Current coffee drinking: No. cups a dayd

(1–2/3–4/5+) 249/352 (70.7) ICC = 0.63 106/171 (62.0) ICC = 0.54 143/181 (79.0) ICC = 0.73

a Current smokers only.b Current and ex-smokers.c Ex-smokers only.d Current drinkers only.

Table 3. Test–retest reliability: agreement and values of kappa (K) and intraclass correlation coefficient (ICC) for cigarettesmoking and coffee drinking according to the interviewers

Variable Both interviews performed by A First interview performed by A andre-interviewed by B

Agreement

No./total (%)

Measure ofagreementK / ICC

Agreement

No. (%)

Measure ofagreementK / ICC

Smoking status(never/ex-smoker/current smoker) 251/258 (97.3) K = 0.96 221/242 (91.3) K = 0.87

Current smoking: No. cigarettes a daya

(1–5/6–10/11–20/21+) 65/87 (74.7) ICC = 0.75 69/98 (70.4) ICC = 0.77Years of smoking (1–20/21–40/41+)b 146/165 (88.5) ICC = 0.89 145/165 (87.9) ICC = 0.84Years since quitting smokingc

(1–5/6–20/21+) 69/74 (93.2) ICC = 0.88 47/56 (83.9) ICC = 0.69Lifetime smoking: No. cigarettes smokedd

(1–99,999/100,000–199,999/200,000+) 132/165 (80.0) ICC = 0.83 122/165 (73.9) ICC = 0.76Current coffee drinking (yes/no) 237/258 (92.0) K = 0.77 209/242 (86.4) K = 0.48Current coffee drinking: No. of cups a dayd

(1–2/3–4/5+) 125/174 (71.8) ICC = 0.53 124/178 (70.0) ICC = 0.60

The intraclass correlation coefficient (ICC) was computed by two-way ANOVA with fixed effects.a Current smokers only.b Current and ex-smokers.c Ex-smokers only.d Current drinkers only.

variables in Table 2, when comparing the first with thesecond interview. No substantial differences in agree-ment were found between cases and controls as re-gards smoking habits; on the other hand, significantdifferences were observed regarding coffee drinking(Mann–Whitney test for unpaired data: p = 0.02 for

both dichotomous and categorical variables). Amongcases, a slight decrease was found in both the propor-tion of consumers (from 90.7% to 86.5%) and the pro-portion of those consuming 3 cups/day and overamong current drinkers (from 52.7% to 42.7%) (testsof symmetry and marginal homogeneity: p = 0.05). On

263

Table 4. Test–retest reliability: agreement and values of kappa (K) and of intraclass correlation coefficient (ICC) according toage at first interview

Variable Aged 30–64 years Aged 65+ years

Agreement

No. (%)

Measure ofagreementK / ICC

Agreement

No. (%)

Measure ofagreementK / ICC

Smoking habit (never/ex-smoker/current smoker) 290/305 (95.1) K = 0.92 182/195 (93.3) K = 0.90Current smoking: No. cigarettes a daya

(1–5/6–10/11–20/21+) 78/124 (62.9) ICC = 0.78 46/61 (68.9) ICC = 0.82Years of smoking (1–20/21–40/41+)b 171/190 (90.0) ICC = 0.88 120/140 (85.7) ICC = 0.77Years since quitting smoking (1–5/6–20/21+)c 54/60 (90.0) ICC = 0.83 62/70 (88.6) ICC = 0.73Lifetime smoking: No. cigarettes smokedb

(1–99,999/100,000–199,999/200,000+) 148/190 (77.9) ICC = 0.82 106/140 (75.7) ICC = 0.74Current coffee drinking (yes/no) 281/305 (92.1) K = 0.72 165/195 (84.6) K = 0.56Current coffee drinking: No. cups a dayd

(1–2/3–4/5+) 165/229 (72.1) ICC = 0.60 84/123 (68.3) ICC = 0.46

a Current smokers only.b Current and ex-smokers.c Ex-smokers only.d Current drinkers only.

the other hand, no changes were observed among con-trols from the first to the second interview.

The subjects were divided into two groups accord-ing to the second interviewer (A or B): in one group,both interviews were performed by A, while in theother the first interviews were collected by A and there-interviews by B. The comparison between the twogroups is in Table 3. As it was expected, the indicessuggested a better agreement between the first inter-views and the re-interviews when they were perform-ed by the same person with respect to those perform-ed first by A and then by B. However, no significantdifferences were found between the two groups ex-cept for coffee drinking (Mann–Whitney test: p =0.01).

Subjects aged 30–64 years when first interviewed(Table 4) showed a slightly better agreement thanthose aged 65 and over for almost all variables, espe-cially coffee drinking. The proportion of drinkers didnot substantially change from the first to the secondinterview in both subgroups, while the number of cupsof coffee drunk per day showed a mean increaseamong the youngest subjects and a mean decreaseamong the others (Mann–Whitney test: p = 0.02).

No major differences were found between malesand females or between subjects with different levelsof education (data not shown in detail). However,some differences were seen as regards the number ofcups of coffee drunk per day: a mean decrease amongmales and an increase in females was found from thefirst to the second interview (Mann–Whitney test: p =0.02).

Cases interviewed up to 2 years after the diagnosisof bladder cancer (incident cases) gave more reliableanswers than those interviewed after a longer time(prevalent cases) for variables concerning cigarettesmoking, but not coffee drinking (data not shown).For instance, the agreement proportions among thecases interviewed 0–2 years after diagnosis and amongthose interviewed after this interval were, respective-ly, 99.0% and 91.6% for smoking status, 91.1% and84.6%, for duration of smoking, and 91.7% and 94.1%for coffee drinking as dichotomous.

The ORs for agreement between the first and sec-ond interview according to case/control status, sex, in-terviewer, age at first interview and education werecomputed by logistic regression analysis with all thevariables put in the model simultaneously. None ofthe independent variables investigated was found toinfluence the probability of a subject giving the sameresponse twice in a systematic way (Table 5). Casesgave less reliable answers than controls as regards cur-rent coffee drinking (dichotomous) but more reliableanswers as regards the number of cups drunk per day.

Finally, we computed the ORs of bladder cancer forcigarette smoking and coffee drinking adjusted forage, sex, residence and education by logistic regres-sion analysis using the data from the first and secondinterviews, for each interviewer (Table 6). Similar ORvalues for the various exposure categories were foundregarding the subjects interviewed twice by A. TheORs for the highest category of exposure were almostalways higher when using data from the second ratherthan the first interview as concerns cigarette smoking,

264

Table 5. Odds ratios (OR) and 95% confidence interval (CI) for agreement between first and second interview according tovarious variables, estimated by logistic regression analysis

Variable Smokinghabit (never/ex-smoker/currentsmoker)

Currentsmokingcigarettes/daya

Years ofsmokinga

Years sincequittingsmokinga

Lifetimesmoking:No. ofcigarettessmokeda

Currentcoffeedrinking(yes/no)

Currentcoffeedrinking:No. ofcups/daya

OR (95% CI) OR (95% CI) OR (95% CI) OR (95% CI) OR (95% CI) OR (95% CI) OR (95% CI)

Case/control statusCase Reference Reference Reference Reference Reference Reference ReferenceControl 0.7 (0.3–1.5) 0.8 (0.4–1.7) 0.9 (0.5–1.9) 0.8 (0.2–2.7) 1.2 (0.7–2.1) 0.3 (0.1–0.6)b 2.5 (1.5–4.2)b

SexMale Reference Reference Reference Reference Reference Reference ReferenceFemale 1.8 (0.7–4.6) 1.1 (0.5–2.5) 1.3 (0.5–3.8) 1.0 (0.1–10.4) 0.9 (0.4–1.9) 0.7 (0.4–1.3) 0.9 (0.5–1.5)

InterviewerA Reference Reference Reference Reference Reference Reference ReferenceB 0.3 (0.1–0.7) 0.8 (0.4–1.5) 0.9 (0.5–1.8) 0.4 (0.1–1.2) 0.7 (0.4–1.2) 0.6 (0.3–1.0)b 1.0 (0.6–1.6)

Age at first interview(years)30–64 Reference Reference Reference Reference Reference Reference Reference65+ 0.7 (0.3–1.7) 1.4 (0.7–3.1) 0.7 (0.3–1.5) 0.8 (0.2–2.8) 0.9 (0.5–1.6) 0.3 (0.1–0.6)b 1.1 (0.6–1.8)

Education (years)0–5 Reference Reference Reference Reference Reference Reference Reference6–8 1.3 (0.5–3.7) 3.2 (1.1–9.3) 1.2 (0.5–2.8)b 1.5 (0.3–6.7) 1.9 (1.0–3.9) 0.5 (0.3–1.1) 0.9 (0.5–1.6)9+ 1.6 (0.5–4.8) 1.5 (0.7–3.5) 6.4 (1.4–28.7)b 1.3 (0.3–7.2) 1.5 (0.7–3.1) 0.5 (0.3–1.2) 0.6 (0.3–1.2)

a The categories of these variables are shown in the previous tables.b p < 0.05 by the Likelihood Ratio Test.

while no differences were seen for coffee drinking. Onthe contrary, among the subjects interviewed first byA and then by B, lower OR values were found at there-interview for almost all categories of exposure, es-pecially for the highest levels. However, both inter-viewers found positive associations of bladder cancerwith both smoking status and coffee drinking, with alinear increase in ORs for increasing levels of expo-sure.

Corrected ORs were also computed for subjects in-terviewed twice by A (first column of Table 6) by us-ing the ICCs or number of disagreements calculatedwhen both interviews had been performed by A (Ta-ble 3, first two columns). The ICC for smoking statuscondisered as a 4-category variable (non-smoker, ex-smoker, smoker of 1–14 cigarettes/day and smoker of15+ cigarettes/day) was 0.97. The ICCs for lifetimesmoking and current coffee drinking were re-calculat-ed as 0.94 and 0.73, respectively, when including the 0category (no exposure), which is the reference for es-timating the ORs in Table 6. The corrected ORs (un-corrected in parenthesis) were as follows: ex-smokerstatus: 2.8 (2.7); current smoker status: 9.5 (8.9);smoker of 1–14 cigarettes/day: 7.8 (7.3); smoker of 15+cigarettes/day: 14.3 (13.2); lifetime smoking, 3 catego-ries of exposure: 3.4 (3.2), 7.6 (6.7) and 13.9 (11.9), re-

spectively; current coffee drinking as yes/no variable:3.2 (2.9); current coffee drinking as categories of ex-posure: 4.3 (2.9), 5.6 (3.5) and 14.7 (7.1), respectively.

Discussion

Overall, this study showed a fairly good reliability ofdata on smoking habits and coffee drinking as collect-ed through face-to-face and telephone interviewsamong cases with bladder cancer and controls affect-ed with benign urological diseases. As expected, relia-bility was higher for dichotomous than for categoricalsemi-quantitative variables. By applying a K or ICCvalue of 0.75 or higher as the criterion indicating goodagreement, reproducibility appears acceptable for to-bacco smoking in each subgroup. Other results in-clude the following: (1) no differences between malesand females or according to education; (2) no differ-ences between cases and controls as regards smoking,but some differences regarding coffee drinking; (3)slightly better agreement among subjects aged 30–64years than those aged 65 and over and among incidentthan prevalent cases; (4) slightly better agreementamong subjects interviewed twice by A than those in-terviewed by A and B; (5) higher reliability for varia-

265

Table 6. Odds ratios (OR) and 95% confidence interval (95% CI) of bladder cancer for cigarette smoking and coffee drinkingadjusted for sex, age, education and residence according to interview

Variable Subjects interviewedtwice by A:first interview by A

Ca/Co OR (95% CI)

Subjects interviewedtwice by A:second interview by A

Ca/Co OR (95% CI)

Subjects interviewedby A and B:first interview by A

Ca/Co OR (95% CI)

Subjects interviewedby A and B:second interview by B

Ca/Co OR (95% CI)

Smoking statusa

Non-smoker 26/65 Reference 25/67 Reference 13/61 Reference 13/57 ReferenceEx-smoker 29/51 2.7 (1.1–6.6) 29/46 3.3 (1.3–8.4) 37/28 4.5 (1.8–10.3) 35/30 3.5 (1.4–8.6)Current smoker 49/39 8.9 (3.7–21.6) 49/42 8.7 (3.6–20.9) 62/41 7.0 (3.2–15.2) 64/43 6.3 (2.9–14.0)

1–14 cigarettes/day 22/18 7.3 (2.8–19.3) 21/22 6.0 (2.3–15.8) 27/22 5.2 (2.2–12.8) 24/24 4.4 (1.8–10.9)15+ cigarettes/day 27/21 13.2 (4.4–39.6) 28/20 16.6 (5.5–49.6) 35/19 10.0 (3.9–25.4) 40/19 9.0 (3.6–22.4)

Lifetime smoking(No. cigarettessmoked)b

1–99,999 19/43 3.2 (1.3–8.0) 13/42 2.5 (1.0–6.6) 17/24 3.1 (1.2–7.9) 19/26 3.2 (1.3–8.1)100,000–199,999 21/21 6.7 (2.4–19.0) 23/20 9.9 (3.4–28.7) 24/18 6.0 (2.3–15.6) 24/20 5.0 (2.0–12.5)200,000 38/26 11.9 (4.3–33.2) 42/25 16.6 (5.8–47.8) 58/27 10.1 (4.2–24.0) 56/27 8.4 (3.5–20.4)

Current coffee drinkingc

No 15/48 Reference 13/41 Reference 5/27 Reference 16/27 ReferenceYes 88/107 2.9 (1.4–5.9) 90/114 2.8 (1.3–6.0) 107/103 4.3 (1.4–14.0) 96/103 1.2 (0.5–2.7)

Current coffee drinking(No. cups a day)d

1–2 56/70 2.9 (1.1–7.8) 51/62 3.8 (1.2–11.7) 55/64 2.4 (0.7–8.6) 43/52 1.9 (0.5–6.8)3–4 33/34 3.5 (1.2–10.1) 27/43 2.4 (0.7–8.1) 42/32 3.3 (0.9–12.5) 42/40 2.4 (0.7–9.0)5+ 5/3 7.1 (1.1–45.4) 12/9 6.3 (1.4–27.5) 10/7 5.2 (1.0–28.5) 11/11 4.3 (0.9–20.7)

a Reference category: non-smokers.b Current and ex-smokers. Reference category: non-smokers.c Odds ratios adjusted for smoking status too. Reference category: non-drinkers at the time of the interview (those who hadnever drunk + ex-drinkers).d Current drinkers only. Odds ratios adjusted for smoking status too. Reference category: those who had never drunk.

bles concerning smoking habits with respect to coffeedrinking. However, most of these differences weresmall and not statistically significant.

Most of these findings were not unexpected, such asthe lack of differences between males and females andbetween subjects with different levels of education,and the higher reliability among younger people, andamong incident than prevalent cases. Furthermore,the lack of differences in reliability between cases andcontrols is one of the fundamental assumptions forhaving no differential bias in relative risk estimate. Wefound no systematic differences between cases andcontrols as regards tobacco smoking and coffee drink-ing. The finding of no influence of sex and case/con-trol status on reliability is in agreement with otherstudies [21, 22].

High agreement on tobacco smoking between twointerviews is a common finding in reliability studies.For instance, Kelly et al. [23] carried out a reliabilitystudy among hospitalised patients interviewed twiceat the hospital. They found a K value of 0.81 for smok-ing status as a variable in three categories, and an ICCvalue of 0.84 when considering the continuous varia-

ble ‘no. of cigarettes smoked/day’. They found a lowerreliability when the two interviews occurred morethan 1 year apart. In this study we could not evaluatethe influence of the interval between interviews, sinceit was similar in all subjects, namely 2–3 years. How-ever, we assessed the reliability of data on smokinghabits in the past according to the time between dis-ease diagnosis and first interview. We observed a low-er reliability among those with longer intervals (morethan 2 years), for almost all variables. This findingconfirms the hypothesis that the interview data fromprevalent cases can be less reliable than those fromincident cases, and that the interval between the onsetof the disease and the interview should be as short aspossible.

The lower agreement observed as regards coffeedrinking with respect to tobacco smoking may be dueto various factors. First of all, it may be partly over-stated due to the use of K or ICC in tables with unbal-anced, symmetrical marginals. Second, a lower agree-ment for coffee drinking with respect to tobaccosmoking has often been found in reliability studies, es-pecially when long intervals occurred between the

266

two interviews. For instance, an ICC of 0.70 for thenumber of cigarettes smoked per day and 0.49 for thenumber of cups of coffee drunk per day was foundamong patients recruited in a hospital-based case-control study when the two interviews took place atleast one year apart [23]. This may be due to more var-iation in the number of cups of coffee usually drunkthan in the number of cigarettes smoked daily. Third,collecting data on coffee consumption may requiremore care on the questions to put in the questionnairethan that it is usually believed. In fact, in the first in-terview consumption of ‘pure’ coffee only was asked,while in the second interview some new questionsconcerning hot beverages other than ‘pure’ coffeewere asked, since some Italians often drink ‘milk andcoffee’, which has the same content of ‘pure’ coffee.When the consumption of ‘milk and coffee’ was alsotaken into account, some people were classified dif-ferently on the basis of either the first or the secondinterview: some of those who claimed to be non-drinkers in the first interview reported some cups of‘milk and coffee’ in the second, being re-classified ascurrent drinkers. Likewise, some subjects were re-classified in a higher category of coffee drinking onthe basis of the second questionnaire, with respect tothe first. Therefore, the first interview underestimat-ed true coffee consumption in some subjects. This un-derestimation was found in all the subgroups, includ-ing both cases and controls, males and females, andwas independent of the interviewer. In order to assessreliability, we used the same questions in the two in-terviews, therefore we took into account only the con-sumption of ‘pure’ coffee from both. On the contrary,we estimated the OR of bladder cancer for coffeedrinking from the second interview using the com-plete data on both coffee and ‘milk and coffee’ drink-ing, since they were likely to better estimate the sub-jects’ true consumption. A comparison between theOR estimates from the first interview (pure coffeeonly) and the second does not, however, show anysubstantial differences. For instance, in males the ORestimates for 1–2, 3–4 and 5+ cups/day were 2.6, 2.5and 9.1 when using data from the first interview, and2.5, 2.1 and 5.2 when using the complete data from thesecond interview (ORs adjusted for age, residence,education and smoking by logistic regression). Whenwe assessed separately the ORs from the first and thesecond interview according to the person who per-formed the re-interview (Table 6), we observed a re-duction in the ORs for coffee drinking among thoseinterviewed first by A and then by B, but not amongthose interviewed twice by A.

It is noteworthy to observe that data regarding cof-fee drinking showed a lower reliability than those oncigarette smoking in this study such as in others. As a

consequence of this relativively low reliability, thecorrected (‘true’) ORs for coffee consumption werefairly higher than the ORs found by using data fromthe first interview, which are usually the only dataavailable in case–control or cohort studies. Thismeans that the true association between coffee drink-ing and human pathologies may have been underesti-mated by studies performed using one only interviewfor classifying subjects. Based on these findings, theresults of some meta-analyses on association betweencoffee drinking and human diseases such as bladdercancer [6], that do not take account of reliability prob-lems, should be re-considered.

Some studies have shown that people have a tend-ency in telephone interviews to underreport their cur-rent smoking and overreport quitting [24, 25]. A re-cent reliability study performed in Italy showed a sub-stantially good agreement for smoking habits, coffeeand alcohol drinking between in hospital interviewsand telephone re-interviews, but it also showed a sys-tematic tendency to report lower tobacco, coffee andalcohol consumption in the hospital setting [26]. Inour study, we found no systematic bias when compar-ing the personal and telephone interviews as regardscigarette smoking, according to others [27]. In fact,when considering the data collected by interviewer Aonly, no relevant changes in the OR estimates werefound as regards cigarette smoking and coffee drink-ing using data from either the first or the second in-terview (see Table 6).

Conclusions

This study showed a fairly high reliability of data onpresent and past smoking habits, and a lower reliabil-ity of data on coffee drinking, with only modest differ-ences due to age at interview and status of incident orprevalent case. The relatively low reliability of data oncoffee drinking may be relevant in term of underesti-mating the true measures of association (odds ratiosor relative risks) by epidemiological studies based ondata collected through personal interviews.

References

1. Armstrong BK, White E, Saracci R. Principles of expo-sure measurement in epidemiology. New York: OxfordMedical Publications, 1992.

2. Dunn G. Design and analysis of reliability studies. NewYork: Oxford University Press, 1989.

3. Fleiss JL. Statistical methods for rates and proportions.New York: John Wiley and Sons, 1981.

4. Maclure M, Willet WC. Misinterpretations and misuse

267

of the kappa statistic. Am J Epidemiol 1987; 126: 161–169.

5. Muller R, Buttner P. A critical discussion of intraclasscorrelation coefficients. Stat Med 1994; 13: 2465–2476.

6. Viscoli CM, Lachs MS, Horwitz RI. Bladder cancer andcoffee drinking: A summary of case–control research.Lancet 1993; 341: 1432–1437.

7. Porru S, Aulenti V, Donato F, Boffetta P, Fazioli R, Cos-ciani Cunico S, Alessio L. Bladder cancer and occupa-tion: A case control study in northern Italy. Occup Envi-ron Med 1995; 53: 6–10.

8. Donato F, Boffetta P, Fazioli R, Aulenti V, Gelatti U,Porru S. Bladder cancer, tobacco smoking, coffee andalcohol drinking in Brescia, northern Italy. Eur J Epide-miol 1997; 13: 795–800.

9. Tammemagi MC, Frank JW, Leblank M, Artsob H,Streiner DL. Methodological issues in assessing repro-ducibility – a comparative study of various indices of re-producibility applied to repeat ELISA serologic testsfor Lyme disease. J Clin Epidemiol 1995; 48: 1123–1132.

10. Thompson WG, Walter DW. A reappraisal of the kappacoefficient. J Clin Epidemiol 1988; 41: 949–958.

11. Feinstein AR, Cicchetti DV. High agreement but lowkappa: I. The problems of two paradoxes. J Clin Epide-miol 1990; 43: 543–549.

12. Agresti A. Categorical data analysis. New York: JohnWiley and Sons, 1990.

13. Brennan P, Silman A. Statistical methods for assessingobserver variability in clinical measures. Br Med J 1992;304: 1491–1494.

14. Armitage P, Berry G. Statistical methods in medical re-search. London: Blackwell Scientific Publications, 1994.

15. Coughlin SC, Pickle LW, Goodman MT, Wilkens LR.The logistic modeling of interobserver agreement. JClin Epidemiol 1992; 45: 1237–1241.

16. Stein AD, Courval JM, Lederman RI, Shea S. Repro-ducibility of responses to telephone interviews: Demo-graphic predictors to discordance in risk factor status.Am J Epidemiol 1995; 141: 1097–1106.

17. Armstrong BK, White E, Saracci R. Principles of expo-sure measurements in epidemiology. Oxford, UK: Ox-ford Medical Publications, 1992: 121–123.

18. Bashir SA, Duffy SW, Qizilbash N. Repeat measure-ment of case-control data: Corrections for measure-ment error in a study of ischaemic stroke and haemos-tatic factors. Int J Epidemiol 1997; 26: 64–70.

19. Duffy SW, Maximovitch DM, Day NE. External vali-dation, repeat determination, and precision of risk esti-mation in misclassified exposure data in epidemiology. JEpidemiol Community Health 1992; 46: 620–624.

20. Dixon WJ, Brown MB, Engelman L, Jennrich NI.BMDP statistical software manual. Los Angeles, CA:University of California Press, 1993.

21. Lindsted KD, Kuzma JW. Reliability of eight-year recallin cancer cases and controls. Epidemiology 1990; 1: 392–401.

22. Cumming RG, Klineberg RJ. A study of the reproduc-ibility of long-term recall in the elderly. Epidemiology1994; 5: 116–119.

23. Kelly JP, Rosenberg L, Kaufman DK, Shapiro S. Relia-bility of personal interview data in a hospital-basedcase–control study. Am J Epidemiol 1990; 131: 79–90.

24. Luepker RV, Pallonen UE, Murray DM, Pirie PL. Val-idity of telephone surveys in assessing cigarette smokingin young adults. Am J Public Health 1989; 79: 202–204.

25. Bowlin SJ, Morrill BD, Nafziger AN, et al. Validity ofcardiovascular disease risk factors assessed by tele-phone survey: The Behavioral Risk Factor Survey. JClin Epidemiol 1992; 46: 561–572.

26. D’Avanzo B, La Vecchia C, Katsouyanni K, Negri E,Trichopoulos D. Reliability of information on cigarettesmoking and beverage consumption provided by hospi-tal controls. Epidemiology 1996; 7: 312–315.

27. Jackson C, Jatulis DE, Fortmann SP. The BehavioralRisk Factor Survey and the Stanford Five-City ProjectSurvey: A comparison of cardiovascular risk behaviorestimates. Am J Pub Health 1992; 82: 412–416.

Address for correspondence: Francesco Donato, Cattedra diIgiene, Universita di Brescia, Via Valsabbina 19, 25124 Bres-cia, ItalyPhone: + 39 30 3838601; Fax: + 39 30 3701404E-mail: nardi6master.cci.unibs.it