session 5_introduction to epidemiological study design

9
L5.1 SESSION 5 INTRODUCTION TO EPIDEMIOLOGICAL STUDY DESIGN Objectives At the end of this session students should be able to: describe the major study designs used in epidemiological research identify the study design used in a particular epidemiological investigation See Hennekens and Buring Chapter 2 and the list of references at the end of session Exposures and outcomes Before looking at study design we need to consider what is being studied. In an epidemiological study there is (a) the outcome of interest (b) the primary exposure (or risk factor) of interest and (c) other exposures that may influence the outcome (potential confounders) As we saw in the first lecture, in epidemiology the term ‘exposure’ is used in a very broad sense. It is not necessarily limited to an environmental hazard such as air pollution or a chemical and may be something as simple as age. It may even be a genetic factor such as a blood group or sickle cell trait. The primary exposure of interest is the one which is included in the hypothesis. For example if the hypothesis is that aflatoxin (a toxin produced by a mould which may grow on peanuts) causes liver cancer, then aflatoxin is the primary exposure of interest. If the hypothesis is that individuals' ability to metabolise aflatoxin determines their risk of liver cancer, then the metabolic enzyme phenotype or genotype is the primary exposure. There may be more than one exposure. If, for example, a study has been set up to examine the hypothesis that alcohol is a cause of lung cancer independent of smoking, then one clearly has to measure smoking exposure as well as alcohol consumption. Here smoking is a ‘potential confounder’ which may ‘get in the way’ when studying the relationship of alcohol consumption with lung cancer. The concept of confounding will discussed in detail in a later session. However awareness of the existence of confounding is crucial to any discussion of epidemiological studies. Indeed, many of the study designs are specifically built around the control of confounding. Briefly, for a factor to be regarded as a confounder, the rules are:

Upload: soraya-alencar

Post on 01-Nov-2014

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Session 5_introduction to Epidemiological Study Design

L5.1

SESSION 5

INTRODUCTION TO EPIDEMIOLOGICAL STUDY DESIGN

Objectives At the end of this session students should be able to: • describe the major study designs used in epidemiological research • identify the study design used in a particular epidemiological investigation See Hennekens and Buring Chapter 2 and the list of references at the end of session Exposures and outcomes Before looking at study design we need to consider what is being studied. In an epidemiological study there is (a) the outcome of interest (b) the primary exposure (or risk factor) of interest and (c) other exposures that may influence the outcome (potential confounders) As we saw in the first lecture, in epidemiology the term ‘exposure’ is used in a very broad sense. It is not necessarily limited to an environmental hazard such as air pollution or a chemical and may be something as simple as age. It may even be a genetic factor such as a blood group or sickle cell trait. The primary exposure of interest is the one which is included in the hypothesis. For example if the hypothesis is that aflatoxin (a toxin produced by a mould which may grow on peanuts) causes liver cancer, then aflatoxin is the primary exposure of interest. If the hypothesis is that individuals' ability to metabolise aflatoxin determines their risk of liver cancer, then the metabolic enzyme phenotype or genotype is the primary exposure. There may be more than one exposure. If, for example, a study has been set up to examine the hypothesis that alcohol is a cause of lung cancer independent of smoking, then one clearly has to measure smoking exposure as well as alcohol consumption. Here smoking is a ‘potential confounder’ which may ‘get in the way’ when studying the relationship of alcohol consumption with lung cancer. The concept of confounding will discussed in detail in a later session. However awareness of the existence of confounding is crucial to any discussion of epidemiological studies. Indeed, many of the study designs are specifically built around the control of confounding. Briefly, for a factor to be regarded as a confounder, the rules are:

Page 2: Session 5_introduction to Epidemiological Study Design

L5.2

1. The factor must be associated with the exposure being investigated, and 2. The factor must be independently associated with the risk of developing the outcome of

interest For example, if in the study of alcohol and lung cancer an association were observed between high alcohol intake and risk of lung cancer, then that association may in part (or wholly) be due to the fact that people with high alcohol consumption are more likely to be smokers (and smoking is a risk factor for lung cancer). Thus in assessing the relationship of lung cancer with alcohol, smoking status would have to be taken into account. In this context smoking would be considered to be a confounding factor (confounder). In diagrammatic terms, this looks like:

EXPOSURE DISEASE (e.g. alcohol) (e.g. lung cancer)

CONFOUNDER

(e.g. smoking) We will return to this in detail in the session on confounding. ‘Outcome’ is also a broad term. Death is an easily defined and important outcome. A specific disease, or even a state of health, may also be the outcome. In some studies there may be multiple outcomes. For example degrees of severity may be important. In a study of malaria the outcomes could be asymptomatic infection, fever with a positive blood slide and/or cerebral malaria. A final point to make here is that a factor may be an outcome in one study and an exposure in another. For example, low birth weight (<2500gm) can be the outcome of interest in a study investigating determinants of poor fetal growth, and the primary exposure in another investigating the effect of poor fetal growth on mortality later in life. The first step in an epidemiological design is to define the hypothesis that you wish to test. This should be done in a way that makes clear what are the primary outcome(s), exposure(s), and potential confounders in the study. One must then choose the most appropriate design.

Page 3: Session 5_introduction to Epidemiological Study Design

L5.3

Study Designs As shown in the diagram overleaf, study designs can be split into observational (or non- experimental) and intervention (or experimental).1

1 Although most textbooks agree on this division, please note that there is variation in the classification system within these broad groups.

Page 4: Session 5_introduction to Epidemiological Study Design

L5.4

EPIDEMIOLOGICAL STUDIES

OBSERVATIONAL (NON-EXPERIMENTAL)

INTERVENTION (EXPERIMENTAL)

DATA FROM GROUPS

DATA FROM INDIVIDUALS

DATA FROM GROUPS

DATA FROM INDIVIDUALS

DESCRIPTIVE (a)

ANALYTIC ANALYTIC DESCRIPTIVE

CLINICAL TRIAL, INDIVIDUAL FIELD

TRIAL (g)

ECOLOGICAL STUDY

(b)

CROSS-SECTIONAL STUDY

(c)

COHORT STUDY

(d)

CASE-CONTROL STUDY

(e)

COMMUNITY TRIAL

(f)

Page 5: Session 5_introduction to Epidemiological Study Design

L5.5

Observational studies As discussed in the first session, observational studies collect information on events over which we have no control. We are simply observing what is happening, or what happened in the past. A useful way to classify observational studies is firstly to group them into those where data is collected from populations or groups (aggregated data) and those where data is collected from individuals. Secondly, we can split the designs into those where the outcome of interest is described with no reference to exposure (descriptive studies) and those where exposure, and its association with the outcome of interest, is considered (analytical studies). This is a useful way to classify studies, but sometimes there may be overlap: some cross sectional studies and cohort studies may have both a descriptive and an analytical component (see diagram on previous page). (a) Descriptive studies using grouped (aggregated) data These studies examine disease occurrence by age, sex, region, time period (Person, Place, Time). No exposure data are examined. They often make use of routinely-collected data such as national mortality or cancer incidence rates (see session on Vital Statistics). (b) Ecological studies These studies describe disease (or outcome) in the population or group as in (a) above, but also include information on exposure. They are thus analytical studies. The average (or other summary statistic) exposure of a population is plotted against the rate of the outcome for that population. This is done for several populations and the data are then examined for evidence of an association between exposure and outcome. For example, the average intake of salt in a population can be compared to the mortality rate from stroke. This can be done for several different countries and the data examined for a relationship between the salt intake of a population and rate of disease (often done using a scatter plot). It is important to understand that the only conclusion one can draw relates to the population. It is not possible from an ecological study to draw conclusions about exposure in the individual and the risk of the outcome. Attempting to do this is referred to as the ‘ecological fallacy’ (see session on ecological studies). (c) Cross-sectional studies Cross-sectional studies collect information from individuals to measure prevalence at one point in time. The prevalence of the outcome can be measured without reference to exposure (for example the % of LSHTM students suffering a headache on a particular day), or the prevalence of exposure may be measured without reference to disease (for example the % of persons aged under 18 years in London who have used illegal drugs in the past week). These studies are classified as descriptive because outcome is not reported in relation to exposure, or the exposure is not reported in relation to the outcome. But if the prevalence of disease is measured in those with, and without, the exposure of interest, the study is classified as analytical. Cross-sectional studies are relatively simple to conduct, take only a short time and are relatively cheap. For these reasons they are frequently used for planning purposes. However they have the major drawback that they can be difficult to interpret as it is not possible to answer whether the outcome followed the exposure in time or the exposure resulted from the outcome. For example schizophrenia will often result in an inability to

Page 6: Session 5_introduction to Epidemiological Study Design

L5.6

hold down a job or may impair the ability of the person to hold a responsible job. In this situation a cross-sectional study might show that people with the disease have lower socio-economic status. This could be misinterpreted as meaning that being in a lower social class increases the risk of schizophrenia when in fact the occupational status was determined by the disease. In reality one would not use a cross-sectional study to examine this question at all because the disease is too uncommon. It would be necessary to survey a very large population in order to have sufficient people with the outcome to draw any conclusions. In general, therefore, cross sectional surveys are used to estimate the prevalence of common conditions of a reasonably long duration or to determine the distribution of continuous variables within a population. They are particularly efficient for the latter situation. (d) Cohort studies The starting point in cohort studies is the definition of a group of people by their exposure status. For example, one might take a group of smokers and a group of non-smokers. These are then followed up over time to see which ones develop a disease or condition, such as gangrene of the leg or requiring an amputation. Rates of disease in the exposed and unexposed groups can then be calculated. Since the definition of the study group is by exposure one could also look at the incidence of chronic bronchitis, lung cancer, depression etc in the study. Cohort studies thus have the ability to look at multiple outcomes. If they are well designed they also ensure that the exposure precedes the outcome. Cohort studies usually compare rates of disease in groups with different exposures, and are thus analytical studies. But rates of disease can be calculated for a single group (for example an occupational group) and presented as a descriptive study (with no analysis of exposure). Cohort studies will be discussed further in a later session (and accompanying nested case-control studies in the session on case-control studies). (e) Case-control studies The starting point in case-control studies is the definition of a group of people with a particular disease or condition. Suitable controls without the disease, and representing the population from which the cases originated, are also selected. Information on the prevalence of past exposure in cases and controls is then collected. Thus in a study of deaths from respiratory infection the cases could be children who died from pneumonia. Controls might be healthy children of the same age. If you were interested in household crowding as a risk factor (exposure) information would be gathered on how many other children the case children, and the control children, shared a room with. The odds of exposure in cases is then compared with the odds of exposure in controls. Although a case-control study is limited to one outcome, many different exposures can be measured the combined effects of different exposures examined. Case-control studies are always analytical. A next session will deal with this design in more detail. Intervention studies In intervention (or experimental) studies, we (the researchers) deliberately allocate the "exposure" to individuals or communities. As discussed in the Introductory session, for

Page 7: Session 5_introduction to Epidemiological Study Design

L5.7

ethical reasons we cannot expose people to factors which might increase the risk of disease or death, but we can intervene to reduce exposure (e.g. to smoking) or to allocate alternative treatments. The preferred form of intervention study is the randomised controlled trial, in which the intervention (or exposure) is randomly assigned at either the group level (a community trial: box (f) in diagram on page 6.3) or at the individual level (a clinical trial or individual field trial, box (g) in diagram on page 6.3). All intervention studies are analytical since they all study the effect of exposures. To measure the effect of the exposure, the outcomes of interest are measured and compared between the different exposure groups. Intervention studies will be discussed in a later session. Choice of Study Design If the outcome of interest is a rare disease (e.g. childhood leukaemia) the choice would probably be between an ecological or a case-control study (see Table 1). If the exposure is known to be rare, an ecological or a cohort study would be the most appropriate study designs. Cohort studies are also favourite for the investigation of multiple outcomes, obtaining measures of incidence, and detailed investigation of time sequences between exposure and disease onset. Case control studies are able to examine multiple exposures. The strengths and weaknesses of the various types of study must also be used to decide which design to use (see Table 2). If little is known about the outcome of interest, or the factors affecting it, it is sensible to start with a descriptive or ecological study before embarking on a more costly and time-consuming case-control or cohort study. Cohort studies are heavy on both time and money. Bias (selection and information), confounding and loss to follow-up are important weaknesses, and will be dealt with in later lectures. Intervention studies provide the strongest evidence with which to test hypotheses. Randomisation of people or whole communities to exposures ensures that many of the problems of confounding are overcome. However, intervention studies are not the most common design, mainly because of ethical considerations. They are used where there are grounds to believe that the "exposure" will provide potential benefit to individuals, for example a vaccination or nutritional supplement program or an alternative treatment for a specific condition. Table 1 : Applications of different observational and analytical study designs (number

of crosses indicates usefulness)

Ecological

Cross-sectional

Case-control

Cohort

Investigation of rare disease

++++

-

+++++

-

Investigation of rare exposures

++

-

-

+++++

Examining multiple outcomes

+

++

-

+++++

Studying multiple exposures

++

++

++++

+++

Measurement of time relationship Between exposure and outcome

+

-

+

+++++

Direct measurement of incidence

-

-

+

+++++

Investigation of long latent periods

-

-

+++

+++*

*If historical (see later session on Cohort studies)

Page 8: Session 5_introduction to Epidemiological Study Design

L5.8

Table 2 : Strengths and weaknesses of different observational analytic study designs

Ecological

Cross sectional

Case control

Cohort

Probability of: • selection bias • information bias • loss to follow-up • confounding

NA NA NA high

medium high NA

medium

high high low

medium

low low high low

Time required

low

medium

medium

high

Cost

low

medium

medium

high

NA = Not applicable

Measures of disease (outcome) and measures of exposure effect Different types of analytic study lead to (or use) different measures of disease occurrence, and thus have different measures of the effect of exposure. Table 3 summarises the main points. Further details will be discussed in later lectures throughout the year. Table 3 Measures of disease occurrence and exposure effect in analytic study designs

Type of analytical study

Measure of disease (outcome)

occurrence

Measure of exposure effect

Ecological

Rate, Risk, Prevalence

Correlation or Regression Coefficient

Cross-sectional

Prevalence Prevalence Ratio, Prevalence Difference, Odds Ratio

Cohort Rate, Risk , Odds, Mean or Median Rate Ratio, Risk Ratio, Odds Ratio, Rate Difference, Risk Difference, Vaccine Efficacy, Difference between Means or Medians

Case-control None1 Odds Ratio,Vaccine Efficacy,

Intervention Rate, Risk ,Odds, Mean or Median Rate Ratio, Risk Ratio, Odds Ratio, Rate Difference, Risk Difference, Vaccine Efficacy, Difference between Means or Medians

1 Unless the sampling fraction is known for both cases and controls; i.e. unless the proportion of cases and proportion of controls

sampled from the population is known.

References The references below are for the whole of the block on different study designs. General texts dos Santos Silva I. Cancer Epidemiology : Principles and Practice. IARC: Lyon (France)

1998

Page 9: Session 5_introduction to Epidemiological Study Design

L5.9

Beaglehole R, Bonita R, Kjellstrom T. Basic Epidemiology. WHO 1993 Kelsey JL. Thompson WD & Evans AS. Methods in Observational Epidemiology. Oxford

University Press. 1986 Rothman KJ. Modern Epidemiology. Little Brown & Co, 1986 Cohort studies Breslow N and Day N. Statistical methods in cancer research. Volume 2: Analysis of cohort

studies. International Agency for Research on Cancer Case control studies Schlesselman JJ. Case control studies. Design, conduct and analysis. Oxford University

Press 1982 Breslow N and Day N. Statistical methods in cancer research. Volume 1: Analysis of case-

control studies. International Agency for Research on Cancer. 1980 Intervention studies Armitage P. Sequential medical trials. 2nd edition. Blackwell Scientific Publications,

Oxford, 1975 Peto R et al. Design and analysis of randomised trials requiring prolonged observation of

each patient I. Introduction and design. British J Cancer 1976;34:585-612 II. Analysis and examples. British J Cancer 1977;35:1-39

Pocock SJ. Clinical trials: a practical approach. John Wiley & Sons, Chichester 1983 Smith PG, Morrow RH (eds). Methods for field trials of interventions against tropical

diseases: a toolbox. Oxford University Press 1991 Ethics Beauchamp TL, Childress JF. Principles of Biomedical Ethics. 2nd Edition. Oxford

University Press, 1983 Levine RJ. Ethics and regulation of clinical research. 2nd Edition. Urban &

Schwarzenberg, 1986