race and ethnicity in genetic epidemiology neil risch
TRANSCRIPT
Race and Ethnicity in Genetic Epidemiology
Neil Risch
Does Race/Ethnicity Matter?
• Editorial, New England Journal of Medicine:– “Race is biologically meaningless.”
• Nature Genetics Editorial:– “Commonly used ethnic labels are both insufficient
and inaccurate representations of inferred genetic clusters.”
– “Genetic data … show that any two individuals within a particular population are as different genetically as any two people selected from any two populations in the world.”
Does Race/Ethnicity Matter?
• Jack Kemp:– “The human genome project shows there is no
genetic way to tell the races apart. For scientific purposes, race doesn’t exist.”
• President Bill Clinton:– “All the schoolchildren will soon be learning in their
biology classes that all the people in the world – all the people in the world, in terms of their genetic makeup, scientifically, are 99.9% the same. The Serbs, the Albanians, the Irish, the Latins, the Asians.”
Does Race/Ethnicity Matter?
• J. Craig Venter:– “It is disturbing to see reputable scientists and
physicians even categorizing things in terms of race … there is no basis in the genetic code for race.”
Does Race/Ethnicity Matter?
• Eric Lander (Nova Interview):– “The genetic difference between any two
people, whether it’s a Sumo wrestler or a Sports Illustrated bathing suit model – one tenth of a percent. Those two, and any two people on this planet, are 99.9% identical at the DNA level.
Does Race/Ethnicity Matter?
– Eric Lander (continued):• “So race is not a very helpful category to a
geneticist, because it’s focusing on a fairly small number of genes that describe appearance. But if we’re talking about the 30,000 genes that run the human symphony, that’s a tapestry that weaves through every population. That’s why geneticists really don’t think race is a terribly helpful concept.
• “But then to define all the human variation on top of it, we sequenced millions and millions of DNA segments from a worldwide sample of 24 people: Pacific Islanders, Asians, Africans, Americans.”
Does Race/Ethnicity Matter?
• Haga and Venter (Science;July, 2003):– “We are concerned that applying antiquated
labels to the analysis and interpretation of scientific data could result in misleading and biologically meaningless conclusions.”
Does Race/Ethnicity Matter?
• Shields et al (Am Psychol, 2005):– “The authors examine the history of racial
categories, current research practices, and arguments for and against using race variables in genetic analyses. The authors argue that the sociopolitical constructs appropriate for monitoring health disparities are not appropriate for use in genetic studies investigating the etiology of complex diseases.”
What is the evidence regarding genetic structure and race?
Results from Population Genetics Studies
• Bowcock et al, Nature, 1994:– 30 microsatellite loci– 14 populations, 148 subjects:
• African - CAR pygmy, Zaire pygmy, Lisongo• Caucasian – Northern European, Italians• Oceania – Melanesian, New Guinean, Australian• East Asia – Chinese, Japanese, Cambodian• Americas – Maya, Surui, Karatiana
Calafell et al, Eur J Hum Genet, 1998
• 45 microsatellite loci
• 10 populations, 504 subjects– African: CAR pygmy, Zaire pygmy– Caucasian: Dane, Druze– Oceania: Melanesian (Nasioi)– East Asia: Chinese, Japanese, Yakut– Americas: Maya, Surui
Unpublished data (Collaboration with Ken and Judy Kidd)
• 49 SNPs in 14 Loci• 33 populations, 1716 subjects
– African: Biaka, Mbuti, Yoruba, Ibo, Hausa, Ethiopia, African American
– Caucasian: Yemen, Druze, Samaritan, Adygei, Russia, Finn, Dane, Irish, European American
– Oceania: Nasioi, Micronesian– East Asia: SF Chinese, Taiwan Chinese, Hakka, Ami,
Atayal, Japanese, Cambodian, Yakut– Americas: Cheyenne, AZ Pima, MX Pima, Maya,
Ticuna, Surui, Karitiana
What is the evidence regarding genetic structure and race?
• How much correlation is there between self-identified race/ethnicity (SIRE) and genetic structure in the human population?
• Results from the Family Blood Pressure Program (FBPP)
FBPP
• Study of genetic and environmental determinants of hypertension in families
• Four networks, 15 field centers (collection sites), four major race/ethnicity groups: Caucasian (CAU), African American (AFR), East Asian (Chinese, Japanese) (EAS), Hispanic (Mexican American) (HIS)
• Our analysis includes one subject per family
FBPP
• Total of 3,636 individuals included (one per family)
• CAU 1349, 6 sites
• AFR 1308, 4 sites
• HIS 412, 1 site
• EAS 567 (407 CHI, 160 JAP), 5 sites
• 18 SIRE-site combinations total
FBPP
• Genome Screen STR markers, all typed at the NHLBI sponsored Mammalian Genotyping Service, Marshfield, Wisconsin (James Weber)
• Total number of markers included = 366.
Analysis
• Genetic Distances (Reynolds,1983; Nei, 1978) between all pairs of SIRE-sites (18x17/2 = 153 comparisons)
• Multidimensional scaling (MDS) for two dimensional depiction of genetic distances
• Branching tree relating 18 SIRE-sites• Genetic Cluster Analysis (GCA) using
STRUCTURE on all 3,636 subjects (326 markers), comparison with SIRE
Genetic Cluster Analysis4 Clusters
Cluster A Cluster B Cluster C Cluster D
CAU 1348 0 0 1
AFR 3 0 1305 0
HIS 1 0 0 411
CHI 0 407 0 0
JAP 0 160 0 0
Genetic Cluster AnalysisEast Asians Alone
Cluster A Cluster B
CHI 405 2
JAP 4 156
GCA Classification versus SIRE
• Concordant: 3,631
• Discordant: 5
• Discordance Rate: .0014
Reynolds-Stanford-Kaiser Cardiovascular Disease Project
• Investigators:– Stanford: Tom Quertermous, Mark Hlatky,
Steve Fortmann, Rick Myers, Richard Olshen, Neil Risch
– Kaiser: Alan Go, Carlos Iribarren, Malini Chandra, Phenius Lathon
• Analysis by Analabha Basu
SELF-IDENTIFIED RACE ETHNICITIES
White (Caucasoid) 2281
Black (African-American) 438
Hispanic 197
Indian-Pakistani (South-Asian) 55
Asian/ Asian-American (East-Asian) 223
Native Hawaiian or Other Pacific Islander 9
American-Indian/Native American 2
Mixed-Hispanic 326
Mixed-Other 138
Overview of Genetic data
• 467 Markers (SNPs)
• 452 Autosomal Markers + 15 X-chromosomal Markers
• 77 Candidate Genes
• 73 on Autosomal Chromosomes + 4 on X-chromosome
Multidimensional Scaling ( using Reynolds Distance)
South-Asians are with Hispanics
Multidimensional Scaling
Structure with 4 ancestral populations
Self-Identified Inferred Clusters Number ofPopulation 1 2 3 4 IndividualsCaucasian 0.943 0.004 0.004 0.050 265 African-American 0.011 0.989 0.000 0.000 183 Hispanic 0.138 0.000 0.000 0.862 181 South-Asian 0.287 0.000 0.006 0.706 55 East-Asian 0.014 0.000 0.981 0.005 215
Structure with 5 ancestral populations
Self-Identified Inferred Clusters Number ofPopulation 1 2 3 4 5 IndividualsCaucasoid 0.858 0.027 0.108 0.004 0.004 265 African-American 0.011 0.000 0.000 0.000 0.989 183 Hispanic 0.126 0.742 0.132 0.000 0.000 181 South-Asian 0.046 0.018 0.935 0.000 0.000 55 East-Asian 0.014 0.005 0.000 0.981 0.000 215
Analysis of Group Differences
• SIRE and GCA give nearly identical results with enough genetic markers
• Important environmental/social/cultural differences also exist between SIRE groups
• High correlation between SIRE and GCA leads to strong confounding between genetic and non-genetic factors when examining group differences in prevalence of diseases or traits
Analysis of Group Differences
• Ignoring the SIRE/GCA relationship (and avoiding SIRE, using GCA only) runs the risk of false inference of genetic explanations for group differences
• Distinguishing between genetic and non-genetic sources of group differences best examined within a single admixed group, but depends on variation in admixture levels, and is still possibly subject to residual correlation and confounding
Analysis of Individuals Admixture Analysis
• Even though the four ethnic groups were easily separable based on genetic markers, African Americans and Latino Americans typically have ancestry from multiple continents. Using the same genetic markers, it is possible to estimate for each individual the proportions of ancestry, or individual ancestry (IA) from each continental/ancestral group.
Analysis of Individuals Admixture Analysis
• African Americans and Latino Americans typically have ancestry from multiple continents. Using genetic markers, it is possible to estimate for each individual the proportions of ancestry, or individual ancestry (IA) from each continental/ancestral group.
Admixture AnalysisFBPP
• Estimation of ancestry requires genotypes of individuals representing the original indigenous ancestors. For our analyses, we included 1,378 unrelated Caucasians from the FBPP, 127 unrelated sub-Saharan Africans and 50 Native Americans from the World Diversity Panel.
Admixture Analysis - FBPP
• These various data sources shared 284 microsatellite markers from the Marshfield Screening Set 10, where all subjects were genotyped.
• IA estimates were obtained from the genetic cluster analysis program Structure (Pritchard et al).
African Ancestry in African Americans
Ancestry in Mexican Americans from Starr County, Texas
Admixture Analysis
• Distinguishing between genetic and non-genetic sources of group differences can be examined within a single admixed population.
• Depends on variation in admixture levels within that population
• Examine correlation of individual ancestry (IA) with trait of interest (e.g. does blood pressure correlate with African ancestry?)
Admixture Analysis - FBPP
• 3,207 African Americans representing 1,801 sibships from 4 recruitment sites
• 1,506 Mexican Americans representing 453 sibships from 1 recruitment site
• Estimated IA and its correlation with blood pressure, hypertension, and BMI
Admixture Analysis – Blood Pressure and BMI
• For blood pressure and BMI, performed linear regression on estimated African IA for the African Americans (n=1424) and on African IA and Caucasian IA for the Mexican Americans (n=1122), adjusted for age, age2, sex and field center. BMI was included as a covariate for blood pressure
African IA in hypertensives versus normotensives
Site Group Hypertensive Normotensive Delta P value
Number Mean (sd)
Number Mean (sd)
Maywd Afr. Amer.
49 .863 (.097)
141 .867 (.092)
-.004 .805
Jackson Afr. Amer.
223 .851 (.123)
37 .827 (.113)
.024 .264
Forsyth Afr. Amer.
144 .845 (.114)
47 .820 (.139)
.025 .225
Birming Afr. Amer.
351 .881 (.086)
34 .860 (.102)
.021 .170
Starr Mex. Amer.
101 .043 (.029)
161 .043 (.030)
.000 .89
Results of ANOVA of African IA
Factor df Sum of Sq.
Mean Sq.
F value P value
Site 3 .271 .093 8.269 .00002
Hyper-tension
1 .035 .035 3.185 .075
Resid. 1021 11.148 .011
Linear Regression on African IA in African Americans
b(IA)
SBP
b(IA)
DBP
b(IA)
MAP
b(IA)
BMI
5.4 (4.5) 3.0 (3.1) 6.2 (3.3) 4.0 (2.0)*
Regression in Mexican Americans on African and Caucasian IA
Outcome b(IA)
African
b(IA)
Caucasian
SBP 9.5 (21.6) -8.9 (5.8)
DBP 18.9 (10.0)* -1.0 (2.6)
MAP 15.6 (12.6) -3.9 (3.3)
BMI 3.9 (6.0) 4.3 (1.7)*
Admixture Analysis
• Caveat: Still possibly subject to residual correlation and confounding
• For example, within African Americans, discrimination may be related to both skin pigment and adverse health outcomes
• Skin pigment is likely to be genetically correlated with degree of European versus African ancestry
Admixture Mapping
• As opposed to ancestry estimates based on the entire genome, which may be confounded with non-genetic factors, ancestry at specific genetic locations are less likely to be so confounded
• The power of the method depends on how large the effect of an allele is on the trait, and the difference in the frequency of that allele between ancestral groups
Admixture Analysis
• If only a small number of genes contribute to ethnic difference, global estimate may be only poorly correlated with those specific locations
• Therefore, locus-specific analysis might be more informative (admixture mapping)
Admixture Mapping
• If the admixture occurred recently in history (e.g. over the past 10 generations), then the ancestry excess will extend over large segments of the chromosome
• Thus, markers in the vicinity of the trait locus will also show excess ancestry from the population with the higher allele frequency
Admixture Mapping in FBPP
• Estimated locus-specific African ancestry for hypertensives from 3 networks separately; also a pooled group of cases based on more stringent criteria; performed similar analysis on controls (normotensives)
Cases Controls
Red Line = Marker Information
Black Line = Genome-wide Z scores
Distribution of Z Scores
Table 2 Marker locations associated with the largest excess of African ancestry in hypertensive subjects for each individual network
Network and marker Location (cM) Excess African ancestry Z score
GenNet
GATA184A08 6q24.1 (146) 0.021 3.08
D6S2436 6q25.1 (155) 0.021 3.08
D21S1437 21q21 (13) 0.017 2.55
GENOA
GATA184A08 6q24.1 (146) 0.011 4.23
D6S2436 6q25.1 (155) 0.010 3.01
HyperGEN
GATA184A08 6q24.1 (146) 0.017 4.69
D6S2436 6q25.1 (155) 0.011 2.91
D21S1437 21q21 (13) 0.011 2.88
Lessons from Asthma
• Data from Esteban Burchard and colleagues.
• Example of complex interplay between ancestry and environmental factors
Lifetime Asthma Prevalence in US
25.8%
15.8%
12.7%
10.1%
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
MexicanAmerican
Caucasian AfricanAmerican
Puerto Rican
Lara et al, 2006
Genetics of Asthma in Latino Americans (GALA)
• Esteban Burchard, PI
• Study of Mexican and Puerto Rican asthmatics from Mexico, Puerto Rico and the US.
Genetics of Asthma in Latino Americans (GALA)
• Estimated African, European and Native American ancestry in Puerto Ricans with ancestry informative markers (AIMS)
• Examined relationship of ancestry and socio-economic status (SES) on asthma risk
• Found an interaction between ancestry, SES and asthma risk
Ancestry-Socioeconomic Status Interaction & Risk of Asthma
Mod / Mid Upper
Cases
Controls
0
5
10
15
20
25
Percent African Ancestry
SES
Asthma
Cases
Controls
In lower SES category, Puerto Ricans patients with asthma had less African and more European ancestry compared to healthy controls, whereas in upper SES category, patients with asthma had more African and less European ancestry compared to healthy controls
Conclusion
• Epidemiologic and genetic studies in admixed populations (e.g. African Americans and Latinos) offers unique opportunities to unravel complex genetic and environmental contributors to disease
Two Examples of Ethnic-Specific Alleles in Pharmacogenetics
• Irinotecan (Camptosar) and colon cancer
• Carbamazepine and Stevens-Johnson Syndrome
Irinotecan and Colon Cancer
• Extreme side effects in some patients– Severe diarrhea, neutropenia– Recommended reduced starting dosage
• Metabolized by uridine diphosphate glucuronosyltransferase isoform 1A1 (UGT1A1)
• Homozygotes/compound heterozygotes for deficiency alleles at greatly increased risk for side effects
Frequency of UGT1A1 Deficiency Genotypes by Ethnic Group
Blacks Whites Asians Pac Isl’s
*28/*28 20% 15% 1% <0.1%
*6/*6 + *6/*28
<0.1% <0.1% 5.5% ?
Stevens-Johnson Syndrome and Carbamazepine (Tegretol)
• Carbamazepine most common cause of SJS in Asians
• HLA B*1502 a major risk factor in Han Chinese• Relative Risk estimated at 2,500 (Chung et al,
Nature 2004)• B*1502 carrier frequency about 8% in Chinese,
very rare or non-existent in other racial groups• May explain greater proportion of SJS due to
carbamazepine in Asians than other groups