the genetics of mexico recapitulates native american

Upload: ale-ortega

Post on 03-Jun-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 The Genetics of Mexico Recapitulates Native American

    1/7

    DOI: 10.1126/science.1251688, 1280 (2014);344Science

    et al.Andrs Moreno-Estradaand affects biomedical traitsThe genetics of Mexico recapitulates Native American substructure

    This copy is for your personal, non-commercial use only.

    clicking here.colleagues, clients, or customers by, you can order high-quality copies for yourIf you wish to distribute this article to others

    here.following the guidelines

    can be obtained byPermission to republish or repurpose articles or portions of articles

    ):August 6, 2014www.sciencemag.org (this information is current as of

    The following resources related to this article are available online at

    http://www.sciencemag.org/content/344/6189/1280.full.htmlversion of this article at:

    including high-resolution figures, can be found in the onlineUpdated information and services,

    http://www.sciencemag.org/content/suppl/2014/06/11/344.6189.1280.DC2.htmlhttp://www.sciencemag.org/content/suppl/2014/06/11/344.6189.1280.DC1.html

    can be found at:Supporting Online Material

    http://www.sciencemag.org/content/344/6189/1280.full.html#relatedfound at:

    can berelated to this articleA list of selected additional articles on the Science Web sites

    http://www.sciencemag.org/content/344/6189/1280.full.html#ref-list-1, 13 of which can be accessed free:cites 60 articlesThis article

    http://www.sciencemag.org/cgi/collection/geneticsGenetics

    subject collections:This article appears in the following

    registered trademark of AAAS.is aScience2014 by the American Association for the Advancement of Science; all rights reserved. The title

    CopyrighAmerican Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005.(print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by thScience

    http://www.sciencemag.org/about/permissions.dtlhttp://www.sciencemag.org/about/permissions.dtlhttp://www.sciencemag.org/about/permissions.dtlhttp://www.sciencemag.org/about/permissions.dtlhttp://www.sciencemag.org/about/permissions.dtlhttp://www.sciencemag.org/about/permissions.dtlhttp://www.sciencemag.org/content/344/6189/1280.full.htmlhttp://www.sciencemag.org/content/344/6189/1280.full.htmlhttp://www.sciencemag.org/content/344/6189/1280.full.htmlhttp://www.sciencemag.org/content/suppl/2014/06/11/344.6189.1280.DC2.htmlhttp://www.sciencemag.org/content/suppl/2014/06/11/344.6189.1280.DC2.htmlhttp://www.sciencemag.org/content/suppl/2014/06/11/344.6189.1280.DC1.htmlhttp://www.sciencemag.org/content/suppl/2014/06/11/344.6189.1280.DC1.htmlhttp://www.sciencemag.org/content/344/6189/1280.full.html#relatedhttp://www.sciencemag.org/content/344/6189/1280.full.html#relatedhttp://www.sciencemag.org/content/344/6189/1280.full.html#ref-list-1http://www.sciencemag.org/content/344/6189/1280.full.html#ref-list-1http://www.sciencemag.org/content/344/6189/1280.full.html#ref-list-1http://www.sciencemag.org/content/344/6189/1280.full.html#ref-list-1http://www.sciencemag.org/cgi/collection/geneticshttp://www.sciencemag.org/cgi/collection/geneticshttp://www.sciencemag.org/cgi/collection/geneticshttp://www.sciencemag.org/content/344/6189/1280.full.html#ref-list-1http://www.sciencemag.org/content/344/6189/1280.full.html#relatedhttp://www.sciencemag.org/content/suppl/2014/06/11/344.6189.1280.DC2.htmlhttp://www.sciencemag.org/content/suppl/2014/06/11/344.6189.1280.DC1.htmlhttp://www.sciencemag.org/content/344/6189/1280.full.htmlhttp://www.sciencemag.org/about/permissions.dtlhttp://www.sciencemag.org/about/permissions.dtlhttp://oascentral.sciencemag.org/RealMedia/ads/click_lx.ads/sciencemag/cgi/reprint/L22/540226873/Top1/AAAS/PDF-R-and-D-Systems-Science-1709891/SfN2014_TG_ScienceBanner.raw/1?x
  • 8/11/2019 The Genetics of Mexico Recapitulates Native American

    2/7

    HUMAN GENETICS

    The genetics of Mexico recapitulatesNative American substructure andaffects biomedical traits

    Andrs Moreno-Estrada,1*Christopher R. Gignoux,2Juan Carlos Fernndez-Lpez,3

    Fouad Zakharia,1 Martin Sikora,1 Alejandra V. Contreras,3 Victor Acua-Alonzo,4,5

    Karla Sandoval,1 Celeste Eng,6 Sandra Romero-Hidalgo,3 Patricia Ortiz-Tello,1

    Victoria Robles,1 Eimear E. Kenny,1 Ismael Nuo-Arana,7Rodrigo Barquera-Lozano,4

    Gastn Macn-Prez,4 Julio Granados-Arriola,8 Scott Huntsman,6 Joshua M. Galanter,6,9

    Marc Via,6|| Jean G. Ford,10 Roco Chapela,11 William Rodriguez-Cintron,12

    Jose R. Rodrguez-Santana,1,3 Isabelle Romieu,14 Juan Jos Sienra-Monge,15

    Blanca del Rio Navarro,15 Stephanie J. London,16 Andrs Ruiz-Linares,5

    Rodrigo Garcia-Herrera,3 Karol Estrada,3 Alfredo Hidalgo-Miranda,3

    Gerardo Jimenez-Sanchez,3# Alessandra Carnevale,3 Xavier Sobern,3

    Samuel Canizales-Quinteros,3,17 Hctor Rangel-Villalobos,7 Irma Silva-Zolezzi,3**

    Esteban Gonzalez Burchard,6,9*Carlos D. Bustamante1*

    Mexico harbors great cultural and ethnic diversity, yet fine-scale patterns of humangenome-wide variation from this region remain largely uncharacterized. We studied

    genomic variation within Mexico from over 1000 individuals representing 20 indigenous

    and 11 mestizo populations. We found striking genetic stratification among indigenous

    populations within Mexico at varying degrees of geographic isolation. Some groups were as

    differentiated as Europeans are from East Asians. Pre-Columbian genetic substructure is

    recapitulated in the indigenous ancestry of admixed mestizo individuals across the

    country. Furthermore, two independently phenotyped cohorts of Mexicans and Mexican

    Americans showed a significant association between subcontinental ancestry and lung

    function. Thus, accounting for fine-scale ancestry patterns is critical for medical and

    population genetic studies within Mexico, in Mexican-descent populations, and likely in

    many other populations worldwide.

    Understanding patterns of human popula-

    tion structure, where regional surveys arekey for delineating geographically restricted

    variation, is important for the design and

    interpretation of medical genetic studies.

    In particular, we expect rare genetic variants,

    including functionally relevant sites, to exhibit

    little sharing among diverged populations (1).

    Native Americans display the lowest genetic

    diversity of any continental group, but there is

    high divergence among subpopulations (2). As

    a result, present-day American indigenous pop-

    ulations (and individuals with indigenous an-

    cestry) may harbor local private alleles rare or

    absent elsewhere, including functional and med-

    ically relevant variants (3, 4). Mexico serves as

    an important focal point for such analyses, be-cause it harbors one of the largest sources of

    pre-Columbian diversity and has a long history

    of complex civilizations with varying contribu-

    tions to the present-day population.

    Previous estimates of Native Mexican genetic

    diversity examined single loci or were limited to

    a reduced number of populations or small sam-

    ple sizes (58). We examined local patterns of

    variation from nearly 1 million genome-wide

    autosomal single-nucleotode polymorphisms

    (SNPs) for 511 Native Mexican individuals from

    20 indigenous groups, covering most geographic

    regions across Mexico (table S1). Standard prin-

    cipal components analysis (PCA) summarizesthe major axes of genetic variation in the data

    set [see (9)]. Whereas PC1 and PC2 separate

    Africans and Europeans from Native Mexicans,

    PC3 differentiates indigenous populations with-

    in Mexico, following a clear northwest-southeast

    cline (Fig. 1A). A total of 0.89% of the variation

    is explained by PC3, nearly three times as much

    as the variation accounted for by the north-south

    axis of differentiation within Europe [0.30%, in

    (10)]. The northernmost (Seri) and southern-

    most (Lacandon) populations define the extremes

    of the distribution, with very clear clustering

    of individuals by population, indicating high

    levels of divergence among groups (fig. S1).

    Seri and Lacandon show the highest level ofpopulation differentiation as measured with

    Wrights fixation index FST(0.136, Fig. 1B and

    table S4), higher than the FST between Euro-

    peans and Chinese populations in HapMap3

    (0.11) (11). Other populations within Mexico

    also show extreme FSTvalues; for example, the

    Huichol and Tojolabal have a pairwise FST of

    0.068, similar to that observed between the

    Gujarati Indians and the Chinese in HapMap3

    (0.076).

    The high degree of differentiation between

    populations measured byFST argues that these

    populations have experienced high degrees

    isolation. Indeed, when autozygosity is inferr

    using runs of homozygosity (ROH), all popu

    tions on average have long homozygous trac

    with the Huichol, Lacandon, and Seri all havi

    on average over 10% of the genome in ROH [fi

    S2 and S3 (9)]. These populations are relativ

    small, increasing the effects of genetic drift a

    driving some of the high FSTvalues. In contra

    the Mayan and Nahuan populations have mu

    smaller proportions of the genome in ROH, co

    sistent with ROH levels found in Near Easte

    populations in HGDP (12). These populatio

    are the descendants of large Mesoamerican

    vilizations, and concordant with large histori

    populations, have relatively low proportions

    ROH. The high degree of variance in ROH amo

    populations is an additional indicator of pop

    lation substructure and suggests a large varian

    in historical population sizes. Comparing the o

    served ROH patterns to those derived from co

    lescent simulations, we find that Native Americ

    groups within Mexico are characterized by sm

    effective population sizes under a model with

    strong bottleneck, in agreement withother studof NativeAmerican populations(13). The degr

    of population size recovery to the current day

    consistent with the degree of isolation of the e

    tant populations, ranging from 1196 chromosom

    [95% confidence interval (CI) 317 to 1548] for t

    Seri in the Sonora desert, to 3669 (95% CI 2588

    5522) for the Mayans from Quintana Roo (figs.

    to S6; (9)).

    1Department of Genetics, Stanford University School ofMedicine, Stanford, CA, USA. 2Department of Bioengineerand Therapeutic Sciences, University of California, SanFrancisco, CA, USA. 3 Instituto Nacional de MedicinaGenmica (INMEGEN), Mexico City, Mexico. 4EscuelaNacional de Antropologa e Historia (ENAH), Mexico City,Mexico. 5Department of Genetics, Evolution andEnvironment, University College London, London, UK.6Department of Medicine, University of California, SanFrancisco, CA, USA. 7 Instituto de Investigacin en GenticMolecular, Universidad de Guadalajara, Ocotln, Mexico.8Instituto Nacional de Ciencias Mdicas y Nutricin SalvadZubirn, Mexico City, Mexico. 9Department of Bioengineerand Therapeutic Sciences, University of California, SanFrancisco, CA, USA. 10The Brooklyn Hospital Center,Brooklyn, NY, USA. 11Instituto Nacional de EnfermedadesRespiratorias (INER), Mexico City, Mexico. 12VeteransCaribbean Health Care System, San Juan, Puerto Rico.13Centro de Neumologa Pediatrica, San Juan, Puerto Rico14International Agency for Research on Cancer, Lyon, Fran15Hospital Infantil de Mxico Federico Gomez, Mexico CityMexico. 16National Institute of Environmental HealthSciences, National Institutes of Health, Department of Heaand Human Services, Research Triangle Park, NC, USA.17

    Facultad de Qumica, Universidad Nacional Autnoma deMxico, Mexico City, Mexico.*Corresponding author. E-mail: [email protected]

    (C.D.B.); [email protected] (A.M.-E.); esteban.burchard

    ucsf.edu (E.G.B.) These authors contributed equally to this

    work.Present address: Department of Genetics, Stanford Unive

    School of Medicine, Stanford, CA, USA. (C.R.G.) Present address

    Center for Statistical Genetics, Mount Sinai School of Medicine, N

    York, USA. (E.E.K.) ||Present address: Department of Psychiatry

    Clinical Psychobiology - IR3C, Universitat de Barcelona, Spain. (M

    Present address: Analytic and Translational Genetics Unit,

    Massachusetts General Hospital, Boston, USA. (K.E.) #Present

    address: Harvard School of Public Health and Global Biotech

    Consulting Group. (G.J.-S.) **Present address: Nutrition and

    Health Department Nestec Ltd, Nestle Research Center, Lausa

    Switzerland. (I.S.-Z.)

    1280 13 JUN E 2014 VO L 344 IS S UE 6189 sciencemag.org SCIEN

    RESEARCH | REPORTS

  • 8/11/2019 The Genetics of Mexico Recapitulates Native American

    3/7

    Isolation also correlates with the degree of

    relatedness within and between ethnic groups, ul-

    timately shaping the pattern of genetic relation-

    ships among populations. We built a relatedness

    graph (Fig. 1C) of individuals sharing >13 cM of

    the genome identically by descent (IBD) (cor-

    responding to third/fourth cousins or closer

    relatives). Almost all the connections are within-

    versus among-population, consistent with the

    populations being discrete rather than exhib-

    iting large-scale gene flow [figs. S7 and S8 (9)].

    As seen with the ROH calculations, the Mayan

    and Nahuan groups have fewer internal connec-

    tions. The few between-population connections

    appear in populations close to each coastline,

    such as the connections between the Campeche

    Mayans and populations to the west along the

    Gulf of Mexico.

    The long-tract ROH and IBD analyses are es-

    pecially relevant to the recent history of isolation

    of Native American populations. We ran TreeMix

    (14) to generate a probabilistic model of diver-

    gence and migration among the Native Am

    ican populations (Fig. 1D). The inferred tree w

    no migration paths recapitulates the north/sou

    and east/west gradients of differentiation fro

    the PCA and IBD analyses, with populatio

    with high ROH values also exhibiting long

    tip branches. The primary branches divide po

    ulations by geography. All northern populatio

    (dark blue) branch from the same initial sp

    at the root. We also find two additional maj

    clades: a grouping of populations from the sout

    ern states of Guerrero and Oaxaca (green labe

    and aMayan clade composed of Mayan-speaki

    populations from Chiapas and the Yucatan p

    ninsula in the southeast (orange labels). Int

    ducing migratory edges to the model conne

    the Maya in Yucatan to a branch leading to t

    Totonac, whose ancestors occupied the large p

    Columbian city of El Tajin in Veracruz (15). Th

    result points to an Atlantic coastal corridor

    gene flow between the Yucatan peninsula a

    central/northern Mexico (fig. S9), consistent w

    our IBD analysis. Indeed, the only Mayan la

    guage outside the Mayan territory is spoken

    the Huastec, nearby in northern Veracruz, suporting a shared history (16).

    These signals remain today as a legacy of t

    pre-Columbian diversity of Mexican populatio

    Over the past 500 years, population dynam

    have changed drastically. Today, the majority

    Mexicans are admixed and can trace their a

    cestry back not only to indigenous groups b

    also to Europe and Africa. To investigate patter

    of admixture, we combined data from continen

    source populations (including the 20 nat

    Mexican groups, 16 European populations, a

    50 West African Yorubas) with 500 admix

    mestizo individuals from 10 Mexican sta

    recruited by the National Institute of Genom

    Medicine (INMEGEN) for this study, Mexicafrom Guadalajara in the POPRES collection (1

    and individuals of Mexican descent from L

    Angeles in the HapMap Phase 3 project (ta

    S1). We ran the unsupervised mixture model

    gorithm ADMIXTURE (18) to estimate ances

    proportions for individuals in our combined da

    set (Fig. 2, fig. S10, and table S5). Allowing

    three ancestral clusters (K= 3), we find that mo

    individuals have a large amount of Native a

    European ancestry, with a small (typically

  • 8/11/2019 The Genetics of Mexico Recapitulates Native American

    4/7

    Fig. 2. Mexican population structure.(A) Map of sampled pop-

    ulations (detailed in table S1) and admixture average proportions

    (table S5). Dots correspond to Native Mexican populations color-

    coded according to K= 9 clusters identified in (B) (bottom), and

    shaded areas denote states in which cosmopolitan populations

    were sampled. Pie charts summarize per-state average propor-

    tions of cosmopolitan samples at K= 3 (European in red, West

    African in green, and Native American in gray). Bars show the

    total Native American ancestry decomposed into average propor-

    tions of the native subcomponents identified at K= 9. (B) Global

    ancestry proportions at K= 3 (top) and K= 9 (bottom) estimated

    with ADMIXTURE, including African, European, Native Mexican, and

    cosmopolitan Mexican samples (tables S1 and S2). From left to right,

    Mexican populations are displayed north-to-south. (C) Interpola-

    tion maps showing the spatial distribution of the six native com-

    ponents identified at K= 9. Contour intensities are proportional to

    ADMIXTURE values observed in Native Mexican samples, with crosses indi-

    cating sampling locations. Scatter plots with linear fits show ADMIXTURE

    values observed in cosmopolitan samples versus a distance metric summariz-

    ing latitude and longitude (long axis) for the sampled states. From left to right:

    Yucatan, Campeche, Oaxaca, Veracruz, Guerrero, Tamaulipas, Guanajuato,

    Zacatecas, Jalisco, Durango, and Sonora. Values are adjusted relative to

    the total Native American ancestry of each individual (9).

    1282 13 JUNE 2014 VOL 344 ISSUE 6189 sciencemag.org SCIEN

    RESEARCH | REPORTS

  • 8/11/2019 The Genetics of Mexico Recapitulates Native American

    5/7

  • 8/11/2019 The Genetics of Mexico Recapitulates Native American

    6/7

    These changes due to ancestry are compar-

    able to other factors affecting lung function. Com-

    paring the expected effect of ancestry across

    Mexico with the known effects of age in the

    standard Mexican-American reference equa-

    tions (28), the inferred 7.3% change in FEV1associated with subcontinental ancestry is sim-

    ilar to the decline in FEV1 that a 30-year-old

    Mexican-American individual of average height

    would experience by aging 10.3 years if male

    and 11.8 years if female. Similarly, comparing

    our results from the Mexican data with the

    model incorporating ancestry in African Amer-

    icans, a difference of 7.3% in FEV1would corre-

    spond to a 33% difference in African ancestry

    (29). The association between FEV1 and ASPC1

    is not an indicator of impaired lung function

    on its ownrather, it contributes to the distribu-

    tion of FEV1 values and would modify clinical

    thresholds. This finding indicates that diagnoses

    of diseases such as asthma and chronic obstruc-

    tive pulmonary disease (COPD) relying on spe-

    cific lung function thresholds may benefit from

    taking finer-scale ancestry into consideration.

    An important implication of our work is that

    multi- and transethnic mapping efforts will

    benefit from including individuals of Mexican

    ancestry, because the Mexican population har-

    bors rich amounts of genetic variation that may

    underlie important biomedical phenotypes. A

    key question in this regard is whether existing

    catalogs of human genome variation capture

    the genetic variation present in the samples

    analyzed here. We performed targeted SNP tag-

    ging and genome-wide haplotype sharing anal-

    ysis within 100-kb sliding windows to assess

    the degree to which haplotype diversity in the

    Mexican mestizo samples could be captured

    by existing reference panels [figs. S18 to 20 (9

    Although Mexican-American samples (MXL) w

    included in both the HapMap and 1000 G

    nomes catalogs, average haplotype sharing f

    the INMEGEN mestizo samples is limited

    81.2 and 90.5% when combined with all con

    nental HapMap populations. It is only after

    cluding the Native American samples genotyp

    here that nearly 100% of haplotypes are shar

    maximizing the chances of capturing most of t

    variation in Mexico.

    Much effort has been invested in detecti

    common genetic variants associated with co

    plex disease and replicating associations acro

    populations. However, functional and medica

    relevant variation may be rare or populatio

    specific, requiring studies of diverse hum

    populations to identify new risk factors (

    Without detailed knowledge of the geograph

    Fig. 3. Subcontinental ancestry of admixed Mexican genomes and bio-

    medical implications. (A) ASPCA of Native American segments from Mexican

    cosmopolitan samples (colored circles) together with 20 indigenous Mexican

    populations (population labels). Samples with >10% of non-native ad-

    mixture were excluded from the reference panel, as well as population

    outliers such as Seri, Lacandon, and Tojolabal. (B) Zoomed detail of the

    distribution of the Native American fraction of cosmopolitan samples

    throughout Mexico. Native ancestral populations were used to define PCA

    space (prefixed by NAT) but removed from the background to highlight the

    subcontinental origin of admixed genomes (prefixed by MEX). Each cir

    represents the combined set of haplotypes called Native American alo

    the haploid genome of each sample with >25% of Native American a

    cestry. The inset map shows the geographic origin of cosmopolitan sa

    ples per state, color-coded by region ( ). (C) Coefficients and 95% CIs

    associations between ASPC1 and lung function (FEV1) from Mexic

    participants in the GALA I study, and the MCCAS, as well as both stud

    combined (table S6 and fig. S17) ( ). (D) Means and CIs of predict

    change in FEV1 by state, extrapolated from the model in (C).

    1284 13 J UN E 2014 VO L 344 IS S UE 6189 sciencemag.org SCIEN

    RESEARCH | REPORTS

  • 8/11/2019 The Genetics of Mexico Recapitulates Native American

    7/7