gene-flow in a mosaic hybrid zone: is local introgression ... · quence (m. ohresser, personal...

35
INVESTIGATION Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression Adaptive? Christelle Fraïsse,* ,,,1 Camille Roux,* ,,§ John J. Welch, and Nicolas Bierne* ,*Université Montpellier 2, 34095 Montpellier Cedex 5, France, CNRS, Institut des Sciences de lEvolution, ISEM Unité Mixte de Recherche 5554, 34200 SETE, France, Department of Genetics, University of Cambridge, CB2 3EH Cambridge, United Kingdom, and § Department of Ecology and Evolution, Lausanne University, Biophore/Sorge, CH-1015 Lausanne, Switzerland ABSTRACT Genome-wide scans of genetic differentiation between hybridizing taxa can identify genome regions with unusual rates of introgression. Regions of high differentiation might represent barriers to gene ow, while regions of low differentiation might indicate adaptive introgressionthe spread of selectively benecial alleles between reproductively isolated genetic backgrounds. Here we conduct a scan for unusual patterns of differentiation in a mosaic hybrid zone between two mussel species, Mytilus edulis and M. galloprovincialis. One outlying locus, mac-1, showed a characteristic footprint of local introgression, with abnormally high frequency of edulis-derived alleles in a patch of M. galloprovincialis enclosed within the mosaic zone, but low frequencies outside of the zone. Further analysis of DNA sequences showed that almost all of the edulis allelic diversity had introgressed into the M. galloprovincialis background in this patch. We then used a variety of approaches to test the hypothesis that there had been adaptive introgression at mac-1. Simulations and model tting with maximum-likelihood and approximate Bayesian computation approaches suggested that adaptive introgression could generate a soft sweep,which was qualitatively consistent with our data. Although the migration rate required was high, it was compatible with the functioning of an effective barrier to gene ow as revealed by demographic inferences. As such, adaptive introgression could explain both the reduced intraspecic differentiation around mac-1 and the high diversity of introgressed alleles, although a localized change in barrier strength may also be invoked. Together, our results emphasize the need to account for the complex history of secondary contacts in interpreting outlier loci. G ENETIC barriers between related lineages are often semipermeable, varying in strength across the genome (Harrison 1986). As such, hybridization can lead to adaptive introgression, the transfer of selectively benecial alleles between species (Parsons et al. 1993; Arnold 2004; Helico- nius Genome Consortium 2012; Hedrick 2013). However, the frequency and importance of such hybridization events remain hotly debated (Mallet 2007; Abbott et al. 2013; Barton 2013). It is now common to address such questions by scan- ning genome sequences for regions of enhanced or reduced differentiation. Universally advantageous alleles can usually cross barriers to gene ow without much delay (e.g., Piálek and Barton 1997), and neutral loci linked to such alleles are expected to display unusually low levels of differentiation. In contrast, barrier loci will delay the introgression of neutral alleles in proportion to their linkage (Barton 1979; Barton and Bengtsson 1986); that is the basis for the investigation of genomic islands of differentiation (Turner et al. 2005; Hohenlohe et al. 2010; Nadeau et al. 2012). Interpreting regions of unusual differentiation can be difcult, not only because of variation in recombination across the genome (Nachman and Payseur 2012; Roesti et al. 2013), but also because reduced differentiation might be caused by processes other than adaptive introgression. In particular, pop- ulation history may promote multiple contacts between the same lineages in different locations, and so the resulting bar- riers to gene ow might vary from place to place, according to local ecological gradients and population connectivity. In ac- cordance with this hypothesis, reversed or modied associa- tions between genetic differentiation and habitat variation (Bierne et al. 2011; Jackson et al. 2012) and partial parallel- ism in genomic divergence (Gagnaire et al. 2013) have both been observed in replicated contact zones of the same species pair. Differentiating between these hypotheses is challenging, because adaptive introgression leaves a genomic signature that can be substantially different from the classic hard sweep(Maynard Smith and Haigh 1974) where adaptive substitution Copyright © 2014 by the Genetics Society of America doi: 10.1534/genetics.114.161380 Manuscript received January 10, 2014; accepted for publication April 17, 2014; published Early Online April 29, 2014. Supporting information is available online at http://www.genetics.org/lookup/suppl/ doi:10.1534/genetics.114.161380/-/DC1. 1 Corresponding author: SMEL, 2 rue des Chantiers, 34200 Sète, France. E-mail: [email protected] Genetics, Vol. 197, 939951 July 2014 939

Upload: others

Post on 19-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

INVESTIGATION

Gene-Flow in a Mosaic Hybrid Zone: Is LocalIntrogression Adaptive?

Christelle Fraïsse,*,†,‡,1 Camille Roux,*,†,§ John J. Welch,‡ and Nicolas Bierne*,†

*Université Montpellier 2, 34095 Montpellier Cedex 5, France, †CNRS, Institut des Sciences de l’Evolution, ISEM Unité Mixte deRecherche 5554, 34200 SETE, France, ‡Department of Genetics, University of Cambridge, CB2 3EH Cambridge, United Kingdom,

and §Department of Ecology and Evolution, Lausanne University, Biophore/Sorge, CH-1015 Lausanne, Switzerland

ABSTRACT Genome-wide scans of genetic differentiation between hybridizing taxa can identify genome regions with unusual rates ofintrogression. Regions of high differentiation might represent barriers to gene flow, while regions of low differentiation might indicate adaptiveintrogression—the spread of selectively beneficial alleles between reproductively isolated genetic backgrounds. Here we conduct a scan forunusual patterns of differentiation in a mosaic hybrid zone between two mussel species, Mytilus edulis and M. galloprovincialis. One outlyinglocus, mac-1, showed a characteristic footprint of local introgression, with abnormally high frequency of edulis-derived alleles in a patch ofM. galloprovincialis enclosed within the mosaic zone, but low frequencies outside of the zone. Further analysis of DNA sequences showed thatalmost all of the edulis allelic diversity had introgressed into the M. galloprovincialis background in this patch. We then used a variety ofapproaches to test the hypothesis that there had been adaptive introgression atmac-1. Simulations and model fitting with maximum-likelihoodand approximate Bayesian computation approaches suggested that adaptive introgression could generate a “soft sweep,” which wasqualitatively consistent with our data. Although the migration rate required was high, it was compatible with the functioning of an effectivebarrier to gene flow as revealed by demographic inferences. As such, adaptive introgression could explain both the reduced intraspecificdifferentiation around mac-1 and the high diversity of introgressed alleles, although a localized change in barrier strength may also beinvoked. Together, our results emphasize the need to account for the complex history of secondary contacts in interpreting outlier loci.

GENETIC barriers between related lineages are oftensemipermeable, varying in strength across the genome

(Harrison 1986). As such, hybridization can lead to adaptiveintrogression, the transfer of selectively beneficial allelesbetween species (Parsons et al. 1993; Arnold 2004; Helico-nius Genome Consortium 2012; Hedrick 2013). However,the frequency and importance of such hybridization eventsremain hotly debated (Mallet 2007; Abbott et al. 2013; Barton2013). It is now common to address such questions by scan-ning genome sequences for regions of enhanced or reduceddifferentiation. Universally advantageous alleles can usuallycross barriers to gene flow without much delay (e.g., Piálekand Barton 1997), and neutral loci linked to such alleles areexpected to display unusually low levels of differentiation.In contrast, barrier loci will delay the introgression of neutral

alleles in proportion to their linkage (Barton 1979; Bartonand Bengtsson 1986); that is the basis for the investigationof genomic islands of differentiation (Turner et al. 2005;Hohenlohe et al. 2010; Nadeau et al. 2012).

Interpreting regions of unusual differentiation can bedifficult, not only because of variation in recombination acrossthe genome (Nachman and Payseur 2012; Roesti et al. 2013),but also because reduced differentiation might be caused byprocesses other than adaptive introgression. In particular, pop-ulation history may promote multiple contacts between thesame lineages in different locations, and so the resulting bar-riers to gene flow might vary from place to place, according tolocal ecological gradients and population connectivity. In ac-cordance with this hypothesis, reversed or modified associa-tions between genetic differentiation and habitat variation(Bierne et al. 2011; Jackson et al. 2012) and partial parallel-ism in genomic divergence (Gagnaire et al. 2013) have bothbeen observed in replicated contact zones of the same speciespair. Differentiating between these hypotheses is challenging,because adaptive introgression leaves a genomic signature thatcan be substantially different from the classic “hard sweep”(Maynard Smith and Haigh 1974) where adaptive substitution

Copyright © 2014 by the Genetics Society of Americadoi: 10.1534/genetics.114.161380Manuscript received January 10, 2014; accepted for publication April 17, 2014;published Early Online April 29, 2014.Supporting information is available online at http://www.genetics.org/lookup/suppl/doi:10.1534/genetics.114.161380/-/DC1.1Corresponding author: SMEL, 2 rue des Chantiers, 34200 Sète, France.E-mail: [email protected]

Genetics, Vol. 197, 939–951 July 2014 939

Page 2: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

is associated with a single new mutation. Adaptive introgres-sion may involve multiple migrant copies of the same beneficialallele (Pennings and Hermisson 2006) leading to a “soft-sweep”signature, which can be difficult to discriminate from a localincrease in the neutral introgression rate (see Table 1).

Here we present a multilocus scan for local introgressionbetween the marine mussels Mytilus edulis and M. gallopro-vincialis. These species appear to have had a complex historyof fragmentation and colonization during the Quaternary, asis the case for almost all temperate organisms (Hewitt 2011).Recent analyses suggest thatM. edulis andM. galloprovincialisdiverged �2.5 million years (MY), followed by a secondarycontact beginning about 0.7 MY (Roux et al. 2014). Today, theyare isolated by multifarious pre- and postzygotic mechanisms(Bierne et al. 2002, 2003a, 2006). Due to natural replications ofcontact zones, the Mytilus species complex is particularly suit-able for studying the different possible outcomes of secondarycontact. Here we study an original mosaic hybrid zone inEurope, extending from the Mediterranean Sea to the NorthSea (Bierne et al. 2003b; Hilbish et al. 2012) and to the BritishIsles (Skibinski et al. 1983). We focus on the French section ofthe mosaic zone that includes two patches per species, oneperipheral and one enclosed within the zone (Figure 1).

To study introgression between these populations, weused previously published data at 440 loci (422 amplifiedfragment length polymorphism, AFLP, and 18 codominantnuclear markers; Gosset and Bierne 2012). Initial analysesidentified a nuclear marker, mac-1, with high interspecificdifferentiation in the peripheral patch but an anomalouslylow level of differentiation in the enclosed patch in Brittany.This outlier locus was therefore a candidate for adaptiveintrogression in a geographically localized region.

To test the plausibility of this hypothesis, we first sequenceda 3.1-kb region around mac-1 and analyzed how allele fre-quency varies along the chromosomal region. Unlike anotherM. edulis locus analyzed previously (Bierne 2010), we failed toidentify variation in differentiation on such a small chromo-somal scale. We then determined how well our data might beexplained by a model of adaptive introgression (Pennings andHermisson 2006). Inferences indicated that the level of geneflow implied by the adaptive introgression hypothesis washigh, but still compatible with the strength of the genetic bar-rier between the two species. We further propose that thebarrier to gene flow in the vicinity of mac-1 could be simplymore permeable in certain hybrid zones of the mosaic. Overall,we show that although the molecular signature of adaptiveintrogression is difficult to identify, dedicated methods thataccount for the history of speciation can help in interpretingoutlying levels of introgression.

Materials and Methods

Study sites and sampling

The zone of study, along the European coast, is characterizedby three successive transitions between panmictic patches

of each species (Figure 1; Bierne et al. 2003c; Hilbish et al.2012). We used two geographical samples of M. edulis: theperipheral patch of the North Sea (Wadden Sea, Holland);and the enclosed patch of the Bay of Biscay (Lupin, France).We also used two geographical samples ofM. galloprovincialis:the peripheral patch of the Iberian Coast (Faro, Portugal); andthe enclosed patch of Brittany (Roscoff, France). These foursamples have been described in Faure et al. (2008) and havebeen established to be representative of monospecific pan-mictic patches. In addition, we used a sample of M. trossulus(Tadoussac, Canada) to serve as an outgroup. Forty-eightindividuals per sample were examined, except for the Brit-tany sample, which comprised 87 individuals. Genomic DNAwas extracted from adults using the DNeasy Blood and TissueKit (Qiagen) following the manufacturers protocol.

Multilocus scan

For the multilocus scan, we combined existing data, comprising422 AFLP markers, 13 codominant nuclear markers (Gossetand Bierne 2012), and 5 allozyme markers (Faure et al. 2008).To combine loci with various level of diversity, nuclear codom-inant markers were transformed into biallelic loci by poolingalleles according to their frequencies in the M. galloprovincialisand M. edulis reference samples (see McDonald 1994). AFLPmarkers are subject to several well-known caveats, notablyfragment-size homoplasy (Caballero et al. 2008; Whitlock et al.2008). To reduce this problem, we excluded AFLP bands smallerthan 50 bp, which are most prone to homoplasy (Vekemanset al. 2002). Allele frequencies of AFLP markers were esti-mated with the Bayesian method of Zhivotovsky (1999) inAFLPsurv v. 1.0 (Vekemans et al. 2002). All data were com-bined for the outlier tests.

To ensure robustness, and following the recommendationof Pérez-Figueroa et al. (2010), we used six distinct methodsto identify loci with unusual levels of genetic differentiation.These are the methods of Lewontin and Krakauer (1973),Beaumont and Nichols (1996), Vitalis et al. (2003), Beaumontand Balding (2004), Bonhomme et al. (2010), and a custom

Table 1 Reduction in diversity after introgression in studies ofadaptive introgression

Genus

Species pair

Locus Her/HedDonor Recipient

Canisa latrans lupus Mitochondria 0.752Eucalyptusb cordata globulus Chloroplast 0.751Euremac hecabe b hecabe y Wolbachia 0.388Littorinad obtusata fabalis SS ARK 0.143Mytiluse edulis trossulus Mitochondria 0.608Mytilusf edulis galloprovincialis mac-1 0.957

d denotes the donor species; r denotes the recipient species. Only donor-derivedalleles were considered. Her/Hed, proportional diversity after introgression.a Lehman et al. (1991).b McKinnon et al. (2004).c Narita et al. (2006).d Kemppainen et al. (2011).e Quesada et al. (1999); Rawson and Hilbish (1998).f Present study.

940 C. Fraïsse et al.

Page 3: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

simulation method in which the neutral envelope of FST isobtained from simulations with the parameter estimates ofthe best-supported demographic model. Full details are givenin Supporting Information, File S1.

Data analyses at the outlier locus mac-1

Chromosomal walking along mac-1: We sequenced a re-gion of 3.1 kb around mac-1 taking advantage of publishedsequences (Daguin et al. 2001), an additional upstream se-quence (M. Ohresser, personal communication) and lookingfor homology between coding regions and expressed sequencetags of Mytilus (Tanguy et al. 2008). To describe the localvariation of the edulis allele frequency around mac-1, wewalked along the sequence and looked for evenly spacedgenetic polymorphisms (see Figure S2 for a diagram). Fourpolymorphisms, two PCR product-length polymorphismsand two SNPs, were chosen, based on the allelic genealogies,to be discriminant between the two species. We extensivelygenotyped our four samples (North Sea, Brittany, Bay of Biscay,and Iberian Coast) for these four polymorphisms. See File S1for technical details.

Genetic analysis: For the PCR product-length polymorphismsdescribed above, we computed classical genetic statistics onthe subset of edulis alleles (i.e., alleles that group within thediversity of M. edulis within the mac-1 genealogy; see Figure3) using Genetix 4.05.2 (Belkhir et al. 1996–2004): the num-ber of allelic classes (k), the heterozygosity (He, Nei 1978),departure from Hardy–Weinberg equilibrium, and genetic dif-ferentiation were estimated by f and Q (Weir and Cockerham1984), respectively, and significance was tested with 1000permutations. Finally, we evaluated the departure from theneutral expectation at mutation–drift equilibrium with theEwens test (e, Slatkin 1994; Slatkin 1996).

Sequence alignment was performed with Multalin (Corpet1988) and verified by eye in BioEdit 7.5.3 (Hall 1999). Align-ment gaps were excluded from further analyses. We inferredallelic genealogies for the three regions sequenced using theneighbor-joining algorithm implemented in Mega 5.0 (Kumaret al. 2004). We rooted the allele genealogy with the outgroupM. trossulus. We computed classical genetic statistics on DNAsequences using DNAsp (Rozas et al. 2003): the number ofpolymorphic sites (S), the number of synonymous (Ss), non-synonymous (Sns) and noncoding (Snc) mutations, levels ofnucleotide diversity estimated from the number of polymor-phic sites (uS, Watterson 1975) and from pairwise differences(up, Tajima 1983) and the minimum number of recombinationevents (Rm, Hudson and Kaplan 1985). We also computedtwo indicators of the distortion of the allele-frequency spec-trum from the neutral mutation–drift equilibrium expecta-tion: (1) Tajima’s D (Tajima 1989a; 1989b) and (2) Fay andWu’s H (Fay and Wu 2000); significant departure from stan-dard neutral model were assessed using standard-coalescentsimulations.

Model of adaptive introgression: To assess the hypothesisof adaptive introgression atmac-1, we considered a model ofadaptive introgression via recurrent migration, introducedby Pennings and Hermisson (2006).

This model considers a universally beneficial allele thatintrogresses from the donor population(s) to the recipientpopulation via recurrent migration and then sweeps to globalfixation. Formally, the model considers two Wright-Fisherpopulations, each of N individuals. Each individual is char-acterized by two completely linked haploid loci. The firstlocus is biallelic, with alleles B and b. The B allele has a se-lective advantage s over the b allele. The second, linkedlocus, is neutral with an infinite number of possible alleles,and new mutations arising at rate m (the per-gene mutationrate). The donor population is initially fixed for the benefi-cial B allele, while the recipient population is initially fixedfor the alternative b allele, and both populations are as-sumed to have reached mutation–drift equilibrium at theneutral locus. After equilibration follows a period of second-ary contact in which the populations exchange migrants atrate m. During this secondary contact, the beneficial B alleleis introduced into the recipient population, where it will even-tually reach fixation. The three most important parameters of

Figure 1 Localities of Mytilus spp. samples along the European coast.Panmictic patches of M. edulis are indicated by thick lines; their corre-sponding samples are (1) Wadden Sea (solid square) in the peripheralpatch of the North Sea and (2) Lupin (solid circle) in the enclosed patchof the Bay of Biscay. Panmictic patches of M. galloprovincialis are indi-cated by double lines; their corresponding samples are (1) Faro (opensquare) in the peripheral patch of the Iberian Coast and (2) Roscoff (opencircle) in the enclosed patch of Brittany. The three hybrid zones describedby Bierne et al. (2003b) are indicated by dotted lines. HZ1, Iberian Coast/Bay of Biscay; HZ2, Bay of Biscay/Brittany; HZ3, Brittany/North Sea.

Local Introgression in a Mosaic Hybrid Zone 941

Page 4: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

this model are the scaled quantities u = 2Nm, M = 2Nm, anda = 2Ns (Pennings and Hermisson 2006).

To fit this model to data, we used the mac-1 alleles sam-pled from the donor populations (the two M. edulis patches)and the recipient population (the Brittany patch of M. gallo-provincialis). Analytical results assume complete linkage andrecent completion of the sweep (see below), and so, to makeour inferences robust to the violation of these assumptions,we take as data only those M. galloprovincialis alleles thatgroup within the diversity of M. edulis within the genealogy(i.e., only the alleles denoted with open circles in the edulisclade, Figure 3).

Maximum-likelihood approach: We first fit the data usinga maximum-likelihood (ML) approach. Our data in this casecomprised the number of alleles sampled in the donor andrecipient populations (nd and nr) and the number of distinctallelic classes in these samples (kd and kr). Under the assump-tions described above, the probability of observing the datafrom the donor population is given by the Ewens’ samplingformula (Ewens 1972),

Pðkdjnd; uÞ ¼ukdSkdnd

u3 ðuþ 1Þ . . . ðuþ nd 2 1Þ ¼GðuÞukdSkdnd

Gðuþ ndÞ; (1)

where Skn is an unsigned Stirling number of the first kind,i.e., the number of permutations of n elements in k disjointcycles (Charalambides and Singh 1988). Pennings andHermisson (2006) showed that adaptive introgression hassome formal similarities to Equation 1. In particular, if a =2Ns � 1, then after the beneficial allele has reached fixationin the donor population, the number of distinct immigrantalleles that contribute to the sweep also approximate Ewens’distribution, but with the scaled migration rate,M, playing therole of the scaled mutation rate, u (Pennings and Hermisson2006). We cannot observe the number of immigrant allelesdirectly, but we do know that these migrants were drawn atrandom from the donor population and that the diversityin the donor population is described by Ewens’ distributionwith parameter u. As such, the likelihood of observing ourdata are well approximated by

L ðu;Mjkd; kr; nd; nrÞ � Pðkdjnd; uÞPi¼nr

i¼krPðijnr;MÞ Pðkrji; uÞ

¼ GðMÞGðuÞukdþkrSkdnd

GðM þ nrÞGðuþ ndÞXnr

i¼kr

GðuÞMi SinrSkri

Gðuþ iÞ

(2)

(where the summation is over the number of possible migrantsfrom the donor population, which appear in the sample fromthe recipient population). Note that the approximation ofEquation 2 does not depend explicitly on the strength ofselection, a, but this new result does allow us to estimatethe parameters u and M from our mac-1 data. The parame-ters that maximize Equation 2, u and M, are the maximum-likelihood estimates. Confidence intervals on a given parameterwere the values that reduced the log likelihood by two units—obtained by maximizing the likelihood, conditional on the

parameter of interest taking a suboptimal value (Edwards1992). An R script that implements Equation 2 is included asFile S2.

Approximate Bayesian computation approach: Equation 2is an approximation (Pennings and Hermisson 2006); fur-thermore, it considers only the number of allelic classes (kdand kr) and, therefore, ignores some of the information inthe data, such as the allele-frequency spectrum. Accordingly,we also estimated model parameters using an approximateBayesian computation (ABC) approach (Beaumont et al. 2002).This approach uses forward simulation of the adaptive intro-gression model and then compares summary statistics of thesimulated and observed data.

To simulate the model, mutation–drift equilibriumwas firstachieved by sampling directly from the full Ewens (1972)distribution, as implemented in the program MONTECARLO(Slatkin 1994). Then, we ran forward-in-time simulations ofthe secondary contact, ending each simulation when the ben-eficial allele had reached fixation in the recipient population.

To compare the fit of the simulations to our data, we usedseveral summary statistics. First, for each of the two specieswhere d denotes the donor species and r denotes the recipientspecies, we calculated (1) the number of allelic classes (kdand kr), (2) the expected heterozygosity at the locus (Hed andHer; Nei 1978), (3) the standard deviation in frequenciesover all allelic classes (Sdd and Sdr), and (4) the proportionof private alleles (not found in the other species, %Pd and %Pr). We also used (5) the between-species diversity FST (Nei1973) and (6) the proportional diversity after introgressiondescribed by Her/Hed. All of these summary statistics for thetrue data are given in Table S1.

To estimate the parameters, we performed 950,000 for-ward simulations in total. For each simulation, parameterswere log-tangent transformed (Hamilton et al. 2005), and wethen calculated the Euclidean distance between the observedand simulated transformed parameters. The 1000 simulationswith the smallest associated Euclidean distance were thenretained, and the posterior distribution of the scaled param-eters was estimated by means of weighted nonlinear multi-variate regressions of the parameters on the summary statistics(Blum and Francois 2010). For each regression, 50 feed-forwardneural networks and 25 hidden networks were trained usingthe R package abc (Csilléry et al. 2012) and results wereaveraged over the replicate networks. Priors on the modelparameters were all uniform (flat), and used the followingranges: u � [0.2, 40], M � [0.2, 200], a � [2, 2000].

We evaluated the performance of the ABC method in twoways. First, we performed a goodness-of-fit test (Cornuetet al. 2010). This involved simulating 50,000 data sets withparameters drawn from their estimated posterior distributionsand then comparing the summary statistics of these simulateddata sets to the statistics of the observed data. Second, wecomputed the mean bias statistic, b ¼ 1=n

Pi½ei=vi�, where n

is the number of summary statistics, ei is the median estimation,and vi is the true median value of the ith data set (Excoffieret al. 2005). We generated 100 pseudo-observed data sets

942 C. Fraïsse et al.

Page 5: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

with known parameter values drawn from the prior distri-butions and computed b for each scaled parameter.

Forward simulations of adaptive introgression written inC++ are available (see File S2) together with R scriptsimplementing a simple version of the ABC approach.

Models of speciation: To compare our estimate of M at mac-1 to the extent of gene flow between the Brittany patch ofM. galloprovincialis and M. edulis, we took advantage of theABC approach described in Roux et al. (2013) that accountsfor a putative genome-wide heterogeneity in introgressionrates. We obtained DNA sequence data at the eight nuclearloci described in Roux et al. (2014) for the North Sea patchof M. edulis and the Brittany patch of M. galloprovincialis.Only silent positions (i.e., synonymous polymorphisms incoding regions and noncoding polymorphisms in introns orintergenic regions) were retained in the analysis.

We investigated four models of speciation differentiatedby their temporal patterns of introgression (see Figure S8):

(1) strict isolation (SI) between the two daughter species,(2) isolation with migration (IM) assuming continuous geneflow since the two species started to diverge, (3) ancientmigration (AM), where migration is restricted to the earlyperiod of speciation, and (4) secondary contact (SC), wherethe two daughter populations first evolve in strict isolationand then experience gene flow in a secondary contact. Formodels with gene flow (IM, AM, and SC), we compared twoalternative scenarios in which the effective migration ratewas either homogeneous or heterogeneous among loci.

Five million multilocus simulations were performed foreach model using the coalescent simulator Msnsam (Hudson2002; Ross-Ibarra et al. 2008). We then compared the simu-lated and observed data sets by using an array of summarystatistics widely used in the literature (Wakeley andHey 1997;Fagundes et al. 2007; Ross-Ibarra et al. 2008): (1) nucleotidediversity (p, Tajima 1983); (2) Watterson’s uw (Watterson1975); (3) net interspecific divergence (netdivedu-gal); and(4) between-species differentiation computed as 1 2 pS/pT,

Figure 2 Genome scan of inter- and intraspecies dif-ferentiation. (A) FST values between species againstvalues within M. galloprovincialis (Iberian Coast vs.Brittany). Values for the FST between species werebased on the mean allele frequency of the two patchesfor each species. Dotted lines represent the thresholdat 99% of the Lewontin and Krakauer test for the twocomparisons. Dashed lines represent the 95% and99% quantiles of the neutral envelope of FST obtainedfrom our custom simulation method. Loci detected byall methods as outliers within M. galloprovincialis aredepicted by open circles while those outliers in four ofsix methods are shaded. The locus mac-1 is the onlyoutlier with all methods in the interspecific and intra-specific comparisons (top right square). On the marginsare represented the corresponding FST distributions.Genome scan was based on 440 loci (422 AFLPs and18 codominant nuclear markers). All methods are de-scribed in File S1. (B) Frequency of the galloprovincialisallele at the three outlier loci in the four panmicticpatches of the studied area.M. edulis patches are solid(North Sea and Bay of Biscay), while M. galloprovincia-lis patches are open (Iberian Coast and Brittany). Wedefined the galloprovincialis allele as the less frequentin the two patches of M. edulis.

Local Introgression in a Mosaic Hybrid Zone 943

Page 6: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

where pS is the average pairwise nucleotide diversity withinspecies and pT is the total pairwise nucleotide diversity of thepooled sample across species. We assessed departure frommutation–drift equilibrium using Tajima’s D (Tajima 1989a,b). We also classified the variable genomic positions accord-ing to whether they are exclusively polymorphic in M. edulis(Sx.edu), exclusively polymorphic inM. galloprovincialis (Sx.gal),with different alleles fixed in both species (Sf), or sharing thesame polymorphism (SS). We computed the average and stan-dard deviation of each of these statistics across the surveyedloci by using MScalc (available from http://www.abcgwh.sitew.ch/; see Roux et al. 2011). We provide the observedvalues computed from the sequenced data in Table S2. Fulldetail of priors, model choice and model checking procedures,as well as parameter estimation, are described in File S1.

Results

Detection of interspecific and intraspecific outliers

Figure 2A compares interspecific and intraspecific differen-tiation at screened loci. The corresponding figure in M.edulis is provided (see Figure S1). Differentiation betweenpatches of the same species was generally low (in M.gallo-provincialis FST = 0.021; in M. edulis FST = 0.024), buthighly variable between loci, and a few loci were identifiedas outliers using several methods of detection. In particular,all methods agreed that three loci were outliers within M.galloprovincialis (ACG.CGA.206 FST = 0.189; mac-1 FST =0.192 and CAG.CTC.317 FST = 0.224; Figure 2A), and twoloci within M. edulis (ACG.CGA.152 FST = 0. 294 and CAG.CGA.363 FST = 0.374; Figure S1). In the comparison be-tween species, differentiation was also low (FST = 0.068)but the distribution of FST values was overdispersed, as onewould expect when a semipermeable barrier to gene flowisolates two species. The locus mac-1, which was highlydifferentiated within M. galloprovincialis, was also highlydifferentiated between species (FST = 0.507) and was theonly locus to be detected as an outlier in both kinds ofcomparison.

To see this in more detail, Figure 2B shows allele frequen-cies in all four patches at the three loci identified as outlierswithin M. galloprovincialis. While all three loci show allele-frequency differences within M. galloprovincialis, only mac-1shows allele-frequency differences between the Brittanypatch of M. galloprovincialis and the neighboring patchesof M. edulis.

Detailed analysis of mac-1

Because the pattern of differentiation at mac-1 was unique,this locus was investigated in more detail. Specifically, wesequenced three regions, comprising a total of 3.1 kb in thevicinity of mac-1 (Figure S2). All three regions containedseveral SNPs, and the first region also contained indels thatwere used to define PCR primers that generate length poly-morphism of the PCR product, indel-in0. Summary statisticsof the genetic diversity within each sequenced region arefound in Table 2, while Table 3 describes the genetic diver-sity of the two PCR product-length polymorphisms.

We next constructed genealogies of the three regions.Figure 3 shows the genealogy of region 2 where alleles fromthe Brittany patch of M. galloprovincialis are found in twodistinct clades, grouping with (1) alleles from both M. edulispatches (“edulis clade”; Figure 3) and (2) with the othergalloprovincialis alleles (“galloprovincialis clade”; Figure 3).Similar patterns are also found in regions 1 and 3 (FigureS3) and were robust to the exclusion of possible recombi-nants (Table 2).

Together, these patterns are consistent with the results ofFST outlier tests and suggest that there has been an intro-gression of mac-1 alleles from M. edulis into the Brittanypatch of M. galloprovincialis, which lies between them. Fur-thermore, comparison of the Brittany alleles that group inthe edulis clade (open circles within the “edulis clade”; Fig-ure 3), with the related alleles found in the two M. edulispatches (solid squares and circles, Figure 3), suggests thatvery similar levels of genetic diversity are found in all threepatches (see Table 2 and Table 3). For example, the pro-portion of diversity introgressed at the locus indel-in0 is

Table 2 Genetic statistics of the sequenced regions

Region nedu ngal Sedu Sgal Ssedu Ssgal Snsedu Snsgal Sncedu Sncgal

1 17 19 14 34 1 3 2 4 11 272 19 11 53 51 0 0 0 0 53 513 27 27 25 19 16 11 9 8 0 0

Region usedu usgal upeduupgal

RmeduRmgal

Dedu Dgal Hedu Hgal

1 0.004 0.006 0.005 0.011 2 2 20.44 21.78* 0.82 2.912 0.103 0.013 0.138 0.016 7 4 21.03 20.77 20.25 6.43 0.005 0.004 0.009 0.007 2 2 21.4 21.37 2.58 23.26

edu denotes the patches of M. edulis (North Sea and Bay of Biscay); gal denotes the Brittany patch of M. galloprovincialis. Only edulis-derived alleleswere considered. Significance of D and H were tested with standard-coalescent simulations: *P-value ,0.05. n: number of sequences sampled; S,total number of polymorphic sites; Ss, number of synonymous mutations; Sns, number of nonsynonymous mutations; Snc, number of noncodingmutations; uS, diversity estimated from the number of polymorphic sites (Watterson 1975); up, diversity estimated from pairwise differences (Tajima1983); Rm, minimum number of recombination events (Hudson and Kaplan 1985); D, Tajima’s D (Tajima 1989a; 1989b); H, Fay and Wu’s H (Fay andWu 2000).

944 C. Fraïsse et al.

Page 7: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

Hegal/Heedu = 0.67 and at mac-1 is Hegal/Heedu = 0.96 (seeTable 3). Additionally, there is no evidence of departurefrom neutrality in any of the three sets of alleles (see theresults of Tajima’s D and Fay and Wu’s H in Table 2).

One possible interpretation of these results is that mac-1has been subject to adaptive introgression from M. edulisinto Brittany patch of M. galloprovincialis. The followingsections use several methods to test the validity of thishypothesis.

Testing for the chromosomal signature ofadaptive introgression

One genomic signature of local adaptation is a single sharppeak in the frequencies of linked neutral alleles, centered onthe adaptive polymorphism (Charlesworth et al. 1997). Totest for this signature we identified four polymorphisms, twoindels and two SNPs, that mapped to an internal branch oftheir corresponding genealogy (Figure 3 for region 2 andFigure S3 for regions 1 and 3). These polymorphisms wereevenly spaced along the 3.1-kb sequence, and their locationsare shown in Figure S2 (triangles for indels and stars forSNPs). Figure 4 shows the variation of the edulis allele alongthe sequence in the four patches. Whereas the frequency ofthe edulis allele remains consistently low in the IberianCoast patch (Fqedu , 0.05) and consistently high in theNorth Sea and Bay of Biscay patches (Fqedu . 0.95), wenoted a slight increase of the edulis allele toward the 59 sideof mac-1 in the Brittany patch. However, there was a surpris-ing lack of variation along the whole sequence: the edulisallele frequency remains �0.4 over several kilobase pairs.

Model fitting of adaptive introgression

The limited scale of our chromosomal walking does notallow us to reject any hypotheses with high confidence.Accordingly, we next tested whether adaptive introgressionmight plausibly explain the pattern of diversity observed atmac-1.

First, to illustrate the effects of adaptive introgression ondiversity, Figure 5 plots the expected reduction in diversityagainst the rate of gene flow (see also Pennings and Her-misson 2006). When M , 0.01 we see a clear “hard sweep”with a single haplotype reaching high frequency in the re-cipient deme, while when M . 1, most of the heterospecificdiversity introgresses into the recipient background (a clas-sic soft sweep with no detectable reduction in diversity). For

intermediate values of M, the effect is also intermediate,with both a soft sweep and a reduction in diversity.

Building on the results of Pennings and Hermisson(2006), we fitted a model of adaptive introgression to ourmac-1 data. We first developed an ML approach to estimatethe population mutation rate, u = 2Nm, in the M. edulispopulations (North Sea and Bay of Biscay), and the effectivenumber of migrants, M = 2Nm, from these populations tothe Brittany patch of M. galloprovincialis (see Equation 2).Considering only edulis-derived alleles, the total number ofalleles sampled, and the number of distinct allelic classesobserved were nd = 177 and kd = 15 for the North Seaand Bay of Biscay patches and nr = 60 and kr = 6 for theBrittany patch (Table 3). From these data, we obtained a rel-atively precise estimate of the population mutation rate:u ¼ 3:79 [1.95 – 6.66] (see solid lines, Figure 6A). However,the effective number of migrants was imprecisely estimated,with low migration rates rejected but no upper bound:M ¼ 4:63 [0.95–] (see solid lines, Figure 6B). The lack ofan upper bound on the estimate reflects the fact that nearlyall of the allelic diversity from the edulis patches had beenintrogressed into the Brittany patch of M. galloprovincialis.

In an attempt to improve our inference, we next used anABC approach, which allowed us to relax the assumption ofstrong selection and to consider more of the information inthe data. Parameter posterior distributions, corrected formean bias, are shown in Figure 6 as shaded circles. In thefollowing, we give their median and 95% credible intervals.

Figure 3 Allele genealogy reconstructed by the neighbor-joining algo-rithm apply to the number of nucleotide differences (region 2). Sequencessampled in M. edulis patches (solid squares, North Sea; solid circles, Bayof Biscay) cluster within the “edulis” clade, while those sampled in theIberian Coast patch (open squares) of M. galloprovincialis cluster withinthe “galloprovincialis” clade. In Brittany (open circles), sequences arefound in both clades.

Table 3 Genetic statistics of the PCR product lengthpolymorphisms

Locus nedu ngal kedu kgal Heedu Hegal fedu fgal Q

indel-in0 92 28 9 5 0.364 0.244 0.049 20.071 20.006mac-1 177 60 15 6 0.761 0.728 0.0962 20.053 0.0097

edu denotes the patches of M. edulis (North Sea and Bay of Biscay); gal denotes theBrittany patch of M. galloprovincialis. Only edulis-derived alleles were considered.Significance of f and Q were tested with 1000 permutations: *P-value ,0.05. n,number of alleles sampled; k, number of allelic classes; He, heterozygosity (Nei1978); f, departure from HWE (Weir and Cockerham 1984); Q, genetic differenti-ation (Weir and Cockerham 1984).

Local Introgression in a Mosaic Hybrid Zone 945

Page 8: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

We obtained an even more precise estimate of the scaledmutation rate (Figure 6A): umedian = 2.82 [1.42 – 4.53]whose bounds are similar to the ML estimates. Estimationof the scaled migration rate (Figure 6B): Mmedian = 99 [24 –

154] showed a clear discrepancy between the two methodswith respect to the lower bound. Compared to ML, the lowerbound estimation was M � 1 while its upper bound waspoorly estimated (value approximates the prior upperbound). Finally, the estimation of selective strength, amedian =402 [203 – 594] (Figure 6C) was imprecise but compati-ble with the assumption of strong selection used in the MLinference.

To better understand the discrepancy between the twomethods, we reran the ABC estimation, using only thesummary statistics that appear in the ML equations (namely,nd, kd, nr, and kr). In this case, estimates of u and M werevery similar to the MLEs (Figure S4), and there was no in-formation on a as expected. The results of the ABC approachtherefore come from the additional use of summary statisticsthat contain information that is not used in the ML approach.We therefore conducted a goodness-of-fit test to evaluatehow well the model fit the data at mac-1 when estimatingthe parameters by each method (ML and ABC using all sta-tistics). Figure S5 shows the goodness-of-fit of ML estimation(blue dots) and ABC estimation (red dots) for each statistic.In the main, both methods fit the data well (see Table S1 forquantiles and P-values) but looking at the median values, MLperforms better. In particular, ABC tends to overestimate krand Her, which basically explains the discrepancy in the esti-mation of M. The ML method proved to be the most under-standable approach with a simpler model that retrieved thebest fit to the data and consequently our interpretations relyon the ML estimates.

Finally, we used our simulations to assess the effects ofviolating some assumptions of our model: (1) the assumption

of complete linkage between the selected and neutral loci and(2) the assumption that migration rates remained constant.Figure S6 shows that the sweep pattern was indeed robust toincluding recombination. Figure S7 suggests that our resultswill also be fairly robust to temporal variation in migrationrates, particularly in mussels where dispersal rates are veryhigh.

Insights into secondary contact history

Both estimation methods agree that M . 1 best explains ourdata if adaptive introgression has taken place. To assess theplausibility of our estimation, we analyzed several models ofspeciation (Figure S8) between M. edulis and the Brittanypatch of M. galloprovincialis.

We first applied a model choice procedure for each ofthe three speciation models with gene flow to test whetherheterogeneity in introgression rates was supported. Weobserved clear support for heterogeneous migration (TableS3, “within model”). We then applied the model-choice pro-cedure to the speciation models, assuming heterogeneousmigration (Table S3, “between models”). Secondary contactwas the best-supported model with posterior probability,Ppost = 0.61, confirming a previous study (Roux et al. 2014).We tested the robustness of this result by simulating 500pseudo-observed data sets for each model. We empiricallyestimated that given a posterior probability of 0.61, theprobability that secondary contact with heterogeneous in-trogression rates is the correct model was 0.987 (TableS4). It is worth mentioning that when migration rate wasassumed to be homogeneous among loci, the method failed

Figure 5 Expected sweep effect in the adaptive introgression model. Theproportional diversity after introgression from d (donor species) to r (re-cipient species), Her/Hed, is shown as a function of the scaled migrationrate, M. Errors bars represent standard deviation calculated over 5000replicates. Median values are depicted by circles. Parameter values wereN = 10000, u = 2, a = 1000, r = 0 (complete linkage), and nd = nr = 50(sample size). Simulations were run until the beneficial allele reachedfixation in the recipient background.

Figure 4 Frequency of the edulis allele at the four polymorphisms alongthe studied genomic region in the four patches. Position (in kilobase) isindicated by asterisks for SNPs and by inverted triangles for PCR product-length polymorphisms. M. edulis patches are solid (squares, North Sea;circles, Bay of Biscay), while M. galloprovincialis patches are open(squares, Iberian Coast; circles, Brittany). Sample size was n = 48, exceptin Brittany (n = 87).

946 C. Fraïsse et al.

Page 9: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

to infer secondary contact; instead ancient migration wasthe best-supported model (SC, Ppost = 0.26 and AM, Ppost =0.65).

We then estimated the multilocus distributions of in-trogression rates in the best-supported SC model withheterogeneous migration (Figure 7). We found an asymme-try of introgression in favor ofM. edulis: Mgallo to edulis = 0.15[0.002 – 1.06] (see open bars, Figure 7) and Medulis to gallo =0.45 [0.013 – 2.84] (see shaded bars, Figure 7). However,compared to the multilocus inference, introgression at mac-1was strongly unidirectional from M. edulis to M. gallopro-vincialis as suggested by the comparison of different scenar-ios of introgression (SC with introgression into M. edulis:Ppost = 0.0013; SC with introgression into M. galloprovincia-lis: Ppost = 0.9997; see Table S5). Furthermore introgressionrate into M. galloprovincialis at mac-1 was very high com-pared to the extent of gene flow across loci (Mmedian = 2.50,solid line, Figure 7) that is consistent with the mac-1 gene-alogy (Figure 3). In comparison, our ML estimation of M atmac-1 in the adaptive introgression model (M ¼ 4:63, Fig-ure 6B) was outside the 95% quantile of the multilocusdistribution, but its lower bound (MML lower bound = 0.95,dashed line, Figure 7) remained consistent with the second-ary contact inference.

Discussion

Adaptive introgression between hybridizing species has longbeen proposed as a potentially important source of newadaptations in plants and, more recently, in animals too(Arnold 2004; Hedrick 2013). Most of all, compared to ad-aptation from new mutations or standing variation, adaptiveintrogression might allow big evolutionary jumps by theacquisition of complex variants (Wright 1949; Mallet 2007;Abbott et al. 2013).

Genome scan surveys of differential introgression amongloci can help to localize genomic regions that introgressadaptively (Payseur 2010). Here, we took advantage of themosaic structure of the hybrid zone of mussels along thewestern European coastline, to perform a multilocus scanof local introgression. As pointed out by Harrison and Rand(1989), an exciting feature of mosaic hybrid zones is thatthey involve independent contacts between species, eachwith a unique evolutionary trajectory. In addition to randomfluctuations, the outcomes will be contingent to the relativeabundance of the two species, local environmental condi-tions, and the genetic architecture of their reproductiveisolation.

Our multilocus scan, based on AFLP and codominantmarkers, showed some genetic differences between theperipheral patch of M. galloprovincialis in the Iberian Coastand the patch enclosed in Brittany. In our analysis, AFLPmarkers were used mainly to obtain an estimation of thegenomic distribution of FST values. We also have fitted a spe-ciation model to an independent DNA sequence data set. Wetherefore checked whether the parameters inferred couldreproduce the FST distribution observed at AFLP markers.The two distributions—AFLPs vs. simulations under the bestdemographic model inferred—were very similar (FigureS9), suggesting that we have a good estimate of the genomicdistribution of FST. To mitigate the problem of false posi-tives, due to an underestimation of the neutral variance ofFST (Robertson 1975), we cross checked the results of sixmethods, all of which assume different population struc-tures. In this way, we identified three loci with a robust de-viation from the neutral expectations (open circles, Figure2A). When compared to other methods, simulations of theinferred secondary contact model with heterogeneous geneflow produced an elevated variance of FST in both thebetween-species comparison, and—more surprisingly—in the

Figure 6 Parameter estimation in the adaptive introgres-sion model. Maximum-likelihood estimation of (A) u and(B) M are depicted by black curves. Maximum-likelihoodvalues are denoted by solid vertical lines and confidenceintervals by dotted lines. Posterior distributions of (A) u, (B)M, and (C) a are depicted by shaded circles. Medians aredenoted by solid vertical lines and 95% highest posteriordensity values by dotted lines. Values are corrected forparameter bias in the ABC inference (see text for moredetails).

Local Introgression in a Mosaic Hybrid Zone 947

Page 10: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

within-species comparison. Nevertheless, even under thisscenario, the combination of a high differentiation withinand between species observed at mac-1 was not observedunder neutrality (see dashed lines, Figure 2A). The geneal-ogy of mac-1 confirmed that this was due to local introgres-sion of M. edulis alleles into the Brittany patch of M.galloprovincialis, but not into the patch on the Iberian Coast(Figure 3). This pattern contrasts strongly with results at theother loci analyzed.

The locus-specific nature of the introgression at mac-1suggested that selective processes were acting rather thanpurely demographic ones. The highly asymmetrical intro-gression rates from M. edulis (Table S5) contrasts with themultilocus pattern (Figure 7), and so we hypothesized thatmac-1 might have been subject to adaptive introgression,with edulis alleles sweeping to fixation in the Brittany patchof M. galloprovincialis.

Demonstrating adaptive introgression requires a combi-nation of evidence that is hard to obtain in most species(Vekemans 2010). Few studies have undertaken experi-ments to confirm the fitness advantage of the introgressedtraits/alleles in the recipient backgrounds. Exceptions in-clude the adaptive introgression of drought tolerance in Hel-ianthus annuus (Whitney et al. 2010) and the adaptiveintrogression of rodent poison resistance in the housemouse, Mus musculus domesticus (Song et al. 2011). In con-trast, most studies rely on indirect evidence involving geo-graphical variation in allele frequencies, analysis of DNApolymorphism, and the reconstruction of gene genealogies.For example, Heliconius Genome Consortium (2012), Pardo-Diaz et al. (2012), and Smith and Kronforst (2013) provided

strong evidence for the adaptive introgression of wing colorpattern genes involved in Müllerian mimicry between hy-bridizing species of Heliconius. In humans, Mendez et al.(2012a,b, 2013) found a high frequency of divergent hap-lotypes at two immune genes, STAT2 and OAS1, in modernMelanesians. These haplotypes share recent common ances-try with archaic hominin sequences (STAT2, Neanderthal;OAS1, Denisovan). This result is suggestive of ancient intro-gressions, but their adaptive nature remains hypothetical.Similarly Roux et al. (2013) identified genomic hotspots ofintrogression between two highly divergent sea squirt spe-cies but could draw no conclusions about their adaptivenature, illustrating that specific tests need to be developed.

The patterns that we observe at mac-1 differ in one re-spect from previously studied cases of outlying introgres-sion. In particular, in most previous cases, there has beena reduction in the genetic diversity in the recipient popula-tion, when compared to the donor population (see Table 1).In contrast, at themac-1 locus, 96% of theM. edulis diversitywas introgressed to M. galloprovincialis, and there were noother signs of a classical selective sweep (Table 1, Table 2,and Table 3).

However, it remains possible that the mac-1 alleles haveintrogressed via a soft sweep, with little loss of diversity(Pennings and Hermisson 2006). We tried two approachesto test for a soft sweep at this locus. First, we analyzed fourpolymorphisms along the sequence (Figure 4) with the hopeof observing a gradient of introgression. Unfortunately, thechromosomal scale proved to be too small. Second, we usedtwo new inference methods to test whether the rate of mi-gration required to generate the soft sweep effect was con-sistent with the existence of an efficient genetic barrier atmac-1 elsewhere in the range. Our inference suggested thatM = 2Nm . 1 was required to explain our data under theadaptive introgression model, and the ML estimate (M ¼ 4:63;Figure 6B) was outside of the 95% quantile of the multilocusdistribution of introgression rates between M. edulis and theBrittany patch of M. galloprovincialis (Figure 7). However, itslower bound remained consistent with the secondary contactinference (MML lower bound = 0.95, dashed line, Figure 7), andso we cannot reject the hypothesis of adaptive introgression.

While we cannot reject adaptive introgression, we mustalso recognize alternative hypotheses. Migration pressurecan easily spread gene combinations, even if these areuniversally selected against (e.g., Barton 1992). It is there-fore possible that our results could be explained by a geo-graphically and genomically localized change in the barrierstrength between M. edulis and M. galloprovincialis. Thegenomic region around mac-1 may contain barrier loci thatwere maintained in some patches, but not in the Brittanypatch—perhaps due to differences in local ecological selec-tion or because an asymmetric intrinsic incompatibility man-aged to uncouple from the tension zone.

Together, our results emphasize the difficulties in inter-preting outlier loci in populations that have undergonesecondary contact. This is unsuprising given that patterns of

Figure 7 Multilocus distributions of introgression rates estimated in thesecondary contact model between M. edulis (North Sea) and M. gallo-provincialis (Brittany). Introgression from M. edulis into M. galloprovincia-lis is shown as shaded, while introgression in the opposite direction isshown as open. Distributions were obtained after randomly sampling1000 values from each of the 2000 beta distributions retained by theABC analysis. The solid line shows the median introgression rate intoM. galloprovincialis estimated at mac-1 in the secondary contact model(MSC median = 2.50). For comparison, the lower bound of the maximum-likelihood estimation in the adaptive introgression model is reported asa dotted line (MML lower bound = 0.95).

948 C. Fraïsse et al.

Page 11: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

introgression after such a contact can vary dramatically,depending on the way divergence occurred in allopatry, theresulting genetic architecture of the barrier, and the locallandscape where the contact takes place—including its pop-ulation density, connectivity, and environmental variation(Bierne et al. 2011, 2013a;Domingues et al. 2012; Strasburget al. 2012; Abbott et al. 2013; Gagnaire et al. 2013). Nev-ertheless, our best hope of understanding these issues is toapply dedicated methods, such as those developed here, tomuch larger genome-wide data sets.

Acknowledgments

The authors are grateful to Marie-Thérèse Augé and CéliaGosset for their technical help with this work and MarcOhresser who kindly provided the sequence of the mac-1region. Molecular data were produced through the ISEMplatform Génomique marine at the Station Méditerranennede l’Environnement Littoral [OSU OREME (Observatoire deRecherche Méditerranéen de l’Environnement)] and theplatform Génomique Environnementale of the LabExCeMEB (Laboratoire d’Excellence Centre Méditerranéen del’Environnement et de la Biodiversité). Numerical resultspresented in this article were carried out using the ISEMcomputing cluster at the platform Montpellier Bioinforma-tique et Biodiversité of the LabEx CeMEB, and we are grate-ful to its staff. This work was funded by the Agence Nationalde la Recherche (HYSEA project, ANR-12-BSV7- 0011) andthe project Aquagenet (SUDOE, INTERREG IV B). This isarticle 2014-044 of Institut des Sciences de l’Evolution deMontpellier (ISEM).

Literature Cited

Abbott, R., D. Albach, S. Ansell, J. W. Arntzen, S. J. E. Baird et al.,2013 Hybridization and speciation. J. Evol. Biol. 26(2): 229–246.

Arnold, M. L., 2004 Transfer and origin of adaptations throughnatural hybridization: Were Anderson and Stebbins right? PlantCell 16(3): 562–570.

Barton, N. H., 1979 The dynamics of hybrid zones. Heredity43(3): 341–359.

Barton, N. H., 1992 On the spread of new gene combinations inthe third phase of Wright’s shifting-balance. Evolution 46(2):551–557.

Barton, N. H., 2013 Does hybridization influence speciation? J.Evol. Biol. 26(2): 267–269.

Barton, N. H., and B. O. Bengtsson, 1986 The barrier to geneticexchange between hybridising populations. Heredity 57: 357–376.

Beaumont, M. A., and D. J. Balding, 2004 Identifying adaptivegenetic divergence among populations from genome scans.Mol. Ecol. 13(4): 969–980.

Beaumont, M. A., and R. A. Nichols, 1996 Evaluating loci for usein the genetic analysis of population structure. Proc. Biol. Sci.263(1377): 1619–1626.

Beaumont, M. A., W. Zhang, and D. J. Balding, 2002 ApproximateBayesian computation in population genetics. Genetics 162:2025–2035.

Belkhir, K., P. Borsa, L. Chikhi, N. Raufaste, and F. Bonhomme,1996–2004 GENETIX, version 4.05. http://kimura.univ-montp2.fr/genetix/

Bierne, N., 2010 The distinctive footprints of local hitchhiking ina varied environment and global hitchhiking in a subdividedpopulation. Evolution 64(11): 3254–3272.

Bierne, N., P. David, P. Boudry, and F. Bonhomme, 2002 Assortativefertilization and selection at larval stage in the musselsMytilus edulisand M. galloprovincialis. Evolution 56(2): 292–298.

Bierne, N., F. Bonhomme, and P. David, 2003a Habitat preferenceand the marine-speciation paradox. Proc. Biol. Sci. 270(1522):1399–1406.

Bierne, N., P. Borsa, C. Daguin, D. Jollivet, F. Viard et al.,2003b Introgression patterns in the mosaic hybrid zone be-tween Mytilus edulis and M. galloprovincialis. Mol. Ecol. 12(2):447–461.

Bierne, N., C. Daguin, F. Bonhomme, P. David, and P. Borsa,2003c Direct selection on allozymes is not required to explainheterogeneity among marker loci across a Mytilus hybrid zone.Mol. Ecol. 12(9): 2505–2510.

Bierne, N., F. Bonhomme, P. Boudry, M. Szulkin, and P. David,2006 Fitness landscapes support the dominance theory ofpost-zygotic isolation in the mussels Mytilus edulis and M. gallo-provincialis. Proc. Biol. Sci. 273(1591): 1253–1260.

Bierne, N., J. Welch, E. Loire, F. Bonhomme, and P. David,2011 The coupling hypothesis: why genome scans may failto map local adaptation genes. Mol. Ecol. 20(10): 2044–2072.

Bierne, N., P.-A. Gagnaire, and P. David, 2013a The geography ofintrogression in a patchy environment and the thorn in the sideof ecological speciation. Curr. Zool. 59(1): 72–86.26

Bierne, N., D. Roze, and J. J. Welch, 2013b Pervasive selection oris it. . .?: Why are FST outliers sometimes so frequent? Mol. Ecol.22(8): 2061–2064.

Blum, M. G. B., and O. Francois, 2010 Non-linear regression mod-els for approximate Bayesian computation. Stat. Comput. 20(1):63–73.

Bonhomme, M., C. Chevalet, B. Servin, S. Boitard, J. Abdallah et al.,2010 Detecting selection in population trees: the Lewontinand Krakauer test extended. Genetics 186: 241–262.

Caballero, A., H. Quesada, and E. Roldan-Alvarez, 2008 Impact ofamplified fragment length polymorphism size homoplasy on theestimation of population genetic diversity and the detection ofselective loci. Genetics 179: 539–554.

Charalambides, C. A., and J. Singh, 1988 Review of the stirlingnumbers, their generalizations and statistical applications. Com-mun. Stat. Theory 17(8): 2507–2532.

Charlesworth, B., M. Nordborg, and D. Charlesworth, 1997 Theeffects of local selection, balanced polymorphism and back-ground selection on equilibrium patterns of genetic diversityin subdivided populations. Genet. Res. 70(2): 155–174.

Cornuet, J.-M., V. Ravigné, and A. Estoup, 2010 Inference onpopulation history and model checking using DNA sequenceand microsatellite data with the software DIYABC (v1.0). BMCBioinformatics 11(1): 401.

Corpet, F., 1988 Multiple sequence alignment with hierarchicalclustering. Nucleic Acids Res. 16(22): 10881–10890.

Csilléry, K., O. François, and M. G. B. Blum, 2012 abc: an R pack-age for approximate Bayesian computation (ABC). MethodsEcol. Evol. 3(3): 475–479.

Daguin, C., F. Bonhomme, and P. Borsa, 2001 The zone of sym-patry and hybridization of Mytilus edulis and M. galloprovincia-lis, as described by intron length polymorphism at locus mac-1.Heredity 86: 342–354.

Domingues, V. S., Y.-P. Poh, B. K. Peterson, P. S. Pennings, J. D.Jensen et al., 2012 Evidence of adaptation from ancestral var-iation in young populations of beach mice. Evolution 66(10):3209–3223.

Edwards, A. W. F., 1992 Likelihood. Johns Hopkins University Press,Baltimore, MD.

Local Introgression in a Mosaic Hybrid Zone 949

Page 12: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

Ewens, W. J., 1972 The sampling theory of selectively neutralalleles. Theor. Popul. Biol. 3(1): 87–112.

Excoffier, L., A. Estoup, and J.-M. Cornuet, 2005 Bayesian analysisof an admixture model with mutations and arbitrarily linkedmarkers. Genetics 169: 1727–1738.

Fagundes, N. J. R., N. Ray, M. Beaumont, S. Neuenschwander, F. M.Salzano et al., 2007 Statistical evaluation of alternative mod-els of human evolution. Proc. Natl. Acad. Sci. USA 104(45):17614–17619.

Faure, M. F., P. David, F. Bonhomme, and N. Bierne, 2008 Genetichitchhiking in a subdivided population of Mytilus edulis. BMCEvol. Biol. 8: 164.

Fay, J. C., and C. I. Wu, 2000 Hitchhiking under positive Darwin-ian selection. Genetics 155: 1405–1413.

Gagnaire, P.-A., S. A. Pavey, E. Normandeau, and L. Bernatchez,2013 The genetic architecture of reproductive isolation duringspeciation-with-gene-flow in lake whitefish species pairs as-sessed by RAD sequencing. Evolution 67(9): 2483–2497.

Gosset, C. C., and N. Bierne, 2012 Differential introgression froma sister species explains high FST outlier loci within a musselspecies. J. Evol. Biol. 26(1): 14–26.

Hall, T., 1999 BioEdit: a user-friendly biological sequence align-ment editor and analysis program for Windows 95/98/NT.Nucleic Acids Symp. Ser. 41: 95–98.

Hamilton, G., M. Stoneking, and L. Excoffier, 2005 Molecularanalysis reveals tighter social regulation of immigration in pat-rilocal populations than in matrilocal populations. Proc. Natl.Acad. Sci. USA 102(21): 7476–7480.

Harrison, R. G., 1986 Pattern and process in a narrow hybridzone. Heredity 56(3): 337–349.

Harrison, R. G., and D. M. Rand, 1989 Mosaic hybrid zone andthe nature of species boundaries, pp. 111–133 in Speciation andIts Consequences, edited by D. Otte, and J. A. Endler. SinauerAssociates, Sunderland, MA.

Hedrick, P. W., 2013 Adaptive introgression in animals: examplesand comparison to new mutation and standing variation assources of adaptive variation. Mol. Ecol. 22(18): 4606–4618.

Heliconius Genome Consortium, 2012 Butterfly genome revealspromiscuous exchange of mimicry adaptations among species.Nature 487(7405): 94–98.

Hewitt, G. M., 2011 Quaternary phylogeography: the roots ofhybrid zones. Genetica 139(5): 617–638.

Hilbish, T. J., F. P. Lima, P. M. Brannock, E. K. Fly, R. L. Rognstadet al., 2012 Change and stasis in marine hybrid zones in re-sponse to climate warming. J. Biogeogr. 39(4): 676–687.

Hohenlohe, P. A., S. Bassham, P. D. Etter, N. Stiffler, E. A. Johnsonet al., 2010 Population genomics of parallel adaptation inthreespine stickleback using sequenced RAD tags. PLoS Genet.6(2): e1000862.

Hudson, R. R., 2002 Generating samples under a Wright–Fisherneutral model of genetic variation. Bioinformatics 18(2): 337–338.

Hudson, R. R., and N. L. Kaplan, 1985 Statistical properties of thenumber of recombination events in the history of a sample ofDNA sequences. Genetics 111: 147–164.

Jackson, B., T. Kawakami, S. Cooper, J. Galindo, and R. Butlin,2012 A genome scan and linkage disequilibrium analysisamong chromosomal races of the Australian grasshopper Van-diemenella viatica. PLoS ONE 7(10): e47549.

Kemppainen, P., T. Lindskog, R. Butlin, and K. Johannesson,2011 Intron sequences of arginine kinase in an intertidal snailsuggest an ecotype-specific selective sweep and a gene duplica-tion. Heredity 106(5): 808–816.

Kumar, S., K. Tamura, and M. Nei, 2004 MEGA3: integrated soft-ware for molecular evolutionary genetics analysis and sequencealignment. Brief. Bioinform. 5(2): 150–163.

Lehman, N., A. Eisenhawer, K. Hansen, L. D. Mech, R. O. Petersonet al., 1991 Introgression of coyote mitochondrial DNA intosympatric North American gray wolf populations. Evolution45(1): 104–119.

Lewontin, R. C., and J. Krakauer, 1973 Distribution of gene fre-quency as a test of the theory of the selective neutrality of poly-morphisms. Genetics 74: 175–195.

Mallet, J., 2007 Hybrid speciation. Nature 446(7133): 279–283.Maynard Smith, J., and J. Haigh, 1974 The hitchhiking effect of

a favourable gene. Genet. Res. 23(01): 23–35.McDonald, J. H., 1994 Detecting natural selection by comparing

geographic variation in protein and DNA polymorphisms, pp.88–100 in Non-neutral Evolution, edited by B. Golding, Springer,New York.

McKinnon, G. E., R. E. Vaillancourt, D. A. Steane, and B. M. Potts,2004 The rare silver gum, Eucalyptus cordata, is leaving itstrace in the organellar gene pool of Eucalyptus globulus. Mol.Ecol. 13(12): 3751–3762.

Mendez, F. L., J. C. Watkins, and M. F. Hammer, 2012a Globalgenetic variation at OAS1 provides evidence of archaic admix-ture in melanesian populations. Mol. Biol. Evol. 29(6): 1513–1520.

Mendez, F. L., J. C. Watkins, and M. F. Hammer, 2012b A haplo-type at STAT2 introgressed from neanderthals and serves asa candidate of positive selection in Papua New Guinea. Am. J.Hum. Genet. 91(2): 265–274.

Mendez, F. L., J. C. Watkins, and M. F. Hammer, 2013 Neandertalorigin of genetic variation at the cluster of OAS immunity genes.Mol. Biol. Evol. 30(4): 798–801.

Nachman, M. W., and B. A. Payseur, 2012 Recombination ratevariation and speciation: theoretical predictions and empiricalresults from rabbits and mice. Philos. Trans. R. Soc. B 367(1587): 409–421.

Nadeau, N. J., A. Whibley, R. T. Jones, J. W. Davey, K. K. Dasmahapatraet al., 2012 Genomic islands of divergence in hybridizingHeliconius butterflies identified by large-scale targeted sequencing.Philos. Trans. R. Soc. B 367(1587): 343–353.

Narita, S., M. Nomura, Y. Kato, and T. Fukatsu, 2006 Geneticstructure of sibling butterfly species affected by Wolbachia in-fection sweep: evolutionary and biogeographical implications.Mol. Ecol. 15(4): 1095–1108.

Nei, M., 1973 Analysis of gene diversity in subdivided popula-tions. Proc. Natl. Acad. Sci. USA 70(12): 3321–3323.

Nei, M., 1978 Estimation of average heterozygosity and genetic dis-tance from a small number of individuals. Genetics 89: 583–590.

Pardo-Diaz, C., C. Salazar, S. W. Baxter, C. Merot, W. Figueiredo-Ready et al., 2012 Adaptive introgression across species bound-aries in Heliconius butterflies. PLoS Genet. 8(6): e1002752.

Parsons, T. J., S. L. Olson, and M. J. Braun, 1993 Unidirectionalspread of secondary sexual plumage traits across an avian hy-brid zone. Science 260(5114): 1643–1646.

Payseur, B. A., 2010 Using differential introgression in hybridzones to identify genomic regions involved in speciation. Mol.Ecol. Resour. 10(5): 806–820.

Pennings, P. S., and J. Hermisson, 2006 Soft sweeps II: molecularpopulation genetics of adaptation from recurrent mutation ormigration. Mol. Biol. Evol. 23(5): 1076–1084.

Pérez-Figueroa, A., M. J. García-Pereira, M. Saura, E. Rolán-Alvarez, and A. Caballero, 2010 Comparing three differentmethods to detect selective loci using dominant markers. J.Evol. Biol. 23(10): 2267–2276.

Piálek, J., and N. H. Barton, 1997 The spread of an advantageousallele across a barrier: the effects of random drift and selectionagainst heterozygotes. Genetics 145: 493–504.

Quesada, H., R. Wenne, and D. O. Skibinski, 1999 Interspeciestransfer of female mitochondrial DNA is coupled with role-reversals

950 C. Fraïsse et al.

Page 13: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

and departure from neutrality in the mussel Mytilus trossulus. Mol.Biol. Evol. 16(5): 655–665.

Rawson, P. D., and T. J. Hilbish, 1998 Asymmetric introgressionof mitochondrial DNA among european populations of bluemussels (Mytilus spp.). Evolution 52(1): 100–108.

Robertson, A., 1975 Gene frequency distributions as a test of se-lective neutrality. Genetics 81: 775–785.

Roesti, M., D. Moser, and D. Berner, 2013 Recombination in thethreespine stickleback genome-patterns and consequences. Mol.Ecol. 22(11): 3014–3027.

Ross-Ibarra, J., S. I. Wright, J. P. Foxe, A. Kawabe, L. DeRose-Wilsonet al., 2008 Patterns of polymorphism and demographic history innatural populations of Arabidopsis lyrata. PLoS ONE 3(6): e2411.

Roux, C., V. Castric, M. Pauwels, S. I. Wright, P. Saumitou-Lapradeet al., 2011 Does speciation between Arabidopsis halleri andArabidopsis lyrata coincide with major changes in a moleculartarget of adaptation? PLoS ONE 6(11): e26872.

Roux, C., G. Tsagkogeorga, N. Bierne, and N. Galtier, 2013 Crossingthe species barrier: genomic hotspots of introgression betweentwo highly divergent Ciona intestinalis species. Mol. Biol. Evol.30(7): 1574–1587.

Roux, C., C. Fraisse, V. Castric, X. Vekemans, G. H. Pogson et al.,2014 Can we continue to neglect genomic variation in intro-gression rates when inferring the history of speciation?: a casestudy in a Mytilus hybrid zone. J. Evol. Biol. (in press).

Rozas, J., J. C. Sánchez-DelBarrio, X. Messeguer, and R. Rozas,2003 DnaSP, DNA polymorphism analyses by the coalescentand other methods. Bioinformatics 19(18): 2496–2497.

Skibinski, D. O. F., J. A. Beardmore, and T. F. Cross, 1983 Aspectsof the population genetics of Mytilus (Mytilidae; Mollusca) inthe British Isles. Biol. J. Linn. Soc. Lond. 19(2): 137–183.

Slatkin, M., 1994 An exact test for neutrality based on the Ewenssampling distribution. Genet. Res. 64(1): 71–74.

Slatkin, M., 1996 A correction to the exact test based on theEwens sampling distribution. Genet. Res. 68(3): 259–260.

Smith, J., and M. R. Kronforst, 2013 Do Heliconius butterfly spe-cies exchange mimicry alleles? Biol Lett. 9, 20130503.

Song, Y., S. Endepols, N. Klemann, D. Richter, F.-R. Matuschkaet al., 2011 Adaptive introgression of anticoagulant rodent33 poison resistance by hybridization between old world mice.Curr. Biol. 21(15): 1296–1301.

Strasburg, J. L., N. A. Sherman, K. M. Wright, L. C. Moyle, J. H.Willis et al., 2012 What can patterns of differentiation acrossplant genomes tell us about adaptation and speciation? Philos.Trans. R. Soc. B 367(1587): 364–373.

Tajima, F., 1983 Evolutionary relationship of DNA sequences infinite populations. Genetics 105: 437–460.

Tajima, F., 1989a The effect of change in population size on DNApolymorphism. Genetics 123: 597–601.

Tajima, F., 1989b Statistical method for testing the neutral muta-tion hypothesis by DNA polymorphism. Genetics 123: 585–595.

Tanguy, A., N. Bierne, C. Saavedra, B. Pina, E. Bachère et al.,2008 Increasing genomic information in bivalves throughnew EST collections in four species: development of new geneticmarkers for environmental studies and genome evolution. Gene408(1–2): 27–36.

Turner, T. L., M. W. Hahn, and S. V. Nuzhdin, 2005 Genomicislands of speciation in Anopheles gambiae. PLoS Biol. 3: e285.

Vekemans, X., 2010 What’s good for you may be good for me:evidence for adaptive introgression of multiple traits in wildsunflower. New Phytol. 187(1): 6–9.

Vekemans, X., T. Beauwens, M. Lemaire, and I. Roldán-Ruiz,2002 Data from amplified fragment length polymorphism(AFLP) markers show indication of size homoplasy and of a re-lationship between degree of homoplasy and fragment size.Mol. Ecol. 11(1): 139–151.

Vitalis, R., K. Dawson, P. Boursot, and K. Belkhir, 2003 DetSel 1.0:a computer program to detect markers responding to selection.J. Hered. 94(5): 429–431.

Wakeley, J., and J. Hey, 1997 Estimating ancestral populationparameters. Genetics 145: 847–855.

Watterson, G., 1975 On the number of segregating sites in geneticalmodels without recombination. Theor. Popul. Biol. 7(2): 256–276.

Weir, B. S., and C. C. Cockerham, 1984 Estimating F-statistics forthe analysis of population structure. Evolution 38(6): 1358–1370.

Whitlock, R., H. Hipperson, M. Mannarelli, R. K. Butlin, and T.Burke, 2008 An objective, rapid and reproducible method forscoring AFLP peak-height data that minimizes genotyping error.Mol. Ecol. Res. 8(4): 725–735.

Whitney, K. D., R. A. Randell, and L. H. Rieseberg, 2010 Adaptiveintrogression of abiotic tolerance traits in the sunflower Helian-thus annuus. New Phytol. 187(1): 230–239.

Wright, S., 1949 Population structure in evolution. Proc. Am.Philos. Soc. 93(6): 471–478.

Zhivotovsky, L. A., 1999 Estimating population structure in dip-loids with multilocus dominant DNA markers. Mol. Ecol. 8(6):907–913.

Communicating editor: M. A. Beaumont

Local Introgression in a Mosaic Hybrid Zone 951

Page 14: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

GENETICSSupporting Information

http://www.genetics.org/lookup/suppl/doi:10.1534/genetics.114.161380/-/DC1

Gene-Flow in a Mosaic Hybrid Zone: Is LocalIntrogression Adaptive?

Christelle Fraïsse, Camille Roux, John J. Welch, and Nicolas Bierne

Copyright © 2014 by the Genetics Society of AmericaDOI: 10.1534/genetics.114.161380

Page 15: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%&# *#+,!

!"#$%&'%

&())#$*$+,-./%*$,0123%

4(,#"$.%2$,$5,"1+%"+%,0$%*(#,"#15(3%35-+6%

-)./0#&%)#12)#3)1&'.(#/4#12)#('5#6)12/3(#7()3#1/#3)1)81#$+9#/71.')%#./8'"#

'()#:)#7()3#&#3);)./<6)=1#/4#12)#>)0/=1'=#&=3#?%&@&7)%#ABCDEF#1)(1#4/%#G)10))=H./87(#2)1)%/I)=)'1J#'=#:%'I21K(#4'5&1'/=#

'=3'8)L# $+9"# 9/# 6'='6'M)# 4&.()# </('1';)(# %)(7.1'=I# 4%/6# 8/6<.)5# </<7.&1'/=# (1%7817%)L# 0)# 4/../0)3# 12)# 8/=()%;&1';)#

(7II)(1'/=# /4# $/7%8&3)# )1# &."# A*NBEF# &=3# 4'11)3# 12)# 3&1&# 1/# &# O)=)%&.'M)3# P51%)6)#Q&.7)# 3'(1%'G71'/=# %&12)%# 12&=# &# !2'H

(R7&%)3#3'(1%'G71'/="#92'(#&=&.J('(#7()3#12)#S);'%T#<&8@&I)#A;)%('/=#B"DHEU#-)%=2&%3#V4&44F#'=#W#AW#!/%)#X);)./<6)=1#9)&6#

*NBBFL#0'12#&=#/71.')%#12%)(2/.3#/4#<YN"NB#4/%#12'(#8/=()%;&1';)#1)(1"##

'*)#:)# 7()3# 12)# 6)12/3# /4# -)&76/=1# &=3# Z'82/.(# ABCC[F# 12&1# )(1'6&1)(# 12)# =7..# 3'(1%'G71'/=# /4# 12)# 1)(1# (1&1'(1'8# GJ#

('67.&1'/="#+'67.&1'/=(#7()3# 12)#$X,+9*#(/410&%)# 4/%# 12)#8/3/6'=&=1#6&%@)%(# A-)&76/=1#&=3#Z'82/.(#BCC[F#&=3#X$X,+9#

(/410&%)# 4/%# 12)# 3/6'=&=1# \$>V# 6&%@)%(# A-)&76/=1# *NN]F"# 92)# 6)&=# =)71%&.# $+9# ;&.7)# 0&(# 8&.87.&1)3# GJ# &=# '1)%&1';)#

<%/8)37%)L#%)6/;'=I#./8'#/71.J'=I#12)#C^_#R7&=1'.)"#:)#/G1&'=)3#&#=7..#3'(1%'G71'/=#7('=I#&#10/H3)6)#6/3).L#I)=)%&1'=I#

BNNLNNN#./8'#(&6<.)3#4/%#`]#'=3';'37&.(#<)%#</<7.&1'/="#92)#I./G&.#8%'1'8&.#4%)R7)=8J#/4#12)#6/(1#8/66/=#&..).)#0&(#()1#1/#

N"CCL#&=3#12)#/71.')%#12%)(2/.3#()1#&1#<YN"NB"#

'+)#:)#7()3#12)#6)12/3#'6<.)6)=1)3#'=#XP9+P>#AQ'1&.'(#)1#&."#*NNEF#02'82#'=8.73)(#&=/12)%#4/%6#/4#</<7.&1'/=#(1%7817%)#'=#

02'82#&#8/66/=#&=8)(1%&.#</<7.&1'/=#/4#('M)#Z#(<.'1(#'=1/#10/#3&7I21)%#</<7.&1'/=(#12&1#3';)%I)#GJ#3%'41#4/%#9#I)=)%&1'/=(L#

&=3#3)%';)(#'1(#=7..#3'(1%'G71'/=#7('=I#8/&.)(8)=1#('67.&1'/=("#$/7%#()1(#/4#<&%&6)1)%(#0'12/71#&#G/11.)=)8@#0)%)#8/6<71)3a#

A'F#9bBNNL#ZbBNNLNNNU#A''F#9bBNNNL#ZbBNNLNNNU#A'''F#9bBNNL#ZbBLNNNLNNNU#&=3#A';F#9bBNNNL#ZbBLNNNLNNN"#92)#6&5'676#&..).)#

4%)R7)=8J#0&(#()1#1/#N"CCL#&=3#12)#/71.')%#12%)(2/.3#()1#&1#<YN"NB"#

',)#:)#7()3#12)#&<<%/&82#/4#-)&76/=1#&=3#-&.3'=I#A*NN`FL#02'82#'=8.73)(#&#6/%)#8/6<.)5#6/3).#/4#</<7.&1'/=#(1%7817%)L#

&(# '6<.)6)=1)3# '=# -\cP+!\Z# A$/..# &=3# O&II'/11'# *NN]F"# X)<&%17%)# 4%/6# =)71%&.'1J# &1# &# I';)=# ./87(# '(# '=4)%%)3# 0'12# &#

%);)%('G.)Hd76<#e!e!#G)10))=#6/3).(#0'12#&=3#0'12/71#().)81'/="#92)#e!e!#7()3#&#(&6<.)#('M)#/4#^NNNL#&=3#*N#<'./1#

%7=(#/4#12)#(&6)#.)=I12L#&#12'=='=I#'=1)%;&.#/4#*NL#&=3#&#G7%=H'=#/4#BNNLNNN#0)%)#7()3"#$/%#12)#\$>V#3&1&L#0)#7()3#&#-)1&#

<%'/%#4/%#$,+#0'12#&#6)&=#/4#N"NN^#&=3#(1&=3&%3#3);'&1'/=#/4#N"NNB"#:)#7()3#<%'/%#/33(#4/%#12)#=)71%&.#6/3).#/4#^#A'")"L#0)#

&((76)3#12&1#=)71%&.'1J#0&(#^#1'6)(#6/%)#.'@).J#$#-./0./F"#92)#/71.')%#12%)(2/.3#0&(#3)4'=)3#&(#&#-&J)(#$&81/%#f#BN"##

'1)#:)#7()3#&#6)12/3# 12&1#&88/7=1(# 4/%# 12)#</(('G.)# '=4.&1'/=#/4# 12)#;&%'&=8)# '=#=)71%&.#$+9#37)# 1/#8/H&=8)(1%J#G)10))=#

'=3';'37&.(#AW/G)%1(/=#BCD^F"#,=#<&%1'87.&%L#0)#7()3#12)#6)12/3#/4#-/=2/66)#)1#&."#A*NBNFL#&=3#)(1'6&1)3#12)#8/H&=8)(1%J#

6&1%'5# 4%/6#(7G()1#/4# 12)#\$>V#6&%@)%(# A%)6/;'=I#&=J#/71.')%# ./8'# '3)=1'4')3#0'12# 12)#6)12/3#/4#-)&76/=1#&=3#Z'82/.(#

Page 16: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%&# E#+,!

BCC[FL#&=3#12)#2&#".0334%43#(&6<.)#&(#/71I%/7<"#:)#<)%4/%6)3#BLNNNLNNN#('67.&1'/=(#8/=3'1'/=)3#/=#12)#)(1'6&1)3#6&1%'5#

1/#3)%';)#12)#)6<'%'8&.#=7..#3'(1%'G71'/=#/4#12)#1)(1#(1&1'(1'8L#&=3#12)#/71.')%#12%)(2/.3#0&(#()1#&1#<YN"NNB"##

A[F#543"06#3/64%$"/07#6!"809:#:)#3);)./<)3#&#87(1/6#6)12/3#1/#)(1'6&1)#12)#=)71%&.#)=;)./<)#/4#$+9#G/12#G)10))=#&=3#

0'12'=# (<)8')(# 7('=I# 12)# G)(1H(7<</%1)3# 3)6/I%&<2'8#6/3).(# A())# G)./0# Se/3).(# /4# (<)8'&1'/=TF"# 92)()# &%)# +)8/=3&%J#

!/=1&81# 0'12# 2)1)%/I)=)/7(# %&1)(# /4# '=1%/I%)(('/=# 4/%# 12)# G)10))=H(<)8')(# 8/6<&%'(/=(# &=3# &# 10/H3)6)# 6/3).# 0'12#

2/6/I)=)/7(#I)=)#4./0#4/%#12)#0'12'=H(<)8')(#8/6<&%'(/="#:)#('67.&1)3#BNN#3&1&()1(#/4#`**#+ZV(#'=#&#EH<&182#6/3).#A2&#

!94%/3L#2&#;$%%0-.0</7=/$%/3#/4#,G)%'&=#!/&(1#&=3#2&#;$%%0-.0</7=/$%/3#/4#-%'11&=JF#'=#02'82#2&#!94%/3#&=3#12)#&=8)(1%&.#.'=)&I)#

/4#2&#;$%%0-.0</7=/$%/3# <&182)(# (<.'1# 9(<.'1#eJ#&I/U# &=3# 8&6)#G&8@# '=1/# 8/=1&81#9+!#eJ#&I/"#:)#&((76)3# 12&1# 12)# 10/#2&#

;$%%0-.0</7=/$%/3#<&182)(# (<.'1# 9+!#eJ#&I/# 4%/6#02'82# 12)# 12%))#<&182)(# (1&%1)3# 1/#)582&=I)#6'I%&=1(#&1# %&1)(#eg# 1/# c# A'=#

/12)%#0/%3(#0)#6&3)#12)#&((76<1'/=#12&1#12)#6/(&'8#(1%7817%)#&(#87%%)=1.J#/G()%;)3#)(1&G.'(2)3#&1#12)#(&6)#1'6)#&(#12)#

()8/=3&%J#8/=1&81F"#:)#()1#7<# 12)#;&.7)(#/4#3)6/I%&<2'8#<&%&6)1)%(# 1/# 12)'%#</(1)%'/%#)(1'6&1)# A())#G)./0#Se/3).(#/4#

(<)8'&1'/=TFa# )44)81';)# </<7.&1'/=# ('M)(# 4'5)3# 1/# Z&=8hBLNNNLNNNU# Z)37.'(h*NNLNNNU# ZI&../i,G)%'&=i!/&(1h*NNLNNNU#

ZI&../i-%'11&=Jh*NNLNNNU#1'6)#/4#(<)8'&1'/=#9(<.'1h*"^#eJ#&=3#1'6)#/4#()8/=3&%J#8/=1&81#9+!#hN"D#eJU#2)1)%/I)=)/7(#I)=)#4./0#

G)10))=#2&# !94%/3# &=3#2&# ;$%%0-.0</7=/$%/3# /4# 12)# ,G)%'&=# !/&(1# A())# W/75# !"# $%&# *NB`F# 0'12#6)3'&=# ;&.7)(# /4#e)37.'(# 1/#

,G)%'&=i!/&(1# b# B"**# &=3#e,G)%'&=i!/&(1# 1/# )37.'(# b# N"]^U# 2)1)%/I)=)/7(# I)=)# 4./0# G)10))=#2&# !94%/3# &=3#2&# ;$%%0-.0</7=/$%/3# /4#

-%'11&=J#0'12#6)3'&=#;&.7)(#/4#e)37.'(#1/#-%'11&=Jb#N"E#&=3#e-%'11&=J#1/#)37.'(#b#N"CU#(J66)1%'8#&=3#2/6/I)=)/7(#I)=)#4./0#0'12'=#

2&#;$%%0-.0</7=/$%/3#0'12#6)3'&=#;&.7)(#/4#e-%'11&=J#1/#,G)%'&=i!/&(1#b#e,G)%'&=i!/&(1#1/#-%'11&=J#b#*E"#

70.1*131*-#%8-#9"+:%-#1+:%!"#$%&''

'()#>!?4!7=/7;#$%07;#6$=@(:#9/#&;/'3#12)#&6<.'4'8&1'/=#/4#<&%&./I7)(L#&1# .)&(1#/=)#<%'6)%#/4#&#<&'%#0&(#3)4'=)3#'=('3)#&=#

'=1%/="# 9/# 3'%)81.J# ()R7)=8)# 12)# V!W# <%/3781(L# &..).)H(<)8'4'8# <%'6)%(#0)%)# 3)4'=)3# /=# '=1%&H(<)8')(# </.J6/%<2'(6# AV!W#

<%/3781#.)=I12#</.J6/%<2'(6(#/%#+ZVF"#92)=L#0)#I)=/1J<)3#12)#<%)()=8)j&G()=8)#/4#12)#10/#;)%('/=(#/4#12)#<%'6)%(L#&=3#

()R7)=8)3#12)#2)1)%/MJI/7(#'=3';'37&.(#A())#-')%=)#*NBNF"#:)#8/%%)81)3#12)#(&6<.'=I#G'&(#GJ#%)H(&6<.'=I#12)#()R7)=8)(#

&88/%3'=I# 1/# 12)# 4%)R7)=8J#/4# 12)#&..).)# '=#)&82# (&6<.)"#V!W(#0)%)#<)%4/%6)3#&(#3)(8%'G)3# '=#X&I7'=#)1#&."# A*NNBF"#V!W#

<%/3781(#0)%)#<7%'4')3#0'12#&=#P5/+\V#@'1#Ak+-F#&=3#(7Gd)81)3#1/#&#()R7)=8'=I#%)&81'/=#7('=I#12)#-'I#XJ)#9)%6'=&1/%#@'1#

A\<<.')3#-'/(J(1)6(F"#+)R7)=8'=I#7()3#&=#\-,#EBENg># '=(1%76)=1#&1# 12)#>&GP5#!)eP-#()R7)=8'=I#<.&14/%6L#e/=1<)..')%"##

92)#]]NHG<#%)I'/=#7<(1%)&6#/4#6$=@(#0&(#&6<.'4')3#0'12#12)#4/%0&%3#<%'6)%L#.!;/07(@A#&=3#&#()1#/4#%);)%()#<%'6)%(#12&1#

<%'6)#)&82#/4#12)#10/#&..).)("#$/%#12)#2&#;$%%0-.0</7=/$%/3#(&6<.)(L#()1(#0)%)a#A'F#.!;/07(@;$%B@C(#&=3#.!;/07(@;$%B@C*L#A''F#

.!;/07(@;$%D@C(# &=3# .!;/07(@;$%D@C*L# A'''F# .!;/07(@;$%5@C(# &=3# .!;/07(@;$%5@C*"# $/%# 12)#2&# !94%/3# (&6<.)(L# 12)# ()1#0&(a#

.!;/07(@!94@C(# &=3# .!;/07(@!94@C*"# \# ()8/=3# B"^H@G# %)I'/=# '(# ./8&1)3# G)10))=# 6$=@(# &=3# 12)# ()8/=3# )5/=# /4# 12)#

()R7)=8)3#()I6)=1"# ,1#0&(#&6<.'4')3#7('=I#&#()1#/4#G'&..).'8# 4/%0&%3#<%'6)%(#&=3#&#%);)%()#<%'6)%L#.!;/07*@C"#$/%#12)#2&#

;$%%0-.0</7=/$%/3#(&6<.)(L#12)#()1#0&(a#.!;/07*@;$%@A(#&=3#.!;/07*@;$%@A*"#$/%#12)#2&#!94%/3#(&6<.)(L#12)#()1#0&(a#.!;/07*@

Page 17: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%&# `#+,!

!94@A(#&=3#.!;/07*@!94@A*"#9/#/G1&'=#2'I2#R7&.'1J#()R7)=8)(L#0)#3)4'=)3#=)0#<%'6)%(#'=#12)#6'33.)#/4#12)#()8/=3#%)I'/=#

A.!;/07*@6/99%!@A#&=3#.!;/07*@6/99%!@CF#&=3#%)H()R7)=8)3#4%/6#12)#G)I'=='=I#1/#12)#6'33.)#A[NN#G<F#&=3#4%/6#12)#6'33.)#

1/# 12)# )=3# ACNN# G<F# /4# 12)# V!W# <%/3781"# 92)# 12'%3# %)I'/=# AD^N# G<F# 8/%%)(</=3(# 1/# 12)# ()8/=3# )5/=# /4# 12)# ()R7)=8)3#

()I6)=1"# 92)# ./0# .);).# /4# </.J6/%<2'(6# /4# 12)# 12'%3# %)I'/=# &../0)3# 7(# ()R7)=8'=I# 0'12/71# ()<&%&1'=I# 12)# 10/#

82%/6/(/6)("#9/#&;/'3#()R7)=8'=I#<&%&./I7)(L#0)#4'%(1#7()3#12)#%);)%()#<%'6)%#>EF@C#0'12#&#()1#/4#./87(H(<)8'4'8#4/%0&%3#

<%'6)%(a# A'F# .!;/07+@!94@A# 4/%# 12)#2&#!94%/3# (&6<.)(L# A''F# .!;/07+@;$%@A# 4/%# 12)#2&# ;$%%0-.0</7=/$%/3# (&6<.)("# 92)=L#0)# %)H

()R7)=8)3#4%/6#12)#G)I'=='=I#1/#12)#)=3#/4#12)#()8/=3#)5/=#0'12#12)#4/%0&%3#<%'6)%#>EF@!G*@A"#V%'6)%(#&%)#.'(1)3#'=#9&G.)#

+["###

'*)# H!70"I-/7;# $"# J04.# -0%I60.-8/363:# $/%# I)=/1J<'=I# 12)# V!W# <%/3781# .)=I12# </.J6/%<2'(6(# A/79!%K/7LL# &=# '=3).#

</.J6/%<2'(6# &=3# 6$=@(L# &=# S)5/=H<%'6)3# '=1%/=H8%/(('=I# V!W# 6&%@)%TFL# 12)# 4.7/%)(8)=1# 3J)# ^l# )=3H.&G)..)3H<%'6)%#

1)82='R7)#0&(#7()3#0'12#3J)#mPg# A+'I6&#O)=/(J(F"#:)# 4'%(1# &6<.'4')3# /79!%K/7L#0'12#<%'6)%(# /79!%@/7L@Aj/79!%@/7L@C# &=3#

6$=@(#0'12#<%'6)%(#6$=@(@Aj6$=@(@C"# 92)=L# 12)#V!W#<%/3781(#0)%)# ()<&%&1)3#&=3#3)1)81)3#7('=I# &=#\<<.')3#\-,# V%'(6#

EBENg>#A\<<.')3#-'/(J(1)6(F#&=3#('M)3#GJ#O)=)+8&=9e#^NN#Wng9e#+'M)#+1&=3&%3#A\<<.')3#-'/(J(1)6(F"#O)=)e&<<)%o#;`"^#

(/410&%)#A\<<.')3#-'/(J(1)6(F#0&(#7()3#1/#%)&3#12)#%)(7.1'=I#82%/6&1/I%&6("#$/%#>EFK/7(L#0)#7()3#&#<%)()=8)j&G()=8)#

I)=/1J<'=I#/=#&I&%/()#I).#GJ#3)4'='=I#&# %);)%()#<%'6)%# A>EF@CF#&=3#(<)8')(H(<)8'4'8# 4/%0&%3#<%'6)%(a#>EF@/7(@A(# 4/%# 12)#

!94%/3#&..).)#&=3#>EF@/7(@A*#4/%#12)#;$%%0-.0</7=/$%/3#&..).)"#$/%#>EFK!G*L#0)#I)=/1J<)3#GJ#()R7)=8'=I#12)#()8/=3#)5/=#/4#

12)#()R7)=8)3#()I6)=1#&(#3)(8%'G)3#&G/;)"#V%'6)%(#&%)#.'(1)3#'=#9&G.)#+D"##

;12$#3%1<%3)$5"-,"1+6%%

'()#50$%!3=!7"# 3/64%$"/073:#:)#8&%%')3#/71# ('67.&1'/=(#7=3)%#;&%'/7(#6/3).(#/4# (<)8'&1'/=#0'12#;&%J'=I#<&11)%=(#/4#I)=)#

4./0a#+1%'81#,(/.&1'/=L#,(/.&1'/=#0'12#e'I%&1'/=L#\=8')=1#e'I%&1'/=#&=3#+)8/=3&%J#!/=1&81#A$'I7%)#+]F"#e'I%&1'/=#%&1)(#0)%)#

&../0)3# 1/# G)# &(J66)1%'8&.# G)10))=# 12)# </<7.&1'/=(L# &=3# 0)%)# &../0)3# 1/# G)# )'12)%# 2/6/I)=)/7(# /%# 2)1)%/I)=)/7(#

G)10))=#./8'"#,=#12)#S2/6/I)=)/7(T#(8)=&%'/(L#(8&.)3#6'I%&1'/=#%&1)(#'=#)&82#3'%)81'/=#0)%)#%&=3/6.J#(&6<.)3#4%/6#12)#

7='4/%6#<%'/%# '=1)%;&.# pN# H#BNq"#$/%# 12)#S2)1)%/I)=)/7(T#(8)=&%'/(L# 12)#6'I%&1'/=# %&1)(#&1#)&82# ./87(#0)%)#3%&0=# 4%/6#&#

(8&.)3#-)1&#3'(1%'G71'/=L# L#&=3#12)#<&%&6)1)%(#/4#12'(#3'(1%'G71'/=#0)%)#%&=3/6.J#

(&6<.)3#4%/6#12)#7='4/%6#'=1)%;&.(#pN#H#^q#ArF#&=3#pN#H#*NNq#AsFL#&=3#pN#H#BNq#A=F"#:)#&.(/#<.&8)3#<%'/%(#/=#12)#</<7.&1'/=#

671&1'/=#%&1)(# '=#12)#&=8)(1%&.#&=3#10/#)51&=1#</<7.&1'/=("#P&82#/4#12)()#;&.7)(#0&(#(8&.)3#GJ#12)#%)4)%)=8)#%&1)L#t%)4#b#

`Z%)4u#02)%)#Z%)4#'(#12)#)44)81';)#=76G)%#/4#'=3';'37&.(#/4#&#%)4)%)=8)#</<7.&1'/=#7()3#'=#8/&.)(8)=1#('67.&1'/=(L#&%G'1%&%'.J#

4'5)3# &1# BNNLNNN# &=3# u# 12)#671&1'/=# %&1)# /4# *"D[E5BNH]# jG<jI)=)%&1'/="# P&82# /4# 12)()# (8&.)3# R7&=1'1')(#0&(# &(('I=)3# &#

7='4/%6# <%'/%# /=# 12)# '=1)%;&.# pN# H*Nq"# :)# (&6<.)3# 12)# (8&.)3# (<)8'&1'/=# 1'6)# 9(<.'1j`Z%)4# 4%/6# 12)# '=1)%;&.# pN# H# *^q#

8/%%)(</=3'=I#1/#pN#H#BNDq#I)=)%&1'/=(#'=#3)6/I%&<2'8#7='1("#92)#1'6)#/4#+)8/=3&%J#!/=1&81#9+!#&=3#12)#'(/.&1'/=#1'6)#&41)%#

\=8')=1#e'I%&1'/=#9'(/#0)%)#3%&0=#4%/6#&#7='4/%6#3'(1%'G71'/=#/=#12)#'=1)%;&.#pN#H#9(<.'1q"###

Page 18: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%&# ^#+,!

'*)# 209!%# =80/=!:# 9/# (1&1'(1'8&..J# 8/6<&%)# 12)# 3'44)%)=1# 6/3).(# /4# (<)8'&1'/=L# 0)# 4'%(1# );&.7&1)3# 12)# %).&1';)# </(1)%'/%#

<%/G&G'.'1')(# 4/%# 12)# 10/# &.1)%=&1';)# (8)=&%'/(# AS2/6/I)=)/7(T# <3# S2)1)%/I)=)/7(TF# 0'12'=# 6/3).(# '6<.)6)=1'=I# I)=)#

4./0"# Z)51L# 0)# 8/6<&%)3# 12)# G)(1H(7<</%1)3# ;)%('/=# 4%/6# 12)()# 6/3).(L# '")"L! 12)# /=)(# 0'12# 12)# 2'I2)(1# ;&.7)# /4# 12)#</(1)%'/%#<%/G&G'.'1JL#1/#12)#+1%'81#,(/.&1'/=#6/3)."#V/(1)%'/%#<%/G&G'.'1')(#4/%#)&82#8&=3'3&1)#6/3).#0)%)#)(1'6&1)3#7('=I#&#

4))3H4/%0&%3# =)7%&.# =)10/%@# '6<.)6)=1'=I# &# =/=H.'=)&%# 67.1';&%'&1)# %)I%)(('/=# GJ# 8/=('3)%'=I# 12)# 6/3).# '1().4# &(# &=#

&33'1'/=&.#<&%&6)1)%#1/#G)# '=4)%%)3"#92)#*LNNN#5#7# %)<.'8&1)#('67.&1'/=(#=)&%)(1#1/#12)#/G()%;)3#;&.7)(#4/%#12)#(766&%J#

(1&1'(1'8(#0)%)#().)81)3#A02)%)#7#'(#12)#=76G)%#/4#8/6<&%)3#6/3).(FL#&=3#12)()#0)%)#0)'I21)3#GJ#&=#P<&=)82='@/;#@)%=).#

12&1# %)&82)(#&#6&5'676#02)=#+1&1'(1'8/G(#b#+1&1'(1'8('6"#!/6<71&1'/=(#0)%)#<)%4/%6)3#7('=I#^N# 1%&'=)3#=)7%&.#=)10/%@(#

&=3#B^#2'33)=#=)10/%@(#'=#12)#%)I%)(('/=#AW#<&8@&I)#S&G8TL#!('..v%J#!"#$%&#*NB*F"#,=#&#('6'.&%#0&JL#0)#8&%%')3#/71#<&'%0'()#

8/6<&%'(/=(#/4#3'44)%)=1#(8)=&%'/(#/4#'=1%/I%)(('/=#&1#12)#6$=@(#./87(L#'=#12)#()8/=3&%J#8/=1&81#6/3).U#12)()#0)%)#ABF#=/#

'=1%/I%)(('/=U#A*F#'=1%/I%)(('/=#'=1/#2&#!94%/3#/=.JU#AEF#'=1%/I%)(('/=#'=1/#2&#;$%%0-.0</7=/$%/3#/=.JL#&=3#A`F#'=1%/I%)(('/=#'=#

G/12# 3'%)81'/=(# &1# 3'44)%)=1# %&1)("# 92)# </(1)%'/%# <%/G&G'.'1J# /4# )&82# (8)=&%'/# 0&(# )(1'6&1)3# &(# 3)(8%'G)3# &G/;)"# # $/%#

<&%&6)1)%# )(1'6&1'/=L# 0)# ./IH1&=I)=1# 1%&=(4/%6)3# 12)# <&%&6)1)%(# Am&6'.1/=# !"# $%&# *NN^F# &=3# /=.J# 12)# *LNNN# %)<.'8&1)#

('67.&1'/=(#<%/378'=I#12)#(6&..)(1#&((/8'&1)3#P78.'3)&=#3'(1&=8)#0)%)#8/=('3)%)3"##

'+)#209!%# =8!=M/7;:#e/3).# 82)8@'=I#0&(#<)%4/%6)3#&88/%3'=I# 1/# 12)#<%/8)37%)#3)(8%'G)3# '=#$&I7=3)(#)1#&."# A*NNDF"#:)#

%&=3/6.J#(&6<.)3#^NN#%)<.'8&1)(#4%/6#12)#4';)#6'..'/=#('67.&1'/=(#<)%4/%6)3#4/%#)&82#6/3).L#&=3#7()3#12)6#&(#S<()73/H

/G()%;)3T#3&1&()1("#$/%#)&82#3&1&()1#0)#&<<.')3#12)#(&6)#6/3).#82/'8)#<%/8)37%)#1/#8/6<71)#12)#</(1)%'/%#<%/G&G'.'1')(#

/4#)&82#/4#12)#8/6<&%)3#6/3).("#92)#%).&1';)#3'(1%'G71'/=(#/4#12)()#<%/G&G'.'1')(#/;)%#12)#^NN#%)<.'8&1)(#0)%)#12)=#7()3#1/#

8/6<71)# 12)# <%/G&G'.'1J# 12&1# 12)# G)(1H(7<</%1)3# 6/3).L# /G1&'=)3# 37%'=I# 6/3).H82/'8)# <%/8)37%)L# '(# '=3))3# 12)# 1%7)#

('67.&1)3#6/3).#A$&I7=3)(#)1#&."L#*NNDU#!/%=7)1#)1#&."L#*NBNU#W/75#)1#&."L#*NBBF"##

',)# F$.$6!"!.# !3"/6$"/07:# 92)# d/'=1# </(1)%'/%# 3'(1%'G71'/=# /4# <&%&6)1)%(# 3)(8%'G'=I# 12)# G)(1# 6/3).# 0&(# /G1&'=)3# GJ#

0)'I21)3#=/=H.'=)&%#67.1';&%'&1)#%)I%)(('/=(#/4#12)#<&%&6)1)%(#/=#12)#(766&%JH(1&1'(1'8(#A-.76#&=3#$%&=8/'(#*NBNF"#:2)=#

12)#G)(1#6/3).#'=;/.;)3#2)1)%/I)=)/7(#I)=)#4./0L#0)#12)=#)(1'6&1)3#12)#./87(H(<)8'4'8#6'I%&1'/=#%&1)#<&%&6)1)%("#m)=8)L#

4/%# )&82# ./87(# 0)# %&=# B"^5BN[# %&=3/6# 8/&.)(8)=1# ('67.&1'/=(# 7('=I# <&%&6)1)%# ;&.7)(# (&6<.)3# '=# 12)# d/'=1H</(1)%'/%#

3'(1%'G71'/=#4/%#12)#4';)#<&%&6)1)%(#8/66/=#1/#&..#./8'#A</<7.&1'/=#671&1'/=#%&1)(#&=3#1'6)(F#/G1&'=)3#7('=I#12)#<%/8)37%)#

3)(8%'G)3#&G/;)"#$'=&..JL#0)#&<<.')3#12)#3)(8%'G)3#%)d)81'/=j%)I%)(('/=#&=&.J('(#1/#12)#('67.&1'/=(#<)%4/%6)3#4/%#)&82#./87(#

1/# d/'=1.J# )(1'6&1)# G/12# )44)81';)# 6'I%&1'/=# %&1)# <&%&6)1)%("# !/6<71&1'/=(# 0)%)# <)%4/%6)3# 7('=I# ^N# 1%&'=)3# =)7%&.#

=)10/%@(#&=3#B^#2'33)=#=)10/%@(#'=#12)#%)I%)(('/=#AW#<&8@&I)#S&G8TL#!('..v%J#!"#$%&#*NB*F"##

#

Page 19: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%&# [#+,!

!"#$%&=%

712$%"*)#$*$+,"+:%"+<$.$+,"-#%*$,0123%

$'.)#+*#'(#&;&'.&G.)#4/%#3/0=./&3#&(#&#M'<#4'.)"#

,1#8/=1&'=(#(/7%8)#8/3)#'6<.)6)=1'=I#./IH.'@).'2//3#'=4)%)=8)#A())#4/.3)%#!/3)i.=>F#&(#0)..#&(#&#('6<.)#;)%('/=#/4#12)#\-!#

'=4)%)=8)#A())#4/.3)%#!/3)i\-!F"##

#

#

>",$.-,(.$%7",$2%

-)&76/=1L#e"#\"L#*NN]#X$X,+9"#211<ajj000"6&12("G%'("&8"7@jh6&6&Gj(1744j#

#

-)&76/=1L#e"#\"L#&=3#W"#\"#Z'82/.(L#BCC[#P;&.7&1'=I#./8'#4/%#7()#'=#12)#I)=)1'8#&=&.J('(#/4#</<7.&1'/=#(1%7817%)"#V%/8"#-'/."#+8'"#

*[EABEDDFa#B[BCwB[*["#

#

-)&76/=1L#e"#\"L#&=3#X"#x"#-&.3'=IL#*NN`#,3)=1'4J'=I#&3&<1';)#I)=)1'8#3';)%I)=8)#&6/=I#</<7.&1'/=(#4%/6#I)=/6)#(8&=("#e/."#

P8/."#BEA`Fa#C[CwC]N"#

#

-')%=)L#Z"L#*NBN#92)#3'(1'=81';)#4//1<%'=1(#/4#./8&.#2'1822'@'=I#'=#&#;&%')3#)=;'%/=6)=1#&=3#I./G&.#2'1822'@'=I#'=#&#(7G3';'3)3#

</<7.&1'/="#P;/.71'/=#[`ABBFa#E*^`wE*D*"#

#

-.76L#e"#O"#-"L#&=3#n"#$%&=8/'(L#*NBN#Z/=H.'=)&%#%)I%)(('/=#6/3).(#4/%#&<<%/5'6&1)#-&J)('&=#8/6<71&1'/="#+1&1"#!/6<71"#*NABFa#

[EwDE"#

#

-/=2/66)L#e"L#!"#!2);&.)1L#-"#+)%;'=L#+"#-/'1&%3L#x"#\G3&..&2#)1#&."L#*NBN#X)1)81'=I#().)81'/=#'=#</<7.&1'/=#1%))(a#12)#>)0/=1'=#&=3#

?%&@&7)%#1)(1#)51)=3)3"#O)=)1'8(#B][a#*`Bw*[*"#

#

!/%=7)1L#x"He"L#Q"#W&;'I=vL#&=3#\"#P(1/7<L#*NBN#,=4)%)=8)#/=#</<7.&1'/=#2'(1/%J#&=3#6/3).#82)8@'=I#7('=I#XZ\#()R7)=8)#&=3#

6'8%/(&1)..'1)#3&1&#0'12#12)#(/410&%)#X,c\-!#A;B"NF"#-e!#-'/'=4/%6&1'8(#BBABFa#`NB"#

#

!('..v%JL#?"L#n"#$%&=y/'(L#&=3#e"#O"#-"#-.76L#*NB*#&G8a#&=#W#<&8@&I)#4/%#&<<%/5'6&1)#-&J)('&=#8/6<71&1'/=#A\-!F"#e)12/3(#P8/."#

P;/."#EAEFa#`D^w`DC"#

#

X&I7'=L#!"L#$"#-/=2/66)L#&=3#V"#-/%(&L#*NNB#92)#M/=)#/4#(J6<&1%J#&=3#2JG%'3'M&1'/=#/4#2I"/%43#!94%/3#&=3#2&#;$%%0-.0</7=/$%/3L#&(#3)(8%'G)3#GJ#'=1%/=#.)=I12#</.J6/%<2'(6#&1#./87(#6$=@("#m)%)3'1J#][a#E`*wE^`"##

$&I7=3)(L#Z"#x"#W"L#Z"#W&JL#e"#-)&76/=1L#+"#Z)7)=(820&=3)%L#$"#e"#+&.M&=/#)1#&."L#*NND#+1&1'(1'8&.#);&.7&1'/=#/4#&.1)%=&1';)#6/3).(#

/4#276&=#);/.71'/="#V%/8"#Z&1."#\8&3"#+8'"#k+\#BN`A`^Fa#BD[B`wBD[BC"#

#

$/..L#e"L#&=3#n"#O&II'/11'L#*NN]#\#I)=/6)H(8&=#6)12/3#1/#'3)=1'4J#().)81)3#./8'#&<<%/<%'&1)#4/%#G/12#3/6'=&=1#&=3#8/3/6'=&=1#

6&%@)%(a#&#-&J)('&=#<)%(<)81';)"#O)=)1'8(#B]Na#CDDwCCE"#

#

$/7%8&3)L#c"L#\"#!2&<71H-&%3JL#x"#+)8/=3'L#!"#$.)7%&=1L#&=3#!"#>)6&'%)L#*NBE#,(#./8&.#().)81'/=#(/#0'3)(<%)&3#'=#%';)%#/%I&='(6(za#

4%&81&.#I)/6)1%J#/4#%';)%#=)10/%@(#.)&3(#1/#2'I2#G'&(#'=#/71.')%#3)1)81'/="#e/."#P8/."#**A]Fa#*N[^w*NDE"#

#

m&6'.1/=L#O"L#e"#+1/=)@'=IL#&=3#>"#P58/44')%L#*NN^#e/.)87.&%#&=&.J('(#%);)&.(#1'I21)%#(/8'&.#%)I7.&1'/=#/4#'66'I%&1'/=#'=#<&1%'./8&.#

</<7.&1'/=(#12&=#'=#6&1%'./8&.#</<7.&1'/=("#V%/8"#Z&1."#\8&3"#+8'"#k+\#BN*A*BFa#D`D[wD`]N"#

#

>)0/=1'=L#W"#!"L#&=3#x"#?%&@&7)%L#BCDE#X'(1%'G71'/=#/4#I)=)#4%)R7)=8J#&(#&#1)(1#/4#12)#12)/%J#/4#12)#().)81';)#=)71%&.'1J#/4#

</.J6/%<2'(6("#O)=)1'8(#D`a#BD^wBC^"#

#

W/G)%1(/=L#\"L#BCD^#O)=)#4%)R7)=8J#3'(1%'G71'/=(#&(#&#1)(1#/4#().)81';)#=)71%&.'1J"#O)=)1'8(#]Ba#DD^wD]^"#

#

W/75L#!"L#Q"#!&(1%'8L#e"#V&70).(L#+"#,"#:%'I21L#V"#+&76'1/7H>&<%&3)#)1#&."L#*NBB#X/)(#(<)8'&1'/=#G)10))=#B.$N/90-3/3#8$%%!./#&=3#B.$N/90-3/3#%I.$"$#8/'=8'3)#0'12#6&d/%#82&=I)(#'=#&#6/.)87.&%#1&%I)1#/4#&3&<1&1'/=z#V>/+#nZP#[ABBFa#)*[]D*"#

#

W/75L#!"L#!"#$%&'(()L#Q"#!&(1%'8L#g"#Q)@)6&=(L#O"#m"#V/I(/=#)1#&."L#*NB`#!&=#0)#8/=1'=7)#1/#=)I.)81#I)=/6'8#;&%'&1'/=#'=#'=1%/I%)(('/=#

%&1)(#02)=#'=4)%%'=I#12)#2'(1/%J#/4#(<)8'&1'/=za#&#8&()#(173J#'=#&#2I"/%43#2JG%'3#M/=)"#x"#P;/."#-'/."#/7#-.!33&##

Q'1&.'(L#W"L#?"#X&0(/=L#V"#-/7%(/1L#&=3#?"#-).@2'%L#*NNE#X)1+).#B"Na#&#8/6<71)%#<%/I%&6#1/#3)1)81#6&%@)%(#%)(</=3'=I#1/#().)81'/="#x"#

m)%)3"#C`A^Fa#`*Cw`EB"#

Page 20: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

A

B

!"#$%&'()*!!!"#$%&#!'()$!%*!+$,#-!)$.!+$,-)/'0#(+#'!.+1#-#$2)2%$3!456!'7%8'!!"#!9):;#'!<#,8##$!'0#(+#'!)=)+$',!9):;#'!8+,7+$!$%&'()*+,&4>%-,7!?#)!-,!@)A!%*!@+'()A63!5::!%,7#-!.#,)+:'!&),(7!B+=;-#!C53!4@6!'7%8'!*-#D;#$(A!%*!,7#!'()*+,&)::#:#!),!,7#!,8%!%;,:+#-!:%(+!+$!,7#!*%;-!0)$&+(2(!0),(7#'!%*!,7#!',;.+#.!)-#)3!5::!%,7#-!.#,)+:'!&),(7!B+=;-#!C@3!!

0

10

20

30

40

50

60

0.0 0.2 0.4 0.6 0.8 1.0

0 10 20 30 40 50 60

0.0

0.1

0.2

0.3

0.4

all methods Outliers within M. edulis

CAG.CGA.363

ACG.CGA.152

0.0

0.1

0.2

0.3

0.4

FST w

ithin

M. e

dulis

0.0 0.2 0.4 0.6 0.8 1.0FST between species

CAG.CGA.363

0.0 0.4 0.8

CAG.CGA.363 ACG.CGA.152

0.0 0.4 0.8

ACG.CGA.152

North Sea

Brittany

Bay of Biscay

Iberian Coast

C. Fraisse et al. 7 SI

Page 21: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(+*! ! !?,;.+#.!'#D;#$(#!)-%;$.!./0123!E%'+2%$!%*!,7#!*%;-!0%:A&%-07+'&'!(7%'#$!*%-!#F,#$'+9#!=#$%,A0+$=!+'!+$.+(),#.!<A!',)-'!*%-!?>E'!)$.!<A!,-+)$=:#'!*%-!EGH!0-%.;(,! :#$=,7!0%:A&%-07+'&'3!I7#!'#(%$.!0%:A&%-07+'&J!./012J!(%--#'0%$.'!,%!,7#!%;,:+#-! :%(;'!+.#$2K#.!<A!=#$%&#!'()$3!I7#!'#D;#$(#!7)'!<##$!'#=&#$,#.!+$!,7-##!-#=+%$'!4-#=+%$!L!%*!MMN!<0O!-#=+%$!C!%*!LPNN!<0!O!-#=+%$!Q!%*!RPN!<06!*%-!=#$#!,-##!-#(%$',-;(2%$3!!

1 20

indel-in0 mac-1 SNP-in1 SNP-ex2

region 1 region 2 region 3

Exon 1 Exon 2

3 Kb

C. Fraisse et al. 8 SI

Page 22: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&' (,*! 5::#:#! =#$#):%=+#'! -#(%$',-;(,#.! <A! ,7#! $#+=7<%;-/S%+$+$=! ):=%-+,7&! )00:A! ,%! ,7#! $;&<#-! %*! $;(:#%2.#! .+1#-#$(#'3! 456!-#=+%$!L!)$.! 4@6! -#=+%$!Q!%*! ,7#!',;.+#.!'#D;#$(#3!?#D;#$(#'!')&0:#.! +$!$%&3/**4564-+70+/*+,&0),(7#'!)-#! +$! -#.! 4T<#-+)$!G%)',6!)$.!%-)$=#!4@-+U)$A6O!87+:#!,7%'#!')&0:#.!+$!$%&'()*+,&0),(7#'!)-#!+$!.)-V!<:;#!4>%-,7!?#)6!)$.!(A)$!4@)A!%*!@+'()A63!!

North Sea Brittany Bay of Biscay Iberian Coast

0.52

A B

C. Fraisse et al. 9 SI

Page 23: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(-*!!!E)-)&#,#-!#'2&)2%$!+$!,7#!).)029#!+$,-%=-#''+%$!&%.#:3!W)F+&;&!:+V#:+7%%.!#'2&)2%$!%*!!&456!)$.!$!4@6!+'!.#0+(,#.!+$!<:)(V3!E%',#-+%-!.+',-+<;2%$!%*!!&456J!$! 4@6!)$.!8! 4G6! +'!.#0+(,#.!<A!=-#A!.%,'3!WX!)$.!5@G!+$*#-#$(#'!8#-#!<)'#.!%$!,7#!$;&<#-!%*!)::#:#'!')&0:#.!+$!,7#!.%$%-!)$.!-#(+0+#$,!0%0;:)2%$'!47(!)$.!766!)$.!,7#!$;&<#-!%*!.+'2$(,!)::#:+(!(:)''#'!+$!,7#'#!')&0:#'!49(&)$.!9663!5::!%,7#-!.#,)+:'!&),(7!B+=;-#!Y3!

-331

-329

-327

lnL

0.00

0.10

freq

0 2 4 6 8 10

A !=2Nu

-329.0

-328.0

-327.0

0.00.10.20.30.4

0.5 2.0 10.0 50.0

B M=2Nm

0.08

0.12

0.16

0 200 400 600 800

C !=2Ns

C. Fraisse et al. 10 SI

Page 24: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

5 15 25

0.00

0.15

kd

5 15 25

0.00

0.15

kr

0.0 0.4 0.8

0.00

0.15

Hed

0.0 0.4 0.8

0.0

0.2

Her

0.1 0.3 0.5

0.0

0.2

0.4

sdd

0.0 0.2 0.4 0.6

0.0

0.2

sdr

-0.3 0.0 0.2 0.4

0.0

0.3

0.6

FST

0.0 0.4 0.8

0.00

0.10

0.20

privated

0.1 0.3

0.00

0.15

0.30

privater

0.0 1.0 2.0 3.0

0.00.20.4

HerHed

!"#$%&'(.*'''"%%.$#''/%*/K,!,%!,7#!).)029#!+$,-%=-#''+%$!&%.#:3!B%-!#)(7!',)2'2(J!4L6!,7#!.+',-+<;2%$!+$!-#.!8)'!%<,)+$#.!*-%&!PNJNNN!'+&;:)2%$'!;'+$=! ,7#! 0%',#-+%-! 0)-)&#,#-! #'2&),#! %*! :J! $! )$.! !O! 4C6! ,7#!.+',-+<;2%$! +$! <:;#! 8)'! %<,)+$#.! *-%&! LNNN! '+&;:)2%$'! ;'+$=! ,7#!&)F+&;&!:+V#:+7%%.!9):;#!%*!!!)$.!$J!%,7#-!0)-)&#,#-'!8#-#!'#,!),!:&Z!PNNNNJ!8&Z!RNNJ!7(&Z!LRR!)$.!$-!Z! YN3!5!.)'7#.! 9#-2():! :+$#! '7%8'! ,7#!%<'#-9#.!9):;#!*%-!#)(7!',)2'2(3!?,)2'2('!;'#.!8#-#[!9J!$;&<#-!%*!)::#:+(!(:)''#'O! ;'J! #F0#(,#.! 7#,#-%\A=%'+,AO! "(J! ',)$.)-.! .#9+)2%$! +$!*-#D;#$(+#'!%9#-!)::!)::#:+(!(:)''#'O!<=J!0-%0%-2%$!%*!0-+9),#!)::#:#'O!!"#J!:#9#:!%*!.+1#-#$2)2%$!%*!'()*+,&)::#:#'O!;'6&>&;'(J!0-%0%-2%$):!.+9#-'+,A!)]#-!+$,-%=-#''+%$!87#-#!(!.#$%,#'!,7#!.%$%-!'0#(+#'!)$.!6!.#$%,#'!,7#!-#(+0+#$,!'0#(+#'3!

C. Fraisse et al. 11 SI

Page 25: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(/*' ' '^1#(,!%*!-#(%&<+$)2%$!+$!,7#!).)029#!+$,-%=-#''+%$!&%.#:3!I7#!0-%0%-2%$):!.+9#-'+,A!)]#-!+$,-%=-#''+%$!*-%&!(&4.%$%-!'0#(+#'6! ,%! 6& 4-#(+0+#$,! '0#(+#'6J!;'6& >& ;'(J! +'! '7%8$! )'! )! *;$(2%$! %*! ,7#! '():#.!&+=-)2%$! -),#J!$3! _+1#-#$,! (%:%;-'! (%--#'0%$.! ,%!.+1#-#$,! 9):;#'!%*! ,7#! -#(%&<+$)2%$! -),#J! 6! 4'##! :#=#$.63! ^--%-!<)-'! -#0-#'#$,! ',)$.)-.!.#9+)2%$'! ():(;:),#.!%9#-!PJNNN! -#0:+(),#'3!W#.+)$!9):;#'!)-#!.#0+(,#.!<A!.%,'3!`,7#-!0)-)&#,#-'!8#-#!'#,!),[!:&Z!LNNNNJ!!&Z!CJ!8&Z!LNNN!)$.!7(&Z!76&Z!PN!4')&0:#!'+\#63!?+&;:)2%$'!8#-#!-;$!;$2:!,7#!<#$#K(+):!)::#:#!-#)(7#.!KF)2%$!+$!,7#!-#(+0+#$,!<)(V=-%;$.3!!

r=0r=0.05r=0.5

1e-03 1e-02 1e-01 1e+00 1e+01 5e+01M=2Nm

0.00

0.25

0.50

0.75

1.00

He rHed

C. Fraisse et al. 12 SI

Page 26: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&' (0*' ' ' ^1#(,! %*! ,#&0%-):! 9)-+)<+:+,A! +$! &+=-)2%$! -),#! +$! ,7#! ).)029#! +$,-%=-#''+%$! &%.#:3! I7#! 0-%0%-2%$):! .+9#-'+,A! )]#-!+$,-%=-#''+%$!*-%&!(&4.%$%-!'0#(+#'6!,%!6&4-#(+0+#$,!'0#(+#'6J!;'6&>&;'(J! +'!'7%8$!)'!)!*;$(2%$!%*!,7#!'():#.!&+=-)2%$!-),#J!$3!"-##$!.%,'!+$.+(),#!)!(%$',)$,!&+=-)2%$!-),#!#9#-A!=#$#-)2%$!4$?@:.63!@:;#!.%,'!-#*#-!,%!)!0;:'#.!&+=-)2%$!-#=+&#!+$!87+(7!$?2AB@:.&#9#-A! LN! =#$#-)2%$'J! %,7#-8+'#!$?AO! '%! ,7),! %$! )9#-)=#!$?@:.3! ^--%-! <)-'! -#0-#'#$,! ',)$.)-.! .#9+)2%$! ():(;:),#.! %9#-! LJNNN!-#0:+(),#'3!W#.+)$!9):;#'!)-#!.#0+(,#.!<A!.%,'3!`,7#-!0)-)&#,#-'!8#-#!'#,!),[!:&Z!LNNNNJ!!&Z!CJ!8&Z!LNNN!)$.!7(&?&76Z!PN!4')&0:#!'+\#63!?+&;:)2%$'!8#-#!-;$!;$2:!,7#!<#$#K(+):!)::#:#!-#)(7#.!KF)2%$!+$!,7#!-#(+0+#$,!<)(V=-%;$.3!!

constant migrationmigration every 10 generations

0.01 0.10 1.00 10.00M=2Nm in average

0.00

0.25

0.50

0.75

1.00

He rHed

C. Fraisse et al. 13 SI

Page 27: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(1*' ' 'W%.#:'!%*!'0#(+)2%$[!?,-+(,! T'%:)2%$!4?T6J! T'%:)2%$!8+,7!W+=-)2%$!4TW6J!5$(+#$,!W+=-)2%$!45W6J!)$.!?#(%$.)-A!G%$,)(,!4?G63!B%;-!0)-)&#,#-'!)-#!'7)-#.!<A!)::!&%.#:'[!#,5*+C!+'!,7#!$;&<#-!%*!=#$#-)2%$'!'+$(#!,7#!'0#(+)2%$!2&#O!:/70J!:'()*+,!)$.!:3/**4!)-#!,7#!$;&<#-! %*! #1#(29#! +$.+9+.;):'! +$! ,7#! )$(#',-):! 0%0;:)2%$J!$%& '()*+,! )$.!$%& 3/**4564-+70+/*+,& -#'0#(29#:A3! #+,4! +'! ,7#! $;&<#-! %*!=#$#-)2%$'!'+$(#!,7#!,8%!$)'(#$,!'0#(+#'!',%00#.!#F(7)$=+$=!&+=-)$,'!+$!,7#!5W!&%.#:3!#"D! +'!,7#!$;&<#-!%*!=#$#-)2%$'!'+$(#!,7#!,8%!.);=7,#-!'0#(+#'!#F0#-+#$(#.!'#(%$.)-A!(%$,)(,!)]#-!)!0#-+%.!%*!+'%:)2%$!+$!,7#!?G!&%.#:3!I7#!)'A&&#,-+():!&+=-)2%$!-),#'&$3/**4&

,%& '()*+,&)$.!$'()*+,& ,%&3/**4! )-#!#F0-#''#.! +$!@:.&;$+,'J!87#-#!.! +'! ,7#!0-%0%-2%$!%*!)!0%0;:)2%$!&).#!;0!%*!&+=-)$,'! *-%&!,7#!%,7#-!0%0;:)2%$!0#-!=#$#-)2%$3!

C. Fraisse et al. 14 SI

>)$(!

>'()*+,& >3/**4!

I '0:+,!

SI

AM

I +'%!

SC

I ?G!

Past

Past

IM

W'()*+,!,%!3/**4&W3/**4!!

,%!'()*+,&

Page 28: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

C. Fraisse et al. 15 SI

!"#$%&' (2*' ' ' ' "%%.$#''/%*/K,! %*! ,7#! <#',! ';00%-,#.! .#&%=-)07+(! &%.#:! ,%! ,7#! 0)+-8+'#! B?I! (%&0)-+'%$'3! 456! )$.! 4@6! '7%8! ,7#!.+',-+<;2%$!%*! B?I! <#,8##$! '0#(+#'! =+9#$! -#'0#(29#:A! *%-! ,7#!$%&3/**4564-+70+/*+,&0),(7!%*! ,7#! T<#-+)$!G%)',! )$.!@-+U)$AO!87+:#! 4G6!'7%8'!,7#!.+',-+<;2%$!%*!B?I!8+,7+$!$%&3/**4564-+70+/*+,3&`<'#-9#.!B?I!9):;#'!),!,7#!aCC!5BXE'!&)-V#-'!)-#!'7%8$!+$!<:)(V3!?+&;:),#.!B?I!9):;#'!4+$!=-#A6!8#-#!%<,)+$#.!<)'#.!%$!LNN!'+&;:)2%$'!%*!aCC!?>E'!+$!)!Q/0),(7!&%.#:!;'+$=!,7#!0%',#-+%-!#'2&),#'!%*!,7#!'#(%$.)-A!(%$,)(,!&%.#:!8+,7!7#,#-%=#$#+,A!%*! +$,-%=-#''+%$! -),#'! 4'##! B+:#! ?L63!_)'7#.!9#-2():! :+$#'! '7%8!&#)$!B?I! 9):;#'O! -#.! 9#-2():! :+$#'!.#$%,#!,7#!B?I!9):;#'!%<'#-9#.!),!./0123&T$!#)(7!0)+-8+'#!(%&0)-+'%$J!./012!+'!%;,!%*!,7#!bPc/D;)$2:#!%*!,7#!'+&;:),#.!.+',-+<;2%$[!456!519):;#!Z!N3NNQRJ!4@6!519):;#!Z!N3LNLN!4G6J!519):;#!Z!N3NNab3!

A

C

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FST between species: Mytilus edulis - Iberian Coast

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

FST between species: Mytilus edulis - Brittany

0 0.1 0.2 0.3

FST within M. galloprovincialis: Iberian Coast - Brittany

B?I!%<'!Z!N3NMaO!B?I!'+&!Z!N3LNN! B?I!%<'!Z!N3NPbO!B?I!'+&!Z!N3LNa!

B?I!%<'!Z!N3NCLO!B?I!'+&!Z!N3NCL!

B

Page 29: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%& #*+#,-

./ .0/ 1./ ./ .0/ 1./23 *. *0 *. 4* 0"54 6 *7 4* 0"7*

2% + 5 6 *4 0"*7 . *0 *. 0"0.

8)3 0"9+ 0"+ 0"6* 0"61 0"41 0".* 0"96 0"61 0"54

8)% 0"97 0"71 0"94 0"65 0"5+ 0"51 0"96 0"61 0"75

,33 0"** 0"0+ 0"01 0"*9 0"77 0"0+ 0"** 0"44 0"51

,3% 0"*. 0"01 0"*. 0"7* 0"56 0"0+ 0"*4 0"4+ 0"7*

/:3 0"+ 0"79 0".+ 0"94 0"7* 0"*5 0"7. 0".. 0"04;

/:% 0 0 0"*5 0"41 0;;; 0 0"01 0"4. 0;;;

$,< 0"0*+ 0"0* 0"0+ 0"*1 0"*4 =0"04 0"0* 0"0. 0"45

8)%>8)3 0"1+ 0".5 0"1 *"** 0"46 0"6. * *"*5 0"4+

!"#$%&'(&)**+,-../*0/12&3,&24-&5+5678-&3,29*:9-..3*,&;*+-<

<?) 3@A@% B@BCD&E@A '( %)B%)()AF)3 GH F?) FI@ B&FJ?)( @K'& !()%*+ LM@%F? ,)& &A3 N&H @K N'(J&HO" <?)%)J'B')AF B@BCD&E@A '( %)B%)()AF)3 GH F?) N%'P&AH B&FJ? @K '& ,$%%-./-0*12*$%*+" QADH )3CD'(=3)%'R)3&DD)D)(#I)%)#J@A('3)%)3"!S ACTG)% @K &DD)D'J JD&(()(U "#S )VB)JF)3 ?)F)%@WHX@('FH LM)' *196OU $%S (F&A3&%3 3)R'&E@A 'A

K%)YC)AJ')( @R)% &DD &DD)D'J JD&(()(U &'S B%@B@%E@A @K B%'R&F) &DD)D)(U ($)S D)R)D @K 3'Z)%)AE&E@A @K )3CD'(

&DD)D)(#LM)'#*197OU#=-9&>&=-+S#B%@B@%E@A&D#3'R)%('FH#&[)%#'AF%@X%)(('@A"

(F&E(EJ 3&F&"#?@$

3)$14%!+.50$%)!

3)$14%!+.50$%)!

Page 30: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%& #*9#,-

\@JC( A)3C AX&D \F@F \('D ]!() ],$% ^)3C ^X&D _)3C _X&D L*=],>]<O A)F3'R)3C=X&D ,V")3C ,V"X&D ,K ,(`$* 40 40 146 +06 0"0441 0"0*77 0"074 0"045+ =*"*.5+ =*"6.+. 0".669 0"0.0. .9 5* 0 *4`$4 40 40 **06 96* 0"0077 0"0*4+ 0"001 0"0*+4 =4"556* =0"614+ 0"7666 0"0011 45 55 * *XDCJ&A&() *6 *5 661 5+0 0"04+1 0"0*6 0"0496 0"04+9 =0"*745 =*"5017 0"0+91 0"004* 4* *+ 0 476$257 *4 40 645 645 0"0057 0"0*.9 0"005 0"04*+ 0"4106 =*"*07 0"*64+ 0"00*9 5 .9 0 +a&AA&A&() 40 *6 167 ..6 0"0*15 0"047+ 0"04.7 0"0746 =0"1755 =*"*901 0"0465 0"00*5 41 54 0 4*TJ*4. *1 40 .04 *05 0"049 0"095 0"076. 0"065 =*"*011 =0"5+61 0"05+7 0"007+ 9 45 0 9TX34 40 40 *067 1.0 0"07.1 0"054 0"07+4 0"0.69 =0"071+ =*"*6.+ 0"0*5. 0"00** .0 *4+ 0 94aHED'A *+ *+ 9.5 .59 0"05* 0"05*1 0"0545 0"077* =0"*549 *"*715 0"0104 0"006 49 *0 0 .0

!"#$%&'A&'B;;59C&.257.7D.&*E.-98-+&52&24-&-3:42&,BD<-59&<*D3

'&#!()%*+#'(#%)B%)()AF)3#GH#F?)#M@%F?#,)&#B&FJ?"#'&#,$%%-./-0*12*$%*+#'(#%)B%)()AF)3#GH#F?)#N%'P&AH#B&FJ?"

*S F@F&D ACTG)% @K ()YC)AJ)(U +,-,S F@F&D D)AXF?U +S ('D)AF D)AXF? )VJDC3'AX &DD X&B( K%@T F?) F@F&D &D'XAT)AFU .S ACTG)% @K B&'%I'() 3'Z)%)AJ)(

L<&b'T&c *167OU /S d&P)%(@Ae( f Ld&P)%(@Ac *19.OU 0S <&b'T&e( _ L<&b'T&c *161&c *161GOU 12.$3.)S D)R)D @K (B)J')( 3'Z)%)AE&E@Ac I?)%) .$ '( F?)

&R)%&X) B&'%I'() ACJD)@E3) 3'R)%('FH I'F?'A (B)J')( &A3 .) '( F?) F@F&D B&'%I'() ACJD)@E3) 3'R)%('FH @K F?) B@@D)3 (&TBD) &J%@(( (B)J')(U *#,%45 #%62

789S A)F T@D)JCD&% 3'R)%X)AJ) T)&(C%)3 &F (HA@AHT@C( B@('E@A(U $:S ACTG)% @K )VJDC('R) B@DHT@%B?'J ('F)(U $;S ACTG)% @K gV)3 3'Z)%)AJ)(U $ < S#

ACTG)%#@K#(?&%)3#B@DHT@%B?'J#('F)("

&R)%&X)#&J%@((#F?)#(&TBD)3#()YC)AJ)(

Page 31: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%& #*6#,-

I'F?'A#T@3)D(L?@T@#0+#?)F)%@O

G)FI))A#T@3)D(

?@T@ 0"0+9. V?)F)%@ FGHIAJ 0"70+.?@T@ 0"544+ V?)F)%@ FGJKKL 0"0915?@T@ 0"**99 V?)F)%@ FGMMAI FGN(AJ

!"#$%&'I&O-<578-&6*.2-93*9&69*E5E3<37-.&*0&+3P-9-,2&;*+-<.&*0&.6-D357*,

!@TB&%'(@Aa@3)D(

d) g%(F J@TB&%) ?@T@X)A)@C( Lh?@T@iO 0+ ?)F)%@X)A)@C( Lh?)F)%@iO (J)A&%'@( @K'AF%@X%)(('@A %&F)( I'F?'A F?) -ac ja &A3 ,! T@3)D(" d) F?)A J@TB&%) F?) G)(FR)%('@A#@K#)&J?#T@3)D#I'F?#X)A)#k@I#L-ac#ja#&A3#,!O#F@#F?)#,-#T@3)D"#

0"00*9

-(@D&E@A#I'F?#a'X%&E@A#L-aO

jAJ')AF#a'X%&E@A#LjaO

,)J@A3&%H#!@AF&JF#L,!O

,F%'J#-(@D&E@A#L,-O V V

Page 32: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%& #*1#,-

l&DC)( &%) F?) )(ET&F)3 %)D&ER) B%@G&G'D'E)( @K F?) ,! T@3)D I?)A F?) ,-c F?) -ac F?) ja &A3 F?) ,! T@3)D( &%) F?) F%C) T@3)D(" <?)H I)%) J&DJCD&F)3 &F

F?)#,!#B@(F)%'@%#B%@G&G'D'FH#F?&F#I)#@G()%R)3#L:B@(F#m#0"+*O"#`(ET&E@A(#&%)#G&()3#@A#.00#B()C3@=@G()%R)3#3&F&()F(#K@%#)&J?#T@3)D"

d) C()3 F?)T F@ J@TBCF) F?) B%@G&G'D'FH F?&F ,! '( F?) J@%%)JF T@3)D X'R)A F?&F :B@(F m 0"+*" :%@G&G'D'FH @K FHB) - )%%@% '( @A) T'AC( F?'( B%@G&G'D'FH (@ F?&F

*#=#0"169#m#0"0*7"#d) &((CT)3 ?)F)%@X)A)'FH @K 'AF%@X%)(('@A %&F)( K@% T@3)D( 'AJDC3'AX T'X%&E@A L-ac jac ,!O &( F?)H &%) (F%@AXDH (CBB@%F)3 J@TB&%)3 F@ F?)'%?@T@X)A)@C(#&DF)%A&ER)(#L<&GD)#,7O"

!"#$%&'L&!-.2&*0&9*EB.2,-..&0*9&24-&.-D*,+59C&D*,25D2&;*+-<&Q324&4-2-9*:-,-*B.&3,29*:9-..3*,&952-.&

:FHB)#-#)%%@%jAJ')AF#a'X%&E@A#LjaO ,)J@A3&%H#!@AF&JF#L,!O

<%C)#T@3)D

0"0004 0"*96. 0"009+ 0"6*79

-(@D&E@A#I'F?#a'X%&E@A#L-aO

:%@G#L,!#n#<%C)#T@3)D#O

,F%'J#-(@D&E@A#L,-O

0"0*7

Page 33: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%& #40#,-

,!0S#0"700. ,!0S#0"0007 ,!0S#0"004.

'?(R&FGNHHJ '?AR&FGHHHK '?IR&FGHHKJ

,!*S#0"00*7 ,!*S#0"04.*

'?AR&FGHHMK '?IR&FGHKLH

'?AR&FGH(IM

,!7S#0"06+4

,!*

,!4

,!0

!"#$% 'J O-<578- 6*.2-93*9 69*E5E3<37-. *0 +3P-9-,2 .D-,593*. *03,29*:9-..3*,&52&=8>21&3,&24-&.-D*,+59C&D*,25D2&;*+-<

,!FS#A@#'AF%@X%)(('@AU

,!(S#'AF%@X%)(('@A#'AF@#a"#)3CD'(#@ADHU

,!AS#'AF%@X%)(('@A#'AF@#a"#X&DD@B%@R'AJ'&D'(#@ADHU

,!IS#'AF%@X%)(('@A#'A#G@F?#3'%)JE@A(#&F#3'Z)%)AF#%&F)("

,!* ,!4 ,!7

Page 34: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%& #4*#,-

/!,*-175,$%8=o*S#.i=j<jjp!jpj<<pj<jjjj<jjj<!p/!,*-175,$%8=o4S#.=ijj<j<jj<<j<j!jj<j<jjp!j<p/!,*-175,$%9=o*S#.i=!<ppp<j!!j<jp!jpj<<jp<<<jp/!,*-175,$%9=o4S#.i=<j!<ppp<j!!j<jp!jpj<<pj<jj/!,*-175,$%:=o*S#.i=<<<p!<<j<<p<!p<!jjppjj<<p/!,*-175,$%:=o4S#.i=<j<pj<<<p!<<j<<p<!p<ppjjp/!,*-175!()=o*S#.i=<<jj<!jj<<<jjp<j!<j!p!<jj/!,*-175!()=o4S##.e=<<jj<!jj<<<jjp<j!<j!p!<jp

/!,*-1;5,$%=$*S#.i=jj<<!pj<<<j<<<<j<!jj<!<p!/!,*-1;5,$%=$4S#.i=pj!j<jjj<<j<pjj<jj!j<p!</!,*-1;5!()=$*S#.i=<<<<j!<jjj<jjp<!pjjj<<j!!/!,*-1;5!()=$4S#.i=<<<<j!<jj<<jjp<!pjjj<<j!p

4T'33D)

=)A3

/!,*-1;56*((%!=$S#.i=jp<j!j<j!!pjj<j<jp!! /!,*-1;=oS#.i=<p<pj!jj<<!!p<p<<!jj<<ppp

/!,*-1<5,$%5$S#.i=<j<jjj!!!!j!<<!!p<<<</!,*-1<5!()=$S#.i=<j<jjj!!!!j!<<!!<<<!<

7T'33D)

=)A3

=>?5!@;=$S#.i=pp<j<pppj!jpjjjpj!jp!<j!p =>?=oS#.i=!<p<<<p!<pj<!!j!j<!<p<<pp

!"#$%&'N&$3.2&*0&693;-9.&B.-+&0*9&.-SB-,D3,:&59*B,+&=8>21

* /!,*-17=$S#.i=jjpj<<<j<<<j<j<j<p!p<p<p

4(F&%F=

T'33D)

/!,*-1;56*((%!=oS#.i=jpp!<j<j<<!pp<j<p<j!<

7 =>?=oS#.i=!<p<<<p!<pj<!!j!j<!<p<<pp

o)X'@A $@%I&%3#B%'T)%( o)R)%()#B%'T)%(

Page 35: Gene-Flow in a Mosaic Hybrid Zone: Is Local Introgression ... · quence (M. Ohresser, personal communication) and looking for homologybetween codingregions and expressed sequence

!"#$%&'(()#!"#$%& #44#,-

*1(!%5*1A *1(!%5*1A=$S#.i=!j<p<!<!<j<j!!j<<p<j!<<<pp *1(!%5*1A=oS#.i=jpj!!j<j!p<jjjppjjj<p<j<p6$257 6$257=$S#.i=!p<!<jp!p<jp<j!<<jjj<<p 6$257=oS#.i=!pjjjj<<p<jp<!<jp<<<<p<p

=>?5*17=$*#.i=<j<jjj!!!!j!<<!!<<<!<=>?5*17=$4S#.i=<j<jjj!!!!j!<<!!p<<<<

=>?5!@; =>?5!@;=$S#.i=pp<j<pppj!jpjjjpj!jp!<j!p =>?=oS#.i=!<p<<<p!<pj<!!j!j<!<p<<pp

=>?5*17 =>?=oS#.i=!<p<<<p!<pj<!!j!j<!<p<<pp

!"#$%&'K&$3.2&*0&693;-9.&B.-+&0*9&:-,*2C63,:&-5D4&*0&24-&0*B9&6*<C;*9643.;.&5<*,:&24-&.-SB-,D-

\@JC( $@%I&%3#B%'T)%( o)R)%()#B%'T)%(