evolution and applications of pine microsatellites

52
EVOLUTION AND APPLICATIONS OF PINE MICROSATELLITES AULI KARHU Department of Biology, University of Oulu OULU 2001

Upload: others

Post on 03-Oct-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evolution and applications of pine microsatellites

EVOLUTION AND APPLICATIONS OF PINE MICROSATELLITES

AULIKARHU

Department of Biology,University of Oulu

OULU 2001

Page 2: Evolution and applications of pine microsatellites

AULI KARHU

EVOLUTION AND APPLICATIONS OF PINE MICROSATELLITES

Academic Dissertation to be presented with the assent ofthe Faculty of Science, University of Oulu, for publicdiscussion in Kuusamonsali (Auditorium YB210),Linnanmaa, on March 30th, 2001, at 12 noon.

OULUN YLIOPISTO, OULU 2001

Page 3: Evolution and applications of pine microsatellites

Copyright © 2001University of Oulu, 2001

Manuscript received 19 February 2001Manuscript accepted 27 February 2001

Communicated byProfessor Veikko KoskiDoctor G. G. Vendramin

ISBN 951-42-5924-6(URL: http://herkules.oulu.fi/isbn9514259246/)

ALSO AVAILABLE IN PRINTED FORMATISBN 951-42-5923-8ISSN 0355-3191(URL: http://herkules.oulu.fi/issn03553191/)

OULU UNIVERSITY PRESSOULU 2001

Page 4: Evolution and applications of pine microsatellites

Karhu, Auli, Evolution and applications of pine microsatellites Department of Biology, University of Oulu, P.O.Box 3000, FIN-90014 University of Oulu, Finland 2001Oulu, Finland(Manuscript received 19 February 2001)

AbstractThe evolution of microsatellites was studied within and between the pine species. Sequences showedthat microsatellites do not necessarily mutate in a stepwise fashion and that size homoplasy is com-mon due to flanking sequence and repeat area changes within and between the species. Thus, someassumptions of statistical methods based on changes in repeat numbers may not hold.

Sequences from cross-species amplifications revealed evidence of duplications of microsatelliteloci in pines. On two independent occasions, the repeat area of the microsatellite had undergone arapid expansion during the last 10-25 million of years.

Microsatellite markers were used together with other molecular markers (allozymes, RFLPs,RAPDs, rDNA RFLPs) and an adaptive trait (date of bud set) to study patterns of genetic variationin Scots pine (Pinus sylvestris) in Finland. All molecular markers showed high level of within pop-ulation variation, while differentiation among populations was low (FST = 0.02). Of the total varia-tion in bud set, 36.4 % was found among the populations which experience a steep climatic gradient.Thus, the markers applied were poor predictors of population differentiation of the quantitative traitstudied

The distribution of genetic variation was studied in five natural populations of radiata pine (Pinusradiata), species which has gone through bottlenecks in the past. Null allele frequencies were esti-mated and used in later analyses. Microsatellites showed high level of variability within populations(He = 0.68-0.77). Allele length distributions and average number of alleles per locus showed sometraces of bottlenecks. Instead, comparison of observed genetic diversities and expected diversitiessuggested post-bottleneck expansion of populations. Genetic differentiation (FST and RST) amongpopulations was over 10 %, reflecting situation in the isolated radiata pine populations.

Using microsatellites and a newly developed Bayesian method, individual inbreeding coefficientswere estimated in five populations of radiata pine. Most individuals were outbred while some wereselfed. Presumably, in ancestral radiata pine populations the recessive deleterious alleles have beeneliminated after bottlenecks and the mating system has changed as a consequence.

Keywords: population structure, microsatellite evolution, Pinus, inbreeding coefficient

Page 5: Evolution and applications of pine microsatellites

Acknowledgements

This work was carried out at the Department of Biology, Division of Genetics, in theUniversity of Oulu. I warmly thank my supervisor Professor Outi Savolainen forintroducing me the interesting field of plant genetics and evolution. I am grateful to Outifor her guidance and support during my studies and for her patience with me. I thankProfessor Pekka Pamilo and Professor emeritus Seppo Lakovaara for providing excellentworking facilities. I am indebted to all of my co-authors for their invaluable contributionto this work. I also wish to thank the former and present members of the Plant GeneticsGroup, Elena Baena, Anita deHaan, Jens-Holger Dieterich, Vladimir Dvornik, RosarioGarcia, Hans Peter Koelewijn, Helmi Kuittinen, Katri Kärkkäinen, Merja Mikkonen,Robert Podolsky, Anu Sirviö, Niina Tero, Patrik Waldmann, Claus Vogl and JaanaVuosku. Especially I thank Päivi Hurme, Sami Oikarinen and Mona-Anitta Riihimäki forall the inspiring and very often pleasantly confusing discussions we had, especially onFriday afternoons. It was very therapeutic to notice that I was not the only “loony” in thegroup. I thank Minna Ruokonen for discussions concerning everything beneath theheaven. Our coffee breaks (TKT-breaks) were relaxing moments during these years.

I also wish to thank Dr. Craig Echt, Dr. Gavin Moran and Charlie Bell for providingme samples and information on the species studied. I thank our excellent techniciansSoile Finne, Anne Karjalainen and Hannele Parkkinen for their skilful technicalassistance in parts of this work. I would like to extend my thanks to everybody in theDepartment for making my working days so pleasant and for tolerating my“savolaisuuttani”. Sorry guys, it is in my genes.

I am grateful to Professor Veikko Koski and Dr. G. G. Vendramin for the helpfulcomments on my thesis and to Dr. Tek Tay for revising the language.

Finally, I want to express my gratitude to my parents Mirjami and Tauno, and my sisterHeli, and brother Kimmo and all of my friends for their support and encouragementthroughout my studies. They all helped me to remember that life is so much more than amicrosatellite fragment on a gel.

This work was supported by the Graduate School in Forest Sciences, the Academy ofFinland and the University of Oulu.

Oulu, February 2001 Auli Karhu

Page 6: Evolution and applications of pine microsatellites
Page 7: Evolution and applications of pine microsatellites

Abbreviations

Bp Base paircDNA Complementary DNAIAM Infinite allele modelKAM K-allele modelMCMC Markov Chain Monte CarloMYA Million years agoPCR Polymerase chain reactionRAPD Random amplified polymorphic DNArDNA Ribosomal DNARFLP Restriction fragment length polymorphismSMM Stepwise mutation modelSTR Short tandem repeatTPM Two phase model

Page 8: Evolution and applications of pine microsatellites
Page 9: Evolution and applications of pine microsatellites

List of original papers

This thesis is based on the following publications, which are referred to in the text bytheir Roman numerals.

I Karhu A, Dieterich J-H, & Savolainen O (2000) Rapid expansion of microsatellitesequences in pines. Mol Biol Evol 17: 259-265.

II Karhu A, Hurme P, Karjalainen M, Karvonen P, Kärkkäinen K, Neale D & Savo-lainen O (1996) Do molecular markers reflect patterns of differentiation in adaptivetraits of conifers? Theor Appl Genet 93: 215-221.

III Karhu A, Vogl C, Moran GF, Bell JC & Savolainen O (2001) Microsatellites are sen-sitive genetic markers to study genetic structure, effects of bottlenecks and coloniza-tion in five natural populations of radiata pine (Pinus radiata). (manuscript).

IV Vogl C, Karhu A, Moran GF & Savolainen O (2001) High resolution analysis ofmating systems: inbreeding in natural populations of Pinus radiata. (submitted to JEvol Biol).

Page 10: Evolution and applications of pine microsatellites
Page 11: Evolution and applications of pine microsatellites

Contents

Abstract Acknowledgements Abbreviations List of original papers 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.1 Microsatellites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.1.1 General characters of microsatellites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.1.2 Microsatellite evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.1.3 Theoretical models of microsatellite mutations . . . . . . . . . . . . . . . . . . . . . 161.1.4 Testing models of microsatellite evolution . . . . . . . . . . . . . . . . . . . . . . . . 171.1.5 Potential problems associated with microsatellites . . . . . . . . . . . . . . . . . . 191.1.6 Applications of microsatellites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.2 Evolution and population genetics of pines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.2.1 Evolution and distribution of pines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.2.2 Genetic variation in pines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.2.3 Inbreeding in pines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.3 Goals of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Materials and methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.1 Pine material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.2 DNA isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.3 Microsatellite loci and sequence analysis in cross-species study . . . . . . . . . . . . 262.4 Scots pine population study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.4.1 Date of bud set and molecular markers . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.5 Radiata pine studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.5.1 Fragment analysis and sequencing of microsatellite alleles . . . . . . . . . . . . 272.5.2 Data analysis in population structure study . . . . . . . . . . . . . . . . . . . . . . . . 272.5.3 Mating system study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.1 Microsatellite evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.1.1 Short term evolution within species . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.1.2 Long term evolution between species . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.2 Analysis of natural populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2.1 Scots pine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Page 12: Evolution and applications of pine microsatellites

3.2.2 Radiata pine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2.2.1 Estimation of null alleles and distribution of

microsatellite variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2.2.2 Effects of bottleneck and colonization . . . . . . . . . . . . . . . . . . . . . . 363.2.2.3 Relationships between populations . . . . . . . . . . . . . . . . . . . . . . . . 38

3.3 Estimation of inbreeding in radiata pine populations using microsatellites . . . . 394 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Page 13: Evolution and applications of pine microsatellites

1 Introduction

Molecular markers have become essential tools for conservation biology, evolutionaryand population studies as well as for mapping projects (Queller et al. 1993, Jarne &Lagoda 1996). The ideal class of genetic marker would have many scorable and highlyvariable loci with codominant alleles, and markers should be densely distributedthroughout the genome. Microsatellite markers meet these requirements, and they havebecome a marker of choice for mapping, forensic investigations, and population analysesas well as in ecological studies. In addition to these applications, the excellent propertiesof microsatellites allow even more innovative approaches in future projects. This means,however, that we have to know how these tools work. Knowledge of such molecular levelproperties as the mode of genetic transmission, mutation rate, and the nature of themutational process itself are critical to proper interpretation of microsatellite markers in apopulation context.

1.1 Microsatellites

1.1.1 General characters of microsatellites

Microsatellites are sequences made up of a single sequence motif (1-6 bp) which isrepeated many times side-by-side. Historically, the term microsatellite has been used todescribe only repeats of the dinucleotide motif CA/GT (Litt & Luty 1989, Weber & May1989). The term microsatellite has now become the most common term to describe thesetandem repeats of short motifs. They are also called simple sequences (Tautz 1989) andshort tandem repeats (STRs) (Edwards et al. 1991). If these repeats are long enough anduninterrupted, they are excellent genetic markers due to their high level of polymorphism(Powell et al. 1996). Estimates of microsatellite mutation rates in Escherichia coli in invivo systems are about 10-2 events per locus per replication (Levinson & Gutman 1987b),in yeast 10-4-10-5 (Henderson & Petes 1992, Strand et al. 1993), and in Drosophila about6x10-6 (Schug et al. 1997). Pedigree analysis in humans gave an estimate of 10-3 eventsper locus per generation (Weber & Wong 1993).

Page 14: Evolution and applications of pine microsatellites

14

Microsatellites have been found in every organism studied so far. In the humangenome poly(A)/poly(T) stretches are the most common repeat types (Stallings 1992).However, the poly(A)/poly(T) type is not suitable as genetic markers because ofinstability during PCR reactions. The study of Beckmann and Weber (1992) showed thatthe most common dinucleotide repeat type in the human genome is CA/GT. Othermammalian genomes seem to have similar repeat compositions as the human genome. Inplants GA/CT and AT repeats are the most common (Stallings 1992, Lagercrantz et al.1993). In conifers the most common repeat type varies among species, but GA/CT andCA/GT seem to be the most common ones (Lagercrantz et al. 1993, Smith & Devey 1994,Pfeiffer et al. 1997, Scotti et al. 2000).

Microsatellites are generally assumed to be evenly distributed over genomes (e.g.Dietrich et al. 1996) but rare within coding regions (Hancock 1995). There are, however,some human diseases caused by expansions of polymorphic trinucleotide repeats in genessuch as fragile X and myotonic dystrophy, (e.g. Fu et al. 1991, Aslandis et al. 1992,Rubinsztein 1999). In conifers, it is known that nuclear microsatellite repeats are oftenembedded within repetitive DNA sequences (Smith & Devey 1994, Pfeiffer et al. 1997).The sequencing of the whole chloroplast genome of Pinus thunbergii (Wakasugi et.al.1994) has allowed development of very useful and universal chloroplast microsatellitemarkers for conifers (e.g. Powell et al. 1995, Vendramin et al. 1996, Echt et al. 1998,Vendramin et al. 2000). In addition, microsatellites have been found also in themitochondrial genome of conifers (Soranzo et al. 1999, Sperisen et al. 2001).

1.1.2 Microsatellite evolution

There are two potential mechanisms which can explain the high mutation rates ofmicrosatellites. The first is recombination between DNA molecules by unequal crossing-over or by gene conversion (Smith 1976, Jeffreys et al. 1994). The second mechanisminvolves slipped-strand mispairing during DNA replication (Levinson & Gutman 1987a).Studies using yeast and E. coli as model organisms have shown that replication slippageseem to be the main mechanism generating length mutations in microsatellites (Levinson& Gutman 1987b, Henderson & Petes 1992). In replication slippage the nascent DNAstrand dissociates from the template strand during the replication of the repeat area andthe nascent strand can reanneal out-of-phase with the template strand. When replication iscontinued, the eventual nascent strand will be longer or shorter than the template,depending on whether the looped-out bases have occurred in the template strand or thenascent strand. Microsatellites will then lose or gain a single or a few repeats (Fig. 1).These kinds of small length changes are the most common mutational types inmicrosatellite loci, and have been detected in E. coli and yeast (Levinson & Gutman1987b, Sia et al. 1997, Wierdl et al. 1997) as well as in human (Weber & Wong 1993). Ifrecombinations were the major mechanism, mutations would be expected to give rise to awider range of novel mutants.

Page 15: Evolution and applications of pine microsatellites

15

Fig. 1. Model of mutation process at microsatellite loci. a) slippage of the DNA polymeraseduring replication, b) misalignment of the template or the newly replicated strand and c)continuation of replication. DNA strands are represented by lines, repeat units by small boxes,and direction of replication by small arrows.

The length of the microsatellite repeats may have an effect on the mutation rate suchthat longer repeats are more polymorphic than shorter ones (Weber 1990, Chakraborty etal. 1997, Sia et al. 1997, Primmer et al. 1998, Ellegren 2000b). This is probably becausethe opportunity for a stable misaligned configuration is greater for longer repeat arrays.The second parameter that influences microsatellite stability is the purity of the repeat.Interrupted microsatellite repeats (due to insertion or bases or base substitution) seem tohave lower mutation rates than perfect repeats. This might be due to greater difficulty offorming slipped intermediates in the presence of sequence interruptions (e.g. Kunst et al.1997, Petes et al. 1997). In addition to these parameters it has been noticed that the sex ofthe mutating individual has an influence on the mutation process. In barn swallow themutation rate was almost twice as high in males as in females (Primmer et al. 1997). In

Page 16: Evolution and applications of pine microsatellites

16

humans, an excess of paternally transmitted mutations supported a male-biased mutationrate (Ellegren 2000a).

Many studies have shown that microsatellite loci involve more gains than losses ofrepeat units (Weber & Wong 1993, Talbot et al. 1995). Amos et al. (1996) and Primmer etal. (1996a) observed in their germline studies that significantly more gains than losses ofrepeat units occurred in humans and barn swallows, respectively. However, it is unclearwhether this asymmetry in the distribution of mutations occurs only in hypervariable andlong microsatellites, or in all types of microsatellites. The molecular mechanism resultingin this kind of upwardly biased mutation is still unclear.

Most microsatellite arrays are shorter than a few tens of repeat units, although a fewlarge repeat arrays have also been found e.g. in humans (Wilkie & Higgs 1992) and inbarn swallow (Primmer et al. 1996a). This strongly suggests that there must be sizeconstraints restricting the expansion of repeat arrays. However, there is no direct evidencefor selective constraints acting on allele length at microsatellite loci, although severalmechanisms have been suggested. For instance, Primmer et al. (1996a) and Ellegren(2000a) suggested that repeat losses might be more common or involve larger deletionsamong long alleles than shorter alleles. Because large alleles are strongly counter-selectedat loci associated with genetic diseases, Samadi et al. (1998) suggested that selectionmight act as an upper truncating mechanism, imposing a ceiling on alleles with largerepeat counts. Taylor et al (1999a) suggested that interruptions were associated withrepeat shortening, and thus restricted the expansions of microsatellite alleles.

The analysis of germline mutations of the parental genotypes of human familiessuggested that mutations are more common in heterozygous individuals which have greatallele repeat number differences (Amos et al. 1996). However, Ellegren (2000a) showedthat in human the size difference between an individual’s two alleles has no effect on themutation rate. In addition, if the theory of Amos et al. (1996) were true, the mutation ratewould be correlated with heterozygosity, and loci in larger populations would evolvefaster than those in smaller ones. This phenomenon has not been observed at thepopulation level, although Rubinsztein et al. (1995) noticed that human microsatelliteswere longer than their homologues in chimpanzee. However, it has been argued thatmicrosatellites tend to be longer and thus more polymorphic in the species they werecloned from, due to the selection during the cloning procedure (Ellegren et al. 1995,1997).

In summary, mutational process of microsatellites seems to be very complex process.It is very likely that these processes are heterogeneous with differences between loci andalleles (see Ellegren 2000b).

1.1.3 Theoretical models of microsatellite mutations

To estimate population differentiation measures and genetic distances from microsatellitedata, theoretical mutation models for the evolutionary processes of microsatellites areneeded. Two theoretical models have been considered for microsatellites (Deka et al.1991). In the infinite allele model (IAM, Kimura & Crow 1964) mutation can involve anynumber of tandem repeats and always results in a new allele state not previously existed

Page 17: Evolution and applications of pine microsatellites

17

in population. However, as discussed above, the slipped-strand mispairing is currentlyaccepted as the main mechanism for microsatellite length variation. This mechanismmostly causes small changes in repeat numbers such that alleles of similar lengths shouldbe more closely related to each other than alleles of very different sizes. Alleles may alsomutate towards allele states that are already present in the population. The stepwisemutation model (SMM) (Kimura & Ohta 1978) developed for allozymes provides betterdescription for these kinds of evolutionary processes. In addition to this model, Di Rienzoet al. (1994) described the two phase model (TPM), where a limited proportion ofmutations involve several repeats. Although rarely cited in microsatellite literature, a K-allele model (KAM) could also be considered for microsatellites. Under this model, thereare K possible allelic states, and any allele has a constant probability of mutating towardsany of the other K–1 allelic states (Crow & Kimura 1970). Due to size constraints actingon microsatellite loci, the KAM seems to be more realistic than the IAM.

Different kinds of repeat number variance estimators based on the stepwise mutationmodel (SMM) have been developed for estimating phylogenetic relationships, geneticdistances and population differentiation [(δµ)2, Goldstein 1995a,b; DSW, Shriver 1995;RST, Slatkin 1995] from microsatellite data. These estimators are based on the followingassumptions: (i) mutation results in a change of one repeat unit, (ii) the mutation rate isconstant and independent of repeat length, (iii) there is no asymmetry in the distributionof mutations (iv) and there are no allele size constraints. Significant discrepanciesbetween known divergence times and microsatellite genetic distances (Deka et al. 1994,Garza et al. 1995, Valsecchi et al. 1997) imply that one or more of these basicassumptions may be wrong or may not hold at least for all microsatellite loci (seeEllegren 2000). The factors relevant to the evolution of microsatellites have beenincorporated into mutation models, such as allele size constraints (Garza et al. 1995,Nauta & Weissing 1996, Feldman et al. 1997), the possibility of multistep mutations (DiRienzo et al. 1994) and directionally biased changes in allele size (Kimmel &Chakraborty 1996). However, the dependence of the mutation rate on the repeat numbersand on the purity of the repeat sequence has not been taken into account in these models.

1.1.4 Testing models of microsatellite evolution

Theoretical mutation models like SMM and TPM may provide adequate measures ifpopulations are relatively closely related, but these simple models become inadequatewhen divergence between populations and especially between species increases(Takezaki & Nei 1996). The main questions of interest when studying microsatelliteevolution are whether replication slippage is the only mechanism contributing to sizedifferences between alleles, whether the mutation distribution is symmetric, and whetherthere are allele size constraints.

There are several ways to study microsatellite evolution. First, theoretical studiesattempt to model the process of microsatellite evolution by applying assumptions to arange of parameters considered to be important to the mutational process. After computersimulations the resulting data can be compared to the observed distribution of allelefrequencies and/or the heterozygosity of a locus (e.g. Deka et al. 1991, Shriver et al.

Page 18: Evolution and applications of pine microsatellites

18

1993). Valdes et al. (1993) and Di Rienzo et al. (1994) used an alternative method wherethey compared the empirical and modeled allele frequency distributions. These studieshave shown that SMM and TPM can explain relatively well the evolutionary processes ofmicrosatellites. There are still open questions as to whether all mutations at microsatelliteloci involve changes of only one or two repeat units or whether mutations of larger effectsalso occur.

It is possible to study the short-term evolution of microsatellites by analysing germlinemutations (e.g. Zhang et al. 1994, Primmer et al. 1996a). Although, the mutation rate ofmicrosatellites is several orders of magnitude higher than that for nucleotide substitutionsin non-coding DNA, spontaneous mutations are still quite rare in the genome. Moremutations have been recorded from immortal cell lines (Weber & Wong 1993) andcarcinomas, but it is possible that these kinds of somatic events may show elevated ratesof mutations (Weber & Wong 1993). These short term evolution studies have revealedthat gains of repeats are more common than losses (Weber & Wong 1993, Amos et al.1996, Primmer et al. 1996a), that multi-step changes may also occur (Weber & Wong1993, Primmer et al. 1996a), that mutation rate may differ between sexes (Primmer et al.1998, Ellegren 2000a), and that mutation rate might be positively correlated with thenumber of repeats (Primmer et al. 1996a).

Microsatellite allele sequencing both within and between species allows the analysesof past mutation events. Such studies have confirmed that changes in allele length aremost often due to size alterations in the repeat area (e.g. Estoup et al. 1995a, Angers &Bernatchez 1997). Nevertheless, many studies have indicated that stepwise changes inrepeat number is not the only mode of evolution of microsatellite alleles. For instance,Estoup et al. (1995a) noticed that irregular and composite repeat structures seemed toreduce the amount of single-step mutations, so that the mutation process may be moresimilar to the IAM. Homoplasic microsatellite alleles may also differ in repeatcompositions and/or flanking sequences. Size homoplasy has been reported amongmicrosatellite alleles from the same species (e.g. Blanquer-Maumont & Crouau-Roy1995, Grimaldi & Crouau-Roy 1997, Viard et al. 1998, van Oppen et al. 1999, Makova etal. 2000), and between species (e.g. van Treuren et al. 1997, Primmer & Ellegren 1998,Colson & Goldstein 1999).

These studies of evolutionary processes have shown that (i) the mutation of repeatunits depends on the allele size and purity; (ii) the mutation process is upwardly biased;and (iii) some constraints on allele length exist. It is very likely that these are allele andlocus-dependent processes (Ellegren 2000b). Theoretical mutation models, such as SMMand TPM may accurately represent the evolutionary processes of microsatellites whenclosely related populations are considered. However, over long evolutionary distances themutation process seems to be more complex. Thus, theoretical mutation models that canmore accurately represent the evolutionary processes of microsatellites are needed toobtain better estimates of population differentiation measures.

Page 19: Evolution and applications of pine microsatellites

19

1.1.5 Potential problems associated with microsatellites

Microsatellites also have some drawbacks as markers. The first problem is reduction orcomplete loss of amplification of some alleles due to base substitutions or indels withinthe priming site. These so-called null alleles will not necessarily be recognized whenthere is a product from the other allele homologue. This can lead to seriousunderestimation of heterozygosity, compared with that expected on the basis of Hardy-Weinberg equilibrium (e.g. Callen et al. 1993, Paetkau & Strobeck 1995). This problemcan be overcome by designing a new primer which does not include the site of indel orbase substitution. This can be very time consuming and may not always be possible forinstance due to base composition of the flanking sequences.

There are also problems associated with the PCR process itself. Taq polymerasegenerates slippage during PCR and the tendency of Taq polymerase to add an additionaldATP to PCR products can sometimes make allele scoring problematic (Ginot et al. 1996,Gill et al. 1997).

Microsatellite variation is based on length variation (in bp) of the amplified fragments.It is possible that two fragments of the same length are not derived from the sameancestral sequence, introducing the possibility of size homoplasy. Under the IAM thereshould not be any homoplasy, but SMM and TPM can generate size homoplasy. Ifmicrosatellite loci evolve in a stepwise fashion, size homoplasy will depend on themutation rate on the locus and the divergence time of two populations. The degree ofhomoplasy will increase with the mutation rate and time of divergence (Estoup & Cornuet1999). In addition, the selective size constraints that reduce the number of possible allelicstates increase size homoplasy (Nauta & Weissing 1996). Size homoplasy can lead tounderestimates of population subdivision and genetic divergence between populationsand species (e.g. Estoup et al. 1995b, Viard et al. 1998, Taylor et al. 1999b). Sizehomoplasy is taken into account by several distance measures, which are based on theSMM (Goldstein et al. 1995a and b, Slatkin 1995, Rousset 1996, Feldman et al. 1997).

Hedrick (1999) showed that measures of differentiation for highly polymorphicmicrosatellites using traditional F-statistics can be underestimates. The reason for this isthe high within population heterozygosity (He). FST determines the proportion ofvariation between subpopulations as compared to the total population (HT), but does notspecify the identity of the alleles involved (Hedrick 1999). When using microsatellites,populations can have nonoverlapping sets of alleles, and because under Hardy-WeinbergHT > He, the differentiation estimates can be underestimates.

1.1.6 Applications of microsatellites

Microsatellites have become the preferred marker in many studies because of their highvariability, ease and reliability of scoring and codominant inheritance. Microsatellitemarkers were first used for genetic mapping (e.g. Weissenbach et al. 1992) and as adiagnostic tool to detect human diseases (e.g. Murray et al. 1992). Nowadaysmicrosatellites are regularly used in population and ecological studies. Microsatellites are

Page 20: Evolution and applications of pine microsatellites

20

excellent markers for studying gene flow, effective population size (Ne), dispersal andmigration related issues, and parentage and relatedness (e.g. Taylor et al. 1994, Coulsonet al. 1998, Ciofi & Bruford 1999, Goldstein et al. 1999, Luikart & England 1999).

Microsatellites can also be used to study the effects and level of inbreeding (Beaumont& Bruford 1999, Pemberton et al. 1999, Sweigart et al. 1999). Allozymes have been usedto study mating systems in populations. Due to the low level of polymorphism, theestimation of individual inbreeding coefficients has been difficult. It has been possible toestimate only the average population inbreeding level. However, in many areas ofecological and evolutionary studies it is often important to know how much individualsdiffer in their inbreeding histories and to estimate the degrees of relatedness betweenindividuals. The average heterozygosity of an individual measured from microsatellitedata should realistically reflect the level of inbreeding. New advanced statistical methodshave also enabled the use of microsatellite markers in such studies (Sweigart et al. 1999).

1.2 Evolution and population genetics of pines

This part will cover briefly topics of pine evolution and genetics relevant for this thesis.

1.2.1 Evolution and distribution of pines

The first gymnosperms arose in the Middle Devonian (~365 million years ago).Fossilized cones have shown that ancestors of Pinaceae family evolved by the mid-Jurasic (~160 million years ago). Pinaceae is divided into 10 or 11 genera. More than halfthe species in the Pinaceae are included in the genus Pinus (111 species) (Price et al.1998). During the early part of the Cretaceous (nearly 130 million years ago) pinesdiversified into two subgenera, Strobus (haploxylon or soft pines, with one fibrovascularbundle in the needle) or Pinus (diploxylon or hard pines, with two fibrovascular bundlesin the needle) (Mirov 1967, Richardson & Rundel 1998) (Fig. 2). Several sections (e.g.Strobus and Pinus) and further subsections (e.g. Sylvestres, Attenuatae and Strobi) haveevolved since the diversification of these two subgenera (Krupkin et al. 1996, Price et al.1998).

After the diversification of the subgenera Pinus and Strobus pines migrated throughoutthe middle latitudes of the northern hemisphere super-continent, Laurasia. Majorenvironmental changes in the early Cretaceous led to a splitting of several subsections ofPinus into northern refugial populations in western Siberia, mid-latitude populations ineastern Asia, and southern refugial populations in other parts of Asia and Europe(Kremenetski et al. 1998, Willis et al. 1998). Intensive mountain building events togetherwith climate changes created the environmental heterogeneity that drove the radiation ofpine taxa in several areas which became secondary centres of diversification of Pinus(e.g. Mexico and north-eastern Asia). At the end of the Eocene (55-37 MYA) the genusPinus diversified further due to climatic changes (Richardson & Rundel 1998). Theimpact of the Eocene had the effect of dissecting the genus and concentrating pines intowidely disjunct regions. During the Pleistocene (1.7 - 0.01 MYA) pine populations and

Page 21: Evolution and applications of pine microsatellites

21

species shifted first south, then north following the glacial and interglacial periods. Theclimatic fluctuation at the Pleistocene may have played important role in speciation or atleast the preservation of distinctive genotypes (Richardson & Rundel 1998). The last 10000 years after the last glacial period have shaped the current distribution of pines.Nowadays, the natural distribution of pines range from arctic and subarctic regions ofNorth America and Eurasia south to subtropical and tropical regions of Central Americaand Asia, and one species extending even south of the equator (P. merkusii) (Mirov 1967,Price et al. 1998).

Fig. 2. Outline of classification of genus Pinus according to Price et al. (1998), Mirov (1967) andKrupkin et al. (1996). Only those sections, subsections and species used in our studies areincluded. Details about classification are provided in the main text.

Pine species have a major economic value as sources of timber, pulp and otherproducts. Pine species are an important part of the ecosystem (Bonan et al. 1992). Innorthern Europe and Asia the taiga or boreal forest forms the most extensive area ofconiferous forest in the world. However, different historical, ecological and geneticfactors have interacted to determine the limits of distribution of individual pine species.

Page 22: Evolution and applications of pine microsatellites

22

The extremes of distribution patterns are represented by widespread species such as Scotspine (Pinus sylvestris), and highly localized endemic species such as radiata pine (Pinusradiata).

Scots pine (subgenus Pinus) has the largest geographic distribution of any pine,ranging from the Scottish Highlands along the Atlantic to the Pacific coast of easternSiberia. In addition, Scots pine has relict populations from the Pleistocene in theMediterranean region and central Europe (Mirov 1967). It ranges from latitudes 70º N inNorway and Finland to 37º N in Spain, and elevations from sea level to 2600m. With sucha wide range of biogeographic and ecological distribution, it is not surprising to find thatScots pine is highly plastic and contains considerable amount of morphological andphysiological variability (Mirov 1967).

Red pine (Pinus resinosa) belongs to the subgenus Pinus and is closely related to Scotspine (Fig. 2). Red pine is a native of the northeast part of the American continent and itoccurs locally also in northeast West Virginia (Mirov 1967). Red pine is very uniformmorphologically and it has very little genetic variation (Fowler & Morris 1977, DeVerno& Mosseler 1997). Despite the current large range, the low level of variation observed inred pine could have resulted from one or a series of population bottleneck(s) in the past(Fowler & Morris 1977). However, chloroplast microsatellites have revealed substantialamount of genetic diversity in red pine (Echt et al. 1998).

Radiata pine (member of the subgenus Pinus) once ranged more widely, but now hasonly five distinct natural populations along the coast of central California and two islandpopulations off the coast of Baja California. The mainland populations have hundreds ofthousands of trees, while the island populations are much smaller (Moran et al. 1988,Lavery & Mead 1998). In particular, the Guadalupe population is close to extinction withless than 400 trees. The extant populations of radiata pine are disjunct relicts of muchlarger ancestral populations. According to Axelrod (1980, 1981), fragmentation hasoccurred within the last 8000 years. Millar (1997) suggested that repeated cycles ofpopulation growth and fragmentation have occurred at least in the last two million years.The island populations, Cedros and Guadalupe, are well separated from mainlandpopulations. Cedros seems to have separated from mainland up to 10 million years agoand it is assumed that radiata pine has colonized the Guadalupe Island 1-4 MYA (Axelrod1980). In the case of radiata pine there is good historical evidence that populations havegone through bottlenecks in the past (Axelrod 1980 and 1981, Lavery & Mead 1998),while the reduction of the population size of Guadalupe population is a recentphenomenon. Radiata pine is the most widely cultivated exotic conifer, mainly inAustralia, New Zealand and Chile. Breeding programmes in these countries have resultedin substantial genetic improvement (e.g. gains in stem volume) (Lavery & Mead 1998).Nowadays, radiata pine is a target for conservation efforts, and the accurate determinationof inbreeding levels and genetic structure of natural populations would help refineconservation strategies.

Eastern white pine (Pinus strobus) is a member of the subgenus Strobus. It extendsfrom Newfoundland to northern Georgia in northern America. It thus currently has a largeeffective population size. Eastern white pine is an ecologically and economicallyimportant forest species in the area (Mirov 1967).

Sugar pine (Pinus lambertiana) is the member of the subgenus Strobus (Fig. 2). Theeffective population size of sugar pine is large. It is distributed from western Oregon

Page 23: Evolution and applications of pine microsatellites

23

through Sierra Nevada to California. Sugar pine is considered a very important timberspecies due to the quality of wood (Mirov 1967).

1.2.2 Genetic variation in pines

Pines are considered as among the most genetically variable of all species as revealed bymeasures of quantitative genetic variation (Cornelius 1994) and by diversity at allozymeloci (e.g. Hamrick et al. 1979, Hamrick & Godt 1990). According to many allozymestudies the modal value of expected heterozygosity (He) lies between 0.13 and 0.16 inpines (see Ledig 1998). The DNA based markers have also revealed high He estimates.For example, microsatellite markers have revealed the expected heterozygosities to bebetween 0.50-0.80 in pines (Smith & Devey 1994, Thomas et al. 1999, Keys et al. 2000).

Variation at neutral loci may be governed mostly by mutation and drift (e.g. Kimura1983). Further, the level of differentiation between populations at neutral loci depends ona balance between migration and genetic drift (e.g. Hartl & Clark 1989). Pollen-mediatedgene flow among adjacent populations is effective in preventing populationdifferentiation (Koski 1970, Ledig 1998). The homogenizing effect of gene flow can beseen in the distribution of allele frequencies in pines, the high estimates of migration, andthe low values for population differentiation (e.g. Muona & Harju 1989, Beaulieu &Simon 1994, Goncharenko et al. 1994). The proportion of total genetic diversity thatexists between populations is often less than 5 %, which means that more than 95 % oftotal variation is within populations. Some pine species with disjunct populations andrestricted gene flow have more genetic diversity among populations: for example 16-27% for P. radiata and 22 % for P. muricata (e.g. Moran et al. 1988, Millar et al. 1997, Wuet al 1999).

In quantitative traits, the level of genetic variation should depend on a balance betweenmutation and selection, or between different selection pressures (Barton & Turelli 1989).When the trait has an influence on the survival or reproduction of the individual (e.g. coldtolerance), the pattern of variation will be clinal due to the strong selection (Endler 1977,Hurme 1999). Thus, when there is diversifying selection, the balance between selectionand migration can result in considerable genetic differences between populations. Hence,neutral loci do not necessarily predict patterns of variation in traits that are subjected todifferential selection.

1.2.3 Inbreeding in pines

The mating system denotes to the levels of selfing, other consanguineous mating andoutcrossing. Wind pollinated pines are monoecious which means that the female and themale reproductive structures are on the same tree. Studies in controlled pollinations havefailed to reveal any incompatibility systems. Thus, prezygotic mechanisms to preventselfing seem to be absent or poorly developed. Owens et al. (1998) suggested thatprimitive prezygotic incompatibility mechanism might exist in conifers. Self-fertilizationin pines occurs at a low or moderate level (Koski 1971, Muona & Harju 1989, Ledig

Page 24: Evolution and applications of pine microsatellites

24

1998) and a high level of outcrossing at the mature seed stage is maintained by somemechanism. There is evidence of selection at the embryonic stage so that the number ofinbreds is low already at the seedling stage (Koski 1971, Kärkkäinen & Savolainen1993). For instance, Koski (1973) estimated that an average of nearly 90 % of the inbredembryos are destroyed before the seed is mature. Selection after the seedling stage is stillsevere. This was noticed when comparing the survival of selfed seedlings in P. sylvestris(Muona et al. 1987) and in Pinus leucodermis (Morgante et al. 1993). The major forceacting against selfing in an outcrossing population is inbreeding depression. Inbredindividuals have a higher level of homozygotes in general, and thereby homozygosity forrecessive deleterious alleles is also higher (Charlesworth & Charlesworth 1998).Koelewijn et al. (1999) showed that total inbreeding depression is close to one in P.sylvestris. Nevertheless, some selfing is found in pines. The ability to self may allowpines to migrate successfully to new habitat (Bannister 1965).

Allozymes have been used in many studies to calculate the population average fixationindices (FIS) in conifers. For instance, Moran et al. (1988) and Plessas and Strauss (1986)estimated the level of inbreeding in radiata pine from allozyme data. They comparedexpected and observed heterozygosities between two life stages, seeds and adults. Bothstudies showed that in the seed stage there was a homozygosity excess, as a reflection ofpartial selfing. The genotypes of adult trees were, however, close to Hardy-Weinbergequilibrium, probably due to elimination of recessive homozygosity (Strauss & Libby1987).

1.3 Goals of this work

The present study has the following aims; first, to examine the evolution of microsatelliteloci both among and within pine species by sequencing alleles both among species andwithin populations (I, III). The persistence of microsatellites was studied among pinespecies with very different effective population sizes (I). Second, we used microsatellitemarkers together with other molecular markers and an adaptive trait (date of bud set) tostudy patterns of genetic variation in Scots pine (Pinus sylvestris) in Finland. We wantedto solve if patterns of variation are comparable between molecular markers andquantitative traits (II). Third, we evaluated the usefulness of microsatellites in populationstudies. We examined the distribution of genetic variation in five natural populations ofradiata pine (Pinus radiata). We were able to compare our results to the earlier resultsbased on allozymes. In addition, we examined whether the population size reductions inthe past have an effect on the allele frequency distributions. Genetic distance methodsbased on both IAM and SMM were compared (III). Finally, we used microsatellitemarkers and a newly developed Bayesian method to estimate the inbreeding coefficientsof individual adult trees from natural populations of radiata pine in the presence of nullalleles and PCR amplification failures (IV).

Page 25: Evolution and applications of pine microsatellites

2 Materials and methods

Only a brief outline of the material and the methods is given in this chapter. For instance,the PCR conditions, primer sequences and details of materials are described in detail inthe original papers (I-IV).

2.1 Pine material

To study amplifications of microsatellites across species (I), we isolated DNA from seedsof five pine species. P. sylvestris samples were from southern Finland. P. resinosa and P.lambertiana seeds were from standard forests from the United States. P. strobus and P.radiata seeds from standard forest were kindly provided by Dr. C.S. Echt and Dr. G.F.Moran, respectively.

To examine the levels of molecular marker and quantitative trait variability and tostudy the patterns of variation across north-south gradient, eleven populations of P.sylvestris were sampled (II). The most southern population was Bromarv (60° N) and themost northern Alalompolo (70° N). The same populations were not sampled for allmarkers. The origin of samples and sample sizes for each marker are described in PaperII.

Needle samples for P. radiata population and inbreeding studies (III and IV) werecollected from plantation at Urriarra near Canberra, Australia (Eldridge 1978). Eachmegagametophyte from the Año Nuevo population was half-sibs for corresponding treeused in population analysis.

2.2 DNA isolation

For cross-species amplification studies (I) DNA was isolated from diploid embryos andhaploid megagametophytes using a modified CTAB method (Doyle & Doyle 1990). P.sylvestris DNA samples from needles (II) were isolated following the modified CTABmethod of Wagner et al. (1987). DNA was isolated from fresh needles and

Page 26: Evolution and applications of pine microsatellites

26

megagametophytes in P. radiata studies (III and IV) using Fast Prep instrument accordingto instructions with slight modifications (Savant Instrument, Inc.).

Concentrations of DNA samples were measured using DNA-specific fluorometer(Hoefer Scientific).

2.3 Microsatellite loci and sequence analysis in cross-species study

In a cross-species amplification study, twenty-eight primer pairs developed for P. strobus(Echt et al. 1996) and two primer pairs developed for P. radiata (Smith & Devey 1994)were used (I). Cloned PCR amplification products were sequenced using dye terminationsequencing reagents and ABI automated sequencer, model 377 (Applied Biosystems). Atleast three separate clones were sequenced from each individual for each microsatellitelocus. For sequence alignment we used either the SeqEd program (Applied Biosystems),or Dnasis alignment program (Hitachi Software Engineering Co.).

To test the dependence of the mutation rate on distance from the repeat region of micro-satellite, we used log-likelihood ratio test separately for point mutations and indels.

2.4 Scots pine population study

We used different molecular markers (allozymes, RFLPs of ribosomal DNA andanonymous low-copy number DNA, RAPDs and microsatellite) and an adaptive trait(date of bud set) to examine the patterns of variation across a south-north gradient inFinland. Populations and sample sizes are described in Paper II.

2.4.1 Date of bud set and molecular markers

The timing of the terminal bud set was scored from the seedlings from four populationsof Scots pine from different latitudes. The southernmost population was from Bromarv(60° N) and the northernmost population was from Salla (67° N). Differences betweenpopulations were tested by ANOVA (Sokal & Rohl 1981) with populations being arandom effect. Analyses were done using the SAS/STAT computer software (SASInstitute Inc. 1987).

In allozyme analyses, eight enzyme systems were studied yielding ten polymorphicloci. Genotypic frequencies were used to calculate allelic frequencies and expectedheterozygosities. The proportion of variation between populations was estimated usingGST (Nei 1973).

RFLPs were studied by using one complementary DNA (cDNA) and two genomicDNA probes from loblolly pine (Devey et al. 1991). Hybridization conditions aredescribed in Paper II. The inheritance of the variation of RFLP bands was determinedfrom a progeny of full-sib cross. Statistics was calculated as for the allozymes.

Page 27: Evolution and applications of pine microsatellites

27

RFLP analyses of ribosomal DNA (rDNA) were done as described by Karvonen andSavolainen (1993). Shannon’s index of phenotypic diversity (Hutcheson 1970) was usedto quantify the levels of variation and partition between and within populations.

RAPD variation and divergence between northern and southern Finnish Scots pinepopulations were estimated from haploid megagametophytes. The average heterozygositywas estimated by examining a large number of loci in a single individual. In a largemating population we can assume that loci are independent, i.e. there is no correlationbetween the heterozygosity of different loci and there is probably no linkagedisequilibrium (Hartl & Clark 1989, Savolainen & Hedrick 1995). For each tree weestimated the number of segregating and non-segregating bands (observed segregatingbands/total numbers of bands) which can be regarded as an estimate of heterozygosity inthe genome. The same approach was also used to study divergence between populations.Microsatellite variation between northern and southern Scots pine populations in Finlandwas studied using two microsatellite loci from P. radiata (Smith & Devey 1994).

2.5 Radiata pine studies

Microsatellite markers developed for P. radiata (Devey et al. 2001) were used to studythe population structure and the frequency of null alleles in five natural populations of P.radiata. We also sequenced some microsatellite alleles to examine the mutationalmechanisms of microsatellite loci (III). The same data were used to estimate individualinbreeding coefficients of individual trees with a high resolution in these populations(IV).

2.5.1 Fragment analysis and sequencing of microsatellite alleles

From 30 microsatellite loci 19 were selected for population studies. The fluorescentlabelled PCR products were run by using ABI 310 sequencer. The sizes of microsatellitefragments were determined using GeneScan and Genotyper software packages (AppliedBiosystems).

Sequencing of selected alleles was performed by the dideoxynucleotide chaintermination method using ABI 377 sequencer (Applied Biosystems). Both strands weresequenced for each allele and sequence alignments were performed using the GeneDocprogram (Nicholas & Nicholas 1997).

2.5.2 Data analysis in population structure study

The general procedure for inferring frequencies of null alleles is described in detail inPaper III. The frequencies of null alleles were incorporated in the data and the expectedgenotype frequencies were calculated under Hardy-Weinberg from the estimated allelefrequencies (III).

Page 28: Evolution and applications of pine microsatellites

28

FST and RST among populations were calculated using methods described by Weir andCockerham (1984), and Slatkin (1995), respectively. To calculate RST we had to binalleles to fit the stepwise mutation model.

The departure from mutation-drift equilibrium was examined by comparing thedifference between the observed Hardy-Weinberg heterozygosity and the expected genediversity based on the observed number of alleles (Watterson 1978, 1986). The method ofLuikart et al. (1998) was used to compare the distribution of allele frequencies observedin a population suspected to have undergone bottleneck to the distribution expected in anon-bottlenecked population (so-called mode-shift distortion). Both of these tests wereperformed using the program Bottleneck (Cornuet & Luikart 1996).

The method described by Pritchard et al. (2000) was used to infer the populationstructure and to assign individuals (probabilistically) to populations. The method uses themultilocus genotype data to examine how many groups the total set of individuals form.Further, it tests correspondence between the genetic and geographic groupings. Geneticdistances among populations were estimated by Nei´s genetic distance (Nei 1973) and(δµ)2 statistic of Goldstein et al. (1995b).

2.5.3 Mating system study

We used the probability of identity by descent (IBD) of two gametes approach (Malécot1948) and a newly developed Bayesian method to estimate the individual inbreedingcoefficients from microsatellite data (IV). The statistical methods are described in detailin Paper IV. The population level inbreeding coefficients were obtained by averagingindividual inbreeding coefficients. The frequencies of null alleles were estimated with theadditional megagametophyte data using Mendelian transmission probabilities. Thefrequency of PCR errors was also estimated using a probabilistic approach.

Page 29: Evolution and applications of pine microsatellites

3 Results and discussion

3.1 Microsatellite evolution

The evolution and persistence of microsatellite loci was studied both within radiata pine(III) and between different pine species (I). This was done by sequencing alleles withinpopulations and loci among species. In addition, we tested whether the repeat areasincrease the evolutionary rate in the immediately adjacent flanking sequences (I).

3.1.1 Short term evolution within species

Microsatellite evolution within populations was studied by sequencing microsatellitealleles from P. radiata populations (III). This study had two goals. First, we wanted toconfirm that detected fragments were really alleles from one locus and not from closelyrelated loci. Second, we wanted to find a cause for the odd numbers of base pairdifferences and unexpected large gaps between allele lengths in some populations. Allsequenced microsatellite loci contained compound or perfect dinucleotide repeats.

Based on the flanking sequences, the amplification products of each primer pair werealleles from a single locus, although some indels and/or base substitutions were found.From the Guadalupe population we sequenced alleles from locus Pr161 to resolve thereason for a large gap observed between alleles 210 and 228 bp. The length differencewas due to 15 CT repeats in allele 228 bp and six in 210 bp. At locus Pr9.3 alleles 77, 94,and 103 bp were sequenced from the Cambria population. In this case a deletion of one Cin the repeat area was the reason for the odd base pair difference between alleles. At locusPr111, again in Cambria, there was one extra C base in the middle of the repeat area ofallele 100 bp which caused the irregularity in the length differences between alleles 100and 101 bp.

Indels in the flanking sequences as well as undetectable homoplasy in the repeat areascause size homoplasy and make it difficult to compare microsatellite alleles strictly asdifferences in repeat numbers. Although we did not sequence alleles of the same length,these sequences indicated that size homoplasy due to flanking sequence and repeat area

Page 30: Evolution and applications of pine microsatellites

30

changes must be relatively common both within and between radiata pine populations.Size homoplasy has also been reported among microsatellite alleles from the samepopulation or same species (e.g. Viard et al. 1998, Colson & Goldstein 1999, Makova etal. 2000). For instance, Viard et al. (1998) found that the detection of size homoplasythrough allele sequencing had a substantial effect on the resolution of populationstructure.

Furthermore, our sequences showed that repeat areas of microsatellite loci do notnecessarily mutate purely in a stepwise fashion due to the indels or base substitutions inthe repeat area. This means that the assumptions of statistical methods based on variancesin repeat numbers are not necessarily always valid. In addition to the sequence data,microsatellite allele distributions of five natural populations of radiata pine alsoconfirmed that there are a large number of odd base pair differences between alleles.

The earlier studies have shown that SMM or TPM can explain relatively well theevolutionary processes of most microsatellite loci (e.g. Weber & Wong 1993, Deka et al.1993). It is, however, important to have more than just a few microsatellite loci so that thenoise caused by irregularly evolving loci is reduced. Takezaki & Nei (1996) tested theprobability of finding the correct topology when using microsatellites under IAM andSMM. They showed that the probability of detecting the correct topology increased as thenumber of loci increased under both models. In the case of PCR based microsatellites, itis relatively easy to examine many loci at the same time, but unfortunately for somespecies there are just a few microsatellite loci available. The sequencing of alleles wouldresolve the behaviour of loci, but it is extremely labour-intensive. Less time consumingtechniques, such as SSCP (single strand conformation polymorphism) (Orita et al. 1989),could be considered to solve the identity of alleles.

3.1.2 Long term evolution between species

Microsatellite persistence and evolutionary change was studied among five species ofpines (I), which included a pair of closely related species (P. sylvestris and P. resinosa) inthe subgenus Pinus, their relative P. radiata (a member of the same section, but differentsubsection) and another closely related species pair (P. strobus and P. lambertiana) in thesubgenus Strobus. These two subgenera diverged over 100 MYA. The conservation ofsequences that flank repeat areas of microsatellites allows cross-species amplificationamong related species. We wanted to examine whether microsatellite primers originallydeveloped for P. strobus and P. radiata could amplify corresponding loci in related pinespecies. Further, we compared the sequences of the microsatellite loci between species tofind out changes in the structure of the repeat areas, as well as in the flanking sequences.The effective population sizes of these species are known to have ranged from the verysmall bottlenecks of P. resinosa to vast populations of P. sylvestris. All this backgroundinformation allowed us to place the microsatellite persistence and evolution study in well-defined phylogenetic setting.

We were able to amplify homologous microsatellite loci in pine species which haddiverged more than 100 MYA. Nine of 28 (32%) P. strobus (subgenus Strobus) primerpairs and both P. radiata (subgenus Pinus) primer pairs resulted in specific amplification

Page 31: Evolution and applications of pine microsatellites

31

in P. sylvestris (subgenus Pinus). Further, four of these loci (three from P. strobus and onefrom P. radiata) were used to amplify microsatellites in all studied species; P. sylvestris,P. resinosa, P. radiata, P. strobus, and P. lambertiana. All four loci needed highstringency of PCR conditions to produce only the expected size of amplification products.It is known that probability of successful cross-species amplification of microsatellitesdepends on the relatedness between species. Amplification success between closelyrelated species can be high. For instance, within species belonging to the familyHirundinidae, 90 % of all the studied marker-species combinations worked (Primmer etal. 1996b). In the Arabis species 43 to 50 % of the microsatellites could be amplified (vanTreuren et al. 1997). There are, however, some examples where microsatellite loci havebeen conserved over very long evolutionary distances, such as in different genera ofturtles over 300 million of years (FizSimmons et al. 1995) or in fish species over 450million of years (Rico et al. 1996). Echt et al. (1999) showed that it is possible to shareprimers among members of the subgenera of Pinus. Their amplification success rate was29 % when they tested primer pairs among subgenus Strobus and subgenus Pinus. Thus,this result is similar to our results.

Homology of the amplification products was assessed by comparing flankingsequences to the known phylogeny of these species. In P. strobus primers of locus RPS105 gave two amplification products (RPS 105a and RPS 105b), whereas the other fourspecies had just one locus. According to the repeat structure and shared bases the locusRPS 105b in P. strobus was orthologous with the locus obtained in the other species. Therest of the loci gave high base pair identities between species and the repeat areas did notshow any major discrepancies among species. As all these species are assumed to haveseparated over 100 MYA, the short flanking sequences may not be enough to fullydistinguish the order of divergence. However, we were able to compare sequences in awell-defined phylogenetic setting.

The flanking sequence comparison between RPS 105a and b and unsuccessful PCRefforts to amplify the locus RPS 105a in the other species indicated that duplication of thelocus had taken place in P. strobus after P. lambertiana and P. strobus diverged (10-25MYA). Duplication of locus RPS 105 in P. strobus and the finding that in P. radiata locusRPS 150 resulted in amplification of the related microsatellite (RPS 140) provideevidence of families of microsatellites in pines. Earlier studies on conifers have alsoshown that even species-specific primers can amplify several loci and that microsatelliteregions are often embedded within repetitive DNA sequences (Smith & Devey 1994,Kostia et al. 1995, Pfeiffer et al. 1997). In addition, Elsik and Williams (2001) foundfamilies of clustered microsatellites in P. taeda.

We tested if the unstable repeat areas destabilize and thus increase the evolutionaryrate in the adjacent flanking sequence. For point mutations as well as indels, we could notdetect any evidence of dependence of mutation rates on distance from the microsatelliteregion.

Loci RPS 105, RPS 150 and RPS 160 from P. strobus that also amplified in the otherpine species did not show within species polymorphism among individuals used inamplification testing (2-6 individuals). These loci were also classified as monomorphic inP. strobus (Echt et al. 1999). When Echt et al. (1999) tested microsatellites between thesubgenera Strobus and Pinus, only monomorphic loci amplified in both subgenera. Theysuggested that natural selection could explain the conservation of these loci. The repeat

Page 32: Evolution and applications of pine microsatellites

32

length at a locus may be constrained by the same selective forces that constrain variationin the adjacent PCR primer annealing areas. However, our results showed that repeatareas seemed to evolve somewhat independently from the evolutionary rate of theadjacent single-copy areas. In addition, the polymorphic locus PR 4.6 from P. radiata(also polymorphic in P. sylvestris), amplified in species from both subgenera, Strobus andPinus (II). To test polymorphism of this locus in P. strobus we would have needed moreindividuals.

The main repeat type of PR 4.6 was different in each of the three species, P. radiata, P.sylvestris, and P. strobus. In P. radiata, the polymorphism is based on variation in thenumber of CA repeats, but according to our sequence data, the polymorphism in P.sylvestris is based on a different motif, TAA repeats. Loci RPS 150 and 160 had verydegenerated repeat motifs, probably due to point mutations. Both loci may have hadlonger perfect repeats earlier but interruptions have made these structures almostunrecognisable. According to Taylor et al. (1999a) interruptions are responsible formonomorphism and shortening, i.e., death of the microsatellite.

Sequence comparisons revealed that microsatellite repeat sequences had persisted inall species despite the very different population sizes. This result is in accordance with thepredictions of Stephan and Kim (1998), who concluded that microsatellites should persistindependently of population size. On two independent occasions, the repeat area ofmicrosatellite RPS 105 had undergone a rapid expansion in P. strobus (AC repeat at RPS105a) and P. sylvestris (GT repeat at RPS 105) in the last 10-25 million of years. In bothspecies a low number of repeat units has served as the basis for expansion. It is possiblethat base substitutions have provided material for replication slippage and thus enabledthe further expansion of the repeat areas. Xu et al. (2000) showed that there is a strongbias toward expansion or contraction for a particular allele depending on its length. Youngmicrosatellites that evolve from shorter allele length have an overall bias towardexpansion. Primmer and Ellegren (1998) found that a limited number of repeats (NN)2-4was sufficient for further expansion through slippage in avian microsatellites. Further,Messier et al. (1996) showed that in primates point mutation initially led to generation ofa run of two tetranucleotides and, in the lineage where this occurred, to a later expansionto four tetranucleotides. In another lineage in the same microsatellite, a point mutation ledto a run of five dinucleotides, which was later expanded by one dinucleotide in tens ofmillions of years. Expansions of repeat areas in two different pine species have been veryrapid compared with those in avian and primates. We are not able to conclude whether theincrease of repeats occurred one step at a time or through larger additions. If expansionslike this can occur through larger additions, these loci would cause serious overestimationof genetic distances based on the stepwise mutation model (Slatkin 1995, Goldstein et al.1995b). However, according to the literature these kinds of rapid expansions of repeatnumbers must be very rare among microsatellite loci in general.

Overall, according to our study the patterns of microsatellite evolution are quitevariable, and the simple stepwise mutation model does not hold over the time ofdivergence between relatively closely related pine species. The length variation due toindels in the flanking areas was common at the loci examined. This has also been noticedbetween more closely related species (e.g. Blanquer-Maumont & Crouau-Roy 1995,Steinkellner et al. 1997, van Treuren et al. 1997, Makova et al. 2000). We also noticedthat the evolutionary rates at different microsatellite loci, and even in different parts of the

Page 33: Evolution and applications of pine microsatellites

33

same complex microsatellite can be highly variable. Complex changes of repeat unitsamong pine species are possible. Thus, inference of phylogenies using SMM baseddistances using allele size information alone can be insufficient and polymorphismcomparison between species is not recommended. Nucleotide substitutions in the flankingarea can, however, be useful when estimating phylogenetic relationships among alleles(Orti et al. 1997) or among species.

3.2 Analysis of natural populations

3.2.1 Scots pine

We examined patterns of variation of several kinds of molecular markers (isozymes,RFLPs of ribosomal DNA and anonymous low-copy number DNA, RAPDs andmicrosatellites) and an adaptive trait (date of bud set) in Pinus sylvestris (Scots pine) (II).The study included Finnish Scots pine populations (from latitude 60°N to 70°N) whichexperience a steep climatic gradient. Our study allowed a comparison of variationbetween allozymes and different type of DNA markers. We were also able to study therelationship of marker variation to the variation of quantitative trait.

Single-locus markers differed greatly in the level of variability. Although allozymesare highly variable in P. sylvestris (average within population heterozygosity, He = 0.34),RFLPs (He = 0.49) and especially two microsatellite loci showed even higher variability(He= 0.77). Smith and Devey (1994) found an expected heterozygosity of 0.60 for thosesame microsatellite loci in P. radiata (radiata pine). For RAPDs the proportion ofsegregating loci in an individual tree was about 30 %. If we take this as an estimate ofheterozygosity, this estimate is similar to the allozyme estimates for P. sylvestris. WithRAPDs there can be problems in the identification of loci (Lynch & Milligan 1994), so adirect comparison may not be appropriate. Hurme and Savolainen (1999) found that 6 %of bands thought to be homologous based on size were found to arise from different lociupon closer study. The study of Isabel et al. (1995) showed that allozymes and RAPDsgave similar genetic diversity estimates in black spruce populations. These resultsindicate that RAPDs are comparable with other types of nuclear loci at least withinspecies studies. Our result based on RAPDs might not be completely accurate, but itshould indicate the general level of heterozygosity. The phenotypically scored rDNARFLPs are also quite variable within populations, but their usefulness is limited becauseof a lack of direct genetic interpretation.

Among all these markers, microsatellites are clearly the markers of choice for mappingand when neutral markers are needed. Disadvantages such as the high costs anddifficulties to find new microsatellites from complex pine genome can be overcome bycross-species amplifications. For instance, microsatellites developed for Pinus halapensiswork in the very close relative Pinus brutia (Keys et al. 2000) and many of these primersgive polymorphic amplification also in P. sylvestris in our mapping project (unpublishedresult).

All populations were equally variable for all single-locus markers. For the quantitativetrait, our measurements were at phenotypic level in a common garden environment.

Page 34: Evolution and applications of pine microsatellites

34

Phenotypic coefficients of variation of bud set date in individual populations were from0.09 to 0.14. Thus, there were no variation differences between northern and southernpopulations. In the long run, the adaptability of a trait is governed by the amount ofgenetic variation, and not by the ratio of additive genetic variation to the total phenotypicvariation (Houle 1992). Thus, the desirable measure will be the additive coefficient ofvariation. Hurme (1999) estimated the additive genetic variance for bud set date.According to this result the southern population had a lower estimate of additive geneticvariance than the northern one (0.24 and 0.37). The extensive variation in thenorthernmost populations is probably maintained by migration from the south(Kärkkäinen 1991), and a balance between migration and selection determines theamount of genetic variation.

All molecular markers (allozymes, RFLPs, and microsatellites) showed high levels ofwithin population variation, while differentiation among populations was low (FST ≤0.02). This pattern of genetic structure is common in conifers with large distributionalranges. Allozyme studies have shown that there is no latitudinal differentiation among thenorthern European Scots pine populations, as the FST values were less than 0.02(Gullberg et al. 1985). In allozyme analysis among populations geographically as distantas China and Sweden, 92.5 % of genetic diversity was within individual populations(Wang et al. 1991).

Differentiation between northern and southern Scots pine populations account for 14% of the rDNA variation. The phenotypic rDNA results were not directly comparable toother markers. However, we analysed the genotypic microsatellite data of locus PR 9.3with Shannon-Weaver statistics (Hutcheson 1970), and the estimate for the proportion ofdiversity between populations was 10%. Thus rDNA data was in a range similar to othermarkers.

Common garden experiments showed that in the northernmost population, Salla, budset occurred about 21 days earlier than in the southernmost population, Bromarv. Of thetotal variation in bud set, 36.4 % was found among the populations. Thus, clinal variationobserved in this study is in accordance with the earlier works in the timing of bud set(Mikola 1982) and the development of frost hardiness in Scots pine (Hurme et al. 1997),suggesting adaptive significance. In addition, evidence for local genetic adaptation hasbeen obtained from transfer experiments (Eriksson et al. 1980).

According to our study molecular markers may not be good predictors of variability inall quantitative traits. The genetic variation observed for these markers seems to beselectively neutral at least with respect to climatic adaptation. The high amount of pollen-mediated gene flow among adjacent populations (Koski 1970, Harju & Muona 1989) andthe lack of differential selection are effective in preventing their divergence. Low levelsof differentiation have been found also for most other species of conifers (e.g. Hamrick etal. 1992). Instead, strong differential selection diversifies those loci that are responsiblefor local adaptation. Although we had only four Scots pine populations from Finland inour bud set study, our finding emphasizes the general pattern of adaptation. Similarfindings have been reported earlier e.g. in Scots pine (Aho 1994), Pinus contorta (Jonssonet al. 1981), and in Picea abies (Ekberg et al. 1979).

Complete lack of marker variation may predict lack of morphological variation, as inred pine (Mosseler et al. 1992). On the other hand, in Acacia mangium some growthvariation is present even if marker variation is lacking (Moran et al. 1989). These kinds of

Page 35: Evolution and applications of pine microsatellites

35

different patterns are due to differences in selection strength and different dynamics oftraits after bottlenecks (Lande 1994, Lynch 1996).

Although molecular markers may not carry much information on quantitative traits,they are excellent tools for many other applications such as mapping, identification, andmonitoring changes of variability due to drift.

3.2.2 Radiata pine

Microsatellite variation was studied in the five natural populations of P. radiata (III).Populations were three large mainland populations in northern California Año Nuevo,Monterey, Cambria and smaller island populations Cedros and Guadalupe on Mexicanoceanic islands. First we had to estimate the proportion of null alleles because undetectednull alleles can cause an overestimation of level of inbreeding. We wanted to examine theresolution power of microsatellites as compared with earlier allozyme studies. In the caseof P. radiata there is good historical evidence that populations have gone throughbottlenecks (Axelrod 1980, 1981, Lavery & Mead 1998). Therefore we wanted toexamine the departure of populations from the mutation-drift equilibrium by comparingobserved gene diversity and the expected diversity (heterozygosity) from observednumbers of alleles. In addition, the relationships between populations were studied usinga test developed by Prichard et al. (2000), which infers the population structure (i.e.number of populations) and assign individuals to populations from genotype data. Thegenetic distances among populations were estimated by using both Nei´s geneticdistances (Nei 1973) and the (δµ)2 statistics of Goldstein et al. (1995a,b) developed formicrosatellites.

3.2.2.1 Estimation of null alleles and distribution of microsatellite variation

Among the 19 microsatellites, most loci showed significant departures from Hardy-Weinberg equilibrium. Fixation indices were between 0.136 – 0.279 in all populations.There can be two reasons for such a high deficiency of heterozygotes; null alleles orinbreeding. It is known from earlier studies that adult trees are likely to be close toHardy-Weinberg equilibrium due to inbreeding depression (Moran et al 1988, Muona1990). Thus, null alleles are a likely alternative. Null alleles are alleles that do notamplify due to base substitutions or indels within the priming site. Frequencies of nullalleles were estimated using the probabilistic approach based on the probability ofidentity by descent (IBD).

Frequencies of null alleles varied among loci/population combinations from close tozero to 0.333. The null allele frequencies were incorporated into the data and the expectedgenotypes were calculated under Hardy-Weinberg from estimated allele frequencies.

Earlier studies have shown that null alleles are common for microsatellite loci (e.g.Callen et al. 1993, Paetkau & Strobeck 1995, Pemberton et al. 1995). Fisher et al. (1998)

Page 36: Evolution and applications of pine microsatellites

36

observed that two out of the nine microsatellite loci in P. radiata had null alleles. Thomaset al. (1999) found deficiency of heterozygotes (FIS = 0.360) in lodgepole pine (Pinuscontorta var latifolia) populations when they used five microsatellite loci. Theyconcluded that the reason for this was due to inbreeding, although locus specific FISestimates varied from –0064 to 0.648. Thus, null alleles could at least partially explain thehigh inbreeding value.

We were able to compare our microsatellite variation in P. radiata to earlier allozymeresults (Plessas & Strauss 1986, Moran et al. 1988). All populations showed high level ofvariability (He = 0.68 – 0.77). The mean gene diversity was much higher than estimatedfrom allozymes (He: 0.73 and 0.089, respectively). The large mainland populations hadmore alleles and higher expected heterozygosities than the smaller island populations,while allozymes had similar expected heterozygosities for all five populations (Moran etal. 1988). Wu et al. (1999) were not able to detect within population variation differencesbetween Año Nuevo, Cambria and Guadalupe when using RAPDs. Thus, onlymicrosatellites were able to reveal differences in the measures of genetic diversity amongpopulations with different effective population sizes.

We used both traditional FST statistic and the RST (Slatkin 1995) statistic developedfor microsatellites to calculate the level of genetic differentiation among populations.Substantial amount of genetic differentiation was detected among populations. FST was11.9% from original data and 14.1 % from pooled data (alleles were binned to fit thestepwise mutation model). RST was 15.6 % among natural populations of P. radiata.Although the proportion of total genetic diversity found among populations of widespreadwind-pollinated outcrossing conifer species is usually less than 10 % (Ledig 1998), ourresults reflect well the situation in isolated P. radiata populations with restricted geneflow between populations. Moran et al. (1988) found that 16.2% of total variation wasbetween populations when using allozymes and Wu et al. (1999) 17-26 % when usingRAPDs. Thus, our results were comparable with earlier results. Irregularly evolving locimay cause problems when estimating differentiation of populations using methods basedon stepwise mutation model (i.e. RST). However, as the number of loci increases the noisecaused by irregularly evolving loci should be reduced. In our study, 19 loci weresufficient for obtaining comparable values for differentiation estimates based onmicrosatellite and allozyme data.

3.2.2.2 Effects of bottleneck and colonization

According to Nei et al. (1975), populations lose genetic variability after bottlenecks.However, as soon as the population size becomes large, variability starts to increase dueto new mutations. Allele number is reduced first, and it recovers slowly. However, theaverage number of alleles per locus increases faster than the average heterozygosity afterthe bottleneck. The island populations Cedros and Guadalupe had lower expectedheterozygosities and mean numbers of alleles per locus. Reasonably large Cedrospopulation (80 000 trees) that colonized the island 10 million years ago (500 000overlapping generations ago) should have reached the new equilibrium after colonization.Cedros population is, however, divided into two separate stands (Moran et al. 1988).

Page 37: Evolution and applications of pine microsatellites

37

Thus, the effective population size of Cedros is likely to be smaller than the actualnumber of individuals. The average heterozygosity has probably reached the newequilibrium level which is lower compared to mainland populations. The average levelsof heterozygosity of mainland populations were close to those estimated from other hardpines (e.g. Thomas et al. 1999). Thus, we can assume that mainland populations haveprobably reached the mutation-drift equilibrium.

The very small Guadalupe population (< 400 trees) had a similar level ofheterozygosity as Cedros. The mean number of alleles per locus was smallest among allthe populations, indicating the smallest effective population size. Populations that haverecently undergone bottleneck are likely to have lost rare alleles, but may still containsubstantial amount of heterozygosity (Nei et al. 1975).

Allele length distributions in P. radiata showed multi-modal distributions at many lociand in all populations. When microsatellite loci are in mutation-drift equilibrium, lociunder the SMM should have contiguous allelic states. Thus, microsatellite alleles shoulddiffer by multiples of the repeat units, with most multiples represented. Bottleneck cancause gaps in the distributions due to loss of alleles with intermediate lengths. This meansthat the number of the alleles decreases faster than the range in the allele size atmicrosatellite loci (Cornuet & Luikart 1996, Garza & Williamson 1999). When a locusfollows the SMM, the number of alleles can increase rapidly as soon as population sizestarts to grow, but it can take a long time for alleles to reach equilibrium in allelefrequency distribution.

After a bottleneck event the observed number of alleles is less than the numberpredicted from the Hardy-Weinberg heterozygosity under the assumption that populationis at mutation-drift equilibrium (Nei et al. 1975, Watterson 1984). Populations after arecent bottleneck should have significant heterozygosity excess compared to that basedon the observed number of alleles. Thus, bottlenecks can be studied by comparingexpected gene diversities (based on number of alleles) and observed gene diversities(Watterson 1978, 1986). This was done by using the program Bottleneck (Cornuet &Luikart 1996) which uses an approach similar to the Ewens-Watterson test (Watterson1978). Under the SMM and TPM none of the populations showed heterozygosity excess,but rather heterozygosity deficiency, resulting from a recent population expansion. Thus,populations have probably reached new lower mutation-drift equilibrium after thebottlenecks, but they have expanded since (Maruyama & Fuerst 1984). In addition, ananalysis of allele frequency distribution revealed no traces of recent bottlenecks. as modeshift from L-shaped distribution would have resulted in the event that a bottleneck hadreduced genetic variability in a population (Luikart et al. 1998).

The gaps in the allele frequency distributions may be the reason for the unexpectedheterozygosity deficiency in Guadalupe population, which is close to extinction with lessthan 400 trees. At equilibrium SMM and TPM should have reasonable contiguous allelicstates. If gaps that follow the bottleneck are progressively “filled in” by mutations, therecan be a transient excess of alleles (i.e. deficiency of heterozygosity) (Cornuet & Luikart1996). The negative result in mode shift analysis might be a combination of longgeneration time of pines and the degree of bottleneck. Luikart et al. (1998) concluded intheir simulation studies that a bottleneck size likely to be detectable is approximately Ne= 20 and a power of the test depends on the generations since bottleneck. The Guadalupepopulation has less than 400 individuals, thus severity of the bottleneck does not fill this

Page 38: Evolution and applications of pine microsatellites

38

criterion and the bottleneck might be so recent that there have not been enoughgenerations to show any traces of bottleneck.

3.2.2.3 Relationships between populations

The test of Prichard et al. (2000) showed that genotype data from 19 microsatellite lociindicated the presence of five very distinct groups, corresponding to the three mainlandand two island populations (P = 1). When geographical distribution was used as priorinformation, the analysis allowed correct assignment of individuals to their collectionlocations with very high probabilities (P ≈ 1). Thus, there were no misclassifiedindividuals and no migration between populations.

A topology based on Nei´s genetic distances showed that mainland and islandpopulations clustered in their own groups. Among mainland populations Año Nuevo hadseparated first. This result is in accordance with the study of Plessas and Strauss (1986).Contrary to this, the allozyme study of Moran et al. (1988) showed that Año Nuevo andCambria are the most similar mainland populations. This is also the traditional view basedon morphological characters (Guinon et al. 1982) which may be influenced by selection.Fossil records of pine cones (Axelrod 1980), and the allozyme study of Moran et al.(1988) suggest that the two-needled small-coned P. radiata var. cedrosensis is the mostdistinct from other populations. Microsatellites were not able to separate Cedros fromGuadalupe population. A consensus tree from (δµ)2 distances based on SMM (Goldsteinet al. 1995a) mixed the mainland and island populations. The result is therefore notconsistent with our present knowledge about the history of radiata pine (Axelrod 1980,Plessas & Strauss 1986, Moran et al. 1988, Millar 1997).

The method of Pritchard et al. (2000) to infer population structure and assignindividuals to populations does not assume any particular mutation model. This methodneeds a relatively small number of loci (e.g. seven microsatellite loci) to detect a verystrong signal of population structure and assign individuals appropriately (Pritchard et al.2000). Our 19 loci were more than enough for the analysis. Further, we do not expect anyadmixture, and allele frequency differences among populations are clear. Thus, our resultshould be very reliable. The IAM based distance estimator (Nei 1973) gave a morereliable topology than SMM based estimator (Goldstein et al. 1995a). There might beseveral explanations for the inconsistent behaviour of (δµ)2. Evolutionary patterns ofmicrosatellites may be irregular and there may be an upper limit to the number of repeats(Goldstein et al. 1995a,b). The mutation rates can also vary between loci which willincrease the variance of distance values. According to Takezaki & Nei (1996) thebottleneck effect would also contribute to reducing the percentage of replications inwhich the correct topology will be obtained.

Page 39: Evolution and applications of pine microsatellites

39

3.3 Estimation of inbreeding in radiata pine populations using microsatellites

We used highly polymorphic microsatellite markers to estimate inbreeding coefficients ofindividual adult trees of radiata pine with the newly developed Bayesian MCMC(Markov Chain Monte Carlo simulation) scheme (IV). In addition, we were able toestimate parental population allele frequencies, including null alleles. With slightmodification, the proportion of PCR failures was also estimated. With allozymes theestimation of the inbreeding level of individuals has been nearly impossible due to lowlevels of polymorphism and low number of loci. Only calculations of population averageinbreeding have been possible.

The frequency of PCR errors was about 0.05 and the frequency of null alleles wasgenerally from low to modest (see Paper III). The superior resolution of microsatellitesrevealed that the distribution of individual tree inbreeding coefficients (Fi) was bimodal.Most of the individuals were due to outcrossing in all five populations. About oneindividual per mainland population and three to four per island population were probablydue to selfing. The posterior mean of the inbreeding level for some of the individuals waswell above 0.5, but the 0.9 posterior interval often included zero. Though we used 19highly polymorphic microsatellites, additional markers would improve resolution.

The average inbreeding coefficients (corresponding to FIS) from the microsatellite datawere about 0.04 for the mainland and 0.1 for the island populations. Thus, the islandpopulations had higher inbreeding levels than mainland populations, although only forCedros the posterior intervals did not overlap any of the mainland populations. Moran etal. (1988) reported that fixation indices (FIS) did not significantly differ from zero formaternal trees from the wild population when using a different statistical method. Thesedata were reanalysed using the same method as with microsatellites. For each of the fivepopulations, posterior intervals of the allozyme and microsatellite analyses overlappedbroadly. Savolainen et al. (2001) estimated outcrossing rates from seeds in five naturalpopulations of radiata pine. The selfing rate was about 0.1 for the mainland populationsand about 0.5 to 0.7 for the island populations. Thus, it seems evident that selectionagainst inbreds takes place during embryonic and juvenile stages.

Our samples were from common garden environment, an experimental plantation nearCanberra. Thus, the results may not be strictly transferable to the wild. However,inbreeding levels of adult individuals in the natural populations [reanalysed allozyme datafrom Moran et al. (1988)] and the experimental populations (our microsatellite data) werevery similar. Furthermore, inbreeding depression is expressed in early life-stage such thatseeds taken from wild populations have experienced some selection against selfers(Muona et al. 1987, Koelewijn 1998). For these reasons we can assume that our data fromcommon garden environment is comparable with the situation in the wild.

It is well known that pines have considerable level of self-pollination (e.g. Koski 1971,1973) and that inbreeding depression during seed development is severe in conifers(Koski 1971, Muona 1990, Kärkkäinen & Savolainen 1993). Under the assumption thatall inbreeding is due to selfing, Lande et al. (1994) showed that there is a threshold ofinbreeding depression, over which all selfed individuals die. Under such conditionsmating system will be complete outcrossing, as has been found e.g. in Scots pine

Page 40: Evolution and applications of pine microsatellites

40

(Koelewijn et al. 1999). In a predominantly outcrossing population, recessive deleteriousalleles with low frequencies can be effectively shielded from selection, while partialselfing and other forms of inbreeding can lead to purging of recessive deleterious alleles.Savolainen et al. (2001) noticed that in the island populations of radiata pine pollinationpatterns have changed moderately towards selfing, while the mainland populations havepollination patterns close to those observed in the other species (Muona 1990). Thus, anincrease of self-pollination is not a very likely explanation for purging. It is known thatradiata pine populations have undergone bottlenecks due to fragmentation andcolonization (Axelrod 1980, Millar 1997). Bottlenecks increase the probability of selfingbut even more the probability of weak inbreeding, e.g., mating of distant cousins. Thus,the weak inbreeding may have been a first step in purging in radiata pine. Once the levelof inbreeding depression had decreased, closer inbreeding and nowadays selfing wouldhave become possible. Consequently, the mating system of radiata pine would havechanged from almost completely outcrossing to mixed mating with an intermediate levelof selfing.

We also presented a probabilistic method for estimating individual inbreedingcoefficients and parental population allele frequencies (IV). The Bayesian MCMCscheme was used with the Gibbs sampler (Gelman et al. 1995) to produce the fullposterior distribution of all parameters with confidence limits (Weir 1996). The Bayesianmethod is more efficient than the method of moment estimator for the inbreedingcoefficient (see Sweigart et al. 1999) as it allows the estimation of null alleles and theproportion of PCR failures when using allelic data from the maternal genome(megagametophytes) of related individuals. The precision of estimation of individualinbreeding coefficients should be very high with this approach, because allele frequenciesfrom each population are estimated. When using method of moment estimators parentalallele frequencies are estimated from pooled populations. However, if populationdifferentiation is very low (FST well below 0.1) the method of moment estimator is asgood as the probabilistic method used here.

Normally, only population averages (FST and FIS) are reported and not individualinbreeding coefficients. The reason for this has been the limited resolution of theallozyme data. Microsatellites are extremely polymorphic compared to allozymes, butthey have their own problems, such as null alleles and PCR errors. Thus, the estimation offrequencies of null alleles and removing of PCR errors might be essential, especially ifthe frequency of null alleles is very high. Our 19 polymorphic microsatellites is more thanis available for most of the wild organism. As the posterior intervals showed, resolutioncan be improved with additional markers, although already data based on 19microsatellites gave a promising result.

Page 41: Evolution and applications of pine microsatellites

4 Concluding remarks

Although microsatellite markers provide an excellent tool for different kinds of studies,knowledge of the evolutional processes of these markers is critical in a populationcontext. The first part of the present study concerns the evolution of microsatellites bothwithin species and between pine species (I, III). These studies showed that lengthvariation between alleles is not necessarily due to differences in repeat numbers and thatmutational pattern of repeat areas can be irregular due to indels and base substitutions.This means that the stepwise mutation model (SMM) (Kimura & Ohta 1978) does notnecessarily hold, especially over longer evolutionary distances. Genetic distanceestimators based on the variance of repeat numbers have been improved by incorporatingfactors relevant for evolution of microsatellites into mutation models, like allele sizeconstraints (Garza et al. 1995) or directionally biased changes in allele size (Kimmel &Chakraborty 1996). However, these processes seem to be locus specific. Takezaki & Neiconcluded that as the number of loci increases the noise caused by irregularly evolvingloci should be reduced (Takezaki & Nei 1996). In our radiata pine population study (III)19 microsatellites seemed to be enough to give differentiation estimates comparable toearlier allozymes results.

Although molecular markers may not be good predictors of variability in allquantitative traits (II), the high variability makes them ideal tools for monitoring changesof variability due to drift. Especially highly polymorphic microsatellites have goodresolution power compared to many other markers. Microsatellites were able to revealdifferences in the measures of genetic diversity among P. radiata populations withdifferent effective population sizes (III). Allozymes (Moran et al. 1988) and RAPDs (Wuet al. 1999) were not able to do that. The population differentiation estimates based onmicrosatellite data were, however, comparable with other single-locus markers both in theScots pine population study (II) and in the radiata pine population study (III).

In the last part of the present study microsatellites were used to estimate inbreedingcoefficients of individual trees of radiata pine with newly developed Bayesian MCMCscheme (IV). The Bayesian method allowed estimation of null allele frequencies and theproportion of PCR failures. With less variable markers it has been possible estimate onlypopulation average inbreeding (see e.g. Plessas & Strauss 1986, Moran et al. 1988).

Page 42: Evolution and applications of pine microsatellites

42

Already our 19 microsatellite markers gave very promising results, although theresolution could be improved by adding more loci.

Page 43: Evolution and applications of pine microsatellites

5 References

Aho M-L (1994) Autum frost hardening of one-year-old Pinus sylvestris (L.) seedlings. Effect oforigin and parent trees. Scand J For Res 9: 17-24.

Amos W, Sawcer SJ, Feakes RW & Rubinsztein DC (1996) Microsatellites show mutational bias andheterozygote instability. Nature Genetics 13: 390-391.

Angers B & Bernatchez L (1997) Complex evolution of a salmonid microsatellite locus and itsconsequences in inferring alleleic divergence from size information. Mol Biol Evol 114: 230-238.

Aslandis C, Jansen G, Amemiya C, Shutler G, Mahadevan M, Tsilfidis C, et al. (1992) Cloning of theessential myotonic dystrophy region and mapping of the putative defect. Nature 355: 548-551.

Axelrod DI (1980) History of the maritime closed-cone pines, Alta and Baja California. Universityof California Publications in Geological Sciences Vol. 120.

Axelrod DI (1981) Holocene climatic changes in relation to vegetation disjunction and speciation.Amer Natural 117: 847-870.

Bannister MH (1965) Variation in the breeding system of Pinus radiata. In: Baker HG & Stebbins GI(eds) The Genetics of Colonizing Species. Academic Press, New York, p 353-372.

Barton NH & Turelli M (1987) Adaptive landscapes, genetic distance and the evolution ofquantitative characters. Genet Res 49: 157-173.

Beaulieu J & Simon J-P (1994) genetic structure and variability in Pinus strobus in Quebec. Can JFor Res 24: 1726-1733.

Beaumont MA & Bruforrd MW (1999) Microsatellites in conservation genetics. In: Goldstein DB &Schlötterer C (eds). Microsatellite evolution and application. Oxford University Press, New York,p 165-182.

Beckmann JS & Weber JL (1992) Survey of human and rat microsatellites. Genomics 12: 627-631.Blanquer-Maumont A & Crouau-Roy B (1995) Polymorphism, monomorphism, and sequences in

conserved microsatellites primate species. J Mol Evol 41: 492-497.Bonan GB, Bollard D & Thompson SL (1992) Effects of boreal forest vegetation on global climate.

Nature 359: 716-718.Callen DF, Thompson AD, Shen Y, Phillips HA, Richards RI, Mulley JC & Sutherland GR (1993)

Incidence and origin of “null” alleles in the (AC)n microsatellite markers. Am J Hum Genet 52:922-927.

Chakraborty R, Kimmel M, Stivers DN, Davison J & Deka R (1997) Relative mutation rates at di-,tri-, and teranucleotide microsatellite loci. Proc Natl Acad Sci USA 94: 1041-1046.

Charlesworth B & Charlesworth D (1998) Some evolutionary consequences of deleterious mutations.Genetica 102/103: 3-19.

Page 44: Evolution and applications of pine microsatellites

44

Ciofi C & Bruford MW (1999) Genetic structure and gene flow among Komodo dragon populationsinferred by microsatellite loci analysis. Mol Ecol 8: 17-30.

Colson I & Goldstein DB (1999) Evidence for complex mutations at microsatellite loci in Drosophila.Genetics 152: 617-627.

Cornelius J (1994) Heritabilities and additive genetic coefficients of variation in forest trees. Can JFor Res 24: 372-379.

Cornuet JM & Luikart G (1996) Description and power analysis of two tests for detecting recentpopulation bottlenecks from allele frequency data. Genetics 144: 2001-2014.

Coulson TN, Pemberton JM, Albon SD, Beaumont M, Marshall TC, Slate J, Guinness FE & Clutton-Brock TH (1998) Microsatellites reveal heterosis in red deer. Proc R Soc London Ser B 256: 489-495.

Crow JF & Kimura M (1970) An introduction to population genetics theory. Harper and Row, NewYork, Evanston and London.

Deka R, Chakraborty R & Ferrell RE (1991) A population genetic study of six VNTR loci in threeethnically defined populations. Genomics 11:83-92.

Deka R, Shriver MD, Yu LM, Aston C, Chakraborty R & Ferrell RE (1994) Conservation of humanchromosome 13 polymorphic microsatellite (CA)n repeats in chimpanzees. Genomics 22: 226-230.

DeVerno L, Mosseler A (1997) Genetic variation in red pine (Pinus resinosa Ait.) revealed by RAPDand RAPD/RFLP analysis. Can J. For Res 27: 1316-1320.

Devey ME, Jernstad KD, Tauer CG & Neale DB (1991) Inheritance of RFLP loci in a loblolly pinethree-generation pedigree. Theor Appl Genet 83: 238-242.

Devey ME, Moran GF & Bell JC (2001) A set of microsatellite markers in Pinus radiata for the treebreeding applications. (Manuscript).

Dietrich WF, Miller J, Steen R, Mechant MA, Damron-Boles D. Husain Z, et al. (1996) Acomprehensive genetic map of the mouse genome. Nature 380: 152-154.

DiRienzo A, Peterson AC, Garza JC, Valdes AM, Slatkin M & Freimer NB (1994) Mutationalprocesses of simple-sequence repeat loci in human populations. Proc Nat Acad Sci USA 91: 3166-3170.

Echt CS, DeVerno LL, Anzidei M & Vendramin GG (1998) Chloroplast microsatellites revealpopulation genetic diversity in red pine, Pinus resinosa Ait. Mol Ecol 7: 307-317.

Echt CS, May-Marquardt P, Hseih M & Zahorchak R (1996) Characterization of microsatellitemarkers in Eastern white pine. Genome 39: 1102-1108.

Echt CS, Vendramin GG, Nelson CD & Marquardt P (1999) Microsatellite DNA as shared geneticmarkers among conifer species. Can J For Res 29: 365-371.

Edwards A, Civitello A, Hammond HA & Caskey CT (1991) DNA typing and genetic mapping withtrimeric and tetrameric tandem repeats. Am J Hum Genet 49: 749-756.

Ekberg I, Eriksson G & Dormling I (1979) Photoperiodic reactions in conifer species. Holarct Ecol2: 255-263.

Eldridge KG (1978) Seed collections in California in 1978. Aust. CSIRO Div For Res Annu Rep1977-1978, p 8-17.

Ellegren H (2000a) Heterogenous mutation processes in human microsatellite DNA sequences.Nature Genetics 24: 400-402.

Ellegren H (2000b) Microsatellite mutations in the germline; implications for evolutionary inference.Trends in Genet 16: 551-558.

Ellegren H, Moore S, Robinson N. Byrne K. Ward W & Sheldon BC (1997) Microsatellite evolution– A reciprocal study of repeat lengths at homologous loci in cattle and sheep. Mol Biol Evol 14:854-860.

Ellegren H, Primmer CR & Sheldon BC (1995) Microsatellite evolution: directionality or bias inlocus selection. Nature Genetics 11: 360-362.

Page 45: Evolution and applications of pine microsatellites

45

Elsik CG & Williams CG (2001) Families of clustered microsatellites in a conifer genome. Submittedto Mol Gen Genet.

Endler JA (1977) Geographic variation, speciation, and the clines. Princeton University Press,Princeton N.J.

Eriksson G, Andersson S, Eiche V, Ifver J & Persson A (1980) Severity index and transfer effects onsurvival and volume production of Pinus sylvestris in northern Sweden. Stud For Suecica 156: 1-32.

Estoup A & Cornuet J-M (1999) Microsatellite evolution: inferences from population data. In:Goldstein DB & Schlötterer C (eds). Microsatellite evolution and applications. Oxford UniversityPress, New York, p 49-65.

Estoup A, Garnery L, Solignac M & Cornuet JM (1995a) Microsatellite variation in honey bee (Apismellifera L.) populations: hierarchical genetic structure and test of the infinite allele and stepwisemutation models. Genetics 140: 679-695.

Estoup A, Tailliez C, Cornuet JM & Solignac M (1995b) Size homoplasy and mutational processesof interrupted microsatellites in two bee species, Apis mellifera and Bombus terrestris (Apidae).Mol Biol Evol 12: 1074-1084.

Feldman M, Bergman A, Pollock DD & Goldstein DB (1997) Microsatellite genetic distances withrange constraints: analytic description and problems of estimation. Genetics 145: 207-216.

Fisher PJ, Richardson TE & Gardner RC (1998) Characterization of single- and multi-copymicrosatellites from Pinus radiata. Theor Appl Genet 96: 969-979.

FitzSimmons NN, Moritz C & Moore SS (1995) Conservation and dynamics of microsatellite lociover 300 million years of marine turtle evolution. Mol Biol Evol 12: 432-440.

Fowler DP & Morris RW (1977) Genetic diversity in red pine: evidence for low geneticheterozygosity. Can J For Res 7: 343-347.

Fu YH, Kuhl DP, Pizzuti A, Pieretti M, Sutcliffe JS, Richards S et al. (1991) Variation of the CGGrepeat at the fragile X site results in genetic instability: resolution of the Sherman paradox. Cell67: 1047-1058.

Garza JC, Slatkin M & Freimer NB (1995) Microsatellite allele frequencies in humans andchimpanzees, with implications for constraints on allele size. Mol Biol Evol 12: 594-603.

Garza JC & Williamson E (1999) Detection of reduction in population size using data frommicrosatellites. VII ESEB congress, Barcelona.

Gelman A, Carlin JB, Stern HS & Rubin DB (1995) Bayesian Data Analysis. Chapman and Hall.Gill P, Sparkes R & Kimpton C (1997) Development of guidelines to designate alleles using an STR

multiplex system. Forensic Sci Int 89:185-197.Ginot F, Bordelais I, Nguyen S & Gyapay G (1996) Correction of some genotyping errors in

automated flurescent microsatellite analysis by enzymatic removal of one base overhangs.Nucleic Acid Res 24:540-541.

Goldstein DB, Roemer GW, Smith DA, Reich DE, Bergman A &Wayne RK (1999) The use ofmicrosatellite variation to infer population structure and demographic history in a natural modelsystem. Genetics 151: 797-801.

Goldstein DB, Ruiz Linares A, Cavalli-Sforza LL & Feldman MW (1995a) Genetic absolute datingbased on microsatellites and origin of modern humans. Proc Natl Sci USA 92: 6723-6727.

Goldstein DB, Ruiz Linares A, Cavalli-Sforza LL & Feldman MW (1995b) An evaluation of geneticdistances for use with microsatellite loci. Genetics 139: 463-471.

Goncharenko GG, Silin AE & Padutov VE (1994) Allozyme variation in natural populations ofEurasian pines. III. Population structure, diversity, differentiation and gene flow in central andisolated populations of Pinus sylvestris L. in eastern Europe and Siberia. Silvae Genetica 43: 119-132.

Grimaldi M & Crouau-Roy B (1997) Microsatellite allelic homoplasy due to variable flankingsequences. J. Mol. Evol. 44: 336-340.

Page 46: Evolution and applications of pine microsatellites

46

Guinon M, Hood JV & Libby WJ (1982) A clonal study of intraspecific variability in radiata pine.Aust For Res 12: 191-201.

Gullberg U, Yazani R, Rudin D & Ryman N (1985) Allozyme variation in Scots pine (Pinus sylvestrisL.) in Sweden. Silvae Genet 34: 193-200.

Hamrick JL, Godt MJ & Sherman-Broyles SL (1992) Factors influencing levels of of geneticdiversity in woody plant species. New For 6: 95-124.

Hamrick JL & Godt MJW (1990) Allozyme diversity in plant species. In: Brown AHD, Clegg MT,Kahler al & Weir BS (eds) Plant Population Genetics, Breeding and Genetic Resources. SinauerAssociates Inc., Sunderland, Massachusetts. p 43-63.

Hamrick JL, Linhart YB & Mitton JB (1979) Relationships between life history characteristics andelectrophoretically detectable genetic variation in plants. Ann Rev Ecol Sys 10: 173-200.

Hancock JM (1995) The contribution of slippage-like processes to genome evolution. J Mol Evol 41:1038-1047.

Harju A & Muona O (1989) Background pollination in Pinus sylvestris seed orchards. Scand J ForRes 4: 513-520.

Hedrick PW (1999) Perspective: Highly variable loci and their interpretation in evolution andconservation. Evolution 53: 313-318.

Hart DL & Clark AG (1989) Principles of population genetics. Sinauer Associates Inc., Sunderland,Massachusetts.

Henderson ST & Petes TD (1992) Instability of simple sequence DNA in Saccharomyces cerevisiae.Mol Cell Biol 12: 2749-2757.

Houle D (1992) Comparing evolvability and variability of quantitative traits. Genetics 130: 195-204.Hurme P (1999) Genetic basis of adaptation: bud set date and frost hardiness variation in Scots pine.

Thesis, Acta Univ. Oul. A339. Oulu University Press, Oulu, Finland.Hurme P, Repo T, Savolainen O & Pääkkönen T (1997) Climatic adaptation of bud set and frost

hardiness in Scots pine (Pinus sylvestris). Can J Res 27: 716-723.Hurme P & Savolainen O (1999) Comparison of homology and linkage of random amplified

polymorphic DNA (RAPD) markers between individual trees of Scots pine (Pinus sylvestris L.).Mol Ecol 8: 15-22.

Hutcheson K (1970) A test for comparing diversities based on the Shannon formula. J Theor Biol 29:151-154.

Jarne P & Lagoda JL (1996) Microsatellies, from molecules to populations and back. Trends EcolEvol 11: 424-429.

Jeffreys AJ, Tamaki K, MacLeod A, Monckton DG, Neil DL & Armour JAL (1994) Complex geneconversion events in germline mutation at human minisatellites. Nature Genetics 6: 136-145.

Jonsson A, Eriksson G, Dormling I & Ifver J (1981) Studies on frost hardiness of Pinus contortaDougl. seedlings grown in climate chambers. Stud For Suec 157: 1-47.

Karvonen P & Savolainen O (1993) Variation and inheritance of ribosomal DNA in Pinus sylvestrisL. (Scots pine). Heredity 71: 614-622.

Keys RN, Autino A, Edwards J, Fady B, Pichot C & Vendramin GG (2000) Characterization ofnuclear microsatellites in Pinus halapensis Mill. and their inheritance in P. halapensis and Pinusbrutia Ten. Mol Ecol 9: 2157-2159.

Kimmel M & Chakraborty R (1996) Measures of variation at DNA repeat loci under a generalstepwise mutation model. Theor Pop Biol 50: 345-367.

Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press,Cambridge.

Kimura M & Crow JF (1964) The number of alleles that can be maintained in a finite population.Genetics 49: 725-738.

Kimura M & Ohta T (1978) Stepwise mutation model and distribution of allelelic frequencies in afinite population. Proc Nat Acad Sci USA 75: 2868-2872.

Page 47: Evolution and applications of pine microsatellites

47

Koelewijn HP (1998) Effects of different levels of inbreeding on progeny fitness in Plantagocoronopus. Evolution 52: 692-703.

Koelewijn HP, Koski V & Savolainen O (1999) Magnitude and timing of inbreeding depression inScots pine (Pinus sylvestris L.). Evolution 53: 758-768.

Koski V (1970) A study of pollen dispersal as a mechanism of gene flow in conifers. Commun. InstFor Fenn 70.4: 1-78.

Koski V (1971) Embryonic lethals of Picea abies and Pinus sylvestris. Commun Inst For Fenn 75.3:1-30.

Koski V (1973) On self-pollination, genetic load, and subsequent inbreeding in some conifers.Commun Inst For Fenn 78.10: 1-42.

Kostia S, Varvio S-L, Vakkari P & Pulkkinen P (1995) Microsatellite sequences in a conifer, Pinussylvestris. Genome 38: 1244 –1248.

Kremenetski CV, Liu K, MacDonald GM (1998) The late Quaternary dynamics of pines in northernAsia. In: Richardson DM (ed.) Ecology and Biogeography of Pinus. Cambridge University Press,Cambridge, p 95-106.

Krupkin AB, Liston A & Strauss SH (1996). Phylogenetic analysis of hard pines (Pinus subgenusPinus, Pinaceae) from chloroplast DNA restriction site analysis. Am J Bot 83: 489-498.

Kunst CB, Leeflang EP, Iber JC, Arnheim N & Warren ST (1997) The effect of FMR1 CGG repeatinterruptions of mutation frequency as measured by sperm typing. J Med Gen 34: 627-631.

Kärkkäinen K (1991) Itsesiitos, kukintamuuntelu ja sukusiitosheikkous pohjoisissamäntypopulaatioissa. M.Sc. thesis. University of Oulu.

Kärkkäinen K & Savolainen O (1993) The degree of early inbreeding depression determines theselfing rate at the seed stage: model and results from Pinus sylvestris (Scots pine). Heredity 71:160-166.

Lagercrantz U, Ellegern H & Andersson L (1993) The abundance of various polymorphicmicrosatellite motif differs between plants and vertebrates. Nucleic Acids Res 21: 1111-1115.

Lande R (1998) Anthropogenic, ecological and genetic factors in extinction and conservation. ResPopul Ecol 40: 259-269.

Lande R, Schemske D & Schultz ST (1994) High inbreeding depression, selective interference amongloci, and the treshold rate for purging recessive deleterious mutations. Evolution 48: 956-978.

Lavery PB & Mead DJ (1998) Pinus radiata: a narrow endemic from North America takes on theworld. In: Richardson DM (ed) Ecology and Biogeography of Pinus. Cambridge University Press,Cambridge, p 432-449.

Ledig FT (1998) Genetic variation in Pinus. In: Richarson DM (ed) Ecology and Biogeography ofPinus. Cambridge University Press, Cambridge, p 251-280.

Levinson G & Gutman GA (1987a) Slipped-strand mispairing: a major mechanism for DNAsequence evolution. Mol Biol Evol 4: 203-221.

Levinson G & Gutman GA (1987b) High frequencies of short frameshift in poly-CA/GT tandemrepeats borne by bacteriophage M13 in Escherichia coli K-12. Nucleic Acids Res 15: 5323-5338.

Li W-H (1997) Molecular Evolution. Sunderland, Mass., Sinauer.Litt M & Luty JA (1989) A hypervariable microsatellite revealed by in vitro amplification of a

dinucleotide repeat within a cardiac muscle actin gene. Am J Hum Gen 44: 397-401.Luikart G, Allendorf FW, Cornuet JM & Sherwin WB (1998) Distortion of allele frequency

distributions provides a test for recent population bottlenecks. J Hered 89: 238-247.Luikart G & England PR (1999) Statistical analysis of microsatellite DNA data. Trends in Genet 14:

253-256.Lynch M (1996) A quantitative-genetic perspective on conservation issue. In: Avise J & Hamrick J

(eds.) Conservation genetics: case histories from nature. Chapman and Hall, NY, p 471-501.Lynch M & Milligan BG (1994) Analysis of population genetic structure with RAPD markers. Mol

Ecol 3: 91-99.

Page 48: Evolution and applications of pine microsatellites

48

Makova KD, Nekrutenko A & Baker RJ (2000) Evolution of microsatellite alleles in four species ofmice (Genus Apodemus). J Mol Evol 51: 166-172.

Malécot G (1948) Les mathematiques heredité. Masson, Paris.Maruyama T & Fuerst PA (1984) Population bottlenecks and nonequilibrium models in population

genetics. II. Allele numbers when populations evolve from zero variability. Genetics 108: 745-763.

Messier W, Li S-H & Stewart C-B (1996) The birth of microsatellites. Nature 381: 483.Mikola J (1982) Bud set phenology as an indicator of climatic adaptation of Scots pine in Finland.

Silvae Fenn 16: 178-184.Millar CI (1997) Quaternary evolution of Pinus radiata. In: Burdon RD & Moore JM (eds) Genetics

of radiata pine. New Zealand Forest Research Institute, Rotorua, NZ, p 22-25.Mirov NT (1967) The genus Pinus. New York, Ronald Press.Moran GF, Muona O & Bell JC (1989) Acacia mangium: a tropical forest tree of the coastal lowlands

with low level genetic diversity. Evolution 43: 231-235. Mosseler A, Egger KN & Hughes GA (1992) Low levels of genetic diversity in red pine confirmed

by random amplified polymorphic DNA markers. Can J For Res 22: 1332-1337.Moore SS, Sargeant LL, King TJ, Mattick JS, Georges M & Hetzel DJS (1991) The conservation of

dinucleotide microsatellites among mammalian genomes allows the use of heterologous PCRprimer pairs in closely related species. Genomics 10: 654-660.

Moran GF, Bell JC & Eldridge KG (1988) The genetic structure and the conservation of the fivenatural populations of Pinus radiata. Can J For Res 18: 506-514.

Moran GF, Muona O & Bell JC (1989) Acacia mangium: a tropical forest tree of the coastal lowlandswith low genetic diversity. Evolution 43: 231-235.

Morgante M, Vendramin GG, Rossi P & Olivieri AM (1993) Selection against inbreds in early life-cycle phases in Pinus leucodermis. Heredity 70: 622-627.

Muona O (1990) Population genetics in forest tree improvement. In: Brown AHD, Clegg MT, KahlerAL & Weir BS (eds) Plant Population Genetics, Breeding, and Genetic Resources. Sinauer Assoc,Sunderland, p 282-298.

Muona O & Harju (1989) Effective population sizes, genetic variability and mating system in naturalstands and seed orchards of Pinus sylvestris. Silvae Genet 38: 221-228.

Muona O, Yazdani R & Rudin R (1987) Genetic change between life stages in Pinus sylvestris:allozymes variation in seeds and planted seedlings. Silvae Genet 35: 39-42.

Murray JC, Bennett SR, Kwitek AE, Small KW, Schinzel A, Alward WL, Weber JL, Bell GI &Buetow KH (1992) Lonkage of Reiger syndrome to the region of the epidermal growth factor geneon chromosome 4. Nature Genetics 2: 46-49.

Nauta MJ & Weissing FJ (1996) Constraints on allele size at microsatellite loci: implications forgenetic differentiation. Genetics 143: 1021-1032.

Nei M (1973) Analysis of gene diversity in subdivided populations. Proc Natl Acad Sci USA 106:283-292.

Nei M, Maruyama T & Chakraborty R (1975) The bottleneck effect and genetic variability inpopulations. Evolution 29: 1-10.

Nicholas KB & Nicholas HB Jr (1997) GeneDoc: a tool for editing and annotating multiple sequencealignments. Distributed by author.

van Oppen MJH, Rico C, Turner GF & Hewitt GM (2000) Exstensive homoplasy, nonstepwisemutations, and shared ancestral polymorphism at a complex microsatellite locus in lake Malawicichlids. Mol Biol Evol 17: 489-498.

Orita N, Iwahana H, Kanazawa H, Hayashi K & Sekiya T (1989) Detection of polymorphism ofhumab DNA by gel electrophoresis as single strand conformation polymorphisms. Proc Natl AcadSci USA 86: 2766-2770.

Page 49: Evolution and applications of pine microsatellites

49

Orti G, Pearse DE & Avise JC (1997) Phylogenetic assessment of length variation at a microsatellitelocus. Proc Natl Acad Sci USA 94: 10745-10749.

Owens JN, Takaso T & Runions CJ (1998) Pollination in conifers. Trends in Plant Sci 3: 479-485.Paetkau D & Strobeck C (1995) The molecular basis and evolutionary history of a microsatellite null

allele in bears. Mol Ecol 4: 519-520.Peakall R, Gilmore S, Keys W, Morgante M & Rafalski A (1998) Cross-species amplification of

soybean (Glycine max) simple sequence repeats (SSRs) within the genus and other legume genera:implications for the transferability of SSRs in palnts. Mol Biol Evol 15: 1275.1287.

Pemberton JM, Coltman DW, Coulson TN & Slate J (1999) Using microsatellites to measure thefitness consequences of inbreeding and outbreeding. In: Goldstein DB & Schlötterer C (eds)Microsatellite evolution and applications. Oxford University Press, New York, p 151-164.

Petes TD, Greenwell PW & Dominska M (1997) Stabilization of microsatellite sequences by variantrepeats in the yeast Saccharomyces cerevisiae. Genetics 146: 491-498.

Pfeiffer A, Olivieri AM & Morgante M (1997) Identification and characterization of microsatellitesin Norway spruce (Picea abies K.). Genome 40: 411-419.

Plessas ME & Strauss SH (1986) Allozyme differentiation among populations, stands, and cohorts inMonterey pine. Can J For Res 16: 1155-1164.

Powell W, Morgante M, McDewitt R, Vendramin GG & Rafalski JA (1995) Polymorphic simplesequence region repeat in chloroplast genomes: application to the population genetics of pines.Proc Natl Acad Sci USA 92: 7759-7763.

Powell W, Machray GC & Provan J (1996) Polymorphism revealed by simple sequence repeats.Trends in Plant Sci. 7: 215-222.

Price RA, Liston A & Strauss SH (1998) Phylogeny and systematics of Pinus.In: Richardson DM (ed)Ecology and Biogeography of Pinus. Cambridge University Press, Cambridge, p 49-68.

Primmer CR & Ellegren H (1998) Pattern of molecular evolution in avian microsatellites. Mol BiolEvol 15: 997-1008.

Primmer CR, Saino N, Moller AP & Ellegren H (1998) Unraveling the processes of microsatelliteevolution through analysis of germline mutations in barn swallows Hirundo rustica. Mol BiolEvol 15: 1047-1054.

Primmer CR, Ellegren H, Saino N & Møller AP (1996a) Directional evolution in germlinemicrosatellite mutations. Nature Genetics 13: 391-393.

Primmer CR, Møller AP & Ellegren H (1996b) A wide-range survey of cross-species microsatelliteamplification in birds. Mol Ecol 5: 365-378.

Pritchard JK, Stephens M & Donelly P (2000) Inference of population structure using multilocusgenotype data. Genetics 155: 945-959.

Queller DC, Strassmann JE & Hughes CR (1993) Microsatellites and kinship. Trends Ecol Evol 8:285.

Richardson DM & Rundel PW (1998) Ecology and biogeography of Pinus: and introduction. In:Richardson DM (ed) Ecology and Biogeography of Pinus. Cambridge University Press,Cambridge, p 3-46.

Rico C, Rico I & Hewitt G (1996) 470 million years of conservation of microsatellite loci among fishspecies. Proc R Soc Lond. B Biol Sci 263: 549-557.

Rousset F (1996) Equilibrium values of measures of population subdivision for stepwise mutationprocesses. Genetics 142: 1357-1362.

Rubinsztein DC (1999) Trinucleotide expansion mutations cause diseases which do not conform toclassical Mendelian expectations. In: Goldstein DB, Schlötterer C (eds) Microsatellites evolutionand applications. Oxford University Press, Inc., New York, p 80-97.

Rubinsztein DC, Amos W, Leggo J, Goodburn S, Margolis RL, Ross CA & Ferguson-Smith MA(1995) Microsatellite evolution – evidence for directionality and variation in rate between species.Nature Genetics 10: 337-343.

Page 50: Evolution and applications of pine microsatellites

50

Samadi S, Erard F, Estoup A & Jarne P (1998) The influence of mutation, selection and reproductivesystems on microsatellite variability: a simulation approach. Genet Res 71: 213-222.

Savolainen O & Hedrick P (1995) Heterozygosity and fitness: no association in Scots pine. Genetics140: 755-766.

Savolainen O, Moran GF & Bell JV (2001) Variation in outcrossing in native populations of Pinusradiata due to purging of early acting deleterious genes. (Manuscript).

Schug MD, Mackay TFC & Aquadro CF (1997) Low mutation rates of microsatellite loci inDrosophila melanogaster. Nature Genetics 15: 99-102.

Scotti I, Magni F, Fink R, Powell W, Binelli G & Hedley PE (2000) Microsatellite repeats are notrandomly distributed within Norway spruce (Picea abies K.) expressed sequences. Genome 43:41-46.

Shriver MD, Jin L & Boerwinkle LE (1993) VNTR allele frequency distribution under the stepwisemutation model. Genetics 134: 983-993.

Shriver MD, Jin L, Boerwinkle LE, Deka R, Ferrell RE & Chakraborty R (1995) A novel measure ofgenetic distance for highly polymorphic tandem repeat loci. Mol Biol Evol 12: 914-920.

Sia EA, Kokoska RJ, Dominska M, Greenwell P & Petes TD (1997) Microsatelle instability in yeast:dependence on repeat unit size and DNA mismatch repair genes. Mol Cell Biol 17: 2851-2858.

Slatkin M (1995) A measure of population subdivision based on microsatellite allele frequencies.Genetics 139: 457-462.

Smith DN & Devey ME (1994) Occurrence and inheritance of microsatellites in Pinus radiata.Genome 37: 977-983.

Smith GP (1976) Evolution of repeated DNA sequences by unequal crossover. Science 191: 528-535.Sokal RR & Rohl Jf (1981) Biometry. WH Freeman, San Francisco, California.Soranzo N, Provan J & Powell W (1999) An example of microsatellite length variation in the

mitochondrial genome of conifers. Genome 42: 158-161.Sperisen C, Büchler U, Gugerli F, Mátyás G, Geburek T & Vendramin GG (2001) Tandem repeats

in plant mitocondrial genomes: application to the analysis of population differentiation in theconifer Norway spruce. Mol Ecol 10: 257-263.

Stallings RL (1992) CpG suppression in veryebrate genomes does not account for the rarity og (CpG)microsatellite repeats. Genomics 17: 890-891.

Stephan W & Kim Y (1998) Persistence of microsatellite arrays in finite population. Mol Biol Evol15: 1332-1336.

Strand M, Prolla TA, Liskay RM & Petes TD (1993) Destabilization of tracts of simple repetitiveDNA in yeast by mutations affecting DNA mismatch repair. Nature 365: 274-276.

Steinkellner H, Lexer C, Turetschek E & Glössl J (1997) Conservation og (GA)n microsatellitelocibetween Quercus species.Mol Ecol 6: 1189-1194.

Strauss SH & Libby WJ (1987) Allozyme heterosis in radiata pine is poorly explained byoverdominance. Amer Natur 130: 879-890.

Sweigart A, Karoly K, Jones A & Willis JH (1999) The distribution of individual inbreedingcoefficients and pairwise relatedness in a population of Mimulus guttatus. Heredity 83: 625-632.

Takezaki N & Nei M (1996) Genetic distances and reconstruction of phylogenetic trees frommicrosatellite DNA. Genetics 144: 389-399.

Talbot CC, Avramopoulos D, Gerken S,Chakravarti A, Armour JA, Matsunami N, et al. (1995) Thetetranucleotide repeat polymorphism D12S1245 demonstrates hypermutabillity in germline andsomatic cells. Hum Mol Genet 4: 1193-1199.

Tautz D (1989) Hypervariability of simple sequences as a general source for polymorphic DNAmarkers. Nucleic Acids Res 17: 6463-6471.

Taylor JS, Durkin JM & Breden F (1999a) The death of a microsatellite: A phylogenetic perspectiveon microsatellite interruptions. Mol Biol Evol 16: 567-572.

Page 51: Evolution and applications of pine microsatellites

51

Taylor JS, Sanny P & Breden F (1999b) Microsatellite alleles size homoplasy in the guppy (Poeciliareticulata). J Mol Evol 48:245-247.

Taylor AC, Sherwin WB & Wayne RK (1994) Genetic variation of microsatellite loci in abottlenecked species: the northern hairy-nosed wombat Lasiorhinus krefftii. Mol Ecol 3: 277-290.

Thomas BR, Macdonald SE, Hicks M, Adams DL & Hodgetts RB (1999) Effects of reforestationmethods on genetic diversity of lodgepole pine: and assessment using microsatellite and randomlyamplified polymorphic DNA markers. Theor Appl Genet 98: 793-801.

van Treuren R, Kuittine H, Kärkkäinen K, Baena-Gonzalez E & Savolainen O (1997) Evolution ofmicrosatellites in Arabis petrea and Arabis lyrata, outcrossing relatives of Arabidopsis thaliana.Mol Biol Evol 14: 220-229.

Valdes AM, Slatkin M & Freiner NB (1993) Allele frequencies at microsatellite loci: the stepwisemutation model revised. Genetics 133: 737-749.

Valsecchi E, Palsboll P, Hale P, et al. (1997) Microsatellite genetic distances between oceanicpopulations of the humpback whale (Megaptera novaeangliae). Mol Biol Evol 14: 355-362.

Vendramin GG, Lelli L, Rossi P & Morgante M (1996) A set of primers for the amplification of 20chloroplast microsatellites in Pinaceae. Mol Ecol 5: 595-598.

Vendramin GG, Anzidei M, Madaghiele A, Sperisen C & Bucci G. (2000) Chloroplast microsatelliteanalysis reveals the presence of population subdivision in Norway spruce (Picea abies K.).Genome 43: 68-78.

Viard F, Franck P, Dupois MP, Estoup A & Jarne P (1998) Variation of microsatellite size homoplasyacross electromorphs, loci, and populations in threee invertebrate species. J Mol Evol 47: 42-51.

Wagner DB, Furnier GR, Saghai-Maroof MA, Williams SM, Dancik BP & Allard RW (1987)Chloroplast DNA polymorphism in lodgepole and jack pines and their hybrids. Proc Natl AcadSci USA 84: 2097-2100.

Wakasugi T, Tsudzuki J, Ito S, Nakasima K, Tsudzuki T & Sugiura M (1994) Loss of all ndh genesas determined by sequencing the entire chloroplast genome of the black pine Pinus thunbergii.Proc Natl Acad Sci USA 91: 9794-9798.

Wang X-R, Szmidt AE & Lindgren D (1991) Allozyme differentiation among populations of Pinussylvestris (L.) from Sweden and China. Hereditas 114: 219-226.

Watterson GA (1978) The homozygosity test of neutrality. Genetics 88: 405-417.Watterson GA (1984) Allele frequencies after a bottleneck. Theor Pop Biol 26: 387–407.Wattersom GA (1986) The homozygosity test after a change in population size. Genetics 112: 899-

907.Weber JL (1990) Informativeness of human (dC-dA)n

.(dG-dT)n polymorphisms. Genomics 7: 524-530.

Weber JL & May PE (1989) Abundant class of human DNA polymorphisms which can be typedusing the polymerase chain reaction. Am J Hum Genet 44: 388-396.

Weber JL & Wong C (1993) Mutation of human short tandem repeats. Hum Mol Genet 2: 1123-1128.Weir BS (1996) Genetic Data Analysis II. Sinauer, Sunderland MA.Weir BS & Cockerham C (1994) Estimating F-statistics for the analysis of population structure.

Evolution 38: 1358-1370.Weissenbach J, Gyapay G; Dib C, Vignal A, Morissette J, Millasseau P, Vaysseix G & Lathrop M

(1992) A second-generation linkage map of the human genome. Nature 359: 794-801.Wierdl M, Dominska &M, Petes TD (1997) Microsatellite instability in yeast: depence on the length

of the microsatellite. Genetics 146: 769-779.Wilkie AOM & Higgs DR (1992) An unusually large (CA)n repeat in the region of divergence

between subtelomeric alleles of human chromosome 16p. Genomics 13:81-88.Willis KJ, Bennett KD, Birks HJB (1998) The late Quaternary dynamics of pines in Europe. In:

Richardson DM (ed.) Ecology and Biogeography of Pinus. Cambridge University Press,Cambridge, p 107-121.

Page 52: Evolution and applications of pine microsatellites

52

Wu J, Krutovskii KV & Strauss SH (1999) Nuclear DNA diversity, population differentiation, andphylogenetic relationship in the Californian closed-cone pines based on RAPD and allozymemarkers. Genome 42: 893-908.

Xu X, Peng M & Xu X (2000) The direction of microsatellite mutations is dependent upon allelelength. Nature Genetics 24: 396-399.

Zhang L, Leeflang E-P, Yu J & Arnheim N (1994) Studying human mutations by sperm typing.Instability of CAG trinucleotide repeats in the human androgen receptor gene. Nature Genetics 7:531-535.