summary of 8th lesson exotic microbes have a reduced level of genetic variability if genotypes fall...
TRANSCRIPT
Summary of 8th lesson
• Exotic microbes have a reduced level of genetic variability
• If genotypes fall on clades separated by long branches, it may be an indication there is no sex going on between individuals belonging to the two branches. Formal tests like the Index of association can test for that
• Anonymous multilocus analysis can be done without any knowledge of the genome using markers such as RAPDs or AFLPs
• Need to eliminate co-segregant markers and to use Jaccard;s
Dealing with dominant anonymous multilocus markers
• Need to use large numbers (linkage)
• Repeatability
• Graph distribution of distances
• Calculate distance using Jaccard’s similarity index
Jaccard’s
• Only 1-1 and 1-0 count, 0-0 do not count
1010011
1001011
1001000
Jaccard’s
• Only 1-1 and 1-0 count, 0-0 do not count
A: 1010011 AB= 0.6 0.4 (1-AB)
B: 1001011 BC=0.5 0.5
C: 1001000 AC=0.2 0.8
Now that we have distances….
• Plot their distribution (clonal vs. sexual)
Now that we have distances….
• Plot their distribution (clonal vs. sexual)
• Analysis: – Similarity (cluster analysis); a variety of
algorithms. Most common are NJ and UPGMA
Now that we have distances….
• Plot their distribution (clonal vs. sexual)
• Analysis: – Similarity (cluster analysis); a variety of
algorithms. Most common are NJ and UPGMA– AMOVA; requires a priori grouping
AMOVA groupings
• Individuals within population
• Among populations
• Among regions
AMOVA: partitions molecular variance amongst a priori defined groupings
Results: Jaccard similarity coefficients
0.3
0.90 0.92 0.94 0.96 0.98 1.00
00.10.2
0.40.50.60.7
Coefficient
Fre
quen
cy
P. nemorosa
P. pseudosyringae: U.S. and E.U.
0.3
Coefficient0.90 0.92 0.94 0.96 0.98 1.00
00.10.2
0.40.50.60.7
Fre
quen
cy
Fre
quen
cy
0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99
Pp U.S.
Pp E.U.
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Jaccard coefficient of similarity
0.7
P. pseudosyringae genetic similarity patterns are different in U.S. and E.U.
0.1
4175A
p72
p39
p91
1050
p7
2502
p51
2055.2
2146.1
5104
4083.1
2512
2510
2501
2500
2204
2201
2162.1
2155.3
2140.2
2140.1
2134.1
2059.2
2052.2
HCT4
MWT5
p114
p113
p61
p59
p52
p44
p38
p37
p13
p16
2059.4
p115
2156.1
HCT7
p106
P. nemorosa
P. ilicisP. pseudosyringae
Results: Results: P. nemorosaP. nemorosa
Results: Results: P. pseudosyringaeP. pseudosyringae
0.1
4175A2055.2p44
FC2DFC2E
GEROR4 FC1B
FCHHDFCHHCFC1A
p80FAGGIO 2FAGGIO 1FCHHBFCHHAFC2FFC2CFC1FFC1DFC1Cp83p40
BU9715 p50
p94p92
p88p90
p56Bp45
p41p72p84p85p86p87p93p96p39p118p97p81p76p73p70p69p62p55p54
HELA2HELA 1
P. nemorosaP. ilicis
P. pseudosyringae
= E.U. isolate
Have we sampled enough?
• Resampling approaches
• Saturation curves
– A total of 30 polymorphic alleles– Our sample is either 10 or 20– Calculate whether each new sample is
characterized by new alleles
Saturation curves
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
NoOf Newalleles
If we have codominant markers how many do I need
• IDENTITY tests = probability calculation based on allele frequency… Multiplication of frequencies of alleles
• 10 alleles at locus 1 P1=0.1
• 5 alleles at locus 2 P2=0,2
• Total P= P1*P2=0.02
White mangroves:Corioloposis caperata
White mangroves:Corioloposis caperata
Coco Solo Mananti Ponsok DavidCoco Solo 0Mananti 237 0Ponsok 273 60 0David 307 89 113 0
Distances between study sites
Coriolopsis caperataCoriolopsis caperata on on Laguncularia racemosaLaguncularia racemosa
Forest fragmentation can lead to loss of gene flow among previously contiguous populations. The negative repercussions of such genetic isolation should most severely affect highly specialized organisms such as some plant-parasitic fungi.
AFLP study on single spores
Site # of isolates # of loci % fixed alleles
Coco Solo 11 113 2.6
David 14 104 3.7
Bocas 18 92 15.04
Distances =PhiST between pairs ofpopulations. Above diagonal is the ProbabilityRandom distance > Observed distance (1000iterations).
Coco Solo Bocas David
Coco Solo 0.000 0.000 0.000
Bocas 0.2083 0.000 0.000
David 0.1109 0.2533 0.000
Using DNA sequences
• Obtain sequence
• Align sequences, number of parsimony informative sites
• Gap handling
• Picking sequences (order)
• Analyze sequences (similarity/parsimony/exhaustive/bayesian
• Analyze output; CI, HI Bootstrap/decay indices
Using DNA sequences
• Testing alternative trees: kashino hasegawa • Molecular clock• Outgroup• Spatial correlation (Mantel)
• Networks and coalescence approaches
Good chromatogram!
Bad chromatogram…
Pull-up (too much signal) Loss of fidelity leads to slips, skips and mixed signals
Reverse reaction suffers same problems in opposite direction
Alignments (Se-Al)
Distance vs. parsimony
• Distance simply calculates at how many positions sequences are similar or differen
– (Matteo) ACGTAACGTT-AG– (Amanda) AGTTAACGTTAAG– (Patrick) ACTTAACGTTAAG
Distance vs. Parsimony
• Patrick
Amanda Matteo
Matteo
Patrick
Amanda
OUTGROUP can allow us to pick Matteo as ancestral
Confidence
• Bootstrap (resampling approcah) Decay indices (threshold approach)
• Consistency index
• Homoplasy index
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Pacifico
Caribe
The “scale” of disease
• Dispersal gradients dependent on propagule size, resilience, ability to dessicate, NOTE: not linear
• Important interaction with environment, habitat, and niche availability. Examples: Heterobasidion in Western Alps, Matsutake mushrooms that offer example of habitat tracking
• Scale of dispersal (implicitely correlated to metapopulation structure)---
From Garbelotto and Chapela, From Garbelotto and Chapela, Evolution and biogeography of matsutakesEvolution and biogeography of matsutakes
Biodiversity within speciesBiodiversity within speciesas significant as betweenas significant as betweenspeciesspecies
Other important types of markers (co-dominant)
• Restriction Fragment Length Polymorphisms (RFLP) of a locus
• Single Nucleotide Polymorphisms (SNPs)
• Microsatellites (SSR)
Restriction Fragment Length Polymorphisms (RFLP) of a
locus
aacccacgtcaataaaaa
aacccac +
gtcaataaaaa
One restriction site
aacccaggtcaataaaaa
aacccaggtcaataaaaa
No restriction sites
Restriction Fragment Length Polymorphisms (RFLP) of a
locus
Two alternate alleles= codominant marker
Single Nucleotide Polymorphisms (SNPs)
• nnACGTnnnnnnTAAGnnnnnn
• nnAGGTnnnnnnTATGnnnnnn
Dept. of Energy / Joint Genome InstituteDept. of Energy / Joint Genome Institute(www.JGI.gov)(www.JGI.gov)
Shotgun sequencingShotgun sequencing
Completed May 2004 Completed May 2004 - 7x coverage- 7x coverage66 MB………..much larger than first calculated66 MB………..much larger than first calculated
Used 445,030 reads (FASTA format)Used 445,030 reads (FASTA format) - 3x coverage- 3x coverage
FASTA format
1. SSR detection1. SSR detection
* batch search* batch search
* di and tri-nucleotide repeats* di and tri-nucleotide repeats
* differ in repeat length* differ in repeat length
CCGAAATCGGACCTTGAGTGCGGAGAGAGAGAGAGACTGTACGAGCCCGAGTCTCGCAT
repeatmicro_gt_0135.seq0 gt(6)micro_gt_0135.seq01 gt(6)micro_ac_0462.seq012 ac(6)micro_ac_0312.seq ac(7)micro_gt_0067.seq gt(6)micro_gt_0594.seq gt(7)micro_ag_0145.seq ag(8)micro_ac_0689.seq ac(6)micro_gt_0382.seq012 gt(7)micro_gt_0396.seq0 gt(7)micro_ag_0316.seq ag(12)micro_ct_0079.seq ct(6)micro_ct_0639.seq0 ct(6)micro_ac_0478.seq012 ac(7)micro_ctg_0109.seq ctg(6)micro_ag_0305.seq ag(7)micro_cg_0053.seq cg(6)micro_ct_0541.seq ct(6)
Tm length forw sequence rev sequence60 100 0 AGCGCGTGACACACACTAAG 0 CACAACACACGCGCTCTATC60 100 0 AGCGCGTGACACACACTAAG 0 CACAACACACGCGCTCTATC61 101 0 ACCCACCACCACTCTTCACC 0 TATGATGGGGTGGGTGATTG60 101 0 GAACAGCCTCGTTCAAGAGC 0 GGGGTGATTTTACTGGCTGC60 101 0 GAAATGGGGCCAGCTCTAAC 0 CGGGGTCAAATTTACGAATG60 101 0 GGAGTACGCGGACGAAGC 0 ACACCACGACACACAACACC60 103 0 TTCTACACGCCTGCCCTTAC 0 TTATCCAGCCCTTCGTCATC60 104 0 CCATCCTCTCTCTTCTCTCGC 0 ACAATATGCCCTCCCTCCTC60 104 0 TGGTATGTTGTGTGTGTGCG 0 ACCACATCCGCGTAGAGAAC60 104 0 CTGTTGCTCGTGTTTGATGG 0 AAAGCACGCCAAAACCATAC60 106 0 CCAGCTCTCTCTGCTGCAC 0 ATTGATTGGGGCAAACAGTG60 106 0 AGCCTCTCCCAGAGCATACC 0 ACAGAGCCATCGACTTGACC60 107 0 CTTGCTCTTCTCCTTCCGTG 0 GAGATGAGGAGTCGACGAGC60 110 0 CAAAACTCCACTCATCCCCC 0 TTTGTTGGTTGTTGTGGTGG60 110 0 TTGAGCTCGTGAGCCTTCTC 0 CAACAACCCAGACAACCACC60 112 0 GAAGAGGGGGAAAGAGGGAG 0 CGCGTGCTTTTCTTCTTCTC60 112 0 ACGAAGGCTGTTCTGGACAC 0 GTTTGCTGCTGCTCTCCAAG60 112 0 TTCCTCCCACTCTGTCGTTC 0 CACAGAAGCGGAAGAGAAGC
locus ct 0070locus ct 0070
Microsatellites (SSR)
• Supposed to be neutral• Stepwise mutation model• Very sensitive because loci are prone to mutation• Allele is af fragment of DNA that includes the
flanking regions of the microsatellite and then a certain number of tandem repeats (variation in size should be in multiple of SSR(