polymorphism structure of the human genome
DESCRIPTION
Polymorphism Structure of the Human Genome. Gabor T. Marth. Department of Biology Boston College Chestnut Hill, MA 02467. Human variation structure is heterogeneous. chromosomal averages. polymorphism density along chromosomes. marker density. “dense”. “sparse”. allele frequency. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/1.jpg)
Polymorphism Structure of the Human Genome
Gabor T. Marth
Department of BiologyBoston CollegeChestnut Hill, MA 02467
![Page 2: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/2.jpg)
Human variation structure is heterogeneous
chromosomal averages
polymorphism density along chromosomes
![Page 3: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/3.jpg)
Heterogeneity at the level of distributions
0.0
0
5.0
0
10
.00
15
.00
20
.00
25
.00
30
.00
35
.00
40
.00
4 kb
8 kb
12 kb
16 kb0
0.1
0.2
0.3
0.4
“sparse” “dense”
marker density
“rare” “common”
0
0.05
0.1
1 2 3 4 5 6 7 8 9 10
allele frequenc
y
![Page 4: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/4.jpg)
What explains nucleotide diversity?
5
6
7
8
30 33 36 39 42 45 48 51 54
G+C Content [%]
SN
P R
ate
[per
10,
000
bp
]
5
6
7
8
0.3 1.2 2.1 3 3.9 4.8 5.7
CpG Content [%]
SN
P R
ate
[p
er
10,0
00 b
p]
G+C nucleotide content
CpG di-nucleotide content
5
6
7
8
9
10
0 0.5 1 1.5 2 2.5 3 3.5 4
Recombination rate [per Mb]
SN
P R
ate
[per
10,
000
bp
] recombination rate
functional constraints
3’ UTR 5.00 x 10-4
5’ UTR 4.95 x 10-4
Exon, overall 4.20 x 10-4
Exon, coding 3.77 x 10-4
synonymous 366 / 653non-synonymous 287 / 653
Variance is so high that these quantities are poor predictors of nucleotide diversity in local regions hence random processes are likely to govern the basic shape of the genome variation landscape (random) genetic drift
![Page 5: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/5.jpg)
Components of drift: Genealogy
present generation
randomly mating population, genealogy evolves in a non-deterministic fashion
![Page 6: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/6.jpg)
Components of drift: Mutation
mutation randomly “drift”: die out, go to higher frequency or get fixed
![Page 7: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/7.jpg)
Modulators: Changing population size
mutation randomly “drift”: die out, go to higher frequency or get fixed
genetic bottleneck
![Page 8: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/8.jpg)
Modulators: Population subdivision
subdivision
subdivision promotes private polymorphisms, and skews allele frequency
![Page 9: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/9.jpg)
Modulators: Recombination
accgttatgcaga acagttatgtaga
acagttatgcaga
accgttatgtagaaccgttatgcaga acagttatgtaga
recombination
different nucleotide sites within the same DNA segment no longer share the same genealogy
![Page 10: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/10.jpg)
Modulators: Natural selection
negative (purifying) selection
positive selection
the genealogy is no longer independent of (and hence cannot be decoupled from) the mutation process
![Page 11: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/11.jpg)
Modeling ancestral processes
“forward simulations” the “Coalescent” process
By focusing on a small sample, complexity of the relevant part of the ancestral process is greatly reduced. There are,
however, limitations.
![Page 12: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/12.jpg)
Inferences from variation data
larger population size (N) -> more mutations -> higher diversity (θ)
larger mutation rate (μ) -> more mutations -> higher diversity (θ)
higher diversity -> larger population size OR higher mutation rate(θ = 4Nμ)
![Page 13: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/13.jpg)
Ancestral inference: modeling
past
present
stationary expansioncollapse
MD(simulation)
AFS(direct form)
histo
ry
0
0.05
0.1
1 2 3 4 5 6 7 8 9 10
0
0.05
0.1
1 2 3 4 5 6 7 8 9 100
0.05
0.1
1 2 3 4 5 6 7 8 9 10
0
0.05
0.1
1 2 3 4 5 6 7 8 9 10
bottleneck
0
0.1
0.2
0.3
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0 1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0 1 2 3 4 5 6 7 8 9 10
0
0.1
0.2
0.3
0 1 2 3 4 5 6 7 8 9 10
![Page 14: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/14.jpg)
Ancestral inference: model fitting
0
0.05
0.1
0.15
1 2 3 4 5 6 7 8 9 10
minor allele count
bottleneckmodest but
uninterrupted expansion
![Page 15: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/15.jpg)
Allelic association
accgttatgcaga
acagttatgtaga
acagttatgcaga
accgttatgtaga
possible allele combinations (2-marker
haplotypes)
higher recombination rate
(r)
![Page 16: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/16.jpg)
Allelic association: LD
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.81E-6
1E-5
1E-4
1E-3
0.01
0.1
1
10
100
1000
Recom
bin
ation F
raction
r2
European Asian
African American
Dis
tance (k
b)
measure of allelic association: “linkage disequilibrium (LD)”
![Page 17: Polymorphism Structure of the Human Genome](https://reader036.vdocuments.us/reader036/viewer/2022062314/56812b08550346895d8eea47/html5/thumbnails/17.jpg)
Haplotype structure
“haplotype block”