inferring functional constraints on drosophila noncoding...
TRANSCRIPT
![Page 1: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/1.jpg)
Casey M. Bergman
Faculty of Life SciencesUniversity of Manchester
Inferring functional constraints on Drosophila noncoding DNA from patterns of sequence evolution.
![Page 2: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/2.jpg)
Outline of Talk
• Noncoding DNA, cis-regulatory annotation and Drosophila as a system
• Conserved noncoding sequences are selectively constrained.
• Spatial constraints on noncoding sequences
![Page 3: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/3.jpg)
Higher organisms have ahigher proportion of noncoding DNA
Bacteria15 %
Yeast30 %
Worm70 %
Fly75 %
![Page 4: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/4.jpg)
The function of most noncoding DNA is unknown & unannotated
Bioinformatic & functional analysis of noncoding DNA ⇒
Genome organization
Transcriptional regulation
= Exon
Mef2
Mef2
Mef2
Mef2
Mef2
CG15863
CG12130
CG1418
CG12133
Adam
CG12134
CG12134
eve
TER94
TER94
Pka-R2
Pka-R2
Pka-R2
CG12128
BS 1360
![Page 5: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/5.jpg)
(A)n
Mef2
Mef2
Mef2
Mef2
Mef2
CG15863
CG12130
CG1418
CG12133
Adam
CG12134
CG12134
eve
TER94
TER94
Pka-R2
Pka-R2
Pka-R2
CG12128
BS 1360
Enhancers
AR3/7
2
APRCQ4/6
mes
15RP2
Transposable elements
Goal: comprehensive functional annotation of noncoding sequences in Drososphila
![Page 6: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/6.jpg)
Why is annotation of cis-regulatory sequences important?
• Better understand development
• Better understand mechanisms of transcription
• Provide material for forward genetics
• Provide material for evolutionary biology
• Generate data for systems biology
![Page 7: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/7.jpg)
Why Drosophila as a model system?
~120 Mb of euchromatin~15,000 genes
75% noncoding
Compact, deletion bias
![Page 8: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/8.jpg)
“Pseudogenes” decay rapidly by deletion in Drosophila
Petrov and Hartl (1998) Mol. Biol. Evol. 15:293-302
![Page 9: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/9.jpg)
Genes with complex expression have longer intergenic regions in compact genomes
Nelson, Hersh & Carroll (2004) Genome Biology 5:R25
![Page 10: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/10.jpg)
Longer introns & intergenic regions have slower rates of sequence evolution in Drosophila
Halligan & Keightley (2006) Genome Research 16:875-884
![Page 11: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/11.jpg)
A wealth of comparative genomic data exists for the genus Drosophila
http://species.flybase.nethttp://rana.lbl.gov/drosophila
![Page 12: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/12.jpg)
image from Pavel Tomancak (MPI-Dresden)
Thousands of candidate expression patterns:BDGP embryonic in situ database
http://www.fruitfly.org/cgi-bin/ex/insitu.pl
![Page 13: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/13.jpg)
Systematic annotation of cis-regulatory datain Drosophila: FlyReg & REDfly databases
Bergman et al. (2005) Bioinformatics 21:1747-1749Gallo et al. (2006) Bioinformatics 22:381-383
chr2R: 5485000 5490000 5495000 5500000FlyBase Protein-Coding Genes
FlyReg: Drosophila DNase I Footprint Database
Regulatory elements from ORegAnno
CG12134CG12134
eve TER94TER94
eveUnspecified
evettk
UnspecifiedUnspecified
knihbhbknihbknihbhbknihbhbknihbhb
hbhb
KrKrKr
bcdKrgt
bcdgtKrKrKr
bcdKrKr
bcdKrgthbKr
bcdKrhb
Krhb
UnspecifiedUnspecifiedUnspecified
ttk
Unspecified
ttk
Unspecified
prdeve
UnspecifiedUnspecified
eveprd
UnspecifiedUnspecifiedUnspecifiedUnspecifiedUnspecified
panMedMedMedpanpan
tinpan
tinMed
tinMedzfh1pan
tinpanpan
OREG0005984OREG0005985
OREG0005981OREG0005968
OREG0005972
OREG0005967OREG0005970
OREG0005971OREG0005969
OREG0005974OREG0005976OREG0005979OREG0005977
OREG0005980OREG0005978
OREG0005973
OREG0005975
OREG0005982OREG0005983
![Page 14: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/14.jpg)
ORegAnno: Open Regulatory Annotation
Montgomery et al. (2006) Bioinformatics 22:637-640
![Page 15: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/15.jpg)
shnAbd-A
fkh
ko
Dll
dpp
mus209
tsh
bcd
salm
Antp
dl
Ubx
zen
kni
ftz
eve
hb
tll
Kr
Trl
grh
cad
h
en
gt
ttk
cis-regulatory annotation & systems biology
Ashburner & Bergman (2005) Genome Research 15:1661-1667
![Page 16: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/16.jpg)
Outline of Talk
• Noncoding DNA, cis-regulatory annotation and Drosophila as a system
• Conserved noncoding sequences are selectively constrained.
• Spatial constraints on noncoding sequences
![Page 17: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/17.jpg)
mel
sim yak ere tak ana pse
500 bp spacer
Pattern of noncoding sequence evolution in Drosophila: the eve stripe 2 enhancer
block
![Page 18: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/18.jpg)
Are conserved blocks functionally constrained or simply mutational cold spots?
Bergman & Kreitman (2001) Genome Research 11:1335-1345
Clark (2001) Genome Research 11:1319-1320
median: 19 bp
![Page 19: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/19.jpg)
Using population genetics to test of the mutational cold-spot hypothesis
1. Excess of rare derived mutations in blocks relative to spacers
(Non-parametric test - blocks vs. spacers, frequency spectrum)
If blocks are functionally constrained we predict the following:
![Page 20: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/20.jpg)
Using population genetics to test of the mutational cold-spot hypothesis
1. Excess of rare derived mutations in blocks relative to spacers
(Non-parametric test - blocks vs. spacers, frequency spectrum)
2. Excess of mutations in blocks relative to fixed differences
(“MK” test - blocks vs. spacers, polymorphism & divergence)
If blocks are functionally constrained we predict the following:
![Page 21: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/21.jpg)
0 ! 0.1 0.1 ! 0.2 0.2 ! 0.3 0.3 ! 0.4 0.4 ! 0.5 0.5 ! 0.6 0.6 ! 0.7 0.7 ! 0.8 0.8 ! 0.9 0.9 ! 1.0
Derived Allele Frequency
0.0
2.0
4.0
6.0
Fra
ction o
f S
NP
s
1. Excess of rare derived mutations in blocks relative to spacers
(Non-parametric test - blocks vs. spacers, frequency spectrum)
2. Excess of mutations in blocks relative to fixed differences
(“MK” test - blocks vs. spacers, polymorphism & divergence)
spacer
If blocks are functionally constrained we predict the following:
Using population genetics to test of the mutational cold-spot hypothesis
![Page 22: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/22.jpg)
0 ! 0.1 0.1 ! 0.2 0.2 ! 0.3 0.3 ! 0.4 0.4 ! 0.5 0.5 ! 0.6 0.6 ! 0.7 0.7 ! 0.8 0.8 ! 0.9 0.9 ! 1.0
Derived Allele Frequency
0.0
2.0
4.0
6.0
Fra
ction o
f S
NP
s
blockspacer
If blocks are functionally constrained we predict the following:
Using population genetics to test of the mutational cold-spot hypothesis
1. Excess of rare derived mutations in blocks relative to spacers
(Non-parametric test - blocks vs. spacers, frequency spectrum)
2. Excess of mutations in blocks relative to fixed differences
(“MK” test - blocks vs. spacers, polymorphism & divergence)
![Page 23: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/23.jpg)
Conserved blocks in humans are not mutational cold-spots
Drake et al. (2005) Nat. Genet. 38:223-7
Yoruba (African)
American (European)
Beijing+Tokyo (Asian)
block
spacer
![Page 24: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/24.jpg)
If blocks are functionally constrained we predict the following:
block blockspacer
Divergence
Polymorphism
div.
π
1. Excess of rare derived mutations in blocks relative to spacers
(Non-parametric test - blocks vs. spacers, frequency spectrum)
2. Excess of mutations in blocks relative to fixed differences
(“MK” test - blocks vs. spacers, polymorphism & divergence)
Using population genetics to test of the mutational cold-spot hypothesis
![Page 25: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/25.jpg)
Harvesting data from GenBank using PDA: a pipeline to study polymorphism
Casillas & Barbadilla (2004) Nucl. Acids Res. 32:W166-W169
Get sequences & annotations
Input from sequencesfrom Genbank,
corresponding to theDrosophila genus
Minimum of 2sequences per category
MSAparameters
Gene, CDS, exon,intron, 5’UTR,
3’UTR, promoter
Group byspecies & gene
Sequences &annotations
1b
Muscle
Sequencesorganized incategories
2
Alignmentvalidation
Alignmentswith Scores
3
Sequencessubgroups
4
Read geneannotations
8
Extract generegions
Sequences,positions and orientations
9 Alignmentssubgroups
56
Polymorphism
Syn & Non-synpolymorphisms
Linkagedisequilibrium
Codon bias
Diversity AnalysisModule
7Web-based
output
Alignments
Jalview
Output
1a
MySQLdatabase
Seq. manipulations
External programs
OutputDiversity analysis
Low qualitysequences
excluded
Alignqualityvalues
![Page 26: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/26.jpg)
Highly conserved noncoding sequences -the UCSC PhastCons track
chr2R:
Conservation
d_simulansd_sechellia
d_yakubad_erecta
d_ananassaed_pseudoobscura
d_persimilisd_willistoni
d_virilisd_mojavensisd_grimshawi
5485000 5490000 5495000 5500000FlyBase Protein-Coding Genes
12 Flies, Mosquito, Honeybee, Beetle Multiz Alignments & phastCons Scores
PhastCons Conserved Elements (12 Flies, Mosquito, Honeybee, Beetle)
CG12134CG12134
eve TER94TER94
lod=13lod=13lod=48lod=16lod=34lod=24lod=22lod=14lod=13lod=59lod=27lod=38lod=22lod=21lod=31lod=27lod=43
lod=109lod=11lod=41lod=31lod=15
lod=23
lod=15lod=46lod=45
lod=15lod=11
lod=10lod=86lod=153
lod=258
lod=71
lod=112
lod=25
lod=190
lod=119
lod=127
lod=73
lod=111
lod=19lod=121
lod=194lod=78lod=85
lod=112
lod=69
lod=28lod=35lod=72
lod=279
lod=17
lod=36
lod=44lod=12lod=11lod=36
lod=57
lod=12lod=39lod=176
lod=11lod=35
lod=20lod=25
lod=107lod=17lod=23lod=84
lod=22
lod=148
lod=27
lod=91lod=139
lod=42lod=65
lod=91
lod=68lod=125
lod=16lod=182
lod=324
lod=14
lod=13lod=57lod=55lod=17lod=18lod=21lod=81
lod=81
lod=101lod=47
lod=110lod=698
lod=18lod=13
lod=44
lod=19
lod=17
lod=168
lod=98lod=146
lod=37
lod=30lod=15lod=12
lod=27
lod=14lod=69
lod=65
lod=67
lod=76
lod=54
lod=18
lod=96
lod=32
lod=11
lod=144lod=54lod=14
lod=164
lod=20lod=72lod=46
lod=82
lod=103
lod=92
lod=19lod=70
lod=13
lod=150
lod=15
lod=15
lod=50
lod=171
lod=104
lod=116lod=14
lod=91
lod=153
lod=18
lod=32lod=118
lod=177
lod=108lod=14lod=24
lod=23
lod=261lod=61lod=24
lod=102
lod=40
lod=48lod=16
lod=19
lod=18
lod=23lod=36
lod=12lod=18
lod=209
lod=138lod=13
lod=147lod=18lod=25lod=17
lod=41lod=180
lod=65
lod=156
lod=17
lod=163
lod=110lod=99
lod=126lod=214
lod=17lod=98
lod=31lod=22lod=73lod=109
lod=26
lod=44
lod=18
lod=85lod=10
lod=319
lod=32lod=11
lod=124lod=11
lod=10lod=124
lod=34lod=46
lod=22lod=39lod=166
lod=20lod=44lod=103
lod=21
lod=71lod=20
lod=13lod=42
lod=16lod=13
lod=11lod=12
lod=11lod=26lod=14
lod=292lod=1111
lod=1301lod=1091
![Page 27: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/27.jpg)
The data: alignments of ~12 D. melanogaster alleles with reference sequence and 1 D. simulans allele
High frequency derived spacer allele
Low frequency derived block alleles
![Page 28: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/28.jpg)
Glinka (2003) + Ometto (2005)
African
Glinka (2003) + Ometto (2005)
European
Orengo (2004)
European
Intronic 167 173 28
Intergenic 90 93 80
Total loci 257 266 108
# Alleles 11.7 11.8 12.7
bp block 30,683 33,292 28,721
bp spacer 79,317 87,379 47,590
Summary of the polymorphism data sets
![Page 29: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/29.jpg)
0
1,250
2,500
3,750
5,000
polymorphism divergence
Single nucleotide polymorphisms & fixed differences are reduced in conserved blocks
66% reduction in polymorphism
77% reduction in divergence
3334
437
4854
374
Obs
erve
d nu
mbe
r
blockspacer
Casillas, Barbadilla & Bergman (2007) Mol. Biol. Evol. 24:2222-2234
![Page 30: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/30.jpg)
0
0.13
0.26
0.39
0.52
0.65
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
blockspacer
Conserved blocks have an excess of rare derived point mutations
KS test: p<6x10-11
Derived allele frequency (DAF)
Freq
uenc
y
Casillas, Barbadilla & Bergman (2007) Mol. Biol. Evol. 24:2222-2234
![Page 31: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/31.jpg)
0
0.375
0.750
1.125
1.500
Excess of polymorphism in conserved blocks relative to fixed differences between species
Poly
mor
phis
m :
dive
rgen
ce χ2 test:p<5x10-13
Block Spacer
Poly. 437 3334
Div. 374 4854
block spacer
Casillas, Barbadilla & Bergman (2007) Mol. Biol. Evol. 24:2222-2234
![Page 32: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/32.jpg)
0
0.375
0.750
1.125
1.500
Excess polymorphism is observed in both intergenic and intronic conserved blocks
χ2 test:p<5x10-5
0
0.375
0.750
1.125
1.500
Poly
mor
phis
m :
dive
rgen
ce χ2 test:p<3x10-9
block spacer
Intergenic Intronic
Casillas, Barbadilla & Bergman (2007) Mol. Biol. Evol. 24:2222-2234
![Page 33: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/33.jpg)
Blocks and spacers are more constrained than 4-fold degenerate “silent” sites in genes.
0
0.13
0.26
0.39
0.52
0.65
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
blockspacer4-fold
Derived allele frequency (DAF)
Freq
uenc
yBlock vs 4-fold: P = 2.48e−12Spacer vs 4-fold: P = 0.00471
Casillas, Barbadilla & Bergman (2007) Mol. Biol. Evol. 24:2222-2234
![Page 34: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/34.jpg)
Conserved noncoding sequences in Drosophila are selectively constrained for point mutations
• Reduction in polymorphism and divergence in blocks
• Excess of rare alleles in blocks
• Excess of polymorphism relative to divergence in blocks
• Not due to use of spacers as inappropriate control sequences, differences in GC content, or alignment error
• Both intergenic and intronic blocks are constrained
![Page 35: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/35.jpg)
Outline of Talk
• Noncoding DNA, cis-regulatory annotation and Drosophila as a system
• Conserved noncoding sequences are selectively constrained.
• Spatial constraints on noncoding sequences
![Page 36: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/36.jpg)
mel
sim yak ere tak ana pse
500 bp
Pattern of noncoding sequence evolution in Drosophila: the eve stripe 2 enhancer
| slope | ~ 1
![Page 37: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/37.jpg)
Spacing between conserved noncoding sequences is maintained in divergent Drosophila species
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
D. melanogaster
spacer interval length (log[bp])
D.
pse
ud
oo
bscu
ra
spacer
inte
rval le
ngth
(lo
g[b
p])
r = 0.85 p < 10-6
Bergman et al. (2002) Genome Biology 3:0086.
![Page 38: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/38.jpg)
Indels are under constraint in mamamlian noncoding DNA
Lunter et al. (2005) PLoS Comp. Biol 2:e5
Log1
0 (F
requ
ency
) Unique noncoding
Ancestral Repeat
Distance between indels
![Page 39: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/39.jpg)
Similar constraints on insertions and deletions in blocks and spacers
0
0.11
0.22
0.33
0.44
0.550.
10.
20.
30.
40.
50.
60.
70.
80.
91.
0
Freq
uenc
y
Casillas, Barbadilla & Bergman (2007) Mol. Biol. Evol. 24:2222-2234
0
0.375
0.750
1.125
1.500
Poly
mor
phis
m :
dive
rgen
ce χ2 test:p=0.029
block spacer
χ2 test:p=0.568
![Page 40: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/40.jpg)
Block Spacer
Poly. 66 380
Div. 107 901
χ2 test:p=0.029
Different selective constraint on indels not due to low power or small sample size
Indel Observed
Block Spacer
Poly. 437 3334
Div. 374 4854
χ2 test:p<5x10-13
Point Mutation Observed
Block Spacer
Poly. 51 394
Div. 72 935
χ2 test:p=0.007
Point Mutation Rescaled to Indel Observed
![Page 41: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/41.jpg)
A molecular interpretation of conservation in Drosophila noncoding regions
= Conserved noncoding sequence
= Spacer intervals
= Transcription factors
![Page 42: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/42.jpg)
A molecular interpretation of conservation in Drosophila noncoding regions
= Conserved noncoding sequence
= Spacer intervals
= Transcription factors
![Page 43: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/43.jpg)
A hierarchical model of spatial constraints on cis-regulatory regions
binding site binding site
modulemodule
enhancer
cis-regulatory region
//
~ 5 bp
~ 100 bp
~ kbpenhancer
after Ondek et al. (1988) Nature 333:40-45
![Page 44: Inferring functional constraints on Drosophila noncoding ...bergmanlab.genetics.uga.edu/wp-content/uploads/...Conservation d_simulans d_sechellia d_yakuba d_erecta d_ananassae d_pseudoobscura](https://reader034.vdocuments.us/reader034/viewer/2022042811/5fa0cd9568833546e54d90c7/html5/thumbnails/44.jpg)
Acknowledgements
Marty Kreitman
Michael Ashburner
Sue Celniker, Gerry Rubin,Eddy Rubin
Sonia Casillas, Antonio Barbadilla
Stephen Montgomery, Obi GriffithsMarc Halfon, Steve Gallo