in silico cis-analysis promoter analysis - promoters and cis-elements - searching for patterns -...
Post on 21-Dec-2015
229 views
TRANSCRIPT
In silico cis-analysispromoter analysis
- Promoters and cis-elements
- Searching for patterns
- Searching redundant patterns
What is a promoter?and why care about it?
CDS
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
A little about thesecis-elements
- Transcription Factors (TF) often bind to them
- They are normally from 5-10 bp long
- They are often placed in the region from TSS and upstream
- Often they come in clusters i.e. must be placed in some ‘syntax’ to be functional
- We assume, they are shared by the promoters in a regulon
- They are not always conserved 100%
Some tricksmakes it possible
Regulon (cluster)
Comparing promoters from a regulon to all other promotersi.e. using a negative set (all other promoters in species X)
This allows us to use hypergeometric statistics
Ranked list of genes
The distribution of sequences with a given pattern along a ranki.e. is the pattern overrepresented in promoters with low (or high)
p-values in a microarray experiment
This allows us to use Kolmogorov-Smirnov
Hypergeometric Statistic
10x5x
Takeout 6 balls
1x5x p=0.04
Kolmogorov-Smirnov, here test for deviations from a uniform distribution of patterns along a sorted list og promoters
Patte
rnO
ccurre
nce
Promoters ranked by e.g. p-value
So, the elements maynot always be 100% conserved
If we only had a weight matrice to describe our pattern we could find less conserved patterns
With a Gibbs sampler we can work backwards• We give it promoters and it builds a weight matrix
Motif model (residue frequency x 100):POS A C G T Info1 89 . . . 1.02 . . 92 . 2.03 . . . 94 1.24 . 92 . . 2.05 94 . . . 1.26 94 . . . 1.2
weight matrix
Gibbs sample evaluation
TTGACT
Gibbs sampling on 1000 random sampled sets of 17 Arabidopsis promoters, evaluated by information content.
monte carlo
GACTTTTC
Gibbs sample evaluation
Gibbs sampling on 1000 random sampled sets of 17 Arabidopsis promoters, evaluated by information content.
monte carlo
Here for 8bp patterns