xprime: a novel motif searching method
DESCRIPTION
Presentation prepared for the WNAR conference held at Portland State University in 2009TRANSCRIPT
![Page 1: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/1.jpg)
XPRIME: A Novel Motif Searching Method
Rachel L. Poulsen
Department of StatisticsBrigham Young University
June 15, 2009
![Page 2: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/2.jpg)
Introduction
DNA contains the genetic instructions that uniquely define anorganism
RNA is created to carry genetic instructions from the DNA tothe rest of the cell
The process of DNA “talking” to the rest of the cell is calledtranscription
![Page 3: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/3.jpg)
Introduction
DNA contains the genetic instructions that uniquely define anorganism
RNA is created to carry genetic instructions from the DNA tothe rest of the cell
The process of DNA “talking” to the rest of the cell is calledtranscription
![Page 4: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/4.jpg)
Transcription
DNA
RNA
![Page 5: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/5.jpg)
Transcription
DNA RNA
![Page 6: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/6.jpg)
Transcription
DNA RNA
![Page 7: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/7.jpg)
Position Weight Matrix (PWM) (Hertz et al 1990)
ETS1 TF binding motif
Position: 1 2 3 4 5 6 7 8ACGT
0.067 0.333 0.0 0.0 1.0 0.533 0.267 0.0670.933 0.600 0.0 0.0 0.0 0.133 0.067 0.4000.000 0.000 1.0 1.0 0.0 0.000 0.667 0.0000.000 0.067 0.0 0.0 0.0 0.333 0.000 0.533
![Page 8: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/8.jpg)
Position Weight Matrix (PWM) (Hertz et al 1990)
ETS1 TF binding motif
Position: 1 2 3 4 5 6 7 8ACGT
0.067 0.333 0.0 0.0 1.0 0.533 0.267 0.0670.933 0.600 0.0 0.0 0.0 0.133 0.067 0.4000.000 0.000 1.0 1.0 0.0 0.000 0.667 0.0000.000 0.067 0.0 0.0 0.0 0.333 0.000 0.533
![Page 9: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/9.jpg)
Sequence Logos
Figure: DNA binding motif for the ETS1 TF
![Page 10: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/10.jpg)
De Novo motif searching
Regular expression enumeration1 Actual count vs. expected count2 Dictionary-based sequence model (Bussemaker et al. 2000)
PWM updating1 MEME (Bailey et al 1995)2 Gibbs Motif Sampler (GMS) (Lawrence et al 1993)3 BioProspector (Liu et al 2001)4 AlignACE (Roth et al 1998)
![Page 11: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/11.jpg)
De Novo motif searching
Regular expression enumeration
1 Actual count vs. expected count2 Dictionary-based sequence model (Bussemaker et al. 2000)
PWM updating1 MEME (Bailey et al 1995)2 Gibbs Motif Sampler (GMS) (Lawrence et al 1993)3 BioProspector (Liu et al 2001)4 AlignACE (Roth et al 1998)
![Page 12: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/12.jpg)
De Novo motif searching
Regular expression enumeration1 Actual count vs. expected count2 Dictionary-based sequence model (Bussemaker et al. 2000)
PWM updating1 MEME (Bailey et al 1995)2 Gibbs Motif Sampler (GMS) (Lawrence et al 1993)3 BioProspector (Liu et al 2001)4 AlignACE (Roth et al 1998)
![Page 13: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/13.jpg)
De Novo motif searching
Regular expression enumeration1 Actual count vs. expected count2 Dictionary-based sequence model (Bussemaker et al. 2000)
PWM updating
1 MEME (Bailey et al 1995)2 Gibbs Motif Sampler (GMS) (Lawrence et al 1993)3 BioProspector (Liu et al 2001)4 AlignACE (Roth et al 1998)
![Page 14: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/14.jpg)
De Novo motif searching
Regular expression enumeration1 Actual count vs. expected count2 Dictionary-based sequence model (Bussemaker et al. 2000)
PWM updating1 MEME (Bailey et al 1995)2 Gibbs Motif Sampler (GMS) (Lawrence et al 1993)3 BioProspector (Liu et al 2001)4 AlignACE (Roth et al 1998)
![Page 15: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/15.jpg)
Known Motif Search
1 GREP
2 Database search with scoring function (Hertz et al 1990)
![Page 16: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/16.jpg)
XPIME: An Improved Method
TRANSFAC (Matys et al 2003)
Information pulled from in vitro experiments and literatureMost methods justify results using TRANSFACXPRIME incorporates prior informationXPRIME can search for both de novo motifs and known motifssimultaneously
![Page 17: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/17.jpg)
XPIME: An Improved Method
TRANSFAC (Matys et al 2003)
Information pulled from in vitro experiments and literatureMost methods justify results using TRANSFAC
XPRIME incorporates prior informationXPRIME can search for both de novo motifs and known motifssimultaneously
![Page 18: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/18.jpg)
XPIME: An Improved Method
TRANSFAC (Matys et al 2003)
Information pulled from in vitro experiments and literatureMost methods justify results using TRANSFACXPRIME incorporates prior information
XPRIME can search for both de novo motifs and known motifssimultaneously
![Page 19: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/19.jpg)
XPIME: An Improved Method
TRANSFAC (Matys et al 2003)
Information pulled from in vitro experiments and literatureMost methods justify results using TRANSFACXPRIME incorporates prior informationXPRIME can search for both de novo motifs and known motifssimultaneously
![Page 20: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/20.jpg)
Notation and Data
Indices
w: width of motifL: length of sequencem: motif indicatori: position in sequencej: position in motifs: indicates sequence
The data, zs
zs = (yis ,∆1i ,∆2i , · · · ,∆(m+1)i )
yi represents the position (w-mer)∆mi indicates if yi belongs to motif m or not∆(m+1)i indicates if yi belongs to the backgrond motif or not
![Page 21: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/21.jpg)
Notation and Data
Indices
w: width of motifL: length of sequencem: motif indicatori: position in sequencej: position in motifs: indicates sequence
The data, zs
zs = (yis ,∆1i ,∆2i , · · · ,∆(m+1)i )
yi represents the position (w-mer)∆mi indicates if yi belongs to motif m or not∆(m+1)i indicates if yi belongs to the backgrond motif or not
![Page 22: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/22.jpg)
Notation and Data
Indices
w: width of motifL: length of sequencem: motif indicatori: position in sequencej: position in motifs: indicates sequence
The data, zs
zs = (yis ,∆1i ,∆2i , · · · ,∆(m+1)i )
yi represents the position (w-mer)∆mi indicates if yi belongs to motif m or not∆(m+1)i indicates if yi belongs to the backgrond motif or not
![Page 23: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/23.jpg)
Notation and Data
Indices
w: width of motifL: length of sequencem: motif indicatori: position in sequencej: position in motifs: indicates sequence
The data, zs
zs = (yis ,∆1i ,∆2i , · · · ,∆(m+1)i )
yi represents the position (w-mer)∆mi indicates if yi belongs to motif m or not∆(m+1)i indicates if yi belongs to the backgrond motif or not
![Page 24: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/24.jpg)
The Scoring Function
MotifScore = f (y) =w∏
j=1
∑i∈A,C ,G ,T
pij I (yj = i).
![Page 25: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/25.jpg)
Methods: Complete Data Likelihood
(m+1) – component mixture model
L(θ|z) =Ls∏i=1
C (yi )[r1f1(yi )]∆1i [r2f2(yi )]∆2i · · · [rm+1fm+1]∆(m+1)i
f(y) is the Motif Score equation
![Page 26: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/26.jpg)
Methods: Complete Data Likelihood
(m+1) – component mixture model
L(θ|z) =Ls∏i=1
C (yi )[r1f1(yi )]∆1i [r2f2(yi )]∆2i · · · [rm+1fm+1]∆(m+1)i
f(y) is the Motif Score equation
![Page 27: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/27.jpg)
Methods: Priors
fm+1(y) is fixed a priori
∆(m+1)i ’s are missing a priori
f1(y), · · · , fm(y) have product Dirichlet priors such that
π(fm(y)) ∝L∏
j=1
∏k∈(A,C ,G ,T )
papmij
−1
mjk
r also has a Dirichlet prior
π(r) ∝M∏i=1
rari−1
i
![Page 28: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/28.jpg)
Methods: Gibbs Algorithm
1 Draws ∆’s from a multinomial distribution
p∆ ∝ rM ∗ fM(y)
2 Draws r from a Dirichlet distribution
αr =∑L
i=1 ∆Mi + aM
3 Draws pmij from a Dirichlet distribution
αpmij =∑L
i=1
∑k={A,C ,G ,T} ∆mi I (yij = k) + apmij
![Page 29: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/29.jpg)
Methods: Gibbs Algorithm
1 Draws ∆’s from a multinomial distribution
p∆ ∝ rM ∗ fM(y)
2 Draws r from a Dirichlet distribution
αr =∑L
i=1 ∆Mi + aM
3 Draws pmij from a Dirichlet distribution
αpmij =∑L
i=1
∑k={A,C ,G ,T} ∆mi I (yij = k) + apmij
![Page 30: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/30.jpg)
Methods: Gibbs Algorithm
1 Draws ∆’s from a multinomial distribution
p∆ ∝ rM ∗ fM(y)
2 Draws r from a Dirichlet distribution
αr =∑L
i=1 ∆Mi + aM
3 Draws pmij from a Dirichlet distribution
αpmij =∑L
i=1
∑k={A,C ,G ,T} ∆mi I (yij = k) + apmij
![Page 31: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/31.jpg)
Methods: Gibbs Algorithm
1 Draws ∆’s from a multinomial distribution
p∆ ∝ rM ∗ fM(y)
2 Draws r from a Dirichlet distribution
αr =∑L
i=1 ∆Mi + aM
3 Draws pmij from a Dirichlet distribution
αpmij =∑L
i=1
∑k={A,C ,G ,T} ∆mi I (yij = k) + apmij
![Page 32: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/32.jpg)
An Example: ETS1
We hypothesize that ETS1 has a specific binding site
The Data1 ETS1 only2 GABP only3 ETS1 and GABP
![Page 33: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/33.jpg)
ETS1 Binding Motifs
(a) ETS1 from TRANSFAC (b) ETS1 from ETS1 only
(c) ETS1 from GABP only (d) ETS1 from ETS1/GABP
![Page 34: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/34.jpg)
Justification of Prior Information
Pete Hollenhorst sequence logo
![Page 35: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/35.jpg)
Justification of Prior Information
Figure: Motif found without prior specification
Figure: Motif found with prior specification
![Page 36: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/36.jpg)
Conclusions and Future Research
XPRIME successfully searches for de novo and known motifs
Evidence found suggesting ETS1 has its own binding motif
Hidden Markov Models and forward backward algorithm
Prior information on r
![Page 37: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/37.jpg)
Conclusions and Future Research
XPRIME successfully searches for de novo and known motifs
Evidence found suggesting ETS1 has its own binding motif
Hidden Markov Models and forward backward algorithm
Prior information on r
![Page 38: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/38.jpg)
Conclusions and Future Research
XPRIME successfully searches for de novo and known motifs
Evidence found suggesting ETS1 has its own binding motif
Hidden Markov Models and forward backward algorithm
Prior information on r
![Page 39: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/39.jpg)
Conclusions and Future Research
XPRIME successfully searches for de novo and known motifs
Evidence found suggesting ETS1 has its own binding motif
Hidden Markov Models and forward backward algorithm
Prior information on r
![Page 40: XPRIME: A Novel Motif Searching Method](https://reader033.vdocuments.us/reader033/viewer/2022042700/55961fe21a28ab7b0e8b4899/html5/thumbnails/40.jpg)
Conclusions and Future Research
XPRIME successfully searches for de novo and known motifs
Evidence found suggesting ETS1 has its own binding motif
Hidden Markov Models and forward backward algorithm
Prior information on r