genomic identification of structural rnas using …...genomic identification of structural rnas...
TRANSCRIPT
![Page 1: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/1.jpg)
Genomic identification of Structural RNAs using phylo-SCFGs
Jakob Skou Pedersen
Bioinformatics Center, University of Copenhagen
![Page 2: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/2.jpg)
Structural RNA identification problem
independently transcribed ncRNAs
ncRNA protein-coding gene
Structural RNA: any transcribed region with functional structure.
ncRNAs co-transcribed with protein-coding genes
Such as:
![Page 3: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/3.jpg)
Highly diverse settRNA Xist (~20 kb long)miRNA
UTR RNase P
Little single sequence signal:Lack of common nucleotide biasesLack of common sequence motifs
![Page 4: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/4.jpg)
Evolutionary signal
Structure functionally important
Primary sequence subs tolerated
Signal Unprecedented comparative data
![Page 5: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/5.jpg)
Introduction to phylogenetic models
Captures:
Nucleotide biases
Patterns of substitution
Evolutionary sequence correlations
Correlated changes (multi nucleotide models)
: Continuous time Markov chain acting on
branches of phylogenetic tree
Alignment column
Substitution rates of Markov chain
Felsenstein 81:
Transition prob.:
![Page 6: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/6.jpg)
Single-nucleotide model
• 4x4 rate matrix
• Marginal average of di
nucleotide matrix
• Fast substitution rate
-nucleotide model
16x16 rate matrix
Learned from data
Favors pairing di-nucs
Slow substitution rate
EvoFold Phylogenetic models
-nucleotide model
16x16 rate matrix
Learned from data
Favors pairing di-nucs
Slow substitution rate
Single-nucleotide model
• 4x4 rate matrix
• Marginal average of di
nucleotide matrix
• Fast substitution rate
![Page 7: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/7.jpg)
EvoFold SCFGsStructural model: Non-structural model:
![Page 8: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/8.jpg)
Structure derivation
Structure
![Page 9: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/9.jpg)
Single nucleotide phylogenetic model
Di-nucleotide phylogenetic model
fold
![Page 10: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/10.jpg)
Single nucleotide phylogenetic model
Di-nucleotide phylogenetic model
fold
![Page 11: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/11.jpg)
Single nucleotide phylogenetic model
Di-nucleotide phylogenetic model
fold
![Page 12: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/12.jpg)
Algorithms and training
Algorithms: Traditional SCFG algorithms (CYK and insideoutside) combined with Felsenstein 81.
Training of EvoFold:: Rfam structures mapped onto genomic alignments
Complexities:For an n long alignment with msequences:
Space:
![Page 13: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/13.jpg)
vertebrates & drosophilidsInput: conserved segments
sub-fold
……(((((((..))).(((….))))))).………………(((((….)))))..sub-fold
Output: sub-folds
Sensitivity: 43%Performance
![Page 14: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/14.jpg)
Experimentally studied vertebrate cases
HAR1Expression in developing neocortex
A
Editing in Mouse Brain
I/M siteGABRA3 A-to-I RNA editing substrate
![Page 15: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/15.jpg)
High confidence subset from Drosophila screen
Selection criteria:Min. two compensatory substitutions#compensatory subs > 2 x #contradictory subs
Predictions (394 total) Genomic background
![Page 16: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/16.jpg)
UTR structures
Gene involved in biogenesis and assembly of the ribosome
(by homology to RPL24)
High fraction of UTR predictions on transcribed strand (5’UTR: 80% & 3
Significant enrichment of genes regulatory roles.
![Page 17: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/17.jpg)
Acknowledgments
GABRA3 study: Johan Ohlson, Marie Ohman (Stockholm University), and David Haussler (UCSC)
HAR1 study: Katherine S. Pollard (UCSC), Sofie R. Salama (UCSC), Nelle Lambert (ULB), Marie-Alexandra Lambot (ULB), Sandra Coppens (ULB), Sol Katzman (UCSC), Bryan King (UCSC), Courtney Onodera (USCS), Adam Siepel (Cornell), Andrew D. Kern (UCSC), Colette Dehay (Lyon), Haller Igel (UCSC), Manuel Ares, Jr (UCSC), Pierre Vanderhaeghen (ULB)
EvoFold: Gill Bejerano (Stanford), Adam Siepel (Cornell), Kate Rosenbloom (UCSC), Kerstin Lindblad-Toh (Broad), Eric S. Lander (Broad), Jim Kent (UCSC), Webb Miller (Penn State), and David Haussler (UCSC)
Fly screen: Manolis Kellis (MIT), David Haussler (UCSC), Drosophila Sequencing and Analysis Consortium
![Page 18: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/18.jpg)
![Page 19: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/19.jpg)
Structure predictions
UU
GCC
UAU
UGU
GUC
GCC
A -> I (G)U
UCA
U•
GG
U A C
•
New case of A-to-I RNA editing
A
Genomic
cDNA
I/M
Mouse Brain
Slide with structures, perhaps browser screenshot.
Highlight collaborationsRNA editing
regulation
Slide with structures, perhaps browser screenshot.
Highlight collaborationsRNA editing
regulation
![Page 20: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/20.jpg)
Intronic hairpin in RDL flanked by Ato-I edited exons
![Page 21: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/21.jpg)
SpenSpen function: Transcription co-factor and involved in neuronal cell fate, survival, and axon guidance. It has three RNA recognition motifs.
![Page 22: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/22.jpg)
Staufen hairpin similar to known localization element in Orb
Stau
Orb
![Page 23: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/23.jpg)
Hairpin in highly expressed intergenic region
![Page 24: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/24.jpg)
Structure can be extended
EvoFold str.RNAfold str.
Alignment & full structure
![Page 25: Genomic identification of Structural RNAs using …...Genomic identification of Structural RNAs using phylo-SCFGs Jakob Skou Pedersen Bioinformatics Center, University of Copenhagen](https://reader033.vdocuments.us/reader033/viewer/2022041907/5e64b7d22eea3c505c060c0a/html5/thumbnails/25.jpg)