introduction to rna bioinformatics craig l. zirbel october 5, 2010 based on a talk originally given...
TRANSCRIPT
Introduction to RNA Bioinformatics
Craig L. ZirbelOctober 5, 2010
Based on a talk originally given by Anton Petrov.
Outline
Lecture 1• Importance of RNA, examples (miRNA, riboswitches).• RNA 2D and 3D structure.• RNA structure prediction.Lecture 2• RNA basepairs and 3D motifs• Predicting secondary structure from sequence (mfold)Lecture 3• Statistical variability of protein and RNA sequences
ENCODE Project Consortium, Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007 Jun 14447(7146):799-816
In the human, out of approximately 3 billion nucleotides, only about 1.5% code for proteins, although up to 93% are transcribed into RNA. What is this “non-coding” RNA doing?
Mattick, J.S. (2004) The hidden genetic program of complex organisms. Scientific American 291 (4): 60-67.
DNA
RNA
Protein
Transcription
Translation of exons
Reverse Transcription
Splicing
tRNARibosomal RNAMany other types of ncRNA
Introns (RNA)
micro RNA
Mattick, J.S. (2004) The hidden genetic program of complex organisms. Scientific American 291 (4): 60-67.
Kim VN, MicroRNA biogenesis: coordinated cropping and dicing. Nat Rev Mol Cell Biol. 2005 May;6(5):376-85
microRNA
Mattick, J.S. (2004) The hidden genetic program of complex organisms. Scientific American 291 (4): 60-67.
Bioinformatical challenge: given a DNA sequence,
predict microRNA genes and their respective targets.
miRNAs in a transcript, waiting to be diced out
Peterson, K.J., Dietrich, M.R. and McPeek, M.A. (2009) MicroRNAs and metazoan macroevolution:insights into canalization, complexity, and the Cambrian explosion. BioEssays 31:736–747.
Acquisition of novel microRNAs (shown in white boxes) may be a driving force of recent evolution. Also a factor in cancers?
There are 84 mammal-specific microRNAs, and 84 more that are
found exclusively in apes.
Montange, R. K., & Batey, R. T. (2008). Riboswitches: emerging themes in RNA structure and function. Annu Rev Biophys 37:117-133.
RIBOSWITCHES
Bioinformatic challenges: find riboswitches in
genomic sequences, design novel riboswitches.
RNAs which bind to other molecules when they are present, altering the shape and function of the RNA.
http://en.wikipedia.org/wiki/List_of_RNAs
Types of RNA
Bioinformatic challenges: Is this list final? Could there be more types of non-
coding (ncRNA) that we don’t know yet? How to search for novel ncRNAs in
genomes?
Goals of RNA bioinformatics
• Find and classify RNA genes in genomic sequences (using both experimental and computational methods).
• Predict secondary and 3D structure from RNA sequence.
• Infer function from structure.• Rationally design RNA molecules for
biotechnology.• Find diseases associated with RNAs (e.g.,
cancer and miRNA)
Why RNA is unique
• Similar to DNA in chemical composition, primary and secondary structure, and information content, but with more complicated structure than helices• Similar to Proteins in tertiary and 3D structure and function, but also very different, mostly base-base interactions, fewer backbone-backbone• Binds substrates and catalyzes reactions, just as proteins.• Participates in all stages of gene expression and information transfer: transcription, splicing, translation. Frequent target of antibiotics.
Similarities Between Protein and RNA 3D Structures
• Compact folding • Hierarchical
organization • Modular domains • Specific tertiary
interactions • Molecular “mimicry”
-- Proteins that “mimic” RNA
LIANG, H., & LANDWEBER, L. F. (2005). Molecular mimicry: Quantitative methods to study structural similarity between protein and RNA. RNA, 11(8), 1167-1172.
The tertiary structures of tRNA-mimic translation factors and tRNA. (a) Thermus thermophilus EF-G:GDP (PDB accession code 1DAR). (b) Thermus aquaticus EF-Tu:GDPNP:Phe-tRNAPhe (1TTT). (c) Thermus thermophilus RRF (1EH1). (d) Yeast Phe-tRNAPhe.
RNA 2D Structure Elements
Bioinformatics: sequence and genome analysis By David W. Mount
Bioinformatic challenges: predict most stable 2D
structures, resolve pseudoknotted regions etc.
Basepairs are the basic units of secondary structure.