central dogma of molecular biology -...
TRANSCRIPT
![Page 1: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/1.jpg)
Transcriptomics
Marta Puig Institut de Biotecnologia i Biomedicina
Universitat Autònoma de Barcelona
![Page 2: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/2.jpg)
Central dogma of molecular biology
![Page 3: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/3.jpg)
Genome
Proteome Transcriptome
Complete DNA content of an organism with all its genes and
regulatory sequences
Complete collection of proteins and their relative
levels in each cell
Transcription
Translation
Central dogma of molecular biology
Phenotype
Complete set of transcripts and their relative levels of expression in a particular cell or tissue under defined conditions at a given time
![Page 4: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/4.jpg)
RNA profiling provides information about:
Expressed sequences and genes of a genome
Gene regulation and regulatory sequences
Function and interaction between genes
Functional differences between tissues and cell types
Identification of candidate genes for any given process or disease
Why is the study of RNA so important?
![Page 5: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/5.jpg)
Methods
Alternative splicing
Types of transcripts
Regulatory sequences
ENCODE project
Overview
![Page 6: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/6.jpg)
SINGLE GENES
Northern
RT-PCR
5’ and 3’ RACE
Quantitative RT-PCR (Real-Time RT-PCR)
WHOLE TRANSCRIPTOME
EST sequencing
Microarrays
RNA-Seq
Transcriptome analysis methods
![Page 7: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/7.jpg)
Transcriptome analysis using microarrays
Gene expression arrays - Quantification of transcript abundance
- Single/multiple 3’ probes
Genome tiling arrays - Identification of transcribed sequences - Multiple probes covering the genome
Alternative splicing arrays - Quantification of different RNA isoforms - Probes in exons and exon-exon junctions
Gene
Probes
Gene
Probes
Inclusion form
Exclusion form
![Page 8: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/8.jpg)
Brent (2008) Nature Reviews Genetics 9: 62-73
ESTs
Alignment with genome
cDNA synthesis
cDNA library
Sanger sequencing of insert ends
Expressed Sequence Tags (EST)
![Page 9: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/9.jpg)
RNA-seq
AAAAA
Figure 1. Wang et al. (2009) Nature Reviews Genetics 10: 57-63
Sequencing of all the transcripts in a sample using NGS technologies
![Page 10: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/10.jpg)
RNA-seq mapping of short reads in exon-exon junctions
CCGAAAATCAAGTCATCCCTAAAGACTAAGTAAGTAACCATATTACATTAAGGAAGGCACTTTAAAAGTTTATAATCATTTGTAGACTCCCACCAAAGCCACTGACTCGCAAGG
Exon Exon Intron
RNA-seq
![Page 11: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/11.jpg)
Figures 1 and 2. Graveley et al. (2011) Nature 471: 473-479
Discovery of new transcripts by RNA-seq in D. melanogaster
RNA-seq examples
![Page 12: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/12.jpg)
Expression profile by RNA-seq of the D. melanogaster gene eve in different developmental stages
Quantification and determination of expression profiles
D. melanogaster RNA-seq data as shown in GBrowse (FlyBase)
RNA-seq examples
![Page 13: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/13.jpg)
Independence of the existence of an available genomic sequence
Detection of new transcripts
Single-nucleotide precision
Detection of splicing variants and alternative transcription starts and ends
Detection of SNPs in transcribed regions
Detection of allele-specific transcription
Accurate quantification of expression levels (wide range of measurements)
Great reproducibility
Small amount of initial RNA needed
RNA-seq advantages
![Page 14: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/14.jpg)
Table 1. Wang et al. (2009) Nature Reviews Genetics 10: 57-63
RNA-seq advantages
![Page 15: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/15.jpg)
CTGAATAAATCCA
Polyadenylation signal
Methionine TRANSLATION INITIATION
Regulatory elements
Promoters
ACTGATGTCCA
TATA
TRANSCRIPTION START SITE
TRANSCRIPTION TERMINATION SITE
CCGATAAATCC STOP codon
TRANLATION TERMINATION
5’ UTR 3’ UTR
ORF
DNA
mRNA AAAAAAAAA
polyA tail
Splicing
mRNAs
![Page 16: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/16.jpg)
Figure 1. Nielsen and Graveley (2010) Nature 463: 457-463 Figure 1. Li et al. (2007) Nature Reviews Neuroscience 8: 819-831.
Internal exons
Initial/final exons
Exon inclusion/skipping
Alternative 5’ splice site selection
Alternative 3’ splice site selection
Intron retention
Alternative splicing
![Page 17: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/17.jpg)
Figure 8.22. Evolution. Barton et al. (2007) Cold Spring Harbor Laboratory Press
Alternative promoters Exon inclusion/skipping Alternative polyA sites
Alternative splicing example: α-tropomyosin
Alternative 3’ splice site selection
![Page 18: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/18.jpg)
Figure 2. Nielsen and Graveley (2010) Nature 463: 457-463
Extreme alternative splicing examples
>500
38016
28
Number of isoforms
![Page 19: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/19.jpg)
Prevalence of alternative splicing in Drosophila
7473 genes are alternatively spliced
60.7% out of 12295 expressed genes with multiple exons
Table 1. Graveley et al. (2011) Nature 471: 473-479
![Page 20: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/20.jpg)
Figure 2. Wang et al. (2008) Nature 456: 470-476
92-94% of human genes show alternative splicing 86% of human genes generate two different transcripts in significant amounts (minor isoform frequency of 15%) Many alternative isoforms are produced in different tissues as a result of a specific regulation
Prevalence of alternative splicing in humans
![Page 21: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/21.jpg)
Figure 1. Wang et al. (2008) Nature 456: 470-476
Tissue-regulated splicing variants in humans
Figure 2. Nielsen and Graveley (2010) Nature 463: 457-463
Not all possible isoforms exist
Regulation of alternative splicing
Figure 4. Graveley et al. (2011) Nature 471: 473-479
Developmentally regulated splicing variants in D. melanogaster
![Page 22: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/22.jpg)
Genes tend to express many isoforms simultaneously
One isoform dominates in a given condition
12
0.3
¾ of the protein-coding genes have at least two different major isoforms
Variability of gene expression contributes more than variability in splicing ratios to the variability of transcript abundance across cell lines
Figure 4. Djebali et al. (2012) Nature 489: 101-108
Regulation of alternative splicing in humans
![Page 23: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/23.jpg)
Unanswered questions
How many of the observed isoforms are functionally relevant?
Can alternative splicing account for the higher complexity of some organisms?
Table 2. Nielsen and Graveley (2010) Nature 463: 457-463
![Page 24: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/24.jpg)
Type Name Size Transcripts Function
Small non-coding RNAs
rRNAs ribosomal RNAs 114-5000 nt 531 Component of ribosome
tRNAs transfer RNAs 73-93 nt 624* Translation
snRNAs small nuclear RNAs 100-300 nt 1923 Splicing
snoRNAs small nucleolar RNAs 60-300 nt 1529 RNA modification
miRNAs micro RNAs 21-23 nt 3116 Gene expression regulation
Long non-coding RNAs
lncRNAs long non-coding RNAs >200 nt 21271 Regulation, imprinting…
* Number of transcripts from GENCODE v7 data
Number of transcripts from GENCODE v14 data
Types of transcripts
![Page 25: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/25.jpg)
rRNAs and tRNAs
rRNAs transcribed from a polycistronic transcript that is modified and processed to generate the mature 18S, 5.8S and 28S
rRNAs assemble with proteins to form the two subunits of the ribosome
tRNAs carry an amino acid to the protein synthetic machinery of a cell (ribosome) as directed by a three-nucleotide sequence (codon) in the mRNA
Essential components of the protein translation process
tRNAs
rRNAs
![Page 26: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/26.jpg)
snRNAs snoRNAs
Dredge et al. (2001) Nature Reviews Neuroscience 2: 43-50 Eddy (2001) Nature Reviews Genetics 2: 919-929
snRNAs and snoRNAs
Part of the splicing machinery Guide chemical modifications of other RNAs
![Page 27: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/27.jpg)
Figure 2. He and Hannon (2004) Nature Reviews Genetics 5: 522-531.
Small non-coding RNAs (21-23 nt) involved in the post-transcriptional regulation of gene expression by binding to the 3’ UTR of target mRNAs
Identified in the early 1990s, but recognized as a distinct class of regulators in the early 2000s
Detected in multiple species ranging from humans to mice, Drosophila, C. elegans or even plants (Arabidopsis)
Abundant in many cell types and may be involved in many different processes
Target around 60% of mammalian genes
microRNAs
![Page 28: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/28.jpg)
Rinn and Chang (2012) Annual Review of Biochemistry 81: 145–166
Genomic organization
Definition Non-coding RNAs longer than 200 nucleotides
Long non-coding RNAs (lncRNAs)
![Page 29: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/29.jpg)
Figure 5. Derrien et al. (2012) Genome Research 22: 1775-1789
Lower expression levels in all tissues compared to protein-coding genes More tissue-specific expression patterns compared to mRNAs
Expression of long non-coding RNAs
Distribution of the number of Human Body Map tissues in which lncRNA and protein-coding transcripts are detected
![Page 30: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/30.jpg)
Currently 21,271 annotated transcripts transcribed from 12,933 loci in the human genome Significantly more conserved than neutrally evolving sequences but at lower levels than protein-coding genes
Are lncRNAs functional?
Baker (2011) Nature Methods 8: 379–383
Byproduct
Guide
Scaffold
Long non-coding RNAs
![Page 31: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/31.jpg)
3.1 kb
1 kb
Figure 2. Huarte and Rinn (2010) Hum. Mol. Genet. 19 :R152-R161
Examples of long non-coding RNAs
lincRNA-p21 represses many genes and results in cellular apoptosis GAS5 is induced under starvation and growth arrest. It competes with glucocorticoid receptor for DNA binding sites and results in reduced metabolism A lncRNA is transcribed from the promoter region of CCND1 induced for DNA damage, and recruits TLS protein to CCND1 (cyclin D1) and represses its expression, interrupting cell cycle
![Page 32: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/32.jpg)
Figure 3. Harrow et al. (2012) Genome Research 22: 1760-1774
≈10,000
≈3,000
≈29
≈175
Definition Genes that have lost their coding ability
Types
Pseudogenes
![Page 33: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/33.jpg)
863 pseudogenes are transcribed and associated with active chromatin in the human genome
Can pseudogenes have a function or they are just what remains of inactivated genes?
PTENP1 pseudogene protects PTEN from miRNA silencing, and therefore has a tumor suppressive function
Figure 1. Poliseno et al. (2010) Nature 465: 1033-1038
Pseudogenes
![Page 34: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/34.jpg)
Mouse
Data from Su et al. (2004) PNAS 101: 6062-6067
Human
http://biogps.org
Transcript profiling across tissues
![Page 35: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/35.jpg)
Transcript profiling across individuals
Figure 1. Cheung and Spielman (2009) Nature Reviews Genetics 10: 595-604
Different expression levels of a given gene are detected in different individuals
![Page 36: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/36.jpg)
Regulatory changes have unique properties that could make them especially important in phenotypic evolution
Reduced pleiotropical effects
Fine-tuning of gene function
Co-dominance and more efficient selection
Coding vs. Regulatory changes
![Page 37: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/37.jpg)
Lactase production in adults shows large variability in human populations and seems related with pastoralism
In most mammals ability to digest milk disapears with age and is related to the production of the lactase enzyme
Figure 1. Itan et al. (2010) BMC Evolutionary Biology 10:36
Persistence of lactase expression
![Page 38: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/38.jpg)
Figure 1. Tishkoff et al. (2007) Nature Genetics 39: 31 - 40
Persistence of lactase expression
![Page 39: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/39.jpg)
Regulatory elements are difficult to predict:
Small ( <50 pb)
Variable sequence motifs
Few nucleotide positions are really important
Poorly conserved and with not defined locations
Regulatory elements:
Core promoter
Proximal elements
Distal enhancers (upstream / downstream)
Figure 1. Ong and Corces (2011) Nature Reviews Genetics 12: 283-293
Regulatory elements
![Page 40: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/40.jpg)
ChIP-seq
Figure 1. Massie and Mills (2008) EMBO reports 9: 337-343. Figure 2. Park (2009) Nature Reviews Genetics 10: 669-680.
Chromatin immunoprecipitation (ChIP) + Sequencing
Detection of transcription factor binding sites and other DNA-protein interactions
![Page 41: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/41.jpg)
PHASES
• Pilot phase (2003-2007) 1% of human genome (44 regions, a total of ≈30 Mb)
• Production phase (2007-) Whole genome
ENCyclopedia Of DNA Elements
International project funded by the National Human Genome Research Institute (NHGRI) with the goal to identify all functional elements in the human genome.
ENCODE project
![Page 42: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/42.jpg)
Figure 1. Ecker et al. (2012) Nature 489: 52-55
ENCODE project – Functional elements
![Page 43: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/43.jpg)
Maher (2012) Nature 489: 46-48
1,640 genome-wide data sets prepared from 147 cell types
ENCODE project data
![Page 44: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/44.jpg)
A total of 62.1% and 74.7% of the human genome is covered by either processed or primary transcripts, respectively No cell line expresses more than 56.7% of the union of the expressed transcriptomes across all cell lines
A large number of previously unknown transcription start sites and new transcript isoforms have been identified
Thousands of new non-coding transcripts have been detected (22,531 long-noncoding RNAs)
An initial set of 399,124 regions with enhancer-like features and 70,292 regions with promoter-like features have been described
80% of the genome has been annotated with potentially functional elements
ENCODE project main results
![Page 46: Central dogma of molecular biology - UABbioinformatica.uab.es/base/documents/masterGP/Transcriptomics2013610-1354.pdf · Genome Proteome Transcriptome Complete DNA content of an organism](https://reader030.vdocuments.us/reader030/viewer/2022041210/5dd09f04d6be591ccb61e189/html5/thumbnails/46.jpg)