long non-coding rnas in disease and cancer. facts: less than 2% of the human genome encodes proteins...

19

Upload: vicente-burwell

Post on 15-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and
Page 2: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

FACTS:

• Less than 2% of the human genome encodes proteins

• Recent evidence from genomic tiling arrays and transcriptome deep sequencing showed that >50% of the human genome is transcribed

• The bulk of transcriptional products consists of small and long RNAs with very reduced coding potential

most eukaryotic transcribed DNA is non-coding

• C-value paradox genome size does not correlate with organismal complexity

• Gene-value paradox; relationship between morphological complexity and the number of protein-coding genes within a genome

• simplistic expectation + contradictory data = “paradox”

• The number of human genes is about the same as the number required to specify a nematode worm

• The secret of evolution lies in gene regulation complexity?

Page 3: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

• Main observations:

• Intronic transcripts

• Non-polyadenylated RNAs

• Antisense and overlapping transcription

• Concept of “transcribed dark matter”, i.e. transcripts with unknown function and meaning

• Tests of function that depend on gene knockout or overexpression only work for a fraction even of known protein-coding genes. Need to establish non coding transcripts function

Page 4: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

• RNA molecules both encode sequence information and possess great structural plasticity

• RNA can directly interact with DNA and with other RNAs by base pairing

• Highly structured RNA can also provide docking sites for binding proteins

• RNA has a compact size and significant sequence specificity

• non-coding RNA known from long time:

• rRNA and tRNA in translation

• snRNA and snoRNA in mRNA processing

• ribozymes

• remind the RNA world hypothesis

Page 5: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

• Genome-wide surveys have revealed that eukaryotic genomes are extensively transcribed into thousands of long and short ncRNAs

• Important small ncRNA small ncRNA with regulatory roles:• miRNAs

• siRNAs

• piRNAs (Piwi-interacting RNA, transposon silencing in spermatogenesis)

• Long ncRNAsLong ncRNAs, lncRNAs >200 nt

• Many lncRNAs show spatial- and temporal-specific patterns of expression, indicating that lncRNA expression is strongly regulated

• lncRNAs have specific biological functions

• if they are by-products of other regulatory events, they can be convenient biomarkers of ongoing regulation

Page 6: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

lncRNA Mechanisms can be described according to four, not mutually exclusive, archetypes:

1. As signals

2. As decoys

3. As guides

4. As scaffolds

Page 7: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

lncRNAs as signals

• lncRNAs can serve as molecular signals, because transcription of individual lncRNAs occurs at a very specific time and place to integrate developmental cues, interpret cellular context, or respond to diverse stimuli.

• Some lncRNAs in this archetype possess regulatory functions, while others are merely by-products of transcription—it is the act of initiation, elongation, or termination that is regulatory.

• The advantage of using RNA as a medium suggests that potential regulatory functions can be performed quickly without protein translation.

Page 8: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

lncRNAs as signals

• Allele specificity: Xist ion X chromosome inactivation, Air expressed only in paternal chromosome and repress inmprineted genes

• Anatomic specific expression: HOTAIR and HOTTIP from Hox loci

• Induction by DNA damage: LinkRNA-p21 acts as transcriptional repressor of P53 pathway and triggers apoptosis

• Induction by cold: RNAs expressed after vernalization controls flowering

• Coordinated activity: eRNAs (enhancer RNAs)

Page 9: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

lncRNAs as decoys

• as decoys, lncRNAs can titrate transcription factors and other proteins away from chromatin or titrate the protein factors into nuclear subdomains

• as decoys, lncRNAs can compete with mRNAs for miRNA target sites

Page 10: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

lncRNAs as decoys (molecular sinks)TERRA: telomeric repeat-containing RNA sequestrates the telomerase. It is large non-coding RNA in animals and fungi, which forms an integral component of telomeric heterochromatin. The accumulation of TERRA at telomeres interferes with telomere replication, leading to a sudden loss of telomere tracts.

• ncRNA compete for miRNA binding. The 3′ UTR of PTENP1 (tumor suppressor pseudogene) RNA was found to bind the same set of regulatory miRNA sequences that target the tumor-suppressor gene PTEN, reducing the downregulation of PTEN mRNA and allowing its translation into the tumor-suppressor protein PTEN.

Page 11: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

lncRNAs as guides

• The third archetype of lncRNA is the guide—RNA binds protein(s), then directs the localization of ribonucleoprotein complex to specific targets.

• lncRNAs can guide changes in gene expression either in cis (on neighboring genes) or in trans (distantly located genes) in a manner that is not easily predicted based on lncRNA sequence.

Page 12: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

lncRNAs as guides• The gene regulatory components brought on by the lncRNAs include both repressive (e.g., polycomb) and activating (MLL) complexes, as well as transcription factors (TFIIB). However, no matter the distance or mechanism (either cis or trans), the principle remains the same: to convey regulatory information across an intervening stretch of DNA to control target gene expression, bringing about changes in the epigenome.

• Expression of the Hox lncRNA HOTAIR has recently been associated with cancer metastasis and was observed in primary and metastatic breast cancer.

• Depletion of HOTAIR from cancer cells leads to a reduced invasiveness of cells that express a high level of polycomb proteins (PRC2). These findings suggest that ncRNA-mediated targeting of polycomb complexes is a crucial event in breast tumorigenesis.

• lncRNAs such as HOTAIR are able to alter and regulate epigenetic states in cells through their targeting of chromatin-modifying complex occupancy/localization/enzymatic activity in trans.

Page 13: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

lncRNAs as scaffolds

• As scaffolds, lncRNAs can bring together multiple proteins to form ribonucleoprotein complexes.

• The lncRNA-RNP may act on chromatin as illustrated to affect histone modifications.

• In other instances, the lncRNA scaffold is structural and stabilizes nuclear structures or signaling complexes.The lncRNA scaffold is structural and stabilizes nuclear structures or signaling complexes.

• Traditionally, proteins were thought to be the major players in various scaffolding complexes

Page 14: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

lncRNAs as scaffolds• LncRNAs can serve as central platforms upon which relevant molecular components are assembled.

• Telomerase catalytic activity requires the association of two universal telomerase subunits: an integral RNA subunit, the telomerase RNA (TERC) that provides the template for repeat synthesis, and a catalytic protein subunit, the TERT, as well as several species-specific accessory proteins. The TERC in particular also possesses structures that contribute to TERT binding and catalytic activity, in addition to those that play major roles in stability of the complex.

• Thus, the primary functional role for TERC is to be a scaffold

Page 15: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

• Alterations in the primary structure, secondary structure, and expression levels of lncRNAs as well as their cognate RNA-binding proteins underlie diseases ranging from neurodegeneration to cancer.

• Recent progress suggests that the involvement of lncRNAs in diverse human disease.

• Evidence highlight fundamental concepts in lncRNA biology that still need to be clarified to provide a robust framework for lncRNA genetics.

Page 16: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

LncRNAs involved in epigenetic silencing

lncRNA ANRIL

•The INK4b/ARF/INK4a locus encodes three tumor suppressor genes that have been linked to various types of cancers.

•A recent study has characterized the mechanism by which the lncRNA ANRIL mediates INK4a transcriptional repression in cis.

•ANRIL was shown to interact with the Pc/Chromobox 7 (CBX7) protein, a member of the polycomb repressive complex 1 (PRC1).

•Altered ANRIL activity might result in dysregulated silencing of the INK4b/ARF/INK4a locus, contributing to cancer initiation.

•Genome-wide association studies have shown that the intergenic region encompassing ANRIL is significantly associated with increased susceptibility to coronary disease, intracranial aneurysm, type 2 diabetes, as well as several types of cancers

Page 17: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

Splicing regulation by lncRNAs

lncRNA MALAT-1

• MALAT-1 (metastasis-associated in lung adenocarcinoma transcript) was identified in an attempt to characterize transcripts associated with early-stage non-small-cell lung cancer (NSCLC).

• MALAT-1 is an abundant 6.5 kb lncRNA transcribed from chromosome ∼11q13 and primarily localized in nuclear speckles.

• MALAT-1 regulates alternative splicing through its interaction with the serine/arginine-rich (SR) family of nuclear phosphoproteins which are involved in the splicing machinery.

• In NSCLC metastasizing tumors, MALAT-1 expression is three-fold higher than in non-metastasizing tumors.

• In patients with stage I disease, MALAT-1 expression is closely correlated with poor prognosis.

• lcnRNAs as prognostic marker for metastasis and survival?

Page 18: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and
Page 19: Long non-coding RNAs in disease and cancer. FACTS: Less than 2% of the human genome encodes proteins Recent evidence from genomic tiling arrays and

• The bulk of sequence mutations in the genome occur in non-coding and intergenic regions.

• A substantial portion of the genome is transcribed

• Mutations are transmitted to the transcriptome, potentially affecting a large number of lncRNAs.

How small mutations in lncRNAs contribute to disease?

How primary sequence translates into lncRNA function?