call for non-coding mrna resource

14
A call for the creation of a Human long ncRNA Clone Set Matthias Harbers - January 2012 Matthias Harbers 1

Upload: matthias-harbers

Post on 11-May-2015

262 views

Category:

Health & Medicine


0 download

DESCRIPTION

Genomic projects provided clone resources for coding mRNAs, but non-coding mRNAs are still missing for functional studies. Seeing the growing number of non-coding mRNAs, we hope the community will prepare better resources for studying human non-coding mRNAs.

TRANSCRIPT

Page 1: Call for non-coding mRNA resource

A call for the creation of a

Human long ncRNA Clone Set Matthias Harbers - January 2012

Matthias Harbers 1

Page 2: Call for non-coding mRNA resource

Long non-coding mRNAs

Matthias Harbers 2

The dogma that an “mRNA has to encode a Protein” directed the creation of protein coding clone collections

The Mammalian Gene Collection (MGC) and ORFeome Collaboration (OC) focused on the cloning of protein coding transcripts

However, by now it is generally accepted many mRNAs are non-coding transcripts that exercise their functions by other, mostly unknown, mechanisms

All clone collections are incomplete because of the use of oligo-dT priming

However, may be up to 40% or more of the mRNAs could lack polyadenylation and hence are not covered by classical cDNA libraries

Many non-polyadenylated mRNAs could be non-coding mRNAs

Page 3: Call for non-coding mRNA resource

Research is driven by sequencing

Matthias Harbers 3

High-speed sequencing changed the way genomic research is done

High-speed sequencing has an unmatched power in “discovery”

However, expressed sequences need confirmation by other means

New transcripts have to be analyzed for their functions

Loss-of-function studies became easier with RNAi knock-down experiments

However, gain-of-function experiments are essential to understand mechanisms and functions

We should have the resources needed to study the functions of ncRNAs!

Page 4: Call for non-coding mRNA resource

“Real” versus “predicted”

Matthias Harbers 4

Public databases like NCBI and others use “reference sequences” for transcripts

This implies that there is only one transcript per gene!

Reference sequences ignore splicing!

Actual cDNA clones in the public domain often do not match reference sequences

High-speed sequencing data provide more “predicted sequences” based on the assembly of short reads into contigs or alike

Predicted sequences are not experimentally verified!

Cloned cDNAs have at least an “experimental origin”!

Page 5: Call for non-coding mRNA resource

Matthias Harbers 5

Creation of genomic resources and data:

Public databases

Functional Studies:

Gene annotation

Find the clones you need:

Clone Distribution Services

The “Knowledge Cycle”

Physical resources are needed to bring “life” to in silico data!

Page 6: Call for non-coding mRNA resource

What is needed?

Matthias Harbers 6

Define what are long non-coding mRNAs (ongoing in the community) Description of human non-coding mRNA set

Consent on the features of human non-coding mRNA set

Starting materials available in the community?

New starting materials required?

Build consortium to build human non-coding mRNA clone set

Consider non-coding mRNA collections from other organisms

Small non-coding RNA are not considered here because they are in part

already covered by some public collections (e.g. Netherlands Cancer Institute)

Page 7: Call for non-coding mRNA resource

Features of non-coding mRNA set

Matthias Harbers 7

ORFeome Collaboration committed to Invitrogen Gateway system

Broad Institute also uses Invitrogen Gateway system

Suggestion to stay with Invitrogen Gateway system for ncRNA set?

However, many clone customers do not want Invitrogen Gateway clones!

Addition of restrictions sites could enable sub-cloning without use of the Invitrogen Gateway system

For example Promega offers Flexi® Vectors using SgfI and PmeI

Should the parental clones from cDNA libraries made available?

Should the collection include splice variants?

Are there special requirements we do not know of?

Page 8: Call for non-coding mRNA resource

Available starting materials

Matthias Harbers 8

Want to use high-quality full-length cDNA clones and libraries where possible!

Human cDNA collections: • RIKEN ~311,000 human end-sequenced human cDNA clones in NCBI? • Other human cDNA clones: e.g. ORIGENE?

Human full-length cDNA libraries in the public domain? I do not see gene synthesis based on predicted sequences as “general

option”

I prefer starting from cDNA libraries using “real transcripts”

Classical cDNA libraries have not been sequenced deep enough to cover rare genes! There is an option to find more unique clones in old libraries

Need new cDNA libraries/pools to cover important biological samples?

Page 9: Call for non-coding mRNA resource

New technologies will help

Matthias Harbers 9

In the past sequencing cost limiting factor for building clone collections

Many clones in the public domain are not full-length sequenced

Lack of sequence information limits clone annotation

New high-speed sequencing methods can overcome limitation on sequencing cost

Use high-speed sequencing instead of end-sequencing of individual clones to screen cDNA libraries more deeply

Use high-speed sequencing to obtain full-length sequences of all clones within ncRNA collection

Use high-speed sequencing to assure high quality standards of entire collection!

Page 10: Call for non-coding mRNA resource

Matthias Harbers 10

RNA

cDNA Library

Clone Picking

End Sequencing

Annotated Clone Collection

Limited by sequencing cost Redundancy in clone collection

RNA

cDNA Library/cDNA Pool

Shotgun sequencing

Library Screening

Clones for Targets

Much higher coverage Focus on new targets

Page 11: Call for non-coding mRNA resource

New starting materials required

Matthias Harbers 11

Many mRNAs lack polyadenylation and required new cloning method

Total RNA from cell

Removal of polyA mRNA

Ligation of 3’ adapter to mRNA

cDNA synthesis using 3’ adapter

Cap selection using Cap Trapper

Cloning into cDNA library

Size does matter: Classical cDNA projects had a size cutoff

Page 12: Call for non-coding mRNA resource

Build internet presence

Matthias Harbers 12

Any clone collection requires an internet presence with a database!

Clone related information can only be provided by a database

Annotation of the clones by reference to other resources is important

Application notes and references could be a great capture for users Good documentation of the project needed

Provide all clones to community without limitations on the rights to use

(follow example of “Good Faith Agreement” of ORFeome Collaboration)

ncRNAs may require “more” for a better understanding on how to study new mechanisms and functions!

Become the “home” for research on ncRNAs!

Page 13: Call for non-coding mRNA resource

Conclusion

Matthias Harbers 13

MGC and OC set standards for human clone resources!

We should to build on the great record of MGC and OC to move from coding mRNAs to long non-coding mRNAs

After most coding genes have been covered by at least one cDNA clone, we need to work on the non-coding transcripts to move forward

Non-coding genes are essential players in life and we want to provide comprehensive resources for their study and analysis

Starting with human ncRNAs will greatly benefit medical research

Including ncRNAs from other organisms could be an option where those are key model systems to study ncRNA functions (RIKEN FANTOM clone set from mouse includes many long ncRNAs)

Page 14: Call for non-coding mRNA resource

Matthias Harbers 14