alvis brazma, array express gene expression atlas, fged_seattle_2013
DESCRIPTION
From microarrays to RNA-seq. From archiving experiments to Gene Expression Atlas.TRANSCRIPT
From microarrays to RNAseq From archiving experiments to Gene Expression Atlas
Alvis Brazma European Bioinformatics Institute
European Bioinformatics Institute (EBI)
• EBI is in Hinxton, ~10 miles South of Cambridge, UK Wellcome Trust Genome Campus
• EBI is part of EMBL, ~like CERN for molecular biology
• ~500 scientific and IT staff at EBI
• Hosting the ELIXIR node (details to follow)
Building an archive of public gene expression data (www.ebi.ac.uk/arrayexpress)
Building an archive of public gene expression data (www.ebi.ac.uk/arrayexpress)
Search for cancer
150 citations of ArrayExpress publications using the data in 2011:
From microarrays to RNAseq
ArrayExpress 9
NGS vs Array data in ArrayExpress
1
10
100
1000
10000
2009 2010 2011 2012
Array data
Sequencing data
Experiment details
MINSEQE - Minimum Information about a high-throughput SeQuencing Experiment 1. The description of the biological system and the
particular states that are studied
2. The sequence read data for each assay
3. The 'final' processed (or summary) data for the set of assays in the study
4. The experiment design including sample data relationships
5. General information about the experiment
6. Essential experimental and data processing protocols
ArrayExpress
ArrayExpress Archive
Primary DB
Expression Atlas
Added Value DB
Curation EFO annotation statistical computations Submissions
and GEO import curation
Users – bioinformaticians
Users – all biologists
Query for genes
Query for conditions
Mission - making ArrayExpress data accessible to every biologist
Igfbp4
Experiment view
Baseline Expression e.g. which genes are expressed in a normal human kidney?
Differential Expression e.g. which genes are up-regulated in pancreatic islets of pregnant mice?
www-test.ebi.ac.uk/gxa
www-test.ebi.ac.uk/gxa Baseline Expression
quality control pain%ng by shardcore (h2p://www.shardcore.org/)
low-quality reads
NNACGANNNNN
mapping
Baseline quantification
summarization
RNA-seq Processing Pipeline
www-test.ebi.ac.uk/gxa Baseline Expression
Experimental Factors: • Tissue • RNA type • Cellular component • Cell line Experiments so far (RNA-seq):
• Illumina Body Map (16 runs) • Transcriptome of DBAxC57BL/6J mice (36 runs) • RNA-seq of long coding and long non coding RNAs from ENCODE cell lines (162 runs) • RNA-seq of 6 tissues from 10 species to investigate the evolution of gene expression levels in
mammalian organs (127 runs)
www-test.ebi.ac.uk/gxa Baseline Expression
Illumina Body Map (16 human tissues)
www-test.ebi.ac.uk/gxa Baseline Expression
Illumina Body Map (16 human tissues)
www-test.ebi.ac.uk/gxa Baseline Expression
Illumina Body Map (16 human tissues)
www-test.ebi.ac.uk/gxa Baseline Expression
Illumina Body Map (16 human tissues)
www-test.ebi.ac.uk/gxa Baseline Expression
Illumina Body Map (16 human tissues)
www-test.ebi.ac.uk/gxa Baseline Expression
Illumina Body Map (16 human tissues)
www-test.ebi.ac.uk/gxa Baseline Expression
Illumina Body Map (16 human tissues)
*Rando, T. A. Nature 441, 1080-1086 (29 June 2006) | doi:10.1038/nature04958
Stem cells, ageing and the quest for immortality
www-test.ebi.ac.uk/gxa Baseline Expression
Illumina Body Map (16 human tissues)
www-test.ebi.ac.uk/gxa Baseline Expression
Future Plans – Inclusion of Proteomics Data
www-test.ebi.ac.uk/gxa Baseline Expression
Future Plans – Genome Browser View
www-test.ebi.ac.uk/gxa Baseline Expression
Future Plans – Comparable Experiments
Now Future
Future Plans – Atlas Home Page
www-test.ebi.ac.uk/gxa We need your feedback!
Robert Petryszak
Funding
• EMBL member countries • European Commission FP7 grants
Questions
www-test.ebi.ac.uk/gxa