using ngs to answer biological questions

29
Using NGS to answer biological questions Usadellab.org @ RWTH Aachen, Forschungszentrum Jülich

Upload: malha

Post on 24-Feb-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Using NGS to answer biological questions. Usadellab.org @ RWTH Aachen, Forschungszentrum Jülich. Microarrays and RNA Seq the old and the new. You have heard it all before. Was considered big data once. Open platform Good for SNP calling Higher dynmic range Better reflects RT-PCR data. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Using NGS to answer biological questions

Using NGS to answer biological questions

Usadellab.org @ RWTH Aachen, Forschungszentrum Jülich

Page 2: Using NGS to answer biological questions

You have heard it all before

• Open platform• Good for SNP calling• Higher dynmic range • Better reflects RT-PCR data

• Was considered big data once

Microarrays and RNA Seq the old and the new

Page 3: Using NGS to answer biological questions

1979

1987

1992

1994

1996

1998

2000

2002

2004

2006

2008

2010

2012

2014

0500

100015002000250030003500

Pumed

Next generation sequencingpublications in pubmed

The goldrush is still in its high steam

But all that glitters is not gold

Sometimes a closed platform is not too bad, this also means standardization and of course microarrays take much less time to download

Did you ever ask yourself: Oh let’s have a brief look a this dataset…

All that glitters is not gold

Page 4: Using NGS to answer biological questions

1979

1987

1992

1994

1996

1998

2000

2002

2004

2006

2008

2010

2012

2014

0500

100015002000250030003500

Pumed

Next generation sequencingpublications in pubmed

The goldrush is still in its high steam

But all that glitters is not gold

Sometimes a closed platform is not too bad, this also means standardization and of course microarrays take much less time to download

And then there is still mapping and stats

All that glitters is not gold

Page 5: Using NGS to answer biological questions

1979

1987

1992

1994

1996

1998

2000

2002

2004

2006

2008

2010

2012

2014

0500

100015002000250030003500

Pumed

Next generation sequencingpublications in pubmed

The goldrush is still in its high steam

But all that glitters is not gold

Sometimes a closed platform is not too bad, this also means standardization and of course microarrays take much less time to download

And storageBtw did you buy that newstorage pod?

Another 180TBAnother 20k€Or worse you can‘t build yourself

All that glitters is not gold

Page 6: Using NGS to answer biological questions

So does it all come to naught?

The goldrush is still in its high steam, so there is of course something

Did you ever think it were possible that you yourself can sequence a full genome, de novo of course?

Well that‘s a PhD topic now. (Within reason, up to medium sized plants, bacteria can be dealt with in a Bsc if you gt lucky)

So where are good claims to be had and what can one do about it?

Can Bioinformatics help biologists?

But the goldrush is still on

Page 7: Using NGS to answer biological questions

Wild relative of S. lycopersicum

Moyle 2008Source: Tomato Genome Resource Centre (TGRC)

Grows in Peru and Northern Chile(TGRC Accessions shown)

Solanum pennellii - a wild tomato relative

Page 8: Using NGS to answer biological questions

Schauer et al., 2006Metabolites

Intro

gres

sion

Lines

Solanum pennellii - a great source of gentic variation

x

x

S.lyc M82 S. pennellii

S.lyc M82 F1

Introgression Line Population

Page 9: Using NGS to answer biological questions

Trimmomatic fast & precise

Page 10: Using NGS to answer biological questions

Filtering Effects

Page 11: Using NGS to answer biological questions

Scaffolds Total N50 (>2000)

S. pennellii (V2.00) 943M 1,741,129

S. lycopersicum (V2.4) 781M 16,467,796

S. pimpinellifolium (A-1.0) - -

S. tuberosum (V3) 715M 1,354,002

Final Contigs Total Size N50

S. pennellii (V2.00) ~870M 45,7k

S. lycopersicum (V2.4) 738M 86,9k

S. pimpinellifolium (A-1.0) 689M 6k

S. tuberosum (V3) 683M 31,4k

Split on ‘N’s SNP small indel<0.03%

Solanum pennellii Assembly

Page 12: Using NGS to answer biological questions

Schauer et al., 2006Metabolites

Intro

gres

sion

Lines

Solanum pennellii - a great source of gentic variation

x

x

S.lyc M82 S. pennellii

S.lyc M82 F1

Introgression Line Population

Unfortunately it was the cultivar Heinz and not M82 that was sequenced

Luckily re-sequencing is relatively straight forward (sometimes)

Page 13: Using NGS to answer biological questions

Solanacae..... What makes them what they are

Physalis alkengi (Chinese lantern) Physalis peruviana (Cape gooseberry)

Physalis ixocarpia (tomatillo)

Page 14: Using NGS to answer biological questions

More than 2000 termsRedundancy reduced terms for better visualization and statistical analysis

~ 20 plant species

Automatic tool for whole transcriptome annotation

The MapMan Plant Ontology

Page 15: Using NGS to answer biological questions

Mercator

Data Submission

Mercator is an online resource allowing to submit large FASTA files containing plant sequences

Mercator compares the sequences to in-house annotated and classified plant sequences and searches for domains

Mercator then classifies all genes/proteins

Mercator typically processes one genome equivalent in 2-3 days in acurate mode (and faster in draft mode)

FASTA Sequence Results Summary and Tables

Mercator: Bulk Sequence classification

Page 16: Using NGS to answer biological questions

MapMan

MapMan: Omics on Plant Pathway visualization, testing

Pathway Visualization

MapMan is a graphical tool allowing• Pathway visualization for more about 20 plant species including all major

crops• Testing for enriched pathways and processes• Interactice data exploration and visualization e.g. Venn Diagrams,

Clustering,…

Expression Data Enrichment testing Interactive dataExploration

Page 17: Using NGS to answer biological questions

Physalis alkengi leaf versus rootSomething was done right

Bringing it together

Page 18: Using NGS to answer biological questions

Carbon Status

Arrays RNA Seq

Metabolic profiling

day night extended night

Diurnal Cycles and an Extended Night across species

Page 19: Using NGS to answer biological questions

Carbon Status

Arrays RNA Seq

Metabolic profiling

day night extended night

Diurnal Cycles and an Extended Night across species

The mciroarray was pretty useless <10k genes

Page 20: Using NGS to answer biological questions

Peak times seem to be conserved

If you are a cycling gene it seems to be good to peak around midday or midnight

Page 21: Using NGS to answer biological questions

Phases for orthologs seem to do much worse.....

Genes ordered by phase in Arabidopsis, if you are very far away you might see some conservation

Page 22: Using NGS to answer biological questions

Diurnal Cycles and an Extended Night across species

Looking at individual genes can help....

Page 23: Using NGS to answer biological questions

Myo Inositol pathway (MIOX) shows a conserved response

Arabidopsis Tomato

Blue upRed down

UDP-Glucose

UDP-Glucuronic Acid

Glucuronic Acid-1-P

Glucuronic Acid

Myo-InositolMiox

UGDUDP-Glucose

UDP-Glucuronic Acid

Glucuronic Acid-1-P

Glucuronic Acid

Myo-InositolMiox

UGD

Maize

UDP-Glucose

UDP-Glucuronic Acid

Glucuronic Acid-1-P

Glucuronic Acid

Myo-InositolMiox

UGDCELL WALL

Conserved Pathways

Page 24: Using NGS to answer biological questions

Miox Pathway shows a correlated change in

metabolites and transcripts

Blue upRed down

UDP-Glucose

UDP-Glucuronic Acid

Glucuronic Acid-1-P

Glucuronic Acid

Myo-InositolMiox

UGD

Glucuronokinase

CELL WALL

Conserved Pathways... And metabolites

Page 25: Using NGS to answer biological questions

UDP-sugars drop in response to Carbon depletion

ED EN XN

ED EN XN

UGD

MIOX

GK

Carbon and the Wall

Page 26: Using NGS to answer biological questions

UDP-sugars drop in Carbon depletion

ED EN XN

ED EN XN

UGD

GK

Carbon and the Wall

Page 27: Using NGS to answer biological questions

Miox Mutants show a stronger drop in UDP-sugars

ED EN XN

ED EN XN

ED EN XN

UGD

GK

Carbon and the Wall

Page 28: Using NGS to answer biological questions

• Not all that glitters is gold, but well treated you can find much more unexpected stories from NGS data (S.pimp)

• NGS does allow us to actually get a handle on genomes and transcritomes we couldn’t dream of before (S.penn Physalis)

• Using the openness of NGS one starts seeing new things and can compare between species

Summary

Page 29: Using NGS to answer biological questions

Zhangjun Fei, Jim Giovannoni, Cornell University

Raimund Tenhaken, Salzburg University

Alisdair Fernie, Mark Stitt MPI Golm

Detlef Weigel, MPI Tübingen

Acknowledgements

Thomas HerterLC-MS

usadellab.org