[2013.11.01] visualizing omics_data

25
Visualizing omics data CENTER FOR MICROBIAL COMMUNITIES Mads Albertsen Introduction to community systems microbiology 2013

Upload: madsalbertsen

Post on 14-Jun-2015

1.151 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: [2013.11.01] visualizing omics_data

Visualizing omics data

CENTER FOR MICROBIAL COMMUNITIES

Mads AlbertsenIntroduction to community systems microbiology

2013

Page 2: [2013.11.01] visualizing omics_data

• Visualizing omics data

• Re-introduction to 16S analysis

• Hands on 16S analysis in Rstudio

• There is so much to learn. How do I start?

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Agenda

Page 3: [2013.11.01] visualizing omics_data

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Visualizing data?

http://mkweb.bcgsc.ca/

Martin Krzywinski

Page 4: [2013.11.01] visualizing omics_data

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Who - when, where and why?

Re-introduction to 16S analysis

Page 5: [2013.11.01] visualizing omics_data

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Who - when, where and why?

Page 6: [2013.11.01] visualizing omics_data

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Who - when, where and why?

http://phil.cdc.gov/phil/details.asp?pid=2226http://en.wikipedia.org/wiki/File:EBPR_FISH_Floc.jpg P. Larsen 2012

Accumulibacter Competibacter Bacillus anthracis

Page 7: [2013.11.01] visualizing omics_data

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

The affinities of all the beings of the same class have sometimes been represented by a great tree... The green and budding twigs may represent existing species; and those produced during former years may represent the long succession of extinct species.

C. Darwin, 1872

http://tolweb.org

Nothing in biology makes sense, except in the light of evolution.

T. Dobzhansky, 1973

Taking advantage of evolution

Page 8: [2013.11.01] visualizing omics_data

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Why do we use the 16S gene?

Ribosomes are universal

rRNA = Structural RNAhttp://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf

Page 9: [2013.11.01] visualizing omics_data

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Why do we use the 16S gene?

http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf

8F8F Universal primer

8F

8F

Page 10: [2013.11.01] visualizing omics_data

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Why do we use the 16S gene?

http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf

Ashelford et al. AEM. 2005;71:7724-7736

• Advantages:• Universal gene (No horizontal gene transfer)• Conserved regions• Variable regions• Great databases and alignments

• Problems:• Variable copy number• No universal (unbiased) primers• (Not directly correlated with activity)• (Lack of functional information)

Page 11: [2013.11.01] visualizing omics_data

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

Sampling SequencingExtraction Sample prep Bioinformatics

There is a lot of steps!

Page 12: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

• Standardisation, standardization, standardizasion..!

• Use biological replicates and evaluate your variation…!

• Design a good experiment with realistic expectations to the outcome (Most studies fail here!!!)

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

AAU activated sludge standard @ midasfieldguide.org

Page 13: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

eDNA removal

Input (mg)

Bead beating

Storage

Intensity (ms-1)D

urati

on (s

)4 6

400160

804020

1 2 4 9 22• Fresh• 24 h @ 4°C• 24 h @ 20 °C

PMA650 W 10 min

+ N+ CH3

NH2

N3

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

AAU activated sludge standard @ midasfieldguide.org

Page 14: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

Bp

Mea

n fr

eque

ncy

of

mos

t com

mon

resi

due

in 5

0 bp

win

dow

0 500 1000 1500

1.0

0.8

0.6 V1 V2 V3V4 V5

V6

V7 V8V9

V1.3 V4V3.4

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

AAU activated sludge standard @ midasfieldguide.org

Ashelford et al. AEM. 2005;71:7724-7736

Page 15: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

PCR with modified 16S primers

5’-AATGATACGGCGACCACCGAGATCTACAC GTACGTACG GT AGAGTTTGATCCTGGCTCAG-3’

5’-CAAGCAGAAGACGGCATACGAGAT TCCCTTGTCTCC ACGTACGTAC CCG ATTACCGCGGCTGCTGG-3’Illumina adapter Barcode Pad linker 534R

Illumina adapter Pad linker 27F

////Target region

//

1.

2.

3.

AAU activated sludge standard @ midasfieldguide.org

PCR Cycle

Page 16: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

Mardis, 2008 (PMID 18576944)

≈ 500 bp target amplicon

Page 17: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

Read 1: 300 bp

Read 2: 300 bp

Read 1Read 2Barcode

≈ 500 bp target amplicon

After Sequencing:

Page 18: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

How many sequences are needed? It depends on your question! (although 50.000 raw sequences per sample is usually fine)

AAU raw kit and chemical costs (DKK) Cost Cost v2

DNA extraction 105 70a

Library preparation 40 40

Sequencing (min 100k reads / sample) 190b 70c

Total 335 180a Kits discountedb 50 samples per runc 150 samples per run (can run up to 300)

Page 19: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

Merge Cluster

31131

OTU Count

Assign taxonomy (Compare to database)

3 Accumulibacter11 Unkown

3 Competibacter1 Bacillus anthracis

OTU Count OTU table

Page 20: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

Merge Cluster

2 1 3 83 01 0

OTU A B

Assign taxonomy (Compare to database)

AAAAAAAAABBBBBBBBB

Barcode

2 1 Accumulibacter 3 8 Unkown3 0 Competibacter1 0 Bacillus anthracis

OTU A BOTU table

Page 21: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

Sequence errors, chimera’s and weird stuff..

The chance of a perfect read as function of the read length

Chimera’s

Page 22: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Typical workflow

Merge Cluster

3113

OTU Count

Assign taxonomy (Compare to database)

3 Accumulibacter11 Unkown

3 Competibacter

OTU Count OTU table

Removing unique sequences makes the subsequent steps 10-100x faster and removes

the majority of errors and chimera’s

Dependent on sequencing depth and sample complexity! Be careful!

Page 23: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

AAU workflow

16SAMP-14516SAMP-14616SAMP-14716SAMP-14816SAMP-14916SAMP-150

16S.V13.workflow.sh

Find sample ID’s on Google drive

OTU table (+ R version)Plain text file

2 1 Accumulibacter 3 8 Unkown3 0 Competibacter

OTU A B

Page 24: [2013.11.01] visualizing omics_data

Sampling SequencingExtraction Sample prep Bioinformatics

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

AAU workflow

What 16S.V13.workflow.sh does:1. Find and unpack your samples2. Optional subsampling3. Remove potential phiX contamination (bowtie2)4. Merge read 1 and read 2 (flash)5. Remove reads outside length criteria6. Optional removal of unique reads and subsampling to even depth7. Format reads for QIIME8. Cluster reads to OTUs (Uclust, QIIME)9. Assign taxonomy (RDP classifier, QIIME + database: MiDAS, Greengnes or Silva)10. Generate OTU table (QIIME)

Page 25: [2013.11.01] visualizing omics_data

CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

Where do I start?

• Get online (twitter, blogs, seqanswer.com)

• Learn basic multivariate statistics

• Learn R (with Rstudio)

• Analyzing Ecological Data (2007) by Zuur, Ieno & Smith