a metagenomic tool for cheese...

31
Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure Abraham, Quentin Cavaillé, Thibaut Guirimand, Sandra Dérozier, Charlie Pauvert, Mahendra Mariadassou, Bedis Dridi, Valentin Loux, Pierre Renault Jouy en Josas France

Upload: others

Post on 19-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

Sept 9, 2018

A metagenomic tool for cheese

ecosystems

Anne-Laure Abraham, Quentin Cavaillé, Thibaut Guirimand,Sandra Dérozier, Charlie Pauvert, Mahendra Mariadassou,Bedis Dridi, Valentin Loux, Pierre Renault

Jouy en Josas – France

Page 2: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.02JOUR / MOIS / ANNEE

RCAM 2018 2

Cheesemaking

Evolution of the ecosystem during cheese making

Inoculated micro organisms

House microbiota

Starters

Micro organismsfrom : animal milk,

Waterflows,airflows

Micro organismsfrom salt

Ripening cultures

Micro organismsfrom shelves, cellar

Micro organisms: bacteria, yeasts, fungi, phages

Page 3: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.03JOUR / MOIS / ANNEE

RCAM 2018 3

Properties of cheese micro organims

Starters Micro organismsfrom : animal milk,

Waterflows,airflows

Micro organismsfrom salt

Ripening cultures

Micro organismsfrom shelves, cellar

Organoleptic propertiesAcid flavor

Fruity flavorFormation of bubbles

Production of lactic acid,

carbon dioxide, alcohol,

aldehydes ketones …

Coat textureCoat color

Page 4: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.04JOUR / MOIS / ANNEE

RCAM 2018 4

Knowledge of cheese micro organims

Starters Micro organims from : animal milk,

Waterflows,airflows

Micro organismsfrom salt

Ripening cultures

Micro organismsfrom shelves, cellar

Defined starter cultures

Undefined complex starters Not completely known

more vulnerable to bacteriophage attack

Known

“domesticated cultures”

Inoculated micro organisms

House microbiota Not completely known

Page 5: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.05JOUR / MOIS / ANNEE

RCAM 2018 5

Why study cheese ecosystem?

Protect functional

properties of strains

Identify origin of organoleptic

properties of strains

Quality control

Follow ecosystem during

cheese manufacturing

Compare production linesStudy strain diversity

Major reduction in the diversity of micro-organisms due

to sanitary pressure & intensification of production

Page 6: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.06JOUR / MOIS / ANNEE

RCAM 2018 6

Food microbiomes project

• Project with academic & dairy industries

• Use metagenomics to achieve a better understanding of cheese ecosystems

Develop a user-friendly tool to analyze cheeses samples

• Characteristics of cheese ecosystems

• Few species (a few dozens)

• More than 4000 sequenced dairy genomes ≥ 1 genome / most species

• Needs

• Precise taxonomic assignation (strain level)

• Low abundant species identification

• Identification of genes (and their functions)

• A user-friendly interface for non bioinformaticians

• A database with dairy genomes

• Results easy to understand

• Public / private genomes & metagenomes

Page 7: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.07JOUR / MOIS / ANNEE

RCAM 2018 7

Metagenomic shotgun taxonomic assignation

Methods based on Kmeror Burrows–Wheeler

transform

Krachen (Wood, 2014)

CLARK (Ounit, 2015)

Kaiju (Menzel, 2015)

Centrifuge (Kim, 2016)

Methods based on genomes/contigs mapping

Sigma (Ahn, Bioinformatics, 2015)

MicrobeGPS (Lindner, PLoSOne, 2015)

DESMAN (Quince, Genomebiol, 2017)

MetaSNV (Costea, Plos one, 2017)

Constrains (Luo, Nat Biotech, 2015)metaMLST (Zolfo, NAR, 2017)

StrainPhlAn (Truong,

Genome Research, 2017)

Methods based on marker genes

Limited taxonomicassignation precision

Precise taxonomic assignation

Fast, large database Slow, limited database

Identification of strain-level variation

Page 8: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.08JOUR / MOIS / ANNEE

RCAM 2018 8

Metagenomic alignment

Ecosystem

Alignment

sequencing

mismatches

Unaligned reads

Reference genomes

Sequencing errors&

Absence of good reference genome

Choice of alignment parameters

Page 9: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.09JOUR / MOIS / ANNEE

RCAM 2018 9

Metagenomic alignment

Ecosystem

Alignment

sequencing Reference genomes

Regions with high reads coverage

Repeated regions

Heterogenous sequencing depth

Transposable elementsConserved regions

Low abundance High abundance

Choice of alignment results cleaning

Page 10: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.010JOUR / MOIS / ANNEE

RCAM 2018 10

Coverage of genomes

Close strain – intermediate abundance

Absent strain

Very close strain – high abundance

Page 11: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.011JOUR / MOIS / ANNEE

RCAM 2018 11

Characteristics of alignment

Software Bowtie (Langmead, Genome Biology 2009)

• 3 mismatches allowed (-v)• If several best hits, choose one randomly (-a --best --strata -M 1)

CDS CDS FilteredFilteredCDS

Select reads that align on CDS

Filter some CDS: • Annotated: integrase, transposases, IS, phage• Length <300nt

Page 12: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.012JOUR / MOIS / ANNEE

RCAM 2018 12

Characteristics of mapping

CDS CDS FilteredFilteredCDS

Samtools & bedtools: • Identify variant positions • VCF file

Compute expected coverage• Fraction of genome that should be covered by at least one read if the genome is present• Lander & Waterman statistics

thGenomeLeng

ReadNumberReadLength

exp1

C

Observed distribution Expected distribution

htslib.org

Page 13: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.013JOUR / MOIS / ANNEE

RCAM 2018 13

Genomeindexes

Summary (Samtools – Bedtools)

Reference creation

Alignment(bowtie)

Reference genomes database(genbank)

Metagenome(fastq)

Gene annotations (GFF)

CDS CDSCDS

Reads alignment (BAM)

Summary for each genome(CSV)

Reads for each CDS (GFF)

Schema

Page 14: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.014JOUR / MOIS / ANNEE

RCAM 2018 14

software output

Mean, median, sd coverageNumber of variant positions

Genome name

CDS number%CDS with at least 1 read

% positions covered by readsExpected % positions covered by reads

(Lander & Waterman)

Summary for each genome(CSV)

Reads for each CDS (GFF)

CDS Localization

CDS Name & product

CDS Length, Length covered by reads & Number of positions with mismatches

CDS coverage

Page 15: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.015JOUR / MOIS / ANNEE

RCAM 2018 15

A dedicated dairy database

• Based on organisms known to be in dairy products

• Database enrichment: sequencing and assembly of new species

isolated from dairy products - 150 bacterial species & 15

filamentous fungi and yeasts

• 4000 genomes, manually selected

• Work in progress:

• Use text mining to:

• Identify dairy species of the literature

• Identify habitat of species found in metagenomics (for example:

sea for salt bacteria)

• Annotation enrichment: genes of technological interest

(Almeida et al. 2014 BMC Genomics)Collab C. Nedellec team, MaIAGE

Page 16: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.016JOUR / MOIS / ANNEE

RCAM 2018 16

Web interface & server

Quentin Cavaillé, Thibaut Guirimand, Sandra Dérozier, Pierre Renault, Valentin Loux

• User friendly interface

• Public/private genomes and samples

• Personalized analyses

Page 17: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.017JOUR / MOIS / ANNEE

RCAM 2018 17

• Tchapalo: traditional beer in Côte d’Ivoire

• Mean production: 38.000 t/year

• Daily familial consumption

• Income-generating economic activity

• Production process:

• Sorghum malt goes through a double fermentation:

• Natural lactic fermentation => sour wort

• Alcoholic fermentation => Tchapalo

17

Tchapalo ecosystem

Racha ZAARIR

Page 18: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.018JOUR / MOIS / ANNEE

RCAM 2018 18

Tchapalo ecosystem analysis

72.3%

25.1%

80.2%

15.9%

Metagenomic analysis

Microbiology analysis

Page 19: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.019JOUR / MOIS / ANNEE

RCAM 2018 19

Tchapalo ecosystem abundant species

genome % CDS covered meanCoverage % coverageExpected % coverage

Lactobacillus fermentum S6 100 54,979 99,215 100Lactobacillus delbrueckii subsp. lactis KCCM 34717 95,503 150,326 91,717 100

Lactobacillus delbrueckii subsp. Jakobsenii 99,669 164,759 99,119 100

The strain Lactobacillus fermentum S6 is very close to the strain of the ecosystem

Lactobacillus delbrueckii subsp. Jakobsenii is more close than Lactobacillus delbrueckii subsp. lactis KCCM 34717 to the strain of the ecosystem

Page 20: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.020JOUR / MOIS / ANNEE

RCAM 2018 20

Tchapalo ecosystem low abundant species

genome % CDS covered meanCoverage % coverageExpected % coverage # reads

Saccharomyces cerevisiae YJM326 90,727 0,094 8,145 8,908 21405Pediococcus acidilactici DSM 20284 81,706 0,577 9,418 46,005 28706

The strain Saccharomyces cerevisiae YJM326 YJM326 is very close to the strain of the ecosystem

Pediococcus acidilactici DSM 20284 is absent of the ecosystem (reads coming from other Lactobacillaceae)

Page 21: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.021JOUR / MOIS / ANNEE

RCAM 2018 21

Conclusion

• Will be publicly available for research purpose

• An account on the INRA migale platform is required

• The software and database development are still on going

Genomeindexes

Summary (Samtools – Bedtools)

Reference creation

Alignment(bowtie)

Reference genomesdatabase(genbank)

Metagenome(fastq)

Gene annotations (GFF)

CDS

CDS

CDS

Reads alignment (BAM)

Summary for each genome(CSV)

Reads for each gene (GFF)

Reference genome database

Metagenomic software

Web interface

Provinding a user friendly tool for metagenomic analysis

Page 22: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.022JOUR / MOIS / ANNEE

RCAM 2018 22

Perspectives

Genomes pre-selection using a faster method (k-mer or Burrows–

Wheeler transform) to speed up computation

Allow metagenomes analysis comparisons

Apply it on MetaPDOCheese project (next slide)

Application to other ecosystems with enough reference genomes

(for example: fermented food, animals digestive ecosystems…)

Page 23: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.023JOUR / MOIS / ANNEE

RCAM 2018 23

Compare ecosystems of the same PDO area

MetaPDO Cheese Project

INRA: MaIAGE (S. Dérozier, V. Loux, M. Mariadassou, C. Nedellec, Q. Cavaillé), Micalis (P.

Renault, T. Guirimand, B. Dridi, C. Pauvert), GMPA (F. Irlinger), URF(C Delbès), CNIEL

Follow ecosystem in the time scale of cheesemaking

What are the structural and functional diversities of cheese ecosystems?

What are the evolutionary mechanisms of microbial population?

• 44 Protected Designation of Origin French Cheeses

• 1200 samples -16S & ITS sequencing

• Some sample with shotgun sequencing

• Sequencing of 100 new genomes

Page 24: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.024JOUR / MOIS / ANNEE

RCAM 2018 24

Thanks to

StatInfOmics and Bibliome teams

Migale platform

Robert Bossy

Quentin Cavaillé

Estelle Chaix

Hélène Chiapello

Louise Deleger

Sandra Dérozier

Valentin Loux

Mahendra Mariadassou

Claire Nédellec

Pierre Nicolas

Sophie Schbath

Micalis

Pierre Renault

Charlie Pauvert

Thibaut Guirimand

Bédis Dridi

Racha Zaarir

Page 25: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

Sept 9, 2018

Page 26: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

RCAM 2018 26

% ID

Pos covered 100 nt / Pos covered 35 nt

Page 27: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

RCAM 2018 27

% ID

Nb Reads 100 nt / Nb Reads 35 nt

Page 28: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.028JOUR / MOIS / ANNEE

RCAM 2018 28From: Irlinger et al. FEMS Microbiol Lett. 2014

Cheese ecosystems

Page 29: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.029JOUR / MOIS / ANNEE

RCAM 2018 29

Challenges of taxonomic assignation ??? Ou

pas ?? Aussi challenges fonctions ??

We don’t have reference genomesfor each strain of the ecosystem

Some genera with many reference genomes, others without a reference genomeImpossible to sequence every strain (non cultivable species, cost of DNA extraction, sequencing and storage…)

Computational challenge: impossible to compare reads to every sequenced genomeNovember 2017 : 124 481 procaryotic genomesA metagenome : millions reads per sample

Tree of life Reference genomes Ecosystem strains

Page 30: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.030JOUR / MOIS / ANNEE

RCAM 2018 30

GeDI method

Sequencing bias& repeted regions

Heterogenous genome coverage

ArtefactAlignment on close genome

Gene position

Very close strain – high abundance

% c

ove

rage

Genome position

Strains present in different proportions

Page 31: A metagenomic tool for cheese ecosystemsmaiage.jouy.inra.fr/sites/maiage.jouy.inra.fr/files/u43/180909-rcam18... · Sept 9, 2018 A metagenomic tool for cheese ecosystems Anne-Laure

.031JOUR / MOIS / ANNEE

RCAM 2018 31

Knowledge of cheese micro organims

Inoculated micro organims

House microbiota

starters

Micro organims from : animal milk,

Waterflows,airflows

Micro organismsfrom salt

Ripening cultures

Micro organismsfrom shelves, cellar

Not completely known

Defined starter cultures

Undefined complex starters Not completely known

more vulnerable to bacteriophage attack

Known

“domesticated cultures”