meta-proteome-analyzer: a graph database backed protein analysis software thilo muth mpi magdeburg /...

22
Meta-Proteome-Analyzer: a graph database backed protein analysis software Thilo Muth MPI Magdeburg / Germany (cooperation with CompOmics group in Ghent / Belgium)

Upload: sandra-small

Post on 16-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Meta-Proteome-Analyzer: a graph database backed protein analysis software

Thilo MuthMPI Magdeburg / Germany

(cooperation with CompOmics group in Ghent / Belgium)

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Life Sciences + Graph Database?Life scientists are interested in genes, proteins, metabolites, relationships, interactions and biological networks...

Introduction

Meta-Proteome-Analyzer – a graph database backed protein analysis software

...but not so much in a bunch of boring tables.

Introduction

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Classical way:Most biological data are being stored and provided in excel sheets or SQL databases.

Shortcomings:• Scalability and performance issues (data growth)• Complexity • Lack of information • Models far away from biological reality

Introduction

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Metaproteomics

• Metaproteomics studies reveal functional activities in microbial communities• Metaproteomics results can also be used for taxonomic assignment (proteins to species).

What is metaproteomics ?

Meta-Proteome-Analyzer – a graph database backed protein analysis software

One Protein - Different Organisms

Protein Species B1 : n

Species A

Species C

Metaproteomics

Meta-Proteome-Analyzer – a graph database backed protein analysis software

How to handle more complex structures?

Metaproteomics

Peptide A

Protein B

Protein A

Protein C

Species A

Species BPeptide B

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Find a more „natural“ model:Transfer the graph model of biological data back to the database storage level.

Metaproteomics

USE A GRAPH DATABASE

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Graph Database

We implemented the graph database Neo4jinto our free software Meta-Proteome-Analyzer.

Biology + Graphs!

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Graph Database

• Peptides• Proteins• Species• Enzymes• Pathways• Ontologies

Graph nodes/vertices:NODE

NODE

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Graph Database

• Peptide-BELONGS-TO-Protein• Species-HOLDS-Protein• Protein-BELONGS-TO-Pathway

Graph edges/relationships:NODE

NODE

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Meta-Proteome-Analyzer – a graph database backed protein analysis software

„Complex“ Data

Did you find that guy in here?

Meta-Proteome-Analyzer – a graph database backed protein analysis software

„Complex“ Data

No ?

Asara et al. (Science 2007) reported the detection of collagen protein fragments in a 68-million-year-old Tyrannosaurus rex bone by shotgun proteomics.

This finding has been called into question by researchers marking these proteins as contaminants from modern species.

Me neither!

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Graph Database Queries

Simple Query: Filter out (T.Rex) contaminants

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Graph Database Queries

Find a certain pathway and its related entitites

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Graph Database Queries

START pathways = nodeathways("Identifier:*")MATCH (pathways)<-[rel:BELONGS_TO_PATHWAY]-(proteins)<-[:IS_METAPROTEIN_OF]-(metaproteins)WHERE (pathways.Identifier='K00399')WITH pathways, proteins, metaproteinsMATCH (proteins)-[:BELONGS_TO]->(taxa)RETURN pathways, taxa, metaproteins, proteins

Neo4j Cypher Query

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Performance

A B C247 277 280

2584

12095

19740

Performance comparisonClient Neo4J (ms) Server MySQL (ms)

Test Datasets: A) 1388 proteins B) 4946 proteinsC) 9393 proteins

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Software

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Software

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Software

Current State• Tested by four differents labs (Magdeburg, Leipzig, Helsinki, Alghero)• Documentation and manuscript in preparation• Testing and fixing bugs for release version 1.0

PR TEOMEANALYZER

META

PA

P

Meta-Proteome-Analyzer – a graph database backed protein analysis software

Thanks for your attention!

Questions ? Comments ?

The software is available here:http://meta-proteome-analyzer.googlecode.com