ebi is an outstation of the european molecular biology laboratory. bird‘s eye view of... molecular...

40
EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of ... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support (APIs, Validator) ChEBI APO-SYS workshop 20 – 21st January 2009 Berlin

Upload: marissa-ellerman

Post on 01-Apr-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

EBI is an Outstation of the European Molecular Biology Laboratory.

Bird‘s Eye View of ...

Molecular Interaction Standards: PSI-MI XML

PSI-MI Tool support (APIs, Validator)

ChEBI

APO-SYS workshop20 – 21st January 2009Berlin

Page 2: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

PROTEOMICS STANDARD INITIATIVE

A gentle introduction to the

2

Page 3: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

3

Engineering 1850Engineering 1850

• Nuts and bolts fit perfectly together, but only if they originate from the same factory

• Standardisation proposal in 1864 by William Sellers

• It took until after WWII until it was generally accepted, though …

Proteomics 2003Proteomics 2003

• Proteomics data are perfectly compatible, but only if they are from the same lab / database / software

• “Publish and vanish” by data producers

• Collecting all publicly available data requires huge effort

• Urgent need for standardisation

Page 4: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

4

• Community standard for Molecular Interactions

• XML schema and detailed controlled vocabularies

• Jointly developed by major data providers: BIND, CellZome, DIP, GSK, HPRD, Hybrigenics, IntAct, MINT, MIPS, Serono, U. Bielefeld, U. Bordeaux, U. Cambridge, and others

• Version 1.0 published in February 2004The HUPO PSI Molecular Interaction Format - A community standard for the representation of protein interaction data.Henning Hermjakob et al, Nature Biotechnology 2004, 22, 176-183.

• Version 2.5 published in October 2007Broadening the Horizon – Level 2.5 of the HUPO-PSI Format for Molecular Interactions;

Samuel Kerrien et al. BioMed Central. 2007.

PSI-MI XML format

Page 5: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

5

• Collecting and combining data from different sources has become easier

• Standardized annotation through PSI-MI ontologies

• Tools from different organizations can be chained, e.g. analysis of IntAct data in Cytoscape.

PSI-MI XML benefits

http://www.psidev.info/MIHome page

Page 6: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

PSI-MI CONTROLLED VOCABULARIES

An overview of the

6

Page 7: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

7

Ontology Lookup Service• Makes available OBO controlled vocabularies• Web site allows for searching and browsing their

hierarchy

htt

p:/

/ww

w.e

bi.

ac.u

k/o

nto

log

y-l

ooku

ph

ttp

://w

ww

.eb

i.ac.u

k/o

nto

log

y-l

ooku

p

Page 8: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

8

Ontology Lookup Service

• Each term has a definition as well as literature reference

htt

p:/

/ww

w.e

bi.

ac.u

k/o

nto

log

y-l

ooku

ph

ttp

://w

ww

.eb

i.ac.u

k/o

nto

log

y-l

ooku

p

Page 9: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

PSI-MI XML 2.5 DATA MODELAn overview of the

9

Page 10: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

10

PSI-MI 2.5 Standards

Page 11: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

11

• Top level structure unchangedcompared to PSI-MI 1.0

• Use of Id/Ref on main objects

Bird’s eye view of PSI-MI XML 2.5

Page 12: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

12

Main objects - Experiment

Controlled by Ontologies

Literature references

Confidence measures

Page 13: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

13

Main objects - Interactor

Generic interactor

Reference to a public database

Page 14: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

14

Main objects - Interaction

Controlled by Ontology

Copyright

Experiment

Kinetics parameters

Confidence value

Page 15: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

15

Basics – Controlled Vocabularies• Why ?

• Ensure data consistency

• Provide reliable mean for searching & filtering data

• How ?

• By providing a reference to an ontology term

Using

Xref !

!

Page 16: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

16

Main objects - Participant

e.g. enzyme target

Interactor

e.g. bait, prey

Delivery methodexpression level…

Interactor used experimentally

Building of Complex

Page 17: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

PSI-MI TAB DATA MODELAn overview of the

17

Page 18: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

18

Standard columns (15):• ID(s) interactor A & B• Alt. ID(s) interactor A & B • Alias(es) interactor A & B• Interaction detection method(s)• Publication 1st author(s)• Publication Identifier(s)• Taxid interactor A & B• Interaction type(s)• Source database(s)• Interaction identifier(s)• Confidence value(s)

PSIMITAB Standard Columns

Page 19: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

INTACT EXTENDED MITABA quick look into

19

Page 20: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

20

IntAct specific columns (+11):• Experimental role(s) of interactors• Biological role(s) of interactors• Properties (CrossReference) of interactors• Type(s) of interactors• HostOrganism(s)• Expansion method(s)• Dataset name(s)

Standard columns (15):• ID(s) interactor A & B• Alt. ID(s) interactor A & B • Alias(es) interactor A & B• Interaction detection method(s)• Publication 1st author(s)• Publication Identifier(s)• Taxid interactor A & B• Interaction type(s)• Source database(s)• Interaction identifier(s)• Confidence value(s)

+

PSIMITAB Extended Columns

Page 21: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

PSI-MI XML 2.5 JAVA APIA hands on introduction to

21

Page 22: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

22

PSI-MI XML Java API

• Uses Java 5• Provides binding between XML and Java object model• Tools to read/write XML from/to file• Read can be done in 2 fashions:

• Load a whole file in an EntrySet• Only allows to load large files if you have enough memory• Easy to update content and write back to file

• Index XML data and give access though an IndexedEntry• Memory efficient with large files• Allows to browse through interactions, experiments…• Trickier to write updated content (yet, feasible)

Page 23: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

PSI-MI TAB 2.5 JAVA APIA hands on introduction to

23

Page 24: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

24

PSI-MI TAB Java API

• Uses Java 5• Provides binding between TAB and a Java object model• Tools to read/write TAB from/to file• You can read in 2 fashions:

• Load a whole file in a Collection<BinaryInteraction>• Only allows to load large files if you have enough memory

• Load interaction one at a time using Iterator<BinaryInteraction>• Memory efficient with large files

Page 25: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

25

• PSI-MI XML is the de facto standard for molecular interactions

• We have code samples & exercises for both APIs ! Let me know if you want access to it …

• The Java API makes it easy to handle

Summary

http://psidev.info/MIPSI-MI Home page

http://www.psidev.info/index.php?q=node/60#toolsAPI Download

ftp://ftp.ebi.ac.uk/pub/databases/intact/current/psi25Data

Page 26: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

R packages for PSI-MIQuick introduction to

26

Page 27: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

27

Rintact & RpsiXML

• Initiative from the Wolfgang Huber’s group at the EBI

• Allows to read PSI-MI XML data into R data structure

• Enables data analysis using existing packages such as: RBGL, ppiStats, apComplex, …

• Currently supports: IntAct, MINT, HPRD, DIP, BioGRID, MIPS/CORUM, MatriDB, MPACT.

http://www.bioconductor.org/packages/2.1/bioc/html/Rintact.html

API Download

http://www.bioconductor.org/packages/2.3/bioc/vignettes/RpsiXML/inst/doc/RpsiXML.pdf

Documentation

Page 28: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

PSI SEMANTIC VALIDATORQuick introduction to

28

Page 29: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

29

The PSI validator framework automatically checks that experimental data reported using a specific XML format and various CVs are compliant with the overall MIAPE recommendations.

The semantic validator checks :

- the XML syntax

- the appropriate CV terms are used in specific locations of a document

- misc. consistency check

The Framework (in the context of PSI)

Page 30: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

30

OntologyManager

Ontology Mapping Rule Object Rule

Semantic ValidatorMessagesData Model

Config Config Config

OBO

OLS

DataFile

Components of the Validator

Page 31: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

31

The Ontology Manager

Declaration of ontologies or Controlled Vocabularies:

• location,

• format,

• retrieval method (local file or via web services)

Page 32: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

32

Ontology Lookup Service

Currrently 61 Ontologies available

Web Service for easy access

Page 33: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

33

CV Mapping Rules

Is an explicit specification of which CV terms may/should/must be used in a given location.

•crucial to bind a data model to a set of CVs

•necessary to enforce MIAPE guidelines

•allows to develop CVs independently from a schema (necessary to comply to CV guidelines)

•this mapping is specified in an XML file

Page 34: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

34

Exchange Format

Referenced ontologies and CVs

Resulting mapping file <CvMappingRule scopePath="/mzML/sampleList/sample” cvTermsCombinationLogic="OR" requirementLevel="MAY">

<CvTerm termAccession="GO:0005575" useTerm="false" termName="cellular_component" allowChildren="true" isRepeatable="true" cvIdentifierRef="GO"></CvTerm> <CvTerm termAccession="BTO:0000000" useTerm="false" termName="brenda source tissue ontology" allowChildren="true" isRepeatable="true" cvIdentifierRef="BTO"/> </CvMappingRule>

<CvMappingRule scopePath="/mzML/instrumentList/instrument/componentList/analyzer” cvTermsCombinationLogic=“AND" requirementLevel="MUST"> <CvTerm termAccession="MS:1000443" useTerm="false" termName="data file checksum type" allowChildren="true" isRepeatable="true" cvIdentifierRef="MS"></CvTerm> <CvTerm termAccession="MS:1000480" useTerm="false" termName=“Mass Analyzer" allowChildren="true" isRepeatable="true" cvIdentifierRef="MS"></CvTerm></CvMappingRule>

<CvMappingRule scopePath="/mzML/instrumentList/instrument/componentList/detector" cvTermsCombinationLogic=“AND" requirementLevel="MUST"> <CvTerm termAccession="MS:1000026" useTerm="false" termName=“Detector Type" allowChildren="true" isRepeatable="false" cvIdentifierRef="MS"/> <CvTerm termAccession="MS:1000027" useTerm="false" termName="detector acquisition mode" allowChildren="false" isRepeatable="true" cvIdentifierRef="MS"/></CvMappingRule>

CV Mapping Rules – example with MzML

Page 35: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

35

•A data model is not bound to a single mapping

•PSI MI and MS workgroup provide a mapping corresponding to their respective minimum reporting guidelines (MIAPE)

•Mapping can be customized by any end user of a standard to be more or less granular

CV Mapping Rules – final thoughts

Page 36: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

36

List of consistency check tailored to specific data type

Examples:- taxid is an existing entry at NCBI- PubMed ID is an existing publication- protein and DNA sequence defined using

appropriate alphabet- CV dependency rules

Note: These rules are to be programmed in Java

The Object Rules

Page 37: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

37

Fancy Building Your Own ?

We are currently finalizing a tutorial to guide users in writing a validator based on their own data model. It provides:

• Additional explanation on the Validator’s modules

• Example of configuration files

• A working prototype based on a made up data model

• Source code available to get you quick-started.http://psidev.info/validator

Page 38: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

EBI is an Outstation of the European Molecular Biology Laboratory.

IntAct team

Rol

f Apw

eile

r

•Henning Hermjakob•Sandra Orchard•Jyoti Khadake•Luisa Montecchi•Dave Thorneycroft•Cathy Derow•Prem Achuthan•Bruno Aranda•Samuel Kerrien

IntA

ct

is f

un

ded

by t

he E

uro

pean

Com

mis

sio

n u

nd

er

FELIC

S,

con

tract

nu

mb

er

021902 (

RII

3)

Page 39: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

EBI is an Outstation of the European Molecular Biology Laboratory.

• Luisa Montecchi-Palazzi• Florian Reisinger• Lennart Martens• Andy Jones• Mathias Oesterheld• Bruno Aranda• Prem Achuthan • Henning Hermjakob

PSI participants(direct contributors to the validator)

• Juan A Vizcaino• Chris Taylor• Eric Deutsch• Pierre Alain Binz• Susanna Sansone• Frank Gibson• Zsuzsanna Bencsath• Daniel Schober• Trish Wetzel• Pete Souda

Other PSI participants

Page 40: EBI is an Outstation of the European Molecular Biology Laboratory. Bird‘s Eye View of... Molecular Interaction Standards: PSI-MI XML PSI-MI Tool support

40

????

??? ?

??

?

?

?

?

?

?

??

?

?

? ?

?