the complex portal: a ‘one-stop shop’ for protein complexes birgit meldal intact curator...
TRANSCRIPT
The Complex Portal: A ‘one-stop shop’ for protein complexes
Birgit MeldalIntAct Curator
Rational
1. Most protein-protein interaction (PPI) databases display PPIs as binary interactions -
• to allow user-friendly displays in web interfaces (tables)
• to allow building of protein networks
• but, this means loosing the information of the in vivo topology of multi-protein complexes
2. There was no central database that amalgamated information about protein complexes from their constituent parts to their molecular function and involvement in biological processes.
• Many databases included snippets of such information but were lacking cross-references or lost financial support to update the database
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Objectives
• To design an Online Portal to search and visualise protein complexes
• Including cross-referencing to source databases and beyond
• Export to interested parties in a format of their choice
• Incorporate the data into network analysis tools
• Emphasis on major model organisms, chosen to span the taxonomic range –
• Homo sapiens, Mus musculus, Saccharomyces cerevisiae, Escherichia coli
• Caenorhabditis elegans, Drosophila melanogaster, Saccharomyces pombe, Arabidopsis thaliana
• All data held in IntAct DB but separated from experimental evidence
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Definition of stable complexes
A stable set (2 or more) of interacting protein molecules which
• can be co-purified and
• have been shown to exist as a functional unit in vivo.
Non-protein molecules (e.g. small molecules, nucleic acids) may also be present in the complex.
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
What is not a stable complex?
1. Enzyme/substrate, receptor/ligand or similar transient interactions,
Unless:
• one element is a complex in its own right and/or
• it is an obligate complex that requires substrate/ligand binding, e.g. PDGF receptors
2. Two proteins associated in a pulldown / coimmunoprecipitation with no functional link
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Source Databases
• MatrixDB (Sylvie Richard-Blum, Univ. of Lyon)
• Mining UniProt – yeast (Bernd Roechert, SIB – manually)
• PDBe (EBI)
• ChEMBL (EBI)
• Reactome – human (EBI)
• Manual curation from IMEx DBs & the literature (Sandra & Birgit)
• User requests (prioritised!)
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Data captured
• Participants
• proteins (UniProt)
• small molecules (ChEBI)
• nucleic acids (Ensembl, ChEBI, RNACentral?)
• Species (visible in search results)
• Stoichiometry – when known
• Topology (= binding sites) – when known
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Data captured
• Complex-specific, free-text annotation fields:• Function and context – UniProt-style (visible in search
results)
• Assembly, e.g. homodimer, heterotetramer…
• Physical properties, e.g. MW, size, topology/assembly
• Ligands
• Disease
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Data captured
• Complex names:• Recommended name:
most recognisable name from literature, use GO component if specific complex exists in GO (visible in search results)
• Systematic name:
based on Reactome’s new CV names – ‘string of gene names with stoichiometry’
• Synonyms:
all other names the complex may be known as
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Data captured
• Structured annotation using GO (BP, MF, CC)
• Cross references to experimental evidence:• IMEx (+ non-IMEx IntAct & DIP), PDB, EMDB
• Cross references to related complex data: • Reactome (human)
• ChEMBL
• Intenz (enzyme EC numbers)
• OMIM / EFO / Orphanet (disease)
• ECO (evidence code ontology)
• PubMed (for further information)
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
ECO – Evidence Code Ontology
• ECO:0000353: physical interaction evidence used in manual assertion, if :
• full experimental evidence for the complexes is present either in a PPI DB, PDB or EMDB
• ECO:0000266: sequence orthology evidence used in manual assertion, if:
• only limited experimental evidence exists for a complex in one species (e.g. mouse) but it is desirable to curate the complex which has been curated in another species (e.g. human) and orthologous gene products exist in the former species, e.g. PDGFs
• ECO:0000250: sequence similarity evidence used in manual assertion, if:
• only limited experimental evidence exists for one complex but full experimental evidence exists for a similar complex of the same species
• ECO:0000306: inference from background scientific knowledge used in manual assertion, if:
• no or only partial experimental evidence can be found but the complexes are generally assumed to exist, e.g. GABA receptors exist in ChEMBL
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Future plans
• More content! • Other groups already voiced interest in joining our
curation effort.
• We welcome any volunteer contributions and can train you and set you up on the system.
• Expand search filters / add Advanced Search:• Species
• GO terms
• ECO
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Future plans
• Visualisation:• Schematic views
• Crystal / EM structures (PDBe / EMDB)
• Integration of pathway view (Reactome)
• Drug targets (ChEMBL)
• Small molecule structures & Ontology (ChEBI)
• Enzyme reactions (Rhea)
• Gene expression data (Expression Atlas)
• Direct link to search for related complexes and interactions
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Live demo!
www.ebi.ac.uk/intact/complex
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Direct Submissions and Contact
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Acknowledgements
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
Henning Hermjakob
Group leader
Sandra Orchard
Coordinator
Pablo Porras Milan
Birgit Meldal
Margaret Duesbury
Curation team
Marine Dumousseau
Developing team
Oscar Forner-
Martinez
Acknowledgements
Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014
GO
• Jane Lomax
• Paola Roncaglia
• David Osumi-Sutherland
• Rebecca Foulger
• Rachel Huntley
• Heiko Dietze
SIB
• Bernd Roechert
MatrixDB
• Sylvie-Ricard-Blum
Reactome
• Steve Jupe
• David Croft
ChEMBL
• Anna Gaulton
• Yvonne Light
PDBe
• Sameer Velankar
• Jose Dana
Thank you!
?Complex Portal @ Networks & Pathways Course, EBI, 9th June 2014