the complex portal - relationship to gene ontology sandra orchard (intact)

27
The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Upload: trevor-townsend

Post on 30-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

The Complex Portal- relationship to Gene Ontology

Sandra Orchard(IntAct)

Page 2: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Project Aim

• To design a Online Portal to search and visualise protein complexes

• Including cross-referencing to source databases and beyond

• Export to interested parties in a format of their choice

• Incorporate the data into network analysis tools

• Emphasis on major model organisms, chosen to span the taxonomic range –

• Homo sapiens, Saccharomyces cerevisiae, Escherichia coli

• Mus musculus, Caenorhabditis elegans, Drosophila melanogaster, Saccharomyces pombe, Arabidopsis thaliana

• All data held in IntAct DB – share editor, protein update mechanism, QC procedures

• Separate search and visualisation facility

• wwwdev.ebi.ac.uk/intact/complex/

Page 3: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Definition: stable protein complexes

A stable set (2 or more) of interacting protein molecules which

• can be co-purified and

• have been shown to exist as a functional unit in vivo.

Non-protein molecules (e.g. small molecules, nucleic acids) may also be present in the complex.

What is not a stable complex?• Two proteins associated in a pulldown /

coimmunoprecipitation with no functional link• Enzyme/substrate, receptor/ligand or similar transient

interactions• Exception - obligate complex that requires substrate/ligand,

e.g. PDGF receptors

Page 4: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Source Databases

• PDBe (EBI) – almost 1000 complexes imported

• ChEMBL (EBI) – 81 complexes imported, more to come with each release

• MatrixDB (Sylvie Richard-Blum, Univ. of Lyon)

• Mining UniProt – yeast (Bernd Roechert, SIB – manually)

• Reactome – human (EBI)

• Manual curation from IMEx DBs & the literature

• Gramene – Arabidopsis

• Unmaintained web resources – CYGD (yeast), CORUM (human), E. coli website, 3D Complexes (Sarah Teichmann, EBI),

Page 5: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Data captured currently for IntAct complexes• Participants – proteins (UniProt), small molecules

(ChEBI), nucleic acids (Ensembl, ChEBI, RNACentral?)

• Species

• Stoichiometry – when known

• Topology (= binding sites) – when known

Page 6: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Data captured currently for IntAct complexes• Complex-specific, free-text annotation fields:• Function and context – UniProt-style (visible in search

results)

• Assembly, e.g. homodimer, heterotetramer…

• Physical properties, e.g. MW, size, topology/assembly

• Ligands

• Disease

Page 7: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Data captured currently for IntAct complexes• Complex names:• Recommended name:

most recognisable name from literature, use GO component if specific complex exists in GO

• Systematic name:

based on Reactome’s new CV names – ‘string of gene names with stoichiometry’

• Synonyms:

all other names the complex may be known as

Page 8: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Data captured currently for IntAct complexes• Structured annotation using GO (BP, MF, CC)

• Cross references to experimental evidence:• IMEx (+ non-IMEx IntAct & DIP), PDB, EMDB

• Cross references to related complex data: • Reactome (human)

• ChEMBL

• PubMed (for further information)

• Intenz (enzyme EC numbers)

• OMIM (disease)

• ECO (evidence code ontology)

Page 9: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Parallel Annotation of complexes in GO

• Project start > 400 complex terms in GO CC, mostly children

of GO:0043234 protein complex – lacking hierarchal

structure

• Good collaboration with GO to provide structured annotation

• Parent terms mainly based on complex function

• TermGenie (TG) Standard Form <protein_complex_by_activity>

• Otherwise use TG Free Form

• Some complexes still direct children of GO:0043234 protein complex

• Adding “logical definitions” / “cross-products” / “extensions”

• e.g. “capable_of x activity”

Page 10: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

ECO – Evidence Code Ontology

• ECO:0000353 physical interaction evidence used in manual assertion (=IPI)

• full experimental evidence for the complexes is present

• ECO:0000266 - sequence orthology evidence used in manual assertion (=ISO)

• only limited experimental evidence exists for a complex in one species (e.g. mouse) but it is desirable to curate the complex which has been curated in another species (e.g. human) and orthologous gene products exist in the former species, e.g. PDGFs

• ECO:0000306: inference from background scientific knowledge used in manual assertion, if:

• no or only partial experimental evidence can be found but the complexes are generally assumed to exist, e.g. GABA receptors exist in ChEMBL

Page 11: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Download

• At present:

• One PSI-MI xml 2.5.4 file for all complexes on ftp site

• From next IntAct release:

• One file per complex within a folder per species on ftp site and a zip file per species

• Future:

• Separate files for each complex accessible on each complex details page

• List of files for complexes from search results list

• Database specified dumps

• Network analysis appropriate format (as developed by MIPS)

Page 12: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Project status

• Website will move to production site end March

• Further development (particularly graphics) will be made public over the next 6 months

• Curation priorities – Human (mouse), yeast, Ecoli

- user requestsExports to GOA (process and component) and UniProt under discussion.

Page 13: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Future Plans - Display

• Add search filters, e.g.

• Species –almost done

• GO terms

• ECO

Advanced Search

• Links to ‘experimental evidence’ and ‘related complexes’ searches

• Schematic view of complex

• Add existing widgets/BioJS components to show content from other databases directly in the Complex Portal (BioJS)

- crystal structure, pathway, enzyme reactions etc

Page 14: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Future Plans - Functionality

• Concept of ‘sets’ – important for Reactome import

• Hierarchy of complex sets specific complex sub-complex

• Introducing features to indicate, e.g. complex-drug binding sites

Page 15: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Complexes on demand

1. Request via ‘Contact us’ button

1. Name & components

2. Experimental paper

3. Full details including Function,

stoichiometry and topology

.. or we give you access to editor to create your own

Page 16: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

17

????

??? ?

??

?

?

?

?

?

?

??

?

?

? ?

?

Page 17: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Summary of ‘User Survey’ and own goals

Page 18: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Summary of ‘User Survey’ - Search

Page 19: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Summary of ‘User Survey’ - Display

Page 20: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Summary of ‘User Survey’ - Features

Expression Atlas?

Page 21: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Summary of ‘User Survey’ - Features

Manually for mouse

ECO xref to exp-evidence

Page 22: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Summary of ‘User Survey’ - Features

Definition???Reactome

Page 23: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Summary of ‘User Survey’ - Features

Page 24: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Summary of ‘User Survey’ - Downloads

Page 25: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

IntAct and Complex Portal homepage

Page 26: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Complex PortalUniProt-style display

Page 27: The Complex Portal - relationship to Gene Ontology Sandra Orchard (IntAct)

Complex Portaltab-style display