protein interactions and pathways

Post on 14-Jan-2016

50 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Protein interactions and Pathways. Jyoti Khadake & Vicky Schneider Joint Wellcome Trust –EBI Summer School 24 th June 2011. This morning session outline. Where do protein sequences come from? Introduction to protein databases Introduction to protein interactions - PowerPoint PPT Presentation

TRANSCRIPT

EBI is an Outstation of the European Molecular Biology Laboratory.

Jyoti Khadake & Vicky Schneider

Joint Wellcome Trust –EBI

Summer School

24th June 2011

Protein interactions and Pathways

This morning session outline• Where do protein sequences come from?

• Introduction to protein databases

• Introduction to protein interactions

• Standardisation of the protein interaction data

• IntAct and demo

• Psicquic/Cytoscape & demo

• Data visualisation and network building-

• Including the Protein information from other sources to enhance networks

2

Where do protein sequences come from?

3

Protein Sequences

Protein databases•Based on nucleotide sequence similarity•Based on peptide sequences

Organism database•Organism of protein is important as is sequence – taxonomy databases

Can you name THE database of protein sequences?

5

UniProtKB

factsheet

Let’s explore a protein: CDC42

6

• Cell division control protein 42 homolog also known as CDC42 is a protein involved in regulation of the cell cycle.

• It is a small GTPase of the Rho-subfamily, which regulates signaling pathways that control diverse cellular functions including cell morphology, migration, endocytosis and cell cycle progression.

What could go wrong if CDC42 is not doing its job?

UniProtKB (CDC42 protein)

• Search for gene - CDC42• Check the different proteins retrieved

• Organisms• Same organism swissprot/trembl• Different referenced databases - PRIDE

• Sequences and References• Information about protein

Where is it present, how does it act, what are its properties…• INTACT, REACTOME, GOA, INTERPRO, PDB

How are TREMBL entries generated?

Master headline

UniProt Knowledge Base• Swiss-Prot: Manual annotations (~450,000

proteins)• TrEMBL: Automatic (~3,300,000 proteins)

htt

p:/

/ww

w.e

bi.

un

ipro

t.org

/h

ttp

://w

ww

.eb

i.u

nip

rot.

org

/

Master headline

UniProt Knowledge Base

• Interactions in IntAct are using Splice Variants

htt

p:/

/ww

w.e

bi.

un

ipro

t.org

/h

ttp

://w

ww

.eb

i.u

nip

rot.

org

/

Master headline

UniProt Knowledge Base• Summary:

• Master Protein: P60953• Splice variants / Isoform: P60953-1, P60953-2

htt

p:/

/ww

w.e

bi.

un

ipro

t.org

/h

ttp

://w

ww

.eb

i.u

nip

rot.

org

/

!

11

UniProt Knowledge Base

Protein Families, domains and motifs

12

What is a Protein families?Protein domain? And protein motifs?

Why to bother creating a db that groups proteins that share the same domain?

13

InterPro

Protein Families, domains (and motifs)factsheet

Master headline

UniProt Knowledge Base• Summary:

• Master Protein: P60953• Interaction and pathway databases

htt

p:/

/ww

w.e

bi.

un

ipro

t.org

/h

ttp

://w

ww

.eb

i.u

nip

rot.

org

/

!

Master headline

UniProt TaxonomyUniProt Taxonomy• Web Interface to the NCBI taxonomy

Master headline

Newt

PRIDE: where is the data coming from.

18

PRIDE

factsheet

EBI is an Outstation of the European Molecular Biology Laboratory.

Protein interactions

Interactions

• Basis of protein action• Types

• Self• Binary: homomeric or heteromeric• N-nary complexes• Co-localisations

• Biological types of interactions• Information in literature and websites

2. Association

3. Functional Interaction

Types of Interaction data in IntAct

1. Direct interactions

In pairs start the next activity:Match the types of experimental techniques (you can find information in the cards provided) with the type of interactions Jyoti just explained :

Direct Interactions

Association

Functional Interaction

Standardisation of the protein interaction data

23

Ontologies

factsheet

www.ebi.ac.uk/ols for controlled vocabularies

25

Format for storage and exchange –

PSI-MI XML 2.5

Interaction DatabasesDeep Curation

IntAct – active curation, broad species coverage, all molecule typesMINT – active curation, broad species coverage, PPIsDIP – active curation, broad species coverage, PPIsMPACT - ? curation, limited species coverage, PPIsMatrixDB – active curation, extracellular matrix molecules onlyBIND – ceased curating 2006/7, broad species coverage, all molecule types – information becoming dated

Shallow curationBioGRID – active curation, limited number of model organismsHPRD – active curation, human-centric, modelled interactionsMPIDB – active curation, microbial interactions

The IMEx consortium

27

EBI is an Outstation of the European Molecular Biology Laboratory.

IntAct

29

Interaction2

Interaction4

Interaction1

Interaction3

Publication

Experiment1

Experiment2

. Roles

. Features

. Preparations

Part

icip

an

t

How to model an interaction

Protein1

Protein2

Participant1

Participant2

Participant3

30

Main objects - Experiment

Controlled by Ontologies

Literature references

Confidence measures

31

Main objects - Participant

e.g. enzyme target

Interactor

e.g. bait, prey

Delivery methodexpression level…

Interactor used experimentally

Building of Complex

IntAct

• Search MITab• From MiTab to detailed view• Expanding network• Network view - TBC

• Other data that can be visualised

Master headline

IntAct – Home Pageh

ttp

://w

ww

.eb

i.ac.u

k/i

nta

ct

htt

p:/

/ww

w.e

bi.

ac.u

k/i

nta

ct

Master headline

Software demonstrationSoftware demonstration

• Many ways to search data !

• Simple, yet powerful search engine

• Advanced search – how to build complex queries

• Searching by ontology terms

• Searching by chemical substructure

Master headline

Simple Search

Fir

st

searc

h f

rom

th

e h

om

e p

ag

e…

Fir

st

searc

h f

rom

th

e h

om

e p

ag

e…

UniProt Taxonomy PubMed OLSDetails of interaction Complex ?

Master headline

Downloading & Customizing

Fir

st

searc

h f

rom

th

e h

om

e p

ag

e…

Fir

st

searc

h f

rom

th

e h

om

e p

ag

e…

!

Master headline

Searching –more

How

to b

uild

com

ple

x q

ueri

es…

How

to b

uild

com

ple

x q

ueri

es…

Master headline

Searching – Fields

How

to b

uild

com

ple

x q

ueri

es…

How

to b

uild

com

ple

x q

ueri

es…

• Unsure how to build your own complex query ?

Master headline

Searching – Searching – FieldsFields

How

to b

uild

com

ple

x q

ueri

es…

How

to b

uild

com

ple

x q

ueri

es…

• Some fields provide easy ways to select terms

Master headline

Software demonstrationSoftware demonstration

• Single interaction details

• Selecting an interaction

• Looking at the details

• Fetching all other interaction reported in the same paper

• Searching for similar interactions in the database

Master headline

Interaction Details

Sele

cti

ng

an

in

tera

cti

on

…S

ele

cti

ng

an

in

tera

cti

on

Master headline

Interaction Details

Lookin

g a

t th

e d

eta

ils…

Lookin

g a

t th

e d

eta

ils…

Master headline

Interaction Details

Lookin

g a

t th

e d

eta

ils…

Lookin

g a

t th

e d

eta

ils…

Master headline

Interaction Details

Searc

hin

g f

or

sim

ilar

inte

racti

on

s…

Searc

hin

g f

or

sim

ilar

inte

racti

on

s…

EBI is an Outstation of the European Molecular Biology Laboratory.

Network VisualisationPSICQUIC Cytoscape

Network visualisation

• In IntAct • From IntAct Binary and expanded• From IntAct N-nary and expanded

Important: type of interaction and method used• In Psicquic

• Data from other interaction databases

What is PSICQUIC ?

• Proteomics Standards Initiative Common QUery InterfaCe.

• Community effort to standardise the way to access and retrieve data from Molecular Interaction databases.

• PSICQUIC is a specification of a web service.

• Resources already implementing PSICQUIC are listed in a registry.

• Based on the PSI standard formats (XML and MITAB)

• Documentation: http://psicquic.googlecode.com

PSICQUIC implementation

….…. ….....

….…. ….....

PSICQUIC PSICQUIC PSICQUIC

Sample

Observation error

Interaction databases

Publications

PSICQUIC sources

Annotation error

User

PSICQUIC Registry

PSICQUIC client

http://www.ebi.ac.uk/Tools/webservices/psicquic/view/http://www.ebi.ac.uk/Tools/webservices/psicquic/view/

PSICQUIC View

http://bit.ly/psicquic-viewhttp://bit.ly/psicquic-view

• Enables clustering of queries across providers,• Visualization of graphical network• Linking back to the original source for more details• …

PSICQUIC Services Tagging

Contentprotein-proteinsmall molecule-proteinnucleic acid-protein

 Interaction representation

evidenceclustered

 Curation standards

mimix curationimex curationrapid curation

Sourceinternally curatedtext miningpredictedimported 

Complex expansionspokematrixbipartite

PSICQUIC View

How to deal with Complexes

• Some experimental protocol do generate complex data:Eg. Tandem affinity purification (TAP)

• One may want to convert these complexes into sets of binary interactions, 2 algorithms are available:

In pairs start the next activity:

Binary or N-nary? Spoke or Matrix?

Please identify the type of interaction for the interaction method cards given

Also choose the method you think is best for the method

Software demonstrationSoftware demonstration

• Visualising network in Cytoscape

• Selecting an network

• Import in cytoscape

• Change layout

• Add attributes and change view based on these

• Change and add properties to nodes and edges

Cytoscape network visualisation

N

etw

ork

N

etw

ork

VisualizationH

igh

lig

hti

ng

netw

ork

layou

t…H

igh

lig

hti

ng

netw

ork

layou

t…

Master headline

VisualizationH

igh

lig

hti

ng

netw

ork

pro

pert

ies e

dg

es

Hig

hlig

hti

ng

netw

ork

pro

pert

ies e

dg

es

Master headline

VisualizationH

igh

lig

hti

ng

netw

ork

pro

pert

ies n

od

es

Hig

hlig

hti

ng

netw

ork

pro

pert

ies n

od

es

Attributes and analysis using Cytoscape

59

Master headline

What else?

How

to look d

eep

er

into

a d

ata

set…

How

to look d

eep

er

into

a d

ata

set…

Master headline

GOAH

ow

to look d

eep

er

into

a d

ata

set…

How

to look d

eep

er

into

a d

ata

set…

• Click on the interaction count to restrict your dataset

• This operation can be done several time to add multiple filters

Improving and increasing protein annotations

63

EBI is an Outstation of the European Molecular Biology Laboratory.

IntAct team

Rolf Apweiler•Henning Hermjakob•Sandra Orchard•Margaret Duesbury•Samuel Kerrien•Bruno Aranda•Marine Dumousseau

IntA

ct

is f

un

ded

by t

he E

uro

pean

Com

mis

sio

n u

nd

er

FELIC

S,

con

tract

nu

mb

er

021902 (

RII

3)

PSI, IMEx, Enfin Proteomics community

PANDA

Proteomics

Acknowledgements

What data are we dealing with ?

System Biology?System Biology?

Genomics Proteomics

Functional Genomics/Proteomics

TranscriptomicsMetabolomics

DNA

RNA

Protein

Small Molecules

Databases

Pathways:

This afternoon session outline• Reactome Overview• What type of data it contains• Where the data comes from• What and how can you access through Reactome• Have a go: tutorial

67

A Database of human biological

pathways

Steve Jupe

Rationale – Journal information

Nature 407(6805):770-6.The Biochemistry of Apoptosis.

“Caspase-8 is the key initiator caspase in the death-receptor pathway. Upon ligand binding, death receptors such as CD95 (Apo-1/Fas) aggregate and form membrane-bound signalling complexes (Box 3). These complexes then recruit, through adapter proteins, several molecules of procaspase-8, resulting in a high local concentration of zymogen. The induced proximity model posits that under these crowded conditions, the low intrinsic protease activity of procaspase-8 (ref. 20) is sufficient to allow the various proenzyme molecules to mutually cleave and activate each other (Box 2). A similar mechanism of action has been proposed to mediate the activation of several other caspases, including caspase-2 and the nematode caspase CED-3 (ref. 21).”

How can I access the pathway described here and reuse it?

Nature. 2000 Oct 12;407(6805):770-6.The biochemistry of apoptosis.

Rationale - FiguresA picture paints a thousand words…

but….• Just pixels• Omits key details• Assumes• Fact or Hypothesis?

Reactome is…

Free, online, open-source curateddatabase of pathways and reactions in human biology

Authored by expert biologists, maintained byReactome editorial staff (curators)

Mapped to cellular compartment

Extensively cross-referenced

Tools for data analysis – Pathway Analysis, Expression Overlay, Species Comparison, Biomart…

Used to infer orthologous events in 20 non-human species

Reactome is…

human

PMID:5555 PMID:4444

mouse

cow

Direct evidence

Direct evidence

Indirect evidence

PMID:8976

PMID:1234

Using model organism data to build pathways – Inferred pathway events

Theory - Reactions

Pathway steps = the “units” of Reactome

= events in biology

TRANSPORTCLASSIC

BIOCHEMICAL

BINDING

DISSOCIATION

DEGRADATION

PHOSPHORYLATION

DEPHOSPHORYLATION

Reaction Example 1: Enzymatic

Reaction Example 2: Transport

REACT_945.4

Transport of Ca++ from platelet dense tubular system to cytoplasm

Other Reaction Types

Binding

Dimerization

Phosphorylation

Reactions Connect into Pathways

OUTPUTINPUT

CATALYST

OUTPUTINPUT

CATALYST

INPUT OUTPUT

CATALYS

T

Data Expansion - Link-outs From Reactome

• GO • Molecular Function• Compartment• Biological process

• KEGG, ChEBI – small molecules• UniProt – proteins• Sequence dbs – Ensembl, OMIM, Entrez Gene,

RefSeq, HapMap, UCSC, KEGG Gene• PubMed references – literature evidence for events

Species Selection

Data Expansion – Projecting to Other Species

A + ATP A + ADP-PB

Human

A + ATP A + ADP-P

BMouse

BA

Drosophila

Reaction notinferred

No orthologue - Protein not inferred

+ ATP

Exportable Protein-Protein Interactions

Inferred from complexes and reactions

Interactions between proteins in the same complex, reaction, or adjoining reaction

Lists available from Downloads

See Readme document for more details

Coverage – Content, TOC

And many more...

Planned Coverage – Editorial Calendar

Reactome Tools

• Interactive Pathway Browser

• Pathway Mapping and Over-representation

• Expression overlay onto pathways

• Molecular Interaction overlay

• Biomart

Summary• Pathway databases are an integral part of the scientific enterprise.

• Reactome has deployed a user-friendly web site that promotes

integrated research on pathways and networks.• Data visualization

• Data analysis

• Data expansion

• Data integration

• Data standards/exports

• Develop and distribute open software and standard operating

procedures for the management of pathway information.

Credits

OICR/CSHL NYU EBI

Lincoln Stein Peter D'Eustachio Ewan Birney

Michael Caudy Shahana Mahajan Henning Hermjakob

Marc Gillespie Lisa Matthews David Croft

Robin Haw Veronica Shamovsky Phani Garapati

Irina Kalatskaya Bijay Jassal

Bruce May Steven Jupe

Leontius Pradhana

Nelson Ndegwa

Guanming Wu Gavin O’Kelly

Christina Yung Esther Schmidt

Supported by grants from the US National Institutes of Health (P41 HG003751) and EU grant LSHG-CT-2005-518254 "ENFIN”

In pairs start the Reactome Tutorial

top related