Download - Reactome a pathways knowledgebase
ReactomeReactomea pathways knowledgebasea pathways knowledgebase
Imre Vastrik
EMBL-European Bioinformatics Institute
6/10/2005
The PlanThe Plan
• Why?
• How?
• What does it look like/what can you do with it?
From data to knowledgeFrom data to knowledge
Decrease in computational access
Insulin binds the insulin receptor, causing it todimerise. The dimerised form the autophosphorylateson 6 cytoplasmic tyrosines. This phosphorylated form recruits the IRS adaptor....
Decrease in computational access
……and exhaustionand exhaustion
• Why?
• How?
• What does it look like/what can you do with it?
History of ReactomeHistory of Reactome
• Started as Genome Knowledgebase in spring 2001.• Aim: capture the knowledge of biological
experts in a form that could be searched and reasoned over electronically, and which could act as a connecting link between sequence records and primary biomedical literature.
• Initially tried to capture and standardise the language used to describe molecular processes.
• 2001/2002 realised that what we are trying to capture are reactions and pathways.
• Rebranded as Reactome June 2004.
plasma membrane [GO:0005886]
Cytosol[GO:0005829]
extracellular region[GO:0005576]
Reactome data modelReactome data model
Insulin
Insu
lin r
ece
pto
r
Insu
lin r
ece
pto
r
InsulinIn
sulin
re
cep
tor
Insu
lin r
ece
pto
r
Insu
lin r
ece
pto
r
Insu
lin r
ece
pto
r
P P
Insulin
IRS
Insulin
Insu
lin r
ece
pto
r
Insu
lin r
ece
pto
r
P P
AT
Px1
2
AD
Px1
2
Reactome data modelReactome data modelUniProt:P01308
UniProt:P06213
PMID:8276779PMID:8039601
PMID:11737239PMID:8276779PMID:7781591
ChEBI:2359 ChEBI:2342
IRS
-1
IRS
-2
DO
K1
UniProt :Q9Y4H2 UniProt :Q99704UniProt :P35568
plasma membrane [GO:0005886]
Insulin
Insu
lin r
ece
pto
r
Insu
lin r
ece
pto
r
InsulinIn
sulin
re
cep
tor
Insu
lin r
ece
pto
r
Insulin
Insu
lin r
ece
pto
r
Insu
lin r
ece
pto
r
P P Insu
lin r
ece
pto
r
Insu
lin r
ece
pto
r
P P
Insulin
IRS
AT
Px1
2
AD
Px1
2
Cytosol[GO:0005829]
extracellular region[GO:0005576]
transmembrane receptorprotein tyrosinekinase activity[GO:0004714]
plasma membrane [GO:0005886]
Reactome data modelReactome data model
Insulin
Insu
lin r
ece
pto
r
Insu
lin r
ece
pto
r
InsulinIn
sulin
re
cep
tor
Insu
lin r
ece
pto
r
Insulin
Insu
lin r
ece
pto
r
Insu
lin r
ece
pto
r
P P Insu
lin r
ece
pto
r
Insu
lin r
ece
pto
r
P P
Insulin
IRS
IRS
-1
IRS
-2
DO
K1
UniProt :Q9Y4H2 UniProt :Q99704UniProt :P35568
UniProt:P01308
UniProt:P06213
AT
Px1
2
AD
Px1
2
ChEBI:2359 ChEBI:2342
transmembrane receptorprotein tyrosinekinase activity[GO:0004714]
PMID:8276779PMID:8039601
PMID:11737239PMID:8276779PMID:7781591
Cytosol[GO:0005829]
extracellular region[GO:0005576]
Insulin signalling
Ambiguity of connection maps…Ambiguity of connection maps…A B
C
D
+ +
+
Do you need A & B or just A | B to get active C?
……is avoided by using states and is avoided by using states and reactionsreactions
A
C
C’
C’’
B D
D’
A
C
C’’
B
D
D’
A & B A | B
About mice and men…About mice and men…
human mouse rat human
PMID:5555 PMID:4444PMID:8976 PMID:3924
… … and how not to mix themand how not to mix them
human
PMID:5555 PMID:4444
mouse
rat
Direct evidence Direct evidence
Indirect evidence
Indirect evidence
PMID:8976
PMID:3924
Two FAQsTwo FAQs
• What about tissue specific reactions?– We annotate to the union of all possible reactions: gene
expression data gives the set of reactions feasible in a cell
• What about fine dynamic balances?– We only capture qualitative information. The
quantitative/model aspects has to be handled by ODEs/Kds and SBML like techniques. We can link to these resources, but they are out of scope for the moment
Reviewer
(external)
Curator
(staff)
Expert
(external)
Release cycleRelease cycle
Repository
ReleaseDB
Extract finished & reviewed topics
Computationally project pathways to other organisms
Add cross-references (Ensembl, Entrez Gene, MIM, KEGG,…)
www.reactome.org
Reactome in numbersReactome in numbers(release 15, 26/9/2005)(release 15, 26/9/2005)
Human:• Reactions 1524• Pathways 659• Proteins 1095• “Small molecules” 379• Complexes 982• Literature references 1408
• Interactions 19471
• Why?
• How?
• What does it look like/what can you do with it?
HSAMMU
ANA
BSU
ECO
SSO
MJA
PFA
DDI
ATH
ANI
SPO
SCE
CEL
DME
TNI
Homo sapiens
Schizosaccharomyces pombe
Mus musculus
Tetraodon nigroviridis
Drosophila melanogaster
Caenorhabditis elegans
Saccharomyces cerevisiae
Aspergillus nidulans
Arabidopsis thaliana
Dictyostelium discoideum
Plasmodium falciparum
Methanococcus jannaschii
Sulpholobus solfataricus
Escherichia coli
Bacillus subtilis
Anabaena
Human
Species 1
Species 2
Rules for orthology-based inferenceRules for orthology-based inference
• 75% of a complex must have orthologs
• Lineage specific paralogs are allowed
• All small molecules presumed to exist if reactions exist
• Otherwise every input, output, catalyst must be present
HSAMMU
ANA
BSU
ECO
SSO
MJA
PFA
DDI
ATH
ANI
SPO
SCE
CEL
DME
TNI
Finding lineage-specific deletionsFinding lineage-specific deletions++
++
--++--
----
++++++
----
------
--
??++
++
??
++
----
??
++?? --
------
--
4.4
3.7
4.9
10.2
26.1
9.0
4.0
9.1
26.1
14.7
24.5
8.3
6.7
5.1
0.2
20.2
26.2
26
18.6
13.9
26.1
43.4
43.7
38.1
44.2
39.5
53.1
60.1
74.4
92.4
ANA
BSU
ECO
SSO
MJA
PFA
DDI
ATH
SPO
ANI
SCE
CEL
DME
TNI
MMU
Lineage-specific deletion ratesLineage-specific deletion rates
Absent in cerevisiae and pombe, but Absent in cerevisiae and pombe, but present in aspergilluspresent in aspergillus
Lipid metabolism
Xenobiotic metabolism
Metabolism of amino acids Nucleotide
metabolism (transport)
Lineage Deletion ratesLineage Deletion rates
Trp Catabolism
Head or Tail
DNA Repair
Redundant Paths
Insulin Signalling
Pathway modules
Presence of “small molecules”Presence of “small molecules”
50.0
60.2
57.9
48.5
31.0
40.1
79.5
75.4
53.5
80.7
59.4
88.9
84.8
90.9
98.8
18.7
24.4
24.1
17.3
12.8
24.6
43.0
42.0
36.7
42.2
37.8
52.8
58.9
74.0
92.0
ANA
BSU
ECO
SSO
MJA
PFA
DDI
ATH
SPO
ANI
SCE
CEL
DME
TNI
MMU
perc_inferred
perc_compounds
Tissue expressionTissue expression
0.00E+00
5.00E-02
1.00E-01
1.50E-01
2.00E-01
2.50E-01
Pearson-0.16-0.14-0.12 -0.1-0.08-0.06-0.04-0.02
00.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22
Pearson Correlation
Frequency
NoneComplexReactionNeighbour
Data from HumanNovartis Affy scan
more correlated
Reactome at a glanceReactome at a glance• Catalogue of all possible reactions (topology)
in an organism - reactome• Authored by experts• Currently human orientated• Computational predictions to other species• Data & code freely available
(www.reactome.org/download):– MySQL database, SBML, BioPAX + specialised
datasets– Perl and Java APIs– Website mirror– Data entry tool
Cold Spring Harbor Laboratory European Bioinformatics Institute Gene Ontology Consortium
Lincoln SteinPeter D'EustachioLisa MatthewsGopal GopinathMarc GillespieGuanming Wu
Elizabeth NickersonMarcela Tello-RuizGeeta Joshi-Tope
Ewan BirneyImre VastrikEsther SchmidtBijay JassalBernard de BonoDavid Croft
Suzanna Lewis
Groups & PeopleGroups & People
NHGRI Grant # R01 HG002639EU STREP EMI-CDEBI Industry program
www:http://www.reactome.orge-mail: [email protected]