gopubmed and beyond: rules and reasoning for ontology-based literature search
DESCRIPTION
GoPubMed and beyond: Rules and reasoning for ontology-based literature search. REWERSE: a European Network of Excellence. Reasoning on the Web with Rules and Semantics Technology Rule Markup languages Policy specification, composition and conformance Composition and typing - PowerPoint PPT PresentationTRANSCRIPT
Michael Schroeder BioTechnological CenterTU [email protected]://www.biotec.tu-dresden.de Biotec
GoPubMed and beyond: Rules and reasoning for
ontology-based literature search
By Michael Schroeder, Biotec, 2003 2
REWERSE:a European Network of Excellence
Reasoning on the Web with Rules and Semantics Technology
Rule Markup languages Policy specification, composition and conformance Composition and typing Reasoning-aware querying Evolution and reactivity
Application Web-based Decision Support for Event, Temporal, and Geographical
Data Towards a Bioinformatics Semantic Web Personalised Information Systems
Towards a Bioinformatics Semantic Web Groups: Dresden, Jena, Lisbon, Linkoeping, Edinburgh, Bucarest,
Manchester, Paris Rules and constraints for structure prediction, metabolic pathways, gene
expression analysis, ontologies, workflows.
By Michael Schroeder, Biotec, 2003 3
GoPubMed and beyond: Rules and reasoning for
ontology-based literature search
By Michael Schroeder, Biotec, 2003 4
Problem
PubMed >12M articles
By Michael Schroeder, Biotec, 2003 5
Example Task Which enzymes does Levamisole inhibit?
It is well known that Levamisole inhibits alkaline phosphatase
It is not well known that Levamisole inhibits phosphofructokinase
By Michael Schroeder, Biotec, 2003 6
PubMed Example A keyword search for
levamisole inhibitor produces well over 100 hits in PubMed.
To find out about specific functions, we have to go through all these papers!
We are interested in the relevant enzymatic functions.
A refined search for Levamisole inhibitor enzymatic activity produces only 5 hits - a lot of relevant papers have been dropped.
By Michael Schroeder, Biotec, 2003 7
GoPubMed Example
Query: Levamisole inhibitor
Maximum papers: 100Strict matching
54 papers in biological process20 in cellular components72 in molecular function
By Michael Schroeder, Biotec, 2003 8
GoPubMed Example
Let’s look for some functions:
70 papers including terms, which are enzyme activities
By Michael Schroeder, Biotec, 2003 9
GoPubMed Example
Transferase 8Kinase 6Hydrolase 58Oxidoreductase 2Lyase 1
By Michael Schroeder, Biotec, 2003 10
GoPubMed Example
Alkaline Phosphatase:
52 papers
…alkaline phosphatase
inhibitor levamisole…
Effects of alkaline
phosphatase and its
inhibitor Levamisole…
By Michael Schroeder, Biotec, 2003 11
GoPubMed Example
Phosphofructokinase
By Michael Schroeder, Biotec, 2003 12
GoPubMed Example
Levamisole direclty inhibits tumor
phosphofructokinase
By Michael Schroeder, Biotec, 2003 13
GoPubMed Example
In PubMed the article is listed at position 84!And hence unlikely to
be read
By Michael Schroeder, Biotec, 2003 14
GoPubMed and beyond: Rules and reasoning for
ontology-based literature search
By Michael Schroeder, Biotec, 2003 15
Prova
By Michael Schroeder, Biotec, 2003 16
Prova Rule-based Java scripting for middleware Combination of object-oriented and declarative
programming Offer a rule-based platform for distributed
agent programming. Expose logic and agent behaviour as rules; Access data sources via wrappers written in Java Make all Java API from available packages
directly accessible from rules; Run within the Java runtime; Enable rapid prototyping of applications;
By Michael Schroeder, Biotec, 2003 17
PubMed XML Output<PubmedArticle> <MedlineCitation Owner="NLM" Status="Publisher"> <PMID>15469972</PMID> <DateCreated> <Year>2004</Year> <Month>10</Month> <Day>7</Day> </DateCreated> <Article> <Journal> <ISSN>0950-1991</ISSN> <JournalIssue PrintYN="N"> <PubDate> <Year>2004</Year> <Month>10</Month> <Day>6</Day> </PubDate> </JournalIssue> <Coden>DEVPED</Coden> <Title>Development (Cambridge, England)</Title> <ISOAbbreviation>Development</ISOAbbreviation> </Journal>…
By Michael Schroeder, Biotec, 2003 18
Output
Year: 2004ArticleTitle: Developmental potential of defined neural progenitors derived from mouse embryonicstem cells.LastName: PlachtaFirstName: NicolasInitials: N**********************LastName: BibelFirstName: MiriamInitials: M**********************LastName: TuckerFirstName: Kerry LeeInitials: KL**********************LastName: BardeFirstName: Yves-AlainInitials: YA******************************************Year: 2004ArticleTitle: Eye evolution: a question of genetic promiscuity.LastName: NilssonForeName: Dan-EInitials: DE**********************…
By Michael Schroeder, Biotec, 2003 19
Code Snippet%%% XML reading tests
:- eval(test_xml1()).
test_xml1() :-Document=XML("pubmed.xml"),Root = Document.getDocumentElement(),tagname(Root,Elements),
Elements.nodes(Element),subtag(Element,SubElements),
SubElements.nodes(SubElement), SubElementNodeName = SubElement.getNodeName(),
ChildNodes=SubElement.getChildNodes(),ChildNodes.nodes(ChildNode),evaluate(ChildNode).
tagname(Root,Elements):- Elements =
Root.getElementsByTagName("MedlineCitation").
subtag(Element,SubElements):- SubElements =
Element.getElementsByTagName("PubDate").subtag(Element,SubElements):- SubElements =
Element.getElementsByTagName("Article").subtag(Element,SubElements):- SubElements =
Element.getElementsByTagName("Author").
evaluate(ChildNode):- "Year"=ChildNode.getNodeName(), printout(ChildNode).
evaluate(ChildNode):- "Initials"=ChildNode.getNodeName(), printout(ChildNode), println([**********************]).evaluate(ChildNode):- "FirstName"=ChildNode.getNodeName(), printout(ChildNode).evaluate(ChildNode):- "ForeName"=ChildNode.getNodeName(), printout(ChildNode).evaluate(ChildNode):- "LastName"=ChildNode.getNodeName(), printout(ChildNode).evaluate(ChildNode):- "ArticleTitle"=ChildNode.getNodeName(), printout(ChildNode).
printout(ChildNode):-ChildNodeName = ChildNode.getNodeName(),DataName = ChildNode.getFirstChild(),StringName = DataName.getNodeValue(),println([ChildNodeName, ": ",StringName]).
Prolog rules
Javaobjects
Unification
By Michael Schroeder, Biotec, 2003 20
Traverse the GO tree and find all phosphofructokinases
:-eval(consult("utils.prova")).
location(database,“go","jdbc:mysql://comas.soi.city.ac.uk","guest","guest").location(database,"GO","jdbc:mysql://dbserver","guest","guest").
:-solve(isPhosphofructokinase()).
isPhosphofructokinase() :- dbopen("GO",DB), println(["DB open"]), sql_select(DB,term,[id,TermID],[name,Name],[where,"name = 'phosphofructokinase activity'"]), println(["Looking for all children of GO-Term (",TermID,"): ",Name]), findall(TermID,isPhosphofructokinase(DB,TermID),_).
isPhosphofructokinase(DB,TermID) :- concat(["term1_id=",TermID],WhereClause), sql_select(DB,term2term,[term2_id,ChildTermID],[where, WhereClause]), concat(["id=",ChildTermID],WhereClause2), sql_select(DB,term,[name,ChildName],[where,WhereClause2]), println([TermID," has child ", ChildName,",",ChildTermID]), isPhosphofructokinase(DB,ChildTermID).
isPhosphofructokinase(DB,TermID) :- println([TermID," does not have any children."]).
Backtracking
Built-in DB access
Built-in DB access
Recursion
By Michael Schroeder, Biotec, 2003 21
OutputLooking for all children of GO-Term (2153): phosphofructokinase
activity2153 has child 1-phosphofructokinase activity,21542154 does not have any children.2153 has child 6-phosphofructo-2-kinase activity,21552155 does not have any children.2153 has child 6-phosphofructokinase activity,21562156 does not have any children.2153 does not have any children.yes
By Michael Schroeder, Biotec, 2003 22
Messaging and Reaction Rules Prova is designed to implement agents in a
distributed system Prova is based on theoretical work on multi agent
systems
Prova provides predicates to send and receive messages and to realise reaction rules
By Michael Schroeder, Biotec, 2003 23
Code Snippet:- eval(ex004()).
a(1).a(2).b(3).b(4).
% Send out messagesex004() :-
println(["==========ex004=========="]),iam(Me),
sendMsg(XID1,self,Me,queryref,a(I)),rcvMult(XID1,self,Me,reply,a(I)),println(["Inline reaction ",rcvMult(XID1,self,Me,reply,a(I))]),
sendMsg(XID2,self,Me,queryref,b(J)),rcvMult(XID2,self,Me,reply,b(J)),println(["Inline reaction ",rcvMult(XID2,self,Me,reply,b(J))]).
% Reaction rule to general queryrefrcvMsg(XID,Protocol,From,queryref,[X|Xs]|LocalContext) :-
println(["Rule reaction 1 ",rcvMsg(XID,Protocol,From,queryref,[X|Xs]|LocalContext)]),derive([X|Xs]),sendMsg(XID,Protocol,From,reply,[X|Xs]|LocalContext).
rcvMsg(XID,Protocol,From,queryref,[X|Xs]|LocalContext) :-println(["Rule reaction 2 ",rcvMsg(XID,Protocol,From,queryref,[X|Xs]|LocalContext)]),sendMsg(XID,Protocol,From,end_of_transmission,[X|Xs]|LocalContext).
% A testing harness for printing incoming end_of_transmission messages.rcvMsg(XID,Protocol,From,end_of_transmission|Extra) :-
println(["end_of_transmission for conversation-id ",XID,": "|Extra]).
Built-in agent agent communication:
sendMsg and rcvMsg
Abstracting from protocols like JMS, Jade or inline
By Michael Schroeder, Biotec, 2003 24
Output==========ex004==========Rule reaction 1 ["rcvMsg","mediator@bioinf-mobile3","self","mediator","queryref",["a",I]]Rule reaction 2 ["rcvMsg","mediator@bioinf-mobile3","self","mediator","queryref",["a",I]]Inline reaction ["rcvMult","mediator@bioinf-mobile3","self","mediator","reply",["a",1]]Inline reaction ["rcvMult","mediator@bioinf-mobile3","self","mediator","reply",["a",2]]end_of_transmission for conversation-id mediator@bioinf-mobile3: ["a",I]Rule reaction 1 ["rcvMsg","mediator@bioinf-mobile6","self","mediator","queryref",["b",N@@16]]Rule reaction 2 ["rcvMsg","mediator@bioinf-mobile6","self","mediator","queryref",["b",N@@16]]Rule reaction 1 ["rcvMsg","mediator@bioinf-mobile8","self","mediator","queryref",["b",N@@16]]Rule reaction 2 ["rcvMsg","mediator@bioinf-mobile8","self","mediator","queryref",["b",N@@16]]Inline reaction ["rcvMult","mediator@bioinf-mobile6","self","mediator","reply",["b",3]]Inline reaction ["rcvMult","mediator@bioinf-mobile6","self","mediator","reply",["b",4]]end_of_transmission for conversation-id mediator@bioinf-mobile6: ["b",N@@16]Inline reaction ["rcvMult","mediator@bioinf-mobile8","self","mediator","reply",["b",3]]Inline reaction ["rcvMult","mediator@bioinf-mobile8","self","mediator","reply",["b",4]]end_of_transmission for conversation-id mediator@bioinf-mobile8: ["b",N@@16]end_of_transmission for conversation-id mediator@bioinf-mobile3: ["a",I]Shutdown complete.
By Michael Schroeder, Biotec, 2003 25
Conclusion GoPubMed facilitates exploration of literature abstracts with the
GeneOntology Prova implements Prolog-style rules and reasoning with Java
Thanks: GoPubMed: Andreas Doms, Ralf Delfs, Alex Kozlenkov Prova: Alex Kozlenkov Support: EU Projects REWERSE, GeneStream, BioGrid
Contact: Michael Schroeder: [email protected]
URLs www.biotec.tu-dresden.de www.rewerse.net (REWERSE project including info on bioinfo group) www.semanticwebrules.org (Prova) www.gopubmed.org (GoPubMed)