wrapping analytical services for cabig taverna-cagrid technical review meeting stian soiland-reyes,...
DESCRIPTION
Project overview Taverna caGrid cooperation Taverna workbench enhancements for caGrid Grid-enabling analytical services caGrid security support for Taverna This presentation deals with the analytical servicesTRANSCRIPT
Wrapping analytical
services for caBIG
Taverna-caGrid technical review
meetingStian Soiland-Reyes, myGridUniversity of Manchester, UK
2009-01-23http://www.mygrid.org.uk/dev/wiki/display/caGrid
Agenda
• Project overview• Primary goals• Service
selection• Services
identified• Architecture• Service outputs
• Service outputs• UML model• Template workflow• Work so far• Implementation
plan
Project overview
• Taverna caGrid cooperation•Taverna workbench enhancements
for caGrid•Grid-enabling analytical services•caGrid security support for Taverna
• This presentation deals with the analytical services
Primary goals
• Identify two publicly available analytical web services currently accessible through Taverna
• caGrid-enable the services; semantically described using caBIG’s infrastructure
• Demonstrate building of workflows combining the new services with existing caBIG services
Service selection
• Selected services in collaboration with the caGrid Workflow working group, lead by Juli
• Winners:•NCBI Blast hosted by EBI•InterProScan hosted by EBI
Why these services?
• Freely available• Highly reliable, hosted by EBI• Widely used by the scientific
community• Can be combined with existing caBIG
tools in biologically meaningful workflows•caBIO, GridPIR, etc.
Services identified
• NCBI Blast•A popular similarity search tool using
local sequence alignment•Supports sequences of proteins, DNA,
RNA•Searches sequences in a whole range
of databases•SWISSPROT, UNIPROT, NCBI, EMBL, etc.
•SOAP web service hosted by EMBL-EBI
Services identified
• InterProScan•Integrates various databases of
protein domains and functional sites
•Searches using protein signature recognition methods
•SOAP web service hosted by EMBL-EBI
Architecture
Architecture as pseudo codeclass CaGridClient: def main(): endpointReference = wrappedService.invoke(inputs) endpointReference.subscribe() def resourcePropertyChanged(): outputs = endpointReference.getResourceProperty() print "Result", outputs class WrappedService: def invoke(inputs): convertedInputs = dataConverter.convertFromCaGrid(inputs) jobId = serviceInvoker.invoke(convertedInputs) endpointReference = new EndpointReference(jobId) return endpointReference
def outputReturned(jobId, outputs): convertedOutputs = dataConverter.convertToCaGrid(outputs) endpointReference.setResourceProperty(convertedOutputs) class ServiceInvoker: def invoke(convertedInputs): jobId = originalService.invoke(convertedInputs) return jobId
Output InterProScan (Untranslated)<EBIInterProScanResults xmlns="http://www.ebi.ac.uk/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.ebi.ac.uk/schema/InterProScanResult.xsd"> <Header>..</Header><interpro_matches> <protein id="uniprot|P01174|WAP_RAT" length="137" crc64="1C2E8ADA9FD97949" > <interpro id="IPR008197" name="Whey acidic protein, 4-disulphide core" type="Domain" parent_id="IPR015874"> <child_list><rel_ref ipr_ref="IPR008198"/></child_list> <contains><rel_ref ipr_ref="IPR002098"/></contains> <classification id="GO:0030414" class_type="GO"> <category>Molecular Function</category> <description>protease inhibitor activity</description> </classification> <match id="G3DSA:4.10.75.10" name="Whey_acidic_protein_4-diS_core" dbname="GENE3D"> <location start="77" end="128" score="9.899996308397199E-5" status="T" evidence="Gene3D" /> </match> <match id="PF00095" name="WAP" dbname="PFAM"> <location start="30" end="72" score="6.30000254573025E-5" status="T" evidence="HMMPfam" /> <location start="79" end="126" score="1.59999889349247E-14" status="T" evidence="HMMPfam" /> </match> </interpro> <interpro id="IPR008198" name="Proteinase inhibitor I17" type="Domain" parent_id="IPR008197"> ...</interpro> </protein></interpro_matches></EBIInterProScanResults>
Output InterProScan (Untranslated)<EBIInterProScanResults xmlns="http://www.ebi.ac.uk/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.ebi.ac.uk/schema/InterProScanResult.xsd"> <Header>..</Header><interpro_matches> <protein id="uniprot|P01174|WAP_RAT" length="137" crc64="1C2E8ADA9FD97949" > <interpro id="IPR008197" name="Whey acidic protein, 4-disulphide core" type="Domain" parent_id="IPR015874"> <child_list><rel_ref ipr_ref="IPR008198"/></child_list> <contains><rel_ref ipr_ref="IPR002098"/></contains> <classification id="GO:0030414" class_type="GO"> <category>Molecular Function</category> <description>protease inhibitor activity</description> </classification> <match id="G3DSA:4.10.75.10" name="Whey_acidic_protein_4-diS_core" dbname="GENE3D"> <location start="77" end="128" score="9.899996308397199E-5" status="T" evidence="Gene3D" /> </match> <match id="PF00095" name="WAP" dbname="PFAM"> <location start="30" end="72" score="6.30000254573025E-5" status="T" evidence="HMMPfam" /> <location start="79" end="126" score="1.59999889349247E-14" status="T" evidence="HMMPfam" /> </match> </interpro> <interpro id="IPR008198" name="Proteinase inhibitor I17" type="Domain" parent_id="IPR008197"> ...</interpro> </protein></interpro_matches></EBIInterProScanResults>
UML model: wrapped InterproScan
UML model: wrapped NCBIBlast
Template workflow
http://www.myexperiment.org/workflows/230
EBI_dbfetch_fetchBatch will be replaced with the caBIG service caBIO
This workflow uses both NCBIBlast and InterproScan which will be replaced with the wrapped services
Work so far• Identified services and example workflow • Described services (Deliverable 3.2)• Modelled service inputs and outputs in UML
according to caGrid guidelines• Still a few tweaks needed for WS-
Resource usage• Architecture and implementation plan for
wrapping services (Deliverable 3.3)• JavaDoc needs updating for WS-Resource
Implementation plan• Generate Common Data Elements for inputs
and outputs and verify Silver compatability• Generate semantically annotated XMIs• Submit Silver compatability review package
• Implement and deploy wrapped services• Using Introduce and possibly gRavi• Implement, test , deploy• We’ll start with this before submitting CDEs
• Build caGrid-based workflow using services
Any questions..?