Download - BioIT Europe 2010 - BioCatalogue
![Page 1: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/1.jpg)
The Reality of Web Services in the Life Sciences
Professor Carole [email protected]
University of Manchester, UKmyGrid Project
BioIT World Europe 2010, Hannover
http://www.biocatalogue.org
![Page 2: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/2.jpg)
Web Services
• Programmatic Interfaces to Services.
• Machine-Machine communication
• Software Lego™ that works across the web and underpins enterprise SOA.
• Standard interfaces.• Two big families:
– SOAP and REST.
![Page 3: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/3.jpg)
Programmatic Interfaces to Services on the up…..
• Specialisation and segregation of methods from monolithic servers.
• Component packaging.• Publishing data and analyses.• Tools / resources integration.• Applications, analytic workflows,
workbenches and enterprise platforms
• Agile software development• Remote and in house execution • Loosely coupled systems.
http://ww
w.m
yexperiment.org/w
orkflows/15
8.html
![Page 4: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/4.jpg)
Service Providersand Consumers
• Core facility (EMBL-EBI, DDBJ, NCBI …)
• EMBL-EBI 8-10million hits/month• 329 services
• Community projects and labs
• Single Investigator projects
• Enterprises (e.g. Pharmas)
Public Private
![Page 5: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/5.jpg)
Web Service Rhetoric
• Pistoia Alliance
• BioIT Alliance
• ELIXIR
• But not all rosy … see Christian Hauck’s talk 16.00 Thursday.
![Page 6: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/6.jpg)
Web Service Technology Standards
• Simple Object Access Protocol– Remote Procedure Call based– HTTP transport protocol only– Web Service Description Language in
XML, UDDI registry– Extensible
• Representational State Transfer– Resource (document) style– HTTP and URI application protocol– XML and JSON responses, usually– GET / PUT / POST – Lightweight, webby
![Page 7: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/7.jpg)
Bio Service Special Flavours
• Distributed Annotation Services (www.biodas.org)
• BioMOBY (www.biomoby.org)
• SADI
• SSWAP (iPlant Collaborative)
![Page 8: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/8.jpg)
Where…can I find them? advertise mine?
What…do they do? can I use them?
How…do they work? up to date? reliable?
Who…provides them? recommends them? knows about them?
Reusing Public and Third Party Web Services
![Page 9: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/9.jpg)
Web Service Description Language
<wsdl:message name="getGlimmersResponse">
<wsdl:part name="getGlimmersReturn" type="xsd:string"/> </wsdl:message> <wsdl:message name="aboutServiceRequest"/> <wsdl:message name="getGlimmersRequest">
<wsdl:part name="in0" type="xsd:string"/> <wsdl:part name="in1" type="xsd:string"/> <wsdl:part name="in2" type="xsd:string"/> <wsdl:part name="in3" type="xsd:string"/> <wsdl:part name="in4" type="xsd:string"/> <wsdl:part name="in5" type="xsd:string"/> <wsdl:part name="in6" type="xsd:string"/> <wsdl:part name="in7" type="xsd:int"/> <wsdl:part name="in8" type="xsd:string"/>
Pathport Web service from the Virginia Bioinformatics Institute http://pathport.vbi.vt.edu/services/wsdls/beta/glimmer.wsd
Name of the service
Uninformative names for parameters
What kind of string?
![Page 10: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/10.jpg)
Services In the Wild
Find• EMBOSS clustalw program called ‘emma’
Execute• SOAP / REST / Quasi-REST / REST-like
Understand• Input0:string, Output0: string• What does SeqRet actually do?• Example data? Parameter configurations?
Input-Output correlations?
Use• Quality of Service, Monitoring, Robustness• Volatility, Sustained, License, Conditions of Use
![Page 11: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/11.jpg)
Cataloguingto avoid reinvention
• Investigator and project specific registries
• Community lists• Specialist
registries
• General catalogues and search engines
![Page 12: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/12.jpg)
An Open, Public, Curated, Boutique Cataloguefor Web Services serving the Life Sciences for the
Bioinformatics Community
http://www.biocatalogue.orgLaunched June 2009
Nucl Acids Res, June 2010, Web Servers issue doi: 10.1093/nar/gkq394
![Page 13: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/13.jpg)
![Page 14: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/14.jpg)
![Page 15: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/15.jpg)
UNDERSTANDand USE
UNDERSTANDand USE
![Page 16: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/16.jpg)
Prot
ein
Seq.
Alig
nmen
t
Prot
ein
Stru
ctur
e P
redi
ction
Prot
ein
Func
tion
Pred
ictio
n
Nuc
leoti
de S
eq. A
lignm
ent
Rna
stru
ctur
e pr
edic
tion
Gen
e Pr
edic
tion
Text
Min
ing
Ont
olog
y
Phyl
ogen
y
Mic
roar
ray
Sequ
ence
Ret
rieva
l
Iden
tifier
Ret
rieva
l
Stru
ctur
e Re
trie
val
Lite
ratu
re R
etrie
val
Gen
omic
s
Prot
eom
ics
Syst
ems
Biol
ogy
Bios
tatis
tics
Chem
oinf
orm
atics
Service Coverage1719 services – SOAP and REST
– 92% with service description– 57.5% with all ops/methods described
>60 classifications Big players: EBI, NCBI, DDBJ etc….
60 operations on chemistry and chem-informatics data
![Page 17: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/17.jpg)
[June 09 - Sep10]
Steady use: 2K+ unique IPs/month.
![Page 18: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/18.jpg)
• Chiefly public services• Community contributed
– Service Providers: 127– Third Parties: 92
submitters– 420 registered members– 27 countries
(UK>Spain>USA>Canada)
• Partners and registries– EMBRACE Registry,
SeekDa!, (BioMOBY, DAS)
• Automated crawling• Manual mining
Building Content and Community
![Page 19: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/19.jpg)
EMBL-EBI
DDBJ
NCBI
But these statistics have to be interpreted…..
![Page 20: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/20.jpg)
Curation
Chang
e log
s
Quantitative Annotations
Tags
Semantic Annotations
Ontologies
FunctionalCapabilities
Provenance
OperationalCapabilities
OperationalMetrics
Use Policy
Social Status
Ratings
AttributionFree text
Instrumentation
Usable and Useful
Understandable
Annotations
Bio-Services• EDAM• myGrid• BioMOBY…
Bioontologies• OBO
Foundry• BioPortal…
Services• WSMO• SAWSDL• SA-REST…
![Page 21: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/21.jpg)
Incremental Annotation50,672
• accumulate, aggregation, types, attribution
![Page 22: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/22.jpg)
Archived ServiceArchived Service
AnnotationsAnnotations
AttributionAttribution
![Page 23: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/23.jpg)
TaggingTagging
Social Social
Annotate AnythingAnnotate Anything
CategoriesCategories
![Page 24: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/24.jpg)
OperationsInputsOutputs
OperationsInputsOutputs
Example useExample use
![Page 25: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/25.jpg)
• Availability• API changes• Test script
sandbox
• Based on EMBRACE Registry Monitoring Framework
• Availability• API changes• Test script
sandbox
• Based on EMBRACE Registry Monitoring Framework
Social SharingFeeds
Social SharingFeeds
![Page 26: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/26.jpg)
WSDL, SAWSDL, SA-REST, WSMORDF and SPARQL
Service annotationformats
Gadgets, Apps
Customised and Private instances
A service / resource
Open Source (BSD)Open Platform
Read (Write) REST APIs
EDAM, BioMOBY, myGrid, OBO family, BioXSD
Annotation Ontologies
![Page 27: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/27.jpg)
People Powered ContentReward and AttributionSensitivities
Tools
Bringing a Community together
Automation
Core Contribution& CurationCoordinationGovernance
Content Capture & Curation
![Page 28: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/28.jpg)
GovernanceBlackhole
• Submission• Content• Ownership / submitter /
curator responsibilities• Responsibility migrations• Service update• Metadata update• Notifications• Withdrawal• Take-down• Archiving• Preservation
![Page 29: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/29.jpg)
Curating third party services is HARD
The Reality of Web Services in the Life Sciences
The Reality of (Expert) Crowd Sourcing Contributions
for a Web Service Catalogue
![Page 30: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/30.jpg)
Eight years ago Lincoln Stein said…
“An interface is a contract between data provider and
data consumer”
Stein L Creating a bioinformatics nation. Nature 2002;417:119-120.
![Page 31: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/31.jpg)
A Public interface means a Public Service
• Thinking local not global– Local configuration bake-ins – Scalability – I/O and load– Interface granularity and interaction
chattiness
• Interface churn– Silent API volatility– BioCatalogue Change logs– Web Interface trumps API– Local application trumps dependent
external ones
Ensembl API: updated on every release, not backward compatible with obscured versioning.
BioMART: exposed internal identifier formats and then changed them.
![Page 32: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/32.jpg)
Preservation
(Public) Service Sustainability
Staff/funding/project churn• 2 year availability, responsibility migration/hole, service
decay -> application decay• 58% developed by students, 24% stated not maintained • (Schultheiss et al. (2010) PLoS Comp Biol (in review))• 146 services archived, >90% availability
Sustainability strategyMake it portable, Provide documentationUse existing frameworks and practicesInvolve the community and know your usersPlan sunset or migrationFunding models for sustainability
![Page 33: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/33.jpg)
Schultheiss et al. (2010) PLoS Comp Biol (in review)
![Page 34: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/34.jpg)
Geek UsabilityQuasi-Standards
• http://xml.nig.ac.jp/rest/Invoke?• service={x}&method={y}&...
• Which service? Need to know precisely what is expected for every service at the same endpoint
• http://xml.nig.ac.jp/{service}/{method}?...• Service-method pairs
y
like
http://BASE/op?parameter={value}
![Page 35: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/35.jpg)
Usability: The What and How are Implicit knowledge
• No or lots of docs, poor examples• Complexity• Interfaces and Operation• Service families
Service
OperationOperationOperationOperationOperationOperationOperationOperationOperationOperation
Input
Output
Parameters
Errors
![Page 36: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/36.jpg)
![Page 37: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/37.jpg)
Behaviour families
Function
Polymorphic
Patterns
e.g. KEGG, TFmodeller
e.g. searchSimple operation in BLAST DDBJ
e.g. InterProScan (EBI), RapidMiner, Soaplab Server
Domain Tasks
Invocable operations
![Page 38: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/38.jpg)
query database program
searchSimple
Polymorphic One operation
multiple functional unitsBLAST (DDBJ)
1 Operation: searchSimple
5 Functional units
PD: protein sequence databaseND: nucleotide sequence database
proteinBlast
blastp proteinPD
nucleotideBlast
blastn nucleotide ND
proteinNucleotideBlast
tblastn nucleotideND
nucleotideProteinBlast
blastx protein PD
nucleotideBlastFrameTranslation
tblastx nucleotide ND
![Page 39: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/39.jpg)
Server Wrapper Pattern
• SOAPLab services operations
• clear | describe | getLastEvent | getResults | getResultsInfo | getStatus | run | runAndWaitFor | terminate | waitfor |
• All 100 or so services have same WSDL document.
![Page 40: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/40.jpg)
The SOAP/REST technical view over services is not enough
Need a functional / task-oriented view
![Page 41: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/41.jpg)
Functional Unitannotation
• Service description abstraction
• Services as functional tasks
• Within the boundary of a service
• Independent from technology used
Service
OperationOperationOperationOperationOperationOperationOperationOperationOperationOperationW
SD
LR
ES
TD
AS
[Missier, et al 2010 Functional Units: Abstractions for Web Service Annotations]
![Page 42: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/42.jpg)
Complexitybecause it’s a database really
SABIO–RK Service only
Taverna workflow
find chemical reactions that are associated with a given metabolite, and the kinetics associated with those reactions.
![Page 43: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/43.jpg)
Reflections
• Writing reusable, reliable (public) services with good and stable interfaces for others is hard
• A service interface is different to a web interface or a database query interface.
• Public interfaces – internal interfaces mismatch• Publishing an interface is a publishing step.• Technologist – User mismatch• Eat your own dog food• Takes resource, time and trouble• But will pay off! We can’t afford to reinvent.
![Page 44: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/44.jpg)
Enterprise Concerns:real or perceived?
• Security– HTTPS trusted peers inside a firewall– WS-Security and OAuth (REST)– Or is it fear of using external data?
• Performance– Signature granularity and chattiness– Data shipping vs reference shipping– XML and JSON are not the only
formats
• Governance– Service Level Agreements
Technical or social issues?
![Page 45: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/45.jpg)
Collaborative Curating
• Socialising the community• Rewarding contributors
• 10:90 long tail rule• Content feedback spiral
• Feedback sensitivities• Reputation protection
• Widen - Smart application feeds
• Resourced core content team
![Page 46: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/46.jpg)
Cost of Crowd Curation
![Page 47: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/47.jpg)
Take home
• Emerging, evolving, exciting and challenging Web service ecosystem
• BioCatalogue draws together services, knowledge and community to provide intelligence.
• Crowd collaboration to scale contribution, core to coordinate
• Open effort – contribute or adopt• Core resource – for Alliances and Journals
• Social + technical challenges• Christian Hauck’s talk 16.00 Thursday.
![Page 48: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/48.jpg)
Credits
Thomas LaurentHamish McWilliams
Franck Tanoh Jiten BhagatCarole Goble
Rodrigo LopezEric Nzuobontane
Steve Pettifer
Katy Wolstencroft
Robert Stevens
David De Roure
52
Mannie Tagarira
Jerzy OrlowskiSergejs Aleksejevs
![Page 49: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/49.jpg)
![Page 50: BioIT Europe 2010 - BioCatalogue](https://reader038.vdocuments.us/reader038/viewer/2022103114/5550126eb4c905af648b49b7/html5/thumbnails/50.jpg)
Thank You
http://www.biocatalogue.org
About Us - http://wiki.biocatalogue.org
API Docs - http://apidocs.biocatalogue.org
11th July 2010 54ISMB 10
Bhagat, J., Tanoh, F., Nzuobontane, E., Laurent, T., Orlowski, J., Roos, M., Wolstencroft, K., Aleksejevs, S., Stevens, R., Pettifer, S., Lopez, R., Goble, C.A.: BioCatalogue: a universal catalogue of web services for the life sciences, Nucl. Acids Res., 2010.
doi:10.1093/nar/gkq394