the taverna software suite
DESCRIPTION
Carole Goble at EGI User Forum, SHIWA, 2013TRANSCRIPT
The Taverna Software SuiteProf Carole Goble FREng FBCS CITP
The University of Manchester, [email protected]
http://www.mygrid.org.ukhttp://www.taverna.org.uk
The Taverna Suite of ToolsClient User Interfaces
User InterfacesWorkflow Repository
Service Catalogue
Third Party Tools
Web Portals / Gateways
Activity and Service Plug-in Manager
Workflow Provenance
Workflow Server
Secure Service AccessOAuth1 & 2, username/password,
certificates.
Workflow Engine
Virtual Machine
Prog APIs
Command Line
Player
WorkflowComponents
Workbench Taverna Lite
Interaction Server
VPH-Share ProjectModels of Human Physiology
Eagle Genomics & NHSNext Generation Sequencing based Patient Diagnostics
Astronomy & HelioPhysics
Library Doc
Preservation
Systems Biology of Micro-Organisms
OpenTox Project Chemistry Development Kit
Drug Toxicity
BioDiversity Invasive Species Modelling
Metagenomics
5820 members, 304 groups, 2415 workflows, 604 files and 229 packs (research objects)
biovel.myexperiment.org
5820 members, 304 groups, 2415 workflows, 604 files and 229 packs (research objects)
The Wf4Ever Components
http://www.wf4ever-project.org
ModelsEncoded in StandardsContributed to Standards
ServicesFoundational, Extension, UserAPIs, ArchitectureWeb protocols/services
Policy and PlanningLeveraging established
protocols Preservation planning, policiesBest workflow design practices
Reference SystemsCommand line+
Third party systemsUser Driver
The Research Object www.researchobjects.org
Execution Platform
Using and Making Standards
Standard id for each componentORCID, DOI, URI
OAI-OREStructuring and Bundling
descriptions and components.
W3C Open Annotation Data Model (AO)Wf4Ever instrumental and hosting rollout
meeting in ManchesterTransferable annotations
Structured and semantically tagged packs for exchange and for linking across repositories
Semantic Web Encoding
Aggregation
Annotation
Identity
ro Ontology
Preservation ChecklistMonitoring environmentMetadata Completeness
Release, not PublishSoftware release practice for workflows and scripts,
services, data, articles, research objects
Gamble, Zhao, Klyne, Goble. MIM: A Minimum Information Model Vocabulary and Framework for Scientific Linked Data, 8th IEEE e-Science 2012, Chicago, USA
W3C PROVRepair recordPreserved record of execution
Gil, Miles, Belhajjame, Deus, Garijo, Klyne, Missier, Soiland-Reyes, Zednik. Primer for the PROV Provenance Model. World Wide Web Consortium (W3C). 2012.
Belhajjame, Goble, Soiland-Reyes, De Roure. Fostering Scientific Workflow Preservation Through Discovery of Substitute Services. Proc 7th IEEE eScience 2011 Stockholm Sweden
Schopf, Treating Data Like Software: A Case for Production Quality Data, JCDL 2012
minim
wfprov roevo
Preservation ModelExperiment Descriptions
Organise workflows into structuredstudies
wfdescInputs, outputs, dependencies
Workflow DecayComponent, Data & Infrastructure unavailability or inaccessibility
Taverna Components
Experiment Decay
Methodological changesNew technologies, resources, components, data
WorkflowMotifs
IEEE e-Science 2012FGCS submission
Best PracticesSWAT4LS
http://www.researchobject.org/W3C Research Object for Scholarly Communication (ROSC) Community Grouphttp://www.w3.org/community/rosc/
Taverna Engine Execution
• Scufl2 language• Functional dataflow, simple control flows, implicit iteration
• Linking services and tools• Data movement, monitoring, staging, reference• “In Workflow Programming” Beanshell scripting• Provenance collection: W3C PROV(+) format• Plug-in Framework
– Infrastructures: Grid, HPC, Web Services (SOAP, REST) – Domain: CDK, BioMart, VOTable, SADI– Common Tools: Excel Spreadsheets, Google Refine, R
• OAuth security plug-in
Taverna Pro-Workbench
• Desktop application• GUI• Intermediate results
views• Gateway to
BioCatalogue and myExperiment
• Plug-in Framework
Workflow Blocks made of a workflow
• Well described • Well behaved• Well looked after• Agreed fail• Agreed formats in and out• Agreed provenance
Deposited in myExperimentGrouped into families
Components
Workflow Blocks made of a workflow
• Well described • Well behaved• Well looked after• Agreed fail• Agreed formats in and out• Agreed provenance
Deposited in myExperimentGrouped into families
Components
Workflow Blocks made of a workflow
• Well described • Well behaved• Well looked after• Agreed fail• Agreed formats in and out• Agreed provenance
Deposited in myExperimentGrouped into families
Components
Desktop Clienthttp://www.xworx.org/
Data Centric Interface
BIFI (Beautiful Interfaces for Inputs) Taverna Workbench Plug-in, GUI definition language
Data services• Vanilla Taverna
– Domain data type neutral
• AstroTaverna plug-in – IVOA data services– VOTables
• PyWPS plug-in– Exposes OGC-compliant
Web Processing Services that can handle large data
Taverna Server• Multiple clients, Multi-user• SOAP and REST API
Server HostServer Host
TavernaServer
“Client”
TavernaServer
“Client”
Taverna Server Front End
Taverna Server Front End
TavServ Back EndTavServ
Back End
TavServ Back EndTavServ
Back End
TavServ Back EndTavServ
Back End
ServiceService
ServiceService
ServiceService
Taverna Server Family• Taverna Server
– Multiple clients, Multi-user– SOAP and REST API
• Taverna Server Amazon Machine Image– Bundled R server, Atom feed server– Multiple instances in Amazon Cloud and as
required, for multiple users/uses and different security scenarios
• Taverna Virtual Machine• Taverna Command Line• Bundled Servers
Calling DCI Grid/Cloud Services• Expose services/tools as WSDL/REST services
– HELIO: Fixed host name – VPH-Share: Services running on dynamically started
instances– SZTAKI Desktop Grid – BOINC/Debian Package
• Specific service/extension to Taverna– UNICORE plugin: Ask grid what services are available,
Include services in a workflow, Invoke services on the grid see talk by Shahbaz Memon
• Library to control job submission to grid– PBS plugin: beanshells in a workflow include invocations
of jobs– KnowARC plugin: Advanced Resource Connector to
submit jobs to NorduGrid
Webinterface
InputSNPs
Results
Storage (S3)
Ensembl (mySQL)
Cache(S3)
Taverna Server
Taverna Server
Taverna Server
Workflow engine
orchestratore-Hive
other
Taverna
Application specific tools and Web Services
Application specific tools and Web Services
Application specific tools and Web Services
WS WS ToolToolWS
All user interaction via web interface
User data stored in the Cloud
Data for all tools and Web Services stored in the Cloud
Unified access to different workflow engines with our common REST API
Tools and Web Services for each workflow are installed together for easy replication
Cloud Analytics for Life Sciences
Tavoop—Taverna & Hadoop
• Compiles Taverna Workflow to collection of Hadoop jobs
• Designed for handling very large amounts of data– Overhead to using Hadoop, but
wins if enough data– Data ingest (expensive step)
must have already been done• Supports Taverna Platform
Execution interface• Parallelisable service types• http://wiki.opf-labs.org/display/SP/P
PL Hadoop ClusterHadoop Cluster
Taverna Execution InterfaceTaverna Execution Interface
Tavoop CompilerTavoop Compiler
Portal(Taverna Player)
Portal(Taverna Player)
GUI Application(Workbench)
GUI Application(Workbench)
Interacting with a workflow
• Many workflows need user interaction• A workflow on a server does not need to
be “press a button and wait”– VPH-Share opens a VNC connection to the
spawned instance.
• Taverna Interaction Service– Users interact with a workflow (wherever it is
running) in a web browser. – Interaction Service Plug-in in workbench
URLs and Frames
Taverna Tool Spectrum
Technical ComputationalScientist
DomainScientist
Workbench WorkbenchComponents
Lite Domain-SpecificWebsite / Tool / Portal
Workflow Visibility
Concept KnowledgeTaverna Domain
High LowPlayer Command Line
Taverna Client Family• Java library / Ruby GEM • Run a Taverna workflow in another
workflow system e.g. Galaxy tools• Command line• Simple Taverna “player”
– Fixed workflow
• Upload & run workflows and choose data– Universitat Pompeu Fabra’s “Soaplab
MajorDomo”– Taverna Lite
Taverna-LiteGeneric Web-based Client
Hide complexityAccess to datasetsUpload and interact with
workflows
Build Portal• Homepage• User-Sessions• Workflow Management• Run Management• Server Credentials
Uses Components for simpler assembly and workflow edits
Web apps to create and run workflows
Service Chaining EditorPete Walker et al Plymouth Marine Laboratory
For chaining OGC Web Processing Service geospatial Web services
Web apps to create and run workflows Online Taverna
• Dr Vadim Surpin and Vitaly Sharanutsa• Institute for Information Transmission Problems of
Russian Academy of Sciences (IITP RAS)
An online, in-browser application for assembling and running Taverna Workflows over a HPC platform
Software Sustainability Institute BoothDr Vadim Surpin
Upload workflow by URL Online Taverna
Taverna 3
Beta July 2013
Summary
• Taverna Suite for interactive and batch workflows
• Flexible Plug-ins and Flexibly Plugged-in
• Themed Taverna
• Establishing Taverna Foundation
• We welcome collaboration/contribution
• http://www.taverna.org.uk
Learn more….• myGrid
– http://www.mygrid.org.uk
• Taverna– http://www.taverna.org.uk
• myExperiment– http://www.myexperiment.org
• BioCatalogue– http://www.biocatalogue.org
• Wf4ever– http://www.wf4ever-project.org
• SCAPE– http://www.scape-project.eu
• Software Sustainability Institute– http://www.software.ac.uk
• BioVeL– http://www.biovel.eu
• Virtual data objects– Johan
• MOU – Portals for BioVeL– DCI platforms
• myExperiment – SHIWA repository (execution)– How can we interchange