e-labs and research objects
DESCRIPTION
e-Labs and Research Objects. What is an e-Laboratory?. A laboratory is a facility that provides controlled conditions in which scientific research, experiments and measurements may be performed, offering a work space for researchers. - PowerPoint PPT PresentationTRANSCRIPT
e-Labs and Research Objects
What is an e-Laboratory?
• A laboratory is a facility that provides controlled conditions in which scientific research, experiments and measurements may be performed, offering a work space for researchers.
• An e-Laboratory is a set of integrated components that, used together, form a distributed and collaborative space for e-Science, enabling the planning and execution of in silico experiments -- processes that combine data with computational activities to yield experimental results
• An e-Lab consists of:
1. a community;
2. work objects;
3. generic resources for building and transforming work objects.
• Sharing infrastructure and content across projects
e-Labs
PeoplePeople DataData MethodsMethods
Research Objects
• The common currency for e-Labs• A story about an investigation• An aggregation of resources
– With a particular purpose, reason or rationale for the aggregation
• Capturing the investigation process “from soup to nuts”• Intended to be
– Reusable– Repeatable– Replayable
e-Labs + Research Objects
• An e-Lab is built from a collection of services, consuming and producing Research Objects
RO Bus
Service Service Service
Workbench/RO driven UI
RO awareservices
Service
VisualisationNotification
Annotation etc.
Research ObjectsResearch Objects
Workflows
Data sets
Services
Scripts
Publications
Development e-Lab
Application e-Lab
ResearchMethodsExperts
Policy makers
Delivery Experts
Knowledge Burying (Mons)
• Publishing/mining cycle results in loss of knowledge– ≥ 40% of information lost
• RIP – Rest in Paper• ROs as a mechanism for publication of knowledge,
preserving information about the process.
Experiment
Paper
Knowledge
Publication Text Mining
(Current) RO Principles
• Common Schema for internal strcture• References + metadata rather than Data• Graceful degradation of understanding
– Not all services understand everything– cf RDF/OWL
• Reflective– Clickable– Displayable
• Mailable
Anatomy of an RO
Flavours of RO
• RO as encapsulation of a process– Up to date references to appropriate resources
• RO as a record of what happened– Curated, “fossilised”, immutable aggregation
• RO as collection– E.g Tutorial materials
• RO as protocol• General templates that may be
specialised for specific domains/tasks
What’s inside?
• A research problem• A hypothesis• Experimental design• Data sets• Measurements• Workflows used to analyse data• Results of data analysis• Information about ethical
approval• Governance policies
• Publications, e.g. papers, reports, slide-decks
• The investigators involved in the experiment;
• References to other SROs that the work depends on or cites
• Descriptions of relationships between resources.– Lilly experiment ontology, – SWAN/SIOC – Scholarly discourse– OBO relations ontology
RO Lifecycle
• ROs have a lifecycle: they may be created, manipulated, edited, interrogated and published.
• Appropriate servicessupport this lifecycle
e-Labs services
• Registry• Repository• Workflow Monitoring• Event Logging
– News feeds, activities• Social Metadata
– Tagging, groups, users, Sharing
• Annotation• Search• Visualisation• Notification
• Authentication, Authorisation and Role based Access
• Job Execution. Workflow engine, HPC scripts etc.
• Naming and Identity Centralised vs. distributed.
• Synchronisation– To support on-line and off-
line working• Anonymisation
– e.g. for health records• Text Mining
e-Labs activity
• Obesity e-Lab (details next)• myExperiment
– Packs as a precursor to ROs
– Sharing/Social networking services
• Biocatalogue– Curated collection of bio
web services• LifeGuide
– myExperiment for storing/sharing Internet interventions
• NW eHealth– e-Labs as a “sense-making
layer” on top of NHS Information Systems
• ONDEX– Linking bio data sets
• Sysmo-DB– Web-based exchange of
data• Shared Genomics
– HPC Infrastructure for analysis of large-scale genetic data
e-Labs TAG
Evolution
1st Generation
•Current practice of early adoptors of e-Labs tools such as Taverna
•Characterised by researchers using tools within their particular problem area, with some re-use of tools, data and methods within the discipline.
•Traditional publishing is supplemented by publication of some digital artefacts like workflows and links to data.
•Provenance is recorded but not shared and re-used.
•Science is accelerated and practice beginning to shift to emphasise in silico work
2nd Generation
•Designing and delivering now, e.g. Obesity e-Lab
•Experience with Taverna and myExperiment and on our research results arising from these activities
•Key characteristic is re-use - of the increasing pool of tools, data and methods across areas/disciplines.
•Contain some freestanding, recombinant, reproducible research objects. Provenance analytics plays a role.
•New scientific practices are established and opportunities arise for completely new scientific investigations.
3rd Generation•The vision - the e-Labs we'll be delivering in 5 years - illustrated by open science.•Characterised by global reuse of tools, data and methods across any discipline, and surfacing the right levels of complexity for the researcher. •Key characteristic is radical sharing •Research is significantly data driven - plundering the backlog of data, results and methods. •Increasing automation and decision-support for the researcher - the e-Laboratory becomes assistive. •Provenance assists design•Curation is autonomic and social
ROs and e-Labs
• Research Objects– Aggregations of resources (people + data + methods) – Rationale, purpose, story– Lifecycle– Share and Exchange: Reuse, Replay, Repeat
• E-Labs– Collection of services consuming and producing
Research Objects
A dream…
http://www.flickr.com/photos/fatdeeman/2879894
ProblemProblem
E-LabE-Lab