annotating experimental records using ontologies olga giraldo, unal de colombia/ciat jael garcia, 3...

28
Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Post on 21-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Annotating Experimental Records using Ontologies

Olga Giraldo, Unal de Colombia/CIATJael Garcia, 3Universität der Bundeswehr

Alexander Garcia, UAMS

Page 2: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Motivation and Research Question

• Knowledge-based approach to managing laboratory information– it combines elements from the Semantic Web (SW), e.g.

ontologies supporting organization and classification, with elements from Social Tagging Systems, e.g. collaboration, ad-hoc organization strategies.

• How can we semantically annotate laboratory records?

• How can we facilitate the coexistence of laboratory notebooks and electronic laboratory records?

Page 3: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Motivation and Research Question

• Easy to use, highly portable, easy to share, low cost…

• Great artifacts for supporting design

• Legal requirement

da Vinci

Mutis

Marie Curie

Page 4: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS
Page 5: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Research Question

• How can we facilitate the coexistence of laboratory notebooks and electronic laboratory records?

• How can we semantically annotate laboratory records?

Page 6: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Our Approach

• Documents should be able to “know about” their own content for automated processes to “know what to do” with them.

Semantics….

Page 7: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Materials and Methods

• Our scenario: supporting the annotation of experimental data for some of the processes routinely run at the Center for International Tropical Agriculture (CIAT) biotechnology laboratory

• 15 laboratory notebooks together with their corresponding electronic records, e.g. XLS files, outputs from lab equipment, etc.

• 10 biologists • Direct non-intrusive

observation: 6 months• Ontology and prototype

development: iterative and collaborative process

• Existing ontologies

Page 8: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Results

• Data types • Rhetorical structure• Ontologies• Orchestration of ontologies• Tags and ontologies • Lessons

Page 9: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Results

• Data Types– Manuscript – Digital – Digital data with manuscript annotations

Page 10: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Results• Manuscript

– Lists– To-dos – How-tos (protocols)– Incomplete results – Dates– Formulas– Electronic paths – Sources for information

(URLs)

Page 11: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Results

• Digital– Photos– Lists– Incomplete results – Protocols – Figures– Sequences

Page 12: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Results

• Digital + Manuscript – Digital files, print-outs,

tagged with manuscript information.

Page 13: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Results

• We identified the rhetorical structure implicit in those laboratory notebooks we studied

• And the metadata describing such structure

Page 14: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Lab Notebook

Body: metadata describing an experimental activity

Header: metadata describing a lab notebook

Title (DC)

Notes (AgMes)

Date of creation (DC) Laboratory

notebook number (M4L)

Creator (DC/AgMes)

Date of finalization (M4L)

Languaje (DC)

Project (OBI/AGROVO

C)

Laboratory procedure

(M4L)

Comments (BioPortal,

NCIt, SNOMED)

Date (DC)

Page number (M4L)

Purpose (M4L)

Security measurements

(M4L)

Outcome (NCIt)

Rhetorical structure: Header, Body.

Materials & Methods, experimental design

Materials & Methods: Samples, Reagents, Assays, Equipment and supplies.

Experimental design

Samples: DNA, RNA, whole plant, etc. (OBI, CHEBI, PO)

Reagents: buffer, dNTP mix (CHEBI, M4L)

Assay: extraction DNA, PCR, gel electrophoresis (OBI, M4L).

Equipment & supplies: freezer, centrifuge, shaker, glove, etc. (OBI, PEO, SEP, SNOMED, BIRNLex M4L).

Experimental design: (OBI, M4L)

Protocol (OBI)

Recorded by (M4L)

Page 15: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

We focused on: DNA extraction, PCR and Electrophoresis

DNA Extraction

Page 16: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

A typical process in a plant biotechnology laboratoryMechanical pulverization of plant material

Page 17: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Results

• M4L: our ontology for the experimental processes we studied– Based on OBI. – Terms proposed to OBI: 197, including new terms

plus terms from other ontologies– Other terms will be proposed to other ontologies,

e.g. ChEBI, GO, PO

Page 18: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Ontology N. of concepts

0 Metadata for Laboratory Notebook (M4L) 149

1 Chemical Entities of Biological Interest (CHEBI) (Degtyarenko et al., 2008) 87

2 Ontology for Biomedical Investigation (OBI) (Brinkman et al., 2010) 59

3 Medical Subject Headings ontology (MSH) (Moerchen et al., 2008) 17

4 Gene Ontology (GO) (Ashburner et al., 2000) 14

5 Sample Processing and Separation Techniques (SEP) (http://psidev.info/index.php?q=node/312) 6

6 BIRN Project lexicon (BIRNLex) (Bug et al., 2008) 6

7 Gene Regulation Ontology (GRO) (Beisswanger et al., 2008) 5

8 National Cancer Institute thesaurus (NCIt) (Ceusters et al., 2005) 5

9 Plant Ontology Consortium (POC) (Jalswal et al., 2005) 5

10 SNOMED-CT (http://www.nlm.nih.gov/research/umls/Snomed/snomed_main.html) 5

11 BioTop Ontology (Beisswanger et al., 2007) 1

12 Foundational Model of Anatomy (FMA) (Rosse and Mejino, 2003) 1

13 Ontology for Genetic Interval (OGI) (Lin et al., 2010) 1

14 Parasite Experiment Ontology (PEO) (http://wiki.knoesis.org/index.php/Parasite_Experiment_ontology) 1

15 Proteomics Data and Process Provenance (PDPP) (Sahoo et al., 2006) 1

Page 19: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Results

• We have structured the descriptive layers by reusing and extending existing ontologies.

• For supporting the annotation within our scenario we have identified three main layers, namely:– i) that related to the

document itself,– ii) the annotation layer, and– iii) that related to the

experiment.

Page 20: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Results

• Orchestration of ontologies: Annotation Ontology

The Annotation Ontology is a vocabulary for performing several types of annotation - comment, entities annotation (or semantic tags), textual annotation (classic tags), notes, examples, erratum... - on any kind of electronic document (text, images, audio, tables...) and document parts. AO is not providing any domain ontology but it is fostering the reuse of the existing ones for not breaking the principle of scalability of the Semantic Web.

Page 21: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

InitEndCornerSelectorImageSelector

(304,507) (360,618)

ANNOT1

ANNOT2

Annotation Qualifier

Definition

aos:init aos:end

rdf:type

rdfs:SubClassOf

Selector

ao:context

rdf:typerdfs:SubClassOf

Provenance

http://www.tags4lab.org/

foaf.rdf#olga.giraldoJune 1, 2010

foaf:Person

rdf:type

pav:createdOnpav:createdBy

Annotation

rdf:typerdfs:SubClassOf

Partial sequence on psy promoter

aof:annotatesDocument

aof:onDocument

GenBank:AB005238ao:hasTopic

name

Topic

ann:body

moat:Tag

tags:name

rdf:type

moat:tagMeaning

MOAT

aoex:hasMoatMeaning moat:Meaning

rdf:typemoat:hasMeaning

aof:annotatesDocument

http://www.ncbi.nlm.nih.gov/

pubmed/12520345

Page 22: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Results

• The AO is structuring the semantic annotation as well as the tags generated by users. – In this way we are

supporting complex SPARQL queries involving several ontologies, for instance:

• Retrieve from the eLabBook the pages tagged by Tim Andrews or Lisa Watson with the tags rice and iron for which there is a LIMS data entry”

Page 23: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Concluding Remarks

• Although several ELNs have been proposed and replacing paper-based records has been a consistent trend for several years, the technology has not yet been widely adopted; Laboratory Information Management Systems (LIMS) in combination with paper-based laboratory notebooks continue to be commonly used; particularly in academic environments.

Page 24: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Concluding Remarks

• Sharing and organizing information happens on a concept basis – researchers studying genes involved in iron

transport share information with those who undertake nutritional studies assessing the effects of iron intake in human populations

– Clustering information based on concepts

Page 25: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Concluding Remarks

• Simple tagging mechanisms proved to be valuable resources for organizing information– Cloud of tags were used as TOCs– Tags were also used to support a quick view of

laboratory pages – Tags tend to stabilize over time– Tags were a valuable resource of terms and

evidence (use cases) for those terms

Page 26: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Concluding Remarks

• Time is difficult to model • Incremental prototyping and participatory

design were key –community engagement• Limitations in the technology:

– Tablets, electronic pen, ipad first generation, now motorola XOOM

– Browser compatibility• Laboratory notebooks look like specialized

wikis

Page 27: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Future Work

• Focus on one technology: Android OS• Semantic LIMS• Support the whole cycle (LIMS record—notebook

—machine generated data)• Automatic annotation of machine generated data• Adopt minimal amounts of information• Adopt techniques from Personal Information

Management approaches• Look more like a wiki

Page 28: Annotating Experimental Records using Ontologies Olga Giraldo, Unal de Colombia/CIAT Jael Garcia, 3 Universität der Bundeswehr Alexander Garcia, UAMS

Acknowledgments

• John Bateman, Oscar Corcho, Joe Tohme, Cesar Montana, Alberto Labarga

• The CIAT biotech lab