semtech web-protege-tutorial

65
Collaborative Editing of Lightweight Ontologies with WebProtégé Natasha Noy Matthew Horridge Tania Tudorache Stanford University

Upload: matthewhorridge

Post on 21-Nov-2014

3.618 views

Category:

Education


9 download

DESCRIPTION

Slides for the hands on WebProtégé tutorial held at Semtech 2013.

TRANSCRIPT

Page 1: Semtech web-protege-tutorial

Collaborative Editing of Lightweight Ontologies

withWebProtégé

Natasha NoyMatthew HorridgeTania Tudorache

Stanford University

Page 2: Semtech web-protege-tutorial

Download the slides

http://tinyurl.com/semtech-webprotege

Page 3: Semtech web-protege-tutorial

Plan

• Introduction

• What is collaborative ontology editing?

• A guided tour of WebProtégé

• Hands-on exercise

• Discussion, Roadmap and Wrap up

Page 4: Semtech web-protege-tutorial

What is Protégé?

• An open-source ontology editor• developed at Stanford University• has more than 200,00 registered users• has dozens of plugins for

• visualization• inference• import and export• ….

• has an API for developers

Page 5: Semtech web-protege-tutorial

A bit of Protégé history• Started more than 20 years ago

• Has gone through many iterations

• Was the first editor to support OWL 1

• Informed the design of OWL 2

• Has a thriving user community:

• conferences

• mailing list

• short courses

Page 6: Semtech web-protege-tutorial

Protégé short course: Vienna, September 2-4

Text

http://protege.stanford.edu/shortcourse/protege-owl/201309/

Page 7: Semtech web-protege-tutorial

The “Classic” Protégé

Not what this tutorial is about!

Page 8: Semtech web-protege-tutorial

WebProtégé• A Web-based application

• edit ontologies in your Web browser

• nothing to install

• Supports distributed editing

• multiple editors can make changes at the same time

• Includes many collaboration features

• discussion, watches, feeds

Page 9: Semtech web-protege-tutorial

Plan

• Introduction

• Collaborative ontology editing

• Hands-on

• WebProtégé in large projects

• Discussion, Roadmap and Wrapup

Page 10: Semtech web-protege-tutorial
Page 11: Semtech web-protege-tutorial

Collaborative Ontology Development

Page 12: Semtech web-protege-tutorial

Collaborative Ontology Development

Collaboration: several users contribute to the

development of one ontology

– Small group → larger community

– Larger ontologies that concern a certain community

– Individual process → social process

Each community does it its own way

Page 13: Semtech web-protege-tutorial

Use cases of collaborative development in

biomedical domain

• Gene Ontology (GO)

• NCI Thesaurus

• BiomedGT

• OBI, BIRNLex, RadLex

• Open Biomedical Ontologies (OBO)

• International Classification:

– of Diseases (ICD-11)

– of Traditional Medicine (ICTM)

– of Patient Safety (ICPS)

Page 14: Semtech web-protege-tutorial

The NCI Thesaurus collaborative

development process

● Simultaneous editing in Protégé

clients

● Custom UI for restricting user

input and enforcing business

rules

● Development cycle begins after

baseline

● ~20 full-time editors making

changes; 1 “lead editor” who

approves the changes, and

assigns new tasks

● Released version on NCI

website and BioPortal

Reference ontology for cancer biology, translational science, and clinical oncology

Page 15: Semtech web-protege-tutorial

ICD-11

● 11th Revision of the International Classification of Diseases

● Over 10.000 categories used for coding, billing, statistics, policy making all over the world

● Collaborative and international effort

● Current version: published as books

● Goal for the new version: use a more formal representation and published in electronical format; use Web-based collaboration and social platforms for editing

Page 16: Semtech web-protege-tutorial

Construction of ICD-10: Revision Process in the 20th Century

● 8 Annual Revision Conferences (1982 - 89)

● 17 – 58 Countries participated

– 1- 5 person delegations

– Mainly Health Statisticians

● Manual curation

– List exchange

– Index was done later

● "Decibel” Method of discussion

● Output: Paper Copy

● Work in English only

● Limited testing in the field

Page 17: Semtech web-protege-tutorial

ICD-11 process today

● Over 250 domain experts from around the world ● Organized in groups, which edit different parts of the ontology

Page 18: Semtech web-protege-tutorial

ICD-11 process today (cont.)

● Each night a snapshot of the commonly edited ontology is published in a public platform to encourage feedback from the larger community http://apps.who.int/classifications/icd11/browse/f/en

● Editorial workflow

● Centrally overseen by WHO

● Peer-reviewed process for the content and structure

● WebProtégé used as the collaborative ontology development platform

Page 19: Semtech web-protege-tutorial

Other ways of collaborating: Wikis

● Wikis are well known; Wikipedia

● Semantic Wikis – add semantic extensions to the wiki platforms

● Assign a wiki page to an entity in the ontology (e.g. the class “Mountain”)

● Export/import RDF

Page 20: Semtech web-protege-tutorial

Semantic Wiki: MoKi

Source: https://moki.fbk.eu/website/userfiles/image/entmod.png

Page 21: Semtech web-protege-tutorial

The challenge with wikis

Source: Hoehndorf, Robert, et al. "BOWiki: an ontology-based wiki for annotation of data and integration of knowledge in biology."

BMC bioinformatics 10.Suppl 5 (2009): S5.

Page 22: Semtech web-protege-tutorial

Using sourceforge to manage change proposals for the Gene Ontology

Page 23: Semtech web-protege-tutorial

myexperiment: social platform for sharing scientific workflows

Page 24: Semtech web-protege-tutorial

Other collaboration processes

● Use source control repositories – SVN, CVS

– Text based mechanisms

– Hard to merge local copies in the shared copy

● Locking mechanisms (lock parts of an ontology for editing)

● Use specialized (domain dependent) ontology repositories, e.g., BioPortal

Page 25: Semtech web-protege-tutorial

BioPortal

● An open repository of biomedical ontologies developed by NCBO at Stanford

● Publishing of ontologies, versioning (over 350 ontologies)

● Discussions and structured proposals

● Mappings, views

● Storing metadata

● Search over all ontologies

● Browsing different versions of an ontology

● All content and functionality also available as REST Web services → mash-up of applications

● Technology is domain independent

● http://bioportal.bioontology.org

Page 26: Semtech web-protege-tutorial

BioPortal Statistics

Page 27: Semtech web-protege-tutorial

Ontology list in BioPortal

Page 28: Semtech web-protege-tutorial

NCI Thesaurus details in BioPortal

Page 29: Semtech web-protege-tutorial

Useful features for collaboration

● Tools for discussion and reaching consensus

– Add notes to ontology entities (classes, properties, individuals, axioms)

– Add reviews and change proposals anywhere in the ontology

– Document the decision process and final decisions

● Complete Change history

– Establish provenance

– Retrieve ontology snapshots at any time

– Implement different conflict resolution mechanisms

● Personalized views of an ontology based on:

– User’s role and tasks

– User’s level of expertise

Page 30: Semtech web-protege-tutorial

Useful features for collaboration (cont.)

● User roles and access control

– Fine-grained control for editing and viewing rights

– Sharing of ontologies

● Publishing released versions of an ontology in a central location,e.g. a repository

● Scalability, reliability and robustness

Page 31: Semtech web-protege-tutorial
Page 32: Semtech web-protege-tutorial

WebProtégé

A Quick Tour of the UI

Page 33: Semtech web-protege-tutorial

Creating an Account I

Create a new account

Page 34: Semtech web-protege-tutorial

Creating an Account II

Email address - used for notifications such as ontology changes

User name - displayed next to changes you make and notes that you post

Page 35: Semtech web-protege-tutorial

The “Home Screen”

Side bar

Project list. Click projectname to open

Create project

Download project

Sign In/Sign Out

Trash projectUpload project

Page 36: Semtech web-protege-tutorial

The Side Bar

All public projects plus your projects that are not in the trash

Your projects that are in the trash

Only projects owned by you that are not in the trash

Page 37: Semtech web-protege-tutorial

Projects

A project encompasses: A collection of ontologies

Notes & discussions and watches

Some user interface settings

Some sharing settings

A list of revisions and a log of changes

Page 38: Semtech web-protege-tutorial

Creating a ProjectCreate New Project

Project name - does not need to be unique

Project description - appears in the project list

Page 39: Semtech web-protege-tutorial

Uploading a ProjectUpload Project

Project name - does not need to be unique

Project description - appears in the project list

Local OWL file name

Page 40: Semtech web-protege-tutorial

Sharing

Share link (top right corner)

Page 41: Semtech web-protege-tutorial

Public Projects

➊ Select public

➋ Assign permissions for anyone including guests

➌ Assign more fine-grained access for specific users Enter names in list and press “Add”

Page 42: Semtech web-protege-tutorial

Private Projects

➊ Select public

Access is restricted to specific users

➋ Assign more permissions for specific users.Enter names in list and press “Add”

Page 43: Semtech web-protege-tutorial

Class tree Editor (similar for properties and individuals) Notes & Discussions

Project feed

Editing Class Descriptions

Page 44: Semtech web-protege-tutorial

Adding SubclassesCreate subclasses button

Enter one or more names. Press CTRL+Enter to accept and close(one class name per line)

Page 45: Semtech web-protege-tutorial

Editing Class Descriptions

Display name - corresponds to the value of rdfs:label here

IRI - Internationalized Resource Identifier. Auto-generated, globally unique

“Property values”(Class expressions under the hood owl:subClassOf)

Annotation assertions

Values can be class names, datatype names, individual names, numbers, dates and strings

Language editor for plain literals

Delete row

Page 46: Semtech web-protege-tutorial

Auto-Completion

Type in name. Popup shows possible matches.Dublin Core and SKOS properties “recognised”

Page 47: Semtech web-protege-tutorial

On-the-Fly Creation

New property warning(helps prevent typos!)

Press the tab key and enter value to create property(property type will be determined from the value)

Page 48: Semtech web-protege-tutorial

Editing Individual DescriptionsClass tree Editor Notes & Discussions

Project feed

Page 49: Semtech web-protege-tutorial

Display name - corresponds to the value of rdfs:label here

IRI - Internationalized Resource Identifier. Auto-generated, globally unique

“Property values”(Annotations, property assertions or class expressions under the hood - owl:subClassOf)

Type assertions(rdf:type)

Values can be class names, datatype names, individual names, numbers, dates and strings

Delete row

Same individuals(owl:sameAs)

Editing Individual Descriptions

Page 50: Semtech web-protege-tutorial

Icon Cheat SheetClass

Individual (named)

Datatype (xsd:integer, xsd:double etc.)

Property (object/data property)

Annotation property

Number

Date-Time

Literal

Link (http:)

IRI

Page 51: Semtech web-protege-tutorial
Page 52: Semtech web-protege-tutorial

Hands OnOnline Newspaper

Page 53: Semtech web-protege-tutorial

Modelling Task

Build an ontology to describe an online newspaperor news website e.g. www.nyt.com or www.bbc.com

Goal: Become familiar with WebProtégé and some aspects of collaborative ontology editing

Page 54: Semtech web-protege-tutorial

Content

Articles:

title, author, date published, edited by, keywords/topics, published in section, media (pictures, video), external links etc.

Advertisements:

Standard ad, personal ad, Service ad etc.

Model different kinds of articles and their properties. For example,

Page 55: Semtech web-protege-tutorial

Structure

Newspaper:

date published, issue, front matter etc.

Sections:

Domestic News, World News, Editorial, Magazine, Letters, Commentary, Television Listings, Advertisements, Appointments/Jobs, Sport, Business etc.

Sections and subsections

Model the structure of a news paper - different sections and how theyfit together. For example,

Page 56: Semtech web-protege-tutorial

People

Employees:

Columnist, Editor, Section Editor, Reporter, International Reporter, Manager

name, contact details: email, phone number, role

Other people:

Politician, President, Actor etc. Individual people, e.g. Barack Obama.

Model the people who contribute to the news paper and people whoare the subject of articles. For example,

Page 57: Semtech web-protege-tutorial
Page 58: Semtech web-protege-tutorial

Custom entry forms for editing the ontology content

● Easy to create user interfaces for the domain experts

● Use common entry forms, but still keep the ontology “intelligence” behind it

● A form widget (e.g., text field) is linked to a property in the ontology

● Easy to create custom forms with different views for different users

● Hides complex ontology stuff

Page 59: Semtech web-protege-tutorial

Form configuration in WebProtégé

Form-based editing and configuration of the user interface for the development of ICD-11

http://icatdemo.stanford.edu

Page 60: Semtech web-protege-tutorial

Examples of form-based editing

Page 61: Semtech web-protege-tutorial

Importing BioPortal terms into WebProtégé

(1) Search term in BioPortal ontologies

(2) Get

search

results(3) Browse

details of

results

(4) Import into WebProtégé with

single click

Page 62: Semtech web-protege-tutorial

WebProtégé – Make Up

Protégé Collaboration

Framework

WebProtégé

WebProtégé Server

GWT RPC

Server side

Client side

Java

Java

Java at

development time

JavaScript at

run- time

2 parts: server and client

Server is completely

implemented in Java and makes

API calls to the OWL-API and

other libraries

Client side is developed in Java,

and later compiled by GWT into

JavaScript

Communication between server

client is done via GWT RPC or

simple HTTP calls

Page 63: Semtech web-protege-tutorial

WebProtégé is pluggable

WebProtégé User Interface (GWT)

Portlets

Event manager Other managers

WebProtégé Server (Java)

Access policies service

...Ontology Service

Notes and changes Service

pluggable

pluggable

Page 64: Semtech web-protege-tutorial

Extending WebProtégé

Plug-in infrastructure very similar to Protégé's: create your

own tabs and portlets

Extend: AbstractTab or AbstractEntityPortlet

Implement your own RPCs, if needed

Reuse existing portlet code

Writing a tab – as easy as creating an empty class that

extends AbstractTab

http://protegewiki.stanford.edu/wiki/WebProtegeImplementationGuide

Page 65: Semtech web-protege-tutorial

Resources

● Online WebProtégé server: http://webprotege.stanford.edu

● WebProtégé documentation:http://protegewiki.stanford.edu/wiki/WebProtege

● WebProtégé paper: “WebProtégé: A Collaborative Ontology Editor and Knowledge

Acquisition Tool for the Web”, Tania Tudorache, Csongor Nyulas, Natalya F. Noy,

Mark A. Musen, Semantic Web Journal (SWJ) 4 (Number 1 / 2013), 89 - 99

● WebProtégé in use: “Will Semantic Web Technologies Work for the Development of

ICD-11?”, T. Tudorache, S. M. Falconer, C. I. Nyulas, N. F. Noy, M. A. Musen. The 9th

International Semantic Web Conference, ISWC 2010 (In-Use track), Shanghai,

China, Springer. Published in 2010.

http://bmir.stanford.edu/file_asset/index.php/1646/BMIR-2010-1427.pdf

● Other References: http://protegewiki.stanford.edu/wiki/WebProtege#References