terminology curation with the semantic mediawiki

49
04/18/2007 Terminology and the Semanti c MediaWiki Ecoterm IV – Vienna 17 – 18 April 2007 coInformatics Initiative coInformatics Initiative Terminology Curation with the Semantic MediaWiki Harold Solbrig Informatics Architect Apelon, Inc.

Upload: olin

Post on 05-Jan-2016

53 views

Category:

Documents


2 download

DESCRIPTION

Terminology Curation with the Semantic MediaWiki. Harold Solbrig Informatics Architect Apelon, Inc. The Primary Task. Evaluate the roles, categories and organization of the National Cancer Institute (NCI)’s Cancer Thesaurus with respect to: Upper Level Ontological Principles - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Terminology Curation  with the  Semantic MediaWiki

04/18/2007

Terminology and the Semantic MediaWiki

Ecoterm IV – Vienna17 – 18 April 2007

EcoInformatics InitiativeEcoInformatics Initiative

Terminology Curation with the Semantic MediaWiki

Harold SolbrigInformatics ArchitectApelon, Inc.

Page 2: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 2

Page 3: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 3

The Primary Task

Evaluate the roles, categories and organization of the National Cancer Institute (NCI)’s Cancer Thesaurus with respect to:

Upper Level Ontological Principles ISO TC37 & Related principles

As with Ontology construction, it was understood by all parties that this was a process – not a goal.

Page 4: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 4

Approach

1. Gather appropriate upper level ontologies (BFO, Dolce, Top Bio, UMLS Semantic Net and OBO Relations Ontology) into a single, readily referenced format

2. Load NCI Thesaurus into same format

3. Multiple parties review, annotate, recommend and categorize

4. Publish, analyze and evaluate results

Page 5: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 5

Solution

By using the Semantic MediaWiki (SMW), we were able to accomplish all of the goals in a (very) reasonable period of time

Page 6: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 6

Discussion

We also discovered that, with some extensions, the SMW could be useful for publishing, annotating and cross-referencing other terminological (and other..) resources.

Page 7: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 7

Questions?

… just kidding.

Page 8: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 8

Wiki’s

Community developed Collaborative “Organic” – to the very core… Primary focus (to date) is human

consumption Traceable, provenance automatically

recorded, differences, undo and redo.

Page 9: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 9

MediaWiki

http://en.wikipedia.org/wiki/Wiki Base for WikiPedia and many others… Key characteristics

Web based editingPage linksCategoriesTemplates

Page 10: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 10

MediaWiki

Fully documented using (surprise!) mediawiki

Rich mechanisms for discussion, curation, export, etc.

Page 11: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 11

Page 12: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 12

Common constructs

[[Train Transport]] – hyperlink to page named “Train_Transport”

‘‘Italic’’, ‘‘‘Bold’’’ * Bullet point [http://www.w3c.org/ The W3C] – hyperlink … and much more

Page 13: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 13

Templates

Page 14: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 14

Templates

Page 15: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 15

Sample Template

ParameterExtension call

Page 16: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 16

Semantic MediaWiki

Page 17: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 17

Semantic MediaWiki3 Key extensions to MediaWiki

1. Categories == Class– PageA … [[Category:X]] pageA rdf:Type

category:X– Category:Y … [[Category:X]] category:Y

rdfs:subClassOf category:X

2. Links == Role– PageA … [[PageB]] PageA …

[[hasPart::PageB]]

3. Attributes == DataProperty – [[population:=32,154,773]]– Includes datatypes

Page 18: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 18

Categories and Relations

Page 19: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 19

Attributes

Page 20: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 20

Semantic Rendering

Type (or superClass)

Attribute Value

Relation RDF (!)

Page 21: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 21

Thesaurus Content

Page 22: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 22

Templates?

; Gene_Product_Is_Biomarker_Type

: The role is used to designate the type of …

Kind: [[:Category:NCI_Kind]]

‘‘‘Semantic Type:’’’ [NCI_Semantic_Type::Category:SN_Conceptual_Entity|Conceptual Entity]

Brittle, not readily changed…

Page 23: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 23

Templates?

{{OntylogDescription|ns=NCI|text=“The role is used to designate…”}}

{{Kind|ns=NCI|target=Kind}}

{{ResourceRef|name=Semantic_Type|ns=NCI|target=Conceptual_Entity|targetns=SN}}

Can readily be updated viat template…

Page 24: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 24

Commentary

Link to another NCI comment

Link to external Ontology

Categorization in external Ontology

Page 25: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 25

Computed

Page 26: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 26

How is it Working?

Very well!

Page 27: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 27

What can we do to improve it…

Page 28: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 28

Terminology

Centrally curated Central to the practice of medicine

Insurance and reportingRegulatoryResearchClinical Practice Information Sharing

ICD-9, CPT-4, SNOMED, …

Page 29: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 29

Clinical Terminology

Quality and content is important Needs central vetting, integration, qa

Central model doesn’t scaleNeed input from (many) expertsNeed visible, active feedback loop

Page 30: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 30

Terminology Workflow 1995

ControlledTerminology

Curation

(1)

Distribution

BooksPDF

Lists andTables

(2)

(3)

(4)

Page 31: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 31

Terminology Workflow 1995

ControlledTerminology

‘B’

(1)

(2)

(3)

Curation

Distribution

BooksPDF

Lists andTables

Page 32: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 32

Terminology Workflow 2008

ControlledTerminology

Curation

(1)

Distribution

(2)

(3)

(5)

CommonDistribution

Model

OnlineServices

(4)

Page 33: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 33

Terminology Workflow 2008

ControlledTerminology

Curation

(1)

Distribution

(2)

(3)

(5)

CommonDistribution

Model

OnlineServices

(4)

ControlledTerminology

B

Page 34: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 34

Common Distribution Model

LexGrid (a little bit of…) OWL

NCI Thesaurus & SNOMED CTStill requires LexGrid-like additions“Pushing the envelope”

UMLS RRFAlthough underspecified as a ‘model’

Page 35: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 35

Online Services

OMG Terminology Query Services Not heavily used Perceived (incorrectly) as CORBA specific Perceived as too complex Object oriented and stateful

ANSI Common Terminology Services Being adopted Necessary but not sufficient Stateless

CTS-2 Co-development beginning w/ HL7 & OMG

Page 36: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 36

Online Services

LexBIGLexGrid for the Bio Informatics GridRobust query specificationMeets many end-user (developers)

requirments Not simple to implement – it actually adds value Not a standard - but will be used to guide CTS-2

Page 37: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 37

Workflow and Feedback

ControlledTerminology

Curation

(1)

Distribution

(2)

(3)

(5)

CommonDistribution

Model

OnlineServices

(4)

Page 38: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 38

The Feedback Component

Curation

Page 39: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 39

The Feedback Component

Curation

SemanticMediaWiki (++)

Annotations andChange Requests

CommunityReview

Distribution

CommonDistribution

Model

OnlineServices

VersionStaging

Page 40: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 40

Issues and Next Steps

(1) SHARED Semantics{{Definition|…}}{{Synonym|…}}}{{References|…}}{{DLSome|…}}{{DLAll|…}}…

12620 anyone?

Page 41: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 41

Issues and Next Steps

(2) Figure out namespacesNCI:Activity, AgroVoc:Fish, …NCI_Activity, AgroVoc_Fish???

(2a) Identifiers (Activity vs. C12345)(2b) Versions(2c) URI’s (vs. URL’s)

InternalExternal

Page 42: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 42

Certification and Sanctioning

Who can edit? Who can validate? Who selects updates? … (see:

http://en.citizendium.org/wiki/Main_Page

Page 43: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 43

Automatic Export

Selecting sets of updates Formatting update recommendations for

target curators, etc…

Page 44: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 44

Synchronization

Changes implemented in terminologyUpdate wiki pagesSay what changedWhat changes are incorporated by value? By

reference?

Page 45: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 45

Approach and Responsible Parties

Shared SemanticsCore set based on LexGrid & OWLPost on WIKI and link on SMW siteAssigned to Apelon, Mayo, NCI, ???Extend to OBO, SKOS (?), XMDR…Connections to 12620

Page 46: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 46

Time Frame and Assignments

URI’s, namespaces, namingUK NCR (CancerGrid) – looking at unAPI and

servers(Hopefully) can provide URI resolver svc.Short term – use templates / extensions

Page 47: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 47

Content

SNOMED-CT, ICD-9-CM, many, many others are already available via. Apelon DTS ServicesAvailable soon

FMA, HL7 Version 3 Terminology, OBO Foundry (GO, PATO, etc) as time permits

Others as needed (and funded…)

Page 48: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 48

What we’ve got to date

Apelon DTS Server Extension Includes both defined and classified view (!) Export in restful XML (currentely Apelon, soon to be

LexGrid) XMDR Export Format Protégé (Native and OWL 3.2) prototype

Done by Mayo Both import and export Still needs templates

Page 49: Terminology Curation  with the  Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 49

Questions?

This time for real

Note: SMW will be made externally available (w/ simple password) once we get contract specific info cleaned up (NCI will probably publish shortly)… contact: [email protected] for access.