terminology curation with the semantic mediawiki

Post on 05-Jan-2016

53 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Terminology Curation with the Semantic MediaWiki. Harold Solbrig Informatics Architect Apelon, Inc. The Primary Task. Evaluate the roles, categories and organization of the National Cancer Institute (NCI)’s Cancer Thesaurus with respect to: Upper Level Ontological Principles - PowerPoint PPT Presentation

TRANSCRIPT

04/18/2007

Terminology and the Semantic MediaWiki

Ecoterm IV – Vienna17 – 18 April 2007

EcoInformatics InitiativeEcoInformatics Initiative

Terminology Curation with the Semantic MediaWiki

Harold SolbrigInformatics ArchitectApelon, Inc.

04/18/2007 Terminology and the Semantic MediaWiki 2

04/18/2007 Terminology and the Semantic MediaWiki 3

The Primary Task

Evaluate the roles, categories and organization of the National Cancer Institute (NCI)’s Cancer Thesaurus with respect to:

Upper Level Ontological Principles ISO TC37 & Related principles

As with Ontology construction, it was understood by all parties that this was a process – not a goal.

04/18/2007 Terminology and the Semantic MediaWiki 4

Approach

1. Gather appropriate upper level ontologies (BFO, Dolce, Top Bio, UMLS Semantic Net and OBO Relations Ontology) into a single, readily referenced format

2. Load NCI Thesaurus into same format

3. Multiple parties review, annotate, recommend and categorize

4. Publish, analyze and evaluate results

04/18/2007 Terminology and the Semantic MediaWiki 5

Solution

By using the Semantic MediaWiki (SMW), we were able to accomplish all of the goals in a (very) reasonable period of time

04/18/2007 Terminology and the Semantic MediaWiki 6

Discussion

We also discovered that, with some extensions, the SMW could be useful for publishing, annotating and cross-referencing other terminological (and other..) resources.

04/18/2007 Terminology and the Semantic MediaWiki 7

Questions?

… just kidding.

04/18/2007 Terminology and the Semantic MediaWiki 8

Wiki’s

Community developed Collaborative “Organic” – to the very core… Primary focus (to date) is human

consumption Traceable, provenance automatically

recorded, differences, undo and redo.

04/18/2007 Terminology and the Semantic MediaWiki 9

MediaWiki

http://en.wikipedia.org/wiki/Wiki Base for WikiPedia and many others… Key characteristics

Web based editingPage linksCategoriesTemplates

04/18/2007 Terminology and the Semantic MediaWiki 10

MediaWiki

Fully documented using (surprise!) mediawiki

Rich mechanisms for discussion, curation, export, etc.

04/18/2007 Terminology and the Semantic MediaWiki 11

04/18/2007 Terminology and the Semantic MediaWiki 12

Common constructs

[[Train Transport]] – hyperlink to page named “Train_Transport”

‘‘Italic’’, ‘‘‘Bold’’’ * Bullet point [http://www.w3c.org/ The W3C] – hyperlink … and much more

04/18/2007 Terminology and the Semantic MediaWiki 13

Templates

04/18/2007 Terminology and the Semantic MediaWiki 14

Templates

04/18/2007 Terminology and the Semantic MediaWiki 15

Sample Template

ParameterExtension call

04/18/2007 Terminology and the Semantic MediaWiki 16

Semantic MediaWiki

04/18/2007 Terminology and the Semantic MediaWiki 17

Semantic MediaWiki3 Key extensions to MediaWiki

1. Categories == Class– PageA … [[Category:X]] pageA rdf:Type

category:X– Category:Y … [[Category:X]] category:Y

rdfs:subClassOf category:X

2. Links == Role– PageA … [[PageB]] PageA …

[[hasPart::PageB]]

3. Attributes == DataProperty – [[population:=32,154,773]]– Includes datatypes

04/18/2007 Terminology and the Semantic MediaWiki 18

Categories and Relations

04/18/2007 Terminology and the Semantic MediaWiki 19

Attributes

04/18/2007 Terminology and the Semantic MediaWiki 20

Semantic Rendering

Type (or superClass)

Attribute Value

Relation RDF (!)

04/18/2007 Terminology and the Semantic MediaWiki 21

Thesaurus Content

04/18/2007 Terminology and the Semantic MediaWiki 22

Templates?

; Gene_Product_Is_Biomarker_Type

: The role is used to designate the type of …

Kind: [[:Category:NCI_Kind]]

‘‘‘Semantic Type:’’’ [NCI_Semantic_Type::Category:SN_Conceptual_Entity|Conceptual Entity]

Brittle, not readily changed…

04/18/2007 Terminology and the Semantic MediaWiki 23

Templates?

{{OntylogDescription|ns=NCI|text=“The role is used to designate…”}}

{{Kind|ns=NCI|target=Kind}}

{{ResourceRef|name=Semantic_Type|ns=NCI|target=Conceptual_Entity|targetns=SN}}

Can readily be updated viat template…

04/18/2007 Terminology and the Semantic MediaWiki 24

Commentary

Link to another NCI comment

Link to external Ontology

Categorization in external Ontology

04/18/2007 Terminology and the Semantic MediaWiki 25

Computed

04/18/2007 Terminology and the Semantic MediaWiki 26

How is it Working?

Very well!

04/18/2007 Terminology and the Semantic MediaWiki 27

What can we do to improve it…

04/18/2007 Terminology and the Semantic MediaWiki 28

Terminology

Centrally curated Central to the practice of medicine

Insurance and reportingRegulatoryResearchClinical Practice Information Sharing

ICD-9, CPT-4, SNOMED, …

04/18/2007 Terminology and the Semantic MediaWiki 29

Clinical Terminology

Quality and content is important Needs central vetting, integration, qa

Central model doesn’t scaleNeed input from (many) expertsNeed visible, active feedback loop

04/18/2007 Terminology and the Semantic MediaWiki 30

Terminology Workflow 1995

ControlledTerminology

Curation

(1)

Distribution

BooksPDF

Lists andTables

(2)

(3)

(4)

04/18/2007 Terminology and the Semantic MediaWiki 31

Terminology Workflow 1995

ControlledTerminology

‘B’

(1)

(2)

(3)

Curation

Distribution

BooksPDF

Lists andTables

04/18/2007 Terminology and the Semantic MediaWiki 32

Terminology Workflow 2008

ControlledTerminology

Curation

(1)

Distribution

(2)

(3)

(5)

CommonDistribution

Model

OnlineServices

(4)

04/18/2007 Terminology and the Semantic MediaWiki 33

Terminology Workflow 2008

ControlledTerminology

Curation

(1)

Distribution

(2)

(3)

(5)

CommonDistribution

Model

OnlineServices

(4)

ControlledTerminology

B

04/18/2007 Terminology and the Semantic MediaWiki 34

Common Distribution Model

LexGrid (a little bit of…) OWL

NCI Thesaurus & SNOMED CTStill requires LexGrid-like additions“Pushing the envelope”

UMLS RRFAlthough underspecified as a ‘model’

04/18/2007 Terminology and the Semantic MediaWiki 35

Online Services

OMG Terminology Query Services Not heavily used Perceived (incorrectly) as CORBA specific Perceived as too complex Object oriented and stateful

ANSI Common Terminology Services Being adopted Necessary but not sufficient Stateless

CTS-2 Co-development beginning w/ HL7 & OMG

04/18/2007 Terminology and the Semantic MediaWiki 36

Online Services

LexBIGLexGrid for the Bio Informatics GridRobust query specificationMeets many end-user (developers)

requirments Not simple to implement – it actually adds value Not a standard - but will be used to guide CTS-2

04/18/2007 Terminology and the Semantic MediaWiki 37

Workflow and Feedback

ControlledTerminology

Curation

(1)

Distribution

(2)

(3)

(5)

CommonDistribution

Model

OnlineServices

(4)

04/18/2007 Terminology and the Semantic MediaWiki 38

The Feedback Component

Curation

04/18/2007 Terminology and the Semantic MediaWiki 39

The Feedback Component

Curation

SemanticMediaWiki (++)

Annotations andChange Requests

CommunityReview

Distribution

CommonDistribution

Model

OnlineServices

VersionStaging

04/18/2007 Terminology and the Semantic MediaWiki 40

Issues and Next Steps

(1) SHARED Semantics{{Definition|…}}{{Synonym|…}}}{{References|…}}{{DLSome|…}}{{DLAll|…}}…

12620 anyone?

04/18/2007 Terminology and the Semantic MediaWiki 41

Issues and Next Steps

(2) Figure out namespacesNCI:Activity, AgroVoc:Fish, …NCI_Activity, AgroVoc_Fish???

(2a) Identifiers (Activity vs. C12345)(2b) Versions(2c) URI’s (vs. URL’s)

InternalExternal

04/18/2007 Terminology and the Semantic MediaWiki 42

Certification and Sanctioning

Who can edit? Who can validate? Who selects updates? … (see:

http://en.citizendium.org/wiki/Main_Page

04/18/2007 Terminology and the Semantic MediaWiki 43

Automatic Export

Selecting sets of updates Formatting update recommendations for

target curators, etc…

04/18/2007 Terminology and the Semantic MediaWiki 44

Synchronization

Changes implemented in terminologyUpdate wiki pagesSay what changedWhat changes are incorporated by value? By

reference?

04/18/2007 Terminology and the Semantic MediaWiki 45

Approach and Responsible Parties

Shared SemanticsCore set based on LexGrid & OWLPost on WIKI and link on SMW siteAssigned to Apelon, Mayo, NCI, ???Extend to OBO, SKOS (?), XMDR…Connections to 12620

04/18/2007 Terminology and the Semantic MediaWiki 46

Time Frame and Assignments

URI’s, namespaces, namingUK NCR (CancerGrid) – looking at unAPI and

servers(Hopefully) can provide URI resolver svc.Short term – use templates / extensions

04/18/2007 Terminology and the Semantic MediaWiki 47

Content

SNOMED-CT, ICD-9-CM, many, many others are already available via. Apelon DTS ServicesAvailable soon

FMA, HL7 Version 3 Terminology, OBO Foundry (GO, PATO, etc) as time permits

Others as needed (and funded…)

04/18/2007 Terminology and the Semantic MediaWiki 48

What we’ve got to date

Apelon DTS Server Extension Includes both defined and classified view (!) Export in restful XML (currentely Apelon, soon to be

LexGrid) XMDR Export Format Protégé (Native and OWL 3.2) prototype

Done by Mayo Both import and export Still needs templates

04/18/2007 Terminology and the Semantic MediaWiki 49

Questions?

This time for real

Note: SMW will be made externally available (w/ simple password) once we get contract specific info cleaned up (NCI will probably publish shortly)… contact: hsolbrig@apelon.com for access.

top related