lexbig, evs and ncbo browser publish, query, & browse vocabularies in cabig january 2008

43
LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Upload: charles-atkinson

Post on 05-Jan-2016

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

LexBIG, EVS and NCBO Browser

Publish, Query, & Browse Vocabularies in

caBIG

January 2008

Page 2: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Agenda

• Overview• LexGrid/LexBIG infrastructure• Distributed LexBIG API• EVS Services• BioPortal Browser

• Next Steps• LexBIG Grid Services

Page 3: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Team Members

MayoScott BauerJames BuntrockSridhar DwarkanathThomas JohnsonPradip KanjamalaJason LeischJyoti PathakKevin PetersonCraig StanclTraci St. Martin

Coordination & MentorshipBrian DavisBob FreimuthTahsin KurcJoshua Phillips

EVS teamJohnita BeasleyGilberto FragosoFrank HartelSteven HunterWilberto GarciaCharles GriffinJason LucasKim OngJohn ParkTracy SafranRob Wynne

Page 4: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Grid Services

NCI/NCBO Services

LexBIG Java API

Conceptual Overview

LexGridModel & Storage

Browsers and Applications

Page 5: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Conceptual Overview

LexGridModel & Storage

Page 6: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

LexGrid Model & Storage

Coding Scheme

RelationsConcepts

Properties

Associations

describable

codingScheme

concepts::conceptsdescribable

relations::relations

describable

relations::association

relations::associationInstance

associatableElement

relations::associationTarget

versionableAndDescribable

concepts::codedEntry

concepts::property

concepts::comment

concepts::definition

concepts::presentation

0..1+concepts 0..*+relations

1..*+association

0..*+sourceConcept

0..*+targetConcept

1..*+concept

0..*+property

T005: VirusT047: Disease or Syndrome

T005 -> causes -> T047

Page 7: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

• Coverage• Lexical Semantics (e.g. names, ids, definitions, comments)• Logical Semantics (e.g. associations, qualifiers)• Context (e.g. language, source)

• Supports Load and Representation of …• Web Ontology Language (OWL)• Open Biomedical Ontologies (OBO)• UMLS Rich Release Format (RRF)• Protégé Frames (various, requires custom loader for each unique flavor)• XML & Text (various, requires custom loader for each unique flavor)

• Available Renderings of the Model• XML Schema (master)• Formal (XMI, UML)• Data Storage/Schema (e.g. RDB, LDAP)• Technology-specific (e.g. Castor, EMF)

LexGrid Model & Storage

Page 8: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

LexBIG Java API

Conceptual Overview

LexGridModel & Storage

Page 9: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Other

OBO

Content

Export

ImportOWL

XML

OBO

Text

Protégé

RRF

XML

Representation

LexGridVocabulary

Model

DataRepository

LexBIG - API

Tools And Services

Access

Prog

ramm

ing Interfaces

APIs

LexBIG

CTS

App

s

Java

Embed

WebServices

DistributedLexBIG

API

App

licatio

nS

erver• Each LexGrid ‘Node’ provides the software, metadata, indexes, and

backing data store to service one or more vocabularies.

• Each LexBIG Installation represents one LexGrid Node and Java API to administer and query data.

File System:Metadata &

Indexes

Page 10: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

LexBIG - API

• API coverage• Administrative FunctionsAdministrative Functions

• Query Available Code Systems and MetadataQuery Available Code Systems and Metadata

• Query Concepts, Concept Properties, and QualificationsQuery Concepts, Concept Properties, and Qualifications

• Query Concept Relationships and QualificationsQuery Concept Relationships and Qualifications

• API characteristics• Conscious separation of service and data classesConscious separation of service and data classes

• Deferred query resolutionDeferred query resolution

• Payload optimizationPayload optimization

• Support of iteratorsSupport of iterators

• Defined extension points (loaders, exporters, sort algorithms, match Defined extension points (loaders, exporters, sort algorithms, match algorithms, convenience methods)algorithms, convenience methods)

Page 11: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

LexBIG – API Example

• Prerequisites• ICD-9-CM loaded from UMLS distribution (RRF source files)

• Target a code system• Define a concept ‘space’ (a codedNodeSet) for ICD-9-CM, version 2007• Initially unrestricted and unresolved

• Restrict the space based by adding constraints• Property ‘Semantic Type’ -> exact match -> “Disease or Syndrome”• Primary text match -> sounds like -> ‘infeksion’• Any text stemmed match -> ‘classify’ (to match ‘classified’, ‘classifying’, etc)• Must contain a property with name -> ‘UMLS_CUI’• Concept must be active

• Indicate sort preferences and limit number returned• Sort by code, ascending• Limit to top 5

• Resolve!

Page 12: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

LexBIG – API Example

• OK, now find some relationships…• Target a code system

• Define an unrestricted graph for a target ontology (e.g. ICD-9-CM)• Restrict by adding constraints

• Restrict to parent/child relationships (UMLS-defined ‘PAR’ = has parent)• Restrict to the codedNodeSet defined in the previous example

• Indicate extent of navigation• Maximum 2 levels, moving in forward direction• Maximum 50 nodes resolved overall

• Resolve!

Page 13: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

• Example of Work in Progress for Mar 2008 release…

• Feature Request: Provide a more convenient way to query hierarchies with different underlying hierarchical names and structures.

• Solution: Register additional Metadata indicating supported hierarchical relationships, root nodes, and direction of navigation to traverse from parent to child. Provide relation-independent and direction-agnostic convenience methods to allow tree building without need to know specific behavior of each ontology.

LexBIG – API

R1

C1

C2

hasSubtype R1

C1

C2

‘CHD’R1

C2

C5

isa

C4

C3

developsFrom

Page 14: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

• Solution: Register additional Metadata indicating supported hierarchical relationships, root nodes, and direction of navigation to traverse from parent to child. Provide relation-independent and direction-agnostic convenience methods to allow tree building without need to know specific behavior of each ontology.

• supportedHierarchy #1• Hierarchy ID = is_a• Root = R1• Association = hasSubtype• ForwardNavigable = true

• New Convenience Methods…• Resolving root of the is_a type hierarchy is always ‘R1’• Resolving first level of tree always provides ‘C2’ & ‘C3’• Etc…

LexBIG – API

supportedHierarchy #2Hierarchy ID = is_aRoot = R1Association = CHDForwardNavigable = true

supportedHierarchy #3Hierarchy ID = is_aRoot = R1Association = isa, develops_fromForwardNavigable = false

Page 15: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

EVS caCORE APIs

LexBIG Java API

Conceptual Overview

LexGridModel & Storage

Page 16: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

EVS caCORE API - Distributed LexBIG

Database ServerLexBIG in Local JVM

LexBIG Install

LexBIG Distributed API Client in Local JVM

LexBIG API Proxy

caCORE EVS Server

SpringRemoting

Database Server

LexBIG Install

JDBC

JDBC

Direct Invocation

Distributed API

Page 17: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

• Query-by-example (QBE) system

• EVS 3.2 model

• Java

• Web Services

• REST (HTTP / XML)

caCORE EVS Server

Web ServicesWeb Services

XML / HTMLXML / HTML

Java QBEJava QBE

LexBIG Install

DAO

Cache

Service L

ayer

Database Server

JDBC

EVS caCORE API – QBE

Page 18: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Grid Services

EVS caCORE APIs

LexBIG Java API

Conceptual Overview

LexGridModel & Storage

Page 19: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Grid Services – EVS 3.1 Data Model

• The EVS Grid Service is accessible via the caGRID Portal: http://cagrid-portal.nci.nih.gov

• The following is a list of the available operations via EVS Grid Service.

Operation Name Description============== ===========getHistoryRecords Searches a valid vocabulary in NCI thesaurus

for history information.searchSourceByCode Searches the Meta Thesaurus based on Source code.searchMetaThesaurus Searches NCI meta thesaurus and returns Meta

Thesaurus information that meet the search criteria.getMetaSources Returns all Metathesaurus Sources contained in the EVS.searchDescLogicConcept Searches a valid Vocabulary such as NCI

Thesaurus and returns Description Logic concepts that meet the search criteria.

getServiceSecurityMetadata Returns the service's security metadata.getVocabularyNames Returns all the vocabularies present in the Description Logic

in caCORE 3.1 EVS service.

• Next slide is a screen shot of the EVSGridService information viewable from the caGRID Portal.• The next release of caCORE/EVS 4 will support the full EVS 3.2 model n the Grid

Page 20: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008
Page 21: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

LexBIG Installation

caGrid Nodehosting Service

Client

Client Invokes caGridService

caGrid Service usesDistributed LexBIG toimplement call

Distributed LexBIG returnsrequested information tocaGrid Service

caGrid Servicereturns responseto client

Grid Services – LexBIG Prototype

Page 22: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

LexBIG Model in Introduce

The LexBIG Model is loadedinto The Introduce Grid Service Authoring Toolkitvia XSD files.

Grid Services – LexBIG Prototype

Page 23: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Using the LexBIG Model in a caGrid Service

Services can then bedefined using the imported types.

LexBIG Model types loaded from XSDsDefining Services

‘CodingSchemeRenderingList’is used as the output of caGrid Service ‘getSupportedCodingSchemes()’

Grid Services – LexBIG Prototype

Page 24: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Sample caGrid Service call‘getSupportedCodingSchemes()’

Client caGrid Service Distributed LexBIGCalls caGrid ‘getSupportedCodingSchemes()’

Calls Distributed LexBIG‘getSupportedCodingSchemes()

Returns result of call to caGrid Service

Results are returned to clientwith all appropriate caGrid security mechanisms

Grid Services – LexBIG Prototype

Page 25: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Creating a caGrid Service

With the model loaded andmethods created, the servicecan then be deployed to acaGrid Node.

caGrid Node

Introduce Toolkit Output:A deployed serviceto the Grid, plus clientsoftware to access theservice.

Page 26: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Grid Services

EVS caCORE APIs

LexBIG Java API

Conceptual Overview

LexGridModel & Storage

Browsers and Applications

Page 27: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

LexBIG GUI

Page 28: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Gene OntologyHL7Medical Dictionary for Regulatory Activities Terminology (MedDRA)National Drug File - Reference TerminologyNCI MetaThesaurusNCI ThesaurusSNOMED Clinical TermsThe MGED OntologyUMLS Semantic NetworkZebrafish

Stand-alone terminologies

BioCarta Terms Derived from online maps of molecular relationships, adapted for NCI use, 0601CClinical Bioinformatics Ontology, June 2005Canonical Clinical Problem Statement System, 1999 Clinical Classifications Software, 2003Clinical Data Interchange Standards Consortium, 0601CCOSTAR, 1989-1995CRISP Thesaurus, 2004Common Terminology Criteria for Adverse Events, 2003Cancer Therapy Evaluation Program (CTEP), 2004DSM-IV, 1994NCI Developmental Therapeutics Program, 0601CExpression Library ClassificationFood and Drug Administration, 0601CGene Ontology, 2004_03_02Healthcare Common Procedure Coding System, 2005Home Health Care Classification, 2003

Health Level Seven Vocabulary, 1998-2002ICPC2E-ICD10 relationships from Dr. Henk Lamberts, 1998HUGO Gene Nomenclature, 2004_04ICD10, 1998ICD-9-CM, 2005International Classification of Diseases for Oncology (ICD)ICPC2 - ICD10 Thesaurus, 200403International Classification of Primary Care, 1993Online Congenital Multiple Anomaly/Mental Retardation Syndromes, 1999NCI Mouse Terminology, 0601CKEGG Pathway Database, 0601CLOINC 2.13MEDLINE (1995-1999)McMaster University Epidemiology Terms, 1992Mitelman Database of Chromosome Aberrations in Cancer (MDBCAC), 2005_12MedDRA, 6.0MEDLINE (2000-2005)MedlinePlus Health Topics_2004_08_14, 20040814Online Mendelian Inheritance in Man, 1993Multum MediSource Lexicon, 2004_03Medical Subject Headings, MSH2005_2004_10_12UMLS MetathesaurusMetathesaurus FDA National Drug Code Directory, 2004_01Metathesaurus additional entry terms for ICD-9-CM, 2005, 2005ICPC2 - ICD10 Thesaurus, 7-bit Equivalents, 0403ICPC2 - ICD10 Thesaurus, American English Equivalents, 0403Metathesaurus Version of Minimal Standard Terminology Digestive Endoscopy, 2001Metathesaurus forms of SNOMED Clinical Terms, 2004_01_31NCBI Taxonomy, 2004_09_30NCI modified Common Terminology Criteria for Adverse Events v3.0, 2003NCI-GLOSS (Cancer.gov Dictionary), 0601C

National Cancer Institute Thesaurus, 2006_01CNCI MetathesaurusNCI SEER ICD Neoplasm Code Mappings, 1999National Drug File - Reference Terminology, 2004_01National Library of Medicine Medline DataOmaha System, 1994Physician Data Query, 2005_12Portfolio Management Application (PMA), 2003Quick Medical Reference (QMR), 1996QMR clinically related terms from Randolph A. Miller, 1999RXNORM Project, META2005AA Cumulative Update 2004_11_17, 2005AASNOMEDCT Clinical Terms, 2004_01_31Standard Product Nomenclature, 2003Metathesaurus Source Terminology NamesUMDNS: product category thesaurus, 2005University of Washington Digital Anatomist, 1.7.3

Individual Terminologies

Metathesaurus terminologies

Page 29: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

NCI BioPortal

Page 30: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

NCI BioPortal

• http://bioportal.nci.nih.gov/ncbo/faces/index.xhtml

• Encourage use and feedback

• Notes …

• General query, navigation, and visualization in place

• Some operations not performing well and under investigation

• Suspect bottlenecks in Distributed LexBIG API layer; compared with NCBO implementation which works directly against the LexBIG Java API.

Page 31: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Future Browser Support

• Invitation for Participation !!!• NCI- Terminology Open Portal (TOP) Project• https://gforge.nci.nih.gov/projects/openportal/• Every other Friday 2pm Eastern

• The OpenPortal is a collaborative effort to develop an open, site neutral and easily extensible Web service allowing users to browse, search, and visualize ontologies stored in LexGrid repositories.

• Participation from NCBO & others.• Will inform future changes to the LexBIG model and

service layers.

Page 32: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Project Links

• LexBIG Project

http://gforge.nci.nih.gov/projects/lexbig/

• NCI BioPortal Project

https://gforge.nci.nih.gov/projects/lex-browser/

• NCI BioPortal Site

http://bioportal.nci.nih.gov/ncbo/faces/index.xhtml

• Open Terminology Portal Project

https://gforge.nci.nih.gov/projects/openportal/

Page 34: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Additional Materials

Page 35: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

System.out.println("Example double restriction query with additional application of sort criteria and restricted return values");

//Declare the service...LexBIGService lbs = new LexBIGServiceImpl();

//Start with an unconstrained set of all codes for the vocabulary...CodedNodeSet cns = lbs.getCodingSchemeConcepts("NCI_Thesaurus", null, false);

//Constrain to concepts with designations (assigned text presentations)//that contain text that sounds like ‘heart ventricle’cns.restrictToMatchingDesignations("hart ventrikle",SearchDesignationOption.ALL,MatchAlgorithms.DoubleMetaphoneLuceneQuery.toString(),null);

//Further restrict the results to concepts with a semantic type of//'Anatomical Structure'.cns.restrictToMatchingProperties(Constructors.createLocalNameList("Semantic_Type"), "Anatomical Structure", "exactMatch",null);

LexBIG API - Example

Page 36: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

//Indicate that the resulting list should be sorted,//with best results first and then sorted by code if there is a tie.SortOptionList sortCriteria =Constructors.createSortOptionList(new String[]{"matchToQuery", "code"});

//Indicate to return only the assigned UMLS_CUI and textualPresentation properties.LocalNameList restrictTo =ConvenienceMethods.createLocalNameList(new String[]{"UMLS_CUI",

"textualPresentation"});

//Still nothing computed yet!//Perform the query and resolve the sorted/filtered list,//with a maximum of 6 items returned ...ResolvedConceptReferenceList list = cns.resolveToList(sortCriteria, restrictTo, 6);

//Print the results ...ResolvedConceptReference[] rcr = list.getResolvedConceptReference();for(ResolvedConceptReference rc : rcr)

System.out.println("Resolved Concept: " + ObjectToString.toString(rc));

LexBIG API - Example

Page 37: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

LexBIG Model Harmonization

• These slides are examples of the harmonization process.

Page 38: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

ConceptReferenceList as it appeared in the original EA representation

class Collections

«XSDcomplexType»ConceptReferenceList

«XSDelement»+ ConceptReferenceCollection: lbCore:ConceptReference [0]+ id: long

LexBIG Model Harmonization

Page 39: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

ConceptReferenceList following harmonization to caCORE modeling requirements

class Collections

«XSDcomplexType»ResolvedConceptReferenceList

+ id: Long

«XSDattribute»+ incomplete: boolean

«XSDcomplexType»ConceptReferenceList

- id: Long

«XSDcomplexType»Core::Resolv edConceptReference

«XSDattribute»+ codingSchemeURN: String+ codingSchemeVersion: String

«XSDcomplexType»Core::ConceptReference

- id: Long

«XSDattribute»+ codingScheme: String+ conceptCode: String

«XSDextension»

+ResolvedConceptReferenceList 0..1

+ResolvedConceptReferenceCollection 0..*

+ConceptReferenceList 0..1

+ConceptReferenceCollection 1..*

LexBIG Model Harmonization

Page 40: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

SIW output prior to Harmonization/and annotation.

LexBIG Model Harmonization

Page 41: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

After Harmonization and prior to first pass of the annotation process

LexBIG Model Harmonization

Page 42: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Result’s of automated first pass of the SIW automated annotation tool.

LexBIG Model Harmonization

Page 43: LexBIG, EVS and NCBO Browser Publish, Query, & Browse Vocabularies in caBIG January 2008

Documentation of Harmonization requirements

LexBIG Model Harmonization