1 introduction to the cadsr presented to hl7 vocab sig january 24, 2005 denise warzel national...
Post on 27-Mar-2015
214 Views
Preview:
TRANSCRIPT
1
Introduction to the caDSRPresented to HL7 Vocab SIG
January 24, 2005
Denise Warzel
National Cancer Institute, Center for Bioinformatics
caDSR Project Officer, Software Development
D. Warzel 2
Presentation Outline
• caCORE Overview
• ISO/IEC 11179 Overview
• caDSR Implementation and tooling
D. Warzel 3
caCORE Components
Enterprise Vocabulary
Data Standards
Bioinformatics Objects
• caCORE is the open-source foundation upon which the NCICB builds its research information management systems
D. Warzel 4
caCORE Infrastructure wiring
Vocabulary for CDE specification
Dictionary, thesaurusservices
Domain object metadata
Common data elements
Public APIs
Common data elements
(CDEs)
D. Warzel 5
Presentation Outline
• caCORE Overview
• ISO/IEC 11179 OverviewISO/IEC 11179 Overview• caDSR Implementation and tooling
D. Warzel 6
Terms and Definitions for ISO/IEC 11179
Administered Item: A registry item for which administrative information is recorded in an Administration Record
Data Element: A unit of data for which the
definition, identification, representation, and permissible values are specified by means of a set of attributes.
Data Element Concept: An idea that can be represented in the form of a data element, described independently of any particular representation.
Value Domain: A set of attributes describing representational characteristics of instance data with or without enumerated permissible values.
Data Element: A unit of data for which the definition, identification,representation, and permissible values are specified by means of a set ofattributes.
Data Element Concept: An idea that can be represented in the form of a data element, described independently of any particular representation.
Value Domain: A set of attributes
describing representational Characteristics of instance datawith or without permissible values.
Value Meaning: A member of theset of finite allowed inventory ofnotions that can be categorized for a conceptual domain.
Permissible Value: Anexpression of a value meaning in a specific value domain
Representation Class: A classification of data elements based upon the type of representational form.
Conceptual Domain: A set of possible value meanings of a data element expressed without representation.
Data Element Representation:
The part of a data element having
A value domain, datatype,and other representational specifications.
D. Warzel 7
• ISO/IEC 11179 Parts 1-6: Information technology – Specification and Standardization of data elements
– A metamodel for ‘data element’ metadata– Standard by which to convey semantic,
syntactic and lexical meaning • Human and machine understandable
• Unambiguous
What is ISO/IEC 11179?
D. Warzel 8
ISO/IEC 11179 Information technology Standard
• ISO/IEC 11179 Part 1: Framework for the specification and standardization of data elements
• ISO/IEC 11179 Part 2: Classification for data elements
• ISO/IEC 11179 Part 3: Registry metamodel and basic attributes
• ISO/IEC 11179 Part 4: Rules and Guidelines for the Formulation of Data Elements
• ISO/IEC 11179 Part 5: Naming and Identification Principles for Data Elements
• ISO/IEC 11179 Part 6: Registration of data elements
• Publically Available from:
• http://isotc.iso.ch/livelink/livelink/fetch/2000/2489/Ittf_Home/PubliclyAvailableStandards.htm??Redirect=1
D. Warzel 9
Basic Metamodel Components
Conceptual_DomainData_Element_Concept
1..10..*+specifying
1..1+having
0..*
data_element_concept_conceptual_domain_relationship
Data_Element
0..*
1..1
+providing_representation_to0..*
+represented_by1..1
expression
Value_Domain
0..* 1..1+represented_with
0..*
+providing_representation_for1..1
representation
0..*
1..1
+representing
0..*
+specified_by1..1
specification
Data Element Concept Conceptual Domain
Data Element Value Domain
Perception
Representation
D. Warzel 10
• “What is this datum?” – Provides concrete guidance on the creation and maintenance of
discrete data element attributes and metadata (semantics) enabling the formulation of data elements in a consistent, standard manner –
• “Metadata Repository/Registry” – Framework for Data element standardization and registration
allow the creation of a shared data environment in much less time and with much less effort than it takes for conventional data management methodologies.
• Adoption of 11179 Allowed us to “Get on with it”Adoption of 11179 Allowed us to “Get on with it”
Why ISO/IEC 11179?
D. Warzel 11
ISO/IEC 11179 Administered Items
Administered_Item
Context (for administered item)
Classification_Scheme
Object_Class
Property
Data_Element_Concept
Conceptual_Domain
Data_Element
Value_Domain
Representation_Class
Derivation_Rule
D. Warzel 12
ISO/IEC Administered ItemAdministration Record and Common Attributes
• Unique Identifier• Administrative Status• Registration Status• Creation Date• Administrative Note(s)• Effective Date• Change Date(s)• Change Description(s)• Origin• Until Date
• Created By• Modified By• Name(s)• Definition(s)• Stewardship Information• Submitter Information• Reference Document(s)• Classifications
D. Warzel 13
ISO/IEC 11179 NCICB Extensions
Administered_Item
Context (for administered item)
Classification_Scheme
Object_Class
Property
Data_Element_Concept
Conceptual_Domain
Data_Element
Value_Domain
Representation_Class
Derivation_Rule
The Concept ClassProvides Semantic Linkage
Form
Concept Class
D. Warzel 14
ObjectAgent
PropertyChemopreventive
Conceptual DomainAgent
Data Element ConceptChemopreventive Agent
Data ElementChemopreventive Agent Name
Value DomainChemopreventive Agent Name
ContextcaCORE
RepresentationName
Cla
ss
ific
ati
on
Sc
he
me
sc
aD
SR
Tra
inin
g
Valid ValuesCyclooxygenase Inhibitor
DoxercalciferolEflornithine
…Ursodiol
caDSR Implementation of ISO/IEC 11179 Model
D. Warzel 15
NCICB Concept ClassCommon Attributes
• Concept Class• Administered Item attributes +
• Concept Unique Identifier• Pointer to an externally defined concept
• Concept Definition Source• Names the source terminology/ontology/vocabulary
• Concept Relationship• Semantic Order of the concepts• NOTE: ISO describes a ‘Concept Relationship’ as a semantic link among two
or more concepts. There is a subtlety in our implementation. In caDSR use the concept relationships as more of a derivation rule, naming the order of the concepts - not semantic relationships in an ontologic or object model sense of ‘relationship’.
• Object Class, Property, Representation term, Qualifier terms, Value Domains
D. Warzel 16
Why vocabularies/ontology important?
• Goal: “Semantically unambiguous, interoperability”• Data Element curators are not necessarily vocabulary experts• NCI had a terminology and vocabulary services group: EVS• Semantic integration is achieved by tying Standard
vocabulary identifier codes to the caDSR metadata• The ISO 11179 provides the framework – we were looking
for something that could be computed without a human having to read and interpret definitions
• By abstracting the curation of concepts in caDSR and instead relying on external vocabularies
D. Warzel 17
EVS and caDSR Distinctions
• caDSR is a metadata repository– maintains metadata to permit a user to locate the correct
data element defining the characteristics of a piece of datum, an instance of a specific concept, in sufficient detail to be collected and stored on a computer
• EVS is a terminology server– provides services for synonymy, mapping between
vocabularies, hierarchical structures, Subconcepts, Superconcepts, Roles, Semantic type, etc.
D. Warzel 18
Presentation Outline
• caCORE Overview
• ISO/IEC 11179 Overview
• caDSR Implementation and toolingcaDSR Implementation and tooling
D. Warzel 19
caDSR Overview
• NCI Data Element Metadata repository and registry• Based on the ISO/IEC 11179 • Designed to integrate caCORE infrastructure• Supports the development and deployment of Data
Elements that are used as metadata descriptors, primarily for NCI-sponsored research, with an ever widening girth of end users
• Available as an open-source download
D. Warzel 20
caDSR Tools
• Goals of caDSR Tools development:– Simplify development and creation of ISO/IEC 11179
compliant metadata by Data Element Curators and UML Modelers
– Simplify consumption of Data Elements by end users and application developers
– Enhance reuse of Data Elements for all – Enable semantic consistency across research domains– Support metadata life-cycle and governance processes
D. Warzel 21
caDSR Home Page
Curators Developers General
D. Warzel 22
Introduction to caDSR Tools
– CDE Browser to Search for and Download – Form Builder to Create user specified collections of CDEs– Side-by-Side Compare
– CDE Curation Tool to Create Data Elements
– Admin Tool to Curate and Administer caDSR - “Power Users”
– Sentinel Tool (3.0)• Generates end user ‘Alerts’ triggered by metadata changes
– Batch Load to import Administered Items• Excel Loader (MS Excel)• UML Loader (XMI)• Case Report Form Loader (MS Excel)
Access, Develop, Manage, Consume
D. Warzel 23
• View, Search, Download– Shopping cart feature
• FormBuilder to Build / Download Forms and Data Elements
• “Context Browsing” Tree– By Classification Schemes
– By Forms
• CDE Basic Search Criteria – Google-like search
– Sortable search results by clicking on column headings
CDE Browser
“CONTEXTBrowsing”
“CONTEXTBrowsing”
Basic SearchBasic Search
D. Warzel 24
• Advanced Search Criteria – Leverages ISO attributes
• Find all with “18254-3” permissible value
• Find all with “Gene*”
• Find all with “Released” workflow status
• Find all with “Standard” Registration status
• Etc.
CDE Browser
Advanced SearchAdvanced Search
D. Warzel 25
Form Builder
• Create and Manage Forms– Organize CDEs into
modules within a Form
– Attach pdf or word format
– Classify Forms into groupings for specific end user communities
– “Publish” “Un-Publish” for Browser Catalog visibility
• “Printer Friendly” version
• Download CDEs
D. Warzel 26
CDE Side-by-Side Compare
• CDE Side-by-Side Compare– Build shopping cart,
compare CDE metadata side by side
– Download to excel spreadsheet
D. Warzel 27
• To Create, Edit or Version: • Data Element Concepts• Value Domains• Data Elements
• ISO 11179 Wizard – Construct ISO compliant Data
Elements by building up the pieces• Builds Names and Definitions
from underlying components.• “Get Associated”
– Leverage ISO to retrieve related CDEs
• “Block Edit”• “shopping cart”• Assign classification schemes• Versioning
Curation Tool
D. Warzel 28
Administration Tool
• System Administration • User Accounts and
Security• Lists of Values (LOVs)
used in content creation
• Create “Framework”: • Conceptual Domains
• Classification Schemes (basis for organizing CDEs in Browser)
• Protocols
D. Warzel 29
Sentinel Tool
• Create “Alerts”– User defined triggers based
on data element metadata attributes
– “notify me of any change to the Value Domain for any CDE on the Adverse Event Form
• Generates and emails a report of changes matching “Alert” criteria
D. Warzel 30
Batch LoadingOC caDSR DEFAULT VALUES: Workflow status = "Released" Alw ays. Version = 1.0 Alw ays. Create Date =Date loaded by Loader. Created by = EVS. Long Name = EVS Preferred name
EVS Preferred Name Definition Definition Source Database Context Preferred NameEffective Begin Date Change Note Alternate Name Type
VARCHAR2 (20) VARCHAR2 (2000) VARCHAR2 (2000) VARCHAR2 (255) VARCHAR2 (20) VARCHAR2 (30) VARCHAR2 (2000) VARCHAR2 (20) Mapped to Long Name and Preferred Name
PreferredDefinition Definition Source Database Requestors Context YY.MM.B Text AlternateName.Type
Not Null Not Null Null Not Null Not Null Null Null Not Null
Celsius Scale The temperature scale defined by the values 0 degree Celsius for the freezing point of water and 100 degrees Celsius for the boiling point of water. The Celsius degree (C) is the same size as a Kelvin and equal to (F - 32)/1.8. To convert Celsius to Fah
NCI NCI Thesaurus caBIG 11/18/2004 Requested by Dianne Reeves
NCI_Concept_Code
HEENT HEENT is the Head, Ears, Eyes, Nose and Throat, and is referred to as a body system on a physical or medical examination. The term is typically used as 'HEENT' in a physician or caregiver notes.
NCI NCI Thesaurus caBIG 11/18/2004 Requested by Dianne Reeves
NCI_Concept_Code
Gracely Pain Unpleasantness Scale
The Gracely Pain Unpleasantness Scale is a visual analog scale of 0 to 20 used by a subject to define their pain unpleasantness experience. Together with the intensity scale these tools serve to differentiate the patient's sensory perception of pain inte
NCI NCI Thesaurus caBIG 11/18/2004 Requested by Dianne Reeves
NCI_Concept_Code
• Excel Loaders– Formatted MS Worksheet
• Administered Item• Form
• UML Loader– XMI representation of a
UML Class Diagram• Class Object Class• Attribute Property• Data Element Concept,
Value Domain and Data Element derived from the above
D. Warzel 31
Current User Base
• Cancer Biomedical Informatics Grid (caBIG) – 820/466/180/ 61% *• Center for Cancer Research (CCR) – 821/573/506/ 12%
• Clinical Data Interchange Standard Consortium (CDISC) - 3/0• Center for Cancer Imaging (CIP) - 238/151/148/ 2%
• Cancer Therapy Evaluation Program (CTEP) – 8029/2432/2428/ .1%
• Division of Cancer Prevention (DCP) – 427/321/286/ 11%
• National Heart Lung and Blood Institute (NHLBI) – 0/0• Early Detection Research Network (EDRN) – 121/1/1/ 100%
• Divisions of Population Sciences and Cancer Control (PS & CC) 85/9• Specialized Programs of Research Excellence (SPOREs) – 719/197/120/ 39%
• Cancer Ontologic Research Environment (caCORE) – 1028/810/810 0%
* Total CDEs in this Context / ”Released” workflow status / ”Released” and developed by this context / “Reused” from other contexts
D. Warzel 32
Exploring
• National Institute of Neurological and Disorders and Syndromes (NINDS)
• National Icelandic Center for Oncology
• Cancergrid – UK
D. Warzel 33
Operating Environments
• Database Repository– Oracle 9i
• Administration Tool– Oracle PL/SQL, Oracle 9i Application Server
• CDE Browser– Java, Oracle 9i Application Server
• CDE Curation Tool– Jakarta Tomcat
D. Warzel 34
Support
• NCICB Help Desk– ncicb@pop.nci.nih.gov and telephone support
• Bi-weekly Software meetings– Hosted by Denise Warzel– Telconference and web-cast
• Bi-weekly Content Development Meetings– Hostd by George Komasoulis– Telconference and web-cast
• Open end user requirements meetings, design reviews and prototyping/feedback sessions
• Training– Web-cast and telconference
D. Warzel 35
Contact Information
• caDSR Home Page– http://ncicb.nci.nih.gov/core/caDSR
• caDSR Users ListServ– http://list.nih.gov to subscribe to
caDSR_Users@list.nih.gov
• caDSR Training Home Page– http://ncicb.nci.nih.gov/NCICB/core/caDSR/Training
• caDSR Training ListServe– http://list.nih.gov to subscribe to caDSR_Training-
L@list.nih.gov
D. Warzel 36
Documentation/Recommended Reading Materials
• caDSR Homepage: – http://ncicb.nci.nih.gov/core/caDSR
• caCORE User Application Manual:– ftp://ftp1.nci.nih.gov/pub/cacore/NCICBapplications/NCICBAppManual.pdf
• caCORE Technical Guide:– ftp://ftp1.nci.nih.gov/pub/cacore/caCORE2.0_Tech_Guide.pdf – caDSR APIs
• caDSR API Guide:– ftp://ftp1.nci.nih.gov/pub/cacore/caDSR/caCORE2.0_caDSR_API.pdf
• caDSR Business Rules – http://ncicb.nci.nih.gov/NCICB/core/caDSR/BusinessRules
• caDSR Content Meetings – http://ncicb.nci.nih.gov/NCICB/core/caDSR/Content
• caDSR_Users List serv subscribe: – http://list.nih.gov– Send Request for caDSR Account to: ncicb@pop.nci.nih.gov
D. Warzel 37
caDSR Tools Team
• NCICB– Peter Covitz
– Denise Warzel
• ScenPro– Bill McCurry
– Tom Phillips
– Robert Harding
– Jennifer Brush
– Larry Hebel
– Smita Hastak
• Oracle– Edmond Mulaire– Ram Chilukuri– Prerna Aggarwal– Dan Ladino– Christophe Ludet– Shaji Kakkodi– Jane Jiang
• SAIC– Kathleen Gundry– Tommie Curtis– Brenda Maeske
top related