new metadata standards

26
New Metadata Standards Workshop Open Forum 2008 Sydney Australia Hosts: Denise Warzel, Gail Hodge

Upload: gauri

Post on 14-Jan-2016

17 views

Category:

Documents


0 download

DESCRIPTION

New Metadata Standards. Workshop Open Forum 2008 Sydney Australia Hosts: Denise Warzel, Gail Hodge. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: New Metadata  Standards

New Metadata Standards

WorkshopOpen Forum 2008

Sydney Australia

Hosts: Denise Warzel, Gail Hodge

Page 2: New Metadata  Standards

Many organizations have recognized the need to develop and share programmatically accessible terminologies, services and data collection forms. This workshop will focus on building a consensus and plan for advancing the development of new metadata registry standards for these types of objects: Services, Terminologies and Forms.

To enable data processing systems to be written to discover and use these types of resources accurate, unambiguous and verifiable metadata is a prerequisite. With appropriate metadata standards for these types of objects, systems could be designed to allow owners to register and share electronically the essential characteristics necessary for common understanding of what and how these resources are intended for use. While the ISO/IEC 11179 addresses unambiguous representation and registration of data elements, no such standard exists for these important objects that are essential for developing interoperable information systems. A standard metadata framework to describe the essentialcharacteristics of these types of objects in ways that can be compared and interpreted is needed.

The envisioned metadata standards would be similar to the ISO/IEC 11179 Metadata Registries standard Part 3 in which the registry and basic attributes of each type of object are defined and can be specified and registered in a metadata registry. This session will facilitate developing a common understanding of some of these characteristics for each type of object. Participants will share existing metamodels and emerging specifications, discuss common needs, approaches and priorities and possible next steps that could lead to the development of new ISO standards in these areas.

This session would be of interest to information developers, information managers, data administrators, standards developers and others who are responsible for designing systems in which these types of metadata objects are understandable and shareable.

Page 3: New Metadata  Standards

Many organizations have recognized the need to develop and share programmatically accessible terminologies, services and data collection forms. This workshop will focus on building a consensus and plan for advancing the development of new metadata registry standards for these types of objects: Services, Terminologies and Forms. Type of Metadata Standard: (Service, Terminology, Form or Other):

Existing/Recommended metamodel:

Benefits:

Use Cases/Justification if this standard existed:

Known initiatives/individuals that could be pooled to create a Functional Specification:

Plans for Implementation:

Suggested next steps if known:

Page 4: New Metadata  Standards

Goals for Today

• Discuss new areas for development of metadata standards

• Specifically: Terminology, Forms, Services • Reach common understanding about what each of these

metamodels would describe (agree what we mean by “Terminology”, “Forms”, “Services”)

• Reach consensus on priorities/areas of common interests

• Learn how to progress something into a new standard• Form a Strategy for progressing these new standads

Page 5: New Metadata  Standards

Etc…Transformation

SearchRepository

MDR TopologyMetadataAuthoringTools &Sources

DE

VD

Concept

XML…

UML

Terminology

SourceCode

Documents

Image

Form

Au

tho

rs, C

ura

tors

Registry

Repository

Registry

Repository

Registry

Repository

Serv

ice In

terfa

ce

Serv

ice In

terfa

ce

Me

tad

ata

Ex

ch

an

ge

Global or Parent nodeMetadataManagementSubmissionHarmonizationReviewRegistration &Management

Registry

Repository

Browser (s)

Se

rvic

e In

terfa

ce

ServiceMetadata

CDEMetadata

Run timeOperationalRepository

TerminologyMetadata

Browser (s)

ServiceDiscovery

Service,Schema,

ModelValidations

CDE Search

Model Discovery

Sys

tem

s

Common RegistryServices

Local or Child nodeMetadataManagementSubmissionHarmonizationReviewRegistration &Management

Enterprise Vocabulary Services

SchemaRepository

Page 6: New Metadata  Standards

Areas of Interest• Registry Metamodel

– Profiles that would allow registry operators to ‘declare’ themselves, provide attributes that would enable others to understand certain characteristics about the registry, such as business rules, naming conventions, contact information etc.

– Issue 127: 2005 Global Attributes for a Registry to be used with an ROR

– (ISO 29002)– What is contained in the Registry (e.g. What type of registry is it?

Image Registry, Service Registry, 11179 Metadata Registry, etc)– Establish “Trust”

• Interest is registry for CWM mappings (Baba)

Page 7: New Metadata  Standards

‘Terminology’

• There are different types of terminologies • This would cover all types of terminologies: • Taxonomies, Formal Ontologies,

Standardized list of terms, Vocabularies, Dictionaries, etc.

• The metadata would describe the Terminology as a WHOLE – (NOT the model for the terms in the terminology)

Page 8: New Metadata  Standards

Terminology Issues• Agree on a common definition of ‘Terminology’

– Look at TC 37 work (Sue Ellen Wright)

• Add proper Provenance class

• Related Work: – Registry for Knowledge Organization Systems (NKOS) – Gail Hodge– ‘Rights’ – UMLS has both public and restricted content– UK Joint Information Systems committee (JISC) – Dennis Nicolson(?),

Doug Tudhope– NCBO – BioPortal – resource quality, attributes– Government Terminology Services – each country has some

terminologies they are serving and have an implicit metamodel that could be harmonized

• (JoAnne Evans – Monash University)• Open Data Linking

Page 9: New Metadata  Standards

9

Model - Overview

Page 10: New Metadata  Standards

Areas of Interest

• Services Metamodel– Profiles that would allow a Service Registry

operators to ‘declare’ themselves, provide attributes that would enable others to understand certain characteristics about the registry, such as business rules, naming conventions, contact information etc

– Establish “Trust”

Page 11: New Metadata  Standards

‘Service’ definition

• OASIS (organization) defines service as "a mechanism to enable access to one or more capabilities, where the access is provided using a prescribed interface and is exercised consistent with constraints and policies as specified by the service description."

Page 12: New Metadata  Standards

Service Metadata Issues• myGrid Service Model

• caGrid Service Metamodel

• Need a way to describe a “Registry Service”?? Should use WSDL – service metadata would provide a way to standardize a description of the described in a WSDL

• IBM has introduced Service Science as a new curriculum – may have some information that would be compatible

Page 13: New Metadata  Standards

caGrid Service Metamodel

Page 14: New Metadata  Standards

caGrid Service Metamodel

11179:Administered Item

Page 15: New Metadata  Standards

Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J. Bioinformatics Research and Applications, Vol. x, No. x, 200x

.

Page 16: New Metadata  Standards

Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J. Bioinformatics Research and Applications, Vol. x, No. x, 200x

.

Not in caGrid Service Domain model: •Operation.task•Operation.method•Operation.resource•Operation.relationship (shim services)

Not in caGrid Service Domain model: •Operation.task•Operation.method•Operation.resource•Operation.relationship (shim services)

not sure if ‘format’ is the similar to

‘dimensionality’

matchnewmissing

caGrid Service Metamodel

Page 17: New Metadata  Standards

Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J. Bioinformatics Research and Applications, Vol. x, No. x, 200x

.

Service Types

• Domain: Performs scientific function

• Shim: Does not perform scientific function, but is needed to make one service work with another– E.g.

Page 18: New Metadata  Standards

myGrid Portions courtsey: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J. Bioinformatics Research and Applications, Vol. x, No. x, 200x

.

myGRID Service Operation: •Operation.resource •Operation.method (algorithm)•Operation.task

myGRID Service Operation: •Operation.resource •Operation.method (algorithm)•Operation.task

These concepts could come from myGrid ontology

The concepts for each “Input” could come from the semantics registered for the UML class

Page 19: New Metadata  Standards

Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J. Bioinformatics Research and Applications, Vol. x, No. x, 200x

.

myGRID semantic categories•Informatics. captures the key concepts of data, data structures, databases and metadata. The data and metadata hierarchies in the ontology contain this information.

• Bioinformatics. This builds on informatics as well as data and metadata, there are domain-specific data sources (e.g., the model organism sequencing databases), and domain-specific algorithms for searching and analysing data (e.g., the sequence alignment algorithm, clustalw). The algorithm and data_resource hierarchies contain this information.

• Molecular biology. This includes the higher level concepts used to describe the bioinformatics data types used as inputs and outputs in services. These concepts include examples such as, protein sequence, and nucleic acid sequence.

• Tasks. A hierarchy describing the generic tasks a service operation can perform. Examples include retrieving, displaying and aligning.

• Formats. A hierarchy describing bioinformatics file formats. For example, fasta format for sequence data, or phylip format for phylogenetic data.

•Informatics. Not sure if we have a corollary or need this.

• Bioinformatics. We could probably use theirs for attaching a concept to Operation algorithm

• Molecular biology. We have this already, its the concepts associated with the UML Class level

• Tasks. We could probably use theirs for Operation task.

• Formats. Probably useful, this is at the Service level – not sure if this is similar to the ‘dimensionality’ attribute we have at the input/output level?

Comparison of myGrid semantics and caGrid Semantics

Page 20: New Metadata  Standards

“Forms” Description• A structured set of metadata for the collection of information and for the ‘form’

itself • There are various types of forms

– Statistical forms– Surveys, Questionnaires – Population Science Measures

• Eg quality of life

• Link to Rules for Form Completion• Rights – can the form be reused? By whom? What is the process for

authorization, etc.• Includes Standard way to describe minimal Rendering information

• To describes how the form looks

• ‘Behavior’ – skip pattern– Form Structure

• Standard way to associate the semantics of the Question and Answer Set (e.g. to associate an item on the form with a 11179 Data Element, DEC or Value Domain

Page 21: New Metadata  Standards

Forms MetaModelRelated Projects

• OASIS• UN/CEFACT – Forms Metamodel

– UN eDocs – Cross Border Trade ‘form templates’ – may have an implicit form metamodel

– ISO 15000-5 ebXML– ISO 7372– Adobe pdf form metamodel– MS InfoPath – has a metamodel for their forms– XForms - language for representing forms

Page 22: New Metadata  Standards

Forms Metadata

• caDSR Protocol Forms Metadata

• cancerGrid Protocol Forms

• HL7 Clinical Document Architecture

• Westat Survey/Instrument Metamodel

• ??

Page 23: New Metadata  Standards

class ProtocolCaseReportForms

domain::ConditionMessage

+ id: String+ message: String+ messageType: String

domain::FormElement

domain::AdministeredComponent

+ beginDate: Date+ changeNote: String+ createdBy: String+ dateCreated: Date+ dateModified: Date+ deletedIndicator: String+ endDate: Date+ id: String+ latestVersionIndicator: String+ longName: String+ modifiedBy: String+ origin: String+ preferredDefinition: String+ preferredName: String+ publicID: Long+ registrationStatus: String+ unresolvedIssue: String+ version: Float+ workflowStatusDescription: String+ workflowStatusName: String

domain::Form

+ displayName: String+ type: String

domain::Module

+ displayOrder: Integer+ maximumQuestionRepeat: Integer

domain::Question

+ defaultValidValueId: String+ defaultValue: String+ displayOrder: Integer+ isEditable: String+ isMandatory: String

domain::ValidValue

+ description: String+ displayOrder: Integer- meaningText: String

domain::Instruction

+ type: String

domain::TriggerAction

+ action: String+ createdBy: String+ criterionValue: String+ dateCreated: Date+ dateModified: Date+ forcedValue: String+ id: String+ instruction: String+ modifiedBy: String+ triggerRelationship: String

domain::Protocol

+ approvedBy: String+ approvedDate: Date+ changeNumber: String+ changeType: String+ leadOrganizationName: String+ phase: String+ protocolID: String+ reviewedBy: String+ reviewedDate: Date+ type: String

domain::AdministeredComponentClassSchemeItem

+ createdBy: String+ dateCreated: Date+ dateModified: Date+ id: String+ modifiedBy: String

domain::QuestionRepetition

+ defaultValue: String+ isEditable: String+ repeatSequenceNumber: Integer

domain::DataElement

domain::QuestionCondition

+ id: String

domain::QuestionConditionComponents

+ constantValue: String+ displayOrder: Integer+ id: String+ logicalOperand: String+ operand: String

domain::Function

+ createdBy: String+ dateCreated: Date+ dateModified: Date+ id: String+ modifiedBy: String+ name: String+ symbol: String

+conditionComponent 0..*

+function 0..1

+administeredComponentClassSchemeItemCollection

0..*

+triggerActionCollection

0..*

+administeredComponentClassSchemeItemCollection0..*

+administeredComponent 1

+instruction 0..*

+formElement 1

+sourceFormElement 1

+triggerActionCollection 0..*

+targetFormElement

1

+triggerActionCollection

0..*

+triggeredActionCollection

0..*

+enforcedCondition

0..1

+conditionComponentCollection 0..*

+questionCondition0..1

+questionCondition 1

+condtionMessage 0..*

+protocolCollection 0..*

+triggerActionCollection 0..*

+protocolCollection 0..*

+formCollection 0..*

+validValue

0..1

+conditionComponent

0..*+questionRepetitionCollection0..*

+defaultValidValue 0..1

+module 1

+questionCollection 0..*

+questionCondition

0..1

+forcedConditionTriggeredActionCollection

0..*

+parentQuestionCondition 1

+questionCondition 0..*

+questionCollection 0..*

+dataElement 0..1

+question 1

+questionRepetitionCollection0..*

+question 0..*

+questionCondition

0..1

+questionComponentCollection

0..*

+question 0..1

+validValueCollection0..*

+question 1

+form

1

+moduleCollection

0..*

Page 24: New Metadata  Standards

class ProtocolCaseReportForms

domain::ConditionMessage

+ id: String+ message: String+ messageType: String

domain::FormElement

domain::Form

+ displayName: String+ type: String

domain::Module

+ displayOrder: Integer+ maximumQuestionRepeat: Integer

domain::Question

+ defaultValidValueId: String+ defaultValue: String+ displayOrder: Integer+ isEditable: String+ isMandatory: String

domain::ValidValue

+ description: String+ displayOrder: Integer- meaningText: String

domain::Instruction

+ type: String

domain::TriggerAction

+ action: String+ createdBy: String+ criterionValue: String+ dateCreated: Date+ dateModified: Date+ forcedValue: String+ id: String+ instruction: String+ modifiedBy: String+ triggerRelationship: String

domain::Protocol

+ approvedBy: String+ approvedDate: Date+ changeNumber: String+ changeType: String+ leadOrganizationName: String+ phase: String+ protocolID: String+ reviewedBy: String+ reviewedDate: Date+ type: String

domain::AdministeredComponentClassSchemeItem

+ createdBy: String+ dateCreated: Date+ dateModified: Date+ id: String+ modifiedBy: String

domain::QuestionRepetition

+ defaultValue: String+ isEditable: String+ repeatSequenceNumber: Integer

domain::DataElement

domain::QuestionCondition

+ id: String

domain::QuestionConditionComponents

+ constantValue: String+ displayOrder: Integer+ id: String+ logicalOperand: String+ operand: String

domain::Function

+ createdBy: String+ dateCreated: Date+ dateModified: Date+ id: String+ modifiedBy: String+ name: String+ symbol: String

+conditionComponent 0..*

+function 0..1

+administeredComponentClassSchemeItemCollection

0..*

+triggerActionCollection

0..*

+instruction 0..*

+formElement 1

+sourceFormElement 1

+triggerActionCollection 0..*

+targetFormElement

1

+triggerActionCollection

0..*

+triggeredActionCollection

0..*

+enforcedCondition

0..1

+conditionComponentCollection 0..*

+questionCondition0..1

+questionCondition 1

+condtionMessage 0..*

+protocolCollection 0..*

+triggerActionCollection 0..*

+protocolCollection 0..*

+formCollection 0..*

+validValue

0..1

+conditionComponent

0..*+questionRepetitionCollection0..*

+defaultValidValue 0..1

+module 1

+questionCollection 0..*

+questionCondition

0..1

+forcedConditionTriggeredActionCollection

0..*

+parentQuestionCondition 1

+questionCondition 0..*

+questionCollection 0..*

+dataElement 0..1

+question 1

+questionRepetitionCollection0..*

+question 0..*

+questionCondition

0..1

+questionComponentCollection

0..*

+question 0..1

+validValueCollection0..*

+question 1

+form

1

+moduleCollection

0..*

caDSR Protocol Forms Metamodel December 2007

Page 25: New Metadata  Standards

HL7 Clinical Document Architecture (CDA) Metamodel

Page 26: New Metadata  Standards

Metadata Standards Development• SC 32 and WG 2 Options for progressing these

– ServiceMetadata, Forms Metadata, Terminology Metadata and Global Registry Metadata (RoR - Issue 127)

– Also need a Service Specification for Registries

– Work with Issue 127, or create another issue to put into 11179– WG 2 ‘Study Period’

• Look at this and decide what approach we should take• New part of 11179? New Standard? • ‘Study Period’ involves more open participation, not as formal as a WG 2

meeting• Could be proposed as a new Study Period next week as a ‘Resolution’

– Needs a Leader, provides an opportunity to circulate something for people to “sign-

up” - Could Report back at the next WG 2 meeting in the Winter 2008 (5-8 months)• ISO 19115 (TC 211) Geographic Information • TC 184/SC 4 and ISO 8000 Data Quality – Quality of Services? Gerald Radack and Peter

Benson• OMG Evan Wallace/Elisa Kendall• IT 4 Australian, New Zealand