procedures for achieving metadata registry content consistency: data elements

39
Larry Fitzwater, U.S. EPA Judith Newton, NIST Lois Fritts, SAIC anuary 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC-0002-021-JE-2026

Upload: gates

Post on 18-Jan-2016

18 views

Category:

Documents


3 download

DESCRIPTION

Procedures for Achieving Metadata Registry Content Consistency: Data Elements. Larry Fitzwater, U.S. EPA Judith Newton, NIST Lois Fritts, SAIC January 17, 2000. Open Forum on Metadata Registries Santa Fe, NM. SDC-0002-021-JE-2026. Contents of Working Paper. 1. Scope 2. References - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

Larry Fitzwater, U.S. EPAJudith Newton, NISTLois Fritts, SAICJanuary 17, 2000

Open Forum onMetadata Registries

Santa Fe, NM

SDC-0002-021-JE-2026

Page 2: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Contents of Working PaperContents of Working Paper

1.1. Scope Scope

2.2. References References

3.3. Definitions Definitions

4.4. Types of Abstraction Types of Abstraction

5.5. Registry Population Registry Population

AnnexesAnnexes

Page 3: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

ScopeScope

Describes business rules for the registration of data elements and their attributes in a registry, to assist in consistently establishing good quality data elements.

Based on the model of a data registry described in ISO/IEC 11179, Part 3.

Helps to achieve metadata content consistency through procedures and examples.

Page 4: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

Judith Newton, NISTOpen Forum on

Metadata RegistriesSanta Fe, NM

SDC-0002-021-JE-2026

Page 5: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Types of AbstractionTypes of AbstractionRelevant to Data Elements Relevant to Data Elements

1.1. Specialization/generalization–all Specialization/generalization–all items in the superclass are also items in the superclass are also in the subclassin the subclass

2.2. Decomposition/aggregation–the Decomposition/aggregation–the part-ofpart-of relationship relationship

Page 6: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Specialization/GeneralizationSpecialization/Generalization

State USPS CodeState USPS Code

GeographicState USPS Code

GeographicState USPS Code

Mailing AddressState USPS CodeMailing AddressState USPS Code

FACILITYGeographic

State USPS Code

CUSTOMERGeographic

State USPS Code

FACILITYMailing AddressState USPS Code

CUSTOMERMailing AddressState USPS Code

Page 7: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Decomposition/AggregationDecomposition/Aggregation

Country Identifier

CountrySubdivision Code

CountrySubdivision Code

CountyCode

BoroughCode

MetropolitanDistrict Code

UnitaryAuth. Code

SpecialArea Code

Page 8: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

Lois Fritts, SAICOpen Forum on

Metadata RegistriesSanta Fe, NM

SDC-0002-021-JE-2026

Page 9: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

A Metadata Registry Can Be A Metadata Registry Can Be OverwhelmingOverwhelming

Data_Element_Concept_Relationship

<<Required>> type_description

Non_enumerated_Domain

<<Required>> description

Value_Domain_Relationship

<<Required>> type_description

Enumerated_Domain

Permissible_Value

<<Required>> item<<Required>> begin_date<<Conditional>> end_date

2..n

1..*

+member_of2..n

+specifing1..*

allowed_value

Value_Meaning

<<Required>> identifier<<Optional>> description<<Required>> begin_date<<Conditional>> end_date

2..n

0..*

+contained_in2..n

+containing

0..*

permissib le_value

1..*

0..*

+represented_by

1..*

+representing0..*

permissib le_value_meaning

Conceptual_Domain

<<Optional>> administered_component_information : Administered_Component<<Optional>> dimensionality

0..*0..*

+containing

0..*

comceptual_domain_relationship

+contained_in0..*

1..*

0..*

+containing

1..*

+contained_in

0..*

value_meaning_set

Value_Domain

<<Optional>> administered_component : Administered_Component<<Optional>> name<<Required>> datatype : Datatype<<Optional>> maximum_character_quantity<<Optional>> minimum_character_quantity<<Optional>> format<<Optional>> unit_of_quantity : Unit_of_Quantity

0..*

0..1

+contained_in

0..*

value_domain_relationship

+containing

0..1 0..*

1..1

+representing0..*

+specified_by

1..1

specification

Example

<<Required>> item

Data_Element_Concept

<<Required>> administered_component : Administered_Component<<Optional>> object_class : Object_Class<<Optional>> object_class_qualifier<<Optional>> property : Property<<Optional>> property_qualifier

0..1

0..* +containing0..1

data_element_concept_relationship

+contained_in0..*

1..10..*+specifing

1..1+having

0..*

data_element_concept_conceptual_domain_relationship

Data_Element

<<Required>> administered_component : Administered_Component<<Required>> representation_class : Representation_Class<<Optional>> representation_class_qualifier

0..* 1..1

+represented_with

0..*+providing_representation_for

1..1

representation

1..*

1..*

+represented_by

1..*

+representing1..*

exemplication

0..*

1..1

+providing_representation_to

0..*

+represented_by

1..1

expression

Rule

<<Optional>> administered_component : Administered_Component<<Required>> description

Source_Data_Element

0..*

1..*

+containing

0..*

+contained_in

1..*

derivation_input

0..1

1..1

+is_input_to0..1

+resulting_from

1..1

derivation_output

1..1

0..*

+is_formula_for1..1

+used_by0..*

derivation

Proposal for Comments11179-3 RevisionDD Mann

PAGE 111179-3 METAMODELMain Model

NOTE:

This model represents the logical structure of a registry for data elements and related components that are in a "recorded" or higher registration status.

For UML v1.3 documentation see:ftp://ftp.omg.org/pub/docs/ad/99-06-08.pdf

1999-12-13

SDC-0002-021-JE-2026

Page 10: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

This presentation is a This presentation is a practical approach to practical approach to

populating the content populating the content of a data registry for of a data registry for

data elements.data elements.

SDC-0002-021-JE-2026

Page 11: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

OverviewOverview

General ProceduresGeneral Procedures

Examples of RegistrationExamples of Registration

Data Element GroupsData Element Groups

Page 12: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

General ProceduresGeneral Procedures

1.1. Understanding the data element Understanding the data element

2.2. Content research Content research

3.3. Population of metadata attributes Population of metadata attributes

4.4. Classification Classification

5.5. Quality control Quality control

Page 13: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Population of Metadata AttributesPopulation of Metadata Attributes

Bottom Up ApproachBottom Up Approach

Top Down ApproachTop Down Approach

A data element is attributed with known A data element is attributed with known facts prior to defining the conceptual facts prior to defining the conceptual information about a data element.information about a data element.

A classified group is added to the registry, A classified group is added to the registry, beginning with conceptual domains, value beginning with conceptual domains, value domains, and working down to the domains, and working down to the individual data elements.individual data elements.

Page 14: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

Data Element DefinitionData Element Definition

Permissible ValuesPermissible Values

Data Element Name and IdentifiersData Element Name and Identifiers

Other Data Element AttributesOther Data Element Attributes

Data Element ConceptData Element Concept

Conceptual DomainConceptual Domainand Value Meaningsand Value Meanings

Logical Bottom Up ProcessLogical Bottom Up Process

SDC-0002-021-JE-2026

Page 15: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

Data Element DefinitionData Element Definition

Permissible ValuesPermissible Values

Data Element Name andData Element Name andIdentifiersIdentifiers

Other Data ElementOther Data ElementAttributesAttributes

Data Element ConceptData Element Concept

Conceptual Domain andConceptual Domain andValue MeaningsValue Meanings

Logical Top Down ProcessLogical Top Down Process

SDC-0002-021-LF-1005SDC-0002-021-JE-2026

Page 16: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Other AttributesOther Attributes

Submitting OrganizationSubmitting Organization Data StewardData Steward CommentComment ExampleExample OriginOrigin

DocumentDocument SystemSystem StandardStandard

AdministrativeAdministrative

Page 17: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Data Element ExamplesData Element Examples

ISO Standard–EnumeratedISO Standard–Enumerated

ISO Standard–Non-enumeratedISO Standard–Non-enumerated

Application System–EnumeratedApplication System–Enumerated

Page 18: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

ISO Standard–EnumeratedISO Standard–Enumerated

--United States--United States--United States of America--United States of America--US--US--USA--USA--840--840--ÉTATS-UNIS--ÉTATS-UNIS--États-Unis d’Amérique--États-Unis d’Amérique

ISO 3166ISO 3166Country IdentifiersCountry Identifiers

Short English NameShort English NameLong English Name Long English Name 2-character abbrev.2-character abbrev.3-character abbrev.3-character abbrev.3-digit code3-digit codeShort French NameShort French NameLong French NameLong French Name

Page 19: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Codes for Data Element Codes for Data Element RegistrationRegistration

Definition (Def)

Permissible Value (PV)

Value Domain (VD)

Value Domain Origin (VDO)

Data Element Name and Identifiers (DEID)

Data Element Name Context (CNTX)

Data Element Concept (DEC)

Conceptual Domain (CD)

Classification (Cl)

Layer of Abstraction (LA)

Registration Status (RS)

Page 20: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

ISO 3166–EnumeratedISO 3166–Enumerated

Def: The short name of a country, represented in the English language

PV: Afghanistan, Albania,…Zimbabwe

VD: Short English-language country names

VDO: ISO 3166-1:1997

DEID: 209033:1 Short English-language country name

CNTX:Registry

DEC: Country identifier

CD: Countries of the world

Cl: Geopolitical entities; country identifiers

LA: Generalization

RS: Standard

Page 21: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

ISO Standard–Non-enumeratedISO Standard–Non-enumerated

ISO 6709ISO 6709Geographic Point LocationsGeographic Point Locations

LatitudeLatitudeLongitudeLongitudeAltitudeAltitudeLatitude Sexagesimal MeasureLatitude Sexagesimal Measure

SDC-0002-021-LF-XXXX

Page 22: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

ISO 6709–Non-enumeratedISO 6709–Non-enumerated

Def: The sexagesimal measure of the angular distance of a position on the earth on a meridian north or south of the equator

PV: <All measures recorded as DDMMSS.SS>VD: Sexagesimal measures of latitudeVDO: Not applicableDEID: 312345:1 Latitude sexagesimal measure CNTX:RegistryDEC: Latitude distanceCD: Latitude coordinatesCl: Geographic point locationLA: GeneralizationRS: Recorded

Page 23: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

Application System–EnumeratedApplication System–Enumerated

33c

NameStreet AddressCity, State Postal CodeCountry

Mailing Address Country NameMailing Address Country Name

SDC-0002-021-JE-2026

Page 24: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Application Data ElementApplication Data Element

Def: The name of a country where the addressee is locatedPV: Afghanistan, Albania,…ZimbabweVD: Short English-language country namesVDO: ISO 3166-1:1997DEID: 5394:1 Mailing_Address.Country_NameCNTX: Facility data systemDEC: Address country identifierCD: Countries of the worldCl: Mailing addressLA: Specialization RS: Recorded

Page 25: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

11. Understanding the classified group. Understanding the classified group

2.2. Specifying the data elements Specifying the data elements

3.3. Understanding the group’s source: Understanding the group’s source:– NameName– DefinitionDefinition– AuthorityAuthority– RationaleRationale– Potential usagePotential usage– IdentifierIdentifier

Register a Classification ofRegister a Classification ofData ElementsData Elements

Page 26: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Data Element ClassificationsData Element Classifications

DocumentDocument

StandardStandard

Composite data elementComposite data element

Page 27: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Classify by DocumentClassify by Document

Source:Source: Facility Location and IdentificationFacility Location and Identification Standard Standard

Definition:Definition: Core set of data elements thatCore set of data elements that supports location and identification supports location and identification of place-based objects of place-based objects

Authority:Authority: Federal Geographic DataFederal Geographic Data Committee(FGDC) Committee(FGDC)

Rationale:Rationale: Proposed U.S. National StandardProposed U.S. National Standard

Usage:Usage: Facilitates data sharing about facilitiesFacilitates data sharing about facilities

Identifier:Identifier: 12341234

Page 28: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Data Elements in DocumentData Elements in Document

Facility NameFacility Name

Facility Category Type Facility Category Type

Facility Identification Number Facility Identification Number

Latitude MeasureLatitude Measure

Longitude MeasureLongitude Measure

Page 29: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Classify by StandardClassify by Standard

Source:Source: Standard representation of latitude,Standard representation of latitude, longitude, and altitude for geographic longitude, and altitude for geographic point point

locationslocations

Definition:Definition: The horizontal and vertical coordinates The horizontal and vertical coordinates that that define a point on the earthdefine a point on the earth

Authority:Authority: ISO 6709 ISO 6709

Rationale: Rationale: International standardInternational standard

Usage:Usage: System developers to design a System developers to design a database entity and transfer data filesdatabase entity and transfer data files

Identifier:Identifier: 1345 1345

Page 30: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Data Elements in StandardData Elements in Standard

Latitude Degrees MeasureLatitude Degrees Measure

Longitude Degrees MeasureLongitude Degrees Measure

Altitude Measure Altitude Measure

Latitude Sexagesimal Measure Latitude Sexagesimal Measure

Longitude Sexagesimal MeasureLongitude Sexagesimal Measure

Page 31: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Classify by Classify by Composite Data ElementComposite Data Element

Name: Name: Urban-style street addressUrban-style street address

Definition:Definition: A set of precise data elements that A set of precise data elements that can can be combined into a street addressbe combined into a street address

Authority:Authority: U.S. Postal Service, Publication 28: U.S. Postal Service, Publication 28: Postal Address StandardsPostal Address Standards

Rationale: Rationale: U.S. national standard for creating a U.S. national standard for creating a mail piece mail piece

Usage:Usage: Parse street address for validation of Parse street address for validation of individual segments individual segments

Identifier:Identifier: 25432543

Page 32: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Data Elements inData Elements inStreet AddressStreet Address

Building NumberBuilding Number

Pre-directional CodePre-directional Code

Street NameStreet Name

Street Suffix CodeStreet Suffix Code

Post-directional CodePost-directional Code

Secondary Unit CodeSecondary Unit Code

Suite NumberSuite Number

Page 33: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

Composite Data ElementComposite Data Element

Example of data values forUrban-Style Street Address

200 N Glebe Road SW Suite 300

SDC-0002-021-JE-2026

Page 34: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Linking Data ElementsLinking Data Elements

Vertical Vertical

HorizontalHorizontal

Used TogetherUsed Together

Page 35: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Vertical LinkingVertical Linking

Generalization to SpecializationGeneralization to Specialization

State USPS CodeState USPS CodeState USPS CodeState USPS Code

Mailing Address State CodeMailing Address State Code

Facility Mailing Address State Code

Page 36: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Horizontal LinkingHorizontal Linking

Equivalent Layer of AbstractionEquivalent Layer of Abstraction

PCS_Permit_Facility.Mailing _State

BRS_Site_Information.Mail _State

RCR_Mailing_Location.State

FacilityMailing Address

State Code

Page 37: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

SDC-0002-021-JE-2026

Linking by UseLinking by Use

Sample QuantityUnits Name

Sample QuantityUnits Name

SampleQuantitySampleQuantity

1717 milligramsmilligrams

Example:

Page 38: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

This is a practical, This is a practical, logical approach to logical approach to registering “good” registering “good”

data elements.data elements.

SDC-0002-021-JE-2026

Page 39: Procedures for Achieving Metadata Registry Content Consistency: Data Elements

Discussion

SDC-0002-021-JE-2026