a multi-discipline metadata registry for science interoperability
DESCRIPTION
A Multi-Discipline Metadata Registry for Science Interoperability. J. Steven Hughes/JPL - [email protected] Daniel J. Crichton/JPL - [email protected] Jason J. Hyon/JPL - [email protected] Sean C. Kelly/UTA - [email protected]. Open Forum on Metadata Registries - PowerPoint PPT PresentationTRANSCRIPT
A Multi-Discipline Metadata Registry for Science Interoperability
J. Steven Hughes/JPL - [email protected] J. Crichton/JPL - [email protected]
Jason J. Hyon/JPL - [email protected] C. Kelly/UTA - [email protected]
Open Forum on Metadata RegistriesJanuary 17-21, 2000
Santa Fe, New Mexico
A Multi-Discipline Metadata Registry for Science Interoperability
•Background
•Problem Statement
•System Overview
•Profile Development
•Conclusion and Issues
Background
•NASA’s Office of Space Science•Planetary Science
•Planetary Data System (PDS)•5 Science disciplines nodes - 2 Support nodes 1 Central node•Heterogeneous domains - short term missions
•Astrophysics•Astrophysics Data System
•100s to 1000s of nodes•Homogeneous domains - long term missions
•Space Physics•Space Physics Data System*
•Several identified nodes
Background
•Planetary Data System (PDS)
•Archives essentially all science data from solar system exploration missions
•Prototype - 1986, Operational - 1990
•Publishes archive quality products
•Well defined standards architecture
BackgroundPlanetary Science Standards Architecture
VolumeOrganization Stds.
ProductLabels
CatalogTemplates
Planetary ScienceData Dictionary
Object Description Language
Standard ArchiveProduct Architecture
StandardDescription
Standard Vocabulary
Standard Grammar
+ Peer Review
ArchiveQualityProduct
Background
•Planetary Science Data Dictionary•1000+ Data Elements spanning Planetary Science disciplines
•Nomenclature Standard•Meaning, type, ranges, enumerated values
•Planetary Science Data Model•Developed as Planetary Science enterprise E/R model
•Planetary Science Entities - Spacecraft, Instruments•Science Data Entities - Data Products, Projections, ...•Data Organization Entities - Volumes•Management Entities - Nodes, Personnel
•Implemented as the PDS Data Set Catalog in an RDBMS•Distributed in Object Description Language
Background
•Challenge
•Develop single interface for locating space science data.
•Provide data system interoperability.
•Support correlative Science.
Problem Statement
Space scientists can not easily locate or use data across the hundreds if not thousands of autonomous, heterogeneous, and distributed data systems currently in the Space Science community.
•Heterogeneous Systems•Data Management - RDBMS, ODBMS, HomeGrownDBMS, BinaryFiles•Platforms - UNIX, LINUX, WIN3.x/9x/NT, Mac, VMS, …•Interfaces - Web, Windows, Command Line•Data Formats - HDF, CDF, NetCDF, PDS, FITS, VICR, ASCII, ...•Data Volume - KiloBytes to TeraBytes
•Heterogeneous Disciplines•Moving targets and stationary targets•Multiple coordinate systems•Multiple data object types (images, cubes, time series, spectrum, tables,
binary, document)•Multiple interpretations of single object types•Multiple software solutions to same problem.•Incompatible and/or missing metadata
Proposed Solution
•Encapsulate individual data systems. (Hide uniqueness.)
•Communicate using metadata that describe resources•Data (e.g. data sets, images)•non-Data (e.g. catalogs, services)
•Enable interoperability based on metadata compatibility.
•Refocus problem on metadata development.
Proposed Solution (cont) • Object_Oriented Data Technology Task (OODT)
– Domain independent data management infrastructure
• Domain independent data structures– XML - Standard interchange language
– Metadata management
• Resource profile
– Message passing
• Domain independent system infrastructure– CORBA for interoperability between computer systems and languages
– Message passing to simply interface design
– Standardized reusable server components
System OverviewObject Oriented Data Technology Framework
SeaWinds Staging
OODT ServerPDS Staging
PTI Staging
Profile Server
Query Server
Archive Server Product
ServerArchive Server
Profile Server
Profile Server
Sybase
Oracle
Profile Server
PDS Systems
Product Server
ProfProf
Prof
Prof
Scientist
Web Server
System OverviewProfile Service
• Profile describes a resource– Available datasets and products– Types of resources and where they’re located
• Optionally reference other profile servers
Profile Server
Prof Data system 1
Data system 2
Profile Server
Prof
Profile Server
Profile Server
System OverviewQuery Service
• Knows how to “crawl” through servers to produce a result– Crawls through profiles to discover other
profiles and product servers– Crawls through product servers to display
available products
• Accessible through CORBA API or through web browser
Profile Development Objective
•Objective•Design and develop domain generic structure that will capture the metadata necessary for identifying and locating science data resources across distributed heterogeneous data systems.
•Result •Profile - A resource description (subset of meta-model) sufficient to determine if the resource might resolve a query.
Profile Development Approach
•Choose a common interchange format.
•Develop a domain generic language.
•Implement domain specific instances.•Model the domain. •Capture the meta-data.
•Develop system to manage the results.
Profile Development Choose a common interchange format
•XML
•eXtensible Markup Language•More expressive than HTML•More simple than SGML
•A meta-language used to define domain languages.•XSIL - eXtensible Scientific Interchange Language.•XIL - Instrument control language.
•Wide acceptance as an interchange format.•Electronic data interchange (EDI) standard.
Profile Development Develop a domain generic language
•Define a generic structure (XML DTD) that can describe heterogeneous domain-specific resources.
•Profile - A resource description with sufficient information to determine if the resource satisfies a query.
•Profile elements •name, syntax, unit, value_instance, meaning, alias, …•encodes selected domain attributes and their values specific to this resource
•Resource attributes - id, title, discipline, location_id, …
•Profile attributes - id, title, desc, type, data_dictionary_id, …
Profile Development Develop a domain generic language
prof.dtd
<!ELEMENT PROFILES (PROFILE+)>
<!ELEMENT PROFILE (PROFILE_ATTRIBUTES, RESOURCE)>
<!ATTLIST PROFILE PROFILE_ID CDATA #REQUIRED >
<!ELEMENT PROFILE_ATTRIBUTES (ID, TITLE*, DESC*, TYPE*, STATUS_ID*, SECURITY_TYPE*, PARENT_ID*, CHILD_ID*, REVISION_NOTE*, DATA_DICTIONARY_ID*)>
<!ELEMENT RESOURCE (RESOURCE_ATTRIBUTES, PROFILE_ELEMENT*)>
<!ELEMENT RESOURCE_ATTRIBUTES (RESOURCE_ID, RESOURCE_TITLE, RESOURCE_DISCIPLINE, RESOURCE_AGGREGATION, RESOURCE_CLASS, RESOURCE_LOCATION_ID, RESULT_MIME_TYPE)>
<!ELEMENT PROFILE_ELEMENT (ELEMENT_NAME, ELEMENT_MEANING*, ELEMENT_ALIAS*, VALUE_SYNTAX*, VALUE_UNIT*, (VALUE_INSTANCE | (MINIMUM_VALUE, MAXIMUM_VALUE))*)>
Profile Development Profile Example - PDS Distributed Inventory System
<PROFILE PROFILE_ID = "PROFILE_PDS_DIS_V1.3.n" > <PROFILE_ATTRIBUTES> <ID> PROFILE_PDS_DIS_V1.3.n </ID> <TITLE> Planetary Data System - Distributed Inventory System - Profile V1.0 </TITLE> <DESC> This profile describes the Planetary Data System (PDS) Distributed Inventory System (DIS) ... <TYPE> PROFILE </TYPE> <DATA_DICTIONARY_ID> OODT_PDS_DATA_SET_DD_V1.0 </DATA_DICTIONARY_ID> </PROFILE_ATTRIBUTES> <RESOURCE> <RESOURCE_ATTRIBUTES> <RESOURCE_ID> PDS_DIS_V1.3.n </RESOURCE_ID> <RESOURCE_TITLE> Planetary Data System - Distributed Inventory System </RESOURCE_TITLE> <RESOURCE_DISCIPLINE> PDS </RESOURCE_DISCIPLINE> <RESOURCE_AGGREGATION> GRANULE+ </RESOURCE_AGGREGATION> <RESOURCE_CLASS> INVENTORY </RESOURCE_CLASS> <RESOURCE_LOCATION_ID> http://pds.jpl.nasa.gov/pdsbrows.htm </RESOURCE_LOCATION_ID> <RESULT_MIME_TYPE> text/html </RESULT_MIME_TYPE> </RESOURCE_ATTRIBUTES>
...
Profile Development Profile Example (cont) - PDS Distributed Inventory System
…<PROFILE_ELEMENT> <ELEMENT_NAME> DATA_OBJECT_TYPE </ELEMENT_NAME> <ELEMENT_MEANING> The data_object_type element provides the type ... <VALUE_SYNTAX> ENUMERATION </VALUE_SYNTAX> <VALUE_UNIT> N/A </VALUE_UNIT> <VALUE_INSTANCE> IMAGE </VALUE_INSTANCE>... </PROFILE_ELEMENT> <PROFILE_ELEMENT> <ELEMENT_NAME> DATA_SET_NAME </ELEMENT_NAME> <ELEMENT_MEANING> The data_set_name element identifies a PDS data set. -- example ... <VALUE_SYNTAX> ENUMERATION </VALUE_SYNTAX> <VALUE_UNIT> N/A </VALUE_UNIT> <VALUE_INSTANCE> VO1/VO2 MARS VISUAL IMAGING SUBSYSTEM DIGITAL ... <VALUE_INSTANCE> VO2 MARS RADIO SCIENCE SUBSYSTEM RESAMPLED LOS …... </PROFILE_ELEMENT> <PROFILE_ELEMENT> <ELEMENT_NAME> TARGET_NAME </ELEMENT_NAME> <ELEMENT_MEANING> The target_name element provides the names of the targets ... <ELEMENT_ALIAS> ADS.OBJECT_ID </ELEMENT_ALIAS> <VALUE_SYNTAX> ENUMERATION </VALUE_SYNTAX> <VALUE_UNIT> N/A </VALUE_UNIT> <VALUE_INSTANCE> IDA </VALUE_INSTANCE> <VALUE_INSTANCE> JUPITER </VALUE_INSTANCE>... </PROFILE_ELEMENT> </RESOURCE>
Profile Development Develop a domain generic language
•Specialize the profile class
•Profile - One profile to one resource (e.g. inventory)
•Inventory - One profile to many resources (e.g. data set, image)•Minimized profile element attributes
•no meanings•subsets of preferred values
•Dictionary - One profile to one discipline•Maximize profile element attributes
•aliases , meanings•union of all preferred values
Profile Development Develop a domain generic language
•Profile element hierarchy•Dictionary - Planetary Science Data Dictionary
•data elements - union of all data elements in all profiles•preferred values - union of all data element values•e.g. TARGET_NAME = {ADRASTEA, …, VENUS}
•Profile - Planetary Image Atlas - Viking, Galileo, MPF, ...•data elements - union of all data elements for all
entities managed by resource•preferred values - union of data element values•e.g. TARGET_NAME = {MARS, DEIMOS, PHOBOS, JUPITER, ...}
•Inventory - Viking Orbiter Image Catalog•data elements - data elements associated with inventory item.•perferred values - data element values for inventory item.•e.g. TARGET_NAME = {MARS, DEIMOS, PHOBOS}
Profile Development Implement domain specific instances
•Apply domain generic language to specific domain.E.g. Space/Earth Science data and other resources.
•Model the domain •Data Dictionary•Data Model
•Capture the meta-data•Extracted from domain metadata repository
Profile Development Implement domain specific instances
Inventory Example - PDS Data Set
<RESOURCE> <RESOURCE_ATTRIBUTES> <RESOURCE_ID> VO1/VO2-M-VIS-5-DIM-V1.0 </RESOURCE_ID> <RESOURCE_TITLE> VO1/VO2 MARS VISUAL IMAGING SUBSYSTEM DIGITAL IMAGING MODEL ... <RESOURCE_DISCIPLINE> PDS </RESOURCE_DISCIPLINE> <RESOURCE_AGGREGATION> GRANULE+ </RESOURCE_AGGREGATION> <RESOURCE_CLASS> DATA </RESOURCE_CLASS> <RESOURCE_LOCATION_ID> http://pds.jpl.nasa.gov/cgi-bin/pdsserv.pl?OBJECT_ID=PDS100676 ... <RESULT_MIME_TYPE> text/html </RESULT_MIME_TYPE> </RESOURCE_ATTRIBUTES> <PROFILE_ELEMENT> <ELEMENT_NAME> DATA_SET_NAME </ELEMENT_NAME> <VALUE_INSTANCE> VO1/VO2 MARS VISUAL IMAGING SUBSYSTEM DIGITAL IMAGING MODEL ... </PROFILE_ELEMENT> <PROFILE_ELEMENT> <ELEMENT_NAME> DATA_OBJECT_TYPE </ELEMENT_NAME> <VALUE_INSTANCE> IMAGE </VALUE_INSTANCE> </PROFILE_ELEMENT> <PROFILE_ELEMENT> <ELEMENT_NAME> TARGET_NAME </ELEMENT_NAME> <VALUE_INSTANCE> MARS </VALUE_INSTANCE> </PROFILE_ELEMENT> <PROFILE_ELEMENT> <ELEMENT_NAME> VOLUME_ID </ELEMENT_NAME> <VALUE_INSTANCE> VO_2001 </VALUE_INSTANCE>... <VALUE_INSTANCE> VO_2014 </VALUE_INSTANCE> </PROFILE_ELEMENT> </RESOURCE>
Conclusion Profile Development - Review
•Choose a common interchange format. (XML)
•Develop a domain generic language. (X2PL)(XML eXtensible Profile Language)
•Implement domain specific instances. (Resource Profiles)
•Develop system to manage the profiles. (Profile Servers)
Conclusion Issues
•Develop space science metadata registry•~10 high level concepts - “Anchor Points”•Complete development of discipline registries
•Determine management policy•Design meta-model and mandate conformance•Evolved meta-model through voluntary conformance
•Determine space science metadata standards•NASA Data Entity Dictionary Specification Language (DEDSL - XML syntax) currently being used