metadata. generally speaking, metadata are data and information that describe and model data and...
Post on 02-Jan-2016
226 Views
Preview:
TRANSCRIPT
Metadata
MetadataGenerally speaking, metadata are data and
information that describe and model data and informationFor example, a database schema is the metadata for the
data stored in the database
Metadata also includes data that represent properties and relationships among individual objects (instances) of any type (e.g., tectonic, sedimentary, geochemical)
This kind of metadata is typical of those in an ontology, which is a Semantic Web technology
VocabularyMetadata are used to specify vocabularies for exchanging data
among different people in research groups or between machines
The vocabulary enriches the data so that software can interact with them, and manipulate them
Metadata tell software (algorithm, processors) what to do with the data and how to use them, and are of many kinds Syntactic (how code statements are put together)Structural (how data are structured; e.g., relational, XML, OO, graph)Referent (set of allowable relations or properties connecting objects to
instances, e.g., subclass, part-of, intersection, disjoint)Domain specific
These metadata are in forms that include database schemas, XML documents, UML diagrams, and domain-specific entity hierarchies
Subsumtive/Partitive HierarchyMetadata allow representation of the format and organization of
data (e.g., taxonomy, partonomy), for example:Foliation isA PlanarStructure describes the subsumption
relationship between foliation and planar structure Subsumption is the word for the is-a relation. If B is a kind of A, then we say A subsumes B, and B is subsumed by A.
Mineral partOf Rock describes the meronomic relation between minerals and rocks
This following includes the relationship between various types of data, e.g., AxialPlanarFoliation parallel FoldAxialPlane
EntailmentMetadata also allow us to formally specify and represent our
domain knowledge by describing the information domain (i.e., field, such as geochemistry), thereby helping us to infer implicit statements from explicit statements through inference rules and entailment, e.g.:
PlanarStructure has Strike. If we assert in our ontology that LinearStructure disjointWith PlanarStructure, and Lineation isA LinearStructure, and that LinearStructure has trend, the knowledge can then be used to make inferences about the underlying data
For example, if a structure, such as Foliation, has strike, we can infer that it is a planar structure;if it has trend, we infer that it is a linear structure
Applications of metadataMetadata are used as a tool to describe and model domain
information and knowledge, and can support several useful functionalities such as navigation, browsing, and retrieval of maps, images, and information about a specific geologic feature or phenomenon such as a rock or mineral sampleMetadata will enable knowledge-based decision support and
management systems
The decision support system, when implemented, can be used by the decision makers in geoscience communities, and the knowledge management system will be used by geologists in these communities, trying to figure out the relationship between cross-disciplinary geological facts and phenomena (e.g., mineral reserve and petrology; geochemistry and water quality)
Types of metadataMetadata can describe content-independent information, such as
rock sample number or the date the sample was taken
The URI (Uniform Resource Identifier) associated with a geological resource is another example of this kind of metadata
Content-based metadata, on the other hand, describe the structural information of documents or artifacts, and domain-specific terminology and vocabulary, which capture both intra- and inter-domain relationships among data (i.e., within one field or between different fields, for example within the Geochemistry field, or between Geochemistry and Petrology fields)
While the content-independent metadata describe the format and organization of the underlying data, the domain-specific metadata are the most relevant, and capture information about the domain (e.g., stratigraphy, geochemistry), and are the most useful as far as scientific semantics is concerned
Metadata are commonly developed in isolation, and require intermediary software for interchange, interoperability, and integration
The Semantic Web can help in developing systems that allow efficiently linking and integrating distributed data to anything in a community
Decisions on where to explore for a specific mineral or drill wells for water or oil depend on the accuracy of the data, and how these data (e.g., aquifer and rock type or contaminants) are related to each other
Currently, these data are scattered in publications and unrelated databases and worksheets
OntologyStructured vocabularies define the metadata for specific fields
(domains). The more domain-specific the metadata, the more useful they become to model the domain knowledge
Therefore, the terms in the vocabularies should capture consensual domain terms and interrelationships among these terms
Among the different types of vocabularies, ontologies are at the top of the hierarchy in providing the most useful and complete metadata, hence semantics
Ontology is a formal specification and model of a domain’s knowledge (e.g., knowledge of Geochemistry). It defines the shared vocabulary and the interrelationships that exist among the real individual objects within a specific field or domain of discourse, such as plate tectonics
Metadata FrameworksMetadata frameworks are specifications that allow creating,
manipulating, and querying metadata descriptions, and include those that are XML-, RDF-, and OWL-based (among others)
Each of these frameworks consists of a data model, semantics (applying RDF, RDFS, OWL), serialization format (e.g., XML, N3), and query language (e.g., SPARQL)
The XML-based metadata framework is used to capture both
content (separate from presentation) and metadata, but not semantics
Schema in XML exists with the data as tag namesThis allows the self-describing content to include both data and
metadata
RDFThe RDF-based metadata presentation is based on XML,
and is designed to describe metadata for resources on the Web
RDF uses a subject-predicate-object triple graph formatThe subject and object are resources, which on the web
can be anything said about anything by anyone
RDF triple: Sample analysis Chemistry, means that a specific sample (a resource) has analysis (predicate) given by the chemistry resource (which can be a trace element list of data)
OWLOWL-based metadata framework, which builds on RDF and
RDF Schema (RDFS), allows construction of more complex semantic expressions at the schema and data levels
OWL allows defining class and class membership and properties between classes (e.g., subclass-of, disjoint-from, equivalent)
Among many other constructs, OWL allows defining domain and range for each class
OWL-QL and SPARQL are two query languages for the OWL language.
top related