CONCEPT DEFINITION TASK GROUP
Rome, Aug. 30, 2010
Agenda
Current status of vocabularies Reorganization of CGI workgroups Vocabulary resource management Change URI scheme from URN to http URI
Current GeoSciML
Vocabularies
(33 total)
Subversion repository
Current GeoSciML vocabs
Metadata
Preferred label(en, others)
Text definition (en)
Asserted hierarchy
Services
Auscope BRGM
Reorganization
New vocab group being organized Merge with Multilingual thesaurus group Develop statement of work Meeting planned this fall to formally
organize/kick off
Management
Move to repository to GeoSciML.org Authorities for OGE vocabularies (CGI
or ?)
New vocabulary requirements for v3 documents
Formal decisions and policies to be developed by new workgroup
URI SCHEMEFor CGI resources
http URIs
Identify information and non-information resources
expected to be dereferenceable using the existing DNS system on the Internet
Reasons for selecting HTTP URI
HTTP URI's are URI's, so may appear as the value of an @xlink:href in a GML-conformant document
They are persistent May identify offline- as well as online-resources Implies immediate resolvability -- a Good Thing in most
circumstances Structure is "facetted".
Enables more flexible rules for identifier governance. Composed of slash-delimited alphanumeric fields.
Allows some explicit semantics to be visible may imply resource-type, ownership, even value often useful during system development.
OGC has adopted in its service architecture (with apologies to Simon Cox https://www.seegrid.csiro.au/twiki/bin/view/CGIModel/CGIIdentifierScheme#URN_vs_URI)
What is identified
four distinct but related resources that we might like to identify using an HTTP URI (Booth) Identifier label: actual string (info) A concept or physical entity (non info) Web Location: the information resource that is
produced by an HTTP GET request using that URI (non information). There is no guarantee that the same web location will GET the same resource when it is recalled
A particular document instance (info)
URI requirements
Identifiable and transparent to people. Branding: trust, advertising; evident what is identified
Memorable Keyboard compatible Usable, reliable, documented Distributed (delegate support for the transfer of naming
authority)
Stable, cost effective Portable-- change dereferencing host system
without reengineering identifiers
Redirect
303 code response redirect is invisible to a human user
Compilicates relocation of the dereferencing host for legacy URIs.
Suggestion is to explicitly distinguish the dereferencing host from the name authority, these may be decoupled.
Content negotiation and URL redirection invisible to the user violate requirement for URI transparency
CGI URN scheme
“urn” “:” “cgi” “:” CGIResource “:” ResourceSpecificStri
ng
protocol Name authority Resource type Resource ID
CGI http URI scheme
“http” “://” host “/” “cgi-uri” “/” Authority “/” cgi resource “/” resource specific
identifier
Protocol Host authority URI scheme ID Name authority Resource type Resource ID
USGIN Scheme
"http:" "//" Host “/” URIscheme “/” nameAuthority “/” resourcePath “/” resourceSpecificString
“/” [representationPart]
protocol Dereferencing service URI scheme ID Name Authority ID Resource type Resoure ID Representation ID Representation
instance ID?
Examples from the wild
Stem part with host name, authority, resource type path part, opaque resource-specific part.
E.g. http://vocab.ndg.nerc.ac.uk/term/C161/0/28Bhhttp://www.eionet.europa.eu/gemet/concept/
7769http://zbw.eu/stw/descriptor/12880-5
Issues
Proposal to reverse the CGIResource/Authority pair to Authority/CGIResource
How are particular representations identified? file extensions ('.rdf', '.html'). content negotiation
Is host part considered part of the identifier string.
Underscores ‘_’ or hyphens ‘-’ in http URIs All characters
lower-case CamelCase
What is default representation
Rdf Html GeoSciML fragment Defined based on resource type