dutch government business case
TRANSCRIPT
Dia 1
Dutch Government Metadata
Overheid.nl Web Metadata Standaard: OWMS
Hans [email protected]
DC2009
Hello, my name is Hans Overbeek. I am a civil servant working for the Dutch government.My main responsibility is to develop and manage a metadata standard for Dutch government information on the Internet. The standard we develop is called OWMS.
OWMS is based on Dublin Core. We try to stay as close as possible to the specifications of the DCMI.
Interoperability
of government information
1200+ organisations
1600+ websites
16M+ citizens
Let's take a look at the business case to see what we are talking about. The Netherlands is a small but densely populated country in western-Europe. We estimate the number of organizations within the dutch government to 1200. There are about 1600 governmental websites that contain public information for a population of about 16 million people and companies.
Information Architecture Findability
AnyquestionInformation(Content)We see today that in the vast growth of supply of information many people lose their way.
Information Architecture Findability
Anyquestion
amsterdam.nlgovernment.nlgovernment.nlxyz.nlAnywebsite
Information(Content)In an attempt to reach different target audiences, new websites are built with new editorial boards, but with reused information. In order to make this re-use easier, we use metadata to give structure to information that is not structured in itself. We make difference between supply of information by governmental organizations here on your right side - and demand for information by citizens and companies here on your left.
Information Architecture Findability
Location
(e.g. map)Life eventThemeSearch term
Structure of demand (Standard question)Anyquestion
amsterdam.nlgovernment.nlgovernment.nlxyz.nlAnywebsite
Information(Content)At the demand-side we need a structure to create different views on the information and for instance to make mash-ups of information from different sources. This structure is what we call structure of demand. You can think of postal code-area to specify location, standard navigation structures like themes or life events, or pre-cooked queries on mash-up pages.
Information Architecture Findability
Location
(e.g. map)Life eventThemeSearch term
Structure of demand (Standard question)Anyquestion
identifiertitletypelanguagecreatormodifiedspatialtemporal
Metadata (Dutch Core)
amsterdam.nlgovernment.nlgovernment.nlxyz.nlAnywebsite
Information(Content)At the supply-side we need a stable standard for metadata. Some information if not all information - can be posted for a long time. And you don't want to change the metadata if you have different needs in presentation.
Information Architecture Findability
Location
(e.g. map)Life eventThemeSearch term
Structure of demand (Standard question)Anyquestion
identifiertitletypelanguagecreatormodifiedspatialtemporal
Metadata (Dutch Core)
Knowledge model(Linked Data)
Geo
Subject
Language
amsterdam.nlgovernment.nlgovernment.nlxyz.nlAnywebsite
Information(Content)n order to bridge the gap between the need for stability in the so called back-end and the need for flexibility in the front-end we need to build a model of our world in Linked Data. We call this layer the knowledge models.You can think of :Semantic networks, Geographic models, Decision trees, Inference rules, Thesauri, Synonyms, Translations etc.This is the 'Bigger Picture'. Let's have a look at the 'Smallest Parts'.
Metadata: Dutch Government Core
dcterms:identifier
dcterms:title
dcterms:type
dcterms:language
dcterms:creator
dcterms:modified
(dcterms:temporal)
(dcterms:spatial)
8 properties
mandatory
(if applicable)First the metadata that makes the stable structure of the information supply. OWMS, the metadata standard for Dutch government, is a Dublin Core Application Profile.
In order to to be attractive enough for implementors of government information systems - such as Content Management, Record Management and Document Information Systems - we decided to choose a minimal set of properties as being mandatory to provide. So we came to these eight core-elements, the OWMS-core.
Identifier, title and type - to identify and recognize the described information object.Language - to filter out objects in a certain language and because it is a property that is relatively easy to provide.Creator - to know who-says-so. It gives the described resource authenticity.Modified - also contains date created so you can tell how old the described information object is.Temporal - contains the period that the information object is about.Spatial - is the area or point that the information object is about.
Aim: Simple content metadata
Text
Value from a VES
URI
URI with
optional VES and Text
Amsterdam.nl Homepage
What does OWMS metadata look like?Our aim is to make it look as simple as possible. And that's not easy! ;-)In it's simplest form it is just plain text, for example in case of dcterms:title.
Aim: Simple content metadata
Text
Value from a VES
URI
URI with
optional VES and Text
Amsterdam.nl Homepage
Amsterdam
For other properties we provide controlled lists of value or Vocabulary Encoding Schemes in Dublin Core words. In this example we use the value Amsterdam from the Vocabulary Encoding Scheme called overheid:Gemeente. (Overheid means Government , Gemeente means Municipality.)
Aim: Simple content metadata
Text
Value from a VES
URI
URI with
optional VES and Text
Amsterdam.nl Homepage
Amsterdam
At this very moment we are developing a framework to define URI's or Pointers for generally used concepts, also known as non-literal values. They can be used stand-alone like this pointer to the municipality of Amsterdam, ...
Aim: Simple content metadata
Text
Value from a VES
URI
URI with
optional VES and Text
Amsterdam.nl Homepage
Amsterdam
Amsterdam
...or optionally in combination with a scheme and a label. These concepts will be defined as Linked Data and have relationships with other concepts.
Documentation of the standard
Human readable documentation of the standard: http://tinyurl.com/nlgov-docEigenschappen= Properties
Waardelijsten= List of values (VES)
Toepassingen (IPM)= Application Profile
Overheid= Government
An overview of the properties and ranges that we use in our standard can be found by the tinyurl I provided. The site is written in our beautiful language: Dutch, but this translation table should be enough to get an idea.
Metadata standard
Dublin Core abstract model
Simple OWMS model
Collection application profile
So our Dublin Core Application Profile follows an abstract model which is a little bit simpler than the Abstract Model of Dublin Core.
Dutch abstract model
From the top of the model going down, we see that a described resource is described by a description.A Description consists of statements.A statement is a property-value pairA value can be - a URI, which is a pointer to a concept.- a plain value string- a Vocabulary encoding scheme and a string.As said before, this model is a simplification of the full Dublin Core Abstract Model. It is a bit more pragmatic and easier to understand, though it is not academically correct and not as flexible as the DCAM.
There are two reasons for that:1. is that we want to keep the metadata itself as easy to understand as possible.2. We want to validate the metadata provided by information suppliers, which means that we want organizations to use known encoding schemes and for instance not to use xsi:type constructions or their own datatypes.
Description Set Profile in XSD
http://tinyurl.com/nlgov-xsd
In case you are interested in the XML-schema definition that we use to validate content metadata against the standard, you will find it at this tinyurl. I can not go through the scheme within the time frame of this presentation, but if you are interested you are welcome to investigate it. If you have any questions, please don't hesitate to contact me.
URI's: pointers to knowledge objects
Knowledge Objects are defined resources within the overheid namespace
Every KO belongs to at least 1 owl:Class
Some KO's are skos:Concepts in a skos:ConceptScheme
Classes, ConceptSchemes and KO's share the same namespace
Nameing conflicts solved by qualification (like DBpedia:
e.g. http://dbpedia.org/resource/Paris
and http://dbpedia.org/resource/Paris,_Texas_(film))
Linked Data tutorial at http://tinyurl.com/berlin-tut
So if information suppliers in their content metadata provide URI's to concepts, they use URI's to Linked Data. Unfortunately we are not ready yet to publish our concepts, so these URI's are not dereferenceable yet. We define all knowledge objects in the same namespace. We define a number of reference classes. Every Knowledge Object belongs to at least one of these classes.Some Knowledge objects are skos:Concepts and belong to a skos:ConceptScheme.
Our concepts will be published following the very useful tutorial about Linked Data by the Free University of Berlin. I provided a tinyurl again. So we will publish a human-readable HTML-page and a machine-readable RDF-snippet for each concept.
These concepts will have relations, represented in what we have called knowledge models.
Knowledge domains
Geo
Organization
Information type
Subject
Language
5.
4.
3.
2.
1.
We distinguish roughly five different knowledge domains, each with its own characteristics.
Knowledge domains
Geo
Organization
Information type
Subject
Language
5.
4.
3.
2.
1.
The concepts within the Geographical domain are the most tangible ones. Geographical entities can be projected on a map and therefor are quite easy to define and to recognize. They are very powerful when used for filtering or navigating to information. There are many initiatives around the world to standardize on geo-concepts.
Knowledge domains
Geo
Organization
Information type
Subject
Language
5.
4.
3.
2.
1.
buy
These knowledge models can be purchased on the market.
Knowledge domains
Geo
Organization
Information type
Subject
Language
5.
4.
3.
2.
1.
buy
Organizations are also quite formally defined in many occasions, though not always, unfortunately.
Knowledge domains
Geo
Organization
Information type
Subject
Language
5.
4.
3.
2.
1.
owl
buy
Therefor we categorize organizations and geo-concepts, using the owl vocabulary. Many organizations correspond to a geo-concept, being the mandate-area of that organization.
Knowledge domains
Geo
Organization
Information type
Subject
Language
5.
4.
3.
2.
1.
owl
buy
Then we distinguish a domain of government information types, products and services like laws, permits, announcements and so on.
Knowledge domains
Geo
Organization
Information type
Subject
Language
5.
4.
3.
2.
1.
owl
buy
They relate to the organizations which produce these kind of information.
Knowledge domains
Geo
Organization
Information type
Subject
Language
5.
4.
3.
2.
1.
owl
buy
The most interesting and the most challenging domain is that of subjects. That is because there seems to be infinite ways to name and classify subjects. Mostly the lemma's in these classifications are only defined by a label and lack proper definition. Subjects relate to information types. The knowledge models for subjects are probably the holy grail of the semantic web. And the fact that we have only just begun to discover how we should handle this, made us decide to put dcterms:subject not into our 'Dutch Core' so-far.
Knowledge domains
Geo
Organization
Information type
Subject
Language
5.
4.
3.
2.
1.
skos
owl
buy
Knowledge domains
Geo
Organization
Information type
Subject
Language
5.
4.
3.
2.
1.
skos
owl
buy
And finally there is the Language domain with knowledge about synonyms and homonyms and stemming-rules like plurals and conjugations.
Knowledge domains
Geo
Organization
Information type
Subject
Language
5.
4.
3.
2.
1.
free
skos
owl
buy
These knowledge models are available as thesauri, often for free.
Ideas, rules of thumb
Use OWL:Class for more tangible entities
(e.g. organisations)
Use SKOS:ConceptScheme for more abstract entities
(e.g. information type, subject)
Start modeling from RDF/XML and try keeping models so simple that connection and conversion to alternate models (e.g. Topic Maps) remains possible.
So we aim to model Geo-objects and organizations in owl and information types and subjects in skos.We realize that we are just starting this exercise and that a lot of the theory behind it is still developing. So we will not be able to make definite decisions. We have been thinking about the choice between RDF and Topic Maps. And we decided to start developing our knowledge models in RDF, since that seems to be the more popular framework today. But we try to keep the models so simple that we can convert or integrate with other frameworks if that is feasible.
Work in progress
Technical R&D issues:How to determine classes and conceptschemes?
How liberal or concise should our business rules be in order to achieve quality without discouraging contributors?
How to cope with change? e.g. Date in URI?
See: http://www.jenitennison.com/blog/node/108
and http://www.jenitennison.com/blog/node/112
Management issues:many organisations and collections
time and money
There are still a lot of issues to be resolved. Our main concern is to develop a framework which is not only correct, but also simple enough to understand and to use. Therefor we need to present the classes and concept schemes in a way that is immediately clear to the user. We have to bring business rules in place in order to enforce uniform application of the standard, but on the other hand, the standard should be flexible enough to serve a broad scope of applications and domains.There are some practical issues, like how to cope with changes? Jenni Tennison gave a very useful definition of the problem.And finally we need organizations to understand the need for metadata, to understand the standard and to start using it!
Dutch Government Metadata
Overheid.nl Web Metadata Standaard: OWMS
Hans [email protected]
DC2009
Thank you for your attention and thank you, Liddy, for giving me the opportunity to present our case to you all. If there are any questions, please feel free to send me an e-mail. I wish you all a good conference in Seoul and I hope I will be able to join you next year.Thank you.
Klik om de opmaak van de titeltekst te bewerken
Klik om de opmaak van de overzichtstekst te bewerkenTweede overzichtsniveauDerde overzichtsniveauVierde overzichtsniveauVijfde overzichtsniveauZesde overzichtsniveauZevende overzichtsniveauAchtste overzichtsniveauNegende overzichtsniveau
Government Answers
Government Answers