What is Information Modelling (and why do we need it in NEII…)?Dominic Lowe, Bureau of Meteorology, [email protected]
29 October 2013
A very simple information model
The real world Information model
Tree
Lake
Mountain
Snow
This is the domain we are interested in
This is how we conceptualise it
"height"
"depth"
"age" "deciduous"
"size" "water quality"
Another information model
The real world Another Information model
Spruce
H2O
Limestone
Permafrost
Same domain…. …different conceptualisation
Chemistry
Land cover
Species
Rock type
Information models can be encoded in different ways
The real world Information model
Tree
Lake
Mountain
Snow
Encodings
Information Models are "implementation neutral"
Enabling integration of diverse data sources based on shared concepts
Mountain
Underpinning Information Model
"Many to 1" ismore scalable than"Many to Many"
Information models
• Everyone who writes or stores data has one, even if it's not well defined.
• This is usually ok for closed systems where everyone roughly understands the same thing.
• In distributed systems (hint: NEII ) it becomes problematic if there is no shared understanding of meaning across datasets.
• Effective data integration requires mapping to shared information models at some level.
• Data is our 'lifeblood' (S.Barrell, ISS division 1st staff meeting)
• Data carries information.
• NEII is an information infrastructure. It is not just about delivering data files.
• To what extent do we need to agree upon information models?
• How far can NEII get without addressing this?
Why is information modelling important to NEII?
The Environmental Info. Value Chain
discovery and
accessintegration environmental
intelligence
Diagram: A. Woolf
Metadata Harmonised servicesHarmonised data formsShared semanticsShared 'understanding'
i.e. interoperability
Cataloguing
"As-is" DataServices
Results&Benefits
Information models needed to do this
Simple Example – pre NEII
Data Provider
A
Data Provider
B
Data Provider
C
Format A
Format B
Format C
Info Model A
Info Model B
Info Model C
• User wants data about species distribution
• User gets data about species distribution:
• 3 services • 3 formats• 3 information models• Hard work!• Not scalable!
NEII Example
NEII 'Species distribution'
Servicee.g. WFS
Data Provider
A
Data Provider
B
Data Provider
C
Format A
Format B
Format C
Info Model A
Info Model B
Info Model C
User wants data aboutspecies distribution
NEII Mediator roleAgreed service definitionAgreed information model& Agreed encoding(s)
Really needed for NEII ??
• How many data providers?
• How many 'domains' of interest?
• How many different information models exist… ?!
• … a lot probably
•How many are well defined?
• …probably not that many
• Significant challenge is to integrate data in NEII by sharing common concepts.
Context: NEII Implementation
Based on Open Geospatial Consortium Services:
Catalog Service
Web Map Service
Sensor Observation Service
Web Feature Service
Web Coverage Services (?)
+ Vocabulary Services (RDF/Linked Data)
What do these services actually deliver?
Metadata (CSW)
Maps (WMS)
Features (WFS)
Observations (SOS)
Coverages (WCS)
Vocabularies (RDF)
What type of 'Metadata'?
Maps of what?
What type of 'Features'?
What type of 'Observations'?
What type of 'Coverage'?
What 'Vocabularies'?
Defining the 'what' is the role of information modelling
How?Standards for Information Modelling
ISO TC 211
Geographic information/Geomatics
Overarching meta-model for geographic information and services
OGC
Open Geospatial Consortium
•Implementation of ISO concepts
•Service definitions used in NEII
ISO 19101 General Reference Model
The Reference model describes the use of Conceptual Modelling and how it is used in the
19100 family of standards to enable conforming application systems to inter-operate and share
conforming geographic data.
ISO General Feature Model
Features are objects with identity:Road, Lake, Observation, Bio-region, Borehole,etc.. (anything!)
Features have:- attributes/properties
- associations with other features- operations that can be performed on them
E.g. a "Reservoir" feature has:- 'perimeter', 'depth', 'use restrictions'- fed by river, created by dam- can be emptied
ISO Rules for defining features& a UML profile
S.Cox, Seegrid website
UML = Unified Modelling Language
ISO 19136 – Geography Markup Language
•GML = encoding of ISO concepts:
- spatial, temporal, features, coverages
•Encoding format used in:
OGC Web Feature Service
Sensor Observation Service
Web Coverage Service
•Expressive
•needs profiling with Information Model to develop 'application schemas' – domain specific exchange formats
BoM Case Study: WDTF(Water Data Transfer Format)
Information Model Encoding Format (XML)
AWRISData
Warehouse
Data from > 200 providers
Applications
Requirements:(The Water Act: Water Regulations)
One page summary
•Conceptual/Information modelling is about modelling 'concepts' within a 'Universe of Discourse'
•E.g. in the MetOcean universe of discourse, example concepts might be: Fronts, Forecasts, Grids, Surface Obs, Currents..
•The modelling process is about formalising these concepts so that a community has a well-documented, shared, stable and implementation-neutral model that can be a basis for applications and interoperability.
•Within the ISO TC211 framework for Geographic Information, this process really means defining 'Feature Types' - along with their attributes, operations and relations to other feature types.
•If we can agree upon and formalise all (or some..) of our concepts we can develop a strong basis for implementations that support interoperability and reuse.
•Using UML to GML rules we can automatically generate exchange formats from the information model which are compatible with NEII OGC services (WFS, SOS).