ceos idn task team chiang mai, thailand may 17, 2003

Post on 20-Jan-2016

221 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CEOS IDN Task Team

Chiang Mai, ThailandMay 17, 2003

IDN Agenda• Minutes from Toulouse, France May 2003

http://idn.ceos.org/IDN/Meetings/2003_05_Toulouse

• IDN Reports • Keyword Update• Process for adopting DIF Changes• Status of DIF Proposals• Thesaurus• Semantic Web• IDN Tools and Software Update

– MD8, MD9, DocBuilder• Discussion/Issues

Minutes from Toulouse

• IDN Reports (4 presentations, 2 written)• ISO 19115 Update• Updated controlled keywords• DIF Modification Proposals• Enhanced Project Description• MD8 Report from UNEP

– MD8 operations and testing– Efforts with PostgresSQL

• MD9 Update

IDN Reports

Status at the American Coordinating Node

New DIFs January 1999 - August 2003

6000

7000

8000

9000

10000

11000

12000

13000

14000

Date

# n

ew

DIF

s

DIF Population by Earth Science Topic

AGRICULTURE5% ATMOSPHERE

18%

BIOSPHERE17%

CLIMATE INDICATORS1%

CRYOSPHERE4%

HUMAN DIMENSIONS10%

HYDROSPHERE6%

LAND SURFACE13%

OCEANS13%

PALEOCLIMATE1%

RADIANCE OR IMAGERY6%

SUN-EARTH INTERACTIONS1%

SOLID EARTH5%

AGRICULTURE ATMOSPHERE BIOSPHERECLIMATE INDICATORS CRYOSPHERE HUMAN DIMENSIONSHYDROSPHERE LAND SURFACE OCEANSPALEOCLIMATE RADIANCE OR IMAGERY SUN-EARTH INTERACTIONSSOLID EARTH

DIF Population by Node

NASA41%

NOAA10%

USDA5%

CIESIN1%

CNES0%

NEONET0%

PNRA0%

RAS0%

NASDA1%

UNEP/GRID4%

AMD19%

G3OS0%

CLIVAR0%

UN5%

CCRS4%

J ST1%

CSIRO0%

DLR0%ESA/ESRIN

1%

INPE0%CONAE

0%USGS

4%

NASA NOAA USDA CIESIN USGSCONAE INPE CCRS ESA/ESRIN DLRCNES NEONET PNRA RAS NASDAJ ST CSIRO UNEP/GRID AMD G3OSCLIVAR UN

GCMD Usage by Domain TypeJanuary 1999 - August 2003

.gov3%

.edu8% .org

1%

.com20%

.net20%

.mil1%

numeric24%

.us1%non-US

22%

.gov .edu .org .com .net

.mil numeric foreign .us

Unique Hosts by DomainComparison of 2000, 2001 and 2002

0

10000

20000

30000

40000

50000

60000

.gov .edu .org .com .net .mil numeric foreign .us

domains

uniq

ue h

osts

2000 Total 2001 Total 2002 Total

# Web Page Hits Since January 2001

200000

300000

400000

500000

600000

700000

800000

J an-01 Apr-01 J ul-01 Oct-01 J an-02 Apr-02 J ul-02 Oct-02 J an-03 Apr-03 J ul-03

month

#hits

Searches by Controlled Keyword

Agriculture9% Atmosphere

12%

Biosphere7%

Climate Indicators0%

Cryosphere4%

Human Dimensions7%

Hydrosphere6%

Land Surface8%

Oceans12%

Paleoclimate4%

Radiance or Imagery6%

Sun-Earth Interactions1%

Solid Earth7%

Location9% Source

4%

Sensor4%

Agriculture Atmosphere Biosphere Climate IndicatorsCryosphere Human Dimensions Hydrosphere Land SurfaceOceans Paleoclimate Radiance or Imagery Sun-Earth InteractionsSolid Earth Location Source Sensor

Global Change Master Directory

GCMD Portals to Community-focused DataGCMD Portals to Community-focused Data

DODS

Portal Index://http

. . / / _ .gcmd nasa gov Data portal index html• The index of all of the Portals created by GCMD are available online from the portal index.

• Portal visibility will be improved with link to portals from the GCMD homepage (currently the link is available only via the GCMD sitemap).

New Portals This Past Year

• National Center for Atmospheric Research

• World Water Forum• Remote Sensing for Conservation Portal• United Nations Earth Science Data• Antarctic Master Directory

– Finland– Belgium– Argentina

IDN/CEOS-GRID Activities

• Participate on the CEOS/GRID Catalog Tiger Team

• Potential CEOS/GRID Collaborations – Directory search across CEOS/GRID

databases– Use of IDN controlled keyword hierarchy – Mapping of IDN DIF fields to core ISO

19115 fields as a minimum set of metadata elements

IDN Agency Reports

Keyword Update

New Earth Science Parameter Keywords

47 New Earth science keywords suggested by the GlobalObserving System Information Center (GOSIC)(comprising the GOOS, GTOS and GCOS)

These new keywords were mapped to existing GCMD/IDN keywords. Although not all of the suggested keywords were adopted, some reorganizationof existing keywords resulted in new and modified keywords

New Earth Science Parameter Keywords

TOPIC keyword Radiance or Imagery

will be restructured to Engineering/Spectral Measurements

To accommodate engineering and level 0 raw data measuredfrom satellites and other remote sensing platforms

A new TERM under this TOPIC will be:Engineering/Spectral > Spectral Measurements > Acoustic Waves

Cryosphere > Glaciers/Ice Sheets > Ablation Zones/Accumulation ZonesCryosphere > Glaciers/Ice Sheets > IcebergsCryosphere > Glaciers/Ice Sheets > Glacier FaciesCryosphere > Glaciers/Ice Sheets > Glacier Mass Balance/Ice Sheet Mass BalanceCryosphere > Glaciers/Ice Sheets > Glacier Thickness/Ice Sheet ThicknessCryosphere > Glaciers/Ice Sheets > Glacier Topography/Ice Sheet TopographyCryosphere > Glaciers/Ice Sheets > Glacier Elevation/Ice Sheet ElevationCryosphere > Glaciers/Ice Sheets > Glacier Motion/Ice Sheet Motion

New Earth Science Parameter Keywords

New Glaciers/Ice Sheets TERM and variables

New Earth Science Parameter Keywords

Land Surface > Soils > Micronutrients/Trace ElementsLand Surface > Soils > Heavy MetalsLand Surface > Soils > MicrofaunaLand Surface > Soils > MicrofloraLand Surface > Soils > MacrofaunaLand Surface > Soils > Soil Rooting DepthLand Surface > Soils > Soil ErosionLand Surface > Soils > Soil Infiltration

New Soils Keywords

Note: Also applies to TOPIC Agriculture > Soils

New Earth Science Parameter Keywords

Biosphere > Ecological Dynamics > Habitat Biosphere > Ecological Dynamics > Indicator SpeciesBiosphere > Ecological Dynamics > Biodiversity

New Biosphere keywords:

New Earth Science Parameter Keywords

Atmosphere > Atmospheric Radition > Incoming Solar Radiation

Hydrosphere > Water Quality/Water Chemistry > Water Potability

Additional new keywords

Keyword Citation

GCMD will be providing a citation that we kindly request should be used by organizations and agencies that use the IDN keywords

• Many groups have adopted the keywords• It is important to know how they are being

used so agencies and be informed of updates• GCMD science coordinators put a lot of effort

and expertise in choosing scientifically accepted terminology

Changes to the IDN DIF Standard

Proposals to Modify the DIF Proposals to Modify the DIF StructureStructure

• Through Formal Standard: ISO 19115 compatibility

• Through CEOS IDN Interoperability Group– Current examples

• Numerical Model Field Data Set• Spatial/Temporal Resolution

– Advantages• More flexible for specified purposes beyond ISO.• Faster turnaround time for implementation.

DIF Changes and the Interoperability Process

• Anyone can request a change to the DIF format

• The proposal is circulated on the interop mailing list (interop@gcmd.gsfc.nasa.gov)

• Based on comments and feedback the proposal may be modified and resubmitted

• A vote is requested by the Interop Voting Committee

DIF Proposals

• A proposal passes with a majority vote

• Results of the vote are conveyed to the interop list

• Prioritization, scheduling, and completion of software changes are dependent on the resources of participating members.

Interop Voting Committee1. Andrea Buffam (CCRS/GeoConnections)

Andrea.Buffam@ccrs.nrcan.gc.ca

2. Cheryl Solomon (USGS/BRD/GCMD) Cheryl_Solomon@gcmd.nasa.gov

3. Jolyon Martin (ESA) jolyon.martin@esa.int

4. Lee Belbin (JCADM) Lee.Belbin@aad.gov.au5. Osamu Ochai (NASDA)

ochiai.osamu@nasda.go.jp6. Lyne Yohe (NSIDC) yohe@nsidc.org7. Victor Pusztai (UNEP GRID-Budapest)

pusztai@mail5.ktm.hu8. Lorant Czaran (UNEP) czaran@grida.no9. Sherry Harrison (GHRC)

sherry.harrison@msfc.nasa.gov

Proposals to Modify the DIF Structure

• ISO 19115 compatibility• Numerical Model Field• Data Set Spatial/Temporal

Resolution

DIF/ISO Conversion(Fields that need attention)

Category>Topic > Term > Variable:

Earth Science > Biosphere > Ecological Dynamics > Population

DIF ISO

ISO Topic Category: farming, biota, boundaries,

climatologyMeteorologyAtmosphere, economy, elevation, environment, geoscientific information, healthimageryBaseMapsEarthCover, inlandWaters

Metadata Standard FormatMetadata Standard Version

Personnel:Address

Personnel:Address, City, State or Province, Zip Code,

Country

Not currently in the DIF

ISO Topic Categories

farming, biota, boundaries, climatologyMeteorologyAtmosphere, economy, elevation, environment, geoscientific information, health, imageryBaseMapsEarthCover, intelligenceMilitary,inlandWaters, location, oceans, planningCadastre, society, structure, transportation, utilitiesCommunication

Summary of Proposal Status

Non-controversial proposal to bring the DIF into compliance with the international standard.

o Anne Sophie Archambeau from IPSL, France proposed an addition to the DIF to handle numerical output data sets:

Group: Numerical_Experiment Model_Name: Model_Version: Model_URL: Model_Configuration: Model_Resolution: Model_Calendar: Group: Model_Integration_Period Start_Date: Stop_Date: End_Group Simulation_Name: Initial_Conditions: Perturbation: Imposed_Boundary_Conditions:End_Group

o Interop discussions on the proposed new fields are summarized in the CEOS IDN Interop Newsletter for April 2003 (http://gcmd.gsfc.nasa.gov/pipermail/interop/2003-April/000011.html)

Numerical Model Fields Proposed for the DIF

Summary of Proposal Status

• Received November 18, 2002• The additional information could be included in

the summary• Development schedules do not permit the

prioritization of this addition to the database at this time.

• Comments will be extended until such time that development schedules might permit the initiation of change

• The proposal will be presented to a modeling group in London at the end of September for comment

Data Resolution Proposal

Proposal to provide users with the capability of refining GCMD database searches by Geospatial and Temporal Resolution. Proposed DIF Syntax – (Changes in Bold )

+Group: Data_Resolution

Latitude_Resolution:

Longitude_Resolution:

+Horizonal_Resolution_Range: [choose from the list of geospatial ranges]

Vertical_Resolution:

+Vertical_Resolution_Range: [choose from the list of vertical resolutions]

Temporal_Resolution:

+Temporal_Resolution_Range:[choose from the list of temporal resolutions]

End_Group

DIF authors must select from the set of Geospatial (Horizontal) resolution range valids.

< 1 meter

1 meter - < 30 meters

30 meters - < 100 meters

100 meters - < 1 km

1 km - < 100 km or approximately 0.1 degree - < 1 degree

100 km - < 250 km or approximately 1 degree - < 2.5 degrees

250 km - < 500 km or approximately 2.5 degrees - < 5.0 degrees

500 km - < 1000 km or approximately 5 degrees - < 10 degrees

1000 km or > 10 degrees

Horizontal Resolution Range

Data Resolution Proposal Valids

Vertical Resolution refers to both Altitude and Depth resolution.

< 1 meter

1 meter - < 10 meters

10 meters - < 30 meters

30 meters - < 100 meters

100 meters - < 1 km

> 1 km

< 1 second

1 second - < 1 minute

1 minute - < 1 hour

Hourly

Daily

Weekly

Monthly

Hourly Climatology

Daily Climatology

Pentad

Climatology

Weekly Climatology

Monthly climatology

Annual

Annual climatology

Decadal

Climate Normal (30-year climatology)

Vertical Resolution Range Temporal Resolution Range

Data Resolution Proposal Valids

Example

Group: Data_Resolution

Latitude_Resolution: 1 meter

Longitude_Resolution: 1 meter

Horizontal_Resolution_Range: 1 meter - < 10 meters

Vertical_Resolution: 5 meters

Vertical_Resolution_Range: 1 meter - < 10 meters

Temporal_Resolution: Daily

Temporal_Resolution_Range: Hourly - < Daily

End_Group

Data Resolution Proposal Example

Summary of Proposal Status

New proposal sent through the Interop

Thesaurus Support when Searching Earth Science

DataW.N. Martin, J. C. French (NSF) and

A.K. NaiduDepartment of Computer Science

University of Virginia

The Vocabulary Problem

• Data is indexed using one set of terms

• Searcher casts query using another set of terms

• Result: relevant data is overlooked

Objective

Assist users in specifying queries when the indexing vocabulary is unknown or unfamiliar.

Provide a thesaurus facility to Earth science searchers.

Thesaurus Overview

• DLR gave Thesaurus Server in 1998.

• Lookup Assistant and Modification Tool by University of Virginia– Dr. Jim French– Dr. Worthy Martin– Amit K. Naidu (student)

• Minor server changes have been made.

global change(4)

pollution(6)

... ...

air pollution(22)

... ...

aerosols(5)

carbon monoxide

(6)

NOx(1)

sulfur dioxide(6)

... ... ...

...

acidification(1)

indoor pollution

(1)

...

global warming(5)

...

food contamination

(2)

...

trace gases(11)

...

air quality(2)...

... ... ...

Broader Terms (Top Terms)

Unrelated Terms

Narrower Terms

Related Terms

air contamination

air pollutant

air pollutants

air pollution

air-borne contaminants

air-borne contamination

air-borne contaminations

air-pollutants

atmospheric impurities

atmospheric impurity

atmospheric pollutant

atmospheric pollutants

atmospheric pollution

contaminated air

contaminated atmosphere

polluted air

polluted atmosphere

pollution of the air

pollution of the atmosphere

urban air

urban air pollution

urban atmosphere

Conceptual Thesaurus Structure

Saurus.pl(Perl script)

ISIS Thesaurus Server

Port 6188

Oracle ThesaurusDatabase

Port 1521

Thesaurus Architecture

Demo.cgi(Perl Script)

Document Search servlet

Document DatabaseThesaurus Search Applet Modified Query

Final Search ResultsMain Page

Thesaurus Modification Tool

Last Three Months• Amit Naido started working on the Project.

• Upgraded Makefiles to work with Oracle 9i and SunOS system on new Development machine.

• Now running both thesaurus server and thesaurus modification tool on the same machine.

• Queries go to Isite free-text search on gcmd.nasa.gov machine.

Future Plans

• Create script to regularly backup the database and make sure that the server is running.

• Update the thesaurus database with new terms and relations

• Update the GCMD homepage to include the thesaurus button

DEMO

• Thesaurus Lookup Assistant:http://gcmddev.sesda.com/thesaurus/assistant.html

• Thesaurus Modification Tool:http://gcmddev.sesda.com/thesaurus/tool/applet.html

The Semantic Web for Spatial Data Search

Femke Reitsma

University of Maryland – College Park femke@geog.umd.edu

Why Ontologies: Could the Semantic Web Meet Discovery Challenges?

Why Ontologies: Could the Semantic Web Meet Discovery Challenges?

“The semantic web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation” (Tim Burners-Lee et al. 2001)

The semantic web makes web pages machine “understandable” rather than just human understandable.

Semantic web components:

2. Moving up the semantic web layers:

1. Basic components:

Semantic Web layers presented by Tim Berners-Lee

ontology + semantically marked-up web page

= semantic web

Semantic Web Languages

• Primary languages:

RDF (Resource Description Framework) RDFS (Resource Description Framework Schema)OWL (Web Ontology Language)

• Historical development:XML provides the basic syntax RDF and RDFS adds some tags to XMLDAML+OIL add some tags to RDFOWL extends and replaces (almost) DAML+OIL

• Information is encoded as a triple: subject, predicate, and object

For example:

<Femke > <is a> <student> <Zimbabwe> <is a part of> <Africa>

• All subjects and objects are identified with a Universal Resource Identifier (URI): e.g. http://www.daml.org/2001/02/geofile/geofile-ont.daml#GeographicLocation

Basic Structure

What is an ontology?

Big “O” Ontology vs little “o” ontology:

Ontology = metaphysics, the essence of being, reality

ontology = “a logical theory which gives an explicit, partial account of a conceptualization” (Guarino and Giaretta, 1995 )

What does an ontology look like?

What does semantic web page look like?

Ontology ↔ Semantic Content

Ontology: Dublincore ontologySemantic Web page: http://owl.mindswap.org

ontology + semantic web =

• Computer parsable

• Inference ability

State code > city code > address code

Computer agent could deduce that a Cornell University address, being in Ithaca, must be in New York State, which is in the U.S., and therefore must be formatted to U.S. standards.

Objective:

Explore the potential of the Semantic Web

for distributing spatial data

Current GCMD Search

North America?

2950 records matched your query

Future GCMD Search

North America?

North America [2950]

Limit search by:

- Spatial resolution

- Temporal resolution

- GCMD keywords

Explore results by:

Canada [1348]USA [1602]

GCMD keywords

……

Key = ability to determine relationships between keywords without explicitly encoding them

North America?

GCMD Database

Sesame Ontology

Java Application

Progressing Towards Level 1:

Sesame = Open Source RDF Schema-based Repository and Querying facility

Keywords → Ontologies

•Importance of careful specification of relationships for ontology.

CATEGORY > TOPIC > TERM > VARIABLE

•For purpose of Semantic Web, keyword structure may need modification. e. g.

Hydrosphere > Ground Water > Saltwater Intrusion

e.g. the Variable Fetch is a measurable property of the Term Ocean Waves; however, the Variable Fisheries is a sub-topic of the Term Agricultural Aquatic Sciences.

Keywords: Projects, Sensors, Sources, Locations, IDN Nodes, Data Centers, Science Keywords, Services Keywords, URL Content Types, Chronostratigraphic Units

DIF Schema

• XSLT style sheet to create DIF schema in Semantic Web language

• Mapping terms to ontology

• Avoiding a monolithic ontology by mapping terms to other ontologies

e.g. Dublin Core

DIFs

• XSLT style sheet to convert DIFs to Semantic Language

• Mapping terms to ontology and DIF Schema

• Recording keywords of finest granularity

• Avoiding a monolithic ontology by mapping terms to other ontologies

e.g. Dublin Core, Cyc, DAML-time

Sesame:HTTP Protocol Handler Soap Protocol Handler

Request Router

Export ModuleQuery ModuleAdmin Module

Repository Abstraction Layer

Client 1 Client 2 Client 3

GCMD Repository:

-RDF DIF files

-Ontologies

-DIF Schema

Se

sam

e

HT

TP

HT

TP

SO

AP

HTTP Protocol Handler Soap Protocol Handler

Request Router

Export ModuleQuery ModuleAdmin Module

Repository Abstraction Layer

Client 1 Client 2 Client 3

GCMD Repository:

-RDF DIF files

-Ontologies

-DIF Schema

Se

sam

e

HT

TP

HT

TP

SO

AP

• Middleware

• GUI or API

• Database: PostgreSQL or Oracle

Advantages for the GCMD

• Semantic Web presents database structure in a machine parsable format

• Ability to search for the semantic relationships among any DIF terms within the ontology

• Do not need to change the database structure when new classes and relationships are added

• Real advantages = when ontology is enriched

IDN Tools and Software Update

DocBuilder: An XML Authoring Tool for ISO 19115

The Next Generation of Metadata Authoring Tools

• Web and stand-alone application

• Web version, will replace the current web tools (DIFbuilder, SERFbuilder, etc.).

• Increases software flexibility by allowing the user to choose what type of document to build (i.e. DIF or SERF or Project Supplemental or even an FGDC or ISO document)

DocBuilder Features DocBuilder Features • Object-oriented design. Allows code reuse.

• Java/Jython implementation. Offers platform independence and maintenance reduction.

• XML support. Promotes extensibility and standardized information exchange.

• Multiple versions. Supports diverse users and environments by offering both Web and stand-alone applications.

• MD8 integration. Provides added functionality through distributed database operations.

• Multi-document support. Increases flexibility by allowing the user to choose the format type to build: DIF, SERF, FGDC, ANZLIC, ISO, or Project Supplemental.

• Customization capabilities. Strengthens integration with portals.

DocBuilder HTML Version DocBuilder HTML Version

• HTML Version: The initial focus of effort was on the Java/Swing version. As this has become a stable product, attention has shifted to the HTML version, which will replace the current Web tools (DIFbuilder, SERFbuilder, etc.).

• Functionality similar to Java/Swing (stand alone) version• Will look similar to the current perl tool (DIFbuilder) so users

won’t need to learn a whole new tool.– All components are written strictly in HTML and JavaScript.

No Java components (applets) are included in the Web-client.

– A fully functional widget has been written that is comparable to the Java/Swing searchable list widget.

– Features specialized widgets for valids fields (Parameters, Sensor, Source, Location, Data Centers, Personnel) to provide increased functionality

– In alpha testing now.

Sample screensMain Page: Overview showing fields “checklist”.

The next generation of IDN authoring toolsDocBuilder

MD8 MD8 UpdateUpdate

MD Software Purpose

• To assist JCADM and CEOS IDN members in their efforts to share collection metadata

• To reduce the manual (costly) effort when exchanging DIF metadata

• To improve the metadata validation and thus improve the quality of the descriptions

• To create a foundation for more advanced applications and APIs

MD8 Metadata SharingDB

MD Server

Local Database Agent

UNEPNode

DBMD

Server

Local Database Agent

AADC Node

DBMD

Server

Local Database Agent

Your

Node

Network

New Content

Local DBIncoming QueueTable

MD Server

Local Database Agent

LDAServer

Announcer

Scheduler

Schedule Table

Trigger Table

GCMDNode

MD8 Installation

• Much effort was placed on an easy install compared to MD7

• Read installation requirements for details• Some database and web server configuration

is required outside of the MD8 install• Client is capable of autoupdating its code• UNEP tested the install page and had no

significant problems• AAD tested the install from a more advanced

‘tarball’ and had some problems

MD8 Status

• Used in production at GCMD since 2001•Operational at AADC• Operational at UNEP/Budapest

http://griddata.ktm.hu:8080/Data/portals/ceos

• Final release is MD 8.0 build 3

MD8 at UNEP/Budapest

What’s Next: Future Database Changes

• ISO Changes– Change Personnel address field– Add ISO_Topic_Category field– Add Metadata Standard Name field– Make mandatory: Citation group, DIF

Author, Spatial_Coverage, Dataset_Language, DIF_Creation_Date

Future Database Changes

• GOS project description enhancements• Better tracking of valids and personnel • Data center URLs will be tied to the data

center valid instead of the DIF record• New location valids hierarchy• Dataset geospatial and temporal resolution

What’s NextWhat’s Next

• Move HTML DocBuilder into production

• DIF compatibility with ISO• Web Page Redesign

Home Page

What’s Next (continued)

• Enhanced portal visibility from GCMD homepage

• Adding Services to the AMD• Automatic emails to DIF authors to

remind them to review and update DIFs

MD Version 9• ISO Changes are top priority

– Implies some backward compatibility issues that will need to be dealt with

– In progress• Replace Oracle with PostgreSQL• Replace Isite free-text search with Lucene• Add GOS Project functionality• Subscription Service for Valids• Upgrade the spatial search • enhance data center buckets • Implement temporal and spatial resolution refinement query• Version control of IDN DIFs• API to retrieve DIFs/Valids since a certain date (CCRS request)• Tie data center URLs to dc valid, not DIF• Mini-portal and a save search option

          

                                

Conclusion: Maintain Operations While Going

Forward

– Operational Goals• Reduce maintenance• Increase content

– Development Goals• Increase functionality• Serve as the “public face” for discovery of

and linkage to CEOS Earth science data

Discussion and Issues

top related