speeding up ontology creation of scientific terms

37
Speeding up ontology creation of scientific terms. Luis Bermudez , John Graybeal, Montery Bay Aquarium Research Institute http://marinemetadata.org

Upload: psyche

Post on 22-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Speeding up ontology creation of scientific terms. Luis Bermudez , John Graybeal, Montery Bay Aquarium Research Institute http://marinemetadata.org December 7, 2005. Why are ontologies important. At AGU we have 31 abstracts and 2 entire sessions related to ontologies. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Speeding up ontology creation of  scientific terms

Speeding up ontology creation of scientific terms.

Luis Bermudez , John Graybeal,Montery Bay Aquarium Research Institute

http://marinemetadata.org

December 7, 2005

Page 2: Speeding up ontology creation of  scientific terms

2

Marine M

etadata Interoperability Initiative

Why are ontologies importantAt AGU we

have 31 abstracts and 2 entire sessions related to ontologies

Page 3: Speeding up ontology creation of  scientific terms

3

Marine M

etadata Interoperability Initiative

Problem: Semantic Interoperability

SSDS

AOSN

get me Data for Variable ocean_temperature (C)

get me Data for Parameter temperature_1(deg C)

Page 4: Speeding up ontology creation of  scientific terms

4

Marine M

etadata Interoperability Initiative

Need for controlled vocabulary

A set of restricted words, used by an information community when describing resources or discovering data. The controlled vocabulary prevents misspellings and avoids the use of arbitrary, duplicative, or confusing words that cause inconsistencies when cataloging data.

Page 5: Speeding up ontology creation of  scientific terms

5

Marine M

etadata Interoperability Initiative

Controlled Vocabularies: Discovery of Data

GCMD HTMLhttp://gcmd.gsfc.nasa.gov/Resources/valids

BODC Discovery

Comma Separated Value

http://wwwtest.bodc.ac.uk/data/ codes_and_formats/parameter_codes/bodc_para_dict.html

AGU Index Terms HTMLhttp://www.agu.org/pubs/ gaplist.html

MEL HTMLhttps://mel.dmso.mil/docs/metadata_guide/section_6.htm

NOAA CoRIS Thesauri PDF

http://www.coris.noaa.gov/backmatter/keywords/discovery_ thesaurus.pdf

Page 6: Speeding up ontology creation of  scientific terms

6

Marine M

etadata Interoperability Initiative

Controlled Vocabularies: Usage (tag the data collected)

BODC

Comma Separated Value

http://wwwtest.bodc.ac.uk/data/ codes_and_formats/parameter_codes/bodc_para_dict.html

U.S. JGOFS Dictionary of parameters HTML

http://usjgofs.whoi.edu/datasys/ param_master.html

IOC GF3 parameter codes HTML

http://ioc.unesco.org/oceanteacher/ resourcekit/M3/Formats/Integrated/GF3/GF3.htm

SEACOOS

Comma Separated value

http://twiki.sura.org/twiki/pub/Main/DataStandards/seacoos_draft_data_ dictionary_v2.0.csv

CF XMLhttp://www.cgd.ucar.edu/cms/eaton/cf-metadata/standard_name.xml

Page 7: Speeding up ontology creation of  scientific terms

7

Marine M

etadata Interoperability Initiative

Problem: Semantic Interoperability

semantics semantics

Standard vocabularies

Page 8: Speeding up ontology creation of  scientific terms

8

Marine M

etadata Interoperability Initiative

Harmonization

DTDDTD

Comma Comma Separated Separated

ValuesValues

HTMLHTML

Tab Tab Separated Separated

ValuesValues

Relational Relational DatabaseDatabase

XML/XSDXML/XSD

RDFRDF

Web OntologyWeb Ontology Language (OWL)Language (OWL)

Page 9: Speeding up ontology creation of  scientific terms

9

Marine M

etadata Interoperability Initiative

Web Ontology Language: OWL

2003 World Wide Web Consortium recommendation to formally express ontologies.

Based on the Resource Description Framework (RDF).

Can be serialized in XML. Supporting tools: JENA, Protégé, SWOOP,

Sesame, Pangloss, Kuwari, VINE, Voc2OWL

Page 10: Speeding up ontology creation of  scientific terms

10

Marine M

etadata Interoperability Initiative

Fast introduction to OWL

RDF TriplesRDF ResourcesClasses - individuals - propertiesRDF Graph

Page 11: Speeding up ontology creation of  scientific terms

11

Marine M

etadata Interoperability Initiative

RDF: Triples, triples, triples

id description unitsocean_

temperatureOcean

Temperature C

Page 12: Speeding up ontology creation of  scientific terms

12

Marine M

etadata Interoperability Initiative

RDF: Resource

Resources

A resource is anything on the Web that has a unique identifier. Examples:

URI: urn:aosn.mbari.org.recordVariable.id:1900 URL: http://mmi.org/2005/08/gcmd-keyw#Chlorophyll URL: ftp://mmi.org/data-example

Literal

Page 13: Speeding up ontology creation of  scientific terms

13

Marine M

etadata Interoperability Initiative

Parameters

id description units

Temperature_1water temperature from unit 00471 deg C

Temperature_2water temperature from unit 00822 deg C

Looks like a class

Looks like individuals of (members of) the class Parameter

Classes Individuals Properties

Property (Attributes)

Page 14: Speeding up ontology creation of  scientific terms

14

Marine M

etadata Interoperability Initiative

How are ontologies created?

Conceptual direction strategy:

Up - down

Bottom - up

Automation approach:

Manual

Automatic

Page 15: Speeding up ontology creation of  scientific terms

15

Marine M

etadata Interoperability Initiative

Up - down approach

Page 16: Speeding up ontology creation of  scientific terms

16

Marine M

etadata Interoperability Initiative

Bottom - up approach

Body of Water Class

RiverLake

Has water

Is inland body

Has a relative defined channel

Lake RiverExample:1. Properties of real

world objects are identified.

2. Similarities are identified.

3. Concepts are created

4. and are expressed as a class.

5. Classes are related.

Subclass

Page 17: Speeding up ontology creation of  scientific terms

17

Marine M

etadata Interoperability Initiative

id description unitsTemperature deg C

temperature

Temperature inside the OASIS can, in degrees C

Temperature

temperature measured inside the MMC controller

Temperature CelsiusTemperature degrees CTemperature water temperature deg C

Temperature_1water temperature from unit 00471 deg C

Temperature_2water temperature from unit 03533 deg C

Bottom - up approach

id description unitsocean_temperature Ocean Temperature Cocean_temperature_2 Ocean Temperature 2 C

ocean_temperature_allOcean TemperatureAll

C

ocean_temperature_qcflag

Ocean TemperatureQcflag

0=good,1=missing,2=marginal,3=bad

ocean_temperature_rawOcean TemperatureRaw

counts

sea_surface_temperatureSea SurfaceTemperature

C

ssds:Parameter

aosn:Variable

Example:1.Real word objects:

parameters in observatory systems.

2.They all have similar properties (id, description and units).

3. Make them a resource: instance of a class Parameter

rdf:type

Page 18: Speeding up ontology creation of  scientific terms

18

Marine M

etadata Interoperability Initiative

Bottom - up approach (cont.)

ssds:Parameter

aosn:Variable

mmi:Parameter

sweet:Property

Page 19: Speeding up ontology creation of  scientific terms

19

Marine M

etadata Interoperability Initiative

Manual (Ontology editor)

List of more than 50 editors: http://www.xml.com/2002/11/06/Ontology_Editor_Survey.html

Protégé

Page 20: Speeding up ontology creation of  scientific terms

20

Marine M

etadata Interoperability Initiative

Automatic

Ontology in Ontology in OWLOWL

Software Program

transformationProperties file

id description unitsocean_temperature Ocean Temperature Cocean_temperature_2 Ocean Temperature 2 C

ocean_temperature_allOcean TemperatureAll

C

ocean_temperature_qcflag

Ocean TemperatureQcflag

0=good,1=missing,2=marginal,3=bad

ocean_temperature_rawOcean TemperatureRaw

counts

sea_surface_temperatureSea SurfaceTemperature

C

Page 21: Speeding up ontology creation of  scientific terms

21

Marine M

etadata Interoperability Initiative

Automatic

Advantages Fast Preserves a connection with the source

( back - compatibility ) Avoids typing and copy/paste errors

Disadvantage Only works with simple vocabularies

( Flat vocabularies, and some taxonomies)

Page 22: Speeding up ontology creation of  scientific terms

22

Marine M

etadata Interoperability Initiative

VOC2OWL

Tool created by MMIAllows to create automatic - bottom -up

ontologies from two basic structures of simple vocabularies: Flat vocabularies (e.g. phone directory) Hierarchical vocabularies (e.g.

taxonomies)JAVA - Eclipse standalone application

Page 23: Speeding up ontology creation of  scientific terms

23

Marine M

etadata Interoperability Initiative

Page 24: Speeding up ontology creation of  scientific terms

24

Marine M

etadata Interoperability Initiative

Metadata

Page 25: Speeding up ontology creation of  scientific terms

25

Marine M

etadata Interoperability Initiative

Conversion Properties I/OFormat of the ASCII file to transform: tab or csv

Location of the ASCII file

Location where the ontology in OWL will be saved

Page 26: Speeding up ontology creation of  scientific terms

26

Marine M

etadata Interoperability Initiative

Ontology Conversion Properties

Namespace of the resources

Column from where the local names of the resources (individuals) will be created.

One class (at least) is always created.

More than one class can be created

Page 27: Speeding up ontology creation of  scientific terms

27

Marine M

etadata Interoperability Initiative

ResultParameters

id description units

Temperature_1water temperature from unit 00471 deg C

Temperature_2water temperature from unit 00822 deg C

Page 28: Speeding up ontology creation of  scientific terms

28

Marine M

etadata Interoperability Initiative

Ontology Conversion Properties

If treated as a hierarchy, there is no such primary class. All the lines in the ASCII file represent a hierarchy

Page 29: Speeding up ontology creation of  scientific terms

29

Marine M

etadata Interoperability Initiative

Example Hierarchy (GCMD)

Page 30: Speeding up ontology creation of  scientific terms

30

Marine M

etadata Interoperability Initiative

Has been tested !

About 50 vocabularies were converted to OWL for the MMI workshop “ Advancing Domain Vocabularies” (Aug, 2005)

Page 31: Speeding up ontology creation of  scientific terms

31

Marine M

etadata Interoperability Initiative

Why do we need all these ontologies ?

Workshop was about relating terms from one controlled vocabulary to another one.

Microsoft Excel was to hard to use for this purpose -:)

Page 32: Speeding up ontology creation of  scientific terms

32

Marine M

etadata Interoperability Initiative

Mapping results

Topic Direct

mappings Inferred mappings

Total mappings

Plant Pigments 405 1,022 1,427

PaCOOS 131 375 506

Waves 93 181 274

Currents 90 153 243

CTD 81 432 513

Habitats 23 37 60

Total 823 2,200 3,023

47 participants and 12 hours of mapping time

Page 33: Speeding up ontology creation of  scientific terms

33

Marine M

etadata Interoperability Initiative

VINE : Vocabulary Integration Environment

Page 34: Speeding up ontology creation of  scientific terms

34

Marine M

etadata Interoperability Initiative

More…

• Advance the Marine Knowledge: 250,000 RDF triples (Ontologies + mappings)• They are available as:

• SOAP web services at: http://marinemetadata.org/webservices• Ontology files at: http://marinemetadata.org/ns

Page 35: Speeding up ontology creation of  scientific terms

35

Marine M

etadata Interoperability Initiative

Conclusions

• Solving semantic interoperability issues is fun.• We need to relate data producers vocabularies with standard vocabularies.• OWL is growing and growing in popularity more and more tools will be available.• VOC2OWL can help you !

Page 36: Speeding up ontology creation of  scientific terms

36

Marine M

etadata Interoperability Initiative

Our Guides

Roy Lowry, BODC Robert Arko, LDEO Julie Bosch, NOAA Ben Domenico, Unidata Karen Stocks, SDSC Steve Hankin, NOAA -

Ocean.US/DMAC

Mark Musen, Stanford Univ Michael Parke, Univ of Hawaii Lola Olsen, NASA Goddard Bob Weller, WHOI Dawn Wright, Oregon State

University

Steering Committee

Executive Committee

John Graybeal, MBARI. (PI) Philip Bogden, SURA/SCOOP

Stephen Miller, SIO. Francisco Chavez, MBARI.

Stephanie Watson, Texas A&M

Page 37: Speeding up ontology creation of  scientific terms

37

Marine M

etadata Interoperability Initiative

MMI:Your Handy Reference GuideMMI: http://marinemetadata.org

Voc2OWL: http://marinemetadata.org/voc2owl

Vine: http://marinemetadata.org/vine

Help Line: [email protected]

Ontologies: http://marinemetadata.org/ns

Term Search:

http://mmi.mbari.org:9600/mmi2/search.jsp

Tethys: http://marinemetadata.org/tethys