engineering an ontology on organic agriculture and agroecology: the case of the organic.edunet...
TRANSCRIPT
-
7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project
1/8
IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT
Engineering an ontology on organic agriculture and agroecology: the
case of the Organic.Edunet project
Salvador Snchez-Alonso1, Jess Cceres
1, Aage S. Holm
2, Geir Lieblein
2, Tor
Arvid Breland2
, Roger A. Mills3
and Nikos Manouselis4
1 University of Alcal, Information Engineering research unit, Edificio Politcnico, Ctra. de
Barcelona km. 33.6, Alcala de Henares, Spain, {salvador.sanchez, jesus.caceres}@uah.es
2 Norwegian University of Life Sciences, Department of Plant and Environmental Sciences
Postboks 5003, 1432 Aas, Norway, {aage.holm, geir.lieblein, tor.arvid.breland}@umb.no
3 Oxford University Library Services, Radcliffe Science Library , Parks Road, Oxford OX1
3QP, UK, [email protected]
4 Greek Research & Technology Network (GRNET S.A.), 56 Messogion Str., Athens, Greece
Abstract
Education is essential to raise public awareness on organic agriculture and agroecology (OA &
AE). The EU funded project Organic.Edunet will provide an online, freely-available portal
where learning contents on OA & AE can be published and accessed through specialized
technologies. Thus, the future Organic.Edunet portal will offer advanced services (such as
ontology-based searching and social recommendation) and will facilitate search, retrieval and
use of the collected content. This paper describes the method used for the effort of experts in
different fields to agree and build together a conceptual model (in the form of an ontology)
which would finally be the basis for the technical novelties and advanced functionalitiesprovided by the portal.
Keywords: organic agriculture and agroecology, learning resources, ontology, semantic Web.
1. Introduction
Organic agriculture and agroecology are important efforts towards creating a more sustainable
form of agriculture and farm production. Aimed at applying ecological principles to the
development and management of sustainable agricultural systems both share the concern for
taking care of the environment, and integrating farming activities in the rural communities,
among others. To raise public awareness on organic agriculture and agroecology (OA & AE),education is essential. The EU-funded project Organic.Edunet aims at the promotion of the
e-Education of European youth, but also of producers, farmers, and consumers about OA & AE
benefits, by providing an openly-available portal where learning contents on OA & AE could
be published and accessed. This place will integrate and specialize state-of-art technologies of
the World Wide Web, in order to provide end-users with a single European reference point (the
Organic.Edunet Web portal) that will offer advanced services such as ontology-based searching
and social recommendation, and will facilitate search, retrieval and use of the collected content
(Manouselis et al., 2008).
To be successful in this attempt, a twofold effort is necessary. First, a technical effort to
both make learning materials available but also to provide advanced functionalities that would
503
mailto:[email protected]:[email protected]:[email protected]:[email protected] -
7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project
2/8
IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT
make the Organic.Edunet portal unique. Second, a coordination effort for experts in different
fields domain experts, ontology experts, librarians and external consultants to agree and
build together a conceptual model (in the form of an ontology) which would finally be the basis
for the technical novelties and advanced functionalities provided by the portal.
On the technical side, the effort can be summarized in the following points:
- To develop a metadata scheme for the description of digital learning resources. This schemewill be built upon previous efforts, such as the IEEE Learning Object Metadata standard
(LTSC 2001), the Dublin Core metadata standard (Weibel et al., 1998), FAO's agricultural
information management standards, and others (Stuempel et al., 2007).
- To use this scheme to describe existing contents. This needs to be first carried out at a locallevel: universities and institutions interested in becoming part of the effort, should annotate
their OA & AE digital resources with multilingual, standard-complying metadata according
to the mentioned scheme.
- To specialize existing software tools in order to provide content producers with acustomized suite of tools that will help them organize their content in learning repositories,
where learning objects (Polsani, 2003) will be described according to the metadata scheme.
- To interconnect local repositories to form a federation of learning repositories.- To provide a common point of access including advanced search capabilities, based on the
Semantic Web vision (Berners-Lee, Hendler and Lassila, 2001).
On the knowledge side, all the tasks to perform can be summarized in one macrotask:
building an ontology covering OA & AE. An ontology can be described as a logical structure of
a domain, which includes a description of terms and their relations to each other. In the
Organic.Edunet project, an ontology on OA & AE provides this type of structure for our field of
study. As this ontology has to be designed to help users of the Organic.Edunet portal to make
good searches, it will be deliberately application specific and dependent. According to theclassification based on the level of generality by Guarino (1998), this ontology could be
classified as an application ontology, as it will describe concepts depending both on a
particular domain and task.
Negotiation is a very important issue in the whole process of knowledge construction.
Reaching agreements is a constant concern because the persons involved in this process have
different backgrounds and knowledge (e.g. different definitions of the same thing depending on
which country/institution the expert belongs to), not to mention the management difficulties
(e.g. partners distributed in 15 institutions from all over Europe, from Estonia to Spain, from
Greece to Norway). Those differences and difficulties must be overcome to produce a final
ontology with which all the persons involved are comfortable. This is essential as the
information in the ontology will be later used to enrich learning resources metadata. If thisinformation is not properly used or if it is not used at all to annotate learning materials in the
portal, it will be impossible to deliver the advanced capabilities promised.
This paper sketches the process being followed for the development of the ontology: the
various stakeholders involved, the steps suggested, the sources used and the decisions taken.
The rest of it is structured as follows. Section 2 describes the general procedure of ontology
engineering. Section 3 provides a closer look to the procedure followed so far, while sections 4
and 5 analyze the perspective and structure of terms in the initial list of terms for the OA & AE
ontology. Finally, section 6 summarizes the discussion, providing some conclusions.
504
-
7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project
3/8
IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT
2. Engineering an ontology on Organic Agriculture and Agroecology
Building an ontology can be approached from several angles. However, as reported by Chen
and Williams (2008), none of current methodologies can be considered as mature if compared
to other disciplines such as software engineering or knowledge engineering methodologies. For
this reason, and due to the specificities of our work, the following procedure was agreed:
1. OA & AE experts elaborate a list including all the relevant terms in the domain of OA & AE.As terms will be later used to annotate learning materials, the higher the probability of
annotating a resource with a given term, the more relevant this particular term will be. This
vocabulary has to be built upon previous efforts, such as Bio@gro (http://www.bioagro.gr),
FAO's AGROVOC (http://www.fao.org/aims/ag_intro.htm), and others, and thus some
terms will include mapping information to terms in those vocabularies.
2. Using the list of terms as an input, domain experts (with the help of librarians and guidancefrom the ontology experts) identify sub-domains with the aim of dividing the original list
into microthesauri (sub-lists also known as modules or microtheories). Modules must be
cohesive: all the concepts logically related will be part of the same module. Tentative high
level modules are Farming, Distribution, Processing, Consumption and Waste management,
which can be later subdivided into lower level modules.
3. Domain experts add agreed, unambiguous definitions for the terms, thus producing aconcept list. We agreed on using the word conceptto denote terms whose definition and
some information on its relations to other concepts has been established.
4. Ontology experts develop an initial ontology from the concept list. They process is toseparately engineer each module into a sub-ontology, although all of these sub-ontologies
are created in parallel from the definitions.
5. Evaluation of the ontology produced in step 5, making use of upper ontologies in a processdescribed by Snchez-Alonso and Garca-Barriocanal (2006). This process, which has theadvantage of allowing to contrast the work of several experts in parallel producing very
efficient results, is structured around four steps:
5.1. Find one or several terms in the OpenCyc upper ontology (Lenat, 1995) that subsume,are equal, or similar to the category under consideration.
5.2. Check carefully that the mapping is consistent with the rest of the subsumers insidethe upper ontology.
5.3. Provide the appropriate predicates to characterize the new category.5.4. Edit the term in an ontology editor to come up with the final formal version.
6. Validation by example, using scenarios of typical user interaction for searching andretrieving resources as a way of improving and refining the ontology.
The procedure sketched, shown in figure 1, is intended to be an iterative and incremental
process through which validated concepts and relations will be added to the ontology in a
continuous and systematic building method. Due to the fact that ontology experts have a limited
knowledge on the OA & AE domain, domain experts will be regularly asked about clarification,
and will provide more detailed explanations whenever necessary. The final result of this
process will be an ontology including and integrating all the information in the modules.
505
http://www.bioagro.gr/http://www.fao.org/aims/ag_intro.htmhttp://www.fao.org/aims/ag_intro.htmhttp://www.bioagro.gr/ -
7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project
4/8
IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT
Fig. 1 Procedure for the engineering of an ontology on OA & AE
However, the initial idea of building up the ontology module by module was close to be
discarded due to difficulties from the domain experts side, who preferred to see the whole
picture. The solution taken was to build up a full list of tagged terms, where tags linked terms
to tentative modules where they could fit in, thus postponing the division of this full map of
concepts into cohesive modules to the next iteration. This reassured domain experts in the
feeling that it would be easier to come up with such a mind map of concepts, where they could
later easily work with one branch at the time.
3. From lists of terms to ontologies
It is usual to think of a field, its concepts and its relations in a two-dimensional way: some terms
are related to each other, and so on. However, relations can also cross-link and, in this way, all
terms have a place in the full knowledge system, related to one or many other terms in different
ways. One concrete example could be the following:
- Plant Production is a main, high level term- Plant Health is a sub-term of Plant Production- Preventive Plant Protection is a sub-term of Plant Health- Crop Rotation is a sub-term of Preventive Plant Protection
506
-
7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project
5/8
IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT
At the same time:
- Diversity is a sub-term of Plant Production- Crop Rotation is a sub-term of DiversityWhich in fact means that:
Crop rotation is a two-folded sub-term of Plant Production as it is a sub-term of both
Plant Health and Diversity
However, even if this is a valid knowledge classification from the point of view of
building a thesaurus, it is not enough to build an ontology. In thesaurus construction, the is a,
which denotes a relationship between two terms, has a very broad meaning that needs to be
narrowed to disambiguate and formalize the knowledge in an ontology. In fact, usual relations
in thesauri such as broader, narrower, used for or related are not precisely defined in a
semantic form. In our effort, it is the ontology experts work to depart from a thesaurus-like
knowledge representation and to arrive at an explicit knowledge representation in the form of
an ontology. In the original list of term that domain experts produced a thesaurus-like list
is a relationships are overloaded: they can represent aggregation relations, or hierarchy
relations or even individual-concept relations. The following sentence will serve as an example
of the kind of work that ontology experts have to carry out:
Plant Health is a sub-term of Plant Production
Let us reflect on this for a moment. In which form is Plant Health related to Plant
Production? Is Plant health an individual of the concept Plant production? Not really. Is it a
more specific term than Plant Production, which can be categorized as a specific form of PlantProduction? Neither. So, Plant Production has to be an aggregated concept, one of whose parts
is Plant Health, right? Well, maybe. The problem is that an ontology expert can not determine,
without the intervention of domain experts, which specific relation links Plant Health to Plant
Production. Only after iteration, discussion and negotiation, both experts (domain experts and
ontology experts) will agree on a knowledge representation where all the relationships are
unambiguously determined. From this point, it will be possible to speak about an ontology and
forget about concept lists, thesauri and term lists. In this sense, the formal knowledge in the
ontology will derive from the knowledge implicit in the list of terms used as an input for the
creation of the ontology, but this latter will be richer, and of course will include many more
concepts and relationships than the former. In this example, we would finally have in the
ontology the following assertions (formal expressions stating true knowledge):
Plant Production is an agricultural activity
Plant Health is a prerequisite for Plant Production
Plant Protection is a set of techniques and processes to achieve Plant Health
Preventive Plant Protection is a sub-set of techniques for Plant Protection
Crop Rotation is a specific technique for Preventive Plant Protection
Similarly:
Diversity is a prerequisite for Plant Production in organic agriculture
Crop Rotation is a specific technique for obtaining Diversity
507
-
7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project
6/8
IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT
4. Perspectives
Despite the thesauric vision of a knowledge classification, where broader-narrower term
relations are used to create hierarchies of terms, there are also other dimensions which form an
ontology. In this particular case, the ontology is designed to help users of the Organic.Edunet
portal to improve searches, and due to this some other important facets or perspectives had to betaken into account. We have called the most important facets the three new dimensions of our
OA & AE ontology: the perspective dimension, the user dimension and the space-and-time
dimension.
These dimensions are important in the process of implementing advanced searches in the
web portal. They may be e.g. included in the metadata registry of each content item (or in other
ways), in order to help the user find the desired materials. They should definitely not be
forgotten in the construction of the ontology, as different dimensions create different varieties
of the knowledge base.
4.1. The perspective dimension
Different perspectives could be Social, Economic, Production and Environmental. The
term Plant health may bring up different associations and content seen from an economic
point of view, compared to an environmental point of view. An environmentalist may focus on
the biological aspects of plant health, plant protection and also the content of nutrients. An
economist would probably focus, on the other hand, on the cost-benefit of different plant
protection strategies. In the same way, Animal welfare may be associated with lack of
sickness to a medical professional, but includes terms like natural behavior and access to
open air in the eyes of an environmentalist.
4.2. The user dimension
A user can approach a search, or a content/term, from different angles. An elementary or
secondary school pupil will anticipate a different kind of content than a PhD student or a
professor. The complexity of a term, or the complexity of the relationships to other terms,
should be different depending of the reader. A consumer may be more interested in the structure
of regulations and accreditation which concerns Labeling, or the ideology behind it, whereas
a farmer/producer may be more interested in the specific regulations concerning plant
production.
4.3. The space-and-time dimension
When searching information about a phenomenon, it is very important to be aware of level of
resolution. For example, questions regarding symbiotic N2 fixation may be asked at the
molecular, the soilplant mesocosm, the field, the farm, the local, the national or the biosphere
spatial levels, respectively. Effective searches should be targeted at the resolution level of
interest to the user. Similarly, when searching for information about soil nitrogen dynamics, it
is desirable to distinguish between within-year (e.g., plant-available N during a growing
season), medium-term (e.g., N balance during a crop rotation) and long-term (e.g., humus N
dynamics over decades or decennia) time horizons.
508
-
7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project
7/8
IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT
5. Structure of terms in the OA & AE ontology
A question that soon came to the minds of domain experts during the elaboration of the initial
list of terms was the following: What terms should represent an ontology for OA & AE? We
have previously referred that several thesauri and lists with terms related to agriculture, as well
as other lists including terms relevant to organic agriculture, were used as an input for theoverall process. Some terms, e.g. the term irrigation, are relevant to agriculture in general and
also to organic agriculture and thus was part of more than one list. Consequently, all terms can
not be included in the first list of terms for an OA & AE ontology. As a starting point, we chose
to look at the principles and main focus in organic agriculture. These principles, as enunciated
by IFOAM, give some directions in which OA & AE have a special attention: Health, Care,
Natural, Ecology, Quality, Safety, Variation, Diversity and Fair are all descriptions of its goals.
The goal of OA & AE could also be expressed in one sentence: To produce enough and
healthy food for the worlds population in a sustainable and ecological way. In our search for
key ideas for the initial list of terms, we have focused on terms which are closely related to
these goals and qualities. This does not mean that irrigation could not be a term in the future
OA & AE ontology, but it is not related to the unique key terms and qualities of OA & AE.
The main structure of the ontology starts with the farming perspective: Plant Production,
Soil Quality, Animal Production and Service Production. This is at the farming level. Moving
out from the farm issues concerning distribution, processing, marketing, consumption and, at
the end, waste management come into consideration. These are the main modules of the OA &
AE draft list of terms. This draft shows the main branches and key terms of the future ontology.
Of course, many terms not mentioned in this list should be in it, but will need to be included in
subsequent iterations of the process. In particular, these principles have to be kept in mind
during the refinement and iteration process:
-
The terms in this ontology must relate to the content which will be finally providedfor the Organic.Edunet portal. If an institution has substantial and useful content
materials about e.g. cereal production, we will need to place key terms relevant to
this content in the ontology.
- When filling in missing sub-terms, these should be placed under the correct terms,and probably fill in terms at a sub-level between the main term and the new
sub-term added to the list.
- Synonyms of existing terms, if there are any, must be found and referenced.- Corrections and amendments should be made to the existing list of terms.
Other questions, such as whether or not modules are strictly necessary for our application
purposes or if some terms should be placed under another branch arose during discussion, andwere annotated as interesting issues to take into account for the last stages of the process.
6. Conclusions and future work
In this paper, the process of engineering an OA & AE ontology for the needs of the
Organic.Edunet project has been sketched. This process involves at least three types of experts:
domain experts, ontology experts and other experts such as librarians, external consultants, etc.
The ontology construction has to take into account the existence of multiple dimensions in the
knowledge of the OA & AE field of study. These dimensions have been studied as part of the
ontology engineering process and listed as categories in the initial list of terms after step 1.
509
-
7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project
8/8
IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT
At first, ontology experts suggested to build the ontology starting from some very basic
concepts (from some public, popular ontologies) like "organization", "person", "plant",
"phenomenon", to later let the rest of the actors start associating the list of concepts that they
have elaborated, building something like a hierarchy of concepts that will be the backbone of
the ontology. However, this approach was modified to reach a final agreement in a 7 step
procedure detailed in section 2. At the moment, only the two first steps of the full procedurehave been walked. The preliminary list of terms was created with the help and cooperation of
all the content providers, people from UN/FAO and other experts in the field by February 2008.
The initial list of tagged concepts was produced during an internal workshop at UMB on May
2008, after months of work and input from all the Organic.Edunet partners and external experts.
The first version of the ontology (containing terms only in English) is scheduled to be
ready in September 2008. The works for including translations to all the other eight languages
represented in the Organic.Edunet consortium should be finished by the end of September as
well. From this point, the knowledge in the ontology will be used in pilot searches whose
results will be in turn used to iterate the design and contents of the ontology, to improve it and
refine it.
Acknowledgements
The research reported herein is part of the activities of the EU-funded project Organic.Edunet
(ECP-2006-EDU-410012), from which it receives partial funding.
References
Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic Web. Scientific American, 284(5),
28-37.
Chen, S. and Williams, M.A. (2008). Learning personalized ontologies from text: A review on aninherently transdisciplinary area. In Gonzlez, R.A., Chen, N. and Dahanayake, A. (eds.)
Personalized Information Retrieval and Access: Concepts, Methods and Practices. Hershey,
PA: Information Science Reference.
Guarino, N. (1998). Formal ontology and information systems. In Proc. of the 1st International
Conference on Formal Ontologies in Information Systems (FOIS98), pp. 35.
IEEE Learning Technology Standards Committee (LTSC) (2001) Draft Standard for Learning Object
Metadata Version 6.1. Available online at: http://ltsc.ieee.org/doc/
Lenat, D. (1995). CYC: a large-scale investment in knowledge infrastructure. Communications of the
ACM, 38(11), 33-38.
Manouselis, N., Abian, A., Soto-Carrin, J., Ebner, H., Palmr, M. and Naeve, A. (2008). A Semantic
Infrastructure to Support a Federation of Agricultural Learning Repositories. InProc. of the 8th
IEEE Int. Conf. on Advanced Learning Technologies (ICALT'08).
Polsani, P. R. (2003). Use and Abuse of Reusable Learning Objects.Journal of Digital Information, 3(4),
2003-02.
Sanchez-Alonso, S., Garcia, E. (2006). Making use of upper ontologies to foster interoperability
between SKOS concept schemes. Online Information Review, 30(3), 263-277.
Stuempel H., Salokhe G., Aubert A., Keizer J., Nadeau A., Katz S. and Rudgard S. (2007). Metadata
Application Profile for Agricultural Learning Resources. In Proc. of the 2nd
Int. Conf. on
Metadata and Semantics Research (MTSR'07), Corfu, Greece.
Weibel, S., Kunze, J., Lagoze, C., & Wolf, M. (1998). Dublin Core Metadata for Resource Discovery.
Internet Engineering Task Force RFC, 2413.
510
http://ltsc.ieee.org/doc/http://ltsc.ieee.org/doc/