engineering an ontology on organic agriculture and agroecology: the case of the organic.edunet...

Upload: jesus-caceres-tello

Post on 03-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project

    1/8

    IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

    Engineering an ontology on organic agriculture and agroecology: the

    case of the Organic.Edunet project

    Salvador Snchez-Alonso1, Jess Cceres

    1, Aage S. Holm

    2, Geir Lieblein

    2, Tor

    Arvid Breland2

    , Roger A. Mills3

    and Nikos Manouselis4

    1 University of Alcal, Information Engineering research unit, Edificio Politcnico, Ctra. de

    Barcelona km. 33.6, Alcala de Henares, Spain, {salvador.sanchez, jesus.caceres}@uah.es

    2 Norwegian University of Life Sciences, Department of Plant and Environmental Sciences

    Postboks 5003, 1432 Aas, Norway, {aage.holm, geir.lieblein, tor.arvid.breland}@umb.no

    3 Oxford University Library Services, Radcliffe Science Library , Parks Road, Oxford OX1

    3QP, UK, [email protected]

    4 Greek Research & Technology Network (GRNET S.A.), 56 Messogion Str., Athens, Greece

    [email protected]

    Abstract

    Education is essential to raise public awareness on organic agriculture and agroecology (OA &

    AE). The EU funded project Organic.Edunet will provide an online, freely-available portal

    where learning contents on OA & AE can be published and accessed through specialized

    technologies. Thus, the future Organic.Edunet portal will offer advanced services (such as

    ontology-based searching and social recommendation) and will facilitate search, retrieval and

    use of the collected content. This paper describes the method used for the effort of experts in

    different fields to agree and build together a conceptual model (in the form of an ontology)

    which would finally be the basis for the technical novelties and advanced functionalitiesprovided by the portal.

    Keywords: organic agriculture and agroecology, learning resources, ontology, semantic Web.

    1. Introduction

    Organic agriculture and agroecology are important efforts towards creating a more sustainable

    form of agriculture and farm production. Aimed at applying ecological principles to the

    development and management of sustainable agricultural systems both share the concern for

    taking care of the environment, and integrating farming activities in the rural communities,

    among others. To raise public awareness on organic agriculture and agroecology (OA & AE),education is essential. The EU-funded project Organic.Edunet aims at the promotion of the

    e-Education of European youth, but also of producers, farmers, and consumers about OA & AE

    benefits, by providing an openly-available portal where learning contents on OA & AE could

    be published and accessed. This place will integrate and specialize state-of-art technologies of

    the World Wide Web, in order to provide end-users with a single European reference point (the

    Organic.Edunet Web portal) that will offer advanced services such as ontology-based searching

    and social recommendation, and will facilitate search, retrieval and use of the collected content

    (Manouselis et al., 2008).

    To be successful in this attempt, a twofold effort is necessary. First, a technical effort to

    both make learning materials available but also to provide advanced functionalities that would

    503

    mailto:[email protected]:[email protected]:[email protected]:[email protected]
  • 7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project

    2/8

    IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

    make the Organic.Edunet portal unique. Second, a coordination effort for experts in different

    fields domain experts, ontology experts, librarians and external consultants to agree and

    build together a conceptual model (in the form of an ontology) which would finally be the basis

    for the technical novelties and advanced functionalities provided by the portal.

    On the technical side, the effort can be summarized in the following points:

    - To develop a metadata scheme for the description of digital learning resources. This schemewill be built upon previous efforts, such as the IEEE Learning Object Metadata standard

    (LTSC 2001), the Dublin Core metadata standard (Weibel et al., 1998), FAO's agricultural

    information management standards, and others (Stuempel et al., 2007).

    - To use this scheme to describe existing contents. This needs to be first carried out at a locallevel: universities and institutions interested in becoming part of the effort, should annotate

    their OA & AE digital resources with multilingual, standard-complying metadata according

    to the mentioned scheme.

    - To specialize existing software tools in order to provide content producers with acustomized suite of tools that will help them organize their content in learning repositories,

    where learning objects (Polsani, 2003) will be described according to the metadata scheme.

    - To interconnect local repositories to form a federation of learning repositories.- To provide a common point of access including advanced search capabilities, based on the

    Semantic Web vision (Berners-Lee, Hendler and Lassila, 2001).

    On the knowledge side, all the tasks to perform can be summarized in one macrotask:

    building an ontology covering OA & AE. An ontology can be described as a logical structure of

    a domain, which includes a description of terms and their relations to each other. In the

    Organic.Edunet project, an ontology on OA & AE provides this type of structure for our field of

    study. As this ontology has to be designed to help users of the Organic.Edunet portal to make

    good searches, it will be deliberately application specific and dependent. According to theclassification based on the level of generality by Guarino (1998), this ontology could be

    classified as an application ontology, as it will describe concepts depending both on a

    particular domain and task.

    Negotiation is a very important issue in the whole process of knowledge construction.

    Reaching agreements is a constant concern because the persons involved in this process have

    different backgrounds and knowledge (e.g. different definitions of the same thing depending on

    which country/institution the expert belongs to), not to mention the management difficulties

    (e.g. partners distributed in 15 institutions from all over Europe, from Estonia to Spain, from

    Greece to Norway). Those differences and difficulties must be overcome to produce a final

    ontology with which all the persons involved are comfortable. This is essential as the

    information in the ontology will be later used to enrich learning resources metadata. If thisinformation is not properly used or if it is not used at all to annotate learning materials in the

    portal, it will be impossible to deliver the advanced capabilities promised.

    This paper sketches the process being followed for the development of the ontology: the

    various stakeholders involved, the steps suggested, the sources used and the decisions taken.

    The rest of it is structured as follows. Section 2 describes the general procedure of ontology

    engineering. Section 3 provides a closer look to the procedure followed so far, while sections 4

    and 5 analyze the perspective and structure of terms in the initial list of terms for the OA & AE

    ontology. Finally, section 6 summarizes the discussion, providing some conclusions.

    504

  • 7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project

    3/8

    IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

    2. Engineering an ontology on Organic Agriculture and Agroecology

    Building an ontology can be approached from several angles. However, as reported by Chen

    and Williams (2008), none of current methodologies can be considered as mature if compared

    to other disciplines such as software engineering or knowledge engineering methodologies. For

    this reason, and due to the specificities of our work, the following procedure was agreed:

    1. OA & AE experts elaborate a list including all the relevant terms in the domain of OA & AE.As terms will be later used to annotate learning materials, the higher the probability of

    annotating a resource with a given term, the more relevant this particular term will be. This

    vocabulary has to be built upon previous efforts, such as Bio@gro (http://www.bioagro.gr),

    FAO's AGROVOC (http://www.fao.org/aims/ag_intro.htm), and others, and thus some

    terms will include mapping information to terms in those vocabularies.

    2. Using the list of terms as an input, domain experts (with the help of librarians and guidancefrom the ontology experts) identify sub-domains with the aim of dividing the original list

    into microthesauri (sub-lists also known as modules or microtheories). Modules must be

    cohesive: all the concepts logically related will be part of the same module. Tentative high

    level modules are Farming, Distribution, Processing, Consumption and Waste management,

    which can be later subdivided into lower level modules.

    3. Domain experts add agreed, unambiguous definitions for the terms, thus producing aconcept list. We agreed on using the word conceptto denote terms whose definition and

    some information on its relations to other concepts has been established.

    4. Ontology experts develop an initial ontology from the concept list. They process is toseparately engineer each module into a sub-ontology, although all of these sub-ontologies

    are created in parallel from the definitions.

    5. Evaluation of the ontology produced in step 5, making use of upper ontologies in a processdescribed by Snchez-Alonso and Garca-Barriocanal (2006). This process, which has theadvantage of allowing to contrast the work of several experts in parallel producing very

    efficient results, is structured around four steps:

    5.1. Find one or several terms in the OpenCyc upper ontology (Lenat, 1995) that subsume,are equal, or similar to the category under consideration.

    5.2. Check carefully that the mapping is consistent with the rest of the subsumers insidethe upper ontology.

    5.3. Provide the appropriate predicates to characterize the new category.5.4. Edit the term in an ontology editor to come up with the final formal version.

    6. Validation by example, using scenarios of typical user interaction for searching andretrieving resources as a way of improving and refining the ontology.

    The procedure sketched, shown in figure 1, is intended to be an iterative and incremental

    process through which validated concepts and relations will be added to the ontology in a

    continuous and systematic building method. Due to the fact that ontology experts have a limited

    knowledge on the OA & AE domain, domain experts will be regularly asked about clarification,

    and will provide more detailed explanations whenever necessary. The final result of this

    process will be an ontology including and integrating all the information in the modules.

    505

    http://www.bioagro.gr/http://www.fao.org/aims/ag_intro.htmhttp://www.fao.org/aims/ag_intro.htmhttp://www.bioagro.gr/
  • 7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project

    4/8

    IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

    Fig. 1 Procedure for the engineering of an ontology on OA & AE

    However, the initial idea of building up the ontology module by module was close to be

    discarded due to difficulties from the domain experts side, who preferred to see the whole

    picture. The solution taken was to build up a full list of tagged terms, where tags linked terms

    to tentative modules where they could fit in, thus postponing the division of this full map of

    concepts into cohesive modules to the next iteration. This reassured domain experts in the

    feeling that it would be easier to come up with such a mind map of concepts, where they could

    later easily work with one branch at the time.

    3. From lists of terms to ontologies

    It is usual to think of a field, its concepts and its relations in a two-dimensional way: some terms

    are related to each other, and so on. However, relations can also cross-link and, in this way, all

    terms have a place in the full knowledge system, related to one or many other terms in different

    ways. One concrete example could be the following:

    - Plant Production is a main, high level term- Plant Health is a sub-term of Plant Production- Preventive Plant Protection is a sub-term of Plant Health- Crop Rotation is a sub-term of Preventive Plant Protection

    506

  • 7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project

    5/8

    IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

    At the same time:

    - Diversity is a sub-term of Plant Production- Crop Rotation is a sub-term of DiversityWhich in fact means that:

    Crop rotation is a two-folded sub-term of Plant Production as it is a sub-term of both

    Plant Health and Diversity

    However, even if this is a valid knowledge classification from the point of view of

    building a thesaurus, it is not enough to build an ontology. In thesaurus construction, the is a,

    which denotes a relationship between two terms, has a very broad meaning that needs to be

    narrowed to disambiguate and formalize the knowledge in an ontology. In fact, usual relations

    in thesauri such as broader, narrower, used for or related are not precisely defined in a

    semantic form. In our effort, it is the ontology experts work to depart from a thesaurus-like

    knowledge representation and to arrive at an explicit knowledge representation in the form of

    an ontology. In the original list of term that domain experts produced a thesaurus-like list

    is a relationships are overloaded: they can represent aggregation relations, or hierarchy

    relations or even individual-concept relations. The following sentence will serve as an example

    of the kind of work that ontology experts have to carry out:

    Plant Health is a sub-term of Plant Production

    Let us reflect on this for a moment. In which form is Plant Health related to Plant

    Production? Is Plant health an individual of the concept Plant production? Not really. Is it a

    more specific term than Plant Production, which can be categorized as a specific form of PlantProduction? Neither. So, Plant Production has to be an aggregated concept, one of whose parts

    is Plant Health, right? Well, maybe. The problem is that an ontology expert can not determine,

    without the intervention of domain experts, which specific relation links Plant Health to Plant

    Production. Only after iteration, discussion and negotiation, both experts (domain experts and

    ontology experts) will agree on a knowledge representation where all the relationships are

    unambiguously determined. From this point, it will be possible to speak about an ontology and

    forget about concept lists, thesauri and term lists. In this sense, the formal knowledge in the

    ontology will derive from the knowledge implicit in the list of terms used as an input for the

    creation of the ontology, but this latter will be richer, and of course will include many more

    concepts and relationships than the former. In this example, we would finally have in the

    ontology the following assertions (formal expressions stating true knowledge):

    Plant Production is an agricultural activity

    Plant Health is a prerequisite for Plant Production

    Plant Protection is a set of techniques and processes to achieve Plant Health

    Preventive Plant Protection is a sub-set of techniques for Plant Protection

    Crop Rotation is a specific technique for Preventive Plant Protection

    Similarly:

    Diversity is a prerequisite for Plant Production in organic agriculture

    Crop Rotation is a specific technique for obtaining Diversity

    507

  • 7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project

    6/8

    IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

    4. Perspectives

    Despite the thesauric vision of a knowledge classification, where broader-narrower term

    relations are used to create hierarchies of terms, there are also other dimensions which form an

    ontology. In this particular case, the ontology is designed to help users of the Organic.Edunet

    portal to improve searches, and due to this some other important facets or perspectives had to betaken into account. We have called the most important facets the three new dimensions of our

    OA & AE ontology: the perspective dimension, the user dimension and the space-and-time

    dimension.

    These dimensions are important in the process of implementing advanced searches in the

    web portal. They may be e.g. included in the metadata registry of each content item (or in other

    ways), in order to help the user find the desired materials. They should definitely not be

    forgotten in the construction of the ontology, as different dimensions create different varieties

    of the knowledge base.

    4.1. The perspective dimension

    Different perspectives could be Social, Economic, Production and Environmental. The

    term Plant health may bring up different associations and content seen from an economic

    point of view, compared to an environmental point of view. An environmentalist may focus on

    the biological aspects of plant health, plant protection and also the content of nutrients. An

    economist would probably focus, on the other hand, on the cost-benefit of different plant

    protection strategies. In the same way, Animal welfare may be associated with lack of

    sickness to a medical professional, but includes terms like natural behavior and access to

    open air in the eyes of an environmentalist.

    4.2. The user dimension

    A user can approach a search, or a content/term, from different angles. An elementary or

    secondary school pupil will anticipate a different kind of content than a PhD student or a

    professor. The complexity of a term, or the complexity of the relationships to other terms,

    should be different depending of the reader. A consumer may be more interested in the structure

    of regulations and accreditation which concerns Labeling, or the ideology behind it, whereas

    a farmer/producer may be more interested in the specific regulations concerning plant

    production.

    4.3. The space-and-time dimension

    When searching information about a phenomenon, it is very important to be aware of level of

    resolution. For example, questions regarding symbiotic N2 fixation may be asked at the

    molecular, the soilplant mesocosm, the field, the farm, the local, the national or the biosphere

    spatial levels, respectively. Effective searches should be targeted at the resolution level of

    interest to the user. Similarly, when searching for information about soil nitrogen dynamics, it

    is desirable to distinguish between within-year (e.g., plant-available N during a growing

    season), medium-term (e.g., N balance during a crop rotation) and long-term (e.g., humus N

    dynamics over decades or decennia) time horizons.

    508

  • 7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project

    7/8

    IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

    5. Structure of terms in the OA & AE ontology

    A question that soon came to the minds of domain experts during the elaboration of the initial

    list of terms was the following: What terms should represent an ontology for OA & AE? We

    have previously referred that several thesauri and lists with terms related to agriculture, as well

    as other lists including terms relevant to organic agriculture, were used as an input for theoverall process. Some terms, e.g. the term irrigation, are relevant to agriculture in general and

    also to organic agriculture and thus was part of more than one list. Consequently, all terms can

    not be included in the first list of terms for an OA & AE ontology. As a starting point, we chose

    to look at the principles and main focus in organic agriculture. These principles, as enunciated

    by IFOAM, give some directions in which OA & AE have a special attention: Health, Care,

    Natural, Ecology, Quality, Safety, Variation, Diversity and Fair are all descriptions of its goals.

    The goal of OA & AE could also be expressed in one sentence: To produce enough and

    healthy food for the worlds population in a sustainable and ecological way. In our search for

    key ideas for the initial list of terms, we have focused on terms which are closely related to

    these goals and qualities. This does not mean that irrigation could not be a term in the future

    OA & AE ontology, but it is not related to the unique key terms and qualities of OA & AE.

    The main structure of the ontology starts with the farming perspective: Plant Production,

    Soil Quality, Animal Production and Service Production. This is at the farming level. Moving

    out from the farm issues concerning distribution, processing, marketing, consumption and, at

    the end, waste management come into consideration. These are the main modules of the OA &

    AE draft list of terms. This draft shows the main branches and key terms of the future ontology.

    Of course, many terms not mentioned in this list should be in it, but will need to be included in

    subsequent iterations of the process. In particular, these principles have to be kept in mind

    during the refinement and iteration process:

    -

    The terms in this ontology must relate to the content which will be finally providedfor the Organic.Edunet portal. If an institution has substantial and useful content

    materials about e.g. cereal production, we will need to place key terms relevant to

    this content in the ontology.

    - When filling in missing sub-terms, these should be placed under the correct terms,and probably fill in terms at a sub-level between the main term and the new

    sub-term added to the list.

    - Synonyms of existing terms, if there are any, must be found and referenced.- Corrections and amendments should be made to the existing list of terms.

    Other questions, such as whether or not modules are strictly necessary for our application

    purposes or if some terms should be placed under another branch arose during discussion, andwere annotated as interesting issues to take into account for the last stages of the process.

    6. Conclusions and future work

    In this paper, the process of engineering an OA & AE ontology for the needs of the

    Organic.Edunet project has been sketched. This process involves at least three types of experts:

    domain experts, ontology experts and other experts such as librarians, external consultants, etc.

    The ontology construction has to take into account the existence of multiple dimensions in the

    knowledge of the OA & AE field of study. These dimensions have been studied as part of the

    ontology engineering process and listed as categories in the initial list of terms after step 1.

    509

  • 7/28/2019 Engineering an ontology on organic agriculture and agroecology: the case of the Organic.Edunet project

    8/8

    IAALD AFITA WCCA2008 WORLD CONFERENCE ON AGRICULTURAL INFORMATION AND IT

    At first, ontology experts suggested to build the ontology starting from some very basic

    concepts (from some public, popular ontologies) like "organization", "person", "plant",

    "phenomenon", to later let the rest of the actors start associating the list of concepts that they

    have elaborated, building something like a hierarchy of concepts that will be the backbone of

    the ontology. However, this approach was modified to reach a final agreement in a 7 step

    procedure detailed in section 2. At the moment, only the two first steps of the full procedurehave been walked. The preliminary list of terms was created with the help and cooperation of

    all the content providers, people from UN/FAO and other experts in the field by February 2008.

    The initial list of tagged concepts was produced during an internal workshop at UMB on May

    2008, after months of work and input from all the Organic.Edunet partners and external experts.

    The first version of the ontology (containing terms only in English) is scheduled to be

    ready in September 2008. The works for including translations to all the other eight languages

    represented in the Organic.Edunet consortium should be finished by the end of September as

    well. From this point, the knowledge in the ontology will be used in pilot searches whose

    results will be in turn used to iterate the design and contents of the ontology, to improve it and

    refine it.

    Acknowledgements

    The research reported herein is part of the activities of the EU-funded project Organic.Edunet

    (ECP-2006-EDU-410012), from which it receives partial funding.

    References

    Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic Web. Scientific American, 284(5),

    28-37.

    Chen, S. and Williams, M.A. (2008). Learning personalized ontologies from text: A review on aninherently transdisciplinary area. In Gonzlez, R.A., Chen, N. and Dahanayake, A. (eds.)

    Personalized Information Retrieval and Access: Concepts, Methods and Practices. Hershey,

    PA: Information Science Reference.

    Guarino, N. (1998). Formal ontology and information systems. In Proc. of the 1st International

    Conference on Formal Ontologies in Information Systems (FOIS98), pp. 35.

    IEEE Learning Technology Standards Committee (LTSC) (2001) Draft Standard for Learning Object

    Metadata Version 6.1. Available online at: http://ltsc.ieee.org/doc/

    Lenat, D. (1995). CYC: a large-scale investment in knowledge infrastructure. Communications of the

    ACM, 38(11), 33-38.

    Manouselis, N., Abian, A., Soto-Carrin, J., Ebner, H., Palmr, M. and Naeve, A. (2008). A Semantic

    Infrastructure to Support a Federation of Agricultural Learning Repositories. InProc. of the 8th

    IEEE Int. Conf. on Advanced Learning Technologies (ICALT'08).

    Polsani, P. R. (2003). Use and Abuse of Reusable Learning Objects.Journal of Digital Information, 3(4),

    2003-02.

    Sanchez-Alonso, S., Garcia, E. (2006). Making use of upper ontologies to foster interoperability

    between SKOS concept schemes. Online Information Review, 30(3), 263-277.

    Stuempel H., Salokhe G., Aubert A., Keizer J., Nadeau A., Katz S. and Rudgard S. (2007). Metadata

    Application Profile for Agricultural Learning Resources. In Proc. of the 2nd

    Int. Conf. on

    Metadata and Semantics Research (MTSR'07), Corfu, Greece.

    Weibel, S., Kunze, J., Lagoze, C., & Wolf, M. (1998). Dublin Core Metadata for Resource Discovery.

    Internet Engineering Task Force RFC, 2413.

    510

    http://ltsc.ieee.org/doc/http://ltsc.ieee.org/doc/