learning domain ontologies for semantic web service descriptions

Learning Domain Ontologies forSemantic Web Service Descriptions

Marta Sabou a Chris Wroe b Carole Goble b Heiner Stuckenschmidt a

aDept. of Artificial Intelligence, Vrije Universiteit AmsterdambDept. of Computer Science, University of Manchester

Abstract

High quality domain ontologies are essential for successful employment of semantic Webservices. However, their acquisition is difficult and costly, thus hampering the developmentof this field. In this paper we report on the first stage of research that aims to develop(semi-)automatic ontology learning tools in the context of Web services that can supportdomain experts in the ontology building task. The goal of this first stage was to get a betterunderstanding of the problem at hand and to determine which techniques might be feasibleto use. To this end, we developed a framework for (semi-)automatic ontology learning fromtextual sources attached to Web services. The framework exploits the fact that these sourcesare expressed in a specific sublanguage, making them amenable to automatic analysis. Weimplement two methods in this framework, which differ in the complexity of the employedlinguistic analysis. We evaluate the methods in two different domains, verifying the qualityof the extracted ontologies against high quality hand-built ontologies of these domains.

Our evaluation lead to a set of valuable conclusions on which further work can be based.First, it appears that our method, while tailored for the Web services context, might beapplicable across different domains. Second, we concluded that deeper linguistic analysisis likely to lead to better results. Finally, the evaluation metrics indicate that good resultscan be achieved using only relatively simple, off the shelf techniques. Indeed, the noveltyof our work is not in the used natural language processing methods but rather in the waythey are put together in a generic framework specialized for the context of Web services.

Key words: ontology learning, semantic Web services, ontology learning evaluation

In the last few years the Web encountered two revolutionary changes which aim totransform it from a static document collection in an intelligent and dynamic dataintegration environment. First, the Web service technology allowed uniform accessvia Web standards to software components residing on various platforms and writ-ten in different programming languages. As a result, software components provid-ing a variety of functionalities (ranging from currency conversion to flight booking

Preprint submitted to Elsevier Science 17 September 2005

or book buying) are now accessible via a set of Web standards and protocols. Nat-urally, the real value of Web services is in their composition which allows creatingnew and complex functionalities from the existing services. The second novel Webtechnology, the Semantic Web, developed techniques for augmenting existing Webdata with logics based formal descriptions of their meaning. This semantic markupis machine processable and therefore facilitates access and integration of the vastamount of Web data.

A major limitation of the Web services technology is that finding and compos-ing services still requires manual effort. This becomes a serious burden with theincreasing number of Web services. To address this problem, semantic Web re-searchers advanced the idea of augmenting Web services with a semantic descrip-tion of their functionality that could facilitate their discovery and integration. Moreprecisely, Web services are described in terms of concepts provided by a domainontology. These concepts denote entities in the domain of the Web service (e.g.,Food, Hotel) as well as functionalities that can be performed by services in thegiven domain (e.g., OrderFood, BookHotel). To ensure high quality reasoning onthese semantic Web service descriptions, it is essential that they rely on the useof quality domain ontologies, i.e., ontologies that have a broad coverage of theirdomain’s terminology. This would allow many (if not all) services to use the sameor a small number of different ontologies thus reducing the need of mappings atreasoning time. Note that in practice different types of ontologies are used rangingfrom catalogs of domain concepts to formal domain models.

Despite their importance, few domain ontologies for Web service descriptions existand building them is a challenging task. One of the problematic aspects is thatfor building a high quality domain ontology one ideally needs to inspect a largenumber of Web services in that domain. Since many domains witnessed a rapidincrease in the number of available Web services to several hundreds (1000+ inbioinformatics), tools that support ontology curators to build a Web service domainontology from these large and dynamic data sets become crucial.

Our work addresses the problem of (semi-)automatically learning Web service do-main ontologies. We report on the first stage of this work in which we aim to geta better understanding of the ontology learning task in the context of Web servicesand to identify potentially feasible technologies that could be used. Early in ourwork we learned that the context of Web services raises several issues that con-strain the development of an ontology learning solution. We designed a frameworkfor performing ontology learning in the context of Web services which addressesthese issues in two ways. First, it exploits the particularities of Web service docu-mentations to extract information used for ontology building. In particular, the sub-language characteristics of these texts lead to the identification of a set of heuristics.These heuristics are implemented as pattern based extraction rules defined on topof linguistic information. Second, the learned ontologies are suited for Web servicedescriptions as they contain both static and procedural knowledge.

2

We implemented two learning methods that follow the basic principles of the frame-work but use different linguistic knowledge. The first method uses basic Part-of-Speech (POS) information and was developed and tested in the context of theWonderWeb project 1 [52]. The second method uses deeper dependency parsingtechniques to acquire linguistic knowledge. It was designed and tested on data setsprovided by the myGrid project 2 [55]. In this paper we present both methods andcompare them by applying and evaluating them in the context of both projects.

This paper is structured as follows. We first present some introductory notions aboutsemantic Web service technology, concluding the important role that Web servicedomain ontologies play as well as some requirements that they should fulfill (Sec-tion 1). Then, we analyze why it is difficult to build such domain ontologies. Wedo this by describing the process of building domain ontologies in the context ofthe two research projects that served as case studies for developing and evaluat-ing our framework (Section 2). We conclude Section 2 with an overview of theissues that constrain the development of an ontology learning solution in the Webservices context. Then, we present an ontology learning framework that deals withthese constraints and the two concrete implementations of this framework in Sec-tion 3. Implementation details and some considerations about the usability of theextraction tools are provided in Section 4. In Section 5 we present an overview ofexisting ontology learning evaluation practices and show how they were adaptedfor our work. In this section we also detail our experimental results. We finallydiscuss our major findings and point out future work in Section 6.

1 Semantic Web Services

The advent of Web service technology brought a new dynamic aspect to the Web.The W3C Web Services Architecture Working Group defines a Web service as “asoftware application identified by an URI, whose interfaces and bindings are ca-pable of being defined, described and discovered as XML artifacts. A web servicesupports direct interactions with other software agents using XML-based messagesexchanged via Internet-based protocols” [22]. Typically a Web service interface isdescribed using the XML based Web Service Description Language (WSDL) andregistered in UDDI, a registry that permits Universal Description Discovery andIntegration of Web services. Web service exchange messages are encoded in theXML based SOAP messaging framework and transported over HTTP. By relyingon these standards, Web services hide any implementation details increasing cross-language and cross-platform interoperability. The WSDL language specifies thefunctionality and message formats of the service only at a syntactic level. Whilethese descriptions can be automatically parsed and invoked by machines, the inter-

1 http://wonderweb.semanticweb.org/2 http://www.mygrid.org.uk/

3

pretation of their meaning is left for a human programmer. This lack of semanticslimits the use of WSDL descriptions to facilitate invocation of the correct service.

The Semantic Web community addressed this limitation by augmenting the servicedescriptions with a semantic layer in order to achieve their automatic discovery,composition and execution. A semantic Web service description relies on two majorconstituents as shown by the following hotel booking service description.

<DO:BookHotel rdf:ID="WS1"><GO:hasInput rdf:resource="DO:Hotel"/><GO:hasInput rdf:resource="DO:ReservationDates"/><GO:hasOutput rdf:resource="DO:HotelReservation"/>

</DO:BookHotel>

First, the semantic description uses the vocabulary defined by a generic Web ser-vice description language to specify the main elements of the service (e.g., inputs,outputs). The generic concepts used in our example are proceeded by the GO (i.e.,generic ontology) namespace prefix. A first major initiative in this direction of es-tablishing a standard generic terminology for Web service description is the devel-opment of the OWL-S ontology ( [37], [38]). Second, the description template builtwith these generic terms is filled in with domain specific concepts provided by aWeb service domain ontology (see the concepts proceeded by the DO (i.e., domainontology) prefix). A Web service domain ontology typically specifies two types ofdomain knowledge. On one hand it contains concepts that describe functionalitiesoffered in a domain (e.g., BookHotel, BuyTicket). On the other it specifies domainconcepts that often appear as parameters of Web services (e.g., Hotel, Ticket). Webservice domain ontologies differ from existing domain specific ontologies used inthe Semantic Web by the fact that besides specifying domain concepts (i.e., staticknowledge) they also specify functionality types (i.e., procedural knowledge).

Several Web service tasks can be automated by using semantic descriptions. For ex-ample, service offers and requests can be matched automatically [44]. This match-making is flexible because it allows retrieving services that only partially match arequest but are still potentially interesting. For example, the hotel booking servicewill be considered a match for a request for Accommodation booking services, ifthe used domain ontology specifies that Hotel is a kind of Accommodation. Thismatchmaking is superior to the keyword search offered by UDDI registries.

A basic requirement for being able to perform complex reasoning with multiplesemantic Web service descriptions is that (many of) these descriptions should useconcepts of the same (or a small number of) domain ontology. If each Web ser-vice description uses a different domain ontology then a mapping between theseontologies has to be performed before any reasoning task can take place. However,ontology mapping itself is a difficult and largely unsolved problem in the Seman-tic Web. Therefore, a quality domain ontology will reflect a high percentage of the

4

domain’s terminology (i.e., offer a high coverage of the domain’s terminology) sothat many Web services in that domain can be described with its concepts. Thisrequirement makes the building of the domain ontologies difficult as it is evident inthe next section where we present an analysis of the ontology building process intwo concrete research projects.

2 Building Web Service Domain Ontologies

The creation of semantic Web service descriptions is a time consuming and com-plex task whose automation is desirable, as signaled by many researchers in thisfield, for example [59]. This task can be broken down in two smaller tasks. First,acquiring a suitable Web service domain ontology is a prerequisite when creatingsemantic Web service descriptions. This is required to create a shared terminologyfor the semantic service descriptions. Second, the actual process of annotating Webservices with the concepts provided by the domain ontology (i.e., creating theirsemantic descriptions) has to be performed.

To our knowledge, two research teams concentrate on the Web service annotationtask. Hess and Kushmerick [28] employ Naive Bayes and SVM machine learningmethods to classify WSDL files (or Web forms) in manually defined task hierar-chies. Patil et al. [45] employ graph similarity techniques to select a relevant do-main ontology for a given WSDL file from a collection of ontologies. Then theyannotate the elements of the WSDL file with the concepts of the selected ontology.Both teams use existing domain ontologies and acknowledge that their work washampered by the lack of such ontologies. The work presented in this paper is com-plementary, since we address the acquisition of Web service domain ontologies.

In the rest of this section we describe the ontology building process as it took placein the context of two research projects: WonderWeb and myGrid. These projectsoffered realistic requirements, data sets and evaluation standards for our work. Inboth cases we detail (1) the kind of data sources used for ontology building and the(2) resulting manually built domain ontologies. These manually built ontologiesserve as Gold Standards when evaluating the automatically learned ontologies. Wealso (3) highlight the difficulties encountered during building these ontologies.

The benefit of this analysis is twofold. First, these projects reveal some of the ma-jor aspects that make Web service ontology building difficult. These aspects promptat the need of automating (at least to some extent) the acquisition of domain on-tologies. The second benefit of the analysis is an overview of a set of issues thatconstrain the development of ontology learning methods in the context of Web ser-vices. These constraints, discussed in Subsection 2.3, guided us in the design of theontology learning framework described in Section 3.

5

2.1 Case Study 1: WonderWeb RDF(S) Storage Tools

Project description. The EU-funded WonderWeb research project aimed to de-velop an infrastructure for large-scale deployment of ontologies on the SemanticWeb. The project’s engineering infrastructure was provided by the KAON Appli-cation Server [34], a semantic middleware system which facilitates the interoper-ability of semantic Web tools [54]. Ontologies that describe the functionality ofSemantic Web tools and services are core to the architecture of this middleware. AsRDF(S) storage and query facilities are essential components of any Semantic Webapplication they were the first ones to be integrated with KAON and thus required adomain ontology that would describe this domain. Besides WonderWeb, the ontol-ogy for describing RDF(S) storage functionality was also used in the AgentFactoryproject which performs configuration of semantically described Web services usingagent-based design algorithms ([49], [50]).

Data Sources. While there are many tools offering ontology storage (a major sur-vey [18] reported 14 such tools), only few are available as Web services (two, ac-cording to the same survey). Therefore, it is problematic to build a quality domainontology by analyzing only the available Web services. However, since Web ser-vices are simply exposures of existing software to Web accessibility, there is a largeoverlap (often one-to-one correspondence) between the functionality offered by aWeb service and that of the underlying implementation. Based on this observation,the domain ontology was manually built by analyzing the APIs of three RDF(S)storage tools (Sesame [5], Jena [39], KAON RDF API [34]).

The data source used during ontology building consisted of the javadoc documen-tation of all methods offered by these APIs. A javadoc documentation contains ageneral description of the method’s functionality, followed by the description of itsparameters, result types and exceptions to be thrown. See for example the javadocdocumentation of the add method from the Jena API.

addAdd all the statements returned by an iterator to this model.

Parameters:iter - An iterator which returns the statements to be added.

Returns:this model Throws: RDFException - Generic RDF Exception

Manually built ontology. The manually built ontology contains 36 concepts dis-tributed in two main hierarchies. The first hierarchy contains concepts that denotea set of functionalities offered by the analyzed APIs. These concepts are groupedunder the Method concept which is similar in meaning to the OWL-S Profile con-cept (see a snapshot of the ontology in Fig. 1). This hierarchy contains four maincategories of methods for: adding data (AddData), removing data (RemoveData),retrieving data (RetrieveData) and querying (QueryMethod). Naturally, several spe-

6

cializations of these methods exist. For example, depending on the granularity ofthe added data, methods exist for adding a single RDF statement (AddStatement)or a whole ontology (AddOntology).

Besides the Method hierarchy, the ontology also containsthe elements of the RDF Data Model (e.g., Statement,Predicate, ReifiedStatement) and their hierarchy, groupedunder the Data concept. The ontology is rich in knowledgeuseful for several reasoning tasks. For example, the meth-ods are defined by imposing restrictions on the type andcardinality of their parameters, describing their effects andtypes of special behavior (e.g., idempotent). Note that thehierarchy encoded by this ontology reflects a certain con-ceptualization and is not unique. Furthermore, the buildingof this manual ontology was a good indication that APIdocumentations are rich enough to allow building domainontologies.

Fig. 1: RDF(S) StorageOntology Snapshot.

Encountered Problems. The major impediment in building a domain ontology fordescribing RDF(S) storage tools was the choice of data sources from which to buildthe domain ontology. Once the decision taken, it took three weeks (for one person)to build the ontology. This time includes the time to read and understand the APIdocumentations as well as the time to identify overlapping functionalities offeredby the APIs and to model them in an ontology.

2.2 Case Study 2: myGrid Bioinformatics Services

Project description. myGrid is a UK EPSRC e-Science pilot project building se-mantic grid middleware to support in silico experiments in biology. The experimen-tal protocol is captured as a workflow, with many steps performed by Web services.Core to the infrastructure is an ontology for describing the functionality of theseservices and the semantics of the manipulated data. A key role of the ontology is tofacilitate user driven discovery of services at the time of workflow construction.

Data Sources. The ontology was built manually initially using the documentationfor 100 services as a source of relevant terms. These services are part of the EM-BOSS (European Molecular Biology Open Software Suite) service collection 3 andare further referred to as EMBOSS services. Each service has a detailed descriptioncontaining (among others) a short description of the service, detailed informationabout its command line arguments, examples of the input/output file formats andits relation with other services in the collection.

3 http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Apps/

7

Manually built ontology. The manually builtmyGrid ontology is much larger and more complexthan the RDF(S) ontology. It contains over 550concepts distributed over a set of distinct subsec-tions covering the domains of molecular biology,bioinformatics, informatics and generic tasks, allunder a common upper level structure. Several re-lations are defined between these concepts andmultiple inheritance is often used. However, cur-rently only a part of this ontology (accounting for23% of its concepts) provides concepts for anno-tating Web service descriptions in a forms-basedannotation tool. The so obtained semantic Webservice descriptions are used for facilitating ser-

Fig. 2: myGrid Ontology Snap-shot.

vice discovery [59]. The rest of the ontology contains concepts from the domainof biology that are too generic for describing the existing services (e.g., organism),or concepts that are used to define orthogonal views on the existing services (seemore on this in Section 5.4.3). The myGrid ontology contains only a small numberof concepts denoting functionality (23) (see a snapshot of this part of the ontol-ogy in Fig. 2). Also, a different modelling principle is used here compared to theRDF(S) ontology. Namely, the functionality concepts simply denote generic actionsthat can be performed in bioinformatics without being linked to the involved datastructures. A possible explanation for this choice is that in bioinformatics one canperform these operations on a multitude of data structures and thus, enumeratingall these combinations would be impractical.

Encountered Problems. Several factors hampered the building of this ontology.First, ontology building in itself is time consuming. The ontology was initially builtwith two months of effort from an ontology expert with four years experience inbuilding description logic based biomedical ontologies. A second impediment isthe dynamic nature of the field. The exponential rise in the number of bioinformat-ics Web services over the past year required a further two months effort to maintainand extend the ontology. However, its content currently lags behind that needed todescribe the 1000+ services available to the community. Thirdly, lack of tools ham-pered the process. At the time of development, tool support for handling separateontology modules was minimal, hence the existence of a single substantial ontol-ogy. A fourth impediment was the lack of guidelines on how to build the domainspecific ontology, or indeed how to relate it to upper level ontologies. Since at thattime DAML-S (the predecessor of OWL-S) was still under development, the on-tology curator devised their own generic Web service description schema based onDAML-S but much simplified to reflect narrower requirements. Lacking guidancefrom the Web services field, the curator relied on design principles employed inother large open source biomedical ontologies such as openGALEN [47] and theTAMBIS ontology [2].

8

2.3 Ontology Learning Characteristics

The analysis of the ontology building process presented in the previous two sub-sections lead to two major conclusions. First, ontology building was experienced asa difficult process in both projects. This prompts to the need of an automated solu-tion for this problem. Second, the Web service context presents a set of constraintsthat have to be taken into account when designing an automatic solution. In thissubsection we detail both conclusions.

1. Ontology building is difficult and should be automated. Both case studiesagree on a set of problematic factors that hampered the ontology building activity.First, the ontology curators had to analyze (read, understand and identify commonconcepts in) a high number of textual documents (over hundred in both cases) toensure the quality of their ontologies. The number of analyzed documents is likelyto increase as many domains witness a rapid increase in the number of availableWeb services to several hundreds. A second impediment was the the lack of guide-lines on what knowledge such ontologies should contain and what design principlesthey should follow. This resulted in different groups building different ontologies todescribe Web services in the same domain, as reportedly happened in bioinformat-ics [31]. These factors make ontology building a time consuming activity creatinga demand for tools that support ontology curators to extract ontologies from largeand rapidly changing textual data collections.

2. Several constraints have to be taken into account when building an auto-mated ontology learning solution. We conclude that the two ontology buildingactivities differ in several aspects. First, the application domains are different: com-puter science vs. biology related. Second, different kinds of data sources are usedas a basis for ontology building: javadoc descriptions of several tool APIs in casestudy 1 and detailed service documentations in case study 2. These sources alsodiffer in their grammatical quality, the descriptions used in case study 1 having alower quality from this perspective. The manually built ontologies are also differ-ent. The myGrid ontology is much larger and more complex than the one about theRDF(S) domain. This is not necessarily an advantage since experience has shownthat only a small fraction of the ontology was usable for Web service annotation.

This diversity of domains, existing data sources and requirements for the extractedontologies raises the challenge to build an ontology learning method which candeal with these variances. Namely, the learning method should be applicable on awide range of texts from different domains and offer a configurable set of mod-elling principles for conceptualizing the extracted knowledge. However, there aresome important characteristics that both case studies exhibit and which guided usin setting up the ontology learning framework (see Section 3). These relate to thequality of the analyzed data sets and the requirements for the learned Web servicedomain ontology. We discuss each of them in a separate subsection.

9

2.3.1 Low grammatical quality

The natural language descriptions associated with Web services are mostly com-ments written by their developers. Therefore, they have a low grammatical quality.Punctuation is often completely ignored and several spelling mistakes are present.Naturally, services that have a larger user base expose a better documentation whilesome less-used services barely contain snippets of abbreviated text.

The evident drawback of this low grammatical quality of the analyzed texts is thatthey are difficult to process with off the shelf NLP tools. These tools were trainedon high quality newspaper corpora but even the best quality documentations areoften under the standard quality of newspaper texts. For example, some rule basedpart of speech taggers are sensitive to the capitalization of words considering mostcapitalized words as nouns. A possible remedy is to use preprocessing, e.g., capi-talization of the first words of the sentence, adding some punctuation.

There are also two advantages of working with such documentations. First, thesetexts usually employ simple sentences instead of using complicated phrases. Thisreduced ambiguity allows using deeper linguistic analysis. For example, depen-dency parsers work better on short sentences than on complex phrases. The secondadvantage is that these texts use natural language in a specific way. This characteris-tic makes them amenable to automatic analysis as discussed in the next subsection.

2.3.2 Sublanguage characteristics

Software documentation in general, and Web service descriptions in particular, em-ploy natural language in a specific way. They belong to what is defined as a sub-language in [21]. A sublanguage is a specialized form of natural language which isused within a particular domain or subject matter and characterized by a specializedvocabulary, semantic relations and syntax (e.g., weather reports, real estate adver-tisements). Harris, one of the first researchers to study the use of natural languagein restricted domains, introduced the notion of sublanguage word classes definedas sets of words that are acceptable in the same context within a sublanguage [25].An intuitive example from the medical domain is that in the context ___revealeda tumor we might find words such as X-ray, film, scan. These words belong tothe MEDICAL_TESTs sublanguage word class. There are several constraints onthe co-occurrences of word classes in a sublanguage. For example, many validsentences in the medical sublanguage have the form MEDICAL_TEST revealedDISEASE while sentences of the form DISEASE revealed MEDICAL_TEST aremeaningless in this sublanguage (even if grammatically valid). These constraintsare called selectional constraints.

Several word classes and selectional constraints can be determined in the Web ser-vice sublanguage. For example, by considering EXT_VB a word class of verbs thatindicate an extraction process (e.g., extract, get, retrieve) a frequently occurring

10

pattern which involves this word class and the preposition “from" can be used toeasily determine the output and the source of the action.

Selectional Constrain (Pattern):EXT_VB OUTPUT from SOURCE.

Examples:Extract data from aaindex.Extract cds , mrna and translations from feature tables.Get data from cutg.Retrieve features from a sequence.

Knowledge about word classes and their selectional constraints in a certain sublan-guage can support several Natural Language Processing tasks, such as InformationExtraction [19]. Using sublanguage analysis techniques has also a direct applica-bility in Ontology Learning since word classes often denote semantic classes. Se-lectional constraints can help to determine the members of a word class given someknowledge about the members of other word classes involved in the restrictions.

One of the major problematic aspects of sublanguage analysis is that determiningthe interesting word classes and their selectional constraints is a time consumingprocess. There has been promising research on (partly) automating this process([20], [51]). However, our intention was, for the first implementation of our frame-work, to focus on few but frequently occurring sublanguage features that do notneed laborious work to be identified. Such are patterns that do not rely on lexi-cal information but only on syntactic structures. For example, one of the straight-forward observations was that, in this sublanguage, almost any verb indicates anaction performed by a Web service. So, a word class ACTION would include anyidentified verbs. Also, noun phrases that appear after an action verb denote a partic-ipant in the action, forming the word class ACTION_PARTICIPANT. These wordclasses can be easily identified by relying only on a minimal linguistic analysis.Any co-occurrence of these word classes identifies a Web service functionality andtherefore provides the basic material for our ontology learning algorithm.

2.3.3 Ontologies of procedural knowledge

Ontology learning has to be adapted not only to deal with the characteristics of theinput data but also to produce ontologies that are fit for the task of describing Webservices. Web service domain ontologies contain both static (i.e., domain entityconcepts) and procedural knowledge (i.e., functionalities offered by Web services).Existing ontology learning efforts, to our knowledge, have only focused on derivingstatic knowledge. One of the contributions of our work is to extend these techniquesto the acquisition of procedural knowledge as well.

11

The ontologies we learn are, in the terminology introduced by Guarino in [23], Ap-plication Ontologies, i.e., ontologies that contain both domain concepts and prob-lem solving knowledge useful in a particular application. However, the OWL-Scoalition has coined the term Domain Ontology as referring to ontologies that pro-vide any kind of domain knowledge, both static and procedural. In this paper weuse the term domain ontology in the sense used by the Web services community.

3 The Web Service Ontology Learning Framework

In the previous section we identified a set of particularities that condition ontologylearning when performed in the context of Web services. These particular charac-teristics require the adaptation of existing ontology learning methods. Our litera-ture study yielded that the ontology learning field offers a wide range of differentapproaches to ontology acquisition. However, while most work is targeted on spe-cific domains we are not aware of any efforts that analyze software documentationstyle texts. Several generic ontology learning tools exist, most prominently Text-To-Onto [36], OntoLearn [41] or OntoLT [7], but they are either not available forexperimenting or they are workbenches of generic methods that can be fine-tunedfor a certain domain.

In this section we present an ontology learning framework which is tailored to ad-dress the particularities of the Web services domain. We first describe the learningframework as a whole (Subsection 3.1), then we detail each of its steps.

3.1 Overview of the Framework

The ontology learning framework consists of several steps, as depicted in Figure 3.We briefly describe these steps and show how the characteristics of the Web ser-vices context influenced their design.

Fig. 3. The Ontology Learning Framework.

12

1. Term Extraction. In the first step we identify words in the corpus that are rele-vant for ontology building. A word or a set of words that are identified as usefulfor ontology building form a “term”. Term extraction is done in two steps. First,in a linguistic analysis phase the corpus is annotated with linguistic informa-tion. Then, a set of extraction rules are applied on this linguistic information toidentify the potentially interesting terms.

The characteristics of the Web services domain influenced our design choicesin several ways. First, to overcome the limitations of the poor grammatical qual-ity of the texts we employed linguistic analysis of different complexity. As it isevident from the results of our experiments, more complex analysis led to betterresults. Then, the small size of the corpus and its sublanguage features facili-tated the use of a rule-based solution. Namely, the sublanguage features of thecorpora allowed us to easily observe a few heuristics for identifying importantinformation and implement them in our extraction rules.

2. Ontology Building. The identified terms are centralized, analyzed and trans-formed in corresponding concepts and their hierarchical relations. The ontologybuilding phase derives both static and procedural knowledge in the form of ahierarchy of frequent domain concepts and a hierarchy of Web service function-alities. The strong sublanguage features of the analyzed corpora allow extractingterms that are highly relevant for ontology building. Therefore, it suffices to usesimple ontology learning techniques and to adapt them to the requirements of thedomain (e.g., extract procedural knowledge).

3. Ontology Pruning. The low grammatical quality of the corpus and its sublan-guage characteristics cause a suboptimal functioning of the used linguistic tools.Therefore, some of the derived concepts do not have any domain relevance. Thepruning stage filters out these potentially uninteresting concepts.

In the next subsections we detail all three steps of the learning framework.

3.2 Step1: Term Extraction

The term extraction phase identifies (sets of) words (terms) in the corpus that arerelevant for ontology building. This phase can be implemented in different ways.

First, linguistic analysis of different complexity can be used. In this paper we re-port on two concrete implementations of the framework which use two differentkinds of linguistic knowledge. The first implementation discussed in Section 3.2.1,M_POS, uses basic Part-of-Speech (POS) information. The second implementationpresented in Section 3.2.2 , M_DEP, uses deeper dependency parsing techniques.

Second, the different linguistic information require implementing different extrac-tion patterns: surface patterns in the first case and syntactic patterns in the second.While the technical implementation of these pattern based rules differ (as described

13

in Sections 3.2.1 and 3.2.2), the heuristics behind them remain the same. Indepen-dently of the technical implementation, we distinguish two major categories of rulesaccording to the type of information they derive.

Rules for identifying domain concepts rely on the observation that domain con-cepts correspond to nouns in a corpus. Given the small size of the corpora andthe concise style of the Web service documentations the majority of nouns denotepotentially interesting domain concepts. We extract entire noun phrases where anoun phrase consists of a head noun preceded by an arbitrary (zero or more)number of modifiers (nouns or adjectives).

Rules for identifying functionalities implement the previously described sublan-guage characteristics, i.e., that verbs and related nouns are good indicators ofWeb service functionality.

In the next two subsections we detail the technical details of two concrete imple-mentations of the term extraction step.

3.2.1 Method 1: Part-of-Speech Based Extraction

The first implementation of the framework relies on part of speech tags (POS).We use the Hepple POS tagger [27] to perform the linguistic analysis phase. Thetagger assigns each word in the sentence a corresponding POS tag. For example,in the sentence below, the tagger identifies a verb (i.e., find), two nouns (i.e., sites,proteins), an adjective (i.e., antigenic) and a preposition (i.e., in).

Find(VB) antigenic(JJ) sites(NN) in(Prep) proteins(NN).

Following the general steps described by the framework, a set of extraction rulesare applied on the derived linguistic information. The extraction patters which formthe right hand side of the rules are implemented as surface patterns which, besidesPOS tag linguistic information, rely on surface knowledge such as the order ofwords in the sentence.

1. Identifying domain concepts. We stated above that extraction patterns are writ-ten to extract both static (domain concepts) and procedural (service functionalities)knowledge. The surface pattern that extracts noun phrases implements the heuristicobservation described above. This rule is specified in JAPE [13], a rich and flexibleregular expression based rule mechanism offered by the GATE framework [12].

( (DET)*(ADJ|NOUN|POS)*(NOUN) ):np

-->:np.NP={}

14

The pattern in the left hand side of the rule (i.e., before “–>”) identifies nounphrases. Noun phrases are word sequences that start with zero or more determiners(identified by the (DET)* part of the pattern). Determiners can be followed by zeroor more adjectives, nouns or possession indicators in any order (identified by the(ADJ|NOUN|POS)* part of the pattern). A noun phrase mandatorily finishes with anoun, called head noun ((NOUN)). DET, ADJ, NOUN and POS are macros and actas placeholders for other rules identifying terms that are part of these categories.These macros rely on the actual POS tag information. For example, the ADJ macrohas the following definition:

Macro: ADJ ({Token.category == JJ, Token.kind == word} |{Token.category == JJR, Token.kind == word} |{Token.category == JJS, Token.kind == word}

)

The macro contains the disjunction of three patterns. This means that the macrofires if a word satisfies any of these three patterns. Each of these three patternsidentifies words which were assigned one of the JJ, JJR and JJS POS tags. POStags are assigned in the category feature of a Token annotation.

Any word sequence identified by the left hand side of a rule can be referenced in itsright hand side. The text snippet identified by a (part of) a pattern is associated to avariable which can then be reused in the right hand side. For example, np identifiesall noun phrases. This string is then used in the right hand side of the rule whichspecifies that strings denoted by np should be annotated with the NP annotation.

In the example, our pattern identifies “antigenic sites” 4 (ADJ NOUN) and “pro-teins” (NOUN) as noun phrases.

2. Identifying functionalities. One surface pattern identifies pairs of verbs andfollowing noun phrases as potential functionality information to be added to thedomain ontology. Having identified and annotated noun phrases (NP) and verbs(VB) with two previous rules, the JAPE rule for identifying and annotating func-tionalities is straightforward.

( {VB}{NP} ):funct-->:funct.Functionality = {}

In the example, the pattern identifies “find”_“antigenic site” as a verb phrase de-noting a possible functionality in bioinformatics.

4 We use this notation convention to present “terms” extracted from the corpus.

15

3.2.2 Method 2: Dependency Relation Based Extraction

In a second implementation, M_DEP, we experimented with richer linguistic in-formation than POS tags, i.e., dependency relations. Dependency parsing offersa deeper linguistic analysis than POS tagging being a commonly used method incomputational linguistics. A dependency relation is an asymmetric binary relationbetween a word called head and a word called modifier.

We use Minipar [30], a state of the art dependency parser with a reported highperformance (88% precision and 80% recall). As an example, we list in Table 1Minipar’s analysis for our example sentence. For each word, the following infor-mation is provided : (i) its position in the sentence; (ii) its form as it appears in thesentence; (iii) its lemma; (iv) its part of speech; (v) the name of the dependencyrelation between this word and the head (e.g., obj) and (vi) the position of the headword modified by the current word. In the example antigenic is an adjective whichmodifies the noun sites, and sites is the object of the verb find.

Pos. Word Lemma POS Relation Head

1 find find V - -

2 antigenic antigenic A mod 3

3 sites site N obj 1

4 in in Prep mod 3

5 proteins protein N pcpmp-n 4Table 1An example Minipar output.

The benefit of using richer linguistic information is that the potentially interestinginformation can be extracted in an easier way. While the same heuristics are used,the extraction patterns must be re-implemented. These patterns are defined on thesyntactic relations within the sentences and therefore called syntactic patterns.

1. Identifying domain concepts. The first category of patterns, those that identifydomain concepts, explore the “nn” (noun modifier of a noun) and “mod” (adjectivemodifier of a noun) dependency relations to detect noun phrases. When such rela-tions are identified, the head noun together with its modifiers are annotated as beinga noun phrase. Regular expressions are not enough to encode these more complexpatterns (they do not allow variables). We use extra java code on the right hand sideof the JAPE extraction rules to accomplish this.

2. Identifying functionalities. The pattern for functionality identification relies onthe “obj” relationship and identifies pairs of verbs and their objects. If the objectis the head of a noun phrase then the whole noun phrase is extracted. This patternrelies on the output of the previous NP extraction pattern.

16

Position Word Lemma POS Relation Head

1 replace replace V - -

2 or or U lex-mod 1

3 delete delete V lex-dep 1

4 sequence sequence N nn 5

5 sections section N obj 1Table 2Verb dependency example.

This pattern captures the desired information in the majority of cases with a few ex-ceptions. One of the exceptions occurs when several verbs in a sentence refer to thesame object. For example, the sentence Replace or delete sequence sections sug-gests that both “replace”_“sequence section” and “delete”_“sequence section”are valid functionalities in this domain that we wish to extract. However, Minipar’soutput does not directly encode the verb-object relation between delete and section(see Table 2). On the other hand, the analysis denotes that there is a dependencyrelation between the two verbs of the sentence. Whenever two or more verbs arerelated by a logical operator they should be bound to a single noun (the objectof one of the verbs). One of our extraction patterns identifies cases when severalverbs are related via the “lexdep” or “conj” relations. These relations denote caseswhen verbs are related via logical operators such as “or”, “and” (e.g., Reverse andcomplement a sequence) or “,”. Often there are cases when the logical dependencybetween more than two verbs is partially specified and we have to explicitly defineall dependents based on the transitivity of this relation (e.g., if dependency(v1,v2)and dependency(v2,v3) then dependency(v1,v3)).

Another exception is when several objects are in a conjunctive relation. For ex-ample, from Pick pcr primers and hybridization oligos we wish to extract both“pick”_“pcr primer” and “pick”_“hybridization oligos” functionalities. However,the Minipar output specifies only the first verb-object relation (see Table 3). Nev-ertheless, knowing that there is a conjunctive relation between primers and oligoswe can deduce that oligos also plays an object relation with respect to the verbpick. Just as with verbs, we wrote a pattern that identifies conjunctive NPs and de-duces the additional knowledge. The patterns that identify dependency of verbs andobjects are performed before the pattern that identifies functionalities.

3.2.3 Related Work on Pattern based techniques

The term extraction phase has a major importance in our framework since the qual-ity of this extraction directly influences the final ontology. We present an overviewof related work on using pattern based techniques to derive semantic relations.

17

Position Word Lemma POS Relation Head

1 pick pick V - -

2 pcr pcr N nn 3

3 primers primer N obj 1

5 hybridi- hybridi- N nn 6

zation zation

6 oligos oligos N conj 3Table 3Noun dependency example.

Pattern based techniques are widely used in several natural language processingapplications. Notably they have been used to derive semantic relations from largecorpora. A pioneer in this direction of research was the work of Hearst which intro-duced the idea of learning hyponymy relations using lexico-syntactic patterns [26].Lexico-syntactic patterns are defined on both lexical and basic syntactic informa-tion (POS tags). As such, they allow extracting relations after shallow text process-ing only. For example, the hyponymy relationship suggested by Bruises, wounds,broken bones or other injuries could be extracted using the NP, NP*, or otherNP pattern [26]. As a follow up of this work, Charniak developed a set of lexico-syntactic patterns that identify meronymy (partOf) relations [3]. In both cases, theidentified semantic relations were used to enlarge WordNet.

Naturally, such patterns have a clear relevance for ontology learning. Indeed, Hearst-style patterns are used in the work of Cimiano [10] and in the CAMELEON toolwhich incorporates over 150 generic patterns for the French language ([56], [1]).While such generic patterns work well in general corpora they often fail in smallor domain specific corpora. In these cases domain-tailored patterns provide a betterperformance [1]. Besides using domain tailored patterns one can enlarge the ex-traction corpora. For example, World Wide Web data can be used for pattern basedlearning [8]. In our work and in several ontology learning approaches pattern basedextraction is just a first step in a more complex process ([32], [15], [7]). In thesecases patterns identify potentially interesting terms in the corpus and then the nextprocessing steps derive relevant semantic structures from these terms.

Summarizing Section 3.2, note that the sublanguage nature of the Web service spe-cific corpora allowed us to extract sufficient material for ontology building by usingonly relatively simple, off the shelf natural language processing techniques. Thereare several advantages of using these simple extraction methods. First, they are fast.Second, they rely on off the shelf, thoroughly researched and high-performancetechniques (POS tagging, dependency parsing). Finally, the pattern based extrac-tion rules can be adjusted or extended by the users of the system according to theneeds of their particular data sets.

18

3.3 Step2: Ontology Building

The ontology building step collects the results of the pattern based extraction. Nounphrases are a basis for deriving a data structure hierarchy and the functionalityinformation is used for building a functionality hierarchy. We employ the lemma(i.e., base form) of the extracted terms for ontology building.

Building the data structure hierarchy. We observed that many of the terms men-tioned in the analyzed corpora (and especially in the bioinformatics corpus) havea high level of compositionality, in the sense that they incorporate other meaning-ful terms as proper substrings. Our observation is confirmed by a recent study ofthe Gene Ontology terms which proved that 63,5% of all terms in this domain arecompositional in nature [43]. Another observation, also proved by this study, is thatcompositionality indicates the existence of a semantic relationship between terms.If a term t1 is obtained by adding a modifier to another term t2 then t1 is morespecific than t2. This translates in the ontological subsumption relationship.

The hierarchy building algorithm reflects theseobservations. If a concept A’s lexicalization is aproper substring ending another concept B’s lex-icalization (e.g., Site in AntigenicSite) then A ismore generic than B and the corresponding sub-sumption relationship is added to the ontology.Also, if the lexicalizations of two concepts B andC end with the same substring we speculate thatthis substring represents a valid domain concept Fig. 4: The Site concept.

(even if it does not appear as a stand alone term in the corpus) and add it as a parentconcept for B and C. As an example, Figure 4 depicts the Site data structure hier-archy. Such compositionality based hierarchy building has also been used in otherontology learning approaches ([7], [58]).

Building the functionality hierarchy. There are no clear guidelines in the fieldof semantic Web services about how functionality hierarchies should look like.The OWL-S/IRS [40]/WSMO 5 style of modelling functionalities includes boththe verb of the action and a directly involved data element in the functionality (e.g.,BookTicket). This modelling style was followed in case study 1 (see Fig. 1). Onthe other hand, in the bioinformatics domain ontology developed in case study 2,functionalities are concepts denoting action (e.g., Aligning) without any connectionto the data structures (see Fig. 2). We provide modules that produce functionalityhierarchies fulfilling either of these modelling styles, i.e., creating verb-noun phrase(e.g., Delete_SequenceSection) or only verb (e.g., Deleting) based concepts.

5 http://www.wsmo.org/

19

3.4 Step3: Ontology Pruning

The first two main steps of the framework, term extraction and ontology building,result in an initial ontology. These steps only rely on our initial heuristics to selectthe potential concepts. However, even if they capture strong sublanguage charac-teristics, our heuristics are not perfect and some of the derived concepts are notdomain relevant. The pruning module filters out these irrelevant concepts.

Maedche describes two major strategies for performing ontology pruning [33].First, the baseline pruning strategy is based on the assumption that frequent termsin a corpus are likely to denote domain concepts. Conversely, concepts that arebased on low frequency terms should be eliminated from the ontology. The secondpruning strategy, relative pruning, is based both on the frequency of the terms inthe analyzed corpus and in an independent reference corpus. Only concepts that arefrequent in both analyzed and relative corpora are maintained in the ontology.

We use a baseline pruning strategy in our current implementations. We considerthe average frequency (Freq) of the n learned concepts as a threshold value andprune all concepts that have a lower frequency than this value.

Freq =

∑ni=1 freq(concepti)

n

Another heuristic for the pruning is based on the observation that noun phrases in-cluded within a functionality annotation by our rules are more likely to denote do-main concepts. Therefore, if a low frequency data structure concept’s lexicalizationis identified within a functionality annotation and the corresponding Functionalityconcept is not pruned then the data structure concept is not pruned either.

4 Implementation

We provide two concrete implementations 6 of the above presented framework.One of the goals of our implementation was to ensure a high usability of the ex-traction tool. We achieved this in two ways. First, we aimed for a modular, easyto run and understand implementation that can be easily modified and adapted tonew situations. We achieved this by using the intuitive user interface offered by theGATE framework. Second, we used visual techniques to improve the presentationof the learned ontology. Further, we discuss these two aspects.

Using GATE for implementation. Many of the off the shelf techniques on which

6 Available at http://www.cs.vu.nl/˜marta/experiments/extraction.html.

20

Fig. 5. The GATE implementation of M_POS.

our framework relies were readily offered by GATE. For example, the LinguisticAnalysis step of the M_POS implementation was entirely performed using process-ing resources offered by GATE: a tokenizer (ANNIE English Tokenizer), a sentencesplitter (ANNIE Sentence Splitter) and the Hepple POS tagger [27] (available as theANNIE POS Tagger processing resource). In the case of M_DEP we performedthe linguistic preprocessing external to GATE. For both approaches, the Extractionpatterns were implemented using the JAPE regular expression based rule mecha-nism which is part of GATE. The final two steps are jointly performed by a singlemodule (OntologyBuilding&Pruning) which was implemented as a GATE Process-ing Resources and therefore it is usable from the GATE GUI. The data used by ourmethods (such as the linguistic information or the structures identified by patterns)is represented as annotations on the analyzed texts. Both patterns and individualmodules operate on these annotations and represent their output as annotations.

We greatly benefitted from the support of GATE during implementation. To brieflymention the most important benefits, first, we could reuse several existing librariesfor document management and ontology representation. Second, by declaring partsof our method as GATE Processing Resources and using the offered annotationbased system as a data transfer mechanism between these parts, we can run andmanage our tool via the GATE GUI. It is now possible (1) to build and configuremodular applications by visually selecting existing Processing Resources (our ownor provided by GATE), (2) to select different corpora or (3) to inspect the annotationbased output of each processing module. All these allow easy debugging and makethe extraction process transparent to the end users thus increasing the usability ofthe tool. Finally, we have used the data storage and evaluation facilities of GATEduring the development and fine-tuning stages of our prototype.

Using visualization techniques for ontology presentation. Another approach toenhance the usability of our implementation was the use of visual techniques topresent the learned ontology. There is an increasing awareness in the ontologylearning community that the results of the extraction methods must be easily un-derstandable by the domain engineer who needs to further refine the extracted on-tologies. Therefore, some evidence about the provenance of the learned conceptsor their intended meaning should be supplied. Usability was addressed differentlyby the existing ontology learning tools. The importance of an intuitive user inter-

21

face was advocated by the developers of ASIUM [15]. In OntoLearn a natural lan-guage description of the automatically learned formal concepts is generated basedon WordNet glosses. These explanations support the domain expert in evaluating(and understanding) the learned concepts [41]. In OntoLT for each derived conceptone can inspect all its appearances in the analyzed corpus [7]. Text-to-Onto employsa TouchGraph 7 based ontology visualization technique to depict the extracted con-cepts and their (taxonomic) relations ([36], [24]). While this visualization allowsfor easy browsing of the conceptual structure, no explanation is provided of why acertain concept was extracted.

Fig. 6. Visual ontology inspection.

We use the Cluster Map visualization technique, developed by the Dutch softwarecompany Aduna 8 , to present the extracted ontologies. The Cluster Map techniquediffers from structural visualizations predominantly used in Semantic Web tools bythe fact that besides depicting ontology concepts and their taxonomic relations itvisualizes the instances of a number of selected classes from a hierarchy, organizedby their classifications [17]. The highly interactive GUI in which the Cluster Mapis integrated is used for presenting the extracted ontologies. In Fig. 6 a part of theextracted bioinformatics ontology is shown. The left pane of the GUI presents theconcepts and their hierarchical relations. By selecting a concept, all its instances(i.e., all the documents from where it was extracted) are visualized in the rightpane. Each small yellow sphere represents an instance (i.e., one document). Theconcepts are represented as rounded rectangles, stating their names and cardinali-ties (the cardinality denotes the number of documents from where the concept wasextracted). Each instance is visualized only once and balloon-shaped edges connectit to its most specific concept(s). For example, Fig. 6 shows that one document leadto extracting three concepts (Finding, Maintaining, Removing). This document isplaced in between these concepts thus visualizing them close to each other on the

7 http://www.touchgraph.com/8 http://aduna.biz

22

diagram. An attractive property of this visualization is that visual closeness of con-cepts often denotes a semantic closeness.

By using this visualization it is easy to access documents from which a concept wasextracted - a mouse click on any cluster results in a list of document instances. Amore important benefit is that by visually analyzing the extracted concepts and theirrelations to the underlying corpus one can derive relations between them that arenot explicitly stated in the corpus and therefore were not extracted by the learningmethod. For example, in Fig. 6 two larger groups of interconnected functionali-ties emerge. Each group represents functionalities that are often offered simultane-ously by Web services. At a closer look we observe that the first group containsfunctionalities that search or modify content, while in the second group we findfunctionalities concerned with input/output operations such as Reading or Writing.The domain expert can easily access (with a simple mouse click) and inspect thedocuments that interrelate these concepts and decide if it is the case to set up newabstract concepts (e.g., ContentServices and InputOutputServices). For more exam-ples on the use of information visualization in the context of ontology learning werefer the interested reader to [53].

5 Evaluation

In previous work we verified the performance of the first implementation of theframework (M_POS) on the data sets provided by case study 1 [52] and we evalu-ated the second implementation (M_DEP) in the context of case study 2 [55]. Thegoal of the experiments reported in this paper is to compare the performance ofthe two extraction methods by applying and evaluating them on data drawn fromboth case studies 9 . The ontology building algorithm was adjusted to follow themodelling principle employed by each Gold Standard ontology (i.e., producingcompound functionality concepts for case study 1 and simple action verb basedconcepts for case study 2). In order to get an insight in the efficiency of our prun-ing heuristics we evaluated both a pruned (i.e., completing all processing steps inFig. 3) and an un-pruned (i.e., excluding the last processing step from Fig. 3) ver-sion of the extracted ontologies.

Since ontology learning evaluation is a non-trivial task, we start this section bygiving an overview of some evaluation practices used by the community (Sub-section 5.1). We use a subset of these practices and describe them in details inSubsection 5.2. A description of the used experimental corpora (Subsection 5.3),our experimental results (Subsection 5.4) and an attempted comparison with otherontology learning tools (Subsection 5.5) form the rest of this section.

9 All experimental data (corpora, extracted and gold standard ontologies) can be down-loaded from http://www.cs.vu.nl/˜marta/experiments/extraction.html.

23

5.1 Ontology Learning Evaluation Practices

Evaluation of ontology learning is a very important but largely unsolved issue, asreported by papers in a recent workshop [6]. Two evaluation stages are typicallyperformed when evaluating an ontology learning method. First, term level evalu-ation assesses the performance of extracting terms relevant for ontology learningfrom the corpus. Naturally, the quality of term extraction has a direct influence onthe quality of the built ontology. This evaluation can easily be performed by usingthe well-established recall/precision metrics. Second, an ontology quality evalu-ation stage assesses the quality of the learned ontology. Two different ontologyevaluation approaches were identified by Maedche [33] depending on what is con-sidered a “quality” ontology.

In an application specific ontology evaluation the quality of an ontology is directlyproportional to the performance of an application that uses it. Several papers reporton successfully using ontologies in various tasks such as text clustering and classi-fication tasks ([29], [4]) or information extraction [16]. However, initial considera-tions on task-based ontology evaluation are only reported in [46]. Two problematicissues surface for such evaluations. First, it is often difficult to asses the qualityof the supported task or the performance of the application (e.g., search). Second,an experimental environment needs to be created where no other factors but theontology influences the performance of the application.

In a Gold Standard based ontology evaluation the quality of the ontology is ex-pressed by its similarity to a manually built Gold Standard ontology. In some casesthe authors use a Gold Standard ontology which was extracted from different cor-pora than used by the learning method ([48]). Other authors use Gold Standardontologies extracted strictly from the automatically analyzed corpora ([9], [11]).One of the difficulties encountered by this approach is that comparing two ontolo-gies is rather difficult. According to [35], one of the few works on measuring thesimilarity between ontologies, one can compare ontologies at two different levels:lexical and conceptual. Lexical comparison assesses the similarity between the lex-icons (set of labels denoting concepts) of the two ontologies. At the conceptuallevel the taxonomic structures and the relations in the ontologies are compared.

The Gold Standard evaluation approach assumes that the Gold Standard ontologycontains all the extractable concepts from a certain corpus and it contains onlythose. In reality though, Gold Standards omit many potential concepts in the corpusand introduce concepts from other sources (such as the domain knowledge of theexpert). The evaluation results are influenced by these imperfections of the GoldStandard. To compensate for these errors, a concept-per-concept evaluation by adomain expert can be performed. Such an evaluation is presented in [42]. Expertevaluation can be performed also in cases when a Gold Standard is not availableand its construction is too costly just for the sake of the experiment.

24

5.2 Chosen Evaluation Criteria

We employ a combination of these evaluation strategies to asses and compare thequality of the implemented learning methods. We first asses the performance ofthe term extraction algorithm (marked 1 in Fig. 7). To evaluate ontology quality,we first rely on the domain experts’ concept per concept based evaluation (2). Thedomain experts in both case studies are the curators of the corresponding GoldStandard ontologies. Then, we compare the extracted ontologies to the Gold Stan-dard ontologies provided by each case study (3). In what follows, we present themethodology and metrics for performing each type of evaluation.

Fig. 7. Chosen Evaluation Strategies.

1. Term Extraction. This evaluation stage is only concerned with the performanceof the term extraction modules. To measure the performance of these modules wemanually identified all the relevant terms to be extracted from the corpus. Mis-spelled terms are not considered for extraction. Then, using the Benchmark Eval-uation Tool offered by GATE, we compared this set of terms with the ones thatwere identified through pattern based extraction. We use Term Recall (TRecall)to quantify the ratio of (manually classified) relevant terms that are extracted fromthe analyzed corpus (correctextracted) over all terms to be extracted from the corpus(allcorpus). Term Precision (TPrecision) denotes the ratio of correctly extractedterms over all extracted terms (allextracted). We also compute the Fmeasure of theextraction by assigning an equal importance to both precision and recall.

TRecall =correctextracted

allcorpus

; TPrecision =correctextracted

allextracted

Fmeasure =2 ∗ TPrecision ∗ TRecall

TPrecision + TRecall

2. Expert Evaluation. In this evaluation stage the domain expert performs a con-

25

cept per concept evaluation of the learned ontology. Ontology precision representsthe percentage of domain relevant concepts in the extracted ontology (OPrecision).We observed that manually built Gold Standards often omit several concepts fromthe corpus or introduce concepts that are not named in the corpus. Therefore, forthis evaluation step, a domain expert distinguishes between two categories of rel-evant concepts: existent in the Gold Standard (correct) and omitted by the GoldStandard (new). Irrelevant concepts are marked spurious.

OPrecision =correct + new

correct + new + spurious

We also evaluate the quality of the taxonomic relations. For this we count the num-ber of taxonomic relations established between domain relevant (i.e., correct andnew) concepts (allRelsRelevant). Then an expert assesses how many of these taxo-nomic relations express indeed an isA relation (allRelsCorrect). The TaxoPrecisionmetric is the ratio of correctly identified isA relations over all taxonomic relationsbetween domain relevant concepts that were automatically discovered.

TaxoPrecision =allRelsCorrect

allRelsRelevant

The comments of the experts are useful side-results of the expert evaluation. Thisqualitative evaluation provided valuable ideas for further improvements.

3. Ontology Comparison. In the final evaluation stage we compare each extractedontology to the manually built Gold Standard ontologies in the corresponding do-main. For the lexical comparison, our first metric denotes the shared concepts be-tween the manual and extracted ontology. This metric was originally defined in [35]as the relative number of hits (RelHit), then renamed in [11] to Lexical Overlap(LO). Let LO1 be the set of all domain relevant extracted concepts (correct andnew) and LO2 the set of all concepts of the Gold Standard. The Lexical Overlap isthe ratio between the number of concepts shared by both ontologies (i.e., the inter-section of these two sets) and the number of all Gold Standard concepts (noted all).Intuitively, this metric is equivalent to recall while the previously defined OPreci-sion represents precision. If two or more correctly extracted concepts are equiva-lent to a single concept in the Gold Standard (e.g., AddModel, LoadOntology areequivalent to AddOntology) then only one of them is counted. Therefore LO1 ∩LO2

contains only individual concepts (noted correcti).

LO(O1, O2) =|LO1 ∩ LO2|

|LO2|=

correct

all

The extracted ontology can often bring important additions to the manual ontologyby highlighting concepts that were ignored during its creation. We are not aware of

26

any previously defined metric for measuring these additions. Therefore, we defineOntological Improvement (OI) as the ratio between all domain relevant extractedconcepts that are not in the Gold Standard (noted new) and all the concepts of theGold Standard.

OI(O1, O2) =|LO1 \ LO2|

|LO2|=

new

all

For comparing the taxonomic structures of the Gold Standard and the extractedontology we employ a similar strategy as during the Expert Evaluation stage. Wefirst count the number of taxonomic relations that were identified between twoGold Standard concepts (allRelsRelevantGS). Then we count the number of relationsthat are qualified as isA relations by the Gold Standard (allRelsCorrectGS). TheTaxoPrecisionGS is the ratio of these two numbers.

TaxoPrecisionGS =allRelsCorrectGS

allRelsRelevantGS

This taxonomic comparison is simpler than the cotopy based comparison intro-duced by Maedche [33]. However, it is feasible to be performed because the com-pared hierarchies are not too deep and the overlap between them is quite small.

5.3 Experimental Corpora

The experimental corpora were provided by the two research projects.

Case study 1: RDF(S) tools. The first corpus, C_RDFS, contains 112 documentsextracted from the documentation of the tools used to build the manual ontology (51documents from Jena’s API, 37 from the KAON RDF API and 24 from Sesame’sAPI). Each document in the corpus contains the javadoc description of one method.Previous work showed that the short textual descriptions of these methods containthe most information and other javadoc elements such as the method syntax and thedescription of the parameters introduce a lot of noise severely diminishing the per-formance of the extraction [52]. Therefore, we only use these short descriptions inthese experiments. Also, we exclude the syntax of the method because it introducesirrelevant technical terms such as java, com, org.

Case Study 2: Bioinformatics services. The corpus for this domain (C_BIO) con-sisted of 158 individual bioinformatics service descriptions as available at the EM-BOSS web site. We worked only on the short method descriptions since they aresignificant for Web service descriptions in general being similar to descriptionsfound in online Web service repositories such as XMethods 10 . The detailed de-

10 http://www.xmethods.net

27

scriptions of the EMBOSS services present a specific layout which makes extrac-tion much easier. However, using it would have biased our extraction methods to-wards this particular kind of documentation.

5.4 Results

In this section we present an evaluation and comparison of the two implementa-tions of the framework. We ran both implementations on both corpora and used theevaluation criteria described in Section 5.2 to evaluate them.

5.4.1 Term Extraction

The results of the first evaluation stage (see Table 4) indicate a better performanceof M_DEP in both domains, the F-measure being higher for M_DEP than forM_POS. In each corpus M_DEP resulted in higher recall than M_POS even if theprecision slightly decreased. However, in the context of ontology learning, in gen-eral, and for Web service domain ontology learning in particular, recall is oftenmore important than precision: the domain experts prefer deleting a few conceptsrather than missing some important concepts.

The errors are mostly due to mistakes in the output of the linguistic analysis tools.These tools perform worse on these specialized corpora than on newspaper styletexts used for their initial training. For example, verbs at the beginning of the sen-tence are often mistaken for nouns thus causing a lower recall. It is likely that theseperformance values will remain in this range unless we train the linguistic analysistools for this specific sublanguage. Note also that the performance of dependencyparsing is sensitive to the length and complexity of the analyzed texts. Fortunately,the majority of sentences in our corpora are simple and allow a correct analysis.This partially explains the better performance of M_DEP.

A second source for errors are spelling and punctuation mistakes. The RDF(S)corpus C_RDFS has, from this perspective, a lower quality than the bioinformaticscorpus C_BIO and, indeed, this affects the TPrecision of the extracted set. Thisleads to the conclusion that textual sources from Web service catalogs should bepreferable to low quality code documentation.

5.4.2 Expert Evaluation

The results of the expert evaluation for the extracted ontologies are shown in Table5 (for the RDF(S) domain) and Table 6 (for the bioinformatics domain). For bothdata sets the second method resulted in a slightly decreased ontology precision asa direct consequence of a lower term extraction precision (more incorrect extracted

28

C_RDFS C_BIO

M_POS M_DEP M_POS M_DEP

correctextr 471 549 319 384

allcorpus 631 631 446 446

allextr 658 774 393 480

TRecall 75% 87% 72% 86%

TPrecision 72% 70% 81% 80%

F-measure 73 78 76 83Table 4Term extraction results for both case studies.

terms lead to more incorrect concepts). The pruning mechanism increased the on-tology precision in both domains and for both methods leading to precisions in therange of 57% and 74%. This means that more than half of the concepts of all prunedontologies are relevant for the analyzed domain.

Not Pruned Pruned


correct(i) 29(23) 35(27) 24(20)) 29(21)

new 70 77 45 65

spurious 165 211 46 71

OPrecision 38% 35% 60% 57%

LO 46% 54% 40% 42%

OI 140% 154% 90% 130%Table 5Evaluation results for the RDF(S) domain, case study 1, in the expert evaluation and ontol-ogy comparison phases.

The taxonomic evaluation results (See Tables 7 and 8) show that both methodsidentified a similar number of taxonomic relations per corpus (18/17 for C_RDFSand 78/73 for C_BIO). Naturally, C_BIO resulted in more taxonomic relationsgiven the high level of compositionality of bioinformatics concepts. For both cor-pora and both extraction methods all extracted taxonomic relations were correct(TaxoPrecision = 100%). This indicates that the hierarchy building algorithm thatwe used, even if simple, performs well. Note that besides correctly discovering thetaxonomic relations existing in the Gold Standards, many new taxonomic relationswere discovered as well (either between Gold Standard - Gold Standard concepts,Gold Standard - new concepts or new - new concepts).

The effect of the pruning mechanism on the taxonomic structures is different for

29

Not Pruned Pruned


correct 25 27 12 18

new 140 157 64 80

spurious 98 105 26 39

OPrecision 63% 63% 74% 72%

LO 20% 22% 10% 14%

OI 112% 126% 51% 64%Table 6Evaluation results for the Bioinformatics domain, case study 2, in the expert evaluation andontology comparison phases.

the two corpora. Namely, in C_BIO more than half of the correct taxonomic rela-tions disappear after pruning (unlike C_RDFS where the effect of pruning is notso radical). One of the major reasons for this behavior is that, in bioinformatics,due to the compositionality of terms, deep data structure concept hierarchies arecreated where the frequency of the concepts decreases with their generality. Theselow frequency specialized concepts are often pruned even if important, thus manytaxonomic relations being deleted with them. The pruning threshold should be de-creased when advancing deeper into the hierarchy. Also, since the ontology preci-sion was already high without pruning, we might have adopted a lower value forthe overall pruning threshold.

Qualitative Evaluation. Besides our quantitative results, we collected useful com-ments from the domain experts who performed the evaluation.

Recall vs. Precision. It seams that the cleanness of the ontology is not of majorimportance for the ontology engineer. Often even concepts that are not included inthe final ontology are useful to give an insight in the domain itself and to guidefurther abstraction activities. We should therefore concentrate on increasing therecall of the term extraction process even at the expense of its precision.

Synonymy. During the evaluation, the expert recognized several potential synonymsets such as: {find, search, identify, extract, locate, report, scan}, {fetch, retrieve,return}, {pick, select}, {produce, calculate} or {reverse, invert}. Synonymy infor-mation is an important piece of knowledge for semantic Web services. Especiallysearch and matchmaking algorithms would benefit from knowing which conceptsare equivalent. The ontology engineer can decide to include synonymy in his ontol-ogy in different ways. For example, he can maintain all these concepts and describetheir synonymy via an explicit mapping (e.g., owl:equivalentClass). Alternatively,he can choose to maintain one single concept per synonym set and link all lexicalsynonyms to this concept.

30

Abstractions. The experts often redistributed the extracted domain concepts ac-cording to their domain view. For example, two subclasses identified for Proteinbelong to different domains, molecular biology and bioinformatics, and have tobe placed in the corresponding hierarchies accordingly. Such abstractions needto be still manually created according to the ontology engineer’s view on the do-main. However, the abstraction step is considerably supported if the expert has anoverview of relevant domain concepts.

Support. The curators considered the extracted ontologies as a useful start for de-riving a domain ontology. Several complex structures could be directly included ina final ontology (e.g., the Site hierarchy in Fig.4), or provided helpful hints on howcertain concepts interrelate. The most appreciated contribution was that the learnedontologies even suggested new additions for the manually built ontologies.

5.4.3 Ontology Comparison

The unpruned RDF(S) ontology extracted with M_DEP contains more than half ofthe concepts existing in the manually built ontology and many new potential con-cepts (see Table 5). These values are lower for the unpruned ontology derived withM_POS. Lexical overlap was computed based on the individual correctly extractedconcepts (correcti) shown within brackets. The behavior of the pruning mecha-nism was satisfactory. While pruning has almost doubled the ontology precision(from 38% to 60% in M_POS and from 35% to 57% in M_DEP) it only slightlyaffected the lexical overlap. Ontological improvement was more affected (halved)because many of the newly identified concepts possibly have a low domain rel-evance. Therefore our pruning distinguishes between important domain conceptsand less important concepts.

Not Pruned Pruned


allRelsRelevant 18 17 12 14

allRelsCorect 18 17 12 14

TaxoPrecision 100% 100% 100% 100%

allRelsRelevantGS 4 3 2 2

allRelsCorectGS 4 3 2 2

TaxoPrecisionGS 100% 100% 100% 100%Table 7Taxonomy Evaluation results for the RDF(S) domain, case study 1.

The comparison with the bioinformatics Gold Standard (the ontology currentlyused in Web service description), shows the same trend but registers less successthan the RDF(S) case study (see Table 6). The unpruned ontologies cover only

31

Not Pruned Pruned


allRelsRelevant 78 73 27 30

allRelsCorect 78 73 27 30

TaxoPrecision 100% 100% 100% 100%

allRelsRelevantGS 10 11 5 6

allRelsCorectGS 10 11 5 6

TaxoPrecisionGS 100% 100% 100% 100%Table 8Taxonomy Evaluation results for the Bioinformatics domain, case study 2.

20% - 22% of the manual ontology for both methods, even if they suggest manynew possible concepts. Pruning behavior is less satisfactory in this case: it reducesboth the lexical overlap and the ontology improvement to half while resulting ina low ontology precision increase. This behavior is also explained by the fact thatlow frequency specialized terms are pruned even if important (see 5.4.2).

We tried to understand why is the lexical overlap so low. We concluded that the ma-jor cause for ontological losses was that the curator also included concepts aboutthe fields of biology, bioinformatics and informatics that are not present in the cor-pus. For this he relied on his expert knowledge and other ontologies in the domain(see Section 2). For example, the ontology contains the ten level deep organism -primate hierarchy as well as a hierarchy of measurement units. Further, the curatorrelied on a set of compound concepts to define different “views” on services. Forexample, a set of concepts define different kinds of services based on their com-positionality or the types of inputs and outputs. These concepts that define viewsrepresent 18% of all concepts in the Gold Standard. Our algorithm is not able tolearn such views, however, it is feasible to extend it in this direction.

We also investigated the causes of the high ontological improvement. Our resultssuggest that the ontology curator worked rather independently from the given cor-pus during the building of the Gold Standard as he missed many concepts namedin the corpus. Post-experimental interviewing revealed that the examination of thecorpus was not meticulous. He used “just in time ontology development”: conceptswere added if needed for describing a certain service. Note also that he worked ona subset of our analyzed corpus (100 descriptions instead of 158 analyzed by us).Name changes could also account for the mismatch. The curator expanded abbrevi-ations or created a preferred term for several synonyms (e.g., Retrieving for fetch,get, return). In fact, he acknowledged that the automatic approach leads to a morefaithful reflection of the terms the community uses to describe their services.

32

5.5 Comparison with Other Ontology Learning Tools

While the primary goal of this research was that of adapting existing techniquesto the Web services context rather than developing new ones, we still attemptedto compare our tool with other existing tools. This task was hampered by severalfactors. First, few ontology learning tools are publicly available for download andexperimentation. We only know about TextToOnto and OntoLT. However, thesetools are essentially ontology learning workbenches that provide a set of generictechniques and allow their customization to the user’s needs. For example, bothtools offer a way to encode domain specific extraction patterns. Therefore, aftertuning these tools to execute our patterns (which are the elements that make ourtool tailored for Web services) we expect to have similar results. The differenceswould be caused by the performance of the underlying language processing tools(e.g., all tools use different POS taggers).

To compensate for not being able to perform a comparison with an existing tool,we searched for performance results that could give an insight in how our toolcompares to others. This is again difficult because many ontology learning relatedpapers do not report any evaluation results or use metrics that are incompatible withours. We found that [11] (and other papers related to this work) reported on a for-mal concept analysis based ontology learning algorithm where the lexical overlapreached a maximum value of 27.71%. We are aware that this result is not enoughto give an overview of the state of the art performance in ontology learning. In fact,the ontology learning community just reached the stage when an effort is made to-wards establishing evaluation benchmarks and metrics to be adopted by the wholecommunity with the purpose of being able to compare the different efforts.

6 Discussion

The work presented in this article is motivated by the observation that, while do-main ontologies are of major importance for semantic Web services, their acqui-sition is a time-consuming task which is made more difficult with the increasingnumber of Web services. The ultimate goal of our research is to build tool supportfor (semi-)automatic ontology learning in the context of Web services. In this arti-cle, we presented the first stage of the research which pioneered ontology learningfor Web services. Ontology learning in the context of Web services raises severalnon-trivial questions while inheriting some unsolved issues from ontology learningin general. The aim of the work was to better understand the problem at hand and toinvestigate what technologies might be feasible to use. We addressed a set of fun-damental issues such as: data source selection, choosing the appropriate learningalgorithms, deciding on evaluation methodologies and metrics, considering usabil-ity. The contribution of our work lies in identifying and tackling these issues as

33

well as offering (partial) solutions to them. We discuss our findings in this sectionand point out future work that can be based on our findings so far.

Selecting Data Sources. Traditional ontology learning is predominantly focusedon learning ontologies that describe a set of textual documents. In this case the datasources for ontology learning are those textual sources. However, there are severalpossible data sources that could be used for learning Web service ontologies. First,resources connected to the underlying implementation might provide useful knowl-edge about the functionality of the Web service since Web services are simply webaccessible software products. Such are the source code, its textual documentation orexisting UML diagrams. Second, one could use Web service specific data sourcessuch as associated WSDL files or activity logs.

During our work, we observed that Web services are almost always accompaniedby a short textual description of their functionality which helps a user to quicklyunderstand what the service does. Such descriptions exist in on-line Web servicerepositories such as XMethods and also in javadoc style code documentations. Be-sides being the most available sources, short textual descriptions of Web services(1) are characterized by a low grammatical quality and (2) use a specific sublan-guage that makes them amendable to automatic processing. In our work we onlyconsidered these textual documentations. Current experiments, not reported here,show that WSDL documents also contain valuable information about the servicethat they describe being more detailed than the short textual descriptions we haveconsidered. As future work, we will combine these two sources.

Choosing Learning Techniques. The goal of our work was to adapt existing ontol-ogy learning techniques for this specific domain rather than developing novel ones.The choice of these techniques depends on the kind of data sources considered. Forexample, UML diagrams would require semantic mapping techniques [14] whichare essentially different from natural language processing techniques used for tex-tual sources. We designed a framework for learning ontologies from textual Webservice descriptions and implemented two methods within this framework that usenatural language processing techniques of different complexity. During the designand evaluation of this framework we derived the following observations:

Simple techniques work fine in well-defined contexts. Despite our methods arebased on relatively simple, off the shelf term extraction and ontology buildingtechniques, the learned ontologies have a good quality (as we argue in the eval-uation part of this discussion). One explanation of this phenomenon is that weare considering a well-defined ontology learning task and work on specializedtexts with strong sublanguage characteristics. Our context differs from effortsto design generic ontology learning methods which have to run on any kind oftextual sources and build only generic ontological structures. Therefore, genericontology learning methods are harder to build and they rely on more complextechniques. We believe that since Semantic Web technology is used in a variety

34

of specialized domains, tools that allow easy adaptation of basic ontology learn-ing methods will have an increased practical importance. We consider this workas a first step towards context directed ontology learning.

Deeper linguistic analysis seams to increase performance. The dependency re-lationship based method performs better than the POS tag based method. First,it increases the recall of the term extraction from the corpus with little impacton the term extraction precision. Second, while the extracted ontologies have alower precision this is compensated by higher values for ontological overlap andimprovement (but experts consider precision less important than domain cov-erage). Another argument for the use of dependency parsing is that the richerdependency information makes it much easier to write and establish new syntac-tic patterns than surface ones.

It should be possible to build a domain independent tool. The sublanguage fea-tures on which our methods build can be identified in Web service descriptionswritten for various domains. Therefore, we believe that it is feasible to build anontology learning tool that is tailored to the context of Web service descriptionsbut which is applicable across different application domains. A good indicationis that both our methods perform similarly in two different domains. However,corpus particularities can influence the extraction performance. For example,punctuation and spelling mistakes lead to a low term extraction precision, andconsequently, to a less precise ontology.

We envision several improvements for the basic framework presented here. First,we wish to extend the method with more fine-grained extraction patterns to com-plement the current high coverage patterns. There are a considerable number ofsublanguage patterns that were not used in this iteration but could provide richinput for the ontology building. We also wish to exploit pronoun resolution andprepositional phrases. Machine learning techniques could help in discovering somenew patterns. It is interesting to investigate if these fine-grained lexical-based pat-terns would still make our framework applicable across different domains. Second,we want to enhance the ontology building step. Use of synonymy information dur-ing the conceptualisation step is an important development possibility. Further, wewould like to concentrate on strategies that enrich the basic extracted ontology. Forexample, defining different views on a set of domain concepts or providing a set ofpatterns that act on the already extracted semantic information.

Evaluation. To achieve the goal of this first stage of research, that of understandingthe applicability of ontology learning techniques in the context of Web services, ourevaluation was directed towards getting an insight in the performance of the learn-ing methods. A possible extension of our evaluation would be to test the robustnessof the methods, i.e., to see how their performance is affected when applied on in-crementally enlarged data sets in the same domain. One of the major future tasksis to perform a task-based evaluation of the extracted ontologies in a Web servicecontext, e.g., by powering Web service tasks such as search, matchmaking, etc.This would complement the current evaluation and indicate the appropriateness of

35

the learned ontologies for Web service tasks. However, we believe that the currentevaluation is sufficient to encourage the continuation of this line of work.

One of our major observations is that the evaluation of ontology quality is diffi-cult because the Gold Standards do not faithfully represent the knowledge in thecorpus: the domain experts omit several concepts because it is not feasible to readand analyze all available documents in a reasonable time frame. Complementary,our methods extract ontologies that contain a high percentage of domain relevantconcepts from a corpus. The amount (and domain relevance) of extracted conceptscan be influenced by tuning the pruning algorithm.

The quality of the ontologies, even if extracted by using simple methods, was en-couraging. We state this based on the fact that similar work on ontology learning inopen domain reports on a maximum lexical overlap of 27,71% while we reach 54%procent in some cases. Further, during the qualitative evaluation, the experts indi-cated that the extracted ontologies represent more faithfully the knowledge in thecorpus and that they provide a useful start for building a domain ontology. Indeed,providing ontology curators with ontologies that contain half of the extractableconcepts is a considerable help for this time consuming task.

Usability. An important lesson from the ontology learning research is that thereare no generic (“one-size-fits-all”) methods and therefore any learning method willneed some degree of adaptation to a new domain - or even data sets within the samedomain. It is important that the user of the ontology learning method can understand(1) each step of the extraction (to eventually adapt it to his needs) and (2) the learnedontology. To fulfill the first requirement we used the GATE framework to create amodular, easy to adapt implementation which gives insight in the working of eachextraction module. The second objective was reached by presenting the extractedontology using visual techniques. Future work will need to evaluate the efficiencyof our usability measures through user based case-studies.

Acknowledgments. This work was carried out in the context of WonderWeb, anEU Information Society Technologies (IST) funded project (EU IST 2001-33052).We thank the GATE group for their support in the development of the prototype andG. Mishne, F. van Harmelen and S. Schlobach for their comments on the earlierversions of this paper. Finally, we are grateful to the three anonymous reviewersthat provided insightful comments for the improvement of this article.

References

[1] N. Aussenac-Gilles. Supervised text analyses for ontology and terminologyengineering. In Proceedings of the Dagstuhl Seminar on Machine Learning for theSemantic Web. February 2005.

36

[2] P.G. Baker, C.A. Goble, S. Bechhofer, N.W. Paton, R. Stevens, and A. Brass. AnOntology for Bioinformatics Applications. Bioinformatics, 15(6):510–520, 1999.

[3] M. Berland and E. Charniak. Finding parts in very large corpora. In Proceedings ofthe 37th Annual Meeting of the ACL, 1999.

[4] S. Bloehdorn and A. Hotho. Text Classification by Boosting Weak Learners based onTerms and Concepts. In Proceedings of the Fourth IEEE International Conference onData Mining, pages 331–334, IEEE Computer Society Press, 2004.

[5] J. Broekstra, A. Kampman, and F. van Harmelen. Sesame: A Generic Architecture forStoring and Querying RDF and RDF Schema. In I. Horrocks and J. A. Hendler, editors,Proceedings of the First International Semantic Web Conference, LNCS, 2002.

[6] P. Buitelaar, S. Handschuh, and B. Magnini. ECAI Workshop on Ontology Learningand Population: Towards Evaluation of Text-based Methods in the Semantic Web andKnowledge Discovery Life Cycle. Valencia, Spain, August 2004.

[7] P. Buitelaar, D. Olejnik, and M. Sintek. A Protege Plug-In for Ontology Extractionfrom Text Based on Linguistic Analysis. In Proceedings of the 1st European SemanticWeb Symposium (ESWS), May 2004.

[8] P. Cimiano, S. Handschuh, and S. Staab. Towards the self-annotating web. InProceedings of the 13th World Wide Web Conference, May 2004.

[9] P. Cimiano, A. Hotho, and S. Staab. Clustering concept hierarchies from text. InProceedings of LREC, 2004.

[10] P. Cimiano, A. Pivk, L. Schmidt-Thieme, and S. Staab. Learning Taxonomic Relationsfrom Heterogeneous Evidence. In Proceeding of ECAI2004 Workshop on OntologyLearning and Evaluation, 2004.

[11] P. Cimiano, S. Staab, and J. Tane. Automatic Acquisition of Taxonomies from Text:FCA meets NLP. In Proceedings of the ECML/PKDD Workshop on Adaptive TextExtraction and Mining, Cavtat–Dubrovnik, Croatia, 2003.

[12] H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. GATE: A frameworkand graphical development environment for robust NLP tools and applications. InProceedings of the 40th Anniversary Meeting of the Association for ComputationalLinguistics, 2002.

[13] H. Cunningham, D. Maynard, and V. Tablan. JAPE: a Java Annotation Patterns Engine(Second Edition). Research Memorandum CS–00–10, Department of ComputerScience, University of Sheffield, November 2000.

[14] K. Falkovych, M. Sabou, and H. Stuckenschmidt. UML for the Semantic Web:Transformation-Based Approaches, In Knowledge Transformation for the SemanticWeb, IOS Press, Amsterdam, 2003.

[15] D. Faure and C. Nedellec. ASIUM: learning subcategorization frames and restrictionsof selection. In Yves Kodratoff, editor, Proceedings of Workshop on Text Mining, 10thEuropean Conference on Machine Learning (ECML 98), 1998.

37

[16] D. Faure and T. Poibeau. First experiments of using semantic knowledge learnedby ASIUM for information extraction task using INTEX. In Proceedings of theECAI2000 Ontology Learning Workshop, 2000.

[17] C. Fluit, M. Sabou, and F. van Harmelen. Supporting User Tasks through Visualisationof Light-weight Ontologies. In Staab and Studer [57], pages 415–434.

[18] A. Gomez Perez. A survey on ontology tools. OntoWeb Delieverable 1.3, 2002.

[19] R. Grishman. Adaptive Information Extraction and Sublanguage Analysis. In IJCAI-2001 Workshop on Adaptive Text Extraction and Mining, 2001.

[20] R. Grishman, L. Hirschman, and N. T. Nhan. Discovery Procedures for SublanguageSelectional Patterns: Initial Experiments. Computational Linguistics, B12(3):205–216, 1986.

[21] R. Grishman and R. Kittredge, editors. Analyzing Language in Restricted Domains:Sublanguage Description and Processing. Lawrence Erlbaum Assoc., 1986.

[22] W3C Web Services Architecture Working Group. Web services architecturerequirements. W3C Working Draft, November 2002.

[23] N. Guarino. Semantic Matching: Formal Ontological Distinctions for InformationOrganization, Extraction, and Integration, In Information Extraction: AMultidisciplinary Approach to an Emerging Information Technology, volume 1299of LNCS, pages 139–170. Springer, 1997.

[24] P. Haase, Y. Sure, and D. Vrandecic. Ontology Management and Evolution: Survey,Methods and Prototypes. SEKT Deliverable D3.1.1, December 2004.

[25] Z. Harris. Mathematical Structures of Language. Wiley Interscience, New York, 1968.

[26] M.A. Hearst. Automatic Acquisition of Hyponyms in Large Text Corpora.In Proceedings of the Fourteenth International Conference on ComputationalLinguistics, 1992.

[27] M. Hepple. Independence and commitment: Assumptions for rapid training andexecution of rule-based pos taggers. In Proceedings of the 38th Annual Meeting of theAssociation for Computational Linguistics (ACL-2000), Hong Kong, October 2000.

[28] A. Hess and N. Kushmerick. Machine Learning for Annotating Semantic WebServices. In AAAI Spring Symposium on Semantic Web Services, March 2004.

[29] A. Hotho, S. Staab, and G. Stumme. Wordnet improves text document clustering. InProceedings of the Semantic Web Workshop at SIGIR-2003, 26th Annual InternationalACM SIGIR Conference, Toronto, Canada, July 28-August 1 2003.

[30] D. Lin. Dependency-based Evaluation of MINIPAR. In Workshop on the Evaluationof Parsing Systems, First International Conference on Language Resources andEvaluation, Granada, Spain, May 1998.

[31] P. Lord, S. Bechhofer, M.D. Wilkinson, G. Schiltz, D. Gessler, D. Hull, C. Goble, andL. Stein. Applying Semantic Web Services to bioinformatics: Experiences gained,lessons learnt. In International Semantic Web Conference, 2004.

38

[32] Reinberger M.-L., Spyns P., Pretorius A.J., and Daelemans W. Automatic initiation ofan ontology. In On the Move to Meaningful Internet Systems 2004: CoopIS, DOA, andODBASE (part I), 2004.

[33] A. Maedche. Ontology Learning for the Semantic Web. Kluwer Academic Publishers,2002.

[34] A. Maedche, B. Motik, and L. Stojanovic. Managing Multiple and DistributedOntologies in the Semantic Web. VLDB Journal, 12(4):286–302, 2003.

[35] A. Maedche and S.Staab. Measuring similarity between ontologies. In Proceedings ofEKAW. Springer, 2002.

[36] A. Maedche and S. Staab. Ontology Learning. In Staab and Studer [57], pages 173–190.

[37] D. Martin, M. Burstein, G. Denker, J. Hobbs, L. Kagal, O. Lassila, D. McDermott,S. McIlraith, M. Paolucci, B. Parsia, T. Payne, M. Sabou, E. Sirin, M. Solanki,N. Srinivasan, and K. Sycara. OWL-S 1.0 White Paper. http://www.daml.org/services/owl-s/1.0/, December 2003.

[38] D. Martin, M. Paolucci, S. McIlraith, M. Burstein, D. McDermott, D. McGuinness,B. Parsia, T. Payne, M. Sabou, M. Solanki, N. Srinivasan, and K. Sycara. BringingSemantics to Web Services: The OWL-S Approach. In Proceedings of the FirstInternational Workshop on Semantic Web Services and Web Process Composition(SWSWPC 2004), San Diego, California, USA, July 2004.

[39] B. McBride. Jena: A Semantic Web Toolkit. IEEE Internet Computing, 6(6):55–59,2002.

[40] E. Motta, J. Domingue, L. Cabral, and M. Gaspari. IRS-II: A Framework andInfrastructure for Semantic Web Services. In Proceedings of the Second InternationalSemantic Web Conference, LNCS. Springer-Verlag, 2003.

[41] R. Navigli and P. Velardi. Learning Domain Ontologies from Document Warehousesand Dedicated Websites. Computational Linguistics, 30(2), 2004.

[42] R. Navigli, P. Velardi, A. Cucchiarelli, and F. Neri. Quantitative and QualitativeEvaluation of the OntoLearn Ontology Learning System. In ECAI Workshop onOntology Learning and Population, Valencia, Spain, August 2004.

[43] P.V. Ogren, K.B. Cohen, G.K. Acquaah-Mensah, J. Eberlein, and L. Hunter. TheCompositional Structure of Gene Ontology Terms. In Proceedings of the PacificSymposium on Biocomputing, 2004.

[44] M. Paolucci, T. Kawamura, T. R. Payne, and K. Sycara. Semantic Matching ofWeb Services Capabilities. In Proceedings of the First International Semantic WebConference, 2002.

[45] A. Patil, S. Oundhakar, A. Sheth, and K. Verma. METEOR-S Web service AnnotationFramework. In Proceedings of the 13th World Wide Web Conference, May 2004.

39

[46] R. Porzel and R. Malaka. A Task-based Approach for Ontology Evaluation. In ECAIWorkshop on Ontology Learning and Population, Valencia, Spain, 2004.

[47] A.L. Rector and Rogers J.E. Ontological issues in using a description logic to representmedical concepts: Experience from GALEN. IMIA Working Group 6 Workshop, 1999.

[48] M.L. Reinberger and P. Spyns. Discovering Knowledge in Texts for the Learningof DOGMA-Inspired Ontologies. In ECAI Workshop on Ontology Learning andPopulation, Valencia, Spain, August 2004.

[49] D. Richards and M. Sabou. Semantic Markup for Semantic Web Tools: A DAML-Sdescription of an RDF-Store. In Proceedings of the Second International SemanticWeb Conference, LNCS, pages 274–289, Florida, USA, 2003. Springer.

[50] D. Richards, S. Splunter, F. M.T. Brazier, and M. Sabou. Composing Web Servicesusing an Agent Factory. In Workshop on Web Services and Agent-Based Engineering,AAMAS03, Melbourne, Australia, July 14/15 2003.

[51] E. Riloff. Automatically Generating Extraction Patterns from Untagged Text. In 1044-1049, editor, Proceedings of the 13th National Conference On Artificial Intelligence(AAAI), 1996.

[52] M. Sabou. From Software APIs to Web Service Ontologies: a Semi-AutomaticExtraction Method. In Proceedings of the Third International Semantic WebConference, Hiroshima, Japan, November 2004.

[53] M. Sabou. Using Information Visualisation to support Ontology Learning. InProceedings of the 9th International Information Visusalisation Conference, London,UK, July 2005.

[54] M. Sabou, D. Oberle, and D. Richards. Enhancing Application Servers withSemantics. In Proceedings of AWESOS Workshop, Australia, April 2004.

[55] M. Sabou, C. Wroe, C. Goble, and G. Mishne. Learning Domain Ontologies for WebService Descriptions: an Experiment in Bioinformatics. In Proceedings of the 14thInternational World Wide Web Conference, 2005.

[56] P. Seguela and N. Aussenac-Gilles. Extraction de relations semantiques entre termes etenrichissement de modeles du domaine. In Actes de la Conference IC’99 - Plate-formeAFIA, 1999.

[57] S. Staab and R. Studer, editors. Handbook on Ontologies. International Handbookson Information Systems. Springer-Verlag, 2004.

[58] P. Velardi, M. Missikoff, and P. Fabriani. Using Text Processing Techniques toAutomatically enrich a Domain Ontology. In Proceedings of the InternationalConference on Formal Ontology in Information Systems. ACM Press, 2001.

[59] C. Wroe, C. Goble, M. Greenwood, P. Lord, S. Miles, J. Papay, T. Payne, andL. Moreau. Automating Experiments Using Semantic Data on a Bioinformatics Grid.IEEE Intelligent Systems, 19(1):48–55, 2004.

40

learning domain ontologies for semantic web service descriptions

Documents