flexible taxonomies and ontology extraction from text in the context ... - wordpress… · 2011. 6....

106
FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF A HELPDESK APPLICATION Vivek Nair w1110831 5/3/2011 Supervisor: Dr. Epaminondas Kapetanios This report is submitted in partial fulfilment of the requirements for the BSc (Hons) Computer Science Degree at the University of Westminster

Upload: others

Post on 10-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE

CONTEXT OF A HELPDESK APPLICATION

Vivek Nair w1110831

5/3/2011

Supervisor: Dr. Epaminondas Kapetanios

This report is submitted in partial fulfilment of the requirements for the BSc (Hons) Computer Science

Degree at the University of Westminster

Page 2: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

2 | P a g e

Abstract The final year project will consist of a system that incorporates the concept of a flexible taxonomy

and ontology extraction from text. In essence, a system that can crawl through text data such as

problem reports, and attempts to classify extract taxonomies, probably in a tree like structure, such

that, once extracted, the appropriate expert will get a quick overview of the thematic areas of the

problems reported, discussed or resolved. The intention of the system is to manage the flow of

communication between problem reporters and experts, as effective as possible, so that the right

experts can be allocated to the right type of problems as quickly as possible.

Page 3: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

3 | P a g e

Table of Contents

1 Introduction .................................................................................................................................... 6

1.1 Motivation ............................................................................................................................... 6

1.2 Aims and Objectives ................................................................................................................ 7

2 Background ..................................................................................................................................... 9

2.1 Helpdesk .................................................................................................................................. 9

2.2 Ontology and Taxonomy ....................................................................................................... 11

2.3 Software ................................................................................................................................ 13

2.4 Hardware .............................................................................................................................. 22

2.5 Summary ............................................................................................................................... 22

3 Requirements Specification .......................................................................................................... 23

3.1 Functional Requirements ...................................................................................................... 23

4 Design ............................................................................................................................................ 26

4.1 System Name and Logo Design ............................................................................................. 26

4.2 System Architecture .............................................................................................................. 26

4.3 Database Design .................................................................................................................... 27

4.4 Web Application Design ........................................................................................................ 30

4.5 Front End Design ................................................................................................................... 34

4.6 Web Application Structure .................................................................................................... 35

4.7 Summary ............................................................................................................................... 35

5 Implementation ............................................................................................................................ 36

5.1 Front End ............................................................................................................................... 36

5.2 Web Application .................................................................................................................... 38

5.3 Database Implementation .................................................................................................... 39

5.4 System Operation ................................................................................................................. 41

5.5 Summary ............................................................................................................................... 54

6 Testing ........................................................................................................................................... 55

6.1 Black Box Testing .................................................................................................................. 55

7 Evaluation ..................................................................................................................................... 66

7.1 System Aims Evaluation ........................................................................................................ 66

7.2 System Requirements Evaluation ......................................................................................... 67

8 Conclusion ..................................................................................................................................... 74

Page 4: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

4 | P a g e

8.1 Review of Aims ...................................................................................................................... 74

8.2 Revisions to the design & implementation ........................................................................... 74

8.3 Further Work ......................................................................................................................... 75

8.4 Summary ............................................................................................................................... 76

Acknowledgments ................................................................................................................................. 77

References ............................................................................................................................................ 78

Appendices ............................................................................................................................................ 81

Appendix A: NLP Test ........................................................................................................................ 81

Appendix B: Database Entities Schema ............................................................................................ 86

Appendix C: Front End Mock-up ....................................................................................................... 87

Appendix D: Server Methods ............................................................................................................ 88

Appendix E: Page Methods ............................................................................................................... 89

Appendix F: Front End Methods ....................................................................................................... 93

Appendix G: Testing Ontology .......................................................................................................... 94

Appendix H: Files Directory ............................................................................................................... 96

Appendix I: Project Proposal ............................................................................................................. 99

Page 5: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

5 | P a g e

List of Figures Figure 1: General Helpdesk Process ........................................................................................................ 9

Figure 2: Summary Table of the types of support ................................................................................ 10

Figure 3: Web Services Summary ......................................................................................................... 12

Figure 4: Summary of Rich Text Editors [12] ......................................................................................... 14

Figure 5: Comparison of Potential Languages ...................................................................................... 14

Figure 6: Summary of Potential Databases [13] ................................................................................... 16

Figure 7: Sample JSON document ......................................................................................................... 17

Figure 8: Sample relational database model ........................................................................................ 18

Figure 9:Waterfall Model Adapted, source: [26] .................................................................................. 19

Figure 10: Spiral Model, source: [23] .................................................................................................... 20

Figure 11: Agile Development, source: [25] ......................................................................................... 21

Figure 12: Logo for System ................................................................................................................... 26

Figure 13: System Architecture ............................................................................................................. 26

Figure 14: Pipes and Filter process of Amekhania ................................................................................ 27

Figure 15: High level entity design ........................................................................................................ 28

Figure 16 High level entity design ......................................................................................................... 29

Figure 17:Process of Managing Communication .................................................................................. 30

Figure 19: Process of Ontology Extraction ............................................................................................ 31

Figure 18: Process Diagram for Ontology Extraction ............................................................................ 31

Figure 20: Mock-up of taxonomy integrated into search facility ......................................................... 32

Figure 21: Process of directing call to category .................................................................................... 33

Figure 22: Mock-up’s of front end for user ........................................................................................... 34

Figure 23: Web Application Structure................................................................................................... 35

Figure 24: default web page being the login screen ............................................................................. 36

Figure 25: Validation Table ................................................................................................................... 37

Figure 26: Example of Database Entity ................................................................................................. 39

Figure 27: Sequence diagram................................................................................................................ 40

Figure 28: Login/Registration Test Cases .............................................................................................. 56

Figure 29: Login/Registration Validation Testing .................................................................................. 58

Figure 30: Expert Test Cases ................................................................................................................. 61

Figure 31: User Test Cases .................................................................................................................... 64

Figure 32: Front End Test Cases ............................................................................................................ 65

Page 6: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

6 | P a g e

1 Introduction

An important factor in any organization in an information age is the ability to support itself

technically, this is where technical support/helpdesk departments have a role, providing help and

assisting in the resolution of technical issues. The main ways in which helpdesk are contacted are via

email or via web applications that provide functionality that allow both the user and support to

communicate to solve the problem.

1.1 Motivation

The motivation for this project was derived from my working experience in a commercial

environment; as part of my industrial placement year I was employed as an application support

engineer, where part of the role involved supporting helpdesk, to help resolve issues with in-house

developed software. To clarify, this would mean on a daily basis I was involved in fixing bugs and

implementing patches. The helpdesk application I was using was an in-house written ASP.Net web

application, where the general helpdesk process should there be a “technical” problem would

require an employee, or “problem reporter” to e-mail a mailbox, the web application would then

pick this up and display the contents of the email on to a web page on the application. A member of

the helpdesk would check the webpage for this email or “call” [2], and would log it into its correct

category. Although quite a medial task, it was necessary so that the team could work on calls specific

to their skill sets, however I feel that this is where my project could potentially add value because I

am demonstrating the practical application of semantic web technology.

Also from a technical point of view, semantic web technology which is the main concept underlying

my project, is part next phase of the “web 3.0” era so investigating and trying to make use of new

concepts and then to try and implement these new technologies is very exciting. The hope is

someday my application can provide a blueprint to show how communication can be managed

effectively via the use of semantic web technology.

Page 7: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

7 | P a g e

1.2 Aims and Objectives

1.2.1 Aims

Provide a platform in which users can communicate a technical problem such that it can be

addressed by the appropriate expert(s) or department

A user may come across a problem with a specific application they are using, and there could be

many technical people available but in order to solve the problem it would need to be directed to

the most appropriate expert. This could require many levels of conversation between different

members within the technical department, and is ultimately counter-productive because this time

could be spent on other valuable tasks. Directing the problems automatically to the person who may

have the most extensive experience, so that they can fix the bug or problem is a much more

effective method.

Provide functionality such that the problem reports can get automatically logged and directed to

the appropriate expert with the use of ontology extraction.

In order for the problem report to be directed to the appropriate expert, the problem report will

need to be analyzed for keywords and concepts with the use of natural language processing tools,

the application can then attempt to match the skills of an expert or group of experts to the problem

report so that they can deal with the problem.

Provide a dynamic view of the problematic areas in the form of a tree structure.

Problems will be categorised and grouped; the aim is that if the problem has been diverted

successfully to the appropriate expert, you can categorize the report by the keywords used and use

information about the user and the expert it has been directed to. The problems can then be

categorised and an overall diagrammatical representation of the thematic areas of problems can be

reported.

1.2.2 Objectives

In order to meet the aims, below are the objectives of the project:

1. Produce a helpdesk web application, whereby the user and expert can communicate the

problem via this platform

2. Make use of existing ontology extract technologies (possibly Alchemy API or Open Calais), or

create own extract technology, to help direct the problem to an appropriate expert

Page 8: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

8 | P a g e

3. Make the web application searchable, so users can search for similar or past problems

4. Provide a summary page which provides an overview of the problematic areas

5. Provide a statistical summary of performance of the expert dealing with problem.

Page 9: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

9 | P a g e

2 Background This chapter discusses the topic of ontology and taxonomy extraction, and evaluates existing natural

language processing tools that could potentially be used for the project. It will also discuss software

and hardware technologies that are suitable for the proposed system.

2.1 Helpdesk

2.1.1 What is helpdesk?

The “Helpdesk” [1] is a department that provides information and helps to resolve issues; it is the

first port of call where an employee has a technical problem, and they inform the helpdesk of the

problem, illustrated in figure 1 the general process:

1. User reports the problem via phone or email; they describe the problem they are facing.

2. A member of helpdesk attempts to identify the problem by reading and extracting

information about it, then make a decision as to categorizing the problem

3. Once the problem has been identified and logged to the appropriate department or person,

they analyze the problem.

4. Once analyzed, they can then attempt to resolve the problem and communicate that the

problem has been solved.

2)Identify and log the problem

3) Analyze the problem

4) Try resolve and

communicate back to the user

1) User Reports Problem

Figure 1: General Helpdesk Process

Page 10: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

10 | P a g e

In the report I will refer to:

The “user” being the entity that reports the problem

The “expert” being the one that solves the problem

The “helpdesk call” or “call” will be the email and/or problem report.

2.1.2 Helpdesk Categories

Helpdesk calls will fall into different categories depending on what has been reported as a problem,

in this project I will define three typical categories of support; these are desktop support, application

support, and infrastructure support.

Support Responsibilities

Desktop Provide support for

desktop applications.

Resolve

hardware/software

issues with

laptops/handheld

devices.

Application Provides support for

in-house applications.

Infrastructure Provide support for

infrastructure related

issues

Figure 2: Summary Table of the types of support

In figure 2, the responsibilities of each department/categories have been defined; it is on these

definitions of categories that a judgment can be made defining the category a helpdesk call.

Page 11: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

11 | P a g e

2.1.3 Existing Helpdesk Applications

There are many helpdesk applications, which provide all the necessary functions such as query

management and communication tools, but none apply ontology extraction or build taxonomies

automatically to classify the data.

2.2 Ontology and Taxonomy

2.2.1.1 What is ontology?

In computer science and software engineering, according to Gruber (1993) it is a specification of the

vocabulary used in a particular dialogue, in this project the language of focus will be English, and the

topic would be technology.

2.2.1.2 Why use ontology?

The ability to make computers extract knowledge from information such as text can allow the

sharing of information between people, thus enabling the classification of text documents into

useful forms so a representation of the areas of topics that are being discussed are viewed.

2.2.1.3 What is a flexible taxonomy?

A taxonomy in simple terms is a structure of classification that is usually equated to ontologies [6],

it’s typically used to identify documents, and is only first realized when there is an element of pre-

designed groups/categories, it is from these pre-designed groups that one can ascertain a dynamic

taxonomy from the use of ontology extraction.

In summary, it’s a framework that allows for the organisation of information that is dynamic.

2.2.1.4 What is the purpose of the taxonomy?

In the case of helpdesk an overall representation of the problem areas can be useful to gain a

heuristic view/approach to dealing with certain problems.

2.2.1.5 Representing the taxonomies

After extracting ontologies from the text, the extracted information will have meta-data that is

attached to it; therefore in order to represent the structure/categories of all the documents, the

web application will be required to produce a collation of the meta-data applied to all the

documents and then represent the taxonomies in the form of a tree structure. In turn, it will

represent the documents that are available to a particular category, thus providing a heuristic view

of the problematic areas.

To represent this in the web application a tree view will be used, as this approach shows the

information in a hierarchical format.

Page 12: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

12 | P a g e

2.2.2 Existing Natural Language processing tools

This section describes the tools found that enable extraction of knowledge from text; these are

services that are free and provide toolkits that enable software to interact with their services as

shown in figure 3. The reason for using existing tools rather than creating a whole ontology from

inception is that the reuse of existing technologies is the most sensible approach considering the size

of the project.

2.2.2.1 Information retrieval tool comparisons

The general process of extracting information from the text via the API library is that the developer

is supplied an API key, which authenticates the use of the service. Text is then submitted to the

Number of

Transactions

per day

Language

Support

Methods

supported

Text formats

Accepted

Response

Formats

Typical

Time to

process

COS

T

OpenCalais

[3]

50,000 English,

French

and

Spanish

SOAP,

REST,

"TEXT/XML"

"TEXT/HTML"

"TEXT/HTML/

RAW"

"TEXT/RAW"

RDF,

JSON,

and

Micro

formats

1

second

Free

Alchemy

API

[4]

30,000 Notably

English

and 70

other

languages

REST,

HTTP

Compressi

on

Web pages,

posted HTML

or text content,

and scanned

document

images.

RDF,

JSON,

XML and

Micro

Formats

1

Second

Free

GATE IR

[18]

Unlimited English,

other

languages

Java API Plain Text,

HTML

SGML,XML,R

TF ,Email

,PDF (some

documents)

,Microsoft

Word

XML Unkno

wn

Free

Figure 3: Web Services Summary

Page 13: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

13 | P a g e

service, and the developer uses the API library functionalities to extract what has been processed by

the service.

In order to test the relevance of what is produced from the information retrieval services; contents

of a forum question (appendix A.1.) were submitted into Alchemy API and Open Calais.

After submitting the text to Open Calais (appendix A.2.), the output has produced a refined number

of tags and entities after analyzing the text, and what has been extracted does provide enough

relevant indication as to what the question is about. The results having submitted the text into

Alchemy API is as accurate, however it does produce a lot more irrelevant data as shown in appendix

A.3.

Having also evaluated the GATE IR documentation [18], my assessment of the level of work needed

to process the text documents suggests using this would add a layer of complication to the

application on top of communicating with service, as the response format is in XML. In order to read

the data from GATE, the parsing of XML will need to occur, whereas with services like Alchemy API

and Open Calais, both have an API Library which has wrapper classes that parses the respective

response formats.

In summary, OpenCalais will be chosen to process the helpdesk calls because it is a free and accurate

service and also it’s powered by Thomson Reuters, thus I believe it to be a more reliable service

because of the size of the organization as they will have access to bigger resources in terms of the

infrastructure and also there is more documentation available.

2.3 Software This section describes the programming languages, frameworks and methodologies that will be used

for the development of web application which also uses a database for the use of persistent data.

2.3.1 Front-End

This system will have to be browser compliant which is HTML and will make use of CSS for applying a

consistent style throughout the web application, and make use of AJAX [10] to allow for the web

application to do server side processing but asynchronously retrieve the data from the server, this

effectively means that the user is not required to force a reload of the page in order to see updates.

2.3.1.1 Rich Text Editor

For communication to occur between the user and expert, they will need to have a way of inputting

into the system, and the use of a rich text editor so that the relevant user can input the information.

Below is a summary of two potential rich text editors that could be used:

Page 14: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

14 | P a g e

Benefits Cost

TinyMCE Light-Weight, increase

load times, open source

and aesthetically

pleasing, has

customizable themes

Free

NicEdit Minimalistic, Cross

platform

Free

Figure 4: Summary of Rich Text Editors [12]

The reason for using a RTE rather than a plain form editor is so that the user can apply formatting to

the text, in particular copy and paste error messages into the text editor.

2.3.2 Application

In this section I am going to compare the potential technologies I could use to build the web

application.

In the below figure is a comparison of potential languages:

Language/Frame

work

IDE Advantages Disadvantages Database

Compatibility

ASP.Net [11] Visual Studio Provides support

for Ajax

Well Supported

Performance is

optimized down to

server level

Provides OOP

Use of web controls

Microsoft

platform

required for

deployment

Any

PHP [7] Netbeans,

Eclipse

Can run on any

platform

Slower than

ASP.Net, more

of a scripting

language

Any

Figure 5: Comparison of Potential Languages

Page 15: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

15 | P a g e

The application itself will be written in C# language and will use the ASP.Net framework, the reason

for this is because I want to expand my knowledge in constructing web applications by using these

languages; and also build on my work experience. Also from a technical point of view, building the

web application complements the use of the RavenDB as shown in the next section.

2.3.2.1 Deployment

In order to deploy the project, it will consist of a database and the web application, so an ASP.NET

3.5 server or higher will prove sufficient.

Page 16: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

16 | P a g e

2.3.3 Database

As the nature of the proposed application consists of text orientated documents there are 3 types of

database, one database being a relational database and the other two being document orientated

databases, these are MySQL, MongoDB and RavenDB.

Features Query Language Database Model

MySql Runs Cross Platform

Robust Transactional

Support

SQL Relational

MongoDB Scalable

Auto-sharding

full text searching

Data Modelling

flexibility

Javascript;API

Calls; Json

NoSQL, Document

Orientated

RavenDB Runs natively on

windows, Supports

partial updates, Comes

with fully functional

.NET API

Full text searching

with use of

Lucene.NET

Sharding

Indexing

Data Modelling

flexibility

API Calls; REST;

HTTP; Json

NoSQL, Document

Orientated, Schema

Less

Figure 6: Summary of Potential Databases [13]

Figure 5 is the consideration of different types of databases, this is ultimately a text based

application and the use of a document orientated databases, i.e. Mongo DB and Raven DB is more

suitable because of performance factors when text searching. It also allows for the application of

metadata with convenience because of the structure that documents are stored within the

document orientated databases, which both store documents in Json format.

Page 17: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

17 | P a g e

So having made the choice that a document orientated database is suitable for the nature of the

project, the decision to be made is which one, either RavenDB or MongoDB. These are both similar

databases but on the basis that the Raven DB has a .Net API as shown in the figure above, RavenDB

will be the database selected as this will complement the set of technologies that have been chosen

in previous sections.

JSON is a light weight data-interchange format [15], is essence is a text readable document as

displayed below; it is a model of the helpdesk call and the potential metadata that has been

retrieved from the information retrieval service. Benefits of storage of data in this format allow for

the data to not only be used exclusively for this application, but for any 3rd party software, for

example a CRM system to integrate the data.

{

"Id":2321816,

"Subject":"Web application ",

“Report”: “this web application keeps on throwing a

expection”

"MetaData":[

{

"Type":Technology,

"Value":"ASP.Net"

},

"Responses":[

{

"Response1":This has now been fixed,

"UserID":"14584"

}

]

}

Figure 7: Sample JSON document

Page 18: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

18 | P a g e

Whereas compared to the use of a MySQL database, from a conceptual point of view, the database

would require more tables, the diagram below illustrates of the JSON document shown on the

previous page in the form of a relational database:

This is a reasonably simple problem, but in a relational database whether it be MySQL or for that

matter any other relational database it would require a minimum of three tables, and as a

consequence, this would increase the quantity of queries that are being performed on the database.

As a result, the performance of the database may suffer in terms of processing because of the SQL

joins needed to retrieve the metadata for a particular helpdesk call.

MySQL text searching is good for relatively small sized databases, but then the bigger the table is i.e.

from 10gb onwards, the performance then declines significantly [20]. There is however, the option

to use Lucene which is a fast text searching engine [21], in conjunction with the MySQL database

that indexes documents. But incorporating this into the application can add another layer of

processing and complication to the application which is not beneficial. As a potential database

MySQL will not be suitable, which therefore leads to the use of a document orientated database.

In terms of which document-orientated database, I have chosen to use RavenDB because it’s written

for .NET and allows for the use of Linq [19], which is a .NET component, which allows for native

querying. This has no effect on the performance of the database because queries are carried out via

the application; also in terms of the data structure from document orientated databases they are

very flexible because there are very few limits to structuring the data.

HelpdeskReports

id

subject

report

Metadata

id

reportID

metadatavalue

type

MetaDataType

id

typeName

1 m

1

m

Figure 8: Sample relational database model

Page 19: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

19 | P a g e

2.3.4 Methodology

In this section I will evaluate the software development methodologies such as the traditional

waterfall life cycle, spiral life cycle and the agile methodology.

2.3.4.1 Traditional Waterfall Lifecycle

This methodology is a sequential development process whereby each process of the methodology as

shown in the figure below, shows it’s not possible to go back on a process that has been completed

as it will have an impact on the next process [22], which in turn would increase the length of time

spent on the project, and will cause further problems in the project.

Considering the nature of this project, where there are elements of the project which will be utilising

new technologies, there is an associated element of risk because of the unpredictable nature of

software.

Requirements

Design

Implementation

Maintenance

Testing

Figure 9:Waterfall Model Adapted, source: [26]

Page 20: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

20 | P a g e

2.3.4.2 Spiral Lifecycle

This model, an extension of the waterfall methodology combines the design stages and prototyping,

the prototyping feature helps to improve requirements [17]

The main advantages of this methodology are the focus on the quality assurance aspects as

prototypes helps to ensure issues are caught earlier. Changes that do occur during the development

stage of the project are suitable with this methodology because of the use of prototyping. But as this

project is limited to one person developing a relatively small project that spans over a few months

this methodology may not be suitable, as it is more suited to larger projects.

Figure 10: Spiral Model, source: [23]

Page 21: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

21 | P a g e

2.3.4.3 Agile

Having discussed the waterfall and spiral lifecycles, the agile methodology suits small projects, and

the planning stage which is rigid in the waterfall methodology is omitted from the agile methodology

to anticipate the potential changes that will occur to the requirements, although in this project the

requirements are unlikely to change unless there is a technological issue.

The lack of documentation in agile methodologies is a distinct disadvantage if there is a view to

provide support for an application, although this would be beneficial it isn’t essential for this project.

2.3.4.4 Choice of methodology

The methodologies that are available are designed in the frameset that teamwork (more than 2

members in a team) will take place, therefore, a methodology that makes use of the agile principles

will be used but it will need to be adapted slightly, as the focus on documentation is used to alleviate

the communication issues that may occur as result of working in or as part of a team. The project

uses new software technologies such as RavenDB, Lucene and .Net components such as Linq thus

increasing the potential for unforeseen problems, so rigid planning and low level design will not take

place, although the use of high level designs such as storyboards will be used.

Figure 11: Agile Development, source: [25]

Page 22: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

22 | P a g e

2.4 Hardware The hardware required to run the web application will be just a computer that has a asp.net web

server running on it, connectivity to the internet is essential so that the web application can use

external web services, from a user perspective, all that is required is a computer with a web browser

to use the application.

2.5 Summary This chapter has introduced the concept of the helpdesk application, and has discussed the topics

ontology and taxonomy, development methodologies and the potential technologies that will be

used.

In summary, the technologies that will be used in order to develop the web application will be

ASP.Net and C#, the database used will be RavenDB and the ontology extraction tool used will be

Open Calais. Finally the development methodology will comprise of agile principles.

Page 23: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

23 | P a g e

3 Requirements Specification

3.1 Functional Requirements

3.1.1 Overall System Requirements

Below is a description of what the overall requirements of the system are:

ID Description

R1.1 The system should enable both user and expert to communicate via a web application

R1.2 The system should be able to extract information about a problem report so that it

logged and directed to the appropriate expert.

R1.3 The system should be able to provide a report view of the overall problems helpdesk

are facing.

R1.4 The system should be able to store a history of the report, it’s correspondence etc.

and store statistics about the call

R1.5 The system should have a login facility which displays 2 separate views, one for the

user and the other for the expert

R1.6 The system should be able to direct the call to the appropriate expert by matching the

tags with the experts skills or area of focus

R1.7 The system should be searchable, which allows for the search of any previous calls

3.1.2 Front End Requirements

Below are requirements for the front-end of the web application.

3.1.2.1 Overall Front-End Requirements

ID Description

R2.1 The display should have a simple layout, and should be consistent throughout the web

application.

R2.2 The display should be cross-browser compliant, to ensure the design is consistent with

any browser

R2.3 The display should be potentially customizable in terms of logo and colour scheme

3.1.2.2 User

Below is a description of what the user requirements for the front end would be.

ID Description

R3.1 Display a text box field where the user can enter in the problems they are facing

Page 24: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

24 | P a g e

R3.2 Display a real time update of the status of the helpdesk call

R3.3 Any updates or discussions made will be displayed on the screen.

R3.4 Provide a search page which allows the user to find any previous calls

R3.5 Display all the calls that the user has made that have not been resolved.

3.1.2.3 Expert

Below is a description of what the expert’s requirements for the front end would be:

ID Description

R4.1 A real time view of all the calls in helpdesk that appear in the front page

R4.1.1 Display subject of the call

R4.2 Display all calls in a page which stores all the logged calls

R4.2.1 Display a section where it displays calls logged to specific expert/category

R4.3 Any updates or discussions on the call will be displayed on the screen

R4.4 Display the helpdesk call in new window when clicked on.

R4.4.1 Provide buttons to re-assign calls, complete calls and reply

R4.4.2 Display detailed performance statistics in terms of call handling efficiency

R4.5 Display a tree like report page of the overall problematic areas

R4.6 Provide a advanced search facility

3.1.3 Application Requirements

Below are requirements for the functionalities of the web application.

ID Description

R5.1 All communication is managed via the application itself and is stored within the

database

R51.1.1 Feedback on users’ actions to communicate the web applications is working.

R5.2 The system should make use of natural language processing tools to gain an

“aboutness” for the helpdesk calls

R5.2.1 The system should have a fail-safe should the NLP tool not work.

R5.3 The system should have a login system for security purposes

R5.3.1 The system should then have the facility to register

R5.4 The system should store all information in the database

R1.4.1 Information such as: the helpdesk call itself, statistics about the call, the analysis about

Page 25: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

25 | P a g e

the call, the group it is in.

R5.5 The system should direct the helpdesk call to the appropriate expert

3.1.4 Database requirements

Below are requirements for the functionalities of the web application

ID Description

R6.1 Database should be maintained and backed up regularly

R6.2 Database must be indexed to increase database performance

Page 26: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

26 | P a g e

4 Design This chapter describes the logic of the system, and the designs will be split into subsystems, as well

as a discussion of how the particular design of a subsystem has met a particular requirement.

4.1 System Name and Logo Design The system will be called Amekhania, the reason for this is because it is a name that will be

memorable and has a “technical feel”, it refers to the Greek spirit of helplessness and want of

means, I feel this is an appropriate choice given the intended nature of the web application is to

provide technical help via this platform.

Below is the design of the logo of the system, it is very simple, and looks professional.

Figure 12: Logo for System

4.2 System Architecture The below diagram illustrates the system architecture.

Figure 13: System Architecture

Page 27: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

27 | P a g e

There are multiple patterns that could potentially be used for the system architecture but in the case

of this system the use of a “pipes and filter” system is most useful, because the web application

intention is to store, process and extract information then direct the call to an expert. It is a

streamlined process as shown in the diagram below.

Figure 14: Pipes and Filter process of Amekhania

This process is quite flexible in the sense that there is a chunk of processes that have been

separated, for example should Open Calais fail or cease to exist or a better alternative comes out,

there is the functionality to change or add to the process.

4.3 Database Design As the format of the database is essentially non-relational, and the storage format is in JSON, the

structure will be described below.

There are two types of users that are going to use this system and these are;

User: This user sends queries to the helpdesk

Expert: This user responds and resolves queries in helpdesk

So applying a type field in the document is required to allow the system to recognize who is logging

in so that it can redirect the particular user to the appropriate page.

HTML decode

Pipe 1

Process Call

Pipe 2

Store Call

Pipe 3

Direct Call

Page 28: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

28 | P a g e

In order to design the database in RavenDB, it is suggested that I think in terms of aggregates and

entities therefore in order to design the database the use of domain driven design is suggested in

the RavenDB documentation website1. Therefore below is a high level design depicting the database

design, with the User being the aggregate root for the document database; this is because the user

creates the helpdesk call.

Figure 15: High level entity design

This high level design, allows for the storage of responses and statistics and satisfies requirement

R1.4, and enables the web application to manage the communication.

Below is a description of each key document:

Entity Description

User This stores the details of the user and this includes the expert, i.e.

password and username, and in order to differentiate between the users

a type field will be used, therefore having this satisfies R5.3.

Helpdesk Call The helpdesk call is the report of the problem a user is currently facing,

this will store documents “About”, “Responses”, “Overall Statistic”

1 http://ravendb.net/documentation/docs-document-design

User

Helpdesk Updates

Helpdesk Call

About

RDF-Entities Entity

RDF-Relationships RelationshipDetail

SimpleDocEntities SimpleDocEntity

SimpleDocTopics SimpleDocTopic

Responses Response Statistic

Overall Statistic

Page 29: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

29 | P a g e

About This document will store the open Calais data, this will store documents

“RDF-Entities”, “RDF-Relationships”, “SimpleDocEntities”,

“SimpleDocTopics”

Responses An helpdeskcall can have more than one response therefore a response

will store the statistic and the response

Response-Statistic This document will store the statistic for a response

Overall Statistic This document will store the statistics such as number of responses, and

time it took to deal with etc, and the status of the call.

RDF-Entities From the open Calais API, it delivers entities taken from the RDF

Document that is generated from the service, essentially it’s a list of the

entities

Entity This contains the RDF entity itself

RDF-Relationships This document describes the relationships between the entities

RelationShipDetail This describes the relationships that a call has.

SimpleDocEntities This delivers entities taken from the SimpleDocument that is generated

from the service; it is a simplified version of the RDFEntities.

SimpleDocEntity This contains the SimpleDoc Entity itself

SimpleDocTopics This delivers topics taken from the SimpleDocument

SimpleDocTopic This contains the SimpleDoc Topic itself.

Helpdeskupdates This records any actions of the user/expert

Figure 16 High level entity design

Page 30: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

30 | P a g e

In order to satisfy requirement R3.3/4.3, which states that updates to any calls should be displayed

to both the user and expert, the “helpdeskupdates” document will have to be used to display any

updates made to the a particular call.

As the RavenDB has an in-built automatic indexer (if you query for certain criteria, it will store a

temporary index of the query, but if a query is made over a sustained period of time then it will

make the index permanent); this satisfies requirement R6.1.

In order to back up the database regularly, any backup solution can be used to copy the “data”

directory to the backup director of choice, and this can be automated via the use of windows task

scheduler, this satisfies requirement R6.2.

In this design, the reason for using high level design is that I can decide the variables that I need

when in the implementation stage of the project.

4.4 Web Application Design

4.4.1 Communication Process

As the communication between the user and expert will need to managed by the application and

stored into the database as stated by requirement R5.1, below is a table describing the process of

managing communication.

Step Description

1 User logs into the web application

2 Server verifies credentials if successful redirects to the appropriate page

3 Server retrieves the appropriate data from RavenDB

4 User submits a call or a response to a particular call

5 Server then stores the call or response

6 User logs off application and session ends.

Figure 17:Process of Managing Communication

Page 31: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

31 | P a g e

4.4.2 Ontology Extraction Process

In order to process the call the steps below are taken, the below process diagram satisfies

requirement R5.2 and R5.2.1; in the scenario where Open Calais for any reason should fail, it will

store the call anyway, and ensures the problem reported is still being messaged to the expert.

The extraction process will only be applied on the original call that is made; the reason for this is

because for every response that is made, it will increase the amount of data stored in the database

increasing the processing time for each response and creating unnecessary load on the database.

Below is a table describing the process of ontology extraction.

Step Description

1 User submits a call

2 Server stores the call

3 Server processes the call

4 Server stores the extracted information

5 Server indicates to the user if it is successful

Figure 19: Process of Ontology Extraction

The reason for processing the call as the user submits the call, rather than a batch process where

you have x amount of calls that need processing is because of the requirement of a middle

process/console application; having middleware apps that conduct batch processing could slow

down the database for sustained periods of times whereas processing as the user submits the call

Process Helpdesk

Call

Did the helpdesk call

get processed?

Store the Call

anyway

Store the

Processed Data

into Database

NOYES

Attempt to direct call to

appropriate expert or

call

Log Call as

Unassigned

Figure 18: Process Diagram for Ontology Extraction

Page 32: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

32 | P a g e

would mean that the user has to wait a few seconds for it store and process, this process satisfies

R5.2.

4.4.3 Taxonomies

In terms of displaying taxonomies, the first/base levels of the categories, will be the type of call it is

categorised into, the idea is then to count the number of calls in the designated category and allow

the category to be searchable. For example; if the user was to click on the category, this will display

all calls in that selected category.

The extracted information from the call will also be used for categorisation, and the above where a

count of each entity will be created, and again will be made searchable, as illustrated in the figure

below.

Figure 20: Mock-up of taxonomy integrated into search facility

Page 33: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

33 | P a g e

4.4.4 Directing the call to the appropriate expert

There are two ways of going about directing the call based on the extracted information from Open

Calais. The system can make a search on the system and can then decide who is best to deal with the

call depending on the expert’s skill set, which would have to be manually entered, and if a call has a

match, then it will assign to the expert.

The second option as illustrated below:

To create the categories and have the ability to direct the call, the category would need to have a list

of keywords that describe the category that is pre-inserted, then what has been extracted from the

call will then be matched against these key words. If the call matches the keyword, then it will be

assigned to a category.

But in order to keep the descriptions of each category updated and relevant, should the helpdesk

call get logged into a particular category, the extracted information from the call can then be added

to the description of the category it has been logged to. Therefore the description of each category

is then updated, and future calls can then be logged to the correct category.

For each call that is processed or manually logged into a particular category, it will be displayed with

having the status as “Unassigned", and the expert working (for example an expert working in

desktop support) would be able to see it. I think this is the most appropriate way of directing a call to

an expert as their skills will be aligned to the category of the call, and this will satisfy R1.6 and R5.5.

Has the call been

processed?Attempt to direct

to category

Compare extracted

information with

description of category

Does the call match

a category

Assign call to

category

YES

YESChange Status of

Call to

Unassigned

NO

Figure 21: Process of directing call to category

Page 34: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

34 | P a g e

4.5 Front End Design Below is a mock-up of the front end of the expert view of a helpdesk call which will be similar for the

user but they won’t be able to have or be shown the range of functionality that would be available

for the expert.

The mock-up design as shown in figure 22, can allow the user to make relatively quick searches with

the search bar that is aligned to the right of the header in the image above. To view the call, in the

appropriate boxes/grids, calls will be linkable to a call page, that will allow the user and expert to

view the call, and perform the necessary operations on it, but different functions will be available to

the type of user. For example, if the particular user is a standard user, they will only be able to

respond to the call, whereas an expert user would be able to reassign the call and delete it etc.

As the premise of the application is quite simple, the design will have clear buttons depicting what

the user should be able to do, for example if a user wants to respond to a call, there will a button

called “Respond”.

Figure 22: Mock-up’s of front end for user

Page 35: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

35 | P a g e

4.6 Web Application Structure

Figure 23: Web Application Structure

Above illustrates the web pages that will be available in the web application, the administration

aspect of the website will allow any user to register as the “normal” type, but in order to register as

an “expert”, only an “expert” can create another “expert” user. I have chosen to structure the

creation of these user types in this manner as it would require a natural process of requesting to

become an expert.

4.7 Summary This chapter has described at a high level, the database and how it will be used in the context of the

web application, and how it is going to allow for the communication between user and expert.

A discussion on the process in which it extracts the information about a call, also how to represent

the taxonomies was made. The section has also provided an architecture diagram which shows how

each component works together where also major design decisions and justifications were also

documented.

Amekhania

Login

User

UserCall

Search

Expert

ExpertCall

Advanced Search

Report

AllCalls

Register Expert User

Register

Forgot

Page 36: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

36 | P a g e

5 Implementation This chapter documents the implementation of the system according to the criteria set out in

previous sections and is split out into the appropriate sections.

5.1 Front End

In the making of the front end of the website, the use of ASP.Net master pages [8] were used, in the

implementation of the website, in essence I created three master pages;

“Top.Master” – This master page was used for any page that was used by the expert, this

master page has a modified menu tailored to the expert’s needs.

“User.Master” – This master page was used for any page that was used by the user, this

masterpage has a modified menu tailored to the users need.

“Login.Master” – This master page was used for any user, but provides a very limited view

and does not provide any links to the other pages

They all share the same CSS layout where the file in directory “/styles/Main.css” applied the styling

to the master pages; I modified this to a colour scheme of preference.

Also in the implementation of the front end, the use of JQuery’s UITheme Roller package ensured I

did not need to spend a lot of time focussing on the front end design of the web application as this

had widgets such as tabs and progress bar built in.

Figure 24: default web page being the login screen

Page 37: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

37 | P a g e

5.1.1 Field Validation

With the use of ASP.Net controls such as the label, and input form, validation was very

straightforward, because along with these controls where validation controls that could be

“attached” to the control that needed to be validated.

Below is table that shows the main fields have been validated.

Webpage Field(s) Validation Comment

Default.aspx UserName and

Password

Checks if it is empty Built into the login control

ExpertCall.aspx Submit Response Required Field, so

cannot be empty

Used a validation control

UserCall.aspx User_Response

Required Field, so

cannot be empty

Used a validation control

User.aspx Subject and Body Required Fields, so

cannot be empty

Used a validation control

RegisterExpertUser.aspx

RegisterUser.aspx

Email Address,

Password,

Surname,

DOB

Field must be an

email address

Password cannot be

longer than 15

characters

All fields are

required, so cannot

be empty

Date of Birth uses a JQuery

datepicker, therefore user

cannot enter data

Forgot.aspx FirstName,

Surname,

DOB

All fields are

required, so cannot

be empty

Date of Birth uses a JQuery

datepicker, therefore user

cannot enter data

Figure 25: Validation Table

Page 38: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

38 | P a g e

5.2 Web Application The web application was built with the help of Appendix C, which contains mock-ups of the pages

that needed to be built. Appendix D, describes the web pages methods and Appendix H describes

each of the relevant files and directories in the web application.

5.2.1 Issues Faced

When implementing the web application there were a few complications and these were;

5.2.1.1 Page refreshes

In order to receive updates or display any new data, the use of the Microsoft Ajax Control toolkit in

most of the web pages developed was an excellent tool, as this enabled me to implement partial

page refreshes, and this meant that it would dynamically load any new data that is available without

the need for the user to refresh the page.

5.2.1.2 TinyMCE

As I wanted the form data submitted without creating a full page refresh, I enclosed a textbox

between an Ajax Update Panel, which is part of the Microsoft Ajax control toolkit, so whenever

TinyMCE editor loaded and data was submitted, it would be done asynchronously but the data

submitted would always be null. It would seem the issue was that the TinyMCE editor was not

copying the data into the asp.net textbox; therefore I had to apply the “function updateTextArea” as

described in appendix F, which has fixed the issue.

The load time was also an issue; therefore the use of a TinyMCE compressor was used to improve

the speed in which the editor loaded.

Also when submitting the data, the TinyMCE editor failed to reload after an asynchronous submit,

and I have yet to find a suitable solution to this problem, apart from refreshing the page in which the

editor is on.

5.2.1.3 Open Calais Wrapper

Another issue was that the Open Calais wrapper needed to be updated because the web-service had

been changed slightly and the API library had been throwing exceptions, therefore I’ve had to

modify the code as and when an exception was thrown. This was usually because a dictionary value

could not be matched because it never existed in the library.

Page 39: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

39 | P a g e

5.3 Database Implementation In the development of the database entities, I created entity classes, as shown in Appendix B, which

was generated by Microsoft Visual Studio 2010 class diagram generator. These entity classes are the

objects that are stored in the RavenDB.

Where entities had list of a another type of entity class, this typically would mean that in order to

associate entities, it would stores id’s of the associated helpdesk statistic entity. For example in

figure 26, a helpdesk call may have list of helpdesk statistics, the variable “Statistics” will store an id

for the “helpdeskstatistic” document.

Storing entities as a separate document as opposed to it being stored within its associated entity

would means that it’s easier to perform wide ranging queries of documents of a certain type, for the

reason that the document will not be stored within another document.

Also any queries that are persistently being made to the RavenDB, after being temporary will

become permanent after a certain limit of times that query/index has been used. This was

advantageous as indexes would be created dynamically, rather than having to pre-exist.

Figure 26: Example of Database Entity

Page 40: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

40 | P a g e

5.3.1 User submitting call sequence diagram

In Appendix D, it describes the methods used in the classes that enable the user to perform the

necessary actions, below is a sequence diagram of the user from logging into the system up to

submitting a helpdesk call.

Figure 27: Sequence diagram

Page 41: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

41 | P a g e

5.4 System Operation This section describes “Amekhania” from deployment, to the user registering, to creating and

responding to calls from the expert.

5.4.1 RavenDB – Database

The RavenDB is core to the system running as the web application developed looks specifically for

port 8080 to interact with the database.

Screenshot Description

This is the RavenDB console

application that runs the

RavenDB database server

on port 8080, as shown in

the screenshot.

This is the interface to the

RavenDB; it allows viewing

of the database statistics,

and also the documents

stored and the indexes.

Page 42: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

42 | P a g e

5.4.2 Registering as an expert/user

Step Screenshot Description

1 Default.aspx

The user browses to the URL to

which the web application is

hosted whereby the default page

will be default.aspx, in order to

register they will simply need to

click on the register button where

it will link to the RegisterUser.aspx

2 Registering onto the helpdesk system The user enters all the necessary

details and clicks the register

button, where they will be sent an

email confirming their details.

For the expert user the register

page will be hidden from the

normal user

5.4.3 Users interaction

Page 43: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

43 | P a g e

This section describes the typical interaction the user has with the system.

Step Screenshot Description

1 Login Screen

The user browses to the url to which

the web application is hosted

whereby the default page loaded will

be default.aspx, and in order to login

they will need to enter the username

and password, which depending on

the type of the user it will redirect

accordingly. In the case of a standard

user it will redirect to “user.aspx”

2 User Dashboard

The user when they first login will be

able to see if there are any

notifications, this page updates

without the need to refresh because

of the use of Ajax.

Below is the menu that the user will

have.

Page 44: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

44 | P a g e

3 Creating an helpdesk call

The user will just need to click on

“Create New Call” to convey the

issue the user is having.

4 Viewing an Helpdesk Call

The link to a helpdesk call will be

provided in the dashboard, the user

from here will be able to see if there

has been extracted information,

responses, the statistics and finally

the properties

Page 45: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

45 | P a g e

5 Viewing Responses

Creating a response

The user will be able to see the

responses made to the helpdesk call,

and submit a response via the

TinyMCE plug-in.

Page 46: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

46 | P a g e

6 Viewing Extracted Information In order to view the extracted

information the user will simply

just need to click on the about

tab.

Page 47: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

47 | P a g e

7 Searching helpdesk

In order to search the helpdesk

for calls they will just need the

user can just use the search bar

by entering words in order to

search for something general or

specific, and it will redirect to the

search page.

Page 48: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

48 | P a g e

8 Retrieving Password

For the user to retrieve their

password, the user from the

login page can follow the link

“forgot password” which directs

to forgot.aspx, this contains a

form in which the user can

retrieve their password.

Page 49: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

49 | P a g e

5.4.4 Expert Interaction

This section describes the typical interaction the expert has with the system.

1 Expert Dashboard

Expert Menu

The expert when they first

login will be able to see if

there are any notifications,

similar to the user dashboard,

except with the experts

dashboard they will use a

different menu, also they

Page 50: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

50 | P a g e

2 Expert View of Calls Assigned to themselves

The expert can click

on the tab “Calls

Assigned To You”

and depending on

the drop down

menu; they can

filter according to

the status of the

call.

Page 51: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

51 | P a g e

3 Expert View of Calls that get sent by the user

This renders the

calls that are sent by

the user(s), should

the call get directed

to a category it will

display the category

in which the call

gets assigned to.

4 Taxonomy Report (report view)

This page provides a

summary of what is

in the helpdesk and

has rendered it in

the format of a tree

structure.

Page 52: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

52 | P a g e

8 All Calls in Helpdesk

Depending on what

category the expert

wants to look for

helpdesk calls, they

can check according

to the status.

9 Expert Viewing an helpdesk Call

Once the expert

user has been

directed to the

helpdesk call, from

here the expert, can

view what the call is

about, view the

stats, view the

properties and also

change the

properties of the

helpdesk call.

Page 53: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

53 | P a g e

10 Expert Viewing the extracted information

The expert can

navigate to the tab

about and from

there will be sub-

tabs which allow the

expert to view the

type of information

extracted according

to the open Calais

11 Expert Changing Properties

If the expert wants

to change the

property of the

helpdesk call, they

can click on the

change properties

tab, and below are

sub-tabs that

enable the expert

to modify the user

assigned, modify

the category and

call status.

Page 54: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

54 | P a g e

12 Expert Advanced Search on Calls

Expert Search results on subject

Depending on what

the user wants to

specifically search

for, they can select

it from the drop

down list. In the

next image, the

expert wanted to

search by subject,

once the textbox

for searching by

the subject appear

the search button

would then appear,

and once that is

clicked the results

are shown.

5.5 Summary This section has described how each component has been implemented, and how the user would be

able to use the system.

Page 55: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

55 | P a g e

6 Testing

This chapter describes the logic of the system, and how the designs will be split into subsystems, and

also a discussion of how the particular subsystem has met a particular requirement.

In order to test the application made, the majority of testing will be done via black box testing

because the project is a web application and testing functionality from a black box point of view will

allow me to expose the faults and issues with the system that ultimately will allow me to provide

bug fixes, it can prove to provide conclusive evidence whether functionality works or not.

6.1 Black Box Testing

This phase of testing allows for each component of the Amekhania as a black box system.

6.1.1 Web Application Testing

6.1.1.1 Login/Registration Testing

Test No Test Name Expected

Behaviour

Precondition Actual

Behaviour

Result

1 Launch the

web

application

Website

directs to

default.aspx

As Expected Pass

2 Register as

standard user

Database

stores the

user

credentials

and send

email to the

user

1 As Expected Pass

3 Register as

expert user

Database

stores the

user

credentials

and send

email to the

1 As Expected Pass

Page 56: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

56 | P a g e

user

4 Login as

standard user

Directs to the

user.aspx

2 As Expected Pass

5 Login as

Expert User

Directs to the

expert.aspx

3 As Expected Pass

6 Attempt to

use random

details to login

Does not log

in

1 As Expected Pass

Figure 28: Login/Registration Test Cases

In this phase of testing the login and registration phase has all passed successfully.

Page 57: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

57 | P a g e

6.1.1.2 Login/Registration Validation Testing

As RegistrationExpertUser.aspx and RegisterUser.aspx both contain the same fields, I am only going

to test the RegisterUser.aspx page.

Test

No

Test Name Expected

Behaviour

Actual

Behaviour

Result

1 Enter non valid data for field email address

Data: 46548648645113123asdvniomno

Tells user

“Not an

email

address”

As

Expected

Pass

2 Enter password longer than 15 characters:

Data: asnfifasoiasioasfnofsmnkaslfasfklfa

Field does

not allow

for

password

longer

than 15

characters

As

Expected

Pass

3 Enter first name longer than 30 characters

Data:

nasonviodnsionsdgimdgojkgdnkogsdmasfasfasfafsaf

Field does

not allow

for

password

longer

than 30

characters

As

expected

Pass

4 Enter surn name longer than 30 characters:

Nasonviodnsionsdgimdgojkgdnkogsdmasffasasffaasf

Field does

not allow

for

password

longer

than 30

characters

As

expected

Pass

5 Enter an invalid date Field does Produces Fail

Page 58: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

58 | P a g e

Data: 444444444564564848

not allow

for invalid

date

format

no

message

6 Attempt to register with missing data in field: Email

Address

Does not

register

tells user

“Email

Address

Required”

As

Expected

Pass

Figure 29: Login/Registration Validation Testing

In this phase of testing, all tests passed bar one, which was test 5, this failed because there was not a

validation control applied to the field, this is not a critical issue, but this field is used for password

recovery, so I have added a validation control for this field for both of the pages.

6.1.1.3 Testing ontology extraction

In this section, I am going to test Amekhania’s ability to extract information from calls submitted, in

order to test what has been extracted is accurate I am going to compare this to data on

stackoverflow.com as a benchmark. For example I will pick 3 problem reports from this website, that

have relevance to application support, and I will compare that to what has been produced by

Amekhania, I will do the same for categories, infrastructure support and desktop, as shown in

appendix G.

The result is that in most cases, the extraction component of the system worked well when

compared against the tags placed by stackoverflow.com, which is manually applied on this website.

But where data has been extracted using Amekhania, it extracted and produced in most cases the

same tags and at times, more.

But there is an issue with the system in attempting to direct the call to a particular category as the

system has failed to this correctly; therefore a correction in the algorithm that designates a call to a

category will need to be made in the future because it seems to be a “hit and miss” at times.

Page 59: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

59 | P a g e

6.1.1.4 Web Application Testing

6.1.1.4.1 Expert

This section tests the functionalities and web pages available to the expert

Test No Test Name Expected

Behaviour

Precondition Actual

Behaviour

Result

1 Expert able to

see new

helpdesk call

incoming

Web page

makes an

automatic

request to

populate the

calls

unassigned

grid with new

data

As Expected Pass

2 View list of

calls assigned

Displays calls

that are

assigned to

the expert

As expected Pass

3 View helpdesk

call

Expert gets

directed to

the expert call

page where

the expert can

see the call

1,2 As Expected Pass

4 Modify

Category of

Call

-Category of

call gets

modified and

saved, and call

has been

processed

then gets

added to

1 As Expected Pass

Page 60: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

60 | P a g e

category

descriptions

-creates an

update

-User is

informed of

their action

5 Modify User

Assigned to

Call

- User

assigned is

modified and

saved

-User is

informed of

their action

-creates an

update

1 As Expected Pass

6 Create a

response to a

helpdesk call

Response gets

added to the

call

-User is

informed of

their action

1 As Expected Pass

7 Delete a

helpdesk call

Helpdesk call

gets deleted

from database

-User is

informed of

1 As Expected Pass

Page 61: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

61 | P a g e

their action

8 Modify status

of helpdesk

call

Status of call

gets modified

and saved

-User is

informed of

their action

1 As Expected Pass

9 View Statistics

for helpdesk

Call

Displays

statistics

1 As Expected Pass

10 View All

Helpdesk Calls

Expert able to

view calls in

helpdesk

As Expected Pass

11 Check if

notifications

take place

Notification

appears on

experts

dashboard in

notifications

tab

Has to be

logged in

As Expected Pass

12 View report Tree

generated of

extracted

information.

As Expected Pass

13 Click on a

node of a

report

If parent node

do nothing,

child nodes

will direct to

search page

As Expected Pass

Figure 30: Expert Test Cases

In this section all of the test cases have passed without error.

Page 62: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

62 | P a g e

6.1.1.4.2 Expert Advanced Search Testing

Test No Test Name Expected

Behaviour

Precondition Actual

Behaviour

Result

1 Advanced

Search by

Category:

Desktop

Support

Renders all

calls that are in

category

desktop

support

As Expected Pass

2 Advanced

Search by

Expert

Assigned

Renders all

calls that are

assigned to a

particular user

As Expected Pass

3 Advanced

search by Call

Status:

Unassigned

Renders all

calls that have

status

Unassigned

Renders

nothing

Fail

4 Advanced

search by

date:

First Date:

01/04/2011

Second Date:

30/04/2011

Renders all

calls between

the two dates

Renders

nothing

Fail

5 Advanced

search by

Subject:

Windows

Renders results As Expected Pass

Page 63: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

63 | P a g e

In summary, two fails in this test phase have occurred in the testing of the advanced search facility,

this is because there was an exception thrown as a result of the code trying to arbitrarily limit the

length of a string. Therefore there has been a try catch statement that has been added to patch the

code, and the rendering has been modified to incorporate this change; this issue has now been

fixed.

6.1.1.4.3 User

This section tests the functionalities and web pages available to the user.

Test No Test Name Expected

Behaviour

Precondition Actual

Behaviour

Result

1 Create an

helpdesk call

Call gets

submitted and

is displayed in

the Calls

Unassigned Tab

-User is

informed of

their action

Subject is not

null, Body is

not Null

As Expected Pass

2 Search for an

helpdesk call

by word:

ASP.Net

Result

generated

As Expected Pass

3 Check for

notifications

in dashboard

Asynchronously

retrieve data

User logged in As Expected Pass

4 User clicks on

an helpdesk

call

User directed

to the

userCall.aspx

page

As Expected Pass

5 Create and Response gets As Expected Pass

Page 64: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

64 | P a g e

submit a

response

saved, and

rendered onto

the helpdesk

call

-User is

informed of

their action

6 View Stats Displays

statistics about

call

As Expected Pass

7 View

Responses

Displays all

responses

As Expected Pass

Figure 31: User Test Cases

In this section all of the test cases have passed without error.

6.1.1.5 Front End Testing

Test No Test Name Expected

Behaviour

Precondition Actual

Behaviour

Result

1 Browse to

Amekhania in

Google

Chrome

CSS layout

maintained

Web

application

started

As Expected Pass

2 Browse to

Amekhania in

Mozilla Firefox

CSS layout

maintained

Web

application

started

As Expected Pass

3 Browse to

Amekhania in

Internet

Explorer 8

CSS layout

maintained

Web

application

started

Tabs become

square rather

than rounded

Pass

4 Search for

calls: data:

asp.net

Truncates the

body

Logged in As Expected Pass

Page 65: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

65 | P a g e

Figure 32: Front End Test Cases

In this section all of the test cases have passed without error with the slight exception of test no 3.

6.1.2 Summary

This section has made use of black box testing, and has tested the functionality of the system against

the requirements.

Page 66: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

66 | P a g e

7 Evaluation

This section evaluates the aims and requirements of the project as developed in the introduction

section and requirements specification respectively.

7.1 System Aims Evaluation

Aims Evidence

Provide a platform in which users can

communicate a technical problem such

that it can be addressed by the

appropriate expert(s) or department

The web application has been developed to incorporate the

need to communicate a problem on a real time basis, this

means that rather than emailing to an email box, and waiting

for a typical system to send a reply saying it has been logged,

with Amekhania, the user can communicate a problem with

the use of TinyMCE, which is a rich text editor. The call will be

picked up on the helpdesk application, and should the expert

respond or change the status of the call, this will be indicated

to the user, and vice versa in their “notification box” in the

respective user’s dashboard.

The non-use of a email is a particular benefit, for example

with a system reliant on using emails as its main form of

communication, should the emails server go down it would

still mean that the helpdesk application is still operational

and be able to receive communications from its users.

The user would also be able to see what is currently

happening to the helpdesk call they have submitted with the

status of the call being displayed.

Provide functionality such that the

problem reports can get automatically

logged and directed to the appropriate

expert with the use of ontology

extraction.

The web application architecture has been designed so that

once the user submits the helpdesk call; the system then uses

an ontology extraction tool to determine what the call is

about, rather than directing the call to a particular expert. It

was decided in the design process, that it was beneficial to

direct it to the appropriate category, because then the

Page 67: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

67 | P a g e

helpdesk team for example would be able to see that the call

is “theirs” and they can log it appropriately, and if for

example the call was directed to the expert, and the expert

happened to be off-ill then this could result in no one else

being able to pick up the call.

At first, the system will not recognize where the call is to be

directed, because the “description” for each category will be

empty, should the call get logged to a particular category,

the helpdesk call’s extracted information get’s added to that

of the category it has been assigned to. This process is a

continuous process that builds the description of each

category.

When directing the call, the system can then makes a search

on each category, depending on what category has been

matched the most, the system will then categorize that call

into the category.

Provide a dynamic view of the

problematic areas in the form of a tree

structure.

In order build a dynamic view of the problematic areas in

helpdesk, the web application would on the “spot” would

generate a tree view of what is currently being discussed on

helpdesk which is only available to expert users.

7.2 System Requirements Evaluation

7.2.1 Overall System Requirements

ID Description Evidence

R1.1 The system should enable both the user

and expert to communicate via a web

application

The web application was built using C#

and ASP.Net, therefore this could be

hosted on a web server, with a hosted

database

R1.2 The system should be able to extract

information about a problem report so

that it is logged and directed to the

Where possible the system attempts to

extract information, but this depends on

whether the problem reported is.

Page 68: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

68 | P a g e

appropriate expert.

R1.3 The system should be able to provide a

report view of the overall problems

helpdesk are facing.

There is a web page that collates all of

the topics of discussion, and displays it

in a summarised tree view.

R1.4 The system should be able to store a

history of the report, it’s

correspondence etc. and store statistics

about the call

The helpdesk call entity has been

designed to link together all of the

associated information.

R1.5 The system should have a login facility

which displays 2 separate views, one

for the user and the other for the

expert

With the login system, it has allowed for

the user to be casted into a type of user,

therefore the system automatically

recognizes the type of the user and

redirects appropriately.

R1.6 The system should be able to direct the

call to the appropriate expert by

matching the tags with the experts skills

or area of focus

The system attempts to direct the call to

the appropriate category rather than

expert, as this was a design decision

that was made; this requirement hasn’t

fully been met as there are slight

problems with the system when trying

to direct the call.

R1.7 They system should be searchable,

which allows for the search of any

previous calls

The system collates all of the calls, and

allows for the searching of any call.

Page 69: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

69 | P a g e

7.2.2 Front End Requirements

7.2.2.1 Overall Front-End Requirements

ID Description Evidence

R2.1 The display should have simple

layout, and should be consistent

throughout the web application.

The use of master pages has enabled the

designs used throughout the web application

to look consistent, in conjunction with the use

of CSS.

R2.2 The display should be cross-

browser compliant, to ensure the

design is consistent with any

browser

As shown in testing phase.

R2.3 The display should be potentially

customizable in terms of logo

and colour scheme

Replacing the main_logo image, with image of

choice meets this requirement.

7.2.2.2 User

ID Description Evidence

R3.1 Display a text box field

where they can enter in

the problems they are

facing

Once the user has logged in the user dashboard,

enclosed within the “create new call” tab, it contains

two text boxes, subject and body, where the body

text box has the TinyMCE plug-in applied to it.

R3.2 Display a real time update

of the status of the

helpdesk call

Should any change with regards to the status of the

helpdesk call occur, the user will be updated of the

status of the call.

R3.3 Any updates/or discussions

made will be displayed on

the screen.

Any updates that relate to a particular user are

rendered on the respective dashboard pages.

R3.4 Provide a search page

which allows the user to

find any previous calls

There is a search box that is available across all of

the web pages that are available to the user.

R3.5 Display all the calls that

the user has made that

have not been resolved.

On the dashboard, the user can see all the calls that

have been made and the status of the call.

Page 70: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

70 | P a g e

7.2.2.3 Expert

ID Description Evidence

R4.1 A real time view of all the calls in

helpdesk that come in the front page

The web application has been

designed with the use of Microsoft

Ajax Control toolkit, which allows for

data to be picked up automatically

and this data is rendered in the expert

dashboard

R4.1.1 Display subject of the call The web application displays the

subject of the call, and who the call is

from, which provides a link to the

helpdesk call

R4.2 Display all-calls in a page which stores

all the logged calls

This can be accessed via the experts

menu, and displays all of the calls in

different sections.

R4.2.1 Display a section where it displays calls

logged to specific expert/category

This has been implemented via the use

of tabs, which is split into the different

categories, from which the expert can

then filter the calls according to the

status they wish to see.

R4.3 Any updates/or discussions on the

call will be displayed on the screen

Any updates that relate to a particular

expert are rendered on the expert’s

dashboard page.

R4.4 Display the helpdesk call in new

window when clicked on.

The helpdesk call view page has not

been implemented such that it opens a

new window, instead this is optional in

certain browsers where they can

“open the link in a new window”

R4.4.1 Provide buttons to re-assign calls,

complete calls and reply

Available in the expert call page,

which allows the expert to reply and

modify the status of the call

R4.4.2 Display detailed performance statistics

in terms of call handling efficiency

The web application stores statistics

about the call, such as the number of

responses, the date in which it was

Page 71: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

71 | P a g e

made and the date it was completed.

R4.5 Display a tree like report page of the

overall problematic areas

As discussed earlier in the overall

system requirements evaluation

R4.6 Provide a advanced search facility The web application has been

developed to incorporate a advanced

search facility.

7.2.3 Application Requirements

Below are requirements for the functionalities of the web application.

ID Description Evidence

R5.1 All communication is managed via the

application itself and is stored within

the database

The web application interacts with

RavenDB database in order manage

all of the communication

R51.1.1 Feedback on users’ actions to

communicate the web applications is

working.

The web application makes use of

asp.net “updateprogress” controls

which displays messages in the event

of a postback on a web page, also

shown in the test phase.

R5.2 The system should make use of

natural language processing tools to

gain an “aboutness” for the helpdesk

calls

The web application has integrated the

use of Open Calais, which extracts

information from the helpdesk call

submitted.

R5.2.1 The system should have a fail-safe

should the NLP tool not work.

Should the open Calais fail, the

helpdesk call will just store without

processing it.

R5.3 The system should have a login

system for security purposes

The web application contains a login

system but the connection has no form

of security in the sense that the

password has been encrypted.

R5.3.1 The system should then have the

facility to register

The web application has implemented

the registration pages for the expert

and user to register.

R5.4 The system should store all

information in the database

The use of the RavenDB as a means of

storage has allowed for all

Page 72: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

72 | P a g e

communication to be persisted.

R1.4.1 Information such as: the helpdesk call

itself, statistics about the call, the

analysis about the call, the group it is

in.

The web application displays all of the

required information from the call

R5.5 The system should direct the helpdesk

call to the appropriate expert

As explained in overall requirements

section.

Page 73: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

73 | P a g e

7.2.4 Database requirements

Below are requirements for the functionalities of the web application

ID Description Evidence

R6.1 Database should be maintained and

backed up regularly

This can be done via normal back up

utilities as RavenDB stores data on the

windows directory.

R6.2 Database must be indexed to increase

database performance

The nature of RavenDB is that it

queries on indexes therefore its

performance is optimized.

This section has evaluated whether requirements have been met, and has raised issues that remain

with the system.

Page 74: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

74 | P a g e

8 Conclusion This chapter discusses the project, if it has met the aims and objectives, and discusses the potential

improvements that could be made to the system.

8.1 Review of Aims

In section 7.1, the original aims and requirements of the project have been met, although as

discussed in the evaluation section, the implementation has slightly deviated from the original

specification which has been justified in the design section of this report.

This project has investigated the various ontology extraction tools and has made practical use of

these technologies in a real time system, which is a potential future use in the new web technology

era, where NLP tools could potentially be used on a mass scale across organisations for the use of

data collection and categorization without the use of human intervention.

In the testing phase I have benchmarked my system to that of something similar, stackoverflow.com,

whereby the questions posed on the website are tagged collaboratively by people. Although this

system is not perfect by any means, it has proven that there is no need for human categorization.

8.2 Revisions to the design & implementation

In this section, it describes the main areas of improvements that could be made to the system.

8.2.1 Accuracy of direction

When directing the call to the expert/category, this system has been inaccurate as discussed in

previous chapters; therefore a re-design of the process or the reimplementation of the code so that

there is a more accurate way of directing the call to a call/category.

8.2.2 Notifications

Notifications on the system are quite primitive and only display on the user dashboard, therefore an

improvement to the overall design may be required so that the notification system can be changed

to share similarities with the system used on “facebook”, where there’s a notification bar across all

pages. Also for real time notifications, the use of JGrowl, which is a JQuery notification system, could

also be used [28].

8.2.3 Front End Improvements

With the web application rendering what the call is about in table format, this could be a

consideration for improvement because it not visually appealing, therefore the use of cloud tags [30]

to improve this aspect is a possible consideration. Expert and normal users would be able to gain the

full benefits of what has been extracted from the call.

Page 75: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

75 | P a g e

Also a minor issue is that once a response/or a helpdesk call has been submitted via TinyMCE, the

editor fails to reload because the data is submitted via Ajax, therefore this would be an area where a

potential fix may be required.

Also in the grids of helpdesk calls and responses, there is no limitation to the amount of data

produced in terms of the number of rows, although RavenDB does limit the amount of results

produced to 128 rows by default, this could put unnecessary load on the database, therefore in

order reduce this, the use of paging on the front end is a future improvement that could be made to

the system.

8.2.4 Design Pattern usage

With the development of the project, there was no use of a design pattern; therefore with the use of

MVC [29] design pattern, where it separates the logic from the presentation layer, it would have

been more beneficial with the use of the RavenDB for session management and data concurrency

purposes.

8.2.5 Real Time System Performance

As the system is a real time system, the best way to fully test how the system performs is in a live

environment and given that helpdesk systems are critical to an organization, this task would have

been impractical but it would have given better insight into how the system performs under stress.

8.3 Further Work

This section discusses the potential ways the system could be extended.

8.3.1 Collaboration

The use of ontology extraction tools and categorizations could potentially be used in a worldwide

arena; for example if every academic publication had applied ontology extraction to their content,

categorizing what has been the most viewed or written about topic according to the category it has

been placed into, this would allow for people to gain an understanding of a topic from different

sources. They could even contribute to the domain as knowledge sharing has become an automated

process which as a result of this a build up of content from different perspectives enable enrichment

of knowledge in a particular category or domain. This could potentially change the use of Amekhania

to that of a collaboration problem solving website is a consideration.

8.3.2 Alternative use of the system

As the focus of this project was to make use of ontology extraction tools in the context of a helpdesk

system, the extraction and categorization functionalities of the system could potentially be used in

many other contexts, for example, a DVD renting web application, where the need to categorize

Page 76: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

76 | P a g e

movies according to the genre, applying tags on what the films are about with the synopsis of a film

used for analysis.

Potentially, the creation of a software framework that allows for other websites to make use of the

system to extract information and categorize it from the content from other websites could be

considered. This could also bring benefits such as the boosting of the search engine optimization [30]

rankings of a website with the use of keywords extracted from the system, as demonstrated in the

searching aspect of the helpdesk system.

8.4 Summary

As the web is a rapidly expanding arena of information, to manage the organization of the data is a

massive task that humans are involved in, the use of ontology extraction can allow for automated

categorization which yields many benefits as discussed earlier.

During the implementation of this project, I have learnt a great deal of new concepts, for example

the use of a document orientated databases has enabled me to approach designing the database

with new techniques such as domain driven design as opposed to traditional design approaches to

that of an SQL database development.

This has been an invaluable experience as I have learnt a fair deal and the project has been very

challenging at times, and as a result it has improved my technical skills overall.

This chapter has come to the conclusion that overall the project has been a success in that most of

the aims have been met; further developments and improvements that could be made to the system

have been identified. Therefore this project can be considered a success.

Page 77: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

77 | P a g e

Acknowledgments

This chapter acknowledges any third party code or libraries used in the building of the web

application.

ITEM Source Location in directory

Open Calais Wrapper Library for Open Calais Web Service

http://calaisdotnet.codeplex.com/

/CalaisDotNet /bin/CalaisDotNet.dll

Main CSS Layout

http://www.code-sucks.com/css%20layouts/faux-css-layouts/2-column-css-layouts/faux-7-2-col/faux-7-2-col.zip,

/masterpages/styles/main.css

Ajax Loader Image

http://www.ajaxload.info/ /images/ajax-load.gif

Menu Style http://cssmenumaker.com/builder/menu_info.php?menu=057

/masterpages/styles/menu_style.css

Vertical Menu http://cssmenumaker.com/builder/menu_info.php?menu=044

/masterpages/styles/verticalmenu.css

JQuery UITheme Redmond

http://jqueryui.com/themeroller/ /jquery

Microsoft Ajax Control Toolkit

http://www.asp.net/ajax/downloads

/bin/AjaxControlToolkit.dll

TinyMCE Editor

http://tinymce.moxiecode.com/track.php?url=http%3A%2F%2Fgithub.com%2Fdownloads%2Ftinymce%2Ftinymce%2Ftinymce_3.4.2.zip

/tinymce

TinyMCE compressor

http://tinymce.moxiecode.com/track.php?url=http%3A%2F%2Fgithub.com%2Fdownloads%2Ftinymce%2Ftinymce%2Ftinymce_compressor_net_2_0_4.zip

/tinymce/ tiny_mce_gzip.aspx / tnymce/tiny_mce_gzip.js

JQuery Datepicker Plugin

http://keith-wood.name/datepick.html

/jquery/js/jquery.datepicker.js

JQuery Truncatable plugin

http://theodin.co.uk/blog/development/truncatable-jquery-plugin.html

/jquery/truncatable

Page 78: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

78 | P a g e

References

The papers and journals used to write this report are listed below, and cited in the text where

appropriate.

[1] Help desk - Wikipedia, the free encyclopedia. 2010. Help desk - Wikipedia, the free

encyclopedia. Available at: http://en.wikipedia.org/wiki/Help_desk. Last Accessed 02

November 2010.

[2] Dray, J. (2008). Accuracy is the key to call logging. Available:

http://blogs.techrepublic.com.com/helpdesk/?p=334 . Last accessed 2nd Nov 2010.

[3] How Does Calais Work? | OpenCalais. 2010. How Does Calais Work? | OpenCalais .

[ONLINE] Available at: http://www.opencalais.com/about. Last Accessed 02 November 2010.

[4] AlchemyAPI. (2010). AlchemyAPI - Transforming Text Into Knowledge. Available:

http://www.alchemyapi.com/api/. Last accessed 03 Nov 2010.

[5] Composing the Semantic Web: Text Analysis with OpenCalais in TopBraid. 2010. Composing

the Semantic Web: Text Analysis with OpenCalais in TopBraid . [ONLINE] Available at:

http://composing-the-semantic-web.blogspot.com/2008/02/text-analysis-with-opencalais-

in.html. [Accessed 03 November 2010].

[6] What is an Ontology?. 2010. What is an Ontology?. [ONLINE] Available at: http://www-

ksl.stanford.edu/kst/what-is-an-ontology.html. [Accessed 03 November 2010].

[7] Debate - .NET V. PHP: Top 6 Reasons to Use .NET. 2010. Debate - .NET V. PHP: Top 6

Reasons to Use .NET. [ONLINE] Available at: http://articles.sitepoint.com/article/v-php-

top-6-reasons-use-net/2. [Accessed 03 November 2010].

[8] ASP.NET Master Pages . 2010. ASP.NET Master Pages . [ONLINE] Available at:

http://msdn.microsoft.com/en-us/library/wtxbf3hh.aspx#HowMasterPagesWork. [Accessed

03 November 2010].

[9] Gruber, T. R., A Translation Approach to Portable Ontology Specifications. Knowledge

Acquisition, 5(2):199-220, 1993. See also What is an Ontology? http://www-

ksl.stanford.edu/kst/what-is-an-ontology.html, http://tomgruber.org/writing/ontolingua-

kaj-1993.htm

[10] Implementing AJAX in ASP.NET - developer Fusion . 2010. Implementing AJAX in ASP.NET

- developer Fusion . [ONLINE] Available at:

Page 79: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

79 | P a g e

http://www.developerfusion.com/article/4704/implementing-ajax-in-aspnet/. [Accessed 23

November 2010].

[11] Why Use .NET?. 2010. Why Use .NET?. [ONLINE] Available at:

http://articles.sitepoint.com/article/why-dot-net. [Accessed 23 November 2010].

[12] 20 Excellent Free Rich-Text Editors | Webdesigner Depot. 2010. 20 Excellent Free Rich-

Text Editors | Webdesigner Depot . [ONLINE] Available at:

http://www.webdesignerdepot.com/2008/12/20-excellent-free-rich-text-editors/. [Accessed

01 December 2010].

[13] MongoDB vs. Raven DB vs. MySQL. 2010. MongoDB vs. Raven DB vs. MySQL. [ONLINE]

Available at: http://vschart.com/compare/mongodb/vs/raven-db/vs/mysql. [Accessed 16

December 2010].

[14] JSON: The Fat-Free Alternative to XML. 2010. JSON: The Fat-Free Alternative to XML.

[ONLINE] Available at: http://www.json.org/xml.html. [Accessed 21 December 2010].

[15] JSON. 2010. JSON. [ONLINE] Available at: http://www.json.org/. [Accessed 21 December

2010].

[16] Agile Software Development Methods-Review and Analysis. 2011. . [ONLINE] Available at:

http://www.vtt.fi/inf/pdf/publications/2002/P478.pdf. [Accessed 23 January 2011].

[17] Spiral Methodology. 2011. Spiral Methodology. [ONLINE] Available at:

http://www.mariosalexandrou.com/methodologies/spiral.asp. [Accessed 24 January 2011].

[18] GATE.ac.uk - sale/tao/split.html. 2011. GATE.ac.uk - sale/tao/split.html. [ONLINE]

Available at: http://gate.ac.uk/sale/tao/split.html. [Accessed 27 January 2011].

[19] Language Integrated Query - Wikipedia, the free encyclopedia. 2011. Language Integrated

Query - Wikipedia, the free encyclopedia . [ONLINE] Available at:

http://en.wikipedia.org/wiki/Language_Integrated_Query. [Accessed 28 January 2011].

[20] MySQL Full-Text Search Rocks My World (by Jeremy Zawodny). 2011. MySQL Full-Text

Search Rocks My World (by Jeremy Zawodny) . [ONLINE] Available at:

http://jeremy.zawodny.com/blog/archives/000576.html. [Accessed 28 January 2011].

[21] Raven DB . 2011. How Indexes Work . [ONLINE] Available at:

http://ravendb.net/documentation/how-indexes-work. [Accessed 28 January 2011].

[22] Waterfall Model Advantages and Disadvantages. 2011. Waterfall Model Advantages and

Disadvantages. [ONLINE] Available at: http://www.buzzle.com/articles/waterfall-model-

advantages-and-disadvantages.html. [Accessed 28 January 2011].

Page 80: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

80 | P a g e

[23] . 2011. . [ONLINE] Available at: http://zone.ni.com/images/reference/en-XX/help/371361D-

01/loc_eps_spiral_lifecycle.gif. [Accessed 28 January 2011].

[24] 10 Key Principles of Agile Development | All About Agile. 2011. 10 Key Principles of Agile

Development | All About Agile. [ONLINE] Available at: http://www.allaboutagile.com/10-

key-principles-of-agile-software-development/. [Accessed 28 January 2011].

[25] . 2011. . [ONLINE] Available at: http://agileelements.files.wordpress.com/2008/05/agile-

1pg.jpg. [Accessed 28 January 2011].

[26] . 2011. . [ONLINE] Available at:

http://3.bp.blogspot.com/_vp5SDmx0Vr4/SbI07ldsmAI/AAAAAAAAACA/Vo8wTVB1Yhc/s320

/Untitled.jpg. [Accessed 28 January 2011].

[27] BLACK BOX TESTING tutorial documentations . 2011. BLACK BOX TESTING tutorial

documentations . [ONLINE] Available at:

http://www.testingbrain.com/BLACKBOX/BLACK_BOX_Testing.html. [Accessed 18 April

2011].

[28] stanlemon.net : jgrowl. 2011. stanlemon.net : jgrowl. [ONLINE] Available at:

http://www.stanlemon.net/projects/jgrowl.html. [Accessed 20 April 2011].

[29] . 2011. . [ONLINE] Available at: http://www.jdl.co.uk/briefings/MVC.pdf. [Accessed 20 March

2011].

[30] Tag cloud - Wikipedia, the free encyclopedia. 2011. Tag cloud - Wikipedia, the free

encyclopedia. [ONLINE] Available at: http://en.wikipedia.org/wiki/Tag_cloud. [Accessed 22

April 2011].

[31] Search engine optimization - Wikipedia, the free encyclopedia. 2011. Search engine

optimization - Wikipedia, the free encyclopedia. [ONLINE] Available at:

http://en.wikipedia.org/wiki/Search_engine_optimization. [Accessed 22 April 2011].

Page 81: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

81 | P a g e

Appendices

Appendix A: NLP Test

A comparison between Alchemy API and Open Calais

Appendix A.1: Sample text

Query fails when i try to insert data.

Expand Post »

Hello! I’ve got three tables that I want to link together. When I try to upload images and text

throught a form, the web browser show this error message:

Quote ...

Error, query failed 1452-Cannot add or update a child row: a foreign key constraint fails

(`mn000532_almacen`.`images`, CONSTRAINT `images_ibfk_1` FOREIGN KEY (`itemID`) REFERENCES

`items` (`itemID`) ON DELETE CASCADE)

The itemID field is used in my php script($_GET["itemID"]) to retrieve the all the info about a certain

item therefore it needs to be the same in all tables.

MySQL Syntax (Toggle Plain Text

CREATE TABLE categories (

categoryID INT(5) NOT NULL,

cat_Description VARCHAR(50),

PRIMARY KEY (itemID)

)TYPE = INNODB;

CREATE TABLE items (

itemID INT(5) NOT NULL AUTO_INCREMENT,

categoryID INT(5) NOT NULL,

itemName CHAR(25) NOT NULL,

item_Description VARCHAR(255),

price CHAR(10),

contactName VARCHAR(50),

phone CHAR(15),

email VARCHAR(50),

website CHAR (25),

submitDate DATE NOT NULL,

expireDate DATE NOT NULL,

PRIMARY KEY(itemID,categoryID),

Page 82: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

82 | P a g e

INDEX (submitDate),

FOREIGN KEY (categoryID) REFERENCES categories (categoryID)

ON DELETE CASCADE

)TYPE = INNODB;

CREATE TABLE images (

imagesID INT (5) NOT NULL AUTO_INCREMENT,

itemID INT (5) NOT NULL,

name VARCHAR (30) NOT NULL,

size INT (11) NOT NULL,

type VARCHAR (30) NOT NULL,

pix MEDIUMBLOB NOT NULL,

PRIMARY KEY(imagesID),

FOREIGN KEY (itemID) REFERENCES items (itemID)

ON DELETE CASCADE

)TYPE = INNODB;

Thank you

Hernan

Source:

Page 83: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

83 | P a g e

Appendix A.2: Output of Open Calais

Page 84: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

84 | P a g e

Appendix A.3: Output of AlchemyAPI

Language english

Person (3) mysql_errno(

mysql_error(

Rizzuti

FieldTerminology (1) MySQL

TelevisionShow (1)

Thank

Concept Tags (3) Foreign key

SQL

Delete

Tags (30) Toggle Plain Text

error message

web browser

images

hidden input field

certain item

MySQL Syntax

itemID field

script

debug output

insert query

database

long time

right ones

Hello

foreign key

Page 85: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

85 | P a g e

link

info

Plain

submit

input

dont

image table

waste

I'll

items table

foreign key constraint

anymore

orphan

orphan records

Category

Computers & Internet (confidence: 0.8999)

RSS / ATOM Feeds http://www.daniweb.com/ex ...

http://www.daniweb.com/ex ...

Page 86: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

86 | P a g e

Appendix B: Database Entities Schema

Page 87: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

87 | P a g e

Appendix C: Front End Mock-up

Expert Mock-up

8.4.1 User Mock-up

Page 88: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

88 | P a g e

Appendix D: Server Methods

These files are held in the appcode folder.

Database.cs

Method Description

Database() -Contructor Initializes docStore.cs Class static fields

ProcessCall.cs

Method Description

processCall() -Constructor Initializes processCall class

void insertCall(string subject, string body, string id)

Gets the body and subject and extracts and stores information from the call by submitting the data to open Calais and stores the call

directCall.cs

Method Description

directCall() -Contructor

Initializes directCall Class static fields

Void directTOCategory

gets the callid attempts to make a search on the categories whichever category get the highest count will become the category assigned

query.cs

Method Description

String byStatus(int val) Returns status query for lucene searching

descriptions.cs

Method Description

public string getCategoryDescription(int call_category)

Returns string format of category

public string getStatusDescription(int callstatus1)

Returns string format of status

public string getTypeofUpdateDescription(int typeofcallupdate)

Returns string format of type of update

Page 89: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

89 | P a g e

Global.asax

Method Description

void Application_Start(object sender, EventArgs e)

On the start of the application it creates a new instance of Database.cs, so that it initializes only once.

void Session_Start(object sender, EventArgs e)

Creates a session variable isLoggedIn

Appendix E: Page Methods

This section describes the functions used in webpages.

Default.aspx.cs

Method Description

protected void userLogin_Authenticate(object sender, AuthenticateEventArgs e)

authenticates user by checking for user in database and redirects according if details are matched

Expert.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

On Page Load, it loads all of the user data by using the session variable, it fills the update grid and calls unassigned grids

protected void typesofcall_SelectedIndexChanged(object sender, EventArgs e)

updates the grid according to the status of the call assigned to the user

User.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

On Page Load, it loads all of the user data by using the session variable; and it fills all the grids on the page.

protected void submitcall_Click1(object sender, EventArgs e)

Stores the user call and displays to the user if the call store was a success.

Page 90: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

90 | P a g e

userCall.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

On Page Load, it loads all of the user data by using the session variable, it fills all the grids and data lists and labels on the page

protected void SubmitResponse_Click(object sender, EventArgs e)

stores the response and creates an update

expertCall.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

On Page Load, it loads all of the user data by using the session variable, it fills all the grids and data lists and labels on the page

protected void SubmitResponse_Click(object sender, EventArgs e)

stores the response and creates an update

protected void Delete_Click(object sender, EventArgs e)

deletes the helpdesk call

protected void save_category_Click1(object sender, EventArgs e)

Saves the category for the call and creates an update

protected void ExpertUserList_SelectedIndexChanged(object sender, EventArgs e)

Stores/Modifies the expert user of the call and creates an update

protected void StatusDropDownList_SelectedIndexChanged(object sender, EventArgs e)

Stores/Modifies the status of the call and creates an update

Report.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

On Page Load, loads all the extracted information and groups it and provides a count displays all the information in a tree format

Search.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

On Page Load, requests the search value and fills the data list with what has been found using lucene.

Page 91: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

91 | P a g e

advancedSearch.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

On Page Load, requests the search value and fills the data list with what has been found using lucene. And fills the expert user list.

protected void Button1_Click1(object sender, EventArgs e)

Searches according to dropdown list value then fills the data list

protected void DropDownList1_SelectedIndexChanged(object sender, EventArgs e)

Displays the selected fields according to drop down list

allcalls.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

Does nothing on page load.

protected void typesofcall_SelectedIndexChanged(object sender, EventArgs e)

filters the application support calls according to the status chosen and fills the grid

protected void typesofcall1_SelectedIndexChanged(object sender, EventArgs e)

filters the desktop support calls according to the status chosen and fills the grid

protected void typesofcall2_SelectedIndexChanged(object sender, EventArgs e)

filters the infrastructure support calls according to the status chosen and fills the grid

protected void typeofcall3_SelectedIndexChanged(object sender, EventArgs e)

filters the no category calls according to the status chosen

Page 92: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

92 | P a g e

Forgot.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

Does nothing on page load.

protected void RetrievePassword_Click(object sender, EventArgs e)

queries the database for user details if successful it sets a label with the password

Logout.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

abandons sessions (clears all session variables) and redirects to Default.aspx

RegisterExpertUser.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

Does nothing

protected void register_Click(object sender, EventArgs e)

checks if user exists if not then create the user and send confirmation email but type of user used is expert

RegisterUser.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

Does nothing

protected void register_Click(object sender, EventArgs e)

checks if user exists if not then create the user and send confirmation email but type of user used is normal

Insertcallcategories.aspx.cs

Method Description

protected void Page_Load(object sender, EventArgs e)

Inserts the helpdesk categories, should only be used if it has not been inserted already.

Page 93: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

93 | P a g e

Appendix F: Front End Methods

This section describes the javascript methods used.

Top.Master and User.Master

Method Description

function callTinyMCE()

Initializes compressor of TinyMCE and initiates the TinyMCE editor

function UpdateTextArea()

Copies what is in the TinyMCE editor into the textbox

.tabs()

Creates the tab out of a div element

.hide() Hides the div element

advancedSearch and SearchPage.aspx

Method Description

function pageLoad(sender, args)

Javascript method for checking for page reload and within it checks for a partial page refresh and calls the truncate method

function truncate()

Limits the amount of text shown on the search results page.

Page 94: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

94 | P a g e

Appendix G: Testing Ontology

Category Source, question plucked from the website

Tags on StackOverFlow.com

Amekhania Output Category Assigned by Amekhania

Application Support

http://stackoverflow.com/questions/4820362/html-cross-browser-issues-on-website

html margin

Application Support

Application Support

http://stackoverflow.com/questions/3185437/exception-of-type-system-outofmemoryexception-was-thrown

mysql ibatis ibatis.net

N/A Application Support

Application Support

http://stackoverflow.com/questions/2529012/how-to-rectify-this-issues-asp-net-pagerequestmanagerparsererrorexception

asp.net

Application Support

Desktop Support

http://stackoverflow.com/questions/4491580/annoying-ie8-dropdown-issue-in-xp-but-not-windows-7

javascript css windows internet-explorer internet-explorer-8

Desktop Support

Desktop Support

http://stackoverflow.com/questions/5399603/issues-remoting-to-perfmon

performance statistics remote perfmon

N/A Application Support

Desktop Support

http://superuser.com/questions/270899/scanner-directly-to-printer

ng usb scan

Application Support

Entity Type Entity Value

Technology HTML

Company Deaglegame.net

URL http://deaglegame.net/"

ProgrammingLanguage HTML

URL http://i.imgur.com/2SBP9.png"

Entity Type Entity Value

IndustryTerm search button im getting

ProgrammingLanguage asp.net

IndustryTerm search panel

Entity Type Entity Value

Technology html

OperatingSystem XP

Entity Type Entity Value

Product A6

Technology JPEG

Person Document Size

Technology Flash Memory

Page 95: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

95 | P a g e

Infrastructure Support

http://stackoverflow.com/questions/4927356/android-ndk-networking-problems-tcp-connection-fails

android sockets networking tcp ndk

Application Support

Infrastructure Support

http://superuser.com/questions/270923/setup-a-home-server

networking windows-7 windows-xp home-networking windows-networking

Infrastructure Support

Infrastructure Support

http://stackoverflow.com/questions/1739167/using-mysql-on-tomcat-server

mysql database jsp

Application Support

Entity Type Entity Value

IndustryTerm software components

IndustryTerm actual android device

ProgrammingLanguage Java

Technology Java

OperatingSystem Linux

OperatingSystem Microsoft Windows

OperatingSystem BSD

OperatingSystem Android

IndustryTerm internet permissions

Technology Linux

ProgrammingLanguage C

Technology Wi-Fi

OperatingSystem Windows XP

IndustryTerm internet connection

Technology file sharing

IndustryTerm wireless capability

IndustryTerm wireless router

Technology wireless router

OperatingSystem Windows 7

Entity Type Entity Value

Company mySQL

Position driver

Technology jsp

Page 96: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

96 | P a g e

Appendix H: Files Directory

The following is a description of the relevant directories and files description in the entire directory

of the web application. This is split into the folders and then a table of the directories are described.

/App Code Folder

File Name Description query.cs Builds the status query for lucene searching processCall.cs Inserts the helpdesk call into the database and uses OpenCalais to extract and store

the data into the database directCall.cs Makes a search on categories, and assigns the category to the call if possible. descriptions.cs Contains methods that return string descriptions for updates, categories and

statuses. database.cs Class that initializes the database connection

/EnumeratedTypes folder

File Name Description callcategory.cs Enumeration for call categories callstatus.cs Enumeration for call statuses typeofchange.cs Enumeration for changes made to a helpdesk call

userType.cs Enumeration for the type of user

/RavenDB folder

This folder contains the entities for storage into the Raven database as described in the design

section.

File Name Description about.cs helpdeskcall.cs helpdeskcalls.cs helpdeskcategory.cs helpdeskresponse.cs helpdeskresponses.cs helpdeskstatistic.cs helpdeskstatistics.cs helpdeskupdates.cs helpdeskUser.cs rdfentities.cs rdfentity.cs relationship.cs relationshipdetail.cs relationshipdetails.cs relationships.cs sdEntities.cs sdEntity.cs sdTopic.cs sdTopics.cs

/Bin Folder

Page 97: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

97 | P a g e

This folder is used to reference library files.

File Name Description AjaxControlToolkit.dll This contains the library methods for using Microsoft’s Ajax control

toolkit. CalaisDotNet.dll This contains the library methods for using the open Calais service Lucene.Net.dll This contains the library methods for full text searching via Lucene Newtonsoft.Json.dll Used to read Json data in .Net, possibly in conjunction with

Raven.Abstractions and Raven.Client.Lightweight dll files. Raven.Abstractions-3.5.dll Library of methods used to communicate with the RavenDB

database. Raven.Client.Lightweight-3.5.dll

Library of methods used to communicate with the RavenDB database.

/CalaisDotNet folder

This folder contains the source files that build the dll file, used to interact with the open Calais web

service.

/Images

File Name Description

ajax-loader.gif Animated GIF file used for Ajax

mainlogo.jpg Logo for Amekhania

mainlogo.png Same as above

/JQuery

File Name Description

Index.html Example page displaying sample code for Jquery generated UITheme

../css/Redmond

Contains the stylesheets for UITheme “Redmond”

../development-bundle

Included when downloaded the JQuery UITheme Package

../js/

Contains the javascript files used to make the JQuery UITheme function.

../styles

File Name Description main.css CSS file that styles the layout for the master page files menu_style.css CSS file that styles the horizontal menu used in the master page files verticalmenu.css CSS fill that styles the vertical menu’s used in the master page files

/TinyMCE

../Examples

This folder contains examples of implementing the TinyMCE editor

../Themes

Page 98: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

98 | P a g e

This folder contains themes for the TinyMCEeditor

../Jscripts/TinyMCE

File Name Description tiny_mce_gzip.aspx Compresses the Tiny_mce.js file tiny_mce_gzip.js Initializes the tiny_mce_gzip.aspx file tiny_mce.js JavaScript file for the TinyMCE Editor tiny_mce_src.js Source file for TinyMCE editor

../ truncatable/

This folder contains a Jquery plugin that truncates text.

Web pages

File Name Description advancedSearch.aspx Allows the expert user to make an advanced search for helpdesk calls allcalls.aspx Displays all the calls in each category Default.aspx Start up page for logging into Amekhania Expert.aspx The dashboard for the expert, displays update notifications, calls

assigned to the expert, and displays any incoming helpdesk calls that are unassigned.

expertCall.aspx Displays the helpdesk call, all the responses, statistics and allows the expert to change the property of the call

forgot.aspx Retrieves the password for the user and expert Global.asax Handles the web application event of start up it initializes that RavenDB

database variables.

insertcategories.aspx Additional page, inserts categories descriptions logout.aspx Removes all session variables RegisterExpertUser.aspx Allows an expert to register RegisterUser.aspx Allows a user to register report.aspx Displays all the topics of discussion in helpdesk search.aspx Allows the user to search helpdesk via keywords user.aspx The dashboard for the user, display update notifications, allows the

user to create a call.

userCall.aspx Allows the user to view the call, responses, statistics and the extracted information to the user.

Web.config Asp.Net generated configuration file, modified for session state length increase.

Page 99: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

99 | P a g e

Appendix I: Project Proposal

8.4.2 Abstract

The final year project will consist of a system that incorporates the concept of a flexible taxonomy

and ontology extraction from text. In essence, a system that can crawl through text data such as

problem reports, and attempts to classify extract taxonomies, probably in a tree like structure, such

that, once extracted, the appropriate expert will get a quick overview of the thematic areas of the

problems reported, discussed or resolved. The intention of the system is to manage the flow of

communication between problem reporters and experts, as effective as possible, so that the right

experts can be allocated to the right type of problems as quickly as possible.

8.4.3 Introduction

Every organisation needs support for various equipment and software to operate optimally so that

productivity within the workforce is not lost as a result of technical failures, and this is why technical

support departments exists in organisations; so that they provide advice in resolving issues or get

hands on fixing the issues, this could range from desktop support issues such as computers breaking

down to in-house software faults such as unexpected exception errors, which are typically dealt with

by application support engineers or the software developers.

But in order to effectively report and communicate the problem, the concept of a helpdesk [1] is

used.

What I propose for the project is essentially going to be a web application, but is not exclusively

written for use by my former company specifically, but a project that explores the use of ontology

extraction from text such that it the application can use the results to attempt to auto categorize the

calls [2] and direct to the appropriate expert.

The rationale for my proposal was mainly a result of my background, which has involved working in a

commercial environment for my industrial placement year as an application support engineer, part

of the role involved working as part of the helpdesk, to help resolve issues with in-house developed

software which meant potentially fixing bugs and implementing patches, the helpdesk application I

was using was an in-house written ASP.Net web application. The general helpdesk process was that

if there was a “technical” problem then an employee, the problem reporter would email a mailbox,

and the web application would pick this up and display on the web page, and whoever saw the

email, which in helpdesk terms we refer to it as a “call” [2], would log it into its correct category,

Page 100: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

100 | P a g e

although quite a medial task it was necessary such that the team could work on what their specific

skills could cope with.

Also from a technical point of view, semantic web technology is the next phase of the “web 3.0” era,

so investigating and trying to make use of new concepts and then to try implementing new

technologies is very exciting. The hope is someday my application can provide a blueprint to show

how communication can be managed effectively via the use of semantic web technology.

8.4.4 Background

This section describes the term Ontology, and looks briefly at the methods or tools that are available

that would enable extraction of ontology’s from text.

So why use of Ontology [6]? It provides a common vocabulary for the sharing of information in the

domain; ontology allows for the building of knowledge based system and can be presented in a

taxonomy form, a hierarchal structure.

I have found two services that use natural language processing techniques that extract knowledge

from text, and these are OpenCalais and Alchemy API.

OpenCalais [3], powered by Thomson Reuters, provides a free web service, where you can provide it

unstructured text documents and it will attempt to extract knowledge from the text, available is a

toolkit that will help the web application interact with the service and extract the required

information.

Figure 1: Open Calais Architecture [3]

Page 101: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

101 | P a g e

Alchemy API [4] is another free web service and is of a similar service that is provided by Open

Calais.

8.4.5 The proposed project

The aims of the project are as followed:

Provide a platform in which users can communicate a technical problem such that it can be

addressed by the appropriate expert(s) or department

A user may come across a problem with a specific application they are using, and there could be

many technical people available but in order to solve the problem it would need to be directed to

the most appropriate expert. This could take many levels of conversation between different

members within the technical department, and this not productive because this time could be spent

on valuable tasks such as programming, so directing this problem to the person who may have

extensive experience or may have personally worked on the application, such that they can fix the

bug or problem.

Provide functionality such that the problem reports can get automatically logged and directed to

the appropriate expert with the use of ontology extraction.

In order for the problem report to be directed to the appropriate expert, the problem report will

need to be analyzed for keywords and concepts with the use of natural language processing tools,

such that the application can attempt to match the skills of an expert or group of experts that can

deal with the problem.

Provide a “taxonomized” view of the problematic areas in the form of a tree structure.

Problems will be categorised and grouped, the aim is that if the problem has been diverted

successfully to the appropriate expert then you can categorise the report by the keywords used and

use information about the user and the expert it has been directed to, so that the problem is

categorised and an overall diagrammatical representation of the thematic areas of problems can be

reported.

8.4.6 Objectives

In order to meet the aims, below are the objectives of the project:

6. Produce a helpdesk web application, the user and expert communicates the problem via this

platform

Page 102: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

102 | P a g e

7. Make use of existing ontology extract technologies or create own, possibly alchemy or open

Calais to help direct the problem to appropriate expert

8. Make the web application searchable, so users can search for similar or past problems

9. Provide a summary page in which overviews of the problematic areas are displayed.

10. Provide a statistical summary of performance of the expert dealing with problem.

System Architecture

The system architecture is of a typical nature and will be made up of four core components:

1. Server Application held on the ASP.Net server

The web application will process the interaction between the users and expert, it will use the

ontology service to help deliver the aims of the project.

The results of the ontology service will be stored in the database, such that based on the results; the

application can make a decision as to where the report should be directed to.

The web application as a point of communication will allow the user/expert to see the history of the

actions undertaken and will store statistical information about the helpdesk calls, and for the expert

provide a real time view of the problematic areas that will be derived by the polling of the database.

2. Front end

Figure 3: System Architecture

Page 103: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

103 | P a g e

As a web service, it will need to deliver key information in a presentable way that can be read by

both the problem reporter and expert. The reporter will have a more restricted view of the

helpdesk, as it would make sense for them to only see problems that they are facing, but can

communicate using the helpdesk as a platform and the expert can view the calls assigned to them,

change the status and view a summary page of the problematic areas, and respond via the helpdesk

system.

An example for a potential view of the helpdesk system, a clean and simple layout in terms of

interaction for the problem reporter:

3. Back-end database

This will provide the web application a point of data access and storage, and will provide real time

data for the application to manipulate and query.

4. Ontology Service

The service is a natural language processing tool that will attempt to extract knowledge from text

and returns the information to the web application.

Figure 4: Sample layout

Page 104: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

104 | P a g e

8.4.7 Time plan to achieve aims and objectives

This is a general time plan to achieve what I want from the project; the aim is to finish the project

earlier than the deadline.

Research Ontology Extraction: The aim is to gain an understanding of the field, try to investigate

current ontology extraction technologies and to test whether the existing technologies can provide

what I require of it.

Requirements Specification: Develop the functional requirements of the application such that there

is a firm foundation on what needs to be built to meet the aims of the project.

Design: Design the web application based on the requirements, and develop designs of the front

end.

Implementation: Create the web application, database and front end.

Testing: Test the system for bugs and try to fix, test to see if the application delivers what it is aimed

to do.

Project Write-up: Write a report of findings, the development of the system and a documentation

of the web application.

Figure 5: Task Breakdown

Page 105: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

105 | P a g e

8.4.8 Technology/Resources required

In order to create this web application I will need to use Microsoft Visual Studio, which the machine

described below has already installed.

I will use my personal computer which has a specification:

Main Specs

OS Windows 7 64 bit

Hard-drive space 1TB

Processor Intel i5 2.66ghz Quad Core

The web application will need to interact with a hosted web services such as the database and the

ontology extraction service.

8.4.9 Languages/Frameworks

As what I want to develop a web-based application, the use of the ASP.Net framework will prove

sufficient; I intend to write the web application in C#, as I’ve had some commercial experience in

using this language, as I intend to build on what I have learnt from my studies and during my work

experience. From my experience its far easier debugging applications with the use of the Visual

Studio IDE compared to using for example the most realistic alternative that is PHP because not all

IDE’s support debugging for PHP and usually require XDebug, a debug tool that you attach to an IDE.

In terms of performance, compared to PHP, ASP.Net is faster because it’s compiled on the server

whereas PHP is interpreted [7], this is one of many reasons why I have opted for ASP.Net, but having

said this, the programming aptitude has a fair amount of say in the performance of any web

application, but I am confident that I can deliver an effective web application.

Also from a design point of view, ASP.Net uses the concept of master pages [8], which allows for

consistent page design which is useful as you would not need to repeat the same piece of code over

and over.

Page 106: Flexible taxonomies and ontology extraction from text in the context ... - WordPress… · 2011. 6. 9. · FLEXIBLE TAXONOMIES AND ONTOLOGY EXTRACTION FROM TEXT IN THE CONTEXT OF

106 | P a g e

8.4.10 References

[1] Help desk - Wikipedia, the free encyclopedia. 2010. Help desk - Wikipedia, the free

encyclopedia. Available at: http://en.wikipedia.org/wiki/Help_desk. Last Accessed 02

November 2010.

[2] Dray, J. (2008). Accuracy is the key to call logging. Available:

http://blogs.techrepublic.com.com/helpdesk/?p=334 . Last accessed 2nd Nov 2010.

[3] How Does Calais Work? | OpenCalais. 2010. How Does Calais Work? | OpenCalais .

[ONLINE] Available at: http://www.opencalais.com/about. Last Accessed 02 November 2010.

[4] AlchemyAPI. (2010). AlchemyAPI - Transforming Text Into Knowledge. Available:

http://www.alchemyapi.com/api/. Last accessed 03 Nov 2010.

[5] Composing the Semantic Web: Text Analysis with OpenCalais in TopBraid. 2010. Composing

the Semantic Web: Text Analysis with OpenCalais in TopBraid . [ONLINE] Available at:

http://composing-the-semantic-web.blogspot.com/2008/02/text-analysis-with-opencalais-

in.html. [Accessed 03 November 2010].

[6] What is an Ontology?. 2010. What is an Ontology?. [ONLINE] Available at: http://www-

ksl.stanford.edu/kst/what-is-an-ontology.html. [Accessed 03 November 2010].

[7] Debate - .NET V. PHP: Top 6 Reasons to Use .NET. 2010. Debate - .NET V. PHP: Top 6

Reasons to Use .NET. [ONLINE] Available at: http://articles.sitepoint.com/article/v-php-

top-6-reasons-use-net/2. [Accessed 03 November 2010].

[8] ASP.NET Master Pages . 2010. ASP.NET Master Pages . [ONLINE] Available at:

http://msdn.microsoft.com/en-us/library/wtxbf3hh.aspx#HowMasterPagesWork. [Accessed

03 November 2010].