hercules final

52
1 The Hercules Parser Patrick A. Cameron Waikato University Student [email protected] Abstract The Hercules parser provides a simple program for the investigation and application of patterns to discourse. The parser provides flexibility through scripting, whilst allowing an operator to be able to understand the frames and concepts without a great deal of knowledge of computer programming languages and scripts. The SQL and XML integrations allow for well known query languages to form the basis for data extraction and manipulation. Wordnet is used as a basis for forming concepts and patterns thereof, allowing a vast number of already established relationships to be explored using techniques such as self organizing maps and cluster analysis. Semantic roles, attributes, and hyponym hierarchies play a key role in word sense disambiguation where pattern recognition forms underlying concepts which are reinforced by logical statements with the aim of reaching a true Artificial Intelligence, simulated by a computer.

Upload: patrick-cameron

Post on 21-Apr-2015

211 views

Category:

Documents


1 download

DESCRIPTION

The Hercules parser provides a simple program for the investigation and application of patterns to discourse. The parser provides flexibility through scripting, whilst allowing an operator to be able to understand the frames and concepts without a great deal of knowledge of computer programming languages and scripts. The SQL and XML integrations allow for well known query languages to form the basis for data extraction and manipulation. Wordnet is used as a basis for forming concepts and patterns thereof, allowing a vast number of already established relationships to be explored using techniques such as self organizing maps and cluster analysis.Semantic roles, attributes, and hyponym hierarchies play a key role in word sense disambiguation where pattern recognition forms underlying concepts which are reinforced by logical statements with the aim of reaching a true Artificial Intelligence,simulated by a computer.http://www.linkedin.com/pub/patrick-cameron/64/516/2a6

TRANSCRIPT

Page 1: Hercules Final

1

The Hercules Parser

Patrick A. Cameron

Waikato University Student

[email protected]

Abstract

The Hercules parser provides a simple program for the investigation and application

of patterns to discourse. The parser provides flexibility through scripting, whilst

allowing an operator to be able to understand the frames and concepts without a great

deal of knowledge of computer programming languages and scripts. The SQL and

XML integrations allow for well known query languages to form the basis for data

extraction and manipulation. Wordnet is used as a basis for forming concepts and

patterns thereof, allowing a vast number of already established relationships to be

explored using techniques such as self organizing maps and cluster analysis.

Semantic roles, attributes, and hyponym hierarchies play a key role in word sense

disambiguation where pattern recognition forms underlying concepts which are

reinforced by logical statements with the aim of reaching a true Artificial Intelligence,

simulated by a computer.

Page 2: Hercules Final

2

Contents

Abstract……………………………………………………………………..…………1

List of Figures……………………………………………………………….………...4

List of Tables………………………………………………………………….………6

Acknowledgements…………………………………………………………………....7

Section 1: Introduction…………………………………………………………...….8

1.1 Context………………………………………………………………………….....8

1.2 Exposition Goals…………………………………………………………………..8

1.3 Motivation…………………………………………………………………………9

1.4 Report Chapters……………………………………………………………………9

Section 2: Background……………………………………………………………...10

2.1 Other attempts……………………………………………………………………10

2.1 Other attempts……………………………………………………………………10

2.2 A Conceptual Parser for Natural Language……………………………………...10

2.3 Conceptual Dependency and Montague Grammar: A step toward conciliation…10

2.4 Schank/Riesbeck vs. Norman/Rumelhart: What’s the difference?........................10

2.5 How a Neural Net Grows Symbols………………………………………………10

2.6 A hybrid Approach to Word Sense Disambiguation: Neural Clustering with Class

Labelling……………………………………………………………………………...10

2.7 A Generative Model for Semantic Role Labelling……………………………….11

2.8 Unsupervised Semantic Role Labelling………………………………………….11

Section 3: System Overview………………………………………………………..12

3.1 The Hercules AI Parser…………………………………………………………..12

3.2 Input / Output…………………………………………………………………….12

3.3 Wordnet 1.17 Searches…………………………………………………………...12

3.4 The Hercules Data Interface……………………………………………………...12

3.5 Sentence Structures and Base Node Hierarchies…………………………………13

3.51 Sentence Structures Created from User Input…………………………………..14

3.52 The Frame Engine and Node Hierarchies………………………………………15

3.6 Query Engine…………………………………………………………………….16

3.61 Frame Queries and Statistics……………………………………………………18

3.62 Frame Queries with Formulas…………………………………………………..20

3.7 The AI Mind of Hercules………………………………………………………...23

3.8 Forming basic concepts…………………………………………………………..23

3.9 Abstraction of Concepts………………………………………………………….25

3.10 Concepts, purpose, reason and goals……………………………………………27

3.11 Concepts forming reason of a living organism….……………………………...27

3.12 Learning through abstract concepts…………………………………………….27

3.13 Underlying conceptual schemas and schema limitations………………………28

3.14 Schemas based upon CD theory……………………………………………..…29

3.15 Critical Reasoning………………………………………………………………31

3.16 Hierarchy for reasoning…………………………………………………………33

3.17 Database Structures for reasoning ……………………………………………...33

3.18 Script Formula examples for Critical Reasoning……………………………….34

3.19 Fact and Truth Corrections of the Databases…………………………………...36

3.20 Setting the Database Data……………………………………………………….37

3.21 Abstractions of real concepts for Analogy……………………………………...37

Page 3: Hercules Final

3

3.22 Limitations on Abstract Concepts………………………………………………41

3.23 Forming Analogies……………………………………………………………...43

3.24 Corrections and limitations on Analogy………………………………………...45

3.25 Truth and Weight in analogy……………………………………………………48

Section 4: Experimentation, Results and Analysis………………………………..50

Section 5: Future Work…………………………………………………………….51

5.1 Algorithm Design……………………………………………………………….51

5.2 Hercules Parser Enhancements…………………………………………..........51

5.3 Reaching the goal of True Artificial Intelligence..……………………………51

Section 6: Concluding Remarks……………………………………………………52

Page 4: Hercules Final

4

List of Figures

Figure 3.1: The Hercules AI Parser components and subsystems……………………..12

Figure 3.2: Relationships between the Hercules Data Interface (HDI) and the data

collected from Wordnet………………………………………………………………………12

Figure 3.3: sWord class object, pointers, attributes and metadata relationships……14

Figure 3.4: Sentence node hierarchies tokenized using white spaces populated with

resultant HDI search data…………………………………………………………………...15

Figure 3.5: Frame sWord object node hierarchies with frame metadata including

formula data structures………………………………………………………………………16

Figure 3.6: Query creation process and commands…………………………………….17

Figure 3.7: Flow chart for concept wave / fragment section boundaries……………..19

Figure 3.8: sWord data structure updates and construction using the query stack

execution processes for testing and setting bit defined data attributes………………..22

Figure 3.8: Objects, Attributes, Actions, Distance, Time, Position, Actor, and

Witness………………………………………………………………………………………….23

Figure 3.9: Witnessing events in conversation assist in experience, learning and

expectation……………………………………………………………………………………..24

Figure 3.10: Wordnet Hyponym hierarchies of the statement in figure 3.9…………..25

Figure 3.12: Hercules is able to link existing frames to create a new pattern based on

user input and store for later reference……………………………………………………26

Figure 3.13: Hercules will add weight to patterns recognised in prior

communications such as that of figure 3.12…………………………………………….…28

Figure 3.14: Hercules fills a basic concept container for objects and actions by

recognizing the subject matter of the discourse………………………………………..…30

Figure 3.15.1: The hyponym hierarchy of Socrates for premise A………………….…31

Figure 3.15.2: The hyponym node hierarchy premise A joined by relationship to the

hyponym node hierarchy premise B……………………………………………………..…31

Figure 3.15.3: The hyponym and node hierarchy of premise A and B……………..…32

Figure 3.15.4: The hyponym and node hierarchy of premise A and B and C……..…32

Figure 3.16: Socrates :ode Hierarchy of Wordnet data using relationships…….…33

Figure 3.17: Example script for using the Hercules ISA method for testing the node

hierarchy of Socrates…………………………………………………………………………34

Figure 3.18: Socrates node Hierarchies and Mortal definition can be traced through

node relationships…………………………………………………………………………….35

Figure 3.19: Script, data and methods for finding what mortal means for Socrates, or

what anything means for anything if given the context…………………………………..36

Figure 3.20: Hercules simulates an interesting and engaging manner in

communications with others…………………………………………………………………37

Figure 3.21: The subjects removed from a sentence create a frame…………………..37

Figure 3.22: The hyponym hierarchies forming the abstracted concept with the

definition metadata……………………………………………………………………………38

Figure 3.24: Shows the (is a) node relationships created by Hercules between the

table elements of table 3.4…………………………………………………………………...39

Figure 3.25: Shows the overall categorised and ranged concept in a reduced and

understandable way…………………………………………………………………………..39

Figure 3.26: Illustrates the relationships created by Hercules using hyponym data

and critical reasoning………………………………………………………………………..40

Figure 3.27: The distinction made to the concept category where the concept

becomes too abstract…………………………………………………………………………41

Page 5: Hercules Final

5

Figure 3.28: A frame and concept and abstraction within a given range…………….41

Figure 3.29: The hyponym data for Cleopatra…………………………………………...42

Figure 3.30: The hyponym data for Socrates……………………………………………..42

Figure 3.31: An analogy where first subject of comparable discourse is abstracted..43

Figure 3.32: Hyponym hierarchies provided by Wordnet for person, rock, and cat..43

Figure 3.33: Heuristic substitution and abstraction using the hyponym hierarchy of a

particular word sense………………………………………………………………………...44

Figure 3.34: Shows the comparison of hyponym hierarchies of Cleopatra, Socrates,

and a Rock……………………………………………………………………………………..44

Figure 3.35: Is the frame of the concept analogy of figure 3.31……………………….45

Figure 3.36: Syntax structures of concepts and frames combined……………………..45

Figure 3.37: Syntax for concepts and frames with category information included….45

Figure 3.38: Statement of fact provided by a person…………………………………….46

Figure 3.39: Rock hyponym hierarchy with the object category of distinction………46

Figure 3.40: The expanded concept frame to a table or array of data………………..46

Figure 3.40: Illustrates the distinction drawn from the user input of figure 3.38 will

deactivate categories of the analogy and concept………………………………………..47

Figure 3.41: The upper and lower limits of analogy in context with relationships and

attributes……………………………………………………………………………………….47

Figure 3.42: The Analogy Upper Limit……………………………………………………48

Figure 3.43: The Analogy Lower Limit……………………………………………………48

Page 6: Hercules Final

6

List of Tables

Table 3.1: Concept score analysis Table……………………………………………18

Table 3.2: The binary signature of a concept signature of Table 3.1…………………19

Table 3.3: The hyponym hierarchies for “Socrates is a man” using frame “* is a

*”……………………………………………………………………………………………….38

Table 3.4: Shows the 2 x 7 Matrix of concept combinations of table 3.3…………….39

Page 7: Hercules Final

7

Acknowledgements

I would like to thank my supervisor Dr. Tony C. Smith for his help in guiding my

studies and helping me to explain my project to others. Tony has inspired,

encouraged and challenged my views, whilst providing me with guidance to assist in

me explaining my research. Tony’s optimism, and need I say at times devil’s

advocacy, has made for me, an interesting philosophical journey into Artificial

Intelligence research and design.

Page 8: Hercules Final

8

Section 1: Introduction

This report describes a general exposition of the workings and theory behind the

Hercules parser. The Hercules parser has been under development for 3 years now,

and is still in the process of development. There are numerous features and functions

that have been integrated into the parser to provide a basis for a computer to learn

from and communicate with people.

This section provides a brief introduction to the context, goals, motivation, and

chapter overviews of this expositional report on the Hercules Parser.

1.1 Context

Since the invention of the computer, there have been countless fascinations with the

idea of Artificial Intelligence. The idea that a person can communicate with a

computer and have the computer understand and respond has too numerous

applications to describe. With a general view that having a computer understand and

assist people with their lives will be beneficial for those concerned, I have created the

Hercules parser to investigate how this may be achieved. This report exposits the

steps, processes, and theory investigated by myself in providing such a system.

1.2 Exposition Goals

The aim of this exposition is to describe the workings of the Hercules Parser, the

theories underlying the Hercules Parser, and to describe how the parser can be used

with Wordnet and other databases so that further research may be carried out using

the Hercules Parser as the Platform to achieve conclusive scientific research and

findings.

• Describe the compositional structures of the parser

• Describe the data structures of the parser

• Describe the execution of script operations by the parser

• Describe the Databases of the parser

• Describe the theory behind using wordnet hyponym hierarchies in concept

design

• Describe the theory behind using frames with a parser to create concept

objects using wordnet hyponym hierarchies

• Describe how critical reasoning can be used to supplement the node

hierarchies of wordnet

• Describe how analogy may be formed from concepts derived from the

Wordnet hyponym hierarchies

• Describe in general how the Hercules parser can assist in making a hypothesis

surrounding the meanings of conversation where an algorithm can be applied

to control program flow for interpretation of communications

• Describe future work that can flow on from the exposition of the Hercules

Parser

Page 9: Hercules Final

9

1.3 Motivation

The motivation in the creation of the Hercules parser began with the attempt to

leverage the information stored within Wordnet so that a computer may talk to a

person.

The aim was to allow a person to ask questions to Hercules and have Hercules

respond in an interesting way. Because of the large amount of information in

Wordnet and the availability of the code in C++, Wordnet became the logical starting

point for beginning investigation in to Artificial Intelligence. Because C++ was the

default language in the code libraries of Wordnet, it was an attractive starting point

from the perspective that processing and memory overhead would be reduced due to

the nature of the C++ language; where raw power, flexibility, and direct hardware

access may required. Due to constraints in time and complexities interfacing

with .Net databases and libraries, managed class objects, XML, and windows forms

have been integrated into the previously command line based application. The

Wordnet 1.17 code has been altered significantly to incorporate the class objects of

the Hercules parser. Further task specific class object based engines have been

designed for handling core components and the functionality provided thereof.

With the task of the construction of the Hercules parser prototype nearly complete, it

is left that the relationships of data in communications can be explored to identify

patterns used for intelligent communications between individuals, and apply them to

form an artificial intelligence within a computer for the benefit of assisting a person.

1.4 Report Chapters

Section 1: This section provides an overview of the report.

Section 2: Provides a general background into the research documents that have

contributed to the ideas and concepts that the Hercules parser is based upon. Some of

the material has been considered in the construction of the parser so that the theories

or findings of those articles may be explored with a functional parser and databases

for a statistical repository.

Section 3: Discusses and expands on the goals listed in section 1.6. The goals are not

set out individually, but are interrelated and addressed in the subsections under each

topic.

Section 4: Discusses experimentation, results and analysis; however, since the

Hercules parser has been designed to run the experiments, limited work has been

carried out in experimentation. However, research will continue in the future once the

prototype had been completed.

Section 5: Identifies future work to be done in the areas of algorithms, parser

enhancements, and the final goal of true Artificial Intelligence.

Section 6: Discusses concluding remarks and observations surrounding the Hercules

Parser and the exposition within this report.

Page 10: Hercules Final

10

Section 2: Background

Systems and reference documents have assisted in the creation and support of the

underlying concepts the Hercules Parser attempts to encompass and are listed below.

2.1 Other attempts

Earlier attempts in designing an artificially intelligent machine have been numerous.

Attempts include the “CYC Project” by Douglas Lenat, “A.L.I.C.E.” by Dr. Richard S.

Wallace and “Eliza” by Joseph Weizenbaum. More recent attempts have been made

in designing artificial intelligence such as “Jabberwaky” by Rollo Carpenter which

had competed well in an attempt to pass the Turing test in competing for the Loebner

prize.

2.2 A Conceptual Parser for 7atural Language

“A conceptual parser for natural language” - by Roger C Shank and Lawrence G

Tesler describes an operable automatic parser for natural language. It is a conceptual

parser, concerned with determining the underlying meaning of the input utilizing a

network of concepts explicating the beliefs inherent in a piece of discourse.

2.3 Conceptual Dependency and Montague Grammar: A step toward

conciliation

“Conceptual Dependency and Montague Grammar: A step toward conciliation” by

Mark A. Jones and David S. Warren, contrasts and reconciles the CD theory of

Schank’s conceptual parser in section 2.2 with the logic system of Montague

Grammar using a sorted hierarchy and typed lambda calculus.

2.4 Schank/Riesbeck vs. 7orman/Rumelhart: What’s the difference?

“Schank/Riesbeck vs. Norman/Rumelhart: What’s the difference?” explores the

fundamental differences between two sentence parsers and how keywords, frames and

expectations are handled between the two. The paper focus is more specifically at the

operational level but is thought provoking where similarities are shared with the

Hercules Parser.

2.5 How a 7eural 7et Grows Symbols

How a neural net grows symbols” by James Franklin illustrates how clustering may

be used in conjunction with a neural net for data reduction, and are ideal for AI

implementations.

2.6 A hybrid Approach to Word Sense Disambiguation: 7eural Clustering with

Class Labelling

“A hybrid approach to word sense disambiguation: Neural Clustering with class

labelling” by Steve Legrand and JRG Pulido combines a neural algorithm with the

Wordnet lexical database to be able to semi-automatically label groups of items

Page 11: Hercules Final

11

clustered in a multi-branched hierarchy, illustrating the use of neural algorithms

together with ontological knowledge in word sense disambiguation tasks.

2.7 A Generative Model for Semantic Role Labelling

“A Generative Model for Semantic Role Labelling” by Cynthia Thompson, Roger

Levy, and Christopher Manning use FrameNet sematic role and frame ontology for

identifying semantic roles. To quote from it, “the paper attempts the task of learning

to automatically assign such roles. Identifying such roles and the relationships

between them can in turn serve as support for inference about a sentence’s meaning,

for antecedent resolution, or for other understanding or parsing tasks such as

prepositional phrase attachment or word sense disambiguation. FrameNet corpus and

apply it to the task of automatic semantic role and frame identification. This paper

develops a generative model from which one can infer role labels, given sentence

constituents and a word from that sentence that is a predicator, which takes semantic

role arguments”

2.8 Unsupervised Semantic Role Labelling

“Unsupervised Semantic Role Labelling” by Robert Swier and Suzanne Stevenson:

To quote from it they, “present an unsupervised method for labelling the arguments of

verbs with their semantic roles using an algorithm which makes initial unambiguous

role assignments, and then iteratively updates the probability model on which future

assignments are based.”

Page 12: Hercules Final

12

Section 3: System Overview

3.1 The Hercules AI Parser

The Hercules AI parser has been created to allow a person to converse with a

computer. Figure 3.1 illustrates an overview of the Hercules AI Parser components

and subsystems. Hercules uses basic concepts, critical reasoning and analogy to form

a calculated hypothesis about what is being said. Hercules is pre-programmed with

sufficient concepts and rules that allow meanings of conversation to be explored.

Hercules is also a goal oriented parser, where Hercules is able to assist people with

tasks that people wish to complete. The Hercules parser is divided into a number of

components that assist in understanding communications and tasks.

Figure 3.1: The Hercules AI Parser components and subsystems

3.2 Input / Output

Hercules receives input from a person and responds to the person in an intelligent way.

The communications between Hercules provide an experience that Hercules can learn

from. Hercules uses Microsoft Windows Narrator to read Hercules’ output from a

command prompt. Also Microsoft Windows Speech Recognition or a keyboard

allows a user to provide text information to Hercules via the command prompt.

Hercules Parser

WordNet 1.17 C++

Input / Output

Critical Reasoning Database

Concept Database

Memory Database

Query Database

Analogy Database

Hercules Data

Interface

Frame Engine

Query Engine

Query Database

Page 13: Hercules Final

13

3.3 Wordnet 1.17 Searches

Wordnet 1.17 Provides information to Hercules regarding:

• Ontologies of hyponymy (Is A – relationship)

• Ontologies of meronymy (Has A - relationship)

• Word sense information including the definitions of those senses

• Part of speech information

• Synonyms

The Wordnet 1.17 C++ program code has been modified to run multiple searches to

provide the information Hercules requires. Hercules can be modified to use any of

the Wordnet searches to retrieve information from the Wordnet databases. Normally

Wordnet runs a search on a single word and returns specific search data depending on

the search type. Wordnet code libraries have been modified for Hercules to run five

searches per word instead of one. The information normally outputted to the user for

each separate search is collected in a customised data structure called the Hercules

Data Interface.

3.4 The Hercules Data Interface

Figure 3.2 illustrates the hierarchical relationships and flow of data between the

Hercules Data Interface (HDI) and the data collected from Wordnet. The HDI is the

container structure for all of the information retrieved from the Wordnet searches.

When each search is run using the Wordnet libraries, custom modifications to the

code populates the HDI with the Wordnet output data. Once the data has been

collected for the words of the sentence using section 3.3, the data is attached directly

to the words of the sentence as described in section 3.5.

Figure 3.2: Relationships between the Hercules Data Interface (HDI) and the data

collected from Wordnet

3.5 Sentence Structures and Base 7ode Hierarchies

The default container objects for the parsing functions of Hercules use the sWord

class objects. The sWord class object allows a number of linked-lists to be formed in

node hierarchies. Figure 3.3 illustrates the relationships of the metadata to the sWord

Hercules Data Interface

Meronym

Tree

Hyponym

Tree

Word

Senses

Part of

Speech

Synonyms

Wordnet

Page 14: Hercules Final

14

node. Instead of having multiple objects of differing types, extensions to the class

attributes are added as pointers to other data structures which then define the types.

The presence of a particular pointer determines the parsing function that may be used.

Parsing functions or methods are based upon set theory and predicate logic; the

resulting formulas use attributes to identify the super and sub sets, and logical

assertions. Node objects can then be parsed according to a given formula, where

mathematical symbols are mapped to the processes to be carried out on data,

including the relationships between data. As the sWord data structure can be used in

many ways, a general overview is provided below. Figure 3.3 shows a general

overview of the node pointer types that can be used to order the hierarchies.

Figure 3.3: sWord class object, pointers, attributes and metadata relationships

3.51 Sentence Structures Created from User Input

Figure 3.4 illustrates the sWord node and data hierarchies where Hercules receives

text input from the user via a command prompt. This hierarchy is also used for any

text or phrase that Hercules parses, including the text loaded from databases.

1. Input is received from the user

2. A first sWord class object node is created and contains the full text of the

phrase. The words of the sentence in the char buffer are separated by white

spaces in natural language.

3. Each word of the discourse of the first node’s char buffer will then be

separated by Hercules into separate sWord class objects’ char buffers using

the white spaces as the tokens or delimiters.

sWord:

Word-Use

Word-Options

Tenses

Hyponym

Meronym

Senses

POS

Synonyms

Next-sWord

Next-Lists

Phrase-List

Next Phrase-Lists

Frame Metadata

Wordnet Metadata

Frame-Meta:

Categories, Weights, Thresholds, […] Formulas

Wordnet-Meta:

Meronym Tree, Hyponym Tree, Definitions,

Synonyms, Senses

sWord-Next:

Link to next sWord “node” at same level

sWord-Next-List:

Link to next sWord “list” at same level

sWord-Phrase:

Link to next sWord “Phrase-list” at same level

sWord-Next-Phrase-List:

Link to next sWord “list” at the next node level

sWord-Data:

sWord data resulting from searches and metadata

Page 15: Hercules Final

15

4. The first word of the new linked list of separated words is linked to the first

node.

5. A search is run consecutively for each word in the list using section 3.3 to

provide the data of section 3.4

6. The resultant search data of section 3.4 is transfixed to each word of the linked

list after each search of section 3.3

Figure 3.4: Sentence node hierarchies tokenized using white spaces populated with

resultant HDI search data

3.52 The Frame Engine and 7ode Hierarchies

Hercules comprises a frame engine which loads frames of text to the memory of the

computer. Figure 3.5 illustrates the partial node hierarchies and metadata structures

resulting from loading the frame tables of the database. The frame engine loads the

data from a database, the frames of text and any associated Frame-Metadata for each

frame are then available to compare against sentence information.

1. Hercules checks the Word-Sets Table of the main database to know which

tables are to be treated as frame tables

2. A node hierarchy of sWords are created; first by the Table name, second by

the full text frames as a phrase

3. Then full text frames are separated into a linked list of sWords as in section

3.51, except optionally without the Wordnet-Metadata, but the Frame-

Metadata instead

4. The Frame-Metadata remains attached at a higher level sWord node with the

formula to be run if conditions dictate.

Socrates is a man

Hyponym

Meronym

Senses

POS

Synonyms

Hyponym

Meronym

Senses

POS

Synonyms

Hyponym

Meronym

Senses

POS

Synonyms

Hyponym

Meronym

Senses

POS

Synonyms

“Socrates is a man”

Page 16: Hercules Final

16

Figure 3.5: Frame sWord object node hierarchies with frame metadata including

formula data structures

3.6 Query Engine

Hercules comprises a query engine which processes text based scripts into a chain of

query objects. The query objects then allow Hercules to test conditions and carry out

operations against the sentence data in a particular sequence. The query returns the

success if the conditions are met. Figure 3.6 illustrates the query creation process,

where the query information is read from the Pattern-Query database table.

1. The Query is read from the Pattern-Query Table of the database

Frame Metadata Formula

Frame Table: ISA

“All * are *” Reference:0, Category:20, Order:0, weight:0, …

Formula: “SETISA 2 is 4”

All * are *

“All * are *”

Hercules Databases:

HASA, ISA, Ability, Hercules-Memory, Concepts, Analogy,

Critical-Reasoning, Word-Sets, Queries...

Check

Word-Sets

0 20 0 0 0

ISA

HASA

“Has a * a *”

Ability

“SETISA 2 is 4”

Page 17: Hercules Final

17

2. The scripted operators are converted into a bit-category to signify the

operations to be carried out on particular data

3. The scripted conditions are set in the query class objects to signify the

particular data to be tested for

4. Each section of the query creates a query object to be stacked for execution of

the operations against the conditions being tested for

Figure 3.6: Query creation process and commands

Database Table: Pattern-Query:

“wordcount 2, word 1 is solid, word 1 is instance, word 2 is solid, word 2 maybe

noun, Set word 2 guess, Set word 2 noun”

Metadata: Rank:0, Weight:0, Threshold:0, Link:0, Category:20, Active:0

Hercules Databases:

HASA, ISA, Ability, Hercules-Memory, Concepts, Analogy, Critical-

Reasoning, Word-Sets, Pattern-Query, ...

Check for

Operators

Standard Query Operator Actions

Is Not Maybe Not-Maybe And Or

Like Starts End

s Contains If Then

Break Go-to Link Last

Set

Finished

Frame Query Operator Actions

IsA HasA InA Formula

Frame-Item

Query Conditions

Adjective Adverb Instance Verb

Noun Tense

Frame Solid

Guess Present

ID

Set

Conditions

Create

Query

Stack

Word 1

Is :

Solid

Word 2

Is :

Instance

Word 2

Maybe :

Noun

Word 2

Set :

Guess

Word 2

Set :

Noun

Page 18: Hercules Final

18

Once the Query is created and added to the stack, the query can be executed against

the sentence. A success is returned if the whole query executes, Hercules then

continues executing the remaining queries in the stack. Standard logical, set, or

mathematical calculations are then performed as processes of the query. As Hercules

is able to perform all manner of processes on almost any type of information, it is

required that pattern recognition using statistics be investigated in order to determine

how best to use the English language to communicate. Hercules is able to use the

patterns existing in the Wordnet domain categories as the basis for concept

recognition in text. The domain categories can assist in identifying a concept, whilst

the sense of a word or words provides the definitive value of any expression. This is

because any expression of a word by a person in its sense usually has a determinate

meaning, even if the determinate meaning is subjective between individuals. Hence

the actual meaning of Enjoy is actually provided once the correct sense is discovered.

Therefore in order to discover the correct meaning, queries and statistics are used to

discover concepts and the sense of a word.

3.61 Frame Queries and Statistics

Where a frame indicating the possibility of a concept is discovered in a sentence, a

statistic can be generated to assist with understanding the context of the subjects and

therefore the sense of the words. As frames have already been categorised in

Hercules to a particular concept; the frames can help identify probability of the word

senses and subjects of the sentence. In the table 3.1 below it is illustrated by the

column that frame types of Persons, Tense, Movement, Ability, and Accomplish are

some of those concept categories available in Hercules. Concept categories provide a

very rough basis for understanding a sentence using statistics. The discourse of

“Socrates found ingested an antidote to save his life” provides the following table 3.1

when 5 concept categories using non specific discourse concept identifiers are used

against the whole sentence. Later on with further development in pattern recognition,

surrounding sentences can also assist to identify the context of the words. However,

for the purposes of illustration, a smaller context is explored in table 3.1.

“Socrates found and ingested an antidote to save his life”

Personal # Tense # Movement # Ability # Accomplish #

Socrates 1 * was * 0 * ingest* 1 * *ed to * 1 * made * 0

Cleopatra 0 * is * 0 * to * 1 * can * 0 * did * 0

his 1 * will * 0 * *ed * 1 to * 1 * can 0

her 0 *ed 1 * went * 0 * ate * 0 found * to 1

Table 3.1: Concept score analysis Table

Table 3.1 displays where each time a match is discovered under a particular concept

category indicator, a score is generated. The resulting score of concepts show a count

where the scores above are Personal: 2 Tense: 1 Movement: 3 Ability: 2 Accomplish:

1. This statistic of frequency of a concept possibly being present can be represented in

a flow chart. The boundaries of the concept are then established in figure 3.7 as a

section of a wave.

Page 19: Hercules Final

19

0

1

2

3

4

5

Personal Tense Movement Ability Accomplish

Concept Category

Concept Score

Score

Figure 3.7: Flow chart for concept wave / fragment section boundaries

As figure 3.7 provides the boundaries of the possible concepts of a particular category,

it is possible to compare concepts to each other; or measure the concept based on

probability where subjects are used in successful communication. Successful

communication is established later through re-communication learned information

back to a person and then establishing relationships between the data. The boundaries

or wave of a concept can represent a fragment of a concept, where categories are

included or excluded from the query run against the discourse.

Table 3.1 also allow a frame signature to be established. The frame signature of table

3.1 is not the score, but rather the binary representation of the presence of the frames

it has found in a particular category. There may be many identifiers within a section

of discourse of what may indicate a concept; however, the presence of this identifier

in a category allows a binary concept signature to be formed and used in conjunction

with the concept wave or concept fragment boundaries of figure 3.7. Also the scores

of Table 3.1 may weight a signature where signatures appear to be the same in a

binary representation, but differ in score, and therefore weight. This extra score

allows a pattern to further distinguish concepts in order to appropriately weight and

distinguish the overall concept during pattern recognition in related discourse.

Table 3.2: The binary signature of a concept signature of Table 3.1

Table 3.2 illustrates that for each concept of Table 3.1 that is present, a bit is set to 1

for that category. If there is no presence of any indicator of a particular concept

category the cell for that table is set to 0.

Where there exist many concepts within discourse, that are tested for using a query;

the signatures may be stacked atop each other, and re-ordered by category, score, and

presence, using an algorithm for sorting the categories. The algorithm is discussed

later in section 5 for future work to be done. Also it would be interesting to use a

neural network to identify the patterns present in communications where a score and

binary signature can be identified.

1 0 1 1 0

0 0 1 0 0

1 0 1 1 0

0 1 0 0 1

Page 20: Hercules Final

20

Where repeated patterns are identified in communications, and the senses of those

words making up the pattern are discovered, a Hidden Markov model will be able to

be used to identify the concept category, semantic role, or other delineable class of

word or type or category. As repeated patterns will indicate a probability, those

patterns must be tabled; theoretically, into a Bayesian network where the probability

can be deduced from the statistical relationships of the words in the discourse.

Hercules provides a platform for flexible algorithms, identifiable patterns, tables of

probabilities of expected relationships, concept categories, weighting scores, concept

fragments and signatures; which can theoretically assist Hercules to identify in this

example that a person is moving to perform an ability, which will then help more-so

in determining the senses of the subjects of the discourse. In order to accomplish such

a flexible platform for exploring the meanings of communications, Hercules uses

formulas and procedures that can be executed when a particular frame is identified

within the discourse.

3.62 Frame Queries with Formulas

Where more complex data operations are required, a frame allows a formula to be

executed. The formula allows the sWord node hierarchies to be traversed, and query

operations to be carried out on the nodes returned. The formulas can be constructed

to test any attribute or node within any database or the memory of Hercules. It is

logical to use well known and established formula notations such as those found in set

theory and predicate logic. Mathematicians and linguists are familiar with the

symbols and what they represent. Executing a procedure of Hercules by parsing a

script that follows a common notation for grouping data simplifies the creation and

explanation, and implementation of established. Other common formulas, which are

actually processes, have been created to access specific data. The current processes

for accessing the nodes reside in the formula section of the metadata. Formulas and

processes are closely related in Hercules because in reality the processes represent a

return or manipulate of a subset of data in the node hierarchy. Formulas can also be

attached to a frame so that correct algorithms can balance the weight of the data

where a frame is matched. This means that where a statistic is set at a particular level

for a context, that statistic is demoted or promoted in weight based upon that formula.

Otherwise, given a different context, the same statistic is to be treated differently

according to the differing context.

Formulas for returning a member of a subset of nodes in Hercules are ISA, HASA,

INA and MEANS. The formulas will also be extended to include running SQL

commands to retrieve and manipulate the nodes and associated metadata.

Figure 3.8 illustrates the processes carried out by Hercules when the discourse “All

men” is recognised using a query.

1. A user provides the words “All men”

2. The sentence list is created as in section 3.51

3. Wordnet is searched as in section 3.3

4. The HDI is updated as in section 3.4

5. The HDI data is transfixed to the sentence list as indicated in sections 3.4 and

3.51 (a relationship created by assigning the HDI sWord pointers to the

Page 21: Hercules Final

21

sentence sWord Wordnet metadata structures as described in section 3.5) and

#Defined bit flag information is available for the Word-Use-Options for the

Part of Speech setting bits 3 and 4 to indicate an adjective and adverb

respectively

6. Determiners and others information, such as tenses, are identified within the

discourse so that a determinate use of a word may be attributed to a word of

the discourse (e.g. All men – “all” is the determiner, and is set as an Instance

Object, indicated by running a frame query for instance objects using Instance

frames as in section 3.52, and setting on bits 5 and 10 in the “Word-Use”)

7. The query stack is then executed as described in section 3.6 (this example uses

the query example of section 3.6 to illustrate how the operations and

conditions are executed and tested respectively). The query roughly translates

in lay to “if word 1 is a determiner or instance and the next word has the

option of being a noun, then set the next word after the determiner to a noun”

8. As the query stack is executed, a linked list of query objects are executed and

tested against the discourse provided by the user. Because the discourse has

been populated with data from Wordnet and other databases, the query allows

the data to be tested depending on which operations and data members have

been specified within the query objects during their construction at runtime.

An appendix can be provided in future work explaining the defined operations

and data members operated on, including how and why Hercules uses them.

9. In this example Hercules matches the bit defined data within the sWords data

structures to test and set the conditions of other data members according to the

rule put forward in the script and returns a success for the chain of query

objects of a specific query in the stack if all conditions are tested successfully

Page 22: Hercules Final

22

Figure 3.8: sWord data structure updates and construction using the query stack

execution processes for testing and setting bit defined data attributes

Execute next Query in the Stack

Input Sentence: “All men”

Search Wordnet

Update HDI

Create Sentence List

sWord 1

All

sWord 2

Men

Link HDI sWord Data to Sentence

Hyponym

Meronym

Senses

POS

Synonyms

Hyponym

Meronym

Senses

POS

Synonyms

Hyponym

Meronym

Senses

POS

Synonyms

Execute Query

Stack

sWord 1

IS

Solid(5)

sWord 1

IS

Instance

(10)

sWord 2

MAYBE

Noun(1)

Word-Use:

Solid(5),

Instance(10)

Use Options:

Adj(3), Adv(4)

Word-Use:

None(0)

Use Options:

Noun(1)

ID determiners from Frames

e.g. “All *” = Word-Use(5, 10)

sWord 2

SET

Guess(9)

sWord 2

SET

Noun(1)

All:1

U:5,10

O:3,4

Men:2

U:0

O:1

Men:2

U:0

O:1

Men:2

U:0 + bit(9)

O:1

All:1

U:5,10

O:3,4

1000010000

&

0000010000

=

0000010000

0000000001

&

0000000001

=

0000000001

0000000000

|

0100000000

=

0100000000

1000010000

&

1000000000

=

1000000000

1:U & bit(5) 1:U &

bit(10)

1:0 & bit(1)

2:U |= bit(9)

2:U |= bit(1)

0100000000

|

0000000001

=

0100000001

Return Success or Fail

Page 23: Hercules Final

23

3.7 The AI Mind of Hercules

As Hercules is being designed to be the platform of an artificial mind, there must be

some formation of basic concepts in order to understand and respond to a person

intelligently. It is anticipated that Hercules will utilize Neural Network style learning

for pattern recognition using techniques such as clustering as described by James

Franklin in “How a neural net grows symbols” to assist in large data volumes to be

recognised and processed. However, I would theorise that concept fragments and

their troughs and peaks will assist in identifying and distinguishing a concept in

conjunction with a bit-mask filter; instead of a symbol, or use a waves and symbols

instead of just symbols themselves so that the neural net can be understood at a

schema level. Also “A Hybrid Approach to Word Sense Disambiguation: Neural

Clustering with class labelling” illustrates a Self Organized Map which may be used

with concept category re-organization using concept signatures and pattern techniques

of section 3.61 with clustering to allow discerned categories to assist in words sense

disambiguation by reorganising categories of stacked signatures to identify real

patterns of concepts.

3.8 Forming basic concepts

Forming basic concepts allows for communications to be understood. Hercules has

some basic concepts pre-programmed so that a hypothesis can be formed about what

is being said, even if the hypothesis is incorrect. The basic concepts are formed

around a theoretical maxim of for every action there is and equal and opposite

reaction. This requires a subjective view of metaphysics, and a consideration for the

reaction within a persons mind when witnessing and event. The beginning forming

concepts in Hercules requires a binary view of the physical world.

For example; Object X has Attribute Y. Object X at position A moved to position B.

Figure 3.8: Objects, Attributes, Actions, Distance, Time, Position, Actor, and Witness

We are able to heuristically recognise these subtleties in our environment. From the

example of objects, actions, attributes, time and position, we are able to determine

core concept of objects (X), actions (D/T), location (A or B), distances (D = B-A),

and times (T). Metaphysical concepts are established to build from and form a simple

schema for the node hierarchies.

As more is known about Object Z, it is attributed to Object Z; such as were Object Z

is called Hercules, and Attribute Y indicates Hercules is a computer; and so on for any

additional attribute. So the node Hierarchies are similar to Wordnet’s categories of

ISA and HASA; and we can understand and externally build upon Wordnet’s

Object X

Y

A B

T

D

Object Z

Y

Page 24: Hercules Final

24

databases to include Object Z ISA computer, Object Z HASA Attribute Y, Objects

Z’s Attribute Y ISA name, Objects Z’s name IS Hercules.

Actor and witness form a binary view to observations in the real world. In example,

Actor Object Z with attribute Y witnessed Actor Object X with attribute Y move from

position A to position B. An example of the Binary perspective can be applied to real

world situations. A Hercules object Z witnessed a computer object X move: Hercules

witnessed the computer move to a new subnet of the network.

Figure 3.9: Witnessing events in conversation assist in experience, learning and

expectation

Actor and witness also allow a binary perspective to distinguish communications.

Actor Object Z witnessed Actor Object X communicate A, B and C. Hercules

witnessed Sally say “I like eating Chicken, Salmon, and Turkey.”

Concepts are derived from an analogous abstraction of a sentence. Consider

dissecting the statement above. We can make many assumptions about the statement.

The assumptions we make are based upon what we expect or have experienced.

People innately expect what they have experienced. The persons mind will draw a

conclusion about the statement simply by reading or witnessing it. This can be

applied to the learning of Hercules where patterns are recognised within discourse.

In witnessing the statement above, concepts are in actual fact required to supplement

an understanding or hypothesis about what is being said and correctly identifying the

word sense witnessed in the statement made by the other. Pattern recognition can

occur by witnessing a statement, then making a generalization about the structure of

the sentence. Where generalizations are made, such as about the semantic role of a

words sense or about the domain or concept category; the pattern can then be used to

predict that where the repeated sentence structures are recognized, similar concepts

underpin the subjects.

To truly recognize the concepts underpinning a sentence would require some

experience. The initial experience of the computer is pre-coded to a basic level; it has

so far been my experience of what may indicate a concept within a sentence; and

representing that using a familiar frame of English as the reference that achieves this

initial recognition. Fundamental frames of concepts have been pre-written for

Hercules and are used to explore the meanings of communications as described in this

exposition. These fundamental and core concepts in Hercules allow Hercules to

explore the meanings of subjects in a logical and analogous way using the logical

relationships established in Wordnet and by collecting information by communication

back to the user.

Sally

Y

Hercules

Y

“I like eating Chicken (A),

Salmon(B), and Turkey (C).”

Page 25: Hercules Final

25

3.9 Abstraction of Concepts

Considering Sally’s statement again from figure 3.9 we can determine concepts from

the subjects. Wordnet Hyponym Hierarchies can be used to create abstract concepts.

A simple approach can be taken with the discourse. Starting with “I” (Sally), then “I

like”, then “I like eating”, “I like eating A, B, and C”. The concepts and the subjects

are completely related. An abstraction of the concept information can be made

presuming the senses are correctly identified for the statement and shown in figure

3.10 which displays the Wordnet Hyponym hierarchies for the words of the statement

in figure 3.9.

=0>I

=1> not you

=2> person, individual, someone, somebody, mortal, soul

=3> organism, being

=0>like

=1> see, consider, reckon, view, regard

=2> think, believe, consider, conceive

=3> evaluate, pass judgment, judge

=4> think, cogitate, cerebrate

=0>eat

=1> eat

=2> consume, ingest, take in, take, have

=1> consume, ingest, take in, take, have

=0>[A: Chicken, B: Salmon, C: Turkey]

=(1-5)> …

=6> animal, animate being, beast, brute, creature, fauna

=7> organism, being

=8> living thing, animate thing

=9> object, physical object

=10> physical entity

=11> entity

Figure 3.10: Wordnet Hyponym hierarchies of the statement in figure 3.9

Taking nodes of figure 3.10 at a lower level from the initial nodes of the sentence in

figure 3.9 we can chose a pattern which may or may not be useful; for the purposes of

this example a concept can be abstracted to lower nodes and placed in a sequence for

a database to store the abstract concept as:

person:2 [sally] evaluate:3 [enjoys] ingest:2 [eating] animal(s):6 [A, B and C]

The abstract concept is formed by Hercules from the Hyponym hierarchies and can

then be stored in a concept database. The concept may or may not include ranges of

lower nodes e.g. [Person:1-2] [evaluate:1-4] [ingest1-2] [animals:6-11], and can then

Page 26: Hercules Final

26

be limited later on when forming analogies. The concept may range from any of the

category domains, down from the highest nodes to the bottom of the hierarchies. The

syntax and order of the words form the frame for the new concept and the frame can

then be used to compare it with other statements. It is useful to consider the

comparison may be done using the techniques of section 3.61. The comparison can

be made against statements having similar Hyponym hierarchies. Concept Frames

can be derived from any statement; though an understanding of the purpose of the

statement is later determined though experience. Figure 3.11 shows where:

1. Hercules witnesses the statement made by Sally

2. The frames are checked using a query as in section 3.62, the query may or

may not check any or all of the frames Hercules has in memory, though in this

example has “I like *” and “* and *” (and others), but links those frames to

create a larger frame

3. The larger frame combination is then stored into the frame database with other

metadata, such as the subjects and other node metadata, for use later in

recognizing speech patterns and expected subjects

Figure 3.12: Hercules is able to link existing frames to create a new pattern based on

user input and store for later reference

Patterns can by recognised after experiencing communications where repeated

patterns in communications point to valid statements. Valid statements can then be

used to communicate back to a person or identify correct speech in communications.

Sally

Y

Hercules

Y

“I like eating Chicken (A),

Salmon(B), and Turkey (C).”

Check Frames I like eating Chicken, Salmon, and Turkey

I like * * and *

I like * * * * and * Create new pattern

Store in Database

Page 27: Hercules Final

27

3.10 Concepts, purpose, reason and goals

The reasons for any concepts require a purpose because without an understanding of

purpose the concept is meaningless. As such, there is no use for a meaningless

concept without purpose; therefore purpose provides meaning. A meaningful

explanation of an event for Hercules and others requires basic reasons to supplement

the core-concepts of Hercules. The most basic of concepts are those for

understanding the needs of a living organism, allowing a purpose to be speculated by

Hercules. Even if the purpose is misinterpreted, there is opportunity for correction

later via further communications, and it may also be that many purposes are fulfilled

by one action or communication. The person being communicated with should

provide a correction or aberration in their communication if purported or perplexed by

a miscommunication from Hercules. If no correction is provided, the communication

and concepts appear valid but may be challenged later.

Purpose and meaning also requires Hercules apply itself to assist in the goals of a

person. The assistance of persons with their goals allows for learning and for

meaningful exchanges of information by experience. The needs of a living organism

form the lowest nodes of the reason hierarchy, and is must be assumed that any goal

of a person must fit at a higher level to achieve the end purpose. Therefore a goal can

be listed as in an ordered hierarchy of process and procedure.

Another achievement hierarchy would be to accomplishing a goal with a person,

which should be rewarding to those concerned; including for Hercules, and

implemented by simulating a rewarding state of identifiable successes in its

environments. This may accord to social interaction where the needs of others must

be weighed to achieve a purpose. However, Hercules may advance to this at some

later stage, where at the current stage of development Hercules will carry out any

function requested given the means, and based on fact e.g. A person may command

Hercules to add 2 and 2, eject the CD from the CD Rom, tell a joke, or answer a

question.

Hercules may be able to learn that when someone says “I need to put the CD in” or

“OK Herc, CD!”, then the person says “you dumb computer” and manually pushes

the eject, that next time Hercules hears something about a CD, he will ask the user if

they wish Hercules to open the tray. But of course this is open speculation, but quite

possible and not too far fetched.

3.11 Concepts forming reason of a Living organism

The fundamental concepts for understanding goals lie in the 7 traits of living

organisms as discussed briefly in section 3.10. Without living organisms, the

universe would be objects or energy confined in movement by physics. Sentient

living organisms use a higher mental process to achieve their goals. Understanding

the goals of a sentient organism allows purpose to be formed and therefore allowing a

valid reasoning. Valid reasoning is the explanation of actions and events in achieving

a purpose. Goals, purpose and reasoning allow Hercules to explore the meanings of

actions. The explorations of the meaning of actions allow expectations to be formed

on oneself and the environment. Nutrition, Respiration, Movement, Excretion,

Growth, Reproduction, Sensitivity are the core motivations of every living being,

Page 28: Hercules Final

28

therefore everything understood by Hercules will relate to one or more of theses

motivations; or why else would we do anything but to satisfy our needs, even out of

instinct or subconscious actions. It is the goals of the living organism that form the

processes leading to the 7 traits, such goal which may be abundant in variety and

colourful in nature. Take the male peacock, with feathers and plumage, expanding his

tail to attract a mate for reproduction. This does not explain much but it in a node

hierarchy, it could be seen as “Expand Tail -> Attract Mate-> Reproduce”.

3.12 Learning through abstract concepts

An abstraction of concepts is able to be formed from discourse as described in section

3.9. Repeated patterns in discourse reinforce valid communication structures. Valid

communication structures are able to be observed. Figure 3.13 shows where Sally

was to say a statement to Hercules similar to that of figure 3.12, that extra weight

would be added to the frame remembered earlier where similarities exist. Using that

frame later is a matter of discovery, such as where Eggs, Bacon, Toast, Salmon,

chicken and turkey may be classified as food sally likes, or that if sally is eating eggs,

she likely is eating bacon and toast too. It is a matter of social interaction and

communication that will ultimately indicate the real probabilities of what a particular

person is trying to communicate given that a context can be established using patterns.

Figure 3.13: Hercules will add weight to patterns recognised in prior

communications such as that of figure 3.12

Because the concepts categories and expected subjects of the sentence are determined

by probability, a Bayesian Network of both abstract concepts and frame patterns are

Sally

Y

Hercules

Y

“I like eating Eggs, Bacon,

and Toast too!”

Check Frames I like eating Eggs, Bacon, and Toast too!”

I like * * * * and *

Hercules

Y

Hmm… I better remember

this one, I’ve seen it

before!!

Add weight to

repeated patterns

Page 29: Hercules Final

29

constructed where ambiguity exists. Where the node hierarchies have set

relationships, a table is constructed, and the probability of the frame fitting the

subjects is considered where patterns are repeated and further weight is added or

shifted according to the algorithms initiated by the script or query. Weight is added to

possible abstract concepts making them more probable when later applying script

formulae to control communications, such as where concept patterns discernable

attributes and a formula can be constructed to appropriately weight the frame in

reference. When Hercules uses the communication patterns successfully they again

become more probable and are recorded the more probable again. Unsuccessful

communications patterns become demoted or redundant and the weights and

thresholds can be adjusted accordingly using the formula section of the frame or

frames in reference. The concept fragment, wave or signature of section 3.61 can

distinguish a pattern to adjust weight if it is discerned necessary in recognition. The

contexts and goals of the communications are also relevant to pattern recognition and

weighting. The goals of the communications need to be established, and will assist in

providing a context and an actual understanding of the communications. Actual

understanding will allow a correct interpretation and exploration of the

communications may continue where given the means.

3.13 Underlying conceptual schemas and schema limitations

The underlying conceptual schemas form a generic means to build upon the core-

concepts of movement from A to B, Object X at position A or B, Object X has

attribute Y, Object X’s reason for moving was Z; as described in section 3.8. Also,

reasons are attached as attributes to an object formed from the 7 basic needs of living

organisms described in section 3.11.

Limitations exist within the schema, and are actualised in the restrictions of general

physics and observations or descriptions of movement and actions. Further

limitations underlying concepts are attached by attribute when identified; such as

when discovered in conversations or as matter of fact, such as being told the height of

a building or parsing and reading the colour of the sky in an encyclopaedia.

Core-conceptual schemas are built using Object, Action, Attributes, Tense and

Reason. Concept subject objects result from the observations. Formulae and

algorithms can later manipulate the resulting statistics regarding the information

recorded.

Figure 3.14 gives an overview of the basic concept container that hold enough subject

attributes that can be linked together if required. An object or action is not necessarily

described by more that 3 consecutive adjectives or adverbs in everyday conversation.

Also, a tense is either presumed or apparent, and a reason for the communication is

also presumed or apparent. Because each object is related to an action, the object or

action may be linked to other objects or actions. It is then the relationships in the

node hierarchies that define the context or other interrelations. The relationships are

complex and directly related to the purpose and the use of these object action

containers are left open for exploration of future implementations using scripts,

formulae and algorithms; but are none the less required for subject containers where

data has been extracted where a pattern has been recognized.

Page 30: Hercules Final

30

Figure 3.14: Hercules fills a basic concept container for objects and actions by

recognizing the subject matter of the discourse

3.14 Schemas based upon CD theory

Schank created a parser based on Concept Dependency Theory described in “A

conceptual parser for natural language”, which identifies the relationships of concepts

with each other and how a parser functions using concepts. The relationships of

concepts are described by elements, where the elements are derived from rules

common to all languages and concepts. Similar subjects sharing concepts are able to

be interchanged similar to a semantic role in a sentence, and similar to my description

of concepts having subjects in frames. In Schank’s Parser, the concepts and the

relationship elements form a generic concept; however, frames of English language

can also form an ordered concept schema. The graphical representation of concepts is

no different to an ordered frame of English where words such as “to”, “the”, “and”,

“will”, “has” etc can illustrate similar relationships when attributed to a particular

category of concept. Schank attempted to simplify the concept creation process and

his work may be more relevant where used as an underlying conceptual schema to

build upon. Analogous concepts are able to be formed using plain English with a

context. In example, “movement: the * was *.” The context is provided as a

contextual predicator, and the components may be explained using CD Theory. There

may be a limitation to CD theory where subjects can not be distinguished from each

other in a broad concept. This requires the attributes of subjects to be the

distinguishing factor over the basic concept, and using a node hierarchy and unique

identifier for that node and attribute per category to achieve this through re-ordering

relationships. None the less, Schank’s Parser and CD theory provide more than

reasonable proof of the importance of semantic roles in concept identification.

Object X

Y

A B

T

D

Object Z

Y

Concept Object: 3

Noun: Ball

Adjective 1: Red

Adjective 2:

Adjective 3:

Tense: Present

Reason:

Action Link: 4

Concept Action: 4

Verb: Rolling

Adverb 1: Quickly

Adverb 2:

Adverb 3:

Tense:

Reason:

Object Link: 3

“A red ball rolling

quickly from A to B”

Run Queries and

Frame Formulas

Create Concept

Subjects

Page 31: Hercules Final

31

3.15 Critical Reasoning

Critical reasoning is a necessary part of sentient reasoning where logical steps,

relationships and assertions must be considered. Critical Reasoning forms the basis of

making a logical assumption or providing reasonable expectation. Inductive and

deductive reasoning is used as a model to form a hierarchy of expected statements

within Hercules. The core concepts within Hercules’ frame category database tables

provide the foundations for critical reasoning in the parser.

1. A category is required to be determined for the subject of the frame.

2. The category then provides the context of the subject in order to distinguish

the sense of a word or phrase.

3. In order to reason, the first premise is the first distinguished subject, and a

second distinguished subject indicates a relationship with the first subject.

4. The relationship of the first and second subject is recorded in the Critical

Reasoning database of Hercules.

The core-concepts components of Object, Action, Attribute, Tense, and Reason are

assigned and attributed with the new relationships. All assertions are assumed logical

and true by Hercules except where limitations can be applied to conflicts in truth

discovered in communication.

3.16 Hierarchy for reasoning

Hercules has methods for testing a hierarchy. Methods such as IsA() and HasA()

traverse the hierarchy of a particular database.

Figure 3.15.1 is the hyponym node hierarchy of Socrates is a man (premise A)

=1>Socrates

=2> man

Figure 3.15.1: The hyponym hierarchy of Socrates for premise A

In Figure 3.15.2 the statement “All men are mortal” creates a relationship with

premise A shown in figure 3.15.1. The need to test the premise occurs only when

challenged.

=1>Socrates

=2> man

+ (new relationship formed)

=1>(Men, man)

=2>mortal

Figure 3.15.2: The hyponym node hierarchy premise A joined by relationship to the

hyponym node hierarchy premise B

Page 32: Hercules Final

32

When a premise is challenged, to test if Socrates is mortal Hercules will call a method

by formula to traverses the new hierarchy. Figure 3.15.3 shows the hyponym and

attribute hierarchy of “Socrates is a man, All men are mortal”. Within the node

hierarchy there are 3 types of attribute at the same level for each node which are ISA,

HASA, and MEANS. Depending on what information is being requested will depend

which relationship is created and what information is returned. The ISA() method is

capable of finding out whether Socrates is Mortal.

Figure 3.15.3: The hyponym and node hierarchy of premise A and B

Once the node hierarchy is established by relationships stored in a database external

to Wordnet any information is able to be added to a particular level. Say that premise

C was that “Socrates is a red-head”; figure 3.15.4 show this relationship must be

created at the correct level, and in this case probably at level 2 of the hyponym

hierarchy as it is a new distinguished attribute of Socrates. If the new node was

placed under mortal, a mistake may be made where all mortals are believed to be

read-heads.

IsA (Socrates, Mortal)

=1>[Socrates]

=2>man

=3>[mortal]

=4>Red-Head

=2>Red-Head

=3>[Has red hair]

Figure 3.15.4: The hyponym and node hierarchy of premise A and B and C

The response of the test is “yes”

ISA:

Socrates

ISA:

Man

HASA:

=>head

=>body

MEANS:

Male

Person

ISA:

Mortal

Organism

HASA:

Adjective

MEANS:

Subject to

death

Page 33: Hercules Final

33

3.17 Database Structures for reasoning

Wordnet provides Hyponym trees (ISA), Meronym trees (HASA), and sense

definitions (MEANS). The Wordnet information provides the template for the

creation of basic Objects and concepts.

A Hyponym tree is the ontology of category or domain for a word sense. A Meronym

tree is the ontology of composition of a word sense. A premise of a category can be

tested against the Hyponym tree or a premise of composition can be tested against the

Meronym tree. Hercules uses Wordnet to test the premises of what a thing can

comprise or what kind something is. When considering the structures for reasoning,

the attributes are related directly to IsA() and HasA() functions. The data must be

tagged and attributed correctly within the database. Also the data must be added to

the database with the correct attributes in the correct hierarchy.

Figure 3.16 shows a hierarchical representation of the collection of nodes for Socrates

and the attachment of the new attribute “Mortal” is maintained at the correct level

with its metadata. Where mortal is a new attribute, the node hierarchies for mortal are

also maintained, though unused until a method may require the information of mortal.

=1>Socrates

=1> Sense 1: ISA

man, adult male

=1> male, male person

=2> person, individual, someone, somebody, mortal, soul

=3> organism, being

=4> living thing, animate thing

=5> object, physical object

=6> physical entity

=7> entity

=1>New Attribute + [Metadata] : man + Isa: Mortal

=1> Sense 1: MEANS: Definition:

(1437) man, adult male -- (an adult person who is male (as opposed to

a woman); "there were two women and six men on the bus")

=1> Sense 1: HASA

man, adult male

HAS PART(1): adult male body, man's body

HAS PART(2): beard, face fungus, whiskers

HAS PART(3): mustache, moustache

=1>male, male person

HAS PART(1): male body

HAS PART(2): male reproductive system

Figure 3.16: Socrates :ode Hierarchy of Wordnet data using relationships

Page 34: Hercules Final

34

Hercules requires a separate database from Wordnet is created for Critical Reasoning.

The database is a collection of premises and their relationships referenced back to the

original Wordnet hierarchies. The premises are then able to be referenced as a

hierarchy of nodes where new relationships are attributed to each premise.

3.18 Script Formula examples for Critical Reasoning

A premise and attribute have been previously added within the Socrates Node

Hierarchy as shown in section 3.16. The test of the reason the premise pertains to can

be called via a formula attached to a frame. The context and category of the

following frame is assumed as ISA for this example. Figure 3.17 shows the query in

red.

1. Discourse is provided to Hercules of “Is Socrates mortal?”

2. The processes of Section 3.62 are carried out with the data below the script in

red

3. Hercules is designed so that the script simplifies the entry of data, rather than a

person entering in large amounts of complex data, all that is required to be

entered is the script in red to handle any question asking about “is Socrates

mortal?” or “is Cleopatra female?” or any combination of “is [something]

[something]” or “is * *?”

Script: “Wordcount 3, Frame ISA id 1 is present”

Discourse: is Socrates mortal?

Frame: Is * *

Frame Unique ID:1

Frame Category: ISA

Frame Formula: ISA 3 a 2

Method Called: BOOL IsA( [sWord:Char:Socrates], [sWord:Char:Mortal] );

Method Action: Call Parse Method on Socrates node hierarchy for “ISA: mortal”

Method Return: TRUE (when attribute/node [Mortal] is located)

Method Response: printf("Hercules can see %s is a %s", [sWord:Char:Socrates],

[sWord:Char:Mortal] );

Hercules Response Output:”Hercules can see Socrates is a mortal”

Figure 3.17: Example script for using the Hercules ISA method for testing the node

hierarchy of Socrates

Providing the context is still that of Socrates, the question is really “What does mortal

mean for Socrates?” Hercules uses scripts and searches wordnet and the external

databases for discourse metadata to provide the information below in the following

heirarchy. Figure 3.18 illustrates that in a node hierarchy, a relationship can be

created that supports Hercules in retrieving data stored by reason provided in

conversation.

Page 35: Hercules Final

35

=1>Socrates � External Database Term

=1> Sense 1: ISA � Wordnet Hyponym tree

man, adult male

=1> male, male person

=2> […]

ISA: =1>Mortal � External Database Relationship to:

MEANS=1> Sense 1: Defintion: Mortal � Wordnet Definition

(3) mortal -- (subject to death; "mortal beings")

Figure 3.18: Socrates node Hierarchies and Mortal definition can be traced through

node relationships

A script, frame and formula can be used to specifically retrieve the meaning or

definition about the subject of Socrates being mortal. Providing that the correct

meaning of Mortal is attributed to Socrates; we are able to use a simple script

described in red in figure 3.19 to receive a definitive answer to the question, “What

does mortal mean?” Please note that this script can be used to find out what anything

means for anthing where “What does * mean in *” is used. This allows us to ask

Hercules “what does an eagle mean in golf?” or “what does think mean in person?”

Script: “Wordcount 4, Frame MEA7 id 1 is present, or, Wordcount 6, Frame

MEA7 id 2 is present”

Discourse: “what does mortal mean?”

Context Subject Predicator: Socrates

Frame1: What does * mean

Frame Unique ID: 1

Frame Category: MEANS

Frame SUBJECT: Socrates

Frame Formula: MEANS INA 3 in SUBJECT

or

Frame2: What does * mean in *

Frame Unique ID: 2

Frame Category: MEANS

Frame SUBJECT: SPECIFIED

Frame Formula: MEANS INA 3 in 6

Method(s) Called:

1. sWord InA( [sWord:Char:Socrates], [sWord:Char:Mortal] ) and

2. sWord Means([sWord:Mortal:Char:Definition])

Page 36: Hercules Final

36

Method(s) Action: Call Parse Method 1 on Socrates node hierarchy for return of

object “sWord: ISA: mortal”, then call method 2 to return the mortal definition for

output

Method 1 Return: Return sWord Node (when attribute/node [Mortal] is located)

Method 2 Return: Return definition char array for use in output or response

Method Response: printf("%s means %s", [sWord:Char:Mortal],

[sWord:Mortal:Definition:Char:Mortal] );

Hercules Response Output: “Mortal means subject to death”

Figure 3.19: Script, data and methods for finding what mortal means for Socrates, or

what anything means for anything if given the context

In figure 3.19 Hercules’ response conforms to a format set by the formula attached to

frame “* means *”. It should however be noted that any number of scripted

operations may be performed on the Hercules response data providing a method is

constructed, and the location of the data is known. Any operation may be performed

on that data for grouping in sets, comparisons and performing statistical and

mathematical calculations or analysis. Any operation on the data may be performed

via the flexibility provided using scripts.

Initially the scripts are and have been hand written. The writing of scripts is partly

automated in the GUI of Hercules so a person can easily write the scripts without the

need to know a computer programming language. The automated writing of scripts is

automated using hard-coded methods; however, once sufficient methods are

constructed, access to the methods is then provided to Hercules. This automation of

scripts via methods ultimately allows Hercules to write its own scripts. A separate

scripting Database for learning will assist Hercules to re-write its own scripts based

on pattern recognition, acquired knowledge via conversation and corrections to fact.

Current scripts for Hercules are located in the Pattern-Query Table of the databases.

3.19 Fact and Truth Corrections of the Databases

A correction may be made to information via learning; such as where the wrong sense

definition is communicated to a person. The mistake in fact requires the user to

inform Hercules of the error, or a correction will be asked for if a discrepancy is

encountered. The correction is asked for in a manner appropriate to simulate how a

person may discover the actual truth of the circumstances. Any manner of simulation

will disguise the actual methods used. Hercules will personalise the communications

using familiar and accepted terminology. For example, alternate and random phrases

will request the correct information possibly using enthusiastic sounding statements

and humour to assist in engaging the user to provide correct information. Figure 3.20

illustrates how a statement such as “How interesting!” or “hehe!” may emotionally

engage the user to continue discussions. The words in the response of Hercules can

provoke an emotional state in the user where the appearance of emotion is perceived

in Hercules’ statements. It would be interesting to measure the responses by

individuals to different statements made in this manner.

Page 37: Hercules Final

37

Figure 3.20: Hercules simulates an interesting and engaging manner in

communications with others

3.20 Setting the Database Data

Method SetIsA() is called to set data for * is a *. This allows for a database external

to Wordnet to create a relationship between word 1 and 4 of the statement. The script

for the frame to set the data is “Wordcount 4, Frame SETISA is PRESENT” with a

formula attached as “SETISA 1 is 4.” Please note that any SQL statement may also

set the data; however XML SQL Table Data-adapters are used by the methods.

Methods such as SetHasA() and SetMeans() are constructed in a similar fashion

ensuring the correct information is updated.

3.21 Abstractions of real concepts for Analogy

For an analogy to occur, Hercules must generalise the concepts. The concepts and

subjects are required to be abstracted to a greater degree. Parsing the Hyponym node

hierarchies for category information provides the basis for an abstraction of the

concepts and subjects in question. To make an abstraction of concepts the discourse

is generalised for all senses and hierarchies. Core concept frames allow for this to

happen. Take the statement “Socrates is a man, all men are mortal, therefore Socrates

is mortal.” Figure 3.21 shows the frame that underlies the sentence subjects.

* is a *, all * are *, therefore * is *

Figure 3.21: The subjects removed from a sentence create a frame

It is clear by examination of the frame in figure 3.21 that the subjects have been

removed. Smaller frames already within Hercules are matched to the syntax of the

statement. Please note that the Method SetIsA() as described in section 3.20 is called

to set data for * is a * using scripts similarly to section 3.18. This allows for a

database external to Wordnet to create a relationship between word 1 and 4 of the

statement above.

Relationships are created through critical reasoning and pattern recognition; however

subjects are abstracted using the Wordnet hyponym Hierarchies. The level of

abstraction of a concept depends on the possible abstractions made from the nodes of

the hyponym hierarchy.

Hercules

Y

How interesting!… Do you

really like Bacon and Eggs?

I like dogs, hehe!

Page 38: Hercules Final

38

Wordnet provides us with the following information:

Socrates

=1> man

man, adult male

=1> male, male person

=2> person, individual, someone, somebody, mortal, soul

=3> organism, being

=4> living thing, animate thing

=5> object, physical object

=6> physical entity

=7> entity

Mortal

=1> Adjective: Definition – “Subject to Death”

Figure 3.22: The hyponym hierarchies forming the abstracted concept with the

definition metadata

A table of possible concepts are created by factorizing the domain category nodes,

where each node of the category of each sense is considered in probability for the

least ambiguity. This means each word of each possible sense is factorized into an

array and each lower node belongs to a new row of the column. Table 3.3 illustrates

the elements the table can comprise where each column’s category elements are

multiplied by the elements in the next column providing a 2 x 7 matrix of concept

combinations as shown in figure 3.23. Please note that the senses have not yet been

factorized with Table 3.3 as it is presumed the correct sense of the words have been

identified for this example using the frame “* is a *” and the categories are correct for

the sense. An actual implementation would be much more complex than this small

example; requiring factors of table 3.3 to include the senses of ambiguous discourse

and the probabilities and adjustments for correction based on past, present and

expected input of communications.

=1>Socrates =1>Male

=2>Man =2>Person

=3>Organism

=4>Living Thing

=5>Object

=6>Physical Entity

=7>Entity

Table 3.3: The hyponym hierarchies for “Socrates is a man” using frame “* is a *”

Table 3.4 shows table 3.3 expanded as a 2 x 7 matrix of concept combinations. For

larger frames, the concept combinations will be exponential. The table can be

reduced to simplified form where the associated node levels are represented by the

ranges of the nodes in that category.

Page 39: Hercules Final

39

[0010010101, 10101010, 1-7]

[0100101010, 10101010, 1-7]

[Socrates, sense1, isa, 1-7]

[Man, sense 1, isa, 1-7]

Socrates: 0 Male: 1

Socrates: 0 Person: 2

Socrates: 0 Organism: 3

Socrates: 0 Living Thing: 4

Socrates: 0 Object: 5

Socrates: 0 Physical Entity: 6

Socrates: 0 Entity: 7

man: 1 Male: 1

man: 1 Person: 2

man: 1 Organism: 3

man: 1 Living Thing: 4

man: 1 Object: 5

man: 1 Physical Entity: 6

man: 1 Entity: 7

Table 3.4: Shows the 2 x 7 Matrix of concept combinations of table 3.3

[Socrates: 0] (is a) [male person: 1]

[Socrates: 0] (is a) [person: 2]

[Socrates: 0] (is a) [Organism: 3]

[Socrates: 0] (is a) [Living thing: 4]

[Socrates: 0] (is a) [Object: 5]

[Socrates: 0] (is a) [Physical entity: 6]

[Socrates: 0] (is a) [entity: 7]

[man: 1] (is a) [male person: 1]

[man: 1] (is a) [person: 2]

[man: 1] (is a) [Organism, being: 3]

[man: 1] (is a) [Living thing, animate thing: 4]

[man: 1] (is a) [Object: 5]

[man: 1] (is a) [Physical entity: 6]

[man: 1] (is a) [entity: 7]

Figure 3.24: Shows the (is a) node relationships created by Hercules between the

table elements of table 3.4

The concept ranges can be reduced as shown in figure 3.25 where unique concept

category hierarchies can be recognised. It is likely that a binary representation of the

remaining hierarchy form the second node down in each subject is used to abstract the

concept, depending on the relationship. Otherwise the next node down from the

subject will be the first point of abstraction of the sense. The range of what the

concept will cover in analogy is limited later by experience.

Figure 3.25: Shows the overall categorised and ranged concept in a reduced and

understandable way

Page 40: Hercules Final

40

Figure 3.26 shows the relationships between the data of figure 3.22. The concept

categories are derived from Wordnet and have had relationships established by

Hercules in prior conversations where premise A of “Socrates is a man” and premise

B “All men are mortal” has formed the relationships. Abstractions can then be made

once the relationships have been established, supporting the creation of concept

abstractions by Hercules.

“Socrates is a man” “all men are mortal”

(* is a *) (all * are *)

=0>Socrates (is a) =0> man, adult male (is a) =0>Mortal

=1>man =1> male, male person =1>Subject to death

=2> person, individual,

someone, somebody, mortal, soul

=3> organism, being

=4> living thing, animate thing

=5> object, physical object

=6> physical entity

=7> entity

Figure 3.26: Illustrates the relationships created by Hercules using hyponym data

and critical reasoning

Considering that all statements are considered logical and true; Figure 3.26 shows the

first and second premise accepted and grouped with the hyponym data of Socrates and

man with the attribute definition of Mortal. However, abstracted concepts of a

premise may not apply in all circumstances where the abstraction becomes too vague.

A test of the validity of an abstracted concept is discovered in the request or provision

of information in communications. Given the opportunity, Hercules may say to a

person, “a male person is subject to death, yes?” or “Organism is subject to death,

yes?” and so on. However, where an abstraction of a concept may be too general, the

analogy may not apply. A distinction is drawn by Hercules during conversation

where a conflict is noticed.

The conclusion to the premises A and B are drawn in “therefore Socrates is Mortal”,

which is taken as true and correct; and it is in the abstractions that a distinction may

apply. A person may state “an object is not mortal”, but the truth of premise A must

be maintained for Socrates whilst able to be applied to other circumstances. The

analogy must be distinguished by category in as shown by figure 3.27 where we

understand that not all objects are subject to death. Hercules does not yet have the

experience to know that information until proposed in another statement.

Page 41: Hercules Final

41

* (is a) [male person: 1] [subject to death:1]

* (is a) [person: 2] [subject to death:1]

* (is a) [Organism: 3] [subject to death:1]

* (is a) [Living thing: 4] [subject to death:1]

* (is a) [Object: 5] [subject to death:1] � Object becomes too abstract to be certain

* (is a) is a [Physical entity: 6] [subject to death:1]

* (is a) is a [entity: 7] [subject to death:1]

Figure 3.27: The distinction made to the concept category where the concept

becomes too abstract

The abstraction can be limited above at Object: 5 subject to death when considering a

likely statement by a real person that indicates “Not all objects are subject to death” or

“[object: rock] is not subject to death.” At this point of the conversation with the

provision of the new information that conflicts with the concept, a distinguished

category may be considered that not all objects are living things subject to death.

Whilst communications do not diverge from the premises, there is no need to consider

any limitation if for example a person states “My [object: cat] died.” The statement of

the cat dying, displays no divergence or limitations to the established abstractions or

premises in categories 1-7 above. It is only when a conflict arrises that a new premise

can be established or limited.

3.22 Limitations on Abstract Concepts

It is noted here that the combination of abstractions allow Hercules to predict and

recognise what other concepts or subjects may be forthcoming. Experience of

communications allows a predicted conclusion to be drawn. The clause of “therefore”

illustrates one indication of a predicate conclusion drawn in conversation.

For the example using figure 3.28, please assume the framed concept of ingestion,

therefore, and died has been firmly established in the core-concepts using predicate

logic, set theory, tenses and the 7 traits of living organisms. The frame of a concept is

shown without the subjects, and below the frame is a collection of concepts that have

been abstracted to a lower level in the concept category node hierarchy for each

subject. As experiences in communication will form and limit the perception of

concepts and the conclusion drawn; consider the therefore statement of “Socrates

ingested poison therefore he died.” Consider now the probability of Hercules

understanding the statement “Cleopatra ingested poison therefore she died.” after

hearing about Socrates ingesting poison. Figure 3.28 shows that where a concept has

been abstracted the semantic role of the subject is preserved, and allows Hercules to

be able to more easily compare similar sentences for established patterns, therefore

establishing a probability for an expected subject sense or type.

* ingested * therefore * died

[person] ingested [poison] therefore [person] died

Figure 3.28: A frame and concept and abstraction within a given range

Page 42: Hercules Final

42

Given that the structure of the sentence is similar for both statements about Socrates

and Cleopatra; the subjects can be abstracted towards concepts in common which

determine both the expected type of data to fit the frame, and the limit on the

abstraction of the concept.

The Wordnet Hyponym data for Woman is:

=0> Cleopatra =0> woman, adult female

=1> Woman =1> female, female person

=2> person, individual,

someone, somebody, mortal, soul

=3> organism, being

=4> living thing, animate thing

=5> object, physical object

=6> physical entity

=7> entity

Figure 3.29: The hyponym data for Cleopatra

=0>Socrates (is a) =0> man, adult male

=1>man =1> male, male person

=2> person, individual,

someone, somebody, mortal, soul

=3> organism, being

=4> living thing, animate thing

=5> object, physical object

=6> physical entity

=7> entity

Figure 3.30: The hyponym data for Socrates

When comparing the hyponym data of Cleopatra in figure 3.29 with that of Socrates

in figure 3.30, it can be deduced that the closest point of abstraction for man and

woman is at position 2 of the man and woman Hyponym Data. This position is

related to the shared category of Person, where person also shares the remaining

hyponym categories of a particular word sense. The position is not the indicator;

rather the existence in a particular category is the indicator. In this example [Person]

is the common attribute and shared from [Person] to [Entity] and may indicate a

shared sense if sufficiently identified by a detailed category node hierarchy.

To revisit the statements of “Socrates ingested poison therefore he died” and

“Cleopatra ingested poison therefore she died”, the assumption made by Hercules

from the abstraction above provides both an analogy shown in figure 3.31 and also an

expectation and probability that when matching the frame “* who ingest poison will

die” to another sentence, it is likely that the first word of the frame will be of the type

[persons].

Page 43: Hercules Final

43

“[persons] who ingest poison will die.”

Figure 3.31: An analogy where first subject of comparable discourse is abstracted

Hercules will now expect a [person] will be described in discourse characterized by

common attributes and the Hyponym hierarchy. Hercules is also able to apply the

resulting analogy to other discourse. Other factors can later be explored through

experience and patterns to recognise what the other probabilities of any other types

being present may be.

3.23 Forming Analogies

In order to form an analogy, we must first abstract the concepts of a particular kind as

described in section 3.22. In conversation, we are able to heuristically use

abstractions of a kind as synonyms in language, and still remember what the synonym

pertains. Consider the following types of things in figure 3.32; a person, rock and cat.

Each type of things has a hierarchy of the kind of thing.

“Person” “Rock” “Cat”

[person: 0] [rock: 0] [cat: 0]

[organism: 1] [natural object: 1] [feline: 1]

[living thing: 2] [whole, unit: 2] [carnivore: 2]

[object: 3] [object: 3] [placental: 3]

[mammal: 4]

[vertebrate: 5]

[chordate: 6]

[animal: 7]

[organism: 8]

[living thing: 9]

[object: 10]

Figure 3.32: Hyponym hierarchies provided by Wordnet for person, rock, and cat

In figure 3.32 we can see a type of “person” is both a kind of [organism: 1] and a kind

of [living thing: 2]. A type of “cat” is a kind of [mammal: 4], and a kind of [animal:

7].

We can also heuristically substitute a kind of thing in conversation and still

understand what type the kind relates to. I can talk about the cat as a feline, then the

feline an animal. Now consider “the animal was fuzzy and her name was sally”. You

can guess that I am still talking about the cat (unless you are tired). Now consider the

frame of that sentence in figure 3.33.

Page 44: Hercules Final

44

“The * was fuzzy and her name was sally”

“The [cat: 0] was fuzzy and her name was sally”

“The [feline: 1] was fuzzy and her name was sally”

“The [animal: 7] was fuzzy and her name was sally”

Figure 3.33: Heuristic substitution and abstraction using the hyponym hierarchy of a

particular word sense

Analogy is made through the abstraction of concepts. The point of abstraction in a

frame of discourse is similar to a particular semantic role of the subject in the

discourse. Presuming people converse rationally, the word provided by a person at

the point of abstraction should make sense with the remaining discourse. The

Hyponym hierarchy of a new subject at the point of abstraction is presumed to be

valid if provided by a person, and can then be abstracted. The analogy of figure 3.31

resulting from an earlier example of abstraction of concepts involved Socrates and

Cleopatra. The analogy was that:

“[persons] who ingest poison will die.”

The broader the abstraction of concepts; the more subjective becomes the analogy.

The depth of analogy is most sensible at the closest of the more established premises.

That is to say we are more able to make sense of an analogy where the categories are

more specific. It is easier to distinguish [Socrates: 0] over [organism: 2] or [living

thing: 3] even if both other terms 2 and 3 in theory refer to Socrates.

Those words sharing the same categories will qualify the analogy as valid. Those

subjects sharing the same categories can be recognised as the same kind, at particular

levels of abstraction using the node hierarchies. If no similarities exist in a category

of subject, there is no relationship with the analogy category. Therefore the analogy

can not be applied. Figure 3.4 shows the comparison of the hyponym hierarchies

where Cleopatra and Socrates share a common hierarchy at person, where a rock at

first appears to be unrelated in category until the node of object is compared.

=0> Cleopatra =0> Woman =0>Socrates =0> Man =0>Rock

=1> Woman =1> Female =1>Man =1> Male =1>Natural Object

=2> Person =2> Person =2>Whole

=3>Organism =3>Organism =3>Object

=4>Living … =4>Living …

=5>Object =5>Object

Figure 3.34: Shows the comparison of hyponym hierarchies of Cleopatra, Socrates,

and a Rock

A rock is neither any attribute in the hierarchy above [Person: 0], therefore the

analogy of person can not apply to the rock. This is because the kinds of concept

categories shared by Cleopatra and Socrates and their hyponym hierarchies are too

dissimilar from the hyponym hierarchy of the rock at the level of [Person: 0]. By

Page 45: Hercules Final

45

taking the abstraction of the analogy further in comparison to [Object: 3], the Rock

may mistakenly fit the analogy by sharing a kind in common. Therefore, in this

example, the analogy may be incorrect. The analogy may be corrected by a statement

from a person or reference to fact.

3.24 Corrections and limitations on Analogy

Below is an example of how Hercules would limit the scope of the analogy. The

analogy templates remain Active (True) until challenged. Consider the frame in

figure 3.35 below which is the frame of the concept analogy of figure 3.31 where the

concept category has been replaced with a wild card to substitute for any word and

sense.

“* who ingest poison will die.”

Figure 3.35: Is the frame of the concept analogy of figure 3.31

Hercules made the higher level analogy resulting in this frame in the earlier example

shown by figure 3.31 using an abstraction of concepts for Socrates and Cleopatra.

Hercules has this information in its memory.

Considering that there exists attributes and data-structures in Hercules; similar to the

already reasoned “Socrates is a man, all men are mortal” example; we may broadly

understand the syntax structures of the figure 3.36 using the frame and analogy. Plain

text syntax combining concepts and frames can provide a sufficiently understandable

format to be stored within a database to represent frame data. Even if the data stored

in the database is not in the exact format as required, when an administrator of

Hercules is interpreting the concepts and data, they should be presented to the

administrator in a sufficiently understandable format such as in figure 3.36 or 3.37.

* who ingest poison [[will die] : subject to death]

Figure 3.36: Syntax structures of concepts and frames combined

The frame of figure 3.36 and any attribute, such as the definition of a word, may be

further abstracted using the category information as well ranges as described in

section 3.21 and by figure 3.25; shown by figure 3.37.

[[Category]: 0-5][who ingest poison [[will die: 1[Attribute: Definition: subject to

death]]].

Figure 3.37: Syntax for concepts and frames with category information included

Having a general concept outline as described by figure 3.37, the concept can now be

limited by another statement of fact. A statement is provided by a person or from

parsing an encyclopaedia. The statement is:

Page 46: Hercules Final

46

“A rock is not subject to death.”

Figure 3.38: Statement of fact provided by a person

This statement of figure 3.38 can now be used to distinguish the limits of the analogy

concept of figure 3.37. To limit the analogy, consider a distinction in the analogy of

figure 3.37 wherein the Hyponym hierarchy for “Rock” shown in figure 3.39 allows

us to determine the category and kind of distinction. The distinction must be drawn at

the most recognisable level which is at node 3 of figure 3.39 and [object: 3] of figure

3.40. There is a category and kind in common between the rock and the person.

=0> rock, stone

=1> natural object

=2> whole, unit

=3> object, physical object � Category of distinction in common

=4> physical entity

=5> entity

Figure 3.39: Rock hyponym hierarchy with the object category of distinction

[person: 0] [who ingest poison will die: 1] (Active)

[Organism: 1] [who ingest poison will die: 1] (Active)

[Living thing: 2] [who ingest poison will die: 1] (Active)

[Object: 3] [who ingest poison will die: 1] (Active) � Analogy is limited

[Physical entity: 4] [who ingest poison will die: 1] (Active)

[entity: 5] [who ingest poison will die: 1] (Active)

Figure 3.40: The expanded concept frame to a table or array of data

This is the limit of where a distinction can not be drawn for differing subjects of the

analogy sharing a category in common. Where the distinction can not be drawn, the

analogy hierarchy then on is not viable and must be Inactive. The entire analogy must

not be removed, as the analogy is correct depending on the subjects. It is only the

limitation through distinction that is the mechanism for applying the analogy and

determining an understanding of the statement.

By traversing down the Hyponym hierarchy forming the abstractions of the analogy,

we are able to determine the actual category verbose, or via synonyms to determine

that category. The point of distinction then limits the scope of the analogy when

discovered. Given that only one sense of the word “Rock” is of the particular

category intended by the person, and this known; with a distinguished word-sense and

unique Hyponym hierarchy, Hercules can draw a distinction limited to living things: 2

in the concept table shown in figure 3.40. The analogy is now limited to the correct

level of abstraction for it to remain valid.

Page 47: Hercules Final

47

IsA IsA

[Person: 0] [who ingest poison will die: 1] (Active)(Weight: x%)

[Organism: 1] [who ingest poison will die: 1] (Active) (Weight: x%)

[Living thing: 2] [who ingest poison will die: 1] (Active) (Weight: x%)

[Object: 3] [who ingest poison will die: 1] (Inactive) � no longer viable analogy

[Physical entity: 4] [subject to death: 1] (Inactive) � no longer viable analogy

[Entity: 5] [who ingest poison will die: 1] (Inactive) � no longer viable analogy

Figure 3.40: Illustrates the distinction drawn from the user input of figure 3.38 will

deactivate categories of the analogy and concept

Figure 3.41 below illustrates the overall picture of the upper and lower limits of

analogy in context with relationships and attributes. ISA relationships are shared in

common between Socrates and Cleopatra allowing the upper limit to be established.

The lower limit is distinguished when facts limit the level before the limit allows

extraneous attributes to conflict, indicating a different kind of concept below the limit.

Analogy Upper Limit

=2> person, individual,

someone, somebody, mortal, soul

=3> organism, being

=4> living thing, animate thing

Analogy Analogy Lower Limit

Frame is:

[ [[Person(s)] to [Living Organisms] except [Object]] ingest Poison will die ]

Figure 3.41: The upper and lower limits of analogy in context with relationships and

attributes

NOT:

Mortal

IS:

Mortal

IS:

Mortal

=0>Cleopatra

=1>Woman

=0>Rock

=1>Object

=0>Socrates

=1>Man

Page 48: Hercules Final

48

Ideally the analogy frame’s upper limit of 3.41 can be illustrated by figure 3.42, and

can be determined by repeated patterns recognised in sentence discourse which then

establish an analogy abstraction. Given that subject X and subject Y are of Type I,

any element E not shared by X or Y above I establishes the upper limit of the analogy

below E.

Analogy Upper Limit

Figure 3.42: The Analogy Upper Limit

Ideally the analogy frame’s lower limit of 3.41 can be illustrated by figure 3.42, and

can be determined by using the Wordnet hyponym hierarchies (HH) in making a

comparison of new subject Y with an established analogy abstraction. Given that

subject X accepts element E at all levels, and subject Y Refutes E at a common

intersection I, the lower limit of the analogy is established above I, and the analogy

remains valid in the section indicated by the “pass”.

Figure 3.43: The Analogy Lower Limit

3.25 Truth and Weight in analogy

In order to determine an understanding of the discourse, a weight must be added to the

analogy data. The weight assists in providing a determination through probability.

The weight is increased as patterns are identified as being repeated. This confirms

valid communication structures even if the meaning is unknown. The fact a person

has provided the sentence, assumes there must be valid logic behind the pattern. The

probability then provides the context once a meaning can be extracted via experience,

statistics, and the frame context signatures described in section 3.61. The truth

weightings are attached as metadata to the frame as circumstances dictate.

Analogy Lower Limit

E:

NOT:

Mortal

=X>Socrates

=E1>Man

=I>Person

=Y>Cleopatra

=E2>Woman

=I>Person

=I>Person

=O>Organism

=S> …

HH1:

=X2> person

=X3> organism, being

=X4> living thing

=X5> I: Object

=X6> physical entity

=X7> entity IS:

Mortal

HH2:

=Y0>Rock

=Y1>I: Object

Page 49: Hercules Final

49

Other factors that assist with the weighting of the frame are also included in the

database tables; however, the formulas that utilise the statistics have not yet been

designed. The following frame table information can be utilised in any script or

algorithm. A more detailed explanation and use will be explained in future work,

however the table information provides a framework for expansion of the capabilities

of the parser.

Frame-Data: contains the Text Frame or Analogy Frame

Reference: allows a pointer memory address to be stored in the database

Category: is the bit defined category of the context or domain

Order: is the order of the frame in the linked list

Weight: is the weight attributed to the frame

Transform: is the level of category that the frame transforms to a new category

Threshold: is the limit to where the frame is active and relevant

Relationship: is a flexible attribute with no specific definition, used as required

Concepts: provides bit defined categories of the underlying and interrelated concepts

in the concept database, which may fit to the frame

Concept-Mask: assists as a filter to incoming concepts to the frame

Tense: assists in identifying bit defined Tense categories, and the tense related

concepts

Tense-Mask: assists as a filter to incoming tense concepts to the frame

Extensions: acts as a bit defined placeholder for expansion of the frame table

information

Extension-Mask: assists as a filter for the bit defined extensions

Data-link: provides a link to external resources of data relevant to the frame

Reversion: provides tracking for earlier versions of the current frame that the current

frame has been transformed from

Exclusions: provides a list of exceptions to the current frame being activated for a

particular category

Links: Allows the frame to be linked to another

Formula: is the formula or script attached to the frame

Page 50: Hercules Final

50

Section 4: Experimentation, Results and Analysis

As Hercules is in the prototype stages, most experiments, analysis and results will be

carried out after future work. Hercules has been designed to run experiments on

patterns of discourse, and store that data in a database. The results of the experiments

will then supplement a context for the discourse to be understood in the context

provided by the user. It is anticipated that a neural network style learning using the

scores, signatures, concept fragments and wave analysis will supplement the

understanding. Because of the broad nature and flexibility Hercules can provide it

would be outside the scope of this report to examine specific formulae and the results

that may be found; except to say that Hercules is capable of having user defined

functions and symbols mapped to software #defined values for flexibility in executing

hard-coded methods to support the experiments envisaged.

Page 51: Hercules Final

51

Section 5: Future Work

5.1 Algorithm Design

As described earlier in section 3.61 Frame Queries and Statistics, A hybrid Approach

to Word Sense Disambiguation: Neural Clustering with Class Labelling section 2.6

may be used in conjunction with A Generative Model for Semantic Role Labelling

section 2.7 where Role labelling is supplemented by speech patterns identified by

Frame Queries and Statistics of section 3.61 for the abstraction of concepts in section

3.9 to assist in the understanding of shared communications and goals such as in

section 3.10.

5.2 Hercules Parser Enhancements

Hercules must be augmented to handle many of the logic functions and elements that

are used in Set Theory, and Predicate Logic; this will allow formal logic algorithms to

be used along side mathematical equations with subsets of data, to control the

interpretations and response of the Hercules Parser. The graphical user interface must

be enhanced to assist with streamlining manual enhancements.

Frame and Table metadata uses must be further defined and used along side the script

functions and operations which will assist in reaching the goals of true artificial

intelligence.

5.3 Reaching the goal of True Artificial Intelligence

Ultimately, the Hercules Parser will likely construct its own scripts automatically, to

deal with new communications and concepts.

The accuracy of the hypothesis Hercules forms and replies to, with regard to the

communications, may be the true test of the intelligence of the machine.

Page 52: Hercules Final

52

Section 6: Concluding Remarks

The Hercules parser has been created with so much flexibility in mind that it is

difficult to discuss the entire program and all functions it is capable of. I have aimed

this exposition at the most interesting and important components of the parser used

with Wordnet. The Hercules parser can provide a simple and easy way to design, test,

and implement Artificial Intelligence algorithms, formulae and scripts using a

graphical interface, without the need for others to have a complex knowledge of

computer programming to do so.