gellish a standard data and knowledge representation language and ontology

22
Gellish A standard data and knowledge representation language and ontology Are Data Models becoming Superfluous? by Ir. Andries van Renssen Shell Global Solutions International [email protected] Abstract Data storage and data communication lack a common standard universal data model as well as a common data language and knowledge base with a taxonomy of concepts and a grammar for data exchange messages. This article presents a solution to this problem in the form of the new Gellish language and knowledge base, as an extension of the standard data models and ontology of two new ISO standards. The article presents Gellish as a language for neutral data exchange between systems, that can replace data models, and that provides an extendable ontology with standard reference data for customization and harmonization of systems. The definition of Gellish includes the public domain (“open data”) Gellish knowledge base with definitions of a large number of concepts and product models. It illustrates that a single Gellish Table in a database or data exchange file, is sufficient to express a wide range of kinds of facts about classes as well as facts about individual objects. Keywords: knowledge representation, data exchange, language, data models, standards, ontology, semantic web, knowledge base, classification system Table of Content 1 Introduction............................................. 2 Gellish 1 23/06/2022

Upload: andriesvanrenssen

Post on 14-Jun-2015

1.685 views

Category:

Documents


0 download

DESCRIPTION

Database structures should allow for the expression of any fact that can be expressed in natural languages. This means that they should allow for any expression of facts in a universal formal language. Constraints should be specified in a separate layer. This article describes the basic concepts of such a universal semantic database structure and associated formal subset of a natural language.

TRANSCRIPT

Page 1: Gellish   A Standard Data And Knowledge Representation Language And Ontology

GellishA standard data and knowledge representation

language and ontology

Are Data Models becoming Superfluous?

byIr. Andries van Renssen

Shell Global Solutions [email protected]

Abstract

Data storage and data communication lack a common standard universal data model as well as a common data language and knowledge base with a taxonomy of concepts and a grammar for data exchange messages. This article presents a solution to this problem in the form of the new Gellish language and knowledge base, as an extension of the standard data models and ontology of two new ISO standards. The article presents Gellish as a language for neutral data exchange between systems, that can replace data models, and that provides an extendable ontology with standard reference data for customization and harmonization of systems. The definition of Gellish includes the public domain (“open data”) Gellish knowledge base with definitions of a large number of concepts and product models.It illustrates that a single Gellish Table in a database or data exchange file, is sufficient to express a wide range of kinds of facts about classes as well as facts about individual objects.

Keywords: knowledge representation, data exchange, language, data models, standards, ontology, semantic web, knowledge base, classification system

Table of Content

1 Introduction................................................................................................................... 2

1.1 Standard data models, ontologies and reference data.............................................3

2 The Gellish language and ontology...............................................................................4

3 Storage and exchange of data as well as semantics in Gellish........................................5

4 Interpretation of expressions.........................................................................................8

5 Experiences and applications....................................................................................... 10

6 Conclusions................................................................................................................. 13

7 References................................................................................................................... 14

Gellish 1 13/04/2023

Page 2: Gellish   A Standard Data And Knowledge Representation Language And Ontology

1 Introduction

Currently, each software system stores its data using its own data model and communicates with other systems usually using a dedicated interface data structure, which means that it applies a dedicated interface data model. The large variety of data models cause that data exchange between systems is costly because of the required conversion of the data from the semantics of one data model to the other. This demonstrates the urgent need for widely applicable common standard data models.

Often systems can be ‘customized’ by adding ‘reference data’ as instances, such as the definition of equipment types, document types, activity types, property types, pick lists, etc. However, reference data are usually different per implementation, even when database structures of different systems are equal, such as is the case with several implementations of the same system. This also holds for different implementations of the same system, such as a CAD, CAE, PDM, PLM, ERP or CRM system. The consequence is that data in those implementations can still not be compared, integrated or exchanged without costly data conversion processes. This illustrates the urgent need for a common dictionary, classification system or taxonomy of reference data, because there is currently no standard user data language.

In the current systems there is a separation between the world of data models and the world of instances. Data models are developed by IT specialists (data modelers) who document them using either proprietary tools or using a standard data modeling language, such as EXPRESS (ISO 10303-11) or UML, which languages are especially designed to define data models. Once a data model is defined in such a language, the data model acts as another language in which the reference data as well as the user data has to be expressed. The use of two different languages, one for the model, one for the user data, illustrates the barrier between the two worlds. It is as if the English language definition is expressed in Chinese. On top of this comes that each programmer and each reference data producer is free to define his own terminology using those data definition languages!

The result of the current state of the art is that data storage is done in a Babylonian mix of data models and reference data ‘languages’ with the consequence that exchange of data between systems is impossible, except where dedicated bilateral translators are created not only for each pair of data models, but also for the data content ‘languages’.

The current situation is sketched by Smith and Welty (2001) as follows: “Out of the apparent chaos, some coherence is beginning to emerge. Gradually, computer scientists are beginning to recognize that the provision, once for all, of a common, robust reference ontology – a shared taxonomy of entities – might provide significant advantages over the ad-hoc, case-by-case methods previously used”.

Several attempts are made to develop an ‘upper ontology’, such as SUMO by Niles and Pease (2001), the IEEE Standard Upper Ontology, SUO (2001), the Cyc ontology, Lenat (1995) and GOL, Degen et al (2001). However none of them integrates the upper level ontology with a lower level ontology of reference data. In other words they do not integrate a generic data model with reference data and a language for the description of knowledge and of individual objects and processes.

This article presents a solution to the above-mentioned issues in the form of the Gellish language. Gellish satisfies the criteria for proper ontologies as expressed by Degen et al (2001 par 6.1), but is not limited to an upper ontology. It includes and extents concept definitions that also appear in other sources such as ISO standards and IEC standards, and knowledge stemming from industry standards and proprietary sources. It is extendable just

Gellish 2 13/04/2023

Page 3: Gellish   A Standard Data And Knowledge Representation Language And Ontology

as any natural language. Its taxonomy and knowledge base uses unique identifiers for concepts, thus allowing for synonyms and multiple names in various languages. The latter enables the expression of propositions about facts in one natural languages and automatic translation and presentation in any other natural language.

Gellish eliminates the traditional barrier between the data model definitions of classes and the data instances. The Gellish language demonstrates that this barrier is not necessary and that there are clear advantages when class definitions, reference data and user data are expressed in one and the same language.

1.1 Standard data models, ontologies and reference data

There are several developments of standard lower level ontologies and reference data libraries, stimulated among others by requirements of the e-commerce ‘market places’ and the developments around The Semantic Web promoted by Lee et al (2000) and the Web Ontology Language OWL. For example, the UNSPSC code (http://www.unspsc.org/), Ecl@ss (http://www.eclass.de/), Trade Ranger (http://www.trade-ranger.com/EN/Pages/ContentStandards.asp), etc. These standards have their value mainly in the standardization of terminology, but do not provide a standard language or a standard data model for general use, because of their limited semantic expression power due to the fact that they apply only a few relation types and lack of integration with a rich upper ontology.

There have also been several attempts to develop standard data models for data exchange or for data storage. Some of them are proprietary, but others are in the public domain. Those standard data models are defined independent of a particular system, and are therefore called ‘neutral’. Those standard data models are usually developed for a particular application domain instead of being limited to a particular system. Examples of standard data models are the STEP family of standards in ISO 10303, such as a graphics data model AP203, a data model for the automotive industry (AP214), one for piping systems (AP227), one under development for the defense industry (AP239, PLCS), etc. The integration of all those data models into one overall data model is not yet fully achieved. Although the scopes of these valuable standard data models are wide, they are still limited to particular application area’s and do not provide a general ‘common language’ yet.

A further step towards a data model with a generic scope was the development and publication of the Epistle Core Data Model (2001), in which development the author of this article participated. From that, two new ISO standards were derived, ISO 15926-2 and its counterpart within the STEP family (AP221). Although these generic data models stem from the process industries, they have the generic nature of an upper level ontology, which make that they are applicable in other application domains as well.

To become practically applicable in a particular application domain, these generic data models need a standard ‘reference data library’ or lower level ontology, in order to add standard definitions of application domain specific concepts and to specialize the generic data model. The author coordinated the development of such a standard reference data library, called STEPlib. This is a main source for the common standard library ISO 15926-4.Then it was discovered that the top of the specialization hierarchy of standard data in the library coincided with the entities, attributes and relations in the generic data model. This led to the inclusion of the data model in the library. In other words, the upper level ontology was combined with a lower level ontology. The insight that information should be contained

Gellish 3 13/04/2023

Page 4: Gellish   A Standard Data And Knowledge Representation Language And Ontology

in relations and not in objects, led to the birth of the Gellish language, which is based on standard relation types, expressed by natural language ‘phrases’.

2 The Gellish language and ontology

Gellish is a public domain standard data and knowledge representation language and ontology that that is defined in STEPlib. It does not have the barrier between the user data and the IT data model data. It contains and extents the concepts of the above mentioned generic data models and integrated and extended them with standard reference data and a knowledge base with product and process models. The ontology includes also the definition of a large number of standard fact types (or relation types) that defines the grammar of the Gellish language. It contains the definition of over 20.000 concepts arranged in a specialization hierarchy of classes. These concepts can be interpreted as entity types, attribute types and relationship types or as a classification system or taxonomy. This makes Gellish equivalent to a very large data model. In addition to that STEPlib contains a large number of relations between the concepts. They define the content of the knowledge base of product models and process models.

Gellish is not object oriented, but fact oriented. The basic Gellish object is therefore a fact. Each (atomic) fact is expressed as a relation between (two) objects. For example, fact 1 is expressed by a particular relation between objects with unique identifiers (UID’s) 100 and 101. This expression (1, 100, 101) illustrates the structure of each basic Gellish expression. Gellish requires that both the objects and the fact must be classified explicitly by standard classes, including standard relation types. The standard classes are predefined in the Gellish ontology. In addition to that, objects may have a name. This enables that the expression can be interpreted correctly by software.

Gellish and the above mentioned ISO standards are both based on the understanding that there appears to exist a limited set of application independent standard relation types that are sufficient to model all kinds of products and processes. Gellish standardizes these relation types. The relation types also define the role types that the related objects play in the relations with each other. The variety and extendibility of standard relation types define the semantic expression capabilities of Gellish.A large part of the Gellish relation types is defined in the ISO standards and an extended set is defined in the TOP part of the Gellish language definition (STEPlib).

A standard implementation of Gellish is defined as a Gellish Table. In a Gellish Table the basic Gellish expression becomes:

Left hand object UID

Left hand object name

Fact UID

Relation type UID

Relation type name

Right hand

object UID

Right hand object name

100 thing-1 1 2850 is related to 101 thing-2

In a Gellish Table one (atomic) fact is represented by one record, being as a relation between two object UID’s, the names of the objects and the classification of the fact. The classification of the objects is done via separate classification facts in additional records.

Some examples of facts from a particular application domain, which illustrates the use of standard Gellish relation types are:

Gellish 4 13/04/2023

Page 5: Gellish   A Standard Data And Knowledge Representation Language And Ontology

Left hand object UID

Left hand object name

Fact UID

Relation type UID

Relation type name Right hand object UID

Right hand object name

Scale

130091 diesel engine 2 1146 is a specialization of 130108 engine

104 M-1 3 1225 is classified as a 130091 diesel engine

130802 cylinder 4 1146 is a specialization of 730063 artifact

107 C-1 5 1225 is classified as a 130802 cylinder

107 C-1 6 1190 is part of 104 M-1

107 C-1 7 1727 has aspect 108 volume of C-1

108 volume of C-1 8 1225 is classified as a 550140 internal volume

108 volume of C-1 9 2044 is quantified as 922235 1800 cm3

104 M-1 10 4760 is subject of 110 order-1

Note, for human readability, the relation type UID is ignored in the tables below.

The above table illustrates:

- Standard Gellish relation types, that classify the facts, and that determine the expression capabilities and semantics of Gellish.

- Examples from the large number of standard object types that are predefined in Gellish. For example: engine, diesel engine, cylinder, artifact, internal volume, 1800 and cm3.

- The way in which new object types can be added: such as fact 2 and 4. Although they already exist in Gellish. But if diesel engine and cylinder would not have existed, they could have been added in this way.

- It is possible in Gellish to express facts, such as the volume of C-1, without the need that such a fact is pre-modeled in the data model. Although such a fact type could be defined in Gellish, after which this particular instance can be verified against such a definition. It could also be defined to be obligatory in a particular context, after which the instances can be validated on completeness and compliance.

- One table is suitable to express many kinds of facts.

Note: The table above presents just an example of some of the capabilities of Gellish. For example, Gellish also allows to express in which language the facts are expressed, whether the objects are real or imaginary, what the communicative intent is, who the author of a proposition is and the addressee, etc.

3 Storage and exchange of data as well as semantics in Gellish

In this paragraph I will describe how knowledge, data and semantics are represented in Gellish. The generic nature of Gellish allows expressing any complex network of facts. For example it allows expressing that:- physical objects (of any kind) have properties (of any kind),- properties have values,

Gellish 5 13/04/2023

Page 6: Gellish   A Standard Data And Knowledge Representation Language And Ontology

- physical objects have parts,- physical objects participate in activities or processes in particular roles,- etc.But for clarity I will use a specific example, being the fact that: - a particular pump (‘P-1’) is pumping a particular stream (‘S-1’).

In a conventional database it is required to declare some entity types and attribute types that define the semantics in the form of a data model. In case of the example, the data model could for example consist of the entity types ‘pump’, ‘process’ and ‘stream’, each with some attributes.

In Gellish, the concepts ‘pump’, process’ and ‘stream’ are not entity types, but they are concepts that are defined via facts that are expressed as relations in a generic knowledge base.

The knowledge base is built on a structure that only ‘knows’ the minimum number of ‘basic semantic concepts’ and contains the definition of a large number of concepts. The minimum set of ‘basic semantic concepts’ comprises the fundamental ontological axioms of Gellish in a structure that should be known and understood and which is sufficient for the definition of additional semantic concepts. That structure is presented in figure 1.For the definition of a new concept (‘anything’) it is required to define such a coherent structure of elementary facts. Each elementary fact is expressed a relation between two concepts, represented by the blue boxes in figure 1. In other words, each new concept requires the creation of a structure as presented in figure 1.

Figure 1, Structure of basic semantic concepts

The minimum set of ‘basic semantic concepts’ that are the axioms of Gellish and which meaning should be understood is:

- anything- role- relation / relations - plays role - requires role - is / is a (is classified as a)- individual thing / individual things- kind of thing / kind of things- single thing / plural thing

Gellish 6 13/04/2023

is a

kind of thing

is (a)

relationrole

(of something in relation)

anything playinga role

requirementof role

is a

- object-1 - role-1- relation-1

- object-2 - role-2

plays

played by requires

in

is a

kind of thing

is (a)

relationrole

(of something in relation)

anything playinga role

requirementof role

is a

- object-1 - role-1- relation-1

- object-2 - role-2

plays

played by requires

in

Page 7: Gellish   A Standard Data And Knowledge Representation Language And Ontology

The structure of figure 1 holds for facts about classes as well as facts about individual objects (instances) or relations, but also for single objects as well as for plural objects. In other words, object-1 and object-2 in figure 1 can be either a single or plural individual object, relation or class. The lines in the top left corners of the boxes indicate that the structure is a typical instance.

Any other ‘atomic fact’ is expressed as such a structure. In other words, any atomic fact is expressed as an ‘atomic relation’ between two or more ‘objects’ and by the classification of the ‘objects’, the ‘roles’ and the ‘relation’. This implies that an atomic fact is expressed by a structure of nine (9) relations, formed by the blue boxes in figure 2 (note that 4 of the 5 boxes appear twice in an atomic fact).

For example the fact that impeller O1 is part of centrifugal pump O2 is expressed in Gellish by the following 4 elementary relations:

- O1 plays role R1- R1 is required by C1- C1 requires role R2- R2 is played by O2

These 4 relations relate 5 objects. To interpret them correctly the following 5 additional classification relations are required:

- O1 is classified as an impeller- R1 is classified as a part- C1 is classified as a composition relation (“is part of”)- R2 is classified as a whole- O2 is classified as a centrifugal pump

In practical implementations it appears that the explicit identification of the roles and their classification can be neglected, because they follow from the classification of the relation and the definition of the relation type.Therefore the above relations are usually summarized in 3 Gellish atomic expressions as follows:

- O1 is classified as an impeller- O1 is part of O2- O2 is classified as a centrifugal pump

From this example it can be seen that the 5 kinds of things with which the 5 objects are classified need to be present in or added to the semantics of the Gellish knowledge base in order to ensure that the fact can be interpreted correctly.

The awareness that a knowledge base of predefined concepts is required for a correct interpretation of Gellish expressions resulted in the development of the top-down hierarchical definition of the Gellish knowledge base of concepts, including also relation types, as available in STEPlib.

Knowledge representation: relations between classes

Any fact type that extends the semantics is expressed as a relation between kinds of things. For example, assume that the concept ‘centrifugal pump’ needs to be added. Then the following two atomic relations define that concept:

1. A specialization relation that defines that: centrifugal pump is a specialization of pump

Gellish 7 13/04/2023

Page 8: Gellish   A Standard Data And Knowledge Representation Language And Ontology

2. A relation that defines that a centrifugal pump by definition uses the centrifugal principle: centrifugal pump has by definition as aspect centrifugal.

These relations build respectively on the definition of the concept ‘pump’ and ‘centrifugal’.

4 Interpretation of expressions

In current database technology the semantic interpretation of an expression is done via the fact that any object is implicitly classified by being an ‘instance’ of an entity of which the semantics are defined.For example, assume that P1 is an instance of an attribute called ‘name’ of an entity called ‘pump’. This probably means that P1 is the name of a thing that is classified as a pump, although this meaning comprises two facts that are usually not defined in a computer interpretable way. It should be noted that if there are no other attributes, this data structure does not allow the classification of P1 as a centrifugal pump.

In Gellish all semantics is made explicit by the creation of explicit classification relations between the elements in the expression and the Gellish concepts (classes of objects, including relations). This replaces the instantiation relations and eliminates the need to define a data model with entities and attributes, such as the entity ‘pump’ and the attribute ‘name’. This is illustrated in figure 3.

Figure 2, Linking a Gellish expression to Gellish concepts through classification

Figure 2 illustrates the expression: P-101 is pumping S-1” (in dark yellow). The ‘pumping S-1’ process is an interaction between the fluid S-1 and the pump P-101. The pump has the role as performer and the liquid has the role as subject in the pumping process. The blue boxes in the green shaded area represent the Gellish concepts, being instances in the Gellish knowledge base, STEPlib. The explicit classification relations with the concepts in those blue boxes provide the semantics for the interpretation of the expression.

In a Gellish Table this becomes:

Gellish 8 13/04/2023

classifier

classified

classifier

classified

classifier

classified

Green shaded area = Gellish ontology (STEPlib)Green shaded area = Gellish ontology (STEPlib)

classifier

classified‘S-1’‘P-101’

is classified as ais classified as a is classified as ais classified as a is classified as ais classified as a

classifier

classified

is classified as ais classified as a

‘is performer of pumping S-1’

‘pumping S-1’

is classified as ais classified as a

player requirer

requirerplayer‘is subject in pumping S-1’

pumpingpump liquid streamis performer of is subject in

111

11

11312 112

13 15 14

730083 192512130206

Page 9: Gellish   A Standard Data And Knowledge Representation Language And Ontology

Left hand object UID

Left hand object name

Fact UID Relation type name Right hand object UID

Right hand object name

111 P-101 11 is performer of 112 pumping S-1

113 S-1 12 is subject in 112 pumping S-1

111 P-101 13 is classified as a 130206 pump

112 pumping S-1 14 is classified as a 192512 pumping

113 S-1 15 is classified as a 730083 liquid stream

Such a set of rows in a Gellish Table can be exchanged between Gellish enabled software packages in any kind of table, such as an MS-Access database table, an Oracle or DB2 table, XLS spreadsheet, an XML file (e.g. according to ISO 10303-28) or in STEP physical file format (ISO 10303-21). Further details are described in ref. 1.

Note that the shaded light yellow boxes all have the same name: “is classified as a”. However, they are different individual classification relations. Each of those relations has a unique identifier (13, 14 and 15). The name in the shaded box indicates that each is (implicitly) “conceptualized” to be a classification relation. In other words, each of them is a “is classified as a” relation.

For a correct interpretation of the Gellish concepts they need to be defined in a computer interpretable way. This is done via specialization/generalization relations as is illustrated in figure 3. These specialization relations form one hierarchical network terminating at the top, called ‘anything’. This generic top supports the wide applicability of Gellish, as any missing concept can be added to Gellish as a subtype of an existing concept.

classifier

classified

subtype

supertypesupertype

subtype

classifier

classified

subtype

classifier

classified

subtype

supertype

pumpingpump liquid stream

classifier

classified‘S-1’‘P-101’

is classified as ais classified as a is classified as ais classified as a is classified as ais classified as a

classifier

classified

is classified as ais classified as a

‘performer of pumping S-1’

‘pumping S-1’

is classified as ais classified as a

subtype

instance

entitykinds of things

subtype

is subject inis performer of

requirer

requirerplayer

player

is a specialization ofis a specialization of is a specialization ofis a specialization of is a specialization ofis a specialization ofis a specialization ofis a specialization of is a specialization ofis a specialization of

supertypephysical object relation activity

‘subject in pumping S-1’

is a specialization ofis a specialization of

supertype

is a specialization ofis a specialization of is a specialization ofis a specialization of

individual thing is an instance ofis an instance ofinstance

individual thingsGreen area = Gellish ontologyGreen area = Gellish ontology

anything

is a specialization ofis a specialization of

Gellish 9 13/04/2023

Page 10: Gellish   A Standard Data And Knowledge Representation Language And Ontology

Figure 3, Definition of Gellish concepts in a specialization hierarchy

In practice there are several intermediate levels of specialization between e.g. ‘pump’ and ‘physical object’ and ‘anything’, etc. Furthermore there are classes of physical objects defined as subtypes of ‘physical object’. These can be extended by specializations, such as standard components (e.g. from ASME, BSI or DIN standards) and also specializations such as manufacturer catalogue items (e.g. Manufacturer models and types).

Figure 3 contains eight facts expressed as eight “is a specialization of” relations, each of which is a separate relation between classes. Similarly to what is described above about the “is classified as a” relation, this illustrates that the term ‘is a specialization of’ is not the name of each of those relations, but it is a name of the Gellish concept (the class) that is the conceptualization of those relations.

The knowledge about the meaning of the concepts pump, ‘is performer of’, liquid stream, ‘is subject in’ and pumping is defined in the Gellish ontology STEPlib. Some of that is illustrated in the following facts, which includes some intermediate facts not shown in figure 3 (the UID’s and names are taken from STEPlib, except for the UID’s of the facts):

Left hand object UID

Left hand object name

Fact UID Relation type name Right hand object UID

Right hand object name

130206 pump 16 is a specialization of 730044 physical object

4761 is performer of 17 is a specialization of 4767 is involved in

4761 is performer of 18 requires as role-1 a 640020 performer

730044 physical object 19 can have as role as a 640020 performer

4761 is performer of 20 requires as role-2 a 4773 involver

730083 liquid stream 21 is a specialization of 730045 stream

4760 is subject in 22 is a specialization of 4767 is involved in

192512 pumping 23 is a specialization of 190168 process

Gellish 10 13/04/2023

Page 11: Gellish   A Standard Data And Knowledge Representation Language And Ontology

This knowledge is inherited from higher concepts in the hierarchy to lower level concepts. If an individual object is classified to be of such a class, then the knowledge is applicable to the individual object as a constraint for the specific aspects of the individual object.

5 Experiences and applications

Gellish is applied to express- information about individual objects, - knowledge about kinds of objects,- requirements for data and documents in particular contexts about individual objects and about kinds of objects.

These three application are related to each other, as is illustrated in Figure 4.

Figure 4, Three types of Gellish Models

The left hand of Figure 4 represents a Product Model that illustrates a Gellish model of a process plant (the thick black lines represent composition relations). The relation types in a product model generally start with ‘is’ or ‘has’. For example, K-1301 system is part of U-1300 and K-1301 is classified as a compressor. The right hand Knowledge Model illustrates the content of the STEPlib knowledge base. The relation types in a knowledge model generally start with ‘can be a’ or ‘can have a’. For example, a compressor can have a capacity and a lubrication oil system can be part of a compressor.

Gellish 11 13/04/2023

Copyright: Shell Global Solutions International B.V.

Product Model Requirements Model

Dongting

Coal gasificationfacility

K-1301 system

U-1300

K-1301

LubOil-100

SGP

STEPlib

compressor

luboil system

capacity

Knowledge Model

shall have a / shall be a(in the context of a)

can have a / can be ahas / is

Product / Requirements / Knowledge models

SHELLlib

shall have a

shall have a

is classified as a

DEP xxx

shall comply with

can have a

Page 12: Gellish   A Standard Data And Knowledge Representation Language And Ontology

The middle part of Figure 4 illustrates a proprietary Requirements Model that expresses which data has to be present in a particular context. The relation types in a requirements model generally start with ‘shall be a’ or ‘shall have a’.For example, we developed requirements models that express that in the context of ‘handover’ of data from design to operations a compressor shall have a capacity (in the context of a handover) and a compressor shall be compliant with design guide xx, in the same context. This is expressed in Gellish as follows:

130069 compressor 24 shall have a 551564 capacity

130069 compressor 25 shall be compliant with 5490386 DEP 31….

When data about a compressor is handed over, then this Gellish specification makes it possible to do an automated verification of the completeness of that data, whereas that verification is driven by the requirements model. This is illustrated in figure 5.

Figure 5, Automated verification of a design against a requirements model

The right hand side of figure 5 illustrates the content of the SHELLlib knowledge base, which is a proprietary extension of STEPlib, which also uses Gellish. It illustrates how the knowledge in STEPlib and SHELLLlib is inherited via the specialization hierarchy. Because although P-101 is classified as a centrifugal pump, the requirement that is defined for a pump in general can automatically be made applicable to P-101, because of the defined inheritance via the specialization hierarchy.

Gellish 12 13/04/2023

Page 13: Gellish   A Standard Data And Knowledge Representation Language And Ontology

The specialization hierarchy also enables intelligent queries. For example search engines can perform intelligent searches on subtypes of keywords. For example, a document which is recorded to contains information about a line shaft pump can also be found if documents are searched about ‘centrifugal pump’. And a query on ‘pump’ can also find P-101, being classified as centrifugal pump.

An example of a commercial application of Gellish is a Gellish Browser developed by Mi2. The browser can read (and write) data expressed in the Gellish language and is able to present any knowledge about classes of objects and any data about individual objects. It was expected that implementation of Gellish would have serious performance issues. Therefore the Browser was loaded with over 60.000 facts, originating from different systems, but all expressed in a Gellish Table. These facts included the Gellish knowledge base, extended with a Shell proprietary standards database, data about documents, a materials catalogue, an equipment list and material balances of the design of a process plant. It appears to have an excellent performance.

We also customized an implementation of the Eigner PLM product lifecycle management system and loaded the same data in that system. This system also had a good performance.

We are currently working on the customization of existing systems so that they can export data in a Gellish Table. The Browser can then be used to view data from various systems and data can be imported and integrated with other data in the Eigner PLM system.

It is our intention to use a Gellish Table among others as a data exchange language for data hand-over of design data between engineering contractors and plant owners and for data about catalogue items and items delivered by suppliers.

Further work will explore the use of Gellish for the exchange of messages by intelligent Agent software, acting as nodes in the Semantic Web. For example business communication messages about transactions in E-procurement.

6 Conclusions

The above illustrates that the current practice to define data models separate from reference data and user data is unnecessary. Integration of data model concepts with reference data and user data in one consistent language can provide a single common standard language for data storage and exchange that can significantly reduce development costs and can simplify data communication. A common use of the little data model of figure 2, together with the common use of the Gellish ontology makes it possible to express and interpret a very wide scope of types of facts. This is possible because the explicit classification relations provide interpretation rules for the expressions for which the relation types as well as the object types are defined in Gellish. It is only required to have the concepts defined in the Gellish knowledge base and to refer to them as in the basic structure using the ‘basic semantic axioms’ mentioned above.

The above illustrates that:

- It is possible that a common standard knowledge base of concepts and relations between concepts can replace many data models.

- The Gellish knowledge base of concepts solution is more flexible than fixed data models and it is easier to add semantics to the database.

- The Gellish knowledge base of concepts provides an application independent language with a semantic basis that is equivalent to a very large data model. If

Gellish 13 13/04/2023

Page 14: Gellish   A Standard Data And Knowledge Representation Language And Ontology

sufficient concepts of an application domain are present or added, then data models for such an application domain can become superfluous.

- The Gellish knowledge base, using the inheritance capabilities of the specialization hierarchy, provides extendable product models for many types of objects.

- The implementations have proven that a Gellish knowledge base can be implemented with good performance.

- The implementations have proven that neutral format data exchange using a Gellish Table is a feasible solution.

As Gellish is in the public domain, proposals for extensions of the Gellish language are invited.

7 References

1. Andries van Renssen, “The Gellish Table and its Formats”. A definition of the Gellish Table and its implementation syntax for Gellish messages. www.steplib.com.

2. Andries van Renssen, “Guide on STEPlib”. This guide describes how STEPLib is defined and how to extent the Gellish language and knowledge base. www.steplib.com.

3. STEPlib, the Gellish knowledge base. This is a set of Gellish Tables (available in Excel and in MS Access). The upper level ontology part is documented in the TOPini part. www.steplib.com.

4. Tim Berners-Lee, James Hendler and Ora Lassila, 'The Semantic Web', Scientific American, May 2001;http://www.sciam.com/issue.cfm?issueDate=May-01.

5. OWL, Web Ontology Language Overview. http://www.w3.org/TR/owl-features/

6. Ian Niles and Adam Pease (2001), “Towards a Standard Upper Ontology”, in: Formal Ontology in Information Systems, ISBN 1-58113-377-4.

7. SUO (2001), The IEEE Standard Upper Ontology website, http://suo.ieee.org.

8. Lenat, D. (1995), “Cyc: A Large-Scale Investment in Knowledge Infrastructure”, Communications of the ACM, 38, no 11 (November 1995).

9. Wolfgang Degen, Barbara Heller, Heinrich Herre and Barry Smith (2001), “GOL: A General Ontological Language”, in: Formal Ontology in Information Systems, ISBN 1-58113-377-4.

10. The Epistle Core Data Model (2001), http://www.btinternet.com/~chris.angus/epistle/specifications/ecm/ecm_400.html

Gellish 14 13/04/2023