coin d5.2.1b-information interoperability services specifications m29 v1.0.pdf

8/12/2019 COIN D5.2.1b-Information Interoperability Services Specifications M29 V1.0.pdf

1/31

COIN- Collaboration & Interoperability for Networked EnterprisesProject

N.216256

Deliverable D5.2.1b Information Interoperability Services Specifications M29 issue

Date 30/06/2010

D5.2.1bInformation Interoperability Services

SpecificationsM29 issue

Author: Del Grosso TXT, Taglino CNR, Smith CNR

Contributors: Del Grosso TXT, Taglino CNR, Smith CNR

Dissemination: Public

Contributing to: WP 5.2

Date: 30.06.2010

Revision: V1.0


2/31


N.216256


Date 30/06/2010

COIN Consortium Dissemination: Public 2/31

TABLE OF CONTENTS

1. EXECUTIVE SUMMARY 3

2. INTRODUCTION 4

2.1. WP 5.2Innovative Information Interoperability Services 42.2. Metrics and Indicators 42.3. Structure of this deliverable 4

3. INTEROPERABILITY SPACES 5

4. INNOVATIVE SERVICES FOR SEMANTIC RECONCILIATION 6

4.1. Semantic Reconciliation Approach 64.2. Semantic Annotation 74.3. Mapping Discovery Service 134.4. Semantic Reconciliation Rule Generation Service 194.5. Source2Target Mediator Generation Service 20

4.6. Semantic Reconciliation Suite Platform 204.7. State of the Art 21

5. DATA PAYLOAD INTEROPERABILITY SERVICE 25

5.1. Updated requirements 255.2. Rules application 255.3. JRuleEngine 265.4. The negotiation process 27

6. INNOVATIVE SERVICES FOR FEDERATED INTEROPERABILITY 28

6.1. The COIN approach 286.2. Requested features 28

7. CONCLUSIONS 29

8. REFERENCES 30


3/31


N.216256


Date 30/06/2010


1. Executive Summary

This deliverable specifies the Innovative Information Interoperability Services (from now on

IIIS) that will be implemented and tested by month 24 of the COIN project.

The revised specifications of the IIIS have been defined starting from the results of the

previous set of services released at month 18.

Such services have been tested and evaluated by end users and as a result of such process the

new specifications have been designed.

The IIIS introduces the concept of Interoperability Space, where three different groups of

services have been created:

Innovative Services for Semantic Reconciliation: which start from the results obtained in EU

Project ATHENA (ATHENA IP 507849) and develop new features to extend and make

more performing the operation of translation between documents and ontologies.

Data Payload Interoperability Services: which analyzes and studies new ways to perform

communication, coordination and exchange of business documents in interoperability spaces

(1:1, 1:n, n:m communications).

Innovative Services for Federated Interoperability: which analyzes and studies new ways to

perform information interoperability in a federated space, where no common reference

models are available.


4/31


N.216256


Date 30/06/2010


2. Introduction

2.1. WP 5.2 Innovative Information Interoperability Services

SP5 is the COIN subproject which deals with the Enterprise Interoperability(EI) problems.

The Sub-Project (SP) started with WP5.1 where a baseline of services has been created

starting from results obtained in past projects.The next steps in SP5 are the analysis and development of the Innovative Services which will

cover three different fields of Enterprise Interoperability:

Information Interoperability in WP5.2,

Knowledge Interoperability in WP5.3, and

Business Interoperability in WP5.4.

The aim of WP5.2 is to develop new services which allow to:

Exchange business documents (BODs) written in UBL (Universal Business Language)

inside new interoperability spaces, evaluating several possibilities of exchange: 1:1,

1:n and n:m.Semantically annotate BODs in order to automatically derive the mediation rules

necessary for the semantic reconciliation of formats and contents.

Study how to create federated spaces where the business documents can interoperate

each other without the needs of a common reference models.

2.2. Metrics and Indicators

The COIN description of work, as well as deliverable D1.2.1 Quality Assurance Manual,

report the minimum number of services which must be implemented in the scope of WP5.2 in

order to reach a satisfactory base of services as a result of the work performed.

Such minimal requirements are summarized in the following table:

Table 1: Metrics and performance indicators for EI innovative services

Metric Milestone

M24

Milestone

M48

Number of Innovative Information Interoperability Services 1 2

2.3. Structure of this deliverable

This deliverable is structured into six chapters:

Chapter 1 is the executive summary of this deliverable.

Chapter 2introduces the objectives and structure of this deliverable.

Chapter 3 describes the interoperability spaces.

Chapter 4describes the innovative services for semantic reconciliation.

Chapter 5 describes the data payload interoperability service.

Chapter 6 describes the innovative services for federated interoperability.

Chapter 7 are the conclusions

Chapter 8 are the references.


5/31


N.216256


Date 30/06/2010


3. Interoperability Spaces

The work package 5.2 introduces the concept of Interoperability Space in the context of the

COIN project.

With the concept Interoperability Spaces we refer to a set of services whose purpose is totake into account all the possible kind of data transformation which can be applied to

documents.

Data Interoperability can be divided into two big branches:

Payload interoperability: refer to the transformations applied to the content of the

documents.

Schema interoperability: refer to the transformations applied to thestructureof the

documents.

Schema interoperability, in turn, can be divided into two braches according to the approach

that we want to follow for the transformation:

Unified approach: implies the use of a reference meta-model for managing the

transformations.

Federated approach: implies the absence of a reference meta-model for managing

the transformations.

Figure 5: Interoperability spaces structure


6/31


N.216256


Date 30/06/2010


4. Innovative Services for Semantic Reconciliation

The goal of semantic interoperability is to allow the (seamless) cooperation of software

applications, which were not initially developed for this purpose, by using semantics-based

techniques. In particular, in this context, the focus is on the exchange of business documents,

bothbetween different enterprises, for instance, to automatically transfer a purchase orderfrom a client to a supplier, and

within the same enterprise, for instance, to share a certain document between differentdepartments which use different data organizations.

A relevant work on this field has been done during the Athena project, where a Semantic

Reconciliation suite for reconciliation of business documents was developed. Such a suite has

been brought into COIN as part of the Baseline Interoperability Services (D5.1.2).

Concerning the COIN innovative services for semantic reconciliation, the approach will be to

start from the Semantic Reconciliation suite, as part of the baseline services, and to improve

and enrich it with particular focus on providing a (semi)automatic support and optimize

certain steps of the reconciliation process.

4.1. Semantic Reconciliation Approach

The semantic reconciliation approach is based on the use of domain ontology as common

reference for the harmonization of heterogeneous data. This harmonization is accomplished

into two different moments: a preparation phase and a run time phase (see Error! Reference

source not found.1).

Figure 1: Semantic Reconciliation Approach

In the preparation phase, the schemas of the documents to be shared by the software

applications (say, RS1, RS2, RS3) are mapped against the reference ontology (RO). The

mapping is a two step activity which starts from the semantic annotation and ends with the

building of semantic reconciliation rules.


7/31


N.216256


Date 30/06/2010


The semantic annotation allows the description of the elements composing a document, in

terms of the reference ontology, by identifying conceptual correspondences between a schema

and the reference ontology. Nevertheless, semantic annotations do not contain enough

information for performing actual data reconciliation. For this reason, operational semantic

reconciliation rules are built. In particular, starting from the previous annotation, for each

document schema, a pair of reconciliation rules sets is built: a forward rule set and abackwardrule set, which allow data transformation from the original format into the ontology

representation and vice-versa, respectively. The run-time phase concerns the actual exchange

of data from an application to another. For instance, when an application, say, App 3, wants to

send a document to another application, say, App2, the reconciliation between the format of

RS3 and the format of RS2, is done by applying first the forward rules set of RS3, and after

the backward rules set of RS2.

The objective of this deliverables is to provide the final specification of the innovative

services for Semantic Reconciliation with the further aim of underlining the enhancement

with respect to the first release. In particular, we refer to the following services:

Semantic Mapping Discovery Service: The aim is to develop a powerful mapping

discovery service which can help in the identification of semantic correspondencesbetween a document schema and the reference ontology.

Semantic Reconciliation Rule Generation Service: like the semantic annotation,also the building of reconciliation rules was a manual activity. In this case, in order to

provide an automatic support, the rules generation are currently guided by the reuse of

the knowledge represented by the semantic annotation;

Source2Target Mediator Generation Service: this service allows the generation andthe publication of a specific web-service for the run-time translation of document

instances between a given source and target schemas.

4.2. Semantic Annotation

The specifications of the Semantic Annotation have received substantial changes if compared

to the previous version, for two main reasons: i) to increase the expressive power of such

expressions in order to cover more complex heterogeneities among the structures to be

matched; ii)to allow an automatic translation into abstract reconciliation rules, i.e. first order

logic rules (see section 5.4).

4..2.1. A classification of Semantic Mismatches

In the ATHENA project, a certain number of possible kinds of mismatch were identified and

classified. Such mismatches categories are recapped in the Table 2 and 3. The examples

reported in the tables are drawn from an e-procurementapplication.

Our classification follows to some extent the one in [7], but the kinds of mismatchhave been divided into two broad categories: lossless and lossy mismatches. Lossless

mismatches are cases in which annotation can fully capture the intended semantics, while

lossy mismatches represent cases where it is not possible to find a semantic annotation that

fully captures the intended semantics. Furthermore we focus only on Conceptualization

Mismatches, i.e. mismatches between two conceptualisations of the same domain, that differ

in the ontological concepts distinguished or in the way these concepts are related. We do not

address other classes of mismatch, like Explication Mismatches,related to the differences on

the way the entities are described, orLanguage Mismatches, related to the heterogeneity of

the formalisms used for the definition of the conceptualizations.


8/31


N.216256


Date 30/06/2010


Lossless mismatches Examples

Name Description Resource Schema Reference Ontology

Naming Different labels for the sameconcept.

Request for quotationindicated as:RFQuote.

Request for quotationindicated as:RFQ.

Abstraction Level of specialization/

refinement of the information.The same concepts are

recognized, but they are

defined at different levels of

abstraction.

A manager is anEmployee

who is the supervisor ofsome project.

The conceptManager is

recognized as aspecialization of

Employee.

Structuring The same set of concepts is

modeled, but it differ the way

these concepts are structured

by means of relations.

ADepartment is related to its

controlledProjectsand to theEmployee that is its manager.

AProjectis related to the

supervisorEmployee andto the controlling

Department .

SubClass -

Attribute

value

An attribute, with a predefined

value set, is represented by a

set of subclasses, one for each

value.

The typeRawMaterial can be

represented as an

enumeration:

(iron,copper)

or as two subclasses:

iron subClassOf

RawMaterial, copper

subClassOf RawMaterial.

Class-elation A concept is represented as a

relation.

AProductis related to a

Buyerby a relation.

The concept Sale is

related to aProductand

to aBuyer.

Attribute

Granularity

The same information is

decomposed into a different

number of attributes (or sub-

attributes)

Telephoneis represented as a

singlestring.

Telephoneis composed

by hasPartCountryCode,

hasPartAreaCode,

hasPartLocalPhoneNumb

er.

Attribute

Assignment

Two conceptualization differ in

the way they assign attributes

to concepts.

Department has the attribute

SupervisorName and controls

some project.

Project has the attribute

SupervisorName.

ComplexAttribute

A set of attributes is groupedand represented as a concept.

Name, Address andPhoneNumber are attributes

ofEmployee.

Name, Address andPhoneNumber are

grouped in the concepts

ContactDetails, related toEmployee.

Encoding Different formats of data or

units of measure.

Weightexpressed in ounces. Weightexpressed in

kilograms.

Table 2: Lossless mismatch categories

Lossy mismatches Examples

Name Description Resource Schema Reference Ontology

Overlapping There is an intersection

between the extensions of

different

concepts/attributes/roles.

Executive Manager

Subsumption There is an inclusion between

the extensions of differentconcepts/attributes/roles.

Person Employee

Categorization Two conceptualizationsdistinguish the same concept

but divide it into different and

incomparable sub-concepts.

PublicOrganization andPrivateOrganization.

ResearchOrganization andBusinessOrganization


9/31


N.216256


Date 30/06/2010


Coverage Two conceptualizations do

not model all the entities or

information of a given

domain.

preferredDeliveryDate is

not considered in the RS.

preferredDeliveryDate is

present in the RO.

Precision The accuracy of information. Size of a pallet expressed

by three integer values:

height, length, width.

Size of a pallet expressed

by a constant conventional

value: (small, medium,large).

Table 3: Lossy mismatches categories

4..2.2. Mismatch Templates

In the following we adopt standard notion from First-Order Logic and Description Logic

theory [2]. Regarding the formalism supported for both the RS and RO representation, we do

not restrict ourselves to any particular ontology language in this work. Instead, we use a

generic conceptual model (CM), which contains common aspects of most semantic data

models, UML, ontology languages such as OWL, and description logics. In the sequel, we

suppose both RS and RO represented by using this generic CM. SpecificallyA, B ,C denote

atomic concepts, i.e. set of individuals; D denotes data types, e.g. String; P and Q denote

atomic roles, i.e. binary relations between individuals. U and V denote attributes, i.e. relations

between individuals and data values. Individuals are denoted by a and b, data values as d.

Concepts are organized in the familiar is-a hierarchy and can be specified disjointness

relations among them. Roles, and their inverses (which are always present), are subject to

constraints such as specification of domain and range. We shall represent a given CM using a

directed and labelled ontology graph, which has concept nodes labelled with concept names,

and edges labelled with role names. For each attribute of a concept, we create a separateattribute node. For expressive languages such as OWL, we also connect C1to C2byPif we

find (by reasoning over the ontology) an existential restriction stating that each instance of C1

is related to some (or all) instance of C2byP. General roles, denotedR, can be:atomic;

constructed as the inverse of a role, i.e.P-,

constructed as the composition of roles, i.e. that represents paths in the

ontology graph traversing the edges ,

constructed as the constrained composition of roles, i.e.

that represents the paths in the ontology graph

traversing the edges and the nodes .

General attributes, denotedZ, can be atomic or attribute chains, i.e. a composition of relations

, where is an eventually constrained composition of roles and

U is an attribute. Complex concepts, denoted by C, can be constructed as the intersection of

concepts, i.e. , or by restrictions over roles and attributes. In particular we consider:

, the set of individuals related to an individual instance of A by the general

relationR;

, the set of individuals related to a value ranging inDby the general attributeZ;

, the set of individuals related to the individual aby the general roleR;

, the set of individuals related to the value dby the general attributeZ.

Equivalence relation is denoted by , which FOL counterpart is ; subsumption (i.e.inclusion) is denoted by (resp. ), which FOL counterpart is (resp. ). Concept 2


10/31


N.216256


Date 30/06/2010


Concept Mismatch Templates

Name Atomic Concept

Related M ismatches Naming

Description There is an overlapping between the instances of aAandB

Formal Notation

Example

Name Conjunctive Concept

Related M ismatches Naming, Abstraction

Description There is an overlapping between the instances of aAand the intersection of and

Formal Notation

Example

Name Role Restriction


Description There is an overlapping between the instances of a Cand those instances ofAthat are related by the

roleRto instances ofB

Formal Notation

Example

Name Role Restriction by individual



roleRto the individul b

Formal Notation

Example

Name Attribute RestrictionRelated M ismatches Naming, Abstraction, Sub-class attribute


attributeZto values of the typeD

Formal Notation

Example

Name Attribute Restriction by value

Related M ismatches Naming, Abstraction, Sub-class attribute


attributeZto the value d

Formal Notation

Example

Role 2 Role Mismatch Templates

Name Atomic Role

Related M ismatches Naming

Description There is an overlapping between the instances of aPand Q

Formal Notation P

Example

Name Inverse Role

Related M ismatches Naming, Abstraction, Organization

Description There is an overlapping between the instances of aPand the inverse of Q


11/31


N.216256


Date 30/06/2010


Formal Notation

Example

Name Chain Role

Related

M ismatches

Naming, Abstraction, Organization

Description

There is an overlapping between the instances of a Qand the composite role

Formal

Notation

Example

Name Constrained Chain Role

Related

M ismatch

es

Naming, Abstraction, Organization

Descripti

on

There is an overlapping between the instances of a Qand the constrained composite role

Formal

Notation

Example

Attribute 2 Attribute Mismatch Templates

Name Atomic Attribute

Related M ismatches Naming, Attribute Assignment

Description There is an overlapping between the instances of a Uand V

Formal Notation U

Example

Name Constrained Attribute

Related M ismatches Naming, Attribute assignment, Organization

Description There is an overlapping between the instances V and those instances of U defined over an instances

of the concept C

Formal Notation

C U

Example

Name Value transformation

Related M ismatches Naming, Attribute assignment, Encoding

Description There is an overlapping between the instances of a Uand V,but theirvalueshave to be transformed

by the application of a given functionf

Formal Notation

Example

Name Attribute composition/aggregation


12/31


N.216256


Date 30/06/2010


Related

M ismatches

Naming, Attribute assignment, Granularity

Description A set of attributes have to be aggregated to be translated into instances of V

Formal

Notation

Example

Complex Mismatch Templates

Name Concept - Role

Related

M ismatches

Concept - Role

Description Instances of C are related by and to a pair of individuals a and b, that constitute the instances of

Q

Formal Notation

Example

Name Complex attribute

Related

Misma

tches

Complex attribute

Descri

ptionThere is an overlapping between the instances of the attributes defined over instances of and the

instances of defined over instances of related to instances of . Furthermore and are

matched to.

Forma

l

Notatio

n

Examp

le

Name Subclass by Attribute Value

Related

M ismatches

Subclass by Attribute Value

Description There is an overlap between instances of related to the value dby the attribute Uand the instances

of related to instances ofAby the roleP

Formal Notation

Example

Na

me

Attribute Assignment over chain roles

Related

M is

mat

ches

Attribute Assignment over chain roles

Des

crip

tion

There is an overlapping between the instances of the attributes and the attributes defined over

instances involved in the composite role and , respectively.

For

mal

Not

atio

n

Exampl

e


13/31


N.216256


Date 30/06/2010


4..2.3. Semantic Annotation by Mismatch Templates

Given a Resource Schema (RS) and a Reference Ontology (RO) each describing a set of

entities (concepts, roles and attributes), the Semantic Annotation of RS in term of RO is a set

of relations holding, or supposed to hold, between such entities. Basically the Semantic

Annotation SemAnn(RS,RO) is an alignment [8] made up of a set of instantiations of the

templates discussed above. Such templates are grouped into four categories of annotations,namely Concept Annotations, Attribute Annotations, Path Annotations, Complex

Annotations. In the following we introduce these notions in details.

Concept Annotation (CA). A CA is a tuple , where

C1 is a concept of RS, (see atomic concept mismatch template) or a set of concepts of

RS intended as the conjunction of them (conjunctive concept mismatch templates);

C2 is a concept of RO, (see atomic concept mismatch template) or a set of concepts ofRO intended as the conjunction of them (conjunctive concept mismatch templates);

REL is the relation supposed to hold between C1 and C2. It may be ,

specifying if the mapping is unidirectional or bidirectional;

R1 (resp. R2) is a set of restrictions, each of one of the following form:

o (seeRole Restriction mismatch templates);o (seeRole Restriction by individual mismatch templates);o (seeAttribute Restriction mismatch templates);o (seeAttribute Restriction by value mismatch templates);

Attribute Annotation (AA). An AA is a tuple , where

U1 is an attribute of RS (see atomic attribute mismatch template) or a set of attributes

of RS (see attribute composition/aggregation mismatch template);

U2 is an attribute of RO (see atomic attribute mismatch template) or a set of attributes

of RO (see attribute composition/aggregation mismatch template);

FN is a function to be applied to U1 and U2 RS (see attribute

composition/aggregation and value transformation mismatch templates), e.g. SPLIT,EQ, CAST,CONVERT, COUNT;

CA is a concept annotation (optional) that can be specified to constrain the domain of

U1 and U2.

Path Annotation (PA). A PA is a tuple , where

P1 is a role of RS (see atomic role mismatch template), or the inverse of a Role (see

inverse role mismatch template), or a (constrained) composition of roles (see chain

role and constrained chain role mismatch template).

P2 is a role of RO (see atomic role mismatch template), or the inverse of a Role (seeinverse role mismatch template), or a (constrained) composition of roles (see chain

role and constrained chain role mismatch template).

REL is the relation supposed to hold between P1 and P2. It may be ,specifying if the mapping is unidirectional or bidirectional;

CA1 (resp. CA2) is a concept annotation that can be specified to constrain the domain

(resp., the range) of P1 and P2

Complex Annotations. Complex annotations are constituted by a Path Annotation and a set

of Attribute Annotations. Such expressions cover the complex attribute, subclass by attribute

value andattribute assignment over chain roles mismatch templates.

4.3. Mapping Discovery Service

The objective of this service is to provide a semi-automatic support to the discovery ofsemantic annotations of a structured business document schema (e.g., a purchase order

schema), here referred as resource schema (RS) against a reference ontology (RO). Mapping


14/31


N.216256


Date 30/06/2010


discovery is a hard task since, given a fragment of the reality, there are infinite ways of

modelling it, by using different names, different relationships, different complex structures.

We assume that the observed (and modelled) reality is faithfully represented by the RO and

the RS is therefore a sub-model, in semantic terms, of the former. Then, the differences that

appear on the terminological, syntactic, and structural ground will be reconciled on the

semantic ground.In the literature the problem of mapping discovery (often referred as Ontology or Schema

Matching) has been widely addressed, however the existing proposals have a limited scope,

since they mainly address the mapping between individual elements (e.g., concepts and

relationships) and only a few address complex sub-models as a whole. Furthermore, we go

beyond the logic correspondence (e.g., subsumption), introducing a formalism to represent

mappings based on the instantiation of a set mismatch templates defining rule-based

transformations. Our ultimate goal is to discover in a (semi)automatic way the set of

operations that allow one structure to be transformed into another, without loss of semantics.

As a semi-automatic support, a final validation by a human actor will be needed.

The Mapping Discovery Service is a semi-automatic support for the definition of

Semantic Annotations. The user is involved in this task for the validation and revision of theintermediate and final proposed results. The strategy is depicted in Figure 2 and here

summarized:

1. In the first step of the matching process we consider only lexical knowledge. We startby processing the entity labels of the two graphs to build a term similarity matrix.The

similarity matrix reports a similarity value (between 0 and 1) for every couple of

elements , where Abelongs to RS and Bbelongs to RO. This is achieved by

running in parallel a string similarity algorithm and a linguistic similarity algorithm.

The former is based on the Monge-Elkan distance [1], while the latter is a slightly

modified version of the SemSim criteria [4], based on the Lin measure [3] applied to

the WordNet [5] lexical taxonomy (See section 5.3.1).

2. After the terminological analysis, relying on the similarity values computed in theprevious step, a set of evidencesis selected. An evidence is a pair of concepts (c1,c2),

where c1belongs to RS, c2belongs to RO, their similarity value is high and they are

adjacent to similar entities (See Section 5.3.2).

3. Semantic Annotation Expressions are finally build by the Mismatch Detectionalgorithm. The mismatch detection algorithm implement a search strategy for every

mismatch pattern provided as input, relying on the information collected in the

previous two steps (See Section 5.3.3).


15/31


N.216256


Date 30/06/2010


Figure 2: Semantic Annotation Discovery Strategy

4..3.1. Label Similarity Lsim

In order to assign a similarity value to a pair of labels from the two graphs we combine the

results of both a string similarity measure and a linguistic similarity measure. In the former,

we consider labels as sequences of letters in an alphabet while in the latter, we consider labelsas bags of words of a natural language (English in our case) and we compute a similarity

value based on their meaning. The label similarity value Lsim between two labels is hence

obtained by taking i)the higher similarity value computed according to the two measures if is

greater than a threshold, ii)0 otherwise.

String Similarity. We experimented several string similarity measuresiand finally we selected

the Monge-Elkan distance [1], which was proposed as a field matching (i.e., record linkage)

algorithm. The Monge-Elkan distance measures the similarity of two stringsand tevaluating

recursively every substrings fromsand t; this approach is also capable to support the analysis

of abbreviations or acronyms. To improve the accuracy of the algorithm, the input strings arenormalized, i.e. characters are converted in lower case and special characters (digits,

whitespaces, apostrophes, underscores) are discarded.

Linguistic Similarity. Our approach is based on the Lin Information-theoretic similarity [3]

applied to the lexical database WordNet [5], that is particularly well suited for similarity

measures since it organizes synsets (a synset is a collection of synonyms denoting particular

meaning of a term) into hierarchies of ISA relations. Given two synsets of the WordNet

taxonomy, the Lin measure can be used to state their similarity depending on the increase of

the information content from the synsets to their common subsumersii. In order to use the Lin

measure to compare two stringssand t, we consider such strings as word entries of WordNet,

and apply the Lin measure to all the pairs of synsets related to sand t, respectively. We then

define Ssim(s,t)as the higher computed similarity value, since we expect that words used inthe RS and in the RO belong to the same domain sharing the same intended meaning.

Entity labels are considered bags of words since they are, in general, compound words

(LegalVerification, contact_details). To tokenize a string into a bag of words we realized a

label resolution algorithm looking (from right to left) for maximal substrings that have, after

stemming, an entry in WordNet; also some special characters (e.g., , _, -) are taken into

account. In this way, we can also filter the noise in the labels, deleting substrings

recognized as prepositions, conjunctions, adverbs and adjectives that are not included in the

WordNet taxonomy.

Finally we can define an algorithm to compute a linguistic similarity value given two

labels, represented as bags of strings:

begin

Double:Lmatch(bag_of_strings term1, bag_of_strings term2)minL=min(term1.lenght, term2.lenght)maxL=max(term1.lenght, term2.lenght)

while (term1andterm2notEmpty)score =maxSsim(s,t), foralls interm1andtinterm2totalscore += scoreremoves fromterm1 andt fromterm2

denum =max(minL,minL+log(maxL-minL+1))return totalscore/denum

end


16/31


N.216256


Date 30/06/2010


The algorithm iteratively looks for the most similar string of term1andterm2, adding

their similarity value to the totalscore. Than it returns totalscore divided for the number of

words in the label with less number of words; if the input terms have different sizes, the

denominator is increased by a logarithmic function to reduce their similarity. For example,

given the labels contact_info and RepresentativeDetails, we see that

sim(contact,representative)=0.768 and sim(info,details)=0.746. Therefore wecalculate:

Lmatch([contact, info],[representative, details])= 1.514/(2) = 0.75

where the denominator is the number of words (after the label resolution) contained in the

shortest label.

4..3.2. Neighbour Similarity and Evidence Selection

The goal of this step is to discover the set CEof conceptevidences, i.e., pairs of concepts that

exhibit an high level of semantic similarity and that will be used as input for the mismatch

detection.

Neighbour Matching

The Neighbour Matching algorithm Nsim has been designed to overcome the limits of the

pure lexical approach followed by Lsim. Nsim compute the similarity of two concepts by

assigning a score to the similarity of their neighbours by a wedding approach. Given to

conceptAandB, Nsim is defined as follows:

Where:

NA (resp.NB) is the set of neighbour entities ofA(resp.B), i.e.o the incident roles ofA(resp.B) in the ontology graph;o the adjacent concepts ofA(resp.B) in the ontology graph;o the attributes of the adjacent concepts ofA(resp.B) in the ontology graph.

Each entity ofNAandNBcan participate in one pair exclusively.On the basis of LsimandNsim we define the following measure Sim, that takes into account

three different aspects in stating the similarity of two concepts:

1. The similarity between their labels;2. The similarity between their neighbours;3. The percentage of the similar entities among their neighbours.

Where:

NA (resp.NB) is the set of neighbour entities ofA(resp.B);

M is theset of the pairs ( considered in the computation ofNsim(A,B);

and are constants.

A first set CE is computed according to the criteria described above. Sim is computed for

every pairs of concepts belonging to RS and RO respectively, and the pairs (A,B) with

Sim(A,B)greater than a threshold are added to CE.

Taxonomic Mismatch Patterns


17/31


N.216256


Date 30/06/2010


The set of concept evidences discovered in the previous step is not aware of the semantic

consistency with respect to the constraints (i.e. axioms) asserted in the two conceptualizations

to be matched. In particular there are two situations that may lead to undesired consequences:

1.Inconsistent evidences, e.g. the concept Ais matched with bothB1andB2, but B1isdeclared to be disjoint from B2. This may lead to state that some individuals of A1

are individuals of bothB1andB2.2.Cross-subsumtion evidences, e.g. the concept A1 is matched B1and the concept A2withB2,but A1subsume A2, while B2 subsume B1.This may lead to state a cyclic

inclusion amongA1, A2, B1, B2.

These situations may be in some case detected by searching for the following taxonomic

pattern, that only approximate the consistency of the evidences, trying to avoid potential

dangerous (and hence incorrect) evidences.

Sibling Concepts.This pattern require that and there exists a concept

C such that are asserted axioms. constitute the siblingconcepts set ofA.

Inconsistent Correspondences. Given a concept evidence (A,B), the set of evidences

inconsistent with it is defined as

)}()(|),{( XAAXBYYBBYAXCEYXIEAB

Cross Subsumption.Given a concept evidence (A,B), the set of cross-subsumtion evidences

of (A,B), is defined as:

)}()(|),{( XABYYBAXCEYXCCEAB

The set CE computed in the previous step is then refined by the search of taxonomic

mismatch patterns discussed above, according to the following strategy:1. Sibling Concepts sets are identified2. CE is ordered for decreasing values of similarity3. For every(A,B) in CE (iterating over decreasing values of similarity)

a. Remove from CE the setb. Remove from CE the set

4. Sibling Concepts sets are added to CEBasically we start from the evidences showing higher confidence, and we delete from CEthe

set of evidences that may conflict with them.

4..3.3. Mismatch Detection

In this step SemAnn(RS,RO)is populated by searching the following mismatch templates in

the given order.

Atomic Concepts.For every (A,B) inCE, If (A,B) is a 1:1 match in CE, is

added to SemAnn(RS,RO).

Conjunctive Concepts. For every A in RS (resp. RO), if A is matched with other concepts

in CE, and they are not included in the sibling concepts set of A, the concept

annotations .... (resp. < , _, , _>.... < ,, _, ,_>) are added to SemAnn(RS,RO).


18/31


N.216256


Date 30/06/2010


Disjunctive Concepts.For every A in RS (resp. RO), if A has an associated sibling concepts

set , the concept annotations .... (resp. < ,

_, , _>.... < ,, _, , _>) are added to SemAnn(RS,RO).

Constrained Attributes.For every concept annotation ca= in SemAnn(RS,RO),

given an attribute U having domain A and an attribute V having domain B, the attributeannotation is added to SemAnn(RS,RO)if:

1. Lsim(U,V) > lth;2. There is not an attribute X having domainAsuchthatLsim(X, V)> Lsim(U,V);3. There is not an attribute Y having domainBsuchthatLsim(U, Y)> Lsim(U,V).

Constrained Roles.For every pair of roles (R1,R2)such that:

1. there is a concept annotation CA1involving the domains ofR1andR2;2. there is a concept annotation CA2involving the ranges ofR1andR2;3. LSim(R1,R2) > lth;

a path annotation is added to SemAnn(RS,RO).

Class-Attribute.This mismatch is identified if the following conditions hold:

1. (A1,A2)is involved in the concept annotation CA1;2. An attribute Uwith domain A1(resp. A2) is not involved in any attribute annotation

related to CA1;

3. There is a concept Chaving attributes V1...Vnadjacent toA1 (resp.A2) by means ofthe roleR;

4. Lsim(U,R)> lth orLsim(U,C) > lth;5. C is not matched with any concept, except forA1(resp.A2) in any concept annotation.

If this mismatch is identified, the following complex annotation is built:

(resp. ) (resp. )

If present the concept annotation between CandA1 (resp.A2)is removed.

Complex Attribute.This mismatch can be considered as more general than the class-attribute

mismatch. It is identified if the following conditions hold:

1. (A1,A2)is involved in the concept annotation CA1;2. A set of attributes U1...Un with domainA1(resp.A2) are not involved in any attribute

annotation related to CA1;

3. There is a concept Chaving attributes V1...Vnadjacent toA1 (resp.A2) by means ofthe roleR;

4. C is not matched with any concept, except forA1(resp.A2) in any concept annotation;5. Every Ui can be matched with a Vi withLsim(Ui,Vi) > lthIf this mismatch is identified, the following complex annotation is built, where every Uiand

Viparticipate only to the match with higher similarity:

(resp. ) (resp. ) .

(resp. )

If present the concept annotation between CandA1 (resp.A2)is removed.

Chain Role. This mismatch is identified if the following conditions hold:

1. A roleR of RS (resp. RO) is not involved in any path annotation;2. The domain ofR is involved in a concept annotation CA1 and the range in a concept

annotation CA2;


19/31


N.216256


Date 30/06/2010


3. R1....R2 is the shortest path in the ontology graph representation of RO (resp. RS)such that the domain of R1 is involved in CA1 and the range of R2 is involved in

CA2;

If this mismatch is identified, the following complex annotation is built:

(resp.

Attribute Assignment over chain roles. This mismatch is identified if the following conditions

hold:

1. pis a path annotation previously discovered;2. U1...Un are the attributes having as domain some concepts involved in the path

(eventually composed by an atomic role) ofRSand V1...Vnare the attributes having as

domain some concepts involved in the path (eventually composed by an atomic role)

of RO;

3. U1...Un and V1...Vn are not involved in any attribute annotation;4. EveryUi can be matched with a Vi withLsim(Ui,Vi) > lth.

If this mismatch is identified, a complex annotation is built adding topan attribute annotation

of the form for every (Ui,Vi) such that every Ui and Vi participate only to the match

with higher similarity.

Other heuristics. The last step aims at enriching the 1:n concept annotation with restrictions

over roles and attributes. To this end we adopt the algorithms described in [9] to discover role

restrictionand attribute restriction templates.

4.4. Semantic Reconciliation Rule Generation Service

The objective of the Semantic Reconciliation Rule Generation service is to provide a semi-

automatic support to the definition of backward and forward reconciliation rules (i.e.,

operational mappings) starting from the previously defined declarative mappings (Semantic

Annotations). A Semantic Annotation is not able to fully represent how to actually transform

data from a format to another. Nevertheless, the knowledge carried by these declarative

mappings is extremely useful for generating actual transformation rules.

The service works according to the following steps:

Abstract rules generation. Starting from the previously defined semantic annotations,

an abstract representation of transformation rules (i.e., FOL rules) is generated,

following the FOL grounding presented for every Mismatch Templates in Section

5.2.2;

Rule validation and completion by a human user. Not all the knowledge needed for

generating a transformation rule is contained in the annotation. For instance, splittingone strings (e.g., name) into two strings (e.g., firstname and surname) needs the

specification of a separator to identify the two substrings. In this phase, the human

user operates through a graphical user interface, which has the objective to shield the

user from the complexities of a formal syntax, by showing the rules in a friendly way.

Here we intend as abstract reconciliation rules First Order Logic formulas of the form:

where and are conjunctive formulas defined:

in Forward Rules over the Alphabets of RS and RO, respectively; in Backward Rules over the Alphabets of RO and RS, respectively.

This kind of logic-based representation of schema mappings are known in literature as GLAV

mappings or TGDs [6].

),(.)),(..( wxwyxyx


20/31


N.216256


Date 30/06/2010


In Figure 3 an example of forward reconciliation rule generation is shown. In the

upper part is graphically represented a Complex Semantic Annotation, representing the

instantiation of a Complex Attribute + Attribute Composition mismatch templates, together

with the corresponding abstract reconciliation rule.

Figure 3: Example of abstract reconciliation rule generation

4.5. Source2Target Mediator Generation Service

The Source2Target Mediator Generation Service allows the generation and the publication of

a specific web-service for the run-time translation of document instances between a given pair

of source and target schemas. It works according to the following steps:

A source resource schema (SRS) and a target resource (TRS) schema are provided ininput to the service, together with i) the reference ontology (RO) to be used in the

reconciliation process, ii)the abstract forward rules defined between SRS and RO, iii)

the abstract backward rules defined between TRS and RO.

An executable representation of the two set of rules is compiled. In particular abstractrules are serialized into Jena2 rules, in order to allow the execution by the

reconciliation engine SIRE based on the Transitive Rule Reasoner of the Jena2iii

toolkit.

A web-service S is automatically generated and published. This service takes as inputthe URI of an instance file conform to SRS, and returns an instance file conforms to

TRS. S represents an indirection level, wrapping a customized execution of the

reconciliation engine SIRE.

4.6. Semantic Reconciliation Suite Platform

In Error! Reference source not found.4, a functional view of the Semantic Suite is depictedand its components are here recalled:

Athos, is the ontology management system; in the reconciliation process acts as the

Ontology Repository trough the Ontology Catalog interface. Athos is an

autonomous system, provided with a Web user interface.

Semantic Annotation Tool (SAT), exposes the Annotation Definition interfacethat provides functionalities to define and edit Annotations between resources. The

Semantic Mapping Discovery Service is implemented within this module (Section 5.2

and 5.3).

Semantic Abstract Rule Builder (SARB),is the reconciliation rule building tool; itexposes the Rule Building interface that provides functionalities to define

transformation rules (backward and forward). The Semantic Reconciliation RuleGeneration Service is implemented within this module (Section 5.4).


21/31


N.216256


Date 30/06/2010


Semantic Interoperability Reconciliation Engine (SIRE), is the reconciliationengine; it exposes the Rule Execution interface that provides functionalities to

perform the actual data reconciliation between resource schemas.

Semantic Interoperability Mediator Generator (SIMEG), exposes in the S2TMediator Generation interface the functionalities of the Source2Target Mediator

Generation Service (Section 5.5).Resource Repository, stores the schemas and the instances of the resources that has

to be reconciliated.

Annotation Repository and the Rule Repository store Semantic Annotations and

Transformation Rules respectively.

Reconciliation Suite Web App, is the server-side web application that provides a

unified User Interface for the services of SARB, SAM, SIRE and SIMEG.

User Web Browser, is a user web client.

Figure 4: Functional View of the Reconciliation Suite Platform

The Reconciliation Suite has being implemented as a Java application. In particular the web

application is based on the Google Web Toolkit. With respect to the first release of the

Reconciliation Suite Platform some functionalities have been added, regarding user

administration and the management of the persistent resources.

4.7. State of the Art

This section presents some existing results addressing the problem of interoperability among

software applications. The section is divided into three sub-sections, which present standards

for document exchange, semantics-based platforms for document reconciliation, and methods

for mapping discovery, respectively.

International standards for business documents exchangeUniversal Business Language (UBL)

iv is a library of standard electronic XMLbusiness

documents such aspurchase ordersandinvoices.UBL was developed by anOASISTechnical

Committee with participation from a variety of industry data standards organizations. The

UBL 2.0 Standard includes 31 documents in total, roughly grouped into the following

categories: Presale Ordering, Delivery, Invoicing, and Payment. The Core Components

Technical Specificationvdefines meta models and rules necessary for describing the structure

and contents of conceptual and physical/logical data models, process models, and information
http://en.wikipedia.org/wiki/XMLhttp://en.wikipedia.org/wiki/XMLhttp://en.wikipedia.org/wiki/Purchase_orderhttp://en.wikipedia.org/wiki/Purchase_orderhttp://en.wikipedia.org/wiki/Purchase_orderhttp://en.wikipedia.org/wiki/Invoicehttp://en.wikipedia.org/wiki/Invoicehttp://en.wikipedia.org/wiki/Invoicehttp://en.wikipedia.org/wiki/OASIS_(organization)http://en.wikipedia.org/wiki/OASIS_(organization)http://en.wikipedia.org/wiki/OASIS_(organization)http://en.wikipedia.org/wiki/OASIS_(organization)http://en.wikipedia.org/wiki/Invoicehttp://en.wikipedia.org/wiki/Purchase_orderhttp://en.wikipedia.org/wiki/XML


22/31


N.216256


Date 30/06/2010


exchange models. Therefore, CCTS describes an approach for developing a common set of

semantic building blocks that represent the general types of business data in use today. This

approach provides for the creation of new business vocabularies as well as restructuring of

existing business vocabularies to achieve semantic interoperability of data.

Open Financial Exchange (OFX)vi

is a unified specification for the electronic

exchange of financial data between financial institutions, businesses and consumers via theInternet. In particular, it defines the request and response messages used by each financial

service as well as the common framework and infrastructure to support the communication of

those messages.

The e-GIFvii

defines the technical policies and specifications governing information

flows across government and the public sector. They cover interconnectivity, data integration,

e-services access and content management. The e-GIF is presented as a set of policies,

technical standards, and guidelines, which cover ways to achieve interoperability of public

sector data and information resources, information and communications technology (ICT),

and electronic business processes. The aim is to enable any agency to join its information,

ICT or processes with those of any other agency using a predetermined framework based on

open (i.e. non-proprietary) international standards.The adoption of standards to face the problems of document exchange implies a strong

effort in the refactoring of legacy software applications, which is exactly what the semantic

reconciliation suite here described wants to avoid. However, the existence of standards is very

relevant, because they represent an important result in terms of description and organization

of business documents. As such, they are a crucial resource in the construction of the

reference ontology that is at the basis of the usage of the semantic reconciliation suite. For

instance, in the ATHENA project, the e-procurement ontology, concerning purchase order

and invoice, has been built considering some standards like UBL and RosettaNetviii

.

Semantic reconciliation platforms

AMEF, the ARTEMIS Message Exchange Framework [11] for document reconciliation is the

result of the ARTEMIS projectix. It allows the mediation of two OWL ontologies whichrepresent the schemas of the documents to be reconciled. For this reason, the schemas of the

documents to be reconciled are previously transformed into OWLx by using a lift and

normalization process. The semantic mediation is realized in two phases: (i) Message

Ontology Mapping Process, where the two ontologies are mapped one to another, in order to

build Mapping definitions (i.e., transformation rules), with the support of the OWLmt

ontology mapping tool; (ii) Message Instance Mapping, where XML instances are first

transformed into OWL instances, and then Mapping definitions are applied to transform

messages from the original to the destination format.

The MAFRA (MApping FRAmework) [21] is a framework for mapping distributed

ontologies. It is based on the definition of Semantic Bridges as instances of a SemanticBridging ontology which represents the types of allowed bridges. Such bridges represent

transformation rules. Also in this case a lift and normalization activity is performed to

transform the original documents (schemas and data) into the ontology format. Afterwards,

considering the transformed schemas, the Semantic Bridges between the two parties are

created on the basis of the Semantic Bridge Ontology (SBO). With respect to the automatic

support provided by the platform, AMEF does not provide any facility: mapping and rules

have to be created manually. Concerning the MAFRA platform, a very limited automatic

support is provided. Conversely, the main goal of our work is to provide an effective

automatic support to those activities that are error-prone and time consuming (i.e., semantic

annotation and transformation rules building).

The Web Service Modeling Toolkit (WSMT) [18] is an integrated developmentenvironment for Semantic Web Services that enables developers to develop Ontologies, Web

Services, Goals and Mediators through the Web Service Modeling Ontology (WSMO)


23/31


N.216256


Date 30/06/2010


formalism. The WSMT is implemented as a collection of plug-ins for the Eclipse framework

that cover several areas of functionality; among them the Mapping Perspective [24] is a tool

for defining mappings between ontologies. Mappings are defined trough a formal model,

linked to a logic-based Abstract Mapping Language that does not commit to any existing

ontology representation language. Such mappings are then grounded to a concrete and

executable representation language, the WSML-Rule language designed for instancetransformation. In this approach operational mapping rules are seen as a set of WSML axioms

that are evaluated by a WSML reasoner. WSMT offers also an automatic support for mapping

discovery, accomplished by using a set of suggestion algorithms for both lexical and

structural analysis of the concepts. Concerning methodological aspects of the resource

mapping process, WSMT shares some analogies with the semantic reconciliation suite

approach. Anyway the WSMT logical architecture is designed to create mediation mappings

between a service requester and a service provider that use different conceptual models

(ontologies) to describe the same domain. On the other hand, the purpose of the semantic

suite is to allow interoperability within a network of software applications through the

adoption of a common and shared conceptualization of the domain (the reference ontology)

that provide a common view over the heterogeneous data sources . Furthermore, the WSMT isreleased as an Eclipse plug in which means a standalone application. On the contrary, the

semantic reconciliation suite is being implemented as a web application and consequently in a

more service oriented logics.

The Interoperability Service Utility (ISU) [16] developed within the scope of the

iSURF projectxi

provides interoperability between different UN/CEFACT CCTS based

document standards (i.e., OAGIS, UBL, GS1). The proposed approach is centred on the

notion of Harmonized Ontology that contains two types of OWL-DL ontologies: (1) the

Upper Ontology that describes the CCTS artefacts, as generic classes; (2) the Document

Schema Ontologies that describe the actual document artefacts for each electronic business

document standard as subclasses of the classes in the upper ontology. A Description Logic

Reasoner and a Rule Reasoner are used to identify the equivalence and subsumption relationsin the Harmonized Ontology. The discovered similarities among the document artefacts are

then used to generate XSLT definitions for xml instance translation. The overall

methodological framework is very relevant to the proposed reconciliation suite; however the

ISU focus on the integration of xml data defined with respect to CCTS-based standards, while

we intend to be more general aiming at the integration of heterogeneous data without any

assumption regarding the semantics and the structure of the resource schemata. Furthermore

the definition of the relations (DL-axioms or logic rules) between artefacts belonging to a

Document Schema Ontology and to the Upper Ontology is mainly a manual activity, while

our aim is to provide an effective support to these activity, that is basically a mapping

discovery task.Mapping discovery methodsIn the recent period, the automation of the mapping discovery (i.e., finding correspondences

or relationship) between entities of different schemas (or models) has attracted much attention

in both the database (schema matching) and AI (ontology alignment) communities (see

[17,26] for a survey). Schema matching and ontology alignment use a plethora of techniques

to semi-automatically finding semantic matches. These techniques are based on different

information sources that can be classified into: intensional knowledge(entities and associated

textual information, schema structure and constraints), extensional knowledge (content and

meaning of instances) and external knowledge (thesauri, ontologies, corpora of documents,

user input). We can divide the methods/algorithms used in existing matching solutions in:

rule-based methods, where several heuristics are used to exploit intensionalknowledge, e.g. Prompt [25], Cupid [20];


24/31


N.216256


Date 30/06/2010


graph analysis, where ontologies are treated as graphs and the corresponding sub-graphs are compared, e.g. Similarity flooding (Melnik, Garcia-Molina, & Rahm,

2002), Anchor-prompt [25];

machine learning based on statistics of data content, e.g. GLUE [13];

probabilistic approaches that combine results produced by other heuristics, e.g.

OMEN [23].Complex approaches, obtained by the combination of the above techniques, have been

also proposed (e.g., OLA [14]), as well as frameworks that provide extensible libraries of

matching algorithms and an infrastructure for the management of mappings (e.g., COMA

[12]). At the best of our knowledge, most of the previous matching approaches focus on

finding a set of correspondences (typically 1:1) between elements of the input schemas,

enriched eventually by some kinds of relationship (equivalence, subsumption, intersection,

disjointness, part-of, merge/split).

The construction of operational mapping rules (i.e., directly usable for integration and

data transformation tasks) from such kind of correspondences is another challenging aspect.

Clio [15], taking in input n:m entity correspondences together with constraints coming from

the input schemas (relational schema or XSD), produces a set of logical mappings with formalsemantics that can be serialized into different query languages (e.g., SQL, XSLT, XQuery).

MapOnto [10] can be viewed as an extension of Clio when the target schema is an ontology.

Most previous mapping constructors concentrate on creating executable mappings rules

between particular data-models; data sources, however, are of many different data models,

(e.g., XML, RDF, Relational, OWL). In our framework we allow general and rich

relationships (Semantic Annotations) that allow the mapping between a wide variety of data

models. Such general declarative mappings can then be used for the construction of model-

dependent operational mapping rules (e.g. conjunctive queries, SQL views, XSLT

transformations).

The main difference with respect to ontology matching and mapping construction

approaches present in literature is that these two steps are not seen as two separated tasks to

be executed in sequence. On the contrary in the proposed mapping discovery algorithm, the

matching phase is driven by the search of templates that can be seen as abstraction of

mapping rules (and hence a mapping construction task). The output of the mapping discovery

service is then a declarative mapping, closed to the output of an ontology matching algorithm,

but capable of capturing complex correspondences that are directly interpreted as complex

mapping rules, to be used in a mediation task.


25/31


N.216256


Date 30/06/2010


5. Data Payload Interoperability Service

The specifications of theData Payload Interoperability Service have received substantial

changes if compared to the previous version.After the evaluation of the first set of services, described in deliverable D5.2.1a the majority

of the comments were that the services were too collaboration oriented.

Moreover in the set of services dealing with Data Interoperability there was a lack of tools for

managing the document payload transformations (those transformations that act on the

content of the document, rather than the format).

For these reason we decided to change the specification and to develop a service whose focus

was the content transformationof business documents.

5.1. Updated requirements

The new service gets a very specific requirement which is: automation.

The service needs to be extremely automated in its procedures.

After a needed step of setup to define the environment of the user, the service should be able

to apply such environment to the submitted document and get the results.

In the following table are summarized other technical requirements expressed for the Data

Payload Interoperability Service:

ID Name of the feature Description of the feature

REQ 1 Creation of negotiation The service should allow the possibility to

create a new negotiation, by providing at

least name and description.REQ 2 1:1 and 1:N negotiation

scenarios

The service should provide the possibility to

select multiple users for the negotiation, in

order to enable the 1:N scenario.

REQ 3 Creation of business rule The service should provide the possibility to

create business rules.

REQ 4 Management of business rules The service should allow managing the

business rules: look at the content and delete

them.

REQ 5 Definition of user roles The service should give the possibility to

define the rules for different roles, depending

if the user is the creator of the negotiation or

just a participant

REQ 6 Application of rules The service should give the possibility to

select which rules to apply to specific

negotiations and see the results.

5.2. Rules application

The most efficient way to deal with the automation requirement is the application of business

rules.

Business rules are expression in the form IF-THEN-ELSE, which allow to define thebehaviour of the software when some conditions are (or not) verified.


26/31


N.216256


Date 30/06/2010


Such rules are executed by an engine rule and can be written in a structured language which

can be standard (like XML) or proprietary of the engine.

The syntax of the rule is always proprietary, since every engine understands only its language.

There are a lot of business rules engine available in the net, here are some example of those

evaluated:

Drools -http://drools.org/OpenRuleshttp://openrules.com

Mandaraxhttp://mandarax.sourceforge.net/

SweetRuleshttp://sweetrules.projects.semwebcentral.org

TermWarehttp://www.gradsoft.ua/products/termware_eng.html

JRuleEnginehttp://jruleengine.sourceforge.net/

JLisahttp://jlisa.sourceforge.net/

JEOPShttp://sourceforge.net/projects/jeops/

Provahttp://comas.soi.city.ac.uk/prova

Open Lexiconhttp://openlexicon.org

Zilonishttp://www.zilonis.org

Hammurapihttp://www.hammurapi.biz

For the development of the Data Payload Interoperability Service we decided to use

JRuleEngine.

5.3. JRuleEngine

JRuleEngine (http://jruleengine.sourceforge.net/)JRuleEngine is java rule engine, based on

Java Specification Request (JSR) 94.

It has been selected among the other engine candidates because of its simplicity to use and

fact that the rules can be written in a very simple format XML based.

Another very useful functionality is the possibility to wrap the execution of the rules on some

methods defined at code level.

This offers the possibility to improve the behavior of the rules by adding some logic on the

execution of the IF-THEN statements.

The code below represents an example of definition of a rule in JRuleEngine language

RuleExecutionSet1Rule Execution Set
http://drools.org/http://drools.org/http://drools.org/http://openrules.com/http://openrules.com/http://openrules.com/http://mandarax.sourceforge.net/http://mandarax.sourceforge.net/http://mandarax.sourceforge.net/http://sweetrules.projects.semwebcentral.org/http://sweetrules.projects.semwebcentral.org/http://sweetrules.projects.semwebcentral.org/http://www.gradsoft.ua/products/termware_eng.htmlhttp://www.gradsoft.ua/products/termware_eng.htmlhttp://www.gradsoft.ua/products/termware_eng.htmlhttp://jruleengine.sourceforge.net/http://jruleengine.sourceforge.net/http://jruleengine.sourceforge.net/http://jlisa.sourceforge.net/http://jlisa.sourceforge.net/http://jlisa.sourceforge.net/http://sourceforge.net/projects/jeops/http://sourceforge.net/projects/jeops/http://sourceforge.net/projects/jeops/http://comas.soi.city.ac.uk/provahttp://comas.soi.city.ac.uk/provahttp://comas.soi.city.ac.uk/provahttp://openlexicon.org/http://openlexicon.org/http://openlexicon.org/http://www.zilonis.org/http://www.zilonis.org/http://www.zilonis.org/http://www.hammurapi.biz/http://www.hammurapi.biz/http://www.hammurapi.biz/http://jruleengine.sourceforge.net/http://jruleengine.sourceforge.net/http://jruleengine.sourceforge.net/http://jruleengine.sourceforge.net/http://www.hammurapi.biz/http://www.zilonis.org/http://openlexicon.org/http://comas.soi.city.ac.uk/provahttp://sourceforge.net/projects/jeops/http://jlisa.sourceforge.net/http://jruleengine.sourceforge.net/http://www.gradsoft.ua/products/termware_eng.htmlhttp://sweetrules.projects.semwebcentral.org/http://mandarax.sourceforge.net/http://openrules.com/http://drools.org/


27/31


N.216256


Date 30/06/2010


The engine allows the possibility to define multiple IF statements (connected with AND

logical operator) and multiple THEN statements.

5.4. The negotiation process

The Data Payload Interoperability Services core is represented by the creation andmanagement of business rules to enable the negotiation of business documents in 1:1 and 1:N

scenarios.

The documents used within the service are UBL orders.

The service gives the users the possibility to define business rules and select which ones to

apply to specific negotiations.

The service is composed of three main parts:

Negotiation creation: it allows the creation of new negotiations, the user can specify

name and description of the negotiation and select the participants (one or many)

allowed to participate in the negotiation.

Rules creation: it allows the creation and management of business rules.

Rules are created with a specific role (sender and\or receiver) according to theforeseen use of the rule.

Sender (the person who creates the negotiation) rules are used when the negotiation is

created by the current user, while Receiver (the participant to the negotiation, selected

by the sender) rules are used when the current user is participating in a negotiation he

doesnt own.

Rules application: this is the core part of the service. It allows the selection of the

specific negotiation to evaluate and the selection of which rules to apply to the

negotiation.

The service decomposes the 1:N negotiation into several 1:1 negotiation for better

management inside of the service logic.

When selecting the negotiation to manage, the system automatically applies all the rules of

the participant user, without giving visibility of this to the current user.

This means, for example, that if person X is the creator of a negotiation and person X is

currently evaluating one, the service will automatically apply all the rules defined by the

participant for that negotiation.

Since in the rules application system the order of the rules is important because the

application of one rule can change the content for the next one, the logic of the service is to

apply BEFORE the rules of the participants, and then the rules of the current user.


28/31


N.216256


Date 30/06/2010


6. Innovative Services for Federated Interoperability

The federated approach is characterized by the absence of any reference meta-models which

can be used to reconciliate documents from one format to another.The Federated Interoperability service works on the structure of UBL documents, and gives to

the user the possibility to choose how to perform transformation and reconciliation of

different formats.

6.1. The COIN approach

The approach adopted in COIN is the composition of micro-services.

The UBL document is analyzed at schema level and decomposed into several parts, each one

representing a single main node of the document.

For each part several possibilities of transformations exists, according to the target document

format or even just uploaded by end users.

For this version of the service the target domain will be the transformation of the Swedish

invoice (UBL 1.0) to the Turkish invoice (UBL 2.0).

The system will offers a default set of micro services which perform transformations based on

XSLT, but the user has the possibility to upload different ones according to his specific needs.

The users can then select and combine different micro services to get a more complex and

complete transformation.

6.2. Requested featuresIn this chapter are summarized the features required to the service.

Selection of document formats: the user must be able to select which kind ofdocuments he wants to apply the transformation to.

Presentation of single parts: the service must be able to decompose the structure ofthe selected document into smaller parts independent one from the others.

Default transformation: the system must provide a basic way of performing

transformation.

Manual transformation: the system must give the user the possibility to upload

private transformation for the selected parts.

Testing: each transformation (default or private) must be testable by the user.Composition of micro-services: the user must be able to select different parts and use

a set of micro-services to compose a more complex and complete transformation.


29/31


N.216256


Date 30/06/2010


7. Conclusions

The WP5.2 introduces the concept ofInteroperability Space.

This is a set of services whose purpose is to take into account all the possible kind of data

transformation which can be applied to documents.

Data Interoperability can be divided into two big branches:Payload interoperability: refer to the transformations applied to the content of thedocuments.

Schema interoperability: refer to the transformations applied to thestructureof the

documents.

Schema interoperability, in turn, can be divided into two braches according to the approach

that we want to follow for the transformation:

Unified approach: implies the use of a reference meta-model for managing thetransformations.

Federated approach: implies the absence of a reference meta-model for managingthe transformations.

For each of the three main groups (payload, unified and federated) WP 5.2 has developed a

set of services.

The Innovative Services for Semantic Reconciliation group in a unified environment a set

of functionalities to provide an effective automatic support to the definition and execution of

expressive mappings between heterogeneous resources, with the aim of providing a

reconciliation framework for eBusiness resources exchange.

The Data Payload Interoperability Serviceworks on the content of the documents and it isused in the negotiation process of UBL orders.

The InnovativeServices for Federated Interoperabilityworks on the structure of the

documents and proposes an approach where no reference meta-models are available, but

instead the users are free to provide personal transformations.


30/31


N.216256


Date 30/06/2010


8. References

1. The field matching problem: Algorithms and applications. Monge, A. and Elkan, C. 1996. InProceedings of the Second International Conference on Knowledge Discovery and Data Mining.

pp. 267--270.

2. D. Nardi, R. J. Brachman. An Introduction to Description Logics. In the Description Logic

Handbook, edited by F. Baader, D. Calvanese, D.L. McGuinness, D. Nardi, P.F. Patel-Schneider,Cambridge University Press, 2002, pages 5-44.3. An information-theoretic definition of similarity. Lin, D. 1998. In Proc. 15th

InternationalConference of Machine Learning (ICML). pp. 296-304.4. Formica A., Missikoff M., Pourabbas E., Taglino F.. Weighted Ontology for Semantic Search. R.

Meersman and Z. Tari (Eds.): OTM 2008, Part II, LNCS 5332, pp. 12891303, 2008.

5. Fellbaum, C. WordNet An Electronic Lexical Database.The MIT Press, 1998.6. Schema mappings, data exchange, and metadata management. Kolaitis, P. G. 2005. PODS. pp. 61-

-75.7. P. R. S. Visser, D. M. Jones, T. J. M. Bench-Capon, M. J. R. Shave, An analysis of ontological

mismatches: Heterogeneity versus interoperability, in: AAAI 1997 Spring Symposium on

Ontological Engineering, Stanford, USA, 1997.

8. Euzenat, J., Shvaiko, P.: Ontology Matching. 2007.9. Ritze, D., Meilicke, C., Zamal, O., Stuckenschmidt, H.: A pattern-based ontology matching

approach for detecting complex correspondence. Chantilly 25.10.2009. In: Ontology Matching

2009. CEUR-WS, 2009.

10.An, Y., Borgida, A., & Mylopoulus, J. (2005). Constructing Complex Semantic MappingsBetween XML Data and Ontologies. International Semantic Web Conference, (pp. 6-20). Galway,

IE.11.Bicer, V., Laleci, G., Dogac, A., & Kabak, Y. (2005). Providing Semantic Interoperability in the

Healthcare Domain through Ontology Mapping. eChallenges Conference. Ljubljana, Slovenia.

12.Do, H., & Rahm, E. (2002). COMA - A system for flexible combination of Schema MatchingApproaches. Very Large Databases, (p. 610-621). Hong Kong, CN.

13.Doan, A., Madhavan, J., Domingos, P., & Halevy, A. (2003). Ontology Matching: A MachineLearning Approach. In S. Staab, & R. Studer (Ed.), Handbook on ontologies (p. 397-416). Berlin,

DE: Springer.14.Euzenat, J., & Valtchev, P. (2004). Similarity-based ontology alignment in OWL-Lite. European

Conference on Artificial Intelligence , (p. 333-337). Valencia, ES.

15.Haas, L. M., Hernandez, M. A., Ho, H., Popa, L., & Roth, M. (2005). Clio grows up: fromresearch prototype to industrial tool. In Proceedings of the international conference on

Management of data, (p. 805-810). Baltimore, US.16.Kabak, Y., Dogac, A., Ocalan, C., Cimen, S., & Laleci, G. B. (2009). iSurf Semantic

Interoperability Service Utility for Collaborative Planning, Forecasting and Replenishment .

eChallenges Conference. Instanbul, Turkey.

17.Kalfoglou, Y., & Schorlemmer, M. (2003). Ontology mapping: the state of the art. The

Knowledge Engineering Review , 18 (1), 1-31.18.Kerrigan, M., Mocan, A., Tanler, M., & Fensel, D. (2007). The Web Service Modeling Toolkit -

An Integrated Development Environment for Semantic Web Services. European Semantic WebConference. Innsbruck, AU.

19.Kolaitis, P. G. (2005). Schema mappings, data exchange, and metadata management. Principles ofDatabase Systems, (p. 61-75). Baltimore, US.

20.Madhavan, J., Bernstein, P. A., & Rahm, E. (2001). Generic schema matching with Cupid. VeryLarge Data Bases, (p. 49-58). Roma, IT.

21.Maedche, A., Motik, B., Silva, N., & Volz, R. (2002). MAFRA - a mapping framework fordistributed ontologies. International Conference on Knowledge Engineering and Knowledge

Management, (p. 235-250). Siguenza, ES.

22.Melnik, S., Garcia-Molina, H., & Rahm, E. (2002). Similarity Flooding: A Versatile Graph

Matching Algorithm and its Application to Schema Matching. International Conference on DataEngineering, (p. 117-128). San Jose, CA US.
http://www.inf.unibz.it/~franconi/dl/course/dlhb/dlhb-01.pdfhttp://www.inf.unibz.it/~franconi/dl/course/dlhb/dlhb-01.pdf


31/31


N.216256


Date 30/06/2010

23.Mitra, P., Noy, N. F., & Jaiswal, A. R. (2005). Ontology Mapping Discovery with Uncertainty.International Conference on the Semantic Web (ISWC), (p. 537-547). Galway, IE.

24.Mocan, A., & Cimpian, E. (2007). An Ontology-b

coin d5.2.1b-information interoperability services specifications m29 v1.0.pdf

Documents

7/16 and composite 7/16 series - radiall d1c004xee.pdf · a...

how to trade gold coin futures contracts · gcor96 gold...

tm 9-772 carrier cargo m29 and m29c

coin-coin n°7 - 2012

coin master free 1000 spins - kachifpro.info · coin master...

powerpoint presentation · the singapore mint æoze '...

facebook coin

aim:. british east india company agents 1-e coins of the...

coin coin n°9 - 2015

sr20 v1.0 avionics en v1.0

6.2 m29 four mile run draft -...

tossed coin

coin lesion

nhg clinical practice guidelines m09 acute otitis media aom...

coin strategy101

the coin-or open solver...

metal coin

challengecoinusa challenge coins that rock by challenge coin...

transcript of june 22, 2016 commission meeting ... ·...

zeus coin (zeus coin) - dessertswap.finance