a grammar-based approach to specify and implement visual ... · b a c d m n p activ1 start sync1...
TRANSCRIPT
Dottorato di Ricerca in Informatica
I ciclo Nuova Serie
Universita di Salerno
A Grammar-based Approach to Specify andImplement Visual Languages
Vincenzo Deufemia
November 2002
Chairman: Supervisor:
Prof. A. De Santis Prof. G. Costagliola
Abstract
This thesis work presents a methodology for modeling and implementing visual languages.
The approach relies on the syntactic framework of eXtended Positional Grammars (XPG,
for short). This is a formalism to model the basic elements (visual symbols) of the visual
notation, their syntactic properties, the relations between them, and a set of syntactic
rules to formally define the feasible visual sentences.
We present a powerful LR-based (XpLR) methodology for parsing visual languages
described by XPGs. The result is the possibility of describing and compiling a broad class
of visual languages yet keeping most of LR parsing efficiency. We describe this new algo-
rithm, named XpLR(0) parser, and provide heuristics able to solve a number of conflicts
usually arising in the previous applications of LR methodology to visual languages.
The expressive power of the formalism has been highlighted by modeling UML state
diagram languages, which represent one of the most complex visual modeling languages
used in the software engineering field.
An interesting feature of the XpLR methodology is the possibility of using standard
compiler generation tools for the construction of compilers for visual languages. Indeed,
we define mapping rules and conflict handling techniques to convert a generic XPG into an
equivalent translation scheme. This conversion process allows us a rapid implementation
of compilers for XPGs thanks to the use of standard and well-known tools, like YACC.
Using the Visual Language Compiler-Compiler (VLCC) system extended with the
proposed XpLR methodology it is possible to automatically implement visual languages
once their formal XPG specification is given. VLCC generates both editor and compiler for
the specified visual language. This makes our methodology a sound basis for the definition
iii
of a new meta-CASE technology, since VLCC can be used for defining and automatically
generating CASE tools. In fact, we have used it to model the diagrammatic notations of
the Unified Modeling Language (UML), and to generate a set of flexible CASE tools for
supporting them.
One of the most interesting applications of VLCC is in the construction of meta-CASE
analysis and design workbenches. Indeed, such workbenches are usually visual oriented
since they support editing and manipulation of diagrammatic notations which allow en-
gineers to prototype models of the system. Until recently, the main difficulty with their
automatic generation derived from the lack of formal syntax and semantics specification
of diagrammatic notations used as part of analysis and design methods. The formal spec-
ification methods proposed in the visual language research area can be profitably used
to this aim. In this thesis, we show how the VLCC system can be profitably used for
the construction of meta-CASE workbenches. The meta-CASE generates a workbench by
integrating a set of visual modeling environments in agreement with a required method,
which includes a process model and suitable rules/guidelines and is specified in terms of
a suitable activity diagram.
iv
Acknowledgements
I’d like to thank my advisor Gennaro Costagliola for the profitable discussions that con-
tributed to my research work and for the suggestions he gave me during the preparation
of this thesis. I would like to express my gratitude to Filomena Ferrucci for her careful
support. She has been of great help throughout the doctoral program. I am very glad to
have had the opportunity to work with her.
I thank Carmine Gravino and Giuseppe Polese who have been my co-authors for several
recent research papers related to the results presented in this thesis. I want to thank
Riccardo Distasi for his tireless LATEX assistance and for his moral support. I also want
to thank all my friends and my colleagues.
A final word of thanks is due to my family for their constant and invaluable support.
v
vi
Contents
Title Page i
Abstract iii
Acknowledgements v
Contents vii
1 Introduction 1
1.1 Visual Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 A Framework for Describing Visual Languages . . . . . . . . . . . . . . . . 3
1.2.1 Visual symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Visual sentences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Representation of visual sentences . . . . . . . . . . . . . . . . . . . 9
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Extended Positional Grammars 13
2.1 Modeling UML Statechart Diagrams through XPGs . . . . . . . . . . . . . 22
3 The XpLR Methodology 35
3.1 The XpLR Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.1 The input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.2 The stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.1.3 The XpLR parsing table . . . . . . . . . . . . . . . . . . . . . . . . . 36
vii
3.1.4 The XpLR parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1.5 Parsing time complexity . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 Constructing XpLR(0) Parsing Tables . . . . . . . . . . . . . . . . . . . . . 44
3.3 XpLR parsing table conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.1 Handling parsing table conflicts . . . . . . . . . . . . . . . . . . . . . 49
3.3.2 Building parsing tables with ordered substates . . . . . . . . . . . . 51
3.4 Applicability of XpLR parsing . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4 Building LR(0) Parsers for XPG Grammars 59
4.1 Converting an XPG into a translation scheme based on string grammars . . 60
4.2 Comparing the recognized languages . . . . . . . . . . . . . . . . . . . . . . 65
4.3 Resolving conflicts in non-LR(0) translation schemes . . . . . . . . . . . . . 85
5 An XPG-based Visual Environments Generator 93
5.1 The Symbol Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.2 The VLCC Textual grammar editor . . . . . . . . . . . . . . . . . . . . . . 96
5.3 The generated Visual Programming Environment . . . . . . . . . . . . . . . 97
6 Constructing Meta-CASE Workbenches 101
6.1 The proposed approach for the construction of meta-CASE workbenches . . 102
6.2 The MEG Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.2.1 The architectural design of MEG and the underlying methodology . 106
6.2.2 Using the VLCC system as a support to MEG construction . . . . . 110
6.3 The Workbench Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7 Related Work 121
7.1 Picture Layout Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.2 Constraint Multiset Grammars . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.3 Relational Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.4 Symbol-Relation Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.5 Graph Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.5.1 Layered Graph Grammars . . . . . . . . . . . . . . . . . . . . . . . . 128
viii
7.5.2 Hypergraph Grammars . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.6 Visual Conditional Attributed Rewriting Systems . . . . . . . . . . . . . . . 130
8 Conclusions and Further Research 133
References 137
ix
Chapter 1
Introduction
Visual languages are widely used in several application fields: teaching, development of
GUIs, and the software development process. Indeed, graphical notations allow for the
description and understanding of complex systems, such as concurrent and/or real-time
systems, for which traditional textual descriptions are inadequate. This thesis proposes
practical grammar-based means for specifying and implementing visual languages.
In this chapter, we first introduce the concept of visual language. Then, we provide a
framework for describing visual notations. We formalize the concepts of attribute-based,
relation-based and linear representation of a visual sentence. Finally, we describe the
structure of the rest of this thesis.
1.1 Visual Languages
In the last decades a wide variety of activities have made use of icons and diagrams to
allow for multimodal communication and interaction between humans and computers.
The ability of using graphics as a communication means in any activity that involves
human-computer interaction is named visual programming . Typical activities that benefit
from the use of visual languages are interpretation of low-level media (i.e., handwriting
recognition, image processing, etc.) and graphical user interfaces (i.e., interpretation of
user input, design support, visual and multimedia databases) [45]. Thus, a huge amount
of visual programming languages have been introduced. Such languages allow a user to
communicate with the system by spatially arranging visual objects on the screen, so as to
compose a “visual sentence” [31].
2 Chapter 1. Introduction
The use of visual programming is rapidly growing also in the industrial field. Indeed,
productivity improvements derive from a more effective communication between devel-
oper and customer, and from the availability of visual environments that facilitate the
combination of development and execution while promoting iterative design and interac-
tive prototyping (see, e.g., [5, 31,40]). Moreover, visual languages are widely employed to
support many activities of the software development process, such as specification, analy-
sis and design (e.g., Petri nets, FSAs, Statecharts, and Dataflow diagrams). For instance,
UML [57,50] diagrams are a common visual form of expressing and communicating design
information; they are used for modeling, testing, specifying, and programming of soft-
ware systems. Visual notations are able to provide abstract models and different views of
software systems, which allow software designers to devise solutions and design systems.
Much effort is presently being done to develop formal techniques for specifying, design-
ing and implementing visual (programming or modeling) languages [8,31,44,47,53]. Cur-
rently, there are three main approaches to visual language specification: the grammatical
approach, the logical approach, and the algebraic approach. The grammatical approach is
based on grammatical formalisms which extend the traditional rewriting mechanism used
in string language specification by describing geometric relationships between the objects
to be rewritten. Close to grammar-based approaches are formal specification methods
based on rewriting systems [49,8]. The logical approach uses first-order mathematical logic
or other forms of logic from artificial intelligence. Logical techniques are usually based
on spatial logics, which axiomatize the possible relationships between objects. Finally,
the algebraic approach uses algebraic specifications consisting of composition functions
constructing complex pictures from simpler picture elements. See [45] for an extensive
survey. In this thesis we will mainly rely on the grammatical approach.
The literature offers several grammatical formalisms for the specification of visual
languages, which differ one from another under several aspects [31, 44, 47, 53]. In general,
such formalisms extend traditional string grammars in that they rewrite sets or multisets
of symbols rather than sequences, and specify several relationships between objects rather
than mere concatenation. As a consequence, the analysis of visual languages is much
harder, due to the high cost of parsing. Indeed, while strings naturally drive a sequential
scan of the symbols, no precise scanning order is implicit in multi-dimensional structures.
However, the effective use of grammars for specifying visual language syntax requires
Chapter 1. Introduction 3
efficient parsing techniques. The parsing efficiency issue is a very relevant topic that
has been widely investigated in the literature. Several methods have been proposed that
impose some restrictions on the form of the grammars, balancing the ability to express
visual languages and the efficiency of parsing techniques [44].
Several grammar-based visual environment generators have been proposed [35, 11, 15,
22,30,62,58,56,70]. They are able to generate powerful visual environments within which
visual languages are embedded, and the editor and the compiler components are tightly
integrated. Such characteristic derives from two specific issues. First, for any visual lan-
guage a specialized editor is needed to assist the user in the specification of visual models,
by providing him/her with a set of visual symbols and relationships. Indeed, the use of
general-purpose drawing editors would force users to the unacceptable task of drawing the
possibly complex shapes representing symbols and relationship of the language, in fact sig-
nificantly limiting the benefits of visual languages. The second issue is related to the lack
of a standard representation for visual input expressions, analogous to the ASCII format
for textual languages. As a result, any parser is based on the specific input representation
that is used by the associated editor [31].
Visual environment generators can also be used for the construction of meta-CASEs,
to generate workbenches supporting other phases of the software development process. In
particular, they are suited to the generation of visual oriented workbenches (i.e. analysis
and design workbenches) since they support editing and manipulation of diagrammatic
notations. Such languages are key elements in the software engineering field, since they
effectively enhance the human to human communication, essential for cooperative work.
1.2 A Framework for Describing Visual Languages
Recently, several techniques for describing visual languages syntax have been created, and
these have been thoroughly analyzed and compared [45]. It turns out that two main
methods can be used to represent visual language sentences: relation-based and attribute-
based. The former describes a sentence as a set of graphical objects and a set of relations
on them, the second instead conceives a sentence as a set of attributed graphical objects.
In this section we provide a framework to formally describe visual languages, [13, 21].
A visual language is formed by a set of visual symbols from an alphabet and a set of
4 Chapter 1. Introduction
feasible visual sentences over these symbols. A feasible visual sentence of a language L
is a spatial arrangement of visual symbols according to the syntax of L. The framework
presented below can be used to describe many different classes of visual languages [13].
In this thesis we will mainly focus on its use to describe visual languages of software
development methodologies.
1.2.1 Visual symbols
A visual symbol vs (vsymbol for short) is defined as a triple (M, S, L) where M is the
physical component that is needed to materialize vs to our senses; S is the syntactic
component used to relate vs to other vsymbols. This component depends on how the
visual symbol is used in a sentence. L is the semantic interpretation of a vsymbol and is
used to derive the meaning of the sentences in which it occurs. The three components are
not necessarily disjoint. In particular, M is a set of attributes that specify the physical
appearance of the vsymbol, including size, color, shape, etc.; S is a set of attributes, named
syntactic attributes, whose values depend on the “position” of the vsymbol in the sentence;
L is a conceptual structure defining the semantics of the vsymbol. When we say that the
three components are not necessarily disjoint we mean that certain attributes may be part
of more than one component of a vsymbol. In general, most syntactic attributes are also
part of M. As an example, let us consider the vsymbol of a flowchart. In this case, M
describes its graphical aspect, i.e., a rhombus and three little circles (attaching points);
S keeps track of the links connected to each attaching point. According to the syntax of
this language, in order not to cause syntactic errors, 1) the vsymbol must be connected to
other vsymbols through links leaving the attaching points, and 2) no two attaching points
in it can be connected. The semantic component L qualifies as the condition block of
a conditional or loop statement in a flowchart.
1.2.2 Visual sentences
A visual alphabet S is a set of vsymbols. A visual sentence (vsentence for short) on S is a
set of vsymbols {x1, x2,. . ., xn} with their physical and syntactic components completely
instantiated. Examples of vsentences are given in Fig. 1.1(i) and (ii). In Fig. 1.1(i)
the vsymbols are the blocks of the “flowchart”. The syntactic attributes of each block
correspond to its attaching points and are used to keep track of the connections among
Chapter 1. Introduction 5
the blocks. In the activity diagram of Fig. 1.1(ii) some of the vsymbols are characterized by
syntactic attributes corresponding to attaching regions visualized, in this case, by thicker
lines. It is easy to note that attaching points can be seen as a special case of attaching
regions. Thus, definitions and discussions on attaching regions apply also to attaching
points.
Start
Hear alarm
Turn alarm
OFF
Get up
Stop
Do you
want to get up? Reset alarm Go back to
sleep
yes
no
(i)
Receive Order
Fill Order Send Invoice
Receive
Payment
Regular
Delivery
Overnight
Delivery
Close Order
[rush order] [else]
(ii)
Figure 1.1: A flowchart (i) and an activity diagram (ii).
In general, the different types of syntactic attributes can be used to identify several
classes of relations, which yield corresponding ways of modeling visual languages. In this
thesis we will consider two classes of relations, namely the class of connection relations
and the class of geometric relations [13]:
1. A connection relation is specified on a sequence of a finite number of attaching
regions. Visual sentences can be built by connecting attaching regions of vsymbols
through links or arrows. One or more links can be attached to an attaching region.
As an example, the vsymbol On in Fig. 1.2, semantically representing a superstate
in a statechart [34], has one attaching region represented by its border line, and the
two arrows on and off are attached to it.
2. A geometric relation between vsymbols is specified on the coordinates of the upper-
left and the lower-right vertices of their bounding boxes. Sentences can be built by
6 Chapter 1. Introduction
composing vsymbols through relations such as containment, sibling, right-to, etc. As
an example, the vsymbol NotOn in Fig. 1.2 is related to the two vsymbols Standby
and Off in its bounding box through the containment relation. As another example,
the vsymbol a in the string bacb is related to the vsymbol c through the right-to
relation. In this case right-to is the visual counterpart of the string concatenation
relation. As a consequence, in this framework we model string languages as a special
case of visual languages.
Notice that some relations have an explicit visual representation in vsentences, whereas
others have only an implicit representation. The latter are called implicit relations,
whereas the formers are called explicit relations. For example, in Fig. 1.2 the connection
relation is explicitly represented by edges, whereas the containment relation is implicitly
represented.
Moreover, a relation R is called univocal when given a vsymbol x, there exists at most
one vsymbol y such that x R y. Otherwise, R is called non-univocal. As an example the
right-to relation is univocal, while the connection relations are usual non-univocal as we
will see in the following.
Standby
on
NotOn
off Off
High
up
down
Low
Worm
minus
plus
Cool Hot
plus
minus
on
off
On
Figure 1.2: A visual sentence with both connection and geometric relations.
From the classes of relations connection and geometric it is possible to derive more
specialized subclasses of relations, as shown in Fig. 1.3, [13]. In particular, the class Graph
is suitable for modeling general graph-structured visual languages. Each vsymbol has a
pre-defined set of attaching regions on its image. The relations are graph interconnections.
Many of these languages have been used within software engineering methodologies. Ex-
amples include languages based on data flow graphs, state transition diagrams, Petri nets,
Chapter 1. Introduction 7
entity-relationship diagrams, SADT diagrams, Class and Object diagrams. The class Plex
is suitable for modeling graph-structured visual languages, with the limitation that each
vsymbol has a fixed number of attaching points [26], like flowcharts, chemical structures,
Boolean and electric circuits, and so on. For instance, in flowchart languages each graph-
ical object (boxes, diamonds, etc.) has a pre-defined number of attaching points (two for
boxes, three for diamonds, etc.), and graphical objects can be connected only through
links visualized as polylines. The vsymbols of the class Box are characterized by their
bounding boxes. The relations are spatial compositions of three types: inclusion, inter-
section and spatial concatenation, which are all defined on the vertices of the bounding
boxes of vsymbols. Most of the visual notations used in common software development
methodologies can be successfully modeled through these relation classes. In fact, some
notations can be modeled as pure geometric or pure connection, whereas others require
a hybrid modeling, since they combine characteristics from both classes. As an example
Statecharts [34] use arrows between nodes, which correspond to connection relations, and
the containment relation, which is a typical geometric relation.
Relation
Classes
Plex
Graph Box
Iconic
String
Geometric Connection
Figure 1.3: A hierarchy of relation classes.
A particular type of relation is annotation: a vsymbol of a visual language can be
annotated with a visual sentence from the same or a different visual language to provide a
detailed description of it. As an example, Fig. 1.4 describes a UML class diagram contain-
ing two classes Person and Traditional Person combined in a generalization/specialization
hierarchy. Each class is annotated with a statechart diagram that describes the dynamic
8 Chapter 1. Introduction
behavior of the class. The vsymbols of the annotating language could be in turn anno-
tated, yielding a hierarchy of visual languages. Note that the annotation need not be
“visual”. As a matter of a fact, a vsymbol could also be annotated by a string language
such as a high level textual specification.
getsMarried
Single Married
getsDivorced
birthday birthday
getsEngaged
NotEngaged Engaged
birthday birthday
Married
Single
birthday
getsDivorced getsMarried
Person
birthday
getsMarried
getsDivorced
Traditional
Person
getsEngaged
Figure 1.4: A visual sentence with two annotations.
The concept of annotation leads to the definition of hierarchical visual notations, [13].
More formally, a hierarchical visual notation is a visual language whose vocabulary con-
tains vsymbols that are annotated with textual or visual sentences from the same or from
another language, yielding homogeneous or heterogeneous hierarchies of visual languages,
respectively. Hierarchical visual language support is a vital aspect in software engineering.
In fact, most software development methodologies rely upon a hierarchical combination of
different visual notations. As an example, SADT diagrams are used to specify the func-
tional part of a software system. They use boxes to represent activities. Each box can
be annotated in turn with another SADT diagram to describe the details of the activity
it represents [55]. Similarly, visual symbols of a flowchart can be annotated by textual
Pascal-like code.
Chapter 1. Introduction 9
1.2.3 Representation of visual sentences
A visual sentence can be represented externally by materializing all of its vsymbols or
internally by taking into account the M, S and L components of its vsymbols. Fig. 1.5
classifies all the possible representations. Usually, the M component of each vsymbol is
represented as a set of attributes describing the physical characteristics of the vsymbol.
In the case of visual languages, the attributes of the M component of a vsymbol may
correspond to its position, a zoom factor, a list of elementary graphical objects, a bitmap
file, etc. The L component may be represented through a conceptual structure such as
a semantic network. In this thesis we mainly focus on the syntactic component S. We
consider three ways to syntactically represent a visual sentence: attribute-based, relation-
based and linear. We will see that these representations can be converted one into the
other, even though the conversions may not be 1-to-1.
visual sentence representations
external internal
M S L
attribute-based relation-based linear
Figure 1.5: A classification of visual sentence representations.
Attribute-based representation
In the attribute-based case, a vsentence is represented by explicating the syntactic at-
tributes of the vsymbols composing it. Let us examine how this representation method
applies to the visual languages in the class hierarchy of Fig. 1.3. In the case of visual
languages from the class String, we can explicit the value of the attribute position of a
vsymbol, which represents the position index of the vsymbol in a string; in the case of Plex
visual languages the attaching points of a vsymbol v are numbered and represented by
an array ap[1],. . ., ap[n]. The value of ap[i] is given by a unique label assigned to the link
plugged into attaching point i of v ; in the case of Graph visual languages the attaching
regions of a vsymbol v are numbered and represented by an array aps[1],. . ., aps[n] of sets.
10 Chapter 1. Introduction
The value of aps[i] is the set of labels of the links plugged into attaching region i of v. Fig.
1.6(a) shows the attribute-based representation of an activity diagram by considering the
link labeling provided in Fig. 1.6(b).
name aps[1] aps[2] aps[3]
Start {a} - -
Activ1 {a} {b} -
Sync1 {b} {c,d} -
Activ2 {c} {e} -
Activ3 {d} {f} -
Cond {e} {g} {h}
Activ4 {g} {i} -
Activ5 {h} {l} -
Mux {i} {l} {m}
Activ6 {f} {n} -
Sync2 {m,n} {o} -
Activ7 {o} {p} -
Halt {p} - -
b
a
d c
m
n
p
Activ1
Start
Sync1
Activ2 Activ3
Halt
Sync2
Activ4 Activ5
Activ7
Activ6
Cond
e f
Mux
g h
i l
o
(a) (b)
Figure 1.6: Attribute-based representation (a) of the activity diagram in Fig. 1.1(ii)
based on the link labeling in (b).
Relation-based representation
Given a visual sentence vs, let us consider a set R of binary relation identifiers. A labeled
graph on R and vs, G〈R,vs〉 = (N, E), is defined as follows:
• each node in N identifies a distinct vsymbol in the sentence vs
• a labeled edge (x, y, REL) is in E iff REL ∈ R holds between subsets of syntactic
attributes from the vsymbols x and y, respectively.
Definition 1.1 Let R and vs be a set of binary relation identifiers and a visual sentence,
respectively, a relation-based representation of vs with respect to R is any labeled graph
G〈R,vs〉 that is connected.
In the following, we will denote a labeled graph G〈R,vs〉 by listing its labeled edges in the
format REL(x, y). As an example, let us consider the visual sentence in Fig. 1.6(b). It
can be modeled according to the visual language class Graph, by using a class of relations
Chapter 1. Introduction 11
of type LINKi,j defined as follows: a vsymbol x is in relation LINKi,j with a vsymbol y
iff attaching point i of x is connected to attaching point j of y, i.e., iff apsx[i] ∩ apsy[j]
is not empty. Under these assumptions, the relation-based representation of the activity
diagram of Fig. 1.6(b) is given by the set:
{LINK1,1(Start, Activ1), LINK1,2(Activ1, Sync1), LINK2,1(Sync1, Activ2),
LINK2,1(Sync1, Activ3), LINK2,1(Activ2, Cond), LINK2,1(Cond, Activ4),
LINK3,1(Cond, Activ5), LINK2,1(Activ4, Mux), LINK2,2(Activ5, Mux),
LINK3,1(Mux, Sync2), LINK2,1(Activ3, Activ6), LINK2,1(Activ6, Sync2),
LINK2,1(Sync2, Activ7), LINK2,1(Activ7,Halt)}.
Linear representation
Definition 1.2 Given a visual sentence vs = {x1,. . ., xn} and a set of relation identifiers
R, a linear representation of vs with respect to R is the pair (G〈R,vs〉, P) where:
1. G〈R,vs〉 is a relation-based representation of vs with each relation in R invertible;
2. P is a permutation (y1, y2,. . ., yn) of the vsymbols in vs such that for each yi with
1< i ≤ n, there exists at least an index k such that 1 ≤ k < i and an edge in G〈R,vs〉on yk and yi.
A linear representation (G〈R,vs〉, P) will be denoted in the following as the string y1 R1
y2 R2 y3 . . . Rn−1 yn where each Rj is a non-empty sequence of type:
〈RELh11 , . . . ,RELhi
i , . . . ,RELhmm 〉 with m ≥ 1.
Each RELhii denotes the pair (RELi, hi) where RELi∈R or RELi = REL−1 with
REL∈R, RELi labels the edge (yj−hi, yj+1) in G〈R,vs〉 and it relates syntactic attributes
of yj+1 with syntactic attributes of yj−hi, with 0 ≤ hi < j. In the rest of the thesis, we
will denote REL01 simply as REL1. Notice that, whenever the relations are invertible
since G〈R,vs〉 is connected, it is possible to find a linear representation for vs. As an ex-
ample, let us consider the activity diagram in Fig. 1.6, if we consider its relation-based
representation given above, and the following permutation of vsymbols: (Start, Active1,
Active2, Active3, Active4, Active5, Active6, Active7, Sync1, Sync2, Cond, Mux, Halt) the
corresponding linear representation is:
Start 〈LINK1,1〉Activ1 〈LINK2,1〉 Sync1 〈LINK2,1〉Activ2 〈LINK12,1〉Activ3 〈LINK1
2,1〉
12 Chapter 1. Introduction
Cond 〈LINK2,1〉 Activ4 〈LINK13,1〉 Activ5 〈LINK1
2,1, LINK2,2〉 Mux 〈LINK42,1〉 Activ6
〈LINK13,1, LINK2,1〉 Sync2 〈LINK2,1〉 Activ7 〈LINK2,1〉 Halt.
This linear representation well fits the interpretation of an activity diagram. It follows the
natural flow of the described activities. The same cannot be said if an alternative linear
representation starting from Halt is chosen. Thus, the semantics of a visual sentence can
drive the construction of one of its linear representations.
1.3 Thesis Outline
In Chapter 2, we present the eXtended Positional Grammars (XPG, for short), a grammar
formalism for modeling visual notations, including those used in most software develop-
ment methodologies. Then, in Chapter 3, we give an LR-based algorithm for the parsing
of visual notations modeled through XPGs. This algorithm, named XpLR(0) parser, is
also able to solve a number of conflicts previously arising in pLR parsing tables. XPG and
XpLR extend Positional Grammars (PG) [15], and the associated pLR parsing method-
ology. The extension enables us to model diagrammatic notations used in software engi-
neering and to efficiently parse them. In Chapter 4 we describe how to construct parsers
for XPGs by exploiting standard compiler generation tools, like YACC.
XPG and XpLR have been implemented within the last version of the Visual Language
Compiler-Compiler [14], a system for implementing visual languages, which is described
in Chapter 5. Such tool has been used for generating many practical visual environments,
such as environments for the diagrammatic notations of UML, notations used in multime-
dia software engineering, and in workflow management.
In Chapter 6 we propose an approach for the construction of meta-CASE workbenches
based on the technology of visual language generation systems and on UMLmeta modeling.
In Chapter 7, we review the related work. Finally, in Chapter 8 we present our closing
remarks and discuss further directions for the research. Readers who are not familiar with
the implementation of visual languages may find it helpful to glance over Chapter 7 before
reading Chapters 2-6.
Chapter 2
Extended Positional Grammars
Extended Positional Grammars (XPGs) are a direct extension of Positional Grammars
(PGs) [15]. The latter have been successfully used to model and implement several impor-
tant visual languages, including languages from the classes Iconic, Plex, and Box. However,
PG were not suitable to model some critical visual languages, such as those belonging to
the class Graph. This was a considerable limitation, preventing the application of the PG
methodology to some important application fields, such as software engineering. In fact,
most of the visual notations used in software development methodologies are based on the
class Graph. Examples are UML class diagrams, Petri nets, Statechart diagrams, Activity
Diagrams, etc. XPG overcomes this limitation, also thanks to a new parsing technique.
Moreover, we have also provided conflict handling techniques to simplify grammar design.
In fact, as opposed to grammars for string languages, grammar formalisms modeling the
two-dimensional space are more likely to run into ambiguities. Without efficient conflict
handling techniques such grammar formalisms could not be effectively used for modeling
many practical visual languages. In order to avoid conflicts, the designer would have to
produce complex grammars, even for simple visual notations.
An Extended Positional Grammar is the pair (G, PE), where PE is a positional eval-
uator, and G can be seen as a particular type of context-free1 string attributed grammar
(N, T∪POS, S, P) where:
• N is a finite non-empty set of non-terminal vsymbols;
• T is a finite non-empty set of terminal vsymbols, with N∩T = ∅;1Here “context-free” means that the grammar productions are in “context-free” format and does not
refer to the computational power of the formalism.
14 Chapter 2. Extended Positional Grammars
• POS is a finite set of binary relation identifiers, with POS∩N= ∅ and POS∩T = ∅;
• S∈ N denotes the starting vsymbol;
• P is a finite non-empty set of productions having the following format:
A → x1R1x2R2 . . . xm−1Rm−1xm,∆,Γ
where A is a non-terminal vsymbol, x1R1x2R2 . . . xm−1Rm−1xm is a linear repre-
sentation with respect to POS where each xi is a vsymbol in N ∪ T and each Rj is
partitioned in two sub-sequences
(〈RELh11 , . . . , RELhk
k 〉, 〈RELhk+1
k+1 , . . . , RELhnn 〉) with 1 ≤ k ≤ n
The relation identifiers in the first sub-sequence of an Rj are called driver relations,
whereas the ones in the second sub-sequence are called tester relations. During syn-
tax analysis driver relations are used to determine the next vsymbol to be scanned,
whereas tester relations are used to check whether the last scanned vsymbol (termi-
nal or non-terminal) is properly related to previously scanned vsymbols.
Without loss of generality we assume that there are no useless vsymbols, and no
unit and empty productions [1].
∆ is a set of rules used to synthesize the values of the syntactic attributes of A from
those of x1, x2 ,. . ., xm;
Γ is a set of triples (Nj , Condj , ∆j)j=1,..,t, t≥0, used to dynamically insert new
terminal vsymbols in the input visual sentence during the parsing process. In par-
ticular,
– Nj is a terminal vsymbol to be inserted in the input visual sentence;
– Condj is a pre-condition to be verified in order to insert Nj ;
– ∆j is the rule used to compute the values of the syntactic attributes of Nj from
those of x1,. . ., xm.
Moreover, a property that guarantee the convergence of parsing algorithms, based on
XPGs, is: “for each production A → x1 . . . xm, ∆, Γ the number of triples in Γ whose
conditions can simultaneously evaluate to true must be less than m-1”. This means that
Chapter 2. Extended Positional Grammars 15
no more than m-2 vsymbols can be inserted in the input during the application of a
production.
Informally, a Positional Evaluator PE is a materialization function which transforms
a linear representation into the corresponding visual sentence in the attribute-based rep-
resentation and/or graphical representation.
In the following we characterize the languages described by an extended positional
grammar XPG = ((N, T ∪ POS, S, P), PE). We write α ⇐ β and say that β reduces to
α in one step, if there exist δ, γ, A, η such that
1. A → η, ∆, Γ is a production in P,
2. β = δηγ,
3. α = δA’πγ, where A’ is a vsymbol whose attributes are set according to the rule ∆
and π results from the application of the rule Γ.
We also write αi⇐ β to indicate that the reduction has been achieved by applying
production i. Moreover, we write α∗⇐ β and say that β reduces to α, if there exist α0,
α1, . . ., αm (m ≥ 0) such that
α = α0 ⇐ α1 ⇐ . . . ⇐ αm = β
The sequence αm, αm−1, . . ., α0 is called a derivation of α from β.
• a positional sentential form from S is a string β such that S ∗⇐ β
• a positional sentence from S is a string β containing no non-terminals and such that
S ∗⇐ β
• a visual sentential form (visual sentence, resp.) from S is the result of evaluating a
positional sentential form (positional sentence, resp.) from S through PE.
The language described by an XPG, L(XPG), is the set of the visual sentences from
the starting vsymbol S of XPG. Without loss of generality, let us assume that XPG has
no empty productions.
Given the two pairs (x, k) and (y, j), where x ∈ N ∪ T, y ∈ T, k is a syntactic attribute
of x, and j is a syntactic attribute of y, we say that (y, j) is reachable from (x, k) iff one
of the following situations occurs:
16 Chapter 2. Extended Positional Grammars
1. x = y;
2. there exists a production x→ x1R1x2. . .xi. . .Rm−1xm, ∆, Γ in P such that attribute
k of x is synthesized from attribute h of x1 by means of ∆, and (y, j) is reachable
from (x1, h).
If (y, j) is reachable from (x, k), we also say that y is reachable from x.
The new features of Extended Positional Grammars, as opposed to Positional Gram-
mars, include the use of multiple driver relations and the introduction of Γ rules to dynam-
ically modify the input visual sentence. It is easy to show that these features dramatically
improve the expressive power of positional grammars. In the following we show three
examples of XPG grammars, the first presenting a simple grammar to describe a plex
visual sentence, the second describing a context-sensitive string language, and the third
modeling a State Transition Diagram language.
Example 2.1 Let us consider the following grammar: N = {A, B}; T = {a, b, d, e, f}POS = {LINKi,j}, where LINKi,j is defined as in Section 1.2.3, and will be denoted as
h k to simplify the notation. All the vsymbols have two attaching regions as syntactic
attribute except A that has no attributes. In the following, the notation V symi denotes
the attaching point i of the vsymbol V sym. The set of productions P is:
(1) A → a 〈1 1〉 B 〈〈11 1〉,〈2 2〉〉 b 〈11 1〉 d
(2) B → e 〈2 2〉 f
∆: (B1 = e1; B2 = f2)
Γ: {(d; true ; d1 = e1)}.
Notice that d is a fictitious terminal vsymbol to be dynamically inserted in the input
sentence during the parsing process. Fig. 2.1 shows how the picture described by the
grammar is reduced to the starting non-terminal A by using productions 2 and 1.
A@ B@ C@
D@
E@
A@ F@
D@
⇒@G@
H@ H@ G@
H@ G@
H@ G@ H@
H@
H@H@
G@
G@⇒@H@
I@
Figure 2.1: A reduction process.
Chapter 2. Extended Positional Grammars 17
Example 2.2 Let us consider the context-sensitive language L={ anbncn | n ≥ 1}. It isgenerated by the string grammar with the following productions:
(1) S → a B S c
(2) S → a B c
(3) B c → b c
(4) B a → a B
(5) B b → b b
where the non-terminals are S and B, and the terminals are a, b and c. As a matter of
fact, the sentence a2b2c2 is obtained through the following derivation:
S 1⇒ aBSc2⇒ aBaBcc
3⇒ aBabcc4⇒ aaBbcc
5⇒ aabbcc
The extended positional grammar which generates the context-sensitive language anbncn
can be obtained modifying this string grammar accordingly. In particular, the set of non-
terminals is given by N = {S, B} where each vsymbol has two syntactic attributes, named
head and tail, both specifying a position in the plane. The set of terminals is given by T
= {a, b, c} and have one syntactic attribute (the pair of coordinates of their centroid),
referred to as head or tail interchangeably. As described in section 1.2.2, the right-to
relation is the visual counterpart of the string concatenation relation. Thus, the set of
relations is given by POS = {right-to} and the right-to relation can be defined as:
x 〈right-to〉 y if and only if ∃! y | yhead = xtail + 1
where x, y ∈ N ∪ T. The set of productions P is described below.
(1) S → a 〈right-to〉 B 〈right-to〉 S 〈right-to〉 c
∆: (Shead = ahead; Stail = ctail)
(2) S → a 〈right-to〉 B 〈right-to〉 c
∆: (Shead = ahead; Stail = ctail)
(3) B → b 〈right-to〉 c
∆: (Bhead = bhead; Btail = btail)
Γ: {(c′; true ; c′head = chead, c′tail = ctail)}
(4) B → a 〈right-to〉 B′
∆: (Bhead = ahead; Btail = atail)
Γ: {(a′; true ; a′head = B′head, a′tail = B′
tail)}
18 Chapter 2. Extended Positional Grammars
(5) B → b 〈right-to〉 b′
∆: (Bhead = bhead; Btail = btail)
Γ: {(b′′; true ; b′′head = b′head, b′′tail = b′tail)}
Notice that the set superscripts are used to distinguish different occurrences of the same
vsymbol and the terminals in the left-hand side of the string grammar productions are
moved in the Γ rules of the XPG productions.
These productions do not satisfy the property that guarantee the convergence of the
parsing algorithm, but can be easily shown that it converges.
Example 2.3 Let STD=((N, T ∪ POS, S, P), PE) be the XPG for State Transition
Diagrams, characterized as follows. The set of non-terminals is given by N = {StateTD,
Graph, Node} where each vsymbol has one attaching region as syntactic attribute, and
StateTD is the starting vsymbol, i.e. S = StateTD.
The set of terminals is given by T = {NODEI, NODEIF, NODEF, NODEG, EDGE,PLACEHOLD}. The terminal vsymbols NODEI, NODEIF, NODEF, NODEG have one
attaching region as syntactic attribute. They represent, the initial, the initial and final,
the final, and the generic node, respectively, of a state transition diagram. The terminal
vsymbol EDGE has two attaching points as syntactic attributes corresponding to the start
and end points of the edge. Finally, PLACEHOLD is a fictitious terminal vsymbol to be
dynamically inserted in the input sentence during the parsing process. It has one attaching
region as syntactic attribute. The tokens are graphically depicted in Fig. 2.2. Here, each
attaching region is represented by a bold line and is identified by the number 1, whereas
the two attaching points of EDGE are represented by bullets and are identified each by a
number.
1 1
NODEI NODEIF NODEF NODEG EDGE PLACEHOLD
1 1
1 2
1
Figure 2.2: The terminals for the grammar STD.
The set of relations is given by POS = {LINKi,j, any}, where the relation identifier
any denotes a relation that is always satisfied between any pair of vsymbols. Moreover, we
Chapter 2. Extended Positional Grammars 19
use the notation h k when describing the absence of a connection between two attaching
areas h and k.
Next, we provide the set of productions for describing State Transition Diagrams.
(1) StateTD → Graph
(2) Graph → NODEI
∆: (Graph1 = NODEI1)
(3) Graph → NODEIF
∆: (Graph1 = NODEIF1)
(4) Graph → Graph’ 〈〈1 1〉,〈1 2〉〉 EDGE 2 1 Node
∆: (Graph1 = Graph’1 - EDGE1)
Γ: {(PLACEHOLD; |Node1| >1; PLACEHOLD1 = Node1 - EDGE2)}
(5) Graph → Graph’ 〈〈1 1〉, 〈1 2〉〉 EDGE∆: (Graph1 = (Graph’1 - EDGE1) - EDGE2)
(6) Graph → Graph’ 〈〈1 2〉, 〈1 1〉〉 EDGE 1 1 Node
∆: (Graph1 = Graph’1 - EDGE2)
Γ: {(PLACEHOLD; |Node1| >1; PLACEHOLD1 = Node1 - EDGE1)}
(7) Graph → Graph’ 〈any〉 PLACEHOLD∆: (Graph1 = PLACEHOLD1)
(8) Node → NODEG
∆: (Node1 = NODEG1)
(9) Node → NODEF
∆: (Node1 = NODEF1)
(10) Node → PLACEHOLD
∆: (Node1 = PLACEHOLD1)
Notice that Graph1 = Graph’1 - EDGE1 indicates set difference and is to be interpreted
as follows: “the attaching area 1 of Graph has to be connected to whatever is attached to
the attaching area 1 of Graph’ except for the attaching point 1 of EDGE”. Moreover the
notation |Node1| indicates the number of connections to the attaching area 1 of Node.
20 Chapter 2. Extended Positional Grammars
According to these rules, a State Transition Diagram is described by a graph (produc-
tion 1) defined as
• an initial node (production 2) or as
• an initial-final node (production 3) or, recursively, as
• a graph connected to a node through an outgoing (production 4) or incoming (pro-
duction 6) edge, or as
• a graph with a loop edge (production 5).
A node can be either a generic node (production 8) or a final node (production 9). The
need for productions 7 and 10 will be clarified in the following example.
Example 2.4 Suppose we are to model the help module of a software package. Fig. 2.3
shows the state transition diagram describing the behavior of this help module. When
the user enters the software package, this goes into state 1. At this point a window pops
up, letting the user access the help module by clicking on the Show Tips option, or refuse
help support and directly go to the package functions (state 3) by clicking on the No Tips
option. If he/she chooses the first option, the package enters state 2, where an help page is
displayed. In this state the user can decide to exit tips and access other package functions
(state 3), or to show the next help page and to remain in state 2.
1
3
2
Enter Package
Show Tips
No Tips
Exit Tips
Next Tip
Figure 2.3: The state transition diagram for a help module.
Figg. 2.4(a-i) show the steps to reduce the state transition diagram of this help mod-
ule through the extended positional grammar STD shown above. In particular, dashed
ovals indicate the handles to be reduced, and their labels indicate the productions to be
used. The reduction process starts by applying production 2 to the initial state transi-
tion diagram. This causes the terminal NODEI representing state 1 to be reduced to the
Chapter 2. Extended Positional Grammars 21
non-terminal Graph. Due to the ∆ rule of production 2, Graph inherits all the connec-
tions of NODEI. Similarly, the application of production 8 replaces the unique NODEG of
Fig. 2.4(a) with the non-terminal Node. Fig. 2.4(b) shows the resulting visual sentential
form, and highlights the handle for the application of production 4. The vsymbols Graph,
EDGE, and Node are then reduced to the new non-terminal Graph. Due to the ∆ rule
of production 4, the new Graph is connected to all the remaining edges attached to the
old Graph. Moreover, due to the Γ rule, since |Node| = 4 > 1, a new node PLACEHOLD
is inserted in the input, and it is connected to all the remaining edges attached to the
old Node. Fig. 2.4(c) shows the resulting visual sentential form. After the application of
(a) (b) (c) (d)
Production 8 Production 2
1
3
2
Production 4
Node Graph
Production 9
Graph
Production 4
Node
Graph
Production 7
Graph
Production 10
Production 5
Graph
(e) (f) (g) (h) (i)
StateTD
Production 4
Graph’
Node Production 1
Graph
Enter Package
Show Tips
No Tips
Exit Tips
Next Tip
Figure 2.4: The reduction process for a state transition diagram.
productions 9 and 4 the visual sentential form reduces to the one shown in Fig. 2.4(e).
Then, production 7 reduces the non-terminals Graph and PLACEHOLD to a new non-
terminal Graph. By applying the ∆ rule of production 7, the new Graph inherits all the
connections to PLACEHOLD (see Fig. 2.4(f)). The subsequent application of produc-
tions 10, 5, 4 and 1 reduces the original state transition diagram to the starting vsymbol
in Fig. 2.4(i), confirming that the visual sentence associated to the initial state transition
diagram belongs to the visual language L(STD).
Ambiguous grammars
An XPGmay have two types of ambiguity. The first one says that an XPG G is structurally
ambiguous if there exists at least a positional sentence having more than one derivation.
This is similar to the definition of ambiguity for textual grammars. The second type of
22 Chapter 2. Extended Positional Grammars
ambiguity for an XPG G occurs when the application of PE to different positional sen-
tences produces the same visual sentence. In this case G is said to be visually ambiguous.
Obviously, an XPG may present both ambiguities simultaneously.
As an example of visual ambiguity, let us consider the grammar for state transition
diagrams given in example 2.3. It is easy to verify that the visual sentence in Fig. 2.5 may
be described from the grammar using two different reduction processes. In particular, a
reduction applies production 4 and then production 5, while the other applies production
5 and then production 4.
1 2
Figure 2.5: A visually ambiguous sentence for the grammar of example 2.3.
2.1 Modeling UML Statechart Diagrams through XPGs
In this section we show how Extended Positional Grammars can be used to specify UML
Statechart Diagrams [17]. This language is derived from the original proposal by Harel [34],
with modifications in order to include object-oriented features. It models the states of an
object and how an object moves from state to state for its entire lifetime. In particular the
OMG Unified Modeling Language Specification states that “statechart diagrams represent
the behavior of entities capable of dynamic behavior by specifying its response to the receipt
of event instances. Typically, it is used for describing the behavior of class instances,
but statecharts may also describe the behavior of other entities such as use-cases, actors,
subsystems, operations, or methods” [50].
UML statechart diagrams are a very rich graphical specification formalism obtained
as an extension of conventional finite state machines with more powerful concepts such as
hierarchy of states, orthogonality, interlevel transitions, etc. [34]. There are five different
kinds of entities in a statechart diagram namely statevertexes, transitions (arcs), events,
conditions and actions. As described in [50], statevertexes may be states, pseudostates,
stub states or synch states. The states (shown as rectangles with rounded corners) may
be composite (such as Alive in Fig. 2.6), simple (such as Created and Dead) or final.
Chapter 2. Extended Positional Grammars 23
Composite states may be OR-States or AND-States. OR-States contain a set of other
states (substates) that are related to each other by “exclusive-or”. AND-States contain
at least two unnamed concurrent OR-States separated by dashed lines. Simple states are
those that have no substates (they are at the bottom of the hierarchy). A final state
(shown as a circle surrounding a small solid filled circle) represents the completion of
an activity in the enclosing state. A state may have associated internal transitions that
Created
Dead
Runnable Asleep
Running
do/ execute code
Joined Waiting
new Thread
Start
Stop
timeout, interrupt
timeout, interrupt,
Notify
sleep
wait / release
lock
Alive
get
scheduled
[lock available
if synchronized]
yield, get
scheduled, lock
[lock available
if synchronized]
other
Thread’s
death,
timeout
interrupt
join (with
another
Thread)
run returns
Figure 2.6: Java Thread Life Cycle.
depict what activities the object will be doing while it is in that state. The general
format for the internal transitions is: event-signature ’[’ guard-condition ’]’ ’/’ action-
expression, where the event-signature describes an event that is an occurrence that may
trigger an action, the guard-condition is a boolean expression, the action-expression is
executed if and when the event occurred. As an example in Fig. 2.9 the state Working
has two internal transitions. The first transition entry/i++ specifies that the counter i
is incremented upon entry to the state Working, the second exit/i - - specifies that the
counter i is decremented upon exit from the state Working. The pseudostates kinds are:
initial, deepHistory, shallowHistory, join, fork, junction and choice. Each OR-State (as
well as the statechart diagram itself) contains exactly one initial pseudostate, which is
shown as a small solid filled circle. Transitions connect states and may be labeled by a
transition string that has the same format of internal transitions of the states. As an
example, in Fig. 2.6 the transition from the state Running to the state Waiting is labelled
with the transition string formed by the event wait and the action release lock.
To simplify the statechart diagram language description the deepHistory indicator, the
stub states and synch states will not be considered in the following, moreover we consider
statechart diagrams where the initial pseudostates are connected only to states.
24 Chapter 2. Extended Positional Grammars
Thus, statechart diagrams are a hybrid visual notation since they use links between
nodes to model transitions and spatial relations such as “containment” to model the
hierarchy of states.
It is widely recognized that the main difficulties for modeling statecharts derive from
the presence of interlevel transitions (like the transition labeled with the event run returns
in Fig. 2.6), multiple source/multiple target transitions, join/fork connectors, and history
connectors. In order to capture the interlevel transitions the extended positional grammar
proposed in this section models statechart diagrams in two different phases. The first
phase determines the hierarchy of states and the second phase analyzes the transitions
amongst states. Thus after the first phase, the statechart diagram is transformed in a
graph formed by the states and the transitions of the statechart. As an example Fig. 2.7
shows the graph correspondent to the statechart in Fig. 2.6.
Created
Dead
Runnable Asleep
Running
Joined Waiting
Alive
End
Figure 2.7: The graph describing the transitions in the statechart diagram of Fig. 2.6.
Now, we focus our attention on the first phase. In order to describe how a state
is contained in a superstate we introduce a spatial relation contains. To this aim we
associate to the states a containment area as syntactic attribute. Such containment area
corresponds to the rectangle area representing the state. Thus the containment relation
is defined as:
A 〈contains〉 B if and only if the containment area of A is the closest containment area
that contains B, where A and B are states of a statechart diagram.
As an example from Fig. 2.6 Alive 〈contains〉 Joined holds because there are no states
that contain Joined and are contained in Alive. Moreover the states of a statechart diagram
can be related through a sibling relationship whenever either the states are contained in
the same superstate, or they are not contained in superstates. Such relationship can be
described by a spatial relation, named sibling, defined as:
A 〈sibling〉 B if and only if
Chapter 2. Extended Positional Grammars 25
1. there exists C such that C 〈contains〉 A and C 〈contains〉 B, or
2. there does not exist a C such that C 〈contains〉 A or C 〈contains〉 B
where A, B and C are states of a statechart diagram. As an example, Running 〈sibling〉Waiting holds because Alive 〈contains〉 Running and Alive 〈contains〉 Waiting, moreover
Alive 〈sibling〉 Dead holds because the condition 2 is satisfied. The extended positional
grammar SD = ((N, T∪POS, S, P), PE) for UML statechart diagrams has the follow-
ing characteristics. The set of non-terminals is given by N = {StateDiagram, Hierarchy,
HSeqState, SeqState, IState, Initial, Diagram, State, Connector, Component, Comp, Multi-
Graph, Graph, Sync, Node} where the first eleven vsymbols have one containment area as
syntactic attribute, and the other have one attaching region as syntactic attribute.
The set of terminals is given by T = {INITIAL, STATE, CONCURRENT, FINAL,HISTORY, JUNCTION, CHOICE, FORK JOINT, EDGE, NEWSTATE, NEW FJ, PLA-
CEHOLD} and are graphically depicted in Fig. 2.8.
CONCURRENT
NEWSTATE
INITIAL STATE JUNCTION
CHOICE FORK_JOINT
FINAL
1 2
1
EDGE
1 2
PLACEHOLD
1 1
1
1
1
1
NEW_FJ
1 2
H 1
HISTORY
Figure 2.8: The terminals for the grammar SD.
Here, each attaching region is represented by a bold line and identified by a number,
each containment area is represented by a fill dotted area, while the attaching points
are represented by bullets. The terminal vsymbol INITIAL has one attaching region as
syntactic attribute and represents the initial pseudostate. The terminal vsymbol STATE
has one attaching region and one containment area as syntactic attribute and represents
the state of a statechart diagram. The terminal vsymbol CONCURRENT has one con-
tainment area as syntactic attribute, and it represents the concurrent substate contained
in an AND-State. The terminal vsymbol FINAL has one attaching region as syntactic
attribute and represents a final state.The terminal vsymbols HISTORY, JUNCTION and
CHOICE have one attaching region as syntactic attribute and they represent the history
26 Chapter 2. Extended Positional Grammars
state indicator, the junction point (which is used to merge and split transitions), and the
dynamic choice point, respectively. The terminal vsymbol FORK JOINT has two attach-
ing regions as syntactic attributes corresponding to the incoming transitions and outgoing
transitions, and it represents synchronization, forking, or both. The terminal vsymbol
EDGE has two attaching points as syntactic attributes corresponding to the start and end
points of the transition. Finally, NEWSTATE, NEW FJ and PLACEHOLD are fictitious
terminal vsymbols to be dynamically inserted in the input sentence during the parsing
process.
The set of relations is given by POS = {LINKi,j, sibling , contains, any}, whereLINKi,j is as defined in Section 2.3 and will be denoted as h k to simplify the notation.
Next, we provide the set of productions of the first phase to determine the hierarchy of
states in the statechart diagrams.
xHy@ j_A_BzcAeaAs@→ {cBaAap]g@⟨���⟩ |wb_c}aAt]@
xGy@ j_A_BzcAeaAs@→ {cBaAap]g@
xry@ {cBaAap]g → ~j_A_B@⟨��� ⟩ ΗjB�j_A_B@
@ @ ∆�@x{cBaAap]gAaBA@k@~j_A_BAaBAy�@
x�y@ {cBaAap]g → ~j_A_B@
@ @ ∆�@x{cBaAap]gAaBA@k@~j_A_BAaBAy�@
x�y@ {jB�j_A_B@→ jB�j_A_B@⟨��� ⟩ {�j\i��@
@ @ ∆�@x{jB�j_A_BAaBA@k@jB�j_A_BAaBAy�@
x�y@ {jB�j_A_B@→ jB�j_A_B@
@ @ ∆�@x{jB�j_A_BAaBA@k@jB�j_A_BAaBAy�@
x�y@ jB�j_A_B@→ zcAeaAs@⟨��� ⟩ jB�j_A_B@
@ @ ∆�@xjB�j_A_BAaBA@k@zcAeaAsAaBAy�@
x�y@ jB�j_A_B@→ zcAeaAs@
@ @ ∆�@xjB�j_A_BAaBA@k@zcAeaAsAaBAy�@
x�y@ ~j_A_B@→ ~dc_cAb@⟨�������⟩ {cBaAap]g���@
@ @@@@∆�@x~j_A_BAaBA@k@~dc_cAbAaBAy@
xH�y@ ~j_A_B@→ ~dc_cAb@
@ @@@@∆�@x~j_A_BAaBA@k@~dc_cAbAaBAy@
xHHy@ ~j_A_B@→ @~dc_cAb ⟨�������⟩ �`st`dBd_@
@ @@@@∆�@x~j_A_BAaBA@k@~dc_cAbAaBAy@
xHGy@ ~dc_cAb@→ ~u~\~I�@⟨���⟩@@�z}�@@⟨���⟩ j\I\�@
@ @ ∆�@x~dc_cAbAaBA@k@j\I\�AaBAy�@@ @@@@
@ @ �@lxu��j\I\��@�j\I\�H��H�@@
@ @ @ @@@@u��j\I\�H@k@j\I\�H��z}�Gyn@
xHry@ zcAeaAs@→ j_A_B@
@ @ ∆�@xzcAeaAsAaBA@k@j_A_BAaBAy�@
xH�y@ zcAeaAs@→ �`ddBp_`a@
@ @ ∆�@xzcAeaAsAaBA@k@�`ddBp_`aAaBAy�@
xH�y@ j_A_B@→ j\I\�@⟨�������⟩ {cBaAap]g@
@ @ ∆�@xj_A_BAaBA@k@j\I\�AaBAy�@
@@@@ @@@@�@lxu��j\I\��@�j\I\�H����@u��j\I\�H@k@j\I\�Hyn@
xH�y@ j_A_B@→ j\I\�@
@ @ ∆�@xj_A_BAaBA@k@j\I\�AaBAy�@
@ @@@@�@lxu��j\I\��@�j\I\�H����@u��j\I\�H@k@j\I\�Hyn@
xH�y@ j_A_B@→ j\I\�@⟨�������⟩ �`st`dBd_@
@ @ ∆�@xj_A_BAaBA@k@j\I\�AaBAy�@
@ @ �@lxu��j\I\��@�j\I\�H����@u��j\I\�H@k@j\I\�Hyn@
xH�y@ j_A_B@→ �~uI�@
@ ∆�@xj_A_BAaBA@k@�y�@
@ @ �@lxu��j\I\��@�@�~uI�@H����@u��j\I\�H@k@�~uI�@Hyn@
xH�y@ �`ddBp_`a@→ ��u�\~iu@
@∆�@x�`ddBp_`aAaBA@k@�y�@
@ @ �@lx@u��j\I\��@�@��u�\~iuH����@u��j\I\�H@k@j_A_BHyn@
xG�y@ �`ddBp_`a@→ �{i~��@
@ ∆�@x�`ddBp_`aAaBA@k@�y�@
@ @ �@lxu��j\I\��@�@�{i~��H����@u��j\I\�H@k@j_A_BHyn@
xGHy@ �`ddBp_`a@→ @�i����i~u\@
@ ∆�@x�`ddBp_`aAaBA@k@�y�@
@ @ �@lxu������@��i����i~u\H����u�����H@k@�i����i~u\Hm@
@ @ @@@@@@@@u�����G@k@�i����i~u\Gyn@
xGGy@ �`st`dBd_@→ �`st@⟨��� ⟩ �`st`dBd_@
@ @ ∆�@x�`st`dBd_AaBA@k@�`stAaBAy�@
xGry@ �`st`dBd_@→ �`st@⟨��� ⟩ �`st�@
@ @ ∆�@x�`st`dBd_AaBA@k@�`stAaBAy�@
xG�y@ �`st@→ �iu�����u\@⟨�������⟩ {cBaAap]g@
@ @ ∆�@x�`stAaBA@k@�iu�����u\AaBAy�
Chapter 2. Extended Positional Grammars 27
Productions 1 and 2 describe a statechart diagram as a hierarchy of states (first phase of
the reduction process) represented by the non-terminal Hierarchy and an optional set of
transitions between states (second phase) represented by the non-terminal MultiGraph. A
Hierarchy is formed by an initial state (Istate) and an optional sequence of states and/or
connectors (HSeqState) as described in productions 3 and 4. An initial state is obtained
by linking an initial pseudostate to a state (production 12), and it can be:
• an initial simple state (production 10), or
• an initial OR-State (production 9) which is formed by the non-terminal Initial with
a Hierarchy in its containment area, or
• an initial AND-State (production 11) which is formed by the non-terminal Initial
with a non-terminal Component in its containment area.
A Component is formed by at least two non-terminals Comp, where each one is a termi-
nal CONCURRENT with a Hierarchy in its containment area (productions 22-24). The
productions 5 and 6 define a HSeqState as formed by the non-terminal SeqState and an
optional history indicator. SeqState is a sequence of states and/or connectors related
through the sibling relation (productions 7-8). A state can be: a simple state (production
16), or an OR-State (production 15), or an AND-State (production 17), or a final state
(production 18). A connector can be a junction connector (production 19), a dynamic
choice connector (production 20), or a fork/join connector (production 21).
During the first phase the terminals with incident links are reintroduced as described
in the Γ rules and reanalyzed in the second phase. Fig. 2.9 shows the statechart diagram
describing the behavior of a job processing environment. In the following we show the
steps to reduce such a statechart diagram through the extended positional grammar SD.
Figg. 2.10 (a-i) show the first phase of the reduction process. In particular, the
process starts by applying production 12 to the initial state Working. This causes the
reduction of vsymbols INITIAL, EDGE and STATE to the non-terminal IState. Due to
the Γ rule of production 12, since |STATE1|=3 a new state NEWSTATE is inserted in
the input, and inherits all the connections of STATE except for the connection to EDGE.
Similarly, the application of production 16 reduces the vsymbol STATE representing the
state Waiting with the non-terminal State and the Γ rule inserts in the input sentence
28 Chapter 2. Extended Positional Grammars
Waiting Working
entry/i++
exit/i--
Holding Sending
Awaiting
confirmation
FinishedWork()
confirm()
after
(5s)
when
(empty)
[ready]
doWork(j:Job)/
p.tell(j)
Figure 2.9: A statechart diagram that models the behavior of a job processing environ-
ment.
a new state NEWSTATE. Fig. 2.10(b) shows the resulting visual sentential form, and
highlights the handles for the application of productions 12, 16 and 18 respectively to the
initial states, to the state Sending and to the final states contained into the AND-State,
and for the applications of production 10 to the non-terminal Initial and of production 13
to the non-terminal State. The resulting visual sentential form is shown in Fig. 2.10(c).
After the application of several productions the reduction process reduces the AND-State
to the non-terminal Diagram that will be reduced with the non-terminals SeqState to the
non-terminal HSeqState by applying productions 7 and 6. Fig. 2.10(g) shows the resulting
visual sentential form with the handle for the application of production 3. Fig. 2.10(i)
shows the visual sentential form resulting from the first phase of the reduction process of
the statechart diagram in Fig. 2.9. It is composed by the non-terminal Hierarchy and an
non-connected graph obtained reintroducing new states during the reduction. In general a
node of the non-connected graph can be either the terminal NEWSTATE or the terminal
NEW FJ.
Now, we provide the set of productions of the second phase to analyze the transi-
tions amongst such states. These can be easily obtained modifying the productions of
the grammar for state transition diagrams introduced in example 2.3. As a matter of
fact with productions 25 and 26 we describe a non-connected graph as formed by one or
more graphs, where each one is a state transition diagram with no initial node and no
final node (productions 27-31). Finally we add the productions for the fork/join connector
(productions 32-37).
Chapter 2. Extended Positional Grammars 29
xG�y@@|wb_c}aAt]@→@@}aAt]@⟨���⟩ |wb_c}aAt]@
@ @@ @∆�@x|wb_c}aAt]H@k@@}aAt]Hy�@
xG�y@@|wb_c}aAt]@→@@}aAt]@@
@ @ @∆�@x|wb_c}aAt]H@k@@}aAt]Hy�@
xG�y@}aAt]@@→@u��j\I\�@
@ @ @∆�@x}aAt]H@k@@u��j\I\�Hy�@
xG�y@}aAt]@@→@}aAt]�@@⟨⟨���⟩m⟨���⟩⟩@@�z}�@���@@u`EB@
@ @ @∆�@x}aAt]H@k@}aAt]�H@�@�z}�Hym@
@ @ @�@l@xh�I��{i�z�@�@u`EBH��H�@@
@ @ @ @@@@@h�I��{i�zH@k@u`EBH@�@�z}�Gyn@
xG�y@}aAt]@@@→@@}aAt]�@@⟨⟨���⟩m⟨���⟩⟩@@�z}�@
@ @ @∆�@x}aAt]H@k@x}aAt]�H@�@�z}�Hy@�@�z}�Gy�@
xr�y@}aAt]@@@→@@}aAt]�@@⟨⟨���⟩m⟨���⟩⟩@@�z}�@���@@u`EB@
@ @ @∆�@x}aAt]H@k@}aAt]�H@�@�z}�Gym@@@@@
@ @ @�@l@xh�I��{i�z�@�@u`EBH��H�@@
@ @ @ @@@@@h�I��{i�zH@k@u`EBH@�@�z}�H@yn@
xrHy@}aAt]@@@→@@}aAt]�@@⟨���⟩ h�I��{i�z@
@ @ @∆�@x}aAt]H@k@h�I��{i�zHy�@
xrGy@}aAt]@@@→@}aAt]�@@⟨⟨���⟩m⟨���⟩⟩@@�z}�@���@@jgdp@@
@ @@ @∆�@x}aAt]H@k@}aAt]�H@�@�z}�Hy�@
xrry@}aAt]@→@}aAt]�@@⟨⟨���⟩m⟨���⟩⟩@@�z}�@���@jgdp@
@ @ ∆�@x}aAt]H@k@}aAt]�H@�@�z}�Gy�@
xr�y@jgdp@@→@@u�����@@⟨⟨���⟩m⟨���⟩⟩ �z}�@���@@u`EB@
@ @ ∆�@xjgdpH@k@u�����Hm@jgdpG@k@u�����@G@�@�z}�Hy�@
@ @ �@l@xh�I��{i�z�@�u`EBH��H�@@
@ @ @ @@@@@h�I��{i�zH@k@u`EBH@�@�z}�Gy�@
xr�y@jgdp@@→@@u�����@@⟨⟨���⟩m⟨���⟩⟩ �z}�@���@@u`EB@
@ @ ∆�@xjgdpH@k@u�����H@�@�z}�Gm@jgdpG@k@u�����Gy�@
@ @ �@l@xh�I��{i�z�@�u`EBH��H�@@@ @ @ @@@@@h�I��{i�zH@k@u`EBH@�@�z}�Hy@
xr�y@@jgdp@@→@@jgdp�@⟨⟨���⟩m⟨���⟩⟩ �z}�@���@@u`EB@
@ @ ∆�@xjgdpH@k@jgdp�Hm@jgdpG@k@jgdp�G@�@�z}�Hy�@
@ @ �@l@xh�I��{i�z�@�u`EBH��H�@@
@ @ @ @@@@@@h�I��{i�zH@k@u`EBH@�@�z}�Gy@
xr�y@@jgdp@@→@@jgdp�@⟨⟨���⟩m⟨���⟩⟩ �z}�@���@@u`EB@
@ @ ∆�@xjgdpH@k@jgdp�H@�@�z}�Gm@jgdpG@k@jgdp�Gy�@
@ @ �@l@xh�I��{i�z�@�u`EBH��H�@@
@ @ @ @@@@h�I��{i�zH@k@u`EBH@�@�z}�Hy@
xr�y@@u`EB@@→@@u��j\I\�@
@ @ ∆�@xu`EBH@k@uiz�}Hy�@
xr�y@@u`EB@@@→@@h�I��{i�z@
@ @ ∆�@xu`EBH@k@h�I��{i�zHy�@
Figg. 2.11(a-h) show the steps to reduce the visual sentential form of Fig. 2.9(i)
through the productions shown above. The reduction process starts by applying produc-
tion 27 to one node of each graph. This causes the terminals NEWSTATE to be reduced
to the non-terminal Graph. Due to the ∆ rule of production 27, Graph receives all the con-
nections of NEWSTATE. Similarly, the application of production 38 reduces the remaining
NEWSTATE of Fig. 2.11(a) with the non-terminal Node. Fig. 2.11(b) shows the resulting
visual sentential form, and highlights the handle for the application of productions 28 and
30. The vsymbols Graph, EDGE, and Node are then reduced to the new non-terminal
Graph. Due to the ∆ rules of productions 28 and 30, the new Graph is connected to all
the remaining edges attached to the old Graph. Moreover, due to the Γ rule, since |Node|= 2 > 1 for the graph on the left, a new node PLACEHOLD is inserted in the input, and
it is connected to all the remaining edges attached to the old Node. Fig. 2.11(c) shows the
resulting visual sentential form. After the application of productions 28 and 26 the visual
sentential form reduces to the one shown in Fig. 2.11(d). Then, production 31 reduces the
non-terminals Graph and PLACEHOLD to a new non-terminal Graph. By applying the
∆ rule of production 31, the new Graph receives all the connections to PLACEHOLD (see
Fig. 2.11(e)). Moreover, production 25 reduces the non-terminals MultiGraph and Graph
to a new non-terminal MultiGraph and production 39 reduces PLACEHOLD to a new
non-terminal Node. The subsequent application of productions 30, 25 and 1 reduces the
30 Chapter 2. Extended Positional Grammars
Production 12
(a)
Production 16
Production 12
Production 12
Production 16
Production 18
Production 18
(b)
Initial
State
Production 13
Production 10
Productions 13 + 8
Productions 13 + 8
IState
Initial State
Initial State
State
Diagram
Production 13
Production 8
(c)
Production 10
Production 10
Production 7
Production 6
IState
IState SeqState
IState SeqState
Diagram
SeqState
(d)
Production 6 + 3
Production 3
IState
IState
IState HSeqState
SeqState
(e)
SeqState
Production 24
Production 24
IState
Hierarchy
SeqState
(f)
Hierarchy
IState
(g)
HSeqState
Production 3
• • •
Hierarchy
(i)
Figure 2.10: The first phase of the reduction process for the statechart diagram in Fig.
2.9.
original statechart diagram to the starting vsymbol in Fig. 2.11(h), confirming that the
visual sentence associated to the initial statechart diagram belongs to the visual language
L(SD).
It is worth noting that non well-formed statechart diagrams are also in SD. The language
L(SD) can be restricted to well-formed statechart diagrams by modifying and adding new
productions to the grammar SD. As an example in a well-formed statechart diagram final
states cannot have any outgoing transitions. Such property is captured by modifying the
Γ rule in production 18 so that a new state, named NEWFS, is reintroduced in the input
sentence. In particular the production becomes:
Chapter 2. Extended Positional Grammars 31
Hierarchy
Production 27
Production 38
Production 38
Production 27
Production 27 Production 38
Production 38
(a)
Hierarchy Production 30 Graph
Node Node
Graph Node Node
Node Graph
Production 28
Production 30
(b)
Hierarchy Production 28 Graph
Node
Graph Node
Graph
Production 28
(c)
Production 26
Hierarchy
Graph
Graph
Multi
Graph
Production 25
(d)
Production 31
Production 39
Hierarchy
Node
Multi
Graph
(e)
Production 30
Graph
Hierarchy
Graph
Production 25
(f)
Multi
Graph
Hierarchy
Production 1
(g)
Multi
Graph
StateDiagram
(h)
Figure 2.11: The second phase of the reduction process of the statechart diagram in Fig.
2.9.
(18’) State → FINAL
∆: (Statearea = Ø);
Γ: {(NEWFS; | FINAL 1|>0; NEWFS1 = FINAL 1)}
Consequently new productions must be introduced for the second phase of the reduction
process to manage the vsymbol NEWFS. The productions consider only incoming transi-
tions for such vsymbol as shown in the following.
(40) Graph → Graph' ⟨⟨1_1⟩,⟨1 2_ ⟩⟩ EDGE 2_1 NEWFS
∆: (Graph1 = Graph'1 - EDGE1),
Γ: { (NEWFS’; | NEWFS1|>1; NEWFS’1 = NEWFS1 - EDGE2)}
(41) Sync → NEW_FJ ⟨⟨2_1⟩,⟨ 2_2 ⟩⟩ EDGE 2_1 NEWFS
∆: (Sync1 = NEW_FJ1, Sync2 = NEW_FJ 2 - EDGE1);
Γ: { (NEWFS’; | NEWFS1|>1; NEWFS’1 = NEWFS1 - EDGE2)}
(42) Sync → Sync’ ⟨⟨2_1⟩,⟨ 2_2 ⟩⟩ EDGE 2_1 NEWFS
∆: (Sync1 = Sync’1, Sync2 = Sync’2 - EDGE1);
Γ: { (NEWFS’; | NEWFS1|>1; NEWFS’1 = NEWFS1 - EDGE2)}
Now we show how to modify the grammar in order to analyse the text of the transitions
associated to the edges and to the states. In particular the string transitions can be mod-
elled by using textual annotations, so the terminal vsymbols STATE and EDGE have also
32 Chapter 2. Extended Positional Grammars
one textual annotation as syntactic attribute and the set of relations POS is extended by
adding the relations annotated-by and right-to. The former denotes a relation that is
satisfied between a vsymbol and a sentence, whereas the latter is defined as in example
2.2. The productions must take into account the new annotation relation so the terminal
vsymbol EDGE must be substituted by the non-terminal Edge which has two attaching
points as syntactic attributes. As an example the production 12 becomes:
(12’) Initial → INITIAL ⟨1_1⟩ Edge ⟨2_1⟩ STATE
∆: (Initialarea = STATEarea);
Γ: {(NEWSTATE; |STATE1|>1; NEWSTATE1 = STATE1-Edge2)}
In the following we provide the subset of productions for describing the string transitions
annotating the terminal vsymbol EDGE.
(43) Edge → EDGE <annotated-by> Label
∆: (Edge1 = EDGE1; Edge2 = EDGE2);
(44) Edge → EDGE
∆: (Edge1 = EDGE1; Edge2 = EDGE2);
(45) Label → Event <right-to> Condition <right-to> Action
∆: (Labeltext = Eventtext + Conditiontext + Actiontext);
(46) Event → EVENT
∆: (Eventtext = EVENTtext, Eventhead = EVENThead, Eventtail = EVENTtail);
(47) Event → EVENT <right-to> (<right-to> PARAM <right-to>)
∆: (Eventtext = EVENTtext + ‘(‘ + PARAMtext + ‘)’ , Eventhead = EVENThead, Eventtail = PARAMtail+1);
(48) Condition → [ <right-to> COND <right-to>]
∆: (Conditiontext = ‘[‘ + CONDtext + ‘]’, Conditionhead = CONDhead-1, Conditiontail = CONDtail+1);
(49) Action → / <right-to> ACT
∆: (Actiontext = ‘/’+ ACTtext, Actionhead = ACThead-1, Actiontail = ACTtail);
Productions 43 and 44 describe a transition as formed by the terminal vsymbol EDGE
and an optional annotating string transition (Label). The general format for the transition
strings is event-signature ’[’ guard-condition ’]’ ’/’ action-expression, where event-signature
is event-name ’(’ comma-separated-parameter-list ’)’. Thus productions 45-49 describe a
string transition as the string concatenation of event, condition and action. The non-
terminal Label has a string as syntactic attribute referred to as text ; the non-terminals
Event, Condition and Action together with the terminals have three syntactic attributes:
a string referred to as text, and two positions in the plane, named head and tail. Notice
that Labeltext = Eventtext + Actiontext indicates string concatenation and is to be in-
terpreted as follows: “the text of Label is obtained from the concatenation of the text of
Chapter 2. Extended Positional Grammars 33
Event and the text of Action”.
In order to describe the internal transitions associated to the states, the terminal vsymbol
STATE must be substituted by the non-terminal AnnState which has one containment
area as syntactic attribute. As an example the production 27 becomes:
(27’) State → AnnState ⟨contains⟩ Hierarchy
∆: (Statearea = AnnStatearea);
Γ: {(NEWSTATE; | AnnState1|>0; NEWSTATE1 = AnnState1)}
The productions describing internal transitions annotated to the terminal vsymbol STATE
are shown in the following:
(50) AnnState → STATE <annotated-by> SeqLabel
∆: (AnnStatearea = STATEarea);
Γ: {(NEWSTATE; |STATE1|>0; NEWSTATE1 = STATE1)}
(51) AnnState → STATE
∆: (AnnStatearea = STATEarea);
Γ: {(NEWSTATE; |STATE1|>0; NEWSTATE1 = STATE1)}
(52) SeqLabel → Label <right-to> SeqLabel’
∆: (SeqLabeltext = Labeltext + SeqLabel’text, SeqLabelhead = Labelhead, SeqLabeltail = SeqLabel’tail);
(53) SeqLabel → Label
∆: (SeqLabeltext = Labeltext, SeqLabelhead = Labelhead, SeqLabeltail = Labeltail);
The non-terminal SeqLabel has three syntactic attributes: a string referred to as text, and
two positions in the plane, named head and tail.
Chapter 3
The XpLR Methodology
The XpLR methodology is an extension of the pLR methodology as presented in [15]. It
is a framework for implementing visual systems based upon XPGs and LR parsing. As
in pLR parsing, an XpLR parser scans the input in a non-sequential way, driven by the
relations used in the grammar. In particular, the XpLR methodology differs from the pLR
one in that it handles the Γ rules, and provides algorithms to eliminate conflicts arising
during the construction of a pLR parsing table.
3.1 The XpLR Parser
The components of an XpLR parser are shown in Fig. 3.1 and are detailed in the following.
3.1.1 The input
The input to the parser is a dictionary, named Dp, storing the attribute-based represen-
tation of a picture as produced by the visual editor. No parsing order is defined on the
graphical objects in the dictionary. The parser retrieves the objects in the dictionary
through a find operation, driven by the relations in the grammar. The parser implicitly
builds and parses a linear representation from the input attribute-based representation.
If the input picture contains explicit relations, i.e., the relations have a graphical
representation, its attribute-based representation is augmented with an array COUNTER
containing an entry for each explicit relation. The entry COUNTER(r) for an explicit
relation labeled r with degree n contains the value n-1. This value indicates the number
of binary relations describing r in any relative representation of the picture. During the
36 Chapter 3. The XpLR Methodology
parsing phase, all the visited tokens, and the traversed explicit binary relations, are marked
in order to guarantee that each object and each explicit relation be considered at most
once. The marking of an explicit binary relation REL labeled r is done by decreasing the
entry COUNTER(r) by 1.
The 0-entry of the dictionary always refers to the end-of-input symbol EOI. Similarly
to the usual end-of-string marker, the end-of-input symbol EOI is returned to the parser
if and only if the input has been completely visited, i.e., all the input tokens have been
parsed, and all the explicit relations have been traversed. These conditions are signaled
by having all the tokens marked and COUNTER(r) = 0 for each explicit relation r,
respectively.
action goto next
XpLR Parsing Table
XpLR parsing
program (driver program)
Input
sm
Xm
.....
s1
X1
s0
Stack
vsymbol
next vsymbol request
Output
Figure 3.1: The architecture of an XpLR parser.
3.1.2 The stack
An instance of the stack has the general format s0X1s1X2s2 . . . Xmsm, where sm is the
stack top; Xi is a grammar vsymbol, and si is a generic state of the parsing table. The
parsing algorithm uses the state on the top of the stack, and the vsymbol currently under
examination, to access a specific entry of the parsing table in order to decide the next
action to execute.
3.1.3 The XpLR parsing table
An XpLR parsing table (see Fig. 3.2) is composed of a set of rows and is divided in three
main sections: action, goto, and next. The action and goto sections are similar to the
Chapter 3. The XpLR Methodology 37
ones used in LR parsing tables for string languages [1], whereas the next section is used
by the parser to select the next vsymbol to be processed. An entry next[k] for a state sk
contains a pair (Rdriver, x ), which drives the parser in selecting the next vsymbol y that
is reachable from x, by using the sequence of driver relations Rdriver. Two special pairs
in the column next are (start, S) and (end, EOI), where S is the starting vsymbol and
EOI is the end-of-input marker. The first is used at the beginning of the parsing process
to retrieve the first vsymbol to be parsed. This vsymbol depends on the nature of the
language. The latter is used to check whether the whole input sentence has been parsed.
If all the vsymbols have been analyzed and all the explicit relations have been considered,
then the query returns the EOI marker.
St. Action Goto NEXT
a b d e f EOI A B
0 :sh2 :1 (start, A)
1 acc (end, EOI)
2 :sh6 :3 (1_1, B) 3 2_2:sh4 (11_1, b) 4 :sh5 (11_1, d)
5 r1 r1 r1 r1 r1 r1 -
6 :sh7 (2_2, f)
7 r2 r2 r2 r2 r2 r2 -
Figure 3.2: An XpLR(0) parsing table.
An action entry has one of the following four values:
1. “Rtester: shift s” where Rtester is a possibly empty sequence of tester relations and
s is a state. As an example, the entry (3, b) in Fig. 3.3 contains 〈2 2: sh4〉;
2. reduce by a grammar production (i) A→ β, shown in the table as ri;
3. accept;
4. error shown as an empty entry.
A goto entry contains “Rtester: s”, where Rtester is a possibly empty sequence of tester
relations and s is a state.
A shift or goto action is executed only if all the relations in the corresponding Rtester
are true, or if Rtester is empty. As an example, let us consider the XpLR(0) parsing table
38 Chapter 3. The XpLR Methodology
in Fig. 3.2. If the current state corresponds to row 3, and the vsymbol currently scanned is
b, then the parser executes the action (2 2: sh4), that is, if the relation 2 2 holds between
b and the vsymbol on the stack top, then the parser shifts b and goes to state 4. Once in
state 4, the parser launches a query on the input, based on the entry next[4] = (1 11, d).
The query will search the input for a terminal vsymbol d, such that d is in relation 1 1
with the first vsymbol below the stack top.
3.1.4 The XpLR parser
In order to illustrate the XpLR parsing program we define the two functions Fetch Vsymbol
and Test. The former uses the stack and the input as global data structures and takes its
arguments from the column NEXT of the parsing table. The latter is used to validate the
tester relations between vsymbols. It takes in input an action condition from the action
or goto part of the parsing table and returns a boolean value.
Function Fetch Vsymbol(NEXT)
begin
case NEXT of
NEXT = (start, S):
return the row index in Dp to the first object to parse
NEXT = (end, EOI):
if all the objects have been marked as visited and
COUNTER(r) = 0 for each explicit relation r
then return the row index 0 in Dp pointing to the end-of-input symbol EOI
else return null ;
NEXT = (Rdriver, x), where Rdriver = 〈RELh11 ,. . .,RELhn
n 〉 andeach RELhi
i acts on a syntactic attribute ki of x
let zi be the hi-th object below the stack top
for i = 1 to n
let next seti = {b | b is in Dp, it is not marked as visited, it
has an attribute j such that (b, j ) is reachable from (x, ki),
zi RELi b holds, and the relation RELi acts on a syntactic attribute
of zi and the syntactic attribute j of b, respectively }if ∩i=1...n next seti contains exactly one object b
Chapter 3. The XpLR Methodology 39
then for each explicit relation RELi in Rdriver do
decrease by 1 the entry in the array COUNTER corresponding to
the explicit relation zi RELi b
mark the corresponding entry in Dp as visited
return the row index of b in Dp
else if ∩i=1...n next seti contains more than one object b
emit “run-time conflict” and exit
else return null ;
NEXT = null :
return null ;
endcase
end
Let us describe how the function works on the table in Fig. 3.2. In particular the
relations Rdriver in the NEXT column are as follows:
• the special relation start: in this case Fetch Vsymbol returns the index in Dp of an
instance of the vsymbol a in the visual sentence (in this simple example a plays the
role of the starting symbol);
• the special relation end: in this case Fetch Vsymbol returns the index in Dp of the
EOI vsymbol only if all the vsymbols and all the explicit relations of the visual
sentence have been visited;
• a relation h k: this relation must hold between the vsymbol z on the stack top and
exactly one non visited vsymbol b in Dp. In particular, when NEXT = (h k, x ):
1. if x is a terminal vsymbol, Fetch Vsymbol returns the index in Dp of a non
visited vsymbol whose name is x and whose k-th syntactic attribute is linked
to the h-th syntactic attribute of z.
2. if x is a non-terminal vsymbol, Fetch Vsymbol returns the index in Dp of a
non visited terminal vsymbol b whose j-th syntactic attribute is linked to the
h-th syntactic attribute of z. The couples (x, k) and (b, j ) are such that b is a
terminal that begins a positional sentence derived from x and the k-th syntactic
40 Chapter 3. The XpLR Methodology
attribute of x is synthesized from j by successively applying the ∆ rules in the
derivation.
If no object is found then Fetch Vsymbol returns null. On the other hand, if more
than one vsymbol is found, then the parser cannot proceed because it cannot decide
which token to analyze deterministically. As a consequence, the function issues a run-time
conflict message and stops the execution of the parser. The occurrence of this type of
conflict, named run-time conflict, might prevent the recognition of syntactically correct
input visual sentences. In section 3.4 we analyze the run-time conflicts, and give some
heuristics to solve such problem.
The function Test shown below verifies that the grammar vsymbol to be pushed on
the stack top is properly related to a grammar vsymbol already in the stack.
Function Test(COND)
let COND = (RELi, x) where x is either a terminal or a non-terminal
let z be the i-th object below the stack top
if z REL x holds
then begin
if REL is an explicit relation then
decrease by 1 the entry in the array COUNTER corresponding to the
explicit relation z REL b
return true
end
else return false
In the following, we give the complete XpLR(0) parsing algorithm.
Algorithm 3.1 The XpLR(0) parsing algorithm.
Input: A visual sentence in attribute-based representation and an XpLR(0) parsing table.
Output: A bottom-up analysis of the visual sentence if this is syntactically correct, an
error message otherwise.
Method: Start with the state s0 on the top of the stack.
Chapter 3. The XpLR Methodology 41
repeat forever
let s be the state on the stack top
set ip = Fetch Vsymbol(next [s])
if ip is not null
then // there is only one next vsymbol
let b the grammar vsymbol pointed by ip
if action[s, b] = “accept” then “success” and exit;
if action[s, b] is a conditioned shift of type “Rt: shift s′”
then
if Rt is empty or Test(RELh, b) is true for each RELh ∈ Rt
then push b and then s’ on the stack;
else emit “syntax error” and exit;
else emit “syntax error” and exit;
else if next [s] is empty then // the state s is a reduce state
let b a terminal vsymbol
if action[s, b] = reduce A → x1R1x2R2 . . .Rm−1xm, ∆, Γ then
compute the syntactic attributes of the vsymbol A
according to the synthesis rule ∆, apply rule Γ, if
present, and pop 2*m symbols from the stack
let s’ be the new state on the stack top
if goto[s’, A] is a conditioned goto of type “Rt: s” then
if Rt is empty or Test(RELh, A) is true for each RELh ∈ Rt then
push A and then s′′ on the stack and
output the production A → x1R1x2R2 . . . Rm−1xm, ∆, Γ
else emit “syntax error” and exit;
else emit “syntax error” and exit;
else emit “syntax error” and exit;
else emit “syntax error” and exit;
endrepeat
At each step, the XpLR parsing program checks the entries next [s] of the parsing table
corresponding to state s on the top of the stack. If Fetch Vsymbol(next [s]) is not null,
then the resulting pointer ip points to the next terminal b to be processed. In this case,
42 Chapter 3. The XpLR Methodology
either the input picture is accepted (action[s, b] = “accept”) with b being the end-of-input
marker EOI, or b is shifted on the stack top (action[s, b] = “R: shift s′”). Whenever a
shift action is required and the action condition R is not empty, then each relation RELhii
in R is tested between the hi-th object below the stack top and the object b to be shifted.
Otherwise, if next [s] is empty then a reduce action is required. In this case, the
pointer ip is not updated and it points to the last terminal b shifted on the stack top.
The reduction action[s, b] = “reduce A → x1R1x2 . . . Rm−1xm, ∆, Γ” is accomplished by
calculating the syntactic attributes of A as specified by ∆, applying the Γ rule, popping
2*m elements out of the stack, and pushing A on the stack top. If s′ is the state on the
stack top after popping the 2*m elements, then the next state s′′ of the parser is given
by the entry goto[s′, A]. Also in this case, the goto action may be triggered by an action
condition to be verified between objects below the stack top and the object A.
3.1.5 Parsing time complexity
In this subsection we analyze the time complexity of the XpLR parsing algorithm. As
described in Chapter 2 an XpLR parser may not converge in the analysis of a visual
sentence, since the parser may get into a loop while reducing productions where the number
of vsymbols introduced with Γ is greater or equal to the number minus one of vsymbols
popped from the stack. Thus, we restrict the analysis to the class of convergent XpLR
parsers. In particular, we analyze the time complexity to parse a visual sentence containing
n vsymbols, and with nt vsymbols inserted during the parsing. The worst case complexity
is achieved for correct input pictures when all the input vsymbols are visited.
Given an extended positional grammar XPG, let
- na be the maximum number of attributes of a vsymbol, let
- no be the maximum number of vsymbols in the right-hand side of a production, let
- nr be the maximum number of relations in a tester, and let
- t be the maximum number of triples in the Γ rules.
At each step, the parser performs a shift or a reduce action. Therefore, the total number
of shifts will be n + nt, while the number of reductions will be O(n + nt). The parsing
Chapter 3. The XpLR Methodology 43
algorithm performs a shift action whenever next [s] is defined and a reduce action otherwise.
Let us compute separately the time complexity for shift and reduce actions.
To perform a shift action the parsing program must first access the input and then
test the action condition, if any. Let tq be the time required to perform the function
Fetch V symbol (on next [s]). Moreover, if an action condition is to be performed, the
conditioned shift depends on the number nr of relations in a tester and on the time tr
to test each relation. As the push operation on the stack takes time na, the total time
complexity to perform a shift action is O(tq + nr ∗ tr + na).
To reduce a production, the parser has to perform the following steps:
(i) calculate the syntactic attributes of the left-hand side non-terminal;
(ii) apply the Γ rule;
(iii) pop the records corresponding to the right-hand side vsymbols from the stack;
(iv) test for conditioned gotos;
(v) push the record corresponding to the left-hand side non-terminal onto the stack.
The cost of step (i) depends on the particular function used to synthesize each syntactic
attribute. Let O(t∆(no)) be the time required to perform this task, then the time com-
plexity for step (i) will be O(na ∗ t∆(no)). The cost of the step (ii) depends on the time c
to compute the conditions and the time na required to insert new terminal vsymbols in the
sentence. The total time is O(t(c+na∗t∆(no))). As the stack pop operation takes time na,
step (iii) will cost O(no∗na). Similarly to a conditioned shift, a conditioned goto has time
complexity O(nr ∗ tr+na). The final push operation (step (v)) takes time na. Therefore,
the total time complexity for a reduce action is O(na(t∆(no) + t(c+ na) + no) + nr ∗ tr).
Then, the time complexity of the parser is O((n+nt)(na(t∆(no)+t∗c+no)+nr∗tr+tq)).
For a fixed grammar, na, nr, no, c and t are constants and the time complexity reduces to
O((n+nt)(tq+ tr+ t∆)). The parameters tq, tr, and t∆ depend on the particular class of
visual languages. For example, for the graph languages the access time tq may vary from a
constant to O(n), depending on the chosen implementation of the input dictionary, while
the test time tr is constant. Finally, the time complexity t∆ for synthesizing the syntactic
attributes of a vsymbol requires O(n) time. Thus, for a fixed grammar modelling a graph
44 Chapter 3. The XpLR Methodology
language the time complexity is O((n + nt)(n ∗ tq)). By using proper hashing techniques
to implement the dictionary Dp, the expected time complexity reduces to O(n(n + nt)).
3.2 Constructing XpLR(0) Parsing Tables
In this section we present the algorithms for the construction of an XpLR(0) parsing table.
Let us start by providing the notion of item. An XpLR(0) item of an extended positional
grammar is a production without the ∆ and Γ rules, and with a dot at some position of
the right-hand-side. However, a dot can never be placed between a relation identifier and
the terminal or non-terminal vsymbol to its right.
As an example, the production A → X R1 Y R2 Z, ∆, Γ leads to the following four types
of XpLR(0) items:
[A → · X R1 Y R2 Z]
[A → X · R1 Y R2 Z]
[A → X R1 Y · R2 Z]
[A → X R1 Y R2 Z ·]
Intuitively, an item indicates how much of a production has already been examined during
the parsing process and what is yet to come. For instance, the item [Graph → Graph’ ·〈〈1 1〉,〈1 2〉〉 EDGE] from example 2.3 means that the non-terminal Graph’ has already
been seen and a terminal EDGE in relation 〈1 1,1 2〉 with Graph’ is expected next.
A collection of sets of XpLR(0) items provides the basis for constructing XpLR(0)
parsers. To construct such collection for a grammar, we define an augmented grammar
and two functions, closure and goto. Given an extended positional grammar G with start
vsymbol S, its augmented extended positional grammar G’ is derived from G by adding
the new start vsymbol S’ and the production S’ → S.
The Closure Operation
If I is a set of items for a grammar G, then closure(I) is the set of items constructed from
I by the two rules:
1. Initially, every item in I is added to closure(I).
Chapter 3. The XpLR Methodology 45
2. If A → α · R Bβ with α �= ε or A → · Bβ is in closure(I) and B → γ is a production,
then add the item B → · γ to I, if it is not already there. We apply this rule until
no more new items can be added to closure(I).
Intuitively, given a set of items I containing an item with a dot before a non-terminal
B, the function CLOSURE adds to I all the items with B in the left-hand side and the
dot preceding the first object of the right-hand side. This means that if the non-terminal
object B is expected next, then any object starting a positional sentential form from B is
expected next.
The function closure can be computed as follows.
Function CLOSURE(I)
begin
J=I;
repeat
for each item [A → α · R Bβ] with α �= ε or [A → · Bβ] in J
and each production B → γ in G’ such that B → · γ is not in J
do add [B → · γ] to J
until no more items can be added to J
return J;
end.
The Goto Operation
The second useful function is goto(I, x, R) where I is a set of items, x is a grammar
vsymbol and R is a sequence of tester relations. goto(I, x, Rtester) is defined to be the
closure of the set of all items [A → α 〈Rdriver, Rtester〉 x · β] such that [A → α · 〈Rdriver,
Rtester〉 x β] is in I. Intutively, once a vsymbol x has been seen, the function GOTO
determines the ordered sequence of sets of items containing the vsymbols that can be seen
next.
Function GOTO(I, x, Rtester)
begin
if Rtester = ∅ then
46 Chapter 3. The XpLR Methodology
let J = { [A → α Rdriver x · β] | α �= ε and [A → α · Rdriver x β] ∈ I } ∪{ [A → x · β] | [A → · x β] ∈ I }
else
let J = { [A → α 〈Rdriver, Rtester〉 x · β] | α �= ε, and
[A → α · 〈Rdriver, Rtester〉 x β] ∈ I }return CLOSURE(J)
end
The Set-of-Items Construction
We are now ready to give the algorithm to construct C, the collection of sequences of
XpLR(0) item sets for an augmented grammar G’; the algorithm is shown in the following.
Algorithm 3.2 Construction of the sets of XpLR(0) items.
Input: An augmented extended positional grammar G’.
Output: The collection of XpLR(0) item sets.
Method: Item sets are constructed by the main procedure ITEMS, which in turn calls the
two functions CLOSURE and GOTO.
Procedure ITEMS(G’)
begin
let C = { 〈CLOSURE({[S’ → · S]})〉 }repeat
for each set of items I in C, each vsymbol x
such that there exists [A → α · 〈Rdriver〉 x β] ∈ I or [A → · x β] ∈ I and
GOTO(I, x, ∅) is not included in C
do C = C ∪ GOTO(I, x, ∅)for each set of items I in C, each vsymbol x and each sequence
of tester relations Rtester �= ∅ such that [A → α · 〈Rdriver, Rtester〉 x β] ∈ I
and GOTO(I, x, Rtester) is not included C
do C = C ∪ GOTO(I, x, Rtester)
until no more sets of items can be added to C
end
Chapter 3. The XpLR Methodology 47
The collection of XpLR(0) item sets of an augmented extended positional grammar G’
are incrementally constructed by the main procedure ITEMS, starting from the initial
set containing the item [S’ → · S]. Similarly to the LR case, the sets of XpLR(0) items
correspond to the states of a finite automaton for viable prefixes [1] where the transitions
are determined by the function GOTO.
In the following we give an example of construction of the sets of XpLR(0) item sets.
Example 3.1 The collection of XpLR(0) item sets for the grammar of example 2.1 is
described in the following. The notation (goto j) to the right hand side of an item K=[A
→ α · 〈Rdriver, Rtester〉 x β] indicates the item sets Ij returned by GOTO(K, x, Rdriver).
I0 = { S’ → · A (goto 1)
A → · a 〈1 1〉 B 〈〈11 1〉,〈2 2〉〉 b 〈11 1〉 d (goto 2)}I1 = { S’ → A · }I2 = { A → a · 〈1 1〉 B 〈〈11 1〉,〈2 2〉〉 b 〈11 1〉 d (goto 3)
B → · e 〈2 2〉 f (goto 6)}I3 = { A → a 〈1 1〉 B · 〈〈11 1〉,〈2 2〉〉 b 〈11 1〉 d (goto 4)}I4 = { A → a 〈1 1〉 B 〈〈11 1〉,〈2 2〉〉 b · 〈11 1〉 d (goto 5)}I5 = { A → a 〈1 1〉 B 〈〈11 1〉,〈2 2〉〉 b 〈11 1〉 d ·}I6 = { B → e · 〈2 2〉 f (goto 7)}I7 = { B → e 〈2 2〉 f ·}
The XpLR parsing table Construction
Now we shall show how to construct an XpLR parsing table from the collection of item
sets.
Algorithm 3.3 Constructing an XpLR(0) parsing table.
Input: An augmented extended positional grammar G’.
Output: The XpLR(0) parsing table for G’.
Method:
1. Construct C = {I0, I1,. . ., Im}, the collection of sets of XpLR(0) items as described
in Algorithm 3.2.
2. State i of the parser is constructed from the set of items Ii. The entries for state i
of the parsing table action and next parts are determined as follows:
48 Chapter 3. The XpLR Methodology
SHIFT ENTRIES
• If [A → α · Rdriver a β] or [A → · a β] is in Ii and GOTO(Ii, a, ∅) = Ij then
set action[i, a] = “T: shift j” (a is required to be a terminal) where T stands
for a condition which returns always true.
• If [A → α · 〈Rdriver, Rtester〉 a β] is in Ii and GOTO(Ii, a, Rtester) = Ij then
set action[i, a] = “Rtester: shift j” (a is required to be a terminal).
REDUCE ENTRIES
• If [A → α · ] is in Ii then set action[i, a] = “reduce A → α” for each terminal
a.
NEXT and ACCEPT ENTRIES
• Whenever [A → α · 〈Rdriver, Rtester〉 x β] is in Ii insert (Rdriver, x) in next [i].
• If [S’ → · S] is in Ii then insert (Start, S) in next [i]. If [S’ → S .] is in Ii then
insert (end, EOI) in next [i] and “accept” in action[i, EOI].
3. The entries for state i and the non-terminals X of the goto part are determined as
follows:
• If [A → α · 〈Rdriver, Rtester〉 X β] is Ii and GOTO(Ii, X, Rtester) = Ij then
insert “Rtester : j” in goto[i, X].
• If [A → α · Rdriver X β] or [A → · X β] is in Ii and GOTO(Ii, X, ∅) = Ij then
insert “T: j” in goto[i, X].
The action and goto parts of the XpLR parsing table are constructed as in the LR parsing
tables. The action conditions and the entries in the column next are constructed as follows:
• a shift or goto action in state i has a sequence of tester relations Rtester as an action
condition if and only if the set of items Ii corresponding to state i contains an item
with a dot preceding a sequence Rtester.
• the entry next [i] contains the pair (Rdriver, x ) if and only if the set of items Ii
corresponding to state i contains an item with a dot preceding a sequence 〈Rdriver,
Rtester〉, and the vsymbol x.
Chapter 3. The XpLR Methodology 49
3.3 XpLR parsing table conflicts
A conflict in an XpLR parsing table arises when multiple actions are contained in a
single entry of the action, goto or positional parts. An XpLR parsing table may present
shift/shift, goto/goto and positional conflicts, besides the classical shift/reduce, reduce/reduce
conflicts.
A shift/shift conflict occurs whenever multiple shift actions are present in a single entry
of the action part. Analogously, a goto/goto conflict occurs whenever multiple goto actions
appear in a single entry of the goto part. Shift/shift or goto/goto conflicts are generated
whenever a set of XpLR(0) items contains two or more items with the dot preceding the
same grammar object with the same sequence of driver relations, but with different tester
relations.
As in the LR methodology a shift/reduce (reduce/reduce, resp.) conflict occurs when-
ever a single entry of the action part contains both shift and reduce (multiple reduce,
resp.) actions.
A positional conflict occurs whenever multiple values (REL, x) are present in a single
entry of the next column. This conflict is generated whenever a sequence contains a set of
XpLR(0) items with two or more items with the dot preceding pairs with different driver
relations or with the same driver relation but different grammar objects.
An extended positional grammar for which it is possible to construct an XpLR parsing
table without conflicts is said to be an XpLR grammar. As an example, the grammar of
example 2.1 is an XpLR grammar, as shown by its parsing table in Fig. 3.2. As for LR
parsing, every ambiguous grammar G fails to be XpLR. Indeed, if G is visually ambiguous
then the corresponding parsing table has a positional conflict, whether if G is structurally
ambiguous then the parsing table may present any conflict.
3.3.1 Handling parsing table conflicts
Ambiguities in non XpLR grammars are handled by exploiting heuristics. In particular,
positional conflicts are solved by partitioning the conflicting state into a sequence of sub-
states on the base of the driver relations, and ordering the values (RELh, x) in the same
entry of the column next. In this case, the XpLR parsing table has a slightly different
structure with respect to the pLR parsing table.
50 Chapter 3. The XpLR Methodology
As an example, Fig. 3.3 shows the parsing table of the ambiguous grammar STD of
example 2.3. Note that state 4 is partitioned in four ordered substates. Thus when the
parser is in state 4, it has recognized a non-terminal Graph and proceeds with the parsing
of the visual sentence by looking for:
1. an outgoing edge or a self-edge (corresponding to state 4.1) of Graph as shown in
Figg. 3.4 (a)-(b), or
2. an incoming edge of Graph (state 4.2) as shown in Fig. 3.4 (c); or
3. a reintroduced vsymbol PLACEHOLD (state 4.3) as shown in Fig. 3.4 (d).
If no one of such vsymbols are found then the parser executes a reduce operation (state
4.4).
St. Action Goto NEXT NODEI NODEIF NODEF NODEG EDGE PLACEHOLD EOI StateTD Graph Node
0 :sh2 :sh3 :1 :4 (start, StateTD) 1 acc (end , EOI) 2 r2 r2 r2 r2 r2 r2 r2 - 3 r3 r3 r3 r3 r3 r3 r3 - 1 1 2_ : sh5
1_2: sh6
(1_1, EDGE)
4 2 1 1_ : sh7 (1_2, EDGE)
3 :sh8 (any, PLACEHOLD) 4 r1 r1 r1 r1 r1 r1 r1 - 5 :sh11 :sh10 :sh12 :9 (2_1, Node) 6 r5 r5 r5 r5 r5 r5 r5 - 7 :sh11 :sh10 :sh12 :13 (1_1, Node) 8 r7 r7 r7 r7 r7 r7 r7 - 9 r4 r4 r4 r4 r4 r4 r4 -
10 r8 r8 r8 r8 r8 r8 r8 - 11 r9 r9 r9 r9 r9 r9 r9 - 12 r10 r10 r10 r10 r10 r10 r10 - 13 r6 r6 r6 r6 r6 r6 r6 -
Figure 3.3: An XpLR(0) parsing table with ordered substates.
The order of the substates in a state depends on the syntax of the language to be parsed.
In general, the language implementer may need to modify the order of the substates
accordingly.
It is easy to show that partitioning a state in a sequence of ordered substates allows us
to avoid all the conflicts caused by the introduction of Γ rules in the XpLR grammars, and
also some of the conflicts that could occur when using the XpLR parsing table construction
algorithm, as shift/reduce and reduce/reduce conflicts.
Chapter 3. The XpLR Methodology 51
State 4.1 State 4.2 State 4.3
(a) (b)
(c)
(d)
Graph Graph
1_1
Graph
1_2
Graph 1_1, 1_2
Figure 3.4: A graphical representation of state 4.
The remaining shift/reduce and reduce/reduce conflicts are solved by using disam-
biguating rules such as those used by tools like YACC [39]. In particular, a shift/reduce
is resolved in favor of shift, and a reduce/reduce is resolved by choosing the conflicting
production listed first in the grammar specification.
Finally, shift/shift and goto/goto conflicts are solved by ordering the conditioned ac-
tions present in the same entry. The parser tests the action conditions sequentially and
executes the first action whose condition is verified. Similarly to YACC, the order of mul-
tiple values in the same entry of the parsing table depends on the order of the items in
the same set.
It is easy to reproduce the reduction process in Fig. 2.4 by applying Algorithm 3.1
modified with the previous heuristics on the XpLR(0) parsing table in Fig. 3.3. Let us
observe that the parsing time complexity on the grammar STD is O(n2), where n is the
number of vsymbols in the sentence, since the number of vsymbols introduced during the
parsing is limited by the number of edges in the sentence.
3.3.2 Building parsing tables with ordered substates
In the following we show how to modify the algorithm for the construction of the collection
of XpLR(0) item sets in order to obtain a parsing table where the states can be partitioned
into substates. To this aim, we introduce a new function called Partition in addition to
Closure and Goto functions.
The Partition Operation
If J is a set of XpLR(0) items then Partition(J) splits J in an ordered sequence of XpLR(0)
item sets. The function groups the items having the same driver relation following the
dot, so each set of the sequence can be identified by a driver relation. The order of the
52 Chapter 3. The XpLR Methodology
sets in the sequence depends on the syntax of the language to be parsed, and the language
implementer may need to modify it accordingly. Moreover if J contains one or more com-
plete items (i.e., of type [A → X R1 Y R2 Z · ]) then the function inserts the item whose
production is first listed in the XPG specification at the end of the sequence.
The function Partition can be computed as follows.
Function PARTITION(J)
begin
let D be any ordered sequence 〈Rdriver1 ,. . ., Rdrivern〉 of all the different driverrelations following the dots in the items in J
if n ≥ 1 then
for i=1 to n do
Ji = {items | items = [A → α’ x · 〈Rdriveri , Rtester〉 β’] ∈ J}let m the number of complete items in J
if m > 0 then
Jn+1 = {[A → α’ x · ] | [A → α’ x · ] ∈ J, and A → α’ x is the conflicting
production listed first in the grammar specification}return 〈J1,. . ., Jn+1〉;
if n ≤ 1 then return {〈J〉}else return 〈J1,. . ., Jn〉;
end
This function is invoked by the function Goto as described in the following.
The new version of the Goto Operation
Goto(I, x, R) is defined to be the closure of the sequence of sets of items obtained by
applying the partition operation to the set of all items [A → α 〈Rdriver, Rtester〉 x · β]
such that [A → α · 〈Rdriver, Rtester〉 x β] is in I.
Function GOTO(I, x, Rtester)
begin
if Rtester = ∅ then
let J = { [A → α Rdriver x · β] | α �= ε and [A → α · Rdriver x β] ∈ I } ∪{ [A → x · β] | [A → · x β] ∈ I }
Chapter 3. The XpLR Methodology 53
else
let J = { [A → α 〈Rdriver, Rtester〉 x · β] | α �= ε, and
[A → α · 〈Rdriver, Rtester〉 x β] ∈ I }set 〈J1,. . ., Jm〉 = PARTITION(J) where m is the length of the returned sequence
return 〈CLOSURE(J1),. . ., CLOSURE(Jm)〉end
Finally, the function Items can be easily modified in order to take into account sequences
of item sets instead of item sets.
In the following we give an example of construction of the sets of XpLR(0) item sets.
Example 3.2 The collection of sequences of XpLR(0) item sets for the grammar of ex-
ample 2.3 is described in the following.
I0 = 〈 J01 = { S’ → · StateTD (goto 1)
StateTD → · Graph (goto 4)
Graph → · NODEI (goto 2)
Graph → · NODEIF (goto 3)
Graph → · Graph’ 〈〈1 1〉,〈1 2〉〉 EDGE 2 1 Node (goto 4)
Graph → · Graph’ 〈〈1 1〉,〈1 2〉〉 EDGE (goto 4)
Graph → · Graph’ 〈〈1 2〉,〈1 1〉〉 EDGE 1 1 Node (goto 4)
Graph → · Graph’ 〈any〉 PLACEHOLD (goto 4)}〉
I1 = 〈 J11 = { S’ → StateTD · }〉
I2 = 〈 J21 = { Graph → NODEI · }〉
I3 = 〈 J31 = { Graph → NODEIF · }〉
I4 = 〈 J41 = { Graph → Graph’ · 〈〈1 1〉,〈1 2〉〉 EDGE 2 1 Node (goto 5)
Graph → Graph’ · 〈〈1 1〉,〈1 2〉〉 EDGE (goto 6)}J42 = { Graph → Graph’ · 〈〈1 2〉,〈1 1〉〉 EDGE 1 1 Node (goto 7)}J43 = { Graph → Graph’ · 〈any〉 PLACEHOLD (goto 8)}J44 = { StateTD → Graph · }〉
54 Chapter 3. The XpLR Methodology
I5 = 〈 J51 = { Graph → Graph’ 〈〈1 1〉,〈1 2〉〉 EDGE · 2 1 Node (goto 9)
Node → · NODEG (goto 10)
Node → · NODEF (goto 11)
Node → · PLACEHOLD (goto 12)}〉
I6 = 〈 J61 = { Graph → Graph’ 〈〈1 1〉,〈1 2〉〉 EDGE · }〉
I7 = 〈 J71 = { Graph → Graph’ 〈〈1 2〉,〈1 1〉〉 EDGE · 1 1 Node (goto 13)
Node → · NODEG (goto 10)
Node → · NODEF (goto 11)
Node → · PLACEHOLD (goto 12)}〉
I8 = 〈 J81 = { Graph → Graph’ 〈any〉 PLACEHOLD · }〉
I9 = 〈 J91 = { Graph → Graph’ 〈〈1 1〉,〈1 2〉〉 EDGE 2 1 Node · }〉
I10 = 〈 J101 = { Node → NODEG · }〉
I11 = 〈 J111 = { Node → NODEF · }〉
I12 = 〈 J121 = { Node → PLACEHOLD · }〉
I13 = 〈 J131 = { Graph → Graph’ 〈〈1 2〉,〈1 1〉〉 EDGE 1 1 Node · }〉
Sequence I4 is the only one formed by more than one set. The function Partition has split
the set of items in four subsets using the ordered sequence of driver relations D= { 〈1 1〉,〈1 2〉, 〈any〉}.
3.4 Applicability of XpLR parsing
In this subsection we show the properties that an extended positional grammar must
satisfy in order to obtain a correct and complete XpLR parser. Moreover, we show that
Chapter 3. The XpLR Methodology 55
the XpLR methodology provides means to handle also grammars whose associated XpLR
parsers are not correct and/or complete.
Theorem 3.1 gives the conditions under which an XpLR parser is correct. The proof
is derived from theorem 7.1 of [15].
Theorem 3.1 (Correctness)
Let XPG be an XpLR grammar and P(XPG) its XpLR parser. If Π is a visual sentence
accepted by P(XPG) then Π ∈ L(XPG).
Vice versa, the absence of conflicts in an XpLR parsing table for a language L does not
guarantee that any visual sentence in L is accepted by the corresponding XpLR parser. Let
Π be a visual sentence in L(XPG) where XPG=(G,PE), generated by applying PE to a po-
sitional sentence s ∈L(G). At each step of the parsing process, the function Fetch Vsymbol
takes as argument the pair (RELhii , x) from the column next of the parsing table to in-
quire the input dictionary. For the parsing program to execute correctly in a deterministic
way, there must be a single terminal xi reachable from x that is detected and returned by
Fetch Vsymbol. However, in the case Fetch Vsymbol detects more than one terminal on
the pair (RELhii , x), a “run-time conflict” message is returned and the parsing program
halts. In this case, we say that a run-time conflict occurred.
As an example, let us consider the visual sentential form in Fig. 3.5 obtained during
the reduction process described in example 2.3 (see Fig. 2.4(e)). The parser produces
this form by reaching state 4.3 containing the set of items I43={Graph → Graph’ · 〈any〉PLACEHOLD}. In the next step, the execution of Fetch V symbol on the pair (any ,
PLACEHOLD) retrieves two occurrences of the terminals PLACEHOLD and, as a conse-
quence, detects a run-time conflict.
Graph
Figure 3.5: A visual sentential form.
Definition 3.1 (XpLR parsable)
Let P(XPG) be the XpLR parser of an XpLR grammar XPG. XPG is said XpLR parsable
56 Chapter 3. The XpLR Methodology
if for each visual sentence Π ∈ L(XPG), each execution of Fetch V symbol invoked by
P(XPG) during the parsing of Π detects and returns one and only one terminal.
In other words, the parser P(XPG) of an XpLR parsable grammar XPG does never incur
in run-time conflicts. The following theorem gives the conditions for completeness of the
XpLR parsing algorithm. The proof is an extension of theorem 7.2 of [15].
Theorem 3.2 (Completeness)
Let XPG be an XpLR parsable grammar and P(XPG) its XpLR parser. If Π ∈ L(XPG)
then Π is accepted by P(XPG).
It is obvious that grammars that exhibit run-time conflicts are undesiderable because they
are not suitable for XpLR parsing. In [16] an algorithm has been introduced to statically
verify, during the construction of the parsing table, whether or not a positional grammar
would produce run-time conflicts (such algorithm can also be applied to XpLR grammars).
In particular, whenever the algorithm detects a conflict it returns the set of items causing
the conflict. Therefore, this technique allows a designer to have a feedback in the early
phases of the syntax definition of the visual language and gives him/her information on
how and where to intervene in order to solve the conflict.
As a matter of fact, whenever the algorithm detects a run-time conflict the gram-
mar designer analyzes the relation R causing the conflict and verifies if the scanning
order of the vsymbols producing the conflict, i.e. belonging to the set detected by
Fetch V symbol(NEXT), is not relevant to the correct parsing of the sentence. In this
case any of the detected vsymbols can be chosen as the next input. It is easy to show that
the relation any in the set of items I43 is such that every PLACEHOLD can be chosen as
the next vsymbol to be parsed.
In general, when the algorithm statically detects a “non relevant run-time conflict”
produced by a relation of this type in a particular set of items, the grammar designer must
explicitly tag such relation in the XPG with a ’*’.
In order to support such approach, the function Fetch V symbol must be modified to
take into account the tagged relations. The modification consists in the addition of the
following new case:
NEXT= (R∗driver, x), where R∗
driver = 〈RELh11 ,. . ., RELhn
n 〉∗ and each RELhii acts on
Chapter 3. The XpLR Methodology 57
a syntactic attribute ki of x
let zi be the hi-th object below the stack top
for i = 1 to n do
let next seti = { b | b is in Dp, it is non marked as visited, it has an attribute j
such that (b, j) is reachable from (x, ki), zi RELi b holds,
and the relation RELi acts on a syntactic attribute of zi
and the syntactic attribute j of b, respectively }if ∩i=1...n next seti is non-empty
then
randomly select an object b from ∩i=1...n next seti
for each RELi in Rdriver that is an explicit relation do
decrease by 1 the entry in the array COUNTER corresponding to the
explicit relation zi RELi b
mark the corresponding entry in Dp as visited
return the row index of b in Dp
else return null ;
Although the relations used to model many popular visual languages are applied in con-
texts such that the relations are tagged, this technique cannot always be applied. In
these cases, the grammar designer must modify the grammar in order to solve the con-
flict analogously to what happens when using traditional compiler-compiler tools such as
YACC [39].
Fig. 3.6 describes the steps that the designer follows to construct an XpLR parser.
XpLR methodology
1. Design or modify an extended positional grammar G
2. Construct the parsing table for G
3. if G has no conflicts then exit;
else apply the following heuristics
a. positional conflicts are handled by partitioning states in substates
b. s/r and r/r are handled as with YACC
c. s/s and g/g are handled by sorting the conflicting entries according to the associated tester relations.
4. if the modified parsing table suites the designer’s needs then exit; else goto 1;
Figure 3.6: The steps for the construction of an XpLR parser.
It is worth noting that for some classes of non XpLR grammars the application of the
58 Chapter 3. The XpLR Methodology
heuristics leads to deterministic parsers. In particular, the conflicts that preserve the de-
terminism are: the positional conflicts that involve a univocal relation, and the shift/shift
(goto/goto, resp.) conflicts with mutually exclusive conditioned shift (goto, resp.) ac-
tions. In the first case, the univocal relation may be satisfied by at most one vsymbol thus
Fetch Vsymbol may succeed at most on one conflicting entry. Similarly, in the second case
only one condition is true, this guarantees that only one shift (goto, resp.) action can be
performed.
As an example, the grammar for the context-sensitive language anbncn given in example
2.2 is non XpLR (see Fig. 3.7) but the positional conflicts involve the relation right-to
that is a univocal relation. It is easy to prove that the time complexity of the parser
is O(n2) by analyzing the structure of the derivation steps needed to generate the word
anbncn.
Thanks to the application of heuristics we could easily generate parsers for recognizing
many practical visual languages that would otherwise require the specification of a complex
grammar. As an example, the grammar for state transition diagrams given in example
2.3 is non XpLR. Nevertheless, we preserve its simple structure by handling the conflicts
through the partitioning and ordering of the parser states into substates.
St. Action Goto NEXT
a b c EOI S B
0 :sh2 :1 (start, S)
1 acc (end, EOI)
2 :sh5 :sh4 :3 (right-to, B)
3 :sh2 :sh10 :8 (right-to, S)
(right-to, c)
4 :sh7 :sh6 (right-to, c)
(right-to, b)
5 :sh5 :sh4 :9 (right-to, B)
6 r3 r3 r3 r3 -
7 r5 r5 r5 r5 -
8 :sh11 (right-to, c)
9 r4 r4 r4 r4 -
10 r2 r2 r2 r2 -
11 r1 r1 r1 r1 -
Figure 3.7: The XpLR(0) parsing table for the grammar of example 2.2.
Chapter 4
Building LR(0) Parsers for XPG Grammars
Given an extended positional grammar XPG, we can build an XpLR(0) parser that rec-
ognizes L(XPG) by using the algorithms described in the previous chapter.
In this chapter, we show that it is also possible to construct a parser for an XPG
grammar by using a translation scheme directly derived from XPG by means of special
mapping rules. A translation scheme is a context-free string grammar in which attributes
are associated with the grammar vsymbols and semantic actions enclosed between braces
{} are inserted within the right sides of productions [1]. In the following we denote with
map(XPG) the translation scheme SG derived by applying mapping rules to XPG, G(SG)
the context-free grammar underlying SG, P (SG) the corresponding parser, and L(SG) the
language recognized by P(SG).
The conversion of an XPG into an “equivalent” translation scheme allows us to use
standard and well-known compiler generation tools, like YACC [39], for the rapid imple-
mentation of compilers for visual languages. We also prove that given a translation scheme
SG=map(XPG),
1. G(SG) is LR(0) iff XPG is XpLR(0), and
2. the LR(0) parser built on SG recognizes the same set of visual sentences as the
XpLR(0) parser built on XPG.
However, in an attempt to keep the grammars simple, visual language designers often
prefer to leave ambiguities within the grammar, and to solve them later by using conflict
handling techniques to be specified when generating the parser. This means that we
might have to frequently deal with ambiguous XPGs. Thus, P(SG) needs heuristics for
60 Chapter 4. Building LR(0) Parsers for XPG Grammars
conflict solving to preserve the equivalence between L(SG) and L(XPG). In particular,
we will prove that to each type of conflict in an XPG corresponds a precise type of
conflict in map(XPG). In this way, we devise conflict handling techniques for map(XPG)
simulating the behavior of the techniques used for XPG, so that L(XPG) is still equivalent
to L(map(XPG)).
Fig. 4.1 graphically illustrates our approach. Let us observe that, if the translation
scheme obtained from the conversion is non-LR(0) then it does not present shift/reduce
conflicts but only reduce/reduce.
p`dfBa_Ba@AaB@_]BaB@
aBEwpB�aBEwpB
p`dCbcp_^¦@
ui@
��j@
IttbcpA_c`d@@`C@@
���@�`dCbcp_@@
�B^`bw_c`d@
\Bp]dc�wB^@
§h}@~d_BasBEcA_B@
\aAd^bA_c`d@
jp]BsB@
��@
\aAd^bA_c`d@
jp]BsB@
Figure 4.1: An approach for the construction of LR grammars from XPGs.
Next section describes the mapping process, whereas section 4.2 shows the equivalence
between L(XPG) and L(map(XPG)) for an XpLR grammar XPG. Finally, section 4.3
provides techniques for constructing a translation schema map(XPG) for a non-XpLR
grammar XPG, such that L(P(map(XPG)))=L(P(XPG)). In other words, we describe
how to construct an LR parser simulating P(XPG), including its heuristics.
4.1 Converting an XPG into a translation scheme based on
string grammars
In this section we define the mapping rules to convert a generic extended positional gram-
mar into a translation scheme. The generated translation schemes have synthesized at-
tributes, i.e. each grammar production “A → α” is associated with an action that cal-
culates the attributes of the non-terminal A from the values of the vsymbols in the right
Chapter 4. Building LR(0) Parsers for XPG Grammars 61
hand side α.
Let us consider a generic production of an extended positional grammar XPG:
A → α xi 〈Rdriveri ,Rtesteri〉 xi+1 β,∆,Γ (4.1)
where xi, xi+1 are either terminals or non terminals and Γ = {(N1, Cond1, ∆1),. . ., (Nt,
Condt, ∆t)} with t≥0.The syntactic attributes of each vsymbol in the production will be left unchanged in
the final translation scheme SG. The ∆ and Γ rules will be emulated within the action
sections of SG. In order to complete the mapping we need to introduce new non-terminals,
productions and actions within SG to simulate the behavior of each sequence of relations
Rdriveri and Rtesteri .
The conversion of XPG in SG is accomplished through the four mapping rules given
below, which are applied to the productions of XPG to derive the set of productions and
semantic actions of SG. In them we refer to Dp as the dictionary storing the vsymbols.
Moreover, the functions Fetch Vsymbol and Test have the same behavior of the corre-
sponding functions defined in subsection 3.1.4, with the only difference that they ignore
some non-terminals when accessing the stack. In particular, they ignore non-terminals
added during the conversion process, which did not belong to XPG.
The four mapping rules follow:
Rule 1. Replace each sequence of driver relations Rdriveri with a new unique non-terminal
DRki. Furthermore, build an empty production on DRki
with an action emulat-
ing the fetching of the next vsymbol to parse. Such an action calls the function
Fetch Vsymbol on arguments Rdriveri and xi+1, where xi+1 is the vsymbol following
Rdriveri in the XPG production. When DRkiis reduced, the action retrieves the
next vsymbol to be processed from Dp. In particular, the added production is:
DRki→ ε { ip = Fetch Vsymbol(Rdriveri , xi+1);
if ip is not null then next vsymbol = Dp[ip]
else {emit “syntax error”; exit;}}
Rule 2. Replace each sequence of non-empty tester relations Rtesteri with a sequence
formed by a new unique non-terminal TRkifollowed by a fictitious unique terminal
62 Chapter 4. Building LR(0) Parsers for XPG Grammars
aki . Such a sequence must be placed after the vsymbol xi+1 following Rtesteri in the
XPG production. Moreover, introduce a new empty production for TRkiwith an
action emulating the tester relations. In particular, the action invokes the function
Test for each relation RELh in Rtesteri to verify whether RELh holds between xi+1
and the previously scanned vsymbols. If Test returns always true, then the fictitious
terminal aki is returned as the next vsymbol to be processed. The successful parsing
of aki signals the correct recognition of xi+1.
The following productions are the result of applying rules 1. and 2. to Rdriveri and Rtesteri
in the XPG production 4.1.
A → α′ xi DRkixi+1 TRki
aki β′ {∆;
for j=1 to t do
if Condj is true then {insert(Dp, Nj);
∆j ;
}}
DRki→ ε { ip = Fetch Vsymbol(Rdriveri , xi+1);
if ip is not null then next vsymbol = Dp[ip]
else {emit “syntax error”; exit;}}
TRki→ ε { if Test(RELh, xi+1) is true for each RELh in Rtesteri
then next vsymbol = Dp[ip]
else {emit “syntax error”; exit;}}
Rule 3. Add the following two productions to SG in order to calculate the first vsymbol
to be processed, and to verify that all the vsymbols in the input sentence have been
processed:
S’ → SP S { ip = Fetch Vsymbol(end);
if ip is not null then {emit “syntax error”; exit;}else {emit “parsing ok”; exit;}
}
Chapter 4. Building LR(0) Parsers for XPG Grammars 63
SP → ε { ip = Fetch Vsymbol(start);
if ip is not null then next vsymbol = Dp[ip]
else {emit “syntax error”; exit;}}
Here, S is the starting vsymbols of XPG, whereas S’ is the starting vsymbol of SG.
The following rule aims to reduce the number of productions and non-terminals in SG, so
that the corresponding parser will have a reduced number of states.
Rule 4. Merge empty productions with identical actions to form a single production.
This entails the elimination of the non-terminals on the LHSs of merged productions,
and the introduction of a new non-terminal as the LHS of the resulting production.
Moreover, merge empty productions having the same parameters in the Test function
into a single production. This entails that the LHSs and the fictitious terminals
of merged productions need to be replaced by a single non-terminal and a single
fictitious terminal for the resulting production.
This renaming process needs to be propagated to all the productions referring to the
renamed vsymbols.
The application of these mapping rules to an extended positional grammar XPG without
empty productions produces a translation scheme SG in which the productions have two
possible formats:
1. B → y1 A1 y2 A2 . . . An−1 yn with n≥1where B is a non-terminal from XPG, each Ai is either a DR or a TR non-terminal
vsymbol, and each yi is either a terminal or a non-terminal vsymbol from XPG or
a unique fictitious terminal. Moreover a TR vsymbol can only be followed by a
fictitious terminal.
2. A → ε
where A is either a DR or a TR non-terminal vsymbol.
In the following, productions of type 1 will be referred to as ordinary productions, and
productions of type 2 as DR or TR productions depending on whether A is of type DR or
TR.
64 Chapter 4. Building LR(0) Parsers for XPG Grammars
Example 4.1 Given the XPG=((T, N∪POS, P, S), PE) for state transitions diagrams
shown in example 2.3, the application of mapping rules 1-4 yields the translation scheme
SG=(T’, N’, P’, S’), where T’=T∪{A1, A2, A3}, N’=N∪{S’, SP, r1 1, r2 1, r1 2, r1 1b,
tn1 2, t1 2, tn1 1, rany} and P’ is the set of productions with actions described in the
following.
�����→���������������
���������������������� ���������������RST���
�UV����������������WXRS�RYUW���������� � ���RZUW�!���������R[\R�RYUW���� ���"��#���RZUW�!�
������������!�
�
�����→�ε����������������������� ���������������\W]^W���
UV����������������WXRS������������� ��$��%��
�������R[\R�RYUW���������� � ���RZUW�!�
�����������!��
���������→��& ����!�
�
& ������→��'()*��
� & ���+� ��'()*+��
������������!�
& ������→��'()*��
� & ���+� �'()*�+�������������!�
�
& ������→��& ������ +�+�)&)���+�,�-+� ,�+�'�.��& ���+� �& ����+�/�)&)+� V_^�� +�W_���T_�
UV��0�'�.�+01+��WXRS���� � ���� ���2-3)4(2����
� � �2-3)4(2+� �'�.�+�/�)&),���
�����!�!�
+�+���→��ε������������������������ ���������������+�+5�)&)���
� UV����������������WXRS������������� ��$��%�
=======R[\R�RYUW���������� � ���RZUW�!�
�����������!��
��+�,���→�ε����������������������UV������ ,�+ 5�)&)������ ����
��������������������������WXRS������������� �-+��
== ========R[\R�RYUW���������� � ���RZUW�!�
��������������������!�
�
,�+���→��ε������������������������ ���������������,�+5�'�.����
� UV����������������WXRS������������� ��$��%�
==========================================R[\R�RYUW���������� � ���RZUW�!�
����������������!��
& ������→��& ������ +�+�)&)��+�,�-,��
& ���+� ��& ����+�/�)&)+�/�)&),� ���������������!�
�+�,���→�ε ������������������UV������+�,5�)&)������ ���
=======================WXRS������������� �-,��========================R[\R�RYUW���������� � ���RZUW�!�
��������������!�
�
& ����→��& ������ +�,�)&)���+�+�-6� +�+��'�.��
& ���+� �& ����+�7�)&),� V_^�� +�W_���T_�
UV��0�'�.�+01+��WXRS���
� � ���� ���2-3)4(2����
� � �2-3)4(2+� �'�.�+�7�)&)+��������!�
!�
�
+�,���→��ε���������������������� ���������������+�,5�'�.����
����������������UV����������������WXRS������������� ��$��%�
�����������������������������������������R[\R�RYUW���������� � ���RZUW�!
���������������!��
��+�+���→�ε�������������������UV������ +�+ 5�)&)������ ���
=============================WXRS������������� �-6��=============================R[\R�RYUW���������� � ���RZUW�!�
��������������!�
�
+�+����→��ε��������������������� ���������������+�+5�'�.����
���������������UV����������������WXRS������������� ��$��%�
����������������������������������������R[\R�RYUW���������� � ���RZUW�!�
���������������!��
& ������→��& ������ �����2-3)4(2��
��������������& ���+� ��2-3)4(2+� ��������������!�
�
������→��ε�������������������� ������������������5��2-3)4(2���
��������������UV����������������WXRS������������� ��$��%�
=== = ===R[\R�RYUW���������� � ���RZUW�!�
�������������!�
�
'�.����→��'()&��
��������������'�.�+� �'()&+� �������������!�
'�.����→��'()���
���������������'�.�+� �'()�+� �������������!�
�
'�.����→���2-3)4(2�����������������'�.�+� ��2-3)4(2+� �������������!
Chapter 4. Building LR(0) Parsers for XPG Grammars 65
4.2 Comparing the recognized languages
In this section we compare L(SG) and L(XPG) and analyze the circumstances under which
they are equivalent. In particular, we prove that the grammar G(SG) is LR(0) iff XPG is
XpLR(0), and that if G(SG) is LR(0) then L(SG) is equivalent to L(XPG).
Let XPG = ((N, T ∪ POS, S, P), PE) be an extended positional grammar, and let
Rd and Rt be sets of sequences of relations from POS. Sequences in Rd represent driver
relations, whereas those in Rt represent tester relations. Moreover, let SG = map(XPG)
and G’ = G(SG) = (N’, T’, S’, P’). From the mapping rules 1-4 seen above we know that
N ⊆ N’ and T ⊆ T’. In particular, N’ = N ∪ DR ∪ TR and T’ = T ∪ FICT ∪ {ε}, whereDR is the set of non-terminal vsymbols introduced in G’ by Rule 1; TR and FICT are in
order the set of non-terminal and fictitious terminal vsymbols introduced in G’ by Rule 2.
Furthermore, the following regular expressions will be also used throughout the proofs:
N is the regular expression denoting the set N of non terminals in XPG;
T is the regular expression denoting the set T of terminals in XPG;
Rd is the regular expression denoting the set Rd of sequences of driver relations;
Rt is the regular expression denoting the set Rt of sequences of tester relations;
DR is the regular expression denoting the set DR of non-terminals resulting from Rules
1 and 4;
TR is the regular expression denoting the set TR of non-terminals resulting from Rules
2 and 4;
a is the regular expression denoting the set FICT of fictitious terminals resulting from
Rules 2 and 4;
x = (N | T) denotes the set of grammar vsymbols from XPG;
PREF = x(〈Rd, Rt?〉 x)∗ denotes the set of non empty prefixes of the right hand side
of a production in XPG;
SUFF = (〈Rd, Rt?〉 x)∗ denotes the set of suffixes of the right hand side of a production
in XPG;
66 Chapter 4. Building LR(0) Parsers for XPG Grammars
PREF’ = x (DR x (TR a)?)∗ denotes the set of prefixes of the right hand side of a
production in G’;
SUFF’ = (DR x (TR a)?)∗ denotes the set of suffixes of the right hand side of a pro-
duction in G’.
If r is a regular expression we will use the standard notation L(r) to refer to the language
defined by r.
In the following we define a correspondence between the set of items constructed from
an XPG by using the algorithms of section 3.2 and the set of items constructed from the
grammar G’.
Definition 4.1 (map-equivalence)
Let I be a set of XpLR(0) items derived from XPG and I’ a set of LR(0) items derived
from G’. The sets I and I’ are map-equivalent iff
1. the number of kernel items is the same, and
2. for each kernel item A → α′ · β′ in I’ there exists a kernel item A → α · β in I such
that:
(a) the production A → α′ β′ in G’ is derived by the application of the Rules 1-4
to the production A → α β in XPG, and
(b) α′ is the result of the translation process applied to α, whereas β′ is the result of
the translation process applied to β. In future, whenever we use a greek letter
α to denote a generic sequence on the RHS of some production in XPG, we
will denote with α′ the sequence obtained as a result of the translation process
applied to α.
An interesting property of the mapping process is given in the following proposition.
Proposition 4.1 Let I be a set of XpLR(0) items derived from an XPG and I’ a set of
LR(0) items derived from G’ and map-equivalent to I. For each shift/goto transition of
P(XPG) from I to an adjacent set of items Ix there exists a set of items Ix’ map-equivalent
to Ix that can be reached through 1, 2, or 4 consecutive transitions of P(G’) from I’.
Proof: The items derived from XPG that are subject to shift/goto transitions can be of
three types:
Chapter 4. Building LR(0) Parsers for XPG Grammars 67
1. Kernel items with a non-empty sequence of tester relations following the dot.
2. Kernel items with an empty sequence of tester relations following the dot.
3. Items with a dot at the beginning of the right hand side.
Since we assume that XPG generates no conflicts, there can only be three cases that are
worthwhile to examine.
CASE 1.
I contains one or more items of type 1 with the same driver sequence, tester sequence, and
grammar vsymbol immediately following the dot. In this case there is a single transition
from the set of items
I: A → α · 〈Rdriver, Rtester〉 x β,
B → δ · 〈Rdriver, Rtester〉 x γ,
. . . . . . . . . . . . . . . . . . . . . . . . . . .
to the set of items
Ix: A → α 〈Rdriver, Rtester〉 x · β,
B → δ 〈Rdriver, Rtester〉 x · γ,
. . . . . . . . . . . . . . . . . . . . . . . . . . .
where α, δ ∈ L(PREF), Rdriver ∈ Rd, Rtester ∈ Rt, x ∈ N∪T, β, γ ∈ L(SUFF).
Given the nature of the mapping rules 1-4 above it is easy to prove that there exists a
sequence of four consecutive transitions in P(G’) starting from the set of items I’ (map-
equivalent to I):
I’: A → α′ · DRi x TRi a β′,
B → δ′ · DRi x TRi a γ′,
. . . . . . . . . . . . . . . . . . . . .
and ending to the set of items
Ix’: A → α′ DRi x TRi a · β′,
B → δ′ DRi x TRi a · γ′,
. . . . . . . . . . . . . . . . . . . . .
α′, δ′ ∈ L(PREF’), DRi ∈ DR , TRi ∈ TR , a ∈ FICT, β′, γ′ ∈ L(SUFF’).
The execution of the actions associated to the empty productions involving the non termi-
nals DRi and TRi, together with the recognition of the vsymbol a reproduces the same ef-
fect as the XpLR parser invocations of the Fetch Vsymbol and Test algorithms on (Rdriver,
68 Chapter 4. Building LR(0) Parsers for XPG Grammars
x) and (Rtester, x), respectively.
CASE 2.
As case 1, but here Rtester is empty. If x is the vsymbol following the dot, there might
exist items of type 3 derived by closure that happen to have x as the first vsymbol on
their RHS. Thus, in P(XPG) there is a shift/goto transition from the set of items:
I: A → α · 〈Rdriver, 〉 x β,
B → δ · 〈Rdriver, 〉 x γ,
X → σ · 〈Rdriver, Rtester〉 C τ ,
C → · x λ,
. . . . . . . . . . . . . . . . . . . . . . . .
to the set of items
Ix: A → α 〈Rdriver, 〉 x · β,
B → δ 〈Rdriver, 〉 x · γ,
C → x · λ,
. . . . . . . . . . . . . . . . . . . . . . . .
and there exists a sequence of two consecutive transitions in P(G’) starting from the set
of items I’
I’: A → α′ · DRi x β′,
B → δ′ · DRi x γ′,
X → σ′ · DRi C TRi a τ ′,
. . . . . . . . . . . . . . . . . . . . . . . .
crossing set of items I”
I”: A → α′ DRi · x β′,
B → δ′ DRi · x γ′,
X → σ′ DRi · C TRi a τ ′,
C → · x λ′,
. . . . . . . . . . . . . . . . . . . . . . . .
and ending to an item set Ix’ map-equivalent to Ix
Ix’: A → α′ DRi x · β′
B → δ′ DRi x · γ′
C → x · λ′,
. . . . . . . . . . . . . . . . . . . . . . . .
Chapter 4. Building LR(0) Parsers for XPG Grammars 69
with α′, δ′, σ′ ∈ L(PREF’), DRi ∈ DR , TRi ∈ TR , a ∈ FICT, β′, γ′, τ ′, λ′ ∈ L(SUFF’).
The execution of the actions associated to the empty production involving the non-terminal
DRi reproduces the same effect as the XpLR parser invocation of the Fetch Vsymbol on
(Rdriver, x).
CASE 3.
In this case I contains a certain number of items of type 3 followed by the same grammar
vsymbol x. This means that there is a shift/goto transition in P(XPG) from the set of
items
I: A → · x β,
B → · x γ,
. . . . . . . . . . . .
to the set of items
Ix: A → x · β,
B → x · γ,
. . . . . . . . . . . .
with x ∈ N ∪ T, β, γ ∈ L(SUFF), and there exists a transition in P(G’) from the set of
items
I’: A → · x β′,
B → · x γ′,
. . . . . . . . . . . .
to the set of items
Ix’: A → x · β′,
B → x · γ′,
. . . . . . . . . . . .
with β′, γ′ ∈ L(SUFF’).
In the next proposition we prove that G(map(XPG)) can only have reduce/reduce conflicts,
for any XPG.
Proposition 4.2 Let G’=G(map(XPG)). If G’ is non-LR(0) then the corresponding pars-
ing table can never present a shift/reduce conflict.
Proof: Let us contradict the thesis by supposing that the parsing table derived from G’
has a shift/reduce conflict. In order for G’ to produce a shift/reduce conflict there must
70 Chapter 4. Building LR(0) Parsers for XPG Grammars
exist a set of items K containing at least one complete item and one with the dot preceding
a terminal vsymbol. The latter can be of the following three possible types:
1. The dot is between a TR non-terminal and a fictitious terminal
A → α′ DRkixi+1 TRki
· akiβ′
α′ ∈ L(PREF’), DRki∈ DR, TRki
∈ TR, aki∈ FICT, β′ ∈ L(SUFF’).
Since the set of items K must have been reached through a goto operation on TRkiand
a TR non-terminal has to be followed by a fictitious terminal, then there cannot exist
any complete item in K. Hence, shift/reduce conflicts cannot involve an item with a dot
between a TR non-terminal and a fictitious terminal.
2. The dot is between a DR non-terminal and a terminal from XPG
A → α′ DRki· b ρ′ β′
α′ ∈ L(PREF’), b ∈ T, DRki∈ DR, ρ′ ∈ L((TR a)?), β′ ∈ L(SUFF’).
The set of items K must have been reached by a goto operation on DRki. Moreover a DR
non-terminal has to be followed by a vsymbol b ∈ N ∪ T. Thus, if b ∈ T there cannot
exist any complete item in K; if b ∈ N the closure on b cannot generate complete items
in K because no empty ordinary productions are allowed. Hence, shift/reduce conflicts
cannot involve an item with a dot between a DR non-terminal and a terminal.
3. The dot is at the beginning of the right hand side of a non-empty production
A → · b α′
b ∈ T, α′ ∈ L(SUFF’)
Thus, the set of items K should contain the following item:
B → δ′ DRki· Y ρ′ γ′
δ′ ∈ L(PREF’), Y ∈ N, Y ∗⇒ Aλ′, λ′, γ′∈ L(SUFF’), ρ′ ∈ L((TR a)?).
This corresponds to case 2. Hence, shift/reduce conflicts cannot involve an item with a
dot at the beginning of the right hand side of a non empty production.
It is then proved that G’ can never produce an LR(0) parsing table with shift/reduce
conflicts, independently from the XpLR(0) property of XPG.
Now we consider the case in which the translation scheme SG obtained from an XPG by
applying the mapping rules 1-4 is LR(0). We prove that G(SG) is LR(0) if and only if
XPG is XpLR(0).
Theorem 4.1 Let G’ = G(map(XPG)). G’ is LR(0) iff XPG is XpLR(0).
Proof: ⇒) If G’ is LR(0), we need to prove that XPG is XpLR(0). Let us contradict
Chapter 4. Building LR(0) Parsers for XPG Grammars 71
the thesis by supposing that XPG is not XpLR(0). Then we need to prove that also the
hypothesis is contradicted, i.e., G’ is not LR(0). Thus, let us suppose that XPG is not
XpLR(0). This implies that its XpLR(0) parsing table has at least one of the following
types of conflicts: shift/shift case, goto/goto case, shift/shift othercase, goto/goto other-
case, and positional conflicts. In the following we analyze each XpLR(0) conflict, detect
the XpLR(0) items raising them, and use proposition 4.1 to prove that there must exist
a corresponding set of LR(0) items generated from G’ yielding conflicts in the associated
LR(0) parsing table.
Shift/Shift Case (Goto/Goto Case, respectively)
There is only one way for an XpLR(0) set of items K to present a shift/shift case conflict
(goto/goto case conflict, resp.). The set of items K must contain at least two kernel items
k1 and k2 with the dot preceding a terminal (non-terminal, respectively) vsymbol. The
sequences of driver relations right after the dot must be equal, whereas the sequences of
tester relations must not be mutually exclusive to have a conflict. Thus, K contains the
following items:
K: A → α · 〈Rdriveri , Rtesteri〉 xi+1 β, (k1)
B → γ · 〈Rdriveri , R’testeri〉 xi+1 δ, (k2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
with α, γ ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri , R’testeri ∈ Rt, β, δ ∈ L(SUFF), xi+1 ∈ T
(xi+1 ∈ N, respectively).
Then, the map-equivalent set of items K’ derived from G’ will contain the following items:
K’: A → α’ · DRi xi+1 TRi ai β’, (k1’)
B → γ’ · DRi xi+1 TR’i a’i δ’, (k2’)
DRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
with α’, γ’ ∈ L(PREF’), DRi ∈ DR, TRi, TR’i ∈ TR, ai, a’i ∈ FICT, β’, δ’ ∈ L(SUFF’).
By executing the goto operation twice, on DRi first and xi+1 then, we reach a set of items
I’ containing the following items:
I’: A → α’ DRi xi+1 · TRi ai β’, (i1’)
B → γ’ DRi xi+1 · TR’i a’i δ’, (i2’)
TRi → ·TR’i → ·
72 Chapter 4. Building LR(0) Parsers for XPG Grammars
. . . . . . . . . . . . . . . . . . . . . . . . . . .
which presents a reduce/reduce conflict involving two different TR productions. As a con-
sequence, if the parsing table on XPG contains a shift/shift case conflict or a goto/goto
case conflict, then the parsing table built on G’ must contain a reduce/reduce conflict.
This leads to a contradiction since the hypothesis states that G’ is LR(0).
Shift/Shift Othercase (Goto/Goto Othercase, respectively)
There are two ways for an XpLR(0) set of items K to present a shift/shift othercase con-
flict (goto/goto othercase conflict, resp.).
CASE 1
In the first case, the set of items K must contain two kernel items k1 and k2 with the
dot preceding a terminal (non-terminal, respectively) vsymbol. The two items must have
equal sequences of driver relations right after the dot, and exactly one of the two must
have an empty sequence of tester relations right after the dot.
K: A → α · 〈Rdriveri , Rtesteri〉 xi+1 β, (k1)
B → γ · 〈Rdriveri , 〉 xi+1 δ, (k2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
where α, γ ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, β, δ ∈ L(SUFF), xi+1 ∈ T (xi+1 ∈N, respectively).
Then, the map-equivalent set of items K’ derived from G’ will contain the following items:
K’: A → α’ · DRi xi+1 TRi ai β’, (k1’)
B → γ’ · DRi xi+1 δ’, (k2’)
DRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
where α’, γ’ ∈ L(PREF’), DRi ∈ DR, TRi ∈ TR, ai ∈ FICT, β’, δ’ ∈ L(SUFF’).
By executing the goto operation twice, on DRi and xi+1 successively, we reach a set of
items I’ containing the following items:
I’: A → α’ DRi xi+1 · TRi ai β’, (i1’)
B → γ’ DRi xi+1 · δ’, (i2’)
TRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
If δ’ is empty, then the set of items contains a reduce/reduce conflict on the TRi pro-
duction and the item i2’, otherwise δ’ starts with the vsymbol DRi+1, so that also the
Chapter 4. Building LR(0) Parsers for XPG Grammars 73
item “DRi+1 → ·” must have been added to I’ by closure. In this case there would be a
reduce/reduce conflict involving the DR and the TR productions.
CASE 2
In the second case, the set of items K must contain a kernel item k1 and a nonkernel item
k2, both with the dot preceding a terminal (non terminal, respectively) vsymbol.
K: A → α · 〈Rdriveri , Rtesteri〉 xi+1 β, (k1)
B → · xi+1 δ, (k2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
where α ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, β, δ ∈ L(SUFF) xi+1 ∈ T (xi+1 ∈ N,
respectively).
In this case K must also contain at least another kernel item k0, from which k2 is derived
by closure. The two kernel items k0 and k1 must have equal sequences of driver relations
right after the dot, otherwise there would not be the conflict.
X → σ · 〈Rdriveri , ρ〉 Y γ, (k0)
where σ ∈ L(PREF), Y ∈ N, Y ∗⇐ Bλ, λ ∈ L(SUFF), Rdriveri ∈ Rd, ρ ∈ L(Rt?), γ∈L(SUFF).
Then, the map-equivalent set of items K’ derived from G’ will contain the following items:
K’: X → σ’ · DRi Y ρ’ γ’, (k0’)
A → α’ · DRi xi+1 TRi ai β’, (k1’)
DRi → ·where σ’, α’ ∈ L(PREF’), DRi ∈ DR, TRi ∈ TR, ρ’ ∈ L((TR a)?), ai ∈ FICT, β’, γ’ ∈L(SUFF’).
By executing the goto operation on DRi we reach a set of items J’ containing the following
items:
J’: X → σ’ DRi · Y ρ’ γ’, (j0’)
A → α’ DRi · xi+1 TRi ai β’, (j1’)
B → · xi+1 δ’, (j2’)
Notice the presence of the item B → · xi+1 δ’. We prove that it is generated by closure
from Y. In fact, we know from the hypothesis that Y ∗⇐ Bλ with λ ∈ L(SUFF’), this
means that there exists in XPG a sequence possibly empty of non-terminal vsymbols B1,
B2,. . ., Bn, such that:
Y ⇐ B1λ1 ⇐ B2λ2 ⇐. . .⇐ Bnλn, with λ1, λ2,. . ., λn∈ L(SUFF),
74 Chapter 4. Building LR(0) Parsers for XPG Grammars
and there exist productions Bn → Bλn+1, and B → xi+1 δ, with λn+1, δ ∈ L(SUFF). Ac-
cording to the mapping rules this means that there must be a similar derivation generated
by G’:
Y ⇒ B1λ1’ ⇒ B2λ2’ ⇒ . . . ⇒ Bnλn’, with λ1’, λ2’, . . ., λn’∈ L(SUFF’),
and productions Bn → B λn+1’, and B → xi+1 δ’, with λn+1’, δ’ ∈ L(SUFF’), so that
when the dot precedes the string Y ρ’ γ’ we can derive the item B → · xi+1 δ’ by closure
through the n+2 productions seen above.
By executing the goto operation on xi+1 we reach a set of items I’ containing the following
items:
I’: A → α’ DRi xi+1 · TRi ai β’, (i1)
B → xi+1 · δ’, (i2)
TRi → ·Again, if δ’ is empty then I’ contains a reduce/reduce conflict on the TRi production and
the item i2, otherwise δ’ starts with the vsymbol DRi+1, so that also the item “DRi+1 →·” must have been added to I’ by closure. Thus, also in this case there is a reduce/reduce
conflict involving the DR and the TR productions.
Positional conflicts
There is only one way for an XpLR(0) set of items K to present a positional conflict. The
set of items K must contain at least two incomplete kernel items k1 and k2 having equal
sequences of driver relations and different vsymbols following the dot. No constraints are
imposed on the sequences of tester relations. Thus, K must contain the following items:
K: A → α · 〈Rdriveri , ρ〉 xi+1 β, (k1)
B → γ · 〈Rdriveri , φ〉 yi+1 δ, (k2)
where α, γ ∈ L(PREF), Rdriveri ∈ Rd, ρ, φ ∈ L(Rt?), β, δ ∈ L(SUFF), xi+1, yi+1 ∈ T
(xi+1, yi+1 ∈ N, respectively).
The map-equivalent set of items K’ derived from G’ will then contain the following items:
K’: A → α’ · DRi xi+1 ρ’ β’, (k1’)
B → γ’ · DR’i yi+1 φ’ δ’, (k2’)
DRi → ·DR’i → ·
α’, γ’ ∈ L(PREF’), DRi, DR’i ∈ DR, ρ’, φ ∈ L((TR a)?), β’, δ’ ∈ L(SUFF’)
with a reduce/reduce conflict involving two different DR productions. Again, this contra-
Chapter 4. Building LR(0) Parsers for XPG Grammars 75
dicts the hypothesis.
Thus, we can conclude that if G’ is an LR(0) grammar obtained through the applica-
tion of the mapping rules 1-4 to an extended positional grammar XPG, then we can state
that XPG is an XpLR(0) grammar.
⇐) Let XPG be an XpLR(0) grammar, we need to prove that the grammar G’=G(map(XPG))
is LR(0). We will prove this by assuming that G’ is not LR(0) and by showing that such
an hypothesis leads to a contradiction. In order for G to be non LR(0) its parsing table
must contain at least a shift/reduce or a reduce/reduce conflict. From proposition 4.2
the parsing table derived from G’ can never present a shift/reduce conflict, thus in the
following we show that each different type of reduce/reduce conflict leads to a particular
conflict in the parsing table derived from XPG.
We distinguish different types of reduce/reduce conflicts produced by G’ depending
on the types of productions involved. Let us recall that G’ contains three types of pro-
ductions, namely, ordinary, TR and DR productions. Therefore, the number of possible
reduce/reduce types of conflicts caused by G is given by the six pair wise combinations of
them.
Fig. 4.2 summarizes all the correspondences between conflicts caused by G’ and XPG
that we intend to prove. As an example, the first row of the table can be read as follows:
“a reduce/reduce involving an ordinary production and a DR production in G’ implies a
shift/reduce conflict caused by XPG”. The correctness of such table implies that if XPG
is an XpLR(0) grammar, then the grammar G’ is an LR(0) grammar. In what follows we
will prove the six cases singularly.
Type of reduce/reduce conflict caused by G’ Type of conflict caused by XPG
(ordinary, DR) shift/reduce
(ordinary, ordinary) reduce/reduce
(TR, TR) shift/shift or goto/goto case
(TR, ordinary)
(TR, DR)
shift/shift or goto/goto othercase
(DR, DR) positional
Figure 4.2: Correspondence between conflicts caused by G’ and XPG.
76 Chapter 4. Building LR(0) Parsers for XPG Grammars
1. (ORDINARY, DR)
Since XPG does not contain empty productions, this case occurs when G’ generates a set
of items I’ that contains at least one complete item like i1, and at least one item like i2
with the dot preceding a DR vsymbol on the RHS:
I’: A → α′ · (i1)
B → β′ · DRi xi+1 ρ′ γ′ (i2)
DRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi+1 ∈ N ∪ T, α′, β′ ∈ L(PREF’), ρ′ ∈ L((TR a)?), γ′ ∈ L(SUFF’)
There must exist a map-equivalent set of items I generated from XPG containing the fol-
lowing items:
I: A → α ·B → β · 〈Rdriveri , ρ〉 xi+1 γ
. . . . . . . . . . . . . . . . . . . . . . . . . . .
α, β ∈ L(PREF), Rdriveri ∈ Rd, ρ ∈ L(Rt?), γ ∈ L(SUFF).
Hence, there should have been a shift/reduce conflict generated by XPG, which would
contradict the hypothesis.
2. (ORDINARY, ORDINARY)
This case occurs when the grammar G’ generates a set of items I’ containing two or more
complete items. As an example, let us suppose that I’ contains two complete items i1
and i2. They should be terminated by the same vsymbol, because it is the last scanned
vsymbol in both of them and from the hypothesis the two items are not empty.
I’: A → α′ xi · (i1)
B → β′ xi · (i2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi ∈ N ∪ T, α′, β′ ∈ L(PREF’(DR))∪{ε}.Thus, there must exist a map-equivalent set of items I generated from XPG containing
the following items:
I: A → α xi ·B → β xi ·. . . . . . . . . . . . . . .
α, β ∈ L(PREF〈Rd, 〉)∪ {ε}.
Chapter 4. Building LR(0) Parsers for XPG Grammars 77
Hence, there should have been a reduce/reduce conflict generated by XPG, which would
contradict the hypothesis.
3. (TR, TR)
This case occurs when I’ contains two or more items with the dot preceding different TR
vsymbols. By following similar arguments as above, I’ will have the following structure:
I’: A → α′ DRi xi+1 · TRi ai δ′ (i1)
B → β′ DRi xi+1 · TRj aj γ′ (i2)
TRi → ·TRj → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi+1 ∈ N ∪ T, α′, β′ ∈ L(PREF’), DRi ∈ DR, TRi, TRj ∈ TR, ai, aj ∈ FICT, δ′, γ′ ∈L(SUFF’).
Thus, there must exist the following set of items J’ generated by the grammar G’, such
that goto(J’, xi+1) = I’:
J’: A → α′ DRi · xi+1 TRi ai δ′ (j1)
B → β′ DRi · xi+1 TRj aj γ′ (j2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Moreover, there must exist the following set of items K’ generated by the grammar G’,
such that goto(K’, DRi) = J’:
K’: A → α′ · DRi xi+1 TRi ai δ′ (k1)
B → β′ · DRi xi+1 TRj aj γ′ (k2)
DRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
The map-equivalent set of items K generated from XPG contains the following conflicting
items:
K: A → α · 〈Rdriveri , Rtesteri〉 xi+1 δ
B → β · 〈Rdriveri , Rtesterj〉 xi+1 γ
. . . . . . . . . . . . . . . . . . . . . . . . . . .
α, β ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri , Rtesterj ∈ Rt, δ, γ ∈ L(SUFF).
Thus, there should have been a shift/shift case conflict generated by XPG, which would
contradict the hypothesis.
4. (TR, ORDINARY)
78 Chapter 4. Building LR(0) Parsers for XPG Grammars
This case occurs when G’ generates a set of items I’ containing one or more complete items
like i1 and one or more items like i2 with the dot preceding a vsymbol TRi ∈ TR:
I’: A → α′ xi+1 · (i1)
B → β′ DRi xi+1 · TRi ai γ′ (i2)
TRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi+1 ∈ N ∪ T, α′ ∈ L(PREF’(DRi))∪{ε}, β′ ∈ L(PREF’), DRi ∈ DR, TRi ∈ TR, ai ∈FICT, γ′ ∈ L(SUFF’).
We must distinguish two cases, according to the following two alternatives: α′ = ε or α′ �=ε. In the first case, there must exist the following set of items J’ such that goto(J’, xi+1)
= I’:
J’: A → · xi+1 (j1)
B → β′ DRi · xi+1 TRi ai γ′ (j2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
This means that there must exist the following item in J’ from which j1 is derived by
closure:
X → σ′ DRi · Y ρ′ τ ′ (j0)
σ′ ∈ L(PREF’), DRi ∈ DR, Y ∈ N, Y ∗⇐ Aλ′, λ′, τ ′ ∈ L(SUFF’), ρ′ ∈ L((TRiai)?), TRi
∈ TR, ai ∈ FICT.
Notice that there must be the same vsymbol DRi preceding the dot in j0 and j2. Moreover,
Y cannot be the vsymbol S following SP, otherwise DRi = SP and this is not possible since
SP can only occur once in a set of items. Thus, there must exist the following set of items
K’ such that goto(K’, DRi) = J’:
K’: X → σ′ · DRi Y ρ′ τ ′ (k0)
B → β′ · DRi xi+1 TRi ai γ′ (k2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
The map-equivalent set of items K generated from XPG contains the following items:
K: X → σ · 〈Rdriveri , ρ〉 Y τ
B → β · 〈Rdriveri , Rtesteri〉 xi+1 γ
A → · xi+1
. . . . . . . . . . . . . . . . . . . . . . . . . . .
α, β ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, ρ ∈ L(Rt?), τ , γ ∈ L(SUFF).
Chapter 4. Building LR(0) Parsers for XPG Grammars 79
We can notice that the set of items K presents a shift/shift or goto/goto othercase conflict
depending on whether xi+1 is a terminal or non-terminal vsymbol, which would contradict
the hypothesis.
If α′ �= ε there must exist the following set of items J’ such that goto(J’, xi+1) = I’:
J’: A → α′ · xi+1 (j1)
B → β′ DRi · xi+1 TRi ai γ′ (j2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
and the following set of items K’ such that goto(K’, DRi) = J’:
K’: A → α′′ · DRi xi+1 (k1)
B → β′ · DRi xi+1 TRi ai γ′ (k2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
Thus, the map-equivalent set of items K generated from XPG contains the following items:
K: A → α · 〈Rdriveri , 〉 xi+1
B → β · 〈Rdriveri , Rtesteri〉 xi+1 γ
. . . . . . . . . . . . . . . . . . . . . . . . . . .
α, β ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, γ ∈ L(SUFF).
Thus, also in this case there is a shift/shift or goto/goto othercase conflict.
5. (TR, DR)
In this case the set of items I’ generated from G’ must contain at least an item i1 with the
dot preceding a vsymbol DRdriverj ∈ DR and at least an item i2 with the dot preceding a
vsymbol TRdriveri ∈ TR. By the nature of the mapping rules, I’ must have the following
structure:
I’: A → α′ xi · DRj xj ρ′ λ′ (i1)
B → β′ DRi xi · TRi ai γ′ (i2)
DRj → ·TRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi, xj ∈ N ∪ T, α′ ∈ L(PREF’(DRi))∪{ε}, DRi, DRj ∈ DR, TRi ∈ TR, ai ∈ FICT, ρ′ ∈L((TR a)?), β′ ∈ L(PREF’), λ′, γ′ ∈ L(SUFF’).
Also here we must distinguish two cases in correspondence of the two alternatives: α′ =
ε, and α′ �= ε.
CASE 1.
80 Chapter 4. Building LR(0) Parsers for XPG Grammars
I’: A → xi · DRj xj ρ′ λ′ (i1)
B → β′ DRi xi · TRi ai γ′ (i2)
DRj → ·TRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
There must exist the following set of items J’ such that goto(J’, xi) = I’:
J’: A → · xi DRj xj ρ′ λ′ (j1)
B → β′ DRi · xi TRi ai γ′ (j2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
This means that there must exist the following item in J’, from which j1 is derived by
closure:
X → σ′ DRi · Y ρ′ τ ′ (j0)
σ′ ∈ L(PREF’), DRi ∈ DR, Y ∈ N, Y ∗⇐ Aλ′, λ′, τ ′ ∈ L(SUFF’), ρ′ ∈ L((TRiai)?), TRi
∈ TR, ai ∈ FICT.
Again, Y can be the vsymbol S, but not the one following SP, otherwise DRi = SP and
this is not possible since SP can only occur once in an item set. Thus, there must exist
the following set of items K’ such that goto(K’, DRi) = J’:
K’: X → σ′ · DRi Y ρ′ τ ′ (k0)
B → β′ · DRi xi TRi ai γ′ (k2)
DRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
Consequently, the map-equivalent set of items K generated from XPG contains the fol-
lowing items:
K: X → σ · 〈Rdriveri , ρ〉 Y τ
B → β · 〈Rdriveri , Rtesteri〉 xi γ
A → · xi
. . . . . . . . . . . . . . . . . . . . . . . . . . .
α, β ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, ρ ∈ L(Rt?), τ , γ ∈ L(SUFF).
The derivation of the item A → · xi can be proven by similar arguments used in the
(TR,ORDINARY) case. We can notice that the set of items K presents a shift/shift or
goto/goto othercase conflict, which would contradict the hypothesis.
CASE 2.
Chapter 4. Building LR(0) Parsers for XPG Grammars 81
I’: A → α′ xi · DRj xj ρ′ λ′ (i1)
B → β′ DRi xi · TRi ai γ′ (i2)
DRj → ·TRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi, xj ∈ N ∪ T, α′ = α′′DRi, α′′, β′ ∈ L(PREF’), DRi, DRj ∈ DR, TRi ∈ TR, ai ∈ FICT,
ρ′ ∈ L((TR a)?), λ′, γ′ ∈ L(SUFF’).
There must exist the following set of items J’ such that goto(J’, xi) = I’:
J’: A → α′ · xi DRj xj ρ′ λ′ (j1)
B → β′ DRi · xi TRi ai γ′ (j2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
and the following set of items K’ such that goto(K’, DRi) = J’:
K’: A → α′′ · DRi xi DRj xj ρ′ λ′ (k1)
B → β′ · DRi xi TRi ai γ′ (k2)
DRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
Consequently, the map-equivalent set of items K generated from XPG contains the fol-
lowing items:
K: A → α · 〈Rdriveri , 〉 xi 〈Rdriveri , ρ〉 Y λ
B → β · 〈Rdriveri , Rtesteri〉 xi γ
. . . . . . . . . . . . . . . . . . . . . . . . . . .
α, β ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt, ρ ∈ L(Rt?), λ, γ ∈ L(SUFF).
Thus, also in this case K presents a shift/shift or goto/goto othercase conflict, which would
contradict the hypothesis.
6. (DR, DR)
This case occurs when I’ contains two or more items with the dot preceding different DR
vsymbols. As an example, let us consider the following two conflicting items i1 and i2:
I’: A → α′ xi · DRj xj ρ′ λ′ (i1)
B → β′ xi · DRk xk θ′ γ′ (i2)
DRj → ·DRk → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
82 Chapter 4. Building LR(0) Parsers for XPG Grammars
xi, xj, xk ∈ N ∪ T, α′, β′ ∈ L(PREF’(DRi))∪{ε}, DRj, DRk, DRi ∈ DR, ρ′, θ′ ∈ L((TR
a)?), λ′, γ′ ∈ L(SUFF’).
The map-equivalent set of items I generated from XPG contains the following items:
I: A → α xi · 〈Rdriverj , ρ〉 xj ρ λ
B → β xi · 〈Rdriverk , 〉 xk θ γ
. . . . . . . . . . . . . . . . . . . . . . . . . . .
α, β ∈ L(PREF(〈Rdriveri , 〉)), Rdriveri , Rdriverj , Rdriverk ∈ Rd, ρ, θ ∈ L(Rt?), λ, γ ∈L(SUFF).
Thus, there would also be a positional conflict generated from XPG, which would contra-
dict the hypothesis.
Now, we also prove that if the translation scheme SG=map(XPG) is LR(0), then P(SG)
and P(XPG) are equivalent.
Theorem 4.2 If SG=map(XPG) is LR(0) then the parser built on SG recognizes the
same set of visual sentences as the XpLR(0) parser built on XPG.
Proof: Let VS = {t1, t2,. . ., tn} be the set of terminals forming an input visual sentence.
By using induction on the number of parsed terminals m (equals to n + nt, where nt is
the number of vsymbols introduced during the parsing) we prove that the two parsers
produce equivalent results after reading j ≤ m vsymbols. By equivalent results we mean
that either they both successfully scan the sub-sentence ti1 , ti2 ,. . ., tij , in the same order,
or they both reject it after reading the same vsymbol tix , 1 ≤ x ≤ j. From Proposition 4.1
P(SG) might do this by performing more transitions than P(XPG), although these do not
produce further effects, since they only simulate the effects of driver and tester relations
of XPG. Moreover, the hypothesis and Theorem 4.1 ensures that both XPG and G(SG)
are conflict free.
Induction Base (n = 1)
Let us examine the steps executed by P(XPG) when scanning VS. We know that the
starting set of items I0 derived from XPG must contain the item S’ → · S, which generates
by closure in I0 a certain number of nonkernel items of type A → · xi β, β ∈ L(SUFF),
which in turn might generate items of the same type within I0. The parser P(XPG) starts
by reading ti1 from the input. If I0 does not contain any nonkernel item of type A → · ti1β, then the parser returns error and VS is rejected. Otherwise, there exist k > 0 nonkernel
Chapter 4. Building LR(0) Parsers for XPG Grammars 83
items of type A → · ti1 β, so that P(XPG) performs a shift to a set of items Ix containing
at least the k items of type A → ti1 · β. The parser P(SG) will have a similar behavior on
VS. In particular, the starting set of items I0’ derived from SG must contain the following
items:
I0’: S’ → · SP S,
SP → ·with SP ∈ DR. There are no more items in I0’ generated by closure, but there exists a
set of items I1’ derived from SG which contains the item S’ → SP · S, and generates by
closure in I1’ the same number of non kernel items generated from S’ → · S in I0. In
particular, for each item in I0 of type A → · xi β, β ∈ L(SUFF), there exists in I1’ a
corresponding item A → · xi β’ , β’ ∈ L(SUFF’). Therefore, it is easy to verify that
the terminal vsymbols starting the right hand sides (RHS) of items in I1’ are the same
as those in I0. Moreover, I0’ contains no items starting with a terminal vsymbol, hence
P(SG) will necessarily reduce with SP → · and will execute the associated positioning
action to scan the vsymbol ti1 . Then, it performs a transition from I0’ to I1’. If P(XPG)
rejected VS it means that ti1 does not start any RHS of items in I1, and from what said
above, it cannot start a RHS of items in I1’. Thus, also P(SG) rejects VS and returns
a parse error. Vice versa, if P(XPG) scanned ti1 successfully, it means that I1’ contains
exactly k>0 items of type A → · ti1 β’ as I0. Thus, also P(SG) performs a transition to a
state Ix’ map-equivalent to Ix, containing k items of type A → ti1 · β’, and perhaps some
empty productions of type “DR → ·”.Induction Hypothesis/Step
If the two parsers P(XPG) and P(SG) produce equivalent results after reading j < m
vsymbols, then they produce equivalent results after reading the (j+1)th vsymbol.
Obviously, if both P(XPG) and P(SG) returned a parse error there would be no (j+1)th
step. Vice versa, if they produced equivalent results reading j vsymbols, it means that they
have reached map-equivalent set of items Ij and Ij’. We distinguish two cases according to
the different structures of Ij and Ij’.
Case 1.
Ij contains one or more kernel items like
A → α tij · 〈Rdriveri , Rtesteri〉 xi β, (k1)
where α ∈ L(PREF), Rdriveri ∈ Rd, Rtesteri ∈ Rt ∪ {ε}, β ∈ L(SUFF), xi ∈ N ∪ T.
84 Chapter 4. Building LR(0) Parsers for XPG Grammars
From the induction hypothesis Ij’ contains similar kernel items like
A → α’ tij · DRi xi TRi ai β’, (k1’)
DRi → ·with α’∈ L(PREF’) ∪ {ε}, DRi ∈ DR , TRi ∈ TR ∪ {ε}, ai ∈ FICT ∪ {ε}, β’ ∈ L(SUFF’).
For each xi ∈ N, xi will generate one or more items of type xi → · yi β, β ∈ L(SUFF),
by closure in Ij. The application of the transitive closure to each xi ∈ N yields a certain
number of terminal vsymbols b1, b2,. . ., bn, each appearing as the first vsymbol of one
or more RHSs of items in Ij. Thus, starting from the last scanned vsymbol tij , P(XPG)
executes the driver relations in Rdriveri in order to scan the next vsymbol tij+1 from the
input. If tij+1 does not coincide with any of the vsymbols xi ∈ T following Rdriveri in
Ij or the vsymbols b1, b2,. . ., bn, then P(XPG) returns parse error and VS is rejected.
Otherwise, P(XPG) successfully scans tij+1 and performs a shift to a state Ix. The latter
contains all the items of type X → σ tij+1 · δ, σ ∈ L(PREF) ∪ {ε}, δ ∈ L(SUFF), such
that X → σ · tij+1 δ was in Ij, plus those they generate by closure. Similarly, the same
situations will also have occurred in P(SG). In fact, from the assumption that there are no
conflicts, P(SG) can only reduce with DRi → · in Ij’, and the execution of the associated
action reproduces the same positioning effects of Rdriveri starting from the last scanned
vsymbol tij . Thus, P(SG) will have transited in a set of items Ij” containing items of
type A → α’ tij DRi · xi TRi ai β’. Thus, for each xi ∈ N, xi will generate by closure
similar items as those generated from xi in Ij, with the same set of terminal vsymbols b1,
b2,. . ., bn, starting their RHSs. Therefore, it is easy to see that if tij+1 was successfully
scanned by P(XPG), then it will also be scanned by P(SG), which will transit to a state
Ix’ map-equivalent to Ix. Conversely, if P(XPG) returned parse error then also P(SG)
returns parse error.
Case 2.
Ij contains one complete kernel item like
A → α tij · , (k2)
with α ∈ L(PREF) ∪ {ε}.From the induction hypothesis Ij’ contains a similar complete kernel item like
A → α’ tij · , (k2’)
with α’∈ L(PREF’) ∪ {ε}.This means that P(XPG) performs a reduction and will return to a set of items Ih con-
Chapter 4. Building LR(0) Parsers for XPG Grammars 85
taining an item with the dot preceding the non terminal A:
X → σ · ρ A δ, (k3)
with σ ∈ L(PREF) ∪ {ε}, ρ ∈ L((〈Rd, Rt?〉)?), δ ∈ L(SUFF), and σ = ε ⇔ ρ = ε.
Analogously, P(SG) will reduce to A and will return to the following set of items Ih’:
X → σ’ · A δ’, (k3’)
. . . . . . . . . . . . . . .
with σ’ ∈ L(PREF’(DR)) ∪ {ε}, δ’ ∈ L(SUFF’), DR ∈ DR.
It is easy to prove that both parsers perform a goto on A, transiting to map-equivalent
states Ih+1 and Ih+1’, respectively. If δ and δ’ are not empty we run into case 1, so we can
apply the same arguments. If they are both empty and A is not S, then we are in case 2
again, so we apply the same reasoning until we run into case 1 or A becomes S. In the last
case, both check if there are vsymbols in the input which have not been examined. Since
from the inductive hypothesis both parsers have scanned the same vsymbols, in the same
order, it means that P(XPG) accepts VS if and only if P(SG) accepts VS.
In the next subsection we consider the non-LR(0) translation scheme generated through
mapping rules 1-4.
4.3 Resolving conflicts in non-LR(0) translation schemes
A grammar G’=G(map(XPG)) may not be LR(0), hence P(SG) needs heuristics for con-
flict solving to preserve the equivalence between L(SG) and L(XPG). To this sake, previ-
ously we have proved that conflicts in G’ are introduced by conflicts in XPG. In particular,
we have proved that each conflict in XPG always yields one reduce/reduce conflict in G’.
This is an important property because it enable us to develop conflicts solving heuristics
in G’ simulating the heuristics adopted on XPG (see section 3.3.1), so that L(XPG) is
still equivalent to L(G’). In this way, we can use the parsing implementation technique
presented in this paper even in those cases when the XPG grammar is not XpLR(0).
As shown in Fig. 4.1, initially we ignore the non-LR problem and use the transforma-
tion algorithms of section 4.1 to generate what we call an intermediate translation scheme.
Successively, we apply ad hoc transformation techniques to the intermediate grammar in
order to eliminate the conflicts eventually caused by the original non-XpLR grammar
XPG.
86 Chapter 4. Building LR(0) Parsers for XPG Grammars
In order to devise conflict handling techniques for G’=map(XPG), we must identify the
possible reduce/reduce conflicts on the grammar G’ and modify it according to resolution
techniques preserving the property L(P(G’))=L(P(XPG)). As shown in Theorem 4.1, the
possible reduce/reduce conflicts in a set of items are given by the possible combinations
of ordinary, DR and TR productions. In the following we describe these conflicts and
the transformation techniques that we use to eliminate them from the grammar. These
techniques follow the heuristics defined in the XpLR methodology.
Case 1. Ordinary
This case occurs when the grammar G’ generates a set of items I’ containing two or more
complete items. As an example, let us suppose that I’ contains two complete items i1 and
i2.
I’: A → α′ xi · (i1)
B → β′ xi · (i2)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi ∈ N ∪ T, α′, β′ ∈ L(PREF’(DR))∪{ε}.In the XpLR methodology this type of conflict is solved by choosing the conflicting pro-
duction listed first in the grammar specification. This approach can be simulated with
the introduction of the non-terminal ’next1’ in the two conflicting productions followed
by new fictitious terminals.
A → α′ xi next1 ai
B → β′ xi next1 aj
then we introduce the empty production
next1 → ε { next vsymbol = ak;}where ak is ai if the production associated to (i1) precedes the production associated to
(i2) in the XPG specification, otherwise ak is aj.
Case 2. DR
This case occurs when I’ contains two or more items with the dot preceding different DR
vsymbols. As an example, let us consider the following two conflicting items i1 and i2:
I’: A → α′ xi · DRj xj ρ′ λ′ (i1)
B → β′ xi · DRk xk θ′ γ′ (i2)
DRj → ·DRk → ·
Chapter 4. Building LR(0) Parsers for XPG Grammars 87
. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi, xj, xk ∈ N ∪ T, α′, β′ ∈ L(PREF’(DRi))∪{ε}, DRj, DRk, DRi ∈ DR, ρ′, θ′ ∈ L((TR
a)?), λ′, γ′ ∈ L(SUFF’).
In the XpLR methodology the set of items is partitioned according to the driver relations.
This establishes an evaluation order of the driver relations. We can resolve this conflict
by introducing the non-terminal ‘next1’ in the two conflicting productions:
A → α′ xi next1 xj ρ′ λ′
B → β′ xi next1 xk θ′ γ′
and this empty production:
next1 → ε { let Rseq be the ordered sequence of parameters of Fetch Vsymbol
in the conflicting DR productions
do { let R the first element in Rseq;
ip = Fetch Vsymbol(R);
if ip is not null then next vsymbol = Dp[ip];
else delete R from Rseq;
} while(ip is null and Rseq is not empty);
if (ip is null) { emit “syntax error”; exit; }}
Case 3. TR
This case occurs when I’ contains two or more items with the dot preceding different TR
vsymbols. Thus, I’ will have the following structure:
I’: A → α′ DRi xi+1 · TRi ai δ′ (i1)
B → β′ DRi xi+1 · TRj aj γ′ (i2)
TRi → ·TRj → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi+1 ∈ N ∪ T, α′, β′ ∈ L(PREF’), DRi ∈ DR, TRi, TRj ∈ TR, ai, aj ∈ FICT, δ′, γ′ ∈L(SUFF’).
This conflict is generated from a shift/shift or goto/goto case conflict in the XPG gram-
mar. The heuristics used by the XpLR methodology to eliminate these ambiguities is to
order the tester relations and to execute the first shift or goto whose condition is true.
This heuristic can be simulated by introducing a new non-terminal ‘next1’ and by defining
88 Chapter 4. Building LR(0) Parsers for XPG Grammars
it through the following empty production:
next1 → ε { let Rseq be the ordered sequence of tester relation that are parameters of
Test in the conflicting TR productions
do { let R the first element in Rseq;
if Test(RELh, xi+1) is true for each RELh in R
then next vsymbol = ak; // where ak is the fictitious vsymbol following R
exit;
else delete R from Rseq;
} while(Rseq is not empty);
if (ip is null) { emit “syntax error”; exit; }}
Finally we introduce the non-terminal ’next1’ in the two conflicting productions:
A → α′ DRi xi+1 next1 ai δ′
B → β′ DRi xi+1 next1 aj γ′
Case 4. (ordinary, DR)
This case occurs when G’ generates a set of items I’ that contains at least one complete
item, and at least one item with the dot preceding a DR vsymbol on the RHS:
I’: A → α′ · (i1)
B → β′ · DRi xi+1 ρ′ γ′ (i2)
DRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi+1 ∈ N ∪ T, α′, β′ ∈ L(PREF’), ρ′ ∈ L((TR a)?), γ′ ∈ L(SUFF’)
This conflict is generated by a shift/reduce conflict in the XpLR parsing table. In this
case, I is split by the function Partition in an ordered sequence of items sets on the base
of the driver relations, and the parser gives priority to the shift of xi+1. So we can tackle
this conflict by introducing a new non-terminal ‘next1’ and by defining it through the
following empty production:
next1 → ε { ip = Fetch Vsymbol(Rdriveri , xi+1);
if ip is not null then next vsymbol = Dp[ip];
else next vsymbol = ak;
}then we introduce the non-terminal ‘next1’ in the two conflicting productions:
Chapter 4. Building LR(0) Parsers for XPG Grammars 89
A → α′ next1 ak
B → β′ next1 xi+1 ρ′ γ′
Case 5. (Ordinary, TR)
This case occurs when G’ generates a set of items I’ containing one or more complete items
and one or more items with the dot preceding a vsymbol TRi ∈ TR:
I’: A → α′ xi+1 · (i1)
B → β′ DRi xi+1 · TRi ai γ′ (i2)
TRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi+1 ∈ N ∪ T, α′ ∈ L(PREF’(DRi))∪{ε}, β′ ∈ L(PREF’), DRi ∈ DR, ai ∈ FICT, γ′ ∈L(SUFF’).
Moreover, let us suppose that
TRi → ε { if Test(RELh, xi+1) is true for each RELh in Rtesteri
then next vsymbol = ai;
else {emit ”syntax error”; exit;}}
We must distinguish two cases, according to the following two alternatives: α′ = ε or α′
�= ε.
In the first case, there is a shift/shift or goto/goto othercase conflict in XPG depending
on whether xi+1 is a terminal or non-terminal vsymbol. We can tackle this conflict by
introducing a new non-terminal ‘next1’ and by defining it through the following rule:
next1 → ε { let j: X → σ′ DRi · Y ρ′ τ ′ the kernel item such that i1 is in Closure(j)
if ρ′ = TRj a’ then
if in the order sequence of conditioned actions, the condition verified
by Tj precedes the condition verified by TRi
then if Test(RELh, xi+1) is true for each RELh in Rtesterj
then { next vsymbol = aj; exit; }if Test(RELh, xi+1) is true for each RELh in Rtesteri
then next vsymbol = ai;
else {emit “syntax error”; exit;}}
then we introduce the non-terminal ’next1’ and a fictitious vsymbol aj in the two conflict-
90 Chapter 4. Building LR(0) Parsers for XPG Grammars
ing production:
A → xi+1 next1 aj
B → β′ DRi xi+1 next1 ai γ′
Also in the case α′ �= ε there is a shift/shift or goto/goto othercase conflict. By following
similar arguments as above, we can tackle this conflict by introducing a new non-terminal
‘next1’ and by defining it through the following empty production:
next1 → ε {if Test(RELh, xi+1) is true for each RELh in Rtesteri
then next vsymbol = ai;
else next vsymbol = aj;
}then we introduce the non-terminal ‘next1’ and a fictitious vsymbol aj in the two conflict-
ing production:
A → α′ xi+1 next1 aj
B → β′ DRi xi+1 next1 ai γ′
Case 6. (TR, DR)
In this case the set of items I’ generated from G’ must contain at least an item i1 with
the dot preceding a vsymbol DRj ∈ DR and at least an item i2 with the dot preceding a
vsymbol TRi ∈ TR:
I’: A → α′ xi · DRj xj ρ′ λ′ (i1)
B → β′ DRi xi · TRi ai γ′ (i2)
DRj → ·TRi → ·. . . . . . . . . . . . . . . . . . . . . . . . . . .
xi, xj ∈ N ∪ T, α′ ∈ L(PREF’(DRi))∪{ε}, DRi, DRj ∈ DR, TRi ∈ TR, ai ∈ FICT, ρ′ ∈L((TR a)?), β′ ∈ L(PREF’), λ′, γ′ ∈ L(SUFF’).
This conflict is generated by a shift/shift or goto/goto case conflict in XPG. We can tackle
this conflict by introducing a new non-terminal ‘next1’ and define it through the following
empty production:
next1 → ε {if Test(RELh, xi) is true for each RELh in Rtesteri
then next vsymbol = ai;
else {ip = Fetch Vsymbol(Rdriveri , xj);
if ip is not null then next vsymbol = Dp[ip];
Chapter 4. Building LR(0) Parsers for XPG Grammars 91
else emit “syntax error”; exit;}}
then we introduce the non-terminal ‘next1’ in the two conflicting production:
A → α′ xi next1 xj ρ′ λ′
B → β′ DRi xi next1 ai γ′
Conflict resolution algorithm
Fig. 4.3 shows the algorithm for the resolution of reduce/reduce conflicts that uses the
approach proposed in the previous conflicts classification.
@[e_^UWXY=fgh=
ASijWk=I@_aAd^bA_c`d@^p]BsB@}@AdE@_]B@p`aaB^t`dEBd_@��x�y@tAa^cde@_ADbB@\@oc_]@aBEwpB�aBEwpB@p`dCbcp_^q@
7jWijWk@I@_aAd^bA_c`d@^p]BsB@oc_]@d`@aBEwpB�aBEwpB@p`dCbcp_^q@
Hq@ �BtBA_@^_Bt^@G��@wd_cb@\@]A^@d`@aBEwpB�aBEwpB@p`dCbcp_^q@
Gq@ �`a@BAp]@^_A_B@^@cd@\@p`d_Acdcde@A_@bBA^_@_o`@p`stbB_B@c_Bs^m@paBA_B@_]B@^B�wBdpB@4*56)78x^y@k@¨\�Hm@qqm\��m@z��©Hm@qqm@z��m@
I�©Hm@qqm@Id�@xd@≥@Gy@`C@_]B@bBC_@]AdE�^cEB@d`d�_BascdAb^@AttBAacde@cd@_]B@p`stbB_B@c_Bs^m@oc_]@\�A∈\�m@z�D∈z�m@Ip∈um@
H≤A≤�m@�©H≤D≤�m@�©H≤p≤dq@@
rq@ �`a@BAp]@^_A_B@^@cd@\@paBA_B@A@eaAt]@d`EBq@�`a@BAp]@tAca@`C@^_A_B^@^H@AdE@^G@cd@\@paBA_B@Ad@Aap@cC@4*56)78x^Hy@AdE@4*56)78x^Gy@
p`d_Acd@A_@bBA^_@A@p`ss`d@p`stbB_B@c_Bsq@@
�q@ �B_@l�uHm@qqm@�u�n@DB@_]B@^B_@`C@p`ddBp_BE@eaAt]@p`st`dBd_^@AdE@^B_�`C�d_x�uy@DB@_]B@^B_@`C@d`d�_BascdAb^@AttBAacde@cd@A@^B�wBdpB@4*56)78x^y@o]BaB@^@c^@A@d`EB@cd@�uq@
�q@ V_^@c@k@Hm@ªm@�@T_@@
= = ==l^R]WR@d`d�_BascdAb@u�§\c@@ @ @@V_^@BAp]@d`d�_BascdAb@§@cd@^B_�`C�d_x�ucy@T_@
@@@@@@@@@@@@@@@@@@@@@@@UV@x§@∈@H:@`a@§@∈<:y@WXRS@
@ @ @ R[UYUS]WR@Abb@_]B@ta`Ewp_c`d^@§@→@ε@l@action@n@Ca`s@}@@ @ @ Y_TUVm@}@Dg@aBtbApcde@BAp]@`ppwaaBdpB@`C@§@oc_]@u�§\c@@
@ @ @@@@@@R[\R=
= = = ^Ri[]lR@§@→@σ@@@oc_]@§@→@σ@u�§\c@Ac�@
======US\R^W@A@ta`Ewp_c`d@u�§\c@→@ε@l@action@n@cd@}@o]BaB@action@c^@p`EBE@A^@C`bb`o^�@ switch state of s1: code to eliminate the conflict in s1 ……… st: code to eliminate the conflict in st otherwise: syntax error @ @ @ {BaB@^H@ª^_@AaB@_]B@d`EB^@cd@�ucq@
�q@ �`d^_awp_@_]B@tAa^cde@_ADbB@\@C`a@_]B@dBo@eaAssAaq@
Figure 4.3: The algorithm for the resolution of reduce/reduce conflicts in the translation
scheme.
The algorithm takes into account the possibility that a production can be involved in
more than one conflict, so the modification of such production with the introduction of
the non-terminal NEXTi must consider the different situations. To this aim the algorithm
constructs a graph where the nodes are the states of the parser with a reduce/reduce
conflict and the edges connect the states having at least a common complete item. These
92 Chapter 4. Building LR(0) Parsers for XPG Grammars
states are modified in agreement to the techniques presented in the previous reduce/reduce
conflicts classification.
The equivalence of the languages recognized from P(G) and P(XPG) follows from
theorem 4.1 and from the approaches used in the resolution of reduce/reduce conflicts.
Example 4.2 Let us consider the non-LR(0) translation scheme SG of example 4.1, the
application of the algorithm in Fig. 4.3 to SG yields the translation scheme SG’=(T’, N’,
P’, S’), where T’=T∪{A1, A2, A3, A4}, N’=N∪{S’, SP, r2 1, r1 1b, next1, next2} and P’
is the set of productions with actions described in the following.
j�@@→@jh@j_A_B\z@@@@l@ct@k@�B_p]�«^gsD`bxRSTy�@
@@@@ UV@ct@c^@d`_@dwbb@@===WXRS@lRYUW@¬^gd_Av@Baa`a¬�@RZUW�n@
@@@@R[\R@lRYUW@¬tAa^cde@`�¬�@RZUW�n@
@@@@@@@@@@@@@@@@n@@
jh@→@ε@@ lct@k@�B_p]�«^gsD`bx\W]^Wy�@
@@@@@@@@@@@@@UV@ct@c^@d`_@dwbb@@===================WXRS@dBv_�^gsD`b@k@ztct®�@
@@R[\R@lRYUW@¬^gd_Av@Baa`a¬�@RZUW�n@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@n@
j_A_B\z@@→@@}aAt]@dBv_H@I�@
}aAt]@@→@@uiz�~@l@}aAt]H@k@@uiz�~H�@n }aAt]@@→@@uiz�~�l@}aAt]H@k@uiz�~�H�@n@
}aAt]@@→@@}aAt]�@@dBv_H@�z}�@dBv_G@IH@aG�H@u`EBl@
}aAt]H@k@}aAt]�H@�@�z}�H� V_^@ckH@W_@_@T_@
UV@x�@u`EBH��Hy@WXRS@l@@
@@@@@ @ @@@@@@@@cd^Ba_xh�I��{i�zy�@@@ @ @@@@@@@@@h�I��{i�zH@k@u`EBH@�@�z}�G�@@
@ @ @@@@@n@
@ @ n@
dBv_H→ε l[RW=�^B�klxH�Hm�z}�ymxG�Hm�z}�ymxAdgm@h�I��{i�zyn@
@ @@@T_=l@@ @@@@@@bB_@:@_]B@Cca^_@BbBsBd_@cd@�^B��@
@@ @ @@@@@@@ct@k@�B_p]�«^gsD`bx:y�@
== = =======UV@ct@c^@d`_@dwbb@WXRS@dBv_�^gsD`b@k@ztct®�@==============R[\R=EBbB_B@:@Ca`s@�^B��@
@@@@@@@@@@@@@@@@@@@@@@n@nXU[Rxct@c^@dwbb@]ST@�^B�@c^@d`_@Bst_gy�@
@ @@@@UV@xct@c^@dwbby@l@dBv_�^gsD`b@k@I��@n@@ @n@
@
dBv_G@@→@@ε@@l=[RW=�^B�@kl@x� �� m@�z}�ym@xH�Gm@�z}�ym@x H�H m@�z}�yn@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@AdE@@�cp_®klIHmIGmIrn@@@@@@@T_=l@
@@@@@@@@@@@@@bB_@:@_]B@Cca^_@BbBsBd_@cd@�^B��@
=============== = UV@\B^_x:;B]m@vc©Hy@c^@_awB@C`a@BAp]@:;B]@cd@:@
== =====WXRS@l@dBv_�^gsD`b@k@�cp_�®�=RZUW�@n@@=== = =====R[\R=EBbB_B@:@Ca`s@�^B��@
@@@@@@@n@nXU[Rx�^B�@c^@d`_@Bst_gy�@
@ @@@@@@@UV@x�^B�@c^@Bst_gy@l@RYUW@¯^gd_Av@Baa`a°�@RZUW�@n@@@@@@@@@@@@@@@@@@@@@@@@@n@
@
aG�H@→@@ε@@ lct@k@�B_p]�«^gsD`bxG�Hm@u`EBy�@@@@@@@@@ @@@@@@@@@@@@@UV@ct@c^@d`_@dwbb@WXRS@dBv_�^gsD`b@k@ztct®@
@@@R[\R@lRYUW@¬^gd_Av@Baa`a¬�@RZUW�n@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@n@@
}aAt]@@→@@}aAt]�@@dBv_H@�z}�@dBv_G@IG@l@
}aAt]H@k@x}aAt]�H@�@�z}�Hy�@�z}�G� @ n@
@
}aAt]@@→@@}aAt]�@@dBv_H@�z}�@dBv_G@Ir@aH�HD@u`EBl@
}aAt]H@k@}aAt]�H@�@�z}�G� V_^@ckH@W_@_@T_@
UV@x�@u`EBH��Hy@WXRS@l@@
@ @ @ cd^Ba_xh�I��{i�zy�@@
@ @ @ h�I��{i�zH@k@u`EBH@�@�z}�H�@@@ @ @@@@@n@
@ @ n@
@
aH�HD@@→@@ε@@ lct@k@�B_p]�«^gsD`bxH�Hm@u`EBy�@
@@@@@@ @@@@@@@@@@@@@UV@ct@c^@d`_@dwbb@WXRS@dBv_�^gsD`b@k@ztct®@
@@@R[\R@lRYUW@¬^gd_Av@Baa`a¬�@RZUW�n@@@@@@@@@@@@@@@@@@@@@@@@@@@@@n@
@
}aAt]@@→@@}aAt]�@@dBv_H@h�I��{i�z@l@ }aAt]H@k@h�I��{i�zH�
@ @ n@
@
u`EB@→@@uiz�}@l u`EBH@k@uiz�}H�@n@
u`EB@→@@uiz��@l@u`EBH@k@uiz��H�@n@
@
u`EB@→@@h�I��{i�z@l@u`EBH@k@h�I��{i�zH�@n@
@
Chapter 5
An XPG-based Visual Environments Generator
In this chapter, we briefly present the Visual Language Compiler-Compiler (VLCC) [15],
a system for the automatic generation of visual programming environments supporting
the definition and construction of visual notations modeled through XPGs. The system
implements the concept of meta-CASE [59], since the visual programming environments
it generates can be used as CASE tools for the modeled visual notations. Changes to the
visual notations can be effectively executed within VLCC by acting on the XPG model
associated to the notations.
The system provides a “desk” where a developer can find an integrated set of tools
supporting the whole process for modeling and implementing visual languages. Fig. 5.1
shows the structure of the VLCC system. The main components are: the Symbol Editor,
the Visual Grammar Editor, the Textual Production Editor and the Language Inspector.
The Symbol Editor supports the language developer in the complete definition of vsymbols.
Basically, for each vsymbol, it allows us to draw its physical aspect (the M component), to
associate the syntactic attributes (the S component), and to “attach” a visual or a textual
annotation to it (in order to define hierarchical visual languages). Once defined, each
vsymbol is then inserted into a Terminals palette. The Visual Grammar Editor is a visual
component supporting the language designer in the specification of an XPG grammar
and semantic routines. This specification is translated into a YACC-like definition of the
visual language by using the technique described in the previous chapter. The designer
can further refine it through the Textual Grammar Editor.
The Language Inspector makes it possible for VLCC to directly support the hierarchical
visual notations. It allows the language designer/implementor to easily navigate through
94 Chapter 5. An XPG-based Visual Environments Generator
Visual Language Compiler-Compiler
Grammar
Editor
Symbol Editor
Production
Editor
Visual
Programming
Environment
Generator
VPE user
Language
designer
Visual Programming Environment
LR–based
compiler
Visual
Editor
Figure 5.1: The VLCC architecture.
the specifications of the visual notations composing the hierarchy.
Once the visual notation has been completely specified, VLCC automatically generates
an integrated Visual Programming Environment (VPE). In particular, VLCC generates a
VPE composed of a visual sentence editor based on the Terminals palette built with the
Symbol Editor and the visual language LR-based compiler generated from the translation
scheme derived by an XPG grammar and semantic routines. A final user can then use the
VPE to compose and process visual sentences from the implemented language.
The VLCC system maintains a repository of previously developed visual notations,
which can be reused to derive new visual notations, or to combine them in a hierarchy.
As a matter of fact, the visual environment development process supported by VLCC is
incremental, enabling the reuse of available components. Such components include vsym-
bols, grammar components, semantic rules, or previously developed visual environments.
In a first phase the designer uses VLCC as a prototyping tool. He/she identifies the visual
languages involved in the final VPE and their relationships. For each language he/she
sketches the graphical aspect of vsymbols and vsentences. In this phase the designer uses
VLCC to produce a first visual environment mock-ups prototype [59], by just creating the
Terminals palette for drawing sentences.
Chapter 5. An XPG-based Visual Environments Generator 95
Once the appearance of the visual notations has been completely defined the designer
adds the syntax and the semantics for the involved languages, where missing. In some
cases, when re-using already implemented notations, the designer/implementor may only
need to produce new semantic routines.
In the following we describe the main functionalities of the VLCC tool.
5.1 The Symbol Editor
Fig. 5.2 shows the VLCC Symbol Editor: a designer/implementor may recall the appro-
priate palette to define the physical aspect of the vsymbol (I palette button), or define
attaching areas or containment areas on the vsymbol (II or III palette, respectively). As
an example it can be noted from Fig. 5.2 that the attaching area palette allows a de-
signer/implementor to define as attaching areas: points, lines, curves, circles, squares,
circumferences, perimeters, etc.
I palette
graphical
aspect
II palette
attaching
areas
III palette
containment
areas
IV palette
language
annotation
containment area
attaching area
Figure 5.2: Definition of a Statechart state through the Symbol Editor.
Moreover, by clicking on the language annotation palette it is possible to annotate a
vsymbol with any of the already defined visual languages. As an example Fig. 5.3 shows
96 Chapter 5. An XPG-based Visual Environments Generator
the Annotation dialog where a UML Class vsymbol is annotated with the visual language
named Statecharts. The dialog also allows adding a textual annotation to the vsymbol.
Figure 5.3: Annotating a UML Class symbol with the language Statecharts.
5.2 The VLCC Textual grammar editor
Fig. 5.4 shows the main window of VLCC. The Terminals palette includes all the vsymbols
that have been specified with the Symbol Editor. The Viewer window is a structured
editor that allows modifying or directly editing the the translation scheme derived by an
XPG grammar together with the semantic rules in YACC format. The figure also shows
the Language Inspector window containing a hierarchy of languages: the main language
STATECHARTS has a symbol annotated with the language Classes again annotated with
the language Classpec annotated with Classpec itself and STATECHARTS. By clicking
on a language name in the window it is possible to recall its language specification in the
main window. This allows the language implementor to navigate through the language
Chapter 5. An XPG-based Visual Environments Generator 97
specifications in a simple and visual way. Once all the XPG grammars and the semantic
routines are provided in YACC format the command build language under the menu
Visual CM Project allows compiling the whole project and building the corresponding
Visual Programming Environment.
Figure 5.4: The VLCC Textual Grammar Editor.
5.3 The generated Visual Programming Environment
Fig. 5.5 shows the Visual Programming Environment generated from the grammar speci-
fication of subection 2.1. A user can draw a statechart by using the Terminals palette and
compile it by using the flash button. The compilation process checks the syntactic and
semantic correctness of the diagrams and prints the output to the Output window. The
output depends on the semantic routines added to the XPG grammars. As an example,
we have added semantic rules to automatically translate the statechart diagrams into the
XML Model Interchange (XMI) format. XMI is a standard file format for saving and
loading UML designs [67].
98 Chapter 5. An XPG-based Visual Environments Generator
Figure 5.5: The final VPE implementing the grammar specification given in subsection
2.1.
In Fig. 5.6 we show a generated hierarchical visual programming environment. A user can
draw a diagram and possibly zoom in a vsymbol to refine it. This is done by first clicking
on the Annotation button and then on the vsymbol that needs to be annotated.
The availability of such tool turns out to be a useful support for a validation phase of
the language development. As a matter of fact, the designer can quickly receive feedback
from the customer during the language prototyping process and modify the prototype in
agreement with the customer’s advice. This user-centered design is especially desirable due
to the nature of visual languages whose effectiveness strongly depends on the consistency
between user’s intention and machine interpretation.
The VLCC provides a clean separation of the concerns of the graphical editing and the
interpretation of diagrams both from the architectural and the usability point of view. The
user draws the diagram in free order (not dictated by a syntax directed editor) and then
invokes the language analyzer to interpret the drawing. The analyzer informs the user
about any errors it finds during parsing and semantic processing. This approach to visual
language implementation makes it possible to combine the sketching and the checking of
Chapter 5. An XPG-based Visual Environments Generator 99
Annotation
Button
Figure 5.6: A hierarchical visual sentence
diagrams into an explorative design style.
Separating the two concerns of editing and analyzing reduces the software complexity
of a tool that implements a visual language because the correctness of a diagram does not
have to be constantly enforced during editing.
Chapter 6
Constructing Meta-CASE Workbenches
CASE tools are recognized as useful means to strengthen and support the software de-
velopment. They are especially useful when provide also some kind of correctness and
consistency checking and do not support a single task of the software life-cycle, but rather
a whole process phase (workbenches) or a considerable part of the software process (en-
vironments). In these cases they usually support a specific method and provide guidance
on when tools in the set should be used.
Although CASE tools are able to speed development, they are not as widespread as one
would expect. It is widely recognized that the main difficulty in their large employment
derives from their method inflexibility. Indeed, while software development organizations
dynamically change the adopted methodologies by tailoring methods to their own require-
ments, modifying supporting tools is usually impossible or too expensive.
In recent years, the use of meta-CASE technology for developing CASE tools has been
proposed as a solution to this problem [37, 59]. Nevertheless, the development of meta-
CASEs able to generate visual oriented workbenches (i.e. analysis and design workbenches)
is not easy. The main difficulties are concerned with the generation of suitable visual
modeling environments and the integration of such environments.
In this chapter we propose an approach for the construction of meta-CASE work-
benches which profitably exploits the research on the generation of visual programming
environments realized in the visual languages research field.
The software architecture of the proposed meta-CASE workbench consists of two
modules: the Modeling Environment Generator (MEG) and the Workbench Generator
(WoG) [20]. The first allows to generate visual modeling environments starting from a
102 Chapter 6. Constructing Meta-CASE Workbenches
UML class diagram specifying the abstract syntax of the visual language. The imple-
mentation of the MEG subsystem can be effectively supported by visual environment
generators. The WoG module allows workbench designer to visually specify the target
workbench by defining the set of visual modeling environments inside a suitable process
model. Starting from such visual specification it generates the customized workbench by
integrating the required environments. The data integration is accomplished by using an
XML sublanguage, named GXL (Graph eXchange Language) [36, 63], introduced in soft-
ware re-engineering field to provide a general format for describing graph structures. The
generated workbenches also include suitable mechanisms to check the consistency between
different diagrams by exploiting tools for constraints checking based on XML technologies.
Next section introduces the main issues in the construction of meta-CASEs and presents
the software architecture of the proposed system. Section 6.2 is devoted to illustrate the
MEG module and the underlying methodology for the generation of visual modeling en-
vironments. Moreover, it describes how the construction of MEG can be effectively sup-
ported by the Visual Language Compiler-Compiler (VLCC) system. Section 6.3 describes
the WoG module.
6.1 The proposed approach for the construction of meta-
CASE workbenches
It is widely recognized that the effective use of diagrammatic representations to model
software systems requires the availability of visual oriented workbenches providing a set
of tools that synergistically work to provide a comprehensive support to system mod-
eling. These workbenches should provide diagram editors which support designers in
the construction of software models and save the information in a central shared reposi-
tory. They should also include design analysis and checking tools able to process software
models and reports on errors and inconsistencies. Many workbenches attempt to pro-
vide import/export facilities to allow the interchange of information from the repository
with other tools. Method-oriented workbenches include also knowledge about a process
model, the rules, and guidelines which should be applied to the software being developed.
They should adopt the process model to advise engineers on what tools of the workbench
to apply and when they should be used. On the other hand, they should exploit the
Chapter 6. Constructing Meta-CASE Workbenches 103
rules/guidelines to provide some automatic checking of the diagrams.
Thus, the main relevant issues for the construction of meta-CASE workbenches for
analysis and design are:
1. how to generate customized visual modeling environments;
2. how to integrate such environments;
3. how to specify the required process model, and constraints;
4. how to support constraint checking.
The approach we propose is based on suitable choices for each one of the above issues.
It is worth to pointing out that such choices are not independent, but one influences the
others. As a matter of fact, we will show how the choice for data integration strongly
influences the methodology for constraint checking and visual environments specification
and generation, and so on.
As for the first issue, the proposed approach suitably exploits the techniques and the
tools developed in the visual language research field [35,11,15,30,58,70].
As for the integration issue, different levels of integration can be considered for a
workbench, such as data integration, presentation integration, and process integration.
Presentation integration is obtained when the different tools composing the workbench
exhibit a similar user interface. The proposed approach easily ensures such integration for
all visual environments since they are generated with the same approach. On the other
hand, data integration is guaranteed by the use of an XML sublanguage, named GXL
(Graph Exchange Language), to represent the diagrams stored in a shared repository. The
GXL approach shows several appealing features and has been proposed to be a standard
exchange format for graph-based tools and to facilitate interoperability of reengineering
tools [36,63]. In particular, GXL has been designed in such a way that future extensions are
possible for handling different type graphs, such as hypergraphs or hierarchical graphs [24],
and GXL descriptions can be enriched with layout information. Thus, the GXL versatility
allows us to improve interoperability between visual modeling environments. As a matter
of fact, some groups from industry and research committed to provide facilities to import
and export GXL documents to their tools [65]. However, the choice of GXL does not
prevent from the use of other XML-based languages for import/export facilities. For
104 Chapter 6. Constructing Meta-CASE Workbenches
example, for UML visual environments, we may need to represent the sentence also with
the XMI format [67]. The proposed meta-CASEs can support workbench designer to
accomplish this task, allowing him/her to provide the suitable semantic rules to carry out
the translation.
Method-oriented workbenches should provide also a process integration, in the sense
that they should have embedded the knowledge of the process model, i.e., the activities,
the phases, the constraints and the tools of a software process. In particular, this model
should be used as guidance for the users in the process activities. Approaches to specify
such process modeling range from precise, formal languages [3, 4, 6] to more high-level,
graphical workflow languages [2, 46, 61].
For specifying process models we propose the use of activity diagrams, which is a
particularly useful notation able to coordinate the diagram environments and to model
the essential dependencies between them. Such features turn out to be especially useful
because many activities in the process model are concurrent, and their coordination may
be interdependent. In order to specify constraints between elements of the software mod-
els, such activity diagrams include dependency arcs annotated with constraints, which
express constraints on system models, derived by rules or guidelines of the method that
the workbench is supposed to support. Such constraints can also be seen as model consis-
tency constraints. Different notions of model consistency exist [25]. In particular, intra-
consistency (syntactical consistency) ensures that a model conforms to the abstract syntax
of the language, and inter-consistency (horizontal consistency) is related to diagrams of
different languages. Indeed, analysis and design workbenches are valuable because allow
software designers to use several modeling environments and then describing the software
from different point of views. Nevertheless, the checking of consistency of different dia-
grams should be supported. It is intuitive that the managing of inter-consistency is more
complicated because we have to consider the heterogeneity of the specification provided
by using different visual languages.
In the last years, the problem of model constraints checking has been widely inves-
tigated and some tools have been proposed for the automatic verification of model con-
sistency [23, 38, 42, 60]. However, most of them are conceived for checking consistency of
UML documents. A highly generic technology is represented by the xlinkit [51] framework,
which is able to generate rule-based links automatically and to check the consistency of
Chapter 6. Constructing Meta-CASE Workbenches 105
XML documents. According to such approach, consistency constraints are specified in
terms of a rule language based on a first order logic that has been adapted to work on
XML [51]. The flexibility of xlinkit makes it a valuable tool to accomplish the consistency
checks on the GXL representation of the software models.
On the base of the described choices for the construction of workbenches, the software
architecture of the proposed meta-CASE consists of two modules, namely the Modeling
language Environment Generator (MEG), and a Workbench Generator (WoG). Such sub-
systems exchange information through a Visual Modeling Environments (VME) Repository
as shown in Fig. 6.1. The MEG module, described in the next subsection, is based on
visual language environment generators technologies and allows us to generate visual mod-
eling environments starting from a high-level specification given in terms of a UML class
diagram.
The WoG module, described in subsection 6.3, generates customized workbenches by
integrating the VMEs, in agreement with the constraints and the dependencies expressed
in the activity diagrams modeling the customized software process. Fig. 6.2 shows the
architecture of a workbench generated by WoG.
VME
Repository
MEG
WoG Workbench
language
designer
workbench
designer
Figure 6.1: The software architecture of the meta-CASE workbench.
The Workbench Interface is the module that interacts with the end user and coordinates
the use of the VMEs in agreement with the knowledge about the process model. Each
VME allows a user to edit models and obtain as output GXL documents which are stored
in the shared GXL Repository. The Checker module accesses the Constraint Repository
and verifies the consistency constraints of the GXL documents as specified by the designer
in the WoG. In particular, the GXL documents and the corresponding constraints rules
are submitted to xlinkit checker that returns a diagnostic report.
106 Chapter 6. Constructing Meta-CASE Workbenches
VME1 VME2 VMEn …
GXL
Repository
Constraint
Repository
xlinkit
diagnostic
report
Workbench Interface
Figure 6.2: The architecture of the generated workbench.
6.2 The MEG Module
The core of the proposed meta-CASE is the MEG module, which supports language de-
signers in the definition and generation of visual modeling language environments. Each
generated environment is able to process visual sentences (software models) and output
suitable XML-based representations in order to facilitate data exchanging and VMEs in-
tegration. In the next subsections we describe the architecture of MEG, then we illustrate
how MEG can be constructed by using the VLCC system.
6.2.1 The architectural design of MEG and the underlying methodology
The architectural design of MEG consists of two subsystems, namely a UML class diagram
environment, and a grammar-based VME generator as described in Fig. 6.3. The process
of definition and generation of the language environment is carried out according to a
suitable methodology [19]. Such methodology has been strongly influenced by the choice
to adopt the GXL format as internal data representation. In the GXL approach, graph
classes are defined by using a suitable declarative language which is usually formed by
UML class diagrams [57]. Such UML class diagrams are translated into equivalent graph
representations, which are described by GXL documents, named graph schemas. A graph
schema provides the graph structure, i.e. the definition of node and edge classes, their
attribute schemas, and their incidence structure. Thus, GXL is used to represent instance
graphs as well as graph schemas for describing the structure of data. Schema and instance
Chapter 6. Constructing Meta-CASE Workbenches 107
graphs are exchanged by the same type of document, i.e. XML documents matching the
GXL DTD [63].
The GXL approach can be effectively integrated into MEG giving rise to a general
methodology able to generate interoperable visual language environments. According to
such methodology a visual language is defined starting from a high-level specification given
in terms of a UML class diagram. Such specification is analyzed to automatically generate
the GXL schema of the language. Moreover, it provides a start-up for the language
grammar definition, which is used to generate an integrated target visual environment.
Such environment provides support for editing visual sentences and produces the GXL
instance. The GXL schema and the GXL instance so obtained can be used as an exchange
format for the sentences of the language.
Grammar
Skeleton
Grammar with
semantic rules
2.1
3 4
1
2.2
UML Class
Diagram
Environment
Annotated
UML Class
Diagram GXL
repository
GXL
schema
GXL
instance
Visual
sentence
MEG
Target Visual
Modeling
Environment
Grammar-based
VME
Generator
Figure 6.3: The architecture of the MEG subsystem and the underlying methodology.
In the sequel we provide a more detailed description of the phases of the methodology. In
particular, it consists of four steps which are numbered as depicted in Fig. 6.3.
In step 1 the language designer provides a high-level specification of a visual language
in terms of an annotated UML class diagram. This formalism offers a suited declarative
language to define visual languages and the annotation provides the concrete syntax of
the languages, i.e. physic, syntactic and semantic features of the symbols.
In step 2.1 the GXL schema for the specified visual language is automatically generated
from the UML class diagram. To accomplish this task, we have identified a set of general
rules for the translation. In particular, the first rule states that for each class of the UML
108 Chapter 6. Constructing Meta-CASE Workbenches
class diagram the following node should be inserted into the GXL schema specification,
where classname is the name of the class.
<node id = classname>
<type xlink:href="gxlmetaschema.gxl#NodeClass"/>
<attr name="name">
<string>classname</string>
</attr>
</node>
Rule 2 states that for each aggregation association that holds between class X and class Y,
a node and two edges should be inserted into the GXL schema specification. In particular,
the node is defined as follows:
<node id = aggrname>
<type xlink:href="gxlmetaschema.gxl#AggregationClass"/>
<attr name="name">
<string>aggrname</string>
</attr>
</node>
and the edges are:
<edge from="aggrname " to="X">
<type xlink:href="gxlmetaschema.gxl#from"/>
</edge>
<edge from="aggrname " to="Y">
<type xlink:href="gxlmetaschema.gxl#to"/>
</edge>
where aggrname is the name of the aggregation. If the multiplicities aggregation associa-
tion is of type 0..* then the first edge will be:
<edge from="aggrname " to="X">
<type xlink:href="gxlmetaschema.gxl#from"/>
<attr name="limits">
<int>-1</int>
Chapter 6. Constructing Meta-CASE Workbenches 109
</attr>
</edge>
The specifications of other rules are in [19].
In step 2.2 a context-free grammar skeleton is constructed from the annotated UML
class diagram. In order to automatically carry out this translation, general rules have been
identified. In particular, according to rule 1 each non specialized class Cname produces a
terminal symbol, named T Cname, whose attributes are the corresponding attributes of
the class Cname.
As an example, let us consider the UML class diagram depicted in Fig. 6.5 that specifies
the abstract syntax of UML State Diagrams. The application of rule 1 determines the
insertion of symbols T Guard, T SubState, T SynchState, T PseudoState, T SimpleState,
T FinalState, T Event, T SubmachineState in the set of terminal symbols of the grammar
skeleton. Moreover, the symbols T Guard, T SubState, T SynchState, T PseudoState have
associated the attributes expression, referenceState, bound, kind, respectively.
In Fig. 6.4 the main rules used to obtain a grammar skeleton from the annotated UML
class diagram are explained.
Rule 1. Each non specialized class produces a terminal symbol of the grammar.
Rule 2. Each generalization relationship produces a production skeleton where the names of the specialized
classes are the names of the grammar symbols in the right-hand side of the production and the name of the
generalized class is the name of the grammar symbol on the left-hand side of the production.
Rule 3.1. Each aggregation or composition association between classes produces a production skeleton. In
such a production a grammar symbol with the name of the whole class is in the left-hand side and grammar
symbols with the names of the part classes are in the right-hand side. Moreover the symbols in the right-
hand side are at the same level in the hierarchy of objects and this relationship must be specified by the
language designer in step 4 of the methodology.
Rule 3.2. The multiplicity of the associations is used to determine the number of the grammar symbols in
the right-hand side of the productions and the number of productions with a given grammar symbol in the
left-hand side.
Rule 3.3. Each composition between a class and a stereotype produces a production skeleton. This
production describes a grammar symbol that contains a hierarchy of objects. The names of the grammar
symbols in the hierarchy are the names of the classes that “compose” the stereotype.
Figure 6.4: The main rules to obtain a grammar skeleton from a UML class diagram.
In step 3 the language designer completes the grammar skeleton to obtain a visual
language specification, in agreement with the grammar formalism underlying the system.
110 Chapter 6. Constructing Meta-CASE Workbenches
Successively, semantic rules are added to the productions of the grammar in order to trans-
late visual sentences into GXL instances. Such semantic rules should take into account the
annotated UML class diagram and the GXL schema generated in step 2.1 (some examples
concerning with step 3 are given in the next subsection).
In step 4 an integrated Visual Modeling Environment (VME) is generated starting
from the supplied language specification. Such environment can present different features
depending on the characteristics of the adopted grammar-based generator. Nevertheless,
the environment should encompass a visual editor and a compiler for the specified lan-
guage. Thus, the user can edit a visual sentence by selecting terminals and arranging
them on the working window. Then he/she can compile the input sentence and obtain the
corresponding GXL instance of the sentence, which together with the GXL schema can
be used for data integration.
6.2.2 Using the VLCC system as a support to MEG construction
In this section, we illustrate how the construction of MEG can be effectively supported by
the VLCC system [14]. Indeed, the VLCC system not only can be used as grammar-based
VME generator of MEG, but it can also support the generation of the suited UML class
diagram environment where the language designer can provide the high-level specification
of the language and obtain the corresponding GXL schema.
Using the class diagram environment module of the MEG subsystem we edit the class
diagram in Fig. 6.5 and add information on the concrete syntax and the semantic of the
language. Then, from the analysis of such sentence we obtain the GXL schema and the
grammar skeleton sketched in Figg. 6.6 and 6.7, respectively.
In order to describe how to obtain an XPG grammar specification starting from the
grammar skeleton and the annotated UML class diagram, in the following we illustrate the
construction of the XPG productions for the selection of production skeletons depicted in
Fig. 6.7. Let us observe that, productions 1, 2 and 3 in Fig. 6.7 are obtained by applying
rule 2 defined in Fig. 6.4. Such productions show that the vsymbol N State can be a
T FinalState, a T SimpleState or a N CompositeState. Indeed, a state of a UML state
diagram is described by the class State which is the generalization of the three classes
CompositeState, SimpleState and FinalState representing the three types of states (see
Fig. 6.5).
Chapter 6. Constructing Meta-CASE Workbenches 111
Pseudostate
kind : PseudostateKind
SimpleState
SynchState
bound : UnlimitedInteger
StubState
referenceState : Name
FinalStateCompositeState
Guardexpression : BooleanExpression
StateVertex
0..*
0..1
+subvertex
0..*
+container
0..1
Event
Action(from Common Behavior)
Transition
1
0..1
1
+guard0..1
0..1
*
+trigger0..1
*1 *
+source
1
+outgoing
*
1 *
+target
1
+incoming
*
0..1
0..1
+effect0..1
0..1
State
0..*
0..*
0..* +deferrableEvent
0..*
*
0..1
+internalTransition*
0..1
0..1 0..10..1
+entry
0..1
0..1
0..1
0..1 +exit
0..1
0..1
0..1
0..1 +doActivity
0..1
SubmachineState
StateMachine
*
0..1
+transitions*
0..1
1
0..1
+top1
0..1
*
1
*
+submachine
1
ModelElement(from Core)
*
0..1
+behavior *
+context 0..1
Figure 6.5: A chunk of the UML class diagram specifying the abstract syntax of UML
State Diagrams [50].
Productions 4 and 5 establish that a N CompositeState is a state containing a set
of vsymbols N StateVertex derivable from the non-terminal vsymbol AggCompositeS-
tate Vertex. In particular, container subvertex specifies a containment relation holding
between the vsymbol N CompositeState and the vsymbol N StateVertex of the grammar
skeleton. As a matter of fact, the aggregation association between the classes Compos-
iteState and StateVertex describes a hierarchy of StateVertex.
Moreover, the analysis of annotations on the class State produces a grammar vsymbol
T State having one attaching region and one containment area as syntactic attributes.
In order to construct the XPG productions, the designer identifies the relations used
to relate the vsymbols in the visual sentences by analyzing the associations in the class di-
112 Chapter 6. Constructing Meta-CASE Workbenches
<?xml version="1.0"?> <!DOCTYPE gxl SYSTEM "gxl.dtd"> <gxl> <graph id="StatediagramSchema"> <type xlink:href = "gxl.gxl"/> <node id = "n1"> <type xlink:href = "gxl.gxl#NodeClass"/> <attr name = "name"> <string>State</string> </attr> </node> <node id = "n2"> <type xlink:href = "gxl.gxl#NodeClass"/> <attr name = "name"> <string>FinalState</string> </attr> </node> … <node id = "e1"> <type xlink:href = "gxl.gxl#EdgeClass"/> <attr name = "name"> <string>source_outgoing</string> </attr> </node> … <node id = "n15"> <type xlink:href = "gxl.gxl#CompositionClass"/> <attr name = "name"> <string>container_subvertex</string> </attr> </node> … <graph> </gxl>
Figure 6.6: A chunk of the statecharts GXL schema.
agram and the skeleton obtained. As an example, the containment relation container sub-
vertex will be the spatial relation contains defined in Section 2.1. Moreover, the spatial
relation sibling will be used to relate the vsymbols N StateVertex that are contained in
the same superstate N CompositeState.
Thus, the XPG productions in Fig. 6.8 are obtained from the skeleton of Fig. 6.7.
Moreover, by adding semantic rules to the XPG productions it is possible to translate
any state diagram sentence into the corresponding GXL instance in agreement with the
state diagram GXL schema. As an example, the production for the vsymbol T FinalState
in Fig. 6.9 contains the semantic rule to generate the corresponding GXL code. In
this production, T FinalStateid is a unique identifier of the terminal T FinalState, and
T FinalStateaction describes the action associated to the state FinalState. The semantic
rules for the other attributes of T FinalState have been defined in a similar way.
It is worth noting that the verification of intra-consistency constraints of visual lan-
guages can be carried out during a static semantic analysis by exploiting the syntax struc-
ture given in output by the syntactic analysis [18]. To this aim, semantic attributes and
Chapter 6. Constructing Meta-CASE Workbenches 113
xHy@u�j_A_B@@@@@@@@@@@@@@@@@@@@@@@@�@@\��cdAbj_A_B@
xGy@u�j_A_B@@@@@@@@@@@@@@@@@@@@@@@@�@@\�jcstbBj_A_B@
xry@u�j_A_B@@@@@@@@@@@@@@@@@@@@@@@@�@@u��`st`^c_Bj_A_B@
x�y@u�@�`st`^c_Bj_A_B@@@@@@�@@u�@�`st`^c_Bj_A_B@@@p`d_AcdBa�^wDfBa_Bv@@Iee�`st`^c_Bj_A_B�j_A_B«Ba_Bv@
x�y@Iee�`st`^c_Bj_A_B�j_A_B«Ba_Bv@@@@@�@@Iee�`st`^c_Bj_A_B�j_A_B«Ba_Bv@@u�j_A_B«Ba_Bv@
Figure 6.7: A part of the grammar skeleton for the statecharts language.
xHy@u�j_A_B@@@�@@\��cdAbj_A_B@
@@@@@@@@@@@@��@x@u�j_A_BH@k@\��cdAbj_A_BH@�@u�j_A_BAaBA@k@\��cdAbj_A_BAaBAy@
@@@@@@@@@@@@�lx6?<@<;�@�@\��cdAbj_A_B@H��H�@�j\I\�H@k@\��cdAbj_A_B@Hyn@
xGy@u�j_A_B@@�@@\�jcstbBj_A_B@
@@@@@@@@@@@@��@x@u�j_A_BH@k@\�jcstbBj_A_BH@�@u�j_A_BAaBA@k@\�jcstbBj_A_BAaBAy@
@@@@@@@@@@@@�lx6?<@<;�@�@\�jcstbBj_A_B@H��H�@�j\I\�H@k@\�jcstbBj_A_B@Hyn@
xry@u�j_A_B@�@@u��`st`^c_Bj_A_B@
@@@@@@@@@@@@��@x@u�j_A_BH@k@u��`st`^c_Bj_A_BH@�@u�j_A_BAaBA@k@\��`st`^c_Bj_A_BAaBAy@
@@@@@@@@@@@@�lx6?<@<;�@�@\��`st`^c_Bj_A_B@H��H�@�j\I\�H@k@\��`st`^c_Bj_A_B@Hyn@
x�y@u��`st`^c_Bj_A_B@�@u��`st`^c_Bj_A_B�@ql_SW]US\r=Iee�`st`^c_Bj_A_B�j_A_B«Ba_Bv@@@@@@@@@@@@@@@@
@@@@@@@@@@@@��@xu��`st`^c_Bj_A_B@H@k@u��`st`^c_Bj_A_B�H�@u��`st`^c_Bj_A_B@AaBA@k@u��`st`^c_Bj_A_B�AaBAy@
@@@@@@@@@@@@�lx6?<@<;�@�@u��`st`^c_Bj_A_B�H��H�@�j\I\�H@k@u��`st`^c_Bj_A_B�Hyn@
x�y@Iee�`st`^c_Bj_A_B�j_A_B«Ba_Bv@�@Iee�`st`^c_Bj_A_B�j_A_B«Ba_Bv�@q\Us[USer=u�j_A_B«Ba_Bv@@@@@@@@@@@@@@@@
@@@@@@@@@@@@��@xIee�`st`^c_Bj_A_B�j_A_B«Ba_Bv@H@k@Iee�`st`^c_Bj_A_B�j_A_B«Ba_Bv�H�@@
@@@@@@@@@@@@@@@@@@Iee�`st`^c_Bj_A_B�j_A_B«Ba_Bv@AaBA@k@Iee�`st`^c_Bj_A_B�j_A_B«Ba_Bv�AaBAy@
@@@@@@@@@@@@Γ�lx6?<@<;�@�@Iee�`st`^c_Bj_A_B�j_A_B«Ba_Bv�H��H�@�j\I\�H@k@Iee�`st`^c_Bj_A_B�j_A_B«Ba_Bv�Hyn@
Figure 6.8: Some productions of the XPG grammar for the state diagram language.
semantic rules can be added to the vsymbols and to the XPG productions in order to
obtain a syntax structure summarizing the information of the input visual sentence.
As an example, let us consider the state diagram of Fig. 6.10 which exhibits an
anomalous behavior whenever the system is off and an on event occurs. Indeed two
different transitions are triggered by such event, causing the system to enter into both on
and standby states, that should be mutually exclusive.
Fig. 6.11 shows the syntax graph corresponding to the state diagram in Fig. 6.10.
Algorithms on graphs can be applied to such syntax graph in order to determine the
presence of specific properties such as conflicting transitions and then detect possible
anomalous behaviors of the system. As an example, the conflicting transition of state
diagram in Fig. 6.10 can be detected by checking if there exists a node, in the syntax
114 Chapter 6. Constructing Meta-CASE Workbenches
u�j_A_B@@@�@@\��cdAbj_A_B@@
@@@@@@@@@��@x@u�j_A_B�@k@\��cdAbj_A_B�@�@u�j_A_BAaBA@k@\��cdAbj_A_BAaBAy@
@@@@@@@@@�lx6?<@<;�@�@\��cdAbj_A_B@�����@�j\I\��@k@\��cdAbj_A_B�yn@
{����������<node id = \��cdAbj_A_Bid> <type link:href="statediagram.gxl#FinalState"/>
<attr name="name"> <string>@\��cdAbj_A_Bname</string> </attr>);
if (\��cdAbj_A_Baction is not null) emit��<attr name="action">
<string>@\��cdAbj_A_Baction</string> </attr>);
… emit��<node/>); }@
Figure 6.9: A production with the semantic rule to generate the corresponding GXL
code.
Standby
on
NotOn
off Off
High
up
down
Low
Worm
minus
plus
Cool Hot
plus
minus
on
off
On
on
Figure 6.10: A state diagram with a conflicting transition.
graph of Fig. 6.11, with more than one output transition with the same event (see node
Off ).
Finally, from the XPG grammar specification the VLCC automatically generates a
VME for UML state diagrams.
6.3 The Workbench Generator
The WoG module is a visual environment that supports the workbench designer in the
definition and generation of workbenches. Fig. 6.12 depicts the architecture of the WoG
module.
Chapter 6. Constructing Meta-CASE Workbenches 115
Off Standby On1
Worm Hot Cool High Low
On2
On
root
NotOn
Minus
Plus
Minus
Plus
Off
On Down
Up
Off
On
On
Figure 6.11: The syntax graph corresponding to the state diagram in Fig. 6.10
VME
Repository
Process Model Environment
Constraint
Rule
Editor
Compiler Activity Diagram
Editor
WoG
Visual-oriented
Workbench
Figure 6.12: The architecture of the WoG module.
The Activity Diagram Editor allows the designer to define a process model in terms of
an activity diagram. In such diagram, the activities correspond to specific visual modeling
environments, available in the VME repository, meant to support them. The transitions
define the required activity sequence in the process model. Moreover, the activities can be
linked by dependency arrows that allow workbench designer to specify model constraints
using the Constraint Rule Editor.
As an example, let us consider the activity diagram shown in Fig. 6.13. It specifies
a workbench providing environments for UML class diagrams, UML sequence diagrams
and UML state diagrams. Due to the fork synchronization bar, a state diagram and/or
a sequence diagram can be edited after the definition of a class diagram. Moreover, the
presence of dashed arrows (labelled r1 and r2) shows that suitable consistency constraints
must hold between class diagrams and the other diagrams. The self-arrow labelled r3
indicates that a constraint must be verified on state diagram models.
116 Chapter 6. Constructing Meta-CASE Workbenches
�|�@
jB�wBdpB@zcAeaAs@
aH@
aG@
�|�@
�bA^^@zcAeaAs@
�|�@
j_A_B@zcAeaAs@
ar@
Figure 6.13: An activity diagram specifying a process model.
In Fig. 6.14 it is depicted an example of constraint associated to label r1. Let us
observe that the rule consists of two main parts. The first is a description element which
is a natural language specification of the rule. In this example it is stated that all the
methods in the state diagram must have a corresponding class in the class diagram. The
second part of the rule is formed by a forall element containing the constraint formula
given in agreement with the rule language abstract syntax [51] shown in Fig. 6.15.
<ConsistencyRule id="r1">
<description> Abb@sB_]`E^@cd@A@^_A_B@EcAeaAs@sw^_@]AfB@A@p`aaB^t`dEcde@pbA^^@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@cd@_]B@pbA^^@EcAeaAs@</description>
<for all var="c" in="Event">
<exists var="o" in="ClassDiagram">
<equal op1="$o/CallEvent/operation.name/text()"
op1="$c/Class/operation.name/text()"/>
</exists>
</forall>
</ConsistencyRule>
Figure 6.14: A constraint rule between class diagrams and state diagrams.
It is worth noting that the constraint formula is concerned with elements of the vi-
sual models expressed in terms of XPath [12] expressions over the GXL schemas. Since
workbench designer could be different from language designer, the WoG module assists a
workbench designer during constraint specification providing him/her with the capability
of visualizing both the abstract syntax and the GXL schema of the visual languages in-
volved in the constraint. In particular, Figg. 6.16 and 6.17 show a chunk of class diagram
and state diagram abstract syntax (including the classes Operation on which the formula is
defined). The choice of the rule language has been dictated by the adoption of the xlinkit
Chapter 6. Constructing Meta-CASE Workbenches 117
@@@@@@@@@awbB@��k@∀@t]^@∈@Zi]WXxA4*BCD+y@
xA4*BCD+y@��k@∀@t]^@∈@Zi]WXxA4*BCD+y@�@
�������������������∃@t]^@∈@Zi]WXxA4*BCD+y@�@
EEEEEEEEEEEEEEEEEEEEEEA4*BCD+=]STEA4*BCD+E�@
EEEEEEEEEEEEEEEEEEEEEEA4*BCD+=_^EA4*BCD+E�@
EEEEEEEEEEEEEEEEEEEEEA4*BCD+=UYi[UR\EA4*BCD+E�@
=====================S_WEA4*BCD+E�@
=====================Zi]WX=kZi]WX�@
=====================Zi]WX=�@Zi]WX�@
=====================\]YR=t]^=t]^@
Figure 6.15: The rule language abstract syntax.
Figure 6.16: A part of the class diagram specifying the abstract syntax of UML class
diagrams.
tool [51] to accomplish constraint checking in the generated workbench. However, a rule
language based on a first order logic exhibits some advantages, such as declarativeness,
concision and intuitiveness.
Whenever the workbench designer have completed the specification of the process
model enriching the activity diagram with the required consistency constraints, the WoG
module analyzes the model and produces a workbench whose structure has been shown in
Fig. 6.2.
The Workbench Interface drives the software designer by coordinating the use of the
tools provided by the workbench. Indeed, the activity diagram describing the process
model is presented to the software designer by showing the completed activities and by
118 Chapter 6. Constructing Meta-CASE Workbenches
CallEvent
Operation
Event Transition
State Action
0..1
0..1
0..1
0..1
0..1 0..1
0..1
0..1
0..1
0..*
0..* *
0..1
*
1
*
Figure 6.17: A part of the class diagram specifying the abstract syntax of UML state
diagrams.
suggesting the ones that should be carried out. Indeed, it is worth noting that, according
to a widely accepted opinion, the software designer should not been forced to follow the
activity sequence expressed by the process model, rather he/she should have a useful guide
during his/her work.
Moreover, possible inconsistencies between models reported by the xlinkit checker are
shown by highlighting the corresponding dependency arrows in the activity diagram. The
checker is activated whenever the GXL repository is updated due to the definition of a
new software model expressed in terms of a GXL document. In this case it accesses to the
Constraint Repository and verifies constraints involving the new GXL document. Then a
diagnostic report is output by xlinkit and possible inconsistent links are highlighted.
An example of xlinkit report is shown in Fig. 6.18. It identifies the inconsistency
between a class diagram and a state diagram with respect to the constraint rule specified
in Fig. 6.14. In this case the workbench interface will highlight the arc labelled with r1.
<xlinkit:consistencyLink>
<ruleid="ConsistencyRule.gxl#/id(‘r1’)">
<xlinkit:State> inconsistent </xlinkit:State>
<xlinkit:Locator xlink:href="ClassDiagram.gxl#/ClassDiagram[1]"/>
<xlinkit:Locator xlink:href="StateDiagram.gxl#/Event[1]"/>
</xlinkit:consistencyLink>
Figure 6.18: A diagnostic report containing an inconsistency link.
Chapter 6. Constructing Meta-CASE Workbenches 119
As we can see, the proposed meta-CASE can be effectively used to address the method
flexibility issue of workbench. Indeed, it allows workbench design to easily tailor the tool
to specific design methods in agreement to the organization requirements. He/she can
generate a workbench supporting a different method by simply modifying the activity
diagram, i.e. by specifying different VMEs, and/or different constraints. Moreover, the
time and the costs of workbench construction can be considerably reduced also thanks to
the possibility of VME and constraint reuse.
Chapter 7
Related Work
In the literature there are many different types of grammar formalisms for specifying di-
agrammatic languages [44]. Some of them use attributes to handle information about
the spatial layout of symbols. Such attributes play an important role in the parsing pro-
cess, since grammar productions are applied if attribute values satisfy specific constraints.
Other than XPGs, in this category of formalisms we find Relational Grammars [64, 66],
Constraint Multiset Grammars [43], and Picture Layout Grammars [32,33]. Other gram-
mar formalisms specify relationships among visual symbols at a high level of abstraction
without attributes. In this category we find Symbol-Relation Grammars [28], Hypergraph
Grammars [47], and Layered Graph Grammars [53, 54].
Many of the proposed grammar formalisms support order-free pictorial parsers that
process the input objects according to no ordering criterion. The formalisms of Picture
Layout Grammars, Relation Grammars, and Constraint Multiset Grammars fall into this
class. In general, and in the worst case, an order-free parser proceeds with a purely
bottom-up enumeration. To limit the parsing computational cost, subclasses of PLGs,
CMGs, and RGs have been defined to provide the corresponding parsers with predictive
capabilities that restrict the search space. To further improve parsing efficiency, predictive
pictorial parsers have also been defined. In this category we find pLR parsers [15], based
on Positional Grammars, and the one presented in [64], based on Relational Grammars.
In general, the broader the class of languages to be treated is, the less efficient the parsing
algorithm is.
Close to grammar-based approaches are formal specification methods based on rewrit-
ing systems. In particular, it is worth mentioning the Conditional Set Rewriting System
122 Chapter 7. Related Work
by Najork and Kaplan [49] and its extension, the Visual Attributed Rewriting System,
by Bottoni et al. [7, 8]. The latter was especially conceived to specify the pictorial and
computational aspects of visual languages that formalize interactive sessions of the human-
computer dialogue.
In the sequel we give an outline of some formalisms previously citated.
7.1 Picture Layout Grammars
In the formalism of picture layout grammars (PLG), proposed by Golin and Reiss [33],
a visual sentence is viewed as a multiset of visual symbols, with attributes containing
positional information about symbols. Each production of a PLG grammar is associated
with a set of semantic functions and constraints. The formers specify rules by which
attributes of left-hand side symbols are derived from those of right-hand side symbols,
whereas constraints represent predicates over the attribute values of the right-hand side
symbols, and are used to determine when a production can be applied. The associated
parsing algorithm has the drawback that it works under certain restrictions that can only
be checked at run-time, yielding to possible nonterminating computations.
A visual environment generation tool based on this grammar formalism is the Visual
Programmer’s Workbench (VPW) [56], which enables the generation of visual program-
ming environments such as iconic languages and some diagrammatic languages. The
language specification is divided in four components: the syntactic and abstract structure
of the language and the static and dynamic semantics. The syntactic structure specifies
the visual appearance and structure of the language with the PLGs. The abstract struc-
ture specifies a model which reflects the underlying structure of the language. The static
semantics allow to extract, analyze or synthesize static properties of visual programs. The
dynamic semantics specifies the execution properties of the language.
Fig. 7.1 shows the structure of the VPW system. The spatial parser takes a picture
layout grammar and the picture as its inputs and produces a concrete structure, the
parse-graph as its output. A set of mapping functions transforms the parse-graph into
multiple abstract representations, on which can operate static or dynamic processing action
routines.
Chapter 7. Related Work 123
Figure 7.1: The architecture of VPW [56].
7.2 Constraint Multiset Grammars
Constraint multiset grammars (CMG) are another constraint based formalism, and are
highly related to PLGs. They provide a grammar formalism based on multiset rewrit-
ing, where a nonterminal symbol in a multiset can be rewritten by a production in the
grammar whenever the attributes of the symbols in the multiset satisfy a given constraint
describing relationships between pictures. The main difference between CMGs and PLGs
is that a production of CMG can specify constraints over the attributes of any symbol in
the current sentential form.
For example, the following CMG production is from a grammar for state transition dia-
grams:
TR:transition ::= A:arrow, T:text
where exists R:state, S:state where
T.midpoint close to A.midpoint,
R.radius = distance(A.startpoint, R.midpoint),
S.radius = distance(A.endpoint, S.midpoint)
and TR.from=R.name, TR.to=S.name, TR.label=T.string.
The production defines a transition to be formed by an arrow object and a text object
that is near to the midpoint of the arrow. Furthermore, the startpoint and endpoint of
124 Chapter 7. Related Work
the arrow are constrained to reside on the perimeter of a state object. In the production
above, midpoint, startpoint, endpoint, and radius are geometric attributes whereas from,
name, string and label are semantic attributes.
CMGs also allow the specification of negative constraints, which enable the specifi-
cation of visual languages by deterministic CMGs, yielding an efficient parsing [11]. In
general, the recognition algorithm for full CMGs has exponential complexity. However,
cycle-free, stratified, and deterministic CMGs [10] have polynomial time complexity.
A visual programming environment generator based on CMGs is the Penguins sys-
tem [10, 11]. Penguins automates the construction of graphical editors that support the
intelligent diagram concept. The intelligent diagram is a metaphor for diagramming where
the underlying graphical editor parses the diagram as it is being constructed, performing
error correction and collecting geometric constraints about the relationships between dia-
gram components. A constraint solver uses the geometric constraints to maintain the dia-
grams semantics during diagram manipulation. The system follows the compiler-compiler
approach to the generation of the diagramming editor. The generated editor supports
the creation, manipulation, and interpretation of diagrams in agreement with the visual
language whose high-level specification is provided (by the programmer) in a specification
language based on constraint multiset grammars (CMG).
Fig. 7.2 shows the overall structure of the Penguins system. In Penguins, the parser
generator VisualGen generates from a CMG specification an incremental diagram parser
(Spatial Parser) which is incorporated into the standard diagramming environment Vi-
sualEdit. The incremental parsing technique allows the presence of incorrect intermediate
visual sentences, yielding a user friendly paradigm for visual sentences manipulation. The
Graphic Editor provides standard graphic primitives (lines, circles, text, arrows). The
system provides also a Tokenizer that can map input gestures to the graphic primitives.
A Constraint Solver is used by the editor to provide the constraint solving mechanism
needed in geometric error correction and diagram manipulation.
7.3 Relational Grammars
Relational grammars (RGs) is a formalism based on relational structures. A recent version
also considers attributes, but these can only be associated to terminal graphical objects.
Chapter 7. Related Work 125
VisualGen
ConstraintSolver
IncrementalSpatialParser
GraphicEditor
Tokenizer
ApplicationSpecificRoutines
VisualEdit
Diagram Editor
Constraint
Multiset
Grammar
Input
Compile into
Generates
Figure 7.2: The architecture of the Penguins system [11].
The parsing algorithms of relational grammars have been applied to a variety of applica-
tion fields, such as mathematical expression analysis, line drawing for pen-based interfaces,
multidimensional data verification, interactive support for design, and multimedia docu-
ments generation [66].
The productions contain a multiset of symbols on the right-hand side and a set of con-
straints to determine the validity of a production. In a Relation Grammar the constraints
are expressed as evaluation rules written in a Prolog-like notation, rather than being used
to compute attributes of aggregate objects as in PLGs and CMGs. A O(nlogn) parser has
been constructed for a constraint-based subclass of RGs, called RG/1 grammars.
RGs are a member of the context-free family of constraint multiset grammars, and
several other subclasses of RGs have been proposed by the algorithmic demands of spe-
cific applications. In particular restrictions on the productions of RGs have lead to the
definitions of Connected RGs class, and its subclasses Fringe RGs and Atomic RGs. The
latter has been applied in ShowBiz, a design tool for business process flow modeling.
126 Chapter 7. Related Work
7.4 Symbol-Relation Grammars
In symbol-relation grammars (SR grammars) a visual sentence is viewed as a set of symbol
occurrences (s-items) and a set of relational items (r-items) over symbol occurrences [28].
The derivation of a visual sentence is accomplished by rewriting both symbol occurrences
and relational items by means of simple context-free rules. In particular, during a deriva-
tion step a symbol occurrence X0 in a sentence S1 is replaced by a sentence S2 according
to a rewriting rule of the form X0 → S2, called s-production. After X0 has been rewritten,
the replacement of the set of r-items involving X0 is performed through r-item rewriting
rules (r-productions) of the form r(X0, Y1) → R, where R is a set of r-items relating Y1
to s-items in S2.
The formalism of the SR grammars has a quite large expressive power, indeed can be
used to generate any context-sensitive string languages [28]. However, the membership
problem for the whole class of SR languages is NP-hard. Thus, in [29] has been defined
an expressive subclass of SR grammars, named Boundary SR grammars, for which an
efficient parsing algorithm has been implemented.
The Visual Language Programming Environment Generator (VLPEG) [30] is based on
Boundary SR grammars, which allow for a high-level of abstraction during the language
specification. This is possible also thanks to a lexical analyzer able to interpret the physical
layout of the input visual sentence and to identify the relevant relationships between the
graphical symbols composing the sentence. The system provides further support in the
rapid prototyping process, offering the capability of automatically generating the syntax
of the language by using an inference module. Such a module extends to diagrammatic
languages the grammatical inference techniques previously experimented with the VLG
system [22]. This capability allows the designer to focus on the structural features of
the target language and quickly receive feedback from the customer during the language
prototyping process.
7.5 Graph Grammars
In the last years, graph grammars have also been used for specifying the syntax of visual
languages.
Chapter 7. Related Work 127
A graph grammar is similar to a string grammar in the sense that the grammar consists
of finite sets of labels for nodes and edges, an axiom, i.e., an initial graph, and a finite
set of productions L::=R, with L and R graphs. A production p can be applied to graph
G and rewrites it into G′, if G contains a subgraph ML that matches L, and G′ = G ∪R\ML. The language of a graph grammar is the set of all graphs which can be derived
in a finite sequence of rewriting steps from the axiom graph. The strong point of graph
grammars is that they can have context-sensitive grammar productions. In unrestricted
context-sensitive productions the left hand side is a subgraph instead of just a single
nonterminal.
A graph parser tries to decide the membership problem for graph grammars: for a given
graph and a graph grammar, it tries to find a derivation sequence from the grammar’s
axiom to the given graph. This problem is undecidable for general grammars. This derives
from the context-sensitivity of productions and the need to perform graph matching. An-
other technical issue is the embedding of the right hand side production elements into the
context (surrounding graph) of its application. Hence, restricted classes of graph gram-
mars have been considered for which the membership problem is more or less efficiently
decidable.
In [54] it is discussed the class of layered graph grammars that allow for context-
sensitive productions but restrict the right hand side of a production to be lexicographi-
cally smaller than the left hand side. The lexicographic order of graphs is based on the
decomposition of node and edge labels into a set of layers. Further, the right hand side
graphs must be connected and they must add new graph elements when applied. This
class is analyzed in the next subsection.
Moreover, in [53] it is presented a three level representation of visual programs (dia-
grams) for a graph-based visual environment generator (see Fig. 7.3). The authors use two
kinds of graphs as internal representations of diagrams: a spatial relations graph (SRG)
that abstracts from the physical diagram layout and represents higher level spatial rela-
tions, and an abstract syntax graph (ASG) that represents the diagrams logical structure
and is kept up-to-date with the SRG. Context-sensitive graph grammars are used to define
the syntax of both graphs. The lowest level of the model is the graphical representation of
a diagram consisting of graphic primitives (lines, circles, characters etc.) with properties
like size and location. Graphical scanning produces a spatial relations graph (SRG).
128 Chapter 7. Related Work
By defining the allowed SRG’s, the allowed ASG’s, and their correspondence relation,
ones defines the syntax of the visual language.
Spatial Relations Graph (SRG)
Physical Layout
Abstract Syntax Graph (ASG)
represents
constraintsolving
graphicalscanning
represented by
graphics editingLow level
Representationoriented editing
Activity supported:
Layout editing
Interpretation
Syntax directed editing
Figure 7.3: The three level representation of visual programs [53].
7.5.1 Layered Graph Grammars
Layered graph grammars are context-sensitive graph grammars where the left and right
hand sides of productions are extended with context-elements. The latter cannot be
modified as a result of production applications, but they may be used as sources of targets
for new relationships [54]. More precisely, a production is defined as a couple of graphs
(L, R), where L and R can have a common subgraph K, called the interface graph. This
graph identifies all those context-elements in a host graph which have to be present, but
are not deleted by the application of the production. Thus, no explicit rules are provided
to specify the embedding phase. The presence of context-elements requires the use of quite
complex parsing algorithms. Although Rekers and Schurr argue that the requirements can
be dramatically reduced for real world examples, the general parsing algorithm proposed
by the authors has an exponential time and space complexity.
The PROGRES tool [58] employs layered graph grammars and the approach in Fig.
7.3 to deliver a graph grammar engineering environment. PROGRES is a visual language
and a tool where users can edit and execute graph grammar productions. The idea of the
PROGRES language is to support the design of graph structures and the implementation of
Chapter 7. Related Work 129
graph manipulation tools. The PROGRES tool provides a standard editor environment.
In order to improve the time and space complexity in real cases, several heuristic are
discussed in [52].
Zhang and Zhang [69] have proposed a simple parser for a further restricted kind
of layered graph grammars, named reserved graph grammars (RGG). They reverse each
grammar production and thus obtain a new graph transformation system. As long as the
reverse graph transformation system is confluent, a graph can be parsed by reducing it to
an initial graph. The parsing algorithm for RGGs has a polynomial worst-case behavior
but is restricted because of the requirement of confluence.
VisPro [70] is a set of visual programming tools capable of automatically generating
visual programming environments in a Lex/Yacc fashion. The construction of visual no-
tations consists of two steps: a lexicon definition and a grammar specification. During
the lexicon definition the user defines the visual objects and a visual editor. The formal-
ism of the reserved graph grammars is employed to specify the language grammar. This
specification is used by the toolset to generate a compiler and the visual programming
environment.
7.5.2 Hypergraph Grammars
An hypergraph is a generalization of a graph, in which edges are hyperedges. Each hyper-
edge can be connected to any (fixed) number of nodes. In DiaGen [48, 47, 35] diagrams
are internally represented by hypergraphs. A visual language, named diagram class, is
specified by a hypergraph language and a mapping from hypergraphs to their visual repre-
sentation as diagrams. The hypergraph language is specified by a context-free hypergraph
grammar. The nodes and edges of hypergraphs have attributes and the productions of
a hypergraph grammar are adorned with constraints on the attributes. The constraints
direct the layout of a diagram derived by applying the production. A constraint solver
is employed to provide automatic layout of diagrams where the user can adjust layout.
In [47] it is described an extension to the hypergraph grammar model that allows for
restricted use of context-sensitive elements in the right hand sides of productions. This
makes it possible to model general graph structures.
The hypergraph parser used in DiaGen is based on the CYK parser for context-free
string grammars [68]. The parser constructs syntactic information for a given hypergraph,
130 Chapter 7. Related Work
searches for maximal subgraphs that are syntactically correct, and creates their syntax
trees. The parser can work incrementally, thus it is particularly suited for online parsing
during the editing of the diagrams.
DiaGen consists of an editor framework and a generator. The hypergraph grammar
of a diagram class is given in input to the generator that creates custom components.
The framework and the components build a graphical editor for the specified diagram
class. The specification of the diagram language can be augmented with transformation
rules which make it possible to provide syntax-directed manipulation of diagrams. The
transformation specifications define editing actions that transform a diagram from one
valid state to another.
In [41] DiaGen has been extended to support free-hand editing as well as syntax-
directed editing by employing an hypergraph parser and making use of graph transforma-
tion for adding syntax-directed editing.
7.6 Visual Conditional Attributed Rewriting Systems
Visual Conditional Attributed Rewriting (VCARW) systems are rewriting systems used
for the specification and the implementation of interactive systems [7, 8].
VCARW systems rewrite both the image and the description components of visual
sentences in one step, differently from the previous approaches where the input picture
is first scanned and translated into an intermediate form that can then be processed by
the parser. Indeed, this approach does not construct parsers from the visual language
specifications, but it derives control systems to manage user interaction.
In [8] the authors introduce one-increasing VCARW systems for which a procedure
has been designed to automatically generate the control mechanisms of the interaction,
thus supporting the construction of more effective and usable systems. In particular, from
a one-increasing VCARW system an automaton can be derived to control user actions
enforcing the correctness of the visual sentences used in the interaction. In this way, the
composition of visual sentences evolves in according to a control automaton. Moreover, the
interactions involving different visual languages have been managed by using a metacontrol
automaton formed by a state for each visual language to be used in the interaction.
Chapter 7. Related Work 131
This approach has been implemented in the GenIAL system. This tool generates
logbooks formed by a page with a window that presents the alphabet symbols, and a
window for composing the sentences. The latter allows the users to construct visual
sentences in free-order form, and implements the interpreter for the control automaton of
the correspondent visual language.
Chapter 8
Conclusions and Further Research
In this thesis, we have presented a methodology for modeling and implementing visual
notations. The methodology relies on a grammar formalism (XPG) and an LR-based
technique (XpLR), which allow us to model and parse a broad class of visual languages,
encompassing those used in software development methodologies. In particular, we have
highlighted the power of the methodology by implementing the UML state diagram lan-
guages which represent one of the most complex visual modeling languages used in the
software engineering field. Indeed, they have been recently used as case study to evalu-
ate and compare the capability of different approaches for the visual language specifica-
tion [17]. The XPG/XpLR methodology has been implemented within the VLCC system,
supporting the generation of visual programming environments that can be used as CASE
tools for the modeled notations.
Moreover, we have shown how to construct a parser for an XPG grammar by deriv-
ing an equivalent translation scheme by means of special mapping rules and particularly
conflict handling techniques, which simulate the techniques used for XPG. This conver-
sion process allows us a rapid implementation of compilers for XPGs thanks to the use of
standard, well-known and tested tools, like YACC.
We have also presented the design of a meta-CASE workbench based on the technol-
ogy of visual language generation systems, such as the one proposed in this thesis, and on
UML meta modeling. In particular, the visual modeling environments supporting the cus-
tomized software process are specified through UML class diagrams and suitable grammar
formalisms, such as XPGs. The customized workbench is generated by integrating visual
modeling environments in agreement with a process model, and with suitable constraints
134 Chapter 8. Conclusions and Further Research
expressing rules/guidelines of the method the workbench is supposed to support. Such
process model is expressed through an activity diagram showing the essential dependencies
among models.
The use of UML meta-model for the specification of visual languages is gaining interest
in recent years. As a matter of fact, a meta-model approach is underlying most generators
of diagrammatic editors. As an example, Metabuilder [27] automatically generates an
editor for a new visual language starting from the class diagram modeling the language.
In [9] UML meta modeling has also been exploited to characterize families of diagrammatic
languages through an abstract syntax given as a class diagram and a set of constraints in
a logical language.
Regarding the XPG/XpLR methodology, in the future we intend to perform further
theoretic studies on the expressive power of XPG grammars and on the recognition power
of the XpLR parser. So far we have been able to show them by following an empirical
method, that is, by modeling and parsing many practical visual languages. In particular,
we have been able to model visual notations used in widely spread software development
methodologies, those used in emerging multimedia and web development methodologies,
in workflow management, and in e-business specification tools. Currently, we are investi-
gating notations used in virtual reality applications. An interesting future research is the
comparison of the visual formalisms so far defined with respect to XPG, similarly to what
has been done by Marriott and Meyer in [44] with CMGs.
Moreover, we intend to automate the conversion process proposed in Chapter 4 in
order to obtain a converter that can be profitably integrated into the VLCC system.
Several remarkable future researches can be prospected in the construction of meta-
CASE workbenches. First of all, it will be interesting to identify the systems described
in the previous chapter that could be effectively exploited for the construction of meta-
CASE workbenches in agreement with the approach proposed in Chapter 6. Then, we
intend to carry out an analysis and a comparison of the different proposed methods and
of the related results. To this aim, it will be crucial to have a unified theoretical and
methodological framework, which encompasses the different proposals and allows us to
identify the advantages, the drawbacks and the specific application fields of any approach,
under the point of view of expressiveness, efficiency and ease of use. Thus, the different
proposals could be exploited in a single meta-CASE that would allow workbench designers
Chapter 8. Conclusions and Further Research 135
to select the most appropriate approach on the basis of the target application domain.
References
[1] A.V. Aho, R. Sethi, and J.D. Ullman, “Compilers, principles, techniques and tools”,
Addison-Wesley, New York , 1985.
[2] M. Baldi, S. Gai, M.L. Jaccheri, P. and Lago, “Object Oriented Software Process
Design in E3”, Software Process Modelling & Technology, A. Finkelstein, J. Kramer,
and B. Nuseibeh, Eds, Research Studies Press, 1994.
[3] S. Bandinelli, A. Fuggetta, C. Ghezzi, and L. Lavazza, “SPADE: an environment
for software process analysis, design and enactment”, Software Process Modelling &
Technology, A. Finkelstein, J. Kramer, and B. Nuseibeh, Eds, Research Studies Press,
1994.
[4] N.S. Barghouti, “Supporting Cooperation in the Marvel Process-Centred SDE”, in
Proceedings of the 1992 ACM Symposium on Software Development Environments,
ACM Press, 1992, pp. 21-31.
[5] E.C. Baroth and C. Hartsough, “Experience Report: Visual Programming in the Real
World”, Visual Object Oriented Programming, edited by M. M. Burnett, A. Goldberg
& T. G. Lewis, Manning Publications, Prentice Hall, 1995, pp. 21-42.
[6] I.Z. Ben-Shaul, and G.E. Kaiser, “A Paradigm for Decentralized Process Modeling
and its Realization in the Oz Environment”, in Sixteenth International Conference
on Software Engineering, May 1994, pp. 179-188.
[7] P. Bottoni, M.F. Costabile, S. Levialdi and P. Mussio, “Specifying Dialogue Control in
Visual Interactive Systems”, Journal of Visual Languages and Computing 9, 553-564,
(1998).
138 References
[8] P. Bottoni, M.F. Costabile, and P. Mussio, “Specification and Dialogue Control of
Visual Interaction through Visual Rewriting Systems”, ACM Transactions on Pro-
gramming Languages and Systems, 21(6), 1077-1136 (1999).
[9] P. Bottoni, G. Costagliola, “On the Definition of Visual Languages and Their Ed-
itors”, in Procs. of 2nd International Conference on the Theory and Application of
Diagrams (Diagrams 2002), Georgia, USA, April 18-20, 2002, pp. 305-319.
[10] S.S. Chok and K. Marriott, “Automatic construction of user interfaces from constraint
multiset grammars”, in Proceedings of the 11th IEEE International Symposium on
Visual Languages, Darmstadt, Germany, 1995, pp. 242-249.
[11] S.S. Chok and K. Marriott, “Automatic construction of intelligent diagram editors”,
in Proceedings of the ACM Symposium on User Interface Software and Technology
UIST98, San Francisco, California, 1998, pp. 185-194.
[12] J. Clark and S. DeRose. XML Path Language (XPath) version 1.0 Recommendation
http://www.w3.org/TR/1999/REC-xpath-19991116, World Wide Web Consortium,
June 1999.
[13] G. Costagliola, A. De Lucia, S. Orefice, G. Polese, “A Classification Framework to
Support the Design of Visual Languages”, to appear in Journal of Visual Languages
and Computing.
[14] G. Costagliola, A. De Lucia, S. Orefice, G. Tortora, “Automatic Generation of Visual
Programming Environmens”, IEEE Computer, 28(3), 1995, pp. 56-66.
[15] G. Costagliola, A. De Lucia, S. Orefice, G. Tortora, “A Parsing Methodology for the
Implementation of Visual Systems”, IEEE Transactions on Software Engineering,
23(12), 1997, pp. 777-799.
[16] G. Costagliola, V. Deufemia, F. Ferrucci, C. Gravino, “On the pLR parsability of Vi-
sual Languages”, in Proceedings of IEEE International Symposia on Human-Centric
Computing Languages and Environments (HCC’01), Stresa, 5-7 Settembre, 2001, pp.
48-49.
[17] G. Costagliola, V. Deufemia, F. Ferrucci, C. Gravino, “Implement-
ing Statecharts using Extended Positional Grammars”, Statecharts
References 139
Modeling Contest in Symposia on Human-Centric Computing Lan-
guages and Environments (HCC’01), Stresa, 5-7 Settembre, 2001.
http://www2.informatik.uni-erlangen.de/VLFM01/Statecharts/CoDeFeGr.pdf.
[18] G. Costagliola, V. Deufemia, F. Ferrucci, C. Gravino, “Using Extended Positional
Grammars to Develop Visual Modeling Languages”, in Proceedings of the Four-
teenth International Conference on Software Engineering and Knowledge Engineering
(SEKE’02), Ischia, 15-19 Luglio, 2002, ACM Press, pp. 201-208.
[19] G. Costagliola, V. Deufemia, F. Ferrucci, C. Gravino, “The Use of the GXL Approach
to Supporting Visual Language Specification and Interchanging”, in Proceedings of
IEEE International Symposia on Human-Centric Computing Languages and Envi-
ronments (HCC’02), Arlington, VA, USA, 3-6 Settembre, 2002, pp. 131-138.
[20] G. Costagliola, V. Deufemia, F. Ferrucci, C. Gravino, “Exploiting Visual Languages
Generation and UML Meta Modeling to Construct Meta-CASE Workbenches”, in
Special Issue of Electronic Notes in Theoretical Computer Science 72 (3) (2002).
[21] G. Costagliola and G. Polese, “Extended Positional Grammars”, in Proceedings of
2000 IEEE Symposium on Visual Languages, Seattle, WA, USA.
[22] C. Crimi, A. Guercio, G. Pacini, G. Tortora, and M. Tucci, “Automating Visual
Language Generation”, IEEE Transactions on Software Engineering, 16(10), 1990,
pp. 1122-1135.
[23] J. C. Cruellas, A. Canals, J. P. Bodeveix, and T. Millan, “The NEPTUNE Technol-
ogy to Verify and to Document Software Components”, to appear in the book On
Business Component-Based Software Engineering. Barbier edition. Kluwer Academic
Publishers.
[24] J. Ebert and A. Franzke, “A Declarative Approach to Graph Based Modeling”, in E.
Mayr, G. Schmidt, and G. Tinhofer, editors. Graphtheoretic Concepts in Computer
Science, LNCS 903. Springer, Berlin, 1995, pp. 38-50.
[25] G. Engels, L. Groenewegen, R. Heckel, J. M. Kster, “A Methodology for Specifying
and Analyzing Consistency of Object-Oriented Behavioral Models”, in V. Gruhn
140 References
(ed.): Proceedings of the 8th European Software Engineering Conference (ESEC),
ACM Press, Vienna, Austria, September 2001, pp. 186-195.
[26] J. Feder, “Plex Languages”, Information Science, vol. 3, 1971, pp. 225-241.
[27] R. I. Ferguson, A. Hunter, C. Hardy, “MetaBuilder: The Diagrammer’s Diagrammer”,
in Proceedings Lecture Notes in Computer Science 1889 Springer 2000, Edinburgh,
Scotland, UK, September 1-3, 2000, pp. 407-421.
[28] F. Ferrucci, G. Pacini, G. Satta, M. Sessa, G. Tortora, M. Tucci, G. Vitiello, “Symbol-
Relation Grammars: A Formalism for Graphical Languages”, Information and Com-
putation, 131, 1 (November 1996), pp. 1-46.
[29] F. Ferrucci, G. Tortora, M. Tucci, G. Vitiello, “A Predictive Parser for Visual Lan-
guages Specified by Relational Grammars”, in Proceedings of IEEE Symposium on
Visual Languages, St. Louis, Missouri, October 1994, pp. 245-252.
[30] F. Ferrucci, G. Tortora, M. Tucci, G. Vitiello, “A System for Rapid Prototyp-
ing of Visual Languages”, in Proceedings of IEEE International Symposium on Vi-
sual/Multimedia Approaches to Programming and Software Engineering, Stresa, Italy,
September 5-7, 2001, pp. 382-389.
[31] F. Ferrucci, G. Tortora, and G. Vitiello, “Visual Programming”, in Encyclopedia of
Software Engineering, J.J. Marciniak (Ed.), John Wiley and Sons, 2001.
[32] E.J. Golin, “Parsing Visual Languages with Picture Layout Grammars”, Journal of
Visual Languages and Computing, 2, 1991, pp. 1-23.
[33] E.J. Golin, S.P. Reiss, “The Specification of Visual Language Syntax”, Journal of
Visual Languages and Computing, 1, 1990, pp. 141-157.
[34] D. Harel, “On Visual Formalisms”, Communications of the ACM 31, 5 (May 1988),
pp. 514-530.
[35] B. Hoffmann and M. Minas, “A generic model for diagram syntax and semantics”,
Workshop on Graph Transformation and Visual Modelling Techniques, July 15/16,
Geneva, Switzerland, 2000, pp. 443-450.
References 141
[36] R. C. Holt, A. Winter, and A. Schuerr, “GXL: Toward a Standard Exchange Format”,
in 7th Working Conference on Reverse Engineering, IEEE Computer Soc., 2000, pp.
162-171.
[37] H. Isazadeh and D.A. Lamb, “CASE Environments and MetaCASE Tools”. Technical
report 1997-403, Dept. of Computing and Information Science, Queen’s University,
Kingston, Canada K7L 3N6, February 1997.
[38] S. Jackson, I. Schechter, and I. Shlyakhter, “Alcoa: the Alloy constraint analyzer”,
in Procs. of the International Conference on Software Engineering, Limerick, Ireland.
June 2000.
[39] S. C. Johnson, “YACC: Yet Another Compiler Compiler”, Bell Laboratories, Murray
Hills, NJ, 1978.
[40] G. Kent, “Automated RF Test System for Digital Cellular Telephones”, Procs of the
NEPCON West ’93, Anaheim, California, 1055-1064 (1993).
[41] O. Koth and M. Minas, “Generating Diagram Editors Providing Free-Hand Editing as
well as Syntax-Directed Editing”, in Proc. GRATRA’2000 - Joint APPLIGRAPH and
GETGRATS Workshop on Graph Transformation Systems, Technische Universitat
Berlin, Germany, March 2000, pp. 32-39.
[42] W. Liu, S. Easterbrook and J. Mylopoulos, “Rule Based detection of Inconsistency
in UML Models”, Workshop on Consistency Problems in UML-based Software Devel-
opment (UML 2002), pp.106-123.
[43] K. Marriott, “Constraint Multiset Grammars”, in Procs. of the IEEE Symposium on
Visual Languages, 1994, pp. 118-125.
[44] K. Marriott and B. Meyer, editors. Visual language theory. Springer-Verlag, 1998.
[45] K. Marriott, B. Meyer, K. Wittenburg, “A Survey of Visual Language Specification
and Recognition”, in [44].
[46] R. Medina-Mora, T. Winograd, R. Flores, “The Action Workflow Approach to Work-
flow Management Technology”, in Proceedings of CSCW’92, ACM Press, 1992, pp.
281-288.
142 References
[47] M. Minas, “Diagram Editing with Hypergraph Parser Support”, in Procs. of 13th
IEEE Symposium on Visual Languages, Capri, Italy, Sept. 1997, 226-233.
[48] M. Minas, and G. Viehstaedt, “DiaGen: A Generator for Diagram Editors Providing
Direct Manipulation and Execution of Diagrams”, in Procs. 11th IEEE International
Symposium on Visual Languages, Darmstadt, Germany, 1995, pp. 203-210.
[49] M. A. Najork, S. M. Kaplan, “Specifying Visual Languages with Conditional Set
Rewrite Systems”, Procs. IEEE Symposium on Visual Languages, 12-18 (1993).
[50] Object Management Group: UML specification v1.4, OMG-Document formal/01-09-
67, 2001. http://www.omg.org/technology/documents/formal/uml.htm.
[51] C. Nentwich, L. Capra, W. Emmerich and A. Finkelstein, “xlinkit: a Consistency
Checking and Smart Link Generation Service”, ACM Transactions on Internet Tech-
nology, 2002.
[52] J. Rekers, A. Schurr, “A Graph Grammar Approach to Graphical Parsing”, in Pro-
ceedings 11th IEEE International Symposium on Visual Languages, 1995, pp. 195-202.
[53] J. Rekers, A. Schurr, “A Graph Based Framework for the Implementation of Vi-
sual Environments”, in Proceedings 12th IEEE International Symposium on Visual
Languages, Boulder, Colorado, Sept. 1996, pp. 148-157.
[54] J. Rekers, A. Schurr, “Defining and Parsing Visual Languages with Layered Graph
Grammars”, Journal of Visual Languages and Computing, 8, 1 (1997), pp. 27-55.
[55] D.T. Ross, K.E. Schoman Jr., “Structured Analysis for Requirement Definition”,
IEEE Transactions on Software Engineering, 3, 1 (1977), pp. 6-15.
[56] R.V. Rubin,J. Walker II, E. J. Golin, “Early Experience with the Visual Programmers
Workbench”, IEEE Transactions on Software Engineering 16, 10 (October 1990), pp.
1107-1121.
[57] J. Rumbaugh, I. Jacobson, and G. Booch, “The Unified Modeling Language Reference
Manual”, Addison Wesley, Reading, 1999.
References 143
[58] A. Schurr, A. Winter, A. Zundorf, “Graph Grammar Engineering with PROGRES”,
in Schafer W., Botella P. (eds.): Proc. 5th European Software Engineering Conf.
(ESEC95), vol. 989 of LNCS, pp. 219-234. Springer Verlag, Berlin, 1995.
[59] I. Sommerville, Software Engineering, Addison Wesley, 1996.
[60] J. L. Sourrouille, G . Caplat, “Constraint Checking in UML Modeling”, in Proceedings
of the Fourteenth International Conference on Software Engineering and Knowledge
Engineering (SEKE’02), Ischia, 15-19 July, 2002, ACM Press, pp. 217-224.
[61] K.D. Swenson, R.J. Maxwell, T. Matsumoto, B. Saghari, and K. Irwin, “A Business
Process Environment Supporting Collaborative Planning”, Journal of Collaborative
Computing, vol. 1, no. 1, 1994.
[62] S.M. Uskudarli, and T.B. Dinesh, “Towards a Visual Programming Environment
Generator for Algebraic Specifications”, Procs. 11th IEEE International Symposium
on Visual Languages, Darmstadt, Germany, 1995.
[63] A. Winter, “Exchanging Graphs with GXL”, Graph Drawing - 9th International Sym-
posium, GD 2001, Vienna, September 23-26, 2001, Mathematics and Visualization.
[64] K. Wittenburg, “Earley-style Parsing for Relational Grammars”, in Proceedings 8th
IEEE International Workshop on Visual Languages, Seattle, WA, U.S.A., 1992, pp.
192-199.
[65] A. Winter, B. Kullbach, V. Riediger, “An Overview of the GXL Graph Exchange
Language”, S.Diehl (Ed.), Software Visualization, LNCS 2269, pp.324-336, 2002.
[66] K. Wittenburg, L. Weitzman, “Relational Grammars: Theory and Practice in a Visual
Language Interface for Process Modeling”, in Marriott and Meyer [44], chapter 6, pp.
193-217.
[67] XML Metadata Interchange (XMI). Object Management Group document formal/01-
09-76. Available from http://cgi.omg.org/docs/formal/01-09-76.pdf.
[68] D. Younger, “Recognition and parsing of context-free languages in time n3”. Infor-
mation and Control, 10, 189-208, 1967.
144 References
[69] D. Zhang, and K. Zhang, “Reserved Graph Grammar: A Specification tool for dia-
grammatic VPLs”, in Procs. of 13th IEEE Symposium on Visual Languages, Capri,
Italy, Sept. 1997, pp. 288-295.
[70] D. Zhang, and K. Zhang, “VisPro: A Visual Language Generation Toolset”, Proceed-
ings of IEEE Symposium on Visual Languages, Halifax, Canada, 1999, pp. 195-202.